This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
- PVR).
It is shown in [14, Corollary 7(2)] that a ring R is a 0-PVR
if and only if Nil(R) is a divided prime ideal and for every a, b E R \
Nil(R),
either a | b in R or b | ac in i? for each nonunit cE R. Also, it is shown in [15, Theorem 2.6] that for each n > 0 there is a 0-PVR of Krull dimension n that is not a PVR. (^-pseudo-valuation rings have been studied extensively in [14], [15], [16], [17], and [22]. We would like to point out that if R is an integral domain, then Dobbs, Fontana, Huckaba, and Papick in [30] have defined and studied "T-strongly primes" and "strong rings" (see section 4) . Chang [22] gave another generalization of pseudo-valuation domains. Recall from [22] that a Marot ring R with total quotient ring T(R) is called an r-pseudo-valuation ring (r-PVR) if each regular prime ideal P of R is r-strongly prime, in the sense that xy E P, x E T(R), y E T(R) implies that either x E P or y E P. Chang [22] gave an example of an r-PVR that is not a 0-PVR. In this article, we will only study and survey pseudo-valuation domains. If the reader is interested in the generalization of pseudo-valuation domains to the context of an arbitrary rings with nonzero zerodivosors, then we recommend the following papers : [8], [5], [11], [12], [13], [14], [15], [16], [17], [18], and [22].
40 2. PSEUDO-VALUATION DOMAINS
We begin by stating some simple properties and characterizations of pseudovaluation domains (PVDs). Recall that an integral domain R is called a valuation domain if for every a,b € R, either a | b or b \ a. P R O P O S I T I O N 2.1. (131, Proposition 1.1]. Every valuation domain is a pseudo-valuation domain.
•
The following proposition is a characterization of strongly prime ideals. P R O P O S I T I O N 2.2. ([31, Proposition 1.2];. Let P be a prime ideal of a domain R with quotient field K. l
x~ P C P whenever x G K\R.
Then P is strongly prime if and only if •
P R O P O S I T I O N 2.3. ([31, Corollary 1.3],). In a pseudo-valuation
domain
R, the prime ideals are linearly ordered (under inclusion). In particular, R is quasilocal. • Anderson [4, Proposition 4.6] gave the following characterization of nonprincipal strongly prime ideals: P R O P O S I T I O N 2.4. ([4, Proposition 4.6}). Let R be an integral domain with quotient field K, and let I be a nonzero ideal of R. Then the following statements are equivalent: 1. I is a nonprincipal strongly prime ideal. 2. 7 _ 1 = {x € K : xl C R} is a ring and I is comparable to each principal fractional ideal of R.
•
Let / be an ideal of an integral domain R with quotient field K.
Then
/ : / = {x G K : xl C / } . Another characterization of strongly prime ideals was given in [1, Proposition 1.3]:
41
P R O P O S I T I O N 2.5. ([1, Proposition 1.3], alse see [3, Proposition 4.2 and 4.3]j . Let P be a prime ideal in an integral domain R with quotient field K. Then the following statements are equivalent: 1. P is a strongly prime ideal. 2. S = K\P
is multiplicatively closed.
3. P is prime and is comparable to each fractional ideal of R. 4. P : P is a valuation domain with maximal ideal P. 5. P is a prime ideal in some valuation overring of R. O P R O P O S I T I O N 2.6. (Proposition 2.5 and [35, Theorem 7] and [5, Corollary 3.7(b)];. Let P be a strongly prime ideal of an integral domain R.
Then
RP — P : P if and only if Rp is a valuation domain. In particular, if P is a nonmaximal ideal of R, then Rp = P : P is a valuation domain.
•
P R O P O S I T I O N 2.7. ([35, Theorem 1];. Let R be a pseudo-valuation domain, and let I be an ideal of R and P be a prime ideal of R such that
Pal.
Then P is a prime ideal of I: I = {x 6 K : xl C 7}. • P R O P O S I T I O N 2.8. ([31, Theorem 1.4 and Theorem 1.5];. Let R be an integral domain with quotient field K. The following statements are equivalent: 1. R is a pseudo-valuation
domain.
2. A maximal ideal of R is strongly prime. 3. For each x £ K\R
and for each nonunit a of R, we have x~xa £ R. D.
The following proposition is a restatement of Proposition 2.8(3). P R O P O S I T I O N 2.9. ([10, Proposition 3(4)]; An integral domain is a PVD if and only if for every a,b £ R, either a | b or b | ac for every nonunit c of
R. • .
42
P R O P O S I T I O N 2.10. (131, Corollary 2.9], also see [9, Proposition 5]j. / / a pseudo-valuation domain R has a nonzero principal prime ideal, then R is a valuation domain. • . Hedstrom and Houston [31, Theorem 2.10] gave the following characterization of pseudo-valuation domains: P R O P O S I T I O N 2.11. fl31, Theorem 2.10];. Let (R,M)
be a quasilocal
domain with quotient field K which is not a valuation domain. a pseudo-valuation domain if and only if M
_1
Then R is
= {x € K : xM C R} is a
valuation domain with maximal ideal M. D Anderson and Dobbs [6, Proposition 2.5] sharpened the above Proposition. P R O P O S I T I O N 2.12. (16, Proposition 2.5]/ Let (R, M) be a quasilocal domain with quotient field K.
Then R is a pseudo-valuation domain if and
only if M : M = {x £ K : xM C M} is a valuation domain with maximal ideal M.
•
Anderson [4, Proposition 4.1] gave this characterization of pseudo-valuation domains: P R O P O S I T I O N 2.13. (14, Proposition 4.1];. Let R be an integral domain with quotient field K. The following statements are equivalent: 1. R is a pseudo-valuation domain, (and hence quasilocal). 2. For each x 6 K and prime ideal P of A, xA and P are comparable (under inclusion).
•
If R is a ring, then U(R) denotes the set of all units of R.
Anderson
and Anderson [1, Theorem 1.2] gave the following characterization of pseudovaluation domains:
43
P R O P O S I T I O N 2.14. ([1, Theorem 1.2];. Let K be a field and R be a subring of K with group of units U(R).
Then S = (K \ R) U U(R) is mul-
tiplicatively closed if and only if either R is a pseudo-valuation domain with quotient field K or R is a sub field ofK.
•
Let b be an element of an integral domain R. Then an element d of R is called a proper divisor of b if b = dm for some nonunit m G R. Badawi[12, Proposition 4] gave the following characterization of pseudo-valuation domains: P R O P O S I T I O N 2.15. ([12, Proposition A}). An integral domain R is a pseudo-valuation domain if and only if for every a,b G R, either a \ b or d \ a for every proper divisor dofb.
O
Anderson and Dobbs [6, Proposition 2.6] showed that a pseudo-valuation domain is the pullback of a valuation domain: P R O P O S I T I O N 2.16. ([6, Proposition 2.6];. Let V be a valuation domain with maximal ideal M, F = V/M its residue field, cf>: V —> F the canonical epimorphism, k a subfield of F, and R = >_1(fc). Then the pullback R = V xF k is a pseudo-valuation domain. • In the following proposition, Dobbs [24, Proposition 4.9] gave an extension of Hedstrom-Houston's observation [31, Example 2.1] that the D + Mconstruction yields a pseudo-valuation domain whenever D is a field. P R O P O S I T I O N 2.17. ([24, Proposition 4.9];. Let M ^ 0 be the maximal ideal of a valuation domain V = K + M, where K is a field. Let D be a proper subring of K. Set R = D + M. Then R is a pseudo-valuation domain if and only if either D is a pseudo-valuation domain with quotient field K or D is a field.
44
2.1. Examples of pseudo-valuation domains. Hedstrom and Houston gave the following example of a pseudo-valuation domain that is not a valuation domain: E X A M P L E 2.1.1. ([31, example 2.1],). Lei V be a valuation domain of the form K + M, where K is a field and M is the maximal ideal ofV.
If F is
a proper subfield of K, then R = F + M is a pseudo-valuation domain that is not a valuation domain. In particular, if K is a field and F is a proper subfield of K, then F + XK[[X]] is a pseudo-valuation domain that is not a valuation domain. For example, Q + XR[[X]] is a pseudo-valuation that is not a valuation domain.
domain
•
E X A M P L E 2.1.2. ([32, Example 3.1]). For each positive integer n (possibly infinite), there is a pseudo-valuation domain of Krull dimension n that is not a valuation domain. Let D = Q + XR[[X}].
Then D is a pseudo-valuation
domain of Krull dimension 1 that is not a valuation domain. Now, assume that n > 1. Let K be the quotient field of D. Then there is a valuation domain of the form K + M with maximal ideal M of Krull dimension n — 1. Then R = D + M is a pseudo-valuation domain by Proposition 2.17. By standard properties of the D + M-construction,
R is not a valuation domain and R has
n Krull dimension. 3. OVERRINGS THAT ARE PSEUDO-VALUATION DOMAINS
Recall that if R is an integral domain with quotient field K, then we say that B is an overring of R if R C B C K. We start with the following proposition: P R O P O S I T I O N 3.1. ([31, Proposition 2.6]. Let R be a pseudo-valuation domain with maximal ideal M. If P is a nonmaximal prime ideal of R, then Rp is a valuation domain, (and hence a pseudo-valuation domain). O.
45
P R O P O S I T I O N 3.2. ([3, Proposition 4.3], also see [9, Proposition 6}). Let R be an integral domain with quotient field K. Supopose that P is a nonzero strongly prime ideal of R. Then : 1. If P is not principal, then P~x — {x £ K : xP C R} = P : P = {x £ K : xP C P} is a valuation domain. 2. If P is principal, then P : P = R is a valuation domain. D. Anderson, Badawi, and Dobbs [8, Lemma 20] showed the following: P R O P O S I T I O N 3.3. ([8, Lemma 20]. Let R be a pseudo-valuation domain with maximal ideal M.
Let B be an overring of R.
If s~l S B for some
nonzero s 6 M, then B is a pseudo-valuation domain. • Let R' be the integral closure of R. P R O P O S I T I O N 3.4. ([31, Proposition 2.7], [24, Proposition 4.2]j. Let R be a pseudo-valuation domain with maximal ideal M.
Then R' — M : M if
and only if every overring of R is a pseudo-valuation domain. D Badawi showed the following: P R O P O S I T I O N 3.5. ([12, Corollary 18];. Let R be a pseudo-valuation domain with maximal ideal M. Then the following statements are equivalent:
1. R' =
M:M.
2. Every overring of R is a pseudo-valuation domain. 3. Every overring C of R such that C C M : M is a pseudo-valuation domain. 4. Every overring C of R such that C C M : M is a pseudo-valuation domain with maximal ideal M. 5. M is the maximal ideal of every overring C of R such that C C M : M.
46
6. R C C satisfies the INC condition for every overring C of R such that C C M : M. (Recall that R C C satisfies the INC condition if any two prime ideals ofC with the same contraction in R are incomparable (under inclusion).) Anderson, Badawi, and Dobbs showed the following: P R O P O S I T I O N 3.6. ([5, Corollary 2.2] If (R, M) is a pseudo-valuation domain, then the following conditions are equivalent :
1. R' =
M:M.
2. Every overring of R is a pseudo-valuation
domain.
3. Every overring of R that does not contain an element of the form 1/s for some s 6 M is a pseudo-valuation
domain.
4. For each u G (M : M) \ R, R[u] is a pseudo-valuation 5. For each u G (M : M)\R,
domain.
R[u] is quasilocal.
6. Every overring of R is quasilocal. • Badawi[13, Theorem 3] proved the following result: P R O P O S I T I O N 3.7. ([13, Theorem 3]j. Let (R, M) be a pseudo-valuation domain with quotient field K, and let V be a valuation domain with maximal ideal N such that R C V C K.
If P = N D R is different from M, then
V = RP. D The above result was used to prove the following: P R O P O S I T I O N 3.8. ([13, Theorem 8]J. Let (R, M) be a pseudo-valuation domain with quotient field K. The following are equivalent: 1. R =
M:M.
1. Every valuation domain VofR
other than M : M such that R C V C K
is of the form Rp for some nonmaximal prime ideal P of R.
47
3. Every overring of R is a pseudo-valuation domain. Recall that an overring B of an integral domain R is called a proper overring of R if R ^ B. Let R be an integral domain with quotient field K. Okabe[36] defined R to be a quasi-valuation domain (QVD) if each proper quasilocal overring B of R with maximal ideal MB satisfies the condition ( QV) R: B = {x € K : xB C R} = MB- Okabe showed the following: P R O P O S I T I O N 3.9. ([36, Proposition 1 ] / Every Valuation domain is a quasi-valuation domain.
•
Using the concept of quasi-valuation domains, Okabe proved the following: P R O P O S I T I O N 3.10. ([36, Theorem 8}). Let R be a quasilocal domain with maximal ideal M and quotient field K. Then the following conditions are equivalent: 1. R is a quasi-valuation domain. 2. Each overring of R is a pseudo-valuation
domain.
3. Each proper valuation overring V of R satisfies (QV). 4. Each proper minimal valuation overring of R satisfies (QV). 5. some proper minimal valuation overring of R satisfies (QV). 6. R' = M'1 = {x e K : xM C R}. Recall that an integral domain R with quotient field K is called seminormal if whenever x2,x3
€ R for some x e K, then x € R. Anderson, Dobbs, and
Huckaba proved the following result: P R O P O S I T I O N 3.11. (17, Proposition 3.1];. 1. Each pseudo-valuation domain is seminormal. 2. Let R be a pseudo-valuation domain with maximal ideal M and quotient field K. Then the following four conditions are equivalent:
48
(a) For each a € K\R, is a pseudo-valuation
each overring of R which is maximal without a domain.
(b) Each overring of R is seminormal . (c) Each overring of R is a pseudo-valuation (d) R' = M:M.
domain.
a
Let R be an integral domain with quotient field K. Dobbs and Fontana[28] defined R to be a locally pseudo-valuation domain (LPVD)if Rp is a pseudovaluation domain for every (nonzero) prime ideal P of R. For a generalization of locally pseudo-valuation domains to the context of arbitrary rings with nonzero zerodivisors see [18]. Dobbs and Fontana showed the following: P R O P O S I T I O N 3.12. (128, Proposition 2.2];. An integral domain R is a locally pseudo-valuation domain if and only if RM is a pseudo-valuation domain for every maximal ideal M of R.
•
P R O P O S I T I O N 3.13. ([28, Example 2.5];. Let n>2.
Then there exists a
locally pseudo-valuation domain R with precisely n maximal ideals, such that R is neither a pseudo-valuation domain nor a Prufer domain. Proof, (sketch). Let A; be a field with the following two properties: (1) there exist n pairwise incomparable valuation domains Vi = k + Mi having (maximal ideal Mi, residue class field k and) a common quotient field; (2) there exists n distinct proper subfields kt of k. Then R = D(ki + Mi) is a locally pseudovaluation domain, and R is neither a Priifer domain nor a pseudo-valuation domain.
D
Let R be an integral domain with quotient field K. Recall from [37] that R is said to be an i-domain if the contraction map i : Spec(S) —> Spec(R) is an
49
injection for each overring S of R; equivalently ([37, Corollary 2.15]), if the integral closure of RM is a valuation domain for each maximal ideal M of R. P R O P O S I T I O N 3.14. ({28, Theorem 2.9];. Let R be an integral domain. Then the following conditions are equivalent: 1. Each overring of R is a locally pseudo-valuation domain. 2. R is a locally pseudo-valuation domain and each overring of R is seminormal. 3. R is a locally pseudo-valuation domain and R! is a Prufer domain. 4. R is a locally pseudo-valuation domain and an i-domain.
•
P R O P O S I T I O N 3.15. ([28, Corollary 2.10];. Let R be a pseudo-valuation domain with maximal ideal M. Then the following conditions are equivalent: 1. Each overling of R is seminormal. 2. Each overring of R is a locally pseudo-valuation domain. 3. Each overring of R is a pseudo-valuation 4. R! = M:M.
domain.
O.
Let R be an integral domain with quotient field K. Matsuda[33] called an overring of R which is maximal without a specified element of K\R
a specified
overring s-overring. Matsuda[33] showed the following: P R O P O S I T I O N 3.16. ([33, Theorem 3];. Assume that R is a domain with R! is quasilocal. Then if each s-overring of R is a pseudo-valuation then each overring of R is a pseudo-valuation domain.
domain,
•
P R O P O S I T I O N 3.17. ([33, Theorem A]). Let R be an integral domain with quotient field K. Then the following conditions are equivalent: 1. Each s-overring of R is a pseudo-valuation domain.
50
2. Each s-overring B of R is a pseudo-valuation domain with maximal ideal MB, andB' = MB : MB. 3. Each s-overring B of R is an i-domain, and each integral overring of B is seminormal. 4. For each s-overring B of R, each integral overring of B is seminormal, and B' is a pseudo-valuation
domain.
5. Each overring of R is seminormal, and, for each overring B of R which is not an i-domain, B' contains no s-overring of B. 6. Each overring of R is seminormal, and each s-overring of R is an idomain. 7. Each integral overring of R is seminormal, R' is a Priifer domain, and each s-overring of R is an i-domain. 8. For each maximal ideal M of R, each s-overring of RM is a pseudovaluation domain. Let R be an integral domain. Recall that R is called t-closed if, whenever a, r, c G R satisfies a 3 + arc — c2 = 0, there exists b G R such that b2 -rb = a, and b3 — rb2 = c. Picavet[38] showed the following: P R O P O S I T I O N 3.18. ({38, Proposition 3.1];. Let R be a pseudo-valuation domain with maximal ideal M and quotient field K.
Then:
1. R is t-closed. 2. The following conditions are equivalent: (a) Each overring of R is a pseudo-valuation (b) Each overring of R is t-closed. (c) Each overring of R is seminormal. (d) R' =
M:M.
domain.
51
(e) For each a € K\R,
each overring of R which is maximal without a
is a pseudo-valuation domain.
•
4. A T O M I C PSEUDO-VALUATION DOMAINS
Let R be an integral domain. Recall that a nonunit a of R is called an atom of R if a is an irreducible element of R. If each nonunit element of R is a product of atoms of R, then R is called an atomic domain. It is well-known that a Noetherian domain is an atomic domain. Hedstrom and Houston[31] showed the following: P R O P O S I T I O N 4.1. (131, Theorem 3.1];. Let R be a Noetherian domain with quotient field K and integral closure R'. Then R is a pseudo-valuation domain if and only if R' is a valuation domain. D P R O P O S I T I O N 4.2. (131, Proposition 3.2];. If R is a Noetherian pseudovaluation domain which is not a field, then R has Krulll dimension 1. • P R O P O S I T I O N 4.3. ([31, Corollary 3.3];. If R is a Noetherian pseudovaluation domain, then every overring of R is a pseudo-valuation domain.
•
Let R be an atomic integral domain. Anderson and Mott [2] called a subset S of R a universal if each element of S is divisible by each atom of R. Anderson and Mott in [2] showed the following: P R O P O S I T I O N 4.4. ([2, Theorem 5.1];. Let R be an atomic quasilocal domain with maximal ideal M. Then R is a pseudo-valuation domain if and only if M2 is universal.
•
The following result is a stronger version of Proposition 4.2:
52
P R O P O S I T I O N 4.5. ([2, Corollary 5.2] and [12, Theorem 9]; and [23]/ / / R is an atomic pseudo-valuation domain which is not a field, then R has Krull dimension 1. • Recall from [39] that an atomic integral domain R is called a half-factorial domain (HFD) if each factorization of a nonzero nonunit element of R into a product of irreducible elements (atoms) of R has the same length. Let R be a half-factorial domain and x b e a nonzero element of R. Then we define L(x) = n if x = x\X2..-xn for some atoms xt of R. If x is a unit of R, then L(x) — 0. We have the following: P R O P O S I T I O N 4.6. ([2, Theorem 6.2], also see [12, Theorem 5 ] / If R is an atomic pseudo-valuation domain, then R is a half-factorial domain. Badawi [12] gave a characterization of atomic pseudo-valuation domains in terms of half-factorial domains. P R O P O S I T I O N 4.7. ([12, Theorem 6 ] / Let R be an atomic domain. Then the following statements are equivalent: 1. R is a pseudo-valuation
domain.
2. R is a half-factorial domain and for every x,y G R, if L(x) < L(y), then x | y in R.
•
4.1. E x a m p l e s of a t o m i c pseudo-valuation domains. E X A M P L E 4.1.1. ({31, Example 3.6]/
Let R = Z[y/(5)](2,1
Then R is a Noetherian ( and hence atomic) pseudo-valuation E X A M P L E 4.1.2. ([2]). Let k be any field and X,Y
+ V^)).
domain.
be indeterminates.
Then R = k + Xk(y)[[X]] is an atomic pseudo-valuation domain that is not Noetherian.
53
For further study on examples of pseudo-valuation domains, we recommend [29] and [23].
5. R E L A T E D RESULTS
Let R be a subring of an integral domain T. Dobbs, Fontana, Huckaba, and Papick [30] called a prime ideal P of R T-strong if, whenever x G T and y 6 T satisfy xy € P, then either x 6 P or y G P. If each prime ideal of R is T-strong, then T is called a strong extension of R ( or R C T is a strong extension). Evidently, an integral domain R with quotient field K is a pseudovaluation domain if and only if R C K is a strong extension. The following is an example of of a strong overring extension R C T of domains for which neither R nor T is quasilocal, (and hence neither R nor T is a pseudo-valuation domain. E X A M P L E 5.1. ([30, Example 2.1];. Let L be the quotient field of Z[X], and V — L + XL[[X}] (observe that V is a valuation domain with maximal ideal XL[[X]}). SetR=Z
+ XL[[X\\ and T = Z[X] + XL[[X]]. Then
RcT
is a strong overring extension with the stated properties. T = Z[X} + XL[[X]]. Recall from [25] that a prime ideal P of an integral domain R is said to be divided in R if P is comparable (under inclusion) with each principal ideal of R. The following result is stated in [30]: P R O P O S I T I O N 5.2. ([30, Theorem 2.3];. Let P be a prime ideal of an integral domain R. Then R C Rp is a strong extension if and only if both P is divided in R and R/P is a pseudo-valuation domain. Furthermore, if R C Rp is a strong extension, then the set of all prime ideals of R which contain P is linearly ordered by inclusion and R is quasilocal. •
54
The following result [30] states a characterization of pseudo-valuation domains in terms of strong extensions: P R O P O S I T I O N 5.3. fl30, Theorem 2.9] j . A domain R is a pseudo-valuation domain if and only if R has a prime ideal P satisfying the following two conditions: 1. R C Rp is a strong extension; and 2. Rp is a valuation domain. Recall that if A is a ring then U(A) denotes the set of all units of A. a P R O P O S I T I O N 5.4. fl30, Theorem 3.1]j. Let R be an integral domain which is distinct from its quotient field K; and T is an integral domain contains R. If K C T, then R C T is a strong extension if and only if both R is a pseudo-valuation domain and U(T) = U(K).
•
Let R be an integral domain. Recall that an ideal / of R is called a cancellation ideal if, whenever IJi = IJ2, then J\ = J 2 . Also, recall that an ideal / of R is called a quasi-cancellation ideal, if al C IJ for some a 6 R and a finitely generated ideal J of R, then a € J. Matsuda and Sugatani [34] proved the following: P R O P O S I T I O N 5.5. (134, Summary];. 1. For a pseudo-valuation domain R, a nonzero ideal I of R is a cancellation ideal if and only if I is a principal ideal. 2. There is a pseudo-valuation domain R that is not a valuation domain, such that R has a quasi-cancellation ideal which is not a cancellation ideal. •
55
Badawi and Houston in [19] called an ideal / of an integral domain R with quotient field K powerful if, whenever x e K, y E K, and xy G /, then either x G / or y E / . P R O P O S I T I O N 5.6. ([19, Proposition 1.3, and Corollary 1.6];. A prime ideal of R is strongly prime if and only if it is powerful. In particular, an integral domain R is a pseudo-valuation dom.ain if and only if a maximal ideal of R is powerful. • P R O P O S I T I O N 5.7. ([19, Proposition 1.14]/ Let I be a powerful ideal of an integral domain R, and suppose that P
and
hence N : N = {x € K : xN C Af} is a valuation overling of R with maximal ideal N.
•
Recall from [19] that an ideal I of an integral domain R with quotient field K is called strongly primary if, whenever xy a I for some x,y € K, we have x E I or yn e / for some n > 1. An integral domain R is called almost pseudovaluation domain (APVD) if every prime ideal of R is strongly primary. Also, recall from [25] that a prime ideal of R is called divided if it is comparable to
56 every principal ideal of R. If every prime ideal of R is divided, then R is called a divided domain. Dobbs in [24] proved that a pseudo-valuation domain is a divided domain. Badawi and Houston in [19] showed the following. P R O P O S I T I O N 5.9. ([19, Proposition 3.2];. Let R be an almost pseudovaluation domain. Then R is a (quasilocal) divided domain. Moreover, every nonmaximal prime ideal of R is strongly prime.
•
P R O P O S I T I O N 5.10. fll9, Theorem 3.4]/ The following statements are equivalent for an integral domain R: 1. R is almost pseudo-domain. 2. Some maximal ideal of R is strongly primary. 3. R is a quasilocal domain, and the maximal ideal M of R is such that M : M is a valuation domain with M is primary to the maximal ideal of M :M.
a
P R O P O S I T I O N 5.11. (119, Proposition 3.7]/ If R is an almost pseudovaluation domain with maximal ideal M, then R! is a pseudo-valuation domain with maximal ideal M.
•
P R O P O S I T I O N 5.12. fll9, Proposition 3.8]/ If each overring of an integral domain R is an almost pseudo-valuation domain, then R' is a valuation domain.
•
The converse of the Proposition 5.12 is false (see [19, Example 3.9].) However, we state the following result: P R O P O S I T I O N 5.13. (119, Proposition 3.10]/ Let R be an almost pseudovaluation domain, and assume that every integral overring of R is an almost pseudo-valuation
domain.
valuation domain. D
Then every overring of R is an almost pseudo-
57
In [21] Bastida and Gilmer prove that a domain R shares an ideal with a valuation overring iff each overring of R which is different from the quotient field K of R has a nonzero conductor to R. Domains with this property, called conducive domains, were explicitly defined and studied by Dobbs and Fedder [26] and further studied by Barucci, Dobbs, and Fontana [20] and [27]. Recall from [26] that an integral domain R with quotient field K is called a conducive domain if for each overring T of R, the conductor R '. T = {x G K \ xT C R} is nonzero. Badawi and Houston in [19] showed that conducive domains, powerful ideals, and strongly primary ideals are intimately connected. P R O P O S I T I O N 5.14. ([19, Theorem 4.1];. The following statements are equivalent: 1. R is a conducive domain. 2. R admits a powerful ideal. 3. R admits a strongly primary ideal. 4. R shares a nonzero ideal with some conducive overring. • REFERENCES [1] D. D. Anderson and D. F . Anderson, Multiplicatively
closed subsets of fields, Houston
J. Math. 13(1987), 1-11. [2] D. D. Anderson and J. L. Mott, Cohen-Kaplansky
domains:
Integral domains with a
finite number of irreducible elements, J. Algebra 148(1992), 17-41. [3] D. F. Anderson, Comparability
of ideals and valuation
overrings, Houston J. Math.
5(1979), 451-463. [4] D. F . Anderson, When the dual of an ideal is a ring, Houston J. Math. 9(1983), 325-332. [5] D. F . Anderson, A. Badawi, D. E. Dobbs, Pseudo-valuation
rings II, Boll. Un. Mat.
Ital. B(7)8(2000), 535-545. [6] D. F . Anderson and D. E. Dobbs, Pairs of rings with the same prime ideals, Can. J. Math. 32(1980), 362-384.
58 [7] D. F . Anderson and D. E. Dobbs and J. A. Huckaba, On Seminormal
overrings, Comm.
Algebra 10(1982), 1421-1448. [8] A. Badawi, D. F . Anderson, D. E. Dobbs,Pseudo-valuation rings, Lecture Notes Pure Appl. Math. Vol. 185(1997), 57-67, Marcel Dekker, New York/Basel. [9] A. Badawi, A Visit to valuation and pseudo-valuation
domains, Lecture Notes Pure
Appl. Math. Vol. 171(1995), 155-161, Marcel Dekker, New York/Basel. [10] A. Badawi, On domains which have prime ideals that are linearly ordered, Comm. Algebra 23(1995), 4365-4373. [11] A. Badawi ON Divided commutative
rings, Coram. Algebra 27(1999), 1465-1474.
[12] A. Badawi, Remarks on pseudo-valuation rings, Comm. Algebra 28(2000), 2343-2358. [13] A. Badawi On Chained overrings of pseudo-valuation
rings, Comm. Algebra 23(2000),
2359-2366. [14] A.
Badawi,
On
<j>-pseudo-valuation rings,
Lecture
Notes
Pure
Appl.
Math.
Vol.205(1999), 101-110, Marcel Dekker, New York/Basel. [15] A. Badawi, On ^-pseudo-valuation
rings
II, Houston J. Math. 26(2000), 473-480.
[16] A. Badawi, On ^-Chained rings and
ideals, almost
pseudo-
and conducive domains, to appear in Comm. Algebra.
[20] V. Barucci, D. E. Dobbs, and M. Fontana, Conducive integral domains as pullbacks, Manuscripta Math. 54(1986), 261-277. [21] E. Bastida and R. Gilmer, Overrings and divisorial ideals in rings of the form D + M, Michigan Math. J. 20(1973), 79-95. [22] G. W. Chang, Generalization
of pseudo-valuation rings, to appear in International J.
of Commutative Rings(IJCR). [23] J. Coykendall, D. E. Dobbs, and B. Mullins, On integral domains with no atoms, Comm. Algebra 27(1999), 5813-5831.
59 [24] D. E. Dobbs, Coherence, ascent of going-down, and pseudo-valuation
domains, Houston
J. Math. 4(1978), 551-567. [25 D. E. Dobbs, Divided rings and going-down, Pacific J. Math. 67(1976), 353-363. [26 D. E. Dobbs and R. Fedder, Conducive integral domains, J. Algebra 86(1984), 494-510. [27; D. E. Dobbs, V. Barucci, and M. Fontana, Gorenstein
Conducive domains,
Comm.
Algebra 18(1990), 3889-3903. [28; D. E. Dobbs and M. Fontana, Locally pseudo-valuation
domains, Ann. Mat. Pura Appl.
134(1983), 147-168. [29 D. E. Dobbs and M. Fontana, On pseudo-valuation
domains and their
globalizations,
Lecture Notes in Pure and Appl. Math., Vol. 84(1983), 65-77. [30; D. E. Dobbs and M. Fontana and J. A. Huckaba and I. J. Papick, Strong ring and pseudo-valuation
extensions
domains, Houston J. Math. 8(1982), 167-184.
[31 J. R.. Hedstrom and E. G. Houston, Pseudo-valuation
domains, Pacific J. Math. 4(1978),
551-567. [32; J. R.. Hedstrom and E. G. Houston, Pseudo-valuation
domains, II, Houston J. Math.
4(1978), 199-207. [33 R. Matsuda, Note on overlings without a specified element, Math. J. of Ibaraki univ. 30(1998), 9-14. [34; R. Matsuda and T. Sugatani, Cancellation ideals in pseudo-valuation
domains,
Coram.
Algebra 23(1995), 3983-3991. [35; A. Okabe, Some results on pseudo-valuation
domains, Tsukuba J. Math. 8(1984), 333-
338. [36 A. Okabe, On Quasi-valuation
domains, Math. J. of Ibaraki Univ. 32(2000), 29-31.
[37 I. J. Papick, Topological defined classes of going down rings, Trans. Amer. Math. Soc. 219(1976), 1-37. [38 M. Picavet-L'Hermitte, t-closed pairs, Lecture Notes Pure Appl. Math. Vol.(185), 401415, Marcel Dekker, New York/Basel. [39 A. Zaks, Half-factorial domains, Bull. Amer. Math. Soc. 82(1976), 721-724.
D E P A R T M E N T O P M A T H E M A T I C S , B I R Z E I T U N I V E R S I T Y , B O X 14, B I R Z E I T , P A L E S T I N E , VIA I S R A E L
GRP SCHEMES FOR TIME-DEPENDENT 2-D A N D QUASI 1-D FLOWS 1 MATANIA BEN-ARTZI AND JOSEPH FALCOVITZ
Institute of Mathematics Hebrew University of Jerusalem Jerusalem 91904, Israel
ABSTRACT The Generalized Riemann Problem (GRP) scheme for compressible time-dependent flows is briefly presented. A 2-D (Strang-type) operator splitting method that uses 1-D GRP as its basic building block is outlined. The 1-D Sod test problem and a 2-D cylindrical blast problem serve to demonstrate the high-resolution capabilities of GRP methods. Additional two-dimensional sample case are briefly considered. One with experimental validation of shock diffraction through double-bend conduit. Another with a comparison between fully two-dimensional solutions of wave interaction with area contraction segment in a duct, and the corresponding quasi one-dimensional approximation. The main thrust of the paper is in the validation, both analytic and experimental, of the quasi 1-D approximation and the operator splitting method, in the context of 2-D non-planar flows.
x
Talk given by MBA at the Second Palestinian Mathematical Conference, Bet-Lehem, Palestine, August 2000.
60
61 1. INTRODUCTION The Generalized Riemann Problem (GRP) scheme for time-dependent compressible flow in one space dimension, is a second-order accurate "analytic" extension to the classical (firstorder accurate) Godunov scheme [7]. Over the past decade, aiming at practical simulation of shock wave phenomena, two-dimensional schemes (and most recently a three-dimensional one) were developed, using the GRP method as a fundamental building block. This presentation is intended to serve as a concise introduction to the basic GRP methodology. For a more comprehensive account of the GRP method and its diverse applications, we recommend as a first reference the extensive review [5]. Further details of the GRP analysis and scheme may be found in earlier GRP publications [1,2]. The notion of "scheme extensions" is a central concept in regards to GRP methods. While we refer to [5] for an overview of major existing extensions, we shall mention them here briefly. For the treatment of multi-fluid shock wave phenomena a Material Interface Tracking (MIT) GRP extension was developed in two space dimensions. Another versatile 2-D GRP extension is the Moving Boundary Tracking (MBT) scheme, for flows involving moving/deforming boundary surfaces in a Cartesian grid. Diverse physical extensions were also introduced into the basic GRP method. A dusty gasdynamics scheme was developed by Wang and Wu [14], and more recently by Falcovitz and Igra [6]. Flows involving additional energy sources, such as chemical energy (combustion) and potential energy (self-gravity) were treated by adequately extended GRP schemes. Also, a "singularity tracking" (1D) extension to the GRP method was employed for producing simulations of shock wave phenomena with near-perfect accuracy and resolution. This presentation starts in Section 2 with an outline of the basic (one-dimensional) GRP method, followed by a computation of Sod's shock-tube problem [11] that demonstrates the accuracy and high-resolution of GRP simulation. In Section 3 the 2-D operatorsplit GRP scheme is outlined, followed by a cylindrical blast test problem. The good agreement between the 2-D simulation of that blast flow and the corresponding (cylindricallysymmetric) 1-D GRP simulation demonstrates the accuracy of the operator-split 2-D scheme. Here we also briefly consider two additional problems. One with experimental validation of shock diffraction through double-bend conduit. Another with a comparison between a fully two-dimensional solution of wave interaction with area contraction segment in a duct, and the corresponding quasi one-dimensional approximation. It is emphasized that the purpose of this paper is not to review the (huge) existing literature concerning high-resolution (second-order) methods for nonlinear conservation laws. There is definitely no attempt at comparing features of different schemes. It is intended to give a rather self-contained review of the techniques used in the GRP approach (indeed, the multidimensional extension by operator-splitting is very classical and common to many methods), followed by a report on recent results related to quasi 1-D flows. Such flows (like spherically symmetric flows) are quite common in various physical and engineering settings. However, we feel that they are under-represented in the (high resolution) numerical literature. The test-cases mentioned here offer a "cross-examination" of the validity of
62 quasi 1-D high-resolution accuracy, on one hand, and its compatibility with full 2-D (split) calculations on the other hand.
2. OUTLINE OF THE GRP METHOD A) Quasi 1-D Duct Flow Equations. The equations governing the quasi-one-dimensional flow of an inviscid compressible fluid [4] through a duct having a smoothly varying cross-section area A(r), as function of the space coordinate r and the time t are
(2 1}
-
A U+ [AF{U)]
+ A GiU)
U(r,t)=
F(U) =
§-t lr
| pu | ,
i
= °> \(pE+p)u
Here p,p,u,E are, respectively, density, pressure, velocity and total specific energy, where E = e + i « 2 , e being the internal specific energy. In general, the thermodynamic variables p, p, e are related by an "equation of state". We shall frequently refer to the most common case, that of an ideal "7-law" gas, where, (2.2)
p=(7-l)pe,
7
>1.
B) Quasi-Conservative GRP Scheme. The following notation is adopted for the finite-difference approximation to Eq. (2.1). The spatial grid is r, = iAr, i = l,2,...,imax, where Ar is constant. The (numerical) solution is sought at equally spaced time-levels tn = nAt. By "cell i" we refer to the interval extending between the "cell-boundaries" ri±i = (i ± ^)Ar. We label by Q" the average value of a quantity (flow variable) Q over cell i at time-level tn. Similarly, Q. * is the value of Q at the cell boundary ri+i, averaged over the time interval (£ n ,t„+i). Flow variables are approximated as piecewise-linear in cells, where AQ" denotes the "slope", i.e., the variation in Q over the j—cell interval. The discretization scheme is shown schematically in Fig. 1.
63 JL
""""" Second - Ore
QyM AQ +
— First - Order Q
' ^
1i
i
1
i
•
' i+1/2
Figure 1. Distribution of Flow Variable per Cell Taking the cell i to be a "finite-volume", a "quasi-conservative" difference scheme for (2.1) is given by (2.3)
jjn+l _ jjn
:
At AVt
MU).[G{U)$
-G{U)£*]},
where AVi = / r i + J A(r)dr is the volume of cell i. The scheme (2.3) is completely defined only when the fluxes (2.4)
w)$=F(u:+h,
G([/)" + *
=G(U£*),
are specified as function of the "state variables" Uf and the respective slopes AZ7". It is emphasized that once U"+1 are evaluated by (2.3), they are never changed or modified in any way. The slopes of flow variables in cells are also updated to time-level tn+i. They are, however, subject to monotonicity constraints [13], following the time-level updating. Here we will provide a sketch of the scheme, noting, in particular, the cases of the (first-order accurate) Godunov scheme, the (second-order accurate) GRP scheme and its simplified version E\. Godunov [7] proposed to solve (at every cell-boundary r i + i ) the "planar" Riemann Problem , obtained by solving Eq. (2.1) with A(r) = 1 and piecewise-constant initial data consisting of the states U", U"+1, on the left and right of ri+1, respectively (see Fig.
64 1). The solution to a Riemann problem [4] is self-similar (as shown schematically in Fig. 2), so that U. i2 for the flux in Eq. (2.4) is the solution at ri+i and t — tn = 0 + . It is well known [10] that the resulting (first-order) scheme is stable and robust, but also that jump discontinuities are poorly resolved by it. For the GRP scheme, the flow variable values U., i2 for the flux in Eq. (2.4) are 2
.
.
obtained by an analytic procedure based on solving at each grid point a Generalized Riemann Problem (GRP), which is the initial value problem for Eq. (2.1), having piecewise-linear initial data (see Fig. 1). The solution to a GRP is not self-similar (as shown schematically in Fig. 3), and the mid-step flow variables are evaluated from the following two-term Taylor expansion (in i).
(2-5)
Cii=C?+i + T
k»
i+k
The key idea of the GRP method is to derive the expressions for the mid-step fluxes analytically, according to Eqs. (2.4) and (2.5). This leads to a scheme where the fluxes are evaluated from plug-in expressions, thus constituting an "analytic" upgrading of the Godunov scheme to second-order accuracy level. C) The Ex Scheme. By inspecting Eq. (2.5), it is clear that in order to get a second-order upgrading of Godunov's scheme, it suffices to determine the time-derivative with an O(Ai) error, since then the error made in the evaluation of U. i2 and the corresponding flux terms is of order 0(At2). The resulting simplification denoted as the E\ scheme, leads to an extremely simple modification of Godunov's scheme. Indeed, it is the simplest possible modification that upgrades the Godunov scheme to a second-order accuracy level. Our experience with numerous GRP computations indicates that in the vast majority of cases (i.e., in all regions of smooth flow), it suffices to use the simplified version. We actually wrote our GRP codes with both the simplified (E{) and the "fully analytic" (Eoo) schemes, where the use of the latter is restricted to "difficult" grid-points (e.g., large jumps). D) The Sod Shock-Tube Problem. We now turn to the well-known shock-tube problem proposed by Sod [11] which has served as a standard test case for the evaluation of numerical schemes. The tube extends from r = 0 to r = 100 (with planar symmetry, A(r) = 1) and is divided into 100 equal cells. The fluid is a 7-law perfect gas with 7 = 1.4. The initial conditions are u = 0, p = p = 1 for 0 < r < 50; u = 0, p = 0.1, p = 0.125 for 50 < r < 100. Two computations of that problem were performed, one using the Godunov scheme, the other with the GRP scheme. In Fig. 4 we show the flow profiles at t = 20 (the exact self-similar solution is given by the solid curve), obtained from the Godunov scheme. In Fig. 5 we show the same profiles obtained from the EI/EQO GRP scheme. It is evident that both the contact and the shock discontinuities are far better resolved by the GRP scheme than by the Godunov scheme. The overall improvement due to the enhanced accuracy and high-resolution of the GRP scheme is quite dramatic.
65
/ CONTACT
SHOCK
Figure 2. Self-Similar Solution to Riemann Problem
CONTACT
SHOCK
Figure 3. Solution to Generalized Riemann Problem
66
VELOCITY
ATT=
20.000
DENSITY
AT T=
20.000
ATT=
20.000
^
Figure 4. Sod's Shock-Tube problem. Godunov's method.
VELOCITY
ATT=
20.000
DENSITY
Figure 5. Sod's Shock-Tube problem. GRP method.
67
3. THE GRP METHOD FOR TWO-DIMENSIONAL FLOW A) The Governing Equations. Assuming an inviscid compressible fluid and an ideal gas equation of state, the timedependent two-dimensional flow is governed by the laws for conservation of mass, momentum and energy, expressed in Cartesian coordinates (x, y) as dtU + dxF(U) + dyG(U) = 0,
(3.1)
(3.2)
(3.3)
U(x,y,t)
=
p' pu pv pE.
. nu) =
pu pu2 +p puv .u(pE+p).
p = (7 — l)pe,
pv puv G(U) = pv2 +p .v(pE + p)
7 = constant > 1,
e = E-Uu2
+ v2).
In (3.1) we denote by p,p,e,E,u,v the density, pressure, specific-energy, specific total energy and (x, y)-velocity components, respectively. B) Operator Splitting. The two-dimensional finite-difference approximation to (3.1) is formulated as a "Strangtype" operator-splitting [12], using the GRP scheme as the one-dimensional finite-difference operator. This splitting procedure can be outlined as follows. The system (3.1) is first split into the two simpler systems, (3.4)(i)
dtU + dxF{U) = 0,
(3.4) (ii)
dtU + dyG(U) = 0.
Loosely speaking, the system (3.4) is taken to mean that the evolution of an initial state U0 by (3.1) over a short time interval At, can be approximated by evolving U0 first subject to (3.4)(i) (over time At) obtaining a state U\, then evolving U\ in accordance with (3.4)(ii) again over time At. Let Lx(At), Ly(At), L(At) denote finite-difference approximation operators for the integration by a time-step At of (3.4)(i), (3.4)(ii), (3.1), respectively. Then, as shown by Strang [12], the operator sequence (3.5)
L{At) =
Lx(^At)Ly(At)Lx(~At)
68 is a second-order finite-difference approximation to (3.1). The one-dimensional operators Lx(At), Ly(At) are given by the planar "GRP solver". We reiterate the basic idea (in terms of Lx) as follows. The grid consists of the sequence of points Xi+i/2 = (t + 1/2)Ax, i = 0,1,2,..., imax, where Ax is the grid spacing and cell i is the interval £j_i/2 < x < Xj+i/2- U(x,y,t) (for a fixed y) is approximated at time t = tn = nAt by Un(x,y), a piecewise linear distribution in cells, having the average value U" in cell i. The finite-difference GRP solver Lx, yielding {U™+1}™* m terms of { t f K = r is explicitly given by,
(3.6)
ur1 = u?-£ [nu)::?* - FW*?,;] ,
where the time-centered fluxes F(U)"+1L are determined analytically from solutions to Generalized Riemann Problems that arise at the cell-interfaces xi+1/2, as explained in Section 2(B) above.
C) Cylindrical Blast Example. This sample problem was chosen to illustrate the capability of the two-dimensional GRP scheme, in a case where cylindrical symmetry permits also the application of the onedimensional GRP scheme (2.3) with A(r) = r. A more detailed account of this problem is given in the recent publication [3]. The initial data is as follows. The fluid is an ideal gas having 7 = 1.4 and it is initially at rest everywhere. Inside the circle centered at {x,y) = (0,0), of radius R = 50, there is a high-pressure state {PL,PL) = (10,20), while the low-pressure state outside the circle is (PR,PR) = (1,1). The computational domain is the square (0 < x < 100,0 < y < 100), which is divided into a grid of 100 x 100 square cells. The integration was carried out with a constant time-step At = 0.16, up to the final time of T = 12.5. Compliance with the Courant-Priedrichs-Levy (CFL) stability condition [10] throughout the computation was verified. As a reference calculation, we used the quasi 1-D GRP scheme (2.3) to compute the flow profiles, as functions of the radial coordinate r. The pressure and density profiles thus obtained are shown in Fig. 6. They represent a sharp outgoing shock wave and an ingoing rarefaction, separated by a contact discontinuity.
69 PRESSURE
DENSITY
AT T= 12.50
20.0
AT T= 12.50
10.0
.0 100.0
0.
100.0
Figure 6. One-dimensional profiles as function of r In the two-dimensional computation, the initial data in cells intersected by the circle was smoothed out, by taking an area-weighted density and pressure averages. The resulting density distributions are shown in Fig. 7(a) as a grey-scale plot (the amount of grey shading is proportional to the density), and in Fig. 7(b) as a diagonal cross-section (along the line x = y).
Figure 7(a). Density grey-scale plot on (x,y) plane, T=12.50.
Figure 7(b). Density diagonal cross-section, T=12.50.
70 The agreement between the one-dimensional cylindrically-symmetric computations and the corresponding two-dimensional results is very good (compare the density profile in Fig. 7(b) with t h a t of Fig. 6 ). This agreement underlines the fact t h a t even though the schemes (2.3) and (3.5)-(3.6) are quite different in formulation, with the latter being based on operator-splitting, the results agree quite well. D ) Further A n a l y s i s of 2 - D F l o w s . Presently, two-dimensional fluid dynamical phenomena are accurately simulated by the G R P method, and additionally, experimental techniques for flow visualization are well developed. Thus, experimental validation of numerical computations is readily performed in two-dimensional setups. A number of two-dimensional sample cases were studied during the past decade, some with experimental validation and some without. Here we present summarily two such studies, referring to the original publications [8,9] for detailed accounts. In a recent experimental and computational study of this kind [8], the diffraction of a shock wave propagating through a double-bend passage was visualized by shadowgraph and double-exposure holographic methods. A notably good agreement was obtained between the experimental visualization and G R P computed results at a sequence of time points t h a t covers the entire shock passing process [8]. In another study we considered the interaction of shock or rarefaction waves with a (smooth) contraction of the duct cross-sectional area [9]. Here the idea was to compare the fully two-dimensional solution with the corresponding one-dimensional approximation, obtained by solving numerically the quasi one-dimensional equations for compressible flow in a duct of varying cross-section area. The case of rarefaction wave propagating through an area contraction segment is shown in Fig. 8 as a time sequence of isobar maps. We note that at large time a stationary two-dimensional shock wave system, which appears to be a Mach reflection from the symmetry plane, has formed in order to match the pressure of the supersonic flow issuing from the narrow duct with the (higher) pressure prevailing downstream in the wider duct. Clearly this flow pattern is inherently two-dimensional, and cannot be approximated by a quasi one-dimensional duct flow. We refer to [9] for a more comprehensive analysis of this case.
71 GRP.
Two-Dimensional Duct Flow, Area Ratio = 0.5
Rarefaction wave into quiescent gas. Pressure ratio 0 . 1 ,
7 = 1.4
T =0
T = 90
T=180
T = 270
T = 360
T = 450
T = 540
Figure 8. Isobars time-sequence of rarefaction wave interaction with area change.
72 4. REFERENCES
1) M. Ben-Artzi and J. Falcovitz, A second-order Godunov-type scheme for compressible fluid dynamics, J. Comp. Phys. 55 (1984), 1-32. 2) M. Ben-Artzi and J. Falcovitz, An upwind second-order scheme for compressible duct flows, SIAM J. Sci. Stat. Comp. 7 (1986), 744-768. 3) M. Ben-Artzi, J. Falcovitz and U. Feldman, Remarks on high-resolution split schemes computation, SIAM Journal on Scientific Computing, 22, 1008-1015, 2000. 4) R. Courant and K.O. Friedrichs, "Supersonic flow and shock waves", SpringerVerlag, New York 1976. 5) J. Falcovitz and M. Ben-Artzi, Recent Developments of the GRP Method, JSME International Journal, Series B, 38, No. 4, 497-517, 1995. 6) J. Falcovitz and O. Igra, Analysis of Shock Wave Structure in Dusty Suspension, presented at the 14"" International Mach Reflection Symposium, Tohoku University, Sendai, Japan, 2-4 October, 2000. 7) S.K. Godunov, A finite difference method for the numerical computation of discontinuous solutions of the equations of fluid dynamics, Mat. Sbornik 47 (1959), 271-295. 8) O. Igra, J. Falcovitz, T. Meguro, K. Takayama and W. Heilig, Experimental and theoretical studies of shock wave propagation through double-bend ducts, Journal of Fluid Mechanics, 437, 255-282, 2001. 9) O. Igra, L. Wang and J. Falcovitz, Non-stationary compressible flow in ducts with varying cross-section, Proc. Instn. Mech. Engrs., Part G, 212, 225-243, 1998. 10) R.D. Richtmyer and K.W. Morton, "Difference Methods for Initial Value Problems", Interscience, New York 1967. 11) G.A. Sod, A survey of several finite difference methods for systems of non-linear hyperbolic conservation laws, J. Comp. Phys. 27 (1978), 1-31. 12) G. Strang, On the construction and comparison of difference schemes, SIAM J. Num. Anal. 5 (1968), 506-517. 13) B. van-Leer, Towards the ultimate conservative difference scheme V, J. Comp. Phys. 32 (1979), 101-136. 14) B.Y. Wang and R.S. Wu, Numerical investigation of dusty gas shock wave propagation along a variable cross-section channel, in "Shock Waves: Proc. of the 18th Int. Symp. on Shock Waves" (Ed. K. Takayama), Springer-Verlag 1991, pp. 521-526.
A P P R O X I M A T I N G F I X E D P O I N T S OF L I P S C H I T Z I A N GENERALIZED PSEUDO-CONTRACTIONS Vasile B E R I N D E Department of Mathematics and Computer Science Faculty of Sciences, North University of Baia Mare Victoriei 76, 4800 Baia Mare, ROMANIA; E-mail:[email protected]
A b s t r a c t . For a class of iterations approximating fixed points of Lipschitzian generalized pseudo-contractions in Hilbert spaces, the fastest iteration is found. Mathematics Subject Classification: 47 H 10 K e y words and phrases: fixed points, Lipschitzian operator, generalized pseudo-contractive operator, Hilbert space, simple iteration method 1. Introduction Many of the most important nonlinear problems for applied mathematics reduce to finding solutions of nonlinear functional equations. These equations can be then formulated in terms of finding the fixed points of given nonlinear selfmappings. From the point of view of applications, it is essential not only to show the existence of fixed points of such mappings under suitable hypotheses, but also to develop systematic techniques for the construction or calculation of the fixed point(s) The classical iteration scheme or the method of successive approximations or the Picard iteration associated to the operator T, xn+i = Txn,
n >0
is known to converge to the (unique) fixed point of T, for some classes of contractive operators (strictly contractive, generalized contractive etc, see, for example [2]). If T fulfills only a Lipschitz - type condition (that is, for example, the case when T is nonexpansive), then generally the Picard iteration does not converge to a fixed point of T. 73
74
In order to remove these difficulties, other kind of iterations were considered: the simple iteration method, the Mann iteration, the Ishikawa iteration [4] etc. Starting from the fact that the so called simple iteration method is suitable for approximating fixed points of Lipschitzian generalized pseudocontractions, the main aim of this paper is to find the fastest iteration(if any) in the family of simple iterations given by (5). Let K be a non-empty closed convex subset of a real Hilbert space H and let (•, •) and ||-|| denote, respectively, its inner product and norm on H. We need some notions and results from [7]. An operator T : K —» K is said to be a generalized pseudo-contraction if, for all x, y in K, there exists a constant r > 0 such that \\Tx - Ty\\2 < r2 • \\x - y\\2 + \\Tx -Ty-
(1)
r(x - y)\\2 .1
This is equivalent to (2)
(Tx - Ty, x - y) < r \\x - yf , for
all
x,y
in
K2
or to (3)
2
{(I-T)x-(I-T)y,
x-y)>(l-r)\\x-y\\
where I is the identity. The operator T is called Lipschitzian exists a constant s > 0 such that (4)
||Ta:-Ty||
,2>
(or Lipschitz continuous) if there
for
all
x,y
in
KA
In order to approximate fixed points of Lipschitzian generalized pseudocontractions T : K —> K, Verma [7] considered two classes of iterations defined by the following algorithms. Algorithm 1.1. (Simple iteration method) For a given xo in K, compute x n + iby the iterative algorithm (5)
xn+\ = (1 - t)xn + 1 • Txn,
for
n > 05
where t > 0 is arbitrary. Algorithm 1.2. (Mann iteration method) For a given x0 in K, compute z n + i b y (6)
xn+i = (1 - an)xn + an [(1 - t)xn + t • Txn],
for
all
n > 06
75
where the sequence {an} lies in [0,1] such that oo
and t > 0 is arbitrary. Remark. For an — 1, n ~ 0,1,2,... , the iteration given by (6) reduces to the iteration (5). In fact the simple iteration method corresponding to T is just the Picard iteration method corresponding to the operator U=(l-t)
I + t-T,
t > 0
associated to T. One of the results given in [7] (Theorem 2.1) shows that any Lipschitzian generalized pseudo-contractive self operator T of a nonempty closed convex subset K of H has an unique fixed point z in K and that for each xo in K the sequence {xn} generated by (5) converges to z, for all t belonging to (0, a), where (7)
a = 2 ( l - r ) / ( l - 2 r + <s2)7
and s and r are, respectively, the Lipschitzian and the generalized pseudocontraction constants (r < 1). Our main aim in the present paper is to answer the question: among all the iterations (5) associated to a given operator T with t belonging to the interval (0, a), is there a certain iteration to be the fastest one? To this end we need some remarks and results concerning the Lipschitzian and generalized pseudo-contractive operators. 2. Lipschitzian operators are generalized pseudo-contractions We begin this section with L E M M A 2.1. Let K be a non-empty subset of a real Hilbert space H. Then any Lipschitzian operator T : K —> K is a generalized pseudocontractive operator too. Proof. By the Cauchy-Schwarz inequality \(Tx -Ty,x-y)\<
\\Tx - Ty\\ • \\x - y\\
and (4), we clearly obtain (2) with r = s, that is, T is a generalized pseudo-contractive operator.
76
Remarks. 1) The previous Lemma shows that if T is a Lipschitzian operator with constant s, then T is also a generalized pseudo-contractive operator with the same constant. This, however, does not exclude the possibility that T satisfies (2) with a constant r less than s. This means that, for a given Lipschitzian operator T with constant s, the only reason of considering explicitely a generalized pseudo-contractive condition with constant r is that r could be less than s. 2) On the other hand, the fixed point theorems established in [7] are essentially deduced under the assumptions r < 1 and s > l.(In the case s < 1, T is a contraction and then - by virtue of the fixed point theorem of Banach the sequence of successive approximations, which may be obtained from (5) by taking t = 1 - it converges to the unique fixed point of T in K). Consequently, the fixed point results obtained in [7] as well as all our results in the present paper do make sense mainly under the self-understood assumptions (8)
0< r < 1
and
r < s.8
Example 2.1. Let H be ., K = [i, 2] and T : K -> K given by Tx = \ for all x in K. Then T is Lipschitzian with constant s = 4 and also generalized pseudocontractive with constant r, with r > 0 arbitrary taken. 3. T h e fastest iteration First we prove a result wich completes Theorem 2.1 in [7] with the estimates (9) and (10). Our proof is formally diffent from that given in [7]. T H E O R E M 3.1. Let K be a non-empty closed convex subset of a real Hilbert space H, and let T : K —> K be generalized pseudo-contractive and Lipschitz continuous with corresponding constants r and s fulfilling (8). Then (i) T has an unique fixed point z in K; (ii) For each x$ G K, the sequence {xn} given by (5) converges strongly to z, for all t in (0, a) with a given by (7); (Hi) The following estimations
77
(9)
||z„ - z|| < ( 0 " / ( l - 0 ) ) . | | o ; i - s o i l ,
(10)
n>l,9
\\xn-z\\<(9/(l-9))-\\xn-xn^\\,
n>l,10
9 = ((1 - i) 2 + 2i(l - i)r + t V ) 1 ' 2 . ! !
(11)
Proof. We consider on K the operator (12)
Fa; = (1 - t)x + t • Tx,
x e K12
for a l i > 0. Since K is convex, we have F(K) C K for each £ G (0,1]. As a closed subset of a Hilbert space, K is a complete metric space. Since T is generalized pseudo-contractive and Lipschitz continuous, the operator F : K —> K given by (12) is a contraction. Indeed, from \\Fx - F2/|| 2 = ||(1 - t)x + tTx - (1 - t)y - tTyf
= (1 - tf \\x - yf
+t2 • \\Tx - Tyf
+ 2t(l -t)-
<{{l-tY
+ 2t{l-t)r
+
=
<
iis2)\\x-y\\\
we obtain ||Fx - Fy\\ < 6 • \\x - ytf ,
far
all
x,y
in
K,
where 6 < 1, in virtue of the fact that t < a. To obtain the conclusion we now apply the Banach fixed point theorem with both a priori and a posteriori error estimates (see [2] or [6], for example) for the operator F considered on the complete metric space K. R e m a r k . The a apriori estimate (9) in Theorem 3.1 shows that {xn} converges to z at least as quickly as the geometric progression having the ratio 6. Consequently, if two sequences j x ' n \ , jx' n j converging to z do satisfy (9) with 9 = 6i and 9 = 6>2, respectively, 9\,9% € (0,1), we shall say that converges faster than \x'n} if
{x'n\
78
0i < e2. We are now ready to present the main result of this paper. T H E O R E M 3.2. Let all assumptions in Theorem 3.1. be satisfied. Then the fastest iteration {xn} obtained from (5) when t € (0, a) is the iteration given by (13)
i min = ( l - r ) / ( l - 2 r + s2).14
Proof. By the a apriori estimation (9), with 9 given by (11), we deduce that we have to find the minimum of the quadratic function f(t) = (l-t)2
+ 2t(l-t)r
+ t2s2,
that is f(t) = (1 - 2r + s2)t2 - 2(1 -r)t + l,
te (0, a)
with a the number given by (7). This is an elementary task. Indeed, from (8) we deduce that 1 - 2r + s2 > (1 - r) 2 > 0, hence / does posses a minimum, which is obtained for t = tm;n, given by (14). The minimum value of f(t) is /mm = (s2 - r 2 )/(l - 2r + s2) and, hence the minimum value of 8 given by (11) is (14)
6min = [(s2 - r 2 )/(l - 2r + s2)f/2 .15
Remarks. 1) Clearly, from (8) and (15), we always have 0 < 6>min < 1;
79 2) It is easy to see that, if s < 1 (that is, T is actually a contraction) then we have a > 1 and hence t = 1 € (0, a), that is, iteration (5) does contain the sequence of successive approximations. This means that the sequence of successive approximations associated to T may be obtained from (5) for t = 1 (what does not happen when s > 1) and it may be compared with the fastest iteration from (5). It is easy to check that if r = s2 < 1, then "min
£>
that is, in this case the fastest iteration from (5) does coincide with the sequence of successive approximations associated to the contraction T. Consequently, if r ^ s2 and s < 1 then 9min < s, which shows that iteration from (5) is faster than the sequence of successive approximations associated to T. In this case, the fastest iteration from (5) may be seen as an acceleration of the sequence of successive approximations (which is convergent too, because T is a contraction). 3) The main benefit of this paper may be resumed as follows. If T is a contraction (with constant s) and, in addition, T is generalized pseudo-contractive with constant r < 1 such that r ^ s2, then we may accelerate the successive approximations iteration, by considering (5) with * = t min (instead of t = 1). 4) A similar result to that in Theorem 3.2 may be obtained for the iteration (6), as in [8]. 4. Numerical examples Example 4.1. For T as in Example 2.1 the Picard iteration method does not converge, for any initial approximation XQ,XQ ^ 1 (= z, the unique point of Ton K). By taking r = 0.5, we obtain a = 1/16 and hence (5) converges to z = 1 for each t £ (0,1/16).Among the iterations given by (5), with, r = 0.5 the fastest one is obtained for £m;n = 1/32 : (15)
xn+1 = ~ (31x„ + ^ - ) ,
n>0.16
Since 6mm = y/63/8 is closed enough to 1, the iteration (16) converges slowly to z = 1, as shown in the table. If we take another r > 0, the fastest iteration will be not (16).Quensequently, it is necessary to find the optimal
80
value for r, in order that the fastest iteration given by (5) to be as fast as possible.
Table 1. The first 31 values for the iteration (16) with x0 = 1.5 n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
%n
1.5 1.473 1.449 1.425 1.402 1.381 1.360 1.341 1.322 1.304 1.287 1.271 1.256 1.242 1.228 1.215
n 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
%n
1.203 1.191 1.180 1.170 1.160 1.151 1142 1.133 1.126 1.118 1.111 1.105 1.098 1.087 1.082 1.077
For example the iteration obtained from (5) for t = 1/17 is faster then (16). Indeed, starting with x0 = 1.5 we obtain after 15 iterations, xi5 = 1.093
81
REFERENCES [1] Berinde, V., $—monotone and $—contractive operators in Hilbert spaces, Studia Universitatis " Babes-Bolyai", XXXVIII(1993), 4, 51-58 [2] Berinde, V., Generalized contractions and applications, CUB PRESS 22, 1997 (in Romanian) [3] Browder, F.E., Petryshyn, W . V . , Construction of fixed points of nonlinear mappings in Hilbert sopaces, J. Math. Anal. Appl. 20(1967), 197 -228 [4] Dieudonne, J., Foundations of Modern Analysis, Academic Press, 1969 [5] Ishikawa, S., Fixed points by a new iteration, Proceedings of the American Mathematical Society, 44(1974), 147-150 [6] Rus, I. A., Principles and applications of fixed point theory, Editura DACIA, 1979 (in Romanian) [7] Verma, R. U., A fixed point theorem involving Lipschitzian generalized pseudo-contractions, Proceedings of the Royal Irish Academy, Vol.97A(1997), No. 1, 83-86 [8] Berinde, V., The fastest Mann-type iteration for approximating fixed points of Lipschitzian generalized pseudo-contractions (submitted)
Univalent Harmonic mappings with Blaschke dilatations D. Bshouty and W . Hengartner September 26, 2001
Abstract Let / be a univalent harmonic mapping defined on the unit disk U in C whose dilatation function a is a finite Blaschke product of degree N. In contrast to the general theory of harmonic mappings when Supz^u \o.{z)\ < k < 1 the image domains of U under / is restricted. We shall present a review of known results in the field.
1. Introduction A harmonic orientation preserving-mapping / defined on the unit disk U in C is a solutions of the system of linear elliptic partial differential equation Mz)=a(z)fz(z);aeH(U),\a\
(1)
on U. It is known that, for the special case where \a\ < k < 1 in U and a given simply connected domain ft, there is a univalent solution of (1) which maps U onto ft. It is natural to ask wether the same holds for each given dilatation a(z),a € H(U), \a\ < 1. Unfortunately the answer is negative. Indeed, it has been shown in [HS2] that if a is a finite Blaschke product, there is no univalent harmonic mapping from U onto any bounded strictly convex domain. In what follows we summarize what is known about the characterization of these domains. 2. The generalized Riemann mapping theorem The following result given in [HS1] is the best one can do in regards to the Riemann mapping problem related to the equation (1). Theorem 1: Let ft be a given bounded domain of C such that its boundary dfl is locally connected. Suppose that a € H(U) satisfies \a\ < 1 on U. Choose
82
83 wa in 0 . properties:
Then there exists a univalent
solution of (3) having the following
• / ( 0 ) = wo, / , ( 0 ) > 0 and f(U) C fi. • There is a countable set E on dU such that the unrestricted limits f* (elt) = limz_).e;! f(z) exist on dU \ E and they are on cTi. • The functions exist on dU
ft(elt)
— ess\imsft
f*(e's)
and f+{elt)
= esslim^t
• The cluster set of f at e%t is the line segment from f"L(elt)
f*{els)
to / £ ( e l t ) .
Remarks: (1) If \a\ < k < 1 then E is empty and / admits a continuous extension to ft. Furthermore, we have f(U) = CI. If in addition, CI is a Jordan domain, then / extends to a homeomorphism from U onto Cl. (2) There is no analogue theorem for multiply connected domains. The uniquenes problem of mappings / as defined in Theorem 1 is still open. There are several kinds of uniqueness theorems for fc-quasiconformal mappings. But none of them applies to our case. Suppose t h a t the boundary dfl is smooth enough and \a\ < k < 1. If one knows that two mappings / and F satisfy Theorem 1 and in addition / z ( 0 ) = ^ ( 0 ) then one can conclude that / = F (see e.g. [GDI] and [GD2]). On the other hand if, in addition to the conditions of Theorem 1, fl is a strictly starlike domain, then / is uniquely determined [BHH1]. Uniqueness also holds for symmetric CI if a has real coefficients [BHH2]. Using minimal surfaces, A. Weitsman [W2] proves the uniqueness of the mapping in Theorem 1 for convex domains ft and a(z) — zn.
3. U n i v a l e n t m a p p i n g s w i t h finite Blaschke dilations o n t o b o u n d e d domains iFrom now on we suppose that the second dilatation function is a finite Blaschke product of degree N, Definition 1: (1) We say that a prime end q € dfl is a point of convexity (with respect to ft) if there exists a neighborhood V of f~x(q) and a line segment L containing q as an interior point such t h a t L \ {q} lies in the exterior of f(U D V). (2) We call /?(r) a regulated function on the interval [a, 6] if the one-sided limits /3(r + 0) and /?(r - 0) exist for all t € [a, b].
84
(3) Let il be a simply connected domain of C and suppose that the boundary dil is locally connected (every prime end is a singleton). Let <j> be a conformal mapping from U onto il. We call il a regulated domain if for each prime end q = W(T) = (j>(elT) of dil the direction angle of the forward (half-)tangent at W(T), /3(q) = lim arg[w(s) — W(T)] = lim arg[w(s) — q], S4-T
S-lT
exists and defines a regulated function. We have the following mapping theorem [BH] T h e o r e m 2: Let i
m
r
-PkZ
"1
n
z°
' -Pk
.
n 0 > 0,nfc > 0,0 < \pk\
a(z)fz(z)
which maps U onto il, if and only if r2n
- / *»<• JO 2m Jo
a{z) - a(e«)
df(elt)=0
(2)
* — c
holds for all z in U. Remark: The condition (2) can be replaced by any set of [y] linear functional which form a linearly independent set. Suppose that the Blaschke product a contains a single factor. Then equation (2) is trivially satisfied and we have Corollary 1: Let il be a regulated Jordan domain. Then there exists a univalent solution f of W)
=
eii^i-fz{z) 1 — pz
85 which maps U onto fi if and only if d€l has exactly three points of Moreover, f is unique.
convexity.
Consider now the case of a(z) = e%1z2. Then condition (2) in Theorem 1 reduces to the single equation
[\itdf*(eit)
Jo
= 0,
which leads to the following result [BH]. C o r o l l a r y 2: Let CI be a regulated Jordan domain and let a(z) =
etJz2.
• / / dfl contains three points of convexity, then there exist always at least one soultion of fj = a{z)fz{z) which maps U onto fl and there are at most six different ones. • Suppose dCl has four points of convexity, wi,W2, W3 and the positive direction. Denote by Lj. the euclidian length arc which joins Wk to Wk+i where w 5 = wi. Then there f7 = a(z)fz(z) which maps U onto fl, if and only if L\ +
W4. oriented in of the boundary is a solution of L3 = L2 + L4.
• IfdVl contains more than four points of convexity then there is no univalent solution of / j - = a(z)fz{z) which maps U onto ft. To each solution / in Theorem 2 corresponds a minimal surface whose normal vector covers the upper half-sphere exactly once. Theorem 2 can be expressed in terms of minimal surfaces as follows. C o r o l l a r y 3 : Let Q. be a simply connected regulated bounded domain of C. • If dil has three points of convexity, then there exists always a nonparametric regular minimal surface over fi whose Gauss map has the property that its image is the upper half-sphere covered twice. There are at most three esentielly different ones. • Suppose dQ has four points of convexity, wi,W2, W3 and W4. oriented in the positive direction. Then there exists a regular nonparametric minimal surface S over Q whose Gauss map has the property that its image is the upper half-sphere covered twice if and only if the Wk divide dQ in four parts such that the sum of the euclidean length of the opposide sides are equal. There is only one with this property. • In all other cases there are no regular nonparametric having the above property.
minimal
surfaces
86
4. Univalent harmonic mappings with Blaschke dilatations onto the exterior of a bounded domain A univalent and orientation-preserving harmonic mapping defined on the exterior of the unit disk, A, which keeps infinity fixed, is neccessarely of the form f(z) = A(z -i)+B(z--) + C log\z\2 + F{z) z z where F is a bounded harmonic function on A. A characterization of the domains fl containing infinity such that there exists a univalent harmonic mapping from A onto Q, whose dilatation is a finite Blaschke product is given in the following [BHN] Theorem 3: Let ..
iV—no ..
oo
0 < n0 < N, 0 < \pk\ < 1, 1 < k < N - no, be a finite Blaschke product of degree N and let il be a piecewise concave domain of C which contains infinity and whose boundary has at most N — 2 points of convexity. Let f*(elt) be a positively oriented quasihomeomorphism from the unit circle dA onto dfl Then the mapping
f(z) = A(z - \) + B(z - 1) + C logtf
- ± j T « ( | ^ f )
f(e«)dt
is a univalent solution of h =
a(z)fz(z)
which maps A onto fl if and only if the following conditions on f* hold. • S( A /a(e i ')d/*(e i ') = 0, is satisfied f* - a.e. on dA. • The coefficients A, B and C satisfy the relation, —j , „ aiao~A + aiA B = ctoA and C = — ;—^—. 1 - |ao| 2
.
h2/a(z)eH(A\oo).
• For 0 < \pk\ < 1 and 1 < j < nk, 1 f2n eiitdf*{eit) ,._. _. B: 1T~- I i / x^ = (j PkA + C)-\ 6ji.
2m Jo
(e^-W)1
Pi3'
87 • If no = 2, then
If no > 3, then
— / 2m Jo
e~iktdr{eil)
= 0, 2 < k < n0 - 2,
Remark: The last two conditions of the theorem can be replaced by any set of [ y ] linear functionals which form a linearly independent set. 5. Univalent harmonic mappings with Blaschke dilatations of periodic type Let G be a domain in C of the following form G = {w = u + iv; —oo < u < uo(v); v e R } where «<> satisfies uo(v + 2TT) = uo(v). We are interested in studying univalent harmonic mappings F(() = u(() + iu(C) from the left half-plane D = {£ : 5R £ < 0} onto G satisfying F(C + 2m) =
F(()+2m;(eD
and ^{^(-00)}=
lim £ {F(£ + in)} = -oo. £—• — 00
Hence, the partial derivatives of F are periodic analytic functions on D with period 2m. Without loss of generality, we may assume that F is sensepreserving. It then follows (see [AH, Lemma 2.2]) that F admits the representation
where 1. R / J > - 1 / 2 , 2. H and G are analytic in D,
88
3. G(-oo) = lim^-oo G(£ + irf) = 0, 4. H{—oo) = lim^^-oo H(£ + in) exists and is finite, and 5. i?(C + 2m) = -ff(C) + 2TU and G(C + 2m) = G(() + 2ni on D. Furthermore, the second dilatation function A — G'/H' of F satisfies 1. A 6 -ff(-D) and \A\ < 1 on D, 2. ^(C + 2TTJ) = A(Q
and
3. lim^^-oo A((- + irf) = ^4(—oo) = j ^ exists and is finite. Consider the mapping f(z) =
exp(F(log(z))),zeU.
Observe that / is of the form ttz) =
z\z\Wh(z)lXz),z€U
where 3J /? > —1/2 and where h and are nonvanishing analytic functions on U. Note that F is univalent on D if, and only if, / is univalent on U. Furthermore, F maps D onto a bounded domain fi and satisfies the nonlinear elliptic partial differential equation 7 * = ( a 7 / / ) / z ; aEH(U),
|a|
where a(z) = yl(logz). The mapping / is called a logharmonic mapping. The one-to-one correspondence between / and F is used in the following [ABH] Theorem 4: Let m
r
-i nfc
nfc > 0 and \pk\ < 1, ifl
on dU. Then the mapping f{z) = z\z\2/3exp
hj'*fci)<>«r<.*)-w
89 where ft = " 1 _i a (o)W > ?* a univalent solution of k =
a(z)f-^fz(z),
which maps U onto 0 , if and only if r2
* o(e") - a{z) df*{eH) 2mJ eil-z f*(eu) liri 0Jo
JL /
_
l + ajfi) 1 - |a(0)| 2
a(z) - a(0)
holds on U. Among the many interesting open problems is to discuss univalent harmonic mappingsfrom U onto an unbounded domain £1 where infinity belongs to <9fl. References. [ABH] Z. Abdulhadi, D.Bshouty and W. Hengartner, Minimal surfaces whose Gauss map covers periodically the pointed upper half-sphere exactly once. Preprint [AH] Z. Abdulhadi and W. Hengartner, Univalent harmonic mappings on the left half-plane with periodic dilatations, Univalent functions, fractional calculus and their applications, H. Srivastava and S. Owa, Ellis Horwood Limited, 1989, 3-28. [BH] D. Bshouty and W. Hengartner, Boundary values versus dilatations of harmonic mappings, J.Analyse Math., 72, 1997, 141-164. [BHH1] D. Bshouty, N. Hengartner and W. Hengartner, A constructive method for starlike harmonic mappings, Numer. Math., 54, 1988, 167-178. [BHH2] D. Bshouty, W.Hengartner and O.Hossian, Harmonic typically real mappings, Math. Proc. Cambridge Philos. S o c , 119, 1996, 673-680. [BHN] D. Bshouty, W. Hengartner and M. Nicole, A constructive method for univalent harmonic exterior maps with Blaschke dilatation, Computational methods and function theory (CMFT'1997), N.Papamichael, S.Ruscheweyh and E.Saff, World Scientific Publishing Co., 1999, 99-115. [GDI] J. J. Gergen and F. G. Dressel, Mapping by p-regular functions, Duke Math. J., 18, 1951, 185-210. [GD2] J. J. Gergen and F. G. Dressel, Uniqueness for p-regular Mapping, Duke Math. J., 19, 1952, 435-444. [HS1] W. Hengartner and G. Schober, Harmonic mappings with given dilatation, 3. London Math. S o c , 3 3 , 1986, 473-483. [HS2] W. Hengartner and G. Schober, On the boundary behaviour of orientationpreserving harmonic mappings, Complex Variables Theory Appl., 5, 1986, 197208. [W2] A.Weitsman, On univalent harmonic mappings and minimal surfaces, Pacific J. Math., 192, 2000, 191-200.
Finite Dimensional Dynamics on Attractors Alp Eden Bogazici University, Mathematics Department, Bebek, Istanbul, Turkey 15 M a y 2001
Abstract We construct a finite dimensional generalized dynamical system on the finite dimensional attractors of damped hyperbolic equations via Mane projections. This extends the result obtained in this direction for dissipative parabolic partial differential equations in Eden et al [1, Chapter 10].
1
Introduction
Consider a dissipative parabolic partial differential equation that can be expressed as an evolution problem on a separable Hilbert space H that is of the form ut + Au + R(u) = 0, u(0) = u0eH. (1) We will assume that the initial value problem (1) always has a unique solution on H and the resulting solution semigroup St(u0) = u(t) St:H->H
(2)
is continuous. The dissipativity of the equation is usually revealed by the existence of a bounded absorbing set, i.e. B C H such that for every bounded subset F of H there exists t0 = to(F) such that Sto(F) c B. For a general class of dissipative parabolic partial differential equations, including the initial boundary value for the 2-D incompressible Navier-Stokes equations on a bounded domain fi c 5ft2, not only the existence of such an absorbing set is guaranteed but also the existence of a compact invariant set X with finite
90
91
fractal dimension. This set is called the (global) attractor of the solution semigroup. It is believed that the behaviour of the solution semigroup on the attractor can be described by a finite dimensional dynamical system. The existence of Inertial Manifolds for a general class of parabolic partial differential equations supports this belief. Unfortunately uptill now, the existence of Inertial manifolds for the 2-D Navier-Stokes equations eluded the researchers. In Eden et al [1], an alternative approach was developed via Mane projections. The theory rested on the existence of a Holder-Mane projection from the finite fractal dimensional attractor to $iN that has a Holder continuous inverse. This was left as an open problem in that work. Later, Foias and Olson [2] succeeded in establishing this result and justifying the theory developed in Eden et al. [1]. The paper is organized as follows, in the first section we summarize the important results from [1] that applies to the first order evolution equations . In the second section, we study the same problem for the second order damped evolution equations on a separable Hilbert space E0 = V x H of the form utt + ut + Au + g(u) = /, u(0) = it0, «t(0) = «i (3) with (u0,ui) £ E0. Here we use the assumptions from( [1] Chapter 6). It turns out that the theory of second order evolution equations when cast in the form of a system of first order equations admits a completely parallel treatment in terms of their finite dimensional behaviour. We also mention some examples of damped Hyperbolic equations for which the theory applies. Finally, in the appendix we recall the theorem Foias and Olson [2]on the existence of Holder-Marie projections on sets of finite fractal dimension.
2
Dynamics on the Attractors of Parabolic Partial Differential Equations
We start with a first order evolution equation on a separable Hilbert space H given by ut + Au + R{u) = 0, u(0) = u0. (4) Here A : H —>• H is a linear self-adjoint operator with domain D(A) and has a compact inverse. It follows that we can define the spaces D(Aa) for any real a. We assume that there is an absorbing ball B for the solution semigroup that is bounded in D{A3/2). This further regularity of the attractor is required
92
because we would like to obtain the Holder continuity of the non-linear term. In the case of the 2 — D Navier-Stokes equations such an assumption is justified if the forcing term, here R(0), is in D{A1/2). We also assume that the non-linear term maps D(A) into H and satisfies a Lipschitz condition on this absorbing ball B, namely, there exists c > 0 such that for u, v € B \\R{u) - R{v)\\D{AU2)
< c ||u - v\\D{A3/2).
(5)
We follow the notation of [1] and set F{u) = -Au-R{u).
(6)
Then it follows that F also maps D (A) into H and is Holder continuous with exponent 6 — 1/3 from H n B into H. Let X be the attractor for the evolution equation that is bounded in D(A3/2), setting Z = XL)Y where Y = {u + F(u) : u G X}, one can deduce that dp(Z) < 3dp(X).Let PM denote the Holder-Mane projection from Z to $lN , see Appendix. By using the Holder-Mane projection on Z one can guarantee that the resulting generalized dynamical system will not have any new steady state solutions other than the ones that are projected from the evolution equation. A generalized dynamical system is defined on diN by the initial value problem, denoted by IVP from now on ^
= T(x(t)),x(0)
= x0e$lN.
(7)
Here the non-linear map T is defined by T( 1 x
^ )-\
\-S
PMF{PMU) a(v(x)
-x)
if
+ PMF{PMv{x))
if
x = PMu
xi
G
PMX
PMX
where the map v : UN ->• PMX = Y is defined by v(x) = x if x G Y and = y where dist^N(x, Y) = \x — y\m if x g Y, and a > 0 is an arbitrary real number. When the set Y is not convex, the function v is not necessarily uniquely defined, however, once a choice of v(x) is fixed for every x £ Y , it is established in ( [1], LemmalO.l) that v is a Borel function which is the pointwise limit of continuous functions and the set of continuity points of v is a dense G$ subset of 5RW. It follows that the same is true for the non-linear map T on UN, moreover T is locally bounded( [1], Lemmal0.2). Since T is not known to be Lipschitz continuous, it may even not be continuous, the
93
system of ODE's thus obtained need not have a unique solution. This is a consequence of the artificial way that the dynamics is constructed outside the projected attractor Y. Since by the Holder-Mane theorem the projection operator is injective on the attractor Z only, there is no canonical way of lifting the dynamics from UN to H when x ^ PM%- Fortunately, T being a locally bounded and Borel function allows one to solve the corresponding integral equation on an appropriate space. Let *(x(t)) = x0 + [ T(x{s))ds
(9)
: \x(t)\ <x0 + l,t£
(10)
be denned on 3 = {x£C([0,to};UN)
[0,t0] andx(0) = x0]
where to = t0(Y,a,xo)- It is shown in ( [1], Lemmal0.3) that * : 3 —> 3 is compact and hence has a fixed point.Moreoever, this fixed point satisfies dist^N{x(t),
Y) < e~atdist®N(xQ, Y)
(11)
for all t > 0. The fixed point of the integral operator ty given in (8) need not be unique, hence one can not define a dynamical system in the usual sense on the projected space %tN. However, the trajectories do satisfy the following two conditions : i) Superposition Principle : if x(t) is a solution of the initial value problem IVP and y(t) satisfies -£ = T(y(t)), with y(t0) = XQ then the trajectory defined by zM Z[t)
_ I »(*) 0 < t < to \ ~\x{tto) t>t0 J
is also a solution of the IVP. ii) Continuity with respect to initial values : if x0j —>• Xo and Xj(t) is a solution of the IVP with Xj(0) = x0j, then there exists a subsequence of the trajectories that converges uniformly on compact sets to x(t), a solution of the IVP with x(0) = x0 • A system that takes x0 € X to a trajectory x(t) £ C([0,oo);X) satisfying i) and ii) will be called a generalized dynamical system on X. Using the Holder continuity of the inverse of PM \xone can lift the dynamics from the projected space UN to the Hilbert space H in such a way that the two systems are inertially equivalent, (see [1], section 10.2 for details)
94
3
Dynamics on the Attractors of Hyperbolic Partial Differential Equations
The existence of exponential attractors for semilinear wave equations with damping was established in ( [3], see also [1], Chapter 6 ) Later on the theory was extended to more general systems through the use of the concept of a-contractions. ( see [4] ,[5] and [6] ). Here we will follow the simpler framework of [3] and start with an initial value problem for a second order evolution equation defined on a separable Hilbert space H of the form utt + ut + Au + g(u) = /,
(12)
u(0) = u0andut{0) — ux.
(13)
with Here the linear operator is as before positive self-adjoint and has a compact inverse and g : D{All2) —> H is a C^-non-linearity that is Lipshitz from V = D(All2) into H on bounded subsets B of D(A),i.e. there exists L = L(B) > 0 such that for ui, u-i G B, \\g(ui) ~ 9(u2)\\H < L ||ui - u 2 || v
(14)
We let EQ = V x H then the second order evolution equation can be written as a first order evolution equation on EQ. Setting w(t) = [u(t),v(t)]Tandw0 = [UQ,U\]
(15)
the second order evolution equation takes the form of a first order evolution equation wt = G(w),w{0) = wQeE0 (16) where G(w) =
r
v -v- Au- g{u) + f
(17)
As before we will assume that the first order evolution equation gives rise to a dynamical system {St : t > 0} on EQ and there exists an attractor of finite fractal dimension X C EQ. NOW let us try to draw parallels with the parabolic case as described in the first section. EQ with its natural inner product is a separable Hilbert space and the second order evolution equation
95 has given rise to a dynamical system on this Hilbert space with an attractor X cE0. The solution semi-group resulting from a second order evolution equation is not smoothing, i.e. regularizing as in the parabolic case. Still, there is an asymptotic smoothing property that holds for the attractors. Namely, for / € V, one can prove that the attractor of the dynamical system is bounded in E2 = D{AZ>2) x D(A) ( see [7] or [8]) and that there exists an invariant absorbing set B2 C E2 for initial values in E2. Now, it is easy to see that the non-linear term G satisfies a Holder continuity condition from E0 into itself, on B2, since for Wi,W2 £ B2 \\G(Wl)-G(w2)\\2Eo
= WvWl + Wv + Au+igM-giuMl < \MD{A)\MH + c(\\v\\l + \\u\\l{A)+L>\\ufv)
< ciiMU, (18) where we have set w = (u, v)T — (u\ — u2, V\ — v2)T and in the second line we have used the Lipchitz continuity of the map g : Vf]B2 —> H, the E2 bounds on the solutions and an interpolation result for the triple D{A) <-* V <-» H , finally in the third line we have used an interpolation inequlity for the tripled (A3/2) --> D(A) «->• V = D(A1'2), \\M\l
= \\UW2D(A) < \\V\\D{A*/*) II U IID(^/2) < Co ||-ut|D(Ai/2) •
It is at this stage that we really need the absorbing set in E2 and our need stems from the control of the linear term. It follows from (17) that G : E0D B2 —)• EQ is Holder continuous with a Holder exponent 9 = 1/2. As before, consider Z = X U Y where Y = {w + G(w) : w e X}, then dF(Z) < ma,x{d,F{X),dF{Y)} < 2dp{X). From this point on we can follow line by line the argument summarized above for the parobolic case and obtain a finite dimensional generalized dynamics on the attractor X. It is worth reemphasizing that the only serious difference in the two theories stem from the lack of the immediate regularization in the latter case. Fortunately, for the examples we are going to mention below the attractor, for / € V, lies in a an absorbing set B2 C E2. (see [7]or [8]) Let fi C UN be a bounded set with smooth boundary, consider the damped sine-Gordon equation with homogeneous Drichlet boundary conditions : Utt + ut - Au + /3sinu = f(x) (19)
96 for x G 0 , £ > 0, u(x, t) = 0 for x G <9Q and £ > 0 with initial values u(x,0) = u0{x) e ^ ( f i J . u t f o O ) = «i(s) G L 2 (fi) 2
(20)
2
. Here H = L (Q) and A = - A , D ( A ) = f#(fi) n H (fl) and 5 (u)=/3sinu. In the example above one can take the non-linearity asg(u) — u3 + p(u) where p(u) is any quadratic polynomial, and take the space dimension N = 3 and obtain the damped Klein-Gordon equations. Here the regularity of the attractor is a little more diffucult to obtain see [8] The same theory applies without any change to systems of sine-Gordon equations as well. ( see e.g. [3] and [7]). It is worth mentioning that the theory should also apply to abstract evolution equations of the form Putt + Qut + Au = g(u) given in [5] with minor modifications on the underlying spaces.
4
Appendix
The essential role in the above argument was played by the existence of a Eolder-Mane projection on a set of finite fractal dimension. In the case when the underlying Hilbert space was finite dimensional this was proven by Ben-Artzi et al [9]. The inifinite dimensional case is proven by Foias and Olson [2] : ( Foias-Olson) Let X be a compact subset of a separable Hilbert space H with finite fractal ( = upper box counting) dimension d with D = 2d < N—1. Let P0 be an orthogonal projection from H to 5RW with rank [D + 1] then for every 8 > 0 and for some 9 > 0 there exists an orthogonal projection P = P(6,5) from H to $lN with the same rank as PQ and a positive constant C such that \\P — ^o|| B ( ff ) < S and \\x-y\\H
(21)
for every x, y G X. Explicit estimates for the Holder exponent 6 of Holder-Mane projections is given in [10].
4.1
References
[1] Eden, A., Foias, C , Nicolaenko, B. and Temam, R., Exponential Attractors for Dissipative Evolution Equations, Research in Applied Mathematics, Masson Publications, 1994.
97
[2]Foias, C , and Olson, E., Finite fractal dimension and Holder-Lipschitz parametrization. Indiana Univ. Math. J. 45, 3, 603-616, 1996. [3] Eden, A., Milani, A.J., and Nicolaenko, B., "Finite Dimensional Exponential Attractors for Semilinear Wave Equations with Damping", Journal of Math. Anal, and Appl., 169, 2, 408-419, 1992. [4] Eden, A., Foias, C. and V. Kalantarov, "A remark on two constructions of exponential attractors for a—contractions", Journal of Dyn. and Diff. Eqns., 10,1, 37-45, 1998. [5] Eden, A. and Kalantarov, V. "Finite dimensional attractors for a class of semi-linear wave equations", Turkish Jour, of Math, 20,3,425-450, 1996. [6] Eden, A. and Kalantarov, V. " On the discrete squeezing property for semilinear wave equtions", Turkish Jour, of Math, 22,3, 335-341, 1998. [7] Temam, R., Infinite Dimensional Dynamical Systems in Mechanics and Physics, Springer Verlag, 1988. [8] Babin, A. V. and Vishik, M. I. , Attractors for Evolution Equations, North-Holland, 1992. [9] Ben-Artzi, A., Eden, A., Foias, C , and Nicolaenko, B., "Holder continuity for the inverse of Mane projection" Journal of Math. Anal, and Appl., 178, 22-29, 1993. [10] Hunt, B.R. and Kaloshin, V. Y., "Regularity of embeddings of infinite dimensional fractal sets into finite-dimensional spaces", Non-linearity, 12, 5,1263-1275, 1999.
T h e M a t h e m a t i c s of Informational A s y m m e t r y
Ivar Ekeland CEREMADE and Institut de Finance, Universite de Paris-Dauphine, 75775 Paris CEDEX 16, France Abstract I describe a new type of problems in the Calculus of Variations, characterized by global convexity constraints, and I give the economic motivation for them.
1
Introduction
Considering the recent trend of mathematics in Palestine, and notably the new program in Mathematics Applied to Economics which has been set up at the University of Bir-Zeit, I have chosen to talk on some mathematical problems arising from economic theory. More precisely, I will show how the economics of informational asymmetry lead to mathematical models which are very close to thos use in optimal transportation, such as the classical Monge-Kantorovitch problem, and to new problems in the calculus of variations. Suppose I am dealing with an individual of characteristics 9 = (0 1( ..., 6k) e Rk It often happens that these characteristics are known to him, but not to me. This is the basic problem of informational asymmetry: I may want to know 8, and he may want to hide it from me. Such situations are quite common in economics. One example is insurance: if I am an insurer, I would like to know if the prospective customer who enters my shop is a good risk or a bad risk. He probably knows himself, but it is useless to ask hime directly, because he will simply lie about it if it gets him a better deal. Another example is pricing travel: if I operate a public
98
99
transportation system, I would like the rich to subsidize the poor. I will try to charge more those who can afford to pay more, in order to make cheaper tickets available to those who cannot afford the full price. But just asking people how much they are willing to pay will lead me nowhere. The way around is well-treaded. Insurance companies sell several types of contracts, some with low premium and high deductible, others will high premium and low deductible, and let customers chose the one they prefer. Presumably the bad risks will choose the second type. Railroads offer first class accomodation, that presumably only the richer will pick, and airlines offer standby tickets at low fares, that will interest people who cannot afford to pay very much. In other words, to overcome the problems of informational asymmetry, one should design contracts whereby customers sort themselves out according to their type. This is the idea we will explore in this section. One last word about jargon: the person who designs the contracts will be called the principal, the others, who either accept or refuse the contracts, will be called the agents. For an introduction into the economics of the theory, I refer to the textbooks [4], [6] and the treatise [5].
2
Incentive-compatible contracts
We are given a subset Q, C RK, a subset X C R+, and a map u : f i x X - » i J . We interpret each point 9 = (9X,..., 9K) G Q as defining the type of an agent. A contract is a pair (x,t) £ X x R, where x = (xx, ...,XN) is a bundle of goods and t a sum of money. An agent of type 6 being allocated the contract (x, t) obtains a total utility U (9, x, t) given by:
U(9,x,t) =u(6,x) + t Note that the model is flexible. A worker's contract is of this kind: agent 9 will produce a certain x, and then t > 0 will be his wage. Then the utility derived from producing x is negative (the more you work, the less you like it), but the personal cost of producing x varies across agents: it is determined by their type 9. In other words, the parameter 9 represents the productivity of the agent. Another application of the model occurs when the goods are desired by all, so that every agent derives positive utility from being allocated
100
the bundle x. In that case, t should be negative, and represents the amount that the agent is paying to get x. However, the utility which the agent gets from consuming x (and presumably the amount he is willing to pay for it) will depend on the parameter 9, which then represents the tastes of the agent. A contract menu is a map 9 —• (x(9) ,t(9)) from f2 to X x R. An allocation is a map 9 —> x (9) from fi to X. Agents behave by maximizing utility. A contract menu will be called incentive-compatible if each type 9 will find the contract (x (9) ,t(9)) is the best for him: he prefers that contract to any other contract. In other words, he has no incentive to lie about his type. Mathematically, this is written as: V6», r, u {9, x (9)) +t(9)>u
{9, x (r)) + t (r)
Finally, an allocation x (6) will be called implementable if there is some t (9) such that the contract menu (x (9) ,t{9)) is incentive-compatible. Contract menus are complicated objects: each one of them is defined by (N + 1) real-valued functions on Q. Our first step is to show that, contrary to what seems at first glance to be the case, incentive-compatible contract menus are simpler: each of them can be represented by a single real-valued function on f2. Definition 1 The potential associated with a contract menu (x (9) ,t(9)) the maximum utility it yields to each type: V(9)=snpU(9,x(r),t(T))
is (1)
In particular, if the contract menu is incentive-compatible, we must have V (9) — U (9, x(9) ,t (9)) for every 9. This observation lies at the heart of the following: Proposition 2 An allocation x (9) is implementable if and only if there exists a function V : fi —» R such that: V(9)>V{T)-U(T,X(T))
+ U(9,X(T))
MT,9
(2)
Proof. If x (6) is implementable, then there exists some t (9) such that the contract menu (x(9) ,t(9)) is incentive-compatible; taking V (9) to be the corresponding potential, we get:
V(9)
> u(0,x(T)) + t(T) =
V(T)-U(T,X{T))
+
U(9,X(T))
101
Conversely, assume that there is a function V (9) such that (2) holds, and define t (9) by: t{9) =
V(9)-u{9,t{9))
Then, replacing V (9) and V (r) by their value in (2), we get: U(9,x{9))+t(9)
>
=
u(T,t(T))+t(T)-u(T,x(T))+u(9,x(T))
t(r) +
u{9,x(T))
so that the contract menu (x (9) ,t{9)) is incentive-compatible and V (9) is the conrresponding potential. •
3
w-convex analysis
Motivated by Proposition2, we will introduce a generalization of convex analysis. Before we do so, let us see what becomes of relation (2) in the particular case when K = N, Q = X — RN and u (9, x) = 9'x (here and elsewhere, we denote by 8' the transpose of 9). Formula (1) then yields:
V(9) =
sup{ffx{T)+t{T)} rgfi
so that V is a convex function, and relation (2) becomes:
V (9) > V (r) + (9 -
T)'X
(r) Vr, 9
which means that x (r) belongs to the subgradient of V at the point r, for every r:
x (r) € dV (r) Vr We now extend this notion of subgradient, and the other usual notions of convex analysis, to a certain class of functions depending on u (9, x). The function u : U x X —> R is prescribed throughout, so that we will not recall the dependence of these various notions on u, and we shall denote the subgradient, for instance, by d instead of du as in most of the literature. Of
102
course, when u (6, x) = 6'x we want to fall back on the usual definitions of convex analysis. The results we are going to describe largely belong to the mathematical folklore, meaning that many people have rediscovered them through the years; I follow the exposition in [1]. In the following, we shall have to deal with functions which may assume the value +oo. Such functions will be called proper if they are not identically +oo, that is, if there is a point where their value is finite. Definition 3 A function f : Q —• i?U{+oo} will be called u-convex if there is a subset Q C RN+1 such that: f(0)=
sup {u(6,x)
+ t}
(3)
(x,t)eQ
and we shall say that x belongs to the ti-subgradient of f at T, and write xedf(r),if: f(9)>f(T)-u(r,x)
+ u(e,x)V8
(4)
Definition 4 The u-conjugate of a (not necessarily convex) proper function f : Q —> R U {+00} is the function f* : X —> R U {+00} defined by: r(x)=8uP{u(0,x)-f(0)} een
(5)
Definition 5 The w-conjugate of a function g : X —> R U {+00} is the function g* : £1 —> R U {+00} defined by: g*(6) = Sup{u(e,x)-g(x)} sen
(6)
We then have the following propositions, which duplicate well-known results from convex analysis: Proposition 6 (Fenchel's inequality) For all (0,x), we have f(6) +
f*(x)>u(6,x)
P r o p o s i t i o n 7 For given (0,x), the following are equivalent: 1- f(0) + r(x)
=
u(0,x)
103
2. x G df {6) The conjugacy is a correspondence between functions on Q and functions on X. It reverses the order: if / i < /2, then /* > f%- We now introduce the biconjugate of a function / : Q —» R U {+00}:
/" = {/T Note that is again a function on fi. Proposition 8 /** (x) is a u-convex function (even if f is not) and r*(x)
Vz
Proposition 9 (Separation theorem) / / / is u-convex, then /** = / Proof. All the proofs follow directly from the definitions, except for this last one, which is a little bit more tricky. We already know that /** < / , so all we have to prove is that / < /**.In fact, we shall prove that whenever (x, t) are such that u(e,x)+t
(7)
then u (6, x) + i < /** (x). Since / is u-convex, and satisfies (3), this will give the result. So let (x, t) satisfy (7). Define the function / by: f(6) =
u(0,z)+i
It is clear from the definition that f (x) = —i. Since / < / , we must have /* > /*:
/* (x) < -i and hence:
/**(0) > u(e,x)-f(x) >
u (9, x) + t
104
Corollary 10 If f is u-convex, then / ( 0 ) = sup M M ) - / * ( * ) }
(8)
We now give some consequences of the envelope theorem. In the following propostion, we assume throughout that Proposition 11 Assume f is subdifferentiable at 9, with x £ df (9). (a) if the function T —> u'g (9,x), then
U(T,X)
is differentiable at r = 9, with derivative
u'e(9,x)edf(9) (b) if f is differentiable at 9, and the function T —» U(T, X) is differentiate atr — 9, then u'0(9,x) = f'(6) (c) if f is twice differentiable at 9, and the function r —> U(T, x) is twice differentiable at r = 9, then: u'W,x)
105
This means that the map x —* u'e (9, x) is one-to-one. In the case when tt and X are intervals, so that K — N = 1, this amounts to the fact that the map is strictly monotone. A sufficient condition for that was introduced by Spence and Mirrlees in their seminal work on education and taxation: d u and this is why we call this condition GSM, or Generalized Spence-Mirrlees. As a first example, consider the case of standard convex functions: u (9, x) = 6'x. Then u'e (9, x) — x, so that GSM certainly holds. As a second example, consider the case when K = N and u(9,x) = — c(x — 9). Then u'e (9, x) = d (x — 9), and the question now is whether the map d : RN —> RN is one-to-one. This will be the case if c (y) = \\y\\a, a ^ l , but will not be the case when a = 1, that is c(y) — \\y\\.
4
The reallocation principle
We now go back to economics, and we use the above machinery to address the following question: given u (9, x), can one find implementable allocations, and if so, how many ? The answer is given by the following reallocation principle, which was first stated by Carlier: given any allocation, there is a single implementable allocation which has the same distribution as the given one. The aim of this section is to prove this result. For the sake of simplicity, we shall assume that fi and X are compact, that u is continous with respect to (9, x) on fi x X and C 1 with respect to 9 on the interior of Q, and that u'g extend continuously to Q x X. We begin by considering a strange optimization problem. Let // and p be positive Radon measures on fi and X, such that:
tin) = PW Denote by B (Q) and B (X) the set of all Borel functions on Q, and X. Given
/ eB(fl) and geB(X),
set
J(f,9)=
[ f(0)dfi+ [ g(x)dp
Jn
Jx
106
and consider the optimization problem:
inf { J (/, g)\f{0)+g
(y) > u (9, x) a.e.}
(9)
Proposition 13 There is an optimal solution (/, g) for the problem, with f Lipschitz and u-convex, and g = f* > 0. Proof. Remark first that, since /i(fi) = p[X), adding a constant to / , and substracting the same constant to g, does not affect the criterion: f (f + a)dn+ Ja
[ (g-a)dp Jx
=
[ f(0)dn+ Ja
[ g(x)dp + Jx
a(n(n)-p(X))
= f f{9)dp + f g{x)dp JQ
JX
so that optimal solutions, if they exist, will be determined up to that constant. Let us take a minimizing sequence (fn,gn)- It is readily seen that we can lower J (/„, gn) by taking gn = /*, and then again by taking /„ = g*n = /**. In other words, we may assume that /„ is u-convex and that gn = /*. Since Q and X are compact, / „ and gn are uniformly Lipschitz and bounded from below. We may adjust the constant a in the preceding observation so that 9n > 0, with gn(xn) = 0 for some xn £ X. Since /„ (6>) = max {u (6, x) - gn (x)} > u (9, xn) X
it follows that minu < / „ < maxu, so that / „ is uniformly bounded. We now apply Ascoli's theorem to get a uniformly convergent subsequence. The limit / clearly yields a minimizer for J. • Proposition 14 / / u satisfies GSM, and \i is absolutely continuous with respect to the Lebesgue measure, then df (6) € X is a singleton for almost every 9 £ f2, and the map df : Q —> X is measure-preserving:
I ip (df (6))dn= [
107
Proof. Take any
+ e
f-A(9 + eV)' e &5 Jn For some xe (9) ed(g (g + erf
=
g*] dfi + / ipdp > 0 x
(10) (11)
+ e
(9) = u (9, x£ (6)) - g (x£ (6)) - Sip (x e (9))
For almost every 9 with respect to the Lebesgue measure, the function g* = f is differentiable at 6 so that / ' (0) = u'g{0, x0 (9)), with x0 (9) £ df(9). Using GSM, we find that x0 (9) is uniquely defined, and that xe (9) —> x0 (9) when e —> 0. Since Lebesgue-negligible sets are ^-negligible, (11) becomes: - /
Jn
(x0 (9)) dfi+ / ipdp > 0
V
Jx
Changing
•
Summing up, we have: Theorem 15 Ifu satisfies GSM, and \i is absolutely continuous with respect to the Lebesgue measure, then there is a unique (up to equality a. e.) measurepreserving map s : Q, —> X such that: s
(9) = df {9) a.e.
for some u-convex f : Q —> R Results of this type were first proved by Brenier in the standard convex case, although the proof we give here follows line set by Gangbo: Corollary 16 (Brenier) Given two bounded subsets Q and X of RN, there is a unique (up to equality a.e.) measure-preserving map s : Q, —> X which is the gradient of a convex function: 3 / convex : s (9) = f (9) a.e.
108
Corollary 17 (Polar decomposition) Given two bounded subsets Q, and X of RN, with the same volume, and a measure-preserving map s : fi —> X, there is a convex function f and a measure-preserving map h : £2 —> X such that: 8(0) = f'(h(O))
a.e.
Proof. Just define p to be the image of ji by the map s, and apply the main theorem with u (9, x) = O'x. • The economic interpretation is a reallocation principle. Given a map s : £2 —> X, we denote by s # /i the image of \x by s, which is a measure on X, and call it the distribution of s. Proposition 18 (Carlier) Assume u satisfies GSM, and /i is absolutely continuous with respect to the Lebesgue measure. Given an allocation x (6), there is a unique (up to equality a.e.) implementable allocation x(8) with the same distribution:
and it is the solution of the following problem: max
u(0,s {&)) dfj,
8*fi = X*fi
(12) (13)
The optimization problem (12),(13) is well known in mathematics as the mass transportation problem: it consists of transporting a given mass from £1 to X so as to maximize an integral profit (or minimize an integral cost). It has a long history, going back to Monge in the latel8th century, and to Kantorovitch around 1950.
5
Optimization under incentive constraints
We will now consider optimization under incentive constraints: the principal looks, not for the best contract menu, because it might no be sustainable in the face of informational asymmetry, but for the best among all contract menus that are incentive-compatible. This is a rich area for research, there
109 are many models of this type in the economic literature, and we will be content with describing just one, due to Rochet and Chone [7]. The model aims to describe the following situation. The principal enjoys a monopoly on a certain type of product. It comes in different qualities, and he wants to price the different qualities so as to extract the maximum possible amount from consumers. Since he is a monopolist, their only recourse is not to buy. If they buy, they have to buy from him. More precisely, a monopolist sells a product with two characteristics (y 1 ,?/ 2 ). Each consumer buys zero or one unit, and one unit of quality y costs p (y). There is a continuum of consumers whose tastes are subsumed by two parameters (x\,X2) : the utility customer 9 gets from buying x at price p (x) is dix1 + 92x2 + u(xi,x2)
— p (x)
The type 9 is private information: each consumer knows his own type, but the monopolist does not know his customers' types. However, he knows the distribution of (#i,#2), say f(9)d9. Knowing that, he must decide on a price schedule p{x). Once this price schedule is announced, each buyer of type 9 maximizes 9'x + u(x) — p(x) with respect to x; if the resulting utility is positive, he buys the resulting product x{9), if it is negative, he doesn't buy at all. The seller's problem is to find the price schedule which will maximize total profit. In the language of the first section, he must find a contract menu (x,p(x)) which maximizes:
[ p(x(8))f(9)d0
Jn
Of course such a contract must be incentive-compatible, otherwise consumers will buy qualities intended for other types, and the above computation is meaningless. So attention must be restricted to such contracts, which are restricted by additional constraints. As we pointed out in the preceding sections, the trick is then to look, not for the contract x (#)itself, but for its potential function V (9). Setting V(9) = Max,, {9'x - u(x) - p(x)} yields W(<9) = x (9) by the envelope theorem. Substituting for p(x), we find:
110
max
f [6W(6)
+ u ( W ( 0 ) ) - V{6)} f(6)d6
V convex, V (0) > 0 V0
(14) (15)
Particularizing to u(x) = — |||x|| 2 and fi = [ai, ^J x [a2,62] with /(0) = 1 gives:
0'VV(0)--||W(0)||2-F(0) max J ai J ai v V convex, V (0) > 0 V0
d0id02
(16) (17)
This is a new type of problem in the calculus of variations. It turns out that both the convexity constraint and the positivity constraints are binding: V (0) = 0 on a set of positive measure, and V" (0) is degenerate on another set of positive measure. Both have economic interpretations: the first one is the set of types which don't buy, and in the second one several types may buy the same quality (bunching). There is also a third set, where V is strictly convex: in this region there is one contract per type, so that each customer is individually targeted (screening). The mathematical situation, even in this simple case, is not fully understood. Let us review quickly the main results to date. Existence is easy, both for (14),(15) and for (16),(17) (it does not even require that u be concave in x). It has been recently proved by Carlier and Lachand-Robert [2] that the solution is C 1 up to and including the boundary. Necessary conditions for optimality have been obtained by Lions: denote by the Kp the cone of convex functions in Wx
111
References [1] G. Carlier, PhD thesis, Universite Paris-Dauphine, 2001 [2] G. Carlier, T. Lachand-Robert, " Regularity of solutions for some variational problems subject to a convexity constraint", Comm. Pure App. Math., 54 (2001) p. 583-594 [3] G. Carlier, T. Lachand-Robert, B.Maury, "A numerical approach to variational problems subject to convexity constraint", Numer. Math. 88 (2001), no. 2, 299-318. [4] D. Kreps, "A course in microeconomic theory", Princeton University Press, 1990 [5] Laffont and Tirole, "A theory of incentives in procurement and regulation", MIT Press, 1993 [6] A. Mas-Colell, M. Whinston, J.R. Green, "Microeconomic theory", Oxford University Press, 1995 [7] J.R. Rochet and P. Chone, "Ironing, sweepoing, and multidimensional screening", Econometrica; 66 (1998)
Recent Progress in the Theory of Generalized Closed Sets* Jiling Cao, Maximilian Ganster and Ivan Reilly Abstract In this paper we present an overview of our research in the field of generalized closed sets (in the sense of N. Levine). We will demonstrate that certain key concepts play a decisive role in the study of the various generalizations of closed sets.
1
Introduction and Preliminaries
In recent years there has been considerable interest in the study of generalized closed sets in the sense of N. Levine, and their relationships to other classes of sets such as a-open sets, semi-open sets and preopen sets. This investigation has led to significant contributions to the theories of separation axioms, covering properties and generalizations of continuity. In this paper we shall give an overview of our approach to these topics, thereby demonstrating t h a t certain key notions seem to play a fundamental role in the overall discussion. For the convenience of the reader we first review some basic concepts, although most of them are very well known from the literature. A subset S of a topological space (X, r) is called a-open S C int(dS),
S C d(int(dS)).
semi-predosed) d(int(dS))
(semi-open,
preopen, semi-preopen)
if S C int(d(intS))
Moreover, S is said to be a-dosed
(S C
(semidosed,
d(intS), predosed,
if X \ S is a-open (semi-open, preopen, semi-preopen) or, equivalently, if
C S (int(dS)
closure, predosure,
C S, d(intS)
semi-preclosure)
C S, int(d{intS))
C S).
The a-dosure
{semi-
of 5 C X is the smallest a-closed (semiclosed, pre-
closed, semi-preclosed) set containing S. It is well known that a-dS
= S U d(int(dS))
and
*2000 Math. Subject Classification — 54A05, 54D10, 54F65, 54G05. Key words and phrases — g-closed set, Hewitt Decomposition, extremally disconnected, a-open, semi-open, preopen, semipreopen, submaximal.
112
113 sclS = S U int(clS),
pclS = 5 U cl{intS)
and spclS = SU int(d(intS)).
Njastad [25] has
shown that the collection of a-open sets of a space (X, T) is a topology T" on X. Moreover, if SO(X, T) denotes the collection of all semi-open sets of (X, r ) , then SO(X, r ) is a topology if and only if (X, r ) is extremally disconnected, i.e. the closure of every open set is open. In this case, SO{X,T)
= ra (see [25]).
Recall that a space (X, T) is called resolvable if there exists a pair of disjoint dense subsets. Otherwise it is called irresolvable.
(X, r) is said to be strongly irresolvable if every open
subspace is irresolvable. Hewitt [17] has shown that every space (X, T) has a decomposition X = FUG, where F is closed and resolvable and G is open and hereditarily irresolvable. We shall call this decomposition the Hewitt decomposition of (X, r). There is another important decomposition of a space which we shall call the Jankovic-Reilly
decomposition.
Since every
singleton {x} of a space (X, T) is either nowhere dense or preopen (see [18]), we clearly have X = Xi U X2, where X\ — {x € X : {x} is nowhere dense } and X% = {x £ X : {x} is preopen }. R e m a r k 1.1. Throughout this paper, F resp. G will always refer to the Hewitt decomposition, and X\ resp. X2 always to the Jankovic-Reilly decomposition. In 1970, N. Levine [19] called a subset A of a space (X, r ) generalized closed, shortly g-closed, if clA C O whenever A C O and O is open. Complements of g-closed sets are called g-open.
It is obvious that every closed set is g-closed but not vice versa. A space
(X, r ) is called Ti/2 [19] if every g-closed set is closed, or equivalently, if every singleton is either open or closed [15]. Definition 1. A subset A of a space (X,T) is called (1) a-generalized
closed (briefly, ag-closed)
[20] if a-clA
C U whenever A C U and U
[21], if a-clA
C U whenever A C U and U
is open, (2) generalized a-closed
(briefly, ga-closed)
is a-open, (3) generalized semiclosed (briefly, gs-closed) open,
[1] if sclA C U whenever A C U and U is
114 (4) semi-generalized closed (briefly, sg-closed) [2], if sclA C U whenever A C U and U is semi-open, (5) generalized semi-preclosed (briefly, gsp-closed) [13] if spclA C U whenever A Q U and U is open. (6) regular generalized closed (briefly, r-g-closed) [26] if clA C U whenever AQU and U is regular open. In [14], J. Dontchev summarized the relationships between these notions in a beautiful diagram. He also pointed out that none of the implications can be reversed. closedsetg
closedset
ag-closed set a-closed set ga-closed set gs-closed set r-g-closed set semi-closed set sg-closed set gsp-closed set semi-preclosed set "3,4" preclosed set
2
Results
Our starting point in the investigation of generalized closed sets were two open questions that Dontchev posed in [14], namely : Characterize those spaces where (A) Every semi-preclosed set is sg-closed, and (B) Every preclosed set is ga-closed. These questions have been solved by Cao, Ganster and Reilly in [4]. To our surprise, both decompositions mentioned before, i.e. the Hewitt decomposition and the JankovicReilly decomposition, played a key role in our solution to these questions. Further studies have shown that these decompositons are important in many more questions concerning generalized closed sets.
115 Recall that a space (X, r) is called submaximal (resp. g-submaximal) if every dense subset is open (resp. g-open). (X, r) is said to be locally indiscrete if every open subset is closed. Theorem 2.1. [4] For a space (X, T) the following are equivalent: (1) {X,T) satisfies (A), (2) Xi n sclA C spclA for each
ACX,
(3) Xi C int(cZG), (4) (X, r) is the topological sum of a locally indiscrete space and a strongly irresolvable space, (5) (X,T) satisfies (B), (6) (X, Ta) is 5-submaximal. This result motivated us to look for other possible converses in Dontchev's diagram. Out of the many results we obtained we shall present here two key results. Theorem 2.2. [5] For a space (X, r) the following are equivalent: (1) every semi-preclosed set is ga-closed, (2) (X, Ta) is extremally disconnected and g-submaximal. Theorem 2.3. [5] For a space (X, r) the following are equivalent: (1) Xi C clG , (2) every preclosed subset is sg-closed, (3) (X, T) is sg-submaximal, (4) (X, ra) is sg-submaximal, Corollary 2.4. If (X, Ta) is g-submaximal then (X, r°) is also sg-submaximal. The converse, however, is false (see [5]).
116
3
Lower Separation Axioms
We already mentioned that the closer investigation of generalized closed sets had great impact on the theory of separation axioms. If we again have a look at Dontchev's diagram, the search for converses of other implications leads to the consideration of certain lower separation axioms. Recall that Maki et al. [22] have called a space (X, r) a Tg3 space if every gs-closed subset is sg-closed. We have been able to characterize Tga spaces in the following way. Theorem 3.1. [6] For a space (X, T), the following are equivalent: (1) (X,r) is a Tgs space, (2) every nowhere dense subset of (X, T) is a union of closed subsets, i.e. (X, r) is T*
[16], (3) every gsp-closed set is semi-preclosed, i.e. (X,r) is semi-pre-Ti/2 [13], (4) every singleton of (X, r) is either preopen or closed. A space (X, T) is called semi-T\ [23] if each singleton is semi-closed, it is called semi-Ti/2 [2] if every singleton is either semi-closed or semi-open. Let rs denote the semi-regularization topology of a space (X, T). The closure of a subset AC. X with respect to rs will be denoted by 5 — clA. A subset A of X is called &-generalized closed if 5 — clA C U when A C U and U is open in (X, r). Moreover, (X, r) is called a T3/4-space [12] if every <5-generalized closed subset of (X,T) is closed in (X,T„).
The well-known digital line, also called the Khalimsky
line, is a T3/4-space which fails to be T\. We now have the following result. Theorem 3.2. For every space (X, T), (1) r 3 / 4 = Tg3 + semi-71 [12], (2) Ti/2 = Tgs + semi-T1/2 [22], (3) Tgs — every ag-closed set is ga-closed, (4) Tg3 + extremally disconnected = every gs-closed set is preclosed.
117 Corollary 3.3. In a Tgs space, every g-closed set is ga-closed. This has led to the natural question of characterizing those spaces where every ga-closed set is g-closed. Theorem 3.4. [6] For a space (X, r) the following are equivalent: (1) Every go-closed set is g-closed, (2) every nowhere dense subset is locally indiscrete as a subspace, (3) every nowhere dense subset is g-closed, (4) every a-closed set is g-closed. Observe, however, that there exist spaces in which every nowhere dense subset is g-closed but there exists a nowhere dense set which is not closed (see [6]).
4
Gp—closed Sets
Definition 2. A subset A of a space (X, r) is called generalized predosed, briefly gp-closed, [24] if pdA C U whenever AQU and U is open. Our study of generalized predosed sets has been carried out to a great detail in [7]. As one might expect, here also the Hewitt decomposition, the Jankovic-Reilly decomposition, submaximality and extremal disconnectedness play a significant role. Out of the many results that we obtained we mention here two important characterizations. Theorem 4.1. [7] For a space (X, r) the following are equivalent : (1) {X,T) is a T9s-space, (2) Every gp-closed subset of (X, r) is predosed, (3) Every gsp-closed subset of (X, r) is semi-preclosed, (4) Every gp-closed subset of {X, r) is semi-preclosed.
118 Theorem 4.2. [7] For a space (X, T) the following are equivalent : (1) Every gsp-closed subset of (X, T) is gp-closed, (2) Every semi-preclosed subset of (X, r) is gp-closed, (3) (X, r) is extremally disconnected.
5
Sg—compact Spaces
Definition 3. A topological space (X, r) is called sg-compact if every cover by sg-open sets has a finite subcover. The class of sg-compact spaces has been introduced by Caldas [3], Devi, Balachandran and Maki [9] and Tapi, Thakur and Sonwalkar [27]. Sg-compact spaces are quite interesting because sg-openness seems to be the weakest form of generalized openness for which there exists a nontrivial corresponding notion of compactness. For example, the cofinite topology on any infinite set yields a sg-compact space. Clearly, every sg-compact space is semicompact and thus hereditarily compact. Dontchev and Ganster [10] called a subset A of a space (X, T) hereditarily sg-closed, briefly hsg-closed, if every subset is A is sg-closed. A space (X, r) is said to be a C^space [10] if every hsg-closed set is finite. It is easily observed that every nowhere dense set is hsg-closed. Moreover, AC X is hsg-closed if and only if Xi n int(clA) = 0 [10], Theorem 5.1. [10] For a space (X, T) the following are equivalent : (1)(X,T)
is sg-compact,
(2) (X, T) is a C3 space. The question concerning products of sg-compact spaces is rather tricky. It has been shown in [11] that there exists a space {X,T) which is sg-compact but X x X fails to be sg-compact. In addition, the following result holds.
119 Theorem 5.2. [11] (1) If X = Y[{Xi '• i € 1} is sg-compact then only finitely many Xi are not indiscrete, (2) Suppose that X = n{^» : * S / } is sg-compact. Then : either all X, are finite, or exactly one of them is infinite and sg-compact and the rest are finite and locally indiscrete.
6
Concluding Remark
We want to draw the attention of the reader to a forthcoming paper of Cao, Greenwood and Reilly [8] where all the various notions of generalized closedness considered in the literature so far have been brought under a common framework.
References [1] S. Arya and T. Nour, Characterizations of s-normal spaces, Indian J. Pure Appl. Math., 21 (1990), 717-719. [2] P. Bhattacharya and B.K. Lahiri, Semi-generalized closed sets in topology, Indian J. Math., 29 (1987), 375-382. [3] M.C. Caldas, Semi-generalized continuous maps in topological spaces, Portugal. Math., 52 (4) (1995), 399-407. [4] J. Cao, M. Ganster and I. Reilly, On sg-closed sets and ga-closed sets, Mem. Fac. Sci. Kochi Univ. Ser A, Math., 20 (1999), 1-5. [5] J. Cao, M. Ganster and I. Reilly, Submaximality, extremal disconnectedness and generalized closed sets, Houston J. Math., 24 (4) (1998), 681-688. [6] J. Cao, M. Ganster and I. Reilly, On Generalized Closed Sets, to appear in Topology & Appl. [7] J. Cao, M. Ganster, Ch. Konstadilaki and I. Reilly, On preclosed sets and their generalizations, to appear in Houston J. Math. [8] J. Cao, S. Greenwood and I. Reilly, Generalized closed sets : a unified approach, preprint. [9] R. Devi, K. Balachandran and H. Maki, Semi-generalized homeomorphisms and generalized semi-homeomorphisms in topological spaces, Indian J. Pure Appl. Math., 26 (3) (1995), 271-284. [10] J. Dontchev and M. Ganster, More on sg-compact spaces, Portugal Math., 55 (1998), 457-464.
120 J. Dontchev and M. Ganster, On a stronger form of hereditary compactness in product spaces, to appear in Rend. Circ. Mat. Palermo. J. Dontchev and M. Ganster, On 5-generalized closed sets and T3/4- spaces, Mem. Fac. Sci. Kochi Univ. Ser A, Math., 17 (1996), 15-31. J. Dontchev, On generating semi-preopen sets, Mem. Fac. Sci. Kochi Univ. Ser. A., Math., 16 (1995), 35-48. J. Dontchev, On some separation axioms associated with the ct-topology, Mem. Fac. Sci. Kochi Univ. Ser A, Math., 18 (1997), 31-35. W. Dunham, 71/2-spaces, Kyungpook Math. J., 17 (1977), 161-169. M. Ganster, I. Reilly and M. Vamanamurthy, Remarks on locally closed sets, Math. Panon., 3 (1992), 107-113. E. Hewitt, A problem of set-theoretic topology, Duke J. Math., 10 (1943), 309-333. D. Jankovic and I. Reilly, On semi-separation properties, Indian J. Pure Appl. Math., 16 (1985), 957-964. N. Levine, Generalized closed sets in topological spaces, Rend. Circ. Mat. Palermo, 19 (1970), 89-96. H. Maki, K. Balachandran and R. Devi, Associated topologies of generalized a-closed sets and a-generalized closed sets, Mem. Fac. Sci. Kochi Univ. Ser. A, Math., 15 (1994), 51-63. H. Maki, R. Devi and K. Balachandran, Generalized a-closed sets in topology, Bull. Fukuoka Univ. Ed. Pari III, 42 (1993), 13-21. H. Maki, K. Balachandran and R. Devi, Remarks on semi-generalized closed sets and generalized semi-closed sets, Kyungpook Math. J., 36 (1996), 155-163. S. Maheshwari and R. Prasad, Some new separation axioms, Ann. Soc. Sci. Bruxelles, 89 (1975), 395-402. H. Maki, J. Umehara and T. Noiri, Every topological space is pre-Ti, Mem. Fac. Sci. Kochi Univ. Ser. A, Math., 17 (1996), 33-42. O. Njastad, On some classes of nearly open sets, Pacific J. Math., 15 (1965), 961-970. N. Palaniappan and K.C. Rao, Regular generalized closed sets, em Kyungpook Math. J., 33 (1993). 211-219. [27] U.D. Tapi, S.S. Thakur and A. Sonwalkar, S.g. compact spaces, J. Indian Acad. Math., 18 (2) (1996), 255-258.
121 Adresses : Department of Mathematical Science, Faculty of Science, Ehime University, 790-8577 Matsuyama, JAPAN. Department of Mathematics, Graz University of Technology, Steyrergasse 30, A-8010 Graz, AUSTRIA. Department of Mathematics, University of Auckland, Private Bag 92019, Auckland, NEW ZEALAND
Misconceptions in Mathematics Dr. Haifa Nassar Kuncar * Mr. Mahmoud Breigheith* Bethlehem University Abstract The purpose of the study was to determine the mathematics deficiencies for high school pupils and find the reasons behind them. This will enable us to make better plans for choosing the contents of our mathematics courses, review the teaching methods and to review the adequacy of the curriculum at schools. To accomplish this purpose, an aptitude test was given to a sample of tenth grade pupils from different schools in the areas of Bethlehem, Hebron and Jerusalem. The test addressed several topics traditionally taught at pre- tenth level, including a number of tasks designed to measure pupils' analytical abilities. We then reviewed the mathematics entrance exam at Bethlehem University taken by all applicants for the year 1997, and an aptitude test given to twelfth grade science pupils in 1998 and did an item analysis for the three exams. Two surveys were prepared, one for a sample of schoolteachers and one for freshman students at Bethlehem University. The purpose of the two surveys was to determine what teachers and students believe the reasons of the misconceptions in mathematics. We analyzed the results of the surveys and compared the responses of the two groups. The study showed that students' performance in the diagnostic test, the aptitude exam and the mathematics entrance exam was poor. Most of the percentages of the Concepts' comprehension was below 50% and the misconceptions are not a problem at one level but it is a problem that teachers and students have to face at different levels. From the students' and teachers' surveys, it was found that all factors causing misconceptions are important. Some recommendations were to stress the importance of mathematics to the real world, cooperation between mathematics schoolteachers at different levels, cooperation between mathematics schoolteachers and university mathematics departments. Introduction New directions for the teaching of mathematics have been proposed during the past several years. Numerous mathematics educators called for more concentration on conceptual depth rather than manipulation of skills. Most school courses are like cookbooks were students are given recipes of how to solve a problem rather than the concentration on the understanding of the concepts. Haifa Nassar Kuncar, Assistant Professor, Mathematics Department, Bethlehem University, Palestine Mahmoud Breigheith, M.Sc, Mathematics Department, Bethlehem University, Palestine
123
At the university, most students who take their first calculus course face problems and show weakness in the understanding of the concepts. Also, from our previous research (Low Grades In The Mathematics Entrance Exam Obtained By Science Students, 1998) the 1997 Bethlehem University applicants showed poor performance in the mathematics entrance exam especially when it comes to problem solving skills. This issue seems to be universal, and is certainly not new. Some years ago, lecturers in the King's College Mathematics Department wrote: " Many students of science subjects arrive at university with little facility and less interest in mathematics" (Baker; Crampin and Nuttall, 1973). Description of the areas of mathematical weakness over the years have had much in common: calculus, logarithm, trigonometry, algebraic manipulation, especially factorization, ratio and percentages, and logic. In a paper on the mathematical deficiencies of engineering undergraduates written nearly twenty years ago, the authors produced a list of mathematical weakness such as that listed above, wrote: " Our evidence is that the deficiencies are more fundamental than this list would suggest or than students are prepared to admit" (Howarth and Smith, 1980). Moreover, some educators blamed the Teachers' Preparing Programs. Grossnickle (1951) showed that in many Teachers' Colleges the program that is provided is wholly inadequate for preparing prospective teachers to deal effectively with the widely varied Situations they will face in their schools. It is evident that there is a problem in mathematics conceptual comprehension that needs to be studied and followed up by all involved sectors. Objectives Our objective is to determine the mathematics deficiencies and find the reason behind them. This will enable us make better plans for choosing the contents of our mathematics courses, especially the intensive mathematics course and the first business mathematics course. It will also enable us review the teaching methods, the adequacy of the curriculum and evaluation at schools. Hypothesis There are several factors causing misconceptions: the student, the teacher, and difficulty of the subject, the curriculum, the school, the society, and evaluation methods. We list here some of the items concerning the above main factors. 1. Mathematics teachers are not well prepared to teach mathematics. 2. Intellectual abilities of the students and lack of follow up. 3. Problems in the curriculum such as logical arrangement of the material, presentation of the material, length of the curriculum... 4. School problems such as the number of students in each class, facilities...
124
5. Home and society problems, such as family problems, economic situations... 6. Evaluation methods and lack of feedback. We foresee that students will blame the teachers more than themselves and the teachers will blame the students more. Procedure and findings 1. We gave a diagnostic test to 112 tenth grade pupils from different schools in the areas of Bethlehem, Hebron and Jerusalem. The test addressed several topics traditionally taught at pre- tenth level, including a number of tasks designed to measure pupils' analytical abilities. The test consisted of 15 questions covering different concepts appropriate to their level. A space was provided for them to show their work so that we can recognize their weaknesses. The test questions are given in Appendix 1. Figure (1) gives the distribution of grades for those pupils. Aptitude test grades
,
30 -
,
-
25 I
Freque
^20f
-
|0Series1 [
-
5 -
, • to 2
3 to 4
5to6
7 to 8
9to10
11to12
13to15
Grades
Figure (1): The distribution of grades of 112 tenth grade students The test mean was 42.7% and the standard deviation was 3.68%. 2. We analyzed the questions of the test and summarized them into concepts where most of these concepts were presented more than once in different problems in different forms. Table (1) gives the concepts and the percentage of correct answers obtained from the 112 students. We reviewed the mathematics entrance exam at Bethlehem University taken by all applicants for the year 1997, and an aptitude test given to twelfth grade science pupils in 1998 (research done by Kuncar and Bregheith, 1998) and summarized the results of the three exams in table (2). The entrance exam, the diagnostic test, and the aptitude test were prepared to cover most of the concepts taught during the 12 years at school.
125 Table (1): The percentages of concept comprehension in the aptitude exam given to tenth grade students Concepts Ratios, proportions and percentages Fractions Trigonometric functions Triangles and the Pythagorean theorem Critical thinking and logic Constructing and solving equations Areas of regular geometric shapes Counting Basic statistics Angles Exponents and roots Functions Three dimensional figures
Percentage Of concept comprehension 38.1 40.5 8 49.1 44 53.6 28.6 19.6 65.2 56.3 48.7 19.6 32.1
Table (2): Percentages of concept comprehension in a total of the three exams Concepts
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Areas of Regular Geometric Figures Analytic & Logic Factorization of Polynomials & their Zeros Addition & Subtraction of Fractions Triangles & Their Similarity Multiplication & Division of Fractions Ratio & Percentages The Circle Constructing & Solving A System of Linear Equations Distance & Vectors Simplification of Rational Exponents Sequences & Series Inequalities & the Order of the Real Numbers Straight Lines & Geometric Interpretation of A Curve Odd Verses Even Integers Multiplication of Matrices Properties of Logarithms Trigonometric Functions Conditional Probability Counting Basic Statistics Rational & polynomial functions Three dimensional regular shapes
Total Percentage Number of of concept Occurrenc Comprehensi e on 13 40.5% 22 44.0% 7 34.3% 56.2% 9 43.7% 8 5 60.3% 41.0% 8 5 44.7% 46.4% 9 6 6 3 3 2
39.0% 57.2% 51.4% 50.3% 41.2%
1 1 1 3 1 2 2 3 1
29.4% 18.0% 25.0% 36.8% 33.0% 38.8% 53.3% 63.2% 32.1%
126 4. To determine the reasons for the misconceptions, we prepared a survey consisting of 7 main factors and several items following each factor. We distributed the survey to a sample of 110 Palestinian mathematics schoolteachers, to serve the purpose of this research. In the survey, different reasons were given for the misconceptions and teachers were asked to list them in order from the most important to the least important. To give items their proper degree of importance, we assigned them weights 1, 2,3 and so on, as follows: 1 is less important than 2, 2 is less important than 3, and so on. Then we calculated the weighted mean out of 100. Table (3) lists the main factors and the items to be listed in order and the weighted percentage of each item. The sub items are sorted according to their weights. Table (3): Teachers' responses to the questionnaire 1 The main factors causing misconceptions The student Home and society The curriculum and the nature of the material The teacher The school Evaluation methods 2 The factors concerning the student: Negligence and lack of follow up by the student Intellectual abilities of the student Individual differences Relation between the student and the teacher Relation between the students themselves 3 The factors concerning the teacher Teaching of mathematics by a non-mathematics majors Inappropriate teaching methods and negligence of individual differences Poor economical situation for teachers Teacher's teaching heavy load Teachers are not well trained for teaching mathematics The personality and attitude of the teacher Different meanings of concepts and teaching methods applied by different teachers at different levels 4 The factors concerning the curriculum and nature of the material Length of the curriculum Boredom from the abstraction of the subject which causes students to prefer other subjects Inconsistency of the material with the students intellectual
Weights 84.5% 68.2% 60.9% 60.2% 47.1% 29.8%
86.9% 77.3% 53.5% 51.1% 36.7%
73.6% 60.3% 59.0% 56.8% 56.5% 54.2% 40.8%
74.8% 64.7% 62.0%
127 abilities Mathematics being accumulative subject causes difficulty in following it up and understanding it Weakness in the logical sequence between the chapters at the same level or between different levels Weakness in the presentation of the material and resorting to memorization rather than reasoning and critical thinking 5 The factors concerning the school Large number of students in the classrooms Inappropriate time of the mathematics period Lack of teachers' office hours in schools which does not allow private discussion between students and teachers Weakness in the follow up of students' performance by the School administration Weakness in cooperation between mathematics teachers in the same school 6 The factors concerning home and society Indifference of the parents towards their children's performance Frustration of the social community from the academic process caused by the economical situation Family problems Difference between the students and parents academic preferences Different teaching methods between home and school Inappropriate studying atmosphere at home 7 The factors concerning students' and teaches' evaluation methods The mandatory limited maximum percentage of failures at each level Not choosing suitable evaluation methods in order to determine the weaknesses of the students Evaluation of teachers by the mathematics coordinators by observing some of his/her classes rather than considering the students' performance Negligence in discussing the students' mistakes in their exams The concern of teachers to increase the number of students passing the exam (especially tawjihi) rather than the understanding of mathematical concepts Evaluation of students that depends mostly on testing techniques rather than the understanding of the concepts
58.5% 48.6% 44.8%
89.3% 72.5% 53.3% 48.5% 37.5%
83.5% 64.1% 62.7% 54.5% 50.9% 36.2%
60.8% 58.2% 55.8%
54.8% 50.9% 49.2%
5. A survey similar to the teachers' survey, and distributed it to a sample of 130 freshman students at Bethlehem University. Again the students were asked to list To determine what students believe the causes of their misconceptions, we prepared the items following each factor from the most important to the least
128
important. We calculated the weighted percentages as before. A summary of the students' survey is given in Table (4).
Table (4): Students' responses to the questionnaire 1
The main factors causing misconceptions The teacher The curriculum and the nature of the material The student The school Home and society Evaluation methods
2
3
4
Factors concerning the teacher The inability of the teacher to present the concepts clearly caused by his/her inappropriate specialty and/or lack of teachers' preparation Inadequate teaching methods and unsuitable examples to explain the concepts Distrust in the teacher and lack of interest in the material caused by a weak teacher's personality Lack of positive interaction between the students and the teacher" respect and encouragement" Different meanings of concepts and teaching methods applied by different teachers at different level Not stressing the importance of mathematics in real life Factors concerning the student Not knowing the appropriate way to study mathematics Students are not aware of the importance of mathematics to their future study and work Intellectual ability and individual differences Students prefer to play rather than study specially in the elementary levels Frustration and lack of encouragement specially from peers Factors concerning the curriculum and the nature of the material Since in learning mathematics students depend a lot on the teacher, any distraction in the classroom causes misunderstanding of the concepts Mathematics being an accumulative subject causes difficulty in following it up and understanding it Weakness in the logical sequence between the chapters at the same level or between the different levels Length of the curriculum Boredom from the subject due to its abstraction which causes students to show more interest in other subjects
Weight s 78.6% 74.7% 71.4% 54.6% 42.2% 32.2%
70.5% 66.9% 58.2% 58.1% 52.2% 52.1%
69.7% 62.8% 59.8% 56.8% 53.7%
65.7% 64.3% 64.0% 58.9% 52.2%
129 5
6
7
Factors concerning home and society Difference between the students' and parents' academic preferences Indifference of the parents towards their children's performance Family problems " economical, social..." Inappropriate studying atmosphere at home Frustration of the social community from the academic process caused by the economical situation Factors concerning students' evaluation methods The concern of teachers to increase the number of students passing the exam (especially tawjihi) rather than the understanding of concepts Negligence in discussing the students' mistakes in their exams Weakness in choosing suitable evaluation methods in order to determine the weaknesses of the students In most of the mathematics exam questions, the solution has only two possibilities " right or wrong", which causes frustration of students Evaluation of students depend mostly on testing techniques rather than the understanding of concepts Factors concerning the school Large numbers of students in the classrooms Weakness in the following up of students' performance by the school administration Weakness in cooperation between mathematics teachers in the same school Lack of teachers' office hours in schools which does not allow private discussion between students and teachers Shortage of the weekly number of mathematics classes and the inappropriate time of the mathematics period
64.5% 63.2% 60.9% 60.0% 56.5%
69.1% 68.5% 66.9% 54.0% 45.8%
68.0% 67.4% 65.4% 54.2% 51.7%
Results 1. It can be seen from figure (1) and tables (1) and (2) that students' performance in the diagnostic test, the aptitude exam and the mathematics entrance exam is poor. Most of the percentages of the concepts ' comprehension were below 50%, which agrees with our hypothesis that students do have misconceptions in mathematics. In appendix 2, we give some of the students' answers to some problems in the tenth grade diagnostic test and the twelfth grade aptitude test. Clearly, the misconceptions are not a problem at one level but it is a problem that teachers and students have to face at different levels. This is evident since table (1) gives the concept comprehension for the tenth grade students and table (2) for the tenth, twelfth grades and the 1997 university applicants, and the percentages of the concept comprehension do not vary much. 2. From the students' and teachers' surveys, we notice that all mentioned factors causing misconceptions are important since all weighted percentages are more than 29%. In the students' survey, the percentages range from 45.8% to 78.6%, and in
130
the teachers' surveys, they range from 29.8% to 89.3%. Teachers have a bigger range than students because they have a wider view of the deficiencies in mathematics at all levels. Most of the sub items in the two surveys are identical, some factors were presented only to students and others were presented only to teachers. We calculated the correlation r between teachers ' and students ' responses to the twenty-four identical items, and found that r = 0.27. We also tested this result for linearity by applying the correlation linearity test at significant level a = 0.05, and found that t = 1.31. Using the two-tailed test, we concluded that there is no significant linear correlation between teachers ' and students ' responses. In other words, the responses are independent. Following is a summary and comparison between the students' survey and the teachers' survey for the identical items. a) A comparison between students ' and the teachers ' responses to the main factors causing misconceptions is given in figure (2).
90.00% 80.00% • 70.00% 60.00% 5 0 . 0 0 % -i 40.00% 30.00% 20.00% 10.00% 0.0 0% a>
J*.
*
methods
Evaluation
society
Home and
The
school
The
student
The
--
curriculum .
teacher
Percentages
T h e M a i n F a c to rs
— m — Teacher
Factors
Figure (2): Comparison between students and teachers responses to the main factors From figure (2), we notice that a large percentage of students blame the teachers for their misconceptions and vice versa, as was expected. Also, the teachers give more importance than students to home and society. This could be because students are biased or possibly they do not recognize the effect of home on the learning process. Since students have deficiencies in mathematics and lack of understanding, they blame the curriculum more than the teachers. b) A comparison between students' and teachers' responses to the items concerning the curriculum is given in figure (3).
131
Curriculum 00% .00% ,00% .00% .00% 00% 00% .00% .00% Length of the curriculum
-Teacher Boredom from its abstraction
Difficulty in follow up since it's accummlative
Weakness in the logical sequence
-Students
Figure (3): Comparison between teachers and students responses concerning the curriculum. Clearly, the teachers worry about finishing the curriculum, so they are aware that sometimes they have to rush through the material and not giving their students enough time to grasp the concepts. So, they give more importance to the length of the curriculum, while students do not worry about finishing the curriculum. On the other hand, students feel that it is difficult to follow up the material because it is accumulative. We assume that teachers are more aware of the logical sequence of the curriculum, but students gave more importance to this item, which indicates the lack of students ' understanding of the curriculum. c) A comparison between the students and teachers responses to the items concerning home and society is given in figure (4). Home and soceity 90.00% 80.00% „ 70.00% g, 60.00% 50.00% 40.00% I 30.00% 20.00% 10.00% 0.00% -. Different academic preferences
Students Indifference of the parents
Family problems
Inappropriate studying atmosphere
Frustration caused by the economical
Teachers
Figure (4): Comparison between teachers and student responses concerning home and society.
132
In the student's point of view, all items concerning home and society have almost equal importance. While, teachers give different weights to different items. Educators are more aware of parents ' role in the learning process, so they emphasize parent's encouragement, and follow up of their children. Since students are aware of the atmosphere at home, they gave it more importance than teachers. d) A comparison between teachers' and students' responses to items concerning the school is given in figure (5). School 90.00% • . 80.00% 70.00%
\.^ t
*^
,
g 60.00% 2
"^s
50.00%
"»-._
g 40.00% •' a! 30.00% . 20.00% -. 10.00% ' 0.00% Large numbers of students in classrooms
-
i Weakness in foifow up of students' performance
- Students Weakness in cooperation between math teachers
Lack of teachers' office hours in schools
Inappropriate time of the math period
- Teachers
Figure (5): Comparison between teachers and student responses concerning school. Obviously, teachers realize that it is easy to ignore individual differences when one has a large number of students in the class. Thus, causing a number of students to have mathematics deficiencies, especially those rated below average. As mentioned earlier, students tend to blame their teachers for their misconceptions, and hence they give a weight more than the teachers to the item concerning the lack of cooperation between mathematics teachers at different level, while not all teachers see the need for such cooperation. We believe that cooperation between teachers is important, since it helps the teacher to clear any misconceptions carried on from previous levels. e) A comparison between teachers ' and students ' responses to the items concerning the evaluation methods is given in figure (6).
133
Figure (6): Comparison between teachers and student responses concerning school. We note that teachers give less importance to the item " the concern of teachers to increase the number of students passing the exam" which makes teachers concentrate on the techniques of solving problems rather than the understanding of the concepts. Teachers should be aware of this problem but probably do not want to admit it, while students give this item the most weight. We believe that teachers would like to increase the number of students passing the exams especially the Tawjihi, which is caused by the competition between schools or between teachers at the same school, forcing most teachers to concentrate on the manipulation of skills. Again, students tend to blame the teachers for not discussing their mistakes or not giving them a feedback. We agree that not giving students feedback or returning the exam papers does not allow students to know their mistakes and thus builds up a set of deficiencies in mathematics. So, students gave this item more weight than the teachers. Recommendations. From the above results, it can be seen that all the factors mentioned in the surveys (the teacher, the student, the school, the curriculum, home and society, and the evaluation methods) including the sub items have an effect on the learning process of mathematics. So all factors need to be considered seriously, since in one way or another they cause mathematics misconceptions. To correct the teaching and learning process of mathematics, considerable efforts from public and private sectors are required, which is not easy for us to tackle or solve in a short period of time. We will give here some recommendations that do not need great efforts and can be applied at any time. 1. Stress the importance of mathematics to the real world. 2. Cooperation between mathematics schoolteachers at different levels. 3. Cooperation between mathematics schoolteachers and university mathematics departments.
134
4. Emphasize the importance of the feedback, which involves returning the exam papers, discussing students' mistakes, and class discussion. Communicating results and reflection on thinking processes are important aspects of investigating work. 5. Distribution of short pamphlets containing the most frequent misconceptions to mathematics schoolteachers. These pamphlets could be prepared by mathematics departments at the local universities or by the ministry of education or any interested teachers. 6. Distribution of sample tests that concentrate on the understanding of the concepts rather than the manipulation of skills. 7. Communication between teachers and pupils for diagnostic purposes. The above recommendations were discussed thoroughly with mathematics directors in Hebron and Bethlehem districts including direct contacts with teachers. In addition, a copy of the diagnostic test and pertaining information were provided for the for-mentioned directors and teachers.
References Baker, J., Crampin, M.And Nuttall, J. (1973) 'A Crash Course in Calculus', International Journal of Mathematics Education in Science and Technology, 4. Brueckner, etal. Developing Mathematical Understandings in the Upper Grades. NewYork: Holt, RineHart And Winston, 1961. Breigheith, M. and Kuncar, H. A Study on the Low Grades in the Mathematics Entrance at Bethlehem University Exam Obtained by Science Students. An Internal Research completed in 1998. Davis, Robert. Learning Mathematics. The Cognitive Science Approach to Mathematics Education. London: Routeldge, 1984. Greer, B. and Mulhern.G. (Eds.). New Directions in Mathematics Education. London: Routledge, 1992. Howarth, M. J. and Smith, B J. (1980) 'Attempts to Identify and Remedy by the Mathematical Deficiencies of Engineering Undergraduates Entrants at Polytechnic', International Journal of Mathematics Education in Science and Technology, 11 (3), pp.377-383. Lerch, Harold. Active Learning Experiences for Teaching Elementary School Mathematics. Dallas: Houghton Mifflin Company, 1981. Polya, G. 91980) How to solve it, NewJersy: Princeton Press, 1973.
On the Category-Theoretic Approach to Differential Geometry Federico G. L a s t a r i a
Abstract The purpose of this paper is to outline the program - formulated by F.W. Lawvere in 1967 of giving mathematical foundations for continuum mechanics, differential geometry and calculus of variations in category-theoretic terms. Among the remarkable features of this approach are the unification of the finite and the infinite-dimensional cases, and the rigorous systematic use of nilpotent infinitesimal quantities, widely used in mathematics at least until the end of the 19th century and extensively used still now, at least in heuristic form, by physicists and engineers. Suitable axioms allow to reduce differential calculus to simple algebra, fully combining rigor with intuition. The importance of such an approach in the teaching of calculus deserves the utmost attention. The construction of models for categories of smooth spaces is finally sketched.
Introduction The category-theoretic approach to continuum mechanics, differential geometry and the calculus of variations was first formulated by F.W. Lawvere in a series of lectures in Chicago in 1967. In the sequel, such a theory will be referred to with the well-established phrase Synthetic Differential Geometry, though Categorical Dynamics - the title of Lawvere's 1967 Chicago lectures - might be more appropriate. In general terms, the issue is to give an axiomatic theory of continuous bodies and continuous variations: the very notion of smoothness and the decisive conceptual constructions t h a t can be performed on continuous bodies are described as ways in which bodies are transformed or mapped into each other. Otherwise stated, fundamental concepts and constructions are described mathematically through consideration of morphisms in suitable categories, namely cartesian closed categories satisfying appropriate axioms, which are chosen to model objective properties of the physical world. Category theory provides t h e indispensable conceptual tools for such an approach, as well for t h e construction of suitable models. T h e paper is organized as follows. In the first section it is explained what cartesian closed categories are and why they are required, along with infinitesimal nilpotent quantities, for a mathematical description of the physical world. The second section briefly shows "Proceedings of the Third International Palestinian Conference on Mathematics and Mathematics Education, Bethlehem University, 9-12 August 2000.
135
136 how elementary calculus may be developed in a purely algebraic way, by a systematic use, on a rigorous basis, of nilpotent infinitesimal quantities. Finally section 3 accounts for consistency of the whole theory, sketching the construction of models for Synthetic Differential Geometry. This paper is an invitation to the topic, meant to provide the reader with some motivations to go through the specialized literature cited in the bibliography. In particular Lawvere's paper [7], his Introduction in [8], Kock's fundamental book [4], Bell's booklet [1], [6] and [15] by Moerdijk and Reyes, from each of which the present paper has largely drawn, are suggested as further readings. For an outline of more recent developments of Synthetic Differential Geometry, see [11] and the references therein. It must be finally remarked explicitly, to avoid confusion, t h a t Lawvere's categorical approach to calculus, outlined in this note, completely differs from A. Robinson's 'Nonstandard analysis' (1966), both for conceptual origins and for technical developments. A c k n o w l e d g m e n t s . My thanks go to Anders Kock for his corrections and for his constructive criticisms, and to Bill Lawvere for his interest in this work and for his kindness and patience in explaining me the fundamental philosophical ideas underlying the theory. Of course, I insist on retaining sole responsibility, at least for any errors or misconceptions should still remain.
1
Why Cartesian Closed Categories and Infinitesimal Quantities?
T h e category M a n of finite-dimensional smooth manifolds is not a convenient setting for the study of continuum physics, differential geometry and the calculus of variations - more generally, for the study of quantities which undergo smooth variations - because it lacks good categorical properties, such as the existence of all finite limits and the existence of exponentials. To begin with, the category M a n lacks the property of being closed with respect to the fundamental construction of function spaces, or exponentials. Roughly speaking, the space of smooth maps between two smooth manifolds is not a smooth manifold. More precisely, M a n is not cartesian closed, according to the terminology t h a t we explain below. To illustrate the importance of such a concept, consider the following constructions, of fundamental importance in continuum physics ([7],[9]). In order to describe the motion of bodies in space, we need to consider a category E . A time interval, a body and the ordinary flat space are modelled as objects of E, say T, B and E respectively. For any pair of objects X, Y, denote by Yx the set of morphisms from X to Y. A particular motion of the body may be described in any of the following three ways: • as a m a p B xT space;
—> E, t h a t to each pair (particle,
instant)
associates a position in
• as a p a t h T — • EB in the m a p space EB, where the latter is the space of all possible
137 placements of B in the space E; • as a placement B —> ET of B in the path space ET, the m a p space of all paths of E. Here one assigns to every particle of B its path through E. For t h e purposes of continuum physics, it is necessary t h a t these three descriptions be all available and equivalent, because each of them has its own range of applications. Here are some examples. 1. Let E —> R be a numerical function giving, for instance, the distance from a fixed reference point PQ of E. Composition with the m a p B x T —> E gives the distance from PQ of each particle at each instant during the motion. 2. T h e center of mass of a body depends only on the way the body is placed in space, and is therefore a map EB —> E. By composing this map with the motion described as a map T — • EB, one describes the motion T —> E of the center of mass of the body. 3. Let V be t h e object of velocities of E, i.e. the vector space of translations of the flat space E. Then differentiation gives a morphism ET —> VT which associates a velocity p a t h to every space path. Composing the latter with the motion of a body in the form B —> ET, one obtains the velocity p a t h of the particles of the body B as a m a p B —> VT. T h e existence and equivalence of the different ways of representing a motion described above (for any three objects in E ) are ensured by the property of cartesian closedness, which defines the map-space construction as an adjoint functor. A category E is called cartesian closed if it has finite cartesian products (including an empty product 1, the final object) and has exponential, right adjoint to product. This means t h a t for any two objects A, Y there is a an object YA such that for any object X there is a natural bijection AxX —>Y X —>YA between the indicated maps. For every pair of objects A, Y, the points of the m a p object YA are (identified with) the maps from A to Y. Another limitation of the category of manifolds is the absence of ways of dealing with infinitesimal quantities. Reasoning with infinitesimals - more generally, synthetic reasoning - has been a fundamental issue in studying and teaching mathematics, physics and engineering at least until the end of the nineteenth century, when it was discredited and slandered as non-rigorous by the process of arithmetization. As A. Weil points out: "Depuis les premieres tdtonnements du calcul infinitesimal et de la geometrie differentielle, les geometres ont pense "points infinement voisins" (ou "points proches"; le terme latin est "proximus"), quitte a rediger tout autrement lorsque cette notion fut tombee en discredit: il y a la un chapitre d'histoire qui meriterait d'etre ecrit." ([17].)
138 Lawvere's proposal is to work in cartesian closed categories in which the real line R is (no longer a field but) a commutative ring with non-zero nilpotents. Through an appropriate axiom, t h a t will be stated in the next section, the powerful intuitive ways of thinking and computing in terms of infinitesimals are thus recovered and systematically used in a rigorous setting. T h e consideration of cartesian closed categories finally allows the unification of the finite and the infinite-dimensional cases. For instance, along with any pair of (finitedimensional) spaces X, Y, the (infinite-dimensional) map-space Y is available inside the category. This issue is already explicit in 1854 B. Riemann's lecture "On the hypotheses which lie at the foundations of geometry", where it is pointed out that there are also "manifolds in which the determination of position requires not a finite number, but either an infinite sequence or a continuous manifold, of determinations of quantities." *
2
Calculus in a Cartesian Closed Category of Smooth Objects
Differential and integral calculus, in one or several variables, may be developed in categorytheoretic terms. We refer the reader to [1] and [4] for a thorough treatment and limit ourselves to some basic principles and examples. We assume to work in a cartesian closed category E . We further assume t h a t in E there is a commutative ring object R, referred to as the real line. The object D of nilpotent infinitesimals of first order is definable in terms of the smooth line R as 2 D = {d e R | d2 = 0 } Geometrically, in the plane R x R, D may be thought of as the intersection of the circle x2 + (y — l ) 2 = 1 with the tangent line y = 0. T h e fundamental axiom which allows to develop calculus is the following. A x i o m l(Kock-Lawvere) For any f : D —> R there exist unique a,b G R such that Vd £ D
f(d) = a + b-d
We can phrase the axiom by saying that: D is so small that all functions 1
f : D — • R are affine,
"Es gibt indess auch Mannigfaltigkeiten, in welchen die Ortsbestimmung nicht eine endliche Zahl, sondern entweder eine unendliche Reihe oder eine stetige Mannigfaltigkeit von Grossenbestimmungen erfordert." [16]. (Quotation in [15].) 2 Actually D is not a set, but an object of E, namely the equalizer of the pair of maps ( — )2 : R —> R and 0 : R —• R. Such an equalizer exists, because E has all finite limits. One can prove that a coherent semantics ([4],[14]) allows to work in E as if it were the category of sets, once some cautions about the use of the logical principle of the excluded middle are taken, as explained below.
139 while, on the other hand, D is large enough to determine
the slope of any f : D — • R
uniquely.
Thus every function R —> R, restricted to D, is an infinitesimal element of a straight line, and this straight line is uniquely determined by such infinitesimal element. Axiom 1 implies t h a t D ^ {0} (otherwise the uniqueness claim would fail). Therefore the ring of line type R is required not to be a field. Furthermore, notice that Axiom 1 implies: (V/i G D h-b1 = h-b2)^b1=b2 i.e. 'universally quantified first order nilpotents may be cancelled'. Now there is a subtle logical point to acknowledge: it turns out that if we insist on an axiomatic approach to smoothness, as we do by Axiom 1, then the law of excluded middle (roughly speaking: a statement is either true or false) has to be rejected, as shown by the following argument, due to Steve Schanuel. By using the law of excluded middle, define a function / : D —> R by ' 1
iffc/0,
If ho G D, ho 7^ 0, then, by Kock-Lawvere axiom, 1 = f(ho) = / ( 0 ) + ho • b = h0 • b By squaring, 1 = 0. Thus, when working in cartesian closed categories satisfying KockLawvere axiom, one needs to leave classical logic. T h e weaker logic that must be used in cartesian closed categories is the Heyting logic. "The moral to be learnt here, and that we believe is new in mathematics, is that weaker logics have their virtue, not by being more 'secure' or 'trustworthy' than full classical logic, but by their ability to carry theories (like Axiom 1) which are so strong as to be inconsistent in classical logic." (See Kock's paper in [8].) Now define the derivative R —> R of a function R —> R as the function (unique, by Axiom 1) defined by f(x + h) = f(x) + f'{x) • h for all x G R and for all he D. T h e process of forming derivatives may be indefinitely iterated. Thus every function R —> R has higher derivatives / ' , / " , . . . of any order. T h e assumption of enough nilpotents ensured by the Kock-Lawvere axiom, allows to prove the basic formulas of calculus in a rigorous way by means of straightforward algebraic manipulations. Here are some instances. T h e o r e m For any f,g:R —> R and for any r G R: 1- ( / + )' = / ' + '
2- (r • / ) ' = r • / '
140 3. if-g)'
= f-g
+
f-g'
4. ( f l o / ) ' = ( g ' o / ) . / ' Proof. As a matter of example, we prove 3 and 4. For all d E D, (f-g)(x
+ d)
= = =
f(x + d)-g(x
+ d) (f(x)+f'(x)-d)-(g(x)+g'(x)-d) if • 9)(x) + (f'(x) • g(x) + f{x) • g'(x)) • d
(We used d2 = 0.) Analogously, as far as equality 4. is concerned, for any x e R and for any d € D, we have: (gof)(x
+ d)
= =
g{f{x + d)) g(f(x) + f'(x)-d)
= (Notice t h a t f'(x)
• d e D.) By definition of derivative, it follows
(gofy(x) for all x e R.
g(f(x))+g'(f(x))-f'(x)-d
=
g'(f(x))-f'(x)
I
Notice t h a t t h a t proofs like the one reported above are qualitatively simpler than the usual ones and, on the other hand, are as rigorous as the latter. Algebraic manipulations of nilpotent quantities have long been known as effective techniques to argue in the 'infinitely small'. To give an idea of synthetic reasoning, here are two other instances of proofs, taken from Lawvere's paper [7], based on the use of nilpotent infinitesimals. Numerous other examples from geometry and physics may be found in [1]. • The length C(r) of the boundary of a disk of radius r. Define n to be the area of a unit disk. By homogeneity, the area of a disk of radius r is A(r) = 7rr2. If the radius is perturbed by a small quantity h, the increment of area is A(r + h) - Air) = 7r((r + hf
- r 2 ) = 7r(2rh + h2) = 2-nrh
if 'small' is intended to mean h2 = 0. On the other hand, we may assume t h a t Air + h)-
Air) = C(r)/i
Therefore C{r)h = 2-nrh, for all h such that h2 = 0. Since universally quantified nilpotents can be cancelled, it follows that Cir) = 2ixr. • The electric field of a dipole. Two charged particles +q and — q are placed at a small 2/i distance apart (1 > h). The electric field E at a substantial distance x from
141 t h e mid-point between the particles, on the line containing the particles, is given by Coulomb's law: E
=
l
l
(x -h)2 1 X2 -2xh 1 , 1 X2 *i _
X*
(x + h)2 1 x2 + 2xh 1
2h X
•1 + ^ X
Ah
)
l + fj
-(-?))
Algebraic calculations based on heuristic arguments of this sort, often found in physics books, become rigorous in our context, if assumptions like l > / s are interpreted as h being nilpotent - i.e. satisfying hn+1 = 0 for some n = 1,2,.. - while 'substantial' quantities are considered to be invertible elements. By cartesian closedness, for every object M in E one can consider the space MD of infinitesimal paths in M, the tangent bundle of M. Thus the tangent bundle of an object M is representable as the object of maps from D to M . The representability of the tangent bundle functors greatly simplifies constructions and computations in the theory of vector fields, flows and differential equations. For a recent account, see ([11]).
3
A Sketch of the Construction of Models
Up to now, we did not consider the problem of proving the existence of models that may be used as a universe of discourse for Synthetic Differential Geometry. Good models do exist, and t h e basic principles for their construction as (dual) sheaves on finitely-generated C°°algebras were given by Lawvere in 1967 ([3]). We now roughly sketch the ideas underlying their construction. As emphasized in section 1, such models are required to be cartesian closed categories, in order to guarantee t h e existence of function spaces. Actually, they will turn out to be toposes, namely cartesian closed categories in which the notion of subobject is representable by a truth-value object S~2. This means that, for any object X, there is a natural bijection X
—>fl
between subobjects S «-• X of X and maps X —> U, the characteristic functions of the subobjects. (In the topos Set of sets, for example, the truth-value object is the two-element set fl = {0,1}. For other examples see [9].) Furthermore, these toposes are required to contain infinitesimal objects (like the space D of nilpotent infinitesimals of square zero) and are subjected to suitable axioms (such as Kock-Lawvere's axiom or the integration axiom), chosen to suit different purposes. To
142 construct such models, one starts from the category M a n of manifolds, whose objects are finite dimensional smooth manifolds and whose morphisms are infinitely differentiable maps. This category is not cartesian closed and it obviously does not contain infinitesimal objects like D. Now the basic idea is to specify a category of spaces as the dual of a category of commutative rings. An instance of such a line of thought was the introduction of the category of affine schemes by Grothendieck (around 1957) as a specification of the dual of the category R i n g s of commutative rings. T h e use of non-reduced commutative rings, i.e. rings with nilpotent non-zero elements, allowed to apply differential geometric concepts in algebraic geometry, by making it possible to render infinitesimal variations of points on an algebraic variety ([17]). In the same vein, the basic guideline in Synthetic Differential Geometry is to construct categories of smooth spaces as a specification of the dual of suitable subcategories of the category of commutative rings. A space will be re-constructed as the spectrum of a ring of functions. In this way, for example, the infinitesimal object D will be recovered as the spectrum of its coordinate ring RD (the ring of smooth maps from D to the reals) where t h e latter, by the Kock-Lawvere Axiom, is isomorphic to the ring of dual numbers R[X]/(X2). To be more precise, as a first step for the construction of models, we need to select a category A satisfying: • its objects include all the coordinate rings of (ordinary) manifolds as well as all the coordinate rings of infinitesimal spaces. T h e rings whose spectrums are infinitesimal objects are the local algebras or Weil algebras ([17]); • the morphisms of A are required to be exactly the ring homomorphisms induced by smooth maps. One such candidate is the category A of finitely generated C°°-rings
([15]). T h e functor
Q op
M a n —> A , taking each manifold M to its coordinate ring C(M), is a full and faithful embedding of M a n in the category A o p . Now the category A o p has all finite limits and contains all ordinary manifolds as well as all infinitesimal objects, but it lacks function spaces in general, i.e. it is not cartesian closed. A natural step to achieve cartesian closedness is considering the functor category S e t A of presheaves of sets on A o p . T h e composition of the functors Man - ^ Aop - ^ SetA (where Y is the Yoneda embedding) gives an embedding M a n —> S e t A . Now the category of presheaves S e t A is cartesian closed, contains infinitesimal objects and can be easily seen to satisfy Kock-Lawvere's axiom. Nevertheless, the embedding M a n —> S e t A is still unsatisfactory, because it does not preserve such colimits of M a n as open covers. Preservation of open covers is needed in order to ensure, for example, that compact manifolds are mapped to compact spaces in the model or to allow the consideration of local questions. In order to overcome this difficulty, one defines a Grothendieck topology, or covering system, on A o p and considers the category (Grothendieck topos) S h ( A o p ) = E of sheaves with
143 respect to the fixed covering system: only those presheaves are considered that 'perceive' that open covers in Man are still open covers in Set A . We finally have functors Man - ^ A o p -^
Set A - ^ E
whose composition J : Man —> E is a full and faithful embedding satisfying: 1. J(K) = R, where R denotes the real numbers and R is the real line in E; 2. J(K \ {0}) = the set of invertible elements of R; 3. for every / : K — > 1 with derivative / ' : K —• K, J(f') = (J(/))'; 4. for every manifold M, J(TM) = {JM)D, where TM is the tangent bundle of M; 5. J preserves transversal pullbacks. The category E is said to be a well-adapted model, in the sense that the embedding J : Man —> E is suitable for comparison between properties and constructions already available in the category of manifolds Man and those available in the enlarged universe E. The notion and the construction of well adapted models of Synthetic Differential Geometry, indispensable for for the applications to classical differential geometry, are due to Eduardo Dubuc ([2]).
References [1] Bell, J.L., A Primer of Infinitesimal Analysis, Cambridge Univesity Press, 1998. [2] Dubuc, E., Sur les Modeles de la Geometrie Differentielle Synthetique, Cahiers de Topologie et Geometrie Differentielle, Vol. XX-3, 1979. [3] Kock, A. (ed.), Topos Theoretic Methods in Geometry, Various Publications Series No. 30, Aarhus Universitet, Matematisk Institut, 1979. [4] Kock, A. Synthetic Differential Geometry, London Math. Soc. Lecture Note Series, 51, Cambridge University Press, 1981. [5] Kock, A. Synthetic differential calculus, and nilpotent real numbers, "Meeting on Existence in Mathematics", University of Roskilde, Nov. 24-25 2000, available at http://home.imf.au.dk/kock/ [6] Lavendhomme, R., Basic Concepts of Synthetic Differential Geometry, Kluwer, 1996.
144 [7] Lawvere, F.W., Toward the description in a smooth topos of the dynamically possible motions and deformations of a continuous body, Cahiers de Topologie et Geometrie Differentiate, Vol. XXI-4, 1980. [8] Lawvere, F.W. and Schanuel, S.H. (ed.) Categories in Continuum Notes in Mathematics N. 1174, Springer-Verlag, 1982. [9] Lawvere, F.W. and Schanuel, S.H., Conceptual Mathematics: to Categories, Cambridge University Press, 1997.
Physics, Lecture
A First
Introduction
[10] Lawvere, F . William, Toposes of Laws of Motion, Montreal 1997, available at h t t p : / / www.acsu.buffalo.edu/" wlawvere/ [11] Lawvere, F . William, Outline of Synthetic Differential Geometry, at h t t p : / / www.acsu.buffalo.edu/" wlawvere/
1998, available
[12] Lawvere, F.W., Everyday Physics of Extended Bodies or Why F u n c t i o n a l Need Analyzing, 1998, available at http://www.acsu.buffalo.edu/~wlawvere/ [13] Lawvere, F.W., Categorie Springer, 31, 1999. [14] McLarty, C , Elementary ford, 1992.
e spazio:
un profilo, Lettera Matematica Pristem,
Categories, Elementary
Toposes, Clarendon Press, Ox-
[15] Moerdijk, I. and Reyes, G., Models for Smooth Infinitesimal Verlag, 1991. [16] Riemann, B., Gesammelte
Mathematische
Analysis,
Springer-
Werke, Dover, New York, 1953.
[17] Weil, A., Theorie des points proches sur les varietes differentiables, Colloque Geom. Diff., Strasbourg, 1953. Oeuvres Scientifiques, Collected Papers, Vol. II, Springer-Verlag, 1979., p.103-109. Commentaire, p.534-536.
Federico G. Lastaria Dipartimento di Matematica Politecnico di Milano Piazza Leonardo da Vinci 32 20133 Milano, Italy [email protected]
Twenty-one characterizations of genus zero Helmut Lenzing Abstract We deal with the class of smooth projective curves C whose derived category of coherent sheaves is equivalent to the derived category of finite dimensional modules over a finite dimensional algebra. For an arbitrary base field A; this happens if and only if C has genus zero. We collect a number of characterizations for C to be of genus zero. Some of the characterizing properties appear in print for the first time; others — often in an implicit form — are scattered throughout the literature, where usually they are stated only for an algebraically closed base field. We approach the question from the point of view of hereditary noetherian categories with Serre duality. The investigation of such categories has recently attracted much attention. For k algebraically closed, we refer in this context to the characterization of hereditary noetherian categories with Serre duality by Reiten and van den Bergh [17], and to the characterization of hereditary categories with a tilting object by Happel [7]; see also [13].
Introduction We assume that k is an arbitrary field and that C is a smooth projective curve defined over k. We are going to give a number of equivalent conditions for C to have genus zero. For the case of an algebraically closed base field, many of these conditions are classical, but other characterizations do use concepts of a more recent origin, often related to the concept of tilting [8, 6] and inspired by investigations in the framework of finite dimensional algebras. Statements and proofs, where available in print, are scattered throughout the literature; it seems therefore worthwhile to make a coherent account of this topic available. Proofs are written in the spirit of hereditary noetherian
145
146
categories. Wherever possible, proofs have been selected as to work over an arbitrary base field. We have made a systematic attempt to keep the exposition to a large degree self-contained, when starting from a set of basic properties given in the next section. Many arguments, standard either to experts from algebraic geometry or from the representation theory of finite dimensional algebras, therefore are included without an explicit reference to their original sources. Instead we recommend [9] and [1] as general references for algebraic geometry and representation theory of finite dimensional algebras, respectively. The author thanks the Centre for Advanced Study in Oslo for support and hospitality during a stay during August-September 2001, where this paper was written.
1
Basic properties of coh(C)
For the rest of this paper C is a smooth projective curve, defined over k, and H = coh(C) is the category of coherent sheaves on C. By O we denote the structure sheaf. We first note a couple of general features, denoted (H 1) to (H 6), of the category Ji which arise from this setting. Considering these properties as axioms, the account of this and the next section is to a large extent self-contained. ( H 1) 7i is an abelian k-category with finite dimensional morphism and extension spaces. The term ^-category refers to the fact that morphism spaces llom(X, Y) in 7i are vectorspaces over k and, moreover, composition of morphisms is ^-bilinear. ( H 2) Ti is noetherian, that is, each object ofH satisfies the ascending chain condition for subobjects. We require in addition that O has infinite length. ( H 3) H satisfies Serre duality D E x t ^ X . y ) = Eom{Y,TX) for a selfequivalence TofH. In particular H is hereditary, that is Ext2(—, —) vanishes. ( H 4) If Sx is the simple sheaf, concentrated in a point x, then H o m ( 0 , Sx) has dimension one as a left vector space ewer End (S^). Ifx andy are distinct points of C, then Sx and Sy are non-isomorphic with Ext1 (Sx,Sy) — 0. ( H 5) End(C») = k.
147
Clearly, the (full) subcategory TCQ of H consisting of all finite length objects is a Serre subcategory of H, that is, Ho is closed under the formation of subobjects, quotients and extensions. Therefore the quotient category H/Ho of H with respect to Ho, in the sense of Serre-Grothendieck, cf. [3, 16], is an abelian category. The category H/Ho has the same objects as H but morphisms are defined as the direct limit Eomn/no(X,
Y) = lim
Hom«(A"',
Y/Y'),
—> X>,Y>
where X' runs through all subobjects of X with X/X' in Ho and Y' runs through all subobjects of Y belonging to Ho- Let q : H —> H/Ho, X H-> X denote the (exact) quotient functor, where X denotes the object X of H when viewed as an object in H = H/HoWe recall that an algebraic function field K in one variable over k is a finite field extension of the rational function field k(x). ( H 6) The endomorphism ring k(C) ofO in H/Ho is an algebraic function field in one variable over k, called the function field of C. Moreover, each object in H/Ho *s isomorphic to On for some integer n. We may rephrase {H 6) as stating that the category H/Ho is equivalent to the category mod(fc(C)) of finite dimensional vectorspaces over k(C). We define the rank of an object X of H as the length (dimension) of X in H. Properties (H 1) to (H 3) are assumptions of a quite general nature, while the remaining properties (H 4) to (H 6) express features of a more special nature, specific to categories of coherent sheaves on a non-singular projective curve. Actually, as we are going to show elsewhere, any category H satisfying (H 1) to (H 6) is equivalent to the category of coherent sheaves over a non-singular projective curve. Note that assumption (H 5) may afford the passage from the original base field to the field of constants of k(C). Next we are going to specify consequences of the "axioms" (H 1) to (H 6). Since H is abelian with finite dimensional morphism spaces, each object X of H is a finite direct sum X\ © • • • ® Xn of indecomposable objects. Due to finite dimensionality, the endomorphism rings of indecomposable objects are local rings. In particular, H satisfies the Krull-Schmidt property: decompositions into indecomposable objects are unique up to order and isomorphism. To investigate H it is therefore to a large extent possible to restrict to indecomposable objects. By H+ we denote the full subcategory consisting of
148
objects without non-zero subobjects of finite length. Such objects will be called bundles. Since, clearly, r preserves Ho it also preserves H+. Recall that an object U of finite length is called uniserial if it has a unique composition series. ( H 7) Each indecomposable object X either belongs to H+ or to Ho- Each indecomposable object of finite length is uniserial, and thus is uniquely determined by its simple socle and its length. PROOF. By noetherianness we choose a maximal subobject XQ of finite length. By construction X+ = X/X0 belongs to H+. If one of XQ or X+ is zero we are done, hence assume that both are non-zero. This yields a non-split exact sequence fi: 0 —> X0 —> X —> X+ —> 0 and by Serre-duality 0 ^ / i € Ext 1 (X + , XQ) yields a non-zero morphism X0 —> TX+, contradiction. Given a simple object S in Ho there is exactly one further simple Si (resp. S2) with Ext 1 (S, Si) + 0 (resp. Ext 1 (S 2> S) ^ 0), namely Si = TS (resp. S2 = T~S), and the two extension spaces have dimension one over End(S). The last assertion follows from this, see [4]. • We are now going to recall the concept of an almost-split sequence. For the ubiquity of almost-split sequences in the representation theory of finite dimensional algebras we refer to [1]. This concept turns out to be of central importance also here. A non-split exact sequence 77: 0 —> A —^-> B —% C —* 0 is called almost-split in H if its end terms A and B are indecomposable and, moreover, for any indecomposable object X from H, each non-isomorphism h : X —> C lifts to a morphism h : X —> B (with h = vh). It is equivalent to request that each non-isomorphism k : A —> Y, Y indecomposable, extends to B (via u). In this situation the isomorphism class of A only depends on the isomorphism class of C, and the assignment C i-> A is called Auslander-Reiten translation. Let r\ be an almost-split sequence as above, and represent B — By © • • • © Bs as a direct sum of indecomposables, then the components ut : A —> Bi (respectively vt : Bi —> C) of u (respectively v) are irreducible morphisms. Here, a morphism h : X —• Y between indecomposable objects is called irreducible if it is not an isomorphism and, moreover, only admits trivial factorizations in the sense that any factorization X —> Y — [X —> Z —> Y] forces a to be a split monomorphism or (3 to be a split epimorphism. As an immediate consequence, each irreducible morphism is either a monomorphism or an epimorphism.
149
(H 8) The category H has almost-split sequences, that is, for each object C there is an almost-split sequence 0 —* A —> B —> C —> 0. Moreover, the self-equivalence r from Serre-duality serves as Auslander-Reiten translation. The proof involves the category (Hop, Ab) of contravariant additive abelian group-valued functors on H. As an indecomposable object TC has a local endomorphism ring, isomorphic by Yoneda's lemma to the endomorphism ring of the representable functor Hom(—,rC). This implies that Hom(—, TC) has a simple quotient functor F with the property F(TC) ^ 0. Invoking Serre duality, D F becomes a simple subfunctor of E x t ^ C , —), again with D F ( r C ) ^ 0. It is straightforward to show that any non-zero r\ 6 D F ( r C ) C Ext 1 (C,TC) represents an almost-split sequence PROOF.
0 - ^ T C ^ B ^ C ^ O .
•
Next we are going to derive useful properties of the rank. By K 0 (C) we denote the Grothendieck group of H — coh(C) with respect to short exact sequences. For an object X in H we denote by [X] its corresponding class in K 0 (C). ( H 9) (a) (b) (c) (d)
The rank yields a linear form on Ko(C) with the following properties For X € H we have r k X = 0 if and only if X belongs to Ho; For each nonzero Y from H+ we have rk Y > 0; For each X efi we have r k r X = r k X ; The structure sheaf has rank one.
For the first assertion we use that the quotient functor q is exact. Properties (a), (b) and (d) are obvious. Concerning (c) we note that as an equivalence r sends Ho to Ho, hence induces an equivalence of H, then preserving the length. • PROOF.
Indecomposables of rank one are called line bundles. In particular, the structure sheaf is a line bundle. Next we are deriving information on the Grothendieck group. Since inclusion of Ho into H is an exact functor it induces a homomorphism K0(7io) —* K0(W) whose image, denoted K'o(C), is generated by the classes of simple sheaves. ( H 10) We have K 0 (C) = 1[0\ © K' 0 (C). Moreover, r induces the identity map on K'o(C).
150 P R O O F . By induction on the rank r of X from H we are going to show that the class [X] belongs to Z[G] + K'0(H). This is obvious for r = 0, hence^assume that r >_0 and X belongs to T-C+. Our assumption shows that X has length r in H, and hence contains the simple object O. By the definition of morphisms in H/HQ this yields an inclusion L <—> X for a line bundle L contained in O. Since O / L belongs to Ho and X / L has rank r — 1, induction yields that [X] belongs to Z[C] + K'o(TC). We have thus shown that Ko(C) = Z[C?] + K'o(C). That, moreover, this sum is direct follows from the fact that the rank function vanishes on K' 0 (C) but takes value one on [O]. This proves the first assertion. Concerning the second assertion we claim that TS = 5 for each simple sheaf. Indeed, since r is an equivalence, rS is again simple. Since by Serre duality D Ext 1 (5, TS) = End(S') is non-zero, (H 4) implies that TS = S. •
Since H is hereditary, there is a (non-symmetric) Z-bilinear form (—, —), called Euler form, on K 0 (C) given on classes of objects by the expression (\X), [Y]) = dimfc Hom(X, Y) - dimfc E x t ^ X ] , [Y]). The Euler form encodes major homological properties of 7i in a handy form. We will use the Euler form, in particular, to define the degree of an element a; from K 0 (C) by the formula dega: = ([O], x) - ([O], [O]) xkx. For an object X we write d e g X = deg \X\. ( H 11) The linear form deg : Ko(C) —* Z has the following properties: (a) degC» = 0. (b) deg U > length (U) > 0 for each nonzero object of finite length. (a) is obvious. Concerning (6) consider a finite filtration 0 = U0 C Ui C ••• C Ui = U with simple factors Si = Ui/Ui-i. Since {[0],U) = T,Li([°lsi) and since further —due to Serre duality—([O], ^ = dinifcHom^,^), the assertion follows from {HA). • PROOF,
We define the degree dx of a point x as the degree of the simple sheaf Sx concentrated in x. To derive the Riemann-Roch formula, expressing the Euler form in terms of rank and degree, we introduce the concept of the genus gc = dimfc E x t ^ O , O) of C. Note that {[O], [O]) = l-gc( H 12) (Riemann-Roch)
For x and y from Ko(C) we have
(x,y) = (1 - gc)rkxrky
+
rk x deg x
rk y deg y
151 P R O O F . Note that both sides of the above formula are linear expressions in x and y. In view of (H 9) it therefore suffices to show that equality holds for x and y being equal to [O] or the class of a simple sheaf. For x = [O] and y arbitrary the formula amounts to the definition of the degree. For y = [O] and x from K'o(C), we invoke Serre duality and (H 9) to obtain (x, [O]) = —{[0],TX) = ([0],x) = degx, which proves the claim also in this case. Finally, if x and y are classes of simple sheaves, both sides of the formula evaluate to zero. •
We next discuss the relationship between the genus gc of C and the degree of the canonical sheaf TO. ( H 13) The degree of the canonical sheaf TO and the genus of C are related by the formula degrO = 2(gc — 1). By definition of the Euler form we get ([£>], [TO]} = d i m f c H o m ( 0 , r O ) dimfc Ext (O, rO). Invoking Serre duality (twice) and the definition of the genus, this yields {[O], [TO]) = gc — i- On the other hand, the Riemann-Roch formula yields {[O], \TO}) = (1 - gc) + d e g r C . • PROOF.
1
Extl(L,L)
( H 14) Each line bundle L has End(L) = k and first extensions of dimension gc.
P R O O F . It would be possible to derive this from the transitivity of the action of the Picard group on H. We prefer, however, to derive it directly from the properties (H 1) to {H 6). Let L be a line bundle and Sx be the simple sheaf concentrated in a point x. By Riemann-Roch we get (L,SX) = dx. Hence Hom(L, Sx) and Ext x (5 x ,L) have dimension one over End(S' x ). Choose a non-split exact sequence nx : 0 —> L -^-> L(x) -^-> Sx —> 0. We claim that L(x) belongs to H+, and then trivially is a line bundle. Indeed, assume that a simple sheaf S is contained in L(x), then the restriction of v to S is either zero or an isomorphism. In the first case S becomes a subobject of L, which is impossible, in the second case the sequence splits, yielding another contradiction. Next, we show that for each endomorphism / of L there exist endomorphisms f(x) of L(x) and fx of Sx such that the diagram
rjx:
0 -
L
Vx:
0 -»
L
-^
L(x)
-%
L[x)
if
^
Sx
-^
Sx
i f{x)
-
0
iU -> 0
152
is commutative. Indeed, the push-out f.rjx oirjx along / belongs to Ext1(Sx, L), which has dimension one over End(S' x ), and hence can be written as a pullback r/x.fx for some endomorphism fx of Sx. This yields a commutative diagram
0 ~~*
Z,
f-Vx = Vx-fx • 0 ~~>
L
0 ->
L
Vx •
Vx •
-% £(*)
1/ II
Sx
-> 0
sx ifx sx
-• o
1 i -U
L(x)
-+ o
proving the claim. Further, f(x) is uniquely determined by / , since there is only the zero morphism from Sx to L(x). Clearly, the map ip : End(L) —> End(L(a;)), / i-> f(x), is a homomorphism which is injective. Conversely, given / : L(x) —> L(a?), we note that v f as a member of Hom(L(x), S^) has the form vf = fxv for some / x from End(S' x ). This induces a morphism f : L-> L with / ( x ) = / . We have shown that End(L) ^ End(Z,(») holds for each line bundle L. Assume now that 11 c-» L is an inclusion of line bundles. By induction on the length I of L/U we are going to show that End(L) ^ End (77). For I = 0 there is nothing to show, hence assume that Sx is a simple sheaf contained in L/U. This yields inclusions U <—» L" <—» L and a non-split sequence 0 -> U -» L" -> 5 X -+ 0, such that L" ^ L(x), hence End(L') ^ End(L") by the preceding argument. Since L/L" has length < I, induction shows that End(L) = End (!/"), proving the claim. Writing L(nx) for the n-fold repetition of extending with Sx, such that [L(nx)} = [L] + n[Sx], we see that ([O], [L(nx)]) > 0 for large n, yielding an inclusion O ^-» L(nx), hence finally that the endomorphism rings of O and L agree. To calculate the dimension of E x t ^ L , L) we use that (L, L) = 1 — gc by Riemann-Roch. Since End(L) = k, by the preceding argument, we get that Ext 1 (L,L) has dimension gcD Given a line bundle L and a simple sheaf Sx, we may also form the kernel L(—x) of any non-zero homomorphism v : L —> Sx, yielding an exact sequence 0 —> L(—x) —> L - ^ 6^ —• 0. Since Ext 1 (Sx,L) has dimension one over End(S' x ), the isomorphism class of L(—x) does not depend on the chosen map.
153
An important role in the investigation of the category H is played by the concept of stability (and semi-stability), derived from properties of rank and degree, see for instance [18]. First of all, for any non-zero bundle we define its slope as the quotient [iE = deg F / r k E. Then a non-zero bundle E is called stable (respectively semistable) if for each proper subobject 0 C F C E we have /.iF < \iE (respectively //F < fiE). It is straightforward to verify that the semistable bundles of a fixed slope q, including the zero bundle, form an exact abelian subcategory H^ of H such that, within "HSq\ each object E has finite length (bounded by rk E). Moreover, the simple objects of H^ are exactly the stable bundles of slope q. As is easily seen, each line bundle is stable. Assume that / : E —»• F is a non-zero morphism between semistable (respectively stable) bundles, then iiE < jiF (respectively \xE < fiF). We will repeatedly use that H(TF)
= HF +
deg O
(1)
holds for each bundle F. To obtain this formula we use that by Serre duality {[O], [TE]) + {[E], [O]) = 0. Invoking (H 13) and Riemann-Roch we obtain deg (TO) rk E = deg TE — deg E, implying the claim. Let E be a non-zero vector bundle. Among the non-zero subobjects of E we consider those of maximal slope. Among those there is a unique one, denoted Ess, having additionally maximal rank. By construction fiF < fiEss holds for each non-zero subobject F of E. In particular Ess is semistable, and is called the maximal semistable subbundle of E. We next discuss topics related to tilting theory, compare [8] for an account of tilting in a module category mod(A) of finite dimensional modules over a finite dimensional algebra A. We say that an object T of 7i is a tilting sheaf if T has no self-extensions, that is, Ext 1 (T,T) = 0 and, additionally, T generates H in a homological sense, meaning that for any X from H the two conditions Eom(T,X) = 0 and Ext x (T,X) = 0 imply that X = 0. An object E of 7i is called exceptional if E has no self-extensions and if its endomorphism ring is a division ring. Conversely, by a result from [8], each indecomposable object E from a hereditary category with Ext 1 (E, E) = 0 is exceptional. In particular, each indecomposable direct summand of a tilting object in 7i is exceptional. For a hereditary category H there is a particularly simple construction of its bounded derived category Db(H), cf. [6]. For each integer n we form a copy H[n] of Ti with objects denoted X[n], where X is from ~H. The union
154
of these copies is turned into a category, defining morphisms by EomDHn)(X[m],Y[n})
=
Ext^m(X,Y),
and composing morphism by the Yoneda product. The derived category Dh(H) is the closure of Unez^M under finite direct sums. The derived category is equipped with a translation functor sending X[k] to X[k + 1], (X e Tt, k G Z). The n-th iteration of this functor is denoted I H I [ n ] . It is customary to identify H with the full subcategory H{0] of ~Db(H). An object T of Db{H) is called a tilting complex if Horn(T,T[n]) = 0 holds for each non-zero integer n and, moreover, for any X from Db(7i) the condition Hom(T,X[n]) — 0 for all n e Z implies X = 0. Obviously, a tilting complex T lying in H is just a tilting sheaf. Let T be a tilting complex with endomorphism ring A. Then the (right derived functor of) the functor Hom(T, —) induces an equivalence between the derived categories Db(H) and D6(mod(A)), thus relating the study of H and the representation theory for A. Conversely any equivalence ip : D6(mod(A)) —» Db(H) as triangulated categories yields a tilting complex ip(A) in Db(H).
2
The characterization
Proposition 2.1 Let C be a smooth projective curve over an arbitrary base field and let coh(C) the category of coherent sheaves on C. Then each of the following equivalent conditions characterizes genus zero: (1) There exists a tilting sheaf. (2) There exists a tilting complex. (3) There exists an exceptional sheaf. (4) Each line bundle L is exceptional. (5) LetO-^rO
—> Ec —> O —> 0 be almost-split, then T = Ec®0
is tilting.
(6) The middle term of the almost-split sequence 0 —> rO —> Ec —> O —» 0 has no self-extensions. (7) The degree of the canonical sheaf TO is < 0.
155
(8) Each indecomposable bundle is stable. (9) The endomorphism ring of an indecomposable bundle is a skew field. (10) Each indecomposable bundle has rank one or two. (11) There is a bound for the ranks of indecomposable bundles. (12) Each indecomposable bundle is exceptional. (13) An indecomposable sheaf is exceptional if and only if it has rank one or two. Moreover, in this situation, there are two cases to consider. (a) If Ec is decomposable, then C has a point x of degree one, Ec is isomorphic to 0(—x) ffi O(-x), and C is isomorphic to the projective line Pi(fe). (b) If Ec is indecomposable then C has no points of degree one but there exists a point of degree two. If the characteristic of k is different from two, C is isomorphic to the plane projective curve given by an anisotropic quadratic form q(xi,x2, x3) = —ax\ — bx\ + obx\. We recall that a quadratic form q : k3 —> k is called anisotropic if q(x) is non-zero for each non-zero x. In characteristic ^ 2, up to similarity of quadratic forms, such a form can be written q(x\, x2, x3) = — ax\ — bx\+abx\, where a and b are non-zero elements from k. Anisotropy of q is equivalent to the fact that the associated algebra D = ( ^ ) of generalized quaternions is a skew field. By definition D has two fc-algebra generators i and j such that i2 — a, j 2 = b and ij = —ji, see [12] for further information on this topic. Proof of Proposition 2.1. By (H 14) property (4) expresses that C has genus zero. (7) => (8): Let F be an indecomposable bundle and Fss its maximal semistable subbundle. If Fss is properly contained in F, then the sequence 0 —> Fss —> F —> F/Fss —> 0 is non-split, and hence in view of Serre duality induces a non-zero morphism T~lFss —> F/Fss. By formula (1) we have H(T~FSS) = /J.FSS - deg (TO) > ^iFss. It follows that F/Fss has a non-zero subobject F'/Fss of slope /J,(F'/FSS) > fiFss, hence satisfying ss fiF' > /J,FSS, and contradicting the choice of F . We thus have shown that F is semistable. Let q = /j,F; we choose a simple, hence stable, subobject
156
S of F in H{q). If F/S ^ 0, then the sequence 0 -> S - • F -f F / S - • 0 does not split, and Serre-duality yields a non-zero morphism T_1S —> F/S, implying q - degO — /J,(T~1S) < (JLF/S — q, contradiction. Hence F = S is stable. Next we are going to show the implications (1) / (5) \
-
(2)
(6)
\ -+ (3) /
(4) implying that assertions (1) to (6) are equivalent. (1) => (2): Obvious. (2) => (3): Let T be a tilting complex. Each indecomposable direct summand of T then yields an exceptional sheaf. (3) => (4): Let E be an exceptional sheaf. Then 0 < {[E], [E]) = (1 gc)(^E)2 by Riemann-Roch's formula. It follows that gc = 0. (4) =>• (5): We thus assume that C has genus zero. We claim that T = O © Ec is a tilting bundle. Applying Hom(C>,-) to fi, we obtain exactness of the sequence 0
= -^
H o m ( 0 , TO) - • Hom(C, Ec) -» Hom(C, O) - ^ Ext 1 ( 0 , r O ) -> Ext 1 {0,EC)^
Ext 1 ( 0 , 0 ) = 0.
Since /i is not split, S is a non-zero morphism between one-dimensional kspaces. It follows that Hom(0, Ec) = 0 = Ext1 (0,Ec)Next we apply H o m ( - , 0 ) to \i. This yields exactness of 0 = E x t ^ C , 0 ) -> E x t 1 ^ , 0 ) -> E x t x ( r 0 , 0 ) —» 0. In view of genus zero we have d e g r 2 0 < 0, the cokernel term Ext1 (TO,0) = D H o m ( 0 , r 2 0 ) hence vanishes. Therefore also 1 Ext (.EC,C)) = 0. Finally we apply (—,Ec) to \i. We thus obtain exactness of 0 = Ext1 (0,EC) -> Ext1 (EC,EC) -> Ext1 (TO,EC) -» 0. Each indecomposable summand of Ec has slope —1 therefore, invoking stability, Ext^rCEc) = ttom(Ec,T20) = 0. This implies E x t 1 ^ , ^ ) = 0, hence the first property of a tilting sheaf. It remains to show that for any sheaf X the conditions Hom(T, X) = 0 = Ext\T,X) imply that X is the zero object. Note that the condition implies ([£>], [X]) = 0 = {[Ec], [X]) = 0, hence also ([TO], [X]) = 0. Since
157
d e g r O = - 2 we obtain 0 = {[O],[X])
=
r k X + degX
0 = ([TO], [X]}
=
3rk X + deg
X.
This implies that X has rank and degree zero, hence itself is zero. (5) =>• (1),(6): Obvious. (6) =» (3): Since E has no self-extensions, each indecomposable summand of E is exceptional. Next, we deal with implications (11) -+ (7) <— (4)
/ T
I
(12)
T
-^
(9)
/ (10) _
(8)
then implying that assertions (1) to (12) are equivalent. Note that (7) =>• (8) has been shown already. (4) => (7): Obvious because of (H 13). (8) => (9): Obvious. (9) =$> (4): We assume that gc > 1. We claim first that f?c is indecomposable. Otherwise .Ec is the direct sum of two line bundles L\ and L 2 . Since there are irreducible maps rO —> Li and Li —> 0 , we conclude deg TO < deg L\ < deg C = 0, contradiction. Therefore Ec is indecomposable. Since Hom(0,r(9) has dimension gc, there is a non-zero morphism h : O -> rO. It follows that uhv is a non-zero endomorphism of Ec which is not an isomorphism, contradiction. (8) => (10): Let F be an indecomposable direct summand of Ec. Then there are irreducible maps TO —• F and F —* O. Stability then implies that d e g T 0 < p,F < d e g C = 0. By the implications already proved C has genus zero, and therefore T = O © Ec is tilting. Let Q be an indecomposable bundle. Since H{TG) = y.{G) + d e g r O , application of a suitable r z -shift shows that without changing the rank we may assume that d e g r O < [iG < 0. Stability implies that any two indecomposable bundles Ei and E2 with slopes in the range /I(TO) < /J,EI,JJ,E2 < 0 satisfy • E x t 1 ^ ] , ^ ) = DEom(E2,TE1) = 0. In particular T © G has no selfextensions. By a standard argument from tilting theory, repeated below, this implies that G lies in add(T), the closure of T under taking direct sums and direct summands. Hence in view of indecomposability G is isomorphic to O or to a direct summand of Ec, which then implies the claim.
158
In fact, let fti,... , hn be a basis of Hom(T, G), and consider the morphism h = (hu... ,hn) : Tn -> G. By construction the map h : Uom(T,Tn) -> Hom(T, G) induced by h is an epimorphism. It follows that the cokernel C of h satisfies Horn (T,C) — 0 = Ext 1 (T,C), and hence C = 0 because T is tilting. We then obtain a short exact sequence r/i : 0 —> JV —> T " —> G —> 0. Since /i is an epimorphism, it follows that Ext (T, N) = 0, which by the preceding argument shows the existence of another exact sequence rj2 : 0 —> K ^Tm Ji+ N -+0 W ith Ext 1 (T, if) = 0. Since TV --> T n , we obtain an epimorphism 0 = E x t ^ T " , ^ ) -» Ext 1 (JV,/f). Thus Ext^iV, # ) = 0, and the sequence % splits which shows N G add(T). Now also % splits, since Ext 1 (G,T) = 0, and thus G lies in add(T). (10) =>• (11): Obvious. (11) =*> (7): Let E1 be an indecomposable bundle of maximal rank, and consider the almost-split sequence 0 —> TE -^—> 0 i = i -^i — > -E —> 0, with the E* indecomposable, hence by assumption of rank less or equal to rkE. As an irreducible morphism, each ipi : Et —> E is a proper monomorphism or epimorphism. Because rk Ei < rkE the case of an epimorphism is excluded, hence each ipi is a monomorphism. Let Ess denote the maximal semistable subbundle of E. For some index i the image ipi(rEss) is non-zero, therefore restriction of ipiifi yields a non-zero morphism ss ss ss TESS -> E. This in turn shows (j.E + degrO = fi(rE ) < fiE , therefore degrC<0. The conclusion will follow once we show that the property d e g r O = 0 implies the existence of indecomposable bundles of arbitrary large rank. For the rest of the argument we switch to the exact abelian subcategory H^ of semistable bundles of slope zero, where each object has finite length. Start with a line bundle Ei. Since D Ext 1 (EI,TEX) = Eom(EuEi) jt 0 we obtain a non-split exact sequence 7?! : 0 -»
TEI
-> E2 -^
Ei -» 0.
Since D Ext 1 ( E ^ - r i ^ ) = Hom(i?2, i?i) / O w e obtain a non-split-exact sequence % : 0 - • TE2 -» E3 ^ Ei -> 0. Continuing we obtain such a non-split sequence
r?n : 0 - rEn_i
^Enu-^lEi^0
for any n > 1. We are going to show that rn~lEi is the unique simple subobject of En in 7Y(0) which automatically implies indecomposability. Indeed, let
159
U be a simple subobject from En. In view of Schur's lemma the restriction of u n _! to U is either zero or an isomorphism. It cannot be an isomorphism because otherwise r\n would split. We conclude that U belongs to the kernel TE„-I of w n _i, hence equals the unique simple subobject T(TH~2EI) of T £ ? „ _ I . (8) => (12): If Ec is indecomposable, then stability implies degrO < fiEc < [iO = 0. Otherwise Ec is the direct sum of two line bundles L\ and Li- Since there are irreducible maps TO —> L\ and L\ —* O, we obtain deg TO < deg 0 = 0 also in this case. Assume that E is an indecomposable bundle, thus by assumption stable, hence End(E) is a skew field. Since H(TE) = \iE + d e g r O < fiE, stability of E implies Hom(#, TE) = 0, hence E x t 1 ^ , E) = 0. We have shown so far that (1) to (12) are equivalent, and finally are going to show implications (5) => (13) => (3). Since (13) => (3) is obvious, we only need to show (5) =>• (13): Let X be an exceptional sheaf. Because of Riemann-Roch we see that 0 < (X, X) — (rkX)2, hence X is a bundle. By means of the equivalences already shown, we know that all indecomposable bundles are exceptional and stable and have rank at most two. (12) => (4): Obvious. Concerning the last assertion, we need some preparation. Consider an algebra R which is either the polynomial algebra fc[x,y] or else has is the quotient k[x,y,z}/(q) of the polynomial by an anisotropic quadratic form q = ~ax2 — by2 + abz2. In both cases R, equipped with the Z-grading induced by total degree, is a positively Z-graded algebra that is graded factorial, see [11] for further information. Graded factoriality means that each non-zero homogeneous element has a factorization into homogeneous prime elements. Here, a homogeneous element 0 ^ w £ R is called prime if the factor ring R/(p) is a graded integral domain, meaning that the product of any two non-zero homogeneous elements is again non-zero. It is not difficult to show that the quotient category mod z (i?)/modo {R) is isomorphic to the category coh(X) of coherent sheaves over a smooth projective curve X. If R = k[X, Y] the curve X is the projective line Pi (A;) over k with function field k(x) while for R = K[x,y,z]/(q) the curve X is the plane projective curve given by q(x, y, z) = 0 whose function field is the quotient field of k[x, y}/(q(x, y, 1)). We invoke that a smooth projective curve is determined by its function field up to isomorphism. In case (a), the cokernel of an irreducible morphism 0(—x) -* O is a (necessarily simple) sheaf of degree one. It is known [10, section 5.7] that a function field of genus zero in this case is isomorphic to the rational function
160
field. We conclude that C is isomorphic to the projective line Pi (A;) in this case with coh(C) isomorphic to mod (k[x, y])/mod0(k[x, y\). In case (6) Ec is indecomposable and stable of slope — 1. Further there are no simple sheaves of degree one. Otherwise there is a point x such that 0(—x) has slope —1. By the definition of almost-split sequences, inclusion 0{—x) —> O lifts to a non-zero morphism O(-x) —> Ec- As simple objects in W( _ 1 \ then O(-x) and Ec must be isomorphic, contradiction. Since ([rO], [O]) = (1 — g) + 2 = 3 we get an exact sequence 0 —» TO —> O —> S —• 0 where 5 has degree two, and hence is simple. Assume that k does not have characteristic two. Since gc = 0 it then follows [10, section 5.7] that k(C) is isomorphic to the quotient field of k[x,y]/(q(x, y, 1)) for some anisotropic form q(x, y, z) = —ax2 — ay2 + abz2, as above. It follows that C is isomorphic to the plane projective curve with equation q. • A d d e n d u m 1 (a) If C is isomorphic to Pi (A;) then coh(C) has a tilting object whose endomorphism ring is isomorphic to the Kronecker algebra ( , *i , y ). (b) Assuming characteristic different from two, let C be the plane projective curve given by an anisotropic quadratic form q(xi,x2,x3) = —ax\ — bx\ + abz2. Then coh(C) has a tilting object whose endomorphism ring is isomorphic to ( jk R ), where D — (^jr), is the skew field of generalized quaternions over k attached to a, b. In case (a) T = 0(—x) © O is tilting with endomorphism ring isomorphic to the Kronecker algebra. In case (b) we take, in the notations of the proposition, T = O © Ec as a tilting object. It is shown in [11] that D — End(-Ec), and the claim follows. • PROOF.
A d d e n d u m 2 If k is an infinite field, then the property (14) There is a point x such that Hom(C, 0(x)) has k-dimension also characterizes genus zero.
1 + dx.
P R O O F . Assume that C has genus zero, and x is any point. Applying Hom(C?, —) to the sequence 0 —> O —> 0(x) —> Sx —> 0 we obtain exactness of 0 - • k -» Eom(0,0(x)) - • Eom{0,Sx) - • E x t ^ O . O ) = 0, hence dim/t Hom((9,0(x)) = 1 + dx. Assume conversely that for some x we have dim*, Horn(0,0(x)) = 1 + dx. Each non-zero u e Hom(C>, O(x)) yields an exact sequence u : 0 -» O ^ ^ 0(x) -» C u -* 0,
161
where the cokernel term Cu has finite length bounded by dx. We show, by induction on the length I of Cu, that the isomorphism class of Cu determines u up to a non-zero scalar from k. If I = 1 then S = Cu is simple. Since Ext 1 (S, O) has dimension one over the division ring End (S), any extension H-.
0 -» 0 ^ + 0{x) -+ S -> 0
is equivalent to the push-out of u along some isomorphism of S yielding a commutative diagram u:
0 -+ O
-%
II JU:
0 -»
C
0(x)
-f
T -%
0(x)
S
-•
0
T -+
5
-+ 0
Since End(0(a;)) = A; this proves the claim for / = 1. We now assume I > 1 and fix a simple subobject Sy of Cu, yielding short exact sequences 0 - • O - ^ 0 ( y ) -> S„ -> 0 and 0 - • 0(y) - ^ O(s) -> Cu/Sy —> 0. By induction, both a and /? are determined (by the respective cokernel) up to a non-zero scalar, hence the same assertion holds for u = /3a. We thus have shown that different classes k*u and k*v from the projective kspace associated with H o m ( 0 , 0 ( x ) ) yield non-isomorphic cokernels CU and C„. Since by assumption k is an infinite field, we thus obtain an infinite number of pairwise nonisomorphic Cu's. Recall that each Cu has length less or equal dx. Uniseriality of indecomposables from HQ now implies that there is an infinite number of pairwise non-isomorphic simple objects, each contained in some Cu. Assume now that gc > 0, therefore Hom((9, TO) ^ 0, yielding an inclusion j : O —> TO, which we fix for the rest of the proof. Applying Hom((9, —) to u yields an exact sequence 0 -> Hom(C, O) -> Hom(0,0{x)) -> Hom(C, Gu) -?-> D Hom(C, TO) —» D Horn ((9 (x), TO) —> 0. By the assumption on the dimension of H o m ( 0 , C(a;)) we get 5 = 0, hence the restriction map Hom(0(a:),TO) — • H o m ( C , T O ) ,
h^hu
is bijective. It follows that inclusion j : O —> TO has a factorization O — • 0(a;) —> T C , with inclusions u and h, implying that coker(w) = Cu embeds into TO. By the preceding argument therefore rO/O has an infinite socle, contradicting noetherianness. Hence gc — 0 as claimed. •
162
3
Algebraically closed base field
We keep assumptions and notation from the previous section, but assume additionally that the base field k is algebraically closed, yielding a number of additional characterizations, typically connected with assertions on the Grothendieck group. P r o p o s i t i o n 3.1 Assuming the base field k algebraically closed, each of the following assumptions also characterizes genus zero. (15) There is a Z-graded factorial domain R, affine of Krull dimension two, such that coh(C) is equivalent to the quotient category mod (i?)/modg (R). (16) The Euler form on Ko(C) is non-degenerate. (17) Each line bundle is determined by its degree. (18) For any two points x and y the classes [Sx] and \Sy] agree in Ko(C). (19) There are two distinct points x and y with [Sx] = [Sy] in Ko(C). (20) The Grothendieck group Ko(C) is free. (21) The Grothendieck group Ko(C) is finitely generated. Moreover, if any of these conditions are satisfied, C is isomorphic to the projective line Pi(fc) over k. PROOF. The arrangement of the proof is illustrated by the scheme: iSc = Q) -
(15) -
\
/ (16)
(17) ->
(18)
/
-
(19) -
(14)
\
(20)
(21)
\
s (gc = 0)
(<7c = 0) =>• (15): Since k is algebraically closed, each simple has endomorphism ring k, hence in view of (H 4) has degree one. Proposition 2.1 now shows that C is isomorphic to the projective line, and the claim is satisfied with R — k[x, y], graded by total degree.
163
{9c — 0) =*> (16): Conceptually this follows from the existence of a tilting object, compare [8]. A direct argument, however, is very easy, calculating the values of the Euler form on a basis [O], [Sx] of Ko(C). (15) =S> (17): By graded factoriality the Picard group Pic(C) is isomorphic to Z. Moreover, the degree homomorphism deg : Pic(C) —• Z, L i—• degL, is non-zero, hence injective. This proves (17). (16) =4> (17): Because of (16) and Riemann-Roch, each element in K 0 (C) is determined by its rank and degree. In particular, for each line bundle L its degree determines its class in Ko(C) hence — via the determinant homomorphism det : Ko(C) —> Pic(C), sending the class [E] of a bundle E of rank r to the isomorphism class of its r-th exterior power /\r E — determines the isomorphism class of L. (17) =4> (18): Since all points have degree one, the line bundles 0(x) and 0(y) have the same degree, hence are isomorphic, and [O] + \SX] = [£?(#)] = [0(y)] = [O] + [Sy] follows. (18) =• (19): Obvious. (18) =» (20), (21): Invoking (H 10) the assumption implies that Ko(C) = Z[L] -I- Z[SX], where Sx is any simple sheaf. Using the properties of rank and degree it follows that K 0 (C) is free abelian of rank two. (19) => (14): Since Sx and Sy have the same class it follows that O(x) and 0(y) have the same class in Ko(C), and hence are isomorphic. This yields short exact sequence 0 —> O -^-* 0(x) —> Sx —+ 0 and 0 - » O -^-> 0(x) —> Sy —> 0. For x ^ y the morphisms u and v are linearly independent over k, since their cokemels are non-isomorphic. Since dx — 1, this implies that condition (14) is satisfied. Any of (20) or (21) implies that gc = 0: We use that Ko(C) is the direct sum of Z 2 and Pic 0 (C), the subgroup of the Picard group consisting of (isomorphism classes) of line bundles of degree zero, see [9]. For k algebraically closed it is known that Pic 0 (C) is isomorphic to the Jacobian variety of C. Assume g > 1. For k — C, the field of complex numbers, it is classical that the Jacobian variety is isomorphic to (C/(Z x Z)) s [9], but also for an arbitrary algebraically closed field the Jacobian variety is a (nontrivial) divisible abelian group, see [15], which makes it impossible that K 0 (C) is either free or finitely generated. •
164
4
Comments
Assume that k is algebraically closed. Then the hereditary noetherian kcategories with Serre duality, thus satisfying (H 1) to (H 3), which are additionally connected and have a tilting object (respectively a tilting complex) are exactly the categories of coherent sheaves coh(C) over a weighted projective line in the sense of [5], see [13]. These categories also satisfy properties (H 5) and (H 6). By contrast, property (H 4) in general is violated for a weighted projective line and, in fact, among the weighted projective lines, it characterizes Pi (A;), the single curve of genus zero for k algebraically closed. If k is arbitrary, the concept of a weighted projective line has to be replaced by the concept of an exceptional curve, as defined in [14], since these are exactly the hereditary, noetherian, connected fc-categories with a tilting complex [14]. The function field k(C) of an exceptional curve C is a finite central skew field extension of an algebraic function field of one variable. Assuming (H 4) and commutativity of k(C) singles out the curves of genus zero studied in this paper. For the category of coherent sheaves over a weighted projective line, more generally, over an exceptional curve most properties, listed in proposition 2.1, are again satisfied. The exceptions are (H 5), (H 10) and (H 13). The maximal bound for the rank of an indecomposable bundle will be 6, a number that is actually reached for the weighted projective line of weight type (2,3,5). Concerning {H 5) we note that the number of nonisomorphic indecomposable summands of a tilting bundle can get arbitrarily large. Returning to the setting of the paper, Proposition 2.1 allows to attach to each smooth projective curve C of genus zero a finite dimensional algebra A(C) = E n d ( C © Ec) such that non-isomorphic curves C and C yield algebras A(C) and A(C") with non-equivalent module categories. This follows from the fact that, for gc i1 1, the category coh(C) can be recovered from its derived category D 6 (coh(C)). The algebra A(C) is a tame hereditary algebra, Morita-equivalent to an algebra of bimodule type ( ( ! E ) , where D and E are finite skew field extensions of k and D M # is a (D, £)-bimodule such that dim/) M • dim^ M = 4 and, moreover, k acts centrally on D, E and M. We refer to [2] for further information on the representation theory of tame hereditary algebras. If k has characteristic different from two, Proposition 2.1 shows that for tame hereditary algebras of shape A(C) exactly the bimodules k(k © k)k
165 a n d kDo occur, where D is a generalized quaternion algebra a t t a c h e d t o an anisotropic q u a d r a t i c form over k.
References [1] M. Auslander, I. Reiten, S. 0 . Smal0, Representation Theory of Artin Algebras. Cambridge Studies in Advanced Mathematics, v. 36, Cambridge University Press, Cambridge (1995). [2] V. Dlab and C. M. Ringel. Indecomposable representations of graphs and algebras. Mem. Amer. Math. Soc. 173, 57 p. (1976). [3] P. Gabriel. Des categories abeliennes. Bull. Soc. Math. France 90 (1962), 323-448. [4] P. Gabriel. Indecomposable representations II. Symposia Mat. Inst. Naz. Alta Mat. 11, 1973, 81-104. [5] W. Geigle and H. Lenzing. A class of weighted projective curves arising in representation theory of finite dimensional algebras. In: Singularities, representations of algebras, and vector bundles, Lecture Notes Math. 1273, Springer 1987, 265-297. [6] D. Happel. Triangulated Categories in the Representation Theory of Finite dimensional Algebras. London Math. Soc. Lecture Notes Series 119, Cambridge, 1988. [7] D. Happel. A characterization of hereditary categories with tilting object. Invent. Math. 144 (2001), 381-398. [8] D. Happel and C. M. Ringel. Tilted Algebras. Trans. Amer. Math. Soc. 274 (1982), 399-443. [9] R. Hartshorne. Algebraic Geometry. Springer Verlag 1977. [10] H. Koch. Number Theory. Algebraic numbers and functions. Math. 24, Amer. Math. S o c , Providence 2000.
Graduate Studies
[11] D. Kussin, Factorial algebras, quaternions and preprojective algebras. CMS Conf. Proc. 2 4 (1998), 381-398. [12] T. Y. Lam, The algebraic theory of quadratic forms. Benjamin, Reading, Massachusetts, 1973.
166 [13] H. Lenzing, Hereditary noetherian categories with a tilting complex, Proc. Amer. Math. Soc. 125, 1893-1901 (1997). [14] H. Lenzing, Representations of finite dimensional algebras and singularity theory. CMS Conf. Proc 22 (1998), 71-97. [15] D. Mumford, Abelian Varieties, Oxford Unviversity Press, 1970. [16] N. Popescu, Abelian Categories with Applications to Rings and Modules, Academic Press, London, New York (1973). [17] I. Reiten and M. Van den Bergh, Noetherian hereditary categories satisfying Serre duality, Amer. J. Math, to appear. [18] C. S. Seshadri, Fibres vectoriels sur les courbes algebriques, Asterisque (1982). Helmut Lenzing Fachbereich Mathematik-Informatik Universitat Paderbom D-33095 Paderborn Germany E-mail: [email protected]
96
Some recent developements on the Hydrodynamic limit of the Boltzmann equation Nader Masmoudi Courant Institute 251 Mercer street, New York, NY 10012
From a physical point of view, we expect that a gas can be described by a fluid mechanic equation when the mean free path goes to zero. We present here some recent results concerning the (rigorous) derivation of incompressible Fluid Mechanic equations starting from the Boltzmann equation in the limit where the free mean path (Knudsen number) goes to zero.
1 1.1
The Boltzmann equation The model
The molecules of a gas can be modeled by hard spheres that move according to the laws of classical mechanics. However, due to the enormous number of molecules (about 2.7 1019 molecules in a cubic centimeter of gas at 1 atm and 0° C), it seems difficult to describe the state of the gas by giving the position and velocity of each individual particle. Hence, we must use some statistics and instead of giving the position and velocity of each particle, we specify the density of particles F(x, v) at each point x and velocity v. This means that we describe the gas by giving for each point x and velocity v the number of particles F(x, v) dx dv in the volume (x, x + dx) x (v,v + dv). Under some assumptions (rarefied gas, ...), it is possible to derive (at least formally) the Boltzmann equation from the classical Newton laws in an assymptotic where the number of particles goes to infinity (see also [19] and [8] for some rigorous results) (B)
dtF + v.VxF = B{F,F)
(1)
where the collision kernel B(F,F) is a quadratic form which acts only on the v variable. It describes the possible interaction between two different
167
168
particles and is given by B(F, F){v) = f
[
(F[F' - F!F)b(v - vltui)dvxdoj
(2)
where we have used the following notation for all function
(3)
and where the primed speeds are given by v' = v + u[w.(v\ — v)] v[ = v — ui[ui.(vi — v)].
(4)
Moreover, the Boltzmann cross-section b(z, w) (z G R ^ w G SN~X) depends on the molecular interactions (intermolecular potential). It is a nonnegative, locally integrable function (at least when grazing collisions are neglected). The Galilean invariance of the collision implies that b depends only on v—V\, w and that b{z,u) = \z\S{\z\,\nc\),
Mc= i
| »
(5)
where S is the specific differential cross-section. We also insist on the fact that the relation (4) is equivalent to the following conservations v' + v[ = v + v\ \v'\2 + \v[\2 = \v\2 + \vi\2
(conservation of the moment) (conservation of the kinetic energy)
(6) (7)
We notice that the fact that two particles give two particles after the interaction translates the conservation of mass. For a more precise discussion about the model, we refer to [7] and [8]. 1.1.1
Conservation laws
if F satisfies the Boltzmann equation, we deduce (at least formally) the following local conservations ' dtif*,, F
dv
) + Vx-ifw vFdv)
=0
dt(fmN vF dv) + Vx.(fMN v ® v F dv) = 0 [ dt(fRN \v\2F dv) + Vx.(fRN v\v\2 Fdv) = 0
(8)
169
These three equations describ respectively the conservation of mass, momentum and energy. They present a great resemblance with the compressible Euler equation. However, the third moment fRN v\v\2 F dv is not a function of the others and depend in general on the whole distribution F(v). In the asymptotic regimes we want to study, the distribution F(v) will be a Maxwellian. If we make this assumption the third moment fRN v\v\2 F dv can be given as a function of p — J*RN F dv, pu = J"Rjv vF dv and />(||u|2 + \9) = IRN l\v\2F dv. Moreover, for all i and j , JnN ViVjF dv can also be expressed as a function of p, u and 6. 1.1.2
Maxwellians
A Maxwellian Mp,Utg is given by M
^ = j^me^~^v-U?)
(9)
where p, u and 6 depend only on t and a;. If, we assume that for all t, F is a Maxwellian given by F = MPtUtg then (8) reduces to dtp+Wx.pu = 0 < dtpu + Vx.(pu®u) + Vx(p9) = 0
(10)
, dt{\pu + \pff) + V x .(pu(i|w| 2 + \9)) = 0 Hence, we get the compressible Euler system for a mono-atomic perfect gas. This derivation can become rigorous, if we take a sequence of solutions Fe of dtF£ + v.VxFe = ^B{Fe,Fe)
(11)
where e is the Knudsen number which goes to 0 (see R. Caflisch [6]). Formally the presence of the term \ in front of \B(Fe, FE) implies (at the limit) that B(F,F) = 0 which means that F is a Maxwellian (see [7] and [8]). 1.1.3
The scaling
We explain here the type of scalings, we want to study and its meaning concerning the the Knudsen, Reynolds and Mach number.
170
Let F£ = MG£ = M(\ + emg£) be a solution of the following Boltzmann equation ssdtFe + v.VF£ = jqB{Fe, F£)
(12)
which is also equivalent to esdtGe + v.VG£ = ~Q(G£, G£)
(13)
where Q(G,G)(v)=
f
f
(GIG? -G^Hv-vu^Midvidu.
(14)
With this scaling, we can define Ma = em,
Kn = e",
Re = em~q.
(15)
s
Here e is a time scaling which allows us to choose the phenomenon we want to emphasis. By varying m,q and s, we can formally derive the following systems (see references for some rigorous mathematical results) q = 1, m = 0, s = 0 q = l, m > 0 , s = 0
Compressible Euler system [6, 18, 28] Acoustic waves [3]
q = 1, m = 1, s = l q = l, m > 1, s = 1 q > 1, m = 1, s = l
Incompressible Navier-Stokes system [9, 3, 5, 24] Stokes equation [2, 25] Incompressible Euler system [25]
Note that the compressible Navier-Stokes system (with a viscosity of order 1) can not be derived in this manner because of the following physical relation Re = C ^ . (16) Kn However, the compressible Navier-Stokes system with a viscosity of order e can be seen as a high order approximation of the Boltzmann system in the case q = 1, m = 0, s = 0. 1.1.4
Formal development
Here, we want to explain (at least formally) how we can derive the incompressible Navier-Stokes system starting from the Boltzmann system with the scalings q — 1, m = 1, s = 1. Rewriting the equation satisfied by ge, we get (be) dtg£ + -v.Vxg£ = -^Lg£
+ -Q(g£,g£)
(17)
171
where L is the linearized collision operator given by Lg=
/
{g + gi-g[-
g')b(v - «i, w)Midt^ dw
(18)
We assume that £ can be decomposed as follows ge = g + eh + s2k + 0(s3) and we make the following formal development \2:
Lg = 0.
(19)
A simple study of the operator L shows that it is formally self-adjoint, non negative for the following scalar product < f,g >= (f g) where we use the following notation (g) = fRN gMdv and Ker(L) = {g, g = a + fi.v + j\v\2, where (0,^,7) e R x l " x R}. Hence, we deduce that g = p +
- : v.Vg = -Lh + Q{g,g).
(20)
Hence, we deduce that that u = (vg) is divergence-free (div u = 0). Besides, for the order 1, we have ^:dt(vg)
£
+ Vx.{v®vh) = 0.
(21)
To get a closed equation for g, we have to inverse the operator L. We define the matrix
— \v\2I
(22)
which is orthogonal to Ker(L) for the scalar product < f,g >= (f g). We also define the viscosity v by
We notice that v only depends on 6. Using that L is formally self-adjoint, we deduce that I
|2
dt(gvi) + V,.(^(Q( 5 ,ff) - v.Wg)) + V (^-h)
=0
(24)
172
A simple computation (but a long one) gives the Navier-Stokes equation, namely dtu + u.Vu - uAu + Vp = 0 (25) where u = (gv) and the pressure p is the sum of different contributions. Remark 1.1 It is also possible to derive (at least formally) in the same way an evolution equation for 9 which has the following form dt9 + u.V9 -KA6
= 0
(26)
where K only depends on b
1.2
The convergence towards t h e incompressible Navier-Stokes
The rigorous justification of the formal development 1.1.4 has a history which goes back to the work of C. Bardos, F. Golse and D. Levermore [1] where the stationary case was handled under different assumptions and restrictions . First, the heat equation was not treated because the heat flux terms could not be controlled. Second, local momentum conservation was assumed because DiPerna-Lions solutions are not known to satisfy the local conservation law of momentum (or energy) that one would formally expect. Third, the discretetime case was treated in order to avoid having to control the time regularity of the acoustic modes. Fourth, unnatural technical assumptions were made on the Boltzmann kernel. Finally, a mild compactness assumption was required to pass to the limit in certain nonlinear terms. In collaboration with P.-L. Lions [24] and under two assumptions (which are not necessarily satisfied by the renormalized solutions of the Boltzmann equation, we were able to treat the time dependent case. In the last few months, there were different works trying to remove all the assumptions and also recovering the heat equation (26) (see [12], [20], [15]). 1.2.1
Mathematical difficulties
To prove a rigorous mathematical result concerning one of the above formal ones, we encounter many difficulties which we are going to analysis
173
D l . The local conservation of momentum is not known to hold for the renormalized solutions of the Boltzmann equation. Indeed, the solutions constructed by R. DiPerna and P.-L. Lions [10] only hold in the renormalized sense which means that dtP{F) + v.VP(F)
= Q(F,F)/3'(F),
0 ( F ) ( t = O) =
P(F°)
(27) (28)
and where /? is given, for instance, by /J = Log (1 + / ) . Some ideas to solve this difficulty can be found in [12] (see also [20]). D2. The lack of a priori estimates. Indeed, all we can deduce is that ge is bounded in LlogL. However, we need a bound in L2 to define all the product involved in the formal development. To pass to the limit in the different products, one has also to prove that ge is compact in space and time, namely that ge G K where if is a compact subset of some 17(0, T; L 1 (Q)). We split this into two difficulties D3. The compactness in space of ge. This was achieved in the stationnary case by C. Bardos, F. Golse and D. Levermore [4], [1] using the averaging lemma and proving that ge is in some compact subset of L 1 (fi). D4. The compactness in time for ge. It turns out that in general ge is not compact in time. Indeed, ge presents some oscillations in time which can be analyzed and described precisely. Using this describion and some compensation (due to some remarkable identity of the wave equation), it is possible to pass to the limit in the whole equation. This was done by P.-L. Lions and the author [24] and we will describe this work later. 1.2.2
S o m e assumptions
Before giving the assumptions ( A l ) and (A2) which make it possible to circumvent the difficulties Dl and D2, we specify the conditions we impose on the initial data. It is supposed that G°e satisfies (we recall that F£° = MG°e) H(G°£) = [ [
N
(G°elogG°£ - G°£ + 1)M dxdv < Ce2
(29)
JQ JR
This shows that we can extract a subsequence of the sequence g® (defined by G° = l + eg°) which converges weakly in Ll towards g° such that g° € L2.
174
We also notice that (29 ) is equivalent to the fact that f^i^eg^)) dx < Ce2, where h{z) — (l + z)log(l + z) — z which is almost an L2 estimate for g®. This shows at least that g° G L2. Then, we consider a sequence Ge of renormalized solutions of the Boltzmann equation {Be). The convergence result we prove in [24] requires the following two hypotheses (Al) and (A2) on the sequence Ge which allow us to circumvent the difficulties Dl and D2 ( A l ) . The solution G£ satisfies the projection on divergence-free vector fields of the local momentum conservation law dtP(vGe) + -PVx.(v ® vGe) = 0. e
(30)
(A2). The family (l+\v\2)g2/Ne is relatively compact in w—L1{dt M dv dx), where Ne = 1 + |
(31) (32)
This assumption on the Boltzmann kernel can be relaxed, see [12] and [20] where it is assumed (in the case of hard interparticle potential) that there exist C(, G (0, oo) and TJ 6 [0,1] such that b satisfies
/
1.2.3
(w, v) dui < Cb{\ + 12\v\2)'1 almost everywhere.
(33)
The result and a sketch of the proof
As we are going to see, the difficulty D4 can be solved by a precise analysis of the possible oscillations in time of (vge). Indeed the fluctuation g£ can
175
be split up into a divergence-free part P(vg£) which is compact in time and which converges strongly to u and a gradient part Q(vg£) which is oscillating in time (formed by the acoustic waves) and which can be analyzed within the same framework as for the compressible-incompressible limit (see [23]). Theorem 1.2 Let G£ be a sequence of renormalized solutions of the Boltzmann equations (B£) with initial condition G°. Then, the family (1 + \v\2)g£ is relatively compact in w — L1 (dt Mdv dx). If g is a weak limit of a subsequence (still denoted ge) then Lg = 0 and g = p + u.v + 0 ( ^ j — y ) satisfies the limiting dissipation inequality \ j f \p(t)\2 + \u(t)\2 + ^\9\2
dx + £ < liminf \
j f I „ | v , u +* Vxu\2 I{h{ege))dx
= C°
(34)
Moreover ifb satisfies (AO) andGe satisfies the additional conditions (Al) and (A2) then u = (vg) is a weak solution of the Navier-Stokes system (NS)
{
dtu + u.Vu - uAu + Vp = 0,
V.u = 0
u{t = 0,x) = u°(x)
with the initial condition u° = P(vg°) and where the viscosity v > 0 depends only on the collision kernel and is given by (23). Now, we give an idea of the proof of theorem 1.2 (see [24] for a more complete proof). We start by recalling few a prior estimates taken from [1] Proposition 1.3 We have i) The sequence (1 + \v^)gE is bounded in L°°(dt; Lx(Mdv dx)) and relatively compact in w — Lx{dt Mdv dx). Moreover, if g is the weak limit of any converging subsequence of g£, then g G Lco(dt;L2(Mdv dx)) and for almost every t G [0, oo), we have i [ (g2(t)) dx < lim inf \ ii)Denoting qe = ^(G'£lG'e—G€iG£),
f (h(ege(t)))dx
< C°
(35)
we have that the sequence (l + \v\2)q£/N£
is relatively compact in w—L1(dt d\i dx)) whered/j, — b{v—v\,ixi)duiM\ dv\M dv.
176
Besides, if q is the weak limit of any converging subsequence of qe/NE then q € L2(dt;L2(dp, dx)) and q inherits the same symmetries as qe, namely q(v,vi,w) = q(vi,v,w) = -q(v',v[, u). Hi) In addition, for almost all {t,x), Lg = 0, which means that g is of the form 1 N g(t,x, v) = p(t,x) + u(t,x).v + 0(t,x)(-\v\2 - —), (36) where p,u,9 G L°°(dt; L2(dx)). iv) Finally, from the renormalized equation, we deduce that v.Vxg=
/
qb(vi-v,uj)duMidvi
(37)
which yields the incompressibility and Boussinesq relation, namely V x .u = 0,
Vx{p + 6)=0
(38)
We point out here that even though g G L2 the convergence of ge towards g only takes place in Ll. Next, using the hypothesis (Al) on the conservation of momentum, we get d,P(vg,} + PV,.(^(l-~)4>Lg,)+
+
(39)
P^9isJl).PVz.^Q^Ay
(40)
Then it is easy to deduce from Proposition 1.3 that
(^SvT1)
"> ^u+*Vu)
(41)
in w — Ljoc(dt dx) (see also Propositions 5.1 of [1]) On the other hand, using assumption (A2), we can deduce that ^(l-l-jLg,) in L]oc(dt; Ll(dx)) that
->
0
(42)
(see Propositions 5.2 of [1]). Again, using (A2), we get
(*(2%4*-««.,«.>))
-
0
(43)
177
where have used the following decomposition of g£ = g£ + eg£, with
(44)
i.-^.
fc = |
The theorem is then proved, provided we prove that Vx-(>Q(g£,g£))
-»•
div{u®u)
+ Vp
mV
(45)
- j)g£)
(46)
Then projecting g e , we can define Pe = (ge),
U£ = (vg£),
§£ = ^ ( { ^
And using assumptions (AO) and (A2), it is easy to reduce the problem to proof of the following one div (u£
-»•
div (u®u)
+ Vq.
(47)
which is an easy consequence of the following Proposition Proposition 1.4 Under the above assumptions, we have Pue->u
in
Llc{dt;L2(dx))
div (Qu £ <8> Qu£)
->•
(48)
Vg.
(49)
We only give an idea of proof of the second assertion. Indeed the first one is a simple consequence of the averaging lemma [14],[13] (see also [1]). For the second part, we recall that Qu£ converges weakly to 0 in L2 and hence we need to analyze the oscillations in time of Qu£ This is done by the following Proposition Proposition 1.5 There exist two sequence Ie and J£ such that
f a,. m(*
,,
i ,JV + 2 , , . .
+ Ot) + -{—)dtvn€
i_ =
-I. (50)
—u€ + - v(Pe + e£) = - J£ \ at
e
e
and Ie,JE for some s € R+.
->
0
in
tf&TiH-')
(51)
178
Now, ignoring the regularity issue in the x variable and assuming that all the terms are smooth in x, we can perform these computations where fe = pe + 6e and VV>e = Que divfvV'eOVV^) = -V|VV>e|2 + A^eVt/>e
= ^ v (l v ^l 2 ) + ( F ^ K - J ^ V V O - /ev/e + UQU + JeV^)
= M ^ 2 - ivT^)+ ;vT2(-|^ v ^ + '*'• + J < v 4 Then, since e/eVV><:, /£<2Je and IeVipe converge strongly to 0 in L 1 (0 ) T;ff*) (Vs > 0), we conclude easily. The fact that we can assume that Vipe is smooth is a consequence of the compactness in x of QuE (which is due again to the averaging lemma). This allow us to convolute Qu£ in x with an error which is uniformly bounded in e. Remark 1.6 The assumption Al and A2 where made to circumvent the difficulties Dl and D2. Some recent works ([12], [20], [15] ) try to remove completely these assumptions and also relax the assumption A0.
1.3
The convergence towards the Stokes system
The second result [25] we want to present concerns the convergence towards the Stokes system. This can be justified without making the hypotheses (^41) and (A2). In fact the assumption (A2) has been already removed in the work of C. Bardos, F. Golse and D. Levermore (see [1] Proposition 3.3) by proving a new estimate ( which is weaker that (A2) ) but sufficient in the linear case, namely 2
l^l 2 ^- = o(|ln(e)|),
in t o o ( 0 ) r ; L 1 ( M t t ) ) .
However, the hypothesis (Al) was removed only in the case where m > 2 and with strong restriction on the kernel b (see [2]). In a more recent work F. Golse and D. Levermore [12] were also able to control the concervation defects using an other method.
179
1.3.1
Defect measures
In [25], we manage to eliminate the assumption on the conservation of moment by showing that this conservation can be recovered in the limit. Indeed by looking at the construction of the renormalized solutions of DiPerna-Lions [10], one sees that one can write a kind of conservation of moment (with a defect measure) which also intervenes in the energy inequality. Indeed, the solutions Fe built by DiPerna and Lions satisfy in addition dt /
vFe dv + -div /
(v <8> v)F£ dv + -div(M e ) = 0.
(52)
Besides, the following energy equality holds \ l I N \v\2Fe(t,x,v)dx * Jn JK
dv + l f tr(M£) dx = \ ( f N \v\2F?(x,v)dx dv * Jn 4 Jn JR (53) which can be rewritten (with emm£ = Me )
1.3.2
dt{gev) + Vx.(gEv®v) + -V.me = 0,
(54)
/ (\v\2g£)dx + / tr(m£) dx = 0 Jn Jn
(55)
Entropy inequality
One can write the entropy inequality for Ge (as in the case of the limit towards the Navier-Stokes system) or write it for F£ as well. It turns out that the second choice gives a better estimate for the defect measure. Indeed starting from the entropy inequality for Fe, we can deduce / f N h(emg£)dx M dv(t) - f f em^-gedx Jn J& Jn JRN ^
M dv{t) +
H—r ds dx / M dv Mi dvx / dub(v - vi,u) 4e2 Jo Jn JRN JUN JS"-1 (G'£lG'6 - G e l G e ) l o g ( g i g ) <JfN
Kemgl)dx M dv
(56)
180
1.3.3
The result
We take initial data satisfying f f F°dxdv = 1, JtN JMN
f
I
JTN JRN
vF°dxdv = 0,
f
f
JTN JUN
\v\2F°dxdv = N (57)
and logF? dxdv<-^-
+ Ce2m
(58)
N
JQJR
We also assume that b satisfies (^40). Theorem 1.7 We take b such that (A0) is satisfied. If Fe is a sequence of renormalized solutions of the Boltzmann equations (B£) with initial condition F° and satisfies the entropy inequality as well as the refined momentum equation, then the family (1 + |v|2)ffe is relatively compact in w — Lx(dt Mdv dx). And, if g is a weak limit of a subsequence (still denoted ge) then Lg = 0 and g = p + u.v + 9{^i y) satisfies the limiting dissipation inequality \ I |p(i)|2 + \u(t)\2 + j\6\2 dx+fj
Uvxu
+« Vxu\
^li?J?i^Lfn(h(cm9s))dx
= C<
(59)
Moreover u = (vg) is the solution of the Stokes system (S) with the initial condition u° = P(vg°) and where the viscosity v is given by (23). Besides, we have the following strong Boussinesq relationship p + 6 = 0.
(60)
We notice that the relation (60) is more precise than the Boussinesq relation proved in [1] where it was proved that Vx(p + 6) — 0. Moreover the relation (60) also holds in the case we only take m > 0. 1.3.4
Conservation of momentum at the limit
We explain here briefly how we can recover the conservation of momentum in the limit. Indeed, starting from the entropy inequality, one deduces that ! (h{smgE))dx + emtr(m£) + D{G£) < Ce2m Jn
(61)
181
and since m > 1, we deduce -tr(m E )
and
£
-mE
-»
0
(62)
£
in L o o (0,T;L 1 (fi)) since me is a positive matrix.
1.4
Convergence towards the Euler system
We present, here, a method of proof based on an energy method which uses the relative entropy. Indeed contrary to the two preceding cases, we suppose here the existence of a strong solution for the Euler system and we show the convergence towards this solution. The technique used is based on a Gronwall lemma. In [25] (in collaboration with P.-L. Lions), we show this convergence with an assumption on high velocities (A2). The innovation with respect to the results presented in [11] and [24] is that one does not need to assume the conservation of momentum. Indeed we introduce defect measure which disappear in the limit. We take well prepared initial data (i.e. there are no acoustic waves). 1.4.1
Entropic convergence
In addition to the assumptions on G°e which we imposed in the case of convergence towards the Navier-Stokes system, we suppose that g® converges entropically towards one g° and that <7° = u°.v (with divu 0 — 0) i.e. that g°E -» g° lim I
in
w - Ll{M dvdx),
f{h(eg°e))dx
and
= l f ((9°)2)dx
(63) (64)
It is also supposed that u° is regular enough (for example u° € Hs, (s > y + 1 ) to be able to build a strong solution u of the Euler system with the initial data it0. Then, we have u G L£ c ([0,T*); HS). 1.4.2
Relative entropy
We want to show that the distribution Fe is close to a Maxwellian M^^^ = MGe. But as Fe is only in LlogL, we have to estimate the difference between Fe and M(0,eu,o) using the relative entropy
H{Ge,Ge) = J(Gelog(%-)-Ge + Ge).
(65)
182
Using the entropy inequality, we get
H{Ge,Ge) +e f tr(m e ) f dsD(G£)+ < H{G°e,G°e) Jn Jo
2 < Gedt\ogGe > +e2dt < gev > .u + e3dt < ge > '«L -^-ds
+
2 Jo Jn where m£ denotes the sequence of defect measures appearing in the conservation of momentum
1.4.3
The result
Theorem 1.8 Assume that (AO) holds. IfGe is a sequence of renormalized solutions of the Boltzmann equations with initial condition G®, satisfying (Al) and (A2) and such that g£ converges entropically to g° = u°.v, where u° eHs, (s > f + 1). Then, for allO
entropically
(66)
where u(t) is the unique solution of the Euler system in L"^c([0, T*); Hs) with the initial condition u°. Moreover, the convergence is locally uniform in time. We explain here the idea of the proof of the above result. It is based on a Gronwall lemma. Indeed, after some computation, we can rewrite the entropy inequality as follows \\H(G£,Ge)+e
e•
+
L
f tr(m e )l +\ Jn
J
e
f dsD(Ge) < Jo
e
\H(G°£,G°£)
I ||Vu|| L o.i[tf(G ds e > G e ) + e [ tv(me)](s) £ L J Jo Jn '0
Hence, we deduce that H(G£,G£) goes to 0 in Z,~([0,T*)). We want to point out that the same type of argument can be used to prove the convergence towards the Navier-Stokes system in the case a regular solution is known to exist.
183
References [1] C. Bardos, F. Golse, and C. D. Levermore. Fluid dynamic limits of kinetic equations. II. Convergence proofs for the Boltzmann equation. Comm. Pure Appl. Math., 46(5):667-753, 1993. [2] C. Bardos, F. Golse, and C. D. Levermore. Acoustic and Stokes limits for the Boltzmann equation. C. R. Acad. Sci. Paris Ser. I Math., 327(3) :323-328, 1998. [3] C. Bardos, F. Golse, and C. D. Levermore. The acoustic limit for the Boltzmann equation. Arch. Ration. Mech. Anal, 153(3): 177-204, 2000. [4] C. Bardos, F. Golse, and D. Levermore. Fluid dynamic limits of kinetic equations. I. Formal derivations. J. Statist. Phys., 63(1-2):323-344,1991. [5] C. Bardos and S. Ukai. The classical incompressible Navier-Stokes limit of the Boltzmann equation. Math. Models Methods Appl. Sci., 1(2):235257, 1991. [6] R. E. Caflisch. The fluid dynamic limit of the nonlinear Boltzmann equation. Comm. Pure Appl. Math., 33(5):651-666, 1980. [7] C. Cercignani. The Boltzmann equation and its applications. SpringerVerlag, New York, 1988. [8] C. Cercignani, R. Illner, and M. Pulvirenti. The mathematical theory of dilute gases. Springer-Verlag, New York, 1994. [9] A. De Masi, R. Esposito, and J. L. Lebowitz. Incompressible NavierStokes and Euler limits of the Boltzmann equation. Comm. Pure Appl. Math., 42(8):1189-1214, 1989. [10] R. J. DiPerna and P.-L. Lions. On the Cauchy problem for Boltzmann equations: global existence and weak stability. Ann. of Math. (2), 130(2) :321-366, 1989. [11] F. Golse. From kinetic to macroscopic models, preprint, 1998. [12] F. Golse and C D . Levermore, Stokes-Fourier and Acoustic Limits for the Boltsmann Equation: Convergence Proofs, Commun. Pure & Appl. Math, (submitted 2001).
184
[13] F. Golse, P.-L. Lions, B. Perthame, R. Sentis, Regularity of the Moments of the Solution of a Transport Equation, J. Funct. Anal. 76 (1988), 110125. [14] F. Golse, B. Perthame, R. Sentis, Un resultat de compacite pour les equations de transport et application au calcul de la limite de la valeur propre principale de I'operateur de transport, C.R. Acad. Sci. Paris 301 (1985), 341-344. [15] F. Golse and L. Saint Raymond Navier-Stokes-Fourier Limit for the Boltzmann Equation: Convergence Proofs, (preprint 2001). [16] H. Grad, Principles of the Kinetic Theory of Gases, in Handbuch der Physik 12, S. Fliigge ed., Springer-Verlag, Berlin, 1958, 205-294. [17] D. Hilbert, Begrundung der kinetischen Gastheorie, Math. Annalen 72 (1912), 562-577; English: Foundations of the Kinetic Theory of Gases, in Kinetic Theory 3; S.G. Brush (ed.), Pergamon Press, Oxford, 1972, 89-101. [18] M. Lachowicz. On the initial layer and the existence theorem for the nonlinear Boltzmann equation. Math. Methods Appl. Sci., 9(3):342-366, 1987. [19] O. Lanford III, Time evolution of large classical systems. Dynamical systems, theory and applications (Recontres, Battelle Res. Inst., Seattle, Wash., 1974), pp. 1-111. Lecture Notes in Phys., Vol. 38, Springer, Berlin, 1975. [20] C D . Levermore and N. Masmoudi From the Boltzmann Equation to an Incompressible Navier-Stokes-Fourier System, preprint 2001. [21] P.-L. Lions, Mathematical Topics in Fluid Mechanics, Vol. 1: Incompressible Models, Oxford Lecture Series in Mathematics and its Applications, 3. Oxford Science Publications. The Clarendon Press, Oxford University Press, New York, 1996. [22] P.-L. Lions and N. Masmoudi. Incompressible limit for a viscous compressible fluid. J. Math. Pures Appl. (9), 77(6):585-627, 1998.
185
[23] P.-L. Lions and N. Masmoudi. Une approche locale de la limite incompressible. C. R. Acad. Sci. Paris Ser. I Math., 329(5):387-392, 1999. [24] P.-L. Lions and N. Masmoudi. From Boltzmann equations to incompressible fluid mechanics equation .1. Archive Rat. Mech. & Anal. 158 (3): 173-193 2001. [25] P.-L. Lions and N. Masmoudi. From Boltzmann equations to incompressible fluid mechanics equation .II. Archive Rat. Mech. & Anal. 158 (3): 195-211 2001. [26] T. Sideris, Formation of Singularities in Three Dimensional Compressible Fluids, Commun. Math. Phys. 101 (1985), 475-485. [27] S. Ukai, The Incompressible Limit and the Initial Layer of the Compressible Euler Equation, J. Math. Kyoto Univ. 26 (1986) 323-331. [28] S. Ukai and K. Asano. The Euler limit and initial layer of the nonlinear Boltzmann equation. Hokkaido Math. J., 12(3, part l):311-332, 1983.
PIECEWISE PRIME RINGS Gary F. Birkenmeier Department of Mathematics University of Louisiana at Lafayette Lafayette, LA 70504-1010 U. S. A. E-mail: g f b l l 2 7 1 o u i s i a n a . e d u and Jae Keol Park Department of Mathematics Busan National University Busan 609-735, Korea E-mail: jkparkhyowon. c c . pusan. a c . kr 0. I n t r o d u c t i o n This paper presents a survey of results on the class of piecewise prime (or simply, PWP) rings. We begin with background material on several classes of rings such as the piecewise domains (PWD's) (quasi-) Baer rings, right P P rings, right p.q.-Baer rings, and rings with a generalized triangular matrix representation. This material provides the natural motivation for the concept of a P W P ring. After denning P W P rings, we provide examples which show that the class of P W P rings properly contains the class of PWD's. Next structural results are presented. Then P W P rings satisfying various finteness conditions are described. Finally we consider the transfer of the P W P condition between a ring and some of its extensions. Throughout R denotes an associative ring with unity. 1. Background As in [GS], a ring R is called a piecewise domain (or simply, PWD) if there is a complete set of primitive idempotents {e\,..., e„} such that xy = 0 implies either x = 0 or y = 0, whenever x € etRej and y G ejRek for 1 < i, j , k < n. Gordon and Small introduced this concept to extend and unify well-known results on hereditary Noetherian rings and hereditary semiprimary rings. The principal result of [GS] is the following. Theorem 1.1. ([GS, Main Theorem]) Assume that R is a PWD. Then
186
187
R={R)1Rn---RmOR2---R2ny.'--'-00---Rn, where each ft is a prime PWD and each Rij is a left ft- right Furthermore Ri^(D) •••Dln\--.\Dnl---Dn,
ftj-bimodule.
Similarly we define a set of right triangulating idempotents of R using (i), 61 € ST(R), and bk+1 6 Sr(ckRck). From part (iii) of the above definition, a set of left (right) triangulating idempotents is a set of pairwise orthogonal idempotents. A set { 6 1 , . . . , bn} of left (right) triangulating idempotents of R is said to be complete if each bi is also semicentral reduced. Definition 1.3. ([BHKP]) We say R has a generalized triangular representation if there exists a ring isomorphism
matrix
&'•R —• ( R )i R12 ' • • ftn0ft> ' • ' -ftri" ' • ; 00 " " ' ftu where each ft is a ring with unity and ft., is a left ft- right ft,-bimodule for i < j , and the matrices form a ring under usual matrix addition and multiplication. We say R has a complete generalized triangular matrix representation if each ft is semicentral reduced. P r o p o s i t i o n 1.4. ([BHKP, Proposition 1.3]) A ring R has a (respectively, complete) set of left triangulating idempotents if and only if R has a (respectively, complete) generalized triangular matrix representation. In [BHKP, Theorems 2.9 and 2.10], rings having a complete set of left triangulating idempotents are characterized, and the cardinality of a complete set of left triangulating idempotents is shown to be unique. This motivates the following definition: R has triangulating dimension n, written Tdim (R) = n, if R has a complete set of left triangulating idempotents with exactly n elements. Note that R is semicentral reduced if and only if Tdim. (R) = 1. If R has no complete set of left triangulating idempotents, then we say R has infinite triangulating dimension, denoted Tdim (R) = oo. Moreover [BHKP, Proposition 2.14] shows that a ring satisfying almost any finiteness condition has finite triangulating dimension, while [BHKP, Proposition 2.16] contracts the study of many well-known types of rings having finite triangulating dimension to the study of semicentral reduced rings of the corresponding type. The study of semicentral reduced rings has been initiated in [BKP3]. Complete sets of left triangulating idempotents are related to complete sets of primitive and centrally primitive idempotents in our next result.
188
P r o p o s i t i o n 1.5. (i) ([BKP5, Lemma 1.5]) Let {e\,..., ew} be a complete set of primitive idempotents of a ring R. Then there exists a complete set of left triangulating idempotents {b\,..., &„} such that for each 6j, 1 < i < n there is a nonempty subset A, of { 1 , . . . , w} with fyR = EjgA; ejR and {A* | i = 1 , . . . , n} is a partition of { 1 , . . . , w}. (ii) ([BHKP, Proposition 2.20]) Let {b\, ...,bn} be a complete set of left triangulating idempotents of a ring R. Then there exists a complete set of centrally primitive idempotents { c i , . . . , c „ } such that for each Cj, 1 < i < m there is a nonempty subset Aj of { 1 , . . . , n } with qi? = EjeA* tyR and {A; | 2 = 1 , . . . , m} is a partition of { 1 , . . . , n } . The second ingredient in the P W P concept is the notion of a quasi-Baer ring. Definition 1.6. A ring R is called (quasi-) Baer if the right annihilator of an (ideal) nonempty subset of R is generated, as a right ideal, by an idempotent. Important generalizations of the Baer and quasi-Baer concepts are given in the following definition. Definition 1.7. A ring R right (p.q.-Baer) PP if the right annihilator of a (principal right ideal) singleton set is generated, as a right ideal, by an idempotent. The study of P P and Baer rings has its roots in Functional Analysis. The P P condition originated in the work of Rickart [R]. In generalizing the notion of a von Neumann algebra, he investigated C*-algebras with the property that the right annihilator of any element is generated, as a right ideal, by a projection (i.e., a projection is an idempotent p such that p = p*). To abstract the purely algebraic properties of a von Neumann algebra, Kaplansky defined the concept of a Baer ring in [K]. Clark in [CI] defined quasi-Baer rings and used them to characterize when a finite dimensional algebra with unity over an algebraically closed field is isomorphic to a twisted matrix units semigroup algebra. For a survey on quasi-Baer rings, see [BKP2]. The right p.q.-Baer condition was defined in [BKP1] but its serious study began in [BKP4]. With these preliminaries behind us, we can now define a piecewise prime ring. Definition 1.8. ([BHKP] and [BKP2, Definition 3.6]) A ring R is called a piecewise prime ring (or simply a P W P ring), if there is a complete set of left triangulating idempotents { e i , . . . , en} such that xRy = 0 implies either x = 0 or y = 0 whenever x 6 ejitej and y € ejRek for 1 < i, j , k < n. From the above definition and Proposition 1.5(i), it follows that every
189
PWD is a P W P ring. Also since every prime ring is semicentral reduced, we see that every prime ring is a P W P ring. The following result brings together the key ingredients of the P W P concept. T h e o r e m 1.9. The following conditions are equivalent: (i) R is a P W P ring; (ii) R is a quasi-Baer ring with a complete set of left triangulating idempotents; (iii) R is a right p.q.-Baer ring with a complete set of left triangulating idempotents. Proof. This result is a direct consequence of [BHKP, Theorem 4.11] and [BKP, p.q.-Baer, Theorem 3.7]. In [GS] several examples are provided to show that, although a right P P ring with a complete set of primitive idempotents is a PWD, not every PWD is a right P P ring. However every P W P ring (hence PWD) is right p.q.-Baer. Since the classes of right p.q.-Baer and quasi-Baer rings are quite extensive (see [BKP2] and [BKP4]) they provide a large source of examples. The following examples are P W P rings which fail to meet the various criteria of a PWD. E x a m p l e 1.10. The example of Zalesskh-Neroslavskii [ZN] or [CH, p. 179] provides a simple Noethrian ring R which is not a domain and which has no idempotents except 0 and 1. Hence R is a P W P ring which is not a PWD. Moreover, the ring R is not a Baer ring. Example 1.11. Let R be the endomorphism ring of an infinite dimensional vector space over a field. Then R is a prime Baer ring. Hence R is a P W P ring. But it is not a PWD since it has no complete set of primitive idempotents. 2. Structures In this section we describe various structural results for P W P rings. Theorem 2.1. ([BHKP, Theorem 4.4]) Let R be a quasi-Baer ring with Tdim(R) = n. Then R = A © B (ring direct sum) such that: (i) A = ®i=i Ai is a direct sum of prime rings;
190
(ii) there exists a ring isomorphism
(B)lB12Blm0B2B2mW-.\00Bm,
where each Bi is a prime ring, B^ is a left 5;- right .Bj-bimodule, and k+m = n; (iii) for each i € { 1 , . . . , m} there exists j 6 { 1 , . . . , m) such that either B^ ^ 0 or By ^ 0; (iv) the rings B\,...,Bm are uniquely determined by B up to isomorphism (induced by an inner automorphism of R) and permutation; (v) B has exactly m minimal prime ideals P i , . . . , Pm, R has exactly n minimal prime ideals of the form A © Pt or C* © B where C; = 0 ^ ^4^ and these are mutually comaximal, P{R) = P(-B), and (P(i?)) m = 0; (vi) if I is a minimal ideal of R, then either I2 ^ 0 and 7 C A, for some 1 < i < k, or I2 = 0 and >(/) C ( 5 ^ ) for some 1 < i < m and 1 < j < m, where (B^) is the set of m-by-m matrices with entries from B^ in the (i, j)-th position and zero elsewhere. Corollary 2.2. Let i i be a P W P ring with a complete set of primitive idempotents E. Let C be any of the prime rings A, or Bi of Theorem 2.1. Then C has the form (e ) 1 CeieiCe2eiCene2Ceie2Ce2e2Cen::.'
• • :e„Ceie„Ce 2 e n Ce n ,
where the e* are in E and each e,Cej is a prime ring whose only idempotents are 0 and e*. In particular, if R is a P W P semiperfect ring, then each ejCei is a prime local ring. Observe that Theorem 2.1 provides a generalization of Theorem 1.1 and reduces the study of PWD's to the study of prime PWD's. Moreover, note that Example 1.10 provides a P W P ring with a complete set of primitive idempotents (in fact, RR is indecomposable) which is not a PWD. Several algebraic notions of dimension are defined in terms of prime ideals. Our next result makes this connection for triangulating dimension. T h e o r e m 2.3. ([BP, Theorem 3.4]) Let R be a right p.q.-Baer ring. Then Tdim (R) = n if and only if R has exactly n minimal prime ideals. Corollary 2.4. ([BP, Corollary 3.6]) The following conditions are equivalent: (i) R is a semiprime P W P ring;
191
(ii) R is a semiprime right p.q.-Baer ring with only finitely many minimal prime ideals; (iii) R is a finite direct sum of prime rings. In particular, R is a biregular ring with Tdim (R) < oo if and only if R is a finite direct sum of simple rings. T h e o r e m 2.5. ([BP, Corollary 3.5]) The P W P property is a Morita invariant. 3. Finiteness Conditions In this section we consider P W P rings which are either semiprimary or Noetherian. From [S], if R is a right (or left) P P ring with no infinite set orthogonal idempotents, then R is a Baer ring. Thus every semiprimary right hereditary ring is a Baer ring. In [S], Small showed that a right perfect right P P ring is semiprimary. By different methods, Teply [T] showed that a right perfect right hereditary ring is semiprimary. We have the following proper generalization of the results of Small and Teply. T h e o r e m 3.1. For a right p.q.-Baer ring R, the following are equivalent: (i) R is right (or left) perfect; (ii) R is semiprimary; (iii) R/P(R) is semisimple Artinian, where P(-ft) is the prime radical of R. Proof. This result follows from Theorem 1.9 and [BKP5, Theorem 1.7]. The next example presents a semiprimary ring which is quasi-Baer (hence right and left p.q.-Baer) but not Baer (hence neither right nor left PP). Thereby it shows that Theorem 3.1 is a proper generalization of Small's result and Teply's result. E x a m p l e 3.2. ([BKP5, Example 1.8] or [PZ, p. 135]) For a field F, let R = Tm(Tn(F)), with m, n > 1, is a semiprimary quasi-Baer ring which is not Baer, where 7fc(—) is the k x k upper triangular matrix ring. Then R is semiprimary quasi-Baer, but not Baer. The following definition identifies an important subclass of the class of semiprimary rings. Definition 3.3. Let R have a generalized triangular matrix representation as indicated in Definition 1.3. If each Ri is simple Artinian, then we say R
192
is a TSA ring. In [H] Harada shows that a semiprimary hereditary ring is a TSA ring. T h e o r e m 3.4. ([BKP5, Theorem 2.4]) If R is a semiprimary quasi-Baer ring, then R is a TSA ring. Observe that Example 3.2 also shows that Theorem 3.4 is a proper generalization of Harada's result. However we note that Harada gave further details on the components of the generalized triangular matrix representation. In [Ch] Chatters shows that if R a left and right Noetherian hereditary ring, then R is a direct sum of prime rings and an Artinian ring. The next result was proved by methods, different from Chatters', which employed the P W P concept. Also a more explicit description of the Artinian ring is given. T h e o r e m 3.5. ([BKP5, Theorem 3.1]) If R is a left and right Noetherian hereditary ring, then R = A © B, where A is a finite direct sum of prime rings and B is an Artinian TSA ring. 4. E x t e n s i o n s We describe, in this section, how the P W P condition passes between a ring R and several interesting types of extensions of R. Proposition 4 . 1 . QBP, Proposition 1.7]) Let R = F[G] be a semiprime group algebra over a field F. Then R is quasi-Baer if and only if each annihilator ideal is finitely generated. The following corollary provides an abundance of examples of group algebras which are P W P rings. Corollary 4.2. ([BP, Corollary 1.9]) If R = F[G] is a semiprime group algebra over a field F satisfying any of the following conditions, then R is a finite direct sum of prime rings: (i) right Noetherian; (ii) DCC on annihilator ideals; (iii) ACC on annihilator ideals. A monoid G is called a u.p.-monoid (unique product monoid) if for any two nonempty finite subsets A, B C G there exists an element x E G uniquely presented in the form ab where a € A and b G B. The class of u.p.-monoids is quite large and important (see [O] and [P]). For example, this class includes the right or left ordered monoids, submonoids of a free group, and torsion-
193
free nilpotent groups. Every u.p.-monoid is cancellative. Especially, in [P] group algebras of a u.p.-group are extensively studied in the investigation of the zero divisor problem. Theorem 4.3. ([BP, Proposition 4.5(i)-(v)]) A ring R is quasi-Baer with Tdim (R) = n if and only if T is quasi-Baer with Tdim (T) = n, where T is any of the following extensions of R: (i) R[G], where G is a u.p.-monoid; (ii) R[X], where X is a nonempty set of not necessarily commuting indeterminates; (iii) JR[[^]], where X is a nonempty set of not necessarily commuting indeterminates; (ivjflfrar 1 ];
(v)R[[x,x-% Theorem 4.4. ([BP, Theorem 4.8]) Let R be a quasi-Baer ring with a complete set of left triangulating idempotents B = {b\,... ,&„}. If AT is any of the following extensions of R, then N is a quasi-Baer ring with B determining a complete generalized triangular matrix representation for N in which each diagonal ring, Ri, is a prime ring: (i) R[G], where G is a u.p.-monoid; (ii) R[X], where X is a nonempty set of not necessarily commuting indeterminates; (iii) i2[[X]], where X is a nonempty set of not necessarily commuting indeterminates; (iv) Rfax-1]; (v) Rlfax-1]]; (vi) R[x; a], where a is an automorphism such that a(bR) C bR for all b£B; (vii) R[[x; a]], where a is an automorphism such that a(bR) C bR for all b&B; (viii) Tn(R). Open Problems. (1) Characterize all PWP group algebras. (2) Enlarge the class of ring extensions of PWP rings which are also PWP rings. Acknowledgments. The authors wish to express their gratitude to the organizers of the Third International Palestinian Conference on Mathemat-
194
ics and Mathematics Education, especially to Professor Mohammad Saleh, for their invitation and financial support. The second author was partially supported by the Korea Research Foundation with Research Grant Project No. DP0004 in 2000-2001. REFERENCES
[B] [BHKP] [BKP1] [BKP2]
[BKP3]
[BKP4] [BKP5] [BP] [Ch] [CH] [CI] [G] [GS] [H]
G. F. Birkenmeier, Idempotents and completely semiprime ideals, Comm. Algebra 11 (1983) 567-580. G. F. Birkenmeier, H. E. Heatherly, J. Y. Kim and J. K. Park, Triangular matrix representations, J. Algebra 230 (2000) 558-595. G. F. Birkenmeier, J. Y. Kim and J. K. Park, A sheaf representation of quasi-Baer rings, J. Pure Appl. Algebra 146 (2000) 209-223. G. F. Birkenmeier, J. Y. Kim and J. K. Park, On quasi-Baer rings Algebras and Its Applications (D. V. Huynh, S. K. Jain and S. R. Lopez-Permouth (eds.)), Contemp. Math 259, Amer. Math. Soc, Providence, 2000, 67-92. G. F. Birkenmeier, J. Y. Kim and J. K. Park, Semicentral reduced algebras, The international Symposium on Ring Theory (edited by G. F. Birkenmeier, J. K. Park and Y. S. Park), Trends in Math., Birkhauser, Boston, 2001, 67-84. G. F. Birkenmeier, J. Y. Kim and J. K. Park, Principally quasi-Baer rings, Comm. Algebra, to appear. G. F. Birkenmeier, J. Y. Kim and J. K. Park, Triangular matrix representations of semiprimary rings, Preprint. G. F. Birkenmeier and J. K. Park, Triangular matrix representations of normalizing extensions, Preprint. A. W. Chatters, A decomposition theorem for Noetherian hereditary rings, Bull. London Math. Soc. 4 (1972) 125-126. A. W. Chatters and C. R. Hajarnavis, Rings with Chain Conditions, Pitman, Boston, 1980. W. E. Clark, Twisted matrix units semigroup algebras, Duke Math. J. 34 (1967) 417-423. R. Gordon, Classical quotient rings of PWD's, Proc. Amer. Math. Soc. 36 (1972) 39-46. R. Gordon and L. W. Small, Piecewise domains J. Algebra 23 (1982) 553-564. M. Harada, Hereditary semi-primary rings and triangular matrix rings Nagoya Math. J. 27 (1966) 463-484.
195
[K] [O] [P] [PZ] [R] [S] [T]
[ZN]
I. Kaplansky, Rings of Operators, W. A. Benjamin, New York, 1968. J. Oknhiski, Semigroup Algebras, Marcel Dekker, New York, 1991. D. S. Passman, The Algebraic Structure of Group Rings, Wiley, New York, 1977. A. Pollingher and A. Zaks, On Baer and quasi-Baer rings, Duke Math. J. 37 (1970) 127-138. C. E. Rickart, Banach algebras with an adjoint operation, Ann. Math. 4 7 (1946) 528-550. L. W. Small, Semihereditary rings, Bull. Amer. Math. Soc. 73 (1967) 656-658. M. L..Teply, Right hereditary, right perfect rings are semiprimary, Advances in Ring Theory (eds. S. K. Jain and S. T. Rizvi), Trends in Math., Birkhauser, Boston, 1997, 313-316. A. E. Zalesskii and O. M. Neroskavskii, There exists a simple Noetherian ring with divisors of zero but without idempotents (Russian) Comm. Algebra 5 (1977) 231-234.
ON THE EXISTENCE OF THE SOLUTION FOR THE EQUATIONS MODELLING CONTACT PROBLEMS Nicolae POP North University of Baia Mare Deparment of Mathematics and Computer Science Victoriei 76, 4800 Baia Mare, ROMANIA E-mail: [email protected]
Abstract. In this article to demonstrate the existence of the solution of the equation modelling the static problem of elastic contact using the Brezis' fundamental theorem of pseudo-monotonous operators [4]. The existence of a solution of the equation modelling the dynamic contact problem with Coulomb friction, is proved for a particular case, when the material is considered viscoelastic with elastic behaviour and when in the bilinear form is included the displacement velocity. We shall consider linear viscoelastic material with short memory. The proof of the main result is based on the penalization and the regularization methods MSG74M15, 74G40, 74H20, 74H30, 65N30 KEYWORDSxontact problem, variational inequation, penalty approximation, regularization, Faedo-Galerkin method.. 1. THE DIFFERENTIAL AND THE VARIATIONAL FORMS OF THE STATIC CONTACT PROBLEM Let us consider two elastic bodies which, at the moment t = 0 are located within disjunctive domains fi1 C M.d and fi2 C M.d where d = 2 or d = 3. The boundaries of the bodies are giving by: dQ1 = T1=T1uuT1NuT1c
and, 8Q2 = T2 = % UT2N UT2C,
196
197
which are open from a topological point of view, and disjunctive two by two, such that only TQ and TQ can have common points. The boundary values, in displacement and under stress are given by u(t, x) on the boundary IV = Tu U Ty and by h(t, x) on the boundary TN = TNU TN, respectively. The boundary of Tc = T^ U Y2C is considered at the beginning as being under no stress. At the same time we define the vector a^n\u) oriented outwards the boundary dQ.= dfl1 U dQ2.We also know the initial displacements u(0, x) — uo(x) and the initial speed u(0, x) = ui(x). As long as the two bodies do not touch each other, the field of displacement will be the solution of a boundary value problem for the partial differential equations of elastodynamics. If the bodies touch each other, then in the contact areas there are forces preventing their mutual penetrating. The boundary condition which has to be formulated in this latter case is called "contact condition". On the contact area there may appear, additional, friction forces, as well, which will be described by a friction law. The contact problem of two elastic bodies on a time interval [0, tE], tE > 0 has the following differential form: (1)
pui(t, x) - <jijj(u(t, x)) = fi(t, x),
[0, tE] x £H.l
purceding in [0, £#] x fi , where Q = f} 1 U Q2. The boundary conditions are (2)
and (3)
u(t,x)=u(t,x)
o-{n\u)(t,x)
on
= h{t ,x)
[0,tE]xTul.2
on
[0, tE] x rV.1.3
The initial conditions are (4)
u(0,x) = u0(x)
and
u (0,x) =
ui(x)lA
Contact condition have to contain both the condition of nonpenetrating of one body into another (or the penetrating according to a given law), and the correct description of the transmission of forces among the bodies. These processes must be formulated correctly from a mathematical point of view in such a way that they may be approached through variation methods.
198
Because of inherent difficulties "the contact condition" is approximated by the Signorini condition [3]. In approximating "the contact condition" ammounts to account for the conditions of the linear elasticity theory. We will begin by parametering the two contact boundaries FQ and TQ, which are assumed to be disjunctive. To this goal,we shall resort to two bijective applications:
x^-.P-^T'c
2 X^:P^T C
and
of a domain P of a C 1 class whose dimension is d-1 for each contact boundary (d=2 or d=3). Consequently in each point x G P the are deals with: - the normalized normal vector on Y\,\ , _ xQ\x) (5)
- x
n{x)
~\xW(x)-xW(x)r-5
- the initial gap: (6)
g(x):=\x{2\x)~x^(x)\.1.6 For a u-displacement field in Q 1 U fi2we define the relative displacement
(7)
uR{x) := u^{xw{x))
-
uW(x(2\x))1.7
where vM) = u/gm, with j — 1,2, denotes the trace of u on d£l. The components of v vector field v : P —> M.d, in the direction of n (perpendicular to n) are denoted by v^ := v • n ( respectively, vT = v — vjy).The condition of non-penetrating bodies will be understood as a geometric " contact condition". It is approximated by the inequation (8)
u%(x,t) < g(x).l.&
This inequation describes actually "the contact condition" if the points on the contact boundaries move in the direction n{x). The " contact condition" has also describe safely the transmission of forces between the bodies and to fulfil the following requirements: 1° The Newton's balance of forces , i.e. the force F12 which is exercised by the body fi1 on the body fi2 , is oposite to force F21 which ft2 exercises on fi1; 2°. On the contact area there can be transmitted only compressive forces;
199
3°. The forces can be transmitted only in the areas where the bodies touch. Condition 1° means: a(n)(a;(1)(x)) J^x) = -a^\x^{x))J2{x)
(9)
=: o(x)
Vx G P, 1.9
where Jk(x),k = 1,2 are the determinants of the transformations parametrical. Condition 2° can be formulated as follows: (10)
aN(x) < 0,
Vz 6 P, 1.10
whereas condition 3° leads to (11)
aN(x)(u%(x) - g(x)) = 0 Vx € P l . l l
To summarise, the contact condition can be modelled as follows: (12)
and (13)
(
u%
= -(
an<0;
= : (71.12
an(u% - g) = 0.1.13
The variational form of eq. (1.13) is (14)
u%
Vv% < gl.lX
The friction law that describes the dependence of the tangential stress on the normal stress as weel as on the sliding speed is given by (15)
u * = 0 = > | o r | < ^ ( 0 ) - k j v | , 1.14a
and (16)
R
v*? 0^aT
= -.F(u£) • \aN\ • -^-.1.146
Where T describs the friction coefficient depending on the speed uN (x) with which the bodies slide one to an other in the point x. The differential (classical) formulation of the dynamical problem consists in: finding the solution of the system of'differential equations system: p(x)ui(t, x) - Oijj(u(t, x)) = fi(t, x)
in
[0, tE\ x fi
200
with the boundaries values: u(t, x) = u(t, x), on a{n)(u(t, x)) = h(t, x) ( CT (n)
0
^ l ) ^=
u§ ^ 9 ,
_(a(n)
[0, ts] x IV, an
0 X(2))J2
=
[0, tE] x I>
.
a
O'N = 0, CTJV(MT - 9) = °
t^=
0 =>
|a T | < ^ ( 0 )
tiT^
0 =>• aT = -F(UT)
> on
• ICTJVI
[O.tjgJ x P,
• \o-N\ • j^fj-
and the initial conditions: u(a:, 0) = MoW,
" (0, <E) = ti^x).
A correct physical interpretation of the static contact problem is possible only if this is considered as an incremental step of a temporal discretization of the problem. The initial problem will be equivalent with the iterrative determination of the displacement u at a given moment t, after the approximation of the temporal derivatives ii and, u by finite differences.To simplify the solution of this static contact problem, we shall transform it in a bondary value problem and a contact problem with homogenous boundary values, except on the contact boundary. Here we shall restrict ourselves to the second problem. In the case of variational formulation, the stress on the contact boundaries are defined as functions on Sobolev spaces as: H~ll2(Tlc, Rd)xH~1/2(T^, Rd), f € # 0 - 1 (fi 1 ,R'') x Ho\n2,Rd),u
e#1(fi1)Rd) x
H\Q2,Rd)
and h e H~1/2(T1N,Rd)
x
H~1/2(T2N,Rd).
The variational formulation of the contact problem with homogeneous boundary values, except on the contact boundary consists in: looking for u £ K i.e. \/v £ K such shot (lS)pu,v-u
>Q +a(u,v-u)
+ j(u,v) - j(u,u)
> - < HT,v^-u^
>P 1.15
where a(u,v)
= f (Tij(u)eij(v)dx descibs the deforming energy, j(u,v) = o / .F(My) |CTJV(W)| • \vjt\ ds is the functional which descibs the influence of fricp
tion, whereas HT is the tangential stress obtained from the boundary value problem, and K := {veH\n\Rd)
xff 1 (fi 2 ,R' 1 ) :: v = 0 on Tv and v^ < g
on
P}
201
is the admissible set of our functions.The contact condition can also be approximated by the method of penalization, in which case the penalized functional is
$8 = J
j[u%-g]+-v§ds.
p
The functional that models the friction reads: js(u,v) = / T{v%) • - [u% - g] + • \v%\ dsx. p
The next step is to approximate the variational inequation (1.15) by a variational equation, i.e. to approximate the module \v^\ from the friction functional with a differentiating convex function depending on a parameter e, fulfilling the conditions \grad^>e(x)\ < 1, \(f>£{x) - \x\\ < e,
Vx e Rd.
So, the variation equation that approximates the variation inequation (1.15) is given by: (18) < pu,v >n +a(u, v) + 4>s(u, v) + ips,e{u, v) = - < HT, v% >P 1.16 where lpS,e(u, V) = \\m(jSt£(u,
U + Xv)-
= / F(y%)- [u% -g]+-
jS,e(u,
V,)) =
grad<j)e{i4) • v%dsx
p
is the Coulomb friction law which describs in tourn as follows: CTT(U)
= — T \a^{u)\ • grad(f>£(ux)
This corresponds to the regularization of the friction law through the functions <j>e. 2. THE EXISTENCE OF A SOLUTION TO THE OPERATORIAL EQUATION MODELLING THE STATIC CONTACT PROBLEM After the preliminary problems done above, we shall formulate the fundamental theorem for the pseudo-monotonous operators. To do this, let be X
202
a real , reflexive and separable Banach space, A : X —> X*& nonlinear operator and / € X* a given functional. Further {w\W2,...} denotes a Galerkin basis, which is a sub-set of the set of linearly independent elements fulfilling oo
the condition: \J span {wxw2, ...Wk} = X where Xk =
span{w1W2,.--Wk}is
fc=i
the space generated by the first k vectors of the basis.We will analyze the operatorial equation: (19)
Au = /2.1
where t i e l a s well as the corresponding Galerkin equation: k
(2Q£ Auk,Wj >=< f,Wj >,j = 1, ...,k forwhich
uk = 's£jCf)wi
6 Xfc2.2
j=i
Theorem 2.1 (The fundamental Brezis theorem of the pseudo-monotonous operators [4]). Let be X a real, reflexive and separable Banach space, and A : X —> X* a pseudo-monotonous, continuous and coercive operator. Then for V/ € X*: 1°. The Garlekin equation(2.2) has at least one solution Uk, Vfc 6 N. 2°. If (iik')k1, is a weakly convergent sub-row of the row of solutions (uk)k of point 1°, then the weak value is a solution of the operatorial equation (2.1), as shown in [4]. The next step consists in demonstrating that the equation (1.16) verifies the conditions of theorem 2.1. For this purpose we will define the operators: < Pu,v >=< pu,v)n + a(u,u), < Qu, v >= (f>s{u, v),
and
(V)u,v€V,
< Ru, v > = ips,s(u, v),
(V)u, v EV.
This opens the way to our Lemma 2.1: L e m m a 2.1. If X is a reflexive Banach space, A : X —• X* is a strong monotonous operator and B : X —> X* is a complete continuous operator, then the operator T:=A + B is pseudo-monotonous [l].The operators P, Q and R have the following qualities: L e m m a 2.2 Under lemma 2.1 and the Theorem 2.1, we have: 1. P : V —• V* is a linear, continuous and elliptical operator; 2. Q : V —> V* is a continuous, monotonous and Lipschitz continuous operator; 3. R : V —> V* is a completely continuous operator.
203
The demonstration can be found in [2].With these results, according to lemma 2.1, the operator A := P + Q + R fulfils the condition of theorem 2.1 concerning pseudo-monotonous operators, so that A is pseudo-monotonous, continuous and coercive operator. This show the existence of the solution of the operatorial equation Au = / . 3. T H E D I F F E R E N T I A L A N D T H E V A R I A T I O N A L F O R M S OF T H E D Y N A M I C C O N T A C T P R O B L E M Let us consider a domain_Q C K d , where d = 2 or d — 3, whose boundary r = dQ. = IV U VM U TC is Lipschitz and consists of nonintersecting measurable parts Tc, T^ and Tu- The boundary values, in the displacement and under the stress are given by Ui(t, x) on the boundary Tu and by hi(t,x),i = l,don the boundary T^, respectively. The boundary Tc is the contact boundary with a rigid foundation. The stress-strain relation is given by (21)
atj = atj («, u) = C%ekl («) + C\fklekl («)3.1
where the coefficients C\-kl and Cl-H are symmetric, bounded and coercive. Moreover, the domain Q is assumed to fulfil the following requirements: a) fl is or bounded and with T = dQ of Lipschitz-type either boundless and with T = dfl which belong to C 1 and may be represented through on atlas with a finite number of maps. b) fi is or a domain with IV C T — dQ either Q does have the shape of an infinite strip R d _ 1 x (0,1) and Tu does contain a part of the R d _ 1 x {0}. The classical formulation of the dynamic contact problem with Coulomb friction is given: (22)
ili-
(23)
Ui = U a<-n\u, u)=W,
(24) uN< 0, (25)
and (26)
a-ijj(u,u) = fi
U
T=
aN < 0,
in
QT := IT x Q3.2
cm Su' = ITx on
Tv3.3
S% = IT x 1^3.4
uN -aN = 0
0, = » 10*| < .F(O) • \aN\
\ on
Ui(0,x) =Ui (0,a:) = Or in
Sl = ITx
fi.3.6
Tc3.5
204
Here &N denotes the normal boundary stress, GT denotes the tangential stress, Uff and UT have an analogous meaning for the displacement, respectively, !F is the friction coefficient depending on the displacement velocity, and IT = [0, T] is the observed time interval. The problem (3.2)-(3.6) has a weak formulation in terms of a variational inequality. Let / C R be an interval and W a Banach space, then we define B${I, W) the set of the bounded functions on / with values in W, with the norm:
HMW:=/|K*)IGA The set of the admissible functions is: K={ve_
Rd)) \v = U
L2{IT, H\tt,
an
S% and
vN<0
an
S%}.
The variational inequality is: find ueKf)
BQ{IT,
I*(ft, R d )) and u(0, x) =u (0, a;) = 0 i.e. Vv e K s.t.
/ { < u, v— u>n +a^(u, (27)
v— u) + aS2\u, v— u) +
IT
+ < T{tPr) | aN(u,u)
|, \vT\ -
U? >Tc}dt > J < L,vIT
with the bilinear forms: a{l\u,v)
=J
Cg^iOe/uHda
a{2\u,v)
= /
C^eijiujekiivjdi
n and the linear functional: < L, u > = < / , v >n + < h, v
>TN
The problems (3.2)-(3.6) and (3.7) are equivalent.
u> dt
3.7
205
4. T H E PENALTY APPROXIMATION AND T H E REGULARIZATION OF T H E FRICTION FUNCTIONAL The first step in our approach is the penalty approximation of the contact condition, by prescribing the normal component of the boundary stress: aN(u,u) = -
UN
The new set of the admissible functions is K = {«GL 2 (/T,F 1 (fi,K d ))|« = 0 on S£} and the variational inequality to the penalized problems reads as follow: find u s.t. J{< u,v— U>Q +aP\u,v-
u) + a^(u,v— u)+ < |
IT
+
UN
, \VT\ -
UT
vN- uN>Fc + 4.1 >Tc}dt > J < L,v- u> dt. UN
IT
(28) If we replace in (4.1) the module function|-| with a convex function 0£(-) that fulfils the requirementsand \(j>e(v) — \v\ \< e and\ grad(f)e(v) < 1 and in order to transform the variational inequality in a variational equality, we put a test function v =u +Xw divided by A and assign to the limit A —• 0, we obtain: find u with u e (u + V) n B0(IT, L2(£l, Rd)) and u(0,x) = u(0,x) = 0 V u g K such that we have: J a (29)
+a^(u, v) + a^ (u, v) + < § [uN
, vN >rc +
IT
+ < J:(vyr)^[uN]+grad(/)£(uT),vT > r c j dt = / < L,v > dt IT
The problems (4.1) and (4.2) are equivalent.
4.2
206
5. T H E E X I S T E N C E OF A S O L U T I O N OF T H E EQUATION MODELLING THE D Y N A M I C CONTACT PROBLEM
The existence of a solution of this problem can be proved using the usual Faedo-Galerkin method. Proposition 1. If the domain Q fulfils the requirements a) and b) from section3, let us consider LeV*,U e H2{IT, L2{Q2, Rd))C\Hl{IT, Hl(Q2, Rd)) with U = 0 on Sc, U(0, x) = u(0, x) = 0 in fi. The friction coefficient T = F{x, UT) is bounded and has the compact support relatively to x: S? := supp ( x ) (Jf) = | ( i , x)
G
gr | ( 3 ) ^
s±.T{x,
Wr) ± o }
is supposed to fulfil the Caratheodory condition. Then variational equality (4.2) has at least one solution depending on 8 and e. Proof. Let us consider {w\, W2, ••-,} a Galerkin base of the space V0 = {veH1(tt,Rd) \v = 0 an Tv}. In accordance with the Faedo-Galerkin property, we find a solution of the variational equality (4.2) in the form: m
u^\t,x)
YJC(r\t)wj{x),
= U(t,x) = 3=1
with teIT,x€£l,meN. We shall find the functions cj
(i) from the Faedo-Galerkin equation:
< u^(t),Wj >n +aM(um) (t),Wj) + aW{ulmKt),v>j) + fr™ (t)}+,wjtN >Tc + . (m) . (m) . (m) +
°-1
(30) Using the orthogonality of the function Wj, we obtain a sequence of differential ordinary equations of degree two: (31)
K - | | g i n c S m ) ( t ) = G i (*,c(t),c(t)),
j = l,2,...,m,5.2
and from the Peano's theorem, it results that the differential system (5.1) has at least one solution, for each t G IT , q.e.d.
207
REFERENCES
[1] C. Eck, Existenz und Regularitdt der Losungen fur rait Reiburg, Disertation, Univ. Stuttgart, 1996
Kontaktprobleme
[2] N . P o p , Applications of the Variational Inequalities in the Elastic Contact Problems, CUB PRESS 22, Baia Mare, 1998 [3] A. Signorini, Sopra alune questioni di elastostatica, Atti. Soc. Ital. Progr. Sci., 1933. [4] E .Zeidler, Nonlinear Functional Analysis and its Applications IIB: Nonlinear Monotone Operators, Springer-Verlag, New- York- Berlin - Heidelberg - Tokio, 1990 [5] Jarusek J., Contact problems with given time-dependent friction force in linear viscoelasticity, Comm. Math. Univ. Carolinae, 31, pp. 257 262, 1990 [6] Jarusek J., Eck C,Dynamic contact problems with friction in linear viscoelasticity, C.R. Acad. Sci. Paris, t. 322, Serie I, pp. 497-502, 1996
Applications of Stochastic Differential Equations to Ecological Data By Haiganoush K. Preisler USDA Forest Service Pacific Southwest Research Station Albany, CA, USA Email: hpreisler@fs,fed,us Summary Ecologists, like physicists, are often interested in studying factors that might affect the movement patterns of particles or waves. Examples include studies on the effects of environmental or topographic conditions on the propagation of wildfires, the effect of roads on the movement patterns of free ranging animals, or the effect of male pheromones on the chemotaxis of female beetles. One useful statistical approach for the analysis of movement data is that of stochastic differential equations (SDE). In this manuscript I describe some of the studies I have been involved with where SDE's where used to study effects of explanatories such as environmental factors on movement.
Introduction A stochastic process ¥= {Y(t,a>), t>0, co ef!) is a collection of random variables, Y(t,e>), where a random variable is defined as a function that assigns a real value to each outcome co in the sample space Q. The variable t is often interpreted as time and Y(t,(a) is called the state of the process at time t. A stochastic differential equation (SDE) is given by dY(t,
[1]
where Y(t, co) is a random variable, {B(t,(a), t>0} is a random process, and 6 is a set of parameters, some known and some unknown. The SDE in equation [1] is interpreted as t
i
Y(t) - Y(0) = \rfY(t))dt + fo(Y(t))dB(t) o
[2]
o
where the integrals exist in probability (0ksendal 1992). The Brownian or Wiener process, a particular random process, is a useful tool for describing motion and building complicated stochastic processes. Brownian motion process, {B(t), t> 0}, is a stochastic process with the following properties: (i) B(0)=0; (ii) {B(i), t> 0} has stationary independent increments; (iii) for each f>0, B(t) is normally distributed with mean 0 and variance a 2t. The process is named after the 19dl century English botanist Robert Brown who observed that particles of pollen when immersed in water did not settle down but kept moving in an unpredictable manner. Einstein gave the first physical explanation of the Brownian motion phenomena in 1905. Norbert Wiener presented a concise mathematical definition of the process underlying Brownian motion in a series of papers in 1918. Since then it has been
208
209 used to analyze data from such areas as quantum physics, the stock market, and animal movement. For example, the Black-Scholes formula dS(t) = fiS(t)dt + aS(t)dB{t)
[3]
has been used to study the behavior of the value of a stock, S(t), over time. The SDE in [3] has also been used to model population growth where S(t) is assumed to be the size of a population at time t and the random process, r(t) = fi + aB(t), the instantaneous rate of growth. Solving the SDE in [3] one arrives at the formula for the population size, or stock value, at time tgiven by S(0 = exp[(A-|cr 2 )/ + aS(0]. One derivation of the Brownian motion is as a limit of a random walk. Suppose that at each small time interval At a particle takes a step of size Ay either to the left or to the right with equal probability. If Y(t) is the location of the particle at time t, we have Pr[Y(t)=Y(t- Ai)+Ay] = Pr\Y(t)=Y(t- At)-Ay\ = 0.5 and the expected value E[/*(r)]=0, the variance var[Y(f)] = (Ay)'
.Next if we let
Ay = aVA7 for some a > 0 and At -> 0 then var|Y(f)] -» a 2r and ¥(•) —^—> B(-). It is of interest to note that while B(t) has continuous paths, i.e. with probability 1 B(t) is a continuous function of t, it is nowhere differentiable (Fig. 1). Also, given the relationship Ay = a VA7 , it can be seen that the velocity of a particle
Figure 1: Two simulated Brownian motion paths.
210
in Brownian motion is not defined because — =
——> to with probability 1. In other words, the
Ar
Ar
Brownian motion model is not adequate for very small time intervals. Finally, note that the values of integrals like the ones in equation [2] above are not the same as the e
corresponding values for non-random functions. For example, X(t) = \B(s)dB(s) =
R(t\^
t
while the
o standard integration by parts yields X(t) = —^— (Guttorp 1995). Some Estimation Techniques In practice one is usually interested in estimating the parameters of a process (e.g., 9 in equation [1] or u andCTin equation [3]) given observation at discrete time points V, ,yt
,••• ,y,
. Some of the methods
used to estimate the parameters in a SDE are: Difference equations Here the SDE in [1] is approximated by the difference equation Y(ti)~Y(ti_1)
= M(yMti-ti-t)
+ a(y;Q)ei
[4]
where (E, , i=I, ..., n) is independent random noise, such as a standard Gaussian variate. The parameters in 9 are then estimated using regression techniques. Example 2 below describes the use of difference equations to estimate effects of a heterogeneous environment on the movement of female elk. Autoregressive models When the realizations of a random process, Y(t), are recorded at equal intervals (t= 1,2,3,...) and the relationship between consecutive values of the process may be given by r =P r
; + 11(
(t = 1,2,3,...) n,~i.i.d
[5]
then the relationship is called an autoregressive model of order 1. Equation [5] is a special case of the difference equation in [4] where u (y; 9 ) Hp-tyi-i and a(y ; 9)=1- Time series estimation techniques may be used to estimate the parameter in the model (Shumway and Stoffer 2000). Example 1 below describes the use of an autoregressive type model to study chemotaxis (orientation in relation to gradients of chemicals) by walking bark beetles. Maximum likelihood estimates for a diffusion process Given the SDE in [1] with 9 the parameter of interest the log likelihood function of an observed path is given by m
- - ) ^ M s ) - \ ) * ^ d y ( s ) J a(y(s))
[6]
2 J a(v(s))
(Guttorp 1995, p. 302). The maximum likelihood estimate (MLE) of 9 may be calculated the usual way, i.e., by setting the derivative of the log likelihood with respect to 9 zero and solving for 9. For example, to
211 obtain the MLE of p. in the Black-Scholes formula in [3], where fi(y(i);$) = /jy(t) and c r ^ f ) ) = ay(t), we solve the equation ,
T
T
— Hfi,
T
-
T
\y(s)ds
Y,yi
o
o
where yh . . . yT are observed values. An example of use of the maximum likelihood technique in ecology is given in Brillinger and Stewart (1998) where the authors estimate speed and other parameters describing the migration path of elephant seals.
Two Examples from Ecology 1.
Bark beetle response to pheromone.
Biologists are interested in statistical models to study the response of bark beetles to compounds involved in their chemical communication (i.e., pheromone) system. Pheromone systems are of concern to entomologists because they appear to be promising as nontoxic alternatives to insecticides for insect control. Insect pheromones control the orientation of one individual with respect to another. Chemicals emitted by the male guide the flight of the female to him for the purpose of mating. A useful measure for studying characteristics of an animal track as they orient and move towards a point source is the 'heading' angle between the direction toward the source and the direction along the animal's path. In the study described in Akers and Wood (1989) and Preisler and Akers (1995) female Ips paraconfusus bark beetles were introduced into a small experimental arena (36-cm diameter) with a pheromone source at one end. The progress of beetles was tracked by marking their positions at one-second intervals. Figure 2 shows examples of individual tracks from one control (no pheromone emitted from source) and three treatment groups. The statistical model that best described the observed heading angle, 0,, of a beetle between times t-1 and t, given the history of the beetle's movements up to time t-1, was the second order autoregressive type model ©, = Ptft-\ + Pi(0,-\ -y,.2) + e(TaoA2n)
w i t h s - VM(0,K,)
[7]
where 0, is a random angular variate ranging between —JI and Jt; VM is the von Mises distribution (Mardia, 1972); Kt = ea«+a'cos
'-' j y:l i s the heading angle of a beetle traveling in a straight line between time t-2
and t\ {p;, /3Z ao, ci/J a set of unknown parameters.
212
t3"'"^-.
t4
*'c2't5
>-/ s4
s'3
S5
s2
Figure 2: Tracks of 4 female bark beetles. C2 is the starting point of a beetle in the control group (no pheromone emitted at the source located at S2). t3, t4, t5 are die starting points of 3 beetles in the treatment groups with pheromone sources at S3, S4, and S5 respectively. Points on the track indicate locations of beetles at 1-s intervals. When the concentration parameter K = 0 the model in [7] corresponds to a random walk with no mean direction. If K -> co , 0 < Pi < 1, and p^= 0 the beetle will approach the source S along an arc. One interesting result of the analysis was the fact that the concentration parameter, K, was a function of the heading angle at time t-1 rather than the distance to the source. It appears that when the absolute value of the heading angle is large (i.e., when a beetle is heading away from a pheromone source) the fluctuation around the mean direction was larger than when a beetle is heading toward the source. 2.
Movement patterns of free-ranging female elk.
Studies on the movement of free-ranging animals provide essential information to a wide audience, from wildlife managers, to conservation biologists and population and landscape ecologists. In the study described in Rowland et al. 1997 and Brillinger et al. 2000, a telemetry system was used to monitor the locations of 53 radio-collared female elk foraging in a 9000 ha fenced experimental forest. Figure 3a shows the locations along the trajectory of one elk for a period of 9 months. The shaded background is the fenced region of the experimental forest. The statistical model developed for this study assumed that elk move in accordance with the stochastic differential equations
213 di(t) = v(r(/), t)dt + l ( r ( 0 , t)dB(t)
[8]
Here, r(t) = (x(t),y(t))' is the location of an elk at time t; B(t) is a bivariate Brownian process; v(r,t) = E{dr(t)j/dt is the velocity in some direction (drift). The parameters and the Brownian process control the direction and speed of motion. A particular case of the SDE in [8] is the mean-reverting OrnsteinUhlenbeck process (Dunn and Gipson, 1977) where v(r,t) = A(a - r), Z( r,t) = I and a is the mean. Another special case is the random walk where the drift term, v(r,t), is zero. The drift term, v(r,t), may be modeled parametrically as a function of covariates, such as distances to road, distances to cover, time of day, hunting season. Another approach is to estimate v(r,t) by a smooth function using nonparametric regression procedures such as loess (Cleveland et al., 1992). Figure 3b displays one of the results of the nonparametric approach. Here, the drift term was assumed to be a smooth function of space and time of day. The estimation was done using the loess procedure within a generalized additive model (Hastie, 1992). Figure 3b is a plot of the estimated vector field v describing the expected movement of elk at 0600 hours.
Nine month tracks of one elk
Estimated vector field at 0600 hr
~ -
•
•-
.".."•'
" '"^-v.'7
"7™ ,'L."-"-
i
~ 0
2
4
6 Km
8
10
12
0
2
4
6
8
10
12
Km
Figure 3: (a) The locations along the trajectory of one elk for a period of 9 months. The shaded background is the fenced region of the experimental forest, (b) A plot of the estimated vector field
214 describing the expected movement of elk at 0600 hours. Data from 53 female elks were used to estimate the vector field. ACKNOWLEDGEMENT I would like to dedicate this paper to professor Mary Hanania Regier and Nevart Krikorian, two Palestinian women who have inspired me and made my drift through life less random.
REFERENCES Akers, R. P., and Wood, D. L. (1989). Olfactory orientation responses by walking females Ips paraconfusus bark beetles. I. Chemotaxis assay. Journal of Chemical Ecology 15, 3-24. Brillinger, G. R., Preisler, H. K., Ager, A. A. and Kie, J. G. (2000). The use of potential functions in modelling animal movement. Data Analysis from Statistical Foundations. Ed. A. K. Md. E. Saleh. Nova Science, New York. Brillinger, D.R. and Stewart, B. S. (1998). Elephant seal movements: modelling migration. Canadian J. Statistics, 26, 431-443. Cleveland, W. S., Grosse, E. and Shyu, W. M. (1992). Local regression models. Pp. 309-376 in Statistical Models in S (Eds. J. M. Chambers and T. J. Hastie). Pacific Grove: Wadsworth. Dunn, J. E. and Gipson, P. S. (1977). Analysis of radio telemetry data in studies of home range. Biometrics, 33,85-101. Guttorp, P. (1995). Stochastic Modeling of Scientific Data. Chapman & Hall, London. Hastie, T. J. (1992). Generalized additive models. Pp. 195-247 in Statistical Models in S (Eds. J. M. Chambers and T. J. Hastie). Pacific Grove: Wadsworth. Mardia, K. V. (1972). Statistics of Directional Data. Academic Press, London. 0ksendal, B. (1992). Stochastic Differential Equations: an introduction with applications. Springer, New York. Preisler, H. K. and Akers, R. P. (1995). Autoregressive-type models for the analysis of bark beetle tracks. Biometrics, 51, 259-267. Rowland, M. M., Bryant, L. D., Johnson, B. K., Noyes, J. H., Wisdom, M. J. and Thomas, J. W. (1997). The Starkey Project: History, Facilities, and Data Collection Methods for Ungulate Research. Technical Report PNW-GTR-396, Forest Service, USDA. Shumway R. H. and Stoffer, D. S. (2000). Time Series Analysis and Its Applications. Springer, New York.
DECOMPOSITION OF MEASURES ON DIFFERENCE POSETS Eissa D . Habil - Akram M. Radwan
A B S T R A C T . We prove generalizations of the Yosida-Hewitt decomposition theorem and the Lebesgue decomposition theorem for positive finitely additive maesures defined on difference posets (generalizing orthomodular posets and orthoalgebras) with values in a Dedekind complete Riesz space. Moreover, we provide some conditions for a difference poset to possess the Jordan-Hahn property and the approximate Jordan-Hahn property.
1
INTRODUCTION
One of the areas of noncommutative measure theory is the study of measures and states on algebraic structures less rich than a a— field which arose from the realization that quantum mechanical events fail to form a a— field. Such structures are, for example, quantum logics (=orthomodular posets), orthoalgebras or more generally difference posets [7, 8, 9, 10,12]. In the last 20 years many classical decomposition theorems have been proven in the setting of finitely additive measures defined on orthomodular posets and orthoalgebras [2, 3, 4, 14, 15]. The main results of the this paper are the following: 1) generalizing the YosidaHewitt decomposition and the Lebesgue decomposition of finitely additive measures, a—additive measures, and completely additive measures on difference posets with values in a Dedekind complete Riesz space [2, 3, 4]. 2) Finding necessary or sufficient conditions for difference posets to possess the Jordan-Hahn property or the approximate Jordan-Hahn property [14, 15]. 2
PRELIMINARIES
Let (L, < ) be a partially ordered set (poset), and let D be a nonempty subset of L. If the supremum (resp., the infimum) of D in L exists, it will be denoted by \J D (resp., f\D). In particular, if D = {x, y} we write V D = x V y and /\ D = x A y. Definition 2.1 [12, 13] Let (L, <) be a poset with a least element 0 and a greatest element 1. Let © be a partially defined binary operation on L such that b © a is defined if and only if a < b. Then (L, < , 0 , 0 , 1 ) is called a difference poset (a DP, for short) if the following conditions are satisfied Va,b,c£ L: (DPI)
For any a € L, a Q 0 = a.
(DP2) If a < b < c, then c © b < c © a and (c © a) © (c 0 b) = b © a. 215
216
The following statements have been proven in [10]. Proposition 2.2 Let a, b, c be elements in a DP L. Then (i) a Q a = 0. (ii) a < b implies bQa = 0 -O- b = a. (iii) a < b implies bQa = b <& a = 0. (iv) a (v) a
bQa
=$• bQa
(vi) a
and {cQa) © (60 a) = c © 6.
and 6 9 (b © a) = a. =S- (c © a) © 6 = (c © 6) © a.
(vii) a a < c 0 ( 6 e a ) and (c © (b © a)) e a = c 0 b. Let (L, <, 0,0,1) be a difference poset. For any element a £ L we put a' := 1 © a. Then (i) a" = a; (ii) a < 6 implies 6' < a'. Two elements a,b £ L are orthogonal, and we write a JL 6 iff a < b' (or equivalently b < a'). The properties of a DP enable us to define a partial binary operation 0 : L x L —> L as follows : For every a, 6 G L with a J. 6, put a © b := (a' © b)' = (&' © a)'.
(2.1)
The partial binary operation © on L is commutative and associative. Consequently to above, if a, b S L with a < b, then 3 c € L such that c La and b = a@c. Moreover, c = bQ a. Very important examples of difference posets are orthomodular posets, orthoalgebras, and effect algebras [6, 8, 10, 12]. Example 2.3 [2, 4] An orthomodular poset (OMP) is a poset (L, <) with smallest element 0 and greatest element 1, and with an orthocomplementation ' : L —> L such that (OM1) a" = a for any a e L; (OM2) a V a' = 1 for any a e L\ (OM3) if a < b, then b' < a'; (OM4) if a < b' (and we write alb),
then a V b 6 L;
(OM5) if a < 6, then 6 = a V (a V &')' (the so-called orthomodular identity). Note that every OMP L becomes a DP when b © a := b A a' for a < b in L . An orthomodular lattice (OML) is an OMP which is also a lattice. A distributive orthomodular lattice is called a Boolean algebra. For more details concerning orthomodular posets and lattices see, e.g. [9].
217 E x a m p l e 2.4 [7,8] An orthoalgebra (OA) is a set L containing two special elements 0,1 and equipped with a partially defined binary operation © : L x L —> L such that for all a, b, c € L we have (OA1)
(Commutativity) 6 ©a;
If affifc is defined, then 6©a is defined and affi£> =
(OA2)
{Associativity) If 6 © c is defined and a © (6 © c) is defined, then a © 6 is defined, (a © 6) © c is defined, and a® (b® c) = (a®b) Q c;
(OA3)
(Orthocomplementation) For any a e L there is a unique 6 € L such that a © 6 is defined and a © 6 = 1;
(OA4)
(Consistency) If a © a is defined, then a = 0.
Note that every OA (L, ©, 0,1) becomes a DP if we define © on L by b © a := (a © 6')' for all pairs (a, b) with a ±b', where b' is the unique element in L such that 6 © 6 ' = 1. We note that if L is an orthomodular poset and a © 6 := a V b whenever a Lb in L, then L with 0,1, © is an orthoalgebra. The converse statement does not hold, in general. We recall that an orthoalgebra L is an OMP iff a _L b implies a V b G L. By [12], we conclude that a DP with 0,1 and ©, defined by (2.1), is an orthoalgebra if and only if a < 1 © a implies a = 0. Therefore, it is not hard to give many examples of DPs which are not orthoalgebras; such ones are sets of effects: Example 2.5 [10] Let if be a Hilbert space, and let E(H) be the set of all selfadjoint operators A on H with O < A < I, where O and I are the zero and identity operators, respectively, on H. The partial order on E(H) is defined by setting A < B iff (Ax,x) < (Bx,x), x e H, and C = B © A iff (Bx, x) - (Ax, x) — (Cx,x), x G H. We recall that (E(H), <, ©, O, I) is a difference poset which is not an orthoalgebra. On the other hand, if in the definition of an orthoalgebra, axiom (OA4) is replaced by the weaker axiom (EA4) a © 1 is defined implies a = 0, we obtain the so-called effect algebra (EA) which generalizes orthoalgebra. Moreover, an effect algebra is equivalent to a DP [6]. Definition 2.6 Let L be a DP. A subset M C L is called jointly compatible iff M is contained in a Boolean subalgebra. A subset M C L is called an orthogonal set if any two elements of M are orthogonal. A subset M C L is said to be jointly orthogonal iff it is an orthogonal set and jointly compatible. An orthoalgebra is said to be locally finite if every jointly orthogonal subset is finite. A finite set I) C L i s called a difference set if either D is empty or there exists a strictly increasing sequence (pi)"=o, 1 < n, in (L, <) such that D = {piQpi-i
: i = l,2,....,n}.
218 We say that the sequence (pi)™=0 yields the difference set D. Notice that n is the cardinality of D. Singleton sets of nonzero elements of L are difference sets. If p, q are orthogonal pair of nonzero elements of L, then the set {p, q] is a difference set. A difference set is orthogonal and a subset of a difference set of L is also a difference set. The following theorem gives a sufficient condition for a subset of a DP to be jointly orthogonal. Theorem 2.7 [14] Let L be a DP and let M be an orthogonal subset of L. Then M is a jointly orthogonal set if and only if every finite subset of M \ {0} is a difference set. A non-zero element p of a difference poset L is said to be an atom in L if, for elements q,r € L, p— q®r implies that q = 0 or r = 0. Lemma 2.8 [14, Lemma 6.2] Let L be a locally finite difference poset. Then: (i) For each nonzero element p in L there exists an atom q and an element r in L such that p = q © r. (ii) For each nonzero element p in L there exists a difference set D consisting of atoms such that p = 0 D. (iii) A difference set D consisting of atoms is maximal as such if and only if 1 =
®D. Definition 2.9 [1] Let (E, +,., <) be a real vector space which is equipped with a partial ordering <. We say that E is a Riesz space if the following axioms are satisfied: (i) (E, <) is a lattice; (ii) x,y e E and x < y => x + z < y + z for all z G E; (iii) x,y € E and x < y => ax < ay holds for all a > 0. The set of all positive elements of E will be denoted by E+ (i.e., E+ := {x G E : x > 0}). For any vector a; in a Riesz space we define i+:=iV0;
i":=(-x)V0;
\x\ := x V {-x).
The elements x+, x~ and \x\ are respectively called the positive part, the negative part and the absolute value of x. For any x € E, we have (i) x = x+ — x~, (ii) |a;| = x+ + x" and (iii) |a;| = 0 iff x = 0. A Riesz space E is called Dedekind complete (resp., a-Dedekind complete) if, for every nonempty subset (resp., countable subset) BofE that is bounded from above,
219 V B exists in E. A Riesz space E is Archimedean if and only if, given x,y e E+ such that nx
=* {ft + 9s}U
+g
(2.2)
{ / J I / , {5,} 1 5 =• {ft + 9.} I f + 9
(2.3)
Finally, we say that a net {xa} in E is order convergent to an element x 6 E, and we write xa A x , if there exists a downwards directed net {pa} C E1 and {p a } J. 0 such that \xa — x\ <pa Va. If i a t x (or x a J. x) then x a A x [1]. 3
D E C O M P O S I T I O N OF M E A S U R E S
Throughout the rest of this paper, V is assumed to be a Dedekind complete Riesz space, and L = (L, < / , 0 , 0 , 1 ) is assumed to be a DP for which the partial operation © : L x L —> L is defined by (2.1). Consider the following binary relation < n on VL '• Mi ^ n M2 iff Mi (a) < /"2(a) for all a € L. Clearly the pair (VL, < n ) is a partially ordered set. An element fi £VL is said to be positive if /i(a) > 0 for all a € L. We say that an element \x € V L is a finitely additive measure if fi(a ® b) = [i(a) + fi(b) whenever a © 6 is defined in L. Then /x(0) = 0 and n(a') — \i{\) — fi(a) for all a 6 L. If fi is positive, then a < 6 in L implies /u(a) < M&) m V• One can easily show that, an element n G VL is a finitely additive measure iff /j,(bQa) = fi(b)—/i(a) whenever a < b. To define a—additive and completely additive measures on L, we introduce the following notions. Let {ai, • • • , an} C L. Recursively we define for n > 3 ax © • • • © an := (ax © • • • © a n _ x ) © a„,
(3.1)
supposing Oi©- • -©a n _i and (aj©- • -(Ban-i)(Ban exist in L. From the associativity of © in a DP we conclude that (3.1) is correctly defined, and we put a x ©- • ©a n = a\ if n = 1, and ax © • • • © an = 0 if n = 0. Then for any permutation (zj, • • • , «'„) of (1, • • • , n) and for any fc with 1 < k < n, we have ttl © • • • © fln = Q,it © • • • © (lin,
^ © • • • © an = (ai © • • • © ak) © (a/t+j © • • • © a n ). We say that a finite subset F = {au--- , an} of L is 0 —orthogonal if ai©- • -©a n exists in L. In this case, we say that F has an 0 —join, defined as ( J ) F = ai © • • • © an.
220 It is clear that two elements a and b of L are orthogonal, i.e., a _L b, iff {a, 6} is 0 —orthogonal. An arbitrary subset G of L is 0 —orthogonal if every finite subset F of G is 0 —orthogonal. If G is 0 —orthogonal, then any subset of G is 0 —orthogonal. An 0 —orthogonal subset G of L has an 0 —join in L if
0G:= V 0 ^ FS^(G)
exists in L (where T(G) denotes the set of all finite subsets of G). We say that a DP L is a complete DP (resp., a a—DP) if for any 0 —orthogonal subset (resp., any countable 0 —orthogonal subset) of L, there exists the 0 —join in L [4, 6]. Definition 3.1 [2, 4] Let L be a complete DP. A mapping fj, € VL is said to be a positive completely additive measure on L if, for any 0 —orthogonal system {ai : i 6 / } in L, we have for any finite subset F of I
I M 0 * ) - £ > ( * ) ! <&*•>
(3.2)
where {bF} 1 0 and bFl < bF2 whenever F2QFX. That is, J2iaF Kai) "^ M(0 i € j«»). and we shall write M(©i€/ a ') = X^e/M 0 *)If the index set I in (3.2) is only countable, we say that fj, is a positive o—additive measure (or a positive countably additive measure), and we write ^ ( 0 ^ ! ai) = Since every Dedekind complete Riesz space is Archimedean, we conclude that n(0) = 0. In fact, for any finite subset F of I with |M(0;<=i a i) ~ J2ieF Ma>)l ^ ^ > where a,- = 0 V« £ / and hence 0 i e / a ; = 0, we have that (card(F) — l)\/i(0)\ < bp I 0, and hence /i(0) = 0. We denote by a(L, V)+, oa(L, V)+, and ca(L, V)+ the sets of all ji e V+ which are finitely additive, a—additive, and completely additive measures, respectively. It can be shown that ca(L, V)+ CCTa(L,V)+ C a(L, V)+. It is not hard to prove that a positive additive measure /z on L is a—additive or completely additive iff n
oo
{£>( ai )} T K 0 < t=l
(3.3)
i=l
or {£>(a*)}F T M ® * ) >
(3-4)
where F runs over all finite subsets of / , whenever {at : i € / } is a 0 —orthogonal set in L for which 0 i e / a,- exists in L. Moreover, if // and v are elements of ca(L, V)+,
221 then fi + iy € ca(L, V)+, where (fi + v)(a) := n(a) +v(a), a e L. Indeed, this follows from (3.4) and (2.2). An element /J G a(L, V)+ is said to be weakly purely additive if V
V G ca{L, V)+=> rj = 0.
(3.5)
If (3.5) holds for r\ G aa{L,V)+, fi is said to be purely additive. An element \i G aa(L, V)+ is said to be purely a—additive, if it satisfies (3.5). The next result is a noncommutative version of the Yosida-Hewitt decomposition theorem for Dedekind complete Riesz space-valued measures on a complete difference poset. This result generalizes that one in [2] and [3] to the more general setting of difference posets. T h e o r e m 3.2 Let n G a(L,V)+ where L is a complete DP. Then // can be expressed as a sum \i = £ + r}, where f G ca(L,V)+, and 77 is a positive weakly purely additive measure on L. Proof. Define TM = {7 £ ca(L, V)+ : 7 < n /x}. Since 0 G TM, TM is nonempty. Let C = {TJ} be a chain in TM with respect to the natural ordering <„, and define 7o(c) :=\/'Yj(c), c€L. 3
Since 0 < 7,(0) < 7?(1) < /x(l) and V is Dedekind complete, 70(c) is defined correctly on L. Moreover, -y0 is finitely additive. Indeed, let a © 6 be defined in L. Then 7,-(a) T To (a), Tj(&) T To (6). Also, it can be shown that {Tj( a )} and (TJ(&)} are equidirected. By (2.2), we conclude that %(a ®b)= Tj(a © b) T= (T,-(O) + T # ) ) T= T » T +7j(&) T= To(o) + 7„(fr). From the definition of 70, we conclude that {70(c) — 7; (c)} J. 0 uniformly in c G L. To see this, let c G L. Then 7,(0) f , hence —7,(0) J. and therefore, {To(c) — 77(c)} JMoreover,
0 = ToW - y 7iW = A(^( c ) - ^ c ) ) ' J
which implies that {TO(C) — Tj( c )} J- 0. Now, we shall show that -y0 G ca(L,V)+. Let {aj}ig/ be a ©—orthogonal system in L with a = 0 i e / a i - Then for any finite subset F of / , we have 0 < 70(0) - Y, 1°^
= 1°(a
G
(0
a
*))
= (7o(a e ( 0 aj)) - 7j(o 9 ( 0 Oi))) + (7j(a 9 ( 0 *)), i€F
i€F
ieF
222 where {j0(a Q ( © i s F a ; ) ) - ^{a Q ( © i e F a i ) ) } I °( © i 6 F «i)) - iM
fl
© (©,eF '))
and
0 < 7o(a) - ^
:
Thus
letting pj := 7„(a e
% = 7?(a © ( © i 6 F *))> w e obtain 7o(fli) < Pj + VF
a
VF G ^ ( / ) ,
where {p,} j 0 and, by complete additivity of jj , {VF}F I 0 for each fixed j . This yields
0
and therefore 7 0 (a) = 2 i s / 7 o ( a ; ) - . Since j 0
223 where pt := 7„(a) - 74(a) and {pi} j 0 because {70(c) - 7J(C)} 1 0 uniformly in c € L. Therefore j0(a) < e. Thus % is a majorant of C in TM. It follows from Zorn's Lemma that FM contains a maximal element £. Hence £ -C A and £ < n /i. Let r) :— fi — £. Clearly, 77 £ a(L, V ) + . To finish the proof, it remains to show that 77 ± A. Let 7 £ a(L, V)+ be such that 7 < n 77 = p. — £ and 7 -C A. Then 7 + £ 6 a(L, V)+, 7 + f < n /J and 7 + £
THE JORDAN-HAHN DECOMPOSITION
The present section is concerned with a relaxed form of the Jordan-Hahn decomposition in the noncommutative setting of difference posets and orthoalgebras. In this section, we take V = R. By a measure on a DP L, we mean an element p € R L which is finitely additive measure on L. A measure p on L is said to be bounded if the image p(L) of L under the map p is a bounded subset of R, and it is said to be positive if p(p) > 0 for all elements p in L. We denote by ba{L) the linear subspace of all bounded measures on L and by a+(L) the subset of R L of all positive measures on L. Notice that a+(L) is a cone in R L (i.e., (i) a+(L) + a+(L) C a+(L), (ii) R + a + ( L ) C a+(L) and (iii) a+(L) n —a+(L) = {0}). The subspace a+(L) - a+(L) of R L is denoted by J{L). An element of J(L) is called a Jordan measure. L e m m a 4.1 (i) The positive cone a+(L) is a subset of the linear space ba(L). (ii) J{L) C ba{L). Proof, (i) Let / j b e a positive measure on L and let p, q be elements in L with p < q. Then there exists an element r E L such that p _L r and q = p © r. Hence, p,(p) < p(p) + p(r) = p.(p © r) = p{q). Therefore, for all elements p € L, 0 = n(0) < p(p) < p(l).
224
(ii) It follows directly from (i). For an element ji G ba{L), we define /i + ,/i~ and \\fi\\ as follows :
M + (P)
^"(p) : = - A ^ ) ,
:= V /*(«)'
IHI
:
=V
IMP)I-
Notice that / i - is equal to (—n)+ and that ^ + is a super-additive positive functional on L. Indeed, let p, q be elements in L such that p _L q. If r,s are elements of L with r < p and s < q, then r J. s and r © s < p © g. It follows that n{r © s) < M+(P © 9)> a n d by additivity of fi we conclude that fi(r) +
n(s)
Therefore, /i+(p) + fi+(q) < jj+(jp © q). A difference poset L is said to have the Jordan-Hahn property (JHP) if for each H G ba(L) there exist elements v, £ G a+(£) and an element p £ i such that fi = v — £
and
j/(p') = 0 = £(p).
A difference poset L is said to have the approximate Jordan-Hahn property (AJHP) if for each /i € ba{L) there exist elements 1/, £ € a + (L) such that (i) 11 = v — £, and (ii) for each e > 0 there is p G -L with /v(p') < e, £(p) < e. If L has the Jordan-Hahn property, then it has the approximate Jordan-Hahn property. Moreover, the two properties coalesce provided that L is finite. The following result gives a necessary and sufficient condition for a difference poset to possess the approximate Jordan-Hahn property. This result generalizes Theorem 2.1 in [15]. Theorem 4.2 Let L be a difference poset. Then L has the approximate JordanHahn property if and only if for each element /1 of ba(L) satisfying Hip) < 1
forallpGL,
(4.1)
there exists an element v of a+(L) such that K
fo
r all q&L.
(4.2)
Proof : (=>): Let /i e ba{L) and suppose that /x(p) < 1 Vp G L. Since L has the AJHP, there exist elements v, £ G a+(L) such that \i = v-t, and for every e > 0 there exists p G L such that v{p'),£{p) < e. Then for all e > 0,
i>n(p)=v(P)-m
= v(i)-v(ij)-t(p)
= Kl)-["(p0+£(p)]>i/(l)-2e;
225
and, therefore, i/(l) < 1. It follows that v(q) = v{\) - v{q') < v{l) < 1 \/q e L. (<=): Let fi e ba(L) be a non-zero element. Since fj,(q) < \\fi\\ for all q e L, we have
(p[)(«)Sl V 9 eL. By hypothesis, there exists an element v of o + (L) with (-rp-rr)(g) < v(q) < 1 Vg € L. IIWl Set £ := v - ^ . Then £ e a+{L), p = \\\x\\v - |H|£, and \\n\\v, |H|£ £ a+{L). Let e > 0 be given. Then, there exists p e L such that A*(P)>IH|-|
or
(4.3)
A*(P)<|-|HI-
Now, from above and (4.3), we have INK(p) = MHP) - M(P) < IHI - /i(p) < | < e. and I M K P ' ) < IMI < | - A*(P) < I + IIMIKCP) -
IHHP)
< | + \ - IHKP) = t - \\A\v(v) < e; which completes the proof that L has the AJHP. A positive measure fi G a+(L) is called a probability measure if /i(l) equals 1. We denote by fi(L) the collection of all probability measures on L. A difference poset L is said to be unital if for each non-zero element p in L, there exists an element ft of fi(L) which evaluates to one on p. Remark 4.3 Let L be a difference poset, and let H(ti(L)) and H+(Cl(L)) denote the linear hull and the positive hull of ft(L), respectively. The set U(L) is said to have the JHPR, the Jordan-Hahn property in the sense of Riittimann [14] if for each fi 6 H(Q,(L)), there exist elements v, £ G H+(fl(L)) and an element p G L such that H=v—£
and
v(p') = 0 = £(y>).
Note that, if L has the JHP, then Q,(L) has the J H P R . In fact, we note that Q(L) is a convex subset of itself and J(L) (resp., a+(L)) coincides with H(Q,(L)) (resp., H+(fl(L))). Moreover, by Lemma 4.1, every Jordan measure is bounded. It follows that Q(L) has the J H P R . Theorem 4.4 [14, Corollary 5.3] Let L be a unital orthoalgebra and let A(L) be a convex set of probability measures on L. If A(L) has J H P R then L is locally finite. The next result relates local finiteness of a difference poset to the Jordan-Hahn property.
226 T h e o r e m 4.5 Let L be a unital difference poset. If L has the Jordan-Hahn property, then L is locally finite. Proof. Apply Remark 4.3 and Theorem 4.4 (which is also valid for unital difference posets) to A(L) = Q,(L). The difference posets Sch2o [15], the Greechie-diagram of which is given in Figure 4.1(a) is unital and locally finite, but fails to have the Jordan-Hahn property. To see this, let the bounded measure fi be given as in Figure 4.1(b). It clearly satisfies condition (4.1). The only element v £ R L which satisfies (4.2) and which is a positive measure on every block in Scli2o is given in Figure 4.1(c). Notice that v is not a measure on Scli2o (becasues, i/(c) + v{d) = 0 ^ 1 = v(g') = v(c © d)).
+ 1.
r
4 - 1 0
c
-1
& +1.
2. 'i
C
a 0
0 >••
'! 0
+1A
Figure 4.1 Let L be a locally finite difference poset. We denote by A{L) the collection of all atoms in L and by O(L) the collection of all maximal difference sets consisting of atoms. By Lemma 2.8, A(L) and O(L) are not empty, and the pair (A(L), 0{L)) is called the atom-hypergraph of L. A locally finite difference poset is said to satisfy the outer point condition if for every element E in O(L) there exists an element p G A(L) such that, for all elements F in O(L),
peF
<$
E
F
(4.4)
The following theorem is a generalization of Theorem 3.1 of [15], which appears without proof, to difference posets. T h e o r e m 4.6 Let L be a locally finite difference poset. If L satisfies the outer point condition, then it has the approximate Jordan-Hahn property. Proof. Let fi be a bounded measure on L such that fj,(p) < 1 for all p € L. For each element E e 0{L) we define a scalar £# by
tB peE
227
Then, as fj,+ is a super-additive positive functional on L we have 0 < tF < 1. Also for each element E 6 0(L), we select, using the outer point condition, pE e A(L) which satisfies condition (4.4), and define a map LJ : A(L) —> R as follows / \ ._ f 1 - *B + (_ /i + (p),
if P = PE for some £ <= O(L) otherwise.
H+(PE),
Then w(p) > 0 and fi(p) < n+{p) < w(p) for all p e A(L). Define an element /j,u in RLby ( 0, ifp = 0 where TV is a difference set consisting of atoms such that 0 N equals p. We claim that Hv is a positive measure on L which extends w. To see this, let p, g be nonzero elements in L with p L q and let M, AT be difference sets consisting of atoms such that p = 0 M and q = ©AT. Then, M ( J N is a difference set and ®(M\JN) = ( © M) © ( © A7) = p © ?. Since M f] N is empty,
/*-(?©«) = 1 3 w ( r ) = £]<''('")+ £)<»(»•) rSMUJV
r€M
reN
Note that yuw(p) = w(p) for all p e ^4(L), since p = ©{p}. Moreover, /iw(p) > 0 because oj(p) > 0 for all p € L. Finally, it remains to show that, for every F € O(L), ]T) 6 f w(p) = 1. To this end, we have two cases to consider. If F is a singleton set, then F = {PF} and Y2peFw(P) ~ ^(PF) = 1- The other case is that F has more than one element. In this case, we have ^w(p) p€F
= uj{pF)+
J2 OJ(P) P6F\{PF}
= i-tF+ti+(PF)+
Y, ^+w P€F\{pF}
+
= i-tF+^2fj, (p)
= i-tF+tF
= i.
p€F
From the above we conclude \iw is a positive measure and ji{p) < fJ,u(p) < 1 for all p € L. Therefore the assertion follows from Theorem 4.2. The difference posets (in fact, orthoalgebras) Gu and Jig are locally finite. Figure 4.2 gives their atom-hypergraph (see [8,9]) where atoms are represented by points in the plane and a collection of points connected by a smooth line-segment forms a maximal difference set of atoms.
228 Gl4
Jis:
Figure 4.2 Notice that Gu and Ji 8 satisfy the outer point condition and therefore, by Theorem 4.6, possess the approximate Jordan-Hahn property.
References [1] D. Aliprautis, O. Burkinshaw, Positive Operators, Academic Press, Inc. (1985). [2] P. De Lucia, A. Dvurecenskij, Yosida-Hewitt decompositions of Riesz spacevalued measures on orthoalgebra, Tatra Mount. Math. Publ. 2 (1993), 229-239. [3] P. De Lucia and P. Morales, Noncommutative decomposition theorems in Riesz space, Proceedings of A.M.S. 120 (1994),193-202. [4] A. Dvurecenskij, B. Riecan, Decompositions of measures on orthoalgebras and difference posets, Inter. J. Theor. Phys. 33 (1994), 1387-1402. [5] A. De Simone, Decomposition theorems in orthomodular posets: the problem of uniqueness, Proceedings of A.M.S. 126 (1998),2919-2926. [6] D. J. Foulis, M. K. Bannett, Effect algebras and unsharp quantum logics, Found. Phys. 24 (1994), 1325-1346. [7] D. J. Foulis, R. J. Greechie and G. T. Riittimann, Filters and supports in orthoalgebras, Inter. J. Theor. Phys. 31 (5) (1992), 789-807. [8] E. D. Habil, Orthoalgebras and Noncommutative sertation, Kansas State University (1993).
Measure Theory, Ph.D. dis-
[9] G. Kalmbach, Orthomodular Lattices, Academic Press, London/Now York, (1983). [10] F. Kopka, F. Chovanec, D-posets, Math. Slovaca 44 (1994), 21-34.
229 [11] W. A. J. Luxemburg, A. C. Zaanen, Riesz Space I," North-Holland", Amsterdam, London, 1971. [12] M. Navara, P. ptak , Difference posets and orthoalgebras, BUSEFAL 69 (1997), 64-69. [13] Z. Riecanova and D. Brsel, Counterexamples in difference posets and orthoalgebras, Inter. J. Theor. Phys. (33) (1994), 133-141. [14] G. T. Riittimann, The approximate Jordan-Hahn decomposition, Canad. J. Math. 41 (1989), 1124-1146. [15] G. T. Riittimann, The Jordan-Hahn property, Proceeding of the First Winter School on Measure Theory (Slovak Academy of Sciences). Eds. A. Dvurecenskij, S. Pullmanova, Bratislava, 1988, 138-145. [16] G. T. Riittimann, Weakly purely finitely additive measure, Canad. J. Math. 46 (1994), 872-885.
O N N O N L I N E A R WAVE EQUATIONS W I T H D A M P I N G A N D SOURCE TERMS MOHAMMAD A. RAMMAHA In this article we present some recent results and some open questions on the long-time behavior of solutions to a large class of nonlinear wave equations. The nonlinearity in the equation features a damping term that competes with a source term. Of central interest is the relationship of the source and damping terms to the behavior of solutions in the large. ABSTRACT.
1. INTRODUCTION
Various examples of the nonlinear wave equation utt - Au + Q(x,t,u,ut)
— F(x,u),
(1.1)
satisfying the structural conditions vQ(x, t, u, v) > 0, G(x, t, u, 0) = !F(x, 0) = 0, and F{x, u) ~ |u| p _ u for large |u| arise in physics. For instance, if Q = 0 and F{x, u) — u3, or more generally, any positive odd power of u, then the equation arises in quantum field theory (cf. Jorgens [12] and Segal [31]). On the other hand, if J- = 0 and Q(x, t, u, ut) = \ut\ ut, then the equation provides a model for a classical vibrating membrane with a resistance force that is proportional to the velocity ut. In this article, we focus on the long-time behavior of solutions to initialboundary value problems and the Cauchy problem for a class of nonlinear wave equation of the form (1.1). Throughout the paper, we assume that Q is an open, bounded, connected domain in R™ with a smooth boundary dQ — T. Further assume T is the union of two disjoint, connected n — 1-dimensional manifolds To and Ti. We consider the PDE model utt - Au + \u\k \ut\m sgn(ut) = luf'1 u - q(x)2u, u{x, 0) = u°(x), u(x, t) = 0 on T0 x (0,T),
ut(x, 0) = ^(x),
infix(0,T), in Q,
§^(z, t) = h(x, t), on Tr x (0,T),
(1.2) (1.3) (1.4)
Date: Submitted July 22, 2001, and revised November 4, 2001. 1991 Mathematics Subject Classification. Primary: 35L05, 35L20 Secondary: 58G16. Key words and phrases, wave equations, damping and source terms, weak solutions, blowup of solutions.
230
231
where m > 0, k > 0,p > 1, h G (^([O.oo),!, 2 ^!)), and ^ denotes the outward normal derivative on I \ . Hypotheses on the initial data u°, u1 and q will be specified whenever needed. The basic questions we address in this paper are: • If k + m > p does the initial-boundary value problem (1.2)-(1.4) have a unique global solution for all appropriate initial data? • If k + m < p does there exist initial data u°, ul so that the solution to (1.2)-(1.4) blows-up in finite time? It is well-known that when the damping term \uf \ut\m sgn(u t ) is absent from the equation, then the source term |u|p~ u (where q(x) = 0) drives the solution of (1.2)-(1.4) to blow-up in finite time (cf. [6, 17, 26, 35]). In addition, if the source term \u\p~ u — q(x)2u is removed from the equation, then damping terms of various forms are known to yield existence of global solutions, (cf. [2, 3, 10]). However, the interaction between the damping and source terms is often difficult to analyze, as one can see from the work in [5, 19, 21, 28, 29]. There has been an extensive body of work on the long-time behavior of solutions to various hyperbolic equations. Of particular relevance to this paper are the work of Georgiev and Todorova [5] and Levine and Serrin [19]. We also note the fundamental work of Lasiecka and Triggiani [15, 16] and Lions and Strauss [24]. Also, for other related papers we refer the reader to [3, 10, 13, 17, 21, 22, 27], and at the same time we apologize for any omitted references. 2. 2
PRELIMINARIES
2
Let L (fi), L ( r i ) , etc...., denote the standard Lebesgue spaces and Hs(£l), H (Vo), Hs(Ti),..., denote the standard Sobolev spaces. By HS(T) we mean the space HS(TQ) x H"(Ti), and by H^ro(T), s > 0, we mean the subspace of H'(T) that is given by H^T) = {0} x ff'(ri). Also, for s > 1/2 we set s
where the evaluation on r 0 is taken in the sense of traces. For u 6 HS(Q), s > 1/2 we denote by ju the trace operator on T, i.e., ^u — u |r- Also, we set r
yiU =
u\Tl.
Throughout the paper, we let A : L2(f2) ->• L2(Q) be the operator given by: A = - A with its domain V(A) = {ueH2(Q):
U|ro
= - |
r i
= 0}.
It is well known that A is positive, self-adjoint, and A is the inverse of a compact operator. Moreover, A has an infinite sequence of positive eigenvalues {An : n = 1,2,...} and a corresponding sequence of eigenfunctions {e n : n = 1,2,...} that forms an orthonormal basis for L 2 (0). Namely, if u e L2(Q), then
232
u = J2™=i unZn, where the convergence is in L2(Q), with I M I ^ m = Yl^Li \un\2 and un = \u, cn)^2(fj)The powers of A are defined as follows: As : V(AS) C L 2 (fi) -»• L 2 (fi), s yl u = ^ ^ . j A* unen, with the domain of As given by oo
2
oo
U
V{A') = {UE L (Q) : « = J2 nCn, J2Xn n=l
KI* < °°}-
n=l
We remark here that the results of Grisvard [8] and Seeley [30] give the following characterization for the fractional powers of A: { #2s(fi);0<s< \ V(AS) = <
(2.1)
{ {u G H2°(tl) : «| r o = 0, £ | F l = 0} ; | < s < 1. Moreover, ©(A 1 / 4 ) -> Hl'2{ti), u
V{A3/4) <-+ 2\V2
is equivalent to (J^^Li^n \ n\ )
flj^fi),
and the norm ||u|| f f . ( n )
• Therefore, we set
lff'(n)
E^K|2.
Throughout the paper we set: G(u, u') = \u\ \u'\m sgn(u'), and f(x, u) = |u| p _ 1 u — q(x)2u, where u' = ut. We shall use the weak formulation of the problem to define what we mean by a solution to the initial-boundary value problem (1.2)-(1.4). Definition 2.1. Let u° G HQTO(Q,), U1 G L2(Q,). We say that u is a weak solution to the initial-boundary value problem (1.2)-(1.4) on [0,T] if u G L 2 (0,T,# 0 1 r o (fi)), u' G L2{0,T,L2(n)) and u satisfies: (u'(t),4>)LHQ) - (u\4>)LHa) +
Jo
- /
L
/2 {Al'2u{s),Al ^ \
+f
f\(G{u{s),u\s))A)»{a)-(f{<s)U)
(M s ) ) 7i
=
0
'
J 0
for all
ds ds (2.2)
233 3. INITIAL-BOUNDARY VALUE P R O B L E M S
Let u e C{[0,T],H^o(Q)) and v! e C([0,T],L 2 (Q)) such that u is a weak solution to the initial-boundary value problem (1.2)-(1.4). Let E(t) denote the total energy at time t, i.e., P+i
m): = \ (iTOII^nj + ll^tiWll^n,)- ^ l N ' ) l lLP+!(0) +
g ll9u(*)H^(n) - (Mi),7iu(i)} L2(ri) •
(3-1)
The following results have been recently established in [28]. T h e o r e m 3.1. Let u° G H^Q), u1 £ L2(Q), and h G ^ ( [ O , ^ , ! 2 ^ ) ) . Assume that q = 0 and k, m, p satisfy k, p > 1, 0 < m < 1, and (3.2) Then there exists a constant T > 0 SMC/I i/iai the initial-boundary value problem (1.2)-(1.4) has a unique weak solution u with u e C([0,T),H^o{Q))
andu' e
C([0,T),L2(n)).
In addition, we have: (i) If k + m> p, then u is a global solution, i.e., T = oo. (ii) If k + m < p and E(0) < 0, then the local solution u blows-up in finite time. R e m a r k 3.2. The existence of a local weak solution to (1.2)-(H) in Theorem 3.1 was established by using a standard Galerkin scheme based on the eigenfunctions of the Laplacian. However, there are several technical difficulties in the passage to the limit. One difficulty lies in showing that the sequence of approximate solutions {u^} satisfies \u'N(t)\msgn(u'N{t))
-> \u'(t)\m8ga(u'(t))
weakly in L2{Q).
Another difficulty lies in proving uniqueness of solutions, which does not follow from the theory of ordinary differential equations. R e m a r k 3.3. The proof of the existence of a global solution in Theorem 3.1 when k + m > p relies on obtaining an energy-type estimate for the sequence of the approximate solutions {u^} which holds for each bounded time interval [0, T]. In addition, the blow-up result (ii) in Theorem 3.1 is for small data. Indeed, if the initial data is sufficiently small with E(0) is merely negative,
234
then the following upper bound for the life span T* of the local solution holds
(of- [28]j: p-(k+m)
(.
2
\
T*
where a = min< ^ L t f f , 2foTI! I
an
^ ^ *s
some
(3_3)
positive constant that does
not depend on the initial data. Remark 3.4. A special case of the initial-boundary value problem (1.2)-(1.4) has been studied in [5]. Specifically, in [5] the authors studied (1.2) with homogeneous Dirichlet boundary conditions and with k — 0, m, p > 1, q = 0. Although the blow-up result obtained in [5] is for large data, the authors' proof can be modified to yield the same blow-up result for small initial data. Essentially, the results in [5] show that the conclusions of Theorem 3.1 hold for the case k = 0, m, p> I, q = 0. Remark 3.5. In one-space dimension where Q = [0,1], a special case of the initial-boundary value problem (1.2)-(1.4) has been studied in [29]. Specifically, the results in [29] show that the conclusions of Theorem 3.1 hold, if m — 1, k > 0, p > 1, and q = 0. Remark 3.6. In [19], Levine and Serrin proved several abstract theorems on the blow-up of solutions to a large class of nonlinear hyperbolic equations. In particular, their results apply to the initial-boundary value problem (1.2)-(1.4) in the case when k > 0, m,p > 1 and q = 0. In such a case, the results of Levine and Serrin [19] yield that every local solution to (1.2)-(1.4) cannot be global, whenever k + rn < p and the initial energy is negative. We end this section by posing the following questions: Question 3.7. For all appropriate initial data, and a suitable function q, does conclusion (i) of Theorem 3.1 hold for the case k > 0, m, p > I? Does it also hold for the case 0
of Quesneeds to that are difficulty
235
in answering the second part of Question 3.7 lies in proving the uniqueness of solutions. 4. T H E CAUCHY P R O B L E M
In this section we focus on the Cauchy problem utt-
Au + \u\k \ut\m sgn{ut) = \u\p~l u - q(x)2u, u(x, 0) = u°{x),
ut(x,0) = u^x),
in Rn x (0,oo), n
in R ,
(4.1) (4.2)
where, k > 0, m > 0 and p > 1. The initial data u° and u1 are of compact support, and q is locally bounded measurable function on R™. Until recently there has been very little work done on the question of global existence and blow-up of solutions to the Cauchy problem for nonlinear wave equations which have damping terms present. The interaction between nonlinear damping and source terms in the Cauchy problem (4.1)-(4.2) is much more complicated than the case of bounded domains. The key property that one has to prove for the Cauchy problem (4.1)-(4.2) is the finite speed of propagation. However, when damping contains sub-linear terms (the case when 0 < f c < l o r 0 < m < l ) , then the damping term is not locally Lipschitz, and thus the proof of the finite speed of propagation becomes difficult. Indeed, having locally Lipschitz terms in the equation was the key tool for Strauss' argument [32] in his derivation of the finite speed of propagation. One of key tools used in studying the global well-posedness of the Cauchy problem is the Payne and Sattinger [26] theory of potential wells. Indeed, this theory has been used by many authors in studying special cases of (4.1)-(4.2). For instance, we refer the reader to Ikehata [11], Levine and Todorova [22], Ohta [25], Vitillaro [36], and the references therein. Having in mind the finite speed of propagation, we now give the definition of a weak solution to the Cauchy problem (4.1)-(4.2). Definition 4 . 1 . Let u° e Hl(Rn), u1 6 L2(Rn) be of compact support in R n . We say that u is a weak solution to Cauchy problem (4-l)-(4-2) on [0, T) if u G C([0,T),H1(Rn)), u' e C([0,T),L 2 (R")) and u satisfies: (u'(t), 0) La(H „ ) - (u 1 ,0> L 2 (Rn) + J
(Vu(s), V
+ f L\(G(u(s), u'(s)), # L 2 ( R n ) - (/(«(«)), 0) ia(R »)lJ ds = 0, Jo for all t E [0,T) and all <j> € Hl(Rn)
of compact support in R".
(4.3)
236
For a weak solution u of the Cauchy problem (4.1)-(4.2), we let £(t) denote the total energy at time t, i.e.,
£(*)•• = ^(ll«'(*)ll^(R-) + l | v « W l l V ) ) The following results have been independently established in [21] and [34]. T h e o r e m 4.2. Let u°e/f1(K"),u1eL2(R")
(4.5)
be compactly supported in W1. Assume that k = 0, m > 1, q = 0, and p>l,
forn=
1,2, (4.6)
l < P < ^ , M n > 3 . Then there exists a constant T > 0 such that the Cauchy problem (4-l)-(4-%) has a unique weak solution u where uGC([0,T),H\Rn)), v! e C([0,r),L 2 (K n )) n L m + 1 ( K " x [0,T)). Furthermore, the following holds: (i) If m>p, then u is a global solution, i.e., T = oo. (ii) If 1 < m < p, m > n "f' , and £(0) < 0, then the local solution u blowsup in finite time. (Hi) If 1 < m < p, m < " p +1 , and £(0) is sufficiently large negative, then the local solution u blows-up in finite time. Remark 4.3. Assume that 1 < m < p and the conditions (4-5)- (4-6) are fulfilled. Further assume that q is locally bounded measurable function on W1 satisfying lim|x|_^oo9(x) = 0, and q(x) > C{\ + Isl) - ", for i G f
(4.7)
where
0<,
(4.8) v ;
and C is some positive constant. Then, it has been shown in [34] that the local solution of the Cauchy problem (4-l)-(4-2) blows up in finite time provided £(0) < 0. Finally, we pose the following question.
237
Question 4.4. Do the conclusions of Theorem 4-2 hold for the general case k, m > 0, p> 1?
Acknowledgment. The author is grateful to David Pitts for some helpful remarks. REFERENCES [1] R.A. Adams, Sobolev Spaces, Academic Press, New York 1975. [2] K. Agre and M.A. Rammaha, Global solutions to boundary value problems for a nonlinear wave equation in high space dimensions, Differential & Integral Equations, in press. [3] Dang Dinh Ang and A. Pham Ngoc Dinh, Mixed problem for some semi-linear wave equation with a nonhomogeneous condition, Nonlinear Analysis. Theory, Methods & Applications 12 (1988), 581-592. [4] J. Ball, Remarks on blow-up and nonexistence theorems for nonlinear evolution equations, Quart. J. Math. Oxford (2) 28 (1977), 473-486. [5] V. Georgiev and G. Todorova, Existence of a solution of the wave equation with nonlinear damping and source terms, J. Differential Equations 109 (1994), 295-308. [6] R.T. Glassey, Blow-up theorems for nonlinear wave equations, Math. Z. 132 (1973), 183-203. [7] J. Greenberg, R. MacCamy and V. Mizel, On the existence, uniqueness and stability of solutions of the equation cr'(ux)uxx + \xtx = PoUtt, J- math. Mech. 17 (1968), 707-728. [8] P. Grisvard, Caracterisation de quelques espaces d' interpolation, Arch. Rat. Mech. Anal. 25 (1967), 40-63. [9] P. Grisvard, Equations differentielles abstraites, Ann. Sci. Ecole Norm. Sup. 2(4) (1969), 311-395. [10] A. Haraux and E. Zuazua, Decay estimates for some semilinear damped hyperbolic problems, Arch. Rational Mech. Anal. 100 (1988), 191-206. [11] R. Ikehata, Some remarks on the wave equations with nonlinear dampind and source terms, Nonlinear Analysis: Theory, Methods, and Applications, to appear. [12] K. Jorgens, Das Anfangswertproblem im Grossen fur eine Klasse nichtlinearer Wellengleichungen, Math. Z. 77 (1961), 295-308. [13] K. Kawarada, On solutions of nonlinear wave equations, J. Phys. Soc. Jap. 30 (1971), 280-282. [14] I. Lasiecka, J.L. Lions and R. Triggiani, Nonhomogeneous boundary value problem for second order hyperbolic operators, J. Math. Pures et Appl. 65 (1986), 149-192. [15] I. Lasiecka and R. Triggiani, A cosine operator approach to modelling 1,2(0, T;L2(r)) boundary input hyperbolic equations, Appl. Math. Optim. 7 (1981), 35-83. [16] I. Lasiecka and R. Triggiani, Regularity theory of hyperbolic equations with nonhomogeneous Neumann boundary conditions. II. General boundary data, J. Differential Equations 94 (1991), 112-164. [17] H.A. Levine, Instability and nonexistence of global solutions of nonlinear wave equations of the form Putt = Au + F(u), Trans. Amer. Math. Soc. 192 (1974), 1-21. [18] H.A. Levine, Some additional remarks on the nonexistence of global solutions to nonlinear wave equations, SIAM J. Math. Anal. 5 (1974), 138-146.
238 H.A. Levine and J. Serrin, Global nonexistence theorems for quasilinear evolution equations with dissipation, Arch. Rat. Mech. Anal. 137 (1997), 341-361. H.A. Levine, S.R. Park, and J.M. Serrin, Global existence and nonexistence theorems for quasilinear evolution equations of formally parabolic type, J. Differential Equations 142 (1998), 212-229. H.A. Levine, S.R. Park, and J.M. Serrin, Global existence and global nonexistence of solutions of the Cauchy problem for a nonlinearly damped wave equation J. Math. Anal. Appl., 228 (1998), 181-205. H.A. Levine and G. Todorova, Blow up of solutions of the Cauchy problem for a wave equation with nonlinear damping and source terms and large initial energy, preprint. J.L. Lions and E. Magenes, Non-Homogeneous Boundary Value Problems and Applications I, II, Springer-Verlag, New York-Heidelberg-Berlin, 1972. J.L. Lions and W.A. Strauss, Some non-linear evolution equations, Bull. Soc. math. Prance 9 3 (1965), 43-96. M. Ohta, Remarks on blow up of solutions for nonlinear evolution equations of second order, preprint. L.E. Payne and D. Sattinger, Saddle points and instability of nonlinear hyperbolic equations, Israel Math. J. 22 (1981), 273-303. P. Pucci and J. Serrin, Global nonexistence for abstract evolution equations with positive initial energy, J. Differential Equations 150 (1998), 203-214. M.A. Rammaha and D.R. Pitts, Global existence and non-existence theorems for nonlinear wave equations, preprint. M.A. Rammaha and T.A. Strei, Global existence and nonexistence for nonlinear wave equations with damping and source terms, Trans. Amer. Math. Soc., in press. R. Seeley, Interpolation in LF with boundary conditions, Stud. Math. X L I V (1972), 47-60. I. E. Segal, Non-linear semigroups, Annals of Math. 78 (1963), 339-364. W. Strauss, Nonlinear Wave Equations, CBMS Regional Conference Series in Mathematics, Amer. Math. S o c , 73(1989). R. Temam, Navier-Stokes Equations, Theory and Numerical Analysis, North-Holland, 1984. G. Todorova, The Cauchy problem for nonlinear wave equations with nonlinear damping and source terms, C. R. Acad. Sci. Paris 3 2 6 Serie I (1998), 191-196. H. Tsutsumi, On solutions of semilinear differential equations in a Hilbert space, Math. Japonicea 17 (1972), 173-193. E. Vitillaro, Global nonexistence theorems for a class of evolution equations with dissipation and applications, Arch. Rational. Mech. Anal. 149 (1999), 155-182. G.F. Webb, Existence and asymptotic behavior for a strongly damped nonlinear wave equation, Can. J. Math. 32 (1980), 631-643. D E P A R T M E N T O F M A T H E M A T I C S AND S T A T I S T I C S , U N I V E R S I T Y O F N E B R A S K A - L I N C O L N ,
L I N C O L N , N E 68588-0323, U S A
E-mail address: rammahaQmath.uiil.edu
APPROXIMATION THEORY FOR PARAMETER IDENTIFICATION IN NONLINEAR DELAY EVOLUTION EQUATIONS
Azmy S. Ackleh Department of Mathematics and Statistics Texas Tech University Lubbock, Texas 79409 E-mail: [email protected] Simeon Reich Department of Mathematics The Technion-Israel Institute of Technology 32000 Haifa, Israel E-mail: [email protected]
ABSTRACT
First, a brief survey of recent results concerning approximation methods for parameter identification in first order evolution equations is presented. These results are then extended to a class of nonautonomous nonlinear evolution equations with delay. Existence and uniqueness of solutions to these delay evolution equations are established. A convergence theory for Galerkin approximations to inverse problems involving the identification of parameters in these equations is given. Finally, an application to a nonautonomous nonlinear delay reaction-diffusion equation is discussed.
1
Survey of Recent Results
The goal of this section is to give a brief survey of results concerning approximation techniques for parameter identification in first order evolution equations. There has recently been a considerable amount of research devoted to the study of such problems (see, for example, [3, 5, 7, 10, 14, 15, 20] and the references cited therein). Such evolution equations often arise in several scientific and technological fields including biology, physics and engineering. The main steps in developing an approximation theory for parameter identification can be summarized as follows: First, devise an approximation scheme that computes the solution to the
239
240
evolution equation; second, choose a cost function that compares the computed solution with the observed data; third, iterate over the admissible parameter space until a minimizer of the cost function is reached; finally, show that the sequence of computed minimizers converges in some sense to the true one. In the paper [9] the identification problem involving the following linear nonhomogeneous abstract equation was studied: u(t;q) + A(q)u{t;q) = f{t;q) (1.1)
«(0) = ttl), where q £ Q (a compact metric space) and the time-independent operator A(q) is the infinitesimal generator of a Co semigroup T(t; q) on a Hilbert space H. A convergence theory for Galerkin approximations to inverse problems involving these equations was given. In [16] the same problem was considered. However, a new type of approximation scheme (the so-called "weak-tau" method) for least-squares parameter estimation associated with the evolution equation (1.1) was proposed and then analyzed both theoretically and computationally. These results were extended to the case where A(q) is a nonlinear operator in the paper [14]. Numerical results corroborating that theory were presented in [11]. To establish the convergence of the computed minimizers in the above papers, abstract approximation results for evolution systems in Banach space (such as the Trotter-Kato theorem (see [24]) or its nonlinear analog (see, e.g., [19, 21])) were applied. In the paper [15] the authors considered a nonautonomous identification problem for an abstract evolution equation of type (1.1) with A being a time-dependent (i.e., A = A(t;q)), nonlinear and hemicontinuous operator which is continuous in the parameter q. Furthermore, the operator was assumed to satisfy certain monotonicity, boundedness and timemeasurability conditions. To establish convergence of computed minimizers, the authors relied on the theory of maximal monotone operators in Banach spaces (see, e.g., [17, 18]). This is a different approach from the one used in [14]. Although the approach in [14] can be applied to the nonautonomous case, this would require the resolvents of the timedependent operators to satisfy a Lipschitz-like condition with respect to the time variable (see [14]). However, a condition on the resolvent is, in general, not easily verified for the infinite-dimensional system and can be especially difficult to check for the sequence of approximating finite-dimensional systems. In the paper [5] the authors considered an identification problem similar to that in [9] with time-dependent operators A(t; q) satisfying coercivity, continuity in the parameter, and boundedness conditions. The novelty of that paper is that convergence results were established for the identification of discontinuous parameters. Their theory relies on the compact
241 embedding of the space of bounded variation functions BV([0,T]; Q) into L 1 (0,T; Q). In [6] the authors studied the parameter identification problem for the following semilinear nonautonomous evolution equation: ( u{t; q) + A(t; q)u{t; q) = F(t, u(t); q)
{ l «(0) =£()•
(1-2)
Here A(t; q) is a linear time-dependent operator which again satisfies coercivity, boundedness and continuity in the parameter conditions. The function F is assumed to be locally Lipschitz continuous. These results were extended in [7] to the case where A(t; q) is a nonlinear hemicontinuous operator satisfying the same conditions as in the paper [15]. The results in [7] are established using the theory in [15] together with a Picard type iteration. A detailed numerical study supporting this theory was presented in [4]. The identification problem involving the following autonomous initial value problem was studied in [8]:
f u(t;q) + A{q)u{t;q) = G{u;q){t) { I «(0) = S(q)-
(1-3)
Here the time-independent operator A(q) is, in general, nonlinear and G{u\q) is a nonlocal, nonlinear mapping satisfying a Lipschitz-like condition (see condition (A5) in Section 2). In [3] the focus was on the identification problem where the right-hand side in (1.2) is replaced with a nonlocal mapping G satisfying similar conditions to those imposed on the right-hand side of (1.3). The motivation for considering such nonlocal evolution equations arises from several applications, including nonlinear (autonomous/nonautonomous) Volterra integral equations. The approach used in [3] is in the spirit of [7] and differs from the method used in [8] for the time-independent case. Numerical implementation of the methods discussed in [8] and [3] was presented in [1] and [2], respectively. Recently the authors of [12] studied the well-posedness of the following delay abstract semilinear equation on the Hilbert space H: ii{t;q) + A{q)u(t;q) + AD(q)u{t -n;q)+
g(u;q)(t) + gD(u{t - r2);q) = f(t;q), (1.4)
!
u(s)=£(s;q),
se[-r,0].
Such a delay equation arises in advanced toxicokinetic modeling (see [12] for an example of a problem). Here, the linear time-independent operators A(q) and Ao(q) satisfy similar conditions to those in [5]. The nonlinear function g : H —> H is assumed to be convex with a Frechet derivative g' satisfying ||g'(0)|| < M for all (/> 6 H, while the function gD is assumed to be a nonlinear mapping from H into H. The approach used in [12] to establish existence
242 and uniqueness of weak solutions relies on a priori estimates of Galerkin approximations. In the paper [13] the problem of identifying parameters in (1.4) was discussed and convergence theory for the computed parameters along with numerical results were presented. The goal of the present paper is to extend these results in several important directions. In our treatment we will study a nonautonomous evolution equation. We assume that the operators A are nonlinear and time-dependent. Furthermore, we consider a more general function g and show that the convexity condition imposed on the function g is not necessary. This is important from the applications point of view since the model that motivated the work in the papers [12, 13] does not satisfy such a convexity condition. The authors of [12, 13] discuss this point. In addition, they show that they can remove the convexity condition on g when an additional regularity on the forcing function / is imposed. Our results indicate that one can remove this convexity condition without imposing any additional regularity on the forcing term / . Our paper is organized as follows. In Section 2 we state the delay problem and formulate our precise assumptions. In Section 3 we discuss the existence and uniqueness of solutions to our abstract delay differential equation. Section 4 is devoted to results concerning the convergence of minimizers of the finite dimensional approximate cost functionals to a minimizer of the infinite dimensional one. In Section 5 we apply this theory to an example that arises in reaction-diffusion equations.
2
The Delay Problem
We consider the following abstract parameter identification problem: (ID) Given observations z G Z, find parameters q G Q which minimize the performance index J(?) = *(«(•; ?);*)> where u(-; q) is the solution to the following initial value problem: f u{t;q) + A(t;q)u{t;q) = G{u;q)(t) + FD{t,u(t < [ u(s)=i{s;q),
-r);q) (2-1)
SG[-T,0].
Here, T > 0 is a fixed value, t € (0, T) and 0 < r < T. The following notation will be used throughout the discussion. Let V be a metric space with Q (the admissible parameter set) a compact subset of T>. Let the observation space Z be a normed linear space with norm | • \z- Let H be a Hilbert space with inner product (•, •) and corresponding induced norm | • |. Let V be a reflexive Banach space with norm || • || which is densely and continuously
243
embedded into H. The latter assumption implies that there exists a constant fi > 0 for which |0| < fi\\(j>\\ for all <> / G V. Let V* be the space of continuous linear functionals defined on V and denote the usual dual space norm on V* by || • ||». Identifying H with its dual, we have V C H = H* C V* with H densely and continuously embedded into V*. It follows that \\(j>\\t < n\(j>\ for all <j> G H and that ||<£||» < n2\\<j>\\ for all <j>£V. For
(Aft; q)4> - A(t; q)f,
P(i;?M|.
where 7 G L^O.TsR "). Furthermore, for each u G C([0,T];H), the map q -» G(u;g) is continuous from Q cT> into L2(0, T; i?).
244 (A6) For each q G Q, the map t -» FD(t, ??; q) is strongly measurable for all r\ G V. (A7) There exists a positive constant /?2 which does not depend on q G Q or t G [0, T] such that for any r?i,% G F, ll*b(t,77i;g) ~ FD{t,ih;q)\\, < P2\\m ~ m\\, a.e. t G [0,T]. (A8) There exists a /33 G L 2 (0,T;R + ) which does not depend o n g e Q such that l|i^(t,0;g)||, < AW for a.e. t G [0,T]. (A9) For each rj €V, the map g —> FD(t, r\\ q) is continuous from Q C T> into F* for almost every tG [0,T]. For each q G Q, let ^(-;9) G L 2 (-r,0; V) with ^(-;g) G L 2 ( - T , 0 ; V ) and assume that the mapping q -» ^(-,) is continuous from Q C V into L2(-T,0;V). For each z e Z, we 2 assume that the mapping $(-;^) is defined on L (0,T;V) with range in R + and that it is continuous when restricted to one or the other of the two spaces C([0,T]; H) or L 2 (0,T; V) endowed with their respective usual topologies.
3
Existence and Uniqueness
In this section we define the notion of weak solutions and establish the existence and uniqueness of such solutions. To this end, recall that a function v G L2(Q,T\V) with v G L2(0,T;V*) is continuous from [0,T] to H and absolutely continuous from [0,T] to V (cf. [22], p. 19 and [25], p. 379). By a solution to the initial value problem (2.1) on the interval [0,T] we mean a function u{-;q) G L2(0,T;V) with «(•;?) G L2(0,T;V) which satisfies (2.1) for almost every t G [0,T]. Clearly in this section we may and will ignore the dependence on the parameter q. We now prove the following theorem. Theorem 3.1 The initial value problem (2.1) has a unique solution on the interval [0, T]. Proof. Consider first the interval [0,T]. Let fi(t) = FD(t,^(t — r)), a.e. t G [0,TJ. Clearly, /i G I/2(0, r; V*). On this interval, solving our system is equivalent to solving the following initial value problem: f {
Ul{t)
+ A(t)u!(t) = G^it)
+ h{t) (3.1)
245 It follows from the results in [3] that (3.1) has a unique solution on [0, r]. Hence, so does the problem (2.1). Now let f2(t) = FD{t,Ui{t - T)), a.e. t 6 [T,2T]. Clearly, f2 e L2{T,2T;V*) is a known function on the interval [r, 2r]. One can easily observe that solving our system (2.1) on [r, 2T] is equivalent to solving the following initial value problem: f ii2{t) + A(t)u2{t) = G(u2){t) + }2{t) \ [ U2(T) = « i ( r ) .
(3.2)
Again by the results in [3], problem (3.2) has a unique solution on [r, 2r] and so does (2.1). Continuing by stepping the solution on intervals of length r, we can construct u, the unique solution to (2.1), the restriction of which to [{n — l)r, nr\ is equal to un. Here un is the solution to the initial value problem analogous to (3.2), where f2(t) is replaced by fn(t) = FD(t,un-i(t — T)), and the given initial data un{(n - 1)T) = un-i((n - l)r). This establishes our result. • Remark 3.2 We note that the assumption (A7) imposed on the function FD can be relaxed for the purpose of proving Theorem 3.1. In fact, we can replace (A7) with the following boundedness condition: (AT) There exists a constant /?2 > 0 which does not depend on q 6 Q or t e [0, T] such that
ll*b(t,i7;?)||.eV and a.e. t G [0,T]. This is because by assumption (A7) we are still able to conclude that each fn defined in the proof of Theorem 3.1 belongs to L2((n — 1)T, nr; V*).
4
Convergence Theory
For each N = 1,2, • • • , let HN be a finite-dimensional subspace of H which is also contained in V. Let PN : H —¥ HN denote the orthogonal projection of H onto HN. We assume the following condition regarding our approximation elements: (A10) For each 4> e V, lim^oo \\PN
all (t>eH.
246 For each N = 1,2, • • • , q G Q, and a.e. t G [0, T], we define the operator ^ ( t ; q) : HN -> ff^ to be the restriction of the operator A(i;g) to ff" with the image in V* of <j>N 6 i7 w , A(t; q)(j)N, considered to be a linear functional on HN. Identifying HN with its dual, for 4>N £ HN we obtain AN(t; q)4>N = 4>N where ipN is that element in HN which satisfies {A(t; q)<j>N, XN) = &N, XN) for all XN e HN. A similar definition holds for the approximating functions F^(t,-;q) : HN —» HN, N = 1,2,..., q G Q, and almost every t £ [0,T]. Finally, we define the projections UN : L2(0, t; H) -> £ 2 (0, t; # " ) by (n w ff )(s) = PN{g{s)), a.e. s G [0, t], and for each N = 1,2, • • • and qeQ,we define $N(q) G L 2 (-r, 0; # " ) by (£N(q))(s) = / ^ ( f (s; ?)), for a.e. s G [-T, 0]. Now, consider the following sequence of parameter identification problems: (ID^) Given observations z G Z, find parameters qN G Q which minimize the performance index JN(q) = <S>(uN(;q);z), where uN(-,q) is the solution to the initial value problem ( uN(t;q) + AN(t;q)uN(t;q) \ uN(s)=ZN(s;q),
= n"G(uN(t;qy,q)(t)
+
FE(t,uN(t-T;q);q)
se[-T,0] (4.1)
corresponding to q G Q. The evolution problem (4.1) in HN is the standard Galerkin approximation to (2.1). The existence and uniqueness of the solution to (4.1) on the interval [0,T] follows from similar arguments to those used in the proof of Theorem 3.1. Now we have the following theorem. Theorem 4.1 Let {qN} be a sequence in Q with \imN^><xqN = q. If conditions (A1)-(A10) are satisfied, then (i) lim^ 0 O « i v (-, 9 i v ) = u{-,q) inC{[0,T];H)
and L 2 (0,T; V);
(ii) for eachfixedK = 1,2,-, limjv-K*, uK (•, qN) = uK{', q) in C([0, T}; H) and L2(0, T; V). Proof. We will only prove (i). Similar arguments can be used to establish (ii). For convenience we will use the notation vN(-,qN) = vN(-) and v(-;q) = v(-). Consider first the interval [0,r]. Let ff{t) = Fg{t,£N(t - r);qN), a.a. t £ [0,r]. Then on this interval, a solution to (4.1) (with q replaced by qN) is equivalent to a solution to the following system: f < ( t ) + A " ( t ) < ( i ) = n " G « ( t ) ; q N ) ( t ) + f«(t) { [ <(0)=^(0).
(4.2)
247 Using similar techniques to those used in the proof of Theorem 4.1 in [3], we can show that the present Theorem 4.1 holds on the interval [0, r] for the solutions wf and «i of the corresponding problems (4.2) and (3.1), respectively, provided that for any fixed t e [0, T], f? -> /i in L2(0, t; V) asN->oo.
(4.3)
To prove (4.3), consider
HA" - A I I L W ) = HJtfO, £";«")) - FD(., ft 9)11^0,^.) <\\FD(;e;qN))-FD(;t;qN)l\mo,ty*) + \\FD(-A;QN))
- f D (-,f; 9)11^(0,^-)
+
//.
The first term / -»• 0 as AT -> oo because H^" - £||L2(-T,O;V) -> 0, while the second term II —> 0 as AT —> co by the continuity of FD in the parameter q (see (A9)). Thus it follows that Theorem 4.1 holds on the interval [0, r]. Next consider the interval [r, 2r]. Clearly /2w(t) = F#(i, uf (t - r); q w ) for a.e. t e [r, 2r] is a known function on the interval [r, 2r]. Hence, the solution to (4.1) (with q replaced by qN) on the interval [r, 2r] is equivalent to the solution of the following system: u»(t) + AN(t)u»(t)
= IlNGN(u»)(t)
+ f»{t) (4.4)
<(r)=<(r). Since we proved that uN —> u in C([0,r]; /f) and L2(0, r; V), we can show that for any fixed t € [r, 2r], / ^ —» / 2 in L 2 (r, t; V*) as A?" —• co, by employing similar arguments to those used before. Hence our theorem holds on the interval [r, 2r] for the solutions w^ and u2 to the corresponding problems (4.4) and (3.2), respectively. Continuing to step the proof of Theorem 4.1 on intervals of length r we establish the desired result in full. • Using Theorem 4.1 we can conclude that the inverse problem (ID") has a solution. Indeed, since uN(-;q) depends continuously on q (by Theorem 4.1), we know that the approximate cost functional JN is continuous on Q. Since Q is assumed to be a compact metric space the existence of such a solution qN follows immediately. Furthermore, we can conclude that the infinite dimensional identification problem (ID) has a solution which is the limit of a subsequence of {qN}. In fact, by the compactness of Q there exists a convergent
248 subsequence {qN'} of {qN}. Denoting its limit by q we obtain J(q) = $(«(•; q); z) = $ ( lim uN> (•; qN');z) J->oo
= lim
j—>oo N
< lim J '{q) = lim $(«"•> (•;?);«) = $ ( lim «^(•; q); z) = $(«(•; q)\ z) = J(q) for every q e Q. Hence q is indeed a solution of the problem (ID).
5
Applications
In this section we discuss a general example of a problem that arises in the study of nonlinear reaction-diffusion equations (see, e.g., [23], p. 86 for a semilinear version of this problem). Let Q be a bounded region in R' with a smooth boundary. Let F : [ O . T j x l x R 1 - > R satisfy the following: (a) F(-, a, C) is measurable on [0,T] for all a e R and C G R(• (b) There are positive constants /?2 and /?2 such that for each t e [0, T], |F(t,ffi,Ci) - F(i,ff 2 ,6)| < A k i - <72| +AICi " Ca|, V<7i,a2 e R a n d Ci,C2 € R1. (c) There is a function ft € L 2 (0,T;R+) such that |F(t,0,0)|3 3 (i)forallte[0,T]. We consider the following parameter identification problem: Given observations z (ti,x) at times {U}f=1, with 0 < ti < i 2 < • • • < * # < T, and a position x e Q, find a parameter q G Q (to be specified below) which minimizes the performance index k $(u;z) = y~] / \u(titx;q) -
z(titx)\2dx,
where for each q 6 Q, u(q) = u (t, x; q) is the parameter dependent solution of the following delay reaction-diffusion equation: ou — - V{q{t,x, V«(i,x))Vu(t,x)} = G(u(;x)){t) + F(t,u(t -r,x),
Vu(t - T,X))
(5.1)
249 with Dirichlet boundary conditions. We take H = L2(Q) and V = H£(Q). We assume that the mapping G satisfies (A5). For each q £ Q and almost every t € [0, T], we define the operator A(t; q) : V ->• V* by (A(t; 9)(z)Vf(x)dx Jn
for all
Let 2? = L°°([0,T] x Q x Rf), and set the observation space Z = C([0,T]; L2(Q)) with the norm \z\z = sup 0 < ( < r (/ n |z(t,a;)|2da;)1/2. Choose Q to be a compact subset of V with the property that q e Q if 1. The mapping 6 ->• q(t,x,6) is C 1 for almost every (i,x) 6 [0,T] x U, and 2. There exists a constant S > 0 which does not depend o n g e Q for which (?(*, x, 6)6 - q{t, x, V)ti) • ( » - ! ? ) > «5|0 - T?|2 for almost every (t,x) e [0,T] x U and every 0, 77 e R'. In [15] the hemicontinuity property and the conditions (A1)-(A4) have been verified for this operator A. We now show that the function F considered above satisfies (A6)-(A9). To this end, define a function FD(t,rj) : [0,T] x V -t V* by [FD(t,ri)]{x) = F(t,7}(x),Vr){x)) for all x e SI. Since FD does not depend on q, (A9) is obviously satisfied. Conditions (A6) and (A8) also follow immediately from the above assumptions on F. Let /ii be the embedding constant of L2(Sl) into H~1(Si) and ^ 2 be the embedding constant of Hg(ST) into L2(Sl). Then using (b) we see that for any <j>,ip € Hg(Sl), \\FD(t, 4>) - FD{tM\H-nn)
< ft||4> - r/>\\Hila),
where fa = fc^ifa + f-ih • Consequently, all the results of the previous sections apply to equation (5.1). ACKNOWLEDGMENTS The work of the second author was partially supported by the Israel Science Foundation founded by the Israel Academy of Sciences and Humanities (Grant 592/00), by the Fund for the Promotion of Research at the Technion, and by the Technion VPR Fund - E. and M. Mendelson Research Fund.
250
References [1] A.S. Ackleh, S. Aizicovici, R.R. Ferdinand and S. Reich, "Numerical studies of parameter estimation techniques for nonlinear Volterra equations", Theory and Practice of Control and Systems (A. Tornambe, G. Conte and A.M. Perdon, eds.), World Scientific, Singapore, 1998, 310-315. [2] A.S. Ackleh, S. Aizicovici, R.R. Ferdinand and S. Reich, "Parameter identification in a nonautonomous nonlinear Volterra integral equation", Proceedings of the 7th Mediterranean Conference on Control and Automation, Haifa, Israel, 1999, 2200-2206. [3] A.S. Ackleh, S. Aizicovici and S. Reich, "Parameter identification in nonlocal nonlinear evolution equations", Numerical Functional Analysis and Optimization, 21 (2000), 553570. [4] A.S. Ackleh, R.R. Ferdinand and S. Reich, "Numerical studies of parameter estimation techniques for nonlinear evolution equations", Kybernetika, 34 (1998), 693-712. [5] A.S. Ackleh and B.G. Fitzpatrick, "Estimation of discontinuous parameters in general nonautonomous parabolic systems", Kybernetika, 32 (1996), 543-556. [6] A.S. Ackleh and B.G. Fitzpatrick, "Estimation of time dependent parameters in general parabolic evolution systems", Journal of Mathematical Analysis and Applications, 203 (1996), 464-480. [7] A.S. Ackleh and S. Reich, "Parameter estimation in nonlinear evolution equations", Numerical Functional Analysis and Optimization, 19 (1998), 933-947. Corrigendum, Numerical Functional Analysis and Optimization, 20 (1999), 1003-1004. [8] S. Aizicovici, S. Reich and I.G. Rosen, "An approximation theory for the identification of nonlinear Volterra equations", Numerical Functional Analysis and Optimization, 14 (1993), 213-227. [9] H.T. Banks and K. Ito, "A unified framework for approximation and inverse problems for distributed parameter systems", Control-Theory and Advanced Technology, 4 (1988), 73-90. [10] H.T. Banks and K. Kunisch, Estimation Techniques for Distributed Parameter Systems, Birkhauser, Boston, 1989.
251 [11] H.T. Banks, C.K. Lo, S. Reich, and I.G. Rosen, "Numerical studies of identification in nonlinear distributed parameter systems", International Series of Numerical Mathematics, 91 (1989), 1-20. [12] H.T. Banks and C.J. Musante, "Well-posedness for a class of abstract nonlinear parabolic systems with time delay", Nonlinear Analysis, 35 (1999), 629-648. [13] H.T. Banks, C.J. Musante and J.K. Raye, "Approximation methods for inverse problems governed by nonlinear parabolic systems", Numerical Functional Analysis and Optimization, 21 (2000), 791-816. [14] H.T. Banks, S. Reich and I.G. Rosen, "An approximation theory for the identification of nonlinear distributed parameter systems", SIAM Journal on Control and Optimization, 28 (1990), 552-569. [15] H.T. Banks, S. Reich and I.G. Rosen, "Galerkin approximation for inverse problems for nonautonomous nonlinear distributed systems", Applied Mathematics and Optimization, 24 (1991), 233-256. [16] H.T. Banks and J.G. Wade, "Weak-tau approximations for distributed parameter systems in inverse problems, Numerical Functional Analysis and Optimization, 12 (1991), 1-31. [17] V. Barbu, Nonlinear Semigroups and Differential Equations in Banach Spaces, Noordhoff, Leyden, 1976. [18] H. Brezis, Operateurs Maximaux Monotones et Semigroupes de Contractions dans les Espaces de Hilbert, North-Holland, Amsterdam, 1973. [19] M.G. Crandall and A. Pazy, "Nonlinear evolution equations in Banach space", Israel Journal of Mathematics, 11 (1972), 57-94. [20] B. G. Fitzpatrick, "Analysis and approximation for inverse problems in contaminant transport and biodegradation models", Numerical Functional Analysis and Optimization, 16 (1995), 847-866. [21] J.A. Goldstein, "Approximation of nonlinear semigroups and evolution equations", Journal of the Mathematical Society of Japan, 24 (1972), 558-573. [22] J.L. Lions and E. Magenes, Non-Homogeneous Boundary Value Problems and Applications, Springer, New York, 1972.
252 [23] C.V. Pao, Nonlinear Parabolic and Elliptic Equations, Plenum, New York, 1992. [24] A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, Springer, New York, 1983. [25] M. Renardy and R.C. Rogers, An Introduction to Partial Differential Equations, Springer, Berlin, 1993.
On mesoscopic superconducting samples JACOB RUBINSTEIN AND MICHELLE SCHATZMAN
Abstract. We survey recent progress on mathematical questions related to mesoscopic superconducting samples. We start with an existence and regularity result for the two dimensional Ginzburg Landau functional, and we proceed to study phase transitions in multiply connected domains.
1. Introduction Some of the most fascinating properties of superconductors are observed in samples with nontrivial topology and small scales. In this article we review recent progress with respect to several such setups. We use throughout the Ginzburg Landau (GL) energy functional to model superconductivity [20]. We first establish existence and regularity for its minimizers in two dimensional domains. The proof is given in detail, since, to our surprise no general proof was given so far. We then review results by ourselves and by our colleagues on Little Parks oscillations.
2. Existence and regularity of minimizers of the Ginzburg - Landau functional in two dimensions We consider the two dimensional GL functional written (in non-dimensional units) in the form [4]: g(u,A)=
I (n{\u\2 - l)2/2 + \(i grad + A)u\2) dx + iSu.-1 [ |curl A - H e | 2 dx. JR2
JO
(2.1) Here O is an open bounded domain of R2, u is a complex valued function, the potential A determines the magnetic field H through curl A = H, and H e is a given vector field in the direction orthogonal to the plane; whenever necessary, it is identified with a scalar function. The positive parameter /i is proportional to Tc—T, where T is the temperature and Tc is the critical temperature of the material in the absence of magnetic fields. The GL parameter K is the ratio between the penetration length and the coherence length. It is a temperature independent material constant. We make the following assumption on H e : H„ is the sum of a constant and of a Holder continuous function with compact support.
(2.2)
Let A e be a vector potential such that curl A e = H e , 253
div A e = 0;
(2.3)
254
Under assumption (2.2), Lemmas 1 and 2 of [16] guarantee that there exists a unique A e up to translation by a constant field which satisfies (2.3) and the condition f (l + | z | ) - 5 | A e | 2 d a ; < + o o .
(2.4)
J IP
Moreover, A e is continuous and its first derivatives are Holder continuous. According to Lemma 2 of [16] we can choose A e given by the convolution product:
(
x
x
\
and this vector potential is bounded over R 2 . We prove the existence of a minimum of the functional (2.1) in an appropriate function space; in [7] the form of the GL functional is not given correctly, since the integral of | curl A — H e | 2 is taken on the domain of the superconductor while it should be taken on full space. In [15] the existence of a minimizer is proved by taking a priori A in the space i/div(K2) which is the completion of the space of divergence free CQ° vector fields with respect to the Dirichlet norm. Here, we minimize on a larger functional space and we state precisely some regularity results. Denote by W the set of pairs (u, A) such that u belongs to Hl{0;C), A belongs to Ht0C(M2)2 and curl A — H e belongs to L 2 (R 2 ). Since we work in dimension 2, u and A|_ belong to V{0) for all p < oo by Sobolev injections, and therefore AM and |M|2 are square integrable in O. T h e o r e m 2.1. The functional Q attains its minimum over W. All minimizers (u, A) satisfy the relations \u\ < 1, a.e. in O, curl(A — A e ) = 0 in the unbounded component of R2 \ O;
(2.5) (2.6)
The Euler - Lagrange equations satisfied by minimizers are: - AM + div(iAw) + iA g r a d u + | A | 2 u + /i«(|u| 2 - 1) = 0, curl curl A - curlH e = /iK~23(M(grad - i A ) u ) l o , i £ E 2 (i grad +A)u • v = 0, x e dO.
xeO
(2.7) (2.8) (2.9)
All minimizers can be subjected to a gauge transformation so that div A = 0 on R2, and |A — A e | < C/\x\ as \x\ tends to infinity.
(2-10)
When (2.10) holds, A is uniquely determined from the knowledge of curl A. If the boundary of O is of class C2 and (2.10) holds, u belongs to H2(0) and A — A e belongs to tf20C(R2). Proof. Let P be the mapping from C to itself defined by 'z ' z/\z\
ifU|
255 Let us show that g(Pu,A)
(2.11)
In particular, we observe that on u |(jgrad+A)Pu| = \iPu{A - %B))\,
|(igrad+A)w| = \u(iA -
B)\.
Therefore f (lJ.[(\Pu\2 - l) 2 /2] + |(igrad+A)Pu| 2 ) dx (2.12) < ( (M[(I«| 2 - l) 2 /2] + |(igrad+A) U | 2 ) dx and the equality holds only if w is negligible. Let R be large enough for O to be included in the ball BR of radius R about 0. Define A = A — A e . We will modify A outside of BR SO as to lower the GL functional while the divergence of the modified vector potential will be compactly supported. We construct A so that curl A vanishes outside of BR and A coincides with A on BR. Using polar coordinates in M2, the circulation of A along the positively oriented boundary 3BR of BR is given by T= /
A=
JdBR
f
Ae{6,R)de.
JO
Define a canonical vector A 0 on R2 \ {0} by X2 A = ( 0 ^ 2ir\x\2
Xl } 2n\x\2>'
The curl of Ao vanishes, and its circulation along any closed curve turning once in the positive sense around 0 equals one. Set
r" a(0)=
/ {Ae(6,R)-r)d6, Jo where Ag and As are, respectively, the tangential and normal components of A on dBR. By definition of T, a is 2w periodic. We also know that it belongs to the Sobolev space H3/2(dBR). It is possible therefore to find a function x £ H2(BR+i \ BR) whose traces are given by x\dB
= a,
X\9BR+1 =0,
gradx(-Rcos#,,Rsin0) • (cos0 sin^) =
As(6,R),
grad*((-R+l)cos0,(ii + l)sin0)-(cos0 sin0)=O.
256 We extend x by 0 in K2 \ BR+1. We let now -_(k(x)
H\X\
" j r A o + gradx
if \x\ > R.
It is clear that
g(u,Ae +
A)
Consider the divergence of A: the restriction of div A to the exterior of the ball BR is equal to div A 0 + Ax; since div A 0 vanishes, it follows that div A has compact support. We shall use the gauge invariance of the GL functional: if Xi is any function in H2(R2), it is immediate that («i,Ai) = (exp(ixi)u, A + gradxi) belongs to W and that Q(ui,Ai) = Q(u, A). Let us choose Xi so that div(A + gradxi) = 0 . For example, we can select Xi = K * div A, where K(x) is the two dimensional Coulomb potential
Observe that the construction of A was essential; if we did not know that div A had compact support, we would not have been able to convolve K and div A. Moreover the decay of grad xi at infinity can be estimated by: |gradxi| (x) = |grad.fiT * div A\(x)
.IdivAl^^^^)
1
"
\
Thus we have obtained from an arbitrary (w, A) e W a new element (ui, A x ) e W such that Q{ui,A{)
divA!=0onR2,
curl(Ai - A e ) = 0 and
\AX - Ae\ < C/(\x\ - R - 1) if \x\ > R + 1.
A vector potential B which satisfies the relations B € tf/octR2)2, divB = 0,
|B| < C/\x\ as |a;| tends to infinity
(2.13)
is completely determined by the data of a compactly supported curlB. The uniqueness is guaranteed by Lemma 2 of [16]. The vector potential can be constructed as follows: from the relation divB = 0 we know that there exists a distribution X2 s u c n that B = curlx2- Then we must have - A x 2 = curlB,
257 which we can solve by Xi = K *curlB. Next we estimate B | B where S is some strictly positive number. Observe that if \x\ < S, if curl B is supported in BR and if XR+S is the characteristic function of the ball BR+$ then / grad K(x - y) curlB(y) dy=
I XR+S(X - y) gradK(x - y) curl B(y) dy;
but \{XR+sgr&dK) *curlB| L 2 (R 2 ) < | XR+S grad K \ Li (K2 j | curl B| L 2( Ra) . Moreover the derivatives of B are second derivatives of Xi\ straightforward Fourier analysis tell us that
d2
d2$ Z,2(M2)
dxxdxz
9V 2
2
L (K )
dxl
<
|A>| L 2(K2).
L2(R2)
Therefore, since B satisfies (2.13), the following estimate holds: \B\m(BS) < C(R + Consider now a minimizing sequence (un,An). assume that for all integer n
S)\cui\B\L2m.
(2.14)
The above study implies that we can
\un\ < 1, a. e. on O,
div A n = 0,
and A„ — A e satisfies the conditions (2.13). There exists C such that |jgradn„ + Anun\L2{0)
< C,
IcurlAn-ifel^ng^ < G.
(2.15) (2.16)
Estimates (2.14) and (2.16) imply that for all positive S \An-Ae\H1{Bs)
258 is constant in the unbounded component of R2 \ 0\ but this constant must vanish for /|curl(A — A e )| 2 dx to be finite. The construction of A, u\ and Ax from the minimizer (M, A) works exactly as above. We assume henceforth that A satisfies (2.13). Then curl curl A = —AA. Therefore A — A e is given by A-Ae
J = K-2S(u(gta,d-iA)u)l0.
= K*J,
(2.17)
2
In particular, A — A e belongs to H 0c, and A|_ is bounded. These observations imply that u and A belong to L°°((D) n H^O); thus wA belongs also to L°°{0) n H^O); this shows that the normal trace of uA on O belongs to the Sobolev space Hll2. On the other hand, (2.7) implies that —AM belongs to l?{0). Therefore by standard regularity theorems, if the boundary of O is of class C 2 , u belongs to H2(0). D Remark 2.2. Bootstrapping techniques can be used to upgrade the H2(0) regularity into C2(0) regularity. We refer to [11] for an exposition of such an upgrade for the GL functional. Remark 2.3. In the rest of this paper we shall be interested in multiply connected domains. Therefore, it is important to point out that if there are holes in O, i.e. if K2 \ O has bounded connected components, then curl (A — A e ) is constant on these connected components.
3. Little Parks oscillations Little and Parks observed in 1961 [12] that the phase transition temperature in long and thin cylindrical shells is essentially a periodic function of the axial magnetic flux through the cylinder. This is obviously a quantum mechanical effect, actually a manifestation of the Aharonov-Bohm effect. More precisely, the T c ($) curve, where Tc is the critical temperature and $ is the flux, is of the form of a periodic function superimposed on a parabola. The pure effect (i.e. without the parabolic background) has a simple explanation in terms of one dimensional considerations. The parabolic background was shown by Groff and Parks [6] to be a consequence of the finite thickness of the shell. The theoretical work was based on the assumption that the order parameter has essentially constant amplitude. With the introduction of advanced fabrication methods, modern experiments are performed on essentially two dimensional domains (i.e. the shell is actually a flat ring). For a long time people found excellent agreement when they compared experimental results with theoretical calculations (e.g. [6]). A closer look, however reveals problems. As an example we mention the experiment of Zhang and Price [21]. They measured dl/d$ as a function of <£, where / is the current flowing in the ring, and $ is the flux through a disc defined by the average radius of the ring. Near the critical temperature the graphs have strong positive peaks for flux values that are approximately integer plus half (in normalized units). These peaks are unaccounted for by the usual theory (e.g. [20]). To understand these peaks, a more refined theory is needed. In fact, Berger and Rubinstein [2] observed several years ago that even a slight deviation from uniform thickness
259 implies an unusual behavior at flux values near Z + 1/2. They showed that in this situation, and in a one dimensional setup, the order parameter has a zero whenever the flux is exactly in the set Z + 1/2. Moreover, the assumption of uniform amplitude for the order parameter breaks down in a temperature interval near Tc, and for flux values near Z + 1/2. They have further shown [3] that the new theory can explain, at least qualitatively, the anomalies in the Zhang-Price experiment. Recent theoretical progress [4], [8] revealed that the key term is symmetry breaking. The nonuniformity of the thickness of a ring indeed gives rise to a specific symmetry breaking. But it is only a special case of a more general situation. An extensive theory, valid for arbitrary one dimensional graphs is given in [19]. Two dimensional setups are further considered in [9]. We comment that the classical literature on Little Parks oscillations assumes that the superconducting sample is multiply connected, typically with a ring-like geometry. Nevertheless, an oscillatory T c ($) curve can also be obtained for simply connected domains. This has been shown experimentally by Buisson et al. [5] who considered a small disc. Later Moshchalkov, Bruyndoncx and their coworkers [13] measured the transition temperature T c ($) for a mesoscopic square. Both groups reported oscillatory phase boundary superimposed on a linear background. The heuristic reasoning for the oscillations is that as the flux increases, the wave function concentrates near the boundary, and the sample appears effectively as a thin ring. This gives rise to a topological quantization constraint on the phase of the wave function and hence the Little Parks oscillations. The analogy with the experiments in thin shells is not perfect, though. In the case of a simply connected sample under strong fields, the width of the 'effective' ring is not fixed. Rather, the width shrinks as the applied field increases. Further theoretical study of this issue is provided in [1], [10].
4. The Zero Set One of the fascinating subjects in the study of the GL model in multiply connected domains is the zero set of the order parameter. We shall consider it now for the case of one dimensional models. Let M be a planar graph. The edges of the graphs are smooth arcs in R 2 . A precise definition of the graph as an anlytical object is given in [18]. For our purposes, there is particular importance to the cycles of the graphs; the magnetic flux through them is the driving mechanism for the supercurrents. The GL model over M is given by
H{1>) = f (W + AWf + /i [(M2 - 1)72]) da.
(4.1)
JM
Here we denote the order parameter by ip (to distinguish it from the the order parameter in two dimensional domains we defined in section 2), AT is the tangential component of the applied magnetic potential A e , and s is a variable running along the edges of M. The model (4.1) is a nonlinear extension of the model proposed about 20 years ago by Alexander and deGennes. A fundamental question is whether the order parameter ij) has zeros. In a single uniform
260 loop, for example, it is natural to expect a solution with a uniform order parameter. However, it was shown in [2] that even slight deviations from uniformity imply (for certain flux values) large variations in the amplitude of the order parameter. The question is closely related to the nature of the transition between different types of circulation along cycles in the graph. Clearly the circulation is in integer. As the flux through a cycle j is varied, we reach a value where the total circulation, characterizing changes to an adjacent integer. The question is whether this change is continuous (and thus accompanied by a zero of ip somewhere along the cycle), or discontinuous. A partial answer was given in [3] for single narrow rings whose cross section deviates slightly from a constant. In the limit case of the one dimensional model it was found that in general a zero will form at some point along the loop whenever the flux is an odd multiple of IT. A general theoretical framework to study this zero formation, together with extensions to two dimensional situations was developed later in [4]. One consequence of this theory is that under generic asymmetry assumption, a zero will form for the critical flux values even for rings with arbitrary cross section. The crucial point is to define an appropriate notion of asymmetry in this context, since the definition used in [4] is neither natural nor easy to check. The problem was resolved by Helffer et al. in [8]. They proposed to consider the GL equations over the double covering of the ring. In this space one can find a convenient gauge, and the equations simplify considerably. We shall apply the method of Helffer et al. to the problem of zeros in graphs. We assume that the flux through each cycle is an odd multiple of -K. Furthermore, to simplify the exposition, we limit ourselves to linear problems. Physically, this amounts to considering the phase transition from the normal state, where ip = 0, to a superconducting state. In graphs this phase transition is always continuous, taking the form of a bifurcating branch. The bifurcation occurs at the lowest y, where the zero solution is no longer a local minimum. We denote this critical value by / v Calculating the second variation of the GL functional, one finds that fxc is exactly the ground energy of the magnetic Schrodinger operator on M. The superconducting order parameter is proportional to the ground state. Therefore the problem we shall consider now is whether the ground state has zeros. The double covering of M is denoted by M. In fact there are many double coverings, and we have to choose an appropriate one. For this purpose we lift both ip and AT into M. We denote a function / on M lifted to M by / . We pick a covering such that the integral of AT over every cycle in M is kit for some even k G Z. Denoting the primitive of AT by (, our choice of M guarantees that e!^ is a single valued function on M. This is the crucial observation that facilitates the analysis. It is here that we need the condition that the flux through every cycle of M is an integer times 7r. We can therefore define the new gauge v = e~Kij>. (4.2) In this gauge the GL functional over M reads FGL= I w{\v'? + \m2-V2)ds,
(4.3)
where we introduced a weight w that models nonuniformities in the network thickness. The nonuniformity will enable us to investigate symmetry breaking even in single loops.
261 Consider further the mapping G : M -> M which sends every point to its corresponding point on the other copy of M. Since i> is G symmetric by construction, then v is G antisymmetric. The weight w, on the other hand, is lifted into M from the weight w in M, and thus it is G symmetric. Therefore the variational problem we consider is to minimize FGL over complex G antisymmetric functions in M. The advantage of the new formulation is that a complicated problem on M was reduced to a simpler problem on M, except that the minimization is now taken over a special class of functions. The critical temperature is determined by the eigenvalue problem of minimizing jlc = J^ w\v'\2 over all complex G antisymmetric functions v such that J^ w\v\2 = 1. The minimal value is the eigenvalue \ic — (j,c, associated with a ground state v. Instead of minimizing over complex valued functions, we consider the eigenvalue problem of minimizing /}* = J^ wu'2 over all real G antisymmetric functions u such that f^ wu2 = 1. Assume jlsc is a simple eigenvalue, associated with the eigenfunction usc. We argue that in this case p,c = p.sc, and vc = usc. For suppose, in contradiction, that jxc < jlsc. Since the spectral problem for jlc is invariant under complex conjugation, both the real and imaginary parts of v are real G antisymmetric eigenfunctions. Hence our assumption on jisc implies that they are proportional to each other and to u. Consider now the special example where M is a single loop, and assume that Jlsc is simple (this is why we need the nonuniformity; when w = 1, the eigenspace of ftsc has dimension two). Clearly a G antisymmetric function must have at least one zero on each of the copies of M that comprise M. Using G antisymmetry again, it is easy to verify that there cannot be two zeros on each copy, and simple surgery argument implies that an eigenfunction associated with a smallest simple eigenvalue cannot have three or more zeros. An interesting question is whether the zero(s) occur only in the linear bifurcating solution. It was shown in [4] (and in more detail in [9]) that, in fact, if p.sc is simple, there is an interval (fic, p,c + 8) (physically, a temperature interval) where the solution to the full nonlinear GL problem has a zero for $ = hx for odd k. Another important question regards the nature of the zero in a truly two dimensional domain O with holes. The important aspect in our construction was our ability to "gauge out" the magnetic potential. This can be done whenever the potential is a gradient of some function. In one dimension this is always true. But in two dimensions we need the compatibility condition V x A = 0 to hold. This condition is equivalent to assuming that the magnetic field vanishes in fi (although it should not vanish in the holes bounded by Q, in order to guarantee the desired flux value). Indeed, it can be shown ([9]) that the conclusion regarding the identification of v with u, in the case where jlsc is simple, holds in two dimensional multiply connected domains, under the special assumption that He = 0 in £1 Again the zero set of ip is the nodal set of u - the leading G antisymmetric eigenfunction of the Laplacian on £1. Thus the zero set is of codimension one. When He does not vanish in £2, there may still be a smooth transition between circulations along closed loops in H. But now the transition is mediated by vortices. We have shown that when M is a loop and jlsc is simple, there is exactly one zero in the order parameter. We now set w = 1, i.e. assume uniform thickness, and consider a canonical version of the model in an arbitrary graph. Given a graph M we face two
262 questions . First, is /2* simple? Then, if it is simple, what is the size and structure of the zero set? Clearly there will be at least one zero, but in sufficiently complex graphs we could expect more zeros. To make the discussion more concrete, let us analyze the special case of symmetric ladders. An n symmetric ladder consists of n identical squares in a row. Parks conjectured that the order parameter will have zeros if n = 2m and there will be no zeros if n = 3. This conjecture was indeed verified experimentally. For the case of symmetric ladders, it can be checked that when n is odd, the symmetries of the ladder imply that flsc is not simple. On the other hand, it can be verified that /}* is simple for n even. Thus we have a proof of a generalization of Park's conjecture. We emphasize, though, that the original conjecture did not take into account the strict symmetry requirement. By this we mean that ip can have zeros even in a 3 ladder, say, if we replace the squares by unequal loops, or if we introduce nonuniformities in M. We refer to [19] for an extensive discussion on double covering of graphs, the simplicity of the ground state, and estimates on the number of zeros of G antisymmetric functions there.
References [1] A. Bernoff and P. Sternberg, J. Math. Phys. 39, 1272, 1998. [2] J. Berger and J. Rubinstein, Phys. Rev. Lett. 75, 320, 1995. [3] J. Berger and J. Rubinstein, SIAM J. Appl. Math. 75, 103, 1998. [4] J. Berger and J. Rubinstein, Comm. Math. Phys. 202, 621, 1999. [5] O. Buisson, P. Gandit, R. Rammal, Y.Y. Wang and B. Pannetier, Phys. Lett. 150, 36, 1990. [6] R.P. Groff and R.D. Parks, Phys. Rev. 176, 567, 1968. [7] Q. Du, M.D. Gunzburger and J. Peterson, SIAM Review 34, 54, 1992. [8] B. HelfFer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof and M. Owen, Comm. Math. Phys. 202, 629, 1999. [9] B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof and M. Owen: Comm, in Connectivity and Superconductivity, J. Berger and J. Rubinstein, Eds., Springer Lecture Notes in Physics M62, 63, 2000. [10] H.T. Jadallah, J. Rubinstein and P. Sternberg, Phys. Rev. Lett. 82, 2935, 1999. [11] S. Jimbo and P. Sternberg, "Non-existence of permanent currents in convex planar domains" SIAM Math. Anal., to appear. [12] W.A. Little and R.D. Parks, Phys. Rev. Lett. 9, 9, 1962. [13] V. Moshchalkov, V Bruyndoncx, E. Rosseel, L. Van Look, M. Baert, M. J. Van Bael, T. Puig, C. Strunk and Y. Bruynseraede, in Euroschool on "Superconductivity and mesoscopic systems", Certosa di Pontignano, Siena, Italy, to appear.
263 [14] B. Pannetier, in Quantum Coherence in Mesoscopic Systems, ed. B. Kramer 457, Plenum Press, 1991. [15] J. Rubinstein, in Boundaries, Interfaces and Transitions, M. Delfour Ed. CRM Series, American Mathematical Society, 163, 1998. [16] J. Rubinstein and M. Schatzman, J. Math. Pures Appl. 77, 801, 1998. [17] J. Rubinstein and M. Schatzman, "Variational problems on multiply connected thin strips I: basic estimates and the convergence of the Laplacian spectrum", ARM A to appear [18] J. Rubinstein and M. Schatzman, "Variational problems on multiply connected thin strips II: convergence of the ginzburg Landau functional", ARMA to appear [19] J. Rubinstein and M. Schatzman, "Spectral properties and symmetries of double coverings of graphs with applications to quantum wires", in preparation. [20] M. Tinkham, Introduction to Superconductivity, McGraw Hill, 1996. [21] X. Zhang and J.C. Price, Phys. Rev. B 55, 3128, 1997. J.R.: Department of Mathematics, Technion, Haifa 32000, Israel; M.S.: UMR 5585 CNRS Analyse Numerique, Mathematiques, Universite Claude Bernard - Lyon 1, 69622 Villeurbanne Cedex, France;
Some Strange Numerical Solutions of the Non-stationary Navier-Stokes Equations in Pipes Bernd Rummler September 14, 2001
Abstract A general class of boundary-pressure-driven flows of incompressible Newtonian fluids in threedimensional pipes with known steady laminar realizations is investigated. Considering the laminar velocity as a 3D-vector-function of the cross-section-circle arguments, we fix the scale for the velocity by the L2-norm of the laminar velocity. The usual new variables are introduced to get dimension-free Navier-Stokes equations. The characteristic physical and geometrical quantities are subsumed in the energetic Reynolds number Re and a parameter V>, which involves the energetic ratio and the directions of the boundary-driven part and the pressure-driven part of the laminar flow. The solution of non-stationary dimension-free Navier-Stokes equations is sought in the form u = u i + u, where UL is the scaled laminar velocity and periodical conditions in center-line-direction are prescribed for u. An autonomous system (S) of ordinary differential equations for the time-dependent coefficients of the spatial Stokes eigenfunction is received by application of the Galerkin-method to the dimension-free Navier-Stokes equations for u. The finite-dimensional approximations uNW
of u are defined in the usual way. A class
of time periodic solutions near to the laminar velocities but different from them was found by parameter studies for the numerical solution of finite-dimensional subsystems of (S). This class of time periodic strange numerical solutions seems to be one of the first links in the bifurcation chain to turbulence. K e y w o r d s : Navier-Stokes equations, Stokes eigenfunctions, Galerkin methods, transition to turbulence 2000 Mathematics subject classifications:
264
34A34, 35Q30, 65M60, 76F06, 76F65
265
1
Introduction
The so-called pipe flow of incompressible Newtonian fluids is well known as an old-established object of preference in theoretical and applied fluid-dynamic research. Owing to the simple geometry of the domain - the pipe flow is ideal qualified for studies to the transition to turbulence and for investigations to extract further deterministic features from a random, fine-grained turbulent flow. It is the purpose of our investigations to check and to clarify the possibilities of using low-dimensional Galerkin spaces defined by Stokes eigenfunctions for fact finding of the mechanism of the transition to turbulence in the non-stationary 3D-Navier-Stokes equations. Especially, our studies are targeted on the behaviour of such approximations in the vicinities of critical Reynolds numbers. We explore a general class of scaled pipe flows of incompressible Newtonian fluids in unbounded pipes in R 3 , which can be described by the sum of laminar boundary-driven Couette (angular momentum) flow uj^c, of laminar pressure-driven Poiseuille flow XLL,P, and of a time-dependent part u. The first and second addenda are used to define the energetic Reynolds number and a weighting parameter i/> for the energetic ratio and the direction of action of the boundary- and pressure-driven parts from uj,. Low-dimensional approximation spaces spanned by Stokes eigenfunctions with periodic conditions in the center-line-direction are applied for the direct numerical study of systems of Galerkin equations. The essential notations and governing equations supplemented with initial and boundary conditions are given in section 2. After convenient scaling we decompose the velocity fields u(i, x) into the laminar flow UL(X) and the remaining velocity u(t,x) (fulfilling homogeneous Dirichlet conditions on the boundary of the pipe: u = U£ + u. Additionally, periodic conditions in the center-line-direction are required for u. The pressure is decomposed similarly. The energetic Reynolds numbers and a weighting parameter ip for the energetic ratio and the direction of action of the boundary- and pressure-driven parts of UL are also defined in this section. The weak formulation of the Navier-Stokes initial-boundary value problem for the remaining velocities u is introduced in section 3. We give the existence-theorem for weak solutions of the NavierStokes initial-boundary value problem and tools for its proof in this section. In particular, we explain the Stokes operator and the Galerkin approximation U/v := 5Z,-_i 9jwj
0I
* n e weak solutions of the
Navier-Stokes equations for the remaining velocities u, where the {Wj }?1 1 are the Stokes eigenfunctions. The Galerkin equations (as an autonomous system of ordinary differential equations for the coefficients gj(t) of the eigenfunctions { w ^ x ) } ^ ) are stated there. Section 4 is devoted to the numerical method and the results. At first a fixed period 2 • I = 2 • 2.69 in the center-line-direction was chosen for historical reasons (cf. [14]). With this choice is the di-
266 mension of the Galerkin-space N(X) = N(Xmax)
= 114 determined by a bound Xmax := 50 for the
eigenvalues X. Xmax is taken in such a way, that the Galerkin-space includes two significant modes for the modification of the mean velocity both for the pure Couette-flow and the pure Poiseuille-flow. The calculation of the coefficients in forming the system of ordinary differential equations is very expensive. For this purpose are used universalized tools of combined C- and MAPLE-routines together with implemented rules of general addition theorems in form of allocation-lists. MAPLE-tools are also utilized for the handling of Stokes eigenfunctions and for symbolic computations. The corresponding systems of ordinary differential equations were solved numerically for several values of the parameters Re, ij) and a set of initial values {SJ(0)},J = 1,...,N (small ujv.ro) := X)j = i Sj(O)Wj), where the kinetic energy E(t) := ]Ci=iS|(*) °f t n e Galerkin approximations was used VLN as a measure of turbulence. The evaluation of our numerical investigations for general pipe flows shows matches with our experiences for general channel flows. There are very good agreements with measurements for the transition from laminar to turbulent flows in the vicinity of the critical Reynolds numbers and satisfactory results for the mean velocities of the turbulent flow (even for small dimension of our approximation space). However, the agreement with other experimental data for the Reynolds stresses and the root-mean-square values of the fluctuating velocities is less satisfactory, but due to the small dimension of our approximation space most probably. But the evaluation of our numerical investigations for general pipe flows yields a new significant feature. Much to our surprise we found time periodic non-laminar solutions of a constant kinetic energy E(t, Re) at fixed ip-values out of a small interval of such arguments about a range of Reynolds numbers Re. This set of time periodic strange numerical solutions can be interpreted as the numerical result of stabilization by the physical property of angular momentum in the flows. From the mathematical point of view it seems to be one of the first links in the bifurcation chain to turbulence.
2
The Basic Notations and Equations
The non-stationary Navier-Stokes equations describe the time evolution of an incompressible Newtonian fluid. We are interested on the pipe flow, which means, that the fluid is filling an open unbounded cylindrical domain ft' in R 3 : n'
:=
{y = (yi,y2,y3)T
where R in [m] is the radius of the pipe.
e R 3 : \/yl + vl < R},
267 The unknowns are the velocity field v in [m/s], v(r,y)
(« 1 (r,y),t; 2 (r,y),i; 3 (r,y)) T
=
,
yen',
r€
[O,0],(0>O)
and the kinematic pressure p'(r, y) in [m2/s2], both depending on the actual time in r [s]. The initial-boundary value problem is given by: Problem 2.1 We seek v and p' fulfilling: 3
dv dr : + H
vJD v
- " A y v + v yP' = 0 , m
i
Vfi
v(T,.)|8
(0,0) x n '
(1)
V y •v
=
0
(2)
v(0,y)
=
v (0 )(y)
(3)
Cc(0,-l/3,J/2)^ +1) J =H 2,
:=
m/iere we have used the kinematic viscosity v in [m2/s] and the acronym D'j := ^ - ,
(4) j = 1,2,3.
2/3
It is a matter of common knowledge, that there exist stationary solutions of the Navier-Stokes equations (1), (2) for the boundary conditions (4). These solutions are given by the laminar velocity fields and the corresponding laminar pressures:
Vi(y)
:=
Cc(0, -!/3,2/2)T + CpXR(l - ^ ± ^ , 0 , 0 ) T = v L , c + v £ j
:
p'i(y) = -fdyl +
R2
4
yD-^-^px^yi
(5)
where cc, cp are velocity parameters in [1/s] and x = y f is chosen in such a way that for c c = cp the velocity fields VL,.,C and v/,,.iP result the equal kinetic energy in a control volume. The characteristic (laminar) velocity vchar is introduced as the scale of the velocities by: Vchar •= RJ<% + C2-
(6)
268 Additionally the parameter ip € [0,2it) is defined for all vr, ^ 0 as an indicator for the energetic ratio and the direction of action of the parts of the laminar velocity: cosV> := - p ^
,
sin V- •= — ^
,
,j) e [0,2TT)
(7)
The Problem 2.1 is formulated in dimension-free terms by the use of R as the length scale, — as the time scale, fchor as the velocity scale and vJ
. ,
VT
1
,
€ R 3 : \/x\ + x\ < 1}
.
. .
1
..
and ,
for the variables. The velocities and pressures are handled by the use of splitting up formulas: u(.,.)
=
ut,M(.) +u(.,.),
p(.,.)
=
pt, M (.) + P ( . , 0 .
(9)
where U£ and p£ are the scaled laminar fields (cf.(5)): Ui(x)
p L (x)
:=
cosil>(0,-y3,y2)T+
smij}{xp{l-xl-xl),O,0)T
=
cos*/>uL>c + sini/)u i>p ,
:=
cos2ip((xl + x23)
--)-4sintpXPXi
Finally the energetic Reynolds number is denoted by Re: Re
=
Re(cc,cv)
:=
—JcJVcJ
(10)
and periodic conditions for u and p according to (9) are required instead of conditions in infinity: (P): u(t,xi,x2,x3)
=u(t,X!+2l,X2,X3)
,
p(t,x1,x2,x3)
= p(t,xx + 2l,x2,x3)
The initial-boundary value problem for the unknowns u(t, x) and p(t, x) is given by: P r o b l e m 2.2 : We seek solutions u(4, x) and p(t, x) fulfilling: „
3
— - Axu
+
Re(y^ujD
3
3
jU + cosipy^^j^DjU
3=1
3=1
+ shn/> y \ {
3
with Dj :=£.,
Vxu
=
0
u 0 (x)
=
u(0,.)-Ui(.)
u(t,x 1 ,X2,x 3 )| (a . 2) 2 +(3 . 3) 2 =1
=
0,
j = 1,2,3.
+ Vxp = 0
3=1
3= 1
=
+
3
+ c o s ^ y j u ^ j U i ^ + sinipy^uWjUi^)
u(0,x)
p DjU
3=1
'
in
(0,T)xfi
and(P):
cf.(ll)
(11)
269
3
Explanation of the Galerkin- Approximations
We restrict the domain Q on the open bounded sub-domain Ti considering the presupposed periodic conditions (11): Ti := { x 6 n : x = ( x i , i 2 , x 3 ) T , | x 1 | < / } ,
(12)
in which we suppose I > 1. Additionally the notation a for the slice plane will be used in our considerations: a
:=
{{(x2,x3)T
: xj + xj < 1}
(13)
Definition 3.1 : Let S(= S°) be the closure of the set V, :=
{s G (C°°(T,)) 3 : V x • s = 0, s(x) := £
:= ±J
sK(z2,z3)exp(^^),
S(t (*2,*3)
:=
s(x)exp(^^i)dx1,s«(.)=II^0,VK6Z,sK(.) 6 (CS»)3}
in the norm o / H ( = H°) = (L2(T;))3. The space S 1 arises from the closure OJVM.I in the norm of H1 = ( ^ ( T M , ; ) ) 3 k
whereti
=
an
d the spaces S* , fc = 2 , 3 , 4 , . . . , are explained by the relation: Sk = S 1 n H ' ,
k
{W {TM,l)f.
The norms of the spaces S* agree for k = 0,1,2,... on Sk with the norms of Hk denoted by \\.\\k and the scalar product on H is termed by (.,.). Finally let S" 1 be the dual of S 1 . Remark 3.2 : One can also endow the space S 1 with the scalar product (., .)D 3
("> W )D,2
:=
2
3
(^2^2(Dku:',-D*V)L2(TM,,))
.
since the Poincare inequality holds by the boundedness of the domains £IM in at least one direction and vanishing traces of the elements of S 1 on d£l n dTi. Definition 3.3 : Let •KI be the Leray-Helmholtz projection o / H onto S: iri : H -> S. The Stokes operator is termed by A, A := — 717 A x : D(A) = S 2 -> S, where the defining equation is to be understood in the sense of Friedrichs extention. The Stokes operator A plays a central role in the treatment of incompressible Navier-Stokes equations. The most important properties of A will be specified in the following statements. Theorem 3.4 : The Stokes operator A is positive and self-adjoint. operator A - 1 is injective, self-adjoint and compact.
The inverse of the Stokes
270 The proof of Theorem 3.4 is a simple modification of the Theorems 4.3 and 4.4 in [3]. The essential tools for the proofs of these theorems are the theorems of Rellich and Lax-Milgram. In detail Theorem 3.4 is proven in [10]. By the use of a well known theorem of Hilbert and regularity results like Proposition 1.2.2 of [15] one shows additionally: Corollary 3.5 ; The Stokes operator is an operator with a pure point spectrum. All the eigenvalues \j of A are real and of finite multiplicity. The associated eigenfunctions {•Wj(x)}j2-i of the Stokes operator A (counted in multiplicity) are an orthogonal basis of S and S 1 . We obtain that (i)
Aw,
(ii)
0
(iii)
lrnij-too Xj
(iv)
w
x
{ j( )}^=i!
:=AyWy,
wy 6 £>(A),
< Ai < A2 < • • • < \j
Vj =
l,2,...
< •••
= oo Hwj'l|oVj = 1,2,..., form a complete orthonormal system in
S
The eigenfunctions Wj(x) Vj = 1,2,... are elements of C°°(T|). Some explicitly calculated real eigenfunctions { w ^ x ) } ^ are used in our investigations. These eigenfunctions (as complex functions) were deduced in [11]. The real eigenfunctions are inscribed in [9], where also the following theorem is proven: T h e o r e m 3.6 : The explicitly calculated real eigenfunctions {wy(x)}^2.j form a a complete orthonormal system in S and agree in this sense with the set of eigenfunctions of Corollary 3.5. The Galerkin spaces are determined by the use of bound A for the permitted eigenvalues. Let N = N(X) denote the number of eigenvalues of A with Xj < A (counted in multiplicity): JV(A)
:=
£ l
(14)
The spaces S 1 , S, S _ 1 will be used as an evolution triple for the weak formulations of the Problem 2.2. Let us denote by Lp(0,T, X), p e [l,co], for an arbitrary Banach space X, the usual evolutionary Lebesgue spaces, by C*(0, T, X), k — 0,1,2 the Banach spaces of fc-times continuously differentiable functions with values in X and by Cw(0,T, X) the spaces of weakly continuous functions. We give in the following the weak formulation of Problem 2.2:
271 P r o b l e m 3.7 : Let uo G S. We seeku G L 2 (0,T,S 1 ) n C B ( 0 , T , S ) sucft ttot: d
3
— ( u , s ) + (11,8)^2
+
3
fle{(^ujPiu,s)
3
+ c o s V > [ ( ^ " L j D J u . s ) + ( 5 Z uiDiuL,c,
J'=l
J'=l 3
s)]
j=l
3
sinV[(^ui > p Z),-u,8) + ( ^ u ^ D j U t , p s ) ] } = 0 /or aHs € S 1 u(0)
=
Uo
in the sense of
Cw(0,T,S)
R e m a r k 3.8 : We note that the boundary conditions on u and the vanishing divergence of u are included in the definition of the spaces S and S 1 . Finally the gradient fields like V x p are included in the orthogonal complement of S in H . The decomposition and trace theorems for H and H 1 respectively S and S 1 are given in [8], [9] and [10]. In [10] was proven the following result to weak solutions of Problem 3.7 as a straight forward extrapolation of Leray's statement (cf. [10] Theorem 5.8 and Remark 5.9): T h e o r e m 3.9 ; There exists at least one weak solution of Problem 3.7 for every uo G S. N o t e 3.10 ; Further results on existence of weak and strong solutions, to the regularity of these solutions and energetic estimates are also given and proven in [10]. The proofs of the existence of statistical solutions of the Navier-Stokes equations as probability measures on weighted Sobolev spaces on (0, T ) x f i were achieved in [9]. Invariance properties of these statistical solutions were established by the availability of these invariance properties for the Galerkin approximations of weak solutions on (0,T) x TJ for all I > 1. Particularly the invariance via translations in the argument belonging to the unbounded direction of fi has excelled in the evaluation of scalings and numerical methods. The Galerkin method is an essential tool for the proof of Theorem 3.9. Some notations for the explanation of the Galerkin equations will be declared in what follows. Definition 3.11 : LetN(cf.
14) be the dimension of the Galerkin space. The span o/{wi,W2,—, WJV}
as a subspace of S is denoted by M ^ : M„
= NMTM,,)
~
{w,(x)}f = 1 H ( T M '' ) C S
The orthogonal projector fromS to Mjv is denoted PJV . PN will be used P^ also in the sense: PN : S 1 -> Mjv, since dim(Mjv) < oo.
272 For the actual formulation of the Galerkin approximations of Problem 3.7. one uses b(.,.,.): 3
6(u,q,s) := ( £ y j D , - q , s ) V(u,q,s) 6 ^(O.T.S 1 ) x ^(O.T.H 1 ) x L2{U,T,Sl){*),
(15)
3=1
as the notation of the trilinear-form, where (*) means that permutations of the factors in the product space are allowed. The trilinear-form 6(.,.,.) is antisymmetric in relation to permutation of q and s. P r o b l e m 3.12 ; Let Uo € S. We seek a function ujv 6 C1 (0, T, Mjv) such that: — (uN,Wj)
+ (uN,Wj)D2
+ cosipb(uN,uLiC,v/j)
+ Re{b(uN,uN,Wj)
+
cosipb(uLtC,uN,Wj)
+
+ smip[b(uL,p,uNv/j)
+
6(ujv,ui>p,Wj-)]} =
0,
for all j 6 u w (0)
=
{1,...,N}
PNu0
A solution UN of this problem is called the Galerkin approximation of the weak solution u to Theorem 3.9. Using energetic apriori estimates and the theorem of Picard-Lindelof one shows in the proof of Theorem 3.9 that there exists a unique Galerkin approximation in the sense of Problem 3.12. Since UJV € C^O.T, MJV), one can receive the equivalent representation of UJV by: N
uN(t,x)
:= PNu{t,x)
= ^ft-Ww^x),
(16)
3=1
withgj(t)
:=
(u(t,x), wj),
j = l,...,N,
(i.e. uN = uL + uN)
Corollary 3.13 : The solution of Problem 3.12 is equivalent to solution of the initial value problem of the autonomous system of ordinary differential equations: N
ilk = -^kgk
-Re
N
(^2
N
&M,sSi9s + cosV'^5jE,jSi-rSin!/>^g" i 3,+
i,s=l
i=l
i=l
N
N
+ cosip^2rligi
+ smyjJ2rIkIigi),
>=1
9k(0)=9k,o
•=
fc
= 1,2,...,AT
!=1
(u ( 0 )(x),w t ), fe = l,2,...,JV,
with h,i,s
•=
%i,ws,wt)
(=-6s,a),
i,k,s =
l,2,...,N,
1k,i •= &( w » u L,o w *)>
r{i := b(uL>c,Wi,wk)(=
Qk'i:= K w i , u L , p , w t ) ,
r'k'j := 6(u£, p , w ; , w*) (= - r ( i ) , i,fc=l,2,...,JV, .
-r'iik),
273 Finally the kinetic energy of our Galerkin approximation of the remaining velocity and its rate of change in time is termed by: N
E(t) := £flj(t)
aad
(17)
*=1 N
E(t)
=
k=l
4
N
- 2 ( ^ A t 5 | ( i ) + iie(cos^ £
N
<£,•&&• + sin^ ^
i,k=l
«M
(18)
i,/t=l
Numerical Experiments and Results
The first step in the numerical treatment of the autonomous systems of ordinary differential equations to Corollary 3.13 are some preprocessings. One has to fix a period / and after that an upper bound A for the eigenvalues of the considered Stokes eigenfunctions - respectively iV as the dimension of the Galerkin space. The energetic Reynolds number Re and the control parameter il> (cf. (7)) are varied with the flow. The variables I and A are the necessary inputs for the first preparation procedure. There are generated the Stokes eigenfunctions { w ^ x ) } ^ and the required partial derivatives of these functions. In a second preparation procedure are calculated the coefficients of the ordinary differential equations to Corollary 3.13. We used universalized tools of combined Cand MAPLE-routines together with implemented rules of general addition theorems in the form of allocation-lists. The main step in the numerical treatment of the Galerkin approximations of the Navier-Stokes equations are studies of the numerical solutions of the systems to Corollary 3.13 in the dependence of the the energetic Reynolds number Re, the control parameter rp (cf. (7)) and the initial values <7/i(0) := gtfi = 0, Vfc = 1,2,..., AT, as free parameters. There are implemented and used DormandPrice methods (DOPRI5) and Runge-Kutta-Fehlberg (RKF45) methods (cf. [4]) with step size control for the numerical solution of the systems of ordinary differential equations. The initial values gk(0) '•= 9k,o = 0, Vfc = 1,2,..., N, stand for the laminar velocities UL,M as initial values of the Navier-Stokes systems. From a physical point of view, the small initial values gt,o are understood to be small perturbations of the laminar flow at the time t = 0. (This agrees with experimental results where the laminar velocity field is also a realisation of the real flow in the domain of over-critical Reynolds numbers (e.g. if the walls are very smooth).) In other words, the initial conditions for the gk have been chosen as small random values out of a ball of radius a around the origin in R " . If the initial conditions gk,o have a distance smaller than p < a from the origin in R ^ , where p depends on the Reynolds number Re, our numerical solutions of the autonomous
274 systems of ordinary differential equations of Corollary 3.13 tend to the zero of R ^ . The reason is the asymptotic stability of the laminar flow in the sense of Ljapunov. The radius p of this ball of asymptotic stability depends nearly exponentially on Re. The kinetic energy E(t) (cf.(17)) of the approximated remaining velocity was used for the evaluation of the solutions. The behavior in time of E(t) shows whether the solution of the systems (Corollary 3.13) tends to the origin of KN (asymptotic stability) or not ? The numerical experiments were the following: The numerical solutions of the Galerkin equations were calculated for several intervals of time and a set of initial values for varying parameters Re and ip. The properties of these solutions was studied with respect to the behavior of the kinetic energy E(t) (our indicator-measure of turbulence or bifurcations). In a second turn of studies, the numerical solutions were calculated for a set of parameters Re and tp with fixed initial values chosen as random values out of spherical shells around the origin of R " and for a fixed intervals of time. For a better understanding, let us present some of the eigenfunctions used and their corresponding eigenvalues. The period I in our present investigations was chosen to be 21 = 2 x 2.69 for historical reasons. In the following are used cylindrical coordinates x\, r,
w
«* =
w
x
oj!o( )
* =
w
r
x
oj ,o( )
0
^TTJI(\/A)
and Wi
M
Jo(V\r)
JEN,
j € N , Ji(V\)
-simp
2lirJ2(V\)
= Wi<{r)eXi
= 0,
w;. =
Wi.(r)ev
cosip I
2(In(jK,r) - aBJnWX^r))
cos(jfcxi) sinnip
(7„_i (jKr) - aB_ J„_j ( \ / A » ) sin( f(cii) sin(ra - l)tp ^ {In-i(^nr)
(
-aB_Jn_1(N/A»)sin(fKi1)cos(n-lV
° (7 n + i(f rer) - aB+ J n + i ( \ / A » ) sin(f RX{] sin(n + l)ip
^ -(In+i(jK,r) aB :=
w;.
0 \
Ji(VXr)
(x):=
with
= 0,
1°/ \
JS),I,g,r,R
JO{VX]
- a B + J n + i ( V A ^ r ) ) s i n ( j « i i ) c o s ( n + l)ip
InijK^Jniy^))-1 and
a B _ :=
In-lijH)^-!^))'1
aB+ := In+i(-K){JnJtl{y/\~K))
1
.
)
^
275 The first 114 eigenvalues are written down in the following table, where M denotes the multiplicity of the eigenvalue Xj. Table of eigenvalues: cylinder I = 2.69
j 1* 2-5 6* 7-8 9-12 13-14 15-16 17-20 21-22 23-24 25 - 28* 29-30 31-32 33-36 37-40 41-44 45-46 47* 48-51 52-55 56-57 58-59 60-63 64-65 66-67 68-71 72-75 76-79 80 - 83* 84-85 86-89 90-93 94-97 98 - 101 102 - 103 104* 105 - 106 107-110 111 - 114
Xj
A-^
5.78318596 14.00353173 14.68197064
2.40482555 3.55522038 3.83170597
15.44219494 16.04591039 20.13772963 20.75026890 26.37461643
3.16013226 3.83170597 3.83170597 2.91115289 5.13562230
26.53943566 26.95742836 27.10106525 28.19952031 28.28023245 29.49065461 29.69742364 30.47126234 32.77224938 33.03561689 34.87484339 36.50500661 39.80837362 40.70646582
5.01751890 3.83170597 5.07317706 5.18030699 4.77749656 2.76904652 4.92358250 5.52007811 5.22651799 4.55633176 4.75388111 3.83170597 5.24718170 6.38016189
41.11877055 41.22711826 41.30088185 42.45059017 43.01834131 43.37712401 47.18741703 48.09226325 48.52539446 48.78046433 49.21845632
4.39269104 6.31372936 2.68372654 6.40988692 4.60383593 6.15803256 6.46000449 5.98471432 6.86741980 3.83170597 7.01558667
49.43922306 49.65210407
5.25511057 6.64803317
n
K
M
type
0 1 0 1 1 0 0 1 1 2 2 0 0 1 2 1 0 0 1 2 0 0 1 2 3 2 3 1 2 0 3 2 3 1 0 0 1 1 1
0 1 0 0 2 1 2 3 0 0 1 3 1 1 2 4 2 0 2 3 3 4 3 0 0 4 1 5 1 4 2 2 3 1 5 0 0 4 2
1 4 1 2 4 2 2 4 2 2 4 2 2 4 4 4 2 1 4 4 2 2 4 2 2 4 4 4 4 2 4 4 4 4 2 1 2 4 4
il 9,r,R
r R 9,r,R r r 9,r,R g,r R 9,r,R r 9,R 9,r,R 9,r,R 9,r,R g,R R 9,r,R 9,r,R g,R r 9,r,R 9,r R 9,r,R 9,r,R 9,r,R 9,r,R 9,R 9,r,R 9,r,R 9,r,R 9,r,R r r R 9,r,R 9,r,R
276 The behavior of the kinetic energies E(t) of the solutions of the systems for three values of the parameter ij>, fixed Reynolds number Re = 1450 and fixed initial values is illustrated in Figure 1. tp = 0 stands for the pure boundary-driven Couette (angular momentum) flow and ip = 7r/2 stands for the pure pressure-driven Poiseuille flow. Our numerical investigations provide good agreement with experimental results for critical Reynolds numbers, for initial values in the order of magnitude of physical perturbations. If the initial values <7^0 are chosen not too small, the corresponding functions E(t) tend to zero for Re < ReCTit. and remain at about the same level for Re > ReCTit.. For small values of if) fa 0, n, 2-rr is the laminar boundary-driven Couette flow stable. (E(t) —• 0) The same is true for the long-time behavior of the solutions. The time mean values of the coefficients of the remaining velocity field are defined by
9i ~Y^Ta
(/*(*)*)
w
^ £ * ( * ' ) • ti€[t0,T],j
= l,...,N,
(19)
where ni is the number of Dormand-Price - resp. Runge-Kutta-Fehlberg time steps, tj are the calculation points in time and to is a time for which the kinetic energy starts to show an aperiodic oscillating behavior between an upper and lower level. The mean velocities are given by T
fijv(r)
=
"L.PW +
«
Ui,p(r)+ ^
4 l
,
T
_f.
2JT
;
I I I ujv(t,x) dxi difidt to 0 -1
g7*Wi.(x),
where the significant coefficients are given by i* = 6, 104 for the Couette flow and by i* = 1, 47 for the Poiseuille flow. Let us call the corresponding eigenfunctions Stokes modes. It is worth noting that the time averages gl*, g~j- according (20) assume much larger values than the time averages of all the other coefficients g, (19). The critical Reynolds numbers for our calculations are ^COUETTE
w 00j
RePois.
£
(1400>
20000).
Much to our surprise we found time periodic non-laminar solutions of a constant kinetic energy E{t, Re) at fixed ^-values out of a small ^-interval (ip « 7r/3) about a range of Reynolds numbers Re. These solutions are termed strange numerical solution. An example of such strange numerical solutions is set in what follows. The Reynolds number was taken to be Re = 5000. We found beginning with to = 1.03607 constant values of the coefficients to the Stokes modes (1, 6, 47 and 104) and only eight other non-vanishing coefficients with: 925(t) » #28(t), -926(t) » 027(t), Sso(*) » gs3(t) and -gsi(t)
» S82W Vt € [t 0 ,T]
277 (The modes with non-vanishing coefficients was signed by * in the table.) For a simple linear combination of the corresponding eigenfunctions . .
WA(X)
w 27 {x) - w 26 (x) "—-y= —
:=
, N w 8 2y( x ) - w 8 1 ( x ) wc(x) := —-y= -
w 25 (x) + w 28 (x) K —~= ^
and
w B (x) :=
and
as well as w 80K(x) + w 8 3y ( xL) w f l (x) := -~= —
and
gB{t) :=
are t h e received coefficients: gA(t)
:= V2g27(t)
V2g25(t) as well as
gc(t) := V2gs2(t)
and
gD{t) :=
V2gso(t)
shown in Figure 2. The mean velocity of a calculated time periodic non-laminar solution of constant kinetic energy E(t) at Re = 5000 is finally compared with the corresponding laminar profile in Figure 3. The rotation of the pipe acts stabilizing to the pressure-driven part of the velocity, like a bullet is stabilized by the spiral fluted barrel of a gun. We also put the mean calculated velocity with the same driving pressure but without any rotation of the pipe in this Figure for a better illustration of this effect. The illustrated set of timely periodical strange numerical solutions will be an object of our further research in the next years.
278
2.5
0.5
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Figure 1: Behavior of E(t) over t at Re=1450 for ip = 0 (E(t) = 0) ,7r/4,7r/2
Figure 2: Trajectories in the planes of modes
Figure 3: Flow in a pipe (^ = ff/6) Re=5000, laminar profile, calculations of the quasistationary solution in comparison with calculations^ = TT/2) Re=2500
279
References [1] Babin, A., Mahalov, A., Nicolaenko, B. Global splitting, integrability and regularity of 3D Euler and Navier-Stokes equations for uniformly rotating fluids, Eur.J.Mech., B/Fluids no 3, (1996) 291-300 [2] Busse, F.,H. Bounds on the Transport of Mass and Momentum by Turbulent Flow Between Parallel Plates , ZAMP (Vol. 20), (1969), 1-14 [3] Constantin, P., Foias, C. Navier-Stokes Equations, Univ.of Chic.Press, Chicago (198S) [4] Hairer, E., Norsett, S.P., Wanner, G. Solving Ordinary Differential Equations I Springer , Berlin ,(1993) [5] Iooss, G., Mielke, A. Bifurcating time periodic solutions of Navier-Stokes equations ininGnite cylinders, J.Nonlin.Sc. 10 (1991), 106-146 [6] Kleiser, L., Zang, T.A., Numerical Simulation of Transition in Wall-Bounded Shear Flows, Annu.Rev.FluidMech. 23 (1991), 495-537 [7] Koppe,
M.
Galerkinapproximation
der
allgemeinen
Rohrstromung
Studienarbeit,
Univ.Magdeburg, 1999 [8] Krause, B. Translationshomogene statistische Losungen der Navier-Stokes-Gleichungen
im
Kanal, Leipzig: Dissertation (1984) [9] Rummler, B. Partiell-translationshomogene statistische Losungen der Navier-Stokes-Gleichung im Rohr und die Berechnung der ebenen Couette- Stromung , Leipzig: Dissertation (1984) [10] Rummler, B. Zur Losungder instationaren inkompressiblen Navier-Stokesschen Gleichungen in speziellen Gebieten, Magdeburg: Habilitation (1999/2000) [11] Rummler, B. The Eigenfunctions of the Stokes Operator in Special Domains I, ZAMM 77(1997) 8, 619-627 [12] Rummler, B. The Eigeniunctions of the Stokes Operator in Special Domains II, ZAMM 77(1997) 9, 669-675 [13] Rummler, B., Noske, A. Direct Galerkin Approximation of Plane-Parallel-Couette and Channel Flows by Stokes Eigenfunctionsin Not.Num.Fl.Mech., Vol. 64 (ed.by Friedrich, R. and Bontoux, P.) Vieweg, (1998)
280 [14] Rummler, B., Breitschuh, U. Berechnung der turbulenten Couette-Stromung mittels Galerkinapproximation der Navier-Stokes-Gleichung - Ergebnisse und Probleme, ZAMM 64 (1984) 10, M 489-492 [15] Temam, R. Navier-Stokes equations, theorie and numerical analysis ,Amsterdam: North Holland (1979) [16] Visik, M.J., Fursikov, A.V. Mathematische Probleme der statistischen Hydromechanik ,Leipzig: Geest & Portig (1986) ADDRESS: PD Dr. Bemd Rummler, Dept. of Mathematics, Otto-von-Guericke-Universitat Magdeburg, PF 4120, D-39016 Magdeburg, Germany , email: [email protected]
On Super and ^—Continuities Mohammad Saleh* Mathematics Department, Birzeit University, P.O. Box 14, Birzeit, West Bank, Palestine Email: [email protected]
Abstract In this paper, we further the study of super and 6—continuities using S—open and 6—open sets. Among others, it is shown that the graph mapping g of / : X —* Y is Super continuous iff / is super continuous and X is semi- regular. 1. I N T R O D U C T I O N . The concepts of 5 - closure, 6 - closure, 8 interior and 6 — interior operators were first introduced by Velicko [14]. These operators have since been studied intensively by many authors. The collection of all S—open sets in a topological space (X, T) forms a topology Ts on X, called the semiregularization topology of Y, weaker than T and the class of all regular open sets in T forms an open basis for T s . Similarly, the collection of all 6—open sets in a topological space (X, T) forms a topology Fe on X, weaker than Ts. So far, numerous applications of such operators have been found in studying different types of continuous like maps, separation of axioms, and above all, to many important types of compact like properties. In the present paper, we further the study of S.contunuity [8] and s.S.c or super continuity in the since of Munshi and Bassan [6]. We give several characterizations to 5.c and s.S.c maps, and we study the relations between these functions and their graphs. Theorem 2.13 proves that the graph mapping g — (x, f{x)) : X —> X x Y is s.6.c iff / : X —• Y is S.c and X is "The author was supported by Birzeit University under grant 235-17-98-9. AMS Subject Classification: 54C08, 54D05, 54D30
281
282
semi-regular. Also, it is shown that the preimage of a Hausdorff injective S.c is Hausdorff, and the image of compact is n.compactness under S.c functions. Theorem 3.1 proves that an S.c retract of a Hausdorff space is S—closed. We get similar results to some of those in [3,..., 13]. For a set A in a space X, let us denote by Int(A) and A for the interior and the closure of A in X, respectively. Following Velicko [14], a point x of a space X is called a 0—adherent point of a subset A of X iff U n A ^ 0, for every open set U containing x. The set of all 6—adherent points of A is called the 6—closure of A, denoted by clsgA. A subset A of a space X is called 9—closed iff A — clsgA. The complement of a 0—closed set is called 9—open. Similarly, the 9—interior of a set A in X, written IntgA, consists of those points x of A such that for some open set U containing x, U C A. A set A is 9 — open iff A = IntgA, or equivalently, X — Ais 9 — closed. A point x of a space X is called a S—adherent point of a subset A of X iff Int{U) n A ^ 0, for every open set U containing x. The set of all 6—adherent points of A is called the S—closure of A, denoted by cls$A. A subset A of a space X is called 5—closed iff A — clssA. The complement of a S—closed set is called S—open. Similarly, the S—interior of a set A in X, written IntsA, consists of those points x of A such that for some regularly open set U containing x, U C A. A set A is 5 —open iff A = IntsA, or equivalently, X — A is 5 —closed. It is well-known that one of the most weaker forms of compactness is closure compactness (QHC). A subset A of a space X is called a closure compact subset of X if every open cover of A has a finite subcollection whose closures cover A. A closure compact Hausdorff space is called H—closed, first defined by Alexandroff and Urysohn. A function / : X —> Y is almost continuous (briefly, a.c) ( resp., almost strongly 9—continuous (briefly, a.s.c), closure or 9 — continuous in the sense of Fomin [2] (briefly, c.c), weakly continuous (briefly, w.c), S—continuous (briefly, S.c) strongly continuous or strongly 9—continuous (briefly, s.c) if for any open set V in Y, there exists an open set U in X such that f(U) C Int(V)( resp., f(U) C Int(V), f(U) C Int(V), f(U) C V, f(Int(U)) C Int(V), f(U) C V). A function / : X —> Y is strongly S—continuous (briefly, s.S.c) ( or super continuous in the sense of B.M. Munshi, and D. Bassan ) if for any open set V in Y, there exists an open set U in X such that f(Int(U)) C V.A space X is called Urysohn if for every x ^ y G X, there exist an open set U containing x and an open set V containing y such that Ur\ V = 0. A space X is called semi- regular if for every x € X and for every
283
open set U of x there exists an open set W such that x 6 W C Int(W) C U, or equivalently, T5 = T. A space X is called almost regular if for every regularly closed set F and for every x $. F there exist disjoint open sets U, V such that x e U, F C V. Equivalently, a space X is almost regular if for every x € X and for every regularly open set U containing x there exists a regularly open set W containing x such that x € W C W C U, or equivalently, Ts is regular. 2. BASIC RESULTS. s.c => s.(5.c => continuity => a.c =*• c.c =^ w.c, s.c => s.d.c =$> 5.c =*> a.c, and s.c => a.s.c =>• a.c, but neither continuity implies S.c nor <5.c implies continuity, neither a.s.c implies continuity nor a.s.c implies continuity, and neither a.s.c implies S.c nor s.d.c implies a.s.c. Example 2.1. Let X = {1,2,3}, with 9* = {0, {1}, {2}, {1,2}, {1,3}, {1,2,3}}, y - {1,2,3} with 3 y = {0,{1},{2},{1,2},{1,2,3}}. Let / : X -> F be the identity map then / is continuous but / is not S.c nor a.s.c since 7nt({I}) = {1} in Y, but Jnt({l}) = {1,3} in X and {1} = {1,3} in X, but 7nt({l}) = {1} in Y. Example 2.2. Let / : {Ru) —* (Re) be the identity map, where Ru, Re the usual and the cocountable topologies, respectively. Then / is a.s.c and S.c but neither s.S.c nor continuous. Example 2.3. Let X = R with the topology S generated by a basis with members of the form (a, b) and (a, b) — K, where K = {^ : n € Z+}. Let / : (X, 9) —> (X, 9), be the identity map. Then / is continuous but not strongly continuous. The proofs of the next results follow directly from the definitions and thus will be omitted. Theorem 2.1. [8,Theorem 4.6(2)]. Let / : X -> Y be an a.c function and let A" be a semi- regular space then / is S.c. Theorem 2.2. Let / : X —> Y be S.c. and let Y be a semi- regular space then / is s.S.c. Theorem 2.3. Let / : X —+ Y be an a.s.c function and let Y be a semiregular space then / is syc. Theorem 2.4. Let / : X —* Y be an c.c function and let Y be an almost-
284
regular space then / is s.a.c. Theorem 2.5. An open map / is c.c iff / is S.c. T h e o r e m 2.6. Let / : X —> Y he S.c. and let X he an almost regular space then / is a.s.c. Theorem 2.7. Let / : X —> Y he an d.c function and let Y be a semiregular space then / is s.5.c. T h e o r e m 2.8. Let / : X —> Y he a continuous function and let X be an almost regular space then / is s.d.c. Theorem 2.9. Let / : X -* Y be an c.c function and let Y be a regular space then / is s.S.c. T h e o r e m 2.10. Let X,Y he regular spaces. Then the following concepts on a function / : X —> Y : w.c, c.c, a.c.,a.s.c, s.c, S.c, s.S.c, continuity are equivalent. Most of the following characterization of S.c, s.S.c are given in [8], [6]. Theorem 2.11. Let / : X —• Y. Then the following are equivalent: a) f(clsgA) c clssf(A) for every A C X, b) The inverse image of regularly closed set is S—closed, c) The inverse image of regularly open set is S—open, d) The inverse image of S—closed set is S—closed, e) The inverse image of S—open set is S—open, / ) For each x G X and for each filter base F S—converges to x, f(F) S—converges to f(x), g) For each x G X and for each net {xa}aep ^-converges to x,{f(xa)}aet) <$—converges to f(x), h) f is S.c. Theorem 2.12. Let / : X —> Y. Then the following are equivalent: a) f(clssA) C f(A) for every AcX, b) The inverse image of a closed set is S—closed, c) The inverse image of an open set is S—open, d) For every x G X, and for every V open subset of Y containing f(x) there exists a regularly open set Ux such that f(Ux) C V, e) For each x G X and for each net {xa}a€-p d—converges to x,{f(xa)}ae-p converges to f(x), / ) For each x G X and for each filter base F J—converges to x, f(F) converges to f(x),
285
/ ) / is S.8.C. In [3] it is shown that a function / is almost continuous iff its graph mapping g, where g{x) = (x, f(x)) is almost continuous. In [7], [8] this result was extended to weak continuity and S.c. In the present paper we extend this result to s.5.c. Theorem 2.13. Let / : X —> Y be a mapping and let g : X —> X x Y be the graph mapping of / given by g(x) = (a;, f(x)) for every point x G X. Then g : X —> X x Y is s.J.c. iff / : X —> Y is s.5.c. and X is semi- regular. Proof. Suppose g is s.d.c. Let x £ X and let V be an open set in Y containing f(x). Then X x V is an open set in X x Y containing g(a:). Since # is s.<5.c, <7-1(A' x V) is regularly open. But g~l{X x V) = / - 1 ( V ) , since is the graph mapping of / . Hence, / - 1 ( V ) is regularly open, proving that / is s.S.c. To prove that X is semi-regular, let x G X and let U be an open set containing x. Then U x Y is an open set containing g(x). By s.S.c. of g, there exists a regularly open set VF containing x such that (W) C U x Y. Thus x €W C U, proving that X is semi- regular. Conversely, assume / is s.S.c. and let A C X. Then g(clssA) C ds^A x f(clssA), since 5(1) is the graph mapping of / . By s.S.c. of / , f(clssA) C f(A). Since X is semi- regular, J 4 j = cls$A. Therefore, g(clssA) C d s ^ x /(ds^A) =~Ax f(clssA) c A x f(A) — Ax f(A) = 3(A), proving that g is s.<5.c. The next example shows that the graph mapping of s.S.c. need not be s.S.c. E x a m p l e 2.4. Let X = Y = {1,2,3} with topologies 3 y = {0, {3}, Y } , 3 * = {0,{1},{2},{1,2},{1,3},X}, / ( x ) = 3, / o r all x S I . Then / is s.S.c. but the graph mapping g of / , where g(x) = (a;, f{x)) is not s.S.c. at x=l. Similar to S.c. [8, Theorems 3.3, 3.4 ] and following a similar argument as in [4, Theorems 6,7], we get the following results. Theorem 2.14. Let / : X —> l\a£l Xa be given. Then / is s.S.c. iff the composition with each projection 7ra is s.S.c. Theorem 2.15. Define Uaei fa • Ylaeixc -> UaeiYa by {xa} ->
286
{fa(%a)}- Then n fa is s.S.c. iff each fa : Xa —• F Q is s.S.c. 3. H A U S D O R F F SPACES A N D ^ - C O N T I N U I T I E S . Definition 3.1. A space X is said to be 0.Hausdorff if for every i ^ y g X, there exist 6—open sets Ux , Vy such that Ux (~\Vy — 0. L e m m a 3.1. Let X be a Hausdorff. Then for every x ^ y G X, there exist regularly open sets Ux , Vy such that Ux f)Vy — 0. Proof. Let X be a Hausdorff space and let x ^ y G X. Thus, there exist two open set Ux ,VV of x and y, respectively, such that Ux
287
Then the set A = {x 6 X : f(x) = g(x)} is an S—closed set. Proof. We will show that X\A is 5-open. Let x e Ac. Then f(x) ^ g(x). Since Y is Hausdorff, there exist regularly open sets Wf(x) and Vg(x) such that W^n V = 0. By &c. of / and g there exist regularly open sets Ux, U2 of x such that / ( t / i ) C W and p(t/ 2 ) C V. Clearly [/ = t/i n U2 C Ac. Thus Ac is 5—open and hence A is 5—closed. Theorem 3.3 leads to a generalization of a well-known principle of the extension of identities. Definition 3.2. A subset A of a space X is called 5—dense if clssA = A. Corollary 3.1. Let f,g be S.c. from a space X into a Hausdorff space Y. If / and g agree on a regularly dense subset of X then f = g every where. Theorem 3.4. Let / : X —> Y be <5.c.(resp., s.&c.) map and let Ad Then / : A —> Y is J.c. (resp., s.J.c). Proof. Staightforward.
X.
Remark. If a function / : X —> Y is S.c. Then / : X —> / ( X ) need not be &c. However, it is true for s.S.c. Example 3.1. Let X = R with the usual topology, Y = R with the cocountable topology and let / : X —* Y be defined as f{rationals) = 1, /(irrationals) — 0 then / is d.c, but / : X —> / ( X ) is not <5.c. 4. A P P L I C A T I O N S It is well-known that the image of compact is closure compact under weakly continuous functions, the image of closure compact is closure compact under closure continuous functions and the image of closure compact is compact under strongly continuous functions. The following results are similar to that applied to S—continuities. Definition 4 . 1 . A subset A of a space X is said to be ^-compact if every cover of 9—open sets has a finite subcover. Definition 4.2. A subset A of a space X is called nearly compact (briefly,
288
n.compact) if every open cover has a finite subcollection whose interior of the closures cover A. Equivalently, a subset A of a space X is n.compact iff every cover of regularly open sets of A has a finite subcover. L e m m a 4.1 [12, Corollary 2.1]. A subset A of a space X is n.compact iff every cover of S.open sets of A has a finite subcover. Remark 4.1. A function / : (X,T) -> (F,S) is s.S.c iff / : (X,TS) (Y, 3 ) is continuous and / : (X, T) -» (Y, 3 ) is S.c iff / : (X, Ta) -» (F, 9 8 ) continuous. Remark 4.2. A subset K C (X, T) is an n.compact subset iff K (X, Ts) is compact and K C (X, T) is a ^.compact subset iff K C (X, Te) compact.
-• is C is
The following results follow directly from Remarks 4.1 and 4.2. Theorem 4 . 1 . Let / : X —> Y be 8.c and let K be an n.compact subset of X. Then f(K) is an n.compact subset of Y. Corollary 3.4. Let f : X —>Y bean open a.c and let K be an n.compact subset of X. Then / ( i f ) is an n.compact subset of Y. Theorem 4.2. Let / : X —» F be s.&c. and let if be an n.compact subset of X. Then f(K) is a compact subset of Y. Theorem 4.3. An n.compact subset of a Hausdorff space is d—closed. Theorem 4.4. Every 5—closed subset of an n.compact space is n.compact. T h e o r e m 4.5. An ^.compact subset of an 6.Hausdorff space is 6—closed. Theorem 4.6. Every 6—closed subset of an ^.compact space is ^.compact. T h e o r e m 4.7. Let / : X —> Y be a surjective S.c and let X be connected. Then Y is connected. Proof. Suppose Y is disconnected. Then there exists disjoint open sets V, W such that Y = IntfV) U Int(W). By ^-continuity of / , f-\lnt(y)) = f-\V) and f-\Int(W)) = f-\W) are open in X. But X = f~\W) U / - 1 ( F ) and / - 1 ( W ) n f~l{V) = 0. Thus X is disconnected, a contradiction. Therefore, Y is connected. T h e o r e m 4.8. Let / : X —> Y be S.c 1-1, onto. If X is n.compact, and Y Hausdorff, then the image of every 5—open is 5—open.
289 Proof. Let U be an <5-open subset of X, and thus X\U is an S—closed subset of X. Theorem 4.4 implies that X\U is n.compact. Since / is S.c, Theorem 4.1 leads that f(X\U) is n.compact. Therefore, Theorem 4.3 implies that f(X\U) = Y\f{U) is £—closed, and thus f(U) is 5—open. Theorem 4.9. Let / : X —> Y be S.c. HX is n.compact and Y Hausdorff, then the image of every S—closed is S—closed. Theorem 4.10. Let / : X —> Y be s.S.c 1-1, onto. If X is n.compact and Y Hausdorff, then the image of every S—open is 6—open. Theorem 4.11. Let / : X —> Y be s.S.c. If X is n.compact and Y Hausdorff, then the image of evey S—closed is #—closed. REFERENCES 1 P. Alexandroff and P. Urysohn, Memoire sur les espaces topologiques compacts, Verh. Nederl. Akad. Wetensch. Afd. Natuurk. Sect. I 14(1929), 1 - 96. 2 S. Fomin, Extensions of topological spaces, C. R. Dokl. URSS(M. S), 32(1941), 114 - 116.
Akad.
Sci.
3 P. E. Long and D. A. Carnahan, Comparing almost continuous functions, Proc. Amer. Math. Soc. 38(1973), 413 - 418. 4 P. E. Long and L. Herrington, Strongly 9 — continuous Korean. Math. Soc. 18(1981), 21 - 28.
functions,
J.
5 M.N. Mukherjee, and S. Raychaudhuri, Some applications of ^-closure operators, Indian Journal of Pure and Applied Mathematics, 26(1995), 433— 439. 6 B.M. Munshi, and D. Bassan, Super-continuous mappings, Indian Journal of Pure and Applied Mathematics, 13(1982), 229 - 236. 7 T. Noire, On weakly continuous mappings, Proc. Amer. Math. 46(1), 1974,120 - 124. 8 T. Noire, On S—continuous functions, 166.
Soc,
J. Korean. Math. Soc. 18(1980), 161 —
290 9 M. Saleh, Some remarks on closure and strong continuity, An Najah J., 1998, 7-20. 10 M. Saleh, Some applications of 5—sets to H-closed spaces. Q&A in Topology, 1999, 203-211. 11 M. Saleh and M. Al Amleh, N-compactness and 0-closed sets, Proceedings of the Mathematics Conference, World Scientific Press, 2000, 207218. 12 M. Saleh and M. Al Amleh, On 6>—closed sets and faint continuity, Proceedings of the Mathematics Conference, World Scientific Press, 2000, 219-232. 13 M.K. Singal and A.R. Singal, Almost continuous mappings, Yokohama Math. J. 16(1968), 6 3 - 7 3 . 14 N.V. Velicko, H-closed topological spaces, Transl. AMS, 78(1968), 103 118. 179(1973), 6 1 - 6 9 .
Design of user interface for computer-aided instruction of mathematics K.Tintarev, Uppsala University P.O.Box 480 751 06 Uppsala Sweden [email protected] Abstract Symbolic computations software designed for teaching college and high school mathematics should have a self-explanatory user interface that does not waste the learning effort on machine operation. The paper discusses available directions for such design within the constraints of available operating systems, GUI libraries and computer algebra algorithms.
1
Introduction
Use of technology in education is justified when it improves quality of education and when such improvement is not offset by negative side-effects. This paper deals with the question of improving technology and decreasing negative side-effects as a precondition for reaping the benefits of the technology for mathematical education. Most of software used in instruction of mathematics falls in the following categories: a) Fixed content software or "hyperbook" (usually a sequence of screens that present material and calculate scores for student's answers);
291
292 b) Mathematical toolkits, ranging from software for programmable calculators to advanced symbolic computations packages and interactive geometric workareas; c) Small specialized programs illustrating selected concepts; d) Software designed for other purposes but found useful for instruction (for example, spreadsheet programs). Software in the category (d) is merely filling the space that could be occupied by software specially designed for education. Software in the category (c), once its educational value becomes clear, becomes eventually a part of an annotated collection on a CD-ROM and later gets absorbed into packages in the first two categories on which we can focus now. The main disadvantage of a mathematical hyperbook at the present level of technology is that it cannot provide much of feedback to the student. A discourse is essential for mathematical education, but with the typical hyperbook software all the machine knows about the student is her scores in preprogrammed multiple choice quizzes and her navigation history. We see the main educational benefit of hyperbooks in their capacity to present visual simulations that illustrate particular mathematical topics and in their role as a catalogue of mathematical knowledge. Both funcionalities may evolve in the future: the software will become more perceptive, and online databases with mathematical content, with two-way student access will probably evolve. However, the present hyperbooks, which proved to be efficient in teaching fixed content, such as routines for eqipment maintenence, cannot play a central role in mathematical education. It seems that the main genre of software used for mathematical education in the immediate future will be the toolkit software that provides free-hand user access to symbolic computations, graph plotting and geometric manipulations. Most of this software can be used not only on a desktop PC but also on the high-end pocket devices which in less than a decade are likely to phase out pocket calculators. Prom the developer's point of view the distinction between P C and calculator software for mathemal education ceased to be of essense. Some of the interface ideas for a mathematical toolkit software have been implemented by the author in the package MINOS SE^INOS SE is licensed for free use by Palestinian educational instiutions. Copies are X
M
293 available from the Palestinian Society of Mathematical Sciences and, upon request, from the MINOS SE website.
2 2.1
Natural object interface from the user viewpoint The command line interface
The first generation of symbolic computations software had could use only the the console/command line interface: to perform a calculation one had to enter a command string, like
>
Diff(a*x2,{x,a},l.A);
This is a typical form for a command to compute all partial derivatives of ax2 of order 1, 2, 3 and 4 wtih respect to variables x and a. A user with programming skills who knows syntactic conventions for several programming languages would immediately identify Diff as the name of the routine to be executed; the semicolon at the end of the line as a sign for the end of the statement; the commas as separators between arguments; the curly brackets as an encolosure of an argument which by itself is a list of arguments; and the symbol ".." as a way of specifying a range of integers. A user without prior experience with computer languages might express his intent for the same calculation with a string like > diff {ax2; x,a;l
4);
The parsing routine of the software won't treat this string kindly: it is not programmed to recognize an intent, but only to pick up pieces of data separated by correct delimiters. Its first responce will be to treat everything before the first semicolon, that is diff (ax2; as a separate statemnt, first out of three. A typical parser will throw an error message about a missing bracket in the first "statement". Since changing the brackets will be contrary to user's intent, the immediate feeling would be that the machine is blocking the human intent. Moreover, since most of parsers report only the first discovered error, the user might not be able to deduce from a change in an error message if the correction failed or the parser has advanced to the
294 next error. When all syntactic errors in the string above are rectified, the resulting command,
>
diff(ax2,x,a,l.A);
will be executed only if the parser regards diff as equivalent to Diff, and if it does, the output will be a collection of zeroes, because the multiplication intent in ax will be not recognized by a typical parser which will treat ax as a name of a variable. For many tasks, such as plotting a three-dimensional parametric curve with given resolution and color scheme, even a user with programming skills might not be able to write the correct command knowing only the command name: as a minimum, one will have to know the correct order of arguments and the way to identify a particular color scheme.
2.2
Command system as an obstacle to educational use
In a single instance of successful use of mathematical software one can identify the following steps. 1. The user intends to perform a mathematical operation. ("I want to see the graph of y = x 3 — a:.") 2. The user verifies if the capability to perform the operation is attributed to (or claimed by) the software. ("This program is supposed to draw graphs.") 3. The user identifies the command that enacts this capability. ("I guess it is this plot button that I have to press after I type the formula in this box.") 4. The user enacts the command. (Presses the button.) 5. The user assesses the outcome as operation's success or failure ("Here is the graph!" or: "Here comes the error message...") 6. The user interprets the output in relation to his intent. ("Why did I get a surface? Shouldn't it be a curve?" or "What does this error message mean?") 7. The user gives a mathematical interpretation of the outcome.
295
Only the first and the last steps are related to mathematics. The intermediate steps are to do only with operating a machine and can fail without producing a mathematically meaningful outcome. Here are examples of aborted action at the correspondent steps. 1. The user intends to perform a mathematical operation, but thinks it is too much trouble to use the computer and gives up. 2. The user fails to find a clue if the operation is within the scope of the software ("I can't figure out if this software can draw an intersection curve for two parametric surfaces or not!") 3. The user can't identify the right command ("It can't be plotSd, it can't be plotparam, what the hell it can be?") 4. The user fails to understand the rules for enacting the command ("The manual says that the second argument should be of the PLOTDATA type. I don't know what is PLOTDATA type, all I have is my formula for the function!") 5. The user fails to understand if the output conveys a successful computation. ("Well, here some numbers are coming. Are these components of my matrix or error codes?") 6. In case of failure, the user cannot understand the error message. ("What's Index out of range? I have no index in my formula. What should I do with it?") In case of success the output looks totally foreign to what the user expects to see. ("If this is a vector, how many lines takes the expression for the first component?") 7. The mathematical outcome is not what it was expected. ("What? Three local maxima? Pity, the whole model has to be reworked.") - this is the only step where failure is didactically significant. The words "user fails" above signify not only that the action could not be carried out, but also that student's emotional state can be affected negatively. In case of a machine operation failure the student's mathematical intent is frustrated, and even if not, it yields no mathematically meaningful feedback. Such responce to student's intent would not be acceptable in a human teacher, and there no reason to be any more indulgent towards
296
educational software. Moreover, even if the student is trained to use the user unfriendly programs, the attention she gives to operating the machine, comes as an overhead cost on learning of mathematics. Consequently, educational use puts a requirement on the interface to reduce the intermediate (machine operation) steps to minimum. Ideally, the interaction between the sudent and the machine should take place in natural terms of mathematical discourse, with expression of intent followed directly by mathematically meaningful output. This requirement does not amount to a demand that the software will be capable of mathematical discourse, that is, we do not require the software to handle user's intents mathematically, we require the software to handle user's mathematical (that is, already expressed in a usual mathematical form) intents instead of their translations to a machine language. This requirement might become redundant if a particular machine language becomes a part of universal education and merges with mathematical notations to such an extent it becomes inseparable from instruction of mathematics. In this case every mathematical intent will be expressed directly in a machine-readable form. However it would be a mistake to promote any machine language used today (a majority opinion would probably favor the grammar of the C + + family, including Java and command systems of Maple or Matlab) as a universal language. Unlike mathematical formulas, machine languages of the present are a product of an ongoing technological revolution and are subject to a further change. They are essential for science and technical education because they are current industry standards, but not because they are a universal cultural asset like Euclides geometry or Newton laws. Certain categories of students can benefit from computer-assisted instruction of mathematics also with user-unfriendly software. In a Darwinistic approach one would simply tag as "good" students those who can endure computer-aided instruction of mathematics despite a frustrating, antipedagogic interface, while avoiding any computer-aided instruction when the survival ratios push the student enrollment statistics down - which is usually the case for the first year college students. It is also true about today students that they are willing to use their own programmable calculators with limited and unfriendly software while shunning school's desktops. However, this is not an argument in favor of poor interface, since the students' pocket devices very soon will be computers par exellance, with no technological excuses for inferior interface.
297
2.3
Menu-dialog interface
The menu-dialog box interface has laid ground for a wide educational use of Derive, while the high-end packages such as Maple and Mathematica have adopted dialog boxes for a part of their command system. Dialogs (palettes) are also used for entering values for vectors and matrices in editable tables, rather than as a list with delimeters. In a typical menu-dialog interface, a click on the menu word Derivative opens a form with three text fields, labeled, for example, function, variable(s) and order. The menu-dialog interface reduces errors that come from incorrect recollection of the function's name, from a wrong number of arguments, or from incorrect delimeters in the argument list. At the same time the menu-dialog interface remains subordinated to the structure of the underlying function call and it provides no protection against syntactic erros in arguments themselves. The usage sequence with the menu-dialog interface is shorter than with the command line interface, if we define as a single step any complex action that is unlikely to be aborted in the middle. 1. Intention to perform a mathematical operation. 2.
Identification and preliminary enaction of an appropriate command (opening the dialog box).
3. Entry of dialog parameters and completion of the command (closing the dialog box). 4. Interpretation of the output in relation to mathematical intent 5. Mathematical interpretation of the outcome. It should be noted that the menu-dialog interface can be implemented in any software, regardless of its purpose and content, simply by placing function names or mnemonic keywords on the menu and creating as many fields in the dialog as there are arguments. Menu-dialog interface is being superceeded by the visual context (point-and-click, drag-and-drop) interface. While the visual context interface is also based on common GUI libraries, its implementation is much more dependent on the the purpose and the content of the software.
298
2.4
Natural object interface
We would like to outline an interface that would reduce the usage sequence to three steps: 1. Intention to perform a mathematical operation. 2. Identification and enaction of appropriate command. 3. Mathematical interpretation of the outcome. This reduction is possible if one acheives two objectives: (a) a mathematically unambiguous visualisation of the state of the system; and (b) a command syntax that greatly reduces possibility of a syntactically incorrect input. These objectives can be acheived by reification of models of mathematical objects involved in computations by creating these models in terms of objectoriented programming and providing these model objects with graphics and event handlers associated with the visial coordinates of the graphics. We tentatively adopt the term eidolon) for such objects. Of course, reification does not change the fact that in most cases a class meant to represents a matematical object is a fundamentally imperfect model. For example, it would be too ambitious to have a model object for any real valued function, and a class Function based on an algebraic expression may be a reasonable compromise. Besides the physical constraint in representing mathematical data (to be able to represent any given number with N binary digits one needs at least that many memory cells), there are harder questions, for instance if a class RealNumbers should be accepted as a model for R without an algorithm that is capable to verify whether a given set is open or not. For educational use at the present level of technology, the judgement of the models can follow the prevailing practice in education. For example, if in the standard calculus courses one uses only functions given by algebraic expressions, then can grudgingly agree to have a Function class based on algebraic expressions, perhaps with an additional Domain member (which immediately prompts a question, Can Dpmain take a value of the Cantor set?). Presenting mathematics by means of a self-contained "object toolbox" has both advantages and drawbacks. It creates a closed, limited view of mathematics (but so does a pocket calculator) and it should never be considered as a substitute, say, of a proper book in Real Analysis (but same can
299 be said about a Calculus 101 text). A software package with a full power of a basic calculus book and the instant clarity of the road sign language could be the best answer to the hunger of the modern workplace for graduates who can build efficient and well-conceived mathematical models (usually from a small subset of mathematical machinery).
3
Visualized representations of abstract objects - eidola
The ad hoc choice of adopting the 32 x 32-pixel icons for representing mathematical objects in MINOS SE (using Microsoft's List View control as a container) was not based on a proper research into the matter. It was motivated by the success of this visual gadget in representing file systems, as well as by the degree of student's familiarity with this metaphor. At the same time it seems to meet some speculative criteria. Text representation of mathematical objects, i.e. formulas, is in essence a compromise with limitations of paper as a medium. On one hand, a formula provides compact, visually efficient representations of sets with complex structures, of multi-tire compositions of mappings etc., on the other hand they are ambiguous without the context of the given mathematical writing, and, to a lesser extent, of the mathematical discipline to which the writing belongs. Meaning of mathematical formulas can be sustained only due to an unabating diligence of the reader. In a formula, a letter is just a letter, and cannot sustain its meaning as a mathematical object without the broader context. This would not be so if, for example, every representation of a function will contain a "hyperlink" to the list of its arguments. Such requirement for a printed text would be ridiculous, but is most natural for software once one accepts in principle that registration of mathematical objects by computer does not always mean typesetting of a mahematical text. Our notion of natural object interface rests on a requirement that the visual representation of a mathematical object in software has to provide an instant context in self-contained terms. Although we do not exclude the use of " classical" hyperlinks to text glossary entries, those should not make the core of the instant context. An automatic display of the argument list under every Function icon in MINOS SE is a more relevant example, and so are the context (right mouse button) menus of properties and actions available
300
for a given eidolon. Moreover, eidola should have several view modes. A Function, for example, can appear by the user's choice as a moveable icon in a window (default), as a text formula, as a graph in the graph window and as a table of the function's values. Each additional representation serves as a gateway to additional instant context. For example, a graph of a function provides interface for finding zeroes and local extrema by point-and-click. One can also imagine, for example, an eidolon of type Linear Operator with an "external" view as a member of an algebra and an "internal" view as a matrix in a specified basis (a Basis eidolon will be needed for that purpose.).
4
Natural object syntax
Translation of user's mathematical intent into a command can be done effortlessly if the syntax of the command is somehow subordinated to the syntax of the intent. This is not the case with the command line interface: our intent does not suffer from such faults as typing errors or missing delimiters, it can be expressed very laconically, like DifferentiateThis, provided that This is defined. If one needs to calculate derivative of a specific order, the intent sentence can be easily rephrased as (IterateThatManyTimes)(Dif ferentiateThis). The question to what extent one can develop a full programming language for mathematical intent based only on what can be expressed by mouse movements is not discussed in this paper. All what syntactic conventions in MINOS SE have acomplished was to express an ad hoc set of intent sentences which we exemplify below. To have an operation to be perfored with an eidolon, one can use one of the following synonymic forms: click on an operation name in a context menu attached to the eidolon; select (highlight) the eidolon and press the operation button identified by mnemonic graphics and tooltip text; create an operation eidolon and drag it to the (argument) eidolon. A multiple selection of several eidola followed by pressing an operation button expresses a commutative operation such as sum or an operation applied separately to each of the eidolons (it can also be viewed as a commutative operation). Eidola that can be dragged freely in the window make a metafor of an unordered set, but it is possible to supply row of cells that servs as a metaphor for an orderd set and then extend the button metafor to an operation where the order of arguments is of essense. In some cases when parameters are normally required, but form a finite
301
(and visually not large) set, an operation may output results for every value of parameter instead of prompting for a value. For example, differentiation of a function of several variable may produce all partial derivatives of the first order at once. An operation involving two arguments, particulary when their order is of essence, can be expressed by the following sequence of actions. First the user presses the operation button. This changes the mouse cursor to graphics on the button, for example, to the division sign. Finally the user drags one eidolon to another. The output of all operations on eidola is also in the form of eidola, so that the results can be reused in subsequent computations. If an operation fails to produce an expexted eidolon, a blinking red light confirms an error. This indiscriminate error signal is meant to convey that the user has exceeded the capability of the software, and has no connotations of user's personal failure. In this command system, typical errors that the user can make are as follows. One can accidentally use a different object or a different operation from the one really intended. This error is normally noticed while it is being done, exactly like when one picks a a dime from the wallet instead for a nickel. One can make a meaningless mouse movement that will produce no result. One can apply an operation to a pair of objects for which it is not defined, e.g. trying to add a scalar and a vector. Such action will simply yield no result. Since algebraic expressions are defined in MINOS SE by formulas (although they can be "assembled" by subsequent algebraic operations on eidola, syntax errors can still be made on the formula level. At the same time formula parser allows synonimic syntax: for example, both x(x + a) and x * (x + a) are legitimate and parse into identical data structures.
5
Didactic remarks
A great deal of students' insecurity in mathematics might be traced to unreliable long calculations in which where one looses a connection between the problem solving strategy and the outcome. Delegating mechnical manipulations from time to time to software might shift the focus back from formal manipulations to solution strategy. If a modern high school student (unlike so many in the previous generation) can instantly answer the question " can a person moving on the equator with the constant speed of 4.5km/hr circum-
302
vent the globe (40000/cm) in less than one year," this is because she can reach for a pocket calculator. Obviously the crucial mathematical skills for answering this question consist of ability to write down the subsequent divisions 400000/4.5/24 rather than to perform these divisions on paper. The symbolic computations environemnt allows also to use dimensioned quantities so that the computation can be set up as: (40000 * /cm)/(4.5 * km/hr)/(24: * hr/day) with cancellations of units carried out by the algebra routines. The student certainly has to know what is the operation that the computer is performing and she has to be able to solve the problem without the machine as well. However, it is not obvious that training of mental computations has always to precede computer-assisted computations. We see no immediate objections to the following sequence: introducing the notions; showing examples to the class in a computer-assisted demonstration; letting the students to solve further problems with a computer and only then, when the subject matter is sufficiently understood, to let the students do the entirely mental calculation. In the existent educational practice the latter sequence is rare invoked, most often for organizational reasons. Few classrooms, even in rich countries, have on-site computer projection equipment and the technology-assisted instruction involves disruptions, such as checking out and checking in the projection equipment by the teacher and moving the class to and from the computer lab. With the eventual arrival of cheap pocket PC's and falling costs of computer display projectors, these disruptions should disappear. Many students' mathematical mistakes are rooted in a seeming triviality of substitution. The non-computer, formula-on-paper interface, allows to perform this operation in the formal, mindless, computer-like way by replacing a selected character in the formula with a specified expression. This conceals from the student a deeper meaning of the operation as composition of mappings and allows him to forget about domains and ranges of the mappings involved. A computer interface can represent substitutions in a more eloquent and robust manner. For instance, the Function eidolon in MINOS SE displays automatically arguments of the function (more precisely, names of variables found in the expression that defines the function). A drag of the visual representation of y(x) (the formula is concealed from the user and is revealed only on demand, by a click on the icon) to the visual representation of z(y) creates a new eidolon with the result of substitution, which is automatically labeled z(x). In this way the operation of composition in MINOS SE draws the attention to the functional dependency, immediately answering the question what variable(s) the result depends on before one can see the
303
resulting formula. Besides the substitution (a drag of one function icon to another) MINOS SE, solely for the importance of the notion, also defines the operation of compostion, which is initiated by dragging of z(y) to y(x) with the o-cursor and uses the same background routine as substitution, only with reverse order of arguments. The natural object interface is helpful in laying out conditions of a problem and designing a solution strategy without suffering from linearity of a paper record (or a linear, although editable, session log that symbolic computation software usually generates), provided that it allows to arrange representations of objects in an arbitrary geometric order, to use multiple virtual locations for categorization of data, or to link representations of objects with some graphical markers (for example, long arrows) indicating planned or performed actions. The natural object interface can also build communication of the results, beyond an automatic or, preferrably, produced by student, session log, by registering an eidolon as an OLE data type for data exchange between applications and by using Internet protocols either for mailing eidola to other users or for synchronizing the software's workspace between several computers.
6
Program architecture behind natural object interface
Object-oriented programming is natural for natural object interfaces and it allows graduate improvement of mathematical models used. For example, one can create a class called Number that in a draft version would be based on the standard data type (e.g. 32-bit integers and floats) with member functions such as number.add (or number.op(" + ")) or the eventially overloaded "+") invoking nothing more than the system's integer or floating point addition. At a later step this class can be expanded, for example, by a long integers library, supplemented with rationals built on long integers, constractively defined real numbers, complex numbers etc. All what is needed is that the subsequent versions of the Number class would keep the old monikers of function calls and that the results of computations will be always yuelding the current version of Number objects. The object hierarchy should be open to grow both upwards (old classes used as members in new classes) and downwards (new classes used as members in old classes). At the
304
"ground level" of MINOS SE we have placed, in a traditional manner, the Expression class, implementing basic algorithms of computer algebra. In general, member functions of Expression could be: Expression, in (string) string Expression.out string() Expression, variables Expression Expression, op (string operation, argumentsQ)
used as follows (in pseudocode): E = new Expression; E.in("5*a*(x+2)"); Eprime = E.op(" differentiate" ,E.variables(l)); Form.Print Eprime.out; The user does not have to know anything about these commands, they are executed in the background, but for the programmer this division into steps is the one that allows to turn user's mouse movements into mathematical computations. Since the class Expression can be endowed with different data structures and different algebraic rules, the downward expansion of the system can be virtually limitless. Above Expression one can already have classes modelling mathematical objects, such as the Function class mentioned before. It is obvious that Expression itself should not be used to represent functions, because this will not allow future versions of Function to have other attrubutes such as Domain. As an example of upward expension, the Vector class can be based on an array of Function objects. Eidola come always at the top of hierarchy, desirably as extension of a prototype Eidolon class whose methods, besides a constructor, a destructor and a doner, should include Eidolon.Icon
305
Eidolon. Open(viewMode) Eidolon.Operation(opSymbol,[args()]) Eidolon.Print(prmtMode, [coordinates]) In MINOS SE we used instead a ready library class (Microsoft's Listview.Listitem) for visual representation of eidoia and processing mouse and keyboard events. Any creation of instances of Listitem invokes the constructer of Eidolon and double-links the two objects (that have a field reserved for their counterpart) which, together with similar coordination of other methods allows them to work as one object. If a mouse event results in a computation failure, the convention is that the resulting Eidolon is set to null, which in turn serves as a flag blocking a call to the constructor of Listitem. If computation succeeds, an Eidolon is created first and its constructor initiates a new Eidolon and links the two.
References [1] B.Kutzler, Improving Mathematics Teaching with DERIVE, Chartwell-Bratt,1996 [2j C.Kynigos,M.Koutilis,T.Hadzilacos, Mathematics with component-oriented exploratory software, International Journal of Computers for Mathematical learning 2, 229-250 (1997) [3] J.-B. Lagrange, Mathematiques, calcul formel, programmation. Un point de vue didactique, Bulletin - APMEP (Paris) no.429, 474-481, 2000. [4] A.Sierpinska, T.Dreyfus, J.Hillel, Education of a teaching design in linear algebra: the case of linear transformations, Recherches en Didactique des Mathematiques 19 7-40, 1999 [5] A.Sfard, Problems of reification: representations and mathematical objects, imd.Kirshner (ed.), Psychology of Mathematical Education, vol.1, 3-34, North American Chapter 1994 [6] Metaphorical objects in advanced mathematical thinking, International Journal of Computers for Mathematical learning 2, 61-65, 1997
ASYMPTOTIC BEHAVIOR OF VISCOUS 1-D SCALAR CONSERVATION LAWS WITH N E U M A N N BOUNDARY CONDITIONS CHONGSHENG CAO AND EDRISS S. TITI ABSTRACT. In this paper we consider the long-time behavior of a generalized viscous Burgers equation - a one dimensional scalar conservation law with viscosity - subject to Neumann boundary conditions. We show that all the steady state solutions of this problem are constant functions. Furthermore, we prove that, for any initial data, the time dependent solution converges to a steady state solution, as the time grows unboundedly to infinity.
1. INTRODUCTION
Starting from the pioneer work of Burgers [3], [17] and Hopf [9] the Burgers equation has always been used as a paradigm for shedding light on understanding turbulence and other nonlinear phenomena (see, for example, [12], [13], [19], [20], [22], [23] and references theirin). In recent years, Burgers equation has also been used as a model for studying boundary and distributed parameter feedback control of nonlinear partial differential equations (see, for example, [2], [4], [5], [6], [10], [14], [15], [16] and references theirin). In this paper, we study and characterize the long-time behavior of solutions to the unforced viscous Burgers equation subject to Neumann boundary conditions. This problem was brought to our attention by Professor D.S. Gilliam [15]. The tools and results we present here are equally valid to a larger class of viscous one dimensional scalar conservation laws, of which the Burgers equation is a special case. Therefore, we will study here the long-time behavior of the following class of one dimensional generalized Burgers equation in the interval il = (0,1), subject to the homogeneous (no flux) Neumann boundary conditions,
!-S
+
^
W ) =0
'
for*en,i>0,
ux{0,t) = ux(l,t) = 0, u(x,0) = u-m(x),
for x e
(i) fi,
(2) (3)
where u-m(x) e C°(Cl), the initial data, is given. The function F is assumed to be in C2(R) with F" > 0. Notice that in the case of Burgers equation F(u) = \u2. It is well known that in the case of Dirichlet boundary conditions the viscous Burgers equation, with a source term, has a unique steady state which attracts all the time dependent solutions. That is in the case of Dirichlet boundary conditions the global attractor consists of a single stable steady state (see, for example, [8] Date: September 8, 2001.
306
307 and [13]). However, in the case of the Neumann boundary conditions, the steady states are not unique. Every constant function is a steady state solution. Thus, the long-time behavior in this case is not necessary trivial. Nonetheless, we will show, in section 3, that each time dependent solution of the above system (1)(3) converges to a constant function, hence to a steady state solution. We are still unable, however, to find a simple connection between the asymptotic limiting constant function and the initial data. This interesting problem is a subject of future analytical and numerical work. After the completion of an earlier draft of this paper it was brought to our attention that in [24] the author proved the same results for a more general class of 1-D second order parabolic equations, including the system (l)-(3), by showing that there are nontrivial local Liapunov Functionals for those systems. We refer the reader to [25] for details. However, our approach is different and much more elementary. We rely substantially on the Maximum Principle. In the Liapunov's Functional context of [24] we prove that el Jo
\wx\2 dx
is a Liapunov's Functional, where w is an auxiliary function we introduce later in (65). 2. GLOBAL EXISTENCE AND REGULARITY
In this section we introduce some notions and establish the global existence and uniqueness of the strong solution to the initial Neumann boundary value problem (l)-(3). In particular, we prove the maximum principle which plays an essential role in our analysis. We provide detailed proofs of these results for the sake of completeness. Moreover, as far as we are aware, such details for weak solutions of the Neumann boundary value problem are not available in literature (see, however, [18] for the case of classical solutions). Denote by V =
{ueC°°{ty:
ux{0)=ux(l)=0},
and F(a) = max{|F(x)|, |F'(*)|, \F"(x)\}.
(4)
\x\
Definition 1. Let « m 6 C°(Q) and let T be any positive number. A function u(x,t) is called a regular solution of (l)-(3) on [0,T] if u € LfQC((0, T);H3(Q)) n C((0,T]; H2(U)) n i 2 ([0,T]; H\il))
n C([0, T];L2(Q))
and if it satisfies t
{u(t),4>)L1 - (u(t0),4>)L2 + / («»(«),
+ J((F(u(s))x,
for every <j> € V and every t, to € [0, T].
(5)
308 Proposition 2. (maximum principle) Let u(x,t) be a regular solution of the system (l)-(3) in the interval [0,T]. Then, max u(x,t) < max uix.tn) 0
(6)
~~ 0<*<1
min u(x, t) > min u(x, tn) for every 0
(7)
Proof. Let to £ (0,T] be fixed and u(x,t) be a regular solution on [0,T]. First let us observe that from the definition of a regular solution that u,ux £ C°((0,T] x ft). As a result F{u(x,t))eC°({0,T]xTi) and (.F(u(:M)))x = * " ( « ( * , i ) ) ^ , * ) £ C°((0,T] x H). Moreover, we also have uxx - F'(u)ux e C°((0,T],L2(n)). Therefore, we get eC°{{0,T},L2{U)).
ut=uxx-F'(u)ux
Finally, it is clear that since u £ L? ((0,T];H3(ft)) / uxx(x,s)ds-
we obtain
uxx(y,s)ds J to
\Jto
I /•'
<
I
/•'
/ Un(a;,s)«is + / |« I3; (x,s) - u xa ,(j/,s)| ds.
By Cauchy-Schwarz inequality, we have / uxx(x,s)ds< | < - r | 5 I / ||«x«(-,s)||ioodsl
,
and \uxx(x, s) - wxx(y, s)| =
/ uxxx(£, s) d£
ds\x-y\>.
Note that l | w * « ( - > s ) | l i » < C\\uxx(;S)\\L2{Q)\\u(;s)\\HHn)
<
C\\u{-,s)\\
Thus \j
Uxx{x,s)ds\
(J
\H;S)\\%smdsj
.
Therefore, we get / uxx(x,s)ds\Jt0
uxx(y,s)ds Jt0
\\n(;s)\\2H3(Q)dsy
+cjt
|K,8)||H,(n)d«|x-y|i.
Thus, / uxx(x,s)ds
<= C°((0,T] x Q),
309 as a function of x and t. Combining the above we conclude that u(x,t) -u(x,t0)
- / uxx{x,s)ds+
/ F'(u(x,s))ux(x,s)ds
e C°((0,T] x £)).
Hence, u(x,t) - u(x,to) -
uxx(x,s)ds
+
•/to
F'(u(x,s))ux(x,
s)ds = 0
Jt0
holds classically as functions in C°((0,T] x O). In particular, we have ut - uxx - F'(u)ux = 0 holds as a classical differential equation in the space L2(U). and u+(x, t) = max{0,u(a;, i)}. It is
Let u(x,t) = u(x,t) — maxo<x
ut - uXx + F'(u)ux = 0, ux(0,t)=ux{l,t)=0,
(8) (9)
By taking the I? inner product of equation (8) with u+, we get \^l I {u+)2dx+ f ((u+)x)2dx+ [ F'(u)u+(u+)xdx 2 at J 0 Jo Jo
= 0.
Since u € C((0,T] x U), we obtain
I r1
/ F'{u)u+{u+)x dx < max \F'(u(x,t))\ \Jo By Cauchy-Schwarz inequality, we have \ ^ f\u+)2dx 2 at Jo
\\U+\\LHn)
\\(u+)x\\L2{Q).
< max \F'(u(x,t))\2 [\u+fdx. o<x
Thanks to Gronwall's inequality, /
\\u+(t)\\hw<\\u+(t0)\\2L,me-'
3 ^
\F'HX,T))\2
dr
since u+ = 0, we have ||«+(£)ll£2(«) = 0 for all i > i 0 > 0. In other words, max u(x.t) < max u(x,to), 0
_
0<*<1
for all t > to > 0.
v
_
Similarly, we can show (7), for all t > t0 > 0. Next, we show that (6) and (7) hold for to = 0. Let us denote by M(t)=
max u(x,t) 0
and m(t) = min
u(x,t).
0<x
Since (6) and (7) hold for t0 > 0, then ||u(-,t)|U- ( 0 ) =max{|Jlf(t)| ) |m(t)|} is a monotonic nonincreasing function of time for t > 0. Therefore,
Urn | K , OIIL-(O) = < , < < » •
(10)
310 Moreover, since we proved that (6) holds for to > 0, then M(t) is a monotonic nonincreasing function of time for t > 0. As a result of (10) we have limM(t) = M°
Thus, hm || («(., t) -M°)+-
(uia(-) - M°)+ \\LHO) = 0.
(11)
Observe, however, that
u(x,t)-M°
<M{t)-M°
and since M(t) — M° > 0 we conclude that 0 < (u(-,t) - M°)+ < M(t) - M°. Thus, hm||(W(-,t)-M0)+||£2(n)=0. Together with (11) implies that (Min(-)-M°)+=0, and hence M(0) = max uin(x) < M° < M{t)
for all t > 0,
which proves (6) for all to > 0. Once again, the proof of (7) is similar.
•
Theorem 3. Let T be any positive number. Then, for every wm(a;) € C°(il), there is a unique regular solution u(x,t). We will give the proof of Theorem 3 later. But, first, let us consider the following initial boundary value problem (linearization of (l)-(3) about ri(x,t))
| - S + n ^ , 0 ) | = o,
, 6 n,*>o,
&«M) = 6.(M) = 0,
(12) (13)
£(z,o)=uin(x),
sen,
(H)
where r)(x, t) is a given function such that 77=
sup 0<x
\q(x,t)\ < oo
(15)
and, for every T > 0, W=[l0
H-Mmdt)
(16)
311 Definition 4. Let n m 6 (7° (ft) and let T be any positive number. solution of (12)-(14) on [0, T] is a function £(x, t) such that:
A regular
t 6 Lf oc ((0,T]; H3(H)) n C((0, T\; H2(Sl)) n £ 2 ([0, 71]; tf1^)) n C([0, T]; L2(Q)) and t
«(*),*>£» - <€(*o),^)i2 + J «.(«), *.>L» ds +
(17)
for every
(18)
ll&(-.*)lli» + | j [ ' llf-.(-,-)lli»«d» < ^2(T,^)
(19)
ll«-.<
\\t«.(;t)\\h
+ ^f
ll^x(-,s)||2L2S2dS
(20)
u>/jere ^ ( T , ? ? ) , ^ ^ , ^ ) and / ^ ( T , TJ,TJT) tui/J 6e specified later as in (26), (28) and (30), respectively. Furthermore, £(x, t) satisfies the maximum principle: IK(-.0IU-
/braW
t > 0.
(21)
Proof. We prove this Proposition by using the standard Galerkin procedure. It is clear that
F'(n(*,t)) dt dx2 (Cm).(0,t) = « m ) , ( l , t ) = 0 I f m (z,0) = P m u m (a;), "t" r m
-t
(22) (23) (24)
where £m(a;,i) = SfcLo ak(t)k{x) £ F m . The global existence of solution to this finite dimensional linear system (22)-(24) is obvious. We only show the regularity, i.e. the estimates (18)-(20). Let T > 0 be given. By taking the L2 inner product of equation (22) with £ m , we get |
^
^
+ llttm).|ll. = - / V ( f , ( * , 0 ) « m ) - f m « f a < J i ( f l ) | | « m | | L » l l « m ) . | | i > ,
312 where F and rj are as in (4) and (15), respectively. By using the Young's inequality we obtain
^ ^ + \\(UAl*<(Fm2\\u\h. Thanks to Gronwall's inequality we have for every t 6 [0,T], \\U(;t)\\h
+ I ll(6»).(*,«)lk ds < KtiT.rfi, Jo
(25)
where K1(T,rfi = \\uin\\lJFW)2T.
(26)
By taking the 1? inner product of equation (22) with — (£m)xx,
lmitxllh
we reach
+ [ KU**|2 dx = jf1 FJ(V(x,t)){UUU)..dx
<^«)ll(W.|M|tfm)..||L», where, again, F and rj are as in (4) and (15), respectively. Again, by using Young's inequality we obtain rf|l(
^'"i2
+ | | « m ) . . | & < (F(5)) 2 ||« r a ).||i,.
Thanks to Gronwall's inequality we get for 0 < r < t < T, ll«m).(-,*)lli»+j[t|l«m)..(-,»)lll.<e^)2r||ttm).(-,7-)||ia.
By integrating the above inequality, with respect to r over the interval [0, t], and applying estimate (25), we get \\(UU;t)\\h
+ \jQ
mmU(;s)\\2L,sds
(27)
MlMe(m)2Tt
(28)
where
K2{T^ =
and Ki(T,ff) is as in (26). Once more, by taking the L? inner product of equation (22) with (£m)xxxx, we reach
Hl(fc»)..|||,
/
(£m)*s(£m)xxsx dx +
2 dt Jo By integrating by parts, arts, we obtain
I
F'(j}(x,
t)) (£„,)» (£m)xxxx dx = 0.
Jo
l<*ll(6n)M|ll» , „ „ , „2 + ||(^m)ixa:|li2 2 dt = / F"(T](x,t))rix(x,t)(Zm)x(tm)xxxdx+ Jo
/ Jo
Ff{ri{x,t))(tm)xx(Zm)xxxdx
<^[ll«J.||y||«m).|U-+||ttm)..||L»]||«m).~lli;». By using Young's inequality, we get
313 Notice that (£m)x(0,t)
= 0. Then
I fx \\{U)x{x,t)\\L~
f1
I
= max / {£m)xx{y,t)dy\ < / \{U)xX(y,t)\ 0<x
dy < ||(£ m ) M |U».
Therefore, d|ltf
" ^ " l l ' » + | | « m ) . . . l k < 2(F(fj)f
[IMh
+ 1] ||«n.).-lli..
Thanks to Gronwall's inequality, for 0 < r < t < T,
\\(U)«{;t)\\b + f ll«m).,«(-,«)|ll»d» ,nu ^ < Mi2 2 (^(^)) a / r [Itl-C-.^lli- + 1] *"
+ ^f\\(UUA;s)\\2L2
s2ds < K3{T,rj,rjT),
(29)
where K3(T,rj,rjr) = ^ f ^
e
2 T O )
2
[nr + T]^
(30)
and K2{T,rf) is as in (28). Finally, by using the appropriate "Compactness Theorems" one can extract a subsequence {£m< (x,t)} which converges to the limit function £(x, t) that solves the system (12)-(14), satisfies (17), and belongs to
e6Xj)C((o,r];Hs(n))nC7((o,T];fla(n))nL2([o)r];H1(n))nc([o,ri;La(n)). Furthermore, from (25), (27) and (29) we conclude that the limit function £ satisfies (18)-(20). To prove (21) (i.e. the maximum principle) one can follow, almost step by step, the proof of Proposition 2. _ Next, we show the uniqueness of £. Suppose that £ and f are two regular solutions of system (12)-(14). Let \ — £ — ?• Since the system (12)-(14) is a linear system, we have that x is a regular solution of the system (12)-(14) but with u-m(x) — 0. It is clear that x satisfies (18). However, in the case of u m (a;) = 0, by applying (26), we get Ki(T,rj) = 0. Therefore, H X ( - . O I I L » = O,
and x = £ — ? = 0. Therefore, the regular solution is unique. Now, we are ready to prove Theorem 3.
•
314 Proof. Let T > 0 be given. First, let us consider, for n > 1, the following systems
4 n ) ( 0 , t ) = «i n) (l,*) = 0,
(32)
n
u< >(x,0) = uin(x),
(33)
w(0) = 0.
(34)
Denote by {u^(x,t)}^=0, the sequence of functions, that solve the above systems (31)-(34) for n = 1,2, • • • , respectively. By applying Proposition 5 in the case of rj = 0, we get u(
1
'eLfoc((o,r];F3(n))nc((o,r];F2(f2))nL2([o,r];F1(n))nc([o,r];L2(fi)),
and the estimates (18)-(21) hold for £ = w*1'. Therefore, by induction, we apply Proposition 5 in the case of rj = u ' " - 1 ' and we are able to show that ) 2 3 2 2 1 2 u(" eL oc((o,r];H (n))n<7((o,T];i? (f2))nL ([o,r];if (n))nc([o,T];L (n)),
with it'"' satisfies the maximum principle, i.e.,
lk n) (-,t)IU~
(35)
and \\u(n)(;t)\)h
+ fh^H-^nhds
hin)(;t)\\h
+ \f*
\\u£(;s)\\hsds
\HD(;t)\\h
+^ f
\\v£l{;s)\\l.s2ds
<
tf,(r,Min) < K2(T,M-m)
(36) (37)
< KsiT.M-^K^M^)),
(38)
where M-m is as in (35), and K^T^-^K^T,M-m) and K3(T,Min,Kx{T,Min)) are as in (26), (28) and (30), respectively. Next, we show that {u (n) (x,£)}£L 0 is a Cauchy sequence in L2((0,T; H1^)) n C([0,T]; H). Denote by 5 («) =
„(") _ u ( " - i ) )
for n = 2,3, • • • .
It is easy to get
* £ - T • * ' < » < - , ^ + i * ^ - ^-'"-"'l ^ - » < » 5(0,t) = 5< B >(M)=0 5
(n)
(40)
(a;,0)=0.
(41) n
By taking the L? inner product of equation (39) with S' ', we get l#I ( r a ) || 2 L 2
l| ( g ( n ) ) II2
2
ll
at
II L 2
=
[1F<{u(n-l)){~{n))x~(n)dx_
_ y0
- C |>( u ( n_1) ) - •F>(n_2))] («("-1))Iu(") dx < F(Min)\\uM\\L>\0n))x\\L*
+
F{Mta)\\^-^\L,\\(u^-1\\\L.\\^\\L,,
315
where F and Mjn are as in (4) and (35), respectively. Recall that
Then, by applying the following version Sobolev inequality
ll/.IU- < ll/.H^II/.-lli?,
v/ev,
(42)
and by using the Cauchy-Schwarz inequality, we obtain
^w^
+
or
\\&n))>L * (nMin))2\&n)\\h + ML2
ii 1
+2F(Min)||2("- )||L2||(«(n-1))«lli/22||(«("-1))„llI/i2||a(n)IU= < (F(M in )) 2 [l + \\(u("-%x\\L,] ||S<»)||i, + ||S(n-1)|li.||(ti<"-1>).||i;.. Thanks to Gronwall's inequality, for every 0 < t0 < t < T,
\\u{n)(;t)\\h + f |l(S("))-(-,»)irra <*S < eh
T T
||«W(., to) 111,
+
(43)
+ f j . * T *"||u(-1)(.,a)||i,||(u(-1))«(-,a)||L»d. J to
where *(T) = (F(Min))2 [l + IK^- 1 )).^-^)!!^] . On the other hand,
(44)
<(iCl(r)Min)e(^in))2^§(ln!)i. Therefore, /•« (F(Mi n )) a /"||(u(- 1 ))..(-,r)|| £ ,d7-
/ e
A
IK^-^M)!!^
Jt0 ft
^
[F{M-m)f k ( T , M f a ) e ( ^ M m ) ) 2 ^ Lt)* K m» ^ j y sj || ( u (n- 1 ) ) x ( . ) S ) | U 2 d a
iKu(n_i))i( s)iii2ds :
•/to
-(r
'' )
1
5 X
' - 2(F(Min))2(^l(T,Min)e(^in))2^i Jto
(ln£)M (45)
316 In the case of to > 0, thanks to (36), we have ft (F(Min)f / e
f •/»
\\(u("-V)xx(;T)\\L*dT
| | ( w < n - 1]U;s)\\L>ds
Jto
(
2(F(M i n )) 2 (^(T,
Min)e^Mm)fT
?ny
(t -10)
v ^(^(r.JJf^X-^T.Mjn))*
(t-t0)i,
(46)
where 2(F(M.m))* (Kl(T, M.m)eWMin)fT\
Ar4(r,Min) = e
* A T\*
v
/
V W
(47)
In the case of to = 0, thanks to (36) and (45), we reach / e J t0 KiK^M^y
(
Il(u ( - 1 ) )x(-, s)\\L2ds
J.
J
' ( m ]A *
2(F(M i n )) 2 {KX (T, M-m)eWMin))2^
\
< (^(r.Mjn))* x ~
2 ( F ( M i n ) ) 2 ( i f 1 ( T , M i n ) e ( F ( M i n ) ) 2 r y (ln(j + l))i
where AT5(T,Min) = ~
„
i
2(F(M i n )) 2 ( ^ ( T , M i n ) e ( ^ ( M i n ) ) 2 r ) ' (ln(j + l)) 1/ 2 < oo.
j'O +1) Therefore, when to = 0, we get from the above and (36) \\n{n)(;t)\\h+
ft\\(u^)x(;s)f Jo "
"L
ds
<e(^(^in))2r(ifl(riMin)A:5(riMin))i
t\
max||2("-1)(,S)|||3,
(48)
317 and when t0 > 0, we obtain UL
Jto "
T+(Kl{T,M.m)e(^in))2Ty(,nI. L
<||«W(.,to)|l! 2 e M
+e^
2T
x
'
i
k
in)) (K1(T,Min)Ki(T,Min))
(t-t0) *
-
-'
J +
1
max llu*"" ^,*)!!!,. (49) 0<s
Let T* =
= 4 e 2(F(M i n )) TKi{TtMin)[K<(T,Min) Then, from the above we have max ||u(-I0lli»+ /
. +
(50)
K5(T,Min)]
Ik^"^^-,*)!2
*
max l l u ^ - ^ M I I i a . 2 0
V
/UL
By induction, we get o m|*,
||5<»>(,i)||i, + [
\\&n)U,t)ll
dt <
^M?n.
Thus, {uM(x,t)}%Lo is a Cauchy sequence in L2{0,T*;H1(U))nC([0,T*];L2(Q)). By taking to = T* in equation (49) and by noticing that {u^(x,T*)}^L0 is a Cauchy sequence in L2(Cl), one can easily see that {u^n\x,t)}'^L0 is also a Cauchy sequence in I?{T*, 2T*; Hl{Sl))r\C{[T*, 2T*]; L2(f))). We repeat the above procedure again and again until we cover the interval [0,T] with finitely many steps, which implies that {u'"'(x,t)}^L 0 is a Cauchy sequence in L2(0,T;Hl(U))r\ 2 C([0,T];L (fi)).I.e., u(x,t) = lim u (n) (a;,i), n-¥oo
in ^ ( O . r i H 1 ^ ) ) n C([0,T]; H). Moreover, u{x,t) satisfies the weak formulation (17) and max \u(x,t)\<M-m (51) 0<s< 1 t>0 /
H*,t)|lir.
(52)
Jo
Therefore, u(x,t) satisfies the condition (15)—(16). By applying the Proposition 5, we have
«ei^((o,r];zrs(n))nC'((o,ri;ff2(n))nL2([o,ri;ff1(n))nc([o)ri;^a(n))Next we show the uniqueness. Suppose that u\ and ui are two regular solutions. Let u = u-i — u\, which satisfies ut - uxx + F'(u2)ux + (F'(u2) - F'(tii))(wi) I = 0 ux(0,t) = ux(l,t) = 0, u(x,0) = 0.
(53) (54) (55)
318 By taking the L2 inner product of equation (53) with u, we get 1 # u | ! i + \\UxfL2 = _ f1 F\u2)uxudx l at Jo < F(Mia)\\u\M\um\\v Notice that
l«0M)l< /
(Ul)xudx
u
(y^)dy
+ ll««llL»
\Jo
therefore 112
[F'(u2) - F'(Ul)]
+ JWm)ll(ui).IMMU-IMIi»-
I f1
14^ii
- f Jo
+ ||ttx ||2 2
< F(Min)[(i + ||(u1).||Loilti|U»llu.||L» + IK«i).IMMlU
By using the Cauchy-Schwarz inequality, we obtain
« ;
< ( F ( y ) 2 [ l + ||(u1),||L,]2||u||ia + 2F(Min)||(Ul)I||L2||«H2^.
Thanks to Gronwall's inequality we have
\M\b = o. Therefore, the regular solution is unique. D Remark In the case of the viscous Burgers equation, the authors of [16] take 2
advantage of the special form of F = \ initial data u-m 6 L2(Cl).
to show similar global existence result for
3. LONG-TIME DYNAMICS
In the previous section we have established the global existence and uniqueness of regular solution for (l)-(3). In this section we study the long-time behavior of this solution. But first let us identify the set of steady state solutions for (l)-(2). Proposition 6. u(x) is a steady state solution of (l)-(2) if and only if u(x) = const. Proof. First, notice that every constant function u(x) — const is a steady state solution. Next we show that all steady state solutions are constant functions. Let u(x) be any steady state of equations (1)—(2). Then u(x) satisfies - u X I + (F(u))„ = 0
(56)
«.(0) = u«(l) = 0.
(57)
By integrating (56), we get ux(x)=ux(0)JoXF'(u(*))ds. Since 1^(0) = 0, then ux = 0. Therefore, u(x) is a constant function.
D
Since we are interested in the asymptotic behavior of solutions of the system (l)-(3), we will study the system (l)-(3) when t is large enough. From now on we assume that t > to > 0, for to large enough, as large as needed. Recall that for almost every t>t0> 0,u(-,t) 6 H3(U).
319 Denote by v = ux. Then v satisfies
m~d^
+ F {u)v2 + F {u) =0
"
' di
vxzn,t>t0,
v(0,t) = v(l,t)=0 v(x,t0) = vin{x)
Vx e Q,
(58) (59) (60)
where vm(x) jg(x,t0). Proposition 7. The solution v(x, t) of system (58)-(60) converges to a nonpositive function as t -t oo. Proof. Let us denote by v+(x, t) = max{0,v(x,t)}- Since for each t > t0 > 0, v € Hl(Sl), we have v+ e H1^) (cf. [1], [7]). By taking the L2 inner product of equation (58) with v+, we get \jtt
(v+fdx+
[ i(v+)*)2dx+
f F"(u)(v+)3dx+
f F'(u)vxv+dx = 0.
Observe that Therefore, / F'(u)v+vxdx Jo
= -]- f 2 J0
F"(u)(v+fdx.
Thus,
11
[ {v+fdx + [ ((v+)x)2dx + I f F"(u)(v+fdx = 0. Jo * Jo 2dt, Jo Recall that F"(u) > 0. Also notice that v+{0,t) = v+(l,t) = 0, then by applying the Poincare inequality (see, e.g., [1], [7]) we obtain Id ~ [\v+fdx ! dt J0 2< Thanks to Gronwall's inequality,
+ ir2 f (v+fdx J0
<0.
II« + (*)IIL»(0) < ll«+(«o)lli»(o) e-2^-'*).
(61)
Then, v+ converges to zero as ( -> oo. Therefore, v(x, t) converges to a nonpositive function as t —> oo. O Let u(x, t) be the solution of the initial boundary value problem (1)—(3). We denote by M(t) = max u(x,t) and m(t) = min u(x,t). 0
0<x
Notice that u(x,t) satisfies the maximum principle (Proposition 2). Then, M(t) is a nonincreasing function and m(t) is a nondecreasing function. We denote by M=
MtaM(t)
and
m=limm(*)-
(62)
Proposition 8. Let u(x,t) be the solution of the initial boundary value problem (l)-(3). Then _ lim (M - u(0, t)) = lim (u(l, t) - m) = 0. (63) £-+oo
t-+oo
320 Proof. Recall that v(x,t) = ux(x,t). u(x,t)-u{0,t)
Then
=
v+(y,t)dy.
v(y,t)dy < Jo
Jo
Therefore, M(t) -u(0,t)
= max [u(x,t) -u{0,t)] <
< j v+(y,t)dy < (j (v+{y,t))2dy\ * . Since the right hand side goes to zero as t -> oo, we conclude that lim(iW-u(0,i)) = 0. t—>oo
Similarly, we can show that lim(u(l,i) - m ) = 0. t—i-oo
D Theorem 9. For any initial data u-m(x) 6 C°(U), the solution u(x,t) of (l)-(3) converges to a steady state as t -> oo. In other words, u(x, t) converges to a constant function asymptotically in time. Proof. Let u(x,t) be a solution of (l)-(3). It is obvious that the theorem is true when M = fh, where M and m are as in (62). Therefore, we only show the theorem when M > fh. Again we denote by v = ux and by G(
(64) (65)
where
Al = mzirn M
=
(66)
M —m mF{M)-MF{fh)
(67)
in — M where, again, M and fh are as in (62). It is worth to mention that G{M) = G(fh) = 0.
(68)
By direct calculations, we get dw
d2w
„.. . dw
, ,
By taking the L 2 inner product of equation (58) with v, and equation (1) with G(u)G'(u), and by adding the resulting equations, we get Id — / [v2+G2{u)]dx2 dt Jo - [ uxxG(u)G'(u)dx Jo
/ vxxvdx+ Jo
/ F"(u)v3dx+ Jo
+ [ F'(u)uxG(u)G'(u)dx Jo
[ Jo = 0.
F'{u)vxvdx(70)
321 By integrating by parts, we reach (1)
- /
v2x dx,
vxxvdx = j
Jo
Jo 3
f G"(u)vv2dx Jo
(2)
f F"(u)v dx= Jo
= -2 [ Jo
(3)
/ F'{u)vxvdx = \ j F'{u){v2)xdx Jo * Jo
(4)
- f uxxG(u)G'(u)dx=
G'{u)vvxdx,
=-\ f 2 J0
f u2x{G'(u))2dx + f
Jo
Jo
F"(u)v3dx, u2xG{u)G"(u)dx,
Jo
= f u2x{G'(u)fdx Jo
+ f Jo
v2G(u)F"(u)dx,
= \f [{F'(u)(G(u))2)x - F"(u)(G(u))2v] dx = * Jo Jo 2 - F'(u(0,t))[G(u(Q,t))]2 _ \1 f f1 „„F"(u)(G(u))2vdx = F'(u(l,t))[G(u(l,t))] 2 Therefore, equation (70) can be rewritten as (5)
f F'(u)uxG(u)G'(u)dx Jo
• [ [v2 + G2(u)]dx + [ w2xdx - \ [ F"(u)vw2dx = 2dt __ Jo Jo 2 J0 F'(u(0,t))[G(u(0,t))}2-F'(u(l,t))[G(u(l,t))]2 = 2
(71)
Denote by a =
inf
/
widx.
t(l
We will show that a = 0. Suppose that a > 0. First, let us consider the following two terms: F'(u(0, i))[G'(u(0, t))] 2 - F'(u(l, t))[G'(u(l, t))]2 and 1
r1
=- /
F"(u)v+w2dx.
2 Jo
By applying the Proposition 8, we obtain lim {F'(u(0,t))[G'(u(0,t))}2 t-HX
-F'(u(l,t))[G'(u(l,t))]2)
= F'{M)[G'(M)]2 2 _- j?iicz\\ni
= 0.
(72)
_
| F » | < F(Min). Since w(x, t) satisfies the maximum principle (see equation (69) and Proposition 2) and the regularity results of the previous section, then max |iu(a;,£)| < max \w(x, to)\2 < ( I K M o ) I U - + ||G(u(x,t 0 ))IU-) i < 0<ar
0<x
< (Ks(t0,Min,K1(t0,Mia))
+ F(Min) + \At\Mfo + \A2\)
322 where F,Mm,K3,Ki are as in (4), (35), (30), (26), respectively. By applying the Proposition 7, we get 1 f1 lim - / F"{u)v+w2dx = 0. (73) t-foo 2 J0
As a result of (72) and (73), there is a T, large enough, such that for any F'(u(0,t))[G'(u(0,t))]2 -F'(u(l,t))[G'(u(l,t))]2 < a/4 and
t>T
- / F"{u)v+w2dx < a/4. 2 Jo Thus, from the above and equation (71) it follows that when t > T
i d r1 2 • [ [v + G2(u)]dx < -a/2. 2dtJJo0
(74)
Since / 0 [v2 + G2(u)]dx > 0, it would be impossible for inequality (74) to hold. Therefore, a = 0. As a result, there is a sequence { i , } ^ , tj -> oo, such that lim / w2(x,tAdx
= 0.
i^>°°Jo10
Applying the following version of Poincare inequality m of Poincare inequality 1/2 <£(•)- f
4>{x)dx
Jo
<( L°°
f
tf1^),
V0e
/
1
mm w{x,tj) < 2 ( / w2x(x,tj)dx)
maxw(x,tj)0<x<\
0<x<\
\Jo
. J
Thus, lim I max iw(x,i,) — min w(x,tA I = 0 . t,—»oo \ 0 < * < 1
0<x
V
'
Jy
/
(75)
V
'
Since w(x,t) satisfies the maximum principle (see equation equat (69) and Proposition 2), then
(
max w(x,£) — min io(a;,t) I V
0
0
'J
is a nonincreasing function of t. As a result of the above, We have lim I max w(x, t) — min w(x, t)\ = 0 . t-*oo \ 0 < I < 1
0<*<1
V
'j
This shows that w(x, t) converges to a constant function uniformly in x as t -t oo. On the other hand, since lim w(0,t) = - lim G(u(0,t)) = G(M) = 0, t—>oo
t—>oo
then, lim w(x, t) = 0
uniformly in x.
Since F 6 C 2 (0) and G(M) = F(M) -A1M-A2
= 0,
(76)
323 there is a well-defined continuous function H such that F(u) - AlU -A2= H(u)(u - M). Denote by u = u — M. Then, by the definition of w, we have ux - F{u) + A\u + A2 = w, and u satisfies ux — H(u)u = w. Solving this linear equation with u is unknown gives u = u-M=
rw(x,t)e--foH(u(z't))dzdy\eIoH(u(z't))dz.
(u(0,t)-M+
Observe that by the mean value theorem H(u(x,t)) = F'(u(x,t))+Au
(77)
where u(x, i) is between u(x, t) and M. From the maximum principle (Proposition 2, see also the statement of Theorem 3) we have
IK-,t)IU~ < Min
and
\M\ < M-m,
and in particular we conclude that
IK,t)IU-< WinThus, from the above, (4) and (77) we conclude that |£T(«(x,«))l < -F(Afm) + l^ilTherefore, P M
e-tfH(u{z,t))dz<e
( in^
^
J - H(u(z,t))dz < /<Min>+^l
Applying (63) and (76), we conclude that the right hand side of the above equation goes to zero as t —> oo. Thus, lim (u-M)= 0. (-•oo
Therefore, u converges to a constant as t -t oo.
D
ACKNOWLEDGEMENTS
We are thankful to Professor D.S. Gilliam for proposing this problem for us. E.S.T. acknowledges the support of the Varon Fellowship at the Weizmann Institute, Israel, and the Universite de Reims, France, where parts of this work were completed. This work was supported in part by the National Science Foundation grants number DMS-9706964 and DMS-9704632, and supported by the Department of Energy, under contract W-7405-ENG-36.
324 REFERENCES [1] R.A. Adams, Sobolev Spaces, Academic Press, New York, 1975. [2] A. Balogh and M. Krstic, Global boundary stabilization and regularization of Burgers' equation, http://www-mae.ucsd.edu/research/krstic/workshop.html, accessed September 17, 1999. [3] J.M. Burgers, Nonlinear Diffusion Equation, Dordrecht, Netherlands: Reidel, 1974 [4] J.A. Burns and S. Kang, A control problem for Burgers' equation with bounded input/output, Nonlinear Dynamics, 2 (1991), 235-262. [5] C.I. Byrnes and D.S. Gilliam, Boundary feedback stabilization of a viscous Burgers' equation, Computation and Control III: Proc. of the Third Bozeman Conference, Bozeman, Montana, August 5-11, 1992, K.I. Bowers, J. Lund (eds.), Birkhauser, Boston, 1993. [6] H. Choi, R. Temam, P. Moin and J. Kim, Feedback control for unsteady flow and its application to the stochastic Burgers equation, J. Fluid Mech. 2 5 3 (1993), 509-543. [7] D. Gilbarg and N.S. Trudinger, Elliptic Partial Differential equations of Second Order, Springer-Verlag, 1983. [8] A.T. Hill and E. Siili, Dynamics of a nonlinear convection-diffusion equation in multidimensional bounded domains, Proc. Roy. Soc. Edinburgh, 125A (1995), 439-448. [9] E. Hopf, The partial differential equation « t + u u x = uxx, Comm. Pure Appl. Math. 3 (1950), 201-230. [10] H.R. Jauslin, H.O. Kreiss and J. Moser, On the forced Burgers equation with periodic boundary conditions, Differential equations: La Pietra 1996 (Florence), 133-153, Proc. Sympos. Pure Math., 65, Amer. Math. S o c , Providence, RI, 1999. [11] F. John, Partial Differential Equations, Springer-Verlag, New York, 1982. [12] R.H. Kraichnan, Note on forced Burgers turbulence, Physics of Fluids, 11 (1999), 3738-3742. [13] G. Kreiss and H. Kreiss, Convergence to steady state of solutions of Burgers' equation, Appl. Numer. Math., 2 (1986), 161-179. [14] M. Krstic, On global stabilization of Burgers' equation by boundary control, Systems & Control Letters, 3 7 (1999), 123-141. [15] M. Krstic, T. Bewley and B. Bamieh, organizers: National Science Foundation Workshop on Control of Flows, http://www-mae.ucsd.edu/research/krstic/workshop.html, accessed September 17, 1999. [16] H.V. Ly, K.D. Mease and E.S. Titi, Distributed and boundary control of the viscous Burgers' equation, Numer. Funct. and Optimiz., 18 (1997), 143-188. [17] F.T.M. Nieuwstadt and J.A. Steketee, Selected Papers of J.M. Burgers, Kluwer, 1994 [18] M.H. Protter and H.F. Weinberger, Maximum Principles in Differential Equations, New York, Springer-Verlag, 1984. [19] Z. She, E. Aurell and U. Frisch, The inviscid Burgers equation with initial data of Brownian type, Commun. Math. Phys., 148 (1992), 623-641. [20] Ya.G. Sinai, Statistics of shocks in solutions of inviscid Burgers equation, Commun. Math. Phys., 148 (1992), 601-621. [21] M.E. Taylor, Differential Equations : Basic Theory, New York, Springer, 1996. [22] E . Weinan and E . Vanden Eijnden, Another note on forced burgers turbulence, Physics of Fluids, 12 (2000), 149-154. [23] E. Weinan, K. Khanin, A. Mazel and Y. Sinai, Probability distribution functions for the random forced burgers equation, Phys. Rev. Lett., 7 8 (1997), 1904-1907. [24] T.J. Zelenyak, Stabilization of solutions of boundary value problems for a second order parabolic equation with one space variable, Diff. Equat. 4 (1968), 17-22. [25] T.J. Zelenyak, M.M. Lavrentiev J r . and M.P. Vishnevskij Qualitative Theory of Parabolic Equations, Part 1, VSP BV, 1997. (C. Cao) CENTER FOR NONLINEAR STUDIES, MS B258, Los ALAMOS NATIONAL LABORATORY,
LOS ALAMOS, NM 87545, USA E-mail address: ccaoQcnls.lanl.gov (E.S. Titi) DEPARTMENT OF MATHEMATICS, AND DEPARTMENT OF MECHANICAL AND AEROSPACE ENGINEERING, UNIVERSITY OF CALIFORNIA, IRVINE, CA 92697-3875, USA
E-mail address: e t i t K m a t b . u c l . e d u
On the category of comodules over corings Robert Wisbauer Abstract It is well known that the category Mc of right comodules over an ^-coring C, A an associative ring, is a subcategory of the category of left modules >cM over the dual ring *C. The main purpose of this note is to show that M.c is a full subcatgeory in *c-M if and only if C is locally projective as a left j4-module.
1
Introduction
For any coassociative coalgebra C over a commutative ring R, the convolution product turns the dual module C* = Hom^C, R) into an associative i?-algebra. The category Mc of right comodules is an additive subcategory of the category c*M of left C*-modules. MP is an abelian (in fact a Grothendieck) category if and only if C is flat as an .R-module. Moreover, Mc coincides with c-M. if and only if C is finitely generated and projective as an .R-module (e.g. [11, Corollary 33]). In case C is projective as an i?-module, MP is a full subcategory of c-M- and coincides with cr[c*C], the category of submodules of C-generated C*-modules (e.g. [9, 3.15, 4.3]). It was well understood from examples that projectivity of G as an .ft-module was not necessary to achieve Mc = a[c*C] and that the equality holds provided C satisfies the a-condition, i.e., the canonical maps N ®R C —> Hom^(C*, N) are injective for all i?-modules iV (e.g. [1, Satz 2.2.13], [2, Section 2], [10, 3.2]). It will follow from our results that this condition is in fact equivalent to MP = cr[ c .C] and also to C being locally projective as an .ft-module. We do investigate the questions and results mentioned above in the more general case of comodules over any A-coring, A an associative ring, and it will turn out that the above observations remain valid almost literally in this extended setting.
325
326
2
Some module theory
Let A be any associative ring with unit and denote (—)* = Hom^—, A). We write A-M. (MA) for the category of unital left (right) A-modules. / (or IN) will denote the identity map (of the module N). 2.1. Canonical maps. For any left A-module K, consider the maps (pK:
K
^
k ~
K**
-U
[/-/(A)] ~
AK\
(/WW-
For any right A-module JV define the maps (*N,K-
ipN-
N®AK N®A AK'
-^Eomzs(K*,N), -> NK",
By the identification M&p(K*,N) gram
n®k *-* [/•-» n/(ft)], n
= NK* we have the commutative dia-
N®A K [r]1®*" [d]aw>K N»A AK'[d]<"
2.2. Injectivity of (XN,K• We stick to the notation above. (1) The following are equivalent: (a) a^,K is injective; (b) foruG
N ®A K, (I®f)(u) = 0 for all f G K*, implies u = 0.
(2) The following are equivalent: (a) For every finitely presented right A-module N, a^,K is injective; (b) (f>K '• K —> AK* is a pure
monomorphism.
Proof. (1) Let u = £T=i m®ki G N®AK. Then (I®f)(u) = £ [ = 1 mf(ki) 0, for all / G K* if and only if u G Ke QN,K-
=
(2) For N finitely presented, ij)^ is injective (bijective) and so aNK is injective if and only if IN®&K is injective. Injectivity of IN®
327
The module K is called locally projective (see [12]) if, for any diagram of left A-modules with exact lines 0[r] F [rf K\d}9 L[r]f N[r] 0, where F is finitely generated, there exists h: K —> L such that goi = fohoi. Clearly every projective module is locally projective. Prom Garfinkel [4, Theorem 3.2] and Huisgen-Zimmermann [12, Theorem 2.1] we have the following characterizations of these modules which are also studied in OhmBush [5] (as trace modules), and in Raynaud-Gruson [6] (as modules plats et strictement de Mittag-Leffler). 2.3. Locally projective modules. For the left A-module K, the following are equivalent: (a) K is locally projective; (b) K is a pure submodule of a locally projective module; (c)
OLN,K
(d) (XN,K
is injective, for any right A-module iS
N;
injective, for any cyclic right A-module
(e) for each m e K, we have m G
N;
K*(m)K;
(f) for each finitely generated submodule i : F —> K, there exists n G N and maps /3 : Rn —> K, 7 : K —> R1 with (3 o 7 o i = i. Recall the following observations. Notice that for a right noetherian ring A, every product of copies of A is locally projective as left A-module (e.g. [12, Corollary 4.3]). 2.4. Corollary. Let K be a left A-module. (1) Every locally projective module is flat and a pure submodule of some product AA, A some set. (2) IfK is finitely generated, or A is left perfect, then K is locally projective if and only if K is projective. (3) For a right noetherian ring A, the following are equivalent: (a) K is locally projective; (b) K is a pure submodule of a product AA, A some set.
328
The following facts from general category theory will be helpful (e.g., [7]). In any category A, a morphism / : A —» B is called a monomorphism if for any morphisms g,h : C —> A the identity / o g = f o h implies g = h. In an additive category A a morphism 7 : K —> A is called a kernel of / : A —> 5 provided / o 7 = 0 and, for every 5 : C —> A with f o g = 0, there is exactly one / i : C —> K such that g = 7oh. Recall the following well-known (and easliy proved) observations. 2.5. Monomorphisms. Let A he any catgeory and f : A —• B a morphism in A. The following are equivalent: (a) f is a monomorphism; (h) the map Mor(L, / ) : Mor(L, A) —> Mor(L, B), g t-> / o g, zs injective, for any L £ A. If A is additive and has kernels, then (a)-(h) are equivalent to: (c) for the kernel 7 : K —> A of f, K = 0. The basic properties of adjoint functors will be helpful. 2.6. Adjoint functors. Let A and B be any categories. Assume a functor F : A-+ B is right adjoint to a functor G : B —• A, i.e., MoiB(Y,F{X))
~ UoiA(G(Y),X)),
for any X 6 A, Y G B.
Then (1) F preserves monomorphisms
and products,
(2) G preserves epimorphisms and coproducts. For the study of comodules the following type of module categories is of particular interest. 2.7. T h e c a t e g o r y a[K}. For any left yl-module K we denote by a\K] the full subcategory of AM whose objects are submodules of /('-generated modules. This is the smallest full Grothendieck subcategory of ^M containing K (see [8]). a[K] coincides with AM if and only if A embeds into some (finite) coproduct of copies of K. This happens, for example, when K is a faithful A-module which is finitely generated as a module over its endomorphism ring (see [8, 15.4]).
329
The trace functor TK : AM —> a[K], which sends any X € AM to TK(X)
:= £ { / ( # ) I N € a[K], f €
EomA(N,X)},
is right adjoint to the inclusion functor a[K] —> A-M (e.g., [8, 45.11]). Hence, by 2.6, for any family {N\}A of modules in a[K], the product in a[K] is
IL^ = r*(I]> where the unadorned f] denotes the usual (cartesian) product of A-modules. It also follows from 2.6 that for {NX}A in a\K] the coproduct in cr[K) and the coproduct in AM. coincide.
3
Corings and comodules
As before, let A be any associative ring with unit. 3.1. Corings and their duals. An A-coring is an (A, A)-bimodule C with (A, A)-bimodule maps (comultiplication and counit) A:C-+C®AC,
e:C^A,
satisfying the identities (7®A) o A = (A®/) o A,
(I®s) oA = I = (e®I) o A.
For elementwise description of these maps we adopt the E-notation, writing for c G C, 4(c) = ] P Ci®C2. Then coassociativity of A is written as /]
A(ci)®(% = / ^ Ci i®Ci 2®<% = / J Cjjg*% 1®C2 2 = /,Cl®A(<%),
and the conditions on the counit are X^S(C1)C2 = C = ^ C i £ ( c 2 ) .
Of course, when A is commutative and ac = ca for all a 6 A, c 6 C, the coring C is just a coalgebra in the usual sense.
330
For any A-coring C, the maps C —> A may be right A-linear or left A-linear and we denote these by C* := H o m - ^ C , A),
*C := Honu_(C, A),
and for bilinear maps we have Hom^^C, A) = *C D C*. Both C* and *C can be turned to associative rings with unit e by the (convolution) products (1) for / , g e C*, and c € C put / * r (c) = £ (/(/(cjefc), (2) for / , 5 € *C, and c € C put / *' g (c) = £ /(c^cfc)). Notice that for f,g €*CnC*
this yields
a formula which is well known from coalgebras. It is easily verified that the maps tj : A —> *C, B H [ C H e(c)a], and
t r : A —> C*, O H [ C H a£(c)],
are ring anti-morphisms and hence we may consider left *C-modules as right j4-modules and right C*-modules as left yl-modules. 3.2. Right comodules. Let C be an A-coring and M a right A-module. An yl-linear map QM • M —> M <Su C is called a coaction on M, and it is said to be counital and coassociative provided (I®e) o QM = J,
and
(7®A) o gM = (QM®I) ° QM-
A rig/it C-comodule is a right j4-module with a counital coassociative coaction. A morphism of right C-comodules / : M —> AT is an A-linear map such that QN°f = (f®I) o QM • We denote the set of comodule morphisms between M and JV by Hom c (M, N). It is easy to show that this is an abelian group and hence the category M?, formed by right C-comodules and comodule morphisms, is additive.
331
For any right A-module X, the tensor product X ®A C is a right Ccomodule by J®A : X ®A C -> X
f®I:X®AC->Y®AC is a comodule morphism. 3.3. T h e category Mc.
Let C be an A-coring.
c
(1) The category M has direct sums and cokernels. It has kernels provided C is flat as a left A-module. (2) For the functor — ®A C : M-A —* M? we have natural Hom c (M, X ®A C) -> H o m ^ M , X),
isomorphisms
f ^ (7«e) o / ,
for M e Mc, X £ MA, with inverse map h )-• (h®I) o gM) i.e., the functor — ®A C : MA —> Mc is right adjoint to the forgetful functor M.c —> M.A and hence it preserves monomorphisms and products. (3) For the right comodule endomorphisms we have Endc(C) ~ C*. (4) C is a subgenerator in M.c. Proof. (1) Consider a family {M\}\ of right C-comodules. It is easy to prove that the direct sum ® A M\ in M-A. is a right C-comodule and has the universal property of a coproduct in Mc. For any morphism / : M —> A^ of right C-comodules, the cokernel of / in AdA has a comodule structure and hence is a cokernel in Mc. If C is flat as a left A-module, similar arguments hold for the kernel. (2) The proof of the corresponding assertion for coalgebras applies (e.g., [9, 3.12]) and then refer to 2.6. Note that the adjointness, for example, was also observed in [3, Lemma 3.1]. (3) The group isomorphism End c (C) ~ C* follows from (2) by putting M = C and X = A. This is a ring isomorphism when writing the morphisms on the right. (4) For any M G Mc, there is an epimorphism A^ —• M in MATensoring with C yields an epimorphism A^ ®A C —> M ®A C in Mc. As easily checked the structure map QM : M —> M <SU C is a morphism in M,c and hence M is a subobject of a C-generated comodule. •
332
3.4. Mc as Grothendieck category For an A-coring C the following are equivalentfa) C is a flat left A-module; (b) every monomorphism
in Mc
is injective;
(c) the forgetful functor Mc —> MA respects c
If these conditions are satisfied, M
monomorphisms.
is a Grothendieck category.
Proof, (a) => (b) <£> (c) are obvious. (c) =*> (a) For any monomorphism / : N —> L in MA, the map f ® I : N®AC —> L®AC is a monomorphism in Mc (by 3.3(2)) and hence injective by assumption. This shows that — ®A C : MA —> ^ - M o d is exact and hence C is a flat left A-module. Now assume that {a)~(c) are satined. Then M.c is abelian and cocomplete. Since C is a subgenerator it is routine to show that the subcomodules of Cn, n € N, form a generating set for Mc. Hence Mc is a Grothendieck category. • Every right C-comodule M allows a left *C-module structure by ^:*C®%M
-> M,
/8fflH(/8/)ogM(m).
With this structure any comodule morphisms M —+ N is *C-linear, i.e. Hom c (M,A0 C Hom. c (M,iV), and hence Mc is a subcategory of *cM. As shown in [3, Lemma 4.3], Mc can be identified with *cM. provided C is finitely generated and projective as left A-module. Notice that in any case C is a faithful *C-module since f^c = 0 for all cG C implies /(c) = e ( / ^ c ) = 0 and hence / = 0. The question arises when, more generally, Mc is a full subcategory of *CM, i.e., when Eomc(M,N) = Hom. c (M,JV), for any M,N e Mc. The answer is given in our main theorem: 3.5. MP as full subcategory of >cM For the A-coring C, the following are equivalent: (a) Mc = a[,cC}; (b) Mc
is a full subcategory of >cM;
333
(c) for all M,N
eMc,
Hom c (M, N) = Hom. c (M, N);
(d) C satifies the a-condition as left A-module; (e) every *C-submodule ofCn, n £ JN, is a subcomodule
ofCn;
(f) C is locally projective as left A-module. If these conditions are satisfied we have, for any family {M\}A A-modules,
(Y[AMX) ®A C ~ Yl[(Mx ®A C) C J J A ( M A
of right
®A C).
Proof. The implications (a) <$ (b) •& (c) =>• (e) are obvious. (a) => (d) By 3.4 AC is flat. For any N £ "MA we prove the injectivity of the map a : N ®A C -* Hom^(*C, N),
n®c ^ [f i-> nf(c)}.
Considering Hom2g:(*C, N) and the right C-comodule N ®A C as left *Cmodules in the canonical way, we observe that a is *C-linear. So for any right C-comodule L we have the commutative diagram Hom.c(L, N ®AC) [r]Hom
\df
Hom.c(L,Hom^(*C, N))[d[~ HomA(L, N)[rf Bomz(L, N
where the first vertical isomorphism is obtained by assumption and 3.3, Hom. c (L, N ®A C) = Hom c (L, N ®A C) ~ Hom A (L, N), and the second one by canonical isomorphisms Hom. c (L, Hom«(*C, N)) ~ Hom^(*C ®.c L, N) ~ Hom^(L, N). This shows that Hom(L, a) is injective and so (by 2.5) the corestriction of a is a monomorphism in M.c. Since AC is flat this implies that a is injective (by 3.4). (e) =>• (a) First we show that every finitely generated module N is a C-comodule. There exists some *C-submodule X C Cn, n e an epimorphism h : X —> N. By assumption X and the kernel comodules and hence N is a comodule. Now for any L € cr[>cC] the finitely generated submodules are and hence L is a comodule.
€. o-[*cC] IN, and of h are comodules
334
For any *C-morphism in CT[.CC], the kernel is a *C-submodule and hence a comodule. As easily verified this implies that monomorphisms and epimorphisms in c[. c C] are comodule morphisms and hence this is true for all morphisms in o-[.cC}. (d) & (/) follows by 2.3. (d) => (e) We show that for right C-comodules M, any *C-submodule ./V is a subcomodule. For this consider the map pN : N -> KomA{*C, N), n . - [/ H- / - n ] . With the inclusion i : N —> M, we have the commutative diagram with exact lines 0 [r] N [rf M [r]"[d]s" M/N [r] 00 [r] N®AC
[rf®7[<2f".c M®AC
[r]PSI[d}aM'c M/N®AC
[r][d
where all the a's are injective and Hom(*C, i) o pN = a.M,c ° QM ° i- This implies (p0l)ogMoi — 0, and by the kernel property, QM°i factors through N -+ N 0 ^ C thus yielding a C-coaction on N. The final assertion follows by 2.6 and the characterization of products in a[,cC] (see 2.7). • As a corollary we can show when all *C-modules are C-comodules. This includes the reverse conclusion of [3, Lemma 4.3] and extends [11, Lemma 33].
3.6. Mc = .CM. For any A-coring C, the following are equivalent:
(a) Mc - ,CM; (b) the functor — (&A C : MA —> »cM has a left adjoint; (c) AC is finitely generated and projective; (d) AC is locally projective and C is finitely generated as right C*-module. Proof, (a) =>• (b) and (c) =*> (d) are obvious. (b) => (c) By 2.6, —®AC preserves monomorphisms (injective morphisms) and hence AC is flat. Moreover we obtain, for any family {MA}A in MA, the isomorphism
(Y[MX)®AC~1[(MX®AC), A
A
335 which implies that AC is finitely presented (e.g., [8, 12.9]) and hence projective. (d) => (a) Recall that C* is the endomorphism ring of the faithful module *cC. Hence Cc* finitely generated implies Mc = o[,cC\ — *CM (see 2.7). • A c k n o w l e d g e m e n t . The author is very grateful to Jawad Abuhlail for interesting and helpful discussions on the subject.
References [i Abuhlail, J.Y., Dualitdtssdtze fur Hopf-Algebren iiber Ringen, Dissertation, Universitat Diisseldorf (2001) [2: Abuhlail, J.Y., Gomez-Tor recillas, J., Lobillo, F.J., Duality and rational modules in Hopf algebras over commutative rings, J. Algebra 240, 165184 (2001) Brzeziriski, T., The structure of corings, Algebras and Repr. Theory, to appear
K
Garfinkel, G.S., Universally torsionless and trace modules, Trans. Amer. math. Soc. 215, 119-144 (1976) Ohm, J., Bush, D.E., Content modules and algebras, Math. Scand. 31, 49-68 (1972) Raynaud, M., Gruson, L., Critere de platitude et de projectivite, Inventiones Math. 13, 1-89 (1971)
[7] Schubert, H., Categories, Springer, Berlin (1972) [«. Wisbauer, R., Foundations of Module and Ring Theory, Gordon and Breach, Reading, Paris (1991) Wisbauer, R., Semiperfect coalgebras over rings, in Algebras and Com[9 binatorics, ICA'97, Hong Kong, K.R Shum, E. Taft, Z.X. Wan (ed), Springer Singapore, 487-512 (1999) Wisbauer, R., Weak Corings, J. Algebra, to appear [10;
336
[11] Wischnewsky, M.B., On linear representations of affine groups I, Pac. J. Math. 61, 551-572 (1975) [12] Zimmermann-Huisgen, B., Pure suhmodules of direct products of free modules, Math. Ann. 224, 233-245 (1976)
Mathematisches Institut Heinrich-Heine-Universitat 40225 Diisseldorf e-mail: [email protected]
Regularity of the blow-up set and singular behavior for semilinear heat equations Hatem Zaag Courant Institute and CNRS ENS
Abstract : We consider u(x,t) a blow-up solution of ut = Au + |u| p ~ 1 u where u : RN x [Q,T) ^ R, p > 1, (N - 2)p < N + 2 and either u(0) > 0 or (3iV - 4)p < 3N + 8. The blow-up set S C RN of u is the set of all blow-up points. Under a non degeneracy condition, we show that if S is continuous, then it is a C1 manifold. The blow-up behavior of u near non isolated blow-up points is derived as well. If the codimension of the blow-up set is one, then S is C1'" for any a 6 (0, | ) . If in addition p > 3, then u is very close to a superposition of one dimensional solutions as functions of the distance to S. A M S Classification : 35K55, 35B40 We are concerned in this note with blow-up phenomena arising in the following semilinear problem : ut
=
Au+|u|p_1u
«(.,0)
=
uo6i°°(lN),
(1)
where u(t) : x € M.N —• u(x,t) £ M and A stands for the Laplacian in WLN. We assume in addition the exponent p > 1 subcritical : if N > 3 then 1 < p < (N + 2)/(N - 2). Moreover, we assume that u0 > 0 or (37V - 4)p < 3N + 8.
(2)
This problem has attracted a lot of attention because it captures features common to a whole range of blow-up problems arising in various physical situations, particularly the role of scaling and self-similarity. Without pretending to be exhaustive, we would like nonetheless to mention some related equations : the motion by mean curvature (Soner and Souganidis [20]), vortex dynamics in superconductors (Chapman, Hunton and Ockendon [6], Merle and Zaag [15]), surface diffusion (Bernoff, Bertozzi and Witelski [2])
337
338
and chemotaxis (Brenner et al. [4], Betterton and Brenner [3]). However, equation (1) is simple enough to be tractable in rigorous mathematical terms, unlike other physical equations. A solution u(t) to (1) blows-up in finite time if its maximal existence time T is finite. In this case, g n I K ^ H i ^ R " ) = lim \\u(t)\\Laa{UN) = +oo. Let us consider such a solution. T is called the blow-up time of u. A point a G M.N is called a blow-up point if |u(a;,t)| -> +oo as (x,t) —> (a,T) (this definition is equivalent to the usual local unboundedness definition, thanks to Corollary 2 in [18]). S denotes the blow-up set, that is the set of all blow-up points. Prom [18], we know that there exists a blow-up profile u* G Clc(RN\S) such that u{x, t) -> u*{x) in Clc(RN\S)
as t -> T.
(3)
The blow-up problem has been addressed in different ways in the literature. An important direction was developed by authors looking for sufficient blow-up conditions on initial data or on the nonlinear term (see Fujita [10], Ball [1], Levine [13] and the review paper by Deng and Levine [7]). The behavior near singular time is a major direction too. More precisely, given a G M.N a blow-up point of u, two issues arise : - the blow-up behavior of u{x,t) near the singularity (6,T). - the regularity of the blow-up set near a. The blow-up behavior issue has been extensively addressed in the literature, when a is an isolated blow-up point (note that the second question is irrelevant then). See for example Weissler [25], Bricmont and Kupiainen [5], Herrero and Velazquez [12] and [22]. No relevant results were known when a is not isolated. As a matter of fact, we address in this note these two issues in a case where a is not isolated. These two issues are very closely related. See [26] and [27].
1
The regularity of the blow-up set
By definition, the blow-up set is closed, and if the initial data is sufficiently decaying at infinity, then it is bounded as well (see Giga and Kohn [11]). Two questions arise :
339 - A constructive question : Given a compact set S C RN, can one construct u a solution of (»1) blowing up at some time T exactly on 5? The answer is affirmative if S is a sphere (see Giga and Kohn [11] for example) or a collection of fc points (see Merle [14] and Merle and Zaag [16]). The techniques of [16] give a solution when S is a union of k concentric spheres (which reduces to the case of A; points in the radial setting). The question remains open otherwise. - A descriptive question : Given u a solution of (1) that blows up at time T on a set S, consider a a non isolated blow-up point. What is the regularity of S near a? We know from Velazquez [23] that the (N — 1)dimensional Hausdorff measure of S is bounded on compact sets (as a matter of fact, this provides a necessary condition on S in the constructive question above). No further information was available. The description question is our first concern in this note. Given a G S, we know from Velazquez [22] that up to some scalings, u approaches a particular explicit function near the singularity (a,T). We consider the case where for all K0 > 0, sup
(T - t)^u
(a + Q&zy/(T-t)\log(T-t)\,t)
- fl&(z)|
-> 0
(4)
\z\
as t -)• T, where Q& is an orthonormal N x N matrix, 1& = 1,..., N, and
Other behaviors with the scaling (T — £)~2* (x — a) where k = 2,3,.. may occur (see [22]). We suspect them to be unstable. If Za = N , then a is an isolated blow-up point. An extensive literature is devoted to this case (Weissler [25], Bricmont and Kupiainen [5], Herrero and Velazquez [12] and [22],...). We have proved the stability of such a behavior with Fermanian and Merle in [8]. The key argument in our proof was the following Liouville Theorem proved by Merle and Zaag in [17] and [18]: Consider U a solution of (1) defined for all (x, t) £ M.N x (—oo, T) such that for all {x,t) 6 RN x ( - o o , T ) , \U{x,t)\ < C{T - t)~^.
Then, either
U = 0 or U(x,t) = [(p- 1)(T* - 1 ) ] ~ ^ for some T* > T. The case l& < N is known to occur, namely when u is invariant with respect to some coordinates. However, when Z& < N, we cannot even tell
340
whether a is isolated or not, or whether S is continuous near a. Therefore, we assume that a is non isolated and that S contains a continuum that goes through a. To make our presentation clearer, we restrict to the case N = 2 and assume that a = a(0) G I m a C S where a G C ( ( - l , 1),M 2 ) and for some ao, Ve > 0, a ( - e , e) intersects the complimentary of any connected closed cone with vertex at a and angle a G (0, ao]
, .
(this is in a way to insure that a is not an endpoint). Assuming that u behaves according to (4) near the singularity (a,T), we have the following result : T h e o r e m 1 (Regularity of the blow-up set at a point with the behavior (4) assuming S contains a continuum) Assume N = 2 and consider u a solution of (1) that blows-up at time T on a set S. Consider a = a(0) G I m a C S where a G C((-1,1),M 2 ) and a is not an endpoint (in the sense (6)). If u behaves near (a, T) as stated in (4), then there are 6 > 0, Si > 0 and cp G C1([-5i,6i],R) such that SDB(a,25)
=gmph
=ImaflB(o,2<5).
(7)
In particular, S is a C1 manifold near the point a. More precisely, there exists Co > 0 and ho such that for all |£| < <$i and \h\ < ho such that |£ + h\ < S\, we have :
\
h)-
Remark : The function (p is actually Cl'a for any a G (0, ^) (see Proposition 5 below). In higher dimensions, we proved Cl,a regularity only when the codimension of the blow-up set is one. Remark : Prom [22], we know that the limit function at (o,T) stated in (4) has a degenerate direction, and that we can not have two curves of blow-up points intersecting transversally at a. With our contribution, we eliminate the possibility of two curves meeting tangentially at a. In particular, there is no cusp at a, and there is no sequence of isolated blow-up points converging to a G 5 . Theorem 1 also holds in higher dimensions. We claim the following : T h e o r e m 1' (Regularity of the blow-up set near a point w i t h the behavior (4) assuming S contains a N — I dimensional continuum)
341
Take N > 2 and I G { l , . . . , i V - l } . Consider u a solution of (1) that blows-up at time T on a set S and take a G S where u behaves locally as stated in (4) with k = I. Consider a G C ( ( - l , l)N~l,m.N) such that a = o(0) G I m a C S and Im a is ai /easf (N — I) dimensional. If a is not an endpoint, then there are S > 0, <$i > 0 and
2
The blow-up behavior near a non isolated blowup point
The behavior of u(x, t) near the singularity (a, T) is our second concern in this paper. We claim the following : T h e o r e m 2 (Blow-up behavior and profile near a blow-up point where u behaves as in (4) assuming S contains a continuum) Under the hypotheses of Theorems 1 and 1', there exists to < T such that for all K0>0,te [t0,T) and x G B(a,5) s.t. d{x,S) < K0y/(T t)\\og(T-t)\, we have d{x S)
(T-t)^u(x,t)-fJ
'
y/(T-t)[]og{T-t)\i
,
^Co{Ko)
log|log(T-f)|
|log(r-*)| (8)
where f\ is defined in (5). Moreover, \/x G M.N\S, u(x,t) with
—> u*(x)
u*{x) ~ U(d(x, 5)) as d(x, S) -> 0 and x G B{a, 8)
ast-^T
(9)
i
where U{z) = ( ( ^
F
^
M
)
P
"
1
for z > 0.
Remark : This is the first time where the blow-up profile u* is derived near a non-isolated point. Indeed, in the earlier work of Velazquez, the behavior along the "tangential" direction of S was not derived. Estimate (8) shows that in a tubular neighborhood of 5, the main term in the blow-up asymptotics is the one dimensional blow-up profile / i , function of only the normal coordinate ±d(x, S).
342
The major step towards Theorems 1,1' and 2 is the proof of the stability of the behavior (4) in a neighborhood of a in S. Without such a stability, no further result could be obtained after Velazquez's result in [23] about the Hausdorff measure of S. The key argument in getting this stability is the Liouville Theorem of [18], stated on page 3. The error term in (8) shows that we fall in logarithmic scales v = —l/log(T — t) of the blow-up small parameter e = T — t. Further refinements in this direction should give an expansion of the solution in terms of powers of v, i.e., in logarithmic scales of e (see Stewartson and Stuart [21]). Logarithmic scales also arise in some singular perturbation problems such as low Reynolds number fluids and some vibrating membranes studies (see Ward [24] and the references therein, see also Segur and Kruskal [19] for a Klein-Gordon equation). Since v goes to zero slowly, infinite logarithmic series may be of only limited practical use in approximating the exact solution. Relevant approximations, i.e., approximations up to lower order terms such as (P for j3 > 0, lie beyond all logarithmic scales. When the codimension of the blow-up set is one, namely when Ja = l, we do better, and get to error terms of order (T —1)@ with /? > 0. Our idea to capture such relevant terms is to abandon the explicit profile function obtained as a first order approximation, and take a less explicit function as a first order description of the singular behavior. Both formulations agree to the first order. Through scaling and matching, we can reach the order e@ by iterating the expansion around the less explicit function.
3
Further refinements when the codimension of the blow-up set is one
A natural candidate for this non explicit function is simply a one dimensional solution of (1) that has the same profile / i . It is classical that there exists a one dimensional even function u(xi,t), solution of (1), which decays on (0, oo) and blows up at time T only at the origin, with the profile f\, in the sense that for all K0 > 0 and t € [t0,T), if |x x | < K0y/(T - t)\ log(T - t)\, then
{T-t)^u{Xl,t)-hl
Xl
V(T-t)|log(T-t)|
, ^C°{Ko)
log|log(r-*)| |log(T-*)| (10)
343
(see Appendix A in [27] for a proof of this fact). Hence, it follows from (8) that for all K0 > 0, te fo,T) and x € B(a,8) such that
d{x,S) < KQy/(T-t)\\og(T-t)\,
we have
(T-t)iFT \u(x,t)-u(d(x,S),t)\
1
1
.
(11)
This estimate remains valid even if we replace u(d(x, S), t) by any ua^Xtt)(d(x,S),t) where ua is defined by Ua(xi,t) = e ~ ^ T { i ( e - f a ; i , T - e ~ < 7 ( T - i ) ) ,
(12)
provided that |
which corresponds to a codimension 1 blow-up set. We claim the following: Theorem 3 (The N dimensional solution seen as a superposition of one dimensional solutions of the normal variable t o the blow-up set, w i t h a suitable dilation) Under the hypotheses of Theorems 1 and 1' and if la = I and p > 3, then for all t 6 [t\, T) and x E B(a, 8) such that d(x, S) < eo for some t\ < T, 5 > 0 and eo > 0, we have \u(x, t) - ua{Ps{x))
(d(x, S),t)\<
h(x, t)<M<
where Ps{%) is the projection of x over S and h(x,t) t->T.
+oo,
(13)
-> 0 as d(x,S)
—• 0
Thus, when p > 3, all the singular terms of u in a neighborhood of (d, T) are contained in the rescaled one dimensional solution w(r(ps(x)) (d(x,S),t),
344
which shows that in a tubular neighborhood of the blow-up set S, the space variable splits into 2 independent variables: - A primary variable, d(x,S), normal to S. It accounts for the main singular term of u and gives the size of u(x,t), as already shown in the formulation (11), which follows directly from Theorem 2. - A secondary variable, Ps(x), whose effect is sharper. Through the optimal choice of the dilation a(Ps(x)), it absorbs all next singular terms in the normal direction to S at Ps(x). Similar ideas are used by Betterton and Brenner [3] in a chemotaxis model; see section 5 in [27] for a short discussion of connections with that work. We would like to mention that we have successfully used this idea of modulation of the dilation with Fermanian in [9] to prove that for N = 1 and p > 3, there is only one blow-up solution of (1) with the profile (4), up to a bounded function and to the invariances of the equation (the dilation and translations in space and in time). Theorem 3 is a direct consequence of the following result which is valid also for 1 < p < 3. T h e o r e m 4 (Blow-up behavior and profile near a blow-up point where u behaves as in (4) assuming S is locally a (N—l)-dimensional manifold) Under the hypotheses of Theorems 1 and 1' and without the restriction p > 3, if la — I, then there exists t\
(d(x, S), t) | <
(7mM((T-t)^)|log(T-t)|f+Co,d(a;,5)^|logd(2;,5)|^+Co), (14) where Ps(%) is the projection of x over S, mM = min if 1 < p < 3 and mM = max if p > 3. ii) If x $. S, then u(x, t) —> u*(x) as t —> T and u*(x)-e
u* ye
2
d(x,S)\
Cd{x,S)5=T|logd{x,S)\^+Co,
< where u*(xi) =
p-1
limu(xi,t).
345
Remark: In view of Theorem 2, we see from our new estimate that up to a suitable dilation, all the next terms in the expansion of u* up to the order p—3
P
I /T
d(x, S) p-11 log d(x, S) | p-1 ° are the same as the particular one dimensional solution. The splitting of the space variable x into d(x, S) and Ps{x), as shown in (14), induces a geometric constraint on the blow-up set S, leading to more regularity on S. Proposition 5 (C1'5~?? regularity for S and C1^ regularity for the dilation a) Under the hypotheses of Theorems 1 and V and ifla = 1, then S is the graph of a function ip S C 1 '2 _? '(i3 w _ 1 (0,8\), R), locally near a, and a is a Cl~n function, for any n > 0. More precisely, there is a ho > 0 such that for all |£| < 6\ and \h\ < ho such that |£ + h\ < 6i, we have
M£ + /0-¥>(O-V(OI
K£,¥>(0) - °(Z + hMZ + h))\ < C\h\|log \h\|3+c°. The reader is referred to the papers [26] and [27] for proofs and details. Acknowledgement : The author wants to thank Professors Mohammad Saleh and Edriss Titi for their invitation to the third Palestinian conference on math and math education where this work has been presented.
References [1] J. M. Ball. Remarks on blow-up and nonexistence theorems for nonlinear evolution equations. Quart. J. Math. Oxford Ser. (2), 28(112):473486, 1977. [2] A. J. Bernoff, A. L. Bertozzi, and T. P. Witelski. Axisymmetric surface diffusion: dynamics and stability of self-similar pinchoff. J. Statist. Phys., 93(3-4):725-776, 1998. [3] M. D. Betterton and M. P. Brenner. Collapsing bacterial cylinders, preprint. [4] M. P. Brenner, P. Constantin, L. P. Kadanoff, A. Schenkel, and S. C. Venkataramani. Diffusion, attraction and collapse. Nonlinearity, 12(4):1071-1098, 1999. [5] J. Bricmont and A. Kupiainen. Universality in blow-up for nonlinear heat equations. Nonlinearity, 7(2):539-575, 1994.
346
[6] S. J. Chapman, B. J. Hunton, and J. R. Ockendon. Vortices and boundaries. Quart. Appl. Math., 56(3):507-519, 1998. [7] K. Deng and H. A. Levine. The role of critical exponents in blow-up theorems : the sequel. J. Math. Anal. Appl, 2000. [8] C. Fermanian Kammerer, F. Merle, and H. Zaag. Stability of the blowup profile of non-linear heat equations from the dynamical system point of view. Math. Annalen, 317(2):195-237, 2000. [9] C. Fermanian Kammerer and H. Zaag. Boundedness up to blow-up of the difference between two solutions to a semilinear heat equation. Nonlinearity, 13(4):1189-1216, 2000. [10] H. Fujita. On the blowing up of solutions of the Cauchy problem for ut = Au + u1+a. J. Fac. Sci. Univ. Tokyo Sect. I, 13:109-124, 1966. [11] Y. Giga and R. V. Kohn. Nondegeneracy of blowup for semilinear heat equations. Comm. Pure Appl. Math., 42(6):845-884, 1989. [12] M. A. Herrero and J. J. L. Velazquez. Blow-up behaviour of onedimensional semilinear parabolic equations. Ann. Inst. H. Poincare Anal. Non Lineaire, 10(2):131-189, 1993. [13] H. A. Levine. Some nonexistence and instability theorems for solutions of formally parabolic equations of the form Put = — Au + F(u). Arch. Rational Mech. Anal, 51:371-386, 1973. [14] F. Merle. Solution of a nonlinear heat equation with arbitrarily given blow-up points. Comm. Pure Appl. Math., 45(3):263-300, 1992. [15] F. Merle and H. Zaag. Reconnection of vortex with the boundary and finite time quenching. Nonlinearity, 10(6):1497-1550, 1997. [16] F. Merle and H. Zaag. Stability of the blow-up profile for equations of the type ut = Au + |u| p _ 1 u. Duke Math. J., 86(1):143-195, 1997. [17] F. Merle and H. Zaag. Optimal estimates for blowup rate and behavior for nonlinear heat equations. Comm. Pure Appl. Math., 51(2):139-196, 1998. [18] F. Merle and H. Zaag. A Liouville theorem for vector-valued nonlinear heat equations and applications. Math. Annalen, 316(1):103-137, 2000.
347
[19] H. Segur and M. D. Kruskal. Nonexistence of small-amplitude breather solutions in
World Scientific www. worldscientific. com 4784 he
ll
789810"247201