This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
> 1/11 there exists an optimal 1-matching. p s ( 9 ) . If such a circuit exists, it serves as a barrier to closed paths in the dual graph, so the closed cluster containing the origin in the dual graph is finite. Hence, if such a circuit exists with probability one, 1 -p
I
a
n
1: 1
and n l = -n
. Let V=V1 u V2,
I V1l = n l , V,l = n - n , . Then the probability that two vertices of V1 are joined by an edge of G, is at most /3/n,. Hence for every fixed E > O a.e. G, is such that VI contains a set W , of m , = [(f(p) - ~ ) nJ ,independent vertices. Furthermore, a.e. G , has
vertices in V2 which are independent of the vertices in W , . The probability that ci
cin2/n
-so a.e. G, is such n n2 that at least m 2=[Cf(cm,/n)-~)n,J of these n2 vertices are independent. Thus a.e.
two of these n2 vertices are joined by an edge of Gp is
-=
Gp satisfies
Since this inequality holds for every E>O, we have
where y=(a-/3)e-Pf'@'. Consequently
if a=yee(@'+/3 and so (14) holds. Set c = e e ( ' ) and u,=l +c+...+ck-'.Then u , = 1 so g(ul)=g(l). Suppose that g ( u k - l ) > ( k - l)g(I) for some k > 2 . Setting p= I, y=c(K-1 and a=yes(@)+/3=ak,
75
Random graphs of small order
from (14) we find that 9 ( M k ) 3 9 ( 1)
as claimed.
+ 9 (arc- 1 ) > k g (11
1
17
The function f l (a)
e, e 0 0.717Gf(1)<0.872,
3
2
1
4
0.563~f(2)~0.750, 0.411 ~ , f ( 4 ) 6 0 . 5 8 3
Relation (14) implies that if a> 1 then
log(a+ l)+f(l)-log2
f (42
O!
log(a+ 1)+0.02
>-
a
a-1
Indeed,g(a)>log(a+l)fora
ee(l , and suppose that
c=6(”=
B. Bollobds, A . Thomason
76
so
+
>log(a+ 1) g (1) - log2. On the other hand, ifg(a)
Theorem 15. Suppose
CL> 1 and
b> 1 are such that
2b2log b -2b (b - 1)log(b - 1) < a .
(16)
Then
n is an integer. Set b and write Ek for the expected number of independent k-sets in G,.
Proof. In the proof of this result we may suppose that k = p=-
U
n
It suffices to show that lim E,=O. Clearly n- m
For small values of CL the upper bound g;ven by Theorem 15 i s shown in Table I I . 2 loga For large values of CL the upper bound implied by Theorem 15 is essentially -- -- . CL
Random graphs of small order
77
Corollary 16. For a> 2.27 we have
a
Proof. For c! e'/' and b = ~the left-hand-side of(16) is 2 log @
(
2 b ( b - 1)log I +
a =-
--
log c!
For 2.27
,:I)
+2blogb
(1 +log@- log 2 -log loga) < a the corollary follows from Theorem 15 by a computer check.
0
Perhaps the easiest algorithni for the construction of a large independent set in a random graph is the so-called greedy d ~ j o r i t / i i nI. n facl, the same algorithm can be used to colour a random graph with relatively few colours (see Grimmett and McD'armid [30], BollobAs and Erdos [ 171). Lct G be any graph on V= { I , 2 , .. ., n>. The greedy algorithm constructs an independent set by running through the vcrticcs in this order and selecting a vertex whenever it can be selcctcd, i.e. whenever it is independent of the vertices selected so far. Thus vertex 1 belongs to the independent set, vertex 2 belongs to it ifT it is not joined to I , ctc. Denote by I g ( G )the cardinality of the independent set constructed by tlic grcedy algorithm. The natural definition of xs(G) is not suitable for random graphs G, when p is rather small. Indeed, if we defined x!,(G) analogously to Pg(G)then we would find that
for all constants k E N and c > O . This can be seen as follows. By applying induction on k it is easily proved that I'or all k > I and t>2"' there is a tree T with vertex set W = ( I , 2 , ..., t ) such that xs(T)>Ii, where xg is taken with respect to the natural order on I+" Since l i i n P ( C ; , ; , , ~ T ) > O , relation (17) does hold. ~
I I -*
I,
Hence it is more natural to define x,,(C) as follows. Use the greedy algorithm to construct the colour classes one by one. If at any stagc the graph H spanned by the reiiiaining vcrticcs is such that all its cornponents are trees and unicyclic
B. Bollobds, A . Thomason
78
graphs then colour H with x ( H ) colours. (Clearly x ( H ) = 3 if H contains an odd cycle, otherwise x ( H ) < 2 . ) Denote by xg(G) the number of colours used by this algorithm. By definition
for every graph G on V.
Theorem 17. Let p = ct(n)/ii, x(n) = o (nilog n). Then ,for
. rf
holds with probability 1 - 0
euerji jLml positive c
c4>0 is a coilstant theri with probability
1- O ( n - l )
f (43 /J,(G,) 3
log(a+ 1)
n.
c4
Proof. Suppose that when the greedy algorithm tests vertex j , there are k - I vertices in the independent set selected so far. Then the probability that , j will be added to the independent set as the k-th vertex of that set is exactly
' = pk 1 xh.
Let X I ,X , ,. ..be independent geometric random variables with E ( X J lc- I
=(I-:)
. (Thus P(Xk=~)=pk(l-ph)S-'. s=1,3, ... .) Set
x=
1
I =I
Then by (1 8)
The geometric random variable X, has inean p;' riogu-& Hence with 1= -___ n we have
1
and variance ( I - p k ) p k 2 .
Random graphs of small order
79
-1+1
U
logn - loga+(l- 1) Simple calculations show that our conditions imply that
for some fixed ?1>0, provided n is sufficiently large. We have the following estimate of the variance of Y,
By Chebyshev’s inequality, (20) and (21) give P(/I,(G,)n)=O
(r) (:) =O
-
,
as claimed. Suppose now that a is constant. Set I=
1-
(
I--
<@(l-’I)
(
1---
for some q > O , so the first part of (19) implies (20), and the assertion follows from (21). It is rather interesting that Theorem 17 gives the same lower bound forf(a) 3s Corollary 16. In fact, the proof shows also that the lower bounds in Theorem 17
B. Bollobrfs, A. Thomason
80
are essentially best possible: for a(n)>O, a(n)=o(n/logn) a.e. Gp is such that
P,(G,) =log(a.t tl 1) n + o ( + ) . However, the main advantage of Theorem 17 is that the graph obtained after the deletion of the independent set is still a random graph, so the process can be repeated and we can obtain a lower bound for x,,(G,). Theorem 18. Let p=a(n)/n and suppose a(n)
a
Proof. I f a c l is a constant and p= - then a.e. G, is a union of components il
each of which is a tree or a unicyclic graph. Hence in this case x q ( G p ) < 3 . Let a> 1 be a constant. Then by the second part oTTheorem 17 a.e. G, is such that thc first colour class found by the greedy algorithm has at least
n l =[‘og(a+’)-crz-/ - -___ a most n , = n - n , CI ii
-
vertices. In the subgraph spanned by the remaining at
vertices the probability of an edge is a
112
n2
ii
a--log(Cc+1)+c - a’
Hence if a.e. G,,,,, satisfies
n2
‘2 2
Random graphs of small order
81
for all &
x (G,,,,)
k +3
whenever a
< 1 5 log 15, -log + c
a1 LYl
1
for 1 6 6 1 6 k
and
c a,=o(n) k
I = 16
then
for a.e. Gnk/,,.If a,=Ilog f < n l l z for 15,
+
i n f (u1- - a1 log ul) 16614k
=
i n f :(l-l)Iog(l-I)-lIogl+Iogl+logIogl) 16Sldk
= 15 log (15/16)
+log log 16> 0 .
This completes the proof of our theorem.
0
If a is not too sinall then the greedy algorithm works rather wcll, even without a good colouring of the sparse subgraph at the end. Theorem 19. Suppose a(n)
where q = 1 - p .
3lo.glogn _. log c1
&=---
B. Bollobcis, A. Thomason
82
Proof. What is the probability that the simple greedy algorithm uses the (k+ I)-st colour to colour vertex,j+ I ? It is at most k
k
Hence
P (X,(C,) > k ) 6
jy ( I -
q j / y
j=O
proving the theorem.
0
Once again, it is very easy to see that the upper bound is essentially best posslog U l q ) ibfe: a.e. G, satisfies xy(G,)--- / I . Thus for O
- log (1 - a / n )
x0(C,,J=(l +o(l)) - -
-
log(Cc+ 1)
n+0(1)
for ax. Gain. ITp is constant (say p = + ) then the independence number of G,,,, is a.s. about 2 log,,,,n (see Matula [64], [65], Grimniett and McDinnnid [4O], Bolloblis anti Erdiis [17], Chvatal [20]). This is proved by estilnnting E ( X k )and o(X,),whcre Xk is the iiuinber of k-sets of independent vertices, and then applying Chcbyshev's inequality. For small valiies of n the independence IiLiinber does not have 8 very sharp peak, but the exact values o f E ( X , ) and o(X,)still tell us a fair amount about the distribution of /l'o(Gn,p).Clearly
and
Random graphs of small order
Table 12 Independent k-sets in Gn. ,iz ?I
k
E( Xk )
U(Xd
40 40 40 40
5 6 1 X
42.58 117.13 8.89 0.29
0.181 0.601 2.511 23.502
60 60 60 60
6 7 8 9
1527.83 184.16 9.53 0.22
0.198 0.589 2.504 77.810
80 80 80
so
7 8 9 10
1514.78 107.99 3.37 0.05
0.250 0.746 4.079 84.320
100 100 100 100
7 8 9 10
7631.00 693.23 27.68 0.19
0.138 0.346 1.224 12.~18
200 200 200 200
9 10
17 101.98 638.10 10.76 0.08
0. I05 0.236 1.103 37.313
400 400 400
11
12 13
25 386.89 401.84 2.93
0.057 0.1 19 1.359
600 600 600
12 13 14
55 133.78 608.83 3.12
0.033 0.063 0.967
800 800 800 800
12 13 14 15
> 105 26 484.02 181.74 0.58
0.017 0.026 0.059 3.591
1000
1000
13 14 15 16
> 105 4228.26 16.96 0.03
0.015 0.022 0.178 50.685
2000
17
3.95
0.434
3000
18
5.04
0.305
4000
19
0.72
I .841
5000 5000
19 20
50.62 0.02
0.031 50.261
1000 1000
11
12
84
B. Bollobcis, A . Thomaron
and by Chebyshev’s inequality P (xk = 0) < (T2(xk) /E
(x,)’= u (xk).
(23)
Table 12 sliows the information one can derive from (22) and (23) in the case p = f for all those values of k for which E(X,)>O.OI and U(X,)>O.Ol. Table 13 Independent sets found by greedy algorithms i n random graphs ol‘ order n with probability P(edge)=c./n Hammer
2574 2541 2548 2550 2524 2522 2564 2539 2541 2540 Hammer
2246 2237 2250 2243 2253 2262 2248 2239 2252 2268 Hammer
5089 5084 5068 5081 5111 5089 5120 5089 5092 51 I4
11=5000 c = 3 Simple grecdy Low first 207 1 2177 2206 2197 21 88 2221 2190 2239 2206 2229
2319 2526 2541 2543 2525 2506 2540 2540 2538 2536
n=5000 c = 4 Simple greedy Low first 1923 1950 1981 1937 1964 1999 1953 1966 1950 1999
2156 2242 2232 2248 2240 2265 2234 2249 2239 2258
n=10000 c = 3 Simple greedy Low first
4101 4363 4398 4391 4452 442 I 4421 4410 4109 4313
4595 5038 5047 5083 509 I 5070 5111 5088 5082 5091
High first
1731 1922 1930 1905 1915 1903 1883 1879 1915 1927 High first
1630 1697 1682 1716 1673 1685 1698 1732 1714 1702 High first
3534 3875 3853 3823 3848 3797 3846 3887 3835 3893
Random graphs of small order
85
For large values of n (5000 and 10000) we ran four algorithms on random graphs C n , r , nc,= 3 and 4, scarching for large independent sets. All four algorithms choose vertices one by one. The first algorithm, called limirr,er, always chooses an available vertex of minijiiuIii dcgree. I n the other three algorithrns the vertices are ordered at the beginning and at each stage we choose the first available vertex. I n simpleyrccclj. the natural order is taken, in I O N .j'irst (grrcdy) we choose an order i n which thc degrces are non-decreasing and in a kighfi'rsr (greedy) we choose a n ordcr in which the degrees are non-increasing. The outcomes are shown in Table 13. Corresponditig colouring results are shown in Table 14. In addition to the number of colours used, we show the size and the first element 01' each colour class. Table 14 Colouring mcmbcrs of 9(n,c/n) by greedy algorithms n=4000 c = 3 . 0
Class sizes and first members
Colours Simple Low High Simple Low High
5 6 4 5 6 4
(1853, (2025, (1614, (1760, (2036, (1566,
1) (1280, 38) (669, 235) (186, 1133) (12, 3336)
I) I) I) I) 1)
(1083, (1524, (1366, (1084, (1574,
287) (601, 1147) (245, 2609) (43, 3495) (3, 3937) 32) (807, 101) ( 5 5 , 588) 5 5 ) (651, 432) (206, 947) (17, 2609) 40) (566, 1718) (252, 2624) (58, 3542) (4, 3975) 14) (790, 90) (70, 325)
n=4000 r = 4 . 0
Class sizes and first members
C o1ours Simple Low High Simple Low
6 6 5 6
High
5
7
(1596, I ) (1233, 38) (781, 317) (338, 1179) (50, 2299) (2, 3728) (1794, 1)(1092, 160) (634, 1094) (339, 2171) (125, 3305) (16, 3872) (1358, I ) (1359, 16) (1035, 120) (243, 516) ( 5 , 1433) (1606, I ) (1227, 49) (759, 375) (345, 1103) (62, 1967) ( I , 3520) (1798, I ) (1065, 48) (061, 1145) (324, 2331) (132, 3154) (19, 3588) ( I , 3957) (1348, I ) (1360, 23) (990, 87) (295, 314) (7, 1030) II=
Colours Simple Low High
5 6 4
10 000 ~ = 2 . 5
Class sizcs and first members (5006, 1 ) (3235, 38) (1441, 1216) (300, 3329) (18, 6868) (5453, I ) (2687, 910) (1325, 3353) (464, 6900) (68, 8903) (3, 9919) (4371. 1) (3927, 8) (1643, 267) (59, 1017) tr= 10 000
Colours Simple Low High
5 6 4
(,=3.0
C l a s s sizcs and first members
(4605, 1)(3216, 38) (1655, 621) (483, 3282) (41. 4816) (5098, 1 ) (2706. 703) (1453, 3262) (595, 6397) (140, 8564) (8, 9797) (4038, I ) (3745, 21) (2050, 263) (167. 1443)
B. Bollobas, A . Thomason
86
6. CoIouring large random graphs
If O
The lower bound is the smallest number of colours for which the expected number of colourings is not vanishingly small and the upper bound is the number of colours used by the greedy algorithm (see Grimmett and McDlarmid 1401, Bollobis and Erdos [17]). It is generally believed that the lower bound gives the correct value for ,y(C). This section presents the results of an attempt lo create an efficient colouring algorithm for colouring random graphs, which apart from its intrinsic usefulness might cast light on the asymptotic value of ,y(G). The graphs used as input to the algorithm were random graphs with n= 1000, p = f . The value 1000 was chosen as the largest manageable on the machine used, the IBM 3081 at Louisiana State University. For these graphs, we can show
8 0 G x (C) d 126, The expected number of colourings with k colours, namely
where the sum is over all sequences (a!)of length k with Za,=n and nj is the number of a, which equal j , can be computed when n = 1000 and k=79 or 80. One finds E(79)< and E(80)> loL4,so ~ 3 8 0almost surely. (Incidentally, the expected number of equitable colourings with 80 colours, having 40 colour classes of size 12 and 40 of size 13, is less than 0.08. The “most likely” sequence ( a i ) is 26 classes each of sizes 12 and 13, 12 each of sizes 11 and 14 and 2 each of sizes 10 and 15.) The upper bound of 126 represents the average number of colours used by the greedy algorithm. Table 15 gives the number of colours used by the greedy algorithm i n 1000 simulated colourings. The distribution of this data is very close to normal with p = 126.55, o = 1.332. The problem above has been studied by Johri and Matula [43], Johnson [42] and Le:ghton [56], among olhers. The best results known so Par for graphs of this type appcar to be those of Johri and Matula [43], who used an average of 95.9 colours. Our algorithm used an average of 86.9 colours. This clearly indicates the inefficiency of the greedy algorithm.
Random graphs of small order
87
Table 15 Colouring 1000 graphs of order 1000 by the greedy algorithm
Colours No. of graphs
122 1
123 14
124
125
54
151
126 259
127 281
128 157
129 65
130 17
131 1
Mean number of colours S26.6. Time 1 min 30 sec.
The basic strategy of our algorithm was the natural one: find the largest independent set you can, remove it, and repeat the process on the remaining graph. The difficult part is finding a large independent set. It should be noted that this procedure is very unlikely to produce a colouring with only 80 colours, even if one could find the actual largest independent set at each stage. The data in a fuller version of Table 12 would indicate that such a procedure certainly uses 85 colours and almost surely 86. Thus our average of 86.9 is close to best possible. One simple refinement or this strategy however was of fundamental importance. Namely, if several independent sets of o r d e r j were found but none of o r d e r j t 1, the set of o r d e r j removed was not just the first one found but the one with t h e Zurycst llrgrcc sum. That is, the set was chosen to minimize the density of the remaining graph. An average of six colours were saved compared with the basic strategy of removing the first large set found. The effectiveness of thismodification is also seen in that the density of the remaining graph in the basic algorithm increased from 0.50 to 0.52 or more, whereas i n the modified algorithm it eventually went down to below 0.47. The method of finding independent sets should be mentioned. The procedure here was again the most natural, producing all independent sets of a graph with vertex set {I, 2 , ...,n}. When at some stage the current independent set is ( u , ,v 2 , ..., u k} the next one is formed by adding u k + l , the smallest vertex greater than and joined to none of v l , u 2 , ..., u k . I f no such vertex exists, uk is replaced by &, the least vertex greater than u, jolned to none of u l , u 2 , ...,u k - , . This process is clearly of exponential order, and the time needed increases rapidly with n. On an ISM 3081 ( 5 MIPS) with a program written in FORTRAN, B random graph of order 180 took 5 seconds and one of order 280 took 60 seconds (see Table 16). A thorough examination of a graph of order 400 by this ire:hod was impossible. Hence a restricted search was required to find independent sets in graphs of Table 16 The time taken to find the indcpendence riunihcr II
Mean Po Mean time n Mean fi0 Mean ti me
Do of G*,
100 9.2 0.34
120 9.6 0.78
140 10.1 1.61
150 10.4 2.15
160 10.4 3.14
170 10.5 4.29
180 10.7 5.70
190 10.8 7.79
200 11.0 9.57
220 11.2
230 11.2 20.7
240
11.4
250 11.0 35.6
260 11.8 41.8
270 280 11.8 12.2 50.5 57.7
290 12.0 76.6
300 12.0 92.3
16.5
26.7
210 11.1 13.0
B. Bollobds, A . Thomason
88
order up to 1000. Of a great number of strategies tried for this restricted cearch, one was clearly supersor,aiid wa? the only one ever to find an independent set of order 15 i n a graph of order 1033. The stratcgy wa5 this: at some stage, we have an independent set ofvertices u , , ..., u,. Let W, be the subgraph induced by the vertices jo ned to none of u I , ..., u,. Instead of choosing u ~ to+ be~ the firat vertex in W,, we choo,e the f h t vertex u with
where f is some predetermined function. This guarantees that at the next stage there will be many verticec joined to none of u1 , ... , u k + Nolice that since u, was chosen subject to the same rule we have Wkl>,f(k, n). The trick lies in choohing a suitable function f. Setting f-0 makes the restricted search the same as the exhaustive search. The function we used is given by
I
where ,f(O, n)=n, p = 1-4 i s the density of the graph and I i s some parameter. The idea of this is that if W,l =f(k, n ) then the mean value for W,l -dw,(u)- 1 -_iLqf(k, n) and the standard deviation is J p q f ( k , n). Variation of the parameter I allows for a greater or lesser thoroughness of search. The performance of this algorithm is illustrated by Table 17; sonie of the entries can be compared with those in Table 16. In both, tables the times are g'ven in seconds. Quite pojsibly other choices off would give better results, but computer time was not ava'lable to allow experimentation. The coloxing algorithm then consisted of repeated applications of the above restricted search algorithm. A slight improvement was obtained by, at each stage,
I
I
Table 17 Large independent sets found in G,,
n I Mean /lo Mean time
150
150
0.6 7.7 0.02
0.4
I Mean /lo Mcan time
250 0.6 10.7 0.10
n 1 Mean flo Mean time
500 0.5 12.3 1.32
I1
99 0.05
250
04 10.7 0.18
500 04 12 5
2.03
using the restricted search 150 0.2 9.9
0.07
150 150 0.0 -0.2 10.1 10.2 0.15 0.20
200
0.0 10 6
- 0.2
0.04
200 0.2 10.7 0.20
0.22
300 02 11.6 1.18
300 300 0.0 - 0.2 11.8 11.9 1.38 2.89
250 02 10.9 0.40
250 0.0 -0.2 11.0 11.0 0.88 1.43
300 0.6 11.1 0.17
300 0.4 11.1 0.42
500 03 13 0 3.95
500 02 13 0 6.32
1000 05 14.0 24.2
1000 0.4 14.1 45.4
1000 1000 0.3 02 14.4 14.4 84.4 127.6
250
200
200 0.4 10.3 0.1 I
200 0.6 10.2
10.8 0.58
Random graphs of small order
89
swapping a vertex in the remaining graph with one in a previous colour class if this was possible and if it decreased the density of the remaining graph. When the remaining graph became sinall the exhaustive algorithm took over fi.0~11 the restricted search to find independent sets; the number of vertices bclow which this was appl ed is denoted i n thc tables by r. Moreover, the very last few classes tended to be very small and it was usually worthwh le finding the exact chromatic number of the remaining graph once it was small enough. The order of the subgraph so coloured is denoted by s i n the tables. (Subgraphs of order 40 could be coloured exactly in a few seconds and it was often quite feasible to colour up to 50.) The value o f 2 used in the restricted search was not a constant, but was allowed to decrease as the remaining graph decreased. This was done by specrly,ng t w o parameters ct and fi, and letting 2 decrease linearly fi-om ct to fi as the order of the remaining graph decreased from 1000 to s. A non-linear variation would probably g've better results but again we were unable to experiment. This more or less conipletes the description of the algorithm. In practice our aim was to get the best colouring we could inside one hour, and to this end we tried three variants of the basic algorithm. A) Choose ct and fi small enough to consume one hour of computer time. B) Bound the size of the independent set to be chosen (say by (13) in the hope that, although this may not be the largest such set, the large number of sets of this size would oKer an opportunity to choose one which would greatly decrease the density of the rema:ning graph. C) Compare, at each stage, the size of the independent set found with the size one would expect to find in a graph of that order and density. If the expected size is greater, make another tiioi'c thorough search using a smallcr value of 1 in the hope OF finding a larger set. The value of d is computed by replacing u and by smaller parameters y and 6. Table 18 Colourings of the graphs of ordcr 1000, with a=O.5, y=0.2, S = -0.5, ~ ~ 2 3 0
a= -0.1,
(For an explanation of these parameters, see the text.) Graph number s
Time taken Colours u x d Graph number s
Time taken Colours used
I 36 48:38 87
2 34 51:56 86
6 39 43145 87
48 50:16 87
7
3 40 54:58 87
4 34 49:11
8 36
9 40
55:31
561.16
87
87
Mean time: 53:23, mean colours used: 86.9.
87
5
34 55:23 87 10
40 67:.10 87
B. Bollobrfs, A . Thomason
90
Table 19 Six colourings of the same graph of order 1000. The parameter 6 denotes a bound on the size of an independent set chosen
r
S
a
B
230 230 230 230 230 230
38 36 40 39 43 39
0.5 0.5 0.3 0.3 0.4 0.3
-0.11 -0.1 -0.3 -0.3 -0.2 -0.5
6
Y
b 13 13
0.2 0.0
-0.5 -1.0
Time
Colours
25:07 28:40 60: I5 83:43 60:05 11454
88 88 88 88 87 87
Table 20 Quick colourings of ten graphs of order 1000, wilh a= 1.0, 8-0.1 and r = 100 G r J p h number S
Time taken Coloui-s used
1 28 3:07 92
2 25 3:24 90
3 35 2:46 91
4 25 3:22 90
5 27 3:09 91
6 30 3:05 91
7 28 3:13 91
8 25 3107 91
9 36 3:18 90
10 35 3:35 90
Mean time: 3:13, mean colours used: 90.7
The third of these variants appears t o be best. The results of colouring ten graphs this way are presented in Table 18. Some results from all the variants are given in Table 19 for comparison. Finally, Table 20 gives results of colouring ten graphs using large values of I where the aim was to produce a good but quick colouring. The times taken are shown as min:sec.
7. Random regular graphs Let $ ( r z , r-reg) be the set of r-regular graphs with vertex set V= { I , 2 , ..., n>. Turn %(n, r-reg) into a probability space by giving all members the same probability. A point of this space is a raitdoionz r-regulargraph oj'order ti and is denoted by Gr-reg. Properties of are considerably harder to study than those of G M and G,, mojtly because the set % ( ! I , r-reg) cannot be constructed nearly as simply as, say, 9'(rz, M ) . Bender and Canfield [4] gave an asymptotic formula for IY(n, /.-reg)/ as n+m. Bollobis [9] gave a simpler proof by using inany probabilistic ideas. The method also enables one to generate regular graphs rather easily. Several properties of the space % ( / I ,r-reg) have been studied by Worjnald [831, ~ 4 1 . Two important questions about regular graphs concern the minimal diameter of an r-regular graph of order N (see Ber-mond and Bollobis [ 6 ] ) . It is not inconceivable that random regular graphs can be used to tackle both questions.
Random graphs of small order
91
i n fact, for a fixed r and large n the best upper bound for the first function was obtained by Bollobris and de la Vega [IS], with the aid of random graphs. For the sake of simplicity we state a rather crude form of this result.
Theorem 20. Let r > 3 be fixed and rn even. Then, as n+ m, a.e. log,-,(nlogn)-log,-,
6r r-2
--
is such that
-1
+
6 diam (Grhreg)
dig 3 5 7 6 49 I 14 8 2 N=
100,
d\g 3 8 20 9 48 5 10
4
d\g 3 4
6 7 8
6 22 0 0 S=
100
5 4 10 2 0 0
4
11
n=50, s5.100 d\g 3 4 5 6 4 4 0 7 47 15 1 8 2 4 2 0 9 3 0 0
n=40, s=100 5
3 0 9 3 4 7 6 1 4 0 0
n=200, s=100
n=150, s=100 d\g 3 4 9 10
5
6
35 18 1 1 37 8 0 0
d\g 3 4
9 10 11
5
2 1 1 71 18 11 5
1
0
n=300, s = 4 0
n=500, s=40
n = 1000, s = 15
3 4 5 5 2 1 25 7 0
d\g3 4 5 11 4 6 4 12 17 7 2
d\s 3 4 12 1 0 13 11 3
n=3000, s = 2
n=4000, s = 2
10 11
n=2000, s = 2 d\g 14
3 4
1
1
d\s 3 4 1 1
15
d\s 15
3 2
92
B. flollobas, A . Thomason Table 22 The diameter and girth of a random sample of s four-regular graphs of order 11 N = 30, 11
s= 100, s = 100,
= 40,
11 = 50,
s = 100,
n = 100, n=150, n = 200, n = 300, n = 500,
s = 100, s= 100, s = 100, s = 40, s = 40,
I1 = 1000,
S=
n = 2000, n = 3000,
s=2,
15,
s=2,
Table 23 The diameter and girth of a random sample of s five-regular graphs of order rz
n = 30, I I = 40,
n = 50, n = 100,
s= 100, s= 100, s= 100, s=40, s = 40,
n= 150, n = 200,
s=40,
n = 300, n = 500,
S=
s = 40, 15,
1000,
s=2,
n = 2000,
s=2,
I1 =
Table 24 The diumeter and girth of a random sample of s six-rcgular graphs of order n ?I = 30, n = 40, I? = 50,
n = 100, n = 160, I1 = 200,
n = 300, n = 500, n = 1000,
n = 2000.
Random grnphs of small order
93
graphs of order 50, 15 had diameter 7 and girth 4. In Tables 22-24 the first two numbers stand for the order and sample size and an entry (cl, g ) t means that we found t graphs of diameterdandg;rth g . For example, i n a saniple of 100 4-regular graphs of order 30, 80 were found to have diameter 4 and g.rth 3. Table 25 A cubic graph of ordcr 52 and girth 7 given by its adjacency lists 1:2 2:l 3:1 4: I 5: 2 6: 2 7: 3 8: 3 9: 4 10: 4 11: 5 12: 5 13: 6
3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26:
6 7 7 8 8 9 9 10 10
I1 II 12 12
29 31 33 35 37 39 41 43 45 38 42 29 32
30 32 34 36 38 40
42 44 46 47 28 44 51
27: 25: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39:
13 13 14 14 I5 15 16 16 17 17 18 18 19
45 24 35 39 46 26 37 40 29 32 43 23 30
52 33 25 50 49 36 28 47 45 41 33 52 51
40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52:
19 20 20 21 21 22 22 23 35 31 30 26 27
34 45 24 37 40 27 31 34 46 42 43 39 38
44 36 49 50 25 41 48 48 47 50 49 52 51
Tables 21-24 show that random regular graphs are not too useful in our search for graphs of given degree of regularity, given girth and small order. Nevertheless, a computer can be of some use when i t conies to completing a construction after a promising start. Using this method we managed to construct a cL,bicgraph of girth 7 and order 52; as far as we know, no such graphs have been found previously. In Table 25, the graph is given by its adjacency lists. The graph was constructed by starting with a tree of radius 4, centre 1, vertex set { I , 2 , ..., 46}, in which every branchvertex has degree 3, and three independent edges (47, 48), (49, SO), (51, 52). Acknowledgements We are very grateful to Mr A. J. Harris for his generous assistance with the preparation of the final form of this paper.
References [I] D. Angluin and L. G. Valiant (1979), Fast probabilistic algorithnis for Hamilton circuits and n atciings, Journal of Computcr and Systeni Sciences 18, 155-193. 121 T. L. Austm, K. E. Fdgen, W. F. Pcnney and J. Riordan (1959). The numbcr of components in random linear graphs, Ann. Math. Statist. 30, 747-754.
94
B. Bollobds, A . Thomason
[3] J. Beck (1983), On size Ramsey numbers of paths, trees and cycles. T, J. Graph Theory 7, 115-129. [4] E. A. Bender and E. R . Canfield (1978), The asymptotic number of labelled graphs with given degree sequences, Journal of Combinatorial Theory (A) 24, 296-307. [ S ] C . Bcrgc (1958), Sur le couplage maxinium d'un graphe, C. R. Acad. Sci. Paris 247, 258-259. [6] J.-C. Bermond and 13. BollobAs (1982). The diameter of graphs - a survey, Proc. Twelfth Southeastern Conf. o n Combinatorics, Graph Theory and Computing, Congressus Nunierantium 32, 3-27. [7] B. Bollobt~s(1978). Extremal Graph Theory, Academic Press, London, New York and San Francisco, xx f 4 8 8 pp. [ 8 ] B. Bollobis (1979), Graph Theory - An Introductory Course, Graduate Texts in Mathcmntics, Springer-Verlag. New York, Heidelberg and Berlin, x i - I80 pp. [9] 13. Bollobis (l980), A probabilistic proof of an asymptotic formula for the number of labcllcd regular graphs, European Journal of Combinatorics I , 3 1 1-3 16. [lo] B. Bollobis (l98l), Degree scquenccs of random graphs. Discrete Math. 33, 1-19. [ I I ] 13. Bollobas (1981), Random graphs, in: Conibinatorics, (1-1. N . V. Tempcrley, ed.), London Math. Soc. Lecture Note Series 52, Cambridge University Press, 80-102. [12] €3. Bollobis (1982a), Vertices of given degree in a random graph, Journal of Graph Theory 6, 147- 155. [I31 B. BollobAs, The evolution of random graphs, Trans. Amer. Math. S o c . 1141 B. Bollobhs, Geodesics in oriented graphs. [I51 B. Bollobis (1984), The evolution of sparse graphs, in: Combinatorics and Graph Theory, Proe. Cambridge Combinatorial Conf. in honour of Paul Erdiis (H. Bollobtis, ed.), Acad. Press. [16] B. Bollobas, P. A. Catlin and P. Erdiis (1980), Hadwiger's conjecture is true for almost every graph, European Journal of Combinatorics 1, 195-199. [17] B. Bollobas and P. Erdijs (1976), Cliques in random graphs, Math. Proc. Cambridge Phil. SOC.80, 419-427. [I81 B. BollobAs and W. F. de la Vega (1982), The diameter of random regular graphs, Combinatorica 2, 125-1 34. [19] N. Christoiidcs (1971), An algorithm for the chromatic number of a graph, Computing J. 14, 38-39. [20] V. Chviital (1977), Determining the stability number of a graph, SIAM Journal of Computing 6, 643-662. [21] L. Conitel (1974), Atlvariccd Conibinatorics, D. Rcitlcl Publishing Company, Dordrecht, Holland. [22] D. G . Corneil and B. Graham (1973), An algorithm for determining the chromatic number of a graph, SIAM J . Computing 2, 311-31R. [23] P. Erdos (1959). Graph theory and probnbili[y, Canadian Journal of Matliematics 11, 34-38. [24] P. Erdos (1961), Graph theory and probability. IT, Canadian Journal o f Mathematics 13, 346-352. [25] P. Erdiis (1963), Some applications of probability to graph theory and combinntorial problems, in: Theory of Graphs and Applications, Proc. Synip. held in Sniolenice in June 1963, Academia, Praha, 1964, pp. 133-1 36. [26] P. Erdos and S . Fajtlowicz (1979), On the conjecture of Hajbs, Tenth S. E. Conf. o n Conibinatorics. Graph Theory and Computing, Boca Raton. [271 P. Erdiis and A. Renyi (1959), On random graphs. I, Publ. Math. Debrecen 6 , 290-297.
Kandom graphs of’small order
95
[28] P. Erdos and A. Renyi (1960), On the evolution of random graphs, Publ. Math. Inst. Hungar. Acad. Sci. 5 , 17-61. [29] P. Erdos and A. R(nvi (1961), On the evolution of random graphs, Bull. Inst. Internat. Statist. Tokyo 38, 343-347. [30] P. Erdos and A. Renyi (1961), On the strength of connectedness of a random graph, Acta Math. Acaci. Sci. Hungar. 12, 261-267. [31] P. Erd6s and A. R h y i (1964), On random matrices, I’ubl. Math. Inst. Hungar. Acad. Sci. 8, 4 - 4 6 1 . [32] 1’. Erdas and A. R h y i (1966), On the existence of a factor of degree one of a connected random graph, Acta Math. Acad. Sci. Hungar. 17, 359-368. [33] P. ErdBs and A . Rcnyi (l968), 0 1 1 random ni:itriccs. 11, Stiidia Sci. Math. Hungar. 3, 459-464. [34] T. 1. Fenner and A. M . Frirze (1983), On the existence of hamiltonian cycles in ;Iclass of random graphs, Discrete M a t h . 45, 301-305. 1351 S . Fillenbaum and A . Kapoport (1971), Subjects in the Subjective Lexicon, Academic Press, New York. [36] Y. Fu and S . S. Yau (l962), A note on the reliability of communication networks, J. Soc. Indust. Appl. Math. 10, 469-474. [37] M. R. Carey and D. S. Johnson (1976), The complexity of near-optimal graph c o l ~ u r ing, J. Assoc. Computing Mach. 23, 43-49. [38] E. N. Gilbert (1959), Random graphs, Anna15 Math. Stat. 30, 1141-1 144. [39] G . R. GI-ininiet I (1980), Random graplis, in: Furthcr Sclcctcd Topics in Graph Theory, (L. Bcinche and R. J. Wilson, eds.), Academic Press, London, New York, San Francisco. [40] G. R. Crimmett and C. J. H. McDiarniid (1975), On colouring random graphs, Math. Proc. Cambridgc Phil. Soc. 77, 313-324. [41] C . I. Ivchenko (1973), The strength of connectivity of ;Irandom graph, Theory of I’rob. and Appl. 18, 396-403. [42] D. S. Johnson (1974), Worst case behaviour of graph colouring algorithiiis, Fifth Southeastern Conf., pp. 5 13-527. [43] A. Johri and D. W. Matula (1982). Probabilistic bounds and heuristic algorithms for colouring large random graphs, Techn. Rcpt., Dept. of Computer Science and Engineering, Southern Mcth. University. [44] M. Karoliski (1982), A review of random graphs, Journal of Graph Theory 6 , 349-389. f45] R. M. Karp (1972), Reducibility among combinatorial problems, in: Complcxity of Computer Computations (R. E. Miller and J. W. Thatcher, eds.), Plenum Press, New York, pp. 85-104. [46] A. K . Kelmans (1965), Some problenls o f the analysis of reliability of nets (in Russian), Automatika i Telemechanika 26, 567-571. [47] A. K. Kelmans (1967). On the conncctcdncss of random graphs (in Russian), Autornatika i Telcmechanika 28, 98-1 16. [48] A. K. Kclnians (1972), Asymptotic formulas f o r the probability of k-connectedness of random graphs, Thcorji 01‘ Proh. and Appl. 17, 243-254 [49] A. K. Kelmans (1977), Comparison of graphs by their probabllity of connectedness (in Russian), in: Kombinator. i Asirnpt. Analiz. Krasnoyarsk, 69-81. [50] D. E. Knuth (l973), The Art of Computer Programming, Vol. 3, Sorting and Searching, Addison-Wesley, Reading, Mass. [51] J . Komlbs and E. Szenieredi (1975), Hamiltonian cycles in random graphs, in: Infinite and Finite Sets, (A. Hajnal, R. Rado and V. T. Sos, eds.), Colloq. Math. Soc. J. Bolyai 10, North-Holland, Amsterdam, 1003-101 1 .
96
B. Bollobds, A . Thomason
[52] J. Koml6s and E. Szemerddi (1983). Limit distributions for the existence of Hamilton cycles in a random graph, Discrete Math. 43, 55-63. [53] A. D. Korshunov (1976), Solution of a problem of Erdos and Rknyi o n Hamilton cycles in non-oriented graphs, Soviet Mat. Dokl. 17, 760-764. [54] A. D. Korshunov (1977). A solution of a problem of P. ErdBs and A. Renyi about Hamilton cycles in non-oriented graphs (in Russian), Metody Diskr. Anal. v Teoriy Upr. Syst., Sbornik Trudov Novosibirsk 31, 17-56. [ 5 5 ] E. L. Lawler, A note on the complexity of the chromatic number problem, Inform. Processing. Lett. 5, 66-67. I561 F. T. Leighton (1979), A graph colouring algorithm for large scheduling problems, J. Rcs. Nat. Bur. Stand. 84, 489-496. [57] K. 1.: Ling (1973), The expected number of components in random iinear graphs, The Annals of Probability 1, 876-881. (581 R. F. Ling (l973), A prokibility theory of cluster analysis, Annals of the American Statistical Association 68, 159-164. [59] R. F. Ling (19751, An exact probability distribution on the connectivity of random graphs, J. Math. Psych. 12, 90-98. [60] K. F. Ling and G . G . Killough (1976), Probability tables for cluster analysis based on a theory of random graphs, J . Amer. Stat. Assoc. 71, 293-300. [61] C. J. H. McDimnid (1979), Determining the chromatic number of a graph, SIAM J. Computing 8, 1-14. [62] G. A. Margulis (1 974), Probabilistic characteristics of graphs with large connectivity, Problems of lnformation Transmission 10, 101-108. [63] D. W. Mntula (1970), On the complete subgraph of a random graph, Combinatory Mathematics and its Applications, Chapel Hill, N. C., 356-369. [64] D . W. Matula (1972), The employee party problem, Notices A. M. S. 19, A-382. [65] D. W. Matula (1976), The largcst clique size in a random graph, Technical Rep. Dcpt. Conip. Sci., Southern Methodist Univ., Dallas. [66] H . S. Na and A. Rapoport (1967), A formula for the probability of obtaining a tree from a graph constructed randomly except for “exogamous bias,” The Annals of Math. Stat. 38, 226-241. [67] I. Palksti (1963), On the connectedness of bichroniatic random graphs, Publ. Math. Inst. Hungnr. Acad. Sci. 8, 431-440. [68] I. PalBsti (1968), On the connectedness of random graphs, in: Studies in Math. Stat. and Appl., Akad. Kiado, Budapest, 105-108. [69] L. Posa (1976), Haniiltonian circuits in random graphs, Discrete Math. 14, 359-364. [70] A. Rapoport and S . Fillenbaum (1972), An experimental study of semantic structures, in: Multidimensional Scaling (A. K . Romney, R. N . Shepharci and S . €3. Ncrlovc, eds.), Vol. 11, Seminar I’rcss, New York, 93-131. [71] A. Rucinski (1981), On k-connectedness of a n r-partite random graph, Bull. Acact. Polon. Sci., Ser. Sci. Math. 29, 321-330. [72] J. W. Schultz and L. Hubert (1973), Data analysis and the connectivity of random graphs, J. Mathematical Psychology 10, 421428. 1731 J . V. Schultz and L. Hubert (1975), An empirical evaluation of a n approximate result in random graph theory, The British J. of Motliematical and Statistical Psychology 28, 103-1 11. 1741 E. Shaniir, A sharp threshold for Hamilton paths in random graphs. 1751 E. Shamir, How many random edges make a graph hamilLonian?
Random graphs of small order [76] J. Spencer (1975), Ramsey’s theorem
97
- a new lower bound, J. Cornbinatorial Theory (A)
18, 108-115.
[77] V. E. Stepanov (1969), Combinatorial algebra and random graphs, Theory of Prob. and its Applications 14, 373-399. 1781 V. E. Stepanov (1970), On the probability of connectedness of a random graph G,,,(t), Theory of Probability and its Applications 15, 55-67. [79] R. E. Tarjan (1972). Finding a maximum clique, Techn. Rep. 72-123, Computer Sci. Dept., Corncll University, lthaca, N. Y . [SO] R. E. Tarjan and A. E. Trojanowski (1977), Finding a maximum independent set, SlAM J. Comput. 6, 537-546. [81] A. C . Thomason (1978), Hamiltonian cyclcs and uniquely edge colourable graphs, in: Advances in Graph Theory (B. Bollobas, cd.), Ann. Discr. Math., North-Holland, 259-268. I821 W. T. Tutte (1947). The factorization of linear graphs, J. London Math. SOC. 22, 107-1 1 1 . [83] N. C. Wormald (1981), The asymptotic connectivity of labelled regular graphs, .I.Combinatorial Theory (B) 31, 156-167. [84] N. C. Wormald (1981),The asymptotic distribution of short cycles in random regular graphs, J . Combinatorial Theory (B) 31, 168-182. [85] E. M. Wright (1968), Asymptotic enumeration of connected graphs, Proc. Royal SOC. Edinburgh, Sect. A, 68, 298-305. [86] E. M. Wright (1972), The probability of connectedness of a n unlabelled graph can be less for more edges, Proc. Amer. Math. SOC.35, 21-25. [87] E. M. Wright (1973), The probability of connectedness of a large unlabelled graph, Bull. Amer. Math. SOC. 79, 767-769. [88] E. M. Wright (1975), The probability of connectedness of a large unlabelled graph, J. London Math. SOC.(2) 1 1 , 13-16.
This Page Intentionally Left Blank
Annals of Discrete Mathematics 28 (1985) 99-105 0 Elsevier Science Publishers B. V. (North-Holland)
VERTEX-DEGREES IN STRATA OF A RANDOM RECURSIVE TREE
Marian DONDAJEWSKI, Peter KIRSCHENHOFER* and Jerzy SZYMANSKI Teclinical Unioersity of Poznari. Poznati, Poland, cind *Technical Utticersiry of Vienno, Vienno, Airstria
A stratum S, of a rooted tree is a set of vertices whose distance from the root is equal to k . We shall derive cxact and asymptotic formulas for the expected value A,(n, k ) of the number of vertices of degree r in S, of a random recursive tree with n vertices.
1. Introduction
A tree is a connected graph which has no cycles (see [3] or [5] for definitions not given here). A tree R, with n vertices labelled 1 , 2, ..., n is a recursiue tree if for each k such that 2 < k < n the labels of vertices in the unique path from the I-st vertex to the k-th vertex of the tree form an increasing subsequence of { 1 , 2, ..., k } . A vertex with label 1 is thc roof of a recursive tree. A set of all vertices with the distance from the root equal to k is called the k-th stratunz of a recursive tree. A random recursive tree (RRT) with n vertices is a tree picked at random from the family of all recursive trees with n vertices. We assume that each of all (n- l)! possible choices of a tree is equiprobable. Let A , ( n , k ) denote the expected value of the number of vertices of degree r in the k-th stratum of the RRT with n vertices. Our main objective is to determine a formula for A , ( n , k ) . To do this first we find the generating function for A,(n, k ) and next we determine some recursive relations needed to prove the final formula. In the last section we find an asymptotic formula for A,(n, k). 99
M . Dondajewski, P . Kirschenhofer, J. Szymariski
100
2. The expectation Ar(n, k )
Let us consider the family of the generating functions for the expectation
A,(n, k ) of the form 4)
n-k
and prove the following result. Theorem 1. Let G k ( x ,y ) be as in (1). Then
(2)
+(-l)ky(l-y)-k(l-x)-y or IxI
... ( k - i + l ) .
Proof. It is shown i n [2] that Ar(n, k ) fulfils the following recursive equations
A,(n , k ) =
n-1
c
i=k+r-l
Ar-l(i, k ) ,
for r 2 2 and n 2 k + l
(3)
and I
n-1
where p ( n , k) denote the expected value of the number of vertices in the k-th stratum of the RRT with II vertices. Moreover, if we assume that
then relation (3) is also true for r = l . Let D denote the operator d / i k Then
Using formula (3), changing the order of summation, and summing the geometric
Vertex-degrees in strata of a random recursive tree
101
series, we get 00
,,
i-k
m
m
i-k
Finally,
Meir and Moon [4] proved that p ( n , k) fulfils the following relation: m
Using the above equation, we can rewrite ( 5 ) in the form
Solving this differential equation with the boundary condition G,(O, y)=O, we arrive at the thesis. 0 From Theorem 1 one can deduce the following recursive relations for A r ( n , k). Corollary 1.
for n > k + 1 and
A,(n+l, k+l)=A,_,(n+l, k+l)-Ar(n+l,
k)
(8)
.for r > 2 and n> k + r , where s(n, i) denote the Stirling number of the Prst kind.
M . Dondajewski, P. Kirschenhofer, J. Szymahski
102
Proof. By (2) it is easy to see that Gk(x,y) fulfils the following recursive equation
for k 3 1. Since
c s ( n , k )(-- n !
1 -lnk(l-x)= k!
X)"
m
n=k
and
c x" m
(l-x)-I=
n=O
we are able t o write (9) in the following form:
Now using the definition of G,(x,y) and comparing coefficients of the series in (lo), one can easily get the relations (8) and
A,(n+l,k+l)=
Is(i-k)l i=k
--
i!
A,(n+ 1, k ) .
To prove (7) we have to show that
holds. It is easy to see, however, that both sides of (12) are equal to Is(n+ 1 k + n!, which completes the proof. 0 Now we are ready to prove the main theorem of this paper
Theorem 2.
Vertex-degrees in strata of a random recursive tree
103
Proof. To prove the above result it is sufficient to show that the right-hand-side of (13) fulfils the recursive relations (7) and (8) with the respective boundary condition. In fact, A,-,(n+l, k+l)-A,(n+l,
k+1)
-(-'I"( -~ (- 1)"+'-' (L)s(n, k - t r - 1) + n!
(- l)'s(n, i) i=k+r
(( -
i + ; I)-(
i; r)))
Similarly, one can see that the right-hand-side of (13) fulfils (7). Moreover, it is easy to see that the boundary condition
is also fulfilled, which completes the proof.
Corollary 2. For arbitrary k, r and n
0
M. Dondajewski, P . Kirschenhofer, J. Szyrnariski
104
Proof. By Vandermonde's convolution in the form
=)';i(
j=o
(-r)(
ji )
k-j
it follows immediately from ( 1 3 ) that
By the application of the following identity (see (12))
we arrive at the thesis.
0
Note that (14) is much better suited for the asymptotics of A , @ , k) than (13) since the bounds of suinmation do not depend on ti. Theorem 3. If r and k are fixed and n-+03 then for k 2 2
(In n)k Ar(n 1, k + 1 ) = - - + ( y k!
+
(In n l k - l i.) ____ ( k - l)!
Furthermore, Ar(n+1,2)=lnn+y-r+O and
A,(n+1,1)=1+0 where y=0.5772
("":-I ~~
... is the Euler constant.
,
+ 0 ((ln ~ i ) ~ - ~ ) .
Vertex-degrees in strata of
Q
random recurfive tree
104
Proof. To prove the asymptotic formulas for A,(n, k ) we have to deal with an asymptotics of Is(n+ 1 , j + l)l/tz! for fixed j and n-+co. It is known that
I1
where Yj are the Bell polynomials and l,(s)=
1 i-',
(see [I, p. 2171). But
i= 1
?(XI
, ... , X i ) = x i
+( ;)
+
x ~ - z x z .. .
and <,( 1) =In n
+y + 0 ( 1 2 -
~,(s)=O(I)
')
for s > 1 .
so
and by (14) the proof is complete.
Cl
Note that in the asymptotic formula for A,(n, k ) only the second order term depends on the vertex degree r.
References [ I ] L. Comtet, Advanced Coinbinatorics (D. Rcidel Publ. Camp., Dordrecht, 1974). 121 M . Dondajcwski and J. SzyinaAskr, On thc diwibution of vcrtex-degrees in strata of a random rcctirsive trcc, Bull. Acad. Polon. Sci. Ser. Sci. Math. Vol. XXX, No. 5-6 (1 982) 205-209. 131 F. Harary and L. Palnicr, Graphical Enuincration (Academic Press, New York, 1973). [4] A. Meir and J. W. Moon, On the altitude of nodcs in random trees, Canad. J. Math 30 (1978) 997-1015. [ 5 ] J. W. Moon, Counting Labelled Trees, Canadian Mathematical Congress, Montreal, 1970.
This Page Intentionally Left Blank
Annals of Discrete Mathematics 28 (1985) 107-124 0Elsevier Science Publishers B. V. (North-Holland)
RELIABILITY-ESTIMATIONIN STOCHASTIC GRAPHS WITH TIME-ASSOCIATEDARC-SET RELIABILITY PERFORMANCE PROCESSES
Wolfgang GAUL University of Kurlsruhe ( T H ) , West Germany In this paper situations are considered in which the reliability-behaviour of the arcs of stochastic graphs is described by time-associated reliability performance processes. Reliability-estimations, e.g. lower (and for special cases upper) reliability bounds additional to the known minimal cut lower bound of Esary and Proschan - are yielded by a proper decomposition of the underlying stochastic graph. This decomposition allows a successive determination of the reliability estimation by using reliability estimations of stochastic subgraphs which should be of interest when the underlying stochastic graph is large. Comparisons of the different bounds are made within an example of simplest form.
1. Introduction There are some interesting directions concerning stochastic aspects within application-relevant graphtheoretical problems, one of them consists in stochastic programming on graphs, see e.g. Cleef and Gaul [6], [7], another in models of the reliability-beliaviour of graphs. In most papers dealing with reliability problems in stochastic graphs the model description is done from a static stochastic viewpoint allowing that the elements of the graph (nodes, arcs) can take only two states - functioning or having failed - with probabilities independent of time. Graphtheoretical reliability measures depend on an appropriate connectivity notation (well-known measures are given e.g. by the probability that a specified 107
108
W. Gaul
pair of nodes or a specified subset of nodes belong to a “connected functioning subgraph,” a notation which has to be defined according to the fact whether directed or undirected graphs are used in which nodes and/or arcs are subject to random failure). Of course, the simpler the structure of the underlying stochastic graph is the more realistic stochastic descriptions are possible (see e.g. Barlow and Proschan [I], where - if the underlying stochastic graph is a path with reliable nodes (a series-system built by the arcs of the path) - tools from availability theory i n connection with renewal theory can be applied). However, standard reliability problems in stochastic graphs don’t use timeassociated reliability performance processes for model description as will be assumed throughout the rest of this paper, thus, literature concerning some known directions of reliability graph problems is not explicitly mentioned here but see Frank and Caul [I41 (where connectedness probabilities in stochastic graphs with randomly failing nodes and arcs are considered), Caul and Hartung [I81 (where bounding distribution functions are coiiiputed when the arcs of the underlying stochastic graph can take several states of reliability between complete failure and perfect functioning, see also Barlow and Wu [3], El-Neweihi, Proschan and Sethuraman [9] for the first description of niultistate reliability models) and the references cited there. With respect to the dependence structure of the random variables used for model description of reliability problems association (which was first mentioned in Esary, Proschan and Walkup [13], a more recent paper is Jogdeo [20]) can be used. Weakening the usual and restrictive independence assumption to the case of time-associated performance processes was done in Esary and Proschan [I21 and is adopted here to derive estimators, e.g. lower (and for special cases upper) bounds for the nodebasis - nodecontrabasis (see Harary, Norman and Cartwright [19] for graphtheoretical notations) connectedness probability in stochastic acyclic digraphs - additional to the known minimal cut lower bound of Esary and Proschan. This bound was first established i n Esary and Proschan [ I l l and improved by Bodin [4] using modular dccoiiipositions (for the use of modules which are also of interest in fields other than reliability theory see e.g. Butterworth [5]), its generalization to the time-associated case was given in the already mentioned paper by Esary and Proschan [12]. I n this paper reliability-bounds are constructed by means of a proper decomposition of the underlying stochastic acyclic digraph first described in Gaul [I61 for project digraphs. Dependent on the used decomposition improvements of some of the bounds (including tho:e of Esary and Proschan) can be obtained. Comparisons of the different bounds are made within an example of simplest form.
ReIiability-estimation in stochastic graphs
109
2. Problem formulation
For ease of description some of the frequently used graphtheoretical notations are given in the following, for more detailed and additional explanations see e,g. Harary, Norman and Cartwright [19]. Let D = ( N , A , f ) describe a digraph wherc NfO denotes the set of nodes, A the set of arcs, f = ( J " , f 2 ) withf': A + N , i= 1, 2, the incidence mapping with J" (a), f 2 ( a ) as starting-, end-node of a E A . For abbreviation, sometimes, only D is written for a digraph in which case N ( D ) , A ( D ) is used to denote the nodes, arcs of D. The incidence mapping is mostly omitted. In this case the tupel ( N ( D ) , A ( D ) ) is used instead of D. For two digraphs D i , i= 1 , 2, call
In the following it suffices to consider only gsp (generalized series-parallel)digraphs of the forni
which are finite, acyclic, weakly connected directed graphs the nodebasis (nodecontrabasis) of which consists of the single node p E N(D,,) ( q E N ( D p q ) ) .gspdigraphs are of importance because they generalize the description of two terminal series-parallel systems which obviously can be represented by gsp-digraphs. They also enclose project digraphs (when parallel arcs are not allowed) structural properties of which have been described in Gaul [I61 and can be used in the following context. One question is whether for a given pair of nodes i, j e N(D,,) there exists a gsp-subdigraph D l , c Dpq. If this is the case, special gsp-subdigraphs of interest are the maximal gsp-subdigraph denoted by El, and the minimal gsp-subdigraphs denoted by P,,and called paths from i to j . (P&, (P,j)k, k E N(P,,), gives the subpath of P,, from i to k, k to j , respectively. The set of paths belonging t o D,,is denoted by P(D,,). For a gsp-d'graph Dp4there exists a bijective mapping called (ascending) level-assignment I : N(D,,)-+{O, 1 , ...,m} ( m = ~ i V ( D , , , ) ~ - lwith ) U E A =- I(f'(u))
< l(f2(a))
(and l ( p ) = O , f ( q ) = m ) .
CV. Gaul
110
With the identification N(D,,): = (0, 1, ... ,m] the nodes of D,, are assumed to be topologically ordered according to such a level assignment (this assumption is needed for the successive determination of the reliability estimation), and, from now on h
the notation Do, is used instead of D p q .
(1)
N(b,,)
is assumed to consist of perfect nodes only (which do never fail). A ( s o , ) consists of unreliable elements the reliability behaviour of which is described by the (vector) reliability performance process
where X,(t) are Bernoulli-distributed random variables on a given probability space (Q, G , Pr) with
1
arcn is functioning a E A ( s O , ) , tETCL-0, a).
(2)
For fixed w E Q the sample functions X,(t, m) are assumed to be continuous from the right on T. Now, for fixed time t E T, the stochastic graph is described by the tupel
but, again, at least for mainly graphtheoretical considerations the incidence mapping and the reliability performance processes are omitted. With respect to the dependence structure of the random variables describing the reliability-behaviour it is presupposed that the reliability performance process is time-associated, which means that for all finite sets of times Tk={ I l , ... , &)
CT { X , ( t ) , t~
q ,a ~ A ( f i ~ ,is) }a
set of associated random variables
(see [2], [12], [13], [20] for properties of assoc;ation or/and time-association and the discussion of special cases as independence and positively total dependence and a variety of maintenance situations). Of course, for two nodes i , j E N(C0,,,) for which gsp-subd:graphs D t j exist, an interesting question is whether there will be a functioning path P,, (a path with functioning arcs) from i t o j .
Reliability-estimation in stochastic graphs
111
More formally, for every gsp-subdigraph D,,a so-called structure function
1 0
there exists a functioning path P i j ~ P ( D i j ) , otherwise,
can be defined with its path-representation
Using the minimal cuts of D , j - a minimal cut C , j c A ( D , j ) is a minimal set of arcs with the property C,, n A(P,,)#0, VP,, E P(D,,) - the following cut representation qDIJ(x(t>)=
n
pClJ(X(t))
(with qClJ(X(t))=1-
C,j E C ( D I J )
(l-Xa(t)) aE C,j
(4) is equivalent to (3) where C ( D t j )denotes the set of cuts of D , j . For fixed time t E T the structure function qDl, is a binary non-decreasing function with q U l J ( O , ... , O)=O, q D l , ( l ... , , I ) = 1. For fixed w E Q the sample function p D , , ( X ( t ,0))is continuous from the right on T.
is an intuitive reliability measure for a gsp-digraph D,,but for larger and more complex structured graphs the determination of the exact value of (5) can be difficult. In this situation one can calculate the minimal cut lower bound of Esary and Proschan
(with R,-,,(7)=Pr(pcI,(X(t))= 1, vt E T(7)))for which
{ X ( t ) , t~ T ) time-associated
RDi,(~)>EPDl,(~)
(7)
is valid, and, of course, additional bounding possibilities for the reliability estimation would be of interest.
W. Gaul
112
3. gsp-digraph decomposition h
With respect to the given gsp-digraph Damthe following notations are useful: Let be h n = { D i n ,Dill is gsp-subdigraph, i
a system of gsp-subdigraphs which all have the same nodecontrabasis n E N(fiolll), 1 dndnz,
B (S,)= {i , i EN(EO,,,) Di,~h,,) the set of the nodebasis-nodes of the gsp-subdigraphs of 6,. Call 6,, proper if
VD~,,, ,
E 6,:
({iyn},O) i,=iz=i, otherwise,
D;,,n D ; ~=~{ ( ~{ r ~ ) , 0 )
A
Such proper gsp-subdigraph systems always exist, e.g. 6,= {Don} is proper. Because of (8) the gsp-subd'graphs of the proper 6, are arc-disjoint and nodedisjoint except for the common nodecontrabasis 11 and, eventually, a common nodebasis i, (9) establishes a relation between P(bon)and 6,. For properties of proper systems of project digraphs see Gaul [16], the following theorem (for the proof of which see Gaul and Hartung [I81 in the more general multistate reliability framework) gives a hint why proper systems of gsp-subdigraphs could be useful.
Theorem 1. r f 6, =(D,,} is a proper gsp-subdigraph sysrem then
From Theorem 1 it follows that a successivedetermination of the structure function (for n = m one gets the structure function of the underlying gsp-d'graph fi,,,,) is possible which depends on the chosen proper gsp-subdigraph system 8,.
113
Reliability-estimation in stochastic graphs
Among different proper gsp-subdigraph systems the following relation will be of interest:
If 8, = {D,,,], 6,*= (D,.,,) are two proper gsp-subdigraph systems with A (6,) = A (5:) then there exists a proper gsp-subdigraph system a,** = with
For
a,*, a*:
fulfilling (10) the notation 6,*~&,** is used.
4. Reliability estimation via lower (upper) bounds
The results of the following lemmas are needed.
Lemma 1.
n aikj n rnax ( a i k ) 1- n( - flajk) S
(iii) max {
r
S
=
14iQr k=l
S
1
=
i=l
k = l 1QiGr
k=l
Proof. (i) is obvious for less restrictive assumptions.
najk by induction with respect to S
(ii) follows with yi=
r.
k= 1
(iii) Nothing has to be shown if max
(C[jko}=O
for some ko E (1,
..., s},
1S i Q r
but max (C[ik>
= 1,
Vk,causes
the existence of io
E
{ 1 , ... , r > with aios=1 and
1C i 4 r
because or the non-increasing property mjok=l, V k . 0
W. Gaul
114
With the definition (see Esary and Marshall [lo]) a device with reliability performance process { X ( r ) , t E T } has a life in T ( t ) if Pr(X(l)= I, V I E [O,s>[X(s)=l) = 1 for all S E T(r) with Pr(X(s)= 1) > O holds, one has
Lemma 2. r f D is a gsp-digraph with life in T ( T )then
Proof. Let be
follows. Remark. It is Lemma I(i) which always allows to get lower reliability bounds. Lemma I(ii) indicates a poxibility which can yield improvements for lower bounds but only if special gcp-subd.graphs have lives in T(s) upper reliability bounds are obtainable according to Lemma 1 (iii) by using the non-increasing property of Lemma 2.
For the proofs of the following theorems notice that if
E c T(r) is a countable, dense subset of T(s), T k = { t l ,..., t , ) c E is a finite subset of E with T , f E ( k + c o ) , D , D,,D, are gsp-subdigraphs of a proper &system,
(1 1)
then
Pr( p,(X(t))= 1 ,
1 E T,)L
Pr( pD(X(t))= 1,
by monotone convergence, and, RD(T)= Pr (VD(X ( t ) ) = 1 Vt E E )
vt E E )
(k+ co)
(12)
Reliability-esfimation in stochastic grophs
115
because y D ( X ( t ,w)) is continuous from the right for fixed w E 0.Furthermore, if {X(t),
t~
T } is time-associated
then
{ v D ( X ( t ) ) ,t E T } is time-associated
(14)
because qD(X(t)) is a binary non-decreasing function for fixed ~ E Tand, , as { y D ( X ( t ) ) ,tETk] is a set of associated binary random variables for n,, D , E S k
P
k
is valid. Now, assume that with respect to a proper gsp-subdigraph system S,= ( D j f I > lower and upper reliability bounds for i?~,,, (z), i E B(S,), are known, i.e.
and define
Theorem 2. If & = ( D i n ) is a proper gsp-subdigraph system, if ( X ( t ) , t E T ) is a time-ussociated reliubility performance process then
Proof. In consideration of (1 1-18) one gets for the finite set of times Tk k
k
W. Gaul
116
where after application ofThcorem 1 the first inequality follows from Lemma 1 (i), the second inequality from Jensen's inequality because of the convexity of the max-operator, and the third inequality from (15). Applying (12), (13) and (lb), (17) gives
Theorem 3. r f ij,, = (U,,,] is a proper gsp-subdigraph system, if' ( X ( t ) , t E T } is a time-associated reliability perfortnutice process, if the component processes { (X,(t), ~a E A(d,)), t E T } , { ( X a ( f ) ,a E A(ijl,)), t E T} are independent then
h
if, additionally, Do,, i E B(dl,), Dill E S,,have lives in T(r).
Proof. (i) In consideration of (11-18) one gets for the finite set of times Tk
Reliability-estimation in stochastic graphs
117
where the first inequality follows from arguments used in the proof of Theorem 2, the second inequality from Lemma l(ii), the third inequality from (15) (as
is a set of associated binary random variables) and the last equality from the independence assumption of the component processes. Applying (12), (13) and (16), (17), (18) gives the result. (ii) Similarly,
where the first equality follows from Theorem 1, the second equality follows from Lemmas l(iii) and 2 because the gsp-subd:graphs have lives in T ( s ) (where t l < ... < t k is assumed for the times of Tk)and the last inequality, again, from (15) and the independence assumption of the component processes. Applying (12), (13) and (16), (18) gives
Of course, an interesting question is whether it is possible to get improved bounds by changing from one proper gsp-subdigraph system :S to another dd*. The answer is given by the following. Theorem 4. :S = { Dl*,l), S,**= { Dl**rr) are propcr gyp-subdigraph systenis with d,*~ d , * *$ , { X ( t ) , t E T ) is a time-associuted reliability perfortnonce process then
i**
E
~(6,**),
(i) L i r r ( ~ ) R s , , I >L,(s), ,,(~)
i" 3
Lq3*(S) 2 I!(r),
E
B(S:) nN(Dill,,),
W. Gaul i** E B(6,**),
(ii) Ui..(t)Rg?.+,.(~)
i * E B (6:) nN
,
h
if, additionall.v, Dllr,.,i** E B(dT*), i* E B(6:) n N(D,**,,), D,." E 6$, have lives in T ( T ) and, , if the component processes {(X,(t), a E I E TI, { ( ~ ~ (.€A(&:,*)), t), t E T> with A(&:) w . ~ = A ( J ; * ) ,A(&,*)n A=s are
a),
independent.
Here, to include the dependence of the lower (upper) reliability bounds from the chosen proper gsp-subdigraph system, the notation for the bounds is slightly different from (17), (18).
Proof. First, noticc that for ,:S S,**with cT,*Cd,** (sec (10))
i s a proper gsp-subdigraph system (with respect to the gsp-digraph D1+J and Theorem 1 gives
(i) Although this is the more relevant statement of Theorem 4 the proof follows directly as a consequence of replacing q7D,,,,, by (19) and is omitted. ( i t ) One has
k
where the equality follows frOJ11 Lemma 1 (iii) applied to the max-expression (because the corresponding gsp-subdigraphs have lives in T ( T ) and ) an exchange of U,,.(r) (which is not affected by the max-operation) with the max- and the Il-operator and the inequality from Lemma I (ii).
Reliability-estimation in stochastic graphs
119
Now
where the first equality follows from (19), the inequality from (20) and an association property of the form ( 2 , , ... , Z,} tion-negative associated *E
n Zi> n E Z i r
r
i=1
i=l
(which is similar to (15) and can be proved along the lines of the induction argument used in the proof of (3.1), Chapter 2 in Barlow and Probchan [2]) and the last equality from the independence assumption. Applying (12), (13) and (16), (18) gives the result. 0
OF course, the decision which of the proper grp-subdigraph systems S,*, a,** with 6,*~6:* one should choose will depend on the fact, how good the given bounds L i r , Ui., i* E B(J:), Li.., U,,,, i** E B(6:*), respectively, are as well as on the knowledge about and the difficulties for the possibilities of the determination of the reliability of the gsp-subd.graphs of the proper systems. Inrtead of formulating further conditions for such an optimal choice the next section shows that the just developed approach allows reinarkable improvements even i n an example of simplest forin. From the preceding theorems one gets the following. Proccdurc Step n. Choose n E N(h0,,,), 1 <
120
W. Caul
5. Example
To illustrate the theoretical results the following example is given to show that bounds additional to the minimal cut lower bound of Esary and Proschan could be of interest for reliability estimation. In the gsp-digraph of Fig. 1 the numbers attached to the nodes give a possible level assignment. The cross-connection arc (2,3) from 2 to 3 hinders straightforward series-parallel reduction.
FIG. 1.
The arcs are assumed to have lives in T(z)with lifetimes independently ident ically exponential-distributed with parameter 1, i.e.
where Life, is the random variable describing the lifetime of arc a. As there are: three paths
and nine minimal cuts
Reliability-estimation in stochastic graphs
121
one easily gets from (21) RpA$r)=e-3"', i
E
(1, ... , 3},
&-AT)=
(22) I - ( I -e-ar)2, i E { I , ... , 5 1 , I -(1 - e - 1 * ) 3 , i E ( 6 , ... , 9 ) .
The easiest bounds using path representation are
h
For the proper gsp-subdigraph system 6, = {D,,,D 3 5 } with & , = ( ( 2 , 4 , 5 1 , ( ( 2 , 4 ) , ( 4 , 5 ) } ) , and
635
=({3,5) , {(3,5))),
no lower (upper) bounds for bility
goi,i~
RD~~(T)=~-~",
R EJ
T ) = e- r ,
B(6,), are needed because the exact relia-
is easily obtainable, the corresponding bounds (see (17), (18)) are
For the miniinal cut lower bound of Esary and Proschan one gets from (22) Ep&5(~)=(1-(1-e-Ar)z)5(1-(1-e-
)
1 r )3 4
.
(25)
The bounds (23), (24), (25) are given in Table 1 together with the exact reliability value (denoted by R) for different values of iand z. The results indicate that for high reliabilities the EP-bound can be better but that the suggested decomposition D f gsp-digraphs with respect to proper gsp-subd.graph systems (for which mprovements according to Theorem 4 are possible) becomes more and more ateresting for increasing time.
L
hi I"
Table I
L,, Ui.EPreliability boundsfor R w i t h L 1 < L 2 < R < U 2 < U 1 and E P < R 5=
a.=
L1 LZ
0.10
C'1
UZ
R EP Ll
LZ
u2 0.20
u 1
R EP L1 L2 UZ
0.30
Ul
R EP
Ll
LZ u 2
0.40
u 1
R EP
1.o 0.74082 0.8751 1 0.96763 0.98259 0.95717 0.95224
2.0 0.54881 0.72971 0.87806 0.905 15 0.85608 0.82617
3.0 0.40657 0.59001 0.75670 0.79102 0.73061 0.65 810
4.0 0.301 19 0.46705 0.62757 0.65875 0.60300 0.48613
5.0 0.223 13 0.36418 0.50605 0.531 14 0.48563 0.33540
6.0
7.0
0.16530 0.28081 0.39969 0.41844 0.38399 0.21796
0.12246 0.21472 0.31088 0.32422 0.29943 0.13436
8.0 0.09072 0.16312 0.23904 0.24821 0.23099 0.07907
9.0 0.06721 0.12330 0.18222 0.18837 0.17672 0.04468
10.0 0.04979 0.09284 0.13800 0. I4205 0.13432 0.02435
0.54851 0.72974 0.S7806 0.908 I 5 0.85608 0.82617
0.301 19 0.16705 0.62757 0.65875 0.60300 0.48613
0.1 6530 0.28051 0.39969 0.41844 0.38399 0.21 796
0.09072 0.16312 0.23901 0.21821 0.23039 0.07907
0.01979 0.03281 0.13800 0.14205 0.13432 0.02435
0.02732 0.05217 0.07807 0.07975 0.07649 0.00660
0.01500 0.02908 0.04361 0.04132 0.01299 0.00162
0.00823 0.01612 0.02122 0.02449 0.02396 0.00037
0.00452 0.00891 0.01339 0.01349 0.01 329 0.00008
0.0024s 0.00191 0.00738 0.00742 0.00734 0.00002
0.40657 0.59001 0.75670 0.79102 0.73061 0.65840
0.1 6530 0.28081 0.39969 0.41844 0.38399 0.21796
0.06721 0.12330 0.18222 0.18837 0.17672 0.04468
0.02732 0.05217 0.07807 0.07975 0.07649 0.00660
0.01111 0.02166 0.03253 0.03296 0.03212 0.00078
0.00452 0.00891 0.01339 0.01 349 0.01 329 0.00008
0.00184 0.00365 0.00547 0.00550 0.00545 0.00001
0.00075 0.00149 0.00223 0.00224 0.00223 0.00000
0.00030 0.00061 0.00091
0.00012 0.00025 0.00037 0.00037 0.00037 0.00000
0.301 19 0.46505 0.62757 0.65575 0.60300 0.48613
0.09072 0. I63 12 0.23904 0.24831 0.23099 0.07907
0.02732 0.05217 0.07807 0.07975 0.07649 0.00660
0.00823 0.01612 0.02422 0.02449 0.02396 0.00037
0.00248 0.00191 0.00738 0.00742 0.00734 0.00002
0.00075 0.00149 0.00223 0.00224 0.00223 0.00000
0.00022 0.00015 0.00067 0.00067 0.00067 o.ooo0o
0.00007 0.00014 0.00020 0.00020 0.00020
o.ooooo
0.00091
0.00091 0.00000 0.00002 0.00004 0.00006 0.00006 0.00006 0.00000
0.00001 o.oo001 0.00002 0.00002 0.00002
0.00000
f Q
&
0.223 13 0.36318 0.50605 0.531 I4 0.48563 0.33540
0.04979 0.09284 0.13800 0.14205 0.13132 0.02435
0.01 11 1 0.02166 0.03253 0.01296 0.03212 0.00078
0.00248 0.00491 0.00738 0.00742 0.00734 0.00002
0.00055 0.001 10 0.00165 0.001 66 0.001 65 0.00000
0.00012 0.00025 0.00037 0.00037 0.00037 0.00000
0.00003 0.00006 0.00008 0.00008 0.00008 0.00000
0.00001 o.oOOO1 0.00002 0.00002 0.00002 0.00000
0.00000
0. I0510 0.1 8728 0.27294 0.28101 0.26330 0.10364
0.0111 I 0.02166 0.03253 0.03296 0.03212 0.00078
0.001 I7 0.00233 0.00350 0.00351 0.00339 0.00000
0.00012 0.00025 0.0003 7 0.00037 0.00037 0.00000
0.00001 0.00003 0.00004 0.00004 0.00004 0.00000
0.00000 0.00000 0.00000 0.OOOOO 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000 o.oOOO0 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.OOOOO 0.00000 0.00000 0.00000
0.00000 0.OOOOO
0.04979 0.09234 0. I3800 0.13205 0.13432 0.02435
0.00218 0.00491 0.00738 0.00742 0.00734 0.00002
0.00012 0.00025 0.00037 0.00037 0.00037 0.00000
0.00001 0.00001 0.00002 0.00002 0.00002 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000
0.00000 o.oO0oo 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
o.mo0 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 o.oO0oo 0.00000
o.oOOO0 0.00000 0.00000 0 .00000 0.00000 0.00000
0.00000
i?
0.00000 0.00000 0.00000
5
0.00000
5 9
o.oooO0 0.00000
3
0.00000 0.ooOOO 0.00000
~
= ci2.
s
2g 2
124
W. Caul
References [l] R. E. Barlow and F. Proschan, Availability theory for multicomponent systems, in: P. R. Krislinaiah, ed., Multivariate Analysis I11 (Academic Press, New York, 1973) 319-335. [2] R. E. Barlow and F. Proschan, Statistischc Theorie dcr Zuvcrlassigkeit (Verlag Harry Deutsch, Frankfurt/Main, 1978). [3] R. E. Barlow and A. S . Wu, Coherent systems with multistatc componcnts, Math. Oper. Research 3 (1978) 275-281. [4] L. D. Bodin, Approximations to systems rcliability using a modular decomposition, Technometrics 12 (1970) 335-344. [5] R. W. Butterworth, A set theoretic trcatnient of coherent systems, SIAM J. Appl. Math. 22 (1972) 590-598. [6] H. J. Cleef and W. Gaul, A stochastic flow problem, J. Information & Optimization SC. 1 (1980) 229-270. [7] H. J. Cleef and W. Gaul, Project scheduling via slochastic programming, Matheni. Operationsf. & Statistik, Ser. Optimization 13 (1982) 449-468. [8] E. El-Neweihi and F. Proschan, Multistate reliability models: A survcy, in: P. R. Krishnaiah, ed., Multivariate Analysis V (North-Holland Publishing Company, 1980) 523-541. [9] E. El-Neweihi, F. Proschan and J. Sethuraman, Multistate coherent systems, J. Appl. Probability 15 (1978) 675-688. [lo] J. D. Esary and A. W. Marshall, System structure and the existence of a system life, Technometrics 6 (1964) 459-462. [ I l l J. D. Esary and F. Proschan, Coherent structures of nonidentical components, Technometrics 5 (1963) 191-209. (121 J. D. Esary and F. Proschan, A reliability bound for systems of maintained interdependent components, J. Amer. Statist. Association 65 (1970) 329-338. [13] J. D. Esary, F. Proschan and D. W. Walkup, Association of random variablcs, with applications, Ann. Math. Statistics 38 (1967) 1466-1474. [I41 0. Frank and W. Gaul, On reliability in stochastic graphs, Nctworks 12 (1982) 119-126. [I51 K. W. Gaede, ZuvcrlSssigkeit, Mathematische Modelle (Verlag Karl Hanser, Miinchen, 1977). [16] W. Gaul, Some structural properties of project digraphs, J. Combinat., lnfor. & System Sc. 3 (1978) 217-222. [I71 W. Gaul, Stochastische Aspekte bei anwendungsrelevanten Graphenproblemen, Habili tationsschrift (Universitat Bonn, 1980). [IS] W. Gaul and J. Harung, Muitistatc reliability problcms for gsp-digraphs, Lecture Notes in Economics and Mathematical Systems 240 (1985) 41-53. [19] F. Harary, F. Z. Norman and D. Cartwright, Structural Models: An Introduction to the Theory of Directed Graphs (John Wiley&Sons, New York, 1965). [20] K. Jogdeo, Association and probability inequalities, Ann. Statistics 5 (1977) 495-504.
Annals of Discrete Mathematics 28 (1985) 125-130 0Elsevier Science Publishers B. V. (North-Holland)
ELECTRICAL NETWORKS WITH RANDOM RESISTANCES Geoffrey GRIMMETT School of Matlietnatics, Universitv of Bristol, Bristol, England An electrical network is a graph G = ( V , E ) with two sets I , 0 of vertices, called input and output vertices, such that each edge e has some electrical resistance R ( e ) ohms. We suppose that the family { R ( e ) :e E E } is a collection of independent, identically distributed random variables, and we are interested in the effective (random) resistance R ( C ) of the network G betwcen I and 0. There are three main cases of interest, when G is a branching tree, or a complete graph or a subsection of some crystalline lattice; for these cases, we discuss the asymptotic properties of R ( C ) i n the limit as IVI-tm. For the special case when each edge-resistance takes the values 1 and a, ohms with probabilities p and 1-p respectively, these problems deal with the strength of connectivity of random graphs.
1. Introduction
An electrical network contains terminals, some pairs of which are connected by wires of specified resistances. If the positive and negative terminals of a battery are connected to two terminals of the network then a potential is induced at each terminal and currents flow along the wires; these currents, potentials and resistances satisfy certain well-known rules called Kirchhoff’s Laws and Ohm’s Law. Such a network has an eflectiue resistance which may be calculated using the series and parallel laws for combining resistances, and, from the battery’s point of view, the total current flow would remain unchanged if the network were replaced by a single wire with this effective resistance. The computation of the effective resistance of a network can be intricate since the underlying graph may have a complicated structure (although electrical engineers have certain devices for easing this task). We are interested here in studying the effective resistance of a typical network, and towards this goal we shall study electrical networks whose resistances are random variables. Of course, such networks may be very far from being typical in the sense of their being i n a general category of practical interest; also, it seems to be a hopeless task to seek exact results 125
126
G. Grimmer1
about most specified finite networks with random resistances. Consequently, we restrict our attention to certain types of networks with particularly regular structures, and we discuss limit theorems for the effective resistances of large networks as the number of their terminals approaches infinity. We niay think of an elcctrical network as a type of random graph; we shall suppose that each edge e of a graph G = ( V , E ) on n vertices has SOJllC random resistance R(e) ohms. I n common with the theory of random graphs (see Grrmmett [I]), we shall study the elyective resistance R ( C ) as rz-00 for the two cases when G is either a complete labclled graph or a subsection of the square lattice bz; similar results hold for other crystalline lattices in dimensions ~ 1 3 2 . Similar results may be derived for the effective resistances of certain graphs without circuits. Stinchconibe 121 has studied the cflective resistance of G whcn G is a regular branching tree; the absence of circuits i n these graphs is a great advantage in applying the laws of probability theory dealing with combinations of random resistances. We shall assume that the family {R(c):e E E ) of edge-resistances is a collection of independent, identically distributed random variables with some common distribution function F ( x ) = P ( R < x ) satisfying F(0-)=0. We are particularly interested i n the special case when each resistance takes either the value 1 ohm, with probability p , or the value 00 ohms, with probability 1 - p , where 0 6 p d 1. This corresponds exactly to the usual situation in the theory of random graphs, in which each edge of some initial graph G is deleted with probability 1 - p (and thus transmits n o electricity, and has infinite resistance) or remains present with probability p (and has some standard resistance, say 1 ohm). In such a case of Bernoulli resistances, we may sometimes allow the parameter p to be a function p = p ( n ) of the number n of vertices of G. Various references to the random resistance problem exist in the physical literature. Apart from Stinchcombe [ 2 ] , who studied trees, several authors have considered electrical currents which flow through subsections of the square lattice (see Stauffer [3] for example), but few r'gorous results are known. A related problem is the question of Ford/Fulkerson network flows through graphs with random edge capacities. The rules for combining the capacities of edges in parallel and series differ from those for combining the conductances of an electrical network, and it is not clear that the relationship between the two problems is anything more than superficial. However, Menger's Theorem has an application in the proofs of Section 4. The flows through randomly-capacitated trees and complete graphs have been studied by Grimmett and Welsh [4] and by Grimmett and Suen [ 5 ] . The case of randomly-capacitated crystalline lattices is treated by Grimmett and Kesten [6] as an application of first-passage percolation theory. The results of this paper are drawn from Grimmett and Kesten [6, 71.
Electrical networks with random resistances
127
2. Preliminaries In this section we establish some notation and review those laws of physics which bear on the problem of electrical networks. Let G = ( V , E ) be a finite connected graph without loops or multiple edges and with vertex set V={O, 1, 2, ..., a , m). Vertices 0 and co are specially designated as those to which our battery will be connected. We write i - j if vertices i and j are adjacent and denote by c,, the edge joining them in this case. We shall often discuss ordered pairs of adjacent vertices and denote such a pair by ( i ,j ) ; i f the context is clear, then we may sometimes use ( i , j ) to represent the edge e l j also. Suppose that the vertices of G are terminals in an electrical network and that the edges are connections between pairs of terminals. Suppose further that each edge e of G has some non-random non-negative resistance r(e) ohms, and that a potential difference of 1 volt is imposed across G by the connection o f a battery to the terminals labelled 0 and rn (see Figure 1). The following experimental facts are well-known.
If e l j joins the ordered pair ( i , j ) of adjacent vertices, then some current of (say) c I jamps flows from i to j along e i j (note that cij = - cJ. (2.1) KirchlzofS’s First Law. There exists a potential function p: V 4 g such that y(O)=O, y(co)=l, so that the potential difference between the pair ( i ,j ) is p i j = y ( i ) - p ( j ) (note that pij= - Y j i ) *
1
2
3
4
0
Fig. 1. A simple electrical network.
G. Grirnmerr
128
Kirchhoff’s Second Law. The aggregate current entering any vertex other than 0 or 00 equals zero. That is to say, for all i # O , a,
1 cji=o. j-i
Ohm’s Law. For each adjacent pair ( i , j ) the potential difference, current and resistance satisfy pij = clj r (eij).
(2.4)
It may be shown (see Bollobds [8, p. 321 and Kesten [9, Ch. 111) that there is a set of potentials and currents such that these laws hold. The total current flowing through G is c(G)=
1
cjo=
j-0
1
Cmj,
j-m
and the effective resistance of G is defined to be
in accordance with Ohm’s Law. The series and parallel laws for combining resistances are simple consequences of the above laws, and state that the effective resistance of wires in series is the sum of their resistances, and the reciprocal of the resistance of wires in parallel is the sum of their reciprocals: rr 0
a
r2
has effective resistance r = rl + rz
has effective resistance r such that 1
1
+-.1
r
rl
r2
---
r,
It is an important observation that the potential function v, is a harmonic function in the following sense. If i i s a vertex other than 0 or co then its potential q(i) is a weighted average of the potentials of its neighbours 1
for i f O , co.
Electrical networks with random resistances
129
This is an immediate consequence of the above laws (2.3), (2.4), and is useful i n the analysis of the problem. Equation (2.7) providcs a useful link between electrical network thcory and random walks. We do not explore this here, but note only the following. Suppose for siinplicity that r ( e ) = 1 for all e E E ; i n this case p(i) is the average of the potentials of the ne ghbours of i:
(2.8) where d(i) is the degree of i. Now consider a psrticle which performs a random walk about the vertices o f C. If at some epoch of ti~iir:it is at vertex i ( f O , a) then we assume that it iiloves during tlic next time interval to one ofthe ne:ghbours of i, each such vertex being cllo.jen with equal probability l/d(i).Let 0 and m be ajsorhituj wrticrs, :io lhal the partick stop!; JilOVillg once i t has visited either of them, and let y(i) be the ,first-pu.s.sqe prohabilit~y illat, starting from i, the particle is absorbed at co rathcr than at 0. It is easy to see that iy satisfies (2.8) i n place of ip, subject to t.he same boundary condition y/(O)=O, t i / ( c o ) = 1. But there is a unique S L I U ~solution to (2.8), and SO t//=p. Electrical potential theory now provides insight into the random walk: an example of this is an elegant approach to the recurrence and transience of syininetric random walks on 2T2 and T 3 ,respectively. Sec Griffeath and L'ggett [lo], Lyons [I I ] and Grirninett and Stirzaker [12, p. 2731 for more details. I n the following, we shall assume that the resistances of G are a family ( R ( E ) : e E E } of non-negative independent random variables with the shared distribution function F ( x ) = P ( R < . x ) , where R denotes a typ'cal resistance. Our results concern convergence of sequences of random variables. We usually consider convergence in probability only, although other modes of convergence may hold in addition. We often use upper case letters to denote random variables and lower case letters to denote non-random quantities. Thus, in a deterministic network there exist potentials 4,currents c and resistances r, whilst in a random network these quantities are represented by @, C and R .
3. Random complete networks Suppose that G, is the complete graph on the vertex set (0, 1 , 2 , ..., n, a). We write R(e) for the (random) resistance of an edge e, and F for the common distribution function of the R's. We impose a potential difference of 1 volt between vertices 0 and co and denote by R, the effective resistance of the network (see Figure 2). We consider a general situation in which the distribution function F
G.Grimim tt
I30
Fig. 2. The network C, with the battery.
may depend upon n in the follow'ng way. Let ( p ( n ) :n > 1) be a sequence of numbers from [0, I ] and let H be the distribution function of aprobabilitymeasure with support [0, a),and assume that F=F,, is given by
Such a distribution function corresponds to the case when each edge of C is deleted from the network with probability I - p ( n ) independently of all other edges, and the resistances of the edges which remain are distributed as H . It is well-known (and easily verified) that, for the simplest case when p ( n ) = l for all n and H is concentrated on { l ) , then nR,-i2 almost surely (as.) as n-tco. Our first result generalizes this.
Theorem 1. I f n p ( n ) + a as i i + o o , then n p ( n )R,L+2
(1
d M (x))-
it1 probability
10. m )
0s n-ico, where the limit is interpreted as 0 if the integral eqiials m.
The proof I S quite long and, together with the proof9 of subsequent results of t h s section, may be found in ~r*liiJllett and Ke\ten [7]. The condition that np(nL+ar, is a natural one tor the following reason. The degree d,,(i) 01' vertex
Electrical networks with random resistances
131
I is binomially distributed with parameters (n+ 1) and p(n), and has mean value ( n + I)p(n). If p(n)-0 as n + a , but np(n)+A, where O,
(iii) if A=m then d,(i) is such that
is asymptotically normally distributed with zero mean and unit variance.
In particular, if n p (n)-+l E [0, co) then each vertex degree remains a s . finite as n-m, whilst if np(n)-+mthen they grow a s . beyond all bounds. This is closely related to results of‘ Erdos and RCnyi [13], who stud ed the connectedness of random graphs. As they observed, a type of “critical phenomenon” occurs at the value A = 1 of the parameter 1.We describe this in the next theorem. Theorem 2. Suppose that n p (n)- 1, where 0 d 1 5 03. (a) /fA< 1 then
(b) If ,I>I then there exists a distribution function J with sipport [0, 031 such tltut R,,-+X + Y in clistributioti.
where X atid Y ure itidepcncletit mnc~owi unriables with distribution jiinction J. Also, the atom of J u t 03 eqrrals ihe smuller root (I of’tlieequutioti (I= exp ( -A( I - 4 ) ) . The distribut on function J of part (b) above arises i n the followiiig way. Let T be the fanlily tiee of a 1)~ i l ~ ~ J ~ l ~ - ~ ~ ~ blanching ~ ~ i i - ~ process L i 1 ~ i n wh ch the olf,pr iig-distiibution 1 5 the Po \mi d siribut on with paidmeter ,I(>,O), and let T,, be the subtiee of T induced by the members of the first 17 gener-
0 l l
132
G. Grinitnett
ations of T. To each edge e of T we assign a resistancc R(e), where (X(P): e E T) is a family or independent random variables with distribution function H . Consider the electrical network obtained by connecting a battery between the root of T and a “collective” vertex obtained by shortcircuiting all the vertices in the 12-th generation of T; we write R(T,,) Tor the effective resistance of this network. It is clear (thoiigh not quitc so easy to prove - see Kesten 191) that R(T,,)< &(T n +,) and , hence R = lim R(T,,) cxisls (but may be infinite). Tlicn J is dclined n-* w
to be the distribution function of R: J(x)=P(Rdx)
for
Obxda.
Of course, R liab an atom at LO which I S ;Li least as large as the probability, q(3,) that T IS finite (and the bianching process becomes extinct). I n fact, cqualily holds here 111that
and q(11) is the smaller solution to ttic equation q = exp( - A ( l - q ) ) , The proor of Theorem 2 is long, although the basic idea is quite simple and proceeds as follows. We call an edge e of C, insulating (respectively conducting) if R(e)=oo (respectively R ( e ) < a ) . The subgraph G,(C) of C,,, containing the conducting edges only, is a random graph (in the usual sense of the expression) in which each edge is present with probability p(n). We denote by T,“ (respectively 7’:) the set of vertices of G,,(C)which have distance k from 0 (respectively KI) in G,,(C). Suppose that np(~)-+E. where O , < A c c o , and let K be a fixed positive integer. It turns out that the two subgraphs of G,,(C) induced by thc vertex sets ( 0 ) u T f u... u T i and {a}u T;“u ... u T : converge in distribution, as n+w, to independent copies of the first K generations of the family tree of a BienaymC-Galton-Watson branching process in which the offspring-distribution is the Poisson distribution with parameter 1. If 11<1 then such a branching process becomes extinct a s . , implying that for large n there is, with probability 1 -o(l), no path from 0 to co in G,,(C); thus if I < 1,
If A> 1 then there is positive probability that such a branching process is infinite,
Electrical trctworks with random resistances
133
and it is now not too difficult to see that D
I im i t i f R , 3 R I( TK)+ R 2 ( T,) ,
(3.2)
11-m
0
where R,(T,) and R2(TK.) are independent copies of' R(T,), and we wt ite ;Y> Y if P ( X > , z ) > P ( Y 3 z ) for all z. We let K - > a i n (3.2) to obtain part of Theorem 2(b). The bulk of the proof of Theoi-e~ii3(b) i s devoted to xhowtng a correcponding upper bound for lim \ i i p R,. . n-+ n
4. Lattice graphs
Physicists are mostly concerned with the case when the underlying graph G is part of a crystalline lattice, and we consiiler this case next. We restrict ourselves to subgraphs of the plane square lattice, but note that similar results should hold for other two-dimensional lattices. Let Y 2 bc the plane square laltice with vertices {(x, y): x,y=O, .f. I , ...> and edges joining all pairs of vertices which are unit distance apart. Let p satisfy 0 6 / 1 6I , and let H be the distribution function of a probability measure concentrated on [0, m). To each edge e of Y 2we assign a resistance R(P),where ( R ( L ' )e: E 2") is a 18mily of independent, identically distributed random variables whose COJlli11Oll distribution function F is given by
Thus, any given edge is irzsrilciting (in that its resistance IS inlinite) with probability 1 - p , whilst with probability p it is corrr/iicti/~ga n d its resistance has distribution function If. Just as the networks of tlic previous section are related directly to random graphs, 50 are thehe lattice networks related to (lie bond pcrcolation model (see Ikstcn [9, 141 lor recent resul~s).In tlic percolation niodel on T2, edges of Y 2are deleted indcpciiclcntly with probabilily I - / I ; percolation theory studies the way that the structure 01' the remaining cornponents depends on the numerical value oi' the parameter p. Let S,] be the square section of T2induced by the vertices {(x,y): O<X, y
d
w P
c
I, nl
Fig. 3. S. as an electrical network.
Electrical networks with random resistances
135
left-hand side to the right-hand side of Snapproaches 0 a5 n-rco; thus P(R,,= co) 4 1 as n-icc i f p < $ . 011 the o:her hand, we can show (Grimmett and Kesten is uniformly bounded i n I? if p>+. [6] or Kesten [9]) that
Theorem 3. (a) u p < $ / h i 7 P(R,,=m)-+lasri-+co. (b) If p > 4 flictr tliere exists v(p) < ac, such thu1 -1
( p j - : dH(x))
GliminfR,, n- m>
< I im sup R , < v ( p ) J” x ciH (x) I, -+
the.first t e r m
being
interpreted as 0 i f
-
as.,
ru
s-’dH(x)=
CO.
Conjecture. I f p > j tlwn R,xb= lim R,, exists a.s. I1
or,
Results similar to Theorem 3 should hold for any planar two-dimensional lattice L , the critical value p = ) being replaced by the critical probability of the bond percolation process on L. A related open problem co.icerns cubic lattices in dimensions d 3 3. Consider 2T3 for example, and suppose (Ibr siinplicity) that each edge of T 3is a wire whose resistance equals I wilh probability p , or equals co with probability I - p , independently of all oilier edge-resistances. Let K,,(3) be the efkctive resistance of acube, side length / I , betwecii opposite faces ofwhich a battery has been connected. Denobing by W the set of vcrtices of Z3 which are attainable from the origin by paths of conducting cclgei, we deline the critical probability
Wc can prove the following.
We are tempted to conjecture the following: the conclusion of (b) remains valid under the condition p > p T , (11) the sequence {nR,,(3)’,converges as n-iiso, if p > p 3 . (I)
136
G . Crimnzett
References [I] G . R. Grimmett, Random graphs, in: Selected Topics i i i Graph Theory. 11, L. Beineke and R. Wilson, eds. (Academic Press, London, 1983). 201-235. [2] R. B. Stinchcombc, Conductivity and spin-wave stiffness in disordcred systems an exactly soluble niodcl, J. Physics C: Solid State Phys. 7 (1973) 179--203. [3] D. Stauffcr, Scaling theory or pcrcolation clusters, Physics Reports 53 (1979) 1-79. [4] G . R. Griminctt and D. J. A. Wclsh, l l o w in nctworks with randoni capacitics, Stochastics 7 (1982) 205-229. [ 5 ] G . R. GrimmetI and W.C. Sucri, The maximal flow through a directed graph with random capacities, Stochasficc S (19x2) 1.53-159. [6] G . R. Grimmelt and I I . Kcsten, Fii-sl-pawgc percolation, network flows and clectrical networks, Z. Wnhrsch'th:oric vcrw. Geb. 66 (198.1) 335-366. [7] G . R. Grimnictt a n d H. Kcsten, Random clcclricul nctworks on coniplcte graphs, J . London M(ith. SUC.30 (1983) 171-192. [ 8 ] B. Bolloblis, Graph Theory, An Introductory Coursc (Springx, Berlin, 1979). [9] H. Kesten, I'ercolation Theory for Mathematicians (I3irkhSuser, BosIon, 1982). [lo] D. Griffcath and T. Liggett, Critical phenomena for Spitzer's rcvcrsible nearest particle systcms, Ann. Probability 10 (1982) 881-895. [ I 11 T. Lyons, A simple critcrion for transiencc ofa rcversiblc Markov chain, Ann. Probability 11 (1983) 393-402. [I21 G. R. Grimnictt and D. R. Stirraker, Probability and Random Proccsses (Clarendon Press, Oxford, 1982). 1131 P. Erdiis and A. RBnyi, On thc evolution of random graphs, Publ. Math. Inst. Hungar. Acad. Sci. 5A (1960) 17-61. [I41 H. Kesten, Exact rcsults for percolation, Advances in Mathematics, lo appear.
Annals of Discrete Mathematics 28 (1985) 137-158 0Elsevier Scicnce Publishers B. V. (North-Holland)
A RANDOM BIPAR'I'ITE MAPPING
A random bipartite mapping ( T ; P,,0,) of a finite set V =, ' L u V z into itself is considered. Wc determinc the exact distributions of several numerical characteristics (for cx:implc the number o f connectcii coiiipotients, cyclical points, prcdcccssors and succzssors of a given point) of such :I raridtmi m:ipping. An asyniptilt izal bctiaviour of the above randoni v;iriablcs is studicd in tlic spccial casc (P,= l/l V , 1, Q,= I / ] YZI).
1. Introduction
Let V , ={I, 2, ..., K } and cr2 = { K - F1, K+2, . . ., KsL). Consider a random bipartite mapping (T;P,, QJ oil V= I.'1 u V2 into itself which was introduced i n [3], i.e. such that for a11 i E V independently P, Q, 0
for j E V,, i E V , , for j E V,, i g V z , otherwise,
and P I + & + ...+&= I ; P , . , s ~ P ~ + ~~...+rk..i-l.=l. Each niapping 3' is a digraph G,, the verliccs of which belong to the set V, u V2;the vertices i a n d j are joined by an arc ilTj== T(i). I t is well known t h a t each connected coinponetit of G', consists of esaclly otic oi.icntcct cycle and trecs; aI-c:
Denote by S,(i) the set of successors of i in T, i.e. S T ( i ) = { i ,T(i),T 2 ( i ) ...}= , { j V ~; TA(i)=jforsomenon-negativeintegerk) 137
J . Jaworski
138
and by PT(i)the set of predecessors of i in T, i.e. P T ( i ) = {Ej V ; Tk(j)=ifor some non-negative integer k } .
In [ 3 ] we have determined the probability of the connectedness of a random bipartite mapping ( T ; P,, Qi) and developed a recursive formula for the distribution of the number C of connected components. We have also derived an expression for the expectation E(C) and variance Var(C). Finally, we have applied these results in the special case when P,E 1/L and Q1= l/K. Using the s.Lnic iiiethod we have applied for a randoin mapp:ng (T, P,) [ 3 ] ,we shall determine here exact forriiulas for distributions of several nu.rnerkal characteristics related to random bipartite mappings. We shall consider distributions of the following randoin variables:
C C, q
s(i)
p(i) I(i)
the number of connected components, the number of cycles of length 21, the number of cyclical vertices, the number of successors of a given vertex i, the number of predecessors of a g’ven vertex i, the number of cyclical vertices i n the connected component to which a given vertex i belongs.
Obviously, i n the bipartite model, p ( i ) = p l ( i ) + p z ( i ) , where pl(i)= IYTn V1l and p2(i)=19r n v2(,(as usual, tor sc V, dcnotes tiic cardinality of the set s). I n an analogous way we can deline q I , q 2 ,sl(i),s2(i),/ l ( i ) , Iz(i) but it should be notcd that q , = q 2 , /l(i)=/2(i)and s,(ij=s,(i)+c(i), &here c(i) E { - 1,0, I}. I t is easy to see that events: ’YE S,,(i),” “ i ~ ; ~ ’ ~ ( j )and ’ ’ “there exists a path from i l o j ” are identical. No!c also that the event { C = k } ({(I=/)) is determined by the products of independent events: “ S = S I u S L is a sum o f k cycles” ( “ S = S , u s2is a sum ofcycles, (Sl=/”j and “there is no cycle i n S; u s;,” wlicre sic vi, SIF= V,\S, for i= I , 2 and by “S is ...” we Jiiean that “vertices of S forin ... .” Exact formulas for probabilities or the above events are given i n the next scction. In Scction 3 exact distributions of randoin variablcs C, C,, (I, s(i), p(i), I(;) are obtained. In the special case, when P j - I/L and Qi-IjK. which is studied in Section 4, we investigate also an asymptotical behav,our of random variables for L 3 K-tw (more exactly: L(n)3K(N)+cc as K+L=n+m). Finally, we discuss briefly a generalization of the niodel ( T ;Pj,Qi). Throughout the paper we shall use the following notation for events:
IS/
OCJS,, S 2 ] “S1 u S2 is a sum of k cycles” OCL[S,, S,] “ S , u S2 is a sun1 of cycles which has
exactly k cycles of the length 21”
A random lippartite mapping
139
OC [ S , ,S2J’ ‘‘S, u S2 i s a sum of cycles” NC [S,, S,] “there is no cycle in S , u S2” and probabilities
for O # S Q(S)=
n
v,,
P(0)=1
Qi
for O # S c V , , Q ( @ ) =I
Pi
for 0 # S c V 2 , K ( 8 , P)=O
Qi
for B i S c V , . R ( O , Q ) = O .
itS
R(S, P)= ieS
R(S,Q)= its
2. Basic lentmas Let f(i,,j) be a random variable such that for i,,j
jI
’ ( ‘ ’ A = \O
E
L’
i f there exists a path from i otlicrwise.
Lemma 1 .
Pr { I ( i , j ) = I ] =
Proof. Let i e V , , ~ E1f2 and S = S , US?, s , c v , - { ~ S} ~, C V , - ( ~ }I f. S u { i , i ) is a vertex-set o f a direcleci pat11 from i to,j, ttien anci there are [ J s , ~ ! ] ~ ways of obtaining this path. Moreover-, cvcr-y path of this type exists with probability P,P(S‘,)Q(S,). S’nce there exists exactly one directed path from i to j in GT, we obtain the lirst l‘orinula in Leinina 1. I n an analogous way one can determine the probabilities of the event “ I ( i , j ) = 1” in the other cases.
Is, I =Is, I
140
J. Jaworski
Proof. Note t h a t if S , LI S 2 is ;i siiiii of k cycle!; t.heii every siim o f iltis type exists with probability P(S,)Q(S,) and, obviously, only one sucli sum can occur in G,.. Hence we have to pl-ove only that there are ISI]!(.(ISI(,k)]ways to forin ;1 siiiii of k cycles fro111 vertices S1 u S2. Let o be a permutation on the set ( I , 2, ..., r } and G, be a digraph reprcsenting 0 (it consists of cyclcs only). Replace each or I' arcs of G, by two arcs in the Tollowing way:
There are r.! ways to label r new poinh i n the resulting bipartire digraph. 'Therefore, from any pcnnutation 011 r points we can oblain r ! dilferetit digraphs on 3r. poiiits which consist of cycles only. Moreover, thc number of these cycles IS the same as i n G, (niore exactly: the IliIJiib~rof cycles of length 21 i n this digraph is equal to the number of cycles of the length I in Go). It is obvious that i f SI u S2 is a sum ol'cycles then it is a bipartite digraph of the same type as discusscd above. Therefore, using the [act that there arc Is(r, k)l permutations on r points with exactly k cycles (see [6, p. 203]), we obtain rlie first desired probability. Note that there are
A random biparfitr mappin.F
permutations on r points with k cycles of length 1 and
Hence thc two other probnbilitics can be obtained by similar arguments as Pr(OC,[S,, S J ) . 0
Remark. It is casy to .;ee that prob.ibilitics o f events O C [ S , ,S2J, OC, [ S , , S , ] , OCitS,, s,],where slcv,, S,C v,, are equal t o o if
/S,I#]S~I.
Lemma 3. Let U 1c Vl, U , c V2 then Pr[NC[U,, U 2 ] j = K ( U y , Q ) + R ( U i , P ) - R ( U t , Q ) R ( U ; , P ) = I -R(U,, Q)R(U,. P ) .
Proof. The proof is by induction on r = I U,I + I CIll and a s i t is obvious for r = 1 aswine that the result holds for a11 values less than r . But
and from the indepcndence of the events NC [ U , -S1, U,-S,], Lemma 2 and the induction hypothesis, we have
where the last equation follows from the fact that
Remark. The above lemma is equivalent to the lemma from [3].
OC [S,, S,], by
J. Jaworski
142
3. Probability distributions related to random bipartite mappings First let us consider the number C ofconnected components of G.,.. Obviously, C is also the number of cycles in GT, so we can state
Theorem 1 . F o r k = 1 , 2 , ..., min(K,L)
Proof. The expectation E ( C ) and variance Var(C) were obtained earlier in [3]. Note that
Therefore, using Lemmas 2 and 3, we arrive at the thesis.
0
Consider now the number C, of cycles of the length 21 in a random bipartite mapping ( T ;Pj,QJ.
T h e o r e m 2 . F o r l = l , 2,... ,min(K,L); k = 0 , 1 ,...,
wlicrc by Ek(.) W P n i w i
the
k-th factorid moment.
A random bipartite mapping
143
Proof. Note that
so by Lemmas 2 and 3 we have
Since L(S,l//Jand [(ISl(-I)//] are different only when (Sll=i/,then the first part of our theorem follows froin the last formula. u S:"; S!"c V1, S:')c V2 and (S~"(=IS~"I=l for i = l , 2, ..., k . Let S(i)=Sf) It is easy t o see that if S"' i s a cycle for i= 1, 2, ..., k then S"), ..., Stk' have to be disjoint. Hence by Lemma 2, after simple calculations we can obtain the k-th factorial moment of randoin variable C,. 0 Denote now by q the number of cyclical vertices of G , .
Theorem 3. For i = 1, 2, ..., min ( K , L ) Pr ( 4 = 2 i } =[i!I2 s,c P
1 P (S,) Q(Sl)- [ ( i + ])!I2 I a2c I 1
ISII = l S z l = i
144
Proof. Obviously, Pr {q =2i) =
1 C
s, c V I s z c v2
Pr (oC [ S , , S,] n N C IS:, S ; ] )
IS,I = IS2J= i
and the proof follows i n a11analogous way to the proofs of Theorems 1 and 2. E(q) and Var(q) can bc calculated directly fi.0~11the exact distribution or by using an indicator method since
Assume now that i~ V , and considcr tlic numbers(i) of successors of i i n T and the number I(i) of cyclical vertices in the connected component to which i belongs. Thcorem 4. For k = I , 3, ..., niin ( K , I,), i E C',
nndfork=l,2, ..., min(K-l,L),
iE
VI
A random bipartite mapping
145
Theorem 5. For j = 1,2, ..., min(K, L), i E V,
h i = iS1i
+:E (2 (i)) - E2(I (i)) Proof of Theorems 4 and 5. The moments of s ( i ) and [(i) can be calculated directly from exact distributions but for s(i) it is easier to use an indicator method since s(i)=
C
I(i,j)+1
j s V - (i)
and E(l(i,.j))=Pr{l(i,j)= I} is given by Lemma 1. Consider now, as in [ 4 ] , the joint distribution ofs(i) and I(i). Then Pr{s(i)=2k, l ( i ) = 2 k } = P r { i belongs to a cycle oflength 2/c]
and for k >j Pr{s(i)=2k, Z(i)=2j)=k!(k-2)!
1
P ( S , ) Q ( S , ) R ( S , , Q)
slcV,-(i) SZCV,
and calculating the marginal distributions we arrive at the thesis.
0
J. Jaworskf
146
Remark. One can note that, as in [4],we have the following equation E(l(i))=l+tE(q).
Consider finally the distributions of the numbers of predecessors of a given vertex i E V,.
Theorem6.Fork=l,2 ,..., K + L ,
ie V1
Proof. Notice that p(i)=
C
I(j, i ) + l
j s V- (i)
Hence the formula for the expectation E(p(i)) follows by Lemma I . Denote by BISI, S,] the event T ( S E - { i } ) n S2 = 0
T ( S 5 )n(S, u {i})=8
T ( S , )n Ss =0
T(S,) n( S f - { i } ) = S
for S1c V,- (i}, S, c v,.
A random bipartite mapping
147
Then
Replacing P, and Q, by P,/R(S,, P ) and Q j / R ( S lu { i } , Q), respectively, for S, and j E S1 u {i] and using Lemma 3, one can check that
IE
Using the joint distribution of p l ( i ) and p,(i), we arrive at the thesis.
0
Remarks. We have omitted ;I Formula for Var ( p ( i ) ) since it is long and complicated. For the same reason we d o not consider the f.dlowing random variables: k(i)-the number of vertices i n the connected component to which i belongs, p (Al, A,) - the number of predecessors of vertices from A , u A 2 , where A , c V 1 and A 2 c V,. In fact, distributions of these random variables come directly from the basic lemmas, Theorem 1 and the method of the proof of Theorem 6. Finally, note that for i E V2 distributions of s(i), f(i) and p ( i ) can be obtained in the same way as in Theorems 4, 5 and 6 .
4. Special case: P j E 1/L, Q i= 1/K
All formulas which were obtained in the previous section can be considered as functions of two vectors Q = ( Q l , Q 2 . ..., QK)and P = ( P K + , ,P K + 2 ,..., PK+& Tt is straightforward to verify (see [ I , 51) that, for example, Pr {C=l), -E(C), -E(q) and -Ek(C,) are Shur's convex functions of vectors Q and P,separately. Hence, the probability that the graph G,. is connected is minimized and expected number of cycles, cycles of the length 21 as well as cyclical vertices are rnaximized for P, 1/L and Q l I/K. This is the reason why the uniform case (2'; I/L, 1/K) is considered. Let Pj= I/L, Q1= 1/K and K 6 L . Then the following corollaries can be obtained directly from Theorems 1-6.
J. Jaworski
148
Corollary 1.
where by (n), w e mean n!/(n-j)! Corollary 2.
Corollary 3.
i = l , 2 , ... , K
Var(q)=8
1 (K),(L)i i-2E(g)-E2(q). Ki L'
i=l
I t is obvious that in the uniform case (T; l/L, I/K), s(i), l(i), p(i), pl(i) a d P d i ) are not dependent on the choice of ifrom V,. Therefore, wc shall use notation $[I], /[I], p [ l ] , p,[I] and p,[I]. Similarly, for i E s(i), /(i), p(i), p , ( i ) and p d i ) will be denoled by s[2],1[2], p[2],pl[2] and p L [ 2 ] ,respectively.
v2,
Corollary 4.
149
A random bipartite mapping
Pr{s[1]=21c+1)=--
E(s[1])=1+2
( KK )kk ( L )Lk k
L(l-
_-
c --<--K L! (K)k(L)k
k=l
i),
k=1,2,
1 (-KKi);k-( -L-)Lkk
k=l
K
..., K-1
=l+E(q)-b,
and (K)k(L)k
k = 1, 2 , ... , K
Pr {s [2] = 2k) = -- - - - , K k L! L (K)k(L)k
Pr{s[2]=2k+l}=----Kk E(s[2])=1+2C
Lk K(
(K)k(L)k
~
k=l
K'
_
L!
I--
P)
,
k = 1 , 2 , ..., min(K, L - 1 )
(K)k(L)k
1K
_ -k-- 7-=l++((q)-6,,
k=l
L L
Corollary 5.
Corollary 6.
,
k , = O , I , ..., K ,
k 2 = 1 , 2 , ..., L
J. Jnworski
1 50
where 0 < 6 , c l , 0<6,<1. Denote by (T; l / n ) the uniform case of a random mapping (T; P,) on a finite set V, VI =n, into itself, i.e. such that for all i E V independently and j E V
I
Pr { T ( i ) = j }= l / n .
Now, using Corollaries 1 - 6 and the methods applied for a random mapping (T; l / n ) as n+cO (see [2, 7 and 81) we can investigate an asymptotical behaviour of random variables related to a random bipartite mapping (T; 1/L, l/K). Theorem 7. For L 2 K-,
00
and
where CEis Euler's constant. Let 2. = ( C - + log K)(+ log K ) - ' I 2 . Then the random variable normal with expectation 0 and variance 1.
is asymptotically
c z'Pr {c=i } then by Corollary I K
Proof. Let
gK,L(Z) =
i=1
Let L a K+m, then we have the following asymptotic representation for the
A random bipartite mapping
151
generating function of C
where o(I)+O uniformly for all z , 1 - 6 < z < 1 + 6 , 0<6< I . The above fact is an analogue to the lemma which was proved for (T; l/n) (see [7, p. 2391) and can be obtained by a similar approach. Since it is useful for the next theorems, we outline a proof of this fact here. Let
where
z I)
6 1 (L), <E<-. Then, since - . < I and 1+--6 L' 2(6+2)
(
~
<2
The sums on the right-hand side of the inequalities are exactly the same as those estimated in the cited lemma ([7]). Hence,
uniformly for all z, 1 -6
where the last equation follows from the fact (see [7, p. 1941) that z+i-1
1
(1+o(l))
and
0(1)+0
152
J. Jaworski
uniformly for all z, 1-6
as L>K+co; 0(1)+0 uniformly for all z, 1 - S < z < 1 +S, and the asymptotic expression for g K , L ( Z ) is proved. Let G K , L ( t ) be a generating function of moments of C. Then using the asymptotic formula for g,,,(exp ( f / g ) ) , where aZ=+logK, one can check that for any t , - 6’ < t 0 and L 2-K
Hence, by Curtiss’ theorem we have
Asymptotic expressions for E(C) and Var(C) can be obtained directly from exact formulas (Corollary 1) or by an asymptotic approximation of differentials of g K L ( z )for z=1. 0
Theorem 8. Tlze number C, of cycles of fixed length 21 has in the limit a Poisson distribution with parameter 111 as L 2 K-, co. Proof. By Corollary 2
Therefore, we arrive at the thesis by a standard method of moments (see e.g. ~71). Thcorcm 9. The normalized randoin variable 412
has a limit distribution
with the derisity function x exp (- x2/2), x > 0, as L > K-, co. Moreocer,
A random bipartite mapping
153.
and
Proof. Using Corollary 3, we obtain for s > O
Thus, the asymptotic density of 4 / 2
is xe
_-X2
for x>O. Note that
and
Hence, by Corollary 3, we obtain the asymptotic expression for expected number of cyclical vertices. Using estimations for H , ( l ) , H2( I), H3(1) from Theorem 7, one can check that
Therefore, again by Corollary 3 , we obtain the asymptotic expression for Var(q) and Theorem 9 is proved. 0 Thcorern 10. The nnrmnlized random varirrhle s [ 1112 tion, as L 2 K 4 co, witti the ctiriiulntive distrilxitioti firtiction Y’
F(y)=l-e
2
,
y>O
has a Iiiiiit clislribu-
J. Jaworskl
154
and
J
E(s[l])=
27r--(l+o(l)) KKSLL
2KL Var (s [t]) =(4 - r) (1 + o ( I ) ) . K+L ~
Moreover, the above results are also true for random variable s[2].
Proof. Results for moments and the last assertion follow immediately from Corollary 4 and Theorems 4 and 9. Note that i 1 (K'Q i ( LL' ) K,--(K+ L
k-l
Pr {s [l] < 2 k } =
--:-
L- i)+
(K)k(L)k
i=l
Hence, by simple calculations, we arrive at the thesis.
--
--
li
-
K k Lk K
0
Let l = l [ l ] = 1 [ 2 ] . Theorem 11. The normalized random variable 112 with the densityfunction &c( Y
@((y)=(27r)-* f e
1- @(y)), y >0, as L 2 K-,CO, where X1 --
dx.
-m
Moreover,
E(I)= and
J
a KL l+o(l)) 2 K+L
-__
has a limit distribution
A random bipartite mapping
155
Proof. The thesis of our theorem follows from Corollary 5 and Theorem 9 by routine calculations. 0 Theorem 12. Let k , = I, 2 , . ..; k, = 0 , 1, .. .; L 2 K+ co, then
and
n (K+L)K (1 + 0 (1)) . 2
L
Proof. Replacing factorials in the formula for Pr { p , [ I ] = k , , p z [ l ] = k z } in Corollary 6 by Stirling’s approximation, we obtain after standard calculations the desired result. Asymptotic formulas for expected number of predecessors follow immediately from Corollary 6 and Theorem 9. 0
5. Final remarks
Let T be a random bipartite mapping (T;P,, Q,) on V = V , u V2 into itself;
I V1l=K , IV21=L. Consider a random mapping T* on Vl into itself such that for all i, j
E
V,
Then we have
On the other hand consider a random mapping (To,Q,) on V , into itself, which was studied by Ross [5] and the author [4]. Namely, let To, To : V,+ V,, assign,
J. Jaworski
156
independently, t o each iE V l one o f the image pointsj E V , with probability QJ. It is obvious that, in general, choices of images in T* are not independent but we can prove an asymptotic independence and, in this sense, an asymptotic equivalence between T* and Toin the following cases.
Theorem 13. Let P,= 1/L, K=const. Then T* is asymptotically equivalent to (To, Pi) as L - t a . Proof. One can check that we have to prove only that for any j , ,j,, j , e V 1 , i = l , 2,... , K
...,j,,
as L + a . But
Pr ( T * ( I ) = j , , ~ * ( 2 ) = j , ..:, , T*(K)=j,}
Using a reasoning similar to that in the proof of the above theorem, we are able also to show Theorem 14. Let P , r 1/L, Q,-l/K, K = K ( L ) = o ( V L ) . Then T" is asymptotically equivcilent to (To;l/K) cis L-tm. Obviously, a structure of G,. is closely related to a structure of G,. For example, the number of cycles (i.e. the number of connected components) is the same in both d graphs, the number or cycles of length 21 iii GT i s equal to the nuniber of cycles oflerig~hI in GI.; y,, / , ( i ) ,s , ( i ) and p,(i)(hee Sect,on 1) which were delined for Tcorrespond to y, I(& .s(i) and p ( i )for T*,respectively. Therefore by Theorems 13 and 14 and the well known results for a random mapping ( T ; I / K ) ( [ 2 , 7, S]), one can extend the results of Section 4 fdr the case K=o(?L).
A random bipartite mapping
157
Finally, let us consider one of the possible generalizations of a random bipartite mapping (T; Pi, Let V1,V,, ..., v k be finite sets, k = 1, 2, ... . We define a random mapping ( T; PI,i l , P z , i 2 ,..., PkJ as follows
el).
u F; for k
T : V+V, V =
each i E V, independently, we choose its image
i= 1
point j E V and
and
Thus we have T ( V , ) C V ~ +t =~l,, 2, ..., k and it is easy to see that for k = l and k = 2 we obtain random mapp'ngs Pi)and (T; P,,Pi), respectively. One can check that basic lemmas can be obtained in the considered model by the same arguments as in Section 2, We are able to prove for example that
(c
for
n
SiCK, P,(Si+,)=
P,,,,
i=1,2,T.?,k
j E S , + 1
Is,(= Is,(= .. . = Is,\ as well as
Pr ( N C [ U , , U , , . . . , U J } = 1- K ( U , , PI)x K ( U , , P 2 ) x . . . x R ( V ,,Pk) where
Therefore, we are also able to prove results similar to thnsc of Sections 3 and 4. Obviously, the formulas will be more complicated. It seems to be interesting to study a corresponding random mapping T*, i.e. such that T* : V1+ I/, and T*(i)=jiff T k ( i ) = j .It is interesting that such a corresponding random mapping for k = I is a random mapping ( T , P,) and for k > 1 can be Irealed as a random mapping with restrictions on independence of choices of images in T*.
158
J. Jaworski
References [l] G . H. Hardy, J. E. Littlewood and G . Polye, Inequalities (Cambridge Univ. Press, Cambridge, MA, 1952). [2] B. Harris, Probability distribution related to random mappings, Ann. Math. Statist. 31 (1960) 1045-1062. [3] J. Jaworski, On the connectedness of a random bipartite mapping, Graph Theory, tagbw 1981, Lecture Notes in Math. No. 1018 (1983) 69-74. [4] J . Jaworski, On a random mapping ( T , P,),J. Appl. Prob. 21 (1984) 186-191. [ 5 ] S. M. Ross, A random graph, J. Appl. Prob. 16 (1981) 309-316. [6] V. N. Sachkov, Combinatorial Methods in Discrete Mathematics (Nauka, Moscow, 1977) (in Russian). [7] V. N. Sachkov, Probabilistic Methods in Combinatorial Analysis (Nauka, Moscow, 1978) (in Russian). [8] V. E. Stepanov, Limit distribution of certain characteristics of random mappings, Theory Prob. Appl. 14 (1969) 612-626.
Annals of Discrete Mathematics 28 (19851 159 - 170 0Elsevier Science Publishers B. V. (North-Holland)
PROBABILISTlC INEQUALITIES FROM EXTREMAL GRAPH RESULTS (A SURVEY) G . 0. H. KATONA Matheniatical Institirte of the Hiingoriati Academy of Scietices, H-I053 Bidapest, lI14n~~~ar.v
The aim of the papcr is lo survey the probabilistic inequalities proved by the method based on extremal combii~atorialtheorems.
1. Introduction
To illustrate the main idea of the field surveyed in the present paper, let sketch the proof of the following theorem:
US
Theorem 1. [ 5 ] If 5 and q are independent icleritically distributed randoni variables taking values from a Hilbert-space X,then
where
11 11 is the nornz of X.
Proof. 1. We start with stating the following special case of the Tursin theorem [17]: If a simple graph with n vertices contains no empty triangle (=for any 3 differ-
ent vertices there is at least one edge) then the graph has at least edges. 2. We need the following simple statement from geometry: If a , , a2,a3 E X are of norm >x ( 3 0 ) then Ila,+a,l[ 3 x holds for a pair 1 < i <j63. The three vectors span a 3-dimensional Eucl'dean space. It is easy to see that the angle between a, and a j is < 120" for some 1 < i < j < 3 . Now it is easy to verify in the plane determined by them that ( l a , + a j ( ( 3 x . 159
G. 0.H. Kafona
160
3. The following trivial inequality will be used: P ( 1 1 5 + r l l l ~ X ) ~ P ( 1 1 5 + S l l ~ X 1, 1 t 1 1 2 . 9
llylll>x).
(1)
Suppose for a while that 5 (and 7) can have only m values, with equal probabilities: 1 P ( 5 = a i ) = - (1GiGrn). Let a, be ordered in the following way: Ilalll>x, ...,
/la,,llkx,
m
Ila,,+lII<x,
..., Ila,.II<x. Consider the following graph G. Let a l , ..., a,
be the vertices of G. Two vertices of G are connected with an edge iff the norm of their sum is ax. Then
P(llt+ylll>x,
~ ~ tI(rll/~>x) ~ > ~ ,
= m V 2(thenumberofpairsa,,a,(I
=n-’(2(the
(2)
number of edges of G)+n)
+
holds since lla, all[=211af11> 2 x 3 x. The graph G has no empty triangle by Section 2 of the proof. Applying the Turrin theorem for G, we obtain a lower estimate for (2):
Theorem 1 is proved for this special case. 4. To prove the general case two approaches offer themselves: a) Having an arbitrary distribution for let us approximate it with the discrete distributions used in Section 3. This method was applied in [ 5 ] but the roughness of the elaboration led to unnecessary conditions for the distribution of (. Later Sidorenko [I61 worked out this method properly. We do not treat it here in detail. b) The other method can be found in [6] and [7]. Suppose that the distribution of ( is arbitrary. Let the vertex-set X o f G consist of the vectors satisfying l ( o l ( 3 x . Two vertices, n and b are connected if ))a+bll>zc. G is, in general, an infinite graph and it contains no empty triangle. The right-hand side of (1) is the measure, i n a certain sense, of the set of edges of G. Namely, take the direct product of X with itself. Any edge (n,b) means two elements in the direct product X z : the pairs (a, 6) and (b, a). The set of cdgcs is consequently a symmetric set in X 2 . The measure P on X determines a product measure on X 2 . The right-hand side of(1) is the measure of tiic above symmetric set according to this product measure. We have to give a good lower estimate of the measure of this set by terms of P(11(11>x) (the measure of X) under the condition that G contains no empty triangle.
Probabilistic inequalities from extremal graph results
161
If Xhas finitely many, n elements, let the measure of each element be equal to 1. Then the Turan theorem says that the measure of the edge-set ( = 2 (the number of unoriented edges)+)i) is ,< $ 2 that , is, the half of the measure of X 2 . We may cxpect the same stalement for the general case. We will call the generalization of a discrete coinbinatorial statement for the product measures its "continuous version." Later we will precisely show that there is a transition (under very general condition) for continuous versions. Accepting the veracity of this statement, Theorem 1 follows easily by (1). 0
P
The above sketch of the proof can be illustrated with the diagram in Fig. 1 . The aim of the present paper is to survey the results proved by this method.
2. Continuous versions of extremal theorems in combinatorics The next lemma shows the connection between the continuous and finite graphs. Before stating it, let us give some definitions. Let M = ( X , 0,p ) be a measure space, where a is a a-algebra on X and p is a finite measure defined on a. M 2 is the product of M with itself, that is, M 2 = ( X 2 ,02, p2),where a2 is induced by the products of the members of 0 and p 2 is the product measure. If E c X ' is measurable, that is, E E a2 then G = ( X , E ) is called a (directed) graph. Let Y be a subset of X , then G, denotes the graph induced by Y in G, that is, G,=( Y, Ey)
G. 0.H.Katana
162
where E y = { ( a , b ) : a , b E Y ,(a,b)EE). M or p is atomless i f for any A o w P ) < P (4.
E 6,p
(A)>O there is a B c A , B E cr satisfying
Lemma 1. Let G = ( X , E ) be a graph on the atomless measure space M = ( X , Suppose that
6,p ) .
I
holds for any Y satisfying YI >no. Then
Proof. Introduce the notation M"=(X", a,,, p,,)generalizing the case n=2. On the other hand, define
This function is obviously measurable since E is measurable. Take the integral
when a f b . If a=b we use
Probabilistic inequalities from extremal graph results
163
Summing up (3) and (4) for all pairs 1
yn)l,
I ( a , b; y , , ..., y,,) is nothing else but IE{,,,,,_,,
Observe that
that is,
1So. b S n
the number of edges of the subgraph induced by ( y l , ...,y,,} i f y l , ...,y,, are all distinct. This latter condilion holds with the exception of a set of measure 0. This is heuristically obvious and can be rigorously proved (see e.g. [7]). Using the , , Ynl[ 3 CIJ’, (ti > n o ), assumption [ E { ~ ,..,
(5) and (6) imply
If n 4 c o this leads to p2(E)kcp(X)’.
0
Let 9 be an arbitrary class of graphs G=(X, E ) determined on the measure space M = ( X , CJ, p). 3 is called hereditary if G E 3 implies G, E 3 for any measurable Y c X . Then
can be considered as the continuous analogue of the “minimum number” of edges in 3. Analogously, let us define H ( n ,Y)=min IEl n”
(7)
where the minimum runs over all members of 3 ‘ having exactly n vertices. It is proved in [7] that (7) has a limit if n-co. The inequality
H ( M , 3 ) k l i m H ( n , 3)
164
G. 0.I€. Katona
is an easy consequence of Lemma 1 if M is atomless. However, this inequality holds for measures with atoins supposing that 9 has certain properties. Let G=({x, xI,...}, E ) be a graph, and define GX=({x‘,x”, xl,...}, Ex),where E x consists of the pairs obtained by subst.ituting x either by X’ or S’‘ i n any way iii any pair which is in E. I n other words, we form two copies o f x i n all edges of G. 9 is called doublable iff G E 9 implies C” E 9 for any vertex x of C.
Theorem 2. [7] Suppose that 3 is a hereditary class of gruphs on flic measure space M ,
if M is atoniless or 3 is cloicbkuble.
For our applications we need this direction of the inequality. One may guess, however, that equality holds in (8) under some reasonable conditions. Indeed, if 3 is doublable then (8) holds with equality (see [7]). However, there is another class of g’s, for which the equality in (8) is proved. Y is called strongly hereditary i f (i) 9 is hereditary, ( i i ) adding a new edge to a member of 3, the new graph is also in 9, (iii) adding a new vertex to a member of 9 (until a certain fixed cardinality) with all the possible edges containing x, the new graph is also in 3.11 is proved in [ I I ] that to any strongly hereditary class 9 there is another class $9” of graphs that a graph H has all its induced subgraphs froin S iff the complement fi contains no subgraph from 9”. gois called in the literature the class of forbidden graphs. The equality in (8) for strongly hereditary graphs is an easy consequence of a theorem of Brown, Erdos and Siinonovits [3]. (The conditions of this theorem and of Theorem 2 are stated incorrectly in
171.) The above results are formulated for directed graphs but, in fact, we need them for undirected graphs. The connection is obvious: each edge (a, b) ( a # b ) of an undirected graph is replaced by two oppositely directed edges (a, b), (6, u). Let us remark that Bollobas [2] independently proved (8) with equality for strongly hereditary classes of undirected graphs o n an atomless measure space. His proof is easier for this special case. Let us see how we can obtain the “continuous version” of the Turin theorem by Theorem 2. Let Q be the class of all graphs G=(X, E ) such that (i) (a, b) E E iff (b, a) E E, (ii) (a, a) E E for all a E X, (iii) if a, b, c are different vertices (E X) then at least one of (a, b), (b, c), (c, a) is in E. By the usual Turdn theorem, the graph G=$,
E), IXI=n, G E Q must contain at least
r(n~l)z] ___
pairs of edges
Probabilistic inequalities from extremal graph results
165
(a, b), (b, a) (a#b). Hence, by property (ii), the number of edges is
This implies Iim IT(//,‘3936 and (8) implies f f ( M ,Y)>+. The proof of Thcorcin 1 can be completed if this inequality is used for measure space induced by the set i n thc probability measure P, and for the set
{/I.Y)
is a consequence of H(M, 9)3 4. Let us remark that [7] states the results on the “continuous versiotis” for g-graphs, however the equality in (8) is known for strongly hereditary graphs only when g = 2 . [7] also contains some results for the case when Ad has atom and $4 is not doublable. Finally, [I;] gives the “continuous versions” of a coinpletely different class of combinatorial extremal problems: a transformation Tof g-graphs to h-graphs is given; the number of vertices and g-edges is fixed, the number of edges of the transformed graph has to be minimized.
3. Two random variables One can sce w i t h an easy construction that Theorem 1 is sharp in the following sense. For any p ( O < p < 1) and any s > O thcrc i s a distribution of 5 (and r7) i n a more than two-dimensionaI space \ L I C I ~t h a t ~ ( 1 1 < + i l l l > x ) = j - p 2 andp=P(II;II >,I-). 111 other words, P(ll<+ql[ 2.~1 has n o better lower esliriiate in terms of ~(ll
‘(llrll >.\-I. Theorem 3. ([ I61 and [Y] independently) Let X b e at2 iriJiniie-dime/~siorialHilhertspace, ( and 11 be X-valiiett iriclcpoiclerzt, Meriticdly distributed raiirloin vtiriahlcs, t h z the best possible junctiorzs f in the i/zequality P(II<+rlII >x)>f(P(llsT112cx)> are the followi~igones:
G . 0.H. Katona
166
{
f(PI= -E2 p (
if ~ 2 3 , -p ) o t h e r w i s e ,
. 2 p - p 2 if p > + , otherwise,
f (P)={p‘”
f(P)=3P2
when + < c < 3
w ~ i e n
when l < c < JS 2-
>
Each row of the theorein can be proved following the proof of Theorem 1. that is, the scheme g’ven in Fig. 1. We show a new phenomenon of the proof in the case J3/2 d c < 312. We start with a very brief sketch of the proof. Fix the real number x>O and i - x } , X z = X - X 1 . The graph G = ( X , E ) is defined put X , = { a : L I E X ,Ilall< 4s by E={(a, b) : Ila+bll>x}. The following simple geometric statement is true. J 5-
J5
If a,,a,,a, are vectors in a H’lbert-space and ~ ~ ~ z , ~ Ila,ll> ~ > 2 -- ~ then there is apair ifjsalisfying ( ( a , + a , i ( (I3. Hence thegraph G has noernpty triangle with at least two vertices in X,.If G‘ i s finite and IX,=n,, lXzl = n 2 then according to Lemma 2 of [6] the number of edges is at least
I
if n 2 > n , and
(’i2)
the statement for
otherwise. Thc “continuous version” of this Iunma proves
J5 -2
(11 11
JF
The first novelty here is that we cannot disregard the sinnll < -2 x) vectors, like in the case of Theorem 1. This causes the trouble wilh two classes, so new types ofextremal graph results are needed. Finally, we need a gcneralizution of Theorem 2 for two (or more) classcs. This generalization is straightforward and can be found in [7] (Theorem 3). The proof of this row and any other row is valid for any (lower-dimensionnl) Hilbert-space. But the estimates are not the best, in general. The constructions do not work.
,
Probabilistic inequalities from cxtremal graph results
167
In the d-dimensional space the necessary geometric problems are unsolved (even for d=3). Namely, the quantities 6(2, k , X ) = inlmax { I l a i + a j l l } , lQi<jQk
where the infimum is taken over all vectors a,, ..., ak E X satisfying Ila,ll> 1 (1
I
P(I la5 + bql 3 x)
in terms of P(ll(ll > e x ) are determined, where a and b are fixed reals, t and v are independent, identically distributed real random variables. Another theorem (Theorem 22 of [16]) deals with the case We think that our method is helpful in a more general context. Let f i and f2 be a one-variable and a two-variable function, respectively. A lower estimate is needed for Pcf,((, q)>.x) in terms P ( f , ( < ) > e x ) . Examples are,fl(<)=ltl, f2=&, orf,(() is the vector of the coordinates of (.
4. More random variables
Probabilists claim that the real task of probability theory is to say something about a large set of random variables. Thus, they would need a generalization of Theorem 3 for 11c, + t 2 +... rather than for Let us try the case 1=3. I f we copy the proof of Theorem 1, the geometry works: 11~7,11>1, 11u21131, llojll > I , J(u4I/> I implies that there are 3 distinct ones of t~iemso that llui +a, +a,,[[3 1. However, tlicre is a little trouble with the combinatorics. We need the ~ninirnumnumber T(t1,4, 3) of 3-elcment subset of an n-element set under the condition that any 4-element subset conlains one or thein. It is conjectured that 4 ? ’ ( I ] , 4,3),( -- . The proof of this conjecture would imply 9
+([/I
;)+
11( 1 +c211.
G. 0.H . Katona
168
for any 3 independent, identically distributed random variables i n a Hilbert-space (see [ 121). The first problem I S that even the order of magnitude of this eilimate is not correct. It is proved i n [lo] that
holds if P ( \ l ( , l l > x ) < : . (10) IS 111iidi stronger for \ J l l d l values of P(ll;ll[ax) than (9). However, the constant i i s not the be,t possible. The reason why the situatuon here I F d IrcIent fiorii the case 1=2 i s that the srnall vectors alw play role. This piobleni is circumvented if we considel P(ll<, +52+5311, 115rlll, 115111, 1 1 t 3 1 1 2 - 4 . Indeed,
is proved for the independent, identically distributed two-dimensional random variables. In fact (1 I) is proved in [lo] only Tor one-dimensional variablcs and with 0.44444. Since then Bereznai and Varecza [ I ] proved that (37) of [lo] tends to 4 and Ha Le An11 [4] proved Lenimas 2.2 and 2.4 for two-dimensions. Ha Le Anh also gave counterexamples Tor thoje lemmas if the dimension is higher. So the method of [lo] does not work for higher dimensions. However, we still conjecture
for any Hilbert-space. Sidorenko [IG. 151 found similar results for the case when the lower eWnntc uses >cx). Finally, let us mention another reuilt of Sidorenko [16]. He gives lower estiinates of
~(ll<,]l
and
in terms of Z ' ( ] ~ ~ , ~ ~ where > x ) , q also runs in the inin and m a .
Probabilistic itreqitnlities from extretnnl graph results
169
5. Open problems
Although the papers i n this field contain many opcn questions (actually, they contain more open questions than rcsdts) l i e would like I0 cinphusize sojiie of t hClT1
f
1. Do we always have equalily in (X)? It is known that if Y is doublable and if g=2, 3 is strongly hcreditnry and A! is atonilcss. So, it i s unlinmvn for some ca\es, even i f g = 3 , and very little is known lor g > 3 .
IE',.\>c]
YI'
IE~~
2. It is easy l o see t ~ i a tif is replaced by ~ j in 2 Lemma I thcn , U ~ ( E ) < ~ I ( ,can Y ) ~be concluded. Suppole now lEyl
CI
m
where E c
Si and p ( S , ) < p . A(E, a,p) is a lion-decreasing function of p , i= 1
therefore the limit
exists. This is called the a-diincnsional outer' ~lnzrs~ln/:fflrncaszrre of E. It i s casy to see that there is an x0 such that A(/:, a)=w i t a < a o and 2(Ei a)=O if a>ao. Tliis a. is the Nnust/or'~-cli,irc./rsinirof E.) 3. Our estitiiates deterinine tho "best" functioti.fof thc distribution function of at a given place (cx). Anothci- problem is to find [lie best operator,( where P(lli:ll>.~)is considered to bc a function of s. A modest result in this direction can be ibund in [6]:
Il{(l
where p1=P(I/J?
the infimum is taken over all vectors a , , ..., uk E X satisfying [/a,ll>1 (1 didk).
170
G . 0. H. Katona
A survey of some results can be found i n [16, 151 and for the 3-dimensional X in [ 6 ] . 5. Prove (12). A Jitinl remark. Most of the work i n this field is done by S:dorenko and by the author. I know the results of the latter one better so this survey is based on them. Consequently, the interested reader should carefully study the quoted and forthcorning papers of Sidorenko.
References [ I ] G. Bereznai and A. Varecza, On the limit of a sequence, to appear in: Publ. Math. Debrecen. [2] B. BollobBs, Measure graphs, J. London Math. Soc. 21 (1980) 401-407. [3] W. G. Brown, P. Erdos and M. Sinionovits, Extremal problems for directed graphs, J. of Conibinatorial Theory 15 (1973) 77-93. [4] Ha Le Anh, Inequalities for the lengths of vector-sums, to appear in: Studia Sci. Math. H u nga r . [5] G. 0. H. Katona, Graphs, vectors and probabilistic inequalities, Math. Lapok 20 (1969) 123-127 (in HungLtrian). [6] G. 0. H. Katona, Inequalities for the lengths of sums of random vectors, Teor. Verojatnost. i Primenen. 22 (1977) 466-481 (in Russian). [7] G. 0. H . KatonLt, Continuous versions of some extremal hypcrgraph problems, Coll. Math. Soc. Bolyai 18 Conibinatorics (1976) 653-678. [8] G. 0. H. Katona, Continuous versions of some extremal hypergraph problems. 11, Acta Math. Acad. Sei. Hungar. 35 (1980) 67-77. 191 G. 0. H. Katona, “Best”estimalions on thc distribution of length of sunis of two random vectors, Z. Wahrsch. Verw. Gebiete 60 (19x2) 41 1-423. [lo] G. 0. H. Katona, Sums of vectors and Turin’s problem for 3-graplis, European J . of Combinatorics 2 (19x1) 145-154. [I I ] G. 0. H. Katona, Continuous versions of extremal coinbinatorial thcorcms with applications in probability lhcory, Academic Doctor Thesis, Hungarian Academy of Sciences, Budapest, 1981. [I21 G. 0. H. Katona and B . S. SteEkin, Combinntnrial numbers, geometric constants and probabilistic inequalitics, Dokl. Acad. Nauk SSSR 251 (1980) 1293-1296 and Soviet Math. Dokl. 21 (19x0) 610-613. [I 31 A. F. Sidorenko, Classes of hypcrgraphs and probabilistic inequalities, Dokl. Acad. Nauk SSSR 254 (1980) 530-543 (in Russian). [I41 A. F. Sidorenko and B. S. Stetkin, Extrcnial geometrical constants, Mat. Zaniet. 19 (1981) 691-709 (in Russian). [ I 51 A. F. Sidorcnko, Extrcnial constants and inequalities for the distribution of sums of random vectors, Thesis for “Candidate” Degree, Moscow State University, Moscow, 1982 (in Kussian). [16] A. F. Sidorenko, Extrcnyal estimatcs o f probability measures and tlicir combinatorial nature, Izv. Acad. Nauk SSSR 46 (1982) 535-568 (in Russian). 1171 P. TurBn, Egy grrifelnieleti szelsoert~kfeladatrol, Mat. Fiz. Lapok 48 (1941) 436-452.
Annals of Discrete Mathematics 28 (1985) 171-180 Q Elsevier Science Publishers B. V. (North-Holland)
A NEW VERSION OF THE SOLUTION OF A PROBLEM OF ERDOS AND RPNYI ON HAMILTONIAN CYCLES IN UNDIRECTED GRAPHS Aleksej D. KORSHUNOV Mathenintical Institute of the Sovier Acndetny of Sciences, Siberian Section, Novosibirsk. U.S.S.R. We present a new proof of the following fact. If k=+n (logn+loglogn+y,(n)), almost every graph with 11 vertices and k edges contains a hamiltonian cycle.
(o(tr)-tm, then
Let us denote by %(n,k ) the set of all graphs on n vertices v , , ..., u,, and kedges,
O
;),
and assume that each such a graph can be chosen, independently,
with the same probability. In [4]the following problern is stated: what is the least possible value k,=/i,(tz) such that there is a hamiltonian cycle i n almost every graph from 3 ( n , k,). S nce having harniltonian cycle is an increasing property, then there would be a hamiltonian cycle in almost every graph from B(n, k), where k
E
following theorem is a solution ol' that problem.
Theorem I. Almost eiiery graph f i o n i CJ'C'lt'
if
Ntld
$(ti,
k ) cntztcriti~at Icwst orie li~iniiltonirrii
ordy if
Necessity follows immediately from [ S ] since graphs with pendant vertices cannol pojsess a h ~ J l l ~ ~ ~ cycle. ~ l l i ~Concerning ln the sufficiency, let us make two remarks. Fi~-atly,i t is enough to assume that q (n)<
172
A . D. Korsliunov
9,(n) the set of random graphs on vertices { u , , ..., u,,} such that each edgc appears with probability p . It is known (see [?I) that for ;in increasing property 15; the problem of finding the minimal p such that the probability that u e choose from $&I?) a graph wit11 propcrty F tends to 1 as ti-km, and tho prohiem of'fintfing the minimal k such that almost :dl graphs from % ( t i , Ii) possess the propcrty F, are equivalent under relritioii li = p (;). Tl;u:;, sufficiency 0 1 Theoren1 1 follows from
infiriity as n-+ i r ~atrcl ( i ( t 7 )
12-
'Q,
Before we prove Theorem 2 let 11s show sonic Icnimas. - v i , ) be a path i n G. We denote it T, = Tci(ut,,. . ., ui,). Lct (vll, ui2)(ui2,ui,) ... ( u , ~ ,, Vcrticcs vi, and v i , arc said to be h i t i d and j i t i c d i n this palh, rcspectively. If there is the edge (uis, ui,), 1 < j < s - I , i n G then, 01' course, there is i n G the path TA = TA(vil,..., ul,, vi,, ul,-l, ..., vi,+,) with ull as the initial and uiltl as the final vertex. The transformation of the path T, into Th is called pcwnissible trans/ormation. The set of all paths in G which can be obtained from T , by a sequence of permissible transformations (with fixed initial vertex u i , ) is denoted by .d(T,). We say that a path TG in G is st(zb1e if no path from d ( T , ; ) can be exlended by its final vertex, i.e. the final verlex ofeach path from d(T,) is allowed to be adjacent to vertices of TGonly. The set of all vertices of T,; which are final vertices of paths from d ( T G )i s denoted by tl(T,). By U(T,) we denote the set of vertices of T, which clu not belong to H(TJ but which are adjacent in T, to at least one vertex from ff(T,). Finally, let X(TG)=V\(M(T,) u V(T,)).
Proof. Let v' E R(TG)and u" E X(TG).If u" cf T, then v' and u" cannot be adj:iccnl since T,; is stable. Let u s suppose now that v" E T, and v' and u" are a d j x c n t in G. Consider a path T: E d ( T G )with the linal vertex v'. Therc are two possibilities: (1) v" is adjacent i n T: to the same vertices as i n T G , ( 2 ) there is at least one vertcx in T i adjacent to U" in T i but not in TG. In the first case it is possiblc to get from T,, by a permissible transformation, a path TA whose final vertex is adjacent to u" in TZ.Thus u" should not belong to X( TG),which contradicts the choice of u". Suppose that the second case holds and U" is adjacent to u1 and uj in TZ in such a way that at least one of the edges (u", u,) and (u",uj) does not belong to T,.
A new version of the solution of Erdiis and Rinyi’s problem
173
Consequently, this edge should appear in a path obtained during the process of permissible transformations, turning the path T, into T,*. However, then one of the vertices which are adjacent to v” i n T, is the final vertex of that path. Hence u” is adjacent i n T, with a vertex from N(T,).As before, we arrive at the contradiction with the fact that u“ E S(T,). 0 Let 9; (12, p(n)) denote the set of a11 graphs C E Y P ( n )such that there are in G at least two vertices at distance at inost 2 with degrees at most log n/JyF). Lemma 2. Let p={n(log n+log log n+p(n)),’(:), where p(n)+oc, cis IZ--+CC, and p(n)
Proof. It is easy to we that the probability of getting a random graph from gP(n) in whch two fixed vertices u1 and u, are adjacent and tl(ui)=s,cl(u,)=t is equal to
<.
n~+r-2ps+r- 1(1 - p ) 2 a - s - r - 2
/(s- l)!(t- I)!
Therefore, the probability p1 that a random graph belongc to Y;(n, y(n)) and possesses two adjacent vertices with ctegt-ees at mobt log /I/ JqG is less than
x(l -p)2n-s-f-2
/(s-
l)!(t-l)!
Now we find an upper bound for the probability p z that a random graph belongs to 97i(n, p(n)) and possesses two vertices at distance 2 with degrees at most log n/ It is easily to see that the probability of obtaining a graph from 97i(n, ~ ( n ) ) , in which two fixed vertices u, and u, are at distance 2, d(u,)=s and d(u,)=t is not
JZ).
A . D. Korshunov
174
greater than
n"+t-l
S+f
-p)zr~-s-t-4
/(s- l)! ( t - l ) !
Thus
=o(l)
(1)
according to the form of the probabilityp. This completes the proof of Lemma 2. 0 Let s ; ( n ) denote the set of graphs G E 9,,(n) such that there is in C at least one stable path T, with 1H(TG)1<$n.
Lemma 3. Let p=+z(logn+loglogn+p(n))/(;), where p(n)+cO as n+co and q(n)
A new version o f t h e solution of Erdiis and RPnyi's problem
175
Let G be a graph from 9,(n, s, t ) and T, a stable path in G with IH(T,)I =s, IU(T,)l=t. Obviously, there are at least s+[t/2] edges in T , with both ends in the set H(T,) u UCT,). Basing on this fact and Lemma 1, i t is easy to see that one can obtain all graphs from 9Yp(n,s, t ) (and some others) in the following way. 1. We choose from V= ( u , , ..., 0). an s-element subset V , , llog n / J z ) j <s
(
: ) < n s / s ! <(en/s)s possibilities.
2. We choose from V - V1 a t-element subset V,, 1 GtG2s. Given s and t , there are ( " I s ) < n ' / t ! possibilities. 3. We choose s+tt/2] pairs from the set of all pairs of vertices of V1 u V,. There are
possibilities. 4. Each chosen pair is adjacent. The probability of this event is equal to ps+Lt'zl. 5. If' u1 E V , and u, E V\(V, u V,) then u1 and uj are not adjacent. The probability of this event is equal to (1 -p)s'"-s-"
aP>
s (I:
- 3 s ) < e- p s (n - 3s)
6. Other pairs of vertices from V are adjacent with probability p and nonadjacent with probability 1 - p . It follows froin ( I ) and 1-6 that the probability of getting a random graph from %:(n) is less than
Since 1 p = --- (Icg n +log log I 1 n-1
+ q ( n ) ) <3 log n/n
A.
1'76
D.Korshunoo
for IZ large enough then
1 2
Lemma 4 . ~ e tp=--n(logn+loglogn+
(2")
q(n))/
, where
q(n)-co
as n-tw
and q(n)
1
1. We choose from V an s-element subset V1,s= Ln/41. There are
(9
possibil-
ities. 2. We choose from V - Vl an r-element subset V,, r = [3n/log I ? ] . There are
(" ')
possibilities.
3. There are no edges between Vl and V,. The probability of this event is equal to ( I -p)Sr<e-SrP. 4. Other edges are present with probability p and are absent with probability 1- p . Hence for large n we have
Suppose that a stable path T, of a random graph G E g P ( n )is such that IHIT,)I >n/4. By Lemma 3, the probability of this event tends to 1 as n+m.
A new version ofthe solution of Erdiss and Rhnyi’sproblem
177
Consider s=Ln/4] paths with distinct final vertices which one can get from TG by permissible transformations. We denote by V , the set of these final vertices. 3n If there is in T, less than n vertices then, according to (2), with probability log I1 tending to 1 as n+w, there is a vertex in V , adjacent with a vertex outside T G . This contradicts the stability of the path T,;. 0
Proof of Theorem 2. It consists of two plrts. First, we will prove that, with probability tending to 1 as tz+Tx), there i s in a random graph from 9,,(rz)a hamiltonian path, i.e. a path crossing each vertex once. Part 1. Let 11sdenoic by 3;(1z)the set of these random graphs C E 8,,(tz) in which 3n tl vertices and (If(TG)( > -. By Lemmas every stable path TGcontains at least n - ~log n 4 2 4 it follows that, with probability tending to 1 as n-+co, a random graph from ? I p ( 1 7 ) belongs to the set $ ( I ? ) .
Z.!
, n’=n-ka nd 11 represented i n the form Put k = 2
V’={u,, ..., unr}.Iti s e a s y t o s e e t h a t p c a n b e
1 p = --n’(Iogn’+logIogd+ p’(n’>)/ 2 where cp’(n’)-twas 12-03. Thus we can use Lemmas 3 and 4 and therefore, with probability tending to 1 as I I - + C C ) , a random graph from gP(n’)belongs to the set 9 (n Let G be a graph from 9i(n’),7,- a stable path in G of maximal length, V1 the set of vertices lying on T, and IV,I=I. The vertices outside V1 are called periplzcral. According to Lemma 4, the number 1 fulfils the inequality I).
I > n‘-3n’/logn’>
11‘-
3nllogn.
} { u , , - ~ + .. ., v,,), V= 1’’ u V 2 . Here we describe Put V , = { u , ~ , + ...,u , ~= such a process of generating random edges between the sets V 2 and V’ and at the same time of lengthening the path T,, that allows LIS to conclude i n a 4mple way that with probab Iity tending to 1 a> 1 7 4 T I , theie is i n a random graph from %,,(/J) a haniiltoninn path. These edges are genet aled ~n a few steps. Now we present the first one. For each veitex U,E H(T,,) we c hoox froin the \et sZ(T,) a path with u, a\ the final vertex. Then we take a vertex from H(T,,) w i t h the minimal index (my u r l ) and jo:n by an edge every pair of veitices u , , , pi,, j = d + I , ... , t i , with probability p . Theae pair\ ol‘vertices we call utilized. If no edge has appeared then 111 a
173
A.
D.Korshunov
similar way we are looking at pairs u 1 2 ,0,. where ulz is the next vertex rrom H(T,). This process is continuing until there appears at least one edge. Suppose ( v l l , u,,), u j , E V 2 , u l , E H(T,,) is such an edge. If all vertices of II(T,) are utilized and no edge has appeared then tlie process is terminated. If the edge ( u j l , vi,) has appeared then we consider a path with vjI as the initial vcrtcx, consisting of the edge ( P , ~ u, i l ) and that path froin d ( T G )which has u L I ;IS tlie linal vertex. We lengthen this path to a stable one in C and denote it by T‘. We assoiate with T‘ the set d(T’) consisting of paths with the initial vertex u , , , obtainable from T‘ by a sequence of permissible transformations. Every unutilized p:iir I , , , , I ’ ~ where , cil E P”, we join by an edge with probabilily p and call it utilized. 11’ at Icist one edge ( v j l , vi) appears, where u, E H(T‘), then we get a cycle consisting of the edge ( o , ~ , , rind a path from d(T‘)which h;is ui as the final vertcx. It’ no cycle has ~ ~ P C ~ I ~ Cbut LI d ( u j , ) 3 2 then we include u,,, into the set ofperiplicrd vertices. Otherwise M’C stop the process of lengthening tlie p:ith T(;. Let a cycle appear. Since G is corincctcd then there is a vertex u’among pcriphera1 vertices, which is adjacent wilh ;I vertex from this cycle (say u”). Let u“ be adjacent to vertices uiand u, on the cycle, i<,j. Instead of TGwe will consider now a path consisting of the edge (u’, u”) and that path which is formed from the cycle by removing tlie edge (u”, u j ) . Let u’ be the final vertex of this path. We lengthen this path to a stable one i n C. It contains at least two more vertices t h a n T,: ;I nonperipheral one from V 2 and at least one per’pheral vertex. We exclude these new vertices from the set of peripheral vertices. The first step is complete. A number of further steps are analogous to the previous one. Each time we get a new stable path which contains one new nonperipheral vertex and at least one other new vertex which has been by that moment peripheral and now we exclude it froni the set ofperipheral vertices. These steps are complete when either all peripheral vertices disappear or the process of lengthening of the current path comes to an end. If all peripheral vertices have disappeared, then the process of including tlie remaining vertices into present path is a s follows. Let T‘ be the selected stable path with ul as the initial vertex and u j - the vertex with minimal index outside T‘. Then every unutilized pair u,, I ] , , where u, E V, is joincd by an edge with probability p and called utilized. Suppose now that u, appeared to be nonadjacent with any vertex froril H(T’). Then v, becomes peripheral vertex and it will bc included into a present path on a forthcoming step in a similar way as above. Let assume now the opposite, i.e. u, is adjacent with a vertex from H(T‘) (say u,J. Then u j belongs to a path consisting of theedge(uj, u,) and that path from d‘(T’) which has u, as final vertex. In this case instead of Twe consider a new path and the next vertex 0-itside it is included in a similar way. This process of extension of a path is continuing until it breaks according to the rules described above or the path becomes hamiltonian. zii)
A new version of the sohition of Erdos und Rgnyi's problem
179
It is easy to see that this process of lengthening TGto a hamiltonian path has, in fact, binomial distributioti. Hence, it is not too difficult to show that, with probability tending to 1 as H+WJ, there is in ;1 randoin graph frorii YIj(/z)a h;uiiiltonian path. And now we conic to the scconcl 1mrt of thc proof. Plrrt 2. Cons'der 3 1-~11tl0ti1gI.iiplI fi-o~il
/I-
- . From tlic lii-st part and Lemni:i 3 it follows that prob4 ability of ob:nining such :I grap11 rioni
such that l H ( K ; ) l >
-
~~
II-
an edge each pair u,,, v j with probability p . Since IN ( T , ) l >
1 ~
'4
, thenwithprob-
ability tending to 1 a s 1 7 - + x , I h e vertex I ) , , i i adjacent with a t least one vertex fr-0~11H(T,), say o j . Then wc get a path consisting ol'tlie cdjic (I),,, i),) and ;i path from .J(T,,) with u j as tlic liniil verlcx. We denok thi?, path by T'.Now, w i t h probability p we join each pair L),,, uj wlier-e l:,, $ N(T(;). It i s easy to see that, with probability tending to I as /?+m, the verlex u,, becomes adjacent with at least one vertex from H(T'), say 0,. Hence we have obtained :I 1i;imilionian cycle conhisting of the edge (u,,, v,) and a Iiamiltonian path fro~ii. d ( T ' )with u, as final vertex. 0 Comment. The first result on a hnmil~oniancycle in randoin graphs was obtained in [15]. There was described an il~gOl~ithJ11 which was selecting a hamiltonian cycle with probability teritling to 1 as n-tw, if p 3 2
J-
log I 1
n
. Iinproved versions
of that algorithm were later described in [6, 7, 161. Next this problem was investigated in [ I , 3, 8, 10, 14, 17-21]. 111 [lo] using the inethod of second moment based on Chebyshev's inequality, it was shown that if k =n3,2y(n), p((n)-+m as n+m, then there is a hamiltonian cycle i n almost every graph from $ ( / I , k). (An analogous fact was proved it1 [ I X ] . ) Additionally, it was shown in [lo] that this method fails for values less than k . A slighily weaker results than those or [ I l , 121 were obtained in [13, 211. In [S] i t was proved that if k > / i exp (2 Jlog 11 log log n), then almost every graph flOJ11
~~~
I80
A.
D.Korshunov
Acknowledgements
I would like to express my thanks to Andrzej Ruciriski for his translation of this paper. References [I] D. Angluin and L. Valiant, Fast probabilistic algorithms for hamiltonian circuits and matchings, J. Coniput. System Sci. I S (2) (1979) 155-193. [2] B. BollobAs, Graph Theory - An Introductory Course (Springcr, New York, 1979). [3] V. Chvjtal and 1’. Erdiis, A notc on haniiltonian circuits, Discrete Math. 2 (2) (1972) !11-1 13. [4] P. ErdBs and A. Rdnyi, On the evolution of random graplis, Publ. Math. Jnst. Hungar. Acad. Sci. 5 (1-2) (1960) 17-61. [ 5 ] P. Erd(is and A. KBnyi, On the strength or connectedness of a random graph, Acta Math. Acad. Sci. I-lungar. 12 (1961) 261-267. [ 6 ] E. H . Gimadi and V. A. Perepelica, A statistically eKective algorithm for (he sclection of a hamiltonian contour (cyclc), Diskret. Analiz 22 (1973) 15-28 (in Russian). [7] E. H. Gimadi and V. A. Perepclica, An asymptotic approach t o the solution of the travclling salesman problem, Llpravljaemye Sisterny 12 (1974) 35-45 (in Russian). [8] J. Komlos and E. SzcmcrCdi, Hamiltonian cycles in random graphs, in: A. Hajnal, R. Rado and V. T. Sbs, eds., Infinite and Finite Scts, Colloq. Math. Soc. J. Uolyai 10 (North-Holland, Amsterdara, 1975) pp. 1003-I01 1. [9] J. Komlos and E. SzemcrBdi, Limit distribution for the existence of hamiltonian cycles in a random graph, Discrete Math. 43 ( I ) (1983) 55-63. [lo] A. D. Korshunov, The number of pairs of hamiltonian cyclcs of complete graph having the prescribed number of edges with a given cycle, Upravljaemye Sistemy 13 (1974) 40-54 (in Russian). [ I l l A. D Korshunov, Solulion of a problem of ErdBs nnd Rdnyi on hamiltonian cycles in undirected graphs, Dokl. Akad. Nauk SSSR 228 (3) (1976) 529-532 (in Russian). 1121 A. D. Korshunov, Solution of a problem of Erdiis and Rdnyi on haniiltonian cycles i n undirected graphs, Melody Diskrct. Analiz. 31 (1977) 17-56 ( i n Russian). 1131 Le Cong Thank arid Phan Dinh Dieu, Asymptotic estimates for some parameters of finite graphs and their applicatioris, Acta Math. Vietnam. 3 (I)(1978) 51-79 (in Russian). [I41 I. Palasti, On Hamilton-cyclcs of random graphs, Period. Math. I-lungar. 1 (2) (1971) 107-1 12. [IS] V. A. Perepclica, On two problems from graph theory, Dokl. Akad. Nauk SSSR 194 (6) (1970) 1269-1272 (in Russian). [I61 V. A. Pcrepelica, An asymptotic approach to the solution or cert:iin exti-cmal problcnis on graphs, Probleiny Kibernct. 26 (1973) 291-314 (in Russian). [I71 L. POsa, I-lnmillonian circuits in random graphs, Discrete Math. 14 (4) (1976) 359-364. [IS] E. M. Wright, For how many edgcs is a graph almost certainly Iiumiltoiii;iin?, .I. London Math. SOC.8 ( I ) (1974) 44-48. [I91 E. M. Wright, The proportion of‘ unlabellcd praplis which are hainiltonian, Notices Amer. Math. Soc. 159 (1975) A - I . [20] E. M. Wright, The proportion of unlabcllrd graphs which arc hmniltoni:m, UuII. London Math. Soc. 3 ( 3 ) (1976) 111-244. [21] P. E. O’Ncil, Asymptotics in random (0, I)-matrices, I’roc. Anier. Math. Soc. 25 (2) (I 970) 290-296.
Annals of Discrete Mathematics 28 (1985) 181-188 0Elsevier Science Publishers B. V. (North-Holland)
LOCALLY DEPFNDENT RANDOM GRAPHS AND THEIR USE IN THE STUDY OF EPlDEMIC MODELS Kari KUULASMAA* Depnrttiicvii of Applied Mutlierniilics aird Stciristics, Crttir.ersityof Oiilu,
SF-YO5 70 Oir ltr 5 7. k+'tilwid A locally dependent random graph is a generalization of directed bond and site pcrcolation processes, where the dependence structure of the bonds from any particular site being open is general. The marginul probabilities for diITcrent bonds being open nccd not be cqual, so that i t is possible that there ;ire an infinite [iumber of bonds from a site. This papcr sunimarizcs some resulls about locally dcpcndcnt random graphs and shows how they have been used 10 invcstigate the threshold behaviour of a continuous time spatial general epidemic model.
1. Introduction Let ( V , E ) be a graph, where V is a countable set of vertices and E is a set of edges between the vertices. A bund p m d u t i o t z process (bee also Welsh [I21 or W.erman (131) can be defined as such a graph with random black and white colouring of the edges such that any particular edge is black with probability n, O < n < 1, and white with probab,lity 1-71, indepenclently of the colotirs of the other edges. Assu~iiiiigthat a finite set S o ~ s o i / r uertices c~ is g.ven, we are interested in the percolat on properties of the system: what is the set of verlices which can be reached fi-om the source vertices by following black edges only. I n the case of an i n f i n i t e p p h a cj~iestion01' p:irticular interest is the prob:ib:lity that this set is infin; tc, t lie :~o-c;iIled pcwolrrtioii prohiibilitj. Unl.kc in Ji13lly ();her appl.cat o.~:,,siich as percolat:on o f fluid through :I poroll:, J1WriI:l. i l l h i ~ U : \ t : O l l 01 b O i O g CLll slid dell1 010g Cal b p l ~ & l d the ~ ~ l l ~ i O l l l l l ~ s S is not rc;iIly inrp.ecl by t l i c eiiges b u t h y the vcri.~eh:it thc ends 01' tiicin, I'or eX:lJllplc b y thc tnlec! o 1 h j11ci v . ~ L I : ~ ~t YI i C. , Cdgcs ilow ind c:iL;llg ne gltbclLirsh 1". Thus i t may not bc ren.o.ial>l~. :o ;ix,ume t h a t (lie coiours ol'd.lYei-ciitcdgcb 1~ro.ma particular verIcx are indepcrldetii. 7'11 s riio:.vatca its to s t u d y percolat.on JiioJcls where \VC aII<>\V dcl)ctIdCI?Cc heiWcciI CiIgci i'i.O:II the SilJllc Vc'l'LeX. We SIIilII, h o w ever, still asstime that the colotir~o!' eclgcs l ' l - 0 ~c1,tlerent ~ verlices are independent.
*
N o w at Nitional I'ublic Hc':ilih IiihLitutc. Mannerhrimiiitie 166, SF-OOZSO I-lelsinlci. Finland. 1x1
K . Kuulasmaa
182
This can be interpreted as the assumption that different infectious individuals behave independently, but it excludes the eirect of poisible external random factors such as temperature or wind. In order that each edge comes from a unique vertex we have to restrict atlention to dirccfed graphs. We call such percolation models local!,^ rkpendent ratidorn griiphs. In addition 10 the direcled bond percolation mockls they also include sitc pfrco/citjcxr proCp,y,ypS, where every edge from any p r l i c u l a r vertex has tlic same colour; they are all black, with probability z, or all white, with probability 1 --z Oilier impoilant locally dependent random graphs, wit I1less simple dependence structure, :ire those geiieratcd hy certaiii spatial gencr:il cp'clcmic processcs (sec Section 5). A bond or site percolation IWOCCSS r~ial
We shall introduce sonic notation and give a rigorous definition for a locally dependent random graph. Let C = ( V , E ) be a directed graph such that the set V or vertices is countable, and the set E or directed edges contains no loops. For every u E V, E , c E is the set of edges with u as their initial vertcx. We assume that every Eu is countable. Hence E = E,, is countable, Loo. d i s t h e set ofall subsets ofE,8', is the set ofall
u
U E
Y
is the set of all finite subsets of E,. The eleinents of 8 we subsets of E, and 9, call coufiqitrations. For each u E V, P , is a probability measure defined on (rPv,d), where d is the a-field generated by the cylinder scts i n 8,(note that an alternative deiinilion for 8,i s (0, I}"'). We assume that P , ( Y U ) =1 for every U E Y. I n other words, tlie random subset of E,, determined by I", i s finite almost certainly. P dcnotcs [he pl-ociuct nieasure on 8 defined by P = P,. The pair (C, P ) is called a focally rkpouiv eiit rmidonz grqdz, or for short, in this paper, a ranrbtn groph. For P,, u E V , wc dclinc two usel'ul l'unclions, the ~/uoir/n~/r,cfiincfioiz y, and the disrribuiioii function d, on 9-", the set of finite subsets of Lo, by
17
UL
Locally dependent random graphs in epidemic models
183
p,( I ) = P,({J c El, : Z n J = 0 > ) ,
d , ( l ) = P , ( { J c E , : Jcl}).
By lnaking use of the principle of inclusion-exclusioii (e.g. Rota [ I I]) and Kolniogorov's consistency theorem, one can show that either one ofp, and rl, defiiie P,, uniquely. We interpret the I-andoni graph (G, P ) as the graph G with dl its edges colour-ed black or while, such that the SCI 01' black edges is determined by the probability jiieasure P. F-or v E C'aricl I E 9,. thc avoidance 1'Linction p l , ( I ) gives theprobability t h n t every edge of the set 1 of cc1gc.s f'roni v is white. The distribution function dl,(l)indicntes the probability that cvery black edge f'xont 1 ) is included in 1.
3. Paths and the percolation probability A noiieiiipty set <== , ..., ol,) of ecigcs is called a (selT-avoiting) pcrt/z if there such that e, E El,&,and I),+ is the terminating are distiiicl veilices v , , . . ., vertex of ci for i - I , . .. , I ? . If 11, E Sc G' the p t h is said to be fro~ii.S to I , , , + , . An inlinite set { c l , c 2 , . . . j . c E is :I pa1I1 i f { e l , ..., c,~}is a path for every tz. The il-zrncatiot~st n ,n= 1, 2, . . ., of a path < =:el, e l , .. .) are delined by ~i,~.+
<
We say that a path ( i s black if it is contained in a given configuration OJ E €, or i n oihcr or-ds, i f all of its edges aIc black in configuration w . For any set ;" of paths we deline two subbets of 8 by
and
<
{<,, : E Z}. We assume by convention that B 5=V" -a. 9 ' is the set where where at Ica\t o:ic path 01' E I S black. I t is Jliensurab~eat least if Z is countable \vlicreas r ~ " is always mcasuratAc. 111 a finite gr:lpIi & a n d cd'are ways equ:iI. We shall ;imime t h a t there is always g.vcn :I lixed finite set S of vcrticcs, the set ot' SOL(I'(Y rwticcs. For it vertex u E V\S we dcno:c the set of' a l l paths from S to 1 1 by 2,;.J I ' - ~which , is measurable, i s the event in thc Inti-oduction we called ' ' u can be I-cached from the SOUI-ccvertices by following black edges only."
K . Kuulnstnaa
184
For any Q E F let a(o)'E denote the set of edges which in configuration w belong to some black path from the source vertices. The set {LO E d : # a((o)= a> is measurable, and we call its probability thepercol~itionprobubilitj,.Since the number of black edges froxi any particular vertex is assumed to be finite almost certainly, the percolation probability also indicates the probability that an infinite number of vertices can be reached from the source vertices along black paths. Let Z, denote the set of all infinite paths froiii the source vertices. The percolation probability can be expressed by means or Z, : Lemma 3.1. The percolation probability is equal to P(%"""). Proof. If #a(m) is infinite then either (0 contains an infinite path from S , i.e. 0 E a',, or w has an infinite number of edges from some vertex. S nce the latter event has probability zero, the set {LO : #a(to)=co}\d'" is a null set. But since a?+@-c{w
:
#D(o)=o3}
we have
P ( { w : #a(w)=CO})=P(%+).
0
The following theorem, a version of the "General Clutter Percolation Theorem" of McD armid [8] is very useful Tor comparisons of percolation probabilities on random graphs with different probability measures.
Theorem 3.2. Let (G, P ) and (G, Q) be two locally dependent random graphs, dcjined on tile siinie directed graph G, with nvoitkince furictions { p " ) atid ( ( I , ) , respectively. If p. >, for ewry v E V IVC Iiuuc p(%') < Q
!.
The proof for the case where the probab l r t i e s of 9' are compared for :I set 3 of finite path\ ojily can be IoJnd In K ~ ~ l i i s ~ i i [S], : i a and the theorem c;in ca\ily be exlendcd lor the ~ i i o : geiicial '~ Z. As a n :lpp~icatiOilO f T ~ C O I C 3.3 J IWC ~ S C t 1JTlJllCdiiltC~ythe le\ult 01'Hammel $ley [2] that thc percolatioil prob'tbrli~yi n a bond percolat oii piocess is h ghei t h m in the corrcapond,ng site pi occss.
4. The product representation of a locally fiaite random graph
We call a locally dependent random graph a product random graph if all the edges can be gro3ped in such a way that wcthin each gioup the edges have the same colour with piobability one, and between groups the colours of edges
LocalIy dependent raiidorn graphs in epidemic niodels
185
are mutually independent. A product random graph is always simple i n the sense that its measure can be expressed by means ofa set of independent binary measures. Examples of product random graphs are bond and site percolation processes. Sonietiines to a random graph (( V , E ) , P ) there corresponds a product random graph ( ( V , E ' ) , Q) with the same set of vertices such that the probability distributions which determine the sets of terminating vertices of black paths from the source vertices are the same i n these randoin graphs. Then we say that the graph ( ( V , E),P) has a product rrprescntation. As the price of the simplicity of its measure, the product representation ( ( V ,E ' ) , Q) usually has iiiore edges than the or'ginal randoni graph (( V , E ) , P). The following theorem tells us when a random graph has a product representat ion. Theorem 4.1. A locally r k p m l e n t ranrlorn graph (( V , E ) , P ) iriih distribution functions ((I,>> has a product representation $ und o d y if the following titlo ronilitions are vaiid for every u E V: (i) there exists u$nite sirbset E o u cE,, s t d t that d,(K)>O $arid only if E o v cK ,
and (ii) n,,(K)=
n d,(J u Eo")
(- I ) #
(K\J)+l
dl
J c K
for every finite
cind
rionctnpty K c Eu\Eou.
Note that if r/,(0)>0 then E , , 01' condition is the empty set. For more deta Is of the product repiesxtation and lor a pioof of Theoieiii 4.1 see Kuulasniaa 161. (q)
5. A spatial general epidemic model Molliaon ",I has delincd ;I spatial general cp'dcmic G E ( Z d ,CI, 1 1 , F ) as follows. Let [lie set ol s i ~ e sbe z', t ~ i eAIimciisioiwI inccger Iatlicc, ailti let s bc a liiiite subsct or Z d .Wc ;issiiIne lhat c( is a s l r clly positive real numhcr, l r is ;I probab!l,ty density dcfincd o:>Z dsuch t h a t p(O)=O and F i s a piobability d.siribut oil t'unctioii concentrated on (0, cr-1). At time zeio lhei-e is ;in inlect.ous ind,v dual at each site oi's, and the reht oi'the sites ai-e ozct~ped by healthy individuals. Thc inf'ectives emit g c r ~ ~independently ia in Po.s:on piocc.sses with rates CI until they ate reniovcd, each independently arrer having been infectious i'or a random leng~liot time with distribution F. After an indlvldual has becn removed, her site r-enia.nsempty for ever. Each emitted germ goes independently to a site whobe location with respect to the location of the parent is choben according to the contczcf distrSu/ion
K . Kirrrlasmaa
186
p . If a healthy individual gets a germ she becomes infected and starts to emit germs until she is removed after an infectious time with distribution F. If an infected individual or an empty site receives a germ nothing happens. Mollison [9, 101 has studied tlie velocity of tlie front of tlie corresponding simple epidemic, where the inlkcted individuals remain infectious for evcr. His results provide upper bounds also for the velocity of the general epidemic. The most important question about the general epidemic is whether it I s possible that tlie infection never dies out. This happcns alniost surely if and only if infinitcly many individuals arc iiltiniatcly inl'cctcct. Indeed, we ca11make a simple comparison to lind out that the process never explodes, or in other words, that only LI finite number of individuals will become infected in a linitc time: The number of infections in a general epidemic is dominated by the coi-responding Yule pi~ocess, where every emitted germ c:iuscs ;I iiew infection and the infectivcs remain infectious for evcr. The pidxibility of explosion of a Yule process is zero (Fcllcr [ I , Sections XVlI 3 and 41). The problem oF exlinclion of the epidemic rctiuces to one of' I-andorn graphs. Let G'=(v,E ) be tIie simple p p I i where V=Z" and for each u E V , E contains an edge fro~iiu to w if and only if p ( ~ o - u ) > O . Corresponding to the general epidemic CE( a , p , F ) , we define a locally dependent r;indom graph (G, P ) such that, assuming tliere is :in iiilictive at every vertex of V , the edge I'rom u to 10 is black if and only i f tlic infective at u sends a germ to IV before she is removed. The ranctom graph ( G , P ) detcrnmilles the uitimale spread of the epidemic: the individual at u E V, u 4 S, will sooiler o r laler be infected if and only if i n tlie random graph tliere is a black path from S to u. The percolation probability PC6'+') (see Leinnia 3.1) indicates the probability that the infection never becomes extinct. Tlieore~ii3.2 and the knowledge about site pcrcolation processes can be used to prove a thresliold theorem lor the general ep:demic process (Kuulasmaa [ S ] ) . I t states that if i n tlie general ep-dcmic G E ( Z d ,a,p , F), (122 and 11 is properly at least two-dimensional, then there exists a critical infection rate &, such that for a c?, the probability of extinction is less than one. It is interesting to note that i T p is one-dimensional and has finite mean and if also F has finite nieaii then the probability of extinction is always one (Kelly [3]). Compared with an arbitrary general ep'demic, one with constant lifetime is rcmarkabty simple since tlie colot~rsof the edges of the random graph corrcsponding to a gcncraI epidemic G E ( Z " ,a,, ~ i F, ) , where p > O at at least two sites, are mutually independent i f ~ n otily d i f F is degenerate. A proof of t h i s statement is included a t the end ol' this sect.on. Let CE(Z", c?, 11, F ) be an arbitrary general epidemic with (C, P) as the corresponding random graph. Wc can definc two constant lifetime cp'deiiiics such that in the random graph, ( G , P*) say, OF one of them the marginal probability for
v,
Lordly itepen~lentratrdonr graphs
iir
epidemic tnodels
187
any edge to be black 1 5 the came as i n (C, P),and in the other, which has contact dijtributron ji, the probability that an itiTecti\e emits no germs I > the same as in G E ( Z d ,a , p , F). Let (G, P o ) be the random graph of the latter constant lifetijne proccj\. w e can u\c Thco:eni 3.2 to find out (Kuulasmua [5], Kuulauiiaa and Zachary [7]) thdt theic co i 5 t a n t ~ i f e t l m eep tlemics provide both an upper hound and a lower bound for the probabilily of no extinction of G E ( Z d ,a,1 1 , F):
The randoin graph ( G , 1”) of GE(Zd,a,p , F ) h a s ;I product representation if .. (.- ~)‘-‘t//‘’’(.y)>O i n tIic i n t e r v a l O<.\- O at at least two sites then the coIours of’ the edges of the geiieral ep demic G L ( Z ~a, , p , F ) arc JilutUdly indcpendent if and only if F is degenerate. The “ i f ” p:irl is himple (see KUtl~aSJllaa[ S ] ) . For the “only i f ” part we need the “other” Chebyshev inequality which states t h a t if T is ;Lraiiilom var;able and il’ g and / i are two non-itici.casing functions such that g ( T ) :iiid h ( T )are integrnble thcn E[{q(T)h(r)] ~ E [ ~ ( T T ) ] E [ with / I ( ~eqiirility )], oiily il‘ g(T)or h ( T ) is dcgcner:itc (lor :i bhort proo1.01 t l i i b rncqiiality ~ C CK h a t r I [4]). L c ~I I a11d ~ t be ’ IM‘O d h t l n c t site3 such thiit p ( r ) > O and p(iv)>O and let F be 110iitl~gu11cr:itc. If T 1s a r:ltidoin variable with distribution F we ciiti make use 01‘ the ineq~~;ihly above to see that
188
K.Kuulasmaa
where p ( { o , w}) is the probability that the edges from the origin to u 2nd w are both whttc and p ( { u } ) and p ( ( w } ) are the corresponding marg~iialprobabilities for the edges. IHence the colours of these two edges are not Independent.
References [l] W. Feller, An Introduction to Probability Theory and 11s Applications, Vol. I, 3rd ed. (Wiley, New York, 1968). [2] J. M. Haminersky, Comparison of atom and bond percolation processes, J. Math. I’hys. 2 (1961) 728-733. [3] F. P. Kelly, In discussion of Mollison [9] (1977) 318-319. [4] C . G . Khatri, On certain inequalities for normal distributions and their applications to simultaneous confidence bounds, Ann. Math. Statist. 38 (1967) 1853-1867. [5] K. Kuulasmaa, The spatial general epidcmic and locally dependent random graphs, J. Appl. Prob. 19 (1982) 745-758. [6] K. Kuulasniaa, The product representation of a locally dependent random graph, Stochastic Processes Appl. 17 (1984) 147-158. [7] K. Kuulasmaa and S . Zachary, On spatial general epidcmics and bond percolation processes, J . Appl. Prob. 21 (1984) 91 1-914. [S] C. McDiarmid, General percolation and random graphs, Adv. Appl. Prob. 13 (1981) 40-60. [9] D. Mollison, Spatial contact models for ccologicnl and epidemic spread, J. I<. Statist. Soc. B 39 (1977) 283-326. [lo] D. Mollison, Markovian contact proccsses, Adv. Appl. Frob. 10 (1978) 85-108. [ I I ] (3.-C. Rota, On the foundaiions of combinatorial theory. I . Theory of Mijbius runctions, Z. Walirsclieinlichkcitblh. 2 ( I 963) 340-368. [I21 I). J . A . Welsh, 1’cicoI;ition and related topics, Science Progress 64 (1977) 65-43. [I31 J . C . Wicrmcin, Criiical pcrcol;iiion probrihilitics, I’ioc. o f thc Scmin:tr “liantloni Graphs ’83” Poriiari, I~ol:intl, 1983, in; Ann. of Discrcte Mtith. 28 (198-7) 349-359 ( N c) i t 11- 14 01 l a Iid, A m t cr dam, I 98 5).
Ann& of Discrete Mathematics 28 (1985) 189-197 0Elsevier Science Publishers B. V. (North-Holland)
A RANDOM SAMPLING PROCEDIJRE FROM A FINITE POPULATION AND SOME APPLICATlONS Ljuben M UTAFCIEV
The papcr presents ;I random sclni:~Iing proccdiire from a linite population and its applications L O random graphs and cpiticinic processes on random mappings.
1. Introduction
Let Z,, be a finite set of tz objects, a random part ot’which, say a subset A c Z,, has a specific property S. We assume that < , , = I A ] is a random variable on a given probability apace (Q,,, -d,,, P,,). A sainple with replacement is taken from Z,, until the f i r b l object with property S is discovered. Models of this sort are investigated from the statistical point or view in many papers devoted to sampling inspection plans [2]. Let t(c,,)dcnotc the number of the trial on which the sample liad terminated. Clearly, for clI=l,the conditional distribution of r(<,r) is geometric with parameter 1/12. I t is easy to see that the uiicoiiclitional distribution of has the f’orm of ;L mixture
((r,,)
where the kernd K,l(.v,19 i s constructed by the last geometric probabilities and is a probability iiie:is~ireo n [O, N ) dcpending OI? P,, ant1 <,, (see formulas (4) and ( 5 ) of Section 2). The asynip:o!ic belinviour of mixtures, provided that the kernels and the measures arc convcrgciit i n various senses, a s 1z-t n,is investig:ited in deluil in [ I I]. In Section 2 M’C ;tssiiiiic t h a t $,, , appropriaiely nonunliLcd, convcrgcs weakly, a s / I + m, to ;I random variable :tnd detcnnine the aSyJllp~OtiCtli:itrihutiol1 ot’ of f(<,,) by the c!istribution function of <. Thc prool‘ is based on a limit theorem
ell
<
189
i 90
L . Mutafciev
of Serfozo [ I I ] for intcgrals of type (I), analogous to a consequence oE Vituli's classical theorem for a lixed measure (see, for example, [ 6 ] ) .Similar problem for inixturcs of binomial distributions is considered i n th; Section 2.3 of "1. I n the Section 3 of our pa;ier, Z,, i s assumcd to be the vertex sct o f ii rnndom gi-aph. Thc asymptotic rcsult lor thc sampling size t(<,,)is interpi-ctcd in terms o r threilioltl functions (Ihc concept o f a thrcshold function was introduccil by Ercliis a n d R6nyi [7]). 1 1 1 Section 4 we put Z,,={l,2 , ..., I , } mid consider thc Emlily Y/., 01' d l single-valuccl mappiiigs T from Z,, into itself, equipped witli the unil'or-mpi.ohuhility iiicasiirc. Sonic appliciitio!ia to epi~lc~iiic models on tlic gr:rplis G,. i-cpiwcnting mappings T E V',,(w [8, 5, lo]) ;ire discusscd.
2. A limit theorcm for thc smipling s i x t ( & ) Our first aim is to show how the limiting distribution of t(t,,) c;in bc ohtained as a consequence 01' more gcncral asymptolic results for Lebcsguc integrals with varyi tig measures. Suppose that X be an arbitrary metric space, and lel :B(/Vbc its Borcl a-field. The Ioilowing lemma is a special casc of a sufficient condition given i n [ I l l . { / i n } and { j ,,} are sequences of probability mensurcs and measiirable, red-vahred und imjfortdy boutiderl firizctiotis on ( X , 28 (X)), rcspccrivclj,,
Lcmma 1. Si~pposethat
sutisjying thc followiiig conilitiotu:
for m y c > 0,
where J' is bounded uric1 continuous ,fiinctioti; Cii) tlie sequetice {p,,}converges weakly to a probubility nieusure Then
11.
The proof of this lcnima coincides with the suficicncy of Theorem 2.8 111 [I I]. It is based on a result, analogous to Vitnli's clas.,ic.il integral convei-gence theorems for a fixed measure [6].
Notc. We restricted ourselvcs to tlzc case of uniformly bounded (A,).,continuotisf' and weakly convergent {p,,)which i s sufficient for our purpose. Serfozo's assumptions of tight and uniform pa-integrability for {.Ll} arc more general and follow
A randoni sampling procediire from a finite population
191
immediately from the facts above. The convergence
is the only more restrictive condition i n SerFozo's theorem. However, taking into account that J'is c o t i ~ i n t i o tand ~ ~ hounded, it i.s eahy to see that (3) can be replaced by the \veak co~~vergeiicc of ( p , , ) .
Consider no^ the s;iiiipliiig procediire and thc rundom variable <,, described i n the Introduction. Lct R ' . deiiote tlic real suhiiitcrvnl [0, w) :ind let % + bc its Borcl o-field. LkIloie by (.,f a scqiiwce 01' positive constants, a n d let
We shall uae alw thc lollowing not a t 'lolls. ~ , , = n n ~; ~J -' ~ , , ~ = / U: , ~ ~I = O . I
,
... , ri ;
The next result gives the convergence Kll(.v,~ ) - - + P - ~ ' ' in (3,-measure (see also condition ( I )of Leiii~iia1).
Lemma 2. L e t an-tv3 h siich (1 icwy that exists n st.gueme {b,,) mcII rhut
(ill = o (n),
as rz+
CL.
Suppose that there
192
L . Muta/cieu
where
Remembering that the integrands are uniformly bounded, by virtue of (7), for
i2 we get
To cstimate I , , o b mve first that for a fixed I and Gb", nccordlng to (3, we have
12
such that
J ' , ? , ~<j><-yn,, k ,
where
Hence
The inequalllies y,,(yn,,)
A random sampling procedure from a finite population
193
Hence, exqn(bn)-1=o(l), as n--tcO, and therefore
sup lK,,(x, y ) - e e-""~=o(l), OByBb,
which implies
Now relation (8) follows from (9-12).
0
After these lemmas we are able to determine the asymptotic distribution of the sampling size t(&,).
Theorem. (a) Suppose the hypothesis of Lemma 2 for the numerical sequences {a,,} and (6,) holds and the sequence of probability measures {Q,(a,; .)}, given by (4), converges weakly, as n + w , to a probability measure Q ( * )on (R+, a+). Then
(b) Suppose an=n, n= 1 , 2 , ..., and the sequence of probability measures (Q,(n; -)}, given by (4), converges weakly, as n-tw, to a probabilily measure Q(-) on ( R + ,a+). Then 1
lim P n ( t ( & , ) = k } = J y ( l - y ) k - l Q ( d y ) , ,-+W
k = l , 2 , ... .
0
Proof. (a) Clearly, for 5,,=! the distribution oft(&) is geometric with parameter l/n, so that the unconditional distribution of t ((3 is a mixture of the following type. Pn{tn=Z}, k = O , l,Tq!
Putting here k = x n / a n = x N , , x E R + , and taking into account definition (4) and notations (9, we may write this in ternis of Lebesgue integrals as follows.
194
L. Mutaprev
Conditions (i) and (ii) of Lemma 1 are fulfilled by virtue of Lemma 2. Hence, by (21, we get
completing the proof of part (a) of the theorem. (b) Let E n ( . ) denote the mathematical expectation in the probability space (Q,,, d , ,P,) (see Introduction). Then, by the formula of total probab.l,ty, it is easy to check that
P,(t(t,)=k}=E,
{7( -:y} -~
,
1---
k = 1 , 2 , ...
The random variables &/n are uniformly bounded for n= I , 2 , 5.4 in [4], we obtain assertion (b) of the theorem. 0
.... So, by Theorem
3. Sampling from the vertex set of a random graph Let 2, be the vertex set of a fam:ly of graphs 9,.Introducing a probabilily measure P,, on a a-field 9, of subsets of B,, we obtain a probability space (9,,, P,,, P,,) of a type of random graphs on n vertices. Thus all conceivable characteristics of the graphs G E 9, become random variables on (Y,,, F,,, P,,). Let S be a vertex property of a random graph G E 9,and let (,,(G) be the number of vertices possessing S. Suppo;e that for each graph-realization G E 9,,a sample with replacement of size z=z(.v) is taken from Z,,, and let H , denote the event that at least one vertex from this sample has the property S. Now we shall introduce the concept of regular threshold functions coming from Erdos and R h y i [7] and servicing the study of structural properties of random graphs as the number of their vertices and edges increase. However, we shall use regular threshold functions to describe the asymptotics of ( ~ ( n ) } and Pn(Hr(,,J,as n-roo.
Definition. The monotone non-decreasing function wH(n) with lim wl,(n)= a, a-m
is said to be the regular threshold function of the event H , if t h m ex sts a probability d stribution funct:oii vl,,(x), called threshold distribution function of H , , such that for every point of continuity X E R + of YH(x) the relation T(n) lim P,fH,(,,)= Y,(x) holds, when lim __ --x n-1,
n-m
wdn)
a
A random sampling procedure from a finite population
195
The results of Section 2 may be app1;ed to find v H ( n ) and YH(x)for the class of events {.H , ,] . Actually, if we suppose that there exist two numerical sequences {a,) and (6,) such that &/a, converges weakly to a random variable with distribution function @(x) and the assumptions of Lemma 2 hold, then from part (a) of the theorem of Section 2 it follows that
Such problems for a special type of random graphs are solved by Bagaev [l].
4. Some applications to epidemic models on a special type of random graphs Suppose now that & = { I , 2, ..., n}, and let%, be the set of all single-valued =n". Introducing the uniform probmapp'ngs T of 2, into itself. Clearly, ability measure on Y f , , i.e. considering T as p'cked at random from W,, with probability \ Y f , , I - ' , we obtain the random mapping T. In this way all numerical characteristics of T become random variables. Each mapping T can be represented by a d graph G, hav'ng Z,, as a vertex set and for each i, j E Z,, an or:ented arc goes from i to j Iff j = T(i).(G, consists of a number of components with exactly one cycle in each.) Let T - ' be the inverse mapp'ng of T, and let T* be such that T*(i)={Tli)} u { T - ' ( i ) } for i E Z , . Denote by f,? - I , and ?* the transitive closures of T, T - ' , and T", respectively, so that, for example, ? ( i ) = { i } u {T(i)}u {T(T(i))} u ... , for iE 2,. Let C,,, denote an m-element subset of Z,,, and let
1 I,/%
Gertsbakh [S] suggested some models of ep'demk spread on the graphs G T , basic properties of which depend on these random variables. Namely, 2, is considered as a population; a contag'ous disease is initially confined to a subpopulat.on CmcZ,,,and it is then spread to other vertices (individuals) along arcs of G,. There were described three versions regarding whether the infection is spread only forward, i.e. accord'ng to the orientat on of the arcs, or only backward, or in bo.h directions. Clearly, the distributions of the random variables q,,,,, and c,,, descr:be the infected size of the population in these three models, respectively. The exact and asymptotic distributions of q,,,,. as m, n-t 00,
r,,,,,
L. hfutafcfeu
196
c,,,
regarding all possible assumptions on m, are obtained by Burtin [5]. For and Cn,, the same problem is solved in detail by Pittel [lo]. Small parts of it are contained in Berg [3] and Ignatov [9]. These results and the theorem of Section 2 make it possible to obtain the asymptotic distributions of the corresponding sampling sizes ~(t,,,,), ~ ( q ~and ,~z )((,',,,,),as m , n-co, under various assumptions on m. For an illustration we give here three examples. (I) Let m be fixed; then
(see Berg [3] and Pittel [lo]), and
lim P.{ t (tn,,)/Ji <x} = 1-2m -
n-1 m
1
y2'"-l exp( - x y - y 2 / 2 ) dy ,
(m-l)! 0
XER'.
(11) Let m , n-+co, so that M=o(&);
(see Burtin [5]), and
(111) Let
in
be fixed; then
eq =
1 3 . ...Q m - 1) ?"(m - 1) !
2
then
A random sampling procedure from a finite population
197
(see Pittel [lo]), and
lim P n ( t ( n,l m )=k}=n-t m
2m * 1* 3 * ... * (2m - 1) __ ( 2 k - l ) ( 2 k + l ) ... [2(k+m)-l]
~.
k = l , 2 , ....
The last asymptotic formula for k = m = 1 was established earlier by Stepanov [12] and Pittel [lo].
Acknowledgements
I would like to thank the referee for some comments. I am also grateful to Dr. Michal Karoriski and Dr. Andrzej Rucinski for their helpful suggestions concerning the exposition.
References [ l ] G . N. Bagaev, Limit distributions of metric characteristics of a random indecomposable mapping, Combinatorial and asymptotic analysis, Krasnojar. Gos. Univ., Krasnojarsk, (1977) pp. 55-61 (in Russian).
Y.K. Belyaev, Probability Methods of Sampling (Nauka, Moskva, 1975) (in Russian). S. Berg, On snowball sampling, random mappings and related problems, J. Appl. Probab. 18 (1981) 283-290. P. Billingsley, Convergence of Probability Measures (John Wiley, New York, 1968). Y . D. Burtin, On a simple formula for random mappings and its applications, J. Appl. Probab. 17 (1980) 403-414. N. Dunford and J. Schwartz, Linear Operators, Part I: General Thcory (Intcrsciencc, New York, 1957). P. Erdiis and A. RCnyi, On !he evolution of random graphs, Magyar. tud. akad. Mat. kuta(o int. kiizl. 5 (1960) 17-61, I. B. Gertshakh, Epidemic processcs on a randvm graph: Some preliminary results, J . Appl. l’robab. 14 (1977) 127-4.38. Z. Ignatov, Asymptotic ircsults for a n cpidcmic process on random graphs, Trans. 9th Prague Conf. on Inf. ’Theory, Statist. Dec. I-‘unctions, Random Proccsscs, Prague, Junc 28 - July 2, 1981 (Academia, Prague, 1983) pp. 301-306. B. I’ittel, On distributions rclutcd to transitive closures of random tinite mappings. Ann. Probab. I 1 (1983) 428-411. R. Serfozo, Convergence of Lebesguc integrals with varying measures, SaiikliyS, Ser. A 44 (1982) 380-402. V. E. Stepanov, Random mappings with a singlc attracting centre, Thcory Probability Appl. 16 (1971) 155-161.
This Page Intentionally Left Blank
Annals of Discrete Mathematics 28 (1985) 199-207 0 Elsevier Science Publishers B. V. (North-Holland)
THREE REMARKS ON DIMENSIONS OF GRAPHS Jaroslav NESETRIL Charles University, 110 00 Prague I , Czechoslovakia
Vojtech RoDL Czech Technical University, 110 00 Prague I , Czechoslovakia
The paper is concerned with the representation of graphs by various products and with the related dimensions.
0. Introduction
I n this note we are interested in representation of graphs by various products and in the related dimensions. We complement the earlier research in this area by three remarks: In Section 1 we prove that every dimension (with respect to arbitrary product and arbitrary class of generators) is either trivial (i.e. identically equal to 1) or unbounded. This extends our earlier result for the direct product dimension stated in [ 5 ] . I n Section 2 we sharpen the result of Section 1 for the case of the direct product and prove a general lower bound of order logloglogn for a graph with n vertices. On the other hand, one can achieve that the dimension of every graph with n vertices is d log n. In Section 3 we prove the existence of bipartite graphs with large dimension and without large induced matching. This result is stated without proof in [ 5 ] .
1. General dimension for general products Let p be a fixed mapping {+, -, = } x { + , -, =}+{+, -, p ( i , j ) equals to = iff i a n d j equal to =. Given a graph G = ( V , E ) and a pair of vertices x, y, set 199
=}, where
J. NefetJil, V. Rod1
200
s(x,y)=+
iff { x , Y } E E
s ( x , y =-
iff ( x , y } $ E and X Z Y
s ( x , y ) equals to = iff x = y .
(Thus S : V X V+{+, -, =}.) Given graphs G =( V , E ) , G’=( V,E’), define product C x G’=( W ,F ) as follows
+
This definition covers all products of graphs. For example, if p ( i , j ) = iff i=j x p is the direct product,and if the function p satisfies p ( i , j ) = + iff either i equals = and j = + or i= + and j equals = then x p is the Cartesian product. The following then is thc key definition of this section. be a class of graphs and let p be a function with the above properties. Let Then define the dimension dim,,,(G) of a graph G as the minimal number of graphs A l , ..., A , belonging to -d such that G is an induced subgraph of the product
=+ then
If there is no such k then we put dim,,,(C)=co.
We prove hcre
Theorem 1.1. For every p and .d the following 1iold.y: either dim,,,(CI= 1 for every g r p l i G or for every posifivc ititeger k rliere exists a graph G such tliar
This theorem follows easily from thc following stronger result (cf. [ 5 ] ) .
Theorem 1.2. For every p arid for every graph G there exists a graph H with the following property: if H is an induced subgraph of A x B then C is aiz iticluccd subgraph of either A or B.
,
Proof. Let p and G be fixed. It is easy to find Go such that both Go and its complement Go contain an induced subgraph isomorphic to G and, moreover, both Go and Go are connccted.
7h-m remarks on dimensioni of graphs
20 1.
Let H = ( W , F ) be a graph with the following property: for every partition
CWl’=P++ v P , - U P - + u P - - u P = , v P = - u P+ = u P-= there exist i , j , k , 1 and an induced subgraph G’=(V’, E’) of H , G‘ isomorphic to Go,such that E‘cP,, and [ V ’ ] Z - E ’ ~ P k , . This is a Ramsey type result which follows from any of our earlier papers on this subject, e.g. [4]. We prove that H has the desired property. Let H be an induced subgraph of A x B. Given a pair (x, x>E [ x=(a, b), x’=(a’, b’), we put (x, x’) E P,, iff $ ( a ,a’)=i and s (b , b’)=j. Clearly, this defines a partition into 8 parts. Thus there exist i, j , k , l E (+ , - , =>and an induced subgraph G’=(V’,E’) of H,G‘ isomorphic to Go, such that E’GP,, and [ V ‘ l Z - E ‘ ~ P k I .
w2,
We shall distinguish two cases: 1. One of the indices i, j , k , 1 is equal to =, say i equals =. As both Go and Go are connected we have that also li equals =. Moreover, as Go is neither complete nor empty, we have that i n this case p ( = , + ) # p ( = , -). It follows that the graph B contains an induced subgraph isomorphic either to Go or to Go. By the choice of C, this, in turn, means that B contains an induced subgraph isomorphic to G. 2. Assume that all indices i ,j , k , 1 are distinct from =. It follows from E’GP,, that p(i,,j)= + and similarly p ( k , I)= -. Thus either l c # i or t # j . Withoutlessorgeneralityussiiiiie that k f i . But tlieiieithcrC,(if(i,I<)=(+, -)> or G, (if (i, /c)=(-, +)) is an induceci subgraph or A . 111 boll1 cases this nieans that C is an induced subgraph of A. 0
2. Gcncral dimension for direct product Let p ( i , , j ) = + iff i=j= +. The product x (defined i n Section 1) is, i n thi: particular case, called direct product and it will be denoted by x . Also, we put diin,,,(G) =dim,(C). Accordiiig to case 1 we know that the function dim, is either constant ( = 1) or unbounded. In this section we give bounds for its growth. Let d be a class of graphs. Denote by g,((n) the maximal dim,(G), where G has n vertices. Let B be a class of bipartite graphs. Denote by b,(n) the maximal dim,(G), where G is a bipartite graph with n vertices. We show the following two theorems.
202
. I . Nefetfil,
V. Rod/
Theorem 2.1. (:) For every E>O there exists a class d , of graphs such that gdC(n)<&log n for every positive n. (i i) For every class d of graphs either g d (n)= 1 for every n or gd(n) 2 (I - o (1)) x log log log n. Theorem 2.2. (i) For every E >0 there exists a class g, of bipartite graphs such that b,. (n)< &logn for every positive integer n. ( i i ) For every class 98 of bipartite graphs either b,(n)=l for every n or b,(n) >(l-o(l))loglogn.
In the proof of Theorem 2.1 we shall use the following theorem of Erdos and Hajnal (see [I]). Theorem 2.3. For every positive integer t there exists integer n(t)=n such thaf the following holds: every graph H with t-vertices is an induced subgraph of every graph G which has 1) n vertices, and 2) neither G nor its complement contains a complete graph with
vertices. Before we start with the proof of Theorem 2.1 let us introduce the following.
We have f1(n)>2c2"0B")L'z and for k 2 2 holds by induction
Three remarks on dimensions of graphs
203
Proof of Theorem 2.1. First, we prove statement (i): for every E > O consider all graphs A which have the following structure - the vertex set of A can be decomposed in p classes U I, U, , . .. , Upsuch that the graphs induced on U, are complete. Herep=p(c) is a suffi5ently large positive integer. It can be seen easily that every graph G, V(G)=n, is isomorphic to an intersection of r = c l o g n such graphs A l , A , , ..., A , and thus also G is an induced subgraph of A l x A , x ... x A , . (By the intersection of graphs ( W , .El), ..., ( W , E,) we understand the graph
nEi), clearly we can suppose that graphs A , , ...,A , have the same vertex r
(W,
i=1
set .) Now we prove the second statement. Let be a fixed class of graphs. Suppose that there is H which fa’ls to be an induced subgraph of any A ~d and thus dim,((H)>I. On the other hand, we may suppose without loss of generality that dim,(G) is finite for every graph G as otherwise statement (ii) is trivial.
0 Let H be a graph with dim,(H)>
1, where IV(H)[ is as small as possible. Set
IV(H)I = I . Take now n very large and consider “random graph” G with n vertices
where the edges are chosen independently each with probability 1/2. The following fact is well known (cf. [2]).
Claim 1. For every S > O Prob [G or G contains a complete graph on (2 log n)(1 + S) vertices] = o( 1). Take now G with the properties of Claim 1: let neither G nor G contains a complete graph with 3 logn vertices. Let A l , A , , ..., A , be graphs belonging to d such that G is an induced subgraph of A , x A , x ... x A , . Let V i be vertex set of the graph A , , i = l , ..., 1. Clearly, we may assume that the sets V1, ..., V, are pairwise disjoint and that IV,I
V(#)=V,uV,u
... uV,
and e = ( u , ,..., v , ] s V
is an edge of 2 iff q(z))=(ul, u,, ... ,u,) for some v E V(G). As the coniplement of G does not contain complete graph with 3 logn vertices, we have that any vertex of I/(&) is contained in at mast 3 log n hyperedges. As IE(Z)I=n we get
J. NefettY, V. Rod1
204
for every i= I , 2 , ...,I. As &' has maximum degree dsmaller than 3 log n, we can find
pairwise disjoint hyperedges e , , e , , ..., em of X . This fact formulated i n the language of G and graphs A , , ... ,Al can be summarized as follows.
Claim 2. If G is an induced subgraph of product of I graphs A , , A 2 , ,. . , A, E d then there are graphs ( W , F , ) , ..., ( W , F,), WI = m isoniorphic to subgraphs of A x , ... ,AI such that
I
I
G'=(W, O F i ) i= 1
I
for some induced subgroph G' of' G', V(G')I=m. Suppose now for the contrary I d l o g log log n -21og log log log n
As ( W , F J does not contain H as an induced subgraph, the graph (W, F , ) has to contain (according to Theorem 2.3) a complete graph with f,( n ) vertices. (Note that the complement of ( W , F,) docs not contain a complete graph with f,( I Z ) = ~ ~ ( ' ~vertices ~ " ) ' ' as ~ this would imply that the complement of C contains a complete graph with > 3 log n vertices.) Dcnote by CY, the set of these vertices (i.e. which form a complete graph i n (W, F , ) of size f , (n)) and consider the graph ( W , , [W,12 n Fz). This graph again does not contain H as an induced subgraph and hence contains a complete subgraph withf,(n) vertices. Iterating this procedure, we get a subset W!E W withf; ( H ) vertices which form a complete subgraph in G. Hence, according to (I), we have
1 log [4(2t+l)log(31ogn)] >1 loglogn 2
Three remarks on dimensions ofgraphs
205
as this for n>no I loglogn 2 >--(log log log n)2
which contradicts (2). In [I] Erdos and Hajnal state also the following.
Theorem 2.4. For every positive integer t there exists E>O and ntl= n , ( t ) such that for every graph H with t vertices a d every graph G with n >n, ( t ) vertices and with the property that neither G nor its complement contains a complete bipartite graph A , B , /A1= 1B1 =n, the graph G contains H as an induced subgraph. Imitating the above proof of Theorem 2.1 and using Theorem 2.4 instead of 2.3, we obtain a proof of Theorem 2.2.
Corollary 2.5. Fix E>O. Let d be a class of graphs closed on induced subgraphs. Assume that there exists a graph which does not belong to .d. Then there exists a graph C = ( V , E ) which fails to be an induced subgraph of product of less than (1 - E ) log log log V ( graphs belonging to d.Particularly, if G = fl Ai ( A , E d ) ,
I
ie I
then
1'
>(l-&)logloglog n
3. An example of graphs with high dimensions Given a graph G with n vertices, deno'.e by dim(G) the smallest number d such that G is an induced subgraph of the graph
Ki x 7 x K,, =(fQd. L -
d
Here x denotes the direct product of graphs. This concept was defined in [3] and [4]. Clearly, i f d denotes theclassof all complete graphs thendim(G)=dim,(G) for every graph H , where the symbol dim,(C) was introduced in Section 2. It is proved in [3] that dim nK, = [log nl+ 1 and dim(K, +K,)=n, where nK2 is the matching of size n and K,,+K, denotes the complete graph with n vertices together with an isolated vertex. These results are complemented by the followiag.
J. NeSetr'l, V. Rddl
206
Theorem 3.1. For every positive integer n there exists a graph G, with the following properties:
+
( I ) G, is bipartite (and consequently does not contain K3 K,), ( 2 ) G, does not contain 3K2 as an induced subgraph, ( 3 ) dimG,>n. Proof. Let n be fixed. Let G=(V, E ) be a graph which does not contain a cycle of length 2 6 and which has chromatic number at least 2""+1. Denote by G, =( W ,F ) the following graph: W = V x { O , l}
iff i#j, u # u ' and { u , u ' } E E
We prove that the graph G, satisfying the above conditions ( I ) is trivial, (2) follows from the fact that C d o s not contain short cycles: if the disjoint edges ( ( v , , i,), ( u 2 , Q} ( ( v 3 9 i3) ((05
Y
9
( 0 4 Y i4))
is) (vfJ9 i d } 5
of G, form an induced subgraph then it is easy to see that at least 4 of the vertices u l , say v L ,v 2 , v 3 , u4 are d stinct. But then these vertices form a rectangle in G. In order to prove (3) let us first recall the following fact which is made explicitely in [4].If dim H < k then the edges of the complement of the graph H may be coloured by k colours in such a way that the graph formed by the edges of any of the colours is a disjo nt union of complete graphs and each edge is coloured at least once. Thus assume on the contrary that dim G,
This equivalence clearly defines the partition of the edges E of G into 2'" classes and thus one of the classes of the partition induces a graph with chromatic number 2 3 . Expl'citly, there exist an index i and a subset E ' c E such that x(V, E ' ) 2 3 and for all u
Three remarks on dimensions of graphs
207
However, it is easy to see that the graph ( V , E’) contains vertices u1 ,u2 ,u3, u4 such that u1 < u 2 , u1 < u 4 , u3
References [ I ] P. Erdos and A. Hajnal, On spanned subgraphs of graphs, in: Beitrage zur Graphentheorie und deren Anwendungen, Konferenz Oberhof 1977, pp. 180-196. [2] P. Erdos and J. Spencer, Probabilistic Methods in Combinatorics (Akademiai Kiado, Budapest, 1974). [3] L. Lovhsz, J. Neietiil and A. Pultr, On a product dimension of a graph, J. Comb. Th. B 29 (1980) 47-67. [4] J. Neietiil and V. Rodl, A simple proof of the Galvin-Ramsey property of graphs and a dimension of a graph, Discrete Math. 23 (1978) 49-55. [5] J. NrSetiil and V. Rodl, Prcducts of graphs and their applications, Lecture Notes in Mathematics 1018 (1983) 151-160.
This Page Intentionally Left Blank
-
Annals of Discrete Mathematics 28 (1985) 209 219 0Elsevier Science Publishers B. V. (North-Holland)
BIPARTITE COMPLETE INDUCED SUBGRAPHS OF A RANDOM GRAPH Zb:gniew PALKA Institute of Mathematics, Admn Mirkirwicz Unicrrsity, 60-760
PfJZltUh, t-’o/Old
Let G(n, p ) denote a random graph on n labelled vertices in which the edges are chosen independently and with a fixed probability p. We study the number of vertices in the largest bipartite complete induccd subgraph of a random grapli C(rt, p ) . Also we find those natural numbcrs which are likely to occur as orders of maximal bipartite complete induced subgraplis or G ( n , p ) .
1. Introduction
We will be concerned with the probability space Y ( n , p ) consisting of all graphs with a given set of n labelled vertices, in which the edges are chosen independently and with probability p = 1-4. An element of Y(iz,p ) will be denoted by G(n,p). In this paper we will always take ?I(/?, p ) , where O
as n+co
Several papers have dealt with that important problem of what is the order of an induced subgraph of a certain kind i n a random graph. For example, the complete subgraphs were considered in great detail i n [ I , 31 and [ 5 ] . Recently, soiiie problems on induced trtt-s wcre also taken into account i n [ 2 ] and [4]. The aim of this paper is to determine the order of the IargcJst bipartite coinplete CzilriccJ sirbgruph as well as to find those natural numbers which are likely to occur ;IS orders of inaxiinal bipartite cotripkcre iti~ircedsiibgrrrphsin a random graph G(n,p ) . S:nce we are concerned only with induced subgraphs, the term “subgraph” here means “induced subgraph.” Logarithms are to base e. And for any real .\-, [.Y] and {.Y} denote the greatest integer not greater than x and the least integer not less than I,respectively. 209
Z . Pafka
210
2. The largest bipartite complete subgraphs Let Kr,w be a b:partite complete graph with a bipartition ( R , W), IRI=r,
I WI = w. Assume that there are two g'ven non-decreasing sequences of positive integers r = ( r l , r,, ...) and w = ( w , , w2,...), such that the sequence r + w is increasing. Let Bn(r,w)=max(rk+wk:thereis acopyofK,,,,,in
G(n,p)).
In this section we prove, using the second moment method and Borel-Cantelli Lemma, that if wk=o(rk) as k+co then the sequence of random variables (B,(Y,w ) } behaves exactly like both the independence and tree numbers (see [3] and [4]). As a matter of fact, the following result is true. Theorem 1.
If w k =0 (rk)
as k =k (n)+ og then the sequence { ~ , , ( rw , ) ) satisfies
with probability one and in any mean. Proof. Denote by X,(k) the number of copies of a graph K,,,,,, in G ( n ,p). Let k be fixed. For the sake of simplicity put X , r and w instead of X,(k), rk and wk, respectively. The probability that a given set of r+ w vertices spans K,,,, is
Consequently, if w # r then the expectation of X i s
Now, to find the second moment of X, let us consider two bipartite complete graphs with bipartitions ( R , , W , )and ( R 2 , W 2 ) ,respectively, where lRll = lRzl = r , IW,I=IW21=w and w f r . If IRI n R z I = i and IW,n W,I=j, where O < i < r , O < j < w and O,
Bipartite complete induced subgraphs of a random graph
21 1
ordered pairs of such K,.,w’s and each pair occurs with probability
Analogously, assuming that IR, n W,l=i and IWl n R , ( = j , where O < i < w , OdjG w and 1 < i + j 6 2 w , one can choose
A,(i, j )
=(:) (:>( ‘ )( - \ (7)( w-1
1
-2w r-.)
+
,i
ordered pairs of desired Kr,w’swith probability, as in the previous case, equal to P 2 ( i y j ) Thus .
where C‘ is over all pairs ( i , j ) such that O < i < r , O < j < w and O < i + j < r + w , whereas X” is over ( i , j ) such that Odidrv, O d j 6 w and 1 d i + j < 2 w . Therefore, denoting the variance of X by Var(x) we have var(X) E(x’) - _-I= E(X)’ E(X)‘
__--
C ’ F ( j~, ,r , w ) + C ” F ( i , j , w , r ) - l ,
where
and P,(i,,j) is given by ( I ) . Further, if w = r then there is no need t o distinguish vertex classes of Kr,wand the expression on Var(X)/E(X)* is deduced the same lines as before, except a few changes, namely E(x> given by (2) and A , ( [ ,j ) given by (3) have to be divided by 2 and 4, respectively. Moreover, consideration of Az(i,j ) is unnecessary now. Consequently, if w = r then Var(X)/E(X)2= .E’F(i,j , r, w ) - 1. Further, it is easy to check that
F ( O , O , r , w ) + F ( O , l,r,w)+F(l,OJr,w)+F(O,l,wJr) + F ( l , 0 , w , r)
(4)
Z . Palka
212
since P i ( i , j ) = 1 for O , < i + j < l . Also, for 2 < i + j < r + w
Now, let k = k ( n ) depend o n ti in such a way that wk=o(rk) and r,+w,=r+w=
[
I2A]-
(2--E)
where c > O is an arbitrarily sinall constant. Then, for suficiently large i + j > 3 we have
tz
and
for any fixed y>O. As a matter of fact, in the worst case, when q > p , i=i(n)-+(x, and j=j(n)-+oO as n+cm the left-hand side of (7) is at most ( q / p ) w + o ( l ) = o ( t z y ) , since w=o(log n). Therefore, we dcduce that the right-hand side of inequality (5) is < 1 and consequently
where the summation is over all pairs (i, j ) such that 2 < i + j d r + w and O < S < 1 is a constant. Similarly, one can check that
where 2 < i + j < 2 w . Thus, by (4) and the above facts we have
Prob (X=0) d Var ( X ) / E(X ) 2= o ( 1 1 - -’) , where r and w satisfy (6). This implics that
Bipartite complete induced subgrapks of a random graph
Now, let us choose k = k
(12)
213
such that
Then, by ( 2 ) Prob(X> 1)<E ( x )
(s>
f
6
rw/(r+w) r + w
~
r !w !
(nq(r + w -
)
=o(n-I)
for any integer i. Therefore,
(
i 13aI)
Prob Bn(r,w ) > ( 2 + ~ ) - - -
=o(n-').
(9)
Consequently, by the Borel-Cantelli Lemma we deduce from (8) and (9) convergence of &(r, w ) with probability one. Now, to prove convergence in any mean, let us fix t > l and 0<&<1. Then
1("where
b , ( r , w>
logn
- - -I)=
2 logl/q
s
l- - -i
c" logrz s =
~
2 ' Prob(Bfl(r,w ) = s ) logl/q
log n and s>(2+&)--, respect1% l / q log-1I4. = o ( l ) and ~ 3 = ~ ( l ) w h e r e a s ~ , + ~ ' < ~ s i n c e
El a n d x , are taken over s<(2-&)-- log n
ively. Consequently, by@) and(9),
This completes the proof of our thesis.
0
In a particular case, setting w,- 1 and rk-k, we obtain that the number of 2 log n vertices in the largest star of G(n,p ) is, with probability one, ____ -to(logn) as log l / q n+w. This improves the result of 121.
Z . Palka
214
Now let us examine the behaviour of B,,(r, w ) in the case when the sequences r and w grow to infinity in such a way that wk/rk-wwhen k-m, where O
.
(k,f)
and
h=
The following result is available.
-
Theorem 2. Let wk crk when k-+co and 0 < c < 1. Then for every c >0 log n (2-&)---
W)<(2+&)
log n log h
a.s.
Proof. The argumentation follows the same lines as the proof of Theorem 1 except a few minor changes. It appears that under the assumption Wk-Crk, (7) is not satisfied for some values of the edge probability p . Therefore, to show that the right-hand side of ( 5 ) is < 1, let us assume that p < 1/2 and
Then, for sufficiently large n and 2=C'i+j,
Consequently (cf. the proof of Theorem 1)
(
[
12;PI)
Prob E,,(r, w ) c ( 2 - & ) -
-
=o(n-l-d),
where0<6<1. On theother hand,ifp
then
Bipartite complete induced subgraphs of a random graph
215
for any integer i. Therefore,
(
{
Prob Bn(r, w)> ( 2 + ~ ) -__
=o(n-')
I%J)
This completes the proof for p < 1/2. In a similar way one can obtain the desired bounds on B,,(r, w ) for p > 112. 0
From Theorem 2 one can deduce that when w is of the same order of magnitude as r , then for small as well as large values of the edge probability p , both the lower and upper bounds on Bn(r,w ) are wide apart. The best situation holds when p = 1/2. As a matter of fact, the following sharp result is true. Theorem 3. Let B:(r, w ) stand for Bn(r,w ) uhen considering a random graph G(n, 1/2), Ifw,-cr, when k+co and O < c 6 1 then the sequence (B:(r, w ) } satisjies
with probability one and in any mean.
3. Bipartite cliques For the given two non-decreasing sequences of positive integers r = ( r , , r2, ...) and w=(w,, w,, ...), where r f w is increasing, define amaximal bipartite complete subgraph Krk,wk (a bipartite clique) of a graph G. A complete bipartite subgraph K,,,, of G is maximal if it is not contained in any other subgraph of G being a Of K I k + l r W k + l * Let b,,(r, w)=min{r,+w,:
there is a copy of maximal K,,,, in G(n, PI}.
logn First, we will show that bipartite cliques of order less than (l-&)-are unlogf likely to occur. Remember that f and h stand for the maximum and minimum of two numbers
(-,P --), respectively. We have the following result. 4 1
1
Z . Paika
216
Theorem 4. For every c > 0 log n b,,(r, w ) > ( l - ~ ) - - lo g f
as.
Proof. Let Y= Y,,(k) be the number of all maximal copies of Kr,win G ( n , p ) , where r=rk and w = w , . Let k be fixed and assume that rfw and p < 112. Then
It is clear that the same upper bound can be used in the case when w = r and p 6 1/2. Now let us put t=
[
(1-E)-
12;J
and assume that k=k(n) is such that r k + w k < t . Then for sufficiently large n
Consequently, Prob ( b n ( v ,w )6 [(I
-8)
5f-I) c'
log 1 l P
<
E(Y)
C(n-">'
I
i=2
c'
where is over all k such that rkf wk < t . This completes the proof for p < 1/2. Analogous arguments imply the assertion for p> 1/2. 0 Now we will find those natural numbers which are likely to occur as orders of bipartite cliques. Let us beg'n with the case when both independent sets are of the same order of magnitude. Here the following result is available.
Bipartite complete induced subgraph of a randont graph
217
Theorem 5. Let 0 < p < I and E > 0 be chosen so that the inequality
holds. r f w, cr, (0 < c < 1) and k depends on n in such a way that for sirfjiciently large n N
log n
log n < rk + M'k <(2 - E ) _ (l +&)logh logf ~
then a random graph G(n, p ) contains a bipartite clique Krk,w* a s . Proof. We will use a method described in [6]. First of all, let us notice thar by the inequality (1 1) we must have 3-J5 -1+J3
The second moment method used in the proof of Theorems 1 and 2 shows that
Thus, by Chebyshev's inequality we have Prob(X> .9E(X)) = 1-0 (1).
3-43
(13)
Now assume that ------cp< 1/2. Then the probability that a given bipartite 2 complete Kr,,,is not maximal is, at most,
( n - r - w)(prqw+ pwqr)Q 2nqr+" ~ 2 n =-O ~(1) Thus, by (lo), E ( Y)=E(X)(l - o ( 1))
and by Markov's inequality Prob(X- Y < . l E ( X ) ) = l -o(l).
2.Palka
218
Combining that and (13), we deduce Prob ( Y >.8E(X))= 1 - o (1) which implies Prob ( Y >O) = 1- o (1) for, by the assumption p Q 1 / 2 and (12), we obviously get E(X)+oo as n - + c o . 3-J5 1 < p Q - - contains a s . at I n other words, a random graph G(n,p) where 2 2 least one bipartite clique K,k,,, such that w,-cr, and (12) is satisfied. Analogous 1 -1+JS, argumentation imply our thesis for -
In Theorem 5 the assumption wk-crk has the influence Only on the upper bound of rk+wk. Therefore, assuming that w k = o ( r k ) as k+oo and applying Theorem 1, one can formulate the following variant of the last result. Theorem 6. Let 0 c p < 1 and e> 0 be chosen so that the inequality
(+)l+e
If wk =o (rk) and k depends on n in such a way that for suficiently large n
+
(1 e)
log n -
log n .
log h
log 119
.than a random graph G (n, p ) contains a bipartite clique K,, w k a.s. -1+J5 Remark. Inequality (14) implies 0 < p <___ . 2
Acknowledgements The author thanks Michal Karoriski and Andrzej Rucitiski for their helpful comments.
Note added in proof. In his recent paper Pee [7]), Rucihski has generalized results on the orders of cliques, b:partite cliques and maximal trees in a random ,graph G ( n , p ) to a wider class of graphs.
Bipartite complete induced subgraphs of a random graph
219
References {I] B. Bollobhs and P. Erdos, Cliques in random graphs, Math. Proc. Cambridge Phil. SOC.80 (1976) 419-427. [2] P. Erdos and Z. Palka, Trees in random graphs, Disc. Math. 46 (1983) 145-150; Addendum: ibid. 48 (1984) 331. 131 G. R. Grimmett and C. J. H. McDiarmid, On colouring random graphs, Math. Proc. Cambridge Phil. SOC.77 (1975) 313-324. [4] A. Marchetti-Spaccaniela and M. Protasi, The largest tree in a random grzph, Th. Computer Sci. 23 (1983) 273-286. [5] D. W. Matula, On the complete subgraphs of a random graph, in: Comb. Math. and its Appl. (Chapel Hill, N. C., 1970) pp. 356-369. [6] Z. Palka, A. Rucinski and J. Spencer, On a method for random graphs (to appear). [7] A. Rucinski, Induced subgraphs of a random graph (to appear).
This Page Intentionally Left Blank
Annals of Discrete Mathematics 28 (1985) 221 - 229 0Elsevier Science Publishers B. V. (North-Holland)
SUBGRAPHS OF RANDOM GRAPHS: A G E N E R U APPROACH Andrzej RUCIfiSKI Instilute of Mrrtheinntics, Adutir Mirkiewicz Utiiversity, 60-769 Poztimi, Polatid
We present a general result dealing with the existence and distribution of the number of subgraphs of a random graph of cither binomial or uniform type. It is shown that many known facts about subgraphs of a random complete graph, random lattice. random regular graph, random mapping as well as random permutation, partition and matching, are the simple consequences of our theorems. Some new results which follow from our gcneral approach and deal with a random out-regular directed graph, random trce and forest arc also indicatcd.
Let us define a rLiiziluin grripli as a pair (9', P ) , where Y is a faniTly of spanning subgraphs of a g ven graph, called initial graph and denoted further by ING, and P I\ a probability d stribution on (9. We w y that a random graph I S of a biizorniul tiyc i f the family Y consi4ts of all spanning subgrqhs of 1NG and for every graph G E Y
where e( ) denote5 the number of edges of a given graph. Furthet , we say that a random graph is of a t n i i f h z lypr 11 the family 9' consists of a11 spanning subgraphs of ING, each of them havlng a given property .4 and for every graph G € 9 P(G)=IBI-' .
We will denote such random graphs by ING, and ING,4, respectively. It should be noted that the majority of random graph inodels are covered by the above classification. One of the most intensively studied problems in the literature (for a review see [7]) is the problem of the existence and numerical distribution or given subgraphs 221
222
A . Ruciriski
in a random graph. In the next section we shall give some general theorems which solve this problem for small subgraphs of random graphs of binomial and uniform type. Several known as well as some new results can be deduced from our theorems. The scope of the possible applications of our approach is presented in the last section of the paper. Suppose that X,,, n= 1, 2, ... , is a sequence of random variables. Throughout this paper the signification X,,-Po(A) and X,,-N(O, 1) means that X,has asymptotically (as n+co) the Poisson and standard normal distribution, respectively. Moreover, EX denotes the expectation of the random variable X whereas Var X -stands for its variance. For convenience, we shall denote x=(X-EX)/JVar X and writef(n)-g(n) iff(n)/g(n)-+l as n-ico. Finally, by V(G) we always mean the vertex set of a given graph G .
Let (ING(n)), {A(@)) and ( ~ ( n ) t)i ,= 1 , 2 , ..., be sequences of initial graphs on n vertices V(ING(n))I =n), graph properties and numbers from (0, I ) , respectively. Additionally, denote by d, the set of all spanning subgraphs of ING(n), each having the property A @ ) , and put a,=ld,,l, n= 1,2, ... Let H stand for a given connected graph with k vertices, I edges and a(H) automorphisms. Any graph isomorphic to H is called an H-graph. We will write rH for a union of r vertex disjoint H-graphs and Hr for any other union of r H-graphs, r = 2 , 3 , .... For a given graph K on a vertex set {u,, ..., u,} we say that a graph F on a vertex set {sl, ...,s l ) is a copy o f K i f the bijection u p s J , j = 1, ..., i, is an isomorphism between K and F. Let us denote by b,(K) the number of all sequences (s,, ..., sJ of distinct vertices of ING(n) which induce a subgraph containing a copy of K . Moreover, let u,(K)= N(s,, ..., sJ, where N(s,, ...,sI) is the number of
(I
.
(s1,
...,Si)
all G E d,, such that the vertex set (sl,..., sl> induces in G a subgraph containing a copy of K. First we formulate a result for a random graph ING,,(,)(n). We will say that for given (ING(n)), (p(n)) and a graph H the condition B(r) holds if
and (ii) for every Sr
b,,(~r)(p(n))"'X"=o((b,(H)(p(n))'>3,
n+m,
r = 2 , 3 , ...
.
Subgraphs of random graphs
223
Moreover, let us denote by B*(r) the condition obtained from B(r) by putting -r instead of r in the right-hand side ol'(ii).
Theorem 1. Let X,, be the number of all H-graphs contained as subgraphs in a random graph INGp(,,)(n),p=b,(H)(p(n))' and n-t co. Then (A) Ifp-.O then P(X,, > 0 )= o( 1); (B) I f p - t c > O and for every r=2, 3, ... the condition B(r) holds. then X,,Po (c/a(H)); (C) I f p - t co and the condition B(2) holds, then P(X,, = 0 )= o (I); (D) Up+ co andfor every r =2, 3, . .. the condition B*(r) holds, then X,,-bN (0, 1).
An analogous theorem can be stated for a random graph ING,,(,,,(n). First, let us start with an appropriate reformulation of the conditions B(r) and B*(r). We will say that for given (ING(n)}, (A(n)} and a given graph H the condition U(r)holds if (i)
ai-lu,(rH)-ui(H)
and (ii) for every &',
u,,(&',)/a,,=o((u,(H)/a,,)'), n+co , r = 2 , 3 ,
... .
Moreover, let U*(r) denote the condition obtained from U(r) by putting -r instead of r in the right-hand side of (ii).
Theorem 1'. Let X,,be the number of all H-graphs contained as subgraphs in a random graph INGA(,,)(n),p=u,,(H)/a,,and n+co. Then /he assertions of Theorem 1 hold if we replace B(r) and B*(r) by U(r) and U*(r), respectively.
Proof. Let us note that EX,=,u/a(H). Thus (A) is trivial because the inequality P(X,#O)<EX, holds. In order to prove (B) we will show that the r-th factorial moment of the random variable X,,, Er= E(X,(X,,- I ) ...(A',,-r+ l)), tends to (c/a(H))' for r= 1,2, ... as n+w. Then the thesis follows from the fact that the Poisson distribution is uniquely determined by its moments. It is easily seen that E, is the expectation of the number of all ordered r-tuples of H-graphs contained in ING,(n). Therefore, we can express E, as E,= E; + Ei', where E: counts all. ordered r-tuples of mutually disjoint H-graphs. Observe that E:=b,(rH)p"/d(H),
r = l , 2,
...'
A . Rucitiski
224
and for every P r
EL’
=
o ( b , ( ~ , . )pe
(xr)),
r =2 , 3 , . .. .
Thus, from the assumptions it follows that Ei--+(c/a(H))’ whereas E:’=o(l) as n-tco, r = 1 , 2, ..., and the proof of (B) is complete. To prove (C) we shall use the inequality P(X,,>O)
> (EX,,)’/E:(X;)
=
( ( E i +E;’)/(EX,,)’f l/Ex,l)-‘.
(1)
One can check that under the assumptions of (C) the right-hand side of (1) tends to 1 as n+co. In order to prove the last statement of Theorem 1 , we have to determine the r-th moment of the random variable
x,,
+cz
=c1
where S(,) is the Stirling number of the second kind. From the condition B*(r) it follows that C2=o(l) whereas X I can be expressed (see [3]) a s (EX,Jke-
=(Var X,J - r ’ 2
k !)- ‘ ( k - EX,,)’
k=O
Since in our case VarX,I-EX,, if EX,,-+co then C, tends to the r-th moment of the standard normal distribution and the theorem follows. The proof of Theorem 1’ is analogous and therefore omitted. One of the possible extensions of Theorems 1 and 1‘ is to count all H-graphs in a random graph such that 11 is an element of a specified family 9 l of graphs with not necessarily fixed order, i.e. with order which depends on n and tends to infinity as n-cn. (For particular models see [2, 9, 10 and 171.) Wc are also able to state similar theorems for some other random structures such as random directed graphs, random niult’graphs, random hypergraphs, random signed graphs as well as for soine other kinds or subgraphs, e.g. induced subgraphs and isolated s u bg r aph s .
Now we turn o u r attention to soine possible applications of o u r results to different random graphs and other random structures. First, notice that the several known facts on subgraphs of complete random graphs K , , p and K,l,N
225
Sitbgraphs of random graphs
( [ 2 , 3 and S]), bipartite random graph Kn.m,p ([lo]), random lattice ([13]), and random graph with given vertex degrees G,,, ([16]), follow immediately from our main result. In this section we present a number of corollaries of Theorem I’ dealing with an out-regular directed randoin graph, a randoin tree and some random covers such as random permutations, partitions, matchings and forests. 1. An out-regular directed random graph Suppose that each of n vertices chooses randomly d neighbours, d> 1, so that all (“d I ) possible outcomes are equiprobable. A resulting directed random graph -+ D,,,dis of the uniform type with ING(n)=K,,, the directed complete graph, and A ( n ) = ( D : d + z d ) , where rl+ means the out-degree sequence of a directed graph
D. Let us notice that
a,,=
(“a1>”
and if F is a directed graph with out-degree
sequence (L/,, .. .,d,,,), di
).. .(
)(”;
d - d,,,
Iv(F)J,
then
‘ ] - I n .
Thus,
where (.Y),,=x(x- I) ...(,Y - y + I), y > 1 and ( x ) ~ =1. Moreover, any sum of not all pairwise vertex disjoint cycles has more edges than vertices. So, we arrive a t the following corollaries.
Corollary 1. If a directed graph F has more arcs than vertices then P(D,,, contains an F-graph) =o (1) , n-* co
.
Let X, be the number of H-graphs contained in Dn,d, where H is a cycle of the length k having out-degree sequence ( d l , ,..,dk) (o
Corollary 2. If X,,is the number of all directed k-cycles in Dnsdthen X,,-Po ( d k / k ) , The above results are essentially new except for that of Corollary 2 for d = 1, i.e. for a random mapping (see [14]). It is interesting to compare these results with a result on a random d-regular graph G , , d([16]), which says that the number
226
A. Rucihski
of all k-cycles in Gn,dhas asymptotically (n-+oo) the Poissan distribution with the expectation ( d - l)k/2k. Also, there are no fixed subgraphs in Gn,dwith more edges than vertices almost surely as n-m. 2. A random tree
Let ING(n)=K,, and A ( n ) = ( G : G isa tree),n=l, 2, .... Suchauniform random graph is called a random tree and denoted by T,,.It is known that a random graph K,,, (simply a graph chosen at random from among the family of all graphs on n given vertices) contains any graph almost surely as n-too (see [3, pp. 1301311). A similar result is true for a random tree. Corollary 3. For every tree T P ( T , contains T)+1 as n+m. Proof. Let T b e a tree on k vertices. One can check (cf. [15, p. 811) that
and for every sum JV2 of two not-vertex-disjoint T-graphs
Therefore the assumptions of the part (C) of Theorem 1' are fulfilled and the thesis follows. 0
3. Random covers Let 9 be a family of connected graphs. We call a graph isomorphic to a member of 9an 9-graph. A spanning subgraph of a graph G such that all its components are 9-graphs is called an 9-cover of G (see [4]). We restrict ourselves to covers of the complete graph K,,. Random F-cover is a random graph of the uniform type with ING(n)=K,, and A(n)= { G : G is an 9-cover of K,,}, n = 1,2, .... It can be easily seen that for every k-vertex graph F E 9
Subgraphs of random graphs
227
and the conditions U(r) and U*(r) are equivalent to the following asymptotic equation
where a,, is the number of all graphs o n n given vertices such that all their components are 9-graphs. (Here by a subgraph of 9-cover we mean isolated subgraph, i.e. a component of 9-cover.) Now we turn our attention to particular examples of random covers. Putting ..., where means a directed k-vertex cycle, k = 1, ..., n, we obtain the case when any 9-cover corresponds to an n-element permutation. Thus a,,=n! and
zk
F=(cl, en},
&I,
any 9-cover is simply a partition of an n-element set Putting g = { ( K , , ..., and un=B,,, the Bell number. It is known from [6] that B,,+k/B,,-(n/logn)k, so U,,(Kk)/Un+03,
k = 1 , 2 , ...,
tl+CQ.
If we put 9 = ( K 1 , K2)then any 9-cover is a matching of the complete graph K,, and (see [5]) ~ , , / u , , - ~ = ~ ~ + 1 / 2 + oThus, ( l ) . also in this case
Note that in all the above cases condition (2) holds. So, if we denote by X,,, Y and 2, the number of cycles of the length k in a random permutation, the number of k-element blocks in a random partition, and the number of edges in a random matching, respectively, then we obtain that X,,-FPo(l/k), Y,,-N(O, 1) and 2, -N(O, I). All these results were already known (see [5 and 121). A generalization of the last result is g'ven in [I I]. Finally, let us suppose that the family F contains all trees. Then any 9-cover is a forest and U,,-J&"-~ (see [12]). For every tree Tkon k vertices condition (2) is true and
Let X,,(T)be the number of components of a random forest which are isomorphic to a given tree T and let 9denote the family of all painvise nonisomorphic trees
228
A . Rucitiski
on k-vertices. Then the random variable X,,(.T)=
X,,(T) counts k-vertex T E S
components of a random forest.
Corollary 4. For every tree Tkon k vertices
where n (Tk)is the number of’utriomorphisins of Tk.Moreover,
Proof. The first statement follows directly from thc part (B) of Theorem 1’, whcreas t h e second one i s a consequence or the fact that all random variables X,,(T), TE.“7, are asyinptotically (a-+w) independent and that n ( T ) = k ! / c ( T ) , where c ( T ) counts the number of ways one can label the unlabclled tree T. 0 Acknowledgements The author wishes to express his thanks to Michal. Karohki and Zbigniew Palka for their helpful comments.
References [ l ] B. Bollobhs, Graph Theory - An Introductory Course (Springer, Berlin, 1979). [21 B. BollobAs, Threshold functions for small subgraphs, Math. Proc. Cambr. Phil. SOC. 90 (2) (1981) 197-206. [3] P. Erdos and A. Rknyi, On the evolution of random graphs, Publ. Math. Inst. Hungar. Acad. Sci. 5 (1960) 17-61. [4] E. J. Farrell, On a general class of graph polynomials, J. Comb. Theory (B) 26 (1979) 11 1-1 22. [5] C. D. Godsil, Matching behaviour is asymptotically normal, Combinatorica 1 (4) (1981) 369-376. [6] J. Haigh, Random equivalence relations, J. Comb. Theory (A) 13 (1972) 287-295. [7] M. Karonski, A review of random graphs, J. Graph Theory 6 (1982) 349-389. [8] M. Karonski and A. Rucinski, On the number of strictly balanced subgraphs of a random graph, Graph Theory, Lagow 1981, Lecture Notes in Math. No. 1018 (Springer, Berlin, 1983) 79-83. [9] M. Karonski and A. Ruciliski, Medium subgraphs of a random graph, unpublished. [lo] M. Karonski and A. Rucinski, Subgraphs of random bipartite graphs, to appear. [ l l ] A. Ruciliski, The behaviour of (x,...,:,,,&‘/i! is asymptotically normal, Discrete Math. 40 (9184) 287-290.
Siihgraphs of random graphs
229
[12] V. N. Sachkov, Probabilistic Methods in Combinatorial Analysis (Tzdat. Nauka, Moskva, 1978) (in Ilussian). [I31 K . Schiirger, On the evolution of random graphs over expanding square lattices, Acta Math. Sci. Hungar. 27 ( 3 ) (1976) 281-292. [I41 V. E. Stepanov, Limit distributions 0 1 certain characteristics of random mappings, Th. Prob. Appl. 14 (4) (1969) 612-626. [15] 1 . Tomcscu, Introduction to Conibinatorics (Collet's Publ. Ltcl., London and Wellingborough, 1975). [lb] N . C . Wormald, The asymptotic distribution of short cycles i n random regular graphs, J. Comb. Thcory ( U ) 31 (2) (1981) 168-182. [I71 E. M. Wright, Large cycles in large labelled graphs, Math. Proc. Cambr. Phil. Soc. 78 (2) (1975) 7-17.
This Page Intentionally Left Blank
Annals of Discrete Mathematics 28 (1985) 231-241 0 Elsevier Science Publishers B. V. (North-Holland)
MATCHMAKING BETWEEN TWO COLLECTIONS Ipe H. SMIT I.F.L.O., Free University Amsterdam, 1081 HV Amsterdam, Holland
The paper presents exact and asymptotic distributions of the number of matches (two-cycles) for a model of a random bipartite digraph.
1. Introduction Let B and G be disjoint sets, with IBI=b> 1 and IGl=g,> 1. Consider the set of all bipartite (labelled) digraphs with bipartition ( B , G ) such that each vertex of G has outdegree 1 (1
2. Notation and terminology
The set of positive integers is denoted b y X a n d X ’ = X u (0). If
12
E N then
M n = ( x ~ & ” x d n ) a n d X ~ = X n u{ 0 ) . I f m ~ X ’ thenapartitionof , m is often
given in the form
p=
lP’ 2p’ ... mPm,where p I E .”and Eip,=m. The set of all 23 1
232
I. ti. Smit
partitions of nz is indicated by A,. A partition is calledj-restricted, when i t has no part greater then ,j (sop,=O for all i>.j). Such a partition is often wrillen in the form Ipi ...,jpJ. Moreovcr, we delinc E A,,, p is,/-restricted}. For the number of parts of a partition (i.e. xpi) our notation is lpl. we also need multinoinial coeficients i n connection wilh partitions. I f n E .,V,rn E *.+" and ?, E A,,,
I
then by
(i)
we mean ( r / ) , ; , ( p , ! p 2... ! p m ! ) - ' . In the last form ( H ) ~ is ;;I~falling
factorial that is n(n-1) ... ( r z - l j I + I ) if l i l > O and that is 1 i T IpI=O. If r is a real number, then [ r ] stands for the greatest integer not greater than r. Finally, if Xand Yare random variables, then by X z Ywe mean than Xand Yare equally distributed.
3. Elementary properties Let I,, be the indicator of the event that 6 , E B is adjacent to g1 E G and let J,, be the indicator of adjacency of 6 , from g,, then X + = l i j J i j . From this
c i.
relation the following propositions can be derived easily.
Proposition 3.1. E ( X + )=kl and Var(X+)= kl( I -k/g)(1 - l/b). Furthermore, let X,",(f,k ) be the number of ( x , y ) E B x G such that therc is an arc from x t o y and there is no arc from y t o x . Then X,O(E, k ) = 1 lij(l - - J , ] ) i , .i
and X$l, k)MX$b-l, we can prove
k ) and X,",(l, k)+X$(I, k)=bk. Using these statements
Proposition 3.2. X&l, k f - E(X&(!, k ) )M E(XA(b- I , k ) )-X&(b- I, k). I n parficular, $I= $b,X,',(l, k ) is symnictricully distributed. Proposition 3.3. The range o f X + equals {max {gl+bk-bg, O}, ..., min {gl,b k } } .
4. Principle of inclusion and exclusion Enumerate the elements of B x G 1 to bg. By A,(i E ./lrbs) we mean the event that in element i of B x G a match occurs. Then, according to the generalized principle of inclusion and exclusion, the distribution of X + is given by
Mutchniakitig between two collections
233
where S,dZfXP(Ai,Arz... A,,,,), 111 ~ . / l r & a n dwhere the last summation is over all combinationsof(i,, ..., i,,,] of. 1'6,takiIlgnzdistinct integers at a time. Inparticular, we have So=1. 11 is known that S,,,is also the /wth binoniial momenl of X + ; i.e. E ( ( a r ) ] . I t ~ollowsfrom Proposition 3.3 that the In-th binomial \
I
I ,
\
moment of X' vanislieswlienever 117 E ( H E . 1 r f i > . m i n ( g l , h k } } . So, the bound of summation in ( I ) may be replaced by Jnin ( g I , bli) --s. On the other hand, if we define S,, =
{(
for all
112
€.A'"', this boundmay be set to infinity. How-
ever, we set out tentatively with version (1) using bound b g - . ~ . Consider P ( A I I A , ,... A,,,) with 112 E . N & Except for the parameters b, g , k and 1 this probability depends on two partitions of 111. To see this, regard the undirected graph ( B u C , (il, . . . , i,,,}). The degrees occurring in B form a partition of rn ( p say), as do the degrees of the vertices i n G (constituting i ) .Clearly, if (p, ;)$ A i , x A!,, or ( p ( > b or (g, then P ( A , , ._.A,,,)=O. Furthermore, i f ?,= 1P12P1 ... kpkand ;= 1q'2q2... P',then P ( A , , ... A,,,) equals
4)
For all (p, E A:, x A:, (if the pair is graphical or not) we define weight w ( p , ); by expression (2). Let N(?, i ) be the number of bipartite graphs with respect to B , C> such that arises i n B and in G, then we may state
5
or for short S,,=C(Nw)(p, G). Summarizing, i n order to obtain explicit expressions for P ( X + =s) it suffices to evaluate the N(p, G)'s. Generating functions can be found for N ( p , applying methods given by Read [3]. However, we prefer to follow another approach, which is more straightforward and leads more easily to explicit expressions.
G)
5. Generating polynomial functions
Let F be a function of b+g
+ 1 variables s1,.. ., x,,, y,, ...,y,
F ( x l , . . . , x b , Y l , . . . r y p ; t)'
b
g
i=l
j=1
I1
(I+XiYjt),
and t , defined by
I. H. h i t
234
and let h, be the polynomial coefficient of t"' in the expansion of F. The polynomials h, are invariant under permutation of the indices of the x's. The same holds with respect to the y's. In the natural expansion of h,, each additive term is the product of rn (not necessarily distinct) x's and my's and finally an integral coefficient. If we let x, correspond to b, E B and y, to g j E G, then a term such as cxyx';... x:y:'yy ...$ (Cr,=C.s,=?n of course) corresponds in an obvious way to the set consisting of all bipartite graphs with bipartition ( B , G ) with m lines and such that d(b,);i.e. the degree of vertex b,, equals rl and d(g,)=s,. Furthermore, the coefficient c clearly equals the cardinality of that set. If p E A;, ? E Af, such that IjlGb and l
where C ; , ; { ~ , ( X ~ ., . ., xb; y,, . . ., y,)) denotes the coefficient of
in the expansion of h,. Since both sides of (4) vanish whenever IEl>b or I i l > g we drop the conditions lPl66 and (GI < g and (4) holds for all (i, 4) E A; x Af, Using methods very similar to those in the proof of Lemma 3 in Smit [4] h, can be expressed as follows. Proposition 5.1. Let m E ,Nbg. Then,
b
where X,=
C I=1
a
x; and Y,=
C
y;
j= 1
With this result and (4)the following thcorem arises.
.. Theorem 5.2. Let m E Xig,5 E A:, q E A; a i d n=min{k, l } , then
Matchmaking between two collections
n X p ) denotes the coeficient
235
n
where C;(
Similarly C;(
of
n.):Y
s= 1
Proof. If m=O the statement obviously holds. Suppose m E NbS.We have N ( 3 , G)=(!)(!)C;,;(hm). Using Proposition 5.1 and some simple properties P
4
of C;,;, we obtain
m
Since clearly C;(
r[ Xp)=O whenever r is not k-restricted a= I
whenever r 4 A!,,, we may restrict ourselves to partitions So the proof is complete. c]
n m
and C;(
r E A:
Y,'.)=O
a= 1
n A:,,= A:.
In the next section we will show how this theorem can be used to derive explicit expressions for N ( p , < )and so for P ( X + =x), whenever k and I are not too large.
6. Exact results With the aid of Theorem 5.2 we are able to obtain explicit expressions for C;,q(h,) and N ( i , S ) i n the cases k , 163, k G 2 and 1 6 2 . It seems that most other cases are too difficult to deal with i n a similar way. As an example we consider k, 163. In this case, we only have to determine the quantity C-,(X;'X;*X;') for 2; and 1' = 1"2r23'3E A:,. A little reflection shows that C;(X;'X;ZX;3) vanishes whenever p l > r I , p 3 < r 3 , p 2 + p 3 < r 2 + r 3 or 1i1>6. Let ~ ~ = r ~ ~- , =p p~, +, p ~ - r , - r ~ and N 3 = p 3 - r 3 and suppose Ni>O, i= 1 , 2 , 3 , lal
I. H.Smit
236
where M,=rnax{N3-r2,
[:I]]
1
and M , = m i n ~ [ ~ ~ ]N, , , N 3 ]
Since the
proof is tedious, elementary combinatorics, it will be omitted. Statement (6) liiakes it easy to forinulatc and to provc the following two results for the numbers N ( p , (I>.
where rl =m-2r,-3r3,
Q3=min(y3, q3) und
Furthermore,
Proof. By the bounds of summation Q3 and Q z ( r 3 )in the asscrtion, the condition N,>O, i= 1 , 2, 3 for (6) holds for all r2 and r 3 . By Theorem 5.2 the proof can be completed, using the fact that the product
(%)(5)
vanishes whenever
>b
or I l q l > ~ .
For
y,
E
A: we have the following corollary.
Corollary 6.2. Let m E A";,, p atid
E
A:. Then,
Proof. By Theorem 6.1, setting u = r 2 . 0 Obviously it is possible to formulate theorems about the exact probability distribution of X&(l, k ) whenever k 6 3 and 1 6 3 , but this will not be done here. Another application of Theorem 5.2 is about moments of Xf. For example we can derive expressions for S3 and E ( ( X + ) 3 } .
Matrhniaking between two collections
237
7. Asymptotic results
In this section we pay attention to an asymptotic expression for inoments and to results about the asymptolic behavior of X,’(f,k ) ( b and/or g - + c o ) . We begin with a theorem about moments. Theorem 7.1. Let k and I be arbitrary, but j i x d . Then,
Proof. For j = O , 1 this statement obviously holds. Let 2 G j d b g and suppose k 3 2 , 1 3 2 . Then S j = ( N m ) ( l ’ , l’)+(Nu)(l’-’ 2 * , l’)+(Nw)(I’, l’-’ 2’) +X(Nw)(F, i),where the last summation is over all other pairs (p,<) of admitted j-partitions and <. Clearly, the number of terns in this sum is independent o f b andg. Furthermore, for each pair (F,G> like this ( N ~ o > ( p , ~ ) = O { ( b - ’ + g)-*’>,so S ; “ ’ ~ % ( N 4 ( p4) ,’ O { ( h - ’+g-I)’}. This is even true if /c= 1 or I = 1. If
a
and
then it follows from Corollary 6.2 that
238
I. H. Smir
and from the definition of the Sji)’s it is clear that S,=CSy’ even holds if k= 1 or I = 1. Because
and
the result follows. 0
Theorem 7.2. Let k , 1 and g be arbitrary, but jixed. Then,
x=o,
..., g l ,
where
and
Proof. Since Sf=O whenever j>gI, the upperbound of summation in (1) may be replaced by g f - x. Let rn E Nil and q E A;, then clearly
Furthermore,
Matchmaking between two collections
239.
Since
the proof is complete.
0
As a corollary we state the case I= 1. Corollary 7.3. Let k and g be arbitrary, but fixed. Then,
Proof. By Theorem 7.2, substituting I= 1. 0
For the case that b and g simultaneously tend to infinity we present a further result on the asymptotic behavior of P ( X + =x). Theorem 7.4. Let k and I be arbitrary, but fixed. Then,for x E N' P (X'
=X) = 7~:
{ 1+ C ( - X * + x +2xkI - k212)}+ 0 {( b-
+9- ')'}
,
( b ,g - t w ) , where
':27
denotes the Poisson probability (kl)xe-rl and C=f((bk)-' +(gZ)-').. X!
I. H. Smir
240
Proof. Let T j = n ~ ' { l - ( j ) 2 C and } Rj-Sj-Tj, min (bk, g l } 2x.Then one can show that P(X'=x)=
-y(-l)i i=O
("
EN'.
Furthermore, let n =
+
i, S, + = B1 - B2 B, ,
where
f i) T,
m
( - I);(,
BI = i= 1
+
{ 1 + C ( - x 2 + x + 2k 1 - k 21');
=
,
IU
B2 = i=n-x+
I
(-
I)'(,:
i, T',;,
obviously also a convergent series, and
n -x
B, = 1( - I
i ) R , x ,i .
i=O
For B z , IB,16
(kl)'
1 '3J
x!
i=rt-x+1
,:(lily - (l-(x+i)zC}
holds. Using i ! k ( r t - x + l ) ! x
1.
( i - n + x - I)!, it follows that ~ , = O { ( b - l+g-s)2}. In order to prove that B 3 = O { ( b - ' + g - ' ) 2 } we split up the corresponding R, into 4 parts. Consider the definitions of the S);)'s in the proof of Theorem 7.1. 4
Obviously, Sj=
Sj" holds for all j
E .&;,
even f o r j < I . Let
i= 1
Ri" = Sj('!-
(kl)'
7-r
{I - & ( j ) , ( b -
J-
Then
Rj=
+g -
I)},
1Ry) . i=l
By methods similar to those used in the proof of Theorem 8 in Smit (41, it is possible to show that for i = l , ..., 4 there is a positive function f:B x G - + 9 ? independent o f j and of the order O((6-l + g - ' ) * } , and a positive real number D which is constant with respect t o j , b and g, such that (kl)' (R~'(6,j4(l+D)'j(b, I.
g),
j=O,
..., n.
241
Matchmaking between two collections
Obviously, the same holds for R,. And now it follows easily that B,= O((b-'+g-')2}. !J Finally, we state a corollary which is an immediate consequence of Theorem 7.4. 9
Corollary 7.5. Let k andl be arbitrrrry, butfixed. T i m , A'&(/, k)+ Po (kl), (b,g-' a), where Po (kl) rietiotes the Poisson distribirtioii with parameter kl.
References [ I ] J. Jaworski, On thc conncctedness of a random bipartite mapping, Lccture Notes in Math. 1018 (1983) 69-74. [2] J. Jaworski, A random bipartite mapping, Ann. Discrete Math. (to Ippear). [3] R. C . Read, The enumeration of locally restricted graphs (lI), J. London Math. SOC. 35 (1960) 344-351. [4] I . H. Smit, The distribution of the number of two-cycles in certain kinds of random digraphs, Wiskundig Seininarium, Vrije Universitcit, Amsterdam, 1979, Rapport nr. 116. (51 S . S. Wasscrman, Random directed graph distribution and the triad census in social networks, J. Math. Sociology 5 (1977) 61-86.
This Page Intentionally Left Blank
Annals of Discrete Mathematics 28 (1985) 243-250 C Elsevier Science Publishers B. V. (North-Holland)
FOUR ROADS TO THE RAMSEY FUNCTION Joel SPENCER Department of Mathematics, S U N Y at Stony Brook, Stony Brook, N Y 11794, U . S . A .
In this paper we are concerned with lower bounds to the Ramscy function R(k). We examine four arguments and the bounds they yield. All arguments we consider are variants of the probabilistic method.
The Ramsey function R(k) is defined as the smallest n such that if the edges of K, are two-colored there necessarily exists a monochromatic Kk.In this paper we are concerned with lower bounds to R(k). Thus, for n as large as possible, we wish to two-color K,, so that there does not exist a monochromatic Kk.We examine four arguments and the bounds they yield. Ail arguments we consider are variants of the probabilistic method. In 1946 Paul Erdos [2] published a seminal paper on the probabilistic method. He showed that then
R(k)>n.
A calculation using Stirling’s formula shows that this implies R ( k )>
(--> 4 1
+
k2k/2(1 o (1)).
Here is his argument in modern, i.e. probabilistic, terms. Consider a random coloration of K,,. That is, each edge is colored Red or Blue (our colors) with equal probability and these probabilities are mutually independent. Each k-set has probability 2’-(’)
of being monochromatic. There are
the expected number of non no chromatic k-sets is
(9
such k-sets. Hence
(321-(k)-
Basically, we are here invoking the linearity of expectation. For each k-set A let X, be the indicator random variable for the event “ A is monochromatic.” 243
J. Spencer
244
Let X=ZX,, the summation over all k-sets A . Then X is the number of monochromatic k-sets. The variables X,, XA, can be quite dependent but linearity of expectation does not require independence of the variables. Thus E(X) =CE(X,) which, by assumption, is less than unity. As X is integral valued there is some “point” in the probability space for which X=O. The points of the probability space are the colorings. Thus there is some coloring of K, with no monochromatic Kk,completing the proof. Since 1946 the only improvement on the bound (2) has been in the constant term l/e$. As the best known upper bound for R(k) is of the ordcr ( 4 + 0 ( l ) ) ~one could say that no significant iniprovenient on the Erdos bound has been found. Indeed, the problem of finding the real order of R(k) - e.g. the v:ilue of lim R(k)’lL- is, in this author’s opinion, the most vexing problem involving the probabilistic method. Our three other methods, while not shedding light on this question, do bring basic methodologies of the probabilistic method into sharp focus. An improvement of the Erdos bound is given by the Deletion Method. We show that then R( k) > n (1-&)
.
(3)
Taking E = o (1) appropriately, this implies
(3
R ( k ) > - k2k’2(1+o(1)).
(4)
Again consider a random coloration of K,. As before the expected number Xof monochromatic Kk is (;)2*-(’)
which is now less than cn. Thus there is a particular
coloring of K, with less than En monochromatic Kk. Select one point arbitrarily from each of the monochromatic Kk and delete it from the vertex set. At least n(l -E) vertices still remain. All of the monochromatic Kk have been destroyed so we are left with a coloring of the edges on at least n(I - E ) vertices with no monochromatic Kk. The above argument was discovered by Jim Shearer in 1982. The result is not as good as that obt,tined by the Lovisz Local Lemma (to be described) discovered in 1975 and has not been published, Still, it is surprising that this result, using a fairly well understood technique, was not commented on (to this author’s knowledge) between 1946 and 1975.
Four roads to the Ramsey function
245
Our third argument uses a Recoloring Method. This method was used effectively by Jozsef Beck [ I ] to find bounds on the function m(n) for Property B. We show that 1f
(2>
k
- ( 2 )< ,I 2 - R
21
then R ( k ) > n .
(5)
Here E is an arbitrarily small but fixed positive real and n approaches infinity. This implies
R(k)>
(f) --
k 2 y 1 +o(l)).
Again we begin with a random coloring of K, (which we call the First Coloring) yielding nZ-'monochromatic Kk.For somewhat technical reasons we set s= [ I 0 0 x e-2]2 and call a k-set A nearmono if all but less than s edges are the same color. (Nearmono includes monochromatic.) The expected number of nearmono Kk is then
The additional factor is less than kZS.As k - c In n=no") and c, s are fixed with n approaching infinity this term is l z o ( ' ) . Thus there are n2-e+o(1)nearmono &. Set p = 100/k. G'ven thc First Coloring call an edge critical if it is a red edge in a nearred /<-set or a blue cdge in a nearblue k-set. We change the color of each such edge with probability p , these probabilities being independent. (Note: if an edge I:es in many nearmono k-sets it still changes color with probability only p . ) We shall show that the probability of a monochromatic /c-sct in the new coloring - which we call the Recoloring - is o(1). For conveniencc,we look at the probability of a red Kk. There are thrce ways t h a t a k-set A can be red i n the Recoloring. ( ) A was nearred i n the Flrst Coloring and none ol' its red edges changed in the Recoloring. Given the F,i-st Coloring with A nearred, A has probability at most (1 - P ) ( ' ' - ~ of remaining red. (All red edges, lying in A , were critical.) This factor is roughly exp( -pk2/2)=exp(-50ii)
246
J. Spencer
as flN2k’2.The expected number of times (i) occurs is thus
as desired. In fact, p was chosen large enough to make this term negligible. (ii) A was nearblue in the First Coloring and changed to red. This possibility is most unlikely, occurring an expected number of times less than
since kk1/2-
-(2
k/2 klnk
)
-n
klnk
.
(iii) A was neither nearred nor nearblue in the First Coloring but became Red in the Recoloring. Say A had t blue edges in the First Coloring, so s < t <
(;)
--s
.
Given the First Coloring the probability of A becoming Red is at most p’ as all Blue edges must change. The expected number ol‘ times this occurs is
(approximating
(7)
by
e)
which is o(1) when ta5000k.
Now we assume s
Four roads to ihe Ramsey function
(1;) ( k If.4 ) (nk)
-4 +o( 1 )
=
247
potential partners B. The coloring of A affects only
six edges of B so the chance of B being nearmono is <2-(')26=2 -(k)n O ( l ) . Hence the expected number of pairs is
-n 2 - e
n 2 - 8 ti - 4 t i0 ( 1 )-- ~ - 2 e + o ( l )
which is o(l). There will be pairs A , B with IA n B1=2 or 3 but we now show that they break into small groups. Let s , = [ 2 ~ - ~ ] + 1so that sO&>2. Consider the graph whose vertices are the nearinoiio sets and where A, 3 are adjacent if IA n B ] = 2 or 3. We claim this graph has no components of size greater than so. If such a component did exist there would be an ordered family Ao, A l , ..., A,, of nearmono sets such that for each i > O there would existj
(3
choices for Ao. There are less than
(;) (,'
,J= n
- 2 +'(I)
(i)
choices for each succeeding A , as two of its points must come from a prescribed A,. Each A , has at most 3s0 edges i n common with previous A j . Thus the probability that A , is nearmono conditional on the previous A, being nearmono is at most 2-(%2S23S0= 2-(2k)t10'1'.(That is, the common edges have a negligible effect.) The expected number of families is less than
which is n2-Z-eSO+O(l) . We have selected so such that this term is o(1). At the opposite extreme we claim there will not be pairs A , B of nearmono sets with [ A n BI=k-2. Again there are n z - e + o ( l ) nearmono A . Each A has n 2 + O ( l J potential partners B. Given that A is, say, nearred, among the 2(k-2) edges from B - A to A - B at most s can be colored blue. As s<
248
J. Spencer
if [ A n BI=k- 1 or k and neardisjoint if [ A n BI=0, 1,2, or 3. Each pair is one or the other. If A , B are nearequal and B, C are nearequal then A , C can surely not be neardisjoint so they must be nearequal. That is, nearequal is an equivalence relation on the nearmono sets. Furthermore, we claim that any collection of mutually nearequal nearmono sets has a total of either k or k + 1 points. Otherwise there would be nearmono A l , A Z , A 3 , mutually nearequal with a union of k + 2 points. On these k + 2 points at most 3s+3 edges could have the "other" color. The usual expected value argument shows that this will not happen. Suppose a k-set A has t blue edges el, ..., e,, s
The a priori probability that B has less than 2s red edges is n0"'2-(') if IB]=k and even less if I B I = k + l . When we condition on the coloration of A this probability may increase. The increase is by at most a factor of 2' since at best we are presupposing a particular t edges of B to be blue. Thus the expected number of such pairs A , B would be less than
which is o ( I ) as the n-"" dominates in the range s< t< 5000k. Set u = J ~ / .for T~ convenience. We consider the possibility of a k-set A with t blue edges and mutually disjoint nearblue A , , ...,A,with all 1A n A i l > 2 . Thereare
(i)potential
A and,given A , ii - 2 + o ( 1 ) (
:)
potential A , for each i. A has probability
-0
less than k2'2-(') to have t blue edges. Each A, has a priori probability r1*(~)2
to be nearblue. Conditioning on the coloring of A increases this by at most a factor of 2" where r , is the number of blue edges in A n A , . The events " A , is nearblue" are mutually independent since, critically, the A , are edge disjoint. The
Four roads to the Ramsey function
249.
factors 2“ combine to a factor of at most 2‘ (which will prove negligible) since X r , < t . Thus the expected number of A , A , , ..., A,, is less than
which is less than
This is o(1) as the term n-eu=n-eJr’so dominates in the range s
if 4(k)( 2 k - 2 )2’-(’)<1
then
R(k)>n.
This implies R(k)>
(f) -
k2k’2(1+o(1)).
As this result has been described in detail previously [3] our treatment shall be brief. Let E l , ..., Em be events in a probability space, each with probability p . Let d be an integer and suppose that for each Ei there is n fain;ly of at most d other events E,,, ..., El,, s < d , such that El is mutually independent of the E,, j # i , i , , ..., is. Assume 4dp< 1. The L O V ~ SLocal Z Lemma then states that the complements Eihave a nonempty intersection. That is,
To apply this, consider a random coloring of K,].For each k-set A let E,4 be the event “ A is monochromatic.” Then p = 2 ’ - ( ’ ) . The event EA is mutually independent of all EB with / A n B1<2 since these events involve only the color of the edges outside of [AI2. We apply the Lovisz Local Lemma with cl equal the number of k-sets B, ) A n B)>2. This d is less than
(‘)( ). 2
k-2
When 4dpcl
250
J. Spencer Bound on R(k) (el JZ)k2”’(1+
o( 1))
the complements EAhave nonvoid intersection. There is a point in the probability space (is. a coloring of K,,)for which no EA hold ( i s . no A are monochromatic) completing the proof. The Lovbz Local Lemma acts as a sieve, filtering out the events EA to find a point in none of them. For this reason, this method IS soinetiines called the Lovasz Sieve. We summarize our methods and conclude our paper with the chart (see above).
References [ l ] J. Beck, On 3-chromatic hypergraphs, Discrete Math. 24 (1978) 127-137. [2] P. Erdos, Some remarks on the theory of graphs, Bull. Amer. Math. SOC.53 (1947) 292-294. I31 J. Spencer, Ramsey’s theorem - a new lower bound, J. Comb. Th. (A) 18 (1975) 108-115.
Annals of Discrete Mathematics 28 (1985) 251-262 0 Elsevier Science Publishers B. V. (North-Holland)
RANDOM GRAPH PROBLEMS IN POLYMER CHEMISTRY John L. SPOUGE T-10,Theoretical BioloKy, D465, Los Alamos National Laboratory, New Mexico 617545, U . S . A . We examine some random graph problems suggcsted by polymer chemistry. These problems generalize the random graph that Erdos a n d R h y i examined. In certain cases the genzralization is known to lead to novel graph behaviour and niathematical techniques. The Appendix also gives the polymer chemistry references which suggest the graphical problems.
1. Introduction Random graphs can provide useful models for polymer chemists. This paper reviews some graph-theoretic problems suggested by polymer chemistry. Some of the problems are unsolved (or morc accurately, unattempted) but may well require novel mathematical techn:ques for solution. All problems are stated in graph-theoretic form. The Appendix contains the polymer chemistry references which suggested the problems. We usually follow Harary's [15] graphical terminology, see F'g. 1. Our graphs are always labelled and undirected, but they may be disconnected unless spec:iied otherwise. Most problems have stra;ghtforward extensions lo directed graphs. The notion of graphical partition is central to this paper. The partition of a graph is a list of the degrees of its vertices. We use x H y to denote such a list. Here Hiis the marker variable for a vertex of degreej and there are nj vertices of degree j , j = O , 1 , 2 , ... . Often the Hiare numbers as well but the use of the symbol H, will be clear from context. The text allows two types of graph: ordered and unordered. The edges emanating from a vertex in an unordered graph are indistinguishable. I n an ordered graph, those edges are assigned numbers 1 , 2 , ..., j ; different assignments produce different ordered graphs. We shall examine three types of graphical model: tree, graph and pseudomultigraph. Let us begin with the graph, since this has strong analogy to the work of Erdos and RCnyi [3]. 25 1
I . L . Spouge
252
a
b
Fig. 1. (a), (b) and (c) are a tree, a graph and a pseudomultigraph, respectively. A pseudomultigraph may have loops, cycles and multiple bonds. A graph may have cycles only; a tree has neither loops, cycles nor multiple bonds. The graph has labclled vertices and the pscudomultigraph ordered half-edges. Thc degree of a vertex is the number of edges for which i t is a n endpoint, for example, the dcgrcc of all the pseudomultigraph vertices is 6 . Wc also use “graph” ;is a general term lo include any type of collection of vertices and edges. The meaning will be clcar from context. A component of a graph (in the second sense) is any maximal connected subgraph of the original graph, e.g. (a), (b) and (c) are the 3 components of I-ig. 1 taken as a single disconnccted graph.
2. The graph model
Erdos and Rtnyi considered a random graph on n labelled vertices, where each of M = t n ( n - 1) possible edges was independently present (probability p ) or absent (probability q = 1 -p). Let G be an unordered graph on the n vertices which has b edges. Then G is the selected graph with probability pbqM-b.
Random graph problems in polymer chemistry
253
Let us generalize the probability measure. Assume G has partition nHJ’. The following constraints on vertex and edge numbers hold.
Give the graph G the weight (not probability)
where (H,} and are preassigned numbers. We now select G from all possible graphs with probability
where the sum runs over all possible graphs satisfying Eq. (1). In Eq. (3), H, is the intrinsic tendency of the vertices to be of degreej. The parameter determines the expected number of edges in the random graph. The Erdos-R6nyi study is the specialization Hj=l,
. j = O , 1 , 2 , ...;
fi=p/q.
(5)
To extend that study, one could investigate threshold functions and limit theorems for general (H,) as n+m. Grimmett [I41 gives a good introduction to the ErdosRCnyi results. I n the Graph Model, ordered graphs present no new features: every unordered graph of partition KHJ”’corresponds to exactly ~ ( j ! ) ”ordered ’ graphs. (The halfedges around each vertex of degreej can be numbered in j ! ways.) The ordered case with degree weights
H
H!=-j
’
j!
is equivalent lo the unordered case with degree weights H j . The Graph Model is the least explored of the three models.
3. The pseudomultigraph model Polymer chemists have used special cases of this model and the following one extensively. Mathematical analysis of this model is due to Whittle [17-221 and begins at Eg. (9). The formulation of the model as a random ordered pseudomulti-
254
J . L. Spouge
graph as given here is new. I give Whittle’s results in some detail as they involve novel mathematical techniques (Eq. (15)) and have potential for generalization (Eq. (14)). In this section, we assign probabilities according to Eqs. (1-4) of fj2, but now we do so to ordered pseudoinultigraphs G instead of unordered graphs.
a
Fig. 2. Both unordered pseudomultigraphsin (2a) have partition Hf but they correspond to different numbers of ordered pseudomultigraphs in (2b).
Eq. (6)provided an easy passage from unordered graphs to ordered graphs. Fig. 2 shows that no such passage is possible for pseudomultigraphs, since the partition does not uniquely define the correspondence between ordered and unordered pseudomultigraphs. The problem for unordered pseudomultigraphs is harder than the one we examine. There are an infinite number of pseudomultigraphs on i z vertices, so one may suspect already that @, the bonding parameter, must be “small.” The number of distinct ordered pseudomultigraphs of partition 7cHy is
n! (2b)! U,,{ncH;‘} =__ m j ! 2bb!
.--
(7)
Random graph problems in polymer chemistry
255
Proof: Recall constraints (1) and (2) on the partition. The first factor gives the number of ways of assigning the degrees to the n labelled vertices. The second factor gives the number of ordered pseudomultigraphs after the assignment of vertex degrees: if the vertex V is to have d e g r e e j, m a k e j copies of the letter V, i.e. V , , V,, ..., V,. There are now 26 letters. Permute them ((2b)! ways) and pair the letters starting from the beginning of the sequence (i.e. position 2k- 1 is paired with 2k, k = 1,2, ..., 6). Join the paired half-edges in the pseudoniultigraph. This overcounts the pseudomultigraphs by a factor 2’b! since order within letter-pairs is immaterial (2’) as is the order of the pairs themselves (b!). Note that
Eq. (7) shows that the exponential generating function of the pseudomultigraph weights is rn n ! (2b)! U ( v , , 4 ) = 1 + C--C----*rH1,”,4‘ n = l n ! ( “ , ) n n j !2bb! vfI
exp [ o H ( b * x ) ] e-tx2 d x ,
=--
(9)
-rn
where rn
H(%)= C H j X j . j=o
The second sum in (9) runs over all in,} satisfying constraints (1) and (2). The second equality results from substitution from (S), reversal of integration and summation and then elementary manipulation of the sum. Eq. (9) must usually be interpreted formally; the two sides diverge in general. Taking coefficients of u” on both sides generally yields a convergent expression for the weights of pseudomultigraphs on n vertices however:
-m
256
J. L. Spouge
The expected proportion of the n vertices which have degreej is
since the effect of the operator H j d/dHj(.)is to weight each term of U, by the number of vertices of degree .j present in the corresponding pseudomultigraph. We assumc that p= l i p , where y is a fixed constant and n is the number of' vertices in the random pseudomultigraph. This assumption usually ensures that the expected number of edges in the pseudomultigraph is proportional to n, a result desirable to chemists. Hence p , is the ratio of two integrals. We apply the method of stupest descents as n-+m (Carrier, Krook and Pearson 121) to find p J :
where s satisfies the saddle-point condition
The behaviour of the vertex degree distribution as y decreases (i.e. as 13 and edge numbers increase) is dependent on the analytic behaviour of s. This may or may not be smooth as Whittle demonstrated, see Fig. 3. If C(u, 8) is the exponential (in u) generating function of the weights C,,(ll)for the connected pseudoinuhgraphs on iz vertices, then U(o, j?) of Eq. (9) satisfies
Heuristic. Every pseudomultigraph can be decomposed into k connected components. Ch(u,p) is the exponential generating function for ordered k-tuples of connected pjeudomult'graphs whose vertices are drawn from n labelled points. Ilk!removes the k-tuple ordering. Percus [I61 gives a full proor of this very general relationship between graphical and component generating functions. Consider again the random pseudoinultigraph on rz vertices. Let
Random graph problems in polymer chemisrry
257
-1-
Y
b Fig. 3. Thegraphs ofy=H’(s)/H(s)and y = ) ~ a r eshown for H ( s ) = e S ( 3 a ) and H(s)=coshs (3b) and 2 diR-rent valucs of jt. I n ( 3 s ) the intersection of the two graphs does not change behaviour as y dzcrcases. By contrsst, the intersection in (3b) relnains at origin until y < l when i t bifurcatcs and moves steadily away from origin.
2 58
J. L. Spouge
where E,, is the expected number of vertices of the component containing a random vertex. Using Eq. (15) and standard generating function techniques, Whittle [19] showed that y,, obeys the recursion
If H(x)=ex, then Eq. (10) shows
Whittle derived the asymplotics of cp,, lor this case by the following novel technique: the t e r m TmSk have maxim:i amongst the roots of
For Bm
tn ~
l-prn
,
pm61.
For Bm> 1, T,n,k has maxima at the two non-zero roots of Eq. (18) which are O(m). For these k
Because p ( m - k ) < l for the positive root of Eq. (18), Eq. (19) implies yfn-k =O(in). P m - k is therefore dominated by y,,,+k in Eq. (20). If
m+k=N
:tnd
q=-
2k N
then BN> 1 and Eq. (20) becomes
Random graph problems in polymer chemistry
259
Eq. (18) yields
In this model Whittle’s results are consistent with the following: as /I increases from zero all components are finite trees until bn> 1, when a component of size qn appears. All cycles, loops and niultiple bonds remain confined to this giant component . These statements are probably only approximately true as more precise statements have not been investigated yet.
4. The tree model
A model for trees, cimilar to those for graphs and pseudomultigraphs using the degiee weights ( H , ) of 4 2 and 0 3 , has not been inve\trgated. Instead standaid modcls begin with the distribution of vertex degrce (I),). p , IS the probability that ;I vertcx has a given degrce (cf. Eq. (12)). Let P ( s ) be the ordinary generating function of {p,].
c PjFIJ. a
Y(n)=
j=O
Because the correspondence between ordered and unordered trees is a function of tree partition alone (see Eq. (ti)), we examine unordered trees as representative of both cases. The probability that the vertex on the end of a random edge has degree k is
kPk f,=-P‘(1)
r
‘I
k = 1 , 2 , 3 , ....
Proof. The probability is the a priori probability of the vertex being a degree k ( p , ) weighted by the number of edges from the vertex (k). l/P’(l) normalizes the (fk) into probabilities. This effectively specifies the distribution of the trees as a branching process (Athreya and Ney [I]), see Fig. 4. Choose a vertex at random. Eq. (22) g’ves the distribution of the degree of this “progenitor” vertex. The progenitor’s degree is the number of I-st generation “offspring” (those vertices adjacent to the progenitor). Each offspring vertex in the I-st generation has degree k (i.e. k - 1 2-nd generation offspring and 1 parent) independently with probability fk, as do offspring vertices in subsequent generations.
J. L. Spouge
260
a
GEN 2
GEN 1
b GEN 0
Fig. 4. (4b) shows the branching process resulting from choosing a random vertex in the tree in ( 4 4 .
Results about tree distributions in this model are derived from the corresponding branching process results. Straightforward extensions of this model are possible: multiple vertex and edge colours, directing, etc. Perhaps the most interesting extension is to assign to the vertices independently identically chosen random masses (a) and to let the vertex degree probabilities (p,) be functions of a. This last notion has obvious extension to the degree weights (H,) of $ 2 and $ 3 . It is likely that an approach to trees through degree weights (If,) would yield results similar to the branching process until a tree containing O(n) of the n vertices formed (this co;rebponds to a supercritical branching process). Thereafter (in certain chemical models), the branching proxss method yields results consistent with those at the end of $ 3 (despite our exclusion of cycles!).
Random graph problems in polynter chemistry
261
Grimmett [I31 has used special branching processes to enumerate trees; the branching process enumeration of trees by partition is central to the tree model of this section (Conclus.on, Spouge [27]).
5. Conclusion This paper attempts to bring Fame coinbinatorial problems suggested by polynier chemistry to the attention of random graph theorists. In $ 2 we ass:gn probabilities to the graphs o1'the Erdos-RCnyi scheme on the basis of graphical partition as well as number of edges. If applied l o pseudomultigraphs (9 3), lhis ncw scheme may produce non-smooih changes In the distribut:on of vertex degrees as the expected numbers of edges is increased. Whittle's analysis, which required novel a~ymp:oticmethods, showed that almost all finite components of a random p;eudornult graph are trees until the threshold for the appearance of a component of O(ti), where n is the number of vertices of the p.,eudomult igraph. In $ 4 we exainine the branching process model for trees and indicate a connection between branching processes and enumerat'on of trees by partition. There is considerable scope for mathematical investigat'on of these models since most of the work on them appears in the literature of polymer chemistry. I g've polymer chemistry references in the Al-,pendix.
Appendix Flory's [5 - 71 RA, model is the paradigm of polymerization models. Stockmayer [24] gave the size distribution for this model. Flvry [8, p. 1921 disagreed wit11 the interpretstion of Stockmsycr's result. Falk and Thomas [.I]resolved Ih: resulting debate by computer sirnulatioil (see also Ziif and Strll [ 2 3 ] ) . I gave analytic results for Flory's modcl (Spouge [29]), not realiring thal i t is the spccial casc
of Whittlc's [I91 pseudoniultigrapli model. Thc Stockmaycr interpretation is equivalent to a trcc model employing jh: samc ctcgrce v,eiglits if/,:. Gordon [ 101 and Good [U] introduccd tllz Branching Process Tree Model and Gordon cf 01. [ I I ] g i v e ii rigor-ous ju.;rilicarion 01' i t \ applicability. Gordon and Scantlebury [I21 ;rnd Spougc. [27] g3vc rclinLm :nts cquivLilcnt to niul~icolouringvcrticcs and edges respectively. Spo~igc[28) illso allow-d the virliccs to h:ivc random mass. Flory's [S, Cli. 91 A, I < t l - g ~iiodclis equivalcnt to employing directed trees. Spouge [25. 26, 281 gives solutions and rctinemcnts for this model.
262
J. L. SpouEe
References [ l ] K. B. Athreya and P. E. Ney, Branching Processes (Springer-Verlag. New York, 1972). [2] G. F. Carrier, M. Krook and C. E. Pearson, Functions of a Complex Variable, Theory and Technique (McGraw-Hill, New York, 1966). [3] P. Erdos and A. Renyi, Math. Inst. Hung. Acad. Sci. Hung. 5A (1960) 17-61. [4] M. Falk and R. E. Thomas, Can. J. Chcni. 52 (1974) 3285. [5] 1’. J. Flory, J. Am. Chcm. Soc. 63 (1941) 3083-3090. [6] P. J. Flory, J. Am. Chcm. Soc. 63 (1941) 3091-3096. [7] P. J. Flory, J. Am. Chcm. Soc. 63 (1941) 3096-3100. [8] P. J. Flory, Principles of Polymer Chemistry (Cornell University Press, Ithaca, New York, 1953). [9] I . J. Good, Proc. R. Soc. Lond. A272 (1963) 54-59. [lo] M. Gordon, Proc. R. SOC.Lond. A268 (1962) 240-259. [I I] M. Gordon and T. G. Parker, Proc. R. Soc. Edinb. A69 (1970/1971) 181-192. [I?] M. Gordon and G. R. Scantlebury, Proc. R. Soc. Lond. A292 (1966) 380-402. [I31 G. R. Grimmett, J. Austral. Math. Soc. A30 (1980) 229-237. [14] G. R. Grimmctt, in: L. Beineke and R. Wilson, eds., Further Selected Topics in Graph Theory (Academic Press, 1983). [I 51 F. Harary, Graph Theory (Addison-Wesley, London, 1969). [I 61 J. K. Percus, Combinatorial Methods (Springer-Verlag, New York, 1971). [I71 P. Whittle, Proc. Camb. Phil. SOC.61 (1965) 475-495. [I81 P. Whittle, Proc. R. SOC.Lond. A285 (1965) 501-519. [I91 P. Whittle, Adv. Appl. Prob. 12 (1980) 94-1 15. [20] P. Whittle, Adv. Appl. Prob. 12 (1980) 116-134. [21] P. Whittle, Adv. Appl. Prob. 12 (1980) 135-153. [22] P. Whittle, Theory Prob. Appl. 26 (1980) 350-361. [23] R. M. Ziff and G. Stell, J. Chem. Phys. 73 (1980) 3492-3499. [24] Stockmayer. [25] J. L. Spouge. [26] J. L. Spouge. [27] J. L. Spouge. [28] J. L. Spouge. [29] J. L. Spouge.
Annals of Discrete Mathematics 28 (1985) 263-304 0Elsevier Science Publishers B. V. (North-Holland)
FLOWS THROUGH COMPLETE GRAPHS W-C. S. SUEN Scliool of hfdienratics, Unioersity of Bristol, Brisrol, England
We consider Ford and Fulkerson network flows [2] through a complete graph G of which the edge czipacitics form a family o f indepcndcnt and identically distributed random variables on [0, co). We study the case when G has vertex sci (0, 1, ..., n - 2 , a ) . ,and edges which are indepeiidently directed so that for each edge joining a pair i, j of vertices,
where r E (0, I]. We obtain asymptotic results concerning the maximum flow whun a typical edge capacity C has distribution p(C>l)=p(lr)(l-F(t)),
It
[ O , co),
where O < p ( n ) < l and F is a known distribution function concentrated on (0,co). The problem studied here is a generalized version of a problem considered by Grimmett and Welsh in [71.
1. Introduction
We begin with a brief description of the concept of flows through a capacitated network. Suppose that G = { V, E ) is a directed graph with vertex set V and edge set E which is a set of some ordered pairs of vertices in V . A capacitated network is obtained from the graph G by associating each edge e = ( i , j ) in E with a nonnegativenumber C,,called thecupaciryof theedge. Letsand t be two special vertices acting as the source and the sink respectively. A feusibfeflow f of value u=u (1) from s to t through G is a non-negative function on E so that the following conditions are satisfied. u if i = s , C f i j - C f j i = - u if i = t , M j , i )E E j : ( i ,j ) E E 0 otherwise,
I
jrj
(1.2)
for all ( i , j ) E E . 263
W -C. S. Suen
264
Therefore, condition (1.1) is effectively Kirchhoff's current law in that flow is conserved at each intermed ate vertex and condition (1.2) specifies the maximum amount of flow permitted to pass through each edge. The maxinizinz Jow u* is defined as u*=sup{u(f) : f is a feasible flow},
The invest;gation of the max'mum flow through a capacitated network has wide applkat'ons in var ous operations research problems (see for example [?I). A nuinber of efkient algorithms for lind ng the maximum flow when the edge capacities are deterministic have been found, and the subject is in general well understood. This is, however, not so when the edg? capacities are random variables. In coiis'dering flows through coinplete graphs of n vertices, Grilnlnett and Welsh [7] obtained asymp:otic results when the capacity of each edge is drawn from a known d'stribution independent of P I . (See also Griinniett and Suen [6].) I n this paper, we shall consider a similar pi-oblem. Suppo;e that K,, is a coinplete graph with vertex set V,=(O, I, ..., n - 2 , m}, in w1i;ch every edge e has a capacity C,,(e).We assume that the collection of all edge capacities of K,, is a family of indep-iideiit randoin variables with a coininon distribution so that
P (C,(e) =0 )= 1 - p (n), ~ ( C , , ( e ) > t ) = p ( n ) ( l - F ( t ) ) , O
We assume that the edgcs of K,, are directed according to the following rule: Let r be a co.istant in [0, I]. For each pair of vertices i and j with i < j , the edge jo'ning i and j is d rectcd from i t o j with probability r , or f r o m j to i with probability 1 - r . Each edge is d rected ,ndcpzndently of all olher edge diiections as well as the edge capisities. With the vertices 0 and co acting as the 5,ource and sink, we den0.e by A',,the max'muin flow from 0 to co thi-oi:gh K". We aim to obtain asymp:otic results concerning the random variables {A',,)in terms of the asympto. that when r=O, the problem is trivial tic behav'our of the sequence { p ( r ~ ) }Nole and not interesting.
Flows through coniplere graphs
265
Let G, be the graph obtained by deleting the blocked edges (an edge is blocked if it has a zero capacity; otherwise it is said to be opcn) from the graph K,,. For intcger h3 1, let A(n, 0, /z) be the subgraph of C, induced by all vertices of distance not exceed'ng h from the vertex 0. (A vertexj is said to be of distance not exceeding h from a vertex i i f j is joined to i by a path, directed from i, of lenglh not greatcr than / I . ) Clearly the maximum flow fio~ii0 to :a through K,, depends heavily on the structure of the graph A(n, 0, / I ) . We shall see later that the graph A(n, 0, / I ) is "similar" to a multi-type branchirig process in a certain way. (See [ 3 ] for the concept of branching processes.) By explor,ng the re1atioiish:p between a random graph and a multi-type braiiching piocess, we obtain results concet-ning maximum flows. (The papers [I J and [4]are two o:her examples where branching processes are cons'dered in the study of random graph problems.) We are able to obtain the following results.
Theorem 1. Suppose that r E (0, I), a d tlzut tip (ri)+m as
IZ+CO.
Tlren
Theorem 2. Suppose that r E (0, 1) and tAnt np (n)+a E (0, a)as iz+m. Let a, be u constant g i t m by I-r log--
if r # 3 ,
12
(I) I f a b a , , then lim P(X,=O)=l. !I+
D
If'a>(rc, tketi X,,+min ( T ,T')in tlis~ril~zifion as n+m, w I w e T' and T' are two iriclqwirrhit copies of u rrrncl'o~iiwl-kibk T,(or, r , F ) , t i liic h is the tnuxitnum jlow ~ / W O U ~(1/ Jbrunchbig tree ( L O be defined i n the next sectzoii). (11)
Remarks. Whcn r = t , the re\ults in Theorems 1 a n d 2 can be tr;ln\luted to corre. pond r g results concet nri~g max miim flows tliioi gh uncl rected complete graph\. (See [2] or the coiiccpt ot Rowi thto:gh und rected giaphs.) This i s due to a theorem troiii McD atmid [ X I w h ch relatcd und,iectetl graphs to d rccted graphs. Therefore, the analysis u e g ve for flows thio: gh ditccted giaphs could be adapted to deal with flows throiigh undirected graphs. When r = I , the edge JO nitig two vertices i and j 11) K, I S directed (almost surely) from i to j if and only if i < j . This method or directing edges means that some
W-C. S. Suen
266
vertices can have many outgoing open edges while others can have few. This imbalance greatly reduces the chances of having a non-zero flow. Theorem 3. Suppose that r = 1, and that
t1P(II) -
~
log n
-+ 03
as n+
03.
Then
Theorem 4. Suppose that r = 1. (i) I f for some u > 1, np (12)
lim P(X,,=O)= 1 n-tm
X”
M-1
_ _ -+p __ in probability as n+m npb>
a
.
We have not bsen able to obtain sharp results concerning the asymptotic np(n) behaviour of {X,,} for the case where r = 1 and - -+I as n-+m. Nevertheless, log n the following result concerning the connectedness of the vertices 0 and CQ by open paths is obtained. Theorem 5. Suppose that r = 1, and np (n)=log n -log log (na), with /3> 0. Then a,
m
0
0
xy
lim P ( x , = o ) = 1 J e - T e - ” e - Y d x d y . n+ m
Remarks. The integral in Theorem 5 can be written as pea
j: -.
e-’dz.
B
Section 2 is a collection of useful results concerning a multi-type branching process. Section 3 relates the graph G,, to a branching process; it also contains the proof of Theorem 5. Section 4 i s devoted to the proofs of Theorems 2 and 4. The proofs of Theorems 1 and 3 are not given in this paper.
Flows through complete graphs
267
2. A multi-type branching process This section is a collection of appropriate results dealing with branching processes, and it will be useful to our analysis of maximum flows through complete graphs. We shall first describe a multi-type branching process which is an example of the general branching process defined by Harris in 131. Thinking of a branching process as a tree, we then consider flows through a branching process in the sense of Grinimett and Welsh [7]. We do not intend to present a complete picture of multi-type branching processes 3s that woulcl be beyond the scope of this paper, but we shall explain, withoiJt rigoroxly proving, the results stated in this section. A nonkornogcxeozrs Poisson procc.ss (sec also [ 5 ] for a definition) with rate function A(.) is a jump process (NPP(x);s > O } taking values in N = { O , 1, ...} such that (i) NPP(t)=number of jumps the process makes in the interval [0, r ] ; (ii) if p(k, x , y ) i s the probability that the process makes k jumps i n the interval [x, y ] , then p ( k , x, y ) satisfies (a) p ( 0 , x , x - t - d x ) = 1 - I ( x ) d x + o ( d x ) , (b) p(1, X , ~ + d x ) = l ( x ) d x + ~ ( d x ) , m
(c) C p ( i , x, x+dx)=o(dx); 1=2
(iii) the numbers of jumps the process makes over disjoint intervals are independent. Suppose that 01 and r are constants satisfying a E (0, co) and r e [0, 11. Let y : [0, 11 x [0, 1]-+[0, a)be a function given by (2.1)
For any t~ [0, I], we denote by NPP, a nonhomogeneous Poisson process with rate function At(.), where A,(x) satisfies for all t E [0, 11,
We denote by S(r) the set of points at which the process NPP, makes a jump. It is clear that S ( t ) is almost surely a finite set of points in [0, I]. We next define a discrete-time branching process { Z m } z = owith rate function y of form given by (2.1). Each particle in the process has a type which is a number
268
W-C. S. Sum
in the interval [0, 13. The state Z , of the process at epoch in is the set of types of all particles born at epoch in. These particles are called the rn-th generation particles. We shall always assume that the process {Z,"}starts with a single particle. (The type of the progenitor may not be specified if no confusion would arise from the omission.) At each epoch rn3 I , every particle in the (in- I)-th generation gives birth to a set of offbpring by looking at a nonhoniogeneous Poisson process NPP,, where x is the type of the particle in concern. If S(.u) I S the set of all jump points of the process NPP,, then the particle x (that is, the particle with type x) branches into IS(x)l children so that each child has a different type from the set S(x). We make the Ibllowing two assumptions. ( i ) Conditioned on Z,,, the family { N P P , ; x E Z,,,} is a collection of independeril nonhomogeneous Poisson processes, as defined after (2.1).
(ii) The process {Z,,,} is Markov and time homogeneous in that the future states of the process depend only on the present state of the process.
It is clear that with probability 1, no two particles in the process can have the same type. We denote by P, and Et respxtively the probab'lity measure and the expectation operator of the ptocess {Z,,,} with a progenitor of type t E [0, I]. Remarks. When r = + , the function y ( t , x) equals +a for all t, x in [0, I]. The branching proiess (2,) defined above can be reduced to a Galton-Watson process (see [3] for a definition) having a typ'cal family size distributed as a Po.sson random variable with parameter j a . Although results of this process are in general well-known, they are nevertheless included in the following theorems which will be needed when we deal with flows. For any integer m > I , we denote by q,,(t) the probability of extinct'on by the m-th generation of the piocess (2,")with a piogenitor of type t in [0, I]. That is,
By conditioning on Z , , it can be shown that
Routine analysis shows that q,(t) converges uniformly to a limit q ( t ) on [0, I ] as rn+co. where q ( t ) is known as the probability of ultimate extinction of the
Flows through complete graphs
269
process {Z,,,).We therefore have that (2.3)
By substituting (2.1) into (2.3), we obtain the following two theorems. Theorem 6. Suppose that r= I . Then q(t)= 1 for all t E [0, I]. Furthermore, if M ( t ) is the size of the entire popiclntion of the process with a progenitor t E [0, 11, Ihen M ( t ) is a geometric random variable with parameter 0, = exp{ -a( 1 - t ) } . That is,
'
P,(M( t ) =j ) = el(I -U, ) j - ,
j = 1 , 2 , ... .
(2.4)
Theorem 7. Stippose that r E (0, 1). We knue two cases: 1
(i) For r # 4, let v = 4 q ( t ) d t . Titen u is equal to the smallest root of the following 0
equation in s: C-urs-
e-crr=e-a
(1 -r)s-
e-a
(1 -r)
,
and q ( t ) is given by
(ii) For r = f , we have
q(t)=q
f o r u l l t e [ O , 11,
where q equals rite smallest root of the following equation in s:
Moreover, we have a critical phetionienon in that there is a constant a, where
(2.5)
W-C. S. Sicerr
270
such ihat, (a) ifcz
such that the random variables
form a inartingale (see [ 5 ] for a definition). The martingale convergence theorem (see [ 5 ] ) gives that the sequence { W,(~J))of random variables converges almost surely as m+w. The convergence of { W”,(~J)) suggests that the process {Z,,,} grows exponentially if it does not die out. The following theorem gives estimates of the population sizes of the branching process (Z,,,}.
Theorem 8. Suppose that 01>01,. For t E [0, I ] , let N(t, m) be the s i x o j the m-th generation of’the process {Z,,,} with a progenitor of type t. Then
(2.7)
and if
E
is a constant in (0, l), then
(2.8) Furthermore, fi M(t, m ) is the total number of particles in thejrst m generations of
Flows through complete graphs
27 1
the process, then as m + 03, (2.9)
and (2.10)
Remarks. Readers who would like to explore further into the subjcct of branching processes could consult Harris [3]. It should be noted that Harris considered [ 3 , Cli. 1111multi-type branching processes in a general setting. He placed certain conditions on the process and such conditions, when translated to our process {z,}, cover only the case where r is in (0, I). The rest of the section i s conccrned with flows through branching trees. (See Grimmett and Welsh [7] for flows through Bethe lattices.) The results stated below will only be needed when we prove Theorem 2. For any t E [0, I], let w, be a realization of a branching process with a progenitor t and rate function y given by (2.1). We think of ofas a tree (in which two particles are joined by an edge if they have a parent-child re1ationsh:p) with the progenitor acting as the root of tlie tree. We associate each edge e of L o , with a non-negative random variable U(e) as the edge capacity of c. We assume that tlie random variables { U(e);e is an edge in w,} are independent and have coiiinion distribution function Fconccntrated on (0, m) with finite mean. Let w ’ = o ’ ( q ) be a realization of the edge capacities of the tree (9,.A network f 3 , , t = O m , t ( q , w’) is obtained from the capacitated tree by connecting each ni-th generation vertex of w, to a new vertex co’ by an edge of infinite capacity. We denote by T,,, t = Tm,t(~, r, F ) the niaxinium flow through from the root of w , to the vertex a‘, where the parameters a and r i n Tm,,(cr, r , F ) specify the rate function y (see (2.1)) of the underlying branching process, and the function F is the distribution function of a typical edge capacity. I f w, has become extinct by the m-th generation, then T , , , = O . Suppose that w, and (,I‘ are as defined above. Then we have that for wi2 1,
since a flow in O,n, ,,,must pass through the In-th generation vertices in the tree 0,. Thus thc sequence of random variables, Tm,*= T,,,(u, r, F), converges for every realization w, of the process ( Z m }and every realization w’=w’(w,) of the edge capacities of w,. We let T,=T,(ci, r, F ) be the limit. (Nolc. This gives the definition
272
W-C. S.Sum
of the random variable To(a,r, F) which appeared in Theorem 2.) Writing Z , , , as the set of all particles in the first generation of the process {Z,,,}where Z . = { t } , it is easy to check that for t E [0, 11,
(2.12)
where for X E Z , , , , U, is thecapacityof the edge joining the progenitor t to the is the maximum flow from x to 03' through O m + , , , without particle x, and passing through the edge joining t and x . It is clear that Tm,,is equal to Tm,x in distribution and that when Z,,,=Z#B, the random variables in { F m S xU,X ; x E Z } are independent, Theorem 9. With notation defined as before, we have that
where q ( t ) is the probability of ultimate extinction of the underlying process { Zm) with a progenitor of type t and rate function y given by (2. I). Eg. (2.13) is proved by first noting that
{T,=O}2{z,=@for some m > l } , which implies that P(T,=O)>q(t). This shows (2.13) when a
where q(6, t ) is the probability of ultimate extinction of a process {ZA} which starts with a progenitor of type t and has a rate function y' satisfy'ng y'(t, x) =(1 -F(h))y(t, x) [or all t , x E [0, 11. Bccause q(6, t ) converges to q ( t ) as Is decreases to 0, Eq. (2.13) is shown. Suppose that o, We next consider :I generalized version of the network and (I)' are defined as before. We for111 a network UAl,, by jo'ning each rn-th generation vertex of w, to a new vertex ix)" with an edge having a capacity U:. We
Flows through complete graphs
273
assume that the family { U i ; x E Zm}is a collection of independent and identically distributed non-negative random variables. We denote by TA,rthe maximum flow from the root of orto the vertex a”through O;,,. Similar to (2.12), we have the following equations.
(2.15)
Theorem 10. With notation dejned as above, suppose that the random variables { U i ) have n common distribution gioen by Y (u: = 6) = 1 - P ( v.; =O)
=p ,
where S E (0, a)and p E (0, I]. Then the sequence ojraiidom variables TL,fconverges in distribution to the mndom variable T,as m-+a3.
The central idea of the proof of Theorem 10 is as follows. For m > 1, let Em,, be the event that the network Oi,,,, has a minimum separating cutset (see [2] for cutset definitions) that contains only edges in the tree 0 J f . For those realizations in which or has an empty m-th generation, we adopt the convention that Em,f has occurred. If Ern,:occurs, then replacing by infinity the capacities of those edges joining the m-th generation vertices to the vertex a”would not alter the maximum flow through O i , f . Therefore, when conditioned on Emsthaving occurred,
Theorem 10 is thus established if lim P ( E m S r ) =1
(2.16)
m-r w
Eq. (2.16) is true because a branching process grows exponentially if it does not die out, and therefore “boLtlenecks” are likely to occur in the early generations.
Remarks. Although we have not been able to find, in general, the distribution of the random variable T,(a, r, F) when a>a,, certain estimates can be obtained. From (2.14), we have that for sufficiently small 6>0,
W-C. S.Suen
274
P(T,>G)=l-q(S,
t)
>mf (1-y(6, t ) ;
=II
t ~ [ oI]> ,
9
which is positive by Theorem 7 for a>cc,. Thus, if the random variables {Vi} defined before Theorem 10 have a common distribution g:ven by
it is easy to show that
Ti,t
in distribution for n z = 1 , 2 , ... .
Also, from (2.1 l), we have for every realization,
Because qi,,t+Tt in distribution, and T,,,,-+Ttfor every realization, it is therefore possible to obtain good upper and lower bounds for E(T,) by using Eqs. (2.12) and (2. IS). The atom of TIat zero is g ven by Theorem 9.
3. Subgraphs of G, and Proof of Theorem 5 Let G, be the graph obtained by deleting all blocked edges from the capacitated network K,,. Noiice that the distribution of G,, depends on two parameters, r a n d a(n)=np(n). For i e V,, and for any positive integer h> 1, let A(n, i, h) be the subgraph of G,, induced by all vertices of distance not more than / I from i. We shall obtain certain properties of the graph A(n, i, h) by applying the results stated in the previous section on branching processes. This is done by constructing two rooted trees. We shall be mainly interested in the following two cases:
(i) r ~ ( 0 , l and ) np(n)+ccE(a,, co) as n-+co,
(ii) r = l and np(n)-mE(O, log n
co) as n + m .
Construction C'(m, a(m), r, i, h, q). Given positive integers m,h, and a function q: V,,,x V,,- [0, 11, we construct, for each realization w of the graph G, with parameter6 r and a(m)=mp(m), a tree GT(m, f, h)=GT(w; m, i, h) with root i E V,,,
Flows through conipkte graphs
275
and height not greater than h (where h may depend on m). The tree GT(m, i, / I ) can be regarded as a "random graph" whose randomness is derived from the graph G,,, and from the probability rules of the colouring process set out below in the construction. (i)
The vertex i is coloured red, and acts as the root of the GT(m, i, h).
(ii) We construct each stratum of the tree GT(m,i, h) in turn. Suppose that we have formed the I-rh stratum (I=O, I , ..., 11-1). We form the (/+I)-th stratum by performing a procedure COLOUR' (to be specified below) on each vertex in the I-th stratum, starting fi-om the vertex with the smallest label, and working through to the vertex with the b.ggest label i n :in ascending order. The procedure COLOUR' has an input variablej, wherej is the label of the vertex on which the procedure is performed. Procedure COLOUR'(j). Let S j bc the set of vertices joined to the vertex j by an edge (in G,,,) directed fron1.j. For each k E S j , if k i s not co!o2red, then the vertex k and the edge (j,k ) arc coloured with the red co!our with pro5ability ~ ( kj ),. Each such colouring is do.ie independeiltly of all olher c0lo:lringj. The vertices coloured by the procedure COLOUR'(j) are the offJpring vertices of the vertex j in the tree GT(ni, i, 11). (iii) The colouring process g'ven in (ii) is first applied to the root i to obtain the first stratum, and then to the vertices in the first stratum to form the second stratum. This is continued in the manner set out in (ii) until one of the following occurs: (a) a tree of he'ght h is formed, or (b) the construction gives an empty stratum at some stage.
Construction C"(m, i, h, y,,,). Given a positive integer m, a vertex i~ V,,, and a function y, : [O, I ] x [O, 11-[O, co), let ( Z ~ " ' }be a multi-type branching process i ffl-1 withaprogenitor r,wheret=-ifi#co andt=ifi=oo,andratefunction y,,,.
m ffl For each realization w, of the process { Z ~ " ' }we , shall construct a labelled tree BT(m, i, h)= BT(o,;m, i, h) with root i and he'ght not exceed ng h. To do that, we first associate each vertexje V, with an interval l ( j > ~ [ O I,] so that
[A jil)
f(j)=
-?
~
for j = O , 1 , ..., m - 2 ,
W-C.S. Suen
276
Let J : [0, l]+ V,,, be a function so that J(x)=j
if xEZ(j)
The construction of the tree BT(m, i, h) is as follows. (i) The progenitor is coloured red, and is labelled with i. (ii) We colour the particles in each generation in turn, starting from the first generation. When every particle in a generation is coloured, we proceed to colour the particles in the next generation. To colour the particles in the (l+ 1)-th generation ( I = O , 1, ..., I t - I), we perform a procedure COLOUR" (to be specified below) on each particle in the I-th generation, starting from the particle with the smallest type, and working through to the particle with the b:ggest type in an ascending order. The procedure COLOUR" has an input variable x, where x is the type of the particle on which the pro-edure is performed. Procedure COLOUR"(x). Let S(x) bc the set of all immediate offspring of the particle x. There are two possibilities: (a) If x is red, then we sample each p o k t in S(x), starting from the one with the smallest type. If y E S(x) and the label J(y) has not been used, then the particle y is coloured red and labelled with J(y). If y E S(x) and J b ) has been used, then particle y is coloured blue. (b) If x is blue, then every particle in S(x) is coloured blue. (iii) The colouring and labelling process is first performed to the progenitor so that every particle in the first generation is coloured. The construction is continued by following the method set out in (ti) until one of the following occurs: (a) every particle in the first h generations of the process (Zi'"'} is coloured, or (b) every particle in the process is coloured if the process has died out by the h-th generation. (iv) We think of the red labelled particles as vertices. For each pair of vertices J ( x ) and J(y) (that is, particles x and y), an edge is drawn from J(x) to J ( y ) if and only if the particle x is the immediate parent of the particle y . The resulting graph is named BT(m, i, h). Clearly it is a tree with root i, and has height not exceeding h. The idea behind the construction is as follows. The constructions serve as means
of finding "approximations" to a graph A(m, i, h) by a tree GT(m, i, h) and to a branching process (Zj"'} by a tree BT(m, i, h). The trees GTand BT are random. By choosing suitable parameters in the constructions, we force the trees GT and
Flows through complete graphs
277
BT to have the same distribution. Certain inferences concerning the graph A(m, i, h) can now be drawn by looking at the process {Z;"')]. We now proceed to prove Theorem 5 by applying the constructions given above. We shall make use of three lemmas, prool's of which can be found in the Appendix. Note that by setting h>m, we place no external restriction on the heights of the trees GT(m, i, h) and BT(m, i, 12) because their heights can never exceed m.
Lemma 11. Suppose that r = 1, and that a (m)satisfies
Let 9 : V, x V,,+[O,
I ] be the constantfirnction,
Set h = m and i=O. Let M2(rn,0 ) be the size of the tree GT(m,0, in) obtained from the construction C'(m, a(m), 1, 0, in, 9 ) .Let M , (m, 0) be the total number of vertices joined to the vertex 0 by paths directed from 0 in the graph G,, with parameters r and a(m) as given above. Then
P ( M , ( m , O ) = h f 2 ( n 1 ,o,)= I ,
(3.3)
Lemma 12. Suppose that r = 1, and that a(m) satisfies
Let ym : [0, 11 x [0, 1]+[0, co) be afirnction given by
' 0
orherivise .
Consider the constructioiz C"(nz, i, h, y,). Set h=nz and i = O . Let D(m,0) be the number of particles coloured blue in the construction. Then, as nz-, co, we have
278
W-C. S. Suen
Lemma 13. Let & . ( m y0 ) be the set of all labelled trees of roots 0, and with vertex sets which are subsets of the set V,,,. Consider the constructions C'(m,cr(m), 1,0, myq ) and C"(m, 0, m yy,,,), where a(na), 9 satisfies the hypotheses of Lemma 11 and ym is given by (3.4). Then for each (3 in SZ,(m, 0), we have P ( { W : G T ( ~m; , 0,m)=W})
=Po({o0:B T ( o o ;i n , 0 , m)=W}). That is, both trees have the same distribution. With the help of the lemmas, the proof of Theorem 5 is now straightforward. From this point onwards, we shall sometimes use non-integral quantities in places where integers are required. Such an aberration makes no essential difference to our analysis.
Proof of Theorem 5. We first divide the vertex set V,, of the graph G,, into two sets V,,o and V,,m so that
and V,,, = V,,- V,,,o . Let M I
(respectively M I
) be the number of
vertices in the set V,,, (respectively Vn,), joined to the vertex 0 (respectively co) by paths directed from the vertex 0 (respectively towards the vertex a).It is clear are independent, and that P ( X n = O ) = P ( 3 no path directed from 0 to
00
i n G,,)
of G,, on the vertex set V n , o This . graph cr'(m) 1 where -4has -- vertices with parameters r = l and a' 2 logm 2 as m+co by the hypotheses of Theorem 5. We now apply construction Consider the induced subgraph Gn
2'0
n
d(+, a'(+),
1,0,;.
q ) to the graph G ; , , , where 1 is given by (3.21, and
FIows through complete graphs
219
(q)
to the branching process {Zl } with a progenitor of type 0, and rate function
yfl
as given by (3.4) in which a'(m) is substi-
2
("z )
tuted for a(m). Let M --, 0
1 -i-,
(i
and let D -, 0
C"( +,O,
n
y;).
be the size of the entire population of the process,
be the number of particles coloured blue in the construction Then by Lemmas 11 and 13, we have that
where each inequality holds in distribution only. Since a'(m) -+Iognz
1
2
asm+co,
we have from Lemma 12 that
(;
We note also from Theorem 6 that the variable M -, 0 is distributed as a geometric variable with parameter O(n), where
O(n)=exp
1I [
- --log 1 -
Consider the variable M ,
(5,
m). By relabelling the vertices and by reversing
280
W-C. S. Suen
the edge directions in the graph G,, it will be clear that MI(
c1
tribution similar (identical if n is even) to that of MI -, 0 we have that
i-,
a) has a dis-
.Thus, from Eq. (3.7),
where the random variables Yi and Yi' are independent and have the same distribution as the variable
(- i)
(i ).
M -, 0
Theorem 5 now follows because
and O ( n ) M ( - ! - , 0) converges in distribution to
(l-p(n))-e@"'+exp
an exponential variable with parameter 1.
0
We next show a theorem which will be useful in the proof of Theorem 4.
Theorem 14. Consider the graph G,,, in whiclt r = 1 and a ( m ) satisfies
M (m) __ -9
log 171
a ~ ( 0 00). , Let M I ( m , 0) be the number of vertices joined to the vertex 0 by paths
in G,,, directedfrom 0. Then for each m , M , ( m 0)G Y,,, in distribution, where Y,,,is a geometric variable with parameter B ( m ) given by
O(rn)=
(
I--
*)".:'
Furthermore, if a E ( t ,$), then for any large m ,
E
in
(4, a), we
have that, f o r sl&cienlly
Proof. The first part of the theorem is a direct application of Theorem 6 , Lemmas 11 and 13. (See also the proof of Theorem 5 . )
Flows through complete graphs
28 1
Let M(m,O) be the size of the entire population of the branching process ( 2 ~ ” which ”) starts with a progenitor of type 0 and has rate function ym given by
otherwise.
10
Let D(m, 0) be the number of particles coloured blue in the construction C”(m, 0, m,y,,,). Then from Lemmas 11 and 13,
where each inequality holds in distribution. Since, by Theorem 6, M ( m , 0) is distributed as a geometric random variable with parameter 0(m),where
0 ( m )=(1 -a (m)/m)rn= rn - @ t( o I), Eq. (3.8) now follows from (3.9) and Lemma 12. Note that the error probability in (3.8) can be improved considerably. We next turn our attention to the graph G, i n which r E (0,l) and a(m) satisfies
We collect a result which will be useful in the proof of Theorem 2.
Theorem 15. Suppose that r E (0, l), and that lim ~ ( ( r n ) = a ~ (,a0 0, ) where a, is ni-tm
given by (2.5). Suppose also that h=h(m) is given by
19 log m h = h (m)=.. 32loga-l0gac ~~
Then for any vertex i in the vertex set V,,, of the graph G,, we have the following statements concerning the subgraph A ( m , i, h) of G,
.
(i) If MI (m, i, h) is the number of vertices in the graph A(m, i, 11) then P ( M , ( m , i , h ) > n 1 2 / 3 ) = 0 ( m - 1 / 2 4 ) as m+m.
(3.101,
W-C. S. Suen
282
(ii) If N 1(m, i, h) is the number of vertices of distance h from the vertex i, then for i m-1 t = - if i # m and t=if i= co, we have m m
P ( N , ( m , i, h ) < m 9 " 6 ) = q ( t ) + o ( 1 )
as ni+o3,
(3.11)
where q(x), x E [0, I], is probability of ultimate extinction of the process ( Z , } which starts with a progenitor x and has rate function y, where y is given by
The proof of the theorem makes use of three lemmas. These lemmas help relate the graph A(m, i, h) to the first h generations of a multi-type branching process.
Lemma 16. In addition to the hypotheses of Theorem 15, let q : Vmx Vm+[o, 11 be thefunction given by
r-
. , m
(3.12)
Let A ( m , i, h) be the event that during the formation of the tree G T ( m , i, h) using the construction C'(ni, cc(m),r, i, h, q) there is n vertex j E V,,, such that at least one vertex in S, reinnins uncoloured immediately c@r the procedure CO LOUR'( j ) is performed on the vertex j . Then
Lemma 17. In addition to the hypotheses of Theorem 15, let y,,, : [0, 11 x [0, 13 +[O, a)be a function given by (3.14)
Flows through complete graphs
283
For i E V,, let D(m,i, h) be the number of particles coloured blue in the construction C"(m, i, h, y,) for the tree BT(m, i, h). Then
where t is the type of the progenitor of the process {Zi'")}to which the construction is applied,
Lemma 18. For i E V,,,, let sZ,(m, i, h) be the set of all labelled trees of roots i, with heights not greater than h, and whose vertex sets are subsets of the set V,. Suppose that the hypotheses of Lemmas 16 and 17 hold. Thenfor any W E Q,(m, i, h), we have that P ( { m :G T ( w ; m , i , /I)=&})
=P,({w,: BT(m,; rn, i , h)=G}). That is, both trees have the same distribution. It should be noted that these lemmas are in the same spirit as Lemmas 11, 12 and 13. The proofs of Lemmas 16 and 17 are g'ven in the Appendix. The proof of Lemma 18 is not given as it is similar to the proof of Lemma 13.
Proof of Theorem 15. We first note that if A(m, i, h) has not occurred, then for I= 1, 2, .. ., or h, every vertex of distance I from the vertex i in the graph C, is in the I-th stratum of the tree GT(m,i,h). Therefore, we have from (3.13) that P ( M , ( m , i , 11) >
P ( N ( m , i , 11) < !?a9"
+ 0 (M-~~'), ') = P (N,(m , i , h ) < m9/' ') + 0 ( m 3 / 8 ) , = P ( M , ( m , i , / I ) > rn2I3)
-
(3.16) (3.17)
where N,(m, i, h) and M,(m, i, /z) respectively are the size of the 11-th stratum and the total number of vertices in the tree GT(m, i, h). Let N ( m , i, h) and M ( m , i, h) respectively be the size of the h-th generation and the total number of particles in the first h generations of the process {Z:"'} given i n Lemma 17. Then by Lemma 18, we have that
M , ( m , i , h ) < M ( m , i , h ) i n distribution, and N ( m , i, h ) - D ( m , i, h ) < N , ( m , i , h ) Q N ( m , i , h ) in distributioD.
284
W-C. S. Suen
It follows from Theorein 8 that as m-co
= 0 (m-1 / 2 4 ) ,
(3.18)
and
where q'm'(x), x E [0, I], is the probability of ultimate extinction of the process (2:"')) with a progenitor x and rate function ym given by (3.14). Eq. (3.10) now follows from (3.16) and (3.18). S'nce cc(m)+cc as m-tco, it is easy to deduce from Theorem 7 that q'"'(t)=q(t)+o(l) as nz-co. Thus Eq. (3.1 I ) follows from (3.17) and (3.19). 0
4. Proofs of Theorem 2 and Theorem 4
Proof of Theorem 2. Consider, for any fixed positive integer, I, the subgraphs A(n, 0, I) and A'(n, 00, I ) of the graph G,,, where A(tt, 0, I) is defined as before, and A'(n, 03, I) is the subgraph of G,, induced by all vertices that are joined to the vertex co by paths, directed towards 03, of length not exceeding I. Let M , (u, 0, I ) (respectivelyM i (n, co,I ) ) be the size of the graph A(n, 0, I) (respectively A'(17, co,I)). It is easy to show that as n+co,
This suggests that for any fixed I, the graphs A(n, 0, I ) and A'(n, co,I ) resemble two branching trees for large n. This is the central idea of our proof of the theorem. Suppose that cc(n)=(I+f(n))a, wheref(n)+O as n-+co. Let ( p ( n ) } be a sequence of positive integers tending to co slowly so that the sequences {n-1'24p(n)} and (f(n) p(n)} converge to 0 as n+m. For any labelled and rooted tree w with vertex set V ( o ) , we denote by u(w) the size of the tree. For any vertex i E V,, let QT(n, i, I) be the set of all labelled trees of roots i, with heights not exceeding 1 and satisfy V ( w ) cV,. We write Q',(n, i, l ) as the set
Flows through complete graphs
285
and f ( n , 1) as the set
r(n,I ) =
{(q, wz) :0
1 EQ;ZIy(ll,
0, I ) ,
w2 EQ;.(rl,
03,
I),
and V ( w o ,n ) V(02)=0}. We shall need the following lemmas.
Lemma 19. For (aLE Q,(n, 0 , l ) artd o2E O,(n, co,f), let B ( n , wl, event that A (rz, 0, 1) = o 1 and A' (n , 00 , I ) = ( 02 . Writing
we haue that as n
--f
(02)
be the
co ,
P(B(lt))= 1 -o( I)
Lemma 20. Let y : [O, 11x [Of I]+ [0, co) be a jknction giuen by
Consider the construction C"(n, 0 , I , y ) of the tree BT(n, 0 , I). For w E Q,(n, 0 , I ) , let B ( n , (0)be the event that BT(n, 0 , l)=w and no particle is coloured blue in the cotutruction. Then as I I + cn P o ( B ( n , 0))=1-o(1). 0 E R;(n.
(4.3)
0,I )
Lemma 21. For w E Q,(n, 00, I), let z(w) be the tree obtained from w by relabelling each vertex i in CL) with the label k ( i ) where
and ~ ( O ) = C O , k(w)=O.
Then for (w, ,w 2 )E r ( n , l ) ,
[ y].
where A ( n ) = p ( n ) f(n)+-
286
W-C. S. Siren
Note that (4.1) says that the graphs A ( n , 0, I ) and A ‘ ( n , co,I ) are, with probability 1 -o(l), two disjoint trees, each having a size not more then p ( t z ) . Eq. (4.4) shows that the graphs A ( n , 0 , I ) and A ’ ( n , co,1) “resemb~e”IWO independent branching trees when n is large. Lemmas 19 and 20 are more or less obv ous because the sizes of the graphs A ( H ,O , r ) , A’(n, co,I ) and BT(n, 0, I ) are (almost surely) o f O ( 1 ) as n+m.
Proof of Lemma 21. Let W(w) be the set of vertices of w not i n the I-th stratum of the tree w , and for j~ CY(to), let S j ( ( o ) be the set of offspring vertices of j in (0. Now the event B(n, m,,(02) specifies that (:) there is an edge ( j , k ) for j E W ( w J and k j E s, (wz),
E S,(co,),
or for k E W(uJ and
(ii) there are ~ ( u ( w ~ )1)- (u(wI)-2) pairs of vertices in ojI and ~ ( u ( c o l ) - 1) x ( U ( Q ~ ) - ~ ) pairs of vertices in w2 not jo’ned by an edge in either directions, (iii) there is no edge ( j , k ) f o r j E W ( o , ) and k E V,- V(w,), or for j~ V,- V(w,) - V ( w J and k E W(w2). Let 4 , qn: V, x V,,+[O, 11 be functions g’ven by
tl
r-
n
Then we have that
if i < j ,
Flows through complete graphs
287
giving that
specilies We next turn our attention towards P,(B(n, w,)). The event B ( n , 0,) that forJE V ( Q , ) ,there is exactly one particle of type i n I ( / ) amongst all particles in the first I generations of the process { Z , , } , which starts with a progenitor of type 0 and has rate function y given by (4.2). It is easy to check that
proof of Lemma 21 is thus complete. We now consider flows through trees in a way similar to what we have done in Section 2. For co E RT(n,0, I), we associate each edge E in o with a capacity U ( e ) so that the fam ly { U ( e ) ; e is an edge in o>is a collection of independent random var;ables w’th common distribution F, where P i s g’ven by the distribution of a typical edge capacity of K , . We next form a network Oj(co) by joining each vertex in the I-th stratum of (1) to a new vertex 00’ by an edge of infinite capacity. Let T,(w) be the maximum flow from 0 to co’ through O,(w). We adopt the convention that if o has an empty I-th stratum, then T,(w)=O. For x E [0, a),let
Suppose that B(n,w,,co,) has occurred. It is easy to show that
i n distribution, X,<min(T,(o,), $(o,)) is independent of T,(w,) and has a distribution identical to that where +ft(o,) of 7‘,(r(w2)).Hence, by using Lemmas 19, 20 and 21, we have that
+
P ( X , > x) =P (x,> x ,B ( n ) ) o,(l)
W-C. S. Suen
2x8
whereT,,,=T,,,(u, r , F)isdefinedinSection2and the functions ol(l), ..., o,(l) all tend to 0 as n403. This shows that for x E [0, 03) and for I = 1 , 2 , ... , we have
The proof of part (i) of Theorem 2 is now complete because when uu,, that (4.7)
To show (4.7), we propose to find a lower bound A,’ by constructing a subgraph (2 of K,,. Suppose that the event B(n) has occurred. Then the graphs A ( H ,0, I ) and A‘(n, co,I ) are trees; each having a size not greater than p((n). Let R,(O) and R,(co) be the set of vertices in the I-th strata of A ( n , 0, I ) and A ’ @ , 00, /). Note that if either R,(O)or R,(co) is empty, then X,,=O. Consider the case when neither R,(O) nor R,(co) is empty. Let r,=IR,(O)) and roo=IRl(co)l.Suppose that R,(O)={a(1),
*.., a ( r o ) > ,
and R,(4={a’(l),
... * .’(roo)>
9
where a(l)U,.
We then delete from the graph K,,all edges with capacities not exceeding 6. This results in a graph G;. We now form, for each i E R,(O),sets R,(O, i ) and W(0,i ) .
FIo ws through complete graphs
19 log n 32 l O g ( U ( 1 -F(G))}-loga,
(i) Let h be an integer where h =-
289
. Let H ( I ) = Vn- V(0)
- V(co),where V(0)and V ( w )are the vertex sets of A ( n , 0, I) and A’(n, respectively.
03,
I)
(ii) F o r j = 1 , 2, ... ,until y o , we perform the following: (a) Consider the induced subgraph GA(j) of GA on the vertex set H ( j ) u { a u ) } . We define W(0, a ( j ) ) (respectively R 2 ( 0 ,u ( j ) ) ) as the set of all vertices of distance not exceeding (respectively, distance exactly) h from the vertex a(j). (b) Put H ( j + l)=H(j)\W(O, a(j)). To form the sets W(cn, i ) and R 2 ( c o ,i), i E R,(a), we change our definition of distance here so that a vertexj is said to be of distance not exceeding k from a vertex i i f j is joined to i by a path, directed from j , of length not more than k. (iii) Forj= I , 2, ... , until r m , we perform the following: (a) Consider the induced subgraph CA(ro+j) of GA on the vertex set H ( r o + j ) u {~‘(j)}. We define W(co,u’(j)) (respectively R 2 ( c o ,~ ’ ( j ) ) as ) the set of all vertices of distance not exceeding (respectively, distance exactly) h from the vertex a’(j). (b) Put H(ro+ j + l)=H(ro +j)\( W ( m ,a’(j)). We write W(O)=
W(0,i) and W ( m ) = (J W ( w ,i). We define Q to isRl(m)
isRI(0)
be the induced subgraph of K,, on the vertex set V(0) u V ( w )u W(0)u W(co). Let A , ( k ) be the event that each of the sets, W ( O , a ( l ) ) ,..., W ( 0 ,a(k)), is of size not exceeding n 2 / 3 . If A l ( k - l ) occurs, then the graph Gi(k) has at least n(1 -o(l)) vertices, with parameters r E (0, 1) and a,,, where
Thus, by Theorem 15, we have that
For t E [0, I], let q(6, t ) be the probability of ultimate extinction of the process {ZL) which starts with a progenitor of type t and has rate function y‘ where y ’ ( x , y ) = ( l - F ( G ) ) y ( x , y ) for all x , y [0, ~ I]. Then by Theorem 7, we have that P1=inf ( 1 - q ( 6 , t ) ; r E [0, 1]}>0. Thus by Theorem 15, we have that
I
Therefore, if A 2 ( 0 , i ) is the event that W(0,i)l >n9IL6, then for il
W-C. S. Suen
290
in R,(O), we have from (4.8) and (4.9) that as n-+cn,
P ( A , ( O , i , ) , ..., A , (0, i k ) ) > p P : ( l + O ( l ) ) k + O ( l ) . Since E(ro)
where p satisfies O
P ( A , ( m i d , .*. A,(m 9
7
3
ik))>/j:,
(4.1 I)
where { i , , ... , 4 ) c R , (a). Let A be the event that there exist i E R,(O) a n d j E R,(oo) such that (i) the events A 2 (0, i) and A (a, j ) have o-curred, and (ii) the graph Q does not contain an edge ( k o , k , ) , of capacity exceeding S , joining a vertex ko E R,(O, i) and a vertex k , E R,(cn,j). Since ro r , < p ( ~ ) , ,we have that
=o(l)
as n + a .
(4.12)
Let a" be a new vertex. Consider the network Q' obtained from Q by deleting from Q all vertices not in A ( n , 0 , f )or A ' @ , c o , I ) , (ii) drawing, for each i E R,(O), an edge ( i , a")of capacity S if and only if the event A (0, i ) has occurred, (iii) drawing, for each i E R , (a), an edge (a", i ) of capacity 6 if and only if the event A 2 ( a ,i ) has occurred. Let Y, be the maximum flow from 0 to co through Q'. It is easy to check that when conditioned on 2, the network Q permits a flow from 0 to a of value at least Y,,. (Note. We use the notation E to denote the complement of an event E.) For w E a&, 0, I), we form a network O;(w) whkh is the same as the network O,(o)except that each edge e joining a vertex in the I-th stratum of w to the vertex CQ' has a capacity U'(e), with distribution (i)
Flows through complete graphs
291
where 6 and p are as defined above. We assume that all edge capacities in O;(w) ase indcpendent and denote by T;(cI,)the maximum flow fiom 0 to co' through O;(w). We adopt the convention that i f 0) has an empty I-th stratum, then T;(w)=O. Let P'(O1,
x)=P(T&o)>x),
XE[O,
From (4.10) and (4.1 I), it follows that if r ( n , k ) and for x E [O, a)
ri
m)
is sufficiently large, then for (wl, w 2 )
E
Similar to ( 4 4 , we have that
= P(TL, I > x y 1-0 (1)
,
where TAvlis defined in Section 2. S:nce T i p T 0 in distribution as I+co (by Theorem lo), and since TA,l
This completes the proof of the theorem. 0 Proof of Theorem 4. We shall show part (i) first. Consider the induced subgraph G i - l of Kn o n the vertex set ( 0 , 1 , ..., n - 2 } containing open edges only. Let. M ; ( n - I , O ) be the number of vertices in GL-l joined to the vertex 0 by open paths directed from 0. Then clearly
W-C.S. Sum
292
Now the graph GA.-l has parameters r = 1 and u'(n - 1) = ( a - l ) p ( n ) , and so by Theorem 14, we have that Ml(n-l,O)
in distribution,
(4.14)
where Y,,-l is a geometric variable with parameter O(n- 1) given by
I t therefore follows from (4.13) and (4.14) that
P ( X , =0) 2 E ((1 -p (n)) yn- ')
= 1-0 (1) ,
by the hypotheses of the theorem.
To show part (ii), we first choose a constant el in a-1
(0,
-&).Letfi=(l+.q)x
. Consider the induced subgraph G;l-B)nof K,, on the vertex set (0,1, ...)
U
(1 - f i ) n - 2 ,
00)
containing open edges only. Now the graph C ; , - B ) n has
--f 1 -El(u - 1) < 1 as log m m + m . Therefore, if E(n) is the event that the vertex 00 is joined to the vertex 0 by an open path (directed from 0) in the graph G;l-B)ny then we have from part (i) that
parameters r= 1 and ~ ' ( ( 1-B)n)=(l -P)np(n), where
__
P ( E ( n ) ) = o ( l ) as n + m . Suppose that the event E(n) has not occurred. Then
where C, is the capacity of the edge joining the vertices i and co in K,. The law of large numbers shows that for e satisfying e > c l ,
Flows through complete graphs
As
293
can be made as close to 0 as possible, we have that for any E > O
=o(l). It therefore remains to show that for any E>O, (4.15)
a
To show (4.15), we construct a subgraph Q of K, so that Q allows, with probability 1 -o(l), a flow of value not less than (1 - ~ ) p ( a - 1) log n to pass through. Before we give the method of construction, we define some notation. With regard to the fact that P(E(n))=o(l) as n+co, we consider the sets
{
L(O)= I , ... , a i l n } , ~
i:
L(co)=
-i1-3,
... , n - 2
I
,
whereL(0) n L(co)=0 need not be true. Suppose that 1 is a large positive integer. We divide each of the sets L(0) and L(co) into I sets, L(0, l), ... ,L(0, I) and L ( w , l), ..., L(co, I) so that for k = 1 , 2, ..., I,
k a-l
. . . , - !- n ]a,
1-ka-I
n,
1 l-k+la-l ... , - n - 2 + __ ci l a
Let e2 be a constant in (0, 1). We choose a positive constant 6 small enough so that
~ = ( l - F ( 6 ) )is doze to 1 and K a > l , where F is given by the distribution of a typical edge capacity in K..
W-C. S.Siten
294
The construction is basically a process of picking verlices. If, at any stage of the construction, we fail to pick an appropriate set of vertices, then the construction is stopped and i s saicl to have failed. For k = 1 , 2 , ... , until I- 5, we perlorm the following:
(I) We pick a set R1(0, k ) or 1 p ( n ) =- (1 - E 2 ) (1- cc)log I1
1
vertices from the setL(0, k ) so that each of the vertices picked is joined to the vertex 0 by an open edge. This is done subject to the following condition. Condition (I). Only vertices with smallest possible labels are picked, and once picked, they are no longer available to future selection in the construction. (ii) Suppose that R,(O, k ) = { a ( l ) , ..., a(p(n))} where a(l)< ...
vertices from the set L ( 0 , k f l ) so that each vertex picked is joined t o a ( j ) by a n edge of capacity exceeding 6 . We write
and
Suppose that the sets R,(O) and R2(0) are formed. We cotitinue our construction to form, for k = l , 2, ..., 1-5, sets R,(oo,k)and f o r i E R,(co, k ) , sets Rz (a, k , i). This is done by performing, for j = 1 , 2, ... , until I- 5, steps (i) and (ii) with “co’’substituted for “0.” We write
and
Flows throiigh complete graphs
295
Having rormed the sets R,(O),R2(0),R , ( m ) and R2(co),we proceed to pick for k = 1 , 2 , ..., 1-5, and for i E R 2 ( 0 , k ) ,a set W ( O , k , i ) . Let G,:’ be the edgesubgraph of K,, containing only the edges with capacities exceeding d. For k = 1 , 2, ..., until 1 - 5 , we perform the following: (iii) Let H (0, k ) be the set given by
H ( 0 , k’ j has not been picked in the construction
} b(1)
{ b ( j ) ) u I f ( O , k)\
u W ( 0 ,k , b ( 9 ) .
i= I
We define W(0, k, b(j)) to be the set of all vertices that are joined to b ( j ) by paths, directed from b(j), in the graph GL(0, k,j ) . Having picked the sets W(O, ., -1, we continue our construction to form corresponding sets W(o3, . , .). This is done by performing for j = I ,2 , ... , until 1 - 5 , step (iii) but with “co” substituted for “0,” “directed towards” for “directed from,” and
n
1-k-2ix-1
n n<j<--2+-CI
I-k-1 1
U-1 ~
U
(4.17) n,
and j has not been picked i n the construction ‘1
I
for (4.16). Note that the W sets are all disjoint and that the sets H ( 0 , .) and H ( a , .) are carefully chosen so that for k = 1 , 2 , ..., 1 - 5 H ( 0 , k ) n H ( a , I-k-4)=0 We shall show that, with probability 1 -o(l), each H set has a size close to n 1 a-1 -+--n. 2G?
I
c1
W-C. S. Suen
296
We write
R(O)=R,(O)u(
u W ( 0 ,k ) ) and R(w)=R,(m)u( u W ( c o , k ) ) . We define
1-5
1- 5
k= 1
k= 1
the graph Q to be the induced subgraph of G,, on the vertex set (0, CQ} u R(0) u R(co). We now estimate the probability of failure of the construction. The construction fails only when picking sets of sizes p(n) or v ( n ) in step (i) or step (ii). If the construction succeeds, then there are 2(/-5) ( I + p ( n ) ) such sets, each having a size at most p(n). Thus, the number of vertices not available for selection when sets of sizes p(n) or v(n) are picked is at most 2 / p ( ~ )whereas ~, each set L(0, 1 a-1 or L ( m , .) has - -n vertices. Therefore, if A is the event that the construction l a fails, then a)
where Y, (respectively, Y2) is a binomial random variable with parameters 1 a-1 - ---nn221p(n)' and p ( n ) (respectively, p(n) (1 -F(d))). Routine analysis l a shows that for all large n,
and
where p l , p2 are constants in ( 0 , 1). It follows that P ( A ) = o ( l ) as n-+co. 1 1 a-1 Suppose that theconstruction succeeds. Choose aconstant e3 in and let B, be the event that there is a set W(0, * , whose size exceeds n* or is less than ne3.We shall show that P(B,)=o(l) as n+m. Suppose W ( 0 ,k , a ( j ) ) is the first of such sets appearing in the construction. Then the graph CY(0, k , j ) , from which W ( 0 ,k, a ( j ) ) is formed, has a)
1 a-1 n n +---0(~~/~(10gn)*) 1 Q 2cr
--
Flows through complete graphs
297
vertices. Furthermore, the vertex a ( j ) has the smallest label among the vertices in CL(0, k , j ) . Also, each edge in GY(0, k, i ) appears independently (conditioned on what has happened in the construction so far) with probability p ( n ) ( I -F(6)). Thus the graph Gh'(0, k , j ) has parameters r = l and allj where a:, j/logrn K +-+-K
cr-1
2
1
as m-+co. Since
K
is close to 1 [when 6 is small) and since I is as-
sumed to be very large, we have that
a-1 -+-2 1
K .
K E
(4,2). Thus, by Theorem
14,
we have
=o(l)
as n+oo
Let B , be the event that there exists a set W ( m , * , .) whose size exceeds n* or is less than nea.Following arguments similar to those used above, we have that as n+ 03,
P(B,IB0, A)=o(l) We now proceed to show (4.15). For k = 1 , 2 , ..., 1 - 5 , let A ( k ) be the event that there are vertices i , j , where i E R 2 ( 0 ,k ) a n d j E R 2 ( c o ,I - k - 4 ) , such that no vertex in W(0,k,i ) is joined to at least one vertex in W(o0,I - k - 4 , j ) by an edge, in Q , of capacity exceeding 6. Since the sets H ( 0 , k ) and H ( w , I - k - 4 ) are disjoint, it is easy to show that as n-03,
giving that
P(A(k)IA)=o(l). Thus we have that 1-5
P(
U A(k)lX)=o(l).
k= 1
For k = 1,2, ..., 1 - 5 , let Q ( k ) be the induced subgraph of Q on the vertex set
( 0 , ~ uR,(O, ) ~)uR,(co, I - k - 4 ) ~ W ( 0 , k ) u W ( C OI, - k - 4 ) .
298
W-C. S. Siren
Let X ( n , k ) be the maximum flow from 0 to co through the graph Q ( k ) . Then conditioned on A ( k ) and A we have that
X ( n , fc)=lnin(Y(O, k ) , ~ ( c ok,) ) , where
C
~ ( 0k ,) =
min(U,, v ( n ) 6 ) ,
i e R , ( O , k)
C
Y ( c o , I<)=
min(Ui, v ( n ) 6 )
i e R l ( m , k)
and U , is the edge capacity of an open edge (0, i ) or (i, co)depending on whether i ~ R , ( 0 , k or ) i E X , ( c o , k ) . S,nce the family { U l ; i E R , ( O , k ) u R , ( c o , k ) ) is a collection of independent and identically random variables with finite mean Y ( 0 ,k ) Y ( m ,k ) p , it is easy to show that ___ and converge to p in probability An) dn) as n-+co. This implies that as n-tco,
x(n,k)
1 +-(l--~~)p(a-l)
~-
logn
I
in probability.
Since the graphs Q ( k ) , k = 1 , 2 , ..., 1-5 are edge disjoint, we have that if'
u A ( k ) , then
1-5
A,=
k=l
=1-o(l)
as n-+co,
where c4 is a constant in (0, 1). As we can make c2 and and 1 as big as possible, we have for any E in (0, l),
It follows that P (X"> (1 - E ) p (a - 1) log n )
= P (x, > (1 -E ) p ( a - 1)log n ,Al, A)+ 0 (1)
E~
as small as possible,
Flo HVS througli complete graphs
This implies (4.15), and the proof of Theorem 4 is thus complete.
299
n
Acknowledgements. The author would like to thank Dr. G . Grimmett for his helpful comments.
Appendix A. Proof of 1,emma 11. The Icmnia is obvious because when the function t l ( j . k ) equals 1 for all j , k E V,,,, the construction C’(m, cc(m), 1, 0, i n , t i ) gives a vertex, say k , in the I-th stratum of C T ( i n , 0, in) if and only if the vertex k is of distance I from the vertex 0 in the graph Cn,. 0 B. Proof of Lemma 12. We first note that if Y is a geometric variable with parameter 0, then
Consider the process {Z,(“’}to which the construction C”(in, 0, n i , yo,) is applied. Notice starts with a progenitor of type 0 and has rate function ym given by (3.4). Let that {Z,(”‘)) H be the set of particles in ( 2 ~ ” given ”) by
H = ix : particle s is blue but its parent is not]
For x E H , let n ( x ) be the size of the set containing the particle x and its subsequent descendants in the process {Z:””].Then clearly D ( m , O)= x
Let
1 c,=--+
3
c E
rr(x)
I1
1
~~.We partition the set H i n t o sets
4a
H , , H 2 and I€> so that
H I = {x E H : the particle x has at least one brother whose type is in the interval I ( J ( x ) ) } , H ~ = { X H€ - H I : x < ‘ E ~ } ,
H,= H - HI - Hz
.
For k = l , 2 , or 3, let Dl,(tn)=
1 n ( x ) , giving
that D ( m , O)=D,(ni)+D,(m)+Ds(m).
1e11,
For the rest of the proof, we drop the suffices in the probability measure Po and the expectation operator Eo,writing P and E respectively.
W-C. S. Suen
300
Consider a particle x in the process {Z:""}. Let p x be the probability that the particle gives birth to two or more particles whose types are in an interval I ( j ) for some j E V,. Then by the Markov inequality,
Let B(m) be the event that there exists such a particle. Let M ( m , 0) be the size of the entire population of {Z:""}, and let M ( m , 0, el) be the number of particles in {Z:"'} of types not exceeding cl. Then
Consider particles y 1 and y 2 in the branching process. Let , y ( y l , y z ) be the indicator function of the event that the particles y , and y2 each gives birth to exactly one child of type in f(j) for some j E V,. Then
<m - ' a(my. Since M ( m , 0, 6 , ) (respectively M ( m , 0)) is distributed as a geometric variable with parameter exp(-el a(m)>(respectively exp( -a(m))), we have that
where n, =sup E(n (x) ) . It is easy to show that XEHj
Eows through complete graphs
301
giving that
Therefore, (A.I), (A.21, (A.3) and (A.4) give Eq. (3.5) since D ( m , O)=D,(m)+D,(m)+D,(m).
U
w
C. Proof of Lemma 13. Let j be a vertex in the tree E RT(m,0). Suppose that we are in the process of applying procedure COLOUR'(j) to the vertex j in the construction C ' ( m , a(m), l , O , m , q ) of the tree G T ( m , 0 , m). Let G C ( j ) be the set of vertices coloured immediately before procedure COLOUR'(j) is applied, and let GS(j) be the set of vertices coloured by the procedure. In contrast, consider the procedure COLOLJR"(x) where x is the type of the red particle labelled with f ( x ) = j in the construction C"(rn, 0, m , yn,). Corresponding to the sets CC(j) and C S ( j ) we have sets BC(j) and BS(j) of labels, used before procedure COLOUR"(x) is applied an? used by procedure COLOUR"(x) respectively. Then for k E V,,,- G C ( j ) ,
and for k
E
GCG), we have by construction that
P(kEGS(j))=O.
Also for k
E
V,,, - BC(j),
Po(k E B C ( j ) )= P (particle x gives birth to at least one child of type in I(k))
= 1 -exp
(0
m
otherwise,
by Eq. (3.4). and for k E BC(i), Po(k E BS( j ) ) = 0,
because no label can be used twice in C"(m, 0 . m, ym).
W-C . S . Siren
302
Furthermore, when conditioned on C C ( j ) (respectively EC(j)), the family { { k E G S ( j ) } ; k E V m }(respectively { { k E E S ( j ) ) ;k E is a collection of independent cvcnts. Thus. on the condi*ionthat G C ( j ) = B C ( j ) ,thc sets G S ( j ) and E S ( j ) have thc same distribution. As the trees G T ( m , 0, m) and BT(rn, 0, m) have roots of the sanic label, that is 0, we concludz that Eq. (3.6) holds. 0 D. Proof of Lemma 17. The proof of the lemma is similar t o that of Lemma 12. Consider the process {Z:""} to which the construction C"(m, i, lt, 7,) is applied. Noticc
that {Z:'"'} starts with a progenitor of type t
m
function ym given by (3.14). Suppose that N is a subset of the set of particles in the first generations of the process {Z:"'}, where H is given by
/I
H = ( x : particle s is blue but its parent is not). For X E H , let n ( x ) be the size of the set containing the particlc x and its subsequent dcsccndants in the first h generations of the process {Z:""}. Let 6-5/48 and let h , = J ( l - ~ ) log in x. W e partition the sct H into sets log a - log a, HI = { x E H : the particle x has at least one brother whose type is in the interval I( J(x))), H 2 = { x E H - H I : the particle x is in the first hl generations of the process {Z:"')}}, H , = H - H i - H1.
Let M ( m , i , h , ) (respectively M ( m , i, h)) be the number of particles in the first hl (respectively h) generations of the process {Z:'")}.Then by Theorem 8, E [ M ( m ,i , /
i ) ] = ~ ( r n ~ / ~ ) ) ,
E [ M ( m ,i , h ) * ] = ~ ( r n ~ / ~ ) ,
E [ M ( m , i,
/i1)2]=~(rn1-a).
n(x). By following a method similar to the method
For k = 1, 2 , or 3 , let D x ( m ,i , h)= x e HI
used in proving Lemma 10, we have that, as m-,
03
where n l = sup E ( n ( x ) ) . X 8
Hj
For a particle x in some h,-th generation of the process (2~"'). we have from the basic properties of branching processes that
Flows through complete graphs
303
where M ( m , h - h z ) is the number of particles in the first h - h 2 generations of a process {Z:””) which is the same as the original process {Z;””] except that the process {>:’“’} starts with a progenitor of type x. Thus
n, = sup E ( / Z ( X ) ) I E
Ha
3
Since D(m,i, / I ) =
1 Dk(tn, i, h), Eqs. (A.S), (A.6), (A.7) and (A.8) give Eq. (3.15).
0
I =1
E. Proof of Lemma 16. For any vertex j in the tree G T ( m , i , h), let p ( j ) be the probability that at least one vertex in S, remains uncoloured immediately after the procedure COLOUR’(j) is performed on j. (Note. S, is the set of vertices joined to the vertex j by an edge directed from j in the graph Gn,.) Then by the Markov inequality
Furthermore, if M a ( m ,i , h) is the size of the tree GT(m, i, h), it is easy to deduce from Eq. (2.9) and Lemmas 17 and 18 that
which gives that
References [l] M. Ajtai, J. Komlos and E. Szemerkdi, The longest path in a random graph, Combinatorica 1 (1981) 1-12. [2] L. R. Ford and D. R. Fulkerson, Flows in Networks (Princeton University Press, Princeton, New Jersey, 1962). 131 T. E. Harris, The Theory of Branching Procctses (Springer, Berlin, 1963). [4] G. R. Grimmett and H. Kesten, Random electrical networks on complete graphs, preprint. (1983).
304
W-C. S. Suen
[5] G . R. Grimmett and D. R. Stirzaker, Probability and Random Processes (Clarendon Press, Oxford, 1982). [6] G. R. Grimmett and W-C. Suen, The maximal flow through a directed graph with random capacities, Stochastics 8 (1982) 153-159. [7] G. R. Grimmett and D. J. A. Welsh, Flows in networks with random capacities, Stochastics 7 (1982) 205-229. [8] C. J. H. McDiarmid, General first passage percolation, Adv. App. Rob. IS (1983) 149-161.
Annals of Discrete Mathematics 28 (1985) 305 - 310 0Elsevier Science Publishers B. V. (North-Holland)
ON THE NUMBER OF TREES HAVING A- EDGES IN COMMON WITH A CATERPILL.AR OF MODERATE DEGREES Ioan TOMESCU Furiilty of Mnthenintirs, Uliirersity of Bitcliurest, 70109 Birclirrri~.st.Runialiin
In this paper i t is shown that thc number T(T,; 11, k ) of spanning trees of K. having k edges in common with a fixed caterpillar T, with s edges such that max dcg ( x ) < J E
Y
( r is fixed), satisfies lim T ( T , ;
11,
V
-
(T,)
1 ~ ) / ~ 1 " ~ ~ = 2 ~where / I ~ I.= e ~ ~ lini~ /s/n. / ~ !
n-ro
>I
li
This iniplics that the random variable taking the value k with the probability T(T,; 1 7 , k ) / ~ r " is - ~ distributed asymptotically in accordance with the Poisson law whenever lini sir1 exists. n-r J)
For a connected graph G with at least three vertices, the derived graph G' defined in [ I ] is formed by deleting the endvertices of G. A caterpillar is a tree T whosc derived graph T' is a path. Harary [ I ] proved that for any tree T with at least three vertices, the following statements are equivalent: ( I ) T is a caterpillar. ( 2 ) T 2 is hamiltonian. (3) T does not contain the subdivisioil graph S(K,,,) as a subtree where S(G), or the subdivision graph of G, is the graph obtained from G by inserting just one new vertex on each edge of' G. Consider now a caterpillar T, with s edges e l , ..., e, which is a subgraph of the complete graph K,, with n vertices for n>s+ I . For any selection K of i edges of T,, let K c E ( T , ) = f p , , ..., e,} such that ] K l = i ; these edges span a forest FK of K,, composed of p = ii - i trees. If these trees contain respectively m , , ..., mp vertices (m,+...+ m p = n ) , then it is well known [ 2 ] that the number T ( F K )of spanning trees of K,, that contain FK is given by
n P
T(F',)=
IYijnp-2,
j = I
hence i t depends only on the size of the components of FK and not on their indi305
n rnj P
vidual structure. It is clear that the product
depends only on the choice
j= 1
of K; further this product will be denoted by p ( 9 . Theorem 1. If T,is a caterpillar with s edges such that max deg(x)
Y(T.1
then
The upper bound is an equality i f and only
if i= 1 or i 2 2 and T, is a path.
Proof. Lower bound. We shall dcfine a labelling of the edges of T, with the labels 1 , ...,s, i.e. an injective functionf: E(Ts)+{l, ..., s}, such that if u and u are two adjacent edges in T, then If(u)-f(u)l < r - 1. Suppose the vertices of the path T,' are labelled I , 2, ... ,k where we assume vertices i and i f 1 are adjacent for i = l , ..., k-1. Now label the edges of T, consecutively from 1 to s, labelling first (i) the edges joining endvertices to vertex 1 (in any order), then (ii) the edge joining vertices 1 and 2, then (iii) the edges joining endvertices to vertex 2 (in any order, if there are such edges), then (iv) the edge joining vertices 2 and 3, and so on. Letf(u) denote the label assigned in this way to edge u and let K be any subset of i edges of T,. If If(u)-f(u)lar for all pairs of distinct edges u and u in K, then it is not difficult to see that no two edges of K are adjacent i n T,, whence p ( K )= 2'. Since the number of i-subsets Y = { n l , ..., n t ) c { l , ..., s> such that In-hl3r for every a , b E Y, a # b is equal to s-(r-l)(i-1)
(see e.g. [3]), the lower bound follows. Upper bound. Suppose that TTis not the path P, of length s; so there is a vertex x such that deg ( x ) 2 3 . Let u = x y E E(T,)\E(T:) be an edge incident with x such that deg(y)=l. If T s = K , , slet u=xz be any edge incident with x such that u#u. Otherwise, let u=xz E E(T,) n ,?(Ti) and let Wdenote the set of all edges incident with x which arc different froin x y and xz. It follows that WI > 1.
I
On the number of trees having k edges
307
W
u*
.,
O
U
V
0
-9
Y
2
W
FIG. I .
From T, we shall obtain a new caterpillar Uswith s edges by deleting xy and inserting vertex y on the edge xz of T, (see Fig. 1). In Uslet u and ti denote the edges zy and xy. We shall show that
If i= 1 it is obvious that these two sums are equal to 2s since IE(T,)I = IE(U,)l =s. Let i22.In this case (3) holds and the inequality is strict. To see this denote by Sl the left member and by S2 the right member of (3).
It is clear that
c
P(Kl)=
KI:u$KI
c
R2: U $ K z
PW,)
because if we delete edge v = x z from T,, respectively edge u=zy from Us,we obtain isomorphic graphs. Also, by a similar argument we deduce
c
PWJ=
Ki :u,uaKi
K2
c
: u, U E
PWd. K2
It remains to consider the case when U E K ,but u # K I where K l c E ( T s ) ; let K2 denote the set of edges in Uswith the same labels as those in K 1 . Since u=xy $ K2 it follows that vertices x a n d y belong to different components of the forest induccd by K2 in Us;if LY and p+ 1 denote the number of vertices in these two components, respectively, then a 3 I and b2 1 since u=zy E K 2 . It is not difficult to see that ~ ( K i ) = ( a + f i )P d a ( P + 1) P = p ( K 2 ),
I. Tomescu
30 8
where P> 1 denotes the product of the number of vertices in the remaining components, with equality holding only if u= 1. It follows that S,>S, if i22.After some transformations of this type a path P, is obtained, hence
with equality if and only if i= I or T,=P, But the last sum is equal to
C
nt t
... t I I I . ,
ml,
nil =5 + 1
m , ...n z , ,
... ,niq> I
where q = s - i f l . This sum equals the coefficient of x s f l i n the developinent
Theorem 2. For m y fixed k and r let T(T,; n , k ) denote the nirrnber of spanning trees of K,, lzauing k edges in cotmion with a cuterpilfar T, w f f h s edges j b r wliich inax deg(x)
S
tirid
x E V ( 7,)
lim -.-=A E [0, 11. n-rm
I1
The.followitig relatioti holds
Proof. Let A i denote the set of all spanning trees of K, containing edge ei of Ts for 1
On the number of trees having k edges
309
since
and T(F,) is given by (1). If
then from Theorem 1 it follows that
lim n-tm
s(T,;n , i)/d-'=--
2'A1
(4)
i!
for any fixed i. From the general theory of the principle of inclusion and exclusion it is known that the numbers T(T,; n , k ) satisfy the Bonferroni inequalities:
for m 2 0 a n d 2m i-I <s. Hence
is the number of spanning trees of K,,, this result implies that Because the random variablc taking the value I; with the piobability T(T,;~ 7 k, ) / , T 2 is distributed in accordance with the Poisson law as 1 1 - + ( y 3 . When s ~ c Theorem c 2 holds not only for fixed r but also for every r such that r/s+O since i n this case (4) is true. No:e that the condition on the degrees of T, is essential for Theorem 2. Indeed, ;)-I( 1 - -- [ 2 ] hence if T, is the star K,,,, then T(K,,,; 1 1 , O ) = I Z ~ - ~I - - -
),,
(
S
lim T ( K , , , ; ~ i , O ) / n " - ~ = ( l - - ; l ) ewhenever -~~ lim - = A . n+m n
3 10
I . Toniesric
However, it can be conjectured that the conclusion of Theorem 2 still holds if we replace the caterpillar T, by any tree with s edges whose degrees are bounded asn-+m. 0
Acknowledgements The author wishes to thank the referee for h i s comments and some helpful suggestions.
References [ I ] F. Harary, Recent results on graphical enumeration, Graphs and Combinatorics, Proc. C o d . on Graph Theory and Combinatorics, June 18-22, 1973, R. A. Bnri and F. Harary, eds., Lecture Notes in Mathematics, 406 (Springer-Verlag, 1974) 29-36. [2] J. W. Moon, Counting labelled trees, Canad. Math. Monographs, No. 1 (W. Clowes and Sons, London and Beccles, 1970). 131 I. Tomescu, Introduction to Combinatorics (Collet's, London and Wellingborough, 1975).
Annals of Discrete Mathematics 28 (1985) 311 - 317 0Elsevier Science Publishers B. V. (North-Holland)
RANDOM GRAPHS ALMOST OPTIMALLY COLORABLE IN POLYNOMIAL TIME W. Fernandez de In VEGA Unioersite de Paris-Slid, Centre d’Orsay, Laboratoire de Recherche en Informatique, 91405 Orsay, France
Let r be an integer 8 3 . If the sequence (p.) satisfies the conditions (1 -pPn)n2/‘-tm and (1 --pn)n2/(‘+’)-’0, then the random graph G = G(n, p,) on n vertices uhere each edge is present with probability y . is alniost surely colorablc in polynomial time using no more than (1 +o(I))x(C) colors. The coloring algorithm used I S the greedy algorithm for a maximal system of pairwise disjoint sets applied to a previously computed list of the independent sets of size r in G(n, pn).
1. Introduction It is well known that the problem of coloring a graph using a number of colors which does not exceed the minimum required by a factor greater than 2--E is NP-complete (Garey and Johnson [2,3]) and, in fact, the current best performance guarantee for coloring algorithms running in polynomial time is much worse (Wigderson [8]). Random graphs with constant edge probability can be colored in polynomial time with a performance guarantee 2 + (Grimmett ~ and McDiarmid [ 5 ] , see Korshunov [7] for estimates of the chromatic number of these graphs). The same is true for random graphs with average degree o(n) where n denotes the number of vertices (Fernandez de la Vega [l]). In this paper we display classes of random graphs which can be almost optimally colored (i.e. with a performance guarantee 1 +o(l) “almost everywhere”) in polynomial time. Within each of these classes the graphs enjoy the property that, for almost every vertex, the size of the greatest independent set containing it is equal to a fixed constant r and, in fact, one can obtain an almost optimal coloring with almost all color classes of size r. We denote by G ( n ,p ) the random graph on n vertices with independent edges each one present with probability p .
Theorem. Let r be an integer 2 3 . I f the sequence (P,,),,=~,~, ... satisfies the conditions
311
W.Fernandez de la
312
Vega
then the chromatic number of the random graph G ( n , p,,) is as. asymptotically equivalent to njr. Moreover, G(n, p,,) can then be colored using no more than (n/r)x (1 +o (1)) colors in polynomial time. Next we shall prove this theorem. Conclusions and open problems will be presented in Section 2.
Proof. Let us suppose that the conditions
and
are satisfied by the sequence (pJ and let us denote by ,'A the chromatic number of G(n,p,J. We shall first prove that the inequality
n r
X"3 -(1 - 0 (1))
(3)
holds in probability as n-ioo. Let us denote by Z,,, the number of independent sets of size r+l in G(n,p,,). Its expectation is
Condition ( 2 ) gives immediately EZ,+ =o(n) which implies by Markov's inequality that Z,.,, is o(n) in probability. Similarly ( 2 ) impl;es E Z r + , = o ( l ) , so that, again with probability tending to 1, there are no independent sets of size > r + 2 . Clearly, ( 3 ) follows. For the converse inequality we require firstly the following lemma, which expresses an inequality of FKG type. A proof can be found in ;Ipaper of Harris [6]. See (Graham [4]) for a general view on FKG inequalities.
Avoidance Lemma. Let E be a )nite set and F : 2E-+R a set function satigving the condition Y?Z*F( Y)Z F ( Z ) . Suppose that X is a random subset of' E in which each element is present with probability p , independently of the others. Let A A Z ,..., Ak be given subsets of E. Then we have, for every t ,
Random graphs almost optimalIy colorable
313.
Following current usage, we shall express the above inequality by saying that the conditional distribution of F ( X ) (for the left-hand side conditioning event) is dominated by its unconditional distribution. Returning t o the proof, let us show that we can almost surely find (n/r)(1 -o(l)) disjoint independent r-subsets of vertices in G(n, p,). Clearly this will imply the required inequality X,,<(n/r) (1 +o(l)). For convenience, let us introduce the complementary graph, say H , in which each edge is present with probability q,,= 1 - p n , and look for disjoint K,’s in this graph. Let L = S 1 , S 2 ,... , S,, nz=(:), be a list of the r-subsets of the vertex set [n] arbitrarily ordered. We shall extract from this list a sequence S,,, S,2,..., S,, of pairwise disjoint K,’s of H , using a greedy strategy: S,, is the first set in L which spans a K, of H ,
... Suppose that 1 1 ,12,..., Ik have been found. Then Sir+, is the first set next to S,, which spans a K, vertex disjoint from the previously selected ones or the sequence ends with T= k if no such set is found. We shall prove that the random number T,=T of Kr’s found satisfies
in probability. Suppose that exactly j Kr’s have been found and let R, and E, denote the set of vertices and the set of edges of these K,’s. The other K,’s of H fall into three di sjo.nt classes: those which are vertex dihjo:nt from Rj; those which intersect R, but have no edge in E j ; thosewhich have edges in E j . Let us denoLe by A , , B,, C, in this order the cardinalities of these sets and by Y, the total number of K,’s i n H . We have obv.ously Y r = j + A , + B,+
cj,
and T,=j iB A , = 0 . Let us denote by C the number of pairs ( U , V ) of distinct K,’s of H which share at least one edge. Clearly C I S a common upper bound for all the C,’s. For every c > O and integer 12 we have the following relation between events:
T,,d h E [ Y, d ( 1 - E ) EY,)V( C 3 E EY,) h
V( v
i=O
( B , ~ ( 1 - 2 & ) ~ ~ - ~ ~ ~ n ~ ~ - 1 ) ) .
W. Fernandez de la Vegu
314
Indeed, it is readily checked that if the left-hand side event occurs, at least one of the right-hand side events has to occur. This implies
+C
P(Bj>(1-2e)EY,-jI
j=O
Tn>j-l). (4)
It is easily checked that, under conditions (1) and (2), we have VarY,=O(EY,) and this implies
p (r,d (1- E ) EY,)= 0 (1).
(5)
We shall prove below that we have
P (C >EEY,) = 0 (1)
(6)
and, for h=(n/r)(1-(4~)”~) and jdh,
I
P (Bj>(l - 2 ~ )EYr-j Tn> j - 1) = o ( l / n )
(7)
This implies, with (4) and (5),
P (T,2 ( n / r )(1 -( 4 ~ ) ’ ~ ~=)1) - o ( I ) , and, as desired,
n in probability, since E is arbitrarily small. Now it remains to prove (6) and (7).
Proof of (6). We have
Because of (2), the bracketed expression is o(1) for l < k < r - 2 . This implies EC= o(EY,) and (6) follows by Markov’s inequality.
Random graphs almost optimally colorable PrOOfOf(7).
315
Let us denote by L j = ( i , , i 2 ,. . ., i,) any admissible set of values for
u &,. i
I , , 12,..., I, and let now Rj denote the set
Let EJ denote the set of
k= 1
edges of the corresponding Kr's and let Fj=[nI2-Ej denote the set of edges of the graph complementary to E,. We claim that the conditional distribution of Bj (relatively to the conditioning event I,= i l , ..., Ij=i,) is dominated by the unconditional distribution of the number B; of' Kr's of H which intersect the set R j . Indeed, the distribution of the set 11, of edges of H belonging to F, coincides with that of a random subset of Fj (with individual probabilities equal to q,,), subject to the condition that, for each i from a collection of indices whose precise definition from L, can easily be written down, it does not contain all the edges of the complete graph with vertex set S,. This implies by the Avoidance Lemma that the distribution of Bj is dominated by the unconditional distribution of the number of Kr's which intersect R, and have their edges i n F, which is in turn dominated by the distribution of the number of K,'s which intersect R, and this concludes the proof of our claim. Let us now consider the additional random variables BJ' and Birrdefined on the same probability space as B j , by
BY= # (K,'s of H which do not intersect R,} and
By'= # {K,'s of H ) . Then BYr= BJ+ B;' while Bi and B;' are positively correlated. Hence Var (B;)d Var (By')= 0 ( E x )
.
We have
forj
W. Fernandez de la Vega
316
and, since by (1) we have (EY,/n)-too, P ( B ) >(I - 2 ~EY, ) - j ) =o ( l / n ) , j
(7) follows by dominance.
0
2. Conclusions and open problems We have displayed a class of random graphs for which an almost optimal color;ng can be obtained by a sequential selection of' the color classes, each new color class being an independent set of maximum (or nearly maximum) size within the not yet selwted vertices. It would be interesting to know if such a coloring procedure (to be specific let us refer to the algorithm which at each step selects the new color class at random among the independent sets of maximum cardinality within the not yet selected vertices) could also be almost optimal for other kinds of random graphs and, in particular for the ordinary random graphs with edge probability 1/2, though, for classes of graphs with unbounded stability number, this procedure will not run in polynomial time. Our theorem leaves untouched the case (1-yn)n2'"+')-+C, where C is a positive constant. In this case the proportion of vertices which belong to independent sets of size r f l does not vanish and we are no longer able to locate accurately the chromatic number. We d o not expect that in this case G ( n , p , ) could be almost op.imally colored in polynomial time. Indeed, i t might be the case that conditions (1) and (2) define essentially the only classes of random graphs which can be almost optimally colored in polynomial time (other known classes are the class pn=c/n, c< 1, where the chromatic number is 3 or 2 according to the presence or absence of oJd circuits and the class p I I =1 - c / n for which the almost optimal coloring pi ObleJll equates the maximum matching problem in GI.
References [ l ] W. Fernandez de la Vega, On the chromatic number of sparse random graphs, Graph Theory and Cornbinstorics, a vulume in honour of Paul Erdos, B. Bullobiis, ed., (Academic Press, 1984) 32 1-328. [2] M. R. Garey and D. S. Johnson, The complexity of near-optimal graph coloring, J. Assoc. Comput. Mach. 23 (1976) 43-49. [3] M. R. Garey and D. S. Johnson, Computers and Intractability (Freeman, San Francisco, 19791.
Random graphs almost optimally colorable
311
[4] R. L. Graham, Applications of the FKG inequality and its relatives, in: Mathematical Programming, The State of the Art, A. Bachem, M. Grijtschel and B. Korte, eds., (Springer, 1983) 115-131. [ 5 ] G. R. Grinimett and C . J. H . McDiarmid, On colouring randoni graphs, Math. Proc. Canib. Phil. Soc. 77 (1975) 313-324. [ 6 ] T. E. Harris, A lower bound for the pcrcolation probability in a certain percolation process, Proc. Camb. Phil. Soc. 56 (1960) 13-20. [7] A. D. Korshunov, Thc chromatic numbcr of n-vertex graphs, Diskret. Analiz 35 (1980) 15-44 (in Russian). [ S ] A. Wigderson, Improving the performince guarantee for approxiinate graph coloring, J. Assoc. Comput. Mach. 30 (1983) 729-735.
This Page Intentionally Left Blank
Annals of Discrete Mathematics 28 (1985) 319-336 0 Elsevier Science Publishers B. V. (North-Holland)
SUBCUBE COVERINGS OF RANDOM GRAPHS IN THE 12-CUBE Karl WEBER Sektion Mathematik der Wiihelnz-Pieck-Uniaersitat Rosfock, 2500 Rostock, German Democratic Republic
We give a survey about results and proof techniques concerning coverings of the vertex set of random subgraphs of the n-cube by subcubes.
0. Introduction
The n-cube E n is the graph consisting of the N=2" vertices u = ( a l , ..., a,,), a, E (0, I}, and the N'=n2"-' edges between vertices differing in exactly one coordinate. A spanning subgraph g of E nhas the same vertex set as En.An induced subgraphfof E" (also called a Boolean function) with the vertex set A GE" contains all edges of E nbetween vertices of A . (Note that by E" o r f a r e denoted not only the graph but also its vertex sets, g also stands for the edge set ofg.) Choosing the edges of g (the vertices o f f ) at random, we arrive at a random spanning subgraph (a random Boolean function). The corresponding probability spaces will be denoted by G(n, p ) (F(n, p ) ) if the edges of g (the vertices off) are chosen independently and with the same probability p . That means the probabilities are defined as P(f)=plrlqN-Irl and P(g)=plelqN'-lgl,respectively, where q = 1 - p . In all which follows let Iz denote a random (spanning or induced) subgraph of En. We say almost all random subgraphs 12 have a certain property Q if P(h satisfies Q)+1 as n+co. For example, if the probability p > 1/2 ( p < 1/2) is fixed, then almost all g E G(n, p ) are connected (not connected); if p = 1/2, then P(g is connected)+ l/e as n+co (cf. [3] and [ 131). There are only few rcsults for random graphs in the iz-cube. I n this paper we restrict ourselves to a special covering problem. A covering C of h is a covering of the vertex setofh by subcubes of E" being also subgraphs of h. The number L ( C ) of subcubes in a covering C is called the length of C. A minimal covering of h has minimum length among all coverings of h. This minimum is called the 319
3 20
K . Weber
length of h and is denoted by L(h). A subcube K E h (if h = g , then also the edges of K have to be contained in h) is said to be a maximal subcube of h if there is no K* with K c K * c h . The number of maxiinal subcubes of /z is denoted by S(h). An irredundant covering of h contains only maximal subcubes of / I , and none of them is redundant. I t is clear that L(h)=min L ( C ) taken over all irredundant coverings of h. Deline I!,+ (h)=inax L(c)taken over all irredundant coverings C of lz and denote by T(h) the number of a11 irredundant coverings of h. To construct minimal coverings Tor a g:ven h tlicrc i s a unirying approach: Firs( determine all maxinial subcubes of h, then construct a11 irreclundant coverings or h, and finally find the minimal coverings among the irredundant ones. The complexity of t h i s process may be characterized by parameters :is S(h), L(h) and T(h).If wc renounce the deterniintilion ol'~iiinimalcoverings and replace i t by an arbitrary irredundant one, then the ratio L'.(h)/L(h) yields the possible deviation from the niiniinum length. Obviously, the parameters introduced above are random variables on F(n, p ) or G ( n , p ) , respectively. In this paper bounds for these and similar characteristics are summarized. In the case or F ( n , p ) our investigations were nlotivated by the minimization of Boolean functions. Recall that minimization means to construct disjunctive normal forms (DNF) of ~niniin~im complexity for given Boolean functions. A cusloinary complexity ineasure is the number L ( D ) of conjunctions i n the D N F D (for other complexity nieasures see [14]). Using this ineas~ire,we arrive at our covering problem. In the case of G(n, p ) we note the analogy between the coverings defined above and the well-known coverings of the vertex set of spanning subgraphs of the complete graph by cliques. The last ones are closely related to the chromatic number. Our study was initiated by the work of Glagolev (cf. [4]) and Saposhenko (cf. [6-1 I]). We have used essentially their ideas and constructions.
1. Preliminaries Note that all limits, asymptotics, etc., are considered as n+w. For two sequences a=a(n) and /?=j?(rz) we write a s / ?if a<(l+o(l)), a
Subcnbe coverings of random graphs in the n-cube
321
denoted by EX and D 2 X , respectively. For x>O we have
P ( X B x E X ) < l/x
(1.1)
(inequality of Markov). It implies P ( X < p E X ) 3 1 - 1 /ql+ 1 ,
(1.2)
and in particular since X is non-negative and integer-valued
P ( X = 0 ) + 1 ifEX-10.
(1.3)
The following inequality of Chebyshev is an immediate consequence of (1.1): P ( l X - EX1 Zx) d D ' X / x 2 .
(1.4)
It implies for
D2X=o((EX)') that there is an c=E(iz)+O
(1.5) such that
P ( I X - E X I
(1.6)
For (1.6) we will use the short notation P ( X - E X ) + 1, and we say Xis asymptotically equal to EX for almost all h. If EX-+oo, then (1.5) can be replaced by E(X),-(EX)2, where E ( r n 2 = E X 2 - E X is the second factorial moment of X . Recall that the convergence of the r-th factorial moinents E(X),-+X for r = 1, 2, ... , where 1 is a positive real constant, implies P(X=t)+A'/l!e', t = O , 1 , ..., i.e. X is asymptotically Poisson distributed with mean value 1. G;ven the random variables X O r X ,X,, , ... (the index n is omitted again) and the integer sequences a=a(n), p=p(n), O
then there exists an c=~(n)-+Osuch that
(1.8) P
(Xd-EXd})-,l)
(short notation: P ( d=a
Indeed, by (1. l),
K . Wcber
3 22
and assuming (l.7), we find an c-+O such that the last sum tends to zero too and C are taken over d=a, ..., 8). Note that (1.8) implies
(u,
(in the sense of (1.6)). Clearly, the number i f / of vertices off and the number 191 of edges of 9 are binomially distributed random variables on F(n, p) (with parameters N and p) and G ( n ,p ) (with parameters N' and />),respectively. Thus, i n particular, almost allf E F ( n , p) (y E G(n, p ) ) have asymprotically p N vertices ( p N ' edgcs) provided that these numbers tend to infinity. That means the properties or almost all f E F ( n , p ) ( g E G(rt, p ) ) may be interpreted as typical properties of those f ( g ) with nearly p N vertices ( p N ' edges). For a fixed d-cube K c Ellthere are exactly n-d d-cubes K*, each of which forms together with K a (d+ I)-cube. These subcubes are called the ne:ghbours of K. Clearly K is a maximal dcube o f f ( o f g ) if none of the n--d neighbours of K belongs to f ( ' f no neighbour K* together with the 2' edges between K and K* belongs to g). A subcube K of h is said to be isolated if there are no edges between K and El'-K in It. The vertex sets of connected subgraphs of El' with exactly m vertices are called rn-cells (cf. [I]). By applying the well-known upper bound 4"- 1 for the number of pairwise nonisomorphic rooted trees with m vertices, the following bound for the number [(rn) of rn-cells in E n is easily verified:
<(rn)<2"(4n)"'-'
(1.10)
(cf. [I I]). This bound gives the order of T(m)for m = 0 (I). In all that follows d stands for the dimension of subcubes, i.e. it is integervalued and non-negative. Mostly it depends on n, a fixed dimension will often be denoted by k. The probability is denoted by p . It depends on n too. For a certain smoothness of p = p ( n ) we suppose that not only p but also log(l/p)/logn converges, possibly to infinity. The random variables on F(t7,p) are denoted by I, J d , S, L, T, ... and the correiponding random variables on G ( n , p ) are written with bar: I', Ji, S', L', T',etc. The binary and the natural logarithm are denoted by log and In, respectively. For any real x, 1x1 and 1x1 denote the greatest integer not greater than x and the least intcger not less than x, respectively. The most important notations and abbreviations are listed at the end of the paper.
Subcube coverings of rnndom graphs in the n-cube
323
2. Subcubes
We define the following random variables on F ( n , p ) : I d ( f ) , S d ( f ) and J d ( f ) denote the nulnber of cf-cubes, maximal d-cubes and isolated d-cubes o f f , respectively, f ( f ) =
n
n
d=O
d=O
1 f , ( f ) and S ( f ) = 1 S,(f'). For the expectations we have
E f d = uz , ES,= uz ( I - z ) " - ~ ,E J , , = u z ~ ( " - ~2") d = O , 1, ..., n, where u =
(2.1)
(:;)
2"-d is the nLlJnber of d-cubes in E", z = p 2 * i s the
probability that a fixed cl-cube K is contained in.1; (1 - z ) " - ~ and g ( n - d ' 2 dare the probabilities that K i s filrtherJiiore maximal or isolated, respectjvely. Thresholds (cf. [2]) for small (maximal, isolated) subcubes are given i n the follow i n g . Theorem 2.1. ([16, 171) Let k be uiz urbitrary fi.uetl tzon-negative iiztcger und p u
I n n - rp Tlien almost d1.f E F(tz, p ) contain: (i) no (maxiniul, isolutecl) k-cubc jiw p < g p , , (ii) asymptotically Elk k-cvbcs for p ~ p , , (iii) asymptotically ES, niuxitnul k-cubes f o r p , <
For the variances D21d and the second factorial moments E(Sd), we can prove the following upper bound> by standard techniques (cf. [15], [17]): (2.2
where
Setting Q O = O , the bound (2.3) remains true for d=O.
K. Weber
324
.Using (2.1)-(2.3), EX-+O or D * X = O ( ( E X ) ~is) shown for X = I d , S, or Jd and the corresponding domains of probabilities. Then by (1.3) and (l.6), Theorem 2.1 follows. We will say (maximal, isolated) k-cubes appear at p = p , and maximal (isolated) k-cubes disappear at p=p6 (p=pJ). Let k = O , 1, ... be fixed, x#O a real constant and p - x p , . Then the number of (maximal, isolated) k-cubes is asymptotically Poisson distributed with mean value 1,= ~ * " / 2 ~(cf. k ! [16, 171): Indeed E(Jk),-+Ai, r = I , 2, ..., is easily shown. On the other hand, the expectat'on of the number of m-cells contained in f is [ ( I H ) ~ =$2"n"'-'prn (cf. (1.10)). For t r ~ = 2 ~ +the 1 last term tends to zero and thererore, by (l.3), P(&=Jk)+l, i.e. P ( I k - = t ) - P ( J k = ~ )for p - x p l . But using (1.10) it is not difficult to prove the result for Ikdirectly, i.e. independently of the result for J, (cl'. [16]). S'nce isolated subcubes are special maximal ones, the result concerning S, now immediately follows. Furthermore, for
is easily shown (cf. [16]). Thus, the number of isolated k-cubes i s asymptotically Poisson distributed for these probabilities too. The special case k = 1 and x=O is contained in [12].I:ork=Oweget 1 for p = - - ,
2
P(J,=t)-+
1
J.
(W' - , t=O, t!
and1,=ex/2,inparticular 1,
... for the nuinber of isolated vertices
of random Boolean functions. As Saposhenko has shown that if p = 1/2, then almost all Boolean functions consist of one large component and at most p isolated vertices (cf. [lo, ll]), we obtain Theorem 2.2. ([16]) Let p = 1/2. Then
limP(f is connected)+l/iZ. Given the probabilities p = p (n). Which dimensions of subcubes are available for almost all fuiictionsfe F(n,p)? The following theorem gives a general answer to this question.
Theorem 2.3. ([16]) Giuen p = p (n) such that
~
Subcube coverings of random graphs in the n-cube
325
and put do = [log n - log log (I/p)l. Then almost a11f E F(n, p ) contain asyinptotically E/,,(+co) d-cubes of all rlitizensions d = O , ..., do - 1 (in the sense of ( I .8)) and no subcube of a ditneiuion d> do. d22d
Put
d
c
do- 1
and show that
cd+0 (cf. (1.7) and (2.2)). For d>do l-JN E l , d= 1 Hd+0 is easily obtained. A similar theorem can be formulated for maximal subcubes (cf. [17]). But additionally there are no maximal subcubes of dimensions 0, ..., d3 - 1, d3 =[-loglog(l/p)l, assuming d3- 130. Now the number of all subcubes of almost all f E F ( n , p ) is attacked. E~=--+---
Theorem 2.4. (1161) Let p=p(n) satisfy the inequalities
and put d, = [log log n - log log (l/p)J. Then we have for the number of subcubes of almost a l l f E F(n, p)
For np-+ 00 and log (I/p) x log ti we have
for p z l/n (including the case p
< I/ti when d2 is not dejned)
and for 2-"<
do- 1
do- 1
By (1.9) and Theorem 2.3,
2
d=O
Id(f)-
C
E I , for almost all f E F(n,p). Now
d=O
the g ven estimates are shown by examining the rat;o Eld+,/EId. We note that, assuming (2.5), (2.6) can not be miproved s nce neither E / d z = o ( E / d 2 , , )nor the converse IS true in general. Even for g ven probab lity p there are eas ly determined subsequences of natural number:, for both cases. Let, for example, p= 1/2, Then for n,=2" by (2.6), I(j)-E/,,,, but already for n,=2"- 1, / ( f ) - € / d 2 + ,
K . Weber
326
In both cases
But there is no simple asymptotic formula in n for J ( f ) even i f p=1/2. For n =(22'-2)3 we evaluate
-
I (J') E l , , + *
-
3n 5/3- log 3 --
--
4( log log n )2 - 1 0 g 3 V B V .
and V does not give even the order of the number I ( f ) . Further we mention that for p such that d,-+co and logd2=o(logn) (i.e. n - o , r 1 ) d p1-2-20""y")), 6 (2.6) implies ~ ( f ) = & "-"('))2'',i.e. in pariicular for p = I / 2 , ~(1') = i i l o g l u ~ n ( l- 0 ( 1 "2" (this last result is contained in [4]). Also the other bounds in Theorem 2.4 can not be improved under the given assumptions. So, for example, for p+O such that log(l/p)alogn, assertion (2.6) is not true i n general: Let p = l/,,h, then d,=l and l ( f ) - E / ~ 2 + E l , ~ + Now l. let l/J"p
-
-
-
-
(for more details see [21]). Similar results can be established for subcubes of random spanning subgraphs (see [19]). Roughly speaking, one has to replace pt,p5,&,, EJ,, ES,, EJ,, A,, lZ,do, dz, d3 by the corresponding parameters with bar (see Appendix). For example, we have that for p = p 4 (defined as above) Ji is asymptotically Poisson 2 - k k2k-1 distributed with mean value &=e"(l-2In particular, that implies ) P(J;=O) ( - P ( g is connected)-see [3, 13])-+l/e forp=1/2+0(l/n) (cf. Theorem 2.2).
.
Subcube cocierings of random graphs in the n-cube
327
3. Minimal coverings In this section bounds for the length of almost all random subgraphs of the n-cube are given. Our first theorem answers the question which dimensions of subcubes are essential for the covering problem. For d a l and a E E" define V d , Jf ) as the number of d-cubes offcontaining the vertex a. Denoting the conditional expectation E( V,,, a ~ fby) EV, we have, independently of a,
1
where y = p 2 d - 1 .
Theorem 3.1. (i) Let k 2 1 be an arbitrnry fixed integer and p sirch that
Then asymptotically p2" vertices of almost all f E F(n, p ) have the property that each of them will be covered by asymptoticdy EV, d-cubes o f f for all d = 1, .. k (see below for a more precise formulation). Furthermore, subcubes of dimensions d > k + 1 cover only o ( p 2 " ) vertices. (ii) Let p satisfy the inequalities .¶
loglogn
-..<
log n
.P- < I -
1
(3.3)
JlOgX
and set tll = [loglog n+logloglogn-loglog ( l / p ) I . Then for almost all f E F(n,p) asyniptotically p2" verticcs o f f have the property that each of them is covered by asymptotically E Vd d-cubes o f f for all dimensions d = 1 .,., d, - 1 (more precisely, there are E , 6+0 such that
And d-cubes of dimensions d a d ,
+ 1 off
cover only o(p2")vertices.
I
We sketch the proof: Recall that If -p2". By (1.2), d-cubes off cover at most ~2~EI,=pEl/,p2"vertices such that the second part of (1) and (ii) follows from EV,-+O for d > k + 1 or d>d, + 1, respectively. Denote by Dzvd the conditional variance D2(Vdau1a Ef) (also independent of a) and suppose 0 ' Vd<&d(EVd)'.
K . Weber
328
Setting (for a suitable c+O) F d ( f ) = I ( a : aE f and IVd,,(f)-EVdl>cEVd)I and : using (1.4), we derive EFd6p2"cd/c2.Furthermore, by (1.2), Y ( n { . f 'Fdcf')
l--
a,
, and for
fE
n {Fd
V D d
F ~ ( ~ )
(nand Z taken over d=a, ..., /I) In. accordance with these considerations our theorem follows from the upper bound
k
di-1
~ ~ -or0 rl,
(cf. [15]) by setting cd=d3/EV1+d/EVdand showing that d= 1
-0,
respectively. Indeed, then we can choose q>>jl-
CI
1
E~
d= 1
and simultaneously
A
Remark. In order to show that EVd+O for d> d , + 1 only the upper bound in (3.3) is necessary. Thus the second part of (ii) is actually true for p < 1- l/JG This will be used in Section 4. From Theorem 3.1 lower bounds for the length L ( f ) of almost all f E F(n, p ) can easily be derived. To construct coverings of small length we use two methods of Saposhenko ([8,9]). The upper bound in the following theorem is obtained by a greedy algorithm: In every step take such a subcube off which covers the most of, until to this step, noncovered vertices among all subcubes off. Theorem 3.2. ([ 151) ( ) Let k = 1,2, ... be arbitrarily fixed and p satisfy condition (3.2) from Theorem 3.1(,), Then for the length of almost allf'e F ( n , p ) we have
(ii) Let p satisfy (3.3). Then we haue
for almost all f E F(n,p) (for d , see Theorem 3.1(ii)).
(3.3) implies loglog n < d , <+loglog n+logloglogn, and so the ratio between upper and lower bounds in (3.6) IS of order log log 12. The lower bounds in Theorem 3.2 immediately follow from Theorem 3.1. To prove the upper bounds, the following lemma of Saposhenko is used.
Subcube coverings of random graphs in tlie n-cube
329
Lemma 3.3. ([8]) Given a 0 - I-matrix A4 of size t x i which has suclz a ( 1 - c ) t x rsubmatrix M * that every row of M" contains ut least v 1's (cf. Fig. 1). Then every greedy covering of M contains at most
co1ztmns. (A set of columns covers a matrix M if every row of M has a common 1 with a chosen column.)
(1 -&) t
t
FIG.1.
Fig.
shows how to derive from this statement the upper ,ounds in Theorem 3.2.
If ~ 2 v , j < c d ( E v d ) 2and 0 2 / d = o ( ( E / d ) 2 ) ,then
2 ~ d) = k or But setting &d=d3/Ev1+d//EVd (see (3.4)), we find that ~ ~ = o ( d /for d=d, - 1, respectively.
FIG.2. For small probabilities the upper bounds in the previous theorem can often be impioved by apply ng another algorithm. We will call it the dmatching algorithm since it I S based on the construction of a set of pairwise dlsjo n t d-cubes: Let x,, ..., x, be the coordinates of the n-cube and let Q t , 1 < i < l = [ n / d J , denote the set of all d-cubes of En with edges in directions x ( ~ - ~ ) , , ..., + ~ xld. , The algorithm
K . Weber
3 30
works as follows. In the first step we take all subcubes of Q 1 contained inS, then we take all subcubes of Q2 contained in f a n d without common vertices with already taken subcubes and so on. Let d,(a) be the conditional probability that a fixed vertexa E E" is not covered by any chojen subcube after i steps of the prescribed algorithm provided that a E J Obviously d,(a) is the same for all a E E" and therefore we will omit the argument a. By applying the recursion
we proved the following. Lemma 3.4. ([ 15, 91) We have 6,6(1+ y , v l ) - " w ,
(3.7)
Theorem 3.5. ([ 151) Suppose d B 1 and let p satisfy
Then for the length of almost all f E F(n, p ) we have
L (f) 5 2" - .
(3.9)
Since the expectation of the number of noncovered vertices after I steps of the d-matching algorithm is 4p2", by (1.2) we arrive at L(j')5p2"-d+q6,p2".Now easy calculations show that (3.7) implies 8 , = 0 ( 2 - ~ )for p with (3.8). (A set o f p2"-d pairwise disjoint d-cubes off covering all but o(p2"-3 vertices offwill be called an optimal d-matching.) Note that for d = O ( l ) (3.8) may be replaced by pn 1 /( 2 d - 1 ) +oo.
By the combination of Theorems 3.2 and 3.5 we obtain the following results for small probabilities p ( f we compare the bounds (3.6) and (3.9), then we mean of course the upper bound from (3.6)). ~/n<(p<
L (f)" P 2 " -
(3.10)
Siibcube Coverings of randotn graphs in the ti-cube
33 1
again (3.9) is better than (3.6).
here (3.6) is better. 1111
'
'3
<
5:
p2"- 3 5 L ( f ) Sp 2 " - 2 ,
by (3.9); (3.6) yields the weaker bound L(,f)5p2"-3(31n 2+ 1). Note that P I ? ( =EV,)-tO implies that edges cover only o(p2") vertices, i.e. asyinptotically p2" vertices are isolated and L(f')-p2'* trivially follows. On the other hand, for p ~ + mthere is an optimal I-matching and L ( J ' ) ~ p 2 " - 1 .(3.10) is the only nontrivial asymptotic formula for L we know.' It is natural to ask for a lnaximum d'mens:oii cl of an optimal cf-matching for almost all , f F~( / I p, ) for given p . Clearly the prescribed method yields only lower bounds for this dirnension, e. g. assuming p < 1/2, d4=[loglog loglog log log 1z-loglog(l/p)~ can easily be established as such a lower bound. Of course, for p > 1/2 there exist opiiinal cl-matchings for d= Llog log 11 - log log log n ] too, but we can notprove itsexistence for d=cl,. Suppojep= 1/2, then d4=[loglogn-logloglogn] and easy calculations show that (3.8) is already not valid for d = d 4 + 2 . Hence Theorem 3.5 g'ves an upper bound exceeding the upper bound from Theorein 3.2(1i)by the factor loglog n; (dl = ~ ~ o g l o g n + l o g l o g l o gfor n ~ p = 1/2). Similar results can be derived for almost all g E G ( n , p ) (cf. [19]). One has to - 112" and replace EV, by EV;, dl by d ; (see Appendix), (3.2) by n-'/'"-' <
4. Irredundant coverings
In this section bounds for the number T and the maximum length L+ of irredundant coverings for almost all subgraphs of the n-cube are g'ven. The following theorem shows that almost all f ' F(n,p) ~ have many irredundant coverings of great length.
all f
In [ 5 ] 2"/logI I log log n is determined as the order of magnitude of L ( f ) for almost F(n, 1/2).
E
K. Weber
332
Theorcm 4.1. ([17]) Let p satisfy
and put d = d2 + 1. Then almost all f E F(n, p) have
irredimdant coverings, each containing asymptotically pN maximal subcubes. Further assuming d = d 2 + 1, (4.2) leads in particular to
logT(f)-pNdlogn
(4.3)
forp>n-"") (it implies dz-+co) and log T(f)-(dlogn-(2d-
l)log(l/p))pN
(4.4)
for log(l/p)zlogn (it implies d 2 = 0 ( 1 ) ) . F o r p = 1/2 we have d,=[loglogn] and (4.3) gives the result of Saposhenko [7]. As a further example let p = 1/J- n . Then d2= 1 and by (4.4) log T(f) N log n/2 Ji.
-
The upper bound in (4.2) is obtained from T ( f ) < .
.
considering that S(j)<EV,,+,pN. Now let us sketch the lower bound. First we introduce the probability space F(n, p , p * ) of all pairs of Boolean functions ( ] i f * ) with f * ' f C E " . The probability of such a fixed pair is defined as ProbCf,f*) =P(j)P,(j *), where P(j)=p1'lqN-lr1 is the customary probability measure on F(n,p) and P~(j*)=p*'"iq*l'l-l''l, q*= 1 - p * , is the probability for a function f * s f i f the vertices o r f * are chosen independently and with the same probability p* among the vertices of$ We say almost all ( f , f'*) E F(n,p, p*) have a certain property Q if Prob((f,f'*) has Q)-I. By the properties of the binomial distribution it is clear that, assuming pp*2"+co, almost all ( f , f * ) E F(n,p,p*) satisfy both -pN and -pp*N. Furthermore, the following assertion is evident.
If1
If*/
Lemma 4.2. ([ 171) Suppose pp*N-co and let A c F(n, p , p * ) sutisfy Prob(A)-+O. Then almost all J E F(n, p) contain an f *( -Cf), J'* -p*l f I (-pp*N) such that ( f , f * )4 A . Exactly the maximal subcubes o f f are called maximal subcubes of cf,f*). A subcube K is said to be an embedded subcube of (Jf*)if IK-f*I < I. For d 2 1 and each vertex a E En we define Wd.uC/rf*)as the number of maximal
Sukcube cooerings of rnrzdonz graphs in the n-cube
333
embedded d-cubes K of ( f t f * ) with a E K and KE f * u a. Both conditional expectation E( W,. a E f ) and variance D2(W,, a ~ f are) independent of a, thus we employ the abbreviations EW, and D zW, for it. Clearly, we have
1
(I
I
In-,
where j'=p- , y : k = p * ' " - ' , z = p j ~ y, ( 1 --z)"-" is the probability that a fixed cl-cube K conlaining the arbiirary lixed vertex a is a jnaxiinal subcube of (f,J'*) under the condition a ~ j ' a i i dy'$ is the probability for K - a c f*. Set for g'ven c and d (which will be chosen later) B ( . / ; f * ) = { a : n ~ and f W,,.(.f,f")- EW,,l ZEEM',} and h(,f;f':K)= IB(.f;J'*)I. If D' Wd<~,(EWd)',then we lind for the expectation Eb
I
If*/
If*l
these subcubes covcr at most p ~ , p N 2 ~ ' + o ( p Nvertices ) Assu~ningp<1 - l/JlGn since subcubes of diliiensions greater than d, cover only o ( p N ) vertices (cf. the remark after Theoi-em 3. I). The remaining vertices may be covered by maximal embedded d-cubes. First construct an irredundant covering C2 o f f * - B ( f ;f *), i t covers at iiiost If'*/ verlices. There remain ~ > p N - c p ~ ~ , p N- 2 p p~ *' N - o ( p N ) noncovered vertices a E ~ - ( ( J * u B(f,,f'*))which can be covered by maximal embedded (I-cubes i n at least (EW,( I - o( I))" d ffxent ways. It is clear that, possibly omitting certain members of C , and C2, all these d fferent ways yield different irredundant coverings ol'f. Now let p satisfy (4.1) and put d = d 2 + 1. Then using the bound
(cf. [17]) there is a p*+O such that with cd=d3/EW1+d/EWj,~,2,'+0 and consequently ~ = p N ( 1-o(l)) and f has
irredundant coverings of length a p N ( 1 -o(l)).
Now L + c f ) s p N follows im-
K. Weber
334
IJ I
mediately by the trivial upper bound L + ( f ) < S p N . Furthermore, p*+O can be chosen such that EW,=(EV,)'-"''', and the lower bound in (4.2) is established too. Moreover, we can formulate the following.
Corollary 4.3. ([17]) Siippose the conditions of Theorem 4.1 arc satisfied and put d= (I2+ I . Then t h e c.uist.s n probability p* such thirt f o r alinost all f E F(n, p ) we 11aL.e log T ( f )
N
p N log E W, .
An analogous result can be derived for alinost all g E G ( n , p ) (see [18]). Instead of F(n, p, p * ) the probability space H ( n , p , p * ) = G(n, p ) x E(rz,p*) consisting of all pairs (g,'f'), g E G(n, p), f E F(n, p*), is considered. Thereby the probability of a pair is delined as P r o b ( g , f ) = P ( B ) P * ( ~ )where , P ( P ) is the probability on G ( n , p ) ( F ( n , p * ) ) . EV, and EM', have to be replaced by EV,' and EW;, respectively (see Appendix). A similar role as d, and (I2 play here again the corresponding d i m ns i on s with bar.
Concluding remark. The paper [20] contains ;i discussion o f random graphs in the 12-cube and its relations to other random graph models. Acknowledgements The author wishes to thank the editors for their helpful comments.
Appendix - Notations and abbreviations
Subcube corerings of random graphs in the n-cube
A*=-
II
335
292d
N(1-z)'
Probabilities
1 +x2-' --+o
(+))
Dimensions do = [log n- log I g ( 1 / p ) ] ,
d , = [loglog I I
+ log log log
If
= llog n - log log n -log log( l / p ) 1
- log l o g ( l / p ) l , (1; = [log log I1 - log log( l/p)1
dz = [log log n-loglog( I / P ) ] , ( I ; = [loglog If -log loglog 12 -log log(l/p)l d3
= 1 - loglog( I/P)l, d; = [-loglog( I l p ) - log(- loglog(l/p))l
References [I] M. Ajtai, J. Komlos and E. SzcmcrCdi, Largest random component of a k-cube, Combinatoricn 2 (1982) I , 1-8. [2] P. Erdos and A. R h y i , On the evolution of random graphs, Publ. Math. Inst. Hungar. Acad. Sci., Ser. A, 5 (1960) 17-61. (31 P. Erdos and J. Spencer, Evolution of the n-cube, Coniput. Math. Appl. 5 (1979) 33-39. [4] W. W. Glagolcv, Some estimations for disjunctive normal forms of Boolean functions, Problemy Kibernet. 19 (1967) 75-94 (in Russian). [ 5 ] A. D. Korshunov, On the length of the shortest disjunctive normal form for Boolean functions, Diskretny Analiz 37 (1981) 9-41 (in Russian).
335
K. Weber
161 A. A. Saposhenko, Disjunctive Normal Forms-Metric Theory (Moscow, 1975) (in Russian). [7] A. A. Saposhenko, On the greatest length of an irrednndant disjunctive normal form of almost all Boolean functions, Mat. Zamctki 4 (1967) 6, 649-658 (in Russian). [8] A. A. Saposhcnko, On the complexity of disjunctive normal forins obtaining by a greedy algorithm, Diski-etny Analiz 21 (1972) 62-71 (in Russian). [9] A. A. Saposhenko, Lectures about niinirnizalion theory (Moscow, 1973) (unpublished). [lo] A. A. Saposhenko, Metric properties of almost all Boolean functions, Diskretny Analiz 10 (1967) 91-1 19 (in Russian). [I I ] A. A. Saposhcnko, Geometric striic(ure of almost all Boolean functions, Problcmy Kibcrnet. 30 (1975) 227-261 (in Russian). [I21 E. Toman, Geometric slructure of random Boolean functions, Problemy Kibernct. 35 (1979) 111-132 (in Russian). [I31 E. Toman, On the probability of connectedness of random subgraphs of the n-cube, Math. Slovacn 30 (1980) 3, 251-265 (in I<ussian). [14] K. Webcr, On some conccpts for niinimality of disjunctive normal forms, Problcmy Kibernet. 36 (1979) 129-158 (in Russian). [ I 51 K. Weber, T h e length of random Roolean functions, Elektron. 1nform;itionsverarb. u. Kybernet. EIK 18 (1982) 12, 659-668. [16] K. Weber, Subcubes of random Boolcan functions, EIK 19 (1983) 7/8, 365-374. [I71 K. Weber, Prime implicants of random Boolean functions, EIK 19 (1983) 9, 449-458. [18] K. Weber, Irredundant disjunctive normal forms of random Boolcan functions, EIK 19 (1983) l O / l l , 529-534. [19] K. Webcr, Subcube coverings of random spanning subgraphs of the n-cube, Math. Nachrichten 120 (1985) 327-345. [20] K. Weber, Random graphs - a survey, Rostock. Math. Kolloq. 21 (1982) 83-98. I211 K. Weber and B. Klipps, Boolean functions of msxirnum length and Sperner-type conditions about the set of faces of the n-cube, EIK 19 (1983) 3, 187-193.
Annals of Discrete Mathematics 28 (1985) 337-348 0 Elsevier Science Publishers B. V. (North-Holland)
RANDOM GRAPHS AND POLYMERISATION PROCESSES Peter W HTTTLE Sfutistiml Laboratory, Utiiiwsity of Cnnibridge, Cumbridge, Eirylund
It i s shown that many distributional problems for randoni graphs (and so for polynicrisation processes) can bc quickly resolwd by appeal to a cliissic combinatorial identity. I t is also sIio\\m that certain nntural eqtrilibriuni distributions rot- configtlration in a n arc-building'breaking process also supply the time-dcpendent solution for a stochastic model of pure arc-building.
1. Introduction A graph on N labelled nodes can be regarded as a pattern of bonding between N distinguishable particles, or units. It thus provides a representation of a pattern of interaction between these units in which bonding or its absence is the only relationship, notions such as physical distance or interaction as a function of distance being absent. This skeletal version of interaction is that generally adopted in the literature on polymerisation; the study of the molecules formed by bond-forination between units. The nodes of the graph represent units, and are supposed distinguishable, i.e. labelled. The nodes of the graph may be cofoured if different types of unit are possible. The arcs of the graph represent bonds. If the arcs are directed then they indicate a directional property ol' the bond; perhaps that of being itiiiiuted by the unit from which the arc emerges. Multiple bonding and self-bonding are not excluded, in general. The connected components of the graph represent molecules, or poljnicrs. Suppose that one allows the bonding pattern to be random, so that one is effectively considering random graphs; i.e. a probability distribution over the set of possible graphs on the N assigned nodes. Then there is considerable interest in the nature of the statistics of the polymers induced by the prescribed statistics on bonding. This interest has been followed in both the mathematical and the physical literature, i.e. in the literature on random graphs and in the literature on polymerisation. 337
338
P. Whlffle
The prescription adopted most often in the random graph literature (see e.g. Erdos and RCnyi [I], Erdijs and Spencer [2], Stepanov [3, 41) is that only single, undirected arcs between distinct nodes may occur, and occur independently (for distinct node-pairs) with constant probability p. This may seem like the simplest prescription, but is not (see Section 3 ) . On the other hand, it is much simpler than physical realism would demand (see Section 4). However, it does produce the important phenomenon of a phase traiisition. If we set p = c / N then for N large there is a critical value C of c. For c
2. The fundamental identity and its application If there is no question of units changing their type (i.e. of nodes having a variable colour) then the configuration %' of the graph is described coinpletely by the pattern of directed arcs between the N labelled nodes. That is, by s = ( s a b ) where sob is the number of arcs f*rom node a to node b (a, b= I , 2, ..., N ) . BY we indicate a summation over all configurations for prescribed N , by we 9
1 N
indicate a summation over N = 1,2,3, ... . The lemma is one that appears in various contexts in combinatorics, and for the case o1'tl.e mumeration of graphs can be given the following expression.
Random graphs and polymerisation processes
339
Lemma 1. Suppose that to each graph % on an arbitrary number of labelled nodes can be attached a weight w(V) with the properties: (i) that w(U) is invariant under permutation of the nodes, and ( ' i ) that w(Ce)= w('e,)w(V2)if%can be decomposed into mutually unconnected graphs V, and g 2 .Tlien
where the sum C' covers all distinct connectedgrphs, N is the nuniber of nodes of @, and the tcwn f o r N = 0 is tuken us unitjj.
For proofs of the fundamental identity (1) we can reler to, for example, Percua [19, p. 521. Suppose now that PN(.s)is the probability of configuration s on the given N nodes, and that
where i indexes the possible types ofconnected graph (i.e. of polymer at the completest level of prescr:ption) and mi is the number of times component i occurs in the graph. The proportionality constant in ( 2 ) will be a function or N alone which normalises the distribution over s for fixed N . The weighting Q,v(s) evidently satisfies the conditions of the lemma, and identity ( I ) then becomes
Now, typically, one will classify polymers at some lower level of description, indexed by r, say. We shall g've examples in Szctions 3 and 4. We shall refer to a polymer of type r as an r-mer. Presumably the description will also determine the number of units in the r-mer;let us denote this by R . Let r ( i ) and X ( i ) denote the polymer type and size of a component i and let us define gr=
C
Vi.
i:
r(i)=r
Let us denote the numbcr of r-mers in the configuration by n,, and define the probability generating function Z7,(z) = E
(n
z:r) .
r
This defines the polymer statistics, which we assume to be the object of interest.
P. Whittle
340
Theorem 1. Assume the distribution of complete configurations given by (2). Then n N ( z ) is proportional to the coeficient of uN in tlze expansion of
in powers of ihe scalar 0. Proof. We take the identity (1) with
and obtain
The coefficient or ON in the left-hand-member is indeed proportional to n N ( z ) .
0
It is not necessary that either of the infinite series in (3) should converge for any 0. It is simply asserted that coeficients of ON on the two sides are cqual for all N . However, if the series do converge for some 0, then the identity has an interesting interpretation. It asserts that there is an opcn version or the process (i.e. a version in wh'ch Nis also a random variable) in which then, are independent Poisson variables. The two conditions of Lemma 1, which lead to this conclusion, are: (i) the units are statistically identical, even if distinguishable, and ( i i ) the appearances of polymcrs would be independent events were it not for constraints on N . The solution for n,(z) yielded by Theorem 1 is moderately explicit if the cocfficients g, can in fact be calculated. What can indeed often be calculated in the sun1
G (0) = ORg,=C OR"'y/, .
1 r
i
If we set all the z, equal to unity i n identity (3) then this reduces to
where
Random graphs and polymerisation processes
341
and theseexpressions are often computable. It may then be that g, can be calculated from C(0) if the yi’s include variables which can serve as marker variables. Indeed, the appropriate level of description of the polymers is fixed by the level at which such marker variables exist, and g, is extractable from G(0). Otherwise expressed, one distinguishes just those polymers which are energetically distinguishable. (Use of the term “energy” is a reference to statistical mechanics. A factor in the configuration distribikon which reflects interaction can be regarded as relaled to the potential energy of the configuration.) We shall give examples in the next two sections.
3. Poisson bonding Suppose that we choose
Here V is the “volume” of the region within which the N units are constrained, and we assume that
where 1is a parameter measuring intensity of bonding. The factor P” included i n (5) is constant in that it is independent of s, but it becomes important when we consider a distribution of polymers over more than one “compartment” in space. The assumption impl,cit in (5) is that the N 2 quantities subare independent Poisson variables, each with expectation S/2. Later we shall consider the thermodynamic limit, in which N and V become infinite, the density p = N / V being held fixed. The point of the dependence (6) is then that the total expected number of bonds issuing from a prescr,bed unit remains fixed at Nh=Ap. This Poisson assumption seems the simplest possible: polymer statistics are’ purely combinatorial in that all polymers with R units and L bonds have the same energy.” These two statistics then constitute the natural description level for polymers in this model. The Poisson assumption leads to more natural mathematics than the assumption of independently occurring single undirected bonds mentioned in the Introduction. The assumption allows the phenomena of selfbonding and mult:ple bond‘ng, which may seem unnatural. It is not clear that they are. but their occurrence has negl gible effect in the thermodynamic limit. ‘1
342
P. Whittle
We have
so that
The series (8) defines G(0) as a power series in OV and 1/V,
where roRLis a purely combinatorial term. The pair R, L constitutes the natura prescription r at this level, with 0 and E, being the marker variables for R and L respectively. We have the identification
Note now that
with equality if and only if the polymer is a tree. We see then from (10) that tree polymers have an abundance of order V while other polymers (let us call them ring-polymers) have an abundance of lower order, the order decreasing with the amount ofcyclisation ofthe polymer. One can say that there are so many ringpolymers that the abundance of any given one must be low, although their total effect may be appreciable. Note that the series (8) always diverges, as must then the series (9). This is because of the multitude of contributions from ring-polymers. If one considers the contributions t o these series from tree-polymers alone then the series converge so long as 8 does not exceed a value which is related to the phenomenon of gelation. Asindicatedin [I61 by far theeasiest way to detectgelationis toexamine the quantity PN+nPN-nas a function of n where PN= &IN!. This quantityis the probability that 2N units should distribute themselves over two disjoint reg’ons each of volume V as ( N + n , N - n ) , bonding being allowed only within each volume
Random graphs and polymerisation processes
343
according to the prescription (5). In the sol-state PN+,,PN-,, has its maximum at n=O (corresponding to statistical equi-distribution between the two regions): in the gel-state it has symmetric maxima at n= +no say (corresponding to a tendency for the units to clump in one region or the other). Using then the criterion
as a criterion for gelation and the evaluation (7), we find that the condition for gelation in the thermodynamic limit is
This agrees with the value deduced by much less direct arguments.
4. Bond interactions
Let a unit which has formedj bonds (in either direction) be denoted an a, and let n, be the number of 0,’s in the N given units (j=O, 1, 2, ...). Suppose the Poisson specification (5) now modified to
The assumption is then that the basic Poisson statistics are modified by bond-number dependent terms, the term H, being h:gh or low according as a unit w i t h j bonds is relatively favoured or disfavoured. Alternatively expressed, the energy of the configuration is a function only of the bond-numbersj, and is low or high according as H, is high or low. It is shown in Whittle [IS] that QN now has the evaluation
where
The natural description r of a polymer is now r=(ro, rl, r z . ...), where r, is the number of aj’s the r-mer contains. The H , are natural marker variables for the T i .
P. Whittle
344
So still are 0 and 6 for R and L, although these are unnecessary in that the r, determine R and L
The series (4) with the evaluation (14) for QN determines G (0) as a power series in UV, A/V and the HJ G(O)=
CarV R e L I L O R (HI') n , r
i
where or is purely combinatorial, and wc have
We use the criterion (11) for criticality, with Q N having evaluation (14). In the thermodynamic limit this yields the criterion
for g:lation, where
5 is determined by
All this work can be generalised naturally to the case of several unit-types, and to the case when there is a d fferential effect between inter- and intra-polymer bond ng - an effect which encouragzs or d.scourages cyclisation (see Whittle [ I 7, IS]). Reference might be made to the paper by Spouge [20] which relates the methods of this paper to Gordon's branch ng process model. It should perhaps be emphasised that in the configuration s of (5) and (13) nodes are distinguishable, the ends of a loop are distinguishable (since the loop is directed), but the sob bonds between g'ven nodes a and b are not distinguishable. It is just because of this latter fact that the (sob!)-' term occurs in ( 5 ) and (13).
Random grnphs and polymerisation proresses
345
5. Equilibrium solutions as time-dependent solutions
Flory's original approach was not to consider the equilibrium version of association and dissociation, but rather to consider the time-dependent vers:on of a purely associative process. That is, a process i n which bonds form b u t d o not break. He then deduced the occurrence of gelation at a time when the degree of bonding had reached the critical value. This view expresses the experhenceof the polymer chemist: when physical conditions are created which favour polymerisation, then polymerisation proceeds progressively, and, after some time, gelation is observed as a rather definite event. Flory's inathematical approach was a mixture of statistical mechanics and heuristics. However, one is impelled to ask whether equilibrium distributions such as (5) and ( I 3) might not be seen as time-dependent distributions for an appropriate stochastic process of pure band.ng. Consider a time-dependent version of (5)
Here the bonding parameter S=A/V has been replaced by k / V . We have also dropped the N subscript for simplkity. Distribution (15) implies that the Sub are independent Poisson variables with expectation it/2Vat time t. This is exactly the distribution that would hold for a process, completely dissociated at t = O , in which for any a, b there is a fixed probability intensity/1/2Vthata new bond will form from unit a to unit b, independent of the previous history of the process. This is the simplest stochastic bonding process one can ilnag.ne, and one sees from (12) that it gels at time
Let us write the transition in which sob is increased by one as s+s+euh, and denote the correspond ng transition intensity as A (s, sscub). Then the Poisson model we have just considered is characterised by
/i(s,s+eu,)=A/2V
(a, b=1,2,
these being the only transitions possible.
... , N )
P. Whittie
346
Consider now the more general distribution
in which the time-dependent Poisson statistics are modified by a time-independent term @(s), representing interaction effects associated with the potential energy of configuration s. This can indeed be seen as the time-dependent solution, either exactly or in the thermodynamic limit, of a natural stochastic bonding process.
Theorem 2. Consider the pure bonding process for which
and let P(s, t ) denote the distribution of s at time t for rhis process, given complete dissociation at time t = 0. Let (18)
denote the normalised version of expression (16). Then P, P* respectively satisfy
ap*(s,t )
--
at
-XI P*(S - cab, t ) A (sa
cab, s) -A*(t)P*(s,
t),
b
where
I
A*(t):= P*(S , t )A ( t ) . S
If for eoery t
A (s) is constant (in s) for all s such that P*(s, t ) > O then P(s, t)=
P*(s, t ) .
Proof. Equations (19), (21) constitute the usual Kolmogorov forward equation for the Markov process specified by (17). One readily verifies that expression (16)
Random graphs and polymerisation processes
347
satisfies
and hence that its normalised version satisfies (20). The two equations (19), (20) evidently agree under the condition stated (and only under an almost-everywhere version of it). More explicitly, suppose we have established that P ( s , T) = P*(s, T) for z < t . To establish equality of rate of change we then require, by (19) and (20), that
( A (s) -n*(t))P*(S , t ) = 0 which is equivalent to the condition of the theorem.
0
The condition is quite a restrictive one. Unless there are configurations which are actually forbidden then P*(s, t ) > O for all s and all t>O, and the condition of the theorem would require that
be independent of s. However, one can imag:ne that, under appropriate conditions on @, the random variable a(s) would converge (in the appropriate probabilistic sense) to a constant in the thermodynam'c limit under the distribution P*(s, t ) , for fixed t. That is, that the condition of the theorem is satisfied, and so P=P*. in the thermodynamic limit. Justification of this assertion obviously requires a careful argument, which I hope to supply in a later publication. Van Dsng:n and Ernst [21] consider a determ'nistic polymerisation model with reversible rates of a particular form; and show that for this model some non-equilibrium solutions are indeed equilibrium solutions conditioned by a kinetic parameter.
References [ l ] Erdos, P. and RCnyi, A., On the evolution of random graphs, Mat. Kutat6. Int. Kozl. 5 (1960) 17-60.
[2] Erdos, P. and Spencer, J., Probabilistic Methods in Combinatorics (Academic Press, 1974). [3] Stepanov, V. E., On the probability of connectedness of a random graph, Teoriya Veroyatnostei i ee Prim. 15 (1970) 55-67.
34s
P. Whittle
[4] Stepanov, V. E., Phase transitions in random graphs, Teoriya Veroyatnostci i ee Prim. 15 (1970) 187-203. [ 5 ] Flory, P. J., Principles of Polymer Chemistry (Cornell University Press, 1953). [6] Stockmayzr, W. H., Theory of molecular size distribution and gel formation in branched chain polymers , J . Chem. Phys. 1 1 (1943) 45-55. [7] Stockmayer, W. H., Theory of molecular size distributions and gcl formation in branched polymers. 11. Gcneral cross-linking, J. Chem. Phys. 12 (1944) 125-131. [8] Goldberg, R. J. J., A theory of antigen-antibody rcactions. I., J. Amer. CIicm. Soc. 74 (1952) 5715. [9] Goldberg, R. J. J., A theory of antigen-antibody reactions. II., J. Amer. Chem. SOC. 75 (1953) 3127. [lo] Watson, G . S., On Goldberg’s theory of the precipitin reaction, J. Immunology 80 (1958) 182-185. 1111 Good, 1. J., Cascade theory and thc molecular weight averagcs of the sol fraction, I’roc. Roy. SOC.A 272 ( I 963) 54-59. [I21 Gordon, M., Good’s theory of cascade processes applied to the statistics of polymer distributions, Proc. Roy. SOC.A 268 (1962) 240-259. [12a] Gordon, M. and Temple, W. B., Chemical Applications of Graph Theory, ed. A. T. Balaban (Academic Press, 1976) 300-332. [13] Whittle, P., Statistical processes of aggregation and polymerisation, Proc. Camb. Phil. SOC.61 (1965) 475-495. [I41 Whittle, P., The equilibrium statistics of a clustering process in the uncondensed phase, Proc. Roy. Soc. Lond. A 285 (1965) 501-519. [15] Whittle, P., Polymerisation processes with intra-polymer bonding. I. One type of unit, Adv. Appl. Prob. 12 (1980) 94-115. 1161 Whittle, P., Polymerisation processes with intra-polymer bonding. 11. Stratified processes, Adv. Appl. Prob. 12 (1980) 116-134. [I71 Whittle, P., Polymerisation processes with intra-polymer bonding. 111. Several types of unit, Adv. Appl. Prob. 12 (1980) 135-153. [I81 Whittle, P., A direct derivation of the equilibrium distribution for a polymerisation process, Teoriya Veroyatnostei 26 (1981) 350-361. (191 Percus, J. K., Conibinatorial Methods (Springer, 1971). [20] Spouge, .I.L., Polymers and random graphs: asymptotic equivalence to branching processes, J. Stat. Phys. (1984) (to appear). [21] Van Dongen, P. and Ernst, M. H., Pre- and post-gel size distributions in (ir)rcversible polymerisation, J. Phys. A: Math. Gen. 16 (1983) L327-L332.
Annals of Discrete Mathematics 28 (1985) 339-359 0 Elsevier Science Publishers B. V. (North-Holland)
CRITICAL PERCOLATION PROBABILITIES * John C. WIERMAN
This papcr sumninrizes tlie current state of Itnowledge of' critical probabilities in percolation niodcls. Tlicre are four coiiimon definitions o f critical probability in the literature, which are known to be unequal for soinc graphs. A heuristic method of Sykes and E s i m [I71 has produccd corrcct critical probability values for a fcw planar lattice models. klowcver, counterexamples have been coixtructcr! to some conclusions of the Sykes and Essam niethod, and (hc niethod is valid only for two-dimensional graphs. Exact critical probabilities have been rigorously determined for a few two-dimensional percolation models, using techniques of Seymour and Welsh [I 51, Russo [14], and Kzsteii [lo] involving crossing prob.ibilities of rcctangles.
1. Introduction
Percolation theory developed from B probabilistic model for fluid flow in a medium proposed by Broadbent and Hainmersley [I]. Their niodcl associated the random mechanism only with the ~nedium,with no randomness associated with the fluid as in the more familiar dlffusion models. Consequently, percolation models typically exhibit drastic changes in behavior as a parameter threshold is crossed, which leads t o their popularity a s models for the study of phase transitions and critical phenomena in statistical mechanics. There has bcen dramatic growth in the physics literature on percolation in recent years, consisting primarily of conjectures, numerical evidence, and a wide range of specific applications. Substantial mathematical progress has bcen made, providing rigorous deterrnination of the critical probability in several models. New tools, such as the FortuinKasteleyn-Ciinibre correlation inequality and sub-additive processes, originated within percolation theory. However, most results apply only to two-dimensional models, and there is little knowledge 01' models in three or more dimensions. The medium is represented by a graph 9' in a percolation model, from which sites or bonds are deleted at random to form a network of channels through which
* Research supported by the National Science Foundation under Grant No. MCS-83032 38. 349
3 50
J. C. Wirrniun
fluid may flow. Early research centered on two standard models - the bond percolation model and the site percolation model. In a bond percolation model on a graph 9, each edge of 9 is randomly open with probability p , o , < p d l , the parameter of the model, and closed with probability 1 - p , independently of all other edges. Fluid may pass through only open bonds. I n a site percolation model on 3,each vertex of 9 is randomly open with probability p , O < p b I , and closed with probability 1 -p, independently of all other vertices. A bond is open (closed) if and only if both its endpoints are open (closed) sites. Thus, i n a site model, some bonds, those with one open and one closed endpoint, are neither open or closed. For either niodcl, let P, and E, denote the corresponding probability measure and expectation operator on configtirations of 9.In either sile or bond models, the focus is on the extent of fluid flow possible. Of particular interest, if 9 is infinitc, is the probability that the fluid reaches an infinite set of sites from a single fluid source site. Fisher [5] introduced a bond-to-site transformation which converts a bond percolat~onmodel on of the form ( u o , cl, v l , e,, u 2 , ..., el,,U J , where el is incident on u I - I and ui for each i. If uo=uI,, and e i # e j for i#j, the path is called a circuit. A path or circuit is open (closed) if all its sites (in a site model) or bonds (in a bond model) are open (closed). The open cluster containing a specific site u, denoted W,, is the set of all sites which are connected t o u by an open path. Let # W,,denote the cluster size of W,, which is the number of sites in W,. The classic research problem in niathematical percolation theory is the delermination of the critical probabilities of various infinite lattice graphs. The critical probability represents a threshold in the parameter space between an interval corresponding t o local flow of fluid only and the “percolative rcgion” where fluid penetrates the medium. While there are often no distinctions made in the physics literature, there are actually four common definitions of critical probability, which do not always agree in value. The two most common approaches define critical probabilities in terms of the probability of existence of infinite open clusters and the expected cluster size. (Throughout the remainder of this paper, we assume that the graph 3 is infinite and connected.) Choosing an arbitrary site u E 9, we define the cluster size critical
Critical percolation probabilities
351
probability by
and the mean cluster size critical probability by
(If 3 IS connected, these definitions are independent of the choice ofu.) From the definitions, p7vg)
2. The clusters-per-site approach A third critical probability is the clusters-per-site critical probability proposed by Sykes and Essam for use i n a heuristic method of determining critical probabilities for two-dimensional site models. Sykes and Essam's approach is based on the concept of matching graphs. Let A' be a planar graph imbedded in R2, and F be a Pice of M . Close-packing Fincans adding an edge to ,N between any pair of vertices on the perimeter of F which are not yet adjacent. If 9 is a subset of the collection or laces o f A', let Y be the graph obtained from -dL by close-packing all faces in 9, and let Y*be the graph obtained from -dL by close-packing all faces not in 9. 3 and V are called a matching pair of graphs. Note that any triangular face is already close-packed, so if a graph Y is lirlly-triangulated, it is self-matching. The matching property is uscf'ul for site models because an open cluster on 9 is finite if and only if it is surrounded by a closed circuit on 9*. For bond percolation models on planar graphs, the role of the matching graph is played by the Whitney dual graph. (For this reacon, the dual graph will be denoted by 3*also.) For a planar graph 9,the dual graph '5" is constructed by placing a site of Y* in each face of Y, then connecting by a bond of P' each pair of sites of 9*which lie in faces of Y sharing a coinillon edge. This creates a one-to-one correspondence between edges of 9 and edges of 8".If a bond percolation model is defined on 9, a bond percolation model is induced on 9'' by letting each bond of 9* be open (closed) if and only if the corresponding bond of 3 is opetl (closed). As for site jiiodels, an open cluster on 9 is finite if and only if it is surrounded by a closed circuit on 9%. If the bond-to-site transformation is applied to a dual pair ofgraphs, g1and 9 2 ,the covering graphs 3: and 9: form a matching pair of graphs. To describe the clusters-per-site approach to a definition of critical probability, consider an expanding sequence of rectangular regions in a site percolation model on '3, say R,,=[O, m ] x [0, bn], where O
352
J. C. Wierrnan
of open clusters contained in R,,divided by the number of sites of 9 in R,. Assuming that A,,(p) converges t o an asymptotic clusters-per-site function l ( p ) for 9 as n--rco, Sykes and Essain [17] used a modified version of Euler’s Law to show that
where A* I S the clusters-per-site function for 27*, and p is a polynomial function known as the matching polynomial foi Cg. Sykec and Es5a~nascumcd that there i s a unique singularity of I foi each graph, which is presumably the common value of the critical probability of the giaph in all senses, which we denote heic byp,(B). Thcn, bince rp I5 a polylloJlllal,
If 9 is self-matching, then pE((8)==pE(/9*)=I/& which provides conjectured values for the critical probabilities of the square lattice bond model and the site model on any fully-triangulated graph. In a non-self-matching case, an additional argument using the star-triangle transforination derived the values of n n ?sin -- and 1-2 sin -- for the triangular and hexagonal lattice bond model criti18 18 cal probabilities.
Critical percolation probabilities
353
Unfortunately, pE has not been evaluated for common graphs, and the conjectured values are incorrect for p T and p H for some graphs. Van den Berg [I81 constructed a fully-triangulated graph for which the site percolation model has critical probabilities p , , = p T = 1. The graph consists of a skeleton of an expanding sequence of triangles A,,, with additional bonds inserted to obtain a fully-triangulated graph, such that there are six sites o n each triangle (see Fig. 1). If all six sites on triangle A, are closed, the open cluster containing the central site must be finite, containing at most 3n sites. Since the random variable inf (H : All six sites of Az,, are closed) is a geometrically distributed random variable with parameter (1 -P)~, it is finite almost surely, and has finite expected value, so the open cluster size is finite and has finite expected value. Wierman [21] modified Van den Berg’s example to construct a family of graphs with more unexpected percolative behavior. For each x E (0, 1) there is a fully-triangulated graph gxfor which the site model has critical probability values p,,(gX)= 1 and p T ( g x ) = x ,showing that all critical probability definitions need not share a common value. In addition, the clusters-per-site function for these graphs can be explicitlyevaluated, and is a polynomial, so there is no singularity and p E is not defined. The clusters-per-site function is identical for all gX,x E (0, I), so a graph is not uniquely identified by its clusters-per-site function. The graphs gX are constructed by inserting sites and bonds in Van den Berg’s example which greatly increase the cluster size, but have no effect on the probability of existence of an infinite cluster (see Fig. 2). While it is well known that p T ( g ) < p , , ( 9 ) , these
FIO.2.
354
J. C. Wierman
examples (plus a similar example with p T ( 9 ) = 0 , p f , ( 3 ) = 1) show that p T may assume any value in [0, pH]. Since the preceding examples all satisfy p t , ( 9 ) 2 1/2, one might conjecture that this inequality holds for all site models on fully-triangulated graphs. However, Wierman 1211 also constructed a family of fully-triangulated graphs gk, k = 2 , 3 , ..., for which the site model critical probability P ~ ( % ~ )l/(k-l). < The graphs are modifications of Van den Berg's example which include Bethe trees (see Fig. 3).
For the square lattice, its covering lattice, and its matching lattice, it is known that il(p) is analytic except at p = p , , , and has two continuous derivatives in p on all of [0, I ] (see Kesten [12, Ch. 91). It has been conjectured that the third derivative of A does not exist at p,,. However, as yet there is no proof that a singularity of ilexists for any periodic graph in Rd, d 2 2 .
3. Crossing critical probability The fourth version of the critical probability, defined in terms of crossing probabilities of rectangles, has been the key to the rigorous determination of critical probability values. The definition is formulated for periodic graphs. A peri-
Critical percolation probabilities
355
odic graph is a graph which is imbedded in Rd for some d 2 2 , is connected, contains no loops, has at most z
K=
n[Mi, ni]=(x=(x(l),
..., x@)) : r n i < X ( i ) < / l i ,
l,.
i= 1
An i-crossing on 3 of R is a path ( u o , e l , u I , ..., u,J such that uo(i)<m,, u,,(i) > n i , el intersects ( . x ( i ) = n z i } n R, e,, intersects (x(i)=rz,} n R and v,, e 2 , ...,u,,are contained in the interior of R. The crossing probability in the i-th direction of d
[tn i,
nil is a()%, z; i, p ) = P , (3 an open i-crossing on
n d
Y of
[ m i , nil). (Here
i=1
i=l
~.
tn and i denote ( m , , . .., m,) and ( I ? , , . . ., t id ) respxtively.) The crossing critical probability is pS($9)=sup { p E [0, 11 : lim a(6, (312, 3 n , ... , 3 n , t i , 3 n , ... , 3 n ) ; II
+
m
i , p ) = O , 1 < i , < t l ) , where the one component equal to n in (3n, ..., 3n, 1 1 , 3n, ..., 3n) is the i-th component. The crossing critical probability was introduced by Szymour and Welsh, who considered crossings of squares in the square lattice bond model, and independently by Russo [I41 in the context of the square lattice site model. The present definition was a m o d k a t i o n due to Kesten which is appropriate for models with less symmetry or in dimensions greater than two. Kesten [ I l l proved that for any periodic graph 3 imbedded i n Rd, d > 2 ,
Seymour and Welsh [I51 used crossing probabilities and the self-duality of the square lattice bond model whenp= 1/2 to show that pT+p,,= 1 for the square lattice bond model, which for generalization to the site percolation matching lattice setting should be interpreted as
The usefulness of crossing probabilities is in capturing quantitative information from duality (and matching) properties. Application of duality to a rectangular reg’on shows that either there is an open crossing of 9 in one direction or a closed crossing of ??* in the opposite direction, so the crossing probabilities of the two graphs are related. If p < p T , a simple argument using duality with crossing probabilities is used to link together closed crossings of a sequence of rectangles to form an infinite closed
356
J. C. Wierman
cluster in g*.Thus l-p>p,,(g*), so 1 > p + p H ( 9 * ) for all p
P ( 3 an open path in [ O , n ] x [ 0 , n ] from x=O t o x = n ) > 6 > 0
(*)
then
P ( 3 an open path in [ 0 , 3 n ] x [O, n] from x=O to x = 3 n ) > f ( 6 ) > 0 , wherefis a strictly increasing function independent of n. The proof relies on the use of F-K-G-correlation inequalilies to obtain lower bounds on probabilities of linking crossings together, and uses the fact that both coordinate axes are symmetry axes for the square lattice. Linking together open crossings of similar rectangles, an open circuit is constructed in a square annulus A,= [ - 3 n , 3nI2\ [ - n , nI2 surrounding the or‘g;n, with probability at leastf4(6). If p > p s , there is a sequence of integers f ? k , k 2 - 1, for which (*) holds and the annuli A,, are disjoint, so there exists an open circuit around the origin with probability one. In the context of multiparameter site percolation models on two-dimensional periodic lattices, Keslen [12, Ch. 61 showed that one axis of symmetry is sufficient for the construction of circuits. Kesten [lo] proved the final step in the r’gorous evaluation of the square lattice bond model critical probability, using a technique of closing bonds in independent stages. When the parameter p was less than the candidate 112 for the critical probability, this technique was used to cut open crossings of rectangles, showing thatps> 1/2. Since it was known thatpS=pT<1/2, we then have p s = p T = p H = 1/2, by recalling that p T + p H = 1. These methods were used to evaluate the critical probability of the site model on any periodic fully-triangulated graph with one symmetry axis as p T = p s = p H = 1/2, where the value 1/2 is a consequence of the self-matching property (Kesten [12]). For the triangular and hexagonal lattice bond models, Wierman [I91 veri7r
x
fied the Sykes and Essam values of 2sin - and 1 -2sin - for the common value 18 18 of the critical probabilities p T ,p s , and p H , using the star-triangle transformation
Critical percolation probabilities
357
t o relate crossing probabilities on the dual pair of graphs. Wierinan [22] has also used the star-triangle transformation to evaluate the critical probabilities p r i ,p T , and p s of the bond model on a lattice with both square and triangular faces (see F'g. 4) a!; the root of 1 - p - 6 p 2 + 6 p 3 - - p 5 i n [0, I], which i s approximately .404518. The critical probabilities of the dual graph are then equal to 1 -po-.595482. Kesten [12] generalized the methods to the setting of multiparanieter site models on pxiodic graphs in R2 with one symmetry axis. His results require a relationship between crossing probabilities in dilkrent directions on the sarne lattice, or between crossing probabilities in the lattice and its matching lattice.
4. Additional models
Since the early development of the theory, there has been a proliferation of additional percolation models. Hammersley and Welsh [8] introduced the first-passage percolation model, in which each bond has an associated random travel time required for the fluid t o flow
358
J. C. U’iertnan
through the bond. Topics of interest in the theory are the asymptotic velocity of the spread of fluid, the length of optimal (shortest travel time) paths, and the asymptotic shape of the region wetted from a single fluid source site. See the monograph by Smythe and Wierman [I61 Tor an introduction and basic results, and Cox and Kesten [3], Cox and Durrett [2], and Kesten [9] for specific recent advances. I n oriented (,or directed) percolation models, each bond may allow tluid to pass through in only one direclion (which is non-random) if it is open. In the completely oriented bond model on the square lattice, each horizontal bond allows fluid to pass through only to the right and each vertical bond allows fluid to pass only upwards. Successive improvements in the lower bound on the critical probability for this model are due to Mauldon [13], Gray, Smythe, and Wierman [6] and Dhar [4], who determined that p,[3.6300, but the exact value has not been determined. Wierman [20] has rigorously proved two claims of Dhar. In a mixed model, both sites and bonds, or even sites, bonds, and faces, are all open or closed at random. For some results on mixed models, see Hammersley [7] and Wierman [23].
References [ I ] S. R. Broadbent and J. M. Hammersley, Percolation processes, Proc. Camb. Phil. Soc. 53 (1957) 629-641, 642-645. [2] J. T. Cox and R. Durrett, Some limit theorems for percolation processes with necessary and sufficient conditions, Ann. Probability 9 (1981) 583-603. (31 J . T. Cox and H . Kesten, On the continuity of the time constant of first-passage percolation, J . Appl. Probability 18 (1981) 809-819. [4] D. Dhar, Diode-resistor percolation in two and three dimensions: I . Upper bounds 011 critical probability, J . Phys. A: Math. Gen. 15 (1982) 1849-1858. [5] M. E. Fisher, Critical probabilities for cluster size and percolation problems, J. Math. Phys. 2 (1961) 620-627. [6] L. Gray, R. T. Smythe and J. C . Wierman, Lower bounds for the critical probability in percolation models with oriented bonds, J. Appl. Prob. 17 (1980) 979-986. [7] J. M. Hammersley, A generalization of McDiarmid’s theorem for mixed Bernoulli percolation, Math. Proc. Camb. Phil. SOC.88 (1980) 167-170. [8] J. M. Hamrnersley and D. J. A. Welsh, First-Passage Percolation, Subadditive Processes, Stochastic Networks, and Generalized Renewal Theory. Bernoulli-Bayes-Laplace Anniversary Volume, J. Neyman and L. M. LeCarn, eds., (Springer, Berlin, 1965). 191 H. Kesten, On the time constant and path length of first-passage percolation, Adv. Appl. Probability 12 (1980) 848-863. [lo] H. Kesten, The critical probability of bond percolation on the square lattice equals 1/2, Comm. Math. Phys. 74 (1980) 41-59. [ l l ] H. Kesten, Analyticity properties and power law estimates of functions in percolation theory, J. Stat. Phys. 25 (1981) 717-756. [I21 H. Kesten, Percolation Theory for Mathematicians (Birkhluser. Boston, 1982).
Critical percolation probabilities
3 59
[13] J . G. Mauldon, Asymmetric oriented percolation on a plane, Proc. 4th Berkeley Symp. Math. Stat. 2 (1961) 337-345. [I41 L. Russo, A note on percolation, Z. Wahrsch. verw. Geb. 43 (1978) 39-48. [15] P. D. Seymour and D. J. A. Welsh, Percolation probabilities on the square lattice, Ann. Discrete Math. 3 (1978) 227-245. [I61 R. T. Smythe and J. C. Wierman, First-Passage Percolation on the Square Lattice. Lecture Notes in Mathematics, Vol. 671 (Springer, Berlin, 1978). [17] M. F. Sykes and J. W. Essam, Exact critical percolation probabilities for site and bond problems in two dimensions, J. Math. Phys. 5 (1964) 1117-1 127. [18] J. Van den Berg, Percolation theory on pairs of matching lattices, J. Math. Phys. 22 (1981) 152-157. [19] J. C. Wierman, Bond percolation on honeycomb and triangular lattices, Adv. Appl. Prob. 13 (1981) 293-313. [20] J. C. Wierman, On square lattice directed percolation and resistance models, J. Phys. A: Math. Gen. 16 (1983) 3545-3551. [21] J. C. Wierman, Counterexamples i n percolation, J. Phys. A: Math. Gen. 17 (1984) 631-646. [22] J. C. Wierman, A bond percolation critical probability determination based on the star-triangle transformation, J. Phys. A: Math. Gen. 17 (1984) 1525-1530. [23] J. C. Wierman, Mixed percolation on the square lattice, J. Appl. Prob. 21 (1984) 247-259.
This Page Intentionally Left Blank
Annals of Discrete Mathematics Previous Volumes in this Series
Vol. 1: Studies in Integer Programming edited by P.L. HAMMER, E.L. JOHNSON, B.H. KORTE andG.L. NEMHAUSER 1977 viii+562 pagcs Vol. 2: Algorithmic Aspects of Combinatorics edired by B. ALSPACH, P. HELL and D.J. MILLER 1978 out of print Vol. 3: Advances in Graph Theory edited by B. BOLLOBAS 1978 viii+295 pages Vol. 4 Discrete Optimization, Part I edited by P.L. HAMMER, E.J. JOHNSON and B. KORTE 1979 xii+299 pages Vol. 5: Discrete Optimization, Part I1 edited by P.L. HAMMER, E.L. JOHNSON and B. KORTE 1979 vit-453 pages Vol. 6: Combinatorial Mathematics, Optimal Designs and their Applications edited by J . SRIVASTAVA 1980 viii+391 pages Vol. 7: Topics on Steiner Systems edited by C.C. LINDNER and A. ROSA 1980 x 349 pages
+
Vol. 8: Combinatorics 79, Part I edited by M. DEZA and I.G. ROSENBERG 1980 xxii + 309 pages Vol. 9: Combinatorics 79, Part I1 edited by M. DEZA and I.G. ROSENBERG 1980 viii+309 pages Vol. 1 0 Linear and Combinatorial Optimization in Ordered Algebraic Structures edited by U. ZIMMERMANN 1981 x + 380 pages 361
Annals of Discrete Mathematics. Previous Volumes
362
Vol. 11: Studies on Graphs and Discrete Programming edited by P. HANSEN 1981 viii+ 395 pages Vol. 12: Theory and Practice of Combinatorics edited by A. ROSA, G. SABIDUSI and J. TURGEON 1982 x f 266 pages
Vol. 13: Graph Theory edited by B. BOLLOBAS 1982 viii + 204 pages Vol. 14: Combinatorial and Geometric Structures and their Applications edited by A. BARLOTTI 1982 viii 292 pages
+
Vol. IS: Algebraic and Geometric Combinatorics edited by E. MENDELSOHN 1982 xiv f 378 pages
Vol. 16: Bonn Workshop on Combinatorial Optimization edited b-v A. BACHEM, M. GRBTSCHEL and B. KORTE 1982 x+312 pages Vol. 17: Combinatorial Mathematics edited by C. BERGE, D. BRESSON, C. CAMION and F. STERBOUL 1983 x+660 pages
Vol. 18: Combinatorics '81: in Honour of Beniamino Segre edited by A. BARLOTTI, P. V. CECCERINI and G. TALLlNl 1983 xii 824 pages
+
Vol. 19: Algebraic and Combinatorial Methods in Operations Research edited by R. E. BURKARD. R. A. CUNINGHAME-GREEN and U. ZIMMERMANN 1984 viii 382 pages
+
Vol. 2 0 Convexity and Graph Theory edited by M. ROSENFELD and J. ZAKS 1984 xii+340 pages Vol. 21: Topics on Perfect Graphs edited by C . BERGE and V. CHVATAL 1984 xivS370 pages
Vol. 22: Trees and Hills: Methodology for Maximizing Functions of Systems of Linear Relations Rick GREER 1984 xiv 352 pages
+
Annals of Discrete Mathematics. Previous Volumes
Vol. 23: Orders: Description and Roles edited by M. POUZET and D. RICHARD 1984 xxviiif548 pages Vol. 24: Topics in the Theory of Computation edired by M. KARPINSKI and J. VAN LEEUWEN 1985 x+ 188 pages
363
This Page Intentionally Left Blank