World. Scientific Series in Computer Science Vol. 33
Mathematical Foundations of Parallel Computing
WORLD
SCIENTIFIC
SERIES
IN
COMPUTER
SCIENCE
Published
11:
Control Theory of Robotic Systems ( J M
12:
An Introduction to Chinese, Japanese, and Korean Computing (J
13:
KT
Huang
& T D
Skowronski)
Huang)
Mathematical Logic lor Computer Science (Z W Lu)
14:
Computer Vision and Shape Recognition (Eds. A Krzyzak,
15:
Stochastic Complexity in Statistical Inquiry ( J
16:
A Perspective in Theoretical Computer Science — for Gift Siromoney (Ed. R Narasimnan)
17:
Computer Transformation of Digital Images and Patterns (Z C Li, T D Bui, Y Y Tang
&CY
T Kasvand
Suen)
Commemorative Volume
Suen)
18:
Array Grammars. Patterns and Recognizers (Ed. P S P
19:
Structural Pattern Analysis (Eds. R Mohr,
20:
A Computational Model of First Language Acquisition {N
21:
The Design and Implementation of ConcurrentSmalltalk ( V
22:
From Humans to Computers — (V
& C Y
Rissanen)
V Alexandrov
& N D
Th Paviidis
Sanleliu) Satake) Yokole)
Cognition Through Visual Perception
Gorsky)
23:
Introduction to Theoretical Computer Science (Ma
24:
A Digital Optical Cellular Image Processor — Implementation ( K - S
Wang)
& A
Xiwen)
Theory, Architecture and
Huang)
25:
Computer Epistemology — A Treatise on the Feasibility of the Unfeasible or Old Ideas Brewed New (7" Vatnos)
26:
Applications of Learning and Planning Methods {Ed. N G
27:
Advances in Artificial Intelligence —
Bourbakis)
Applications and Theory (Ed. J
28:
Introduction to Database and Knowledge-Base Systems ( S
29:
Pattern Recognition: Architectures, Algorithms and Applications
30:
Character and Handwriting Recognition —
31:
Software Science and Engineering —
(Eds
(Eds.
33:
R Plamondon
/ Nakata
& H D
& M
Bezdek)
Krishna)
Cheng)
Expanding Frontiers (Ed. P S P
Wang)
Selected Papers from the Kyoto Symposia
Hagiya)
Mathematical Foundations of Parallel Computing {V V
Voevodin)
Forthcoming
32:
Advances in Machine Vision — (Eds.
34:
S E
Strategies and Applications
Petriu)
Language Architectures and Programming Environments (Eds. T Ichikawa H
For
C Archibald
&
Tsubolam)
a complete
list
of published
Mies
in the series,
please
write
in to the
publisher.
World Scientific Series in Computer Science Vol. 33
Mathematical Foundations of Parallel Computing Valentin V.
Voevodin
Russian Academy of Sciences
Vfe
World Scientific Singapore • New Jersey • London • Hong Kong
Published by World Scientific Publishing Co. Pie. Lid. P O B o x 128. FarrerRoad, Singapore 9128
USA office: Suite IB, 1060 Main Street, River Edge, NI07661 UK office: 73 Lynton Mead, Toiteridge. London N20 SDH
M A T H E M A T I C A L FOUNDATIONS OF P A R A L L E L C O M P U T I N G Copyright © 1992 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,electronic or mechanical, including photocopying,recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
ISBN 981-02-0820-0
Printed in Singapore by Genera! Printing Services Pte. L t d .
V
To
my
faithful,
kind,
and
loving
wife
Sima
This page is intentionally left blank
vii
Contents Preface Chapter
I .
Algorithm
and I t s Graph
1
§
1.
§
2.
Algorithm notations
§
3.
Graph o f a l g o r i t h m
§
4.
Topological
§
5.
S c h e d u l e s a n d g r a p h m a c h i n e --
§
6.
Examples
Chapter
General n o t i o n o f a l g o r i t h m
2.
2 6 -
17
sorting
26 —
Algorithm
34 43
... E x e c u t i o n Time
55
§
7.
Vector p r o p e r t i e s o f schedules
57
§
8.
Number s e m i r i n g s a n d o t h e r s e t s
67
§
9.
Minimax p r o p e r t i e s o f schedules
80
§ 10.
O p t i m a l and high-speed schedules
89
§ 11.
Examples -
100
Chapter
3.
Algorithms
a n d C o m p u t e r Memory
—
107
§ 12.
Examples -
•
107
§ 13.
T o t a l r e q u i r e d memory s i z e
120
§ 14.
H i e r a r c h i c a l memory
135
§ 15.
Sectioning
§ 16.
D e c o m p o s i t i o n o f a l g o r i t h m and o f i t sgraph
Chapter 4.
Matrix
o f memory
143
Investigation o f Algorithm - - —
Graphs and m a t r i c e s
§ 18.
Recovering t h e l i n e a r
§ 19.
Computing g r a d i e n t and d e r i v a t i v e
§ 20.
Roundoff
§ 21.
Examples
Chapter
5.
Structure
-
§ 17.
148
-
-- 155 156
functional
164 •
error analysis
169 174 186
Functional
Investigation
§ 22.
Space-time
schedules -
§ 23.
Regular graphs
of Algorithm
Structure
193 193 203
viii § 24.
Passage t o t h e l i m i t
§ 25.
Data streams
§ 26.
Examples
C h a p t e r 6.
2 1 5
---
A l g o r i t h m Graph and
§ 27.
Some s t a t i s t i c s
§ 28.
Order r e l a t i o n
2 3 1
-
2
A
Schedules B u i l d i n g
0 252 253
— -
—
256
§ 29.
Notation particularities
267
§ 30,
Guidelines
275
§ 31.
Splitting
§ 32.
Linear
§ 33.
Branching Linear
§ 35.
Examples —
the algorithm graph
index
§ 34.
Afterword
f o r a l g o r i t h m graph b u i l d i n g
282
expressions —
292 301
-
information closure
308 313
-
-
327
References
332
Index
340
ix
Preface The cians. the
subject
matter
A l o t of things
objects
rithms.
of this still
o f our examination
However,
our i n t e r e s t
n a t u r e . We f o c u s o n s t u d y i n g ularities of their
i s nontraditlonal
are well
known:
i n algorithms
the structure
investigation that
mathematicians.
The
most
f o r mathemati-
i n spite of the fact we
will
i s of a
study
rather
that algo-
specific
o f a l g o r i t h m s and on p a r t i c -
implementation on p a r a l l e l computers.
pect o f algorithm to
book
feel unfamiliar,
I ti s t h i s as-
i s n o n t r a d i t l o n a l and p o o r l y
nontraditlonal
part
of
i t Is
known
research
methodology. Before
the reader
s h o u l d be made c l e a r . achieve
over
several
decades,
of computational
things
went, that
more
period.
One
a dozen
interests
and n u m e r i c a l
problem
are i nthe
software.
every
Recall
various
supposed
f o r whom a c o m p u t e r
that
spare a l g o r i t h m
algorithmic
designers
computers.
t o handle
grams t o c u r r e n t This
abnormal from
initial
systems
a l l activities
goal
that
i t will
pert
i n algorithmic
be reached
were
a
tool
code
accordance
languages. o f view o f
he u s e s
originally
together
i n his
Intended
to
required
with
t o adjust
compilers
were
application
pro-
settings.
has n o t been a c h i e v e d
yet,
and i t i s u n l i k e l y
i n n e a r f u t u r e . One d o e s n o t h a v e t o be a n e x -
languages
t o come t o t h i s
name a l a n g u a g e t h a t g u a r a n t e e s t h a t p r o g r a m s efficiently
time the
the necessity to scrutinize the p e c u l i a r i t i e s
Operating
machine
i s merely
languages
the point
way
learned
changes i n t h e source
o f machine-independent a l g o r i t h m i c i s highly
The
h a d t o be
persisted
t o a new c o m p u t e r :
o f things
mathematician
work.
viewpoint
that.
research
d i f f e r e n t computers
a n d t h e same
the specifications This state
of
the author's
t o be made, a l t h o u g h a l l p r o g r a m s w e r e w r i t t e n i n s t r i c t
with
a
the author's
mathematics
than
programs were t r a n s p o r t e d had
t h e book,
t h e m s h a l l be e a s i l y u n d e r s t o o d a f t e r
For field
proceeds w i t h
H o p e f u l l y b o t h t h e r e s e a r c h g o a l s a n d t h e ways t o
conclusion. ini t will
Just
be implemented
o n d i f f e r e n t c o m p u t e r s w i t h o u t a n y c h a n g e s . You w i l l
soon be c o n v i n c e d t h a t no such languages e x i s t .
try to
pretty
X
Of for
course,
poor
we
may
blame
appreciation of
partly
justified
tainly
got
inant.
I t m u s t a l s o be
far
lost
from being The
s u c h i s s u e s as
present
of
are l e f t
that
results
both
system
developers
The
rebuke
is
i n t e r e s t s have
cer-
are
dom-
and
the
is
no
designers'
not
processes.
The
round
longer problems
are
a l g o r i t h m s and as
should
the r e s u l t s
the
I t i s well scheme o f
as
different
them o f f i n d i f f e r e n t
developing
diversity
(maybe
treated
from
this and
ways. of
the
t o make
with
the p o i n t
such
may
re-
In practice
this
results.
Then
we
a c t u a l l y mean b y
sure
of view
re-
speaking,
follow while
certain
lat-
i s not
computers
"What d o we
programs
machine-independent
that
algorithm
language n o t a t i o n s . S t r i c t l y
substantial
important
known an
information
l a n g u a g e ? What r u l e s s h o u l d we
be
most
mode o f r o u n d i n g - o f f . The
w i t h a b u n d l e o f q u e s t i o n s , as:
viewed
of
related
machine-independent,
b r i n g s about
one
the o v e r a l l
machine-dependent.
machine-independent
so
obtained
of computational
n u m b e r s and
eventually
vested
interests
number r e p r e s e n t a t i o n and
are
are
other
owned h o w e v e r
I n most a l g o r i t h m i c
languages
ing
l o t of
i s i n f l u e n c e d by
factors
and
needs.
simple.
characteristics
flected
the a l g o r i t h m designers'
among a
accuracy
accuracy
ter
as
language, c o m p i l e r ,
algorithm designers'
that
a
designcan
be
reservations)?
they
How
of accuracy?"
And
on. Huge a m o u n t s o f
effort
have gone
into
a b o v e q u e s t i o n s . S t a n d a r d s w e r e a d o p t e d on
finding
the
answers
to
the
number r e p r e s e n t a t i o n and
r o u n d - o f f modes. E r r o r b o u n d s w e r e o b t a i n e d f o r many a l g o r i t h m s . On whole, however, ward e r r o r lysis
are not
i s t o o t i m e - c o n s u m i n g . No
possible that that
the r e s u l t s
a n a l y s i s approaches are
influence
constructive,
forward and
and
interval
other approaches c u r r e n t l y e x i s t .
the c h i e f achievement
roundoff error
s p e c t a c u l a r . The not
i s the f u l l
c a n n o t be
r e a l i z a t i o n of the
ignored during numerical
on the
backanaI t is fact soft-
ware development. For
a
long time
fundamental different certain
i t seemed t h a t
difficulty
concerning
the accuracy
the
implementation
c o m p u t e r s . Though i t s easy s o l u t i o n prospects
went on o f b u i l d i n g
nonetheless up
came
the confidence
problem
into that
could
view.
of
was
the
only
algorithms
n o t be
Gradually
t h e development
counted the of
on on,
process machine-
xi independent numerical The plication
of
The
entire
following
quite
what
The
ation
facts
what
But
of
every
new
numerical
pilers exist
The
author's The
initial
ra,
parallel
numerical
author
was
parallel
computation
s u c h b r a n c h e s was However, long.
Initial
questions
grew
to as
of
many q u e s t i o n s
studied.
The
i t is of,
t o be
has
done
been
de-
reconsider-
language
com-
t h e code t h e y
systems t o f u l l
what
new
was
rather
issues
as
linear
optimization, managed
program
extent.
large
m e t h o d s and
fields
compiler
gene-
on.
computations
physics, the
with
this
was were
i f no
not no
knowledge
new
field
to
numealgeb-
etc.
The
recognize
information
stretched out
illuminating. answers. of
the
The
area
i t dawned u p o n t h e a u t h o r
on
found
in literature
for
Moreover, number
enlarged that
the
new
of
such
and
ac-
answers
s i m p l y because
those
known. software design proved
difficulties
architectures,
processing
and
regards numerical
there
c o u l d n o t be
E f f e c t i v e numerical
system
systems
have
yet
made u s e
is
c o n t i n u e d o n and
in a serial
the author's
answers were not y e t
well
be
what
High-level
traditional
papers
which
quired profundity. Gradually to
difficulties
f o r c e s another
understand
t o know how
familiar
reading
arose
i n such
branches
This
specified.
getting
questions
parallel
of p a r a l l e l
computational
curious
the r e -
c o m p u t e r s . However, i t i s g e n e r a l -
to
I n t o p l a y as
analysis,
also
ap-
undertaken.
i t can and
in parallel
wanted
s o f t w a r e development
how
are,
very e f f e c t i v e
interest
just
large
Once a g a i n
t o be
grave
algorithms.
the resources
supercomputers brought rical
achieved,
t h i s k i n d c o u l d be
author
feasible.
to solve
come I n t h o u s a n d s ,
architecture
and
they are not
of facts of
had
new
various
parallel
r a t e does not e x p l o i t
1 imited.
for
methods
that
that
parallelization
for individual
recognized
list
suggest
software
use
confidence. lore
parallelization
of
languages I s
their
parallelization,
is actually
bottlenecks
Certainly,
veloped.
The
that
algorithmic
p a p e r s on
clear
the
next.
ly
the
t h e a s s o c i a t e d b u z z w o r d was
s p r u n g up. not
in algorithmic
c o m p u t e r s and
p r o b l e m s have undermined
furbishment time,
software
advent o f p a r a l l e l
which
are
causes
o r g a n i z a t i o n s . These
primarily the
i n their
t o be d i f f i c u l t and due
to
corresponding turn
imply
wide
variety
variety the
not
in
of data
necessity
of
xii various numerical
m e t h o d s and
Restructuring parallel
existing
computers
a l g o r i t h m s , s o f t w a r e , and
software
I s not
an
easy
c l e a r whether such r e s t r u c t u r i n g
new
languages.
t o meet t h e r e q u i r e m e n t s task.
Moreover,
is feasible
i t is
of
not
at a l l . Despite
large always
the
large
number o f p a p e r s o n p r o g r a m p a r a l l e l i z a t i o n m e t h o d s , s o f a r t h e r e i s no general
practical
reason,
most o f t e n
developed
constructive new
f o r large
methodology
algorithmic
parallel
in
this
l a n g u a g e s and
computers,
and
field.
numerical
For
that
methods
a l l programs
are
are
written
anew. Mathematicians again
and
again
can
hardly
revise
their
computers, whether w i t h Any portant little
computational fact
about
v e l o p and of
must
by
to
them
i s based
recognized.
errors
tailor
on
Dominant over
to
to various
new
or
not.
really
s e v e r a l d e c a d e s von-Neumann
and
most o f t e n
to just
two
imvery
they
the
de-
science
architecture
computer-related
r e q u i r e d memory s i z e . ignored
most know
algorithms that
long p e r i o d of e x i s t e n c e of
o p e r a t i o n count
i n f l u e n c e was
prospect
some a l g o r i t h m . A
o f t h e m e t h o d s and
of the
the
"super"
Mathematicians
the a l g o r i t h m designers' a t t e n t i o n
characteristics: ing
be
in spite
computations.
riveted
concerned
algorithms
process
now
not
the f a s h i o n a b l e p r e f i x
the s t r u c t u r e
use,
be
in practical
Even
round-
projects.
The
exploration of structural
p r o p e r t i e s o f a l g o r i t h m s , a s e.g.
modularity,
has
The
the
of
remained
rudimentary.
parallel
computers
computational
structures.
language,
compiler,
It
cannot
in
found
Related and
to
without
sciences,
computer
and
solve
by
large
adequate
i n particular
architecture
arrival
problems,
knowledge
of
algorithmic
development,
were
in
H o w e v e r , l e t us
go
find
o f a c h i e v e m e n t s . Let us pose q u e s t i o n s
answers.
from f i n d i n g
Let
us
these
t r y to unearth
a n s w e r s . T h e n we
f a c t know a b o u t p a r a l l e l motivated
understand
and
t o which
the o b s t r u c t i o n s that shall
r e a l i z e how
we
pre-
little
we
computations. the author
t i o n o f numerous p r o b l e m s a r i s i n g to
employment itself
that
i s e a s y t o f o r e s e e many o b j e c t i o n s t o t h i s .
This reasoning
was
i t a l l was
plight.
beyond t h e l o n g l i s t
v e n t us
their
mathematics
algorithm
similar
and
upshot o f
to start
in parallel
investigate
a thorough
computations.
mathematically
the
investigaThe
impulse
bottlenecks
of
xiii analyzing formulas The an
and i m p l e m e n t i n g p a r a l l e l i s m and programs t o compilers joint
i n v e s t i g a t i o n o f computational
exceptionally complicated
attractive
idea.
of
mathematical
uniform
rithms. This sity
i n algorithms
Perhaps
choice
t o provide
b e c a u s e we m u s t
task.
s y s t e m s and a l g o r i t h m s i s
both
be a b l e
i t i s an
computers
and a l g o -
t o make n o t s o much d u e t o t h e n e c e s -
descriptions of certain t o s e t and r e c o v e r
these o b j e c t s , and a l s o
t o solve
stage o f i ti s t h e choice
t o describe
i sdifficult
mathematical
systems.
To u s e c o m p u t e r s
t h e most d i f f i c u l t concepts
fixed
from
and c o m p u t a t i o n a l
to collate
objects,
various
and t r a n s f o r m
but primarily
characteristics of
individual
o b j e c t de-
scriptions. It
I s almost
evident
that
algorithms
can be d e s c r i b e d
( w i t h a f e w r e s e r v a t i o n s ) . I t c a n h a r d l y be d o u b t e d t h a t ciple Is
possible
moment. tions
ended
Of c o u r s e ,
attempts
graph
theory
t o make
i n a disappointment.
much
specific
possible. of
t o describe
t h a t g r a p h s came i n t o with
computational
the author's
i t s numerous
This
t h e same
functional
s y s t e m s . So i t
view a t a beautiful
certain applica-
use
of available
Research
features
f o r c e d us t o consider
order
as
units.
As
t h e number a
goals
o f algorithms
rule,
graph-theoretic
implied
results
taking into
and c o m p u t a t i o n a l
account
systems
g r a p h s whose number o f n o d e s
of algorithm
t h e number
operations
o f algorithm
and
known puters
a t compile
time.
Correlating properties of algorithms
implies determining
momorhlc, e t c . Graph
whether
their
theory has n o t h i n g
practically
as
running
ive
the algorithms
search. Finally
terminology
This
However, even
themselves on s e r i a l i n linear
i s o f course unacceptable
i t became
clear
and a few b a s i c
e n c e was d r a w n f r o m
that
facts.
graph
take about
a n d como r ho-
linear
requiring
time
t h e exhaust-
i n terms o f time theory
research.
search
t h e same
c o m p u t e r s . A c t u a l l y most
time,
can o n l y
Nonetheless, a very
t h e accomplished
Is
t o suggest f o r the s o l u t i o n o f
u n a c c e p t a b l e f o r u s , as i t w i l l
p r o b l e m s cannot be s o l v e d
was
are not
graphs a r e isomorphic,
s u c h p r o b l e m s b u t f o r some k i n d o f s e a r c h . is
as
system
operations
huge. M o r e o v e r , i t t y p i c a l l y depends on c e r t a i n p a r a m e t e r s t h a t
vital
graphs
contributed to this. The
as
t o use g r a p h s
not surprising
by
i t i s i n prin-
cost. lend
important
Specifically,
us t h e infer-
the joint
xiv investigation fective count.
of computational
systems and a l g o r i t h m s
Although
the f i r s t
results,
i t was
means o f r e s e a r c h . being
used
their
jects.
types
and nodes. The r e l a t i o n was n o t a l w a y s
and
papers
as
the mathematical
authors,
were
entire
However,
scope
there
the author's
at different
o f our consideration:
parallel
the author's
computers.
graphs denoting
t h e same
were
alterna-
no v i a b l e
confidence
times, Taken
that
an
works
and t h e y
they
connection covered the
programs,
languages,
had g r e a t
impact
have
deserve
elegant
to different
and had none together,
algorithms,
These
viewpoint
same o b -
instrument.
Yet each paper used graphs.
compilers, forming
published
elaborate.
i n s i z e and i n t h e i d e n t i f i c a t i o n o f
t r e a t i s e c o u l d be p u t t o g e t h e r u s i n g g r a p h s . T h e y b e l o n g e d
whatever.
f o r long.
not adequately
between d i f f e r e n t
clear.
boosted
systems
actually
o f g r a p h s were used t o d e s c r i b e
both
t i v e s f o r the r o l e o f the mathematical Several
graphs
and c o m p u t a t i o n a l
fragmentary
Those g r a p h s d i f f e r e d
objects
to abort
construct-
p r o a r g u m e n t was t h a t g r a p h s w e r e
algorithms
M o r e o v e r , many d i f f e r e n t
I n general?
t o use graphs d i d n o t y i e l d
undesirable
u s e was
be e f -
i n t o ac-
questions.
attempts
The c h i e f
t o describe
Certainly,
arcs
only
B u t w h a t a r e t h e s e f e a t u r e s ? What i s g r a p h s t r u c t u r e
T h e r e w e r e no a n s w e r s t o t h e s e
ive
will
i f the c h a r a c t e r i s t i c f e a t u r e s o f o u r graphs a r e taken
t o be m e n t i o n e d
on here
specially. The search
first
lustrating program
t h e use o f graphs.
graphs
Another graph plementation are
paper I s a r e v i e w
by A.P.Ershov
are
listed
type mentioned
not discussed
In particular,
there
data f l o w graph.
(control
i n t h e review
Next
go
as
methods
t h e papers
however
[61,62]
by
sequential
programs.
suggested:
types
flow,
of
etc.).
i s t h e s o - c a l l e d p r o g r a m Im-
that
t h e most
interesting re-
For a
graph.
L.Lamport.
of analysis of the parallel
were
data
s t r u c t u r e o f a l g o r i t h m s and programs a r e ob-
tained using various modifications o f that
feasibility
a l l essential
flow,
I t s p r o p e r t i e s and p o t e n t i a l a p p l i c a t i o n s
i n 1321. Note
search r e s u l t s o f p a r a l l e l
down
[ 3 2 1 . I t sums u p t h e r e -
i n p r o g r a m schemes t h e o r y . T h e r e a r e a l s o n u m e r o u s e x a m p l e s i l -
They
demonstrate
structure of algorithms
class
the coordinate
o f programs, method
and
two
the
the
written analysis
hyperplane
XV
method. class
These
o f programs,
parallel 132]
proved
system graph
[61,62],
we
associated with
the terminological
notice
an
data
flow
important
converter easily
that
a s e r i e s o f papers
automatically
t h a t machine-independent structural
lems
inspiring
used
i nParafrase
ment o f t h a t systolic [68]
class
tempt
systolic
t h e paper's itself,
t o bring
into
i sa conclusive
evidence
t o study a wide
Graphs
arrays f o r a rather chief
systolic flow
accord
variety
mathematical
[68] by D.I.Moldovan.
message
but rather
data
become i s ana-
them.
arrays
graph.
algorithm
lies
i n the rules
t h e terminology from
t o construct
implementation
they
structure
prob-
are widely
tool.
t h e paper
arrays. Borrowing
suggests
program
b y D.J.Kuck a n d o t h e r s
t o p a r a l l e l i s m . Development o f
to investigate
as a r e s e a r c h
laid the
s t r u c t u r e o f graphs.
Program
Parafrase
features related
mathematicians
However,
coordinate
[61,62]
programs so t h a t
computers.
t o o l s may be d e v e l o p e d
t o constructing
algorithms.
[54,55]
Thus
t o a l o t o f new a n d m e a n i n g f u l
F i n a l l y we m e n t i o n formally
parallel
restructures
on p a r a l l e l
such t o o l s g i v e s r i s e
a
I s e s s e n t i a l l y a FORTRAN-to-FORTRAN
l y z e d b o t h on macro- and m i c r o l e v e l .
of algorithm
between
concerning the
In particular,
constructively.
t h eParafrase. Parafrase
implementable
differences
t h e p r o g r a m was i n t r o d u c e d , a n d a way t o p l a c e
i n I t was s p e c i f i e d
N e x t we m e n t i o n
f o r the selected
development
graph.
f o u n d a t i o n f o r u s i n g computers t o analyze
describing
effective
i n c o r p o r a t e d i n a number o f c o m p i l e r s f o r
Ignoring
implementation
nodes
sufficiently
and were
computers.
and
program
t o be
narrow
class of
not i n the treatused
t o construct
[ 3 2 ] , we may s t a t e
using
This
I t i s dedicated
projections
may be v i e w e d
structures
that
of the
as an a t -
and p a r a l l e l
system
architecture. Naturally,
thediscussed
papers d i d n o t f u r n i s h
questions.
Yet taken together they strengthened
provide
sound
a
base
from mathematical Of as
well.
course,
formulas
Brent
research
t o computational
the author's
The a u t h o r w o u l d
B a r l o w R.H., I.S.,
f o ra global
positions
like
were
answers t o a l l t h e
theopinion that
into
parallelism
systems. influenced
t o g r a t e f u l l y mention
R.P., Demmel J.W.,
graphs ranging
by o t h e r
here
papers
t h e works o f
Dennis J.B., Dongarra J . J . ,
E r s h o v A.P.. E v a n s D. J . , F a d d e e v O.K.
Duff
a n d F a d d e e v a V. N. , G u r d J . ,
XV
i
Heller
D. ,
Lamport Siegel their to
Hockney
L. ,
B, W. , Hwang K. . K u c k
Maslov
V.P.,
H.J., S t o n e
Moldovan
H.S.,
Traub
J . F. ,
w o r k was n o t a l w a y s d i r e c t ,
steer
the r i g h t
course
D. J . ,
D.I.,
Kung
Plemmons
S. ¥ . , K u n g B. J . ,
a n d many o t h e r s .
b u t t h e y have h e l p e d
i n studying
H. T. ,
Sameh
The
A. H. ,
impact o f
and a r e h e l p i n g
the mathematical
foundations
of
parallelism. I n t h e book topics.
[88J t h e author essayed t o s t a t e h i s o p i n i o n s
The c h i e f
ficulties
objective
that hinder
was
t o understand
the
implementation systematic
problems,
data
study
including
flow
graph
of algorithm
I ti s i d e n t i c a l
t h e problem
o f mapping
the
way t o s o l v e information
can
only
o p t i m a l l y most
important
that
a lot of
onto
parallel
t h a t t h e main h u r d l e on
p r o b l e m s was t h e t o t a l
on t h e s t r u c t u r e o f a l g o r i t h m
be d e r i v e d
us t o s o l v e
algorithms
was
t o the pro-
o f [ 3 2 ] . I t was d e m o n s t r a t e d
graphs enables
c o m p u t e r a r c h i t e c t u r e s . I t was a l s o e s t a b l i s h e d
of
on t h e s e
of the d i f -
t h e i n v e s t i g a t i o n s . The c h i e f r e s e a r c h o b j e c t
t h e a l g o r i t h m g r a p h . To a f e w r e s e r v a t i o n s , gram
the roots
graphs.
This
lack
information
f r o m some a l g o r i t h m n o t a t i o n s , a s m a t h e m a t i c a l
for-
mulas, a l g o r i t h m i c language programs, e t c . The
need
t o conduct
theoretical
cumstances o b l i g e d us t o s t a r t rithms, ing
on p a r a l l e l
i t s structure, parallelizing
computers.
implement c l a s s e s o f a l g o r i t h m s , transported
onto various
They can a l s o
Simultaneous mention here Just existing
sequen-
be c o u n t e d
easily upon t o
that
may
c o m p u t a t i o n a l systems, and so o n .
and p r a c t i c a l
a few o f t h e acquired
efforts
results.
were
I t turned
fruitful.
We
out that a l l
program p a r a l l e l i z a t i o n methods a r e i n f a c t methods o f f i n d i n g
solutions ficients viously
theoretical
algo-
systems o p t i m a l l y s u i t e d t o
t o develop numerical software
parallel
cir-
t o o l s aim a t b u i l d -
p r o g r a m s , r e s t r u c t u r i n g them i n s u c h a way a s t o make them
Implementable
Links
uncertain
tools t o study
s y s t e m s . These
d e s i g n m a t h e m a t i c a l models o f c o m p u t a t i o n a l
be
i n these
developing software
programs, and c o m p u t a t i o n a l
the a l g o r i t h m graph, e x p l o r i n g
tial
research
to a certain
i n e q u a l i t y o f Bellman
are r e a d i l y determined unknown b o t t l e n e c k s
were
structure
established
type,
the algorithm
of parallelization
between
o f an a l g o r i t h m
using
t h e problem
and o t h e r
whereof graph.
processes were
of studying
problems having
the coefSome
pre-
exposed.
the parallel
no a p p a r e n t
rela-
xvii tion
to
that
one,
as
studying round-off
errors
propagation
during
the
a l g o r i t h m e x e c u t i o n , c o m p l e x i t y b o u n d s f o r g r a d i e n t e v a l u a t i o n and ilar the
processes.
The
possibility
ical
memory
continued.
of
connection the e f f e c t i v e
systems The
between
was
reader
also will
a l g o r i t h m graph
algorithm implementation
revealed.
find
The
other
list
of
examples
in
sim-
structure on
hierarch-
examples the
and
could
corpus
of
be the
book. T h i s book p r e s e n t s rithm
structures.
material.
ine.
of
choice of notation
language
out
This
tions
using
a
programs.
model
I s accounted
o f a l g o r i t h m and
the
graph
w h i c h we
of the author's
machine
may
choose
rithms
of
tiated
earlier
f o r by
the
produces
other
most
a g i v e n c l a s s . The
model
of
strongly
the graph joint
properties.
Is
mach-
investiga-
systems
implementation
that
on
Transformation
computational
f o r the
efficiency
algo-
investigation
called
t o conduct
system
suited
into
focus
structure
system
desire
computational
those
demands we
Algorithm
computational
research
t o represent algorithms i s not
However, t o r e s p o n d t o p r a c t i c a l
FORTRAN-1 i k e carried
The
the results
approach
i n the course
o f development o f a s y s t o l i c
point
keeping
among
of
was
algo-
substan-
array
design
system [ 8 8 ] . We
make a
Therefore
one
o b s e r v e was mentation
of
out
of
the requirements
our
that
research
the author
separation of algorithm structure considerations. This
requirement
machine-Independent. b e l i e v e d necessary
i n v e s t i g a t i o n from
implies treating
o f a l g o r i t h m s as unknown symbol v a r i a b l e s .
I t I s , however,
that
the values
on
For
e x a m p l e , t h e y may
total
amount
t h e y may of
of
also
parallel
use,
e t c . We
algorithm
influence
branches, do
on
manipulations. prove We
affect
executable
not
t o be
now
h a v e some i m p a c t branching
of computational I t i s natural
other
the
I f we
algorithm
operations.
characteristics
efficiency
know
structure.
vestigations
and
input data
a c o m p u t e r , we
to
ways o f
Implement
must d e v e l o p
p r o c e s s and to
make a
few
surmise
or h i e r a r c h i c a l input
data
the that
number memory
influence
algorithm structure
on In-
s p e c i a l methods f o r symbol
These methods a r e n o n t r a d i t l o n a l f o r t h i s rather
evident
structure.
o f a l g o r i t h m s , as
of d i s t r i b u t e d
concrete are
imple-
a l l Input
data
of
to
research
field
complicated. observations
about
the d i s t i n c t i v e
features of
xviii this
book.
The a u t h o r
tried
t o l e t t h e reader
feel
n a t u r e o f t h e g r o u n d s on w h i c h
the investigation
of
Therefore
a l g o r i t h m s h a s t o be b u i l t .
the indeterminate
of parallel
any a d d i t i o n a l
r e s e a r c h c o n d i t i o n s a r e made o n l y when i t becomes c l e a r made a s s u m p t i o n s author
hopes
a r e u s e d up a n d n o f u r t h e r
this
way
whence t h e c o n s i d e r e d
of exposition
an acceptable
certain
level.
important
ive
dwelling
The
wish
should
problems o r i g i n a t e
a r e used t o s o l v e them. Of c o u r s e , at
This
relations
on minute
help
the author
between
the reader
realize
strived
methods
to maintain
r e q u i r e d so a s no
individual
may
that previously i s p o s s i b l e . The
a n d why s u c h - a n d - s u c h
i s primarily
details
progress
structure
a s s u m p t i o n s on
eclipse
facts.
t h e most
t o s t e e r a midway c o u r s e a l s o h a d i m p a c t
rigor
to lose
However,
excess-
important
facts.
o n t h e manner o f e x -
position. The self
chief
p o i n t s we make a r e p u t down a s S t a t e m e n t s .
a l s o c o n t a i n s a good d e a l
cessarily
more
complex
Statements t o draw proofs are
are just
than
of information. the surrounding
t h e reader's
skeletons
altogether omitted.
examples t o h e l p reader
attention
devoid
Each
understand
contains
view
o f the author.
references who w i l l
not like
is
and
huge
scientific the
scale
I t i s due t o t h i s
i s relatively this
short.
list.
I t keeps
or that
i n another
T h i s b o o k may be r e a d reader's found
preferences.
theoretical
and w o r k
juncture
number
of
that
apologizes
this
the point
the l i s t of
to a l l
field
readers
computing
has no
sound
i t i s n o t easy t o understand
contribution
to parallel
computing
i n c l u d i n g w e l l above one t h o u s a n d
i n several d i f f e r e n t wishes
refer-
t h e book
with
ways, d e p e n d i n g on t h e
to get familiar
the prospects
computer a r c h i t e c t u r e s ,
c a s e , e x a m p l e s may be v i e w e d
proofs
meaningful
book b y t h e same a u t h o r [ 8 8 ] .
I f t h e reader
through
make Many
the simplest
a
circumstance
Nevertheless
foundations, t o realize
mapping a l g o r i t h m s o n t o patience
we
result.
I t presents c h i e f l y
The a u t h o r
paper's
theory. A large reference l i s t e n c e s may be f o u n d
rather,
The number o f p a p e r s i n p a r a l l e l
growing.
basement y e t . A t t h i s of this
a r e n o t ne-
the matter.
T h i s book i s n o t a r e v i e w o f p a p e r s . of
text;
to a particular
of a l l details;
chapter
The t e x t i t -
Statements
pencil
as i l l u s t r a t i o n s
with
pro-
and p i t f a l l s o f
he s h o u l d
marshall h i s
and n o t e p a d .
to theoretical
In this material.
xix If
t h e reader
topics
merely
considered
wishes
t o achieve
a general
i n t h e b o o k , he may j u s t
O n l y a modicum o f a d d i t i o n a l examples.
In this
damentals
of parallei
case
information
read
understanding
t h e Examples s e c t i o n s .
i s r e q u i r e d t o understand the
t h e b o o k may be v i e w e d a s a n e x p o s i t i o n computing
through
o f the
the solution
of fun-
of selected
pro-
blems. T h i s book
i s based on t h e research t h a t
t h e a u t h o r has been
ing
o n f o r many y e a r s now a t t h e D e p a r t m e n t o f N u m e r i c a l
the
USSR
Academy
gratefulness manuscript the took
was
exceptional part
o f Sciences.
The
author
t o G.I.Marchuk f o r c o n s t a n t read
by E . E . T y r t y s h n i k o v .
usefulness
i n . This
book
i s pleased
support
i s also
r e a d i n g a t t h e Moscow U n i v e r s i t y
a
base
t o express h i s
of this
The a u t h o r
o f h i s remarks
r e s e a r c h . The
gratefully
notes
and o f t h e d i s c u s s i o n s f o r lectures
a n d a t Moscow C o l l e g e
Technology.
Valentin
carry-
Mathematics o f
V.
Voevodin.
he
the author i s f o rPhysics
and
Chapter 1 Algorithm and Its Graph B e f o r e we s t a r t cify and
o u r s t u d y o f s t r u c t u r e o f a l g o r i t h m s , we must s p e -
precisely the subject
problem
formulation
search
area
stantial Clearly,
dealing
we
shall
describe The ous,
success
have
to restrict
answer
however,
to this
question
that i t involves
t e n t we s h a l l
a class
number o f w a y s t h e y o f programs
characterize
which
of algorithms
basic we
I ti s this
will
call
studying
permits
have
devise
amounts t o s p e c i f y i n g a c o m p u t e r s . We In full
portions an
w i l l not
detail. I n -
computational
we w i l l
r e v i s e o u r problem:
we w i l l
study
the key
of algorithms).
abstract
machine. Moreover,
the structure of algorithms
to specify a
structure describing
important
will
a graph
that
that
build
a
instead of
the functioning of
machines. Putting
of
This
and programs
some b a s i c
( o r t h e most
s t r u c t u r e we
choice
we e s s e n t i a l l y
or hypothetical
of algorithms
t r y t o discover
l a r g e , ex-
the algorithms
plan.
w h o l e s e t o f s u c h m a c h i n e s . T h e n we w i l l
graph
class o f
I t i s obvi-
probably
t o analyze only
c a n be p u t i n w r i t i n g .
of algorithms
that
system
sub-
b e , a n d how d o we
i s not immediately clear.
computers.
f o rexisting
classes
we w i l l
kernels Using
to a certain
of that class
a
algorithms.
some k i n d o f c o m p r o m i s e b e t w e e n o u r d e -
the f o l l o w i n g research
define
stead,
accurate
loose r e -
t o achieve
arbitrary
t o meet t h e m . To a c e r t a i n ,
be implemented on v a r i o u s
class
hope
our research
b e n e f i t by p r e f e r r i n g
to outline To
cannot
studying
the nature
The
i n the rather
it?
mands a n d o u r c a p a c i t y
can
We
while
o f our investigations
i n the process.
importance
with algorithms.
But what should
the goal
t o be used
i s of special
utilitarian
algorithms.
us
f o rinquiry,
the mathematical apparatus
paramount
this
plan
into
Importance.
effect
will
I n the f i r s t
description of the class o f algorithms possibility
t o construct
a
allow place,
us t o s o l v e we w i l l
t h a t we s t u d y .
mathematical
apparatus
two problems
obtain This
the exact
opens up t h e
t o explore
their
2
structures. to
As we
analyze
place,
fabricate
the s t r u c t u r a l
the
optimally
possibility
properties
arises
of
f o r us
our
to
algorithms.
study
Certainly, structure.
bulk o f our e f f o r t w i l l
above-listed
problems.
The
out
we
will
not
put
It just
is difficult
particular
the
relevant features of algorithms.
implementations hardly
t i o n s o f an a l g o r i t h m on
the graph
preserving
have
sketched
we
p o i n t on what
start
into
implementathose
interest,
units,
more r e a l i s t i c
basing because
t o comprehend a l l
shall determine of
after
first.
architecture
h e l p one
the
f o r i t points
studying the set of
number o f f u n c t i o n a l
as
e t c . T h e n we
computer models,
im-
time, will while
characteristics. the plan
carrying
i s t o be
ever
By
only
investigation
the c h a r a c t e r i s t i c s
machine
the achieved
r i t h m s . Now
problem,
of
algorithms.
computer, p r i m a r i l y
t h e g r a p h m a c h i n e we
minimize
communications,
transform
We
that
second
particular
the
storage,
the
s t r u c t u r e deserve
a
second
i t a l l depends
tackled
t o choose the o p t i m a l p a r a l l e l
on one's e x p e r i e n c e w i t h
plementations
s h o u l d be
able
architectures
the s t r u c t u r e of p r a c t i c a l
aside
which aspects of a l g o r i t h m
be
to the s o l u t i o n of the f i r s t
second problem
h a v e g a i n e d some i n s i g h t i n t o
Nevertheless
go
will I n the
computer
algorithms.
we
for specific
a p p a r a t u s , we
h e a v i l y on t h e f o r t u n a t e c h o i c e o f t h e b a s i c The
tailored
the mathematical
of
algo-
i t o u t . This chapter e l a b o r a t e s our
to
view-
investigated
investigate
and
the
structure
by w h a t means.
1. General Notion of Algorithm In
computer-based
algorithms. gorithms. different who
We
nature,
see
that
constantly
encounter
s o l u t i o n o f computational problems
Information yet
processing performed
f o l l o w s some a l g o r i t h m
the notion of algorithm
a m b i g u o u s and
by
i t i s a l s o d e s c r i b e d by
works a t a t e r m i n a l
often for
The
r e s e a r c h we
can
be
interpreted
a
a l l kinds
of
i s d e s c r i b e d by a l compiler
algorithms.
i s of
quite
Even t h e
man
i n his activity.
i s very widespread. i n m o r e t h a n one
way.
I t s use
is
Consider,
e x a m p l e , t h e a l g o r i t h m o f summing s e v e r a l n u m b e r s , What d o e s i t e x -
actly
mean? From
the point
of
view
of
pure
mathematics,
the
result
does
not
3 depend on t h e order erands i n t h i s dition
But t h a t order
errors,
struction
say,
I n t h a t c a s e we m u s t s e t t l e
selected
order
i n FORTRAN. Due t o t h a t
program
on d i f f e r e n t
course,
we
wrongly.
exclude
we c a n i m a g i n e
computers, t h e case
The d i f f e r e n t
Thus, even
t h e possession
provide What
shall
t h ef u l l is,
we s t u d y
us
discuss
of
rules
The
data
ing by
order
that
o f a program
on t h a t
have agreed
within
processes. can o f f e r
t o start
only
these
data.
any i n d i v i d u a l
I ti sunderstood that
and t h a t
studying
theapplication
the i n of rules
we know t h e r e s u l t
o f each
t h es t r u c t u r e o f algorithms
One c a n t h i n k
t h ed i f f i c u l t y definition,
bas-
i s caused w h i l e we
t h a t c a n be i m p l e m e n t e d o n a
we a r e i n t e r e s t e d i n s t u d y i n g t h e Mathematical
computation-
Encyclopaedia
- perhaps
definition.
Encyclopaedia
defines
s p e c i f y i n g t h e computational
Then
t o be a f i n i t e s e t
mechanically
that
those a l g o r i t h m s
L e t us c o n s u l t a more s t r i c t
input data
algorithms
t o these questions l e t
t o make u s e o f t h e m o s t g e n e r a l
Mathematical
instructions bitrary
does
itself.
an a l g o r i t h m
limits,
unambiguously,
t o study
it
ways
schemes.
application.
c o m p u t e r . More s p e c i f i c a l l y ,
The
t h e answer
t o solve
certain
clouded d e f i n i t i o n .
al
Of
o r used
rounding-off
The s t r u c t u r e o f w h a t
b o o k ? To f i n d
i tpossible
i sdifficult
our attempting
results.
incorrect
i n an a l g o r i t h m i c language
a s e t o f analogous problems.
i sdefined
being
i s es-
i f we r u n t h i s
a r e accounted f o r by t h ed i f f e r e n t
Encyclopaedia defines
make
can vary
thealgorithm
obtain different
numbers, and d i f f e r e n t
an a l g o r i t h m ?
I n this
general
language,
i n f o r m a t i o n on t h e u n d e r l y i n g a l g o r i t h m .
then,
step o f t h e r u l e s It
we s h a l l
t h e meaning o f t h e word " a l g o r i t h m "
problem from data
that
I n some
case. U n f o r t u n a t e l y ,
o f t h e program
results
computers use t o represent
to
computer's i n -
o n some d e f i n i t e
o f summing c a n be e x p r e s s e d
s e n t i a l l y a FORTRAN p r o g r a m i n t h i s
put
c a n be due t o t h e i n f l u e n c e o f
or theidiosyncrasies of the particular
set, etc.
o f op-
once t h e a d -
summing, d i s c a r d i n g a l l o t h e r s . The
not
we c a n a c c e p t a n y o r d e r
becomes s i g n i f i c a n t
i s performed by a computer. This
rounding
of
o f terms. Therefore,
context.
and aims a t a c h i e v i n g i t proceeds
an a l g o r i t h m
process
that
t h er e s u l t
t o e x p l a i n what
formal
t o be
starts
formal
withar-
that corresponds t o instructions are,
4 and
what t h e c o m p u t a t i o n a l It
looks
like
mathematical d e f i n i t i o n al
c o n c e p t s . Ue c a n o n l y i t down
using
t o continue
notions
t h a t cannot
to
solve
that of
to refine
rationalize or c l a r i f y
rigorous
i t , detail hope
t o more
algorithmically,
that a l l
t h e r e was no g o o d suggested
t h e r e was a t a c i t
r e c i p e was a n " a l g o r i t h m " . The n e c e s s i t y
basic
i t i n some way,
existed,
t h e c o n c e p t o f a l g o r i t h m . Once a r e c i p e was
a problem o r a class o f problems,
this
f o ra
be r e d u c e d
some n o t a t i o n , e t c . W h i l e
m a t h e m a t i c a l p r o b l e m s c a n be s o l v e d reason
our search
o f a l g o r i t h m . The n o t i o n o f a l g o r i t h m i n g e n e r -
i s one o f t h e p r i m a r y
write
p r o c e s s i s , and so o n .
i tis futile
to refine
agreement the notion
a l g o r i t h m o n l y m a t e r i a l i z e d when i t was p r o v e n t h a t t h e r e e x i s t
b l e m s t h a t c a n n o t be s o l v e d We s t a r t ing
using
algorithms
pro-
from a s p e c i f i e d class.
o u r e l a b o r a t i o n o f t h e concept o f a l g o r i t h m by enumerat-
i t s f e a t u r e s . Here
i s the l i s t
provided
by t h e Mathematical
Encyc-
lopaedia: - the set of possible
input
- the set of possible
results;
-
intermediate
the set o f possible the r u l e
to start
the r u l e o f the r u l e
the algorithm
execution;
to obtain the result.
There a r e s e v e r a l o f a l g o r i thm.
commonly a c c e p t e d We
mention
listed
the general
refinements
fying
exactly
vary.
The c h o i c e
are equivalent.
o f ranges
of the intuitive
machine,
recursive
func-
speaking, each o f t h e refinements
notion of algorithm,
t h e range w i t h i n
refinements
the Turing
t i o n s and n o r m a l a l g o r i t h m s . S t r i c t l y curtails
results; execution;
processing;
t o end t h e a l g o r i t h m
- the rule
concept
data;
Every
although
i n a sense
refinement
consists
a l l the
i n speci-
which each o f t h e seven parameters can
i s what
distinguishes
one r e f i n e m e n t
from
o f p a r a m e t e r s i s minimum m i n i m o r u m f o r c o m p u t a t i o n a l
pro-
another. Our cesses. of
list
I ti soften insufficient
algorithms.
part data.
on
Notice
the data,
and
that
f o r t h e d e s c r i p t i o n o f many p r o p e r t i e s
a l l seven
the algorithm
The d a t a h a v e t o be s t o r e d
parameters consists
bear
entirely
I n transforming
somewhere d u r i n g
the algorithm
or i n these execu-
5 tlon.
The media I n c l u d e a s h e e t
optical
storage
i c o n s on paper volved
devices,
o f paper, magnetic
e t c . The
to physical state
i n any a l g o r i t h m , e i t h e r
agreement
must e x i s t
methods
o f devices. explicitly
on t h e r u l e s
by which
cing
memory a s p a r t
o f t h e concept
sing
trivial
the algorithm execution
cases,
until
neither tures
they
a r e needed.
a b o v e . We w i l l
first,
i f we
make
point,
into
a n d many
The can
be
the
initial being
subsequent
All
quantity
the transition state
Issue
o f some
Turing
cell
choice
makes
of the Turing
and o u r d e c i s i o n t o con-
another.
with
Note
that the o n e . The
to carry
cells.
i s empty
each
the previous
i s a tape
into
that
states:
step-by-step,
and ends a t t h e f i n i t e
i s required
I fa cell
device
a r e two s p e c i a l
into
coincide
and p a r t i t i o n e d
state.
through
stretching
Each c e l l then
we
the algoinfinitely
c a n s t o r e one
say t h a t
I t con-
letter. machine
o f i t as h a n g i n g
writing
notion of
our research
This
similarity
There
one s t a t e
state
that
alphabet.
ground
The m a c h i n e w o r k s
i n memory. The memory
both d i r e c t i o n s
The
from
later.
o f as an a u t o m a t i c
of states. ones.
at the i n i t i a l
i s stored
computers,
can i n general
the information
tains a "void"
one
later. I t
c a n be i m p l e m e n t e d o n c o m p u t e r s .
c a n be t h o u g h t
and t h e f i n i t e
process s t a r t s
think
this:
o f memory
i n mathematics
as w e l l
the architectural
o f the existing
i n a finite
again
e.g. t h e T u r i n g machine.
account
T u r i n g machine
accepted
we c a n j u s t
sider only those a l g o r i t h m s that
letter
i s no m e n t i o n
t h i s matter
t h e commonly
our starting
taking
machine
in
Bypas-
goes on l i k e
others a r e performed
Yet t h e r e
discuss
one o f I t s r e f i n e m e n t s ,
sense,
rithm
introdu-
i n t h e g e n e r a l d e s c r i p t i o n o f a l g o r i t h m s , n o r i n t h e seven f e a -
Thus,
step
a r e acces-
t h e r e s u l t s o f t h e e a r l i e r o p e r a t i o n s must be s t o r e d some-
listed
algorithm on
Some k i n d o f
of algorithm i s Inevitable. always
from
memory i s i n -
i f we r e g a r d a n a l g o r i t h m a s a p r o c e s s t h e n
some o p e r a t i o n s a r e a c c o m p l i s h e d
where
A l lI n a l l ,
range
t h e memory d a t a
I t seems t h a t
that
electronic or
data
or i m p l i c i t l y .
sed.
follows
tape,
to store
one l e t t e r .
has a above
During
multifunction t h e tape.
read/write
Each s t e p
reading/writing
head.
can involve
t h e head
We
can
reading/
i s on t o p o f o n l y
o f the tape.
Performing
each
step
involves the following
actions.
Suppose t h e
6 r e a d / w r i t e h e a d i s a b o v e some c e l l machine
i s not f i n i t e .
the
letter
the
machine
Then, d e p e n d i n g on t h e s t a t e
i n the c e l l , itself
goes
some l e t t e r
its
right
i s written
t o i t s next
t h e same a s s t o r e d I n t h e c e l l t e d one c e l l
and t h e c u r r e n t s t a t e
or l e f t ,
state.
A program
previously. After
Now
l e t us s i n g l e o u t t h a t f o r us r i g h t
machine c a n d i f f e r tape
can s t r e t c h
from
ours
i n small i n only
as t h e e x e c u t i n g
can
r e a d / w r i t e heads w i t h
be s e v e r a l
t h e tape
state.
that
steps
of the Turing
I n general
i s being
specification
details.
c a n be i s shif-
t h e con-
used.
p a r t o f t h e above d e s c r i p t i o n
now. A p a r t i c u l a r
infinitely
same c e l l a n d letter
T h e p r o c e s s comes t o
of a l l possible
may be p r e s e n t ,
For
example,
one d i r e c t i o n ;
which I s
o f the Turing t h e memory
some o t h e r
devices
one, t h e t a p e - p u l l i n g one, e t c . ; tapes,
and so o n . i s descri-
b y some p r o g r a m . S i n c e T u r i n g m a c h i n e s m i r r o r a d e q u a t e l y
the i n t u i -
tive
notion
tures
the functioning
of algorithm,
this
i s closely associated
that describe
using
Turing
that
algorithm
struc-
e x p l o r i n g t h e s t r u c t u r e s o f programs
Therefore
i t i s not expedient
such programs.
machines
have
very
of their
immense
little
i n common
with
N o n e t h e l e s s , o u r d i s c u s s i o n shows t h a t
rithm
structure
investigation
t o explore
In spite
ters.
hypothetical
exploring
u s e d i n p r a c t i c e a r e h a r d l y e v e r w r i t t e n down a s T u r i n g
programs.
structures value,
with
separate
of the Turing
the algorithms.
Algorithms machine
means
their
there
machine
However, i n a l l cases bed
that
i s a precise mathematical d e f i n i t i o n .
t e n t s o f t h e program depends on t h e a l p h a b e t
important
that
The w r i t t e n
o r remains motionless.
i s the enumeration
o f t h e machine and
into
e n d when t h e T u r i n g m a c h i n e i s I n t h e f i n i t e
machine. This
of the Turing
apparently
the chief
must
algorithm theoretical
modern
target
be p r o g r a m s
compu-
o f algo-
f o r real or
machines.
2. Algorithm Notations While analyzing actually
analyze
various
algorithms,
not algorithms
i t i s easy
as p r o c e s s e s ,
n o t a t i o n s . There i s a grave reason t o t h i s : t a t i o n s f o r algorithms, then
to notice
but rather
some
that
we
formal
i f t h e r e were no f o r m a l no-
i t w o u l d be i m p o s s i b l e
t o spread
the exact
7 k n o w l e d g e o f a l g o r i t h m s among t h e s c i e n t i f i c c o m m u n i t y and c o n s e q u e n t l y no
accumulation
the choice
o f knowledge
the corresponding
while discussing Turing The
pursues
with
to exploit
given
I n t e r p r e t a t i o n o f an a l g o r i t h m i s u s u a l -
some s p e c i f i c
accuracy;
resources
this
machines.
f o r by t h e Incompleteness
designer
solution
leads
f o r m a l n o t a t i o n s , we h a v e m e n t i o n e d
absence o f t h e unique
accounted
rithm
T h a t i s why
t o a n a l y z e a l a r g e enough c l a s s o f a l g o r i t h m s a c t u a l l y
to analyzing
ly
i n that area would take place.
of i t s description.
goal,
e.g. t o o b t a i n
t o guarantee s u f f i c i e n t
of prescribed
t y p e and amount;
The
algo-
t h e problem
execution
speed;
t o express the a l -
gorithm v i aoperations w i t h r i g i d l y determined p r o p e r t i e s , e t c . But t h e entire
s e t o f c o n d i t i o n s under which t h e designer's
r a r e l y w r i t t e n down t o g e t h e r w i t h details Now
are omitted
i f that
some
machine
scription of
a n d some
incomplete that
quite
assumed
i s used
conventions,
different
i s achieved i s Most o f t e n
many
t o hold.
b y somebody o r
t h e a l g o r i t h m de-
meaning.
The
consequences
a r e n o t a t a l l easy t o t r a c k .
Here a r e a few t y p i c a l to describe
description
by o t h e r
acquires
that discrepancy
conditions are t a c i t l y
algorithm
i s guided
actually
goal
the algorithm itself.
examples.
I f mathematical
n o t a t i o n i s used
an a l g o r i t h m , t h e n as a r u l e t h e e x e c u t i o n o r d e r o f i n d i v i -
dual
operations
fied
at all,
Is not specified
as i n our e a r l i e r
numbers. T h i s
means t h a t
precisely.
Sometimes i t i s n o t s p e c i -
example o f f i n d i n g
the algorithm designer
the consequences o f d i f f e r e n t
execution
t h e sum o f s e v e r a l
was n o t c o n c e r n e d
orders and l e f t
about
the decision to
the end user. Accordingly,
the exact
execution
who w r i t e s c o d e
based
follows:
the a l g o r i t h m designer
order, of
"Since
a l l execution
them f o l l o w i n g
orders
of
tivity,
computational
the algorithm designer
operations.
equivalence
They
associativity,
i s chosen by a
formulas.
programmer
He u s u a l l y r e a s o n s
d i d not specify
as
the execution
Such arguments can i n v o l v e con-
p o i n t o f view. can allow a l l v a l i d
are equivalent
usually relies
order
must be e q u i v a l e n t , a n d I c a n t a k e a n y o n e
my own p r e f e r e n c e s " .
s i d e r a b l e danger from True,
on mathematical
from
h i s point
execution
o f view.
on s u c h p r o p e r t i e s o f o p e r a t i o n s
and d i s t r l b u t l v i t y .
In this
case
orders
Yet
this
a s commutathere
i s no
s need f o r a m e t i c u l o u s
clarification
o f execution
w o u l d be o n e and t h e same u n d e r t h a t Due do
t o rounding
not hold
that
while
errors
t i o n undergoes c o n s i d e r a b l e valent
under exact
g a r d s e.g. t h e i r i n t o account computer
and
would
properties by a
operations
computer.
I t follows
i n the algorithm
specifica-
the algorithms
manifest
numerical s t a b i l i t y .
of
I fthis
unlike
t h a t were
properties
circumstance
equias r e -
i s not taken
t h e n even an e x p e r i e n c e d u s e r , w e l l aware o f t h e q u i r k s o f can o v e r l o o k
being q u i t e
certified.
results.
implied
c h a n g e . Now
arithmetics
arithmetics,
algorithm,
i s executed
o f "equivalence"
as t h e r e s u l t
conditions.
the above-listed
t h e program
the contents
order,
sure
that
The d i s r e g a r d
In spite of this,
the i n s t a b i l i t y
i t s stability
of that
particular
was a c c u r a t e l y
phenomenon
mathematical
t o w r i t e computer programs w i t h o u t
o f some
can produce
formulas
are quite
studied incorrect
often
careful examination of their
used
applic-
ability. Rounding e r r o r s
u n a v o i d a b l y accompany c o m p u t e r - b a s e d
f o l l o w s from our discussion algorithm
notations
that
that best-suited
do n o t r e l y
f o r our analysis
on a n y u n d e c l a r e d
o f operations.
l a n g u a g e s was t o r u l e o u t t h e p o s s i b i l i t y o f m u l t i p l e i n t e r p r e -
tations
of the notation.
Existing
o f the execution
Notice
that v i r t u a l l y
roundlng-off,
results
on d i f f e r e n t algorithmic
precisely. Unfortunately
to algorithmic i s so d i f f i c u l t All
the exact where t h e
priori. no c l u e s
methods o f
languages do n o t p r o v i d e
That
to write portable
I n a l l programs
different
t h a t even ex-
t h e means t o s p e c i f y
we h a v e t o g e t a l o n g w i t h t h i s ,
o f rounding e r r o r s
language design.
i s elaborate
enough
i s one o f t h e main
numeric
i n algorithmic
algo-
chiefly
to contribute r e a s o n s why i t
software. languages
a r e good
they supply t h e exhaustive d e s c r i p t i o n s o f algorithms.
unexpected obstacle
as t o t h e
intermediate
o n e a n d t h e same p r o g r a m w o u l d g e n e r a t e
isting
algo-
require
i s known a
Due t o d i f f e r e n t
of
f o r t h e cases
computers. This e s s e n t i a l l y i m p l i e s
rithms
b e c a u s e no t h e o r y
always
except
a l l t h e languages o f f e r
structure o f operations.
results
that
languages
order,
i n d e p e n d e n c e o f some o p e r a t i o n s
inner
o f t h e development
pro-
rithmic
data
o f the goals
a r e those
algebraic
perties
specification
One
research. I t
e m e r g e s : most a l g o r i t h m i c
enough i n
However, an
languages support b u t r e -
9 latlvely
simple algorithm
notations.
Host e x i s t i n g languages were c o n c e i v e d Neumann c o m p u t e r execution execution
of
Moreover,
the
m o s t no
architectures.
t i m e shows l i t t l e
way
Since the
individual
d e p e n d s on
mutual
whether
languages
The
algorithmic that
been
knowledge.
" d u s t y deck"
shall
used
that
information
on
dern p a r a l l e l dependent
write
do
a
information
not
is a
and
to
refurbish
of
manually
languages.
and
a
memory
traditional amounts
of
portion
of
looms t h a t
we
small
Danger
data.
implemen-
via
enormous
even
task.
in a l -
share
indirectly
convenient,
of
operations.
traditional
accumulate
formidable
to dig
algorithmic
this
problem. the
I f we
cannot
tasks
sprout
it,
we
out
to
derive
mentations
of
algorithms
hope up
an
contrive we
mo-
achieve
to
exist-
somebody
extract
s h a l l have
the
s t r u c t u r e of
full from
i s , do
achieve
a
to
extensively
information sequential
success. Moreover, use
Can
we
sequential
algorithms
e x e c u t i o n order preserves the
to
it.
hope t o
of e x i s t i n g algorithm
i n d i v i d u a l o p e r a t i o n s ? I f the
expect
fixing
of
not
information,
must d e v e l o p c o n s t r u c t i v e
Another question
original
of
t i o n order of
if
that
notations,
not
can
use
that
to
re-
manually.
c h a n c e t o do
change
efficient
require
about such branches. Since
language
computational branches out
have t h e
not
P a r a l l e l computers would
incorporate
i t out.
The
do
simultaneous execution of data i n -
information
way
architectures
operations.
implies
explicitly
l o t of d i f f i c u l t
parallel
von-Neumann
branches.
the
from
most p r o g r a m s A
opens
program order
help faster
Is described
simple
d a t a dependence o f
computational
must f i n d
solve
To
architectures
programs
ally
subset
r e f l e c t e d i n the
is
traditional
maximum s p e e d w i t h o u t
we
some
von-
h a v e t o embark on I t . Recall
ing
from
worldwide
programs
the the
t h e s e o p e r a t i o n s exchange or
operations
arrangement
have
classic
architectures,
d a t a dependence does n o t
of
of
to accomplish a subset of operations
i t i s not
influence
references.
operations
on
t a t i o n of algorithms, The
those
era
o r none w h a t e v e r dependence on
time required
information
With
i n the
by
notations?
If
procedures
to
languages
fixing
answer i s y e s , on
possible
programs. algorithm in
this
e x i s t i n g sequential
On
extract
the
t h e n we
parallel the
actuexecu-
other
canimplehand,
s t r u c t u r e , then
case
the
algorithmic
we
opportunity languages
10 on
parallel It
stand the
computers. This p o s s i b i l i t y
i s very
important
i t s bottlenecks,
bottlenecks.
while
i s currently widely
solving
or at least
any k i n d
t o recognize
P a r a l l e l computers
discussed.
o f problem
the factors
themselves
do
t o underthat
not create
shape
bottle-
n e c k s i f we r u n s e q u e n t i a l
programs on them. Y e t t h e y h i g h l i g h t t h e i n -
sufficiency
notations
of sequential
t o guarantee
the e f f i c i e n t
imple-
mentation o f algorithms. We h a v e m e n t i o n e d clude
the declaration
method. ular
to store
I n t h e t i m e when t h e f i r s t
way t o s p e c i f y
mathematical sociated ables.
t h a t e v e r y d e s c r i p t i o n o f an a l g o r i t h m o f memory
an a l g o r i t h m
t h e s e t o f a l l used
The n o t a t i o n
used
itself
to specify
every
clearly:
including
pop-
formulas. I n i t i s as-
indexed
used t o s t o r e
varican be
the value of
variable.
u s e d t o e v a l u a t e some o t h e r
other
words,
tions,
they
perform
tell
some
us, which v a r i a b l e s have t o
v a r i a b l e , a n d i n p r e c i s e l y w h i c h way. I n
us t o e x t r a c t manipulations,
a n o t h e r memory l o c a t i o n . I t i s t h i s
data and
from then
cades that
later,
with
specified store
t h e advent o f p a r a l l e l
algorithms
what o p e r a t i o n s formation using
that
sequential
cessful
efficiently
these data
have
we
must
impact.
p r o g r a m s on p a r a l l e l
about
the result
into
also
languages.
i t h a s become
De-
clear
data are stored.
explicitly
specify
To on
I t i s t h e absence o f t h i s i n -
c o n s t i t u t e s the narrowest
s o l u t i o n o f many p r o b l e m s w i l l
information
loca-
algorithmic
computers
I t i s n o t enough t o s p e c i f y where t h e r e q u i r e d
implement
memory
scheme t h a t i s r e f l e c t e d i n c l a s s i c
von-Neumann a r c h i t e c t u r e s a n d i n t r a d i t i o n a l
the
access
t h e most
individual variable
Mathematical formulas e s s e n t i a l l y t e l l be
quite
variables,
must i n -
the data
t o use m a t h e m a t i c a l
r e g a r d e d a s t h e d e s c r i p t i o n o f a memory c e l l that
and
computers were b u i l t , was
f o r m u l a s memory m a n i f e s t s
with
data,
bottleneck
i n the problem o f
computers. Consequently, depend on o b t a i n i n g
the interconnections
t h e suc-
and s t o r i n g
o f individual operations
of
algorithms. To
get closer
t o t h e answers t o o u r q u e s t i o n s
aspects o f algorithm ous
notations
methods t o compute
i n more d e t a i l .
l e t us c o n s i d e r
L e t us i n v e s t i g a t e
some vari-
11 (2.1)
We
can r e g a r d
language. unique;
The
this
e x p r e s s i o n a s a s t a t e m e n t o f some
execution
order
of
right-hand
side
i n d e x i n g does n o t have a n y t h i n g t o do w i t h
expression. Actually g o r i t h m s . We
will
erations
within
executed
serially
three execution
level
are executed
i n one-by-one
a
2
a
3
a a
1
' Layer 2
a
4
a
a
( a ^ *M eat h^o d H a1 ^
a^
a(
a ^
a
7 a
ag
a
+ a a
5 6 7 8 a ^ )
a^
a^
a^
a ^
*fo aa^
Layer 3
*
Layer 4
a ^
( a ^ +
^
8
78
+
a^
1
aa^
Layer 2
V a
*i%B*s% *
Method
2
a(
as
a^
* V t
*« 7
*7»BJ
ag
a?
V e a
3
a 7
Layer 4
* ( a ^ t ^ ^ ' V e Method
8
•
a fi
3
+
and
have
a
6
a a
5 2 3 4
a
a 56
a a + a a
1
Layer 5
we
a a
34
Layer 2
Layer
Suppose
S
a a
Layer 3
Layer
assuming t h a t
simultaneously
fashion.
12
Input data
of the
orders:
1
Input data
the structure
l a y out the computation i n layers,
each
r
Layer
i s not
( 2 . 1 ) spawns s e v e r a l m a t h e m a t i c a l l y e q u i v a l e n t a l -
Input data
Layer
algorithmic
operations
a
levels
opare
implemented
12 These t a b l e s g i v e compute
(2.1).
We
can see t h a t Method
steps but i trequires ual
operations.
longer
not
While
contents
operations tant,
in
not
take
that
graph
a n d we
We
nodes
arcs
tions.
We
algorithm
3
ascor-
opera-
with
Graph
place 0.
Z
l a y them o u t
and i n p u t
layer
7
substitute
accordance
tables.
0
impor-
f o r tables.
respond
but they
i n c u r r e d a redundant step o f computation.
of individual Is
output
In
two p r o c e s s o r s
structures
respond t o a l g o r i t h m tions
i n three
t h a t w i t h Method 3 t h e u n f o r t u n a t e
investi-
s o we now
sume
Notice
suited f o r
gating algorithm
graphs
order
the result
t a b l e s a r e cumber-
and
analysis.
the
1 produces
to
independent processors t o c a r r y o u t i n d i v i d -
the result.
choice o f execution Our
four
Methods 2 and 3 use o n l y
to obtain
some
some I n s i g h t I n t o t h e s t r u c t u r e o f a l g o r i t h m s
the link
o f operainput
The
data
resulting
graphs
that
t o Methods
cor-
1-3 a r e
shown i n F i g . 2 . 1 . Notice for same
O
t h a t t h e graphs
Methods
1-3
f o r any
are the
Input
7
data.
More
important
f o r us i s
that
the three
pictures i n
Fig.
2.1 a c t u a l l y r e p r e s e n t
Z 3
one a n d t h e same g r a p h . The three ic,
graphs
4
a r e isomorph-
i . e . there
exists
5
a
one-to-one mapping o f t h e i r
F i g . 2.1
13 n o d e s t h a t a l s o maps a r c s of
roundoff
the
same
ters
errors
computer.
as w e l l ,
operations out
is
o n same
Our
important
The
structures.
formulas
involving
scribed
exact
how
same
have
or
we
could
to
store
safely
the operations
(2.1) that
by t h e parentheses i n ( 2 . 1 ) .
This
indicates
a few data
(2.1)
describe
the algorithm
some q u a n t i t i e s .
replaced invent
The i n p u t
regard
o f those
a ,..., a , y 1 8 a totally
as a sequence o f opera-
valid
explicitly variables
from
specified they
plementations
i s o f no
what
that
a a +a a
on which So
The memory
that
locations,
variables
of (2.1) are equivalent
2. 1 c o n -
i f we
carry
out
graphs
we
depend and w h i c h
that
a l l v a l i d im-
as f a r as t h e d a t a would
them i n
I t i s n o t immed-
algorithm
our variables clear
form
processing
i t makes
By b u i l d i n g
i t became q u i t e
the
S
dependencies
between
variables
a r e concerned.
Therefore
they
results
e v e n when r o u n d i n g e r r o r s
influence
the computation.
Consider
locations y e t we c a n
e t c . as d e s c r i b i n g
t o a memory l o c a t i o n . difference
we
1, 2 , . . . , 8, 9
specified,
the operations
of operations.
influence.
importance;
3 4 1 2 3 4
f r o m some memory
( 2 . 1)
sequence
values are stored.
i n ( 2 . 1 ) by i n t e g e r s
a a , a a ,
I t follows
i n r e t r i e v i n g data
clear
addresses
different notation.
12
3
locations.
bet-
d a t a and t h e r e s u l t a r e de-
results are not e x p l i c i t l y
the symbols
S
links
(2.1).
some w a y , a n d s t o r i n g t h e r e s u l t iately
f o r same
are carried
f o r a l l methods t o compute
F i g . 2. 1 m e r e l y
intermediate
required
another
results
O f c o u r s e , n o h y p o t h e s e s a r e made a s t o t h e compu-
representation
could
sist
Precisely
produce
b y t h e a d d r e s s e s o f memory c e l l s w h e r e t h e i r
I
on
be i d e n t i c a l o n d i f f e r e n t compu-
order defined
that would evaluate
The
identical results
e x a m p l e shows t h a t g r a p h s p r o v i d e a handy a p p a r a t u s t o e x p l o r e
algorithm
tions
t h a t even i n t h e presence
yield
property.
ween t h e o p e r a t i o n s . ter
would
computers
This holds
the execution
a very
these
numbers.
I t follows
methods w o u l d
The r e s u l t s
provided
i s not important.
preserve
t o arcs.
the three
yield
a g a i n o u r e x a m p l e o f summing numbers. L e t us
identical
investigate
v a r i o u s methods t o compute
(2.2) i=i
14
The e x a c t associativity. terras.
o f numbers obeys
the result
f o r large
n
execution
the point
fluence,
order
o f view
substantial
of ai
Input
difference
a
Layer 2 3
(a,
Layer 4
U
Layer 5
( a
Layer 6
(a
Layer
i i
7
+ + + + + + +
a2 2 a 2 a 2
a a
2 a
a 2
we
on
the order
of execution
choose?
o f the
orders t o
They
are a l l equivalent
With
roundoff
shows u p . I f we
errors i n -
ignore the absolute va-
o f summing,
then
consecutive
F o r n = 8, i t l o o k s l i k e
sum-
this:
a 3
+ •
*3 a 3 a 3 a a
+ + + 3 + a 33
s o )+ a
6
S
a5 a
The s o - c a l l e d d o u b l i n g scheme pict
number
computation.
the order
a
a
Layer 1
Layer
shall
i n accuracy.
data
n o t depend
and
large.
o f exact
w h i l e choosing
ming i s t h e worst
does
t h e laws o f c o m m u t a t i v i t y
the total
( 2 . 2 ) w o u l d be q u i t e
which
lues
Thus
Obviously,
evaluate
from
addition
+ a )+ a
6
7
+ a + a )+ a
5
6
7
i s the best
8
i n accuracy.
H e r e we de-
i t f o r n • 8:
a
Input data
1 Layer
1
Layer 2
a
2
a + a
1
a
3 2
4
a + a
3
( a + a ) + (a
1
2
a
a
5
6
a + a
4
5
4
5
7
6
a
B
a
a + a
6
* a
* a ) (a
3
a
7
B
+ a
) + (a
7
)
8
I t I s a w e l l - k n o w n f a c t t h a t c o n s e c u t i v e summing i s w o r s e t h a n t h e Layer 3 (a + a + a + a ) + la * a + a + a ) d o u b l i n g scheme by a f a c t o r o f a b o u t n / l o g ^ n . T h e r e i s a l s o a n i n s t r u c tive difference graphs
1
i n F i g . 2.2
scheme, r e s p e c t i v e l y . Even describe
though
2
3
4
5
6
7
i n data dependencies between i n d i v i d u a l
both
correspond
to
consecutive
Each n o d e s t a n d s graphs
8
o p e r a t i o n s . The
summing
and
doubling
f o r a single addition operation.
i n c o r p o r a t e t h e same number
t h e e v a l u a t i o n o f one a n d t h e same e x p r e s s i o n
o f nodes and
(2.2),
they are
15 not
Isomorphic.
Isomorphic
graphs always have al
critical
identic-
path
lengths
(critical
path
i s t h e long-
est
in a
graph).
The
our
graphs
has
path
first
of
critical while
path
length
i t equals
3
for
second o n e . The f i r s t Indicates summing
that has
branches The rates
that
contains data
the
graph
consecutive no
of
second
7,
parallel
computation.
graph
demonst-
doubling
scheme
a l a r g e number
independent
of
Fig.
2.2
operations
the
process o f algorithm
the
crucial
difference
Arcs i n graphs represent data t r a n s f e r s i n e x e c u t i o n . The g r a p h s i n F i g . 2.2 i l l u s t r a t e
between
t h e t w o schemes
with
respect
t o data
transfers. Comparing the
the graphs
second graph
That
means
that
i n Fig.
i n Figs.
2 . 1 a n d 2.2 we r e a d i l y n o t i c e
2.2 i s i s o m o r p h i c
the algorithms
they
scheme t o sum 8 n u m b e r s a n d e v a l u a t i o n tures, is
even
quite
have
a
rithms. have. be
different.
is the
that
algorithms
i n common,
a t least
They f e a t u r e
between
doubling
- have i d e n t i c a l
appearance o f formulas
I t i s clear
i n Fig. 2.1.
I.e. the
with
(2.1)
struc-
and ( 2 . 2 )
identical
as regards
graphs
the flow of
the execution.
For example,
drawback
cies
o f (2.1)
-
g r a p h s and t a b l e s a r e m e r e l y a n o t h e r n o t a t i o n
executed
The
t h e outward
l o t o f propert ies
data during Our
though
t o t h e graphs
represent
that
some p r o p e r t i e s
our tables
i n parallel. of tables operations
immediately
provided
that
explicitly
the formula specify
The f o r m u l a n o t a t i o n s I s that
they
poorly
on d i f f e r e n t by graphs.
exact s t r u c t u r e o f every operation,
lack
That
the other while
notations
which
reflect
layers.
On
t o express
algodo n o t
operations
may
that explicitness. the data kind
hand,
of
dependeninformation
tables
graphs lack
that
comprise lnforma-
16 tion
a l t o g e t h e r . However, t h a t
becomes n e c e s s a r y , include
links
between
individual whether
operations
tions,
operations.
must
be
executed
them.
a l l data
I ti s precisely this
tance f o re f f i c i e n t
movements t h a t
computers,
idea
naturally
algorithm notations
some
other
while
and i t I s t h i s
as a g e n e r a l
idea,
exe-
imporinforma-
L e t us use t h e
a l g o r i t h m graphs,
and t h e n
graphs t o analyze s t r u c t u r e s o f a l g o r i t h m s . This
be d i s p u t e d
opera-
implementa-
i s of vital
comes t o m i n d .
to build
we
simultaneously,
take place
information that
t h e data
of algorithm
a l l possible
the following
but i t s implementation
use
can hard-
i s f a r from
obvious. Almost
ficiently
immediately clear-cut
we
come
structure
arbitrary
picture
choice
that
up
of
h e a v i l y upon a f o r t u n a t e c h o i c e
to
or after
Now,
the o b t a i n e d
Yet,
before
use o f p a r a l l e l
specifies
the graph
t h e a l g o r i t h m i c language n o t a t i o n l a c k s .
being
can be expanded t o
c a n be e x e c u t e d
that
existing
An
explicitly
Using
some o p e r a t i o n s
t i o n s and f u l l y d e s c r i b e s
ly
nodes
e t c . The g r a p h o f a l g o r i t h m d e t e r m i n e s
cuting
tion
o f graph
representation of algorithms
determine
which
the description
I f i t
the relevant information.
Graph
can
i n f o r m a t i o n i s n o t a l w a y s needed.
against
graphs
difficulties.
i n Figs.
2 . 1 , 2.2
o f mapping o f g r a p h nodes o n t o
of that
mapping c o u l d
result
would n o t h e l p
us u n d e r s t a n d
the structure
the formulas
(2.1),
The
(2.2) o f f e r no h i n t s
In a wildly
suf-
depends a
plane.
intricate
o f t h e graphs.
a s t o how t h i s
problem i s
be s o l v e d . The
problem o f f i n d i n g
the best
m a p p i n g c a n c e r t a i n l y be s o l v e d i f
t h e number o f n o d e s i s s m a l l . We c o u l d lities. could
We
only
mention
this
because
just
look over
a l l the possibi-
f o r an a r b i t r a r y
p r o v e t h e o n l y method t o b u i l d and e x p l o r e
algorithm
i t s graph.
That
this
i s why
we s t r e s s t h e s m a l l n e s s o f t h e number o f n o d e s . As ber
a rule,
programs i n a l g o r i t h m i c languages d e s c r i b e
of individual
operations.
Moreover,
h a n d , a s i t d e p e n d s o n some p a r a m e t e r s . chance
to build
and e x p l o r e
i ti s often At t h i s
a l g o r i t h m graphs
a h u g e num-
n o t known
juncture
l o o k i n g over
there
beforei s no
a l l possible
variants. Actual
programs,
however,
are arbitrary
only
on
the
individual
17 statements
level.
The s t a t e m e n t s
As r e g a r d s t h e p r o g r a m the
algorithm
conciseness in
o f a program
the structure
t o be r a t h e r
t h e r e a r e no r e a s o n s
due t o l o o p s must
o f a l g o r i t h m s and t h e i r
Mathematical express
as a w h o l e ,
tend
reflection
notation,
algorithms.
but that
programs,
Directly
reflect
simple.
to believe
i t d e s c r i b e s does n o t possess any p e c u l i a r
exact nature o f t h a t
to
themselves
that
features.
The
i n some
way
Itself
g r a p h s . We do n o t know y e t t h e i s another matter.
schemes, g r a p h s
or indirectly,
c a n a l l be
those
notations
used imply
memory u s a g e t o s t o r e d a t a . The ways v a r i o u s n o t a t i o n s make u s e o f memory
may d i f f e r
rithm
significantly.
With
i s d e s c r i b e d as a s e t o f e q u a l i t i e s .
right-hand (taken
side
together
process would
o f any e q u a l i t y with
super-
(i.e.
I n t h e memory m o d e l t h a t will
ever
This
most
reducing
languages.
overall
memory
resulted
Our ing
discussion
algorithm
proves
structures
notation)
property
and
no
holds
h o w e v e r . The n e -
o f mathematical
requirements,
memory
i n t h e concept
11 a l l o w e d m u l t i p l e
m a k i n g a l g o r i t h m g r a p h b u i l d i n g much more d i f f i c u l t instead o f the mathematical
the algorithmic
t h e same
programs,
memory
which has r e p l a c e d t h e concept
programming
cells,
computer
and t h e variables
i n mathematical
by mathematical
Naturally,
an a l g o -
side
identical
or else
that
i s n o t t h e case w i t h
t o use s p a r i n g l y
"assignment"
i s implied
notation,
The l e f t - h a n d
not contain
I tfollows
get overwritten.
good f o r graphs. cessity
must
or subscripts],
n o t be s p e c i f i e d .
cells
in
the mathematical
"equality"
re-use at
of
o f memory
t h e same
i f we u s e a
time
program
notation. t h a t we c a n n o t e x p e c t
t h e problem o f e x p l o r -
t o b e s i m p l e . T h e r e f o r e we s h a l l
elaborate our
g o a l s a n d o u r means b e f o r e we e m b a r k o n i t s s o l u t i o n ,
3. Graph of Algorithm A
particular
requires
that
computer
i m p l e m e n t a t i o n o f an a l g o r i t h m
c h a n g e s be made e i t h e r
inevitably
i n t h e a l g o r i t h m as a w h o l e , o r i n
some o f i t s f r a g m e n t s . F o r m a l l y , t h i s a d j u s t m e n t o f a l g o r i t h m ticular tion. in
computer
This
means
that
transformation
an u n d e s i r a b l e ,
the algorithm
may c h a n g e
o r , worse,
undergoes
some p r o p e r t i e s
inadmissible
way.
some
t o a par-
transforma-
of the algorithm
18 A l g o r i t h m s were always t a i l o r e d ities.
Yet
this
activity
vent
of
parallei
fact
that parallel
putation
gained
computers.
11
the
requirements
matter
will
quite
imperceptible. For
an
search. about tool ved
end
the
the
any
the
user
actual
result
would
must be
From
that
goal of
computational but
late
of
the
a l l allowable
only
from
results. We
may
the Now,
have
notation. erations
of
some
target
or
operations
the
machine,
Turing
of
the
target
interpret guously.
a
his
as
is
goals
can
i s why
re-
possible
to
result
That
use
the sol-
the be
should
acmain
quite
we
stipupreserve
modification
of
preserve
accuracy
an
grasp
the
the
and
exactly,
any
i s complete in
we
algorithm of
etc.)
a
via
some
that
can
is
language, e t c .
non-contradictory
the
corresponding
to
that
I f the
and
hypo-
execute
able
machine
op-
assumes
(possibly
such machines
that
hypothetical
designer
of
the
describe I t ?
algorithm
algorithm
existence
machine
a
do
an
i m p l i e s p e r f o r m i n g some
automaton,
hypothetical
the
how
s p e c i f i e s . Examples o f
notations
conduct
become
to a guaranteed
a
to
subject to
problem being
algorithms
algorithm
i n some a l g o r i t h m i c
machine
algorithm
of
that
Actually
(or
he
operations
programs w r i t t e n
secondary.
appropriate
explicitly)
machine
sequence o f
mathematical
such
our
wishes
of
approximate
e s s e n t i a l l y we
objects.
just
kind
the
results.
modifications
that
He
to
com-
properties
as
little
development. Other
s p e c i f i c a t i o n o f an
implicitly
thetical)
the
as
i s the nature of t h a t set
stated
Any on
(either
set
the
tool
obtaining
are
the
algorithms
shaped
a
the
the
Otherwise
know
tool.
transformations
choose
what
view
same t h e y
the guaranteed accuracy o f we
to
that
ad-
in
explore
transforming
vaguely
is just
D e p e n d i n g on
algorithms
important,
Thus
of
the
1ies
to
must p i n p o i n t
so
e i t h e r exact or
point
just
get
like
with
this
starting
computers.
can
computer
result.
curacy.
that
and
construction
the
we
preserved while
particular
user,
Ideally,
to obtain
of
ill-defined
swing
for
peculiar-
s p e c i f i c a t i o n of p a r a l l e l
before
detail
computer
large
reason
the
that
in full
t h a t s h o u l d be
be
primary
follows
structures of algorithms
fit
particularly
The
computers r e q u i r e
branches.
of algorithms
to f i t various
the
include perform executes
description then
i t must
language
unambi-
19 However, seldom machine
a perfunctory
i n algorithm
the algorithm
m a c h i n e . As we h a v e
with different can
easily
tion
description
of a
target
machine i s
Most
often
the target
specifications.
i s n o t e v e n m e n t i o n e d . The a l g o r i t h m
sumes t h a t get
even
included
notation stated
itself
designer
earlier,
i f such a d e s c r i p t i o n
a s s u m p t i o n s on t h e t a r g e t machine
occur.
can lead
We s t r e s s
again
t o unpredictable
that
then
pretar-
i s used
misinterpretation
thedifferences
consequences.
as t o t h e t a r g e t machine a r e t h e c h i e f
wordlessly
determines theunderlying
i ninterpreta-
The u n s p o k e n a s s u m p t i o n s
source o f ambiguity
i n algorithm
notations. We h a v e machine priate
noted
i sfairly
that often
modifications.
an a l g o r i t h m used
on another
are
intended
made t o r u n them o n p a r a l l e l
programs
guish
special
that
formed upon m a t r i x rithm,
we h a v e
as w e l l
will
gorithm cell
computers, y e t attempts
transportation of
I t i s important
task
i s t o develop
thing
objects
It
these o b j e c t s i s what
an a l g o r i t h m
arestored
operations
r e s u l t s . The u s u a l
d o . I f t h e amount o f a v a i l a b l e must
allow
must
way t o d o t h i s
and i n be p e r -
the contents
variables.
s e to f operations
i s through these operations crucial point
o f this
on matrix
that
provide
then
the a l -
of a
memory
T h i s c o m p l i c a t e s t h e no-
entries
i s that
organization.
remains
the designer's goal
discussion
entries
any s u b s c r i p t -
memory i s l i m i t e d
f o r overwriting
by values o f other subscripted
The
to
e n t r i e s . H o w e v e r , i f we w i s h t o w r i t e down t h e a l g o -
I f memory c o n s i d e r a t i o n s a r e n o t i m p o r t a n t ,
theentire
and t h e
b o t h a r e m a t r i c e s . On
t a t i o n and imposes l i m i t a t i o n s on t h e c h o i c e o f s u b s c r i p t Yet
to distin-
designer
t o d e c i d e o n a p a r t i c u l a r way t o i d e n t i f y m a t r i x
designer
FOR-
machine.
and o u t p u t
important
as a l l Intermediate
subscripts. ing
t o another.
i t i s i m m a t e r i a l where
m a n n e r . The o n l y
appro-
c o m p u t e r s . We m u s t h a v e t h e p r o p e r d e -
t h e designer's Input
with
i sw r i t i n g
FORTRAN p r o g r a m s a r e i n -
requirements o f t h ealgorithm
inverse.
the d e s i g n stage what
one machine
thev i t a l
thematrix
formulas,
machines t o ensure t h e c o r r e c t
needs o f a p a r t i c u l a r t a r g e t
Suppose find
from
between
to a specific
machine, p o s s i b l y
t o be u s e d o n u n i p r o c e s s o r a l
scription of different our
tailored
A c h a r a c t e r i s t i c example o f t h i s
TRAN p r o g r a m s b a s e d o n m a t h e m a t i c a l herently
notation
unchanged.
i s achieved.
thedesigner's
goals
20 are
not reflected
i n algorithm notations
They a r e c a m o u f l a g e d and
i t s language
gorithmic is
of
creep
into
execution
order
and r e p e a t e d
should
be h a d i n m i n d , h o w e v e r ,
the execution
order
ideas
that
from
order
scription
as a h i n d r a n c e .
i f he t h i n k s t h a t
parallelism
and i s less
burdened
turn
By d o i n g
Generally
algorithm that
coating
Given following
originating
even w i t h
than
greater t h e FOR-
cloud
has to
t h e mathematical
some
important
somebody's d i s c o v e r i n g
reflected
i n particular
prothem,
properties
by i t . The q u e s t i o n
arises
while
by o u r preoccupation
are responsible
for
with
e f f e c t i v e im-
suppose
rules.
that
This
we c a n e x e c u t e
the algorithm
means t h a t some s e t o f o p e r a t i o n s
a n d f o r e v e r y o p e r a t i o n we c a n d e t e r m i n e , w h i c h
functions
i n some f i x e d
on t h evalues
characteristics
those
their
i t s a r g u m e n t s . Suppose t h a t a l l o p e r a t i o n s
depend o n l y
preserving the
o u t t o be p o s s i b l e . The
computers.
the input data, some p r e s c r i b e d
vector
that
machines
I tturns
i s determined
on p a r a l l e l
is f u l l y defined,
time
garbage
intention.
of the results.
o f algorithms
ations provide
no
c o n c e p t , a n d he
a n y a l g o r i t h m n o t a t i o n masks t h o s e
a r enot d i r e c t l y
accuracy
plementation
sults
thwart
i t n e v e r was h i s
o f the solution
properties
as
o r even
exec-
i t i s p o s s i b l e t o f r e e the a l g o r i t h m n o t a t i o n o f a l l t h e redun-
guaranteed nature
therigid
the a l g o r i t h m designer
orders
s o , he c a n i n v o l u n t a r i l y
speaking,
dedicated
t o t h e m a t h e m a t i c a l de-
by u n e s s e n t i a l
execution
o f the algorithm,
though, o f course,
whether
for a
t h e mathematical d e s c r i p t i o n o f f e r s
a set o f possible
notation. perties
issue
t o u s e a FORTRAN p r o g r a m t o
c o m p u t e r , he c a n r e g a r d
He c a n t h e n
TRAN p r o g r a m . Due t o a number o f r e a s o n s , specify
i t i s n o t always easy t o
o f t h e a l g o r i t h m i n s e a r c h o f a more p u r i f i e d
right
dant
With a l -
t h e c h a f f . The d e t e r m i n a t i o n
i s b y no means a s e c o n d a r y
solve h i s problem on a p a r a l l e l
of
form.
machine
o v e r w r i t i n g o f the contents o f
FORTRAN p r o g r a m m e r . Y e t i f somebody w i s h e s
is
pure
t h ea l g o r i t h m s p e c i f i c a t i o n .
the core o f the designer's
ution
original
cells.
It tell
that
i n their
idiosyncrasies of the target
l a n g u a g e s s u c h a s FORTRAN, ALGOL, e t c . t h e c h i e f i n t e r f e r e n c e
rigid
memory
by numerous
number
o f v a r i a b l e s , and t h e i r r e -
o f t h e s e v a r i a b l e s . Suppose a l s o
influence
oper-
c a n be v i e w e d
the output
of the algorithm.
that I t
i s
21 only
Important
i.e.
a l l the
that
a l l the
operations
completed. A l l i n a l l , rithm
the
set
of
arguments
that
we
have
for
these
postulate that
variables that
are
every
operation
arguments
as
output
an a l g o r i t h m
modified
be
i n the
process of
that
are performed d u r i n g the
the arguments f o r every Note
that
we
do
not
t y p e o f v a r i a b l e s and some a l p h a b e t ,
that (in
the
number
terms o f
T h u s we
do
impose
any
boolean values, of
substantial
o p e r a t i o n s . Our
i n s and
matrices,
outs
admit
the
operation of
summing n
Typically,
class
a l l v a r i a b l e s and
number o f
ferent
the
i s i m m a t e r i a l . For
of
numbers.
each o t h e r ,
i n which
ried out.
I f such classes
and
variables
basic
are
can
are
be
we
f o r given
tions. tion,
I f a we
out o f
link
having
are
an
We
l a c k one
of
data
could the
into
f e a t u r e s . The
sig-
to belong
variables
to the
difsame be
Just
an
of
or
how
real
operations
shall
identify
operation
corresponding
several
input
n e v e r u s e d . We or
the
data. of
ac-
say
numbers
are
differ
actually
that basic
car-
operations
defined.
possible as
nodes.
nodes w i t h
argument
nodes w i t h
an
these
opera-
f o r another
opera-
arc.
The
arc
goes
argument. conventions
arguments,
agree t h a t
that are a c t u a l l y p e r f o r -
graph
i s an
graph
t h e node t h a t p r o d u c e s the
There tion
input
result
vari-
a l g o r i t h m can
the
fixed,
is
become
additions, multiplications,
important manner
algorithm.
algorithm f a l l
operations
Consider the set of a l g o r i t h m operations med
an
example, a l l v a r i a b l e s o f
I t i s not
nor
of
some s p e c i f i c
d i f f e r e n c e between
a l l operations
fixed
numbers.
operations
possessing
assume
Is
i f n
t h i s o p e r a t i o n can
i s f o r v a r i a b l e s and
while
numbers and
from
classes
distinction classes,
divisions
numbers, y e t
numbers
the
letters
only
results]
t h e c h o s e n o p e r a t i o n s ) f o r each o p e r a t i o n o f an
not
on
numbers,
a r r a y s , e t c . We
i f i n p u t v a r i a b l e i s an a r r a y o f n
nificant
be
( i . e . a r g u m e n t s and
ceptable
small
operations
restrictions
v a r i a b l e s can
input v a r i a b l e s are
real
algo-
operation.
a b l e and
a
be
execution;
t h e c o r r e s p o n d e n c e w h i c h shows w h i c h r e s u l t s o f w h i c h
of
must
determines:
execution; the set of operations
are
ready,
or
producing
the r e l a t e d
Unless
f o r the
otherwise
arcs are
case o f the
result
either
specified,
an
we
operathat
is
nonexistent assume
that
22
all
I n p u t / o u t p u t goes v i a s p e c i a l
graph
nodes
outgoing
that
arcs,
correspond
respectively.
dence between i n d i v i d u a l of
I fthe pictorial
we w o u l d d r o p
We
I n that
do n o t r e q u i r e
i/o data
such o p e r a t i o n s can v a r y
tion.
i/o devices.
to i/o operations w i l l
case o n l y
have
no
one-to-one
those
incoming/ correspon-
i t e m s a n d i / o o p e r a t i o n s . The number
t o meet t h e demands o f a p a r t i c u l a r
image o f a g r a p h
becomes a n n o y i n g l y
some o f o u r a g r e e m e n t s t o a c h i e v e
better
situa-
cumbersome,
quality
of
i l -
lustrations. We w i l l will
often
treat
t a g a r c s and nodes
graph that
n i z i n g o p e r a t i o n s and d a t a Strictly dependence
speaking,
graph
data". Clearly, simplify
this
i t to just
belong
unwieldy t h e graph
term does n o t f u l l y object,
shall
graph
execution expression
recog-
should
be c a l l e d
i s unfit
" t h e data
to given
input
t o use. Therefore
o r algorithm
graph.
we
Though
r e v e a l t h e n a t u r e o f t h e u n d e r l y i n g mathemat-
see t h a t
i t adequately
t h a t we
reflects
introduced i s a directed
G=(V,E), C.
£ t h e s e t o f arcs o f t h e graph
graph
theory, c h i e f l y ,
assume t h e r e a d e r book
[80] e a s i l y
t h e essence o f
Since
i s familiar
t h e graph
We w i l l
with
of algorithm
links
basic
only fixes
To be more p r e c i s e , t h e r e
t a r g e t machine, s i n c e i t can i n f l u e n c e t y p e s . The r e l a t e d g r a p h
with
and such.
Thus
the target
We make a p o i n t
will
theory [ t h e
t h e a l g o r i t h m was
remain
but a
However,
the said
orig-
few t r a c e s o f
t h e c h o i c e o f o p e r a t i o n s and
parameters are the t o t a l most
often
a n d o p e r a t i o n s i s made b e f o r e a n y n o t a t i o n the algorithm.
from
t h e s e t o f e x e c u t a b l e op-
t o t h e t a r g e t machine f o r which
nected
We
between them, i t no l o n g e r c o m p r i s e s any i n f o r -
written.
o f nodes,
several facts
n o t i o n s o f graph
peculiar
types
m u l t i g r a p h . We
i s t h e s e t o f nodes
them].
inally
degrees
V
need
mation
data
acyclic
where
t h e t e r m i n o l o g y and a few b a s i c r e s u l t s .
covers
e r a t i o n s and d a t a
sent
thus
we
analysis. The g r a p h
the
classes,
corresponding
of algorithm,
make u s e o f t h e s t a n d a r d n o t a t i o n and
to different
the described
ical
we
a s v e c t o r s . Whenever n e c e s s a r y
transfers o f various kinds.
of algorithm
this
our
arcs
parameters
number o f nodes,
t h e choice
of
data
i s selected t o repre-
a r e n o t n e c e s s a r i l y con-
machine. that
t h e graph
of algorithm
i s the kernel of the
23 designer's designer
idea as represented
could
well
be a w a r e o f t h a t k e r n e l . Q u i t e
mask i t b e c a u s e t h e t a r g e t description signer
graph
o f I t . But i n the vast
m a j o r i t y o f cases t h e a l g o r i t h m de-
idea
i s really
I ti s quite
understandable
the informational kernel
since the
o f i t . Up t o a
t h e k n o w l e d g e o f how e x a c t l y t h e i n f o r m a t i o n f l o w s d u -
algorithm
ledge
k e r n e l . M o r e o v e r , he d o e s n ' t e v e n h a v e t h e
o f i t s existence.
s h o r t w h i l e ago,
strived
p o s s i b l y , he h a d t o
language d i d n o t a l l o w any adequate
of algorithm
ring
The a l g o r i t h m
machine
i s n o t aware o f t h a t
smallest
i nthealgorithm notation.
execution
t o acquire
h a d no
i t . With
practical
value.
t h eadvent o f p a r a l l e l
t u r n s o u t t o be o f paramount
importance.
No
wonder
nobody
computers t h i s
The g r a p h
know-
of algorithm
comprises t h e r e l e v a n t i n f o r m a t i o n e x p l i c i t l y . By
choosing
gorithm
itself,
liarities,
t o consider
the graph o f a l g o r i t h m r a t h e r than
we a r e f r e e n o t t o t a k e
nor the idiosyncrasies of the target
u s e d a FORTRAN p r o g r a m t o b u i l d er bothered re-use
deration rithm that
opens
research
The tion
we w i l l
restriction
speaking,
execution eventual choice
of that
consi-
set o f algo-
equivalent.
Basing
implementation
on
f o r any
computers.
o f a l g o r i t h m places theexecution
on t h e execu-
o f a n y o p e r a t i o n may
i t s a r g u m e n t s m u s t be c o m p l e t e d .
i snot a r e s t r i c t i o n ,
b u t r a t h e r the necessary
orders effect
on t h e graph.
possess a s i n g u l a r l y o f roundoff
determined
a l l algorithm
a set o f permissible execution
s e t depends
o f execution
fully
before
graph
the entire
t o s e l e c t t h ebest parallel
i s that
c o n d i t i o n f o r t h e a l g o r i t h m t o be c o r r e c t . T h e r e f o r e t h e
STATEMENT 3 . 1 . are
thing
t o explore
that provide
of algorithm defines
structure
data
this
we h a v e
l a n g u a g e , n o r by t h e m u l t i p l e
important
t h e graph
reads as f o l l o w s :
sufficient
graph
be a b l e
a l l the operations
Strictly
of that
are informationally
computer, i n c l u d i n g
only
order
start,
nature
The m o s t
that
machine. A f t e r
t h e g r a p h o f a l g o r i t h m , we a r e no l o n g -
up t h e p o s s i b i l i t y
implementations
particular
and
by t h e s e r i a l
o f memory c e l l s .
the a l -
i n t o account any n o t a t i o n pecu-
order. Suppose
important
e r r o r s accumulation The f o l l o w i n g that
by argument implementations
that
invariant.
Then
holds: the roundoff
for a given
correspond
Namely, t h e
does n o t depend o n t h e
statement
for a l l operations values.
o r d e r s . The
However, a l l p e r m i s s i b l e
set
errors of
to one and the
input same
24 graph
yield
suits
identical
results,
do not depend
including
on execution
that
E ,
£ ....
be
pointed
0
1
where
Obviously,
o f execution
Certainly,
order,
depend o n e x e c u t i o n
originating
a l l nodes that
order
Then
tions.
we
I t follows
that
on e x e c u t i o n o r d e r The
plore
agreed
as
input
o f £j (
that
we
just
proved
from
produce
f o rE that
Re-
same r e -
operations roundoff
data
do
errors
the results of
f o r E-
operations
opens
notation
o f a l g o r i t h m graphs
t h e m s e l v e s . F r o m now o n , we w i l l graphs,
will
t o Ej
exists.
i = l , do n o t d e p e n d o n e x e c u t i o n o r -
results
the results
a r e now f r e e
the structure
rithms their
As we
from
may «
will
opera-
n o t depend
either.
statement
tunities.
those
subsets
nodes b e l o n g i n g
o f a r g u m e n t s . Now s u p p o s e t h a t
can regard
into
partitioning
a r e arguments
a n d we h a v e
nodes
and t h e nodes o f E
from
one such
the o p e r a t i o n s from £(, . . . , E ^ ,
all
n o d e s . Ex-
i n p u t nodes,
at least
input data
depend o n l y on t h e v a l u e s
der.
on t h e s e t o f graph
the s e t o f graph
only
t o o n l y by t h e arcs
Osi£Jc-l.
sults.
order
we p a r t i t i o n contains
The re-
0
gardless
not
order,
where E
errors.
order.
Graph a r c s d e f i n e a p a r t i a l ploiting
a l l roundoff
since
they
up wide
research
peculiarities,
we
opporcan ex-
instead o f dealing with
algo-
not distinguish algorithms
are equivalent
from
some
point
from
o f view. I n
p r a c t i c e an a l g o r i t h m i s seldom s p e c i f i e d b y i t s g r a p h and b u i l d i n g t h e graph
from
some o t h e r
algorithm notation
we assume t h i s p r o b l e m h a s b e e n s o l v e d We
have
mentioned
that
quirements of a p a r t i c u l a r adjustments
answer w h i c h
and
thus
to the
or i n part.
re-
These
they a r e those
the accuracy
o f the results
transformations that
the basic
preserve
v a r i a b l e s and b a s i c
opera-
unchanged).
now make
several
algorithm graph. F i r s t , ly
be a d j u s t e d
entirely
t r a n s f o r m a t i o n o f t h e a l g o r i t h m . Now we
o f a I g o r i thm ( a s s u m i n g
t i o n s remain Ue
an a l g o r i t h m must
transformations preserve
are permissible:
the graph
t a s k . F o r now,
i n some m a n n e r .
computer, e i t h e r
amount t o a c e r t a i n
can
i sa difficult
f o ra given
remarks
note
that
set o f input data.
bearing
on t h e i n t r o d u c e d
notion of
t h e graph o f a l g o r i t h m i s d e f i n e d onI f an a l g o r i t h m
i s described
as an
e x e c u t i o n s c h e d u l e f o r some t a r g e t m a c h i n e , t h e n a n y r e c o r d o f i t p r a c tically
always deals
with
input data
i n either
direct
or indirect
way.
25 For of
e x a m p l e , Gauss e l i m i n a t i o n thelinear
order
p i v o t i n g depends b o t h on t h e o r d e r
system b e i n g s o l v e d a n d o n t h e v a l u e s o f t h e p i v o t s . The
o f t h e system
i sspecified
lues o f t h ep i v o t s ditions
with
explicitly
depend on i n p u t
i n a FORTRAN p r o g r a m
data
on t h e input
indirectly.
( o r program
One may g e t t h e I m p r e s s i o n al
algorithms
branches. on
only,
that
We s t r e s s
operations.
that
o f operations.
on
data,
conditional Fourier es,
even
or
t o terminate
Sometimes
thealgorithm
the description
For example,
a s we
control
uncondition-
no
conditional
shall
graph does n o t depend
o f algorithm
may
process. graph,
contain f o r fast
a l o t o f conditional
see l a t e r ,
o f algorithm
t o t h e inner
a t y p i c a l FORTRAN p r o g r a m
includes
transfers f a i r l y
some i t e r a t i o n
much t h e s t r u c t u r e
have
b r a n c h e s may be r e l e g a t e d
o f f i x e d order
y e t t h e graph,
data. Conditional
lan-
data.
t h a t o u r s t u d y i s aimed a t that
con-
algorithmic
we d o n o t i m p o s e a n y s u b s t a n t i a l r e s t r i c t i o n s
though
branches.
transform
on input
at algorithms
Thus c o n d i t i o n a l
structure input
is,
the va-
Any b r a n c h i n g
i n any other
guage) a l s o depend, d i r e c t l y o r i n d i r e c t l y ,
while
does
often
n o t depend
branchon
a r e used j u s t
input
to start
I n t h a t c a s e t h e y do n o t a f f e c t modifying
i ts l i g h t l y
along t h e
b o r d e r s o f r e g i o n s where graph nodes a r e s i t u a t e d . Strictly tional
interesting will
speaking,
branches
we m u s t r e g a r d
as a f a m i l y
t o investigate
n o tpursue
this
of similar thegeneral
t h e graphs from a family
on.
O f c o u r s e , we s h a l l o b t a i n sufficient
So we h a v e p l e n t y
try
of effort across
possible rithms,
structure
and i n v e s t i g a t e coarser
o f reasons
a l l possible delving
i n practice.
to build
into
t h es t r u c t u r e o f that
results
t o begin
b u t they o f t e n
studying
algorithms, rare
i t
is We
uni-
prove t o
their this
graphs
that
are r i s k i n g that
algorithms,
i sn o t always
possible.
a l l objects
i t s graph,
ever
i t i s always
executing
from
i f we
t o waste
one h a r d l y
without
Choosing n o t t o d i s t i n g u i s h a l g o r i t h m further notice
the structures o f
data. Besides,
we
situations
For unconditional
and study
For generic algorithms,
sume w i t h o u t
condi-
o f such a f a m i l y .
g r a p h s t h a t do n o t depend on i n p u t
t o investigate
comes
To b e s u r e ,
f o r p r a c t i c a l purposes.
those a l g o r i t h m
plenty
containing
algorithms.
i s s u e f u r t h e r now. We c a n a l w a y s t a k e t h e u n i o n o f
all
be
any a l g o r i t h m
the algo-
we w i l l a s -
and dependencies
t h a t may
26 influence For
the implementation
example, t o take
munication the graph is
that
o r memory that
into
of algorithm are reflected
account
traffic,
reflect
t h e overhead
we must
those
i n the graph.
f o r i n t e r p r o c e s s o r com-
i n c l u d e new n o d e s a n d a r c s
operations.
Their
t h e y do n o t c h a n g e a n y d a t a b u t t a k e
distinguishing
time
into
feature
t o complete.
4. Topological Sorting Our
research
i s aimed
r i t h m s on p a r a l l e l cuted
i n parallel
at the e f f i c i e n t
computers. Determining ( i . e . simultaneously)
implementation
which operations
of
algo-
c a n be e x e -
i s t h e r e f o r e one o f o u r m a j o r
tasks. We
can p o i n t
grounded.
Given a p a r t i c u l a r which
out the general
The e x e c u t i o n
operations
that
i s being
only
from
Thus
any
implementation, are executed
executed
those
idea
have
sorting
of i t s operations.
sorting
i s that
a l l operations
executed
long
us f r o m
groups a r e connected
of
sorting graph
will
speaking,
a t any time an
operation
i t s arguments
to this induces
moment. a
well-
feature of
among g r o u p s
that
the operations
i nparallel. wherein
be
i n time.
this must
t h a t be-
However, n o t h i n g
the operations
from
i n some way. I t g o e s w i t h o u t s a y i n g
of algorithm operations
of algorithms
completed
The d i s t i n g u i s h i n g
the sortings
nodes a n d v i c e v e r s a .
structure just
considering
receive
f o r execution
are distributed
consecutively. Generally
specific
Obviously,
could
been
t o t h e same g r o u p a r e t o be e x e c u t e d
prevents
any
moment
algorithm
defined
be
t o determine
simultaneously.
that
o f an
our research
necessarily unfold
we a r e a b l e
a t a given
operations
scheduling
on which
o f a l g o r i t h m must
This
i s closely
induces means
related
the corresponding
that
that
sorting
exploring the parallel
to studying
the sortings
we
described, So f a r we d o n ' t know a n y t h i n g a b o u t a l g o r i t h m s t r u c t u r e
the f a c t
that
a n y a l g o r i t h m c a n be d e s c r i b e d
process t h a t u n f o l d s notations,
i n t i m e has i t s b e g i n n i n g
as p r o g r a m s
i n serial
by a d i r e c t e d
except f o r graph.
Any
a n d e n d . Some a l g o r i t h m
a l g o r i t h m i c languages,
offer
conspic-
uous d e s c r i p t i o n s o f t h e f i r s t
and t h e l a s t o p e r a t i o n o f a l g o r i t h m . I n
the
operations
graph
of algorithm,
these
correspond
t o t h e node
that
27 h a s no i n c o m i n g a r c s a n d t o t h e n o d e tively. built
Suppose
using
that
those
and
terminal
put
node a l w a y s
an a l g o r i t h m
algorithm
to which
or
notations
that
by a graph
explicitly one i n p u t
e x i s t , as t h e f o l l o w i n g s t a t e m e n t
no a r c s p o i n t
acyclic
graph
a n d at least
respec-
that
was n o t
specify
initial
node a n d one o u t -
shows.
comprises
one node
at least
from
which
one
no
arcs
iginate. The
graphs arc.
the
statement
that
have
Consider
node
obviously
no a r c s
has an incoming
the
critical tions
path
path.
Statement
group
graph.
a t least
a r e n o t empty.
that
a
Statement
o f t h e second
an integer
s±n such
from
i and points
to a node
"1"
t h egraph.
process Since labels
cannot
form
that the
t h e assump-
o u t o f which
t h e nodes
node,
no a r c s
into
belong
i s n o t below
length
algorithm
t o t h e second
operations
n o t supply
three
the third
path
any
2,
a l l
c a n be
information
group. acyclic
graph
a l l the nodes
with
that
from
nodes
e x c e e d t h e number o f g r a p h
and a r c s acyclic.
step,
nodes.
inte-
a node
labeled
i<j.
incident
and l a o n them
Now we r e p e a t t h e
no p r e d e c e s s o r s
on each
There
with
have no p r e d e c e s s o r s
remains
have
i s labeled
j then
that
of n nodes.
can be labeled
i f an arc originates labeled
Remove a l l l a b e l e d
one node
nodes
one i n p u t
4. 1 does
The r e s u l t i n g g r a p h
labeling
a t least
with
found
that
A l l the rest
a directed that
so that
11,2,...,sj
them w i t h
from
case. I f
i s longer
o f graph
a t least
C h o o s e a n y number o f g r a p h n o d e s bel
that
o f a node
The c o r r e s p o n d i n g
STATEMENT 4 . 2 . Consider
with
a path
partitioning
contains
the critical
consecutively.
gers
i n that
p a t h t h e n we h a v e
i sa contradiction
one o u t p u t node.
groups
exists
found
The e x i s t e n c e
group
Provided
thes t r u c t u r e
I t s starting
I f i t I s n o t one o f t h e nodes
have
4.1 d e f i n e s
group.
executed
has a t l e a s t one that
similarly.
The f i r s t
—
( I . e . f o r those
t h e graph
t o t h e same c r i t i c a l
I n b o t h cases t h e r e
i sproved
groups.
Suppose
graphs
p a t h i n t h e g r a p h . Suppose
t h e n we
o f t h e statement.
depart
f o r empty
a r c . There a r e two p o s s i b i l i t i e s
i nour acyclic
critical
holds
at a l l ) .
any c r i t i c a l
p r e c e d i n g node b e l o n g s
a circuit
on
has no o u t g o i n g a r c s ,
nodes. N e v e r t h e l e s s , a t l e a s t
STATEMENT 4 . 1 . Any directed node
that
i s defined
with
the total
"2",
etc.
number o f
28 COROLLARY. No identically
labeled
COROLLARY. The minimal equals
critical
path
number
length
plus
length
utilizes The
s
marking
sorting. all
and the total
precisely
theproof
that
t h e maximum
equals
k-1.
critical
s i n the Interval
number
of nodes
o f 4.2 t h a t length
For that
path length
with
i n that
Note '
j
by i ^ j .
refer
nodes
the
critical
a marking
that
4.2 i s c a l l e d
there
o f paths
sorting,
a
k by t h e t o p o l o g i c a l
exists
than
sorting k.
the topological
terminating
t h e number
i n t h e node
o f used
topological
labels
I t
sorting
such
labeled
by k
exceeds the
b y 1 . T h e p r o o f o f 4.2 e s s e n t i a l l y y i e l d s
t h estatement
still
corresponding
then
follows
the
con-
paths.
holds
i f we r e p l a c e
the inequality
s o r t i n g o f nodes as the
We
generalized
will topolo-
sorting. There
ings. ings
a r e two reasons
As we s h a l l
t o introduce
see l a t e r ,
i s i n a sense a n a t u r a l
ings. iate
Besides, generalized objects
during
a non-trivial
consider the
the
topological topological
sort-
of theset of topological
sort-
closure
topological
generalized
sortings
a specific
topological
are
sort-
useful
as intermed-
topological
sorting. Gi-
sorting,
we d o n o t h a v e t o
e n t i r e g r a p h o n c e a g a i n . A g r e e a b l e r e s u l t s c a n be o b t a i n e d
finding appropriate
by
g r o u p s o f nodes o f t h e
gical
generalized
the seto f generalized
search f o r
by
the
an arc. all
However, n o t one o f t h e c o r o l l a r i e s r e m a i n s v a l i d .
t o the
gical
ven
that
exists
node a r e n o t l o n g e r
s t r u c t i v e method t o f i n d a l l c r i t i c a l
<
with
to mark
between
there
by Statement
i s labeled
terminating
from
required
labels.
described
I f a node
paths
are connected
1.
COROLLARY. For any integer path
nodes
of labels
sorting for
the
topological
sortings
generalized
for
t h e subgraphs
topological
sorting.
defined
The
topolo-
e n t i r e g r a p h c a n be composed o f t h e s o r t i n g s
for
subgraphs. A topological
differently. dered. tings
sorting
I n that
case
i s called
I f a (generalized) topological with
Certainly,
greater
number
linear
the directed
o f labels
i f a l l nodes
graph
i s called
s o r t i n g i s not
linear, other
c a n be r e a d i l y
consider a group o f i d e n t i c a l l y
g r o u p i n c l u d e s more t h a n o n e n o d e . A t l e a s t
labeled
a r e labeled Uneariy or-
derived
sor-
from i t .
nodes. Suppose
one such g r o u p must
that
exist.
29 or
else
o u r g r a p h w o u l d h a v e b e e n l i n e a r l y o r d e r e d . Now p e r f o r m a
eralized) groups.
topological
sorting for
For t h e f i r s t
that
o f those,
group, p a r t i t i o n i n g
the labels
remain
possess
lue,
by 1. C l e a r l y ,
we
yields
increase
the sorting with
We s e e t h a t ting
the values
o f labels
t h e number o f l a b e l s
the topological
t h a t we d i s c u s s e d e a r l i e r
cal
sorting
set
o f graph
of the algorithm nodes
first all
other
the arcs
the
t o 4.2
algorithm
states
numbers.
topologi-
operations.
on t h e
Enumerate
entirely
that
operations.
there
As f o r
a t a l l o r they
This
indicates
that a l l
consecutively,
starting
from t h e
g r o u p do n o t have o u t c o m i n g
arcs, so
can be e x e c u t e d
o f output operations.
no a r c s
g r o u p . Hence i t I s p o s s i b l e in
o f sor-
a partitioning
do n o t a r r i v e
smaller
o n e . The nodes o f t h e l a s t
last group consists
lary
that kind
e n t i r e l y o f input
either
groups having
groups o f o p e r a t i o n s first
va-
marking
g r o u p do n o t possess i n c o m i n g a r c s . C o n s e q u e n t l y t h e
groups, from
t h e new
1,2,3... . I n v i e w o f S t a t e m e n t 4.2 t h e
group o f operations consists
originate
with greater
C o n s i d e r some
I t induces
and c o r r e s p o n d i n g
For t h e
increased by 1.
i n this section.
the groups o f nodes b y i n t e g e r s nodes o f t h e f i r s t
labels
sorting i s precisely
graph.
i t i n t o two
unchanged.
second o n e , and f o r a l l t h e nodes t h a t
(gen-
link
t o execute
t h e nodes
The f i r s t
Corol-
o f o n e a n d t h e same
t h e o p e r a t i o n s w i t h i n each
group
parallel. Finding
those more
topological
possessing intricate
sortings
the desired
the structure
i n general,
properties,
t o say n o t h i n g o f f i n d i n g
i s a
o f the graph,
challenging
t h e more
g r o w s . We n e e d some g e n e r a l m e t h o d f o r t h e r e d u c t i o n gical
sortings
simpler By solved than graph
f o r complex graphs
to finding
tricky
task.
The
this
task
of finding
topological
topolo-
sortings f o r
graphs. "simpler graphs" easier.
A
the original as
practical Here
we mean t h o s e g r a p h s f o r w h i c h o u r p r o b l e m I s
simpler graph
i t s subgraph.
graph —
can involve
i n fact,
In spite
more
nodes
a n d more
i t can even c o m p r i s e
of that,
i tcan t u r n
o u t t o be
t o search f o r topological
s o r t i n g s f o r t h i s new g r a p h .
i s a typical
have n o t e d
tion defines
e x a m p l e . We
t h e graph o f algorithm
data. Conditional
branching
inside
that
an a l g o r i t h m
unambiguously o n l y the algorithm
arcs
the original
for fixed
more
notainput
s h o u l d be r e g a r d e d
as
30 specification creates
of a
family
of similar
i n i t s turn a family
p u t d a t a we may e v e n b e u n a b l e they correspond. on
the input
sults. family.
of
the general
these
investigate
any a c y c l i c
an
a c y c l i c graph,
of
any graph
by
the topological
G;
the
topological into
group. they The all, the
sortings
Suppose
This
subsets,
originate
graph,
then
their
union
suppose
that
G^.
possibly,
a partitioning
a l l nodes subset
with
o f t h e s e t o f nodes
i n a subset
belonging
topological To d o t h i s
since
sorting.
t o nodes o f a p a r t i c u l a r subset a t
a subset
with
smaller
label
value.
Hence
holds: G of acyclic
topological
we
know
graphs
sorting
I f their
some
of
G
topological
union G i s a c y c l i c ,
f o r the union.
f o r G^
sorting we
t o the
the label o f the corresponding
Q , . . . , Gs induces
i s an
topological
The
sortings
f o r acyclic
some t o p o l o g i c a l
difficulty
lies
just
Cs
of G that have
that
we
generates
t o bind
considered them
i n groups
earlier
from
o f topo-
i s that the
c a n be r e a d i l y
a l l nodes
sort-
i n determining
w h i c h n o d e s o f G s h o u l d be g r o u p e d . The c h a r a c t e r i s t i c f e a t u r e sortings
produced
G., 1=1,... ,s.
can be b u i l t
logical
G is
s o r t i n g o f C i s known. A l l nodes
t h e n o d e s o f G^ a r e n o t l i n k e d b y a r c s ,
from
any
for every
and
approach i s
s i n c e t h e y a r e n o d e s o f G. Q u i t e
e i t h e r do n o t p o i n t
g r a p h s Gi ing
union.
To g e t a r o u n d the family
c o n t a i n n o n o d e s o f G. a t a l l . N e v e r t h e l e s s ,
STATEMENT 4 . 3 . If the anion
Now
sortings.
a s n o d e s o f G, t o t h e same g r o u p o f t o p o l o g i c a l
o r they
acyclic
between t h e
b e f o r c e d t o make u s e o n l y
f o r that
s o r t i n g o f G induces
f o l l o w i n g statement
sortings
t h e correspondence
g r a p h s G^
Now we t a g e v e r y
arcs o f
family
reasoning.
sorting,
W i t h i n each subset
belong,
that
f o r a l l graphs o f the
o f a l l graphs from
a n d some t o p o l o g i c a l
disjoint
same g r o u p .
from
set of i n -
a r e d i s t r i b u t e d i n some way among t h e g r o u p s
some o f t h e g r o u p s w o u l d
of
specification
e.g. v i a i n t e r m e d i a t e r e -
of a l l topological
the topological
graph
sortings
d a t a , we w i l l
the union
by t h e f o l l o w i n g
Consider
manner,
to establish
properties
obstacles, consider
justified
t o which
we know t o p o l o g i c a l
and t h e s e t s o f i n p u t
This
For a g i v e n
be t h e c a s e i f t h e c o n d i t i o n s d e p e n d
i n a complicated
that
I f we a r e n o t a b l e
sortings
to tell
This would surely
data
Suppose
algorithms.
o f s i m i l a r graphs.
restored. identically
31 labeled To family pose
subsets. make u s e o f t h e d e s c r i b e d t e c h n i q u e we d o n o t h a v e of similar
that
sired
graphs
properties
topological
induce
Note med a f t e r rectly,
and e x p l o r e
the topological
g r a p h we o b t a i n
sorting
the required
ways g i v e
finding
i s c o m p l i c a t e d . Suppose
nodes and a r c s t o t h a t sired
s o as t o b u i l d
f o r some g r a p h
further
Then
on t h e o r i g i n a l
that
this
expansion
t h e graph
before building
t h e o r i g i n a l one. Besides
graph
built.
the graph,
of
the found
expansion
can as w e l l
by m o d i f y i n g
that
we
We
graph
t o be q u i t e
one k i n d
of reduction
the origin
some p a t h .
that
proves
and t h e end p o i n t
I ti s clear
extensively
use t h i s
sortings
this
Of
However,
effective.
of t h e graph after
reduction
i s based on homomorphism
G i v e n a g r a p h G-(V,E)
expansion
there
Suppose l i n k e d by
a r c from t h e graph would n o t
sortings
useful
indi-
reduction.
results.
technique, especially
expansion o f t h e graph. Another
fine
may
notation
i s an
o f an a r c a r e a d d i t i o n a l l y
that deleting
the set of a l l topological
do t h a t
c a n t r y t o use g r a p h
is
place
sorting
the algorithm
the modified
helpful
pological
a few t h e de-
g r a p h . A t l e a s t , we c a n a l -
course, n o t every r e d u c t i o n would y i e l d
will
adding
o f t h e g r a p h d o e s n o t h a v e t o be p e r f o r -
i s explicitly
s u c h a way a s t o g u a r a n t e e
alter
t h e de-
i ta t r y .
in
that
that
a
u n i o n . Supwith
another graph f o r which
i s easy t o f i n d .
sorting
their
sorting
t o have
i n a n y way. We the preliminary
that helps explore t o -
operations.
we c h o o s e a n y t w o n o d e s u, v f r o m V a n d r e -
t h e m b y a new n o d e . D e n o t e t h e r e s u l t i n g t h e new s e t o f a r c s E
as f o l l o w s :
s e t o f n o d e s b y 1/" ,
De-
i f an a r c f r o m £ I s n o t i n c i -
1
dent
neither
change.
substitute £ f . We mentary
on u
n o r on
I fan a r c from E t h e new n o d e
say t h a t
tion
transforming
i s associative.
homomorphic
convolution
G^XV^E^)
operation.
s e n t a t i o n we c a n v i e w t h i s accordingly
then
i tgets
transferred
i s derived
I n terms
f r o m G={V.E)
by a n graph
t h e two nodes
The e l e m e n t a r y
homomorphism.
We
ele-
repre-
together,
homomorphism
A sequence o f e l e m e n t a r y homomorphisms or just
without
the corresponding a r c i n
of dot-and-line
o p e r a t i o n as merging the arcs.
to E
o n o n e o f t h e n o d e s u , v t h e n we
f o r u and v t o o b t a i n
the graph
homomorphism
v
i s incident
opera-
i s called
apply t h e term
homo-
32 morphic c o n v o l u t i o n
t o t h e r e s u l t i n g g r a p h as w e l l .
An e l e m e n t a r y h o m o m o r p h i s m cycles,
and c i r c u i t s ,
We d i s t i n g u i s h are
can produce p a r a l l e l a r c s ,
t w o modes o f e l e m e n t a r y
removed and a l l p a r a l l e l a r c s
tion,
t h e n we s a y i t i s a s t e r n
refer
t o i t a s multiple
momorphism.
We
say t h a t
lation.
of stern
t h e graph
convolution!
The g r a p h H
(/J). T h i s
H
that
Partition
a stern
H i s acyclic
a l l nodes
g r o u p t h o s e n o d e s whose logical
(sternly
t o some h o m o m o r p h i c
into
Otherwise
we
e l e m e n t a r y hoi s called
a
homomorphic)
convolution
image
H o f an a c y c l i c
groups
images i n H b e l o n g
of C
a r c s and nodes.
some t o p o l o g i c a l
disjoint
( s t e r n ho-
{pre-image)
respective
convolution
to
i s n o t a symmetric r e -
a homomorphic
a n d we know
of G
t h e opera-
convolution.
o f G. G r a p h h o m o m o r p h i s m
homomorphic
or just
homomorphisms
i s homomorphic
(C) i s c a l l e d
self-loops
after
homomorphism.
elementary
terminology also refers t o t h e i r
Consider Suppose
elementary
o f these.
I fa l l
together
o r a s t e r n homomorphic
graph C i f i t i s isomorphic momorphic
homomorphism.
are fused
e l e m e n t a r y homomorphism,
A sequence
s t e r n homomorphism
self-loops,
even i f t h e o r i g i n a l g r a p h had n e i t h e r
graph
sorting
ascribing
o fi t .
t o t h e same
t o t h e same g r o u p
s o r t i n g . Mark t h e g r o u p s o f G b y c o r r e s p o n d i n g
G,
labels
o f topoo f groups
o f H. I t i s e a s y t o o b s e r v e t h a t o u r p a r t i t i o n i n g d e f i n e s a g e n e r a l i z e d topological
s o r t i n g o f g r a p h G. C e r t a i n l y ,
are
homomorphic
the
s t e r n homomorphic c o n v o l u t i o n
nodes b e l o n g i n g is
pre-images
f o r t h e nodes
some a r c s may l a c k
t o t h e same g r o u p
why we s p e a k o f a g e n e r a l i z e d
that
c o n n e c t s nodes f r o m
t h e nodes o f any g r o u p o f G o f a group
o f H. their
o f G may b e c o n n e c t e d
then a homomorphic c o n v o l u t i o n
smaller
lowing statement
label value than
I t s image
i nH
G
sorting
i s an acyclic of H induces
We
have
G
necessarily image i s n o t
always preserves
the direc-
node o f t h e a r c o f G b e l o n g s t o
the terminal
node. Thus, t h e f o l -
holds:
STATEMENT 4 . 4 . If a stern graph
That
s o r t i n g o f G. C o n s i d e r a n y a r c f r o m
d i f f e r e n t groups.
t i o n o f a n a r c . T h a t i s why t h e i n i t i a l a group w i t h
with
i m a g e s , some by a r c s .
c o n n e c t s t h e n o d e s f r o m d i f f e r e n t g r o u p s o f H. I f a n a r c ' s a self-loop,
Since
graph
homomorphic itself
a generalized
mentioned
earlier
then
convolution any
topological that
H of an
(generalized) sorting
generalized
acyclic topological
of G.
topological
sortings
33 facilitate volution
the Investigation
plays
e x a m p l e . S u p p o s e we specified the
investigate
graph o f algorithm.
First
However,
natural
the meticulous
i t s nodes
data transfers. on
to build algorithm
I f we r e g a r d
individ-
t h e most d e t a i l e d
will
incur
no l o s s
o f the minutest
descripof infor-
d e t a i l s can
w h i l e b u i l d i n g and e x p l o r i n g larger
operations,
as
f r a g m e n t s o f t h e p r o g r a m . On t h i s
h a v e a s i m p l e r g r a p h . We w i l l that
algorithm
obtain
description
way o u t i s t o c o n s i d e r or other
o f an
of the algorithm.
cause c o n s i d e r a b l e d i f f i c u l t i e s
basic blocks
con-
the following
o u r p r o b l e m we h a v e
by g r a p h nodes.
Graph c o n s i d e r a t i o n
mation on t h e s t r u c t u r e
Homomorphic
consider
t h i n g , we h a v e t o d e c i d e w h i c h
s t a t e m e n t s a s o p e r a t i o n s , we w i l l
sing
this,
the p a r a l l e l structure
s h o u l d be r e p r e s e n t e d
tion o f the algorithm.
The
structures.
b y a FORTRAN p r o g r a m . To s o l v e
operations ual
of algorithm
s i m i l a r r o l e . To i l l u s t r a t e
t h e graph. subroutines,
l e v e l we
shall
r e f e r t o t h a t g r a p h a s macrograph,
(and possibly
arcs)
represent
large
stres-
o p e r a t i o n s and
Of c o u r s e , t h e m a c r o g r a p h l o s e s some o f t h e i n f o r m a t i o n
the structure
of algorithm.
B u t we r e a l l y d o n o t h a v e t o w o r r y
it.
Sometimes
i t i s sufficient
lel
structure
o f t h e m a c r o g r a p h . B e s i d e s , we c a n a l w a y s e x p a n d t h e mac-
ronodes,
replacing
Actually, phic
graph of
them by t h e i r d e t a i l e d
Large operations
subgraph
sociative,
i t does
convolution building
operations.
graph o f
—
According induces
graph
by s t e r n
node.
Since
when
that
t o perform
t o Statement
a generalized
that
stern
i t c a n be
to the preliminary
sorting
i s as-
homomorphic done
choice
4,4 a n y t o p o l o g i c a l topological
convolution
homomorphism
before
of
sorting
large
of the
of the detailed
algorithm.
o f graph
significant this
homomorphic
stern
In particular,
amounts
A (generalized) topological ing
t o some s u b g r a p h s o f t h e
that
not matter
i s b a s e d o n homomor-
i n d i v i d u a l s t a t e m e n t s . The macro-
a single
o f t h e subgraphs.
the graph
macrograph
from
into
on t h e p a r a l -
graphs.
correspond
g r a p h where nodes r e p r e s e n t
c a n be d e r i v e d
each
the information
our choice o f levels of description
convolution.
detailed
to obtain
about
nodes role
i n t o groups
the sortings
d o e s n o t mean t h a t
sorting i s fully and o r d e r i n g
play
with
by p a r t i t i o n -
these groups.
i n exploring
we c a n d i s p e n s e
defined
algorithm
the algorithm
Despite the structures, graph.
The
34 algorithm
graph comprises very
describes
t h e e x a c t manner i n w h i c h t h e n o d e s e x c h a n g e d a t a .
logical
sortings
lack
this
important
information
vestigate
topological
sortings
structure
of algorithm
graphs.
We The
conclude
notion
justified
this
section
only
with
information
altogether.
as
f a r as
a short
ever, our i n t e r e s t
i t since
In
algorithm
form.
sorting
i s called
of topological
sorting
are called
valent
definition
explore
of
theory
the
terminology. [80J.
We a r e
towards graphs i s o f a r a t h e r s p e c i f i c nature, on p a r a l l e l
research, an a l t e r n a t i v e terminology
Obviously,
topo-
we, t o o , i n v e s t i g a t e g r a p h s . How-
the topological
groups
The
i s why we i n -
help
discussion
i s m o t i v a t e d by t h e i m p l e m e n t a t i o n o f a l g o r i t h m s
ular,
That
they
o f t o p o l o g i c a l s o r t i n g comes f r o m g r a p h i n borrowing from
about t h e arcs. I t
the explicit
reference
underlines
i t so r i g i n
as i t
computers.
sprang up. I n p a r t i c -
t h e parallel
form,
the layers
and t h e
of the parallel
to parallelism
i n the equi-
i n the particular
research
field. The number
number o f l a y e r s o f nodes
layer width allel
form
height.
maximum length
width
"maximum"
parallel
form
i s the layer's
We w i l l
i s called
t o t h e same
parallei
i f at
algorithm from
computation.
the width
of the i t h
that
par-
of
algorithm.
some The A
least
one
path
layer,
and
a l l
of input
layer.
make u s e o f b o t h
r e s e a r c h i s t o be
canonical
i s the
stresses
the
t h e maximum
The maximum
i t s height
i s called
node
width;
form.
definition
t o maximally
forms
a t each
i s called i t s height;
of the parallel
i n this
i t corresponds of parallel
i - 1 terminates
nodes b e l o n g
our
form
i s t h e one w i t h minimum h e i g h t ;
o f view
maximum
i n the i t h layer
i s c a l l e d the width
The w o r d
point
i n a parallel
terminologies
depending on what
facet of
stressed.
5. Schedules and Graph Machine From graphs are
a theoretical point
that distinguishes
acyclic,
i . e . they
o f view,
the only
them among a l l d i r e c t e d
have
no c i r c u i t s .
feature graphs
Any d i r e c t e d
g r a p h c a n be r e g a r d e d a s a g r a p h o f some a l g o r i t h m
of
algorithm
i s that
acyclic
they
multi-
i f we v i e w i t s n o d e s
35 as
some o p e r a t i o n s
and
It
may
an
seem
t o be
i t s arcs
r i t h m s t o be
e x p l o r e d and
t o embark on
the study
sults
concerning However,
foreseen belong
impediment.
Most
of
is
the
point of As is
to
path
of
we
non-trivial
the
number
exhaustive
an
the t o t a l
example the
of
search
consider
the
time-effective
each o f
unlimited
and
find
to
answer
question:
in a given
an
'Can
time?'
initial
a
important
grave graph
The
too. un-
problems
solution
a n a l y s i s of a l l admissible
in real-life
as re-
and
of
cases,
t h e num-
a l g o r i t h m graphs
impossible
from
following
situation.
Suppose o u r
a
practical
formulation in As
a
quite
terms
of
they
graph
to
topological
sorting
ber
o f g r o u p s and
ing
problems.
that
one:
op-
amount
l e t us for
a given
are
t r y to prac-
computer
problems concerning
to
computers are
u s u a l l y does
finding
not
the
NP-hard. pose
topological
any
sortings
example, the above q u e s t i o n i s
'Does a
given
obeys t h e s p e c i f i e d
directed
graph
have
upper bounds f o r the
a
num-
t h e number o f n o d e s w i t h i n e a c h g r o u p ? '
extreme o p i n i o n s e x i s t The
know
any
memory
important
on
entire
memory t r a f f i c Now
yet
parallel
For
task
algorithm
t h i s p r o b l e m i s NP-hard.
theory
amount
that
e x e c u t e d on
a l l worthwhile
some r e q u i r e d p r o p e r t i e s . following
including
simple,
[39] that
tantamount
the
further
idealized.
a l g o r i t h m be
o f a l g o r i t h m s on
rule,
given
d e f i n e d . Assume t h a t we
Suppose
operations
practically
a
the time r e q u i r e d to perform
is rather
I t turns out
of
t h e a l g o r i t h m i s known, t h e
is fully
possessing
them
and
nodes
and
outwardly a given
implementation
difficulties.
Two
a l l algo-
is quite
processors.
situation
Unfortunately,
Their
the
a l l auxiliary
The
optimal
Our
against
implementation
number o f p r o c e s s o r s
on
instantaneous. an
operations.
of
t h e t a s k grows e x p o n e n t i a l l y w i t h
s e t o f o p e r a t i o n s t o be p e r f o r m e d
tice,
latter.
s o - c a l l e d NP-complete problems.
some m u l t i - p r o c e s s o r c o m p u t e r . S i n c e
Is
set
view.
find
eration
between
the
i n that direction,
come up
implies the exhaustive
nodes. Since
huge,
of the
s o r t i n g s p u s h us
that
time required to perform
ber
transfers
to equate
the s t r u c t u r e
topological
to the class of
data
idea
t h e s e t o f a l l d i r e c t e d a c y c l i c g r a p h s , so
of
following
such problems The
as
attractive
"pessimists"
i s p o s s i b l e . The
as
regards
conclude
"optimists"
the NP-hardness o f the
that
suggest
no
practical
searching
aris-
solution
for effective
to ap-
36 proximate
methods,
viewpoints simists
using
various
l e a d n o w h e r e as
t h e n we
shall
regards
i s t i n g p r o g r a m s . The
best
we
can
w h e r e a s we
strive
v e s t i g a t e and
make t h i s
c o u n t on
assumption
length of
judgments are
then
have
a
just
sample
do
not
actually
possess tence
That
i s why
a number o f
of
repeated
specifics
we
and
get
required to i n -
out
of
structures. be
polynomial
the
put
i f we
down o n l y
by
dependencies. number
of
that
of
kind
t o w r i t e d o w n i n comarbitrary.
features associated
with
operations.
to p r a c t i c a l
t h e assum-
total
r e a l l y not
of
plight.
However,
large algorithms
impossible
the
algorithms,
neither
p r o p o r t i o n a l to the
around
pes-
i t s execution,
inter-operation
sequences
sticking
can
such
than
the consequence o f
encounter
characteristic
grounds t o surmise t h a t their
t h e way
a c t u a l algorithms are
periodically
the
the o p t i m i s t s i s
time
a l g o r i t h m can
p r a c t i c e p r e c i s e l y because they are
both
( i n t h e number o f
Even w i t h
amount o f
arbitrary
s u c h r e c o r d w o u l d be
o p e r a t i o n s . We in
follow
complexity
methods p r o v i d e
a l g o r i t h m graphs
pact form.
comply w i t h
r e q u i r e much more t i m e
the enumeration o f a l l i t s o p e r a t i o n s The
i f we
polynomial
t o reduce the o v e r a l l
These r e g r e t t a b l e that
I f we
solve a p p l i c a t i o n s problems. Therefore
nor even l i n e a r c o m p l e x i t y
ption
Unfortunately,
h a n d o r t o c r e a t e anew a l l e x -
the d e s i r e d graph problems.
the a l g o r i t h m a n a l y s i s w i l l
etc.
practice.
have t o r e w r i t e by
t h e development of methods w i t h nodes) t o s o l v e
heuristics,
the
Thus
a l g o r i t h m s and
NP-hardness
of
They exis-
we
have
extracting
large
discrete
problems. NP-hardness t h e way
o f our
tional agree
restrictions to
consider
algorithmic create
we
of
risk
enforce
very
on only
languages.
To the
t a n g i b l e hindrance circumvent class
those
of
i t we
However,
the
limitations
i n an
algorithm graph.
this
that
to shrink very
strict
having
little
answer
lies
too
much t h e
limitations
practical
value
real
constantly gets have t o impose
t o be are
restriction
axiomatic
The
that
shall
algorithms
algorithms
the o p p o r t u n i t y to conduct e f f e c t i v e
the a d d i t i o n a l ties
is a
research.
i s not
research.
way
this
of
admissible
then
only
trivial
remain
in
can
i n our
some
approach
As
to
properis
algorithms.
class.
can
postulate
algorithms
somewhere i n b e t w e e n t h e N P - h a r d n e s s a n d
We
existing
sufficient
We
stipulating
danger o f
class
may
explored.
written
in
addi-
that If
or
usual,
triviality.
we
those
Can
the we
37 pinpoint i t ? We
have
stated
i n the preface
even o f t h e most p o p u l a r search of
into
time
search study
into
process
i n order
take a l o t
Right
now t h e r e -
be a c c o m p a n i e d b y t h e
t o shape
i t i n t h e most
research
i s p r i m a r i l y aimed a t those a l g o r i t h m s I ti s expedient
o f s u c h a l g o r i t h m s . To d o t h i s ,
that
t o computerize
include a
the explora-
we must map a l l e x e c u t a b l e
tions onto
some s e t s o f n u m b e r s , i . e . i n t r o d u c e some c o o r d i n a t e
and
coordinates
assign
obtain
their
computers plies
t o e a c h node o f t h e g r a p h . Graph a r c s
coordinates
t o analyze
too.
Having
done
that,
(not necessarily ways
nodes.
t h e n we s h a l l structure, to
I f we
t o introduce
h a r d l y be a b l e
since
to discover
most Any
a coordinate
of
H o w e v e r , we d o n o t know t h e s t r u c t u r e starting
b u t t h e most
nodes
associated
with
whether
vector
general
arcs
by
f o r an e x i s t i n g defines
i s t o be e x e c u t e d . t^ t h e moment
i s a c c o m p l i s h e d . The r e s u l t i n g
More
i s preserved, t h e
a t which
moment o f
Take any e n u m e r a t i o n
o f time vector
t h e development o f e x e c u t i o n
i s r u n time.
or a hypothetical
a t which
the i t h
t - ( t ^ t ^ , ...) i s
every actual or hypothetical algorithm
t itself
s y s t e m , we
later.
o f algorithm implementation
unavoidably
used
o f algorithms
coordinate
properties of algorithms.
be d i s c u s s e d
designed
computer,
and denote
the simplest
accuracy o f t h e r e s u l t s
characteristic
This vector describes The
from
systems w i l l
or serial
graph
ordering
o f graph
of algorithm
time every a l g o r i t h m o p e r a t i o n
node o p e r a t i o n
i.e. the linear
the directions
i sthe
systems
implementation,
parallel
altogether
system
t o the coordinate
the guaranteed
important
t h i s ap-
t h e deepest t r a i t s
coordinate
Provided
then using
t o unearth
s p e c i f y graphs.
"natural"
will
can discuss
i ti s closely related
beforehand. Therefore, hope
ignore
operasystems
large).
s t r a i g h t f o r w a r d enumeration o f a l l operations, graph
we
t h e s t r u c t u r e s o f a l g o r i t h m s . Of c o u r s e ,
to a l l algorithms One o f t h e s i m p l e s t
of
The r e -
way.
l a r g e number o f o p e r a t i o n s . tion
begun and I t w i l l
Inevitably
itself
structure
i s n o t known t o d a y .
t o t h e above q u e s t i o n .
a l g o r i t h m s t r u c t u r e s must
o f the research
Our
the informational
a l g o r i t h m s t r u c t u r e s has J u s t
t o o b t a i n t h e answer
efficient
that
current algorithms
implementation.
process i n time.
does n o t i n c o r p o r a t e any i n f o r m a t i o n as t o t h e
38 peculiarities components rithm. word
Ue w i l l
exist)
refer
"schedule"
process are
of a particular
( i f they
t o the vector
informs
i n time,
computer,
that
while
we
data
an o p e r a t i o n
that
we
mean
that
In particular,
of
no
small
such computer models a l l o w
t o solve
required to
o r even
t o compensate f o r t h i s ,
on t h e s e t o f a d m i s s i b l e
The
execution
restrictions
t h e time
may be e x e c u t e d s i m u l t a n e o u s l y .
assumptions a r e i m p r a c t i c a l ; restrictions
schedule.
t h e development
c a n be a r b i t r a r i l y
dependent o p e r a t i o n s
ditional
t a s t h e generalized
by " g e n e r a l i z e d "
upon i t s
s o l e l y by the graph o f algo-
consider
imposed on t h e used c o m p u t e r .
accomplish
so any r e s t r i c t i o n s
are determined
equal
zero;
C l e a r l y , such
we w i l l
schedules.
p l a c e ad-
I t turns out
quite realistic
problems
con-
cerning algorithm structures. If tially
two o p e r a t i o n s
i n p r a c t i c e . This
generalized
schedules.
conventional,
refer
cuts
graph,
then
a subset
t o schedules
There a r e v a r i o u s
the inequality
case i t I s obvious t h a t
t j
into
tor.
For s h o r t n e s s ,
An its
arc originating
generated
t. t h j > 0 must
o f t e n omit
hold.
the data
t o a graph
tj
when i t f i r s t
sformed
i n part
In this sche-
assume t h a t
the implementation
the adjectives
vec-
"conventional",
t h a t q u a l i f y a s c h e d u l e . Our p r e c i s e m e a n i n g i s
from
t h e I t h node o f t h e g r a p h a n d p o i n t i n g t o
of the following
interpretation.
Some d a t a
arc.
item being Each
t . and t
this
data
J
u s e s i t u p t o t h e moment t .. Con-
transmitted i s not the only
schedule determines
appears and t h e t i m e
item i s
t . and i s t r a n s m i t t e d t o t h e
t h e r e . A t some moment b e t w e e n
I s consumed by t h e j t h node t h a t
ted
algo-
For generalized
' sequently,
that our
h j required
c a s e we w i l l
be c a l l e d
i n t h e i t h n o d e a t t h e moment
n o d e t o be u s e d
item
s u b s e t as
the context.
j t h node a l l o w s
jth
we w i l l
"generalized"
always c l e a r from
that
t h e j t h node o f a n
t h e schedules a r e s t r i c t .
£ 0 . The n o n n e g a t i v e v e c t o r h w i l l
"strict",
out o f the set of from
w a y s t o make s u r e
d u l e s we h a v e h . = 0 f o r a l l j . I n t h e g e n e r a l hj
sequen-
( t h e c o n d i t i o n h . > 0 must h o l d f o r a l l
I f an a r c goes o u t o f t h e i t h node
rithm
t h e y must be e x e c u t e d
F o r e x a m p l e , we c a n s p e c i f y t h e t i m e
accomplish the j t h operation
j).
dependent
requirement
We w i l l
or s-trict.
schedules are s t r i c t . to
a r e data
t
unambiguously
t ^ i t exists
b y t h e j t h node o p e r a t i o n ) . We
thing
rela-
t h e moment
(possibly tran-
postulate
that
at the
39 moment
[ j the
born.
I f at
several
o l d data
Item
t h e moment
nodes
ceases
one
and
then
we
assume
refer
to
the
to exist
and
t h e same d a t a
that
i t has
t h e new Item
i t s own
Item
Is
i s broadcasted
data
to
existence
on
every
arc. We
will
c o n n e c t i n g t h e I t h and generalized cept
schedule
for their
ing
a l g o r i t h m s on are
data
communication
Is
not
less W
~
computer
the arc
has
implementation imposed by are
those
o p e r a t i o n s . The schedules
vector,
the
equality
delay
arc
t o the schedule on
this
ex-
restriction
the
perform
e t c . We
We
u.
delays.
operations,
can
allow f o r
assume
i t h and
is
Implement-
bounds on
vector
t. A
the delays
executable.
to
the
the
that
t h e j t h nodes
number u . ., i . e . f o r a l l s c h e d u l e s
t J
vector w
composed
of u. . i s
supports
i t s i/o
the
input
nodes
nodes r e s i d e .
are
problem
tions
that
start
as
the
the
They
fairly
into
D e n o t e by schedule
called
opera-
often
we
often
account
that
have
the
r the vector
t
are
re-
whose
correspond
to
c o n s i s t , f o r example, i n s t u d y i n g the inequality
be on
i
interpreted the
to
that,
the process
the vector s
may
close
Besides
of
the
should
situated
taking
i / o processes.
restrictions
etc.
i m p l e m e n t a t i o n . T h a t i s why
components
inequality
places
that
i/o ports,
possibilities
satisfying
graph
to
the
something
i/o devices,
bottlenecks for algorithm
of
required overheads,
connecting
nonnegative
system
terminals,
components
fer
on
lower
on
vector.
explore
input
times
introducing
The
strictions
set
the
t o be
delay
' J
s
i ' * °^ ' '
Any
to
c
data
speaking,
imposes
memory a c c e s s
t h a n some n o n n e g a t i v e
delay
tions: the
by
the delay
1
the
restrictions
Strictly
computers
f o r by and
restrictions
a l l i ,j
~ the
accounted
as
t.
f o r the algorithm
actual
They
-
p l a c e any
nonnegativeness.
necessary c o n d i t i o n
these
tj
t h e j t h nodes c o r r e s p o n d i n g
does n o t
the
for
time
data
the the
^
s.
border
of
times.
conditions
is a
the
Most
vector
given
This i n often
the
r e g i o n where a l l
nodes r e p r e s e n t
o f a l g o r i t h m e x e c u t i o n . That boundary
s
componentwise.
input
input
Here,
or
the
i s why the
operawe
re-
initial
conditions vector. G i v e n an a l g o r i t h m ralized
schedules
and
f o r that
i t s graph, algorithm.
d e n o t e by R t h e s e t o f a l l geneD e n o t e b y S! t h e s e t o f a l l ( e x i s -
t i n g o r h y p o t h e t i c a l I c o m p u t e r s on w h i c h
t h e a l g o r i t h m can
be
implemen-
40 ted. and
I t i s evident defines
of
a subset
the algorithm Any
constraints
defines
be
on t h e computer
system
i n R . For example,
the subsets
i n their
imposed
R.
*
Clearly,
We
v in.
reflects
This determines
machine
vectors
ft
o n ft. N o t e
that
<j>(h)
a l l the sets
f o rour research.
i s t o design
and
i n SJ^ o n w h i c h
Suppose
system
tailored
parameters
are interested
and h y p o t h e t i c a l
that
a l l the schedules
s i r e d p r o p e r t i e s , we w i l l
c o m p u t e r s on
from
can
select
R ^ c a n be
a
that model
implemented.
t h e i m p l e m e n t a t i o n p o s s e s s i n g t h e de-
transform
computer system, p r e s e r v i n g
we
that
on a p a r t i c u -
optimally
t h e c h a r a c t e r i s t i c s we
t h e s e t £3^ o f a c t u a l
Suppose
c a n be i m p l e m e n t e d a n d t h e s e t R ^ o f s c h e d u l e s
the implementations.
Naturally,
t h e computer
T a k e t h e minimum s e t o f c o m p u t e r
A s s u m i n g t h a t we h a v e f o u n d i n
ule.
conditions
Li
h
depending
t o some e x t e n t
which our algorithm describe
initial
f o r t h e b e s t way t o i m p l e m e n t a n a l g o r i t h m
that algorithm. that
o u t some
here depend on t h e a l g o r i t h m ,
computer, o r o u r task
for
carve
i n R . a n d R . O t h e r r e s t r i c t i o n s may h u R c R and a l s o R = R . . .
c a n now d r a w u p t h e g u i d e l i n e s
we a r e l o o k i n g lar
parameters
u
we h a v e R . ,
'
w h e r e u ( f t ) I s some v e c t o r subsets l i s t e d
implementations
the specification of the h or u
o r R . The
t u r n some s u b s e t s
as w e l l .
to the set fi
computer.
ft define
belongs
i n fi w h i c h c o m p r i s e s a l l p o s s i b l e
on t h a t
subset o f schedules vectors
t h a t each p a r t i c u l a r computer
t h a t model machine i n t o
the relevant
properties
the s e t o f a l l schedules w i l l
t h e desired
o f t h e found
be m o d i f i e d
sched-
during
that
transformation. It
i s essential
descriptions sufficient that
that our research w i l l
n e i t h e r o f fi n o r o f Q^. f o r a l l t h e problems
that
we
s u c h model m a c h i n e c a n be e x p l i c i t l y
generalized
s c h e d u l e s . Of c o u r s e ,
never
require
any d e t a i l e d
The m o d e l m a c h i n e w i l l
i t will
will
consider.
p r o v e t o be I t turns
out
specified f o r the set o f a l l retain
i t s properties
f o r any
subset o f t h a t s e t . Consider a graph o f a l g o r i t h m . able
t o perform
the required
We i n s e r t a f u n c t i o n a l u n i t t h a t i s
operation
assume t h a t e a c h u n i t r e c e i v e s d a t a the graph, sidered
i . e . from those u n i t s only
node.
I f no i n g o i n g
arcs
i n t o e v e r y node o f t h e g r a p h .
required
f o r i t s work a c c o r d i n g t o
whose n o d e s e m i t a r c s
are present
We
then
we
t o t h e con-
assume
that the
41 arguments a r e f e d from o u t s i d e ditional
delay,
and
the units
g r a p h n o d e s . Assume a l s o stantly.
once
that
Therefore, from
they are required,
a r e enumerated
i n t h e same
each a l g o r i t h m o p e r a t i o n
now o n
t . means
without
t h e moment
any a d -
manner
as
i s performed i n a t which
thei t h
we w i l l
balance
i operation the
i s performed.
fictitious
tions
n a t u r e o f t h e assumption
i s referred
The iate
mentioned
t o a s the graph
only
The g r a p h
hypothetical
inside
machine
algorithm
plementation.
the units
machine
has i t s advantages
in
[ 8 8 ] . F o r now, we n e e d
us
explore
Every
the graph
L e t us s t r e s s i s accounted
that
schedules.
investigation
Essentially,
o f the fastest
sibly not yet b u i l t
Most
initial set
often
system
again
these
fective
that
will
problems
that
vector
reasons.
s . These
have p r a c t i c a l
Thus pos-
i s known.
h, t h e d e l a y
parameters
schedules,
value.
still
place,
vector
define
a
of
o f our
u and t h e
subset
i n the
search f o r the required
but this with
search
i s more e f -
the specified
they allow
charac-
i s e x p l a i n e d by t o d e s c r i b e many
I n t h e second p l a c e ,
exploration
c a n be t a k e n i n t o
machine o r by p l a c i n g schedules.
are identical.
i m p l e m e n t a t i o n o n some
i n accordance
I n the f i r s t
the efficient
other parameters
machine's
describes the set of
The c h o i c e o f h , u , and s a s o u r p a r a m e t e r s
enable
helps
the set o f a l l
t a k e as t h e minimum s e t o f p a r a m e t e r s
among a l l g e n e r a l i z e d
teristics.
that
the graph that
computer;
c o m p u t e r amounts t o t h e i n v e s t i g a t i o n o f one o f t h e
i f t h e s e t R i s shrank
the f o l l o w i n g
and
thoroughly
as a t o o l
two s e t s
algorithm
the implementation vector
conditions
works b u t
of actual
t h e minimum t i m e i m -
just
o f a l l s c h e d u l e s . O f c o u r s e , we c a n a l w a y s
schedule
ters
we
unit
scope
f o r by t h e f a c t
w o r k i n g modes o f t h e g r a p h m a c h i n e t h a t
model
intermed-
are discussed
machine
p o s s i b l e w o r k i n g modes o f t h e g r a p h m a c h i n e f u l l y admissible
store
i s a m a t h e m a t i c a l model o f a
and disadvantages
algorithms.
to algorithms
themselves.
out the entire
implementations, including
The g r a p h
restric-
computational
machine.
can c a r r y
it
the
additional
The r e s u l t i n g
g r a p h m a c h i n e h a s n o memory a n d c a n t h e r e f o r e
results
rapport
that
by p l a c i n g
on t h e s e t o f a d m i s s i b l e schedules.
system
once.
We h a v e a l r e a d y
these
parame-
the set o f schedules.
account w h i l e transforming
more r e s t r i c t i o n s
Any
the graph
on t h e s e t o f a d m i s s i b l e
42 Studying Before
we
proceed
to p a r a l l e l If the
t,=
time
the
these
similar
to perform
replace
[t.-h..
1 1 1
allow
the c o n d i t i o n
-
time a
partition those
holds:
J
briefly their
tasks. relation
form
linked
performed
simultaneously.
by
(provided that
an
arc
f o r t h e J t h n o d e . We
t . by
the
requirement
that
at
set
least
that
one
of
the
the
w
have
intervals
Thus
schedules
no
the
these into
subset
by
algorithm. of
that that
at
positive
other that
subsets
are
at
each
such
performed
time
moment.
the
reverse
algorithm gives
time
moments. schedule, that
only
the
same
at
I t i s easy
o f g r a p h nodes d e f i n e s a
Clearly,
the
occur
disjoint
partitioning
form
can
start
at
c o n d i t i o n s . Using
same s u b s e t
each
i s performed
operations
nodes
in parallel.
algorithm execution
operation
obeys
of graph Into
parallel
to I t
some
Just
the
the o p p o r t u n i t y t o execute o p e r a t i o n s
the described
every
are
J
time. Label
that
parallel
consider
important
be d r a w n i n t h e c a s e when i t t a k e s
( 1 , 1 , . . . , 1 ) . Let
nodes f a l l
notice
t h e most
f .] h a v e n o n - e m p t y i n t e r s e c t i o n .
moment, w h i l e
the
nodes
t . =
lt.-h.,
schedule
moment o f
of
i , j then the operations corresponding
i n f e r e n c e can
suppose
integer
will
n o d e s c a n n o t be
to recognize
Consider
we
j t h graph
J
and
i s one
t h e o p e r a t i o n s , e.g.
I . j and
Take w zero
research
t. f o r some p a i r to
that
* 0 ) . The
that
schedules
forms of a l g o r i t h m s .
i t h and
follows
to
the set of
statement
rise
to
some
to
certain also sched-
ule. We set
of
have
shown t h a t
a l l parallel
corresponding plenty
of
to
forms of
the
reasons
studying parallel
ules
yielded but
shall
see
we
would later,
peculiarities
the be
the
special
that
than
rithms,
there exists
make
studying
schedules a l l o w
of
in
a
subset
delay
schedules
much
parallel them.
i n a s y n c h r o n o u s modes.
of
more
sortings.
explore
to solve
between
vector.
introducing
of algorithm implementation
c h r o n o u s as w e l l as
the
topological
opportunity to justified
one-to-one mapping
a l g o r i t h m and
choice
f o r m s and
a
Even forms
schedules There
parallel
are
convenient i f schedof
Actually,
algoas
many p r o b l e m s c o n c e r n i n g on
the
we the
computers i n syn-
6. Examples In
this
principal
we c o n s i d e r
several
issues o f t h e developed
section
theory,
EXAMPLE 6 . 1 .
examples
i l l u s t r a t e the
Consider t h e c l a s s i c a l problem o f f i n d i n g t h e product
A = BC o£ t w o s q u a r e m a t r i c e s B, C o f o r d e r entries
that
n . Assume t h a t
the matrix
a r e n u m b e r s a n d d e n o t e t h e m a . ., b . ., c . ,, r e s p e c t i v e l y . S
finition,
j,
i
J
By d e -
.
we h a v e n a. . = 7 b.,c,
ij
L
ik
.,
i, j
= 1,2
n
(6.1)
J
k f
k=\
These f o r m u l a s a r e f a i r l y uation
of entries
guously, tice
since
that
straints
the is
the order
that
exactly,
selected
10)
a, ,
the straightforward
the additions a l l orders
Suppose
to evaluate
eval-
unambi-
by
t h e absence
o f any
No¬ con-
i . j . and m u l t i p l i c a t i o n s
t o sum
that
any a l g o r i t h m
, terms i s n o t s p e c i f i e d .
i s manifested
on t h e order o f i n d i c e s
same r e s u l t .
used f o r
t o sum t h e b.,c
the parallelism
Provided performed
often
o f A. They a l o n e d o n o t d e f i n e
(6.1)
o f numbers a r e
are equivalent
f o r some r e a s o n
and g e n e r a t e
the following
algorithm
(6.1):
0.
ij
(JO
a •, JJ
1
a ! * " ' * £>.,c, .,
ij
ik
i . j , *
= 1,2
n,
(6.2)
k f
(n) a. . = a . . . ij
Again the p a r a l l e l i s m are
p l a c e d on i . j .
DO 1
i sexplicitly
Now w r i t e
this
specified
algorithm
s i n c e no
i n a F0P.TRAN-1 i k e
constraints language:
i=l,n
DO 1 j - l , n a. . = 0 DO 1
k=l,n
(6. 3)
44 a. 1
This Indices
can
notation
does n o t s p e c i f y
i , j , therefore
If ters,
ik kj
= a
U
CONTINUE
the parallelism
we a r e i n t e r e s t e d
t h e independence o f the
i snot self-evident.
i n i m p l e m e n t i n g a l g o r i t h m s o n p a r a l l e l compu-
we c a n n o t d o w i t h o u t
be p e r f o r m e d
explicitly
the information,
simultaneously,
which
The n o t a t i o n
algorithm
(6.2)specifies
rallelism
explicitly,
while
i t i s n o t even
mentioned
Moreover,
the serial
nature
o f FORTRAN may
suggest
parallelism
whatever.
the
( 6 , 3 ) , we w i l l
program
We c h o o s e a+bc such
operations
cannot a f f o r d ate
undue
(6.3)
we w o u l d
n o t jump
b u i l d the algorithm
as t h e b a s i c o p e r a t i o n .
are performed
during
suggests
f o r future
For large
program
g r i d i n 3-D
layout
space w i t h
g r a p h nodes i n t o i n t e g e r
that
( 6 . 3 ) h a s no
t o conclusions.
n, q u i t e
execution.
coordinates
lest
nodes.
notation
We assume t h e y clear,
input is
data
broadcast
i s well
seen i n (6.3)
e a s y t o o b s e r v e t h a t f o r k>l
(iiJ.JC-iJ The down and
being
2
actually
a
explicitly
structure
single
i n (6.1)
In particular, b
and
k coordinates.
i s fairly each
path
i n (6.3)
J
k
Analyzing
simple.
subgraph
will
16.3), i t
stretched
The g r a p h
consisting
along
the k
t h e same p a r a l l e l i s m
and ( 6 . 2 ) .
6 . 1 ( a ) . The f u l l
itself.
we
a l l ,the
(i.j.k).
subgraphs,
we h a v e f o u n d
shown i n F i g . casts.
graph
disjoint
o n them.
corres-
the result o f the operation situated i n
b e s e n t t o t h e node
algorithm
into n
quently, fied
will
incident
a
F o r l i l , j , t e n we
i,j,k.
After
n o d e s and a r c s
we cre-
Consider
p o n d t o o p e r a t i o n s o f t h e f o r m a + b c . To k e e p t h e p i c t u r e input/output
we
The p r o g r a m
f o r graph
nodes o f t h e g r i d .
a lot of
not
draw
Using
Therefore
scattered
investigation.
an adequate
t h e pa-
i n t h e program.
graph.
f o r g r a p h n o d e s t o be a r b i t r a r i l y
impediments
itself
rectangular put
Still,
operations
include
axis.
as t h e one
F o r n=2 t h e g r a p h
graph would
of n
breaks nodes Consespeci-
of algorithm
multiple data
i s
broad-
i s b r o a d c a s t e d t o a l l n o d e s h a v i n g t h e same i
Similarly,
the
same k a n d J c o o r d i n a t e s .
for
t h e c a s e n=2.
but
when k = l t h e r e
c
k
j
i s broadcasted
Fig.
6.1(b)
A l l nodes c o r r e s p o n d i s the difference:
t o a l l nodes
illustrates
these
having
broadcasts
t o operations o f the form
a+bc,
t h e argument a i s n o t produced by
45
a)
b) F i g . 6.1
some o t h e r another
operation
input
data
- i t i s s e t t o 0 . We
item
illustration
systems and a l g o r i t h m
graph
topological
sortings
enumeration
o f t h e nodes. Of c o u r s e ,
arbitrariness
and s c h e d u l e s ,
o f the graph.
enumerated t h e nodes
discern
lelism
the parallelism.
i s self-evident.
nodes
layouts.
confined
this
With
o f which
When
k=l.
considering
ourselves
to a
we d o n o t w i s h
are not yet clear. t h e graph
t h e chosen l a y o u t
M o r e o v e r , any l a y e r
mere
t o im-
I f we h a d
would
have g o t
a t a l l w h e t h e r we w o u l d h a v e b e e n o f nodes
able
the paral-
o f a n y p a r a l l e l f o r m c a n be
readily
described.
include
more
than
natural
that
s o o n e r o r l a t e r we s h a l l h a v e t o e x p l o i t t h e d e p e n d e n c e o f
nodes
layout
In particular,
as
r e f l e c t s the assumption o f
precise,
i n o u r example,
v e r y o b s c u r e and i t i s n o t c l e a r to
0 Just
o f v a r i o u s ways t o c h o o s e c o -
we
To b e more
pose any r e s t r i c t i o n s t h e r o o t s merely
this
and b r o a d c a s t i t t o a l l nodes i n t h e p l a n e
T h i s example i s a good ordinate
can regard
one node
with
i t i s a s e t o f nodes
same
that
i and j c o o r d i n a t e s .
on t h e a l g o r i t h m
notation.
This
6.2. Suppose
use backward
does n o t
I t i s quite
example a n t i c i p a t e s
that
dependence. EXAMPLE
system o f l i n e a r a l g e b r a i c
we
e q u a t i o n s Ax=b
substi t u t ion
t o solve the
where A I s a n o n s i n g u l a r t r i -
46 angular side
m a t r i x of order
entries
n.
Denote by
respectively.
For
w i t h diagonal e n t r i e s equal
a.j,
t h e m a t r i x and
b^
simplicity,
t o 1. Then we
l e t A be
right-hand
lower
triangular
have
i-1 x =b , x~b. 1
1
1
V a . .x .,
-
L
1
ij
2sisn
j
(6.4)
j=l This
notation
the order tive ing
does n o t
o f summing
define
i s not
summing i n t h e o r d e r
the
specified.
a l g o r i t h m unambiguously, Consider,
a l g o r i t h m i n FORTRAN-1ike l a n g u a g e l o o k s
1
x
1
=b
DO
DO
Jf
i-1
(6.5)
.=X . -
a . .X . I J J
1 1
CONTINUE
dominant of
operation
of
the
the s c a l a r f u n c t i o n
a l g o r i t h m has
u-vw.
This
have
labels
straightforward
graph
We
t i o n s we
build
graph
played;
2
The due of
are
of
of
indices i , j .
assignment
nodes
is
ls/si-l.
3.
I t
Other
operations.
non-productive
By
to
scalar
analyzing
has upon
opera-
Again,
for
the
algorithm
operations
i t s layers
in Fig. are
6.2.
marked
A by
sample dashed
lines.
the v e c t o r b e n t r i e s occupy a s i n g l e consecutive
order
j)
seems
n-1;
been
integer opera-
form
The
is also
total
dis-
number
input operations
of that
layer.
o f summing t o e v a l u a t e
t o have
into
and
the i n p u t nodes i n i t .
parallel
operations equals
t o some r e a s o n s . H o w e v e r , t h e c h o i c e
u-vw
the dependencies between
the graph of a l g o r i t h m , i n c l u d i n g
i s shown
increasing of
label
i s performed
take a C a r t e s i a n c o o r d i n a t e system w i t h axes i . j
l a y e r s c o n t a i n i n g t h e a-vw acquire
and
nodes c o r r e s p o n d i n g
nodes f o r 2£i^n,
The
and
enumeration
graph b u i l d i n g . put
1
the
operation
v a r i o u s arguments f o r a l l a l l o w a b l e values tions
this:
x.=b. i
j=l,
4
3
The
consecu-
correspond-
i i
the form
like
The
4 i=2, n
2
4
f o r example,
of i n c r e a s i n g of the j index.
since
( 6 . 4 ) c o u l d be
of the d i r e c t i o n perfectly
casual.
chosen
( i n the
order
Since
there
47 are
no o b v i o u s
backward ing o f
advantages
substitution
i n the latter
algorithm
choice,
u s i n g summing
we
can construct the
i n the order
of
decreas-
J: 1
x
=b
l
l
DO 4 i = 2 , n x.=b.
2
I
l
DO 4 ; = i - l , l , - l 3 4
(6.6)
x =x.
a . .x .
I I
IJ J
CONTINUE
The c o r r e s p o n d i n g g r a p h
i s shown i n F i g . 6.3.
F i g . 6.2
If among can
we
t r yto scatter
F i g . 6.3
t h e nodes c o r r e s p o n d i n g
t h e l a y e r s o f some p a r a l l e l
now r e s i d e i n a n y l a y e r .
This
f o r m , we f i n d
nodes o f t h e g r a p h
i n F i g . 6.3 b e l o n g
i s represented
by dashed
graph
of algorithm
+
2 ) / 2 , which
i s much
more
out that
operations
o n l y one node
i s a c c o u n t e d f o r by t h e f a c t
path
(6.6) with
t o u-vw
t o o n e a n d t h e same
that a l l
path.
arrows.
T h u s t h e number o f l a y e r s
respect
t o u-vw o p e r a t i o n s e q u a l s
than
n-1
layer
i n the graph
of
That
I n the it?
-
n
algorithm
48 (6.5). The
obtained result
gorithms are
could
( 6 . 5 ) and ( 6 , 6 ) a r e d e s i g n e d
based
respect
o n t h e same m a t h e m a t i c a l
t o exact
gards
their
same
amount
computations.
implementation
all
roundoff
lower-order For
by
They with
since
arising
emphasized by t h e f a c t
i n both
system
processes
t h e graphs o f those
execute,
while
average
that
usage
processors
(6.6) takes i s about
0.5
Thus a l g o r i t h m s t h a t t a t i o n on s e r i a l implementation
f o r every
by
match
( I f we
(6,6) that
the gravest
this
then
(6.5) takes 0(n)
2
0 ( n ) time while
units.
case
are t o t a l l y
alike
as r e g a r d s
i n various fields
computers f o r decades a r e accustomed features:
science,
case t h e
i t i s close
to
t o consummate. their
implemen-
as r e g a r d s
their
i ti s that
"mastering"
that
fact
trained)
poses
of parallel
have been w o r k i n g
( o r were
that
on
t o assess
comserial algo-
t h e number o f o p e r a t i o n s , t h e amount o f
memory r e q u i r e d , a n d t h e a c c u r a c y . characteristics:
computer
computers.
example t o s t r e s s t h a t
r i t h m s by t h r e e c h i e f
dif-
time u n i t s to
I n the f i r s t
i n t h e second
obstacle f o r the mathematical
Specialists
computer
ignore
algorithms are completely
c o m p u t e r s c a n be c o m p l e t e l y d i f f e r e n t
on p a r a l l e l
We p r e s e n t
puters.
similar-
solved
z e r o . We assume t h a t e a c h o p e r a t i o n t a k e s o n e t i m e u n i t
in
amount o f
The
I f b o t h a l g o r i t h m s a r e t o be i m p l e m e n t e d o n a p a r a l l e l
having n general-purpose
these
require the
a n d t h e same
programs a r e n e a r l y i d e n t i c a l .
such
a l i k e as r e -
they
terms).
a l l that,
ferent.
problem.
computers,
(6,5) there e x i s t s
errors
t h e same
both a l -
( 6 . 4 ) and e q u i v a l e n t
o f a d d i t i o n s and m u l t i p l i c a t i o n s
solved
Certainly,
Both a l g o r i t h m s a r e t o t a l l y
of the algorithms i s further
system
to solve
formulas
on s e r i a l
c o m p u t e r memory. Even t h e i r ity
h a r d l y be e x p e c t e d .
L i t e r a l l y e v e r y t h i n g was g r o u n d e d o n
fundamental development
parameters
o f numerical
o f computers, methods
e f f i c i e n c y measurements, development o f a l g o r i t h m i c
education
and a l g o r i t h m s ,
languages and
their
r e q u i r e d knowledge o f
alto-
c o m p i l e r s , and so o n . The gether Just on
development o f p a r a l l e l different
unimportant
rapid
changes
computers
algorithm properties f o r serial
computers.
i n the a t t i t u d e
and Yet
towards
characteristics w e
have no r e a s o n
computer-based
that
were
to
count
research
and
49 a l g o r i t h m d e v e l o p m e n t on bits
over
i s why
i t i s not only very
exhaustively and
explore
tial
the part
t h e d e c a d e s . The
but
also
While
differential
equations
of
often.
The
linear
equations
a
semi-iteration
the
methods
linear
with
blocks
that
the m a t r i x of
of
block are
T h u s we
SSOR. F o r
the system
individual
blocks
consider
haThat
algorithms thoroughly
constructive
methodology
to
the
to
and
extract
the
i s lower
of
of
matrices
sake
of
and
the
a
fairly
diagonal
f o r example, d u r i n g
definiteness
triangular
par-
solving
bidiagonal matrix arises
diagonal
the
solution
a l g e b r a i c problem
b i d i a g o n a l m a t r i c e s . Such s y s t e m s o c c u r ,
each
order of
here.
applying grid
off-diagonal
blocks are
strong
powerful
properties of algorithms.
EXAMPLE 6 . 3 .
system
acquired
is really quite
important t o study
to develop
the p a r a l l e l
o f t h e p e o p l e who
inertia
of block
we
assume
o r d e r m and
n. system
of
linear
algebraic equations
of
the
form:
B _a D2
0 8 D
3
U u1 2
0 m
m-i
F F1 2
=
U
F
01
(6.7)
m
where
ik
2k
fl
k
=
'zk
3k n-ik
u
lk
U2k unk The
solution
recurrence
0,
=
nk
2k
3k nk
s *
, F = ' k
f2k
/ n k
t o b l o c k b i d i a g o n a l system
(6.7) i s determined
by
the
50
if
we
set
U =0, o
using
vectors
rithm
(6.8).
tion. arcs and
D =0. o
the
and
F,U
the
fairly
m a c r o o p e r a t i o n X=fl '(F-DU) c o m p u t e s
matrices
assuming
Obviously are
The
that
graph w i l l
large
on
2fc-OS
y>~° \
/
as
The
Each
structure
the
algorithm
shown
2>
2
too,
represents the
the
graph is
These
facts
suggest
algorithm
to
solve
iteration
of
the
clusion
w o u l d be
Consider agonal
system
m a t r i x and
The
that
algo-
nodes
the
and
operations
structure.
2>„_,
6.4
clearly
of
a
t h a t we that
(6.7),
6.4.
the
macroopera-
a _
shows
s e r i a l and
right-hand
p a r a l l e l i z e d , provided
i n Fig.
of
that
X
/°
solution
computation of
graph to
r
admits
i n d i v i d u a l m a c r o o p e r a t i o n c a n n o t be
only be
the
(6.8)
the
emphasizing
have complex
\
of
build
corresponds
picture,
Fig.
level
We
D. node
be
the
data being transmitted
SB,
B,
each
vector
no
on
matrix-vector
paralleltzation.
parallelized either,
bidiagonal sides of
system.
bidiagonal
i s no
hence,
above-mentioned
since i t ,
Therefore
i t is
systems t h a t
s o l v e these systems u s i n g
there
and
that of
(6.8).
good p a r a l l e l i z a t i o n o f
no
p a r a l l e l i z a t i o n of
solution
methods.
can
However,
a
the
semi-
such
con-
premature.
a FORTRAN-like n o t a t i o n solving.
Using
v e c t o r e n t r i e s , we
the have
of
the
notation
algorithm we
for
introduced
block earlier
bidifor
51 DO 1 J m l , n 1
= 0
= 0 (6.9) -
u
2
CONTINUE
(6.9) ( e s s e n t i a l l y
This
operation
ok
a rectangular
graph
nodes
into
grid
with
a l l nodes
nodes t h a t
feed
the output
broadcast
input
on t h e i . k plane.
l^ft^m,
We
of operations
( i - 1 , f t ) and ( i , f t - l ) .
any o t h e r
(6.10).
omit
I t c a n b e shown t h a t
eration w i t h coordinates cute
i n t e g e r nodes
the straight-
the graph,
assuming t h a t the input
a l l nodes
nodes
a n d some
(i.ft)
operations.
located
(6.1),
I n t h e nodes w i t h
required
i s input data
Unlike
By t h e
t h e node w i t h c o o r d i n a t e s
A l l other data
that
t o perform
graph
(i,fc) coor-
t h e op-
i s n o t needed
t h e present
con-
We p u t
i n i t i a l z e r o e s a s a r g u m e n t s f o r some o p e r a t i o n s .
of (6.9)
receive
Again,
f,e,d,b,x,y.
i s no g o o d . To b u i l d
for l^i^n,
t o the operation
dinates
(6. 10)
computes u u s i n g
correspond
analysis
The d o m i n a n t o p e r a t i o n
= b '(f-ex-dy)
enumeration o f operations
sider
will
f o r a l l k, i .
e =d^=0
i t i s t h e only one)i s
a
forward
. u. l - i , ft 1-1, k - d l,k-iul,k-i
ik
H e r e we a s s u m e t h a t in
e
t o exe-
does n o t
data.
The g r a p h o f a l g o r i t h m i s shown i n F i g . 6.5 f o r t h e c a s e n = 5 , m=9. Despite
our apprehensions
t h e graph
l a y e r s o f t h e maximum p a r a l l e l the height minim,n). respond layers
c a n be
readily
parallelized.
a r e drawn i n dashed l i n e s .
The
Clearly,
o f t h e a l g o r i t h m e q u a l s m+n-1, t h e w i d t h o f t h e a l g o r i t h m i s The g r o u p s
to individual
o f n o d e s hemmed b y d a s h e d
how
an
lines
i n F i g . 6,6
coi—
nodes o f t h e g r a p h i n F i g . 6.4. They a r e a l s o t h e
o f a generalized
lustrates
form
easily
parallel
form.
parallelizable
This
collection
algorithm
o f drawings i l -
c a n be
turned
into
52 n o n - p a r a l l e l l z a b l e by t h e u n f o r t u n a t e c h o i c e o f m a c r e o p e r a t i o n s . Merging operations tures,
since
investigation. operations,
i s widely
used w h i l e a n a l y z i n g
omitting superfluous This
details
example warns us
so as n o t t o l o s e
can
to exercise
Important
ture.
k
Fig.
6.5
Fig.
6.6
*- k
greatly
algorithm
struc-
facilitate
caution while
i n f o r m a t i o n on a l g o r i t h m
the
merging struc-
53 T h i s example arcs originating nating
f r o m any
The
graphs.
graphs
can
observe form,
data. and
scrutinized
Naturally
Given
The
We
size,
see
origi-
exhaustively.
brings that
circumstance f a c i l i t a t e d shall
d i d not
For
algorithm
t w o a r r a y s o f n u m b e r s a.,
b l s i s n ,
the following
we
now,
we
parallel
common
on
graph
input
building
example.
w i s h t o compute
the
program:
0
=
b =
0 1 1=1.n
= max(a,b) +
a
out
found.
depend
presently consider a d i f f e r e n t
as
regular
t o w h i c h t h e maximum
the graphs
graph
those
t h r e e p r e v i o u s examples have t h e f o l l o w i n g
two numbers a , b u s i n g
DO
shall
that
t o such g r a p h s
algorithms
L a t e r we
i s t h e r e a s o n due
problem
this
a
refer
t h e w i d t h o f t h e a l g o r i t h m were e a s i l y
f o r given
exploration.
by s h i f t i n g
We
practical
r e s p e c t s be
regularity
EXAMPLE 6 . 4 .
of
often.
t h e h e i g h t and
property:
(1,1).
occur f a i r l y
i n many
that
coordinates
consideration
r e g u l a r graphs
more r e a s o n . N o t i c e
node c a n be o b t a i n e d
f r o m t h e node w i t h
regular that
i s n o t e w o r t h y f o r one
16.11)
b • min(a.b) + 1
If
we
whether ly,
a or
do
not
know
b will
a.
and
be g r e a t e r
beforehand than the other
the graph o f a l g o r i t h m cannot We
the do
CONTINUE
will
build
an e x p a n d e d g r a p h
graph o f a l g o r i t h m this,
will
we
will
f o r any
ignore
a^,
f o r any
argument.
set of
arguments
data only
an
d e t e r m i n e , w h i c h o f t h e f o r m u l a s a=a+a e.g.
required
sists
a.
After
more p r e c i s e ,
the
choice
t o c o m p u t e a. Our
In Ignoring
this
as
l £ i = n s h o u l d be
b.,
to
longer
be
I n s u c h a way
as
update
To
Input
has
cannot
foretell
independent o f input data.
the contents of operations
assume t h a t f o r a l l 1, t h e i r
tually,
be
t h e n we
f o r every i . Consequent-
one
t o guarantee
i t s subgraph.
inside
a r e a.b,a.,
the loop.
a n d a.b.b^
o f t h e n u m b e r s a.b
b o t h n u m b e r s a.b
that
are
We Ac-
i s used
required
but
a n d a = b + a i s h o u l d be u s e d t o
i
been
made,
either
a
or
b
is
expansion of the algorithm graph
c i r c u m s t a n c e and
To
assuming
that
b o t h a and
no
conb
are
54 a l w a y s used t o compute t h e The n=6
expanded graph
together with
rallel data,
form the
carding remain
a
are
i n p u t and depicted
actual half
graph
of
between any
i s obvious
the p a r a l l e l
that
can
be
adjacent
to
formed
course,
expansion
conditional not
consider
hardly
be
from
layers. and
the p a r a l l e l
He
form
of
algorithm.
Ue
a
the
more
drawn on
complex paper.
given
expanded
parallel
cannot
set graph
or
examples
pa-
input
by
dis-
crossing
arcs
foresee, which
t h e expanded graph f o r any
of
arcs
re-
Nevertheless I s as
a.,
b .
our
point here
well
6.7
wished
f o r algorithm
operations or c o n d i t i o n a l
for
l a y e r s o f t h e maximum For
i t i s a v e r y s i m p l e example. Yet
technique
i n F i g . 6.7
b^. a r e k n o w n i n a d v a n c e .
form of the a l g o r i t h m graph
study a complicated
graph
lines.
Namely, e i t h e r
Fig.
Of
I t i s shown
o u t p u t n o d e s . The i n dashed
i t s arcs.
main, unless t h e values o f it
result.
I s easy t o b u i l d .
to demonstrate
exploration
branching. is
that
Unfortunately, this
the c h o i c e o f examples where g r a p h s are
t o be
the
The
the
not
use
of
i n the presence
of
o n l y reason
resulting
obstacle built.
was
we
graphs
severely
do can
curbs
Chapter 2 Algorithm Execution Time The its
time
required
an a l g o r i t h m on a c o m p u t e r i s one o f
major e f f i c i e n c y c h a r a c t e r i s t i c s .
pend
solely
structure The
on
the algorithm.
mutual
history ment
serial ters.
new c o m p u t e r s
mathematics
general,
introduce
t o that
results
that
trends
Serial
time,
computer p e r m i t s
with
long
obtain the solution
portant ably
characteristics
compu-
then? C l e a r l y ,
computer.
Therefore
d i -
but should
yield,
we
must
times. I n
them o n some a b s t r a c t m a c h i n e t h a t still
should reflect
design.
o f algorithms
ago c r e a t e d
respect
on
comparison o f algo-
designed
f o r uniprocessoral
the w e l l - d e f i n e d procedure
to execution
time.
This
well-known
c o n s i s t s i n comparing the o p e r a t i o n counts that both to
parallel
c o m p u t e r . The c o m p a r i s o n w o u l d
on another
i n computer system
implementations
computers have
develop-
c a n become e f f e c t i v e o n c e a g a i n i f
depend on c u r r e n t hardware p e c u l i a r i t i e s ,
algorithms
technology
entire
w e r e deemed e f f e c t i v e
some a b s t r a c t m e t h o d t o c o m p a r e a l g o r i t h m e x e c u t i o n
w o r d s , we m u s t c o m p a r e
general
struc-
The
appear.
w i t h respect different
and a l g o r i t h m
and computer
c a n we a s s e s s t h e a l g o r i t h m e x e c u t i o n
rithms only
on t h e
i n e f f e c t i v e on e x i s t i n g
measurements on a p a r t i c u l a r
other
parameters
Some a l g o r i t h m s
computers a r e o f t e n very
How
not
to this.
time does n o t de-
considerably
i s r a t h e r complex and e q u i v o c a l .
H o w e v e r , t h e s e same a l g o r i t h m s
suitable
rect
time
that
depends
o f t h e computer.
i n f l u e n c e o f computer
o f computational
testifies
Of c o u r s e ,
I t also
and t h e c h a r a c t e r i s t i c s
ture on t h e execution
in
t o execute
t o a given of serial
accuracy. This
t o compare procedure
algorithms
reflects
require
t h e most i m -
implementations o f algorithms
reason-
w e l l a n d h a r d l y depends a t a l l on h a r d w a r e m o d i f i c a t i o n s . F o r a l l
these
reasons t h e operations
table
criterion
t i o n s count as t h e time uniprocessoral
count
became a n a l m o s t u n i v e r s a l l y
of algorithm efficiency.
computer
Formally
we
can t r e a t
accepopera-
r e q u i r e d t o e x e c u t e an a l g o r i t h m o n a n a b s t r a c t that
performs
every
operation
i n unit
time,
56 while a l l other a c t i v i t i e s , ting is
as d a t a
v i a communication channels,
also
assumed
that
i / o , memory t r a f f i c ,
e t c . , do
operations
are
not
take
executed
any
data time
one-by-one
transmitat a l l .
I t
without
any
breaks. As
we
have d e m o n s t r a t e d
inadequate from algorithm stract times is
both
execution
serial
times
machine
becomes
i n §6, o p e r a t i o n s
theoretical on
was
to
Now
parallel
an
infinite
in
unit
can
be
the
algorithm
appropriate
machine
can
a l l other
be
work
communications
parallel
easily
be
the
introduced.
examining
the
Df
model
machine
no
time
at
all.
Suppose
We
could
not
rush
times
on
d i f f e r e n c e s between
various
implementations
various graph
suitable machine
pared
with
the
the
number o f
help
studied
rithm,
we
can
cope
with
solution
abstract
parallel
the
set
readily
algorithms,
parallel
us
algorithm
the
processors,
Having
various
to
makes
their
of
the
to
the
task.
problem
machine, types,
The
latter
that
since
and
the
implementations
proceed
i . e . use
of
abs-
parallel
t o i n t r o d u c e such a machine.
i n w h i c h way
graph
inter-
that
differ.
quite
ab-
i t has
needed use
same
the
for-
operation
must u n d e r s t a n d
and
the
that
tations
one
and
c o u r s e , an
o f v a r i o u s a l g o r i t h m s we of
ab-
computer
each p r o c e s s o r p e r f o r m s any
i n c l u d i n g e s t a b l i s h i n g the
takes
should
the
execution
i m p l e m e n t e d on
latter.
machine t o compare a l g o r i t h m e x e c u t i o n
c o m p u t e r s . H o w e v e r , we Before
can
i m p l e m e n t e d on
number o f p r o c e s s o r s ,
time,
processor tract
compare
fundamental d i f f e r e n c e between the a b s t r a c t s e r i a l
w h i l e o n l y one
stract
f o r comparing
i n §5.
the g r a p h machine i s t h a t a l l a l g o r i t h m s mer,
viewpoints
computers. Consequently,
used
inapplicable either.
c o u n t becomes a l t o g e t h e r
practical
parallel
that
the graph machine d e s c r i b e d The
and
of
implemen-
machine
Moreover,
much e a s i e r i t firmly
one
the a b s t r a c t p a r a l l e l
and
of
the com-
establishes
communication
comparison
as
is
the
network.
same
algo-
implementations
m a c h i n e as
our
of
model
computer.
In t h i s
chapter
we
set. o u t
to explore
the set of
schedules R
Our
m
principal that
goal
represents
trivial
for
i s to
i n v e s t i g a t e the
algorithm execution
serial
implementations:
t i o n s f o r which there
i s no
idle
set
of
time.
The
minimums o f
structure of
i t consists
run of
the
of
those
functional that
set
is
implementa-
the g r a p h machine. S i n c e
elimi-
57 nating
idle
r u n s does n o t change
mums o f t h e t i m e implementations. nate ly
a l l idle
functional As r e g a r d s
the execution
i s essentially parallel
implementations,
runs f o r a l l processors.
The o n l y
assumed i s t h a t a t a n y moment a t l e a s t
executing tions,
the structure
the set of mini-
we c a n n o t
thing
elimi-
t h a t c a n be s a f e -
one o f t h e p r o c e s s o r s
some a l g o r i t h m o p e r a t i o n . T h e r e f o r e
rather complicated
order,
t h e same a s t h e s e t o f a l l
with parallel
o f t h e s e t o f minimums o f t h e t i m e
i s busy
implementa-
functional i s
algebraically.
7. Vector Properties of Schedules B e f o r e we b e g i n sary
t o study
s e t o f v e c t o r s . We b e g i n
with
to
a.
a given
fact
delay
that
other ized
vector
particular
kinds
vector
the set R
At
this
of u
We h a v e m e n t i o n e d
I n general,
between
n o t a t i o n simple. be l i n k e d
any
essential constraints.
schedule,
we
schedule have
restrictions algorithm
o f some d e -
have
introduced
than
sure
a r e imposed o n l y
That
that
that
by t h e d e l a y
are performed
instantly
that
order
t o keep
a n y two nodes d o e s n o t add
vector
t c a n be a
For t h e vector
i t s components
execution
.notation to
i n order
assumption
n o t every
algorithm.
define a valid
operations
nodes.
one a r c . I n
t o the
n o t do t h i s
t o t h e assumption
one a r c .
f o r that
t o be
the nota-
t h e i t h a n d t h e jth
be a t t a c h e d
a n a l g o r i t h m , we o b s e r v e
restrictions
o f the imple-
later.
We
a r c s . We w i l l
That amounts
b y n o t more
various
i n case o n l y one a r c c o n n e c t s t h o s e two
t a g should
several
can
given
the s p e c i f i c a t i o n
t w o n o d e s c a n be c o n n e c t e d b y more t h a n
our
Given
of
by t h e
the set o f general-
to the specification
on t h e a r c connecting
t h a t c a s e some a d d i t i o n a l
(generalized)
corresponding
is justified
the description
be c o n s i d e r e d
This notation i s correct solely
distinguish
that
p o i n t we make a r e f i n e m e n t .
i
f o rR
allow
h i s also equivalent
t i o n t> . f o r t h e d e l a y
nodes.
o f a l l schedules
F o r e x a m p l e , u>-0 d e f i n e s
v e c t o r w. O t h e r c a s e s w i l l
i t i s neces-
r e l a t i o n s d e s c r i b i n g a l l s c h e d u l e s as a
Our p r e f e r e n c e
choices
o f schedules.
schedules.
mentation lay
the p r o p e r t i e s o f schedules,
t o o b t a i n the mathematical
together
t t o be with
of operations.
vector,
a
the
I f the
we c a n assume
that
a t t h e moments d e f i n e d
by
58 the
schedule
components.
STATEMENT 7 . 1 . Let tion
is
necessary
responding the
and u;
to
j t h
node
if
then
w be a g i v e n d e i a y vector.
sufficient an
the
for
arc
the
originates
inequality
The following
vector from
must
t the
to
be
ith
condi-
a schedule
node
and
cor-
points
to
hold
t , - t , * 0, . J
where
ia. .
is
the
component
of
( 7 . 1)
' J
I
the
delay
vector
corresponding
to
that
arc.
If
an a r c goes o u t o f t h e i t h node
the
results
The
specification
into
t h e j t h node
then
one o f
o f t h e i t h o p e r a t i o n i s an a r g u m e n t f o r t h e j t h o p e r a t i o n . o f a nonnegative
v e c t o r id i m p l i e s
that
the time i n -
t e r v a l b e t w e e n t h e t w o o p e r a t i o n s must be g r e a t e r t h a n o r e q u a l
t o t&^j.
This
t h e ne-
i s precisely
cessity
the c o n d i t i o n
I s merely
expressed
a reformulation
r e s p o n d s t o t h e d e l a y v e c t o r U. To p r o v e nonical
parallel
always execute tor
form
that
f o r given
t i o n s u p t o t h e Jcth l a y e r ,
f o r (7.1).
tisfied,
the (k+l)th
lay
(7.1) f u l l y
vector.
(")(s|
While
t o denote
involved.
The
variable,
i f that
rule,
investigating
be
STATEMENT 7 . 2 . T,V. the
Then vector
we u
t h e way
that
i f ( 7 . 1 ) i s sa-
c a n be e x e c u t e d ,
we
and
so
well
d e f i n e d by a de-
will
o f u whenever
use
the
notation
our formulas
get too
c o n s i s t o f more t h a n one
the vector
components have
t h e y can
a l l opera-
executed.
schedules,
i n d e x e x p r e s s i o n may
the schedule
layer
can vec-
o f u d o e s n o t impose
the set o f schedules
t h e s t h component
components have two
vectors
specifies
t cor-
the delay
can execute
I t follows
a l l algorithm operations are
Thus,
since
o n t h e moments a t w h i c h
R s l . The s p e c i f i c a t i o n
a l l operations from
Therefore
the schedule
G i v e n a v e c t o r t , we
layer,
£ a n d w we
any c o n s t r a i n t s on t , except
on, u n t i l
that
t h e s u f f i c i e n c y c o n s i d e r a ca-
the operations of the f i r s t
Suppose
(7.1).
o f the a l g o r i t h m graph.
u does n o t impose any r e s t r i c t i o n s
be e x e c u t e d .
by
o f the f a c t
one
components Index,
are
while
indexed.
the delay
index As
a
vector
indices. Let
vectors
u,v
be
schedules
corresponding
corresponding
to
to
delay
have + v
is
a schedule
the
delay
vector
59 T
*
Vf -
delay
for
For hold.
A
any
£
O
the
vector
and
v, v
u, T
pairs
,-(u+v}
,
Uu^-Uu/j
(7.1) This
ponding
also
the
vector.
»
a
is
schedule
the
corresponding
inequalities
(u . - t t , ) t ( v , - v , )
set The
delay
£
vectors
establishes
to
analogous
the
(7.1)
to
vectors.
set
=
R
H o w e v e r , we
i s almost
of
q
(^Xljj
Au.
corresponding
f o l l o w i n g statement The
AT^.
and
wv
(T+I>)
=
r e l a t i o n s h i p between
o f a l l schedules
STATEMENT 7.3.
T . ,-H*,
£
ACU^-U^)
=
holds f o r the
Statement
to different
studying lay
\u
Therefore
iu+v)
i.e.
vector
AT.
vector
are
schedules more
t o one
corres-
interested in
and
the
same
de-
trivial.
generalized
schedules
a
is
linear
cone.
Certainly, (7.1)
the
let u
and
u . J
a l l valid
for
pairs
components
of
equalities
analogous
are
v
be
generalized
following inequalities
u
generalized
+
of
to
indices Au
2
that the
The
set
If
are
generalized
R^
A
0
to
Statement
is arbitrary
of
set
R^
generalized
is a
linear
schedules
7.2,
satisfy
I t f o l l o w s t h a t the vectors
COROLLARY.
of
(7.2)
According
i,j.
where
s c h e d u l e s , so
virtue
1
J
(7.2).
By
v . - v . £ 0
'
and
v
£ 0,
u.
schedules.
hold
u
the
the
in-
* v and
Au
cone. is
convex
and
inequalities
(7.2)
closed.
hold.
u
and
v
T h e r e f o r e f o r any
A,
schedules
0 £ A £
1 , we
then
(Au+(l-A)v)^-{A(j+(l-A)v}i
=
-u . ) + ( l - A ) ( v -v.)
\{u j
i
J
I
the
have
£ Aw. IJ
=
,+ ( l - A ) w . . - w. ., IJ
ij
60 i.e.
Au+(1-Alv
vector
i s a generalized
z i s an a c c u m u l a t i o n
sequence o f g e n e r a l i z e d
schedule
and R
i s convex.
p o i n t o f ft. I t means t h a t
schedules z
that
Suppose
a
exists
a
there
c o n v e r g e s t o z. Then
K k
k
k
z .-z. = l i m z . - l i m z . 1 J k ^ > k-*, '
i.e.
z t u r n s o u t t o be a g e n e r a l i z e d The
that
set of generalized
I t corresponds
ule f o r the delay negative is
delay
p r o v e n by t h e f a c t
that
t h e components
clude
that
f o r the vectors
2 T . .,
{AT},,
t
the
the
sum of
the
product
the
due
t o the fact t i s a sched-
be s c h e d u l e f o r a l l n o n those
o f u.
u, v
T , . hold
vectors
In partic-
are nonnegative,
i n Statement
7.2
the
This
we
con-
Inequalities
f o r A B 1 and a l l v a l i d
pairs
i,j.
i n t o a c c o u n t a n d u s i n g S t a t e m e n t 7 . 2 , we o b t a i n
possesses
All
i t would s t i l l
o f delay
STATEMENT 7 . 4 . The set vector
only
I f a vector
( 7 . 1 1 h o l d g o o d a s (tfj . d e c r e a s e s .
since
Taking t h i s
i s a cone vector.
v e c t o r s whose c o m p o n e n t s d o n o t e x c e e d
ular,
{T+P}..
£ 0,
'
schedule.
delay
vector m then
1
k ^
schedules
t o the zero
k
= l i m ( z -z.)
set
of
of
is
a schedule
schedules
is
proofs are similar
f o r example,
corresponding
to
a gii'ert
delay
properties:
schedules of
schedules
following a
schedule; and a n u m b e r
convex
and
\
£ 1 is
a
schedule;
closed.
t o p r o o f s o f S t a t e m e n t s 7.2 a n d 7 . 3 . P r o v e ,
t h e c o n v e x i t y . We
have
u -u . £ t j . ., v -v. a u J JJ 1 ij 1
For
a n y A, O s A = l , we f i n d
that
(Aut(l-A)v}j-(Au+(l-A)v>i
•
i.e.
A ( u . - t i . ) + ( 1 - A ) ( v -v.) J 1 J i
e \0.
,+ ( l - A ) w , , = u
ij
t h e s e t o f schedules f o r a given delay For
longer
a given
non-zero delay
vector
=
i j
vector
, i f
i s indeed
convex.
u the s e t o f schedules R
a cone. However, f o r any u t h e s c h e d u l e s
from
R belong U
i s no tothe
61 set
o fgeneralized
schedules R . I n s p i t e
o f the
fact
t h a t if o i n a sense, i s i n c l u d e d i n each R .
includes
9
all
the sets R STATEMENT
generalized
i t also, 7 . 5 . Let
schedule
Certainly,
t
be a given
u the vector
for a l l valid
pairs
tj-t.
Using these
Inequalities,
(7.1)
holds
schedule from
for
from
also
R . •a
Then
a schedule
from
for
any
R . w
i , j we h a v e
u
*-
^ 0.
/ U i
we o b t a i n
( t + u ) . - {t+u} . J I Thus,
schedule
t+u is
the
vector
(t ,-t .) + ( u - u . ) J I J I
t+u,
z iff. .. i j
and t h i s
vector
i s therefore a
R. u
We
see
arbitrary
that
the
cone o f g e n e r a l i z e d
s c h e d u l e s , when s h i f t e d
schedule from
t o t r y and
find
out
R , i s a s u b s e t o f R f o r a l l iff. I t i s n a t u r a l iff w o n w h a t c o n d i t i o n t h e s e t s R and R a r e i d e n t i c a l o
to a parallel
Suppose
the system
p is
compatible
The
any from
and the vector
identical
shifted
to
vector by the
schedule R
plus
a shift
r
p
i
p is
p i s a schedule
from the
R
U
linear
(t-p),
consider
these
from
Then
(t-p>j
i.e.
the vector The
t - p . We h a v e
we c o n c l u d e
c o m p a t i b i l i t y of
)
R^ and fi^
a s some g e n e r a l i z e d t in R ti)
t i-i . £ u>^..
Since
schedule t = p+
Subtracting
(7.3)
that
- f ' j - V
t-p i s a g e n e r a l i z e d
3
R . By S t a t e m e n t 7,5, the set R u ' o i n R^. I t r e m a i n s t o p r o v e t h a t
p. T a k e a n y s c h e d u l e
the v e c t o r
the sets
7
p.
c a n be r e p r e s e n t e d
inequalities,
equations
(
i t s solution.
p i s contained
vector
algebraic
" i j
by the vector
v
vector
=
of
O
from
<•>
translation.
STATEMENT 7.6.
are
by an
(
p
j ~
p
i
)
-
°'
schedule.
(7.3) provides
a sufficient
condition for
the
62 sets
ft
and
o
t o be
R
identical.
Now
t i o n s a r e w o r k a b l e . Suppose t h a t a v e c t o r s. What p r o p e r t i e s a r e
we
ft
inquire
what
necessary
i s generated from
inherent
ft
via a shift
to the schedule
For
any
schedule
t i n R ^ the vector
t-s i s a generalized
any
pair
nodes c o n n e c t e d
an
out
of
i,j
there exists
on
the
i t h node
i t . Due
ity
the
j t h node.
such schedule
to the fact a
{ t - s } j - { t - s ) .
have
into
that
0
by
that
Since
t - t ^
Suppose
Hence,
t
j ~
t
s
-
i
f
s
schedule T
i '
goes
for
given
i t s minimum o v e r R ^
h
i
s
m
the e
a
Suppose
the
set
R
is
generated
from
n
R
inequal-
s
t
h
via
(J a
arc
a
t
w
e
obtained STATEMENT 7.7.
by
schedule.
the
R ^ i s closed,
achieves
i s a generalized
t-s
holds.
arc.
by
s?
Consider
of
condi-
vector
arcs
the
s.
Then
minimum
for
ail
value
pairs
of
t
i.j
-t
j
.
a
shift
o corresponding
over
all
to
algorithm
schedules
t
graph ft
in
i
emiais
u
s . - s . .
Thus, I f the s e t s R s,
t h e components o f
satisfy those
the
as
(7.3)
i s compatible
f o r the schedule We
special an
h
way.
by
idle
m u s t be
extra
If
by
property.
a
schedule
Namely,
they
" w i t h m i n i m a l gaps", i . e . making each o f
t o an
e q u a l i t y as
i n §5 t h a t
the
some d e l a y
possible.
I f the
system
actually
become
equal-
(7.1)
specification
vector
during
the
somewhere. T h u s t h e of both
i n p u t and
schedule
S t a t e m e n t 7.7
of
implementa-
selected be
regarded
i s also v a l i d
vector
f o r an
consumed
t - t -to J 1 tj
the delay
in
a as
i t h operation
immediately
residuals
that
an
the
intermediate data,
t and
asserts
Is can
.-i->^ j(h)
consuming
i s not
of
that
u>(h)
t h e r e s i d u a l tj-t
time
between the
arthen
characterize c a u s e d by
u;
t and
the schedule s minimizes
the
w
are each
times. the system o f equations
ding
schedule
data
item, whether
after
to a shift
important
a l l inequalities
case
run
times
"ill-fitting". those
identical
v e c t o r id. I f some r e s u l t
stored
storage
discrepancy
of
are
the j t h o p e r a t i o n . S i m i l a r treatment
b i t r a r y delay it
o
s.
produces In that
unplanned
output
then
have m e n t i o n e d
vector
(7.1)
close
ities
tion
R
s possess a very
inequalities
inequalities
and
u
p
directs
a
i t i s an
i t I s p r o d u c e d and
(7.3)
special
mode o f
input or
the delay
i s compatible
then
algorithm
intermediate
d e f i n e d by
the
correspon-
execution.
value,
i s used
t h e v e c t o r d has
Every right
elapsed;
63 no
additional
algorithm
w a i t occurs.
Implementation
Certainly,
efficiency.
p r o p e r t i e s o f a l g o r i t h m graph Given a delay form
a cycle.
I n that
to
traverse the cycle.
as t h e f i r s t
itive
We assume t h a t
i f we f i r s t e n c o u n t e r
direction
a n d b. , t i m e s
compatible or
i f the
and
Suppose t h a t gorithm graph rection.
The
only
With
we a s s o c i a t e
system i f
graph
a l l
deiay
of
no
algebraic
of
the
cycles
refer t o tothe
The c y c l e
with
(7.3)
graph
are
is balan-
all.
(7.3) i s compatible and p i s i t s s o l u t i o n .
every
traverse of the i j t h
the equality
p.-p^
I fthe a l -
arc i n the positive
. and w i t h
-
we
the Incident
of neighboring p . that
arcs
value
corresponds
associate
the
neighboring equations
suppose
that
node w i l l node w i l l
losing
the f i r s t
t o i t s nodes t h a t
the
subgraph.
be c o n s i d e r e d
the equations
Add o n e new a r c s u c h
either
with
a t least
with
annihilate.
has no c y c l e s , o r
generality,
we c a n assume
s u c h v a l u e s p.
(7.3) a r e s a t i s f i e d that
tra-
twice.
equation)
n o d e a n d a s c r i b e some
p . Suppose a c o n n e c t e d s u b g r a p h i s b u i l t
bed
=
a l w a y s be i n c l u d e d i n
and t h e l a s t
of algorithm
Without
Take
P^'Pj
the cycle traverse
t h e s u m m a t i o n , t h e s e Pj w i l l
the graph
t o be c o n n e c t e d .
the equality
be t h e d i r e c t e d d e l a y o f t h e
l o ri n the f i r s t
i t s cycles are balanced.
direction
traverse of that
be z e r o . C e r t a i n l y , f o r e v e r y
t o that
b o t h "+" a n d "-" s i g n s . D u r i n g
every
i n accordance w i t h
verse The
graph
We
h a s c y c l e s , t a k e a n y o n e o f them a n d s e t t h e t r a v e r s e d i -
side w i l l
to
that
i n t h e pos-
equations
algorithm
at
cycle, while the left-hand
the
direction.
the cycle.
of
linear
cycles
has
Sum u p a l l t h e e q u a l i t i e s
Now
Suppose
a l l i . j corresponding
o r d e r . T h e r i g h t - h a n d s i d e o f t h e sum w i l l
all
set the direction
balanced.
same a r c i n t h e n e g a t i v e d i r e c t i o n -Uj^,
that
an a r c i s t r a v e r s e d I n the p o s i -
taken over
i s called
algorithm
nodes
a r c i s traversed r ^ j times
o f t h e c y c l e as t h e d i r e c t e d
STATEMENT 7 . 8 .
which
i t s o r i g i n and t h e n i t s end p o i n t ;
i n the negative
u
tr^j-bj/) jj
zero d i r e c t e d delay
ced
n o d e . Now
the arc I s traversed i n the negative d i r e c t i o n .
sum o f v a l u e s
arcs
some s e q u e n c e o f g r a p h
and t h e l a s t
during a cycle traverse the i j t h
the
influences the
determine,
sequence e v e r y n e i g h b o r i n g nodes a r e c o n n e c t e d b y
arc, as w e l l
otherwise
circumstance
ensure t h e c o m p a t i b i l i t y o f (7.3).
v e c t o r u, choose
an
tive direction
this
T h e r e f o r e we w i l l
value ascri-
f o r a l l arcs o f
one o f i t s t e r m i n a l
64 nodes b e l o n g s to
t o the b u i l t
t h e subgraph,
unique
value
Let sumed
then that
subgraph.
(7.3),
taken
I f t h e o t h e r node d o e s n o t b e l o n g f o r t h e added
should correspond
t h e o t h e r node
o u r subgraph
belong
t o t h e o t h e r node.
t o t h e subgraph,
t o be c o n n e c t e d ,
some c y c l e s t o g e t h e r w i t h
arc, determines the
t o o . Since
t h e added
we h a v e a s -
a r c has t o
some o f t h e a r c s o f t h e s u b g r a p h .
constitute Take any o f
s u c h c y c l e s a n d s e t t h e t r a v e r s e d i r e c t i o n i n s u c h a way a s t o t r a v e r s e the of
added a r c i n t h e p o s i t i v e the origin
d i r e c t i o n . Suppose
o f t h e added a r c , 1 -
that
r i s t h e number
t h e number o f i t s end p o i n t .
c o r r e s p o n d i n g v a l u e s p^. a n d P j m u s t h a v e b e e n d e f i n e d verse a l l arcs o f the cycle, the
r t h node.
Again
positively directed -WJJ w i t h
every
equalities.
sides
will
directed
Remembering
that
t h e sum o f r i g h t - h a n d
tion
(7.3) holds f o r t h e r l t h arc. Going
tions
on w i t h
(7. 3) w i l l
COROLLARY. is
vector
the cycle
be
o f adding
I f
satisfied.
Ue
new
have
algorithm
then
graph
i t s solutions
is
we
graph
we
will
a
find
t h e equa-
that
obtained
sura u p
left-hand
i s balanced,
arcs,
thus
the compatibility of that the
t h e sum o f
every
p;-p . •
a r c . Now
be - M J . C o n s e q u e n t l y ,
sides w i l l
t h e process
compatible,
(1,1
This to
of that
s u c h v a l u e s p^ t o a l l nodes o f a l g o r i t h m
(7.3), which proves
(7.3)
a r c , and t h e e q u a l i t y
traverse
Tra-
and f i n i s h i n g a t = fa.. , w i t h
p.-p.
By t h e same a r g u m e n t a s a b o v e ,
be P r ~ P j •
that
ascribe
a t t h e J t h node
traverse of the i j t h
negatively
these
starting
we a s s o c i a t e t h e e q u a l i t y
The
previously.
finally
a l l equa-
solution
to
system. connected
occupy
the
and line
the
system
directed
by
the
1).
i s p r o v e n by t h e f a c t
that
f o r a connected
graph
the solution
( 7 . 3 ) i s d e t e r m i n e d u n i q u e l y o n c e a n y o f i t s c o m p o n e n t s p^ i s f i x e d ,
and
t h e sum o f a n y s o l u t i o n
with
the vector
(1,1
1) i s a g a i n t h e
solution of (7.3). For
any a l g o r i t h m
graph
cycles o f t h e graph balanced lanced that
the set of delay
vectors u that
t h e n ( 7 . 3 ) i s c o m p a t i b l e and t h e r e e x i s t s
I s i t s solution.
las
t o derive
the
delay vector.
I n that
t h e components I f we
take
make a l l
i s easy t o d e s c r i b e . I fa l l c y c l e s a r e ba-
case
a g e n e r a l i z e d schedule
( 7 . 3 ) can be r e g a r d e d
o f t h e schedule
using
as t h e formu-
t h e components o f
an a r b i t r a r y g e n e r a l i z e d schedule
t and
65 define be
t h e v e c t o r u by
compatible
any
and
generalized If
algorithm
case
there are
may
the
. then, obviously,
-t
will
belong
exist
a
schedule schedule
and
be
the
following
Fig. 7.1(a).
The
p -p
2
1
12
additional
,
delay
minimizes
(7.3)
p -p
3
= u
2
the
has
23
the
Clearly,
i t i s compatible
w
bv
S
In
this
2
-S
1
=
i f and
(tl , 12
case t h e s e t R
only
tor
s.
I f the and
We
have
existence as
the
S ~S
3
one
ta^^ R
p -p
3
from
Yet
exist. be
speci-
=
1
w
13
that
>
"j2
the
a l g o r i t h m on
+
= 0la "23
I f
u
3
*
) 3
that
using
R
t h e s h i f t by
the
vec-
o
+ w 2 3
holds,
are a l t o g e t h e r
p o s s i b l e . T h i s does not f o r the
each
time
S - 5 = W +W 3 1 12 23
then
no
such
of
(7.3)
that every data
mean t h a t
vector
s
different,
compatibility
o f such a l g o r i t h m implementation
s o o n as
fastest
and w o
mentioned
for
S t a t e m e n t 7.7.
form
i f w^
W , 23
-
2
i s obtained
inequality sets R
that
such v e c t o r s e x i s t s
u exists
the
in
7.1
of Statement 7.7,
virtue
3
Even
b) Fig.
then,
the
caused by
wait
by
a)
Z3
waits
a l g o r i t h m graph
the
,
will sense
set, then
vector.
i s described
example. Let
(7.3)
In that
of a l g o r i t h m graph.
s c h e d u l e does not
system
= w
balanced.
to the described
the
that
I t e m . Such s i t u a t i o n
c a s e s when e v e n t h a t
Consider by
= t
implemented w i t h o u t
between
i n d i v i d u a l data
fied
v e c t o r does not
c a n n o t be
there
a.^,
schedule balances the cycles
the delay
discrepancy
formulas
therefore a l l cycles
this
the whole. Let
implies
the
item i s used
implementation
is
the algorithm graph
the be
66 specified
by F i g . 7 . 1 ( b ) .
has no c y c l e s , t h e s y s t e m
We
assume <*j . " 1 f o r a l l a r c s .
17.3) i s c o m p a t i b l e .
As
the
graph
H e r e i s one o f i t s s o l u -
tions:
P-0,
Obviously, the
P 3 = 2 , P,=3,
p=l,
the time
p 5 = 2 , p-3.
needed t o e x e c u t e t h i s
=5.
P
a l g o r i t h m i s 5. How
consider
schedule
t =0, t = 1 , ( =2, t =3, t ' 2 3 * 4 It
p7=4,
does n o t s a t i s f y t h e system
(7.1).
The
see t h a t So gebraic tion.
execution
this
time
(7.3),
f o r this
schedule produces
f a r we
I =0, 5 '
t =2,
t
' 7
6
but i t s a t i s f i e s
schedule
the fastest
i s only
=3
8
the inequalities 3.
algorithm
I t i s easy t o
implementation.
have i n v e s t i g a t e d t h e s e t R o f s c h e d u l e s u s i n g io
operations
of vector
addition
However, t h e s e t o f s c h e d u l e s
and
scalar-vector
i s not p a r t i c u l a r l y
the a l -
multiplica-
rich
i n mean-
ingful
properties w i t h respect
t o these operations. With respect
dition
i t i s b u t a commutative
semigroup,
inverse
operation.
z e r o e l e m e n t . The cone not
For fact
(to a parallel clear
vector
study data
*
that
0,
i n w h i c h manner
the p a r t i c u l a r
o f the schedule
now
i s sometimes a
are accounted
have
[65] that proceed
no the
linear
consequences. I t i s
shape o f t h a t
cone, s p e c i f i e d
i t is difficult
p r o p e r t i e s on
f o r by
difficulties
t h e way
the fact
of studying
that
the usual
a d d i t i o n and s c a l a r - v e c t o r m u l t i p l i c a t i o n We
have
the
to
input
the algorithm.
f o r m a l l y used
c o u r s e be t r e a t e d j u s t
tions We
n o t even
a d d i t i o n and s c a l a r - v e c t o r m u l t i p l i c a t i o n
are fed into
schedules. of
does
the set o f schedules
T h e s e , a n d a l s o many o t h e r
vector
as s c h e d u l e s a d d i t i o n has
semigroup
( 7 . 1 ) , c a n be e x p l o i t e d . S t i c k i n g t o t h e o p e r a t i o n s
t h e dependence
schedules
this
t r a n s l a t i o n ) h a s no s i g n i f i c a n t
by t h e i n e q u a l i t i e s of
w
t o ad-
a r e more s u i t a b l e t o g e t t i n g acquainted
vectors.
since
Yet t h e r e
them.
schedules
are other
f o r the investigation with
operations
of
are not i n t r i n s i c f o r
these o p e r a t i o n s
as o t h e r
the set R of id
can
opera-
of schedules.
67
8. Number Semirings and Other Sets Consider do
not
or
b=+co, o r
axis We
an
arbitrary
demand t h a t both.
the
Whenever
i s augmented by
introduce
the
to
they
and
the
The
operations
are
also
by
or
+»
we or
real
numbers a x i s .
finite;
assume both
= m a x ( x , y ) , xey
result
of either
interval ©,
that
of
we
the
them. L e t
allow real
We
a=-
numbers e
x.y
[a,bj.
of
= min(x,y).
(8.1)
these o p e r a t i o n s
[a,bl i s closed
si a r e
associative.
always
under o p e r a t i o n s
commutative.
It is
easy
to
belongs
(8.1).
verify
that
Indeed,
{xmy)@z
= max(max(x,y),z) = max(jf,y,z),
x@(y&z)
= max(x,maxty,z)) =
maxtx,y,z)
analogy
(x®y)®z xa(ysz)
Both operations 9
-m
the
p o i n t s be
operations
Hence any
[a.b].
[ a , b ] on
end
necessary,
either
x<sy
Obviously,
interval
interval's
operation
identity
and
spectively,
and
the f o l l o w i n g
designated
Q®x
Note
= x,
t h a t none o f
Therefore
they
are
tributive
relations
identity the
should
• and
1,
be
l®x
lsx =
= x,
I f we
name
"multiplication",
then
the
unity,
re-
called
Clearly, hold:
t h e ® and
hold:
min(x,y,z).
e l e m e n t s on
equalities
essentially
min(x,y,z),
elements
all
xela.b]
) •
= mintx,minly,z)) =
possess
"addition"
corresponding
= min(min(x,y),z
1,
® operations
!a,bl.
zero
i n our
Oa>x =
has
an
semigroup o p e r a t i o n s .
and
c a s e 0=a,
l=b.
the
For
0.
inverse The
operation.
following
dis-
68 K ® ( y e z ) • (x®y)®(xs>z),
To
prove
this,
we n e e d t o v e r i f y
following equalities
Any
always
xffi(ysz) =
that
f o r any t h r e e numbers x , y , z t h e
hold:
minlx,max(y,z))
=
max(min(x,y),minlx,z)),
maxlx,min(y,z))
•
min(max(x,y),maxlx,z)).
t h r e e numbers x, y , z a l w a y s s a t i s f y
equalities:
x£y£z,
o n l y one o f t h e f o l l o w i n g i n -
x^zsy, z£x£y,
z ^ y ^ x . B o t h o f t h e above
by t h e s t r a i g h t f o r w a r d
c o n s i d e r a t i o n o f each o f
y^x^z,
y^z^x,
equalities are verified
t h e s i x c a s e s . F o r e x a m p l e , i f x^y^z,
then
min(x,max(y,z)) = min(x,z) maxlminlx,y),
minlmaxlx,y),
cases a r e t r e a t e d It
follows
the o p e r a t i o n s properties.
- maxlx,y) = y,
maxlx,z))
= m i n l y . z ) = y.
i n t h e same way.
that every
number
(8.11. Actually,
Grouping
= x,
m i n ( x . z ) ) = m a x l x , x ) = x,
maxlx,min(y,z))
Other
(xey)a(x©z).
together
( 8 . 1 ) we o b t a i n t h e f o l l o w i n g
interval
this
i s an A b e l i a n
semiring
s e m i r i n g p o s s e s s e s some
a l l the properties
of
the
under
additional operations
list:
xtsx
xax
= x,
xs>y = y ® x , x<sy U®y)©z = x®(y©z),
=
x,
= yex,
(x®y)®z = x ® ( y ® z ) ,
(8.2)
x s l x e y ) = x , xs>(xs>y) = x , (xffly)iaz =
A
s e t whereof
equalities
the operations
i s called
(x®z)®(y®z).
satisfy
a lattice.
the s e t i s c a l l e d a d i s t r i b u t i v e
a l l but the last
I f the last lattice.
equality
of
also
t h e above
holds,
then
69 He h a v e n o t y e t c o n s i d e r e d of
the relations
(8.2).
(8.1),
while
cases.
I fx^y then
the f i r s t
The f o r m e r
the latter
Is easily
and t h e one n e x t
holds
trivially
verified
by c o n s i d e r i n g
xslxisy) = min(x,max(x,y)) = minlx,x) xelxsy)
If
to the last
f o r the operations a l l possible
= x.
= m a x ( x , m i n l x , y ) ) = maxlx,y) • x.
xsy then
x « ( x ® y ) = min(x, m a x l x , y ) ) = m i n ( x . y ) = x, xelxay)
It
I s easy
tions
to verify
i s equivalent
= m a x ( x , m i n l x , y ) ) = maxlx,x) = x.
that
the distributiveness
to either
of the following
(x®y)ez =
The addition
interval
relate
pairs [a.b]
o f operations: i s closed
a^O, b=+a>; a=-m,
expect
( x ) . Consider
under
the + operation
that first
we c o n s i d e r .
That
t o be a
i s quite
of (8,2) properties
does
i n the following under t h e x oper-
lattice
clear
that
of the operations.
we c a n c h e c k J u s t Consider
the right
the pair
b=*m;
2
a
£ b « i , a 3 0 . He
f o r the pairs suffice
not hold
at least
distributivity
o f the that the
only
be
To e s t a b l i s h
® , x. N o t e
number
b=+eo. I t I s c l o s e d
H o w e v e r , we c a n s u p p o s e t h a t a semiring.
18.1).
the properties
® , +; © , x; ® ,
baO; a--m,
our i n t e r v a l
f o r the operations
i n a c e r t a i n way t o t h e u s u a l
a t i o n o n l y f o r t h e c a s e s a = 0 , b=+co; a = - « , not
properties:
(x»y)®(x®z)®ly»z).
readily verified
( + ) and m u l t i p l i c a t i o n
following
cases:
are also
9 and « o p e r a t i o n s
opera-
(x®z)®(y®2),
[x®y)®(x®z)®ly©z) =
These p r o p e r t i e s
o f t h e © and ®
of
i t t o mention
f o r t h e + and x
can-
operations that the
operations.
f o r some c a s e s o u r i n t e r v a l w o u l d
fact
i ti s sufficient to verify the
Since a l l operations
a r e commutative,
distributivity.
o f operations
© a n d +. T h e d i s t r i b u t i v i t y
rela-
70 t i o n s have t h e f o r m
(x©y)+z = (x+y)mz
lx+z)®(y+z), |x®z)+ly®z)
=
i.e.
maxlx,y)+z
=
maxtx+y,z) =
Obviously,
the f i r s t
x , y,
would be
z,
be x+y
built
that and
max(x,z)+max(y,z).
of these p r o p e r t i e s
p e r t y does n o t h o l d . such
Consider
x
< z, y
kinds of
b u t x+y
side
is distributive
operation
i s not d i s t r i b u t i v e w i t h the p a i r
with
as
respect
h o l d s . The
b = * « and > z.
would
intervals
operation
Consider
always
t h e c a s e a^O, < z,
the right-hand
f o r other
max(x+z,y+z),
Then
t a k e , f o r example,
to
the ©
o f o p e r a t i o n s ® and
that
operation,
x. The
side
examples
I t follows
respect to the +
pro-
the left-hand
be 2 z . S i m i l a r well.
second
but
may
the + the
©
operation. distributivity
rela-
t i o n s have t h e f o r m
lx©y)xz = lxxz)©(yxz), lxxy)sz =
lx©z)x(y©z)
i.e.
maxlx,y)xz = max(xxy.z) =
Obviously, terval
la.b]
the f i r s t
of
the left-hand
e x a m p l e s we all
maxlx,z)xmax(y,z).
these p r o p e r t i e s
holds
solely
c o n s i s t s o f n o n n e g a t i v e numbers o n l y .
t a i n s n e g a t i v e numbers, on
maxlxxz,yxz|,
can
t h e numbers
then take x < y < z
s i d e and
x z on
easily verify i n the
< 0,
the right-hand
that
interval.
We
side.
would
have
Choosing
t h e second p r o p e r t y Thus
i n case
the i n -
I f the i n t e r v a l
the x operation
never is
con-
then
the
yz
right
holds f o r
distributive
71 with
respect
nonnegative respect
t o the ® operation numbers o n l y
solely
the ®
i n case our i n t e r v a l
operation
consists of
i s not d i s t r i b u t i v e
with
t o the x operation.
Consider tions
and
the pair
of operations
9 and
+ . The d i s t r i b u t i v i t y
rela-
have t h e form
(x®y)+z =
[x+zjaly+z),
(x+y)®z =
(x®z)+[y®z)
i . e.
minlx,y)+z mln(x+y,z] =
Again, yet
the + operation
t h e <s o p e r a t i o n
= min(x+z,y+z), minlx,z)+minly,z).
is distributive
w i t h respect
i s not d i s t r i b u t i v e
with
t o t h e <9 o p e r a t i o n ,
respect
to the +
opera-
tion. Consider tions
the pair
of operations
® a n d x. The d i s t r i b u t i v i t y
rela-
have t h e f o r m
lx®y|xz •
(xxz)®(yxz),
lxxy)®z •
(x®z)x(y®z)
i . e.
minlx,y)xz minlxxy.z)
Once a g a i n , ation and
the x operation
solely
the e
minfxxz,yxz),
= minlx,zlxminly,z).
i s distributive
I n case t h e i n t e r v a l
operation
•
consists
i s not d i s t r i b u t i v e
w i t h respect
t o the e
o f nonnegative
with
respect
oper-
numbers
t o the x
only
opera-
tion. Notice ous
kinds.
y e t one more
property
binding
together
operations
F o r any t h r e e numbers x, y , z, t h e e q u a l i t y
of vari-
72
zx(x+y)
always holds.
Let z£0.
= (zxx)s>(zxy) + (zxx)®[zxy).
I f xzy
(zxx)s(zxy)
If
(8.3]
then
• zxy,
(zxx)®(zxy) = zxx.
( z x x ) ® ( z x y ) - zxx,
(zxx)®(zxy) = z x y .
x=y then
I n c a s e z^O we h a v e t h e f o l l o w i n g c a s e s .
(zxx)a(zxyl
I f x^y then
= zxx,
(zxx)®(zxy) = zxy.
(zxx)®(zxy) = zxx.
I f x £ y then
Our
investigations
tributive there
with
have c l a r i f i e d
respect
t o each
i s nothing spectacular
erations
about
+ and x t h e x o p e r a t i o n
operation,
but the + operation
that not every operation
of the other
operations.
t h a t . Even w i t h isdistributive
i s dis-
Of
course,
t h e c o n v e n t i o n a l opwith
i s not distributive
respect
with
to the +
respect
t othe
x o p e r a t i o n . B a s i n g o n t h e o p e r a t i o n s © , ® , +, a n d x we c a n b u i l d ous be
number called
semirings. "addition"
the n e c e s s i t y cation" ber
o r " m u l t i p l i c a t i o n " . The o n l y
semirings
i n t h e Table
identity
choice
constraints
on
with
respect
i s 7. The v a l i d
8.1 by h a t c h i n g .
elements "zero"
eration pairs. The
any o f t h e mentioned
t o s a t i s f y the following semiring
must be d i s t r i b u t i v e
o f such
shown
I n general,
We d e n o t e
and " u n i t y "
limitation
requirement:
to "addition".
combinations This
which
same
of operations the Interval
i n o u r number [a,bl
that
may
table
The t o t a l
num-
also
shows t h e
t o t h e v a l i d op-
as u s u a l .
semiring be
i s due t o "multipli-
of operations are
correspond
them b y 0 a n d 1 c h a r a c t e r s
vari-
o p e r a t i o n s can
imposes
accepted.
certain
These c o n -
73 straints under
may
the
be
due
chosen
"multiplication" strictions
on
to
with
on
ment 0
i s required. Also,
of
identity
element of as
semiring.
sible
{with
the
"addition" intervals
finite"
number
the
are: that
ensures
the
duct
zero
of
existence and
any
the
to
i s 0.
For
operation
[ 0 , +cn),
included.
to
the
the
the
existence
[-«,
example,
i f we
of
the that
would c h i e f l y each
take
[-
consider of
the
the
7
re-
demands
Identity the
of
ele-
"product"
and
any
other
t h e a> o p e r a t i o n then
the
admis-
+ » ] . Here t h e o n l y " I n -
i n the i n t e r v a l s
i d e n t i t y elements 0
o t h e r number o f
0
"multiplication"
0 ) , and
closed
A l l other
additional
" a d d i t i o n " ) element
as
interval
distributivity
requirement.
consequence o f
included
I t admits of
keeping due
i t i s f r e q u e n t l y assumed
n e e d be
I n w h a t f o l l o w s we "infinity"
as
to "addition" be
respect
«
of
wel 1
Most o f t e n
semiring and
as
only
placed
the
necessity
respect
can
[a.b]
the
the
operations
l e -co.
interval
semirings and
the semiring
1,
and
of
[0,+«], Table
also
i s always
the 8.1,
the
pro-
zero.
Table
x^"rtultlplication "
-+-
"addition"
0 = 0
7
7=0
7
+
0 = 1 0 = a 0 = t.
S§ :
0
7
©
6-
8.1
74 Various all
the
partial
semirings
p r o p e r t i e s should a
It
(*) a
order
that
relations
we
consider.
(£) Of
a
(SJ b and
b
if
a
( s ) b and
b far) a,
<£) c t h e n a
that
(=0
t h e n = (-)
i n the set o f r e a l
every
if
i f two
they
numbers a r e
zero. Either
be
used
here.
by
the " a d d i t i o n "
of
such p a r t i a l
properties peculiar
by
produces
+co p l a y s
assume
that
by
by
standard perty
the
number
that
o f 0,
i f xiy
o r d e r o f r e a l numbers o r
p r o p e r t i e s of the
that
the
This
amounts
inequality
inequality to
the
2xs+n.
f o llowing
for
min(y,z)=y
min(x.z)
and
minlx,z)=z, xiy£z
then
minlx,z)
a
so
min(y,z)=y
z z.
such t h a t Therefore
x of
that
again
I t follows
e
property
for a l l z
inequal i t y x i y
that
that
the
third
minly.z).
that
z ( & ) 0 amounts actually
a l l pro-
imply
minlx,z)=x, find
that
Finally, i f
again
the
additional third
the
[0,+»].
second
r e l a t i o n h o l d s a s w e l l . The
property
will
states
e must
then
I n c a s e x E Z fc y we
so
We
Obviously,
i s t h e same as
I f z^x^y,
i
may
[ 0 , +co] d e f i n e d
[0,+m],
a.
additional
minlx,z)
or
i t s reverse
f i r s t additional
(s) yoz
the z.
than
t h e " a t t a c h e d num-
t h e i n e q u a l i t y x ( s ) y must i m p l y x+z
z|c)0. Recall the
xaz
demand:
minly.z)=z,
order
meaning
second
a minly.z).
so
minlx,z)=z,
p r o p e r t y means t h a t
The
imply
a l l nonnegative
minly.z) holds.
p e r t y of the p a r t i a l
usual
i n e q u a l i t y x+x(?00, which
x ( ^ ) y must
minlx,y)*min(y,z)
0+x=G f o r e v e r y
I s } r e l a t i o n h o l d . The
i s e q u i v a l e n t t o the
self-evident
and
i n the
preserved
" i s greater
" m u l t i p l i c a t i o n " +. C l e a r l y ,
role
preserved
them;
( s ) , t h e o r d e r w o u l d be
another
the usual
a and
x(£)y,
"natu-
the r e s u l t which " i s
Consider, f o r example, the s e m i r i n g i n the i n t e r v a l
ber"
order
t o the
( s ) , t h e o r d e r w o u l d be
to both of
ordered
"multiplied"
equals"
b.
itself
numbers a r e o r d e r e d
are
In
standard
zero-,
number i s " a d d e d "
- i f two if
introduced
n u m b e r s . Namely
number m u l t i p l i e d b y
than o r equals"
another
be
following
c;
each s e m i r i n g admits
t h a t s a t i s f i e s a number o f a d d i t i o n a l
greater
the
a;
if
order
easily
hold:
f o r any
t u r n s o u t , however,
ral"
can
course,
inequality pro-
additional
(£) y*z
for a l l
to the nonnegativeness
demands
that
the
of
inequality
75 x=y
must
imply
semirings gated
the
i n much t h e same So
rings
f a r we
are
»,
m,
portional Now
we
the
For
other
order
proceed
numbers
assumed
i n the
The
x
that
the
interval
elements of
la.b).
This
of
investi-
rational
to defining
numbers,
operations
not
really
ones o n l y .
"addition",
semi-
no
means
closedness the
oper-
numbers
pro-
v a l i d number s e t s .
vector
will
important
vector
multlplication",
the
We
our
i s by
only necessary c o n d i t i o n I s the
i n t e g e r numbers,
t h e most <s as
kinds
are
number s e t u n d e r t h e c h o s e n o p e r a t i o n s . F o r and
discuss
true.
partial
way.
scalar operations.
e r a t i o n s is, « , vector
Is obviously
t o a g i v e n number, e t c . a l l a r e
introduced will
+,
which
properties of
implicitly
requirement.
the considered
ations
we
have
arbitrary
a principal of
a y*z,
x+z
additional
We
vector
r e s p e c t i v e l y . These
analogous
need a l l o f will
refer
"multiply",
operations
the
to the
and
are
to
them,
so op-
"scalar-
defined
as
follows;
(8.4) {Aou) . =
Of
course, As
these componentwise r e l a t i o n s
we
proceed
situations.
In
admissible. correlate
and
rather
that
our
i t Is
produces case
research,
a
i t does
that
not
can
be
to
have
know
belongs which
i t may
to handle
that to
a
the
number
turn out
the e
and
c h o s e n . The
Ao(u®v)
(Aou)©Uov),
Ao(usv)
Uou)®(Aov).
(A©p)ou
(Aou)®(pou),
(Aap)ou
(Aou)e(uou).
being
con-
operations
t o be n e c e s s a r y
e operations
obvious:
various
"scalar-vector set
t h e v e c t o r o p e r a t i o n s e and
i s required, then only
e number o p e r a t i o n s
hold f o r a l l j .
shall
matter
I n o t h e r cases, however,
A.
should
we
sufficient
vector
number o p e r a t i o n s w i t h
tributivity a,
with
Sometimes
multiplication" sidered.
(uoA) . = u . +
following
of
®.
are to
I f dis-
t h e +,
properties
x, are
76 Other p r o p e r t i e s of v e c t o r operations
s h a l l be d i s c u s s e d
the need f o r such d i s c u s s i o n a r i s e s . U n l e s s o t h e r w i s e sume t h a t half
of
a l l n u m b e r s and
the
real
mentioned, not tain
a l l real
subsets closed
be d i s c r e t e . We usual
vector
to
the
"number"
n u m b e r s n e e d be e x p l o i t e d - we
under
the
used o p e r a t i o n s .
can
These
as-
nonnegative
+o>.
As
do
we
have
with
cer-
s u b s e t s may
assume a l s o t h a t n o n n e g a t i v e n u m b e r s a r e o r d e r e d
the
general
some s e m i r i n g
and
case, each component o f the
semirings
are not n e c e s s a r i l y i d e n t i c a l . ponding
to
the
respect
j t h component
a
vector
corresponding
Suppose t h a t comprises
well i n the
i s closed
the
metrically.
to d i f f e r e n t
element
We
vidual
vector
operations primary under
components b e l o n g
but
a
e> and
requirement partial
independence o f
and
s> a r e
c o n s t r a i n t imposed
the
0,
e
on
always
sets
can
be
the
sets
enforced,
multiplication"
is defined,
tor
components a r e
are
"multiplied"
related
then
must
since
the
I f distributivity
vector"
values
only
0
res-
of
indi-
i n c a s e no
other
In
they
be
that
should
case be
the
closed
semirings.
Even
that
are
when
only
there
sets of
"zero" i n case
d e n o t e i t 1.
i s r e a l l y needed.
v i a the set of
is
exists "zero
the
vectors.
i s that
i . e . they
closedness under o p e r a t i o n s
we
to which
upon
corres-
"unit vector" with
tolerated
performed
operations,
i s not
the
of
components
that
d e f i n e the The
t o t h e o o p e r a t i o n i s d e f i n e d s i m i l a r l y and mutual
element
t h e number s e m i r i n g
t h e v e c t o r whose J t h c o m p o n e n t i s ( K .
The
i s an
t o t h e © o p e r a t i o n . Such e l e m e n t c e r t a i n l y
t h e number s e m i r i n g t o be pect
the attached
as
we
way. In
with
components b e l o n g
number a x i s w i t h
later,
specified,
values
cases
I f "scalar-vector
of
n u m b e r s by
individual
which
is required, this
the
vec-
vectors
connection
grows
stronger. Let
A be
an
associative semiring
semiring R i s called by
A
elements
spect
to the
the values one
and
the
a semialgebra over A
i s defined "addition"
"scalar-vector of
and
this
i n A.
individual
we
an
a s s o c i a t i v e semigroup.
i f the product
operation
Whenever we
multiplication"
assume
consider that
the
the
In that
case
with
operation
semirings
the v e c t o r
a semialgebra over
The
of R elements
is distributive
to
semiring
a
with
t h a t number s e t .
reof
which
components b e l o n g a r e always s e m i a l g e b r a s
same number s e t .
e r a t i o n s ® and
or
over op-
77 We
conclude
ducing
lar-vector fines
this
dot products
s e c t i o n by d i s c u s s i n g
and v e c t o r
multiplication"
t h e "sum"
norms.
The
operations
the p o s s i b i l i t y
"vector
imply
that
o f numbers and t h e o o p e r a t i o n
eration) defines
the product
addition" the ®
of
intro-
and
"sca-
operation
o f n u m b e r s . We d e f i n e t h e f u n c t i o n a l
to
be t h e sum o f p a i r w i s e p r o d u c t s
o f t h e c o m p o n e n t s o f u and v
in
t e r m s o f © and o o p e r a t i o n s .
have
We
( u , v ) = ® An .ov J J
Check t h a t
J
J
> 0 i f u * iu.v)
( u e w , v)
assume t h a t a l l v e c t o r Suppose
ponents. terval
there
left
are
satisfied:
( 0 , 0 ) = 0,
(v,u), = Ao(u, v ) ,
=
(8.6)
(u,v)@{w,v).
exists
t h e "zero
end p o i n t
vector"
0 w i t h 0^. b e i n g
o f the dot product i s defined
l u . v )belong
by 8 - max
,
e q u a l s t h e maximum o f Zu ., the
e q u a l i t y here
cient
that
another. ways
not a l l of
obtained
f o r nonzero
t h e components the f i r s t
of
property
i t s comto thei n -
20 .. The v a l u e
J
so ( u , u ) & 0 a l w a y s h o l d s .
c a n be
Consequently,
(8. 5)
J
c o m p o n e n t s a n d t h e number A a r e n o n n e g a t i v e .
C l e a r l y the values
whose
0,
=
( A o u , v)
We
J
the p r o p e r t i e s o f dot product
(u,u)
(u,v)
vectors
= m a x ( u .+v . ) .
J
de-
( o r , e q u i v a l e n t l y , + op-
I n the general
vectors
the "zero
(u.u)
3
i t is
vector"
of dot product
case suffi-
equal
one
does n o t a l -
hold. The s e c o n d p r o p e r t y e a s i l y
t h e r we
f o l l o w s from
the d e f i n i t i o n
have
( A o i i . v ) " max j
!(Aou}.+v J
A o ( u , v ) = max
1
) = max
( u +v
J
(u^.+iO+A.
)+A,
(8.4).
Fur-
78 F i n a l l y we
have
[u+w, v)
= max( ( u © w ) .+v .) • m a x ( m a x ( u ., w .}+v .} j
J
J
j
J
= max ( m a x ( u ,+v ., w ,+v ,) 1
J
>
'
1
•
Thus o n l y extent r
the f i r s t
functional
be i n t r o d u c e d
J
j
m a x ( ( ( i , i / ) , (w, v))
=
(8.5).
The
(8,5) suggests
=
J
• (u,v)©(w, v ) .
of the properties
f o r the f u n c t i o n a l
0. The
J
J
'
= m a x ( m a x ( u +v . I , max (w ,+V .)) j
-)
value that
( 8 , 6 ) does n o t h o l d o f
a
seminorra
to f u l l
be 8 e v e n
(rather
than
i fu norm)
i n our set o f vectors. Let
Hull = © u . = max J
The f o l l o w i n g p r o p e r t i e s s h o u l d
(8.7)
u .. 1
j
h o l d f o r a seminorm
Hull £ 0.
Ilusvll s l i u l l e l t w i l ,
IIAeull =
Here 0
= max j
The
0 .. The
and
UlelluH.
the l a s t
properties
always
can a c t u a l l y be
j
replaced
J
= max(max u..max v.) j
respect
self-evident.
by a
stronger
one,
since
holds:
llu ® vli = m a x d i a v ) . = maxfmax ( u
With
are
J
second p r o p e r t y
the e q u a l i t y
first
(8.8)
to the operation
J
j
v ,)) 1
=
1
BullaDVJI,
J
of "vector
multiply"
t h e seminorm ( 8 . 7 )
79 possesses t h e f o l l o w i n g
property:
llusvll £ D u l l o l l v l l .
(8.9)
Indeed
llusvll = max{u®v)
. = max I m i n l t j ., v , ) )
J
'
J
J
£ m i n t m a x u ., max v.) j
In may h a v e
a great
structures
variety
c a n show
Certain nature
of identity
paid
and a l s o
i n their
exact
elements.
A particularly
We
have
already
interesting
the set o f nonnegative
newly
the
investigating
introduced
ations.
noted
that
nature.
while
Thus p r o p e r a t t e n t i o n
inves¬
should
be
components and t h e s e t o f
case
i s that
n u m b e r s . We w i l l
o f a l l these
o f t e n consider
t h e s e t o f s c h e d u l e s we w i l l
operations,
this
sets case
fact
that
kinds
of operations
described
be d e s c r i b e d
i n terms
vector
c a n be
described
For example, t h e c o n v e x i t y o f
of the operations
i n terms o f t h e operations
+ and x b u t
© , ® , a n d o. T h i s
consequence o f a m a j o r d i f f e r e n c e between t h e x and o o p e r a t i o n s . exists
a n u m b e r A=0 s u c h t h a t
such A f o rwhich
such
A even
\ou
t i o n s ® , ® , <3> p r o v e
t o be v e r y
vectors, particularly
i t
i sa There
0-U=0 f o r a n y v e c t o r u . H o w e v e r , t h e r e i s
= 0 f o r an a r b i t r a r y
f o r two d i f f e r e n t
oper-
I s accounted f o r by
not a l l properties of a set of vectors
set i s readily
make u s e o f t h e
as w e l l as o f t h e t r a d i t i o n a l
The a p p l i c a t i o n o f b o t h
t e r m s o f o n l y one c l a s s o f o p e r a t i o n s .
cannot
no
ele-
t h e pages t o f o l l o w . While
a
and " u n i t y "
these
of identity
w h i c h may o r may n o t be t r u e d e p e n d i n g o n t h e
t o the structures o f the s e t o f vector
being
in
or nonexistence
the properties o f dot product.
n u m b e r s A.
on
o f s t r u c t u r e s . The d i f f e r e n c e b e t w e e n
"zero"
properties exist,
tigating
components and t h e s e t o f numbers A
i n the existence
m e n t s t h a t we c a l l e d
= ItuOallvll.
J
J
practice the set of vector
£
1
vectors. helpful
On
u. Moreover,
the other
hand,
there
while exploring certain
the s e t o f schedules.
i s no
t h e operasets o f
80
9. Minimax Properties of Schedules The pressed
c o n d i t i o n s on w h i c h
investigating to
proceed
rithm
graph
dd aa rr dd
nodes which
traditional
0
notation
into
can
t h e j t h n o d e . He g.
-
ex-
while
operations applied
i t i s no
longer
0
ade-
0 defined
t h e s e t o f numbers
set. Obviously,
be
useful
t h e o p e r a t i o n s ® , ® , and
arcs
f o r t h e empty
vector
v e c t o r . However
Let g . denote
emit
schedule
of
to explore
same s e t o f s c h e d u l e s .
a
r e p r e s e n t a t i o n ( 7 . 1 ) was
d e f i n e d by a d e l a y
a s we
that
t c a n be
The
the properties
schedules
quate
a vector
i n a number o f w a y s .
on
of algo-
use
the stan-
f o r input
nodes
only. STATEMENT 9 . 1 . G i v e n a d e l a y v e c t o r ia the necessary to ted
the
delay
to
then
and
by
sufficient Mi
vector arcs
the
a
for if
the
originating
vector jth
from
node
nodes
t
to of
a
the
whose
conditions- are
following
be
schedule
algorithm
numbers
corresponding graph
are
in
is the
poin-
set
g ., J
relations t . ^ max J
(fe + 0 1
legj
j ,
1
J
i f g . * 0 (9.1)
J
t . = s ., i f g . = 0.
y
1 must
hold.
Here
Let
t be
means t h a t set
Sj
=
a
are
arbitrary
schedule
Now
t ..
those
sign.
He
numbers.
corresponding
the i n e q u a l i t i e s
sider
for
s^
j
suppose
to
the delay
vector
( 7 , 1 ) h o l d . F o r a l l j s u c h t h a t gj
that
t h e j t h node
of the inequalities
i s n o t an
(7.1) which
input
contain t . with
w. -
one.
That 0,
we
Con-
the plus
have
a l l J e g^
The
inequalities
(9.2) imply
that
the following i n -
e q u a l i t y also holds:
2 max
t J
leg;
(t
+ u
). 3
(9.3)
81 Therefore
the r e l a t i o n s
Suppose f u r t h e r tor
t . I f the
holds.
ding arc
the
every
out
of
inequality
the v e c t o r The
time jth
when node
This
17.1).
these input
are
input
initial
hold.
which
stands
t o reason
on
into
conditions
vector
j t h input
In general
the
s.
For
valid
linear
example,
device vectors
s
schedules
by
themselves,
i s considered
the formulas
(9.1)
STATEMENT 9 . 2 . tor
u
the
set
is
closed of
operat
under
of
each
admissible
of
Sj.
some s e t
that
that
entire
vector
the
s.
This
set
data must
coincide i t can
p l a c e d upon the s e t o f
assume
vectors
needed.
set
may
be
of
be
As
valid defined
way. corresponding
operations
conditions
can
dimension, or
are
can
when
S j £ hj
we
the
moment
s w i t h com-
time
they
schedules
initial
the
whenever
i n some o t h e r set
the
to
a,
vectors
delay
vec-
o,
provided
that
closed
under
these
and is
a
ions.
This merely and
The
is
inequality
Introduced
f o r a given
or
the
vector
of
i f
additional constraints
the
form
restrictions
the
only.
t h e moments
i t at
i f hj
then
be
schedules
to
hold
u.
vector.
number o f
of for
(7.1)
execution:
to the
an
holds f o r
i n p u t nodes
determine
fed
refer
b o u n d e d , d i s c r e t e , e t c . The will
is related to
inequalities
space o f t h e a p p r o p r i a t e
conditions
Accor-
leg..
t h a t iegj
algorithm
are
equivalently,
to
(7.1)
concerns the
during
data
t h a t any
the
or,
(9.3)
t o the d e l a y v e c t o r
none o t h e r )
occasionally
the
the e n t i r e
initial
hold,
inequality
correspond
inequalities
(9.1)
(and
then
will
the
t h e j t h n o d e , so
consumed
one,
the
with
the
then
(9.2)
hold
relations
p o n e n t s S j as
fed
one
I t f o l l o w s t h a t a l l the
relations
data
i s an
Imposed
f o r t h e c o m p o n e n t s t . o f some v e c -
input
t i s a schedule corresponding
we
are
(7.1)
each o f
why
be
hold an
i t h node i n t o
explains
It
good.
inequalities
second g r o u p o f
T h a t i s why
(9.1)
the
the
hold
i s not
inequalities
t o Statement 7,1, going
and
that
j t h node
Consequently
those of
(9.1)
statement
shifts
the
same
S t a t e m e n t 9.1
amount the
conditions f o r the erations
are
is
trivial
a l l moments o f of
time.
relations
as
regards
execution Let
(9.1)
u
and
hold.
s c h e d u l e s p r o d u c e d by
obtained
by
the
of v
the
o
operation,
algorithm be
schedules.
C l e a r l y the
since
operations By
vectors
virtue of
t h e a p p l i c a t i o n o f e and
a p p l i c a t i o n of
these
same
by
i t one of
initial ®
op-
operations
to
82 the v e c t o r s
of initial
conditions f o r the original
schedules.If
gj
* &
t h e n we h a v e { u e v t . = max
( u ,, v .1 ^ max
3
}
(max(u.+w
(max(u
1
>
J
= max
(min(u
Thus i t i s q u i t e
suitable
t h e s e t o f schedules.
to
t h e ® and @ o p e r a t i o n s ;
Consider
grouping
the set R
v e c t o r u. together
those
fi^tff)
schedule
have empty
into
legj
(v.+U, ,)) • 1
the operations
is distributive
the set R those
initial
with
i n §8.
into
disjoint
schedules
f o rboth
from
vector
t o one and t h e classes R ed s.
Suppose t h e r e e x i s t s
classes.
Substituting
see t h a t classes,
c i d e . So we h a v e R
respect
® and ® a r e d i s t r i b u t i v e
conditions
( 9 . 1 ) we
i s determined uniquely
s>, ® , and o with
as o t h e r p r o p e r t i e s o f t h e opera-
intersection.
the formulas
3
({us>v} , * u , . ) .
i ndetail
two d i f f e r e n t
1
v ,+(•), . ) ) -
to introduce
and o n l y
classes
to at least
J
o f a l l schedules corresponding
Partition
t o o n e a n d t h e same
belonging
I
the operations
respond
vector
.,
t o each o t h e r . These, as w e l l
same d e l a y
=
. ) , max
1
The o o p e r a t i o n
t i o n s ® , a, a n d o, w e r e c o n s i d e r e d
J
( ( u w ) +ui ) .
( r a i n ( u , , v , ) * < i > , ,) = max
in
respect
iJ '
(u
izgj
= max
1
3
( u ® v ) . = m l n ( u ., v .) £ m i n f m a x }
( v .+<•>. . ) ) =
legj
.,
. ] - max
v )+u
. ] , max 3
legj
= max
= max
(max ( u .+u
3
= u R (s). s
the i n i t i a l
R (s),
that
cor-
Indeed, t h e a
schedule
this
common
conditions
i . e . the classes
coin-
83 STATEMENT 9 . 3 .
Every
class
R^is)
is
closed
under
9
the operations
a.
and
Certainly,
l e t t h e s c h e d u l e s u a n d v b e i n ft i s ) . S i n c e
they
have
to one
a n d t h e same
would
yield
initial
a schedule
conditions
having
that
vector,
same
the operations
Initial
conditions
9 and
9
vector, as
= s , s»s = s . C o n s e q u e n t l y ,
t h e s c h e d u l e s IttHr a n d u a v a r e i nft( s ) . w Now l e t u s i n t r o d u c e t h e o p e r a t i o n s a, ® , a n d o o v e r c l a s s e s . F o r e x a m p l e , we assume t h a t t h e n o t a t i o n ft (s)s>ft ( p ) d e f i n e s a s e t o f w iff s c h e d u l e s f r o m ft c o n s i s t i n g o f a l l t h e "sums" o f s c h e d u l e s f r o mft( s ) sms
iff
with be
L>
s c h e d u l e s f r o m ftiff ( p ) . T h e n o t a t i o n s Ru ( s ) a f t ij i p ) and Aoft ^ ( s ) a r e t o
understood
i n t h e same way. F o r m a l l y
S t a t e m e n t 9.2 e n s u r e s
r e s u l t o f a p p l i c a t i o n o f any o f t h e mentioned o p e r a t i o n s belongs
t o R . However, w h i l e
proving
Statement
that the
t o any
9.2 we h a v e
classes actually
ii established
the following relations: R
iff
( s ) ® R ( p ) c ft ( s o p ) , u
ft(s)aft
tff
(p) c R ( s a p ) ,
tff
iff
(9.4)
iff
Aoft ( s ) = ft ( A O s ) . Iff u The
last
any
c l a s s ft ( s ) t h a t i s fixed.
account
we f i n d R
hi
that
We
case
For example,
the formulas
way w h i l e
the usual
applicable schedules
u s t o assume when
we c a n assume
o f the i n i t i a l that
mins
;
studying
conditions = 0. T a k i n g
(9.1) imply
vector
min t . = 0 f o r a l l schedules I n
case,
have a l r e a d y
usual
allows
one o f t h e components
t h e n o n n e g a t l v e n e s s o f t h e components o f t h e d e l a y
I s ) i n that
the
o f the o operation
to
vector into
property
mentioned
t h a t we assume n u m b e r s t o b e o r d e r e d I n
we e x a m i n e
concepts
t o the sets
the properties
of limit,
closed
o f schedules.
a r e t o be t r e a t e d
sets,
The u p p e r
as the r e s p e c t i v e
o f schedules. bounded
sets,
and lower bounds
I n that etc.
are
bounds f o r
f o r their
com-
ponents. STATEMENT
9 . 4 . T h e class
R (s)
is
bounded
from
below
and
closed
to for
a l l valid
The
vectors
s and w.
components o f t h e d e l a y v e c t o r
are nonnegative,
so t h e formu-
84 las
(9.1) Imply that
the
minimal
R
class
that the
o f the I n i t i a l
t o be bounded f r o m below.
is)
the
a l l components o f any s c h e d u l e
component
z
k
R i s ) . There e x i s t s
•+ Z a s k -» to,
limits
equalities
that
(7,1) hold.
We
k
a l lg
i
tial
* a.
conditions
ft i s ) ,
z J
s . This
= s . i f g. J J
= 0 , we
z
in R
obtain
by
(*)
such
evaluating ^
z
the i n -
have k
z. * u>, , i 'J
Consequently vector
than
proves
l e t z be a n a c c u m u l a t i o n p o i n t f o r
z . = s . f o r t h e c a s e % . = e>. F o r s c h e d u l e s
z. j for
are not less
vector
a sequence o f s c h e d u l e s k
Since
Now
conditions
*
z,
j
I
ij
the vector z i s a schedule. Since
for z
i s the vector
s,
z belongs
the
to the
iniclass
i . e . t h e c l a s s ft ( s ) I s c l o s e d . STATEMENT
9.5.
The
class
ft
is)
contains
the
schedule
such
Ois)
u that
0 ( s ) s u = u . O l s ) « u = Ois)
for
any u e R
is).
Take a n y number j . The low. est
(9.5)
I t follows
that
c l a s s ft I s ) i s c l o s e d a n d b o u n d e d f r o m b e ¬ , <J e ft ( s ) e x i s t s s u c h t h a t t h e g r e a t er
a schedule u
J
l o w e r bound o f v a l u e s o f t h e j t h components
ft is)
i s reached on i t . Consider
o f a l l schedules
J
Ois)
u e
the schedule
= ® u .
(9.6)
J
where
the
"product"
Furthermore, of
Ois)
I s the exact
schedules ft is)
in R^is).
and a l l
is
taken
over
a l l j
the nature of the ® operation l o w e r bound As
a
. Clearly. implies
of values of that
consequence
of
that,
we
j
.) = u .,
( 0 { s ) ® u ) . = m l n ( 0 , ( s ) , u .) = 0 J
J
I
.is), J
e ft i s ) .
each
component
component
over a l l
obtain
w
( 0 ( s ) ® u ) . = m a x ( 0 Asi.u
Ois)
that
f o r any
u
e
85 i.e.
the
relations
(9.5)
hold
good
for
the
schedule
0(s)
defined by
(9.6). COROLLARY. Every the
zero
element
Indeed, These
respect
obey
the
is
to
c l a s s ft is)
the
operations
laws, y e t
R^ls)
class
with
i s closed
fact
that
the
the
schedule
minimum
(9.1)
are
under
commutative,
values
o f components
"
is)
vector by tor.
that
However,
still
a r e cases
com-
to
ob¬
for a given
s
inequalities i n
* 0, •
(9.7) = 0.
i s not
closed
can when
alter the
the
This
under the
i s accounted f o r
initial
result
usual
c o n d i t i o n s vec-
o f vector
operations
belongs t o the c l a s s ft [«).
tional
vector
Let
i n ft i s ) . "
Every
addition
class
is)
is
convex
with
respect
to
conven-
multiplication.
v b e l o n g t o t h e c l a s s ft i s ) . A c c o r d i n g t o S t a tu A, 0 £ A £ 1 , t h e v e c t o r \u + ( l - A ) v i s a s c h e d u l e
I f g . = a then J
J that not
corresponds v.
ts
and scalar-vector
{ A u + ( 1 - A ) v } , = Au
see
ft
s c h e d u l e s u and
teraent 7.4 f o r any
and
) If g
scalar-vector multiplication.
STATEMENT 9 . 6 .
Me
formulas
that
j
these operations
there
testify.
i f the
•*
case t h e c l a s s R ^ l s )
a d d i t i o n and
the f a c t
imply
the
we h a v e
+ ia
lea e g
(9.5)
Is
t h e minimums o f t h e
reached
O j ( s ) - s . i f gj
the general
e and » .
and d i s t r i b u t i v e
us t o b u i l d
(9.1)
Therefore
= max(0 J
are
allows
are
by e q u a l i t i e s .
0 .is)
In
possessing
operations
as the r e l a t i o n s
0 ( s ) . The r e l a t i o n s
replaced
the
associative,
c o m p o n e n t s o f Ois)
the
p o n e n t s o f a l l s c h e d u l e s f r o m ft is) tain
semiring e and e .
h a v e i n v e r s e o p e r a t i o n s . T h e s c h e d u l e Ois)
they do not
"zero element" o f t h a t semiring, The
a commutative
the operations
only
t o the
Therefore
Mi-Xi\r, = J J
i s the v e c t o r same
initial
As + ( 1 - A ) s . = s
J
Au. + ( l - A ) v a s c h e d u l e ,
conditions
f o r a n y A, OSASl,
J J
the
vector
vector
as the
Au + ( 1 - A ! v
but
also i t
schedules u indeed be-
86 longs
to R i s ) . w
According R
t o S t a t e m e n t 7.4 t h e c o n v e n t i o n a l
always belongs t o R . Yet i t i s o n l y
under c o n v e n t i o n a l
vector
addition,
0 only. For A a 1 the product of
a l l the classes
As=s h o l d s
good
id
the class R (0) that
a s t h e e q u a l i t y s+s=s
of R
R (0) i s closed
id
f o r s=0 o n l y .
from
i s closed
holds
for s =
and A a l w a y s b e l o n g s t o R . A g a i n ,
Id
onlv
R is)
sum o f s c h e d u l e s
We
see t h a t
under
Id
that
the class
o p e r a t i o n , as
R ( 0 ) possesses
£d
c e r t a i n e x c e p t i o n a l p r o p e r t i e s as r e g a r d s t h e c o n v e n t i o n a l
vector
ations.
of the class
R
The u l t i m a t e
reason
f o r these
( 0 ) i s t h e unique p o s i t i o n
special properties
o f the zero
vector
oper-
i n the set of vectors
fd with conventional
vector
However, t h a t vectors "zero"
with
operations.
zero
vector
operations
©
vector. Accordingly,
corresponding
ceases
and a.
t o be e x c e p t i o n a l
I t s place
i s taken
the special position
i n the set of
over
by
another
i s now o c c u p i e d b y t h e
class R i s ) .
u STATEMENT 9 . 7 . the
vector
tion.
s"
Then
eration)
that
any of
Let is
the the
schedule
the
set
initial
"zero''
from
schedule
of
vector
R^is)
Ois)
can
and
conditions with
be
some
vectors
respect
represented
schedule
to
S
the
as from
comprise operaf©
a "sum"
the
class
R
opis").
id Furthermore,
the
©
"sum"
of
0(s)
and
any
schedule
from
R is°)
yields
a
Id schedule
in
R (si.
Consider
some c l a s s R is)
Id
relations fine
(9.1)
hold
By
definition
Therefore
of Ois)
Ois)
t.
f o r ut h. e= v e c t o r 1
t h e v e c t o r u by
definition
a n d l e t t be some s c h e d u l e i n R ( s ) . The i f g
t and (9.7)
s . I fg
of s " the equalities the equalities
® u = t.
(d
0
*
j
«
f o r Ois).
We d e -
a
s ^ © Sj°~
0^(s)
hold
© t
I t remains t o v e r i f y
S j hold = t,
hold
i f g . = z.
By
i f g . * a.
that u c R i s " ) . Since i n
Cd t h e c a s e o f gj
• 0 t h e components o f u e x a c t l y match t h o s e o f t h e vec-
tor s " , the only schedule
inR .
thing
that
The p r o o f
i s not y e t proven
t o that
can r e a d i l y
i s that
the vector u i sa
be o b t a i n e d
by
collating
87 the f i r s t group o f r e l a t i o n s difference replaced u. T h i s e
i s that
o f (9.11 f o r t h e v e c t o r s
f o r g ^ = <e> t h e q u a n t i t y t
by a not greater
which
{
t a n d u . The
quantity s° i n the relations
replacement obviously
preserves
the inequality.
only
t o Sj
i s equal
Is
f o r the vector Consequently, u
SvfVl. Now
l e t v be an a r b i t r a r y
schedule
0 ( s ) ® v. S i n c e 0 ( s ) and v b o t h schedule
in R
as w e l l .
i n R i s " ) . Consider
belong
to R , the vector u I f g , = 0 t h e n we h a v e
the
"sum"
0(s) © v is a
( 0 ( s ) © v > . • m a x ( 0 , ( s ) , v . ) = m a x ( s ., 8°. ) - s .
J I.e.
j
}
J'
I
J
0 ( s ) © v b e l o n g s t o t h e c l a s s fi i s ) . Iff
"Adding"
(using
the © operation)
a given
t o r s c a n be r e g a r d e d as a s o r t o f " p a r a l l e l the
given
vector.
I n these
terms
any
c l a s s R ( s ) c a n be g e n e r a t e d f r o m
vector
notation fi^(p)
t h e c l a s s R is")
b y t h e v e c t o r Ois)
if is)
using
s e t by i s that
by t h e " p a r a l l e l
t h e © o p e r a t i o n . We
= 0(s)®R
iff
This y i e l d s
9.7
Iff
u®/? ip) to designate the " p a r a l l e l w t h e v e c t o r 11. T h e n we h a v e
by
of that
t h e message o f S t a t e m e n t
0>
translation"
t o a s e t o f vec-
translation"
the following
iff
translation"
introduce the of the class
( / ) .
(9.8)
r e p r e s e n t a t i o n o f t h e s e t o f s c h e d u l e s fi^: - \j 0 ( s ) f f l R (s°).
R
s
Summing i t a l l u p , we s t a t e t h a t e x p l o r i n g t h e s e t o f s c h e d u l e s I n c a n be r e d u c e d t o t h e s t u d y o f t h e s e t o f " z e r o " s c h e d u l e s 0 1 s ) a n d w of t h e c l a s s R i s " ) . N o t i c e t h e weakness o f t h e c o n s t r a i n t s imposed on R
iff
the vector vector" tor
s
s " . The
only
w i t h respect
that
defines
thing
required
t o the 9 operation our class
i s that
i t should
be
the "zero
i n some s e t c o m p r i s i n g
R I s ) . Given
a vector
s , take
t h e vec-
any
vector
iff
s"
satisfying
ponentwise). be c l o s e d that
the inequality s " £ s Then
under
(the inequality
i s t o be t a k e n
t h e s e t c o n s i s t i n g o f t h e two v e c t o r s
the © operation,
set. I n other
words,
every
and s " w i l l
be
class
can
R^is)
the "zero be
com-
s a n d s° vector"
generated
will In
by t h e
88 "parallel other case
translation"
class
R i s
0
by t h e v e c t o r
) , provided
the set of i n i t i a l
vector",
conditions
nonnegative
by
the a
o p e r a t i o n o f any
inequality
vectors
s° s
s
holds.
does n o t c o n t a i n
with
initial zero
t h e schedule
vector
conditions
initial 01s).
s . Consequently, vector
conditions We
have
any c l a s s
c a n be g e n e r a t e d
vector
already
by a
noted
ft^ls)
from
"parallel
that
In
t h e "zero
i t c a n a l w a y s be a d d e d t o t h a t s e t . T h e i n e q u a l i t y 0 £ s
good f o r any n o n n e g a t i v e
ft ( 0 )
01s) using
the vector
holds with
the class
translation"
the class
R^IO) i s
somewhat s p e c i a l . Now t h e r e p r e s e n t a t i o n o f t h e s e t o f s c h e d u l e s ft^ has the
form
ft = v Ois) u s We
have m e n t i o n e d
particular
earlier
c l a s s ft is)
we
<$ R ( 0 ) .
(it
i n this
section
c a n assume
that
a l l components o f s a r e n o n n e g a t i v e
tion
i soften helpful
a
good
such
idea
initial
©
t h e above
ted
initial
conditions vectors
(1,2)
set (1,0),
( 0 . 1 ) i s no l o n g e r
closed.
t h e new s e t o f i n i t i a l
conditions
only
such e x p a n s i o n i s n o t u n d e s i r a b l e
being
schedules, delay
v e c t o r s , and i n i t i a l
the s e t o f admissible
values
o f time
may w e l l
operations the set of
t o make
i tclosed, i f
reasons.
p r o c e s s e s as t h e y u n f o l d i n values
T i m e c a n be e i t h e r
synchronous o r asynchronous.
b y o u r new
t h e new s e t o f
under
o f time
continuous
a number o f ways t o g e t r i d o f d i s c r e t e n e s s .
be e i t h e r
restrict
® and a, y e t t h e s h i f -
f o r some o t h e r
t i m e . F o r a number o f r e a s o n s t h e a d m i s s i b l e
there
vectors i n
can
technique that
like
Of c o u r s e , we c a n a l w a y s e x -
vectors
algorithm execution
s o m e t i m e s be n o t a r b i t r a r y .
we
For example,
under o p e r a t i o n s
assump-
i t sounds
defined
this
i n mind
s e t was c l o s e d .
i s closed
Schedules d e s c r i b e
speaking,
be k e p t
pand
can
Then
may p r o v e n o t t o be c l o s e d
I f the o r i g i n a l
(1,0),
examining a
i n generality
conditions
conditions.
t o t h e c o n s i d e r a t i o n of t h e subset o f R
conditions vectors. Generally
and ® even
vectors
while loss
and m i n s . = 0. T h i s
each o f t h e v a l i d
made u s e o f . H o w e v e r , i t s h o u l d
initial
any
f o r a number o f r e a s o n s . T h e r e f o r e
a way a s t o s a t i s f y
ourselves
be
to shift
that
without
The v a l u e s
The p r o c e s s e s
o f components o f
conditions vectors moments. T h a t
moments may or discrete,
must b e l o n g t o
i s why i t i s n e c e s -
89 sary f o r t h i s we
have
s e t t o be c l o s e d u n d e r
stated
earlier,
real,
o r nonnegative,
given
number,
e t c . This
most p r a c t i c a l
problems.
this
t h e o p e r a t i o n s © , a , • , a n d x . As
condition
or integer,
i s certainly
or rational,
circumstance
true
i f time i s
or proportional
i s o f momentous
to a
importance f o r
10. Optimal and High-Speed Schedules Run
time
i s one o f t h e m a j o r
plementation. R
characteristics
I f t h e implementation
then the time
i s described
r e q u i r e d t o execute
o f an a l g o r i t h m imby a s c h e d u l e
t
from
t h e a l g o r i t h m i s g i v e n by
to
IttJ
-
mm
t
ntint
-
i
Of
course,
this
f o r m u l a may r e q u i r e some a d j u s t m e n t
cases. For example, i f t h e j t h f u n c t i o n a l lish
i t s Job then
(10,1)
other
this
hand,
formula
both
times, provided their
= max t . i
imposed w i t h Note
yield
close
the input devices
no c o n s t r a i n t s
t h e s e t R , we w i l l
that
min i t i
the time
t a k e s t i m e h . t o accomp-
) .
allows f o r the delays
formulas
that
job. Since
unit
for particular
may be m o d i f i e d a s f o l l o w s :
Tit)
Obviously,
( t o . 1)
i
values
on I n p u t d e v i c e s . f o r algorithm
do n o t t a k e
on t h e delays
t o o long t o complete
on f u n c t i o n a l
make u s e o f t h e f o r m u l a
functional
(10.1)
On t h e
execution
i s continuous
units are
(10.1). i n the seto f
s c h e d u l e s and does n o t change i f a l l components o f a s c h e d u l e a r e s h i f ted
b y o n e a n d t h e same n u m b e r , i . e . T ( A o t ) = Tit)
f o r a l l A.
Therefore
we c a n assume m i n f . = 0 w h e n e v e r we n e e d i t . As we h a v e m e n t i o n e d e a r i lier,
we c a n a r r a n g e
f o rthis
condition
t o be s a t i s f i e d
f o r t h e sched-
u l e s f r o m a n y c l a s s fi i s ) . I n t h a t c a s e t h e t i m e f u n c t i o n a l mally coincides with
t h e seminorm
(8.6).
bound f o r t h e t i m e f u n c t i o n a l f o r n o n n e g a t i v e L e t s be an i n i t i a l
(10.1)
for-
The s e m i n o r m p r o v i d e s a n u p p e r schedules.
c o n d i t i o n s v e c t o r . We p o s e t h e p r o b l e m t o f i n d
90 a
schedule
time.
As
reached such all
i n ft is) u the delay
f o r such
isfies
time
which
condition
is
"zero
algorithm
The
since
o f t^ i s
i s reached f o r i s t h e same f o r
t o t h e minimal conditions
a l l i t s components, We o b t a i n
an i n i t i a i
vector"
with
execution
as f a s t
component o f
vector
operation
mums o n Ois).
i s executed
as soon
STATEMENT 1 0 . 2 . P r o v i d e d
closed
under
vector
to
s the a
operation
that
but also
schedule
achieves
thealgorithm
that
each
as p o s s i b l e .
r e a s o n we w i l l
these
the o
operations
sat-
t h e maximal
the following
ensures
the
as a
and
operations,
refer
whole
individual
This
algo-
i s another
conmini-
t o s c h e d u l e s 0 ( s l as
opti-
( 9 . 7 ) makes i t p o s s i b l e f o r u s t o e x -
p l o r e common p r o p e r t i e s o f t h e e n t i r e
the
that
certainly
including
conditions
respect
s c h e d u l e s . The r e p r e s e n t a t i o n
under
t h e minimum
t h a t a l l components o f s c h e d u l e s r e a c h t h e i r
For that
the
time.
as p o s s i b l e ,
sequence o f t h e f a c t
also
t h e minimum
t h e minimum o f
initial
schedule 0 ( s ) n o t only
executed
closed
execution
t h e maximum c o m p o n e n t . The " z e r o " s c h e d u l e Ois)
this
smallest
mal
algorithm
i s r e a c h e d o n t h o s e s c h e d u l e s f r o m k^is)
STATEMENT 1 0 . 1 . G i v e n
rithm
that
i t i s equal
-
a r e as s m a l l as p o s s i b l e .
Ois)
i s nonnegative,
s. T h e r e f o r e f o r a g i v e n
minimize
is
u
i n the smallest
g^ = 0 , a n d t h e maximum o f
i n ft is)
algorithm execution
one,
vector that
i
results
* 0 . Now n o t i c e
i t h a t g. schedules
vector
that
set
set o f optimal of
m,
initial
the
and
set
the
of
schedules.
conditions optimal
following
vectors
is
schedules
equalities
0 ( s ) s i 0 ( p ) = 0(se>p)
is hold:
(10.2)
AoO(s) = O(Aos) .
The same
0 operation
number.
shifts.
Notice
Consider
shifts that
a l l components o f a s c h e d u l e b y one and t h e the relations
(9.7) are invariant
t w o s c h e d u l e s Ois) a n d 0{p),
I n c a s e g . * 0 we h a v e
(0(s)+0(p))j = maxfO^sl.O^p)) =max(max ( 0 ( s ) + u l *
£
j
U
) , max ( 0 ( p ) + w leg.
t o such
= ) )=
91 = max
(max(0
|s)+u
= max
(max(0
= max
I.e.
the
first
cond g r o u p o f
),0 1
leg.
(p)-Hd
1
l
.) =
J
( s ) , 0 (p)-no
) =
((0(s)®0(p) } t u ) ,
group o f r e l a t i o n s relations
J
(9.7)
(9.7)
holds
holds.
I f g. = a t h e n
the
good b y f o r c e o f d e f i n i t i o n
se-
of op-
eration ©. COROLLARY. The set of
conditions
initial
of
optimal
vectors
schedules
with
is
respect
to
isomorphic
the pair
to of
the
operations
set ©
and © . The optimal
© operation schedules.
stays
Indeed,
Suppose t h a t a l l b u t
one
are
the
zero,
graph
and that
nodes.
conditions Input
nodes
node,
then
within
I f the
take
for
are
linked
yet
twooptimal
non-zero
the
s and p are
schedule
by paths
t h e values
initial
initial
i s zero. same
components
set o f
and 0 ( p ) .
conditions
nonnegative
t o one and the
the
0(s)
correspond
O(s)aOlp)
o f corresponding
conditions vector.
a d d i t i o n and
schedules
components
0 ( s ) © 0 ( p ) may e x c e e d t h o s e o f t h e o p t i m a l zero
i t may g o b e y o n d
components o f t h e i r
vectors
vector
R
vectors
to different
then
the
initial
I f both algorithm
that
the
graph
o f t h e vector
schedule corresponding
Note a l s o
o f our
conventional
s c a l a r - v e c t o r m u l t i p l i c a t i o n do n o t g o beyond R
to
the
vector
(provided
it) the
scalar
bounds o f not
factor the
i s greater
set o foptimal
convex w i t h respect STATEMENT 1 0 . 3 .
tains
the vector
tion.
Then
ule"
that
The
s
the set is
equal
formulas
ule
do not
the
definition
to these
Suppose q
which
1), yet
i s "zero
optimal
to
0(s°).
imply
they
do not
stay
within
the
schedules i s
operations.
the set
of
(9.7)
than
s c h e d u l e s . The s e t o f o p t i m a l
initial
vector"
schedules
that
of
with also
conditions respect
contains
vectors to the
con-
the ©
opera-
"zero
sched-
a l l components o f a n o p t i m a l
sched-
d e c r e a s e a s l o n g a s n o n e o f t h e c o m p o n e n t s s ^ d e c r e a s e s . By o f "zero
vector"
with
respect
t o© operation
a n y com-
92 ponent o f any i n i t i a l ponding
the optimal of
conditions vector
component o f t h e v e c t o r
schedule
with
respect
0(s°).
s i s not less than
I tfollows that
schedule 01s) i s not less
the optimal
schedule"
s°.
than
This
each
the correscomponent o f
the corresponding
proves
t o the © operation
that
component
01s°) i s t h e
"zero
i n the set o f a l l optimal
schedules. Let Suppose
P be that
an A b e l i a n
semigroup
the operation
a n d ff be
a commutative
of multiplication
by s e m i r i n g
i n t r o d u c e d f o r t h e s e m i g r o u p e l e m e n t s . Suppose f u r t h e r tion the
i sdistributive. semiring
Then
the semigroup P i s c a l l e d
R. S u m m a r i z i n g
our r e s u l t s
semiring.
elements i s
that
this
opera-
a semimodule
over
on t h e p r o p e r t i e s o f o p t i m a l
s c h e d u l e s , we o b t a i n STATEMENT semimodule optimal
I f
the
respect
schedules
of
to
is
set
the
also
of
pair
initial of
conditions
a semimodule
with
vectors
o,
operations respect
&
then
to
is
the
that
set
same
a of pair
operations.
The as
10.4.
with
relation
|9.7) i m p l i e s t h a t
t h e image o f t h e i n i t i a l
the schedule 0(s)
conditions vector
can be regarded
s f o r some m a p p i n g
A ,
1. e. 0(s) - A ( s ) .
By
virtue
o f (10.2)
t i o n s o, ®, s i n c e
the operator
"zero
U
(aou®/3ov) = o&A u®8oA u)
Moreover, Statement vector"
from
10.3 e n s u r e s
i t s domain onto
fact
that
nent
s . changes w i t h o u t d e c r e a s i n g
decreasing,
that
should
that
t h e 'zero
no c o m p o n e n t s o f a n o p t i m a l
J
equality
i s linear
with
respect
t o opera-
the equality
A holds.
A.
(10.3)
v
«
the operator vector"
i n i t s r a n g e . The
schedule decrease
means t h a t
a s a n y compo-
the operator
A i s nonw the vector i n -
i s , i f s : p then be r e g a r d e d
i l maps t h e
A s i A p. As u s u a l , to u> componentwise. Obviously, t h e o p e r a t o r
A ui
is
continuous
this
operator
semimodule.
and bounded maps
a
from
semigroup
The o p e r a t o r
A
below. onto
possesses
Statements semigroup
10.2-10.4
and a
t h e inverse
imply
that
semimodule
onto
operator,
since i t
93 maps d i f f e r e n t i n i t i a l c o n d i t i o n s v e c t o r s o n t o d i f f e r e n t o p t i m a l s c h e d u l e s . T h e o p e r a t o r A c a n b e r e p r e s e n t e d a s t h e d i r e c t sum o f t h e i d e n u>
tity and
operator
E corresponding
o f the operator
t o t h e s e c o n d g r o u p o f r e l a t i o n s o f (-9.7)
A ^ corresponding
t othe f i r s t
g r o u p o f 1 9 . 7 ) . The
o p e r a t o r A y p o s s e s s e s no i n v e r s e , s o i t i s o n l y because t h e i d e n t i t y operator E i s r e v e r s i b l e t h a t t h e o p e r a t o r A possesses t h e i n v e r s e . w The us
new i n f o r m a t i o n o n t h e p r o p e r t i e s o f o p t i m a l
to refine
the formulas
STATEMENT 1 0 . 5 .
The following
equalities
ADR
10
h a v e t o show t h a t a n y v e c t o r
a s a "sum"
o f vectors
from
f r o m R (s©p) c a n be r e p r e s e n ts a n d R ip). L e t z e R ( s a p ) . By
R is) IO
virtue and
o f representation
s° i s any"zero
count vity
(10.4)
is) « R ( A o s ) , U
ted
allows
hold:
( s ) © R ( p ) - R is@p)
R
We o n l y
schedules
(9.4).
(9.8)
vector"
Ui
we h a v e
CO
z = 0(s©p)®u
where u e ^
f o rthe vectors s , p, ssp. Taking
( 1 0 . 2 ) , t h e e q u a l i t y u s u = u and t h e c o m m u t a t i v i t y o f t h e © o p e r a t i o n , we c o n c l u d e
u
^
s
'
into ac-
and a s s o c i a t i -
that
z = 0 ( s © p ) ® u = ( 0 ( s ) © 0 ( p ) )©(ti©u) •
•
In
accordance
R
ip).
with
(0(s)©Ll)©(0(p)®U).
the representation
COROLLARY. The set
of
classes
(9.8) 0(s)®u
R is)
is
e R^is),
isomorphic
to
0(p)©u 6
the
set
of
to initial
conditions
vectors
with
respect
to
the pair
of
operations
© and
O, Now we a r e a b l e
t o describe
schedules corresponding of the
initial
t o o n e a n d t h e same d e l a y
conditions vectors.
set ofinitial
e a c h c l a s s R is)
t h e macrostructure
o f t h e s e t R^ o f
v e c t o r U a n d some s e t
I t i s t h e s e t o fclasses
conditions vectors
can be represented
comprises
a s t h e "sum"
R^ls).
t h e "zero" (9.8).
Provided
element s " ,
The o p e r a t i o n s
94 « a n d © may be i n t r o d u c e d i n t h e s e t o f c l a s s e s If
the set o f i n i t i a l
operation o,
© o r a semimodule w i t h
then the set R
using
formulas
(10.41.
c o n d i t i o n s v e c t o r s i s a semigroup w i t h respect t o respect
to the pair
o f operations ©,
i s a semigroup o r a semimodule w i t h
respect
t o these
Li
same o p e r a t i o n s , r e s p e c t i v e l y , a s i t i s t h e u n i o n o f c l a s s e s existence the
o f t h e "zero"
initial
condition
fi^(s).
The
s° implies the existence o f
"zero"
class - i t i s the class R I s " ) . w microstructure of the set R i s described
The
v i a i t s macrostruc-
a ture
and t h e m i c r o s t r u c t u r e o f each o f t h e c l a s s e s
R I s ) i s a commutative semiring w i t h the
"zero"
be g e n e r a t e d a
fixed
respect
schedule 0 ( s ) . According by a " p a r a l l e l
translation"
class
t o o p e r a t i o n s © and e w i t h
t o t h e r e p r e s e n t a t i o n (9.8)
c l a s s fi (s°) b y i t s " z e r o "
schedules 01s) i s isomorphic
R ( s ) . Every w
i t can
i n t e r m s o f t h e si o p e r a t i o n o f
schedule
01s).
t o the set o f classes
The s e t o f R (s) that
"zero" contain
w
these
schedules
class
i s bounded f r o m below and c l o s e d . B e s i d e s t h a t ,
respect
with
respect
t o operations
t o the conventional vector addition
© a n d o. M e t r i c a l l y
every
i t i s convex
with
and s c a l a r - v e c t o r m u l t i p l i -
cation. We p r o c e e d w i t h o u r i n v e s t i g a t i o n o f i n d i v i d u a l moments o f i n i t i a l test
that
Besides t h a t ,
yield
the
functional
data Tit)
However, e v e r y
t h e same e x e c u t i o n
input.
time
time w i l l
conditions vector,
achieves
classes. Given the
t h e optimal schedule describes
algorithm execution
of the i n i t i a l
ments o f i n i t i a l time
input,
algorithm implementation.
schedules
tion
data
We w i l l
refer
may
contain
other
as t h e o p t i m a l
schedule.
change w i t h e v e r y
modifica-
i.e. with
i t s minimum
c l a s s fi ( s ) , o r i n t h e e n t i r e
class
the fas-
every
change
o f mo-
t o schedules f o rwhich the as h i g h - s p e e d
schedules i n
s e t o f s c h e d u l e s R , o r i n some o t h e r
set. STATEMENT 1 0 . 6 . G i v e n a n y s c h e d u l e s u , " i n fi , ( h e i n e q u a l i t i e s Htifflv) £ T ( u ) © T ( v ) ,
always
r(uav) a r(ul»7(v)
hold.
Consider t h e system o f obvious
inequalities
(10.5)
min
u . s u.
i
i
i
min
* max u . , . i
1
v . i v . s max v . .
i
'
'
1
1
As b o t h max a n d m i n a r e n o n - d e c r e a s i n g f u n c t i o n s ,
we h a v e
m a x ( u . , v . ) a maxlmax u.,max v . ) . i
l
.
i
.
i
max(u^,v^] a maxlmin u^.min v ^ ) , i i m i n ( u . , v . ) * m l n l m a x u ., max v . ) , i i . i . i i
l
m i n ( u . , v . ) £ min(min u.,min i
Taking
i n t o a c c o u n t ( 1 0 . 1 ) we f i n d
m a x l m a x u.,max v.) i
i i
.
v.).
i
that
= max(7"(u+min U,),Tlv)+Hili
i
i
.
i
£ m a x ( T ( u ) © r ( v ) + m i n u .,T(u)®T{v)+min
u.,min v . ) .
of the inequalities
T l u w ) = max(max(u.,v i '
(10.5)
)) - min[max(Uj.)) a i
s max(max u . , max v . ) - m a x ( m i n U.,min v.)) i
1
1
i
a
S i m i l a r l y we h a v e
i
i
= KulsTdO+maxtmin i the f i r s t
v,
v ) =
i
Now we o b t a i n
. i
i
riultflv).
i
a
96 m i n f m a x u.,max v.) J J i i
= m i n ( T ( u ) + m i n u., T ( v ) + m i n v . ) * i i
s min(T(u)®r(v)*min i
u .,r(u)@T{v)+min ' i
IT,) =
T ( u ) ® i ( v ) f - m i n ( m i n u ^ . m i n ?. 1. i i Now we o b t a i n t h e s e c o n d o f t h e i n e q u a l i t i e s
r(u®v) = max(min(u . , v . ) )
a m i n ( m a x u.,max v.) 1 i i '
(10.5):
m i n ( m i n ( u , v )) £
mintnlin u^.min i i
))£
£ T(u)®T(v).
The 10.6
time f u n c t i o n a l
i s n o n n e g a t i v e f o r a l l s c h e d u l e s . By
there i s an a d d i t i v e
sembles an a d d i t i v e uniformity
with
Statement
i n t h e © o p e r a t i o n upper bound f o r i t .
seminorm i n t h a t
a s p e c t , y e t due t o t h e absence o f
r e s p e c t t o t h e e. o p e r a t i o n
n o r m . We h a v e a l r e a d y n o t e d t h a t
I t re-
i t i s not actually
the time f u n c t i o n a l
a
semi-
d o e s n o t d e p e n d on
the o o p e r a t i o n . STATEMENT each
of
functional
Let schedules
the
10.7.
For
®,
operations in
SJ i s closed
T^ d e n o t e
any ®,
set and
under
that
schedules
<s,
these
the same
Q that
set
of
is
minimums
closed of
i . e . T(u)
under the
time
operations.
t h e minimum v a l u e o f t h e t i m e f u n c t i o n a l
u a n d v be m i n i m u m s ,
have t o p r o v e
of
-
Tiv)
-
T ( u s v ) = T ( u w ) = T^. A c c o r d i n g
i n £2. L e t
T^. E s s e n t i a l l y
we
to the inequalities
( 1 0 . 5 ) we h a v e
T(u®v) £ T(u)©T(v) = max(Tn, T ) Q
T(u«v) s r(u)®r(v) The
- T"n
= maxtT^.T^) = T .
s c h e d u l e s u ® v and ue>v a r e i n 13, a s we h a v e assumed t h e s e t £1 t o be
c l o s e d u n d e r o p e r a t i o n s ta a n d ® . N e i t h e r Tiu&v)
n o r T ( u s v ) c a n be
less
97 than
since
follows
that
If the
i s t h e minimum v a l u e o f t h e t i m e f u n c t i o n a l
t h e e q u a l i t i e s T(u&v)
s e t o f high-speed
schedules
(a semimodule, a s e m i a l g e b r a ) (ffl.o o r
STATEMENT 1 0 . 8 . closed
metrically
the
"zero
(with
above).
It
d i f f e r e n t terms
t o s e t s o f vec-
which operations are c u r r e n t l y considered. I f the semiring
(the
of
from
"unit
the minimums
below
(from
above)
with
respect
element")
of
a functional then
i t
to
is contains
operation
®
t h e s e m i r i n g be m e t r i c a l l y c l o s e d and bounded f r o m below
(from
Repeating w i t h
9 . 5 , we e s t a b l i s h
upper
I t i s a semiring
o p e r a t i o n s 8 and a,
r e s p e c t to o p e r a t i o n i s ) . Let
and
hold.
i n Q i s a semigroup.
i f £3 i s c l o s e d u n d e r
and bounded
element"
I t
one o f t h e o p e r a t i o n s © o r ® , then
a n d o ) . By a p p l y i n g
t o r s we make i t c l e a r
= 7"^ m u s t
= T{u®v)
t h e s e t £3 i s c l o s e d u n d e r
i n Q.
bounds
i s this
h a r d l y any changes t h e p r o o f s o f s t a t e m e n t s 9.4 t h e e x i s t e n c e o f t h e schedule on which
I the greatest
schedule
that
the least
l o w e r bounds) o f a l l components a r e reached.
i s "zero"
("unit") with respect t o operation ®
(or e ) . COROLLARY. I f the is
bounded
from
below
semiring (from
of
the minimums
above)
then
from
that
semiring
and the
"zero"
schedule
from
that
semiring
and the
"unit"
schedule)
schedule
of
that
same
to
operations ional
Let
of
the
"product" (the is
time
functional
of
any
schedule
"sum" of
any
schedule
the
"zero"
(the
"unit")
semiring.
STATEMENT 1 0 . 9 .
funct
the
I f
+ and
•
the then
set
of
the
schedules
algorithm
£1 i s convex execution
with
time
is
respect a
convex
in £3,
u , v be a r b i t r a r y s c h e d u l e s
i n Q. F o r a n y A, O s A s l ,
r(Au+(l-A)v) = = maxtAu.-f(l-A)v.)-min(Aui + ( l - A ) v i ) ^ i < maxUu i
i
)+max( ( l - A ) v i
)-min(Au i
)-min( (1-A)vj.) '
- A l m a x u . - m i n u , ) + ( 1 - A H m a x v. - m i n v ) =
we h a v e
98
This proves tiie c o n v e x i t y o f t h e time COROLLARY. If erations
*,
convex
with
Let 10.9
,
the
then
respect
set
of
the
set
to
these
of
of
the
with
time
respect
funct
s c h e d u l e s u a n d v be m i n i m u m s a n d Tlu)=T{v)=T^.
7"^ i s t h e minimum
to
tonal
is
opalso
operations.
T(Au+(l-A)v)
(l-A)v)
f) i s c o n v e x
minimums
f o r a n y A, O ^ A s l , we c o n c l u d e
Since
functional.
schedules
By S t a t e m e n t
that
a AT(u)+(l-A)T(v) =
value
o f the time
c a n n o t be l e s s t h a n T f i . T h e r e f o r e
functional
the equality
i n £!, T ( A u +
T(Au +
(l-A)v)
must h o l d . C o n s e q u e n t l y t h e s e t o f m i n i m u m s i s c o n v e x . Now
we
Consider follows
a r e able
t o describe
some
sets
o f high-speed
schedules.
operations
and ® . I t
the class that
R ( s ) . I t i s c l o s e d under to t h e s e t of high-speed schedules
a
i n R (s) i s a
semiring
to w i t h r e s p e c t t o o p e r a t i o n s ® and ® , The s c h e d u l e 0 ( s ) i s a h i g h - s p e e d o n e i n R ( s ) . T h i s same s c h e d u l e i s t h e " z e r o " i n t h e s e m i r i n g o f h i g h W
speed
schedules
from
R is).
This
semiring
i s metrically
closed
and
to bounded f r o m below. A c t u a l l y , an even s t r o n g e r p r o p e r t y h o l d s : min r e m a i n s c o n s t a n t f o r a l l s c h e d u l e s ! i n R ( s i . The f a c t t h a t t h e t i m e
to functional
i s constant
is
from
bounded
ring.
above.
The s e m i r i n g
i n the semiring implies that Hence
the "unit"
o f high-speed
schedule
schedules
from
the semiring exists
itself
i n t h e semi-
R I s ] i s convex
with
to r e s p e c t t o o p e r a t i o n s +, •, A c c o r d i n g t o ( 9 . 8 ) a n y c l a s s R is) of
the "parallel
rings
o f high-speed
possess t h a t
c a n be o b t a i n e d
as t h e r e s u l t
EJ
t r a n s l a t i o n " o f R (s°) b y t h e s c h e d u l e 0 ( s ) , The s e m i Ed schedules
corresponding
p r o p e r t y . Moreover,
t o these
i n the general
case
classes
do n o t
t h e y a r e n o t even
isomorphic. The
set of high-speed
schedules
from
has t h e f o l l o w i n g
R
struc-
to ture. ©,
Taken as a w h o l e ,
® , a. I t i s c o n v e x
i t i s a semi-algebra with
respect
with
to operations
respect +,
to operations
• and
metrically
99 closed.
Algebraically
schedules
from
this
set
is a
union
some c l a s s e s R ( s ) . b u t
of
semirings
of
high-speed
i n general not from a l l of
them.
to To be m o r e p r e c i s e , o n l y t h o s e c l a s s e s a r e t a k e n f o r w h i c h t h e c o r r e s p o n d i n g o p t i m a l s c h e d u l e s a r e h i g h - s p e e d s c h e d u l e s i n f i . A l l s u c h opto tlmal
schedules
vectors
i s bounded
schedules
from
ty" ) would optimal In
from
below
remain
"zero"
based
e. g.
upon
and
"unit"
("unity")
a schedule Now
we
components
schedules
Therefore t o be
i n the
initial
the
conditions
set of
This
high-speed
"zero"
semi-algebra
(or
of
"uni-
high-speed
equal
graph
arc
the path
a
one
may
vector
whose
of
sider
the
(9.7)
there exist
the
j t h algorithm
0.(0)
direct
of
the
of value,
relation
o p t i m a l schedule
ascribe
commay
algebraic though
to
the
high-speed
i s not
always
a
sufficient conditions
of
equal
node and
ing
t o t h e j t h node f r o m
of
initial one
possess
in
the
another
graph
node.
+ WJf
..
an
algo¬
d e f i n e the
length i t .
vectors the
time
which
The
then
contains corresponding is
equal
to
path.
optimal schedule
0 ( 0 ) . Con-
i n accordance
a r c goes f r o m
I t follows that
t h e j ' t h node s u c h
the
constituting
then
critical
I f gjie that
61^. to
path.
execution
algorithm
= 0^,(0)
arcs
conditions
the corresponding graph
weight
t h e j t h n o d e . We
the c r i t i c a i
set
i ' t h node such
jth
schedules
the
the weights
algorithm
the
build
an
He
i s called If
yield
length
t o be
high-speed
i t h and
of
components
Take s = 0 and
"zero"
knowledge
prove no
rather
c o n d i t i o n s v e c t o r s c o n t a i n s a v e c t o r whose
the
t h e sum
schedules
weighted
have
is a
methods. Such methods
i t i s important to f i n d
another.
STATEMENT 1 0 . 1 0 .
high-speed
The
can
property
linking
t o be
10.9.
the
schedules
high-speed.
describe
p a t h o f maximum l e n g t h
the
of
("unity").
high-speed
schedules
c a s e when t h e s e t o f i n i t i a l
a
set
of numerical
Statement
In particular,
h i g h - s p e e d one.
rithm
"zero"
t h e use
of the set of
schedules.
of
I f the
( f r o m above) then
case f i n d i n g
task r e q u i r i n g
structure
for
semi-algebra.
R ^ contains the
the general
"zero"
a
schedules.
plicated be
form
with
i t into
a path exists
the
lead-
that
(10.6)
100 where path. tity ly,
p' , I ' ,
j ' , k'
a r e t h e numbers o f nodes b e l o n g i n g t o t h a t
j
I n accordance w i t h
the properties
of optimal
schedules
t h e quan-
0 . ( 0 ) i s m i n i m a l f o r s c h e d u l e s c o r r e s p o n d i n g t o s = 0. Consequentthe weighted
node
cannot
(10.6). then
l e n g t h o f any p a t h l i n k i n g
exceed
the value
I f we t a k e t h e l a s t
t h e sum
execution time
critical
set of i n i t i a l
all
equal
node and t h e j t h
less
i s reached
value. This into
a s t h e j t h node
equal
of algorithm.
than
side of
the weighted
Clearly,
the weighted
length
algoo f the
o n t h e s c h e d u l e 0 ( 0 ) . Now l e t
c o n t a i n a v e c t o r whose c o m p o n e n t s
case
c a n be r e d u c e d
account
the relations
t o t h e o n e we
AoO(s) =
0(A©s),
no(s)).
COROLLARY. one
be
limit
have c o n s i d e r e d by t a k i n g
equal
o f (10.6) w i l l
conditions vectors
some n o n z e r o
HAoOls)) =
side
p a t h o f t h e graph cannot
p a t h , and t h i s
the
any i n p u t
i n the right-hand
operation of the algorithm
i n the right-hand
length o f the c r i t i c a l rithm
o f t h e sum
I f all
another
then
components the
optimal
of
the
initial
schedule
0(s)
condit is
ions
vector
s
a h i g h - s p e e d one
in
R . u Notice that
the arbitrariness o f the i n i t i a l
only hinder our achieving around
i t we
initial
c o n d i t i o n s v e c t o r can
t h e minimum a l g o r i t h m
have
t o take
a vector
conditions
vector.
For p a r t i c u l a r
with
execution time.
identical cases
components
other
To g e t as t h e
initial
condi-
t i o n s may e x i s t o n w h i c h t h e minimum e x e c u t i o n t i m e s p e c i f i e d b y S t a t e ment 10.10 i s r e a c h e d .
11. Examples While finding
investigating
schedules
the subsets o f a l g o r i t h m
functional
u n i t s o f t h e graph
T h e r e f o r e we w i l l
nodes
machine w h i c h
t i m e moment. T h e s e s u b s e t s d e f i n e graph.
one o f t h e most graph
refer
that
do t h e i r
some p a r a l l e l
t o these
important points i s correspond
forms
subsets
A
layer
o f a schedule
geometric
algorithm
graph
i s sometimes
interpretation nodes
called
accounts
a r e mapped
onto
o f the
the subset.
a wavefront.
f o r this
points
a t t h e same
of the algorithm
as t h e l a y e r s
s c h e d u l e c o r r e s p o n d i n g t o t h e t i m e moment t h a t d e f i n e s
lowing
jobs
t o those
term.
o f some
The
fol-
Suppose
that
space
and t h e
101 nodes
From
some
corresponds in
t i m e we
ted
at
which
a
form
I f we
have a process
given
moment.
forthcoming
which
a surface
t o some s u r f a c e . shall
grounds t h e term The
or.
layer
This
i s t h e same t h i n g ,
regard
process
EXAMPLE 1 1 . 1 .
chapter of
an
that
grid
the graph
node
i s oriented
left
to right.
that
time
We
with
illuminate
and
i n p u t n o d e s a r e n o t shown
wave
propagation,
of
o f schedules.
Let graph
considered will
apply
on
wavefronts, We
will
pri-
i n Example
6.3 i s
the r e s u l t s o f t h i s
nodes be s i t u a t e d
i , k where
laism,
i n t h e nodes
liksn.
Suppose
c o r n e r has c o o r d i n a t e s ( 1 , 1 ) , t h e and
the k axis
i s oriented
from
a l l components o f t h e v e c t o r u e q u a l
takes
positive
integer
i n F i g . 6.5. W i t h o u t
assume t h e y a r e s i t u a t e d dinate
i n F i g . 6.5
t o p t o bottom
assume t h a t
function
schedules.
i n t h e upper l e f t
is discrete
moment
o p e r a t i o n s are execu-
the behavior
of the layers
coordinates
from
time
s u r f a c e s as a
which
resembles
a l g o r i t h m s . We
to I t sinvestigation. integer
axis
i
The g r a p h
f o r computational
Every
i s applied t o the layers.
m a r i l y examine o p t i m a l and high-speed
typical
space.
these
specifying
"wavefront" examples
i n that
values.
losing
Recall
1, that
g e n e r a l i t y we c a n
i n t h e i n t e g e r p o i n t s o f the axes o f t h e c o o i —
system.
Let gorithm
r , , be i k graph
t h e component
node
with
o f the schedule
c o o r d i n a t e s i,k.
corresponding
Then
the formulas
t o the a l (9.1) w i l l
have t h e f o r m
t., ik
2
rnaxU. ,, t . . ) + 1. i-i, k i.k-1
m i n ( i . f c ) = 1,
(11.1)
t ., = s . , , m i n ( i . k ) = 0. ik ik Here, s.. a r e t h e components o f t h e i n i t i a l ing
t h e moments o f d a t a The
general
The q u a n t i t y The
x,k. teger is •
t
points
and
specify¬
input.
solution
of
the problem
c a n be r e g a r d e d
relations
conditions vector
(11.1)
strictly
(11.1)
as a f u n c t i o n
imply that
t(i,k)
increases with
i s easy
t(i.k)
takes
o f two
variables
integer values
i a n d k.
d e f i n e d f o r i . k > 0 and t a k e s g i v e n v a l u e s
to describe.
t(i,k)
The • s
function
in i n i(i,k)
, i f mln(i.k)
0. No o t h e r c o n s t r a i n t s a r e I m p o s e d o n t h e s o l u t i o n o f ( 1 1 . 1 ) .
102 For
given
boundary
values
s..
the
set
of
such
f u n c t i o n s forms
a
1K
semiring our
with
respect
particular
Identity
function
semiring.
to operations
c a s e . The 0(i,k)
I t is given
©
and
a.
with
respect
Oli.k) =
a l l s
fact
i s obvious
the class
In
fi^ts).
t o t h e si o p e r a t i o n e x i s t s
The
i n the
by
O(i.k) = nax(0(i-l,k),0(r,Jc-l))+l
If
This
said semiring Is actually
are equal
t o 0,
sik
2 1
i f minli.k)
i f min(i,k) =
the f u n c t i o n 0 ( i , k )
0.
i s given e x p l i c i t l y
by
a
Ik
simple
formula • U.k)
The tained
f o r constant
schedule
I n R^
of
surfaces
of
tions
describe
lines
in Fig.
by
is
optimal
schedule
optimal
of
form
form are given
the form
nodes
by
each
that
are
the
algo-
constant of
the
const.
0(i,k) = of
ob-
high-speed
of
components
a l l solutions
sets of
i t is a
parallel
i . e . the
schedule
of
va-
nodes
For
these
the
equa-
shown b y
dashed
i n F i g . 6,5 propagates
i t can
the wavefront i n much
be d e s c r i b e d we
will
by
the
described same
way
a hyperplane.
often attempt
by
as
a
Such
to describe
a
highplanar
situation Mavefronts
hyperplanes.
nodes
We
together w i t h
resulting
nents ger
those
f r e q u e n t l y encountered;
EXAMPLE 1 1 . 2 .
The
maximum
0(i,k),
(11.2)
precisely
i s an
(11.2)
6.5.
wave. C o n s e q u e n t l y ,
by
the
the equations
given
Thus f o r t h e g r a p h speed
(11,2)
layers of that p a r a l l e l the f u n c t i o n
0(i,k)
by
1.
conditions. Therefore
describes
each l a y e r s a t i s f y
function
given
initial
that
r i t h m g r a p h . The lue
0(i,k)
function
= i+k-1 i f min(i,k) £
of
graph
now
arcs
modify
the
incident
on
i s shown i n F i g .
t h e v e c t o r u t o be
values.
zero values
Consider of s I t s
an
equal
optimal
a l g o r i t h m graph d e l e t i n g
several
them f r o m
corner.
11.1.
t o 1 and high-speed
layers are
shown by
the
upper
A g a i n , we time
left
assume a l l compo-
to take p o s i t i v e
schedule dashed
inte-
corresponding
lines
in Fig.
to
11.1.
103 We
can observe
the
graph
11.2
6.5
the
though
that
the r e g u l a r i t y of the wavefront that
i s severely
layers
i t i s not
of
marred.
another
lines
I t is a
11.1
we
high-speed
modification
s i g n i f i c a n t changes
i n the layers
ing s u i t a b l e schedules.
Irregular layers
scribe
and
A viable
We
that
the graph
i n F i g . 11.1
can
therefore
t r y t o expand
6.5.
We
examine then
investigate.
i t s schedules,
consider their
mentioned The general,
that
should
i n Figs.
exactly
o f some o f
before their
possess "regular"
high-speed while
i s a subgraph a given
o f the graph
graph of
findt o de-
by o u r p i c t u r e s . inFig.
before s t a r t i n g
t h e expanded g r a p h . We
graph
have
a
11.2 a r e b o t h h i g h - s p e e d
importance f o r the p a r a l l e l i s f o u n d . By a n d l a r g e ,
the t o t a l
dramatically
our graphs
may r e -
a r e much more d i f f i c u l t
out i s suggested
onto the o r i g i n a l
11.1 a n d
i s that
should admit
way
the schedules
what s c h e d u l e
requirement
not d i f f e r
too,
to and
i n fact
i n §4.
i t i s not o f v i t a l
themselves
graphs
study
reduction
approach
schedules
vestigation nificant
one,
of a graph of algorithm
s c h e d u l e s . T h i s c i r c u m s t a n c e causes a d d i t i o n a l d i f f i c u l t i e s
see
enjoyed I n
represent i n Fig.
F i g . 11.2
Thus an i n s i g n i f i c a n t i n quite
schedule.
dashed
optimal.
Fig.
sult
The
from
number o f l a y e r s
the minimal
simple description.
exploration, schedules.
the only in a
one, w h i l e I f we
choose
i t i s very important
ones. I n
structure I n sig-
schedule
the
layers
t o expand
t o know
what
104 EXAMPLE 1 1 . 3 . C o n s i d e r graph
nodes
w h e r e 2^i^n, vector
u
values.
laJSi-l,
t o equal Suppose
k=0,
the input
j=0,
k-0,
in
,
schedule
than
that
0(i,j,0)
of
i n R
are
on
6.2. L e t
coordinates i , j , k
i,j,0.
The
situated
on
the
A are
i , j , 1 feeds
situated
input
) + 1 i f j*o
or
i*0, k=l
initial
o f the problem
conditions
(11.3)
(11.1).
vector
i s much more
However,
obtained e x p l i c i t l y
that
the
difficult high-speed
i n case
a l l s . ., IJK
have
OU.j.O) - max(0(i,J-1,0), OU, j,0)
We
conclude
The
layers
schedule
satisfy
= 0 i f >0.
= j
i f j*0.
i s a high-speed
i n F i g . 6.2.
again
a
hyperplane,
The
wavefront
a l though
(11.4)
one
g r a p h . The
t h e e q u a t i o n s j = const.
lines
We
(11.4)
form o f the algorithm
outwardly
0(J.J-1.0)) + 1 i f j * 0
that
0(j',j,0)
parallel
data
t i l . 3)
D
e q u a l 0, we
line
input.
the problem be
the
J.J.i
. , i f j - 0 , k=0
can
occupy
the matrix
, t . .
of
i n t h e nodes f o r b
formulas (9.1) take the form
J.j-l.o
data
take p o s i t i v e ' integer
the vector
components
with
t h e components
general solution
obtain
nodes data
. t . .
s.
f r o m Example with
and
data
t h e moments o f i n i t i a l
The to
discrete
node
.
. , -
grid
providing
J.J-1,0
s . ., a r e
specify
nodes
e max(t.
t,
Here,
graph
components
J.J.O
t o be
providing
f o r k = l . The
i n F i g . 6.2
integer
Once a g a i n we a s s u m e a l l c o m p o n e n t s o f t h e
time
a l l inner nodes
t h e node w i t h
t.
k=0,1.
1 and
the input
t h e nodes
into
the graph
occupy t h e nodes o f an
i t defines
a maximum
nodes b e l o n g i n g t o
individual
The
and
layers
d e s c r i b e d by
the graphs
a r e shown by
the schedule
i n Figs.
6.2
and
dashed
(11.4) i s
6. 5
are not
alike.
have
succeeded
in finding
the general
solution
of
(11.1),
yet
105 this find
I s more
other particular
be
based
to
the graph,
linked the
difficult
f o r (11.3).
solutions
o f (11.3)
on t h e approach
by
(11.3)
This
the graph,
t o nodes w i t h
mains a c y c l i c
adding
arcs
(
s max(t
t . . i.j.o
2.1,0
The q u a n t i t y
.
i and j . ( 1 1 . 5 )
integer
points values
of
applled
> sj
from
search
will
a d d new
arcs
repeatedly
. f
1,0,0
t(I,j,0)=Sj ,i
I f j=0,k=0
c a n be r e g a r d e d implies that
be r e w r i t t e n
, t
Q
these
follows:
)+l
i f i s 3 , J<0
(11.5)
or j*0,k=l.
as a f u n c t i o n
I f j = 0 , and
as
2,1,1
t(i,j,0)
increases with j
the graph r e -
b y some p a t h . D e l e t i n g
(11.3) w i l l
2,0,0
Clearly,
coordinates
The e n d p o i n t s o f a n y a r c n o t
be l i n k e d
i.J.k
and s t r i c t l y
r(i.j,0)
takes integer
i and j . I t a l s o satisfies
the
o t h e r w i s e . There a r e no o t h e r c o n s t r a i n t s
i n two
values i n takes pre-
inequalities on t h e so-
(11.5).
A high-speed (11.5) providing
optimal
schedule
c a n be f o u n d
t h a t a l l s . .. a r e e q u a l
among t h e s o l u t i o n s o f
t o 0. We
have
IJK
0(2.
0(1.j,0)
1,0) = 1
= max(0(i,j-1,0),
Oli-l,j,0))+l
O(i.j.O) = 0 i f j - 0 .
Hence we
until
whose s o l u t i o n s a r e
nodes w i t h
i max(t . . , t. . , t, , )+l i , J-1,0 i-i,j,o i . J . i t, i.j.k
defined
going
expansion.
t h e problem
lution
Our
t h o s e a r c s whose e n d p o i n t s a r e c a n be
c o o r d i n a t e s I + l . J . O i f j*0.
throughout this
t o c o o r d i n a t e axes w i l l
£(i.j,0)
(11.4).
t o a s i m p l e r problem
arcs from the graph,
variables
than
can a t t e m p t t o
i n §4. N a m e l y , we w i l l
procedure
i s reduced
H o w e v e r , we
t h e s o l u t i o n s o f ( 1 1 . 3 ) as w e l l .
Expand
parallel
suggested
simultaneously discarding
some p a t h s .
problem
certainly
i.j.O
t o do
obtain
i f i=3,j*0
106
0{i,j,0)
This
schedule
i s neither
(11.3),
Nevertheless,
(11.6),
i t can be e a s i l y
Is
slower
sible.
by about
the graph
dinates
high-speed
verified
t h e graph
0
nor optimal
f o r t h e problem
the implementation
o f 2 than
the fastest by a
(11.4),
i t describes
implementation
pos-
hyperplane.
u s t o d e m o n s t r a t e how c a r e f u l o n e s h o u l d be
o f a l g o r i t h m . Consider
i n F i g . 6.2, a d d i n g
i , j ,
(11.6)
bad a t a l l . Comparing
that
i s again described
This example can serve
of
i f i i 2 , j*0.
i t i s not that
a factor
The w a v e f r o n t
while expanding
= i+j-2
t o t h e nodes
arcs
with
going
a different
from
coordinates
expansion
t h e nodes w i t h i - 1 , j ,
coor-
0. The
graph
w o u l d remain a c y c l i c and t h e end p o i n t s o f n e a r l y a l l a r c s n o t p a r a l l e l to
c o o r d i n a t e axes would
ception
i s the arcs
«,j+1,0.
Apparently
"long"
arcs
be l i n k e d
connecting
coordinates
be s i m p l i f i e d ,
j*l,j,0
and
since nearly a l l
c a n be r e m o v e d . Y e t t h e e x p a n s i o n we made, f o l l o w e d b y t h e
2
be n / 2 (we w r i t e
execution
o f l e n g t h 2 . The o n l y ex-
t h e nodes w i t h
t h e g r a p h may w e l l
r e m o v a l o f some a r c s , r e s u l t e d would
by paths
time
T h u s we h a v e
would
lost
i n t h e g r a p h whose c r i t i c a l
t h e most s i g n i f i c a n t
accordingly
a l l parallelism
term
be o f t h e same
path
length
only).
The a l g o r i t h m
order
o f magnitude.
by t h e u n f o r t u n a t e expansion
o f the
graph. Notice hyperplane.
that
even
i n that
0(i,j,0)
Note t h a t size.
Also
schedule.
case t h e w a v e f r o n t
c a n be d e s c r i b e d
by a
I ti s g i v e n by
= i+nj-n-1
the c o e f f i c i e n t s o f the linear note We w i l l
that
the linear
discuss
related
function topics
i f
j*0.
f u n c t i o n depend on t h e problem does n o t d e f i n e a later
i n t h e book.
high-speed
Chapter 3 Algorithms and Computer Memory Memory I s o n e o f t h e b a s i c e l e m e n t s o f a c o m p u t e r s y s t e m . F r o m e n d user of
v i e w p o i n t , memory i s m e r e l y d a t a
t h e technology
sufficient models
used
t o produce
c a p a c i t y and r e l a t i v e l y
o f computer
memory
read/write operations
assume
it.
The
existing
ly
larger
The o n l y
requirements
i t t o be a r b i t r a r i l y
t o be i n s t a n t a n e o u s .
high-capacity
user
irrespectively
f a s t access. Most o f t e n
s e n s e , c o n t r a d i c t o r y a n d n e e d some
sufficiently
storage equipment,
and
data
These a s s u m p t i o n s a r e ,
large
in a
elaboration.
data
storage
devices
do
than
t h e average time
required t o perform
pacity
i s 1lmited.
That
I s why
i s t o be t a k e n
i f we s t r i v e
Existing kinds.
computer
These d e v i c e s
gorithms
systems
possess
channels.
efficiently
the limitations
cated
incorporate storage
o f memory
computer
devices
varying characteristics
algo-
of various
as r e g a r d s
their
B e s i d e s , c o m p u t e r s y s t e m s may i n c l u d e
sever-
Obviously,
A l l b a n k s o f memory a n d f u n c t i o n a l
without
limited
throughput o f
we c a n n o t h o p e t o i m p l e m e n t
taking
into
account
o f t h e communication network.
to the investigation
c o m p u t e r memory
kinds
while developing
communicate v i a a network o r a s w i t c h w i t h
communication
and
t h e amount o f v a r i o u s account
b a n k s o f memory o f t h e same k i n d .
units
and
Into
oper-
y e t t h e i r ca-
t o s o l v e o u r problems as f a s t as p o s s i b l e .
c a p a c i t y and a c c e s s t i m e s . al
provide
an a r i t h m e t i c
On t h e o t h e r h a n d , u l t r a h i g h - s p e e d m e m o r i e s e x i s t ,
rithms
not
l o w a c c e s s t i m e s . So f a r t h e i r a c c e s s t i m e I s s u b s t a n t i a l -
ation.
available
are
mathematical
o f mutual
our a l -
t h e memory
structure
This chapter
i s dedi-
Influence of algorithm structures
peculiarities.
12. Examples We luminate
start
with
a q u a l i t a t i v e a n a l y s i s o f s e v e r a l examples w h i c h
t h e problems o f h i g h - c a p a c i t y storage devices
EXAMPLE
12.1. Given
three arrays
A , B,
i l -
usage.
and C o f d i m e n s i o n s
NxL,
108 a n d M*W
HxL,
respectively,
consider
the following
algorithm
(we u s e a
FORTRAN-like l a n g u a g e t o p r e s e n t o u r a l g o r i t h m s ) :
DO
1
DO DO A{I,
1
t=l,M
1
1=1,»
1
j=l,L J)=Ml,
(12,1)
j ) * A U - l ,
j)+A(l,
J - l )
CONTINUE
We assume t h a t
the following equalities are definitions:
AW,
The
algorithm
j)=B(t,
(12.1)
A(i,0)=Clt,
j ) ,
reflects
some
i ) .
characteristic
l a r g e group o f c o m p u t a t i o n a l methods. Here b e l o n g plicit-implicit
methods f o r t h e s o l u t i o n o f g r i d
i t e r a t i o n methods f o r t h e s o l u t i o n o f g r i d
equations
algebraic equations,
The
p r e c i s e nature o f t h e o p e r a t i o n performed
(12.1)
i s o f no
as G a u s s e l i m i n a t i o n ,
importance;
and t h e f o r m
p h y s i c s , a number o f
many
build of
methods
what
o f index
really
matters
expressions.
the graph
and a l g o r i t h m s p o s s e s s i n g of algorithm.
the parallelepiped
respond the
l^t^M,
Let graph l^isfl,
t o us
Having
similar
loop of
i s t h e loop
studied
t h e key
t o understand structure.
those L e t us
nodes occupy t h e i n t e g e r
lsj
Assume t h a t
points
a l l nodes
cor-
t o o n e a n d t h e same o p e r a t i o n o f t h e f o r m u = a + b + c . T h e a r c s o f
algorithm graph a r e e a s i l y found using
subgraph o f t h e a l g o r i t h m graph can
pro-
t h e Givens method, e t c . i n t h e innermost
p o i n t s o f i m p l e m e n t a t i o n o f ( 1 2 . 1 ) we s h a l l b e a b l e of
i n stationary
methods f o r t h e s o l u t i o n o f systems o f l i n -
ear
structure
a n d ex-
e q u a t i o n s y s t e m s stem-
ming f r o m n o n - s t a t i o n a r y problems o f c o m p u t a t i o n a l
b l e m s , a n d a l s o some d i r e c t
features of a
many e x p l i c i t
( 1 2 . 1 ) . F i g . 12.1 d i s p l a y s a
corresponding
to t=l,2.
The w h o l e
be o b t a i n e d by f o r m i n g a s t a c k o f such subgraphs p a r a l l e l
a x i s . We h a v e s u p p r e s s e d
the input
Fig.
12.1 b y u n e s s e n t i a l d e t a i I s .
data
input
terminating
from
to the t
and o u t p u t a r c s so a s n o t t o c l u t t e r The s a i d
arcs
t h e a r r a y s A , B, C a n d t h e o u t p u t
nodes.
graph
describe
the initial
of results
from the
109
t
i Fig.
The in
v a r i a b l e s we
non-stationary
have
introduced
problems
the total
the
term
value
steps,
number o f s p a c e g r i d
time
to refer
level
o f t . The n o t a t i o n
have
occurring
t i m e , H t o t h e number o f t i m e to
12.1
the following
i n practice:
be
execute
performed.
quiring
n o d e s . We w i l l
t o graph
For large
(12.1) d e s c r i b e s
using a p a r a l l e l Certainly graph
If
we
looks
o f every
6.5 c o n s i d e r e d
such g r a p h admits gorithm
N,
L this
para 1 l e i
system should
the graph
i n Fig.
H,
a typical
like
we
may
level
parallelize can expect
mode o f
o f t h e form
take
that
field
to a
fixed
computation
u=a+b*c
considerable
On
the f i r s t
any unexpected
I n Example
6 . 3 . We
have
time, r e -
complications.
found
t o achieve
within
t h e necessary
each
time
as t h e
out that
Consequently,
to a theoretically sufficient
the computations
must
impression,
h a s t h e same s t r u c t u r e
o f a v e r y good p a r a l l e l i z a t i o n .
( 1 2 . 1 ) c a n be p a r a l l e l i z e d
simply
to
by l e v e l .
computer.
not incur time
borrow from
nodes c o r r e s p o n d i n g
t h e a l g o r i t h m MNL o p e r a t i o n s
t h e u s e o f some
counterparts
corresponds
i a n d j t o s p a c e c o o r d i n a t e s , NL
whereby t h e o p e r a t i o n s a r e c a r r i e d o u t l e v e l To
(
the a l extent
level. I t
speedup by u s i n g
a
110 parallel
computer.
However,
these
expectations
presence o f p a r a l l e l pertaining lel
computer. traffic
ly
overall
the f u n c t i o n a l of
( 1 2 . 1 ) may
Indeed, level
either
el
must
be
current
fetched,
level,
they w i l l
that
be
and
as
of
execution
they
It
form
or
the
data
from
perform be
minimal,
not
imple-
latter. on
each
lev-
the
stored
exceed
of
takes
operations
of
to
as
memory,
level.
For
i t is clearly
should
speed
particular-
execution
the o p e r a t i o n s o f the next
time
paral-
the previous
some i n t e r m e d i a t e r e s u l t s . be
The
factors a
executed
Their
items to
to the
are
in parallel.
must
to
on
level-by-level
u=a*b+c
needed
time
o f many
h e a v i l y on
The
values
follows RAM
that
size of
the
order
level-by-level To
NL.
memory o f s u c h c a p a c i t y s h o u l d required
to perform
memory a c c e s s w i l l
the curb
the necessity
erations
We
the
time
may
overall
necessary
the
like
u = a + b + c on
that
required
It
becomes a m a j o r more. The
I f this
level.
algorithms
This
obstacle
question naturally
for solving
merely a consequence o f e x e c u t i n g knowledge o f
the answer. P a r t i t i o n
imop-
is easily physics.
problems.
met
I t is
Finally,
space d i m e n s i o n 3
l a r g e RAM
size
require-
the a l g o r i t h m (12.1) or
i t is
i t level-by-level.
a l g o r i t h m graph the graph
time hold,
speed.
like
requirement
the
not
refor
(12.1)
problems of
a r i s e s whether
time
t o t h e number o f
f o r l a r g e two-dimensional
ment m i r r o r s some i n h e r e n t p r o p e r t i e s o f
The
does
algorithm execution of
(12.1)
the average
condition
of size proportional time
of
access
( i n space) problems o f c o m p u t a t i o n a l
much l e s s e a s y t o f u l f i l l
or
the
be o f t h e same o r d e r a s
implementations
a
implementation
more p r e c i s e ,
the p o t e n t i a l
t o h a v e RAM
one-dimensional
be
operations.
Thus l e v e l - b y - l e v e l
for
memory.
are
fetch
practice.
perform a l l the operations.
quires
ply
depends
resulting
t h e a c c u m u l a t e d memory a c c e s s to
the
needed t o p e r f o r m
a l s o h a v e t o s t o r e and algorithm
and
t i m e , a b o u t NL
the
I s b u t one
i n e f f e c t i v e with respect
sequentially
some t i m e . D u r i n g
in
u n i t s o f a computer system,
units
be
operations
NL
frustrated
algorithm implementation
efficiency
between i n d i v i d u a l
between
time
The
be
branches of computation
t o t h e e f f i c i e n c y o f an
data
mentation
may
into
proves
t o be
helpful
subgraphs c o n t a i n e d
in
i n the
finding paral-
Ill leleplpeds
bounded
by
hyperplanes
planes and n o t c o m p r i s i n g such
subgraphs w i l l
that
of the original
wherein or,
groups
parallei
to the coordinate
hyper-
t h e g r a p h n o d e s . The m a c r o g r a p h c o n s i s t i n g o f
be a c y c l i c , graph.
i t ss t r u c t u r e resembling
Obviously, algorithm
o f operations
correspond
on t h e whole
Implementations
to individual
exist
parallelepipeds
e q u i v a l e n t l y , t o I n d i v i d u a l m a c r o n o d e s . T h e s e g r o u p s may be e x e c u -
t e d one-by-one o r c o n c u r r e n t l y
d e p e n d i n g o n t h e a v a i l a b l e h a r d w a r e . The
operations
may
within
every
group
also
be e x e c u t e d
sequentially or I n
parallei. Suppose individual cuted
that
only
the r e s u l t s of the operations
macronode a r e s t o r e d
one-by-one. Suppose
also
belonging
i n memory, a n d t h e m a c r o n o d e s a r e e x e that a l l data
exchanges between macro-
n o d e s a r e p e r f o r m e d v i a e x t e r n a l memory. The number o f e x t e r n a l references area
needed t o e x e c u t e a macronode
o f the corresponding
metic
operations
that
a n d RAM
parallelepiped.
t h e number
i s proportional
t h e volume
memory
i s proportional t o the surface
parallelepiped, while references
Since
t o an
grows
faster
of
arith-
t o t h e volume o f than
the
surface
a r e a w i t h t h e d i m e n s i o n s o f t h e p a r a l l e l e p i p e d , we c a n a l w a y s f i n d partitioning will
o f (12.1)
be i n s i g n i f i c a n t The d e s c r i b e d
ly
efficient.
I f macronodes
then
ficient
t h e amount
t o execute
This e f f e c t
individual
nothing
prevents
total
amount
required
tributed
algorithm.
We
serially
macronodes us f r o m
RAM
or
does
executing
the operations
i n parallel.
not result several
increases
the processors
see t h a t
i s suf-
executing
(12,1),
which
was
within
each
I f the parallelism
I n sufficient
macronodes
speedup,
i n parallel.
The
p r o p o r t i o n a l l y t o t h e number o f However,
individual
the pessimistic prognosis
t o execute
i n one-by-one
T h o u g h we c h i e f l y u s e e x t e r n a l
i n t e r c o m m u n i c a t i o n can a g a i n be e s t a b l i s h e d
quired
(12.1) a r e e x c e p t i o n a l -
sequentially
does n o t depend on whether
o f required
among
references
t o e x e c u t e a macronode
macronodes b e i n g p r o c e s s e d s i m u l t a n e o u s l y .
the
memory
t h e o v e r a l l e f f e c t w o u l d be a s i f a l l memory w e r e
a r e executed
within
external
each macronode.
are executed
o f RAM
the entire
memory t o s t o r e d a t a , fast.
subgraphs t h a t
implementations o f algorithm
fashion
macronode
into
while executing
such
t h i s memory i s d i s macronodes, so t h a t
v i aexternal
memory.
a s t o t h e amount o f RAM r e -
based
on
the consideration
of
112 level-by-level
implementations,
i s not corroborated
sis.
out that
admits
I t turned
tions
(12.1)
of highly
that
do n o t r e q u i r e
large
mentations
a r e tantamount
to splitting
can the as via
be e x e c u t e d time
to the overall
amount
between
even
that
i n non-stationary
level-by-level
ineffective efficiency rithms
with
methods t o s o l v e
problems
implementations
respect
lems.
These a l g o r i t h m s
i t i s performed
increased
o f these
physics.
methods
To
d o n o t u s e RAM
block
yet essentially
versions
block
of algorithms they
execution
order
the s p l i t t i n g
c a n be i n c r e a s e d
of operations.
of algorithm
EXAMPLE 1 2 . 2 . L e t A, WxN,
and C
B,
r e s p e c t i v e l y . Suppose
A and can be e x p r e s s e d DO
that
implementa-
on a d i f f e r e n t
i t would
be a r r a y s
a
can For-
execution
splitting
algorithms wherein
we c o n s i d e r e d
prob-
algorithms.
due t o s e l e c t i n g
I n many c a s e s
(12.1) t h a t
algo-
The e f f i c i e n c y
f o r those
as
increase the
algebraic
the above-described
t h e a l g o r i t h m . T h e r e a r e a number o f o t h e r o f implementation
i n linear
a r e based
perform
about
the corresponding
efficiently.
counterparts
equations Therefore
are just
t o memory u s a g e a s i n ( 1 2 . 1 ) .
by b u i l d i n g
order,
ciency
f e a t u r e s o f most ex-
o f computational
t o row o r c o l u m n e l i m i n a t i o n
mally,
of
the e f f i different
be e q u i v a l e n t t o
above,
o f dimensions
NxL,
WxL,
the algorithm rewrites the entries
i n a FORTRAN-1ike
language as
follows:
2 1 = 1 , H/2
DO
1 i=l,N
DO
1 j=l,L
A(i,j)=A(i,j)+A(i-i,
DO
2
j)+A(i,
j-1)
i=N,1,-1
DO 2 j=L, 1,-1 Aii,
2
that that
i n t h e same way a s we d i d f o r ( 1 2 . 1 ) . L e v e l - b y - l e v e l correspond
1
parts
important
systems o f g r i d
o f s u c h m e t h o d s we c a n t r y t o s p l i t
tions
of
imple-
i s insignificant
though
(12.1) m i r r o r s t h e c h a r a c t e r i s t i c
and e x p l i c i t - i m p l i c i t
arising
and
these
into
I t i s also
those p a r t s
time,
implementa-
facto
the algorithm
o f RAM.
execution
parallel De
analy-
e x t e r n a l memory. Recall
be
small
r e q u i r e d t o exchange data
compared
plicit
the
using
a m o u n t s o f RAM.
by f u r t h e r
CONTINUE
/ ) « A t l . j ) + A ( i + l , J)+A(i,
j+1)
(12.2)
113 We
assume t h a t
by
definition
A(i,0)=C(2t-l, I )
A'Q,J)=B(2t-l,j), A(K*l,
Assume a l s o The
that
for
(12.2) a l s o m i r r o r s
of grid
1)
the characteristic features of a
equation
systems
physics.
For example,
of computational
such well-known
and w i d e l y used
methods o f s o l v i n g ear a l g e b r a i c
grid
systems
by
the key
i n non-stationary includes
m e t h o d s a s SSOR. A number o f
Iteration
i n stationary
methods a r e c l o s e l y
studying
arising
methods
group
related
T h e r e f o r e we a g a i n h o p e t o a c h i e v e methods
A{ i , N+1}-C{2t,
o f c o m p u t a t i o n a l m e t h o d s . H e r e b e l o n g many i m p l i c i t
the solution
problems
j ) ,
M i s even.
algorithm
large group
J)=BUt,
problems
and d i r e c t
t o (12.2) i n t h e i r
better
points
this
structure.
u n d e r s t a n d i n g o f many
of
implementation
lin-
of
similar
algorithm
(12.2). Before comparing it f
closer whose
differs loop, for
t o (12.1).
The a l g o r i t h m
body
i s a double
from
(12.1)
the outer
loop
i s performed
values.
We
second
loop
nested
l o o p s . We
DO
also
loop
i n that
b u t two successive
loops
i t s body
so
that
f o r i t s odd values
make l i n e a r
variables
as a l o o p I n
o f £. The a l g o r i t h m
i s not a loops.
single
We
the f i r s t
make of
and t h e second
substitutions
t o achieve
( 1 2 . 2 ) t o make
( 1 2 . 1 ) c a n be r e g a r d e d
independent
doubly-nested
variable
l e t us transform
i =f W - l + 1 ,
identical
indexing
(12.2)
doubly-nested a
substitution
the doubly-nested one f o r i t s even j =• L-j+1 f o r both
f o r the doubly-
have
1t-i.H
DO
1 i=l , N
DO
(12.3)
1 ./=1, L
Aii, A{N-i
f o r odd t
j)=A{i.j)+A(i-l,j)*A(i,j-1) + l , L-j+l)=AlN-i
+ AlN-i+2,L-j+l)+A(N-i 1
(12.1) and (12.2)
CONTINUE
+ l,
+
L-j+2)
L-j-H)*
f o r even t
114 We assume t h a t b y d e f i n i t i o n
AlO,j)=B(t,j),
A(i,Q)=C(t,
A(N+l,j)=Bit,j),
f o r odd
ij
Aii,N+l)=C(t,
t
f o r even
i )
£.
We n o l o n g e r assume M t o be e v e n . Comparing larity.
Both
tical
(12.1)
a n d 112.3) we
notations present
indexing.
The
loop
bodies,
arithmetic
c o m p l e x i t y and i n v o l v e
algorithms
involve
the
same memory
volves
replace
fai 1 to notice
nested
although
triple
their
simi-
with
iden-
loops
different,
have
t h e same
t h e same a r r a y s . T h a t means t h a t
t h e same a m o u n t
requirements.
some c o n d i t i o n a l
expressions
cannot
tightly
of arithmetic
The f a c t
branching
that
operations
t h e second
and s l i g h t l y
more
both
a n d make
algorithm i n -
complicated
index
f o r even 1 i s n o t i m p o r t a n t . T h i s becomes q u i t e c l e a r
i f we
t h e summing o f t h r e e n u m b e r s b y a n i n v o l v e d f u n c t i o n
i n three
variables. Thus execution rial
the algorithms
(12.1)
and (12.3)
t i m e s a n d memory r e q u i r e m e n t s
computers i n accordance w i t h
the
rounding errors
influence,
are alike
as r e g a r d s
their
i f t h e y a r e i m p l e m e n t e d o n se-
the specified
notations.
I f we i g n o r e
t h e r e i s no c o n s p i c u o u s d i f f e r e n c e
bet-
ween t h e t w o a l g o r i t h m s . Let us b u i l d into
integer
the graph
points
tions corresponding £, N-i,
nodes w i t h to
our general
rules:
t o loop variables
L-j,
(12.3). Place
of the parallelepiped
t i m e we d e v i a t e f r o m
dinates
of algorithm
coordinates
t,i,j.
We
again
t
onto
we
l*j'£L.
map
operations
assume a l l n o d e s
This
t h e opera-
t h e nodes w i t h
those
nodes
coor-
onto the
t o correspond
u=a+b*c. The s u b g r a p h o f the
graph
of algorithm
graph
c a n be o b t a i n e d b y f o r m i n g a s t a c k o f s u c h s u b g r a p h s p a r a l l e l t o
the
i s shown
i n F i g . 12.2.
The
whole
£ a x i s . The i n p u t and o u t p u t a r c s a r e n o t d e p i c t e d t o keep t h e p i c -
ture clear
from unessential d e t a i l s .
rithms
i t i s easy
values
o f £ are i d e n t i c a l .
of
f o r £ = 1,2
j
map
o n e a n d t h e same o p e r a t i o n o f t h e f o r m (12.3)
l
f o r even
t , i,
a n d f o r o d d t we
lsfsf),
t h e graph
( differ
only
t o observe
that
Comparing t h e graphs o f b o t h t h e subgraphs
The s u b g r a p h s
i n the directions
corresponding
corresponding
o f arcs
lying
t o even
algot o odd values
i n the coordinate
115 plane
their
shall
refer
turn
o u t t o be
example, term
with
"time
directions
t o these
inappropriate
splitting
level"
importance
as
this
time
with
methods
levels,
respect
i twould
t o t h e subgraph
v a l u e s o f t . However real
a r e o p p o s i t e . As w i t h
subgraphs
algorithm
though
(12.1)
we
term
may
methods.
For
this
to implicit
be m o r e n a t u r a l
corresponding
t o apply t h e
t o several
i s a purely terminological
successive
i s s u e t h a t has no
f o r u s now.
t
Fig.
12.2
Note t h a t a l l t i m e l e v e l s o f (12.3) a r e p a r a l l e l i z e d l y as t h o s e o f t h e a l g o r i t h m blems
about
over,
staying
level
finding
(12.3)
are practically
The mentally
we
branches
i n (12.3).
must
state
that
indistinguishable
the algorithms as
regards
their
More-
level-by(12.1)
and
execution
requirements.
q u e s t i o n whether different
investigate
computational
the class of a l l sequential or parallel
implementations
t i m e a n d memory
( 1 2 . 1 ) . T h e r e f o r e t h e r e a r e no f o r m a l p r o -
parallel
within
as e f f e c t i v e -
t h e a l g o r i t h m s (12.1)
remains open y e t . S e a r c h i n g
the structure
o f memory
required
and (12.3)
are funda-
f o r t h e answer, by a l g o r i t h m
l e t us
(12,3).
We
116 know t h a t to
have
level-by-level
RAM
of size
implementations
proportional
f o r m u=a+b+c i n one t i m e l e v e l .
o f (12.1)
t o t h e number
well.
require
Y e t we h a v e f o u n d
that
(12.1)
much l e s s e r amount o f RAM.
plementations
residing
acyclic
implementations
admits
only
i f each
macrograph would
parallel
nodes.
They
to the coordinate partition
not
subgraph
n o t be a c y c l i c
a l g o r i t h m (12.3)
memory
belonged time
exchanges
crease o f t h e o v e r a l l entirely
other
entirely
levels.
are
to a
I n other
these data
t o be e x e c u t e d
several consecutive
level.
t o some
without
level
t o schedule
macronodes.
partitionings
do
considerable i n -
l e t some s u b g r a p h b e -
o f data
i t exchanges
with
t o t h e number o f n o d e s o f t h e s u b g r a p h .
delays.
time l e v e l s
time
would
I n case
memory i f t h e o p e r a t i o n s t h e subgraph
t h e n e c e s s i t y f o r l a r g e RAM admits
consists of i s obvious.
only the level-by-level
any implementations
of size
pro-
t o t h e number o f o p e r a t i o n s o f t h e f o r m u = a + b + c i n e a c h
time
level.
We
have a l r e a d y
stacle
f o r the s o l u t i o n o f computational
mentioned
increase
( 1 2 . 2 ) . Here l i e s
i n the execution
loop.
requirement
physics memory
time
RAM
of
i s a major ob-
problems o f space d i inevitably
causes
sub-
algorithms
(12.3)
and
t h e c a r d i n a l d i f f e r e n c e between t h e two examples.
EXAMPLE 1 2 . 3 . The n o t a t i o n t h e language
nested
this
3 o r m o r e . The u s e o f e x t e r n a l
stantial
in
that
of i trequire
implemen-
portional
mension
Therefore
the valid
The a m o u n t
into
do n o t a l l o w t o r e -
memory w i t h o u t
cannot r e s i d e i n e x t e r n a l
Thus t h e a l g o r i t h m ( 1 2 . 3 ) tations.
words,
execution time. Certainly,
time
hyper-
I n a l l o t h e r cases the
partitionings
via external
subgraphs i s p r o p o r t i o n a l
Therefore
that
hyperplanes.
and i t would be i m p o s s i b l e
the valid
requirements.
allow data
long
(12.3)
the graph
by t h e
independent e x e c u t i o n o f o p e r a t i o n s c o n s t i t u t i n g separate
duce
of
t h e macrograph formed by such subgraphs
o r encompassed s e v e r a l c o n s e c u t i v e
With
o f the require-
o f implementations
i n t h e p a r a l l e l e p i p e d s bounded
H o w e v e r , we s e e now t h a t be
that
f o r (12.3).
and n o t c o n t a i n i n g g r a p h
subgraphs
that
L e t u s t r y t o o b t a i n a n a l o g o u s Im-
Consider a f a m i l y o f hyperplanes planes
the necessity
of operations
I t i s easy t o v e r i f y
ment must b e s a t i s f i e d f o r a l l l e v e l - b y - l e v e l as
Imply
constructs
I t i s known
(12.1)
i t uses.
that
drastically
The n o t a t i o n
the general
differs (12.1)
from is a
methods o f f i n d i n g
(12.2) tightly
parallel
117 computational
branches
above d i s c u s s i o n good
a r e wel1 developed
h a s shown
parallelization,
such very get
parallei
constructs.
(12.2)
We
have
important properties
ties
are determined
tors,
tively,
contain
that
algorithm
by t h e language
some
of entries
The of a
(12.1)
constructs
developed f o r
(12.2)
lacks
possesses.
o f an a l g o r i t h m
i s no d i r e c t
data,
admits
c o n s i d e r a b l e RAM r e -
algorithm
used
One
some might
and i t s p r o p e r i n the algorithm
c o n n e c t i o n between these two f a c -
shows.
t h e a r r a y s A, B, C o f d i m e n s i o n s
evaluation
constructs.
(12.1)
a r e by f a r less
out that
the structure
However, t h e r e
n o t make
found
as t h e f o l l o w i n g example Let
does
f o r such
algorithm
i s m o r e c o m p l e x . The g e n e r a l m e t h o d s o f
c o m p u t a t i o n a l branches
the impression that
notation.
not only
but i talso
q u i r e m e n t s . The n o t a t i o n Finding
that
Let
NxL,
HxL,
the algorithm
a n d MxN,
consist
of A according t o the following
respec-
i n the r e -
FORTRAN-like no-
tation:
DO 2
r=l,M/2
DO
1 1-1,»
DO
1
(12.4)
jm\,L
AU. j)**Mt, j)*Au-i, ji+Mi. j-i)
1 DO
2
f=l,N
DO 2
J=1,L
A{i,j)-A{i,j)+A(i-l.j)•A(i
2
,
CONTINUE
We assume t h a t b y d e f i n i t i o n
A(i,0)=C(2t-l,i)
AiO,j)=B(2i-l,j).
for
the f i r s t
for
AIO. j ) = f l ( 2 t , j). t h e second assignment statement. Transform
assignment
statement and
the notation
(12.2) and b u i l d
the graph
(12.4)
AU,0)=C{Zt,l) i n t h e same
of algorithm.
manner
The r e s u l t i n g
a s we
did
graph w i l l
for
match
118 exactly
t h e graph
12.1 p r o v i d e d
we s t i c k
t o t h e usual
mapping o f graph
nodes o n t o
t h e i n t e g e r nodes o f t h e g r i d .
The o n l y
the
of the operations
by t h e nodes.
nature
levels time
t h e nodes c o r r e s p o n d
The direct
time
i s rather t r i v i a l .
this
differ-
i n a n y way.
Our p o i n t
i s that
there
I s no
c o r r e s p o n d e n c e b e t w e e n t h e c o m p l e x i t y o f a l g o r i t h m s t r u c t u r e and
the complexity
of the notation
much more c o m p l i c a t e d
representing i t . This
correspondence I s
a n d i t o f t e n c a n be u n d e r s t o o d
only after
build-
the a l g o r i t h m graph. EXAMPLE 1 2 . 4 .
two-dimensional ly,
lies in
F o r odd
u=a+bc. O b v i o u s l y ,
t h e memory r e q u i r e m e n t s
l a s t example
difference
t o o p e r a t i o n s o f t h e f o r m u=a*b*c, f o r even
l e v e l s - t o operations o f t h eform
ence does n o t a l t e r
ing
represented
Let the three-dimensional
a r r a y A o f s i z e MxtfxL a n d
a r r a y s B, C, D o f s i z e s ti'L.
HxH, a n d NxL, r e s p e c t i v e -
c o n t a i n some d a t a . C o n s i d e r
DO
thefollowing algorithm:
1 1=1,M
DO
1
1
DO 1
*
1
1
2
.
5
)
J-1,1
Ait,i.j)=A{t.i,j)+A(t-l,
1
(
t , J ) * A i t , l - l , j ) + A t t , l J - i l t
CONTINUE
We assume t h a t b y d e f i n i t i o n
A[0,i,J)=Dli,j),
The
algorithms
extent
that
exists
between
(12.1) and (12.5)
I ti s d i f f i c u l t them.
j ) ,
Alt.O.J)=Bit,
to tell
Moreover,
matches t h e g r a p h o f 12.1.
whether
aU
i,0)=C{t,
i ) .
resemble one another
the graph
Despite
A(t,
any meaningful
of algorithm
that,
t o such an difference
(12.5)
t h e r e i s a d i f f e r e n c e . The
g r a p h s o f ( 1 2 . 1 ) a n d ( 1 2 . 5 ) c o i n c i d e o n l y a s l o n g a s we o m i t and they
output
a r c s . These a r c s
i n v o l v e a l l nodes f o r
tially
more l / o o p e r a t i o n s
the execution
o f (12.1).
exactly
the input
i n v o l v e o n l y boundary nodes f o r ( 1 2 . 1 ) y e t (12.5). during
Therefore
I t follows
that
the execution
t h e r e a r e substan-
o f (12.5)
the input data
than
during
and t h e r e s u l t s o f
( 1 2 . 5 ) c a n n o t be s t o r e d i n e x t e r n a l memory, a s e x t e r n a l memory r e f e r e n -
119 ces
c a n t a k e more t i m e
A g a i n RAM
than performing
of large size
i s needed
theoperations o f the algorithm.
t o minimize
the algorithm
execution
time. EXAMPLE 1 2 . 5 . The e x a m p l e s common sults
that
the total
memory
we h a v e
size
considered
required
i s almost independent o f t h e implementation.
ment was v e r i f i e d cessarily
f o r level-by-level
the evaluation
of
t h e sum
i s s h o w n i n F i g . 2.2 ( b e l o w )
t h e same
layer
a r e performed
a r e needed
rithm
i s executed on a u n i p r o c e s s o r a l
allel
form
that
t o store
this
Intermediate
Therefore
situation
different
store the
i f we
take
t h e Input
end r e s u l t s For
the i n i t i a l ted
may
t h e memory
gorithm.
data
i s shown
results.
i n F i g . 12.3 f o r n=8. I t i s
I tfollows
that
implementations n
memory
can
cells
intermediate
results
of
be into
and
are input-
both
then rewhere
and end
the algorithm
be s t o r e d . Now
useful
we
can draw
inferences
some
from t h e
above d i s c u s s i o n . There a r e
i n this
case
on t h e a l g o r i t h m implementation, i . e .
o f the a l -
quire
pal—
r e q u i r e s o n l y a b o u t l o g 2 n memo-
example, i f
data
belonging
S u p p o s e t h a t same a l g o -
used t o
simultaneously
both
ne-
n / 2 memory
c o m p u t e r . The c o r r e s p o n d i n g
on t h e s c h e d u l e .
account
state-
i s not
the doubling
f o r n=S. A l l o p e r a t i o n s
implementation
t h e s i z e o f r e q u i r e d RAM d e p e n d s
The
this
This
(2.2)using
t o store intermediate results.
o f t h e a l g o r i t h m graph
easy t o v e r i f y cells
At l e a s t
form o f t h ecorresponding a l -
simultaneously.
cells
ry
i n
t h e case f o r a l l a l g o r i t h m s .
Consider
to
I t
intermediate r e -
Implementations.
scheme. T h e g r a p h o f t h e m a x i m a l p a r a l l e l gorithm
so f a r have
to store
F i g . 12.3
120 algorithms
a n d m e t h o d s whose
implementations
t h e c o m p u t e r s y s t e m p o s s e s s l a r g e RAM, put,
output,
tion,
and
intermediate data
arithmetic
popular
implicit
computational can
always
These
generated
a n d t h e a c c e s s t i m e must n o t e x c e e d
perform
like
splitting
problems.
implemented
methodl
On
hand,
using
a
methods and a l s o
used
t o solve
kind
methods,
the other
efficiently
i n c l u d e many e x p l i c i t
the Gauss-Seidel
during
algorithm
t h e average
o p e r a t i o n s . Methods o f t h i s methods,
physics
be
necessarily require
that
T h i s memory must c o n t a i n a l l i n -
i n c l u d e e . g . many used
to solve the
methods
small
some
execu-
time required to
exist
amount
implicit
the computational
of
that RAM
ones
(like
physics
prob-
lems. N o t e t h a t d i f f e r e n t demands a s t o t h e s t r u c t u r e are not d i r e c t l y of computations ference
connected w i t h
the (im)possibility
at the individual
arithmetic
Thus size
operations level.
i n t h e demands o n t h e memory s t r u c t u r e
propagation processes d u r i n g a l g o r i t h m various
algorithms
and t h e q u a l i t y
investigation
make
mirrors that
o f mathematical
different
problems
on a l g o r i t h m
The d i f -
i n t h e data
execution. requirements
o f c o m p u t e r memory. T h i s
memory s i z e a n d s t r u c t u r e
o f t h e u s e d memory of parallelization
suggests
concerning
both
a more
on t h e thorough
t h e dependence o f
properties.
13. Total Required Memory Size B e f o r e we s t a r t
the research concerning
t h e u s e o f c o m p u t e r memory
we must a g r e e p r e c i s e l y w h a t i s t o be s t o r e d a n d i n w h a t m a n n e r . L e t us assume as b e f o r e t h a t a l l o p e r a t i o n s o f t h e a l g o r i t h m a r e p e r f o r m e d i n s t a n t a n e o u s l y . T h i s amounts t o f i x i n g m a c h i n e . We h a v e s e e n pensated tual
singled
f o r by i n t r o d u c i n g d e l a y
We
f o r that
o u t a subset have
concerning
machine.
v e c t o r s . We
Among
have
a l l such
substituted
time.
properties
to a given o f schedules
To i n v e s t i g a t e
t h e ac-
algorithm
implementations
corresponding
general
execution
o f t h e a s s u m p t i o n c a n be com-
machine and c o n s i d e r e d
o f schedules
studied both
algorithm
some p r o p e r t i e s o f a h y p o t h e t i c a l
the unreality
computer by a h y p o t h e t i c a l
mentations
tor.
that
imple-
we
have
delay
vec-
and
those
memory-related i s -
s u e s we h a v e t o e l a b o r a t e o u r m o d e l h y p o t h e t i c a l m a c h i n e . Of c o u r s e ,
we
121 shall and
again
tend
algorithm By
to express
memory u s a g e p r o b l e m s i n t e r m s
S t a t e m e n t 7.1
lay vector U
i f and
a v e c t o r t i s a schedule
only i f the
t
hold
for a l l pairs
an a r c o r i g i n a t e s lation ith
(13.1)
o f graph
from
implies
t h e moment
precise
nature
corresponding bytes, for
of
or
The
rage
data
device
viously,
can
determine
quired
Knowing
cells,
tioning,
connected
of
that a l l data
data
fixed
during
algorithm
device
the
Items
We
execution
are
as
the
be
total
bits,
struc-
must
be sto-
number stored.
time of
kept
The
assume
of Ob-
individ-
i n memory, number o f
t h e number o f l e v e l s o f h i e r a r c h y , t h e n e c e s s i t y o f a l l these
con-
a data
some
w o r d may
words
the
be
include
s y s t e m has
i s the storage
many memory c h a r a c t e r i s t i c s ,
etc. Certainly,
re-
operations
etc.
comprise
t i m e moment o n l y one
individual
shall
i t e m s a r e o f t h e same
the computational
times
of
size,
generated
that
Suppose
the j t h o p e r a t i o n .
words.
Let
that
nature
t o them a s
given
the
the
nodes. V a l i d arrays
arcs.
, i.e. immediately a f t e r
are generated
on
by
t o t h e j t h n o d e . The
refer
memory.
a t any
de-
(13.1)
are
t h e most i n t e r e s t i n g p a r a m e t e r
words.
the
and
called
ual
we re-
sec-
c h a r a c t e r i s t i c s depend i n g e n e r a l
on
schedule. Given a schedule
that
a word
pears be
graph
numbers,
will
to the
J
points
depends
somewhere. Suppose t h a t
c e l l s wherein
}
that
some d a t a
items
real
used
corresponding
d u r i n g the execution of
tj
data
H e n c e f o r w a r d we
stored
. u
i
t h a t a t t h e moment t
to algorithm
integer
t
f
nodes
the sake o f d e f i n i t e n e s s
ture.
schedules
inequalities
t h e i t h n o d e and
o p e r a t i o n i s completed,
sumed a t
of
graphs.
and
reading
fetched, assume
another
precedes
are
the a l g o r i t h m
written
i s i n s t a n t l y f e t c h e d as
f e t c h e d and
cells
describing
is instantly
used.
writing. I n case
t h e r e I s no
that
value
the data
must
That one
into
s o o n as be
order
and
the
i m p l e m e n t a t i o n we
a memory c e l l
transmitted
s o o n as
assume i t ap-
i t i s needed. I f a v a l u e
written ensures
at
the
that
same d a t a
no
are
n e e d t o r e f e r e n c e memory a t a l l . are
as
between
same moment excessive t o be
must then
memory
written
I n such cases
the f u n c t i o n a l
and we
units d i -
122 rectly,
t o eliminate
r e d u n d a n t memory r e f e r e n c e s
c r e a s e o f memory s i z e . two
that
s i t u a t i o n can occur
data dependent o p e r a t i o n s a r e performed
responding itself
arcs a r e a s c r i b e d zero
i s irrelevant
about
f o r us h e r e .
the properties
also of
Of c o u r s e ,
o f memory
and t h e p o t e n t i a l i n -
delays.
and t h e p e c u l i a r i t i e s
Certainly, sible
throughput,
i n practice.
of using
However
instant
a c c e s s t i m e by i n c r e a s i n g t h e components o f t h e d e l a y that
vector.
Taking
m a c h i n e we ponding
Therefore
that
of
into
specification
implementa-
o f t h e model
hypothetical corres-
vector.
of a delay
vector only p a r t i a l l y I n spite
a l l o w s f o r the
of that
we
contend
implementations
o f a n a l g o r i t h m o n a n a c t u a l com-
i s always d e s c r i b e d by a subset
o f the s e t o f a l l implementations
o f schedules
disregarded understand graph.
corresponding
peculiarities their
impact
sessing quired
the specified
research
12.1 shows t h a t
gorithm
given
properties
i n precisely
manner.
of finding
the re-
c o n d u c t o u r mem-
In particular, implementation
t h e exo f an a l -
t w o - l e v e l memory c a n be r e d u c e d
f i n d i n g a s p e c i a l k i n d o f a l g o r i t h m graph i s a peculiarity
can
t h e s c h e d u l e pos-
We w i l l
the problem o f e f f i c i e n t system w i t h
and a l g o r i t h m
algorithm implementation
o r by t h e problem
that
o f the
Knowing t h e
i t i s possible to
o f schedules
o f t h e a l g o r i t h m graph.
on a computational
There
vector.
computer,
the properties t h e proper
i . e . by t h e subset delay
replaced by t h e problem o f f i n d i n g
characteristics
ory-related
machine,
to a
of the actual on
The p r o b l e m o f f i n d i n g
t h e r e f o r e be a g a i n
tics
i n t o ac-
o f the s e t o f schedules
o f algorithm implementation.
the set of v a l i d
ample
of algorithm
our refinements
our i n v e s t i g a t i o n
t h a t a l g o r i t h m on a h y p o t h e t i c a l
set
to
the peculiarities
account
continue
peculiarities
puter
v e c t o r . He-
c o m p u t e r s y s t e m c a n be a l l o w e d f o r u s i n g t h e d e l a y
t o a given delay
The
i s impos-
t h e time r e q u i r e d t o execute t h e o p e r a t i o n s i s taken
similarly.
We
i t i s easy t o a l l o w f o r t h e non-
zero
t i o n on a p a r t i c u l a r
i t .
t h e number
t h e number o f i / o p o r t s , e t c .
call count
access
assumptions
o f memory,
t h e a s s u m p t i o n o f memory a c c e s s b e i n g
to f u l f i l l
i n case
The p r o c e s s o f memory
We d o n o t make a n y o t h e r
do n o t assume a n y t h i n g a s t o - t h e s t r u c t u r e
c h a n n e l s and t h e i r
only
s i m u l t a n e o u s l y and t h e c o r -
splitting.
i n the investigation
based on s t u d y i n g t h e a l g o r i t h m graph.
o f memory
As a r u l e ,
characteris-
the traditional
123 notation sults
describes
are
multiplication, tion.
As
the
programs are as a r u l e The
algorithms
division,
result
imply
that
tion.
I t i s obvious,
the word
operation
sides
composed o f
may
be
the considered
will
an
that
precisely
o f any
scalar operation
i s t o be
many t i m e s
we
pay
information building
on
a given
arcs,
as
i t e m s may
simply
t h e same
oper-
result term
to a l l functional
I t i s not be
taken
seen
The
broadcasts.
specified,
every
scalar
A bus
t o them.
actually
and
may
important
used, or
e t c . Yet into
the
there
account.
B r o a d c a s t s m u s t be
may
be
adare That
carefully
[88! u s i n g a l g o r i t h m graphs. broadcasts
an
arc.
argument.
be
systolic arrays
t o one
of
Is clearly
i n terms of
such r e s u l t
must be
opera-
Typically
more t h a n one
or
their
broadcasts
are
each s c a l a r
case the r e s u l t
operations.
are described
i s performed.
special attention
into
obtained
The
while
the
potential
t o a n s w e r m o r e p r e c i s e l y w h a t memory i s n e e d e d t o
implement
a l g o r i t h m . Yet p r o b l e m s on pose a g a i n s t
Our
t h e way.
immediate goal
of
On
this
algorithm In other
same
number
we
shall
must b a l a n c e we
are
the
have
corresponds
complexity
i t can
be
independent
assumed words
as
a
of
more the
the q u a l i t a t i v e c h a r a c t e r i s -
reasonable to
to solve
pursuing.
r e q u i r e d memory p r o p e r t i e s on
i t is quite
graph
of
that
is to investigate
stage
words,
We
the goals
t h e dependence o f
perties.
c o n s i d e r a t i o n indeed provides
i t i s obvious
complicated
the
other
corresponding
result
p r o b l e m s we
data.
by
assumptions
the a l g o r i t h m graph.
opportunity
the
they
needed d a t a
Taking broadcasts
of
i n the general
t o t r a n s m i t every
what a r c s
i s a w o r d . Our
r e a d more t h a n once. T h i s
broadcasting
handled w h i l e b u i l d i n g
tics
be
presence of
statements
s t o r e d i n memory a f t e r
however, t h a t
that
evalua-
also
broadcasting
cases when t h e why
as
re-
addition,
s c a l a r o p e r a t i o n s , most g r a p h nodes w o u l d
to the arcs
consume how
dresses of o f t e n
is
assignment
operations.
the n e c e s s i t y
whose
include
real
examples:
refer
operation
stresses units
most
y e t a l m o s t e v e r y g r a p h node e m i t s
We of
of
scalar
used
operations
in
w o r d s t o r e d i n memory w i l l in
scalar
1. T h e s e o p e r a t i o n s
a l o t of boolean operations, f u n c t i o n
right-hand
represent end
ations,
i n terms o f
v a r i a b l e s of dimension
algorithm
t o assume t h a t
transmission that
every
the
number
of
every
arc
independent
operation of
pro-
arcs
produces that
the
124 corresponding responds
a l g o r i t h m g r a p h node e m i t s . Of c o u r s e ,
t o each a r c .
assumption
the q u a n t i t a t i v e about the
yet
variety
characteristics
2 - 3 . The a s s u m p t i o n
results
rigid
A great
u s u a l l y makes no c h a n g e s
of operations
requirements t h e demands
may
increase
undergo
a number
significantly
by a f a c t o r o f
C l e a r l y , more
i s actually
only
this while
t o store not only
o f doubles.
o n memory t h a n
cor-
that
picture,
a change
t o t h e agreement
but also
may be p l a c e d
testifies
i n the q u a l i t a t i v e
only
amounts
o f examples
o n l y one w o r d
necessary,
i n case
there
are
h e a v i l y b r o a d c a s t i n g n o d e s . T h i s c a n w e l l b e t h e c a s e , a s t h e g r a p h 6.2 from
E x a m p l e 6.2 s h o w s . D e s p i t e
usually ing
relatively
broadcasts, For
cute
a l l that,
small, which allows
taking
them i n t o
account
the purpose o f i n v e s t i g a t i n g
a n a l g o r i t h m we assume t h a t He a l s o assume t h a t
o n t h e memory c h a r a c t e r i s t i c s
take
into
account
results
o n l y those
correspond-
way.
does n o t i n c l u d e
arcs are specified under examination.
that
we a r e o n l y
t o s t o r e t h e i n t e r m e d i a t e and f i n a l
interested
I f n e i t h e r i n p u t n o r o u t p u t nodes a r e s p e c i f i e d ,
and t h e
i n t h e g r a p h . The
results
memory r e q u i r e d t o s t o r e i n t e r m e d i a t e r e s u l t s .
may
I f we w a n t
t h e memory n e e d e d t o s t o r e t h e i n p u t d a t a
t h e n t h e i / o n o d e s must be i n c l u d e d
a b s e n c e o f i n p u t n o d e s means t h a t required
the
i n a different
c o m p u t e r memory r e q u i r e d t o e x e -
have e f f e c t to
o f such nodes i s
t h e a l g o r i t h m graph
broadcasts.
algorithm
t h e number
not t o consider
i n t h e memory
o f the algorithm.
t h e n we o n l y s t u d y t h e Other
variants
are also
possible. D e n o t e b y H^it) rithm ent
implementation
t . Let a
k
the minimal described
be t h e number o f a r c s g o i n g
Pk
be t h e number o f a r c s
as
t h e outdegree.
n o d e , a. -l3 k
The that
k
be
stored,
creasing
t h e used
function
ak+0k
operations
resulting
operations memory
that
an algo-
o u t o f t h e k t h g r a p h n o d e and
t h e k t h n o d e . We
will
as t h e d e g r e e
The e q u a l i t y 8^=0 d e s c r i b e s
k
corresponding
into
6 ^ a s t h e indegree,
a s i t s defect.
a m o u n t , The e q u a l i t y <> ~0
step
going
execution of corresponding must
number o f memory c e l l s
b y t h e s c h e d u l e u r e q u i r e s a t t i m e mom-
the output
implies the output amount.
Denote
of the k t h
t h e i n p u t nodes.
i n e v i t a b l y g e n e r a t e s new
i n the increase
describes
refer t o
6
of
t h e used
data
memory
n o d e s . The e x e c u t i o n o f
o f some d a t a , = a -8
thereby
de-
and i n t r o d u c e t h e
125
i STATEMENT 1 3 . 1 . The
function
0, t
< 0.
1,
£ 2 0.
AM*)
has
the
form
113.2)
The at
function
the times
fetched
from
by
number.
that
then
from
number.
"^tt)
must
when memory memory,
that
be p i e c e w i s e
i s referenced.
t h e r e q u i r e d memory
I f a number
o f words
moment o n t h e r e q u i r e d memory
As t h e o p e r a t i o n c o r r e s p o n d i n g
(13.2) r e f l e c t s
(13.2) d u t i f u l l y performed
these
ignores data
as i t changes
only
a number o f w o r d s i s accordingly a t some
amount
decreases
time
increases
t o t h e k t h node
t h e o p e r a t i o n , and a. words a r e s t o r e d formula
amount
i s stored
t i m e moment u . , B. w o r d s a r e f e t c h e d f r o m
The
constant, Whenever
moment, by
that
i s performed a t
memory t o become a r g u m e n t s o f
i n memory a s o p e r a t i o n
observations.
Obviously,
results.
the formula
t r a n s f e r s between t h e o p e r a t i o n s
that are
simultaneously.
STATEMENT 1 3 . 2 . The
equality
(13.3)
always
k
holds.
Indeed, originates
l e t t h e graph
from,
comprise
o n l y one node.
m arcs.
Consequently
= 0
l«k I k= k
It
follows
k
that
k
k
k
k
Each
arc points
t o , and
126 COROLLARY. nonposltlve) equal
I f
a l l internal
defects
to
(not
then
greater
graph
the
than,
total
not
nodes
have
number
of
than)
the
less
zero
(nonnegative,
algorithm
input
total
data
number
of
is
i t sre-
suits. Suffice
i t t o note
equals
the total
output
nodes
corollary
that
number o f i n p u t
i s
opposite
difference
total
between
number
The any
of
equality
schedule
total
input
(13.3)
number
o f results.
of
internal
number
of
graph
nodes
Hence t h e
algorithm
Is
results
equal
to
and the
data.
has the s t r a i g h t f o r w a r d
u a l l functions
nodes
(13.3).
defects
the
algorithm
o f input
d a t a , a n d t h e sum o f t h e d e f e c t s o f
t o the t o t a l
immediately follows from
COROLLARY. The sum of the
t h e sum o f t h e d e f e c t s
y(t-u. ) are unity
i n t e r p r e t a t i o n . For
for sufficiently
large
K
t.
Taking
For
(13.3) i n t o
large
sults
£ the algorithm
read
stored
execution
must
have
been
completed,
f r o m memory b y t h e o u t p u t d e v i c e , a n d n o t h i n g
a l l
re-
valuable i s
i n memory a n y m o r e .
The
step
function
( 1 3 . 2 ) we o b t a i n of
a c c o u n t , we h a v e
i s nonnegative
and not g r e a t e r
the simplest estimate o f required
than
memory s i z e
1. Using i n terms
t h e d e f e c t s o f a l g o r i t h m g r a p h nodes:
(13.4) k:
Although gorithm
this
estimate i s t r i v i a l ,
graphs.
Consider
e.g.
5, >0
i t Is actually
the graph
o f pairwise
b e r s . We h a v e 5^=1 f o r n i n p u t n o d e s , 5 ^ — 1 S =-2
f o r some a l -
summing o f n num-
f o r n-2 i n t e r n a l
nodes,
and
f o r t h e s i n g l e o u t p u t node.
observed
that
plementations The that
reached
( 1 3 . 4 ) i m p l i e s t h a t N ( t ) = n . We h a v e u t h i s e s t i m a t e i s r e a c h e d a t l e a s t f o r t h o s e a l g o r i t h m im-
that
input a l l i n i t i a l
e s t i m a t e (13,4)
i t sright-hand
side
data
simultaneously.
i s a t once n o t a b l e i sindependent
and
crude
due t o t h e f a c t
o f the schedule
and o f time.
127 Consequently, tfjtt).
I.e.
i tonly estimates
t h e maximum amount
mentations
may
mentations
requiring
tion, of
better
memory
require.
that
may r e q u i r e .
Suppose
that
a r e e n u m e r a t e d so t h a t not
decrease,
imple-
t o answer whether
imple-
memory e x i s t .
Using
implementation
f o r t h e schedule
times
(13.2)
the additional
informa-
f o r e . g . t h e maximum amount N*
the corresponding
a n d when t h o s e
do n o t d e c r e a s e . T h e n
various algorithm
does n o t h e l p
c a n be o b t a i n e d
the algorithm
N* o f a l l f u n c t i o n s
o f memory t h a t
(13.4)
less
estimates
t h e maximum v a l u e
described
by a schedule
u the algorithm
graph
u
nodes
o p e r a t i o n s e x e c u t i o n t i m e s do
c o i n c i d e , t h e d e f e c t s o f t h e nodes
implies
that
J = max V 8 . .
H* U
t-i
.
(13.5)
K
k=i Following palrwise 8, Now
F i g . 12.3
summing
algorithm
l e t the i n i t i a l
needed,
data
and a l l d e l a y s
consider
the serial
implementation
f o r n=8. The e s t i m a t e be i n p u t t e d
corresponding
(13.4)
of the s
yields
one by o n e , as soon as t h e y a r e
t o input
nodes
a r e non-zero. I n
t h a t c a s e we h a v e
= max
N*
u
In
case t h e input
obvious to
that
Itself.
II
Different graph
exists
whereby
To
(13.2).
values
I t Is
corresponding
d u r i n g t h e o p e r a t i o n execu-
0 . 1 . 0 . 1 , 2 , 1 , 0, 1, 2, 1 , 2 , 3, 2, 0 ) = 3-
o f N*
derive certain the function Taking
obvious
data
Then
were
given
enumerations corresponding
present
the i n i t i a l
o p e r a t i o n may be i n p u t t e d
= max t - 1 ,
N*
operations arcs are ascribed zero delays,
a schedule
any a d d i t i o n
tion
the
( 1 , 2 . 1 , 2 , 3 , 2 . 1 , 2, 3, 2 . 3, 4 . 3, 2, 0 ) • 4.
into
estimates N ( t ) after
account
by
113.5)
to different
for different
and r e l a t i o n s a
somewhat
algorithm
schedules. i t i s expedient different
the nature o f the step
to re-
fashion
function,
we
than
obtain
128 STATEMENT 1 3 . 3 . The function
N ( t )has u
N it)
=
u
If imply
thedefects
that
V
Y L
o f t h e schedule
less the
easily
be e s t i m a t e d
the formula
one d e s c r i b e d
output
above.
internal
nodes t u r n e d
The d e f e c t s
graph
algorithm
Let Using
the algorithm
it,
graph.
This
Generally
f o r o u r s u r m i s e . Any defect
situations
nodes.
that
are necessarily
Nonethe-
are close to
negative
for all
f o r a l l i n p u t nodes. Besides, t h e
o u t t o have d e f e c t s
be negative
the defects
(the
total
t h e schedule
of
(positive)
implementation data
implemen-
o f t h e same s i g n
i na l l
the
considered.
STATEMENT 1 3 . 4 . Let rithm
would
t h e sum o f
a l l algorithm
no g r o u n d s
important
nodes and n e c e s s a r i l y p o s i t i v e
e x a m p l e s we
input
using
(13.3) p r o v i d e s
a r e some p r a c t i c a l l y
(13.6)
t o t h e r e q u i r e d memory a m o u n t .
i n c l u d e s b o t h p o s i t i v e d e f e c t and n e g a t i v e there
then
u and e q u a l s
o f a l l n o d e s . T h a t w o u l d mean t h a t
amount w o u l d
graph
(13.6)
k
t a t i o n s a r e equivalent w i t h respect
speaking,
form
5^ o f a l l nodes were n o n n e g a t i v e
i s independent
the d e f e c t s
the
or
requires number
of
a l l internal
zero.
does
not
Then exceed
algorithm
u correspond
n o d e s in
the
memory
the
total
t o some
p o s e t h e maximum o f t h e r i g h t - h a n d s i d e
algorithm
k
k=l
We r e p r e s e n t
this
sum a s f o l l o w s :
1' = k=\
I-
of
implementation. (13.5).
i s r e a c h e d f o r 1-1, T h e n
r L
number
any
results).
enumerate a l g o r i t h m g r a p h nodes so as t o s a t i s f y
u
an a l g o -
amount
-out
Sup-
129 where £ output, the
,
, and £
and i n t e r n a l
^
f o r t h e sums o f d e f e c t s o f
nodes, r e s p e c t i v e l y ,
i n v o l v e d nodes a r e n o t g r e a t e r
nodes i s r e p r e s e n t e d
L k
Note
that
l s
Z
j
n
E J N P * EINPalgorithm
Furthermore,
results,
= T. t-inp
sums i m p l i e s
equals
-£
a
n
d
= £
u
i n case
taking
into
N*u
o u (
T '-out
= E*-mp
T
*
''inn
the total
amount
of input
equals
According
|E l
the total
W e
conclude
a
* I L * + Ei '-ou t
i np
2
Ei
" J nn
Eu
*-inp
nodes
i np
are nonnegative
* E' . '-out
+
E'i' j n n s
the algorithm
bound i t s e l f tion
I 'inp -
*
L
prove
E'.inn
3
E'-•inp
+
algorithm
our Statement.
graph
i f Statement
storage o f a l l input data
results
i n memory t h e n
can e a s i l y
13.4 i s a p p l i c a b l e .
tional
(13.2) before,
t o t h e schedules and t h e o u t p u t
t h i s bound y i e l d s
tions a r e performed. STATEMENT rithm must
graph be
stored
Let
the
same
in
the input
of results
Hence t h e f o l l o w i n g
13.5.
have
whereby
memory,
the sign. then
defects I f the
of all
starts
statement a l l
input
required
be The
I f t h e implementa-
i n memory a n d a l s o k e e p i n g a l l the size of actually
u s e d memory. T h i s demand i s e a s i l y a l l o w e d f o r b y r e s t r i c t i n g
pleted
obtain,
E-''inn = " E ''out
d o e s n o t depend on t h e i m p l e m e n t a t i o n .
enforces
we
(13.3)
T h u s a n u p p e r b o u n d f o r t h e r e q u i r e d memory amount using
that I n
nodes a r e n o n p o s i t i v e
The o b t a i n e d e s t i m a t e s o f W
found
d a t a and number o f
t o the hypothesis o f
|Einn|-
s
inn
the defects of internal account
t h e numbers o f
t h e same s i g n , a n d t h e way we d e f i n e d
the inequality
*
provided that
J . The sum o f d e f e c t s o f a l l
i spositive,
E J n n have
case t h e d e f e c t s o f i n t e r n a l
N
*
and "XL,, - ~Eout-
EJNN
the statement, these
S, k
positive,
p
than
input,
analogously:
Y
and
stand
o f a l l data after
data
i s com-
a l l t h e opera-
holds:
Internal
memory
the func-
nodes and
of
an
algorithm
amount
does
algoresults
not
de-
130 p e n d on total
the
algorithm
amount
We
of
implementation
input
data
a r e now a b l e
and
and the
t o assess
the hypotheses o f Statement
is
equal
number
total
the quality
13.5 h o l d
then
to
the
of
maximum
of
the
results.
of the estimate i t i s exact
(13,4). I f
and y i e l d s the
s i z e o f a c t u a l l y u s e d memory. I f t h e d e f e c t s o f i n t e r n a l n o d e s a r e n o n negative total
then
number
t h e sum i n t h e r i g h t - h a n d s i d e o f ( 1 3 . 4 ) i s e q u a l of algorithm
nodes a r e n o n p o s i t i v e t h a t If
the input data
estimate ternal
(13.4)
nodes
results.
and t h e r e s u l t s
i s quite
have
defects
can always
output would can
nodes
t h u s b e met a n d t h e e s t i m a t e
added
mate
a n d t h e number
several
i n p u t and o u t p u t
The
fictitious
of i n -
o f nodes
latter
The p r o c e d u r e
i n p u t and
or both.
13.5
again.
accordingly
or o f input data,
f i c t i t i o u s nodes i s t h u s
de-
The c o n d i t i o n s o f S t a t e m e n t
( 1 3 . 4 ) w o u l d become e x a c t
nodes.
data.
i n memory, t h e
t h e h y p o t h e s e s o f S t a t e m e n t s 13.4 a n d 13.5 b y
t h e number o f a l g o r i t h m r e s u l t s , of
internal
the majority
i s negligible.
by adding
of
amount o f i n p u t
that
sign,
sign
t o the algorithm graph.
always s a t i s f y
fictitious
provided
o f t h e same
be made z e r o
the defects
the total
m u s t n e e d s be s t o r e d
reasonable
whose d e f e c t s a r e o f t h e o p p o s i t e fects
I n case
sum e q u a l s
t o the
We
adding
increases The number
a measure o f roughness o f t h e e s t i -
(13.4). In
case s t o r i n g
a l l the input data
t h e amount o f memory a c t u a l l y
and r e s u l t s
t i v e d e f e c t o p e r a t i o n s were p e r f o r m e d b e f o r e ations. this.
The
pairwise
In particular,
summing
example
the smaller
before
general,
certain
i f some nega-
some p o s i t i v e d e f e c t
considered
memory
c a s e when c e r t a i n a d d i t i o n o p e r a t i o n s ecuted
i s not obligatory,
used w o u l d be s m a l l e r o n l y
amount
above
was
oper-
testifies
sufficient
to
i n the
(having n e g a t i v e d e f e c t s ) g o t ex-
input operations
(having
positive
defects),
t o r e d u c e t h e r e q u i r e d memory amount t h e n e g a t i v e d e f e c t
In
oper-
a t i o n s a r e t o be e x e c u t e d a s s o o n a s p o s s i b l e . The tional is
needed
equal
r e q u i r e d memory amount may
u n i t s exchange d a t a d i r e c t l y . f o r those
one a n o t h e r .
hand
side
were
t o be s t o r e d ,
schedules
The f o r m u l a
becomes z e r o . then
But
also
be s m a l l e r
whereby
a l l operation
(13.5) c o r r o b o r a t e s i f a l l input data
memory w o u l d
i n case
F o r e x a m p l e , i f u=Q
the func-
t h e n n o memory
execution
this,
and a l g o r i t h m
be n e e d e d o n c e
times
as i t s r i g h t -
again.
results
Statements
131 13.4
a n d 13.5 s t i l l
h o l d For t h a t s i n g u l a r
only
some o f t h e c o m p o n e n t s o f a> a r e 2 e r o
tions are performed According is
Instantly
stantly the
t o t h e above a s s u m p t i o n s each r e s u l t a memory c e l l
upon
of
the delay
as f u n c t i o n a l
itself,
opera-
So f a r we h a v e
Therefore
and h y p o t h e t i c a l
units,
e t c . Nothing
vector.
communication
prevents
us f r o m
disregarded
the notion
devices
memory may r e -
i/o ports,
that
data
memory
may
travel
between t h e d e v i c e s ,
or within
presumption
p o s s i b l e s i n c e s o f a r we n e v e r made u s e o f a n y d e -
tails
i s quite
o f memory s t r u c t u r e .
The s t u d y
devices
of
where d a t a
channels,
presuming
individual
o f every o p e r a t i o n
i t s appearance and I s i n -
f e t c h e d as soon as i t i s r e q u i r e d .
include a l l actual
side,
and t h e c o r r e s p o n d i n g
simultaneously.
stored into
formation
should
c a s e . They a l s o h o l d i n case
w i t h o u t change.
of the functional
(13.2)
This
produces
good r e s u l t s a s l o n g a s i t i s known t h a t o n l y a r e l a t i v e l y
s m a l l number
of
memory c e l l s
itself)
be
used
belonging
f o r storage.
we e i t h e r
have
Consider we
pose
present
I n case
t o take
vector o r modify
the following
a question:
'What
systolic
sample s i t u a t i o n . properties
hardware
i s not included
times,
required to perform
should consider
an a l g o r i t h m
o f the algorithm that
graph
graph
must
be
does n o t r e -
processing
elements,
[ 8 8 ] . Of c o u r s e ,
(13.2)
alone
cannot
help
that
i s , on the d e l a y
i n the functional
some
properties
us d e t e c t t h e
vector, but this i n -
( 1 3 . 2 ) . Denote
t h e j t h o p e r a t i o n . The t i m e s hj
storage
and a l s o
the said
T h e r e q u i r e d memory amount may d e p e n d o n t h e
execution
as d a t a
Given
verification.
the functional
properties.
formation
counted
i s considerable
functional.
arrays, pipeline
operation
time
t h e number o f s u c h c e l l s
must
consideration the formation of the delay
of a constructive
Obviously, corresponding
( b e s i d e s memory
intermediate results?' This question arises while
kinds o f parallel
should admit
units
t h e e x i s t e n c e o f an i m p l e m e n t a t i o n
q u i r e memory t o s t o r e
other
into
o u r memory
t o ensure
building
t o other
times.
the functional
I t follows
that
b y h.
the
a r e n o t t o be
instead of
(13.2)
we
o f the form
(13.7) k
132 Here
i s t h e moment o f c o m p l e t i o n o f t h e k t h o p e r a t i o n . I t r e m a i n s t o
find
o u t on
what
conditions
this
functional
is identically
equal
to
zero. I n s t e a d o f 113.7) c o n s i d e r
R t t )= f u
Here
t h e sum
functional
{yfi-(M,-Ti .))-y(t-(u,+i>. .))}. j ij i ij
L
i s taken
a more g e n e r a l
over
a l l i . j
f o r which
(13.8)
t h e r e i s an a r c g o i n g o u t
of
t h e i t h node i n t o
t h e j t h n o d e . The m e a n i n g o f t h e f u n c t i o n a l
is
downright
Whenever a r e s u l t
as an
simple.
argument t o t h e j t h o p e r a t i o n
elapse
after
i t s a p p e a r a n c e and
units earlier
(13.8)
o f t h e i t h o p e r a t i o n i s passed
i t i s s t o r e d when
i s fetched
from
time
v^.
memory
by
units
T^J
time
t h a n i t i s u s e d . As s t o r a g e must n o t p r e c e d e r e a d i n g , t h e
n u m b e r s T . . a n d v. . must b e n o n n e g a t i v e a n d t h e f u n c t i o n a l R ( I ) s h o u l d JJ i j u be r e s t r i c t e d t o t h o s e s c h e d u l e s u f o r w h i c h u -u. ft r . .+i>. .. I n p a r t i j
I
IJ
i j
i f (it, , ft T . .+v. . f o r a l l v a l i d p a i r s i , i t h e n t h e c o n d i t i o n s ij 'J ' J u ,-u. ft T . .+i>. . h o l d a u t o m a t i c a l l y . C l e a r l y , R it)=N ( t ) i f T . = v. , = J I ij ij ' " u u ij ij 0 f o r a l l i , j a n d fi (£)=ff ( t ) i f v. = 0 , T . =n . f o r a l l i , j . I t i s u a ij ij J cular,
also
obvious
that
under o u r assumpt i o n s
the functional
fiu(t)
i s non-
negative. STATEMENT 1 3 . 6 . identical
ly
graph
equal
either
with
to
does
respect
to
The
schedules
zero
not
(in
have
in
memory. T h i s with
t h e moments
I.e.
the equalities
cycles
at
this
V
iJ* iJ'
cycles.
Q
a l l or -
E
According
tions
"
i j
s c h e d u l e we
have
of
equal
=
s u
PP
o s e
i j
i s
i t that
i t s
the
cycles
Since
the
are
balanced
with
stored
Taking
coin-
of a l l operations, By
or
virtue
does
to
weights
not
have
a l g e b r a i c equa-
one o f i t s s o l u t i o n s
t h e moments o f s t o r i n g
of
d o e s n o t have
respect
the system o f l i n e a r
compatible.
thm
corresponding
i s ever
for a l l i.j.
i s balanced
is
balanced
t h e moments o f s t o r i n g
hold
the graph
algori
are
nothing
f o r a l l results
u -r
R (D u
if
the a l g o r i t h m graph e i t h e r
t o S t a t e m e n t 7.8 + L ,
only
to zero,
i n case
reading
i t s cycles
T
s u c h s c h e d u l e u.
u.+v.j
D
or
one
means t h a t
t J o w T
all
t h e functional
and
,.
i s possible only
cide
S t a t e m e n t 7.8
at
if
.*»^
i s identically
R^{t)
exist
cycles
weights
Suppose t h e r e e x i s t s functional
t)
f o r which
coincide with
as a
the
mo-
133 merits o f r e a d i n g f o r a l l r e s u l t s functional
fi^tt)
Thus
would
require
the
algorithm
memory
t o store
graph
be
times.
This condition
(13.2)
only.
Generally describing
speaking,
various
t f o r any
a l l
results
N
U
that
does
i t i s necessary
that
to operation
problems
execution
investigation
different
functionals
memory. However
beget
tempting i t
we
will
n o t s e t o u t t o do i t .
g o a l s a r e covered by t h e f u n c t i o n a l
STATEMENT 1 3 . 7 . G i v e n for
implementation
respect
amount o f r e q u i r e d
may be t o c o n t i n u e t h e i r
hold
with
the
schedule.
c o u l d n o t have been deduced u s i n g t h e f u n c t i o n a l
the overal1
immediate
intermediate
balanced
Consequently,
t o zero f o r that
f o r t h e e x i s t e n c e o f an a l g o r i t h m
not
Our
o f a l l operations.
be i d e n t i c a l l y e q u a l
two
a n a l g o r i t h m graph, schedules
u
* N (t)= N
It)
V
u®v
and
(13.2).
the following
Identities
v:
f t )+ N I t )= u®v
(13.9)
= Wu((lei«v(t) + « u ( t ) » N v ( t ) .
Given of
an a l g o r i t h m
graph,
t h e memory f u n c t i o n a l
which
the corresponding algorithm
moment nodes
f o r any t t h e v a l u e
by t h e s e t o f nodes f o r
o p e r a t i o n s have been completed
t . L e t A be t h e s e t o f nodes t h a t d e t e r m i n e s " u ' ) > that
determines
N
u&v
(t).
N (t).
The sum
B, o r , e q u i v a l e n t l y ,
Then
certainly
identical,
w
A n B determines
u
v
b y A u fl a n d A n B.
which
readily
I tfollows
and t h e l a s t
follows
from
5 (
( '. u f f i ( /
N ( t ) + N ( t ) i s determined
i d e n t i t y o f ( 1 3 . 9 ) I n d e e d h o l d s . The f i r s t are
that
solely
£
termines and
(13.6) y i e l d s
i s determined
tothe
t h e
A
u
setof e
by b o t h that
d e
~
sets A
the f i r s t
terms o f (13.9)
the obvious
iden-
tity a + b • max(a.b) +
which
i s t r u e f o r any two numbers Unlike
the time f u n c t i o n a l ,
on s c h e d u l e s , treme
values
min(a,b}
a.b. t h e memory f u n c t i o n a l d e p e n d s n o t o n l y
b u t a l s o on t h e d e f e c t s o f t h e nodes. I n g e n e r a l t h e exof both functionals
would
be r e a c h e d
on d i f f e r e n t
sched-
134 ules.
I t follows that
i t i s impossible
schedule which would minimize both
to find
i n the general
algorithm execution
time
case t h e
and t h e r e -
q u i r e d memory a m o u n t . Let
w be a d e l a y
all
initial
t=T.
Both
and
data
zero
a) e x i s t
while
d e t e r m i n i n g6
schedules
the s e t o f schedules
the following
respect
u,
f o r which
are outputted at
t o the operations
graph.
f o r any two s c h e d u l e s
and N it) u®v
N it) u®v
(with
s e t f o r any a l g o r i t h m
13.7 t h a t
s e t s o f nodes d e t e r m i n i n g tions
Consider
a r e i n p u t t e d a t 1=0 a n d a l l r e s u l t s
and u n i t y i n such
proving
vector.
v
We
a
have
observed
the sets
o f nodes
a r e t h e i n t e r s e c t i o n a n d t h e sum o f t h e and W ^ ( t ) .
N^it)
Therefore
u n d e r o u r assump-
i n e q u a l i t i e s h o l d f o r any schedule u, p r o v i d e d
that
t h e d e f e c t s o f i n t e r n a l nodes a r e n o n n e g a t i v e
N
In
The
the
= « (I) a N it). ti o
it)
case t h e d e f e c t s o f i n t e r n a l nodes a r e n o n p o s i t i v e
N
as
1
1
"zero ' v e c t o r
possible, moment
implying
o
(t) s U it) u
= H it).
i
schedules each o p e r a t i o n the minimization
of i t s execution
maximum
o f time
memory
t o be e x e c u t e d functional.
i s required
c a s e a n d minimum memory i n t h e s e c o n d c a s e . The f o l l o w i n g viously hold
S
and
f o rthe f i r s t
case
^
@ N i t ) .
it)
the opposite
N ^
This functional
it)
^ N it)
relations hold
z N it)
example s u g g e s t s properties
that
fcH
it):
f o r t h e second
® N i t ) ,
requires
N
H
i t )
a l g o r i t h m g r a p h and t h e s e t o f v a l i d
However a t
i n the
first
r e l a t i o n s ob-
U)
case
a H (t)• K it).
a more t h o r o u g h more
i t ) 9 11
a s soon
detailed schedules.
s c r u t i n y o f t h e memory
information
both
on the
135
14. Hierarchical Memory Suppose
that
a computational
s t o r e a l l needed d a t a . a rule, the
external
Increase
ory
to
they
computation
memory a c c e s s perform
ternal
should
every
rithms
f o r which
ternal
cing. mit
t h e number
yield
than
t h e substan-
I n background time
as t h e
of
external
time
required
computations
and ex-
the total
a l l arithmetic
o f an e f f e c t i v e shows
of external
i n that
case, t h e
I t i s therefore very
of the effective
use o f e x t e r n a l that
there
memory.
exist
overall
time
important
references
i s on t h e I f t h e ex-
of external
t o understand
use o f e x t e r n a l
algo-
operations.
i s n o t v e r y s m a l l , no i m p l e m e n t a t i o n s
negligible
than
operations.
memory
t o t h e number o f a r i t h m e t i c
memory a c c e s s t i m e
algorithms
performed
earlier
As
t i m e d u e t o e x t e r n a l mem-
t o b e more p r a c t i c a l ;
considered
to
case.
memory a c c e s s t i m e s h o u l d b e c o n s i d e r a b l y l e s s
we
proportional
RAM
t h a t o f RAM a n d
t i m e . To a v o i d
operations. Alternating
a l g o r i t h m admits
12.2 t h a t
enough
i n that
the cumulative
n o t be much g r e a t e r
a l l arithmetic
time r e q u i r e d t o perform
Not Example
often
on. Obviously,
memory e x c h a n g e s p r o v e s
total
whole
operation execution
are f a i r l y
goes
cumulative external the
n o t have
E x t e r n a l memory h a s t o be u s e d
i n the o v e r a l l problem s o l u t i o n
references
main
does
memory a c c e s s t i m e b y f a r e x c e e d s b o t h
average a r i t h m e t i c
tial
system
o f such
memory
referen-
what a l g o r i t h m s ad-
memory a n d w h a t p r o p e r t i e s o f a l -
gorithm graphs are responsible f o r i t . To
avoid overcrowding
making
several
our research
simplifying
by u n e s s e n t i a l d e t a i l s
assumptions.
h a v e t w o - l e v e l memory a n d a n y a r i t h m e t i c t i m e . Suppose t h a t
RAM
taneous c o n f l i c t - f r e e tional
units.
ered
that
less
than 1. Let
This
correspond
external
concerned
about
channels,
whether
that
matters
has l i m i t e d
o p e r a t i o n be p e r f o r m e d
In unit instan-
access t o any o f i t s c e l l s only
to delay
vectors
be
the precise these
i s that data
those
access
t o a n y number o f f u n c -
schedules
a l l components
arbitrarily
channels
will
system
s t o r a g e c a p a c i t y and a l l o w s
implies that
memory
we
Let the computational
mode,
large.
We
be
whereof
are not
t h e number
are pipelined,
shall
considare not
currently
o f communication
e t c . The o n l y
e x c h a n g e s be i n v a r i a n t w i t h
respect
thing
t o time.
136 In
particular,
moments performed changes memory that
consider
, £g a t time
occur
moments
a t these
i s t h etime
time
a s e t o f exchanges
Assume
'j+t.
(
+ T 2
moments.
' •••
provided
The m a i n
required t o perform
by q where
a n d l e t them s t a r t
f o r any T s i m i l a r
that
q>\ o r e v e n
that
q » l . Assume
data also
may b e
no o t h e r ex-
characteristic
a single
a t time
exchanges
o f external
exchange. that
Denote
a l l external
memory e x c h a n g e s go v i a BAM. We h a v e
noted
that
ground o r i n t e r m i t t e n t l y are
performed
following executed
e x t e r n a l memory
i n background.
way. P a r t i t i o n into
smaller
We m o d i f y
t h e time
intervals.
written
i n e x t e r n a l memory
t h e e x e c u t i o n p r o c e s s on e v e r y from
Assume
i n that
that
o p e r a t i o n s , and f i n a l l y ,
i n which
that
same
process
time
they
i n the
operations
do n o t u s e a n y d a t a
interval
interval.
Now
as f o l l o w s . F i r s t ,
i s t o be read,
perform
i n back-
thealgorithm i s
the arithmetic
intervals
smaller
e x t e r n a l memory e v e r y t h i n g
algorithm
go e i t h e r
goes o n . Suppose
t h ecomputation
interval
p e r f o r m e d w i t h i n each o f t h e s m a l l e r are
exchanges
as t h e main c o m p u t a t i o n
then
a l l writes
that
modify read
perform a l l
t o e x t e r n a l mem-
ory. These t r a n s f o r m s
result
i n a new a l g o r i t h m i m p l e m e n t a t i o n and computation.
ferences
from background t o i n t e r m i t t e n t ex-
changes time. of
a r e now g r o u p e d . S w i t c h i n g generally
all
data
i n3q time
units.
increase
Finally,
execution
units.
with
i snot greater
a l l data
that
this
synchronous
a n y q, a n y a l g o r i t h m e x e c u t i o n
memory, e t c . T h e o n l y
should
be i n v a r i a n t
little
increase formally
i n time,
important
apart
e x t e r n a l memi n Zq t i m e
i n e x t e r n a l memory c a n
upper bound
i s guaranteed
and asynchronous
modes o f
t i m e , a n y mode o f a c c e s s t o thing
i s that
a s we h a v e m e n t i o n e d
i n algorithm execution brought
from
intervals 4, I n d e e d ,
t i m e needed t o c o m p l e t e t h e fragment
does n o t exceed 8 q . Note t h a t both
execution
into than
can be executed
a r e t o be s t o r e d
i n t i m e 3 q . The o v e r a l l
computation,
c a n be r e a d
itself
these r e -
algorithm
the p a r t i t i o n i n g
factor
The f r a g m e n t
a n y a l g o r i thm g r a p h ,
external
i n overall
needed t o e x e c u t e any f r a g m e n t
be w r i t t e n
having
an i n c r e a s e
Yet i t i s easy t o see t h a t
l e n g t h 2q t h e t i m e
ory
for
causes
Besides,
that a l -
t e r n a t e s e x t e r n a l memory r e f e r e n c e s
t h e exchanges
earlier.
time a c t u a l l y occurs.
computation
and e x t e r n a l
As a
rule,
Of c o u r s e , memory e x -
137 changes,
we
practice.
can again
perform
t h e exchanges
as
background
jobi n
B u t now t h e s e e x c h a n g e s c a n be g r o u p e d .
T h u s we w i l l putation
c o n s i d e r those a l g o r i t h m
and e x t e r n a l
memory
references
Implementations go
computations
a r e performed,
then
e x t e r n a l memory, new d a t a a r e r e a d
data
In,
w h e r e b y com-
as a l t e r n a t i n g
T h a t means t h a t f i r s t d a t a a r e r e a d f r o m e x t e r n a l some
a
processes.
memory i n t o RAM,
are written
from
then
RAM
into
etc.
Every a l g o r i t h m has a s e t o f i t s i m p l e m e n t a t i o n s . W i t h each i m p l e mentation, stored
some
i n RAM.
data
are stored
I tfollows
n i t e a m o u n t o f RAM.
that
i n external
each
The a r i s i n g p r o b l e m s
that minimizes
- find of
data some
the algorithm
are defi-
implemen-
usage;
o u t what p r o p e r t i e s o f a l g o r i t h m g r a p h
required The
RAM
other
requires
include:
- g i v e n an upper bound f o r r u n t i m e , f i n d tation
memory,
implementation
influence
t h e amount
RAM.
list
o f such
p r o b l e m s c a n be c a r r i e d
o n . We
will
investigate
o n l y a c e r t a i n number o f them. Recall
Statement
independent. gorithm worth
1 3 . 5 . The e s t i m a t e
In particular,
implementations
while
t o consider
mate c a n b e s h a r p e n e d . importance
f o rour
STATEMENT graphs
and
quire
both
on s e r i a l
specific
an
at
least
p]F
for
i s implementation-
and p a r a l l e l when
simple
algorithm
computers.
the required statement
p^,
those
graph of
N
holds
cases
The f o l l o w i n g
implementations
RAH amounts
i ty i e l d s
e s t i m a t e w o u l d be t h e same f o r a l l a l -
memory
I tI s esti-
i s of particular
problems.
1 4 . 1 . Let
certain
this
u
(t)
scheduls
the
consist
of
s
disjoint
corresponding
respectively.
Then
fragments the
execute
re-
estimate
£ max p 1 i£iss
that
sub-
(14.1)
the
said
fragments
con-
secutively.
Indeed, vided
t h e memory
t h e fragments
(14.1) c e r t a i n l y
i s freed
are executed
on c o m p l e t i o n
o f each
consecutively. Therefore
fragment
pro-
the estimate
holds,
COROLLARY. L e t a n a l g o r i t h m g r a p h
consist
of
s disjoint
subgraphs
138 and
the
mum
of
ith
fragment
is
that
execute
the
defects the
of
p^
13.5
of
the
fragments
( 1 4 . 11
provided
a
the requirement
s m a l 1 amount
of
In that
fied
as
use
f a r as
bottleneck
partitioned
into
[u,
v)
whose
end
I f we
two
its
end
placing
into
of
the
schedules
sharper
the
and
time
The
that
large
time
fragments
they
are
on
be
each i n -
may
prove
to
one
by
i s well as
our
to
consecutive
executed
i s concerned,
usefulness
of
data
the
of
fragment
result
input
requirement
same
as
than
o f each graph
course
that the
two
no
justi-
other
t i m e . The
of Statement
point.
The
the data
that
and
}
of
the
such
different
set
that
sets
i s a directed
im-
chief
14.1
conexter-
o f nodes V
f o r a l l arcs the
relations
of
the graph
i t from
t h e graph
cut
V ) then d e l e t i n g
(V ,
use
and
first
G^.
The
corresponding
i n f o r m a t i o n b o r n e by the f o l l o w i n g
attachment
transfer then
of
admits
corresponding
to
nodes
to
in G . I t
of the algorithm into
consecutively.
these
described inputting
by
the arcs of the
procedure.
o u t p u t node t o i t s o r i g i n
these d a t a
graph C always
a l l operations
cut defines „ s p l i t t i n g
be e x e c u t e d
loss of
d e l e t e d we
cut
a l l operations
can
a new
s e t s Jf
s u b g r a p h s G]
directed
the
Suppose
elements
execute
any
that
graph.
set o f such a r c s
that
link
outputting
all
PJ.
then
that
being
at
resources
disjoint
and
fragments
deleted,
(IT,
G
To a v o i d cut
two
p o i n t s are
schedules
i n G^
follows
results
to s u b s t a n t i a l l y decrease that
know a d i r e c t e d
G would s p l i t
nodes
imposes
Sufficiently
a directed
v e l ^ h o l d . The
i s denoted
such
of for
maxi-
efficiently.
is
of
i s of
forcing
memory u s a g e .
be
and
This
computer
the
consecutively.
significantly
14.1
If
i t shows t h e ways t o i m p l e m e n t t h e a l g o r i t h m u s i n g
G={V,E)
and
holds
implementation
RAM.
sign.
number
of a l l algorithm results
likely
i s now
memory
the
(14.1)
algorithm execution
are
i n that
Let
of
and
same
case t h e c o n s e c u t i v e e x e c u t i o n o f f r a g m e n t s
plementations
uel^
of
the fragments.
make e f f i c i e n t one.
nal
be the
a l g o r i t h m fragment,
execution
sists
may
i n memory. S t a t e m e n t
dividual
data
the
algorithm
that
requires
have
estimate
of
dropping stored
nodes
input
then
estimate
Statement
internal
number
total
The
all
new
and
As
nodes can
the arc
them i n t o
each arc
a new
being
be
directed i s being
input
node to
viewed
deleted
as
by
the o t h e r subgraph.
refirst Uhe-
139 n e v e r we d e a l w i t h a l g o r i t h m g r a p h s p l i t t i n g u s i n g d i r e c t e d sume t h a t
the corresponding
So f a r we r e s t r i c t of
individual
We a l s o assume t h a t
I t s fragment execution begins
and
terminates by outputting
(nonpositive)
of
r e a d i n g i n memory a l l i n p u t
internal
I t follows
graph
nodes
any directed
any implement
requires
that
data
one a n d execution
cut
at ion
the same
have that
that
amount
nonnegative
does
not
executes
of
memory
spli
t
algorithm as the
exe-
algorithm.
f o r t h e sake o f d e f i n i t e n e s s t h e case o f n o n n e g a t i v e d e -
of internal
a directed
and
all
Then
the entire
Consider
by
execution
a l g o r i t h m and
may b e u s e d f o r t h e c o n s e c u t i v e
Consider
nodes.
consecutively
cution
fects
1 4 . 2 . Let defects.
(input)
fragments
entire
fragments,
STATEMENT
output
with
both
a l l the results.
t h e same s e t o f memory c e l l s individual
added.
our consideration t othe consecutive
fragments.
any
of
c u t s we a s -
i n p u t and o u t p u t nodes a r e a l w a y s
nodes.
cut,
Any s p l i t t i n g
o f the a l g o r i t h m graph
f o l l o w e d by the attachment
induced
o f corresponding
input
o u t p u t nodes does n o t change t h e d e f e c t s o f i n t e r n a l nodes. I n case
the c u t does n o t d i v i d e
the output
nodes
their
number
I s t h e same f o r
t h e s e c o n d s u b g r a p h a n d f o r t h e e n t i r e g r a p h o f a l g o r i t h m . By v i r t u e o f Statement gorithm
13.5 t h e same
itself
amount o f memory
and i t s
nodes o f t h e f i r s t
second
subgraph
total
tion
mentation nal
number o f i t s o u t p u t
of the f i r s t
i nexactly
now d e t e r m i n e d
COROLLARY. L e i the be of
tation
the same
graph of
the a l o f output
the total
number o f i n -
i n i t s turn
n o t exceeding
I tfollows
that
t h e implementa-
f r a g m e n t d o e s n o t r e q u i r e more memory t h a n t h e i m p l e -
i s treated
amount b e i n g
algorithm
number
o f t h e s e c o n d o n e . The case o f n o n p o s i t i v e d e f e c t s o f i n t e r -
nodes
graph
the l a t t e r
nodes.
t o execute
The t o t a l
i s not greater than
put nodes o f t h e second subgraph, the
i s needed
fragment.
either
into
sign
by t h e t o t a l
defects or
of zero.
two subgraphs
fragment
t h e same
of
way,
memory
number o f i n p u t nodes.
a l l internal Suppose
nodes
a directed
requires
of cut
p arcs.
involves algorithm
t h erequired
at
Then least
an
algorithm
splitting any
the
implemen-
p words
of
memory.
If into
we u s e a s e q u e n c e o f d i r e c t e d
subgraphs
then
the implementation
cuts
to slice
o f t h e whole
a n a l g o r i t h m graph a l g o r i t h m can be
140 b r o k e n i n t o a sequence o f i m p l e m e n t a t i o n s posing
every
tation
o f the whole
Statement result
fragment
requires a small algorithm w i l l
time
into
as g r a p h
c r e a s e u s u a l l y does n o t b e g i n when t h e s e t s o f b o t h
require small
o f r e q u i r e d memory
main unchanged f o r a l o n g
until
Sup-
a m o u n t o f memory, t h e i m p l e m e n -
also
14.2 shows t h a t p a r t i t i o n i n g
i n t h e decrease
o f i n d i v i d u a l fragments.
memory
amount.
s u b g r a p h s may n o t t y p i c a l l y amount. T h a t
splitting
the process
amount c a n r e -
progresses. i s close
i n p u t and o u t p u t nodes g e t s p l i t
I t s de-
t o i t s end,
to a
substantial
extent. Notice describe
that
a whole, data must
be
t h e r e d u c t i o n o f t h e r e q u i r e d memory a m o u n t w h i c h we
i s to a certain extent f i c t i t i o u s .
To e x e c u t e
t h e a l g o r i t h m as
exchange between f r a g m e n t s must be p e r f o r m e d .
stored
somewhere,
and
some
additional
memory
These
i s therefore
n e e d e d . By memory we a l w a y s mean r a n d o m - a c c e s s memory h e r e . input
The
As f o r t h e
and o u t p u t n o d e s a d d e d i n t h e c o u r s e o f t h e s p l i t t i n g
will
assume
that
they
represent
a l g o r i t h m graph s p l i t t i n g
implementation
requiring
implementation
always e x i s t s .
duced
t o t h e minimum
devices
i f only
process,
referencing external
p r o c e s s may be v i e w e d a s a s e a r c h
a specified
amount o f RAM.
Furthermore,
RAM
Obviously,
then
graph
splitting
f o r the s u c h an
u s a g e c a n i n f a c t be r e -
t h e arguments and t h e r e s u l t s
t h e time
o f exchanges w i t h
cut requires that number o f a r c s ternal
effect
i n that
i s small
fragment. tional tion perly
o f each i n -
c u t be p e r f o r m e d .
writes
that
algorithm execution
t h e number o f i n p u t d a t a
Of c o u r s e ,
this
I f this
algo-
i s given,
and reads e q u a l
Consequently
can be e f f e c t i v e l y
i n comparison w i t h
amount
e x t e r n a l memory becomes c r i t i c a l .
controlled
time
comparison
o f ex-
as g r a p h
split-
n o t take
I f we
and r e s u l t s
i s v a l i d only
Each
t o the
the time
consider-
manage
to find
o f each
t h e number o f o p e r a t i o n s
s y s t e m has o n l y one p r o c e s s o r
channel.
I f RAM
E x c h a n g e s w i t h e x t e r n a l memory w i l l
on o v e r a l l
such s p l i t t i n g ment
nodes.
a number o f a d d i t i o n a l
memory r e f e r e n c e s
t i n g progresses. able
are individual
we
memory.
d i v i d u a l o p e r a t i o n a r e s t o r e d . The f r a g m e n t s o f t h e c o r r e s p o n d i n g rithm
data
frag-
within
that
i n case t h e computa-
a n d o n e e x t e r n a l memory communica-
i s n o t t h e case
t h e n o u r a r g u m e n t s m u s t be
pro-
adjusted. T h u s we h a v e
essentially
reduced
t h e problem
of efficient
use o f
141 external of
memory t o t h e f o l l o w i n g
directed
cuts
number o f n o d e s cut to
exists,
whose
ly small
the implementation
increase
further
of
splitting
respect
with
external
'Does a l g o r i t h m g r a p h
i s considerably
subgraphs?'
of the entire
o f arcs
directed
tively
memory
are performed
operations
effectively,
execution.
small w i t h at
fragments.
respect
fines.
The
The t o t a l
to the total
l e a s t one d i r e c t e d
substantially
time
i s guaranteed
t o t h e c u t . Any
smaller
search
than
that
question exchanges
i . e . the
cumulative t h e time
t h e correspondence
(or individual
references)
o f our algorithm Into
consecu-
number o f a l l a r c s o f a l l c u t s i s
number o f g r a p h n o d e s .
cut exists
reduced
relative-
as compared w i t h
Establishing
c u t s we o b t a i n a s p l i t t i n g
executed
a l g o r i t h m may be
Suppose f u r t h e r
b e t w e e n g r o u p s o f e x t e r n a l memory r e f e r e n c e s and
one s u c h
o f t h e a l g o r i t h m i n v o l v e s a n s w e r i n g t h e same
t o each o f t h e fragments.
arithmetic
than t h e
belonging
t i m e o f e x t e r n a l memory r e f e r e n c i n g i s s m a l l of
less
algorithm execution
t h e number
such
that
I tfollows
t h e number o f a r c s
cuts
involves
the analysis
that
In i t i s
t h e number o f n o d e s I n t h e s u b g r a p h s
f o r other
admit
I f at least
o f t h e two f r a g m e n t s d e f i n e d b y i t . The
i n the overall
t h e smallness
with
of arcs
i n the corresponding
the implementation
by
number
question:
i t de-
of the frag-
ments. The g e o m e t r i c we r e g a r d be
from
the functional
point
a l g o r i t h m g r a p h nodes i n t o
prises
p
arcs.
corresponding tween
Consider to V
determines
quires V
to
the existence
also
directed cut
a n d V^. S u p p o s e I t com-
vector
u whereby t h e o p e r a t i o n s
L e t t be any time
groups
that
r . Yet i t i s only
must
of operations.
o f such
the data
be s t o r e d a t t h e moment
the existence
of a directed
schedule
N ( t ) i s p. G i v e n
memory f u n c t i o n a l
"u
that
moment b e -
The
quantity
are transferred
t . Therefore
cut comprising
u and time
memory f u n c t i o n a l
ment
L e t some
t h e a m o u n t o f memory t h e a l g o r i t h m i m p l e m e n t a t i o n r e -
a t t h e moment
Consequently
o b j e c t s . D i r e c t e d c u t s may
first.
of different
cut i s self-evident i f
o f view.
t h e subsets
any schedule
are executed
the execution
iV^lt)
from
of a directed
a l g o r i t h m g r a p h s as g e o m e t r i c
treated
split
interpretation
moment
p arcs
( that
an a l g o r i t h m graph,
«u(.f)=p. implies
the value o f consider
some
S u p p o s e n o o p e r a t i o n s a r e p e r f o r m e d a t t h e mo-
t . A l l a l g o r i t h m graph
nodes
fall
into
two d i s j o i n t
groups f o r
142 which the
u
< t a n d u . > t,
respectively,
number o f a l g o r i t h m g r a p h
first
g r o u p and p o i n t
arcs form
I t i s easy t o see t h a t
arcs that originate
from
t o nodes o f t h e second group.
a directed
c u t f o r any t , "
(
u
r
) being
NAt) i s
t h e nodes o f the
Therefore
t h e number
a l l such
o farcs i n
it. It ted
i s i m p o r t a n t t o have v a r i o u s c r i t e r i a
cuts
( i f any) a g i v e n a l g o r i t h m
graph
t o determine
admits.
what
One o f t h e s e
direci s pro-
cured by STATEMENT 1 4 . 3 . (z
, P
the
w ) with P pairs of
distinct
node.
vector
Then
with
z ,
i
nonzero
Suppose
l
p
at ion
requires
at
under
the schedule
corresponding
least
linking
possess
a
to a
p words
of
delay
memory.
t h e common n o d e o p e r a t i o n i s
to
i s generated
t o t h e moment u . T h e s e d a t a
w i l l be
K
later
than u . Consequently, N ( u (ftp.
p
1
implementing
paths
u. A c e r t a i n amount o f d a t a
previously
consumed b y t h e n o d e s w ,...,w This c r i t e r i o n
(z^, w^>, . . . ,
a r e paths
these
p
1
arcs
there
and that
implement
components
t h e nodes z ,...,z
while
p
comprise that
z , w
. . . ;
t h e moment a t w h i c h
k
by
w;
any algorithm
Denote b y u be e x e c u t e d
graph
end points.
nodes
"
common
L e t a n algorithm
implies
K
U K
t h a t we c a n n o t d o w i t h
small
t h e a l g o r i t h m d e f i n e d by t h e graph
amount o f RAM
i n F i g . 1 2 . 2 . To
prove t h i s i t i s s u f f i c i e n t t o t a k e nodes o f a d j o i n i n g l e v e l s as z .....z a n d w .....w i n S t a t e m e n t 14.3. The nodes h a v i n g t h e maxiV p 1 p. mum
values
o f coordinates
i , j a r e common
p a i r s o f n o d e s . T h e amount o f d a t a execution in
o f the corresponding
one l e v e l ,
Reading
memory a c c e s s the
time
should
execution.
stored
This
these
data
n o t t o slow
implies
that
down
than
be done
o f points d u r i n g the
i t i s necessary
that
operation time f o r
significantly
a l l o r almost
that
the c o n s i d e r a t i o n o f p a r a l l e l
a l l data
the algoh a s t o be
implementations
o b l i g a t o r y f o r b o u n d i n g t h e r e q u i r e d memory a m o u n t . study of s e r i a l STATEMENT reached
these
i n RAM.
Notice
the
1inking
t h e number
must
levels. Obviously,
n o t b e much g r e a t e r
memory e x c h a n g e p r o c e s s
rithm
operations equals
and w r i t i n g
execution o foperations o fboth
f o r a l l paths
t h a t must b e s t o r e d a t t h e moment o f
on serial
implementations
14.4.
The global
In the general
case
i s sufficient.
maximum
implementations.
i s not
of
required
memory
amount
is
143 Let u
the quantity
a t time
t^.
moment
N^U^) be t h e g l o b a l
maximum r e a c h e d
We c a n a s s u m e w i t h o u t
loss
on schedule
i ng e n e r a l i t y
t h a t no
o p e r a t i o n s a r e b e i n g e x e c u t e d a t t h e moment t . T h e n » ( f 1 I s e q u a l t o o u o the by
number o f a r c s o r i g i n a t i n g f r o m t h e moment
ter
i
Q
form
t
Q
and p o i n t i n g
. Then t h e s a i d
a l l operations
Our
t o t h e nodes t h a t
before
treatment
and a f t e r t
o f two-level
sequentially
memory c a n e a s i l y
t h e c a s e o f m u l t i - l e v e l memory. S u p p o s e
grows
with
the level
the f i r s t this
sively. and
little
1 levels juncture
First
last
loss
rected
that
t h ehighest-level
to refining
splitting
of algorithm
capacity
r a p i d l y de-
a l l available
o f "BAM", s p l i t t i n g
into
o f course disjoint
be
capacity o f
o f the (l-M)th
level.
c a n be i n v e s t i g a t e d
process
graph
i n time
i tcan u s u a l l y
memory a s " e x t e r n a l "
obtained
t h estructure
to
memory a c c e s s
t h ecumulative
m u l t i - l e v e l memory u s a g e
b u t o n e l e v e l memory, e t c . T h i s
recursive
memory
than t h e capacity
memory a s "BAM". H a v i n g
we p r o c e e d
be expanded
that
Furthermore,
i n generality
i s much l e s s
we r e g a r d
a l l other
tion,
As a r u l e ,
as t h e l e v e l number d e c r e a s e s .
assumed w i t h
At
number.
( a s t h e number o f
would n o t change).
clude
creases
a r e t o be e x e c u t e d a f -
maximum w o u l d r e m a i n u n c h a n g e d i f we w e r e t o p e r -
d a t a a t t h e moment t
arcs carrying
t h e n o d e s whose e x e c u t i o n was o v e r
recurmemory, informaoff the
amounts
fragments
to the by d i -
cuts.
15. Sectioning of Memory We now s u p p o s e use
algorithms
rally
arises
structure
that
f o r some
requiring very
whether
by t a k i n g
reason
large
account
i t i s necessary t o
a m o u n t s o f BAM. The q u e s t i o n
i t i s possible into
or other
t o weaken
more
natu-
t h e demands o n t h e BAM
properties
of practical
algo-
rithms. Recall is
that
b y RAM we a c t u a l l y
mean
t h e memory w h o s e a c c e s s
o f t h e same d e g r e e o f m a g n i t u d e a s a l g o r i t h m
not
concerned
about
fast
memory a c c e s s .
take
somewhat
ficiency will
longer,
the technological I fa relatively
operation
number
o f memory
o r even s u b s t a n t i a l l y l o n g e r ,
a c t u a l l y occur
t i m e . We a r e
a n d s t r u c t u r a l means small
i np r a c t i c e .
little
time
t o achieve references drop
i n ef-
We s e t o u t t o make e f f e c t i v e
144 use
of that Let
tional
circumstance.
t h e c o m p u t e r memory c o n s i s t
units
reference
memory a c c e s s
time
memory
i s roughly
functional
units
long
and t h e t i m e
time,
considerably tioned
greater.
case
sectioned
time
RAM
only will
tation
t h e same a s o p e r a t i o n
one and
required
We w i l l
the data
t h e same
t o switch
refer
exchange
i s chaotic
t o frequent
place
Suppose
t o RAM
that
t h e average
time
section
provided
o f memory
b e t w e e n memory structured
the
for a
sections i s
like
t h i s as sec-
memory.
In
due
reference
o f s e c t i o n s . Assume t h a t t h e f u n c -
v i a a switch.
switching
once
between
t h e average
memory a c c e s s
between s e c t i o n s .
i n a while,
then
memory
computer
uni t s
time
will
be
large takes
average
memory
admits o f an e f f e c t i v e system
and t h e
However, i f s w i t c h i n g
an a c c e p t a b l e
ensue. Whether an a l g o r i t h m
on s e c t i o n e d
the f u n c t i o n a l
i s determined
access
implemen-
by t h e algo-
rithm structure. C o n s i d e r any a l g o r i t h m
t h a t c a n be e f f e c t i v e l y
i m p l e m e n t e d on such
a s y s t e m . We h a v e s e e n t h a t s w i t c h i n g b e t w e e n memory s e c t i o n s infrequent tions
i n that
case.
I t follows
into
such g r o u p s
may be b r o k e n
that
almost
h a s t o be
a l l algorithm
that a l l operation
opera-
arguments f o r
e a c h g r o u p a r e t a k e n f r o m o n e a n d t h e same memory s e c t i o n . Assume the
r e s u l t s o f these operations
generally ments
not Important
from
different
are stored
i n t h a t same s e c t i o n .
where t h e r e s u l t s o f o p e r a t i o n s sections
as
their
t h a t take
number
Nonetheless
l e t us a g r e e
stored
t h e s e c t i o n whence t h e l a s t a r g u m e n t was d r a w n .
Into
Now
group
together
that
are stored,
t h e r e s u l t s o f an o p e r a t i o n
those a l g o r i t h m
graph
long
t o o n e a n d t h e same memory s e c t i o n . S w i t c h i n g b y t h o s e a r c s whose e n d p o i n t s
into
these arcs
disjoint
from
subgraphs,
number b e i n g
u s e d memory s e c t i o n s . An a l g o r i t h m computer splits
system
i s effective
t h e graph
into disjoint
Let of
G=IV,E)
i t s nodes
a r e elements
t h e g r a p h we o b t a i n their
be a g r a p h
a r e always
will
then
be caused
a splitting
groups.
o f t h e graph
a s t h e number o f
i m p l e m e n t a t i o n on a s e c t i o n e d of
arcs
subgraphs i s r e l a t i v e l y
(not n e c e s s a r i l y
(vertices) i s partitioned
into
argu-
I s smal1.
of different
t h e same
i f t h e number
is
n o d e s whose r e s u l t s b e -
only
Deleting
that I t
whose
memory
deletion
small.
d i r e c t e d ) . Suppose t h e s e t two d i s j o i n t
subsets V
and
145 I*a_
The s e t o f a r c s
sets
i s referred
l e d g e s ) whose end p o i n t s a r e e l e m e n t s o f d i f f e r e n t
t o as u n d i r e c t e d
cut,
or just
cut,
o f t h e g r a p h , and
d e n o t e d b y . 1 2
The rithm
possibility
implementation
following. a
to build
on a s e c t i o n e d
Indirected cuts exist
relatively
splits
an e f f e c t i v e
small
the graph
disjoint
memory
computer
Deleting
subgraphs
s i z e . C l e a r l y , the reverse
ted cuts e x i s t s tively
statement
these cuts
So
f a r we
chiefly
also
i n t h e a l g o r i t h m graph then
implemented on a s e c t i o n e d
functional
have
units
t o make
algo-
means t h e
sections.
our discussion
more
from
t h e graph
i s t h e same
as
i s d e t e r m i n e d by sec-
holds:
i fsuch
undirec-
t h e a l g o r i t h m c a n be e f f e c -
memory c o m p u t e r
assumed t h e e x i s t e n c e
a n d memory
system
w h o s e number
t h e number o f u s e d memory s e c t i o n s a n d w h o s e s i z e tion
run time)
i n the algorithm graph that consist o f
number o f a r c s .
into
tas regards
system.
of a switch
Actually
connecting
switch
illustrative.
was
The
the
introduced
essential re-
quirements a r e i n f a c t as f o l l o w s :
time
-
t h e computer system a r c h i t e c t u r e admits o f s e c t i o n e d
-
t h e a v e r a g e memory a c c e s s t i m e
i f the functional
ory f o r a long
units
These tems t h a t
requirements
transfer
tant
as
something
are d i s t r i b u t e d
a
given
satisfied
or distributed,
a switch
sections
system
imposed
i s used,
on
or a
not important
by a l l computer
units
communication
whether
usage o f f u n c t i o n a l u n i t s .
and f u n c t i o n a l u n i t s
by
units
the above-listed
of individual
algorithms
system
network, or
the functional
s o l e l y on t h e s i z e s o f u n d i r e c t e d architecture various
sys-
I s unimpor-
the algorithm
a l l peculiarities
communications c o n t r i b u t e t o t h i s .
t w e e n memory s e c t i o n s
i s consid-
memory. The a c t u a l m e t h o d o f
among memory s e c t i o n s o r n o t . P r o v i d e d
are satisfied,
yield different system
are apparently
I t i s also
tures e v e n t u a l l y t e l l For
between d i f f e r e n t
operations.
the restrictions
Maybe
else.
requirements
o n e a n d t h e same s e c t i o n o f mem-
b e t w e e n memory s e c t i o n s a n d f u n c t i o n a l
regards
architecture.
units
than performing
have s e c t i o n e d ,
data
memory;
t h e same a s o p e r a t i o n
time;
- switching functional erably slower
reference
i s roughly
architec-
c u t s we c h o o s e . will
in
general
B o t h a l g o r i t h m s t r u c t u r e and The t y p e s play
o f c o n n e c t i o n s be-
t h e main
role.
146 We h a v e shown i n §14 t h a t t h e a l g o r i t h m implemented c a s e RAM
effectively
size
is insufficient
mediate r e s u l t s .
to store
memory
a l l data,
12.2 c a n n o t be
computer
system i n
including the inter-
However, i t a d m i t s o f a good i m p l e m e n t a t i o n on a
t i o n e d memory c o m p u t e r by
i n Example
on a n y m u l t i - l e v e l
system.
hyperplanes p a r a l l e l
Indeed, d e f i n e
t o the coordinate
a set of undirected
into
sides
subgraphs
hyperplanes.
o f the corresponding contained
The r a t i o
within
face
large
correspondence achieve sectioned
memory c o m p u t e r
Thus u s i n g
effective there
by t h e
cut t o the total
i s p r o p o r t i o n a l t o t h e r a t i o o f t h e sur-
I t follows
that
and
of the algorithm
i s small f o r
e s t a b l i s h i n g the
memory
sections,
i n Example
we
12.2 o n a
system.
sectioned
memory b r o a d e n s
Our n e x t q u e s t i o n s
implementation
algorithms
i s split
delimited
i n each
the parallelepipeds
implementation
plemented a l g o r i t h m s .
are
the parallelepipeds
parallelepipeds.
between
a good
l i e on t h e
The g r a p h
a r e a o f t h e p a r a l l e l e p i p e d t o i t s v o l u m e . The l a t t e r
sufficiently
an
hyperplane.
o f t h e number o f a r c s
number o f a l g o r i t h m g r a p h a r c s
cuts
h y p e r p l a n e s and n o t c o n t a i n -
i n g g r a p h n o d e s . The e n d p o i n t s o f a r c s o f a n y o f s u c h c u t s different
sec-
that
t h e scope o f e f f e c t i v e l y imare,
on s e c t i o n e d cannot
be
what a l g o r i t h m s
memory
computer
implemented
admit of
s y s t e m s and
effectively
on
such
systems? The
example
question. Suppose
we
just
Let algorithm that
considered
suggests
an answer
g r a p h n o d e s be p o i n t s
t h e nodes a r e s i t u a t e d
w i t h i n a region with s u f f i c i e n t l y
i n such
i n some a way
smooth boundary call
a g r a p h JocaJ
all
with
respect
o f I t s arcs
ledges! a r e small
containing
the graph.
characteristics
describing
for
y e t they
space. number
i f the lengths of o f the re-
various quantitative are not o f
interest
u s now, STATEMENT
local
notion,
their
to the size
I t i s easy t o i n t r o d u c e this
metrical
that
first
i s a good e s t i m a t e o f
t h e v o l u m e o f t h a t r e g i o n . We w i l l
gion
to the
can
be
15.1. effectively
Any
sufficiently implemented
large on
algorithm *
sectioned
whose
graph
memory
is
computer
system.
Given a l o c a l
graph, consider
the hyperplanes p a r a l l e l
t o t h e co-
o r d i n a t e h y p e r p l a n e s and n o t c o n t a i n i n g g r a p h n o d e s . They d e f i n e
a par-
147 titioning
of
the
region
i n t o p a r a l l e l e p i p e d s and
into
subgraphs.
The
ered
during
re-analysis
is
that
our
the
total
parallelepipeds lengths sing
of
of
number
is
of
small.
they
parallelepipeds,
parallelepipeds the
number o f
such a r c s i s r e l a t i v e l y
facets
of
are the
Thus t h e p o s s i b i l i t y sectioned graph
analysis
of
graphs
there
are
or
their
Generally
most o f t e n t h e Note
isomorphic
are
of
graph c u t s . It
use
of
themlinking layers
the
total
the
of
the
local them
local
not possess t h a t
the b o t t l e n e c k .
(and
Dot
computation of
whereby
have
However, property.
the
like),
product
evalu-
s o l v i n g l a r g e systems o f g r i d
number
number
of
data
are
s t r u c t u r e upon
the
i n p u t and
the
equa-
output
The
one.
the
influence
memory
undirected
cuts
always easy t o The
now tell
traditional
loops.
Loops
of
involves
algorithm
the
play
i n v e s t i g a t i o n of
the
main
role.
whether a given
a l g o r i t h m and
often
encompass
the
algorithm
graph
is
isomorphic
program n o t a t i o n s computational
involve
fragments
t h a t have c l e a r - c u t m a t h e m a t i c a l meaning. D e s p i t e
t h a t t h e i r graphs between d i s t i n c t
l o c a l due
putational
liar
to the
existence
of
"long"
links
f r a g m e n t s . S u c h a l g o r i t h m g r a p h s may
c a l g r a p h s and
The
either
graphs.
method
a
algorithm graph.
not
be
the
i m p l e m e n t a t i o n on
to a
most to
conjugate gradient
fragments of
studying
required
i s not
to a local the
to
thin
Consequently,
that
w h o s e g r a p h s do
to
encompas-
respect
relatively
the
bottlenecks.
that
structure
those
different
points of arcs
within
shows
is proportional
thing
parallelepipeds
t h e end
algorithms
trouble while
tions.
the
with
consid-
local,
size of the
i s isomorphic
products being
operations
to
is
i s d e t e r m i n e d by w h e t h e r
the e v a l u a t i o n o f dot l o t of
belong
effective algorithm
ation
causes a
important
small
that
graphs
popular algorithms
already
graph
small.
that
Here b e l o n g , f o r example, t h e
that
graph
to the
also
situated
o f an
t o some g r a p h
practical
local
the
algorithm
only
points
parallelepipeds.
memory c o m p u t e r s y s t e m
i s close
are
provided
adjoining
The
Since
large. Therefore
different
to
12.2.
w i t h respect
that
sufficiently
close
Example
a r c s w h o s e end
small
I t follows
the
selves are
i s rather
of
relatively
i t s arcs are
region.
sizes
situation
of the
then
mathematical
split
meaning.
into Here
subgraphs lies
the
be
t h a t no
transformed longer
methodological
may com-
into lo-
have t h e
fami-
difficulty
of
148 dealing with
local
Consider
graphs.
f o r example
SSOR f o r t h e s o l u t i o n
grid equations. Their structure methods a r e i t e r a t i v e . tions.
loops
solved.
i n (12.2).
o f 1lnear
m i r r o r e d b y E x a m p l e 1 2 . 2 . These
The ( l o o p i n ( 1 2 . 2 ) d e s c r i b e s i n d i v i d u a l
On e a c h i t e r a t i o n ,
natively
i s well
o f systems
The
upper and lower t r i a n g u l a r
solution
processes
The n o t a t i o n
Itera-
systems a r e a l t e r -
are described
by
t h e double
(12.2) has t h u s a c l e a r - c u t
mathematical
structure. Applying 112.2) w o u l d arcs
that
112.2).
the standard not y i e l d
correspond
In fact,
preferring
a
methods
local
t o data
graph.
(12.3).
shown i n F i g . 12..2, a n d i t i s l o c a l .
notation
(12.2).
per double one.
loop i n (12.2), this
graph
parallelepipeds,
then
the resulting
structure
of
(12.2).
"long"
loops i n
reason
reflects
o f our
the structure
t h e mathematical of the original
correspond
t o t h e up-
subgraphs c o n t a i n e d w i t h i n subgraphs would
Moreover,
these subgraphs c a n be i d e n t i f i e d
to
I l e v e l s c o r r e s p o n d t o t o t h e lower
I f we s p l i t
the
into
by
t h e two i , J
t h e main
t h e odd t l e v e l s
t h e even
building
The c o r r e s p o n d i n g g r a p h i s
I t fully
and t h e r e f o r e
In particular,
graph
i s violated
between
o f l o c a l 1 t y was
notation
structure of the algorithm,
a l g o r i thm
Locality
exchanges
t h e absence
the equivalent
of
no
i t i s not obvious
with
any f a m i l i a r
individual
longer
reflect
a t a l l whether
o b j e c t s and p r o c e s -
ses.
16. Decomposition of Algorithm and of Its Graph Algorithm sults
splitting
(or decomposition)
I n the decomposition of algorithm
The
graph
the
original
rithm. ties.
graph
of every
The
graph
partial
algorithm
itself
into
subgraphs r e -
into partial
algorithms.
i s the corresponding
subgraph o f
a n d i t c a n be r e g a r d e d a s some new i n d e p e n d e n t
independence
manifests
I n a number o f c a s e s p a r t i a l
itself
i n memory
algorithms
traffic
a r e implemented
d e n t l y o f one a n o t h e r . There a r e a l s o o t h e r f a c t s
algo-
peculiariindepen-
that corroborate
this
viewpoint, We original
will
regard p a r t i a l
algorithm.
We
will
algorithms use
a s new
the term
large
operations o f the
macrooperaiion
to refer
to
149 such o p e r a t i o n s . Data dependencies between termined two
by
algorithm
distinct
that originate arc. so
i n one
Obviously,
that
graph
subgraphs,
arcs
call
be
to
graph
be
as
regarded
of
larger
and
data The
as
o p e r a t i o n s and
i n two
stages.
level,
then
tion
data
of
partial
quires a
by
First,
choices Taking
demands.
graphs
i t i s p o s s i b l e t o choose macronodes
t h e demands o n d a t a
The
w o u l d be
strict
natural
requirements
cuts
arc
I f the
an
belonging
disjoint
parts,
parts.
t h e end
be
cut. I t follows
This used
deleting
these
of
on
terms
operations cuts.
implementa-
The
t h e macroimplementa-
macronodes
the e n t i r e
re-
algorithm.
c o n s i d e r a b l y weaken t h e r e of
i n s u c h a way well.
This
system
individual
as
t o weaken
results
in i t s
architectures.
what
c o n d i t i o n s the
We
macro-
macrograph
means
that
to build
graph
is
macrograph
t o any cut
macroarc.
from
split
is
the
into
subgraphs
every
macroarc
t h a t any induces
I t i s an
graph
splits
element the
involves the arcs
of
some
graph
into
the set o f d i r e c t e d such
those c u t s from
set of directed
the macrograph
cut of
differ-
of only
directed cut that participated a directed
by
acyclic.
macronodes o f t h e macroarc b e l o n g i n g t o
Consequently,
f o r m i n g o f the macrograph
can
of
peculiarities
computer
algorithm
resulting
cut. Deleting that
self.
on
question arises,
16.1. then
Take any
directed
size
arcs refer
in
compromises are p o s s i b l e w h i l e choosing
directed
ent
will
acyclic.
STATEMENT directed
the
e x c h a n g e b e t w e e n t h e m as
have seen t h a t e f f e c t i v e nodes.
account
sets
is acyclic, i t
individual
that
o f macronodes can into
We
a r e d e s c r i b e d on
to
than
source
i n less
graph.
to build algorithm
algorithms corresponding
l e s s e r amount o f r e s o u r c e s
macro-
r e g a r d m a c r o a r c s as
the chosen a l g o r i t h m graph
implementations
any arcs
disjoint
algorithm described The
de-
graph
the other a into
i n d i v i d u a l macronodes a r e c o n s i d e r e d .
Moreover, s p e c i a l
turn
into
the macrograph
transfers.
are
subgraphs. For
algorithm
divided
will
c o n c e p t o f m a c r o g r a p h a l l o w s us
tions
sink
o f the o r i g i n a l
i s determined
graph
of
m a c r o n o d e s o f t h e new
macrograph. Provided a graph
transfers
separate
be
a m a c r o a r c . We
the corresponding
raacrooperations
set
o f t h e subgraphs and
each s e t would
t h e new
link
entire
a l l a r c s o f a l l c u t s may
connecting
can
that
the
the
one
i n the
the macrograph
cuts
i n the
cuts
i n t h e macrograph
that
the corresponding
mac-
isolates
algorithm
it-
graph
150 ronodes. tion
Suppose t h e m a c r o g r a p h c o n t a i n s a c i r c u i t .
of the cuts
elimination belong
this
circuit
t o non-connected
parts
i n this
case
may be i n v e s t i g a t e d
dently the
o f the graph,
cuts y i e l d s
t h e macrograph
r i t h m graph macrodescription.
"usual"
As t h e e n d p o i n t s
algorithm
o f that
no o t h e r
f o r t h e macrograph
graph.
circuits. There-
be r e g a r d e d a s a n a l g o -
f o r m s may be b u i l t i n just
and schedules
t h e same way a s f o r t h e
E a c h m a c r o o p e r a t i o n may
o f those macrooperations
must
l i n k i n g the
contain
an a c y c l i c macrograph.
can a c t u a l l y
Parallel
macroarc
path
T h e r e f o r e t h e macrograph cannot
Thus t h e use o f d i r e c t e d fore
must be b r o k e n a t some moment d u e t o t h e
o f some m a c r o a r c .
m a c r o n o d e s may e x i s t .
During the elimina-
be e x e c u t e d
t h a t do n o t i n f l u e n c e
indepen-
i t s arguments a t
moment o f i t s e x e c u t i o n . T h e s e t o f m a c r o o p e r a t i o n s may be e x e c u t e d
sequentially or i n parallel.
Each i n d i v i d u a l
implemented
way. F o r e x a m p l e ,
i n any s u i t a b l e
be e x e c u t e d c o n s e c u t i v e l y , w h i l e
m a c r o o p e r a t i o n may a l s o be the macrooperations
t h e e x e c u t i o n o f e a c h o f them
may
i s par-
allelized, Suppose t h a t a m a c r o p r o c e s s o r It
h a s t o be s o l v e d w h i l e
choosing d i r e c t e d
memory r e q u i r e m e n t s d e c r e a s e
larger
subgraphs
results
channels
Regarding
the macroprocessors
various
Implemented.
are
pipelined
able t o arrange
data dependencies communication
network
topology of links
algorithm
include
units
cuts
i n such
them w i l l
systems
with
a
c a n be single
e t c . [ 8 8 ] . I f we
a way a s t o w e a k e n t h e
t h e n t h e r e q u i r e m e n t s on t h e
be m i l d .
t h e y c a n be l i n k e d n o t have
we c a n u s e them
t h e macrograph
A f t e r we d e c i d e o n t h e
practically
significant
impact
arbitrarily.
on t h e o v e r a l l
implementation efficiency.
Whenever t h e r e ly
will
on which
t h e macronodes,
linking
number o f m a c r o p r o c e s s o r s The
organizations
vice versa,
memory i s n e e d e d .
processors, s y s t o l i c arrays,
the directed
between
pro-
macroprocessor
r e q u i r e m e n t s o n communica-
as f u n c t i o n a l
c o m p u t a t i o n a l systems
The p o s s i b l e
macroprocessor,
The
t h r o u g h p u t grow;
i n milder
t i o n channels but l a r g e r macroprocessor
build
cuts.
as i n d i v i d u a l macrographs g e t s m a l l e r , b u t
requirements on communication
choosing
to
t h e macronodes.
h a s some r e s o u r c e s , i n c l u d i n g memory. G e n e r a l l y a n o p t i m i z a t i o n
blem
the
i s chosen t o execute
coupled subgraphs
i sa possibility
to spilt
the algorithm
i t i s w o r t h d o i n g i n most c a s e s .
into
loose-
The s p l i t t i n g r e -
151 legates
t h e main d i f f i c u l t i e s
computer
architectures
structure. al
units
Note t h a t of
complex,
and
structure Me
the
never
in
synchronous
their
structure
o f o p e r a t i o n s and
have merely
They
may
are
ultimately
reveals
properties suits is
the
and
the s t r u c t u r e
the algorithm.
not c u r r e n t l y
that
remarkably
graph
we
like
would
once a g a i n Example ting
induced
planes
by
graph
subroutine MxL,
and
that
MxN
DO
1
DO
determined
or
very
Their
com-
by
both
the
to solve
the problem
of
mapping
Yet even o u r s u p e r f i c i a l
o f the c o m p u t a t i o n a l system discussion
of
treat-
algorithm that
graph
optimally
the mapping
problem
decomposition using to express
12.1.
directed
i t i n our
that
whose
that
structure
Therefore the p a r t i a l
i s analogous
EXP(A, B, C , M , N , L )
algorithm
important Consider split-
subgraphs
t o see
i s so
notation.
hyper-
the
I t i s easy
cuts
algorithm
is
analogous
algorithms
to
should
the
admit
t o 12.1. the
following
o p e r a t e s u p o n t h e a r r a y s A , B,
and
FORTRAN-llke
language
C of dimensions
NxL,
respectively:
f=l,M
2 i= l , N
DO
3
j=l,L
= AU.j)
Aii.j) IF{i*N)
GO
TO
+ AU-i.J)
B(t.J)
=
A(N,j>
2
C(t,i)
=
Aii.L)
1
CONTINUE
u s u a l , we
+ AU,j-l)
2
3
As
simple etc.
intention.
structure.
by
very
graph
of a d e s c r i p t i o n Denote
be
modes,
function-
a set of hyperplanes p a r a l l e l to the coordinate
generates
overall
macroprocessor
c o n n e c t i o n between
A more d e t a i l e d
our
Algorithm
tight
the
onto
t e c h n o l o g y advances.
a l g o r i t h m s o n t o computer a r c h i t e c t u r e s . ment
well
asynchronous
t h e ways
of
r e s t r i c t i o n s on t h e
or
the computer
sketched
o f mapping a l g o r i t h m s
consideration
imposed any
the macroprocessor.
work
plexity
to
we
o f the problem
assume t h a t
by
definition
(16.1)
152 = Btt.JY,
A(0,j)
It
c a n be e a s i l y v e r i f i e d
=
Ali.O)
that
C((,I).
the additional modification of entries
o f fi, C i n ( 1 6 . 1 ) d o e s n o t c h a n g e t h e e n t r i e s o f A c o m p u t e d i n ( 1 2 . 1 ) . With Example
the above-described
12.1 e v e r y
described
vide and
partitioning corresponds
of the algorithm
to a
partial
b y ( 1 6 . 1 ) . t h e v a l u e s H. H. L now b e i n g
graph. Besides, sults
subgraph
the arrays
also belonging explicit
A , B, a n d C now s t o r e
graph I n
algorithm
also
t h e s i z e s o f t h e subt h e i n p u t d a t a and r e -
t o t h e s u b g r a p h . These i n p u t d a t a and r e s u l t s
i n f o r m a t i o n o n w h a t d a t a may be s t o r e d
w h a t c a n be g a i n e d
by t h a t .
Now
i t remains
pro-
i n e x t e r n a l memory
t o p u t down
the algo-
r i t h m as a whole. S u p p o s e t h a t t h e n u m b e r s W, H. L i n ( 1 2 . 1 ) a r e r e p r e s e n t e d of
W= la +. . .*m , l p According A
t o these
, of sizes rk
and
a s sums
integers
n
the matrix
now be w r i t t e n
DO DO DO
r
representations k
C into
blocks
I
ft=l,p
2
r=\,q
3
k=l,s
V
contents
C
hr'
The
initial
the
corresponding blocks i n the array
cuts,
,
partition 6
into
+. . . + 1 i s
L=l
the matrix
blocks
of sizes
hk
« n^.
A into
o f sizes
blocks m. x 1, h
k
The a l g o r i t h m can
VVV
o f the arrays A
<16-2> and C
, B
must m a t c h t h a t o f
o f A , B. a n d C. The a l g o r i t h m
A consisting o f t h e blocks
A
f
k
results
grouped
will
be
I n t h e ap-
manner.
The n o t a t i o n cerning
Chj_
q
i n the form
E X P (
propriate
1
x J, , t h e m a t r i x
"rkCONTINUE
stored
N=n +. , , +n
(16.2)
the p o s s i b i l i t i e s
as i t p r o v i d e s
renders unnecessary any f u r t h e r of algorithm
the f u l l
graph
information
splitting
about
research using
con-
directed
i te x p l i c i t l y .
I t
Is
153 not
even
that
necessary
i s t o be Now
ting
taken
consider
using
cutting
t o know t h e
Inner
i n t o account
Example
undirected
hyperplanes
As
Example 12.1,
algorithms that
with
produces
plementation the
nodes w o u l d
process
into
of
algorithms.
This
undirected
Any
it
a set
of
between
of
linked
graph then
them.
In
a partial
operations.
This
is
typical
an
algorithm
Algorithm
closely
nearly
used
to
usually
dominates the
effective
the
cuts
The
problem
that
overall
the
same
im-
reduce partial
splittings
represent
of
i n various
typically
the
used
of
dependen-
decompose
i n the
I n t h a t c a s e we
set
of
i s the
fact
algorithms.
execution
portabi1ity
is
libraries.
only
that
software
For
time
of
time.
The
therefore
Since
the
directly
are
example,
subroutines problem largely
overall
of the
size
of
performed. beget
tremendous neglected.
that some
the
inevitably
on
i t is
c i r c u m s t a n c e o f c o u r s e c a n n o t be
have
will
ex-
systems, Macrooperations
algorithm execution
fields
cuts
cannot but
transporting applications
The
the
macro-
i n some a p p r o p r i a t e mode.
complicated
the
algo-
i s whether
the macrograph. I f u n d i r e c t e d
role.
the analysis reveals
area
are
t r a n s p o r t c a n n o t be
amounts o f a l g o r i t h m s . T h i s Nonetheless
graph
t h e n o t a t i o n s f o r i t bear n o t
describe
play
i s huge, the
progress
to
to exist
including parallel
the p o r t a b i l i t y
libraries
using
to
main q u e s t i o n
e v e n more i m p o r t a n t
applications software
problem o f
algorithm
The
i s guaranteed
i s d e f i n e d by
the
subroutines
that
to data
computers,
always
partial
algorithm
to
directed
d e c o m p o s i t i o n and
various
I t follows the
graph.
the
the macrooperations according
such guarantees e x i s t .
to
the the
case
I t seems t h a t
related
onto various
Provided
i t i s impossible
allows
ecute a l l macrooperations simultaneously
memory u s a g e .
split-
hyperplanes,
identifying
to break
for
macrooperations.
order
order
u s e d , t h e n no
algorithms.
circuits.
i . e.
thing
time.
t o that o f the e n t i r e
impossible
stages,
only
algorithm graph
coordinate
similar
The
cuts.
i s possible to order
cies
are
situation
decomposition
as
two
EXP.
the whole a l g o r i t h m to implementations of
p r o d u c e d by
rithm
the
have
I t i s i n general
implementation
partial
to
be
the
t h e m a c r o g r a p h p r o d u c e d by
new
macrograph
before,
parallel
s t r u c t u r e of a l l subgraphs w i l l Unlike
i s i t s execution
12.2.
cuts
are
s t r u c t u r e of
set
common
of
algorithms
kernel,
usually
of
one
not
and very
154 large.
Moreover, t h i s
sufficiently
large
o p e r a t i o n s , as plication,
I f a l l linear
a l g o r i t h m s w o u l d be
of
such a
the
algebraic
reduced
course,
operations.
in
are
fields
t h e d i f f e r e n c e may totally
example,
rithm
various
the
identical,
common. The
different
graphs
of
though
would turn
one
of
graph
to
the key
and
parts of
yield
algorithms
(12.1)
and
linear
of
their
the
may
the
described
i n the block
block
of
that,
form. Yet
rather
identical
are
have
graphs. algonothing
revealed
book i s d e v o t e d .
based
on
cuts
we than
multiplication
these a l g o r i t h m s
analysis,
of
kernels.
functional
situation
to which t h i s
algorithms
In spite
possess
algebra
background
development
theory.
be
of
theoretical
matrix
investigation
the s t r u c t u r a l
multi-
algebra a l -
carefully
different
to
of
functionally
essential traits
building
matrix-vector
were
The
of
out
several
matrix-vector
problem f o r l i n e a r
seldom d e s c r i b e d
s t r u c t u r a l analysis of algorithms, rithm
i n terms of simple
to the problem of p o r t a b i l i t y
matrix-vector
i s a w e l l - e s t a b l i s h e d branch
that
structural: For
algorithms
the p o r t a b i l i t y
l i n e a r a l g e b r a i c a l g o r i t h m s are Of
the
multiplication,
description Is closely related
m e t h o d s and
observe
described
example,
such k e r n e l f o r numerical
terms of such o p e r a t i o n s ,
implementing
o f t e n be For
m a t r i x a d d i t i o n and
etc. constitute
gorithms. in
k e r n e l can
operations.
by
the
Algo-
constitute
Chapter 4 Matrix Investigation of Algorithm Structure Various questions tempts t o f i n d
arise
a s we a r e i n v e s t i g a t i n g a n a l g o r i t h m . A t -
a n s w e r s t o them f o l l o w .
The s c o p e o f q u e s t i o n s
i s usual-
l y q u i t e w i d e . T h e q u e s t i o n s may t o u c h u p o n c o m p l e x i t y b o u n d s , u p o n t h e feasibility teristics, tigation this
of certain etc.
transformations,
Algorithm
o f i t s record.
Hence
various
charac-
i n v e s t i g a t i o n a c t u a l l y amounts t o t h e i n v e s -
I t i s Intuitively
i n v e s t i g a t i o n depends
itself.
upon computing
clear
that
l a r g e l y upon t h e s t r u c t u r e
the importance
o f t h e chosen
the success o f of the algorithm
algorithm
notation
and i t s
structure. The
discussion
some o b j e c t s
of algorithm
are specified
the
emerging q u e s t i o n s .
use
already proved
the
graph
notation.
depend 11
beget
associated
it,
with
algorithm
matrix
only
Clearly,
o f data
consider
fully
We
algorithm
shall
parti-
including
structure graph
graph
notation c a n be
must be r e -
we
shall
associate
start
with
an o b j e c t
a
with
a number o f p r o b l e m s b e a r i n g o n a l This
connections
a particular kind
reflects
being useful
i t s de-
o f algorithm
algorithm
to generality,
notation.
investigation.
as v a r i a t i o n m a t r i x
entries
which
that
object.
properties
weighted
its
object
Fairly often
notation,
the choice
that helps solve constructively
will
and i t s
s p e c i f i c a t i o n s . Many
of exploited that
o u r commitment
general
answers t o
i s one o f s u c h o b j e c t s ,
i n i t s pure form.
i n some way o r o t h e r .
i n that
gorithm
to
assumed
a specific
be m e a n i n g f u l i f
constructive
H o w e v e r , we h a v e o b s e r v e d many t i m e s
on t h e k i n d
c a n be
Following rather
A l g o r i t h m graph
fruitful.
would only
provide
i s a c c o m p a n i e d b y some a d d i t i o n a l
cularities
flected
would
i s not always applied
scription
should
structure
that
o f algorithm. the algorithm
object
i s a
o f algorithm.
matrix
In this
o f such m a t r i c e s . The
structure
graph.
chapter
a we
I t i s referred
of i t s
Therefore
called
nontrivial
we c a n c o u n t
on
f o r t h e s o l u t i o n o f problems concerning t h e i n v e s t i g a -
t i o n o f a l g o r i t h m s and t h e i r
structures.
156
17. Graphs and Matrices C o n s i d e r any a l g o r i t h m data
change.
algorithm tations such input
We
have
the order
data
property.
within
the o r i g i n a l
graph.
on n u m b e r s . consists
an a l g o r i t h m
Denote
u
= F
k
k
l u
a l l F^
heavy
those
algorithms this
T h u s we evaluation
we
by
the value
that
evaluate
assume
that
p variables
information cluding
u^
be e a s i l y
number
certain
one o p e r a -
o f operations
data.
the algorithm
Suppose
*
< k.
(17.1)
of
their
arguments.
b u t some s e t o f v a l u e s u ^ . that
there
amounts
functions
Is quite
u
theory
F besides
propagation
and o f g r a d i e n t
obtained only
This
the recurrent
U^j...,u . Both
error
to a
to just
i s only
one r e -
to considering a t given
only
points.
Ob-
large.
relations
(17.1)
describe the
function
on t h e f u n c t i o n
roundoff
derivatives
c a n , up
functions
results
v • FUj
of
be-
as a sequence
fc
c a n assume
class of algorithms
of a certain
even i f
i s established
computations:
smooth
Without
viously,
we
k
are sufficiently
restrictions,
n o t change
k
c a n be t a k e n f o r a l g o r i t h m
represented
sequences
i n generality.
p
Nothing
sult,
that
the input
) ,
11 s
Here,
loss
out the f o l l o w i n g
k
a l l implemen-
by o p e r a t i o n
relation
i s described
a ,...,u
by
i n carrying
Consequently,
our consideration
t i o n sequence, w i t h o u t s u b s t a n t i a l Assume t h a t
a l g o r i thms o r
g r a p h nodes, any sequence w o u l d de-
I t follows
restrict
even i f input
computational
them d o e s
an a p p r o p r i a t e
and a l g o r i t h m
of investigation,
most
c a n be d e s c r i b e d
of operations
change. P r o v i d e d
tween t h e o p e r a t i o n s termine
that
fragments possess t h a t o f such an a l g o r i t h m
that
stage
whose g r a p h r e m a i n s u n c h a n g e d
mentioned
p
) ,
and p r a c t i c e i t s values
require
there
many
other
a t some p o i n t s , i n -
characteristics,
at certain points,
i n case
(17.2)
values
of
partial
e t c . A l l these d a t a can
i s an e x p l i c i t
representation
of
157 F If
i n t e r m s o f tt^, ,,,. # u , a n d t h a t r e p r e s e n t a t i o n t h e f u n c t i o n (17.2) i s computed u s i n g
plicit
I f n i s large.
I n t h a t c a s e we
function v i a the recurrent relations The r e l a t i o n s
t h e s e t o f v a l u e s u f c . We
"k
the general
process
case
(17.1)
F ia + k
k
p
this
system
k
k
i s nonlinear.
While
1
fluk
u^ •
flu
s, k
) - ( u +Au
k ,. . ., k
< s
of
(17.3),
the we
the system
(17.4)
k,
k
( 1 7 . 4 ) . we o b t a i n , t o t h e s e c o n d - o r d e r t e r m s ,
Subtracting
the linear
(17.3)
from
system f o r t h e
flu^.-
J!*aVuk
\
i=l
1 flu- flu, = 0 . k, k
k ,..., k 1
matrix
*
of the linear
the f u n c t i o n s f ^ l u ^ i n (17.3),
the matrix
since
< k. s
system
k
(17.5)
is a
)-ufci n v a r i a b l e s u ) S
relations
(17.5)
i p
n,
investigating
) = 0,
w h e r e a l l v a r i a t i o n s Au f c a r e i n a s e n s e s m a l l .
The
(17.3)
s, k
1
variations
< k. k
the set of solutions
n e a r b y s o l u t i o n s . They s a t i s f y
1
deter-
system as f o l l o w s :
*
p
sparse,
t h e ex-
cannot b u t explore t h e
k
or, equivalently,
have t o c o n s i d e r
equals
then
that determine i t .
rewrite this
)-Ufc - 0, s
of
(17.1),
( 1 7 . 1 ) c a n be v i e w e d a s a s y s t e m o f e q u a t i o n s
Fj-lUf.
In
algorithm
simple.
r e p r e s e n t a t i o n o f t h e f u n c t i o n i n terms o f i n p u t data can h a r d l y
e v e r be a c q u i r e d
mining
i s reasonably
matrix f o r
u ^ . As t h e number
k
( 1 7 . 4 ) e q u a l s n-p, of
Jacobi
(17.5)
Is
a n d t h e number o f v a r i a b l e s
(n-p)xn.
e a c h f u n c t i o n Ffc u s u a l l y
As
a
rule,
i t Is
involves only a small
very
number o f
158 variables. lower
The m a t r i x
triangular
lower r i g h t will
* has f u l l
with
respect
i t sentries
Suffice
originating
i t
Is
from the
The d i a g o n a l e n t r i e s a l l e q u a l - 1 . We
* a s variation
depend
i t t o say t h a t
t o the diagonal
corner o f the matrix.
refer t o the matrix
course,
rank.
of algorithm
matrix
i n the general
case
on
(17,1).
the
Of
variables
u ,..., u . T h e 1y c a n be d e s c r i b e d b y l n-i
-1
i f j=i+p,
3F. ij
if
j i s one o f Ci+p) .
'
" j
i+
(17.6)
P^s i*P
otherwise.
As various The Ascribe p
we s h a l l
see l a t e r ,
variation
evaluation
o r i g i n a t e s from Uj
matrix
the evaluation
nodes s i g n i f y
the
the variation
will
often
i s closely
arcs point
related
the input
of initial
o f rV i n (17.1)
data
to obtain
t h e i t h node a n d p o i n t s
to the algorithm
t o t h e k t h node f o r k s p .
u.,..,u , u ^ . We
u^. According
I f k>p, t h e n
graph.
the f i r s t
and a l l t h e r e s t
postulate
t o t h e j t h node
evaluating
emerge i n
(17.1).
o f u ^ t o t h e k t h g r a p h node. C l e a r l y ,
i s used as an argument w h i l e
ed
matrix
problems concerning t h e i n v e s t i g a t i o n o f a l g o r i t h m
t h a t an a r c
i f and o n l y i f t o (17.1),
no
t h e k t h node i s p o i n t -
t o by a r c s o r i g i n a t i n g f r o m nodes k , . . . , k
Sk
1 Now we w i l l
d e s c r i b e o u r graph by an (n-p)xn
* i j
=
I s easy
i f j=i+p,
1 i f j 0
the
$ with entries
Define -1
It
matrix
i s among ( i + p ^
(i+p>s k*p
otherwise.
t o see t h a t
t h e k t h column o f t h e m a t r i x
q u a n t i t y u . , and t h e k t h row c o r r e s p o n d s *
kth
row h a s e n t r y
ber
of the quantity u
(17.7)
* corresponds to
t o the quantity u
k*p
. The
- 1 i n t h e c o l u m n whose number c o r r e s p o n d s t o t h e numbeing evaluated.
I t has e n t r i e s
+1 i n c o l u m n s
159 whose
numbers
describes refer The is
numbers
the informational
connection
o f information
obvious.
entries
The f o r m e r
representing
of the quantity
t i , . The m a t r i x * K*p o f u , . T h e r e f o r e we w i l l k
interconnection
t o I t a s t h e information
relation
the
a r e argument
i s derived
partial
of algorithm
matrix
connection
matrix
from
the l a t t e r
derivatives
o f F.
by s e t t i n g
to unity.
s t r u c t u r e o f n o n z e r o e n t r i e s i s t h e same f o r b o t h The
ces
information
related
nonzero
connection
to the algorithm
e n t r i e s . We
connection
matrix.
matrices
H e r e we w i l l
of this
structure
graph. S u b s t i t u t e
kind
of their
nonzero
reflects
information
of this kind
i s the
to this matrix.A l l
o f algorithm
entries fully
i s why
family of matri-
as w e i g h t e d
our discussion
the graph
a l l the
That
any numbers f o r a l l i t s
matrix
A sample m a t r i x
limit
define
matrix
matrices.
spawns a w h o l e
t o any such
of the algorithm.
matrix
variation
refer
matrix
(17.1).
to the variation
uniquely,
so t h e
the structure o f a l -
gorithm. We
have
many
times
graph nodes can s i m p l i f y hope t o a c h i e v e rix
observed
a simpler
o f graph
the
layerwise:
operations
of m a t r i x
matrix)
first,
the operations layer,
new
a r g u m e n t s o f t h e new o p e r a t i o n
operation
in
F .
Then
we
except
Fig.
trix
with
1 7 . 1 . Each
nonzero
defines
has o n l y
i s selected
and columns
t h e new
r o w o f F.
e n t r i e s equal
layer,
then
the enumeration
has a t l e a s t
1. Each
one nonzero e n t r y
we
corresponding
to
have n o t been c o u n t e d y e t , so as t o match
t h e new r o w
t o the i n i t i a l
of the information
enumeration
fol-
t o t h e arguments o f
a l l columns
F^ t h a t
f o r columns corresponding
t h e rows
accordance
Enumerate a l l
i n the f i r s t
e t c . That
enumerate
T h e new c o l u m n e n u m e r a t i o n
terchanging
mat-
t h e e n u m e r a t i o n o f c o l u m n s we p r o c e e d a s
the
enumeration,
connection
(17.1).
we e n u m e r a t e a l l c o l u m n s c o r r e s p o n d i n g
the
etc.
t h a t we c a n
by c h o o s i n g a s u i t a b l e enumera-
form o f the algorithm
i n t h e second
r o w s . To d e f i n e
lows. F i r s t ,
enumeration o f
nodes.
Consider any p a r a l l e l operations
an a p p r o p r i a t e
s t r u c t u r e o f the information
(and hence o f t h e v a r i a t i o n
tion
that
the graph d e s c r i p t i o n . I tf o l l o w s
obtain
connection
the matrix
one n o n z e r o
entry
row o f t h e h a t c h e d - o v e r that equals - 1 .
data. I n -
part
matrix
shown i n
and a l l t h e o f t h e ma-
160
b) Fig.
In trix ith
c a n h a v e more column
their is
the general
that
tion
than
column
one e n t r y
of the information
equal
i n c l u d e s 5 j such e n t r i e s
arguments.
He h a v e m e n t i o n e d
some a r c s
transport
case each
17.1
originating
of identical
data
[
t h e meaning
certain
graph
items. Recall that
a s one o f
of this
nodes
we r e f e r
situation
stand
for
to this
the
situa-
as d a t a b r o a d c a s t i n g . Thus i f t h e r e a r e d a t a
broadcasts
i n the algorithm
then
umns o f t h e i n f o r m a t i o n c o n n e c t i o n m a t r i x may h a v e n o n z e r o w i t h more t h a n o n e m a t r i x r o w . T h i s p r o p e r t y h o l d s umn p e r m u t a t i o n s . information exists If
ma-
I n p a r t i c u l a r , the
o p e r a t i o n s use Uj
i f5 that
from
to unity.
connection
I tfollows
connection
at least
matrix
one column
t h e r e a r e no d a t a
that
broadcasts
l o o k as i n F i g . 17.1b).
goes
shown
i n Fig.
t h r o u g h more
i n the algorithm
Ho c o l u m n g o e s
col-
f o r a l l row and c o l -
t h e above p e r m u t a t i o n s
t o t h e form
that
some
intersection
t r a n s f o r m the 17.1a).
There
t h a n o n e o f t h e P.. then
t h r o u g h more
the matrix
will
t h a n one o f t h e P j
matrices. Matrices
are traditionally
used
t o represent graphs.
cuss s e v e r a l k i n d s o f such m a t r i c e s , t a k i n g ral
properties
without all is
of algorithm
graphs.
i n t o account
Consider
a directed
We w i l l
dis-
t h e m o s t genegraph
G={V,E)
l o o p s a n d m u l t i p l e a r c s . S u p p o s e i t h a s n n o d e s a n d m a r c s , and
n o d e s a n d a r c s a r e m a r k e d . An n a n s q u a r e m a t r i x fi w i t h called
adjacency
matrix
o f the graph i f
entries
b. .
161 1 b.
i f an arc o r i g i n a t e s and
. =
0
Note
that
otherwise.
the main d i a g o n a l o f the adjacency m a t r i x
r e s p o n d i n g t o o u t p u t n o d e s and zero
as well.
related ation of
The a d j a c e n c y
t o i t s information
connection matrix
the matrix
graph,
transformed
As s h o w n
choice
such
P'BP is
ever
the
inform-
o f the
algorithm
' stands
connection matrix exploration I nfact
by the
this
by enumerating graph
for
c a n be appro-
amounts t o
nodes
i n an ap-
are
defined
there
by the
adjacency
a permutation
exists
i n the
T a k e any o f t h e m and 1. W i t h t h a t
that
graph,
P
g r a p h . S i n c e t h e g r a p h has
that
never
there
layer-
arc would
point
the adjacency matrix i s
the adjacency matrix
T h e r e f o r e we w o u l d
I t follows
parallei
enumerate t h e nodes
p a t h i n t h e g r a p h we w o u l d a l w a y s
node.
various
e n u m e r a t i o n any
n u m b e r . T h i s means t h a t
numbers.
starting
i f
no c i r c u i t s
Now s u p p o s e
T r a c i n g any
increasing
and only
graph
triangular.
greater
triangular.
the
I s closely
b y t h e l a s t n-p rows
matrix
superscript
o f operations.
i f
from layer
to a node w i t h
at
subsequent
directed
upper
there
starting
gular.
and the information
matrix
A loopless
f o r m s o f i t must e x i s t .
upper
graph
way.
Supposing
wise,
the
o f enumeration
B has no circuits that
matrix,
above,
the adjacency
STATEMENT 1 7 . 1 . matrix
o f the algorithm
B i s the adjacency
so as t o f a c l 1 l t a t e
transforming propriate
matrix
« i s the s u b m a t r i x formed
B ' - E , where
i s z e r o . Rows c o r -
columns c o r r e s p o n d i n g t o i n p u t nodes a r e
connection matrix. Specifically,
E i s theidentity
transposing.
priate
f r o m t h e i t h node
p o i n t s t o t h e j t h node,
are
i s upper
trian-
pass b y nodes
find
with
o u r s e l v e s back
no c i r c u i t s
i n the
no l o o p s , a l l d i a g o n a l e n t r i e s o f t h e a d j a c -
ency m a t r i x a r e z e r o . Actually tionship
this
between
proof
allows
a particular
t o establish
parallel
a n e v e n more c l o s e
form and the graph's
rela-
adjacency
matrix. STATEMENT tiple height
arcs, I
17.2.
An acyclic
defined and width
by the s
i f
directed
adjacency
and only
graph matrix
i f
there
B, exists
without
loops
has a parallel a permutation
and
mul-
form
of
P
such
162 that
P'BP
square
is
diagonal
For to
block
an
upper
blocks
triangular
of
order
not
block
order
exceeding
s t r u c t u r e can
tional
graph p r o p e r t i e s , the
matrix
can
be
discovered.
the i n f o r m a t i o n connection rectangular
nxm
be
had
little
obtained.
corresponding
We
1
with
nonzero
s.
a r b i t r a r y a c y c l i c d i r e c t e d graph,
the adjacency matrix
A
of
similar
further As
we
specify
properties of situation
details
as
addi-
the
adjacency
we
discussed
when
matrix.
matrix
A with
entries a „
is called
Incidence
matrix i f
a .. = ij
1
i f the j t h arc
o r i g i n a t e s from
-1
i f the j t h arc
p o i n t s t o the
0
Only
two
zero.
columns are
1 and
each
column
-1 s i n c e
Suppose t h a t an
A i s i m p l e m e n t e d on
tation
satisfies
the
conditions
vector.
vector
matrix
are
non-
l o o p s i n t h e g r a p h . No
two
m u l t i p l e arcs. is defined
by
i t s
incidence
s y s t e m . Assume t h e
postulated
i n Chapters
a delay vector
w.
2
implemen-
and
3.
Con-
A r e f o r m u l a t i o n of
gives
STATEMENT 17.3.
delay
incidence
no
a computational
s i d e r a s c h e d u l e t - ( t , . . . , t ) and
a
the
a l g o r i t h m whose g r a p h
matrix
delay
of
there are
i d e n t i c a l b e c a u s e t h e r e a r e no
S t a t e m e n t 7.1
i t h node,
otherwise.
entries within
They e q u a l
the
i t h node,
w
For it
is
Let
a
A be
vector
necessary
the
t
incidence
t o be
and
-A't
matrix
a
schedule
sufficient
z
of
a
graph
u
and
corresponding that
the
to
be
the
inequality
( 1 7 . 8|
to
holds. We
cannot
hope
that
the
schedules would s i g n i f i c a n t l y matrix size
matrix-vector simplify
d e s c r i p t i o n of
i t s investigation.
h a r d l y e v e r a c c o m p a n i e s an
algorithm's
i s immense. H o w e v e r , s p e c i a l
features
n e c e s s a r i l y be cidence
reflected
matrix.
This
the
The
be
useful
when
of
incidence
d e s c r i p t i o n ; besides, i t s
of
an
algorithm
graph
i n the s t r u c t u r e of nonzero e n t r i e s of
can
set
exploring
the
must
the i n -
inequality
163 (17.8), the
E t
I f t h e 1 t h component
t
j~ i
T h e r e f o r
" i j '
STATEMENT is
o f the vector
i t h and j t ha l g o r i t h m graph
necessary
e
17.4.
nodes
inequality
then
(17.8) r e f e r s t o
I t has t h e f a m i l i a r
ford
S t a t e m e n t 7.8 c a n b e r e f o r m u l a t e d a s f o l l o w s : F o r a i i algorithm
and sufficient
that
graph
the
cycles
system
of
to
linear
be balanced
i t
algebraic
equa¬
tions
A -t = U
-
be
(17.9)
compatible.
Certainly,
a l l our e a r l i e r
results
can be r e f o r m u l a t e d I n terms o f
certain
properties o f the vector
whether
I t i s w o r t h w h i l e , a n d i n c a s e i t i s , w h a t t h e p u r p o s e o f i t may
inequality
(17.8).
The q u e s t i o n i s
be. Algorithm graph i s f a i r l y
o f t e n balanced w i t h respect
t o the delay
v e c t o r W"e, w h e r e a l l c o m p o n e n t s o f e e q u a l I , STATEMENT tiple
arcs,
respect such
17.5.
An acyclic
defined
to that
directed
by the adjacency
the
vector
P'BP
is
e
i f
block
graph matrix
and only
upper
B,
i f
there
bidiagonal
without
loops
has loops exists
with
and
mul-
balanced
with
a permutation
zero
P
square
diagonal
blocks.
Consider parallel
form
execution
a solution
moments
algorithm).
t o (17.9). Using
o f t h e graph grouping (assuming
execution
l a y e r w i s e we o b s e r v e
ified
i n the statement.
scribed
form.
corresponding node
will
Build
that
differ
a parallel
only
form
Statements concerning
of
by 1. Enumerating
o f the graph,
block
t o nodes
o f some
the graph spec-
m a t r i x has t h e p r e ascribing
t o o n e a n d t h e same from
a
same
layers as the
t h e a d j a c e n c y m a t r i x has t h e form
cycles o f the a l g o r i t h m graph w i l l
vestigating
t o operations
Now s u p p o s e t h e a d j a c e n c y
t o each d i a g o n a l
be l i n k e d
( we c a n b u i l d
o n l y nodes f r o m n e i g h b o r i n g
moments
nodes
the vector
one l a y e r t h e nodes w i t h
t h e nodes c o r r e s p o n d
Each a r c c a n l i n k
corresponding
into
neighboring
t h e nodes
layer.
layers,
so
Each a l l
be b a l a n c e d .
t h e a d j a c e n c y m a t r i x c a n b e h e l p f u l when i n -
the information connection
m a t r i x and t h e v a r i a t i o n
a n a l g o r i t h m . T h e y p o i n t o u t t h e way o f e n u m e r a t i n g
matrix
the operations
164 of
the algorithm
117.1)
to facilitate
the i n v e s t i g a t i o n of i t s struc-
ture.
18. Recovering the Linear Functional Let
us
consider
i n greater
[ 1 7 . 11) ) where a l l f u n c t i o n s tion of algorithm
\
consists
p***"
= l % % 1 1 1=1
Assume t h a t a l l a f c
detail
a
F. a r e l ii nn ee aa r . F, k i n computing
special
This
\
case
implies
of
that
algorithm t h e execu-
k
\
m
* -
M
k
a r e known b e f o r e t h e c o m p u t a t i o n
(18.1) s t a r t s .
Ob¬
i viously, ber
this
function
computation determines (17.2)
that
has
t o be
a way linear
to find
v a l u e s o f some num-
i n the variables
u
u . •
Consequently,
P
i t has t h e f o r m P
u
v = Fiu i
evaluation
ive
(18.1).
than implementing may
become n e c e s s a r y
L
p
Of c o u r s e , t h e d i r e c t
it
) = f 8 .u . .
o f F u s i n g ( 1 8 . 2 ) i s much more
The
total
number o f a d d i t i o n s ,
118.1)
effect-
H o w e v e r , (3^. may n o t b e known. T h a t i s why
to evaluate F indirectly
r a l q u e s t i o n a r i s e s : 'what i s t h e b e s t way
to execute
(18.2)
} j
t o do
v i a (18.1).
The n a t u -
It?*
subtractions,
and m u l t i p l i c a t i o n s
equals
n
« = 2 £
sk-(n-p).
118.3)
k=p+l The
total
compute
number
of
additions,
subtractions,
and
m u l t i p l i c a t i o n s to
( 1 8 . 2 ) u s i n g t h e fi . e q u a l s
rt = Z p - 1 .
(18. 4)
165 Clearly, times less
f o r large
f o rdifferent costly
rectly ficient
input
data
t o precompute
a s many
W » M.
n we h a v e
times
I f (18.1)
U^,... , U p
a n d same
t h e numbers S , and t h e n
as n e c e s s a r y .
Again
i t may be
evaluate
(18.2) d i -
there i s t h e problem
this
problem
c a n be
readily
solved.
( 1 8 . 1 ) . We s e e t h a t u
i s already represented
of
we
<Xy • • • ,Up. u
up+1 like
then
many
ofe f -
evaluation of 8 .
Seemingly
fc-i'
terms
vided
i s t o be e x e c u t e d
Suppose
have
Substituting
we
obtain
the
them
analogous into
the required
a r e bounded,
finding
(18.1)
Indeed,
as a l i n e a r
combination
representations f o r a l l ufc
for
a l l
and g a t h e r i n g
r e p r e s e n t a t i o n (18.2) the explicit
consider
f o r u^. Pro-
representation
(18.2)
would r e q u i r e f o r l a r g e n about
K
"
I
2 p
s
(18.5)
k
k=p+l
additions,
subtractions,
greater
than
(18.1).
Looking
be
executed
and m u l t i p l i c a t i o n s .
t h e number o f o p e r a t i o n s at (18.3)-(18.5)
more
than p times
This
i s about
involved i n a single
we c a n c o n c l u d e
f o r same a
K,
then
that
p
times
computation
i f (18.1)
i sto
i ti s advantageous t o
1
precompute 8 . and then e v a l u a t e It from
turns
the best.
putational The ear
o u t . however,
the linear
that
T h e 8^. c a n a c t u a l l y
this
function
easily
(18.2)
obtained
directly.
result
i s
be c o m p u t e d a t a much s m a l l e r
f a r
com-
expense.
recurrent relations
algebraic
(18.1)
c a n be r e g a r d e d
as a system o f l i n -
equations
(18.6) i-1 with
respect
u
1
1
t o variables u ' ^ , 1 * ^ .
k
Actually,
we
only
have
to find
the variables u u being f r e e . Note t h a t t h e m a t r i x o f t h e n' i P system (18.6) i s t h e v a r i a t i o n m a t r i x * o f t h e a l g o r i t h m ( 1 8 . 1 ) . I t has the
form
166
-1
i f
j=i+p,
a . i f j i s among l i + p ) 0
Now
write
+
matrix
p-vector
made u p o f t h e f i r s t
w i t h components u(,...,u Using
. . . .u^.
By v i r t u e
from
the
complexity
the
last
with
find
that
-"Vx.
(18.9)
Recall
triangular with
t h a t we o n l y h a v e t o f i n d
(18.2) c o n s t i t u t e t h e l a s t of computation
the last
r o w o f -Q~ P.
component o f
I t follows that
1
o f 13. i s t h e same a s
that of
computing
_1
representation
f o r the inverse
obtained,
transform
f o r example,
i t to the i d e n t i t y
of a by
There i s a s p e c i a l f a c t o r e d form lower
triangular
applying
matrix.
Gauss
Taking
matrix
1
-Q'
= R
m a t r i x fij d i f f e r s
fi
from
The e n t r i e s a b o v e d i a g o n a l
into
account
... R
It
the
actual
:
(18.10)
the i d e n t i t y are zero,
[95].
e l i m i n a t i o n to Q to
s t r u c t u r e o f Q we o b t a i n t h e f o l l o w i n g r e p r e s e n t a t i o n f o r -Q
The
diagonal
so t h e r e p r e s e n t a t i o n
r o w o f -Q P.
The m a t r i x 0 i s l o w e r t r i a n g u l a r .
be
; y i s a column ( n - p ) - v e c t o r
to - 1 . I t i s therefore nonsingular,
(18.9) i s v a l i d .
can
col-
n-p
o f (18.7) t h e m a t r i x Q i s lower
e n t r i e s equal
p columns o f *; Q i s
c o l u m n s o f (; / i s s
( 1 8 . 8 ) we
y =
The Bj
(18.8)
m a t r i x made up o f t h e l a s t
ln-p)xfn-p)
c o m p o n e n t s u^^,
y.
(18.7)
( 1 8 . 6 ) as f o l l o w s :
Here, P i s an ( n - p ) x p
umn
, i+p
otherwise.
Px
an
(i+p)
matrix
only
the diagonal
i n i t s i t h column.
e n t r y equals + 1 , the
s u b d i a g o n a l e n t r i e s m a t c h t h o s e o f t h e i t h c o l u m n o f Q. We
will
( 1 8 . 1 0 ) . The of
compute last
the product
R
the last
c o l u m n o f -Q
1
using
row o f R M _ p _ 1 i s k n o w n . S u p p o s e
n-p-i
. ..R.. i
Then
the last
row o f
the representation
we
know t h e l a s t
fi . . . R R n-p-i i i - i
row
would
167 equal
the product
Taking
into
(i-l)th
of the last
account
component
the special
row o f R . . . f i . a n d t h e m a t r i x fi, n-p-i i i - i s t r u c t u r e o f fi^ . we see t h a t o n l y t h e
i s t o be r e c o m p u t e d .
I f there
are 5
nonzero subi-i
diagonal tions, that for
entries
in Rj_j
subtractions,
the f i r s t
this
computation
and m u l t i p l i c a t i o n s .
We
i - 1 components o f t h e l a s t
any i . W i t h
computed
then
the described
algorithm,
requires
have
row o f the last
26.
taken
into
-
i addiaccount
fi . . . R are zero n-p-i i r o w o f -Q
c a n be
using n-p-2
SMn-p-2)
L* = 2 £
(18.11)
i= l additions,
subtractions,
Suppose We
that
and
multiplications.
t h e 1 t h column
c a n assume w i t h o u t
loss
that
i s n o t so, the r e s u l t
data
item.
Clearly,
of P
includes
cr^ n o n z e r o
components.
i n g e n e r a l i t y t h a t o-j>0 f o r a l l I , s i n c e i f o f (18.1) j u s t
to obtain
the last
d o e s n o t d e p e n d o n some
r o w o f -Q
1
input
P i t i s necessary t o
perform
P L
2
" ^
E
p
T
l
1
8
'
1
2
)
i= l additions,
subtractions,
and m u l t i p l i c a t i o n s .
Summing u p ( 1 8 . 1 1 ) a n d ( 1 8 . 1 2 ) we c o n c l u d e t h a t t h e c o m p u t a t i o n o f the
last
(18.2)
row
o f -Q
V
or, equivalent ly,
the computation
requires n-p-2 L
- L'+
L " = 2( £
p + [ tPj)
S.
i=l additions,
subtractions,
and m u l t i p l i c a t i o n s .
p
n-p-2
Vp-a
+
I
i-i
5
i
+
(n-2)
1=1
I
* l
1=1
But n
"I V k=P+i
o f f3 .
from
168 since
the expressions
total
number
on t h e l e f t
of coefficients a
w h i c h i s t h e same t h i n g , tion
t h e number
i n t o account
Thus t h e c o m p u t a t i o n same amount rithm
responding more
p times
once w i t h
using
above
algorithm
(18.2)
i s much
(18.1).
I fboth
small
compared
differs
from
This a linear it
followed
than
by
o f (18.1). This
a l g o r i t h m cor-
i f(18.1)
i s t o be e x e c u -
J
the evaluation than
of
multiple
the overall
computational
c a n be summed up a s f o l l o w s . Suppose
t o compute
that
tional
dure
a scalar
recovery ional
linear
does
by
the
not
values
of
algorithm. exceed
specified
stress that
relies
computes
of
o f (18.1) a r e
cost
but slightly
The f u n c t i o n
v i a a linear functional cost?
process
(18.2) i s
however
that
o f the form
be r e c o v e r e d ,
The t r a d i t i o n a l
and i f method
The m e t h o d we
have
yields
of
We
function
executions
i t i s n o t known. S u p p o s e
i t s values
i s y e s , a t what c o m p u t a t i o n a l
STATEMENT 1 8 . 1 . Suppose
funct
1inear
of required executions
has c o m p l e x i t y o f p e v a l u a t i o n s o f t h e f u n c t i o n a l .
means
o f 6 . using
i
( 1 8 . 1 1 . The q u e s t i o n i s , c a n t h e l i n e a r
by
algo-
t h e obvious
, the precomputation
computationally
t o n then
functional.
described
actually
(18.13)
(18.5). Therefore,
p a n d t h e number
research
answer
0 o r 1, we
that o f a single execution o f (18.1).
i s possible
the
(18.11 o r ,
entries of the varia-
either
execution
efficient
t h e same a
better
represent the
t h e above a l g o r i t h m i n v o l v e s t h e
as a s i n g l e
k
the
sides
S- i a W-p+2.
more
t o the estimate than
of nontrivial ^ equals
o f jij
of arithmetic
i s about
hand
(18.31:
N-p
ted
x,i
m a t r i x o f a l g o r i t h m . As 5^
have, t a k i n g
and r i g h t
involved i n the computation
that
heavily on the structure
the v a r i a t i o n matrix.
of
the a
functional
are
complexity
of
single
evaluat
computed the
ion
funcof
the
algorithm.
the construction
the functional
a linear Then
values,
of the effective o f the graph
or, equivalently,
recovery
proce-
of algorithm
that
on t h e s t r u c t u r e o f
169
19. Computing Gradient and Derivative Consider of
the
most
function tional it
f
once more important
from
approach
functions
(17.2)
data.
V *
\ \
=
%
K
grad u
i s the k t h unit (19.1)
Gradients plexity total
k
, . . . ,k
want
being the
t o compute
total
execute
derivatives
themselves.
of
H e r e we
We
the processes o f F.
can
be
have a l l t h e neces-
( 1 9 . 1)
will
in detail.
not discuss Mote
t h e corn-
however
that
the
r e q u i r e d t o compute t h e g r a d i e n t o f u ^ u s i n g
t o p.
there i s nothing surprising i s a p-vector.
computational
cost
t o p. B u t i n f a c t
involving
t o compute this
about
I t i s perfectly
computed v i a a r e c u r r e n t process
proportional ives
sight,
I f we
this.
implementation
first
k,
k = l , 2 , . . . , p.
= e ,
o f u. a r e p - v e c t o r s . et
i s proportional
At
< %
t h e f u n c t i o n s F^
t o do
(19.1)
i
coordinate vector.
amount o f a r i t h m e t i c
(19,1)
we
of
relations
k
concurrently, a l lpartial
computed a l o n g w i t h sary information
tradiimplic-
grad u
1
and
the
i s
'
p
(17.1)
A l l ufc a r e
the recurrent
One
of
o f i t s g r a d i e n t . The
Differentiating
m
Here
form.
investigation
f u n c t i o n s we o b t a i n
3
U j c
the
p r o b l e m p r o c e e d s as f o l l o w s .
i n input
grad
i n i t sgeneral
concerning
i s the computation
to this
(17.1) as i m p l i c i t
t h e a l g o r i t h m 117.1) problems
that
natural
p-vectors.
the gradient of
i s a premature
fact. that
What i tis
Apparently should
be
conclusion.
For l a r g e n, t h e c o m p l e x i t y o f c o m p u t i n g u. and p a r t i a l d e r i v a t ¬ o f F. i s p r a c t i c a l l y i n d e p e n d e n t o f p . The d e p e n d e n c e o n p a p K
peared scribed
i n the reasoning
above
i n terms o f p - v e c t o r s .
only
because
However,
the process
(19.1)
i s de-
the evaluation of the gradient
170 of u
n
can be o r g a n i z e d Assume
that
a l lu
that
i n a d i f f e r e n t way.
the computation
are evaluated
K
(17.1)
and hence
i s accomplished.
a l l partial
This
derivatives
means of
a l l
f u n c t i o n s F^ t h a t d o n o t d e p e n d o n t h e g r a d i e n t s o f u f c c a n be c o m p u t e d . We
introduce the notation
d
F
k ^ k
for
a l lvalid
- grad
v
k
k. 1
(19.2)
v
i t i,
k . a n d k . Now 1
( 1 9 . 1 ) c a n be r e w r i t t e n
as f o l l o w s :
Sk i=l where
^.....v
(19.3) J
1
k
are given.
I n o u r case
y^i....
equal
,V
, . . . , e^, r e -
spectively. With
the notation
same a s t h e p r o c e s s then
f o r fixed
(19.2)
(18.1).
a,
k.
t h e process
I f we
take
the variable
v
i
v
v
r
,
n
v^,...,v
(19.3)
i s essentially t o be f r e e
p
i s a
linear
the
variables
combination
of
i.e.
p' p v
l &
n
j
V
(19.4)
y
j=l The unknown
coefficients
side of the equality
( 3 . a r e t o be d e t e r m i n e d .
i s t o be computed f o r u n i t
Since
the right-hand
c o o r d i n a t e v e c t o r s Vy
t h e fi . a r e t h e c o m p o n e n t s o f t h e g r a d i e n t o f u ^ . We
see t h a t
essentially considered
earlier
ient computation cess
t h e problem
of finding
t h e same a s t h e p r o b l e m
(19.3).
in this
chapter.
the gradient of a function i s
o f recovering the linear Therefore
functional
the complexity of grad-
s h o u l d be o f t h e same o r d e r a s t h a t o f t h e s c a l a r
pro-
171 We a g a i n c h o o s e t o r e g a r d t h e r e c u r r e n t tem o f l i n e a r S
a
1
this
time w i t h
able
v
k
v
k .
l
v
i
k
respect
=
k j , .. . , ks
k n
°'
P< * -
s
to p-vectors
i s the variation
[19.5)
(18.7)),
E
p
where
P a n d Q a r e t h e same m a t r i c e s
into
account);
K?
vp;
p + i ' ' '
Y
X
is a
M a k i n v
is a
n
the last
(19.7)
that
that
p
(n-p)p-vector
components
row a r e e x a c t l y
computing shown t h a t independent
l
P)xE
We
rewrite
to find
( i f (19.2)
i s taken
formed
by
chaining
together
formed
by
chaining
together
o f the tensor product
o f ma-
r o w o f -Q
(19.7)
be d e t e r m i n e d .
the last
the coefficients
t h e components
the last
(19.6)
)X.
o f Y must
I t follows
r o w o f -o"V.
B . from
(19.6)
from
The e n t r i e s o r , which i s
o f t h e g r a d i e n t o f u ^ . The p r o b l e m
P was c o n s i d e r e d
of
i ndetail
i n §18. I t was
i t c a n be s o l v e d a t a c o m p u t a t i o n a l c o s t t h a t
i s essentially
o f p.
L e t us d w e l l on a p a r t i c u l a r
For
the notation
(19.6)
i ti ssufficient
same t h i n g ,
p -vector
use o f t h e p r o p e r t i e s
*•-( i-Q
of the following
bers,
(with
= 0,
as i n (18.8)
2
column
column
S
t r i c e s we o b t a i n f r o m
one
(17.1)
as f o l l o w s :
p
the
only the vari-
i s a pxp i d e n t i t y m a t r i x .
(PxEp)X + ( Q x E ) Y
of
(19.5)
< k,
v^. Again,
matrix of the algorithm
119.2) I t h a s t h e f o r m
Only
k
i s t o b e f o u n d . The v e c t o r v a r i a b l e s v v are considered " i p The m a t r i x o f t h e s y s t e m ( 1 9 . 5 ) i s t h e t e n s o r p r o d u c t * x £ , w h e r e
free.
V
119.3) a s a s y s -
k
1-1
*
relations
algebraic equations
multiplication
simplicity
operations:
c a s e w h e r e t h e f u n c t i o n s Ffc s t a n d f o r addition
o f two numbers,
assume t h a t
the c o m p l e x i t y o f t h e i r
these
or subtraction
finding
o f t w o num-
t h e i n v e r s e o f a number.
o p e r a t i o n s a r e e q u i v a l e n t as
i m p l e m e n t a t i o n . To c o m p u t e u , r t - p s u c h
regards opera-
172 t i o n s a r e r e q u i r e d . To c o m p u t e t h e g r a d i e n t o f u rithm,
we
must f i n d
the nonzero e n t r i e s
g o r i t h m and compute t h e l a s t L e t F, ft
= u . ± a. . W i t h rC K
1
T h u s , no o v e r h e a d
using
n
t h e above
of the v a r i a t i o n matrix
1
the notation (19.2),
( 1 9 . 3 ) we
i s involved
i n finding
L e t F.
a.
x u^
= u.
Then
2
I
have
the process
there
(17.1)
a
x
2
=
"k
i s accomplished
i s no o v e r h e a d a s s o c i a t e d
u . . We K t
with
s
1•
(17.1)
u.
%
¬
and
u k
2
a r e known
evaluating tt^ . Finally, i
to f i n d
i s accomplished
and
l e tF
k
-
a.
plus
k
=
then
l
-
a single
multiplication is
a, k.
l
Thus f o r a l l c o n s i d e r e d ficients
2
have
the process
required
=
k
then
S
If
have
2
\ 1 2V If
ofa l -
r o w o f -Q P .
i we
algo-
operations
t h e number o f e x t r a
f o r e v e r y k t h e number o f c o e f operations
required
to
compute
them e q u a l s 2. I n c a s e t h e F
stand f o r a d d i t i o n / s u b t r a c t i o n / m u l t i p l i c k a t i o n b y a c o n s t a n t , we c a n e i t h e r t r e a t c o n s t a n t s a s i n p u t d a t a o r c o n s i d e r t h e s e c a s e s s e p a r a t e l y . However i n a l l c a s e s t h e number o f c o e f f i c i e n t s a. p l u s t h e number o f e x t r a o p e r a t i o n s r e q u i r e d t o c o m p u t e K . 1 them i s n o t g r e a t e r t h a n 2 f o r e v e r y k. It
f o l l o w s from
( 1 8 . 3 ) and
be c o m p u t e d i n L o p e r a t i o n s ,
(18.13) t h a t
where
the last
r o w o f -0~V
can
173 n 1
~
I
2
s
k
~
[
n
2
"
)
-
k=p+\
By
t h e above
remark,
additional
S operations are required
e n t r i e s o f t h e v a r i a t i o n matrix- o f a l g o r i t h m ,
t o compute
where
n » *
[
( 2 - s
k
]
.
k=p+l Consequently,
t h e number o f a d d i t i o n a l
the g r a d i e n t o f
t S , 2
[
n V
( n - 2 )
•
k=p+l
Finally,
taking
into
f o r a l l k , we
s.z2 k
that
n
£(2-^>
= [ s
k=p<-l
account
that
f o r the considered set o f operations
i n a l l cases
+ S s
3(n-p)-p+2.
the process
(17.1)
h a s t o be
before
t h e g r a d i e n t c a n be e v a l u a t e d . T h a t p r o c e s s
tions.
T h e r e f o r e we h a v e t h e f o l l o w i n g
STATEMENT dimensional tions,
and
gradient
The putation algorithm size the
19.1.
Suppose
the
point
involves
only
inverse
value
computations.
can
exceeding
be
that
major
computed of
at
evaluating
i s independent heavy
i s proportional function.
mode. R o u g h l y ,
evaluation
of
additions,
a given the
benefit of this
makes
+n-2p+2.
k
k=p+l
have
L
Note
t o compute
i s not greater than
n
L
operations required
a
function
point function
both
In four
i s that
o f the dimension.
an a r r a y o f d a t a
this
function
a number
of
and
operations
i t s not
times.
t h e c o s t o f g r a d i e n t com-
The o v e r a l l
memory
i s saved
p-
multiplica-
the
required
t o t h e number o f o p e r a t i o n s r e q u i r e d large,
a
Unfortunately the described
demands o n memory.
Although
opera-
at
subtractions, Then
result
accomplished
r e q u i r e s n-p
i s used
i n a
memory
t o evaluate very
i n memory a n d t h e n
simple
retrieved
174 in
the reverse
in
the solution
are n o t very We tions
have p r o v e n S t a t e m e n t
undergo
We
The
could
any changes.
from
very
efficient
i f the functions involved
P.
consider scheme
multiplications,
division
of fast
Some c a r e
Statement
should
times"
[ 1 2 ] . Note
that
computation
be e x e r c i s e d
19.1 w o u l d
i n the last
sentence.
the basic
opera-
and i n v e r s e
value
instead o f inverse value
gradient
hold
f o r "inverse value computations"
"four
prove
19.1 f o r t h e case where
subtractions,
overall
r o w o f -Q
sions"
sult
o f m u l t i d i m e n s i o n a l problems
are additions,
putation.
for
Our g r a d i e n t a l g o r i t h m may
complicated.
computations.
last
order.
when c o m p u t i n g t h e
i f we s u b s t i t u t e d
"divi-
i n t h e p r e m i s e and " f i v e This
com-
would not
times"
I s consistent with
most c o m p u t e r s do n o t p e r f o r m
the r e -
division
as a
basic operation. If ive. use
p = l t h e n c o m p u t i n g t h e g r a d i e n t amounts t o c o m p u t i n g a d e r i v a t -
T h e r e i s n o n e e d now t o a p p l y the recurrent relations
right-hand
side
quantities
u
follows
that
store u^ include
a r e used
directly.
must
have
Storing
and u
subtractions,
then
greater
the overhead than
the derivatives
using
o f computing
g r o w s much
increased
derivatives
t h e same of F . I t
(19.1)
r e q u i r e s memory t o
and r e t r i e v i n g
of
them
(19.1)
match i n
I f an a l g o r i t h m i s a sequence o f
multiplications,
and
inverse
the derivative
evaluating the function
computed
of
the right-hand sides
i n memory
three
o r d e r g r e a t e r t h a n one i s t o be c o m p u t e d ,
being
t o compute the
k
r e q u i r e d memory s i z e g r o w s p r o p o r t i o n a l l y ive
r a t h e r , we c a n
that
r e q u i r e d t o s t o r e u ^ . T h i s does n o t
r e q u i r e d t o compute u
Note
t o compute p a r t i a l
( t oa few unessential d e t a i l s ) .
tions,
of
we
t h e above a l g o r i t h m
i the d e r i v a t i v e computation
t h e memory
additions,
not
that
(19.1)
o f t h e same s i z e a s t h a t
themselves. time
of
(19.1)
times.
little
computa(19.1) i s
I fa
derivative
change o c c u r s .
t o the order
by o n e . However,
value using
The
of the derivat-
t h e computat i o n a l
cost
faster.
20. Roundoff Error Analysis One o f t h e most d i f f i c u l t velopment o f numerical
methods
problems r e l a t e d
to exploration
I s estimation of roundoff errors
and deinflu-
175 ence on t h e c o m p u t a t i o n a l that
field,
task,
roundoff
process.
error
Despite
analysis
the significant
is still
r e l y i n g a t t h a t o n some i n g e n i o u s
a
tedious
progress i n
and
t r i c k e r y f o r every
difficult
concrete a l -
gorithm. This ror
i s so due t o t h e a b s e n c e o f a g e n e r a l
influence
explaining
the s i m i l a r i t y
algorithms, analysis sis
e s t i m a t i n g . Moreover,
pointing
i s feasible
of error
out
determining
computational
process.
From
that
research
formity
of representation of
results
rather
taining
them. that
v
= F(u)
puted
an o p e r a t o r
course,
approximation
seldom
o f v b y v?'
exactly,
computation ally of
than
by
methods
acquire
The answer
'How c a n we a s s e s s
case) has a t h e Image
the quality of obtain
Since v i s n o t
dynamically
as t h e
t h e e s t i m a t e s t e p b y s t e p we c a n
eventu-
v-v
f o r t h e n o r m o f v-v.
This
T h e r e a r e no u l t e r i o r
analysis.
t h e com-
seems t o be s e l f - e v i d e n t :
option i s t o estimate
the estimate
o f ob-
image o f u .
unfolds. Updating
forward error
pro-
t o v. T h i s v i s n a t u r a l l y
i n some n o r m a n d a r r i v e a t a v e r d i c t .
our only
analy-
by t h e u n i -
r a n g e . Suppose t h a t
be e x a c t l y e q u a l
the question arises,
an e s t i m a t e o f v-v
error
o f view, a l l
o f a n e l e m e n t u i s e v a l u a t e d . Due t o r o u n d o f f e r r o r s ,
assumed t o be t h e a p p r o x i m a t e
known
point
are united
F (nonlinear i n the general
domain and an r - d i m e n s i o n a l
element v w i l l
Of
an
e s t i m a t i n g and what i t h a s t o do w i t h
related
Suppose
f o r different
whether
a n d p r o v i d e no h e l p a s t o t h e b e s t
achievements o f r o u n d o f f e r r o r
p-dlmensional
theorems
I t i s n o t even c l e a r what m a t h e m a t i c a l
solved during error
underlying
standalone
a t a l l , e t c . The f o r w a r d a n d b a c k w a r d e r r o r
way o f o b t a i n i n g e s t i m a t e s .
the
theory of roundoff e r -
a r e no
e s t i m a t i o n processes
bottlenecks,
approaches a r e not c o n s t r u c t i v e
blem i s b e i n g
there
i s the basic
profound
idea
concepts be-
hind i t . It
i s intuitively
carried then
through.
clear
that
I f the resulting
t h e a n a l y s i s succeeded.
sarily
imply
estimate
may
neighborhood
that
However,
forward error
merely o f u.
signify In this
forward error error
large
a n a l y s i s c a n a l w a y s be is sufficiently
estimate
a n a l y s i s was c a r r i e d
that case
estimate
the operator no m o d i f i c a t i o n
F
does
small
n o t neces-
out poorly.
i s unstable
Large
i n the
o f the computational
176 process would focus
on
the o p e r a t o r
The ment
improve
of
backward
error
exact
tempt
i s successful
norm o f
analysis.
then
which
u-u,
I t s underlying
attempt
to represent
roundoff
by
the s i m p l i c i t y
sis
yielded fantastic of not
of
results.
the
general
I t was
the
is
also
fairly
be
estimated
Idea,
analysis.
backward e r r o r
discovered
that
for a
analy-
l a r g e num-
perturbation estimates
depend
whether
is
the
operator
tational
mathematics.
From a f o r m a l backward
I t gives
possibly yield
error
analysis
is
us
the by
rank r a t u.
If v
image o f v e x i s t s
r e a c h a c o n c l u s i o n on A c t u a l l y our
o f backward e r r o r
for i t to exist
methods
of
evaluating
to
put
the p r a c t i c a l
theoretical
forward
a
process
of
existence let F
least
seem t h a t
we
In practice,
however,
often
of
F
one
concerning
are
the
a priori
backward
a
has pre-
can
o f backward e r r o r
details
of
map
jacobian of at
criteria
i t s existence
algo-
accuracy.
theoretical
l o t of
details
conditions of
answers
problems t o which
of
These
the
I t may
other
neighborhood of v
then
existence only
analysis.
involve a F.
o f u.
no
Indeed,
v,
The
at
anal-
for the
the con-
particular
bottlenecks
of
c l e a r whether i t
error
w o u l d be
analysis the
such
same as
the
formulate
the
t h e e q u i v a l e n t p e r t u r b a t i o n u-u
are
ones.
find
solutions,
sets
of
unstable.
regards
issue
provided
p e r t u r b a t i o n e s t i m a t i n g . I t i s not
possible
To
the l o c a l
the
clear.
neighborhood
discussion
that
r-dimensional
surely exists
least
equivalent
an
i n the neighborhood
ditions
that
sufficiently
i s i n the
ysis.
feasibility
confidence
i t seems
or
that
i m p o r t a n c e f o r compu-
f a r s u p e r i o r as
p-dimensional neighborhood of u onto = F { u ) . Such n e i g h b o r h o o d
stable
i s o f paramount
results
viewpoint,
F
by
equivalent perturbation.
i t furnishes equivalent
on
to
develop-
algorithms
r i t h m s can
rithm
have
computed element v
i n f l u e n c e can
t o as
the
i d e n t i f i c a t i o n o f such a l g o r i t h m s
that
we
a p p l y i n g F e x a c t l y . I f the a t -
error
is referred
idea
the
i s n o t h i n g e l s e t o the concept o f backward e r r o r
ber
is
that
t h e above s i t u a t i o n prompted
image o f some u , o b t a i n e d
Despite
do
I t becomes c l e a r
F itself.
I t c o n s i s t s i n the
as a n
the
situation.
desire to single out
simple.
There
the
and
to
the
emerging
t h e e r r o r v-v
understand
evaluation of
F,
the
and
relation
to the
questions
of
we
must
those problems
properties of
the
to
operator
the F
algo-
itself,
177 etc. Consider
error
the f u n c t i o n
propagation
(17.2).
The
Actual computations
u
k " 'k
or,
f
i n the algorithm
formulas
(17.1)
of
represent exact
evaluating
computations.
satisfy
(
k
u
u
k
k
>•
i
k
P<**n,
k
< k,
s
s. k
equivalently
Uk =
Vuk
p
Here F f c of
Ffc;
s
k .. . . , k
i s a • ' c l o s e t o " Ffc f u n c t i o n u^
i s the a c t u a l l y
%•
I n the general
(20.1)
k s
< k. k
that
available
e r r o r o f t h e c o m p u t e d v a l u e o f Fb. a r e known.
-
uk
1
S
(17.1)
is actually
computed
instead
u ^ ; J|fc I s t h e e q u i v a l e n t a b s o l u t e
F o r now
k c a s e t h e y may
we assume t h a t
o r may
t h e e r r o r s T),
n o t depend
k on ufe , . . . ,
l
k
Ue
strive
t o represent the computation
(20.1) as
follows:
u
+ U
k
E
V k
k =
+
e
1
p
simple
p e r t u r b e d b y e^. ry
out
k .....k
computed
»'
E
S
fc t k
k
(20.2)
k (20.2) a l l o w s o f a
T a k e t h e p e r t u r b e d i n p u t d a t a u ^ + C j , . . . , u +e
and c a r -
(17.1)
exact
A l l u., i n c l u d i n g
exactly.
using
with input
exact
Then
we
will
obtain
exactly
computation. t h r e e sequences:
corresponds
d a t a u , . . . , u ; u. 1
ally
k
S
data, are
are dealing over
1
E
the input
u +e. a t e a c h s t e p o f t h a t K K
computations
+
k
I f t h a t c a n be d o n e , t h e n
interpretation.
the computation
T h u s we
u
k
input
data
p
t o exact
i s t h e sequence
actu-
K
u; p
finally,
t h e sequence
178 u . +e, k K u +c
corresponds u +e .
t o exact These
computations
sequences
over
satisfy
perturbed
the
input
recurrent
data
relations
P p
1 1 (17.1) ,
(20.1),
and 1 2 0 . 2 ) . The sequences
sume t h e a l g o r i t h m roundoff
and ufc e x i s t
c a n be e x e c u t e d b o t h e x a c t l y
s i n c e we a s -
and i n t h e p r e s e n c e o f
e r r o r s . The d e s c r i p t i o n o f t h e s e t o f v a l i d
s e q u e n c e s e. i s an
o p e n q u e s t i o n f o r now. If
(20.2)
exists
then
the perturbations
make u ^ + c ^ t h e r e s u l t s o f a n e x a c t
c. o f t h e q u a n t i t i e s u K k
implementation
o f (17.1)
over
per-
t u r b e d i n p u t d a t a . S i n c e r o u n d i n g e r r o r s have t h u s d i s a p p e a r e d from t h e c o m p u t a t i o n we c a n r e g a r d c . a s p e r t u r b a t i o n s o f t h e c o m p u t e d u bal¬ K k a n c i n g t h e r o u n d o f f e r r o r s i n f l u e n c e . I n o t h e r w o r d s , E , h a v e t h e same
k
effect
as r o u n d o f f
equivalent Note
errors
and they
that
h e r e we e x t e n d
traditionally
t h e commonly a c c e p t e d
lent
perturbations
This
i s a principal difference
t o a l l intermediate
of error propagation At
are therefore
called
perturbations.
data
allowing
o f equiva-
and t o t h e f i n a l
us t o g r a s p
i n concrete computational
l e a s t one sequence c
notion
results.
the subtle
points
processes.
s a t i s f y i n g (20.2) always e x i s t s .
Indeed,
K
set
c t =...= e p = 0 . A n a l y s i n g
unique s o l u t i o n in this
case
sequence
( 1 7 . 1 ) a n d ( 2 0 . 2 ) we s e e t h a t t h e r e
n
analysis.
i s l a r g e we h a v e t o t r y a n d u n e a r t h t h e r e a s o n s f o r t h e
appearance o f l a r g e
e r r o r and l o o k
for
t h e ways t o e l i m i n a t e
e r r o r may be c a u s e d b y t h e p o o r q u a l i t y o f t h e a l g o r i t h m
(17.2)
the function itself,
rounding
errors
important
that
i s a c t u a l l y t h e a b s o l u t e e r r o r o f t h e computed u ^ . This corresponds t o forward e r r o r
I n case c
luating
i s the
u ^ f o r a l l the remaining E ^ . I tfollows
=
(17.2),
or
by
o r by a c o m b i n a t i o n influence
the i n s t a b i l i t y
of both
i s a very
factors.
complicated
t o have g u a r a n t e e s t h a t l a r g e
error
task.
it,
Large
( 1 7 . 1 ) o f evaof
t h e problem
Investigation of I t i s therefore
i s n o t c a u s e d b y some pe-
c u l i a r i t i e s of the algorithm. Suppose
that
we
managed
to find
such
e' s a t i s f y i n g
(20.2)
that
.ft
c n = 0 . The c o m p u t a t i o n quently,
u
computed
s e n t e d as f o l l o w s :
(20.2)
i s exact
i n t h e presence
f o rperturbed o f roundoff
input
errors
data.
Conse-
may b e
repre-
179 u
If
the
equivalent
is
the
r e s u l t of
ly
perturbed
Now
large
uation
the
data,
in ^
lent perturbations (17.2).
I f the
rors
t h e n no
other
influence
quence can
(17.2)
algorithm
also
to evaluate
note
of
the
data
reflecting
carried
through.
estimated
of
of
the
describe both
the
lent perturbations
mixed
in
errors
function
the
u^.
eval-
equiva-
in input than
( 1 7 . 2 ) can
approach
to
backward
error
data
those
rounding
er-
yield errors
analysis,
approach
that
as
helped
l i n e a r algeb-
k
l
u
the
k
that
forward
A l l s e q u e n c e s c.
of
backward such
analysis,
error
analysis.
equivalent
sequences,
zero
Roundoff
analysis
errors
The
the
open q u e s t i o n .
perturbations
issue Other
influence
input
r e s u l t s obtained using perturbed
data.
data
(20.2).
eliminate
u.
( 2 0 . 1 ) and
We
isof seis and
equiva-
arrive
at
equations:
+ ck
+Ek 1
and
from
se-
equivalent
analysis. i.e.
per-
This
error
with
error
i s an
equivalent
analysis.
we
k
u,
fact
zero
error
to determine
*****
quantities
(20.2) w i t h
r e l a t i o n s t h a t make i t p o s s i b l e
1
The
the
compare
of
f o l l o w i n g system of
F
error
through i n the
obtain
Ek "
can
much g r e a t e r
forward
results describe
backward
through e r r o r s To
error
t h a t the problem of
We
i t is this
from
p r a c t i c a l determination
k
as
of
f o r a wide v a r i e t y of
describes
exists,
quences E
the
s e q u e n c e efc
input
always
a
to
that
quantity
( 1 7 . 2 ) w i t h weak-
magnitude
the
r e s u l t s . This
a l w a y s be
of
not
this
[89,95],
in all,
feasibility
the
(20.3) w i t h are
then
the f u n c t i o n
of
i s unstable.
perturbations
a b o v e . We
perturbations sue
small
unambiguously t e s t i f i e s
to develop very accurate algorithms
turbations
(20.3)
are
of
regardless
more a c c u r a t e
r a i c problems
^ l .
E p
investigation is referred
mentioned
All
p
cj,
c j , , , . , e ' from
from
we
, u
the exact evaluation
function
substantially
c ; . . .
V
perturbations
input
error
of
= f (
n
k
K
errors
s
k
ij. K
s
k
»-Vuk s
k
>-v
1
s
< *•
are
determined
by
k
the
(20.4)
computation
180 [20.11. input
I t follows
that
E^ ,...,
data
setting
arbitrary
we c a n f i n d
equivalent
uniquely
rest equivalent perturbations E
perturbations of
( a t least i n theory)
successively
f o r k > p . G i v e n u . a n d t) ,
k the To
system
(20.4)
k
i s i n the general
case n o n l i n e a r
make some a s s u m p t i o n s . T h e r e w i l l
This
first
justifies
tions
assumption limiting
(i^
smooth
the discussion
u^ S
the
t o small
i s that
neighborhood
perturba-
t h e f u n c t i o n s Ffc a r e s u f -
of
I n practice the functions
equivalent
e r r o r s 1).. k
the
implement
exactly
computed
addition,
multl-
k
plication,
e t c . and t h u s o b v i o u s l y
Taking system
in
t o efc. have t o
be t w o o f t h e m .
i s based on t h e s m a l l n e s s o f l o c a l
o n l y . The second a s s u m p t i o n
ficiently
k
w i t h respect
o b t a i n a d e s c r i p t i o n o f t h e s e t o f s o l u t i o n s o f ( 2 0 . 4 ) we
The
a l lthe
these
assumptions
(20.41 by t h e l i n e a r
s"
c
=
k
satisfy
into
the requirement.
account,
sl y s t e m
»t K
we
replace
the nonlinear
zk 1 s
k
\
)
"
V
J p
k , . . . ,k
< k.
S
This
system
i.e.
t h e number o f e q u a t i o n s
"k ror U
a 1 1
e
k
i s always compatible
c a n
c e r t a i n l v
o f higher
t " " " " k
f o r
t h e
a s t h e r a n k o f i t s m a t r i x e q u a l s n-p,
i n t h e system.
he chosen s m a l l .
order
of smallness
c o m
P
u t e d
k
Therefore
i f we
quantities
I tfollows that we o n l y
substitute
uft
u
the f o l l o w i n g
make a n e r -
the exact Now
S
f o r small
we
values
consider
k
system;
*
3
F
k
\
\ £
3u
i=l
ki
1 p
k
k
1
< k,
sk
c
fc = *k •
(20. 5)
181 I f
the
3 r e
"k
linear
system
s m a 1 1
system
(20.4)
i f we
set of a l l small may
be
compared
large
to unity
(20.5) describe ignore
second-order
equivalent
equivalent
then
a l l small
a l l small
terms.
c.
for
that
are
the m a t r i x
sely
that
not
of
particularly
the
system
algorithm
(17.1)
of
nonlinear
describe
( 2 0 . 2 ) . Of
the
course,
there
s m a l 1 T). • H o w e v e r ,
X perturbations
the
Hence t h e y
p e r t u r b a t i o n s from
perturbations
s o l u t i o n s c^
solutions of
such
K
interesting
for
investigation.
(20.5)
i s the v a r i a t i o n
where
the
error
matrix
of
propagation
Note preci-
is
being
studied. We
will
turbations
be
forced
i n course
some c o n s t r a i n t s o n conditions
of
to
of
impose
our
some
error analysis.
the p o s s i b i l i t y
system
restrictions
of
compatibility.
This
on
in
the a n a l y s i s
Let
us
equivalent
i t s turn
itself,
consider
per-
produces
i . e . on
i n greater
the
detail
backward e r r o r a n a l y s i s . STATEMENT 2 0 . 1 . For the b a c k w a r d e r r o r analysis with small equivaperturbations to exist for all small local errors TJ. i n the COmpUtZ
lent
£ation the
of
u^, . . . ,u^
gradient For
tions that
of
the
the
backward
to exist the
bitrary
small
local
formed
algorithm
and
t h e r e f o r e we
matrix
the
first
always
have
n-p
than
that of
i s an
arbitrary
t o o b t a i n 0*
D = tfl
where Q -1
columns o f Q *
then
sufficient
smal1
equivalent
perturba-
i t i s n e c e s s a r y and
columns
of
sufficient
s a t i s f y i n g en=0 f o r arto
the
demand
the- v a r i a t i o n
condition i s consistent £ n-1,
i.e.
that
u^.....u^.
the
i t s c o l u m n s . We
number
that
the
matrix
of
since of
p ft 1
submatrix
denote
the
said
sub-
matrix
by
where D
trix
n-1 This
to determine
zero.
with
and
at
is equivalent
n-p.
order
as
analysis
k
In
so
nonzero
(20.5) have the s o l u t i o n
rank
greater
i s necessary
be
local errors n
e r r o r s . This
by
have
it
(17.2)
error
system
the
rows I s n o t
(17. 1),
f o r small
linear
submatrix
from function
of
order
I f the
of
rank of
nonsingular
the
i s from
the
simplest
the
system
n-p.
that matrix
The
last
n-p-1
p e n t r i e s of
(and
can
consider
any
matrix.
The
matrix
can
be
(19.6).
last
In this
n-p-1
last
h e n c e t h e m a t r i x 40
case t h e
columns o f
row
the of 0
last
the row
* are
can
Q
*
z e r o as
c a n n o t h a v e r a n k n-p.
set n-p-1
identity of
m
chosen
we
e n t r i e s of
the
D
structure. In particular
would match t h e f i r s t
first
* we
maare
well,
I t f o l -
182 lows t h a t one is
f o r * t o h a v e r a n k n-p
I t I s necessary t h a t
n o n z e r o e n t r y among t h e f i r s t easy t o see t h a t
If,
f o rInstance,
this
p entries of the last
condition i s sufficient
t h e i t h o f those e n t r i e s
the
as w e l l . last
Now
we o n l y
have
the f u n c t i o n
(17.2) a t u
i s nonzero then
T h u s we h a v e backward
not,
error
exhaustively explored
of the analysis
process f o r t h e system Both gorithm
backward
that
that
evaluate
functions. I t
analysis
i s f e a s i b l e or
t h e a l g o r i t h m has e f f e c t o f backward e r r o r
the single u
u
value
w h e r e p
on. This
of execution
u ^ . As a r u l e ,
Certainly,
forward
error analysis i n this
n should
be
this
circum-
c a s e c a n be p e r f o r m e d
i s not questioned.
As f o r b a c k w a r d e r -
are formulated
In a
dif-
way.
STATEMENT 2 0 . 2 . L e t t h e s e t o f values
u
u q
algorithm
valent
of thea l -
the result i s
q S n . Of c o u r s e ,
analysis, the conditions f o r i tto exist
ferent
is a
a n a l y s i s as t h e s o l u t i o n
- , q r . o r e l s e t h e r e w o u l d b e no r e a s o n t o c a r r y t h e c o m p u t a -
j u s t a s b e f o r e and i t s e x i s t e n c e
the
I t i s b u t t h e com-
( 1 7 . 1 ) o u t u p t o i t s e n d . F o r now, h o w e v e r , we i g n o r e
stance.
ror
of the existence of
(20.5).
(17.1) I s seldom
among q ] P -
p entries of
error
i n t h e o r y and i n p r a c t i c e t h e r e s u l t
set o f values
tion
the issue
f o r algorithms
that whether
consequence o f o u r t r e a t m e n t
a
the f i r s t
i s non-
t o t h e components o f t h e g r a d i e n t o f
d o e s n o t d e p e n d o n t h e a l g o r i t h m i n a n y way.
plexity
t h e minor o f
n-p-1 columns
P
analysis
be s t r e s s e d
that
I t
as n e c e s s a r y .
u .
i
should
to recall
row o f Q ' * a r e o p p o s i t e
row o f Q
as w e l l
Q ' * composed o f i t s i t h c o l u m n a n d o f i t s l a s t zero
t h e r e be a t l e a s t
(17.1).
For
perturbations
to
the
exist
backward for
a l l
error small
i
be t h e r e s u i t
of
<*r analysis
local
with errors
small
n
i n the
equicom-
fC putatlon the Jacob!
of
l l, . . . ( l i matrix of
u
f r o m ( 1 7 . I J , i t i s necessary ,...,u taken at u
g The we w i l l
proof
i s almost
be b r i e f .
perturbations sufficient
For t h e backward e r r o r
to exist
that
I
r
t h e same a s t h a t
f o r small
the linear
system
local
and have
u
e r r o r s n. have
that r.
P
o f Statement
analysis with
(20.5)
sufficient rank
20.1. Therefore small
equivalent
i t i s n e c e s s a r y and
k the solution
satisfying
183
4
C l
• 0 f o rarbitrary q
the
demand
note of
that
Q
that
f r o m
'l'"""''r
t h e submatrix
t n e
variation
submatrix
local
errors.
This
sign)
by
removing
l
Q~ 5.
q^ a r e o p p o s i t e together,
t h e Jacobi
n-p-r e n t r i e s
o f those
COROLLARY. I f
an
columns
The f i r s t
numbered
matrix
p entries
o f rows
t o t h e components o f t h e g r a d -
these e n t r i e s of u
rows a r e zero.
form
, ...,u
?1 last
i s equivalent to
m a t r i x o f t h e a l g o r i t h m h a v e r a n k n - p . De-
. Taken
u
different
obtained
by * and c o n s i d e r
n u m b e r e d q^
ients o f u a
small
r
(possibly with
at u
qr
u . The 1*
The S t a t e m e n t
P
i s thus
veri-
fied.
backward all
error
i)
analysis
columns. So
This
cases this
have
been
adequate.
how
this
The
overall
notation
scheme
are vector
the d e t e r m i n a t i o n
scalar
matrix
and
o f scalar equivalent
form
scalar
such of
algorithmic
Let
remains
In
situation
i n an
algorithm.
The
the
interested.
matters.
A sample
analysis
relations.
rows
o f the algorithm i s
functions
the
of
are not
portion
as v e c t o r s .
of
even i f
h a s more
algorithms
I t s execution
of error
i)^
errors
then
t h e number o f i t s r o w s .
or that
u^
data
general.
S o m e t i m e s we
of
input
local
in
considering
t o regard
( s a y , FORTRAN)
than
some
cannot equal
the net result
(20.1)-(20.4)
block
for
i s t h e use o f s u b r o u t i n e s
-vector.
now
exist
results
c o d i t i o n s the Jacobi
i s not always
Only
more
be nonexistent
these
i t i s convenient kind
language ft
under
can
even do n o t know, j u s t
implemented.
q
it
Hence i t s r a n k
f a r we
(17.1). or
may not
a r e smaJJ, o r
Certainly, than
has
algorithm
be
u^
a
t h e same, b u t
system
(20.5) f o r
perturbations i s replaced
by t h e
system o f t h e form
S
k a
C
e
I k, k.- k
1=1
Here e
and i ) k
k
E a c h <xfc
1
1
= V
P
are q^-vectors,
i s the Jacobi
matrix
o.^
h
l
;
\
k
a r e q^
s
<
x q
k
k
K
matrix
U
-
rectangular
o f t h e components o f
i
n e n t s o f u . . The o v e r a l l
^ "
over
0
-
6
)
matrices. t h e compo-
184
is
t h e Jacobi
u
,...,!!,
k
k
o f t h e components
k
of
o f t h e system
variables.
(20.6)
Enumerate
i n some way d e t e r m i n e d
i t s block
t h e components o f
columns p e r t a i n i n g
by t h e e n u m e r a t i o n
rows and columns
as we d i d i t t o o b t a i n t h e s c a l a r m a t r i c e s
i n t h e same manner
shown i n F i g . 1 7 . 1 .
Rows and
t o t h e c o m p o n e n t s o f o n e a n d t h e same v e c t o r u ^ a r e
enumerated
continuously.
rithm
the r e s u l t i n g matrix w i l l
then
over
. The c o l u m n s o f F, a r e s c a t t e r e d among t h e c o l u m n s o f t h e
Sk
matrix of
matrix
I f there
a r e no d a t a
broadcasts
have t h e
form
as
i n t h e algoi n F i g . 20.1.
F i g . 20.1
This
I s an upper b l o c k
diagonal ing
triangular
a n d composed o f t h e m a t r i c e s
to the algorithm results
matrix.
matrix. A l l diagonal
Removing
i t would
P.. The b l o c k
blocks a r e block
column
correspond-
belongs t o t h e hatched-over p o r t i o n o f the n o t change
the matrix
structure.
Hence
we
have STATEMENT 2 0 . 3 . rithm.
For
tions tor
to or
the
exist scalar
S u p p o s e t h e r e are no data
backward for u^
a l l
error small u^
from
with
analysis local (17.
errors 1)
it
broadcasts
small
7)^ in is
in
the
equivalent the
necessary
algoperturba-
computation and
of
vecsufficient
185 that
t h i s same c o m p u t a t i o n a l l o w of
each
operation Indeed,
row rank.
of local
Until was
that the variation
i s as i n F i g . 20. 1 t h e n
each F^ has f u l l existence
backward
error
f o r t h e g l o b a l backward e r r o r a n a l y s i s
c e s s a r y and s u f f i c i e n t the m a t r i x
a local
t o exist
m a t r i x have f u l l
i t has f u l l
row r a n k
By S t a t e m e n t 2 0 . 2 t h i s
for
i t i s ne-
row r a n k . I f
i f and o n l y i f
i s the condition of
backward e r r o r a n a l y s i s f o r each o p e r a t i o n f\,.
r e c e n t l y , t h e issue
directly
analysis
F,. n
of existence
o f backward e r r o r
linked t o i t s constructiveness.
There
were
analysis
many
pitfalls
in
t h e c o u r s e o f a n a l y s i s . Some o f them w e r e d u e t o t h e n o n - u n i q u e n e s s
of
local
pose,
backward
error analysis,
f o r instance,
Then some e q u i v a l e n t
that
some
deadlocked
values.
due
The
to incorrect
case
cide.
the practical
They
have
nothing
backward
attempts
process o f t h e systems
estimating
equivalent
These i s s u e s i.e.
perturbations
The a l g o r i t h m
and t h e c o m p l e x i t y
although
and
with
only
of their like
algorithm.
This
suited
for a
though
they
peculiarity
means
posteriori
are helpful
that
t h e systems
estimating with
peculiarity
the a p r i o r i
i s that the right-hand
(20.5)
and
sides
(20.6),
equivalent i s that
the completion
of equivalent
being
determination.
t o determine
not available until
the rela-
(20.5),
and (20.6)
i s somewhat p e c u l i a r . T h e f i r s t
the error
the values
matrix of the algorithm.
entries are generally
coin-
t h e problem
affects
to properties of the variation (20.5)
of
itself
conditions
The
use o f t h e systems
(20.5),
the question
i s consistent
a r e r e l a t e d t o p r o p e r t i e s o f systems
perturbations trix
error
itself.
be
(20.5) and ( 2 0 . 6 ) . I n
the algorithm,
established
between
may
the analysis
feasibility
tion
of
t h e systems
provided
This
by t h e a l g o r i t h m
analysis
exhaustively
i s done f o r t h e a l g o r i t h m .
we
algorithm.
impossibility.
analysis
solved
Sup-
perturbations are
error
t o solve
and t h e o r e t i c a l t o do w i t h
i n an
the equivalent
that
error analysis,
v i e w e d as t h e s o l u t i o n
this
broadcasts
of i t stheoretical
o f backward
broadcasts.
t o be r e l a t e d t o e a c h o t h e r i n
S t a t e m e n t s 20. 1 and 20,2 s o l v e
the existence is
appear a f t e r
I t follows
( 2 0 . 6 ) , and n o t because
due t o d a t a
are data
p e r t u r b a t i o n s have
some w a y . T h o s e r e l a t i o n s ascribed
there
others
(20.6)
ma-
of the
a r e more
perturbations, a l -
a n a l y s i s as w e l l . a r e n o t known —
The s e c o n d
we o n l y
have
186 upper bounds f o r t h e i r usually riori
known a p o s t e r i o r i . The i m p l i c a t i o n o f i t a l l i s t h a t
error Let
of
estimating
|ii
i s p r o b a b l y more n a t u r a l
|sfl , where
the corresponding
using
a b s o l u t e v a l u e s . Moreover, even t h e s e bounds a r e
9, i s t h e v e c t o r
components
a poste-
than a p r i o r i one.
o f a p o s t e r i o r i upper
of the error
vector
t). .
Then
bounds
we
have,
(20.6)
< k.
(20.7)
1=1 It ear
follows
that
polyhedron
among
the equivalent that
equivalent
tional
perturbation
c a n be d e t e r m i n e d
perturbations
properties.
These
from
(20.7)
properties
and on i t s o b j e c t i v e .
analysis
that
the
results
should
data equivalent
the equivalent
be z e r o .
approximation
linear linear.
rather
o f order
constraints.
specific
on
The o b j e c t i v e
lin-
addi-
the particular
I n p a r t i c u l a r , backward perturbations
error
corresponding
o f i t i s t o minimize
roundoff error
I t should
analysis
1, a m a t h e m a t i c a l
The o b j e c t i v e
function
I f we base o n t h e e x a c t e q u a t i o n s
constraints.
with
both
t o some
a r e t o choose
to
input
perturbations.
( 2 0 . 7 ) shows t h a t s m a l l an
those
depend
method o f a n a l y s i s requires
vector belongs
a p o s t e r i o r i . We
be
noted,
i s essentially, to
programming
may b e e i t h e r ( 2 0 . 4 ) we w i l l
however,
that
this
problem
with
l i n e a r o r nonhave
nonlinear
nonlinearity is
special.
21. Examples EXAMPLE 2 1 . 1 . We d e m o n s t r a t e ation.
Suppose t h e a l g o r i t h m
u
where
the notation
k
-
the technique o f f a s t gradient
consists
au
k-p
i n computing
of a
(21.1)
p
i s t h e same a s i n §17, a
b e r s . The e x a c t r e p r e s e n t a t i o n
evalu-
a n d fi a r e some g i v e n
c a n be r e a d i l y
obtained:
num-
187
"
n
-
cfu
+ [a"""
s
1
+ a '
•'...+1)8.
2
(21.2)
H e r e IT I s t h e q u o t i e n t o f n d i v i d e d by p , 6 i s t h e r e s i d u a l . S u p p o s e were n o t a b l e ate
to obtain
the gradient
would
of
the exact
we
r e p r e s e n t a t i o n and s t a r t e d t o e v a l u -
by t h e d i r e c t
d i f f e r e n t i a t i n g o f ( 2 1 . 1 ) . Then
we
have
g r a d it
I f we
ignore
(n-p)p
operations
Now
u.
t h e s p e c i a l f o r m o f t h i s r e l a t i o n we
t h i s computation
Forming
= a grad
to evaluate i s linear
consider
the gradient
i n p as i t should
the gradient
the v a r i a t i o n
evaluation
have t o p e r f o r m
o f u . The
complexity
algorithm discussed
n-p
of
be.
m a t r i x * o f the a l g o r i t h m
<- - p - -» *
shall
( 2 1 . 1 ) , we
i n §19.
obtain
-»
-1
* =
n-p
r
». -1
a
The
matrices
last
n-p
P and Q f r o m
c o l u m n s o f * r e s p e c t i v e l y . The
diagonals.
Specifically,
According 1
To
-Q square
ft. may then p>l
do
= a. i f i - j
-t.j
t o t h e a l g o r i t h m we
this,
make u s e
that only
be n o n z e r o .
now
(n-p)x(ri-p)
* has o n l y
Taking
two
nonzero
have
= -p.
t o compute t h e l a s t
i n t o account
(18. 10).
row o f
A l l ft^ a r e
the structure of 0
we
one o f t h e s u b d l a g o n a l e n t r i e s o f t h e i t h column o f I f i£n-2p then
I t I s I n r o w p + i a n d e q u a l s a.
t h e ft a r e i d e n t i t y m a t r i c e s the last
matrix
p a n d by t h e
= 0 and S ^ j = - 1 i f i - j
of the representation
matrices.
[n-p)x(n-p)
conclude
(18.8) a r e formed by t h e f i r s t
f o r n-2p
i s
t h e
multiplication
of
row o f t h e p r o d u c t "n_p_,-• • identity
matrix.
Bight
R
< i ^ n-p-1.
n
_
I f p>l
therefore f o r
l a s t
r o u
o f
t h e
2 p + 1
that
row
by
the
188 matrix R - _ n
H. R i g h t
2 p
will
R
modifies
only
multiplications
w i l 1
n o t change
m o d i f v
t n e
("
R
n-3p
that
t h e nonzero
n-pl, stance an R^
i n c o l u m n n-2p m a k i n g
i t s entries. -3
entries
w h e r e 1=1,2
one e n t r y
PHh
Only
entry,
o f the last
c , a n d be e q u a l
to a
3
Now
special
t o compute t h e l a s t
1
will
we
columns
t h e circum-
of
we i g n o r e
r o w o f -Q
1
equal
zero,
except
result with
2(n-p)
the gradient
operations.
Statement
EXAMPLE
this
Thus,
a l l components o f the ff
which
equals a .
1
r o w o f -Q
t o compute t h e g r a d i e n t i t took
o f u ^ . The
Statement
19. 1
Again,
we w i l l
i f
have t o
components.
n o t more t h a n p o p e r a -
computation
( 2 1 . 1)
i s corroborated,
takes
as w e l l
as
18. 1 .
U j , • - • .Up
algorithm
that
account the
(21.2) guarantees i t s c o r r e c t n e s s .
n o t more t h a n p o p e r a t i o n s
by
a r e t o be
i t remains to Into
f o r the 5 t h entry
the special structure of the last
t o evaluate
of
b y t h e m a t r i x P. T a k i n g
Summing u p a l l e x p e n s e s we s e e t h a t tions
n-p o p e r a t i o n s
.
s t r u c t u r e o f t h e f a c t o r s we o b s e r v e
Comparing t h i s
perform
r o w o f -Q
by
conclude
occupy
I f we i g n o r e
\
t o o b t a i n t h e components o f t h e g r a d i e n t the last
gradient
21.2. Consider
A l l components means t h a t
the evaluation
of the gradient
backward e r r o r
f o r a l l u^
sufficiently
tion
multiplication
i tto a .
r o w o f -Q
" n _ 2 p _ ] •••
t h a t o u r row e n t r i e s do n o t change w i t h e v e r y m u l t i p l i c a t i o n
multiply
are
the right
setting
, i t i s e a s y t o show t h a t n o t more t h a n
performed
20.1
i t equal to
o f t h e u p d a t e d row by t h e m a t r i c e s
u
smooth
p
of
analysis
provided
t h e sum
o f g equal
that
f u n c t i o n s . Consider
exists
s
o f numbers
1 . By
Statement
f o r any summation
the algorithm's the traditional
operations accumula-
algorithm:
v r
w
V
i
=
V i - i
t u
i . i '
" i *
P
- i .
(2i.3)
Here s=ti 2P-i According of
t o the research
the quantities
straint form
satisfy
i n §20 t h e e q u i v a l e n t
t h e system
p e r t u r b a t i o n s e.
(20.5) w i t h t h e a d d i t i o n a l
£ - , p _ : = 0 . The v a r i a t i o n m a t r i x 4 o f t h e a l g o r i t h m
con-
(21.3) has t h e
189
«- • - P
p-l
- -»
- --*
I
'1 -
p-l 1
The
additional
turbations
C
Ej
* with
i t s last
system
(20.5)
constraint
ap-2
s a t I s
i
fV
the
=0
Implies
that
the
equivalent
s y s t e m whose m a t r i x *
c o l u m n r e m o v e d . The
i s always
-1
rank of
compatible
even
*
i s the
per-
matrix
i s p - l . Consequently
i f e
=0.
This
the
corroborates
ap-i our
conclusion
ence o f 20.3,
global
since
the
analysis tem
that
a
for
backward
backward algorithm
the
(20.5) w i t h
the
equivalent
data
= T)
= n
3 + E
1
of
no
=
J)
=
Y)
2
- c
Zp-l
2p-2
p*2
- C -
C
+
E
p-l
accuracy of
perturbations
exists.
follows
The
from
b r o a d c a s t s and
numbers i s p o s s i b l e .
e
E
the
two
column crossed out
C
estimate
of
always
also
last
p E p-l
To
analysis
analysis
( 2 1 . 3 ) has
addition the
error
error
we
Solving
exist-
Statement the
local
the
sys-
find
, Zp-2
2p-3
+ E +
p>l
E
2p-2'
p*2
,
p*l s u m m a t i o n we of
input
p
p-i
1=1
1=1
o n l y n e e d t o know t h e
d a t a . So
we
sum
have
[ ei " [ V This
result
In
I £
is well
E I U
+U
eter the
+ . . . + U .
12
p*i
describing number
minimize
known
For
I , where c
floating is a
point
c o m p u t a t i o n we
machine-dependent
small
have param-
i+i the
accuracy
representation
the
195).
error
i n the
of
computer
method. floating
The
arithmetic.
implications
point
I t is related
are
summation o f
as
follows.
nonnegative
to To num-
190 bers
they
should
be summed
well-known r e s u l t error
a n a l y s i s c a n be c a r r i e d o u t u s i n g EXAMPLE
Its
21.3. Consider
gradient
(it +u ) sible
2
is a
. According
i f u +u
lowing
vector
order.
with
of a
20.1 backward
* 0. Suppose t h a t
variation
*
of u
5
= u + u , u
1
2'
2 ( 1 ^ + 1 ^ ) ^ ,
analysis
the function i sevaluated
5
i s fea-
by t h e f o l -
= u u , u = u u 4 3' 6 4 5
a n d 11 i n v o l v e s s
matrix
1
o n e a n d t h e same a r g u m e n t u
4
u
u 3
4
-1
u
u 5
equivalent
perturbations
the additional constraint c
E,,..,Es
c a n be o b t a i n e d
system
t o be c o m p a t i b l e
should
be e q u a l
analysis
Taking
u]+u2*0.
This
satisfy
6
= 0. The m a t r i x
6
i s also
4
sides,
the condition
t h e rank o f i
f o r backward
account
(21.4) with
we
values
of a l l
o u t by a
zero
direct
conclude
that
4
conclusion
has rank
3i f
on t h e condi-
analysis.
computations,
p r e s e n t e d as p r o d u c t s c^u^ p r o v i d e d
error
with
.
our e a r l i e r
o f backward e r r o r point
i s borne
3 1 5
i s i n accord
The a b s o l u t e
(20.5)
4 o f t h e system f o r
right-hand
3
into
floating
t h e system
by c u t t i n g o f f t h e l a s t column o f 4. F o r t h i s
u u , u , a n d u u +u . T h i s
tions o f existence With
1
-1
1
The 3 - m i n o r s o f 4 t h a t a r e n o t i d e n t i c a l
4 5 check.
c ,...,c
f o r arbitrary
t o 3. T h i s
to exist.
t h e values
i n t h e a l g o r i t h m . The
-1
* =
The
(21.4)
o f t h e a l g o r i t h m has t h e form
1
hold.
tUt^Ht^S
function
error
that
rules.
2(uj+u2)u3l
components
C o n s e q u e n t l y , we h a v e a c a s e o f d a t a b r o a d c a s t i n g
have
i s also a
algorithm
computation
with
This
a s e t o f formal
the evaluation
t o Statement
u
The
i n the nondecreasing
[ 9 5 ] , We u s e d t h i s e x a m p l e m e r e l y t o d e m o n s t r a t e
the local
errors
c a n be r e -
t h e c o n d i t i o n s u^+u^ 1 0 a n d will
have
a common u p p e r
* 0
bound c
t h a t d e p e n d s o n l y o n t h e mode o f number r e p r e s e n t a t i o n a n d r o u n d i n g o f f adopted on a given
c o m p u t e r . We w i l l
seek
the equivalent
perturbations
191 as
products
product
e''u.
We
to this
used
small
C ' + E ' ,
2
4 6
relative,
and
the
find
can
be
sought
as
a
that
n o t d e f i n e d u n i q u e l y . One
not
the
the
problem.
C ,
c"=
4
6
5
fact
only absolute,
entries
e"=
6
of
the
solu-
of We
that
we
can
find
0.
sometimes
equivalent perturbations.
problem
= b of order
Ax
the
e'-c',
5
stress
Consider
column o n l y
of
solving a
system
p where A i s l o w e r
A equal represent
linear
triangular.
1. S u p p o s e Gauss the
of
As-
elimination
corresponding
algorithm
= b,
sign By
u
k
differs
= L
k-i
from
a.
k-i
the
,
1 < k £ p.
identity
(21,5)
matrix
in
i t s
(k-l)th
(k-l)th
column o f
1
. This
the
nonsingular.
:
here
L K
reverting
( 2 1 . 4 ) we
also
form:
u
matrix
to
diagonal
to solve
in a vector
E " =
3
example
algebraic equations
The
Ej+e
is
e"=
EXAMPLE 2 1 . 4 .
i s used
account
p e r t u r b a t i o n s are system
this
sume t h a t
i^3. Clearly,
4
equivalent
tions
for
Taking into
£
The
e^'u^
column of
virtue
i s obtained
from
i t s subdiagonal
the
entries.
o f S t a t e m e n t 20.2
A l l matrices
L^
backward e r r o r a n a l y s i s
A
by are
always
exists. (21.5) i s a vector be has
solved the
to determine
form
a l g o r i t h m . Hence t h e the
equivalent
block
system
(20.6)
perturbation vectors.
is to
I t s matrix
192 L
-E
1
I
-E
2
(21.6) L
Mote k-l
that
the
linear
transform
u =L
u
of
,
that
fact
the
zero.
As
components
local
li.
error vectors
tp k
Due can
to be
ward e r r o r a n a l y s i s
equivalent
should hold.
into
L.
,
1
'2
and
Taking of
taken
account
right-hand
i£k
number ter.
Thus
(953,
where c
representation
T). .
we
modify
the
looking
from
of
f o r back-
the e q u a l i t y
the matrix
derive
first
components
k-1
are
the s t r u c t u r e of
sides
i s a small and
Taking i n t o account = 0,
components the
of
relations
perturbations. o b t a i n an
estimate
an
a priori
In
this
and
the
eyO
(21.6),
(20.6)
second s u b s c r i p t s .
u
are
k
(21.7)
of
estimates.
| +...+
computed yield
a the
equivalent
estimate.
system
o f f adopted
t h e s t r u c t u r e o f vj we K
l e ,| ^ e (
e x a m p l e we
block
error
by
in
wished
only
on
a
of
that
|uJi I ) ,
course
of
I n the
on
the
X>1.
(21.7!
algorithm
execution,
estimates
of
relations
(21.5)
quality
tools
i n terms of w o u l d be scalar
compu-
that
recurrent
to s t r e s s t h a t the
u
' ^i'
mode o f
posteriori
perturbations
case
e
\~nki I *
particular
conclude
However, t h a t e s t i m a t e
120.6) a r e
have
number d e p e n d i n g
rounding
Certainly, using
i.e.
riori
we
p o i n t c o m p u t a t i o n w i t h a c c u m u l a t i o n we
e
The
not
first
'p
floating
for
does
perturbation vectors,
Denote t h e components o f v e c t o r s of
-E
p-l
equivalent
input
we
can data,
coarser. system
specially for
a
(20.5) poste-
Chapter 5 Functional Investigation of Algorithm Structure So
f a r we
have
without
placing
any
establish of
restrictions
parallel upon
t h e most g e n e r a l p r o p e r t i e s
of real-life
ditional
algorithm
Our
r e s e a r c h h a s shown
their
cuts
of a c t i o n
their
graphs.
and f e e l
i s that
helped
of that
the special
us kind
proper-
f o r t h e u s e o f some a d -
a v a r i e t y o f tasks requires
studied
o f graphs.
t h e problem
able methodology
cuts
particular
directed
class
common
traits.
functional
a
algorithm
o f graphs
o f algorithms B y now, we
i n our
onto
graphs
goals o f exploring
characteristics
enough
features
The m a i n r e a s o n f o r t h a t
of finding
of arbitrary
o f unraveling
One o f t h e c h i e f titative
t h e know-
H o w e v e r we d i d n o t s o f a r g e t r o u n d
cuts
t h a t problem NP-hard. T h e r e f o r e i t i s u n p r a c t i c a l
focus
That
algorithms
and
course
p o s s e s s i n g de-
r e q u i r e s a heavy search o f t h e graph. Hot i n f r e q u e n t l y
ing f o r directed
forward
of
the limits
account
This calls
but rather
characteristics
properties
that
i n graphs.
determination,
quantitative
sired
graphs.
structures
m a t h e m a t i c a l a p p a r a t u s t o r e p r e s e n t and s t u d y o u r g r a p h s .
ledge o f d i r e c t e d
is
studying
r e s e a r c h . Now we a r e g o i n g t o t a k e i n t o
ties
to
been
investigation
functional
particular
feature
of algorithm
methods t o e x p l o r e t h e i r
features
t o segregate
enough
description
a
work-
structure.
was
whose g r a p h s possess
t o engage i n l o o k -
i f we a r e t o c r e a t e
a number
information
important
of
tangible
t o make
structures.
of algorithm
and quan-
some
We
graphs
a
step
shift
our
and
employ
structures.
22. Space-Time Schedules So f a r we w e r e n o t much c o n c e r n e d presentation ations tors.
the p a r t i c u l a r form o f r e -
o f s c h e d u l e s . B e c a u s e o f t h a t we a l l o w e d a r b i t r a r y
of algorithm That
about
g r a p h nodes and r e g a r d e d schedules
enumer-
as g e n e r a l
vec-
w a y , we w e r e a b l e t o e s t a b l i s h a number o f most g e n e r a l p r o -
p e r t i e s o f schedules.
194 S o m e t i m e s , h o w e v e r , we d e v i a t e d For
e x a m p l e , when d i s c u s s i n g
f u n c t i o n s o f two o r t h r e e dictated
by
Imagine
extent
that
general
representation.
E x a m p l e s i n §11, we r e g a r d e d
integer v a r i a b l e s . Obviously,
the p a r t i c u l a r
t o what
from
forms
of algorithm
the consideration
h a v e b e e n more c o m p l i c a t e d
i fschedules
graph
was
representation.
o f Examples
had been
s c h e d u l e s as
the choice
from
§11 w o u l d
t r e a t e d as f u n c t i o n s
o f o n l y one i n t e g e r v a r i a b l e . One-dimensional representation where
it
of algorithms
the entire
single
parameterization
used
i n theoretical
cifications. rithms:
f o r example, readily
o f t e n used t o d e s c r i b e There zable,
other?'
the choice
'Why
i s this
a c h i e v e adequate ease
rithms
of that
intended
specially called
i n working
feature
form
with
used
between d i f f e r e n t
three nested
nearly
notation
number
of
loops
algo-
o f parameters i s
formali-
i s made
so
as t o
on t h e n o t a t i o n c h o i c e . are written
The
corresponding
today.
We o b s e r v e
programs
controlled
on the
t h e a l g o r i t h m . As a c o n s e q u e n c e ,
a l l programs
s c a l a r a d d i t i o n and m u l t i p l i c a t i o n fact simply
o f t h e s e t o f an a l g o i n a c t u a l a l g o r i t h m spe-
t h e choice
implementation
f o r t h e purpose.
For example,
by i n d e p e n d e n t
are fixed
i n languages
that
implementing f o r matrix
there
f o rwriting
with
the s i n g l e c o n t r o l l i n g
cally
use a l a r g e
table
operations.
a matrix
number
multiplication
algorithm
multiplication providing
operations.
i n the form
parameter. A wide v a r i e t y o f programs
o f parameters
to describe
isa
t h e same a l -
indexes,
as b a s i c
Algo-
notat ions are
c a n n o t be a c c i d e n t a l . I t i s h a r d t o i n v e n t a p r a c t i c a l
pose
and
o f n o t a t i o n u s e d , a n d n o t some
that
h a s some i m p a c t
programs and a r e w i d e l y
gorithm.
large
general
t o tremendous
i m p e r c e p t i b l e , and hence p o o r l y
clear
f o r computer
of similarity
a
form by a
operations.
particular
work
designed
due
o f a l g o r i t h m n o t a t i o n . I t has a b e a r i n g
I t i s intuitively
nature
However,
the mathematical that
algorithm
i s something almost
about
question,
seen
t o the
i s parameterized
i s sufficiently
studies.
i t i s h a r d l y ever employed
Take,
i t c a n be
corresponds
( 1 7 . 1 ) o r i n some o t h e r
operations
of single variable parameterization
rithm's operations,
lot
i n t h e form
set of algorithm
i n t e g e r v a r i a b l e . C e r t a i n l y , (17.1)
i s often
clumsiness
the
o f schedules
This pur(17.1) typi-
t h e s e t o f execu-
195 Compactness and r e a d a b i l i t y o f n o t a t i o n i s a n o t h e r uting
t o ease
clear
that this
reflects easily to
with
the algorithm.
requirement w i l l
the algorithm
be t r a c e d
give
fixed
o f work
Again,
be f u l f i l l e d
structure
i n some
i n each p a r t i c u l a r
factor contrib-
i t i s Intuitively
i fthe algorithm
way.
The
notation
relationship can
c a s e b u t i t i s n o t a t a l l c l e a r how
a g e n e r a l d e s c r i p t i o n o f i t . A p p a r e n t l y a compact n o t a t i o n f o r
basic
operations
t u r e . Computer
necessarily
programs a r e j u s t
schedules t o c o n t a i n must r e l a t e
must
Notice
the algorithm
such compact n o t a t i o n s .
more i n f o r m a t i o n a b o u t
them t o compact r a t h e r t h a t we h a v e a l r e a d y
representation
reflect
than
been
struc-
I f we w a n t o u r
t h e a l g o r i t h m s t r u c t u r e , we
(17.1)-like notations.
i n c o n v e n i e n c e d by o n e - d i m e n s i o n a l
o f schedules. S p e c i f i c a l l y ,
some r e l e v a n t
graph
proper-
t i e s w e r e d e p e n d e n t o n s p e c i a l node e n u m e r a t i o n s and n o t r e l a t e d i n a n y way
t o schedule
ploitation ity.
properties.
of sectioned
Linear
ordering
F o r e x a m p l e , when
memory we i n t r o d u c e d
o f graph
nodes
studying
the notion o f graph
i s inadequate
t h i s n o t i o n , s o we assumed g r a p h n o d e s w e r e i m m e r s e d
in
a special
However,
our representation
unchanged. There i s a c e r t a i n i n c o n s i s t e n c y Thus, tighter
t o make
and
algorithm
more m e a n i n g f u l
late
more
notations
our notions
algorithmic
other
remained
i n this.
effective
we
have
to establish
representations,
than
(17.1).
Hopefully,
Special
c a r e s h o u l d be t a k e n
notations
of algorithms
our notions
will
to re-
and i n p a r t i c u l a r
o f many a l g o r i t h m n o t a t i o n s u s e d i n p r a c t i c e e v i n c e s a
number o f f e a t u r e s
t h e y h a v e i n common. T h e s e
cified
or
explicitly
they
c a n be
s i m p l e w a y s . R i g h t now, s e v e r a l - a set o f integer vector (indexes);
w h i c h may c o n s i s t - a partial -
some s p a c e
i f we d o t h i s .
t o compact
local-
language programs.
An a n a l y s i s
ables
into
o f schedules
c o n n e c t i o n s between a l g o r i t h m graphs, schedule
get
to
our research
ex-
f o r the description
of
manner.
efficient
this
the bljection
executable operations
a r e e i t h e r spe-
artificially
o f them a r e i m p o r t a n t
I n rather
f o r us:
i n d e x e s whose components a r e l o o p
set occupies a possibly
o f subdomains o f v a r y i n g order
features
introduced
i n the set of vector
m u l t i p l y connected domain
dimensions; indexes;
between t h e s e t o f v e c t o r of the algorithm;
vari-
i n d e x e s and t h e s e t o f
196 -
the vector
ables
a r e used
indexes-based
and what
correspondence
are modified
that
by each
shows
operation
what
vari-
o f the algo-
r i thm; - the guarantee o f correctness ted
The
features
gorithmic ilar our of
1isted
languages
here
p r o p e r t i e s we w o u l d
course,
shall
space.
relations,
suggests
a
o f f i n i t e - d i m e n s i o n spaces. s e t o f nodes w i l l
and c o n s i s t
notations,
features.
Later
role
we
t o play
i n d e x e s and t h e s e t o f a l to situate indexes
According
graph
nodes i n
and regard
t o t h e above
o c c u p y a d o m a i n w h i c h may be
o f subdomains o f v a r y i n g
(=nodes) have
a necessary
way
vector
this
general
t h e same
them i n
The r e a s o n i s ,
l a n g u a g e s . Many o t h e r
natural
t h e nodes w i t h
features
points
to incorporate
above have an i m p o r t a n t
elements
the
indexes.
f o r FORTRAN-like a l -
and s c h e d u l e s .
between t h e s e t o f v e c t o r
identify
connected
i n trying
have
the features listed
operations We
a r e execu-
a l g o r i t h m graphs using a l g o r i t h m n o t a t i o n s .
bijection
gorithm
be j u s t i f i e d
t h e widespread use o f such
see t h a t
i f operations
i n the s e t o f vector
are characteristic
o f a l g o r i t h m graphs
mathemat i c a l
building The
order
( f o r i n s t a n c e ) . Even i f no o t h e r n o t a t i o n s h a d s i m -
representations
including
in
of results
i n accordance w i t h t h e p a r t i a l
integer
Now
as
list
of
multiply
dimensions. Recall
coordinates,
requirement.
them
although
schedules
this
that
i s noti n
can be viewed
as func-
t i o n s d e f i n e d on t h e s e t o f nodes. Let
a l g o r i t h m g r a p h n o d e s l i e i n some d o m a i n D. G i v e n
e a c h node x i s a s c r i b e d when t h e c o r r e s p o n d i n g can In
be viewed
some u n i q u e
operation
as f u n c t i o n s
t l x ) defined
0 . The r a n g e o f e a c h o f t h e tlx)
schedules stresses
represented the special
value
i n that
which
i s executed.
form
a schedule,
i s t h e moment i n t i m e
I tf o l l o w s that
schedules
on a d i s c r e t e s e t o f p o i n t s x
i s a l s o d i s c r e t e . We w i l l a s space-time
method o f p u t t i n g
graph
nodes
refer to This
schedules.
i n a space
term
and t h e
r e l a t i o n o f i t t o t h e way s c h e d u l e s a r e r e p r e s e n t e d . One o f t h e m a i n t a s k s study of
of algorithm structure investigation
of the set of parallel
operations
that
forms.
a r e executed
t h e sequence o f e x e c u t i o n a number o f s c h e d u l e s .
Every p a r a l l e l
i n parallel
form
defines
and a r e c a l l e d
o f l a y e r s . C l e a r l y , any p a r a l l e l
I f we r e p r e s e n t
i s the groups
l a y e r s , and form
defines
them b y t h e f u n c t i o n s t l x ) ,
then
197 constituents
of parallel
clear-cut geometric function
f(jr),
forms,
and I n p a r t i c u l a r
meaning. C e r t a i n l y ,
consider
i.e. the set o f x satisfying
layers,
a level
will
have a
surface of the
the equation
=
tlx)
const.
S u p p o s e some a l g o r i t h m g r a p h n o d e s a r e i n t h a t s e t . T h e n t h e c o r r e s p o n ding
operations
layer
are executed
of the parallel
surfaces
form.
o f space-time
a t t h e same
Various
time,
and t h e r e f o r e form
l a y e r s correspond
a
t o various
level
o f the s e t o f a l g o r i t h m graph p a r a l l e l
forms
schedules.
Thus, t h e d e s c r i p t i o n amounts t o t h e d e s c r i p t i o n
o f s p a c e - t i m e s c h e d u l e s and t h e i r
level
sur-
faces. The the
simpler
more
the structure o f the set of admissible
effective
of this,
these
functions.
study,
but I t i s totally
methods
of
we
the investigation
cause
cannot This
algorithm
amount o f e x e c u t a b l e erations
i s equal
do w i t h
of parallel
unacceptable
structure
can
Schedules
schedullngs it
should
This
should
investigation
ify
theoretical of practical
to the
that
tremendous
t h e number o f o p -
this difficulty
schedules
i s to specify
used
f o r scheduling
n o t take
very
time be
computat i o n a l long;
than
met
t h e work systems.
typically,
the execution
i f schedules
of
processors
Creating
such
i ti s required of algorithm
are specified
that
itself.
point-by-
schedules. provide
a number
s c h e d u l e s v i a some r u l e s .
pressed should schedule, achieve
suffice.
the simpler
of other
The c h i e f
f o r the description
Clearly, the simpler are the rules specifying a i s i t s practical
a stronger connection
recorded
arguments
H o w e v e r , e v e n t h e a r g u m e n t s we a l r e a d y e x -
use. That
sought t o
t o i n t r o d u c e space-time
m o t i v a t i o n i s as f o l l o w s :
i n some s i m p l e
i s why we
b e t w e e n g r a p h and s c h e d u l e r e p r e s e n t a t i o n
algorithm notations; i n particular,
ules. are
in a
rules.
of
cannot
b e . Be-
A g a i n , we c a n o v e r c o m e t h e o b s t a c l e b y u s i n g some r u l e s t o s p e c -
He c o u l d
and
due
involved. Recall
networks
much l e s s
requirement
point.
of
are chiefly
take
used
t o t h e number o f p o i n t s o n w h i c h s p a c e - t i m e
s p a c e - t i m e s c h e d u l e s b y some
communication
be
i n t h e development
a r e d e f i n e d . The o n l y means t o g e t a r o u n d
and
f o r a s would
ttx),
the point-by-point specification of
representation
operations
functions
a l g o r i t h m k e r n e l s most
w a y , a n d we must
t r y to maintain
schedoften
that slm-
198 plicity
i n our
r e p r e s e n t a t i o n s of graphs of these
k e r n e l s and
of
sched-
ules. Formally,
space-time schedules are
c r e t e s e t o f p o i n t s i n some d o m a i n D. ify
such f u n c t i o n s probably
d i n g on for it
individual not
are
points,
be
extended
Assume T h e n we
but
that
can
For
complicated
of
them t o c e r t a i n
a
executed
before
either data
t . The of
l e t D be
structure.
continuous time
these
sections with
moment lying
groups.
the
simply
connected. Consider These
w e l l be
f from on
ted
multiply
that
surface
Thus, o n l y
and
S u p p o s e an It
defines
of
directed
shown
those
surface of
are
executed may
after
be
have odd
can
states
family
t are
regard
22.1
The
of
level
mentioned
may
a ) , b ) . Level
family
o f some s u r f a c e
are o f t e n
has
at
least
surfaces,
surfaces
referred
of
or
one
or,
may
number
f o r any
to
transfer of
inter-
level
appropriate
equivalently,
not
surfaces
level
evolving
r e s e m b l e s wave p r o p a g a t i o n faces
time
sur-
direc-
show t h e d i s t r i b u t i o n
be
simply
corresponding
l a b e l e d w i t h numbers. R e c a l l i n g t h a t the
are
ascribed
a l g o r i t h m graph arcs
surfaces
surhave
t h e nodes t h a t
Obviously,
tlx).
may
of
transfer data during algorithm execution.
a l g o r i t h m graph
cuts.
in Fig.
values of we
the
ob-
c o n n e c t e d . However, i f
itself
nodes t h a t
cut of the a l g o r i t h m graph. Level streams i n D that
con-
smooth.
the l e v e l
surfaces
always separate those
the
groups of
level
sufficiently
face of a space-time schedule the set o f a l l such a r c s d e f i n e s a
data
sched-
domain D
introduced earlier,
schedule
They may
between d i f f e r e n t
only this,
space-time
the e n t i r e
f [ x ] are
notions
i n t , they w i l l
nodes
depen-
interpretation.
space-time
they are
over
schedules
geometric
simplicity,
formulas
I n view of
assume t h a t
way
dis-
t o spec-
nodes.
space-time
relate
t=t(x)
I f we
some
rule
u s u a l l y h o l d good not
a l s o f o r whole subdomains.
i n some r e a s o n a b l e
t a i n a meaningful
moment
involves providing e x p l i c i t
a grave r e s t r i c t i o n
t a i n i n g a l g o r i t h m graph
faces
most e f f i c i e n t
t h e p o i n t c o o r d i n a t e s . Such f o r m u l a s
will
ules
f u n c t i o n s d e f i n e d on
The
surfaces
i n time.
The
i n t h e d o m a i n D.
t o as wave
fronts,
t h a t when d i s c u s s i n g E x a m p l e s f r o m
as
a
that
to
family
family
f o r time,
of
fronts,
as
successive
discrete
that
reason,
computation §11.
the
connected,
t stands
movement o f For
schedule.
surface
such
sur-
etc.
We
199
if"
3u a)
b) Fig.
The
family
parallel
form
of
of
level
ule
i s not
splitting
very
of
sets
of
within
set
subset
Things
schedules
dimension (see
Fig.
any
can
readi l y
t h i s,
we
subgraphs
graph. shou I d
the
have
can
state
i n one
and
executed executed
i n mind
schedule
defines
mapping
algorithms
s u r f a c e s o f o n l y one
that
these
only
one
level
surface.
s u c c e s s i v e l y and
in parallel.
lot
the D
the
surfaces
reduce
the
t o a macro-
To
me r g e
selected
co-
achieve
delimited
the by
surfaces subfam-
Fig.
subsets
22.2
the
a
sched-
surfaces define
operations into
Choosing
neighboringlevel from
a
isomorphic to the
ordinate
space-time
indepen-
among
level
a l g o r i t h m graph graph
c a n be
domain
22.2).
of
be
equals
the
subfami 1ies
fami1ies we
of
We
a l l algorithm
being
change
once t h e number o f dent
of
o p e r a t i o n s can
every subset
I f we
a
such f a m i l y o f l e v e l
helpful.
the
nodes o f each
surfaces of
the algorithm.
onto computer systems,
22.1
—
a the
A l l sub-
operations
200 ilies
into
graph
i s o b v i o u s . As a n i l l u s t r a t i o n
its
macronodes.
parallel
priate
subfamilies
of the ratio
originating
we c a n u n d e r
o f t h e average
f r o m a macronode
a hierarchical
then special
the
cases
conditions of arcs
number
efficient
independent
may e m e r g e
above
shows
consideration
convincingly
that
good d e a l o f i n f o r m a t i o n simplicity
satisfy
the relations
for
schedules
site,
on a l g o r i t h m
the relations
(9.7),
obeying
we o f t e n
thing tem
the parallel
graph
i n §14. I f
s t r u c t u r e of
some g i v e n
are able
structure.
A l o t depends
initial
initial
themselves,
group
i n (9.7).
discrete
said
systems
Bellman
system
i s often
of
number o f i t s e l e m e n t s
Quite
t h e oppo-
Because o f t h a t ,
maintenance
the only
t o t h e sysforming the
of a
particu-
the discrete
t o seek
a r e c a l l e d the Bellman
space-time
system
schedules i n
x i n t h e d o m a i n D, t h e n a l g o r i t h m
a set of distinct
satis-
have t o look
s c h e d u l e s do o r do n o t
inequalities
and
I f we w i s h
form o f functions o f point
Consider
and
inequalities
a r c s and d e l a y v e c t o r s a r e a l s o
on t h e
not necessary.
of equalities
respectively.
equations,
schedules
of equalities
The r i g i d
a
a n d o n w h a t num-
of the set of solutions
( 9 . 1 ) and t h e system
of relations
form o f delay vectors The
t o reveal
C e r t a i n l y , a l l space-
conditions.
space-time
conditions.
that matters I s the structure
However,
schedules
(9.1) and o p t i m a l
n e e d n o t know w h e t h e r
of inequalities
first
the
t o or
than t h e dimension of
N o t e h o w e v e r t h a t we d o n o t a l w a y s
a d m i t o f some p a r t i c u l a r
of
pointing
of algorithm
i s not mathematically rigorous.
space-time
o f t h e s t r u c t u r e s o f schedules
time schedules
lar
appro-
achieve the
We d i s c u s s e d t h i s
schedules
as r e g a r d s
macro-
layers of
algorithm implementation
b e r o f a p p r o p r i a t e s c h e d u l e s we a r e a b l e t o f i n d .
fy
coordinate
a l g o r i t h m . We d i s c u s s e d t h e s e m a t t e r s i n E x a m p l e 1 2 . 2 . The
it
number
memory c o m p u t e r s y s t e m .
an a l g o r i t h m g r a p h h a s l e s s D
of a
we show s e v e r a l
certain
t o t h e average
n o d e s i n one m a c r o n o d e . T h i s e n s u r e s on
structure of this,
f o r m b y v a r i o u s h a t c h i n g s i n F i g . 2 2 . 2 . By c h o o s i n g
surface
smallness
The p a r a l l e l
graph
t o b e r e p r e s e n t e d a s f u n c t i o n s o f x. positive
i s not large
integer
numbers.
Assume t h e
a n d d o e s n o t d e p e n d o n t h e number
of
a l g o r i t h m g r a p h n o d e s . S u p p o s e t h a t f o r e a c h p o i n t x i n D a s e t g{x)
is
defined
for
certain
that x.
i s a s u b s e t o f t h e a b o v e s e t . The s e t g{x)
may be empty
L e t f Ax)
o f x,
a n d UAx)
be s e t s
of functions
where
201 The
legtx).
range
Mj(x)
is a
nodes
of
input
initial
of
subset of
the
consists
f^x)
of
points
from
D and
n o n n e g a t i v e n u m b e r s . D e n o t e by
algorithm data.
graph.
Let
a
We
assume h e r e
number
function
the
that
s{x)
the
they be
range
set
of
only
input
serve
defined
of
on
to
those
points.
Let
Now
we
the
set
fine
the
f i l l of
the
introduced
positive
integers
p a r t i t i o n i n g of
meration of
arcs
pointing
this
is
their of
arguments
and
only
not
interested
input
one.
t o x.
(9.1)
can
For
The
graph arcs
the
Therefore
corresponding
arguments of
example, t h i s
function
The
function
now
be
tij(x)
g(x)
turns
the
and
de-
the
class
nu-
number.
operations
and
enumeration
The
i s empty i f
out
set
g(x)
t o be
of
that
e m p t y i f we
corresponding
i f the
origin
d e l a y on
following
sets g(x)
their
operation
the
meaning.
r e f l e c t s the
situation arises
sets
w r i t t e n i n the
the
i n t o classes
t o x.
the
returns
following
p a r t i t i o n i n g algorithm
I n o t h e r words, g ( x )
when
the
i n accordance w i t h
x by
classes.
operation
i f xeFp.
appear f i r s t .
ing
into
arguments o f
to
defined
with
i n t r o d u c e d a b o v e and
algorithm
Of
course,
notation
operation
the
Jth arc
arc.
The
are to is
x an
point-
relations
form:
min ( t U J - t U - ^ l x D - W j U ) ) ieg(x)
a 0
i f xeD/rfl, (22.1)
t(x) The
relations
(9.7)
min
can
be
- s(x)
i f xern.
r e w r i t t e n as
follows:
( t ( x ) - t ( x - p Ax))-uAx))
= 0
i f xeD/F
leglx)
(22.2) tlx)
If Pjtx), of
x,
and rithm
we
presumed
iijlx),
and
t h e n our (9.7)
only
notations
ples from
§11,
new in
that
s(x)
well
us as
this
is
values not
( 2 2 . 1 ) and
However, that
i f xerfl.
enumeration
chaotic
relations form.
tells as
arc
take
= sU)
our
a r b i t r a r y , and related
to
(22.2) would d i f f e r experience
i s not
with
actually
the
functions
the
position
from
(9.1)
real-life
algo-
c a s e . The
exam-
many o t h e r e x a m p l e s , d e m o n s t r a t e
this
clear-
202 ly.
I f we d e r i v e
ration
from
o u r c o o r d i n a t e s y s t e m s , node p o s i t i o n s ,
the algorithm
(pj{x), and x.
often
I ti s important
eral rules rather either the
i n some
natural
way,
then
gix),
t u r n o u t t o be n o t v e r y c o m p l i c a t e d f u n c t i o n s o f
that
these f u n c t i o n s
a r e now d e t e r m i n e d b y some gen-
t h a n i n d i v i d u a l l y f o r each p o i n t .
f o r the entire
relations
notation
a n d a r c enume-
domain D o r w i t h i n
Every r u l e i s v a l i d
some s u b d o m a i n
[ 2 2 . 1 ) a n d ( 2 2 . 2 ) c e a s e t o be p o i n t w i s e
o f i t . Hence
a n d become f u n c -
t i o n a l as w e l l . Formally, our functional set
of points.
now
simple,
Since
i ti s natural
[22.1) and (22.2) over by the
continuous
relations are defined
the representation
of g(x),
t o t r y and e x t e n d
only
discrete
a n d to ^ Ix) i s
the f u n c t i o n a l
relations
t h e whole domain D and r e p l a c e d i s c r e t e
( i n some s e n s e ) p r o b l e m s .
Since
the functions
p r o b l e m a r e n o t c o m p l i c a t e d , we c a n h o p e t o o b t a i n
dure t o f i n d
the functions
ous
(22.1) and (22.2) t o o u r d i s c r e t e
problems
on a
tlx).
I f we n a r r o w
problems describing
a simple
the solutions
proce-
t o continu-
sets of points
then the
s p a c e - t i m e s c h e d u l e s s h a l l be d e t e r m i n e d . S u p p o s e we w i s h
t o study
the structure of parallel
c a s e we d o n o t n e e d t o know a n y t h i n g a b o u t ifies
the i n i t i a l
quired (jj(x)
conditions.
t o perform
forms.
t h e f u n c t i o n six)
corresponding
t o x then
the functions
a r e i n d e p e n d e n t o f 1. T h e r e f o r e we a r r i v e a t s t u d y i n g
the set o f
inequality
m i n ( t ( x ) - t ( x - v J ( x ) ) ) ^ wit) ieg(x) or
of a functional
assume w ( x ) = l
1
( x ) } ) = toix).
o p e r a t i o n s have r o u g h l y
without
severe loss
T h u s , t y i n g up a l g o r i t h m tions
(22.3)
equation
min ( t ( x ) - t ( x - ^ Jeg(x) If a l l algorithm
spec-
I f t h e d e l a y s depend o n l y on t h e time r e -
the operation
solutions of a functional
I n that
that
o p e n s u p new p r o s p e c t s
We c a n p r i m a r i l y b e n e f i t
from
(22.4)
t h e same e x e c u t i o n
t i m e we c a n
i n generality.
g r a p h s and s c h e d u l e s w i t h a l g o r i t h m i n studying
the structure
of
investigating the properties
nota-
algorithms. of solutions
203 tlx)
to
functional
additional
information
m u s t know and
the
supplied through topic
This
with the
fined
form
kind
the
of
the
used
resulting
in
book.
the
D
spaces.
notation
In
and
the
of
only
way
notation.
We
ip1ix).
I f linear (17.1),
Most
that
we
attempt
intrinsically
to
describe
i t is this
are
de-
operations
was
functions
fact
the
that are
seldom
usually
they
of
we gix)
discuss
often
the
is
notations
ed
by
place,
obtain
will
c o m p l i c a t e d even f o r simple a l g o r i t h m s .
for
to
ordering
then
However,
functions
n o n t r a d i t l o n a l and
( p j l x ) w o u l d be
functions
first
the
Compact a l g o r i t h m
gix).
functions
i n multidimensional
is
algorithm
(22.1)-(22.4).
this.
domain
d e s c r i p t i o n . The
I n the
simple
t o do
the
information
of
later
rather
of
algorithm
analysis
equations
i s required
precise
in detail
produce
i n e q u a l i t i e s and
This
i n one
gix)
and
i s account-
dimension
the
multidimensional.
23. Regular Graphs Algorithms
written
p r o b a b l y among t h e
as
recurrent
most o f t e n used
relations with
i n mathematics of
s i d e r a f i n i t e - d i m e n s i o n a l space of lexicographic a nonempty s e t
order
i s defined
of vector
u
are
called
vector F.
recurrent
indices
are
f
(23.1) are i n d e x f.
To
—
executed ensure
,.
i n D o r be
some v e c t o r
vided to
as
an
less
f-f,
are
they
.,u
),
r
given
may
i n the that
or
may
process the
computation.
i n d i c e s . Suppose a domain D
indices
be
data
exist i t is sufficient
i n D,
item. that
the
the
containing
provided f.
linear.
lexicographic
The The
f-f ,...,f-f
that
the
functions assignments
growth of
i s c o r r e c t l y defined
vectors
Con-
(23.1)
independent o f not
are
relations
feD.
linear
and
order of
this
The
indices
the
vector
i t i s neces-
e i t h e r should
t h a n f l e x i c o g r a p h i c a l l y . I t i s assumed t h a t
i s not
initial
given.
l
sary t h a t f o r a l l f each of be
i n t h a t space. Let
relations with f
arbitrary
integer vector
i n d i c e s be
pfiu
=
linear
corresponding variable u
not
i n case is
pro-
f-fi I t i s known t h a t f o r t h e
process
(23.1)
t h e most s i g n i f i c a n t ( l e x i c o g r a p h i c a l l y )
204 f
n o n z e r o component o f e v e r y details
of exploiting
i s positive.
the lexicographic
order
Ue w i l l
not dwell
i n the set of
on
vectors
now. I f we e n u m e r a t e v e c t o r ing
order
that
a l l results derived
the
indices
then t h e n o t a t i o n
transformation
particularities
using
t o (17.1)
mentioned
f r o m D i n l e x i c o g r a p h i c a l l y ascend-
(23.1) w i l l (17.1)
transform
t o (17.1).
hold
f o r (23.1).
good
i s impractical
i n §22.
since
g r a p h and s c h e d u l e s f o l l o w i n g t h e g u i d e l i n e s We p o s i t i o n a l g o r i t h m function the
F^ corresponds
node
f-f
f
i s
g r a p h nodes
l
r
inside
D i t stands
to
from
by
i n integer
arcs
these.
However,
retains a l l
the algorithm
there. points
t o t h e node f . I f f e D t h e n
pointed
f - f and o n l y
(23.1)
T h e r e f o r e we c a n b u i l d
I t follows
i n s i d e D. The
(23.1)
originating
I n case a v e c t o r
implies
from
the
that nodes
f - f , does n o t p o i n t i
f o r t h e f u n c t i o n F.
that
inputs
the variable i
The
graph b u i l t
by t h i s
method
g a r d g r a p h a r c s as v e c t o r s into
gular
from
one a n o t h e r
structure o f arcs
some o t h e r
specific
pointing
v i aa parallel may be d e r a n g e d
being
particularities
properties
set of
vectors
formally described
this re-
nodes
are positioned
illustrates
t h e importance
be r e f l e c t e d i n
graph
schedules. explore
they
whether
a priori kind
clear
occurs
(23.1) i s f u l l y
must
The
graph
itself
g r a p h and s c h e d u l e
Similar
defined
graphs
ones o f t e n o c c u r
by r e c u r r e n t
not this
that
notation.
f^,...,f^.
from s i m i l a r
ways known w h e t h e r
by
i f graph
n o d e s c a n be ob-
Note
of algorithm
o f space-time
graph of a l g o r i t h m
slightly differ not
to different
funnels
would proper-
simultaneously. The
the
I f we r e -
i n accord w i t h algorithm
p o s s e s s s p e c i f i c p r o p e r t i e s . We w i l l ties
structure.
translation.
way t h a n d e s c r i b e d a b o v e . T h i s
of graph representation Specific
simple
t h e n o n e and t h e same s e t o f v e c t o r s
e v e r y g r a p h n o d e . Two s e t s
tained
in
has a v e r y
they
have
when we a t t e m p t
and
However,
o f some a l g o r i t h m s circuits.
A sample
t o "approximate"
reason,
f ^ do n o t d i r e c t l y we
start
our discussion
involve
with
that
finding
only
that are
i ti s not a l since
i t
is
situation of
the algorithm
another graph. A l o t o f problems on graphs d e f i n e d
vectors
graphs
i n many p r o b l e m s
relations.
c a n be g r a p h s
b y t h e d o m a i n D and
by a f i x e d
schedules.
graph s e t of
For that
a s p e c i f i c a t i o n of the class of
205 graphs
t o be e x a m i n e d w i t h o u t
We w i l l
use t h e r e c u r r e n t
binding
relations
i t t o any p a r t i c u l a r (23.1) o n l y
algorithms.
to illustrate
our r e -
Let £f
f
be some
vectors.
Consider
sults. Let fixed an
an n-dimensIonal
integer vectors.
infinite
vector
We w i l l
s p a c e be g i v e n .
r e f e r t o them a s basic
g r a p h whose n o d e s a r e a l l i n t e g e r v e c t o r s .
L e t one a n d t h e
same b u n c h o f a r c s f a n o u t o f e v e r y n o d e , t h e a r c s b e i n g d e f i n e d vectors degree
f ^
Any f i n i t e
r.
r. I fa
degree call
such
graph
regular
an
o f i ti s said
graph
i s t h e graph
of algorithm
the d e s i r e of
graph
of
t o b e a regular
graph
of
o f some
ar u
of introducing
i s regular.
f - f
"*
n o t
o f (23.1).
regularities,
We
especially
the notion
also
our
study,
the
s e t o f schedules s t r u c t u r e may
arise
to find
must d e s c r i b e g r a p h s
introduce
may s i g n i f i c a n t l y
as t h e g r a p h
become " g o o d " .
e n
r e
t '
sub-
s e t
-
i s n o t merely
domain,
easlly.
produce
where
regular
graph
splits
every
subgraph
into
no s u b g r a p h graph
disjoint
of i t .
subgraphs
schedules,
prompted
priori
about
I f the i n f i n i t e
graph
A schedule
induces schedules f o r i t s subgraphs,
us
I t s use i s p a r t i c u a
o f r e g u l a r graphs
regular
may
being
f o r the original
t h e n t h e same s p l i t t i n g
o f i t c a n have c i r c u i t s .
that
the opportunity
These
graph.
graphs.
I f the i n f i n i t e
struc-
the objects of
s c h e d u l e s . Note
I s known
The p r o p e r t i e s
infinite
the graph
irregularities
regular
nothing
the
of
o f t e n have l o c a l I r -
schedules
from boundary
of infinite
i n cases
those
may be a u g -
I s b e i n g augmented, and t h e
structure of irregularities.
then
o n some
that
derange
In particular,
schedules
to abstract
the notion
justified
regular
t h e
influence
the set of appropriate
narrows
to the original
main
n
of regularity
they but s l i g h t l y
the required
g r a p h . The d e s i r e
larly
often
i n particular,
graph
to
o
Moreover, I t
only
i n t h e neighborhood o f t h e domain boundary. I n
these i r r e g u l a r i t i e s
However, q u i t e
confined
will
r
mented t o have s i m i l a r s t r u c t u r e . R e a l - l i f e g r a p h s
ture.
we
t o s i n g l e o u t g r a p h s whose s t r u c t u r e I s as b e a u t i f u l as t h a t
the graph
general
algorithm
F f depends
l motive
by t h e
regular
(23.1)
i f each o f t h e f u n c t i o n s
sets o f the s e t o f variables The
infinite
regular.
the graph
regular
i s called
subgraph
algorithm
Obviously, remains
Such
f^.
has no
the
mirror i n regular holds f o r circuits
f o r the i n f i n i t e
e t c . A l l these f a c t s
206 prompt
u s t o embark
on an in-depth
examination
of infinite
regular
graphs. An
infinite
regular graph
have r e g u l a r
structure.
combinations
o f basic
principal
regular
STATEMENT graph is
that
23.1.
are
the
two
trans!at either
these
case
as
the
subgraph
graph. of
the
also linear
subgraph
the principal
regular
of
are integer
to this
subgraphs
ions
an
infinite
principal
two subgraphs
a n d G^ b e t h e s a i d
principal
linear
subgraph,
the p r i n c i p a l
regular
subgraph.
There
identical,
are
a r e connected subgraph
or
they
subgraph Perform zero
a r econnected
Any i n f i n i t e that
with
an i n f i n i t e
integer
regular
node
t o g.
regular
translation
by an a r c
linear
means t h a t u
t h e nodes
of
o f basic must
have
the
veccommon
i s not identical
one node g t h a t
translation
the
as principal
I f t h e two subgraphs
the
Again,
with
subgraph,
do n o t cover
moving t h e
t h e whole
one node u t h a t
a parallel
i t s principal
i s n o t i n t h e subgraph.
o f the principal
then there i s a t least perform
of
components,
graph
s u b g r a p h , m o v i n g t h e z e r o node
can be represented
translations
regular graph, subgraphs.
as a integer
Now s u p p o s e t h e
combinations
graph
are paraiiel
then there i s a t least a parallel
This
one a r c . Since linear
a common
t h e y must b e i d e n t i c a l .
subgraphs
by vectors
If
have
translation of
t h e sum o f t h a t
are identical.
by a t least
that
that
STATEMENT 2 3 . 2 .
subgraph
G( a n d
are a l l integer
t h e subgraphs
nodes. I t f o l l o w s
of
Consequently
a n d t h e v e c t o r g must b e a n o d e f r o m
subgraphs
union
they
v e c t o r s . B u t G^ i s a p a r a l l e l
subgraph as w e l l .
principal
the
Suppose
G^ i s a p a r a l l e l
t h e v e c t o r u-g c a n b e r e p r e s e n t e d
o fbasic
i n G^ i . e . t h e s u b g r a p h s
tors,
subgraphs.
a n o d e u i n G^, S i n c e
combination
combination
the
subgraphs that
disjoint. Let
is
infinite
Consider
parallel
n o d e g. C o n s i d e r
of
refer
I nthe general
t o the entire
an alternative:
are
v e c t o r s . We w i l l
subgraph.
is not identical
may c o n t a i n i n f i n i t e
One o f t h e s e h a s n o d e s t h a t
translation
infinite
i s i nneither of o f the principal
t o u . By r e p e a t i n g t h i s p r o c e s s
we p r o v e
statement. Note
that
t h e subgraphs
that
c a n n o t be c o n n e c t e d by any a r c s .
form
the union
i n S t a t e m e n t 23.2.
207 Consider a l l l i n e a r
combinations o f basic v e c t o r s o f the form
7 r
f
=
11
a.f.,
L
i-1 w h e r e a l l a., s a t i s f y 0 * a , < l . hedron
that
we w i l l
call
This
base
s e t o f v e c t o r s forms a semiopen
nodes as
bearing
as
splits
that into
principal
of
Let
the
the
dimension
entire
disjoint into
some
of
principal
subgraph
into
disjoint.
Consider
the union
there union.
g
The v e c t o r
basic
the
bearing
of
the
basic
contains
refer
t o these
be
the
regular
graph
translations
of
the
nodes.
are parallel
into
vectors
infinite
parallel
translations
nodes a r e e i t h e r
o f a l l subgraphs
subgraph
that
identical or
are parallel
the bearing
trans-
nodes. Suppose
g has i n t e g e r components and b e l o n g s
t o t h e span o f
of basic
polyhedron.
This
which
can a l w a y s add t o i t such
vectors
means t h a t
graph
that
i s not i n the
we
regular
of the
that
vectors. Obviously,
our union,
that
of the i n f i n i t e
combination
in
are
any two b e a r i n g
of the principal
i s a node
span
Then
that
Statement 23.1 subgraphs
lations
of
space.
subgraphs
subgraph
By
r e g u l a r g r a p h . We w i l l
nodes.
STATEMENT 2 3 . 3 . same
The b a s e p o l y h e d r o n
polyhedron.
a number o f n o d e s o f t h e i n f i n i t e
poly-
that
the result
there exists
i s impossible
will
be
such b e a r i n g
integer Inside
linear
t h e base
node t h a t
because o f t h e method used
i snot
to build
it. If
basic
vectors
are linearly
independent
then
b e a r i n g n o d e s may be i n t h e same s u b g r a p h o b t a i n e d lation
of the principal
subgraph.
I f basic
d e n t t h e n s u c h p a i r s o f b e a r i n g n o d e s may Thus any i n f i n i t e graphs
that
is actually allel
a parallel
structure
ficient In cuits.
regular
are isomorphic
translation.
t o examine t h e p a r a l l e l case
basic
two
trans-
are linearly
depen-
vectors
always
splits
into
subgraph.
disjoint
The
i ti s nearly
structure of i t s principal may
define
sub-
isomorphism
As a c o n s e q u e n c e , t o s t u d y
regular graph
vectors
distinct
exist.
t o the principal
o f an i n f i n i t e
the general
graph
no
as a p a r a l l e l
the par-
always
suf-
subgraph.
a graph
Such g r a p h s c a n n o t be g r a p h s o f a l g o r i t h m s . T h e r e f o r e
with
cir-
i t i s im-
208 portant
t o have a c r i t e r i o n
regular A
that
verifies
the
absence o f c i r c u i t s
in a
graph. set
vectors
K is called
a cone
i f the
following
conditions
are
true: s, q e K i m p l i e s s + q q € K implies
e K;
Aq 6 K f o r a l l A S 0;
q € K implies -qi K f o r q * 0. The
dimension
that
cone.
e ,...,e of
Obviously, be given.
linear
defined will we
o f a cone
need
the
omit the
the
(e^,
fie,,
system
dimension
always set
i s a convex
o f vectors
q)>0
set.
q that
space Let
containing the
satisfy
vectors
the
system
f o r a l l I i s a n o p e n c o n e and
of inequalities
following
o f minimal
statement
l e ^ , ql^O
called
i s a closed
the
set
c o n e . We
Farkasz-Minkowsky
lemma
[351;
of
Inequal-
proof.
STATEMENT 2 3 . 4 . ities
a cone
Then t h e
inequalities
by the
i s the
q)^0
for
in
the
a cone
l^i^p.
(e,
inequalities
represented
Let
q):0
K be defined
Then for
( f i e set
all
by the system
of
a l l vectors
q e K , and only
these
e which
satisfy
vectors,
can be
c o n d i t i o n s f o r the
absence o f
form
Y
P e
-
L
A.e., i t
1=1 where
\ . ^ 0.
j
Now circuits
we a r e
ready
i n a regular
STATEMENT 2 3 . 5 . only
i f
basic
exists
i n v e s t i g a t e the
graph. An i n f i n i t e
regular vector
an integer
graph
has no circuits
forming
acute
angles
i f and with
all
vectors.
Let q
there
to
exists
f but
f the
be b a s i c
vectors.
regular graph
has
Suppose t h a t a circuit.
the
i n t h e p o s i t i v e d i r e c t i o n we e s t a b l i s h t h e e q u a l i t y
r
I
i
=l
Vi - °-
s p e c i f i e d vector
Traversing
that
circuit
209 where a l l Forming and
are
the
dot
taking
tion. all
nonnegative
account
Consequently,
have
at
l e a s t one
product of both sides of
into
basic
integers
vectors
that
(f^,q)>0
I f there then
no
is a
this
of
e q u a l i t y and
a l l i
for
vector
subgraph
o f which I s nonzero.
that
the
we
forms
infinite
acute
q
contradicangles
regular
suppose t h a t
for given basic vectors f
with
graph
may
no
specified
property
e x i s t . Take t h e
maximal subsystem o f
t o r s f o r which such v e c t o r e x i s t s . Without sume
that
i t is
open cone for
of
l^isi.
vectors
a l l j>l
in
i t s closure
and
f.+J,.••»f_
+ 1
that
=
the
+ •• •
vector
E-*f|j
f s
s
negative "here
'
umns o f
$ are
vectors f
us
examine
the
integer
numbers.
sequently $u=0 has
with a
linear
this
tain
a
of
the
some o f for
negative.
4>u=0 w h e r e
Using
s y s t e m we
the
from
Kramer's
detail.
formulas
The to
is
a
be
c h o s e n t o be We
have
solution
nents
by
tion.
I t follows
to
the
perturbing that
fundamental
continuous
fundamental
system of
system
siightly the
means
is a the
socol-
matrix
find
conclude that a l l vectors of
can
the
the
example,
This
p o s i t i v e components equations
and
vectors
f.
components.
vectors
combination of
,
open cone K
Suppose,
a l l fi. a r e with
holds
with
found
out
a
*u=0 the
system of
function solutions with
of are
positive
c o e f f i c i e n t s of
4
is
funda-
the
fun-
r a t i o n a l and
con-
that
the
system
s o l u t i o n w i t h p o s i t i v e components. T h i s s o l u t i o n i s a
combination of
vectors
and 1)
s y s t e m *u=0 i n g r e a t e r
solutions
integer
f
each
an
(fj,q)>0
,q)sQ
linear combination of
linear algebraic
composed o f
solution to
S*l
, . . . ,
i n the
{f
as-
K be
inequalities
23.4
coefficients.
(-8...,-8,
system of
of
vec-
can
i < r . Let
inequality
for a l l q
a
to the
mental
-0
as
=
damental system of
q)
the
represented
lution
Let
system
Statement
S
u
the
of
wi t h +
by
where
assumption
with
basic
l o s i n g g e n e r a l i t y we
f
virtue
f^,.... . , f:.' pjfj
our
Hence By
K. b e
vectors
defined
to
qeK.
c a n
by
q
According
for
vectors
formed
vectors
r
1
f j
a
circuits. Now
the
the vector
have
linear
solutions.
Since
i t s coefficients integer,
we
can
and ob-
and
r a t i o n a l compo-
the
linear
s y s t e m $ u = 0 must h a v e a p o s i t i v e
combina-
integer
so-
lution. If
y ,..., r
, n
are
the
values of
corresponding variables
then
210 the
vectors
nonexistence implies
f o r m
yf^,...,
the existence
The
vector
of
Inequalities
be
chosen
q forming
c I r c u i t
of circuits
Consequently, the a l l basic
In the I n f i n i t e
point
regular
of the said
By
the continuity
and due t o t h e u n i f o r m i t y
vectors
graph.
The
vector.
o f t h e cone d e f i n e d
q)>0 f o r l a i s r .
(f.,
t o be r a t i o n a l
-
acute angles w i t h
guarantees the existence i s an i n n e r
q
a
+1
by t h e system
argument
i t can
o f dot products i t
be c h o s e n t o be i n t e g e r . Let
tion ing
basic
vectors
o f them w i t h
be g i v e n .
f^-.-.f^
nonnegative
Suppose no
integer coefficients
basic
vectors,
an a r b i t r a r y
i.e. the inequalities t h e same
space
linear
f u n c t i o n s o f t h e form
t I x ) = lx,q)
+j
stant.
Level
there
Accordwith
f o rlaiar.
as g r a p h
nodes.
Let x
Consider
where j i s an a r b i t r a r y
con-
= a f o r such f u n c t i o n s a r e hyperplanes.
tlx)
G be a n a r b i t r a r y
Suppose t h a t ing
surfaces
combina-
acute angles
hold
lf^.q)>0
from
Let
vector
linear
equals zero.
t o S t a t e m e n t 23. 5 t h e r e e x i s t s a v e c t o r q f o r m i n g
all be
g
of a vector
absence o f c i r c u i t s
can
7 fg.
regular graph
with
i s an a r c I . o r i g i n a t i n g
basic
from
vectors
f
•
some n o d e u a n d p o i n t -
t o a n o t h e r n o d e v . He h a v e
(tv)-t(u)
This
means t h a t
Moreover, optimal
=
Iv.q)-(u.q)
i s a
tlx)
-
linear
i f we s e t t h e d e l a y
space-time
space-time
o n f.
schedule.
= I f ^ q ) > 0.
lv-u,q)
schedule
f o r t h e g r a p h G.
t o be I f t h e n
Consequently
will
tlx)
the following
be an
statement i s
true: STATEMENT f
i " " '
, f
r *
vectors
T a k e
and
tlx)=lx.q)+Tf
set
23.6. a
n
y
the
will
Consider
"ector delay
be
an
q
that
a
regular forms
I f ^ q ) for optimal
graph
acute every
arc
space-time
with angles f .
schedule
basic with
Then for
vectors aJJ
basic
the the
function graph
if
qmq.
Note ule.
tain
there
i s no m e n t i o n o f I n i t i a l
T h e y c a n be e a s i l y The
The
that
layers of that at
reconstructed
space-time schedule
least
one
form
are defined
node.
using
t l x ) defines
But
any
c o n d i t i o n s f o r t h e sched-
the schedule
a parallel
by t h o s e level
level surface
form
itself. o f t h e graph.
surfaces of
the
that
con-
function
211 Hx)
i s a hyperplane
= (x,q)+t
tire
set of parallel
hyperplanes
directed
graph nodes.
time.
tor by
i s t h e graph
hyperplanes
Since
b y t h e v e c t o r q.
are directed
particularities directing
the algorithm
H e n c e t h e number o f h y p e r p l a n e s
regular
graph
o f concrete regular
vector
t h e n t h e num-
q across
every
At this
i s determined point
we
will
so a s n o t t o g e t e n t a n g l e d i n
graphs.
point
execu-
b y o n e and t h e same v e c -
t h e d i s t a n c e between n e i g h b o r i n g hyperplanes.
again consider the i n f i n i t e
distinct
q and c o n t a i n i n g a l l
o f some a l g o r i t h m
covering i t determines
the hyperplanes
Therefore t h e en-
i s d e s c r i b e d by t h e s e t o f
b y o n e a n d t h e same v e c t o r
they a r e p a r a l l e l .
q,
defined
layers
I f our graph
ber o f d i s t i n c t tion
form
We
draw
i n space
a hyperplane
and c o n s i d e r
with
the dis-
tance between n e i g h b o r i n g hyperplanes. STATEMENT containing
integer
planes.
The
where is
d
the
is
the
t h e hyperplane Since
this
6 must
I t i s known
D e n o t e q^=dq'^
The
numbers
form
divisor
an
family
integer of
vector
nonzero
q
equidistant
hyperplanes of
and hyper-
dllqll^'
equals
components
The v e c t o r
(x,q)=S.
hyperplane integer.
of
• • • ,q )
j
n
contains a t least
q,
It'lg
. . +q
has i n -
one
Integer
be a n
integer
has an
integer
of the greatest
common
divisor
o f solutions
i s always
The number
q
q=lq ,
L e t z = ( z , i . . . ,z^)
t h e e q u a t i o n q^z^±.
infi-
w h e r e q'. i s i n t e g e r a n d c o n s i d e r t h e r e s i d u a l
qj
- d[q^z^.
q'
n
are
+
.- <7nzn)
- 5.
mutually
=1 must have an i n t e g e r
H
a
neighboring
i f 5 i s a multiple
r
Q'Z +...+q'z i l ^n
be
that
i f and o n l y
by
norm.
of coefficients
nite.
space
common
vector
t h e number
solution
of
directed
between
greatest
components.
vector.
d
points
Euclidean
point,
Hyperplanes
distance
Consider teger
23.7.
(23.2)
prime
solution.
so
I tfollows
the that
equation the v a l -
n
ue o f t h e e x p r e s s i o n i n b r a c k e t s i n ( 2 3 . 2 ) c a n r u n t h r o u g h a l l i n t e g e r numbers. the
This
minimal
Implies
nonzero
that
f o r integer
residual
The
desire
z^
the absolute value o f
r i n (23.2) e q u a l s d and t h e d i s t a n c e be-
tween n e i g h b o r i n g h y p e r p l a n e s e q u a l s
mizing
z
t o minimize algorithm
t h e number o f h y p e r p l a n e s
1
dllqll^ . e x e c u t i o n time
l e a d s us t o m i n i -
c o v e r i n g the graph.
I f we i g n o r e p a r -
212 ticularities maximize
of concrete graphs
Thus t h e r e lel
structure
graph
There high-speed account
hyperplanes
the
graph
forms
acute i s not
may
tell
on
to
a
as
split
that
a
high-
the
regular
so
disjoint
subgraphs
may
t o high-speed
the of
may
can
cannot
be
in
accordance
i n general however,
change
t h e change
ones.
same
time
the
vector
the
graph.
Even
exist.
In that
investigated
Peculiar
t h e d e t e r m i n a t i o n o f q b u t n o t on i t i s only
to
t h e o t h e r hand, a
still be
graphs.
due
f o r m s a c u t e a n g l e s w i t h a l l ba-
i s at
graph
into
case our f a m i l y of
r e g u l a r g r a p h . On that
obtaining
f a r taken
subfamilies
disjoint
vector
of
that
ones.
t o become
into
a l l the arcs
of
ones
paral-
particular
high-speed
In that
graph
such
be
not
s c h e d u l e s . Most o f t e n ,
I n t h e g e n e r a l case,
high-speed
have
into
the vector q t h a t
with
not
modified
regular graph.
that are close
regular
the
f o r every
c a n p r e v e n t us f r o m
some p e c u l i a r i t i e s
regular
angles
may
siightly
h a n d , we
can
split
have
structure
guidelines
t h e one
o f high-speed
of
graph
ules.
chosen to
f o r determining
course,
method
be
of the i n f i n i t e
Notice also that
parallei
q s h o u l d be
schedules.
graph
also
i n schedules
vectors
this
of the i n f i n i t e
structure
sic
by
Of
two o b s t a c l e s t h a t
a regular
must
procedure
graph.
always
t o high-speed
the s p l i t t i n g
results
found
nearly
s c h e d u l e s . On
that
regular
regular
are i n f a c t
splitting
with
constructive
any
t h e y can
speed o r c l o s e
the
is a of
the schedules
However,
then the vector
1
dllqll^. .
i n case
of
the
t h e method o f b u i l d i n g
Graphs
f o r which
the same
the
the closeness of found
proven.
the
case
along
features
that
graph sched-
schedules there
are
vectors forming acute angles w i t h a l l graph arcs are s u b s t a n t i a l l y
sim-
pler
Such
as
graphs
regards are
the
called
investigation
of
oriented
strictly
graphs
tors q are called
orienting
vectors.
Consider
strictly
oriented
points.
Using
any
special
graph. Moreover, arcs
are
circuits
can
strictly
oriented.
be
graph
transformations,
i t can
canonical
their
be
reduced
vectors.
reduced
to
we
and
whose can
coordinate
properties.
t h e c o r r e s p o n d i n g vec-
nodes
reduce
to a coordinate
In particular, a
parallel
occupy
integer
i t to a
regular
regular
any
regular
graph
since
graph
graph the
whose
without
former
is
213 Let that
the
vectors
f o r some v e c t o r
Inequalities S
«
1
priate. taken
allow
such
n
There are
q
For
that
every g
i we
,
By
select
the
we
system
the
space.
build
a l l arcs Isj.ql^O
various
inequalities
the
a basis of
vector
Inequalities
to
s y s t e m s and
example,
t o be
the
us
many s u c h
For ing
describe
of
we
contains space
can
no
Integer
into
gi,..,gn
assume w i t h o u t
Now
lelepipeds within
by
tion
an
form
The
direction The
of
Note t h a t cussed
since
the
new
a similar
a l g o r i t h m on
g r a p h s h a v e so
about
them.
t h i s p r o b l e m can
be
Let
the
cone
that other
I f we
q =
problems are While
(1 solved
investigating
For
dissects
the to
these
graph
the
1) just
graph. This graph
that
vector
of
regular
of
the
problem
vector
I s the s o l u t i o n simply
facet. of
of
as
a l g o r i t h m graphs,
this we
dis-
to
Im-
Regular
everything
is
finding
the
to the
graphs, following
must f i n d a v e c t o r
to this
the
graph.
i s maximum.
1
dtlqa"
we
intersect
that
memory c o m p u t e r s y s t e m .
the
a
graph
construc-
possibility
u s i n g S t a t e m e n t 23.7,
as
fall
t h e a l g o r i t h m g r a p h was
example,
that
paral-
that
original
the d i r e c t i n g
v e c t o r s . T h e n we
such
that
orthogonal
structure that v i r t u a l l y
for
hyper-
hyperplane
ignore p a r t i c u l a r i t i e s of concrete
canonical
(f,,q)>0, lsjsr,
the vector
hierarchical
transformed,
be
f ,... f
by
a coordinate
Investigation
simple
Consider,
form.
one.
a
of
original
directing
transformation of
coordinate
direct-
g r a p h w i t h t h e nodes o f arc
t h e new
the
i s defined
2 d u r i n g our
maximum p a r a l l e l
t h e new
the
be
original
be
a d j o i n i n g p a r a l l e l e p i p e d s then
of
with
arc
the
nodes of
built will
plement an
known
two
small.
are
i.j.
appro-
always
every
hyperplanes facets
one
most
between d i s t i n c t
i t s n o d e s t o be
of
least
a l l arcs
angles
thus
i n Chapter
I f at
separating
acute
graph
nodes
corresponding
is consistent facet
of
whose
the nodes o f
grid.
facet two
system
vectors
for a l l
can
arbitrarily
graph, s e t t i n g those
Identify
a
arc
one
facet.
a new with
rectangular
goes t h r o u g h link
build
hold
generality that
paralleleplpeds
together
t h e m . We
regular
This
Integer
hyperplanes w i t h
the distance
losing
points.
semiopen
will
Suppose
f o r a l l i . These
of
g , . . . ,g
vectors
p l a n e s c o n t a i n i n g i n t e g e r p o i n t s c a n n o t be reason,
graph.
f r e e t o c h o o s e one
a system o f p a r a l l e l
S t a t e m e n t 23.7,
a
hold
systems
(s^gy^O
are
of
It is
q in
obvious
problem. A l o t o f
one.
a l m o s t never drew
a dls-
214 tinction
between
individual
graph
n o d e s may r e p r e s e n t d i f f e r e n t transfers of different a
good
illustration
of varying
transfers. different We graphs
operations
By
introduce
sizes,
identifying
implying
various
set
g(x)
sizes
by t h e f o l l o w i n g n o t e .
(22.4).
does
a r c s may s t a n d f o r
The p a r a l l e l e p i p e d s of operations t o take
conducted
into
Our
and
with may data
account the
investigation of
on a g e o m e t r i c
on
x
basis.
and
i s actually
regular
regular
The same
f u n c t i o n a l methods. Consider
I n t h e case o f an i n f i n i t e
n o t depend
various
nodes.
r e s e a r c h c a n be c a r r i e d o u t u s i n g (22.1)-
Actually
the parallelepipeds
new o p e r a t i o n s .
contents o f separate
conclude
arcs.
and v a r i o u s
I n some c a s e s i t becomes n e c e s s a r y
s t r u c t u r e was c h i e f l y
lations
and
o f d a t a . The a b o v e g r a p h t r a n s f o r m a t i o n i s
of this.
new n o d e s we i n f a c t be
kinds
nodes
the re-
g r a p h the
the set of
numbers
1 , 2 , . . . , r . The f u n c t i o n s v^ix)
do n o t depend on x e i t h e r and a r e iden-
tical
we s e a r c h f o r s c h e d u l e s
with vectors
f
(
f u n c t i o n s of the form ned
by i n i t i a l
. Suppose
i (x) = (x, q)
conditions
f o r s c h e d u l e s , we o b t a i n u s i n g
m i n ( ( 1f i:lsr This q. ule
i s a Bellman
will
q)
D
-
Ix))
1
*
(22.1)
0.
i n e q u a l i t y w i t h constant c o e f f i c i e n t s w i t h respect to
I t i s immediately c l e a r (x,q)+T)
that are linear
S i n c e we a r e c u r r e n t l y n o t c o n c e r -
that
i f we s e t
be a n o p t i m a l
o n e . Now
(x)=(f,q)
l e tw^(x)=l.
the f o l l o w i n g i n e q u a l i t y f o r t h e determination
m i n ( f , q) i=i*r
or, e q u i v a l e n t l y , t h e system o f l i n e a r
then
t h e sched-
T h e n we w i l l
have
o f q:
i 1
inequalities
If,,
q)
i 1,
(f .
q)
* 1-
(23.3) r
It
follows that the s e t o f admissible
vectors
q i s part
of a t
215 cated
cone.
itive
Integers
The
In particular, then
we s e e t h a t I f t h e c o m p o n e n t s o f f j a r e p o s -
the vector
maximum number o f l i n e a r l y
tained. are
I t I s equal
not concerned
(1,...,1)
q =
i s certainly
t o t h e d i m e n s i o n o f t h e cone
about
the entire
in a discrete set o f points
admissible.
I n d e p e n d e n t s c h e d u l e s c a n be r e a d i l y o b -
cone
(23.3).
(23.3);
we
As a r u l e ,
inside i t . I f the vectors /
to
the vector
be o p t i m a l
q also
gramming p r o b l e m
with
t h e maximum v a l u e optimal.
has t o be i n t e g e r . Adding
t o the conditions
This
linear
( 2 3 . 3 ) we
shall
f u n c t i o n a l research
1
r
the condition for q
have
c o n s t r a i n t s . According
o f the q u a n t i t y dllqtl^
are l n -
f l
teger,
we
are interested but
an
integer
pro-
t o Statement 23.7
c a n be a c o n d i t i o n f o r q t o be
can be c o n t i n u e d
on.
24. Passage to the Limit Although
the functional relations
some o f t h e i r p e c u l i a r i t i e s points
x with vector
Integer. to
much l e s s this
then
point
close,
In
particular,
as
t h e number o f a l g o r i t h m o p e r a t i o n s As o u r p o i n t s
ules w i l l lution
large
from a formal
cannot
apply
etc. while
the process
problems,
identify
be as a
rule
considered
p o i n t o f view.
o f t h e d o m a i n D must
such
Increase
unless
we
important
studying
of finding
involve a l o t of complications
discrete
I f we
will
simple,
increases.
we
differentiation,
Consequently,
inevitably
of
the size
are not close,
t i o n s as c o n t i n u i t y , (22.1)-(22.4).
that
coordinates
i n s i d e o u r d o m a i n D c a n be
arbitrarily
means
are quite
are not immediately obvious.
indices
T h e r e f o r e no two p o i n t s
be c l o s e ,
(22.I)-{22.4)
the relations
space-time germane
impose
no-
sched-
t o t h e so-
some a d d i t i o n a l
constraints. The
underlying
idea
analysis o f r e a l - l i f e always
include
computation, this
kind
rithms, lems,
explicitly
research
i s a s f o l l o w s . The
that their descriptions
the parameters
and v e c t o r
and space s t e p s
the accuracy
eterized
reveals
that determine
nearly
t h e amount o f
i . e . t h e d i m e n s i o n s o f t h e d o m a i n D. S a m p l e p a r a m e t e r s o f
are matrix
time
o f our subsequent
algorithms
dimensions
i n algorithms
of the solution
amount o f c o m p u t a t i o n
i n linear f o rfinite
i n iterative
I s a most t y p i c a l
algebraic
algo-
difference
prob-
methods,
e t c . Param-
f e a t u r e o f t h e major-
216 ity
of
algorithm
transformation pendent o f nodes
descriptions.
t h e s a i d p a r a m e t e r s . The
reside
may
w e l l be
can
now
will
be
r e g a r d e d as
determining
grounds
to
a
simple
task
to
find
a new
t h e new
amount
the
discrete
set
of
graph
points
that
s e t o f a l g o r i t h m g r a p h nodes. However,
points of
the
consider
another
of
set w i l l
computation
situation
of
the
condense as grow,
limit
of
so
the
that
algorithm
a
inde-
set o f p o i n t s where a l g o r i t h m
mapped o n t o
hope t h a t t h e
eters
I t is usually
t h a t maps D o n t o some d o m a i n w h o s e d i m e n s i o n s a r e
we
param-
we
have
graph
se-
quence. When goal.
i n v e s t i g a t i n g the
I t Is practically
lems on I f we
algorithm
could
plexity
find
In
sample
rithm execution graphs, noted
graphs r e l y i n g algorithms
is linear
examine a
this
are
course,
the
o f one
and
ameters,
we
similarity sage t o t h e ent
the
of
certain
a
in
an
rithm
not
we
may
overlook
of
like
the
time
much g r e a t e r
required
than
the
real-life
unproductive.
I t
to
algo-
algorithm
should
r e f e r s t o graphs
do
not
be
also
linear
exist.
be
t h a t have
know p r e c i s e l y
arbitrary
graphs,
Apparently
the
com-
corresponding
to d i f f e r e n t
exceptional
to exploit
discrete
arbi-
i n what
large
although
some
difference
q u i t e s u b s t a n t i a l . As
concerned w i t h
f o r the
be-
graphs
values of
par-
similarity.
I t is
this
when p a s s i n g t o t h e
limit.
Pas-
joint
attempt
Investigation of to replace
algorithms
by
an
the
differ-
investiga-
i n v e s t i g a t i o n of
problems.
the
complexity
depend on
parameters
algorithm.
Even
p r o b l e m s whose com-
immense s i z e o f
I t I s e s s e n t i a l l y an
continuous
nodes,
be
from
their
going
prob-
exist.
clearly
i s not
family
Hopefully will
not
graphs d i f f e r
are
limit
algorithms.
tion
do
same a l g o r i t h m
t h a t we
graph
obviously
algorithms
cannot
well-defined
d i s c r e t e mathematics methods.
above d i s c u s s i o n
features
tween d i f f e r e n t
a
optimization
v e r y many d i s c r e t e p r o b l e m s f o r w h i c h
simply
algorithm
distinguishing
pursue
general
s o l u t i o n of our
graph would
structure. Unfortunately,
real-life
on
number o f
Is
we
to solve
What w i t h t h e
approach
that there
trary
the
time.
situation
f o r the
algorithm
plexity algorithms Of
limit
impossible
I f this
of
the
s o l u t i o n of
that determine
turns
structure examination w i l l
out be
t o be
the
true
continuous
amount o f
then
independent of
the
the
problems
computation
time
of
algorithm
algoexecu-
217 tion
time. There a r e s u f f i c i e n t
to f i n d ity
the solutions
of important
quite
o f continuous
cases.
acceptable
g r o u n d s t o p r e s u m e t h a t we w i l l
Note
provided
problems e x p l i c i t l y
that
their
numerical
t h e amount o f t i m e
be a b l e
f o r t h e major-
solution
I s also
required to obtain
I t
Is
not t o o l a r g e . A l o t of questions problems.
They
solution
solutions,
etc.
low. B e f o r e like
arise
We
will
Let
t o continuous
f e a t u r e s o f these
the corresponding
t r y t o p r o v i d e answers
t o these
attention
t o the fact
that
we
have
algorithm
t o do
graphs.
that
a family
of like
increases
algorithms parameterized
s o t h a t when c t e n d s
to infinity.
nodes a r e s i t u a t e d domain a t t h e l i m i t .
algorithm
q u e s t i o n s bewould
i t basing
on
i s apparent.
t o t h e amount o f c o m p u t a t i o n
graph
problems
So f a r , h o w e v e r ,
b y some e be g i v e n .
T h i s c a n i n g e n e r a l b e a v e c t o r p a r a m e t e r . We assume t h a t
computation
problems,
discrete
t h e p r o b l e m s t h e m s e l v e s s h o u l d be p o s e d . We
fuzzy knowledge o f r e a l - l i f e
no o t h e r a p p r o a c h
a s we a r e m o v i n g
to characteristic
reconstructing
we d o i t ,
t o draw
rather
pertain
methods,
naturally
inside
that
some g i v e n
t o zero
into
domain
disjoint
i n D. T h e l a t t e r
c
algorithm
0 and a r e dense i n that
subgraphs
condition
t h e amount o f
f o r every
F o r now we a l s o s u p p o s e
g r a p h s do n o t s p l i t
hood o f a n y p o i n t
Suppose
e i s related
f o r a l l small
c
i n the neighbor-
i s t o be s u b s e q u e n t l y r e -
placed by a m i l d e r one. It flect
graphs. To
Is intuitively
itself
Anyway, t h i s
refine
life any
clear
that
the likeness o f algorithms should r e -
I n a likeness o f arc description
this
i s how we w i l l
understand
otic
x i n D c a n be e i t h e r
behavior
crete
the arcs
discontinuities continuity
on a r e g u l a r l y
c a n be e i t h e r of regularity.
concentrate along
D i s independent Of
course,
regular
the corresponding
set residing
sides,
the notion
arcs
o f arcs
As a r u l e ,
with
realof
I n t h e case o f cha-
are asymptotically
I n some
surface, line,
short o r long.
their
i n t h e neighborhood
or chaotic.
changing
of
o f likeness.
n o t i o n we d r a w o n c e a g a i n f r o m o u r e x p e r i e n c e
a l g o r i t h m s . The a s y m p t o t i c b e h a v i o r point
f o r the family
I n general
dis-
o r c o n e . Bethere
may be
however, these p o i n t s o f d i s -
some s u r f a c e s a n d l i n e s
whose p o s i t i o n i n
o f e. there a r e a l g o r i t h m s I n which
algorithm
graph
a r c s be-
218 have d i f f e r e n t l y . ces.
N o n e t h e l e s s we w i l l
Our e x p e r i e n c e
i n algorithm
not discuss
graph
building
potential
i s not s u f f i c i e n t to
g i v e p r a c t i c a l grounds f o r a s u b s t a n t i a l expansion liar
features.
alleviated simple totic
Also
cases s h o u l d
be t r e a t e d
first.
a finite
o f pecu-
c a n be e x t i n g u i s h e d o r I n our opinion,
O u r a s s u m p t i o n s a s t o t h e asympas o u r e x p e r i e n c e
g r a p h s g r o w s . The d i s c u s s i o n o f s i m p l e
Let
cases
with
i s specifically
experience.
set o f piecewise
smooth f u n c t i o n s
fj(x)
be g i v e n i n
d o m a i n D. I n g e n e r a l e a c h o f t h e s e f u n c t i o n s may be d e f i n e d o n l y i n
some p a r t
o f t h e d o m a i n D. we assume
some way t h e b e h a v i o r each a r c e i t h e r scribed
with
that
o f a l g o r i t h m graph
i s described
b y some f u n c t i o n
we d o n o t know w h i c h
from
function
a certain
i t is).
i t w i t h number o f a n y f u n c t i o n
nature of arc description.
functions
fj(x)
subset
I n the former
from
describe i n
a r c s . Suppose t h a t f o r every c
by a s i n g l e
f ^(x)
function
these
t h e number o f t h e c o r r e s p o n d i n g f u n c t i o n
tag
o r i t i s de-
o f f u n c t i o n s (and
c a s e we t a g t h e a r c
and i n t h e l a t t e r
the subset.
Let three d i s j o i n t
case we
Now we c l a r i f y t h e
subsets
S ^ x ) , and
gix),
o f t h e s e t o f numbers o f f u n c t i o n s f ^ ( x ) be g i v e n f o r e v e r y
g^ix) x
many d i f f e r e n c e s
o f arcs are subject t o revisions
aimed a t a c q u i r i n g such
the
that
of the l i s t
by a n a p p r o p r i a t e m o d i f i c a t i o n o f t h e g r a p h .
behavior
algorithm
note
differen-
I n D. We assume t h e s e
that the
subsets
the Jtharc i s specified neighborhood
t o be p i e c e w i s e
by t h e f u n c t i o n
constant lc)
Ix)
point
i n D. Suppose f o r every
e in
o f x and
(O -
1
( x ) has an a s y m p t o t i c a l r e p r e s e n t a t i o n o f t h e form
f^tx)
i f
ligAx), EI
|e|
"]
(f>J ( x )
has
an
asymptotical
representation
C j U ) , a >0 i f J e g | x ) ,
tical
representation o f t h e form The
s e t o f numbers
asymptotically
have
have numbers f r o m tically,
but their
corresponds
gix)
a direct ion
g^{x)
|e|
fj(x),
corresponds and a
are short.
nonzero
chaotic arcs.
length.
These
arcs
The a r c s
that
asympto¬
. The s e t o f numbers
Generally
form
leg^(x). arcs.
They a l s o have a d i r e c t i o n
l e n g t h s d e c r e a s e as | E |
t o short
the
t h a t have asympto-
o^>0 i f t o long
of
speaking
t h e case
g(x} where
219 a r c s a r e l o n g and c h a o t i c i s a l s o p o s s i b l e . long
arcs
having
asymptotic
augmentation
o f the graph.
of
respect
this with Our
fore
Our i n t r o d u c i n g
are specified.
Clearly,
often
algorithm
to
in
that
the We
have
denote
short
tend
unsplittable paths
i n some
i n some
schedules part
neighbor-
t o short arcs
I E 1
( x ) . Let D.
The
we assumed o u r g r a p h s
f o ra l l small r i t follows length
by T
o f t h e domain
0, S i n c e
o f a r b i t r a r i l y large
as e
that 0.
have a p o s i t i v e
t h e graphs
I f the funclower
bound
the quantities = sup T xeO
T
become a r b i t r a r i l y l a r g e a s e
normalized
(x)
0 . T h e r e f o r e we h a v e
i n some way b e f o r e t h e i r
. T h e i r maximum v a l u e
8
space-time
arcs
(x) corresponding
consider
bounded
be imposed on t h e s e f u n c t i o n s as
t o z e r o as c
same p a r t o f D t h e n
will
c a n be d e -
they can f a i r -
arises.
graphs
schedules
they
on
a s f^lx)
constraints w i l l
8
will
to believe
i n D. M o r e o v e r ,
functions
to the
d e f i n e d i n t h e same p o i n t s
some c , we w i l l
contain (c)
t i o n s iiij
by c o n s t a n t
passing
1. Suppose
Additional
be l o c a l l y
functions
when
o n t h e way d e l a y s
nonnegative
l e n g t h s o f such a r c s
will
depends
There-
e q u a l i n g e. g.
need f o r t h a t Given
we c a n e x p e c t
schedules.
by a
e
the
i s an example
on t h e 1 t h arc i s specified
function wj '(x) hood o f x.
the set gg(x)
So f a r we h a v e g r o u n d s
be s p e c i f i e d
the delay
what
something
s c r i b e d by bounded n o n n e g a t i v e ly
by an a p p r o p r i a t e
task i s t o s t u d y t h e s e t o f space-time
f o r schedules.
that
Such a r c s c a n b e r e d u c e d t o
and l e n g t h s
t o short arcs.
l e t us t r y t o understand
limit arcs
chief
directions
schedules
limit
behavior
o f t h e form
t
t o normalize
c a n be d i s c u s s e d . ( C
'(x)
• r
< e l
(x) /
i n D i s 1 f o r a l l e.
IE)
X
According
t o (22.1)
t h e s e t o f a l l schedules
T
(X) satisfies the
relation min
(t
( e ,
(x)-T
, E ,
(x-^
C >
(x))
Wj
e,
(x)>
* 0.
Jeg(x) Moving
t o normalized
schedules
t
( C
'(x)
and t a k i n g
into
account
t h e as-
220 sumptions
about
min
t h e a s y m p t o t i c b e h a v i o r o f a r c s we
l e l
(t
(x)-!
( e l
(x-pj
u f ' l x )
c l
(x))-
) * 0
er
g< >
i = 1 , 2 , 3. N o t e
ative,
that
leg
(t
plifies
will
(x)
I E 1
(x)
- t'
c
>
(x
-
limit
our investigation
certain
C ,
v<E'(x)))
i s nonneg-
s a t i s f y weak-
B 0
to the relations
( x ) and unknown p a r a m e t e r s
some s t a g e s o f i n v e s t i g a t i o n
quite
satisfying
(24.2)
functions
t
( E 1
extra
Suppose piecewise the
certainly
(24.2)
i = 1 , 2 , 3. We
< e l
(x) will
Ax)
sence o f t h e f u n c t i o n s ( j j
t
t
always
C)
(x)/B^'
( E ,
o f the form
min
for
I E >
the r a t i o
so t h e normalized schedules
relations
(24.1)
E
legtW for
have
(x) satisfying
a l o t . The s e t o f f u n c t i o n s i s larger
the inequalities
(24.1),
than
the set of
L a t e r we w i l l add
that
normalized
a
sequence
function
(24.2)
For
t h e case
tic
representations.
(x)-t'
schedules
as i t s l i m i t .
fix)
detail.
i s obvious here
a n d we
I fthe l i m i t
ic>
ip
(x)
function
6
G
( z j has a
L e t us consider
Start
(t(x)-i(x-fAx)))
the functions
leg(x)
borhood o f x then
of
i n greater
min
( C ,
sim-
t o make t h e c h o i c e o f a s c h e d u l e .
Passage t o t h e l i m i t
t
The ab-
i n them
c o n d i t i o n s t o (24.2)
smooth
relations
the inequalities
(24,2), E
QT '
with
each o f
t h e case
leg(x).
a r e s m a l l and have
asympto-
have
^ 0.
l ( x ) i s smooth
i n t h e neigh-
asymptotically
E 1
=
U-^
E ,
(x))
<X, |e| ( g r a d
- (grad
£(x),V]
E ,
(x))+0(|c|
20.
i(x),f
(x))+0(|ej
).
2 0 ; i
)
=
221 Suppose
that
(grad
I(x),f |x))*0
a t x.
j
Ignoring
higher
order
terms
we
o n e.
Nonethe-
obtain
min
(grad
t[x),f
Ax))
£ 0.
1
leg (x) 2
For
t h e case
less
t h e number
legjx)
the functions
tions f o ra l l 1 f o revery
r
and
I E 1
(x)-£
( E )
(x-¥.]
j may
i n general
( x ) are small
tp^
c ,
and
have
depend
asymptotic
representa-
e. Hence we h a v e a s i n t h e p r e v i o u s
(x))
=
IcT'lgrad
case
£ (x), f' ( x ) ) + 0 ( | e |
)
finally
min (grad Jeg3(x)
Thus, can
expect
i fa l i m i t
function
t(x),f
t ( x ) f o r normalized
i tto satisfy the following
t(x)
(x)1^0.
£ t(x
schedules e x i s t s ,
we
system o f i n e q u a l i t i e s :
i f le^tx),
f (x)) s
(24.5) (grad
lays
The f a c t
that
on a r c s
can be
presume t h a t of
some
£ ( x ) , f^x))
these
limit.
The
follows
that
time o f data limit
the will
schedule given
transfer
a
l i e on limit.
functions. level
remains
by z e r o
those
arcs.
I n general
we
of a
intersecting level
a s we
traverse
i t i s impossible T h i s means t h a t
to
surfaces
such
schedule that
i n the l i m i t
arcs.
t o know t h e
passing
we d o n o t k n o w p r e c i s e l y
space-time
can a n t i c i p a t e
functions
f o r de-
natural
s u r f a c e s a f t e r we p a s s t o t h e
unchanged
schedule,
the functions
I t is perfectly
by t h e replacement o f c e r t a i n weight
surfaces
Consequently,
be r e p l a c e d
a l l ) arcs
l i e on these
limit
along
may b e a c c o m p a n i e d
(Jj(x) by zero will
as f o l l o w s .
( o r even almost
limit
g^x).
r e l a t i o n s do n o t i n v o l v e
interpreted
space-time schedules w i l l
It
? 0 i fi e g ^ x ) .
after
a l l weight
to the
functions what
we
arcs
pass t o functions
s i t u a t i o n . This
i s ex-
222 actly
what
happened.
The i m p l i c a t i o n
s c h e d u l e s may h a v e a g e n e r a l i z e d circumstance
by I t s e l f
i s not surprising
delays on arcs a r e o f t e n generalized
i s that
schedules
a sequence
o f space-time
space-time s c h e d u l e as i t s l i m i t .
specified
i f we
take
into
account
i n a haphazard fashion.
i n §5. T h e y
correspond
t o zero
This
We
that
discussed
delays
on a l l
arcs. In
case
schedule along be
points a
|e| ) ing
we
arcs
pass
do n o t l i e on
to the limit,
them c a n be t o a n e x t e n t
ascribed
end
certain
after
the weight equal
recovered.
i n c positive
lower
weights
weight.
arbitrarily
A more
refined
large.
As lated
a rule,
to algorithm
limit
sense
have
a n y s u c h a r c may
(to a factor
recovery
i s what
involving
o f t h e form
everywhere.
structure.
we
Multiply-
mean b y r e c o v e r i n g a
the r e s t o r a t i o n
i s seldom
o r graph
structure.
Most
with
graph
often
Concrete weight values structures.
i tmerely
The c o r r e c t i o n
are not relevant
of structure
while
that
with
same
limit
tion
and f o r c h a o t i c
t h e assumptions
r e l a t i o n both
f o r arcs
the operations
( u n d e r some a d d i t i o n a l erations operation
will
of the family. will
not only
fa-
o f algorithms but
made
we
have
that
have
an asymptotic
I n particuo n e and t h e representa-
arcs.
Formally the set o f solutions
under
is in
algorithm
investigating
we
o f t h e system o f i n e q u a l i t i e s
possesses a good d e a l o f t h e p r o p e r t i e s closed
of a family
allow us t o recognize b o t t l e n e c k s
note
with
I t o n l y m a t t e r s whether t h e y a r e z e r o o r nonzero.
the investigation
also w i l l
reflects
As we p a s s t o
and t h e r e f o r e
t h a t r e a s o n we c a n h o p e t h a t p a s s a g e t o t h e l i m i t
cilitate
i s i n no way r e -
execution.
corrected.
structure
of relative
feasible.
of individual operations
a l l weights are automatically correlated
algorithm
lar
transfer
o f values o f t l x ) at i t s
the s p e c i f i c a t i o n o f weight functions
time c h a r a c t e r i s t i c s
For
Certainly,
bound a l m o s t
This
v a l u e s o f w e i g h t s on d i f f e r e n t a r c s
the
of the limit
o f data
t l x ) b y a n a p p r o p r i a t e p o s i t i v e c o n s t a n t we c a n make a l l c o r r e s p o n -
ding
a
surfaces
the time
to the difference
f o r e v e r y e. T h i s w e i g h t w i l l
a uniform
level
then
of generalized
<s>, ® , a n d o,
is a
schedules.
cone,
hypotheses) t h e zero element w i t h
® and ® , e t c . C o n s i d e r , and c o n v e x i t y .
L e t t' (x)
f o r example, and t "
Ix)
(24.5) I t Is
i s c o n v e x , has respect
closedness
be s o l u t i o n s
under
t o opthe e
o f the sys-
223 tern ( 2 4 . 5 ) .
T h e n we f i n d
f o r the function
= min
tlx)
£ m i n I f Ix-f
=
and
i
I t ' (x),
!(x)=min
t"
Ix))
lirj), t "
t l x - f ^ x ) )
t"(x))
I f Ix),
z.
Ix-f^x)))
i f
leg^lx).
further
[grad
t'UJ.fjU)) £ 0i f £ ( x ) = t ' ( x ) a n d leg^x),
(grad
t U l . f j U ) ]
t " [ j r ) , f , (•*)•') £ 0 i f t ( x ) = t " ( x ) and
Hence
min
(f'(x),
gjx),
= (grad
t"(x))
i s also
a
solution
of
J«g.,(x),g3<x).
(24.51.
Take
OsA=l. and l e t
t ( x ) = At' U ) + ( 1 - A ) t " ( x ) .
Then
tlx)
= At'<x) + ( l - A ) i " (x) £
£ Af'lx-fjtxll+U-Ajr'U-fjtx)) =
=
and
that
t l x - f ^ x ) )
i f
l e g l x ) .
finally (grad
= A(grad
tlx),f ix))= }
t'(x),f((x)Ml-A)(grad
if
1<|,(>),(,(I),
2
J
t"(x),fj(x))
= 0
any
A,
224 i.e.
the
set
Now
we
of
solutions
of
the
system of
inequalities
(24.5) i s
con-
vex. describe
weight functions Note t h a t
the
equalities the
function
some p a r t vectors
the
respectively
tions
did
they are
tial
or
in fact
hold
in
the
at
taken
as
were concerned p r i m a r i l y limit
schedules.
They
p r o b l e m we
solve
into
that
the
do
solution
of
inBoth
D,
some s y s t e m only,
of in
d e p e n d i n g on
the
i n t o account i r -
tlx). we
derived
the
limit
rela-
a b o u t w h a t f o r m t h e y can
not
of
limit.
system of
points
taken
relations
are
related
explicitly
requires
a c c o u n t when c h o o s i n g
the
the
the
dimension of
w h o l e d o m a i n D,
rigor
correction
satisfy
choice of
that
of
individual
the
s t r i v e to maintain
the
passage t o
the
o f what d e t e r m i n e d
I f the
noted before
may
can
than
a l w a y s t o be
space-time
of
solutions
less
are
I t t u r n e d out
conditions.
be
i t s gradient
equalities d o m a i n D,
( 2 4 . 5 ) . We
ralized
cone o f
may
Such e q u a l i t i e s
not
s i b l y have.
x
manifestation
takes place during
the
point
and
t{x)
fjlx).
We
at
These
of
particular
actually
dimension of
(24.51
equalities.
a
that
their
the
is
include
Initial
specification
target
many p r o b l e m s
pos-
t o gene-
schedule.
then
It
was
independent of
ini-
conditions. Ue
will
now
regard
space-time
schedule
(24.5). Let
t l x ) be
cessarily ities priate
as
is
to
solution.
process of
the
solution
finding
a
of
system
the
a p i e c e w i s e smooth s o l u t i o n
normalized.
(24,5)
the
The
be
In
question
solved
search
and
of
now
an
of
i s , how
what
is
answer
to
we
generalized
that
the be
strict
inequalities
system, not
system
taken
again
of
or
of
for
resort
ne-
inequal-
i t s appro-
to
plausible
reasoning It
i s not
at
a l l clear
system
(24.5)
is
hurdle
is
group of
will of
investigate
additional
iately
note
group of no
the
are
the
be
that
we
then
looking
an
constructed functional
system
conditions
differential
long arcs
I f we
to
how
can
effective in
the
fairly
general
inequalities.
(24.5) t a k i n g
imposed
solution
on
often
inequalities
in
limit that
functional
case.
Whenever
i n t o account
schedules. our
group of
for
l i n e a r schedules then the
the
H o w e v e r , we
The
I f the
inequalities
the
greatest
precise can
we
form
Immedto
graphs
I s not
functional
for
necessary
consideration
system.
the
method
the have
present.
inequalities
225 t(x)=t ( j f - f j (x)) inequalities every of
become t r i v i a l
can
long arc
short
tional
be
i n the graphs
arcs.
This
inequalities.
that
are
linked
circumstance
i n m i n d we
The
will
Suppose
that
we
sense a h i g h - s p e e d relations
Strictly
form.
The
data
limit
list
of
wish
one
to
speaking,
chief
find
a
limit
We
level
of
the
same
level
links
between
a schedule number p.
schedule.
that
start
with
t(x).
performed
point
If
we
mal
ignore higher order i n the neighborhood
vectors equal
to grad
W j l x ) . Suppose fied
by
vectors path
that grad
terms,
and
that
to
f. ( x ) .
Is situated
the
is in a differen-
in
recovering the
level
that
i t s general
surfaces the
weight
asymptotically
This
times of
amounts
to
the
funcl i e on
the
to
fol-
one
and
set out to f i n d
we
one
weight
will
correction
may
a c t u a l l y search f o r
i n a sense. Take any of which
small
grad t l x )
smooth.
the l e v e l be
surfaces of
regarded
are defined
(tlx),fjlx))
as
> 0. F o r
space-time
planes w i t h
nor-
of
of
smooth f u n c t i o n s
fjlx),
small c a l l paths
speci-
the weighted
I n the neighborhood
f a c e s b e i n g a t d i s t a n c e p f r o m one
by
l i e along straight
Consequently
of
a lo-
Suppose t h e a r c s i n t h e n e i g h b o r h o o d
t h e v e c t o r f u n c t i o n f^x) equal
of
i n the neighborhood
o f x can
t(x).
x have a s y m p t o t i c d i r e c t i o n
con-
the f a s t e s t a l -
solved
additional
and o t h e r u s e d f u n c t i o n s a r e s u f f i c i e n t l y
schedules
Having
simultaneously irrespectively
certain
x
differen-
(24.5)
operations corresponding
the search. Consequently
any
func-
( o r almost any) p o i n t x i n be
correcting
schedule
I s close t o a high-speed
Consider
of
t l x ) that
i t directs
l i e on
zero
A
of
composed
continued.
the system
t h e m . U n d e r t h e s e c o n d i t i o n s we
during that
cannot
them t o be
algorithm
surface are
high-speed
required
limit
al1
path
satisfying
impossibility
arcs
we
those o f the
assumption:
any
cally
those
Therefore
surfaces
lowing
along
consider
be
points
certain
the group
schedule
o f any
problem
o b s t a c l e i s the
transfer
functional
only.
example, suppose
this
one
exclude
may
i n the set of schedules For
The
i f end
least
to
such examples
inequalities
(24.5).
schedule.
tions.
be
at
a l l o w s us
gorithm execution i n the neighborhood
of
by
(24.5)
assume i n w h a t f o l l o w s t h a t
sists of differential
D.
t(fj(x))£0.
i n e q u a l i t i e s a t t h e expense o f augmenting
tial
tial
inequalities
d i s r e g a r d e d i n the system
x
lines length
between
directed of
by
any
such
adjacent
sur-
a n o t h e r , does n o t exceed a s y m p t o t i c -
226 ally
ptii A x )
i |/J(x)|cos(grad
\c\
If
the path
equal
this
i s connected
tion
equals
o f Statement
t h e weighted
path
maximum o v e r for
length
taken
above
only
over
t h a t we s h o u l d we o b t a i n
length
of x
t o minimize
path
o f t h e graph.
i s not greater
maximum
The
than the
the a l g o r i t h m execution that
those
lies
schedule
J that
time
by t h e c h o i c e o f
then
among
This fact
we
again
(24.6)
i s rather
along
a
straight
line
the d i r e c t i o n
by
values o f I
are taken
length
obvious.
grad
I t implies
that
reached
by
one
that
of
t h e s e t Llx)
.I f
surfaces
tlx).
Of
t h e maxiasympto-
the vectors
of the target
limit
problem
Ax) (24.7)
| f I x ) I cos(grad J
from
tlx),
t h e maximum s e t Llx)
f lx)) }
d e f i n e d by t h e p r o -
perties: -
clear
o f t h e path
level
of
i s actually directed
that
a t the minimization of
the choice
of the gradient
u
I t i s also
t o t h e maximal
weighted
arrive
arcs
t h e maximum i s t o be
between adjacent
a t p o i n t x we h a v e t h e f o l l o w i n g
min max grad f ( x ) J e ' ( x )
f o r those
of S2(x).
correspond
f o r t h e maximal
length of a chaotic path
to find tlx)
of x
only
i t seems t h a t
are elements
those
bound
function
leg^lx).
weighted
Thus
1 that
take o n l y
formally
Therefore
o f c h a o t i c a r c s and s i t u a t e d
maximal
tically
is valid
direction.
the neighborhood
course,
The
w i l l asymptotically
t i m e o f a l g o r i t h m execu-
of the c r i t i c a l
i n the neighborhood
reasoning
an upper
consisting
mal
length
tlx). The
in
i t s weighted
10.10 t h e m i n i m a l
E we now h a v e
have a s y m p t o t i c
the
ttxj.f^lx))
a l l i o f ( 2 4 . 6 ) . To m i n i m i z e
a l l small
grad
then
(24.6]
quantity.
By v i r t u e
critical
.
i s p a r t o f t h e u n i o n o f g Ix)
and g
(x);
227 the mal
values
valid -
of a;
there exists
inequality holds The
at
l e a s t one
formula
(24.7)
(24.5).
dition and
In that
(24.7)
do
the graph
by
the
these
case
not
to
arcs
of
schedule
along
only.
function
we
that
maxi-
the
strict
asymptotically smallest In general
we
can
(24.5)
dimension
together with
of
thin
can
attempt
the
the gradient
along
the con-
uniquely,
certain
directions
I f graph arc lengths
g e n e r a l l y count to
algorithm into
directions.
the
less than the dimension of
to large arcs.
the d i r e c t i o n
proportional
neighborhood increment
scalar
of
p
from
factor.
the
equalize
the
Usually
£
we
equals plgrad one
So,
side of
(24.8)
the T
the
on
the
d o m a i n D,
this
dif-
finding
a
lengths
of
stretching
i s no
D
impediment
in
according
to
into
is
be
will
always l e )
T
schedule
of
the
level
surfaces
sufficient (24.6)
determined
T
function
to and
from
L E >
that
(24.7)
This
at
tlx) we
Al-
i n the
(x).
are
define
the
take
(x).
time of a l g o r i t h m execution
account
t ( x ) can
the to
a
conclude
inequality
£
inequality
gives
space-time
for
It
i s determined e v e r y E we
For
increment
another.
locally
i t . This
a
the
(x)|
min max grad t ( x ) JeL(x)
use
to
f i x ) as
l e l
taking
f(x)|
gradient of
biguity
to
tlx)
of grad
i t s length.
correction
length of grad
Igrad
If
x
equals
distance
time
the
the d i r e c t i o n
We
have t o f i n d
lowing f o r the weight
the
be
relatively
then
the
certain
Suppose t h a t
that
are
research.
( 2 4 . 7 ) . Now a
only
a r c s may
magnitude
l l x ) such
schedule.
the e q u a l i t i e s
t o be
when m a p p i n g
infinity
our
and
(x))^0 f o r a l l ieX(x).
limit
determine
turns out
degrees
generalized basic
leUx)
for a l l
vector grad
t(x),f
that a s y m p t o t i c a l l y correspond fer
another
implies that
to forming
the cone d e f i n e d by
cone
one
i n costgrad
arcs c o n t r i b u t e of
equal
values;
• |fj(x)|cos(grad
(24.8)
high-speed
i s accounted (to a
of algorithm execution
to determine
schedule then f o r by
scalar factor)
the but
(24.8)
t(x),fj(x))
the absolute value there i s a certain
fact the
that upper
i n t h e n e i g h b o r h o o d o f x.
the
of am-
right-hand
bound f o r
In general
we
the do
228 not
know w h e t h e r
this
bound
i s sharp
p r e s c r i p t i o n c a n be g i v e n as of
the gradient
mation the
ule
graph
p l a y an
we
have
of
(24.8).
tually ient
The
guarantee
exist
first
influenced x.
This
sched-
that by
the
whether
condition
I t i s by
introduced c o n d i t i o n of a graph
to satisfy
this
t o be
component o f g r a d over
that
minimizes we
a
scalar
there
exists vector.
and
grad
t l x ) by
use
is
con-
local-
the
that
the
t l x ) whose
of
grad-
function
(24.9)
sufficient
first
vectors
ac¬
that
partial
the i t h v a r i a b l e .
class
right-hand
t l x ) = 0.
and
Sup-
does n o t
f o r such
that
for a l l I ,j
t h e i t h component o f g r a d
to
(24.7).
here
function
I t i s known
sufficient
of
equal
the
and
that
i t i s necessary
be
(24.8)
tlx)
derivative
should
both
tlx)
vector
the found
words,
minimized
Note
cri-
right-
the l i m i t
schedule.
of
that
length of the
i s , the b e t t e r
is strongly
infor-
state
between the
grad
that
to
partial
variable
the weighted
can
notat ion grad
a
i t i s necessary
other
general
Additional
i n a number o f s i t u a t i o n s .
rot
In
no
c a n be r e p l a c e d .
found
i s equal
this,
tlx).
the d i f f e r e n c e
high-speed
(24.7)
I s somewhat d i f f i c u l t
side
to
H o w e v e r , we
i n the neighborhood
the previously
pose
Due
schedule
(24.8)
locally
important role
unsplittable
not.
problem.
the less
sides of
the
are connected
that
It
this
approximation of
paths
dition
and
the left-hand
of
high-speed
solve
(24.7) approximates
t l x ) describes
qual1ty
to
to
path of the graph
h a n d and
ly
locally
more a c c u r a t e l y
tical
to
of
i s required
or
t o the d e t e r m i n a t i o n o f the a b s o l u t e value
that
t l x ) by
derivative
Therefore satisfy
of
(24.7) both
the
the j t h the
Jth
i s to
(24.5)
be and
(24.9). Let to
(24.9) h o l d good.
In that
case t h e f u n c t i o n
t ( x ) i s determined
a c o n s t a n t t e r m by t h e i n t e g r a l
of the d e r i v a t i v e
i n the tangent d i -
r e c t i o n a l o n g any ral
we
can
only
continuously said in
count
linking a fixed on
Therefore
b u t p i e c e w i s e . We
z e r o . Suppose a l s o
p o i n t and
t h e components
differentlable.
integrals
0 be
curve
that
of
grad
t l x ) may
demand t h a t f o r any
t h e p o i n t x. t l x ) t o be be
piecewise
r e p r e s e n t e d by
the minimal
two p o i n t s
I n gene-
in D
the
value of t l x ) there exists
a
229 curve all
linking
them such
constant
terms,
as
that
tlx)
well
as
i s continuous along that the
schedule
will
tlx),
c u r v e . Then
be
determined
uniquely. Suppose t h a t tional
t h e system
inequalities.
continuous
in
g r a p h c a n be
the
Let
(24.5) does n o t i n c l u d e t h e group o f
vector
functions
neighborhood
of
x.
r e g a r d e d as a s u b g r a p h
I n c a s e t h e a l g o r i t h m g r a p h has
be r e g u l a r
i n the neighborhood
of
know how
If
of
i f we
weights
that
The
assume t h a t
t ^ t x ) be
the
algorithm
i n the neighbor-
no c h a o t i c a r c s t h e n i t w i l l only difference
I s t h a t we
situated.
However, t h e
the graph
Is unspllttable
do
knowledge i n the
x.
the system
equalities
o f x.
graph nodes a r e a c t u a l l y
i t Is immaterial
neighborhood
and
means
of a regular graph
hood o f x.
not
fj(x)
This
func-.
(24.5) does n o t
then the d i r e c t i o n
d e t e r m i n e d by s e v e r a l
the group
of the gradient
factors.
(grad
include
First,
tlx),
the
of functional I n -
of the
limit
schedule
(x)) =0
f
is
equalities
(24.10)
i can h o l d g o o d f o r some s e t o f should
be
true.
mension o f D be maximal s y s t e m
Finally, equal
to n
mension o f t h e cone where
ities
and
of the form
p. T h i s c o n e c o r r e s p o n d s
t h e n any
the d i r e c t i o n
maximum
ratio
a n o t h e r f o r n-p for
(24.9)
equal
the c o n d i t i o n t o leLlx).
in
solution
r(x),fJ(x))>0,
of
the system
o f g r a d tlx). (24.7)
values i
a l l t h e r e s t o f I.
will
should hold. of solution
Let
(24.7) the d i -
space o f
t o p. S u p p o s e t h a t
(24.7)
Since
s h o u l d h o l d f o r t h e v e c t o r g r a d tlx)
n-p"l,
fine
J . , Second, t h e c o n d i t i o n
the dimension
( 2 4 . 1 0 ) be
(grad
If
indices
the e q u a l i t y
i s considered equals
i t i s closed,
strict
that determines
the
the d i n-
inequal-
(24.7):
ieL(x).
(24.10)
can
be
taken to
de-
I f n-p>l then the minimal value o f the be
reached
f r o m L ( x ) and
I t follows
that
when
these
ratios
equal
are not greater than that the e q u a l i t i e s
should hold
one
value
230 (grad
t(x),fj
[grad
Ix))
t(x),f
l
Ix)
I
)
(24.11) Ix)
1
l J
i
n-p
The a b s o l u t e v a l u e o f t h e g r a d i e n t
el(x).
i s determined a f t e r
i t s direction i s
found. The vector
above
investigation i s valid
functions
nonzero s o l u t i o n o f (24.10), p e n d e n t o f x. The c o n s t a n t function
only
i f (24.9)
t ( x ) so t h a t
(24.11) vector
several
fact,
the informal
bottlenecks
( I f i texists) w i l l
grad
t l x ) corresponds
a b o v e was n o t f o r m a l ,
level
of discussion
the
be
inde-
t o the linear
i t h e l p e d us t o d i s -
render our e x p o s i t i o n
we
A meticulous
imposed i n c o u r s e
rigorous,
In
was t o a s u b s t a n t i a l e x t e n t d e -
t e r m i n e d by o u r unawareness o f these b o t t l e n e c k s .
but this
arrange-
o f the investigation
i s not l i k e l y
t o b r i n g us
a deeper u n d e r s t a n d i n g o f t h e m a t t e r . Let
us l a y a p a r t i c u l a r
gradients sion
o f schedules.
o f D and t h a t
priate that
schedules.
notation build
properties
the graph.
The g r e a t e r
Numerous
i s a measure
stress
o f t h e cone,
on
the difference
show
o f "good"
are influenced
-- r e c a l l
the dimension
the narrower
examples
of declining
g r a p h . Graph p r o p e r t i e s
that
that
between t h e dimeno f appro-
i t i s this
difference
properties
by b o t h a l g o r i t h m
some a l g o r i t h m
r e l a t i o n s have n o t h i n g
p r o b l e m f o r whose s o l u t i o n a l g o r i t h m s limit
properties
mentations.
We
closures
algorithms.
of
o f an
algorithm
p r o p e r t i e s and
notation
i s used t o
I t i s n o t always easy t o u n d e r s t a n d t h e reason f o r the
Note t h a t a l l l i m i t
scribe
o f t h e cone o f
i s our choice
d e c r e a s e o f t h e d i m e n s i o n o f t h e cone o f g r a d i e n t s
may
also
o f t h e process o f schedule determination.
ment o f a l l t h e c o n s t r a i n t s
to
I f
(24.9) i s s a t i s f i e d .
Although the discussion cover
can
i s true.
f ^ t x ) and w e i g h t s L I ^ X ) a r e independent o f x t h e n t h e
limit
t o do w i t h t h e o r i g i n a l
a r e d e s i g n e d . T h e s e r e l a t i o n s de-
of information
can say t h a t
propagation
relations
information
closure.
f o r various
specify
Characteristically, quite
h a v e o n e a n d t h e same
o f schedules.
imple-
the information
different
Note a l s o
algorithms that
we d i d
231 not
discuss
the
structure
specified lated
the precise
mappings
of algorithms
o f t h a t domain. Consequently,
i n terms
i n an a l g o r i t h m i c
t h e y c a n be r e l a t e d
both
gorithms better
i n t h e problem
t o solve ( i n some
tioned
that
this
of
the graph.
the
kind.
graph
criterion
algorithm equalizing
a fairly
gAx)
large c l a s s o f graphs. Consider,
x
either
speed
then
lar
graphs.
form
be
solution
I n this
i s constant
I fthe delays
independent
This
case
with
from
i n D. A l l f u n c -
o n a r c s do n o t d e p e n d o n
the high-
linear
schedules
o f algebraic equations on arcs
schedule
(24.11). According
func-
i n t h e case o f r e g u -
t h e g r a d i e n t s o f high-speed t h e systems
the best
o f t h e systems
o f the high-
o f x. Consequently
are not present
p=0. I f t h e delays
t o choose
the sets
g r a p h s c a n be s e a r c h e d f o r a s l i n e a r
means t h a t
which
the i n -
t o be s i m p l e r f o r c o n c r e t e
graphs.
be
f o rregular
situation
c a n be a p p l i e d t o
t o ( 2 4 . 9 ) - ( 2 4 . 111 t h e g r a d i e n t
determined
(24.11)
criterion
scheme t h a t
I t may p r o v e regular
The ( 2 4 . 1 0 1 - l i k e e q u a l i t i e s
men-
detail.
will
also
i s a
already
investigate
according
speed s c h e d u l e s
always
selection We
such v a r i a b l e s f o r
i n greater
research
d o n o t d e p e n d o n x.
schedule
tions.
f o r example,
to find
A l l that helps
a r e e m p t y a n d t h e s e t gAx)
and £ g ( x )
tions f j ( x )
mathe-
t h e d e s c r i p t i o n s o f some p o r t i o n s
formation structure of the algorithm
graphs.
description.
o f a r c l e n g t h s i s a sample
regular.
a general
f o r variable
graph
we c a n a t t e m p t
becomes p i e c e w i s e
We h a v e o u t l i n e d
of the a l -
I n t h a t case i n f o r m a t i o n c l o s u r e i s
We c a n t r y t o " b a l a n c e " Finally,
language n o t a t i o n
a n a l y s i s a n d i n t h e d e v e l o p m e n t o f new a l -
i t . Another
sense)
asymptotic
of
D and
t o v a r i a b l e s used I n o r i g i n a l
m a t i c a l d e s c r i p t i o n o f t h e problem. helpful
t h e domain
o f v a r i o u s c o o r d i n a t e s . The c o o r d i n a t e s may be r e -
t o v a r i a b l e s used
gorithm. Often
into
I n f o r m a t i o n c l o s u r e s c a n be
a r e n o t known
can
of the
then the
may n o t be r e l a t e d
tothe
t o S t a t e m e n t 2 3 . 7 i t c a n be
based on e.g. t h e m a x i m i z a t i o n o f t h e q u a n t i t y
dllqll£'
25. Data Streams When d i s c u s s i n g v a r i o u s ready and
dealt
with
t h e number
cuts
issues
i n algorithm
o f arcs
i n them
concerning graphs.
mattered.
Both The
data
transfers
the positions latter
we a l o f cuts
characteristic
232 proved
t o be o f e x c e p t i o n a l
C o n s i d e r once a g a i n given
i m p o r t a n c e when e x p l o r i n g memory
the limit
situation.
i n t h e d o m a i n D. F o r e v e r y c t h e r e e x i s t s
that
intersect
may., g r o w
that manifold.
infinitely
some s e t o f g r a p h
C l e a r l y , t h e number o f a r c s
as c decreases.
the d e n s i t y d i s t r i b u t i o n
H o w e v e r , we p r o b a b l y
o f t h e number o f a r c s
that
i n t h e n e i g h b o r h o o d s o f some o f i t s p o i n t s . We w i l l
only
i n manifolds
S(x).
This
by
level
surfaces
o f some
function i s not necessarily related
can estimate
small
be i n t e r e s t e d
t o schedules.
holds
for a l l e the relation
t o higher
Here,
order
Let
a scalar
xe£> s u c h
t(x)
(25. I )
n^vn(x)
f u n c t i o n t ( x ) be d e f i n e d
that
(25.1) holds
i s continuous.
normal v e c t o r plane.
We
Of
grad
will
graph that
as c d e c r e a s e s .
i n t h e domain
i n i t s neighborhood
The m a n i f o l d
defined
t(x).
by a l e v e l
Take a p a r a l l e l o g r a m
this
t h e number
these arcs
exact form.
surface
of tlx)
hyper-
o f the algorithm
parallelogram.
n o t c l e a r how t h e a r c s
into
Take any
o f a r e a o- o n t h a t o f arcs
c o u r s e , e v e r y t h i n g depends on t h e a s y m p t o t i c
c a s e . As b e f o r e ,
D.
and t h e g r a d i e n t
i n the neighborhood o f x by a hyperplane with
t r y to estimate
intersect
i s once a g a i n
take
-
terms f o r almost a l l p o i n t s and p a r a l l e l e p i p e d s .
containing x i s given
ral
v)
depends o n l y on e and g r o w s t o i n f i n i t y
point
It
i n D. De-
t h e number o f a l g o r i t h m g r a p h n o d e s t h a t a r e i n s i d e t h e
N^x,
and
function
s e m i o p e n p a r a l l e l e p i p e d o f v o l u m e v i n t h e n e i g h b o r h o o d o f x. As-
sume t h a t
of
t h e mani-
scalar
Suppose t h a t a p i e c e w i s e smooth f u n c t i o n n i x ) i s d e f i n e d n o t e b y N^lx,v)
arcs
i n such a set
intersect
fold
defined
traffic.
L e t a s m o o t h m a n i f o l d be
t h e prime account
F o r now we w i l l
behavior
a r e t o be c o u n t e d
impediment
i s due t o l o n g a r c s .
i f i t becomes n e c e s s a r y ,
assume t h a t
o f arcs.
i n t h e gene-
utilizing
We
will their
t h e a l g o r i t h m g r a p h h a s no l o n g
arcs. Consider only
one i n d e x
sentation grad
those arcs J from
o f the form
i n the neighborhood
g ^ x i . A l l these arcs |c|
t l x ) i s traversed
by
fAx). these
of x have
that
are defined
an a s y m p t o t i c
The h y p e r p l a n e w i t h t h e n o r m a l arcs
i n only
one d i r e c t i o n
by
repre¬ vector i n the
233 neighborhood o f x.
The
origins
Id
from
the hyperplane.
that
Intersect
tically
our
allow
tU),t[
l f j ( x ) | |cos(grad
Hence t h e t o t a l
not f a r t h e r
than
(x)) |
number o f a r c s o f t h e s e l e c t e d s e t area
the
hyperplane
asympto-
(25.2)
corresponding This we
sum
corresponding
over
to
a l l Jeg^tx).
to minimal
result have
this
l
values
changes
to consider
but
of
area
of a
t h e sum
on
a l l 1
Note
the
from
that
will
slightly
be
1
(x)) |.
g2(x)
as
125.2)
we
must
E decreases,
dominant
i f we
allow
of functions
i s an u p p e r bound o f
parallelogram
t (x). f
/ y r a U H f j t x )I Icosfgrad
f o r arcs
functions
Now
arcs are
equals
Id
Again
these
parallelogram of
a
To
of
i n that
for
(25,2)
over
rather
than
the terms
sum.
chaotic
t h e number o f a r c s t h a t
hyperplane,
sum
the
arcs.
a l l Jeg^x). intersect
the
our
number i t -
self. Thus terested
the
number o f
arcs
i n i s estimated
N
V
=
\ei ajWft*3
i n the
neighborhood
of x
that
we
are i n -
by
£
t(x), f
I f j (x) I |cos(grad
l
(x)) 1 .
(25.3)
ieW(x)
The
values
lowing
of
- M(x) lid
i s part of
the values
there
(grad If
no
taken from
t h e m a x i m a l s e t W(x)
d e f i n e d by
the
fol-
o f a}
t h e u n i o n o f gix) equal
s
and
g3<x);
f o r a l l letf(x)
and
are
the minimal
va-
values; -
cos
1 are
properties:
at
fIx),fj(x))*0
t ( x ) i s an
significant
where a
Is
least holds
arbitrary
one
such
that
the
inequality
f o r some a d m i s s i b l e v e c t o r g r a d t ( x ) . function
simplification.
simplification
Jertlx)
then the formula
However,
i s possible. Let
there t ( x ) be
(25.3)
is
an
a
limit
admits
important schedule.
of
case Ac-
234 cording
t o (24.5)
leglx),
g^lx).
grad
the inequalities
This
means
that
Now
Given an a l g o r i t h m graph, well
defined.
t h e number o f a r c s r e l a t e d
(25.4)
the level
to level
as a f u n c t i o n
tlx)
surfaces
i n time
G i v e n an a l g o r i t h m g r a p h , in
t h e domain D t h a t
In
other
surface case one
words, will
surfaces
intersect
then
them will
the level
sity
surface
o f bilateral
Suppose level
that
tions
be s m o o t h
sity,
the greater
level
surface,
moving should which
stream.
lution
We
refer
i n that
may move down
the level
s u r f a c e moves
to the right-hand
sides of
streams
(25.3)
intersection
specifies
125.4)
schedule.
t h e den-
s p e c i f i e s the
neighborhood.
Consider
a l l graph
the faster
that
i s t h e same t h i n g
Seemingly,
(24.7).
arcs
will
the algorithm
maximizing i n this
the higher
conclusion
that
be will
the right-hand
case, o f (25.3)
This
t h e movement o f
o f x. L e t a l l r e f e r e n c e d
t h e number o f a l g o r i t h m g r a p h a r c s
o f t h e problem
level
i s the limit
the formula
i n t h e neighborhood
then
nodes.
The
D.
intersection.
£(x) i s the l i m i t
and
fields
I f tlx)
a s t h e d e n s i t y o f data
and
surfaces.
I n t h e general
i.e. the surface
a t t h e same t i m e .
will
regard
between
i n t h e domain
o f i t s movement.
be b i l a t e r a l ,
the faster
surface, follow
i n course
intersection
surface
L e t us
tlx).
d e f i n e some v e c t o r
a t p o i n t x . The f o r m u l a
density of uniiateral
Its
are defined
stream
schedule i ( x )
f^lx).
of the f u n c t i o n
t h e r e a r e no p o i n t s i n D a t w h i c h
some d a t a
(25.3) and (25.4) w i t h by
of the limit
f ( x ) and £
t(x)
algorithm
s p e c i f y i n g t h e movement o f t h e s e
streams
vector i n the
but f o r grad
f o r a given
show t h e i n f o r m a t i o n a l c o n n e c t i o n s
data
the intersection
against
t h e normal
all
follows:
that
the functions f j ( x )
s t r e a m and up a n o t h e r
schedule up
as
and v e c t o r s
implies
depends o n l y on t h e a n g l e between g r a d Consider
with
for
Ix)
a l l quantities
Therefore
hold
Ix)JaO
i n o n e a n d t h e same d i r e c t i o n
( 2 5 . 3 ) c a n be w r i t t e n
I e | n 0-7I
are
t(x),f
the hyperplane
t l x ) i s t r a v e r s e d by a l l a r c s
n e i g h b o r h o o d o f x.
(grad
t h e den-
i n t e r s e c t the
traversed be
side
func-
by t h e
executed. I t of
[25.4) o r ,
i s t h e same a s t h e s o -
seems t o be f u r t h e r c o r -
235 roborated
by
the f a c t
deed i d e n t i c a l The
that
the s o l u t i o n s
f o r a number o f t y p i c a l
actual
situation
is different.
example,
a l w a y s t h e c a s e i f n o t a l l n u m b e r s o.^ e q u a l
right-hand should
be
tf(x)
side
to
be
t o be
chosen
speaking, Suppose
are
identical.
enumerated f r o m
1 t o n. that
further
and
Let
of
the
t h e s e t Llx)
(25.4). one
another.
direction
of
longest f ; ( x ) , then
of
from
This w i l l
the shortest
i f the
grad and
the d i r e c t i o n
be
Suppose
( 2 5 . 3 ) shows t h a t
the
minimized
tIx)
i f the of
fj(x).
grad Gene-
contradictory.
a
the
(jj(x)=l
constraints
then
Mix)
form
from
formula
close to that
tlx) and
x
Suppose
these h a r d
i s t o be
t o be
sets of
The
maximized
t h e s e demands a r e
the
independent
s e t Mix)
close to that
i n 124.7)
s h o u l d be
tlx) rally
Is
chosen
maximum r a t i o
are
to the
For
almost
and
identical
problems are i n -
graphs.
may
Llx)
be
two
(24.7)
that
not
t o these
are
identical,
b a s i s . Assume inequality
f o r a l l 1.
the s o l u t i o n s
the
that
(£ f ^ . / ^ J i O We
will
t o t h e two
vectorsf j ( x )
these
vectors
are
h o l d f o r a l l m.
show
that
problems are
even
with
i n general
different. The (25.4)
condition
1JJ f
i s m a x i m i z e d by
implies
J P
the f o l l o w i n g
that
the
choice of grad
right-hand side
of
llx):
n
tlx)
grad
- T
(25.5)
1=1 For
the r i g h t - h a n d side of
( 2 5 . 4 ) we
have n
n
(grad
£ /,)*!e
tlx),
rad
t ( x )
=
I £ f,|.
Any
other
s i g n by
choice
less
case
i f grad
that
the
streams
or
of
equal
tlx) will
sign
solution
to this
i f the
than
solution
p o s s i b l e when t h e s o l u t i o n
cause
i n (25.6).
t l x ) i s chosen
intersection
same o n l y
grad
t o be
system the of of
(25.6)
1= 1
1= 1 the
replacement
In particular,
the solution cannot
choice (24.11) (24.11)
yield
(25.5).
of
this
of will
satisfies is collinear
be
the
( 2 4 . 1 1 ) . T h i s means
greater density The
equality
density (25.6). to £ f
will This
of be is
data the only
, i.e. i f the
236 equalities n
n
'lfv
V = ••• - < - l f r
2=1
hold.
Obviously,
V
1=1
such e q u a l i t i e s
hold
not at a l l often.
I n particular
they hold i f t h e vectors f . a r e orthonormal. Investigating that
minimize
(25.4) be
memory
traffic
the density
implies
minimized
that then
prompts
o f data
i fthe density the vector
grad
streams o f data
us
t o search
f o r surfaces
intersection. streams
t l x ) should
The
formula
intersection
be chosen
i sto
t o form the
l a r g e s t p o s s i b l e a n g l e w i t h £ f j . S u p p o s e t h e a s s u m p t i o n s we made above hold.
Then t h e t a r g e t
grad
For every
t l x ) should
as
possible.
f
t o t h e s p a n o f a l l t h e r e s t f^.
streams
i n t e r s e c t i o n we h a v e t o c h o o s e g r a d
dicular
that
In streams
forms
grad
tlx).
the largest
origin
A v e c t o r o p p o s i t e t o such a
will
smallest
a valid
be o r t h o g o n a l t o a s many o f f .
the perpendicular with
dicular
the
be
f;- consider
To
angle
minimize
the density
at £
perpeno f data
t l x ) t o be t h e a n t i p e r p e n -
w i t h £" f j
or, equivalent ly,
has
length.
the general
case
intersection
we
scalar function
t o maximize have
iminimize)
t o solve
t l x ) t h a t maximizes
(grad
tlx),
the density
the following
o f data
problem:
find
a
(minimizes) the dot product
V fjlx))
(25.7)
lefl(x) under
constraints (grad
f f x ) Jet),
t(x), Igrad rot
In
particular,
i fa l l fj(x)
letilx),
t(x)|=l,
grad
(25.8)
t(x)=0.
do n o t depend
o n x,
t h e problem
125.8) may be s o l v e d b y a p p r o p r i a t e m o d i f i c a t i o n s o f t h e s i m p l e x for
linear
programming
One o f t h e most
[25.7), method
problems.
important problems concerning data
streams
i s the
237 choice arcs
of the parallelepiped
that
intersecting i t s surface
minimizes
the r a t i o
o f t h e number
of
t o t h e number o f g r a p h n o d e s i t encompas-
ses. Suppose
t h e cone o f g r a d i e n t s
o f s c h e d u l e s has d i m e n s i o n n I n t h e
n e i g h b o r h o o d o f x. L e t t h e v e c t o r s nonsingular
parallelepiped
ular
a t t h e end p o i n t
ending
parallelepiped facet
that
o f v o l u m e v.
contains
be e q u a l t o tr.(x).
We
the
matrix
tion
the matrix
a l l the rest
o f ft[x) we h a v e
It
that
|h ( x ) | i r ( x ) = v. n
symbol.
J
a n d r.(x)
i
13
' J
have
c a n be
h.ix)
l
same d i r e c t i o n s
regarded
l
1
t h a t we
ir-CxtC*
=
r
Taking
for a l l
projection is
a
i . of
positive
Ax))
• ls.(x)l lr.(x)|
=
aAx)\rAx)\.
obtain
K,C*)-|r 1 C«ir
d u c e d , we
a
a. (x)
= —'•
1
as
l
where
(.sAx),
From
Consider
( s .( x ) , r . ( x ) ) = 5 . . where 5. . i s Kronecker's
|h.(x)| - cos(s.Cx), r ( x ) ) | s . ( x ) |
=
[25.9)
n
I f h.(x)=a.(x)-r,(x)
J
that
defini-
the vector
s Ax) o n t o r Ax). j I scalar, then
of
r I x ) . By
t h e v e c t o r s h.lx)
follows
L e t t h e area
Sjix).
of the
have
w h o s e c o l u m n s a r e t h e v e c t o r s s^lx).
1
Therefore
to the facet
s.lx)
( x ) ) and i t s columns r ( x ) , . . . ,
R(x)=(S~
t h e edges o f a
Denote by h . [ j r ) t h e p e r p e n d i c -
of the vector
t h . ( x ) | < r ( x ) =...= 1 i D e n o t e b y Six)
form
s ^ x ) . . . . , s^lx)
into
account
conclude
that
the p a r a l l e l e p i p e d
(25.4),
2
(25.9),
t h e number
(25.10)
r A x ) . and
of arcs
the notation
equals n
N - 2|e| n
thAx),
£
fj(x))
leH(x)
3
vn(x)
lh.(x)| i=l
we
intro-
i n t e r s e c t i n g the surface o f
_ 2
238
y
- 2|E| ncvn[jt)
\
ix),
[ fj(x)) leM(x)
i=l
= a|cj'% e W»:(*)-:( [ r . ( x ) , J=l
The
number o f n o d e s w i t h i n
(25.1).
follows
that
the target
f o l l o w i n g problem:
fllx)l"1
parallelepiped
f o r given
i s given
by
t h e formula
have
Idet S(x)|=|det
It
£ fjlx)) . JeM(x)
the parallelepiped
F o r t h e d e t e r m i n a n t s we
=
v > 0 find
= v.
c a n be c h o s e n b y s o l v i n g t h e
vectors
r ( x ) , . . . . r ( x ) such
n
l that
min ( £ r j ( x ) , £ f j ( x ) ) r (x)
r with
' i=i
fj(x))£0,
Idet
rot
DAX)
vectors fines
lenix)
constraints
[rAx),
where
(25.11)
-1
( » i ( x ) r i ( x ) ) = 0,
are a r b i t r a r y
scalar
( x ) and t h e f u n c t i o n s
the i t h facet
Rix)|
ieW(x),
= v.
1*1,2
functions.
z.ix).
( 2 5 . 12)
Then
Suppose
t .ix) =
we
have
found the
t h e s c h e d u l e t ^ x ) t h a t de-
of the parallelepiped w i l l
grad
,n,
nAx)r.[x}
satisfy
239 The
schedules
requirement Note with
are not
of their that
respect
nonnegative invariant
nonnegative
also
of
a
can
respect
the
problem
constant of be
factor.
chosen
in
second
whose p r o d u c t
(25.12) should
The
of
of
invariant arbitrary
the
1.
exists
of
by
i f the
f o r some v ,
the
so
is
i t will
f o r various
that
x
constraints same v e c t o r s
Therefore
scalar
that
the
In this
possess the f o l l o w i n g
v
differ
vectors f j ( x )
r ^ l x ) and
automatically.
constants
is
r . ( x ) by
these
solutions
vectors
independent
( 2 5 . 12)
equals
(25.12)
v.
the
(25.12) holds
positive
t o the absence o f
vectors
Suppose f u r t h e r
t o be
of
arbitrary
the
The
BAx).
both
In
(25.11),
of
(25.11),
Then
x.
constraints
to the m u l t i p l i c a t i o n
constraints
and
the
f o r a l l nonnegative
independent T/Ax)
of
multiplication
scalar functions
exist
o n l y by
first
the
p a r t l y due
normalized.
scalar functions
with
solution
being
the
to
uniquely determined
case
third the
of
the
solution
property: for a l l
8^. whose p r o d u c t e q u a l s
are
functions
i*j
1 the i n -
equalities
hold. This
Thus, the
i s p o s s i b l e i f and
i f the vectors f j l x )
problem
vectors that,
only i f
(25.11),
r . onto
for a fixed
£
(25.12)
f j have
are
independent
guarantees
the
same
absolute value of
that
lengths
of the
and
The
by
stressing
s o l u t i o n o f the problem
domain Using and
conclude
once
projections
the determinant
(25.11),
into
parallelepipeds
that
these
parallelepipeds
we
implement
algorithms
are can
according
again
the
of
informationally
to
structure
the
Besides
t h e m a t r i x com-
circumstance.
to p a r t i t i o n weakly
various directed
the
of
of
minimal.
following
( 2 5 . 1 2 ) a l l o w s us
build
solution
directions.
p o s e d o f r , , t h e l e n g t h s o f t h e p r o j e c t i o n s s h o u l d be We
the
x,
of
the
connected. macrographs
these
macro-
240 graphs.
Individual
lelepipeds, of
macronodes
o f groups
parallelepipeds,
required
be
composed
of parallelepipeds,
e t c . Of c o u r s e ,
i n a n y s u i t a b l e manner; The
may
serially,
memory s i z e
was
of
individual
o f "tubes"
t h e macronodes
or i n parallel,
investigated
formed
may
be
paral-
by chains implemented
or using pipelining.
i n Chapter
3, t h e e x e c u t i o n
t i m e i s d e t e r m i n e d i n C h a p t e r s 2 a n d 5. Once a g a i n , t h e d i m e n s i o n o f t h e c o n e o f s c h e d u l e s i s o f g r e a t importance.
As t h e d i f f e r e n c e
between
t h e d i m e n s i o n o f t h e d o m a i n D and
t h a t o f t h e cone g r o w s , o u r chances t o p a r t i t i o n D i n t o weakly connected p a r a l l e l e p i p e d s The arising
d i s c u s s e d problems do n o t c o v e r i n the study o f information
implementations. ry,
and a l s o
graph
Other
upon
informationally
diminish. t h e e n t i r e scope o f questions
propagation
problems w i l l
processes
be i n v e s t i g a t e d
t h e a c q u i s i t i o n o f new
information
i n algorithm
whenever about
necessaalgorithm
properties.
26. Examples EXAMPLE 2 6 . 1 . C o n s i d e r a b o u n d a r y a l heat
v a l u e problem f o r one-dimension-
t r a n s f e r e q u a t i o n . Suppose t h a t u ( y , z ) i s t o b e f o u n d w h e r e
0£z5l,
Build
s t e p ti a l o n g z and s t e p T a l o n g y . Suppose an
a uniform g r i d with
explicit
scheme i s s e l e c t e d d u e t o some
i ij
Suppose t h e a l g o r i t h m
0
u
J
i s implemented
-2u
reasons
i-i
J
i-i
•u .
by t h e f o r m u l a
241
J To b u i l d system through
with
J
the algorithm axes
hz graph,
and z
a rectangular are related
coordinate t o i and j
i s performed
f o r various
with
and l e t I t c o r -
o f t h e form
U - all-
I s regular
z=hj.
i n s i d e e v e r y node o f t h e i n t e g e r g r i d
respond t o a s c a l a r o p e r a t i o n
graph
y
theequalities
a g r a p h node
which
J+i
introduce
The v a r i a b l e s
y=ri,
Put
)
jF-i
respect
—]+—(b+c),
values
o f a, b , a n d c. The
t o the coordinates
Fig.
26. 1 f o r t h e c a s e h = l / 8 .
T=7Y6. T h e n o d e s
time
levels are situated along the dotted
F i g . 26.1
lines.
algorithm
I t i s shown i n that
a r e i n t h e same
242 As
T
and ft d e c r e a s e ,
not decrease taining let
i f we
the
use
the distance
our
a l g o r i t h m g r a p h expands
infinitely.
us e x p l o r e t h e a l g o r i t h m g r a p h w i t h
l'=el,
where E i s a s m a l l If
T and
h are
e x a m p l e , we
related
contains
a l l algorithm
tangle.
relation
between x
d i m e n s i o n s o f D may main w i l l It
be
infinite.
stretch to infinity i s easy
v a r i a b l e s and
to
see
that
can
take
then
to the
limit,
graphs
will
For
along
(r.h).
domain D
be
a
finite
then
one
e x a m p l e , i f r = 0{h ) 2
that rec-
of
the
then the
do-
the i ' a x i s .
a l l arcs
}=e(l.l),
c • max
the
and ft i s n o n l i n e a r
are
short
are a s y m p t o t i c a l l y represented
ef jl*,4'
pass
con-
coordinates
t o each o t h e r
asymptotically I f the
To
t h e new
n o d e s does
t h e domain
j'=ej.
parameter. For
linearly
between n e i g h b o r i n g
i , j coordinates. Naturally,
as
with
respect
to
new
follows:
Ef2(i', j ' )=e(l,0), (26.1)
efAi'J')
Let
£
the
components o f g r a d
be
the
limit
-
schedule.
l(I'.j').
,j'
fijli'
The in
inequalities.
independent
of
i ' and
o b t a i n the f o l l o w i n g
+
where g
as
linear
i s an a r b i t r a r y
£_,((' . j " ) ^
)
and
Therefore j '
g (I'.jf')
and
g
( 2 4 . 5 ) we
(i'.j'l
have
0, ^ 0,
(26.2)
. J" ) " g 2 ( j ' . J " ) ^ 0
c o e f f i c i e n t s by g ( i ' . j ' ) these
Denote by
I n accordance w i t h
J'3 gti'
e(l,-l).
well.
g ti'.J') we
can I f we
do
n o t depend on
t r y t o choose g succeed
in this
r e p r e s e n t a t i o n f o r the l i m i t
t U \ J ' ) = £,''+
Sj'+S.
constant. Using
(26.1)
i ' and j '
and then
g
to we
be
will
schedules
(26.3)
we
find
that g , g
may
243 be a n y n u m b e r s
satisfying
the conditions 0
V ' The
formulas
generalized
(26.3)
and
s
l
(26
i " '*a -
(26.4)
define
the entire
schedules. This s e t i s described
set of
'
41
linear
i n y , z v a r i a b l e s as
t ( y , z ) • ay+fiz+r,
where a > 0, a £
|0lh/T, o r . e q u i v a l e n t l y
-T/hiS/ai-r/h,
One there
of the implications of this
remains
ft -» 0 . T h i s shown
by d o t t e d
exercised ical
only
lines
a
teresting
to 8
i n 26.1. This
loss
of different
o f information
2
I f T = C ( h ) then
=
means
and
that
certain arcs
o f magnitude,
some
schedule
i t s level
systems where
orders
about
0,
previous
we w a n t
to find
example. L e t us c o n s i d e r
Assume t h a t a l l w e i g h t s e q u a l
care
i s t o be
have
asymptot-
or else
of the potentially
we c a n
very i n -
t h e high-speed
t h e graph
1. A c c o r d i n g
in i',j'
t o (24.7),
schedule f o r coordinates.
( 2 6 . 1 ) we h a v e t o
find
min Igrad
We
as
curves are
schedules.
EXAMPLE 2 6 . 2 . S u p p o s e the
i s as f o l l o w s .
w h e n e v e r we u s e c o o r d i n a t e
representations
suffer
corresponds
(26.5)
o f T and fi g e n e r a l i z e d
one i n d e p e n d e n t
schedule
o>0.
max
((grad
t ( x ) . f ) "J ( g r a d i
t ( x ) , fj~\
(grad
t(x)1=1
have (grad
t U h / j )
(grad
(grad
=
g i i ' , j ' ) + g l i ' . j ' ) .
tlx),f )
t(x).f3)
z
=
g l i ' . f ) ,
= g l V , j ' )~gli'
,j') •
t(x)./3)
_ 1
}.
244 S i n c e a l l t h e s e e x p r e s s i o n s a r e n o n n e g a t i v e we
max
He
obtain
{*,
',
•}
( g ( i - . j ' )
minimum i s r e a c h e d
min max t(x)1=1
on
this
lue
of
v e c t o r max the
that
Igrad
in
be
t(x)|=l
In
from
satisfy
the
condition
relevant
The
factor,
i n our
case.
Igrad
schedule
so
the
con-
We
conclude
have t h e f o r m
= i'+g
Accordingly, f o r the
original
coordinates
z) =
y+y.
I t i s easy t o v e r i f y
that
this
will
certainly
be
the
schedule.
this
equalities.
(26.6)
have t h e f o r m
F i g . 26.1
high-speed
to
of a normalizing
i s a c t u a l l y not schedule w i l l
£ (y,
Using
j ' 1=0-
*, * ) d o e s n o t c h a n g e i t s v a l u e i n D.
c o o r d i n a t e s i ' , j' .
i twill
g2(i' ,
chosen
to the accuracy
the high-speed
the
vector
) = 1,
i(i',j")
y,2
, f
2
" ) - 1. A c c o r d i n g t o ( 2 4 . 8 ) t h e a b s o l u t e va-
should
{*,
be d e t e r m i n e d
dition
{*,
gradient
t ( x ) 1 = 1 . B u t max can
\g W
(*, •, • } = I ,
the unique
g(i',J' For
-
further
Igrsd The
=
have
example D has Consequently
dimension
2
the d i r e c t i o n
the e q u a t i o n s (24.10) w i t h
and of
there grad
j"
(24.10)-style be
determined
n = 2, p = 0. T h r e e s y s t e m s o u t o f
e q u a t i o n s h o u l d be c o n s i d e r e d :
g/j".
a r e no
t l x ) can
)+g2ii'
• J" > =
'" J
1
one
245
To a n o r m a l i z i n g f a c t o r
their
solutions are
g3(i',J')=0, )=i.
e ^ i ' . f
g 3 (i',j"i=o. We
again
there
obtained
the solution
(26.6),
i s e v e n n o n e e d t o make a c h o i c e
as
i tshould
among v a l i d
be. I n t h i s
case
s o l u t i o n s as a l lthe
solutions are identical. EXAMPLE 2 6 . 3 . I n t h e c o n t e x t parallelepiped its
surface
problem are
(25.11).
constant
scalar
minimizing
t o t h e number (25.12)
r
that solve
the ratio
o f t h e number o f a r c s
o f nodes
i t encompasses.
i n the coordinates
i n D, s o we w i l l
f u n c t i o n s tAx).
o f t h e same e x a m p l e , l e t u s f i n d t h e
i',
We
intersecting
must
solve the
j ' . The v e c t o r s f j ( x )
t r y t o obtain constant
v e c t o r s r Ax)
and
We h a v e t o f i n d
1
= (r ,r ) , r = (r ,r ) 1 1 ' 12 ' 2 21 22
t h e problem
r.
Taking
into
account
( 2 6 . 4 ) we c a n r e w r i t e
the equivalence
of
the constraints 0
'n' ' R
2,
> 0
'
r
r
the inequalities
(26.2)
and
(25.12) i n t h e form t
r
1
ii ' ia ' zr' 22 R
L
(26.7)
246 The
third
relation
admissible
i f we
ample
r
K
that
preserve sum for and
implies
take
r ,
that
lesser
and
values
of r
and I "
signs.
S u p p o s e f o r ex-
w i t h opposite
will
be
> |r By i n c r e a s i n g | r I a n d d e c r e a s i n g r • we can 11 12 t h e t h i r d r e l a t i o n i n ( 2 6 , 7 ) , However, t h i s a l s o d e c r e a s e s the
r
+ r , Hence t h e e q u a l i t i e s r • I r |, r = | r I should hold 11 21 11 12 21 22 t h e s o l u t i o n o f t h e p r o b l e m . The s o l u t i o n , t o a n i n t e r c h a n g e o f r , will 2
be
r
, ._
,-1/2 , , (2v) },
.-1/3
= ((2v)
(26.8) r2
-
((2y)-
A possible partitioning of D into the
coordinates
theory
i, j
predicts that
identical.
This
i s depicted
l / 3
.-(2v)-
J.
the corresponding by
dotted
the projections o f
i s certainly
, / 2
lines
parallelepipedsfor i n F i g . 26.2.
and r ^ o n t o
t h e case f o r t h e v e c t o r s
Fig.
26.2
£
(26.8).
should
The be
247 To data our
find
schedules
streams
corresponding t o minimal
i n t e r s e c t i o n we h a v e
example
this
means
gli'.J')
that
o r maximal
t o solve problems
density o f
(2S.7),
i s t o be m i n i m i z e d
C25.8). I n
o r maximized
under t h e c o n s t r a i n t s
gli',
e^i'.J'^O.
j")Mg2(i'.
8\w.J')+g\u'.y) = 1. Obviously,
the vector
g(i'.j') g(i'.j').
maximizes
= 1. gAi'.f)
= 0
and each o f t h e v e c t o r s
(26.9) *,
= 2"
, / a
.
«a(i'.J')
=-2-
, / 2
.
minimizes i t . In tion
this
section. place. not
case,
the direction
coincides with We
that
have mentioned
The c o l l i n e a r i t y
obligatory
that
defined
independent schedules. weights the
Introduce
The
c o i n c i d e n c e does
implementa-
streams
inter-
not always
(26.8) and (26.9)
the principal
by two v e c t o r s
subgraph
=
f j ( 0 , 1), /2=(Z, 0).
i n t h e domain,
the notation
grad
we
take
i s i n general
that
gradient
t h e graph of
a
of a planar Since
will t
schedule
into solves
{(grad
t(x), f
(grad '
t(x),f
arcs are linear
that a l l
I s dictated disjoint according
(24.7) t h e problem
min max |,r-d r(x)[=l
regular
study
Assume
subgraph
c o u l d n o t be s p i l t
high-speed
graph
only
f(x)=(g g ).
1 . The c h o i c e o f t h e p r i n c i p a l
requirement
graphs.
algorithm of data
either.
of the points
equal
this
of the vectors
EXAMPLE 2 6 . 4 . C o n s i d e r graph
of the fastest
o f t h e maximal d e n s i t y
) )
_ 1
by
subto
248 under c o n s t r a i n t s (grad
For
our
example, t h i s
tUL/M^O.
min
u n d e r c o n s t r a i n t s g^ a 0,
max
the problem find
the
intersection.
1
{g^
g^ z 0.
C l e a r l y , the
ttxJ
to a normalizing plane
/Z)
2
grad
Now
ilxl.f^aO.
r e q u i r e s t o compute
2
solves
Igrad
that
According to
(1,2)
(26.10)
factor.
maximizes
(25.7),
max
-
vector
the
( 2 5 . 8 ) we
(grad
density have t o
of
data
streams
find
( ( x j . f ^ f ^
under c o n s t r a i n t s
(grad
For
our
11*).,*, im,
example,
straints
this
(grad
requires =
g SO,
g -°>
8
2
*£
1-
t i x i . f ^ Q .
to
Igrad
compute
Clearly,
the
max
t(x)|=l.
(Zg^g^i
problem
under
I s solved
con-
by
the
vector grad
The
vectors
corroborates is
in
streams
( 2 6 . 1 0 ) and
general
-
(26.11) are
t h a t the d i r e c t i o n of different
than
(1,0).
not
(26,11)
collinear.
Thus,
the f a s t e s t a l g o r i t h m
that
of
the
maximal
this
example
implementation
density
of
data
intersection.
EXAMPLE 2 6 . 5 . porates
(|x)
long arcs.
study the
limit
Consider
the
case
A sample g r a p h o f
situation,
introduce
where
the
that kind t h e new
algorithm
graph
incor-
i s shown i n F i g . 6.2. coordinates
To
249
where n
Is the order
graphs w i l l
be
square.
arcs
new
The
of
t h e m a t r i x . The
contained have
i n the
the
= eCO.l).
1
Here, c -
(n-1)" ,
schedules w i l l
i'*f.
The
The can
have the
form
vector f t i ' . j * )
try
to
l(x) = (gt.g2) (t'-J')*0,
(grad
We where
vectors
the
unit the
linear arcs
that
arcs
grad
i'fcf'
a l l linear
on
limit
z
schedules w i t h
are
schedule.
present.
Let
grad
imply
that
holds
i n the domain c o n t a i n i n g
generalized
schedules
z
w o u l d be
negative
do
not
intersects
Consequently,
the
level
time
a
min Igrad tlx) u n d e r c o n s t r a i n t s g i O , g ^0.
tlx),
f
)
that
i f short
arcs
1
| =1
I n o t h e r words,
i t should
of
see
(24.7)
maxtgrad
case
linear
surface
the g r a d i e n t of a high-speed schedule
i n accordance w i t h
I t is
i n the
( 2 4 . 6 ) , we
algorithm execution
long
lost.
t r y to find
formula
de-
exist.
f o r high-speed schedules l e t us
are
Obviously,
g ~0.
the
determine
we
( 2 6 . 13)
and
arc
not
so
(26.13).
Nonetheless, any
13)
satisfying
long
do
Clearly,
space,
of
( 2 6 . 12)
w h e r e g^O,
t{x)={g^,g )
d e r i v e the formula
the set
(26.
the p o i n t of
s c h e d u l e o n l y once. A n a l y z i n g
present.
guarantee,
that
( 2 6 . 12)
(24.5) t h a t describe
depend
+g-
inequality
(i'-J'.O).
i g n o r e d , o r e l s e t h e c o n d i t i o n g^O
d i d not long
high-speed
are
The
z
easy t o see
long
of
representations i n
t ( x ) , f ( i ' , j ' ) ) * 0 .
schedules 2
I t follows
by
does not
1inear
=
t j l ' . j ' )
t i i ' , j ' )=g^i'+g f
g -0.
graph.
scribed
find and
a r c s c a n n o t be
a
is a half
asymptotic
relations
t l i ' , J ' ) * t U ' , J ' ) ,
the
situation
which algorithm
coordinates:
c f ( i ' . j ' )
g
limit
following
domain D w i t h i n
guarantee
should
250 min
w This problem
i s s o l v e d by
corresponding
schedule
Generally that
from
the
the
paths.
Let
graph
as
high-speed
only
that
their
12.2.
for
For
end
0
has
The
level
by
in Fig.
approach
i n Fig.
that
we
will
be
found schedule
coordinate
through
as
A
the
a
is
that
the
result
of
i s , however, e f f e c t i v e
case where
sample
t/L,
which
can
cfAf
w h e r e c - L *. C h a o t i c
graph
of
the
enough.
t o i t may
a l g o r i t h m graph
that To
kind
is
shown
study the l i m i t
i'=
i/L,
incori n Fig.
situation,
j' = j / L .
a l g o r i t h m graphs
will
be
, i'
t ' , i ' ,f
N/L,
1.
asymptotic
i n the set of arcs possessing
asymptotic
i t has
the
, j ' )=e( 1,0.0).
arcs are
e d g e s N/L,
a r c has
i n the
an
assume t h a t
= a. O n l y one
contained
form
gz(x)= ( l )
representations
E f ( f,
other
two.
l o n g a r c s so g^ix)
r e p r e s e n t a t i o n . We
ef(
arcs
schedule
I t follows lost
i s a rectangular parallelepiped with
no
6.2.
adding
connected
was
the
investigating
whose h i g h - s p e e d
graph
of
d e l e t e a l l long
the vector g = ( l , l ) .
original
to
6.2
can
lines
coordinates
within
situation
graph
graph
points
the
Consider
t'=
limit
the
d e f i n i t e n e s s assume t h a t M^N^L.
domain
,
the a l g o r i t h m execution time corresponding
arcs.
i n t r o d u c e t h e new
g
2
another
i n a c o o r d i n a t e graph
i n c r e a s e by a f a c t o r o f
chaotic
is
i t . After
i s specified
EXAMPLE 2 6 . 6 .
The
to
t r a n s f o r m a t i o n . The
porates
The
there
augment
schedule
I t guarantees
min
2
a r e d e p i c t e d by d o t t e d l i n e s
(1,0)
This results
w e l l k n o w n and
graph
us
form
-l -
g
the vector g = ( 0 , l ) .
speaking,
problem.
arcs of
max
2
2
) = e ( l . 1,0),
i ' , y )=E(i,o, I),
E f 3 ( C , i ' ,j ' ) = c ( l , - l , c f A t ' . i ' , y
g3U)=<2,3,4.5).
)=c(i,
0),
o.-i),
251 According t o (24.5) the g r a d i e n t s of
schedules
s a t i s f y t h e system
inequalities
(grad
B u t f =-f
2
, f = - f , so t h e
3
4
must a l w a y s
hold.
already
factor.
known
tlx),
Since
admissible value
normalizing
1=1,2,3.4.5.
tU),f.)±0,
equalities
5 (grad
the
of llrait
f ) =
the vectors
of
(grad
2
gradient
Specifically,
t o u s f r o m §12.
f
2
of grad
f ) = 0
tlx),
and f
limit tlx)
4 4
are l i n e a r l y
schedule =
independent,
i s unique
(1,0,0).
This
to
fact
a is
Chapter 6 Algorithm Graph and Schedules Building Our plenty
investigations
of information
algorithm about
both
implementation
t h e process
sidered
have
that
on a l g o r i t h m
building
was g u e s s e d
ones,
discussing
we w i l l
from
structure
itself.
rather
than
cesses a r e t h e s u b j e c t m a t t e r o f t h i s While
an a l g o r i t h m
points of
was s a i d
Graph
so f a r we
con-
building
pro-
chapter.
those
processes
the very
beginning
and c h o o s i n g have
pay s p e c i a l
sue.
i s that
requirement
contains
I n t h e examples
built.
c o m p u t e r i m p l e m e n t a t i o n . We w i l l The p r i n c i p a l
graph
and f i n e
o n c o m p u t e r s . However, l i t t l e
o f graph
t h e graph
shown
the appropriate
i n mind
aspects
attention
t h e time
o f their
t o t h e time i s -
o f graph
building
s h o u l d n o t d e p e n d o n t h e number o f a l g o r i t h m o p e r a t i o n s o r , e q u i v a l e n t ly,
on t h e t o t a l In
number o f g r a p h
practice
nodes.
algorithms a r e recorded
s t a n c e does n o t , however, add g r a v e processes.
Recall
explicitly
reflected
that
neither
i ngraphs.
i nv a r i o u s forms.
extra
data
types
essential
ferences
nor kinds o f operations are
i n data
and o p e r a t i o n s , every
notation
i f we i g n o r e
specifies
of re-evaluation
Actually,
formulas, this
programs
difference
complexity o f graph We w i l l
Some
consider
them a r i s e s .
graph
building.
i n single-assignment
i s not a principal
building
notations
Language
We w i l l We w i l l
exclude just
others
languages)
one. I tt e l l s
(as pro(mathedo n o t .
b u t onthe
process.
the algorithm
FORTRAN-like n o t a t i o n . for
t h e computa-
concerns t h e p o s s i b i l i t y or
of variables.
grams i n ALGOL, FORTRAN, e t c . ) a l l o w s u c h r e - e v a l u a t i o n , matical
dif-
v a r i a b l e s a n d e m p l o y s a number o f ways t o o r g a n i z e
l o o p s a n d b r a n c h e s . The g r e a t e s t d i f f e r e n c e impossibility
a l g o r i t h m s , we s e e t h a t
f e a t u r e s a r e t h e same. S p e c i f i c a l l y ,
t i o n o f some I n d e x e d
circumbuilding
A n a l y z i n g , f o r e x a m p l e , a number o f no-
t a t i o n s most o f t e n u s e d t o r e c o r d c o m p u t a t i o n a l their
This
impediments t o graph
graph
details those
ignore
building
will
be p r o v i d e d
language
those
process
based on a a s t h e need
features that
prevent
language f e a t u r e s t h a t
have
253 no
b e a r i n g on t h e process.
into for
account
Of c o u r s e ,
a l lparticularities
we w i l l
of other
hardly
languages,
be a b l e t o t a k e
b u t we w i l l
allow
t h e most i m p o r t a n t ones. All
existing
These p a r a m e t e r s building. and
program d e s c r i p t i o n s
some f o r m a l
s h o u l d be r e g a r d e d a s u n k n o w n q u a n t i t i e s
T h e r e f o r e t h e dependence on parameters
a l l graph-related
objects.
s e a r c h , and i n p a r t i c u l a r At
include
t h e end o f t h i s
that enables
This
finding chapter
we
us t o overcome t h e s e
during
graph
i s i n h e r i t e d by graphs
significantly
proper
parameters.
complicates the r e -
schedules.
discuss
a
mathematical
apparatus
difficulties.
27. Some Statistics If it
a program uses
i s almost
case, ing
t h e whole r e p e r t o i r e
evident that
i t cannot
factors
be b u i l t
the algorithm
cannot
be b u i l t .
then
I n any
b e f o r e p r o g r a m e x e c u t i o n . One o f t h e p r e v e n t -
i s t h e dependence
data or intermediate
o f FORTRAN c o n s t r u c t s
graph
results.
o f indexes
I n that
o f used
variables
on
input
c a s e i t i s i m p o s s i b l e t o know b e -
f o r e e x e c u t i n g t h e program which v a r i a b l e v a l u e s a r e arguments o f a l g o rithm operations.
Consequently
we a r e t o d e v e l o p process
a s many
graph
building
programs
constructs,
graph b u i l d i n g procedures
compromise,
observe
that
procedures.
as p o s s i b l e ,
set o f a d m i s s i b l e language fective
we h a v e t o s e t t l e
we
f o r some c o m p r o m i s e i f
As we w i s h
should
strive
t o be a b l e t o t o expand t h e
but the necessity t o develop e f -
makes u s n a r r o w
n o t a l l language
this
s e t . To f i n d t h e
constructs
are extensively
used i n p r a c t i c e . In
the literature
algorithm graph plicitly) sible
built.
language
studied
i n literature.
it
i s assumed
of
loop indexes. class
parallelization
that approximates
To make o u r f i r s t constructs,
program f r a g m e n t s
narrow
on program
o r something
l e t us
step i n choosing look
what
I t I s easy t o see t h a t
that a r e analyzed are t i g h t l y
that
o f programs
there
other
the
o f programs a r e
i n t h e m a j o r i t y o f cases nested
l o o p s . As a
are linear
restrictions.
a r e no v i t a l
either
t h e s e t o f admis-
classes
index expressions o f variables
There a r e also
methods
i t i s ( e x p l i c i t l y o r im-
But even
theorems
rule
functions f o r this
describing
the
254 q u a l i t y o f proposed p a r a l l e l i z a t i o n This its
s i g n a l s us t h a t e i t h e r g r a p h b u i l d i n g o r the d e t e r m i n a t i o n o f
parallel
difficult loops
processes.
structure
must b e a d i f f i c u l t
t h e s e t a s k s may b e ,
i s subject
limiting
to criticism.
us t h a t
t a s k . H o w e v e r , n o m a t t e r how
the discussion
Our e x p e r i e n c e
to tightly
nested
i n large problems
tion
tells
time
i n t h e m a j o r i t y o f c a s e s . What c o n s t r u c t s a r e t h e n
such c o n s t r u c t s do n o t d e t e r m i n e
program
solu-
execution
t h e most
impor-
tant? To
find
special
t h e answer
software
real-life
tool
programs
t o this
one and o t h e r
was c r e a t e d
similar
f o r amassing
questions,
statistical
data
[ 9 3 ] , I t was a i m e d a t t h e d e t e r m i n a t i o n o f f r e q u e n c y
o f u s e o f v a r i o u s FORTRAN c o n s t r u c t s . A number o f p r o g r a m s i n tion,
linear
lyzed.
not
have
The
at
frequency
—
row.
eventually program
nested
loops
About
i n c l u d e a l a r g e number
structs. levels.
execution
time
out
that
even general level
data
simplicity.
o f DO l o o p s was t h e f i r s t t o turned
o u t t o be v e r y
3 o r m o r e , no t i g h t l y
i s determined
could 12,000
and f u n c -
nested
loops
a r e loops
whose
o f other
loops
in a
loops. by maximally
nested
We d e f i n e p r o g r a m p o w e r t o b e t h e maximum o f i t s l o o p I t was f o u n d
loop
con-
nesting
c o n s t r u c t s where
do n o t d e t e r m i n e program
there execu-
t i m e . Among p r o g r a m s o f p o w e r 3 a n d m o r e , t h e f r a c t i o n o f p r o g r a m s
These, as w e l l cannot
limit
as other data
that
loops
from
[93], testify
our consideration neither
to t h e i r g e n e r a l i z a t i o n mentioned show
Over
subroutines
whose m a x i m a l l y n e s t e d c o n s t r u c t s a r e o f t h a t k i n d i s n o t
we
styles
The s t a t i s t i c a l
i.e.
fraction
19 s u c h
ana-
various i n -
statistics.
(5 on t h e average!
o n l y one l o o p a t each n e s t i n g
tion
programming
40'/. o f a l l c o n s t r u c t s
The m a x i m a l s e q u e n c e c o n t a i n s
were
from
t o them a s p r o g r a m s , f o r
o n l y TA. F o r n e s t i n g l e v e l
Program
people
analyzed.
units,
o f use o f v a r i o u s types
The t i g h t l y
minimiza-
and e l e c t r o d y n a m i c s
by v a r i o u s
on t h e r e s u l t i n g
hereafter refer
a l l were encountered.
bodies
is
f o r FORTRAN
We w i l l
estimated.
small
physics,
due t o i n d i v i d u a l
impact
o f FORTRAN w e r e
gathered
tions.
be
plasma
so d e v i a t i o n s
significant
statements were
algebra,
These p r o g r a m s w e r e w r i t t e n
stitutions,
a on
are actually
to tightly
i n t h eprevious
embedded
more t h a n 9%.
convincingly nested
paragraph.
i n one another
that
loops nor Statistics
i n a
ramified
255 fashion. nested He the of
Because
o f that,
also gathered
GOTO s t a t e m e n t . a l 1 programs,
100
i n what
follows
we w i l l
study
statements.
nested
statistical
data
and i n almost
every
program
I n t h e overwhelming
strictly,
on program
loops
specified v i a
I t was r e v e a l e d t h a t s u c h l o o p s a r e u s e d i n a t h i r d consisting
majority
a n d o n l y a b o u t 1°/, o f p r o g r a m s
computation
i n a b o u t t h e same w a y a s DO l o o p s . I f we a l l o w
will
o f DO s t a t e m e n t
make u s e o f t h i s We a l s o
found
looked
out that
assignment nesting
innermost functions.
only
decreases
The p a p e r
heavy
with
restrictions
sumptions s h a l l
level
blem. For
grows.
level
interesting
2 o r more
I t i s noteworthy
include
the f i r s t
on usable
purposes,
bodies
that the
data.
H e r e we
I ti s only the infer-
we c a n n o t
limit the
constructs; algorithm execution time usual-
structure.
directions.
optimism
that
i s very small.
statistical
length.
procedures
that determine
I t i sn o t necessary
practical
nesting
building
i n f e r e n c e , we w i l l
At t h e beginning
language
b e made o n l y a f t e r
inspires
I np a r t i c u l a r , f o r
m a t t e r f o r u s now:
graph
made a s s u m p t i o n s a r e i n s u f f i c i e n t inference
l o o p s . He
t o s u b r o u t i n e s and u s e r - d e f i n e d
them a t a g r e a t
simple
accordance w i t h
He
consist entirely o f
the nesting level.
t o any f i x e d k i n d o f loop
research
i n innermost
that contain a conditional exit
[93] that
l y have a r e l a t i v e l y
modification
may b e e x c l u d e d .
bodies
[93] includes a l o t o f other
- program fragments
In
whose
no c a l l s
as n e s t i n g
- while developing
following
o f statements
o f loops
t odiscuss
ences drawn f r o m
discussion
loops
direct the
i n 1 5 % o f c a s e s . The f r a c t i o n o f s u c h l o o p
f r a c t i o n o f inner loops
are n o t going
such
i t e x c e e d s 90'/.. I t was a l s o
loops c o n t a i n almost The b o d i e s
b y GOTO s t a t e m e n t s
later.
o f loops
grows w i t h
3 o r more
GOTO s t a t e m e n t s rapidly
a t t h etypes
statements
levels
then
observation
the fraction
formed
than
loops a r e
incorporate overlapping
T h i s means t h a t
t h estructure
loops
o f more
o f programs
constructs.
of
arbitrarily
loops.
constructs.
t o carry
to build
that
the previously
the research
on. The second
the solution
t h e graph
i ti s sufficient
by t h e
n o t impose
Any r e s t r i c t i n g a s -
i t becomes c l e a r
as regards
be guided
we w i l l
o f t h e posed
o f t h e whole
pro-
algorithm.
t o b u i l d graphs o f i t s c r i t i c -
256 al
fragments.
28. Order Relation Building difficult thereby blem
algorithm
part
algorithmic that
tions
is a
languages
like
i sfixed
i na uniform
consider
kind
o f t h e symbols s o r d and r e f e r
Let a b i n a r y r e l a t i o n
the f o l l o w i n g
be g i v e n
i s owing order
t o the
o f operaThere-
ordering. order
before
proceed-
building.
o n t h e s e t A . He w i l l t o i t a s order
denote
i t by
i f i t has
relation
a £ a f o r a n y aeA;
- transitivity:
a s b and b £ c imply a £ c f o r a l l
- antisymmetry:
a £ b and b £ a i m p l y a = b f o r a l l a , b e A .
If
pro-
properties:
- reflexivity:
£ i s an o r d e r i scalled
relation,
An
order
If
there a r e incomparable
linear
< i s strict
order
i f f o ra l l a,bsA either elements
i n A then
a,b,ceA;
i f a £ b a n d a * b. a £ b o r b =" a h o l d s .
the order
i s called
par-
I n c o m p a r a b l e p a i r o f e l e m e n t s a.b i s d e n o t e d b y a a b. A sample o r d e r number
axis.
relation
i s t h e "greater o r equal"
Obviously,
i s more n a t u r a l
this
order
i s linear.
f o r a l g o r i t h m graphs.
Clearly,
this
side
t h e same
call
this
layer
of a parallel
partial acyclic
index
t o be junior
Assume t h a t elements
will
i s a path order.
linking
a and
Any t w o nodes i n -
be i n c o m p a r a b l e .
We
will
Informational.
Consider a f i n i t e
number i n t h a t
form
Setting
i n the
a
r e l a t i o n determines a p a r t i a l
order
relation
Given any d i r e c t e d
g r a p h , we s e t a < b f o r t w o n o d e s a.b i f t h e r e
one
The g e n e r a l
lexicographic
t h e d e s c r i p t i o n o f algorithm graph
The most
a t s t u d y i n g programs i n
properties of this
one
b.
task.
o f o p e r a t i o n s and
the execution
way, c a l l e d
certain
with
real
order
FORTRAN, ALGOL, e t c . T h i s
ing
order
difficult
however, by o u r o r i e n t a t i o n
i n a l l languages o f t h i s
f o r e we w i l l
tial.
fairly
the execution
t h e r e - e v a l u a t i o n o f c o n t e n t s o f memory c e l l s .
i ssimplified,
fact
graphs
of i t i s fixing
system.
ordered
t o another
[senior) L e t index
there i sa linear
o f some
system o f indexes
value
order
a
{
a^.a^
I fi t has s m a l l e r
(greater)
be a n e l e m e n t o f t h e s e t A ^ .
on each o f these
set A are identified
He w i l l s a y
by a g r o u p
s e t s . Suppose
o f foremost
that
indexes
257 from
the
system
Identified
*i.»2.
I n general
various
b y v a r y i n g number o f i n d e x e s .
elements
of
Introduce a s t r i c t
A may
be
binary re-
l a t i o n -4 o n A , s e t t i n g a-
is identified
introducing
relation
elements.
while
can
readily
less than
a l l A . are lower 1
this
senior
indexes
will
either,
i f we
dexes. T h i s be
the
and
i
problems A
order
case
are
lower
r e l a t i o n . We
will
;
t o t h e case where a l l
t h e same
i s preserved.
This
number
of
indexes,
t r a n s i t i o n c a n be
there exists
indexes
equaling a
o p e r a t i o n does n o t change t h e r e was
junior
indexes
n o t change
ac-
hold
suppose
o f b.
indexes
than
Thus, adding
is
t o make t h e
will
demonst-
order.
Sup-
^ = 4 ^ . • • -. y_ ~° _j
•
a
1
i
o f a and b
in particular, i f
a l l indexes
o f a match t h e
senior indexes
relation. I t will
t o b's
n o t change
e q u a l i n g a t o t h e s e t o f a's i n that
the corresponding
a. a s s e n i o r i n d e x e s
o r d e r , no m a t t e r
We
i f a l l indexes
A d d i n g a n y new
i s a consequence o f t h e f a c t
lexicographic
t o i t , so a s
i n a n y way,
that
that
a
the index set o f every
such i t h a t
still
the lexicographic
add any s e n i o r
less
a number
the lexicographic
i a r e changed
a r e a d d e d . Now
strictly
c h o i c e o f a.
formed.
kinds
and t h e methods o f
t h e same f o r a l l A ' s e l e m e n t s .
g r e a t e r than
indexes
corresponding
will
relation
order.
r e s e r v a t i o n s a r e made.
one
t h e r e l a t i o n a-
numbers
senior
the usual
bounded,
p o s e a-
this
of tasks. Various
l o w e r b o u n d s o f a l l ,4 ; . L e t u s augment
number o f i n d e x e s that
that
differing
i n many w a y s . H e r e i s one o f t h e m .
element adding
total
the f i r s t
verified
the general by
the lexicographic order
Since
rate
move f r o m identified
by
of the sets A
unless special
of a
o f b.
I n a l l our
with
a, < b.
b and a l l i n d e x e s
i n a variety
i n them.
numbers
= b,
t o i t a s lexicographic
i s used
orders
real
of A are
refer
by t h e s t r u c t u r e
assume t h i s ,
complished
A's
of
indexes
I t c a n be e a s i l y
will
order
linear
sets
We
than
-< i s d e t e r m i n e d
Lexicographic
elements
indexes
junior
i tare determined
hereafter
less
the binary
o f A's
bounded
by
to the corresponding
i s a l i n e a r o r d e r . We
of
that a « b ..... a,
s u c h number i e x i s t s
how
t h e f i r s t new index
of
index o f a
b due
t o our
c e r t a i n l y does n o t change
many t i m e s
such adding
I s per-
258 Thus, are
we c a n assume w i t h o u t
identified
t h i s assumption If
regard
themselves. the
the
point
have agreed
graphic ly
coordinates.
orders
that
the
transition
triangular
with
Denote dinates.
by S
1
by
S,
are
coor-
two
For
necessary
in
these
orders
and
coordinate
two
sufficient
systems
be
lower
of
transition vectors.
from
o l d t o new
They w i l l
system. Note t h a t
t o the conditions
S be
and
to negati-
Consider the
nonzero component i s n e g a t i v e .
that
t h e con-
(x-yHO
turn equivalent
components o f x - y and S ( x - y ) .
and s u f f i c i e n t
coor-
b e s p e c i f i e d by
We h a v e t o
to multiplication
lower
triangular
with
entries. i s obvious.
f o r some j > 1 t h e s j . e n t r y
whose
first
component
other
components a r e z e r o .
S u p p o s e t h a t z e Z i m p l i e s SzeZ. i s n o n z e r o . Take a s u b s e t
i s - 1 , t h e j t h component This
subset
o f the subset a r e m u l t i p l i e d
images cannot
(direct-
what
introduced
space. is
conditions are i n their
whose f i r s t
The s u f f i c i e n c y
is
t o know
entries.
are equivalent
nonzero
i t i s necessary
vectors
it
the
lexico-
order.
arithmetic
between
spaces
those
on p o i n t c o o r d i n a t e s
orders
space,
i n linear
w i t h only
f o r t h e s e t Z t o be i n v a r i a n t w i t h r e s p e c t
p o s i t i v e diagonal
that
linear
entire
the matrix
and S x - ' S y
set Z o f vectors that
introduced deal
basing
y be a n y d i s t i n c t
of the first
show
be
i t i s important
diagonal
S ( x - y M 0 . The l a t t e r vity
may
Sx a n d Sy i n t h e new c o o r d i n a t e
x-iy
o f indexes b y o n e and
being
lexicographic
matrix positive
L e t x,
the vectors ditions
the
sets
i s described
t h e elements
the lexicographic
a real
over
o n A, we can
iden-
Consequently,
of
identical
make
o f A c a n be
that are introduced
systems
be
element
Therefore
STATEMENT 2 8 . 1 . Suppose
will
l i n e a r a r i t h m e t i c space, t h e indexes
Coordinates
d i n a t e systems p r e s e r v e
to
every
I n w h a t f o l l o w s we w i l l
or indirectly),
coordinate
We
below.
that
o f indexes.
ways.
elements o f A
o f parameters.
i n t e r e s t e d i n the lexicographic order
with points of a real
various
generality that
t h e elements o f A as t h e c o r r e s p o n d i n g
We
same number
tified
in
i n the discussion
we a r e o n l y
actually
losing
b y o n e a n d t h e same number
remain n o n p o s i t i v e
not i n v a r i a n t w i t h respect
I s part
since
o f Z. H o w e v e r ,
and a l l as t h e
component o f t h e i r
a^j i s n o n z e r o .
to multiplication
o f vectors
i s arbitrary,
b y S, t h e f i r s t
Suppose
Consequently, Z
b y S. As S i s n o n s i n g u -
259 lar,
the diagonal
entry of the f i r s t
ries
t o the right
o f the diagonal
first
i rows o f S possess
j>i+l
the
whose all
(i+l)th
S, t h e f i r s t to
repeat
i components
show
that
will
have s i m i l a r
ditional in
the (i-M)th
that
f o r i s lthe
that
a subset
f o r some
of vectors
- 1 , t h e j t h component i s a r b i t r a r y ,
o f images w i l l
concerning
always be z e r o .
the f i r s t
and
are multiplied
row m u t a t i s
row o f S has t h e r e q u i r e d f o r m .
to find
We h a v e
by
only
mutandis t o
The m a t r i x s "
1
t h e c o o r d i n a t e system possessing
some a d -
p r o p e r t i e s w h i l e p r e s e r v i n g t h e l e x i c o g r a p h i c o r d e r , we d o n o t have a wide c h o i c e . S p e c i f i c a l l y ,
the
c o o r d i n a t e systems t h a t
lar
transition
are related
have
t o consider
maximal p o i n t ographical
concerning
t h e problem
point.
only
t o each o t h e r by lower
through triangu-
entries.
t h e l e x i c o g r a p h i c o r d e r we
of finding
i n a c l o s e d d o m a i n . We w i l l
maximum
maximum p o i n t
we c a n s e a r c h
matrices w i t h positive diagonal
Among v a r i o u s p r o b l e m s
ear
Consider
and a l l e n t -
structure.
i f we w i s h
general
often
Suppose
As t h e v e c t o r s o f t h e s u b s e t
o u r argument
Thus,
i s nonzero.
component e q u a l s are zero.
are zero.
t h e d e s i r e d p r o p e r t y . Suppose
j entry of S
the rest
row o f S i s p o s i t i v e ,
refer
the
t o such p o i n t as
L e t t h e domain D wherein
i s s e a r c h e d be a p o l y h e d r o n
will
lexicographically lexic-
the lexicographical
d e f i n e d by t h e system o f l i n -
inequalities
(a,,*) *
V
(28.1) (a
where
l point
The x
a
were
...,a x
m
S
,x)
m
a r e n - v e c t o r s a n d m^n. D e n o t e t h e maximum p o i n t b y x0 m m u s t be o n e o f t h e v e r t i c e s o f t h e p o l y h e d r o n . I n d e e d , i f
not a vertex
then
a
segment
containing x
would e x i s t tain the eral
as i t s i n n e r
point
0
o inside
the polyhedron.
points lexicographically
S u c h a segment w o u l d c e r t a i n l y
greater
than X .
l e x i c o g r a p h i c maximum o f a p o l y h e d r o n
I tfollows
that
con-
to find
o n l y i t s v e r t i c e s a r e i n gen-
t o be s e a r c h e d . The v e r t i c e s a r e s o l u t i o n s t o t h e s y s t e m o f l i n e a r
algebraic
equations
260 ( a ^ , x ) =
a
nonsingular matrix.
equalities In never
problems,
large.
polyhedron
be
do
To
equations
this,
lexicographically
two s o l u t i o n s o-.
tions
spawns a
the
will
a
o n l y need
of
of
satisfy
the
amount
depend large
on
form
t o go (28.2),
solution
among
of
should
work
number o f new
(28.2)
we
based
will on
can
compared.
parameters.
that,
inequalities
direct
However, t h e p r o b l e m
Because o f
systems
we
by
c a n be d i r e c t l y
sides
eters.
should
the system
like
of i n -
(28.1)
the search o f the lexicographic
maximal
total
128.1) i s s m a l l .
systems
performed
algebraic
The
the
Seemingly
can
vertices.
hedron.
They a l s o
(28.1 ) .
our
be
n
inspection
solve those
and
be
large
inequalities to find certain
as
the
the
system only i f
the
these
a criterion properties
poly-
right-hand
comparing
involving
the
the
solved e x p l i c i t l y
case
linear
select
belonging to
not be
them,
I n our problems,
strive
maximum o f a
a l l polyhedron
through a l l systems o f
In that
checking
of
will
of
of
soluparam-
picking
their
ma-
solution
to
trices. We the
can
assume w i t h o u t
following
system
of
losing
generality
linear algebraic
that
Xq
is a
relations:
(28.3) (a ,x )=IT , p o p where p^n. is
Suppose t h a t
substituted
(28.3) from tain
into
the f i r s t
t h e system
of
a l l remaining i n e q u a l i t i e s hold
(28.1). p
Subtracting
inequalities
the
strictly
corresponding
( 2 8 . 1 ) and
setting
y=x-x
as X
equalities . we
ob-
inequalities Ut.y)
£
0,
,y) -
0.
(28.4) U
261 For
the soLution
o f t h e system
of
the polyhedron
Is
unique,
i t i s also
Inequalities
ities
(28.4)
less
28.2.
are
subsystem
dependent
sufficient
that
(28.4)
a l l solutions
vectors
a^
that
a l l
of
the
than
of
n
o f t h e system
system
zero.
are
of
Then
inequalities
i t s solutions
of
than zero. inequal-
there
uith
exists
linearly
also
in-
lexicographically
zero.
Let
S be a n a r b i t r a r y l o w e r
entries.
Perform
triangular
lajsn.
matrix
with
unit diagonal
the substitutions
z=Sy, b J = ( S
for
less
less
consisting
maximum
and i f t h e s o l u t i o n o f (28.3)
a l l solutions
Suppose
lexicographically
of
than
t o be t h e l e x i c o g r a p h i c
( 2 8 . 4 ) be l e x i c o g r a p h i c a l l y
STATEMENT
such
(28.3)
(28.1) I t I s necessary,
Then t h e system
_ 1
)'a1
(28.5)
o f i n e q u a l i t i e s (28.4) w i l l
transform
into
( b , z ) £ 0, (28.6) ( b , z ) £ 0. P
The
systems
ticular, tement tors
(28.4) and (28.6) a r e l e x i c o g r a p h i c a l l y
28.1.
Choosing
we w i l l
a
i
have a s p e c i a l
ities.
than is
y such
equals
zero. S u b s t i t u t i n g
this
$ and r e - e n u m e r a t i n g t h e vec(28.6) where
positive
vector
that
a l l b
will
{
among
vector
vector
into
generality
into
are zero
Is lexicographically ( 2 8 . 4 ) we f i n d
among
the last
we c a n assume i = n . We
entries
only
(28.4)
but
that
there of the
can always the matrix
the entries
O b v i o u s l y , S c a n be c h o s e n i n s u c h
for
greater
out that
components
t h e n u m e r a t i o n o f a J . Suppose
i t s nondlagonal
r o w may b e n o n z e r o .
y
one o f t h e i n e q u a l -
a l l I t s components
component
losing
by changing
that
1. T h i s
one p o s i t i v e Without
this
i s such
last
matrix
t o t h e system
any l e x i c o g r a p h i c a l l y
one t h a t
v e c t o r s a^.
S
(28.4)
t h e c h a n g e f r o m " a " t o ">" i n a t l e a s t
at least
satisfy
I n par-
form.
Take a v e c t o r
last
a suitable
reduce
Substituting must c a u s e
the
equivalent.
i f y-<0 t h e n z-<0 a n d v i c e v e r s a . T h i s I s a c o n s e q u e n c e o f S t a -
of the
a way
that
262 all
c o m p o n e n t s o f b^
nent
of
that a
n
i s equal has
Now t h e one ly
but f o r the last to that
of
t h e same f o r m as
and
there i s at
least
i s zero,
this
by
we
changing
one
Suppose t h a t
may the
the
1. T h i s v e c t o r i s
this
positive Since
a l s o assume numeration
matrix S
of
i s such
vector into
component we
to
n o t be
c h a n g e d by
t h e l a s t row
that
i=n-l.
while
a^
b.
but f o r the
&n_1
will
last
component o f b _
be
Carrying tion of (28.4)
to
two
equal
that
among
that
row
two
will
process
choosing
the
system
be on
last
components
matrix with
zero s o l u t i o n s
can
always
be n o n z e r o .
p=n
i n (28.4).
In
Let e
k
the
kth
we
conclude
where
the
positive
that
two
last
less
n
a ^ ^ so
next
by
changing
the next
can
to
entries.
t h e numera-
t r a n s f o r m t h e system form
a
lower
n
Obviously,
t h a n z e r o . These v e c t o r s b , . . . , b
a l l non-
Denote u=-y.
t h e k t h base v e c t o r , that
equals
1.
we
Obviously,
system o f i n e q u a l i t i e s
determine n
(28.4). will
y-!0
henceforward
i m p l i e s uM)
assume t h a t
and
vice
versa.
i . e . a l l i t s components a r e z e r o but f o r
Denote
a
, n+zk-i
the
The
,z! s 0
subsystem o f i n e q u a l i t i e s i n
one
vector
components o f
of
v e c t o r s b .....b
diagonal
a c c o r d a n c e w i t h S t a t e m e n t 28.2
be
entries
The
t h a t a l l components of
I
proper
satisfy
i n i t s place.
o f the subsystem o f i n e q u a l i t i e s
lexicographically
the
last
component
positive.
(b
are
out
the
last
l triangular
to
the
t h a t m a t r i x S.
a s u i t a b l e m a t r i x S we
(28.6)
find
i t s nondiagonal
may
l a s t o n e s become z e r o . The
and
We
c h o s e n i n s u c h a way
to the
Suppose
lexicographical-
( 2 8 . 4 ) we
leaving
t h e m u l t i p l i c a t i o n by
o f S c a n be
compo-
zero but f o r
among n e x t
assumed t h a t
only the e n t r i e s o f the next to the l a s t a^ w i l l
last
positive.
a l l i t s components a r e
that equals
c o m p o n e n t s o f t h e v e c t o r s a^. of
Is therefore
The
n
g r e a t e r than zero. S u b s t i t u t i n g
that
become z e r o .
the obtained b .
take a v e c t o r y such t h a t next t o the l a s t
one
=e, , k
a
and
--e n+2k
k
consider
263
( a n , u ) i 0,
1
' W For
e a c h k=0,1
set
out o f the set of solutions
such the
that
n-1 t h e a d d i t i o n
the f i r s t
solution
that
of
greater [28.7)
o f 2k l a s t
zero.
Therefore
According
k=0,1,...,n-1
component.
This
( e . + ,u)a0 holds
(28.71.
to
inequalities are
f o r each
(k+l)th
n-1 t h e i n e q u a l i t y
inequalities,
are zero.
o f t h e system o f n f i r s t
than
l u t i o n s o f t h e system o f I n e q u a l i t i e s ment 2 3 . 4 we c o n c l u d e
i n e q u a l i t i e s c u t s a sub-
of solutions
has n o n n e g a t i v e
f o r each k=0,I
° -
o f t h e system o f n f i r s t
k components
hypothesis, a l l solutions
lexicographically any
2
(28. 7)
means
f o r a l l so-
I n accordance w i t h
State-
that n+zk
= [
Aj
V ,
(28.8)
1=1 where A j * Let
+ l }
*0
f o r a l l i , k.
A be t h e m a t r i x whose c o l u m n s
a
are the vectors a l
matrix ith
I s nonsingular
column o f A
t i o n o f (28.8)
1
according
t o t h e hypotheses.
. Assume b y d e f i n i t i o n
by t h e m a t r i x A
k+\
1
L
1
that
Denote
'=0. L e f t
. This
.n^. by a ;
the
multiplica-
yields
i
1=1
i
L
i
j
j=0
Ik+i)
w h e r e v\ If
a r e some
scalars.
any s o l u t i o n o f (28.4)
the
r e p r e s e n t a t i o n (28.9)
all
entries
A
1
ding
except
of the f i r s t f o rthe f i r s t
holds
i s lexicographically
less than zero
f o r t h e columns o f A
. In particular,
column o f A ' a r e nonnegative. o n e a r e sums o f l i n e a r
c o l u m n s a n d some v e c t o r w i t h
nonnegative
then
A l l columns o f
combinations
components.
o f prece-
This
implies
264 that
the f i r s t
pose
that
nonzero
the columns
entry of
i
o f each
1
have
row
the
of A
1
is positive.
representation
show t h a t a n y s o l u t i o n o f ( 2 8 . 4 ) i s l e x i c o g r a p h i c a l l y Suppose t h a t
a
nonzero
vector
y
is a
solution
Now
sup-
We
will
(28.9). less
of
than zero.
(28.4).
Then
we
have
(
<
w h e r e a l l S^O.
V
•
y )
5=(<5 (
5^)
i s not
the
components o f y
equal
V
= a„,
Since the vectors
tor
are l i n e a r l y
i n d e p e n d e n t , t h e vec-
t o z e r o . L e t y=ly^
yn> • Expressing
i n terms o f e n t r i e s o f A
(a
Forming
y )
V
the dot product o f
(
-
l )
,6)
( 2 8 . 9 ) and
= y
.
5
and
and
5 we
find
denoting y =(a*
that
- 1
?
6),
we
have k
for
k=0,1,. .. , n - l .
Obviously, yQ=0
and
n (*+i i=l The
vector y
components
i s nonzero by t h e h y p o t h e s i s . C o n s e q u e n t l y ,
are
component o f y Now sented
we
zero. According i s negative,
will
show t h a t
i n the form
to
128.10),
(28.11)
not a l lof I t s
the
first
nonzero
i . e . y-<0. t h e columns o f any m a t r i x
(28.9) provided
the f i r s t
nonzero
1
A
may
entry
be
repre-
in a l l
its
265 rows
i s positive.
columns i n w h i c h have
numbers
Consider
i
some n u m b e r s . A l l tors ai a
w! *
The r o w s
components w i t h
,...,a^
containing
other
(Jc+i)
also (-1}
a
k-i
k-i raically all
such
' ,
t
(28.9).
to
t o new f i r s t
i ,1
that
viously,
we
equality
(28.9)
Let
us
can
,
2
s
inequalities
and
will
the f i r s t
we
the
first
nonzero
1
A
1
sum
In
less than W ex} . . . ,a^~''.
'
( I < + 1
such
Ob-
that
the
where
nonzero
entry
matrix
with
multiplication entries
facts
help
matrix
of
a nonsingular
of
the
polyhedron
each
row
of
A
1
it is
m a t r i x A possesses row o f A
1
verified
that
by any upper
non-
positive.
L-property Obviously,
entries
possesses
permutation
of matrix
triangular
the L-property. particular
is
the
i s positive.
diagonal
preserves
for
the
A describing
positive
t o examine
search
holds. maximum
the
we
d e f i n e d by t h e system o f
Statement
lexicographical it
Suppose
the polyhedron
I t c a n be e a s i l y
diagonal these
=0 f o r
t h e second
be a l g e b r a i c a l l y
A" *
e n t r y o f each
triangular
and l e f t
positive
obtain
discussion.
inside
of
say t h a t
L-property.
i> 'a'" '* s, s, k k row e n t r i e s a r e a l g e b -
nonzero
will
«. 1
p r o c e s s on and s e t t i n g
nonnegative
2 8 . 3 . The
the
than
t J c + 1
of
i n a l l of the vectors «•
t h e above
vertex
upper
columns
up
maximum
any
components
less
k
( 2 8 . 1 ) . The f o l l o w i n g
i s that
We
correspon-
k
holds.
STATEMENT
singular
are zero
choose
sum
lexicographical
simple,
t h e new
f o r a l l veccomponent o f
( - l )
are algebraically
A l l c o m p o n e n t s o f t h e sum w i l l
f o r those
(28.1)
that
l e s s t h a n M. C a r r y i n g t h a t
1
cept
row e n t r i e s
t k + 1
have
s k
corresponding
unequal
j
i >s
J
s
nonzero
entries
minimal
t h e components o f t > ' * ' a
k
Choose
these
k
that
the previous
numbers a r e z e r o
S
t o the f i r s t
that
o f m a t r i x rows a r e s i t u a t e d ,
. Denote by H t h e a l g e b r a i c a l l y
'
if
nonzero e n t r i e s < k.
• Choose s u c h
ding
"
t h e c o l u m n a^~*'. S u p p o s e
the f i r s t
matrix
Although
with quite
matrices
f o r the pre-
of picking
out t h e sys-
sence o f L - p r o p e r t y . Note
that
the L-property
tems 1 2 8 . 2 ) may r e s u l t termine take
which
I s t h e proper
the solution
that
based
criterion
i n the selection o n e we m u s t
belongs
o f m o r e t h a n o n e s y s t e m . To d e solve
a l l s e l e c t e d systems and
t o the polyhedron.
I f the right-hand
266 sides o f the i n e q u a l i t i e s the
(28.1) depend on p a r a m e t e r s then c h e c k i n g f o r
solution
t o belong
t o the polyhedron
inequalities
dependent
on p a r a m e t e r s .
causes
However,
number o f them a s o n l y a f e w o f t h e m a t r i c e s
t h e appearance there
being
o f new
i s but a
small
checked possess t h e
L-property. STATEMENT 2 8 . 4 . exists
such
a matrix
I f
permutation
P
that
A possesses all
the
leading
L-property
minors
of
then 1
(AP)
there
are
pos-
(28.4)
where
itive.
Consider again a
t h e system o f i n e q u a l i t i e s o f t h e form
a r e t h e columns
i
tions
o f (28.4)
of A . Since
A possesses
are lexicographically
m a t r i x o f t h e system
the L-property.
less than
i n course o f o b t a i n i n g
28.2
implies that
t h e system
( 2 8 . 6 ) . Our p r o o f
(S'V-IP =
m a t r i x fl i s l o w e r
matrix
(S ' ) '
(28.12)
implies that
of
(AP)
1
fl.
with
triangular
o f Statement
positive
with
unit
(28.12)
diagonal diagonal
e n t r i e s , the entries.
t h e minor o f o r d e r k o f (AP) ' e q u a l s
k diagonal
entries
1
o f B . Hence
Now
t h e product
the leading
minors o f
are positive.
We system
conclude i s given
graphic order be
triangular
i s upper
the f i r s t
a l l solu-
D e n o t e b y fl t h e
[ 2 8 . 6 ) . L e t P be t h e m a t r i x t h a t permutes A ' s c o l -
umns
The
zero.
specified
by t h e f o l l o w i n g i n a real
linear
remark.
arithmetic
i s introduced using that
Suppose
that
Obviously,
coordinate the lexico-
s y s t e m . L e t some s e t o f p o i n t s fi
i n t h e s p a c e . L e t u s r e g a r d e l e m e n t s o f fi a s g r a p h
Suppose t h a t w i t h o u r c o o r d i n a t e s a l l g r a p h a r c s o r i g i n a t e graphically
a
space and t h a t
s m a l l e r nodes and p o i n t i n this
case
from
nodes. lexico-
t o l e x i c o g r a p h i c a l l y g r e a t e r nodes.
t h e g r a p h h a s no c i r c u i t s .
Hence i t may be r e -
g a r d e d a s t h e g r a p h o f some a l g o r i t h m . L e t u be t h e o r i g i n a n d v be t h e end
point
our
hypothesis
scribed
o f an a r c . The a r c i s d e s c r i b e d u-
by v e c t o r s
we h a v e v-u>-0. that
b y t h e v e c t o r v - u . S i n c e by
Consequently a l l graph arcs greater
than
a r e de-
zero.
How-
ever, n o t a l l vectors that a r e l e x i c o g r a p h i c a l l y greater than zero
cor-
r e s p o n d t o some a r c s .
are lexicographically
267 We let
x
i n t r o d u c e the
informational
I f there
path
leading from
x
relation
is a partial
order.
formational fied
that
i f x
in general The the
choice take
Is
often
earlier
t h e n x-
true.
degree
to
is a
order
of
into
of
the
simplicity coordinate
account
certain
that
preserved
are
obtained
t h e m a t r i x S.
described then,
by
triangular tion
informational
that sible
By
i n t h e new
vectors
that
heavily
choice For
we
zero.
introduced
s p e c i f i e d by
Let
i n the o l d system
vectors by
a
are
lexicographically
have
example, i t
relation
s y s t e m be
than
on
I t is
vectors
i n the
new
multiplication that
greater
they
than
are
zero,
Additional
informa-
be
If a
the choice
vectors with be
o f S.
nonnegative
In particular,
components
the choice
of S w i l l
the e n t i r e from
every
i n the o l d
r e m a i n unchanged
space, and point
of
are
coordinate nonnegative
i f we
the whole bunch o f
s p a c e . The
lower
i f the arcs
c h o s e n among a l l n o n s i n g u l a r m a t r i c e s w i t h
originates
this
obtained
assume admisinfinite
regular. r e m a r k we
meant t o draw a t t e n t i o n
coordinate
describe
the graph,
of
arcs
graph
that
y.
matrices with positive diagonal entries.
£1 c o i n c i d e s w i t h
stance.
depends
i s not
Into
to
Obviously,
graph w i l l
reverse
c o o r d i n a t e system. S p e c i f i c a l l y ,
greater
veri-
is limited
by
arcs
order
the i n -
easily
from x
restrictions.
x.yefl
of S
system, S can entries.
no p a t h s
While making
that
be
l y - x ) > - 0 . The be
t o S t a t e m e n t 28. 1, t h e c h o i c e
about arcs widens
described
system.
Recall
I t can
description
additional
the
from
graph
y.
o n EJ. F o r
I f n o t h i n g i s known a b o u t g r a p h a r c s b u t
vectors
according
of
i n t h e new
lexicographically
s y s t e m be by
into
or, equivalently,
necessary t h a t a l l graph arcs that
relation
I f ( y - x ) > - 0 t h e n t h e r e may
required
be
order
system
then a
i s t o be
admissible coordinate
and
a
thorough
undertaken
to the f o l l o w i n g
l e x i c o g r a p h i c order
study
i f we
of
wish
peculiarities
are of
circumused
to
the
set
t o have a w i d e c h o i c e
of
systems.
29. Notation Particularities Any t i o n must
process
of
a l g o r i t h m graph
include the f o l l o w i n g
building
using
an
algorithm nota-
stages:
- d e t e r m i n a t i o n o f g r a p h n o d e s and
of
their
functionality;
268 mapping graph
n o d e s o n t o p o i n t s o f some s p a c e ;
- d e t e r m i n a t i o n o f graph The
a r c s and o f t h e i r
complexity o f implementation
sibility tors.
of carrying
o f separate
o u t t h ewhole process
However, I t i s o b v i o u s
that
functionality.
stages
and t h e v e r y
v e r y much, i f n o t e v e r y t h i n g , i s d e -
termined by t h e choice o f language o f a l g o r i t h m d e s c r i p t i o n nature o f the graph
building
We assume t h a t structs ments,
DO
statement The
loops,
with
labels,
i s recorded
variables,
reason
numerous
software
functional
s o f t w a r e shows t h a t
with
units.
t o computational
a superficial
entirely.
structs
languages,
mathematical algorithm
descriptions.
graph
building
tioned
constructs,
guages
having
Included effect
i n many o t h e r
Consequently, f o ra
we w i l l
similar
constructs.
i n t h e above l i s t .
on a l g o r i t h m
graph
i m p o s s i b l e . We w i l l
later
with
similar
developed
extend
C e r t a i n FORTRAN
a method f o r
with
aforemen-
i t t o other
lan-
constructs are
they e i t h e r
or, conversely,
t o which group
con-
t h e language o f
language
to easily
building,
clarify
including
having
T h i s means t h a t
acquaintance
Besides t h a t ,
FORTRAN-like
be a b l e
systems
o f most p r o g r a m s c o n s i s t o f
such c o n s t r u c t s e n t i r e l y o r almost appear a l s o
empty
reasons.
t h e automatization o f the
respect
fragments
con-
state-
a n d u n c o n d i t i o n a l GOTOs,
Even
the p r i n c i p a l
s u c h FORTRAN assignment
i saccounted f o r by several
i s t h e d e s i r e t o guarantee
of existing
using
expressions,
conditional
CONTINUE. T h i s s e l e c t i o n
chief
analysis
indexed
and by the
method.
the algorithm
as a r r a y s ,
pos-
d e p e n d o n a number o f f a c -
have
render
not
negligible t h e process
t h i s or that construct
belongs. Take
special
statement
invokes
speaking,
subroutines
rithm.
are
lists
arguments that
this
that
will
left
i s not i n our l i s t . fairly
often.
as m a c r o o p e r a t i o n s
c a n n o t be d o n e w i t h o u t a d d i t i o n a l
o f subroutines offer o f t h e macrooperation information
o f i t . Therefore,
extend
statement
and i s used
may be v i e w e d
arguments a r e t o t h e r i g h t the
CALL
subroutines
However, t h i s
parameter
Note
note
no h i n t s
i n f o r m a t i o n , as
ones
instead o f using
symbol,
parameters
are i t s results.
i n assignment
o f t h e assignment
o f the algo-
as t o which
and which
i s present
This
Generally
statements: the result
a l l i s to
CALLs a n d s u b r o u t i n e s we
t h e n o t i o n o f assignment statement,
making
i t comparable t o
269 subroutines tains
and m u l t i d i m e n s i o n a l
no i / o s t a t e m e n t s .
signment In well.
f u n c t i o n s i n p o w e r . Our l i s t
are also
also
con-
t o be t r e a t e d a s e x t e n d e d a s -
statements. course
o f our investigation
Whenever
closer
They
necessary
we w i l l
t o t h e mathematical
we w i l l
change
make o t h e r
extensions
as
o u r n o t a t i o n , e.g. making i t
o n e . The language
we u s e i s n o t ,
strictly
s p e a k i n g , a s u b s e t o f FORTRAN. H o w e v e r , i t c a n b e r e g a r d e d a s a n e x p a n sion
o f a subset
term
FORTRAN t o r e f e r
o f FORTRAN. D e s p i t e
rial
connection.
tion
that
We w i l l
are relevant
search goals time b e i n g , tion helps
we a r e o n l y
those
that
noted
that
as p l a i n
plain
by any a p p r o p r i a t e
variables.
indexes and c a l l denote
arrays
identifier variables
symbol.
identifiers We
plain
indexed
as a whole
constituting
name c o m p o s e d
the array
o f the array
portant;
I t c a n be a r b i t r a r i l y
identifiers arrays
aredifferent.
also
we assume
I n general
plain vari-
i s called
index
con-
two p l a i n Consider
variables are
a finite
by independent
that
integer
an array.
use i d e n t i f i e r s . To d i s t i n g u i s h
set of
An
array
between t h e
e a c h o f them
and t h e ordered
has i t s
set of in-
s t r u c t u r e o f s u c h names i s n o t
complicated.
Different
plain
etc. A plain
We w i l l
refer
variables either
im-
t o i t as i n -
Two i n d e x e d v a r i a b l e s a r e d i f f e r e n t
o r they have d i f f e r e n t
indexes used A plain
identifier.
execution.
t o i t as a
and t h e s e t i t s e l f
identifier
d e x e s o f t h e v a r i a b l e . The p r e c i s e
d e x e d variable
that
symbol.
nota-
a name o r a n i d e n t i f i e r d e -
variables
we w i l l
may be a n y a p p r o p r i a t e
refer
operators,
variables,
For t h e
i n algorithm
algorithm execution
Assume
nota-
Our i m m e d i a t e r e -
i n algorithm
We w i l l
a r ed i f f e r e n t .
enumerate
them
of algorithm
v a r i a b l e values.
remains unchanged d u r i n g
the
building.
participates values.
may be n u m b e r s , m a t r i c e s ,
i f their
aspects
use
s t r e s s i n g t h e mate-
language s p e c i f i c a t i o n .
L e t e v e r y p l a i n v a r i a b l e be a s c r i b e d
different
thereby
t o comprehend what
different
and t o i t s v a l u e s
stant.
own
striving
any q u a n t i t y
variable values
To
only
and what hampers a l g o r i t h m g r a p h
Consider
able
detail
d i f f e r e n c e we w i l l
t o a l g o r i t h m graph b u i l d i n g .
do n o t r e q u i r e a r i g o r o u s
T h i s q u a n t i t y may t a k e variable
this
t o t h e used language,
i f their
are i n different
s y s t e m s . T h e number o f i n d e p e n d e n t
t o enumerate a r r a y elements i s s a i d t o be a r r a y
dimension.
v a r i a b l e i s t h e s i n g l e element o f a zero dimension
array. Ar-
270 rays described i.e.
by d i f f e r e n t
t h e y h a v e no common p l a i n
portant
for a particular
plain
variable.
a.
assume
We
and
ables
we c a n a g a i n
introduce
regard
a special
i t s presence does n o t t e l l
impose no r e s t r i c t i o n s
of different
actually joint
ral
i s n o t im-
t h e a r r a y as a
"empty"
variable
i n a n y way o n a n y t h i n g ,
we
reasons
have
There
n o t presume
s
on p a r a m e t e r s ,
specified by
for different yj
y
the that
the given
variables
other
examples.
value
of
variables
8^..
values.
any
We w i l l
be d e c l a r e d
o f an u n c o n d i t i o n a l
may
thing
i s that
be
given.
actually
t h e value
function
refer
t o be
o f a.
y^ using
to this
f u n c t i o n as a r e placed o f uncondi-
a new o p e r a t i o n .
operation
i n terms
incorporate conditional
be
F(x^
y
No r e s t r i c t i o n s
i n general
The
o f some
branches.
The
t h e s e b r a n c h e s d o n o t m o d i f y t h e o r d e r and
number o f a c t u a l a r g u m e n t s a n d r e s u l t s .
We w i l l
D e n o t e b y x.
operation.
representation
"unconditional" referred
shall
may
the set of results
operations
the a t t r i b u t e
8^
o f indexes
Consider
uniquely
can always
we
vari-
sets
other
condition
8^
o f o p e r a t i o n s . Any s u p e r p o s i t i o n o r u n i o n
operations
reasons
variable or different
T h e i n d e x e s o f v a r i a b l e s may de-
operation, or just
important
For these
inner nature
i n a n y s p e c i a l way.
exact
only
i n their
of different
sub-
arrays.
s e t o f a r g u m e n t s X.,.,.,X .
the complexity
use
varying
parameter
determines
often
v a r i a b l e s . They may b e l o n g t o one
;
unconditional
tional
so t h a t
i n FORTRAN
are not interested
o r indexed
t h e same o r t o d i f f e r e n t
and
and i n t e g -
and r
plain
consideration i s
and v e c t o r s , d i f f e r e n t i a l
of a single
variables
vari-
t o conduct
we
often
introduce
we h a v e
to consider
values
We
simultaneous
i n t h e same a r r a y a r e i n t e r r e l a t e d Let
pend
their
written
a r e many
that
T h e s e may be e i t h e r and
o f matrices
again
simultaneously.
ables
because
e t c . Algorithms
I f f o r some
structure,
upon v a r i a b l e t y p e s .
case. For example, q u i t e
investigation
operators,
will
nature
the typical
routines.
on
intersection,
v a r i a b l e s . I farray s t r u c t u r e
case study
For convenience,
that
always have empty
n o t h i n g c a n be done w i t h i t . We
a
identifiers
This
i s a l l that
i s i m p l i e d by
t o an o p e r a t i o n . L a t e r even
this
be w e a k e n e d .
refer
t o the expression
F(ct
a j 8 ,.,.,8 ) a s
assign-
271 Bent
The
statement.
Bl,...,Br ted
variables
I t s o u t p u t s . We
as
follows.
The
a__ a r e
assume t h a t
current
an
said
t o be
i t s inputs
assignment statement
values x , . . . , x
of variables a
ments the
to evaluate y
,y^ 8^
v a r i a b l e s Sj
detail
what v a l u e s
«
s
i n what
made
their
below.
indexes
Generally the
I f there
m u s t be
speaking,
Indexes
of
However, any operation
ments.
and
Input
m e n t s . Output I s not
the
s y m b o l a. may
we
are
sion
F(a
We
i/o
that
the
m e n t s . How
be
will
values
a
recognize
determine
some i n p u t s
and
the
the
im-
statement. as
function
are
then
themselves.
assignment with
F
distinguished be
no
argu-
results.
among
other
specified
o u t p u t s o f any
empty. T h e i r p r e s e n c e has
state-
empty
t o a f u n c t i o n F w i t h empty
statements
to
execution.
i s d i s a l l o w e d as
a l s o viewed to
a]P . . . , 8 ^
variables
that
of
statement
o p e r a t i o n F can of
not vari-
the assign-
The
statement
e x a m p l e , e m p t y v a r i a b l e s may
assume t h a t
We
.
that
reservations
among
of variables
are
corresponds
i n general
by
assignment
effect
on
the
they are j u s t superfluous.
can
extend
the n o t i o n s of
the function
determined
not
the values
f o r the v a r i a b l e s
corresponds
n o t know e x a c t l y sults
that
indexes
statement
ment. S u p p o s e t h a t uniquely
variables
the
statement
implementation; Now
by
statements
i m p o r t a n t how
After
assume t h a t
the assignment
areu*
identical
i n a p r o p e r way.
the c o r r e s p o n d i n g assignment
assignment s t a t e m e n t s . For
statement
We
to
are discarded with
r
assume
performed
output
y
assigned
B^.
indexed
variables
modification of
plementation of
B
are
can
o p e r a t i o n F.
the e x e c u t i o n of the assignment
known b e f o r e
we
used
i s being
Input
are
s u c h among 8^
changed by
the
the values
manner
t h e p r e v i o u s v a l u e s o f S(
be
It
performing
c a r r i e s out t h i s f u n c t i o n
are not
0-
t
and
by
are assigned
and
a b l e s , s h o u l d t h e r e be ment s t a t e m e n t
s
1 s b e f o r e t h e e x e c u t i o n . T h e n x , . . . ,x a r e u s e d as l s
right
are
a
i
determined
and.
i s execu-
once
a
values
i s such
F
subset
of
what arguments a r e
determined. a
;
Bjt..
of
an
—
that
a]
a
we
will
«
Is another
assignment
subset
and
still
are
state-
results
exactly
may
Is
even
what r e -
call
the
expres-
We
will
assume
statement. s
of
i s g i v e n . We
specified
assignment
a l l variables
they a r e used
that
arguments
not
Nonetheless,
, ,8^)
o p e r a t i o n and
always
q u e s t i o n . We
used will
as
argu-
also
as-
272
sume t h a t If
t h evalues
o f a l l v a r i a b l e s 8 ,...,8
some v a r i a b l e s a c t u a l l y
remain
undetermined,
a r e always
determined.
we s e t t h e m t o t h e v a l -
ues t h e y h a d b e f o r e t h e e x e c u t i o n o f t h e a s s i g n m e n t s t a t e m e n t . will
show t h a t
such e x t e n s i o n o f t h e n o t i o n s o f o p e r a t i o n and assign-
ment s t a t e m e n t However, gorithm
results
i twill
i n b u i l d i n g an extension o f t h ea l g o r i t h m
facilitate
building
t h ea l g o r i t h m graph
we d e s c r i b e d
ment, f u n c t i o n and s u b r o u t i n e c a l l s , and o u t p u t s
describing
i s t h ep r i n c i p a l
before the a l -
thefunctional
and i / o s t a t e m e n t s . S e p a r a t i n g the
c o n t e n t s o f o p e r a t i o n s and index be d e t a i l e d
note
that
i t i s reasonable
t o separate
will
show
that
functional
the functional
i fnecessary. these
contents
contents o f algorithm
variables determine
i n c o r p o r a t e s FORTRAN a s s i g n -
p a r t o f t h e e x t e n s i o n . T h e ways o f
a r e n o t i m p o r t a n t . They w i l l
graph
F o r now, we o n l y
descriptions.
o f operations
nodes
expressions
and index
L a t e r we
determine the expressions of
t h e n o d e s t h a t a r e t o be c o n n e c t e d b y a r c s .
One o f t h e most e s s e n t i a l to
ensure t h e p o s s i b i l i t y
aspects
of algorithm
notation
choice i s
o f a compact s p e c i f i c a t i o n o f m u l t i p l e
t i o n o f t h e same o p e r a t i o n s d e p e n d i n g o n t h e v a l u e s o f i n d e x e s and
output
this.
graph.
i s executed.
The a s s i g n m e n t s t a t e m e n t
inputs
L a t e r we
variables.
We w i l l
The FORTRAN DO s t a t e m e n t
make u s e o f a n a n a l o g o u s
execuo f input
i s o n e way t o a c h i e v e
statement
to specify
periodical
r e p e t i t i o n o f o n e a n d t h e same o p e r a t i o n . We assume t h a t gorithm call
to
n o t a t i o n a r e marked b y d i s t i n c t Every statement
labels.
portant
some o r a l l o f t h e s t a t e m e n t s
where
thel e f t
labels
positive
may h a v e a n y number
are situated
s o we assume
constituting integers
of labels. that
they
the a l -
t h a t we w i l l I t i s unimare situated
o f t h e o p e r a t i o n . The DO l o o p h a s a h e a d e r o f t h e f o r m
DO n i = > . m , m 1 where n i s a l a b e l loop
body
beled
n,
parameter of
i s below
t h e DO s t a t e m e n t . The and t h e statement l a -
i , called
successive
that
3
i s e v e r y t h i n g b e t w e e n t h e DO s t a t e m e n t the latter
o f a statement
2
i n c l u d e d . The l o o p b o d y d e p e n d s loop
executions
index.
i ngeneral
The e x e c u t i o n o f DO s t a t e m e n t
o f loop
body
f o ra l l values
on the
consists
o f i w i t h i n the
273 range
from
i=mi,
to
then
will
not
for
be
with
equal
t o 1 . The All
the
otherwise
DO
is
value
the
new
us
of
loop
index
expressions
DO
loop
be
i s executed then
the
t h e DO
initial
as
values
Initial
values
current value of We
o f m^,
loop
first
loops
reason
in
*i'
B3
We
the view
assignment
Consider
the
next
Here A
we
comparison by
arise
due
loop
index
and
First,
the
terminates,
the
i t s o l d value
and
must be
two
as
t o be
Is either
with another
kinds
GOTO n .
label
empty
K
can ism^,
index
and
modification
loop
f o r two
one
Another
variable
or
a
one
has
the
the values
of
building.
outputs.
the of
statement form any
1F(A,
state-
cases
i n no
statements. the
This
There are
I t differs
i n p u t s and
function
reaspec-
concerns
of setting
carrier.
transfer
of
header.
a^,
second
to
setting
v a r i o u s methods o f
The
i t i s executed,
i s set
assume t h a t
when
way
from
Therefore
s p e c i a l case o f assignment
control
After
executed. a
of
a
specified
3
t o s p e c i f y m^.
advantageous.
used I n
a r e u p d a t e d and
of
t h e DO
on a l g o r i t h m g r a p h
proves
statement
empty s t a t e m e n t
the form
may
of
follows.
m^
execu-
that
value of loop
languages.
i s merely
empty s t a t e m e n t s
has
each
t h e s y m b o l CONTINUE t o d e n o t e e m p t y s t a t e m e n t .
ment d o e s n o t h i n g and using
initial
«3,
after
statement
2
t o what p a r t i c u l a r i t i e s
have impact
use
the
sum
Is the d e s i r e t o u n i f y
as
possibly involving
as
m^,
the
i s performed
FORTRAN-like
current uncertainty m2'
m^,
index
are
of I , 1 , 1
in generality
m a i n t a i n wide o p p o r t u n i t i e s
s o n s . The ifying
loss
be
not
f o r t h e v a l u e o f 1 t h a t was
the q u a n t i t i e s
l o o p e x e c u t i o n . The
significant
i t to
do
DO
1
Without
assume
body state-
they
updated
3
before
loop
i n FORTRAN. We
of computation
i>a>2
If
i s evaluated
The
we
resolve the ambiguity
checked.
of m .
the
for
labeled.
may
t h e l o o p body i s executed
value
I f nya^
c o n t a i n o t h e r DO
w o r k s e x a c t l y as
of order
Finally,
i s executed
i n t e g e r c o n s t a n t s . Suppose t h a t
and
assume t h a t
body
is omitted
a l s o be
complicated
interpretations
loop
Hm^.
while
If
may
be
the
l o o p b o d y may
statement
and
c o m p a r i s o n '-m^.
new
on,
The
means t h a t
isn^
First
Indexes.
l o o p body. L e t
l o o p b o u n d s . We Inequality
so
statement
mg,
This
different
the
loop
DO
by a r b i t r a r i l y
index.
tion of
and
at a l l .
I n a l l our
specified
to
other
t h a t m^,
require
loop
i^+n^,
executed
ments
s t e p m^.
with
we
statement.
The
first
one
labeled n is ^.....n^).
other variables
and
274 loop A
indexes,
n
n
are integers.
The p r e c i s e m e t h o d o f s p e c i f y i n g
i s n o t i m p o r t a n t now. We assume t h a t
execution o f I F statement be
executed.
statement DO l o o p s ous
then
I f A i s n o t equal
immediately
specified
be d e f i n e d ,
indexes
where
every
statements
describing
line
each p o s s i b l y l a b e l e d .
consisting top
line. An
tements trol
algorithm i s said
transfer
call
contains
either
This
notation
this notation algorithm
described
DO
then
intermediate results.
For c o n d i t i o n a l
significantly
sidered.
upon
An a l g o r i t h m
that
the level
turns
level
operations. Therefore
erations
sta-
i n c l u d e s con-
conditional.
i s determined statements
b y DO
The sta-
i s independent
t o a c o n s i d e r a b l e e x t e n t by
i s conditional of detailing
i s conditional
often
regards
transfer
results.
Note t h a t w h e t h e r an a l g o r i t h m pends
from the
algorithms, the execution or-
may be more c o m p l i c a t e d a n d d e t e r m i n e d
intermediate
sta-
algorithm
program.
i s called
t e m e n t s o n l y . The o r d e r o f e x e c u t i o n o f t h e s e
der
starting
from
o r as-
transfer some
I fa l g o r i t h m program
the algorithm
e x e c u t i o n o r d e r o f an u n c o n d i t i o n a l a l g o r i t h m
of
specifies
be
and f u n c t i o n
statement,
by a p r o g r a m w i t h o u t c o n t r o l
t o be u n c o n d i t i o n a l .
statements
elements
a l i n e - b y - l i n e n o t a t i o n , viewed
i n the sequential execution o f statements, We w i l l
a r e analog-
their
s i g n m e n t s t a t e m e n t , o r empty s t a t e m e n t , o r one o f c o n t r o l tements,
then the
l , . . . , s
Transfer o f control to
contents o f every statement
I n some way. C o n s i d e r
t o bottom,
i s t h e next tc
statements.
i n t r o d u c e d , and t h e f u n c t i o n a l
top
I F i s executed.
I s allowed o n l y v i a t h e i r headers. Both
l e t arrays
i a t t h e moment o f
l a b e l e d n.
t o any o f t h e numbers
following
t o c o r r e s p o n d i n g FORTRAN Now
be
i f A equals
the statement
o r u n c o n d i t i o n a l dea t which
on t h e b a s i c o p e r a t i o n s
o u t t o be u n c o n d i t i o n a l when d e s c r i b e d
the functional
we
intentionally
make
i n terms no
processes,
viewed
level
o f higher
r e s e r v a t i o n s as
contents o f operations. Generally
implementation
i t i s con-
s p e a k i n g , op-
a s a l g o r i t h m s , may be c o n d i -
tional, A program and c o r r e s p o n d i n g Individual ded
statements
I n t o one a n o t h e r
idual
DO
statements
where m u l t i p l e
execution of
I s g o v e r n e d s o l e l y b y a s y s t e m o f DO
l o o p s embed-
i s called
algorithm
a nested
i n a nested
loop.
loop d i f f e r
I f loop bodies only
i n t h e DO
of indivstatement
275 headers
then
the
called
composite
nested
loop
loop nested
executions of
t e r m i n e d by keeping
nested
l o o p s may
In accordance w i t h
our
be v i e w e d
statement
only
those will
DO
to this
statement
outside
as a n e s t e d l o o p o f z e r o
of
to the algorithm an
the
set
is
de-
algorithm
l o o p c a n be o b t a i n e d
bodies
contain
the
a
the
dimension.
notation,
unconditional
whose nested
i t is
statements describing A
loop. This nested
statements
refer
Otherwise
nested.
number o f DO
agreement as
every
tightly
loop dimension.
the unique nested
s t a t e m e n t . We
t o be The
loop.
i s called
s c o p e o f a l l DO
of
i s said
by
considered
l o o p as b e a r i n g f o r g i v e n
sta-
tement . When
investigating
h a v e more r e a s o n s
the
program
of
an
unconditional
t o hope f o r a success i n a l g o r i t h m
compared t o t h e g e n e r a l c a s e .
I f we
the
then algorithm
graph of s u f f i c i e n t
tational
system
implementation
with of
c o u l d h a v e i f we
tations the
number
t h a t may
investigation
overall
regarded
consists
of
as
a p a r a l l e l form of
i m p l e m e n t a t i o n o n a compu-
units
will
informationally
situation
be
reduced
graph
building
i s simpler
o f streams
though.
For
techniques
to
independent
than
v i a conone
The
we sim-
o f u n i f o r m compu-
these reasons with
the
opera-
the
t h e e n t i r e a l g o r i t h m as c o n d i t i o n a l .
i n the specification
involve conditions, of
we
building
Even i f t h e s e o p e r a t i o n s a r e p e r f o r m e d
a l g o r i t h m s the
plification
are able to f i n d
numerous f u n c t i o n a l
large
t i o n s o f t h e same k i n d . ditional
width
algorithm
graph
we
start
unconditional a l -
gorithms.
30. Guidelines for Algorithm Graph Building Let
an a r b i t r a r y
scribing
unconditional
i t i s a s e q u e n c e o f DO
ment s t a t e m e n t s . Some s t a t e m e n t s viously
set
executed
rules
Consider the
program a
program
may
be
labeled.
According
e x e c u t i o n a l l assignment
of algorithm
to the pre-
statements
to
every
instance of
every
graph
assignment
are DO
the program. building
using
t h a t d e s c r i b e s t h e a l g o r i t h m . F o r g i v e n i n p u t d a t a , we
node
de-
loop indexes of
p o s i t i o n s of assignment statements w i t h i n the general p r i n c i p l e s
program
( p o s s i b l y empty) a s s i g n -
i n some o r d e r . I t i s d e t e r m i n e d u n i q u e l y by
s t a t e m e n t s and
assign
of
a l g o r i t h m be g i v e n . The
s t a t e m e n t s and
statement
will
execu-
276 tion.
This defines t h e s e t o f a l g o r i t h m graph
ution
o f assignment
nodes.
This
algorithm
order
graph
statements
induces
does n o t y i e l d
structure,
a
by i t s e l f
but i t allows
n o d e s . The o r d e r o f exec-
linear
order
on
any s p e c i a l us
the set of
i n f o r m a t i o n on
t o determine
the set of
arcs. Fix of
a graph
node. T h i s
fixes
a well-defined
a w e l l - d e f i n e d assignment statement.
mentation
corresponding
instance o f execution
The a s s i g n m e n t s t a t e m e n t
t o t h e node d e t e r m i n e s
imple-
uniquely the ordered set
of
v a r i a b l e s whose v a l u e s a r e a r g u m e n t s o f t h e c o r r e s p o n d i n g o p e r a t i o n .
If
some o f t h e v a r i a b l e s a r e i n d e x e d
the
moment o f t h e a s s i g n m e n t
plicated
their
evaluation
able
may e i t h e r
all.
Among
linear fied.
"Last"
linear the
The
nodes
we c h o o s e
that
several precede
those
where
i n the previous
order,
fixed
be. I n course
be r e c o m p u t e d
graph
orderl
o f course.
then
statement
Now
t h e indexes
o f program
e x e c u t i o n any v a r i -
o r never
the f i x e d argument
we d r a w
be known t o
no m a t t e r
times
sentence
will
execution,
one
be r e c o m p u t e d a t (with
variables
i s understood arcs
from
how com-
t h e induced
are last i n terms
t h e chosen
modiof our
nodes
into
node. graph
building
process
ends here
as f a r as g e n e r a l
principles
a r e c o n c e r n e d . We h a v e b u t a f e w r e m a r k s t o make. Again,
f i x a n o d e . S u p p o s e some v a r i a b l e whose v a l u e
i s u s e d as an
a r g u m e n t n e v e r was m o d i f i e d b y t h e moment o f e x e c u t i o n o f c o r r e s p o n d i n g operation. lated In
According
how t h e v a r i a b l e
nored. have
to register
the input
then
of the variable i n effect
o u r p r o g r a m . The v a r i a b l e
process
no a r c r e -
t h e a b s e n c e o f s u c h a n a r c c a n be I g f o r example, i n data
i n some way t h a t
This
building
t o o u r n o d e . I f we a r e n o t i n t e r e s t e d
there
and draw
input process,
i s an argument
I n t h a t c a s e we c a n a d d a n e x t r a
n o d e we f i x e d . in
i s used
point
B u t i f we a r e i n t e r e s t e d ,
variable. sent
t o the described graph
t o that variable w i l l
d e f i n e d by t h e
node t o t h e g r a p h
an a r c f r o m
amounts t o I n t r o d u c i n g o f our interest
we
this
t o repre-
node
an i n p u t
t o the
operator
c a n a l s o be r e g i s t e r e d by
some o t h e r way. We
do n o t a l w a y s
Mark t h o s e g r a p h the
process
need
t o analyze
t h e whole
graph
that
we
nodes where r e s u l t s o f i n t e r e s t a r e o b t a i n e d .
of obtaining
them i s f u l l y
built.
Clearly,
d e s c r i b e d by t h e subgraph
formed
277 by
nodes t h a t
particular,
are
this
eter updates
program.
In
follows,
introduced,
we
assume
the
concern about
choice
statements
the
that
o f s u b g r a p h can I f we
we
can
a l l necessary
In particular,
h a v e b e e n d e l e t e d , e t c . We ther
i n c l u d e nodes c o r r e s p o n d i n g
algorithm results,
t i o n s have been p e r f o r m e d . been
t o the marked nodes.
the program statements.
outputting
ments t o t h e what
not
some p a t h s
l o o p h e a d e r s . The
some o f
of
through
s u b g r a p h may
i n DO
by c r o s s i n g o u t the process
connected
build
relation
o f our
t o param-
be
realized
are
interested
add
output
program
the
i s not
of
modifica-
program
t o any
have
interest
complete graph
in
state-
input/output statements
whose a c t i o n
will
In
to
without
us
fur-
particular
algo-
c a n n o t be
given
rithm. A c o n s t r u c t i v e a l g o r i t h m graph b e f o r e we proaches loop all
constructively to
whose DO
ing
the
solution
of
body
consists
of
loops
loop
junior
from
by
al
The
functional
loop
contents
that
nodes
examples. Example The
points.
sequential induces, Obviously,
viewed
as
d i n a t e o f y.
i
a
loop
a
placed
index,
execution
nested
Enumerate correspond-
loop
Index
(greater]
is
value.
i s uniquely deter-
i t is perfectly an
ap-
natur-
n-dimensional
arith-
the coordinates o f the
points.
i n the
i s determined
l o o p b o d y . Of
by
course,
vectors.
described
here
i n almost
a l l our
exception.
execution
of
the
precedes y of these
i s computed
to describe
to describe
as
before,
point x
one
smaller
Therefore
by
tightly
Denote the
n o d e i s t h e same and
i s the only of
a
workable
statement.
that
points of
be d e s c r i b e d
coordinates
Therefore
is sufficient
say
the assignment
stated
study
consider
notation.
will
. , 1 ^ being
,.
To
assignment
values.
o f every
order
as
among t h e d i f f e r i n g
it
index
were
12.2
procedure
i f i t s number has
implementing
a l g o r i t h m g r a p h a r c s can
program
We
a l g o r i t h m g r a p h n o d e s as
the o p e r a t i o n
x,
I .
space, the indexes
Note
problem,
single
the assignment statement
the set o f
to view
metic
I.
to another
Every i n s t a n c e o f mined by
this a
t o p t o bottom i n our
indexes
{senior)
building
s p e c i f y node p o s i t i o n s .
loop
by
the
set
of
order
I f
points the f i r s t coordinate
of
with
the
order
i n which
specified
order
respect
before
the l i n e a r
the order
body
1inear
in to
the
that
corresponding
coor-
i n the set of
nodes
individual
loop
indexes
278 are
computed.
set
of
The
knowledge algorithm
number o f
of
the
linear
nodes
Is
important problems.
The
more f a c i l e
graph
i t s employment
order
we
case
the l i n e a r
o n l y need
t o know
the
loops are
always
the
and
make
point
set
parameters
parameters
positive
formulas f o r indexes o f v a r i a b l e s We
observed
i n Chapter
cerning a l g o r i t h m graph special graphs grid. in
properties
To
study data
s p a c e , and
so
on.
placement. graph
Our
words,
will
from
be
step
1 a l o n g each a x i s .
proper
we
come up
specified
i n the
modification against
form,
the
contrary,
far, eters
of
now
on
that
nodes o f Our
and
i t usually
a
Now Enumerate
consider
con-
information to
about
study
regular regular
the
locations
i f we
nodes can set
knew
be of
i s integer
be-
situated. admissible
of
relevant
o n tn^
formulas
while means
m3
and
graph
grid
can
each nodes
that
has
a l w a y s be
met
f o r index expressions. I f
transforming a that
and
algorithm
rectangular integer
w h i l e b u i l d i n g and
we
will
exploring
program
probably
to
have
the algorithm
statements
f o r a l g o r i t h m graph
i t i s reasonable
loop
many p r o b l e m s
facilitated
each
requirements
of
the only s i g n i f i c a n t inside
parameters loop indexes).
t h e nodes o f a
to determine
N o t e t h a t n o t a l l e x t e n s i o n s o f DO i n §29 p r o v e d
the
must know t h e d e n s i t y o f n o d e
difficulties
then
greater d i f f i c u l t i e s
rality
The
example,
be
In and
can
l o o p h e a d e r s . T h i s means t h a t
will
by
order,
priori.
assume
situated
positive.
l o o p h e a d e r s . We
solution
research would
the
corres-
t o r e s o l v e precedence.
modifying
For
a
i n DO
nodes occupy
i t i s worthwhile
1 i n a l l DO
are
lexicographic
i n what p o i n t s o f space a l g o r i t h m g r a p h
other
equals
the
of
i s described,
description
requires additional
that
s t r e a m s we
loop index values a We
that
by
i n the
solution
( i f t h e y depend on
node
i t i s necessary
forehand In
of
5
structure
order
the
induces
f o r the
of a l l loop indexes
coordinates
by
program
simplest order
order coincides with
s t e p s o f DO
a
simpler this
i s . The
ponds t o t h e case where t h e s t e p s that
that
indispensable
the even
graph.
i n t r o d u c e d f o r geneb u i l d i n g purposes.
On
t o n a r r o w some o f t h e o p p o r t u n i t i e s .
So
extension
i s the a b i l i t y
to update m
param-
bodies. the
a l l loop
general
indexes
notation from
top
of
an
to
bottom
unconditional algorithm. and
denote
them
by
279 Ij
and
denote
them b y F^,....F^. E a c h i n s t a n c e o f e x e c u t i o n o f e a c h a s s i g n m e n t
*n-
Similarly,
enumerate
state-
ment i s u n i q u e l y d e t e r m i n e d bearing
nested
described
loop.
a l l assignment
by t h e s e t o f v a l u e s o f l o o p i n d e x e s o f t h e
Let the bearing
by i n d e x e s
I
I
. S
corresponding
to
s .-dimensional
arithmetic
statements
instances
nested
We
will
loop
treat
f o r statement algorithm
as
loops
then
tightly
coordinates assume
nested
of
execution
space.
The
of
indexes
space, g r a p h
the
i t h space.
where
spaces.
F;
as
points
I .
graph
I f Fj
space
nodes
some
be r e -
s,
i s outside
has zero
a l l nested
dimension.
points
unconditional algorithm
D e n o t e b y If,
in
will
I .
occupied
Unlike
of a
single
are scattered
t h e s e t o f nodes b e l o n g i n g t o
The s e t V o f a l l a l g o r i t h m
two l o o p
smaller
Different
indexes,
(greater!
graph
sets
assignment
rical
o r o f two assignment
number
i s said
V . correspond
V.,
'
J
nodes
i s the union of
statements
t o be junior
t h e one
to different
statements
F ,. I f
F .,
j
among t h o s e d e s c r i b i n g b e a r i n g n e s t e d
statements
t h e n t h e s e t s V.,
V, w i l l
that
t o another.
(senior)
i
t h e r e a r e common i n d e x e s two
nodes
V,.
Of has
points.
nodes o f a g e n e r a l
among m a r i t h m e t i c
all
o f these
the corresponding
loops
graph
l
1
garded
be
F
share
loops o f
some g e o m e t -
characteristics. STATEMENT 3 0 . 1 . Suppose
described
by
the
bearing
nested
I \ ,. ..,I .
indexes
i
and
loop thaS
of of
statement Fj
F,
by
is
indexes
s ,
1 J
,J .
Ji
J
.
Suppose,
for
instance,
that
i<j.
Then
s .
J -
I f
the
index
sets
I
I. l
tersection
-
then
I f
the
tersection not in to any
index
then the I .
any
index
sets
any
intersection; not in the
According
index
I.
is
I
to
not
any
index
I
I .
k
o f DO
is in
the
loops
have
S
intersection
any of I. intersection.
to the properties
I . 1
and
the
1
i
junior
I.
in
and g
empty
in-
j
I
;
have
junior
intersection
nonempty
to
in¬
any
index
is
junior
t h e s c o p e s o f a n y t w o DO
280 s t a t e m e n t s may b e e i t h e r Consequently, nested
i f there
nested
i s a t l e a s t o n e common i n d e x ,
l o o p s h a v e a t l e a s t o n e common DO s t a t e m e n t .
statement
containing
sent I n e i t h e r other.
t h e common DO s t a t e m e n t
o f the bearing nested
COROLLARY. I f some then
through
they
the same
Consider
set
Comparing
first,
a l l indexes
their the
of
in
Thus,
signment
o f the fixed l o o p s . Then
in
t h e s e t V o f a l l graph
induces nested a
sequential a linear
loop with
criterion
course,
state
this
that,
bearing indexes
of all
i n d e x e s , and
t o the total
number o f
nested
nodes o f an a r b i t r a r y between
notation
specification
unconditional
graph
nodes
i s established.
with
indexes l o o p be-
loops.
respect
and asWe d i d
t o each other
i s n o t always
r e q u i r e d . We
necessary. statements
i n t h e program
i n t h e s e t V. J u s t a s i n t h e c a s e
o fa tightly
t h e s i n g l e a s s i g n m e n t s t a t e m e n t , we w o u l d
t o resolve
nested
t h e f o l l o w i n g . At
t h e number o f common
e x e c u t i o n o f assignment
order
runs
l o o p a r e s e n i o r t o a l l indexes
i t i s equal
i n the algorithm
and V . a r e
loops o f assign-
a p p e a r common j u n i o r
how t h e s e t s I ' , a r e p o s i t i o n e d
t h e s e t V. H o w e v e r ,
V.
coordinate
loop w i t h
t o a l l indexes o fa l l the rest nested
statements
sets
a l l indexes o f t h e f i x e d
c a n p r o v i d e i t i n c a s e i t becomes The
the
and = f i x e d
nested
nested
there
i n the
index.
common
we c a n i n g e n e r a l
i s described, and t h e r e l a t i o n
specify
i t s body and pre-
and V
i
notation
f i x e d nested loop indexes. A f t e r
algorithm
of
V
o f the fixed
does n o t i n c r e a s e , and f i n a l l y
not
both
number d o e s n o t d e c r e a s e u n t i l
come j u n i o r
then both bearing
ordered set o f bearing nested
i n succession,
nested
loop
and each
other.
I n t h a t c a s e a n y DO
inside
coordinates
groups,
values
indexes
loops
initial
node
i n the algorithm
nested
of
of
junior
the entire
statements
loop.
form
each
l o o p s must a l s o b e p r e s e n t
The new common DO s t a t e m e n t h a s j u n i o r
common
ment
I n t o one a n o t h e r o r f o l l o w
precedence
using
t h e node
like
t o have
c o o r d i n a t e s . Of
t h e n u m b e r s o f s p a c e s must be a d d e d t o n o d e c o o r d i n a t e s . R e c a l l
t h a t we h a v e assumed e a c h m( t o b e i n t e g e r a n d e a c h ma t o b e e q u a l t o 1 for
a l l DO l o o p s . We e x t e n d
indexes
V.
andsenior
t o n o d e c o o r d i n a t e s a n d s p a c e n u m b e r s . We h a v e
STATEMENT 3 0 . 2 . set
o u r n o t i o n s o f common, j u n i o r ,
with
respect
For a node to
the
linear
in
the set order
V
induced
to precede by
the
a node program,
in i t
the is
281 necessary
and
sufficient
that
one
of
the
following
conditions
should
hold: there the
are
values
coordinate of
the
set
all
some
to
follow
-
are
common
values there
tem o f DO
match,
are
no
are
executed
always
ment i t s e l f
in
smaller
for
sets
of
first
the
node
node
coordinates,
i n v e s t i g a t i o n o f how a s y s -
assignment
I f a l l values
those
o f common
of junior
the location
junior
Indexes
order
induced
match o r state-
i n t h e s e t o f a l g o r i t h m graph
determined o f these
nested
nodes
by t h e numbers o f t h e s e t s sets.
Suppose an a l g o r i t h m
i s spec-
l o o p a n d t h e l o o p body c o n s i s t s o f a
We h a v e s e e n t h a t
i n this
case t h e o r d e r
statements
and
single
i s fully
the lexicographic ori n the t i g h t l y
nested
node c a n be i d e n t i f i e d b y a system
c o m p o s e d o f n o d e c o o r d i n a t e s a n d t h e number o f t h e s e t Vy order w i l l
to that
index
again
system. Note
i s identified
an a l g o r i t h m
coincide with that
the lexicographic order i n both
b y o n e a n d t h e same
i s described
by a nested
cases every
number loop
that
and does n o t have n e i g h b o r i n g assignment
statements.
again
identify
In this
every
node
by i t s c o o r d i n a t e s .
system a g a i n
defines the lexicographic
ever,
belonging
number o f i n d e x e s .
to different
I ti s i n general
order
sets
with
algorithm
o f indexes.
label
nodes
indexes
o f t h e assignment
by node c o o r d i n a t e s and c o i n c i d e s w i t h
node
suppose
is
i tincludes. Therefore
I f t h e r e a r e s e v e r a l assignment
respect
coordinates, the
irrespect-
loop body t h e n e v e r y a l g o r i t h m graph
graph
node
o f t h e l o o p body a r e executed
indexes,
i n each
assignment s t a t e m e n t . determined
linear
and
t o smaller values
earlier.
i s fully
by a t i g h t l y
Indexes
match,
icj.
by a d i r e c t
loops
correspond
node c o o r d i n a t e s
The
the
coordinates,
i sverified
Thus, t h e l i n e a r
of
of
them
i n t h e p r o g r a m becomes t h e d e c i s i v e f a c t o r .
t h e program
der.
coordinates
common
t h e r e a r e n o common
ified
coordinates
sets
i<j;
o f how many o t h e r that
by
matching
a l l statements
statements
if
the among
loops works. R e c a l l t h a t before t h e c u r r e n t value o f loop i n -
i s updated,
ively
in
coordinates
the
This statement
dex
coordinates
junior
V^;
there
their
common
of
How
has o n l y one L e t us o n c e
case
t h e index
i n t h e s e t o f n o d e s . How-
are identified
by
different
i m p o s s i b l e t o compose s u c h a n i n d e x
282 system order
using
node
induced
by
graphic
order. This
d e f i n i t i o n given In s p i t e ral
coordinates the program
There
are
i s , o f course,
of t h i s ,
several
as
the order
algorithm possible
Besides
lexicographic this
does n o t
The
from
Now
in
For
these
o r higher
below
any
or
this.
the
linear
the
lexico-
a d h e r e t o the
as
One
or
of
In particular, from
is
will
apply
in
fact
the
term
i n t h e s e t o f nodes, i f We
will
also
i f i t i s j u n i o r or
we
will to
above
every
operation
by
are
I t follows
that
point coordinates
scheme.
also
given
node.
by
indexes
point
of a l l variables i n -
unambiguously
determined
numbers o f spaces. T h i s
rather
simple
p l e x i t y of i t s c o n s t r u c t i v e implementation
deter-
coordinates
a l l a l g o r i t h m graph arcs
and
I t is essentially
say
senior
s p e a k a b o u t t h e node a
specified
The
the
nested
loop i s exactly
notations
contradiction.
than another
these
to a t i g h t l y
we
induced
i n a gene-
lexicographic.
i n a nested
reasons,
order
closest
to
algorithm
node i s u n a m b i g u o u s l y
quantities.
termined ral
be
of statements
transformed
particular
number o f t h e s p a c e w h e r e i t l i e s .
these
that
above d i s c u s s i o n s e t s grounds f o r a l g o r i t h m g r a p h a r c s
mination.
volved
to
be
create ambiguity
n o d e i s lower
closest
V^
strictly
referred
explanations
that,
the o t h e r , r e s p e c t i v e l y .
being
l o n g as we
often
t o the l i n e a r
order
t h a t one
sets
nodes w o u l d
of execution
is
produce l e x i c o g r a p h i c order.
only
of
the execution order of statements
lexicographic.
and
set
t h a t an a r b i t r a r y a l g o r i t h m c a n
l o o p , and
to
numbers o f
i n §28.
unconditional
fact
and
i n the
in
de-
i s t h e gene-
principle.
i s another
through
must be
The
com-
matter.
31. Splitting the Algorithm Graph We
do
not
set
n e e d t o know
mine
the
what
v a r i a b l e s are
statements these fiers.
two
used
corresponding
I t follows
that
the d e s c r i p t i o n vector
the f u n c t i o n a l
a l g o r i t h m graph on
the
arcs.
concrete
to
variables represent.
mined by of
of
the We
only
i n p u t s and
nodes.
only
contents
The
have
to
and
PU)
on
even
of
and
by
set.
fully
the The
is
assignment
their
is actually
that
matters
important
distinguish G
o f t h e s e t V o f nodes I
f u n c t i o n s Q(I)
that
outputs
I t i s not
a l g o r i t h m graph
o f nodes t o d e t e r -
thing
what
identideter-
definition components
283 ffjUJi
Pk")
determine
of the vector functions are identifiers
t h e i t h i n p u t and t h e k t h o u t p u t
corresponding
t o t h e n o d e I.
0(1).
d e n o t e G = GileV,
we w i l l As
of
stated before,
nodes i n b e a r i n g
on V. W i t h o u t
Stressing P(f),
of variables
o f t h e assignment
QU)).
l o o p s . The p r o g r a m
induces
l o s s i n g e n e r a l i t y we c a n assume t h a t
some v a r i a b l e s ,
guments.
I f we k n o w
the assignment
using
the values
i n what
statement
mined, and c o n s e q u e n t l y
s e t V,
puts
the linear
change w i t h
by
t o a node / e v a l -
( o t h e r ) v a r i a b l e s as a r i s , then
I
o f t h e statement
have i n g e n e r a l
so t h e d i m e n s i o n
The d i m e n s i o n
I.
t h e node
order
i t i s determined
t h e number o f
i s uniquely
deter-
t h e i d e n t i f i e r s o f v a r i a b l e s used on i n p u t s and
assignment statements
and o u t p u t s ,
o f some
i n the algorithm notation
o u t p u t s a t t h e moment o f e x e c u t i o n Different
consisting
i
S t a t e m e n t 3 0 . 2 . The a l g o r i t h m o p e r a t i o n c o r r e s p o n d i n g uates
statement
t h e d e p e n d e n c e o f G o n V, P ( I ) ,
t h e s e t V i s the union o f the sets V
nested
that
of vector
of P(7).
different
number o f i n -
f u n c t i o n s P ( i ) , 0 ( f ) can
remains
Q[I)
a r e known a s w e l l .
constant
over
each
V ^ f o r u n c o n d i t i o n a l a l g o r i t h m s , s i n c e t h e number o f i n p u t s a n d o u t p u t s of
each assignment statement One
of the decisive
graph arcs i.e. of
V. G e n e r a l l y
tifiers
speaking,
i n the course
As tion,
we
lowing -
of the variable,
to Investigate
program
that
etc.
a l g o r i t h m graph
identifiers
execution.
identifiers
of a l l plain
This
case
i s t h e de-
indexes.
Nonetheless,
The i d e n a t t h e mo-
determined. before
program
execu-
o f a l l u s e d v a r i a b l e s may be compels u s t o impose
indexes
v a r i a b l e s and o f a l l a r r a y s
and a r e n o t changed d u r i n g program
o f v a r i a b l e s are expressions
dexes and g l o b a l c o n s t a n t s tion
A typical
the f o l -
restrictions;
before program execution -
variables,
some a c t i o n s may be p e r f o r m e d u p o n t h e i d e n -
i t s i d e n t i f i e r m u s t be
t o ensure
compared b e f o r e
o f used
v a r i a b l e ' s i d e n t i f i e r on loop
i s used
wish
we h a v e
the set of algorithm
t o compare i d e n t i f i e r s
o f program e x e c u t i o n .
may b e t h e v a l u e
ment a v a r i a b l e
of finding
t o compare components o f P and Q a t v a r i o u s p o i n t s
pendence o f an i n d e x e d tifier
conditions
i s the possibility
the p o s s i b i l i t y
does n o t change.
and a r e n o t m o d i f i e d
whose v a l u e s
depending
a r e known
execution;
o n l y on loop I n -
a r e known b e f o r e p r o g r a m
i n a n y way d u r i n g p r o g r a m
execution.
execu-
284 In fied.
what
This
follows
amounts
functions P(I), parameters
we
assume
that
the said
t o t h e assumption
that
restrictions
t h e components
and i d e n t i f i e r s
on t h e c o m p l e x i t y
gresses.
We w i l l
Algorithm about
that
flect that
graphs
structure.
do n o t change
the inner
i n t h e course
natural graphs
splitting
helps
a graph
groups.
the direct
To d e n o t e
This
this,
investigate
induces
of the functions
Ojtfi
may be e m p t y
viewed
as s u b s t i t u t i n g
corresponding
may
We
be
as w e l l .
I . Each
same v a r i a b l e s a s t h e o r i g i n a l
^ -
V z
E
We
stress
themselves.
t h e
i z
Recall e
ra
P
h
G^ the set of in-
t o a node
a n d Q^D
I
i n t o two
z
node Some
The s p l i t t i n g
t o node.
node
I n particu-
o f t h e components o f we
operations
o f those
a t every
Dimensions o f
Q(I)=Q(I)*Q (I).
i n t r o d u c e d may be f o r the operation
operations
e v a l u a t e s the
o p e r a t i o n and i t s arguments a r e d i s t r i b -
u t e d b e t w e e n t h e new o p e r a t i o n s . I t i s i m p o r t a n t t h a t not duplicated
having
o f t h e v e c t o r f u n c t i o n Q(JI
from
empty.
made
and r e -
structure. -
partition
Q^ll)
two h y p o t h e t i c a l
t o t h e node
i
z
corresponding
may v a r y
graphs
i s G
i s d e n o t e d b y G^G^ u
we u s e t h e n o t a t i o n
Q^ll),
pro-
we
simple
as a w h o l e .
algorithm
the s p l i t t i n g
QAD
into
not algorithms
and
G^iV^.E^
statement
Q^(I),
taken
are s p l i t ,
sum o f v e c t o r f u n c t i o n s
functions
one
splitting
P ( I ) , (!)).
GlleV,
o f t h e assignment
t h e assumptions
are relatively
of algorithms
graphs t h a t
u E ) . This union
vector lar,
of a
structure
o f algo-
restrictions,
as o u r i n v e s t i g a t i o n
under
special
u V , E
disjoint
I.
admit
and some
later.
are b u i l t
o f two graphs
Consider
Into
that
the union
G =(V
puts
expressions,
that matter
These
i t i s algorithm
However, g r a p h that
o f index
discuss
t h e programs
special
o f vector
s h a l l depend o n l y on t h e c o o r d i n a t e s o f I
Q(I)
r i t h m e x e c u t i o n . We s h a l l p r o b a b l y h a v e t o i m p o s e f u r t h e r as
are satis-
as t h e y a r e d i s t r i b u t e d
so t h a t
t h e arguments are
n o n e o f them c a n be a r -
gument o f b o t h new o p e r a t i o n s . STATEMENT 3 1 , 1 . The
splitting
GdeV.PiD.Q^n+QlI))
=
(31.1} =
G ^ I e V . P i D . Q i n i v G l I e V . P t n . Q ^ l ) )
285 always
holds.
The
graph
subgraph
on t h e l e f t
o f the graph
on
node s e t s a r e i d e n t i c a l
hand-side
of the equality certainly
the right-hand
side
Is a
o f the equality.
s o t h e t w o g r a p h s may d i f f e r
Their
but i n their
arcs.
E v e r y a r c p o i n t i n g t o any node i s u n i q u e l y d e t e r m i n e d b y t h e c o r r e s p o n ding
Input
vector
v a r i a b l e a n d some s e t o f o u t p u t
function
P ( I ) . Since
course o f p a r t i t i o n i n g the
e q u a l i t y cannot
no
Input
I n t o groups,
contain
variables described
variables
were
by t h e
duplicated
in
t h e union on t h e r i g h t - h a n d side o f
any s u p e r f l u o u s
arcs,
Including
multiple
arcs. Now
partition
into
two d i s j o i n t
assignment statement corresponding sumption indexed duces
that
no
two p l a i n
groups
variables
have
v a r i a b l e s a r e i n t h e same a r r a y
the s p l i t t i n g
of the vector
o f vector
P^DiPlI)
of the
We make a n a d d i t i o n a l a s identical
i n these
f u n c t i o n P{I)
functions f ^ t l )
STATEMENT 3 1 . 2 . W i t h
the set of outputs
t o n o d e I.
names a n d no t w o
two groups. into
This i n -
the direct
and f ^ U ) a t e v e r y node
t h e a b o v e partitioning
we
sum
I.
have
(31.2) =
Indeed, again and an
t h e graph
a subgraph
their
G^lIeV.P^n.CMDvG^ImV.P^n.Qa)).
on t h e l e f t - h a n d s i d e
o f t h e graph
node s e t s a r e i d e n t i c a l .
a r c p o i n t i n g t o node J . T h i s
that
i s an i n p u t
this
v a r i a b l e , being
present
i n both
P^ll)
We h a v e p o i n t e d fect
an o u t p u t
which
recognition
o u t more
corresponding
owing
t o our assumption.
o f e v e r y a r c . The d i f f i c u l t y
i ndifferent
number o f t h e c o r r e s p o n d i n g
t o J . But
statement,
cannot
be
I t follows graphs.
t h a n o n c e t h a t o n l y one v a r i a b l e h a s e f -
variable i s that.
process
statement
some v a r i a b l e
can appear i n t h e r i g h t - h a n d s i d e
on t h e d e t e r m i n a t i o n
nizing,
of the equality
uniquely
o f t h e assignment
a n d P(D
arcs
side
Suppose t h e l e f t - h a n d s i d e g r a p h has
a r c determines
o f t h e assignment
t h a t no s u p e r f l u o u s
o f t h e e q u a l i t y i s once
on t h e r i g h t - h a n d
Input
and o u t p u t
lies
I n recog-
variables affect the
ways. L e t t h e node / and t h e argument
o p e r a t i o n be g i v e n .
They d e t e r m i n e
uniquely
286 the
identifier
name
o f t h e used v a r i a b l e .
i s fixed,
variable
and i f i t has i n d e x e s ,
are fixed.
assignment w i t h what
recognize
statements.
of a plain
variable
formed by t h e statement J.
I four v a r i a b l e
output not
variables
from
modified
variable
subset which
difficulties able
first
finding
we
last
this
vari-
update
that
i s per-
b e f o r e t h e node
statements this
that
have
variable
will
above.
Under
solutions
f r o m b e l o w among a l l s o l u t i o n s
termination Itself.
consider
later.
to obtain
splitting
a detailed
QII)=Q(1)+Q il) z
I.
that
graph
tasks. Let a statement
made,
arcs
the principal vari-
be g i v e n .
s e t where our
this
o f equations
amounts t o o b t a i n e d by
c o n s i s t s I n f i n d i n g the
before
i t i s used
lexicographically
a t t h e node closest to I
t o solve both problems
i n greater
a p p l y o u r scheme o f a l g o r i t h m g r a p h description
We f i x a n a s s i g n m e n t s t a t e m e n t
empty e v e r y w h e r e except
statement i n
t o t h e systems o f e q u a t i o n s .
t h e procedures
H e r e we w i l l
we
task
last
the solution
by t h a t
o f t h e node
t o t h e systems The s e c o n d
i s updated
T h i s amounts t o f i n d i n g
We w i l l
o f t h e assignment
the assumptions
expressions.
i n name and
of algorithm
i t i s clear
i n t h e two f o l l o w i n g
as an i n p u t
node where o u r v a r i a b l e
detail
Now
both
b e f o r e t h e node
consists i n the description
a l l integer index
be p e r f o r m e d
t o be e x e c u t e d
are concentrated
task
variable
the general procedure
outlined
i s updated.
comparing
1th
us t o
o f a l l e x e c u t i o n s where t h e
t o our f i x e d
update w i l l
i s the last
used a t node I
variable
the
we c a n s e l e c t
i n t h e subset
i s identical
T h i s scheme d e t a i l s determination
I.
allows
statements
t o be e x e c u t e d
t h e same a r r a y . H o w e v e r , now
i n d e x v a l u e s . The l a s t
our
The
and
i n g e n e r a l be u p d a t e d d u r i n g e v e r y e x e c u t i o n o f t h e s e l e c t e d s t a t e -
m e n t s . We a r e o n l y i n t e r e s t e d
in
o f the b y many
statement
may u p d a t e
can select
t h e same name. The
then
what
variable
that
we
i s the last
i s indexed
and i n d e x e s
update.
of the fixed statements
then i t s
may b e u p d a t e d
t o predict
the last
variable
with that
name
variable
I t is difficult
t h e s e t o f assignment
an o u t p u t
array
of that
the identifier
I n t h e case
have
The v a l u e
loop index values performs
Nonetheless,
able.
I fi t i sa plain variable
F.
as f o l l o w s . f o r t h e s e t V..
o n e be e m p t y a t a l l p o i n t s
of splitting
a r c s de-
o f t h e graph
and i t s 1 t h argument.
Define
L e t v e c t o r f u n c t i o n Q (I)
be
L e t a l l i t s components b u t t h e
I i n V... a n d l e t i t s 1 t h c o m p o n e n t be
287 t h e same a s t h a t o f v e c t o r f u n c t i o n ( ? ( f ) . termine
the vector
function
duces
t h e graph
ever,
we a r e now a b l e
Gj a n d G
G=Gj u G^ a c c o r d i n g
Specifically,
statement.
program f o r G b y s u b s t i t u t i n g signment statements
input.
nonempty
Then
Inputs
grams i n h e r i t Fix which
means
The p r o g r a m
each
o f t h e programs
the original
o f an a r r a y . t h e name
determine
Statement to which derived
produces
31.2. Formal these
from
from
uniquely
P(J)"P'(I)*Pa(fj
graphs
the vector
f o r
fixed
less
new
pro-
dimension, Define the
correspond
t o eval-
name. T h e s e
PA I ) . C=C j
The
splitting
f o r C( a n d
t h e program
C by substituting
condi-
u G^ a c c o r d i n g t o
be c o n s t r u c t e d
empty
t h a t do n o t e v a l u a t e
empty o u t p u t
have
both
i s fixed.
function
name. T h e p r o g r a m
sta-
one non-
o f P ( I ) be e m p t y a t
that
Specifically,
t h a t do e v a l u a t e
will that
fixed
splitting
programs can a l s o correspond.
f o r C by s u b s t i t u t i n g
signment statements
from t h e
program.
variable
components
t h e graph
t h e program
our array with
program
and
may b e o f z e r o
our array with
outputs o f a l l assignment statements from
i s derived
Assume
L e t a l l components
z
o f some v a r i a b l e
tions
f o r
The a r r a y
e a c h n o d e I i n V, e x c e p t f o r t h o s e uation
forG
program.
of a plain
PlI)=PlI)iP ll).
from
f o r the 1th input of
empty i n p u t s « f o r a l l I n p u t s i n a l l a s -
t h eseto f labels o f theo r i g i n a l
that
How-
t h e graphs
t h e p r o g r a m f o r G i n c o r p o r a t e s more t h a n
than
t h e name
splitting
31.1.
f o r C^ i s d e r i v e d
input 0
empty
pro-
except f o r t h e J t h Input o f t h e i t h assignment
tement. Suppose t h a t empty
t o Statement
t h e program
f o r G by s u b s t i t u t i n g
i t h assignment
Q ( / ) = Q (/)+ (I)
t o c o n s t r u c t f o r m a l programs t o which
correspond.
£
the program the
splitting
These c o n d i t i o n s u n i q u e l y de-
The s p l i t t i n g
Q^I).
for
Is
output
for
some
variable
a l l
f o r G^ i s d e r i v e d f r o m t h e f o ra l l outputs
some v a r i a b l e
from
o f a l las-
our array
with
fixed
name. S u p p o s e t h e p r o g r a m f o r G u s e s v a r i a b l e s f r o m more t h a n o n e
array
as o u t p u t s .
Then e a c h o f t h e programs
outputs 'variables from once
again
original
scribed
both
less
arrays
new p r o g r a m s
than
the original
inherit
will
program.
the set of labels
use as Assume ofthe
program.
Thus, formal
that
f o r G^ a n d
t h edescribed
program program
a l g o r i t h m graph
splittings. splittings,
On
the other
we w i l l
splittings hand.
a r e accompanied by
I f we p e r f o r m
o b t a i n a l g o r i t h m graph
the
de-
splittings
288 that
obey
formal.
portance. work
(31.1),
( 3 1 . 2 ) . We
The f u n c t i o n a l The o n l y
thing
that
i n a l l f o r m a l programs
We w i l l
assume t h i s
stress
contents
matters
just
condition
that
program
o f assignment
splitting
statements
i s that
a l l DO
i s purely
i s o f no i m -
loop
statements
as t h e y d i d i n t h e o r i g i n a l
program.
to hold.
C o n s i d e r a p r o g r a m o f a n a r b i t r a r y u n c o n d i t i o n a l a l g o r i t h m . L e t us split of
i trecursively
splitting
puts
we w i l l
decrease
t h e above the total
rules. number
o r a r r a y names whose v a r i a b l e s a r e u s e d
stituents as
following
produced
by s p l i t t i n g .
this decreasing
have
t h e same DO
statements
structure
as t h e o r i g i n a l
of either
nonempty i n -
as o u t p u t s
The p r o c e s s must
cannot c o n t i n u e loop
Depending on t h e type
infinitely.
stop
i n both
A l l f o r m a l programs
a n d t h e same p o s i t i o n s
program.
A l g o r i t h m graphs
There
corresponding to
i s n o t more t h a n o n e n o n e m p t y
input
i n each o f t h e obtained
from
any two f o r m a l
have empty i n t e r s e c t i o n o r a r e i d e n t i c a l .
ment s t a t e m e n t s
struc-
t o nonempty o u t p u t s a r e i n
same a r r a y . The s e t s o f n o n e m p t y o u t p u t s
grams e i t h e r
arcs
different.
formal programs. A l l v a r i a b l e s c o r r e s p o n d i n g the
will
o f assignment
f o r m a l p r o g r a m s h a v e o n e a n d t h e same n o d e s e t V, b u t t h e i r t u r e s may be c o n s i d e r a b l y
con-
a t some moment
i n f o r m a l programs a r e empty.
These,
pro-
A l o t o f assignas w e l l
as o t h e r
p r o p e r t i e s o f f o r m a l p r o g r a m s , c a n be e s t a b l i s h e d b y a n a n a l y s i s o f t h e splitting
process.
programs w i l l
H e r e we w i s h
STATEMENT 3 1 . 3 . puts the
are same
empty,
or
array,
then
Indeed, execution signment
I f
a l l
under
are
algorithm
f o r some
h a v e no e f f e c t
as DO
loops w i t h
formal
programs
empty,
or
are no
is
instance
empty,
or
input-output
pair
all
inis
In
empty.
o f the statement
statement
cannot
any o u t p u t
be a n i n p u t
of i t s execution.
o f any
o f some a s -
Therefore the
h a v e no a r c s . that
possess
the properties
on t h e a l g o r i t h m graph.
empty b o d i e s , that
statements
graph
t h e hypotheses
programs
h a v e no a r c s .
assignment
outputs
the
a l g o r i t h m graph w i l l
31.3
all
o f any a s s i g n m e n t statement
Formal
t o p o i n t o u t t h a t g r a p h s o f many f o r m a l
be e m p t y , i . e . t h e y w i l l
a r e t o be t a k e n
into
i n Statement
Empty s t a t e m e n t s ,
have no e f f e c t
a l g o r i t h m g r a p h c a n be d e s c r i b e d a s f o l l o w s .
1isted
on i t e i t h e r .
account
while
as w e l l
The s e t o f
b u i l d i n g the
289 Consider Build a
nonempty
for
again
a family
input
a l l other
outputs fixed.
leted. gram,
that
Our
statements,
statement
loops w i t h
the corresponding
Substitute
statements.
rules. Fix
empty
inputs
Substitute
empty
a r e n o t i n t h e same a r r a y a s t h e I n p u t we
assignment
higher assignment
unconditional algorithm.
t o the following
statement.
o f a l l assignment
a l l empty
D e l e t e a l l DO
primilive.
o f an a r b i t r a r y
o f an a s s i g n m e n t
inputs
f o r a l l outputs Delete
closest
a program
o f f o r m a l programs a c c o r d i n g
formal
whenever
implies
labels
onto the
labeled statements
empty b o d i e s . algorithm,
investigation
moving
The r e s u l t i n g
and I t s graph
that
the
a r e de-
formal
are said
following
prot o be
statement
holds:
STATEMENT 3 1 . 4 . statements.
Then
primitive
graphs
Suppose
an
i t s graph
can
it
Programs o f p r i m i t i v e
nested
identical tements
be
loops
algorithms are lucid.
program
nested
of the original
c a n be r e a d i l y
loops
program.
graphs
o f any u n c o n d i t i o n a l a l g o r i t h m
of primitive
ment f o r t h e a l g o r i t h m
of splitting
If
necessary,
no
union
a l g o r i t h m s , we
that
empty
of
a l l
will
t o ensure
that
be a b l e
empty
programs a r e sta-
are able t o
to obtain the require-
i s not too r e s t r i c -
statements
original
of the original
that
assignment
u s i n g S t a t e m e n t 3 1 . 4 . The
do n o t g e t mixed up w i t h statements
programs
down. N o t e
o n c e we
n o t t o have empty s t a t e m e n t s
merely
empty
A l l primitive written
In primitive
I t follows
build
course
has the
o f the corresponding
graph
I t i s used
as
o f assignment statements
t o bearing
tive.
algorithm
represented
spawns.
spawned b y t h e o r i g i n a l bearing
unconditional
produced i n
empty
program
statements.
c a n be
taken
i n t o account s e p a r a t e l y . We
make
arbitrary
one more
primitive
all
empty
statements
assignment
Delete
a l l DO
the
mentary.
a l l the rest from
graph
with
formal
An e l e m e n t a r y
outputs
t h e program,
statement
loops
corresponding
I n algorithm
splitting.
Consider
an
p r o g r a m . F i x t h e s i n g l e nonempty i n p u t and any non-
empty o u t p u t . R e p l a c e
higher
step
whenever
empty
program
moving labeled
bodies.
algorithm,
The
labels
resulting
a t most
and o u t p u t
onto
statements
and i t s graph
contains
ments and a t most one nonempty i n p u t
by empty o u t p u t s and d e l e t e the closest are deleted.
formal
are said
program,
t o be
two assignment that
ele-
state-
are variables
from
290 one
a n d t h e same a r r a y .
ments ing
i n elementary
statements o f the o r i g i n a l STATEMENT 3 1 . 5 .
pty
statements.
graph
of
Suppose
Then
the union
Recall
that
all
general
about
need t o a l w a y s expedient should itive the
substitute graphs
cases
primitive
where
the algorithm that
STATEMENT 3 1 . 6 . every
element
Then
the union
build
t h e spanning
the statements
of
yields
graph
the
v a l u a b l e informawe d o n o t
graphs
only.
graphs
i n t h e u n i o n . We
graphs
I t may be
o n l y f o r those I ti suseful
o f Statement
itself.
before,
Of course,
f o r some r e a s o n .
subgraph
algorithms i s
prim-
t o know
31.5 w i l l
i n fact
Two s u c h c a s e s a r e d e s c r i b e d
follow.
Suppose
every
of
i fi t contains a l l
As s t a t e d
graph.
and elementary
the union o f elementary
sub-
spawns.
graph.
graph
o f the original
t h a t we c a n n o t
coincide with by
as a spanning
i t
consider the union o f elementary
t o combine
has no em-
The u n i o n o f a l l e l e m e n t a r y
o f t h e expanded
the structure
algorithm
t o be s p a n n i n g
o f the algorithm
knowledge o f s t r u c t u r e tion
graphs
state-
the following
can be represented
i s said
graph.
an expansion
o f assignment
an unconditional
elementary
a subgraph
loops
t o those o f t h e correspond-
p r o g r a m . We o b t a i n
that
i t s graph
of
nodes o f t h e o r i g i n a l in
The b e a r i n g n e s t e d
programs a r e i d e n t i c a l
an algorithm
array
is
a l l elementary
has no empty
evaluated graphs
only is
once
statements
and
not
at a l l ) .
the
algorithm
The a s s i g n m e n t
statement
identical
(or with
graph.
Consider
a n a r c o f some e l e m e n t a r y g r a p h .
corresponding
t othe arc's o r i g i n
T h i s element cannot other, of
be updated
every
array
cannot
same all
that or
variables various
elementary
Consider ment s t a t e m e n t of
t o the hypothesis that every
be e v a l u a t e d
more
than
once.
array. o r any element
Consequently, our
a r c i s c e r t a i n l y an a l g o r i t h m graph a r c .
STATEMENT 3 1 . 7 . pose
o f some
by any o t h e r e x e c u t i o n s o f t h i s ,
assignment statement, owing
elementary graph
e v a l u a t e s an element
Suppose that
assignment graphs
once
is
an algorithm determine
different
statements identical
has no empty
are with
outputs distinct.
the algorithm
t o the arc's
origin
of Then
Sup-
one and the the
union of
graph.
a g a i n a n a r c o f some e l e m e n t a r y
corresponding
statements.
graph.
The a s s i g n -
e v a l u a t e s a n element
some a r r a y o r , i n o t h e r w o r d s , t h e v a l u e o f some v a r i a b l e . T h i s v a r -
291 lable
cannot
tement, other
be u p d a t e d
by a n y e x e c u t i o n s o f a n y o t h e r a s s i g n m e n t
outputs. This
assignment
i s a s s e r t e d by t h e f a c t
statements
correspond
the a r c i s i n the a l g o r i t h m The al
hypotheses
notation
We
shall
tates
f i n d i n g elementary
two
assignment
role
cases:
mentary
program
loop
I f
then
The h y p o t h e s e s
there the
of
are
that
I tfollows
Clearly,
only
structure
nested
tightly If
ments,
set
facili-
o f Statement
31.7 o f t e n
individual
elementary
of
building.
may h a v e e i t h e r
assignment graph
statements
have
There a r e one o r t w o
statements has
no
in
paths
an
of
elelength
them
o f nodes
nested
the sets o f
a r c s have empty
contain
paths
i f a l l
intersec-
of length
greater
exist.
elementary
graphs
have
Any
may
only
such
one
have
assignment
algorithm
I s one o f t h e r e a s o n s
relatively
statement
i s described
why
com-
by a
the investigation
l o o p s has drawn c o n s i d e r a b l e a t t e n t i o n .
an e l e m e n t a r y i t s graph
graph's
cannot
whose p r o g r a m s
loop. This
bearing
a r e t h e same. T h e r e f o r e
the graph
those
different
l o o p s h a v e no common n o d e s e v e n
o f elementary
i / o statement.
nested
arc d i r e c t i o n s .
from
program
elementary
describe
that
i s n o t an
tightly of
structure
two
t h a n o n e . I s o l a t e d n o d e s may w e l l
that
"memory
programs.
bearing nested
and end p o i n t s
plicated
not Involve significantly
t o p l a y i n a l g o r i t h m graph
assignment
Different indexes
tion.
fact
one.
Different
origins
does
this
mathematic-
statements.
than
loops.
that
the elementary
STATEMENT 3 1 . 8 .
greater
i t usually
graphs.
t h e knowledge
has a major
possible
results of
Consequently,
31.6 a r e c h a r a c t e r i s t i c o f
see l a t e r
hold f o rportions o f r e a l - l i f e Obviously,
different
variables.
graph.
o f algorithms since
update".
that
to different
o f Statement
cells
graphs
sta-
n o r b y a n y o t h e r e x e c u t i o n s o f t h e same a s s i g n m e n t s t a t e m e n t o n
algorithm
i s characterized The s e t o f g r a p h
o f two b e a r i n g
t h e nodes o f t h e nested
t o t h e nodes o f t h e nested
i s d e s c r i b e d by two assignment
state-
n o t o n l y by p a t h l e n g t h s b u t a l s o by nodes c o i n c i d e s i n t h i s
nested
loops.
The
arcs
case w i t h t h e
always
originate
l o o p c o r r e s p o n d i n g t o t h e o u t p u t and p o i n t
loop corresponding t o the input.
Thus, o u r r e s e a r c h a l l o w s us t o break
down a l g o r i t h m
graph
build-
292 ing
process
Into
the f o l l o w i n g
- splitting
stages:
an u n c o n d i t i o n a l
a l g o r i t h m program i n t o p r i m i t i v e
pro-
grams; - splitting
p r i m i t i v e programs i n t o
- b u i l d i n g graphs
of elementary
- s y n t h e s i s of p r i m i t i v e graphs
elementary
algorithms; from elementary
- forming the union o f p r i m i t i v e graphs original
unconditional
Only
the
third
have completed
programs;
graphs;
to obtain
t h e g r a p h of the
algorithm.
and
fourth
a l l preliminary
stages
are
t o be
investigated
yet.
We
of problems a r i s i n g
in
work.
32. Linear Index Expressions We
embark on
a more t h o r o u g h
algorithm
graph
i t i v e and
elementary
Consider quely
arcs
the set of graph
the
algorithm
means t h a t at
I.
and
linear
order
into
/,
t o the induced
a
that
case
that
i n the set of
to
the
closest
to J from
rithm
a
single graph
be
value
the terms
both
I.
This
i s consumed
know, t h e p r o g r a m
graph
updated
into
I f an
arc
a t nodes t h a t
nodes.
are
s e n i o r and
junior re-
order.
at
output
of
to 1 w i l l
Suppose
assignment nodes
algorithm
we
point.
statement,
flows
this As
t h e y can
every the
i s d e s c r i b e d by node
of
the
elementary
originate
from
a plain
bearing
variable.
nested
algorithm. t h e node J
loop
11
follows
that
i s the
below.
STATEMENT 3 2 . 1 . by
and
and
uni-
t h e s e t o f nodes
assignment
from J
cannot
t o / . Of c o u r s e ,
linear
the arc p o i n t i n g
single
the f o l l o w i n g .
the v a r i a b l e
i t i s updated
corresponding
prim-
I t determines
I t determines
Is evaluated at J
S u p p o s e t h a t an e l e m e n t a r y g r a p h In
discussion to
algorithm.
arc o r i g i n a t e s
some v a r i a b l e
from J
the
t h e s e t o f nodes t o w h i c h
i s d e s c r i b e d by
Suppose an
senior to J but j u n i o r fer
elementary
I t i s important to notice
induces a flows
limit
nodes. Moreover,
whence a r c s c a n o r i g i n a t e If
We
algorithms.
a program o f an
sets coincide.
investigation
determination.
are
that
statement in
one
and
an and the
elementary a same
plain
algorithm variable. path.
is Then
described all
algo-
293 Indeed,
the variable
used a t
every
Induced
by
ating
node.
the
lel
structure
and
output
program all
of
the algorithm.
be
i s updated
p o i n t e d t o by
algorithm
Suppose a p l a i n
i t h assignment will
be
graph
statement
connected
of computation
i s . I f t h e s e t V.
of
algorithm
i n accordance w i t h an
the
arc
and order
origin-
reveals a p o s s i b l e b o t t l e n e c k concerning the
f o r the
portion
statement
the
l o w e r n e i g h b o r i n g node.
n o d e s i n t h e s e t V. this
describes
o r d e r a l l nodes
t h e p r o g r a m , each node w i l l
from
This statement
of
that
I f we
variable
of
the
i s both
program.
b y a p a t h . Hence no
i s p o s s i b l e no
matter
paralinput
Then a l l
parallelizing
what
the
rest
of
c o n s t i t u t e s a considerable part of the set o f
nodes,
then
t h e above k i n d
the
renders
presence
of
a
the algorithm
single
assignment
practically
unparal-
lelizable or poorly parallelizable. Now dexed
but can nents
suppose
variable.
that The
have d i f f e r e n t
are
indexes
of
some e x p r e s s i o n s responding pend on
involving
expressions
involving
corresponding
depend on
Let
input
we
the
in a
d e s c r i b e d by
identical.
input
input
set of
of
of
I f the
be
as
The
in a
i n general loop cor-
They may
indexes
elementary
of
l o o p , now
algorithm.
In particular,
both
depend
de-
only
the
They
may
i f the i n on
external
the assumptions o f Statement integer
statement,
i s d e s c r i b e d by and
also
a r e a l s o some
points
and
&j
p o i n t s £3^. I f t h e e l e m e n t a r y
s i z e s o f Qj
an i n -
q { J ) t h e v e c t o r whose comThese
well.
set
assignment
algorithm
sets are d i f f e r e n t .
by
same a r r a y
are
the bearing nested
the
variables
integer
single
indexes
algorithm.
D e n o t e by
parameters output
nodes J
a
i n the
p ( J ) t h e v e c t o r whose compoThese
variable.
loop indexes the
and
are
loop indexes of the bearing nested
a g a i n a r e e s s e n t i a l l y under
output
n o d e s / be is
to
i s described
variables
Denote by
parameters.
of
some e x t e r n a l the
parameters, 32. 1.
algorithm
output
indexes.
some e x t e r n a l
of
and
t o the output of the elementary
indexes
dexes
elementary
the output v a r i a b l e .
ponents are
one
an
input
the
two
may
algorithm
s e t s B^, a n d
statements
input
f i ^ are
then
a l s o depend on
these
external
parameters. Fix q(I).
For
an
input
an a r c
i n d e x q ( I ) be
node
to flow
equal
/.
This
from
uniquely determines
/ into
I
i t i s necessary
the
vector
that
index
the v e c t o r
t o t h e v e c t o r i n d e x p ( J ) . Hence t h e f i r s t
thing
we
294 s h o u l d do i s t o r e q u i r e
that
the equality
P
should hold. equation tions that
Actually
(32.11
our problem
tain
JeSlj., Jed,,
with
that
respect
we r e f e r
t o J. I n this
case
constraints
we i m p o s e d d e t e r m i n e
indexes
node
J
q(I)
whence
as i n p u t
order
the arc pointing
i t s value
to I
by t h e
(with
cer-
t h e n no a r c with in-
t o ( 3 2 . 1) u n d e r t h e
t h e s e t o f nodes J where
(32.1) t h a t obeys o u r c o n s t r a i n t s
"-<" i m p l i e s
of the variable
The s o l u t i o n s
before
condition
value
t h e condi-
induced
( 3 2 . 1 ) h a s no s o l u t i o n s
a t I.
i s updated
f r o m b e l o w . The l a s t
Here t h e symbol
t o the linear order
the i n i t i a l
i s used
Not o n l y t h e
t o J , but also
t o i t as l e x i c o g r a p h i c
I f the problem
d e x e s q(I)
with
respect
J h U s h o u l d be s a t i s f i e d .
reservations).
points
i s f a r more c o m p l i c a t e d .
i s t o be s o l v e d w i t h
.! i s j u n i o r t o I
program. R e c a l l
( 3 2 . 1)
U)=qU)
the variable
i s consumed
originates
a t /. The
i s the solution to
and i s l e x i c o g r a p h i c a l l y c l o s e s t
a d d s more d i f f i c u l t i e s
toI
t o problem
solu-
tion. We life
have
stated
algorithm
more
graphs
than
once
i s immense.
since
i t d e p e n d s o n some e x t e r n a l
step,
accuracy o f computations,
the
problem
(32.1)
find
the solutions
that
vector
in too
p<J),
The n e c e s s i t y
probably rithms
t h e source
we w i l l
to obtain
equality
their
n o t be a b l e explicit
of the gravest
a n d qll)
and p o s s i b l y
(32,1) d e f i n e s
i t equals
Clearly,
depend o n l y on
some
solutions
of the array
t o our problems i s on t h e s e t o f
used
algo-
execution. section, the
on t h e c o o r d i n a t e s
external
specified
t h e a r i s i n g pro-
i n the previous
a system o f e q u a t i o n s .
the dimension
be
i sto
minimum,
s p e c i f i c a t i o n c a n n o t be
t o solve
to their
hope
at the very
restrictions
prior
grid
hope t o s o l v e
J . Our o n l y
requires,
t o t h e a s s u m p t i o n s we made
components o f p ( J ) respectively,
This
i n realn o t known
dimensions,
T h e r e f o r e we c a n n o t
q ( / ) and a l l t h e c o n s t r a i n t s
whose g r a p h s c a n be b u i l t
According
o f nodes
i t i s usually
i n d i v i d u a l node
a n a l y t i c a l form.
complicated, or else
t h e number
p a r a m e t e r s as m a t r i x
etc.
analytically.
functions
an e x p l i c i t
blems.
in
f o r every
that
Moreover,
parameters.
o f J and I The
vector
The number o f e q u a t i o n s i n the elementary
algo-
Z9S rithm.
To b u i l d
system
with
t h e a l g o r i t h m g r a p h , we
respect t o coordinates of j e x p l i c i t l y .
representative itly
i n any case have t o s o l v e
class
f o u n d . That
o f systems f o r which
i s the class of linear vector
This i s r e f l e c t e d
There
the solutions
a l s o on t h e c l a s s o f s e t s a
i s only
c a n be
Q
t
;
one
explic-
f u n c t i o n s p ( J ) and j
this
qll).
t h a t we c o n s i d e r .
Assume t h a t a l l i n d e x e s o f a l l v a r i a b l e s i n t h e p r o g r a m a r e l i n e a r f u n c t i o n s o f l o o p i n d e x e s and e x t e r n a l coefficients of linear and
Qj s t a n d f o r t h e s e t s o f a l l i n t e g e r
rons t h a t puted
parameters
integer means
that
that
that
determine
polyhedron
coefficients
the operation
facets
speaking,
node
I
one. Denote
correspond
by l i n e a r
linear
functions
Aswith
condition J M
polyhedrons
fora
fixed
a r e planes o r semiplanes.
3 0 . 2 a n d c a n be s p e c i f i e d
t o t h e i t h assignment
t h e c o o r d i n a t e s o f these
polyhed-
loops system.
The p r e c e d e n c e
these polyhedrons
They a r e d e t e r m i n e d b y S t a t e m e n t Let
terms.
linear
that
L e t £3^
a n d maybe o n some com-
o f DO
are defined
and c o n s t a n t
i s i n one o f a l t e r n a t i v e
J
node I , S t r i c t l y
Assume f u r t h e r
points within
may a l s o d e p e n d o n e x t e r n a l p a r a m e t e r s
sume a l s o
jth
parameters.
f u n c t i o n s and c o n s t a n t terms a r e i n t e g e r .
nodes
explicitly.
s t a t e m e n t , ./ t o t h e by
I
and
I.
'qi I
,...,1
. Assume P
that
the f i r s t
s
loop
indexes
J
corresponding are
f o r definiteness
t o bearing nested
no common
and c o n s i s t s
loop
indexes,
i n comparing
l o o p s o f n o d e s I . J a r e common.
t h e precedence
check
I f there
J - ( J becomes
i a n d j . S u p p o s e , f o r e x a m p l e , i > j . The
trivial first
a l t e r n a t i v e p o l y h e d r o n i s d e s c r i b e d by a p l a n e J
. - I . .
(32.2) J
The
s
s
second p o l y h e d r o n i s as f o l l o w s : J . =f. , (32.3) J J.
s-i
=1 . s-i
./ . £ f . - 1 .
296 The
number o f e q u a l i t i e s d e s c r i b i n g e a c h s u b s e q u e n t p o l y h e d r o n w i l l de-
crease
steadily.
polyhedron
We
will
i s specified
always
have
only
one
inequality.
The
third
as f o l l o w s :
J , - I. , J, I
J
I
=
i,
S-2 . JS-!
The
last
alternative
(32. 4 )
S-2
a J.
- 1. V ,
polyhedron
will
be
(32.5)
Any the be
node t h a t
o f any a l t e r n a t i v e
replacing
c l o s e r t o I.
the condition
Consider
the f i r s t
hedron
specified
relations
blem c o n s i d e r e d
on
(32.1),
i s a discrete
etc.
The p r o b l e m
(32.1)
i n §28. The p r o b l e m
the condition
J e Q ^ , and t h e
counterpart o f t h e continuous
on e x t e r n a l
inequalities
(28.1)
parameters o f the algorithm,
maybe o n some c o m p u t e d p a r a m e t e r s d e t e r m i n i n g t h e o p e r a t i o n o f DO system. T h i s dependence for
the time being
extend The
i s linear
according
t o our assumptions.
and loops
Assume
t h a t a l l v a r i a b l e s and p a r a m e t e r s a r e c o n t i n u o u s .
lexicographic order
process
pro-
we c o n s i d e r now i s c h a r a c t e r i z e d by
dependence o f r i g h t - h a n d s i d e s o f c o r r e s p o n d i n g t h e c o o r d i n a t e s o f I,
f o r / t o be i n
t h e n i n t h e second one,
polyhedrons.
will
a l l o w s us t o s o l v e
t h e l e x i c o g r a p h i c a l maximum J o f t h e p o l y -
by t h e e q u a l i t y
(32.2). This
Moreover,
J-il.
s m a l l e r number
circumstance
./-
of alternative find
satisfies
polyhedron w i t h
This
f i r s t o f the a l t e r n a t i v e polyhedrons,
becomes now a s f o l l o w s :
the
polyhedron
i s i n the a l t e r n a t i v e
lexicographically
(32.1) the
node
of solving
onto
continuous
the continuous
variables
problem
i n a natural
We
way.
consists of the following
s t age s: - a l l linear (28.1);
c o n s t r a i n t s on J ' s c o o r d i n a t e s a r e l i s t e d
the left-hand
right-hand
sides involve
sides
involve
the coordinates
i n t h e form
of J ,
t h e c o o r d i n a t e s o f / a n d unknown
and t h e
parameters;
297 -
uBing
equations
the l i s t
(28,2)
all
found
of constraints,
systems o f l i n e a r
respect
to the coordinates
didates
f o r problem
are l i n e a r
the
algebraic
are solved
t h e nodes J ' t h a t
the coordinates
candidate
set of equalities
o f a l l candidate
nodes
parameters;
node J ' a r e s u b s t i t u t e d
and i n e q u a l i t i e s
t h e n o d e I must l i e .
defines a linear
The d i m e n s i o n s o f
into
polyhedron
may d e p e n d o n e x -
p a r a m e t e r s a n d maybe o n some c o m p u t e d p a r a m e t e r s t h a t
o p e r a t i o n o f DO l o o p s the
with
a r e can-
c o n s t r a i n t s o n t h e c o o r d i n a t e s o f t h e t a r g e t node J ; t h e r e -
O'j I n w h i c h ternal
solution;
coordinates o f every
linear
sulting
algebraic equations
of J , yielding
f u n c t i o n s o f t h e c o o r d i n a t e s o f I a n d unknown
the all
a l lsystems o f l i n e a r
a r e f o u n d whose m a t r i c e s p o s s e s s L - p r o p e r t y ;
intersection
for
some
the
corresponding
admissible
the
intersection
o f £!f a n d £1'^ i s c o n s i d e r e d ;
parameter
values
then
node J ' i s t h e s o u g h t i s empty
didates f o r problem
determine
system;
then
solution
J '
I f i t I s nonempty
f o r these
solution
i s excluded
from
f o r t h e same p a r a m e t e r
- the union o f i n t e r s e c t i o n s o f polyhedrons
parameter
values
J t o o u r problem; i f the l i s t
o f can-
values;
Sij. a n d i i j . f o r a l l c a n -
d i d a t e nodes J ' i s formed. All hedron
steps are repeated a f t e r
replacing
b y t h e s e c o n d one a n d t h e p o l y h e d r o n
t h a t was n o t i n v o l v e d i n i n t e r s e c t i o n s . until
a l l a l t e r n a t i v e polyhedrons
portion responds of
the f i r s t
o fflj.m u s t
remain
determining
by
untouched.
This,
data
of
i t
goes on r e p e a t e d l y
and o n l y
by t h e elementary
the input
poly-
the portion
This process
are depleted. After
t o t h e use o f input data
variables
alternative
this
h a p p e n s some
this
portion
algorithm.
are specified
by
cor-
Indexes
the vector
qU). We
make a f e w r e m a r k s . C h e c k i n g w h e t h e r
h e d r o n s fij. a n d £i| i s n o n e m p t y b r i n g s a b o u t ameters often
that
determine
formal parameters
elicited
from
polyhedron
dimensions.
o f the program
t h e program
text.
so t h a t
Therefore
i n t e r s e c t i o n c a n make u s o b t a i n a d d i t i o n a l eter domain. I n general the
the intersections
same p a r a m e t e r v a l u e s
may
the intersection
intersect
of poly-
c o n s t r a i n t s o n unknown These their
checking
parameters values
cannot
are be
f o r nonemptyness o f
information
about t h e param-
o f polyhedrons
one a n o t h e r
par-
Qj. a n d 12^ f o r
f o rd i f f e r e n t can-
298 n o d e s J'.
dldate
icographical
Recall
maximum.
intersection of At
first
search
however
that
Therefore
dinates
As
of
i t looks
a rule, target
Choose a
maximal
s i d e s . We
can
system
to
p o l y h e d r o n has
like
node
J
of
shall
(28.2) w i t h
this
always
subsystem
limit
the
however,
we
satisfy
i t with
number
of
equal i t i e s
reduces
by
the o v e r a l l
i f we
and
assumed case
express
search
the
perform
t h e case.
system
a
heavy
of
t h e L-
The
coor-
equalities.
independent
Clearly,
left-hand
f o r the
process
substitute
reduce
sub-
this
case
multiplicainto
can
senior variables
and
This w i l l
In
equali ties
a m o u n t . The
lexicographically
a l l constraints.
there
that
that
the
before describing
variables
continuous
discrete
problem
i s an
and
solution
the process
parameters
problem
solution that
Important special
we
are
be
account further
i n terms of
t h e o b t a i n e d ex-
b o t h t h e s e a r c h amount
are
integer
actually
with
the d i s c r e t e
problem
c a s e a t l e a s t one
solve
t o be c h a n g e d .
a more d i f f i c u l t
the general
related obtain. exists.
will
of
However, Suppose
This
c o i n c i d e at
situation
i s not
integer
oc-
i n t h e course
t h e n t h e method o f f i n d i n g
I t becomes more c o m p l i c a t e d s i n c e we
problem
t o the
programs.
candidate vector
s o l u t i o n of the continuous problem,
v e c t o r s has
we
are integer f o r a l l In-
solution
solution.
curs very often while analyzing r e a l - l i f e In
to
relation
solution
In
directly
case where s u c h
/. T h e n t h e c o n t i n u o u s p r o b l e m
points
o f problem
continuous.
I s not
h a v e f o u n d o u t t h a t a l l c a n d i d a t e v e c t o r s J'
teger vectors
of
the
the dimension o f the problem. Recall
we
to
allowing
- 1 . Taking
j u n i o r ones u s i n g our maximal subsystem into
some
equalities.
f o r the L - p r o p e r t y i s performed
pressions
lex-
on
t h e s e a r c h t o v a r i o u s e x p a n s i o n s o f t h e chosen
required
chosen
simplified
coincide
matrices possessing
linearly
tion
greatly
have
i s not a c t u a l l y
checking of
the unique
nodes must
intersections. sight
t o d e t e r m i n e a l l systems
property.
a
candidate
finding
the
integer
these
have t o
lexicographical
maximum o f a p o l y h e d r o n , We integer solve solved
cannot In
this
do
with
(28.1) problem.
i f we
want
the
i f we
single
are
to
assumption construct
an
that
the vectors
effective
I n t h e g e n e r a l c a s e i t i s u n c e r t a i n how
to find
an
analytical
solution.
a^
procedure
are to
i t i s t o be
However, t h e
vectors
299 ai
are
rather
straints be
broken
down
equalities search
for
hedron.
the
the
linear
one
third
with
linear
of
triangular
as
Q/
of
point
real-life
that
of
of
graphs.
It
w i 11 To
be
graph
of
i n the
graphs
the
(to a
the
ruin
The
their
descri-
polyhedron
these
bounds
structure
are
implies
junior
to
the
corresponding
permutation)
exploited
us
not
to
two i n -
entries.
The
while
solving
polyhedron
has
t o check whether
set
theoretical
the
similar
the
inter-
way. of
integer
solutions
to
candidate vectors i n
hurdles that
l o t of details
U n f o r t u n a t e l y the
we
viewed that
further
note
that
facts. as
For
prevent
matter from a
insufficient
from g i v i n g
the
problem
our
experience
an a c c u r a t e
very
research
functions
simple
described
of
algorithm
i n t h e case o f
technique
linear
of
buiIding
o p e n s up
e x a m p l e , i t c a n be
such d i s c o n t i n u i t i e s
ignored while studying algorithm Thus,
do
indexes
the complexity of f i n d i n g
discuss
case o f
poly-
the
pracwith
descrip-
details.
conclude,
arcs,
be
enables
s e t . However, a
these
not
on
the
linear
inequalities
Program
linear
complicates
polyhedron.
only
of
can
includes
assumptions
£3^. f o r m
program a n a l y s i s p r e v e n t s us
i s noteworthy
even
this
con-
alternative
the vectors a;
£2^ c a n
structure
t o d i s c o v e r a l o t o f new rithm
that
T h e r e a r e no
view.
tion of exactly We
depend
and
linear
i s nonempty i n a s i m p l e r
(32.1) a f f e c t s
description
by
unit diagonal
of
linear
consists
group
variables
coefficients.
may
the
that
the current
t o our
polyhedron
In particular,
group
second
continuous problem.
the d i s c r e t e p r o b l e m .
tical
The
junior
I t follows
the
complexity
equation
integer
Indeed,
group
the a l t e r n a t i v e
structure
as w e l l
s e c t i o n o f Slj a n d The
of
matrices with
special
specialty.
first
group i s formed
functions
bound.
the d e s c r i p t i o n
The
The
l o o p bounds. A c c o r d i n g
they
discrete,
tasks.
of f i n d i n g candidate vectors
i t is this
solution.
the values
The
functions
which
our
three groups.
Though s m a l l ,
p o l y h e d r o n fi^ a n d
these
teger
regards
i n the problem
(32.2)-(32.4) describing
They s e t
tlj reflects
that
as
integer
from
integer.
bing
into
(32.1),
equalities
being
specific
(28.1) a r i s i n g
nodes,
may
actually
elementary
possibilities
shown t h a t
algo-
be d i s c o n t i n u o u s . occur
notations.
This
in
practice,
fact
cannot
structure. allows
dependencies.
us
to
build
I f either
a l l elementary
index
expressions
300 or
domain
quired. scheme built
descriptions
We w i l l
i s preserved. t h e whole
primitive
become n o n l i n e a r ,
not dwell
on t h e s u b j e c t
Assume t h a t
graph
we
now
additional Further.
a l l elementary
have
t o gather
graphs
Consequently,
primitive
algorithm w i l l
fij will
specified
by
be
a l l elementary be d e s c r i b e d
I n general
the vector
whence a n a r c f l o w s pertaining
algorithms
To into
p e r t a i n i n g t o the
b y o n e a n d t h e same v e c t o r
different.
f u n c t i o n JU)
3At).
The i n d e x
these f u n c t i o n s
that
i s defined
graph
t h e o r i g i n a l program
determines
algorithm i s t h e n o d e Jll)
the s e t o f elementary
i s described
func-
f u n c t i o n s p ( J ) and
elementary
graphs
by t h e s e t o f v e c t o r
i d i s t i n g u i s h e s between
f u n c t i o n s . Each o f
o n some p o r t i o n o f o n e a n d t h e same s e t £1^.
ranges a r e i n general
problem
Every
i n t o n o d e 1. T h e r e f o r e
t o the p r i m i t i v e
functions
The
graphs
T h e number o f o u t p u t s i s
t i o n q(I) a n d o n e and t h e same s e t £2j, w h i l e v e c t o r
by
re-
graphs.
arbitrary.
Their
i s
research
are built.
elementary
A p r i m i t i v e a l g o r i t h m has a s i n g l e i n p u t .
sets
research
The o v e r a l l
different.
i s established
o f synthesis
g r a p h s c a n now be f o r m u l a t e d
The l e x i c o g r a p h i c o r d e r i n the union
of the primitive
as f o l l o w s .
induced
o f ranges.
graph
Suppose v e c t o r
from
elementary
f u n c t i o n s J ^U)
a r e d e f i n e d on p o r t i o n s o f t h e s e t H j . Suppose a l e x i c o g r a p h i c o r d e r I s Introduced function perior
I n the union
JU)
Since
i tcoincides
of
induced
function
we f i r s t
maximum that
consider
values
JU)
h a v e no common
t h e l e x i c o g r a p h i c a l l y suf u n c t i o n i s chosen
and
transitive, operation.
t h e case o f two f u n c t i o n s J ^ t f ) ,
J ( J ) . De-
maximum
of their
The
i s 1inear
i s an
domains.
a n d JjD
O u t s i d e fi^. t h e l e x i c o -
coincides
function J(J)
will
be
with
the single
undefined
where
a r e nodes
deter-
are undefined.
o f e a c h o f t h e f u n c t i o n s J(I),
mined b y one o f t h e b e a r i n g
determined
t h e vector
associative
o f J(I)
i s defined.
b o t h J M J ) a n d J2U) The
with
I s : Find
I . The s u p e r i o r
by t h e p r o g r a m
lexicographical
by [ I j . t h e i n t e r s e c t i o n
graphical
The t a s k
I.
the order
choice
Therefore note
that
ranges.
f u n c t i o n J ^U ) a t e a c h p o i n t
over a l l v a l i d
the
such
of their
indexes,
nested
loops.
JII)
I f the bearing
nested
loops
t h e s e n i o r i t y o f f u n c t i o n s o n £1^ i s i m m e d i a t e l y
by t h e s e n i o r i t y
o f the corresponding
bearing
nested
loop.
301 How
suppose
there
a r e common
indexes.
common c o m p o n e n t s o f J^ll) a n d JiDinto
equalize
lexicographically
To
split
pair of
T h i s d e f i n e s a s u r f a c e c u t t i n g Sl}
obtain and
s e n i o r , on t h e o t h e r
the surface
equalize
itself
t h e second p a i r
JZ(D
bearing
m a t c h . On
nested
signment JAI),
that
loops
statement
J(I)
refer
If
constraints choice
will
o f the corresponding
graph
spawns
graph
will
indexes
every
Note
Primitive
carried
graphs
through
constraints.
graph
both
functions
statement,
the choice
graphs,
expressions
will
t h e syn-
function.
of linear
This
1inear linear.
functions
f a c i l i t a t e s the
of arc functions
may
graphs.
process
index
are related
linear
completes
This
procedures and
be p i e c e w i s e
discontinuities
essentially
loop. I f
inner operation.
index
function
building.
i n t h e case
I t sbottlenecks
i s again de-
c o n s i s t o f a sequence o f
linear
synthesis
algorithm
where
maximum among p i e c e w i s e
that
will
o f t h e corresponding as-
elementary
of linear
a piecewise
process.
more
also appear d u r i n g t h e s y n t h e s i s o f p r i m i t i v e
unconditional
seniority
I n t h e case
by t h e statement's
I n t h e case
produce
building
J (/).
seniority,
bearing nested
t o o n e a n d t h e same a s s i g n m e n t
on loop
the function
a l l common c o m p o n e n t s o f J II)
set the lexicographic
of lexicographical
always
—
to lexicographic
coincide, the seniority
thesis o f the p r i m i t i v e above.
a
t which
becomes d e c i s i v e .
the primitive
described
side
respect
o f common c o m p o n e n t s , e t c . E v e n t u a l l y we
between them i s d e t e r m i n e d
graph
with
t h e s e t o f n o d e s f r o m £i
termined by t h e s e n i o r i t y
be
the f i r s t
t w o h a l v e s . On t h e o n e s i d e o f t h e s u r f a c e t h e f u n c t i o n J II) w i l l
be
The
We
only
the process
can n e a r l y
expressions to
and
of
always linear
the necessity of
m a k i n g c e r t a i n c h o i c e s when some p a r a m e t e r s a r e u n d e t e r m i n e d .
33. Branching The
set o f unconditional algorithms
their
notation.
would
be f e a s i b l e
venient life
algorithm
v i a the form o f
T h i s f o r m was s e l e c t e d s o t h a t a l g o r i t h m g r a p h and t h e corresponding
implementation. Therefore
effectively
i s defined
structure
conditional algorithms.
admit
building o f con-
t h e success o f i n v e s t i g a t i o n o f r e a l -
depends h e a v i l y
transform their
processes would
o n w h e t h e r we w i l l
notations into
be a b l e t o
e q u i v a l e n t n o t a t i o n s o f un-
302 The
principal
scarci ty change
of
shortcoming of
means
the
order
of
of
transfer
statements
they
used
are
tion.
execution
serve
two
that,
they
I f we
assume
that
to
bottom ' of
volves
upward
control
the
the
hazardly
of
above
two
throughout
the
that
program
will
place
the
following
s t a t e m e n t s , we t h a t we
The
the
Implement
the
We
A
place,
of
computa-
are
executed
looping
branching
of
transfer
"from
primarily
involves
will
restrictions looping
of
on
typical
use
In-
downward
control
algorithm
via
transfer always to
that
DO
meet
to
or
Re-
property organize
control
them
we
transfer
This
satisfied to
struc-
loops only.
bound.
statements
be
hap-
reason,
control
transfers
notation
occur
algorithm
of
l o o p upper
"down-
the can
transbe
per-
statements
that
analysis.
program
includes control
c o m p u t a t i o n s and
an
how
such a
unconditional i n §29 w i l l
that
are
in
the
transfer
program
transformation
transfer
transfer
always
such
can
are
the
descriptions
loop of
control
be
algorithm. play
on
"down-
transformed
Our
key
Note
c a u s e d by
intermediate
I n d e x e s t h e n t h e y can domains
to
expansion
role.
primarily
contingent
I f t h e s e c o n d i t i o n s d e p e n d o n l y on incorporated
the
is organized
branching
show when and
control
statements of
very
these r e s t r i c t i o n s
branching of
g r a p h nodes a r e
cluding
first
branching
For
original
impediments of
sults.
to
Control
computa-
complicated.
e q u i v a l e n t program of
easily
then
while
the
investigation
assignment statement discussed the
notation,
loops.
is
of
statements
control
get
that
Suppose t h a t o u r
ditions
implement
way
port ions
may
formed d u r i n g p r e l i m i n a r y
the
In
of
notation only
DO
program, the
use
further
check f o r
of
of
assume t h a t
Assume
formation
wards"
purposes.
a l l o w dynamic u p d a t i n g of
unnecessary
wards"
to
control,
uses
using
loops.
used
The
using
execut ion
unconditional
algorithm
transfers
ture
makes
are
i s by
main
repeated
algorithms
expression.
transfers.
If
call
for
organize
1
unconditional
transfer
statement
Besides
tions. top
to
control
where
of
that conrebe
algorithm
situated. example
a condition
of
branching
check. Consider
is
the
evaluation following
of
a
program
function fragment:
in-
303
IF
(A,
1)
fjUy GO TO 2 1
F2(«2;
s)
(33.1)
S]
2 The v a r i a b l e 8 I s u p d a t e d
h e r e . The u p d a t e I s p e r f o r m e d
either
by oper-
a t i o n F j o r b y o p e r a t i o n F , d e p e n d i n g o n A. H o w e v e r , n o t h i n g p r e v e n t s us
from
regarding
statement t
, a-
will
z
*.
tlo^,
The
never
Is
another
,
transition
ment
o f assignment will
(33.1)
statement
always be t r e a t e d
t h e graph
as a
single
assignment
the variables
implles
that
the values of
as a r g u m e n t s . T h e i r a c t u a l use
o f graphs
lowing fragment
o f fragment
(33.1)
d e p e n d s o n t h e v a l u e o f A.
t o • I s accompanied by u n i t i n g
corresponding
union
fragment
I ti s immaterial that actually
matter.
Obviously, The
A ; 8).
be u s e d s i m u l t a n e o u s l y a s a r g u m e n t s o f t h e o p e r a t i o n
expansion
variables
the entire
<*, ,
to different
i s obtained
values
i f instead
p r o d u c i n g t h e same
IF
of
(33.1)
one g r a p h .
The same
we c o n s i d e r
the f o l -
results:
{A,
W
s = «
8
1) 2
B)
GO TO
1
t h e graphs o f t h e f r a g -
of A into
2
= a,
2 Setting sample one.
This
algorithm and
t h e fragment
transformation involves
(33.1)
of a
enlargement
algorithm
statement
t o an
enlargement
Therefore
isa
unconditional
o f t h e o p e r a t i o n s i n terms
i s described. A global
n o t always possible.
t o be a new a s s i g n m e n t
conditional
o f which t h e
I s n o t always
l e t us c o n s i d e r a n o t h e r
convenient transforma-
304 tion. and
Denote by A t h e q u a n t i t y
i s n o t e q u a l t o 1 if
that
A equals
IF F 1
1. Now
1 i f A i s not equal
rewrite
(33.1) as
t o 1,
follows:
1)
{A,
(a ; 8) 1 l
I F {A, F
equals
(33.2)
2)
(
2 V
L e t us r e g a r d t h e f r a g m e n t s
IF
(A,
1)
I F ( 4 , 2)
a s a s s i g n m e n t s t a t e m e n t s * t 1 ° ^ . 4; 8 ) the fragment
(33.2) w i l l
and
4>2 ( o ^ . A; 8 )
respectively.
Then
have t h e f o r m
1 9 a t « 3 . i*i
8)
2
This
i s another
133. 1 )
sample
transformation
to the unconditional
control
transfers
creating
new
by
a
variables
one.
By
sequence
that
of
determining
of
conditional
method
"near" the
the
we
can
algorithm
replace " f a r "
ones,
at
t h e expense
operation
of
"near"
of
control
transfers. We forming
used t h e fragment
programs. A p p a r e n t l y an
t i o n s may
be
Consider and
(33.1) t o demonstrate
p o s s i b l e ways o f t r a n s -
accompanying t r a n s f o r m a t i o n
of
opera-
required. any
c o n t i n u o u s sequence
t h e same b e a r i n g
nested
loop.
of
statements
Suppose f i r s t
that
belonging to
one
t h e sequence
has
305 no
a l t e r n a t i v e e x i t s . T h i s means t h a t u p o n i t s e x e c u t i o n
always
transferred
rality
we c a n assume t h a t
control
transfer
mines p o s s i b l e be
a new
into of
t o o n e a n d t h e same s t a t e m e n t . W i t h o u t the last
statement. input
Build
and o u t p u t
assignment
statement.
statement
loss
i n gene-
o f t h e sequence
i s not a
t h e graph o f t h e sequence. variables.
We
We
include
declare
results.
I f an o u t p u t
i t as an i n p u t
the graph's
during
will
the execution
assume i t r e t a i n s
that
t h e sequence
transformed
variable
of the selected
itself
The
sequence
(33.1)
i s a sample
sequence.
algorithm
is
preserved during
as
follows.
their
by
inside
sequence
exits.
done
assignment variables The to
ling
irregularities
algorithm
structure
as
replace
after
statement
listed
from
below
t o t h e added empty t h e empty
statement
assignment
of control
additional
results.
of control
control exits
has
no
The
will
modi-
Add a l l
variables
t o t h e new
speaking,
such
distinct.
portion
was d e c l a r e d
portion
the
alternative
statements
Generally
transfer
inside
statement.
statement.
transfer
all
branches i n -
transfer
a l l alternative
list
i t , preserving
above t r a n s f o r m a t i o n o f t h e sequence a l l o w s us t o s p l i t
the executive
portion
with
i t t o be a new
statement
transfer
to the closest
that,
the operation
m u s t be
be that
t h e sequence
t o i t s e n d . Then
t h e sequence
the corresponding
transfers
together
Declare
control
local
The o v e r a l l
statement
from
I f some c o n t r o l
of control
Having
also
o u r sequence has a l t e r n a t i v e e x i t s . T r a n s f o r m I t
statements
sequence by c o n t r o l fied
operations.
a d d an empty
r e l a t i v e order.
the transfer
we
the process.
t h e sequence, r e p l a c e
statement.
that
larger
that
First,
transfer
can
Declaring
be a new a s s i g n m e n t s t a t e m e n t a l l o w s u s t o h i d e
control
of statements,
f o r example, due t o t h e n e c e s s i t y
to
suppose
also
have i d e n t i c a l v a l u e s on o u t p u t .
fragment
Now
we w i l l
i s not updated a t
sequence
i s transformed.
reasons,
variables
i n t o I t s set
v a l u e . T h i s a s s u m p t i o n means i n f a c t
of
side
input
i s not always updated,
its initial
f o r other
identical variables The
variable
o n e . I n case an o u t p u t
I t deter-
t h e sequence t o
I t s s e t o f arguments and t h e g r a p h ' s o u t p u t v a r i a b l e s
view all
the control i s
and
the c o n t r o l l i n g
portion.
The
i t in-
executive
t o be a new a s s i g n m e n t s t a t e m e n t , a n d t h e c o n t r o l -
be c o n s i d e r e d
separately.
Of c o u r s e ,
such
splitting
306 can
be o b t a i n e d b y a n y d i f f e r e n t means. The s e q u e n c e
nested fers
o f statements
a r e i n o n e a n d t h e same
loop can i n c l u d e v e r y f a r c o n t r o l
c a n be e l i m i n a t e d
sequences h a v i n g
incorporating
i n general
q u e n c e c a n be d e c l a r e d replacing
transfers.
the f a r control
transfer
of that
kind
transfer that
by a sequence
i s replacement
o f GOTO
i n t o sub-
every
statement.
c a n be p e r f o r m e d
trans-
i s s p l i t t i n g the
alternative exits. After
t o be a new a s s i g n m e n t
bearing
Such c o n t r o l
the f a rcontrol
An a n a l o g o u s r e p l a c e m e n t
replacement
transfers.
b y v a r i o u s means. One o f t h o s e
sequence o f s t a t e m e n t s
In
that
subse-
This
results
o f near
directly.
control A sample
i n 133.1)
by
IF i n
(33.2). Let level
a p r o g r a m be g i v e n . S e l e c t a n y n e s t e d
and c o n s i d e r
contain
other
signment
DO
t h e body o f t h e innermost
statements,
statements
and
i t must
control
one where
ments
assignment
transferring
control
the
Besides
statement Thus,
nermost or
that,
duplicate
a l l control
define
s e q u e n c e by
transfer
statestate-
t h a t d y n a m i c a l l y u p d a t e t h e l o o p upper a l l control
transfer
from
statements
be h i d d e n
t h e loop
body.
peated
f o r bodies
entire
program i s transformed
ter
this
o f as-
transfer
statements
after
l a b e l e d as end o f l o o p .
loop can e i t h e r
removed
that
f o l l o w e d by
d o w n w a r d s . Add a l l c o n t r o l
ments t o t h e g r o u p o f s t a t e m e n t s bound.
replace
are foremost,
i t cannot
sequence
statements
techniques,
statements
loop. Since
be a c o n t i n u o u s
transfer
b r a n c h e s . Making use o f t h e above the
loop o f maximal n e s t i n g DO
of loops
inside
within larger
Obviously,
o f smaller
assignment
this
nesting
t h e body o f t h e
procedure level
in-
statements c a n be r e -
until
either the
t o a n u n c o n d i t i o n a l p r o g r a m o r we e n c o u n -
t h e s i t u a t i o n where a t r a n s f e r
of control
encloses
some DO
state-
ment. Any h i d i n g an
explicit
graph forms.
or
expansion This
of conditional implicit brings
control
expansion about
transfer
of the algorithm
the reduction
r e d u c t i o n may b e so s i g n i f i c a n t
implementation
of the transformed we h a v e
statements
algorithm t o keep
of that
will
graph.
results i n Algorithm
the set of parallel the only
admissible
be i t s s e r i a l
mentation.
Consequently,
tendencies
w h i l e t r a n s f o r m i n g programs. S p e c i f i c a l l y , n o t a t i o n
an eye o n
two
imple-
contradictory simpli-
307 fication
generally
results
i n graph
structure
simplification
while at
t h e same t i m e r e d u c i n g t h e o p p o r t u n i t i e s o f p a r a l l e l i z a t i o n . If be
a sequence o f s t a t e m e n t s
declared
large
i n c l u d e s a DO s t a t e m e n t
t o be a new a s s i g n m e n t
number
of input
and o u t p u t
statement. variables
The major i n such
then i t cannot obstacle i s a
a sequence.
number may be d e p e n d e n t o n l o o p b o u n d s a n d may n o t e v e n be Despite a l l t h i s ,
control
transfer
statements
This
determined.
c a n be ( I f n e c e s s a r y ! e n -
c a p s u l a t e d even i n t h e case o f such sequences. Ue show how t o a c h i e v e lowing program
t h i s on a s i m p l e example. Consider
the f o l -
fragment:
ITU,
1)
DO 2 1 = l.W F(a
i where
t h e DO
loop
body
F t a ^ , . , . , a ; |3 , . . . , ( 3 j . of
i.
Obviously,
this
; |3 , .• • , 8 .
,a
s
consists
i
of
a
single
Assume f o r d e f i n i t e n e s s
fragment
assignment that
statement
N i s independent
i s e q u i v a l e n t t o t h e f o l l o w i n g one:
DO 2 i = 1, N IFU.
2]
Flcc^
a.; ^
8
r
l
(33.3!
2 CONTINUE 1
Now
t h e DO
l o o p body
be d e c l a r e d t o f o r m same way a s w i t h At in
this
consists
o f three statements.
fragment
They c a n
o f program
i n the
(33.1).
p o i n t we e n d o u r d i s c u s s i o n o f v a r i o u s s i t u a t i o n s
t h e course
programs.
i n (33.3)
a new a s s i g n m e n t s t a t e m e n t . T h i s I s a c h i e v e d
transformation
to equivalent
arising
unconditional
308
34. Linear Information Closure We purpose
introduced
properties
common
formation closure fied
the notion o f information
was t o s i m p l i f y
the investigation
( i n a sense)
schedules
to a class
of like
i s d e s c r i b e d by a system
by t h e s e s c h e d u l e s .
The r e l a t i o n s
we a r e i n t e r e s t e d
closure
of relations
5. The
possessing
algorithms.
may v a r y w i t h
i n and w i t h
i n Chapter
o f schedules
some
The i n -
that are satis-
the properties of
t h e p r o p e r t i e s o f t h e class of
algorithms. So f a r we d i d n o t h a v e v i t a l of
algorithms.
on
i n f e r e n c e s drawn from
sence
of
Our e a r l i e r
sufficient
brought about
reasons
research on i n f o r m a t i o n
consideration
information
c l u d e d a l g o r i t h m s whose g r a p h s
algorithm
restrictions.
have " l o n g "
I n which
o f a class
c l o s u r e s was based
of individual
about
our imposing c e r t a i n
p r o p e r t i e s o f schedules
to f i x properties
e x a m p l e s . The abgraph
properties
In particular,
arcs.
we ex-
As a c o n s e q u e n c e , t h e
we a r e i n t e r e s t e d
were n o t a c c u r a t e l y
described. Now we a r e a b l e t o c o n s i d e r i n f o r m a t i o n c l o s u r e Note t h a t rithms.
to find
I n t h e case o f l i n e a r
rather
detail.
than algo-
index e x p r e s s i o n s a l g o r i t h m graphs
obtained using the above-described graphs
i n greater
i t we n e e d t o know a l g o r i t h m g r a p h s
c a n be
p r o c e d u r e s . Some p r o p e r t i e s o f t h e s e
f o l l o w f r o m t h e i m p l e m e n t a t i o n o f t h o s e p r o c e d u r e s . We
formulate
some o f them h e r e . Let sizes
a system
of real
of algorithms
parameter
parameters
be g i v e n
,
and, c o r r e s p o n d i n g l y , o f t h e i r
v a l u e s a r e i n some
Suppose t h a t
N^, . .
f o r every
nodes a r e i n a f i n i t e
linear
polyhedron,
that
graphs.
defines
Assume t h a t
i n general
unbounded.
s e t o f parameter
v a l u e s .the c o r r e s p o n d i n g g r a p h
system
bounded
of linear
polyhedrons
V l
These p o l y h e d r o n s arithmetic
spaces.
are disjoint
s y s t e m o f o t h e r p o l y h e d r o n s V^ different all
dimensions.
polyhedrons
further points
a n d may
i n fact
Suppose e a c h p o l y h e d r o n
Suppose
that that
real finite
may i n t e r s e c t o n e a n o t h e r a n d have
are 1inear nonuniform
spaces.
by a
t h e c o o r d i n a t e s o f a l l v e r t i c e s of functions
that graph a r c s a r e d e s c r i b e d by l i n e a r of arithmetic
l i e i n different
f . i s covered
V . m
The d o m a i n o f e a c h
of
N . Suppose
nonuniform f u n c t i o n s of o f these
functions is
309 one
t h e p o l y h e d r o n s V.
of
V%
f
arc,
•
and
The the
function
function function
It
i s easy
verify
I s i n one
corresponds
I t s origin.
o f W(
«
of these
to
I t s range
value to
i s Independent
nonuniform f u n c t i o n
. and
argument
and
s
to
The
of the
the
polyhedrons
end
point
uniform
part
the c o n s t a n t term
of
Of
We
that
are not i n t e r e s t e d
will
build
algorithm
i n them
graphs
obtained
using
t h e g r a p h s w e r e o b t a i n e d p l a y s no
not
assume o u r
any
suitable
graphs
t o be
some
irregularities,
note
that
the
moreover
expansion important
In
their
of the
Chapter
nodes Of
be
not
due
He
inside
sufficient
a
closure
i t i n any
d o m a i n whose
on
way.
was
Important new
nodes
p a r a m e t e r s . Such
to and
values
situations
s i z e depends on
some com-
the
built
b a s i n g on a
Specifically,
size
does n o t
we
properties. original
I t remains
Now
depend
we
family
to define
can
of a
spec-
assumed on
a l s o chosen because o f t h e
graph
for
be
intro-
descriptions
I t is
to adding
c l o s u r e was
of graphs.
approach
information
information
transforming
this
need
priori.
the information
of the family
course,
before,
i s the expansion o f the set o f parameter
i s n o t known a 5,
stated
branches.
only
so we
example, t h e y can
As
I n cases where a l g o r i t h m
the i n t r o d u c t i o n of f i c t i t i o u s
arrangement
eters.
graphs. For graphs.
particularly
may
i n the f o l l o w i n g ,
c o n s i d e r e d i f , f o r example, graph
p u t e d v a l u e s and
ial
algorithm
i s useful
have
h a v e t o be
properties.
now.
role
extensions of algorithm
ducing expanded graphs
and
the
formula-
the i n f o r m a t i o n c l o s u r e f o r these classes of graphs.
How
arcs. Equally
linear
parameters.
c o u r s e , t h e s e g r a p h s p o s s e s s a l s o a number o f o t h e r
H o w e v e r , we
an
every
is a
a b o v e - d e s c r i b e d p r o c e d u r e s c e r t a i n l y p o s s e s s t h e p r o p e r t i e s we ted.
of
paramabsence
t r y to
graphs
suitable
that
build
without class
of
schedules. We
will
nonuniform alized vectors
consider generalized
functions
schedule a.
that
that are l i n e a r Take a the
has
Suppose
each o f the polyhedrons V
the form
are
la.,xi)+iri,
independent
x
of
.,
.,
of arcs. and
their
Suppose origins
t h e end are
they V
t h e n we and
N^,
nonuniform f u n c t i o n s of these
family
polyhedron V
on
schedules.
.
are
linear
I f a
gener-
will
constant
seek
the
terms
parameters. points
in V
.
of
I t s arcs are
L e t us h a v e a
look
In at
310 the
information
closure of
this
to
a v o i d cumbersome m u l t i l e v e l
on
V..
Let
the schedule
Suppose
that
family.
indices
have the f o r m
our
family
We
will
i n our
simplify
notation
formulas.
( b , y ) + 5 o n V^
of arcs
our
and
the form a
(a,x)+j
i s described
by
linear
func-
the
i s independent
K tion
of
N^,...,N
the form and
s
the
these parameters.
on
x=Jy+tp
V^ ., w h e r e
constant The
term
ip i s a
matrix J
linear
nonuniform
function
of of
inequality
(a,x)+?=(b,y)+5
should h o l d f o r g e n e r a l i z e d schedules
f o r a l l yeV^j.
I n o t h e r words,
[ J ' a - b , y ) £ -(a,¥>)+5-y. Denote point
by
inside
y
the
a finite
combination
vertices
of
polyhedron
of i t s vertices.
be
represented
H j
1
yelf_.
schedules
The
should
to the f o l l o w i n g
satisfy
on
K , . , . , S
functions are
of
,
are
looking
and
f o r constant parameters.
that
the
(34.1)
inequality
for a l l (34.1)
* - *0 0
terms
j,6
that to
the
o f N , . . ,N
+
J
Let
+
iV--- W
+ S N *. , . +5
11
is
inequalities
According
functions
5 = 5
inequality
f o r v e c t o r s a and
these
l i n e a r nonuniform
every linear
(34.2)
< -[a,v)+5-r.
1
t h a t we
the
shows
system o f
(J'a-b.y )
Recall
that
convex
1
r e p r e s e n t a t i o n [34.2)
equivalent
a
i 0, [ « j = 1-
1
target
[35]
as
In p a r t i c u l a r ,
y = J] a.^ ,
All
I t i s known
V^ ,.
can
(34.1)
s
N s
(34.3)
b
t h a t do
are
linear
hypotheses,
not
depend
nonuniform 1
y
and
tf
311 Rewrite the inequalities
(34.3) as
where a l l f . a r e l i n e a r
According
follows:
uniform functions
t o the hypotheses,
I n the coordinates of a
the s e t o f parameters
i s i n some
and
lin-
e a r p o l y h e d r o n , g e n e r a l l y u n b o u n d e d . An a n a l o g o u s t o 134.2) r e p r e s e n t a tion holds f o r points
lying
i n s i d e such a polyhedron.
V
There
i s always a f i n i t e
•
a finite
= (W*
set of vectors
M
such
( N . . . . . N ) . 1 S
set of points
IT*
and
Denote
q
=
q
(if
n ) s
that
*
=
I
* I
k
11
H < 7 ,
(34.5)
q
where
« , E 0, V a = 1, 8 t 0. (c L k q
Indeed, subset
l e t the parameter that
i s also
a
f o r m N £ ±0 f o r s u i t a b l y ft of
a
representation
be d e r i v e d f r o m The
polyhedron
polyhedron, large
o f t h e form
(34.6)
be u n b o u n d e d . C o n s i d e r adding
proper
Q. P o i n t s i n s i d e (34.2).
The
its finite
inequalities this
of the
polyhedron
representation
admit
(34.5)
can
( 3 4 . 2 ) b y l e t t i n g Q=.+
Inequalities
(34.4) should h o l d f o r a l l W from
(34.5),
(34.6).
312 In
particular,
the
inequalities
(34.7)
should a
k
=1,
hold S
q
all q
for
equalities
a l l k, q.
for
=0
no
134.5)-(34.7)
the
generalized N
N
l
K
be
obtained
q
=1,
B =+c° f o r Hg.
v e c t o r s a,
imply
that
b and
for
inequalities
single
In
spite
constant
these
schedules
and
(34.7)
whose
constant
are
(34.4) by q.
fixed of
t e r m s ?,5
setting
These
this,
same a.b.j.S
i f we
from
the
in¬ have
(34.7),
inequalities
»fi
are
satisfied
direction
terms
from
a
hold f o r a l l parameters H
(34.4) w i l l Thus,
a
l o n g e r d e p e n d on
somehow d e t e r m i n e d t h e then
They can
and
by
vectors
linear
all
are
nonuniform
linear
on
V.
independent
of
functions
of
these
of
arcs.
s
parameters.
These
If
analogous
we
list
inequalities
gether, they w i l l From a bing
the
define
formal
typically
be
considerably simpler.
the
small.
J-E,
All
For
number large.
graphs
example,
that
a
single
family
families,
information
total
is very
schedules
to
all
is rather
concrete
inequalities
strict
the
v e c t o r ip d o e s n o t
the
system of
For
for
target
closure
is
h a v e a=b,
the
viewpoint
information
correspond
inequalities
the
then,
The
inequalities number o f
system of
i n the
to-
closure. of
case of
d e p e n d on
taken
descri-
parameters
inequalities regular
parameters.
may
graphs
In that
we
case
simple.
are
linear
on
V . and
whose
direction
i
v e c t o r s are uniform
independent of W
functions
easy t o v e r i f y be
of
that
these
and
constant
parameters
only for
them t h e
satisfy
first
terms are (34.7)
linear
as
inequalities
well.
in
nonIt
Is
( 3 4 . 7 ) must
strict. Building
the
information
common p r o p e r t i e s
for
portant for a
l o t of problems,
ical
that
software
tems o f Let
various a
can
closure
ti c l a s s
be
of
like
and
finding
algorithms
in particular,
transported
onto
schedules
possessing
is exceptionally
f o r development of
parallel
im-
numer-
computational
sys-
architectures.
program
d e s c r i p t ion
include
some
parameters
defining
the
313 computational
process
scheme a n d t h e amount
Suppose
that
the algorithm
ficient
inner
resources
rithm
onto
program
a particular
i n t h e form
perties.
This
new i n d e x e s ,
computer
t h e program
I n that
essentially
that e x p l i c i t l y
reveals
Involved.
possesses
suf-
case mapping t h e a l g o -
amounts
t o rewriting the
therequired algorithm
e t c . The p r e s e n c e o f p a r a m e t e r s
orders,
new l o o p
i n t h e program
pro-
indexes,
makes
this
difficult.
u s e o f i n f o r m a t i o n c l o s u r e a n d s c h e d u l e s o p e n s up t h e p r o s p e c t
of automating
t h e process o f program t r a n s f o r m a t i o n . During
formation
many
Therefore
program
the opportunity
guages, a n d w r i t e from
by
I n v o l v e s c h o o s i n g new e x e c u t i o n
task e s p e c i a l l y The
described
of p a r a l l e l i s m .
o f computation
t h every
arises
programs so t h a t
beginning.
new c o m p u t e r s y s t e m s come a t l a s t
and a l g o r i t h m
This
shall
bottlenecks
t o design
algorithms,
this
be
trans-
uncovered.
create
lan-
they possess a l l r e q u i r e d p r o p e r t i e s
i n s p i r e s hopes t h a t
c e a s e t o be a d u l l
a creative challenge
may
sometime p o r t i n g
and t e d i o u s
onto
t a s k and be-
f o r mathematicians.
35. Examples Consider
several
examples
of
algorithm
graph
building
using i t s
notation. EXAMPLE 3 5 . 1 . solving a block
Consider
DO 1 1
the following
n o t a t i o n of t h e a l g o r i t h m o f
b i d i a g o n a l s y s t e m of l i n e a r
u .
algebraic
equations:
j = l,n = 0
jo DO 4 k = 1 , « 2
u
, = 0
ok DO 4 i - l , n
V 3
e
i - a
" i k =
Vuk^i.k-A.k-i " ik
b
4 CONTINUE .
We a r e a l r e a d y
familiar
with
this
n o t a t i o n from
E x a m p l e 6 . 3 . The o n l y
314 variable
that
i s evaluated here i s indexed v a r i a b l e
u .. , The
evaluation
ik i s c a r r i e d o u t by t h r e e s t a t e m e n t s , two o f w h i c h m e r e l y i n i t i a l i z e v a r i a b l e s e t t i n g i t t o z e r o . There a r e b u t two i n p u t v a r i a b l e s and u . . J•X
i,
g r a p h s . We
Consequently,
append t h e symbol
Elementary i n d e x e s and
the a l g o r i t h m graph s p l i t s
graph
into
' to coordinates of output
35.1.1.
I t i s specified
inequalities defining
by
the
two
the .
primitive
variables. following
vector
the domains:
j-1 p(j')
q(fc. i )
=
laj'sm,
We
must s a t i s f y
Isksn,
=
k
lsistf?.
t h e e q u a t i o n p ( j " ] = q(k,i).
This produces
the system of
equations
r
The
inequalities
imply
that
k
=
e l e m e n t a r y g r a p h 35,1.1 does n o t Elementary graph i n d e x e s and
35.1.2.
i-i.
=
0
=
0
k.
never
holds.
Because
of
that,
the
exist.
I t i s specified
inequalities defining
by
the f o l l o w i n g
vector
t h e domains:
i-1 p(k'
) =
isk'in,
We
must s a t i s f y
q(k.i)
l
0 = k' a
i-1, k,
k
lsi<m.
the equation p(k')= q ( k , i ) .
equations
=
This produces
t h e system of
315 w h e n c e k'*k, unique. ameter
The
solution
t h a t , any
i s always
k'
signment graph
1=1.
Besides
statement
to
this
lexicographically 3
with
35.1.2 i s d e f i n e d
system
with
respect
e x e c u t i o n o f assignment statement
by
junior
parameters the
k,
to
the
k'
the
is par-
execution of
Therefore
i.
to
2 with
as-
elementary
function
k'
=
( 3 5 . 1)
k.
where lsk'^n,
Elementary i n d e x e s and
graph
35.1.3.
lsjcsn,
1 = 1.
I t i s specified
inequalities defining
(35.2)
by
the f o l l o w i n g
vector
t h e domains:
i-1 p l k ' . l '
)
l
q(k,i)
=
l = i'sm,
We
m u s t s a t i s f y t h e e q u a t i o n p(k',
of
equations
solution
statement
t o the system
3 with
parameters
k
l = i=m.
q(k,l).
•
k.
i s a g a i n u n i q u e . The k',
T h i s produces the system
• i-1,
k'
The
lsfcsn,
i')=
i'
=
i s always
1'
execution of
t o t h e e x e c u t i o n o f t h e same a s s i g n m e n t s t a t e m e n t w i t h Therefore
the elementary
graph
35.1.3 i s d e f i n e d by
k'
=
assignment
lexicographically
Junior
parameters
the vector
k,
1.
function
k.
(35.3) where lsk'sn,
lsi'sm-l,
lsksn,
2^i^m.
(35.4)
316 Elementary i n d e x e s and
graph
35.1.4.
I t is specified
inequalities defining
by
the f o l l o w i n g
vector
t h e domains:
1 P(J')
qik,
=
lij'sm,
We
must s a t i s f y
i)
=
k-1
ISi&B.
likin,
the equation p ( j ' )= <j(k,i).
This produces
the system of
equations
J"
=
i,
0 = k-1.
We
find
the
t h a t j ' = i , k=l.
execution
of
lexicographically with
parameters k,
by t h e
Again, the s o l u t i o n t o the system
assignment
statement
junior
the
to
1
with
execution
parameter
of
i s unique j ' is
assignment
i . Therefore the elementary graph
statement
35.1.4
and
always 3
i s defined
function
J
=
135. 5)
i.
where lsj'sm, Elementary i n d e x e s and
graph
35.1.5.
lsiim.
k=l,
I t is specified
inequalities defining
by
=
k'
l^k'sn,
We
must s a t i s f y
equations
the following
vector
t h e domains;
0 p(k')
(35.6)
q(k,
lsksn,
i)
=
j k-1
l*i<jn.
the equation p(k')= q { k , i ) .
This produces
the system of
317 0 = i , = k-1,
k'
The
inequalities
imply
that
i = 0 never
holds.
Because
of that,
the
e l e m e n t a r y g r a p h 35.1.5 does n o t e x i s t . Elementary
graph
35.1.6.
indexes and i n e q u a l i t i e s
I t i s specified
defining
1 p(k',i' ) =
l^k'sn,
We must s a t i s f y of
by t h e f o l l o w i n g
vector
t h e domains:
/(Mi) = [ [ \ k
k'
lsi'sm,
l«ksn,
lsi=m.
the equation p ( k ' , i ' ) = q { k , i ) .
T h i s produces
t h e system
equations
i'
= i ,
k' = k - 1 .
The
solution
statement to
t o t h e system
3 with
parameters
i s unique
and t h e e x e c u t i o n
k', i ' i s always
t h e e x e c u t i o n o f t h e same a s s i g n m e n t
Therefore
t h e e l e m e n t a r y g r a p h 35.1.6
= k-1,
i'
= i .
assignment
lexicographically
Junior
s t a t e m e n t w i t h p a r a m e t e r s k,
i s defined
k'
of
by t h e v e c t o r
i .
function
(35.7)
where lsk'Sn-1.
Primitive
graph
35.1.1.
l^j'sm,
2iksn,
I t corresponds
l ^ i ^ .
t o input
(35.8)
variable
u .
.
x — 1 , ft and
i s t h e l e x i c o g r a p h i c a l maximum o f f u n c t i o n s
ever,
t h e domains
tersection.
( 3 5 . 2 ) and ( 3 5 . 4 ) o f t h e s e f u n c t i o n s
Consequently,
elementary graphs
( 3 5 . 1 ) a n d ( 3 5 . 3 ) . How-
the primitive
35.1.2 and 35.1.3.
graph
35.1.1
have empty i n i s the union o f
318 Primitive and
i s the
Again,
graph
t h e domains
tersection.
The
algorithm
has
three
are enclosed
statements from
should
graph
that V^
graph
variable
(35.5)
35.1.2
.
and
(35.7).
have empty i n -
I s the union of
i n F i g . 3 5 . 1 f o r « = 5 , n « 8 . The
l o o p s . The
frames.
labels.
They
The
assign zero values i s identical
input
35.1.6.
i s shown
line
to
functions
(35.8) o f these f u n c t i o n s
bearing nested
statement
of
the p r i m i t i v e
35.1.4 and
i n dashed
accordance w i t h
nodes
I t corresponds maximum
(35.6) and
Consequently,
elementary graphs
rithm
35.1.2.
lexicographical
with
nodes
o f these
a r e denoted
nodes o f V
algoloops
b y f ^ , V^.
and V
to the variables. the graph
nested
The
i n Figs.
in
correspond subgraph
6.5,
6.6,
to
with as I t
be.
<+2
rrrrrrn rrrrrrn l
x l
r l
r l
Fig.
T h i s example does n o t r e v e a l f o r us. I t merely demonstrates there
i s one t h i n g
t h a t ought
algebraic equations that ing
elementary
graphs
» l
35.1
a n y new
facts
and s t o r e s no
surprises
the techniques o f graph b u i l d i n g .
Still,
t o be n o t i c e d h e r e . A l l s y s t e m s o f
linear
have t o be s o l v e d I n t h i s either
» l
have
no
solution
example w h i l e f o r given
find-
parameter
319 bounds
o r have e x a c t l y
one s o l u t i o n .
variables
are not re-evaluated.
algorithm
notations.
In this
This
This
case
i s always
t h e case
when t h e
i s c h a r a c t e r i s t i c o f mathematical
algorithm
graph
building
i s greatly
facilitated. EXAMPLE 3 5 . 2 . Now c o n s i d e r
t h e f o l l o w i n g FORTRAN-1ike
DO 1 i
= l,n
m
DO 1 j 1
Quite
probably,
However,
fact
tous
this
algebra
program
merely
valid
i n any e x i s t i n g from
t h e FORTRAN
the corresponding
shuffles
some d a t a
t h a t n^3. This assumption
t o come a n d e x c l u d e
and one o u t p u t
elementary graph
Equalizing the
occurs
graph
programs. language formally.
i s n o t o f momen-
importance.
input one
never
to build
Assume f o r d e f i n l t e n e s s the
... 2n+l-j-j
i s perfectly
L e t us attempt
that
i,n
= u
fragment
the notation
viewpoint. The
this
u. l+j
notation:
variable that
equation while
Therefore
t h e whole graph
we c o n c l u d e
building
l e t s us s i m p l i f y
cases n = l , 2 .
i n t h e program.
coincides with
indexes o f v a r i a b l e s
following
the special
There there
i s one i s only
i nthis
case.
t h a t we s h a l l h a v e t o s o l v e
t h e graph
l'*f
= 2n+l-i-j,
(35.9)
where l=i'£n,
We h a v e to
to find
(35.9)
ically The
under
thepoint
first
with
t o point
alternative precedence
Isisn,
coordinates
the constraints
theclosest
lexicographic
lsj'in,
(35.10)
i , j
from
i',
such
lsjsn.
j ' among that
(35. 10]
the solutions
i ti s
lexicograph-
below.
polyhedron
determining
the conditions
of
i s described as f o l l o w s :
t'
=i .
J' = J - l -
(35.11)
320 The
equality
(35.9) and
ear
algebraic equations
the e q u a l i t y
l'*J'
i n (35.11) form the system
=
I t , we
Substituting and
into
find
i.
that
J'
= I ,
j"
=
the i n e q u a l i t y
(35.
-2i-j+2n+l.
these formulas f o r i ' ,
e q u a l i t i e s must
lin-
2n*\-i-j,
i'=
Solving
of
} '
i n ( 3 5 . 1 1 ) we
into
the
conclude
inequalities
that
12)
(35.10)
the following i n -
hold:
lsisn,
ls2n*l~2i-jsn,
(35.13)
n*l = i+J.
For
the values of
graph w i l l as
i ,j
be s p e c i f i e d
that
satisfy
that
system
by t h e v e c t o r f u n c t i o n
of
inequalities
(35.12). Rewrite
the
(35.13)
follows:
J*n, (35.14)
2l*js2n, n+WI+J. Now of the
I t I s obvious that solutions same
i t i s compatible. I t i s also
o f the system
thing)
does
not
(35.13) cover
clear
( o r o f the system the
entire
Integer
that
the set
(35.14), which i s quadrate
lsisn.
lsjsn. Consider
t h e second
alternative
t i o n s o f l e x i c o g r a p h i c precedence.
polyhedron determining
I t I s d e s c r i b e d as
the
follows:
condi-
321
i*SI-l.
From t h e e q u a t i o n
( 3 5 . 9 ) we
find
that
j'= 2 n + l - i - J - i \
Substitute
this
formula
for
write
them i n t h e f o r m
i ' and
for j '
into
(35.10).
( 2 8 . 1 ) . We
(35.15)
Gather
a l l inequalities
have
i * sn,
-J'si+J-n-1,
(35.16)
i's-i-j+2n, i'si-l, where
l
We
have
t o choose such
lajan.
subsystems
(35.16)
whose
matrices
(with
respect
possess
the L-property.
Since
there
matrices w i l l
be
if
entry
their only
( 3 5 . 1 6 ) we
l x l . Such m a t r i c e s i s positive.
have t o c o n s i d e r
from
the system
t o parameters
i s o n l y one possess
I t follows
only
(35.17)
the f i r s t ,
of
we
inequalities
wish
to
find)
unknown p a r a m e t e r ,
the L-property that
i f and
to determine
fourth,
and
fifth
the only
i ' from Inequal-
ities. The
first
inequality The we
find
inequality
i'an yields
i n (35.16) never holds,
fourth that
inequality
j ' = l . Under
i'£-i-j+2n these
i'=n. I f t h i s
i f we
take
yields
conditions
i s true,
i n t o account i'= -i-j+2/i. the
the
last
(35.17). From
inequalities
(35.15) (35.10),
( 3 5 . 1 6 ) become
lsisn, la
j^n,
nsi*js.2n-\. 2n+l=2i+j.
(35.18)
322 The of
s e t o f s o l u t i o n s o f t h e system solutions of
t h e system
(35.18) does n o t I n t e r s e c t
(35.14).
I t i s also
described
by
the set the i n -
equalities I an, Jan, i* j ' s 2 n - l ,
(35.19)
2n+ia2i+j.
On
that s e t the graph w i l l
be d e s c r i b e d
i'=
by t h e v e c t o r f u n c t i o n
-i-J+2n, (35.20)
J'=l.
The that
fifth
j ' =-Zi-j*Zn*Z.
inequality
i'^i-1
Under
these
yields
i ' - l - l .
conditions
From
the
(35.15)
inequalities
we
find
(35.10),
( 3 5 . 1 6 ) become 2sian, is/an.
(35.21)
n+2s2i*j£2n+l.
The
set o f those s o l u t i o n s o f
described
(35.21)
that
do
not s a t i s f y
(35.14) i s
by t h e i n e q u a l i t i e s
1*J. (35.22)
l+Jsn. n*2s2i+j. Besides
that,
i t includes
the points of the s t r a i g h t
line
2i+J-2n+l. On
that set the graph w i l l
be d e s c r i b e d
(35.23)
by t h e v e c t o r f u n c t i o n
1'-i-1. )'=
-2i-i+2n*2.
(35.24)
323 However, t h e f u n c t i o n s ( 3 5 . 2 0 ) and (35.24) c o i n c i d e a l o n g line on
(35.23).
Therefore
the graph
specification
(35.24)
the straight i s valid
only
the s e t (35.22). The
cover
s e t o f s o l u t i o n s o f systems
the entire
described
integer quadrate
(35.14),
Islin,
(35.19),
lsj'sn.
(35.22) does n o t
The r e m a i n i n g
domain i s
by the i n e q u a l i t i e s
1*J,
2i+jsn+l,
The
point with
coordinates
i=j=n
also is
remains uncovered. At a l l these
taken
example
shows
produce graphs w i t h
that
(35.12),
this
i s the origin
described shown tigate
t o see a n a l y z i n g t h e
linear
arise
information closures
(6.6).
o f data
Their
while
study-
graphs
a r e known;
i n p u t and r e s u l t s o u t p u t .
o f the form
where
a = U,,^), r =
(n,l),
(a,x)+y
W -
they a r e
not inves-
I n that
I n the i , j
with vertices a t (2,1),
schedules
f o r algorithms
t h e d i s c u s s i o n we w i l l
g r a p h s a r e i n t h e same d o m a i n
seek g e n e r a l i z e d
that
May-
structure.
(6.5),
the domain i s a t r i a n g l e will
A r c s may e v e n be d e -
i s easy
o f numerous d i f f i c u l t i e s
6 . 2 , 6 . 3 . To s i m p l i f y
the orders
nodes o f b o t h
o f arcs.
which
s i m p l i c i t y o f informational connections.
Consider
by programs
i n Figs.
a l g o r i t h m n o t a t i o n s can
( 3 5 . 2 0 ) , and ( 3 5 . 2 4 ) . Consequently, s i m p l i c i t y o f no-
a l g o r i t h m and program EXAMPLE 3 5 . 3 .
simple
behavior
functions,
t a t i o n does n o t g u a r a n t e e
ing
even v e r y
complicated
by d i s c o n t i n u o u s
formulas
be
items
as an o p e r a t i o n argument.
This
scribed
p o i n t s one o f t h e i n p u t d a t a
case t h e
coordinates
and ( n , n - l ) .
We
o n t h e d o m a i n V^,
324 The unknown q u a n t i t i e s the
representation
a^, tQ.
ajP
(34.5),
r
3
a r e t o be f o u n d ,
1
(34.6) has t h e f o l l o w i n g
x
Obviously,
f o r m f o r n:
n - 2*13, 8 £ 0 .
Note t h a t lations
the constant
(34.1).
(35.25)
terras y and 8 f o r m a d i f f e r e n c e
I n t h e examples
that
we
consider
i n the r e -
t h e nodes
o f the
g r a p h s a r e i n t h e same d o m a i n V^. M o r e o v e r , t h e s c h e d u l e s h a v e o n e and the
same r e p r e s e n t a t i o n
a=b a n d y=8 i n ( 3 4 . 1 ) . must n o t r e s t r i c t
over
t h e whole domain. T h e r e f o r e
I t follows
that
the choice o f t h e constant
The g r a p h i n F i g .
1
(n,2),
I s defined
(n,n-l).
It
i snatural
that
(34.3),
-1
at
(3,2),
(34.4),
s 0.
(35.26)
and (34.7) w i l l
have
t h e same
form.
1
1
0 S
0
corresponding
linear
the Inequalities
Substituting
the vertices
the i n e q u a l i t i e s
-1
function
1
o f IT
(34 4)
9•
1
(34.3) w i l l
-ay
tain
vertices
" l o n g ' a r c s we h a v e
0
Then
with
V
(34.1) have t h e f o r m
J =
The
"short"
•
1
a
For
f
on t h e t r i a n g l e
The r e l a t i o n s
Horizontal
that
0 ,
0
have
closures
term y.
0
7 =
function
such
we w i l l
information
6.2 h a s t w o f a m i l i e s o f a r c s .
a r c s a r e g i v e n by a l i n e a r f u n c t i o n
This
the target
i s defined
o n V^.
Let y
=
(yj.y^'-
have t h e form
+ a y
1
s a ,
f o r y* a n d g a t h e r i n g
like
t e r m s we ob-
325
"V*2
°'
£
( 2 a j - a 2 ) • (-« ) n £ 0.
Now,
taking
i n t o account
They a r e a s
( 3 5 . 2 5 ) , we d e t e r m i n e t h e i n e q u a l i t i e s
-a
-a
£ 0,
-a
£ 0,
1 2 Joining
1
schedules.
on
i s easy t o v e r i f y graphs
directly
i n F i g 6.2
n. R e c a l l
graphs
that
we
i n Example
such
that
But
a
that
already
11.3.
studying structures
we
arcs
"long"
arcs
to replace
a r c s we
at
( 3 4 . 4 ) , and
" long"
arcs,
reason
we
now.
have
0
0
. v • 1
1
function
(n,l),
i s defined
(n.rt-2).
The
on
the t r i a n g l e
relations
(34.1),
with (34.3),
(34.7) a r e i d e n t i c a l and have the f o r m
a
For
tools f o r
horiz-
linear
(3,1),
effective
6,3 h a s t w o f a m i l i e s o f a r c s a s w e l l . F o r
0
vertices
f o r these
losing
1
corresponding
do n o t depend
schedules
then. For t h a t
of "short"
schedules
thereby
J =
The
linear
d i d n o t y e t have
chains
linear
direction vectors
to find
h i g h - s p e e d s c h e d u l e . T h e r e a r e no l o s s e s
"short"
the r e -
& 0.
the
The g r a p h i n F i g .
by
( 3 5 . 2 6 ) we o b t a i n
t h e r e a r e no o t h e r
their
tried
o f graphs w i t h
"long"
2
had
ontal
a 0
2
In particular
0,
for
-a
*
these i n e q u a l i t i e s and t h e i n e q u a l i t y
lations f o r the target
It
(34.7).
follows:
a r c s we
have
£ 0.
2
(35.27)
326
J
The
corresponding
vertices the
at
linear
(3,2),
0
1
0
0
0
1
•
function
(n,2),
is defined
(n,n-1).
The
obtain
the
the
vertices
inequalities
of
for
inequalities
and
1
y
l
-a
£
2
(34.3)
will
with have
(-a
+2a
1 taking
i n t o account
) + (-a
SO,
£ 0,
target
schedules.
constant
term
that
•) and
generalized.
It
is
possible f o r graphs i n Fig. nodes o f
such graphs
ponding a l g o r i t h m s are
to
6.3.
are
serial
(34.7):
0.
( 3 5 . 2 7 ) we
obtain
the
re-
particular a
=
0.
2
of
easy
inequalities
2
In
a l l schedules scaling
the
£
inequality
1
follows
-a
1 the
a a 0,
It
we
0.
determine
-a
i n e q u a l i t i e s and
l a t i o n s f o r the
terms
2
1 2 these
like
0,
)n £
2
( 3 5 . 2 5 ) , we
-a -a
Joining
gathering
0,
(2ai-a2)+(-a])n £
all
t r i a n g l e 7ja
134.4]
-a
are
the
form
Substituting
Now,
on
the
identical
vector
verify This
linked and
are
a.
a
h a v e no
that
c o u r s e due
single inner
an
accuracy
Moreover,
directly
i s of by
to
path,
all
the
and
parallelism.
a
schedules
nothing to
of
else fact
the
is that
corres-
Afterword So y o u a r e a t t h e e n d o f t h i s just
browsed
question new?'
through
i s likely
To
answer
discovering
the text,
t o be,
this
'What i s t h e r e
question,
parallelism
book. Whether you read e v e r y t h i n g o r
you probably
let's
Consider
executed
line.
lines parallel
a l l dots
correspondence between o u r dots program
execution.
I f an
another
o p e r a t i o n , we
i s isomorphic
of t h i s
There
variable
that
an
inside
of a
Every
is a
one-to-one during
i s the result
i s u s e d . The
which
of two
resulting
i s the central object
c a n be d e f i n e d u s i n g
empty
memory
form
the union graph
t h e same
i n these
twice
in a
last.
This
memory d e p e n d e n c e g r a p h .
and programs
dependence
two d o t s i s updated
graphs.
Indeed,
no memory
i s referred
simple object.
lancells
cases.
o f b o t h g r a p h s and p r o j e c t
i t onto
the x
t o as d a t a dependence graph
t o an e x c u t a b l e statement
o r memory
Notice
i n single-assignment
dependent.
The d a t a
axis.
[ 7 2 ] , Each
o f t h e program.
nodes a r e c o n n e c t e d w i t h a n a r c i f and o n l y i f t h e c o r r e s p o n d i n g are data
set of
a n a r c i f a n d o n l y i f o n e a n d t h e same
i s updated
formulas
twice i n these
resulting
rather
time a
the corresponding
t o t h e d o t where t h e v a r i a b l e
i t s nodes corresponds
ments
a
pro-
the corresponding
there
arc connecting
that
element)
mathematical
yield
Now
of
graph
two d o t s w i t h
(or array
are w r i t t e n
The
rule.
o f an o p e r a t i o n
( t o a n i s o m o r p h i s m ) c a n be c a l l e d both
guages
Establish
and a l l t h e o p e r a t i o n s p e r f o r m e d
t o t h e a l g o r i t h m graph,
i s another connect
The a r c p o i n t s
graph
briefly
book.
n o d e s . We
row.
a n d sum u p
a certain
d o t s . The a r c p o i n t s t o t h e d o t w h e r e t h e r e s u l t graph
i s really methods o f
statements
are d i s t i n c t ,
argument
draw
main
t o t h e y a x i s . Suppose a p r o -
we p u t a d o t somewhere
Provided
that
i n a plane.
the executable
i n accordance w i t h
i s executed,
straight
book
the e x i s t i n g
The
research.
between
gram a n d a s e t o f s t r a i g h t
statement
i n this
review
a r e c t a n g u l a r c o o r d i n a t e system
correspondence
gram i s b e i n g
questions.
i n a l g o r i t h m s and programs
some o f t h e e n d r e s u l t s o f o u r own
one-to-one
have
dependence
graph
Two
stateis a
The number o f nodes i s n o t l a r g e and d o e s n o t d e -
pend on t h e p e c u l i a r i t i e s
o f t h e program
under
examination.
Every arc
328 just
registers
nothing
the
existence
more. There
of
a
dependence
between
tests
heip detect
i s a number o f
that
statements, such
and
depen-
dences. Methods o f d a t a dependence g r a p h g r a m s as
their
computer s c i e n t i s t s . allel
computers
q u i t e a few For
in
various
methods have been d e v e l o p e d ,
graph
program
the
contain
[96],
are
symbolic
scripts;
weak o n c e we
has
of
traditional
graph we
unknown
so
these
situations
in
algorithms
The
and
b u l k o f our
robust
methods
Any
principal
nontrivial
may
these
there
regarded
present,
little
algorithm on
and the
are
variables
graph.
coupled
typical
at
sub-
1, e t c .
least
for
approach to the inves-
this
We
discard
o b j e c t has
little
i s t a r g e t e d at the
We
and
these
exploring
can
be
graphs,
the
valu-
development
c e r t a i n l y had
for
the
algo-
i n mind
the
that
corresponding
easily derived.
that
i s common f o r a l l m e n t i o n e d
t h e amount o f c o m p u t a t i o n . grid
steps,
are not
Moreover,
there
functions
of
certain the e f f e c t
structure.
structure
of
The
data
some
S u c h param-
the d e s i r e d accuracy,
known a t
analysis time.
t h e number o f n o d e s i n a l g o r i t h m
i s known a b o u t program
are
situa-
expressions
t h a n 0 and
programs.
are
i n f l u e n c e d by p a r a m e t e r v a l u e s . as
following index
data
p r o g r a m o r s u b r o u t i n e i n e v i t a b l y d e p e n d s on
they c e r t a i n l y determine
arcs are
and
because effort
difficulty
include array sizes,
memory d e p e n d e n c e
effect
the
the
ignored.
memory d e p e n d e n c e g r a p h . efficient
is a
values of
be
of
Though
efficient.
exploring
loops;
methods f o r b u i l d i n g
formal parameters that determine
graph
of
values;
t h e y c a n n o t be
d a t a dependence graph
have
There
eters
i n one
and
from
f o r par-
tools.
they are not always
structure
methods f o r d a t a dependence g r a p h
graphs.
are
i n compilers
introduces a completely d i f f e r e n t
mathematically
rithm
that
parallelism
information.
once
with
p r o g r a m s and
T h i s book
able
complex
terms
I t i s obvious
tigation
sequential pro-
restructuring
loop indexes appear w i t h c o e f f i c i e n t s o t h e r
numerical
to
program
example, t h e w e l l - k n o w n methods o f b u i l d i n g
tions:
The
t h a t use
These methods a r e used b o t h
and
dependence
of
building
s t a r t i n g p o i n t have r e c e i v e d c o n s i d e r a b l e a t t e n t i o n
numerous
symbolic
graph
examples
Therefore
etc.
However, and
a l l graphs
variables.
Up
to
of formal parameter values
paper
[96]
dependence
testifies
graph
is
that also
in
where are the on
their little
329 studied. we
To
cope w i t h t h i s d i f f i c u l t y
i n b u i l d i n g and
have t o d e v e l o p s p e c i a l i z e d methods o f s y m b o l i c People
who
information sons t o
treat
as w e l l .
work
to
one
using
of
the
dependence
graphs
distinguish
memory d e p e n d e n c e . T h e r e
a l g o r i t h m g r a p h s and
graphs,
are
a
between
l o t of
memory d e p e n d e n c e
both kinds details. two
of graphs are
Due
graphs.
to
this,
In
either
we
this
can
book
arising
identical or
limit
we
rea-
graphs
the mathematical f o r m u l a t i o n s o f problems
i n unsubstantial just
data
and
separately
Fortunately,
I n b u i l d i n g and fer
with
dependence
exploring
computations.
our
study
dif-
discussion
the
algorithm
graph. This duces
book p r e s e n t s a method o f
the
graph
in explicit
mathematical formulas. ficiently
l a r g e program
considered methods. its
The
by
Our
method
efficiency
is
symbolic
data
I n any
the
case,
dependence
is efficlent not
form;
graph
method i s a p p l i c a b l e
class.
well-known
algorithm graph b u i l d i n g
impaired
for by
pro-
i s described
by
t o programs from a
suf-
i t c o v e r s a l m o s t a l l cases
graph
building
a l l programs any
that
of
the
and
within
factors
exploring
that we
class;
mentioned
earlier. The
explicit
symbolic
representation
to point out
i t s characteristic
define
class
a new
described
of
i n symbolic
graphs.
graphs corresponding
are
able
not
to
plore a certain the unknown The
chical with
the
arcs.
We
symbolic
show i n t h i s
macro- and
function
for a
will
wider
still
than
the wider
class
be
we
us can
explicitly
the class
of admissible
particular
representation of book t h a t such
microlevel, efficient
the
set
defined
time.
complexity
of directed cuts
on
Such f u n c t i o n s d e f i n e in
to the class
graph
memory, c o m m u n i c a t i o n
studying
ferently
be
these,
of a l -
p r o g r a m s . I f we
program, instead
we
can
ex-
of
exploring
algorithm graph
is highly
graph.
show t h a t e a c h d i r e c t e d c u t tain
algorithm graph allows
class
i t will
graph chosen from
explicit
promising. t i o n on
build
S u c h new
form, yet
gorithm
of
f e a t u r e s . F i x i n g some o f
nodes
algorithm
These f u n c t i o n s
are
as
p a r a l l e l i s m detec-
of d i s t r i b u t e d
estimates, of
i s determined graph
issues
use
etc.
algorithm by
and
a
level
and
hierar-
a l l have
graph.
We
surface
nondecreasing
of
to
do
further a
along
cergraph
I m p l e m e n t a t i o n s t h a t behave
dif-
c a l l e d schedules
pro-
and
their
330 perties are
are thoroughly
most
with
interested
respect
studied
i n this
i n piecewise
t o symbolic
book.
linear
parameters.
We show t h a t
such schedules s a t i s f y a system o f l i n e a r matrix, using of
present
the symbolic
mapping
choosing The
representation
an a p p r o p r i a t e
this
certain
tion,
an e f f i c i e n t
an a l g o r i t h m
system i t s e l f In
to
a n d we
solution
b o o k we a l s o
tasks
derivatives,
that
r e a s o n s we
are additive
thecoefficients of
inequalities with
method
o f building
of algorithm computer
graph.
o f a system
a constant this
i s reduced t o
o f linear
inequalities.
demonstrate
that
algorithm
no d i r e c t
connection
recovery
o f 1Inear
functional,
and r o u n d o f f
error
propagation.
graph
I s related
t o i t . For fast
I t seems t h a t
the list of
expanded.
t o use algorithm
for
formulations
g r a p h a n d memory d e p e n d e n c e g r a p h f o r
amount,
approaches.
defined.
though
having
this
i n existing
In descriptions
we e s s e n t i a l l y
i n mind
Nevertheless,
main f e a t u r e
that
j o b was d o n e .
The
literature
i n ade-
anew
under
f a c t s and
the theoretical
study
I n our opinion,
o f t h i s book t h a t d i s t i n g u i s h e s
Impact
of results
detection
a l s o be u s e f u l o f numerical
No
theory
published
i n algorithms
i t from other
i n this
software,
will
book
and programs.
i n a number o f o t h e r f i e l d s ,
of problem d e s c r i p t i o n
were
poorly
this
I s the
publications
as e . g . e n s u r i n g
d e s i g n o f new n u m e r i c a l
languages and programming
satisfy
the skeptic
i s not 1imited t o Our r e s u l t s
methods,
this
book were m a t e r i a l i z e d
t h a t e x p l o r e s and r e s t r u c t u r e s microanalysis,
builds
algorithm
i n a software
graph
portabilidevelopment
and t h e p r a g m a t i c
tool
FORTRAN p r o g r a m s .
i fIts ef-
results
called
presented
V-RAY SYSTEM
I t p e r f o r m s macro- and
( a n d some o t h e r
gram f r a g m e n t s , chooses a p p r o p r i a t e s o l u t i o n s
should
languages, e t c .
f i c i e n c y cannot be p r a c t i c a l l y demonstrated. Research in
methods
o f separate
had t o form
the objects
mathemat-
solution
p a r a l l e l computations.
p a r a l l e l ism
ty
problems and e f f i c i e n t
i t abounds
Therefore
foundations,
in
f o r emerging
t h e s e p r o b l e m s . N e i t h e r c a n be f o u n d
quate
illustra-
computation of
a v a r i e t y o f p u r p o s e s makes i t n e c e s s a r y t o h a v e b o t h p r e c i s e ical
matrix
Thus t h e problem
architecture
have
s u c h t a s k s c a n be r e a d i l y desire
a
that
i s not large.
we c o n s i d e r
The
onto
For p r a c t i c a l
schedules
graphs) f o r pro-
o f t h esystem o f lnequal-
331 ities
o f i n f o r m a t i o n c l o s u r e , s i n g l e s o u t v e c t o r s and p a r t i t i o n e s
rithms,
d i s c o v e r s program
m o r e . V-RAY SYSTEM and
compatibles.
cluding
b o t t l e n e c k s , g i v e s v a l u a b l e a d v i c e and a l o t
i sw r i t t e n
i n C and runs
The p e r f o r m a n c e
algorithm
graph
u n d e r MS-DOS o n IBM PC/AT
i s satisfying;
and i n f o r m a t i o n
ing
(very
improved
Vladimir tical
rarely,
ideas w i l l
system often
several minutes). Certainly,
continuously. I t s chief
V. V o e v o d i n .
thef u l l
closure
p r o g r a m c o n s i s t i n g o f 2 0 0 - 3 0 0 FORTRAN s t a t e m e n t s seconds
algo-
H i s work
designer
inspires
be t r a n s f o r m e d
into
analysis (in-
building) takes j u s t
a few
V-RAY SYSTEM i s b e -
and programmer
confidence
of a
that
i s my s o n
these
mathema-
a v a l u a b l e and p r a c t i c a l
software
tool. The end
at the last
only
o f problems
solution will
i t ,
aiaat
arising
page o f t h i s
now c a n we f u l l y
their put
study
victoria
assess
i n parallel
book. Rather, their
computations
we h a v e
difficulty
curam.
theopposite
and v a r i e t y .
take a l o t o f time and e f f o r t .
does n o t case:
Apparently
W e l l , a s t h e Romans
This page is intentionally left blank
References 1.
L.
Adams
and
T.W.
Crockett,
p r o c e s s o r a r r a y s . Computer
Modeling
algorithm
17 ( 1 9 8 4 ) , No.7,
execution
time
on
p.38-43.
2. L. Adams, R e o r d e r i n g c o m p u t a t i o n s f o r p a r a l l e l e x e c u t i o n , Commun. taer.
Appl.
3. R. A l l e n vector
M e t h o d s 2 ( 1 9 8 6 ) , No.3,
in
p.263-272.
a n d K. K e n n e d y , A u t o m a t i c t r a n s l a t i o n o f FORTRAN p r o g r a m t o
form,
ACM Trans.
Program.
Lang.
9
Syst.
(1987),
No.4,
p.
491¬
542. 4. M.
Annaratone
e t a l . , Warp a r c h i t e c t u r e :
tion.
AFIPS
1987)
p. 133-140.
Conf.
56,
processings,
Reston,
5. J . - L . B a e r , A s u r v e y o f some t h e o r e t i c a l Comput.
B a n e r j e e , S.-C.
lel
processor
C-28 8. U.
and
f o r FORTRAN-like
Trans.
Gajski,
loops,
execution
IEEE
Trans.
o f loops
with
( 1 9 8 4 ) , No. 1 1 ,
Gurd, S i m u l a t e d
dataflow
1986) p.419-424.
R.H.
Fast C-33
Comput.
Amsterdam,
11.
Press,
aspects o f multiprocessing,
( P i t m a n , London,
architecture
multi-ring
R.H.
(AFIPS
USA
paralComput.
m a c h i n e , Parallel
p.
systems
B a r l o w , D.J.
IF
state-
1030-1033.
performance o f t h e Manchester Computing
(North-Holland,
85
B a r l o w , P e r f o r m a n c e measures f o r p a r a l l e l
a J J e J processing
1980).
p.660-670.
D.
B a r a h o n a a n d J.R.
10.
VA,
t o produc-
Chen, D . J . K u c k , a n d R. A. T o w t e , T i m e a n d
bounds
( 1 9 7 9 ) , No.9, Banerjee
ments, I E E E 9. P.
prototype
5 ( 1 9 7 3 ) , N o . l , p.31-80,
Surv.
6. J . - L . B a e r , C o m p u t e r system 7. U.
from
algorithms,
(Cambridge Univ. Press,
i n Par-
1982) p.179-192.
E v a n s , a n d J . S h a n e h c h i , An a s y n c h r o n o u s p a r a l l e l
v e r s i o n o f t h e p o w e r m e t h o d . I n t . J . Comput.
Math.
11 ( 1 9 8 2 ) ,
No.2,
p. 1 4 3 - 1 5 4 . 12.
W.
13.
F. B e r m a n a n d L. S n y d e r , On m a p p i n g p a r a l l e l a l g o r i t h m s
Theor.
lel
Baur
and
Comput.
V.
Strassen,
S c i . 22
architectures,
p. 4 3 9 - 4 5 8 .
J.
The
complexity
of partial
derivatives,
( 1 9 8 3 ) , p. 317-330.
Parallel
Distrib.
Comput.
4
into
paral-
(1987),
No.5,
334 14.
A. B o s s a v i t ,
The " V e c t o r
machine":
An a p p r o a c h
m i n g o n CRAY-1 a n d o t h e r c o m p u t e r s ,
15.
faces
and systems
R. P.
Brent,
at 16.
time,
Brent,
for
H, T.
systolic
Kung,
R.P. B r e n t tion
of
arrays,
a n d F. T.
Proc.
H o l l a n d , A m s t e r d a m , 1983) 17.
inter-
p.103-121.
evaluation o f arithmetic
i n Complexity
sequential
and
( A c a d . P r e s s , New Y o r k , 1 9 7 3 ) .
algorithms
R. P.
program-
models,
( N o r t h - H o l l a n d , Amsterdam, 1984)
The p a r a l l e l
logarithmic
t o simple
i n PDF. software:
Congr.
numeric
p.83-102.
L u k , Some
IFIP
expressions i n
parallel
linear-time
83,
Paris,
algorithms (North-
1983
p.865-876.
a n d F.T. L u k , A s y s t o l i c
array f o r the linear-time
o f T o e p l i t z systems o f e q u a t i o n s ,
J . VLSI
and Comput.
solu1
Syst.
( 1 9 8 3 ) , No. 1, p. 1-22. 18.
B. B u z b e e , A s t r a t e g y f o r v e c t o r i z a t i o n , No. 3,
19.
3
Comput.
(1986),
p. 1 8 7 - 1 9 2 .
D.A. C a l a h a n , A l g o r i t h m i c a n d a r c h i t e c t u r a l tor
processing,
Proc.
Int.
(Pergamon P r e s s , O x f o r d , 20.
Parallel
Symp.
Large
issues
Eng.
t o vec-
Winnipeg,
1976
1977) p.327-339.
S.-C. Chen a n d D . J . K u c k ,
Time and p a r a l l e l
1inear
IEEE
recurrence
related
Syst.,
systems,
Trans.
processor C-24
Comput.
bounds f o r
(1975),
No. 7,
execution
o f DO
p.701-717. 21.
Z. Chen a n d C.-C. loops
with
Distributed 22.
J.S.
23.
W.R. to
Cowell
parallel
relations,
J . Parallel
and
p.488-504.
execution
of
logic
programs
(Kluwer
Acad,
1987).
and C P .
Thompson, T r a n s f o r m a t i o n
improve performance on v e c t o r
architecture,
o f FORTRAN
DO
loops
ACM Trans,
on
Hath.
12 ( 1 9 8 6 ) , No.4, p . 3 2 4 - 3 5 3
J . Demmel, LAPACK: A p o r t a b l e l i n e a r puters,
25.
s e t o f dependence
Parallel
Boston,
Iteration-level
4 (1987),
Comput.
Software 24.
a reduced
Conery,
Publ.,
Chang,
Proc.
1989 I E E E
algebra
library
f o r supercom-
(Tampa, F L . , 1 9 8 9 ) .
QACSO
J . Demmel a n d J . D o n g a r r a , e t a l , P r o s p e c t s f o r t h e d e v e l o p m e n t o f a linear algebra cai
and
l i b r a r y f o r high-performance computers.
Computer
National Laboratory,
Science
Division
Argonne,
I L , 1987,
Rpt.
ANL/HCS-TM-97
Hathemati(Argonne
LAPACK W o r k i n g N o t e N o . l ) .
335 26.
J.B.
Dennis,
Data
flow
supercomputers,
13
Computer
(1980),
(Jo. 1 1 ,
p. 4 8 - 5 6 . 27.
28.
J.J.
Dongarra,
Int.
Colloq.
1983
(EDF
J.J.
Vector
Bull.
F. G.
linear
algebra
Parallel
26
Gustavson,
( 1 9 8 4 ) , No. 1 ,
D o n g a r r a and
computers,
D.C.
Parallel
algorithms,
Comput.
in
e t R e c h . , 1983, and
A.
a l g o r i t h m s f o r dense m a t r i c e s on
S I AM Rev. J.J.
and
D i r . Etud.
Dongarra,
algebra
29.
Redesigning
Sci.
Proc.
1-st
Appl.,
Paris.
C,
No. 1) p. 5 1 - 6 0 .
Karp,
Implementing
a vector pipeline
linear machine,
p.91-112.
Sorensen, Linear algebra
Computing
on
(North-Holland,
85.
high-performance Amsterdam,
1986)
schemes,
Paral-
p. 3-32. 30.
I.S. lel
31.
Duff, Parallel 3
Comput.
I.S.
A. P.
The
86.
Berlin, 32.
1986)
Current
27
Evans,
Parallel
solution in
of
Comput.
sparse
Sci..
linear
equations,
(Springer-Verlag,
237
p. 1 8 - 2 4 .
Ershov,
D.J.
Notes
of m u l t i f r o n t a l
p. 1 9 3 - 2 0 4 ,
parallel
Lecture
Cybernetics 33.
( 1 9 8 6 ) , No. 3,
Duff,
CONPAR
Implementation
state
( 1 9 7 3 ) p.
Parallel
processing
of
program
87-110
schemes
of
( i n Russian).
numerical
algoritms
Cambridge
systems.
t h e o r y . Problems
for
Univ.
linear
Press,
systems, 1982,
in
p.357¬
383. 34.
35.
V.N.
Faddeeva 1,2,
p.18-31
( i n Russian)
A. V.
Fiacco
York,
Cybernetics
and
J.P.
and
G a l l i v a n , R.J.
G a n n o n and
complexity Trans.
No.
6,
computations
p.
Nonlinear
28-40;
In
(1982)
linear No.
programming:
3,
Sequential
(Research Analyses Corp.,
Comput.
on
Moldovan, P a r a l l e l i s m
useful
for
( 1 9 8 5 ) No, 1 ,
Comput.
f o r dense l i n e a r D.B.
(1977)
techniques
D.I.
techniques
Distributed
38.
Parallei
McCormick,
minimization
J.A.B. F o r t e s
K. A.
Faddeev,
New
1968).
formation
37.
D.K.
algebra
unconstrained
36.
and
the C-33
van
A. H.
Rev.
R o s e n d a l e , On
design
of
d e t e c t i o n and
algorithms,
J.
trans-
Parallel
and
p.227-301.
Plemmons, and
a l g e b r a , S/AM J.
VLSI
32
Sameh, P a r a l l e l ( 1 9 9 0 ) , No.1,
the
parallel
impact of numerical
( 1 9 8 4 ) , No.12, p . 1 1 8 0 - 1 1 9 4 ,
algorithms
p.54-135. communicational algoriths.
IEEE
336 39.
M.R.
Garey
to
the
and
in
and
IFIP
architectures
and i m p l i c i t
Freeman
A
Guide
a n d Company, San
in
Meteorological
parallel
programming,
(Springer-Verlag.
Models
Kirkham, 86,
Dataflow:
Dublin,
Achievements
and
prospects,
(North-Holland,
1986.
I n n o v a t i v e computer a r c h i t e c t u r e
but not complexity, Press,
Handler,
Amsterdam,
1984)
i n Parallel
- how t o i n c r e a s e
processing
par(Cam-
systems
1982) p.1-42.
Simplicity
architecture,
D.
Intractability.
p.61-68.
bridge Univ. W.
C.
Congr.
W. H a n d l e r , allelism
44.
(W.H.
and
1988) p.255-282.
J . Gurd
1986)
43.
NP-Completeness
Multiprocessing
Proc.
42.
of
Computers
1979).
J . Gurd, Dataflow
Berlin, 41.
Johnson,
Theory
Francisco, 40.
D.S.
i n High
and
speed
flexibility
i n concurrent
computer
(Springer-Verlag,
computation
Berlin,
p.69-88.
Heller,
A
survey
of parallel
algorithms
i n numerical
linear
a l g e b r a , ST AM Rev. 2 0 ( 1 9 7 8 ) , No. 4, p. 7 4 0 - 7 7 7 . 45.
R. W.
Hockney,
(Infotech 46.
R. W.
Hockney
programming 47.
and
C. R.
architecture,
Jesshope,
Parallel
T h e o r y of
Parallei
i n Future
systems,
computers:
(Adam H i l g e r ,
and algorithms
F. H o s s f e l d , plexity,
Supercomputer
2,
I n t . , Maidenhead, 1977) p.277-305.
algorithms:
algorithms,
Bristol,
Architecture,
1981).
The i m p a c t o f c o m m u n i c a t i o n com-
Pecs,
Hungary,
1984.
(North-Holland,
Amsterdam, 1986) p.207-232. 48.
K. Hwang and F. B r i g g s , C o m p u t e r architectures (McGraw-Hill.
sing, 49.
K. Hwang, S.-P. processing New Y o r k ,
50.
C.R.
52.
Su, a n d L. M.
techniques,
and parallel
of
N i , Vector
i n Advances
in
computer a r c h i t e c t u r e
computers,
20.
(Acad.
and
Press,
1981) p.116-199.
the art
D.J. K u c k ,
A survey Surv.
D.J. K u c k ,
processing.
M a i d e n h e a d , UK,
o f p a r a l l e l machine o r g a n i z a t i o n
1987).
and program-
9 ( 1 9 7 7 ) , No.1. p.29-59.
High-speed systems
J.M. Parallel
(Pergamon I n f o t e c h ,
report.
m i n g , Comput.
processing
proces-
1984).
J e s s h o p e , R.J. O'Gorman, a n d S t e w a r t
State 51.
New Y o r k ,
machines
and
(Cambridge Univ.
their
Press,
compilers,
i n
1982) p.193-214.
Parallei
337 53.
D.J, Kuck, P a r a l l e l 15.
computers, 54.
D. J . K u c k , change Int.
p r o c e s s i n g o f o r d i n a r y programs,
( A c a d . P r e s s , New Y o r k ,
e t a l , The e f f e c t
and a r c h i t e c t u r e
Conf. ParaJJeJ
55. D . J . K u c k e t a l . ,
H.T. K u n g , computers,
57.
algorithms,
1 9 . ( A c a d . P r e s s , New Y o r k ,
Why
systolic
f o r pipe-
i n Advances I n
1980) p. 65-112.
Special-purpose supercomputers,
H.T. K u n g ,
P r o c . 1964
1984) p.129-138.
1980, p.709-715.
COHSAC,
of parallel
1986 ( N o r t h - H o l l a n d ,
Dublin, 58.
performance.
algorithm
The s t r u c t u r e o f a n a d v a n c e d v e c t o r i z e r
The s t r u c t u r e
H.T. K u n g ,
restructuring,
( I E E E . New Y o r k ,
l i n e d p r o c e s s o r s . Proc. 4 - t h I n t . 56.
in
1976) p,119-179.
o f program
c h o i c e on program
Process.
i n Advances
Proc.
IFIP
Congr.
86,
Amsterdam, 1986) p.565-570.
architecture?
Computer
15
(1982),
Ho, 1 ,
p.37-46. 59.
S. Y.
Kung
ture
and a p p l i c a t i o n s .
e t a l . , Wavefront
array
IEEE
processor:
Trans.
Language,
C-31
Comput.
architec-
(1982).
No.11.
p.1054-1066. 60. M.S.
Lam,
Boston, 61.
A systolic
The c o o r d i n a t e
Proc.
New Y o r k , 62.
optimizing
(Kluwer
compiler
Acad.
Publ. ,
execution
o f DO
1989).
L. L a m p o r t , loops,
array
Samagore
1973
method
f o rparallel
Comput.
Conf.
Parallel
(IEEE,
Process
1973) p.1-12.
L. Lamport,
The p a r a l l e l
execution
o f DO
loops,
Commun.
ACM 1 7
( 1 9 7 4 ) , No.2, p.83-93. 63. 0 . L e e , C P . K r u s k a l , ic
restructuring
Parallel
a n d D.J. Kuck, The e f f e c t i v e n e s s o f a u t o m a t -
on nonnumerical
64. B. L i s p e r , S y n t h e s i z i n g space-time.
Lecture
Verlag. B e r l i n . 65.
V. P.
Maslov,
equations 66.
synchronous
Notes
in
Comput.
Proc,
1985 I n t . C o n f .
1985) p. 607-613. systems Sci.,
by s t a t i c 362
scheduling i n
(Springer-
1989). Asymptot
(Nauka,
J.R. McCraw,
programs,
( I E E E , New Y o r k ,
Process..
ic
solution
methods
for
pseudodifferential
Moscow, 1 9 8 7 ) ( i n R u s s i a n ) .
A debate:
Retire
FORTRAN?
Physics
Today
37
(1984),
No, 5, p . 6 6 - 7 5 . 67.
J, J. Modi,
ParaJJeJ
Press, Oxford,
1988).
algorithms
and matrix
computations
(Clarendon
338 68.
D.I.
Moldovan,
rays, I E E E 69.
On
of
comparison
symbolic algorithms,
J.M.
Ortega,
in Parallel
CD.
introduction
Polychronopoulos,
for
parallel
Univ.
Park.
systolic
ar-
a l g o r i t h m s and
t o p a r a l l e l and
Loop
York,
vector
architectures.
Proc.
PA,
(Pennsylvania
1957
solution
of
linear
1988).
coalescing: A
machines, USA
numeric
p.325-334.
( P l e n u m P u b l . C o r p . , New
systems 71.
f o r VLSI
between p a r a l l e l p r o c e s s i n g o f
( N o r t h - H o l l a n d , Amsterdam, 1986) 70.
algorithm
(19831.
Proc.
D. I . M o l d o v a n , A and
the design
Int.
compiler
Conf.
transformation
Parallel
State
Process..
Univ.
Press,
1987)
p.235-242. 72.
CD.
Polychronopoulos,
Acad. P u b l . , B o s t o n , 73.
P.
Quinton,
recurrent
Ann 74.
Mich..
Sameh,
algebra,
Sci.
Proc.
Appl.,
s y n t h e s i s of
Proc.
1984 An
1-si
ll-th
Int.
1983
of
Int.
York,
compilers
Vector
Bull.
arrays
Symp.
1984),
parallel
Colloq.
(EDF
systolic
Annu.
( I E E E , New
overview
Paris,
and
programming
(Kluwer
1988).
Automatic
equations,
Arbor,
A.H.
Parallel
from
Archit.
,
p.208-214.
algorithms and
uniform
Comput.
in
Parallel
Dir.Etud. et
numerical Comput.
Rech. ,
in
1983,
C,
No. 1) p. 1 2 9 - 1 3 4 . 75.
A.H.
Sameh, N u m e r i c a l
computer
and
algorithm
p a r a l l e l algorithms - a survey, (Acad.
organization.
Press,
i n High
New
speed
York,
1977)
p.207-228. 76.
H.J.
Siegel,
A
interconnection
m o d e l o f SIMD m a c h i n e s and networks,
IEEE
Trans.
a comparison
Comput.
of various
(1977),
C-2B
No.12,
p.907-917. 77.
L.J.
Siegel,
evaluating SE-8 7 8 . H.S.
and
P.H.
H. S.
Stone,
Problems and
parallei
p.1-16.
IEEE
Trans.
Softv.
r
or
Eng.
p.319-330. i n Introduction
( S c i . Research Assoc., Palo
1973)
Swain, Performance measures
a l g o r i t h m s f o r SIMD m a c h i n e s ,
5tone, P a r a l l e l computers,
sequential
York,
Siegel,
( 1 9 8 2 ) , No.4,
tecture 79.
H.J.
of
parallel numerical
Alto,
in Cal.,
computation, algorithms
computer
1975) in (Acad.
archi-
p. 3 1 8 - 3 7 4 . Complexity Press,
of New
339 80. M.H.S. Swami a n d K. T h u l a s i r a m a n , G r a p h s , Networks, ( W i l e y , New Y o r k , 81.
J . F.
Traub,
complexity,
and
Algorithms,
1981).
Parallel
Proc.
algorithms
IFIP
and
parallel
7 4 , Stockholm,
Congr.
computational
1971 ( N o r t h - H o l l a n d .
Amsterdam, 1974) p.685-687. 82.
J.F. Traub.
G. W.
Wasilkowski,
based c o m p l e x i t y (Acad. 83. P.C. in 84.
mapping
on
a n d K.
paral l e i
Solchenbach,
computer
Information-
1988).
computers.
CONPAR
237 ( S p r i n g e r - V e r l a g , B e r l i n ,
Sci.,
Trottenberg
Wozniakowski,
P r e s s , New Y o r k ,
Treleaven. Future p a r a l l e l
Computer
U.
a n d H.
Parallel
66,
Lecture
1986) p.40-47.
algorithms
architectures,
Notes
and
I n f . tech.
-
their I T 30
( 1 9 8 8 ) , No.2, p . 7 1 - 8 2 . 85.
86.
E.E. T y r t y s h n l k o v , Parallel
Comput.
V.V.
Voevodin,
Proc.
IFIP
New
approaches
to deriving
parallel
algorithms,
15 ( 1 9 9 0 ) , p . 2 6 1 - 2 6 5 . Impact
Congr.
o f algorithms
o n new c o m p u t e r
architecture,
1986 ( N o r t h - H o l l a n d , A m s t e r d a m , 1 9 8 6 ]
86, Dublin,
p. 1 0 4 3 - 1 0 4 8 . 87.
V.V.
Voevodin,
France
-
(Istit. 88.
-
Italy
stucture Joint
of
algorithms,
Symposium
Comp.
A n a l . Numer. , P u b b l . No.730, P a v i a ,
V.V.
Voevodin,
Hathematical
models
Hath,
Proc,
8-th
and
Appl.
1989) p.381-390.
and
methods
for
parallei
( N a u k a , Moscow, 1 9 8 6 ) [ i n R u s s i a n ) .
processes 89.
Information
U.S.S.R.
V. V.
Voevodin,
ComputatIonal
basics
of
linear
(Nauka,
algebra
Moscow, 1 9 7 7 ) ( i n R u s s i a n ) . 90.
V.V.
Voevodin,
91.
V.V. J.
92.
Voevodin,
Numer.
V.V.
Parallel
Anal.
Math.
Voevodin
Toeplitz
and
parallelism
94.
Y. Notes
methods
of
algebra
{Theory
and
computations
Modelling
E.E.
4
and a l g o r i t h m
(1989),
Tyrtyshnikov,
structure,
No.4,
p.327-339.
Computational
processes
Sov.
with
( N a u k a , Moscow, 1 9 8 7 ) ( i n R u s s i a n ) .
matrices
93. V I . V . V o e v o d i n ,
54
Computational
( N a u k a , Moscow, 1 9 6 6 ) ( i n R u s s i a n ) .
Algorithms)
Statistic
i n sequential
estimates of the p o s s i b i l i t y
t o discover
programs, Programming
No. 4 p. 4 4 ¬
(1990)
( i n Russian). Wallach, in
Comput.
Alternating
sequential/parallel
S c i . , 127 [ S p r i n g e r - V e r l a g ,
processing,
Berlin,
1982).
lecture
340 S.
J. Wilkinson,
The
Oxford, England, 96.
Algebraic
programs
Parallel
and
Press,
1965).
Z h i y u Shen, Z h i y u a n FORTRAN
ProbJemfClarendon
Eigenvalue
Distributed
L i , a n d Pen-Chung Yew,
for parallelizing Systems
An
empirical
compilers,
( 1 9 9 0 ) p.
350-364
IEEE
study of Trans.
on
Index A
Bellman equation
adjacency matrix
160
algorithm 274
elementary of
broadcast
289
linear
cone 230
open
unconditional
274
D
algorithm graph
defect
22
delay
289
l i n e a r l y ordered
primitive
28
equivalent perturbation
205
error 212
176
forward
176 175
269
F
statement forward
error analysis
271
output
234
analysis
backward
269
assignment
39
E
205
dimension of
input
39
289
infinite
124
d e n s i t y o f data streams
s t r i c t l y oriented array
o f a node
delay vector
146
regular
269
158
34
elementary
208
208
constant
matrix of
width of
208
dimension of
289
variation
208
closed
308
primitive
123 C
34
information closure
local
discrete
boundary c o n d i t i o n s vector
conditional
height
200
Bellman i n e q u a l i t y ,
G
271 graph
B
critical backward e r r o r a n a l y s i s balanced cycle base v e c t o r
63
205
basic
operation
basic
variable
21 21
176
path
directed cut expansion
31
reduction
31
undirected
cut
99 138
145
175
39
200
342
0
_ 'aph m a c h i n e 40 order
H homomorphic c o n v o l u t i o n 31 homomorphic
r e l a t i o n 256
i n f o r m a t i o n 256 l e x i c o g r a p h i c 257
image 32
strict
h o m o m o r p h i c p r e - i m a g e 32
256
orienting vector
homomorphism 3 1
o f a g r a p h 212
o u t d e g r e e 124
I implementation incidence
vector
P
38 parallel
m a t r i x 162
canonical
i n d e g r e e 124 Infinite
form
r e g u l a r g r a p h 205
information connection
m a t r i x 159
m a x i m a l 34
principal conditions vector
39
r e g u l a r s u b g r a p h 206
program 274
L L-property
5
o f a m a t r i x 265
l a b e l 272 lattice
generalized
38
h i g h - s p e e d 94
68
l e x i c i g r a p h i c a l l y maximum p o i n t 2! l e x i c i g r a p h i c order o r d e r 256
l o o p body 272 loop
s c h e d u l e 38
68
distributive
linear
34
h e i g h t o f 34
w i d t h o f 34
w e i g h t e d 159 initial
34
index
272 M
m a c r o g r a p h 33 m a c r o a r c 149 m a c r o o p e r a t i o n 148
N
282
s p a c e - t i m e 196 strict
38
sectioned
memory 144
s e m i m o d u l e 92 seniority of
loop
i n d e x e s 277
of
s t a t e m e n t s 279
statement assignment 271 empty 273 g o t o 273
n e s t d i m e n s i o n 275
I n p u t 271
nested
o u t p u t 271
l o o p 274
bearing
275
343
T tightly time
V
nested
level
topological
variable
275
109
empty
sorting
indexed
generalized linear
loop
28
269 270
value of
28
variation
28
u unconditional
269 269 matrix W
algorithm
274
wavefront
198
of algorithm
158