DEADL~DULING
FOR REAL-TIME SYSTEMS EDF and Related Algoritluns
THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND CO...
50 downloads
903 Views
22MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
DEADL~DULING
FOR REAL-TIME SYSTEMS EDF and Related Algoritluns
THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE REAL-TIME SYSTEMS Consulting Editor John A. Stankovic HARD REAL-TIME COMPUTING SYSTEMS: Predictable Scheduling Algorithms and Applications, by Giorgio C. Buttazzo, ISBN: 0-7923-9994-3 REAL-TIME DATABASE AND INFORJ\iATION RESEARCH ADVANCES, by Azer Bestavros, Victor Wolfe, ISBN: 0-7923-8011-8 REAL-TIME SYSTEMS: Design Principles for Distributed Embedded Applications, by Hennann Kopetz, ISBN: 0-7923-9894-7 REAL-TIME DATABASE SYSTEMS: Issues alld Applications, edited by Azer Bestavros, Kwei-Jay Lin and Sang Hyuk Son, ISBN: 0-7923-9897-1 FAULT-TOLERANT REAL-TIME SYSTEMS: The Problem ofReplica Determinism, by Stefan Poledna, ISBN: 0-7923-9657-X RESPONSIVE COMPUTER SYSTEMS: Steps Toward Fault-Tolerant Real-Time Systems, by Donald Fussell and Miroslaw Malek, ISBN: 0-7923-9563-8 IMPRECISE AND APPROXIMATE COMPUTATION, by Swaminathan Natarajan, ISDN: 0-7923-9579-4 FOUNDATIONS OF DEPENDABLE COMPUTING: System Implementation, edited by Gary M. Koob and Clifford G. Lau, ISBN: 0-7923-9486-0 FOUNDATIONS OF DEPENDABLE COMPUTING: Paradigms for Dependable Applications, edited by Gary M. Koob and Clifford G. Lau, ISBN: 0-7923-9485-2 FOUNDATIONS OF DEPENDABLE COMPUTING: Models and Frameworks for Depelldable Systems, edited by Gary M. Koob and Clifford G. Lau, ISBN: 0-7923-9484-4 THE TESTABILITY OF DISTRIBUTED REAL-TIME SYSTEMS, Werner Schlitz; ISBN: 0-7923-9386-4 A PRACTITIONER'S HANDBOOK FOR REAL-TIME ANALYSIS: Guide to Rate llfollotollic Analysis for Real-Time Systems, Carnegie Mellon University (Mark Klein, Thomas Ralya, Bill Pollak, Ray Obenza, Michale Gonzalez Harbour); ISBN: 0-7923-9361-9 FORMAL TECHNIQUES IN REAL-TIME FAULT-TOLERANT SYSTEMS, J. Vytopil; ISBN: 0-7923-9332-5 SYNCHRONOUS PROGRAMMING OF REACTIVE SYSTEMS, N. Halbwachs; ISBN: 0-7923-9311-2 REAL-TIME SYSTEMS ENGINEERING AND APPLICATIONS, M. Schiebe, S. Pferrer; ISBN: 0-7923-9196-9 SYNCHRONIZATION IN REAL-TIME SYSTEMS: A Priority Inheritance Approach, R. Rajkumar; ISBN: 0-7923-9211-6 CONSTRUCTING PREDICTABLE REAL TIME SYSTEMS, W. A. Halang, A. D. Stoyenko; ISBN: 0-7923-9202-7 FOUND ATIONS OF REAL-TIME COMPUTING: Formal Specificatiolls and Met/lOds, A. M. van Tilborg, G. M. Koob; ISBN: 0-7923-9167-5 FOUNDATIONS OF REAL-TIME COMPUTING: Scheduling alld Resource Managemellt, A. M. van Tilborg, G. M. Koob; ISBN: 0-7923-9166-7 REAL-TIME UNIX SYSTEMS: Design alldApplication Guide, B. Furht, D. Grostick, D. Gluch, G. Rabbat, J. Parker, M. McRoberts, ISBN: 0-7923-9099-7
G(H 70
I::J
=D~E~A~D~L=I~N~E~S=C=H~E~D~U~LI=N~G=
Y FOR REAL-TIME SYSTEMS
Otl3 jCfc) /
!
I
g
////-)//\} /I/;-///\}
EDF and Related Algorithms by ~~! .
Jobn A. Stankovic
(i..i 3}'!; 912-
University of Virginia Charlottesville, Virginia, USA Marco Spuri Merloni Elettrodomestici Spa Fabriano, Italy Kritbi Ramamritbam University ofMassachusetts Amherst, Massachusetts, USA Giorgio C. Buttazzo Scuola Superiore S. Anna Pisa, Italy
KLUWER ACADEMIC PUBLISHERS Boston / Dordrecht / London
;-
,.'""
Distributors for North, Central and South America: Kluwer Academic Publishers 101 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA Telephone (781) 871-6600 Fax (781) 871-6528 E-Mail Distributors for all other countries: Kluwer Academic Publishers Group Distribution Centre Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS Telephone 31 78 6392 392 Fax 31 786546474 E-Mail
....
..
, . Electromc ElectrOniC Services
Library of Congress Cataloging-in-Publication Data Deadline scheduling for real-time systems: EDF and related algorithms / by John A. Stankovic ... [et al.]. p. cm. -- (The Kluwer international series in engineering and computer science; SECS 460. Real-time systems) Includes bibliographical references and index. ISBN 0-7923-8269-2 (alk. paper) I. Real-time data processing. 2. Computer algorithms. 3. Scheduling. I. Stankovic, John. A. II. Series: Kluwer international series in engineering and computer science ; SECS 460. III. Series; Kluwer international series in engineering and computer science. Real-time systems. QA76.54.043 1998 98-39004 004' .33 -- dc21
crp
Copyright © 1998 by Kluwer Academic Publishers All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any fonn or by any means, mechanical, photocopying, recording, or otherwise, without the prior written pennission of the Assinlppi Park, Norwell, publisher, Kluwer Academic Publishers, 101 Philip Drive, Assimppi Massachusetts 02061 Printed on
acid~free
paper.
Printed in the United States of America The cover of this book was designed by Curtis Buyrn.
CONTENTS
LIST OF FIGURES
ix
LIST OF TABLES
xiii
PREFACE
xv
1
INTRODUCTION 1.1 1.2 1.3 1.4 1.5
Real-Time Systems Common Misconceptions A Typical Example of a Real-Time Application Purpose of this Book FormaL of the Book
1 1 4 4 7 8
REFERENCES
11
2
13 14 19 22
TERMINOLOGY AND ASSUMPTIONS 2.1 2.2 2.3
Task Models, Assumptions and Notation Static versus Dynamic Scheduling Metrics
REFERENCES
25
3
FUNDAMENTALS OF EDF SCHEDULING
27
3.1 3.2 3.3
28 31 61
Optimality on Uni-Processor Systems Feasibility Analysis Summary
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
Vl
REFERENCES 4
RESPONSE TIMES UNDER EDF SCHEDULING 4.1 4.2 4.3 4.4
Finding Local Maxima Deadline Busy Periods Algorithm Description Extended Task Modeling
4.5 4.6
Case Study Summary
REFERENCES 5
PLANNING-BASED SCHEDULING 5.1
Preliminaries: Load, Metrics, Value Functions
5.2 5.3 5.4 5.5
Steps in a Dynamic Planning-Based Scheduling Approach Algorithms for Dynamic Planning Timing of the Planning Implementing Planning-Based Scheduling
5.6 5.7
Dispatching Jobs in a Planning-based Schedule Summary
REFERENCES 6
EDF SCHEDULING FOR SHARED RESOURCES 6.1 6.2 6.3 6.4 6.5 6.6 6.7
The Nature of Resources and the Resulting Scheduling Problems The Priority Inversion Problem The Priority Inheritance Protocol The Dynamic Priority Ceiling Protocol The Stack Resource Policy Resource Scheduling in Planning-based Schedulers Summary
REFERENCES
63
67
68 71 73
74 82
83
85
87 91
98 101 111 113 114 116 117
121 122 125 127
HO 142 144
147 149
Contents
7
Vll
PRECEDENCE CONSTRAINTS AND SHARED RESOURCES 7.1 7.2 7.3 7.4 7.5
Scheduling Dependent Tasks with EDF The ;-.ration of Quasi-Normality Integration of Shared Resources and Precedence Extended Task Model Summary
REFERENCES 8
151 152 155 158 161 164 167
APERIODIC TASK SCHEDULING
169
8.1 8.2 8.3 8.4 8.5 8.6 8.7
170
Dynamic Priority Exchange server Dynamic Sporadic Server Total Bandwidth Server Earliest Deadline Late server Improved Priority Exchange server Performance Results Summary
175 179 182 187 190 193
REFERENCES
195
9
197 198 201 223 224
DISTRIBUTED SCHEDULING - PART I 9.1 9.2 9.3 9.4
Distributed Systems - An Overview Holistic Scheduling Based on EDF Performance Summary
REFERENCES
225
10 DISTRIBUTED SCHEDULING - PART II
229
10.1 The Spring Complex Task Set Allocation and Scheduling Algorithm 10.2 Focussed Addressing and Bidding 10.3 Summary
REFERENCES
229 246 256 259
viii
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
11 SUMMARY AND OPEN QUESTIONS 11.1 Summary 11.2 Open Questions
263 263 265
REFERENCES
267
INDEX
269
LIST OF FIGURES
Chapter 1 1.1
A Fly-By-Wire Flight Control System.
5
Chapter 2 2.1
Example of EDF schedule.
17
Chapter 3 3.1 3.2 3.3
A set of three real-time jobs. EDF schedule (with time overflow) of the job set of Figure 3.1. EDF schedule of two periodic tasks with different initial phas-
33 34
~.
~
3.4 3.5
EDF schedule of two tasks: a sporadic one and a periodic one. EDF schedule of periodic tasks with deadlines different from their periods. 3.6 Pseudo-code of the algorithm for the feasibility analysis of hybrid task sets. 3.7 Worst case arrival paUern for a task set with release jitter. 3.8 Worst arrival pattern of sporadically periodic tasks. 3.9 Non-preemptive non-idling EDF schedule of two jobs. 3.10 Non-preemptive idling schedule of two jobs. 3.11 Busy period preceding an overflow ill a non-preemptive nonidling EDF schedule.
40 44 50 5'2 53 58 58 60
Chapter 4 4.1 4.1
Busy period preceding an instance completion time. Synchronous arrival pattern possibly giving the worst case response time for task Ti.
69
69
x
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
4.2 4.3 4.4 4.5
Pseudo-code of the algorithm for the computation of the maximum deadline busy period lengths. EDF schedule with deadline tolerance. Arrival pattern for the evaluation of a task T; Ti'S 's response time. Local response time maxima when non-preemptive EDF scheduling is assumed.
73 75 76 79
Chapter 5
5.1
(a) Feasible schedule v.;ith Earliest Deadline First, in normal load condition. (b) Overload with domino effect due to the arrival of job Jo. 5.2 Utility functions that can be associated with a job to describe its importance. 5.3 No optimal on-line optimal algorithms exist in overload conditions, since the schedule that maximizes r depends on the knowledge of future arrivals. 5.4 Situation in which EDF has an arbitrarily small competitive fador. 5.5 Scheduling schemes for handling overload situations. a. Best Effort. b. Admission Control. c. Robust. 5.6 The RED acceptance test. 5.7 Basic guarantee algorithm.
88
94
95
97 102 105 109
Chapter 6
6.1 6.2 6.3 6.4 6.5 6.6
Priority inversion under EDF scheduling. Priority inheritance under EDF scheduling. Different relative priorities under EDF scheduling. Example of transitive blocking. Anomalous push-through blocking. Algorithm for the computation of blocking critical sections.
126 127 130 134 135 136
Chapter 7
7.1 7.2 7.3
The schedule of jobs Jk and J j violates EDF*. The new schedule of Jk and Jj EDF* compliant. Example of a not normal schedule produced by EDF and a resource access protocol.
154 154 157
List of Figures
7.4 7.5
A situatio n in which an EDF schedul er without priority inheritance violates quasi-n ormality and precede nce constrai nts. Algorith m of relative deadline modification.
Xl
160 163
Chapte r 8
8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9
Dynami c Priority Exchan ge server example . DPE server schedulability. DPE server resource reclaiming. Dynami c Sporadi c Server example . Total Bandwi dth Server example . Availability function under ED L. a) Idle times available at time t = 8 under EDL. b) Schedul e of the aperiod ic request with the EDL server. Improve d Priority Exchang e server example . Perform ance of dynami c server algorith ms.
171 173 175 176 180 183 185 188 192
Chapte r 9
9.1 9.2 9.3
9.4 9.5 9.6 9.7
End-to- end comput ation delay compon ents. Assume d network topology. a) Busy period precedin g an instance complet ion time. b) Arrival pattern possibly giving the worst case respons e time for task i. Local network scenario for the evaluati on of message worst case commu nication delays. Maximu m delayed token visits at host process or p when full Timed Token protoco l is assumed . Maximu m delayed token visits at host process or p when restricted Timed Token protoco l is assumed . Compon ents of the end-to-e nd comput ation delay.
202 204
209 212 216 217 222
Chapte r 10
10.1 Structu re of Two Periodic Tasks. 10.2 Subtask Commu nication Graph. 10.3 Assignm ent and Schedul e for the Two Periodic Tasks. Chapte r 11
232 239 244
LIST OF TABLES
Chapter 1 Chapter 2 Chapter 3
3.1 3.2
Loading factor computation for the job set of Figure 3.1. Computation of the synchronous lJusy period length for the task set of Figure 3.5.
32 48
Chapter 4
4.1 4.2
List of semaphores and locking pattern for the GAP task set. GAP task set parameters.
82 83
Chapter 5 Chapter 6
6.1 6.2
Job parameters for example Structure of accesses to resources.
123 138
Chapter 7 Chapter 8
8.1
Idle times under EDL.
Chapter 9
184
XlV
Chapter 10 Chapter 11
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
PREFACE
Many real-time systems rely on static scheduling algorithms. This includes cyclic scheduling, rate monotonic scheduling and fixed schedules created by offline scheduling techniques such as dynamic programming, heuristic search, and simulated annealing. However, for many real-time systems static scheduling algorithms are quite restrictive and inflexible. For example, highly automated agile manufacturing, command, control and communications, and disLrilJuLed real-time multimedia applications all operate over long lifetimes and in highly non-deterministic environments. Dynamic real-time scheduling algorithms are more appropriate for these systems and are used in such systems. Many of these algorithms are based on earliest deadline first (EDF) policies. There exists a wealth of literature on EDF based scheduling with many extensions to deal with sophisticated issues such as precedence constraints, resource requirements, system overload, multi-processors, and distributed systems. This book aims at collecting a significant body of knowledge on EDF scheduling for real-time systems, but it does not try to be all inclusive (the literature is just too extensive). The book primarily presents the algorithms and associated analysis, but guidelines, rules, and implementation considerations are also discussed especially for the more complicated situations where mathematical analysis is difficult. In general, it is very difficult to codify and taxonomize scheduling knowledge because there are many performance metrics, task characteristics, and system configurations. Also, adding to the complexity is the fact that a variety of algorithms have been designed for different combinations of these considerations. In spite of the recent advances there are still gaps in the solution space and there is a need to integrate the available solutions. For example, a list of issues to consider includes:
•
preemptive versus non-preemptive tasks,
•
uni-processors versus multi-processors,
XVI
DEADLINE SCHEDULING FOR REAL- TIME SYSTEMS
•
using EDF at dispatch time versus EDF-based planning,
•
precedence constraints among tasks,
•
resource constraints,
•
periodic versus aperiodic versus sporadic tasks,
•
scheduling during overload,
•
fault tolerance requirements, and
•
providing guarantees and levels of guarantees (meeting quality of service requirements) .
Chapter 1 defines real-time systems and gives a brief example of a real-time system. Chapter 2 contains the terminology and assumptions used throughout the book. The fundamental results of EDF scheduling for independent tasks are first presented in Chapters 3 and 4. These Chapters include results on preemption, non-preemption, uni-processors and multi-processors. The overall approach taken in this book is to consider preemption and non-preemption and uni-processors and multi-processors throughout the book and not as separate Chapters. Using EDF in planning mode is discussed in Chapter 5. Chapter 6 discusses results and algorithms involving tasks having general resource requirements. Chapter 7 presents results on scheduling tasks with precedence constraints and resource requirements. Chapter 8 considers problems where periodic and aperiodic tasks must both be scheduled. Scheduling for distributed systems is presented in Chapters 9 and 10. Chapter 11 summarizes the book and discusses open issues. This book should be of interest to researchers, real-time system designers, and for instructors and students either as a focussed course on deadline based scheduling for real-time systems, or, more likely, as part of a more general course on real-time computing. While reading the Chapters in order from 1 through 11 is recommended, this is not necessary. After reading Chapters 1, 2, 3, and 4, a reader can move to any of the other Chapters in any order, depending on his or her interest. One exception to this rule is that Chapter 6 should be read before Chapter 7. Special thanks go to Sang Son and Worthy Martin for helpful comments on early versions of this book.
1 INTRODUCTION
Over the last 10 years there has been an explosion of real-time scheduling results. Many of these results are based on either the rate monotonic (RM) or earliest deadline first (EDF) scheduling algorithms. The RM results have been collected and presented in an excellent book [3]. To date, a similar effort has been lacking for EDF. Since EDF has many valuable properties, it is important to address EDF in a comprehensive manner. Consequently, this book presents a compendium of results on earliest deadline first (EDF) scheduling for realtime systems. The simplest results presented utilize pure EDF scheduling. As system situations become more complicated, EDF is used as a key ingredient, but is combined with other solutions, e.g., to deal with shared resources. This Chapter defines and discusses real-time systems, discusses several common misconceptions, presents an example of a real-time application, and gives a detailed purpose and outline for the book.
1.1
REAL-TIME SYSTEMS
Real-time systems are defined as those systems in which the correctness of the system depends not only on the logical result of the computation, but also on the time at which the results are produced [6]. Many real-time systems are characterized by the fact that severe consequences will result if timing as well as logical correctness properties of the system are not satisfied. Typically, a real-time system consists of a controlling system and a controlled system. In an automated factory, the controlled system is the factory floor with its robots, assembling stations, and the assembled parts, while the controlling system is the
2
CHAPTER 1
computer and human interfaces that manage and coordinate the activities on the factory floor. Thus, the controlled system can be viewed as the environment with which the computer interacts. The controlling system interacts with its environment based on the information available about the environment from various sensors attached to it. It is imperative that the state of the environment, as perceived by the controlling system, be consistent with the actual state of the environment. Otherwise, the effects of the controlling systems' activities may be disastrous. Hence, periodic monitoring of the environment as well as timely processing of the sensed information is necessary. Timing correctness requirernenLs in a real-time system arise because of the physical impact of the controlling systems' activities upon its environment. For example, if the computer controlling a robot does not command it to stop or turn on time, the robot might collide with another object on the factory floor. Needless to say, such a mishap can result in a major catastrophe. Real-time systems span many application areas. In addition to automated factories, applications can be found in control of automobile engines, avionics, undersea exploration, process control, robot and vision systems, military command and control, and space stations. In other words, the complexity of real-time systems spans the gamut from very simple control of laboratory experiments, to process control applications, to very complicated projects such as a space station. Recently, the need to process continuous streams of audio and video data has given rise to exciting new possibilities for real-time applications. Timing constraints for tasks can be arbitrarily complicated, but the most common timing constraints for tasks are either periodic or aperiodic. An aperiodic task has a deadline by which it must finish or start, or it may have a constraint on both start and finish times. In the case of a periodic task, a period might mean 'once per time interval T' or 'exactly T units apart'. Low-level application tasks, such as those that process information obtained from sensors or those that activate elements in the environment, typically have stringent timing constraints dictated by the physical characteristics of the environment. A majority of sensory processing is periodic in nature. A radar that tracks flights and produces data at a fixed rate is one example. A temperature monilor of a nuclear reactor core should be read periodically to detect any changes promptly. Some of these periodic tasks may exist from the point of system initialization and remain permanent, while others may come into existence dynamically. The temperature monitor is an instance of a permanent
Introduction
3
task. An example of a dynamically created task is a (periodic) task that monitors a particular flight; this comes into existence when the aircraft enters an air traffic control region and ceases to exist when the aircraft leaves the region. More complex types of timing constraints also occur. For example, spray painting a car on a moving conveyor must be started after time t j and completed before time t 2 . Aperiodic requirements can arise from dynamic events such as an object falling in front of a moving robot, or a human operator pushing a button on a console. In addition, time related requirements may also be specified in indirect terms. For example, a value may be attached to the completion of each task where the value may increase or decrease with time; or a value may be placed on the quality of an answer whereby an inexact but fast answer might be considered more valuable than a slow but accurate answer'. In other situations, missing X deadlines might be tolerated, but missing X + 1 deadlines can't be tolerated. This raises the question of wll
4
CHAPTER
1
have in dealing with the (timing) exception. For more detailed descriptions of real-time systems see [1, 2, 5, 7, 8].
1.2
COMMON MISCONCEPTIONS
Real-time systems have unique sets of requirements usually requiring novel solutions. This fact is not always understood or appreciated. This has given rise to a number of misconceptions including: a sufficiently fast computer can satisfy the requirements, hence real-time computing is equal to fast computing. This is wrong. The point is that speed helps a real-time system in achieving the required responsiveness, but in general does not support predictability [7], which is one main objective of real-time systems. Another key point in which a real-time system differs from a conventional one is fairness. In conventional systems, resource allocation is usually done in a way that avoids starvation of any possible task. Sooner or later, a task that needs a resource gets it. In a real-time system this policy is not adequate. In case of resource contention, more important tasks should precede tasks with lower importance and fairness is not important. If a deadline must be missed, it is better to miss a deadline of a less important task, or to increase its response time, than missing a deadline of an important task. For a full discussion of these and other misconceptions about real-time computing see [6].
1.3
A TYPICAL EXAMPLE OF A REAL-TIME APPLICATION
One typical real-time application 1 is the flight control program for aircraft, such as the EFA (European Fighter Aircraft) that incorporates a Fly-By-Wire system. This application is presented to provide the reader with a brief overview of the major aspects of a real-time system, and to indicate what aspect of a real-time system that this book addresses. See [5, 8] for additional discussions of real-time systems and applications. lThis example is taken from [11. Also see [4].
Introduction
5
----11 Actuators I ISensors Data I Pilot Data Figure 1.1
DATA PROCESSING SUBSYSTEM A Fly-By-Wire Flight Control System.
In the EFA high performance aircraft the traditional mechanical links with which the pilot usually interacts have been replaced by actuators controlled by a computer system. This is due to the new design approach that is no longer based on the principle of flight mechanical stability, but on a dynamically unstable behavior [4]. The advantage of such design is high maneuverability, but, on the other hand, the aircraft is so unstable that it cannot be flown at aU withont its computer systems. Instead, an Active Control Technology is needed. In Figure 1.1 the typical main control loop is depicted. This system includes a set of sensors, pilot inputs from control sticks, pedals and consoles, a data processing subsystem, a set of actuators to control the aircraft, and a display for the pilot. Note that these system components are the typical main ingredients of a realtime system with suitable substitutions depending on the application. For example, a chemical process control plant has sensors measuring acidity, pressure, volume, etc., chemical engineers dynamically adjusting various controls and system settings, a possibly large and distributed data processing subsystem, a set of actuators to control temperature, oxygen levels, etc. that in turn control the chemical reaction, and monitoring displays to allow humans to follow the progress of the production of the plant.
6
CHAPTER 1
In the EFA the software of the data processing subsystem is decomposed into six modules: •
physical device access;
•
storage and retrieval of pilot data;
•
storage and retrieval of sensor data;
•
computation of speed, angular position, acceleration and angular velocity of the aircraft;
•
computation of control surface commands; and
•
arbitration among the redundant computers.
In the EFA these six modules execute as 15 processes. Five of them are periodic with deadlines equal to their periods. They are used to probe sensors, whose data are then filtered, averaged and stored. In this application, it is necessary to probe the temperature sensor at least once every 25 milliseconds. All other processes are aperiodic and have deadlines. They are activated by pilot generated interrupts such as when he moves the control stick or pedal, or are activated conditionally on the availability of data. Almost all the modules have serialized access to their data in order to ensure data consistency. Moreover, the communication module, which provides interprocessor communication (in order to have high reliability the application rUlls on at least three computers), can lead to communication contelltion. Note that understandability, predictability, and analyzability are all complicated when there are aperiodic processes contending over device, data and communication sharing. Of course, many real-time applications are much more complicated (e.g., air traffic control and nuclear power plants) and larger (hundreds or thousands of tasks) than this example aircraft application. \Vhile many issues must be addressed to build a real-time system, how timing and resource contention requirements are satisfied is mainly the responsibility of the scheduling algorithm of the system [9, 10]. This book concentrates on the scheduling algorithms and analysis for such real-time systems.
Introduction
1.4
7
PURPOSE OF THIS BOOK
Many real-time systems rely on static scheduling algorithms. This includes cyclic scheduling, rate monotonic scheduling and fixed schedules created by off-line scheduling techniques such as dynamic programming, heuristic search, aud simulated annealing. One main advantage of static off-line scheduling is that a careful and complete analysis is often possible. However, for many realtime systems static scheduling algorithms are quite restrictive and inflexible. For example, highly automated agile manufacturing, cOlIlIllaIHl, control amI communications, and distributed real-time multimedia applications all operate over long lifetimes and in highly non-deterministic environments. Dynamic real-time scheduling algorithms are more appropriate and are used in such systems. Many of these algorithms are based on earliest deadline first (EDF) policies. There exists a wealth of literature on EDF based scheduling with many extensions to deal with complex issues such as precedence constraints, resource requirements, system overload, multi-processors, and distributed systems. In many cases, formal analysis (as shown in this book) is possible. This book aims at collecting the significant amount of knowledge that has been developed on EDF scheduling. Rather than just presenting the algorithms, the book also provides proofs, analysis, and sometimes guidelines, rules, and implementation considerations. Besides learning what these important EDF-based results are, the reader should be able to answer, at least, the following questions: •
what is known about ulli-processor EDF scheuuling problems,
•
what is known about multi-processing EDF scheduling problems,
•
what is known about distributed EDF scheduling,
•
what anomalous behavior can occur and can it be avoided,
•
where is the boundary between polynomial and NP-hard scheduling in EDF problems, and
•
what is the influence of overloads on the schedulability of tasks?
It is known that the Rate Monotonic algorithm (RMA) [3] is among the most effective uni-processor real-time scheduling algorithms. This algorithm is one of the best representatives of fixed priority algorithms. To date, a major effort
8
CHAPTER
1
has been devoted to the study of RMA. Less attention has been paid to EDF, even though EDF theoretically allows higher utilization. Another purpose of this book is to expand the comprehension and use of the EDF algorithm.
1.5
FORMAT OF THE BOOK
In general, it is very difficult to codify scheduling knowledge because there are many performance metrics, task characteristics, and system configurations, and a variety of algorithms have been designed for different combinations of these considerations. In spite of the recent advances there are still gaps in the solution space and there is a need to integrate the available solutions. A list of issues includes:
•
preemptive versus non-preemptive tasks,
•
uni-processors versus multi-processors,
•
using EDF at dispatch time versus EDF-based planning,
•
precedence constraints among tasks,
•
resource constraints,
•
periodic versus aperiodic versus sporadic tasks,
•
scheduling during overload,
•
fault tolerance requirements, and
•
providing guarantees and supporting levels of guarantees (meeting quality of service requirements).
Chapter 1 defines real-time systems and gives a brief example of a real-time system. Chapter 2 contains the terminology and assumptions used throughout the book. The fundamental results of EDF scheduling for independent tasks are first presented in Chapters 3 and 4. These Chapters include results on preemption, non-preemption, uni-processors and multi-processors. The overall approach taken in this book is to consider preemption and non-preemption and uni-processors and multi-processors throughout the book rather as separate Chapters. Using EDF in planning mode is discussed in Chapter 5. How to handle system overload is discussed in this Chapter. Chapter 6 discusses results
Introduction
9
and algorithms relating to general resource requirements. Chapter 7 presents results on scheduling tasks with precedence constraints and resource requirements. Chapter 8 considers problems where periodic and aperiodic tasks must both be scheduled. Scheduling for distributed systems is presented in Chapters 9 and 10. Chapter 11 summarizes the book and discusses open issues.
REFERENCES
[1] W. Halang and A. Stoyenko, Constructing Predictable Real Time Systems, Kluwer Academic Publishers, Boston, 1991. [2] K. Kavi (Ed.), Real- Time Systems: A bstraetions, Languages, and Design Methodologies, IEEE Computer Society Press, Los Alamitos, 1992. [3] M. Klein, et. aI., A Practitioner's Handbook for Real-Time Analysis, Kluwer Academic Publishers, Boston, 1993. [4] D. Langer, J. Rauch, M. Roaler, "Fly-by-Wire Systems for Military High Performance Aircraft," in Real- Time Systems Engineering and Applications, edited by MI Schiebe and S. Pferrer, Kluwer Academic Publishers, Boston, 1992. [5] J. Stankovic and K. Ramamritham, Hard Real-Time Systems, IEEE Computer Society Press, Los Alamit.os, 1988. [6] J. Stankovic, "Misconceptions About Real-Time Computing," IEEE Computer 21(10), October 1988. [7] J. Stankovic and K. Ramamritham, "What is Predictability for Real-Time Systems?," Real- Time Systems 2, 1990. [8] J. Stankovic and K. Ramamritham, Advances in Real- Time Systems, IEEE Computer Society Press, Los Alamitos, 1993. [9] J. Stankovic, M. Spuri, IVI. Di Natale and G. Buttazzo, "Implications of Classical Scheduling Results for Real-Time Systems," IEEE Computer, Vol. 28, No.6, pp. 16-25, June 1995. [10] A. Tilborg and G. Koob (Eds.), Foundations of Real-Time Computing - Scheduling and Resource Management, Kluwer Academic Publishers, Boston, 1991.
2 TERMINOLOGY AND ASSUMPTIONS
Real-time scheduling involves the allocation of resources and time intervals to tasks in such a way that certain timeliness performance requirements are met. Scheduling has been perhaps the most widely researched topic within real-time systems. This is due to the belief that the basic problem in real-time systems is to make sure that tasks meet their time constraints. This Chapter introduces basic terminology, assumptions, notation, and metrics necessary to fully understand the remaining Chapters of the book. It should be mentioned that two different research communities have examined real-time scheduling problems from their own perspectives. Scheduling in the 'Operations Research (OR) community has focussed on job-shop and flow-shop problems, with and without deadlines. For instance, manpower scheduling, project scheduling, and scheduling of machines are some of the topics studied in OR [5, 6, 7, 8]. The types ofresources assumed by OR researchers (machines, factory cells, etc.) and how jobs use those resources (e.g., ajob may be required to use every machine in some specified order) are quite different from those assumed by Computer Science researchers (CPU cycles, memory, etc. and where jobs typically use only a single machine). Activities on a factory floor typically have larger time granularities than those studied by computer scientists. The metrics of interest to the OR community such as: minimizing maximum cost, minimizing the sum of completion times, minimizing schedule length, minimizing tardiness, and minimizing the number of tardy jobs are often not of interest to real-time system designers. Rather, real-time system designers attempt to prove all tasks meet their deadlines, or in less stringent situations, they try to minimize the number of tasks which miss their deadlines. OR techniques are geared towards static (off-line) methods where those developed in Computer
14
CHAPTER
2
Science focus more on dynamic techniques. In this book scheduling problems are examined from the perspective of Computer Science.
2.1
TASK MODELS, ASSUMPTIONS AND NOTATION
Real-time systems can be quite complex with many different types oftasks, time and reliability requirements, and metrics. The basic terminology, assumptions and notation used throughout the book are now defined and described. Other terminology is introduced in later Chapters when it applies to a particular algorithm or system configuration.
Definition 2.1 A real-time task is an executable entity of work which, at a minimum, is characterized by a worst case execution time and a time constraint. Definition 2.2 A job is an instance of a task. There are three types of real-time tasks: periodic, aperiodic, and sporadic. Each type normally gives rise to multiple jobs.
Definition 2.3 Periodic tasks are real-time tasks which are activated (released) regularly at fixed rates (period). In keeping with common notation, the period is designated by T. Normally, periodic tasks have constraints which indicate that instances of them must execute once per period T. The time constraint for a periodic task is a deadline d that can be less than, equal to, or greater than the period. The most common case is when the deadline equals the period. Definition 2.4 Synchronous periodic tasks are a set of periodic tasks where all first instances are released at the same time, usually considered time zero. Definition 2.5 Asynchronous periodic tasks are a set of periodic tasks where tasks can have their first instance released at different times. Definition 2.6 Aperiodic tasks are real-time tasks which are activated irregularly at some unknown and possibly unbounded rate. The time constraint is usually a deadline d.
Terminology and Assumptions
15
Definition 2.7 Sporadic tasks are real-time tasks which are activated irregularly with some known bounded rate. The bounded rate is characterized by a minimum interarrival period, that is, a minimum interval of time between two successive activations. This is necessary (and achieved by some form of flow control) in order to bound the workload generated by such tasks. The time constraint is usually a deadline d. Definition 2.8 A hybrid task set is a task set containing both periodic and sporadic tasks. Time constraints can be release times or deadlines, or both.
Definition 2.9 A release time, r, is a point in time at which a real-time job becomes ready to (or is activated to) execute. Definition 2.10 A deadline, d is a point in time by which the task (job) must complete. Usually, a deadline d is an absolute time. Sometimes, d is also used to refer to a relative deadline when there is no confusion. To emphasize relative deadlines D is used. The deadline can be hard, soft, or firm.
Definition 2.11 A hard deadline means that it is vital for the safety of the system that this deadline is always met. Definition 2.12 A soft deadline means that it is desirable to finish executing the task (job) by the deadline, but no catastrophe occu,-s if the,-e is a late completion_ Definition 2.13 A firm deadline means that a task should complete by the deadline, or not execute at all. There is no value to completing the task after its deadline. Accordingly, real-time tasks are often distinguished as hard, soft, and firm tasks_ Sometimes, soft tasks do not have deadlines at all. Their requirement is then to complete as soon as possible.
16
CHAPTER
2
Scheduling constraints are sometimes expressed with respect to tasks, when they do not depend on particular instances, with respect to jobs, otherwise. With these definitions presented, the notation used in the remainder of the book is as follows. The i lh task in the system is denoted by Ti. Its lh instance is denoted by J i .j . Sometimes it is only important to distinguish jobs and it is not important what tasks they are an instance of. In these cases, J i is used to denote the i(th) unique job. In two clearly marked areas of the book, J is redefined to mean the amount of jitter that a job experiences. This should not cause any confusion. Each task usually has a worst case or maximum execution time C;. A periodic task has also a period denoted by Ti . The minimum interarrival time of sporadic tasks is also designated by T;.
Definition 2.14 A job has release time r if its execution can begin only at timet~r.
Definition 2.15 A job has deadline d if its execution must complete by d.
The release time of the as:
lh
job of the periodic task ri,j
= (j -
Ti
is most commonly given
l)T;,
and its deadline is: di,j
= ri,j + T i == jT;,
that is, the deadline of one instance is the release time of the next instance. For sporadic tasks, the assumption is that the release times of two consecutive instances must be separated by at least its minimum interarrival time, that is: r;,j ~ ri,j-i
+ Ti.
The deadline is often assumed to be equal to the earliest possible release time of the next instance, that is: di,j = ri,j
+ Ti ·
An example of an EDF schedule is depicted in Figure 2.1. The first task in the schedule, which is sporadic, has maximum execution time 2 and minimum interarrival time 5. The example shows 2 arrivals of T i at times 5 and 17.
17
Terminology and Assumptions
T 1=5
CI=2_~ ,
_ _h
c,=36 CJ =
12
21-,----,---/-"m'r--t-;~'f;!
18
~'i-"4--)i~~~,,~,~
,
---'-----'----+r-l-",
o
I
3
4
5
Figure 2.1
8
9
10
L 22
17
10
II
12
IJ
14
15
16
17
18
,
c=JL ~" L 24,
19
20
21
21
23
24
,
Example of EDF schedule.
The other two tasks are periodic and they have maximum execution times 3 and 2, and periods 6 and 8, respectively. The schedule is represented on three horizontal time axes, one for each task. Along the axes, release times of sporadic task instances are represented by upward arrows, while deadlines are represented by downward arrows. For instance, the first job of 71 has release time 5 and deadline 10. Deadlines of periodic task instances are represented by downward arrows, as well. Release times are usually represented by a vertical segment. When they coincide with deadlines and there is no ambiguity they are not shown. For instance, the first job of 72 is released at time t = 0 and has deadline 6. 6 is also the release time of the second job. The assignment of jobs to the processor is represented by filled rectangular boxes drawn along the axes. For instance, at time t = 0 the job with the earliest deadline in the system is J2 ,1. This job gets the processor and completes at time t = 3. At this point the processor is assigned to J 3 ,1. An example of preemption is represented at time t = 17. At time t = 16 the processor is assigned to h,3. At time t = 17 J 1 ,2 is released. J 1 ,2 has deadline 22 and becomes the job with the earliest deadline in the system (i.e., 22), hence it preempts the execution of J 3 ,3 and gets the processor. When it completes at time t = 19, h,3 can resume its execution. Note that meanwhile, J 2 ,4 has also been released. However, its deadline is equal to the deadline of h,3 and it is executed later. Actually, ties could be broken arbitrarily.
18
CHAPTER
2
Another key issue in real-time scheduling involves the underlying assumptions made. Initially, in Chapter 3 the assumptions made are the same made by Liu and Layland [9] because this is the seminal paper on real-time scheduling. These assumptions are:
•
AI: All hard tasks are periodic.
•
A2: Jobs are ready to run at their release times.
•
A3: Deadlines are equal to periods.
•
A4: Jobs do not suspend themselves.
•
A5: Jobs are independent in that there are neither synchronizations between them, nor shared resources other than the CPU, nor relative dependencies or constraints on release times or completion times.
•
A6: There are no overhead costs for preemption, scheduling, or interrupt handling.
•
A7: Processing is fully preemptable at any point.
These assumptions are acceptable for a first step in the study of real-time scheduling theory. However, they are not practical and not adequate for the analysis of most actual systems. For this reason, one goal of this book is to present results in which one or more of these assumptions is relaxed. For example, in addition to timing constraints, a task may also possess the following types of constraints and requirements: •
Resource constraints - A task may require access to certain resources other than the CPU, such as, I/O devices, networking, data structures, files, and databases [4, 11].
•
Precedence relationships - A complex task, for example, one requmng access to many resources, is better handled by breaking it up into multiple subtasks related by precedence constraints and each requiring a subset of the resources.
•
Concurrency constraints - Tasks are allowed concurrent access to resources providing the consistency of the resources is not violated.
Terminology and Assumptions
19
•
Communication requirements - Sets of cooperating tasks are the norm for distributed, hard real-time systems. The communication requirements are a function of the semantics of the communication (synchronous, asynchronous) and of their timing requirements.
•
Placement constraints - \\Then multiple instances of a task are executed for fault-tolerance, the different instances should be executed on different processors.
•
Criticalness - Depending on the functionality of a task, meeting the deadline of one task may be considered more critical than another. For example, a task that reacts to an emergency situation, such as, a fire on the factory floor is more critical than the task that controls the movements of a robot under normal operating conditions.
2.2
STATIC VERSUS DYNAMIC SCHEDULING
Most classical scheduling theory deals with static scheduling. Static scheduling refers to the fact that the scheduling algorithm has complete knowledge regarding the task set and its constraints, such as, deadlines, computation times, precedence constraints, and future release times. This set of assumptions is realistic for many real-time systems. For example, real-time control of a simple laboratory experiment or a simple process control application might have a fixed set of sensors and actuators, and a well defined environment and set of processing requirements. In these types of real-time systems, the static scheduling algorithm operates on this set of tasks and produces a single schedule that is fixed for all. time. Sometimes there is confusion regarding future release times. If all future release times are known when the algorithm is developing the schedule then it is still a static algorithm. In contrast, a dynamic scheduling algorithm (in the context of this book) has complete knowledge of the currently active set of tasks, but new arrivals may occur in the future, not known to the algorithm at the time it is scheduling the current set. The schedule therefore changes over time. Dynamic scheduling is required for real-time systems such as teams of robots cleaning up a chemical spill or in military command and control applications. Fewer theoretical results are known about real-time dynamic scheduling algorithms than for static algorithms.
20
CHAPTER
2
Off-line scheduling is often equated to static scheduling, but this is wrong. In building any real-time system, off-line scheduling (analysis) should always be done regardless of whether the final runtime algorithm is static or dynamic. In many real-time systems, the designers can identify the maximum set of tasks with their worst case assumptions and apply a static scheduling algorithm to produce a static schedule. This schedule is then fixed and used on-line with well understood properties such as, given that all the assumptions remain true, all tasks meet the deadlines. In other cases, the off-line analysis might produce a static set of priorities to use at runtime. The schedule itself is not fixed, but the priorities that drive the schedule are fixed. This is common in the rate monotonic approach. If the real-time system is operating in a more dynamic environment, then it
is not feasible to meet the assumptions of static scheduling (i.e., everything is known a priori). In this case an algorithm is chosen and analyzed off-line for the expected dynamic environmental conditions. Usually, less precise statements about the overall performance can be made. On-line, this same dynamic algorithm executes. Generally, a scheduling algorithm (possibly with some modifications) can be applied to static scheduling or dynamic scheduling and used off-line or online. The important difference is what is known about the performance of the algorithm in each of these cases. As an example, consider earliest deadline first (EDF) scheduling. When applied to static scheduling it is known that EDF is optimal in many situations (to be enumerated in this book), but when applied to dynamic scheduling on multi-processors it is not optimal, in fact, it is known that no algorithm can be optimal. Predictability is one of the primary issues in real-time systems. Schedulability analysis or feasibility checking of the tasks of a real-time system has to be done to predict whether the tasks meet their timing constraints. Several scheduling paradigms emerge, depending on (a) whether a system performs schedulability analysis, (b) if it does, whether it is done statically or dynamically, and (c) whether the result of the analysis itself produces a schedule or plan according to which tasks are dispatched at run-time. Based on this ,the fOllowing following classes of algorithms arc identified: •
Static table-driven approaches: These perform static schedulability analysis and the resulting schedule (or table, as it is usually called) is used at runtime to decide when a task must begin execution.
Terminology and Assumptions
21
•
Static priority-driven preemptive approaches: These perform static schedulability analysis, but unlike in the previous approach, no explicit schedule is constructed. At run-time, tasks are executed highest-priority-first.
•
Dynamic planning-based approaches: Unlike the previous two approaches, feasibility is checked at run-time, i.e., a dynamically arriving task is accepted for execution only if it is found feasible, i.e., will make its deadline. Such a task is said to be guaranteed to meet its time constraints. This is sometimes called admission control. One of the results of the feasibility analysis is a schedule or plan that is used to decide when a task can begin execution. However, similar to the static case, the feasibility check and ·schedule creation can be separated. For example, in classical real-time systems it has been common that a schedule is created, while in real-time multimedia scheduling it is common to separate the feasibility check from the scheduling.
•
Dynamic best-effort approaches: In this approach no feasibility checking is done. The system tries to do its best to meet deadlines. But since no guarantees are provided, a task may be aborted during its execution.
It must be pointed out that even though four categories have been identified, some scheduling techniques possess characteristics that span multiple paradigms. Each of these categories is now briefly elaborated. Static table-driven approaches are applicable to tasks that are periodic (or have been transformed into periodic tasks by well known techniques). Given task characteristics, a table is constructed, using one of many possible techniques (e.g., using various search heuristics), that identifies the start and completion times of each task and tasks are dispatched according to this table. This is a highly predictable approach but, is highly inflexible since any change to the tasks and their characteristics may require a complete overhaul of the table.
The approach traditionally used in non real-time systems is the priority-based preemptive scheduling approach. Here, tasks have priorities that may be statically or dynamically assigned and at any time, the task with the highest priority executes. It is the latter requirement that necessitates preemption: if a low priority task is in execution and a higher priority task arrives, the former is preempted and the processor is given to the new arrival. If priorities are assigned systematically in such a way that timing constraints can be taken into account, then the resulting scheduler can also be used for real-time systems. For example, using the rate-monotonic approach [9], utilization bounds can be derived such that if a set of tasks do not exceed the bound, they can
22
CHAPTER
2
be scheduled without missing any deadlines using such a static priority-driven preemptive scheduler. Cyclic scheduling, used in many large-scale dynamic real-time systems [3], is a combination of both table-driven scheduling and priority scheduling. Here, tasks are assigned one of a set of harmonic periods. \Vithin each period, tasks are dispatched according to a table that just lists the order in which the tasks execute. It is slightly more flexible than the table-driven approach because no start times are specified and it is amenable to a priori bound analysis - if maximum requirements of tasks in each cycle are known beforehand. However, pessimistic assumptions are necessary for determining these requirements. In many actual applications, rather than making worse-case assumptions, confidence in a cyclic schedule is obtained by very elaborate and extensive simulations of typical scenarios. This approach is both error-prone and expensive
[10]. The dynamic planning-based approaches provide the flexibility of dynamic approaches with some of the predictability of approaches that check for feasibility. Here, after a task arrives, but before its execution begins, an attempt is made to create a schedule that contains the previously guaranteed tasks as well as the new arrival. If the attempt fails and if the attempt is made sufficiently ahead of the deadline, time is available to take alternative actions. This approach provides for predictability with respect to individual arrivals. In contrast, if a purely priority-driven preemptive approach is used, say, by using task deadlines as priorities, and without any planning, a task could be preempted any time during its execution. In this case, until the deadline arrives, or until the task finishes, whichever comes first, it is not known whether the timing constraint is met. This is the major disadvantage of the dynamic best-effort approaches. If, however, the worst case performance characteristics of such a scheduler can be analyzed, then perhaps it can be recognized and avoided. Such worst case analyses are in their infancy, being applicable only to tasks with very simple characteristics [2].
2.3
METRICS
Classical scheduling theory typically uses metrics such as minimizing the sum of completion times, minimizing the weighted sum of completion times, minimizing schedule length, minimizing the number of processors required, or mini-
Terminology and Assumptions
23
mizing the maximum lateness. In most cases, deadlines are not even considereJ considereLl in these results. \Vhen deadlines are considered, they are usually added as COI1straints, where, for example, one creates a minimum schedule length, subject to the constraint that all tasks must meet their respective deadlines. If one or more tasks miss their deadlines, then there is no feasible solution. Which of these classical metrics (where deadlines are not included as constraints) are of most interest to real-time systems designers? The sum of completion times is generally not of interest because there is no direct assessment of timing properties (deadlines or periods). However, the weighted sum is very important when tasks have different values that they impart to the system upon completion. Using value is often (erroneously) overlooked in many realtime systems where the focus is simply on deadlines and not on a combination of value and deadline. Minimizing schedule length has secondary importance in possibly helping minimize the resources required for a system, but does not directly address the fact that individual tasks have deadlines. The same is true for minimizing the number of processors required. Minimizing the maximum lateness metric can be useful at design time where resources can be continually added until the maximum lateness is equal to zero. In this case no tasks miss their deadlines. On the other hand, the metric is not always useful because minimizing the maximum lateness doesn't necessarily prevent one, many, or even ALL tasks from missing their deadlines.
In the static real-time scheduling problem, an off-line schedule is to be found that meets all deadlines. If more than one such schedule exists, a secondary metric, such as maximizing the average earliness is used to choose one among them. When a task completes its earliness is the amount of time still remaining before its deadline. If no such scheLlule scheJule exists, one which minimizes the average tardiness or lateness may be chosen. Tardiness is the amount of time by which a task misses its deadline. In these cases, an algorithm's ability to achieve optimality is with respect to these secondary metrics. In real-time systems, scheduling results are often presented in terms of schedulability or feasibility analysis. Definition 2.16 A set of jobs is schedulable or feasible if all timing constraints are met, that is, all hard real-time jobs complete by their respective deadlines. Definition 2.17 An optimal real-time scheduling algorithm is one which may fail to meet a deadline only if no other scheduling algorithm can meet the deadline.
24
CHAPTER
2
This definition of optimality is the typical one used in real-time scheduling. A common (non real-time) definition of optimality says that an algorithm is optimal if it minimizes (maximizes) some cost function. It is important to be familiar with both definitions. In dynamic real-time systems, since, in general, it cannot be a priori guaranteed that all deadlines are met, maximizing the number of arrivals that meet their deadlines is often used as a metric. Some of the results presented utilize the metric of minimizing the number of tasks that miss their deadlines which is the dual of maximizing the number that meet their deadlines. The variety of metrics that have been suggested for real-time systems is indicative of the different types of real-time systems that exist in the real world as well as the types of requirements imposed on them. This sometimes makes it hard to compare different scheduling algorithms. Related to metrics is the complexity of the various scheduling problems themselves. Many scheduling problems are NP-complete or N P-hard [4]. N P is the class of all decision problems that can be solved in polynomial time by a nondeterministic machine. A recognition problem R is NP-complete if R E N P and all other problems in N P are polynomial transformable to R. A recognition or optimization problem R is N P-hard if all problems in N P are polynomial transformable to R, but it can't be shown that R E N P. The complexity of the various problems presented in this book is mentioned throughout. The reader should take special note throughout the text regarding the types of tasks' constraints that move the scheduling problem from P to N P, e.g., in some problem situations allowing preemption moves a problem from N P-hard to polynomial and in other problems adding a release time constraint might move the problem from polynomial to N P-hard.
REFERENCES
[1] N. Audsley, A. Burns, M. Richardson, and A. Wellings, "Hard RealTime Scheduling: The Deadline Monotonic Approach," IEEE Workshop on Real- Time Operating Systems, 1992.
[2] S. Baruah, G. Koren, D. Mao, B. Mishra, A. Raghunathan, L. R.osier, D. Shasha, and F. Wang, "On the Competitiveness of On-Line Real-Time Task Scheduling," Proceedings of Real- Time Systems Symposium, December 1991.
[3] G. Carlow, "Architecture of the Space Shuttle Primary Avionics Software System," CACM, 27(9), September 1984.
[4] R. Garey and D. Johnson, "Complexity Results for Multiprocessor Scheduling Under Resource Constraints," SIAM Journal of Computing, 1975.
[5] R. Graham, Bounds on the Performance of Scheduling Algorithms, chapter in Computer and Job Shop Scheduling Theory, John Wiley and Sons, pp. 165-227, 1976.
[6] E. Lawler, "Optimal Sequencing of a Single Machine Subject to Precedence Constraints," Management Science, 19, 1973.
[7] E.L. Lawler, "Recent Results ill the Theory of Machine Scheduling," Mathematical Programming: the State of the Art, A. Bachen et al. (eds.), Springer-Verlag, New York, 1983.
[8] J. Lenstra and A. Rinnooy Kan "Optimization and Approximation in Deterministic Sequencing and Scheduling: A Survey," Ann. Discrete Math. 5, pp. 287-326, 1977. [9] C. Liu and J. Layland, "Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment," Journal of thc Association for Computing Machinery 20(1),1973.
26
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[10] D. Locke, "Software Architectures for Hard Real-Time Applications: Cyclic Executives versus Fixed Priority Executives," Real- Time Systems, Vol. 4, No.2, March 1992. [11] W. Zhao, K. Ramamritham and J. Stankovic, "Scheduling Tasks with Resource Requirements in Hard Real-Time Systems, " IEEE Transactions on Software Engineering, Vol. SE-12, No.5, 1987.
3 FUNDAMENTALS OF EDF SCHEDULING
This chapter discusses the basic results for the EDF scheduling algorithm, regarding optimality and feasibility analysis, respectively. The optimality of a real-time scheduling algorithm means that whenever a task set can be scheduled to meet all its deadlines, then it will be feasibly scheduled by the optimal algorithm. Usually, a real-time scheduling algorithm must guarantee a pr'iori that all deadlines of a particular task set are met. The problem is thus to establish whether a given task set can be feasibly scheduled by the chosen algorithm. In the literature, a solution to this problem is termed feasibility analysis. In this chapter, the optimality of the EDF algorithm, and a feasibility analysis for task sets, when EDF scheduling is assumed, are described in detail. The description in this chapter and the next is meant to present a basic, but comprehensive theory of the EDF algorithm for independent tasks. The results are presented for task models of increasing complexity, but always retaining the notion of independent tasks. Three different conCf~pts used in the analysis are described: processor utilization, processor demaud, aud busy period. Each concept is valuable in different contexts. In the remainder of the book more sophisticated task modles are preseuted including task sets with dependencies of various types. One practical result of the basic theory presented in chapters 3 and 4 is all algorithm that real-time system designers can use to analyze the feasibility of EDF scheduled systems.
28
CHAPTER
3.1
3
OPTIMALITY ON UNI-PROCESSOR SYSTEMS
The first result concerning the optimality of the EDF scheduling algorithm was originally given in [13]. In this work the reference (system) model is quite simple: there are n independent jobs in the system (i.e., n jobs with a single instance each), all ready at time t = 0, with each job J i having a deadline di . For any given scheduling sequence, the lateness of a job i is defined as Ii = fi - di, where fi is its completion time. If the goal is to minimize the maximum lateness of all jobs, assuming the schedule is non-preemptive, a simple solution is the earliest-deadline first algorithm:
Theorem 3.1 (Jackson's Rule) Any sequence is optimal that puts the jobs in order of non-decreasing deadlines.
The proof of the theorem can be given by a simple interchange argument, however it is not presented here, since it is similar to other arguments described later, and applicable to more general optimality results. Now consider some changes to the reference model. When release times are introduced in the model, the problem becomes NP-hard [18]. However, if preemption is allowed, even with release times the scheduling problem remains easy, and, in particular, the EDF algorithm is one possible solution. Note that release times are complications to task models which make scheduling problems difficult and that allowing preemption tends to make the scheduling problems easier. The optimality of the EDF preemptive scheduling algorithm was first described for a set of synchronous periodic tasks (i.e., all tasks share the same start (release) time) by Liu and Layland [21], whose paper is considered a milestone in the field of real-time scheduling. In particular, any synchronous periodic task set, with deadlines equal to their respective periods, is feasibly scheduled is not larger than by EDF if and only if the processor utilization U = L:7=1 1. 1 The optimality is given by the fact that the condition is necessary for any algorithm. The condition is also sufficient for feasibility under EDF scheduling.
%
The optimality result (not the feasibility condition, though) was later extended to asynchronous periodic task sets, with D i :S Ti for any i, by Labetoulle [17] [The proof is given in a later section.
Fundamentals of EDF Scheduling
29
where D i is a relative deadline. Another result is the proof of Dertouzos [7], in which the optimality of the EDF scheduling algorithm (Theorem 3.2) is shown for tasks with: •
arbitrary release times and deadlines, and
•
arbitrary and unknown (to the scheduler) execution times.
Theorem 3.2 (Dertollzos) The EDF algorithm is optimal in that if there exists any algorithm that can build a valid (feasible) schedule on a single processor, then the EDF algorithm also builds a valid (feasible) schedule.
Proof. By using a "time slice swapping" technique, it can be shown that any valid schedule for the task set can be transformed into a valid EDF schedule. In particular, by induction on t, the transformation is shown for any interval [0, t) (note that all the parameters of the problem are assumed to be integers). The theorem is trivially true for t = O. Assume now that it is true for the interval [0, t), that a task's instance with absolute deadline dj is executed in the interval [t, t + 1), and that the earliest deadline among all instances pending at time t is d; < dj . Let t' be the first time at which the instance with deadline d i is executed after t. By definition, t < t'. Furthermore, since this is a valid schedule, t' < d; < dj . It follows that by swapping the executions in the intervals [t,t+1) and [t',t'+l), a valid EDF schedule is obtained in the interval [0,t+1). 0 Note that with a similar "time slice swapping" technique, the Least Laxity First (LLF) algorithm was also proven optimal by Mok [23]. However, the LLF algorithm has the disadvantage of a potentially very large number of preemptions, and it is no longer optimal if preemption is not allowed [10]. When preemption is not allowed the scheduling problem is known to be NPhard. However, if only non-idling schedulers are considered, the problem is again tractable. Namely, a scheduler is non-idling if it is not allowed to leave the processor idle whenever there are pending jobs. In this subclass of nonpreemptive schedulers, the EDF algorithm is optimal. This result was first shown by Kim and Naghibzadeh [16], who term the nonpreemptive non-idling EDF algorithm as the relative urgency non-preemptive (RUNP) strategy.
30
CHAPTER
3
Theorem 3.3 (Kim and Naghibzadeh) The RUNP strategy is an optimal non-preemptive scheduling strategy in the sense that if a system runs without a task overrun under any non-pl'eemptive scheduling strategy, the system also 1'Uns without a task over1'Un under the RUNP strategy.
The theorem is proven assuming systems of sporadic tasks, with relative deadlines equal to the respective minimum interarrival times (D; = T i in our notation). The generalization to more general task sets, that is, the equivalent of Dertouzos' theorem, is due to George et. al. [9].
Theorem 3.4 (George et. al.) Non-preemptive non-idling EDF is optimal.
Proof. By using a "swapping" technique, it is shown that any finite valid schedule can be transformed into a valid EDF schedule. In particular, let t l and t 2 be the execution start times of two successive jobs in the valid schedule. Assume both jobs were pending at t l , and d l > d2 , where d l and d2 are the absolute deadlines of the two jobs, respectively. That is, at t 1 the scheduler has not chosen the job with the earliest deadline. If the executions of the two jobs are swapped, the resulting schedule is still valid, since due to the condition d 1 > d2 the two jobs are still completed by their respective deadlines, while the rest of the schedule remains unchanged. In a finite number of such swaps the schedule is transformed into a valid EDF schedule. 0
What Theorems 3.2 and 3.4 show is that the EDF algorithm theoretically dominates any other in the field of real-time uni-processor scheduling when there is no system overload and all jobs are independent. Other practical considerations may reducp. its advantages, but its optimality still makes it one of the best choices for real-time system designers. It is also worth remarking that the EDF algorithm is not only optimal in the sense of Theorems 3.2 and 3.4, but also "under various stochastic conditions" [28, 11]; but these are not treated in this book. Furthermore, because of EDF optimality, in the following sections the terms "feasibility" and "feasibility under EDF scheduling" are used interchangeably.
Fundamentals of EDF Scheduling
3.2
31
FEASIBILITY ANALYSIS
This section focuses on techniques for thc assessment of task set feasibility under EDF scheduling. To assess the feasibility of a task set means to establish whether the task deadlines are going to always be met. Historically, the first feasibility analysis for synchronous periodic task sets was given by Liu and Layland [21]. Afterwards, new approaches have been introduced in order to relax some of the assumptions and to analyze mon~ complex task sets. At present, sophisticated analysis procedures exist that are able to precisely assess the feasibility of "almost" any task sct in pscudo-polynomial time. \Vhether the problem can be solved in fully polynomial time is still an open question [2]. In this section, the most important results concerning the feasibility analysis of real-time task sets are described. The discussion spans from the relatively simple model of synchronous periodic task sets, with relative deadlines equal to their respective periods, to more complex models in which extensions like deadlines not related to task periods, release jitter, and system ovcrheads are taken into account. The analysis of non-preemptive non-idling systems is also described.
3.2.1
The Notion of Loading Factor
Two concepts are particularly helpful when analyzing the feasibility of real-time task (job) sets: the processor demand and the loading factol". The processor demand is a focused measure of how much computation is requested, with respect to timing constraints, in a given interval of time, while the loading factor is the maximum fraction of processor time possibly demanded by the task (job) set in any interval of time.
Definition 3.1 Given a set of real-time jobs and an intel'val of time [t 1 , t 2 ), the processor demand of the job set on the interval [t 1 , t 2 ) is
32
CHAPTER
U[3,18)
~=E
U[5,12)
'7
4
U[5,14)
4t 6= ~
U[6,14)
"8
6
10
u
U
Table 3.1
3
9
Loading factor computation for the job set of Figure 3.1.
Namely, the processor demand on [t l , t 2 ) represents the amount of computation time that is requested by all jobs with release time at or after t l and deadline before or at t 2 •
Definition 3.2 Given a set of real-time jobs, its loading factor on the interval [tl' t2) is the fmetion of the interval needed to execute its job, that is, h[t, ,t2)
U[t"t,)
= -t--t-' 2 1
and its
Definition 3.3 Absolute loading factor, or simply loading factor, is the maximum of all possible intervals, that is, U=
sup
U[t"t2)'
o:'Ot,
In other words, a job set (i.e., a set of task instances) has loading factor U if in each interval of time [t l , t2) the maximum amount of demanded cpu time is at most U(t2 - tl)' For example, the job set of Figure 3.1 has loading factor U = ~, as shown in Table 3.1. Note that only the computation of the loading factor on the most significant intervals is shown. It is easy to verify that the loading factor on any other interval is less than those shown in Table 3.1. Intuitively, a condition necessary for the feasibility under any scheduling algorithm is that the loading factor is not greater than 1. In fact, not only is this claim true, but the condition is also sufficient for feasibility under EDF scheduling [25].
Fundamentals of EDF Scheduling
33
-------C-I_ 1 _---'------c=4------'-------_1_~> 1 , C
I
=4
I
18
t
-~
I t
C =4 C 2 =4
!
---~_L__~_---'---_ _
____o,,~
12
Figure 3.1
I
I
"
A set of three real-time jobs.
Theorem 3.5 (Spuri) Each set of real-time jobs is feasibly scheduled by EDF if and only if uS; 1. Proof. "If": Assume there is an overflow, that is, a deadline miss, at time t. The overflow must be preceded by a cpu busy period, that is, a period of
continuous processor utilization, in which only jobs with deadlines less than t are executed. Let t2 = t and t 1 be the last instant preceding t such that there are no pending execution requests of jobs released before t 1 and having deadlines less than or equal to t. Both t 1 and t 2 are well defined. See Figure 3.2 for an example. In particular, after t1, which must be the release time of some job, the processor is allocated to jobs released after t 1 and having deadlines less than t 2 . Since there is an overflow at t2, the amount of cpu time demanded in the interval [t 1 , t2) must be greater than the interval itself, that is
L
Ck
> (t 2 - td·
tl~rk,dk$t2
It follows that
hence
u> 1, a contradiction.
34
CHAPTER
3
'I' I
I
_C_I_=_4_bL
~~
-'------'-n_---"------t
,
18
C2 = 4
---__t=======~L___l_-----'
___o_~
12
I
lME OVERFLOW
.----.--.-, ,.-------.-----.,1 , , I o
2
3
Figure 3.2
5
6
7
8
9
'.>
10
II
•"; , Yo 12
13
b,----,---"---.,--.,----.,_,.--,_. .
>-
14
15
16
17
18
19
20
21
22
,
EDF schedule (with time overflow) of the job set of Figure 3.1.
"Only If": Since the schedule is feasible, the amount of time demanded in each interval of time must be less than or equal to the length of the interval, that is
It follows that
hence uS1.
o Note that the result just shown also confirms the optimality of the EDF algorithm for uni-processor systems. From this point of view the theorem is equivalent to that by Dertouzos, previously described in Section 3.1. The main outcome of the theorem, however, is that the problem of assessing the feasibility of a task set is equivalent to the problem of computing the loading factor of the same task set. This fact is used in the following sections to show several other resul ts.
35
Fundamentals oj EDF Scheduling
3.2.2
Synchronous and Asynchronous Periodic Tasks
As already mentioned, the first researchers to address the feasibility analysis of a task set under EDF scheduling were Liu and Layland [21]. In their remarkable work, the reference model consists of synchronous independent periodic tasks (i.e., all tasks share the same start time, which, without loss of generality, can be assumed to be 0), and with relative deadlines equal to their respective periods (i.e., Vi, D i = T;). With this hypothesis, a necessary and sufficient condition for the feasibility of a task set is that the processor utilization is not greater than 1. This result can now be easily seen as a consequence of Theorem 3.5.
Corollary 3.1 (Liu and Layland) Any set of n synchronous periodic tasks with processor utilization U = 2::'=1 ~; is feasibly scheduled by EDF if and only if
Proof. It is sufficient to show that the loading 'factor 1.£ of the task set (i. e., the loading factor of the instances generated by the task set) is equal to U. By Theorem 3.5 the thesis then follows. For any interval [t 1 , t 2 ):
that is, U[tl,t,) ::;
Now, let t 1
U.
= 0 and t 2 = Icm(T1 , ••• , Tn): 2: n !:2C U[tl,!o)
=
i=~2 T;
,
= U.
It follows 1.£=
U.
o In the previous corollary it is assumed that all the periodic tasks have a null initial phasing, that is, for each task Ti, the first instance is released at time
36
CHAPTER
'2
=0
10
n
10 II 12
n
16
19
3
22
C 2 =3 T 2 =5
o
Figure 3.3
I
3
4
5
6
7
,
9
14 15 16 17 18 192021 1123 2425
t
EDF schedule of two periodic tasks with different initial phasing.
t = O. Liu and Lay land's result is very simple and efficient to use, however, in
actual systems it may not be practical to start all periodic tasks at the same time. A first rela..xation to their reference model is thus to let each task have its own start time. Namely, each task Ti is allowed to have a start time Si, the time at which the first job of the task is released. In this way, the k th job of Ti, Ji,k> Ji,k, has release time ri,k = Si + (k - l)Ti and deadline di,k =
Si
+ (k -
l)Ti
+ Di ,
where di is the deadline of task i. Again, such task sets where the first instance has a non-zero start time are termed asynchronous. It is still assumed that Vi, D i = T i . See Figure 3.3 for an example of EDF scheduling with different initial phasing. ColTman [6] showed that the condition of having a processor utilization not exceeding 1 is still sufficient for the feasibility under EDF of asynchronous periodic task sets. His result, too, can be seen as a consequence of Theorem 3.5. Corollary 3.2 (Coffman) Any set of n asynchronous periodic tasks with processor utilization U = L:~l ~:' is feasibly scheduled by EDF if and only if
Proof. Once again, it is sufficient to show that u
= U.
For any interval [t 1 , tz):
I
',J Iv; l'
Fundamentals of EDF Scheduling
37
that is,
Consider now the interval [0, s + mH), where m is an integer greater than 0, s = max{ SI,' .. , sn}, and H = lcm(T), .. . , Tn): n
h[O,s+mH)
2':
mH
LT i=l
Gi
n
Gi = mH
LT i=l
l
= mHU.
1
It follows mH
U[O,s+mH)
U
2': s + mH '
and for arbitrary m U
2': U,
from which it can be finally concluded that follows.
U
= U.
By Theorem 3.5, the thesis 0
The importance of this result is that the feasibility condition introduced by Liu and Layland can be efficiently used in asynchronous systems. That is, as long~~..~t~(l,.QliJ~§.!!re.equ,al~
:s
Theorem 3.6 (Leung and Merrill) Deciding if an asynchronous periodic task set, when deadlines are less than the periods, is feasible on one processor is NP-hard.
38
CHAPTER 3
The proof of the theorem is given by showing a polynomial reduction of the Simultaneous Congruences Problem to the given feasibility decision problem. The Simultaneous Congruences Problem is shown to be NP-complete by Leung and Whitehead [20]. Further, Baruah et. al. [2] show that the Simultaneous Congruences Problem is NP-complete in the strong sense which means that the feasibility decision problem is even harder than initially proven. This extra difficulty is shown in the following theorem. Theorem 3.7 (Baruah et. al.) The problem of deciding whether an asynchmnous periodic task set when deadlines are less than the periods is feasible on one pmcessor is NP-hard in the strong sense. The importance of this negative result is that it precludes the existence of pseudo-polynomial time algorithms for the solution of this feasibility decision problem, unless P=NP. In fact, the problem remains NP-hard in the strong sense even if the task sets are restricted to have processor utilization bounded above by any fixed positive constant. Asynchronous task sets (which are defined as having known start times for the tasks) are termed complete by Baruah et. al. [2]. Complete task sets are in contrast to incomplete task sets where start times are not specified. According to Baruah et. aI., an incomplete task set is feasible if there is some choice of start times such that the resulting complete task set is feasible. So far, it has been shown that the feasibility analysis of complete task sets is a very difficult problem. As can be easily guessed, the analysis of incomplete task sets is exponentially more difficult. Theorem 3.8 (Baruah et. al.) The problem of deciding whether an incomplete task set is feasible on one processor is -complete.
L;
According to the notation of Garey and Johnson [8], the class of problems Np NP .
L; is the class
The difficulty of this problem can be circumvented if the feasibility of incomplete task sets is defined in different terms. In fact, due to the strictness of timing constraints, real-time system design is usually based on a worst case
39
Fundamentals of EDF Scheduling
analysis. If it is assumed that the actual start times are fixed by the runtime system,-and hence, 'theyare not known a priori, it seems reasonable to determine the feasibility of the task set in any possible scenario.
Definition 3.4 An incomplete task set is feasible if for any choice of start times the 7'esulting complete task set is feasible.
With this new definition, the feasibility problem for incomplete task sets is slightly simplified. In fact, it turps out that as our intuition suggests, the most constraining scenario is when all tasks share the same start time. This permits the restriction of attention to synchronous task sets. The proof of this fact, which is also common to sporadic task sets, is given in the following section.
3.2.3
Sporadic Tasks
Not all the activities of a real-time system can be modeled with strictly periodic tasks. Some tasks can be activated by external events or anomalous situations, which do not necessarily occur at a fixed rate. Thus, it is necessary to introduce into the reference model some form of aperiodic task, that is, a task released irregularly. The introduction of aperiodic tasks in a hard real-time system must be subject to some form of restriction such as a maximum rate. If a guarantee on the execution of periodic task instances is still desired, as well as a deterministic responsiveness of the aperiodic workload, the computational bandwidth demanded by the aperiodic tasks must be restricted in some way. The goal is achieved by using the notion of spomdic tasks, a term introduced by Mok [23], although the concept was already known earlier (see for example [16]). 'Without ambiguity, the minimum interarrival time of a sporadic task Ti is denoted T i , as is the period of a periodic task. Similarly, C i and D i denote its worst case execution time and its relative deadline, respectively. Unless otherwise stated, no particular relation is assumed between periods and corresponding deadlines, which are thus arbitrary. According to the definition, the k th instance of Ti has release time ri,k
~ ri,k-l
+ Ti
40
CHAPTER
3
Sporadic
CI=1 T
j
=3
Figure 3.4
o
I
o
j
il
2
3
4
5
6
7
8 9
14
20
23
10 11 12 IJ 14 15 16 17 18 19 20 21 22 23 24 25
I
EDF schedule of two tasks: a sporadic one and a periodic one.
and deadline di,k
= ri,k + D i .
See Figure 3.4 for an example of an EDF schedule with one sporadic task and one periodic task. Since the release time of any sporadic task instance is not known a priori, in order to guarantee the feasibility under any possible scenario, the definition of feasibility must take all situations into account.
Definition 3.5 A sporadic task set is feasible if for any choice of release times compatible with the specified minimum interarrival times, the resulting job set is feasible.
Note that this definition is quite similar to Definition 3.4, in that both refer to all possible situations allowed by the problem specifications. Note also that according to Dertouzos' Theorem, the EDF algorithm is also optimal for sporadic task sets. Even if the feasibility of a sporadic task set may seem hard to study, it is not difficult to see that the analysis can be limited to a particular set of task instances [3]. In particular, it turns out that the worst possible scenariooccur~ when each sporadic task Tilsactlvated'sYIlcllioilousry-u. Tilsactlvated'sYIlcllioilousfiU. e.-, alIlasKS the same starUime, wlthoufloss" o(generality} andatits maximum rate, that is,"it be.h
o
shire
'/
41
Fundamentals of EDF Scheduling
Lemma 3.1 Given a set of n sporadic tasks T, and the corresponding synchronous periodic task set T', for any set of instances of tasks in T: uSu'.
Proof. It is sufficient to prove that for any set of instances and V[t l , t z ), there exists an interval [t;,t~) such that h[tlh) S hkt~). For any interval [tl,t Z ):
Di ~t~-t~
(1+ lt~ -t;T - DiJ) Co i
'
It follows: D
The lemma proves that among all sets of instances possibly issued by a sporadic task set, those with synchronous periodic releases have the maximum loading factor. By Theorem 3.5, the feasibility of a sporadic task set is thus equivalent to the feasibility of the corresponding synchronous periodic task set, as previously addressed. This proves the following lemma.
Lemma 3.2 A sporadic task set is feasible if and only if the corresponding synchronous periodic task set is feasible.
Note that Lemma 3.1 applies also to incomplete periodic task sets, whose feasibility, as defined in Definition 3.4, can thus be similarly analyzed by looking at the corresponding synchronous complete periodic task sets.
Lemma 3.3 An incomplete task set is feasible if and only if the corresponding synchronous periodic task set is feasible.
42
CHAPTER
3
According to the previous lemmas, the feasibility analysis of periodic and sporadic task sets can be unified, since in both cases it is reduced to the feasibility analysis of synchronous periodic task sets. In the following sections, all results are thus discussed with respect to hybrid task sets (i. e., containing both periodic and sporadic tasks).
3.2.4
The Processor Demand Approach
Unfortunately, whether the problem of deciding the feasibility of a hybrid task set is tractable, is still an open question. In fact, while polynomial or pseudopolynomial time solutions are known for particular cases, it is also not clear whether the problem in its general formulation is NP-hard [2, 3]. A clear and intuitive necessary condition for the feasibility of any hybrid task set is that the processor utilization is not larger than 1 [3]. Theorem 3.9 (Baruah et. al.) If a given hybrid task set is feasible under EDF scheduling, then
Proof. Consider the synchronous periodic task set corresponding to the given hybrid task set. If the processor utilization is shown to be not greater than its loading factor, by Theorem 3.5 the thesis follows. Let H = Icm(T1 , .•• ,Tn) and m be an integer greater than zero. In the interval [0, mH + D max ), where D max is the maximum relative deadline, the processor demand is
from which U[O,mH+D ma, )
2: m H
mH +D
U max
.
With m arbitrary, it follows u 2:
u.
o It is not difficult to see that the condition of the previous theorem is no longer sufficient for the feasibility of generic hybrid task sets. But, it is sufficient when relative deadlines are not shorter than the corresponding periods [3].
Fundamentals of EDF Scheduling
43
Theorem 3.10 (Baruah et. al.) Given any hybrid task set T is feasible under EDF scheduling if and only if U ::; 1.
T
with D; ::::
Ti, Vi,
Proof. Consider the synchronons periodic task set corresponding to T, and let [t r , t 2 ) be any interval of time. The processor demand in the interval is
from which u ::; 1. By Theorem 3.5 the thesis follows.
o
When dealing with generic hybrid task sets, the approach of the previous theorem leads only to a sufficient (but not necessary) condition for feasibility under EDF scheduling.
Theorem 3.11 Given any hybrid task set
T,
if
~
Ci <1 L min{D 1, T} - , i=l 1
then
T
is feasible under EDF scheduling.
Proof. Consider the synchronous periodic task set corresponding T, and let [t r , t 2 ) be any interval of time. The processor demand in the interval is
< <
44
CHAPTER
12
5
D
2
2
=0
C Z =2 ",,20 Z
-18 T
20
16
. .L.O-l· _LJL_~
_ _ _ _-----'-O-L-
18
s ] =0
C] = I
OJ =3
T] =10
Figure 3.5 periods.
3
20
'~§' FI-----o--f~~~~~~+___P'+-+-~~~~~___l'''4_~+o I 2 3 4 5 6 7 8 9 10 II 12 13 14 13 16 17 18 19 20 21 22 2] t
EDF schedule of periodic tasks with deadlines different from their
n
(tz-tdL i=l
Ci
. {DT·} lUln t, l-
< tz - t l , from which u ::; 1. By Theorem 3.5 the thesis follows.
o
An example of an EDF schedule with deadlines different from the corresponding periods is depicted in Figure 3.5. Note that in the example, the condition of . C, ,Td -- 4" 3 + 18 Z + 3" I -- 36 43 > 1, , smce '" L.i min{D, t h e prevIOus t h eorem d oes not h 0 ld' however, the task set is feasible, as can be clearly seen. In order to develop a procedure for the feasibility assessment of generic hybrid task sets, Theorem 3.5 can be reformulated. It has been shown that the feasibility of a hybrid task set is equivalent to the feasibility of the corresponding synchronous periodic task set. For any such set, the ratio between the processor demand and the length of the relative interval is maximized in the intervals [0, t), for any t greater than zero z. The processor demand of a synchronous periodic task set in the interval [0, t), denoted for simplicity as h(t) from now on, is
ZAs in Lemma 3.1, it is simple to see that h[t"t2)
S h[o,t,-t,),
for any interval [tl,t2).
45
Fundamentals of EDF Scheduling
The condition u :S 1, necessary and sufficient for the feasibility of the task set, is thus equivalent to h(t) :S t, Vt. The condition is formally stated in the following theorem.
Theorem 3.12 Any given hybrid task set is feasible under liDE? scheduling if and only if Vt, h(t) :S t. Unfortunately, testing the processor demand on any interval [0, t) is not practical. However, Baruah et. al. [2, 3] show that it is a valuable approach to find pseudo-polynomial solutions for the feasibility problem. In fact, they show that it is sufficient to test the processor demand for a finite number of intervals, which gives a pseudo-polynomial complexity in "most of the cases." Theorem 3.13 (Baruah et. al.) If the given hybrid task set is not feasible and U < 1, then h(t) > t implies t < D max or t < l~U maxi=l, ... ,n{Ti - Dd. Proof. Assume h(t) > t and t :::: D max . t
t;J l -;'~i J) t C+T~i-Di) 1+ t
< h(t) :S
Ci
Ci
C.
n
t
i=l
<
n
C.
L T' + L T' (Ti - Di)
tU
t
i=1
t
+ t=l, . max {Ti ... ,n
-
D;}U,
from which t(l - U)
It follows t
< U . max
t=I, ... ,n
{Ti
-
D;}.
U
< - - . max {Ti - D;}. 1- U .=l,...,n
o A valuable consequence of this theorem is that whenever the processor utilization of the hybrid task set is less than or equal to a fixed positive constant c
46
CHAPTER 3
smaller than 1, then the complexity of evaluating the feasibility of the task set is pseudo-polynomial. In fact, U :S c implies
U c l-U - l-c
--<--. According to the result just shown, the condition of Theorem 3.12 can then be efficiently tested in O(n maxi=l, ... ,n {Ti - D i }). Note that when U = 1 the upper bound for t is Ii +D max [3], which can lead to an exponential complexity. A suggestion for a further practical improvement in testing Theorem 3.12 condition is given by Zheng and Shin [31]. Accordingly, since the value of l(t - Di)/T;j changes only on the set {mT; + D; : m = 0,1, .. .}, the inequality h(t) :S t needs only to be checked on the set 5 = U7=ISi, where Si = {mT; +D i : Tn = 0,1, ... , L(t max - Di)/Tij}, and t max is the upper bound on the values to be checked. In their work, t max is found by stopping the algebraic manipulation of Theorem 3.13 a step earlier, thus obtaining a potentially smaller value
A third upper bound is similarly obtained by George et. al. [10]:
2:Di~Ti t max =
(1 - %:) c; 1- U
In all cases, however, the resulting complexity of the feasibility analysis is pseudo-polynomial only when U :S c < 1. Note that this is enough to make the processor demand analysis a practical tool in many situations.
3.2.5
Busy Period Analysis
By using a completely different argument, a further upper bound on the values of t for which the condition of Theorem 3.12 must be evaluated can be determined. Its rationale is found again in the work of Liu and Layland [21], in which it is proven that if a synchronous periodic task set is not feasible, then a deadline is missed in the first period of processor activity, that is, before any processor idle time.
Fundamentals of EDF Scheduling
47
Later, Spuri [25] and Ripoll et. al. [24] independently found that the result also applies to synchronous periodic task sets with D, ::; T i , 'eli, while the case of generic synchronous periodic task sets is discussed by Spuri [26]. Theorem 3.14 (Liu and Layland) If a synchronous periodic task set is not feasible under EDF scheduling, then in its schedule there is an overflow without idle time prior to it. Proof. The same argument given by Liu amI Layland can be applied with the new model. Assume there is all overflow at time t. Let tt be the end of the last processor idle period before t, or a if there are none. t' must be the arrival time of at least one instance. If all instances arriving after t' are "shifted" left up to tt, the processor workload in the interval [t', t) cannot decrease. Since there was no processor idle time between tt and t, there is no processor idle time after the shift. Furthermore, an overflow still occurs at or before t. By considering the new pattern only from time t' on, the thesis follows. 0
The immediate consequence of this theorem is that when checking the feasibility of a hybrid task set, the evaluation of Theorem 3.12's condition can be limited to the interval of time preceding the first processor idle time in the schedule of the corresponding synchronous periodic task set. Any interval of time in which thp. processor is not idle is termed a busy period. The interval of time preceding the first processor idle time 3 in the schedule of a synchronous periodic task set is termed a synchronous busy period. The new upper bound mentioned previously is the length L of the synchronous busy period. Given a hybrid task set and its corresponding synchronous periodic task set, L can be computed by means of a simple procedure. Given any interval [0, t), the idea is to compare the cumulative workload W(t), i.e., the sum of the computation times of the task instances arriving before time t, with the length of the interval: if W(t) is greater than t then the duration of the busy period is at least W(t). The argument is then recursively applied to W(t), W(W(t), ... , until two consecutive values are found equal. Formally, L is the fixed point of the following iterative computation: (3.1) 3Note that a processor idle period can have zero duration, if the last pending instance completes and at the same time a new one is released.
48
CHAPTER
L(O)
L(i)
L(2) L(3) L(4) L(5)
L Table 3.2
3
3+2+1=6 W(6) = 2·3 + 1· 2 + 1· 1 = 9 W(9) = 3 . 3 + 1 ·2+ 1 . 1 = 12 a'(12) = 3·3 + 1 ·2+ 2 . 1 = 13 W(13) = 4 . 3 + 1 . 2 + 2 . 1 = 16 W(16) = 4 ·3+ 1 ·2+ 2·1 = 16
~
16
Computation of the synchronous busy period length for the task
set of Figure 3.5.
where W(t)
=
t I;l i=l I
Gi ,
t
and L(m) is the value computed at the mth step. The computation in Equation (3.1) is stopped when two consecutive values are found equal, that is, L(m+l) = L(m). L is then set to L(m). Accordingly, the value found for L is the smallest positive solution of the equation x
= W(x).
In Table 3.2 the computation of L for the task set of Figure 3.5 is shown. Note that L is correctly assigned the value 16 and not 19. The reason is that at time t = 16 all instances arriving earlier are completed. Even if at the same time another instance is released, the situation is like having an idle period of length
o. Note that the value of L does not depend on the scheduling algorithm, as long as it is non-idling, since neither its definition nor its computation are related to any particular algoriLluu. Furthermore, the synchronous busy period turns out to be the longest one [26]. It can be easily proven that the sequence L(m) converges to L in a finite number of steps if the overall processor utilization of the task set does not exceed 1 [26] (recall that if this condition does not hold the task set is not feasible).
Lemma 3.4 If U :::; 1, then the computation (3.1) converges in a finite number of steps.
Fundamentals of EDF Scheduling
Proof. Let H
49
= lcm(T1 , ... , Tn).
It follows that L :::; H, since W(t) is a non-decreasing function and W(O+) > O. Furthermore, at each step L(m) is either increased by at least Gmin or is unchanged. The final value is thus achieved in a finite number of steps. 0 As reported by Ripoll et. al. [24], the new bound L can speed up the feasibility analysis of a task set. However, the worst case complexity of the analysis is not improved. In particular, even with this different approach, the analysis has pseudo-polynomial complexity if the processor utilization is U :::; c, with c a fixed positive constant smaller than 1 [26]. In this case:
L=
~ n
rT;L 1G
i :::;
~ n
(
L) 1 + T Gi = i
~ G, + L ~ n
1t
Gi
Ti
~ Gi + Lc, 1t
:::;
from which
L < -
L7-1 Gi . 1- c
Hence, the feasibility analysis has time complexity O(n L:l Gi ), which is, as claimed, pseudo-polynomial. Furthermore, since each step of the iterative formula 3.1 takes O(n) time, also the computation of L takes O(n L~=l Gi ) time. This leaves the question of whether there exists a fully polynomial time solution for the feasibility problem as an open question.
3.2.6
Feasibility Analysis Algorithm
A practical algorithm for the feasibility analysis of hybrid task sets can be developed by collecting the results described in the previous sections. The algorithm, whose pseudo-code is reported in Figure 3.6, first checks whether the processor utilization of the given task set is greater than 1. If this is the case, according to Theorem 3.9 the task set is not feasible. Otherwise, the analysis continues by checking the condition of Theorem 3.12, h(t) :::; t, on any interval [0, t), with t limited by the minimum among the three upper bounds previously discussed. Only the values corresponding to actual deadlines of the synchronous periodic arrival pattern are taken into account. Recall that if U :::; c, with a fixed positive constant smaller than 1, the algorithm has pseudo-polynomial complexity.
50
CHAPTER
3
Analyze(T): if U > 1 then return("Not Feasible"); \ .endif
.t i
= max
L:o
{D max , L:"' ij(i-D,/T;jC;} -U ;
[Zheng and Shin bound]
(i-D;/T;)Ci
[George et. aL bound] tz = "i-U ; L = synchronous busy period length; tinax = min{t i ,t 2 ,L}; 5 = U~=i {mT, + D, : m = 0,1, ... } = {el,eZ, ~ .. }; k = 1; while ek < t max if h(ek) > ek then return("Not Feasible"); endif k = k + 1; endwhile return( "Feasible"); Figure 3.6 task sets.
Pseudo-code of the algorithm for the feasibility analysis of hybrid
Fundamentals of EDF Scheduling
3.2.7
51
Extended Task Models
In the reference modelof hybrid task sets studied thus far, the tasks are characterized by three timing parameters, namely maximum execution time, relative deadline, and period, or minimum interarrival time if the task is sporadic. However, some systems may require even more complex models (such as release jitter and sporadic periodic tasks) in order to be analyzed. Furthermore, taking into account the additional costs of an actual system implementation is required and in other situations analyzing the feasibility of systems in which preemption is not allowed may be necessary. These aspects are briefly covered in this section.
Release Jitter Given a hybrid task set, task instances may arrive at any time, assuming periodicity or minimum distance (time between two consecutive arrivals) constraints are respected. However, in an actual system, an arrival must be recognized by a run-time dispatcher, which then places the instance in some run-time queue. The instance is then said to be released. The time between an instance arrival and its release is known as release jitter. Note that the release of a task instance can also be delayed by other factors, such as communication of tasks executing on nodes in a distributed system. So far, the analysis has been described with the implicit assumption of null release jitter. In this section this assumption is removed. It is now assumed that after each arrival, any instance of a task Ti may be delayed for a maximum time Ji before actually being released. When a task experiences jitter, there can be arrival patterns in which two consecutive releases of the same task are separated by an interval of time shorter than Ti . Thus, intuitively, the worst case arrival pattern is one in which all tasks experience their shortest inter-release times at the beginning of the schedule. That is, the first instance. of each task is released at time t = 0, all others are then released at time t = max{kTi - Ji,O}, Vi and Vk > O. See Figure 3.7 for an example. Indeed, it can be shown that the maximum loading factor is obtained with this arrival pattern, since the maximum ratio between processor demand and interval length is obtained in the intervals [0, t) of this scenario. The processor
52
CHAPTER
A -J 1 :. . . . . . A -J 2
A
-J '. 3
A
t .1
t
.t
t
t I
2
0
Figure 3.7
3
~
71·
t
72 T, I
I
I
4
5
6
t
:>
8
t~t
I
I
7
3
9
10
11
12
t
\Varst case arrival pattern for a task 'set with release jitter.
demand on such intervals becomes
since the situation is like having the first instance arrival of any task Ti at time t = -.Ji , and all others equally spaced by T i : all instances arriving at time t < 0 are actually released at time t = O. By applying the argument of Theorem 3.13, Barua;h, et. al.'s upper bound becomes tmax=max{._max {Di-Ji }, 1_1, ... !n
Uu _max {Ti+Ji-Di}}.
1-
1-1,.,:,.n
The upper bounds given by Zheng and Shin, and George, et. al. change in a similar way. The argument of Theorem 3.14 can still be applied to show that if the task set is not feasible, then a time overflow is found in the initial busy period of the EDF schedule, when the described arrival pattern is considered [26]. The equations concerning the busy period analysis must be modified accordingly. In particular, the iterative computation (3.1) must be modified in the definition of the cumulative workload W(t), which becomes W(t)
Jil
+ = Ln ft----yi=l
Ci .
t
The complexity of the feasibility analysis is not affected by the introduction of release jitter in the task model.
53
Fundamentals of EDF Scheduling
9
10
11
12
t
,-----------------------------Figure 3.8
Worst arrival pattern of sporadically periodic tasks.
Sporadically Periodic Tasks The reference model can also be extended by introducing the notion of sporadically periodic tasks [1]. These sorts of tasks are intended to model the behavior of events which may arrive at a certain rate for a number of times, and then not re-arrive for a longer time. For example, there are interrupts which behave in this way (they are also termed bursty sp·oradics). Sporadically periodic tasks are assigned two periods: an inner period (t) and an outer period (T). The outer period is the worst case inter-arrival time between bursts. The inner period is the worst case inter-arrival time between task instances within a burst. There is a bounded number of arrivals to each burst (n). It is assumed that for each task Ti, the total time for any burst (i.e., niti, the number of inner arrivals multiplied by the inner period) must be less than or equal to the outer period T i . Each instance may suffer a maximum release jitter Ji. "Ordinary" periodic and sporadic tasks, which are not bursty, are simply modeled by assigning inner periods equal to the corresponding outer periods, and by allowing at most one inner arrival.
It can bOe easily realized that for any given sporadically periodic task set, the worst arrival pattern in terms of processor loading factor is obtained by "packing" as much as possible, the releases of task instances at the beginning of the schedule. An example of such arrival pattern is depicted in Figure 3.8. As previously, it IS such that the first instances of all tasks are released at time t = 0, and are ideally experiencing their maximum jitter. All the following instances are then released as soon as possible.
54
CHAPTER
3
In particular,once again the argument of Theorem 3.14 can be applied to prove that if the task set is not feasible, then a deadline is. missed in the initial busy period of the described arrival pattern.· That is, the feasibility analysis developed in the previous sections can be adapted to the new model of sporadically periodic task sets [26]. The iterative computation (3.1) of the busy period length must again be modified in the definition of W(t) according to the new model. The definition of the cumulative workload released up to time t is now a bit trickier. If it is assumed that the first instance of each task Ti ideally arrives ~t time t = -Ji, but it is actually released at time t = 0, as shown in Figure 3.8; the number of instances of T; arrived and released by time t, I;(t), can be computed as the sum of: ni· times
•
t
•
+ Ji
the number of outer periods entirely fitting within an interval of units of time, and
the minimum between ni and the number of inner periods (rounded to the sm.allest larger integer) which fit in the last part of the interval (t + J i L(t + Ji)/T;j T i wide) preceding t. '
That is,
I;(t) =
I
t +Ji -
Ti
J ni + mm. {ni, [t + Ji -I ti
*
t
J Til} .
vV(t) then becomes 4 n
W(t) =
L Ii(t)C
i.
i=l
\Vith a similar argument, the number of instances of task Ti with deadline before or at t, Hi(t), can also be determined, which is thus the sum of: •
ni
t
times the number of outer periods entirely fitting within an interval of - D i units of time, and
+ J;
4It is not difficult to see tha.t since the new model is more general than the previous one, the computation of ~V(t) reduces to those previously shown when used with simpler models, in which tasks do not have bursty behavior nOr release jitter.
Fundamentals of EDF Scheduling
•
55
the minimum between ni and the number of inner periods (rounded to the largest smaller integer) which fit in the last part of the interval (t + J i D i - L(t + Ji - Di)/T;j T i wide) precedirlg t, increased by 1. .
That is,
Hi(t)
=
t ·· . i It+J~i DJ ni + min {ni,l + t+J-D-l!:H=Q;.JT]} . -
I
.
7
I
I
..
T.
.1-
Finally, the processor demand on any interval [0, t) becomes
h(t)
=
L
Hi(t)C i .
Di$t+Ji
The condition found in Theorem 3.12, evaluated in the interval [O,L), can still be utilized to test the feasibility of sporadically periodic task sets. The resulting algorithm is basically similar to that described in Section 3.2.6.
Tick Scheduling Feasibility analysis can be further extended in order to take into account the costs of an actual EDF preemptive scheduler implementation. From this point of view, the considerations are very similar to those made for fixed priority scheduling. Thus, in what follows the approach is the same as that described by Tindell, et. al. in [29]. According to [29] "Tick scheduling is a common way of implementing a priority preemptive scheduler: a periodic clock interrupt runs a scheduler which polls for the arrivals of tasks; any arrived tasks are placed in a priority ordered run queue. The scheduler then dispatches the highest priority task on the run queue." In the most general case, task instances can arrive at any time, and hence' can suffer a worst case release jitter of Ttick, the period of the tick scheduler (unless there are periodic tasks with periods which are multiple of Tticd·
Normally, the ticl< scheduler uses two queues: the pending queue, which holds a deadline ordered list of tasks awaiting their start conditions, and the run queue, a priority-ordered list of runnable tasks. "At each clock interrupt the scheduler scans the pending queue for tasks which are now runnable and transfers them to the run queue." The system overhead that must be taken into account is
56
CHAPTER
3
the time needed to handle the two queues, and more precisely the time needed to move tasks from one queue to another one. In particular, like in the work of Tindell, et. aL, the following implementation costs are considered:
Ctick
The worst case computation time of the periodic timer interrupt.
CQL The cost to take the first task from the pending queue. CQS The cost to take any possible subsequent task from the pending queue.
According to Tindell, et. aL's analysis, the tick scheduling overheads over a window of width ware OV(w) = T(w)C Uck
+ min{T(w), K(w)}C QL + max{K(w) -
T(w), O}CQs,
(3.2) where T(w) is the number of timer interrupts within the window: T(w) =
1-;;-1 I t.ck
and K (w) is the worst case number of times tasks move from the pending queue to the run queue: K(w) =
tI i=l
I
Ji
W ;
1·
t
In order to extend the feasibility analysis to include this overhead, Theorem 3.14 must be generalized. Again, the generalization is achieved by looking at the paper of Liu and Layland [21J: it is simply sufficient to reformulate their theorem in order to fulfill our needs. This theorem is proven with respect to a task set scheduled by the deadline driven algorithm, in a system in which the processor time is accumulated by a certain availability function, that is, only a fraction of the processor time is devoted to the task schedule. The attention of Liu and Layland is on sublinear functions, that is, functions for which for all t and T f(T) :S f(t + T) - f(t)· The reason is that when there is a task set scheduled by fixed priority scheduling and another task set scheduled when the processor is not occupied by tasks of the first set (i. e., in background), then the availability function for the second task set can be shown to be sublinear. The new model, in which all tasks are
Fundamentals of EDF Scheduling
57
scheduled when the processor is not busy executing tick scheduler code, fits perfectly in this description. Theorem 3.15 (Liu and Layland) When the deadline driven scheduling algorithm is used to schedule a set of tasks on a processor whose availability function is sublinear, if there is an overflow for a certain arrival pattern, then there is an overflow without idle time prior to it in the pattern in which all task instances are released as soon as possible.
o
Proof. Similar to that of Theorem 3.14.
The feasibility analysis must be modified accordingly. The computation of the busy period length must take into account the additional load due to the tick scheduler. Thus the workload arriving by time t becomes 5 W(t) = OV(t)
+
t r ;J l i
t
i=l
I
Ci .
t
Equation (3.2) can be used to evaluate the availability function: a(t) ~ max{t - OV(t),O}.
A sufficient condition for the feasibility of the task set is then
for all absolute deadlines in the initial busy period. Note that the condition is a generalization of Theorem 9 of [21]. Jeffay and Stone [15] extended the analysis to account for interrupt handling costs. In this work, the authors analyze the feasibility of a set of hard deadline tasks which execute in the presence of h interrupt handlers. The interrupt handlers are treated as sporadic tasks running at the highest priority.
Non-Preemptive Non-Idling EDF Scheduling When preemption is not allowed in the schedule of a task set, the problem becomes much more difficult. If a priority based scheduler like EDF is utilized, 5The analysis is described by assuming hybrid task sets with release jitter. The argument can be similarly applied to sporadically periodiC task sets [26J.
58
CHAPTER
_t
I-
·1
3
Pdodty Inversion
~
1 d]
r]
~Deadline
~
r2
Figure 3.9
Non-preemptive non-idling EDF schedule of two jobs.
. . .. .JI----jl-- Imposed Processor Idle Time
~
I
,--------
_1------..-
rI
_r-LJ
_
~
Figure 3.10
Non-preemptive idling schedule of two jobs.
non-preemption may be the source of undesired priority inversions. In Figure 3.9, a typical situation is depicted. When the second job is released at time T2, the execution of the first job cannot be preempted. The second job is forced to wait for the completion of the first one, even if it has a shorter deadline. In this case the effect of the priority inversion is so bad as to invalidate the schedule, since the second job misses its deadline. The situation can be improved if the scheduler is allowed to keep the processor idle, even when there are pending jobs. In the example, if the processor is left idle between Tl and T2, the second job can be executed first. As shown in Figure 3.10, if such idling decision is taken, no deadline is missed, that is, the schedule is valid.
Fundamentals of EDF Scheduling
59
Unfortunately, as mentioned in Section 3.1, the feasibility problem for idling systems, which are more general than non-idling ones, is NP-hard. Further, Howell and Venkatrao [12] show on one 'hand that "the decision problem of determining whether a periodic task system is schedulable for all start times with respect to the class of idling algorithms is NP-hard in the strong sense, even when the deadlines are equal to the periods." On the other hand, they also formally prove that there cannot exist an optimal on-line idling algorithm for scheduling sporadic tasks, and that if a sporadic task set is schedulable by an on-line idling algorithm, then it is also schedulable by an on-line non-idling algorithm. These arguments, as well as practical reasons, motivate the choice of restricting the attention only to the suboptimal class of non-idling scheduling algorithms, among which EDF is optimal. The analysis of task sets scheduled by the non-preemptive non-idling EDF algorithm (termed simply non-preemptive EDF in the following) is not much more difficult than that discussed thus far for preemptive systems. The processor demand and busy period approaches are stilI useful and can be adapted without much effort. The impact of non-preemption, and hence of priority inversions is, in fact, limited and can be easily taken into account in the equations. It turns out that any task instance may be subject to at most only a single priority inversion, which can be easily bounded. Assume there is an overflow in the schedule at time t, as shown in Figure 3.11. An argument similar to that of Theorem 3.14 can be applied. Let t ' be the last time before t such that there are no pending instances with arrival time earlier than t ' and absolute deadline before or at t. By ,choice, t' must be the arrival time of a task instance, and there is no idle time between t' and t. A lower priority instance (i.e., with deadline after t) may be executing before
t'. If it is not completed by t', owing to its non-preemptability, a priority inversion occurs. However, after its completion only task instances arrived at
t' or later, and having deadline before or at t are executed. In the schedule of such instances there may be several priority inversions, but only the first one has an impact on the deadline missed at time t.
If all "higher priority" instances are "packed" to the left, so that the corresponding tasks are released synchronously from time t', an overflow is stilI found at t or earlier. This proves the following lemma [10].
Lemma 3.5 (George, et. al.) If a hybrid task set is not feasible under nonpreemptive EDF scheduling, then there is an overflow at time t in the initial
60
CHAPTER
"Higher PriorityM
Tasks
3
~t-----'-'======-L...--h-,-----, ,, ___- - - - - ; . : _ . PrionlY Inversion
~Lowcr
PriorilY"
Tasks
Figure 3.11 Busy period preceding an overflow in a non-preemptive nonidling EDF schedule.
busy period when all tasks with relative deadline less than or equal to tare released synchronously from time 0, and all others, if any, ar'e released one unit of time earlier.
Accordingly, having assumed a discrete scheduling model, for any deadline at time t the maximum penalty introduced by the non-preemption is max{C· - I},
Dj>t
J
i
r-;1, l
with the convention that the value is zero if 13j : D j > t. The processor demand approach must then be corrected by adding this value to h(t). The condition of Theorem 3.12 now becomes Vt,h(t) + max{Cj -I}::; t. D;>t
A similar feasibility condition is shown by Kim and Naghibzadeh [16] for sporadic task sets with deadlines equal to periods, although the assumed scheduling model is continuous. Jeffay, et. al. [14] also prove similar results, but assume periodic and sporadic task sets within a discrete scheduling model. Zheng and Shin [31] extend their feasibility condition for preemptive EDF, described in Section 3.2.4, to non-preemptive EDF. All these results are finally generalized by George, et. al. [10] in the following theorem.
Theorem 3.16 (George, et. al.) Any hybrid task set with processor utilization U ::; 1 is feasible under non-preemptive EDF if and only if Vt E S,h(t) +max{Cj -I}::; t, Dj>t
Fundamentals of EDF Scheduling
where
s=
61
Q{
kTi + Di , k = 0, ... ,
llmaxT~ D i J} ,
and t max = min{L,t 1 ,t 2 }. As for the preemptive ca'ie, L is the length of the initial busy period in a synchronous arrival pattern. According to Lemma 3.5, any possible deadline miss is found in a busy period. Since L is the length of the longest busy period, it is an upper bound on the absolute deadlines to check. The other upper bounds, t 1 and t 2 , are obtained by algebraic manipulations of the condition
h(t)
+ Dj>t max{Cj
-
I} > t.
In particular, max {
D
max,
L~-1 (1 - DdTi) C i 1_ U
}
'
LD o
The algorithm described in Section 3.2.6 can then be used to check the feasibility of any task set according to Theorem 3.16. Similar to the preemptive model, the complexity is pseudopolynomial whenever U ::; c, with c a fixed positive constant less than 1.
3.3
SUMMARY
A theory for EDF feasibility analysis has been described in this chapter. In the simplest case, it has been shown that the feasibility of an independent periodic task set can be established by just computing the task set processor utilization U: the task set is then recognized as feasible if and only if U ::; 1. If asynchronous periodic tasks with deadlines not necessarily equal to their periods are considered, the problem becomes intractable, since it has been shown to be NP-Hard in the strong sense, which excludes even the existence of a pseudo-polynomial solution, unless P=NP.
In the general case, U ::; 1 is only a necessary condition, as expected. New interesting techniques based OIl processor demand and busy period approaches have
62
CHAPTER
3
been developed to analyze more complex task sets. The result is a pseudopolynomial solution for the analysis of hybrid task sets, that is task sets including both periodic and sporadic tasks. Whether a fully polynomial solution exists is still an open question. The algorithm for the pseudo-polynomial solution has been described. Extensions necessary to handle models including release jitter, sporadically periodic tasks and tick schedulers have also been discussed. The description of how to apply the approach to non-preemptive non-idling EDF schedulers has been given.
REFERENCES
[1] N. Audsley, A. Burns, M. Richardson, K. Tindell, and A.J. Wellings, "Applying New Scheduling Theory to Static Priority Pre-emptive Scheduling," Software Engineering Journal, September 1993. [2] S.K. Baruah, L.E. Rosier and R.R. Howell, "Algorithms and Complexity Concerning the Preemptive Scheduling of Periodic, Real-Time Tasks on One Processor," Real- Time Systems 2, 1990. [3] S.K. Baruah, A.K. Mok, and L.E. Rosier, "Preemptively Scheduling HardReal-Time Sporadic Tasks on One Processor," Proc. of IEEE Real-Time Systems Symposium, 1990.
/14]
G.C. Buttazzo and J.A. Stankovic, "RED: A Robust Earliest Deadline Scheduling Algorithm," Proc. of 3rd Int. Workshop on Responsive Computing Systems, 1993.
[5] G.C. Buttazzo and J.A. Stankovic, "Adding Robustness in Dynamic Preemptive Scheduling," in Responsive Computer Systems: Toward IntegratIOn of Fault Tolerance and Real- Time, Kluwer Press, 1994. [6] E.G. Coffman, Jr., "Introduction to Deterministic Scheduling Theory," in E.G. Coffman, Jr., Ed., Computer and Job-Shop Scheduling Theory, Wiley, New York, 1976. [7] M.L. Dertouzos, "Control Robotics: the Procedural Control of Physical Processes," Infonnation Processing 74, North-Holland Publishing Company, 1974. [8] M.R. Garey and D.S. Johnson, Computer's and Intl'actability: A Guide to the Theory of NP-Completeness, W.H. Freeman and Company, 1979. [9] 1. George, P. Muhlethaler, and N. Rivierre, "Optimality and NonPreemptive Real-Time Scheduling Revisited," Rapport de Recherche RR2516, INRIA, Le Chesnay Cedex, France, 1995.
64
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[10] L. George, N. Rivierre, and M. Spuri, "Preemptive and Non-Preemptive Real-Time Uni-Processor Scheduling," Rapport de Recherche RR-2966, INRIA, Le Chesnay Cedex, France, 1996. [11] J. Hong, X. Tan, and D. Towsley, "A Performance Analysis of Minimum Laxity and Earliest Deadline Scheduling in a Real-Time System," IEEE Transactions on Computers, Vol. 3S, No. 12, Dec. 19S9. [12] R.R. Howell and M.K. Venkatrao, "On Non-Preemptive Scheduling of Recurring Tasks Using Inserted Idle Times," Information and Computation 117, 1995. [13] J.R. Jackson, "Scheduling a Production Line to Minimi~e Maximum Tardiness," Research Report 43, Management Science Research Project, University of California, Los Angeles, 1955. [14] K. Jeffay, D.F. Stanat, and C.U. Martel, "On Non-Preemptive Scheduling of Periodic and Sporadic Tasks," Proc. of IEEE Real-Time Systems Symposium, 1991. [15] K. Jeffay and D. L. Stone, "Accounting for Interrupt Handling Costs in Dynamic Priority Task Systems," Proc. of IEEE Real- Time Systems Symposium, 1993. [16] K.H. Kim and M. Naghibzadeh, "Prevention of Task Overruns in RealTime Non-Preemptive Multiprogramming Systems," Proc. of Performance 80, Association for Computing Machinery, 19S0. [17] J. Labetoulle, "Some Theorems on Real- Time Scheduling," Computer Architectures and Networks, E. Gelembe and R. Mahl (Eds.), North Holland Publishing Company, 1974. [IS] J.K. Lenstra and A.H.G. Rinnooy Kan, "Optimization and Approximation in Deterministic Sequencing and Scheduling: A Survey," Ann. Discrete Math. 5, 1977. [19] J.Y.-T. Leung and M.L. Merrill, "A Note on Preemptive Scheduling of Periodic, Real-Time Tasks," Information Processing Letters 11(3), 19S0. [20] J.Y.-T. Leung and J. Whitehead, "On the Complexity of Fixed-Priority Scheduling of Periodic, Real-Time Tasks," Performance Evaluation 2, 19S2. [21] C.L. Liu and J .W. Layland, "Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment," Journal of the Association for Computing Machinery 20(1), 1973.
i
REFERENCES
65
[22] Locke C.D., Vogel D.R., and Mesler T.J., "Building a Predictable Avionics Platform in Ada: A Case Study," Proc. of IEEE Real- Time Systems Symposium, 1991. [23] A.K. Mok, "Fundamental Design Problems of Distributed Systems for the Hard Real-Time Environment," Ph.D. Dissertation, IvIIT, 1983. [24] 1. .Ripoll, A. Crespo, and A.K. Mok, "Improvement in Feasibility Testing for Real-Tillie Tasks," Real- Time Systems 11, 1996.
'j[25] M. Spuri, "Earliest Deadline Scheduling in Real-Time Systems," Doctorate Dissertation, Scuola Superiore S.Anna, Pisa, Italy, 1995. [26] tv!. Spuri, "Analysis of Deadline Scheduled Real-Time Systems," Rapport de Recherche RR-2772, INRIA, Le Chesnay Cedex, France, 1996. [27] M. Spuri, "Holistic Analysis for Deadline Scheduled Real-Time Distributed Systems," Rapport de Recherche RR-2873, INRIA, Le Chesnay Cedex, France, 1996. [28] J.A. Stankovic, M. Spuri, M. Di Natale, and G. Buttazzo, "Implications of Classical Scheduling Results for Real-Tillie Systems," IEEE Computer, June 1995. [29] K. Tindell, A. Burns, and A.J. Wellings, "An Extendible Approach for Analysing Fixed Priority Hard Real-Time Tasks," Real- Time Systems 6(2), 1994. [30] K. Tindell and J. Clark, "Holistic Schedulability Analysis for Distributed Hard Real-Time Systems," Microprocessors and Microprogramming 40, 1994. [31] Q. Zheng and K.G. Shin, "On the Ability of Establishing Real-Time Channels in Point-to-Point Packet-Switched Networks," IEEE Trans. on Communications 42(2/3/4), 1994.
4 RESPONSE TIMES UNDER EDF SCHEDULING
In the previous chapter it has been shown how to analyze a set of independent real-time tasks, in order to assess their feasibility when scheduled by the EDF algorithm. In almost all of its formulations, this feasibility assessment problem has been solved by examining the schedule of a subset of task instances in a bounded interval of time. It has been shown that if any deadline is missed in this interval, then any other instantiation of the same task set is not feasibly scheduled by EDF. A different solution to feasibility analysis is found by introducing the notion of worst case response time: the worst case response time of a task is the maximum time elapsed between the release and the completion times of any of its instances. An alternative solution to the feasibility assessment problem would be to compute, for each task, the worst case response time and to compare it with its relative deadline. However, it is worth remarking that the two problems, feasibility assessment and worst case response time computation, are not equivalent when EDF scheduling is assumed. In fixed priority systems, the feasibility analysis of a task set is indeed carried out by computing task worst case response times [12]. In contrast, when EDF scheduling is assumed the two problems have a different complexity. The feasibility problem is simpler, since it can be solved by examining a single specific set of instances. Instead, the worst case response ftime computation requires the analysis of several different scenarios, although '~it too has a pseudo-polynomial time complexity. In spite of being slightly more complex, the problem of determining the task worst case response times is very interesting. In particular, it is a very useful
68
CHAPTER
4
tool not only for the analysis of uni-processor systems, but also for the analysis of distributed real-time systems (see Chapter 9). Distributed applications are, in fact, characterized by precedence relationships among their tasks. If the tasks are statically allocated to single processors, end-to-end timing constraints can be analyzed by a theory which assumes release jitter [1]: "All tasks are defined to arrive at the same time, but a precedence constrained task on one processor can have its release delayed awaiting the arrival of a message from its predecessors. The worst case release jitter of such a subtask can be computed by knowing the response times of predecessor tasks located on other processors." This approach leads to a global analysis of distributed real-time systems, which is termed holistic schedulability analysis by Tindell and Clark [13]. It is clear that the central issue of such a distributed analysis is the ability to compute task worst case response times in the presence of release jitter.
In this chapter an algorithm for the computation of task worst case response times is described, assuming the processor is scheduled according to the EDF algorithm. The algorithm can also be used to extend the holistic approach to the analysis of deadline scheduled distributed systems [10] which is done in Chapter 9. Extensions to the algorithm for handling more complex, but still independent task models on a uni-processor, and an extensive case study are discussed at the end of this chapter.
4.1
FINDING LOCAL MAXIMA
The worst case response time rt; of a task T; is the maximum time between a T;'S instance arrival and its completion. Finding rt; is not a trivial task when EDF scheduling is assumed. Contrary to what may be an intuitive idea, the worst case response time of a task is not always found in the initial busy period of a synchronous schedule, which is thus not exactly the equivalent of the critical instant under fixed priority scheduling [6]. However, the concept of busy period is still useful. The idea is that the completion time of a task's instance with deadline d must be the end of a busy period in which all executed instances have deadlines less than or equal to d. An argument similar to that of Theorem 3.14 can then be applied in order to locally maximize the task response time, for a given subset of all possible scenarios.
Response Times under EDF Scheduling
69
t
d
a
1 d 2
(a)
Task
Other Tasks
t
c=d
a
d
_t r=:::::::=J Lt-------=J tt _t D [J 0 t t-3>-1 _b _h->--_¢'-'---'c"o--'······..'--_--L-¢----'¢--3>-:> o
1
(b)
Figure 4.1a
Busy period preceding an instance completion time.
Figure 4.1b Synchronous arrival pattern possibly giving the worst case response time for task Ti.
70
CHAPTER
4
Lemma 4.1 (Spuri) The W01'st case response time of a task Ti is found in a busy period in which all other tasks are released synchronously at the beginning of the period and then at their maximum rate (see Figure 4.1b). Proof. Consider a T;'S instance, like in Figure 4.1a, with arrival time a and deadline d = a + D i , respectively. Let t2 be its completion time, according to the EDF schedule. Let t 1 be the last time before t2, such that there are no pending instances with arrival time earlier than t1 and deadline less than or equal to d. Since Ti'S instance is released at t = a, t 1 :::; a. Furthermore, by choice of t 1 and t 2 , t 1 must be the arrival time of a task's instance, and there is no idle time in [t 1, t 2 ). That is, [t 1, t 2 ) is a busy period .in which only instances with deadlines less than or equal to d execute. At this point, if all instances of tasks different from i are "shifted left," in such a way to obtain a synchronous arrival pattern, like in Figure 4.1 b, starting at t 1 , the workload of instances with deadlines less than or equal to d cannot diminish in [t1' t 2 ). That is, the completion time of the Ti'S instance released at t = a can only increase. 0 The lemma states a precise characterization of the scenarios that give the local maxima of a task response time. In other words, the local maximum response times of a task Ti can be found in the following way. Only arrival patterns like that shown in Figure 4.1b are taken into account. The only characteristic of these patterns is that all tasks, but Ti, are released synchronously and at their maximum rate starting from time t = O. The maximum response time of the Ti'S instance released at time a is determined by computing the length of the busy period of all task instances with deadlines less than or equal to d, that is, of all task instances that precede the Ti'S instance being considered. The parameter that distinguishes these patterns is the release time a. Accordingly, let Li(a) be the length of the busy period relative to the deadline d = a + D i, and rt i (a) be the maximum response time relative to a. Since C i is an obvious lower bound for any Ti'S instance response time, it is
rti(a)
= max{Ci , Li(a)
- a}.
(4.1 )
The worst case response time of Ti is finally
rti = max {rti(a)}. a 2':0
(4.2)
Put in this way, Equation (4.2) is not quite useful. In order to make it practical for actual tools, an upper bound on the significant values of the parameter a
Response Times under EDF Scheduling
71
is needed. Furthermore, the upper bound has to be small enough to make the complexity of the procedure at least pseudo-polynomial in time. The argument of Lemma 4.1 indirectly answers this question, too. Accordingly, the maximum response time of a task is found in a busy period, and it is known that the longest busy period of a given task set is the initial one in a completely synchronous arrival pattern (refer to Section 3.2.5). Hence, the significant values of a are in the interval [0, L - C i ). In Section 3.2.5 it is shown that when the processor utilization of the task set is bounded by a fixed positive constant less than 1, L is upper bounded by a pseudo-polynomial value. This gives the algorithm explained in the following sections an overall pseudo-polynomial time complexity. It is also not difficult to see that the local maxima of Li(a) are found for those values of a such that in the arrival pattern there is at least an instance of a task different from Ti with deadline equal to d, or all tasks are synchronized. This further reduces the set of significant values to be examincd when evaluating Equation (4.2), considerably speeding up the ovcrall procedure. The details of the computation are given in Section 4.3.
4.2
DEADLINE BUSY PERIODS
The search interval for Equation (4.2) can be further restricted if the notion of deadline busy period [5] is introduced. Namely, a deadline d busy period is a busy period in which only task instances with absolute deadline smaller than or equal to d execute. By looking at the argument given in the proof of Lemma 4.1, it can be easily realized that the local maxima of a task response times are actually found in deadline busy periods. That is, Li(a) denotes the length of the deadline a + D i busy period found in the EDF schedule of the arrival pattern in which all tasks but Ti are released synchronously, while T; has its Hrst instance released at time
8;(a) = a
-l;;J
(so that to have an instance released at time t
Ti
= a).
The computation of L;(a) can bc carried out by means of a useful recursive approach. In particular, given the described arrival pattern, by time t,
72
CHAPTER 4
rtlTj l instances of Tj are released, for each j ¥ i. However, at most only + l(a + D; - Dj)ITjJ of them can have a deadline less than or equal l to
1 d
= a+ Di . That is, the Wi(a, t)
=
"higher priority workload" having arrived by time t is
"L:. {ftl T min
J
Dj
T
j
'
1+
la+D-DJ} ~j G J
j ,
,
:'S a+
Di
and the number of Ti'S instances released up to time t
= a is
l+l;iJ· The length L i (a) can then be computed with the following iterative formula: 0,
Wi (a,L;m)(a))
+ (1+
IT;-J) Gi ·
(4.3)
As for Equation (3.1), the convergence of Equation (4.3) in a finite number of steps is ensured by the condition
~ Gi < 1. ~T i=l
-
t
The reason is that the maximum length of any busy period is L, the length of the initial busy period of the synchronous arrival pattern. At each step of Equation (4.3), either L;m+l)(a) is equal to L;m), in which casethe computation is halted, or L;m+l)(a) is L;m) increased at least by a quantity Gmin , the minimum computation time among all tasks. Since Li(a) is bounded by L, its value is thus reached in a finite number of steps. As remarked by George, et. al. [5], not all deadline busy periods are interesting, but only those that include the reference instance. Namely, given a T;'S instance released at time t = a, hence with deadline d = a + D i , the deadline d busy period includes this instance if and only if it is longer than a. 2 The length of the longest such deadline busy periods is denoted by L i . L i is clearly an upper bound for the significant values of the variable a in Equation (4.2). Its computation can be sped up by utilizing the following two properties. 1 Note that no particular policy is assumed for breaking deadline ties. Hence, in the worst case, instances sharing the same deadline should be considered as having higher priorities. 2Note that when L;(a) :S a the arrival pattern is not interesting with respect to the worst case response time computation.
Response Times under EDF Scheduling
73
Maximum Deadline Busy Period Lengths(T): L n + l == L; for i == n downto 1 let k be such that ek ~ L Hl - C i + D i < ek+l; a == ek - D;; while L;(a) ~ a let k be such that ek ~ L;(a) - Ci + D; < ek+l; a == ek - D;; endwhile
Li
= Li(a);
endfor return(L l
, ...
,L n )
Figure 4.2 Pseudo-code of the algorithm for the computation of the maximum deadline busy period lengths.
Lemma 4.2 (George, et. al.) Di~Dj
=>
Li~Lj.
Lemma 4.3 (George, et. al.) Vi, L;(a) is non-decreasing in a. Let E = U:~l {mT; + D; : m 2: O} = {el' e2,' .. }, let L be the length of the initial busy period of the synchronous arrival pattern, and assume that Vi, D i ~ D H1 . The algorithm for the computation of L; starts by placing an instance of T; in such a way to have its deadline at the largest ek smaller than or equal to Li+l - C; + D i (as previously remarked, these are the patterns that give the longest deadline busy periods). If the resulting deadline ek busy period includes this last instance of Ti, L; is found, otherwise the computation is repeated by choosing a smaller deadline, this time according to the value of L;(a). Note that the algorithm always stops. The pseudo-code is reported in Figure 4.2.
4.3
ALGORITHM DESCRIPTION
Once the bound L i has been computed for any task Ti, the worst case response times can be computed by evaluating the following equation:
rti = max {rt;(a)} . aEA
74
CHAPTER
4
A is the set of the significant values for the parameter a. As previously addressed, it is the set of instants for which a + D; coincides with at least the absolute deadline of another task's instance. Namely,
.4 ~
(Q
{kT,
+ Dj
-
D, , k
~ OJ) n[OJ,)
k 8fl,' , Finally, rt;(a) is computed by means of Equation (4.1) and the iterative formula (4.3). Note that as for the problem of deciding the feasibility of a task set, the complexity of the worst case response time computation is pseudo-polynomial whenever the processor utilization of the task set is upper bounded by a fixed positive constant smaller than 1 [9]. Similarly, whether the problem has a fully polynomial solution is an open question.
4.4
EXTENDED TASK MODELING
In Section 3.2.7, the feasibility analysis for hybrid task sets was extended to more complex models in which release jitter, sporadically periodic tasks, tick scheduling, and non-preemptive scheduling are, in turn, taken into account. These extensions, as well as a new one termed deadline tolerance, are now discussed with respect to worst case response times computation.
4.4.1
Deadline Tolerance
The concept of deadline tolerance is introduced by Buttazzo and Stankovic [2, 3]. The idea is to accept a schedule as feasible even if some deadline is missed. However, the possible lateness of a task instance must not exceed a certain value called tolerance.
Definition 4.1 A task T; has deadline tolerance 6; if any of its instances with deadline d must complete within d + 6;.
Note that for a task instance, having a tolerance greater than zero is different from having a longer deadline. The deadline is the point in time by which the
Response Times under EDF Scheduling
C 1 =4
75
m_-----'yL-__ h _________-----' ~ it
20
16
C 2 =4
12
C
I
I
3
4
5
t
t
1 I
---C'>->
r 6
Figure 4.3
3 =6
»
, I> i 8
9
I
10
I
11
12
,b y 1)
14
15
16
,
,
I
I
I
I
17
18
19
20
21
22
It
•>
EDF schedule with deadline tolerance.
instance should complete, and that can be used by the operating system to drive the scheduling algorithm, while the tolerance is the maximum lateness tolerated in case of a late job. In Figure 4.3 an example of EDF schedule with deadline tolerance is shown. The first and the last jobs both miss their respective deadlines. However, since they have enough tolerance, 4 and 2, respectively, the schedule is accepted as feasible. If each task is allowed to have its own tolerance, the analysis of the system becomes very difficult. In fact, the amount of lateness that a job can have depends strictly on the sequence of job arrivals. A possible solution is a runtime solution as proposed in [2], where at each job arrival the complete schedule of the current job set is examined in order to evaluate each possible lateness. This algorithm is useful when the jobs executions can be dynamically rejected, but it isn't if the feasibility of the task set has to be checked a priori.
In the latter case, an obvious solution is now given by the worst case response time computation. In order fo check the feasibility of a task Tj, it suffices to compute its maximum lateness l; = max{O,rt; - Dil, and to compare it with the task tolerance is;.
76
CHAPTER 4
~ .. . ~i _t ~ ~
~
t t
.f .f
~
ta
slaY
t
::>
t
L
0
Figure 4.4
4.4.2
Arrival pattern for the evaluation of a task
t
Ti'S
t
response time.
Release Jitter
When the release jitter is introduced in the task model, the computation of worst case response times is simply extended as shown for the feasibility analysis. The argument of Lemma 4.1 can be applied to the extended model too. The difference is again in the most constraining arrival patterns, owing to the initial release jitter. An example of the patterns examined to find the worst case response time of a task Ti is shown in Figure 4.4. As previously, given a, all possible Ti'S instances are included in the pattern, so that there is one arrival at time t = a. By forcing the first instance to experience a release jitter J i , the number of Ti'S instances released up to time a now is
Similarly, the computation of the higher priority workload arrived up to time t is Wi(a, t) = ji'i Dj ::; a
+ Di + Jj
The length of the resulting busy period relative to the deadline d = a + D i can still be computed by means of the iterative computation (4.3), accordingly modified. The worst case response time relative to a is then rti(a)
= max{Ji + Gi , L;(a) -
a}.
77
Response Times under EDF Scheduling
Equation (4.2) can finally be applied to find rti. There is a slight difference, however, in the meaning of the variable a: in the previous case a denoted both the arrival and release time of the instance considered; in presence of release jitter it may be only the arrival time, with the release time possibly being up to J i units of time later. In fact, the value of rti(a) in Equation (4.2) must be evaluated for all significant values of a such that the release time of the Ti'S instance considered is greater than or equal to o. Thus the computation can be limited to the interval [-Ji,L i - J i - C i ). Namely, the set A on which rti(a) must be evaluated is now
A
=
(U
{kTj
+ Dj -
Jj - Di : k :::: O})
n[0, Li).
J=l
4.4.3
Sporadically Periodic Tasks
The extension to sporadically periodic tasks can be handled with a very similar ") approach. Assuming the definitions of Ii(t) and Hi(t) given in Section 3.7, the number of Ti'S instances released up to time a is Hi(a+D;). Also, the number of Tj'S instances which have deadlines before or at a+D i is Hj(a+D i ). Similarly, the number of Tj'S instances released by time t is IJ(t). The higher priority workload having arrived by time t is thus
Wi(a, t)
= j
Dj
~ a
oF
i
+ D, + Jj
The rest of the procedure, i.e., the computation of the busy period relative to d = a + D i , the computation of the worst case response time relative to a, and the computation of the overall worst case response time, remains basically unchanged. Refer to [9] for a full description.
4.4.4
Tick Scheduling
Applying the same argument as for Theorem 3.15, also Lemma 4.1 can be generalized according to the mixed scheduling model, in which the cost of an actual scheduler implementation is taken into account. That is, the worst case response times can still be evaluated with the described approach. However, the new availability function must be considered. In practice, only the definition of Wi(a, t), the higher priority workload arriving by time t, needs to be modified
78
CHAPTER 4
by taking into account the additional term due to the tick scheduler overheads:
Wi(a, t) == j21i Dj
:S a
j r ;/j l ,1+ la+ D ;/j - D J}C
min { t
i
j
+ OV(t).
+ Di + J j
All the rest is unchanged. As for the previous section, refer to [9] for a more detailed description.
4.4.5
Non-Preemptive Non-Idling EDF Scheduling
In Section 3.8 it has been shown that the feasibility check of a task set scheduled by a non-preemptive non-idling EDF scheduler is very similar to the preemptive case. As expected, the similarity is also valid for the computation of task worst case response times, as reported in [5]. There are, however, two differences worth mentioning: •
The first one is also addressed in Theorem 3.16. Owing to the absence of preemption, a task instance with a later absolute deadline can cause a priority inversion, which must be accounted for.
•
The second difference is in some way more subtle. Always owing to the non-preemptability of any task instance execution, the attention has to be on the busy period preceding the execution start tillle of the instance, and not on the busy period preceding its completion time, as is the case in the preemptive model. The reason is that when preemption is not allowed, once a task instance has gained the processor, it cannot be preempted, even if a higher priority instance becomes ready during its execution.
Before showing the details of the worst case response times computation, it is first necessary to characterize the scenarios, and in particular the relative deadline busy periods, which provide the local response times maxima. Lemma 4.4 (George, et. al.) The worst case response time of a task Ti is found in a deadline busy period for Ti in which Ti has an instance released at time a (and possibly others released before), all tasks with relative deadline smaller than or equal to a+ D i are released from time t == 0 on at their maximum rate, and finally a further task with relative deadline greater than a+ D i , if any, has an instance released at time t == -1.
Response Times under EDF Scheduling
79
t
Task
t
i
r=J
Other Tasks
Lower Priority Task
_t _t _t
II
tJ
II
~
0
_~ifi,"t\!ill
~ d
a
LJ
+__L
t
L(
-'-----'-_-----'-------L-L_L--_ _ ~ 0 ~D ~
L
-I
Figure 4.5 Local response time maxima when non-preemptive EDF scheduling is assumed.
80
CHAPTER
4
In Figure 4.5 an example of such scenario is depicted. Note that, with respect to the deadline a + D;, there can be at most a single priority inversion before the corresponding deadline busy period. Other inversions may be present in this busy period. However, since it includes by definition only instances with deadlines less than or equal to a + D i , possible inversions between these instances are irrelevant as far as the T; 's instance released at time a is concerned. All have to be executed anyway before it, regardless of their execution schedule. As suggested by the lemma, if the length of the bw;y period starting at time t = 0 and preceding the execution of the Ti'S instance released at time a is
termed L; (a), the response time of the instance is
L;(a)
+ C;
- a.
Since the computation of Li(a) may occasionally give a value smaller than a, more generally it is
rti(a) = max {Ci,Li(a)
+ C i - a}.
The length Li(a) can be determined by finding the smallest solution of the equation
t=
max
0>~~
{Cj-l}+Vl/;(a,t)+lTaJCi,
(4.4)
i
where the first term on the right side accounts for the worst case priority inversion with respect to the deadline a + D;, while the second and the third terms represent the time needed to execute the largest deadline a + D i busy period that precede the execution of the Ti'S instance released at time a. More precisely, the second term is the time needed to execute instances of tasks other than Ti with absolute deadlines smaller than or equal to a+ D i and release times before this Ti'S instance execution start time. Finally, the third term is the time needed to execute the Ti'S instances released before a. The rationale of the equation is to compute the time needed by the Ti'S instance releasee! at time a to get the processor: every other higher priority instance released before this event is executed earlier, thus its execution time must be accounted for. For the same reason, the function vVi(a, t) must account for all higher priority instances released in the interval (0, tl, thus also including those possibly released at time t. For any task Tj, the maximum number of instances released in [0, t] is 1 + It/Tj J.
£1
81
Respon se Times under EDF Schedu ling
D; -
Dj)/TjJ among them can have an absolute However, at most 1 + lea + deadline before or at a + D;. It follows that
-
W;(a,t )=
~ L:.
min
{1+ IlTtJ ,1+ la+D~j.-D'J} Cj. J
j
J r- ,
D} ~
a
+ Di
t solution Since l~';(a, t) is a monoto nic non-dcc reasiug step [uuetioll, the smalles ation: comput. point fixed g followin the using of Equatio n 4.4 can be found by
0,
maxDj>a+D.{Cj -I} + v~-; (a,L;nl) (a)) +
(l*J) C;.
to a, rt;(a), According to the argume nt of Lemma 4.4, the respons e time relative bounde d upper is which L;(a), length is defined as a function ofthe busy period ation comput the Hence, [5]. period husy nous synchro t.he by L, the length of is, the That L. t.han smaller a of values to limit.ed tly coheren be can of rt;(a) worst case respons e time of Ti is finally
rt;
= max {rt; (a) :
°::; a < L} .
be further The number of evaluati ons of rt;(a) necessary to comput e rt; can whose function step a is 4.4 n Equatio reduced by observin g that the right. side of Tj and task some for D;, Dj + kT to equal j values disconti nuities in a are for are reduced some integer k. The significant values of a in the interval [0, L) accordingly. the interval Moreover, as for the preemp tive case, it is possible to furt.her restrict with L; L;], [0, to for, looked be to has T; of time e respous case where the. worst e instanc Ti'S a g being the maximu m length of a deadline busy period includin (see [5] for a detailed descript ion). ation of worst Note also that, similarl y to the feasibility conditio n, the comput L; is upper since ity, case respons e times has pseudo- polynom ial time complex c, with c a ::; U er whenev ial polynom hounde d by L, whose length is pseudo[5]. 1 positive constan t smaller than
82
CHAPTER 4
I Semaphore I Locked by I Time held] 1
Task Task Task Task Task Task Task Task Task
2
2 3 3 4 4 5 5 Table 4.1
4.5
9 9 15 6 10 3 9 11 15
900 300 1350 400 400 100 300 750 750
List of semaphores and locking pattern for the GAP task set.
CASE STUDY
The theory presented in this Chapter has been applied to the GAP (Generic Avionics Platform) task set, a small avionics case study described by Locke et. ai. in [7]. There are seventeen tasks in the GAP set, of which all but one are strictly periodic, with periods multiple of T tick , therefore they do not suffer release jitter. Task 11 is a sporadic task, whose arrival is assumed to be polled by the tick scheduler, hence it may suffer a worst case release jitter equal to Ttick. Some tasks also share resources that are accessed by locking and unlocking semaphores according to a hypothetical pattern, which is assumed to be equal to that described by Tindell et. ai. in [12], [12]' and which is reported in Table 4.1. Similarly, a tick scheduler is assumed with the same parameters as in their description, that is,: Ctick
= 66{LS
T tick
= 1000{LS
CQL
= 74{LS
C QS
= 40{Ls.
According to Tindell et. ai.'s approach [12], for this task set there is no optimal priority assignment able to guarantee all tasks under a fixed priority syst~m. However, according to the EDF analysis, the GAP task set is indeed feasible under EDF scheduling. The worst case response times computed with the theory described in the paper are reported in Table 4.2, where all times are given in microseconds. Note that all tasks with the same relative deadline have the same worst case response time. This case study demonstrates an example of the potential value to EDF scheduling over fixed priority scheduling.
Response Times under EDF Scheduling
Di
1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17
5000 25000 25000 40000 50000 50000 59000 80000 80000 100000 200000 200000 200000 200000 200000 1000000 1000000
I
T~ 200000 25000 25000 40000 50000 50000 59000 80000 80000 100000 200000 200000 200000 200000 200000 1000000 1000000 Table 4.2
4.6
83
3000 2000 5000 1000 3000 5000 8000 9000 2000 5000 1000 3000 1000 1000 3000 1000 1000
0 300 300 300 400 400 400 1350 1350 1350 1350 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1000 0 0 0 0 0 0
rti
4180 12280 12280 20226 30226 30226 39226 60226 60226 74150 168558 168558 168558 168558 168558 198760 198760
I
ai
I
0 0 0 40000 30000 30000 21000 0 0 0 0 0 0 0 0 0 0
GAP task set parameters.
SUMMARY
This Chapter presented solutions for computing the worst case execution times for tasks. These worst case response times can then be compared to task deadlines to check for feasibility. Another advantage of this approach is that it is extensible to distributed real-time computing as shown in Chapter 9.
REFERENCES
[1] N. Audsley, A. Burns, M. Richardson, K. Tindell, and A.J. Wellings, "Applying New Scheduling Theory to Static Priority Pre-emptive Scheduling," Software Engineering Journal, September 1993.
[2] G.C. Buttazzo and J.A. Stankovic, "RED: A Robust Earliest Deadline Scheduling Algorithm," Proc. of Srd Int. Workshop on Responsive Computing Systems, 1993. [3] G.C. Buttazzo and J.A. Stankovic, "Adding Robustness in Dynamic Preemptive Scheduling," in Responsive Computer Systems: Towa1"d Integration of Fault Tolerance and Real- Time, Kluwer Press, 1994. [4J L. George, P. Muhlethaler, and N. Rivierre, "Optimality and NonPreemptive Real-Time Scheduling Revisited," Rapport de Recherche RR2516, INRIA, Le Chesnay Cedex, France, 1995. [5] L. George, N. Rivierre, and M. Spuri, "Preemptive and Non-Preemptive Real-Time Uni-Frocessor Scheduling," Rapport de Recherche HR-2966, /INRIA, Le Chesnay Cedex, France, 1996. 16] C.L. Liu and J.W. Layland, "Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment," Journal of the Association for Computing Machinery 20(1), 1973. [7] Locke C.D., Vogel D.H., and Mesler T.J., "Building a Predictable Avionics Platform in Ada: A Case Study," Proc. of IEEE Real- Time Systems Symposium, 1991. /[8] M. Spuri, "Earliest Deadline Scheduling in Real-Time Systems," Doctorate Dissertation, Scuola Superiore S.Anna, Pisa, Italy, 1995. [9] M. Spuri, "Analysis of Deadline Scheduled Real-Time Systems," Rapport de Hecherche RH-2772, INRIA, Le Chesnay Cedex, France, 1996. [10] M. Spuri, "Holistic Analysis for Deadline Scheduled Real-Time Distributed Systems," Rapport de Hecherche HH-2873, INRIA, Le Chesnay Cedex, France, 1996.
86
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[11] J.A. Stankovic, M. Spuri, M. Di Natale, and G. Buttazzo, "Implications of Classical Scheduling Results for Real-Time Systems," IEEE Computer, June 1995. [12] K. Tindell, A. Burns, and A.J. \VeIlings, "An Extendible Approach for Analyzing Fixed Priority Hard Real-Time Tasks," Real- Time Systems 6(2), 1994. [13] K. Tindell and J. Clark, "Holistic Schedulability Analysis for Distributed Hard Real-Time Systems," Microprocessors and Microprogramming 40, 1994.
5 PLANNING-BASED SCHEDULING
l\Iany real-time applications are deployed in dynamic environments and hence require support for scheduling jobs as they arrive. Dynamic scheduling allows more flexibility in dealing with problems faced in practice, such as the need to alter scheduling decisions based on the occurrence of overloads, e.g., when •
the environment changes,
•
there is a burst of job arrivals, or
•
a part of the system fails.
If system overloads are assumed to be impossible, then schedulability analysis based on EDF can be used. If overloads do not occur, when a job is preempted there is an implicit guarantee that the remainder of the job will be completed before its deadline. Unfortunately, EDF can rapidly degrade system performance during overloads [23]. The arrival of a new job may result in all the previous jobs missing their deadlines. Such an undesirable phenomenon, called the Domino Effect, is depicted in Figure 5.1.
In particular, Figure 5.1a shows a feasible schedule of a job set executed under the EDF scheduling algorithm. However, if at time to job .fo is executed, all previous jobs miss their deadlines (see Figure 5.1b). In such a situation, EDF does not provide any type of guarantee on which jobs meet their timing constraints. This is a very undesirable behavior in those control applications in which a guarantee of minimum level of performance is necessary. In order to avoid domino effects, the operating system and the scheduling algorithm must be explicitly designed to handle overloads in a more controlled fashion, so that the damage due to a deadline miss can be minimized.
88
CHAPTER
t
J 1
J2
~
l
u
t
J3
J4
1 ~
l
l
t (a)
Jo J1
J2 J3
J4
_ _b l
_
h--------.L..--.h
_t___----"-_~---------'-t=b>__------
t~-------'d----Je__________ ~-------'d----Je____________
t--,-------'~ (b)
Figure 5.1 (a) Feasible schedule with Earliest Deadline First, in normal load condition. (b) Overload with domino effect due to the arrival of job Jo.
5
Planning-based Scheduling
89
One possibility is to design a system for the worst case loads. Taking this approach it is usually assumed that worst case times are known and, therefore, overloads and failures never occur. Unfortunately, even with this assumption it is often inefficient to determine schedulability or to a priori construct a Jjil/tcl schedule for such a system. Also, in a dynamic system it is not possible to ;/0'," guarantee a priori that all job arrivals will be able to meet their deadlines: if ,",: i the arrival times of jobs are not known, the schedulability of all the jobs cannot be guaranteed.
r
Dynamic planning algorithms are motivated by these practical considerations of dynamic real-time systems. What planning-based algorithms attempt to do is to give assurances to arriving jobs concerning the ability of the system to meet the time constraints associated with the jobs. Generally speaking, planning to determine schedulability is akin to admission control. Depending on the requested Quality of Service (QoS), a planning algorithm can be designed to work with different types of information, from worst case assumptions needed to provide '''absolute guarantees" regarding the delivered QoS (even under the most pessimistic assumptions) to guarantees based on the satisfaction of specific conditions. For example, when jobs with different levels of importance are considered, the notion of "conditional guarantee" appears to be applicable whereby a job's guarantee is predicated upon the non-arrival of jobs with higher importance. This is in contrast with absolute guarantees whereby once a job is accepted, its schedulability remains intact under all circumstances. Independent of the nature of guarantees, the construction of a plan may require assigning priorities to jobs; this raises the question of how priorities are assigned. A simple method is to assign priorities based on EDF: the closer a job's deadline, the higher its priority. For scheduling independent jobs with deadline constraints on single processors, EDF is optimal, so if any assignment of priorities can feasibly schedule such jobs, then so can EDF. For a given job set, if jobs have the same arrival times but different deadlines, EDF generates a non-preemptive schedule. If both arrival times and deadlines are arbitrary, EDF schedules may require preemptions. EDF uses the timing characteristics of jobs and is suitable when the processor is the only resource needed and jobs are independent of each other. Of more practical interest is the scheduling of jobs with timing constraints, precedence constraints, resource constraints and arbitrary importance on multiprocessors. Unfortunately, most instances of the scheduling problem for realtime systems are computationally intractable. Non-preemptive scheduling is desirable as it avoids context switching overheads, but determining such a
90
CHAPTER
5
schedule is an NP-hard problem even on uni-processors when jobs can have arbitrary ready times [14]. The presence of other constraints exacerbates the situation. This makes it clear that it serves no effective purpose to try to obtain an optimal schedule, especially when decisions are made dynamically. Dertouzos and Mok studied multi-processor on-line scheduling of real-time jobs [27, 12] noting that for most real-world circumstances, optimal dynamic algorithms do not exist [19, 8, 28]. With multi-processors, no dynamic scheduling algorithm is optimal and can guarantee all jobs without prior knowledge of job deadlines, computation times, and arrival times [27]. Such knowledge is not available in dynamic systems so it is necessary to resort to approximate algorithms or to use heuristics to construct the schedules. Any real-time system must exhibit 'graceful degradation' under failures and overloads. To achieve this, not only must the fact that a job did not meet its deadline be detected, but the fact that this is going to occur must be detected as soon as possible and, by signaling this exception, make it possible for the job to be substituted by one or more contingency jobs. Thus on-line schedulability analysis must have an early warning feature which provides sufficient lead time for the timely invocation of contingency jobs, making it possible for the scheduler to take account of a continuously changing environment. An advantage of dynamic scheduling is that fairly complex priority assignment policies can be used. This gives dynamic algorithms a lot of flexibility and aids in their ability to deal with a wide variety of job and resource characteristics. But a priority-based scheduler may incur substantial overheads in calculating the priorities of jobs and in selecting the job of highest priority. ~~~Il dynamic priorities ar~ us~d, the relative priorities of jobs can change as time progresses, as new jobs arrive, or as jobs execute. \Vhenever one of these events occurs, flle-priorltyOf alilheremaining jobs must be recomputed. This can make the use of dynamic priorities more expensive in terms of run-time overheads and in practice these overheads must be kept as small as possible. . Trying to minimize scheduling overheads conflicts with the goal of providing for the early warning feature. The earlier an arriving job is checked for feasibility, the sooner it can be known whether it will meet its deadline. However, if it is scheduled early but remains in the system for too long, say because its deadline is far away, then scheduling costs can be high for subsequent scheduling decisions since they must ensure the schedulability of this and other previously accepted jobs.
I !IV 1 J
91
Planning-based Scheduling
How practical planning-based approaches address these issues is discussed in subsequent sections. In preparation for this, the definitions of load in dynamic systems and the metrics relevant for dynamic real-time systems are discussed.
5.1 5.1.1
PRELIMINARIES: LOAD, METRICS, VALUE FUNCTIONS Definition of Load in Dynamic Systems
In a real-time system, the definition of computational workload depends on the temporal characteristics of the computational activities. For non real-time or soft real-time jobs, a commonly accepted definition of workload refers to the standard queuing theory definition. Here the load p, also called traffic intensity, is:
p=)..C.
where C is the mean service time and)" is the average arrival rate of the jobs. Notice that this definition does not take deadlines into account, hence it is not particularly useful to describe real-time workloads. In a hard real-time environment, a system is overloaded when, based on worst case assumptions, there exists an interval during which the work exceeds capacity, so one or more jobs might miss their deadline.
If the job set consists of n independent preemptable periodic tasks, whose relative deadlines are equal to their period, then the system load p is equivalent to the processor utilization factor:
U=~Ci
DT-' i=l
t
where C i and T i are the computation time and the period of job Ti respectively. In this case, a load p > 1 means that the total computation time requested by the periodic activities in their hyperperiod H = Icm(Tj , T 2 , . .. ,Tn) exceeds the available time on the processor, therefore the job set cannot be scheduled by any algorithm.
92
CHAPTER
5
A general method for calculating the load in an aperiodic real-time environment has been proposed in [6]. According to this method, the load is computed at each job activation time (ri), and the number of intervals in which the computation of load is done is limited by the number of job deadlines (d i ). The method for computing the load is based on the consideration that, for a single job J i , the load is given by the ratio of its computation time Ci and its relative deadline D i = di - l'i' For example, if Ci = D;, i.e., the job does not have slack time, the load in the interval [Ti, di ] is one. When a new job arrives, the load can be computed from the last request time, which is also the current time t, and the longest deadline, say dn . In this case, the intervals that need to be considered for the computation are [t, dIl, [t, d2], ... [t, dn ]. In general, the processor load in the interval [t, d;] is given by
where Ck(t) refers to the remaining execution time of job Jk at time t. Hence, the total load in the interval [t, d n ] can be computed as the maximum among all Pi(t), that is: P = max Pi(t). l.=l,n
5.1.2
Performance Metrics under Overloads
When a real-time system is underloaded and dynamic activation of jobs is not allowed, there is no need to consider job's importance in the scheduling policy, since there exist optimal scheduling algorithms that can guarantee a feasible schedule under a given set of assumptions. For example, Dertouzos [11] proved that EDF is an optimal algorithm for preemptive, independent jobs when there is no overload. On the contrary, when jobs can be activated dynamically and an overload occurs there are no algorithms that can guarantee a feasible schedule of the job set. Since one or more jobs may miss their deadlines, it is .Jlrderable, f[()Jp the viewpoint of achieving graceful degradation, that theTe;;~ imp;~tant jobs are·theo'iic,$.that get delayed. Hence, in overload conditions itis important to distinguish between time constraints and importance of jobs. In general, the \ fk' L importance of a job is not related to its deadline or its period, thus a job with a) aJ long deadline could be much more important than another one with an earlier deadline. For example, in a chemical process, monitoring the temperature every ten seconds is certainly more important than updating the clock picture on the user console every second. This means that, during an overload involving these
93
Planning-based Scheduling
two tasks, it is better to skip one or more clock updates rather than mlsslllg the deadline of a temperature reading, since this could have a major impact on the controlled environment. In order to specify importance, an additional parameter is usually associated with each job, its value, that can be used by the system to make scheduling decisions.
5.1.3
Value/Utility Functions
The value associated with a job reflects its importance with respect to the other jobs in the set. The specific assignment depends on the particular application. For instance, there are situations in which the value is set equal to the job computation time; in other cases it is an arbitrary integer number in a given range; in other applications it is set equal to the ratio of an arbitrary number (which .reflects the importance of the job) and the job computation time (this ratio is called value density). In a real-time system, however, the actual value of a job also depends on the time at which the job is completed, hence the job's importance can be better described by an utility function. Figure 5.2 illustrates some utility functions that can be associated with a job in order to describe its importance. According to this view, a non-real-time job, which has no time constraints, has a constant (low) value, since it always contributes to the system value whenever it completes its execution. On the contrary, a hard real-time job contributes to a value only if it completes within its deadline and, since a deadline miss would jeopardize the behavior of the whole system, the value after its deadline can be considered minus infinity in many situations. A job with a soft deadline can still give a value to the system if executed after its deadline, although this value may decrease with time. Firm real-time activities are those that do not unduly jeopardize the system, but give zero value if completed after their deadline. Once the importance of each job has been defined, the performance of a scheduling algorithm can be measured by accumulating the values of the job utility functions computed at their completion time. Specifically, the cumulative value of a scheduling algorithm A is defined as the following quantity:
r A = L V(fi)
,..' .;J
94
CHAPTER 5
v(f j
v(f j
)
)
soft
Non real-time
.~
fj
f;
,
d; v(fj)
v(f j
hard
finn f;
di
Figure 5.2 importance.
)
~fj
d;
Utility functions that can be associated with a job to describe its
Given this metric, a scheduling algorithm is optimal if it maximizes the cumulative value achievable for a job set. Notice that if a job with a hard constraint misses its deadline, the cumulative value achieved by the algorithm is minus infinity, even though all other jobs are completed before their deadlines. For this reason, all activities with hard time constraints should be guaranteed a priori by assigning them dedicated resources (included processors). If all hard deadlines are guaranteed a priori, the objective of a real-time scheduling algorithm for soft and firm jobs should be to guarantee a feasible schedule in underload conditions and maximize the cumulative value during overloads. If more general utility functions are possible then it may also be a requirement for soft and firm jobs in underload conditions to maximize the cumulative value. Given a set of n jobs J( Gi , D i , Vi), where Gi is the worst case computation time, D i the relative deadline, and Vi the importance value gained by the system when the job completes within its deadline, the maximum cumulative value achievable on the job set is equal to the sum of all values Vi, Le., f max = L:~=1 Vi. In overload conditions, this value cannot be achieved since one or more jobs will miss their deadlines. Hence, if f* is the maximum cumulative value that can be achieved by any algorithm on a job set in overload conditions, the performance
95
Planning-based Scheduling
J 1 J2 J3
t t
c 1 = 10 C 2 =6
.-, 0
2
1
t
1 4
C 3 =6 6
8
I-t
10
12
14
16
(a)
J 1 J2
t t
l
C 2 =6
J3 0
t
C 1= 10
2
4
6
t
8
C 3 =6 10
12
~t
14
16
(b) Figure 5.3 No optimal on-line optimal algorithms exist in overload conditions, since the schedule that maximizes r depends on the knowledge of future arrivals.
of a scheduling algorithm A can be measured by comparing the cumulative value r A obtained by A with the maximum achievable value r*. Consider for example the job set shown in Figure 5.3, consisting of three jobs J 1 (10,11,10), J 2 (6, 7,6), J 3 (6, 7, 6). Without loss of generality, assume that the importance values associated with the jobs are equal to their execution times (V; = C;) and that jobs are firm, so no value is accumulated if a job completes after its deadline. If J 1 and J 2 simultaneously arrive at time to = 0, there is no way to maximize the cumulative value without knowing the arrival time of h. In fact, if h arrives at time t = 4 or before, the maximum cumulative value is r* = 10 and can be achieved by scheduling job J 1 (Figure 5.3a). However, if h arrives between time t = 5 and time t = 8, the maximum cumulative value is r* = 12 and
96
CHAPTER
5
can be achieved by scheduling job hand h and discarding J1 (Figure 5.3b). Notice that if h arrives at time t = 9 or later, then the maximum cumulative value is r' = 16 and can be accumulated by scheduling jobs J 1 and J 3 • In this brief section on value based real-time scheduling only simple situations were described, In general, the value of a job may be dependent on many things such as system mode, sets of other jobs, the availability of input data, the presence of faults, etc. Little is known about scheduling under such situations. Normally, system designers approximate all these complicated issues with a simple value function.
j
5.1.4
On-line vs. Clairvoyant Scheduling
What the previous example shows is that without an a priori knowledge of the job arrival times no on-line algorithm can guarantee the maximum cumulative value r·. This value can only be achieved by an ideal clairvoyant scheduling algorithm which knows the future arrival time of any job. Although the optimal clairvoyant scheduler is a pure theoretical abstraction, it can be used as a reference model to evaluate the performance of real on-line scheduling algorithms in overload conditions.
Definition 1 A scheduling algorithm A has a competitive factor 'PA if and only if it can guarantee a cumulative value
where
r'
is the cumulative value achieved by the optimal clairvoyant scheduler.
From the above definition (given in [1]), it is noticed that the competitive factor is a real number 'PA E [0,1]. If an algorithm A has a competitive factor 'PA it means that A can achieve a cumulative value r A at least 'PA times the cumulative value achievable by the optimal clairvoyant scheduler on any job set. If the overload has an infinite duration, then no on-line algorithm can guarantee a competitive factor greater than zero. In real situations, however, overloads are intermittent and usually have a short duration, hence it is desirable to use scheduling algorithms with a high competitive factor.
Planning-based Scheduling
II
_t---t_ _
97
vl=K
_1_2--===bJ:,:::::::::=============--~"
I
Figure 5.4 tor.
I
=
Situation in which EDF has an arbitrarily small competitive fac-
Unfortunately, without any form of guarantee, the basic EDF algorithm has a zero competitive factor. To show this fact it is sufficient to find an overload situation in which the cumulative value obtained by EDF can be arbitrarily reduced with respect to that one achieved by the clairvoyant scheduler. Consider the example shown in Figure 5.4, where jobs have a value proportional to their computation time. This is an overload condition because both jobs cannot be completed by their deadline. When job J 2 arrives, EDF preempts J 1 in favor of h which has an earlier deadline, so it gains a cumulative value of C 2 . On the other hand, the clairvoyant scheduler always gains C 1 > C2 . Since the ratio C2 /C 1 can be arbitrarily small, it follows that the competitive factor of EDF is zero. An important theoretical result found in [1] is that there is an upper bound on the competitive factor of anyon-line algorithm. In particular the following theorem has been proved.
Theorem 1 If the job's value is proportional to its computation time, then no on-line algorithm can guarantee a competitive factor greater than 0.25.
The proof is done by using an adversary argument, in which the on-line scheduling algorithm is identified as a player and the clairvoyant scheduler as the adversary. In order to propose worst case conditions, the adversary dynamically generates the jobs depending on the player decisions. At the end of the game, the adversary shows its schedule and the two cumulative values are computed. Since the player tries to do its best in worst case conditions, the ratio of the cumulative values gives the upper bound of the competitive factor for anyon-line algorithm. Baruah et. al. [1] also showed that, when using value density metrics (where the value density of a job is its value divided by its computation time), the best
98
CHAPTER
5
that an on-line algorithm can guarantee is 1 <{}max
=
(1 +
J"k)2
where k is the important ratio between the highest and the lowest value density job in the system. Koren and Shasha [20] also found an on-line scheduling algorithm, called having the best possible competitive factor.
Dover,
It is worth pointing out, however, that the above bounds are achieved under very restrictive assumptions, such as all jobs have zero laxity, the overload can have an arbitrary (but finite) duration, and job's execution time can be arbitrarily small. In most real world real-time applications, however, jobs' characteristics are much less restrictive and a lot is known about the actual job set. Therefore, the 1/4th bound has only a theoretical validity and more work is needed to derive other bounds based on more knowledge of the actual real-time job set of a given system.
5.2
STEPS IN A DYNAMIC PLANNING-BASED SCHEDULING APPROACH
In general, dynamic scheduling has three basic steps: feasibility checking, schedule construction, and dispatching. Dynamic planning based scheduling combines these steps in various ways. Depending on the kind of application for which the system is designed, the programming model adopted, and the scheduling algorithm used, all these steps may not be needed. Often, the boundaries between the steps may also not always be clearly delineated. The basic steps are described first and then how they are combined into planning based scheduling follows. Feasibility Analysis Feasibility, or schedulability, analysis is the process of determining whether the timing requirements of a set of jobs can be satisfied, usually under a given set of resource requirements and precedence constraints. Dynamic systems perform feasibility checking on-line, as jobs arrive.
Planning-based Scheduling
99
Feasibility checking using schedulability formulae is most suited for periodic activities. Here, a dynamically arriving periodic task is accepted for execution if the utilization bound for the new job as well as the currently existing jobs is not exceeded. Planning-Based approaches provide similar support for aperiodic tasks. In dynamic planning-based approaches, execution of a job is begun only if it passes a feasibility test. The feasibility can be based on a model of executing jobs according to the EDF scheduling discipline. Often, a result of the feasibility analysis is a schedule or plan that determines when a job should . begin execution. Schedulability analysis is especially important for activities for which recovery following an abortion after partial execution can be complicated. Error handlers are complicated in general and abnormal termination may produce inconsistent system states. This is likely to be the case especially if the activity involves inter-process interaction. In such situations, it is better to allow an activity to take place only if it can be guaranteed to complete by its deadline. If such a guarantee cannot be provided, then the program can perform an alternative action. To provide sufficient time for executing the alternative action, a deadline may be imposed on the determination of schedulability. This can be generalized so that there are N versions of the activity and the algorithm attempts to guarantee the execution of the best possible version. 'Best' refers to the value of the results produced by a particular version; typically, the better the value of the result, the longer the execution time.
Schedule Construction Schedule construction is the process of ordering the jobs to be executed and storing this in a form that can be used by the dispatching step. Whereas approaches that perform schedulability analysis by checking utilization bounds do not construct explicit schedules, for planning-based approaches, schedule construction is usually a direct consequence of feasibility checking. In the former case, priorities are assigned to jobs and at run time, the job in execution has the highest priority. Planning-Based approaches also can be considered to assign priorities to jobs. These are used to decide which job must be placed next in the plan being constructed. However, the resultant schedule mayor may not rely on priorities. For example, it is possible that the final plan is a priority ordered list of jobs, or it may be that the final schedule is a list of start and finish times for each job without any explicit priority identified. In the rest of this chapter the terms plan and schedule are used interchangeably.
100
CHAPTER
5
Dispatching
Dispatching is the process of deciding which job to execute next. The complexity and requirements for the dispatching step depend on
1. the scheduling algorithm used in the feasibility checking step; 2. whether a schedule is constructed as part of the schedulability analysis step; 3. the kinds of jobs, e.g., whether they are independent or with precedence constraints, and whether their execution is preemptive or non-preemptive; and 4. the nature of the execution platform, e.g., whether it has one processor or more and how communication takes place. For example, with non-preemptive scheduling a job is dispatched exactly once; with preemptive scheduling, a job is dispatched once when it first begins execution and again whenever it is resumed. These three steps are combined in a dynamic planning-based approach as follows. vVhen a job arrives, an attempt is made to guarantee the job by constructing a plan for this new job execution to meet its timing constraints and where all previously guaranteed jobs continue to meet their timing constraints. A job is guaranteed subject to a set of assumptions, for example, about its worst case execution time and resource needs, overhead costs and the nature of the faults in the system. If these assumptions hold, once a job is guaranteed it will meet its timing requirements. Thus, feasibility is checked with each arrival. If the attempt to guarantee fails, the job is not feasible and a timing fault is forecast. If this is known sufficiently ahead of the deadline, there may be time to take alternative actions. For example, it may be possible to trade off quality for timeliness, by attempting to schedule an alternative job which has a shorter computation time or less resource needs. In a distributed system, it may be possible to transfer the job (or an alternative job) to a less-loaded node. The alternative job must itself be guaranteed to avoid its impacting previously guaranteed work. In the rest of this chapter, issues underlying planning-based scheduling are discussed followed by the details of plan construction. Since the run-time cost of a dynamic approach is an important practical consideration, several techniques are discussed for efficient dynamic scheduling.
Planning-based Scheduling
5.3
101
ALGORITHMS FOR DYNAMIC PLANNING
Most of the scheduling algorithms proposed in the literature use one of the following scheduling schemes, also illustrated in Figure 5.5. Out of these, the second and third schemes predict and handle overloads by controlling the entry of new jobs and are the subject of this section. 1. Best Effort Scheme. This scheme includes those algorithms with no prediction for overload conditions. At its arrival, a new job is always accepted into the ready queue, so the system performance can only be controlled through a proper priority assignment.
2. Admission Control SchenlC. This scheme includes those algorithms in which the load on the processor is controlled by an acceptance test executed at each job arrival. Typically, whenever a new job enters the system, a guarantee routine verifies the schedulability of the job set based on worst-case assumptions. If the job set is found schedulable, the new job is accepted in the ready queue; otherwise, it is rejected. 3. Robust Scheme. This scheme includes those algorithms that separate timing constraints and importance by considering two different policies: one for job acceptance and one for job rejection. Typically, whenever a new job enters the system, an acceptance test verifies the schedulability of the new job set based on worst-case assumptions. If the job set is found schedulable, the new job is accepted in the ready queue; otherwise, one or more jobs are rejected based on a different policy. The admission controf scheme is able to avoid domino effects by sacrificing the execution of newly arriving jobs. Basically, the acceptance test acts as a filter that controls the load on the system. Once a job is accepted, the algorithm guarantees that it will complete by its deadline (assuming that no job will exceed its estimated worst-case computation time). Admission control schemes, however, do not take job's importance into account and, during overloads, always reject the newly arrived job, regardless of its value. In certain conditions (such as when jobs have very different importance levels), this scheduling strategy may exhibit poor performance in terms of cumulative value, whereas a robust algorithm can be much more effective. In this section, two algorithms based on a dynamic planning approach are described. The first deals with jobs having deadline constraints and where
102
CHAPTER
task
always accepted
~1L.__R_e_a_dy_q_u_eu_e_ _ ----~~18 (a)
task
Guarantee Routine
rejected
task
accepted Ready queue
(b)
Guarantee Routine
accepted Ready queue rejEcrion policy
reclaiming policy
L-_r_e_je_c_t_q_u_e_u_e_J..E - - - - - - - - - - ' (c)
Figure 5.5 Scheduling schemes for handling overload situations. a. Best Effort. b. Admission Control. c. Robust.
5
Planning-based Scheduling
103
each job also has a deadline tolerance. This algorithm incorporates features to select and reject less important tasks when an important job arrives but cannot be admitted because of current load conditions. The second algorithm deals with jobs having deadline and resource constraints. Other related algorithms together with their simulation results and performance comparisons can be found in [5].
5.3.1
The RED Algorithm
RED (Robust Earliest Deadline) [6] is a robust scheduling algorithm for dealing with firm aperiodic tasks in overloaded environments. The algorithm synergistically combines many features including graceful degradation in overloads, deadline tolerance, and resource reclaiming. It operates in normal and overload conditions and is able to predict not only deadline misses, but also the size of the overload, its duration, and its overall impact on the system. In RED, each job instance J;(C;, Ji(Ci , D;, D i , M;, M i , V;) is characterized by four parameters: a worst case execution time (C (C;), (D;), i ), a deadline i ), a relative deadline (D tolerance (M i ), and an importance value (V;). The deadline tolerance is the amount of time by which a job is permitted to be late, i.e., Le., the amount of time that a job may execute after its deadline and still produce a valid result. This parameter can be useful in many real applications, such as robotics and multimedia systems, where the deadline timing semantics is more flexible than scheduling theory generally permits. Deadline tolerances also provide a sort of compensation for the pessimistic evaluation of the worst case execution time. For example, without tolerance, it could be that a job set is not feasibly schedulable, and hence the job is rejected. But, in reality, the jobs could have been scheduled within the tolerance levels. Another positive effect of tolerance is that various jobs could actually finish before their worst case times, so a resource reclaiming mechanism could then compensate and the jobs with tolerance could actually finish on time. In RED, the primary deadline plus the deadline tolerance (which provides a sort of secondary deadline), are used to run the acceptance test in overload conditions. Notice that having a tolerance greater than zero is different than having a longer deadline. In fact, jobs are scheduled based on their primary deadline, but accepted based on their secondary deadline. In this framework, a schedule is said to be strictly feasible if all jobs complete before their primary
104
CHAPTER
5
deadline, while is said to be tolerant if there exists some job that executes after its primary deadline, but completes within its secondary deadline. The acceptance test performed in RED is formulated in terms of residual laxity. The residual laxity L i of a job is defined as the interval between its estimated finishing time (f;) and its primary (absolute) deadline (d i ). Each residual laxity can be efficiently computed using the result of the following lemma.
Lemma 5.1 Given a set J = {J 1 , J 2 , . .. ,In } of active aperiodic tasks ordered by incr'easing primary (absolute) deadline, the residual laxity L i of each job J i at time t can be computed as:
Li = L i-
1
+ (d i -
di -
1) -
(5.1)
Ci(t).
where L o = 0, do = t (i.e., the current time), and Ci(t) is the remaining worst case computation time of job J i at time t.
Proof. By definition, a residual laxity is L i = d i - 1;. Since jobs in the set J are ordered by increasing deadlines, job J 1 is executing at time t, and its estimated finishing time is given by the current time plus its remaining execution time (II = t + Cl)' As a consequence, L 1 is given by:
L 1 = d1
-
II
:==
d1
-
t-
C1·
Any other job J i , with i > 1, starts as soon as J i - 1 completes and finishes after units of time from its start (li :== f;-1 + Ci)' Hence,
Ci
Li
d i - fi
Li-
1
:==
+ (d i -
d i - fi-l -
di -
1) -
Ci
:==
d i - (d i - 1
-
Li-d -
Ci
:==
Ci
o
and the lemma follows.
Notice that if the current job set J is schedulable and a new job J a arrives at time t, the acceptance test of the new job set J' = J U {Ja } requires the computation of only the residual laxity of job J a and one of the jobs J i such that d i > da . This is because the execution of J a does not influence those jobs having deadlines less than or equal to d a , which are scheduled before J a . It follows that the acceptance test has O(n) complexity in the worst case. To simplify the description of the RED acceptance test, the Exceeding Time E i is defined as the time that job Ji executes after its secondary deadline 1 :
E i = max(O, -(L i 1 If
+ M i )).
AIi = 0, the Exceeding Time is also called Tardiness.
(5.2)
105
Planning-based Scheduling
Algorithm RED_acceptance-test(J, J new )
{
/* Maximum Exceeding Time */
E = 0;
La
= 0;
do = currenLtime(); .I' = .I U {Jnew}; k = <position of J new in the job set .I'>;
for each job .I: such that i ;::: k do { /* compute the maximum exceeding time */ L i = L i- 1 + (di - d i - 1 ) - Ci; if (L i + M; < -E) then E = -(L i + M i );
} if (E > 0) { <select a set J* of least value jobs to be rejected>; ;
} }
Figure 5.6
The RED acceptance test.
The Maximum Exceeding Time E max is defined as the maximum among all Ei's in the jobs set, that is: E max = maxi(Ei). Clearly, a schedule is strictly feasible if and only if L i ;::: 0 for all jobs in the set, while it is tolerant if and only if there exists some L i < 0, but E max = O. The RED algorithm uses the acceptance test (shown in Figure 5.6) to determine if a job J new is likely to miss its deadline. If so, it computes the amount of processing time required above the capacity of the system - the maximum exceeding time. Otherwise, the job is accepted and executes according to the EDF policy. The global view provided by the maximum exceeding time allows the planning of an action to recover from the overload condition that would occur if the job is accepted. Many recovering strategies can be used to solve this problem [2J. The simplest 'one is to reject the least value job that can remove the overload situation. In general it is assumed that, whenever an overload is detected, some
106
CHAPTER
5
rejection policy searches for a subset J* of least value jobs that are rejected to maximize the cumulative value of the remaining subset. A resource reclaiming mechanism can [35] be used to take advantage of those jobs that complete before their worst case finishing time. To reclaim the spare time, rejected jobs are not removed, but temporarily parked in a queue, called the Reject Queue, ordered by decreasing values. \Vhenever a running job completes its execution before its worst case finishing time, the algorithm tries to reaccept the highest value jobs in the Reject Queue having positive laxity. Jobs with negative laxity are removed from the system.
5.3.2
The Spring Algorithm
This section describes an admission control algorithm that can also accommodate resource requirements of jobs beyond CPU resources. Admission is granted if a schedule can be constructed, that is, execution can be planned for a given set of jobs [32, 3] such that they meet their deadlines. Schedule construction can be viewed as a search for a feasible schedule in a tree in which the leaves represent schedules, some of which may be feasible. The root is the empty schedule. An internal node is a partial schedule for a job set with one more job than that represented by its parent. Given the NP-completeness of the scheduling problem, it would serve little purpose to search exhaustively for a feasible schedule. So heuristics are used to direct the search. A planning algorithm [32, 41] starts at the root of the search tree and repeatedly tries to extend the current partial plan (with one more job) by moving to one of the vertices at the next level in the search tree until a full feasible schedule is derived. The rest of the section examines the details of this planning-based admission control algorithm, as implemented in the Spring Kernel [38], given the arrival or release time T, deadline d, and worst case computation time C of jobs. Jobs require one CPU. For generality we assume a multi-processor system with m processors and for simplicity, first consider the provision of absolute guarantees. In subsequent chapters when precedence and resource constraints are examined, the basic search algorithm is extended to deal with these additional considerations. Also, jobs are scheduled to execute non-preemptively. The algorithm computes the earliest start time, est;, at which job J; can begin execution after accounting for processor availability given jobs scheduled thus
Planning-based Scheduling
107
far. Given a partial schedule, the earliest available time for a resource (which is CPUj in this case) is given as, eratj' This time can be determined after each job is assigned to a resource for its worst case duration. Then the earliest time that a job .Ii that is yet to be scheduled can begin execution is esti
= max(ri, minj=l..meratj)
Even though for jobs which need just the cpu, EDF is a good priority assignment policy, later when resources beyond the cpu are considered, a more sophisticated priority assignment policy becomes necessary. Hence, for generality it is assumes that each job .Ii has a priority computed dynamically and denoted by Pr(.Ji)' At each level of the search, Pr is computed for all the jobs that remain to be scheduled. The job with the highest priority is selected to extend the current partial schedule. While extending the partial schedule at each level of search, the algorithm determines if the current partial schedule is strongly-feasible or not. A partial feasible schedule is said to be strongly-feasible if all the schedules obtained by extending this current schedule with anyone of the remaining jobs are also feasible. Thus, if a partial feasible schedule is found not to be strongly-feasible because, say, job .] misses its deadline when the current partial schedule is extended by J, then it is appropriate to stop the search since none of the future extensions involving job J will meet its deadline. In this case, a set of jobs cannot be scheduled given the current partial schedule. (In the terminology of branch-and-bound techniques, the search path represented by the current partial schedule is bounded or pruned since it does not lead to a feasible complete schedule.) However, it is possible to backtrack to continue the search even after a nonstrongly-feasible schedule is found. Backtracking is done by discarding the current partial schedule, returning to the previous partial schedule, and extending it with a different job, e.g., the job with the second highest priority. When backtracking is used, the overheads can be restricted either by restricting the maximum number of possible backtracks or the total number of re-evaluations of priorities. The fact that priority is computed for all remaining jobs at each level makes it a + 2 = O(n 2 ) n + (n - 1) +
108
CHAPTER 5
search algorithm where n is the number of jobs in the set. The complexity can be reduced to O(n) if only a maximum of k jobs that remain to be scheduled at each level of search are considered [32]. These k jobs can be selected by taking the k jobs with the earliest deadlines. k is a constant (in practice it is small when compared to n). In both cases, the job with the highest priority is selected to extend the current schedule. Here is a description of the complete algorithm (see Fig. 5.7). Besides Pr, introduced earlier, the following variables are useful in precisely describing the algorithm's steps.
T R, the jobs that remain to be scheduled, in order of increasing deadline; N(TR), the number of jobs in T R; M (TR), the maximum number of jobs considered by each step of scheduling; NTH, the actual number of jobs in T R considered at each step of scheduling, where NTH = NTH
/v1(TR), if N(TR):::: M(TR)
= N(T R), otherwise
and TC, the first
NTH
jobs in TR.
The algorithm starts with an empty partial schedule. At each step, the est for each job is first computed. To determine the job with the highest priority the priority value for each job is computed next. As a prerequisite for extending the partial schedule with the job with the highest priority, strong-feasibility is determined with respect to all the jobs in TC. After a job J is selected to extend the current partial schedule, its scheduled start time sst is set equal to est of J and resource availability vectors are updated. So far, the jobs were assumed to be independent and had just deadline and release time specifications. The extensions necessary to deal with periodic tasks are considered now. Extensions to the basic planning-based approach needed to deal with resources other than CPUs are discussed in Chapter 6, precedence constraints are considered in Chapter 7, and for distributed systems in Chapter 10.
109
Planning-based Scheduling
T R := job set to be scheduledj partial schedule := empty; Result := Success;
while T R =I empty /\ Result =I Failure do TC := first NTR jobs in T Rj Given a partial schedule est calculation: for each job J in T R Compute est; Priority value generation: for each job J in TR Compute Pr(J)j Job selection: find job min] with highest priority in TC; Update partial schedule or backtrack: if (partial schedule EB min]) is feasible and strongly feasible partial schedule := (partial schedule EB miTLj)j TR:== TR 8 minT; Update resource availability vector; else if backtracking is allowed and possible backtrack to a previous partial schedule; choose a job not yet chosen; else Result:=Failure
Figure 5.7
Basic guarantee algorithm.
110
CHAPTER
5
There are several ways of guaranteeing periodic tasks when they are executed together with aperiodic tasks. Assume that when a periodic job is guaranteed, every instance of the task is guaranteed. Consider a system with only periodic tasks. A schedule can be constructed using the basic planning algorithm; given n periodic tasks with periods T 1 ••• Tn,
=
LCM(T1 , ... , Tn). The earliest start time of the lh release of the i th job is (j - 1) X T i and its deadline is j x Ti. In other words, all instances of the periodic tasks are created with release times and deadlines and the entire set is handed to the planning algorithm at once. length of the schedule
If a periodic task arrives dynamically, an attempt can be made to construct a ne\\i template. The new task is guaranteed if the attempt succeeds. This new template begins execution at the end of the current LC M of the previous periodic tasks. If this is too long to wait, it is possible to modify this algorithm to handle a quicker mode change between the two templates. This is not discussed further in this book. Suppose there are periodic and aperiodic tasks in the system. If the resources needed by the two sets of tasks are disjoint then the processors in the system can be partitioned, with one set used for the periodic tasks. The remaining processors are used for aperiodic tasks guaranteed using the dynamic planning algorithm. If however, periodic and aperiodic tasks need common resources, a more complicated scheme is needed. If a periodic task arrives in a system consisting of previously guaranteed periodic and aperiodic tasks, an attempt is made to construct a new schedule. If the attempt fails, the new task is not guaranteed and its introduction has to be delayed until either the guaranteed aperiodic tasks complete or its introduction does not affect the remaining guaranteed jobs. Suppose a new aperiodic task arrives. Given a schedule for periodic tasks, the new task can be guaranteed if there is sufficient time in the idle slots of the template. Alternatively, applying the dynamic guarantee scheme, an aperiodic task can be guaranteed if all releases of the periodic tasks and all previously guaranteed aperiodic tasks can also be guaranteed. So far all jobs had been assumed to have the same level of importance. In Biyabani et. al. [2] the planning-based algorithm was extended to deal with jobs having different values, and various policies were studied to decide which
Planning-based Scheduling
111
jobs should be dropped when a newly arriving job could not be guaranteed. This referenced work extends the algorithm described thus far to become more 7'Obust.
5.4
TIMING OF THE PLANNING
As the number of jobs increases, so does the cost of planning and there is less time available for planning. Needless to say, planning-based schemes must be cognizant of time available for planning. So when a system overload is anticipated, use of a method that controls scheduling overheads is essential. Thus, it is important to address the issue of when to plan the execution of a newly arrived job. Two simple approaches are: 1. when a job arrives, attempt to plan its execution along with previously scheduled jobs: this is scheduling-at-arrival-time and all jobs that have not yet executed are considered for planning when a new job arrives;
2. postpone the feasibility check until a job is chosen for execution: this is scheduling-at-dispatch time and can be done very quickly for non-preemptive job execution by checking whether the new job will finish by its deadline. The second approach is less flexible and announces job rejection very late. Consequently, it does not provide sufficient lead time for considering alternative actions when a job cannot meet its timing constraints. Both avoid resource wastage since a job does not begin execution unless it is known that it will complete before its deadline. To minimize scheduling overhead while giving enough lead time to choose alternatives, instead of scheduling jobs when they arrive or when they are dispatched, they should be scheduled somewhere in between - at the most opportune time. They can be scheduled at some punctual point which limits the number of jobs to be considered for scheduling and avoids unnecessary scheduling (or rescheduling) of jobs that have no effect on the order of jobs early in the schedule. Choice of the punctual point must consider the fact that the larger the mean laxity and the higher the load, the more jobs are ready to run. The increasing number of jobs imposes growing scheduling overhead for all except a scheduler with constant overheads. The punctual point is the minimum laxity value,
112
CHAPTER
5
i.e., the value to which a job's laxity must drop before it becomes eligible for scheduling. In other words, the guarantee of a job with laxity larger than the punctual point is postponed at most until its laxity reaches the punctual point. Of course, if the system is empty a job becomes eligible for scheduling by default. By postponing scheduling decisions, the number of jobs scheduled at any time is kept under control, reducing the scheduling overhead and potentially improving the overall performance. The main benefit of scheduling using punctual points is the reduced scheduling overhead when compared to scheduling at arrival time. This is due to the smaller number of relevant jobs (the jobs with laxities smaller than or equal to the punctual point) that are scheduled at any given time. Clearly, when the computational complexity of a scheduling algorithm is higher than the complexity of maintaining the list of relevant jobs, the separation into relevant/irrelevant jobs reduces the overall scheduling cost; that is, the scheduling becomes more efficient. Consider the following scheme for jobs with deadlines that are held on a dispatch queue, Ql (n), maintained in minimum laxity order, and a variant of the FCFS queue. When a job arrives, its laxity is compared with that of the n jobs in the queue Ql (n) and the job with the largest laxity among the n + 1 jobs is placed at the end of the FCFS queue. When a job in Ql is executed, the first job on the FCF S queue is transferred to Q Q 1. Analysis [15, 18, 29, 30] shows that performance to within 5% of the optimal LLF algorithm is achieved for even small values of n. A more experimental way to limit the number of scheduled jobs is to have a H IT queue and a MISS queue [17]: the number of scheduled jobs in the HIT queue is continuously adjusted according to the the ratio of jobs that complete on time (the 'hit' ratio). This method is adaptive, handles deadlines and values, and is easy to implement. However, it does not define a punctual point. The weakness of both of these approaches is the lack of analytical methods to adjust the number of scheduled jobs. The parameters that control the number of schedulable jobs must be obtained through simulation and a newly arrived job can miss its deadline before it gets considered for execution. In contrast, if the punctual point is derived analytically, then it can be ensured that every arrived job will be considered for execution [42]. The number of schedulable jobs must be controlled using timing constraints, rather than by explicitly limiting the number of schedulable jobs; this ensures that every job is considered for scheduling when its laxity reaches the most
Planning-based Scheduling
113
opportune moment, the punctual point. The approach is especially beneficial for systems where jobs have widely different values and rejecting a job without considering it for scheduling might result in a large value loss, something that can happen easily when the number of schedulable jobs is fixed. The features of a 'well-timed scheduling framework' are summarized below. •
Newly arrived jobs are classified as relevant or irrelevant, depending on their laxity.
•
Irrelevant jobs are stored in a D-queue (the delay queue), where they are delayed until their laxity becomes equal to the punctual point, at which time they become relevant.
•
Relevant jobs are stored in an S-pool (the scheduling pool) as jobs eligible for immediate scheduling.
•
When a job is put into the S-pool, a feasibility check is performed; if this is satisfied, it is transferred into the current feasible schedule. Otherwise, it can be placed in the reject queue to await possible resource reclamation.
It is important to observe that apart from reducing the scheduling cost, the separation of relevant and irrelevant jobs also contributes to the reducing scheduling overhead due to queue handling operations. A simple analytical model is developed in [42], [42]' but a formally derived punctual point awaits further work.
5.5
IMPLEMENTING PLANNING-BASED SCHEDULING
In implementing planning-based scheduling, there are two main considerations: feasibility checking and schedule construction. In a multi-processor system, feasibility checking and dispatching can be done independently, allowing these system functions to run in parallel. The dispatcher works with a set of jobs that have been previously guaranteed to meet their deadlines and feasibility checking is done on the set of currently guaranteed jobs plus any newly invoked jobs. See the Spring kernel [38J for a discussion on how to implement this parallelism in a predictable manner and to avoid race conditions. One of the crucial issues in dynamic scheduling is the cost of scheduling: the more time that is spent on scheduling the less there is for job executions.
114
CHAPTER
5
In a single processor system, feasibility checking and job executions compete for processing time. If feasibility checking is delayed, there is less benefit from the early warning feature. However, if feasibility checking cannot be performed immediately after a job arrives it may lead to guaranteed jobs missing their deadlines. Thus, when jobs are guaranteed, some time must be set aside for scheduling-related work and a good balance must be struck depending on job arrival rates and job characteristics such as computation times. One way is to provide for the periodic execution of the scheduling activity. \Vhenever invoked, the scheduler attempts to guarantee all pending jobs. In addition, if needed, the scheduler could be invoked sporadically whenever these extra invocations affect neither guaranteed jobs nor the minimum guaranteed periodic rate of other system jobs. Another alternative, applicable to multi-processor systems, is to designate a scheduling processor whose sole responsibility is to deal with feasibility checking and schedule construction. Guaranteed jobs are executed on the remaining 'application' processors. In this case, feasibility checking can be done concurrently with job execution. Recall that a job is guaranteed as long as it can be executed to meet its deadline and the deadlines of previously guaranteed jobs remain guaranteed. Guaranteeing a new job might require re-scheduling of previously guaranteed jobs and so care must be taken to ensure that currently running jobs nor jobs that might execute prior to the guarantee algorithm completing are not re-scheduled. These considerations suggests that scheduling costs should be computed based on the total number of jobs in the schedule plus the newly arrived jobs, the complexity of the scheduling algorithm and the cost of scheduling one job. Jobs with scheduled start times before the current time plus the scheduling cost are not considered for rescheduling; the remaining jobs are candidates for re-scheduling to accommodate new jobs.
5.6
DISPATCHING JOBS IN A PLANNING-BASED SCHEDULE
Planning-based schedulers typically use non-preemptive schedules. Dispatching depends on whether the jobs are independent and whether there are resource constraints.
Planning-based Scheduling
115
If the jobs are independent and have no resource constraints, dispatching can be extremely simple: the job to be executed next is the next job in the schedule, and this job can always be executed immediately even if its scheduled start time has not arrived. Note that a scheduled start time (when the job is actually scheduled to run) is not the same as a job release time (which is the earliest time a job is eligible to run).
On the other hand, precedence constraints and resource constraints may increase the complexity of dispatching. If jobs have resource constraints and/or precedence constraints, the dispatching process must take these into account. When the actual computation time of a job differs from its worst case computation time in a non-preemptive multi-processor schedule with resource constraints, run time anomalies [13, 14] may occur, causing some of the scheduled jobs to miss their deadlines. There are two possible kinds of dispatchers. 1. Dispatch jobs exactly according to the given schedule. In this case, upon
the completion of one job, the dispatcher may not be able to immediately dispatch another job because idle time intervals may have been inserted by the scheduler to conform to the precedence constraints, release times, or resource constraints. One way to construct a correct dispatcher is to use a hardware (count down) timer in order to enforce the start time constraint. 2. Dispatch jobs taking into consideration the fact that, given the variance in jobs' execution times, some jobs complete earlier than expected. The dispatcher tries to reclaim the time left by early completion and uses it to execute other jobs. Clearly, non-real-time jobs which do not use resources needed by the real-time jobs can be executed in idle time slots. More valuable is an approach that improves the guarantees of jobs that have time constraints. Complete rescheduling of all remaining jobs is an available option, but given the complexity of scheduling, it is usually expensive and ineffective. Resource reclaiming algorithms used in systems that do dynamic planning-based scheduling must maintain the feasibility of guaranteed jobs, must have low overheads as a resource reclaiming algorithm is invoked whenever a job finishes, and must have costs that are independent of the number of jobs in the schedule. They must also be effective in improving the performance of the system. Simple but effective resource reclaiming algorithms are described in [35] for independent jobs having resource requirements and in [25] for jobs having, in addition, precedence constraints.
116
5.7
CHAPTER
5
SUMMARY
While utilization bounds analyses support predictable job executions, their applicability is restricted to jobs whose release times are known a priori. Besteffort approaches are applicable to jobs with arbitrary needs, but do not offer predictability. Planning-based scheduling offers the best of both worlds. It must be mentioned that the predictability they offer is on a per-job basis. What is needed is a definition and determination of a global system-wide formal schedulability notion based on individual guarantees. This Chapter presented a planning-based paradigm for scheduling jobs and presented two different algorithms: the RED algorithm and the Spring algorithm. Since overload handling is such an important issue, this Chapter ends with a brief look at other work related to overloads. In 1986, Locke [23] developed an algorithm which makes a best effort at scheduling jobs based on earliest deadline with a rejection policy based on removing jobs with the minimum value density. He also suggested that removed jobs remain in the system until their deadline has passed. The algorithm computes the variance of the total slack time in order to find the probability that the available slack time is less than zero. The calculated probability is used to detect a system overload. If it is less than the user prespecified threshold, the algorithm removes the jobs in increasing value density order. Real-time Mach [40] uses an approach similar to this: jobs are ordered by EDF and overload was predicted using a statistical guess. If overload is predicted, jobs with least value are dropped. In other related work, Sha and his colleagues [37] showed that the rate monotonic algorithm has poor properties under overload. Thambidurai and Trivedi [39] studied overloads in fault tolerant real-time systems, building and analyzing a stochastic model for such a system. However, they provided no details on the scheduling algorithm itself. Finally, Haritsa, Livny and Carey [17] presented the use of a feedback controlled EDF algorithm for use in real-time database systems. The purpose of their work was to obtain good average performance for transactions even in overload. Since they were working in a database environment they assumed no knowledge of transaction characteristics and they considered jobs with soft deadlines that are not guaranteed.
REFERENCES
[1] S. Baruah, G. Koren, D. Mao, B. Mishra, A. Raghunathan, L. Rosier, D. Shasha and F. Wang, "On the Competitiveness of On-Line Real-Time Task Scheduling," Real-Time Systems, 4(2), June 1992.
[2] S. Biyabani, J. Stankovic and K. Ramamritham, "The Integration of Deadline and Criticalness in Hard Real-Time Scheduling," Proceedings of the Real-Time Systems Symposium, December 1988.
[3] B. A. Blake and K. Schwan, "Experimental Evaluation of a Real-Time Scheduler for a Multiprocessor System," IEEE Transactions on Software Engineer'ing, 17(1), January 1991.
[4] A. Burns, "Scheduling Hard Real-Time Systems: A Review," Software Engineering Journal, May 1991.
[5] G. Buttazzo, M. Spuri, and F. Sensini, "Value vs. Deadline Scheduling in Overload Conditions," Proceedings of the Real- Time Systems Symposium, December 1995.
[6] G. Buttazzo and J. Stankovic, "Adding Robustness in Dynamic Preemptive Scheduling," in Responsive Computer Systems: Steps Toward FaultTolerant Real-Time Systems, Edited by D.S. Fussell and M. Malek, Kluwer Academic Publishers, Boston, 1995.
[7] S. Cheng, J. Stankovic and K. Ramamritham, "Dynamic Scheduling of Groups of Tasks with Precedence Constraints in Distributed Hard Real. Time Systems," Proceedings of the Real-Time Systems Symposium, December 1986. [8] H. Chetto and M. Chetto. "Some Results of the Earliest Deadline Scheduling Algorithm," IEEE Transactions on Software Engineering, 15 (10), October 1989. [9] H. Chetto, M. Silly, and T. Bouchentouf, "Dynamic Scheduling of RealTime Tasks under Precedence Constraints," Real-Time Systems Journal, 2(3), September 1990.
118
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[10] E. C. Coffman, Jr., editor, Computer and Job-Shop Scheduling Theory, John Wiley & Sons, 1976. [11] M. L. Dertouzos, "Control Robotics: the Procedural Control of Physical Processes," Information Processing 74, North-Holland Publishing Company, 1974. [12] M. L. Dertouzos and A. K-L. Mok, "Multiprocessor On-Line Scheduling of Hard-Real-Time Tasks," IEEE Transactions on Software Engineering, 15(12), December 1989. [13] l\I.R. Careyand D.S. Johnson, "Complexity Results for Multiprocessor Scheduling Under Resource Constraints," SIAM Journal of Computing 4, 1975., [14] l\L R. Carey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman and Company, 1979. [15] P. Coli, J. Kurose, and D. Towsley, "Approximate Minimum Laxity Scheduling Algorithms for Real-Time Systems," Technical Report COINS 90-88, University of Massachusetts, Amherst, Department of Computer and Information Science, 1990. [16] N. Cehani and 1<. Ramamritham, "Real-Time Concurrent C: A Language for Programming Dynamic Real-Time Systems," Real- Time Systems Journal, 3, pp. 377-405, 1991. [17] J .R. Haritsa, M. Livny and M.J. Carey "Earliest Deadline Scheduling for Real-Time Database Systems," Proceedings of the Real- Time Systems Symposium, December 1991. __ ~ [18] J. Hong, X.Tan, and D. Towsley, "A Performance Analysis of Minimum . Laxity and Earliest Deadline Scheduling in a Real-Time Systems," IEEE Transactions on Computer's, C-38(12), December 1989. [19] K.S. Hong and .J.Y-T. Leung, "On-Line Scheduling of Real-Time Tasks," Proceedings of the Real- Time Systems Symposium, December 1988. [20] G. Koren and D. Shasha, "D-over: An Optimal On-Line Scheduling Algorithm for Overloaded Real-Time Systems," Pmceedings of the Real- Time Systems Symposium, December 1992. [21] G. Koren and D. Shasha, "Skip-Over: Algorithms and Complexity for Overloaded Systems that Allow Skips," Proceedings of the Real- Time Systems Symposium, December 1995.
Refer'ences
119
[22] C.L. Liu and J.W. Layland, "Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment," JOU7'lwl of the Association for Computing Machinel'y 20(1), 1973. [23] C.D. Locke, Best-Effort Decision Making for Real-Time Scheduling, Ph.D. Thesis Carnegie Mellon University, Pittsburgh, PA., May 1985. [24] G. Manimaran, S. R. Murthy and K. Ramamritham, "A New Algorithm for Dynamic Scheduling of Parallelizable Tasks in Real-Time Multiprocessor Vol. 15, 1998, pp. 39-60. Systems," Real-Time Systems Journal, Vo!' [25] G. Manimaran, S. R. Murthy and K. Ramamritham, "New Algorithms for Resource Reclaiming from Precedence Constrained Tasks in Multiprocessor Real-Time Systems," Journal of Parallel and Distributed Computing, Vo1.44, No.2, Aug. 1997, pp.123-132. [26] R. McNaughton, "Scheduling With Deadlines and Loss Functions," Management Science 6, 1959. [27] A. K. Mok and M. L. Dertouzos, "Multiprocessor Scheduling in a Hard Real-Time Environment," In Proceedings of the Seventh Texas Conference on Computing System, 1978. [28] A. K. Mok, Fundamental Design Problems of Distributed Systems for the Hard Real-Time Environment, Ph.D Dissertation, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, May 1983. [29] S. S. Panwar and D. Towsley, "On the Optimality of the Ste Rule for Multiple Server Queues that Serve Customers with Deadlines," Technical Report COINS 88-81, University of Massachusetts, Amherst, Department of Computer and Information Science, July 1988. [30] S. S. Panwar, D. Towsley, and J. K. Wolf, "Optimal Scheduling Policies for a Class of Queues with Customer Deadlines until the Beginning of Service," Journal of the Association for Computing Machincry, 35(4), October 1988. [31] K. Ramamritham and .I. Stankovic, "Dynamic Task Scheduling in Hard Real-Time Distributed Systems," IEEE Software, pp. 65-75, .I uly 1984. [32] K. Ramamritham, J.A. Stankovic and P. Shiah, "Efficient Scheduling Algorithms for Real-Time Multiprocessor Systems," IEEE Transactions on Parallel and Distributed Systems, 1(2):184-94, April 1990.
120
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[33] K Ramamritham and J. A. Stankovic, "Scheduling Algorithms and Operating Systems Support for Real-Time Systems," Proceedings of the IEEE, pp. 55-67, ,January 1994. [34] K. Ramamritham, "Dynamic Priority Scheduling," in Real-Time SystemsSpecification, Verification and Analysis, Edited by Mathai Joseph, PrenticeHall, 1996. [35] C. Shen, K Ramamritham and J.A. Stankovic, "Resource Reclaiming in Multiprocessor Real-Time Systems," Transactions on Parallel and Distributed Systems, Vol. 4, No.4, pp. 382-397, April 1993. [36] K. Schwan and H. Zhou, "Dynamic Scheduling of Hard Real-Time Tasks and Real-Time Threads," IEEE Transactions on Software Engineering, Vol. 18, No.8, pp. 736-748, August 1992. [37] L. Sha, J. Lehoczky, and R. Rajkumar, "Solutions for Some Practical Problems in Prioritized Preemptive Scheduling," Proceedings of Real- Time Systems Symposium, December 1986. [38] J.A. Stankovic and K. Ramamritham, "The Spring Kernel: A New Paradigm for Hard Real-Time Operating Systems," IEEE Softwa1T, 8(3):6272, I\lay 1991. [39] P. Thambidurai and KS. Trivedi, "Transient Overloads in Fault-Tolerant Real-Time Systems," Proceedings of the Real- Time Systems Symposium, December 1989. [40] H. Tokuda, J. Wendorf, and H. Wang, "Implementation of a Time-Driven Scheduler for Real-Time Operating Systems," Proceedings of the Real- Time Systems Symposium, December 1987. [41] W. Zhao and K Ramamritham, "Simple and Integrated Heuristic Algorithms for Scheduling Tasks with Time and Resource Constraints," Journal of Systems and Software, 7:195-205, 1987. [42] G. Zlokapa, Real-time Systems: Well-timed Scheduling and Scheduling with Precedence Constraints, Ph.D. Thesis, University of Massachusetts, February 1993.
6 EDF SCHEDULING FOR SHARED RESOURCES
Most scheduling algorithms are primarily concerned with cpu scheduling. \Vhen jobs are allowed to access shared resources, the accesses need to be controlled, as in any concurrent system, through the use of appropriate protocols to ensure the integrity of the resources despite potential concurrent access. The problem of accessing shared resources is well known and there is a vast literature that discusses solutions to this problem (see [8, 9, 17] for a general treatment). Solutions for the shared resource access problem usually adopt some form of [2], or monitors [6]. semaphores [5], critical sections [2]' The scheduling of accesses to resources other than the cpu is a particularly difficult problem in the context of real-time scheduling. In the presence of shared resources, the complexity of feasibility analysis becomes exponential. Specifically, the problem of deciding whether a set of periodic tasks is feasibly schedulable when semaphores are used to enforce mutual exclusion is NP-hard [11]. The difficulty arises as a result of the inability to preempt a job at arbitrary points. Suppose a job is preempted while it is accessing a resource. Clearly, the cpu can be taken away from the job and assigned to the preempting job. However, the same cannot be done for many other resources because it is quite possible that the resource state reflects partial changes done to it by the preempted job. This chapter discusses the details of scheduling tasks which have resource constraints in addition to deadlines and periodicity constraints. For pedagogical reasons, it is assumed that shared resources are implemented by means of critical sections. To maintain the integrity of a resource, accesses to shared resources must be serialized. Hence, if a lower priority job is within a critical section and then a higher priority job tries to enter the same critical section, the higher
122
CHAPTER
6
priority job is blocked until the lower priority job leaves the critical section. In general, any job that needs to enter a critical section to access a resource must wait until no other job is currently within the critical section, holding the resource. Otherwise, it proceeds by entering the critical section and holds the resource. When a job leaves a critical section, the resource associated with the critical section becomes free and the system can then allocate it to one of the waiting jobs, if any. The protocol used to choose among waiting jobs depends on the specific algorithm used for controlling access and on the way in which priorities are assigned. This chapter covers both priority driven algorithms as well as planning based scheduling algorithms. Specifically, Section 6.1 discusses the scheduling problem introduced by the presence of resources and Section 6.2 presents the specific issue of priority inversion resulting from the need to preserve resource consistency. Section 6.3 presents the Priority Inheritance Protocol (PIP). Section 6.4 presents the dynamic priority ceiling protocol and 6.5 briefly discuss the stack resource policy. Resource scheduling in planning mode is the subject of Section 6.6. It is interesting to note that planning mode solutions have the potential to eliminate explicit locks at runtime by scheduling tasks in such a manner as to avoid contention.
6.1
THE NATURE OF RESOURCES AND THE RESULTING SCHEDULING PROBLEMS
To understand the additional considerations introduced by the presence of resources, let us consider the problem of non-preemptively scheduling a set of aperiodic tasks with deadlines and resource requirements. System resources include processors, memory, and shared data structures. In this book processing resources are referred to simply as cpus (or processors), and all the other serially reusable resources are referred to as resources. Consider the following example: There are two processors and two copies of a resource, each of which is used only in exclusive mode. A set of jobs whose parameters are listed below is being scheduled using a dynamic priority driven approach. A job J;'s dynamic priority is denoted by Pri where, the smaller the value of Pri, the higher the priority. Assume that Pri is given by (d i +6 * erati)
EDF Scheduling for Shar'ed Resources
Table 6.1
Job computation time resource request deadline
123
Job parameters for example
J1
·h
.h
9 either copy 9
10 either copy 74
1 both 11
where erati is the earliest resource available time, i.e., the time when the resources needed by J i will be available, given resource needs of jobs in execution and the number 6 is a weighting factor. This priority assignment function which extends EDF has good performance characteristics when planning-based approaches are used with resource constraints. The weighting factor 6 used in the priority function is chosen purely for the purposes of illustration. The schedule produced by a greedy priority driven approach is first determined. When a processor is idle, the highest priority job which does not violate the resource constraints is assigned to the processor. Initially, job priorities are Pl'1 = 9, Pr2 = 74 and Pr3 = 11. So J 1 has the highest priority and it is scheduled to start at time=O. Then, because one processor is still idle, another job that can start at time=O is found. Recomputing the priorities of the remaining jobs gives Pr2 = 74 and Pr3 = 65. Although h has the higher priority, only h can start at time=O and so it is chosen. Finally, h is scheduled to start at time=lO when both copies of the resource are available. Thus jobs are scheduled according to their priority, but the algorithm is greedy about keeping the resources fully used. Now suppose a pure priority-driven approach, one that is not greedy, is used instead. After J 1 is scheduled, the remaining job priorities are recomputed and J 3 is chosen to be executed next at time=9, followed by J 2 at time=10. Thus, with greed, the higher priority job h is delayed by one time unit while without greed, J 2 is delayed by 10 time units. The example shows that if the resources are to be better utilized the execution of higher priority jobs may have to be delayed. This results in a form of priority inversion since a higher priority job is made to wait for a resource held by a lower priority job. Since the priority of a job reflects its time constraints and other characteristics of importance, it is usually desirable in real-time systems to place more emphasis on priorities than on the underutilization of resources.
124
CHAPTER
6
The above discussion assumed that the tasks were non-preemptable. The best of both worlds can be achieved by adopting preemptive priority-driven scheduling. If this is done, then when J 1 completes execution, Jz could have been preempted by h. Unfortunately, the decision to preempt may not be simple. There may be jobs which, once preempted, need to be restarted, losing all the computation up to the point of preemption. For example, in a communicating job, if a communication is interrupted it may have to be restarted from the beginning: the communication line represents an exclusive resource that is required for the complete duration of the job. A job that is preempted while r·eading a shared data structure can resume from the point of preemption provided the data structure has not been modified. A job that is preempted while modifying a shared data structure may leave it in an internally inconsistent state; one way to restore consistency is to wait for the (to be) preempted job to complete before allowing further use of the resource. An alternative is to rollback the changes made by the preempted job but, in general it is difficult to keep a record of all such changes. A rollback can add considerably to the overhead and have subsequent impact on meeting job deadlines. Returning to our example, Jz uses the resource in exclusive mode. So there are two possible ways in which priority driven scheduling can proceed, depending on the nature of the resource. 1. If the resource is like the communication line, Jz can be preempted at time=9 and J 3 can begin using it immediately. This is equivalent to not having started execution of Jz at all at time 0, so allowing J 2 to execute ahead of its turn by being greedy has not helped. But, if J 2 's computation time is less than or equal to that of J 1 , greed can be used. In any case, the execution of h is not any further delayed than for pure priority-driven scheduling. 2. If the resource is a modifiable data structure, h's execution is delayed, either by the need to rollback Jz's changes or to wait for Jz to complete execution. In either case, h completes later than under pure prioritydriven scheduling. This suggests that a limited form of greed can be used in which job computation times and the nature of the resources, as well as their use, are considered when
EDF Scheduling for Shared Resources
125
making scheduling decisions. The goal is then to ensure as much as possible that priorities are not violated or to limit the duration of priority inversion. Before describing protocols that aim to achieve this desire, the priority inversion problem is discussed in further detail.
6.2
THE PRIORITY INVERSION PROBLEM
In systems with a preemptive priority driven scheduling mechanism, a high priority job must be able to preempt a lower priority job as soon as it is ready for execution. This has been the assumption in the previous chapters. When shared resources are used, this assumption is no longer valid because if a lower priority job holds a resource and then a higher priority job tries to hold the same resource, the higher priority job is blocked until the lower priority job releases that resource. When such priority inversion occurs, a high priority job is subjected to unnecessary blocking, since it must wait for the processing of a lower priority job. This blocking increases its response time and must be explicitly considered in analyzing the feasibility of a schedule since it violates the spirit of a priority-driven approach. Moreover, the blocking time cannot be easily bounded if a specific protocol is not used. Consider the situation of Figure 6.1. Here the schedule of three preemptable jobs J 1 , hand h is depicted when Pri = di , i.e., EDF is used. Here J 1 and h use a shared resource, access to which is controlled by a critical section. At time t = 3, h enters the system and begins its execution. At time t = 5, when J3 is in the middle of the critical section, J 1 enters the system. Since the deadline of J 1 is earlier, a preemption occurs and J 1 starts its execution. 'When it needs to enter its critical section at time t = 7, J 1 finds that the associated resource is held by h. Thus, it blocks and releases the processor. Meanwhile, a third job J 2 enters the system with an earlier deadline than h. As per the EDF algorithm J 2 gets the processor and executes. At this point J 1 must also wait for the execution of a medium priority job. The presence of many such medium priority jobs can lead to uncontrolled blocking. In our example J2 completes at time t = 11. Then h is allowed to exit its critical section and at time t = 12 J 1 can finally resume its execution. But it is too late and it misses its deadline. Several solutions have been proposed to deal with the problem of scheduling tasks accessing shared resources. One approach is to use a simple scheduling
126
CHAPTER 6
TIME OVERFLOW
_h
.±J~---7""'" 14
t
.I"
Priority Inversion
1 6
o
I
I
Priority Inversion
, +4.------,-----,--,------,-----1,~ 2
3
t
16
L
,----,-
~'"
--'-!
_ _ _ _ _ _----'--'_ _-----1.-1
4
5
9
10
II
Ii
12
13
\4
15
16
17
1&
19
20
21
22
t
Crilical Section
Figure 6.1
Priority inversion under EDF scheduling.
algorithm, such as one based on priorities, but embellished to control access to critical sections. Essentially, these embellishments determine when preemption can take place and who can preempt whom. The Priority Inheritance Protocol and the Priority Ceiling Protocols were developed for fixed priority systems [18]. They have been extended for EDF in [19] and in [3], respectively. In [1] Baker describes the Stack Resource Policy, a protocol suitable both for static and dynamic priority systems. The priority inheritance protocol, the dynamic priority ceiling protocol, and the stack resource protocols are described in Sections 6.3 through 6.5, respectively. In his thesis [11], Mok proposed the kernelized monitor, in which the processor is assigned in time quanta of fixed length equal to the size of the largest critical section. This essentially schedules the resource problem away because it schedules job executions such that when one job is holding a resource, no other job can. In a way, jobs do not have to use any special protocol to ensure correct access to the resources since the way they are scheduled guarantees access correctness. However, in a multi-processor system this is insufficient since when one job is in execution using a resource, another, needing the same resource, could also be in execution on a different processor. Hence, the multi-processor schedule must be explicitly constructed so as to exclude, in time, jobs that need the same resource for mutually exclusive use. This is what is done in the planning-based
EDF Scheduling for Shared Resources
~
127
~
1
--?
14
1
I
J 1 16
.-~~'I------r-----r-.-----.-.----i~ o
I
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
It" , 19
20
21
22
1
Critical Section
Figure 6.2
Priority inheritance under EDF scheduling.
approaches to scheduling as exemplified by the algorithm lised in the Spring multi-processor kernel [14J. This algorithm is discussed in Section 6.6.
6.3
THE PRIORITY INHERITANCE PROTOCOL
The basic idea behind the Priority Inheritance Protocol (PIP) is that when a job blocks one or more higher priority jobs, it temporarily assumes the highest priority of the blocked jobs, that is, it inherits a higher priority. When the job exits its critical section, it resumes the original priority it had when it entered. For instance, if the three jobs of Figure 6.1 are scheduled in this way the schedule depicted in Figure 6.2 is obtained. When the highest priority job J j needs to enter its critical section at time t == 7, J 3 inherits the priority of J j . That is, within its critical section h executes with a priority corresponding to the deadline of J j , which is the earliest in the system. Consequently, J2 cannot preempt J 3 , as happened in Figure 6.1. As soon as h exits its critical section and releases the resource, it resumes its original priority and J j gets both the resource and the cpu. In this way J j completes at time t == 11, J 2 at time t == 15 and h at time t == 17. All the completion times being earlier than the corresponding deadlines, the schedule is feasible.
128
6.3.1
CHAPTER
6
Assumptions and Terminology
The analysis presented in this chapter is mainly inspired by the work described in [18]. Hence, the terminology and the notation are very similar. The difference is that our interest is in systems with EDF schedulers, while the work in [18] is applicable to fixed priority systems. The assumptions are the following. A set of periodic and/or sporadic tasks, with deadlines less than or equal to their periods, is being scheduled. Similar to [18], jobs do not suspend themselves. In order to avoid problems of deadlocked jobs, it is also assumed that critical sections are properly nested and that resources are accessed by all jobs following the same total order. A resource is denoted by R i . The jth critical section of the task Ti is denoted by Zi,j' The resource associated with Zi,j is denoted by Ri,j. Zi,j C Zi,k indicates that Zi,j is entirely contained in Zi,k' Finally, the computation time of the critical section Zi,j is denoted by Ci,j' The assumption that critical sections are properly nested means that 1) given any pair Zi,j and Zi,k> then either Zi,j C Zi,k, Zi,k C Zi,j, or Zi,j n Zi,k = 0, and 2) the order in which the resources associated with the nested critical sections are freed is the opposite of the order in which they are acquired. A job J is said to be blocked by the critical section Zi,j of job Ji,h (a job of Ti) if J waits for Ji,h to exit Zi,j in order to continue execution, and Ji,h has a later deadline than J. Furthermore, J is said to be blocked due to resource R, if the critical section Zi,j blocks J and Ri,j = R.
In the description that follows the concept of preemption levels, originally defined by Baker in [1] is useful. The preemption level of a task Ti is denoted by 7ri. The essential property of preemption levels is that a job Ji,h is not allowed to preempt another job Jj,k unless 7ri > 7rj. In particular: (6.3.1.1) if Ji,h has higher priority than a higher preemption level than
Jj,k
and arrives later, then
Ji,h
must have
Jj,k.
In a system with EDF scheduling, the property just mentioned is satisfied if decreasing preemption levels are assigned to tasks with increasing relative deadlines. That is:
129
EDF Scheduling for Shared Resources
Note that if release jitter is considered, then
becomes (see Chapter 3). The reason for distinguishing preemption levels from priorities is that preemption levels are fixed values that can be used to statically evaluate potential blocking in dynamic priority driven systems [1]. Finally, even though for simplicity of exposition the access protocols are presented assuming that all resource accesses occur in a mutually exclusive fashion, one can envisage extensions to the protocols that can deal with resources that are used in shared mode as well.
6.3.2
Definition of the Priority Inheritance Protocol
The Priority Inheritance Protocol for EDF is defined in the following way [19]:
•
\Vhen a job Ji,h tries to enter a critical section Zi,j and the resource Ri,j is already held by a lower priority job Jj,k, Ji,h waits and Jj,k inherits the priority of Ji,h.
•
The queue of jobs waiting for a resource is ordered by decreasing priority.
•
Priority inheritance is transitive. That is, if a job J 3 blocks a job h blocks a job J 1 , then J 3 inherits the priority of J 1 via J 2 .
•
At any time, the priority at which a critical section is executed is always equal to the highest priority of the jobs that are currently blocked on it.
•
When a job exits a critical section it usually resumes the priority it had when it entered the critical section. The exception occurs when a new higher priority job is blocked for the critical section that contains the critical section exited by a job.
•
'When released, a resource is granted to the highest priority job, if any, waiting for it.
h, and
130
CHAPTER 6
b
C I =3 T 1=6
C,=7 T, = 14
I~ 1 II
.--.-~ , 0
I
3
Figure 6.3
4
5
6
7
8
10
11
17
12
13
14
15
i
16
I
17
I
18
I
I
19 '0 21
I
"
Different relative priorities under EDF scheduling.
In a manner similar to [18], in the analysis of the protocol, two types of blocking can be identified: 1. direct blocking, which occurs when a higher priority job tries to acquire a resource already held by a lower priority job.
2. push-through blocking, which occurs when a medium priority job is blocked by a lower priority job that has inherited a higher priority from a job it directly blocks.
6.3.3
Properties of the Priority Inheritance Protocol
There are only a few differences in the properties of the Priority Inheritance Protocol implemented under a fixed priority scheduling and under EDF scheduling. In fixed priority systems the analysis is simplified by the fact that the relative priority of task instances does not change, since it is fixed for all jobs of a task. However, when an EDF scheduling mechanism is used, this is not true. In fact, at a certain time the current job of a task may have higher priority than the current job of a second task, but at a different time in the schedule the situation may be reversed with two other jobs of the same tasks. This is illustrated in Figure 6.3. At time t = 5, job J 1 ,1 is released and having the earliest deadline in the system it preempts h,l' However, at time t = 11, when the second job of T1 is released, preemption does not occur, because J 1 ,2 has a later deadline than h,l.
EDF Scheduling for Shared Resources
131
Even though it appears from the previous observation that when evaluating the possible worst case blocking for the jobs of any single task, the critical sections of all other tasks must also be considered, this is not necessary. In fact, looking at the example of Figure 6.3, even if J 1 ,2 has a lower priority than .h,I, it cannot block h,l, since it cannot run before the completion of J 2 ,1' That is, jobs of 71 can be blocked by jobs of 72, but not the other way around which is a property similar to that found in fixed priority systems. The claim is more formally stated in the following two lemmas. Lemma 6.1 A job Jf{ can be blocked by a lower priority job h, only if h is within a cl'itical section that can block Jll, Jll' when Jf{ is released.
Proof. Suppose that when J f{ is released h is not within a critical section which can directly block Jf{ or can lead to the inheritance of a priority higher than Jf{. Then h can be preempted by J f{ and it does not execute again until JJ{ completes, that is, it can never block JJ{. D Lemma 6.2 A job Ji,h of the task 7i can be blocked by a lower priol'ity job Jj,k of the task 7j, only if 7i has a greater preemption level than 7j, that is, only if 7ri
> 7rj.
Proof. By Lemma 6.1, Ji,h can be blocked by Jj,k only if Jj,k is within a critical section when Ji,h is released. Hence, Jj,k must have been released earlier. By D condition 6.3.1.1, Ji,h must have a higher preemption level than Jj,k' Similar to [18], (3i,j denotes the set of all critical sections of jobs of 7j that can block jobs of the task 7i. That is, (3i,j = {Zj,k : 7rj
< 7ri and Zj,k can block Ji,h}.
Given the assumption that critical sections are properly nested, the set (3i,j is partially ordered by inclusion. The following concentrates only on the maximal elements of (3i,j, denoted with (37,j'
It can now be proven that each job of 7i may be blocked at most by one critical section of 137,j' Lemma 6.3 A job Ji,h can be blocked by a lower priority job Jj,k for at most the duration of one critical section of 137,j'
132
CHAPTER 6
Proof. By Lemma 6.1 and Lemma 6.2, in order to block Ji,h, Jj,k must be within a critical section of f3i,j when Ji,h is released. Once Jj,k exits this critical section it can be preempted by Ji,h, which later cannot be blocked by Jj,k again.
o This property enables the computation of an upper bound on the possible blocking time experienced by any job in the schedule.
Theorem 6.1 Under PIP and EDF scheduling, each job of a task Ti can be blocked by at most the duration of one critical section in each of f3i,j' 1 :::; j :::; n and 7fi > 7fj.
Proof. The theorem follows immediately from Lemma 6.3 and Lemma 6.2.
o The argument of Lemma 6.3 can be applied to any critical section that can block a job. Intuitively, any resource accessed by a job can cause at most one priority inversion, that is, a single blocking. If this claim is expanded, an upper bound can be found for the number of blockings a single job experiences. Following the notation of [18], (i,j,k denotes the set of all longest critical sections of Tj associated with resource Rk which can block Ti'S jobs, either directly or via push-through blocking. That is, C,j,k
= {Zj,h
: Zj,h E f3i,j and Rj,h
= Rd·
The set of all longest critical sections associated with resource R k that can block Ti'S jobs is then: C,.,k
=
U (;j,k' 7fj< 7r i
The goal is to prove that each job of Ti can be blocked by at most one critical section in (i,.,k' Theorem 6.2 A job of Ti can be blocked by at most one critical section in (7,.,k' for each resource R k . Proof. By Lemma 6.1 and Lemma 6.2, a job Ji,h can be blocked by a lower priority job Jj,p only if 7fj < 7fi and Jj,p is in the middle of a critical section of f3i,j when Ji,h is released. Without loss of generality, let us assume that the
133
EDF Scheduling for Shared Resources
resource associated with the critical section is Rk. Hence Jj,p holds R k . Once Jj,p exits its critical section, R k is granted to Ji,h in case of direct blocking, or to a higher priority job in case of push-through blocking. In either cases, Ji,h can no longer be blocked by Jj,p, nor by other lower priority jobs with critical 0 sections associated with Rk.
Corollary 6.1 Under the priority inher"itance protocol and EDF scheduling, if there al'e m resources that can block the jobs of a task Ti, each of these jobs can experience a maximum blocking time:
Bi
= min
(
L 7rj<7ri
max {Cj,p : Zj,p E f3;,j} ,
t
max {Cj,p : Zj,p E
C;.,k})'
Proof. It follows immediately from Theorem 6.1 and Theorem 6.2.
6.3.4
(6.1)
k=l
0
Computation of Blocking Times
The evaluation of equation (6.1) is not as trivial as it might look. The detennination of the sets f37,j can be done with a relatively simple procedure. However, the procedure must be carefully designed in order to take the correct critical sections into account. Once these sets are available, the computation of the sets is straightforward.
C;.,k
The identification of all critical sections in f37,j is made difficult by the existence of push-through blocking and transitive inheritance. Critical sections that cause direct blocking are easily identified, while those that can cause push-through or transitive blocking require a deeper analysis. Transitive blocking can be caused by nested critical sections. Consider the example of Figure 6.4. There are three jobs in the schedule, J I , J 2 and h, respectively, with decreasing preemption levels. J I accesses resource R 2 • h accesses two resources in a nested fashion, first R 2 , and then R I • Finally, h accesses R I . When the job J I is released at time t = 8, J 2 already holds R 2 and waits for R I , previously held by J 3 • At time t = 10, J I tries to hold R 2 and it blocks. The priority of J I is then transitively inherited by h, until it exits its critical section and releases R I , which is immediately granted to h. This job has inherited the priority of J I through direct blocking, hence it accesses
134
CHAPTER
______h __,_-----"R"".·.~,_~---,, ,
··
-----------"',1,
.
,
•
I
2
3
4
S
", ~
13
18
I
_-----'-D_---'-_l__-o-~
(---1--,""--, "
·
t
20
II
'~~~-.L
,Oi------r---r' o
6
,6----,-1'1-, 7 8
9
10
I
Critical Section (5 I)
Figure 6.4
II
12
13
14
15
16
17
18
19
20
21
22
t
Critical Section (S 2)
Example of transitive blocking.
R 1 , and then completes its outer critical section releasing R 2 , finally available to J[ at time t = 13. In general, given three tasks Ti, Tk and Tj with preemption levels 1I"i > 1I"k > 1I"j, a job of Ti can be blocked by a job of Tk, which can be blocked by a job of Tj: this can only happen if Tk has two nested critical sections such that the outermost critical section can block Ti'S jobs, and on the innermost critical section Tk can be blocked by Tj. More formally: (3i,j
;;:>
{Zj,h E (3k,j : Rj,h
= Rk,p
and
Zk,p C Zk,q
and
Zk,q E
(3i,d
= (}i,k,j'
Note that the critical sections in (}i,k,j are also considered in the determination of push-through blocking, Thus, they are not further analyzed. The resource accesses that can cause push-through blocking can be characterized by observing that a resource R can cause this kind of blocking to a job J, only if R is accessed both by a lower priority job and by a job which has inherited or can inherit a priority equal to or higher than that of J. Unfortunately, this fact is not as useful as it is for fixed priority systems. Furthermore, the notion of preemption levels is, in this case, not very useful. A job with high preemption level, in fact, can be subjected to push-through blocking by two jobs with lower preemption levels, as shown in Figure 6.5. Here, the job .h has the highest preemption level, however J 1 has the earliest
135
EDF Scheduling jor Shared Resources
___J==~~-='--'-----'. L-_l
~~
16
______J'---
t
- - -;.-~
-----'-c=J_. _-----'17
t
~,m,,~~ o
I
1
2
3
4
5
6
7
8
{)
10
II
12
13
14
15
16
17
18
19
20
21
22
t
Critical Section
Figure 6.5
Anomalous push-through blocking.
deadline in the system. Thus, when J 2 is released it cannot preempt J 1 . J 1 is later blocked by J 3 on the shared resource. This blocking is also experienced by J 2 . Hence, the set of critical sections of Tj that can block jobs of Ti must include the critical sections that can also block jobs of any other task in the system. That is:
Note that \/k, Bi,k,j <;;; sections in 1f;i,j.
1f;i,j.
Hence,
(3i,j
only takes into account the critical
Let ai be the set of resources accessed by jobs of Ti. The critical sections of Tj that can directly block these jobs are th~se which share resources with Ti. For each j such that 1rj < 1ri, (3i,j can be finally written as:
Given this formulation of the sets (3i,j the algorithm shown in Figure 6.6 can be used. Without loss of generality, it is assumed that n tasks are ordered with decreasing preemption levels, that is, 1ri 2: 1rj for all i > j. The complexity of the algorithm is O(cn 3 ), where c is the maximum number of critical sections of each task in the system.
136
CHAPTER
Algorithm BeS: Begin Step 1: Direct blocking. for i = 1 to n - 1 for j = i + 1 to n f3i,j = {Zj,k : Rj,k E
IJ";}
endfor endfor Step 2: Transitive and push-through blocking. for j = 2 to n j
1
f3 = Uh=l h,j; for i = 1 to j - 1 f3i,j = f3i,j U Ij
Ij
endfor endfor End
Figure 6.6
Algorithm for the computation of blocking critical sections.
6
EDF Scheduling for Shared Resources
137
Let us describe the algorithm by means of an example. Assume four tasks, and 74, respectively, ordered with decreasing preemption levels, that is, with increasing relative deadlines. Assume that T1 accesses resource R 3 , 72 accesses resources R 3 and R 2 , 73 accesses resource R 2 and R 1, and 74 accesses resources R 1 and R 3 . Assume also that T3 accesses two resources in a nested fashion, first R 2 and then R 1 • The key aspects of the code of the four tasks is represented in Table 6.2. 71,72,73
In the first step, the algorithm shown in Figure 6.6 determines the critical sections that can cause direct blocking: (31,2
= {Z2,d
(31,3 (32,3
= {} = {z3,d
(31,4 = (32,4 (33,4
{Z4,2}
= {Z4,2}
= {Z4, d
In the second step, the determination of the sets is completed with the addition of critical sections that can cause transitive blocking and push-through blocking: (31,2
= {Z2,d
(31,3 (32,3
= {Z3,d = {Z3,d
(31,4 (32,4 (33,4
= {Z4,1, Z4,2}
= {Z4,1, Z4,d = {Z4,1, Z4,2}
Once the sets (3i,j have been determined, the sets of maximal elements (3i,j and later the sets (;'. k of blocking critical sections accessed through the same resource Rk can be '~asily determined. The blocking time of each task in the system is then computed with equation ( 6.1). An alternative to the algorithm presented in this section is to modify the procedure described in [12]. The procedure is based on the examination of all possible blockings of a job, by traversing a tree specifically built for this goal. The algorithm can find better bounds than those found by the algorithm presented in this section. However, it has an exponential complexity. In order to work for systems with EDF scheduling the procedure must be modified substituting the fixed priorities with the preemption levels of the tasks. The rest remains unchanged. In the following, it is assumed that the blocking times have been determined as described in this section. The blocking time of the jobs of 7i is denoted with Bi ·
138
CHAPTER
Tl
T2
... ...
acquire (R 3 )
acquire (R 3 )
...
.. . ...
.. .
.. . .. .
. .. . ..
...
release (R 1 )
...
. ..
acquire (R 2 )
. .. . ..
release (R 2 )
release (R 1 ) release (R 2 )
. .. ...
.. .
Table 6.2
T4
acquire (R 1 )
acquire (R 1 )
...
6.3.5
. ..
acquire (R 2 )
release (R 3 )
release (R 3 )
T3
., .
.. .
6
acquire (R 3 )
...
release (R 3 )
Structure of accesses to resources.
Feasibility Check
After the identification of the tasks' blocking times, the feasibility of the system under PIP must be checked. In the following theorem the tasks are assumed to be ordered by decreasing preemption levels (by increasing relative deadlines), that is, ITi ~ ITj for all i :S j. Theorem 6.3 Given a set :J of n periodic and/or sporadic tasks, any job set generated by :J is feasibly scheduled under EDF scheduling and Priority Inheritance Protocol, if i-·l
Vi = l, ... ,n
" Cj LD j=1
+ Ci D+ B i < 1 _.
J
(6.2)
,
Proof. By induction on i, let us prove that if the first i conditions in (6.2) are true, then the subset of the first i tasks is feasibly scheduled even if the jobs can also encounter blackings normally due to jobs of tasks with lower preemption levels.
For i = 1 the hypothesis is true, since each job of for B 1 units of time and it is known that:
T[
can be blocked at most
139
EDF Scheduling for Shared Resources
That is, by the basic theorem relating utilization to feasibility, even if the computation time of T1 is increased by B 1 , feasible schedules result. Assume now that the hypothesis is true for the first i - I conditions, and that also the i th condition is true. Let lj.k be any job in the schedule 5 of the first i tasks. Let t be the last point in time not later than rj,k, the release time of Jj.k, such that from time t up to the completion time /i,k of Jj,k, only jobs released at t or later and with deadlines less than or equal to dj,k are executed, except for some possible critical sections of jobs with lower preemption levels that can cause blocking. If .J(t, /i,k) is the set of such jobs, the completion time of Jj,k is:
~
!j,k:St+B(t,!j,d+ Jh,l
Ch,l,
EJ(t,J),d
where B(t, !j,k) is the possible blocking time spent in [t, !j,k] executing critical sections of jobs with lower preemption levels, that is, jobs of tasks T p , with p> i. If in .J (t, !j,k) there are no jobs of Ti, then S coincides with the schedule of the first i - I tasks. By induction, this schedule is feasible, that is,
If there are jobs of Ti in .J(t, !j,k), the blocking time of T;'S jobs in B(t, !j,k) must be taken into account. The blockings of the first i - I tasks' jobs due to jobs with preemption level lower than 7ri are push-through blockings for Ti, hence they are included in B i . If the first i - I tasks are scheduled without blocking and the i th task has blocking Bi, a new schedule exists in which Jj,k does not complete earlier (note that blockings due to jobs in the same set .J(t, !j,k) do not affect the completion time of Jj,k)' Since the i th condition in (6.2) is true, by the basic theorem relating utilization to feasibility, this new schedule is feasible. Again
follows.
o
Similar to Chapter 3, the sufficient feasibility conditions can be checked in O(n 2 ) time. Hence they are appropriate for use both off-line, at design time, and on-line, when tasks arrive dynamically.
140
6.4
CHAPTER
6
THE DYNAMIC PRIORITY CEILING PROTOCOL
Priority inversion is not the only problem present in real-time systems with resources. There is another bad situation, which is not avoided by the Priority Inheritance Protocol, that can cause a high priority job to experience a large amount of blocking. This is the so called chained blocking, and happens because a high priority job is likely to be blocked whenever it wants to enter a critical section. If the job has several critical sections, it can be blocked for a considerable amount of time. Sha et. al. introduced the Priority Ceiling Protocol (PCP) to avoid chained blocking [18]. It works with a Rate Monotonic scheduler [10]. Chen and Lin [3] have extended the protocol to apply to an EDF scheduler. Later, the same authors have further extended the protocol in order to handle multiple-instance resources [4]. The main goal of these protocols is to reduce the occurrence of priority inversions beyond the reductions achieved with PIP alone. The key ideas are to prevent multiple priority inversions by means of early blocking of jobs that could cause priority inversion, and to minimize as much as possible the length of the priority inversion by allowing a temporary rise of the priority of the blocking job, that is, by using priority inheritance. The early blocking of jobs is realized by dynamically keeping track of the priority ceiling of each resource, i.e., the priority of the highest priority job that may hold the resource at that time. When a job tries to hold a resource, the resource is made available only if the resource is free, and only if the priority of the job is greater than the current highest priority ceiling in the system. Such a rule can cause early blockings in the sense that a job can be blocked even if the resource it wants to access is free. This is not the case with PIP. The main advantages of early blocking are that it saves unnecessary context switches and that it affords the possibility of a simple and efficient implementation. This access rule guarantees that any possible future job with higher priority is blocked at most once by the job which is currently holding a resource. Intuitively, chained blocking is prevented by ensuring that "among all resources needed by a future job, at most one of them is held by jobs with lower priorities at any time" [18]. This is the key to preventing multiple priority inversions experienced by a job. Assuming an EDF priority assignment, the earlier a job's deadline, the higher its priority. Following the description given in [3], the PCP has two parts
EDF Scheduling for Shared Resources
141
which define the priority ceiling of a resource and the handling of requests for resources: "Ceiling Protocol. At any time, the priority ceiling of a resource R, c(R), is equal to the original priority of the highest priority job that currently holds or will hold the resource. Resource Allocation Protocol. A job J; requesting a resource R is allowed to access R only if p; > C(RH)' where pr; is the priority of J; and RH is the resource with the highest priority ceiling among the resources currently held by jobs other than 1;. Otherwise, J; waits and the job Jl which holds R H inherits the priority of Ji until it releases RH."
The protocol has been shown to have the following properties:
•
A job can be blocked at most once before it enters its first critical section.
•
It prevents deadlocks.
Of course, the former property is used to evaluate the worst case blocking times of the jobs. Given this protocol the schedulability formula of Liu and Layland [10] has been extended by Chen and Lin [3] to obtain the following condition.
Theorem 6.4 (Chen and Lin) A set of n periodic tasks can be scheduled by EDF using the Dynamic Priority Ceiling Protocol if the following condition is satisfied:
~ Ci +Bi < 1 T -,
L.i=l
1.
where Ci is the worst case execution time, B i is the worst case blocking length and Ti is the period of the task Ti. 0
A more precise guarantee test is also possible by considering n subtests, one for each task, as is done under the static PCP. This test is not presented here. Compared to the PIP, the PCP has a higher processor utilization assuming the same task set, since the blocking time of each task is shorter. However, it is not suitable for task sets in which both the average number of resources accessed by a task and the average number of tasks that may access a resource
142
CHAPTER 6
are large. PCP can also unnecessarily delay jobs because the conditions it uses are sufficient, but not necessary. To rectify this, an optimal protocol has been developed in [13]. It delays jobs only if it helps to avoid potential priority inversions.
6.5
THE STACK RESOURCE POLICY
In [1], Baker introduced the Stack Resource Policy (SRP) that handles a more general situation in which multiunit resources, both static and dynamic priority schemes, and sharing of runtime stacks are all allowed. This section describes the protocol briefly. The protocol relies on the following two conditions [1]:
(6.5.1) To prevent deadlocks, a job should not be permitted to start until the resources currently available are sufficient to meet its maximum requirements. (6.5.2) To prevent multiple priority inversions, a job should not be permitted to start until the resources currently available are sufficient to meet the maximum requirement of any single job that might preempt it. The key idea is that when a job needs a resource which is not available, it is blocked at the time it attempts to preempt, rather than later, when it actually may need the shared resource. As with PCP, SRP saves unnecessary context switches through earlier blocking and makes simple and efficient implementations possible. SRP derives is name because it can be easily implemented using a stack. It should be noted that the basic premise underlying (6.5.1) also motivated the development of planning based scheduling algorithms (and they were actually developed before the SRP) in that they too do not schedule a job unless all the resources needed by it in the worst case are available. However, the details of how they make use of this requirement are different for the two types of algorithms.
The SRP uses a notion called the resource ceiling, which can be defined as the maximum preemption level of all the jobs that may be blocked directly by the currently available units of a resource. The protocol keeps a system-wide
143
EDF Scheduling for Shared Resources
ceiling defined as the maximum of the current resource ceilings. A job with the earliest deadline is allowed to preempt only if its preemption level is greater than the current system-wide ceiling. This enforces the above conditions (6.5.1) and (6.5.2).
The preemption test has the effect of imposing priority inheritance (that is, an executing job that is holding a resource resists preemption as though it inherits the priority of any jobs that might need that resource). What is noteworthy is that this effect is accomplished without modifying the original priority of the job [1]. The SRP has been shown to have properties similar to those of the PCP. That is, it prevents chained blocking and is deadlock free. Furthermore, assuming n tasks ordered by decreasing preemption levels, that is, by increasing relative deadlines, Baker [1] has developed a sufficient condition to check the feasibility of a task set.
Theorem 6.5 (Baker) A set of n tasks (periodic and sporadic) is schedulable by EDF scheduling with SRP if \fk=I, ... ,n
( t~i) i=l
2
o The implementation complexity of the SRP is much lower than that of the dynamic PCP. In fact, in this case the ceilings are static, and so may be precomputed and stored in a table. The current ceiling may be simply kept in a stack. Furthermore, the acquire and release primitives are simpler because resource requests cannot block. They don't require any blocking test or a context switch. A concurrency control protocol similar to SRP has been independently developed for sporadic task sets [7]. In this paper, a necessary and sufficient schedulability test is derived based on the knowledge of a minimum and maximum execution time for each critical section. See the referenced paper for details.
144
6.6
CHAPTER
6
RESOURCE SCI-IEDULING IN PLANNING-BASED SCHEDULERS
Recall that a planning algorithm work by attempting to constructing a schedule for a set of jobs. It starts with an empty schedule and extends it in steps, one job at a time, until, if it succeeds, it comes up with a complete feasible schedule. Here the additional considerations that enter the picture if jobs have resource constraints are presented. In the presence of resources, the planning algorithm must compute the earliest start time, est;, at which job J; can begin execution after accounting for resource contention among jobs. Assume that a job can use a resource either in shared mode or in exclusive mode and holds a requested resource as long as it executes. It is assumed that jobs are non-preemptive. Given a partial schedule, then eratj the earliest time at which resource R j is available, in shared/exclusive modes, can be determined. erats are computed taking into account the requirement that when a resource is held by a job in exclusive mode, no other job can be using it in exclusive or shared modes. Consider how this is accomplished. \Vhen resources are taken into account, several data structures are required to keep track of the availability of resources. To simplify discussions, the data structures when the system has one instance of each resource are first presented. Subsequently, extensions to handle multiple instances are discussed. When only one instance exists for each resource, the planner maintains two vectors erat S and erat€, to indicate the earliest resource available times of the r resources in the system for shared and exclusive modes', respectively: erat S = (eratf., erat~, ... , erat:) and erate = (eratj, erat~, ... , erat~)
Here eratf (or erat'f) is the earliest time when resource R; will become available for shared (or exclusive) usage. After the partial schedule is extended by one task, the planner updates eratS and eratO using the task's start time, computation time and resource requirements.
EDF Scheduling for Shared Resources
145
Here is a simple example to illustrate the computation of new erat B and erat e values: Assume a system has 5 resources, R 1 , R 2 , •.. , R 5 . Let current erat B and erat ee be
erat B = (eratL erat 2, erat!;, erati, eratr,) = (5, 25, 10,5, 10), and erate = (eratl,erat~,erat~,erat4,erat5) = (5,25,10,10,15). Suppose job Ji is being selected to extend the partial schedule. Assume C i = 10, ri = 0, and Ji requests R 1 and R 4 for exclusive use and R s for shared use. Then the earliest time that J i can start is the earliest available time of the resources needed by task T. So, esti = rnax(ri,77WXj=l..reratj) = max(O, MAX (5, 10, 10)) = 10 and, J i can start at 10.
The algorithm updates the erat B and erat eratee vectors as follows:
erat B = (eratL erat 2, eratLerat'4, erat'5) =(20,25,10,20,10), and erate = (eratl, erat~, erat~, erat 4 , erat 5) =(20,25, 10,20,20 ). Note that for R 5 , both erat'5 and erat 5 need to be updated. erat'5=10 because task T uses R s in shared mode and it is therefore possible for some other task to utilize R 5 in parallel, in shared mode. However, erat 5=20 because another task which requires R 5 in exclusive mode cannot be permitted to execute in parallel with T. Based on the above discussion, it is easy to observe that given a task's earliest start time, its finish time can be determined and thus the planner can decide if a task will finish by its deadline. Now, the extensions to allow each distinct resource to have multiple instances are discussed. In this case, a vector no longer suffices to represent the two erat's. erat B and erate have to be matrices so that earliest available time for every instance of each resource can be represented.
146
CHAPTER
6
where n, m and p are the number of instances of resource items 1, 2 and respectively.
1',
(erath, erath, (erat~l' erath,
erat B
, eratfn) , erat~m)
=
and (erath, erath, (erat~l' erat~2'
, eratin) , erat~m)
erate =
Given the representations for erat B and erate as matrices est; becomes: est i = max(ri, 1TWXj=l..rmink=l..qeratjk) where u is s when R; is used in shared mode or e when R i is used in exclusive mode and q is the number of instances of R j . The additional change when resources are included involves being resource cognizant while assigning priorities to jobs when jobs are being considered to extend the partial schedule. Two possible priority assignment policies are: 1. Minimum earliest start time first (Min_S): Pr(Ji )
2. MilLD
+ MilLS:
Pr(J;)
= d+ W 1 Hsti,
= esti
where W 1 is a weighting constant.
The first policy does not consider tasks' timing constraints at all but does take into account resource availability. The second extends EDF to consider both time and resource constraints. This policy has been shown to result in good real-time performance [14] measured in terms of the planning algorithm's ability to find feasible schedules. In [16], the above planning algorithm has been extended to deal with tasks that can be parallelized, that is, mapped into subtasks that are executed concurrently on multiple processors.
EDF Scheduling for Shared Resources
6.7
147
SUMMARY
A spectrum of possibilities - starting from one that uses a non-preemptive job model to one that allows arbitrary preemptions - emerge when several approaches to ensure the integrity of shared resources are considered. 1. Adopt the non-preemptive job model where a job acquires and holds on to a resource throughout its execution. This is the approach taken by the planning based approaches. Jobs are scheduled in such a way that jobs that need the same resource are scheduled to exclude each other in time. This way, all blocking that could occur when jobs execute is avoided. However, if jobs are of long duration or if a job uses different resources during different parts of its execution then concurrency can be reduced and competing jobs can be made to wait for longer durations than is required for resource consistency. To alleviate this problem, a job is modeled as a set of precedence related components, with each component having its own resource requirements. Such jobs can then be scheduled using a planning based algorithm that can exploit precedence constraints of jobs. 2. Allow a job to be preempted in a controlled fashion: either after it releases the shared resources it currently holds or only if the preempting job does not require the resources needed by the preempted job. This characterizes the approach taken by priority-driven algorithms such as PIP: PCP, and SRP. Accesses to the resources are controlled in such a way that the duration of priority inversion is bounded and is as small as possible. As a result, feasibility checks can be undertaken to make sure that jobs meet their time constraints in spite of the blocking. 3. Allow a job to be preempted by a higher priority job at any point during its execution. The resources it currently holds are released after undoing all the changes done by the preempted job. When the job resumes, it reexecutes from the beginning. In· this case, the system must somehow be prepared to reinstate the resource's state as it existed when the preempted job began execution. This adds to the system's overheads and requires additional mechanisms for ensuring undoability. Hence, this solution is not preferred except in real-time transaction systems where such mechanisms are essential [15]. The above list shows that there is no "universal" solution to the problem of scheduling shared resource access in real-time systems. All of the approaches
148
CHAPTER
6
have their place depending on the nature of resources, jobs, and scheduling overheads that can be considered acceptable. More experience is necessary in achieving practical realizations of the algorithms described in this chapter. In addition, a comprehensive set of solutions for the real-time scheduling of different types of resources is still being sought. In particular, good integrated scheduling and resource access policies that span processors, input/output resources, communication channels, and other resources need to be developed. The challenge lies in making them simple enough to implement and yet produce acceptable resource utilization levels. As a final note, a protocol that shares many features with the Dynamic PIP and the SRP has been defined by Jeffay for scheduling sporadic tasks that access shared resources [7]. His model is more restrictive than that adopted by PIP and SRP. It assumes that a job can access at most one shared resource at any given time.
REFERENCES
[1] T.P. Baker, "Stack-Based Scheduling of Real-Time Processes," Real-Time Systems Journal 3, 1991. [2] P. Brinch Hansen, "Structured Multiprogramming," Communications of the ACM 15(7), 1972. [3] M. Chen and K. Lin, "Dynamic Priority Ceilings: A Concurrency Control Protocol for Real-Time Systems," Real-Time Systems Journal 2, 1990. [4] M. Chen and K. Lin, "A Priority Ceiling Protocol for Multiple-Instance Resources," Proc. of the Real- Time Systems Symposium, 1991. [5] A.N. Habermann, "Synchronization of Communicating Processes," Communications of the ACM 15(3), 1972. [6] C.A.R. Hoare, "Monitors: An Operating System Structuring Concepts," Communications of the ACM 18(2), 1974. [7] K. Jeffay, "Scheduling Sporadic Tasks with Shared Resources in Hard RealTime Systems," Proc. of the Real- Time Systems Symposium, 1992, pp. 89-99. [8] 1. Lamport, "The Mutual Exclusion Problem: Part I - A Theory of Interprocess Communication," Journal of the ACM 33(2), 1986. [9] L. Lamport, "The Mutual Exclusion Problem: Part II - Statement and Solutions," Journal of the ACM 33(2), 1986. [10] C.L. Liu and J.W. Layland, "Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Euvironment," Jom'nal of the ACM 20(1), 1973. [11] A.K. Mok, Fundamental Design Problems of Distributed Systems for the Hard Real-Time Environment, Ph.D. Dissertation, MIT, 1983. [12] R. Rajkumar, Synchr'onization in Real- Time Systems: A Priority Inheritance Approach, Kluwer Academic Publishers, Boston 1991.
150
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[13] R. Rajkumar, L. Sha, J.P. Lehoczky and K. Ramamritham, "An Optimal Priority Inheritance Protocol for Real-Time Synchronization," in Principles of Real- Time Systems, Sang Son, Ed. Prentice-Hall, 1994. [14] K. Ramamritham, J.A. Stankovic and P. Shiah, "Efficient Scheduling Algorithms for Real-Time Multiprocessor Systems," IEEE Transactions on Parallel and Distributed Systems, 1(2):184-94, April 1990. [15] K. Ramamritham, "Real-Time Databases," Journal of Distributed and Parallel Databases, Volume 1, Number 2, 1993, pp. 199- 226. [16] G. Manimaran, S. R. Murthy and K. Ramamritham, "A New Algorithm for Dynamic Scheduling of Parallelizable Tasks in Real-Time Multiprocessor Systems," Real- Time Systems Journal, to appear. [17] M. Raynal, Algorithms for Mutual Exclusion, Cambridge (MA), MIT Press, 1986. [18] L. Sha, R. Rajkumar and J.P. Lehoczky, Priority "Inheritance Protocols: An Approach to Real-Time Synchronization," IEEE Transactions on Computers 39(9), 1990. [19] M. Spuri, "Efficient Deadline Scheduling in Real-Time Systems," Ph.D. Thesis, Scuola Superiore S. Anna, July 1995. [20] J.A. Stankovic and K. Ramamritham, "The Spring Kernel: Paradigm for Real-Time Systems," IEEE Software, May 1991.
a New
[21] W. Zhao, K. Ramamritham and J. Stankovic, "Preemptive Scheduling Under Time and Resource Constraints," IEEE Transactions on Computers 36(8), 1987.
7 PRECEDENCE CONSTRAINTS AND SHARED RESOURCES
In many hard real-time systems, due to the strict deadlines that must be met, communications among jobs are implemented in a completely deterministic manner. One approach used is to model communication requirements as precedence constraints among jobs, that is, if a job .Ii has to communicate the result of its computation to another job .Ij , the pair (.Ii, .Ij ) is introduced in a partial order -<, and the jobs are scheduled in such a way that if .Ii -< .Ij the execution of .Ii precedes the execution of .Ij . Good examples of this modeling can be found in the MARS operating system [6, 7], in which the basic concept of a real-time transaction is described exactly in this way, and in Mok's kernelized monitor [8], in which a l"endez-vous construct is used to handle similar situations. In both cases, shared resources among tasks are also considered. However, in the former work the whole schedule is statically generated, that is, is produced in advance before the system can run. The schedule is then stored in a table that, at run-time, is consulteel by a dispatcher to actually schedule the jobs without any other computational effort. In the latter, although dynamic, the schedule is basically non-preemptive, or at least it can be said that the preemption points are chosen very carefully, since the processor is assigned in quantums of time of fixed length equal to the size of the largest critical section. This can often be inefficient. This chapter describes a simple technique that can handle precedence constraints and shared resources for dynamic preemptive systems. Preemptive systems are generally much more efficient than non-preemptive ones, dynamic systems are more robust than the static solutions, and finally, the solution also has a formal basis and associated analytical formulas.
152
CHAPTER
7
Several protocols, such as the Priority Inheritance Protocol (PIP), the Priority Ceiling Protocol (PCP) [9, 3] and the Stack Resource Policy (SRP), handle shared resources. These have been described in the previous chapter. They are well studied and characterized with respect to sufficient conditions for the schedulability of a set of tasks. However, they have been described using a simple independent task model. In this Chapter the results for these protocols are extended to handle precedence constraints as well as shared resources. There have also been several key solutions that deal with precedence constraints, but not with shared resources. Blazewicz [2] shows the optimality of a preemptive earliest deadline first (EDF) scheduler assuming the release times and the deadlines are modified according to the partial order among the tasks. The same technique is used by Garey et. al. [5] to optimally schedule unit-time tasks. In [4], Chetto et. al. show sufficient conditions for the EDF schedulability of a set of tasks, assuming the release times and the deadlines are modified as above. The following aspects of scheduling with precedence constraints are discussed in this chapter: the Blazewicz' technique to "transparently" enforce precedence constraints when deadline driven algorithms are used; the characterization of EDF-like schedulers that can be used to correctly schedule precedence constrained tasks; the way preemptive algorithms, even those that deal with shared resources, can be easily extended to deal with precedence constraints too. The notion of quasi~noTmality [10], which is an extension to [5], is recalled for this purpose. Furthermore, an application of these results to the Priority Ceiling Protocol (PCP) and the Stack Resource Policy (SRP) is also described 1 .
7.1
SCHEDULING DEPENDENT TASKS WITH EDF
One of the first results concerning the scheduling of tasks with precedence constraints is given by Blazewicz [2]. In his work, Blazewicz shows that an EDF scheduler can easily deal with such constraints if the task deadlines are suitably modified. Assume n real-time preemptable jobs are given. Let Tj, d j , and Cj be the release time, the deadline and the worst case computation time of job J j , respectively. The precedence constraints among the jobs can be mathematically modeled by Il\lost of the material of this chapter is taken from [10].
Precedence Constraints and Shared Resources
153
a partial order --<. Thus J; --< J j means that J; must be completed before J j can start. Because of this, without loss of generality it can also be assumed that J; --< Jj implies r; :S rj. With these hypotheses, when J i --< J j , a shorter deadline for J; drives the EDF scheduler to execute J; before Jj, thus enforcing almost "transparently" the precedence constraint between the two jobs. This intuitive idea can be easily formalized in the following way. For each job J i compute
It follows that Vi, di :S d;, and
Also note that in each feasible schedule J i must actually complete within di, otherwise at least one of its successors would fail to meet its deadline and the schedule would not be feasible. If the jobs are now scheduled by EDF according to the computed deadlines, the new algorithm, referred to as EDF*, is optimal.
Theorem 7.1 (Blazewicz) ED? is optimal, in the sense that it finds a feasible schedule whenever it exists.
Proof. Assume there exists a feasible schedule not compliant with the EDF* scheduling. That is, there is at least a couple of jobs Jk and Jj such that, as shown in Figure 7.1, the execution of J k , or part of it, precedes the execution of J j , even if d; < d'k and Jj is ready before Jk completes (rj < ik).
< d'k, J k f< Jj. On the other hand, since the schedule is feasible, Jk· That is, there is no precedence relation between the two jobs. Thus, it is possible to invert the executions on the intervals [max(tk,rj),fk) and Since d; Jj
f<
[tj, fj) without violating any precedence constraint. See Figure 7.2. Also, the feasibility of the schedule is unaffected by the inversion, since the completion time of Jj can only be shortened, while the new completion time of Jk becomes f~ = !J :S d; < dk* :S d k · With a finite number of such inversions a feasible schedule compliant with the EDF* algorithm is found. 0
154
CHAPTER
Jk
r
LC=l rk
t
~-
< fk
fk
k
t__
Jj
j
1
r.
J
I
j
U d; f
~
J
Figure 7.1
The schedule of jobs hand Jj violates EDF*.
Figure 7.2
The new schedule of Jk and Jj EDF* compliant.
7
Precedence Constraints and Shared Resources
155
The technique of deadline modification is used later on in this chapter to enforce precedence constraints in deadline scheduled systems, where shared resources are also considered. In order to easily deal with both aspects, though, a more formal approach is needed. Specifically, the notion of quasi-normality is given in order to characterize the deadline driven algorithms able to easily deal with precedence constraints.
7.2
THE NOTION OF QUASI-NORMALITY
A nice analytical result concerning precedence constraint real-time scheduling can be found in [5]. In this paper, Garey et. al. describe a scheduling algorithm for unit-time tasks with arbitrary release times and deadlines, and precedence constraints using the concept of normality. Their idea can be extended to more general dynamic systems using preemptive EDF schedulers without unit time constraints.
Definition 7.1 Given a partial order -( on the jobs, the release times and the deadlines are consistent with the partial order if
Note that the idea behind the consistency with a partial order is to enforce a precedence constraint by using an earlier deadline. The following definition formalizes the concept of a preemptive EDF schedule.
Definition 7.2 A schedule of a set of jobs is said to be normal (with respect to EDF) if for all por'tions Oi and OJ of two jobs J i and J j S6;
where
S6
<
S6;
dj
:::;
d i or
ri
,
respectively,
> S6;,
is the start time of the portion O.
What this definition says is that at any time among all those jobs eligible to execute (a job J i is eligible for execution only if the current time t is greater
156
CHAPTER
7
than or equal to the release time ri), the job with the earliest deadline is always scheduled. In [5] Garey et. al. show that the consistency of release times and deadlines can be used to integrate precedence constraints into the task model: just use an algorithm that produces normal schedules. This result is proven only for unit-time tasks. However, it can be easily extended to tasks of arbitrary length running on a preemptive system. Lemma 7.1 If the release times and deadlines are consistent with a partial ol'der, then any normal schedule that satisfies the release times and deadlines must also obey the partial order. Proof. Consider any normal one-processor schedule and suppose that Ji -< Jj but that Sj < Ii, where Ii is the completion time of J i . The last expression implies that there are two portions lij and Iii of J j and J i , respectively, such that so; < so,· Since the schedule is normal, this means that dj :s; d i or r i > so; (recall that for the feasibility assumption so; 2: Sj 2: rj). However, by the consistency assumption ri :s; rj and d i < d j , hence in both cases there is a 0 contradiction. Now the question is whether this result can be extended in order to handle the more general situation in which tasks share resources. Unfortunately, a direct generalization to an EDF-like scheduling algorithm, using some resource access protocol, does not hold. In fact, the produced schedules are not necessarily normal (see Figure 7.3 for an example). The motivation is very simple: even if possibly bounded, all such protocols allow priority inversion, that is, during the evolution of the system there may be a lower priority task blocking another higher priority one. In this case the condition for the schedule to be normal is violated. Hence, the conclusion is that as long as shared resources are used, the normality must be weakened in some way. That is, a less restrictive policy, with respect to scheduling decisions, but that still preserves the property of normality shown in Lemma 7.1, is needed.
Definition 7.3 A schedule of a set of jobs is said to be quasi-normal (with respect to EDF) if for all portions Iii and lij of two jobs J i and J j , respectively,
Precedence Constraints and Shared Resources
-
__
~!"""~L--.-!--, ,, ,
~~
,,,
~ l
__ b_'____.L..------_~ ~ Priority Inversion
fa ... ~
157
t
Critical Sec lion
Figure 7.3 Example of a not normal schedule produced by EDF and a resource access protocoL
Ti
~ Tj
and so; < so,
In other words, the definition establishes that in a quasi-normal schedule the decision of preempting a job is left to the scheduler (recall that in a normal schedule whenever there is an eligible job with an earlier deadline a preemption is mandatory). However, if the scheduler chooses to preempt a job J i and assigns the proce~sor to a job Jj , the deadline of Jj must be earlier than the deadline of Ji (without loss of generality it can be assumed that tasks with equal deadlines are scheduled in FIFO order). So with quasi-normality, more freedom is given to the scheduler (so that it can obey shared resource requirements) and a bit weaker condition is obtained, as established by the following lemma.
Lemma 7.2 If a feasible schedule is normal then it is also quasi-normal.
Proof. Consider two portions di and dj of the jobs J i and J j , respectively, with Ti ~ Tj. If so; < so" for the normality of the schedule then either dj ~ di or Ti > so;. Since the schedule is feasible so; :::: Tj, hence it cannot be Ti > dj. It follows that dj S di , that is, the schedule is quasi-normal. 0
Note that the opposite is not true (see again Figure 7.3 for an example of a quasi-normal, but not normal schedule). The result of Lemma 7.1 can now be generalized in the following way.
158
CHAPTER 7
Theorem 7.2 Given a set of jobs with l'elease times and deadlines consistent with a partial order -<, any feasible schedule (i.e., that satisfies both the release times and the deadlines) obeys the partial order -< if and only if is quasi-normal. Proof. "If". Consider any quasi-normal schedule and suppose that J i -< J j , but s j < fi, where sj is the start time of J j . By the consistency assumption ri ~ Tj and di < d j . Given that the schedule is quasi-normal, then d j ~ di which is a contradiction. "Only if". Suppose now that the schedule obeys the partial order -< and that there are two portions <5; and elj of the jobs J i and J j , respectively, with Ti ~ Tj, whose start times are sJ; < SJ;. If the condition of quasi-normality is violated, d j > di . This means that the release times and the deadlines of J i and J j are consistent with a partial order in which J i precedes J j . Hence, even if -< does not contain the relation T i -< T j , it can be forced without changing the problem. But this is a contradiction to the fact that a portion of J j precedes a portion of J i in the given schedule. 0 The theorem just proven is a strong characterization of the scheduling algorithms that produce feasible schedules compliant with the given partial order, provided that release times and deadlines are consistent with it. Quasinormality is the only property strictly required for this class of algorithms. The next step to achieve the integration of precedence constraints and shared resources in a single algorithm is to show that the protocols used to handle shared resources do not violate this property.
7.3
INTEGRATION OF SHARED RESOURCES AND PRECEDENCE
In this section it is shown how the PIP, the PCP and the SRP protocols can be used with an extended task model, in which precedence constraints between tasks can be specified, as well as shared resources. It is first shown that quasinormality is the essential property of a certain EDF schedulers class. This, together with the results shown in the previous section, gives an analytical basis for the extended protocols.
Precedence Constraints and Shared Resources
159
Theorem 7.3 Any schedule produced by a policy or protocol that uses an EDF priority assignment, is quasi-normal if and only if at any time t the executing job is in the set:
where prj is the p1'iority of job J j (recall that according to EDF highe1' pri01'ities are given to jobs with earlier deadlines).
Proof. "If". Consider two jobs Ji and J j , with ri ::; rj and S6j < S6i' At time t = SDj' by assumption prj ~ pri, Le., dj ::; di . Hence the schedule is quasi-normal. "Only if". At any time t consider the executing job J j . Let R t be the set of all jobs with release time less than or equal to rj, i.e., for any job J i E R t ri ::; rj. Given that J i is still present in the system, at least a portion 6i will be executed later than the portion 6j of J j currently executing, that is, S6j < SD,. For the quasi-normality of the schedule d j ::; d i . Hence, Jj is in St. 0 Note that in case of priority inversion the condition for the schedule to be quasi-normal is not violated, since the blocking job, even if it does not have the highest priority in the system, is in St. Furthermore, whenever a job has entered St, it does not leave the set until it completes its execution. This allows proving the condition given by Theorem 7.3 by only testing it at the beginning of each task execution. Theorem 7.3 states a general result that together with Theorem 7.2 allows the modeling of precedence constraints among jobs by just enforcing consistency with respect to Definition 7.1, even in complex systems with shared resources. In what follows it is shown how these considerations can be applied to three well-known protocols, PIP, PCP and SRP, thereby extending them to task models with both shared resources and precedence constraints.
Corollary 7.1 Any schedule produced by the PIP and used with an EDF pri01'ity assignment is quasi-normal.
Proof. It is sufficient to prove that the executing job is always in St and then the result is achieved by applying Theorem 7.3. The condition is always true
160
CHAPTER
_ _ _ _bnecd'cs
.---fj--'-;"'_es_d_.._d_lin_'
_____ -'-t-'-CJ_re_\~-~:~ns~trL.V_iO_Ja_',d
III
7
~
___
,
Critical Section
j
' - - - - - - - - - - - -
Figure 7.4 A situation in which an EDF scheduler without priority inheritance violates quasi-normality and precedence constraints.
whenever a job begins its execution, because at this time the job has the highest priority in the system (each job executes at a priority different from its original one only if it is blocking a higher priority job, but this cannot occur at the beginning of its execution). From that instant on, the task will always be in St, until it completes its execution. 0 Note that some form of priority inheritance is necessary. Otherwise, there could be a situation like that shown in Figure 7.4, in which quasi-normality and a precedence constraint are violated, because the medium priority job, which is not in St, is allowed to start when the higher job blocks. So, by deadline modification and some form of inheritance the integration of precedence constraints and shared resources can be obtained. The same argument used for the PIP holds for the PCP. Corollary 7.2 Any schedule produced by the PCP, used with an EDF priority assignment, is quasi-normal. Proof. Similar to Corollary 7.1.
o
Corollary 7.3 Any schedule produced by the SRP, used with an EDF priority assignment, is quasi-normal.
Precedence Constraints and Shared Resources
161
Proof. Again, it is sufficient to prove that at any time the executing job is in St and the result is then achieved by applying Theorem 7.3. From the definition of the SRP, each job execution request is blocked from starting execution until it is the oldest, highest priority pending request, and the resources currently available are sufficient to meet its maximum requirements, as well as those of any single job that might preempt it. Hence, whenever a job begins its execution it is in St. The way it is defined, the set St is not exited by any job before its completion. It follows then that at all times the executing job is in
St.
0
Note that even in this case there is a form of priority inheritance: "an executing job holding a resource resists preemption as though it inherits the priority of any job that might need that resource" [1]. Finally, it can be shown that consistency can be used with the PIP, the PCP or the SRP and an EDF priority assignment to enforce precedence constraints. Corollary 7.4 If the release times and the deadlines aloe consistent with a partial order, any schedule produced by the PIP, the PCP or the SRP, used with an EDF priority assignment, obeys the partial order. Proof. Follows directly from Corollary 7.1, Corollary 7.2, Corollary 7.3 and from Theorem 7.2. 0
7.4
EXTENDED TASK MODEL
Corollary 7.4 allows the extension of the programming model to handle a partial order among tasks; it is only necessary to use a consistent assignment for release times and deadlines. The resulting protocol consists of two basic steps: 1. modify at run-time release times and deadlines m accordance with the given partial order, and 2. execute one of the known protocols (PIP, PCP or SRP). In the rest of this section it is assumed that accesses to shared resources are controlled by the SRP protocol (the same extended model with a slightly different analysis can be used with the PIP and the PCP protocols). The activities
162
CHAPTER
7
of the system are now modeled by means of processes. A process Pi (periodic or sporadic) is defined as a 6-tuple (Ti,9i,Pi,D i ,Ci ,Mi), where: •
Ti is a set of tasks that form the process,
•
9i is a directed acyclic graph that models a partial order among tasks in Ti (there is an arc from node j to node k if and only if Tj -< Tk),
•
T i is the period of the process (if the process is sporadic it is the minimum interval of time between two successive execution requests of the same process),
•
D i is the relative deadline of the process,
•
C i is its worst case computation time, that is, C i = the worst case computation time of Tj, and
•
{ti is a function that represents the maximum shared resource requirements of each task in Ti.
I:r; ET; Cj, where Cj
is
Furthermore, it is assumed the processes arrive dynamically and are dynamically scheduled. In order to make use of the previous results, the consistency of the release times and the deadlines with the partial order must be enforced. A technique similar to those which have already appeared in several papers [2, 5, 8, 4] can be used. Two different assignments of deadlines to tasks are proposed in this section. They both guarantee consistency with the given partial order, but they have a different impact in terms of schedulability analysis. In the first solution, the same relative deadline D i is first assigned to each task of the process Pi. The deadlines are then modified by processing the tasks in reverse topological order. The algorithm is described in Figure 7.5. The algorithm can run in 0 (I:7=1 mi + nil time, where mi is the number of arcs in 9i, ni is the number of tasks in Ti and n is the number of processes in the system. Then at run-time, whenever a request of execution for the process Pi arrives at time t, the value t + d j is assigned to the absolute deadline of the current job ofTj, for all Tj E Ti. Now, considering that each job of a task Tj can be blocked if it makes use of shared resources, the value bj of its worst case blocking time must be estimated
Precedence Constraints and Shared Resources
163
Algorithm DM: Begin Step 1: Relative deadline initialization. for all Tj E T; dj = D i Step 2: Relative deadline modification. 2.1 If all tasks in T; have been processed, halt. 2.2 Let Tj be a task not already processed and whose immediate successors, if any, have been processed, assign dj = min( {d j } U {d k - Ck : Tj -<,g, Td), and go to step 2.1. End
Figure 7.5
Algorithm of relative deadline modification.
as usual. Hence, assuming all the tasks in the system have been ordered by increasing relative deadlines, the formula proposed by Baker [1] can be used to check the schedulability of the whole set:
Vk
= 1, ... ,N
where N = L7=1 I T; I. Note that in this approach the schedulability check is performed on a task basis using the modified deadlines without considering the process as a whole. If the schedulability test is positive, the formula works correctly. However, if the test is negative, it is pessimistic because of the following anomaly. When modifying deadlines of tasks on a per process basis, it is possible that tasks from different processes are interleaved. This means that a task from a process with a late deadline might execute before tasks from a process with an earlier deadline, possibly causing unnecessary missed deadlines. A tighter set of conditions can be obtained using an alternative deadline assignment. It always starts by assigning to each task of the process Pi a relative deadline equal to D i . These deadlines are then modified according to the following argument: make the tasks within a process consistent with the given partial order, and ensure that deadlines of tasks pertaining to different processes are not interleaved. In effect, this approach uses EDF scheduling for the process
164
CHAPTER
7
as a whole, and uses modified deadlines to ensure the partial order among the tasks of the process itself. This can be easily implemented as follows. The mentioned interleaving can be avoided assuming that the original deadlines are expressed in terms of integer numbers. Then, it is quite simple to find for each process Pi a sufficiently small positive number 6i < 1 such that, modifying the deadlines by processing the tasks in reverse topological order as in algorithm DM, that is, d j = min({d j } U {dk - 6i: Tj -<0; Td), the smallest deadline of any task of this process is greater than D i - 1, and even with equal deadlines between two or more processes there will not be interleaving between the deadlines of their tasks. Now, during the estimation of the blocking times and the evaluation of the schedulability of the system, each process can be considered as a whole. That is, the blocking time of a process Pi is at most
Bi
= TjETi max bj ,
and, assuming again that the processes are ordered by increasing relative deadlines, the set of schedulability conditions becomes
Vk=1, ... ,n
(7.1)
This formula is very similar to that proposed by Baker [1] in his schedulability analysis of the SRP. However, this one is tighter and accounts for groups of tasks with precedence constraints. Note that even though processes consist of sets of tasks with precedence constraints, the internal details of a process are kept hidden in the schedulability conditions (7.1).
7.5
SUMMARY
Many real-time systems have tasks with precedence constraints. Because of this, results dealing with precedence constraints are extremely important. In this Chapter, results were first presented that consider precedence constraints for tasks that do not share resources, and then for tasks that also share resources. In Chapters 9 and 10 results for distributed systems are presented. In these results there are also precedence relations that must be satisfied: the
Precedence Constraints and Shared Resources
165
sending task must precede the message transmission which must precede the task which receives the message. Precedence constraints can also be handled in the planning based algorithms introduced in the previous Chapter, but due to space limits, details on these extensions are not presented. The reader is referred to [12].
REFERENCES
[1] T.P. Baker, "Stack-Based Scheduling of Real-Time Processes," The Journal of Real- Time Systems 3, 1991. [2] J. Blazewicz, "Scheduling Dependent Tasks with Different Arrival Times to Meet Deadlines," in E. Gelenbe, H. Bellner (eds), Modeling and Performance Evaluation of Computer Systems, North-Holland, Amsterdam, 1976. [3] M. Chen and K. Lin, "Dynamic Priority Ceilings: A Concurrency Control Protocol for Real-Time Systems," The Journal of Real- Time Systems 2, 1990. [4] H. Chetto, M. Silly, and T. Bouchentouf, "Dynamic Scheduling of RealTime Tasks under Precedence Constraints," The Journal of Real- Time Systems 2, 1990. [5] M.R. Garey, D.S. Johnson, B.B. Simons, and R.E. Tarjan, "Scheduling Unit-Time Tasks with Arbitrary Release Times and Deadlines," SIAM Journal on Computing 10(2), 1981. [6] H. Kopetz, A. Damm, C. Koza, M. Mulazzani, W. Schwabl, C. Senft, and R. Zainlinger, "Distributed Fault-Tolerant Real-Time Systems: The MARS Approach," IEEE Micro 9(1), 1989. [7] H. Kopetz, R. Zainlinger, G. Fohler, H. Kantz, P. Puschner, and W. Schutz, "The Design of Real-Time Systems: from Specification to Implementation and Verification," Software Engineering Journal 6(3), 1991.
[8] A.K. Mok, "Fundamental Design Problems of Distributed Systems for the Hard Real-Time Environment," Ph.D. Dissertation, MIT, 1983. [9] L. Sha, R. Rajkumar, and J.P. Lehoczky, "Priority Inheritance Protocols: An Approach to Real-Time Synchronization," IEEE Transactions on Computers 39(9), 1990.
168
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[10] M. Spuri and J .A. Stankovic, "How to Integrate Precedence Constraints and Shared Resources in Real-Time Scheduling," IEEE Transactions on Computers, Dec. 1994. [11] W. Zhao, K. Ramamritham, and J.A. Stankovic, "Scheduling Tasks with Resource Requirements in Hard Real-Time Systems," IEEE Transactions on Software Engineering 12(5), 1987. [12] G. Zlokapa, Real-time Systems: Well-timed Scheduling and Scheduling with Precedence Constraints, Ph.D. Thesis, University of Massachusetts, February 1993.
8 APERIODIC TASK SCI-IEDULING
This chapter deals with the problem of scheduling soft aperiodic tasks and hard periodic tasks under a deadline-based priority assignment. Different service methods are presented, whose objective is to reduce the average response time of aperiodic requests without compromising the schedulability of hard periodic tasks. Periodic tasks are scheduled by the Earliest Deadline First (EDF) algorithm. With respect to fixed-priority assignments, dynamic scheduling algorithms are characterized by higher schedulability bounds, which allows a better utilization of the processor, an increase in the size of aperiodic servers, and an enhancement to aperiodic responsiveness. Consider, for example, a periodic task set with a processor utilization factor Up = 0.6. If priorities are assigned to periodic tasks based on RM and aperiodic requests are served by a Sporadic Server, the maximum server size that guarantees periodic schedulability is about Us = 0.1, as imposed by Liu and Layland's bound. However, if periodic tasks are scheduled by EDF, the processor utilization bound goes up to 1.0, so the maximum server size can be increased to Us = 1 - Up = 0.4. For the sake of clarity, all properties of the algorithms presented in this chapter, are proven under the following assumptions: i = 1,
•
all periodic tasks
•
all aperiodic tasks J i : i = 1,
•
each periodic task Ti has a period T i , a computation time Gi , and a relative deadline D i equal to its period;
•
all periodic tasks are simultaneously activated at time t
Ti :
, n have hard deadlines; ,Tn
do not have deadlines;
= 0;
170
•
CHAPTER
8
each aperiodic request has a known computation time, but an unknO\vn unkno\vn arrival time.
This task model can easily be extended to handle periodic tasks with arbitrary phasing and relative deadlines different from their periods. Shared resources can also be included in the model assuming an access protocol like the Stack Resource Policy [1]. In this case, the schedulability analysis has to be modified to take into account the blocking factors due to the mutually exclusive access to resources. The rest of the chapter is organized as follows. In the next two sections, two fixed-priority service algorithms, namely the Priority Exchange and the Sporadic Server algorithms, are extended to work under the EDF priority assignment. Then, three new aperiodic service algorithms are introduced. They are based on dynamic deadline assignments and greatly improve the performance of the fixed-priority extensions. One of these algorithms, the EDL server, is shown to be optimal, in the sense that it minimizes the average response time of aperiodic requests.
8.1
DYNAMIC PRIORITY EXCHANGE SERVER
The Dynamic Priority Exchange (DPE) server is an aperiodic service technique proposed by Spuri and I3uttazzo in [13, 15] which can be viewed as an extension to the Priority Exchange server [6], adapted to work with a deadline-based scheduling algorithm. The main idea of the algorithm is to let the server trade its run-time with the run-time of lower priority periodic tasks (under EDF this means a longer deadline) in case there are no aperiodic requests pending. In this way, the server run-time is only exchanged with periodic tasks, but never wasted (unless there are idle times). It is simply preserved, even if at a lower priority, and it can be later reclaimed when aperiodic requests enter the system. The algorithm is defined as follows: •
the DPE server has a specified period T s and a capacity Cs;
•
at the beginning of each period, the server's aperiodic capacity is set to C~, where d is the deadline of the current server period;
171
Aperiodic Task Scheduling
12
24
18
24
16
o
1
2
3
4
5
6
Figure 8.1
7
8
9
10
II
12
13
14
15
16
17
18
1~
20
2\
22
23
24
Dynamic Priority Exchange server example.
•
each deadline d associated with the instances (completed or not) of the i-th periodic task has an aperiodic capacity, initially set to 0;
•
aperiodic capacities (those greater than 0) receive priorities according to their deadlines and the EDF algorithm, like all the periodic task instances (ties are broken in favor of capacities, i.e., aperiodic requests);
•
whenever the highest priority entity in the system has an aperiodic capacity of C units of time the following happens:
ct;,
if there are aperiodic requests in the system, these are served until they complete or the capacity is exhausted (each request consumes a capacity equal to its execution time); if there are no aperiodic requests pending, the periodic task having the shortest deadline is executed; a capacity equal to the length of the execution is added to the aperiodic capacity of the task deadline and is subtracted from C (Le., the deadlines of the highest priority capacity and the periodic task are exchanged); if neither aperiodic requests nor periodic task instances are pending, there is an idle time and the capacity C is consumed until, at most, it is exhausted. An example of a schedule produced by the DPE algorithm is illustrated in Figure 8.1. Two periodic tasks, Tj and TZ, with periods T1 = 8 and Tz = 12 and worst case execution times C j = 2 and C z = 3, and a DPE server with period T s = 6 and capacity C s = 3, are present in the system.
172
CHAPTER
8
At time t = 0, the aperiodic capacities C~, (with deadline 8) and Cl; (with deadline 12) are set to 0, while the server capacity (with deadline 6), is set to Cs = = 3. Since no aperiodic requests are pending, the two first periodic instances of T1 and T2 are executed and C s is consumed in the first three units of time. In the same interval, two units of time are accumulated in and one unit in Cl;.
C1
ct
At time t = 3, C~, is the highest priority entity in the system. Again, since no aperiodic requests are pending, T2 keeps executing and the two units of C~, are consumed and accumulated in Cl;. In the following three units of time the processor is idle and C12 is completely consumed. Note that at time t = 6 the 2 server capacity C s = is set at value 3 and is preserved until time t = 8, when it becomes the highest priority entity in the system (ties among aperiodic capacities are assumed to be broken in a FIFO order). At time t = 8, two units of C1 2 are exchanged with Cl~, while the third unit of the server is consumed since the processor is idle.
c1
At time t = 14, an aperiodic request, J 1 , of 7 units of time enters the system. Since C18 = 2, the first two units of J 1 are served with deadline 18, while the next two units are served with deadline 24, using the capacity C~;. Finally, the last three units are also served with deadline 24, because at time t = 18 the server capacity C~4 is set to 3.
8.1.1
Schedulability analysis
The schedulability condition for a set of periodic tasks scheduled together with a DPE server is now analyzed. Intuitively, the server behaves like any other periodic task. The difference is that it can trade its run-time with the run-time of lower priority tasks. When a certain amount of time is traded, one or more lower priority tasks are run at a higher priority level and their lower priority time is preserved for possible aperiodic requests. This run-time exchange, however, does not affect schedulability, thus the periodic task set can be guaranteed using the classical Liu and Layland condition:
where Up is the utilization factor of the periodic tasks and Us is the utilization factor of the DPE server. In order to prove this result, given a schedule (Y produced using the DPE algorithm, consider a schedule (Y' built in the following way:
Aperiodic Task Scheduling
173
12
18
16
'2
24
J~L--...,.L-,-~~Bil;;Ni~'+-"-I---···"1 ~"'.;N.'.;;."')';';;) j----r- L.-JII;;;: ~l, ~ o
1
2
3
4
.5
6
7
8
9
Figure 8.2
10
11
12
13
14
15
16
17
18
E1 19
20
21
12
23
24
OPE server schedulability,
•
replace the DPE server with a periodic task T s with period T s and worst case execution time Cs, so that in u' T s executes whenever the server capacity is consumed in u;
•
the execution of periodic instances during deadline exchanges is postponed until the capacity decreases;
•
all other executions of periodic instances are left as in u.
Note that, from the definition of the DPE algorithm, at any time, at most one aperiodic capacity decreases in u, so u' is well defined. Also observe that, in each feasible schedule produced by the DPE algorithm, all the aperiodic capacities are exhausted before their respective deadlines. Figure 8.2 shows the schedule u' obtained from the schedule u of Figure 8.1. Note that all the periodic executions corresponding to increasing aperiodic capacities have been moved to the corresponding intervals in which the same capacities decrease. Also note that the schedule u' does not depend on the aperiodic requests, but only depends on the characteristics of the server and on the periodic task set. Based on this observation, the following theorem can be proved.
Theorem 2 Given a set of periodic tasks with processor utilization Up and a DPE server with processor utilization Us, the whole set is schedulable by EDF
if and only if
174
CHAPTER
8
Proof. For any aperiodic load, all the schedules produced by the DPE algorithm have a unique corresponding EDF schedule (J', built according to the definition given above. Moreover, the task set in (J' is periodic with a processor utilization U = Up + Us. Hence, (J' is feasible if and only if Up + Us ~ 1. Now, it will be shown that (J is feasible if and only if (J' is feasible.
Observe that in each schedule (J the completion time of a periodic instance is always less than or equal to the completion time of the corresponding instance in the schedule (J'. Hence, if (J' is feasible, then also (J is feasible, that is, the periodic task set is schedulable with the DPE algorithm. Vice versa, observing that (J' is a particular schedule produced by the DPE algorithm when there are enough aperiodic requests, if (J is feasible, then (J' is also feasible, hence the theorem holds. 0
8.1.2
Reclaiming spare time
In hard real-time systems, the guarantee test of critical tasks is done by performing a worst case schedulability analysis, i.e., assuming the maximum execution time for all task instances. However, when such a peak load is not reached, because the actual execution times are lower than the worst case values, it is not always obvious how to reclaim the spare time efficiently [11]. Using a OPE server, the spare time unused by periodic tasks can easily be reclaimed for servicing aperiodic requests. Whenever a periodic task completes, it is sufficient to add its spare time to the corresponding aperiodic capacity. An example of reclaiming mechanism is shown in Figure 8.3. As it can be seen from the capacity plot, at the completion time of the first two periodic instances, the corresponding aperiodic capacities (C~l and C1;) are incremented by an amount equal to the spare time saved. Thanks to this reclaiming mechanism, the first aperiodic request can receive immediate service for all the seven units of time required, completing at time t = 11. Without reclaiming, the request would complete at time t = 12. Note that reclaiming the spare time of periodic tasks as aperiodic capacities does not affect the schedulability of the system. In fact, any spare time is already "allocated" to a priority level corresponding to its deadline when the task set has been guaranteed. Hence, the spare time can safely be used if requested with the same deadline. Designers must be careful about the practical implementation issues involved in reclaiming execution time. In some solutions
Aperiodic Task Scheduling
3=~ DPE
t
175
~
7
=L'>- CC=S1
~7
~
12
JR
Lr~S--t=sJ
16
o
1
2
3
4
5
6
7
8
Figure 8.3
9
10
II
12
IJ
14
15
16
24
17
11\
19
20
21
22
23
24
DPE server resource reclaiming.
the time is reclaimed automatically (because it is the nature of the algorithm). In other cases the reclaiming can consume considerable execution time. See
[ll].
8.2
DYNAMIC SPORADIC SERVER
The Dynamic Sporadic Ser'ver 1 (DSS) is an aperiodic service strategy proposed by Spuri and Buttazzo [13, 15] which extends the Sporadic Server [12] to work under a dynamic EDF scheduler. Similarly to other servers, DSS is characterized by a period T s and a capacity C" which is preserved for possible aperiodic requests. Unlike other server algorithms, however, the capacity is not replenished at its full value at the beginning of each server period, but only when it has been consumed. The times at which the replenishments occur are chosen according to a replenishment rule, which allows the system to achieve full processor utilization. The main difference between the classical SS and its dynamic version consists in the way the priority is assigned to the server. SS has a fixed priority chosen according to the RM algorithm (that is, according to its period T,,). DSS has a dynamic priority assigned through a suitable deadline. The deadline assignment and the capacity replenishment are defined by the following rules: 1 A similar algorithm called Deadline Sporadic Server has been independently developed by Ghazalie and Baker in [5J.
176
CHAPTER
..
.I-----r---,.---~.~~__J
o
I ..
8
..~
16
24
.I----r----r-~.~~~I
II 1*1
~:~e~:~:'~~~;, : Dss3l-~ o
I
3
4
5
6
~ 7
R
Figure 8.4
9
10
11
12
13
14
15
16
1~1 17
18
19
20
21
22
23
24
Dynamic Sporadic Server example.
•
When the server is created, its capacity Cs is initialized at its maximum value.
•
The next replenishment time RT and the current server deadline ds are set as soon as C s > and there is an aperiodic request pending. If tA is such a time, then RT = ds = tA + T s ·
•
The replenishment amount RA to be done at time RT is computed when the last aperiodic request is completed or C s has been exhausted. If t[ is such a time, then the value of RA is set equal to the capacity consumed within the interval [tA, tIl.
°
Figure 8.4 illustrates an EDF schedule obtained on a task set consisting of two periodic tasks with periods T 1 = 8, T 2 = 12 and execution times C 1 = 2, C 2 = 3, and a DSS with period T s = 6 and capacity C s = 3. At time t = 0, the server capacity is initialized at its full value C s = 3. Since there are no aperiodic requests pending, the processor is assigned to task T1, which has the earliest deadline. At time t = 3, an aperiodic request with execution time 2 arrives and, since C s > 0, the first replenishment time and the server deadline are set to RT1 = d s = 3 + T s = 9. Since d s is the earliest deadline, DSS becomes the highest priority task in the system and the request is serviced until completion. At time t = 5, the request is completed and no other aperiodic requests are pending, hence a replenishment of two units of time is scheduled to occur at time RT1 = 9.
177
Aperiodic Task Scheduling
At time t = 6, a second aperiodic requests arrives. Since Cs > 0, the next replenishment time and the new server deadline are set to RT2 = ds = 6 + T s = 12. Again, the server becomes the highest priority task in the system (ties among tasks are always resolved in favor of the server) and the request receives immediate service. This time, however, the capacity has only one unit of time available and it gets exhausted at time t = 7. Consequently, a replenishment of one unit of time is scheduled for RT2 = 12, and the aperiodic request is delayed until t = 9, when C s again becomes greater than zero. At time t = 9, the next replenishment time and the new deadline of the server are set to RT3 = d s = 9+Ts = 15. As before, DSS becomes the highest priority task, thus the aperiodic request receives immediate service and finishes at time t = 10. A replenishment of one unit is then scheduled to occur at time RT3 = 15. Note that, as long as the server capacity is greater than zero, all pending aperiodic requests are executed with the same deadline. In Figure 8.4 this happens at time t = 15, when the last two aperiodic requests are serviced with the same deadline ds = 20.
8.2.1
Schedulability analysis
In order to prove the schedulability bound for the Dynamic Sporadic Server, it is first shown that the server behaves like a periodic task with period T s and execution time Cs . Given a periodic task Ti, in any generic interval [tl, t 2 ] such that Ti is released at t l , the total demand of computation requested by Ti is such that:
Ci(tl, t 2 )
:s
l
t
2 ;
t
l
JC
i.
In fact, under EDF the total demand of Ti in [t l , t2] corresponds to the computation time scheduled with deadline less than or equal to t2. The following Lemma shows that the same property is true for DSS. Lemma 8.1 In each interval of time [t l , t2], such that t l is the time at which DSS becomes ready (i.e., an aperiodic request arrives and no other aperiodic requests are being served), the maximum aperiodic time executed by DSS in [t l , t2] satisfies the following relation:
178
CHAPTER
8
Proof. Since replenishments are always equal to the time consumed, the server capacity is at any time less than or equal to its initial value. Also, the replenishment policy establishes that the consumed capacity cannot be reclaimed before T s units of time after the instant at which the server has become ready. This means that, from the time t 1 at which the server becomes ready, at most Cs time can be consumed in each subsequent interval of time of length T Sl hence the thesis follows. 0 Given that DSS behaves like a periodic task, the following Theorem states that a full processor utilization is still achieved.
Theorem 3 Given a set of n periodic tasks with processor utilization Up and a Dynamic Sporadic Server with processor utilization US! the whole set is schedu[able if and only if
Proof. "If". Assume Up + Us :::; 1 and suppose there is an overflow at time t. The overflow is preceded by a period of continuous utilization of the processor. Furthermore, from a certain point t' on (t' < t), only instances of tasks ready at t' or later and having deadlines less than or equal to t are run (the server may be one of these tasks). Let C be the total execution time demanded by these instances. Since there is an overflow at time t, the following relation holds:
t' < C.
t -
Moreover,
C <
- t'J C + L It-----r;n
i
t l ;i i=l
< <
l
t
n
J
i + l ;st' J Cs
t' C
t
t _ t'
t _ t'
L -----r;-C + ----;y-C i
i=l
t
< (t - t')(Up + Us). Thus, it follows that
Up a contradiction.
Cape
+ Us> 1,
s
s
Aperiodic Task Scheduling
179
"Only If". Since DSS behaves as a periodic task with period T s and execution time C" the server utilization factor is Us = Cs/Ts and the total utilization factor of the processor is Up + Us. Hence, if the whole task set is schedulable, from the EDF schedulability bound [9] it can be concluded that Up + Us ::; 1.
o
8.3
TOTAL BANDWIDTH SERVER
Looking at the characteristics of the Sporadic Server algorithm, it can be easily seen that, when the server has a long period, the execution of the aperiodic requests can be delayed significantly. This is due to the fact that when the period is long, the server is always scheduled with a distant deadline. And this is regardless of the aperiodic execution times. There are two possible approaches to reduce the aperiodic response times. The first is, of course, to use a Sporadic Server with a shorter period. This solution, however, increases the run-time overhead of the algorithm because, to keep the server utilization constant, the capacity has to be reduced proportionally, but this causes more frequent replenishments and increases the number of context switches with the periodic tasks. A second approach, less obvious, is to assign a possible earlier deadline to each aperiodic request. The assignment must be done in such a way that the overall processor utilization of the aperiodic load never exceeds a specified maximum value Us. This is the main idea behind the Total Bandwidth Server (TBS), a simple and efficient aperiodic service mechanism proposed by Spuri and Buttazzo in [13, 15]. The name of the server comes from the fact that, each time an aperiodic request enters the system, the total bandwidth of the server, whenever possible, is immediately assigned to it. In particular, when the k-th aperiodic request arrives at time t a deadline Ck dk = max(rk, dk - 1 ) + Us
= rk, it receives
where Ck is the execution time of the request and Us is the server utilization factor (i.e., its bandwidth). By definition do = O. Note that in the deadline assignment rule the bandwidth allocated to previous aperiodic requests is considered through the deadline dk - 1 •
180
CHAPTER
18
12
24
24
16
aperiodic requests 0
I
2
h 3
4
d}~12 5
"
7
8
Figure 8.5
9
rd
d2
10
II
12
13
14
15
16
8
17
d3
{ 18
19
20
21
22
23
24
Total Bandwidth Server example.
Once the deadline is assigned, the request is inserted into the ready queue of the system and scheduled by EDF as any other periodic instance. As a consequence, the implementation overhead of this algorithm is practically negligible. Figure 8.5 shows an example of EDF schedule produced by two periodic tasks with periods T 1 = 6, T 2 = 8 and execution times C 1 = 3, C 2 = 2, and a TBS with utilization Us = 1 - Up = 0.25. The first aperiodic request arrives at time t = 3 and is serviced with deadline d1 = T1 + = 3 + 0.15 = 7. Since d 1 is the earliest deadline in the system, the aperiodic request is executed immediately. Similarly, the second request, which arrives at time t = 9, receives = 17, but it is not serviced immediately, since at time a deadline d 2 = T2 + t = 9 there is an active periodic task, T2, with a shorter deadline, equal to 16. Finally, the third aperiodic request arrives at time t = 14 and gets a deadline d 3 = max(T3' d 2 ) + §; = 21. It does not receive immediate service, since at time t = 14 task T1 is active and has an earlier deadline (18).
&
g;
8.3.1
Schedulability analysis
In order to derive a schedulability test for a set of periodic tasks scheduled by EDF in the presence of a TBS, it is first shown that the aperiodic load executed by TBS cannot exceed the utilization factor Us defined for the server.
Lemma 8.2 In each interval of time [t1' t2], if Cape is the total execution time demanded by aperiodic requests which arrived at t 1 or later and served with
181
Aperiodic Task Scheduling
deadlines less than or equal to t z , then
Proof. By definition
=
Cape
Given the deadline assignment rule of the TBS, there must exist two aperiodic requests with indexes k 1 and kz such that k2
L
Ck
t,
=
L
Ck.
k=k,
It follows that k2
Cape
L
Ck
k=k, k2
L
[d k - max(rk,dk-1)]Us
k=k,
< [dk 2 - max(rk dk,-l )]Us < (t z - t1)Us' 1 '
o The main result on TI3S schedulability can now be proven. Theorem 4 Given a set of n pel,jodic tasks with processor utilization Up and a TBS with processor utilization U., the whole set is schedulable by EDF if and only if
Proof. "If". Assume Up + Us ::::: 1 and suppose there is an overflow at time t. The overflow is preceded by a period of continuous utilization of the processor. Furthermore, from a certain point t' on (t' < t), only instances of tasks ready at t' or later and having deadlines less than or equal to t are run. Let C be the total execution time demanded by these instances. Since there is an overflow at time t, it must be t - t' < C.
182
CHAPTER
8
Moreover,
C <
l
t - t' J ~ Ci+Cape
L n
i=l
<
n
L
1
~Ci + (t - t')US
i=l
<
t'
t _
1;
(t - t')(Up + Us).
Thus, it follows that
Up
+ US>
1,
a contradiction. "Only If". If an aperiodic request enters the system periodically, with period
T s and execution time C s = TsU s , the server behaves exactly as a periodic task with period Ts and execution time C" and the total utilization factor of the processor is Up + Us' Hence, if the whole task set is schedulable, from the EDF schedulability bound [9] it can be concluded that Up + Us :::; 1. 0
8.4
EARLIEST DEADLINE LATE SERVER
The Total Bandwidth Server is able to provide good aperiodic responsiveness with extreme simplicity. However, a better performance can still be achieved through more complex algorithms. For example, looking at the schedule in Figure 8.5, one can argue that the second and the third aperiodic requests may be served as soon as they arrive, without compromising the schedulability of the system. This is possible because, when the requests arrive, the active periodic instances have enough slack time (laxity) to be safely preempted. Using the available slack of periodic tasks for advancing the execution of aperiodic requests is the basic principle adopted by the EDL server [13, 15]. This aperiodic service algorithm can be viewed as a dynamic version of the Slack Stealing algorithm [8]. The definition of the EDL server makes use of some results presented by Chetto and Chetto in [2]. In this section, two complementary versions of EDF, namely EDS and EDL, are proposed. Under EDS the active tasks are processed as soon as possible, whereas under EDL they are processed as late as possible. An important property of EDL is that in any interval [0, t] it guarantees the
Aperiodic Task Scheduling
183
12
EDL W(l)
24
18
1------, ~---.-----l~~~-+-D-+-~--l-n--l----,-~~--l-D----l-~~~
o
1
2
3
4
5
6
7
8
9
Figure 8.6
10
11
12
13
14
IS
16
17
18
19
20
21
n
23
24
Availability function under EDL.
maximum available idle time. In the original paper, this result is used to build an acceptance test for aperiodic tasks with hard deadlines, while here it is used to build an optimal server mechanism for soft aperiodic activities. To simplify the description of the EDL server, wj(t) denotes the following availability function, defined for a scheduling algorithm A and a task set J: w A (t) .J
= {I 0
if the p.rocessor is idle at t otherwise
The integral of wj(t) on an interval of time [t l , h) is denoted by o.j (tl, t z ) and gives the total idle time in the specified interval. The function for the task set of Figure 8.5 is depicted in Figure 8.6.
w;DL
The result of optimality addressed above is stated in Theorem 2 of [2], which is recalled here.
Theorem 5 (Chetto and Chetto) Let J be any aperiodic task set and A any preemptive scheduling algol·ithm. For any instant t,
This result allows the development of an optimal server using the idle times of an EDL scheduler. In particular, given a periodic task set J, the function wj, which is periodic with hyperperiod H = lcm(TI , ... , Tn), can be represented by means of two vectors. The first, [ = (eo, e1,' .. , ep ), represents the times at which idle times occur, while the second, V = (.6. 0 ,.6. 1 , ... , .6. p ), represents
184
CHAPTER
o ei
<3.;
Table 8.1
0 3
1 8 1
2 12 1
8
3 18 1
Idle times under EDL.
the lengths of these idle times. The two vectors for the example of Figure 8.6 are shown in Table 8.1. Note that idle times only occur after the release of a periodic task instance. The basic idea behind the EDL server is to use the idle times of an EDL schedule to execute aperiodic requests as soon as possible. When there are no aperiodic activities in the system, periodic tasks are scheduled according to the EDF algorithm. Whenever a new aperiodic request enters the system (and no previous aperiodic task is still active), the idle times of an EDL scheduler applied to the current periodic task set are computed, and then are used to schedule the pending aperiodic requests. Figure 8.7 shows an example of the EDL service mechanism. Here, an aperiodic request with an execution time of 4 units arrives at time t =: 8. The idle times of an EDL schedule are recomputed using the current periodic tasks, as shown in Figure 8.7a. The request is then scheduled according to the computed idle times (Figure 8.7b). Notice that the server automatically allocates a bandwidth 1- Up to aperiodic requests. The response times achieved by this method are optimal, so they cannot be reduced further. The procedure to compute the idle times of the EDL schedule is described in [2] and is not reported here. However, it is interesting to observe that not all the idle times have to be recomputed, but only those preceding the deadline of the current active periodic task with the longest deadline. The worst case complexity of the algorithm is O(Nn), where n is the number of periodic tasks and N is the number of distinct periodic requests that occur in the hyperperiod. In the worst case, N can be very large and, hence, the algorithm may be of little practical interest. As for the "Slack Stealer", the EDL server will be used to provide a lower bound to the aperiodic response times, and to build a nearly optimal (implementable) algorithm, as described in the next section.
185
Aperiodic Task Scheduling
24
16
EDL "'(t)
I
.~~~~~-1--[:=:l~----I--J.--n--+-----r~~-----Jn--1--~~~
o
9
10
11
12
13
14
15
16
17
JR
19
20
21
22
23
24
(a)
12
24
18
24
16
aperiodic requests
1
n 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
(b)
Figure 8.7 a) Idle times available at time t = 8 under EDL. b) Schedule of .the aperiodic request with the EDL server.
186
8.4.1
CHAPTER
8
EDL Server Properties
The schedulability analysis of the EDL server is quite straightforward. In fact, all aperiodic activities are executed using the idle times of a particular EDF schedule, thus the feasibility of the periodic task set cannot be compromised. This is stated in the following Theorem.
Theox·em 6 Given a set of n periodic tasks with processor utilization Up and the corresponding EDL server (whose behavior strictly depends on the characteristics of the periodic task set), the whole set is schedulable if and only if
Proof. "If". Since the condition (Up ~ 1) is sufficient for guaranteeing the schedulability of a periodic task set under EDF, it is also sufficient under EDL, which is a particular implementation of EDF. The algorithm schedules the periodic tasks according to one or the other implementation, depending on the particular sequence of aperiodic requests. When aperiodic requests are pending, they are scheduled during precomputed idle times of the periodic tasks. In both cases the timeliness of the periodic task set is unaffected and no deadline is missed. "Only If". If a periodic task set is schedulable with an EDL server, it is also 0 schedulable without the EDL server, and hence (Up ~ 1). In the following it is shown that the EDL server is optimal, that is, the response times of the aperiodic requests under the EDL algorithm are the best achievable.
Lemma 8.3 Let A be anyon-line preemptive algor·ithm, r a periodic task set, and J i an aperiodic request. If f;4u{J;} (Ji ) is the finishing time of J i when r U {J;} is scheduled by A, then
f~c;{~,Tver(Ji) ~ f;4u{J;} (Ji ). Proof. Suppose J i arrives at time t, and let r(t) be the set of the current active periodic instances (ready but not yet completed) and the future periodic instances. The new task J i is scheduled together with the tasks in r(t). In particular, consider the schedule (J of r U {Ji} under A. Let A' be another algorithm that schedules the tasks in r(t) at the same time as in (J, and (J'
Aperiodic Task Scheduling
187
be the corresponding schedule. J i is executed during some idle periods of 0-'. Applying Theorem 5 with the origin of the time axis translated to t (this can be done since A is on-line), it can be written that, for each t' 2': t,
Recall now that, when there are aperiodic requests, the EDL server allocates the executions exactly during the idle times of EDL. Since [lEDL(t fEDL server(J)) > [lA' (t fEDL server(J)) T(t) 'rU{Ji} ' - r(t) 'TU{Ji} ,
it follows that
f~!]l:l;} (Ji) :::;
f:u{J;}
(Ji)'
That is, under the EDL server, J i is never completed later than under the A algorithm. 0
8.5
IMPROVED PRIORITY EXCHANGE SERVER
Although optimal, the EDL server has too much overhead to be considered practical. However, its main idea can be usefully adopted to develop a less complex algorithm which still maintains a nearly optimal behavior. The expensive computation of the idle times can be avoided by using the mechanism of priority exchange. With this mechanism, in fact, the system can easily keep track of the time advanced to periodic tasks and possibly reclaim it at the right priority level. The idle times of the EDL algorithm can be precomputed off-line and the server can use them to schedule aperiodic requests, when there are any, or to advance the execution of periodic tasks. In the latter case, the idle time advanced can be saved as aperiodic capacity at the priority levels of the periodic tasks executed. The idea described above is used by the algorithm called Improved Priority Exchange (IPE) [13, 15]. In particular, the DPE server is modified using the idle times of an EDL scheduler. There are two main advantages to this approach. First, a far more efficient replenishment policy is achieved for the server. Second, the resulting server is no longer periodic and it can always run at the highest priority in the system. The IPE server is thus defined in the following way:
188
CHAPTER
EDL W(I)
J L
lPE
3~
n
n
n
L
L
L
K
N 10
18
16
14
1
2
3
4
5
6
Figure 8.8
7
8
9
10
II
12
I
20
24
22
18
12
o
Jt
+3
b-
12
13
14
15
16
17
18
8
24
19
20
21
22
23
24
Improved Priority Exchange server example.
•
the IPE server has an aperiodic capacity, initially set to 0;
•
at each instant t =:: ei + kH, with 0 :"'::: i :"'::: p and k :::: 0, a replenishment of .6. i units of time is scheduled for the server capacity, that is, at time t =:: eo the server will receive .6. 0 units of time (the two vectors [ and V have been defined in the previous section);
•
the server priority is always the highest in the system, regardless of any other deadline;
•
all other rules of IPE (aperiodic requests and periodic instances executions, exchange and consumption of capacities) are the same as for a DPE server.
The same task set of Figure 8.7 is scheduled with an IPE server in Figure 8.8. Note that the server replenishments are set according to the function w~DL, illustrated in Figure 8.6. When the aperiodic request arrives at time t =:: 8, one unit of time is immediately allocated to it by the server. However, other two units are available at the priority level corresponding to the deadline 12, due to previous deadline exchanges, and are allocated right after the first one. The last one is allocated later, at time t =:: 12, when the server receives a further unit of time. In this situation, the optimality of the response time remains. As it is shown later,
Aperiodic Task Scheduling
189
there are only rare situations in which the optimal EDL server performs slightly better than IPE. That is, IPE almost always exhibits a nearly optimal behavior.
8.5.1
Schedulability analysis
In order to analyze the schedulability of an IPE server, it is useful to define a transformation among schedules similar to that defined for the DPE server. In particular, given a schedule cr produced by the IPE algorithm, the schedule cr' is built in the following way: •
each execution of periodic instances during deadline exchanges (i.e., increase in the corresponding aperiodic capacity) is postponed until the capacity decreases;
•
all other executions of periodic instances are left as in cr.
In this case, the server is not substituted with another task. Again cr' is well defined and is invariant, that is, it does not depend on cr, but only on the periodic task set. Moreover, cr' is the schedule produced by EDL applied to the periodic task set (compare Figure 8.6 with Figure 8.8). The optimal schedulability is stated by the following Theorem. Theorem 7 Given a set of n periodic tasks with processor utilization Up and the cOlTesponding IPE server (the parameters of the server depend on the periodic task set), the whole set is schedulable if and only if
Up:S 1 (the server automatically allocates the bandwidth 1 - Up to aperiodic requests).
Proof. "If". The condition is sufficient for the schedulability of the periodic task set under EDF, thus even under EDL, which is a particular implementation of EDF. Now, observe that in each schedule produced by the IPE algorithm the completion times of the periodic instances are never greater than the completion times of the corresponding instances in cr', which is the schedule of the periodic task set under EDL. That is, no periodic instance can miss its deadline. The thesis follows.
"Only If". Trivial, since the condition is necessary even for the periodic task set only. 0
190
8.5.2
CHAPTER
8
Remarks
The reclaiming of unused periodic execution time can be done in the same way as for the DPE server. \Vhen a periodic task completes, its spare time is added to the corresponding aperiodic capacity. Again, this behavior does not affect the schedulability of the system. The reason is of course the same as for the DPE server. To implement the IPE server, the two vectors (; and V must be precomputed before the system is run. The replenishments of the server capacity are no longer periodic, but this does not change the complexity, which is comparable with that of DPE. What can change dramatically is the memory requirement. In fact, if the periods of periodic tasks are not harmonically related, a huge hyperperiod H = Icm(T Icm(T), ... ,, Tn) can be obtained, which would require a great 1 , ... memory space to store the two vectors (; and V.
8.6
PERFORMANCE RESULTS
The algorithms described in this chapter have been simulated on a synthetic workload in order to compare the average response times achieved on soft aperiodic activities. For completeness, a dynamic version of the Polling Server has also been compared with the other algorithms. The plots shown in Figure 8.9 have been obtained with a set of ten periodic tasks with periods ranging from 100 and 1000 units of time and utilization factor Up = 0.65. The aperiodic load was varied across the range of processor utilization unused by the periodic tasks, and in particular from 3% to 33%. The interarrival times for the aperiodic tasks were modeled using a Poisson arrival pattern, with average T a , and the aperiodic computation times were modeled using an exponential distribution. The processor utilization of the servers was set to all the utilization left by the periodic tasks, that is, Us = 1 - Up' Up. The period of the periodic servers, namely Polling, DPE and DSS, was set equal to the average aperiodic interarrival time (Ta ) and, consequently, the capacity was set to C s = TaUs. Unless otherwise stated, the data plotted for each algorithm represent the ratio of the average aperiodic response time relative to the response time of background aperiodic service. The average is computed over ten simulations, in
Aperiodic Task Scheduling
191
which a total of one hundred thousand aperiodic requests were generated. In this way, an average response time equivalent to background service has a value of 1.0. Hence, a value less than 1.0 corresponds to an improvement in the average aperiodic response time over background service. The lower the response time curve lies on these graphs, the better the algorithm is for improving aperiodic responsiveness. Note that the EDL server is not reported in the graph since it has basically the same behavior as IPE for almost any load conditions. In particular, simulations showed that for small and medium periodic loads the two algorithms do not have significant differences in their performance. However, even for a high periodic load, the difference is so small that can be reasonably considered negligible for any practical application. Although IPE and EDL have very similar performance, they differ significantly in their implementation complexity. As mentioned in previous sections, the EDL algorithm needs to recompute the server parameters quite frequently (namely, when an aperiodic request enters the system and all previous aperiodics have been completely serviced). This overhead can be too expensive in terms of cpu time to use the algorithm in practical applications. On the other hand, in the IPE algorithm the parameters of the server can be computed off-line, and used at run-time to replenish the server capacity. In Figure 8.9, the performance of all algorithms is shown as a function of the aperiodic load. The load was varied by changing the average aperiodic service time, while the average interarrival time was set at the value of T a = 100. In the graph, the average aperiodic response time of each algorithm is plotted with respect to that of background service as a function of the mean aperiodic load Uape = ~. As can be seen from the graph, the TBS and IPE algorithms can provide a significant reduction in average aperiodic response time compared to background or polling aperiodic service, whereas the performance of the DPE and DSS algorithms depends on the aperiodic load. For low aperiodic load, DPE and DSS perform as well as TBS and IPE, but as the aperiodic load increases their performance tends to be similar to that one shown by the Polling server. Note that TBS and IPE have about the same responsiveness when the aperiodic load is low, and they exhibit a slightly different behavior for heavy aperiodic loads.
CHAPT ER
192
Periodic Load
=
65%
Mean Aperiodic :;:nterarri val Time
=
8
100
polling -+DSS -+---{]_. DPE -{)_. TBS ··x-_··· IPE ~,-
0.8
0.6
.,~.
,A
,.
o .•
,,'
i
i
i
i
i
0.2
O'--_"'-- _....I-_.. ...l--_--- '-_--'_ _"--_-k-_ ---'--_--- '-_---'-_ 30 27 24 21 18 15 12 o Load \ It)
_'----' 33
Hean Aperiodic
Figure 8.9
Performa nce of dynamic server algorithm s.
generat ed by All algorith ms perform much better when the aperiodic load is activitie s. long of number small a than a large number of small tasks rather executasks' the and s, increase T time ival interarr a the as IVloreover, note that, with respect rity superio its shows m algorith IDE the longer, s become time tion to the others, which tend to have about the same perform ance.
Up as well. These algorith ms have been compar ed with different periodic loads r simbehavio a show ms algorith service ic aperiod all loads periodic For very low ance perform their s, ilar to backgro und service. As the periodic load increase DPE ar, particul In service. und backgro improve s substan tially with respect to h that of the and DSS have a compar able perform ance, which tends to approac IPE outand TBS hand, other the On loads. periodic high for Polling server arly particul is ment perform all other algorith ms in all situatio ns. The improve TBS d, workloa high very a With ds. significant with medium and high workloa even though is not able to achieve the same high level of perform ance as IPE, ions are simulat e extensiv More ms. algorith other the than better it is much reporte d in [13, 15J.
Aperiodic Task Scheduling
8.7
193
SUMMARY
A set of algorithms that provide good response time to aperiodics in the presence of hard real-time tasks is presented. The algorithms differ in their performance and implementation complexity. The experimental simulations have established that, from a performance point of view, IPE and EDL show the best results. Although optimal, EDL is far from being reasonably practical, due to the overall complexity. On the other hand, IPE is able to achieve a comparable performance with much less computational overhead. Both algorithms may have significant memory demands when the periods of the periodic tasks are not harmonically related. The Total Bandwidth algorithm also demonstrates good performance, sometimes comparable to that of the nearly optimal of IPE. Observing that its implementation complexity is among the simplest, this is a good candidate for practical systems. Even though a bit more complex, the DPE and the DSS algorithms show slightly worse performance, although they both provide better responsiveness than the Polling server and the naive background service.
REFERENCES
[1] T.P. Baker, "Stack-Based Scheduling of Real-Time Processes," The Journal of Real-Time Systems 3(1), 1991, pp. 67-100.
[2] H. Chetto, and M. Chetto, "Some Results of the Earliest Deadline Scheduling Algorithm," IEEE Trans. on Software Engineering, 15(10), 1989, pp. 1261-1269.
[3] H. Chetto, M. Silly, and T. Bouchentouf, "Dynamic Scheduling of RealTime Tasks under Precedence Constraints," The Journal of Real-Time Systems 2, 1990, pp. 181-194.
[4] R. I. Davis, K. Tindell, and A. Burns, "Scheduling Slack Time in Fixed Priority Preemptive Systems," Proc. of Real- Time Systems Symposium, 1993, pp. 222-231.
[5] T. M. Ghazalie, and T. P. Baker, "Aperiodic Servers in a Deadline Scheduling Environment," Real-Time Systems, 9, 1995.
[6] J. P. Lehoczky, L. Sha, and J. K. Strosnider, "Enhanced Aperiodic Responsiveness in Hard Real-Time Environments," Proc. of Real-Time Systems Symposium, 1987, pp. 261-270.
[7] J. P. Lehoczky, L. Sha, and Y. Ding, "The Rate Monotonic Scheduling Algorithm: Exact Characterization and Average Case Behavior," Proc. of Real- Time Systems Symposium, 1989, pp. 166-17l.
[8] J. P. Lehoczky, and S. Ramos-Thuel, "An Optimal Algorithm for Scheduling Soft-Aperiodic Tasks in Fixed-Priority Preemptive Systems," Proc. of Real-Time Systems Symposium, 1992, pp. 110-123. [9] C. L. Liu, and J. Layland, "Scheduling Algorithms for Multiprogramming in a Hard real-Time Environment," Journal of the ACM 20(1),1973, pp. 40-61. [10] A. K. Mok, Fundamental Design Problems of Distributed Systems for the Hard-Real-Time Environment, Ph.D. Dissertation, MIT, 1983.
198
CHAPTER
9
•
fault tolerance requirements,
•
tight and loose deadlines,
•
normal and overload conditions, and
•
tasks with different QoS requirements (ranging from deterministic to probabilistic guarantees).
The solution must be integrated enough to handle the interfaces between: •
CPU scheduling and resource allocation,
•
I/O scheduling and CPU scheduling,
•
CPU scheduling and real-time communication scheduling,
•
local and distributed scheduling, and
•
static scheduling of critical tasks and dynamic scheduling of essential and non- essential tasks.
Given the above scope and complexity, it is clear that comprehensive results will not exist for many years. In the next two Chapters one very important aspect (results for distributed real-time systems) that pervades a number of the issues in the above lists is presented. The presentation is confined to solutions that involve EDF.
9.1
DISTRIBUTED SYSTEMS - AN OVERVIEW
In managing the tasks and resources in a distributed real-time system, there are three phases to consider: •
allocation - the assignment of the tasks and resources to the appropriate nodes or processors in the system,
•
scheduling - ordering the execution of tasks and network communication such that timing constraints are met and the consistency of resources is maintained, and
Distributed Scheduling - Part I
•
199
dispatching - executing the tasks in conformance with the scheduler's decisions.
Each of the above phases cannot be dealt with in total isolation, since the mechanism used in one phase may greatly affect the performance of others. For example, the CPU scheduling algorithm used influences the design of the allocation scheme, how the allocation is done affects the run-time performance of the scheduling algorithm and network communication requirement, and whether and how time and resources allocated to a task are reclaimed (when it finishes early) affects the correctness and performance of the scheduling algorithm. It is also necessary to distinguish local from distributed scheduling, and the scheduling algorithm itself from the application tasks being scheduled. For example, in the first two algorithms to be presented, each node of the distributed system runs a local scheduling algorithm, but the tasks themselves are distributed and communicate with each other. In this case the network communication must also be scheduled. In the third algorithm presented, the scheduling algorithm itself is distributed and the application tasks are independent (they do not communicate with each other). The network must be accounted for in the distributed scheduling process itself, but not between application tasks.
The performance of distributed scheduling is highly interrelated to what modules (functions) exist and where they are allocated. If a designer is not able to develop a distributed scheduling solution that meets task deadlines, then a designer has the option of creating different functions, performing a different allocation, adding more hardware resources, or looking for a better algorithm; or all the above. Most of the results for distributed scheduling assume that modules have been designed, that some initial hardware configuration has been chosen, and that the network communication delays are bounded and known. These same assumptions are made for the work presented in Chapters 9 and 10. Three types of distributed scheduling solutions are presented (the first in this Chapter and the next two in Chapter 10): •
holistic scheduling based on EDF, This solution illustrates how to analyze the interaction between the . CPUs and the network, and considers jitter requirements and communication between tasks. This work is an extension of the holistic approach [39] that was based on fixed priority analysis. It is appli-
200
CHAPTER
9
cable to static allocation of tasks which are dynamically activated. This solution requires task set characteristics to be fairly simple. The solution supports the scheduling of application level tasks which communicate over the network. The resulting algorithm handles preemptable tasks, periodic tasks, non-periodics with known interarrival times, end-to-end timing constraints, communication requirements, and resource requirements. These results, therefore, should prove valuable in many situations. On the other hand, the solution does not address non-preemptable tasks, tasks with multiple levels of importance, general precedence constraints, placement constraints, fault tolerance requirements, tight and loose deadlines, overload, or supporting different QoS requirements. •
the Spring complex task set allocation and scheduling algorithm! When task set characteristics are sophisticated then the holistic scheduling approach may not apply. The Spring complex task set allocation and scheduling algorithm may then be appropriate. This algorithm can handle non-preemptable tasks, periodic tasks, tasks with multiple levels of importance, groups of tasks with a single deadline, end-toend timing constraints, precedence constraints, communication requirements, resource requirements, placement constraints, and fault tolerance requirements. This algorithm is used for static real-time systems. In dealing with complicated situations it is necessary to use more than EDF. Since this solution is for static systems, complete a priori guarantees are possible. The resulting solution supports application tasks which communicate over a network with redundancy in the tasks and communication. Extensions to handle more dynamic distributed systems have also been developed, but are not presented in this book.
•
focussed addressing and bidding. When dynamic distributed real-time systems have soft real-time constraints and relatively long deadlines, the previous static solutions are not appropriate. One possible solution for distributed real-time scheduling is based on focussed addressing and bidding [33]. This solution builds upon local guarantees as presented in the previous Chapters. Delays in the network are accounted for in the scheduling process itself, but the application tasks are all independent and
1 Note that the Spring complex task set allocation and scheduling algorithm for distributed systems presented in this Chapter is different than the Spring scheduling algorithm presented in earlier Chapters.
Distributed Scheduling - Part I
201
do not communicate with each other. Future work is necessary to extend these results to more sophisticated task set characteristics. In particular, the resulting algorithm handles only non-preemptable tasks, non-periodics, resource requirements, and overload. The solution does not address preemptable tasks, periodic tasks, tasks with multiple levels of importance, groups of tasks with a single deadline, precedence constraints, communication requirements, placement constraints, fault tolerance requirements, tight and loose deadlines, or supporting different QoS requirements.
9.2
HOLISTIC SCHEDULING BASED ON EDF
The goal in distributed real-time systems design is to analyze the system for feasibility in making deacllines. In such systems, among the most challenging requirements are the so called end-to-end timing constraints. There is an endto-end timing constraint for each pair of communicating tasks 2 : it is intended as the maximum time available for producing a message at the sender end, transmitting the message (which may be composed of multiple packets) over the network, and processing it at the receiving end. The time needed for an end-to-end computation is composed of five major components, depicted in Figure 9.1: •
Generation delay - the time needed by the sender task to generate and queue the message.
•
Queueing delay - the time needed by the message to gain access to the network.
•
Transmission delay - the time needed for the transmission of the message (a message may be sent in more than 1 packet).
•
Delivery delay - the time needed by the destination processor to deliver the message to the destination task.
•
Processing delay - the time needed by the destination task to process the message, that is, to complete its execution.
2More generally, there can exist a timing constraint for a whole chain of communicating tasks.
202
CHAPTER
simi
9
,, I •
!
,
1
I
m
EE'J
Deli"ery and processing delays
I'iEJ
,,
: ----------+..I....-
_oo .... ,
dim)
..
I
E::::I
1
______________ ' _ _DL--"---_---'r==J_'----'-
L-
_
L IJ _
Figure 9.1
End-to-end computation delay components.
Tindell and Clark [39] have described a very interesting approach for the analysis of end-to-end computations in real-time distributed systems where host processors schedule tasks and messages according to fixed priorities (e.g., those chosen by the rate monotonic policy). They have termed this approach holistic, since it addresses the problem of analyzing a system as a whole. In this section the holistic approach is followed, but it is based on the EDF algorithm at each host and for message queues, rather than the rate monotonic algorithm. The holistic approach elegantly overcomes the difficulty of a global distributed analysis by means of a very simple concept: attribute inheritance. The message sent by a task inherits from it, two of its temporal attributes, namely the period and the release jitter. If each instance of the task communicates, the message inherits a period equal to that of the task. Furthermore, if the message can be queued at any time by the sender task, the difference between its earliest and latest releases is bounded by the sender worst case response time. This is the release jitter inherited by the message. Similarly, the destination task inherits its period and release jitter from the message it receives. Based on this notion of attribute inheritance, the overall analysis is decoupled by assuming initial conditions without jitter, analyzing worst case response times of tasks and messages for each host processor and network separately (based on the initial conditions), computing the jitter that these worst case times cause, and then iterating until all the jitters converge. The reason that iteration is required is that the analysis of the different subsystems are mutually dependent: the release jitter of a message depends on the response time of its sender, the release jitter of the destination task depends on the response time of
203
Distributed Scheduling - Part I
the message. It is also important to keep in mind that each host has tasks which send messages and other tasks that receive messages. Consequently, all hosts are mutually dependent on each other, in general. Fortunately, this mutual dependency is expressed in terms of non-decreasing functions. This permits an analysis based on successive recursive steps as described in the formulas below. Note that in the recursive formulas below, m refers to the stage of the recursion, n refers to the n hosts, and RTedf and RTnet are sets of response times for all the tasks on each node and network, using the edf and net scheduling policies, respectively. The J;'s are worst case release jitters per site, the P's are the network propagation delays and the RT's are the response times. Initially, the Jim), ... , J~;/ are assumed zero for the first stage of the recursion. At this point new response times are computed for the m + 1 stage of the recursion. The recursive equations on the right are then used to compute the new jitters for the next stage of the computation.
RT(m+l) 1
RT;,m+1) RT(m+1) net
RTedf
(Jim))
RT,,, tJ~~)) 'RT." RTnet
J~~))
J(m+1)
P 1 (RT(m+l)) net
J~m+l)
P 11 (RT(m+1)) net
1
J(m+l) net
RT1(m+1) , ... , RT1~m+l))
P net
(
In summary, these formulas demonstrate that at each step, the worst case response times of tasks and messages are computed for the n host processors and for the network, assuming the release jitters, and the other known attributes computed in the previous step. The values of the release jitters are then updated before the next step. The computation is halted either when the equations stabilize, that is, when the values of worst case response times and release jitters do not further change, or at least one subsystem is found unschedulable.
In the end, the response times computed can be compared with the original timing constraints, in order to establish the feasibility of the whole system. The algorithm and analysis are now presented in detail.
204
CHAPTER
9
Host Processors
Physical or Logical Ring Token [!!!'l
Figure 9.2
9.2.1
Assumed network topology.
The System Model
The system considered is composed of several host processors (nodes), connected by a physical or logical ring (in this case the actual network may be a shared broadcast bus), as shown in Figure 9.2. In each station there is a set of statically allocated tasks, which possibly communicate over the network with other tasks on different processors. A task i consists of an infinite number of requests, or instances, whose arrival times are separated by a minimum time T i , called the period (according to the conventional notation, this assumption is common to periodic and sporadic tasks). Task instances may arrive at any time. However, the arrival must be recognized by a run-time dispatcher, which then places the instance in the system ready queue. The instance is then said to be released. Note that the release of a task can be delayed by a distributed synchronization if the task is the destination of a message. The time between a task's arrival and its release is known as release jitter. Each task instance may execute for a bounded amount of computation Gi , called the worst case exeClLtion time. The computation should complete within a time D i (relative deadline) after the arrival. The ready queue is ordered according to the actual task's deadlines, earliest first, that is, an EDF [27] preemptive
205
Distributed Scheduling - Fart I
dispatching is assumed. Tasks may locally share resources, by locking and unlocking semaphores according to a protocol like the Priority Ceiling [35, 7] or the Stack Resource Policy [4]. Communicating tasks can send messages at any time, that is, as soon as they start executing, or as late as when they complete. Each message m, sent by task i, may be sent at most once every n m invocations of i, and has a unique destination. Each task may receive at most one message. When queued, messages may be broken down into a number of packets of fixed size (message m is assumed to be broken into em packets). The queue of the outgoing packets, locally shared between the host processor and the communications processor, is ordered by increasing deadlines. Access to the ring, or the bus, is arbitrated by using the Timed Token medium access control protocol [20]. The Timed Token protocol normally provides two classes of service: the synchronous class and the asynchronous class [21]. The former class is intended for messages with regular arrivals and delivery time constraints. The latter class is intended for messages with arbitrary arrival laws and without time constraints. All messages involved in end-to-end computations with strict deadlines are serviced by the synchronous class. Access control among the hosts is provided by a special bit pattern called a token that circulates around the ring. At network initialization time, all hosts negotiate a common value for the target token rotation time (TT RT). Each host p is then assigned a fraction H p of TT RT, termed synchronous bandwidth, which is the maximum time the host can transmit synchronous messages upon reception of the token. After the transmission of synchronous packets, if any, the station can send asynchronous messages only if it has received the token earlier than TT RT units of time after the last token visit. The duration of the possible asynchronous message transmission is limited to TT RT minus the time elapsed between the previous and the current visit. The token is immediately released after the transmission of the last packet. Target token rotation time and synchronous bandwidths are related by the following inequality: n
'LHp p=l
+
T
< TTRT,
206
CHAPTER
9
where T is the sum of protocol overheads and ring latency, that is, the fraction of TT RT not available for message transmission. In this section the analysis of the Timed Token MAC protocol is presented in two different situations. In the first situation full utilization of synchronous and asynchronous class services is allowed at any node. In the second situation only the synchronous service is allowed, a situation that should give more responsiveness to the system. Due to the similarity with the holistic approach originally described by Tindell and Clark in [39] and Tindell et ai. in [40], this section uses the same notation as the original work as far as possible. A glossary follows: C i The worst case computation time of task i on each release. D i The deadline of task i, measured relative to the arrival time of the task. B i The worst case blocking time task i can experience due to the operation of the concurrency control protocol.
J i The worst case release jitter of task i (i. e., the worst case delay between the arrival and its release). T i The period of task i. rti The worst case response time of task i, measured from the arrival time to
the completion time. C m The number of packets comprising message m. D m The deadline of message m, measured relative to its queueing time. J m The worst case release jitter of message m (i. e., the difference between its possible latest and earliest queueing times). T m The period of message m.
r m The worst case communication delay of message m, measured from the time it is queued by its sending task, to the arrival at the destination processor. p The time to transmit a packet.
H p The synchronous bandwidth of node p, that is, the maximum time node p is allowed to transmit synchronous packets upon receipt of the token.
Distributed Scheduling - Part I
T
207
The ring latency and the protocol overheads. It is the fraction of TTRT not available for messagc transmission.
TTRT The target token rotation time. TTRT:::: T + L p H p . Note that in thc restricted version of the Timed Token protocol equality can be assumed, and TTRT becomes the worst case token rotation time.
P Thc network propagation delay. The overall solution is to analyze each CPU using EDF, analyze the network and then combine the results recursively to obtain the end-to-cnd responsiveness. Each of these parts is treated in the next 3 subsections, respectively.
9.2.2
Uniprocessor EDF Scheduling
For uniprocessor EDF scheduling there are standard procedures for dcciding feasibility and for computing worst case execution times. Howcver, these have to be gcncralized because they now need to be used within a distributed setting. First, for feasibility analysis for each CPU a gencralization of thc original rcsult described by Liu and Layland (27] is ncccssary. This generalization is given in the following theorem.
Theorem 9.1 When the deadline driven scheduling algoT'ithm is used to schedule a set of tasks on a pmcessor, if thel'e is an overflow for a certain arrival pattern, then there is an overflow without idle time prior to it in the pattern in which all task instances are released as soon as possible (according to their attributes) .
According to the original theorem, only the schedule of the most demanding arrival pattern in the first busy period, i.e., in the first interval from time t = 0 up to the first processor idle timc, need be studied. This concept of busy period, (26], is also very useful for the computation of already known in the literature (26]' the worst case response times as will be seen shortly. The length L of the busy pcriod can be computed by mcans of a simple iterative formula where the superscript Tn refcrs to the stage of the computation: (9.1)
208
CHAPTER
9
where W(t) is the cumulative workload at time t, i.e., the sum of the computation times of the task instances arrived before time t: W(t) =
t ft ;Jil i=l
Ci .
t
The computation in Equation (9.1) is stopped when two consecutive values are found equal, that is, L(m+l) = L(m). L is then set to L(m). It can be easily proven that the sequence L(m) converges to L in a finite number of steps as long as the overall processor utilization of the task set is less than or equal to 1, that is, if
~ Ci < l.
L... T i=l
t
The feasibility of the task set can then be established by checking whether there are missed deadlines in the busy period. The actual deadline d of a task's instance scheduled within L must be preceded by the execution of all task instances with deadline before or at d. That is, a sufficient condition for the feasibility of the task set is that for all actual deadlines d within L of a task's instance
d:::
L Di:Sd+Ji
where
( ld+J-DJ) 1+
~i
'
(9.2)
Ci+Bk(d)'
3
k(d) = max{k: D k
-
Jk :::::;
d}.
The worst case response time rt i of a task i is the maximum time between an i's instance arrival and its completion. As already stated, the computation of rt i is a fundamental tool for the holistic analysis of a distributed system. Finding rtj is not a trivial task when EDF scheduling is assumed. In fact, contrary to our intuition, the worst case response time of a task is not always found in the first busy period. While it is true that if you use fixed priority scheduling, where the notion of critical instant has been known since the presentation of Liu and Layland's work [27], then the worst case time is found in the first busy period. However, this is not true for EDF scheduling, as pointed 3 Assume the tasks are ordered by non decreasing value's of D; - J;, that is, i D; - J; S Dj - Jj.
<
j
=}
209
Distributed Scheduling - Part I
""-1='.".--.".--.".--.".--.".--.".----tL.,-,-,.".--.".--.".--.".--.".--.".-----,1 a
11
1
t
d 2
~:> 1
(a)
Task
______t__-Lr==:J_~· ·.· ...Lt_ _~:> d
a
_t Other Tasks
t_b=J
r==J
_t _h
n
D
t----.l-t_ t-----L-t_ _~:>
t__t__
0
~CJ
h
1
~
1
t__
1
o (b)
Figure 9.3 a) Busy period preceding an instance completion time. b) Arrival pattern possibly giving the worst case response time for task i.
out in [36]. The equivalent of the critical instant for a deadline scheduled task set is not necessarily when all tasks all released synchronously, and in general it is different for each task. The concept of busy period, however, still helps to solve the problem. The idea is that the completion time of a task's instance with deadline d, must be the end of a busy period in which all executed instances have deadlines less than or equal to d. By examining all such periods, and by taking the maximum length, the worst case response time of a task can be found. The following lemma characterizes the interesting busy periods.
Lemma 9.1 The W01'st case response time of a task i is found in a busy period in which all other tasks are released synchronously at the beginning of the pe1'iod and then at their maximum rate (see Figure 9.3).
The lemma gives the algorithm for computing the worst case response time of a task i: compute the length of the busy periods of task instances with deadlines less than or equal to that of the i's instance considered, for a number of arrival patterns like the one shown in Figure 9.3b, where instance arrivals are
210
9
CHAPTER
represented by upward arrows and deadlines by downward arrows. All tasks but i are released synchronously and at their maximum rate from time t = o. The attention is on the i's instance released at time t = a, a ~ 0, and possibly preceded by other instances of task i (these instances may contribute to increase the busy period length). In particular, given a ~ -J;, consider an arrival pattern with all possible i's instances, so that there is one arrival at time t = a. In order to include the most possible instances, force the first one to experience a release jitter J i and require its release time to be greater than or equal to 0 (its ideal arrival time may be before time 0). In this way, the first i's instance has release time
Si(a) =a+Ji
-l
a;iJ;j T i.
Since only those busy periods that includes all i's instances are of interest, from that released at time t = si(a) to that with arrival time t = a, it can immediately be taken into account in the computation of the busy period length the overall workload of all such instances, which is
( l----r:-
1 + a+Jij) Ci .
If all other tasks are initially released at time t
=
0, at time t,
ft:;'~j l
in-
stances of j will have been released, for each j # i (recall that all tasks ideally experience their maximum jitter at their first arrival). However, at most only
l
1 + a+Dii:i-Dj
J of them can have a deadline less than or equal4 to d =
a+ D i .
That is, the higher priority workload, relative to deadline d, which arrived up to time t is
W;(a,t)
. {r + 1 + l +
=
mm Dj
:s
# i a + Di + j
t Jj , 1 Tj
a
j Di + TJ
j
-
Dj
j}
Cj
.
Jj
The length Li(a) of the resulting busy period relative to the deadline d can then be computed with the following iterative formula:
4There is no particular assumption made for breaking deadline ties. Hence, in the worst case instances sharing the same deadline should be considered as having higher priorities.
211
Distributed Scheduling - Part I
As for Equation (9.1), the convergence of Equation (9.3) in a finite number of steps is ensured by the condition n
C.
'""' -.!:. -< l. Ly i=l
'1
Once Li(a) is determined, the worst case response time relative to a is rti(a) = max {Ji
+ C i + E i , Li(a)
- a} .
Finally, according to Lemma 9.1, the worst case response time of task i is rti
= a?-J, max {rti(a)}.
(9.4)
Fortunately, there is no need to evaluate rti (a) for all a :::: -Ji in Equation (9.4). In fact, it is known that L, the length of the first busy period, is the maximum of all busy period lengths. Hence, the significant values of a are in the interval [-Ji,L - J; - C i - Ed. Furthermore, it is not difficult to see that the local maxima of Li(a) are found for those values of a such that in the arrival pattern there is at least one instance of a task different from i with deadline equal to d, or all tasks are synchronous, i. e., Si (a) = O. These considerations significantly speed up the evaluation of Equation (9.4).
9.2.3
Communications Analysis
As mentioned above, attribute inheritance is a fundamental aspect of the holistic analysis. Accordingly, a message m inherits from its sender task s(m), a period and release jitter: T m = nmTs(m), if m is sent once every n m periods of s(m), and J m = rs(m), the worst case response time of s(m) 5. This technique contributes to simplify the analysis, since it isolates the network subsystem, as far as the computation of message worst case communication delays is concerned. The communication delay of a message is the sum of its queueing and transmission delays. A procedure for this computation can be determined following an approach similar to that described in the previous section for the application tasks. Assume that messages are locally scheduled by the communication processors 5More generally, the release jitter of the message could be smaller if a minimum relative queueing time greater than zero, and a maximum relative queueing time less than '-(s(m) is found. The release jitter of message m is the difference between the two values.
212
CHAPTER
~
!'''i
Moslly delayed lokell arrivals at local communications adapter ~ I , ,
Afessage
'"
,I
~L
I I I
:t :
In
t 'L
-'---_
""'"
I I
local oUlgoillg messages
L
-----,-, --,---,-----,r------
, a Qlher
9
L __ -L-l ~-~,:'u J -1m "
"""'----;---'t_t----+_"""
"""'
"""_,
_
d
I
LL
_~'
0
Figure 9.4 Local network scenario for the evaluation of message worst case communication delays,
according to the EDF algorithm. Thus the argument of Lemma 9.1 can be extended to communication scheduling. The reason is that messages are locally scheduled in the same way as tasks on a uniprocessor. The main differences are the network access, which is governed by the Timed Token protocol, and the fact that packets are non-preemptable. The network access can be taken into account by locally considering the time not available to the transmission of synchronous messages as higher priority interference, which in turn can be precisely bounded. The worst case communication delay of a message m is
Tm = max {Tm(a)} , a?.-J
(9.5)
rn
the maximum among all scenarios like that shown in Figure 9.4, in which one occurrence of message m arrives at time t = a and all other messages are released, locally and remotely, synchronously and at their maximum rate. Later in this section how to bound the values of a for which Tm(a) has to be evaluated in Equation (9.5) is discussed. Since the transmission of packets is non-preemptable, in order to effectively evaluate the communication delay of message m, T m (a), the approach described in [40] is followed: the queueing and the transmission delays of m are separated by exactly bounding the communication delay of the first k - 1 packets and the transmission delay of the kth packet. The idea is to exactly bound the worst case time needed by the kth packet to gain access to the network and then to
213
Distributed Scheduling - Part I
include the time needed for its transmission to the destination end. In practice, the communication delay of the first k packets of m is thus
Tm,da)
= max {Jm + B m + kp + P, Lm,k(a) + P + P -
a},
where Lm,k(a) is the length of the higher priority network busy period which precedes the transmission of the kth packet of m, that is, the maximum time needed by the kth packet to reach the network, while p + P is the time for its transmission and for its propagation through the network to its destination. B m is the blocking time of m: due to the non-preemptive character of packet transmission, m may be delayed by the transmission of a lower priority packet. Hence, B m = p. The maximum among the two quantities is necessary because J m +Bm +kp+ P is an obvious lower bound to the value of Tm,da). The worst case communication delay of the whole message m can be easily computed by substituting k = em, that is, Tm(a) = Tm,C m (a). The higher priority network busy period preceding the transmission of m's kth packet can be divided into three components: •
the local higher priority workload of messages which have deadlines before or at a + D m ;
•
the previous packets of message m;
•
the interference of other hosts and of the possible local asynchronous traffic on the synchronous network access.
The assumptions is that the first two components do not depend on the particular version of the protocol. Thus no distinction is made. The higher priority workload locally preceding the transmission of m's kth packet can be exactly taken into account by defining the function HW m (a, t). In the interval [0, t] a message m' is released 1 +
l
1 + a+Dmi~'?' -D
m
,
lt:;'~7' J
times, but at most
J occurrences have deadlines less than or equal to a + D m ,
thus HW m (a, t)
=
214
CHAPTER
L
P
. {I + It + J,+ 1 la + m TJ ,
mm
m'
m'=j:.m D~,
m m Dm + TJ ,- D , m'
9
J} C
m' ,
m' E Out(p) :s; a + D~ + J~,
where Out(p) is the set of outgoing messages of host processor p, with m E Out(p) . Note that in the equation that defines rm,da), due to the non-preemptability of packet transmissions, the queueing time of the kth packet, Lm,k(a), is separated from its transmission delay to the destination end, p + P. However, this implies that the definition of HW m (a, t) has to include higher priority messages possibly released at time t, since they precede the transmission of the packet of interest 6. The second component of the higher priority network busy period can be easily taken into account. In the scenario, the occurrence of message m which arrived at time t =: a is preceded by is preceded by
lat:~ J
l
other occurrences, hence the kth packet
a+JmJc T m+ k-l m
other packets of the same message. Since only the busy periods whose lengths are greater than a, are of interest, the transmission time of all these packets may be included as a whole in the definition of Lm,k(a). The third and final component is essentially due to token visits delayed as much as possible by the network accesses of other hosts. The maximum delay of any visit can be exactly computed by carefully analyzing the protocol. In the following subsections two different analyses for the full and the restricted versions of the protocol, respectively, are presented.
9.2.4
Full Timed Token Protocol
\Vhen asynchronous service is allowed at any host, subsequent token visits at node p may be delayed by synchronous and asynchronous packets transmitted 6Note that in case of preemptive scheduling the transmission time of the kth packet is included in Lm,da), but the definition of HW m (a, t) includes only those higher priority messages released before t.
Distributed Scheduling - Part I
215
by other hosts, as well as by local asynchronous packets. Zhang and Burns [41] have derived an exact upper bound on the time between any v consecutive token arrivals at host processor p:
where tl,p is the time the token makes its lth arrival at host p, and T is the portion of TT RT unavailable for transmitting messages. Note that the equal sign holds in the worst case. The visits of the token at host p, useful for synchronous transmissions, are then delayed most when: •
the first visit occurs at time t = 0-, that is, just before any synchronous packet has been released, thus making useless this first visit, and
•
the asynchronous capacity is fully utilized at any host, whenever available.
See Figure 9.5 for a graphical representation, where the visits useful for synchronous message transmissions at host pare tl,p, t 2 ,p, ... Accordingly, if v is such that tv-l,p + H p ::; t < tv,p + H p, the interference of the other processors and the local asynchronous traffic in the interval [0, t] can then be defined as
where
The length of the higher priority network busy period can be finally computed by solving the equation
Lm,k(a) = HW
m(a, Lm,k(a))
+
(l
a
;n:mJ em + k -
1) P + I
p
(Lm,da)).
(9.6) Being that Ip(t) and HW m (a, t) are both monotonically non-decreasing step functions, Equation (9.6) can be practically solved by means of an iterative
216
CHAPTER
'" ; ITRT· (lfp+Hq+Hd')
_ _ Synchronous messages (host p)
_c
7TRT+ 1Iqt- II",.
9
J:iL J:iL =-ta t>li i 7TRT
t o,p
I l,p
RT-
TTRT
t 3,p
12.p
7TRT
t 4,p
t 5.p
Figure 9.5 Maximum delayed token visits at host processor p when full Timed Token protocol is assumed.
fixed point computation: 0, -
HW m
(
(i) ) a, Lm,k(a)
I-p ((i) Lm,k(a) ) . The computation is halted when two consecutive values are found equal.
9.2.5
Restricted Timed Token Protocol
\Vhen asynchronous service is not allowed, the analysis becomes simpler. Without loss of generality, assume that n
TTRT
LH
p
+
T.
p=l
The worst case condition which leads to the greatest delayed token visits is depicted in Figure 9.6. Note that the first visit of the token to host p may be partially wasted for the transmission of a lower priority packet, which is not preemptable. This potential blocking time Bm(a) is thus at most the transmission time p of a single packet. However, it must be accounted for only if in host processor p there are messages that when queued at time t = 0- they cannot have an actual deadline before or at a + D m (i. e., messages not included in HWm(a,t)): p
Bm(a) = {
a
if3m' E Out(p) : D m, otherwise.
> a + D m + J m,
Distributed Scheduling - Part I
217
Synchronolis messages SYrlchronous messages (host p)
:.- TTRT ---..;
Figure 9.6 Maximum delayed token visits at host processor p when restricted Timed Token protocol is assumed.
Note that in the busy period, blocking relative to a + D m can only occur at the beginning. Similarly to what was seen in the previous subsection, the worst case interference of the other host processors on the network accesses of host processor p in an interval of time t can be exactly computed by carefully defining the function lp(t). Each host q has a synchmnous bandwidth H q. Consequently, by using a similar argument as for the local higher priority workload, in the interval [0, t] the interference of q is upper bounded by
However, this upper bound may be pessimistic because it assumes that the communications processor at host processor q is always busy transmitting messages for a period of time H q whenever it receives the token. This may not be true if there are not enough outgoing messages queued by q. Since the queueing policy of these messages is known, it is possible to determine a second bound on the interference of q in the interval [0, t], by computing the message workload at its communications processor in the same interval:
-Wq(t)=p
~ L.J
(1+ It+Jm'J) r;;;- em"
m'EOut(q)
The actual bound is the minimum between the two values. The overall interference on host processor p is then
218
CHAPTER
9
Hence, in this case the equation for the computation of Lm,k(a) is given as:
Lm,k(a)
= HWm(a,Lm,da)) + (la;:mJCm+k-l)P + Bm(a)
+ lp (Lm,da) - Bm(a)).
(9.7)
Note that the interference lp(t) is evaluated in the interval [Bm(a), Lm,k(a)], since the busy period may be preceded by a potential blocking time Bm(a) due to the transmission of a lower priority packet. The solution of the equation can be found, as previously, by a fixed point computation.
9.2.6
Search Interval
The last issue to be addressed in this description is how to identify the significant values of the variable a in Equation (9.5). The same approach is used as for the evaluation of task worst case response times. Since in the computation of Tm,da) the length of a busy period, Lm,k(a), is evaluated, an upper bound on the significant values of a is computed by finding the maximum length of any possible processor busy period, L p . The length L p can be determined by taking into account the local message workload and the remote interference, and by using the well known fixed point computation: =
-
P ~mEOut(p) Cm, TVp (L(m») + I p (L(m») p p ,
where Wp(t) and Ip(t) are defined on the interval [0, t[, that is, they only include instances that arrived before t:
and where v is such that
Distributed Scheduling - Pa.rt I
219
if the full version of the Timed Token protocol is assumed, and
if, vice versa, the restricted version is assumed. The significant values of a are then found in the interval [- Jm, L p - J m - Em pCm ). In order to further reduce the number of evaluations of rm(a), observe that the local maxima of the higher priority busy period length Lm,da) are found for those values of a for which there is at least another local message with one occurrence having the deadline at time a + D m , or for which message m also has an occurrence queued, i.e., released, at time t = 0, that is, all messages are first released synchronously. Hence, the significant values of a are those in the interval [-Jm , L p - J m - Em - pCm ) for which 3m', 3k such that a = -Jm , + kTm , + D m , - D m . These are, in fact, the only points at which the right sides of Equations (9.6) and (9.6) have discontinuities in a.
9.2.7
End-To-End Response Times
In the previous sections, it is shown how the generation, queueing and transmission delays of an end-to-end computation are tightly bound. In particular, the generation delay is bounded by the worst case response time of the sender task, while the queueing delay and the transmission delay are included in the worst case message communication delay. The final steps are to bound the delivery delay and the processing delay, that is, the time to deliver the message to the destination task in the destination processor and the time to complete its execution. According to Tindell et. al. [40], the delivery delay can be implicitly included in our analysis by accurately bounding the overhead needed to handle the interrupts raised by packet arrivals at a destination processor. The packet interrupt handler is in charge of message reconstruction at the receiving end, and message delivery, that is, release of the actual destination task, after the arrival of the last packet. By including these overheads in the computation of the worst case response time of the destination task, the delivery and the processing delays of the end-to-end computation are both bounded. A first rough upper bound on the packet handling overhead can be found by observing that the packet transmission time, p, is the minimum time between two consecutive packet arrivals at any host processor. Hence, in any interval
220
CHAPTER
9
of time t the following bound exists:
r~1Cpacket, where
Cpacket
is the worst case execution time of the packet handler.
A second bound can be obtained by describing the packet handling relative to each message m as a sporadically periodic task [3], a model particularly suited for the analysis of bursty tasks. According to this model, the interrupts handling m's packets are described by means of four parameters: The number of handler invocations in a burst. In this case the number of packets message m is composed of.
nh(m)
th(m)
The minimum time between arrivals within a burst. We have
Th(m)
is,
=
Cm,
th(m)
= p.
nh(m)
The periodicity of the burst. It is equal to the period of message m, that Th(m) = T m .
The release jitter of packet arrivals. It is the difference between the earliest and latest arrival of the whole message m, that is, Jh(m) = r m (pCm + P).
Jh(m)
Using this description, Tindell et. al. [40] have derived the following bound on the number of packets that can arrive at a host processor p in an interval of time t:
where In(p) is the set of incoming messages in host processor p. The packet handling overhead at p can finally be bounded by the function
(9.8) As suggested by Tindell et. al., the packet handling overhead is much less if the communications processor had some processing capabilities and is able to distinguish the last packet of a message from the others. In this way the host
Distrib uted Schedu ling - Part I
221
packet, thus process or is interrup ted only once, after the arrival of the last significantly reducin g the delivery overhea d. process or The overhea d describe d by Equatio n (9.S) can be included in the host worst case the and on evaluati ty feasibili the for both 9.2.2, Section of analysis overhea d of respons e time comput ation aspects, as illustrat ed in [36], where the h is based approac The . account into taken a tick driven schedul er is similarl y priority higher as red conside be to just is d overhea the that on the simple idea it is sufficient load, such d bounde tightly Having task. ion applicat any for load priority to accurat ely include it in all equatio ns where all tasks having a higher t class differen a to or deadline shorter a to are taken into account , owing either length. period busy a e comput to or deadline a check to as in this case, in order comput ation The final step is to bound the processing delay of the end-to-e nd d(m). This task ion destinat the of time e by evaluati ng the worst case respons ed in mention just n extensio the with 9.2.2, Section in d can be done as describe d. describe be must the previou s paragra ph. The attribut es inherite d by d(m) of T period the to m equal is This is quite straight forward : the period Td(m) latest the between ce differen the to equal is message m, while the release jitter P). and the earliest arrival times of m, that is, Jd(m) = T m - (pC m + is a strong The descript ion of the procedu re is now complet e. Note that there attribut e The ing. schedul network and r depende ncy between host processo depend strongly delays ication commun message the that inherita nce is such case respons e worst the versa, vice and, times e respons task sender the on times. times of destinat ion tasks strongly depend on message commu nication tly coheren to order In . analysis holistic the of This depende ncy is the basis all step each At used. are steps e iterativ several whole a as analyze the system 9.2.2 Section in subsyst ems are examin ed using the methodologies describe d attribut es for host processors and in Section 9.2.3 for the network , and the when all either halted is ation comput The comput ed at the previou s step. ulable. unsched found is em subsyst a values stabilize or when worst case Once the holistic procedu re is halted, the last step is to check the ar, the particul In ents. requirem their end-to-e nd comput ation delays against up summed finally be can delay ation comput nd end-to-e five compon ents of an : formula g followin as in the (9.9) case respons e The formula is graphic ally illustrat ed in Figure 9.7. The worst includes the ) time of the sender task, Ts(m), is the generat ion delay. (T m - J m
222
CHAPTER
..
: ..
9
:
Sender Task '-'
Minimum Geheral!Ofl Delay
rs(m)
'I'
Message
rm
~
Minimum Cbmmunic:arion Delay
l'
Destinatioll Task
r dlm )
Figure 9.7
Components of the end-to-end computation delay.
queueing and the transmission delays. The delivery delay (implicitly) and the processing delays are included in the last term (rd(m) - Jd(m))' One last issue remains to be addressed. So far it has been assumed that the distributed feasibility problem is specified in terms of maximum end-to-end computation delays. It was also assumed that the subsystems, host processors and local communications processors, are scheduled according to the EDF algorithm. Thus the question is how to assign deadlines to intermediate steps of each end-to-end computation (sender task, message and destination task) in order to always achieve the maximum schedulability. In other words, a deadline assignment that is able to guarantee the feasibility of the system, whenever this is possible with EDF scheduling is needed. Note that this is no longer a simple analysis of the system, but a helpful task aimed at a correct and effective design of the system. At present there does not exist an answer to this problem. However, a sensible deadline assignment can easily be found by looking at Figure 9.7. If Dend-to-end is the relative deadline of the whole end-to-end computation, the destination task d(m) is assigned the relative deadline Dd(m)
= min {Dd(m), Dend-to-end
-
mgdm - mcdm } ,
223
Distributed Scheduling - Part I
where mgdm is the minimum generation delay and mcdm is the minimum communication delay of message m 7 . Note that the minimum is necessary to preserve a possible stronger constraint specified by the system designer. Similarly, message m is assigned the relative deadline
Finally, for the source task sCm) Ds(m)
= min {Ds(m),D m
-
mcdm
+ mgdm }.
Not only simple and intuitive, this assignment has also been found effective. It is likely that this assignment can be improved. The idea is to shorten the deadlines of those tasks or messages for which shorter response times are needed. How to do this consistently with the system requirements and without jeopardizing the other system components is not a trivial task. This is the subject of current research.
9.3
PERFORMANCE
The analysis described in this Chapter is very new. As far as we know, it has not been used in practice. However, a simple case study based on a three node system connected by an FDDI ring was undertaken in [37]. The case study was based on a hypothetical aircraft control system. The performance results demonstrated that the analysis can be used to verify deadlines and that better processor and network utilizations can be achieved than available with holistic analysis based on RM scheduling. In spite of the limited use of this work to date, it is expected that this analysis will prove valuable in many distributed real-time system situations. These results, together with the holistic analysis based on RM scheduling, are part of a growing incipient literature developing a science of distributed real-time scheduling. 7In this section it has been assumed that mgdm = 0 and mcdrn = pCrn
+ P.
224
9.4
CHAPT ER
9
SUM MAR Y
Distribu ted real-tim e systems are becomin g more and more common place. However, relative ly few scheduling results exist for this importa nt area. In this Chapte r one example of a complet e analysis for distribu ted real-tim e systems ba~ed on EDF schedul ing is presente d. The resultin g algorith m handles preemp table tasks, periodic tasks, non-per iodics with known interarr ival times, end-to- end timing constra ints, commu nication requirem ents, and resource requireme nts. It can also be said that the solution adresses the interfac es between CPU scheduling, resource allocati on, and commu nication scheduling. These results, therefor e, should prove valuabl e in many situatio ns. On the other hand, the solution does not address non-pre emptab le tasks, tasks with multiple levels of importa nce, precedence constra ints, placeme nt constrai nts, fault toleranc e requirem ents, tight and loose deadlines, overload, nor support ing different QoS requirem ents. In the next Chapte r two addition al algorith ms for distribu ted real-tim e systems are presente d. These algorith ms address different sets of issues and, therefore, provide addition al coverage on the types of distribu ted systems that can be handled .
REFERENCES
[1] T. Abdelzaher and K. Shin, "Optimal Combined Task and Message Scheduling in Distributed Real-Time Systems," Proc. of the IEEE RealTime Systems Symposium, December 1995.
[2] G. Agrawal, B. Chen, W. Zhao, and S. Davari, "Guaranteeing Synchronous Message Deadlines with the Timed Token Medium Access Control Protocol," IEEE Trans. on Computers 43(3), March 1994.
[3] N. Audsley, A. Burns, M. Richardson, K. Tindell, and A. Wellings, "Applying New Scheduling Theory to Static Priority Preemptive Scheduling," Software Hngineering Journal, September 1993.
[4] T. Baker, "Stack-Based Scheduling of Real-Time Processes," Real-Time Systems Journal 3, pp. 67-99, 1991.
[5] R. Bettati and J. Liu, "End-to-End Scheduling to Meet Deadlines in Distributed Systems," Fmc. of the 12th DistrilJuted Computing Systems Conference, 1992. [6] S. Biyabani, J. Stankovic and 1<. Ramamritham, "The Integration of Deadline and Criticalness in Hard Real-Time Scheduling," Froc. Real- Time Systems Symposium, December 1988. [7] M. Chen and 1<. Lin, "Dynamic Priority Ceilings: A Concurrency Control Protocol for Real-Time Systems," Real- Time Systems Journal 2, pp. 3253 /16, 1990. [8] S. Cheng, J. Stankovic and K. Ramamritham, "Scheduling Groups of Tasks in Distributed Hard Real-Time Systems," IEEE Trans. on Computer's, November 1987. [9] H. Chetto and i\J. Chetto, "Some Results of the Earliest Deadline Scheduling Algorithm," IEEE Transaction on Software Engineering, 15(10), October 1989.
226
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[10] W. Chu, C. Sit, and K. Leung, "Task Response Time For Real-Time Distributed Systems \Vith Resource Contentions," IEEE Trans. on Software Engineering 17(10), October 1991. [11] M. Di Natale and J. Stankovic, "Dynamic End-to-end Guarantees in Distributed Real-Time Systems," Froc. of the IEEE Real- Time Systems Symposium, 1994. [12] ~1. Di Natale and J. Stankovic, "Applicability of Simulated Annealing IVlethods to Real- Time Scheduling and Jitter Control," Real- Time Systems Symposium, December 1995. [13] K. Efe, "Heuristic Models of Task Assignment Scheduling in Distributed Systems," IEEE Computer, June 1982, pp. 50-56. [14] D. Ferrari, "A New Admission Control Method for Real-Time Communication in an Internetwork," in S. Son, Ed., Advances in Real-Time Systems, Prentice-Hall, Englewood Cliffs, NJ, 1995. [15] M. Garey and D. Johnson, "Complexity Results for Multiprocessor Scheduling Under Resource Constraints," SIAM Journal of Computing, 4, 1975. [16] M. Garey and D. Johnson, "Scheduling Tasks with Nonuniform Deadlines on Two Processors," Journal of the ACM, 23(3):397-411, July 1976. [17] M. Garey and D. Johnson, "Two-Processor Scheduling with Start-Times and Deadlines," SIAM Journal of Computing, 6(3), September 1977. [18] M. Garey and D. Johnson, "Strong NP-Completeness Results: Motivation, Examples, and Implications," JACM, Vol. 25, 3, July 1978, 499-508. [19] R. Graham, "Bounds on Multiprocessing Timing Anomalies," SIAM J. Appl. Math., 17(2), March 1969. [20] R. Grow, "A Timed Token Protocol for Local Area Networks," Froc. Electro/82, May 1982. [21] R. Jain, FDDI Handbook: High-Speed Networking Using Fiber and Other Media, Addison-Wesley, Reading, Massachusetts, 1994. [22] H. Kasahara and S. Narita, "Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing," IEEE Trans. on Computers, Vol. C-33, 1" November 1984, 1023-1029.
References
227
[23] M. Klein, et. al., A Practictioner's Handbook for Real-Time Analysis: A Guide to Rate Monotonic Analysis for Real-Time Systems, Kluwer Academic Publishers, Boston, 1993. [24] H. Kopetz, et. al., "Distributed Fault-Tolerant Real-Time Systems: The MARS Approach," IEEE Micro 9(1), February 1989. [25J C. Koza, "Scheduling of Hard Real-Time Tasks in the Fault Tolerant, Distributed, Real-Time System MARS," Proc. 4th IEEE Workshop on Real-Time Operating Systems, pp. 31- 36, July 1987. [26J J. Lehoczky, "Fixed Priority Scheduling of Periodic Task Sets with Arbitrary Deadlines," Proc. of the 11th IEEE Real-Time Systems Symposium, December 1990. [27] C. Liu and J. Layland, "Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment," Journal of ACM20(1), pp. 40-61, January 1973. [28] N. IVlalcolm and W. Zhao, "Guaranteeing Synchronous Messages with Arbitrary Deadline Constraints in an FDDI Network," Pl·OC. of the IEEE Conf. on Local Computer Networks, 1993. [29] D. Peng and K. Shin, "Static Allocation of Periodic Tasks with Precedence," in Distributed Computing Systems, IEEE, June 1989. [30] K Ramamritham, "Allocation and Scheduling of Precedence-Related Periodic Tasks," IEEE Trans. on Parallel and Distributed Systems 6(4), April 1995. [31] K. Ramamritham, "Allocation and Scheduling of Complex Periodic Tasks," Proc. 10th International Conference on Distributed Computing Systems, Paris, France, June 1990. [32] K Ramamritham and J. Stankovic, "Dynamic Task Scheduling in Distributed Hard Real-Time Systems," IEEE Software, 1(3):65-75, July 1984. [33] K. Ramamritham, J. Stankovic and W. Zhao, "Distributed Scheduling of Tasks with Deadlines and Resource Requirements," IEEE Trans. on Computers, 38(8):1110-23, August 1989. [34] 1. Sha, R. Rajkumar, J. Lehoczky, and K Ramamritham, "Mode Change Protocols For Priority Driven Preemptive Scheduling," Real- Time Systems Journal, Vol. 1, pp. 243-264, 1989.
228
DSADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[35] L. Sha, R. Rajkumar, and J. Lehoczky, "Priority Inheritance Protocols: An Approach to Real-Time Synchronization," IEEE Trans. on Computers 39(9), September 1990. [36] 1\1. Spuri, "Analysis of Deadline Scheduled Real-Time Systems," Rapport de recherche 2772, INRIA Rocquencourt, Le Chesnay Cedex, France, January 1996, submitted to IEEE Trans. on Software Engineering. [37] Iv1. Spuri, "Holistic Analysis for Deadline Scheduled Real-Time Distributed Systems, Rapport de Recherche 2873, INRIA Rocquencourt, Le Chesnay Cedex, France, April 1996. [38] J. Stankovic, K. Ramamritham and S. Cheng, "Evaluation of a Flexible Task Scheduling Algorithm for Distributed Hard Real-Time Systems," IEEE Trans. on Computers, C-34(12):1130-43, 1985. [39] K. Tindell and J. Clark, "Holistic Schedulability Analysis for Distributed Real-Time Systems," Microprocessing and Microprogramming, 40:117-134, 1994. [40] K. Tindell, A. Burns, and A. Wellings, "Analysis of Hard Real-Time Communications," Real- Time Systems Journal 9, 1995. [41] S. Zhang and A. Burns, "An Optimal Synchronous Bandwidth Allocation Scheme for Guaranteeing Synchronous Message Deadlines with the TimedToken MAC Protocol," IEEE/ACM Trans. on Networking 3(6), December 1995.
10 DISTRIBUTED SCHEDULING PART II
The algorithms and analysis of distributed real-time scheduling presented in Chapter 9 assumes a fairly simple task set model. As a result, formulas for analysis of these systems can be derived. In this Chapter, two real-time scheduling algorithms are presented where the task set model is more complex and no such formulas exist. The first algorithm handles static real-time distributed scheduling where tasks have deadlines, periods, precedence, communication, and even replication requirements. The algorithm is easily modified so that it can be used for many static, distributed real-time systems. The second algorithm is appropriate for dynamic distributed real-time systems, but only for the non-critical tasks. This algorithm is based on a combination of focussed addressing and bidding. The current state of the art is rather primitive in dynamic distributed real-time systems.
10.1
THE SPRING COMPLEX TASK SET ALLOCATION AND SCHEDULING ALGORITHM
Safety-critical tasks in real-time systems must meet their deadlines under all circumstances, otherwise the result could be catastrophic. Resources needed to meet the deadlines of safety-critical tasks are typically preallocated. These tasks are usually statically scheduled such that their deadlines are met even under worst case conditions. The Spring complex task set and allocation algorithm [22] is an algorithm that allocates and schedules the components of a task across nodes in a distributed system as well as the scheduling of comllluni-
230
CHAPTER
10
cation among these components. Besides periodicity constraints, tasks handled by the algorithm can have resource requirements and can possess precedence, communication, as well as replication constraints. Once this algorithm has been programmed, many simple modifications to it are possible to handle variations in the scheduling problem. This algorithm works off-line and hence is a static real-time algorithm!.
10.1.1
Assumptions and Problem Statement
The assumption is made that each node in the distributed system has one processing element and a given set of (passive) resources. The nodes are connected by a multiple-access network. The algorithm is designed to work with communication media and protocols such that knowing the arrival time and characteristics of a message at the sending node, then the time when the message is delivered at the receiving node is predicted. For instance, point-to-point networks or multi-access networks employing a Time Division Multiple Access (TDMA) protocol have such predictability, as does the FDDI protocol used with the holistic algorithm of Chapter 9. Communication from one node to another occurs at prespecified times, as per the schedule generated. Since the scheduler preschedules the communication on the network, no contention occurs for the multiple-access network at run time. In this approach, a complex program with communicating modules is translated into a task composed of a set of communicating subtasks where each subtask has resource requirements, involves the execution of sequential code, and has communication as well as precedence constraints with other subtasks. Such a translation can be accomplished by a compiler [20]. Specifications of periodic tasks include the following: (See Figure 10.1 for the specification of two sample periodic tasks):
1. The period of the task. The semantics assumed is that one instance of all subtasks of a task is executed every period. 2. The precedence relationship among subtasks of the task. This is expressed as a graph where the nodes represent subtasks and a directed arc exists from a subtask to its successor. (Here it is assumed that communication or 1 Note that this algorithm is different than the Spring scheduling algorithm described in earlier Chapters.
Distributed Scheduling - Par't II
231
precedence relationships do not exist between subtasks of different periodic tasks. See [23] for solutions when this assumption is relaxed.) 3. Computation times of subtasks are expressed via values attached to each graph node. These represent worst case computation times. (It is assumed that the execution of each subtask cannot be preempted [28]).
4. The maximum amount of information communicated from a subtask to its successor is expressed via a value attached to the corresponding arc. This is used to determine the communication delays incurred due to information transfer from a subtask to its successor if they are scheduled on different nodes in a distributed system. This information is used to schedule the communication between communicating subtasks placed on different nodes. (Communication within a node is assumed to incur zero delay.) The value associated with arcs in the graph are the communication times for the corresponding messages. This simplifies subsequent discussion. 5. The replication requirement of subtasks (given by RR) is specified by a value attached to each subtask indicating the number of replicates needed for the subtask. (Thus, it is assumed that fault tolerance is achieved via replication. The results of the replicates of a subtask are sent to each successor of the subtask which, depending on the fault model assumed, may vote on the results to determine the valid input. If voting is done, it is assumed that the voting overheads incurred by the successor are either negligible or are already accounted for in the worst case computation time of the successor.) 6. Resource constraints attached to each subtask express any specific resources needed by that subtask. These include, the CPU, sensors, I/O devices, data structures, files, and data bases. The resource constraints restrict the nodes to which a subtask can be assigned: These nodes should have the resources required by the subtask. For simplicity, it is assumed that all the specified resources are needed by the subtask throughout its execution and that resources allocated to a subtask are released at the end of its execution. (With some minor changes, the algorithm can be made to handle situations where these assumptions are relaxed.)
The purpose of this algorithm is to allocate subtasks of a set of tasks across nodes in a distributed system and to schedule the subtasks such that the tasks meet their periodicity requirements. It is well-known that even some of the simplest scheduling problems are NP-hard in the strong sense and hence, in practice, it is not possible to determine optimal schedules efficiently.
232
CHAPTER
G 4
2
J 1.2
2.17
,I
~ 1.3 ;
9
(~
12.2
RR=3
6
9
4 ,~t I
1.4
II
2.3 )
Periodic Task 1: Subtasks = 1.1, 1.2, 1.3, and 1.4 Period = 50 Subtask 1.2 is required to be executed in triplicate; Subtask 1.4 votes on the results communicated by the three copies of 1.2. Periodic Task 2: Subtasks = 2.1, 2.2. and 2.3 Period =25 The number to the right of a node indicates the computation time of the subtask corresponding to the node. The number attached to the arc connecting two subtasks indicates the amount of information communicated from one subtask to its successors.
Figure 10.1
Structure of Two Periodic Tasks,
10
Distr'ibuted Scheduling - Part II
233
When allocating and scheduling subtasks that communicate, the following issues have to be dealt with in conjunction. (1) Given a set of communicating subtasks, should they be placed on the same node? (2) Which node should a subtask be allocated to and when should it begin execution? An optimal solution should consider the cross product of the solution space of these queries. This is impractical for non-trivial distributed systems and for periodic tasks with complex characteristics. Consider the first issue. Ideally, subtasks of a task should be clustered such that (a) the cost of communication among subtasks within a cluster is higher than that between subtasks in different clusters and (b) the cost of communication among subtasks within a cluster negate the advantages of the parallel execution of the subtasks at different nodes. Subtasks belonging to the same cluster should be allocated to the same node. The basic idea then is to cluster together subtasks that have "substantial" amounts of communication among them. This strategy attempts to eliminate the large communication costs. While more elaborate clustering mechanisms are possible, the results for clusters of size two are presented. In this case, suppose there are n pairs of communicating subtasks. If an optimal solution has to be found, the 2" different clustering possibilities that exist must be examined. The complexity of finding an optimal solution poses a further practical problem when the second issue is considered. Hence, the algorithm addresses the two issues raised above in separate phases and utilizes heuristic solutions. In addition, it provides a way by which it is possible to iterate over the allocation and scheduling decisions to find a feasible solution.
10.1.2
Overview of the Algorithm
The Spring static allocation and scheduling algorithm [22] consists of two parts. The first part decides whether a cluster of communicating subtasks should be assigned to the same node. This decision is based on the computation times of the- subtasks in a cluster and the amount of communication between them. This part is heuristic in nature and is composed of what is referred to as Step I and Step II (see pseudo code below). Given the clustering done in the first part, th€ second part assigns the clusters of subtasks to the nodes in a system and determines a feasible schedule, if possible, for the subtasks as well as the communication between them. This is done in Step III using a search driven by task characteristics, where, at each point in the search, subtasks eligible for execution are considered in accordance with task characteristics
234
CHAPTER
10
such as latest-start-times and precedence constraints. Since the first part of the algorithm eliminates some of the communication (by deciding that certain subtasks should be assigned to the same node), the search space in the second part is considerably reduced. A high level summary of the algorithm is as follows:
PSEUDO CODE FOR STATIC ALLOCATION AND SCHEDULING ALGORITHM PART I Step I:
Construct comprehensive graph containing all task instances (includes task replicas)
Step II: Cluster communicating subtasks PART II Step III: Allocate clusters of subtasks to nodes Schedule subtasks and communication
If a feasible allocation and schedule are not found, alter the CF factor and repeat Steps II and III.
Given a set of periodic tasks, the algorithm attempts to assign subtasks of the tasks to nodes in a distributed system and to construct a schedule of length L where L is the least common multiple of the task periods. A real-time system with the given set of tasks then repeatedly executes its tasks according to this schedule every L units of time. Given the graph depicting each task, Step-I constructs the comprehensive graph containing all instances of the tasks that execute in an interval of length L. The comprehensive graph includes the replicates of the subtasks that have replication requirements.
235
Distributed Scheduling - Part II
Step-II involves clustering subtasks in the comprehensive graph. Specifically, based on the amount of communication involved between a pair of communicating subtasks and the computation time of the subtasks, a decision is made as to whether the two subtasks should be assigned to the same node, thereby eliminating the communication costs involved. The algorithm makes its decision based on whether the fraction sum of the computation time of the two subtasks cost of COm,rllHnlcatwn
is lower than a tunable parameter called communication factor, CF. Applying the above scheme to every pair' of communicating subtasks in the comprehensive graph derived in Step-I, a communication graph is generated with the current value of CF. Step-III allocates the subtasks to nodes in the system, schedules these subtasks as well as communication, and if possible, determines a feasible schedule. This is done using a heuristic search technique that takes into account the various task characteristics, in particular, subtask computation times, communication costs, deadlines, and precedence constraints. It allocates a subtask to a node, determines the order in which each node processes its subtasks, and schedules communication. The allocation and scheduling decisions are made in conjunction. Specifically, allocation and scheduling decisions about a subtask are made only after all its predecessors have been allocated and scheduled. These decisions take into account the communication and computational needs of the subtasks that follow. If at the end of Step-III, a feasible allocation and schedule is not possible, the value of CF is altered, and Steps II and III are repeated.
Now consider each of these three steps in detail.
Step I - Construction of the Comprehensive Graph The following semantics is associated with periodic tasks: All subtasks of the instance of a periodic task with period P should be completed between (P x (j - 1)) and (P x j). Given these semantics, the algorithm attempts to construct a feasible schedule for all task instances that should execute within the interval (0, L) where L is the Least Common Multiple of the periods of all the periodic tasks involved. The comprehensive graph is composed of Ni = instances of the i th periodic task (with period Pi)' The first instance of each periodic task is ready to begin execution at time 0. In addition to the other constraints, the jth instance of task i, for j = 1, 2, ... ,Ni , has a start time constraint whereby its first subtask(s) (i.e., those that have no predecessors), can not be scheduled to begin before ((j - 1) xX Pi)' The last subtask(s) of
lh
it
236
CHAPTER
10
the the jth instance of task i, (i.e., those which have no successors), have a deadline of (j x Pi). The comprehensive graph is then modified to consider replication requirements of subtasks. The additional replicates of a subtask added to the graph are endowed with the same specifications as the original subtask.
Step II - Constructing the Communication Graph As mentioned earlier, the primary considerations in determining whether two subtasks of a task should be placed on the same node are the computation times of the subtasks and the amount of communication between them. If the two subtasks are placed on the same node, the costs of communication between the two subtasks are avoided and hence the time required to complete this pair of subtasks is reduced [10]. However, this has a number of implications. First, if a subtask has two successors (for example, consider subtask 1.1. of Figure 10.1) and if the subtask and both its successors are assigned to the same node, the potential for parallel execution of the successors is not exploited. The second implication is that the load on the node to which both subtasks are allocated increases which can prevent the node from taking on a subtask (of another task) whose characteristics make this node more suitable for its execution. In general, requiring that some of the subtasks be assigned to the same node reduces the options for the remaining subtasks. Clearly, the scheme that decides whether communicating subtasks must be assigned to the same node must be flexible enough to take into account the specific characteristics of the tasks being allocated. Suppose there are two pairs of subtasks where the computational characteristics of one pair are the same as the other. Then, it is better to assign subtasks with the higher communication costs to the same node. This is the basis for this scheme. Specifically, two subtasks with computation times Ci and Cj where the communication from the first subtask to the second takes COmmij units of time is placed on the same node if the following holds: (C i + C j ) < (CF x COmmij) where CF is a tunable parameter. For a given value of CF, this scheme tends to assign the pair of subtasks with higher communication costs to the same node. It should be clear that the maximum value of CF that needs to be considered IS
rnaXi,j,
Distributed Scheduling - Part II
for all communicating subtasks i and j. Let us call this maxc!. positive non-zero value.
237
E
is some
Assume that COmmij for every pair of communicating subtasks i and j is nonzero. This is true in practice since completion of a subtask's execution has to be notified to the successor, say via an "enabling signal". Thus, assigning a value of maxc! to C F forces all communicating subtasks to be allocated to the same node. More and more pairs of communicating subtasks are separated as the value of C F is decreased from maxc!. A C F value of 0 forces communicating subtasks to be allocated to different nodes. By making COmmij infinity (zero) between a pair of subtasks i and j, even if it is not, the subtasks involved can be forced to be allocated to the same (different) node. C F values can be chosen in different ways. One choice is to start with a C F value of maxc!. For a given value of C F if a feasible schedule is not determined, the scheduling algorithm reattempts by considering a lower C F value, one given by decrementing the current CF value by ma,~c! where n is a constant. Given a comprehensive graph, traversing the graph top-down and left to right, the "pairwise heuristic" is applied to every pair of communicating subtasks, to determine whether the subtasks should be assigned to the same node. In addition, in order to produce consistent clustering decisions, a pair of communicating subtasks, say, t 1 and its successor t 2 , may have to be placed on different nodes: •
Suppose t2 has a predecessor to, in addition to t1- If prior decisions require that to and t2 must be allocated to the same node and that to and t1 must be on different nodes, then, clearly, t 1 and t2 have to be allocated to different nodes. One specific instance of the above is the case where to and t 1 are replicates with a successor t2, and t2 and to are assigned to the same node.
•
For the general case of replicates, consider a subtask t1 that is replicated. At most one of the replicates of h can be assigned to the same node as h's predecessor. Similarly, at most one of the replicates of t 1 can be assigned to the same node as t 1 's successor.
•
Suppose subtasks t 1 and t2 have a common predecessor to. If prior decisions require that to and t1 must be on the same node, and that to and t2 must be on different nodes, then t 1 and t 2 must be on different nodes.
238
CHAPTER
10
The graph that results from applying the pairwise heuristic as well as the above rules is called the communication graph. Figure 10.2 shows the communication graph of the graph in Figure 10.1 for CF = maxcf. Subtasks which are connected by arcs that do not have any associated times are the ones that must be allocated to the same node. In this case, since CF = maxcf, two subtasks that communicate are assigned to the same node. When applied to every pair of communicating subtasks, this scheme eliminates all communication except those to and from the replicates of a subtask. Once the communication graph has been derived, the latest start time of each subtask t in the communication graph is determined. Assume t is a subtask of periodic task (instance) T with deadline D. Define the length of a path between two subtasks in the communication graph to be the sum of the computation times of all the subtasks, including the subtasks in consideration, plus the sum of the communication times, if any, associated with the arcs that lie along the path. Let LP be the length of the longest path from t up to and including the last subtask of T. The latest start time of t is defined to be D - LP. Latest start times of tasks are used in ordering tasks for consideration during scheduling. As the term "latest start time" implies, if a subtask is started any later than this time, the task it belongs to definitely misses its deadline. However, as the following example indicates, this is an optimistic latest start time. Consider periodic task 1 of Figure 10.1 without the replication requirement on subtask 1.2. Assume that all the communication subtasks of this task have been eliminated, i.e., all the subtasks must be scheduled on the same node. Suppose the task has a deadline of D. Then subtask 1.1 should start by
LST' = D - (CLl
+ C1.2 + C1.3 + C1.4)
in order for the task to meet its deadline. This is because all four subtasks need to be completed by time D on the same node. Clearly, LST' is less than the optimistic latest start time of 1.1. But, if subtasks 1.2 and 1.3 can execute on different nodes, use of the optimistic start time is appropriate. Since this is the more general case, the algorithm uses this latest start time. However, to accommodate cases illustrated by the above example, additional checks are done while making allocation and scheduling decisions to make sure that a subtask starts early enough. To ease further discussions, the communications that have to be scheduled are referred to as communication sub tasks and the subtasks that must be allocated to nodes as CPU subtasks. This nomenclature recognizes the additional resource constraint imposed by subtasks: CPU subtasks are allocated and sched-
Distributed Scheduling - Part II
8 4
RR=3
@J
4
i
' 0
6
S
:6
IA.1.1
G9l
L....::...:....
11
Here the I th replicate of the k th instance of the j th subtask of periodic task i is identified by i.j.k.l2.3.1.1 has a deadline of2S. 2.1.2.1 has a start time constraint of2S. i.e.• can not start executing before 2S. 1.4.1.1 and 2.3.2.1 have a deadline of SO. Th" rectangle next to each subtask indicates the latest start time of the subtask. Subtasks connected by an arc with no numbers attached to them should be assigned to the same node. Where such numbers are attached. they denote communication costs.
Figure 10.2
Subtask Communication Graph.
239
240
CHAPTER
10
uled on (processing) nodes, and communication subtasks are scheduled on the (communication) network. Viewing it in this fashion allows the uniform dealing with all (types of) subtasks, given that the algorithm takes resource constraints into account.
Step III - !vIaking Allocation and Scheduling Decisions A subtask becomes enabled only when all its predecessors have completed execution. Thus, at any given time, only some of the subtasks are eligible for consideration. A subtask becomes lTady only if it is enabled and when its start-time constraint is met. Given a list of ready subtasks, the order in which they are considered for allocation and scheduling determines whether or not a feasible schedule is derived. For example, assume that at a given node two subtasks are enabled at time 20. Each subtask has computation time 10; the first has a deadline of 30 and the second 40. In this case, if the subtasks are not considered according to their latest start times or deadlines, a feasible schedule can not be generated. In general, delaying the execution of the subtask with the least latest start time delays the overall completion time of the set of tasks. Hence, the ready list is ordered according to increasing latest start time of the subtasks in the list. If there is a tie, the subtask with a greater number of successors is placed first. This is equivalent to using the LST /MISF (Latest Start Time/Maximum Immediate Successors First) heuristic during the search for a feasible allocation and schedule. Assuming that each node has only one processor, when the processor is busy (idle), the node is busy (idle). Some nodes may be busy executing previously scheduled CPU subtasks that are yet to finish. Clearly, since CPU subtasks are assumed to be non-preemptable, ready subtasks can be scheduled only on currently idle nodes. For uniformity, communication subtasks are not permitted to be preempted either. Thus, a communication subtask can be scheduled only if the communication channel is idle. When making allocation and scheduling decisions, an idle node or an idle communication channel is referred to as a schedulable resource.
After initializing the ready list with the root node of the communication graph, the search proceeds as follows. At each search point, the algorithm first checks if the allocation and scheduling decisions made thus far do not lead to a feasible schedule. These checks are discussed below under "Testing for Infeasibility" .
•
If the checks indicate that a feasible schedule is likely, then the subtasks in the ready list are mapped to schedulable resources. Obviously, there are a
Distributed Scheduling - Part II
241
number of possible mappings and they are generated and considered in order (as discussed below in the paragraphs labeled "Systematic Generation of Mappings"). If the current mapping is valid, i.e., meets certain requirements (discussed under Testing the Validity of a Mapping), the search path is extended by one more level and the search proceeds. The "time" corresponding to the new level is set to be the smaller of min (earliest start time of currently enabled tasks) and min(eariiest completion time of subtasks currently occupying resources). Subtasks that were in the previous ready list,but not scheduled are placed in the new ready list. Tasks that have just become ready are added to the new ready list.
If the current mapping is invalid, the next mapping is generated and its validity determined. If no more valid mappings exist at the current point of search the algorithm discards the current search point. Once this occurs, if the algorithm is allowed to backtrack, it backtracks to the previous search point. When going back to the previous search point, the next valid mapping, if any, at that point is pursued. •
If it is found that the current set of allocation and scheduling decisions do not lead to a feasible schedule, the current search point is "bound", i.e., is discarded. Here again, if backtracking is allowed, the algorithm backtracks to the previous search point and proceeds (if possible) with the next valid mapping at that point.
Experimental results show that the LST/MISF based ordering of the ready list works effectively in conjunction with the systematic generation of mappings, the tests used to validate a given mapping, and the checks used to determine whether the current search path leads to a feasible schedule. Systematic Generation of Mappings: Given subtasks in the ready list, a mapping defines the assignment of subtasks to a schedulable resource. To simplify the generation of mappings, the notion of "idle subtasks" is introduced. Sometimes, resources may remain idle because none of the ready tasks require it. Because of the characteristics of subtasks that become enabled or become ready at a future time, at times, it may be better to allow a resource to remain idle even if a currently ready task can be scheduled on it. For instance, a task that is yet to become ready may have an earlier latest start time than another that is ready. Further, if the resource requirements of the former conflict with the latter, then immediately scheduling the latter may affect the schedulability of the former. Thus, scheduling not only the subtasks, but also the idle sub tasks have to be considered. I die subtasks represent time slots during which one or
242
CHAPTER
10
more schedulable resources are allowed to remain idle. The notion of idle subtasks treats resource assignment uniformly: Some subtask is always assigned to a schedulable resource; if it happens to be the idle subtask, the resource remains idle. To facilitate this scheme, a number of idle subtasks, equal to the number of nodes idle at this point, are appended to the ready list. Suppose there are n subtasks (excluding idle tasks) in the ready list and k idle schedulable resources at a certain point in the search. Considering idle subtasks, the effective size of the ready list is (n + k). There are O((n + k)k) possible mappings from subtasks to schedulable resources, considering idle subtasks as mz, .... md represents the assignment of the m;h subtask well. A mapping (mj, m2, in the ready list to the i th idle schedulable resource. If the possible mappings from subtasks to schedulable resources are generated systematically, then given a certain ready list and a particular mapping, the next possible mapping can be determined without any other information. This scheme, inspired by the one used in [15], considerably reduces the amount of information that has to be maintained during search: only the most recently used mapping at this point in the search needs to be kept as part of the search structure. For example, suppose subtasks t j and t 22 are ready, t j has a lower latest start time, and schedulable resources Sj and S2 are available. Then n == 2 and k == 2. The mappings from the ready list (t j , tz, t2, idle, idle) to the list of idle schedulable resources (sl, s2) is generated in the following sequence: (1,2), (1,3), (2, 1), (2,3), (3, 1), (.3, 2), (3,4). This is a lexicographically ordered sequence where each mapping has k elements and each element in a mapping is between 1 and n + k. Mappings that have the same effect as a previously generated mapping are not generated. The above sequence corresponds to the following sequence of mappings:
Observe that mapping (2,4) (corresponding to (t 2 , idle)) is not generated since it has the same effect as (2,3). For similar reasons, (4,1), (4,2), and (4,3) are not generated. Testing the Validity of a Mapping: The following conditions have to be met for
a mapping to be valid. •
As noted earlier, schedulable resources can be allowed to remain idle. However,
Distributed Scheduling - Part II
243
when there are ready tasks, not all resources in the system can remain idle. if the subtask with the lowest latest start time among subtasks yet to be scheduled is ready and is schedulable, it has to be scheduled. •
Resource constraints must be met: \\Then a CPU subtask is mapped to a node, the resources needed by the subtask should be available at that node. Communication subtasks can be allocated only to the communication channels. Two subtasks must be scheduled on different nodes if and only if the two subtasks are separated by a communication subtask. Replicates of a subtask (if any) should be scheduled on different nodes. Subtasks t 1 and t 2 should be assigned to the same node if they have a CPU subtask t as their common successor.
Testing for Infeasibility: The current search point does not lead to a feasible schedule, i.e., a schedule that meets all the timing constraints, if one of the following hold. First, the time at the current search point is greater than the latest start time of a task in the ready list. Second, the total time available on a particular resource between the current time and L is less than that required by the subtasks that will execute between now and L and require that resource. For example, assume that communication is via a multiple-access network and the current time is t. If the time needed for all the communication subtasks that have not yet been scheduled at this point is greater than (L - t), then the current search does not lead to a feasible schedule.
In addition to these two cases, by looking ahead, a potentially infeasible schedule can be detected sooner. Once again, consider periodic task 1 of Figure 10.1 without the replication requirement on subtask 1.2. Assume that all the communication subtasks of this task have been eliminated, i.e., all the CPU subtasks must be scheduled on the same node. Suppose subtasks 1.2 and 1.3 are currently on the ready list. Then the following condition should hold: (CU7Tent time
+ C1.2 + C1.3 <=
LST1.4)
where Ci is the computation time of subtask i and LSTi is the latest start time of i. The reason for the above condition should be obvious: since both subtasks must execute on one node, they should complete execution before the
244
CHAPT ER
o Time
10
20
30
40
I-- -I~ --- -f- --f 7
12
15
24 25
32
50
I
37 40
Site 1
r-~-H--H--+-t--t----1
Site 2
1--r ---t- -1-- ---j
5
14
9
Site 3
30
41
I
-1
18
1--------1---1 --- --- --1 5
Net
21
9
13
18
24
30
l---+ --t-i -+-- -+-i
Figure 10.3
-- ~1
Assignm ent and Schedule for the Two Periodic Tasks.
10
Distributed Scheduling - Part II
245
latest start time of the successor task. If the above condition does not hold, the current partial schedule does not lead to a feasible schedule. Suppose instead that subtask 1.1 is on the ready list. condition should be satisfied for the search to proceed:
Then the following
(current time + 01.1 + 01.2 + 01.3 <= LST1.4) Clearly, the above specific cases can be generalized. However, overheads are involved in detecting whether the conditions for the "look ahead" apply. In general, the sooner it can be determined that the search path being pursued does not lead to a feasible schedule the less time and resources are wasted on scheduling. Experiments [22J have indicated that the LST /MISF heuristic in conjunction with the mapping generation and validation scheme and the search path bounding scheme, determine the initial search path so effectively that if the initial path does not lead to a feasible schedule for a given set of tasks, chances are high that this is an infeasible task set. Figure 10.3 shows the schedule that results when the algorithm is applied to the communication graph of Figure 10.2. It shows the system resource to which each subtask is allocated along with the scheduled start time of the subtask. Here all subtasks of Task 1 execute on node 1, all subtasks of Task 2 execute on node 0, the redundant copies of subtask 1.2.1.1 execute on nodes 0 and 2. Derivation of this schedule required no backtracking. In summary, this algorithm presents a very powerful heuristic that can be easily modified and used for scheduling many types of static distributed realtime systems. It is worth programming the algorithm and making it part of a collection of real-time scheduling analysis algorithms available to designers.
10.1.3
Performance Evaluation
The algorithm presented in this section was implemented in C++ and tested over a wide range of workloads as represented by various parameter settings (such as deadline distributions, worst case computation times, communication costs, amount of redundancy to support fault tolerance, etc.) and task types (such periodic and aperiodic, tasks with precedence and communication requirements, etc.). In the performance studies [22, 23, 26J it was assumed that
246
CHAPT ER
10
there was a network of six nodes connect ed by a multiple -access commun ications network . In these studies it was shown that this algorith m is an effective one for distribu ted real-tim e allocati on and scheduling. The cost of running the algorith m without backtra cking on an INTEL 386 was from 30 ms to 12 seconds dependi ng on the particul ar task load involved. Since this is an off-line algorith m, the algorith m executio n time is very impressive. Other perform ance results showed that the use of the C F factor in clusteri ng tasks and the explicit use of a deadline factor were both importa nt and effective. Using backtra cking improve d the success rate of finding feasible schedules, but only slightly for the experim ents discussed in [22]. This implies that the heuristi cs are quite effective in finding feasible schedul es on one pass through the algorith m. See [22, 23, 26] for a full discussion on the perform ance of the algorith m.
10.2
FOC USSE D ADD RES SING AND BIDD ING
Most solution s used today for distribu ted real-tim e scheduling are static in that they assume complet e and prior knowled ge of all tasks and make static schedul ing and allocati on decisions for the tasks. These solution s offer high levels of guarant ee, but can suffer from inflexibility, poor resource utilizati on, and high overall system cost in dollars. On the other extreme are dynami c solution s that assign prioritie s to tasks at runtime . The dynami c schedul ing approac h suffers from a number of problem s. First, one figure, namely a task's priority , has to reflect a number of characte ristics of the task includin g its deadlin e and level of importa nce. This assignm ent is error-pr one and causes several well known anomali es because deadlin e and importa nce are not always compati ble. Second, the fact that a task has missed its deadline is known only when the deadlin e occurs. This does not allow time for any correcti ve actions. Third, priority schedul ing (as common ly defined) only address es the cpu resource. This is a mistake . What value is there to immedi ately schedul ing a task with a close deadlin e if the first thing that the task does is ask for a locked resourc e and therefor e must wait? \Vhat is required is an integrat ed approac h to cpu schedul ing and resource allocati on.
Distributed Schedul'ing - Part II
247
Fourth, there is no easy way to obtain an overall performance evaluation; simulation is normally used. The scheduling decisions made by the distributed scheduling schemes to be discussed here are based on task deadline and resource requirements. The notion of guarantee underlies all the scheduling decisions: when a task arrives at a node, the local scheduler at that node attempts to guarantee that the task completes execution before its deadline, on that node. If the attempt fails, the scheduling components on individual nodes cooperate to determine which other node in the system has sufficient resource surplus to guarantee the task. If such a node is not found, corrective action can be attempted before the deadline is missed. This guarantee based scheme significantly improves the predictability and fault tolerance properties of the system. The performance metric of interest here is the guarantee ratio defined as the ratio of the number of tasks which can be guaranteed to complete before their deadlines compared to the number of tasks which are invoked 2. The goal of the scheduling algorithms is to maximize the guarantee ratio on a systemwide basis. It is important to note that this metric is not applied to critical tasks. Critical tasks should be scheduled with static scheduling schemes thereby achieving 100% guarantees, subject to the assumptions. Due to the real time constraints on tasks, the scheduling algorithm itself should be efficient. That is, the schedul£ng delay must be minimized and the schedul£ng overheads incurred by the system should be minimized. This implies that the decisions, such as whether a task can be completed before its deadline as well as which node the task should be sent, must be made efficiently. The problem of determining an optimal schedule even in a multi-processor system is known to be NP-hard. A distributed system introduces further problems due to communication delays. These factors often necessitate a heuristic approach to dynamic distributed real-time scheduling. The basic strategy for scheduling essential hard real time tasks in a loosely coupled dynamic distributed system is as follows: Assume that when a task is invoked (this is also referred to as the task arrives) at a node of a distributed system, the task's deadline, computation time, and resource requirements are known. When the task arrives, the scheduler component local to that node decides if the new task can be guaranteed at this node. The guarantee means that no matter what happens (except failures) this task will finish execution by 2If tasks belong to different categories, or have different levels of importance, it is possible to define a suitable weighted guarantee ratio.
248
CHAPTER
10
its deadline, and that all previously guaranteed tasks still meet their deadlines 3 . If the new task cannot be guaranteed locally, then the scheduling components on individual nodes cooperate to determine whether another node has sufficient surplus in all those resources required by the task to guarantee the task and if such a node exists, the task is sent to that node; otherwise the task is rejected. Note that other options are possible, but not discussed here. Algorithms for local guarantee under resource and timing requirements have already been described in Chapters 5 and 6. Now Chapter 10 focuses on the algorithms for selecting a Temote node to which a task should be sent when the task cannot be guaranteed locally. For presenting the global scheduling algorithms consider the following system model. There are n nodes, N I , N 2 , .•• , N n in a loosely coupled distributed system. Let each node contain a set of distinct resources, R I , R 2 , ••. , R r . A resource is an abstraction and can include CPU, I/O devices, main memory resident files, data structures, etc. A resource is active if it has processing power, otherwise, it is passive. For example, a CPU or a physical device is active, but a file is passive. Thus, a passive resource must always be used with some active resource. Some resources can be (simultaneously) shared by multiple tasks while others, such as a CPU, have to be assigned exclusively to one task. Further, if a sharable resource, such as a file, is modified by a task, the resource should be exclusively assigned to the task. A task is a scheduling entity and its execution cannot be preempted. The following characteristics of a task, T, are assumed known when it arrives: •
the worst case computation time, C(T);
•
The deadline, D(T), by which the task must complete;
I!I
The resource requirements of the task. It is assumed that a task needs all its resources throughout its execution. A task requests at least one active resource and zero or more passive resources.
There are two types of tasks: nonperiodic tasks and peTiodic tasks. A nonperiodic task arrives at any node dynamically and has to be executed before its deadline. An instance of a peTiodic task with period P should be executed once 3If a task's execution has to be guaranteed in spite of failures, a guarantee should be obtained from multiple nodes. Such multiple guarantees are not discussed.
Distributed Scheduling - Part II
249
every P units of time. Periodic tasks are assigned to nodes at system initialization time and their timely execution is guaranteed. Since periodic tasks remain on the node where they are initially assigned, this part of this Chapter focuses on the distributed scheduling of nonperiodic tasks. The effect of periodic tasks on local scheduling is, however, taken into account. In addition to resource requirements and timing constraints, tasks in real-time systems are characterized by their priority and precedence constraints. The priority of a task encodes its level of importance relative to other tasks. Precedence constraints enter the picture when tasks communicate or when a complex task is viewed in terms of a number of subtasks related by precedence constraints. Whereas distributed scheduling of tasks with precedence constraints is the subject of [6, 8] and prioritized tasks are considered in [5], this section focuses on tasks that are independent and have equal priority. This focus was chosen in order to carefully study a set of algorithms that vary in their complexity, but have general applicability. On each node, there are 3 components involved. The local scheduler handles scheduling of tasks that arrive at a given node. The dispatcher invokes the next task to be executed based on the schedule determined by the local scheduler. The global scheduler interacts with the schedulers on other nodes in order to perform distributed scheduling. Nodes are connected by a communication network. In describing the algorithms, no specific communication topology is assumed. However, depending on the topology of a given network, an algorithm could be optimized. Some of the algorithms presented here implicitly take topology information into account, for instance, in determining the nodes to which local state information is sent.
10.2.1
Algorithms For Global Scheduling
When a local task, T, arrives at a node Ni the local scheduler of N i is invoked to try to guarantee the newly arrived task on the node. If the task can be guaranteed, it is added to the schedule which contains all the guaranteed tasks on the node. This section discusses five algorithms for dealing with a task that is not guaranteed locally. One plausible algorithm is the Non-Cooperative algorithm (NC). When a task cannot be guaranteed locally, it is rejected. No attempt is made to send the
250
CHAPTER
10
task to other nodes. If all nodes are heavily loaded, a non-cooperative strategy is the best. The second possible algorithm is the Random Scheduling Algorithm (RSA). In this algorithm, when a task cannot be locally guaranteed, the node sends the task to a randomly selected node. The advantage of this algorithm is that it uses minimum communication overhead to determine the node to which a task should be sent. The disadvantage is that it is easy to send a task to an improper node because of the randomness. A more informed choice can be made by a node if it has"information about the state of other nodes, in particular, about the resources available on a node as well as their surplus resources. The three algorithms to be discussed next only consider a remote node as a candidate for receiving a task if that node contains the resources needed by this task. Among the nodes that meet this criterion, a node is chosen based on the surplus information with respect to the required needs of the task. Each node periodically calculates the node surplus and sends it to a subset of the nodes in the system. The node surplus provides information about the available time on resources, after taking into account resource utilization of local tasks, i.e., the tasks that directly arrived at a node from the external environment and not from other nodes. A node's surplus is a vector, with one entry per resource on that node. Each entry indicates the total amount of time, in a (past) window, during which a resource is not used by the local tasks. The window is a time interval [(t - WL),t] where t is the current time, and WL is the length of the window which is adjustable parameter of the algorithm. For example, within the recent window, suppose 400 time units is the sum of the length of all the time intervals during which a resource R on a node is not used by any task. Then the surplus on that node for resource R is said to be 400 time units. W L should not be too short ~ in this case it may reflect the transient behavior of a node, and not too long - in this case it may not reflect the changes that are of legitimate interest to other nodes. A node sorts other nodes according to the number of tasks received from them that were guaranteed on this node in the past time window. Then, according to this sorted node list, the node selects a subset of nodes to send information on its own current node surplus. The subset is chosen such that nodes in the subset potentially use this information in deciding whether or not to send a task to this node. Hence, the nodes which recently sent more tasks to this node are more likely selected. The above strategy minimizes the overheads of exchanging surplus information. One effect of this is that not all nodes have the
Distributed Scheduling - Part II
251
same state information about other nodes. If the network is small, the surplus information can be sent to all the other nodes. Recall that a node has the estimate of the surplus of a given resource on other nodes. Thus, knowing the resources needed by a task and the computation time of the task the node can determine whether one or more nodes are in a position to meet the task's needs. The three algorithms discussed next differ in the way they select the node to which the task should be sent. The three algorithms are the bidding algorithm, the focussed addressing algorithm, and the fiexible algorithm. First, a very high-level overview of the algorithms is presented. All the details necessary to fully understand the workings of these algorithms are then provided. It is also shown how the bidding algorithm and the focussed addressing algorithm are special cases of the flexible algorithm. The fiexible algorithm is designed to reap the benefits of both focussed addressing and bidding and to overcome the shortcomings inherent in using each by itself. The Focussed Addressing (FA) algorithm works as follows. When node N i has a task that is not locally guaranteed, it determines the node with the highest surplus in the resources needed by the task. If this surplus is greater than Focussed Addressing Surplus (FAS), a tunable system parameter, the task is immediately sent to that node. If no such node is found, the task is rejected. In the Bidding algorithm, k nodes with sufficient surplus in the resources needed by this task are selected. The value of k is chosen to maximize the chances of finding a node for the task. A request-far-bid message is sent to these nodes. When a node receives the request-far-bid message, it calculates a bid, indicating the likelihood that the task can be guaranteed on the node, and if it is higher than MB, the minimum required bid, it sends the bid to the node which issued the request-far-bid. After receiving the bids, N i sends the task to the node which offers the best bid. If there is no good bid available for the task, it is assumed that no node in the network is able to guarantee the task. The Flexible algorithm works as follows for tasks not locally guaranteed. •
N i selects k nodes with sufficient surplus in the resources needed by T. If the largest value of the surplus of these k nodes is greater than FAS, then the node with that surplus is chosen as the focussed node. If a focussed node is found T is immediately sent to that node. In addition to sending the task to the focussed node, node N i sends in parallel, a request-far-bid
252
CHAPTER
10
message to the remaining k - 1 nodes. The request-far-bid message also contains the identity of the focussed node if there is one. •
When a node receives the request-for-bid message, it calculates a bid, indicating the likelihood that T can be guaranteed on the node, and sends the bid to the focussed node if there is one, otherwise, to the original node which issued the request-far-bid.
•
\Vhen a task reaches a focussed node, it first invokes the local scheduler to try to guarantee T. If it succeeds, all the bids for T are ignored. If it fails, the bids for T are evaluated and T is sent to the node responding with the "highest bid". A message about whether and where T is finally guaranteed is sent to the original node. The original node then modifies its surplus information about other nodes accordingly.
•
In case there is no focussed node, the original node receives the bids for T and sends T to the node which offers the best bid.
•
If the focussed node cannot guarantee T and if there is no good bid available for T, then corrective actions can be taken.
Next is described the estimation and proper use of node surplus, the heuristics for choosing focussed nodes, the strategies for making bids, and the evaluation of the bids. The details provided are in the context of the flexible algorithm. The conditions under which the flexible algorithm behaves like the focussed addressing and the bidding algorithm are discussed.
Focussed Addressing and Requesting Bids The global scheduler at each node of the distributed system is responsible for doing focussed addressing and requesting bids. For j = 1, ... , nand j =1= i, the global scheduler on node N i estimates ES(T, j)
= number of instances of task T that node
Nj can guarantee (10.1)
This estimation is made according to the node surplus information of resources available on node N i and provides a good indication of the likelihood of a node being able to guarantee a given task. The global scheduler then uses this estimate to decide whether or not to try focussed addressing and/or bidding. For example, assume that the computation time of task T is 250 time units. Suppose, Node N s is estimated to have a minimum surplus of 400 time units
Distributed Scheduling - Part II
253
on each of the resources needed by T in the time interval in which task T must run. The surplus of N i with respect to the resources needed by task T is 400. Then ES(T, s) = 400/250 = 1.6. The global scheduler on Node Ni sorts the nodes according to their ES(T, j), in descending order. The first k nodes are selected to participate in focussed addressing and bidding. The value of k is decided such that the sum of ES(T, j) of the k nodes is larger than or equal to SGS, the System-wide Guarantee Surplus. This is a tunable parameter of the systeI,n. If the first node Nf among the k nodes has its ES(T, f) larger than FAS, the Focussed Addressing Surplus, node Nf is selected as the focussed node. The task is immediately sent to that node. The remaining k - 1 nodes are sent request-for-bid messages in parallel, to handle the case where the focussed node cannot guarantee the task. A request-far-bid message includes information about the deadline and computation time of the task as well as the latest bid arrival time, i.e., time by which bids should reach the focussed/requesting node to be eligible for further consideration. The latest bid arrival time for a task T, L(T), is estimated as follows:
L(T) = D(T) - C(T) - (TD
+ SD)
(10.2)
where D(T) is the deadline of T, C(T) is the computation time of T, TD is the (network-wide) average transmission delay between two nodes, and SD is the average scheduling delay on a bidder node. Thus, on the average, at or before L(T) there is sufficient time to send the task to a bidder node, for it to be scheduled there, and then be executed before its deadline. System performance is sensitive to the values assigned to SGS and FAS. FAS should be such that the chance of a focussed node guaranteeing a task is high. SGS should not be too high, otherwise too many messages are transmitted in the network. It should not be too low since this may result in too many tasks (i.e., (Le., tasks that were not locally scheduled) not being guaranteed, because request-for-bid messages may not be sent to the nodes that can guarantee task.
Bidding, Bid Evaluation, and Response to Task Awarded When a node receives a request-for-bid message for a task, it calculates a bid for the task. The bid indicates the number of instances of the task the bidder node can guarantee. The calculation is done in two steps: first, an upper bound
254
CHAPTER
10
of the bid, Max-Bid is determinated by: Max-Bid == Min(Free Time of Each Resource Required by the Task) (10.3) Computation Time of the Task The free time is calculated to be the sum of the lengths of the free time slots between the estimated earliest task arrival time on this node and the task's deadline. The earliest arrival time is estimated in an optimistic manner to be the sum of current time, the minimum message delay in transmitting the bid and the minimum message delay in sending the task to the bidder. Max-Bid is calculated optimistically to be the best possible bid that this node can make Le., assuming that assuming ideal availability of resources that the task needs, i.e., all the time slots when a resource is idle appear together and that all the needed resources are concurrently idle. The second step calculates the actual bid. In this step, a binary search between In each stage of the binary search, a given number of instances of task T are temporally inserted into the current schedule of this node, and it is checked whether the inserted instances can be guaranteed. The maximum number of instances of the remote task T that this node can actually guarantee without jeopardizing previously guaranteed tasks is obtained at the end of the search. This number, if above minimum bid (MB), a tunable parameter, becomes the bid. The bid is sent to the node selected for focussed addressing if there is one. Otherwise, the bid is sent to the original node which issued the request-for-bid message. The inserted instances of the remote task are removed from the schedule on a bidder's node. Hence the schedule on the bidder's node is not affected by the bid it makes. This implies that a node does not reserve the resources needed by the tasks for which it bids. Since a node typically bids for multiple tasks and multiple bids are received for a task, reservation of resources results in pessimistic bids and hence may reduce system performance.
o and Max-Bid is performed.
When a node receives a bid for a given task, and the bid is higher than High Bid (HB), a tunable parameter, the node awards the task to the bidding node immediately and all other bids for this task, that arrived earlier or may arrive later, are discarded. If all the bids that have arrived for a given task are lower than HB, the node postpones making the awarding decision until L(T), the latest bid arrival time of the task. At time L(T), the task is awarded to the highest bidder if any. 'When the awarded task arrives at the highest bidder, the local scheduler on that node is invoked to see if the task can be guaranteed. Clearly, the state of
Distributed Scheduling - Part II
255
the node may have changed after making a bid and since resources needed by the task were not reserved, the task mayor may not be guaranteed. If the task is not guaranteed, it is rejected. Note that this algorithm requires five tunable parameters: WL, FAS, SGS, MB, and HB. If this seems high consider an often used multi-level feedback queue found in commercial operating systems. In this algorithm many parameters must be chosen, including the number of levels to raise the priority for each I/O device, the number of levels to lower the priority for hitting the end of a time slice, the policies for execution within each priority level, the total number of priority levels, etc.
Dealing with Unguaranteed Tasks In all these algorithms, the action to be taken when a task is not guaranteed depends on the application requirements as well as on the characteristics of the task. If the task has sufficient laxity then another attempt at global scheduling may be made. However, this increases the scheduling and communication overheads. In general, the invoker of a task that is not guaranteed may invoke the same task with an increased deadline or may invoke another task that produces less precise results, but with lower computational costs.
Relationship between the Algorithms Note that if FAS is small and SGS is equal to FAS, then k is at most 1 and the flexible algorithm behaves like the focussed addressing algorithm. If FAS is large, it behaves like the bidding algorithm. Thus, the flexible algorithm combines features from focussed addressing and bidding, utilizing them in an opportunistic manner. The advantage of the flexible algorithm lies in the fact that its choice of focussed addressing, bidding or both is made on a per-task basis. For example, for a particular task, if a focussed node cannot be found, a rather large subset of nodes, perhaps all the nodes in the network, is sent the request-far-bid message. In this case, the scheme converges to the bidding scheme. On the other hand, if the surplus of the focussed node is sufficie.Q.tly large, the subset of the nodes to which the request-for-bid message is sent can be relatively small, perhaps even empty. In this case, the scheme converges to focussed addressing. In summary, the focussed addressing, bidding and flexible algorithms may be used in limited situations and where missing deadlines is not critical. These
256
CHAPTER
10
algorithms are good examples of distributed system algorithms being modified to explicitly address timing constraints.
10.2.2
Performance Evaluation
In [25] a simulation based performance analysis of these algorithms was performed. These simulations show the effectiveness of these algorithms for the situations in which they apply. The simulations assumed a six node network, there were both periodic and aperiodic tasks and tasks required general resources in addition to the CPU. Message delays and task moving delays were incorporated into the performance study. This set of complex requirements are often found in realistic systems. The performance results demonstrated the degradation in performance due to increased message delay times, but the flexible algorithm was able to ameliorate this degradation by applying focussed addressing as delays became greater. Overall, it was shown that the flexible algorithm was better than all the others. Its performance was also shown to be close to a perfect state information algorithm that ideally assumed zero delays and accurate state information about the remote sites. Performance was studied over various task laxity, message delays and computation time laxity distributions. See [25] for a full discussion of these results. Also see [6, 27, 8,31] for other related results. It is important to note that in some of these other studies it was also shown that the random algorithm is just as good as the flexible algorithm in many situations
10.3
SUMMARY
Distributed real-time scheduling is an important area, but few formal results exist. For static scheduling some solutions based on EDF as well as other heuristics do exist. These solutions can handle quite sophisticated sets of task requirements. This Chapter presents one such algorithm. Others can be found in [3, 16,17,21]. For dynamic scheduling much more work is needed. Presented here are solutions for independent. dynamic tasks based on focussed addressing, bidding and a combined solution, called the flexible algorithm. Such results must. be extended to include more sophisticated sets of task charact.eristics including sets of communicating tasks, and lower overhead solutions. Readers may want. t.o investigate [1, 6, 7, 8, 9, 23, 24, 25, 26, 27, 29] for additional work in this area.
Distributed Scheduling - Part II
25
We are also beginning to see very exciting work at applying EDF for admission control in network communications to support continuous traffic characterized by peak rate, average rate and burst size. For example, algorithms and associated analysis has been developed for flow admission control across a set 0 EDF message queues that reside across a network [12, 18]. Discussion of these results is beyond the scope of this book, but it further serves to illustrate the wide applicability of EDF scheduling.
REFERENCES
[1] T. Abdelzaher and K. Shin, "Optimal Combined Task and Message Scheduling in Distributed Real-Time Systems," Froc. of the IEEE RealTime Systems Symposium, 1995. [2] G. Agrawal, B. Chen, W. Zhao, and S. Davari, "Guaranteeing Synchronous Message Deadlines with the Timed Token Medium Access Control Protocol," IEEE Trans. on Computers 43(3), March 1994. [3] N. Audsley, A. Burns, M. Richardson, K. Tindell, and A. Wellings , "Applying New Scheduling Theory to Static Priority Pre-emptive Scheduling," Software Engineering Joumal, September 1993. [4] R. Bettati and J. Liu, "End-to-End Scheduling to Meet Deadlines in Distributed Systems," Froc. of the 12th Distributed Computing Systems Conference, 1992. [5] S. Biyabani, J. Stankovic and K. Ramamritham, "The Integration of Deadline and Criticalness in Hard Real-Time Scheduling," Froc. Real- Time Systems Symposium, December 1988. [6] S. Cheng, J. Stankovic and K. Ramamritham, "Scheduling Groups of Tasks in Distributed Hard Real-Time Systems," IEEE Trans. on Computers, Nov. 1987. [7] W. Chu, C. Sit, and K. Leung, "Task Response Time For Real-Time Distributed Systems With Resource Contentions," IEEE Trans. on Software Engineering 17(10), October 1991. [8] M. Di Natale and J. Stankovic, "Dynamic End-to-end Guarantees in Distributed Real-Time Systems," Froc. of the IEEE Real- Time Systems Symposium, 1994. [9] M. Di Natale and J. Stankovic, "Applicability of Simulated Annealing Methods to Real- Time Scheduling and Jitter Control," Real- Time Systems Symposium, December 1995.
260
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
[10] K. Efe, "Heuristic Models of Task Assignment Scheduling in Distributed Systems," IEEE Computer, June 1982, pp. 50-56. [11] D. Ferrari, "A New Admission Control Method for Real-Time Communication in an Internetwork," in S. Son, Ed., Advances in Real- Time Systems, Prentice-Hall, Englewood Cliffs, NJ, 1995. [12] V. Firoiu, J. Kurose, and D. Towsley, "Efficient Admission Control for EDF Schedulers," Proc. IEEE INFO COM '97, Kobe, Japan, April 1997. [13] M. Garey and D. Johnson, "Complexity Results for Multiprocessor Scheduling Under Resource Constraints," SIAM Journal of Computing, 4, 1975. [14] R. Jain, FDDI Handbook: High-Speed Networking Using Fiber and Other Media, Addison-Wesley, Reading, Massachusetts, 1994. [15] H. Kasahara and S. Narita, "Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing," IEEE Trans. on Computers, Vol. C-33, I" November 1984, 1023-1029. [16] H. Kopetz et. al., "Distributed Fault-Tolerant Real-Time Systems: The MARS Approach," IEEE Micro 9(1), February 1989. [17] C. Koza, "Scheduling of Hard Real-Time Tasks in the Fault Tolerant, Distributed, Real-Time System MARS," Proc. 4th IEEE Workshop on Real- Time Operating Systems, pp. 31- 36, July 1987. [18] J. Liebeherr, D. Wrege, and D. Ferrari, "Exact Admission Control for Networks with Bounded Delay Services," IEEEIACM Trans. on Networking, Vol. 4, No.6, pp. 885-901, December 1996. [19] N. Malcolm and W. Zhao, "Guaranteeing Synchronous Messages with Arbitrary Deadline Constraints in an FDDI Network," Proc. of the IEEE Conf. on Local Computer Networks, 1993. [20] D. Niehaus, "Program Representation and Execution in Real-Time Multiprocessor Systems," PhD Thesis, University of Massachusetts, January 1994. [21] D. Peng and K. Shin, "Static Allocation of Periodic Tasks with Precedence," in Distributed Computing Systems, IEEE, June 1989. [22] K. Ramamritham, "Allocation and Scheduling of Precedence-Related Periodic Tasks," IEEE Trans. on Parallel and Distributed Systems 6(4), April 1995.
References
261
[23] K. Ramamritham, "Allocation and Scheduling of Complex Periodic Tasks," Proc. 10th International Conference on Distributed Computing Systems, Paris, France, June 1990. [24] K. Ramamritham and J. Stankovic, "Dynamic Task Scheduling in Distributed Hard Real-Time Systems," IEEE Software, 1(3):65-75, July 1984. [25J K. Ramamritham, J. Stankovic and W. Zhao, "Distributed Scheduling of Tasks with Deadlines and Resource Requirements," IEEE Trans. on Computers, 38(8):1110-23, August 1989. [26] K. Ramamritham and J. Adan, "Load Balancing During the Static Allocation and Scheduling of Complex Periodic Tasks," TR, Univ. of Massachusetts, Oct. 1990. [27J J. Stankovic, K. Ramamritham and S. Cheng, "Evaluation of a Flexible Task Scheduling Algorithm for Distributed Hard Real-Time Systems," IEEE Trans. on Computers, C-34(12):1130-43, 1985. [28J J. Stankovic, and K. Ramamritham, "The Spring Kernel: A New Paradigm for Real-Time Systems," IEEE Software, May 1991 pp. 62-72. [29] K. Tindell and J. Clark, "Holistic Schedulability Analysis for Distributed Real-Time Systems," Microprocessing and Microprogramming, 40:117-134, 1994. [30J S. Zhang and A. Burns, "An Optimal Synchronous Bandwidth Allocation Scheme for Guaranteeing Synchronous Message Deadlines with the TimedToken MAC Protocol," IEEE/ACM Trans. on Networking 3(6), December 1995. [31] W. Zhao, K. Ramamritham and J. Stankovic, "Scheduling Tasks with Resource Requirements in Hard Real-Time Systems," IEEE Trans. on Software Engineering, SE- 12(5) :567-77, 1987.
11 SUMMARY AND OPEN QUESTIONS
In this final chapter a brief summary and a list of open questions are presented. Throughout the text many books, articles, and journals have been referenced. Readers are encouraged to pursue some of these in depth since rich and useful intellectual challenges await.
11.1
SUMMARY
Scheduling for real-time systems is generally based on either cyclic scheduling, or the rate monotonic (RM) or earliest deadline first (EDF) algorithms. Cyclic scheduling is difficult to create and maintain. The RM approach has been fully developed and many of its results have been accumulated in an excellent handbook (2]. Prior to our current EDF-based book, no such collection of results for the EDF algorithm existed. In our book a comprehensive set of results on EDF and its extensions are presented. Many of these extensions follow the key extensions to the basic RM algorithm, but where EDF is the underlying policy instead of RM. For example, priority inheritance, priority ceiling, stack resource policy, holistic scheduling were all solutions developed first under the RM paradigm. They have all been extended to be used with EDF. While a wide variety of key results are presented in this book, there do exist many other results on EDF scheduling. Further, because of the popularity of both RM and EDF, new results appear continuously. In particular, the widespread use of distributed multimedia where audio and video data is transmitted is creating a new interest in both RM and EDF.
264
CHAPTER
11
EDF, being a dynamic scheduling algorithm, is preferable to RM for dynamic " real-time systems. For static real-time systems the choice between EDF and ( RM is not always clear. RM is a fixed priority scheme and has low implementation costs in most cases (although some of the extensions of RM may require fairly complicated implementations). EDF is a dynamic scheme and theoretically requires greater implementation costs than for RM. However, for relatively small task sets the actual implementation cost for EDF is low and can be very similar to RM. In these cases, with RM and EDF having similar implementation costs, using EDF provides greater utilization. The case study described at the end of Chapter 4 is one example where such extra utilization makes a task set that is unschedulable under RM, schedulable under EDF. This book carefully defines keys terms used in real-time scheduling. It presents fundamental results regarding optimality of EDF and feasibility testing. The book describes how to compute worst case execution times for tasks and how this can be used to analyze many types of situations including deadline tolerance, release jitter, sporadically periodic tasks, tick scheduling, non-preemptive tasks, and distributed systems. The use of EDF in a planning mode is described. This is valuable for all those systems that require a form of admission control or early warning on the possibility that a deadline might be missed. This approach is also useful in dealing with overloads. In fact, planning based realtime scheduling can be considered a fourth key real-time scheduling paradigm, cyclic scheduling, RM and EDF (not used in planning mode) being the other three. Extending EDF to handle shared resources is possibly the most important extension because this deals with how to analyze a system in the presence of blocking, a most common and practical issue. Results are also presented that integrate solutions for both precedence constraints and shared resources. Most dynamic real-time systems have both periodic and aperiodic tasks. Key results are presented that permit analysis of these systems. All of these results (presented in the first 8 Chapters) deal with uni-processor or multi-processor systems. In Chapters 9 and 10, three main results are presented for distributed real-time systems. The first solution is an extension of the holistic scheduling approach (which was originally based on RM). In the extension EDF is used as the underlying algorithm. In order to obtain end-to-end analysis the solution makes use of recursion, precedence constraints and worst case computation time evaluation. The second algorithm for distributed systems handles the allocation and scheduling of a complex task set that includes deadline, resource, communication, and fault tolerance requirements. This is a heuristic approach and the algorithm runs off-line. The third algorithm is an on-line distributed schedul-
Summary and Open Questions
265
ing algorithm that uses focussed addressing and bidding modified to consider timing issues. For further information on real-time scheduling the reader should investigate the following survey articles [1, 3, 4].
11.2
OPEN QUESTIONS
In general, it is very difficult to codify scheduling knowledge because there are many performance metrics, task characteristics, and system configurations, and many algorithms handle various combinations of task characteristics. In spite of the recent advances and all the results presented in this book, there are still gaps in the solution space and there is a need to integrate the available solutions. A list of issues to consider in various combinations includes:
•
preemptive versus non-preemptive tasks,
•
uni-processors versus multi-processors,
•
using EDF at dispatch time versus EDF-based planning,
•
precedence constraints among tasks,
•
resource constraints,
•
periodic versus aperiodic versus sporadic tasks,
•
scheduling under overload,
•
fault tolerance requirements, and
•
providing guarantees and supporting levels of guarantees (meeting quality of service requirements).
Results in some areas of EDF scheduling are somewhat limited and, yet, these areas are very important. These areas include: •
multi-processing EDF scheduling,
•
distributed EDF scheduling,
266
CHAPTER
11
•
detecting and avoiding anomalous behavior,
•
dealing with overload,
•
integrating fault tolerance and reliability into EDF scheduling,
•
using EDF algorithms in real-time databases, and
•
developing statistical guarantees needed for transmission of audio and video data under time constraints.
REFERENCES
[1] S. Cheng, J. Stankovic, and K. Ramamritham, Hard Real-Time Systems, Chapter 5.1 on Scheduling, IEEE CS Press, Los Alamitos, Calif., 1988.
/I
/' [2] 1\1. Klein, et. aI., A Practitioner's Handbook for Real-Time Analysis, ;f Kluwer Academic Publishers, Boston, 1993. [3] K. Ramamritham and J. Stankovic, "Scheduling Algorithms and Operating Systems Support for Real-Time Systems," Proceedings of the IEEE, Vol. 82, No.1, pp. 55-67, January 1994. [4] J. Stankovic, M. Spuri, M. Di Natale and G. Buttazzo, "Implications of Classical Scheduling Results for Real-Time Systems," IEEE Computer, Vol. 28, No.6, pp. 16-25, June 1995.
INDEX A
Admission Control, 21, 101, 106 Applications, 2, 4, 6, 82 Asynchronous Task Set, 14, 28 Avoid Contention, 122 B
Best Effort, 21, 101 Bidding, 251 Busy Period, 27, 33, 47, 61, 68, 207
c Chained Blocking, 140, 143 Clairvoyant Scheduling, 96 Clustering, 233, 235 Communication Delays, 199, 211, 231, 236 Competitive Factor, 96, 98 Critical Section, 121, 127, 131 Cyclic Scheduling, 22, 263 D
Deadline Tolerance, 74, 103
270
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
Deadline, 15-16 Deadlock, 142-143 Distributed Scheduling, 198, 229, 247 Domino Effects, 87, 101 Dynamic Priority Exchange, 170 Dynamic Sporadic Server, 175, 177 E
EDL Server, 170, 182, 187 End-to-End, 200, 219 F
Firm Deadline, 15 Flexible Algorithm, 251 Focussed Addressing, 251
G Guarantee, 87, 99, 110, 114, 174,246 H
Hard Deadline, 15 Holistic Scheduling, 68, 199, 202, 264
Index
Hybrid Task Set, 15 I
Improved Priority Exchange, 187 M
lVletrics, 22, 94, 247 Misconceptions, 4
N Network, 200, 204, 207, 211, 230 Normality, 155 NP-hard, 24, 28-29, 38,42, 59, 90, 121, 231
o Open Questions, 263, 265 Overloads, 87, 90, 98, 101, 201 p Planning, 98, 100, 106, 111, 122, 144 Predictability, 3, 20-21 Priority Ceiling, 126,140-141, 159 Priority Inheritance, 126-127, 129-130, 159
271
272
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS
Priority Inversion, 122-]23, 125, 140, 159 Q Quasi-Normality, 152, 155, 157 R Random Scheduling Algorithm, 250 RED Algorithm, 103 Release Jitter, 51, 68, 76,202, 204, 210 Release Time, 15-16 Replication, 231 Resource Reclaiming, 115, 174
s Sensors, .5-6, 19 Soft Deadline, 15 Spring Algorithm, 106, 200, 233 Stack Resource Policy, 142, 159 Static Scheduling, 19, 21, 229 Synchronous Task Set, 14, 28
T Table Driven, 20
....~"-~{."""''"
.,.....
~
5
AI
Index
Tick Scheduling, 55, 77, 82 Total Bandwidth Server, 179
v Value Density, 93, 97 Value, 93
w Worst Case Response Time, 67, 76
a
273
Many real-time systems rely on static scheduling algorithms. This includes cyclic scheduling, rate monotonic scheduling and fixed schedules created by off-line scheduling techniques such as dynamic programlning, heuristic search, and simulated annealing. However, for many real-time systems static scheduling algorithms are quite restrictive and inflexible. For example, highly automated agile manufacturing, command, control and communicrttions, and distributed real-time multimedia applications all operate over long lifetimes and in highly non-deterministic environments. Dynamic real-time scheduling algorithms are more appropriate for these systems and rtre used in such systems. Many of these algorithms are based on ertrliest dertdline first (EDF) policies. There exists a wealth of literature on EDFbased scheduling with many extensions to deal with sophisticated issues such as precedence constraints, resource requirements, system overload, multi-processors, and distributed systems. DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS: EDF and Related Algorithms aims rtt collecting a significant body of knowledge on EDF scheduling for real-time systems, but it does not try to be all inclusive (the literature is too extensive). The book primarily presents the algorithms and associrtted analysis, but guidelines, rules, and implementation considerrttions are also discussed especially for the more complicated situations where mrtthematical analysis is difficult. In general, it is very difficult to codify and taxonomize scheduling knowledge because there are many performance metrics, task characteristics, and system configurations. Also, adding to the complexity is the fact that a variety of algorithms have been designed for different combinations of Ihese considerations. In spite of the recent advances there are still gaps in the solution space and there is a need to integrate the
preemptive versus non-preemptive tasks, un i-processors versus multi-processors, using EDF at dispatch time versus EDF-based planning, precedence constrainls among tasks, resource constraints, periodic versus aperiodic versus sporadic tasks, scheduling during overload, fault tolerance requirements, and providing gurtrantees and levels of guarantees (meeting quality of service requirements ).
DEADLINE SCHEDULING FOR REAL-TIME SYSTEMS: EDF and Related Algorithms should be of interest 10 researchers, real-time system designers, and instructors and students, either rtS a focussed course on deadline based scheduling for real-time systems, or, more likely, as part of a more general course on real-time computing. The book serves rtS an invaluable reference in this fast-moving field.
KLUWER ACADEMIC PUBLISHERS SECS460
0-7923-8269-2