Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
6955
Ralf Lehnert (Ed.)
Energy-Aware Communications 17th International Workshop, EUNICE 2011 Dresden, Germany, September 5-7, 2011 Proceedings
13
Volume Editor Ralf Lehnert Technische Universität Dresden Institut für Nachrichtentechnik 01062 Dresden, Germany E-mail:
[email protected]
ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-23540-5 e-ISBN 978-3-642-23541-2 DOI 10.1007/978-3-642-23541-2 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011935040 CR Subject Classification (1998): C.2, H.4, D.2, H.3, F.2, K.4 LNCS Sublibrary: SL 3 – Information Systems and Application, incl. Internet/Web and HCI
© Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
It is my great pleasure to welcome you to the proceedings of the 17th EUNICE workshop held in Dresden. EUNICE has a long tradition in bringing together young researchers in communication network modeling and design from all over Europe. The single-track structure with sufficient time for presentations has always provided a platform for stimulating discussions. This year’s focus was on the actual topic of “energyaware communications.” Communication networks today account for 2 % of the worldwide emissions of CO2 with an exponentially rising trend. Currently, electrical energy is generated in a centralized fashion in a few power plants and distributed to the user. Efficient use of energy requires distributed generation and distributed control of alternative energy generators. Therefore, the power grid has to become a so-called smart grid, which in turn requires a communication network for controlling the power grid. Smart grid communications can be realized by all communication technologies, wired or wireless. Applications range from meter reading to home automation and entertainment and the control of distributed power plants. EUNICE 2011 addressed research issues of energy-aware communication networks and communications for smart grids. EUNICE 2011 consisted of three keynotes on smart planet communications, network coding, and resource allocation. Sixteen full papers in seven sessions were accepted. Furthermore there was a session with seven poster presentations of ongoing research. My deep thanks go to our sponsors Comarch, Cracow, Poland, Detecon, Bonn, Germany, and Elcon Systemtechnik, Hartmannsdorf, Germany. Their generous support helped to reduce the registration fees significantly. Many people worked hard to prepare this workshop: I would like to recognize the work of Stanislav Mudriievskyi, Stefan T¨ urk, Volker Richter, Roland Schingnitz, Rico Radeke and Jorge Robles. September 2011
Ralf Lehnert
Organization
EUNICE 2011 was organized by the Chair for Telecommunications, Technische Universit¨ at Dresden.
Executive Committee Conference Chair Ralf Lehnert
Chair for Telecommunications, TU Dresden, Germany
Local Organization Stanislav Mudriievskyi Martin Schuster Stefan T¨ urk Volker Richter Rico Radeke Jorge Robles
TU TU TU TU TU TU
Dresden, Dresden, Dresden, Dresden, Dresden, Dresden,
Germany Germany Germany Germany Germany Germany
Finance Chair Roland Schingnitz
TU Dresden, Germany
Technical Program Committee Finn Arve Aagesen Sebastian Abeck Marco Ajmone Marsan Laurie Cuthbert J¨ org Ebersp¨ acher Claudia Eckert Markus Fiedler Carmelita G¨ org Annie Gravey Jarmo Harju Yvon Kermarrec Paul K¨ uhn Oivind Kure Ralf Lehnert Maurizio Munafo
NTNU, Trondheim, Norway KIT, Karlsruhe, Germany University of Milan, Italy University of London, UK TU M¨ unchen, Germany TU M¨ unchen, Germany BIT, Blekinge, Sweden University of Bremen, Germany TELECOM Bretagne, France Tampere University, Finland TELECOM Bretagne, France University of Stuttgart, Germany NTNU, Trondheim, Norway TU Dresden, Germany PT Turin, Italy
VIII
Organization
Miquel Oliver Michal Pioro Aiko Pras Burkhard Stiller Robert Szabo Andreas Timm-Giel Samir Tohme
University of Pompeu Fabra, Spain University of Warsaw, Poland University of Twente, The Netherlands University of Z¨ urich, Switzerland Budapest University of Technology, Hungary TU Harburg, Germany University of Versailles, France
Organization
Sponsors Platinum sponsor: Comarch, Cracow, Poland
Gold sponsor: Detecon, Bonn, Germany
Silver sponsor: Elcon Systemtechnik, Hartmannsdorf, Germany
Other sponsors: Die Informationstechnische Gesellschaft im VDE (ITG), Germany
EUNICE
Technische Universit¨ at Dresden, Germany
Chair for Telecommunications, TU Dresden, Germany
IX
Table of Contents
Keynote Talks Physical Layer Network Coding for Improved Energy Efficiency . . . . . . . . Eduard Jorswieck
3
A Sense of a Smarter Planet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matthias Kaiserswerth
4
Resource Management in a New Green-IT World . . . . . . . . . . . . . . . . . . . . . Mauro Biagi
5
Session Papers Network Architectures On the Benefit of Forward Error Correction at IEEE 802.11 Link Layer Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Floris van Nee and Pieter-Tjerk de Boer
9
Simple Modifications in HWMP for Wireless Mesh Networks with Smart Antennas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Irfan Rafique, Marco Porsch, and Thomas Bauschert
21
Ad-Hoc and Wireless Networks On the Evaluation of Self-addressing Strategies for Ad-Hoc Networks . . . Ricardo de O. Schmidt, Aiko Pras, and Reinaldo Gomes Considerations in the Design of Indoor Localization Systems for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jorge Juan Robles
31
43
System Simulation Backoff Algorithms Performance in Burst-Like Traffic . . . . . . . . . . . . . . . . . Ievgenii Tsokalo, Yamnenko Yulia, and Stanislav Mudriievskyi New IEEE 802.16-2009 Compliant Traffic Shaping Algorithms for WiMAX Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Volker Richter and Stefan T¨ urk
54
65
XII
Table of Contents
Network Planning, Optimization, and Migration Multiple-Layer Network Planning with Scenario-Based Traffic Forecast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shu Zhang and Ulrich Killat
77
Optimization of Energy Efficient Network Migration Using Harmony Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stefan T¨ urk and Rico Radeke
89
Self-management of Hybrid Networks: Introduction, Pros and Cons . . . . . Tiago Fioreze and Aiko Pras
100
Traffic Engineering Evaluation of Different Decrease Schemes for LEDBAT Congestion Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mirja K¨ uhlewind and Stefan Fisches
112
Comparative Traffic Analysis Study of Popular Applications . . . . . . . . . . . Zolt´ an M´ ocz´ ar and S´ andor Moln´ ar
124
Flow Monitoring Experiences at the Ethernet-Layer . . . . . . . . . . . . . . . . . . Rick Hofstede, Idilio Drago, Anna Sperotto, and Aiko Pras
134
Quality of Experience A Survey of Quality of Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Qin Dai Investigation of Quality of Experience for 3D Streams in Gigabit Passive Optical Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivett Kulik and Tuan Anh Trinh
146
157
Energy Efficient Architectures A SystemC-Based Simulation Framework for Energy-Efficiency Evaluation of Embedded Networking Devices . . . . . . . . . . . . . . . . . . . . . . . . Daniel Horvath and Tuan Anh Trinh Energy Considerations for a Wireless Multi-homed Environment . . . . . . . German Castignani, Nicolas Montavont, and Alejandro Lampropulos
169 181
Poster Session Method for Linear Distortion Compensation in Metallic Cable Lines . . . . Albert Sultanov, Anvar Tlyavlin, and Vladimir Lyubopytov
195
Table of Contents
Multimedia Services Differentiation in 4G Mobile Networks under Use of Situational Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexander Dyadenko, Olga Dyadenko, Larisa Globa, and Andriy Luntovskyy
XIII
199
Downlink Femtocell Interference in WCDMA Networks . . . . . . . . . . . . . . . Zolt´ an Jak´ o and G´ abor Jeney
203
Techno-economic Analysis of Inhouse Cabling for FTTH . . . . . . . . . . . . . . Navneet Nayan, Rong Zhao, and Kai Grunert
209
Impact of Incomplete CSI on Energy Efficiency for Multi-cell OFDMA Wireless Uplink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alessio Zappone, Giuseppa Alfano, Stefano Buzzi, and Michela Meo An Efficient Centralized Localization Method in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohamadreza Shahrokhzadeh, Abolfazl T. Haghighat, and Behrooz Shahrokhzadeh Mechanisms for Distributed Data Fusion and Reasoning in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ioannis Papaioannou, Periklis Stavrou, Anastasios Zafeiropoulos, Dimitrios-Emmanuel Spanos, Stamatios Arkoulis, and Nikolas Mitrou Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
213
217
221
225
Physical Layer Network Coding for Improved Energy Efficiency Eduard Jorswieck TU Dresden, Germany
In 2000, the concept of network coding was introduced by Ahlswede et al. and it lead to paradigm changes on several layers. On the physical layer, it changed the way how interference is handled. Interference is now rather exploited than avoided or canceled. On the network and transport layer, it changed the way how packets are forwarded. Instead of separate packets, combinations of packets flow through the network. The concept of network coding was successfully applied to several research scenarios ranging from ad-hoc and sensor networks to peertopeer file sharing. τ1 NT = |XT |
τNT
R1
transmitter
RN
(information) sources
T
encoder network code encoder channel code
NN = |XN |
channel decoder channel code
(information) sinks
NR = |XR| ρ1
ρNR
decoder network code receiver
First, we will review the basic concept of network coding by two basic example scenarios: the butterfly and the two-way relaying network. Then, the two classes of network coding over finite fields (linear network coding, deterministic and random network coding) and physical layer network coding (also known as wireless network coding) are discussed. Finally, we apply the idea to a multiple antenna interference network scenario and show how the energy efficiency of the overall network can be significantly improved. The presentation is concluded by pointing out open questions as well as an overview over relevant references. R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, p. 3, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Sense of a Smarter Planet Matthias Kaiserswerth IBM Research, Zurich, Switzerland
In the next couple of years, there are expected to be 2 billion people connected to the Internet. At the same time, the instrumentation and interconnection of the worlds human-made and natural systems is explodingwhich could mean that there soon will be more things connected to the Internet than there are people who are connected. This Internet of Things promises to give people a much better understanding of how complex systems work, so they can be tinkered with to make them work better. But it also opens up a whole new sphere of opportunity to mine the data to create anything from smarter traffic to smarter healthcare to smarter food supply chains.
R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, p. 4, 2011. c Springer-Verlag Berlin Heidelberg 2011
Resource Management in a New Green-IT World Mauro Biagi Department of Information, Electronics and Telecommunications engineering (DIET) Sapienza, University of Rome
Wireless mobile, cabled xDSL, Fiber Optical Systems and Power Line Communications share a common problem: Bandwidth-Power efficiencies. While the efficiency required for the spectrum usage is different from wireless to wireline since the former has to respect some standardization bodies recommendations and the latter is mainly tied to channel quality aspects, the power efficiency is mainly linked to electrical power that comes from different sources like carbon, gas, oil, wind, solar, water. Unfortunately only few of these are renewable and require institutional investments for the medium-far future. In the meantime the only one way to be run appears to be the efficient. From this the necessity, in a Green-IT vision, to propose schemes, architectures, protocols and algorithms able to optimally manage bandwidth and power so to guarantee the expected performance at a reasonable cost. This is the direction currently followed by scientific community for already existing systems and in-developing new technologies. Hence, each optimizing procedure with constraints on power/bandwidth usage or with the goal of power minimization or spectrum allocation, falls in the framework of a smart management of resources, so bit-loading for multi-carrier systems, coding and access techniques, scheduling and diversity schemes can be interpreted as an instance of Green-IT. In this context also the dualism between convex (closed-form) optimization and recursive algorithm takes place since also the complexity of the algorithms involves a power aspect, the power used for processing. This keynote, tries to show different approaches to efficiency by highlighting strong aspects and drawbacks for the present and next future.
R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, p. 5, 2011. c Springer-Verlag Berlin Heidelberg 2011
On the Benefit of Forward Error Correction at IEEE 802.11 Link Layer Level Floris van Nee and Pieter-Tjerk de Boer Centre for Telematics and Information Technology (CTIT) Faculty of Electrical Engineering, Mathematics and Computer Science University of Twente, Enschede, The Netherlands
[email protected],
[email protected]
Abstract. This study examines the error distribution of aggregated MPDUs in 802.11n networks and whether or not forward error correction like raptor coding at the link layer would be useful in these networks. Several experiments with Qualcomm 4x4 802.11n hardware were performed. Two devices were used in a data link, while a third device sniffed all transmitted packets. The collected data was analyzed and used to calculate the packet error rate which would be obtained if FEC was used in order to determine whether FEC is useful at the link layer. It is shown that the error distribution of A-MPDUs does not follow the binomial distribution. Because of this, the performance of FEC in real networks is worse than for theoretical cases where a binomial distribution is assumed. Therefore, other ways to decrease the packet error rate have more impact than forward error correction.
1
Introduction
Wireless Local Area Networks (WLAN) play an important role in home video networking, because they are cost-effective to install and easy to set-up. However, there are still areas where WLAN needs to improve in order to deliver the performance needed to stream multiple High Definition (HD) streams simultaneously. One of the reasons for this is that video streaming poses different requirements on the transfer link than normal web browsing does. For example, a much lower Packet Error Rate (PER) is required, because video applications are more sensitive to errors in the communications link. Lost packets require retransmission which cause higher latency and lower throughput. To the end user this may be noticeable as artifacts on the video display. Over the past few years there have been many studies on improving the link quality to allow streaming of multiple video streams. As latency is the most important aspect of video streaming, most studies focus on keeping latency as low as possible by preventing packet loss. One study tried to achieve this by adjusting the algorithm for rate control to use the Signal-to-noise Ratio (SNR) in determining the ideal transmission rate [4]. Another study used forward error correction at the application layer in order to correct errors which occur during transmission [1]. Both studies show some positive results, however they do not R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 9–20, 2011. c Springer-Verlag Berlin Heidelberg 2011
10
F. van Nee and P.-T. de Boer
use devices which support the latest version of the IEEE 802.11 standard. This standard, IEEE 802.11n, introduces a couple of important improvements for transmitting large amounts of data which make it especially useful for video streaming. The most important improvements in the 802.11n standard for video streaming are higher data rates and the possibility to use frame aggregation [5]. MPDU aggregation makes it possible to send up to 64 data frames in one single Aggregated MPDU (A-MPDU) frame. Each of these MPDUs in the aggregated frame can be individually acknowledged using a method called block acknowledgement. This makes MPDU aggregation especially suited for real-time streaming, because an error in one MPDU does not result in the loss of one complete transmit opportunity (TXOP). The lost MPDU can simply be retransmitted together with new data frames as part of the next A-MPDU. One study on forward error correction did note these improvements. They investigated the use of forward error correction for multicast video data by doing simulations [9]. However, their study is completely theoretical and also makes the false assumption that an A-MPDU can contain as many as 1000 MPDUs. Moreover, 802.11a PHY rates are used in their simulation instead of the 802.11n rates. Because of the increasing data rate of HD video and the need to transmit more video streams simultaneously, it is important to achieve even higher throughput and lower latency. This research investigates whether the use of a forward error correction code at the link layer level would improve the performance of the link. In order to determine the usefulness of such an error correction code, one important aspect of the wireless link should be carefully examined. This aspect is whether individual errors in an A-MPDU are correlated. A forward error correction code is most valuable when the errors are not correlated, but instead follow a binomial distribution with an independent error probability p. If errors are correlated, there is a higher probability that there are too many errors to correct in an A-MPDU. In addition, the probability that the additional error correction MPDUs are received in error also increases. Therefore, there is little benefit in using forward error correction when errors are strongly correlated, because only a small percentage of errors can be corrected in that case. In that case, the reduction in error probability may not be justified by the overhead of the extra error correction data. Two research questions are introduced: – What is the distribution of errors in an A-MPDU and which factors influence this distribution? – Would the performance of the link increase when a forward error correction code is introduced at the link layer? The answer to the first question will give insight into the distribution of errors in A-MPDUs. This information can be used to answer the second question. In order to obtain the information necessary to answer the research questions, several experiments are conducted. Qualcomm 4x4 802.11n devices are used for
On the Benefit of Forward Error Correction
11
these experiments, because these devices have full support for aggregation of MPDUs and are therefore well suited for the tests. They use up to four spatial streams to send and receive data enabling them to transmit at a physical layer data rate of up to 600 Mbps. The setup for the experiment is as follows. One device continuously sends data to another device, while a third device which is placed close to the sender, sniffs all packets. The data is sent without using FEC at the link layer. The experiment is performed for several types of data. The captured data is then analyzed for A-MPDU length and error rate to obtain an answer to the first research question. This data is then used to calculate the improvement in PER if forward error correction would be used. This paper first presents an introduction to the topic of A-MPDUs in the 802.11n standard and forward error correction codes. After that, the methods for measuring are explained as well as the methods for analyzing the data. Then, the results and a discussion of the results are presented and finally a conclusion is drawn and an answer to the research questions is formulated.
2 2.1
Background Frame Aggregation in the IEEE 802.11n Standard
Using the full potential of the 802.11n standard, physical layer data rates of 600Mbps can be obtained. Another important improvement is introduced at the link layer. This improvement is the aggregation of data link frames. By combining several link layer frames, the overhead at the link layer is decreased. At the highest data rates, aggregation of MPDUs can lead to a three to four times increase in channel utilization [3]. The MAC-PHY (medium access control-physical) interface combines a maximum of 64 MPDUs into one large aggregated MPDU. The receiver of the packet deaggregates the A-MPDU in its MAC-PHY interface and sends a special type of acknowledgement back, called a block acknowledgement. This block acknowledgement contains a bitmap of 64 bits, in which each bit corresponds to the success of the transmission of one individual MPDU. In the next A-MPDU, the sender retransmits each MPDU which was not received correctly, possibly together with new MPDUs. This is also illustrated in Figure 1. A B
A-MPDU Header
MPDU SN : 1022
MPDU SN: 1023
MPDU SN: 1024
A-MPDU Header
MPDU SN: 1025
MPDU SN: 1022
MPDU SN: 1024
MPDU SN: 1026
Block Ack , Start SN : 962 Bitmap (64bit): 1 1 ... 0 1 0 1
Fig. 1. Simplified flow of transmission of aggregated MPDUs
The most important improvement when using aggregation is the following. In the old approach only one MPDU could be sent after which the sender had to wait before the acknowledgement was received. This process takes one TXOP, so sending ten MPDUs using the old approach takes at least ten TXOPs. In the new approach, these ten MPDUs can be sent in one aggregated MPDU, therefore decreasing the time it takes to transmit the ten MPDUs to only one TXOP.
12
2.2
F. van Nee and P.-T. de Boer
Forward Error Correction
Forward error correction has traditionally been used in wireless networks at the physical layer. Here, convolutional codes are used to encode each m-bit information symbol into an n-bit symbol where n is larger than m. Convolution codes are not block codes, but process a continuous bitstream instead. This makes them not suitable for use at the data link layer, because at this layer the individual MPDUs need to be corrected. Normally, the failure of receiving an MPDU leads to a retransmission of that MPDU. However, it is possible to add specific error correction MPDUs to an A-MPDU which are able to recover these lost MPDUs. These types of forward error correction code are called block erasure codes. In addition, when an error correcting algorithm is such that a message of k information symbols can be encoded into a potentially limitless number of encoding symbols, the code is called a fountain code. One well-known class of fountain codes are raptor codes. The latest generation of raptor codes, RaptorQ [8], are used in this paper when calculations are done on the performance of error correction codes in the measurements. RaptorQ is a systematic code, meaning that the symbols of the original message are included in the set of encoding symbols. Using RaptorQ coding on a source block of size k, the message can be decoded with 99% certainty from the reception of k encoded symbols. For k + 1 received symbols, the decoding probability is 99.99% and for k + 2 it is 99.999999%.
3 3.1
Measurements Equipment and Setup
All the measurements were done with Qualcomm 4x4 802.11n wireless devices. These devices use four spatial streams to send and receive data and have full support for aggregation of frames. The setup of the measurements is displayed in Figure 2. In essence, two wireless devices are sending data to each other, while a third one is sniffing all packets. As can be seen in Figure 2, A is connected to the access point with a gigabit ethernet connection. Consequently, when A sends data to B, the data is first sent to the access point and then the data is wirelessly transmitted to B. B wirelessly sends an acknowledgement back which is received by the access point. Because it is a MAC level acknowledgement, it is not propagated all the way back to A. The sniffer passively participates in the link, capturing all wireless communications between the access point and B. The captured packets are collected on a computer using Wireshark [2]. Because the sniffer should capture as many packets as possible to get accurate results, care has to be taken where to place the sniffer. Data frames are sent at a much higher data rate than (block) acknowledgements. Therefore, acknowledgements have a much larger range than data frames. In the setup of the measurements, the sniffer should be placed such that both data and acknowledgement frames are captured. The best place for this is close to the sender of the data frames.
On the Benefit of Forward Error Correction
13
Fig. 2. The setup of the measurements
3.2
Procedure
In order to obtain the data necessary for analyzing, several tests were performed. For all except one test, Iperf was used as the tool to generate the data to transmit [6]. For the remaining test, a high definition video was streamed from A to B. One requirement for the data set was to be sufficiently large to draw conclusions. There have to be several hundred captured A-MPDUs for each possible A-MPDU size to get meaningful results. After some initial tests using different sizes of data sets, it was concluded that a set of one million packets would be sufficiently large. Therefore, the sniffer was set to capture approximately one million packets per test. Another important aspect to mention is the variety of the tests. It is important to perform tests with different parameters to obtain a broad overview of the problem. Therefore, tests are performed at several application layer data rates and also using different network layer protocols. The difference between using TCP and UDP as the network protocol is that UDP is one-way communication and TCP is two-way. When using TCP, acknowledgements are also sent at network layer level. These acknowledgements are normal data frames at the link layer. Because the communication is two-way, there is a probability that collisions occur. Therefore, three tests were performed: one using Iperf to transmit UDP data at 60Mbps, one using Iperf to transmit TCP data at the maximum available rate (which was 200Mbps) and one last test where an HD movie was streamed at 15Mbps. 3.3
Information Extraction
To obtain an answer to the research questions, the distribution of errors in an A-MPDU needs to be extracted from the acquired data. A plugin for Wireshark was developed for this purpose. The plugin collects statistics of the data set by having a function which is called once per captured packet.
14
F. van Nee and P.-T. de Boer
The easiest way to collect statistics about the error distribution of A-MPDUs is to simply split the data set by the size of A-MPDUs. Then, the probability can be calculated that exactly x MPDUs in an A-MPDU of size s are received in error. This leads to the error distribution per A-MPDU size. One difficulty in this approach is how to detect whether or not the transmission of a packet was successful. There are two methods to do this. First, it is possible to examine the block acknowledgements transmitted by the receiver of the data A-MPDU. This block acknowledgement contains information about which MPDUs were received. Second, the retransmission flag in an MPDU can be used to detect whether the packet is retransmitted. Logically, when a packet is retransmitted, it must have been sent before, though not received. The second approach is better suited for the analysis, because it is more accurate. Sometimes block acknowledgements are not sent. For example, in the case that one complete A-MPDU fails, the receiver does not receive anything, thus it does not send a block acknowledgement back. Using the first approach, this would be treated as if nothing happened. The second approach does detect such cases. The approach still relies on the assumption that the sniffer captures all packets. However, this is not the case. Sometimes the sniffer does not receive the packet, though the packet does arrive at the receiver. Also, sometimes the packet is received neither by the sniffer nor by the intended receiver. The first case is not a problem, as the packet reaches the intended receiver so the sniffer does not have to count an error. The second case poses a larger problem. When the retransmit of the lost packets occurs, the sniffer does not know in which AMPDU the original packets were sent. Therefore, it does not know what the size of the original A-MPDU was and in which error distribution the retransmission should be placed. This problem is illustrated in figure 3. For the analysis, these retransmissions were placed in a special group with unknown A-MPDU size. original A -MPDU size
?
A B
A-MPDU Header
MPDU SN: 970
MPDU SN: 971
.. timeout ..
A-MPDU Header
MPDU (R) MPDU (R) SN: 970 SN: 971
MPDU SN: 972 Block Ack , Start SN : 909 Bitmap (64bit): 1 1 ... 1 1
Fig. 3. Analysis difficulty: the original A-MPDU is not received by the sniffer, making it impossible to obtain the size of that original A-MPDU
The data was also split per sender. For example, if TCP data is sent from A to B, the actual data goes from A to B, but TCP acknowledgements are also transmitted from B to A. It is possible that the distribution of errors differs greatly between these two links, so it has to be split in the results.
4
Results
The results of the three experiments are presented in this section. Error distribution plots have been created for each experiment and A-MPDU size. However,
On the Benefit of Forward Error Correction
15
because there is a lot of similarity between the figures of different A-MPDU sizes of the same experiment, not all of them are shown. Instead, only a couple of figures which are representative are shown to provide a clear overview of the results. Each figure contains three lines, two of which show the results of the experiment. They differ in the interpretation of cases where an entire A-MPDU was missed by the sniffer (typically due to a collision). For the ‘best-case’ line, such events are not counted as errors at all, thus estimating what the distribution would be in the absence of collisions. For the ‘worst-case’ line, it is assumed that the missed A-MPDU consisted of k MPDUs that were all received in error, with k being the number of retransmissions in the next A-MPDU (obviously, the real size of the missed A-MPDU is not known). For comparison, the third line shows the theoretical probability that exactly n errors occur, given that at least one error occurs, under the assumption that they follow a binomial distribution (i.e., are independent). The parameter p for this binomial distribution is set to the overall probability that a single MPDU fails in an A-MPDU of that size, as calculated from the measured data for that case. 4.1
UDP 60 Mbps
In Figure 6 the error distribution for A-MPDUs of size 5 can be found. 97% of the A-MPDUs in this measurement had a size between 1 and 11; as the results for these sizes are similar, they are not shown here. Too few A-MPDUs were sent at a size larger than 11 to make those results interesting. 4.2
TCP at Maximum Rate
There was a large spread in the size of A-MPDUs in these results. 93% of the TCP data A-MPDUs were sent at a size between 1 and 30. The TCP acknowledgement A-MPDUs were smaller, but showed similar results. Figure 7 shows the distribution for TCP data A-MPDUs of size 20 and Figure 4 of TCP acknowledgements A-MDPUs of size 10. The results for other A-MPDU sizes are similar. 4.3
TCP 15 Mbps Video
89% of the A-MPDUs of the TCP video data had a size between 1 and 20. Figure 5 shows the distribution for TCP video A-MPDUs of size 20. Figure 8 shows the individual error probability of a MPDU plotted against the A-MPDU size in this measurement. A similar plot was also obtained for the other experiments, but these are not shown here because of the similarity of the results. Also, Figure 9 shows a plot which illustrates the error probability at a specific location in the A-MPDU. On the horizontal axis the location in the A-MPDU is given. The vertical axis shows the number of errors on that location, relative to the number of errors of the first MPDU in the A-MDPU.
16
F. van Nee and P.-T. de Boer
100
50
90
45
80
40 Binomial p=0.0671
Binomial p=0.073
35
Measured best 60
Probability (%)
Probability (%)
70
Measured worst
50 40
Measured best 30
Measured worst
25 20
30
15
20
10
10
5 0
0 1
2
3
4
1
5
6
11
16
Number of errors
Number of errors
Fig. 4. Error distribution for TCP ac- Fig. 5. Error distribution for TCP video knowledgements at maximum rate and A- data traffic and A-MPDU size 20 MPDU size 10 70
30
60
25 Binomial p=0.1338
50
Measured best Measured worst
30
Probability (%)
Probability (%)
20
Binomial p=0.0845 40
Measured best Measured worst
15
10
20 5
10 0
0
1
2
3
4
5
6
7
8
9
Number of errors
10
1
6
11
16
Number of errors
Fig. 6. Error distribution for 60 Mbps Fig. 7. Error distribution for TCP data UDP traffic and A-MPDU size 5 traffic at maximum rate and A-MPDU size 20
5 5.1
Discussion Distribution of the Errors
First, the results show a significant difference compared to the theoretical distribution. In almost all cases, the theoretical probability for a single error is larger than the measured probability. Also, at a higher number of MPDU errors in the A-MPDU, the probability is in almost all cases smaller in the theoretical distribution than in the measurements. The theoretical distribution reaches zero quickly, while the measurements show a floor of a few percent for almost all number of errors. This floor could be caused by an error during the training phase of reception. When such an error occurs, the probability of successfully decoding an MPDU decreases significantly, thereby increasing the probability that a large number of MPDUs in the A-MPDU fail. One clear result is that the error probability of individual MPDUs is strongly related to the size of the A-MPDU in which it was transmitted: MPDUs which
MPDU error percentage (%)
25 20 15 Best case
10
Worst case 5 0 1
6
11
16
A-MPDU size (MPDUs)
21
Number of errors relave to posion 1
On the Benefit of Forward Error Correction
17
3 2,5 2 Size 20
1,5
Size 15 1 0,5 1
6
11
16
Posion in A-MPDU
Fig. 8. The error probability of a single Fig. 9. The number of errors relative to MPDU plotted against the A-MPDU size the number of errors on the first place plotfor TCP video traffic at 15 Mbps ted against the location of an MPDU in an A-MPDU
are part of a large A-MPDU have a higher error probability than MPDUs in a small A-MPDU. The UDP traffic measurements follow the theoretical (binomial) distribution more closely than the TCP traffic results, though the curve is still more flat than the binomial distribution. This indicates that even when using one-way traffic like UDP, errors are still slightly correlated. There is almost no difference between the best and worst case scenarios, which indicates that the sniffer captured almost all packets. The TCP traffic shows larger differences with the theoretical model. The best case, which filters out all completely failed A-MPDUs, is still similar to the best case in UDP traffic. However, the worst case series shows a large peak at the end, indicating that the probability that all MPDUs in an A-MPDU fail increases dramatically when using TCP traffic. A reasonable explanation for this is that the peak is caused by collisions. TCP is two-way communication, whereas UDP is one-way communication. This means that there is a probability that A starts sending TCP data exactly at the same time as B starts sending a TCP acknowledgement. This results in a collision where both complete A-MPDUs are lost. These collisions can be avoided by using RTS/CTS [7], which is why the best-case situation should be observed here when determining the usefulness of error correction. In the best case situation in the figures, the collisions are filtered out of the results. Not only the distribution of the number of errors per A-MPDU differs between reality and the binomial model, but Figure 9 shows that MPDUs transmitted in the latter part of the A-MPDU have a greater probability of failure. A possible explanation for this is that small variations in the channel since the channel measurement in the preamble of the packet lead to greater inaccuracies later in the reception of the A-MPDU. In general, using TCP at the highest data rate, more larger A-MPDUs are sent than at the lower data rate of TCP video. However, UDP traffic is sent in even smaller A-MPDUs, even though the data rate of the UDP traffic was higher. This is explained by the fact that UDP A-MPDUs can be sent with less
18
F. van Nee and P.-T. de Boer
Table 1. Estimated packet error rates (PER) with various amounts of FEC and data rate reduction 1 FEC symbol per A-MPDU equiv. FEC FEC rate binom. reduc.
Estimated PER 2 FEC symbols per A-MPDU equiv. FEC FEC rate binom. reduc.
3 FEC symbols Measured per A-MPDU PER equiv. A-MPDU without FEC FEC rate size FEC binom. reduc. UDP 5 6.71% 1.82% 0.84% <0.01% 0.47% 0.05% <0.01% 0.09% <0.01% <0.01% 10 12.35% 6.96% 5.02% 1.24% 3.92% 1.47% 0.01% 2.28% 0.31% <0.01% TCP data at max rate 10 5.13% 2.59% 1.03% 0.52% 1.49% 0.13% <0.01% 0.92% 0.01% <0.01% 20 7.30% 5.27% 3.41% – 3.99% 1.23% 0.73% 3.18% 0.35% 0.73% 30 13.56% 11.56% 10.27% – 10.05% 7.18% – 8.73% 4.54% 1.36% TCP video data 10 6.04% 3.47% 1.40% 0.6% 2.04% 0.21% <0.01% 1.25% 0.02% <0.01% 20 13.38% 11.19% 8.67% – 9.45% 4.83% 1.34% 7.99% 2.26% 1.34% 25 20.38% 17.79% 16.41% – 15.53% 12.51% – 13.52% 8.87% 2.04%
waiting time between the packets. When using TCP, the link also has to be utilized for transmitting the acknowledgements, which means the sender has less time to transmit the frames. 5.2
Estimating the Usefulness of FEC
In order to judge the usefulness of forward error correction, the main questions are how much the PER would decrease when FEC is introduced, and how this compares to the reduction in PER when switching to a lower physical layer data rate. We have used the measured error rates and distributions to estimate both of these PER reductions. The resulting PER when using FEC is computed by using the measured error distributions to calculate how many retransmissions are needed. With n FEC symbols it is possible to repair up to n errors in an A-MPDU. Therefore, when x errors occur, a retransmission is only needed when x > n. In that case, x − n MPDUs need to be retransmitted in order to obtain enough data to decode all MPDUs. (The 1% error rate for decoding RaptorQ encoding symbols when x = n is not taken into account here, but this makes only a small difference.) We can also do this computation under the assumption that the errors would be binomially distributed, by simply taking the measured PER as the parameter for the binomial distribution. The reduction in PER resulting from reducing the physical data rate can be obtained from Figure 5.12 in Perahia and Stacey (2008). E.g., this Figure indicates that for the high data rates in the range of 405 Mbps to 600 Mbps that were used in these measurements, switching to a 10% lower rate would lead to about 10 times smaller PER. The underlying assumption here is that the channel is AWGN. In our tests, this provided a good estimation, because the path did not change during the tests. However, in real situations this may not be true, a fact that needs to be taken into account when drawing conclusions. The results are shown in Table 1. For several of the A-MPDU sizes occurring in each of our experiments, both the measured PER without FEC is shown, and
On the Benefit of Forward Error Correction
19
the calculated PER for 1, 2 and 3 FEC MPDUs per A-MPDU. For comparison, for each of the FEC cases also the calculated PER with FEC under the binomial assumption is shown, and the PER resulting from an equivalent physical data rate reduction. E.g., for 1 FEC symbol per A-MPDU and an A-MPDU size of 10, using FEC would effectively reduce the data rate by 10%, so we compare it to a 10% physical data rate reduction without FEC. This is only possible if such a datarate is available in 802.11n; where this is not the case, the table shows a ‘–’. As can be seen, the PER would not benefit much from adding forward error correction: in all cases, FEC performs worse than switching to an equivalent lower physical data rate. Also, it can be seen that at a given PER, the effectiveness of FEC is reduced due to the correlation of the packet errors (i.e., the deviation from the binomial distribution).
6
Conclusion
Several conclusions can be drawn from the results of the measurements in the previous section. First the answer to the first research question on the error distribution in A-MPDUs. It is shown that the distribution of errors in A-MPDUs does not follow the binomial distribution with independent probability p that a single MPDU fails. By comparing the binomial distribution to the distribution as found in the results, two important differences can be observed. In the first place, the results of the measurements lead to the conclusion that there is a correlation between errors in A-MPDUs. Consequently, the presence of one error in an AMPDU leads to a higher probability of a second error in the same A-MPDU. Therefore, the slope of the distribution function of errors is flatter than that of the binomial distribution function. Secondly, there is a probability that the entire A-MPDU fails due to a collision. This can be observed in the distribution function as a peak at the end. There are several factors which determine the distribution of errors as can be observed in the results of the measurements. The most obvious factor is the size of the A-MPDU itself. There is a higher probability that larger A-MPDUs contain errors. Also, the probability that a single MPDU fails increases with A-MPDU size. Another important factor is the existence of another device transmitting on the same frequency, thus also whether or not the link is bidirectional. The probability that a collision occurs is much higher in such links, which can be noticed in the distribution function by a higher peak for complete A-MPDU failure. Furthermore, the application layer data rate does not have much impact on the error rate of the MPDUs, because this does not change the physical layer data rate at which the packets are sent. For the second research question on the usefulness of forward error correction codes the following can be concluded. The performance gain when using a forward error correction code on the MAC layer depends heavily on the premise that errors on the link are binomially distributed and not correlated. When errors are correlated, there are often too many errors in the packet to correct.
20
F. van Nee and P.-T. de Boer
The results show that the distribution of errors can not be approximated by the binomial distribution function, because of the correlation of errors. In addition, A-MPDUs are sometimes received completely in error. While the latter can be avoided by using RTS/CTS, the first remains a problem for FEC. Therefore, the performance of raptor coding is worse than suggested by previous studies that assume that MPDUs in an A-MPDU fail indepently [9]. It can be concluded that it is generally better to switch to a lower physical layer data rate than to use raptor coding. Acknowledgements. The authors would like to thank Qualcomm for making their latest 802.11n wireless devices available for this research.
References 1. Alay, O., Korakis, T.: An Experimental Study of Packet Loss and Forward Error Correction in Video Multicast over IEEE 802.11b Network. In: Proceedings of IEEE CCNC (2009) 2. Combs, G.: Wireshark (December 2010), http://www.wireshark.org 3. Ginzburg, B., Kesselman, A.: Performance analysis of A-MPDU and A-MSDU aggregation in IEEE 802.11n. In: 2007 IEEE Sarnoff Symposium, pp. 1–5 (April 2007) 4. Haratcherev, I., Taal, J.: Automatic IEEE 802.11 rate control for streaming applications. Wireless Communications And Mobile Computing 5, 421–437 (2005) 5. IEEE Computer Society. Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. IEEE Draft P802.11-REVmb/D4.0 (June 2010) 6. NLANR/DAST. Iperf (December 2010), http://sourceforge.net/projects/iperf 7. Perahia, E., Stacey, R.: Next Generation Wireless LANs. Cambridge University Press, Cambridge (2008) 8. Qualcomm. Raptorq technical overview (December 2010), http://www.qualcomm.com/documents/files/raptorq-technical-overview.pdf 9. Samokhina, M., Moklyuk, K., Choi, S., Heo, J.: Raptor Code-Based Video Multicast over IEEE 802.11 WLAN. In: IEEE APWCS (2008)
Simple Modifications in HWMP for Wireless Mesh Networks with Smart Antennas Muhammad Irfan Rafique, Marco Porsch, and Thomas Bauschert Chemnitz University of Technology, 09126 Chemnitz, Germany {irfan.rafique,marco.porsch,thomas.bauschert}@ etit.tu-chemnitz.de
Abstract. In this paper we present simple modifications in the HWMP protocol for wireless mesh networks to incorporate different smart antenna transmission schemes. Our proposed amendments enable mesh nodes to select transmission techniques (spatial multiplexing, beamforming) adaptively according to the channel conditions. The advantages of diversity are exploited by transmitting control packets prior to beamforming communication. Simulation results illustrate that the modified HWMP (MHWMP) leads to a significant better throughput of the mesh network. Moreover, it enables a high degree of robustness wrt. wireless link failures for stationary mesh networks. Keywords: Wireless Mesh Networks, Routing, HWMP, Smart Antennas.
1 Introduction Wireless Mesh Networks (WMNs) have been recognized as low cost, reliable and scalable last mile access network solution. However the uncertain nature of the wireless channel and the multi-hop structure constrain the end to end performance of WMNs. In general, the efficiency of a wireless network can be improved up to a certain limit with suitable techniques at PHY layer. The IEEE 802.11n [2] amendment (Draft Part 11.0) is the latest addition under development of IEEE 802.11 providing an significant increase in the data rate up to 600 Mbps for traditional one hop WLANs (IEEE 802.11a/g). This increment is primarily achieved through PHY layer modifications: IEEE 802.11n relies on multiple input multiple output (MIMO) transmission and 40MHz channel bandwidth, while legacy IEEE 802.11 works with single input single output (SISO) transmission and 20MHz channel bandwidth. In recent years smart antennas are proposed as technology that besides MIMO (spatial multiplexing, MUX) transmission also allow for transmit/receive diversity, beamforming (BF) and null-steering. The current IEEE 802.11s draft standard for WMNs [3] contains some amendments at layer 2 - MAC modifications and a path selection protocol (layer 2 routing protocol) called Hybrid Wireless Mesh Protocol (HWMP) - but still relies on the legacy PHY with omnidirectional antennas. The incorporation of smart antenna features in IEEE 802.11s is quite challenging due to the mutual influence of the smart antenna R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 21–30, 2011. © Springer-Verlag Berlin Heidelberg 2011
22
M.I. Rafique, M. Porsch, and T. Bauschert
transmission scheme and the MAC mechanism. Furthermore the decision, which transmission technique to apply, strongly depends on the current mesh network topology and channel conditions, see [16]. For example, beamforming enables directional transmission with extended range thus being suitable for sparse network topologies whereas spatial multiplexing enables high bitrate omnidirectional transmission with lower range thus exploiting advantages in dense networks. Moreover, the diversity gain can be utilized to increase the reception range which might lead to improvements in non stationary networking scenarios. Thus, a transmitting node might influence a different number of neighboring nodes depending on whether beamforming, spatial multiplexing or diversity is applied. Hence it is advisable that MAC protocol, path selection mechanism and PHY layer transmission technique are operating in a well aligned manner. A number of mesh routing protocols exploiting some advantages of smart antennas in WMNs have already been proposed in the literature. But in our opinion they do not exploit the full potential of smart antennas transmission techniques wrt. dynamically changing networking environments and channel conditions. We present simple modifications in HWMP to incorporate the different capabilities of smart antennas in WMNs. These modifications enable mesh nodes equipped with smart antennas to choose different transmission schemes depending on the channel conditions. We consider spatial multiplexing to increase the transmission rate, diversity for range extension (of broadcast transmission) and beamforming for range extension (and also for link outage mitigation). We assume that VBLAST [18] is applied for spatial multiplexing and that diversity gain is achieved via simple space time block codes (STBC) [4]. Furthermore we determine the antenna gain for bidirectional beamforming according to the following formula, where M and N denote the number of antenna elements at the transmitter and receiver side [16]:
Garray =
(
M + N
)
2
(1)
As already stated above, the MAC protocol should be aligned to the PHY transmission scheme. However, most of the currently existing MAC layer amendments that have been proposed in different IEEE standards [3], [2], and [1] are agnostic to the PHY layer transmission scheme. For simplification, as an initial step we consider DCF at MAC layer and make some changes to cope with the different behaviors of the PHY transmission schemes of smart antennas. The paper is organized as follows: Section 2 describes the related work. Section 3 addresses the amendments of the MAC protocol whereas in Section 4 the HWMP modifications are explained. The results of our simulations are discussed in Section 5. Section 6 concludes the paper.
2 Related Work The MIR routing protocol [15] is the first one which incorporates some smart antenna features like spatial multiplexing and diversity. However MIR does not exploit benefits from beamforming. In [12] a routing protocol is presented that combines the advantages of multiplexing gain and interference cancellation. Lin et al. [11] proposes
Simple Modifications in HWMP for Wireless Mesh Networks with Smart Antennas
23
an algorithm that coordinates the scheduling, routing and power control mechanisms. Cai et al. [7] enhances an AODV-based routing protocol to consider spatial multiplexing at the PHY layer and focuses on the minimization of the route establishment time. Hu and Zhang [9] propose an extension of the IEEE 802.11 based MAC protocol that is aware of spatial diversity and analyzes its impact on the routing. To our knowledge, our work is the first one that considers the combination of multiple PHY layer transmission modes enabled by smart antenna technology and HWMP routing in an integrated approach for improving the performance of WLAN mesh networks.
3 MAC Amendments Before the data transmission with spatial multiplexing starts, we apply omnidirectional transmission for RTS/CTS exchange assuming that the transmission range of spatial multiplexing and single omnidirectional antenna transmission is the same. However, the RTS/CTS transmission with omnidirectional antenna is not suitable for data transmissions with beamforming. This is due to two reasons: The first reason is that due to the larger range of beamforming transmission a sender can cause interference to nodes that are beyond the default omnidirectional antenna transmission range. Secondly, due to the directional beamforming transmission a sender only interferes to those nodes that lie in the direction of the beam. To cope with the larger interference range of beamforming, MHWMP broadcasts RTS/CTS frames with STBC (S-RTS/S-CTS) prior to data transmission with beamforming - a similar approach is described in [14]. Here we assume that the range extension due to beamforming and diversity gain is approximately the same. The omnidirectional exchange of S-RTS/S-CTS might block more nodes from communication than necessary (thus decreasing the network throughput) as the directional beamforming transmission does not affect nodes that lie outside the beam. To avoid such situations we adopt the concept of a directional network allocation vector (DNAV) as suggested in [17]. The use of DNAV enables parallel communication of nearby located nodes with non-interfering beams.
4 Modified HWMP (MHWMP) The basic routing functions of MHWMP are the same as that of the original HWMP. The description of the original HWMP draft is beyond the scope of this paper, the interested readers are referred to [6]. Here we only outline our HWMP modifications. 4.1 Neighbor Discovery (ND) As the transmission range of beamforming and multiplexing is different and the former operates in directional mode, the neighboring nodes might be different for these two schemes. To exploit the advantage of beamforming and multiplexing, a node should have explicit informations about its respective neighbors. A number of neighbor discovery (ND) algorithms applicable for directional transmission can be found in the literature, see [13], [10], [8], [5] and [19], but so far there exists no suitable ND algorithm that can cope both with beamforming and
24
M.I. Rafique, M. Porsch, and T. Bauschert
multiplexing at the same time. In our new ND algorithm, each node has two neighboring tables - the Omnidirectional Neighbor Table (ONT) and the Directional Neighbor Table (DNT). In the DNT the neighbors are stored together with their respective directions. To find the directional and omnidirectional neighbors, beacons are broadcasted both with and without STBC respectively (similarly to S-RTS/S-CTS and normal RTS/CTS). Both beacons are transmitted alternately in 0.5 second intervals. We assume that the PHY layer is capable to differentiate both beacons and that a node can estimate the direction of arrival (DOA) from the received beacons. 4.2 Path Selection Metric Each node has two path tables i.e. an Omnidirectional Path Table (OPT) and a Directional Path Table (DPT). The former is used for communication with spatial multiplexing while the latter is considered for communication with beamforming. The path tables are calculated based on the minimum airtime link metric C:
B ⎤ 1 ⎡ C = ⎢Oca + O p + t ⎥ r ⎦ 1 − e fr ⎣
(2)
where Oca and Op are the channel access and MAC protocol overheads, Bt is the number of bits of a test frame and efr denotes the frame error probability. The DPT is calculated based on (2) by taking r as transmission rate without spatial multiplexing. r is determined by the specific rate adaptation algorithm RBAR [20] considering the channel conditions. The OPT is calculated based on (2) by using rmux instead of r. rmux denotes the transmission rate in case of spatial multiplexing and can be obtained from r as follows:
rmux = min(M , N )r
(3)
where M and N denote the number of antennas in the antenna array of the transmitter and receiver, respectively. 4.3 Path Discovery The path discovery procedure is depicted in Fig. 1. When a source S wants to communicate with destination D, first of all it looks in both path tables OPT and DPT. In case S does not find any entry for D, it broadcasts path request (PREQ) frames with STBC (S-PREQ) and without STBC (standard omnidirectional transmission) (OPREQ). The structure of both PREQ frames is the same as in HWMP. The destination only flag of the corresponding PREQ frames is not set (DO=0). Therefore an intermediate node responds with a Path Reply (PREP) frame to the PREQ frame if it has a valid path entry. When an intermediate node receives an S-PREQ it looks into its DPT and in case that an valid entry for the destination D exists, it replies back to the transmitter (with beamforming). Otherwise it updates its DPT and forwards the SPREQ provided that the sequence number of the S-PREQ is higher or, in case of equal sequence numbers if the airtime metric value of S-PREQ is lower than that of the corresponding path entry. If an intermediate node receives a O-PREQ it looks into it’s OPT and in case that an entry for the destination D exists it replies back to the
Simple Modifications in HWMP for Wireless Mesh Networks with Smart Antennas
25
transmitter (in omnidirectional mode). Otherwise it updates it’s OPT and forwards the O-PREQ if the conditions are met. If a valid entry for D exists only in OPT or DPT then the source starts the data transmission using either spatial multiplexing or beamforming and additionally initiates a path request to fill the missing entry in the respective path table. Thus, in case of the absence of a valid destination entry in one table, the source does not need to wait for the completion of the route discovery before beginning the transmission. When the destination node receives a new S-PREQ or O-PREQ it sends a Path Reply (PREP) on the return path towards the originator applying beamforming or omnidirectional transmission, respectively. While processing PREP frames, the role of the intermediate nodes is almost the same as explained in the original HWMP draft except that they update their respective path tables OPT or DPT and use the same transmission scheme for forwarding the PREP as that with which they received it. 4.4 Data Transmission Fig. 1 also explains the process of data transmission. Before transmitting a data packet to destination D, the source S compares the airtime metric value of the paths available in both path tables. It selects the next hop that belongs to the path with the lower airtime metric and starts its data transmission using the corresponding scheme (spatial multiplexing or beamforming). In case the destination entry in OPT has the lower path metric value, the node selects the next hop from the OPT and uses spatial multiplexing for transmission. If the destination entry in DPT owns the lower path metric value, the node selects the next hop from the DPT and sends the data packet with beamforming. Additionally it starts a O-PREQ in order to find a new path with possibly lower path metric - this is because applying spatial multiplexing for data transmission is always preferred due to the higher achievable data rate. The transmission with beamforming continues until a path with less airtime metric is found through this new request. The described method increases the overall throughput of the mesh network because the selected path always guarantees the highest transmission rate. If a valid destination entry exists only in OPT or DPT then the source initiates the data transmission using either spatial multiplexing or beamforming and additionally starts a path request to fill the missing entry in the respective path table. 4.5 Path Maintenance As we assume a stationary mesh network (with non-moving mesh nodes), mesh link outages might only occur due to bad radio channel conditions, i.e. poor SNR. In the original draft of HWMP, path interruptions due to link outages lead to the generation of path error (PERR) frames. In our algorithm, if the data frames were transmitted using spatial multiplexing, a node sends - after a link outage - a PERR frame towards the source applying omnidirectional transmission. It also looks in the DPT, whether a valid entry for the same destination exists and if so, the node continues the transmission of received packets on that path using beamforming. This approach avoids the loss of queued and on-air data frames. When the source receives the PERR frame, it erases the corresponding entry in the OPT and starts a path discovery procedure in
26
M.I. Rafique, M. Porsch, and T. Bauschert
Fig. 1. Path Discovery and Data Transmission of MHWMP
omnidirectional mode (O-PREQ). Meanwhile, the source uses the next hop entry out of the DPT and continues the transmission towards the destination with beamforming unless a better path is found through the O-PREQ. Intermediate nodes that receive a PERR frame delete the corresponding entries in the OPT and forward the PERR frame towards the source. If data frames are going to be transmitted with beamforming, a node sends - after a link outage - a PERR frame towards the source applying beamforming. Intermediate nodes that receive the PERR frame delete the corresponding entries in their DPT and forward the PERR towards the source accordingly.
5 Performance Evaluation The network simulator ns-3 is used for performance evaluation. MHWMP is compared with original HWMP that works with spatial multiplexing (O-MUX). For fairness reasons, O-MUX also uses standard RTS/CTS.
Simple Modifications in HWMP for Wireless Mesh Networks with Smart Antennas
27
We assume constant bit rate (CBR) traffic flows between source and destination nodes and a constant data packet size of 512 bytes. Each node is equipped with an uniform circular array comprising of four antenna elements. For the simulation of beamforming, we apply the keyhole model where the gain of the side lobes is set equal to zero. The maximum transmission rate for a single antenna element is 54 Mbps (according to IEEE 802.11a). For simplification, we assume that the nodes have perfect channel knowledge and packet losses occur only due to insufficient SNR. Moreover, it is assumed that the PHY layer is capable to differentiate between standard and STBC control frames. The maximum number of packets buffered in the output queue during route resolving is set to 255. After 3 successive PREQs reattempts, the destination is considered unreachable and the time duration between two successive PREQs is set to 102 ms. The same parameter values are mentioned in [3]. We consider a simple scenario where two nodes (source and destination) are placed at a distance of 50m. The life time of the active path and the path maintenance interval are set to 5s and 2s respectively. The total simulated time is 15s. The source generates traffic at 5Mbps and starts sending packets at 4s towards the destination. The SNR of the radio link is decreased by 10 dBm in the time interval between 8s and 11s. Fig. 2 shows the data throughput both for MHWMP and O-MUX. It illustrates that MHWMP shows robust behavior against link losses and maintains the throughput of the network even in situations where O-MUX fails.
Fig. 2. Data Throughput
In case of MHWMP, at the beginning, the destination entry in the sources' OPTs has a lower airtime metric compared to that in the DPTs. This is due to the higher PHY layer data rate of transmissions with multiplexing. Thus MHWMP chooses
28
M.I. Rafique, M. Porsch, and T. Bauschert
multiplexing for the transmission. When the link deteriorates at 8s, the high frame losses lead to a link outage. Then the source selects the available path in DPT and continues packet transmission with beamforming. At the same time it initiates a path discovery (for the path entry in OPT) by sending O-PREQ. However this path discovery remains unsuccessful in the time interval between 8s and 11s. At 11s, when the original SNR of the link is restored, the source successfully discovers a new path that is feasible for transmission with multiplexing (OPT). It compares the airtime metric with that of the DPT entry and finally selects multiplexing for the transmission because of the lower metric value. In the time interval from 8s to 11s O-MUX suffers with a complete link failure without any possibility to recover. However the source re-establishes the path when the link is recovered at 11s showing a temporary throughput peak. This can be explained as follows: during the link outage the source continuously generates traffic at the rate of 5 Mbps. When the path is re-established, most of the packets generated during the time of the link degradation already have been dropped from the output buffer due to its limited size. At 11s the buffered packets are sent out with maximum transmission rate of O-MUX which is much higher than 5 Mbps (the assumed source traffic rate) leading to the temporary throughput peak.
Fig. 3. Packet Time Delay
Fig. 3 explains the per packet delay behavior of O-MUX and MHWMP. As expected, both schemes show a small short delay peak at the beginning of the transmission due to the path discovery process. MHWMP and O-MUX exhibit similar time delay when the SNR of the link is not reduced. After the link deteriorates MHWMP successfully adapts the path from DPT and continues its transmission with
Simple Modifications in HWMP for Wireless Mesh Networks with Smart Antennas
29
beamforming. Since the transmission rate for beamforming is lower than for multiplexing and additionally a lower PHY layer transmission data rate is selected (due to the lower SNR value) by the rate adaptation algorithm, the packet delay is comparatively higher. At 11s, when the path is re-established, MHWMP switches to transmission with spatial multiplexing and resumes the low delay. O-MUX shows worse performance during the link deterioration period as no packets are sent but queued in the output buffer (or dropped if the buffer size is exceeded). As soon as the path is re-established at 11s, the source first transmits the buffered packets - this causes the delay time peak (for these packets). After the successful transmission of the queued packets, O-MUX again achieves the same packet delay time as MHWMP. Since it is assumed that the nodes have perfect channel knowledge, the degradation of the SNR can also be interpreted as caused by a larger distance between the two nodes. Thus by exploiting the benefits from beamforming transmission, MHWMP is expected to outperform O-MUX in sparse mesh network scenarios (with larger node distances) as well.
6 Conclusion In this paper we propose some modifications in HWMP for stationary wireless mesh networks with smart antennas. As a first step, we evaluated the performance of MHMWP by means of a simple two node scenario. It turns out that MHWMP improves the throughput and shows a quite robust behavior against link outages caused by SNR degradations. It exploits the advantages of smart antennas by adaptively using either spatial multiplexing or beamforming for data transmission and STBC for sending control packets prior to beamforming. Due to the different reach of these transmission techniques, MHWMP requires two separate path tables and also two separate sets of control frames. In our future work we will study the performance of MHWMP in medium and large scale wireless mesh network scenarios. Furthermore we intend to analyze the overhead caused by MHWMP in more detail - especially also for wireless mesh networks with non-stationary (moving) nodes. For stationary mesh networks this overhead is expected to be not a big hurdle as path failures and consequently path discovery actions normally do not occur frequently.
References 1. IEEE 802.11 Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications, Amendment 8: Medium Access Control (MAC) Quality of Service Enhancements (2005) 2. IEEE P802.11n/D11.0, Part 11: Wireless Medium Access Control (MAC) and Physical layer (PHY) Specifications, Amendment 5: Enhancement for Higher Throughput (2009) 3. IEEE P802.11s/D3.0, Part 11: Wireless Medium Access Control (MAC) and Physical layer (PHY) Specifications, Amendment 10: Mesh Networking (2009) 4. Alamouti, S.M.: A Simple Transmit Diversity Technique for Wireless Communications. IEEE J. Selecl. Areas Commun. 16(8), 1451–1458 (1998) 5. An, X., Hekmat, R.: Self-adaptive Neighbor Discovery in Ad hoc Networks with Directional Antennas. Mobile and Wireless Communications Summit, 1–5 (2007)
30
M.I. Rafique, M. Porsch, and T. Bauschert
6. Bahr, M.: Proposed Routing for IEEE 802.11s WLAN Mesh Networks. In: 2nd Annual International Workshop on Wireless Internet, WICON 2006, New York, NY, USA (2006) 7. Cai, X., Li, J., Chen, D., Zhang, Y.: Cross-layer Design of Aodv-based Routing in Ad hoc Networks with MIMO Links. In: IEEE 20th International Symposium on Personal, Indoor and Mobile Radio Communications, pp. 2586–2590 (2009) 8. Cain, J.B., Billhartz, T., Foore, L., Althouse, E., Schlorff, J.: A Link Scheduling and Adhoc Networking Approach using Directional Antennas. In: IEEE MILCOM 2003, pp. 643–648 (2003) 9. Hu, M., Zhang, J.: MIMO Adhoc Networks: Medium Access Control, Saturation Throughput, and Optimal hop Distance. J. Communication and Networks, 317–330 (2004) 10. Haung, Z., Shen, C.C., Srisathapornphat, C., Jaikaeo, C.: Topology Control for Ad hoc Networks with Directional Antennas. In: Eleventh International Conference on Computer Communications and Networks, pp. 16–21 (2002) 11. Lin, Y., Javidi, T., Cruz, R.L., Milstein, L.B.: Distributed Link Scheduling, Power Control and Routing for Multi-hop Wireless MIMO Networks. In: IEEE Asilomar Conference on Signals, Systems, and Computers, pp. 122–126 (2006) 12. Mundarath, J.C., Ramanathan, P., Veen, B.D.V.: A Quality of Service Aware Cross Layer Approach for Wireless Adhoc Networks with Smart Antennas. J. Ad Hoc Networks 7(5), 891–903 (2009) 13. Ramanathan, R.: On the Performance of Adhoc Networks with Beamforming Antennas. In: ACM MobiHoc, pp. 95–105 (2001) 14. Rossetto, F., Zorzi, M.: On Gain Asymmetry and Broadcast Efficiency in MIMO Adhoc Networks. In: IEEE ICC 2006, pp. 3862–3867 (2006) 15. Sundaresan, K., Sivakumar, R.: Routing in Ad-hoc Networks with MIMO Links: Optimization Considerations and Protocols. J. Computer Networks 52(14), 2623–2644 (2008) 16. Sundersasan, K.: Network Protocols for Ad-Hoc Networks with Smart Antennas. PhD Thesis, Georgia Institute of Technology, USA (2006) 17. Takai, M., Martin, J., Bagrodia, R., Ren, A.: Directional Virtual Carrier Sensing for Directional Antennas in Mobile Adhoc Networks. In: 3rd ACM International Symposium on Mobile Adhoc Networking & Computing, MobiHoc 2002, New York, NY, USA, pp. 183–193 (2002) 18. Wolniansky, P.W., Foschini, G.J., Golden, G.D., Valenzuela, R.A.: V-BLAST: An Architecture for Realizing Very High Data Rates over the Rich-Scattering Wireless Channel. In: ISSSE 1998, pp. 295–300 (1998) 19. Zhang, Z., Ryu, B., Nallamothu, G., Huang, Z.: Performance of all-Directional Transmission and Reception Algorithms in Wireless Adhoc Networks with Directional Antennas. In: IEEE MILCOM 2005, pp. 225–230 (2005) 20. Holland, G., Vaidya, N., Bahl, P.: A Rate- Adaptive MAC Protocol for Multi-hop WirelessNetworks. In: MobiCom 2001, New York, NY, USA, pp. 236–251 (2001)
On the Evaluation of Self-addressing Strategies for Ad-Hoc Networks Ricardo de O. Schmidt1 , Aiko Pras1, and Reinaldo Gomes2 1
Design and Analysis of Communication Systems University of Twente, The Netherlands {r.schmidt,a.pras}@utwente.nl 2 Systems and Computing Department Federal University of Campina Grande - Brazil
[email protected]
Abstract. Ad-hoc networks are supposed to operate autonomously and, therefore, self-* technologies are fundamental to their deployment. Several of these solutions have been proposed during the last few years, covering most layers and functionalities of networking systems. Addressing is one of the critical network operations on supporting others such as routing and even security of IP-based communcations. This paper has the goal of putting together different strategies for self-addressing problem by evaluating five self-addressing protocols and pointing out their strenghs and drawbacks. At this time, we did not intend to come up with a novel approach for the problem, but to evaluate the existing ones in a non-isolated manner and also considering critical ad-hoc networking situations of partition and merging. Conclusions about the evaluated approaches and their applicability are drawn at the end of this paper.
1
Introduction
Autonomous networking systems represent a new paradigm mostly based on an innovative cooperation model of network devices to create and manage communication environments automatically. The idea of creating autonomous networking systems relies heavily on the concept of autoconfiguration. The autoconfiguration problem, also self-management, can be seen as the main reason for the emergence of self-* technologies. Due to its importance for the proper network operation, addressing can be seen as one of the most important (and challenging) issues in self-* solutions. According to documents, like [1], published within the IETF working group AUTOCONF [2], among the goals of autoconfiguration in dynamic networks, like ad hoc networks, we must consider the configuration of unique address for nodes. The well known DHCP (Dynamic Host Configuration Protocol) has a very limited applicability when considering dynamic characteristics found in these networks, such as mobility, unpredictable number of nodes, and undefined topology. Addressing is a critical functionality in IP networks given that it provides basic configuration for other network functionalities such as routing and security. R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 31–42, 2011. c Springer-Verlag Berlin Heidelberg 2011
32
R. de O. Schmidt, A. Pras, and R. Gomes
Attempting to solve the addressing problem in dynamic networking systems, several self-addressing solutions were already proposed, like the ones surveyed in [3] and [4]. These solutions implement different methodologies to allow selfconfiguration of nodes interfaces with tentative of valid and unique addresses within the network. Such methodologies range from simple random selection of addresses from predefined addressing spaces, to mathematical efforts where equations are defined to enable nodes to calculate their own interface(s) address(es). These solutions are roughly classified as stateless, stateful or hybrid approaches, discussed in more details in the next section. The main goal of this paper is to contribute to the specific area of addressing by comparing completely different self-addressing approaches in critical ad hoc networking situations, like networks merging and partition. Past works have evaluated several strategies for self-addressing. However, most of them only submit the proposed solutions to simple network behaviors, what may not be the case in dynamic ad hoc networks. It is important to state that we do not intent to propose a novel approach for self-addressing, but to evaluate those we consider important methodologies for approaching the problem. With the results in this paper and also further experiments with specific functionalities (like merging/partition management), we plan to come up with a stable, scalable and reliable self-addressing solution, developing new ideas or using already existing ones that proved to be efficient. The rest of this paper is organized as follow. The concepts of self-addressing are introduced in the next section. Section 3 presents the methodology used in the experiments as well as a brief description of the implemented and simulated self-addressing protocols. Results obtained from the simulations are presented in section 4. Finally, in section 5, conclusions about the evaluated strategies are drawn.
2
Self-addressing
Self-addressing is tightly related to the concepts of autoconfiguration. It can be part of a set of technologies that enable a network to operate autonomously. Basically, a self-addressing protocol must provide a node the ability of or generating its own addressing configuration or retriving such configuration from another network entity. Self-addressing protocols can roughly be classified in one of the following categories: stateless, stateful or hybrid. In short, stateless approaches like Strong DAD [5] are those where nodes do not keep track of addresses in use in the network (i.e., the state of addresses is unknon). For instance, a stateless protocol can randomly select an address from a predefined range and test it within the network to ensure that no other node is configured with such address. This testing procedure is generally named Duplicate Address Detection (DAD). Stateful approaches like SooA [6], on the other hand, allow nodes to be aware of addresses state. Usually, in a stateful approach, the nodes that keep track of addresses are responsible for performing addressing tasks on assignment and management of resources. For instance, information
On the Evaluation of Self-addressing Strategies for Ad-Hoc Networks
33
about addresses can be stored in tables. However, alternative strategies like Prophet Allocation [7] define mathematical approaches which allow nodes to track addresses that may be in use, thus characterizing a stateful solution. Finally, hybrid approaches mix properties of both stateless and stateful protocols. Hybrid solutions usually implement random address selection, DAD testing, and the address registration within one or more addressing authorities (like in HCQA [8]) or within tables spread through the network (like in MANETconf [9]).
3 3.1
Experiments Evaluated Protocols
Five self-addressing protocols were compared in our experiments. Three stateless protocols: Strong DAD, AIPAC and AROD; one stateful: Prophet Allocation; and one hybrid: MANETconf. The protocols AROD and AIPAC were fully evaluated. That is, we also considered their mechanisms for handling networks merging and partition. All protocols were implemented in C++ and added to the well known network simulator NS2 (Network Simulator 2) [10]. We opted for these protocols due to their different proposals on addressing assignment, DAD procedures and partition/merging management. In our experiments we aimed a fair comparison among the five protocols, following AUTOCONF’s [2] goal of proposing a common and general self-addressing solution for ad-hoc networks. The protocols were submitted to the same network scenarios, disregarding their implementation (or not) of advanced operations for handling complex networking situations. Next, a brief description of each evaluated protocol is given. Strong DAD [5] is a stateless addressing protocol which implements random selection of addresses followed by DAD procedures. Each starting node is responsible for its own address configuration. Upon starting a node selects randomly, from a predefined range, two addresses: temporary and tentative addresses. The former is used as the node’s address during the DAD procedure on behalf of the latter. On succeding in the DAD procedure, the tentative address is used to configure the node’s interface. Otherwise, the tentative addresse selection and testing is done until no conflicts are identified. Strong DAD does not deal with network partition and merging. Consequently, its applicability to more critical networking scenarios is restricted. AIPAC [11] implements the concept of requester and initiator nodes. The former is a starting node and the latter an already configured node which negotiates an address within the network on behalf of the former. As well as in Strong DAD, a starting node randomly selects temporary address and uses it to communicate with its initiator node during the configuration procedure. This temporary address is used until a negotiated one is received from the initiator. The negotiation procedure is executed by the initiator node. It randomly selects an address from a predefined range and tests it with the network through a DAD procedure. AIPAC also defines functions for dealing with partition and
34
R. de O. Schmidt, A. Pras, and R. Gomes
merging of networks. A network identification, defined by the first node (i.e., network’s founder), is added to the protocol’s messages, enabling nodes to identify different networks that come into each others range. AIPAC’s main drawback is the dependancy on routing messages and tables. It requires modifications to the routing protocol operating in the network, making AIPAC unsuitable for dynamic networks that may even negotiate and change the routing protocol. AROD [12], although being also based on DAD procedure and requesterinitiator scheme, focuses on reducing configuration time and control overhead. To do so, it implements address reservation and optimistic DAD. The main difference is that an initiator node may have reserved addresses, previously validated within the network, which can be allocated to a requester node without the DAD delay, reducing the configuration time. A DAD procedure is usually executed for more than a single tentative address at the same time, which allow nodes to obtain spare addresses for future configurations of requester nodes. Prophet Allocation [7] is the only evaluated approach in this paper that does not implement DAD procedure (at least not during the procedures specified by the authors). Unlikely the other described solutions, Prophet Allocation does not select addresses randomly, but it obtain addresses inside a predefined range from a formula. This formula is based on the fundamental theorem of arithmetic, which says that: every positive integer may be expressed uniquely as a product of primes. Each node has a 2-tuple identifier, composed by their address and the formula’s state. Everytime an address is calculated through the formula, its state is incremented by one. However, as stated by the authors, this formula might generate duplicated addresses within a network. Therefore, when a node starts (requester) it receives from an already configured node (initiator) an address, generated by the formula, and a network identifier. This identifier is added to the protocol’s messages enabling nodes to identify messages from different networks (merging situation). Unfortunately, no additional information on how the protocol deals with the merging problem was provided by the authors in [7]. Consequently, due to this the lack of information, we did not considered management of network partition and merging as Prophet Allocation function. Finally, MANETconf [9] is a completely distributed and hybrid addressing protocol, which implements allocation tables and DAD procedure. Its functionality is simple but it requires a lot of broadcasting for DAD and tables synchronization. MANETconf also implements requester and initiator nodes. The difference is that after successfully negotiating an address to a requester node, the initiator floods the network announcing that such address has been allocated. With this flooding, all network nodes update their respective allocation tables. The allocation tables are used to increase the reliability of the DAD procedure. Despite the fact of flooding the network with broadcasts, MANETconf has the advantage of not depending on any other technology. 3.2
Scenarios and Evaluation Metrics
The experiments were planned to evaluate the implemented protocols under critical situations of ad hoc networks. The protocols were set to operate with very
On the Evaluation of Self-addressing Strategies for Ad-Hoc Networks
35
limited addressing resources (256 available addresses) and submitted to conditions of random deployment and mobility of nodes, leading to situations of networks isolation, partition and merging. Two different scenarios were simulated, as described below. In the first scenario, named Scenario A, 100 nodes ramdonly positioned in a predefined area, forcing situations of nodes isolation in the beginning of the simulation, due to geographical distance between nodes. In this scenario nodes are static and merging was resulted from the deployment of intermediate nodes between the isolated networks. In the second scenario, named Scenario B, two initially isolated 10-node networks were deployed and later merged, resulting in a 20-node network. The area of these networks was defined so that no isolated nodes were deployed in the initial formation. Simulation time and nodes arrival rate did not play an important role in our scenarios. In addition, from previous experiences, like [6] and [12], we learned that variations on addressing space have a very low impact on protocol’s operation. We made use of a methodology described in [13] to determine the necessary number of simulation runs so that our results would fall sufficiently close to the mean, in a confidence interval of 95%. This guarantees that we perform a fair comparison between two or more solutions mainly due to the use of random variables (i.e., nodes positioning and random mobility). To evaluate the protocols, we considered three basic metrics: (a) control overhead, which quantifies the data generated and transmitted during the protocols operation; (b) configuration delay, which represents the time between the node deployment and its final configuration; and (c) address uniqueness, which measures the efficiency of protocols on avoiding problems with duplicated configurations (i.e., conflicting configurations).
4 4.1
Results Analysis Scenario A: Highly Populated Network
The graphics in Fig. 1 present the traffic generated and transmitted by the protocols while configuring all 100 nodes. However, the lines do not represent all the traffic generated during the entire simulation time, which was 2000 seconds, but the traffic from the beginning of the simulation until the configuration of the last node. Regarding the traffic in number of packets, as presented in Fig. 1(a) and 1(b), and the traffic i nnumber of bytes, as presented in Fig. 1(c) and 1(d), one can conclude in a first moment that the best performances were reached by the protocols Strong DAD and AROD. However, we must consider that the former implements only allocation of addresses and no further maintenance operations. When the last node is configured with Strong DAD, the protocol concludes its operation. Therefore, all the traffic generated by the protocol is related to address assignment and DAD procedure. Thus, if after configuring all nodes, n nodes happen to be configured with duplicated addresses, such problem will remain unsolved until the end of the simulation or, in another possible scenario,
36
R. de O. Schmidt, A. Pras, and R. Gomes
,$$$$ *$$$$ ($$$$ &$$$$ $
% %$ &$ '$ ($ )$ *$ +$ ,$ -$ %$$ !
% %$ &$ '$ ($ )$ *$ +$ ,$ -$ %$$ !
%&$$
%&$$
%$$$
%$$$
%$$$$$
($$$$ ')$$$ '$$$$ &)$$$ &$$$$ %)$$$ %$$$$ )$$$ $
!
!
,$$ *$$ ($$ &$$
,$$ *$$ ($$ &$$
$ % %$ &$ '$ ($ )$ *$ +$ ,$ -$ %$$ !
$ % %$ &$ '$ ($ )$ *$ +$ ,$ -$ %$$ !
Fig. 1. Protocols performance in Scenario A
only one of the conflicting nodes remains active in the network. Even being a flood-based approach, Strong DAD generated significantly less traffic than the other approaches. It is due to the fragmentation of the network resulted from the random positioning of nodes in the simulation area, where the less neighbors a node has the smaller is the impact caused by its flooding procedure. In addition, the number of bytes transmitted by Strong DAD grows in different rates than the number of packets. It occurs because in the DAD procedure a list of previous hops (reverse path) must be kept within a request message for allowing a node, which identifies a conflict, to reply correctly to the request generator. It is also applicable to other protocols that implement DAD procedure with reverse path. As one can observe in Fig 1, regarding traffic overhead, AROD performed better that Strong DAD, even executing DAD procedures. The main reason for that is the address reservation implemented by AROD, where for testing several addresses with DAD, the protocol generates pratically the same overhead than Strong DAD would generate for testing a single address. With the address reservation, AROD avoids further DAD executions when configured nodes provide new ones with already tested addresses. It is important to inform that in our experiment nodes operating AROD were able to reserve only one address in addition to the address for their own configuration. One can infer that the protocol performance may improve if, in cases that the addressing space is larger, the nodes are allowed to reserve more than one address for further allocations.
On the Evaluation of Self-addressing Strategies for Ad-Hoc Networks
37
Prophet Allocation, which does not implement DAD procedure on address allocation, did not generated high overhead to configure nodes. The simple handshake procedure for allocating an address allows the protocol to quickly configure new nodes. Given that in this experiments we did not considered cooperation between addressing and routing protocols, the strategy for address maintenance implemented by Prophet Allocation was the responsible for the most overhead. The periodically ack broadcasted by all nodes was the reason for the high values achieved by Prophet Allocation regarding the control overhead. The difference between the sent and received values achieved by the protocol is due to the number of neighbors each node has, where the higher the number of neighbors, the higher is the number of received packets/bytes. In addition, due to the small payload of ack messages, the numbers of packets is much higher than the number of bytes, as observed in Fig. 1. The traffic generated by AIPAC when configuring nodes is similar to Strong DAD. However,the exceeding traffic generated by AIPAC, as observed in Fig. 1, is due to the execution of merging management procedure called "gradual merging". As one can observe in Fig. 1, in general MANETconf generated most traffic when configuring all nodes. It is due to the excessive flooding during the procedures of DAD and tables synchronization. The need of all nodes replying a DAD request received generates a very high control overhead, which is summed with the broadcasted message used to update allocation tables on every new address that is allocated. Due to post-allocation address maintenance, Prophet Allocation, AROD and AIPAC continued to generate traffic even after concluding the configuration of all nodes. However, simulation parameters were set (e.g., nodes arrival rate) so that the configuration of the last node happened close to the 2000 seconds. This way, the additional traffic did not impact significantly in the final results. Fig. 2 illustrates the average delay achieved by the the protocols for configuring the nodes in Scenario A. Strong DAD usually has an average of 15 seconds for nodes configuration, due to the three rounds of 5-second DAD procedure. However, given that several nodes were isolated in the beginning of the simulation, Strong DAD configured such nodes without executing the DAD procedure and, consequently, the average delay at the end of the simulation dropped to 9 seconds.
!"$%
+- +(
*-
*(
)- )(
-
( )
)(
*(
+(
,( -( .( /( !
0(
1( )((
Fig. 2. Configuration delay
38
R. de O. Schmidt, A. Pras, and R. Gomes
The configuration delay in MANETconf is strictly connected to the duration of the DAD procedure. Consequently, the average delay achieved by MANETconf was 5 seconds. In this case, the time for updating the allocation tables was not considered as part of the configuration delay because such procedure is only triggered after the new node is already configured. The merging procedure implemented by MANETconf solves conflicts locally and, consequently, it did not impacted significantly on the protocol’s performance. Unlike MANETconf, the highly fragmented network had a strong impact on AROD operation. The average configuration delay achieved by AROD was 8 seconds, even operating with the address reservation scheme. The control overhead for addressing maintenance and the merging management in AROD compromised its performance. At the end, the reliable mering aimed by AROD spent much time and generated excessive traffic. Prophet Allocation achieved the best results reagarding configuration delay due to its simple handshake for address allocation. However, we must consider that the protocol’s authors assume that the implemented mathematical equation may generate duplicate addresses in a certain moment. One can infer that by executing procedures for conflicts correction, which would be a flooding-based DAD, a much higher maintenance load would be demanded from the nodes, what would result in performance degradation on overhead and configuration delay. In this case, we believe that the averages presented as results in Fig. 2 for Prophet Allocation would be increased in 5 seconds. AIPAC achieved the highest configuration delay. Its DADi procedure lasted 15 seconds in average. Two factors mainly contributed for such results. First, although merging problems were corrected later by the protocol, the procedures for gradual merging were unnecessarily triggered several times during nodes configuration. The second reason is that AIPAC, unlike the other protocols, do not consider networks of one node. It means that an isolated node does not configure itself if it does not have at least one neighbor. Therefore, isolated nodes in the beginning of the simulation did not configured themselves until another node was deployed in their neighborhood. This waiting time increased the average configuration delay for the first nodes, as shown in Fig. 2. A serious problem identified on the results of this experiment was the number of nodes, at the end of the simulation, that were configured with conflicting addresses. Both netowrk fragmentation and reduced addressing space drastically impacted on the configuration consistency. As presented in Table 1, all evaluated protocols concluded their operations with conflicted nodes configuration. The weak performance of AIPAC, if compared to the other protocols, was rewarded by the success achieved on merging a highly fragmented network and finishing the simulation with the lowest number of conflicts. When compared to the other protocols, MANETconf also achieved good results on the number of conflicts. It is due to the allocation tables implemented in all nodes. As well as with AIPAC, the costly performance of the protocol was justified by the reliability of its configurations at the end of the simulations.
On the Evaluation of Self-addressing Strategies for Ad-Hoc Networks
39
Table 1. Address Conflicts in Scenario A Protocol
Conflicts
StrongDAD Prophet Allocation MANETconf AROD AIPAC
29 45 8 30 5
Strong DAD configured initially isolated nodes and later did not handle merging situations by locating and correcting possible conflicts. Consequently, the number of conflicts was very high with Strong DAD. Prophet Allocation achieved the highest number of conflicts at the end of the simulation. Given that it does not implement a procedure for handling merging, the fragmented network was also one of the reasons for its bad performance. In addition, Prophet Allocation allows an implementation of more states in its equation, which would generate more distinct sequences of addresses and, according to its authors, less conflicts. However, we decided for implementing its equation with only one state due to our goal of testing the protocols in very limited conditions. Although AROD ended the simulation with high number of conflicts, it is important to mention that the protocol’s procedure for conflict resolution was not implemented in this experiment. However, AROD successfully identified all the conflicts in the network. Such conflicts were resulted from the merging of all isolated nodes and small networks. Considering that AROD implements DAD procedure for resolving address conflicts, triggering such operation would degrade the protocol’s performance by increasing control overhead and configuration delay. This leads us to conclude that it would bring AROD’s results close to AIPAC’s. 4.2
Scenario B: Handling Networks Merging
Past works have evaluated self-addressing protocols under more controlled networking scenarios, considering mainly the allocation procedure and not the resources maintenance and management after allocation. However, real ad-hoc network scenarios may be quite different. In these, nodes arbitrarily come and go. In some situations not only the nodes move, but entire networks may move from one place to another, temporarily or permanently merging with each other. In this second scenario, we tested the merging procedures of AROD and AIPAC in a scenario of merging between two networks. Each network was composed by 10 nodes. In a predefined moment, after the initial configuration of all nodes, network N1 moved towards network N2, merging and creating a single network N3 with 20 nodes. Both protocols AROD and AIPAC successfully performed the network merging. The difference between the strategies implemented by the protocols for handling merging is clearly observed on the results presented in Fig. 3. The conflict
40
R. de O. Schmidt, A. Pras, and R. Gomes
Fig. 3. Merging traffic Vs. Allocation traffic
resolution procedures were not considered in this experiment, but seem it is seen as future work. As soon as AROD identifies that two networks start to overlap, the mechanism for merging is triggered and a single network is quickly formed. The merging procedure in AROD affects all the nodes in one of the overlapping networks. On identifying that it is necessary to join the other network, a node announce the merging to its network’s leader and the latter floods the network with a reconfiguration message. It is a quick process and does not generate high control overhead. As one can observe in Fig. 3, in average only 5% of the total traffic generated by AROD was resulted from the merging procedure. On the other hand, AIPAC needs more time for executing the merging of networks. In addition, as illustrated in Fig. 3, if compared to AROD, a much higher percentage of its traffic is related to the merging procedure. After successfully merging the two networks, the traffic related to merging accounted for 80% of the total traffic generated by the protocol. It was due to the gradual merging strategy implemented in AIPAC, where some nodes migrated to the other network and later migrated back to their original network. Therefore, such situations doubled the traffic generated by these nodes during the merging procedure. Regarding the procedure delay, AROD detected and performed merging faster than AIPAC. AROD started the merging procedure as soon as the first nodes identified the networks overlap, what happened in averages. AROD needed in average 40 seconds for completing the merging process. With the gradual merging strategy, AIPAC took longer for detecting the merging and concluded the procedure with and average delay of 94 seconds.
5
Conclusion
Several self-addressing solutions were already proposed for ad-hoc networks. From experiments, like the one presented in this paper, we can observe that
On the Evaluation of Self-addressing Strategies for Ad-Hoc Networks
41
such solutions still lack on providing alternatives for handling complex networking situations. Consistency of addresses is important due to its influence on other operations like routing. However, designed strategies for dealing with this issues result in problems with excessive control overhead and degradation on performance of basic addressing. In addition, dependency on other technologies is not a good approach for self-addressing protocols when considering Future Network scenarios. For instance, addressing solutions like AIPAC and Prophet Allocation, which depend on routing protocols, even requiring modifications on routing operations, have their applicability drastically limited to scenarios with the specific routing protocol. It is even more problematic if we consider scenarios where the routing is also dynamic and two or more routing protocols may coexist and cooperate. This paper presented the first steps on evaluation of self-addressing approaches. As future work, the authors plan to carry more extensive and complete simulations with self-addressing solutions, also covering other network scenarios. Our main goals are: (a) to contribute with the decision on the best self-addressing solution (or combination of two or more solutions) for different ad-hoc networks; (b) to develop a general evaluation framework for self-addressing solutions, based on AUTOCONF [2] documentation on requirements and guidelines; and (c) to propose an alternative self-addressing solution for SooA [6], which is going to be used on the absence of addressing servers, providing nodes with temporary configuration.
References 1. Baccelli, E.: Address Autoconfiguration for MANET: Terminology and Problem Statement. IETF Internet Draft (2008) 2. AUTOCONF: Ad-Hoc Network Autoconfiguration. IETF Working Group, http://www.ietf.org (accessed in April 2011) 3. Bernardos, C., Calderon, M., Moustafa, H.: Ad-Hoc IP Autoconfiguration Solution Space Analysis. IETF Internet Draft (2008) 4. Weniger, K., Zitterbart, M.: Address autoconfiguration in mobile ad hoc networks: current approaches and future directions. IEEE Network Magazine, Special Issue on Ad Hoc Networking: Data Communications & Topology Control 18(4), 6–11 (2004) 5. Perkins, C.E., Malinen, J.T., Wakikawa, R., Belding-Royer, E.M., Sun, Y.: IP Address Autoconfiguration for Ad Hoc Networks. IETF Internet Draft (2001) 6. de Schmidt, R.O., Gomes, R., Sadok, D., Kelner, J., Johnsson, M.: An Autonomous Addressing Mechanism as Support for Autoconfiguration in Dynamic Networks. In: Proceedings of the Latin American Network Operations and Management Symposium, LANOMS (2009) 7. Zhou, H., Ni, L.M., Mutka, M.W.: Prophet Address Allocation for Large Scale MANETs. In: Proceedings of the 22nd Annual Joint Conference of IEEE Computer and Communication Societies (INFOCOM), vol. 2, pp. 1304–1311 (2003) 8. Sun, Y., Belding-Royer, M.E.: Dynamic Address Configuration in Mobile Ad Hoc Networks. Technical Report 2003-11, University of California at Santa Barbara (2003)
42
R. de O. Schmidt, A. Pras, and R. Gomes
9. Nesargi, S., Prakash, R.: MANETconf: Configuration of Hosts in a Mobile Ad Hoc Network. In: Proceedings of the 21st Annual Joint Conference of IEEE Computer and Communication Societies (INFOCOM), vol. 2, pp. 1059–1068 (2002) 10. Fall, K., Varadhan, K.: The ns Manual (formerly ns Notes and Documentation). The VINT Project (May 2010), http://www.isi.edu 11. Fazio, M., Villari, M., Puliafito, A.: AIPAC: Automatic IP Address Configuration in Mobile Ad Hoc Networks. Elsevier Computer Communications (COMCOM) 29(8), 1189–1200 (2006) 12. Kim, N., Ahn, S., Lee, Y.: AROD: An address autoconfiguration with address reservation and optimistic duplicated address detection for mobile ad hoc networks. Elsevier Computer Communications (COMCOM) (30), 1913–1925 (2007) 13. Jain, R.: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation and Modeling. Wiley-Interscience, New York (1991); ISBN 0471503361
Considerations in the Design of Indoor Localization Systems for Wireless Sensor Networks Jorge Juan Robles Chair for Telecommunications Dresden University of Technology, Dresden, 01069 Germany
[email protected]
Abstract. The design of an indoor sensor network to support localization-based services is a challenging issue in that the protocols and localization algorithms have to be adapted to the capabilities of resources-constrained sensor nodes. Furthermore, the achieved position accuracy depends on many factors like the amount of signaling and quality of inter-node measurements, as well as the mathematical algorithm used for estimating the position. In this paper we present experiences and considerations in the design of such systems for IEEE 802.15.4 indoor sensor networks. Keywords: localization algorithm, position estimation, RSSI, wireless sensor networks, ranging protocols.
1 Introduction A Wireless Sensor Network (WSN) consists of many sensor nodes that gather information from the environment enabling the creation of interesting applications like animal monitoring, control of industrial processes and home automation. The position information plays an important role in many context-based applications, where the measurements have to be accompanied by the node position to make this information useful. Perhaps, the simplest way to know the position is to include a GPS receiver in the sensor node. Unfortunately, this is expensive in terms of energy consumption. Furthermore, the GPS signal cannot be detected correctly in indoor scenarios. For these reasons, alternative localization systems have to be designed for low-power sensor nodes. In general, we can consider a localization system as the integration of three main functional blocks: the data management, the localization algorithms and the network protocols. The data management refers to the acquisition and processing of the data necessary for the position estimation, like the connectivity, distance and motion information. The localization algorithms represent the mathematical methods used for the position estimation, e.g. the well-known Multilateration [21], Weighted Centroid Localization [22] or proposals based on particle filter. Finally, the networks´ protocols enable the communication required by the localization process, for example the routing and neighbor discovery protocols. Note that each functional block interacts R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 43–53, 2011. © Springer-Verlag Berlin Heidelberg 2011
44
J.J. Robles
with each other. If the goal is to keep the complexity as low as possible and additionally achieve acceptable position accuracy, optimizations in the three functional blocks are recommended. At the Chair for Telecommunications of the Dresden University of Technology, different research works related to localization in WSN have been conducted [1-14]. This paper presents an overview of the main conclusions obtained from these works and provides a guideline for the design of localization systems in 802.15.4 sensor networks. An indoor scenario is taken as reference in our work. Here, there are fixed sensor nodes with known positions called Anchors (ANs). They are external powered. The Mobile Nodes (MNs), whose positions are unknown, are battery powered. Our interest is to determine their positions and extend the lifetime of their batteries as much as possible.
2 Data Management Mostly, IEEE 802.15.4 transceivers have the capability to measure the received signal strength without the necessity of additional hardware. In this section we briefly discuss the suitability of this data to be used by localization algorithms. 2.1 Signal Strength Measurements for Localization The Received Signal Strength Indicator (RSSI) is an inter-node measurement and can be used as a proximity indicator in localization algorithms as well as to determine the link quality between nodes. The standard IEEE 802.15.4 does not clearly specify the format of RSSI. Thus, this indicator can change depending on the device manufacturers. In all our experiments we use the sensor node RCB230 from Dresden Elektronik [20]. It contains the low-power 802.15.4 transceiver AT86RF230 [18] operating at 2.45 GHz and the 8-bits microcontroller ATmega1281 [19]. As the RSSI indicator we use an internal register of the transceiver called Energy Detection (ED), which averages the signal strength measurements over 128 µs. This register can take 85 different values and has a resolution of 1dbm. The RSSI measurements are influenced by the antenna characteristics, temperature, obstacles, the human interaction and the multipath effect. Due to the fact that RSSI measurements present fluctuations, it is advisable to take several RSSI samples and apply statistical methods. Usually, the average value over a certain number of RSSI samples is used in localization algorithms. However, in cases with high dispersion, the median can be very useful because it can filter out unusual values or outliers in the measurements to obtain a stabile value. In order to analyze the spatial correlation of the RSSI, signal strength measurements were taken at defined distances in five different conditions. In Fig. 1 the first two tests were carried out in an outdoor scenario. The first one was conducted without the influence of persons (Outdoor 1), whereas in the second someone held a node while the measurements were taken (Outdoor 2). The next three tests in Fig. 2 were carried out in an indoor scenario (floor) without the influence of persons, where the height of the nodes took three different values (h=0m; h=10cm; h=1.2m). In each
Considerations in the Design of Indoor Localization Systems for WSNs
45
point we averaged more than 400 measurements. The error bars show the minimum and maximum value. In Fig. 1 the error bars in Outdoor 2 indicate a high dispersion in the measurements as compared to Outdoor 1. This is principally due to the contact of the person with the node. Furthermore, a big difference between the curves in all tests is clearly detected. In multiple cases, there is a difference of more than 20dBm for the same distance in different tests. This means that if only a general attenuation model is used for representing the signal strength in all the scenarios, then the error in the distance estimation is very high. Particularly in indoor scenarios, the multipath effect makes it difficult to estimate the distance because of the ambiguity of distance values for some RSSI regions. Our experiences show that RSSI is a poor indicator for the distance determination. In some cases, the RSSI could be used if the stationary characteristic of the scenario is taken into account and a previous calibration is carried out. Distance / m -28
1
5
13
20
30
40
RSSI / dBm
-38
50
60
Outdoor1 h=1.2m Outdoor2 h=1.2m
-48 -58 -68 -78 -88 -98
Fig. 1. Relationship between RSSI and distance in outdoor scenarios Distance / m
RSSI / dBm
-28
1
5
13
20
30
40
50
60
-38
Indoor h=0cm
-48
Indoor h=10cm Indoor h=1.2m
-58 -68 -78 -88 -98
Fig. 2. Relationship between RSSI and distance in indoor scenarios
46
J.J. Robles
3 The Localization Algorithms 3.1 Distance-Based Localization Algorithms Many localization algorithms make use of the distance to reference nodes to estimate the position. There are algorithms like Multilateration [21][15][12] that provide the exact coordinates of the blind node when the information of the distances and the position of the anchors do not contain errors. On the other hand, there are algorithms that approximate the position of the node although there is no error in the estimated distance and AN´s positions. Two well-known algorithms of this last group are MinMax [21] and WCL [22]. One of their advantages is the fact that they are more robust against distance errors compared to other exact methods. Given the low complexity of Min-Max and WCL, they can be easily implemented on sensor nodes. The position accuracy of Min-Max and WCL decreases considerably when the blind node are outside of the region bounded by the anchors of the periphery. In fact, these algorithms tend to locate their estimation inside the above mentioned region although the blind node is outside of the region. Therefore, the further the blind node is with respect to the internal region, the higher the position error is obtained. On the contrary, Multilateration is able to provide the coordinates of a blind node located outside of the region limited by the anchors. By using simulation, we analyzed the average position error in distance-based localization algorithms given different distance errors (Fig. 3). The 2D reference scenario is a square of dimension 40mx40m, where four anchors are placed at the corners. The estimated distance is modeled by adding a zero mean gaussian random variable to the real distance. The standard deviation defines the distance error and it is given as a percentage of the real distance. The MN takes different locations inside the scenario, where its position is estimated. More than 1500 estimations are averaged in each position. Trilateration uses only the information of three distances from ANs to estimate the position. Multilateration exploits the available distances for the position estimation. From Fig. 3 we can see that WCL and Min-Max have a better performance than lateration-based algorithms when the standard deviation is higher than 10%. Finally, it is important to note, that the relative position of the ANs impacts on the achieved position accuracy. For instance, consider a worst case where three ANs are collinear. If a MN requires calculating its position by using the distances to the ANs, then there is an ambiguous result because the MN cannot decide between two possible position estimations. Two important indicators that characterize the geometry of the ANs are the Cramer Rao Low Bound (CRLB) and Geometric Dilution of Precision (GDOP). They can be used to determine the optimal location of the ANs inside the scenario. For further information see [15]. 3.2 The Fingerprint Technique The “fingerprint technique” is a method often used in indoor scenarios given the fact that it takes the stationary characteristics of the scenario into account (e.g. the attenuation due to walls). This technique consists of two phases: the calibration phase and the online phase. During the calibration phase, RSSI measurements are taken at
Considerations in the Design of Indoor Localization Systems for WSNs
47
Average Position Error / m
25 WCL 20 15
Min-Max Trilateration Multilateration
10 5 0 5
10
20
40
Distance error %
Fig. 3. Robustness of localization algorithms against the distance error
known positions. These positions, together with their related RSSI measurements, represent a “fingerprint” inside the scenario. The fingerprints are saved in a database. In the online phase, which is the normal operation of the system, a blind node takes a new set of RSSI measurements. By comparing the online measurements with the stored fingerprints it is possible to approximate the position of the blind node. The position accuracy achieved by the fingerprint technique is in the range of 2-4 meters in an indoor scenario [23][17]. By using simulation we compare a low-complexity fingerprint-based algorithm with WCL in a 3D scenario [1]. Our results show that the fingerprint technique can improve the position accuracy although it is more sensitive to the RSSI dispersion than WCL.
4 Network Protocols The standard IEEE 802.15.4 [16] has been designed for low-rate sensor networks and defines the PHY and MAC layer of the communication model. In this section we analyze the more relevant features of this standard that impact the localization systems. 4.1 MAC Layer in 802.15.4 In 802.15.4, CSMA-CA is used for the communication between two nodes. The main drawback of the CSMA process is that the transceiver consumes extra energy when it waits a random time (backoff delay) and checks if the channel is idle or busy during the Clear Channel Assessment (CCA). If the channel is free the MN can transmit, if not, it has to repeat the aforementioned process. After a certain number of unsuccessful attempts, the MAC layer notifies the application layer by sending a Channel Access Failure (CAF). The application layer can discard the packet or try a retransmission. For further information about the operation of CSMA, please see [16].
48
J.J. Robles
By using an oscilloscope we measured the voltage in a resistance (2 Ohms) in series between the battery (2.8V) and the sensor node. Thus, it was possible to investigate the energy consumption during the transmission process [3]. The Fig. 4 shows a snapshot, where the microcontroller is always active and the transceiver wakes up to transmit a packet. Note, that it performs three CCA before the effective packet transmission. After this task, the transceiver goes into sleep mode again. Here, the duration of the transmission is about 1ms while the CSMA process takes more than 10ms. Thus, in these cases the whole transmission process can mean high energy consumption, especially if the channel is frequently busy. In a previous work of the authors [3] an analytical model for 802.15.4 sensor network is proposed, which considers the influence of the backoff delay on the energy consumption of the transmissions. For example, assume that each MN randomly broadcasts 8 packets of 1ms during a period de 300ms. Between the transmissions the MN sleeps, saving energy. The ANs take signal strength measurements, which are used for the MN´s position estimation. The Fig. 5 shows the expected MN´s energy consumption during the period. The values of the energy consumed in the different MN´s operation modes are taken from [3]. Additionally, Fig. 5 depicts the probability that the CSMA process is successful, leading to an effective transmission. The negative impact of a high number of MNs on both variables can be clearly identified from the figure.
CSMA CCA
CCA
Fig. 4. CSMA process during one transmission
Tx CCA
Considerations in the Design of Indoor Localization Systems for WSNs
1
6
0,95
5
49
4
0,85
3
0,8 0,75
mJ
Probabilty
0,9
2
0,7
Psuccess
0,65
1
Energy Consumption
0,6
0 1
3
5
7
9
11 13 15 17 19 21 23 25 27 N° of nodes
Fig. 5. MN´ energy consumption with increasing number of transmitting nodes
4.2 Listen or Transmit? As mentioned, RSSI measurements can be used for the position estimation in 802.15.4 networks. Basically, there are two different ways to obtain RSSI measurements: either the MN listens during a certain time to measure the RSSI of packets transmitted by ANs (scheme 1) or the MN broadcasts packets and the ANs take the corresponding RSSI measurements (scheme 2). Both schemes can be adequate depending on the scenario and application. In scheme 1, if many ANs broadcast a large number of packets, the MN could listen for a long time to receive the necessary RSSI samples from each AN. The number of required RSSI samples depends on the application; in general, if more RSSI samples are taken, a more stable RSSI value can be obtained. If the transmissions of ANs are not immediate, the MN can consume a high amount of energy during the idle listening periods. By using our 802.15.4 sensor nodes we investigated the listening period at the MNs. Fig. 6 is taken from a previous work [3]. In this test the MN sends a packet to indicate to the ANs that they should start transmitting. Thus, each AN tries to broadcast 10 short packets (p1, p2,…p10) and the MN registers the arrival of each packet and the source node. Although the ANs use CSMA in the MAC layer, a random time is additionally included in the application layer. Before the first transmission a random time of [0-6ms] is used to avoid collisions. Between the next transmissions another random time ([0-1ms]) is used by the ANs to increase the probability of having a successful transmission. We define Tm as the instant when the MN achieves registering the last pX sample. Tm is referenced at the first MN´s transmission. By way of example, according to Fig. 6 if there are three ANs (A=3), then it is expected that the MN has to listen for 20ms (average Tm= 20ms) to receive two samples (X=2) from each AN (assuming that all ANs successfully transmitted at least p1 and p2). Fig. 6 suggests a linear relationship between Tm and X when there are few ANs, although this relationship changes with an increasing number of ANs, principally due to the packets that cannot be successfully transmitted.
50
J.J. Robles 120 N° of ANs=1
100
N° of ANs=2 Tm /ms
80
N° of ANs=3 N° of ANs=4
60
N° of ANs=5
40
N° of ANs=6
20
N° of ANs=7 N° of ANs=8
0 1
2
3
4
5
6
7
8
9
X
Fig. 6. Listening period according to the required number of RSSI samples [3]. The confidence intervals (95%) are shorter than 2.5ms in all cases.
The information of the last arrival can be very useful to determine the expected listening period, and thus, the MN´s energy consumption. One advantage of scheme 1 is that the MN is able to estimate its position by using its measured RSSI values. As described, if there are many nodes in scheme 1 that are transmitting packets, the MN´s energy consumption can increase due to the time used in the backoff process (See Fig. 5). Furthermore, all the RSSI measurements taken by the ANs have to be sent to an operation unit, like a cluster head or a central computer, for the position calculation. This generates traffic between ANs reducing the scalability of the network. However, this scheme can be very attractive in terms of MN´s energy consumption compared to scheme 1. For instance, assume in scheme 1 that the node RCB230 listens for RSSI samples during 40ms and estimates its position in 1ms. Afterwards, it sleeps 959ms and repeats the cycle. In this configuration the life-time of its batteries (1500mh/A) is about two months. As an example of scheme 2, imagine the following configuration: the MN is sleeping, it wakes up to transmit a short packet (0.8ms) one time per second and then it goes to sleeping mode again. Here, the lifetime of the batteries is greater than two years (assuming that the channel is not frequently busy). The energy consumption in the transmission and reception modes depends on the used amplification and modulation in the transceivers. Actually both modes present similar energy consumption in many 2.4 Ghz 802.15.4 transceivers. For instance, in the transceiver ATRF230 of Atmel [18] 45mW are consumed in the reception mode and 50 mW in the transmission at 3dbm. The transceiver CC2430 of the company Texas Instrument [24] consumes 80mW during the reception and 81mW in the transmission at 0dbm. Very important is to consider the transitions between the different modes, because it cannot be negligible in terms of energy consumption and time. In the RCB230 we measured a transition time of 2ms between the sleeping mode, where transceiver and microcontroller are inactive consuming 72µW, and the transmission mode. The energy consumed only in this transition is about 81µJ, while the transmission of a typical beacon of 0.8ms consumes 54µJ. In few words, scheme 1 can be appropriate if there are many MNs in the network that require localization. In this scheme, the MN´s energy consumption rises with
Considerations in the Design of Indoor Localization Systems for WSNs
51
increasing numbers of ANs and required RSSI samples. On the other hand, scheme 2 allows the MNs to consume very low energy. However, the high traffic generated in the network can reduce the number of admissible MNs. The more MNs are transmitting, the higher the expected backoff delay, and thus, the MN´s energy consumption. In [5] we propose a new hybrid protocol, which combines the advantages of schemes 1 and 2. 4.3 Mobility in 802.15.4 Sensor Networks The 802.15.4 sensor nodes have two different addresses. The large address (64 bits) is unique and does not depend on the associated network. In contrast, the short address (16 bits) is dynamically assigned by a coordinator and can be used e.g. by a routing algorithm. By analyzing the header of the transmitted packet it is possible to obtain the address of the source. In 802.15.4 networks, an association process is executed when a new node joins the network. In the association procedure the new node has to scan the channel. After detecting a message from the coordinator, the node initiates an exchange of signaling to receive a short-address (16 bits) from the coordinator (parent node). If the node looses the contact with its parent node, and a new short address is necessary, the orphan node tries a re-association process with another coordinator of the network. The re-association process can mean high energy consumption at the MN. Therefore, efficient handover mechanisms have to be developed to support communication when the MN´s level of mobility is high. An alternative simple mechanism is implemented in [3], where the packets transmitted for localization purposes can also be exploited by the handover process. In this proposal, the ANs use the short-address and the MN uses the large address. In this way, it is possible to implement a low-complexity address-based routing algorithm between the fixed ANs, which decides the next hop according to the short address of the destination. The MN mainly makes use of the large-address to avoid the reassociation process defined in the standard. During the first phase of the localization protocol OLP described in [3], the ANs broadcast packets and the MNs take RSSI measurements for their position estimation (scheme 1). By comparing the RSSI measurements, it is also possible that the MN detects the nearest AN. Then, if the MN wants to transmit a packet to the sink, it directly sends its packet to the nearest AN without generating additional signaling. This “selected AN” sends a packet to the sink containing the MN´s large address and the received information. On the other hand, if the sink wants to transmit a packet to the MN, the sink has to send the packet to the selected AN for its posterior delivery. In order to minimize the impact of the MN´s mobility on the communication, the selection of the nearest AN has to be periodically updated. The disadvantage of the usage of the large address is that the header of the transmitted packets is larger as compared to those that use the short address.
5 Conclusions This paper deals with the design of localization system for resource-constrained 802.15.4 sensor nodes. In such networks the RSSI can be used for localization
52
J.J. Robles
purposes. Due to the dispersion of the RSSI measurements and other problems such as the multipath effect, only a coarse position can be estimated. Min-Max and WCL are two localization algorithms that are very robust against distance error. This characteristic make these algorithms appropriate when the RSSI is used for the distance estimation. The fingerprint technique is normally used in indoor scenarios due to the fact that it can take the stationary characteristics of the scenario into account. The transmission of packets and the taking of RSSI measurements are two different schemes that require optimization. In the transmission, the backoff delay in the MAC layer can be very large if the channel is often busy. On the other hand, if the arrival time between RSSI samples is large, the idle listening increases at the receiver. These two effects can means high energy consumption at the sensor node. The mobility can also impact the energy consumption due principally to the association process between nodes. In this paper we discuss the above mentioned problems and suggest some solutions.
References 1. Robles, J.J., Deicke, M., Lehnert, R.: 3D Fingerprint-based localization for Wireless Sensor Networks. In: International Workshop on Positioning, Navigation and Communication WPNC, Dresden, Germany (March 2010) 2. Robles, J.J., Tromer, S., Quiroga, M., Lehnert, R.: A low-power scheme for localization in wireless sensor networks. In: Aagesen, F.A., Knapskog, S.J. (eds.) EUNICE 2010. LNCS, vol. 6164, pp. 259–262. Springer, Heidelberg (2010) 3. Robles, J.J., Tromer, S., Quiroga, M., Lehnert, R.: Enabling Low-power Localization for Mobile Sensor Nodes. In: International Conference on Indoor Positioning and Indoor Navigation (IPIN), Zurich, Switzerland (September 2010) 4. Radeke, R., Robles, J.: Node Degree Improved Localization Algorithms for Ad-hoc Networks. In:10th International Conference on Ad-Hoc Networks and Wireless (ADHOC-NOW), Paderborn, Germany (July 2011) (to be presented) 5. Robles, J.J., Tromer, S., Perez Hidalgo, J., Lehnert, R.: A High Configurable Protocol for Indoor Localization Systems. In: International Conference on Indoor Positioning and Indoor Navigation (IPIN), Guimaraes, Portugal (September 2011) (to be presented) 6. Martinez. J: Leistungsbewertung eines Lokalisierungsalgorithmus in drahtlosen Sensoren - Diploma thesis, TU Dresden (February 2009) 7. Benavidez, M.: Investigation of the RSSI value for ranging - Student work, TU Dresden (December 2009) 8. Tromer, S.: Implementation of an energy efficient Indoor Localization Algorithm Student thesis, TU Dresden (February 2010) 9. Quiroga, M.: Graphical remote Application for an Indoor Localization System - Student thesis, TU Dresden (February 2010) 10. Mangili F.: Evaluation of energy consumption of sensor nodes in localization applications - Student thesis, TU Dresden (July 2010) 11. Oñate, R.: Performance analysis of an energy efficient localization algorithm - Student thesis, TU Dresden (July 2010) 12. Lorenz, M.: Optimierung von kollaborativen Lokalisierungsalgorithmen, Diploma thesis, TU Dresden (July 2010) 13. Miralles, P.: Adaptive Localization in a Localization System - Student thesis, TU Dresden (February 2011)
Considerations in the Design of Indoor Localization Systems for WSNs
53
14. Iribarren, M.: Central Localization for a Wireless Sensor Network - Diploma thesis, TU Dresden (November 2010) 15. Mao, G., Fidan, B.: Localization Algorithms and Strategies for Wireless Sensor Networks. Idea Group Publishing, USA (2009) 16. IEEE standard 802 (April 15, 2006), http://standards.ieee.org/getieee802/802.15.html 17. Lorincz, K., Welsh, M.: MoteTrack: a robust, decentralized approach to RF-based location tracking. In: Personal and Ubiquitous Computing, vol. 11(6), Springer, London (2007) 18. Datasheet IEEE 802.15.4 transceiver AT86RF230, http://www.atmel.com 19. Datasheet microcontroller ATmega1281, http://www.atmel.com 20. Datasheet RCB230 sensor node, http://www.dresden-elektronik.de 21. Langendoen, K., Reijers, N.: Distributed localization in wireless sensor networks: a quantitative comparison. The International Journal of Computer and Telecommunications Networking - Special issue: Wireless Sensor Networks Archive 43(4) (November 2003) 22. Blumenthal, J., Grossmann, R., Golatowski, F., Timmermann, D.: Weighted centroid localization in Zigbee-based sensor networks. In: Proceedings of the IEEE International Symposium on Intelligent Signal Processing (WISP 2007), pp. 1–6 (October 2007) 23. Liu, H., Darabi, H., Banerjee, P., Liu, J.: Survey of Wireless Indoor Positioning Techniques and Systems. IEEE Transactions on System, Man and Cybernetics 37(6) (November 2007) 24. Datasheet IEEE 802.15.4 transceiver CC2430, http://www.ti.com
Backoff Algorithms Performance in Burst-Like Traffic Ievgenii Tsokalo1, Yamnenko Yulia2, and Stanislav Mudriievskyi3 1
Chair for Industrial Electronics National Technical University of Ukraine ”Kyiv Polytechnic Institute” Prospect Peremogy 37, 03056 Kiev, Ukraine
[email protected] 2 Chair for Industrial Electronics National Technical University of Ukraine ”Kyiv Polytechnic Institute” Prospect Peremogy 37, 03056 Kiev, Ukraine
[email protected] 3 Chair for Telecommunications Technische Universitat Dresden 01062 Dresden, Germany
[email protected]
Abstract. In the present paper several backoff algorithms are considered to analysis their behavior in burst-like traffic of smart metering network: YITRAN, Binary Exponential and Polynomial backoff algorithms. The comparison was performed by means of transmission successfulness probability estimation using Markov chain models for each algorithm. The results of the calculation showed that YITRAN algorithm outperforms the other ones for almost all kinds of burst-like traffic. The developed software can be used to implement other backoff algorithms and perform their comparison. A novel backoff algorithm was developed, which is basing on the results the comparison. Keywords: backoff algorithms, Markov chains, Smart Grid.
1 Introduction The Smart Grid topic is one of the very intensively investigated topics nowadays. This high attention to it was drawn by the prediction that introduction of the Smart Grid will increase the part of the “green” energy in the consumption and will reduce energy usage. In order to implement the Smart Grid a distributed energy generation is foreseen. In this context a Microgrid is defined as a “a localized grouping of electricity generation, energy storage, and loads” [10]. It is also defined that Smart Grid will consist of a number of Microgrids. One of the communications technologies considered to be used for control of the Smart Grid and especially Microgrid is Narrow-band Power Line Communications (NB-PLC) [3]. However most of the existing MAC (Multiple Access Control) protocols scheduling mechanisms for such communication network [1] are very sensitive to a kind of traffic. Thus, development of new MAC protocol with improved properties in the term of sensitivity and stable work under any kind of traffic is urgent and challenging task. R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 54–64, 2011. © Springer-Verlag Berlin Heidelberg 2011
Backoff Algorithms Performance in Burst-Like Traffic
55
Here was assumed that the traffic consists of traffic bursts. It can be described by two parameters: average value of active nodes variation and the number of packets in their queues. For each pair of these parameters there are certain scheduling mechanisms which are most preferable to be used. But there is no possibility to get accurate values of the parameters before each transmission frame. They can be tracked by collecting a statistical data. This does not allow fast reacting to a traffic change but most proper scheduling mechanism for some kind of network and day time could be chosen. Also switching between different mechanisms should be possible in the case if traffic kind changes and remains so for a long time. So, the target for this work is a development of scheduling mechanisms map that will permit choosing of the most proper mechanism for certain kind of traffic. The map actually shows a preference of one or another algorithm usage basing on two traffic parameters that are mentioned above. Usually value of active nodes variation in Microgrid [5] is high. CSMA/CA technology is chosen as channel access mechanism because it shows one of best accommodation possibility to the burst-like traffic [1, 6]. The main part of the scheduling mechanism in CSMA/CA is backoff algorithm. Thus, comparison of different backoff algorithm and choice of the most effective one is a necessary condition to develop proper scheduling mechanism. In second section of the paper different backoff algorithms models are proposed. In third and fourth sections static and dynamic characteristics of several algorithms are presented.
2 Backoff Algorithms Models Three algorithms were considered: binary exponential backoff (BEB), polynomial backoff (POL) and YITRAN backoff algorithm [2, 4, 7]. In the last one contention window width (CW) depends on the number of active nodes. YITRAN’s has a good performance [4] and can be effectively used to evaluate traffic parameters so it was considered as a basic algorithm. Performance of backoff algorithms can be estimated using method of Markov chains [8] or by simulation. The first one allows estimation of the backoff algorithm separately from the rest of scheduling mechanism and thus gives better results. That is why it will be considered in details further in the paper. One-dimensional Markov chain can be characterized with a transition matrix π where the rows represent previous states and the columns - new states:
CWnew CWold
π=
1
CWmax
2
1 2
p11
p12
...
p1CWmax
p21
p22
...
p2CWmax
...
...
CWmax
pCWmax 1
...
pCWmax 2 ...
... pCWmax CWmax
Therefore, matrix is square and each state corresponds to certain value of CW. The matrix dimension is equal to the difference between maximum ( CWmax ) and
56
I. Tsokalo, Y. Yulia, and S. Mudriievskyi
minimum ( CWmin ) width of CW (in π CWmin is shown to be 1 for visualization simplicity). Hereinafter accordingly to [4] CWmax = 500 and CWmin = 15 . Each
element of the matrix corresponds to the probability PCWold CWnew of transition from the old ( CWold ) to the new ( CWnew ) state. An algorithm for PCWold CWnew probability calculation is based on the method from [4]. At first probability density function (PDF) parameter should be estimated. This parameter is a probability that estimated number of active nodes r will be equal to real value R: r ⎧ S − 1 ⎛ CWold − S + 1 ⎞ ⎪ p( R = r ) = ⋅⎜ ⎟ ⎪ CWold ⎝ CWold ⎠ , ⎨ ⎪ CWold ⎪⎩ S = r + 1
(1)
where S – expected value of minimal backoff counter amongst all nodes in the network in one transmission frame. New value of CW is calculated using autoregression function for CWnew = f : f ( CWold , r ) = χ ⋅ CWold + (1 − χ ) ⋅ 12 ⋅ r .
(2)
Using (1) and (2) π matrix was filled with PCWold CWnew by following algorithm (value of r was varied in a range from 1 to 350):
⎧ for CWold ∈ (1, CWmax ) ⎪⎪ PCWold CWnew + = p (r ) . ⎨ ⎪ ⎪⎩CWnew = f ( CWold , r )
(3)
f ( CWold , r ) gives new state and p ( r ) - probability of transition to this state. A sum
of items from each row of π matrix should be equal to one. As far as p ( r ) is defined on infinite range of r each row of π should be normalized. Fixed row vector is found from π using Theorem 11.8 [8]:
w⋅π = w ,
(4)
where w = ( w1 , w2 , ..., wCWmax ) - matrix row of probabilities [1 x CWmax ]. Its maximum element should correspond to an expected average CW value for certain real number of active nodes R that can be found from following formula [4]: CWopt = 12 ⋅ R .
(5)
Backoff Algorithms Performance in Burst-Like Traffic
57
On Fig.1 fixed row vector versus CW state is plotted for several values of R.
3.2
CWopt = 72
Fixed vector values, %
2.8 2.4 2 1.6 CWopt = 228
1.2
CWopt = 384
0.8 0.4 0 0
50 R=6
100
150
200 250 State CW
R = 19
R = 32
300
350
400
R = 50
Fig. 1. Fixed probability values versus number of Markov chain state
Values calculated by (5) are almost coinciding with extreme points on Fig.1. Transition matrix for BEB algorithm was filled accordingly to the following sequence:
for CWold ∈[CWmin , CWmax ] CWnew = CWold >> 1
π CWold ,CWnew = PS
(6)
CWnew = CWold << 1
π CWold ,CWnew = PC _ CWold CWmin and CWmax for all algorithms have the same values as notified above. PC _ CW
old
- probability of collision in CWold state calculated as following [9]:
⎛ 1 ⎞ PC _ CWold = 1 − ⎜ 1 − ⎟ ⎝ CWold ⎠
R −1
`.
(7)
Transition matrix for polynomial algorithm was calculated by (6) but using following formula for calculation of new CW:
CWnew = (i + 1) a ⋅ CWmin , where minimal and maximal values of i are following [7]:
(8)
58
I. Tsokalo, Y. Yulia, and S. Mudriievskyi
imin = 0 ⎛ 1 ⎛ CWmax ⎞ ⎞ imax = (integer) ⎜⎜ ln ⎜ ⎟ − 1⎟⎟ ⎝ a ⎝ CWmin ⎠ ⎠ a = 1.5
In case of collision i increments, otherwise decrements.
3 Static Mode Evaluation To compare the performance of considered algorithms their models were investigated in the regimes of saturated throughput and dynamic traffic. In the first regime an average number of successful transmissions N succ was calculated as a variable for
performance measurement. It’s averaged for number of transitions_ n → ∞ and regarded to be equal to: N success
=
CWmax
∑
i =CWmin
(
)
wi ⋅ 1 − PC _ i ,
(9)
where wi - i-th value of the fixed vector; PC _ i is calculated by (7).
As it can be observed the proposed measurement is unitless. Thus in saturated throughput mode no preliminary condition because are required but the dimension of the transition matrix, which is defined above. On Fig.2 this number versus real number of active nodes is plotted.
Successful transmissions (maximal 1)
0.96 0.9 0.84 0.78 0.72 0.66 0.6 0.54 0.48 0.42 10
20
30
40 50 60 70 R, Number of active nodes
BEB Polynomial algorithm
80
90
100
YITRAN algorithm
Fig. 2. Average number of successful transmissions versus number of active nodes
Backoff Algorithms Performance in Burst-Like Traffic
59
It can be observed that YITRAN algorithm considerably outperforms BEB and POL.
As far as YITRAN algorithm uses the real probability distribution of active nodes number it adapts to this number change better and provides less average number of collisions.
4 Dynamic Mode Evaluation Saturated throughput performance does not give full information about the algorithms performance under real traffic. Therefore dynamic mode evaluation was performed as the algorithm reaction on a traffic burst. The burst can be completely defined with two parameters that define traffic kind: number of active nodes R that instantaneously appear (average value of active nodes variation) and the number of packets in their queues K In initial state there is no traffic at all. In terms of transition matrix π it corresponds to the state CW = CWmin . Then instantaneously R active nodes appear. In the beginning CW rises and then decreases due to the decreasing of R. The process goes on to the full extent of traffic. Time interval from initial state to the full extent of traffic is measured as a qualitative dynamic characteristic (hereinafter referred to as a traffic burst width). Markov chain model should be updated in order to take into account the decreasing of R. R can be changed in case of successful transmission. Usually MAC layer receives several packets to transmit at a time. So R decrements after K successful transmissions. This process can also be characterized by Markov chain. But each state now is defined with two variables: CW and R. Now transition matrix has order (CWmax − CWmin ) ⋅ Rmax . Let’s indicate it as π R . There is following connection between above defined π and π : R
π
R R =const
=π .
Matrix π for certain R=const will be indicated as π R . Transition matrix for one burst of traffic:
⎛ (1 − ps ) ⋅ π Rmax ⎜ 0 ⎜ R ⎜ π = 0 ⎜ ⎜ ... ⎜⎜ 0 ⎝
ps ⋅ π Rmax
(1 − ps ) ⋅ π Rmax −1
0
...
ps ⋅ π Rmax −1
...
0
(1 − ps ) ⋅ π Rmax −2
...
... 0
... 0
... ...
0⎞ ⎟ 0⎟ ⎟ 0⎟ ... ⎟ ⎟ π 0 ⎟⎠
where Rmax is the number of active nodes that appear instantly; p s - probability of successful transmission of all K packets after one attempt.
Using (7):
60
I. Tsokalo, Y. Yulia, and S. Mudriievskyi
R −1 1⎛ ⎛ 1 ⎞ ⎞. (10) ps = ⎜1 − ⎜1 − ⎟ ⎟ K ⎜ ⎝ CWold ⎠ ⎟ ⎝ ⎠ The observed time interval is calculated as an average number of transmission frames that are required to send all packets of all active nodes. Taking these states as end point of modeling we regard all states of π 0 to be absorbing (there are no active nodes remained) [8]. Software for these calculations should solve a system of linear equation with order (CWmax ⋅ ( Rmax − 1)) . For YITRAN algorithm this value is 4500. To simplify calculations instead of determining of the whole width of the traffic burst only time R interval that needed for each decrement of R will be defined. For this purpose π is transformed and new Rmax matrices are created from it. Each of these new matrices corresponds to one decrement of R. They are typical absorbing matrices:
⎛ (1 − ps ) ⋅ π Ri
π R = ⎜⎜ i
⎝
0
ps ⋅ π Ri ⎞ ⎟, I ⎟⎠
(11)
where i ∈ [ Rmax ,1] ; O is zero matrice; I is identity matrice. O and I have dimensions equal to the number of absorbing states. To find traffic burst width an average time for each R decrement is found. For this R purpose fundamental matrix N of the absorbing Markov chain π i is found as −1 N = ( I − Q ) . Item N ij denotes total number of transition times from state j to state i before being absorbed (usually float). Absorption time vector t (R decrement time) can be found as a sum of the rows from N. In order to find an average time of absorption from time vector a weighting coefficients vector should be found. Probabilities of being absorbed in each state play a role of the coefficients. Accordingly to Theorem 11.6 [8] absorbing probabilities can be calculated as following: B = N ⋅H ,
(12)
where H = p s ⋅ π Ri . Every item Bij denotes probability of being absorbed in a state j if i is an initial state. In the dynamic modeling preliminary parameters are denote by three variables: starting number active nodes, number of sent packets and the transition matrix dimension. On the beginning of the modeling we have R = Rmax and the number of sent packets pack = 0 . In this case we know that the initial state in transition matrix is
Backoff Algorithms Performance in Burst-Like Traffic
CW = CWmin . So time of absorption is t transitions in the matrix π
R
average time of absorption t
R = Rmax , pack = 0
61
= t0 . Hereinafter number of
will be marked by two indexes: R and pack, e.x an
= t 00 . For case R = Rmax and pack = 1 the
R = Rmax , pack = 0
initial state is determined by probability vector A, which is found from a matrix of 00
probability vectors B . First probability vector is found from B as a first row because in case if R = Rmax and pack = 0 process starts from the first state CW = CWmin . For next cases A will be calculated using all values of B because it’s non-zero probability that next transitions will start from any state. Probability vector is always calculate from matrix B calculated in previous step so A
01
=B
00 0
. Lower index
indicates first row; higher indexes indicate case R = Rmax , pack = 0 for B and R = Rmax , pack = 1 for A. Time of absorption is found as following: t=
CWmax −CWmin −1
∑ i =0
Ai ⋅ ti .
(13)
For the cases
( R = Ri , Rmin ≤ Ri < Rmax , 0 ≤ pack ≤ packmax ) & ( R = Rmax & pack > 1) initial state is determined by probability vector A CWmax −CWmin −1
A
nm k
=
∑
nm
. Each item of A
Bik ⋅ A
i =0 CWmax −CWmin −1
nm
is calculated as:
n[ m −1] i
,
∑
A
(14)
nm i
i =0
when pack variable is changing, R remains constant and CWmax − CWmin −1
A
n0 k
∑
=
[ n −1] packmax
Bik ⋅ A
i
i=0 CWmax − CWmin −1
∑
,
A
(15)
nm
i=0
i
when R changes too. R does not change till some node sends all K packets.
A
n[ m −1]
plays a role of weighting vector to find A CWmax − CWmin −1
performing (14) or (15)
∑ i =0
k max = CWmax − CWmin .
A
nm i
nm
. A
nm
is normalized, so after
= 1 . Length of this vector is equal to
62
I. Tsokalo, Y. Yulia, and S. Mudriievskyi
Time of absorption is calculated as follows:
t
nm
=
CWmax − CWmin −1
∑
A
n[ m −1] i
i=0
⋅ t nm i .
(16)
Then the width of traffic burst is calculated as:
BW = t
00
+t + 01
Rmax −1 packmax
∑ ∑ n =0
t
nm
.
(17)
m=2
BW, number of transmission frames
For BEB and polynomial backoff algorithms the same procedure for dynamic modelling was used. The traffic burst width versus K for several R for BEB, polynomial algorithm with above calculated values of YITRAN algorithm is plotted on Fig. 3. 120 110 100 90 80 70 60 50 40 30 20 10 0
Special point 2
Special point 1
2
3
4 5 6 7 8 K, starting number of packets in the queue
9
10
(YITRAN) R = 10,8,6,4,2 (BEB) R = 10,8,6,4,2 (POL) R = 10,8,6,4,2
Fig. 3. Ratio of traffic burst width for BEB and polynomial to improved YITRAN algorithm versus K for several R
Special point 1 on Fig. 3 is an intersection of the YITRAN (R = 2) and polynomial (R=2) characteristics and is a limit point. For K<4 and R=2 the last algorithm outperforms the former ones, i.e. for these combinations of K and R polynomial algorithm will show better performance. For all other results shown on Fig._3 YITRAN algorithm is the most effective. Special point 2 on Fig. 3 is the limit point between BEB and polynomial. But BEB shows worse performance than YITRAN algorithm for all K and R.
Backoff Algorithms Performance in Burst-Like Traffic
63
10
R, number of active nodes
9 8 7
YITRAN
6 5 4 3
POLYNOMIAL
2 1 2
3
4 5 6 7 8 K, starting number of packets in the queue
9
10
Fig. 4. Preferable backoff algorithms sections of K and R values
On Fig. 4 a line of the special points between polynomial and YITRAN algorithm performing modelling for all discrete values of R and K is plotted. An area to the left of the line is the area of R and K combination where polynomial algorithm is more preferable. BEB is not competitive with the others and has no area to be used. With the developed software other backoff algorithms can be investigated and their areas of effective application can be defined.
5 Conclusions Modified and improved backoff algorithm which can adapt to the network traffic kind is proposed. It was assumed that any traffic kind could be characterized by two parameters: the average variation of the number of active nodes and the number of packets that these nodes obtain to send in a beginning. Markov chain was created for all considered backoff algorithms (BEB, polynomial and YITRAN’s). The software was developed to compute the algorithms performance characteristic for static and dynamic traffic modes. In static mode YITRAN algorithm substantially outperformed BEB and polynomial one. In dynamic traffic modeling there was created the map that shows where it is preferable to use one or another algorithm. The map was defined with the traffic parameters that are discussed above. These parameters can be tracked during network operation and a backoff algorithm can be chosen in accordance to it. If the traffic kind changes for a considerable amount of time there is a possibility to switch between different algorithms to achieve better performance. A minimal switching time is defined by features of MAC protocol.
64
I. Tsokalo, Y. Yulia, and S. Mudriievskyi
BEB found no place on this map as far as the computation proved it to be not compatible with others. Polynomial algorithm outperforms YITRAN in a small area. Therefore among considered algorithms the YITRAN is a dominant one. The developed software can be used to add new algorithms to the map. In such a way all existing backoff algorithms can be compared. Until now only error-based algorithms were modeled with Markov chains. A reason for this is that traffic-based algorithms do not allow using of mathematical expressions for Markov chain representation. They can be interpreted by Markov transition matrix only.
References 1. Tannenbaum, A.S.: Computer Networks, 4th edn., p. 891. Pearson Education International, Inc., Upper Saddle River (2003) 2. Raphaeli, D., Grauer, A.: A New Power-Line-Communicatiovts Modem based on a Novel Modulation Technique. Tel Aviv University, Israel, Itran Communications 3. Lee, J.-h., Park, J.-h., Lee, H.-S., Leett, G.-W., Kim, S.-c.: Measurement, modeling and simulation of power line channel for indoor high-speed data communications. School of Electrical and Computer Engineering, Seoul National University 4. Gazit, R., Avidan, A.: An Innovative Adaptive Channel Access Mechanism. In: ISPLC (2004) 5. Carron, G., Ness, R., Deneire, L., Van der Perre, L., Engels, M.: Comparison of two modulation techniques using frequency domain processing for in-house nerworks. In: Interuniversity Micro Electronics Center (IMEC) - Kapeldreef 75, 3001, Heverlee, Belgium, 6. Lee1, K.-R., Lee1, J.-M., Kwon1, W.-H., Ko2, B.-S., Kim, Y.-M.: Performance Evaluation of CSMA/CA MAC Protocol in low-speed PLC Environments. School of Electrical Engr. and Computer Science, Seoul National University 7. Xu, D., Sakurai, T.: An Analysis of Different backoff Functions for an IEEE 802.11 WLAN. Department of Electrical & Electronic Eng, The University of Melbourne 8. Grinstead and Shell’s Introduction to Probability. The CHANCE Project. American Mathematical Society, Providence (July 4, 2006) 9. Vu, H.L., Sakurai, T.: Collision Probability in Saturated IEEE 802.11 Networks. Centre for Advanced Internet Architectures (CAIA). ARC Special Research Centre for UltraBroadband Information Networks (CUBIN) 10. Microgrid, http://en.wikipedia.org/wiki/ Distributed_generation#Microgrid
New IEEE 802.16-2009 Compliant Traffic Shaping Algorithms for WiMAX Networks Volker Richter and Stefan T¨ urk Dresden University of Technology Dresden, Mommsenstrasse 13 01062 {richter,tuerk}@ifn.et.tu-dresden.de www.ifn.et.tu-dresden.de/tk
Abstract. WiMAX based on the IEEE 802.16 standard allows the deployment of cellular mobile broadband access networks delivering all services through the Internet Protocol. An essential feature of such networks it the Quality of Service (QoS) support including the traffic shaping. Based on detailed analyses of the IEEE 802.16-2009 QoS concept, we propose two traffic shaping algorithms. The first one is standard conform at the expense of high memory consumption. The second approximate algorithm has been developed for memory crucial system. We present parameter studies obtained by a detailed ns-2 WiMAX simulation environment. The simulation results show that both algorithms can be used as a base for QoS-aware WiMAX scheduling algorithms. Keywords: WiMAX; IEEE 802.16; QoS; Traffic Shaping; Traffic Policing.
1
Introduction
The IEEE 802.16 standard family guarantees the compatibility of WiMAX systems from different vendors. WiMAX is one of the candidates for the 4th generation mobile networks, fulfilling the ITU-R IMT Advanced criteria [2]. The standard IEEE 802.16-2009 [1] specifies three physical layers namely WirelessMANSC, WirelessMAN-OFDM and WirelessMAN-OFDMA supporting microwave radio relay connections through single carrier (SC) transmissions, cellular networks for fixed subscribers and with WirelessMAN-OFDMA full mobile cellular networks. On the top of all these physical layers a mostly common Medium Access Control (MAC) layer has been specified. An integral feature of this MAC layer is the Quality of Service (QoS) support. Therefore, the standard defines five service classes adjusted to application requirements and specific traffic characteristics. The detailed requirements and constrains of a connection are signaled through QoS parameter sets during the connection establishment procedure. Assertion and observance of the accepted QoS parameters are tasks of scheduling and traffic shaping algorithms, which have been left unspecified by the standard for vendor differentiation. R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 65–76, 2011. c Springer-Verlag Berlin Heidelberg 2011
66
V. Richter and S. T¨ urk
In the last years many scheduling algorithms have been proposed [6, 3], but only few authors have taken the QoS parameters specified in the IEEE 802.162009 standard into consideration. Much less work has been done on traffic shaping algorithms. The authors of [5] propose a traffic shaping algorithm based on token buckets disregarding the standardized integration interval of the data rates. Token bucket approaches do not comply with the IEEE 802.16-2009 design. Therefore, the aim of this paper is to propose a traffic shaping algorithm accurately fulfilling the IEEE 802.16-2009 specification. Furthermore, a second approximation algorithm is presented to overcome the drawback in terms of memory consumption of the accurate algorithm. In the first section, the IEEE 802.16-2009 QoS concept is introduced based on our deep studies of the standard document. Subsequently, we describe our proposed algorithms in a detailed manner. These algorithms have been implemented in the network simulator ns2 based WiMAX model provided by the Application Working Group of the WiMAX Forum. Our simulation results are presented in section 4. Finally, we conclude our paper and give an outlook on future work.
2
The IEEE 802.16-2009 QoS Concept
The IEEE 802.16-2009 standard specifies five service classes to fulfill various application requirements and to make use of their specific traffic characteristics. For the Downlink (DL) direction the service classes are denoted as data delivery services types (cp. [1], p.422) and for the Uplink (UL) direction they are specified as scheduling services (cp. [1], p.291). The service class Unsolicited Grant Service (UGS) has been designed for real-time applications generating traffic with fixed packet sizes and constant inter-arrival times. Thus, it is especially suitable for Voice over Internet Protocol (VoIP) without silence suppression. To satisfy the needs of important real-time applications, such as VoIP with silence suppression, the service class Extended Real-Time Variable Rate (ERT-VR) has been created. It is denoted as Extended Real-Time Polling Service (ertPS) for UL transmissions. This kind of applications offer packets with constant sizes and inter-arrival times within certain time periods. In contrast to UGS, these parameters change during connection times. Common video codex’s produce data steams with variable packet and interarrival times. The real-time service class Real-Time Variable Rate (RT-VR) for DL and respectively Real-Time Polling Service (rtPS) for UL transmissions is suitable for video streams. To support delay tolerant applications, which need minimum data rates for timeout avoidance the service class Non Real-Time Variable Rate (NRT-VR) or Non Real-Time Polling Service (nrtPS) has been designed. No QoS guarantees are given to applications which are using the Best Effort (BE) service class. For example web traffic is delivered by this service class. Each service class is characterized by a set of mandatory and optional QoS parameters. To police incoming data streams, as required by the IEEE 802.16-2009
New IEEE 802.16-2009 Compliant Traffic Shaping Algorithms
67
standard, the parameters Maximum Sustained Traffic Rate (MSTR), Minimum Reserved Traffic Rate (MRTR), Time Base (TB) and Maximum Traffic Burst size (MTB) have to be taken into consideration. The MSTR defines the average maximum data rate which can be allocated for the corresponding connection. It is measured in bit per second, where MAC and physical overhead is excluded (cp.[1], p. 1285). This parameter allows to restrict the data rate of a connection according to the bandwidth contracted with the user. Therefore, this parameter is important for network operators to offer contracts with different maximum bandwidths and to limit their costs for backbone traffic. As shown in the left part of figure 1, it specifies the upper data rate bound. Even in an idle WiMAX cell, a connection shall not receive more physical allocation than it is necessary to fulfill the MSTR. Because of the definition as an average over time, the MSTR can be temporarily overtaken. This is allowable when the previous data rate of the offered traffic did not reach the MSTR. This case is depicted in figure 1 before the second traffic maximum in the idle example. In contrast to the MSTR, the MRTR describes the average
Idle
Overloaded MSTR MRTR Time
Fig. 1. System behaviour in idle and overload state
minimum data rate which must be allocated for connections, if data is available in their queues. It is also expressed in bit per second without MAC and physical overhead (cp. [1], p. 1285). This parameter defines a guaranteed rate for a connection, which is important for the functionality of the corresponding application. Also in case of a fully loaded system, all MRTRs have to be reached as illustrated in right part of figure 1. As stated before, both parameters are average values over time. Therefore, an integration interval has to be specified, which is done in the IEEE 802.16-2009 standard by defining the QoS parameter Time Base (TB) given in ms (cp. [1], p. 1306). Furthermore, the optional QoS parameter Maximum Traffic Burst size (MTB) may have an influence on the traffic shaping. The MTB defines the maximum data volume which shall be allocated in one frame (cp.[1], p. 1285). With the help of this parameter a more regular data transmission can be achieved. Table 1 summarizes all service classes including their mandatory (X) and optional (O) QoS parameters. Because of the constant data rate the MSTR
68
V. Richter and S. T¨ urk
Table 1. Service classes and their mandatory (X) and optional (O) QoS parameters Service classes DL Service classes UL
UGS UGS
ERT-VR ertPS
RT-VR rtPS
NRT-VR nrtPS
BE BE
MSTR MRTR TB MTB
X X X O
X X X O
X X X O
X
X X O
X O
parameter is not intended for UGS. In contrast, BE does not support MRTR parameters as guaranteed rates are not provided. Our two traffic shaping algorithms are based on the described QoS parameters and will be introduced in the following section.
3
WiMAX Traffic Shaping Algorithm
In the following, we firstly introduce our exact traffic shaping approach. Afterwards, our approximate algorithm is described. Both algorithms are calculating the maximum admissible data rate based on the MSTR parameter as well as minimum guaranteed data rate based on the MRTR parameter for each connection. In the description of the algorithms we refer to the MSTR calculation. 3.1
Accurate Algorithm
In order to fulfill the average maximum data rate RM S defined using the MSTR QoS parameter, the data volume S must be transmitted through the WiMAX system within the duration of one Time Base denoted as TB . S = RM S · TB
(1)
Therefore, transmission opportunities have to be distributed on a number of n frames. TB (2) n= TF Where TF denotes the frame duration in ms. To determine the maximum data volume Sk which might be allocated by the scheduler in the k th frame, all previous allocations Si within the current TB period have to be subtracted. Hence, all allocations made in frames from k − n to k − 1 have to be considered. This can be expressed in the following equation. Sk = RM S · TB −
k−1
Si
(3)
i=k−n
Sk is the maximum data volume allowed to be scheduled. To enable the calculation of the sum in equation 3 all previous allocation sizes of n − 1 frames have to be kept in memory.
New IEEE 802.16-2009 Compliant Traffic Shaping Algorithms
69
As mentioned in section 2, the IEEE 802.16-2009 standard requires, that only resources for available data shall be assigned. Therefore, a minimum function between Sk and the current queue size SQ of a connection is introduced. The resulting data volume allowed to be scheduled is denoted as Sk . Sk = min( Sk , SQ )
(4)
The optional QoS parameter MTB is denoted as SM T B and avoids large allocation sizes after periods where the offered traffic of a connection dropped below the MSTR or respectively the MRTR. Figure 2 depicts an example, where the TB duration is four times longer than the frame duration. Immediately after
S ccc
TB
S MTB RMS TF
TB
4 TF
k
Fig. 2. Example of the Accurate Traffic Shaping Algorithm showing the influence of the MTB parameter
an idle period, where no packets have been transmitted the queue is flooded. At this moment, the entire data volume of one TB interval is admitted. In the example of figure 2, the transmitted traffic equals four times the average volume. Therefore, the average data volume of four frames is transmitted in the first frame k = 1. In the next call of the algorithm for the following frame k = 2, the possible data volume has already been used. Therefore, no further data can be sent until the first transmission entry of the frame k = 1 gets older than one TB = 4 duration and is discarded. Due to that effect no further data is transmitted in the frames k = 2, 3, 4. Afterwards, this procedure repeats beginning with the frame k = 5 and results in an uneven distribution of data transmissions. With the help of the QoS parameter MTB, the maximum transmitted amount of data in one frame is limited to SM T B assuring a more homogeneous transmission as illustrated in figure 2. The compliance to the MTB QoS parameter can also be realized with a minimum function. Where the maximum possible allocation size for scheduling is defined as Sk . Sk = min( Sk , SM T B ) (5) The MTB value has to be greater or equal to the average allocation size in order to have the possibility to assign enough transmission opportunities to fulfill
70
V. Richter and S. T¨ urk
the MSTR QoS parameter. Otherwise, this parameter is violated and agreed QoS parameters for the corresponding connection can not be achieved. SM T B ≥ R M S · T F
(6)
The scheduling algorithm may allocate less data then Sk , according to the current system load and latency constrains of other connections. After segmentation and/or concatenation of incoming packets into MAC Protocol Data Units (PDUs) and their placement into the frame structure, the effectively transmitted payload volume Sk is saved to calculate the data volume for the following frames. (7) 0 ≤ Sk ≤ Sk To avoid oscillations after establishment of a connection as shown in figure 2 all previous allocation sizes are set to the average allocation size. Sk = RM S · TF
for
k≤0
(8)
In conclusion, the accurate traffic shaping can be described with the following formula. k−1 Si (9) Sk ≤ Sk = min SM T B , SQ , RM S · TB − i=k−n
This proposal accurately fulfills the definitions of the IEEE 802.16-2009 standard. It requires memory to save n − 1 allocation sizes. Therefore, the memory usage of the proposed exact traffic shaping algorithm scales linear with the Time Base duration Tb . In contrast, the computational complexity remains constant as the previous sum value is also saved and the next value used in equation 3 can be obtained as given in equation 10 . k
Si =
i=k−n−1
3.2
k−1
Si + Sk − Sk−n
(10)
i=k−n
Approximate Algorithm
For WiMAX systems,- where memory usage is crucial, we developed an approximate traffic shaping approach, which is based on [4]. To achieve a constant memory usage, the average effective data rate of the previous frames Rk−1 is saved instead of all allocation sizes Si within one TB duration. Therefore, only memory for one value is needed. This algorithm is subdivided in two steps. Firstly, we have to determine the maximum allocation size Sk in for the current frame. Secondly, we use the amount of effective transmitted data Sk to obtain the average rate for the next iteration Rk . The maximum allocation size Sk in for the current frame k results from the difference between the data volume fulfilling the MSTR RM S within the TB
New IEEE 802.16-2009 Compliant Traffic Shaping Algorithms
71
duration TB and the data volume allocated in the n−1 previous frames calculate with help of the previous average data rate Rk−1 . Sk = RM S · TB − Rk−1 (TB − TF )
(11)
In accordance to the exact algorithm, the constrains of the available data size in the queue SQ as well as the optional MTB size can be realized with a minimum function obtaining Sk . (12) Sk = min SM T B , SQ , RM S · TB − Rk−1 (TB − TF ) To avoid start oscillations as described for the accurate traffic shaping algorithm, R0 is set to the value of the MSTR parameter RM S for the first frame k = 1. Rk = RM S
; for
k≤0
(13)
After the scheduling and frame building process the transmitted payload volume Sk is feed back to the algorithm. We adopted the time sliding window approach from [4] to calculate the future average rate Rk . Rk =
Rk−1 · TB + Sk TB + TF
(14)
Due to the time sliding window the memory duration depends also on scheduled data volumes and not only on the TB QoS parameter. Therefore, this approach cannot be considered as strictly standard compatible, but it is more efficient in terms of memory consumption.
4
Simulation Results
Both algorithms have been implemented in a simulation environment consisting of the network simulator ns-2 and the corresponding Add-On for WiMAX modelling [8] developed by the Application Working Group of the WiMAX Forum. They are applied on the MSTR and MRTR of each connection calculating the maximum allowed allocation sizes and the minimum necessary data volumes to meet the application requirements. Our simulation environment includes a two phase scheduling algorithm. In the first phase it allocates only data volumes to achieve guaranteed rates. Afterwards, MSTR based requests are handled. To study the proposed algorithms one subscriber station has been placed next to the base station, arising thereby the highest possible burst profile 64 QAM 3/4 is used. One DL connection applying the RT-VR service class has been configured. The MSTR and MRTR were set to 14 M Bit/s and 10 M Bit/s in order to stay below the system capacity. In the first presented simulation results the MTB size was set to infinity. The frame duration of 5 ms and other none determining parameters have been chosen according to [7]. The offered traffic was simulated by the ns-2 constant bit rate source model generating packets with a size of 576 Byte.
72
V. Richter and S. T¨ urk
Algorithm Comparison Figure 3 shows the achieved goodput over time without any activated traffic shaping algorithm. Therefore, the goodput corresponds to the offered traffic. In
20 1.
Goodput [Mbit/s]
15
10
2.
3.
4.
RT-VR 6.
5.
MSTR
MRTR
5
0 10
20
30
40
50
60
70
80
90
100
Time [s]
Fig. 3. Goodput over Time without Traffic Shaping
the six shown test cases the offered traffic alternates in all possible relations to the MSTR and MRTR boundaries. Thereby, it is possible to study the behaviour of algorithm in typical situations. In figure 4 the accurate traffic shaping algorithm was activated and the TB length was set to 20 ms equivalent to 4 frame durations. The comparison between the two figures shows, that the achieved goodput is limited to the MSTR boundary. Due to the short TB duration, unused data volumes in the idle periods are not taken to temporally exceed the MSTR. The MRTR has no influence of the graph as the system is not overloaded. The simulation result obtained with the approximate approach is shown in figure 5. In periods with constant offered traffic the MSTR constraint is also kept by using the approximate algorithm. But especially after 40 s simulation time in test case three the disadvantage of this algorithm can be observed. If the offered traffic changes heavily, the calculated average data rate by the sliding window method differs heavily from the real one, leading to an overshoot of the achieved goodput.
New IEEE 802.16-2009 Compliant Traffic Shaping Algorithms
73
20 1.
Goodput [Mbit/s]
15
10
2.
3.
4.
RT-VR 6.
5.
MSTR
MRTR
5
0 10
20
30
40
50
60
70
80
90
100
Time [s]
Fig. 4. Accurate Traffic Shaping Algorithm
20 1.
Goodput [Mbit/s]
15
10
2.
3.
4.
RT-VR 6.
5.
MSTR
MRTR
5
0 10
20
30
40
50
60
70
80
Time [s]
Fig. 5. Approximate Traffic Shaping Algorithm
90
100
74
V. Richter and S. T¨ urk
Parameter Study After the comparison of the algorithms, we analyze the influence of the TB period and MTB size QoS parameters on the accurate traffic shaping approach. Therefore, the TB parameter was set to an extreme value of 1 s. Hence, the goodput exceeds the MSTR during one TB duration. Figure 6 shows the goodput over time in the above introduced scenario and a TB interval of 1 s. The simulation result shows, that the goodput is periodically
20 1.
Goodput [Mbit/s]
15
10
2.
3.
4.
RT-VR 6.
5.
MSTR
MRTR
5
0 10
20
30
40
50
60
70
80
90
100
Time [s]
Fig. 6. Accurate Traffic Shaping Algorithm - Time Base 1 s
higher than the MSTR in the test cases three, five and six, where the offered traffic exceeds this boundary. Between 40 s and 50 s in case three the goodput reaches the system capacity at approximately 17 M Bit/s followed by a sharp decline to 0 M Bit/s. This behavior is repeated 10 times in this 10 s long test case. During the idle period no data has been allocated for this connection. When the packet queue is filled after 40 s the whole queue size may be allocated in the first frame only limited by the system capacity, as described in section 3.1. Therefore, the following frames are also fully utilized. This leads to the allocation of the entire data volume of one TB within the first frames. When this limit is reached no further data volumes are admitted until the first saved allocation entry greater zero is older than one TB. This effect results in repeated oscillations visible in figure 6. However, the IEEE 802.16-2009 specification permits such a allocation scheme. But it should be mentioned, that these irregular transmissions
New IEEE 802.16-2009 Compliant Traffic Shaping Algorithms
75
can increase the average delay. Therefore, the TB value should not be greater as the QoS parameter maximum tolerable latency defined in IEEE 802.16-2009. To achieve a more regular transmission the MTB QoS parameter can be applied. In figure 7 the MTB has been set to 8750 Byte corresponding to the average data volume to fulfill the MSTR. Due to this additional limitation of the
20 1.
Goodput [Mbit/s]
15
10
2.
3.
4.
RT-VR 6.
5.
MSTR
MRTR
5
0 10
20
30
40
50
60
70
80
90
100
Time [s]
Fig. 7. Accurate Traffic Shaping Algorithm - Maximum Traffic Burst 8750 Byte
allocated data volumes as described in equation 5, the goodput is always equal or below the MSTR. The allocations are equally distributed between all frames leading to a similar simulation result as shown in figure 4.
5
Conclusion and Future Work
In this paper, we have presented two approaches for the traffic shaping in WiMAX networks as specified by the IEEE 802.16-2009 standard. The first proposed accurate traffic shaping algorithm fulfills the specification in an exact manner, but its memory requirements scales linear with the duration of the Time Base Quality of Service parameter. The second approximate proposal has a constant memory usage, but its memory duration additionally depends on the scheduled data. The simulation results show+, that both algorithms enforce the Maximum Sustained Traffic Rate boundary. The accurate algorithm shows a better behavior in cases of heavy changes of the offered traffic. Based on both
76
V. Richter and S. T¨ urk
algorithms IEEE 802.16-2009 compatible schedulers can be developed, which satisfies the QoS requirements of all types of applications. Future research includes detailed investigations of the behavior of our algorithms in complex traffic scenarios and improvements of our scheduling algorithm to support all QoS parameters. Furthermore, an improvement of accurate algorithm is in development to reduce oscillations of the transmission after idle periods in an adaptable manner. We are going to extend our simulation environment with these algorithms to enable realistic WiMAX investigations. Acknowledgments.We would like to acknowledge our former student Mr. Tung Nguyen Khac for his contribution to the approximate traffic shaping algorithm. We would like to take this opportunity to thank our project partner Detecon Consulting for sponsoring this work. Special thanks are given to Dr. Petry and Dr. Knospe from Detecon Bonn for their great support.
References 1. IEEE standard for local and metropolitan area networks part 16: Air interface for broadband wireless access systems. IEEE Std 802.16-2009 (Revision of IEEE Std 802.16-2004) (2009) 2. Itu-r document 5d/558: Endorsement of candidate imt-advanced rit based on IEEE 802.16. WP 5D Meeting (October 2009) 3. Ali-Yahiya, T., Beylot, A.L., Pujolle, G.: Radio resource allocation in mobile wimax networks using service flows. In: IEEE 18th International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2007 (Semptember 2007) 4. Fang, W., Seddigh, N., Nandy, B.: A time sliding window three colour marker (tswtcm). RFC 2859 (2000) 5. Ghazal, S., Ben-Othman, J.: Traffic policing based on token bucket mechanism for wimax networks. In: IEEE International Conference on Communications, ICC 2010 (May 2010) 6. Wang, W., Sharif, H., Hempel, M., Zhou, T., Wysocki, B., Wysocki, T.: Implementation and performance evaluation of qos scheduling algorithms in mobile wimax ns-2 simulator. In: 4th International Conference on Signal Processing and Communication Systems, ICSPCS 2010 (December 2010) 7. WiMAX Forum: Wimax system evaluation methodology Version 2.1 (July 2008) 8. WiMAX Forum - Application Working Group: Ns-2 mac + phy add-on for wimax (IEEE 802.16) (May 2009)
Multiple-Layer Network Planning with Scenario-Based Traffic Forecast Shu Zhang1 and Ulrich Killat2 1
The Quality Group it vision GmbH Hamburg University of Technology
[email protected],
[email protected] 2
One of the major tendencies in network planning is to take the interactions of multiple network layers into consideration, so that solutions much better than those restricted in single layers can be achieved. This also applies when planning for the extension of the network infrastructure to meet the increment of traffic in the future, in which minimizing the capital expenditure of new devices and the corresponding energy consumptions are among the greatest concerns of network operators. However, the planning relies on the forecast of future traffic demands, where some degree of uncertainty must be counted as a part. A special forecast mechanism currently used by some network operators is to make a series of forecasts, each being a scenario which describes a possible future situation. In this paper, we will introduce this mechanism, analyze its differences to other traffic uncertainty models, and suggest the corresponding planning methods based on the Integer Linear Programming (ILP) models. The performances are shown in the tests at a scale of real problems. Keywords: Network Planning, Multi-Layer, Traffic Uncertainty.
1
Introduction
As a basic principle, telecommunication networks belong to multi-layer architectures. Many researches have shown that the interrelations of multiple network layers can play an important role in network planning, and the planning restricted to single layers is generally sub-optimum, even may fail to meet some critical requests. In this paper, we will present our multi-layer network planning method developed to set a sensible trade-off between rigid multi-layer optimization and algorithmic complexity. Without loss of generality, we take the following abstract model as reference (see the layers in Fig.1): At the bottom of the network is the physical layer, which includes network elements (nodes) and the physical links (cables/fibers) connecting them; a logical link layer is established above it, which has a quite different topology; on the top there are a number of end-to-end traffic flows according to the user demands, and they should be routed over the logical layer. Besides, special care must also be taken for the paths of traffic demands in the physical layer, so that resilience against single component (physical link or a node) failures in the network can be achieved when required. R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 77–88, 2011. c Springer-Verlag Berlin Heidelberg 2011
78
S. Zhang and U. Killat
In the short time period, the main task of the planner is to assign the paths and resource allocations for emerging new demands, as well as to create new logical links using spare resources in the physical layer when necessary. This is called Operative Planning. In the middle or long term, because the current network is exposed to rapidly increasing services and bandwidth demands, the extension of network infrastructure must also be taken into consideration. The corresponding task is referred to as Tactical Planning. The concepts of the planning task and our reference model of the tactical planning problem are shown in Fig.1.
Fig. 1. The reference model of multi-layer tactical planning
For network operators, the tactical planning is of great importance because it involves a large amount of investment (CAPEX) when the decision of installing new physical infrastructure is made, as well as the corresponding operational costs (OPEX, e.g. maintenance effort, the energy consumption of the installed new devices, etc.). The planning decisions are made according to the forecasted traffic matrix. However, because making a precise forecast is practically impossible, some degree of uncertainty must be included in the planning model. Many researches have been made in this area, mainly focusing on two different models: (1) The hose model [1][2][3], which only requires in/out bound of the traffic across the network, but yields quite loose result; (2) the polyhedral model where multiple traffic matrixes are considered [4][5]. However, a special traffic uncertainty mechanism, which is currently used by some network operators in Europe, is different to any of the above ones. With this mechanism, a series of traffic matrix forecasts is made, each is referred to as a scenario. Unlike the polyhedral model, these matrixes follow a sequential order, and each scenario is a subset of the following one. Typically three scenarios, which represent the minimum, the average, and the maximum estimation of the future traffic matrix, are made. In the considered time range, the demands are expected to reach the minimum scenario at first, and then continue moving to the following scenarios in order. However, it is not sure whether the average
Multiple-Layer Network Planning with Scenario-Based Traffic Forecast
79
and the maximum scenario may be reached in reality. A suitable plan must be optimized for all scenarios. Note that the solutions for the demands common to all scenarios must not differ too much in order to minimize reconfigurations when moving from one scenario to the next. With these settings, the network planning task includes a suggestion for network infrastructure extension as well as the routing and resource allocation for each demand in all scenarios. The objective of the planning is to minimize the overall cost (CAPEX and OPEX) in each scenario, while the cost of reconfiguration between neighbor scenarios must also be as low as possible. These features make our planning problem fundamentally different from that of the standard polyhedral models. To the best of our knowledge, no research on this kind of problem settings has been made so far. In this paper, we will introduce our planning method for the above problem settings. The method is based on the Integer Linear Programming (ILP) model, while its compromised variant to cope with the scalability issues will also be discussed. The rest of the paper is organized as follows: In Section 2, the problem setting is defined in detail. Then our ILP formulation and the two-phase optimization method are introduced in Section 3. Section 4 shows the numerical results from some tests cases. A conclusion is given in the last section.
2 2.1
Definition of the Planning Model Network and Traffic Modelling
According to the concept of Fig.1, we give the following definitions. A logical link llog is routed over one or multiple physical links, occupying resource Clog at each hop. This capacity can be used to support end-to-end traffic demands. Generally it is enough to set up one path for a demand. For some critical traffic with request on resilience, 1 + 1 protection must be established according to some policy of the network operator, so that any single failure in physical layer (e.g. device failure, cable cut) can be tolerated. In a telecom network, generally a number of logical links with different capacities has already been established, and their resources are already being utilized to some extent. To route a new traffic demand, the spare capacity of a logical link Cllog can directly be used. Besides, since the current network configuration is known, the routing of every logical link at the physical layer is also available. Denote the set of physical links that the path of a logical link llog traverses as Sphys (llog ), and the set of logical links which traverse the same physical link lphys as Slog (lphys ). It is quite common that the raw resources at physical links have not been fully occupied by the logical links at the moment. Therefore a physical link lphys may also have spare some resource C(lphys ). In our model, both physical and logical links are considered as abstract links l ∈ L in our planning model, each with spare resource C(l). Therefore, we are planning the routings for the forecasted traffic demands Di , i ∈ I, in a graph G(V, L) where V is the set of all nodes.
80
2.2
S. Zhang and U. Killat
The Optimization Model for Network Planning
The Plan in Each Scenario. For each individual scenario, the plan includes the following suggestions to the planner: 1. The routing for each future demand, in which efficient resource utilization is preferred. In our model, the cost of resource utilization is expressed by a certain amount of OPEX. 2. The necessary hardware extensions, which claim a certain amount of CAPEX (purchasing and installation) as well as OPEX (maintenance, energy consumption, etc.). 3. Optionally, the modification of the logical links which can help to reduce the cost of 1) and 2) but also requires some updating work and therefore the corresponding cost (considered as CAPEX in this paper). Since hardware extension involves the majority of cost, a suitable plan for that is the central part of our model. According to limitations in practice (devices type, free interfaces, etc.), we cannot arbitrarily install any type of physical link between a pair of nodes in the network. A specific hardware extension must be selected from a set of choices, which is referred to as an extension pattern in this paper. The pattern should be predefined by the network operator (see for example Table.1). Table 1. Example of an extension pattern Option Type Amount Unit Cost
1 1 Gbps cable 1 1 million
2 1 Gbps ... 2 2 million
3 10 Gbps ... 1 4 million
... ... ... ...
In our model, the hardware extension is considered as an extra resource to be installed at a physical link lphys . Note that the link can be either an existing one, or a planned link which is a candidate of the possible future network infrastructure. The optimization algorithm is responsible for making a choice from the options in an extension pattern at each link, according to the principle of ensuring enough resources to support the forecasted traffic matrix with minimum cost. We define a binary variable t(s, l, k), which takes the value 1 when the kth option is chosen at link l in scenario s, and 0 for not making this choice. Note that the decision can be 0 for all k, which means no extension at this link is needed. The Reconfiguration Between Scenarios. If each scenario is planned independently, waste of investment may possibly occur. Assume that the following plan for hardware extension was made for scenario s1 : t(s1 , l1 , 1) = 1 and t(s1 , l, k) = 0, ∀l = l1 , ∀k. Besides, an independent plan was suggested for scenario s2 : t(s2 , l2 , 1) = 1 and t(s2 , l, k) = 0, ∀l = l2 , ∀k. Although it seems that the cost in each scenario is minimized because there is only one hardware extension for each scenario, the whole plan is still problematic: The extension at
Multiple-Layer Network Planning with Scenario-Based Traffic Forecast
81
l1 has been installed in the first scenario s1 , but is no longer utilized in the next scenario s2 , which means the waste of investment on l1 . This is defined as a hard reconfiguration and should be avoided in the planning. Furthermore, in case each scenario is planned independently, the planning for the routing of demands and modifications of logical links can be quite different. Although these different configurations require only some parameter modifications, it is still a tedious task when the amount of necessary reconfiguration is large, and the operation can be a significant source of error. Therefore, the following differences between neighbor scenarios should also be minimized: 1) The routing of each demand; 2) The configuration of each logical link. Because the two operations are very similar in nature (parameter updates to be made by technicians), they are both referred to as soft reconfigurations. 2.3
The Optimization Model
Based on the above analysis, we suggest a 2-phase optimization model. Phase 1. Each scenario is planned independently with the objective of minimizing the cost. The optimum solutions can show the minimum costs Costopt (s) of each scenario s ∈ S, which are used as reference values in the next phase. Phase 2. Now, the objective is to minimize the differences (i.e., in terms of hard and soft reconfigurations) between neighbour scenarios. We wish that the overall cost in each individual scenario is not significantly larger than the optimum cost obtained in the first phase. Practically, a relaxation factor F (e.g. F = 10%) is defined so that the final cost Cost(s) in each scenario is bounded by: Cost(s) ≤ Costopt (s) · (1 + F )
3 3.1
(1)
The ILP Formulation Model of Phase 1
The Flow Continuity Constraint. For each demand, an end-to-end path should be established. We define a binary decision variable x(s, i, l, p), which takes the value 1 if the path p of demand i traverses link l in scenario s, otherwise 0. By the node-link formulation of Eq.2, p end-to-end paths can be established (p ∈ P (i) where P (i) is the number of paths required by demand i for resilience). ⎧ ⎨ 1 if j = v (2) x(s, i, l, p) − x(s, i, l , p) = −1 if j = u ⎩ 0 otherwise l(m,j)∈L l (j,n)∈L ∀i(u, v) ∈ I(s), ∀j ∈ V, ∀p ∈ P (i) Here, I(s) is the set of all demands in scenario s, and i(u, v) is the demand i with source node u and destination node v. l(m, n) denotes a directional link from node m to n. The equation means that for a node j, the incoming traffic must be equal to the outgoing traffic, while traffic at the entrance and the exit are fixed to 1.
82
S. Zhang and U. Killat
The Link-Disjointness Constraint. For a demand requiring multiple paths, each of the paths should be disjoint in the physical layer. However, we are making the plan in a multi-layer network, where both logical links and physical links are considered as an abstract link in the set L. Therefore, the end-to-end paths obtained by Eq.2 may contain a mix of both physical and logical links. In order to check the disjointness in the physical layer, we define a binary variable y(s, i, l, p) to track the exact path of demand i in the physical layer. It has the same meaning as x(s, i, l, p) but is defined only at physical links. There are two cases to be checked. The first case is when path p traverses a physical link. Eq.3 ensures that y(s, i, lphys, p) is equal to 1 if x(s, i, lphys , p) takes the value of 1 (in this case the path is routed over the physical link lphys ), otherwise unrestricted. y(s, i, lphys , p) ≥ x(s, i, lphys , p), ∀lphys ∈ Lphys , ∀i ∈ I(s), ∀p ∈ P (i)
(3)
The second case is when path p traverses a logical link. Now, as a tracking variable, y(s, i, lphys , p) must be 1 if x(s, i, llog , p) takes the value of 1 and lphys is a link on which the logical link llog is routed, otherwise unrestricted. The relationship of logical link routed on physical link is recorded in the set Sphys (llog ) (see Section 2.1), which can be obtained by analyzing the routing before establishing the ILP model. y(s, i, lphys, p) ≥ x(s, i, llog , p)
(4)
∀llog ∈ Llog , ∀lphys ∈ Sphys (llog ), ∀i ∈ I(s), ∀p ∈ P (i) Since y(s, i, l, p) records the paths in the physical layer, it is straightforward to establish the link-disjointness constraint. Eq.5 ensures that a physical link l cannot be traversed more than once by the paths p ∈ P (i) of demand i. y(s, i, l, p) ≤ 1, ∀l ∈ Lphys , ∀i ∈ I(s) (5) p∈P (i)
The Resource Utilization Constraint. This is the basic rule that the total resources claimed by new demands on a link should not exceed the spare resources at the link. A logical link l has spare resource C(l) which can be modified if necessary. We use the variable δ(s, l) to indicate the amount of modification in scenario s at link l, so that the actual spare resource becomes C(l) − δ(s, l). Note that δ(s, l) can take positive as well as negative value, which implies reduction or increment of spare resources. Let R(s, i) be the resource requirement of flow i in scenario s, the resource utilization constraint is therefore as follows: x(s, i, l, p) · R(s, i) ≤ C(l) − δ(s, l), ∀l ∈ Llog (6) i∈I(s),p∈P (i)
For a physical link, there are two ways that its spare resource may be modified: 1) From the modification of logical links routed over it. 2) Due to the installation
Multiple-Layer Network Planning with Scenario-Based Traffic Forecast
83
of new hardware extension. Let the kth choice of the extension pattern at link l provides an extra resource Cext (l, k). Then, the constraint on physical links is: x(s, i, l, p) · R(s, i) ≤ i∈I(s),p∈P (i)
C(l) +
δ(s, llog ) + t(s, l, k) · Cext (l, k), ∀l ∈ Lphys
(7)
llog ∈Slog (l)
The discussion of the set Slog (l) is referred to Section 2.1. Eq.6 and Eq.7 imply that the modification of logical links is restricted to the adjustment of capacity. A more general model should also allow the rerouting of logical links. Although it is theoretically feasible to establish an equation set of flow continuity conditions for each logical link similar to Eq.2, such a multi-layer model can result in an explosive enlargement of the problem scale, which makes even much smaller problems than our test cases (see Section 4) unsolvable in acceptable time. In this paper, We will focus our discussion on the above analyzed model. The variable δ(s, l) determines the amount of capacity change at a logical link. Since a modification to any value involves the same amount of reconfiguration work, a binary variable d(s, l) is used to track whether any modification has taken place. It should take the value 1 if δ(s, l) is nonzero, and 0 otherwise. Eq.8 describes such a relationship between d(s, l) and δ(s, l). d(s, l) · W ≥ δ(s, l), d(s, l) · W ≥ 0 − δ(s, l), ∀l ∈ Llog
(8)
Here, the constant W satisfies W |δ(s, l)|, ∀l. The Objective. The optimization objective of phase 1 is to minimize the realization cost in the given scenario s. There are 3 types of cost: 1) utilization of resources at each link (OPEX); 2) The modification of logical links (CAPEX); 3) The implementation of hardware extensions (CAPEX and OPEX). Denote the cost for demand i to use a unit of resource at link l as Wx (i, l), the cost of modifying a logical connection as Ws , and the cost of implement the kth choice of hardware extension at link l as Wt (l, k), then the cost function is: x(s, i, l, p) · R(s, i) · Wx (i, l) Cost(s) = i∈I(s),p∈P (i),l∈L
+
l∈Llog
d(s, l) · Ws +
t(s, l, k) · Wt (l, k)
(9)
l∈Lphys ,k∈Kl
By the calculation in phase 1, the cost of the optimum solution Costopt (s) in each scenario s ∈ S can be obtained. These costs will serve as references for the next phase calculation. 3.2
Model of Phase 2
In phase 2, the network planning constraints of phase 1 remain unchanged, while there are the following new features:
84
S. Zhang and U. Killat
– All scenarios must be considered in an integrated planning problem, rather than only one scenario is planned independent of the others. – The soft and hard reconfigurations between neighbor scenarios should be minimized. – The actual cost of each scenario is upper-bounded. In the ideal case, it should also be minimized. Define the set SN of all pairs of neighbour scenarios: (s1 , s2 ) ∈ SN , and between these pairs comparisons should be made. We further require that the pairs in SN are ordered, where the first scenario is a subset of the second one. The Soft Reconfiguration. If the capacity modifications are different for the logical links in the compared pair of scenarios, a reconfiguration is necessary. Following the style of Eq.8, the corresponding inter-scenarios reconfigurations can be tracked by the binary variable ds (s1 , s2 , l) with the following equation: ds (s1 , s2 , l) · W ≥ δ(s1 , l) − δ(s2 , l) ds (s1 , s2 , l) · W ≥ δ(s2 , l) − δ(s1 , l)∀(s1 , s2 ) ∈ SN , ∀l ∈ Llog
(10)
The constant W should also be larger than any possible value of δ(s, l). For the routing of demands, a reconfiguration dr (s1 , s2 , i) takes place when any segment of the paths in the neighbour scenarios is different, as in Eq.11: dr (s1 , s2 , i) ≥ x(s1 , i, l, p) − x(s2 , i, l, p) dr (s1 , s2 , i) ≥ x(s2 , i, l, p) − x(s1 , i, l, p)
(11)
∀(s1 , s2 ) ∈ SN , ∀i ∈ I(s1 ) ∩ I(s2 ), ∀p ∈ P (i), ∀l ∈ L The Hard Reconfiguration. A hard reconfiguration happens when we make different choices of hardware extension at a link. Define the binary variable dh (s1 , s2 , l) for this check: dh (s1 , s2 , l) ≥ t(s1 , l, k) − t(s2 , l, k), ∀(s1 , s2 ) ∈ SN , ∀l ∈ Lphys , ∀k
(12)
Note that the above equation only checks whether a hardware extension applied in s1 also remains in s2 , while not doing the same in the reversed direction. This is because s1 is generally a subset of s2 ; the extra hardware extension in s2 is natural. The Cost In Each Scenario. We wish that the final solution for each scenario is not too far away from the optimum one, i.e. within a slightly relaxed bound as Eq.13. The excess cost recorded by the variable Costexc (s)(≥ 0) will later on be counted into the objective function of phase 2. Cost(s) ≤ Costopt (s) · (1 + F ) + Costexc (s)
(13)
Multiple-Layer Network Planning with Scenario-Based Traffic Forecast
85
The Optimization Objective. As the objective, we mainly wish to reduce the amount of reconfigurations. Let Ws and Wh be the cost of a soft and a hard reconfigurations. Then, the objective function can be defined as: ds (s1 , s2 , l) · Ws Costcf g = (s1 ,s2 )∈SN ,l∈Llog
+
dr (s1 , s2 , i) · Ws
(s1 ,s2 )∈SN ,i∈I(s1 )∩I(s2 )
+
dh (s1 , s2 , l) · Wh
(14)
(s1 ,s2 )∈SN ,l∈Lphys
Although the actual cost in each scenario is bounded by Eq.13, in the final solution it tends to locate at anywhere below the bound. If F is loose, e.g., 10%, there is still much room to improve. Therefore, it is meaningful to let the costs in each scenario also take place in the objective function so that they can be minimized as well. In this case, the value of Ws and Wh should be carefully chosen (generally a large value) so that to minimize the reconfiguration remains our main objective in this phase. Besides, the excess cost from Eq.13 should also be counted with a punishment factor Wp . The overall objective function now becomes: Cost = Costcf g + Cost(s) + Costexc (s) · Wp (15) s∈S
4
s∈S
Numerical Results
Here we show some examples of numerical results obtained with the network topology derived from DFN XWiN [7], a scientific research network with nodes in major cities of Germany, and fibres connecting them to form a meshed topology. Our test network consists of 54 nodes and 81 physical links from the core of XWiN. The capacities of links range from 1Gbit/s to 20Gbit/s as established in XWiN. Then, 100 logical links are randomly generated using the resource of physical links, which takes 70% to 80% of the physical resources into the logical layer. Finally, 20% to 80% (random even distribution) of the total capacity at each logical link is marked as occupied to emulate the current utilization. As a comparison, firstly we try to solve the same planning problem with a greedy method based on sequential shortest path calculation for each demand (for those requiring physical disjoint paths, the algorithm suggested in [6] is used). When a solution does not exist for the current demand, at first a soft reconfiguration is tried, then a hardware extension. This behavior is very close to that of a human planner. We denote the algorithm as Greedy Sequential Algorithm (GSA). The same procedure is repeated for ten thousand times and the best solution recorded. Then, we solve the same problem with our ILP model introduced in the previous section. There are two different models: the exact model which strictly follows the formulation of Section 3; the compromised model in which paths for
86
S. Zhang and U. Killat
identical demands are same in each scenario, which is designed to alleviate the scalability problem. The results shown in the following are collected from 10 times repeating the same procedure: randomly generate a specific number of demands (to make the task heavy, each of them requires a pair of physical disjoint paths). 1/3 of the demands are assigned to the minimum scenario, 2/3 demands including those in the minimum scenario are assigned to the average scenario, and all demands are assigned to the maximum scenario. The capacities of the demands are randomly generated with an even distribution, where the value range is carefully selected so that the resource of the current network is only sufficient to accommodate roughly 80% to 90% of the new demands, i.e. hardware extension is necessary. In this test, we assume that all physical links follow hardware extension patterns with 3 different choices as defined in Table.1.
Cost in comparison to Greedy (%)
100
90
80
70 ILP ILP Compromised GSA
60
50 20
40
60
80
100
120
140
160
Number of demands
Fig. 2. Comparison of OPEX
To compare the performance of the planning methods, it is straightforward to compare the realization cost using Eq.14 and Eq.15. However, the optimization objective of phase 2 indicates an abstract cost strongly depending on the values of the weight parameters Ws Wh and Wp . Different parameter settings may lead to significantly different results. Therefore in Fig.2, we separately compare the cost of resource utilization in the maximum scenario, taking the cost of the GSA solution as reference (100%). The situations in the average and minimum scenarios are very similar. This result shows the saving of OPEX by the optimized routing solution obtained through the ILP model. Note that the solution quality of the exact and the compromised models are very close. The plan of hardware extension involves the majority of costs in the network management. The results of Fig.3 shows fewer hardware extension in the solution obtained by our ILP model (same result from the exact and the compromised model), which indicates tremendous capital saving in reality.
Multiple-Layer Network Planning with Scenario-Based Traffic Forecast
87
Number of extension in comparison to Greedy (%)
100
80
60
40
20
GSA ILP
0 20
40
60
80
100
120
140
160
Number of demands
Fig. 3. Comparison of the number of hardware extensions
Finally, the calculation time of each method is compared in Fig.4. The scalability of the compromised ILP model, where routings of demands in all scenarios are the same, is much better than the exact model allowing the routing reconfiguration. However, the qualities of both solutions are very close as shown in Fig.2 and Fig.3, and both solutions are much better than that of the greedy algorithm. Therefore, the compromised model is generally more suitable in solving practical problems, especially when the scale of the problem is large.
Time till gap below 0.01 (Sec)
20000
ILP ILP Compromised GSA (1E+4 rounds)
15000
10000
5000
0 20
40
60
80
100
120
Number of demands
Fig. 4. Comparison of scalability
140
160
88
5
S. Zhang and U. Killat
Conclusion
In this paper, we introduced the concept of tactical planning for a telecom network in the background of a multi-layer network optimization model. Here, the plan for the routing and resource allocation of future demands as well as the infrastructure extension in the middle to long term are to be made. Since the extension of network involves large amount of investment and directly influences the energy consumption, it plays a central role in the suggested planning method. Furthermore, the problem of the uncertain estimation of future traffic has been introduced, and a specific scenario-based forecast mechanism currently used by some network operators has been discussed. To solve the corresponding network planning tasks, we proposed a two-phase optimization method using ILP models. Through the tests with an XWiN network topology, it has been shown that our optimization model can find satisfactory solutions in acceptable calculation time, while the quality of the planning solution is much better than that obtained by repeated calculation with greedy algorithms, which are close to the behavior of a human planner. Generally, the planning solution suggested by our ILP model requires much less hardware extensions than the greedy algorithms, which means a great reduction of one of the major cost factors for a network operator.
References 1. Duffield, N.G., Goyal, P., Greenberg, A., Mishra, P., Ramakrishnan, K., van der Merive, J.: A flexible model for resource management in virtual private networks. SIGCOMM Comput. Communication Rev. 29(4), 95–108 (1999) 2. Fingerhut, J.A., Suri, S., Turner, J.S.: Designing least-cost nonblocking broadband netowrks. Journal of Algorithms 24(2), 287–309 (1997) 3. Mulyana, E., Killat, U.: Optimizing IP netowrks for uncertain demands using outbound traffic constraints. In: Proceeding of INOC 2005, pp. 695–701 (2005) 4. Ben-Ameur, W., Kerivin, H.: Routing of uncertain demands. Optimization and Engineering 3, 283–313 (2005) 5. Applegate, D., Cohen, E.: Making Intra-Domain Routing Robust to Changing and Uncertain Traffic Demands: Understanding Fundamental Tradeoffs. In: Proceedings of ACM SIGCOMM, pp. 313–324 (2003) 6. Xu, D., Xiong, Y., Li, G.: Trap Avoidance and Protection Schemes in Networks with Shared Risk Link Groups. IEEE Network 18(13), 36–41 (2004) 7. German Research Network (Deutsches Forschungsnetz DFN), The Scientific Network X-WiN, http://www.dfn.de/xwin/
Optimization of Energy Efficient Network Migration Using Harmony Search Stefan T¨ urk and Rico Radeke Dresden University of Technology Dresden, Mommsenstrasse 13 01062 {tuerk,radeke}@ifn.et.tu-dresden.de
Abstract. In this paper we describe the basic network migration problem in backbone networks to move from an existing to a new technology. Furthermore we use a generic harmony search algorithm to optimize the solution in terms of energy efficiency and costs. Harmony search is a probabilistic meta-heuristic which has been successfully adapted to many optimization problems. We analyze how harmony search can be used to calculate migration sequences with minimized energy consumption and financial costs. Results of parameter studies for the heuristic will be shown to evaluate the method. The achieved resource utilization to cover increasing network demands and the point of introduction within a certain time interval will be presented also.
1
Introduction
Traffic demand in optical backbone networks is expected to increase rapidly in the next years [1]. Also the income of network providers is constantly decreasing due to shrinking tariffs for end users. Furthermore the hardware utilization and therefore the energy consumption at each Point of Presence will rise strongly. This perspective offers opportunities for significant cost and energy reduction. Our paper describes how network operators can cope with the increase of backbone traffic in the next years. Network migration describes a technical process of upgrading and exchanging existing hardware/software (infrastructure) to other network devices, which generate calculable cost savings for its owner. The shift of classical Synchronous Digital Hierarchy (SDH) services towards cost efficient Ethernet services is currently under way, i.e. for high data rate Internet access or global interconnection of company locations [2, 3]. The introduction of Carrier Ethernet services is heavily investigated by most providers, since it promises significant cost savings and simplifications in terms of administration and maintenance [4]. In this paper we will describe how a generic Harmony Search (HS) metaheuristic can be used to optimize the multi-layer and multi-period network migration problem for core networks. As a result a proposal for a possible resource utilization and scaling will be given. HS was chosen to be investigated since R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 89–99, 2011. c Springer-Verlag Berlin Heidelberg 2011
90
S. T¨ urk and R. Radeke
the heuristic is adaptable to the initial problem (see section 2) and was described as a fast converging algorithm by its authors [5]. Previously an Ant Colony Optimization (ACO) meta-heuristic [6] had been applied to the problem [7]. The system model of the migration approach was applied to a widely used IP/MPLS/SDH/DWDM1 network. A possible future scenario and migrated network is an IP/MPLS/ETH2/DWDM architecture. Both architectures are discussed in detail in [8]. Previous investigations on energy consumption in backbone networks have been shown in [9]. An energy optimization approach using the Nobel-2 [10] architecture has been discussed in [11]. Energy consumption of different devices was analyzed in [12, 13] and used in our investigation. The remaining paper is structured as follows. Section 2 describes the harmony search agorithm and its adaptation to the migration process. In subsection 2.4 we shortly present the used simulation system and the defined scenarios. In section 3 we discuss achieved results and finish the paper in section 4 with conclusions.
2
Algorithm Description
2.1
Network Migration
The process of network migration can be described as the stepwise insertion, replacement or removal of components in a network to change the technology. In our current approach (multi layer migration) Optical Cross Connect (OXC)s, Carrier Grade Ethernet switches (MPLS Transport Profile (MPLS-TP) switches) and IP routers can be added to a network to satisfy yearly increasing demands. Within an incremental network migration only the next planning state is used to calculate the necessary hardware [14]. In our approach also potential future traffic is considered. Our network migration model implements for instance two important marketdriven factors: on the first hand the network demand increase (40% annual traffic increase [1]) and on the other hand a cost erosion of devices (20% yearly Capital Expenditures (CAPEX) decrease [15]). The detailed description of this cost model can be found in [8]. The energy model is device specific and was approximated for our investigations using the assumptions made by [13] (Fig. 1). Basically it has been stated that an IP Router (IPR) consumes about 1200 Watt/100GBit/s, an ETH switch about 800 Watt/100GBit/s and an OXC about 180 Watt/100GBit/s. 2.2
Basic Solution
The basic Migration Solution (MS) can be generated using different approaches. The first option is to directly calculate a minimal necessary device situation at 1 2
Internet Protocol(IP)/Multiprotocol Label Switching(MPLS)/Synchronous Digital Hierarchy(SDH)/Dense Wavelength Division Multiplex(DWDM). Ethernet(ETH).
Optimization of Energy Efficient Network Migration Using Harmony Search
91
Fig. 1. Energy consumption for different switch types [13]
each node for each time step. The second option is to generate a completely randomized but feasible solution within the solution space of the migration that is also able to cover all appearing demands for each step. Both approaches contain the Greenfield planning result or a current device situation in the network. We denote this starting situation as migration step 0 (see Fig. 2).
Fig. 2. Exemplary network migration solution structure
The implemented device situation at each field of the matrix in Fig. 2 contains information about each device at each node that has to be present at the node in a time step i to cover the demands of the backbone. These device granularities zk,i of node k mark the decision variables later used in section 2.3. Each variable has a minimum and maximum possible value (as defined by the flags, Tab. 1), which is definied by the used system model. Decision variables are stated in Tab. 1, a value of 0 indicates that a device of this type is not present at the node in this time step i. A ”v” defines a virtual meshed device, which would practically be a multi-chassis installation of IPR’s, ETH’s or OXC’s. The cost or
92
S. T¨ urk and R. Radeke
Table 1. Device granularities with cost model and power consumption [10, 12, 13] Type
Var
Flag
CAPEX [CU] Power[KW]
IPR IPR IPR ... IPR
z IP R k
f IP R M IN
0 16.67 111.67
0 2.92 14.94
f IP R 315.83 M AX
35.83
0 640 1280 5760
ETH ETH ETH ETH ... ETH
0 240 1200 2400(v)
OXC OXC OXC ... OXC ... OXC
0 3 Port 4 Port
9600(v)
9 Port(v) 18 Port(v)
z ET H f ET H 0 k M IN 12.5 50 100 f ET H 400 M AX
0 0.73 9.33 18.66 74.64
z OXC f OXC 0 k M IN 21.24 28.32
0 2.16 2.87
63.72
6.48
f OXC 127.44 M AX
12.96
energy model for virtual devices remains unclear as coupled devices loose capacity due to internal connections. These effects are subject to further investigations. The generated basic MS can be optimized using meta-heuristics. One approach is to use harmony search as optimization algorithm, which will be described in the following. 2.3
Harmony Search
Harmony Search was firstly introduced 2001 by Geem et al. [5] and has been improved in several other works [16, 17]. The algorithm has been described very well in [5], but shall be shortly explained in the following to describe how it can be adapted to the network migration problem. HS is based on the idea, that musicians have several options to generate new songs. Basically they can modify an existing song or they can implement copied parts of multiple songs into a new song. They can also improve a totally new (random) harmony. HS was proposed as a very effective optimization heuristic for Traveling Salesman Problem, routing and timetabling, containing five major steps [5]. I. Initialize solution variables and parameters II. Fill harmony memory with randomized solutions III. Generate a feasible random solution within search space and apply changes from a solution found in Harmony Memory (HM) IV. Vary the generated solution according to predefined probabilities (called pitch adjusting) V. Repeat steps III and IV until a stop criterion is reached The defined steps of the meta-heuristic can be applied to the initial network migration problem. In the first step all parameters for the problem have to be set (e.g. the size of the HM or probabilities for algorithm decisions). The fret width
Optimization of Energy Efficient Network Migration Using Harmony Search
93
fw , which is the range of possible granularity changes, has to be set in step I also. After the initialization of variables the HM is filled with randomized solutions until the HM’s maximum size LHM,max is reached. Here randomized solutions have to be provided since otherwise the best solution from the HM would be always the semi-optimal deterministic ”minimal necessary device”-solution. The first part of step III is equal to the description provided in 2.2 as a new solution Sn is constructed here. The second part of the third step delivers the most important changes to a new and potential good migration solution Sn . According to the parameter Harmony Memory Considering Rate (HMCR) the algorithm d decides to fetch an existing decision variable zk,i (where d is the investigated device type) from the HM. The decision variable for current step and current node is randomly taken from the HM. With a probability of Pnew = 1 − HM CR a new randomized decision variable is generated. After the restructuring of the solution either via HM variables or with randomized variables a pitch adjustment can be applied. This means the current decision variable of the current step can be tuned up or down (in our case in terms of device granularities). After pitch adjusting the solution can be compared to the worst solution Sw of the HM. If the Sn has a smaller Total Cost of Ownership (TCO) (where TCO can also mean total power consumption) than Sw it is a potential good solution for the next iteration and replaces Sw in HM. When the stop criterion is reached the algorithm terminates and returns the best found migration solution. These major steps are stated in Alg. 1.
Algorithm 1. Pseudo code for HS implementation Specify parameters while LHM < LHM,max do Generate balanced random solution Add solution to HM end while Sort HM descending for all iterations do (Calculate PAR) Generate new solution Sn for all Solution Elements e ∈ E of Sn do for all Device types d ∈ D of e do Generate new random variable rnd1 if rnd1 < HM CR then Return random device z d from HM k,i Generate new random variable rnd2 if rnd2 < P AR then Do pitch adjusting for z d with fret width fw k,i end if Set device type z d for element e k,i Balance traffic at node else Generate new random device type z d k,i Balance traffic at node end if Store e in Sn end for end for Calculate cost cn of Sn Fetch worst solution Sw from HM Calculate cost cw of Sw if cn < cw then Replace Sw in HM with Sn end if end for print best S of HM
94
2.4
S. T¨ urk and R. Radeke
Scenarios
Two backbone networks were investigated, the German 17-node (Fig. 3) backbone network and the European 67-node backbone. Due to space limitations only the obtained results for GER-17 are discussed in this paper. To compare the results with a simple hardware insertion algorithm a Selective Random Search Heuristic (SRSH) has been implemented as well. This algorithm represents an optimization algorithm based only on feasible randomly generated solutions and their direct comparison either using TCO or power consumption only. The traffic model of both scenarios is based on a population investigation that has been described more detailed in [7]. The total migration period is set to 5 years, where the first year is a greenfield planning result or existing network. Since HMCR and Pitch Adjusting Rate (PAR) strongly affect the result of the migration solution both parameters have been investigated in detail (Tab. 2) to configure the algorithm optimal for the migration problem and to find potential dependencies in the scenario. Table 2. Evaluated heuristic configurations Configuration HMS HMCR PAR version P ARmax P ARmin C-I C-IIa C-IIb C-IIIa C-IIIb C-IVa C-IVb
10 10 10 10 10 10 10
0.85 0.85 0.85 0.85 0.85 0.85 0.85
no PAR fix fix P ARv1 P ARv1 P ARv2 P ARv2
C-V C-VIa C-VIb C-VIIa C-VIIb C-VIIIa C-VIIIb
10 10 10 10 10 10 10
0.95 0.95 0.95 0.95 0.95 0.95 0.95
no PAR fix fix P ARv1 P ARv1 P ARv2 P ARv2
C-X
0
0 0.1 0.01
0.5 0.05 0.5 0.05
0.1 0.01 0.1 0.01
0
0 0.1 0.01
0.5 0.05 0.5 0.05
0.1 0.01 0.1 0.01
no migration
Improved dynamic PAR equations (Eq. 1 (proposed by [17]) and Eq. 2), had been applied to the problem to enhance the total optimization process.
with: it NI P AR(it) P ARmax P ARmin
... ... ... ... ...
P ARv1 (it) = P ARmin +
P ARmax − P ARmin ∗ it NI
(1)
P ARv2 (it) = P ARmax −
P ARmax − P ARmin ∗ it NI
(2)
current iteration step total number of iterations (predefined) probability within iteration step it to do a pitch adjusting maximum predefined probability to do a pitch adjusting minimum predefined probability to do a pitch adjusting
Optimization of Energy Efficient Network Migration Using Harmony Search
95
Load distribution German 17 node backbone
Ulm
Load Ulm
Munic
17
Munic
Nuremberg
Stuttgart
16
Karlsruhe
13
Nuremberg
Karlsruhe 14 15 Stuttgart
Leipzig
11
12
Frankfurt
Frankfurt Mannheim
Leipzig
Mannheim
10
Dortmund
Dortmund
8
Essen
4
9
Cologne
Hanover
Duesseldorf
7
Duesseldorf Cologne
Berlin 5
Berlin
6
Node From
Essen
Hanover
2 3
Bremen
Norden
1
Munic Ulm Stuttgart Karlsruhe Nuremberg Mannheim Frankfurt Leipzig Dortmund Cologne Duesseldorf Essen Hanover Berlin Bremen Hamburg Norden Bremen
Hamburg
Hamburg
Norden
Node To
(a)
(b)
Fig. 3. German 17 (GER-17) node backbone scenario (a) and demand situation (b) [8]
3
Results
The result of our research regarding migration algorithms is shown using different visualisations (algorithm performance and detailed view on migration result). All parameters have been investigated twice, once for cost and once for the power consumption. Stop criterion for the HS has been defined with a maximum amount of 100k iterations. Optimization calculations have been repeated 80 times independenly to guarantee statistical confidence. In Fig. 4 and 5 the performance of the algorithms and the HS variations (CI-CIIIb) are presented. Here CX is the best found cost solution if a network migration is not performed (just IP routers are upgraded), CX is used to compare our results to the providers usual incremental device upgrade. It can be seen that a usage of a too low HMCR can result in too many random changes. Therefore the final power usage of the migration solution will be higher than necessary. Preliminary tests showed that on the other hand a usage of no random decisions results in a very bad algorithm performance. When using a HMCR of 0.95 (Fig. 5) the total algorithm performance is much better, as best algorithm configuration CIIIb was found. A HMCR of 0.95 was chosen according to the default value of the HS algorithm, there might be HMCR values that perform better for the given network scenario. P ARv1 does not enhance the migration result in our case, due to many random decisions in the beginning of the iteration phase. P ARv2 improves the power savings of the migration solution up to 6.5% (compared to CVIb). The size of the HM has not been changed for the results of this paper. Fig. 6 and 7 present a detailed view on the respective best migration result. For cost efficient solutions Ethernet switches (ETH Basic Node (ETHBN)) have to be purchased early (here year 2), for the energy optimal solution electric devices (also IP Port Card (IPPC)s and ETH Port Card (ETHPC)s) have to be bought as late as possible and optical equipment (Optical Basic Node (OPBN)) is preferred early. In both cases the algorithm sells IP Basic Node (IPBN) early if possible.
96
S. T¨ urk and R. Radeke
HS (HMCR=0.85), SRSH, 100000 Iterations, OPT=POW, 1−α=0.95 CI CIIa CIIb CIIIa CIIIb CIVa CIVb SRSH CX
Total power consumption (kW)
5000
4000
3000 10
100
1000
10000
100000
# of iterations
Fig. 4. Algorithm performance for HMCR=0.85 for different configurations HS (HMCR=0.95), SRSH, 100000 Iterations, OPT=POW, 1−α=0.95 CV CVIa CVIb CVIIa CVIIb CVIIIa CVIIIb SRSH CX
Total power consumption (kW)
5000
4000
3000
2000 10
100
1000
10000
# of iterations
Fig. 5. Algorithm performance for HMCR=0.95 for different configurations
100000
Optimization of Energy Efficient Network Migration Using Harmony Search Resoure utilization for GER−17, OPT=POW, best of 100000 HMY iterations 500 POWERSUM OPBN IPBN IPPC ETHBN ETHPC
Partial power consumption (kW)
400
300
200
100
0 1
2 3 Migration step (years)
4
Fig. 6. Best found energy efficient solution for GER-17 Resoure utilization for GER−17, OPT=TCO, best of 100000 HMY iterations CAPEX IMPEX OPEX OPBN IPBN IPPC ETHBN ETHPC
1600
1400
Partial expenses (CU)
1200
1000
800
600
400
200
0 1
2
3
4
Migration step (years)
Fig. 7. Best found cost efficient solution for GER-17
97
98
4
S. T¨ urk and R. Radeke
Conclusion and Future Work
In this paper we have presented a methodology to solve the network migration problem using a harmony search algorithm. The evaluation of the results showed that harmony search is adoptable to the initial problem formulation and performs better than SRSH. The comparison of different parameters for this approach showed promising working conditions for the algorithm, which will be further investigated. Generally it was shown, that simple improvements of the harmony search method can result in significant energy savings for network providers. Future research in the field of optimization of network migration is the evaluation and comparison of other types of algorithms (e.g. genetic) as well as the improvement of existing approaches. The investigation of device meshing and the influence on total migration cost and energy consumption will also be done. Acknowledgment. The work presented in this paper is a result of the CELTIC project 100GET-E3, which has been partially supported by Nokia Siemens Networks GmbH & Co. KG and the German Federal Ministry of Education and Research (BMBF) under grant 01BP0740.
References 1. Cisco: Cisco visual networking index: Forecast and methology, 2008-2013, Tech. Rep. (2009) 2. Kiy, N.: Carrier-ethernet: Transportnetz f¨ ur next generation networks. ntz 3-4, 28–29 (2009) 3. Michaelis, T., Duelli, M., Chamania, M., Lichtinger, B., Rambach, F., T¨ urk, S.: Network planning, control and management perspectives on dynamic networking. In: 35th European Conference on Optical Communication, Vienna, Austria, p. 7.7.2 (2009) 4. Ciena, The value of otn for network convergence and ip/ethernet migration (2009), http://www.ciena.com/files/ 5. Geem, Z.W., Kim, J.H., Loganathan, G.V.: A new heuristic optimization algorithm: Harmony search. Simulation 76(2), 60–68 (2001), http://sim.sagepub.com/content/76/2/60.abstract 6. Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Computational Intelligence Magazine 1(4), 28–39 (2006) 7. T¨ urk, S., Radeke, R., Lehnert, R.: Network migration using ant colony optimization. In: 9th Conference of Telecommunication, Media and Internet TechnoEconomics (CTTE) (June 2010) 8. T¨ urk, S., Sulaiman, S., Haidine, A., Lehnert, R., Michaelis, T.: Approaches for the migration of optical backbone networks towards carrier ethernet. In: IEEE Workshop on Enabling the Future Service-Oriented Internet - Towards SociallyAware Networks, Honolulu, Hawaii, USA (2009) 9. Baliga, J., Ayre, R., Hinton, K., Sorin, W., Tucker, R.: Energy consumption in optical IP networks. Journal of Lightwave Technology 27(13), 2391–2403 (2009) 10. Ferreiro, A.: Nobel 2 project: Migration guidelines with economic assessment and new business opportunities generated by NOBEL phase 2. Tech. Rep. (2008)
Optimization of Energy Efficient Network Migration Using Harmony Search
99
11. Palkopoulou, E., Schupke, D.A., Bauschert, T.: Energy efficiency and capex minimization for backbone network planning: is there a tradeoff? In: ANTS 2009: Proceedings of the 3rd International Conference on Advanced Networks and Telecommunication Systems, pp. 34–36. IEEE Press, USA (2009) 12. Idzikowski, F.: Power consumption of network elements in IP over WDM networks. TU Berlin, TKN Group, Tech. Rep. TKN-09-006 (2009) 13. Tamm, O.: Scaling and energy efficiency in next generation core networks and switches. In: ECOC, Vienna (2009) 14. Meusburger, C., Schupke, D., Eberspacher, J.: Multiperiod planning for optical networks-approaches based on cost optimization and limited budget. In: IEEE International Conference on Communications, ICC 2008, pp. 5390–5395 (2008) 15. Verbrugge, S.: Strategic planning of optical telecommunication networks in a dynamic and uncertain environment. Ph.D. dissertation, University of Ghent (2007) 16. Lee, K.S., Geem, Z.W.: A new structural optimization method based on the harmony search algorithm. Computers & Structures 82(9-10), 781–798 (2004), http://www.sciencedirect.com/science/article/B6V28-4BWMP8Y-2/2/ cd0c48f22f516cc7df98796923cbe99a 17. Mahdavi, M., Fesanghary, M., Damangir, E.: An improved harmony search algorithm for solving optimization problems. Applied Mathematics and Computation 188(2), 1567–1579 (2007), http://www.sciencedirect.com/science/ article/B6TY8-4MM8B85-1/2/99776af451ff9cf34fec0d93a358aa6d
Self-management of Hybrid Networks: Introduction, Pros and Cons Tiago Fioreze and Aiko Pras University of Twente Enschede, The Netherlands
[email protected],
[email protected]
Abstract. In the last decade ‘self-management’ has become a popular research theme within the networking community. While reading papers, one could get the impression that self-management is the obvious solution to solve many of the current network management problems. There are hardly any publications, however, that discuss the drawbacks of selfmanagement. In this paper we will therefore introduce self-management for the specific case of hybrid networks, and discuss some pros and cons. In particular, this paper investigates the feasibility to employ selfmanagement functions within hybrid optical and packet switching networks. In such networks, large IP flows can be moved from the IP level to the optical level, in an attempt to reduce the load at the IP layer and enhance the quality of service (QoS) of the flow that is moved to the optical level. One of the typical management tasks within such networks, is the establishment and release of lightpaths. This paper identifies the advantages and disadvantages of introducing self-management to control such lightpaths. Keywords: Future Internet, self-management, hybrid networks.
1
Introduction
In recent years there has been a considerable interest in what is called the Future Internet. Two fundamentally different approaches are under discussion: evolution versus revolution. The evolutionary approach aims at moving the Internet from one state to another through incremental patches. The revolutionary approach, on the other hand, proposes a radical redesign of the current Internet architecture, and is therefore also called clean-slate approach [2]. Whatever approach will prevail, we can already foresee a future Internet in which optical communication will play a major role. At this moment we can already observe that the core Internet, which once solely relied on IP routing to deliver end-to-end communications, is moving towards a hybrid optical-IP network. Such network takes data forwarding decisions simultaneously at both IP and optical level [11]. It is composed of intermediate multi-service devices that are both switches at the optical level and traditional routers at the IP level. In such an environment, data flows can traverse a hybrid network through either R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 100–111, 2011. c Springer-Verlag Berlin Heidelberg 2011
Self-management of Hybrid Networks: Introduction, Pros and Cons
101
an IP path or a lightpath. In this paper, we consider a lightpath as a direct connection over an optical fiber; the lightpath can consist of the whole fiber, a wavelength within the fiber (lambda), or a TDM-based channel within a lambda. Traditionally, the establishment and release of lightpaths is controlled by human managers. In this article we investigate the feasibility of removing the human manager from the loop and to introduce self-management capabilities. Such capabilities enable an autonomic decision process to configure lightpaths, based on measurement data received from the hybrid network. The human manager expresses what the self-managing system is expected to achieve, but not how this should be done. The human manager is therefore moved to a higher level in the management hierarchy, where he controls the autonomic decision process, rather than the whole hybrid network. Figure 1 shows this self-management approach. Network manager
Autonomic decision process Network traffic information
Configuration process
Hybrid network
Fig. 1. The self-management approach
The goal of this article is to introduce, in a tutorial style, the main advantages and disadvantages of self-management within hybrid networks. In fact, the material presented in this article summarizes four years of PhD research; readers interested in the technical details of our approach and the validation behind the conclusions, are encouraged to read the thesis and associated papers [3] [6] [7] [8]. The remainder of this article is structured as follows. Section 2 introduces the concept of self-management, and provides some definitions. Following that, Section 3 identifies the related work in this area. Next, Section 4 introduces our approach to employ self-management in hybrid networks. Sections 5 and 6 then respectively present the pros and cons of employing our self-management proposal in hybrid networks. Finally, in Section 7 we draw our conclusions.
102
2
T. Fioreze and A. Pras
What Is Self-management?
Self-management encompasses the act of computer systems managing their own operation without (or with very little) human intervention. It was first defined by IBM in 2001 with the IBM Autonomic Computing Initiative (ACI) manifesto [9]. In such manifesto, IBM proposed an approach in which self-managed computing systems could work with a minimum of human interference. This approach is inspired from the human body’s autonomic nervous system. Many actions are performed by our nervous system without any conscious recognition, such as the act of adjusting our eye’s pupils depending on the amount of light or the act of sweating in order to regulate our body temperature. Below, we quote the main objective of IBM’s autonomic initiate that is: “to design and build computing systems capable of running themselves, adjusting to varying circumstances, and preparing their resources to handle most efficiently the workloads we put upon them. These autonomic systems must anticipate needs and allow users to concentrate on what they want to accomplish rather than figuring how to rig the computing systems to get them there.” A system can be seen as a collection of computing resources bound together to achieve certain objectives. For example, a network router can constitute a system responsible for forwarding network traffic. When combined with other network routers, they can form a larger system, i.e., a Local Area Network (LAN). On its turn, a LAN network combined with other LANs can form a Metropolitan Area Network (MAN), and so on. Based on the IBM autonomic principle, each system must be able to manage its own actions (e.g., traffic forwarding), while collaborating with a larger, higher-level system. The same analogy can be found in the human body. From single cells to organs and organ systems (e.g., the circulatory system), each level maintains a measure of independence while contributing to a higher level of organization, culminating in the organism, i.e., the human body. In most parts of our daily life, we remain unaware of our vital organs (e.g., the heart) activities, since these organs (systems) take care of themselves and they only ascend to a higher level (e.g., the brain) when something is wrong and they need some assistance. 2.1
Self-management Aspects
IBM divided self-management into 4 aspects (yet other subdivisions exist), commonly referred as self-*, as follows: – Self-configuration: consists of an automated configuration process of components and systems based on high-levels policies. For example, when a new device is taken into a network, this device is expected to automatically configure itself and at the same time the rest of the network seamlessly adjust itself to incorporate this new device.
Self-management of Hybrid Networks: Introduction, Pros and Cons
103
– Self-optimization: means that components and systems are supposed to continuously improve their own performance. One example of this aspect is the automatic update process most operating systems provide to their users. Instead of requiring users to manually seek for updates, the operating system does that automatically. – Self-healing: consists of the capability of a system to automatically detect, diagnose, and repair problems found at certain components. As an example, a computer could self-heal every time a virus would strike the system, by automatically patching the damaged files. – Self-protection: is seen as a system automatically defending itself against malicious attacks or failures. A computer system could, for instance, prevent the infection by a certain email virus through analysis of email attachments. 2.2
Different Definitions for Self-management
Although the term self-management has been widely used, there is no universal consensus among authors on what self-management actually means, which leads to different definitions for the term self-management. Some of the most known definitions for self-management are as follows: – Autonomic management : is the most common synonym used to refer to the term self-management. That comes from the fact IBM considers selfmanagement as the essence for autonomic computing systems. As a result, the terms self-management and autonomic management are interchangeably used to mean the same. By analyzing the keywords attached to papers submitted to important network management conferences (e.g., IM, NOMS, CNSM), we found out that 80% of the papers were submitted with the keywords as self-*, whereas 20% were registered as autonomic. This leads us to a conclusion that even if they were constantly used as synonyms, the term self-management seems to be the most referred and used by the network management community. – Automatic management : is commonly confused as autonomic management (and thus with self-management). Automatic management refers as the act of managed devices automatically following explicit policies defined by a network operator. In its turn, autonomic management refers as a specialized automatic process in the sense that the process is instructed to perform actions based on certain policies too, but with the capability of self-learning new actions. – Autonomous management : is another definition referring to self-management. Autonomous means that a process can operate independently from any human intervention. However, this lack of external control is, according to some, a contradiction. If an autonomous “management” system includes enough intelligence in order for the system to govern its own management, one can assume that there is no need whatsoever of managing such a system, which somehow invalidates the use of the term management to address this kind of management approach.
104
T. Fioreze and A. Pras
Degree of autonomy
It is worth saying that the foregoing differentiation among self-management definitions is not a common view in the community. On the contrary, this differentiation solely destined for being a reference to be used throughout this article. We see these definitions as following an evolution in the network management approaches as well as having different degrees of autonomy (Figure 2).
Autonomous
Autonomic/ self-management
Automatic
Conventional
Evolution in the network management approaches
Fig. 2. Evolution in the network management approaches vs. their degree of autonomy
The simplest management approach is the conventional management approach. In the conventional management approach, the network management system is manually managed by network operators. There is no intelligence whatsoever and no (or very little) automation in the execution of management tasks. A next step in the evolution of management approaches is the automation of management tasks. In this case, the management system automatically performs explicit tasks defined by network managers, but nothing beyond the scope of the defined rules. Following to automatic management, autonomic management (or self-management) also performs these tasks, but it is capable of learning new rules by itself. The last step in the evolution process and the most complex one is the autonomous management. At this level, the management system is fully capable of deciding by itself the rules to follow. There is therefore no dependence on human intervention. The management system is intelligent enough to decide its own rules and following them according to its judgement.
3
Related Work on Self-management
Since the releasing of the ACI manifesto, several research works about the use of self-management have been reported. To name a few of these works, Lupu et al. [12] have been researching the use of self-management on healthcare practicing, in which a ubiquitous self-managed computing environment is used to monitor and report the health of patients under medical treatment. In another work, selfmanagement is investigated to be used in situations where there is a great risk
Self-management of Hybrid Networks: Introduction, Pros and Cons
105
for human beings, such as in military or disaster scenarios. Within this line of research, we point out the work by Asmare et al. [1] who has been investigating the use of self-management on Unmanned Autonomous Vehicles (UAVs). Not much differently, self-management has also being investigated in the area of communication networks [10]. Much of the focus of this investigation aims at developing highly distributed algorithms, with the objective to optimize several aspects of network operability (e.g., performance). This optimization is aimed through the provision of self-management capabilities to communication networks. Studies on self-management is also the focus of several research projects, such as Autonomic Internet (AUTOI), Self-Optimisation and self-ConfiguRATion in wirelEss networkS (SOCRATES), and UniverSelf. A study that is closely related to ours is by Miyazawa et al. [13]. In their research, they propose a dynamic bandwidth control management mechanism based on the volume of IP flows. In their work there is a centralized management system that observes the bandwidth of IP flows, and decides about offloading these flows based on pre-defined upper and lower threshold values. These threshold values are defined in advance by a human operator and statically stored in the configuration file of the management system. Once an IP flow has a bandwidth utilization that exceeds the pre-determined upper threshold, the management system triggers an action to create a lightpath. In contrast, when the flow decreases its bandwidth utilization below the lower threshold, the management system initiates a deletion process for deleting the established lightpath.
4
Self-management of Lightpaths
We focus now the use of self-management in the context of hybrid optical and packet switching networks, more specifically on the self-management of lightpaths in these networks. The use of self-management is aimed here at autonomically: 1) detect flows at the IP level eligible to be moved to the optical level as well as 2) establish/release lightpaths for those flows. In this paper we adopt the definition of flows as described in the information model for the IP Flow Information eXport (IPFIX) protocol (RFC 5102). In this RFC, an IP flow is defined as a unidirectional sequence of packets that share the same properties (e.g., the same source and destination IP addresses, source and destination port numbers and higher level protocol). Network operators are only required to initially configure the self-management process with decision policies. After this initial setup, the self-management process autonomically runs by itself. Decision policies define a desired objective, which must be achieved by the self-management functions. In our research, the main objective is to offload as much traffic as possible from the IP level to the optical level. For that, our self-management approach aims at moving flows to the optical level that are few in amount, but represent most of the traffic, namely the elephant flows. Figure 3 depicts our approach for the self-management of lightpaths in hybrid networks.
106
T. Fioreze and A. Pras Selfmanagement
2
3 Monitoring station
1
IP domain B
Optical domain A
Caption Optical level Network level
IP router Optical switch
Elephant flow Lambda connections
Traffic information Configuration message
Fig. 3. Self-management of lightpaths in hybrid optical and packet networks
In Figure 3, IP routers located in the IP domain B are exporting network traffic information to a monitoring station (step 1). Network information contains flow information, such as source & destination IP addresses, protocol, flow volume, amongst others. This information is then forwarded to our self-management module (step 2). Based on the information received, decisions are made by the module taking into account whether an elephant flow is eligible or no longer eligible for a lightpath at the optical level. If the decision is in favor of creating a lightpath (i.e., the elephant flow is eligible to be moved to the optical level), the self-management module configures the IP routers in the IP domain B and the optical switches in the optical domain A (step 3). The routers are informed that the elephant flow is offloaded to the optical level. On their turn, the optical switches are configured to establish a lightpath for the offloaded elephant flow. From that point on, the elephant flow is switched at the optical level bypassing thus the network level in the IP domain B. For configuring routers and switches, existing management technologies can be used, such as the Command Line Interface (CLI), the Generalized Multi-Protocol Label Switching (GMPLS) protocol, the Simple Network Management Protocol (SNMP) or the emerging Network Configuration (NetConf) protocol. Note that this article can only summarize the operation of our self-management approach; details of this operation can be found in other papers [4] [5], as well as the thesis that resulted from this research [3].
5
Advantages of Self-management
Advantages and disadvantages depend on the context self-management is employed. We highlight in this section the main pros of employing self-management principles in the specific context of hybrid optical and packet switching networks.
Self-management of Hybrid Networks: Introduction, Pros and Cons
5.1
107
Network Performance
The use of self-management improves network performance by automatically reducing the burden of the IP level. When IP flows are completely transported via lightpaths they bypass the per hop routing decisions of the IP level. As a result, the QoS offered by hybrid networks is considerably better when compared to traditional IP networks. Big IP flows that overload the regular IP level, for example, may be moved to the optical level where they experience better QoS (e.g., negligible jitter and larger bandwidth). At the same time, the IP level is offloaded and can better serve smaller flows. Last but not least, it is also cheaper to send traffic at the optical level than at the IP level. For the same traffic rate, the cost of an optical switch is 1/10th of an Ethernet switch or 1/100th of a conventional router. 5.2
Network Management
We believe that human factors have an impact on the management of lightpaths. For example, network operators of SURFnet reported, when informally interviewed, that it may take hours (intra-domain) or even days (inter-domain) before a lightpath is established by network operators when using a traditional network management paradigm. In such paradigm, a network manager regularly monitors a hybrid network. Based on his analysis of the collected data, he may decide to establish or release a lightpath. It is worth highlighting that this paradigm keeps the human in the management loop. That is, most of the management decisions have to go through the network manager. As a result, the management system does not go beyond any predetermined state or perform any unexpected action, unless explicitly triggered by the network manager. We argue that during the long periods lightpaths are established, several big IP flows could have been transported via lightpaths, but due to the decision delay they remain being routed at the IP level. Moreover, in such long periods, many large IP flows may be using resources at the IP level and, therefore, likely congesting the IP level. Moreover, by the time the lightpath is finally established, those large flows may no longer exist. The human intervention required to select IP flows and manage lightpaths may be considered therefore slow and inefficient. We see our self-management proposal as an alternative to overcome this dependency on human intervention and therefore improve the network management. 5.3
Selection of Unknown Large Flows
Nowadays, IP traffic from several specialized applications, some of them requiring considerable amounts of bandwidth, already profit from lambda-switched networks capabilities. Examples are: Grid applications, High-Definition Television (HDTV) broadcasting, and large-scale scientific experiments. The knowledge of the heavy-hitter behavior of flows originated from these applications allows network managers to establish lambda-connections in advance for such flows. However, there may be also other big IP flows in current networks that
108
T. Fioreze and A. Pras
could also benefit from being moved to lambda-connections, but since the network manager is not aware of their existence, they may not be selected. Selfmanagement comes handy here since flows are monitored and selected by the self-managing system rather than by the human manager. 5.4
Dynamic Flow Selection
The selection of flows to be moved to the optical level are traditionally made based on pre-defined upper and lower threshold values. These threshold values are defined in advance by a human operator and statically stored in the configuration file of the management system (Section 3). The main shortcoming of using thresholds is that they are statically defined and they are not adjusted depending on the current traffic. This can lead up to an unbalance between the IP and optical levels. If the upper threshold values are too restrictive, IP flows may not be offloaded over lightpaths, which may result in congestion in the IP level and underutilization of the optical level. Moreover, with a misadjusted lower threshold, a flow can be inadequately removed from the optical level back to the IP level, where it can contribute to a congestion situation. Our alternative, on the other hand, aims at prioritizing flows by merit (behavior) rather than by characteristics (i.e., port numbers, ToS, and so on). Their merit is measured based on the amount of traffic they are expected to generate. Flows that are expected to generate more traffic are chosen over flows that are expected to generate less traffic.
6
Disadvantages of Self-management
Self-management of hybrid networks also introduces a number of problems, as will be explained in this section. 6.1
Complexity in the Network Management System
As shown in Figure 2, management approaches follow an evolution that is proportional to their degree of autonomy. In the simplest management approach, network managers are responsible for all management tasks. However, as more experience is obtained with these management tasks, some tasks can gradually be automated, which means that the need for human intervention can be reduced. In order to avoid problems related to centralization (single point of failure, possible performance bottleneck), subsequent cycles in the design of the management system may focus on distributing such management tasks. Therefore there is a shift from centralized and explicit management towards distributed and implicit management approaches, such as the aforementioned automatic and autonomic approaches [14]. However, the price to be paid for such evolution is the increased complexity of the management system. The chance that errors get introduced in the implementation of the management system therefore increases, and debugging possible failures becomes harder.
Self-management of Hybrid Networks: Introduction, Pros and Cons
6.2
109
Network Security
Network security consists of providing means to protect network resources from unauthorized access or malicious activities. Consistent and continuos monitoring of network activity is important to prevent or detect any misusage that may fall upon a managed network. Within the context of our self-management approach, network security is expected to prevent any inappropriate use of lightpaths. For instance, Denial of Service (DoS) attacks that may be transiting at the IP level should not be moved to lightpaths. Such move could make the attack more severe, since the increased bandwidth at the optical level allows the transfer of more packets to the attacked system. Other security concepts, such as authorization and intrusion detection should therefore be considered as well, in order to tighten security in hybrid networks. Flows could be verified prior to the offload to the optical level to detect malicious behavior. If flows behave suspiciously, their offload over lightpaths could be blocked and their behavior could be logged for audit purposes and later analysis. 6.3
Temporarily Reduction In Throughput Performance
Another interesting question is whether there is any performance degradation when an active flow is moved from the IP level towards a lightpath. We may expect that, at the moment flows are moved, massive re-ordering takes place, since the first packets transferred over the lightpath can arrive earlier than the last packets over the IP path. To analyze this effect, we used ns-2 to simulate the behavior of TCP flows during such movement, and identified which factors limited the throughput of such flows [15]. For this analysis we used the network topology as shown in (Figure 4).
Fig. 4. Topology used in the simulations and limiting factors (Greek letters).
We observed different kinds of impact on the throughput of TCP flows. In all scenarios some throughput oscillation occurred during the transient phase, but TCP throughput recovered relatively fast after the transient phase was over. However, when the network link at the receiver side is the factor that limits the TCP throughput (thus the bandwidth of ξ is smaller than that of any other link), we found a huge impact on the performance of TCP. In this case, during the transient phase, router r3 tries to send the last data received over the IP path, together with the first data received over the optical path, over the outgoing
110
T. Fioreze and A. Pras
link ξ. The outgoing router’s queue will be filled rapidly and packets must be dropped due to lack of queue space. It is interesting to note that the decrease in throughput was not caused by packet reordering, but by packet loss. This problem indicates that the transmission capacity of the link at the receiver side and the router’s buffer size should be considered before moving flows on the fly.
7
Conclusions
Based on the research presented in this article, our main conclusion is that selfmanagement is technically feasible to be deployed within hybrid optical and packet switching networks. From an implementation point of view, the decision process in a self-management approach can be built upon existing technologies, such as NetFlow/IPFIX to collect traffic information, and CLI, GMPLS, SNMP or NetConf to configure optical switches and routers. Compared to traditional management approaches, self-management of hybrid networks provide several advantages. These advantages include better network performance, faster lightpath establishment and release, the ability to move large flows to the optical level, even in cases where such flows have not been made known to the human manager in advance, and finally the possibility to avoid congestion or underutilization by dynamically changing the decision thresholds. Self-management of hybrid networks also has a number of disadvantages, however. An obvious disadvantage is that self-management increases the complexity of the management system, and therefore makes it harder to debug possible failures. Another problem is that large flows generated by a DoS attack can be moved to the optical level, and in this manner strengthen the attack. Also the move of an IP flow to the optical level may result in the temporary re-ordering and loss of packets, especially in cases where the bandwidth of the network links at the receiver side are the limiting factor that determine the TCP throughput. Acknowledgments: This work has been supported by the EC IST-Emanics NoE (26854), as well as the EC IST-UniverSelf (FP7-257513) and the GigaPort Next Generation projects.
References 1. Asmare, E., Gopalan, A., Sloman, M., Dulay, N., Lupu, E.: A Mission Management Framework for Unmanned Autonomous Vehicles. In: Mobile Wireless Middleware, Operating Systems, and Applications, Second International Conference (Mobilware). ICST Lecture Notes, vol. 7, pp. 222–235 (April 2009) 2. Feldmann, A.: The Internet Architecture - Is a Redesign Needed? pp. 147–164. Springer, Heidelberg (2009) 3. Fioreze, T.: Self-Management of Hybrid Optical and Packet Switching Networks. Ph.D. thesis. Universiteit Twente, Enschede (February 2010) 4. Fioreze, T., Granville, L., Pras, A., Sperotto, A., Sadre, R.: Self-management of hybrid networks: Can we trust netflow data? In: 11th IFIP/IEEE International Symposium on Integrated Network Management (IM 2009), Long Island, New York, USA, pp. 577–584 (June 2009)
Self-management of Hybrid Networks: Introduction, Pros and Cons
111
5. Fioreze, T., Granville, L., Sadre, R., Pras, A.: A statistical analysis of network parameters for the self-management of lambda-connections. In: Sadre, R., Pras, A. (eds.) AIMS 2009 Enschede. LNCS, vol. 5637, pp. 15–27. Springer, Heidelberg (2009) 6. Fioreze, T., van de Meent, R., Pras, A.: An Architecture for the Self-management of Lambda-Connections in Hybrid Networks. In: Pras, A., van Sinderen, M. (eds.) EUNICE 2007. LNCS, vol. 4606, pp. 141–148. Springer, Heidelberg (2007) 7. Fioreze, T., Pras, A.: Using Self-management for Establishing Light Paths in Optical Networks: an Overview. In: 12th EUNICE Open European Summer School, Stuttgart, Germany, pp. 17–20. Institut f¨ ur Kommunikationsnetze und Rechnersysteme, Universit¨ at Stuttgart, Stuttgart, Germany (2006) 8. Fioreze, T., Pras, A.: Self-management of Lambda-connections in Optical Networks. In: Bandara, A.K., Burgess, M. (eds.) AIMS 2007. LNCS, vol. 4543, pp. 212–215. Springer, Heidelberg (2007) 9. Horn, P.: Autonomic computing: IBM’s Perspective on the State of Information Technology (2001), http://www.research.ibm.com/autonomic/manifesto/ autonomic_computing.pdf 10. Jennings, B., van der Meer, S., Balasubramaniam, S., Botvich, D., Foghlu, M., Donnelly, W., Strassner, J.: Towards autonomic management of communications networks. IEEE Communications Magazine 45(10), 112–121 (2007) 11. Leon-Garcia, A., Widjaja, I.: Communication Networks: Fundamental Concepts and Key Architectures, 2nd edn. McGraw-Hill Companies, New York (2003) 12. Lupu, E., Dulay, N., Sloman, M., Sventek, J., Heeps, S., Strowes, S., Twidle, K., Keoh, S.L., Schaeffer-Filho, A.: AMUSE: autonomic management of ubiquitous e-Health systems. Concurrency and Computation: Practice and Experience 20(3), 277–295 (2008) 13. Miyazawa, M., Ogaki, K., Otani, T.: Multi-layer network management system with dynamic control of MPLS/GMPLS LSPs based on IP flows. In: The 11th IEEE/IFIP Network Operations and Management Symposium (NOMS 2008), Salvador, Brazil, pp. 263–270 (April 2008) 14. Pras, A.: Network Management Architectures. Ph.D. thesis, Universiteit Twente, Enschede (February 1995) 15. Timmer, M., de Boer, P.T., Pras, A.: How to Identify the Speed Limiting Factor of a TCP Flow. In: 4th IEEE/IFIP Workshop on End-to-End Monitoring Techniques and Services (E2EMON 2006), Vancouver, Canada, pp. 17–24 (April 2006)
Evaluation of Different Decrease Schemes for LEDBAT Congestion Control Mirja K¨ uhlewind1 and Stefan Fisches2 1
Institute of Communication Networks and Computer Engineering (IKR) University of Stuttgart, Germany
[email protected] 2
[email protected]
Abstract. Low Extra Delay Background Transport (LEDBAT) is a new, delay-based congestion control algorithm that is currently under development in the IETF. LEDBAT has been proposed by BitTorrent for time-insensitive background traffic that otherwise would disturb foreground traffic like VoIP or video streaming. During previous evaluations the so called late-comer advantage has been discovered which makes a new starting LEDBAT flow predominant against already running LEDBAT flows. In this paper we evaluate different decrease schemes which have been proposed to solve this problem. We found that the proposed solutions come with a lower utilization, sometimes increased completion times and are much more sensitive to noise, which is contra-productive for the considered traffic class. Furthermore, we propose extensions to both evaluated schemes. We show that our approach can help to yield more quickly to higher priority traffic. We argue that a fair and equal share is not required for the specific traffic class LEDBAT is designed for. But it is important to address different application requirements in congestion control like LEDBAT as an approach for less-than-best effort background traffic.
1
Introduction
A substantial portion of bandwidth in today’s Internet is used for background and time-insensitive traffic (e.g. P2P Traffic [1]). This traffic should not impede foreground and time-sensitive traffic. A novel congestion control algorithm designed for less-than-best-effort traffic is the Low Extra Delay Background Transport (LEDBAT) [2]. It was proposed by BitTorrent in December 2008 and is now under development within an Internet Engineering Task Force (IETF) working group. It is a delay-based approach that can react earlier to congestion then loss-based schemes which are mostly used today in today’s Internet for TCP traffic. In this paper, we regard two sorts of traffic: 1. Less-than-best-effort, low priority traffic using LEDBAT congestion control, e.g. automatic software Updates running in the background or Peer-to-Peer file sharing R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 112–123, 2011. c Springer-Verlag Berlin Heidelberg 2011
Evaluation of Different Decrease Schemes for LEDBAT Congestion Control
113
2. Higher priority best-effort traffic using lost-based congestion control (or even sending with a constant bit rate), e.g. web-browsing or Voice over IP using UDP According to [2], the LEDBAT congestion control seeks to: 1. utilize end-to-end available bandwidth, and maintain low queueing delay when no other traffic is present, 2. add little to the queuing delay induced by concurrent TCP flows, 3. quickly yield to flows using standard TCP congestion control that share the same bottleneck link. With the current specification of LEDBAT there is a so-called “late-comer’s advantage”, where a second, newly starting flow can starve the first, already running one. This only happens when two or more LEDBAT flow compete on the same link and no other higher priority traffic is present. Thus it is an issue of intra-protocol fairness. Several mechanisms have been proposed to prevent this effect e.g. a mandatory slow-start or multiplicative decrease. We argue however that a high link utilization is actually more desirable for a lower priority traffic class than fairness within that class. As LEDBAT is designed for background traffic that will yield for higher priority traffic, LEDBAT should be able to utilize the link as much as possible when no other traffic is present. Completing one flow after another instead of transmitting all flows in parallel will, whilst being unfair, minimize the mean completion time. When sending as much data as possible en-block, computational power and hence energy consumption will be minimized as well. To support our hypothesis we evaluate different decrease schemes with regard to completion time and utilization. We evaluated the proposed linear decrease scheme, which is discussed in the ITEF, and a contra-proposal by Carofiglio et al. [3] for using a multiplicative decrease. Moreover, we introduce new extensions to either of the schemes. We implemented LEDBAT as TCP congestion control in Linux using the TCP Timestamp Option for one-way delay measurements and subsequently used this code within a simulation environment. Moreover, we show that the decrease behavior is not only important to mitigate the effects of the late-comer’s advantage but is also important when LEDBAT needs to yield to higher priority traffic like standard TCP. The remainder of this paper is structured as follows: Section 2 summarizes related work. Section 3 is a general introduction into the LEDBAT algorithm and different decrease schemes. In Section 4 we present our TCP implementation in Linux. Section 5 shows our results regarding fairness and utilization.Section 6 gives concluding remarks.
2
Related Work
A comparison of LEDBAT with standard TCP as well as other less-than-besteffort congestion control mechanism such as LP-TCP and TCP-NICE has already been performed by Rossi et al. [4] [5]. They also detected the late comer’s
114
M. K¨ uhlewind and S. Fisches
advantage [6]. Initially, the authors proposed TCP Slow-Start as a solution to this problem. Slow-Start will usually overshoot and induce losses that cause all competing flows to basically restart their transmission. When two LEDBAT flows start at the same time, they will equally share the available capacity. But SlowStart overshoot will actually affect all competing flows on the link, LEDBAT flows as well as higher priority standard TCP flows. This breaks one of the design goals of LEDBAT, as listed in 1. In fact, the current LEDBAT specification leaves it to the implementor to use a specific start-up scheme, if necessary. [3] proposed a multiplicative decrease scheme to achieve fairness. We argue that equal sharing is a non-requirement for background traffic within its traffic class. [7] argues as well that it is not the right metric for fairness to share the available capacity equally between competing flows with different requirements. Another study about parametrization is provided by [8]. In this paper we did not look at any parametrization issues. This issues are widely discussed on the IETF LEDBAT mailing list and mostly addressed in the working group document.
3
LEDBAT
LEDBAT is a novel delay-based congestion control approach for low priority transmissions. It is under development within an IETF working group. It is based on one-way delay measurements. When the measured one-way delay increases, LEDBAT can react earlier to congestion than loss-based approaches, which are more predominant in today’s Internet. By slowing down its transmission rate earlier it will give the available bandwidth to presumed higher priority transmissions. Thus LEDBAT is friendly to most of today’s higher priority TCP traffic. Using timestamps, LEDBAT measures the queuing delay on a link. Assuming all queues on the path are empty at some point of the transmission, the sender will take the smallest delay measurement as the base delay. base delay = min(base delay, current delay) The base delay is the constant fraction of the time a packet needs from sender to receiver and thus the minimum transmission time that can be observed. Any additional, variable delays are presumed to be waiting times in network queues. Therefore the actual queuing delay is calculated based on the current delay measurement given by the receiver as following: queuing delay = current delay − base delay When the base delay changes during a transmission, e.g. because of re-routing, LEDBAT will automatically update to the new base delay if it gets smaller. To recognize a higher base delay a base delay history is kept which holds the measured base delay of the last n minutes and will discard old values after n + 1 minutes, thus adapting to the new base delay. LEDBAT aims to keep the queuing delay low but not to be zero since optimal resource utilization requires that there is always data to send. Therefore LEDBAT
Evaluation of Different Decrease Schemes for LEDBAT Congestion Control
115
tries to achieve an extra delay of TARGET milliseconds. LEDBAT can determine how far off target it is and then increases or decreases its sending rate. LEDBAT is designed to not ramp up faster than TCP. Thus it will at maxium increase its sending rate by one packet per round-trip time (RTT). LEDBAT can use a filter to smooth or single out wrong delay measurements. But depending on the used filter scheme and length, e.g. minimum or average of the last CURRENT FILTER samples, the reaction to congestion might get delayed. 3.1
Late-Comer’s Advantage
Whenever a LEDBAT flow uses a previously unused path, it will immediately measure the real base delay. Thus it will saturate the bottleneck link after a short time by maintaining an extra delay of TARGET milliseconds. If now a second flow arrives, it can only measure the actual base delay plus the extra delay of the first flow. Wrongly, the second flow will take this value as its base delay. While the latecomer will add its own target delay on top, the first flow measures an increased delay and begins to lower its sending rate. In the worst case, the second flow will add an additional TARGET millisecond of extra delay. As long as the second flow is not able to measure the actual base delay, it will push away the first flow completely. This effect is called the “late-comer’s advantage“. A proposed way to mitigate this effect is to change the decrease behavior to multiplicative decrease to empty the queue completely such that the second flow can measure the real base delay [3]. 3.2
Managing the Congestion Window
Linear Controller. The current draft version of LEDBAT uses the same linear controller for an additive increase and additive decrease. of f target =
T ARGET − queuing delay T ARGET
of f target cwnd The congestion window (CWND), which gives the number of packets that can be transmitted in one RTT, is altered by the normalized off target parameter that can be positive or negative and thus determines how the CWND grows or shrinks. When GAIN is one, LEDBAT will at maximum speed up as quickly as standard TCP because off target will always be smaller than 1 and reach its maximum value when queuing delay is zero. Unfortunately, this approach allows LEDBAT to decrease very slowly if e.g. just one millisecond of extra delay above the TARGET is measured. To be friendly to standard TCP traffic, LEDBAT should at least decrease as quickly as standard TCP is increasing. The latest version of the LEDBAT draft in the IEFT allows a different GAIN value for the decrease than for the increase. We propose to use GAIN = T ARGET ∗ N cwnd+ = GAIN ∗
M. K¨ uhlewind and S. Fisches
cwnd [packets]
116
80 70 60 50 40 30 20 10 0
Std. LEDBAT 1 Std. LEDBAT 2
0
20
40
60
100 80 Time [s]
120
140
160
180
Fig. 1. Two competing LEDBAT flows with linear decrease
if off target is negative (decrease). N would need to be the number of parallel starting standard TCP flows. As this number is usually unknown, we assume in most cases only one flow starting at the same time and set N to 1. Thus LEDBAT would decrease the congestion window at least by one packet per RTT. Multiplicative Decrease. [3] proposes to decrease the congestion window for negative off target by multiplying it with a factor β such that 0 < β < 1: CW N D = β × CW N D The idea is that the multiplicative decrease allows the queues to drain and thus enable a correct measurement of the base delay. If β is chosen to low it is underutilizing the link. If β is to large it doesn’t drain the queue. While [3] is searching for a fixed value, β is actually depending on the chosen value of TARGET in relation to the current RTT. This is because the CWND is depending on the RTT as it gives the number of packets that can be sent during one RTT. Whereas the TARGET gives the extra delay and thus determines the number of packets that will be stored in the network queue that needs to be emptied at once (in T ARGET one RTT). Thus β must be current RT T . For a negative off target value the congestion window is then calculated as CW N D =
T ARGET × CW N D current RT T
current RT T ≈ 2 × base delay + T ARGET In our implementation we subtracted additional 3 packets after this calculation to encounter measurement and computation inaccuracies. This value of 3 packets is selected through simulative studies. With every multiplicative decrease scheme we decrease only once per RTT as during the first RTT after the decrease all delay measurements still reflect the situation before the decrease.
4
Linux Implementation
To provide a wide access to a less-than-best-effort congestion control scheme, we decided to implement LEDBAT as a TCP congestion control module. The Linux
Evaluation of Different Decrease Schemes for LEDBAT Congestion Control
117
kernel design provides an interface to extent the kernel functionality through additional modules. There is a specific module interface for congestion control. Thus our LEDBAT implementation follows the respective interface of the Linux kernel. The resulting c-file can be included in any current Linux kernel version. No further modifications were needed as TCP already provides an option to transfer and echo time-stamps. Based on this given functionality we do not calculate the delay at the receiver side, as specified in the LEDBAT draft, but at the sender. With the TCP Timestamp Option the receiver reflects the timestamp T Ssnd sent by the sender and adds an additional time-stamp T Srcv at sent-out of the ACK. By subtracting the echoed time-stamp (TSsnd) from the new time-stamp (TSrcv) we determine the one-way delay. OW D = T Srcv − T Ssnd This delay includes the processing delay at the receiver, but as we assume this processing delay to be constant it will not disturb our queuing delay estimation. The clocks of sender and receiver do usually not operate on the same time base. As we only use queuing delay, that means the variable part of the oneway delay in relation to a certain base delay, the absolute value of base delay in not important for us. Even if both timestamp have a different resolution, this can be estimated by monitoring the first samples in relation to the own clock. As we know the time-stamp resolution in our evaluation scenarios, we did not implement a specific logic. In order to archive an precise enough delay measurements to monitor the changes in on-way delay, we had to adjust the kernels timer frequency from its default 250 HZ to 1000 HZ, giving us a 1 ms resolution in the timestamps. Furthermore, TCP will acknowledge at least every second data packet and waits at most by default Linux configuration 100 ms for an additional packet. This is the delayed ACK mechanism of TCP. But whenever two packets are acknowledged at once, only the timestamp of the first packet will be echoed. Thus we have just half the number of measurement samples and some of them will have artifical delays. If packets arrive continuousely, there is a constant waiting time until the second packet arrives. This offset is not a problem. But if the timer expires, there are high variations. With the linear decrease scheme the impact of one wrong sample is quite low but with multiplicative decrease we necessarily need noise filtering to cope with this effect. For our simulations the open source character of Linux also allowed us to patch the kernel and disable delayed acknowledgments at the receiver side.
5 5.1
Evaluation Results Scenario
To evaluate the presented decrease schemes in a real-world scenario we first set up a small testbed with the just described LEDBAT kernel module. All presented results are extracted from simulations with the IKR SimLib [9] as
118
M. K¨ uhlewind and S. Fisches
cwnd [packets]
60
MD-NF2 1 MD-NF2 2
50 40 30 20 10 0 0
20
40
60
100 80 Time [s]
120
cwnd [packets]
60
140
160
180
160
180
160
180
MD-NF1 1 MD-NF1 2
50 40 30 20 10 0 0
20
40
60
100 80 Time [s]
120
cwnd [packets]
60
140
MDvar 1 MDvar 2
50 40 30 20 10 0 0
20
40
60
100 80 Time [s]
120
140
Fig. 2. Two competing LEDBAT flows with different multiplicative decrease, base RTT 40 ms
in the simulation delayed ACK could easily be deactivated. We used the IKR SimLib together with the TCP implementation of the real Linux kernel code provided by the Network Simulation Cradle [10] - a framework that makes kernel code usable within a simulation environment. In our simulation scenarios we used 2-5 parallel flows with a bottleneck link capacity of 10 Mbit/s and one-way delays of 10, 19, 20 or 30 ms. In every scenario each flow starts with an offset of 15 s to the previous one. All flows in a scenario want to transit the same data size which is either 30, 50, 100, 300, 500 or 1000 MBytes. The bottleneck node maintains a queue with a queue size that can hold 60 packets. For all simulations we used a TARGET value of 25 ms and a length CURRENT FILTER of 1 or 2, where 1 basically means no filtering at all. 5.2
Two Competing Flows
To illustrate the behavior of the different decrease schemes we present in detail the scenarios where two LEDBAT flows compete with the same decrease mechanism. Figure 1 shows the linear decrease behavior. The first flow starts and increases its rate slowly until the bottleneck link is filled and 25 ms extra delay is introduced. The increase gets slower as it gets closer to the 25 ms target value. After 15 ms a second flow starts. This flow assumes the current delay as base
Evaluation of Different Decrease Schemes for LEDBAT Congestion Control
cwnd [packets]
60
119
MD-NF2 1 MD-NF2 2
50 40 30 20 10 0 0
20
40
60
100 80 Time [s]
120
cwnd [packets]
60
140
160
180
160
180
160
180
MD-NF1 1 MD-NF1 2
50 40 30 20 10 0 0
20
40
60
100 80 Time [s]
120
cwnd [packets]
60
140
MDvar 1 MDvar 2
50 40 30 20 10 0 0
20
40
60
100 80 Time [s]
120
140
Fig. 3. Two competing LEDBAT flows with different multiplicative decrease, base RTT 20 ms
delay and starts increasing its rate as well. The first flows starts decreasing as it senses more extra delay than the already introduced 25 ms. It decreases only slowly as only little extra delay is introduced by the second flow so far. Thus the second flow, which never measures the correct base delay, adds an additional delay on top of the extra delay of the first flow. However, when two flows start at the same time or restart after the currently dominating flow finished, they will share the capacity equally as long as they have the same TARGET value. The upper and middle diagram in Figure 2 show the multiplicative decrease behavior as proposed by [3] and explained in 3.2 with a fixed value for β of 0.6. We investigated two different variants. The first variant uses a simple noise filter of length 2. It takes always the minimum of the last two measurement samples of current delay. In the second variant the length is one, so there is no noise filtering. But we deactivated delayed ACKs in our simulation. We labeled those variants with MD-NF2 and MD-NF1. All multiplicative decrease schemes aim to empty the queue when another LEDBAT is starting. Thus both flows can measure the right base delay. Whenever some extra delay is introduced on top of TARGET these schemes will decrease the windows and the link becomes underutilized. This happens periodically even when no other flow is starting as LEDBAT itself will exceed the TARGET for probing. When the second flow starts it will quickly measure the correct base delay as the first one decreases.
120
M. K¨ uhlewind and S. Fisches
cwnd [packets]
60
MD-NF2 1 MD-NF2 2 MD-NF2 3 MD-NF2 4 MD-NF2 5
50 40 30 20 10 0 0
20
40
60
80
100
120
140
Time [s]
cwnd [packets]
60
MD-NF1 1 MD-NF1 2 MD-NF1 3 MD-NF1 4 MD-NF1 5
50 40 30 20 10 0 0
20
40
60
80
100
120
140
Time [s]
cwnd [packets]
60
MDvar 1 MDvar 2 MDvar 3 MDvar 4 MDvar 5
50 40 30 20 10 0 0
20
40
60
80
100
120
140
Time [s]
Fig. 4. Five competing LEDBAT flows with different multiplicative decrease, base RTT 40 ms
Then both will increase in parallel and the decrease times get synchronized so that both get the same share of the bandwidth. MD-NF2 does not synchronize due to the noise filtering and thus does not perfectly divide the bandwidth but approximately gives each flow an equal share. The lower diagram in figure 2 shows our proposal with a dynamically adapted decrease factor (MDvar), deactivated delayed ACK and no noise filter. As both flows can have slightly different β values, they do not perfectly synchronize but work in all kinds of scenarios independent of RTT and number of competing flows as shown in Figure 2 (40 ms RTT) and 3 (20 ms RTT). Figure 4 shows the MD-NF1 scheme with 5 successively starting flows with an offset of 15 seconds. The first two do perfectly synchronize but all subsequent flows disturb the others before an equilibrium can be reached. With MD-NF2 flows are again not synchronized. In contrast the MDvar scheme shows that the flows always share the link equally after a short time. Looking at transmission times we note that in Figures 1 and 2, no matter which decrease scheme, the last transmission to finish did so after 171 ms. With the linear decrease version shown in Figure 1 the second flow finishes its transmission ahead of the first flow after 106 ms. In all scenarios with multiplicative decrease in Figure 2 the first flow finishes after 134-158 ms which is nearly the same completion time than for the second flow (171 ms - 15 ms = 156 ms).
Evaluation of Different Decrease Schemes for LEDBAT Congestion Control 9.7 mean rate [Mbit/s]
9.6
121
Std MDvar
9.5 9.4 9.3 9.2 9.1 9 100 flow size [MBytes]
1000
Fig. 5. Mean rate over flow size with 40 ms base RTT at 10 Mbit/s bottleneck link
Regarding bulk background traffic only the completion time of the whole transfer is relevant, not the instantaneous rate. In case of linear decrease at least one user will get a much better completion time whereas otherwise with the multiplicative decrease schemes all users have to wait quite long. 5.3
Utilization vs. Fairness
We argue that in the cases where LEDBAT should be used, like software updates in background, completion time and thus utilization is more important than an equal share of the capacity. Figure 5 shows the mean sum rate over different transfer sizes for each of the 2 to 5 parallel LEDBAT flows with a 15 s offset in the start time. In each scenario all competing flows have a minimum RTT of 40 ms on a common 10 Mbit/s bottleneck link. We only compared the standard LEDBAT with linear decrease and the MDvar scheme, as the other schemes do not work correctly in all scenarios regarding the fair share, as to be seen in Figure 3. The standard LEDBAT with linear decrease utilizes the link better with large transfers as each time a flow starts or ends, the link will not be fully utilized. With long transmissions these time periods in respect to the whole transmission get smaller. We argue that the linear decrease scheme is more appropriate to LEDBAT traffic as completion time and utilization is most important. However, for background traffic, like automatic software updates, the completion time is not even important but it is important to not disturb other traffic. If we can utilize the link as much as possible when it is empty, we will prevent blocking any capacity for higher priority traffic that might arrive later-on. Moreover, the linear decrease scheme is less sensitive to noise and easier to implement. 5.4
Decrease Behavior with Competing Standard TCP Flows
The upper diagram in Figure 6 shows one scenario that we found where LEDBAT with linear decrease is not friendly to standard TCP cross traffic. It is LEDBAT’s most important design goal to yield quickly to higher priority TCP traffic, as
122
M. K¨ uhlewind and S. Fisches
cwnd [packets]
250
LEDBAT with std. linear decrease TCP
200 150 100 50 0 0
10
20
30
40
50
60
Time [s]
cwnd [packets]
250
LEDBAT with min. 1 pkt p. RTT decrease TCP
200 150 100 50 0 0
10
20
30
40
50
60
Time [s]
Fig. 6. LEDBAT flow standard linear decrease (upper) or with min. linear decrease of 1 packet per RTT (lower) and standard TCP cross traffic at 20 Mbit/s bottleneck link and queue size of 60 packets.
described in 1. In this scenario there is a bottleneck link of 20 Mbit/s and still the same queue size of 60 packets. To reach the TARGET of 25 ms the queue needs to be filled with 42 packets. 20 ms after the LEDBAT flow started, a standard TCP flow starts as well. After some RTTs LEDBAT started to decrease its sending rate. But it is only decreasing very slowly as it can at maximum sense another 11 ms of extra delay before the queue overflows. Unfortunately in this scenario, whenever the standard TCP fills up the queue and loss occurs, this loss only hits the standard TCP flow itself as it sends the data in bursts at the beginning of each RTT in Slow Start [11]. In this scenario the ratio between the TARGET and the maximum queue is very unfavorable. Moreover, the burst-wise transmission precludes LEDBAT from falling back into standard TCP behavior as it would do when loss occurs. We evaluated the same scenario with our changed approach where LEDBAT will decrease its rate at least by one packet per RTT. In the lower diagram in Figure 6 we can see LEDBAT is yielding again to the TCP flow. In both of these scenarios delayed ACKs are not deactivated. With our proposal also the increase is slower than in the upper diagram as it decreases (more strongly) from time to time because of noise from delayed ACKs. An appropriate noise filter can help this problem, and thus we conclude a larger GAIN value for the decrease will help LEDBAT to reach its goals. An even larger decrease might be needed if multiple TCP flows simultaneously are sending as their sending rate will sum up to a larger increase than one packet per RTT.
6
Conclusion and Outlook
From our experiments we can see that while the standard linear decrease mechanism of LEDBAT always privileges the flow which started last, it is able to fully
Evaluation of Different Decrease Schemes for LEDBAT Congestion Control
123
utilize the link in stable state. Moreover, we recommend a larger GAIN for the decrease case to yield more quickly to competing higher priority standard TCP traffic. The latest version of the LEDBAT draft allows a higher GAIN for the decrease but it does not specify a certain value. We made a proposal to achieve a minimum decrease of one packet per RTT. Multiplicative decrease schemes, even our optimized proposal, in contrast, underutilize the link due to the periodical events where the queue is emptied. Given that LEDBAT is designed for lowerthan-best-effort traffic, there is no demand for fairness but high link utilization and block-wise transfers will help to minimize the mean completion time. LEDBAT will reset base delay periodically when one or more LEDBAT flows maintain a constant extra delay on the link. Depending on the length of the base history filter the capacity share may change. If only some data are left to finish the transmission, a flow could actively reset base delay. Of course, this will only cause an effect if just LEDBAT flows are on the link and thus will not disturb higher priority traffic.
References [1] Schulze, H., Mochalski, K.: Internet study 2008/2009. IPOQUE Report (2009) [2] Shalunov: S., Hazel, G., Iyengar, J., Kuehlewind, M.: Low extra delay background transport (LEDBAT). draft-ietf-ledbat-congestion-06 (2011) [3] Carofiglio, G., Muscariello, L., Rossi, D., Valenti, S.: The quest for LEDBAT fairness. In: IEEE Globecom (December 2010) [4] Rossi, D., Testa, C., Valenti, S.: Yes, we LEDBAT: Playing with the new bitTorrent congestion control algorithm. In: Krishnamurthy, A., Plattner, B. (eds.) PAM 2010. LNCS, vol. 6032, pp. 31–40. Springer, Heidelberg (2010) [5] Carofiglio, G., Muscariello, L., Rossi, D., Testa, C.: A hands-on assessment of transport protocols with lower than best effort priority. In: 35th IEEE Conference on Local Computer Networks, LCN 2010 (October 2010) [6] Rossi, D., Testa, C., Valenti, S., Muscariello, L.: LEDBAT: the new BitTorrent congestion control protocol. In: International Conference on Computer Communication Networks, ICCCN 2010 (August 2010) [7] Briscoe, B.: A fairer, faster internet protocol. IEEE Spectrum, 38–43 (2008) [8] Schneider, J., Wagner, J., Winter, R., Kolbe, H.: Out of my Way Evaluating Low Extra Delay Background Transport in an ADSL Access Network. In: Proceedings of the 22nd International Teletraffic Congress (ITC22), pp. 7–9 (2010) [9] IKR Simulation and Emulation Library, http://www.ikr.uni-stuttgart.de/content/ikrsimlib/ [10] Jansen, S., McGregor, A.: Simulation with Real World Network Stacks. In: Proc. Winter Simulation Conference, pp. 2454–2463 (September 2005) [11] Allman, M., Paxson, V., Blanton, E.: TCP Congestion Control. RFC 5681 (2009)
Comparative Traffic Analysis Study of Popular Applications Zolt´an M´ ocz´ar and S´ andor Moln´ ar High Speed Networks Laboratory Dept. of Telecommunications and Media Informatics Budapest Univ. of Technology and Economics H–1117, Magyar tud´ osok krt. 2., Budapest, Hungary
[email protected],
[email protected]
Abstract. The popularity of applications changes fast in current Internet and the result is that the characteristics of Internet traffic also goes over a rapid change. In this paper we present the main characteristics of three popular applications based on actual measurements taken from a commercial network. BitTorrent as one of the leading P2P file sharing applications, YouTube as the head of video sharing applications and Facebook as the prominent online social networking application are investigated. Comparative results at both application- and flow-levels are presented and discussed. Keywords: traffic measurements, traffic analysis, BitTorrent, YouTube, Facebook.
1
Introduction
The fact that the Internet is a fast evolving world is manifested in the incredible quick change of applications. Over the years we can observe a dramatic change in the popularity of applications used in the Internet. After the period when the Web was the leading application we could identify the dominance of peer-to-peer file sharing applications for a while. A number of P2P applications have been developed and a few of them, e.g. BitTorrent reached extremely high popularity. However, it seems that after the P2P file sharing era, in recent years, there are some other increasing applications, which are becoming very popular such as online social networking. In this so-called Web 2.0 world the user-generated content sites provide platforms for information, video and photo sharing as well as blogging. As an example, YouTube is the leading video sharing application with continuously increasing popularity. It was launched in December 2005, and since July 2006 the site serves up to 100 million videos per day with a daily upload of more than 65000 videos and nearly 20 million unique visitors per month [1]. The success of YouTube is interesting to investigate. The site exerts no control over its users’ freedom for publishing, so users not only share their videos, but R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 124–133, 2011. c Springer-Verlag Berlin Heidelberg 2011
Comparative Traffic Analysis Study of Popular Applications
125
also participate in a huge decentralized community by creating and consuming terabytes of video content. Nowadays, online social networking sites are also popular. The primary purpose of these sites is to provide the means for users to maintain contacts, communicate and exchange information with each other. The most prominent example is Facebook, which is a web-based online social networking application. Facebook is quickly emerging and can be seen as a new Internet killer-application. In this paper we investigate the main traffic characteristics of three popular applications of different types: BitTorrent as one of the leading P2P file sharing applications, YouTube as the head of video sharing applications and Facebook as the prominent online social networking application. We carried out a comprehensive analysis study based on measurements taken from a commercial network. We analyzed and compared the traffic characteristics of these successful applications at both application- and flow-levels. The paper is organized as follows. Section 2 overviews the related work, and in Section 3 we discuss the details of measurements including the network architecture and analysis tools. Section 4 and Section 5 present our analysis results at application- and flow-levels, respectively. Finally, Section 6 concludes the paper with our main results.
2
Related Work
The popular Internet applications are in the focus of several recent studies from different aspects. Since in our paper we address BitTorrent, YouTube and Facebook applications, we overview some related work in this section. Choffnes and Bustamante pointed out that testbed-based views of Internet paths are surprisingly incomplete concerning BitTorrent and many other applications. This message gives us a warning and emphasizes the need for using actual measurements for analysis [2]. In the last years, several articles have been published, which analyze the behavior of BitTorrent. For example, Erman et al. presented a study on modeling and evaluation of the session characteristics of BitTorrent traffic [3]. They found that session inter-arrivals can be accurately modeled by the hyper-exponential distribution while session durations and sizes can be reasonably well-modeled by the log-normal distribution. Andrade et al. studied three BitTorrent content sharing communities regarding resource demand and supply. The study introduced an accurate model for the rate of peer arrivals over time. They also found that a small set of users contributes most of the resources in the communities, but the set of heavy contributors changes frequently and it is typically responsible only for a few resources used in the distribution of an individual file [4]. YouTube patterns are intensively investigated in [5], and one of the main findings is that caching could improve the user experience, reduce bandwidth consumption and reduce the load on YouTube servers. Zink et al. analyzed YouTube traffic in a large university campus network [6]. They showed that there is no strong correlation between local and global popularity of YouTube
126
Z. M´ ocz´ ar and S. Moln´ ar
videos. They also observed that many users watched the same video more than once. Cheng et al. presented a measurement study of YouTube videos and revealed that YouTube has a significantly different statistics compared to other video streaming applications especially in length distribution, access pattern and growth trend [7]. The “Facebook phenomenon” has been investigated in many recent papers from different points of view including both social [8] and technical aspects [9], [10]. Gjoka et al. carried out a measurement-based characterization of the popularity and usage of Facebook applications [9]. They found that the popularity of these applications is a rather skewed distribution. The authors of [10] also made a large-scale measurement study and they analyzed the usage characteristics of online social network based applications. They pointed out that only a small fraction of users account for the majority of activity within the context of Facebook applications.
3
Traffic Measurements
Measurements were taken from one of the commercial networks in Stockholm, Sweden. This company maintains network infrastructure for several service providers, which offer many different services such as Internet access, IP telephony or IPTV for both residential and business users. During the measurement period more than 1800 customers used the network for their own purposes.
Fig. 1. The architecture of the network
Comparative Traffic Analysis Study of Popular Applications
127
The network infrastructure of Swedish backbone network and the related residential network are shown in Fig. 1. The backbone network consists of three core routers linked to each other with 3 Gbps optical fibres. The subscribers are connected to the area switches and their traffic is aggregated in a migration switch through 100 Mbps links. The migration switch is linked to one of the core routers with a 1 Gbps capacity link. The workstation responsible for data capturing was connected to one of the core routers with a 1 Gbps capacity fibre. The router mirrored its traffic to the workstation, which let the capturing device store and dump the data on its hard drives. Only the packet headers were captured to get information for the analysis such as protocol, size and direction. Traffic identification was done with a tool developed by Ericsson Hungary. This software uses various techniques to identify the traffic such as port-based, signature-based and heuristic-based approaches, however, the algorithm is not public. Table 1. Basic description of measurements Trace
Measurement period October 2008 [(day) hour:min]
Duration [hour:min]
Flows [million]
Packets [million]
FL-1 FL-2
(7) 11:18 – (8) 22:16 (7) 15:00 – (7) 15:59
34:59 01:00
59.1 1.53
3892 69.96
Table 1 describes the main parameters of the investigated traces where FL denotes the flow-level data sources. After the preprocessing phase the cleaned data were loaded into database tables using Microsoft SQL Server 2005. Data retrieving was performed by SQL queries, and the results were processed by Matlab routines. Moreover, Matlab was also used for visualizing and creating charts.
4
Application-Level Analysis
Firstly, this section presents some important properties of the measured traffic including the daily profile and user ranking based on the number of incoming and outgoing packets. After that the main characteristics of the three chosen applications are investigated. The daily profile of the traffic is given in Fig. 2 using the time interval (7) 12:00 – (8) 11:59. As it can be seen, the peak hours are between 5 PM and 8 PM. We can observe that the traffic volume generated in this time frame is approximately six times higher than in the morning hours. Fig. 3 shows the measured incoming (downlink) packets as a function of the measured outgoing (uplink) packets. Every point in the figure corresponds to one user. The linear fitting tells that an average user generates approximately 25% more downlink packets, though the variance is quite large.
128
Z. M´ ocz´ ar and S. Moln´ ar 150
Total traffic [GB]
125 100 75 50 25 0
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Hour
Fig. 2. Daily profile (FL-1)
5
8
x 10
Number of incoming packets
y = 1.25x
7 6 5 4 3 2 1 0
0
1
2
3 4 5 6 Number of outgoing packets
7
8 5
x 10
Fig. 3. User ranking based on the number of incoming and outgoing packets (FL-1)
Table 2 represents the incoming, outgoing and total traffic volumes generated by BitTorrent, YouTube and Facebook applications. The main difference among the applications is that while the amount of data uploaded by BitTorrent users is more than the downlink traffic volume (68% of the total BitTorrent traffic is uplink traffic and this is two times higher amount of data compared to the BitTorrent downlink traffic), YouTube and Facebook users primarily generate downlink traffic. This is due to the usage of BitTorrent since the finished downloads stay in the queue and they are automatically shared with other users. However, the users can remove these shared files from the queue, but fortunately most of them do not do that what is crucial for the efficient operation of the BitTorrent network. Consequently, their uplink traffic increases by the finished and shared downloads. The volume of uplink traffic is not limited since most of the users have symmetric optical high bandwidth access. In case of YouTube downloading takes 98% of the total traffic that can be explained by the specific characteristics of this application. Namely, video streaming produces a huge downlink traffic
Comparative Traffic Analysis Study of Popular Applications
129
while uploading is necessary only for communicating with the servers and exchanging basic information. Although, users can upload videos that increases the uplink traffic, most of them consume only the contents uploaded by others. Our analysis of YouTube is based solely on the video contents, other user activities such as browsing and searching were excluded. In the whole measurement period (FL-1) users downloaded from 1972 different YouTube servers while in the selected busy hour (FL-2) this value is 222. In contrast to the previous results, Facebook generates only a small amount of network traffic. Table 2. The traffic volume generated by the investigated applications (FL-2) Application BitTorrent YouTube Facebook
Incoming
Outgoing
Total
7.5 GB (32%) 3.62 GB (98%) 45 MB (79%)
15.8 GB (68%) 63 MB (2%) 12 MB (21%)
23.3 GB 3.68 GB 57 MB
Concerning user penetration 13%, 3% and 6% of the active customers used BitTorrent, YouTube and Facebook applications, respectively. However, these values are not easy to compare for different applications. For example, while BitTorrent clients often run all day long, an average user visits Facebook once a day.
5
Flow-Level Analysis
In the following analysis both incoming and outgoing traffic have been considered and results are related to the total traffic. In this about one and a half day long period 1217 GB was downloaded (incoming traffic) and 1568 GB was uploaded (outgoing traffic). It clearly shows that the dominance of BitTorrent uploading determines the general picture and makes the uplink traffic volume 30% higher than the downlink traffic volume. Additionally, it gives the explanation of why the ratio is different for the number of incoming and outgoing packets (see Fig. 3), which is in contrast to the previous results. Fig. 4 illustrates the relationships between flow size and duration for BitTorrent and YouTube applications in an enlarged view. Every point in Fig. 4 and Fig. 6 represents exactly one flow. There are two different clusters in Fig. 4a: a cluster concentrated around the horizontal and vertical axis, and another one bounded by the vertical lines. Our analysis showed that the number of bytes carried by a BitTorrent flow is almost independent of the flow duration. We found that in the first cluster 98.3% of BitTorrent flows have a duration less than 200 s and a size smaller than 4 kB. Furthermore, 85.2% of these flows are transferred over UDP and only 14.8% of them are sent over TCP. Concerning duration the flows of the second cluster fall in the interval 300 s and 365 s. In this region about 97.2% of flows are related to TCP and only a negligible amount of
130
Z. M´ ocz´ ar and S. Moln´ ar
data is transmitted over UDP. For the whole measurement period we got that although, only 16% of BitTorrent flows are transferred over TCP and 84% of them are sent over UDP, TCP flows carry almost 99% of the total bytes. The reason is that BitTorrent basically uses TCP for file transfer, but some of the new clients implement a UDP-based method to communicate with the tracker servers. Obviously, it does not need to transmit large volume of data, but rather needs to send numerous flows. Fig. 4b shows that most of the YouTube flows are located along two lines having different gradients. In other words, the flow rate primarily varies around 1.2 Mbps and 0.75 Mbps, respectively. 4
2
7
x 10
2
x 10
1.2 Mbps
1.5
1.5 Size [B]
Size [B]
0.75 Mbps
1
0.5
0
1
0.5
0
200
400 600 Duration [s]
800
0
1000
0
50
(a) BitTorrent
100 Duration [s]
150
200
(b) YouTube
Fig. 4. Relationships between flow size and duration (FL-2)
80
Number of occurrences
70 60 50 40 30 20 10 0
0
500
1000 Rate [kb/s]
1500
2000
Fig. 5. Histogram of the flow rate for YouTube (FL-2)
This property can also be observed in the histogram of the flow rate depicted in Fig. 5. The first peak around 1.2 Mbps can be explained by the rate limitation of the YouTube servers and the second peak about 0.75 Mbps is due to YouTube control that balances the quality and the bandwidth. Fig. 6 depicts the relationships between flow size and number of packets for the three popular applications. The size of the largest packet is near to the maximum size of the Ethernet frame close to 1500 bytes. In Fig. 6a and 6b it can be observed that BitTorrent flows are more scattered than YouTube and Facebook flows. It indicates that there is a negligible difference among the packet
Comparative Traffic Analysis Study of Popular Applications 5
5
x 10
6
5
5
4
4 Size [B]
Size [B]
6
3
2
1
1
0
200
400 600 Number of packets
800
1000
x 10
3
2
0
131
0
0
200
(a) BitTorrent
400 600 Number of packets
800
1000
(b) YouTube 5
6
x 10
5
Size [B]
4 3 2 1 0
0
200
400 600 Number of packets
800
1000
(c) Facebook
Fig. 6. Relationships between flow size and number of packets (FL-2)
sizes in case of YouTube and Facebook, and the Ethernet frame size is better utilized compared to BitTorrent. Fig. 7 shows the histograms of the number of packets for BitTorrent, YouTube and Facebook applications. We can observe that the number of BitTorrent and Facebook flows have a heavy-tailed decrease for the number of packets more than 100 as shown in Fig. 7a and 7c, respectively. In contrast to BitTorrent the histogram of YouTube flows has a unique characteristics (see Fig. 7b). There are only a few occurrences of a particular number of packets. A deeper analysis revealed the interesting property that in case of BitTorrent only 0.1% of flows contain unique number of packets. In contrast, most of the YouTube flows are unique since 57.4% of flows consist of different numbers of packets, but in case of Facebook flows this ratio is about 1.2%. The previous values were calculated for the whole measurement period (FL-1), but for the one hour long trace in the busy period (FL-2) we got the more surprising 0.4%, 93.6% and 8.9%, respectively. In this case the interesting property of YouTube is more pronounced: almost every YouTube flow contains different numbers of packets.
132
Z. M´ ocz´ ar and S. Moln´ ar 8
10
10
6
10 Number of flows
Number of flows
10
4
10
2
10
0
10
10
10
10
−2
10
10
0
2
10
4
6
10 Number of packets
10
8
10
10
8
6
4
2
0
−2
10
0
2
10
(a) BitTorrent
4
10 Number of packets
6
10
8
10
(b) YouTube
8
10
6
Number of flows
10
4
10
2
10
0
10
−2
10
0
10
10
2
4
10 Number of packets
10
6
10
8
(c) Facebook
Fig. 7. Histograms of the number of packets (FL-1)
6
Conclusion
In this paper we presented a comparative traffic characterization study of three popular applications based on actual measurements taken from a commercial network. BitTorrent as one of the leading P2P file sharing applications, YouTube as the head of video sharing applications and Facebook as the prominent online social networking application were investigated. We studied the main traffic characteristics of the measured data including the daily profile and the incoming and outgoing traffic ratios. We found that the number of incoming packets is 25% more than the number of outgoing packets, but the traffic volume shows a different picture: the outgoing traffic volume is about 30% higher than the incoming traffic volume. This observation can be explained by the dominance of BitTorrent usage where the amount of data uploaded by an average user is much more higher than the downlink traffic volume in many cases. The flow-level analysis revealed that the number of bytes carried by a BitTorrent flow is almost independent of the flow duration, and almost all of the BitTorrent flows have a duration less than 200 s and a size smaller than 4 kB. We also found that almost all of the flows, which have a duration between 300 s and 365 s are TCP flows. However, in case of YouTube we got a completely different
Comparative Traffic Analysis Study of Popular Applications
133
result: flow sizes and durations are depending on each other. The typical rate of a YouTube flow varies around 1.2 Mbps and 0.75 Mbps. This is due to the rate limitation of the YouTube servers and the control that balances the quality and the bandwidth. Although, the number of BitTorrent and Facebook flows have a heavy-tailed decrease for the number of packets more than 100, almost all of the YouTube flows contain different numbers of packets. We clearly identified that YouTube has a unique characteristics in contrast to other applications such as BitTorrent and Facebook. Acknowledgement. We thank Sollentuna Energi AB for the measurements, Ericsson Sweden and Ericsson Hungary for the cooperation and helping to access the data.
References 1. Reuters: YouTube serves up 100 million videos a day online. USA TODAY (2006), http://usatoday.com/tech/news/2006-07-16-youtube-views_x.htm 2. Choffnes, D., Bustamante, F.: Pitfalls for Testbed Evaluations of Internet Systems. ACM SIGCOMM Computer Communication Review 40, 43–50 (2010) 3. Erman, D., Ilie, D., Popescu, A.: Bit Torrent Session Characteristics and Models. In: Proceedings of the 3rd International Conference on Performance Modelling and Evaluation of Heterogeneous Networks, Ilkley, West Yorkshire, U.K, pp. 1–10 (2005) 4. Andrade, N., Neto, E.S., Brasileiro, F., Ripeanu, M.: Resource Demand and Supply in BitTorrent Content-Sharing Communities. Computer Networks 53, 515–527 (2009) 5. Gill, P., Arlitt, M., Li, Z., Mahanti, A.: YouTube Traffic Characterization: A View From the Edge. In: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, New York, NY, USA, pp. 15–28 (2007) 6. Zink, M., Suh, K., Gu, Y., Kurose, Y.: Characteristics of YouTube Network Traffic at a Campus Network – Measurements, Models, and Implications. Computer Networks 53, 501–514 (2008) 7. Cheng, X., Dale, C., Liu, J.: Understanding the Characteristics of Internet Short Video Sharing: YouTube as a Case Study. In: Proceedings of the 16th IEEE International Workshop on Quality of Service, Enschede, Netherlands (2008) 8. McClard, A., Anderson, K.: Focus on Facebook: Who Are We Anyway? Anthropology News 49, 10–12 (2008) 9. Gjoka, M., Sirivianos, M., Markopoulou, A., Yang, X.: Poking Facebook: Characterization of OSN Applications. In: Proceedings of the 1st Workshop on Online Social Networks, Seattle, WA, USA, pp. 31–36 (2008) 10. Nazir, A., Raza, S., Chuah, C.: Unveiling Facebook: A Measurement Study of Social Network Based Applications. In: Proceedings of the 8th ACM SIGCOMM Conference on Internet Measurement, Vouliagmeni, Greece, pp. 43–56 (2008)
Flow Monitoring Experiences at the Ethernet-Layer Rick Hofstede, Idilio Drago, Anna Sperotto, and Aiko Pras University of Twente Centre for Telematics and Information Technology Faculty of Electrical Engineering, Mathematics and Computer Science Design and Analysis of Communications Systems (DACS) Enschede, The Netherlands
[email protected], {i.drago,a.sperotto,a.pras}@utwente.nl
Abstract. Flow monitoring is a scalable technology for providing summaries of network activity. Being deployed at the IP-layer, it uses fixed flow definitions, based on fields of the IP-layer and higher layers. Since several backbone network operators are considering the deployment of (Carrier) Ethernet in their Next-Generation Network, flow monitoring should also evolve in that direction. In order to do flow monitoring at the Ethernet-layer, Ethernet header fields need to be considered in flow definitions. IPFIX provides the flexibility to change the definition of flows, incorporating information from several layers in the network (including non-IP fields). The deployment of IPFIX is still at an early stage, which means that use cases for Ethernet-layer monitoring are not well known yet. This paper provides an overview of the usability of IPFIX at the Ethernet-layer and presents several use cases in which Ethernet-layer flow monitoring provides new insights and different views on a network. Keywords: Network management, flow monitoring, IPFIX, Carrier Ethernet.
1
Introduction
The huge amount of traffic in high-speed networks requires scalable approaches for network monitoring. Flow1 monitoring is a feasible solution in such networks. It provides aggregated network data, resulting in a summary of network activities at a certain network layer. This can increase the visibility of the network behaviour by, for example, showing hosts and applications that are generating specific traffic. The main advantage of flow-based approaches is that they overcome the scalability problems of packet-level captures, where all traffic must be exported. For high-speed network connections (10 Gbps and higher), packetlevel monitoring is not feasible, or could lead to severe performance problems of probing equipment. 1
We consider a flow as “a set of packets passing by an observation point in a network during a certain time interval and having a set of common properties” [14].
R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 134–145, 2011. c Springer-Verlag Berlin Heidelberg 2011
Flow Monitoring Experiences at the Ethernet-Layer
135
Cisco’s NetFlow [3] is currently the major network flow export technology. It aggregates packets into flows if they share the same values in their key fields. Non-key fields are not considered in the definition of a flow. NetFlow version 5 (v5), which is still the most-used protocol for flow export, provides flow data at the IP-layer with a fixed flow definition. As such, neither flow key fields (such as source/destination IP, source/destination port, protocol etc.), nor non-key fields (such as packet and octet counters) can be changed. NetFlow version 9 (v9) was proposed by Cisco to overcome this limitation, allowing flow export records to be specified freely by means of templates. IPFIX (IP Flow Information Export) [4] is an effort by the IETF (Internet Engineering Task Force) to create a standard protocol for collecting and exporting flows. Cisco’s NetFlow v9 was used as the basis for the IPFIX specification [11]. The most distinctive characteristics of IPFIX are the flexibility to change the key fields of a flow, and the possibility to include information from several layers, including the Ethernet-layer. Since several backbone network operators are considering the deployment of Carrier Ethernet2 in their Next-Generation Network [17], monitoring at the IPlayer is not a suitable solution anymore. IPFIX, however, could be used for that purpose. Due to the fact that the deployment of IPFIX is still in an early stage, the applicability of the protocol for Ethernet monitoring is not well known yet. In this context, this paper investigates several use cases, answering the following research question: What are the advantages of flow monitoring at the Ethernet-layer, compared to IP-layer flow monitoring? In order to answer this question, the University of Twente (UT) acquired two specialised, early-deployment probes (i.e. dedicated flow export devices) from INVEA-TECH3 . This equipment provides a means to define flows based on Ethernet-header fields. Before deploying the equipment in a Carrier Ethernet (i.e. service-provider) network, it was tested in the UT’s 802.1Q-based Ethernet network, which carries 110 Virtual LANs (VLANs). This paper presents an overview of the IPFIX prototype equipment specially adopted for this research and several use cases identified during the testing phase. This paper is organised as follows: the IPFIX architecture and its deployment at the Ethernet-layer are discussed in Section 2. The fact that no suitable IPFIX software is available in the market had severe impact on the design of the earlydeployment IPFIX equipment. Details on that will be provided in Section 3. After that, Section 4 describes the exported Ethernet flow data, together with four identified use cases. Although some other monitoring technologies exist for monitoring a network at the Ethernet-layer, IPFIX offers several advantages. Section 5 will focus on related technologies, by comparing them to IPFIX. Finally, we close this paper in Section 6, where we draw our conclusions and future work. 2 3
When Ethernet technology is used in large-scale (e.g. service-provider) networks, it is commonly referred to as ‘Carrier Ethernet’ or ‘Metropolitan Ethernet’. INVEA-TECH is a university spin-off company from Brno, Czech Republic.
136
2
R. Hofstede et al.
IPFIX at the Ethernet-Layer
IPFIX is a flow export protocol, based on the principles of NetFlow v9. Its architecture is defined in [15]. According to the standard, an IPFIX Device hosts at least one Exporting Process and eventually Observation Points and Metering Processes. An Observation Point is a location where packets are collected from the network by a Metering Process. Each pair formed by an Observation Point and a Metering Process belongs to a unique Observation Domain. The Exporting Process is the entity responsible for exporting flow records to Collectors. The tasks of Collectors are 1) the interpretation of IPFIX messages from different Observation Domains and 2) the storage of control information (e.g. flow definitions) and flow records received from an IPFIX Device. The IPFIX Device architecture is depicted in Figure 1.
IPFIX Device
Metering Process M
Observation Domain 1
Observation Domain O
Observation Point 1
Observation Point N
Observation Point 1
Observation Point P
Exporting Process
Metering Process 1
Flows Out to Collector
Source: RFC 5470
Packets In
Fig. 1. IPFIX Device architecture
The main tasks performed by the Metering Process are depicted in Figure 2. After packets are captured at an Observation Point and timestamped, packets can be sampled (i.e. selected for processing within a stream of packets) or filtered. The IPFIX standard, however, does not specify any techniques for that. The packets that qualify for flow processing are passed to the next stage, where flows are either created or updated. When using IPFIX for Ethernet monitoring, those tasks are still the same, although the Metering Process will deal with complete Ethernet frames, instead of IP packets.
Source: RFC 5470 Packet header capturing
Timestamping
Sampling
Filtering
Fig. 2. IPFIX packet selection criteria
Flows
Flow Monitoring Experiences at the Ethernet-Layer
137
As said in Section 1, IPFIX allows to change the key fields of a flow [21]. Moreover, it allows flow definitions to consist of other fields than which are present in the IP-based definition of 5 and up to 7 IP packet attributes4 . Among those are fields from the Ethernet-layer, for instance. The possible (key and nonkey) fields are maintained by the IANA (Internet Assigned Numbers Authority) and are called IPFIX Information Elements [20]. Due to the fact that IPFIX can also export non-IP flows, the list of Information Elements (IEs) is much larger than the list of possible fields for NetFlow. An overview of the most elementary Information Elements for the Ethernet-layer is presented in Table 1. More information about that can be found at IANA Web site [20], and IEEE standards [7], [8] and [9]. Table 1. IPFIX Information Elements for Ethernet sourceMacAddress Source MAC address destinationMacAddress Destination MAC address ethernetPayloadLength MAC client data size (including any padding) Ethernet type field, which identifies the type of payload ethernetType in the Ethernet frame IEEE 802.1Q VLAN identifier. In case of a QinQ or dot1qVlanId 802.1ad frame, it represents the VLAN tag in the serviceprovider domain IEEE 802 user priority. In case of a QinQ or 802.1ad dot1qPriority frame, it represents the user priority in the serviceprovider domain In case of a QinQ or 802.1ad frame, it represents the dot1qCustomerVlanId VLAN tag in the customer domain In case of a QinQ or 802.1ad frame, it represents the user dot1qCustomerPriority priority in the customer domain IEEE 802 frame header size. It is the difference between ethernetHeaderLength the total frame size and the MAC client data size Ethernet Virtual Connection (EVC) ID, which uniquely metroEvcID identifies an EVC in a Carrier Ethernet network Represents the type of service provided by an Ethernet metroEvcType Virtual Connection
Although several Information Elements in Table 1 are only relevant in Carrier Ethernet networks, some are also valid in regular (i.e. non-Carrier) Ethernet networks. The fields related to customer frames (dot1qCustomerVlanId, for example) and Ethernet Virtual Connections (metroEvcID, for example) are the most import exceptions: they provide more insights into the customer traffic and, therefore, are essential for monitoring Ethernet transport networks. 4
The standard 5-tuple consists of the following fields: source and destination IP addresses, source and destination ports, and transport protocol. The other two common key fields are the type of service (TOS) and the input interface.
138
3
R. Hofstede et al.
IPFIX Device Prototype
In order to answer the research question listed in Section 1, the UT acquired two INVEA-TECH FlowMon Probes [10]. This equipment is specialised in flow export (by means of NetFlow v5/v9/IPFIX) in high-speed networks (up to 10 Gbps), and uses an easily extensible software platform. A special Ethernet-plugin was developed by INVEA-TECH for the UT, in order to provide an IPFIX Device prototype, able to collect Information Elements from the Ethernet-layer. Table 2. Ethernet-plugin key fields srcIPv6 dstIPv6 srcPort dstPort l3.proto l4.proto port.in
sourceMacAddress destinationMacAddress dot1qVlanId ethernetType 0 (unused) 0 (unused) probe port ID
Table 3. Ethernet-plugin non-key fields srcAS dstAS ToS TCP flags port.out flow start flow end packets bytes
ethernetHeaderLength ethernetPayloadLength dot1qPriority dot1qCustomerPriority dot1qCustomerVlanId first frame seen last frame seen frames bytes
Instead of reimplementing the Metering Process to follow the IPFIX architecture (as described in Section 2), this prototype stores Ethernet data in the IPv6 fields of NetFlow v9 records. In other words, the device is exporting NetFlow v9 packets, but uses IPv6 fields to store the Ethernet data. The complete mapping from IPFIX Information Elements to NetFlow v9 fields is listed in Table 2 and 3. In these tables, the left column refers to the original NetFlow v9 field, while the right column refers to the IPFIX Information Element exported by the Ethernet-plugin. The presented approach has the following advantages: 1. An early-deployment of IPFIX for Ethernet-layer monitoring could be made, because existing (IP-layer) flow processing algorithms (e.g. hash tables in the flow cache) could be reused. Besides that, no suitable IPFIX Collectors are available yet. By using NetFlow v9 packets, the existing NetFlow Collectors can be used. As an example, nfcapd, part of the nfdump tools suite [12], can be used as a Collector to store NetFlow records on stable storage. 2. Several existing monitoring tools, which support NetFlow v9, can be used to analyse the exported flow data. In some cases, however, small corrections are necessary. For example, nfdump, which normally allows to display flow data and to perform aggregations, will not be able to interpret all fields exported by the IPFIX Device prototype correctly. This is because IPv6 fields are used to store non-IPv6 data. However, it is possible to overcome certain incompatibilities, by extending nfdump. An example of this is shown in Table 4. It shows nfdump, combined with an utility from INVEA-TECH to adapt its standard output to Ethernet-layer data, in order to print flows to the terminal. Several columns, such as Destination MAC and Priority, have been left out of the table, for the sake of space.
Flow Monitoring Experiences at the Ethernet-Layer
139
Table 4. nfdump output showing Ethernet data Start time
Src MAC address
Type
VLAN
EHL
EPL
Frames
Bytes
2011-04-03 23:57:29.275 2011-04-03 23:57:29.529
00:25:B3:1F:F3:0A 00:0B:60:AA:80:00
0x0800 0x0800
161 103
14 14
56 62
99 47
11238 7456
2011-04-03 23:57:31.792
00:23:5A:C3:C9:7D
0x86DD
103
14
443
39
16355
2011-04-03 23:57:32.659 2011-04-03 23:57:34.440
C8:0A:A9:F0:E3:4A 00:00:0C:07:AC:00
0x0806 0x0806
103 103
14 14
50 50
16 5
1024 320
EHL - Ethernet Header Length EPL - Ethernet Payload Length
4
Results
The previous sections have made clear that monitoring at the Ethernet-layer by means of IPFIX is a completely new area in the network management community. This is especially true when it comes to hands-on experience. After the two INVEA-TECH FlowMon Probes had been installed in the campus network of the UT, several tests have been performed. This section will highlight several aspects of the obtained hands-on experience, in the fields of traffic profiling, misconfiguration detection and device misbehaviour detection. 4.1
Traffic Profiling
Traffic profiling is the process of exploring active traffic types in a network. The IPFIX Device prototype allows to do that at the Ethernet-layer. This gives a completely different view on the network, since all active layer-2+ protocols5 can be monitored. Besides all protocols that we expected to see active in our campus network - such as Novell IPX, Link-Layer Discovery Protocol (LLDP), and Address Resolution Protocol (ARP) - we have discovered other less common protocols. Among them are DECnet Phase IV protocols, Cisco WLAN Context Control Protocol and Multi-Protocol Label Switching (MPLS) Unicast. Since these protocols do not operate on top of IP, NetFlow would not have been able to identify them. Having Ethernet flow data allows to compare the amount of flows, packets and octets that were exchanged by the active layer-2+ protocols. One of the most striking results obtained was the difference in the traffic behaviour of IPv4 and IPv6 (shown in Figure 3). Note that traffic profiling for IPv4 and IPv6 could also have been done by using NetFlow, although the higher data aggregation level of Ethernet flow data makes profiling much faster and easier. Over a period of 24 hours, the amount of IPv4 flows was almost equal to the amount of IPv6 flows on the campus network. However, the amount of octets generated within 24 hours by IPv4 was roughly 40 times as high as the amount 5
The set of protocols operating directly on top of Ethernet, such as IP and ARP.
140
R. Hofstede et al. 20k
IPv6 flow records (per min.)
IPv4 flow records (per min.)
20k
16k
12k
8k
4k
0 0:00
4:00
8:00
12:00
16:00
20:00
16k
12k
8k
4k
0 0:00
0:00
4:00
8:00
12:00
16:00
20:00
0:00
20:00
0:00
(b) IPv6 flow records
35G
35G
30G
30G IPv6 octets (per min.)
IPv4 octets (per min.)
(a) IPv4 flow records
25G 20G 15G 10G 5G
25G 20G 15G 10G 5G
0 0:00
4:00
8:00
12:00
16:00
20:00
0:00
(c) IPv4 octets
0 0:00
4:00
8:00
12:00
16:00
(d) IPv6 octets
Fig. 3. Traffic profiling for IPv4 and IPv6
of octets generated by IPv6 (25 TB and 600GB, respectively). Although most machines have a dual-stack setup nowadays (to support both IPv4 and IPv6), most of the traffic carrying user payload is sent over IPv4. This behaviour can be clearly identified in Figure 3. Even though the amount of IPv6 flows starts to increase significantly after noon on the capturing day, the amount of octets exchanged remains low. One of the reasons for that is the Neighbour Discovery Protocol (NDP), which is part of IPv6 (and ICMPv6). As such, flows caused by NDP will be counted as IPv6 flows. For IPv4, neighbour discovery is handled by ARP. ARP operates directly on top of Ethernet, which is therefore not counted as an IPv4 flow. 4.2
Misconfiguration Detection
Both main routers at the edge of the UT campus network support the DECnet Phase IV protocol suite for management purposes. Since these protocols are not used anymore, their interfaces should have been disabled for security reasons. One of the active layer-2+ protocols on the network, however, belongs to the DECnet Phase IV protocol suite. We discovered this traffic by identifying the corresponding ethertype. Besides that, the flow behaviour shows a clear periodicity, which is shown in Figure 4. The network managers found out that the DECnet interface on one of the routers was not properly disabled. Without Ethernet-layer monitoring, this misconfiguration could not have been detected.
Flow Monitoring Experiences at the Ethernet-Layer
141
DECnet MOP flow records (per min.)
45 40
30
20
10
0 0:00
4:00
8:00
12:00
16:00
20:00
0:00
Fig. 4. DECnet Maintenance Operation Protocol (MOP) flow records
4.3
Device Misbehaviour Detection (1)
While profiling the network on the UT campus, two unknown ethertypes were detected: 0x8259 and 0x0A59. The IANA maintains a list of registered ethertypes [19], but the discovered ethertypes were not present on that list. After transforming these hexadecimal values into decimal IP addresses, the IP subnet prefixes used by UT (130.89/16 and 10.89/16) are obtained. Packet-level capturing at various points in the network allowed us to identify the generator of these Ethernet frames: a data centre switch of a major network device vendor (operating with beta firmware) had a bug in its IGMP Snooping functionality, resulting in mangled packets. As such, the switch was putting the first two octets of IP addresses (extracted from the Ethernet payload) inside the ethertypes field. The consequence of this is that Ethernet frames were partly overwritten, making them corrupt and useless. 4.4
Device Misbehaviour Detection (2)
During our experiments, a campus host with a malfunctioning network driver (for hardware firewalling) caused severe problems to the UT’s campus network. The host generated a huge amount of ARP messages, resulting in a degraded network performance. This is depicted in Figure 5. While ARP normally generates 2 million octets per minute on average (as shown in Figure 5(a)), it generated around 35 million octets per minute at the moment the host started sending malicious data (Figure 5(b)). Since it is not possible to monitor ARP traffic with normal NetFlow technology, it would not have been possible to detect this issue without the use of IPFIX.
142
R. Hofstede et al. 4M
ARP Octets (per min.)
ARP Octets (per min.)
50M
3M
2M
1M
40M
30M
20M
10M
0 0:00
4:00
8:00
12:00
16:00
20:00
(a) ARP octets on March 28, 2011
0:00
0 0:00
4:00
8:00
12:00
16:00
20:00
0:00
(b) ARP octets on April 4, 2011
Fig. 5. Misbehaving host becomes security threat
5
Related Work
As mentioned in Section 2, the predecessor of IPFIX is NetFlow v9. Some of the use cases presented (i.e. misconfiguration and device misbehaviour) would not have been possible with NetFlow. NetFlow uses fixed flow keys, which do not contain Ethernet fields. IPFIX, however, offers flexible flow keys (by means of Information Elements), which allows to monitor a network at the Ethernet-layer. A complementary protocol to IPFIX is PSAMP (Packet Sampling) [5]. According to [5], “the main difference between IPFIX and PSAMP is that IPFIX addresses the export of Flow Records, whereas PSAMP addresses the export of packet records”. The two protocols share a part of their architectures, which is depicted in Figure 6. The IPFIX architecture consists of two stages, namely 1) packet processing, and 2) flow processing. The first stage is identical in the IPFIX and PSAMP architectures. When a packet header is captured and timestamped, it is passed to the packet selection process. In this step, packets can be sampled or filtered. After that, the next step depends on the considered protocol: 1. If IPFIX is used, packets reach the flow processing stage, in which they are mapped to flows. This means that either an existing flow record is updated, or a new flow record is created. The final step is to export the flows. 2. If PSAMP is used, packet reports are exported, instead of flow records. These reports can be seen as a special IPFIX record, containing the information about a single packet. With PSAMP, it would not have been possible to do traffic profiling (as discussed in Section 4.1) as precise as with IPFIX. The reason for that is the sampling, which is done by PSAMP by definition. Although it is possible to mathematically compensate for sampling [6], this process is not straightforward. The data presented in this paper, however, is always unsampled. While IPFIX is an IETF-standard, also industry technologies exist for network monitoring. One of them is sFlow [16], which uses packet sampling (but not by means of PSAMP) for exporting network data. Just as IPFIX, it offers a network
Flow Monitoring Experiences at the Ethernet-Layer
Stage 1: Packet processing (IPFIX + PSAMP)
143
PSAMP
Source: RFC 5476
Packet report export Packet header capturing
Timestamping
Packet selection
Packet classification IPFIX
Stage 2: Flow processing IPFIX
Flow generation and update
Flow selection
Flow record export
Fig. 6. IPFIX and PSAMP architectures
monitoring solution at the Ethernet-layer. Some differences can be identified when comparing sFlow to IPFIX: – sFlow is usually available on a dedicated hardware chip in a network device, while IPFIX usually shares a hardware and a software solution. The advantage of a complete hardware-based approach is that the CPU and memory of the device are preserved for other tasks (such as routing and switching). – sFlow uses packet sampling by definition. Although this saves hardware resources of the network device, the resulting data set is a subset of the actual network traffic. On the other hand, IPFIX allows to collect unsampled flow data, resulting in a more complete overview of the traffic. Moreover, even though IPFIX is used without sampling, it is still scalable. Because of the packet sampling used by sFlow, traffic profiling cannot be done with the same precision as with an IPFIX Device, just as it is the case for PSAMP. tcpdump [18] is a packet-level traffic capturing and analysis tool. It uses PCAP (Packet Capture) for capturing packets on a medium and eventually to write them to files. Due to the limited bandwidth available in machines for writing data to stable storage, capturing network traffic in high-speed networks (e.g. 10 Gbps) causes severe performance problems to systems. A consequence of these performance problems is that packets will be dropped by the kernel of the operating system, resulting in incomplete traces. For these reasons, making packet-level captures in high-speed networks, and especially in transport (service-provider) networks, is not a suitable solution. NeTraMet (Network Traffic Meter) [2] is another approach to flow monitoring and an open-source implementation of the IETF Meter MIB [13]. Within NeTraMet, rule sets are used to specify the information fields that should be gathered from the network traffic [1]. As a consequence, these rules can also be used for specifying which flows are filtered. NeTraMet is a software-based solution, which uses PCAP (just as tcpdump). Therefore, NeTraMet is not suitable for monitoring high-speed network links (e.g. 10Gbps), for the same reasons as tcpdump.
144
6
R. Hofstede et al.
Conclusions
Flow monitoring is a scalable technology for monitoring traffic in high-speed networks. Until recently, it was mainly deployed at the IP-layer, providing a summary of network traffic based on IP and TCP/UDP fields. When it comes to flow monitoring at the Ethernet-layer, as it is needed for Carrier Ethernet networks, another technology is required. IPFIX is a suitable solution for that, since it allows to define flow keys based on Ethernet fields. The protocol, however, is still in an early-deployment phase and little hands-on experience has been gathered. The IPFIX Device prototype acquired by the UT has been tested in a campus network, in order to answer the research question risen in Section 1: What are the advantages of flow monitoring at the Ethernet-layer, compared to IP-layer flow monitoring? Several use cases were presented, in which Ethernet-layer monitoring provides new insights into the traffic patterns inside the UT’s campus network. These use cases ranged from detecting misconfigurations to detecting device misbehaviour. The discussed related monitoring technologies would not allow to do them with the same simplicity and precision as IPFIX. The major advantage of flow monitoring at the Ethernet-layer is the ability to monitor all active protocols that operate directly on top of Ethernet. Among them are protocols, such as ARP for IPv4, which are essential for IP-based communications. Besides helping to understand how much data these protocols generate and how this amount depends on the number of active hosts, Ethernet-layer flow monitoring can assist network managers in detecting anomalies and debugging problems. Although Ethernet-layer monitoring provides new insights into the traffic transiting within a network, we think that the implementation discussed in this paper will never be able to replace standard NetFlow. The main reason for this is that the IPFIX Device prototype solely provides Ethernet-layer data (i.e. it supports only Ethernet-based IPFIX Information Elements). Besides that, using NetFlow v9 for carrying Ethernet data is just a temporary solution. In a fully implemented and compatible IPFIX Device, which will become available in the future, it will be possible to add IP-based Information Elements to flow definitions, resulting in a more complete overview of the traffic. As future work we consider to investigate the detection of more anomaly types, by means of Ethernet flow data. This can be done in two directions: 1) Investigating anomalies which cannot be detected by NetFlow, and 2) investigating how IP-layer anomalies reflect to Ethernet flow data. Acknowledgements. This research work has been supported by SURFnet’s GigaPort3 project for Next-Generation Networks, the IOP GenCom project Service Optimisation and Quality (SeQual), and the EU FP7-257513 UniverSelf Collaborative Project. Special thanks to Jeroen van Ingen Schenau from the University of Twente for his valuable contribution to the research.
Flow Monitoring Experiences at the Ethernet-Layer
145
References 1. Brownlee, N.: Traffic Flow Measurement: Meter MIB. RFC 2720 (Informational) (October 1999), http://www.ietf.org/rfc/rfc2720.txt 2. Brownlee, N.: NeTraMet & NeMaC Reference Manual, Version 4.3 (June 1999), http://www.caida.org/tools/measurement/netramet/download/ntm43.pdf 3. Claise, B.: Cisco Systems NetFlow Services Export Version 9. RFC 3954 (Informational) (October 2004), http://www.ietf.org/rfc/rfc3954.txt 4. Claise, B.: Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of IP Traffic Flow Information. RFC 5101 (Standards Track) (January 2008), http://www.ietf.org/rfc/rfc5101.txt 5. Claise: B., Johnson, A., Quittek, J.: Packet Sampling (PSAMP) Protocol Specifications. RFC 5476 (Standards Track) (March 2009), http://www.ietf.org/rfc/rfc5476.txt 6. Duffield, N., Lund, C., Thorup, M.: Properties and prediction of flow statistics from sampled packet streams. In: Proceedings of the 2nd ACM SIGCOMM Workshop on Internet Measurment, IMW 2002, pp. 159–171. ACM, New York (2002) 7. Institute of Electrical and Electronics Engineers: Part 3: Carrier sense multiple access with collision detection (CSMA/CD) access method and physical layer specifications. IEEE Standard 802.3 (December 2005) 8. Institute of Electrical and Electronics Engineers: Local and Metropolitan Area Networks: Virtual Bridged Local Area Networks. IEEE Standard 802.1Q (May 2006) 9. Institute of Electrical and Electronics Engineers: Virtual Bridged Local Area Networks - Amendment 4: Provider Bridges. IEEE Standard 802.1ad (May 2006) 10. INVEA-TECH: FlowMon Probe (July 2011), http://www.invea-tech.com/products-and-services/flowmon/flowmon-probes 11. Leinen, S.: Evaluation of Candidate Protocols for IP Flow Information Export (IPFIX). RFC 3955 (Informational) (October 2004), http://www.ietf.org/rfc/rfc3955.txt 12. NfDump (July 2011), http://nfdump.sourceforge.net 13. Poortinga, R., van de Meent, R., Pras, A.: Analysing campus traffic using the meter-MIB. In: Proceedings of Passive and Active Measurement Workshop, pp. 192–201 (2002) 14. Quittek, J., Zseby, T., Claise, B., Zander, S.: Requirements for IP Flow Information Export (IPFIX). RFC 3917 (Informational) (October 2004), http://www.ietf.org/rfc/rfc3917.txt 15. Sadasivan: G., Brownlee, N., Claise, B., Quittek, J.: Architecture for IP Flow Information Export. RFC 5470 (Informational) (March 2009), http://www.ietf.org/rfc/rfc5470.txt 16. sFlow: Making the Network Visible (July 2011), http://www.sflow.org 17. SURFnet: GigaPort3 and SURFnet7 (July 2011), http://www.surfnet.nl/en/ innovatieprograma’s/gigaport3/pages/default.aspx 18. TCPDUMP/LIBPCAP (July 2011), http://www.tcpdump.org 19. The Internet Assigned Numbers Authority (IANA): Ether Types (April 2010), http://www.iana.org/assignments/ethernet-numbers 20. The Internet Assigned Numbers Authority (IANA): IP Flow Information Export (IPFIX) Information Elements (July 2011), http://www.iana.org/assignments/ipfix/ipfix.xml 21. Trammell, B., Boschi, E.: An Introduction to IP Flow Information Export (IPFIX). IEEE Communications Magazine 49(4), 89–95 (2011)
A Survey of Quality of Experience Qin Dai Chair for telecommunications, Technische Universität Dresden, Dresden, Germany
[email protected]
Abstract. Offering the subscriber with good satisfaction is the ultimate target of a service provider, since the user’s satisfaction is necessary if the service can achieve success. Thus it is of great importance to know how users have perceived the quality of service. QoE (Quality of Experience) is a “new” terminology introduced to describe this user’s perception. It reflects how satisfied/dissatisfied the end users are with a certain service and represents how well the service fulfills the user’s expectations. Although it has already been near a decade since the introduction of the concept of QoE at the beginning of this century, a comprehensive introduction and detailed explanation of it and the related issues are still absent. Additionally, the usage of different terminologies as well as the varying explanations of QoE may obscure the concept to new researchers and readers. In this paper, we provide a systematic survey of QoE related issues. We review of the introduction of the QoE concept and the correspondent terms by differ communities. We compare and discuss its relationship with another concept, QoS, which is also used for service quality description. Moreover, the technical issues related to this subjective concept -- the QoE measurement as well as the QoE-based QoS engineering, are introduced. Keywords: QoS, QoE, NP, QoSE, QoSD, KQI, KPI, QoE-aware QoS engineering.
1 Introduction The importance of understanding service quality from the end user’s perspective has been recognized. To describe the end users’ viewpoints about the service quality, the new terminology, QoE (Quality of Experience), has been introduced at the beginning of this century. The background of arising of the QoE concept in the forward is the development and the increasing demand of some new services, e.g., IPTV, VoD, online gaming, etc. These services are considered to be the new revenue point of the ISPs; however, their successes are strongly dependent on the perception of the end users. Research indicates that most of the customers will terminate a service prior to complaining if they have experienced bad quality. According to [1], about 82% of customer defections are due to frustration over the product/service and the inability of the provider/operator to deal with this effectively. Moreover, it is indicated a chain effect that, on average one frustrated customer will tell 13 other people about his/her bad experience. Thus, understanding of the user’s perception of the service is of R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 146–156, 2011. © Springer-Verlag Berlin Heidelberg 2011
A Survey of Quality of Experience
147
primary importance for the ISPs to obtain the customer’s loyalty and to compete with its competitors. More importantly, it can help the ISPs to optimize their revenue and their network resources [2]. In this work, we provide an extensive review of QoE issues with detailed explanations. The reset of this paper is constructed as follows. In Section 2, we introduce the “new” concept of Quality of Experience. Its definitions from diverse communities and a comparison with another concept, Quality of Service (QoS), are discussed. In Section 3, the main contributors and the indicators for describing service QoE are presented. Two QoE measurement approaches, i.e., the service level measurement as well as the network level method, are introduced also. Section 4 provides an introduction of QoE based QoS engineering. Finally this paper is concluded in Section 5.
2 QoE vs. QoS 2.1 QoE Definitions and Characteristics The concept of QoE can convey different definitions by different communities. A unified definition of QoE is still needed. The first introduction of the term QoE at the beginning of this century was predominantly promoted by the industry. For example, in 2005 Nokia introduced this concept as a perception of the end users about a service quality and stated [3]: “QoE is how a user perceives the usability of a service when in use – how satisfied he or she is with a service”. One year later, the Broadband forum defined QoE as a measure and an indicator of a system in fulfilling the requirements of the customers. In its technical report, TR-126 [4], it explains that: “QoE is a measure of end-to-end performance at the service level from the user perspective and an indication of how well the system meets the user’s needs.” Based on the descriptions above, we define the QoE as: QoE is a user-oriented description of service quality. It refers to the “subjective” evaluation of the end users about the “overall” service quality and reflects the users’ experiences during/ throughout the “entire” service utilizing. To dimension/evaluate the QoE of a service, the user’s perceptions, such as “Good”, “Poor”, “Fair”, “Bad”, are often used. QoE is a “subjective” concept. Despite the objective reasons related to service delivering (transport network), end terminal installation, software configuration, etc., those subjective factors, such as the user’s expectations and experiences, can determine the user’s final (overall) perception. QoE is sometimes a “long-term” concept. A user’s positive perception about a service often implies long-term good QoE performance covering the whole period of the entire service utilization. However, QoE is also a “short-term” concept. It reflects the user’s experiences, particularly negative perception over a short period. For instance, the bad experience about the speech quality even for a short period of a voice application is probably serious enough to let the user terminate the service utilization. 2.2 QoS Definition In addition to QoE, QoS (Quality of Service) is another concept used for describing service quality, which has been dominated for several decades. Although the term QoS was originally intended to be user-oriented [2,5,6], now, it is more considered as a
148
Q. Dai
measurement from the network perspective. [3] defined QoS as:”Qos is the ability of the network for providing a service with an assured service level”. In [4] it was introduced as: “QoS is a measure of performance at the packet layer from the network perspective.” QoS also refers to a set of technologies “that enable the network administrator to manage the effects of congestion on application performance as well as providing differentiated service to selected network traffic flows or to selected users [4].” 2.3 Comparisons and Discussions The foremost difference between QoE and QoS resides in the following points: the former one more focus on what the end user feels, whereas the latter concept is more a measure from the network aspects. Basically, the relationship between QoE and QoS can be concluded in two aspects: firstly, QoE extends the concept of QoS. QoS only encompasses one part of QoE scope; second, QoE needs the support from QoS, and inversely, QoS performance can impact QoE satisfaction. We use an example illustrated in Fig. 1 to demonstrate these two concepts, and their relationship in detail.
Quality of Experience End-to-end network QoS
Factors such as network coverage, service offers, level of support, etc.
Other subjective factors such as user expectations, particular experience, user requirements, etc.
Fig. 1. QoE vs. QoS. [3]
Figure 1 depicts a common scenario of VoIP application. As illustrated in the figure, two end users are calling each other and experiencing the service quality. The traffic streams including voice and data need to go through different networks and network elements, e.g., mobile network, backbone network, to arrive at the user terminals. In this example, the (end-to-end) QoS scope may cover all of the network elements within the traffic flow. Its performance can be evaluated by (end-to-end) network measures, such as BER (Bit Error Ratio), PLR (Packet Loss Ratio), latency, etc., which usually have little meaning to the end user. On the other side, QoE directly reflects how the end users think about the service. It refers to their experiences during the service utilization, if he/she was satisfied with the service quality etc. For instance, the user
A Survey of Quality of Experience
149
may experience breaks during the call or he/she may have to wait before talking in order to be sure that the conversation partner has already finished. These speech breaks and long waiting period may be caused by network errors such as network congestion, or due to a wrong software configuration. The user usually does not see the reason behind the bad experiences, but can certainly sense them. And all of these bad experiences can impact his/her overall perception on the service quality. As has been shown, the user’s QoE of a service is far beyond a technical metric. It covers a wide scope involving different partners with different responsibilities, e.g., the service/content offering by the service provider, service delivering by the network provider, service utilizing by the end user, etc. Besides of the quality of the transport network, i.e., network QoS, other factors including subjective ones such as user’s expectation and user’s experience on the service, can also impact the user’s perception. For example, a user with experience on a particular service may have higher requirement on the service quality than an inexperienced one. On other hand, since the service has to be delivered through the network, the network QoS can influence user’s experience. A poor network QoS usually will result in a user’s poor QoE, and a good QoE satisfaction often implies a good QoS of the transport network. The packet loss in the network, for instance, may result in a bad speech quality and disappoint the user. And a good user’s experience of the speech quality usually indicates few or even zero packet losses during the conversation. Nevertheless, fulfilling all the QoS requirements cannot guarantee a good QoE satisfaction. For example, a G.711 voice connection with a small PLR may still maintain good speech quality; however the same PLR of a G.729 speech flow may yield an entirely different user’s experience. In Table 1, the comparisons between these two concepts, QoE and QoS, are concluded. Table 1. Comparisons between QoE and QoS QoE A description of service quality User oriented, from user’s perspective. QoE is a subjective concept. It directly reflects the user’s perception of the service quality. QoE metric such as MOS (Mean Opinion Score) directly expresses the user’s satisfaction. QoE satisfaction needs the support of a good network QoS.
QoS A description of service quality Network oriented, from network/provider’s perspective QoS is limited to a technical concept and usually a QoS measurement cannot directly reflect a service problem. The QoS parameter can be packet loss, jitter, delay, and throughput, etc., which usually has little meaning to an end user. QoS can impact the service QoE.
2.4 QoSE vs. QoSD – The Definition of QoS from ITU The view of QoS that was presented by Broadband forum is limited to be a technical concept. However, the ITU defines QoS in a much wider spectrum. A generic definition of QoS can be found in ITU recommendation E.800 series [2,5,6]. In its newest version issued in 2008, QoS is defined as “totality of characteristics of a telecommunications service that bear on its ability to satisfy stated and implied needs of the user of the service”.
150
Q. Dai
As can been seen from its definition, the concept of QoS by ITU is user oriented, it refers to the comparisons of one service with another in terms of certain universal, user-oriented performance concerns [7]. To address the provider-oriented presentation of service quality, another concept, NP (Network Performance), has been introduced. In [2], NP is defined as “the ability of a network or network portion to provide the functions related to communications between users”. The usage of the different concept pairs for service quality description, i.e., the pair of “QoE” and “QoS”, and the pair of “QoS” and “NP”, and particularly the diverse explanations of QoS may confuse the new researchers and readers. Accordingly, to clearly describe and distinguish the presentation of service quality from user’s perspective and from the provider’s perspective, ITU developed its definition of QoS concept and introduced new terms in its recommendation E.800 in 2008 [2]. As illustrated in Figure 2, ITU further developed its definition of QoS and divided the concept of QoS into four viewpoints, i.e., QoSO (QoS offered), QoSD (QoS delivered), QoSR (QoS requirements of user), and QoSE (QoS experienced).
CUSTOM
SERVICE PROVIDE
Customer’s QoS requirements
QoS offered by provider
QoS perceived by customer
QoS achieved by provider
Fig. 2. Four viewpoints of QoS by ITU. [2]
For describing the service quality from the service provider’s side, as shown in Figure 2, QoSO and QoSD are introduced. QoSO describes the level of the quality targeted by the service provider to offer a certain service to its customers, while the QoSD is the quality level that really delivered/achieved to the user. The Network Performance is referred to the detail technical part of QoSD; QoSR and QoSE are the two viewpoints from the customer side. The former one refers to the QoS requirements of a service from the users, and the latter one expresses the level of quality that the users have experienced through the service. As can been seen from above, although the Broadband forum and ITU use separate terminologies for describing the service quality issues, a common consideration is that understanding the end users’ thinking about a service quality is of primary importance for the service provider to optimize its resource and increase its revenue. And both of them state that the traditional network measurements, i.e., the measurements only from the network layer, cannot precisely even correctly reflect the service problem in
A Survey of Quality of Experience
151
some cases. New metrics for precisely depicting the user’s experience of a service shall be introduced and these subjective opinions shall be changed into objective parameters for the purpose of network designing and engineering. Although the new pair of concepts, i.e., QoSE and QoSD, has recently been introduced by ITU, the terminologies, QoE and QoS, are still widely used by academics and industries for the purpose of service quality evaluation.
3 QoE Indicator and Measurements 3.1 QoE Indicator The description / measurement of the quality of experience issues of a certain service needs to understand its QoE indicators / contributors firstly. A QoE indicator / contributor, often denoted by KQI (Key Quality Indicator) or QoE KPI (Key Performance Indicator), is a user-based metric which can capture the user’s perception of a service quality. As well realized, QoE of a certain service can be determined by various factors existing at different periods and in various steps of the service provisioning. Thus, a complete description of user’s QoE about a service can be challenging. It needs to concern different aspects covering a variety of scopes and is service dependent. For the (end-to-end) QoE description/measurement, a common sense is that it shall comprise all of the factors that can influence the interaction between the end users and the service [8]. An early description of the QoE factors can be found in [3], in which the QoE factors that can impact the overall user’s perception of mobile services were specified. Figure 3 illustrates these definitions. As exemplified in the figure, aspects, such as, service accessibility, service availability, service usability (ease of use), service integrity (session quality), and service reliability/continuity etc., should be taken into considerations when concerning of users’ satisfaction. Based on these, some efforts have been carried out to further develop the definition of QoE indicators. A simple and common way used by these efforts is to distinguish the QoE factors according to different steps of the service provision. For instance, an end-to-end QoE satisfaction of the end users may be influenced by their experiences in two stages, i.e., the stage before session and the stage in session. The user’s satisfaction of the service before building up of the session may be dependent on the service availability, service accessibility, the ease of software usage, etc. After the session is built up, the factors, such as the session quality, the connection continuity, may have important meaning. Finally the user experiences during both stages determine the user’s overall satisfaction of the service. Other identification methods attempt to classify QoE factors according to their contribution to the user’s perception. For example in [9], user’s perception of a service is considered to can be reflected from two aspects, i.e., reliability aspects and comfort aspects. The reliability aspects consist of items such as service availability, service accessibility, service access time, and service continuity. The comfort aspects include session quality, ease of use, and level of support.
152
Q. Dai
availability usability
QoE
integrity ...
reliability
accessibility
Fig. 3. QoE factors/aspects
In addition, the identification of QoE indicator/contributor can depend on the particular purpose of the QoE measurement. For example, in [10], based on the Kano’ model [11] of service quality, the authors only considered the must-be-requirements of the services in their definition of the QoE indicators for High-Speed Internet service, since the target in this work was to explore the criteria of network QoS that can prevent the user’s dissatisfaction. Kano’s model has become one of the most popular quality models and has been widely utilized in the product/service development since it was first introduced in 1984. The basic philosophy of Kano’s model is that the user’s satisfaction and dissatisfaction are two independent concepts and should be considered separately. Recently, it is started to be applied in the telecommunication industry for network/service designing. Related works can be found in [10][12][13]. 3.2 QoE Measurement Typically, a QoE measurement can be approached on two levels -- the service level and the network level. • Service level measurement Service level measurement is typically a subjective measurement. Often by this method, a number of user agents are employed at the terminals for performing subjective evaluations in order to ensure accurate results. Usually with such measurements, the user agents are asked to experience a certain service/service mix and evaluate its quality by answering some carefully predefined questions. Since the results of a service level approach are taken directly from the end users’ point of view, this method can easily and directly reflect the QoE problems of a certain service. Another central purpose of performing a service level QoE measurement is to find the relationship between the overall user’s satisfaction and the corresponding QoE indicators, i.e., KQIs /QoE KPIs, which can be presented as:
A Survey of Quality of Experience
Qi = f i ( I E1 ,⋅ ⋅ ⋅, I EK )
153
(1)
where Qi is the QoE measurement of an individual service, i, and relates to the K quality indicators, I Ej (1<j
Qi = ∑ w j ⋅ I Ej
(2)
j =1
where wj is the weighting factor of jth QoE indicator. Similarly, a linear function is also frequently used for correlating the QoE measurement of each individual service to the overall QoE performance of the service combination. The result can be presented as: N
Qcomb = ∑ wi′ ⋅ Qi
(3)
i =1
where Qcomb refers to the overall QoE of service combination of N services, wi′ is the weighting factor for service, i, with its individual QoE measurement as Qi. • Network level measurement A network level QoE measurement aims to estimate the user’s experience of a given service depending on the network measurements. Usually by this method, a QoE function of QoS parameters which maps the QoS measurements (QoS KPIs), e.g., BER, PLR, etc., to the user experience (QoE), is required. Generally by this method, a QoE function can be presented as: Qi = f i′( I S1 ,⋅ ⋅ ⋅, I SL )
(4)
where the individual QoE of a certain service, Qi, is a function of L QoS indicators/parameters, i.e., I Sj (1<j
154
Q. Dai
recording the related network parameters. The obtained subjective samples, i.e., QoE evaluation, and the recorded objective measurements, i.e., QoS measurement, can then be correlated to each other through statistical analysis and modelling methods. According to (4) and (3), the overall QoE of service combination can be presented as:
Qcomb =
N
∑ w′ ⋅ f ′( I i =1
i
i
1 S
,⋅ ⋅ ⋅, I SL )
(5)
Compared with the service level approach, the network level approach (with the obtained QoE function) does not necessitate the participation of a number of observers, and hence, is time saving. Furthermore, a network level measurement can be deployed for online monitoring and quality prediction. However, defining a proper relation between QoE and QoS parameters is difficult. The reason that the QoE is of instinct a subjective concept while the QoS is an objective description limits the efficiency and accuracy of a network level evaluation. However, despite its weakness in accuracy of assessment, the network level QoE evaluation is of special importance to network providers, since the results from such QoE estimations indicate the ability of their networks in offering a certain service in terms of users’ satisfaction. Besides, the evaluation’s results can be used by the network provider to optimize its network management and traffic engineering.
4 QoE-Aware QoS Engineering Recently the importance of taking care of user satisfaction with the service provisioning as a whole has been realized [14].The definition and the measurement of user’s perception of a certain service are significant for the service providers in reporting to which level of success they have achieved with their services. From the perspective of the service providers, another reason for provoking the QoE investigation, particular that researches based on the network measurement, is to use the obtained results for network resource optimization and in turn to improve the service quality. As a consequence, the concept of QoE based QoS engineering has been introduced. The target of the QoE based QoS engineering is to maximizing the user experience while, concurrently reducing network costs. Figure 4 illustrates a top-down approach for QoE based QoS engineering. As can be seen from the figure, a general approach may consist of 6 steps in two spaces. The step 1 to step 3 in Figure 4 consist of the procedures in QoE space with the tasks of QoE indicator specification/identification (step 1), specifying QoE targets (step 2), and QoE-QoS mapping (step 3). Based on the results of step 3, the network QoS mechanisms and configurations, e.g., service classification (best effort, real-time, etc.), traffic scheduling, policing, queue management, etc., can be determined in step 4 and provided in step 5. The QoE measurement (typically on the network level) in step 6 then examines the efficiency of these QoS engineering regarding the specified QoE target. Its results will be used as a feedback to step 4 to improve the related QoS mechanisms with the intention of fulfilling the QoE requirements.
A Survey of Quality of Experience
155
1:Analysis and charactrize sercive
QoE space
2: Define service QoE metrics and targets Top-down approach 3: Identify QoE dependencies and define QoE function
4: Determine QoS mechanisms and configurations QoS space 5: QoS engineering
6: QoE measurement
No
QoE target satisfied?
Yes
Accept the QoE awared QoS
Fig. 4. Top-down approach of QoE engineering, adapted from [4]
5 Conclusions Customer’s satisfaction has been a matter of concern to most companies in designing and improving their products/services. As a consequence, a concept, QoE, has been introduced in the telecommunication industry for presenting the service quality from customers’ opinions. In this paper, we provide a comprehensive review of QoE and the related issues, covering a wide scope from the concept definition to QoE engineering. Our intention is to address the importance of QoE concern in network designing, and provide some basic knowledge essentially for the related investigation.
References 1. Nerger, P.: Managing Quality of Experience for Mobile data service (2003) 2. International Telecommunication Union – Telecommunication standardization Sector: Definitions of terms related to quality of service. ITU-T Rec. E.800 (September 2008) 3. Nokia: Quality of Experience of mobile services: Can it be measured and improved? Nokia white paper (2005) 4. Broadband Forum: Triple-play Services Quality of Experience (QoE) requirements. Technical report, TR-126 (December 2006)
156
Q. Dai
5. International Telecommunication Union – Telecommunication standardization Sector: Definitions of terms related to quality of service. ITU-T Rec. E.800 (November 1988) 6. International Telecommunication Union – Telecommunication standardization Sector: Definitions of terms related to quality of service. ITU-T Rec. E.800 (November 1994) 7. International Telecommunication Union – Telecommunication standardization Sector: General Aspects of Quality of Service and network Performance in Digital networks, including ISDNs. ITU-T Rec. I.350 (March 1993) 8. Vuckovic, P.M., Stefanovic, N.S.: Quality of Experience of mobile services. In: The 14th Telecommunications Forum 2006, Serbia, Belgrade, November 21-23 (2006) 9. Zepernick, H.J., Fiedler, M., Lundberg, L., Pettersson, M.I., Arlos, P.: Quality of Experience Based Cross-Layer Design of Mobile Video Systems. In: The 18th ITC Specialist Seminar on Quality of Experience, Karlskrona, Sweden, May 29-30 (2008) 10. Kim, D.W., Lim, H.M., Yoo, J.H., Kim, S.H.: VOC Based Key Quality Indicator for HighSpeed Internet Service. In: The Third International Conference on Internet Monitoring and Protection (ICIMP 2008), Bucharest, Romania, June 29-July 5 (2008) 11. Center for Quality of Management Journal 4(2) (Fall 1993) 12. Baek, S.I., Paik, S.K., Yoo, W.S.: Understanding Key Attributes in Mobile Service: Kano Model Approach. In: Proceedings of the Symposium on Human Interface and the Management of Information. Information an Interaction. Part II: Held as part of HCI International 2009, San Diego, Calif., USA (2009) 13. Chaudha, A., Jain, R., Singh, A.R., Mishra, P.K.: Integration of Kano’s Model into quality function deployment (QFD). Int. J. Adv. Manuf. Technol. 53, 689–698 (2011); doi: 10.1007/s00170-010-2867-0 14. Soldani, D., Li, M., Cuny, R. (eds.): QoS and QoE Management in UMTS Cellular Systems. John Wiley & Sons, Chichester (2006)
Investigation of Quality of Experience for 3D Streams in Gigabit Passive Optical Network Ivett Kulik and Tuan Anh Trinh* Budapest University of Technology and Economics Department of Telecommunications and Media Informatics Budapest, Hungary {kulik,trinh}@tmit.bme.hu
Abstract. The Gigabit Passive Optical Network (GPON) that provides a capacity boosts in both the total bandwidth and bandwidth efficiency is a promising technology for transport networks that can carry bandwidthextensive 3D video stream-based applications. However, very little empirical results are known about the user perceived quality of these kinds of services in GPON-based networks. In this paper, we tackle the challenge by carrying out real GPON-based network measurements focusing Quality of Experience (QoE) of 3D streams and by engaging real users in our test for user perceived quality of service. Our results show that, in most of the investigated scenarios, the GPON is suitable for efficient transport of 3D multimedia contents and the QoE results of 3D video streams watching shows dependency on the network caused QoS. Keywords: GPON-based network, 3D multimedia content, stereoscopic visualization, quality of experience (QoE) investigation, subjective evaluation.
1 Introduction The exponential increasing of multimedia streams transmission and especially the appearance of three dimension content services have become challenging issues for Internet Service Providers. A provider needs to be able to observe and react quickly on QoS problems like packet loss, delay, jitter, etc. but the importance of QoE appears as well. QoE are customer-centric metrics while QoS are network-centric. Human perception of the video is best characterized in terms of QoE which looks at the streaming content from the standpoint of the end user. The higher quality of 3D production and stereo shooting can advance of the whole quality as well [7], [12]. The Future 3D Media Internet is a significant part of research work, which should be designed to overcome current limitations involving network architecture, content and service mobility, new forms of 3D content provisioning etc. [1], [4]. The assessment of QoE in multimedia services can be performed either by subjective or *
The research was supported by NKTH-OTKA grant CNK77802.
R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 157–168, 2011. © Springer-Verlag Berlin Heidelberg 2011
158
I. Kulik and T.A. Trinh
objective methodologies. Subjective tests are carried out by tests of real users and these tests have to be conducted by a large number of users for statistically relevant results. The objective tests are carried out by an algorithm on behalf of a real user, trying to predict user perception based on key properties of the reference or the outcome [2]. Obviously network level QoS parameters affect user level QoE parameters. This paper shows assessment of subjective QoE measuring of 3D stereoscopic video streams with three types of video codecs (XviD, MPEG4 AVC and WMV9) based on Gigabit Passive Optical Network (GPON) transport network in laboratory situation. Both reference and outcome information were available and could be compared. The QoE estimating of stereoscopic video streams carried out and results are available in previous publications [5], [8] and quality of presentation became important on mobile devices as well [10]. We carried out experiments based on subjective testing of 3D video streams where fifty participants could observe QoE changing due to degradation of QoS parameters like jitter and bandwidth limitation. The GPON transport network was suitable for efficient transport of 3D multimedia contents. These tests were time consuming because only four personal computers were available for 3D videos watching simultaneously. 50 users watched 3D contents in various network circumstances and scored their quality based on Mean Opinion Score (MOS). Finally gathered information were evaluated and compared. The paper is structured as follows. In Section 2 we explain the GPON-based network architecture. Section 3 contains scenarios’ description of measurements and shows evaluation and comparison of results. Section 4 concludes the paper with a summary and points out directions for future work.
2 The GPON-Based Transport Network We carried out planning and realization of the appropriate transport network before 3D multimedia stream testing. We needed broadband and responsible access to the multimedia server and 3D video contents had to be transfer in unicast mode to the clients because of the Nvidia Vision Player that could play only stereoscopic 3D streams delivered in unicast mode and based on TCP transport at this time. Types of encoding and compression affect the demand of bandwidth in case of multimedia contents’ transport and it could be between 10Mbit/s and 20Mbit/s by stream or more. The GPON-based transport network was efficient with 2.5Gbit/s download speed and 1.5Gbit/s upload speed in laboratory [6]. The whole GPON-based network architecture is shown in Figure 1. The transmission network consist four components: Optical Line Terminal (OLT) on the provider side, Optical Network Terminal (ONT) on the customer side, optical cables for connecting and passive splitters that can split optical signals in split ratios 1:2 and 1:4. The OLT and ONT devices are managed by the Siemens EM-PX manager client.
Investigation of Quality of Experience for 3D Streams in GPON
159
Fig. 1. The GPON-based network for 3D video streams investigation
The video server is responsible for 3D videos storing and sharing, what is guaranteed by the VLC program. The hardware configuration of the client for 3D presentation is shown in Table 1 and the hardware configuration of the multimedia server is shown in Table 2. Table 1. The hardware configuration of the client for 3D video presentation
Processor
Components Intel Core 2 Quad, Q8300, 2,5GHz
Video-card
NVIDIA GeForce GT 240
Memory Spectacles
4GB RAM Nvidia 3D Vision
Notes Needs: At least Intel Core 2 Duo, or AMD X2 Athlon Needs: 8 series, 9 series or 200 series NVIDIA video-card
Table 2. The hardware configuration of the multimedia server
Motherboard Processor Memory Operating system
Components Asus P5B Deluxe Intel Core 2 Duo, 2,13 GHz 1 GB RAM Debian GNU Linux 2.6.26 kernel
160
I. Kulik and T.A. Trinh
The WANulator software simulates different Internet conditions such as delay, jitter or packet loss. This software has been used for simulation of QoS degradation in network.
3 Subjective Testing of 3D Video-Streams The QoE can be affected by many factors in three-dimensional multimedia. Network features and also multimedia features are important in this case. Network features refers to all QoS metrics involved in a multimedia transmission over IP like: packet loss, delay, jitter, reordering, bandwidth limitations and others. Multimedia features include all the higher levels’ specific parameters like: coding, bit-rate, frame-rate, motion level of the video sequences etc. [11]. Different video codecs were the main view-point for the selection of 3D stereoscopic contents. Features of chosen videos are shown in Table 3. Table 3. Features of the investigated 3D videos Title
Coyote Falls XviD ISO MPEG-4 MPEG-1 Layer 3 avi
Another afternoon
Nürburgring
Avatar
H.264/MPEG-4 AVC MPEG-4 AAC LC mp4
WMPv9 (VC-1 Simple/Main) WMAv2
WMPv9 (VC-1 Simple/Main) WMAv2
wmv
wmv
03:02
04:35
02:24
03:32
Resolution
3840*1080
1920*1080
1920*1080
1280*720
Video bitrate (kb/s) Audio bitrate (kb/s)
9200
11606
112
112
Video codec Audio codec Container format Length (mm:ss)
28038 192
9646 192
Stereoscopic imaging is a technique capable of recording 3D visual information or creating the illusion of depth. Most 3D compression schemes apply twodimensional compression techniques and consider theories of binocular suppression as well [3],[9]. The common practice to estimate user perception from network-level performance criteria is to conduct large experiments in a controlled environment. The Mean Opinion Score (MOS) quality scale method used to be applied for voice and video traffic scale (shown in Table 4). 50 users (43 men, 7 women, 19 bespectacled from them and with average age 22,4) attended our experiment, who watched 3 short videos with various contents and various quality of 3D presentation as well. And a smaller group, 33 users (26 men, 7
Investigation of Quality of Experience for 3D Streams in GPON Table 4. MOS Quality Scale Score 5 4 3 2 1
• • • • •
Sequence quality Excellent Good Regular Bad Awful
161
women, 11 bespectacled from them and with average age 23) watched the trailer of Avatar (Table 3 features of videos). Participants answered 5 questions by means of the MOS. When we drew our questions we took into consideration video characteristics like: continuity, blurriness, conformity between the picture and voice and quality of 3D presentation. The questions were:
What was the quality like on the whole? Rate continuity of the video content. Rate the quality of picture. Did you notice disintegration of picture? How did you feel conformity between the picture and voice? How did you assess the 3D experience on the whole?
At first streams were tested without QoS degradation. The 3D videos were delivered to the clients in unicast mode over TCP because of the Nvidia 3D Vision Video Player that could play 3D streams delivered only by this method at this time. We could not examine impact of packet loss but we investigate the effect of various jitter and bandwidth limitation to the 3D streams’ presentation. We carried out next tests by different values of start-up settings (shown in Table 5). Table 5. Jitter and bandwidth threshold settings for tests Title
Coyote Falls
Video codec Test 1
XviD ISO MPEG-4 without degradation Jitter 10000 pkt, 1000 burst, 90ms Test2 + 1,2MB/s Down-link limit
Test 2 Test 3
Another afternoon H.264/MPEG-4 AVC without degradation Jitter 10000 pkt, 750 burst, 90ms Test2 + 1,4MB/s Down-link limit
Nürburgring WMPv9 (VC-1 Simple/Main) without degradation Jitter 10000 pkt, 50 burst, 90ms Test2 + 3,4MB/s Down-link limit
Avatar WMPv9 (VC-1 Simple/Main) without degradation Jitter 10000 pkt, 300 burst, 90ms Test2 +1,7MB/s Down-link limit
Our goal was to achieve perceptible degradation of QoE for video contents by different jitter degradation and bandwidth limitation. The WANulator, the traffic shaper for jitter changing was located between the media server and clients and the values of bandwidth limitation had been set by the Netlimiter on the clients. Scenario 1 - Reference tests At first we carried out reference tests where participants watched videos without any QoS disturbances. The quality of presentations commonly was excellent or good but questions’ answering was also dependent on the contents. The favorite video was the animation „Coyote Falls” independently if its presentation quality. The short video about auto racing „Nürburgring” with the highest resolution and excellent 3D quality would have also excellent rating but with the highest video bitrate (28038kb/s) had
162
I. Kulik and T.A. Trinh
worse sound quality and this degraded the effect size. Therefore also another 3D video was investigated, the short „Avatar” trailer with the same video codec WMPv9 (VC-1 Simple/Main) but with less video bitrate (9646kb/s) and with fewer and worse 3D visualization. Reference test results “Coyote Falls”, video codec XviD ISO MPEG-4 – the favorite content with high QoE values and with average value: 4,77. “Another afternoon”, video codec H.264/MPEG-4 AVC – is an amateur record about skateboarding with not so interesting content. The reference test results were between excellent and good quality with average value: 4,38. “Nürburgring”, video codec WMPv9 (VC-1 Simple/Main) – is about auto racing with interesting content. The colour, contrast and depth perception were excellent but the reference test result values were lower between good and regular due to jerky voice. The average QoE value was: 3,98. “Avatar”, video codec WMPv9 (VC-1 Simple/Main) – is a film trailer with fewer and less 3D quality. The whole investigation were between excellent and good values but the 3D experience was lower (only with average value 3,3) therefore the average value of whole QoE was only 3,85. Scenario 2 – Tests with degradation of jitter When users want to watch a high resolution video stream via UDP, packet loss has a strong negative impact on the user perceived quality and it produces blurring or jerkiness even freezing [2]. In our case video streams were delivered via TCP and we could not investigate packet loss correctly. We carried out measurements with 90ms delay variation with various characteristics for different videos (shown in Table 5). Later we raised jitter values up to 150, 220, 300 and 500ms. Appearance of stalling and jerkiness of video contents were presented at the different jitter values for every stream. The average values of QoE were usually between good and regular up to the jitter 90ms. “Coyote Falls”, video codec XviD ISO MPEG-4 – the test result values were between good and regular values up to the jitter 90ms and the average value of QoE was 3,4 for the whole test.
Fig. 2. QoE based on jitter disturbance (XviD ISO MPEG-4)
Investigation of Quality of Experience for 3D Streams in GPON
163
The QoE was regular up to the jitter 150ms but when the value became worse than 150ms the quality of presentation significantly deteriorated the content became to be jerky and beyond 300ms freezing appeared and the picture was totally unenjoyable (shown in Figure 2). “Another afternoon”, video codec H.264/MPEG-4 AVC – the test result values were better up to the jitter 90ms than in case of „Coyote Falls” and the average value of QoE was 3,802 for the whole test.
Fig. 3. QoE based on jitter disturbance (MPEG4 AVC)
However the video shooting was only with two simple cameras and the content was not so interesting, participants evaluated the QoE highly. When the jitter became worse than 90ms the QoE degradation was imperceptible and values stagnated around good quality because the picture was still enjoyable (shown in Figure 3). “Nürburgring”, video codec WMPv9 (VC-1 Simple/Main) – the test result values were worse than in case of „Coyote Falls” or „Another afternoon” due to the high resolution, very good 3D quality and high motion level of the video sequences but still around regular values up to the jitter 90ms. The average value of QoE was only 2,758 for the whole test. Later, when the jitter became worse than 90ms the quality of presentation began rapidly decrease, became jerky and from the jitter 300ms picture freezing started and the quality turned to bad then awful (shown in Figure 4).
Fig. 4. QoE based on jitter disturbance (WMVv9 VC-1) - Nürburgring
164
I. Kulik and T.A. Trinh
“Avatar”, video codec WMPv9 (VC-1 Simple/Main) – the test result values were between good and regular up to the jitter 90ms and the average value of QoE was 2,88 for the whole test.
Fig. 5. QoE based on jitter disturbance (WMVv9 VC-1) - Avatar
Because the 3D experience was lower in this case the whole QoE assessment was commonly regular up to the jitter 150ms. Later, when the jitter became worse than 220ms the quality of presentation decreased, it became jerky and the whole QoE was between regular and bad quality (shown in Figure 5). Scenario 3 – Tests with degradation of jitter + bandwidth limitation The scenario 3 was when jitter degradations were combined with bandwidth limitation by the Netlimiter software. Values of limitations were calculated based on video bitrates of 3D video contents. These limitations had worse effect to the QoE than the simple jitter degradation. We carried out some measurements only with bandwidth limitations as well but its combination with delay or jitter is oftener in real life. Therefore we carried out these tests with jitter plus bandwidth limitation disturbance. Bandwidth limitations were calculated based on the average demand bandwidth values of these video streams. Limitations were set approximately 95% of the average bandwidth values (shown in Table 6). Table 6. Average bandwidth and bandwidth limitation for tests Title Video codec Average bandwidth Bandwidth limitation
Coyote Falls
Another afternoon
Nürburgring
Avatar
XviD ISO MPEG-4 1,3MB/s
H.264/MPEG-4 AVC 1,5MB/s
WMPv9 (VC-1 Simple/Main) 3,6MB/s
WMPv9 (VC-1 Simple/Main) 1,8MB/s
1,2MB/s
1,4MB/s
3,4MB/s
1,7MB/s
Investigation of Quality of Experience for 3D Streams in GPON
165
“Coyote Falls”, video codec XviD ISO MPEG-4 – the test result values rapidly decreased and the average value of QoE was only regular in case of jitter 90ms and bandwidth limitation 1,2MB/s. The average value of QoE was only 2,12 for the whole test.
Fig. 6. QoE based on jitter and bandwidth limitation disturbance (XviD)
In this case the jerkiness of picture was perceptible from the jitter value 90ms combined with bandwidth limitation 1,2MB/s and later freezing came up. This type of QoS degradation shows exponential relation between QoE and QoS parameters (shown in Figure 6). “Another afternoon”, video codec H.264/MPEG-4 AVC – the test result values rapidly decreased and the average value of QoE was only between regular and bad in case of jitter 90ms and bandwidth limitation 1,4MB/s. The average value of QoE was only 2,18 for the whole test. The QoE was not sufficient during this test. The jerkiness of picture was strong due to the jitter 90ms and bandwidth limitation 1,4MB/s disturbance and later freezing came up beyond jitter 100ms. This type of QoS degradation also shows exponential relation between QoE and QoS parameters (shown in Figure 7).
Fig. 7. QoE based on jitter and bandwidth limit. disturbance (MPEG4)
“Nürburgring”, video codec WMPv9 (VC-1 Simple/Main) – the test result value was very sensitive to the degradation with jitter 90ms and bandwidth limitation 3,4MB/s. The average value of QoE was only 2,2 for the whole test.
166
I. Kulik and T.A. Trinh
Strong jerkiness of picture appears due to the jitter 90ms and bandwidth limitation 3,4MB/s disturbance and later freezing came up beyond the jitter value 100ms (shown in Figure 8).
Fig. 8. QoE based on jitter and bandwidth limit. (WMVv9) - Nürburgring
“Avatar”, video codec WMPv9 (VC-1 Simple/Main) – the test result value was almost bad due to the jitter 90ms and bandwidth limitation 1,7MB/s. The average value of QoE was only 2,44 for the whole test.
Fig. 9. QoE based on jitter and bandwidth limitation (WMVv9) - Avatar
The picture was strongly jerky in case of jitter value 90ms and bandwidth limitation 1,7MB/s and later the value of quality decreased very slowly (shown in Figure 9). 3.1 Comparison of Test Results Results of the reference 3D video tests were commonly excellent or good based on the gained information. The jitter degradation influenced differently the QoE of video streams with various video codecs. The H.264/MPEG-4 AVC video codec was less sensitive and the XviD ISO MPEG-4 codec responded also suitably up to the jitter 220ms. The „Nürburgring” trailer with the WMPv9 (VC-1 Simple/Main) codec and with the highest 3D quality were the most sensible and beyond the jitter 90ms the QoE rapidly degraded. The bandwidth limitation had more destructive influence to every stream and combined with jitter the presentation became jerky and unenjoyable.
Investigation of Quality of Experience for 3D Streams in GPON
167
4 Conclusion In this paper we present and evaluate results of measurements in GPON-based transport network focusing on QoE of 3D stereoscopic video streams. The GPON network with its capacity was suitable for efficient transport of these contents even in unicast mode. Expressing QoE of 3D stereoscopic multimedia contents as a function of QoS metrics like jitter and bandwidth limitation disturbance are demonstrated by tests results. The quality of 3D presentation like depth impression is influenced by multimedia features, and dynamic, lots of movement sections of content are more sensitive to the QoS degradation. Future work will address investigation of 3D streaming delivering in multicast mode over UDP with QoS disturbances and comparison of obtained information with QoE tests results of 2D video streams with goal to get enough data for mathematical modeling of functional relationship between QoE and QoS metrics in case of 3D video contents.
References 1. Zahariadis, T., Daras, P., Laso-Ballesteros, I.: Future 3D Media Internet Network & Electronic Media – Summit, St. Malo, France (October 2008) 2. Casas, P., Belzarena, P., Vaton, S.: End-2-End Evaluation of IP Multimedia Services, a User Perceived Quality of Service. In: Approach 18-th ITC Specialist Seminar of Quality of Experience, Karlskrona, Sweden, pp. 13–23 (May 2008) 3. You, J., Xing, L., Perkis, A.: Quality of Visual Experience for 3D Presentation – Stereoscopic Image. In: Mrak, M., Grgic, M., Kunt, M. (eds.) High-Quality of Visual Experience. Signals and Communication Technology, ch.3, vol. I, pp. 51–77 (2010) 4. Kroeker, K.L.: Looking Beyond Stereoscopic 3D’s Revival. Communications of the ACM 53(8), 14–16 (2010) 5. Xing, L., You, J., Ebrahimi, T., Perkis, A.: Estimating Quality of Experience on Stereoscopic Images. In: ISPACS 2010 – International Symposium on Intelligent Signal Processing and Communication Systems, Chengdu (December 2010) 6. Cale, I., Salihovic, A., Ivekovic, M.: Gigabit Passive Network – GPON. In: ITI 2007 – International Conference on Information Technology Interfaces, Cavtat, Croatia, pp. 679– 684 (June 2007) 7. Zilly, F., Müller, M., Eisert, P., Kauff, P.: The Stereoscopic Analyzer – An Image-Based Assistance Tool for Stereo Shooting and 3D Production. In: ICIP 2010 – IEEE International Conference, Hong Kong (September 2010) 8. Häkkinen, J., Kawai, T., Takatalo, J., Leisiti, T., Radun, J., Hirsaho, A., Nyman, G.: Measuring Stereoscopic Image Quality Experience with Interpretation Based Quality Methodology. In: IS&T/SPIE’s International Symposium on Electronic Imaging, San Jose, California, USA (January 2008) 9. Lambooij, M., Ijsselsteijn, W., Heynderickx, I.: Visual Discomfort in Stereoscopic Displays: A Review. Journal of Imaging Science and Technology 53(3), 030201:1-14 (2009)
168
I. Kulik and T.A. Trinh
10. Shibata, T., Kurihara, S., Kawai, T., Takahashi, T., Shimizu, T., Kawada, R., Ito, A., Häkkinen, J., Takatalo, J., Nyman, G.: Evaluation of stereoscopic image quality for mobile devices using Interpretation Based Quality methodology. In: Proc. SPIE, vol. 7237 (2009) 11. Fiedler, M., Hossfeld, T., Phuoc, T.-G.: A Generic Quantitative Relationship between Quality of Experience and Quality of Service. IEEE Network 24(2), 36–41 (2010) 12. Fort, S.: 2020 3D Media: New directions in Immersive Entertainment. In: SIGGRAPH 2010 – International Conference and Exhibition on Computer Graphics and Interactive Techniques, Los Angeles, USA (July 2010)
A SystemC-Based Simulation Framework for Energy-Efficiency Evaluation of Embedded Networking Devices* Daniel Horvath1,2 and Tuan A. Trinh2 1
Budapest University of Technology and Economics Inter-University Cooperative Research Centre for Telecommunications and Informatics, 2 Department of Telecommunications and Mediainformatics Magyar Tudósok krt. 2, 1117 Budapest, Hungary {horvathd,trinh}@tmit.bme.hu
Abstract. In this paper, we discuss the design and the implementation details of a simulation framework which provides an easy-to-use environment for modeling of a network device. This framework comes with a packet processing architecture that can be modified and extended for a particular network device. Moreover, this framework enables the energy-consumption evaluation of the modeled network hardware by the application of energy management modules which can turn off unused modules. The framework is implemented using the SystemC programming language which is suitable for description of hardware and the software running on it. Finally, a use-case of 8-port gigabit switch based on the implemented packet processing framework is evaluated in different network scenarios. Keywords: simulation, SystemC, modeling, energy-efficiency.
1 Introduction Increasing processing power and throughput of network devices is the in spotlight for many years. Nowadays, the minimization of the energy consumption appeared as a new requirement. Finding an optimum between the processing power and energy consumption is a current research area. Moreover, the design constraint of network devices is not to compromise the network speed, while still maintaining scalability without additional complexity. The lowering of the energy consumption of embedded devices is highly important to extend battery life or to save power. Embedded devices are getting ubiquitous. There are embedded devices in TVs and cars. Most of the embedded devices operate in always-on manner. The vast majority of these devices need not to be always operational. Introducing sleeping period saves power thus lowers expenditures. A *
The research leading to these results has received funding from the ARTEMIS Joint Undertaking under grant agreement n° 100029 and from the Hungarian National Office for Research and Technology (NKTH).
R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 169–180, 2011. © Springer-Verlag Berlin Heidelberg 2011
170
D. Horvath and T.A. Trinh
Fig. 1. The daily periodicity of utilization of a 10Gbit/s link on a week timespan in an internet exchange [1]
typical embedded device does not consume much power (e.g. Linksys WRT54GC consumes less than 4W with operating radio interface). The number of the embedded devices is increasing, and the sum of their consumption is significant and growing. Similarly, greening network devices can cut operational expenditures. A backbone network device may consume 9kW or more [2]. The traffic distribution in time is not uniform even on backbone links. Fig. 1. displays a typical utilization of a 10Gbit/s link in an internet exchange on a week timespan. As can be seen the utilization on the link is much lower (6-10%) during the night, than the afternoon maximum of 70-80%. During the low traffic time ranges the power consumption could be lowered, which can save money and make the applying company more competitive. Furthermore, greening a company has marketing value. To implement such a power saving solution, a general model for networking devices should be designed first. Then a development environment is needed to automate repetitive tasks and enable faster development. This paper focuses on the demonstration of the capabilities of this development framework. 1.1 Background This paper presents results obtained during the FP7 project, named Scalopes, which is aimed at - amongst others - on creating low power embedded platforms. The focus of this paper is on the design and implementation issues of the packet processing framework which has power controlling capabilities. This work is partially based on the packet processing framework already described by our team in [3]. The technology used for implementation of the packet processing is the SystemC [4]. The basic idea of the packet processing pipeline is adapted to be conforming to the C-board, an FPGA-based hardware platform developed by AITIA International Inc. during the Scalopes project. There are also simulators using SystemC to describe the model like ReSP [5], which was developed at the Politecnico di Milano. The ReSP focus on the reconfiguration issues on FPGAs but does not deal with the energy efficiency aspects. The simulation and modeling of networked embedded devices are also investigated in [6]. The solution described in the paper applied SystemC as well. Moreover, the power consumption optimization is examined in embedded devices, e.g. [7] and [8].
A SystemC-Based Simulation Framework
171
This paper is structured as follows: The 2nd section deals with the design decisions concerning the NAD framework. The 3rd section discusses the changes of the packet processing pipeline. The 4th section describes an evaluation scenario of an eight port switch developed with the NAD framework. Finally, the 5th section summarizes the paper.
2 The Development of the NAD Framework The development of an environment to support creating a model of a network device is necessary. We named this tool NAD framework. This acronym stands for Network Application Development. The choice of the programming language for development of the NAD framework was dominated by the flawless portability among various operating systems. Therefore the Java language was chosen. The NAD framework aimed to provide convenient environment to model, compose and evaluate network devices. The modules of a hardware model in the NAD framework are stored in IP-XACT [9] format, and can be exported to enhance portability.
Fig. 2. The graphical user interface of the NAD framework implements the usual layout of developer applications
The NAD framework has two composers: the SystemC composer and VHDL composer. The user can add SystemC code and VHDL code to the same module. The SystemC code is used at the evaluation of the hardware-software complex. Meanwhile, the output of the VHDL composer can be uploaded to the C-board. The equivalency check of the SystemC and VHDL code is not covered by the NAD tool, and therefore the responsibility of the user. The NAD framework also provides a graphical user interface where the user can compose pre-existing and user-created modules into network device. A sample screenshot of the NAD framework is pictured in Fig. 2. The graphical user interface
172
D. Horvath and T.A. Trinh
of the NAD framework is divided into four main parts. The left pane shows the available modules. The bottom pane shows status information which helps the development of the NAD framework. The central pane shows the working area, where the user can connect the modules. There are tabs above the central pane which are used to select the layer in which user would like to work. There are also tabs which allow the user to modify the settings of the module in focus. The right pane shows some configuration of the module in focus. The user of the NAD framework has the possibility to add new modules and to modify or configure existing ones. Adding new modules is very comfortable because the user can set the interfaces of a module and can use the feature which creates stub files. The SystemC stub files are filled with the necessary include-macros, the code of the declaration and definition of the module and its ports. The port and names are generated in a consistent way. To these files the user can add any kind of code. The VHDL stub files can be extended by any proprietary code as well. The user needs to create the simple modules only, and the NAD framework composes the complex modules and the top level file including the wiring. A highlighted feature of the NAD framework is the ability to create complex interfaces instead of connecting wires one by one therefore the straightforwardness of the view is enhanced. These complex interfaces can be reused at the creation of a new module.
Fig. 3. The packet processing framework is implemented in all FPGAs which have dataplane interfaces
The NAD framework can pass parameters to a module. The stub SystemC file generating feature also includes the function which has to handle the parameter. During the running of the simulation the function is called immediately after the instantiation of the object which represents the module. The user can work with the modules on different layers of the model, by selecting the modules in which the user would like to work at the moment. This eliminates the unnecessary outer modules from the view. The design of a network device is accelerated with the applied coloring scheme on modules. For example, the color of a clock module is other than an input module. The NAD framework can compile the SystemC code by using the GCC, and then run the simulation, and finally create statistics and plots based on the output of the simulation. The NAD framework allows cycle accurate simulation of the hardware. This simulation run is time consuming; therefore this tool is not suitable for very large scale projects. An alternative technology of TLM should have been used, but that
A SystemC-Based Simulation Framework
173
would have caused the dismissal of the possibility of the automated check of the equivalency of the SystemC and VHDL modules.
3 Enhancing the Packet Processing Framework The packet processing framework is enhanced in four ways. (a) The interfaces of the modules are modified to fit to the hardware interfaces of the C-board. (b) The packets are stored in an external DDR2 RAM instead of the internal Block RAMs of the FPGA. (c) The framework is extended by the functionality of working on 4 FPGAs, and using distributed packet buffering. (d) The energy management modules are decoupled to form a uniform interface on each module. 3.1 The C-Board Hardware Platform The C-board is an FPGA based platform equipped with four high performance FPGAs. The C-board is designed to be used flexibly for example as a network monitoring device or as a switch. The FPGAs are connected into a bi-directional ring topology. Three of the FPGAs have the same functionality because they have same interfaces. The fourth one has different interfaces, for example a PCI Express port which is the interface to where an onboard PC is to be inserted. This PC will do the upper level processing, and providing human interface to control the software and hardware of the device.
Fig. 4. The power management subsystem
Each FPGA has a DDR2 interface. Three of these DDR2 RAMs serves as the storage of the packets that are waiting for the transmission. The DDR2 memories multiplied the packet storage capacity compared to using only Block RAMs inside the FPGA. The fourth DDR2 RAM holds the forwarding table if necessary for the application. 3.2 The Packet Processing Framework The simplified architecture of an FPGA of the 3 FPGAs whose main function is the packet transmission can be seen in Figure 3. The receiver (RX) and transmitter (TX) block represent the same interface outside the FPGA. The receiver and transmitter functions are decoupled here just to enhance visualization, these resides in the same die. The RX processing module does the ingress filtering and adds an internal header
174
D. Horvath and T.A. Trinh
to each packet. The TX processing module does queuing, egress filtering and removes the internal header. The memory manager handles the DDR2 memory and creates lookup requests. The memory manager sends the lookup requests to the lookup module which can either answer it based on a distributed forwarding table or can forward it to a central lookup module. Our recommendation is that the lookup should be performed on the fourth FPGA, and the lookup modules of the other FPGAs should only cache the lookup information. The lookup may need high bandwidth to the memory, which would be a conflict with the packet manager if placed on same FPGAs. The communication between the FPGAs is done on a RocketIO [10] interface. Since the FPGAs are organized into a ring topology the manager of these interfaces are named ring manager. The modules which would like to communicate modules in another FPGA must send their message to the router module. The router module adds the messages a ring header and fills its entries.
Fig. 5. The inner structure of the switch FPGA resembles to the packet processing framework of Figure 3
The ingress packets are buffered in three different FPGAs. When the memory manager module receives a message which commands it to send out a packet identified by an ID, it checks the ID to decide whether the packet is stored locally or on another FPGA. If the packet is stored locally then it reads out from the memory and passes it on to the TX processing module then it leaves the device. If the packet is stored on another FPGA then it sends a request to the memory manager of the FPGA where the packet is stored, which replies with the packet. This solution is developed to enable multi-destination forwarding without surplus copying of the packets, which could create bottleneck at the memory.
A SystemC-Based Simulation Framework
175
Fig. 6. The modeling of the switch in the NAD framework for performance evaluation
The configuration and power management subsystem is omitted from Figure 3. to keep it easily understandable. The core module of the configuration subsystem is the configuration manager. To the configuration manager any module can be connected, which implements its interface. The configuration interface implements a two-way communication to make possible querying data from the modules. The configuration manager receives the commands from upper layer entities through the ring and after acting, replies through the router module. 3.3 The Power Management Modules The power management of the packet processing subsystem is layered. The central power manager controls the power island managers. A power island is a group of modules of the packet processing framework, which are handled together in the scope of energy management. An interesting derivative of this project could be the investigation of matching the power islands to the FPGA tiles. The central power manager fetches the triggers from wires directly or through some basic processing of a trigger module. The central power manager should be configured to know what to do with the trigger. Then the triggers inform the central power manager about an event and as a consequence the central power manager power off or on a power island. The power island manager manages the modules in a power island by the powering them off and back. The power island manager can be configured also through the configuration interface, therefore making it easy to fine-tune it during the development. For example, a pipeline can be powered off sequentially to avoid losing the packets still in the pipeline. In a scenario that applies microprocessor based modules on the FPGA, the power manager can use the configuration interface to set the speed of the hardware, which therefore scales the frequency and voltage.
176
D. Horvath and T.A. Trinh
Fig. 7. The resulting plots after a simulation. The top plot shows the average delay in microseconds. The middle one shows the practical jitter in microseconds. The bottom one shows the throughput of the backplane in MB/s.
Algorithms can be assigned to the power managers that delay the powering off a power island or module. Keeping a module awake could be advantageous in the case the module could have slept a too short time range, because during the waking the power consumption of the module rises. The measurement of power consumption is distributed to the modules. A compound module can handle the power consumption modeling or delegate it to the submodules. The power consumption values can be fetched via the configuration manager module. That module can query the value which is then reset to zero. So the managing software should keep track of the data and the timing. The modules, which have power modeling capability, may use measured values coded into the SystemC or extrapolate from the measured values or use a pure theoretical model.
4 Evaluation Scenario To evaluate the enhancement done on the packet processing framework, we implemented a Gigabit Ethernet switch with eight ports. Therefore the backplane can handle slightly more than 8Gb/s. This switch is implemented using only one FPGA for the sake of simplicity, whose inner structure is shown in Fig. 5. The environment of the switch in the NAD framework is depicted in Fig. 6. On the left hand side, there are 8 modules, which are traffic generators, connected to the Cboard in the center. On the right, 8 sink modules and additional 8 probe modules connected to the C-board. These modules generate trace files to visualize the changes of the signals or to fill the SQLite [11] database which is needed to create statistics. The statistics of packet sizes, delays and packet lost are presented in the NAD framework.
A SystemC-Based Simulation Framework
177
Fig. 8. The enlarged view of the average delay of the traffic evaluation scenario in Fig. 7. This plot featured finer time granularity, which can be tuned in the GUI.
In the following a simulation scenario is presented. The traffic generators create packets as a Poisson point process with an average link utilization of 10%. The traffic generators are started one by one with 10µs gap between each ones. The ingress packets are of 64 byte size and all are flooded, which is the case if the address learning is opted out in a switch. This is not realistic but a good performance analysis set up. The generated plots are shown in Fig. 7. The delay plot is the top one, also shown in Fig. 8. The practical delay, which is the difference of the largest and the smallest delay, is shown in the middle, and the bottom one presents the throughput. The plots show that the traffic and the delay reach a level and oscillate around it. This scenario has a stable throughput at 4Gb/s rate at 10µs latency. A trace file is generated in .vcd format. A fragment of the visualized trace file is shown in Fig. 9. The statistics of the simulation is also shown in the NAD framework. Fig 10. shows a table with the statistics of the lost packets. Only 6 packets are lost that stands for the 0.1% of the whole traffic, which is acceptable.
Fig. 9. This fragment of the generated trace file shows the start of the transmission of a packet
178
D. Horvath and T.A. Trinh
The measurement of the delay is based on the two kinds of testbench modules: the traffic generators and the probes. The traffic generator module hides an ID in the payload of the generated packet. Therefore the receiving probe can identify the packet, and feed it accurately in the SQLite database, from where the statistic can be generated by the NAD framework.
Fig. 10. The statics of the lost packets with the available statistics on the left pane
Delay measurements were pursued on in the C-board. The shortest experienced delay was about 230 clock cycle, which is measured between the place of the addition and the removal of the timestamp to the inner header. 230 clock cycle multiplied by 8ns, which the period of a clock cycle, gives 1.84µs. The simulations pursued in the NAD framework gave the result of minimum 121 clock cycle (approx. 0.97µs) between the same points. The difference of 0.87µs is the result of lack of the simulation of the address lookup. The end-to-end simulation delay is at least 3.12µs. Fig 8. also shows a minimal delay at 0µs time, since there is no load on the switch at that moment. This value is 3.6µs, which is not less than the simulated value. To obtain realistic end-to-end delay, the difference (0.87µs) should be added to the simulated minimum delay (3.12 µs). Therefore the realistic end-to-end minimal delay is approx. 4µs. Using the NAD framework it is available to compose VHDL code. The VHDL code can be uploaded to the C-board to evaluate real performance. Fig 11. presents a setup featuring two computers connected through the C-board. This scenario presents a simple application running on the C-board. This application is in VHDL, which can be configured using the onboard computer. The computers can communicate with each other until the third computer configures a MAC filter of the C-board. The third computer runs the management software and communicates with the on-board computer of the C-board. The on-board computer accesses the registers of the MAC filter module through the configuration modules. After putting the MAC address of
A SystemC-Based Simulation Framework
179
Fig. 11. The demo set up
either communicating computer on the ban list, their communication is stopped. After the removal of the MAC address of the ban list, they can communicate again.
5 Conclusion The packet processing framework is modified according the new constraints raised by the new hardware platform. The NAD framework is developed to simplify the SystemC based development. The NAD framework is extended with a VHDL composer that simplifies the evaluation of the SystemC model. A simulation scenario of an 8 port gigabit switch is implemented in the NAD framework with the application of the packet processing framework. The performance analysis of the scenario verified that the switch operates as expected. The NAD framework provides appropriate foundation to implement evaluating features on it for the power consumption patterns of network devices.
180
D. Horvath and T.A. Trinh
References 1. BIX, Romtelecom 10Gbit link. Timerange from 2011.01.09 15:30 to 2011.01.16 15:30, http://bix.hu 2. http://www.cisco.com/en/US/prod/collateral/routers/ ps5763/prod_brochure0900aecd800f8118.pdf 3. Horváth, D., Bertalan, I., Moldován, I., Trinh, T.A.: An energy-efficient FPGA-based packet processing framework. In: Aagesen, F.A., Knapskog, S.J. (eds.) EUNICE 2010. LNCS, vol. 6164, pp. 31–40. Springer, Heidelberg (2010) 4. OSCI SystemC 2.2.0 Documentation: User’s Guide, Functional Specifications, Language Reference Manual, http://www.systemc.org/ 5. ReSP project site, http://home.dei.polimi.it/fossati/resp.html 6. Fummi, F., Perbellini, G., Gallo, P., Poncino, M., Martini, S., Ricciato, F.: A timingaccurate modeling and simulation environment for networked embedded systems. In: Proceedings of the 40th Annual Design Automation Conference (2003) 7. Nathuji, R., Schwan, K.: Reducing system level power consumption for mobile and embedded platforms. In: Beigl, M., Lukowicz, P. (eds.) ARCS 2005. LNCS, vol. 3432, pp. 18–32. Springer, Heidelberg (2005) 8. Jejurikar, R., Gupta, R.: Dynamic Voltage Scaling for Systemwide Energy Minimization in Real-Time Embedded Systems. Low Power Electronics and Design (2004) 9. IEEE standard for IP-XACT standard structure for packaging, integrating, and reusing IP within tool-flows. IEEE, 1685–2009 10. Virtex-5 FPGA RocketIO GTP Transceiver (User Guide) (December 3, 2009), http://www.xilinx.com/support/documentation/ user_guides/ug196.pdf 11. SQLite documentation available at, http://www.sqlite.org/
Energy Considerations for a Wireless Multi-homed Environment German Castignani, Nicolas Montavont, and Alejandro Lampropulos Institut TELECOM / TELECOM Bretagne / Universit´e Europ´eenne de Bretagne, 2 rue de la Chˆ ataigneraie, 35576, Cesson S´evign´e, France {german.castignani,nicolas.montavont, alejandro.lampropulos}@telecom-bretagne.eu
Abstract. Internet wireless access technologies are increasingly heterogeneous. The main cause of this heterogeneity is that, up to now, no wireless technology has succeeded in widely conquering the market. Nowadays, different access technologies co-exist in the same environment allowing multi-homed users to improve their experience by exploiting all available wireless interfaces. However, the energy cost of simultaneously using several wireless interfaces can dramatically drain out mobile devices batteries. In this paper, we review the major work on wireless interfaces energy efficiency (i.e., WLAN and 3G) by highlighting the differences of the proposed energy models and the hardware power consumption. In addition, we present some energy consumption measurements for different application flows and discuss the integration of energy considerations into a general interface selection algorithm. Keywords: Wireless multi-homing, Energy-aware, Interface Selection.
1
Introduction
With the recent deployment of heterogeneous wireless technologies, users can exploit network diversity using several wireless interfaces integrated in a single mobile station (MS) to have a better service coverage, a higher throughput and more reliability. Usually, an MS is equipped with a cellular-based interface (e.g., 3G) and an IEEE 802.11 interface (WLAN) and it is very common that in most places users have access to different networks using both interfaces. In the case of WLAN, access points (AP) could be residential, public hot-spots or community network deployments [3]. In this context, which is called a multi-homed environment, a user is able to choose which network to use to exchange its data. Moreover, it could exploit several interfaces at the same time by simultaneously assigning different application flows to different interfaces, depending on some pre-defined criteria. The main goal while taking multi-homing decisions is to match each application flow with the best available wireless interface. Several techniques exist to solve this interface selection problem. In [10], authors compare different multicriteria decision algorithms for interface selection. The decision is taken considering QoS requirements of the different flows, which are weighted using different R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 181–192, 2011. c Springer-Verlag Berlin Heidelberg 2011
182
G. Castignani, N. Montavont, and A. Lampropulos
techniques, such as SAW (Simple Additive Weighting) or TOPSIS (Technique for Order Preference by Similarity to Ideal Solution). These algorithms produce a single-objective optimization to determine the best interface for each application flow. In [7], authors propose a simple energy-aware interface selection technique, which decides to perform a vertical handover (swap all flows from 3G to WLAN) only if the cost of switching to WLAN (i.e., activating the interface and performing scanning) plus the cost of transmitting the data through WLAN is lower than the cost of continuing to transmit the data through 3G. This approach considers a fixed amount of energy per unit of transported data, which is too simplistic to actually represent the interfaces energy consumption, as we will show in Section 2. In such a multi-homed scenario there is a performance trade-off, since an energy-aware multi-homing decision should maximize the level of QoS of each application flow, while consuming as little energy as possible. To achieve this, the decision mechanism has to consider some inputs. First, to assure a certain level of QoS, a previous knowledge about the MS environment is needed. The decision mechanism must gather information about the presence and performance of different access networks and eventually, the MS position to optimize the network discovery. In addition, to estimate the amount of energy to be consumed by each application flow, the decision mechanism must know the energy consumption of each interface applied to the traffic pattern of the given application. In this paper, we study these very first inputs for the decision making. First, we survey in Section 2 the different aspects of energy efficiency of wireless interfaces, analyzing the power consumption of different interfaces and presenting some energy models to predict energy consumption. Then, in Section 3, we present some energy measurements on different MS, including the impact of transmitting different application flows through different interfaces (WLAN and 3G). We also evaluate the energy impact of the environment discovery and propose some optimizations to achieve more energy efficient WLAN scanning and GPS location. Finally in Section 4 we conclude the paper.
2 2.1
Energy Consumption of Wireless Interfaces Empirical Studies
Energy-efficiency of wireless mobile devices has become an important research issue in the last years. Smart-phones, tablets and other battery-based mobile devices have evolved to integrate more and more functionalities, services and applications that have a direct impact on the energy consumption. Some of the contributors of the increasing energy consumption are the communication modules, since most part of these new functionalities use a wireless interface to access the Internet. Moreover, the growth of energy consumption has not been followed by an increasing battery capacity in the same order of magnitude. This produces a degradation of the mobile device autonomy. One way to increase the battery autonomy is to reduce the power consumption of the different hardware components of the mobile device. Another strategy is to optimize the different
Energy Considerations for a Wireless Multi-homed Environment
183
mechanisms and protocols in order to reduce the time the MS spends in high power consuming states. In a multi-homing context, a user may want to simultaneously use several interfaces to benefit from being connected to different networks. Some authors have studied the energy impact of data communications using different wireless interfaces. In [7], the author presents an experimental benchmarking of energy consumption of WLAN and 3G interfaces. This experiment consists in downloading and uploading data through a WLAN AP and a 3G network for different load rates and radio link quality. By measuring the percentage of remaining battery level along the time, they found that energy consumption is increased by 18.3% if both WLAN and 3G interfaces are turned on compared to the case in which only 3G is turned on. Regarding data communications, both interfaces consume the same amount of energy along the time, but a great difference is found when considering the energy consumed by unit of data (MByte). Using the 3G interface, the MS consumes between 0.176% and 1.81% of the battery per MByte, while for WLAN, these values are reduced by two orders of magnitude. Another empirical study was proposed by Xiao et al. [11]. The authors evaluated the energy consumption of watching on-line YouTube videos using 3G and WLAN. under different strategies: online-viewing, local-viewing (i.e., downloadthen-play) and upload. They perform some tests using a Nokia N95 phone and the Nokia Energy Profiler application (NEP1 ) to gather the instantaneous power consumption of the phone during the experiment. They found for all strategies that the energy consumed by WLAN is lower than by the 3G interface, which differs with the results presented in [7]. Balasubramanian et al. [2] propose a measurement study for WLAN, 3G and GSM interfaces using Nokia N95 (running NEP) and HTC Fuze with a hardware power meter. They found that using 3G, the MS spends in average 60% of the total energy by remaining in a high power state during 12 seconds after finishing a transmission (i.e., tail energy). This result is also observed in [9]. GSM behaves in a similar way but the tail time is smaller than in 3G, up to 6 seconds, giving a reduced tail energy. Moreover, the energy consumed during a GSM transmission is higher than in 3G, since due to GSM low data rates, the transmission time becomes longer. Finally, they find WLAN to consume less energy during data transmission, but they highlight the high cost of performing WLAN scanning, as we will analyze in Section 3.2. 2.2
Power States of Wireless Interfaces
Depending on the communication technology, an MS wanting to send or receive data may switch among different power states. A WLAN interface can operate in Continuously Active Mode (CAM) or Power Saving Mode (PSM) as illustrated in Fig. 1.a. In CAM, the MS remains in IDLE state if no data communication is being accomplished. In PSM, the MS can switch to a low-power SLEEP state at any moment. In both cases (CAM or PSM), the MS switches between the two 1
http://www.forum.nokia.com/Library/Tools_and_downloads/Other/ Nokia_Energy_Profiler/Quick_start.xhtml
184
G. Castignani, N. Montavont, and A. Lampropulos
Fig. 1. WLAN and 3G Power States and possible transitions
most power consuming states, RECEIVE and TRANSMIT, when receiving and sending data respectively. Regarding 3G, transitions between power states are not only related to the amount of data to transmit or receive but to some inactivity timers expiration as well [4]. As illustrated in Fig. 1.b, we find three different states depending on the logical channel the MS is using: DCH (Dedicated Channel), FACH (Forward Access Channel) and PCH (Paging Channel). DCH is a high power consuming state that allows sending and receiving data at high data rates. For lower data rates, a common transport channel is used (FACH), involving a lower power consumption. Finally, in the low-power PCH state, no RRC (Radio Resource Control) connection is established, and no data can be transferred (only control data). As it can be seen from Fig. 1.b, transitions between these states can be performed in different manners, depending on the RRC implementation of each network operator. When an MS has some data to transmit or receive, it can directly enter in the DCH state or in the FACH state to switch later to DCH if needed. Since dedicated channels (DCH) are a limited resource, the network operator (through its RRC entity) cannot assign them to an MS for an unlimited time. Moreover, to avoid user performance degradation related to several renegotiations of a DCH, the RRC cannot immediately give the MS access to the FACH. To balance this trade-off, a set of inactivity timers, T1 and T2 are implemented to manage the transition to lower power states. After being idle for T1 in a DCH, the MS can switch to FACH. Then the station can switch again to DCH if the MS demands a high throughput (i.e., higher than a pre-established threshold) or it can switch to PCH after being idle for T2 . The values of the timers and the state transition policy deeply impact the MS energy consumption and are fixed by the network operator. Then, an MS may consume different amounts of energy depending on the 3G access network to which it is connected. Each power state for WLAN or 3G has an associated power consumption. However, these values diverge depending on the specific hardware manufacturer and network operator. In the literature, we can find different power values, as described in Table 1 for WLAN and in Table 2 for 3G. For instance, a WLAN in TRANSMIT state may consume between 1097 mW and 2000 mW.
Energy Considerations for a Wireless Multi-homed Environment
185
Table 1. WLAN states power consumption (mW) WLAN Nokia N810[12] HTC G1[12] Nokia N95[12] Nokia N90[6] SLEEP 42 68 88 40 IDLE 884 650 1038 800 TRANSMIT 1258 1097 1687 2000 RECEIVE 1181 900 1585 900 Table 2. WLAN states power consumption (mW) 3G Holma[5] Lampropoulos[6] Xiao[11] Qian[8] PCH < 18.5 19 282 0 FACH 370-740 555 549 400-460 DCH 740-1480 1100 742 600-600
2.3
Energy Models
While being in operating mode, a wireless interfaces transits among n different power states, i, having a pre-defined power consumption, Pi (in Watts). In order to predict how much energy an application flow may consume by using a particular interface, one may calculate the amount of time (in seconds) an MS spends in each particular state, ti . Then, the energy consumption C (in Joules) can be simply calculated by Eq. 1. C=
n
Pi · ti
(1)
i=1
As stated in Section 2, Pi mainly depends on the hardware, while ti depends on the particular application flow that has to be transmitted through the interface. Different application flows may need to spend a different amount of time on each particular state. For instance, a web-browsing flow will certainly spend more time in a low-power state (IDLE or PCH) compared to a video-streaming flow. Thus an energy model should first estimate the period of time the MS spends in each state. In the following, we review different energy models found in the literature. WLAN. In [1], authors propose an energy model for WLAN MSs connected to the same AP for two different kind of TCP traffic: a long file transfer and several short files transfers. In both cases, authors consider two different scenarios: all stations connected to the AP in CAM or all of them in PSM. In the case of a long file transfer authors model the time spent on each operation state by using discrete-time Markov chains. For short files transfer, they focus on the inactivity or think time between downloads and model the system as a Processor Sharing Model, considering exponentially distributed think times and file sizes. Their analytical model corroborates ns-2 simulations. As expected, for long files, using PSM is less energy-efficient than using CAM because of the overhead of PSPOLL sent to notify the AP that the MS is entering in PSM. On the other
186
G. Castignani, N. Montavont, and A. Lampropulos
hand, using PSM for short files, one can triple the number of transferred files per unit of energy. Another analytical energy model is proposed by Xian et al. [12]. In this case, authors generalize the traffic model by considering packet bursts of size SB , duration TB and separated by an interval of TI . Then, the data rate of the traffic flow is r = SB /(TB + TI ). Their model considers download and upload traffic. In the case of a download, the MS spends some time in IDLE state, some time in the RECEIVE state and, if PSM is activated, the MS can also spend some time in the SLEEP state. Then, the energy consumption can be expressed as E = PR · TB + PI · Ttimeout + PS · Tsleep , where Tsleep = TI − Ttimeout (if TI ≥ Ttimeout ). The same rule can be considered for upload traffic, but using PT (transmission power) instead of PR (reception power). Based on the latter expression for E, authors calculate, in Eq. 2 the average download power as a function of the data rate r. Then, one can use this expression to estimate the average power consumption of a flow on a particular interface by knowing the power level of each state (PR , PT , PI and PS ), the timeout to enter in PSM (Ttimeout ) and the characteristics of the flow (TI , TB and SB ). Pd (r) =
E PR · TB + PI · Ttimeout + PS · (TI − Ttimeout ) = T TB + TI TB Ttimeout = PS + r (PR − PS ) + (PI − PS ) SB SB
(2)
Authors instantiate PR , PT , PI and PS for different smartphones (Nokia N810, HTC G1 and Nokia N95) and provide different expressions for the average download (Pd (r)) and upload (Pu (r)) power consumption. They show that analytical results match real measurements on smartphones. 3G. An energy consumption model for 3G is proposed in [13]. In this model, two different types of traffic are considered: web-browsing (Fig. 2.a) and videostreaming (Fig. 2.b). Web-browsing traffic is modelled as browsing sessions, each one of them composed of packets calls separated by reading intervals. Different browsing sessions are separated by an inter-session time. Video-streaming traffic is modelled as video sessions separated by idle periods. In both cases, the transition between the states (DCH, FACH and PCH) is modelled as a discrete-time Markov chain. For web-browsing traffic, they consider that the MS transmits and receives data using a FACH with probability 1 − p1 , where p1 is the probability that a packet call finishes. If a packet call finishes, the MS remains in the FACH channel until the expiration of the inactivity timer T2 . The MS can start a new web-session (using a FACH) with probability p2 , where p2 is the probability that the reading time ends. If T2 expires, the MS switches to the PCH state and remains there with probability 1 − p2 . In the case of video-streaming traffic, the previous Markov chain is extended in order to include the DCH. Then, an MS remains in the DCH with probability 1 − p1 , where p1 is the probability that the video session ends. If the video session finishes, T1 is triggered and the MS can start a new video session with probability p2 at any moment. After T1 expiration
Energy Considerations for a Wireless Multi-homed Environment
187
Fig. 2. Traffic Modelling
the MS switches to the FACH state and triggers T2 , and after its expiration, it goes to the PCH state, remaining there with probability 1 − p2 . Having modelled state transitions as a Markov chain allows obtaining its steady state (Π), i.e., the equilibrium distribution of the chain, as a function of p1 and p2 or p1 and p2 . Then, assuming a fixed traffic session duration, one could calculate the expected energy consumption by multiplying each component of Π by its associated power consumption (depending on the interface state, DCH, FACH or PCH). This allows predicting the amount of energy to be consumed by a particular flow (i.e., different session duration, number of packet calls, reading time, idle time, etc.).
3 3.1
Energy Measurements Energy Cost of Applications Flows
In order to evaluate the behavior of WLAN and 3G interfaces on energy consumption, we have performed a set of experiments using a multi-interface HTC Dream G1 phone. We have used PowerTutor [14], an Android-based application that allows estimating energy consumption of different smartphone components (wireless interfaces, GPS, audio, CPU and screen). PowerTutor implements an accurate power model that has been derived from an HTC Dream phone that measure energy with at most 2.5 % error. We have calculated the energy consumption of three different applications, web-browsing, YouTube and Skype. Each measurement was performed using only one interface at a time. Regarding Skype, we have calculated the energy consumption for a test call (to a Skype server) and for the application running in background (no active call). As illustrated in Fig.3.a, we have performed the download of a 45 s YouTube video using both 3G and WLAN. We can appreciate that the instantaneous power consumption of WLAN while downloading the video (around 700 mW), is higher than the maximum 3G power consumption (around 570 mW) which could produce high energy consumptions for long videos (22 J for WLAN against 31 J for 3G for 80 s in our example). The same behavior is observed in a Skype test call (Fig. 3.d), but in this case, the 3G interface remains in a high power consumption state (i.e., the DCH) for around 20 s, before switching to FACH (around 400 mW). This gives a higher energy consumption when considering the complete call. Regarding web-browsing (Fig.3.b), we have accessed to the same
188
G. Castignani, N. Montavont, and A. Lampropulos
Fig. 3. Energy Measurements using PowerTutor
web-page using both interfaces. Here, we can clearly appreciate the dramatic effect of inactivity timers in 3G, producing a total energy consumption of 62 J compared to 25 J for WLAN. The same behavior is found when using Skype in background, with no active calls (Fig.3.c). Here we appreciate that WLAN never enters in its high power state, but 3G goes into DCH state several times. This issue may be originated by the high amount of data transmitted by the Skype ServiceController module on the background. We have analyzed the evolution of the power states in both WLAN (Fig. 4.a) and 3G (Fig. 4.b) interfaces during a Skype test call. We observe in these figures that the WLAN states follow the throughput (when the application flow throughput increase or decrease, the WLAN state changes), while in the case of the 3G, we observe that the DCH is triggered as soon as few bytes are sent over the interface, and the MS stays in this state until the 95th second, i.e., 20 s after the call was finished. 3.2
Energy Cost of Scanning and Location Services
Concerning the interface selection problem, in addition to be able to determine the power consumption for each interface, the MS should acquire an accurate vision of the wireless environment. In order to discover the WLAN available
Energy Considerations for a Wireless Multi-homed Environment
189
Fig. 4. Skype test call using WLAN and 3G interfaces
networks and their performance, the MS should perform scanning, which requires to actively probe each channel to find out operating APs. Since scanning is an energy-consuming process, we investigate the energy cost of different scanning strategies that use GPS location or sensor-based information. In this experiment, illustrated in Fig. 5 we used a Samsung Galaxy S GT-I9000 with Android OS 2.2 Froyo, and compared the battery drain of the MS in different scenarios (from 100% to 15% of the battery capacity). In order to have a baseline, we left the phone idle by turning the screen completely on (red curve) and off (black curve). We observe that having the screen on strongly decreases the battery autonomy by a factor of 13. With the screen off and without any running application, the battery lasts 30 hours. For all the other tests, the screen was turned off. For the simple case where the MS is continuously scanning every 3 s (green curve) the battery lasts 11 hours, i.e., three times less than when the device is idle. If for any reason the MS also needs to be located, it has to turn on the GPS and the battery lifetime falls to 9 hours (blue curve). We computed the cost of an individual scanning and found that in average, a single scanning consumes 0.0063% of the battery, differing from the result presented in [7] which claims 0.122% of battery drain per scanning for equivalent battery capacities. Since a typical user does not continuously move, we propose to use sensorbased information to stop scanning when the MS is stationary. By using the internal accelerometer, we detected the MS movement and performed continuous scanning during the following 30 s from the last detected movement. After this time, if not movement is detected, no scanning process is triggered. In this experiment, the MS is also gathering the GPS location when moving. The different grey curves in Fig. 5 represent the battery drain of such a sensor-aided approach for two different users following different mobility patterns. The first user moved 49.16 % of the time and achieved an autonomy of 11 hours. The second user moved 13.20 % of the time and had an autonomy of 17 hours. We can appreciate that for both mobility patterns we have considered, the energy
190
G. Castignani, N. Montavont, and A. Lampropulos
100
Remaining battery level (%)
90
Moving 49.16% of the time
80 70 60
Moving 13.20% of the time
50 40 30 20 10
0
2
4
6
8
Screen always-on Screen always-off
10
12
14 16 18 Time (h)
GPS scan Continuous scan
20
22
24
26
28
30
32
Sensor-aided scan
Fig. 5. Scanning and GPS location energy consumption in Samsung Galaxy S
consumed by the sensor-aided strategy is lower than the GPS scan (blue curve), which demonstrates that integrating sensor information while scanning increases the energy efficiency of the MS.
4
Conclusion and Perspectives
In this paper, we have reviewed the main findings on energy consumption in a wireless multi-homed environment. In this context, a user wants to improve its experience while communicating through the Internet, but at the same time, he/she wants to maximize the duration of this experience, i.e., the battery autonomy. We showed through energy models and specific measurements that different wireless interfaces consume different amounts of energy for a given application flow. Moreover, for a given wireless technology, different applications (e.g., web-browsing, real-time, interactive) show a different energy consumption. Since multi-homing may potentially involve a high energy-consumption if interfaces are simultaneously used, we need to design schemes that allow intelligent assignation of application flows to the available wireless interfaces. This intelligent assignation can only be achieved by previously knowing the environment, by scanning surrounding networks. We studied the contribution of scanning and GPS location on energy consumption. We found that using smartphone internal sensors can significantly improve energy efficiency depending on the user mobility pattern.
Energy Considerations for a Wireless Multi-homed Environment
191
As future work, we envisage a multi-objective optimization approach to optimize the energy consumption on the one hand and the perceived QoS on the other hand.
References 1. Agrawal, P., Kumar, A., Kuri, J., Panda, M.K., Navda, V., Ramjee, R., Padmanabhani, V.N.: Analytical models for energy consumption in infrastructure WLAN STAs carrying TCP traffic. In: 2010 Second International Conference on Communication Systems and Networks (COMSNETS), pp. 1–10 (January 2010) 2. Balasubramanian, N., Balasubramanian, A., Venkataramani, A.: Energy consumption in mobile phones: a measurement study and implications for network applications. In: Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, IMC 2009, pp. 280–293. ACM, New York (2009) 3. Castignani, G., Loiseau, L., Montavont, N.: An evaluation of IEEE 802.11 community networks deployments. In: 2011 International Conference on Information Networking (ICOIN), pp. 498–503 (2011) 4. Haverinen, H., Siren, J., Eronen, P.: Energy Consumption of Always-On Applications in WCDMA Networks. In: IEEE 65th Vehicular Technology Conference, 2007, VTC 2007-Spring, pp. 964–968 (April 2007) 5. Holma, H., Toskala, A.: HSDPA/HSUPA for UMTS: High Speed Radio Access for Mobile Communications. John Wiley & Sons, Chichester (2006) 6. Lampropoulos, G., Kaloxylos, A., Passas, N., Merakos, L.: A Power Consumption Analysis of Tight-Coupled WLAN/UMTS Networks. In: IEEE 18th International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2007, pp. 1–5 (September 2007) 7. Petander, H.: Energy-aware network selection using traffic estimation. In: Proceedings of the 1st ACM Workshop on Mobile Internet through Cellular Networks, MICNET 2009, pp. 55–60. ACM, New York (2009) 8. Qian, F., Wang, Z., Gerber, A., Mao, Z.M., Sen, S., Spatscheck, O.: Characterizing radio resource allocation for 3G networks. In: Proceedings of the 10th Annual Conference on Internet Measurement, IMC 2010, pp. 137–150. ACM, New York (2010) 9. Sharma, A., Navda, V., Ramjee, R., Padmanabhan, V.N., Belding, E.M.: Cooltether: energy efficient on-the-fly wifi hot-spots using mobile phones. In: Proceedings of the 5th international conference on Emerging Networking Experiments and Technologies, CoNEXT 2009, pp. 109–120. ACM, New York (2009) 10. Stevens-Navarro, E., Wong, V.W.S.: Comparison between Vertical Handoff Decision Algorithms for Heterogeneous Wireless Networks. In: IEEE 63rd Vehicular Technology Conference, VTC 2006-Spring, vol. 2, pp. 947–951 (May 2006) 11. Xiao, Y., Kalyanaraman, R.S., Yla-Jaaski, A.: Energy Consumption of Mobile YouTube: Quantitative Measurement and Analysis. In: The Second International Conference on Next Generation Mobile Applications, Services and Technologies, NGMAST 2008, pp. 61–69 (September 2008)
192
G. Castignani, N. Montavont, and A. Lampropulos
12. Xiao, Y., Savolainen, P., Karppanen, A., Siekkinen, M., Yl¨ a-J¨ aa ¨ski, A.: Practical power modeling of data transmission over 802.11g for wireless applications. In: Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking, e-Energy 2010, pp. 75–84. ACM, New York (2010) 13. Yeh, J.-H., Chen, J.-C., Lee, C.-C.: Comparative Analysis of Energy-Saving Techniques in 3GPP and 3GPP2 Systems. IEEE Transactions on Vehicular Technology 58(1), 432–448 (2009) 14. Zhang, L., Tiwana, B., Qian, Z., Wang, Z., Dick, R.P., Mao, Z.M., Yang, L.: Accurate online power estimation and automatic battery behavior based power model generation for smartphones. In: Proceedings of the Eighth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, CODES/ISSS 2010, pp. 105–114. ACM, New York (2010)
Method for Linear Distortion Compensation in Metallic Cable Lines Albert Sultanov, Anvar Tlyavlin, and Vladimir Lyubopytov Ufa State Aviation Technical University, Chair for Telecommunication Systems, 12 K. Marx St., Ufa, Russian Federation
[email protected],
[email protected],
[email protected]
Abstract. In this paper a method of signal linear distortion compensation is proposed for the purposes of transmission reliability and distance enhancement. This method is based on the concept of useful signal pre-equalization in accordance with cable line transfer function. Filter coefficients are computed using sampled signal on the line output when testing signal propagating through the line. Experimental results demonstrate that obtained solutions permit to compensate not only frequency-selective fading, but also multipath propagation effects. Keywords: Cable line, linear distortion, signal precorrection, digital signal processing.
1 Introduction For the last decade attention has been focusing considerably on issues of metallic cable lines efficiency improvement and deployment of broadband communication networks based on this transmission medium. Because of the recent achievements in the area of high-performance hardware, progressive methods of signal processing become realizable, providing more transmission capacity and distance. In a number of cases when deploying or developing networks, usage of existing copper cable infrastructures remains economically more efficient alternative for new cable laying. Furthermore, transmission medium may be realized not only with existing communication cables but also with power supply lines using PLC technology. In quasi-stationary mode, signal propagation along the line is known to be defined by telegraph equations set through the primary parameters. Its solution represents, that characteristic impedance and propagation constant suffer from frequency dependence. Moreover, skin effect and proximity effect cause resistance per unit length R( ω) and inductance per unit length L(ω) to be functions of frequency, and dielectric loss causes frequency dependence of conductance per unit length G(ω) [1]. Furthermore, the transmission medium in general case may contain impedance discontinuities, which cause multipath propagation [2,3]. The distortions mentioned above result in intersymbol interference (ISI), and the channel capacity defined by the Shannon-Hartley theorem turns out to be unachievable. R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 195–198, 2011. © Springer-Verlag Berlin Heidelberg 2011
196
A. Sultanov, A. Tlyavlin, and V. Lyubopytov
2 Method Concepts There are a number of approaches to the problem of ISI in metallic cable lines, which are applied when developing advanced transmission technologies, particularly for Digital Subscriber Line (xDSL) techniques [4]. However various transmission technologies have already widely been used, which don’t provide efficient methods of distortion compensation on a physical layer, although a present-day level of hardware development allows to accomplish it. The proposed method is intended as a generalized approach for ISI compensation irrespectively of used physical layer protocol, except only for a necessary bandwidth and a signal amplitude [5]. So it may be applied in cases when implementation of more effective nonlinear equalization and precoding methods is too complicated. Among two basic ISI compensation philosophies – pre-equalization and postequalization, the former type is chosen, which permits to improve signal-to-noise ratio (SNR) on receiver input, though it enhances a signal magnitude [6]. To minimize retransmission of equalizer adjustment information a concept of zero-forcing for computation of filter coefficients is considered [7,8]; probability of excessively noise increase is avoided by a pre-equalization concept, and influence of the noise with zero mean on the testing signal may be minimized by its output values averaging. Since filter coefficients are obtained on the basis of the inverse transfer function of the line, not only a channel amplitude-frequency characteristic unevenness would be compensated, but also multipath propagation effects caused by impedance discontinuities. For the signal equalization with entire required bandwidth a fractionally spaced equalization (FSE) concept is applied, i.e. a sampling rate and a filter tap spacing are selected to satisfy the sampling theorem. Furthermore, a feature of an FSE is the insensitivity to the sampling phase, in contrast to T-spaced sampling, and synchronization with transmitted signal is not necessary [8]. As in the general case cable line parameters are unknown and time-varying, the precorrection filter coefficients are proposed to be computed using sampled testing signal g(l,t) on the line output. Therefore, the filter implementation consists of two consecutive stages: 1) defining the precorrected signal u(0,t) for given input action g(0,t); 2) synthesizing the equalization discrete filter with transfer function H(z), which is defined using values of u(0,t) and g(0,t) signals. Since the purpose of precorrection consists in obtaining of low-distorted signal on the line end, follow condition may be formulated: t
u ( l , t ) = λ g ( 0, t − Θ ) = ∫ u ( 0, τ ) h ( l , t − τ ) dτ ,
(1)
0
where λ is a coefficient (λ≤1), chosen in order to minimize power consumption and electromagnetic emission on the one hand, and to ensure sufficient SNR on the other hand; Θ is a time delay necessary for the system physical realization and equal to the time period of the input signal propagation from the point x=0 to the point x=l. If testing signal g(0,t) is inputted to the line, then signal on the load may be defined using Duamel integral:
Method for Linear Distortion Compensation in Metallic Cable Lines
197
t
g ( l , t ) = ∫ g ( 0, τ ) h ( l , t − τ ) dτ .
(2)
0
Therefore, function u(0,t) may be expressed from equations set combining expressions (1) and (2). As a result of Laplace transformation over variable t, ⎧ G (l , p) e− pΘ = G(0, p) K (l , p) , ⎨ − pΘ = U (0, p) K (l , p) ⎩λ G(0, p) e
(3)
where K(l,p) is a transfer function of the line with length l; G ( l , p ) denotes Laplace transformation of testing signal on the line output moved to the left for the period Θ. Signal g (l , t ) may be approximated by sequence of Heaviside step functions σ(t). Thus, if trapezoidal pulse sequence is applied as testing signal g(0,t), precorrected signal can be expressed as:
(
)
2
− p (τ +τ ) − pτ + e − p ( 2τ +τ ) λU 0 2 1 − e − e U ( 0,p ) = 2 × , M τ1 3 ⎛ − pmδ ⎞ 1
1
2
p ⎜ S0 + ∑ ( S m − S m −1 ) e m =1 ⎝
1
2
(4)
⎟ ⎠
where Sm is a value of signal g (l , t ) when t=mδ, m=1, 2,…; δ is a sampling interval of signal g (l , t ) , which should tend to zero for accuracy approximation; U0 is a pulse amplitude; τ1 is a length of pulse edges; τ2 is a length of horizontal pulse section. Calculation for inverse Laplace transformation of expression (4) has been obtained both by the residue method and the Newton-Cotes numerical integration formulae. So the precorrection recursive filter may be defined by linear difference equation: y ( 0, nτ ) =
K −1 1 ⎛ M −1 ⎞ − ∑ G ( 0, mτ ) y ( 0, nτ − mτ ) + ∑ U ( 0, kτ ) f ( 0, nτ − kτ ) ⎟ , ⎜ G ( 0, 0 ) ⎝ m =1 k =0 ⎠
(5)
where f(0,nτ) is a sampled useful signal on the filter input; y(0,nτ) is a precorrected useful signal on the filter output; τ is a sampling period [9].
3 Experiment Results For purposes of equalization filter coefficients computation and signal precorrection process simulation, a program with Borland Delphi has been developed. To compare form of the signal received on the line output when precorrection is applied with undistorted pulse on the source output, series of experiments with real cables have been made. The equalization filter for the experimental assembly is based on Altera Cyclone II FPGA, supporting 32 MHz clock frequency; input pulse with length (2τ1+τ2) equals to 290 ns is used. Fig. 1 represents the signal oscillograms obtained for experiment with twisted pair cable line (l1=305 m) with cable section (l2=50 m) branched out from the first section load and unloaded on other end. Shown on Fig. 1(a) oscillogram demonstrates the repeated pulse as the result of multipath propagation.
198
A. Sultanov, A. Tlyavlin, and V. Lyubopytov
Fig. 1. a) Testing signal g(l,t) on the line load; b) precorrected signal y(0,t), λ=0.5; c) signal f(l,t) on the line load with precorrection
Quantitative evaluation of experiments results shows that applying of the precorrection method generally permits to compensate not only frequency-selective fading, but also multipath effects. In the case of twisted pair line, spreading of the pulse width measured at 50% height decreased from 25.9% to 1.7% for the uniform line and from 242% to 2.9% for the line with branching point, mentioned above; the width of falling pulse edge in the first case decreased in 6.5 times; in the second case, falling pulse edge decreased in 9.3 times. Currently the research of various aspects of this approach is going on.
References 1. Grodnev, I.I., Vernik, S.M.: Telecommunication lines. Radio i svyaz, Moscow (1988) (in Russian) 2. Hrasnica, H., Haidine, A., Lehnert, R.: Broadband Powerline Communications Networks. John Wiley & Sons, Chichester (2004) 3. Dostert, K.: Powerline Communications. Prentice Hall, New Jersey (2001) 4. Gerstacker, W.H., Fischer, R.F.H., Huber, J.B.: Blind Equalization Techniques for xDSL using Channel Coding and Precoding. Int. J. Electr. Commun. 53, 1–11 (1999) 5. Lucky, R.W., Rudin, H.R.: An automatic equalizer for general-purpose communication channels. Bell Syst. Tech. J. 46, 2179–2208 (1967) 6. Fischer, R.F.H.: Precoding and Signal Shaping for Digital Transmission. John Wiley & Sons, New York (2002) 7. Haykin, S.: Adaptive Filter Theory. Prentice Hall, New Jersey (1996) 8. Qureshi, S.U.H.: Adaptive equalization. Proceedings of the IEEE 73(9), 1349–1387 (1985) 9. Sergienko, A.B.: Digital signal processing. Piter, Saint-Petersburg (2002) (in Russian)
Multimedia Services Differentiation in 4G Mobile Networks under Use of Situational Priorities Alexander Dyadenko1, Olga Dyadenko2, Larisa Globa2, and Andriy Luntovskyy3 1 Huawei Kiev, Ukraine ITS at National University of Technology of Ukraine "KPI" Kiev, Ukraine 3 BA Dresden Univ. of Coop. Education, Germany
[email protected],
[email protected],
[email protected],
[email protected]
2
Abstract. Deployment of 4G networks requires new challenges regarding to serving of the customer calls. Differentiation of multimedia services in 4G mobile networks under use of an advanced serving discipline with situational priorities is considered in the given work. An advanced serving discipline for LTE systems is proposed. The used method offers significant increasing of QoS-parameters and resource awareness in LTE and future 4G systems. Some case studies have been examined. Keywords: 4G Mobile, Resource Awareness, Advanced Serving Discipline with Priorities, the "Most Valuable Calls".
1 Introduction The principle of hierarchical differentiation is shown in Fig. 1 [1, 2].
Fig. 1. An approach to service differentiation in 4G networks
Fig. 2. Bandwidth consumption in 4G
A new approach of hierarchical 4G networks differentiation which widely used in telco networks is depicted [2, 5]. To solve such problems have to be extended the concept of "priorities" by introducing of "dynamic priorities”. Analysis of policy and charging control architectures (PCC) [2] in mobile OFDMA-networks with has shown, that during the rule enforcement procedure the decision on the next service call using classical disciplines with fixed priorities are not effective. This proved by R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 199–202, 2011. © Springer-Verlag Berlin Heidelberg 2011
200
A. Dyadenko et al.
importance of call (context of request service) and radio resources (bandwidth) are considered in the process of the choice of income calls for further processing. This leads to a situation in which the possibility of providing services to the person who receives the whole (greater) resources, while other services remain not served, even if the calls have the same priority and require less resources (Fig. 2). One type of service can be asked for different resource depending on the location of the client (for example, SNR). The importance of call means the information value of services, e.g. costs of services, the priority and so on. Let X i - i-th call, which came into the system during [0; t ]. Then the call is characterized by a pair of values { X i , Δf i } , X i - the importance of i-th call (context of service) and
Δf i - radio bandwidth required
to call to be served. New discipline was proposed based on analysis. Proposed call serving disciplines based on situational priority which is the kind of dynamical priority. Each time the subscriber asking for the service, the activation priority is X finding via the expression: wi = i , where wi - value of priority, X i - calls Δf i information (context) importance,
Δf i - radio bandwidth required to serve the call.
The explanation of the proposed discipline is represented in Fig. 3.
w1
Δf
λ
w2 wk
Fig. 3. A proposed discipline
λ
μ Fig. 4. Services maintenance in LTE/PCC model with one BS
The “RF” model service procedure formulation for the case of n free channels. If during the period [0;t] k > 0 calls were received, call j for which next is true:
j 1 ( w j ) = max{1( w1 ),..i( wi ),..., k ( wk )} ,will be served in the first place,
where j =1..k ; place l
2
Δf j ≤ F , F - total system bandwidth, is multiple c. In the second
will be served, for which
l 2 = max{1( w1 ),..i ( wi ),..., k ( wk )} is true,
l ∈ (1, k ) and i ∉ (1, k ) . The procedure continues until the channel has free capacity. Other calls are rejected. If during the period [0;t] k > 0 calls were received where
and there is no free capacity in the channel, then calls are rejected. If during the period [0;t] no calls were received, the systems waits time t. Then move to points 1, 2, 3.The integral information importance of handled calls is used to evaluate effectiveness of the proposed discipline. Let the network consists of one base station (eNodeB), which acts as serving channels (SC); processing and calls tariffing control unit (PCC), which acts as solver; and serving server (AS), which generates requests for service (Fig. 4).
Multimedia Services Differentiation in 4G Mobile Networks
201
2 Generalized Problem Solution Serving server (AS) generates calls with intensity λ , which are received be PCC. After checking the possibility of organizing the services, PCC creates queues for reception services, and in accordance to discipline sends them to the base station (BSeNodeB). Let us consider that system functioning conditions and calls entry are next. AS is the source of calls, which generates requests with the intensity λ , which corresponds to Poison stream. PCC receives stationary Poison stream of calls from the base station. PCC computes coefficient w , which describes the call priority. PCC organizes calls serving order on BS according to priority w .BS has n free channels, with fixed capacity c each; the handling intensity equals μ . From 1 to n free channels can be used for handling a call. The problem of computing the integral weight of served calls, according to “RF” and FIFO disciplines, is solved in several steps: Computing the mathematical expectancy of served calls integral weight according to FIFO and RF disciplines, assuming that n channels are free and calls with different tariffs occupy equal number of channels; Computing the mathematical expectancy of integral weight of served calls using RF discipline, considering that calls with different tariffs while being served occupy different but fixed number of bands. Firstly we are to going to give the formulation of the problem. Let the Poison stream of calls {ξ k } enters the system, where k=1…N, N- total number of requests, which arrived in [0;t] period. The call ξ k is characterized by next parameters: ω i = X i / c i , where X i - information value of the call, ci - value known beforehand, which defines number of necessary channels to handle i-type call, i=1....m, where m – the amount of call types; λi - і-type call entry intensity. Let consider call ξ k to have higher priority than ξ l call, if ω k <ω l , к,l=1....m. According to RF service procedure, calls with coefficient ω1 are sent in the first place, then – with ω 2 , and so on until the system has free channels (capacity). Task: find mathematical expectation of average weight of served during time t calls ξ k . The problem is solved in two steps: Step 1. Solution of auxiliary problem: we consider that only one type of calls, which occupies a fixed number of channels, enter the system. Step 2. Solution of basic problem: m types of calls are entering; every type of income calls occupies fixed number of cells (channels) while serving. Thus, the analytical expressions for computing mathematical expectation of integral weight of served calls according to disciplines RF and FIFO, Priority were evaluated. Curves of mathematical expectation of calls processing system while serving the requests depending on the velocities and number of free channels, according to FIFO and RF service procedures, are shown in Fig. 5. There is a dependence between the integral weight of handled calls, latency and the number of serving channels: the larger number of SC (bandwidth), the more calls can be served and more overall information weight can be got. At the same time, effectiveness of the proposed method is higher at smaller number of channels. Relative comparison of mathematical expectation of the total information weight of RF service procedure with FIFO and RF with Priority is given in Fig. 6. Therefore: the proposed RF discipline gives the benefit in comparison with disciplines FIFO and Priority on 13.5% and 5.5 %
202
A. Dyadenko et al.
Fig. 5. Mathematical expectation of integral weight of served calls depending on time and number of free channels: a) FIFO discipline; b) proposed in this paper RF discipline
Fig. 6. Math. expectations of served calls information weight according to service procedures depending on latency: a) RF/FIFO; b) RF/Priority
respectively, if latency is in interval (0;5) seconds; rate of benefit’s increase slows down with growth of latency. The benefit considerably depends on intensity of calls incoming and average service time.
3 Conclusions In the given paper the different approaches for service differentiation in mobile networks were considered. Deployment of 4G networks is connecting with new challenges regarding to serving the customer calls. For increasing of provider efficiency we propose to use situational priorities which help to distinguish the most valuable call during the service activation. The proposed priorities take in account the weights of information context and bandwidth which will be consumed by customers. We introduce the new serving discipline – “RF” which serves only the “most valuable calls”. The corresponding method is proposed. The obtained results depict a general growth of integral weight of served calls for the proposed RF discipline up to 13,5% and 5,5% respectively compared with classical FIFO and Priority.
References 1. Cuevas, A., Moreno, J.I., Einsiedler, H.: IMS Service Platform: A Solution for NextGeneration Network Providers to Be More than Bit Pipes. IEEE Communication Magazine, 75–81 (2006) 2. Digital cellular telecommunications system (Phase 2+): UMTS/ LTE, Policy and charging control architecture, ETSI 123.203, p.118 (2007) 3. Ilchenko, M., et al.: Up-to-date telco systems, p. 328. Kiev, Publishing House, Naukova Dumka of NAS Ukraine (in Ukrainian) 4. Tomashevsky, V.: System Modeling and Simulation, p. 352. BHV Kiev Pubslishing House (2007) (in Ukrainian) ISBN 966-552-120-9
Downlink Femtocell Interference in WCDMA Networks Zoltán Jakó and Gábor Jeney Department of Telecommunications Budapest University of Technology and Economics 1117, Budapest, Magyar tudósok krt. 2., Hungary {jakoz,jeneyg}@hit.bme.hu
Abstract. With the usages of femtocells, users can achieve better coverage and QoS in home or office environment. But with this cheap solution in two-tier WCDMA systems the interference is increasing in the adjacent network. Because of interference the number of femtocells per macrocell is limited. This article analyse the downlink interference types in two-tier WCDMA network and sets up a simulation model for these femtocell generated downlink interferences. The simulation deals with the house/office wall penetration loss which is limiting the interference. The aim of this article to find out how many femtocells can be deployed to a macrocell, and which parameters vary this limit. Keywords: femtocell, interference, two-tier networks, WCDMA, 3G.
1 Introduction The competition for the customers between the mobile operators is increasing. They provide new services and these services require large bandwidth and quality of service parameters (QoS) that previously only wired technology could ensure. Furthermore nowadays energy saving technologies gain higher attention in telecommunication industry. To ensure the requirements of QoS and low power supply we can use femtocells integrated to mobile architecture. Femtocell is a low-range (10—20 m), cheap, low power emission (approx.30 mW— 50 mW) base station. The subscriber can use femto BS to provide coverage and better transmission speed at home or at the office. With this solution, a much higher transmission speeds and better QoS parameters can be achieved, rather than using only the macro base station at the street. The femtocell base station transports the subscriber traffic on wired technologies (DSL, fiber), causes lower load to the macrocell base station. The users can turn off their femtocell nodes whenever they want. With femtocells, users can transmit on lower power because the base station is closer than the macro base station, which causes less energy usage to the cell phone. Less energy usage means bigger time interval between two charges. But if the used frequency bandwidth is common (like in WCDMA), the users causes interference to each other with their uplink or downlink transmissions. This is also valid for the connection of macro and femtocell base stations. Nowadays the incidence of 3G femtocells is thin. The interference caused by them is low, but in the next few years the number of femtocells will increase worldwide R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 203–208, 2011. © Springer-Verlag Berlin Heidelberg 2011
204
Z. Jakó and G. Jeney
according to the market research r presented by femtoforum, which will generrate remarkable interference to the macrocell network. That is the reason while we woould dels for the downlink interferences, based on the workss of like to give simulation mod [2] and [3]. In this simulatiion environment the femtocells will be placed in housess or offices. The goal of thee article is to find connection between the downllink interference level, wall pen netration loss and the number of femtocells. Because off the interference the number off the deployable femtocells per macrocell is limited. But with the various penetration n wall loss the number of the deployable femtocells is not a constant. The simulation is looking for dependencies and relations between the penetration loss value and the t number of femtocell in a macrocell.
2 System Model To find dependencies and d relations between the number of femtocells and the interference level we need to create a system model for the simulation. This modeel is shown at Fig. 1.
Fig. 1. Simulation model
The model is based on a circular macrocell. The radius of the circle is denotedd by Rc. The macrocell base station s is placed to the centre of this circle. We use omnidirectional antenna on n this base station to provide coverage for the full cell. T The maximum transmit power (ܲ௫ ) of the macrocell base station is 20 W, the ppilot de the macrocell there are several macrocell users. T The signal power is 2 W. Insid macrocell users are distribu uted uniformly in the macrocell. We assume that eachh of these users communicates only o with the centre macrocell. Active macrocell users are denoted by Uc, which follow ws Poisson distribution. Inside the macrocell Nf femtocells are deployed. The femtocells provide coverrage in a circle with Rf radiuss. The radius of the femtocell is much smaller than the
Downlink Femtocell Interference in WCDMA Networks
205
f macrocell radius (Rf <
2.1 Macrocell Path Loss Model The path loss model used in the simulation is based on Okumura-Hata formula and the COST-231 model [14]. The effect of fast fading and noise (AWGN) are ignored in the simulation, since interference levels are much higher: interference fades noise here. The COST-231 model requires several parameters, and gives us the path loss in dB. These parameters are the following: mobile node height (HMS =2m), base station height (HBS = 23m), carrier frequency (fc= 2140 MHz) and |X|. Parameter |X| is the distance between the mobile node and the base station in meter. Θc represents the lognormal fading effect. Logarithm of Θc given as Gaussian random variable, 10log(Θc)~ ࣨ(0, 6 dB), in urban environment and 10log(Θc)~ ࣨ(0, 10 dB) in suburban environment. gwdB denotes the wall penetration loss in dB. The urban path loss model: g
u
[dB] = 46.3 + 33.9 log (f c )− 13.82 log (H BS )− 3.2 log 2 (11.75H MS )+ 4.97 + (44.9 − 6.55 log (H BS ))⋅ log⎜⎜
+g
w
dB
⎛ X ⎞ ⎟+ ⎟ ⎝ 1000 ⎠
( )
+ 3 + 10 log Θ , c
(1) and in suburban environment: g s [dB ] = 46 . 3 + 33 . 9 log (f c ) − 13 .82 log (H BS ) − 0 . 8 − (1 . 1 log f c − 0 . 7 )H MS + 1 . 56 log (f c ) + ⎛ X + (44 . 9 − 6 .55 log (H BS )) ⋅ log ⎜⎜ ⎝ 1000
⎞ ⎟ + g w + 3 + 10 log (Θ c ). dB ⎟ ⎠
(2) 2.2 Femtocell Path Loss Model In femtocells different type of propagation model should be used:
( )
g f [dB] = 43.85 log( f c ) − 4.78 log 2 ( f c ) + 20 log( Y ) − 83.36 + 10 log Θ f ,
(3)
where gf gives us the path loss in dB, Y denotes the distance between the femtocell base station and the femtocell UE in meter and Θf represents the lognormal fading.
206
Z. Jakó and G. Jeney
We assume that the femtocell base station placed in a single room which means no wall loss required in the path loss model, and Line of Sight (LoS) is guaranteed everywhere.
3 Simulation The main parameters of the simulations are summarised here. This interference analysis was part of a T-mobile research and these parameters were given by them. The radius of the macrocell is denoted by Rc. The value of Rc will be 500 m in urban, and 1200 m in suburban case. The radius of the femtocell (Rf) is 30 m. The applied downlink frequency (fc) will be 2140MHz. Nf denotes the number of femtocells, with a scale between 1—104. The required SIR level denoted by SIRlimit, with the value of 12.5dB. Parameter λint represents the interval of time when the user is active, which in the simulations will be the 35% and 70% of the time interval. The value of wall penetration loss (gw [dB]) vary on a scale between 1—25 dB. α out represents the outdoor path loss exponent, with the value of 3.6. Parameter n denotes the repetition of the simulation. By choosing the value of n high, simulation gives us more reliable results. The value of n is 105. which means 105 repeated calculations of path losses and interferences deterministically for probabilistically varied number of active users in the femtocells. The first simulation results (Fig.2.) represent the cumulative density function (CDF) of Nf femtocell generated interference in urban and suburban environment.
Fig. 2. Femtocell caused interference to a macrocell and femtocell user empirical CDF,
N f = 1000, g wdB = 10dB The next figure shows the dependencies between the number of femtocells, the wall penetration loss and the probability that the SIR level is under the SIRlimit (12.5dB). We simulate both urban and suburban environment when the value λ int = 0.7 . At the figures the vertical axis shows us the probability that SIR level is lower than SIRlimit. The two horizontal axes represent the number of femtocells and the wall penetration loss.
D Downlink Femtocell Interference in WCDMA Networks
207
Fig. 3. Probability of o SIR in urban and suburban environment, λint = 0.7
The next figure represen nts the connection between the wall penetration loss, the number of femtocells when n the Pr(SIR<12.5dB)=10% and λ int = 0.7 :
Fig. 4. Urrban environment and suburban environment
The results shows us that t when Pr(SIR<12.5)=0.1 the femtocells severely depend d from the wall penetration loss. penetration loss (15 dB) in n urban environment the only 4 service outage probability, unlike in suburban environment 10% service outage probabiility.
number of deployaable With a usual flat w wall femtocells causes 110% 10 femtocells givess us
4 Conclusion In this article we gave a simulation about the downlink, femtocell generaated W network. The simulation result shows us that the interferences in a two-tier WCDMA number of femtocells in a macrocell is rather limited in downlink case if comm mon ugh the wall penetration loss decrease the probabilityy of frequency assumed. Althou service outage at the macro ocell UE, but with these simulation parameters the num mber
208
Z. Jakó and G. Jeney
of femtocells per macrocell is limited to approximately 20 per macrocell. The femtocell BS's interference does not effect the other femtocell users as much, due to the double wall penetration which attenuates the interfering signal remarkably. So the main problem with the femtocell generated interference appears at the macrocell tier. After summarizing the findings of the article it is cleared that however femtocells are energy-aware communication tools, but it is not acceptable to use them in common frequency with the macrocell BSs.
References 1. Kennedy, C.: Femto Forum: Interference Management in UMTS Femtocells (December 2008), http://www.femtoforum.org 2. Chandrasekhar, V., Andrews, J.G.: Uplink Capacity and Interference Avoidance for TwoTier Femtocell Networks. IEEE Transactions on Wireless Communications 8(7) (July 2009) 3. Chandrasekhar, V., Andrews, J.G.: Uplink Capacity and Interference Avoidance for TwoTier Cellular Networks. In: IEEE GLOBECOM, pp. 3322–3326 (2007) 4. Hall, S.R., Jeffries, A.W., Avis, S.E., Bevan, D.D.N.: Performance of Open Access Femtocells in 4G Macrocellular Networks. In: Wireless World Research Forum 20, Ottawa, (April 22, 2008) 5. Jo, H.-S., Mun, C., Moon, J., Yook, J.-G.: Interference Mitigation Using Uplink Power Control for Two-Tier Femtocell Networks. IEEE Transactions on Wireless Communications (October 13, 2009) 6. Zhang, J., de la Roche, G.: Femtocells: Technologies and Deployment. Wiley, Chichester (2009) 7. BME, Department of Telecommunication, T-Mobile research (2009) 8. Kishore, S., Greenstein, L.J., Poor, H.V., Schwartz, S.C.: Downlink User Capacity in a CDMA Macrocell with a Hotspot Microcell. In: IEEE GLOBECOM, vol. 3, pp. 1573–1577 (2003) 9. UAP3801 Product Description, Huawei technologies, Ver.(March 2007) 10. UMTS AG Product Description, Huawei technologies, confidential 11. Rahnema, M.: UMTS network planning, optimization, and inter-operation with GSM. Wiley, Chichester (2007); ISBN 0470823011 12. Sanjaasuren, I., Sato, T.: Adaptive Power Control Applying to Femtocell. In: IEICE General Conference, Sendai (2010) 13. Agrawal, A.: Heterogeneous Networks: A new paradigm for increasing cellular capacity (2009) 14. Cost Action 231, Digital Mobile Radio Towards Future Generation Systems: final report, European Commission, Brussels (1999)
Techno-economic Analysis of Inhouse Cabling for FTTH Navneet Nayan, Rong Zhao, and Kai Grunert Detecon International GmbH, Oberkasseler Str. 2, 53227 Bonn, Germany {Navneet.Nayan,Rong.Zhao,Kai.Grunert}@detecon.com
Abstract. Through this paper we propose an effective methodology for the development of a comprehensive cost model for Inhouse Cabling in FTTH. The techno-economic modeling for Inhouse Cabling has been thoroughly investigated. A cost model is being developed and some case studies for multiple network scenarios are also being undertaken in order to assess the significant cost factors for network operators while deploying FTTH networks. It is aimed to identify the major roadblocks preventing the step towards FTTH, while maintaining the profitability of networks. Keywords: Access Networks, FTTH, Techno-Economics, Cost Modeling.
1 Introduction An increasingly insatiable bandwidth requirement, decrementing fixed line revenues, an extremely competitive and regulatory intensive global broadband market propels network operators to plan and deploy FTTx services. With the choice of deploying fiber until the Curb (FTTC), the Building (FTTB) or the home (FTTH), the strategy for network deployment depends majorly on the costs associated. As one of the most effective Next Generation Access (NGA) solutions, Fiber-tothe-Home (FTTH) is capable of providing high bandwidths to enable new broadband Internet services. It has been widely accepted that a substantial component of FTTH investments come from the Inhouse/In-building optical fiber deployment. In particular, the step towards implementation of FTTH Inhouse Cabling for Multiple Dwelling Units (MDU) faces numerous challenges, e.g. high installation and material costs, different structures of houses or buildings, optimal positioning of access points etc. In addition, copper/coaxial/power line and other standardized home network technologies tend to become more and more competitive during the last few years. All these factors influence the brown-field migration strategy from FTTB to FTTH and are subject to a business case impacted by CAPEX (Capital Expenditure), OPEX (Operational Expenditure) and the associated revenue.
2 FTTH and Inhouse Cabling Inhouse Cabling includes the segment of the optical cable/s and associated components starting from the MDU Distribution Point, where the OSP (Outside Plant) Cable terminates, until the drop cable at the end user location/apartment. R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 209–212, 2011. © Springer-Verlag Berlin Heidelberg 2011
210
N. Nayan, R. Zhao, and K. Grunert
2.1 The Challenge Germany has extensive xDSL (Digital Subscriber Line) coverage owing to its existing copper infrastructure. While fiber deployment in single family houses might be simpler technically, Multi-Dwelling Units (MDU) present a challenging task for optical fiber installation within buildings, both technically and economically. Given that a sizeable German population resides in MDUs ranging from 4-12 apartments in most cases, developing a use-case for such a scenario could be practical. The primary element of consideration for Inhouse Cabling refers to the architecture of the building in focus. Different building types demand specific approaches for a cost-optimized deployment. However, a strategy driven standardized approach can also be an economical solution. The primary challenges associated with Inhouse Cabling are:
Inhouse Network Topology/Architecture Regulatory Implications Co-ordination with Landlord
Restricting the domain of this study to the technical challenges, we shall try to present a comprehensive approach to model the essential elements required for Inhouse Cabling and their corresponding cost values.
Fig. 1. (a) Logical Architecture
Fig. 1. (b) Example Solutions
2.2 Solution Architecture- An Illustrative Example A plethora of choices available for Inhouse Cabling solutions makes its cost modeling a difficult exercise. However, with a simplified, stratified approach as explicit in Fig. 1(a), some transparency in costs can be obtained through cost estimation for each network component. Fig. 1(b) shows possible examples of solution architectures where the inhouse network topologies can be internally/externally routed. At the same time, various combinations of network connections can be established. Precise
Techno-economic Analysis of Inhouse Cabling for FTTH
211
technical know-how regarding each solution and comprehensive techno-economic cost modeling can be helpful for the selection of the optimal solution.
3 Cost Modeling The primary elements constituting CAPEX for Inhouse Cabling include the material and installation costs. A step-wise process modeling of the selected network components and the associated installation time requirements for each can provide a good estimate of the CAPEX. OPEX for Inhouse Cabling accounts for the service activation, fault management and electrical power consumption, if any. 3.1 Cost Function Notation CCAPEX: Capital Expenditure COPEX: Operational Expenditure CRV: Revenue based on different states during the migration period DP: MDU Distribution Point FDT: MDU Floor Distribution Terminal CS: Connectors and Splices CPE: Customer Premises Equipment, including ONT (Optical Network Terminal) M: Cost factor for Material (Cable, Duct, DP, FDT, CS) in CAPEX I: Cost factor for Installation (Cable, Duct, DP, FDT, CS) in CAPEX K, D, P, F, S, E: Maximum number of cost factors for different material & installation Definition of the Cost Function In this work a general cost function is defined which is derived from the difference between the total cost (CCAPEX and COPEX) and the revenues (CRV). C o st = f ( C C A P E X , C O P EX , C R V ) = C C A PE X + C O PE X − C R V
However, we shall focus on the costs of investment, i.e. CCAPEX. The cost function is thus, expressed as: CCAPEX = C Material =
+ C Installation
K
∑ (M
D
Cable k
k =1
d =1
F
∑ (M f =1
+ I kCable ) + ∑ ( M dDuct + I dDuct ) + S
FDT f
+ I FDT ) + ∑ ( M sCS + I sCS ) + f s =1
P
∑ (M
E
∑ (M e =1
DP p
+ I pDP ) +
p =1
CPE e
+ I eCPE )
3.2 Cost Model Based on the experience from previously developed technology cost models, a sufficient platform is developed, taking into account: CAPEX, OPEX and Revenue for different scenarios, as shown below:
212
N. Nayan, R. Zhao, and K. Grunert
Fig. 2. Cost Modeling for FTTH Inhouse Cabling
An innovative approach has been developed, to comprehensively model various Inhouse Cabling architectures, that facilitates a cost & performance optimized planning as illustrated in Fig.3 (OPEX and revenue are ignored in this paper).
Fig. 3. Cost Comparison (CAPEX) of Multiple Building Architectures
4 Conclusion and Outlook A systematic quantitative cost analysis of multiple Inhouse Cabling solutions has been proposed. The developed cost model facilitates a cost & performance optimized planning for the migration towards FTTH. With the implemented framework, we aim to study further, the cost influence of the migration from FTTB to FTTH directly.
Impact of Incomplete CSI on Energy Efficiency for Multi-cell OFDMA Wireless Uplink Alessio Zappone1, , Giuseppa Alfano2 , Stefano Buzzi1 , and Michela Meo2 1
1
CNIT and University of Cassino, Via G. Di Biasio, 43, I-03043 Cassino (FR), Italy 2 Dipartimento di Elettronica, Politecnico di Torino, Torino, Italy {alessio.zappone,buzzi}@unicas.it,
[email protected],
[email protected],
Introduction
A wise resource allocation design for a wireless network allows the optimization of a number of relevant parameters such as the data-rate, the radiated power, the number of supported users, and so on. Among these parameters, we focus on the energy-efficiency [1,2,3, and references therein], defined here as the number of error-free delivered bits for each energy-unit used for transmission. While all of the above references deal with energy efficiency in CDMA systems, non-cooperative resource allocation for energy efficiency maximization in the uplink of orthogonal frequency division multiple access (OFDMA) wireless networks, the leading multiple access strategy for the forthcoming fourth generation of wireless networks, is a much less investigated subject, that has been tackled for the first time, to the best of the authors’ knowledge, in [4]. There, a game-theoretic [5] approach for power control and subcarrier allocation is devised. In [6] a non-cooperative game in which each user selfishly tries to minimize its transmitted power subject to a transmission rate constraint is proposed, while in [7] an auction approach to subcarrier, modulation, and coding scheme allocation in single-cell and multi-cell OFDMA networks is proposed. However, in all of the above cited works, single-antenna processing and perfect CSI are assumed. In this work instead, we relax these two assumptions, first extending the results of [4] to the multiple antenna scenario, and then providing a power control algorithm that does not require perfect CSI.
2
System Model
Consider the uplink of a synchronous, multi-cell OFDMA network with B base stations (BSs), each equipped with M receiving antennas. Let N be the number
This work has been supported partially from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n. 257740 (Network of Excellence ”TREND”), and partially by the project Energy eFFIcient teChnologIEs for the Networks of Tomorrow (EFFICIENT), founded by the Italian Ministry of Research and University (MIUR).
R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 213–216, 2011. c Springer-Verlag Berlin Heidelberg 2011
214
A. Zappone et al.
of subcarriers associated to the whole system and K the total number of active users in the network. BS assignment is denoted by the K-dimensional vector a = (a1 , . . . , aK ), whose entry ai ∈ {1, . . . , B}, and we assume that BS assignment has been predetermined, thus focusing on the resource allocation problem only. Let us denote by hk,j,m () = βk,j ()αk,j,m () the channel gain between the k-th user and the m-th antenna of the j-th BS on the -th subcarrier, wherein αk,j,m () is the fast fading term (modeled as a zero-mean Gaussian random variate with mean square value equal to 1/M ), and βk,j () is the slow-fading term, which we assume to be independent of the antenna index m; the squareroot of βk,j () is distributed according to a log-normal random variate, with mean square value d−η k,j , where dk,j is the distance between the k-th mobile user and the j-th BS, and η the path loss exponent. Let pk () be the transmit power of the k-th user on the -th subcarrier, and assume that each user can transmit on L subcarriers. Let Fk contain the indexes of the L subcarriers allocated to the k-th user. Also, define by ρk () a binary variable that equals 1 if user k is transmitting on subcarrier , and 0 otherwise. Now, if linear minimum mean square error (LMMSE) reception is employed, it can be shown that the k-th user’s SINR on the -th subcarrier can be written as −1 [k] [k]† 2 γk,ak () = pk ()ρk ()h†k,ak() H[k] ()P ()H () + σ I hk,ak () , (1) ak ak with hk,ak () = [hk,ak ,1 (), . . . , hk,ak ,M ()]T the k-th column of the overall mul˜ users transmitting over subcarrier , to tiuser channel matrix Hak from the K ˜ transmit powers of the receiver ak , P() the diagonal matrix containing the K [k] users employing subcarrier , and (·) denoting the matrix (·) without the column associated to user k. In the coming section, the proposed games are devised.
3
Energy-Efficient Power Control
The energy-efficiency problem is formulated as the normal-form game G = K L {K, {Sk }K k=1 , {uk }k=1 }, with K = {1, . . . , K} the set of players, Sk = [0, Pmax ] 1 the k-th player’s strategy set , and uk = ∈Fk uk (), the k-th player’s utility f (γ
())
k,ak function, with uk () = R D the energy efficiency of player k on subP pk () carrier . R is the transmit data-rate, P is the packet length, D ≤ P is the number of information symbols contained in each packet, and f (·) approximates the probability of correct reception for a packet of length P , and is usually chosen P as f (γ) = (1 − e−γ ) . If perfect CSI is available, the best-response of player k is given by
pk () = min{p∗k (), Pmax } , 1
(2)
Pmax is the maximum allowed transmit power on the generic subcarrier which we assume to be the same on all the used subcarriers, the general case presenting no additional difficulties.
Impact of Incomplete CSI on Energy Efficiency
215
with p∗k () the transmit power such that γk,ak () = γ ∗ , and γ ∗ the unique, positive solution of the equation γf (γ) = f (γ). The following proposition holds. Proposition 1: The considered energy-efficient game admits a unique NE. Moreover, the best-response dyncamics (BRD) in which each player k iteratively plays his best response according to (2), always converges to the unique NE. Proof: The proof is omitted due to space constraints. Instead, if only partial CSI is available, the situation is more involved. Assume a + He , and that only H a is available for any k. In the following that Hak = H k ak k we perform a large system analysis in which K and M grow to infinity but with K a fixed ratio M . In this case, assuming that for any k, Heak has independent, unit-variance normalized entries and bounded moments of order 4 + , leveraging [9] it can be shown that (1) converges almost surely to −1 [k] [k]† e e† 2 γk,ak () = pk ()ρk ()βk,ak ()tr H[k] ()P ()H ()+E[H () H () ]+σ I . ak ak ak ak (3)
Now, the following proposition holds. Proposition 1. Assume that only partial CSI is available. In the considered large system scenario, the energy-efficient non-cooperative game admits a unique NE, and the associated BRD, always converges to the unique NE. Proof. The proof is omitted due to space constraints. Although this result holds in the limit of large K and M , employing this result for finite K and M can still be useful to get some insight on the performance of the power control game with partial CSI. This will be carried out in the coming section. Achieved utility [bit/J] at the NE. N = 10. L = 3. B = 4. M = 4
10
10
Perfect CSI Partial CSI Initial utility 9
10
8
10
7
10
6
10
5
10
0
2
4
6
8
10 K
12
14
16
18
20
Fig. 1. Average achieved energy efficiency at the NE, versus the number of active users. a) Perfect CSI; b) Partial CSI; c) Initial energy efficiency.
216
4
A. Zappone et al.
Numerical Results
We considered a square area of 2800 × 2800 square meters, with B = 4 BS regularly placed, and users placed randomly inside this area. Channel coefficients were generated as explained in Section 2, and each user is assigned to the BS towards which he has the best channel coefficient. The system parameters have been set to N = 10, L = 3, R = 100 kbit/s, P = 120, Pmax = 10dBm, σ 2 = 10−12 Watt/Hz, η = 3, and M = 4. The entries of the matrix H eak have been generated as i.i.d. Gaussian random variables, with zero-mean and variance σe2 = 10−6 . In Fig. 1 the achieved utility at the NE, averaged over the users, versus the number of active users, is shown for the perfect and partial CSI case. Also, for the sake of comparison, the initial utility before the resource allocation schemes come into play is reported. Results clearly show how the proposed games improve the system’s energy-efficiency, both with perfect and partial CSI.
References 1. Goodman, D., Mandayam, N.: Power control for wireless data. IEEE Personal Commun. 7, 48–54 (2000) 2. Meshkati, F., Poor, H.V., Schwartz, S.C., Mandayam, N.B.: An energy-efficient approach to power control and receiver design in wireless data networks. IEEE Trans. Commun. 53(11), 1885–1894 (2005) 3. Buzzi, S., Poor, H.V.: Joint receiver and transmitter optimization for energy-efficient CDMA communications. IEEE J. Sel. Areas Commun., Special Issue on Multiuser Detection for Advanced Communications Systems and Networks 26, 459–472 (2008) 4. Buzzi, S., Colavolpe, G., Saturnino, D., Zappone, A.: Potential Games for Power Control and Subcarrier Allocation in Uplink Multicell OFDMA Systems. In: Proc. of 2nd ICST Conference on Game Theory for Networks (GAMENETS 2011 - invited paper), Shangai, China, April 16-18 (2011) 5. Fudenberg, D., Tirole, J.: Game Theory. MIT Press, Cambridge (1991) 6. Han, Z., Ji, Z., Liu, K.: Non-cooperative resource competition game by virtual referee in multi-cell OFDMA networks. IEEE Journal on Selected Areas in Communications 25, 1079–1090 (2007) 7. Yang, K., Prasad, N., Wang, X.: An auction approach to resource allocation in uplink OFDMA systems. IEEE Trans. Sig. Proc. 57, 4482–4496 (2009) 8. Marzetta, T.L.: Noncooperative Cellular Wireless with Unlimited Numbers of Base Station Antennas. IEEE Trans. Wireless Commun. 9, 3590–3600 (2010) 9. Hoydis, J., Kobayashi, M., Debbah, M.: Asymptotic Performance of Linear Receivers in Network MIMO. In: Asilomar Conference on Signals, Systems, and Computers (Asilomar 2010), Pacific Grove, CA, US (November 2010) (invited paper)
An Efficient Centralized Localization Method in Wireless Sensor Networks Mohamadreza Shahrokhzadeh, Abolfazl T. Haghighat, and Behrooz Shahrokhzadeh Islamic Azad University, Qazvin Branch, Daneshgah Street, Qazvin, Iran {m.shahrokhzadeh,haghighat,bshahrokhzadeh}@qaiu.ac.ir
Abstract. The purpose of this article is to propose some major modifications to improve the performance of the Simulated Annealing based localization algorithm. To this end, by replacing the random initial location estimation of the sensor with an appropriate estimation, the execution time is decreased considerably. In addition, in order to mitigate flip ambiguity and increase localization accuracy, the fitness function and the probability distribution function are modified. The simulation results indicate a significant increase in the localization accuracy and execution speed especially in high noisy network. Keywords: Wireless Sensor Network, Localization, Simulated Annealing, Flip Ambiguity.
1 Introduction Random topology of the wireless sensor networks has made localization problem as one of the most critical challenges. Simulated Annealing (SA) as a generic probabilistic meta-heuristic has been applied to solve the wireless sensor network localization problem. The idea of applying of SA in localization problem was introduced in [1]. Despite its advantages over other solutions, there are some disadvantages that experts in [2,3,4,5] try to improve its performance, but it is still far from desirable. One of the most important disadvantages of this algorithm is time-consuming calculations that grow with the size of the network, therefore the localization process time is long. Another disadvantage is that, the accuracy of the algorithm decreases due to flip ambiguity, especially in the low-density networks. For solving the problem, Kannan and et al. in [1] have also added a new phase to the algorithm that is costly and often ineffective. The aim of this article is to present the achieved solutions for these above two problems. We make some modifications to upgrade the performance of this algorithm considerably. There are two major advantages for our proposed algorithm compared to the SA algorithm: • The average execution time has been reduced to less than 40%. • The average accuracy of the algorithm has been improved to double. R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 217–220, 2011. © Springer-Verlag Berlin Heidelberg 2011
218
M. Shahrokhzadeh, A.T. Haghighat, and B. Shahrokhzadeh
The proposed algorithm is described in the next section, after that, in the third section the simulation results have been presented and finally the fourth section concludes the article.
2 The Proposed Algorithm and Its Specifications 2.1 Initial Estimation of the Sensor Location The SA algorithm generates an initial hypothetical network using a random estimate of the sensors location, While position each sensor that is in neighboring of at least three anchors can be calculated by applying simple mathematical computations, that is called Trilateration. By repetition of this procedure in the whole network, the possibility of detecting the position of more sensors will be provided. This recurring procedure is termed Iterative Multilateration (iM). By using these two described methods, we can generate the appropriate initial hypothetical network. However two weaknesses in these methods have influenced their capabilities. First, in the case of network deployment in a noisy environment (which usually is the case), distance determination between neighboring sensors is associated with error that not only causes accuracy reduction but also propagates the error in the whole network [6]. In order to mitigate adverse effects of this weakness, we suggest that the iteration number of iM method be limited to one when the noise is high. In the other hand, due to the random distribution of sensors, the possible availability of three anchors in any partitions of the network even after the application of iM method is not high. In this situation, if there are two neighboring anchors rather than three, we offer to select the mid-point between these two for the initial location estimation of sensor. In the case of one neighboring anchor, our proposal is to put location of anchor rather than the sensor location estimation and finally for the sensors with no neighboring anchor we can consider a random location. 2.2 The Fitness and the Probability Distribution Functions The flip ambiguity is the major problem in wireless sensor network localization. In fact by decreasing or omitting such a problem, both the speed and precision of localization increase. Our proposal is to apply formula (1) as a fitness function in all phases of the algorithm. f =
∑
(
∑
i∉anchor j∈Ni
(dˆij − dij ) 2 +
∑
w × (dˆij − R ) 2 )
(1)
j∈Ni and dˆij < R
Where dij is the measured distance of sensor i from sensor j and dˆij is the estimated distance between these two sensors in the hypothetical network, R is the transmission range of each sensor, Ni is the set of real neighbors and Nˆ i is the set of wrong neighbors of sensor i. w as the factor of second term, at the beginning of the algorithm has the value greater than 1 that gradually decrease to 1. By applying
An Efficient Centralized Localization Method in Wireless Sensor Networks
219
formula (1), any sensor that has flip ambiguity increases value of the fitness function, therefore it helps to avoid the flip ambiguity during execution of the algorithm. The probability distribution function P ( Δf ) is used to accept or reject the random perturbation. We propose that, if there is good initial estimation of sensor then it is calculated with exp(
−Δf −Δf ×100 ). ) and otherwise with exp( T T
3 Simulation and Results The studied network consists of 100 sensors that in each experiment are randomly distributed in a two-dimensional space and among them 15 %of sensors are randomly added to anchor set. All experiments are done in an environment with 10 %noise. In order to apply noise to the measured distance between the neighboring nodes, formula (2) is used.
dij = dij′ × (1.0 + η × 0.1)
(2)
where dij′ and dij are true distance and measured distance respectively between two nodes i and j. η is a Gaussian distributed random variable with 0 mean and variance 1. Fig. 1 indicates the difference between the proposed and reference algorithms on the accuracy of location estimation of each sensor. Fig. 2 shows the average processing times of these two algorithms in the networks with the different degrees of connectivity. As it is clear from this figure, the accuracy of the proposed algorithm in all cases is better than the reference algorithm, even this difference is much more significant in low connectivity networks. Modified-SAL
Average Processing Time
SAL Average Accuracy
20 15 10 5 0 7
9
11
14
17
20
Connectivity
Fig. 1. The average accuracy of localization
SAL
Modified-SAL
7 6 5 4 3 2 1 0 7
9
11
14
17
20
Connectivity
Fig. 2. The average processing time
4 Conclusions and Future Work In order to promote the performance of the SA based localization algorithm, we make two major modifications on it. First, by using the position of the anchors and also Trilateration method, the random initial location estimation of the sensor replaces with an appropriate estimation, therefore the execution time is decreased
220
M. Shahrokhzadeh, A.T. Haghighat, and B. Shahrokhzadeh
considerably. Evaluation results represent a 2.5 times increase in the average speed. Second, by changing the fitness and the probability distribution functions mitigate flip ambiguity problem. The evaluations also show that the proposed algorithm decreased the localization error by half on average. Applying these modifications on other similar methods can be the goal for the future studies.
References 1. Kannan, A., Mao, G., Vucetic, B.: Simulated Annealing based Wireless Sensor Network Localization. Journal of Computers 1(2), 15–22 (2006) 2. Li, Y., Xing, J., Yang, Q., Shi, H.: Localization Research based on Improved Simulated Annealing Algorithm in WSN. In: 5th Int. Conference on WICOM, China, pp. 1–4 (2009) 3. Su, Z., Shang, F., Wang, R.: A Wireless Sensor Network Location Algorithm Based on Simulated Annealing. In: 2nd Int. Conference on BMEI, China, pp. 1–5 (2009) 4. Shahrokhzadeh, M.R., Haghighat, A.T., Shahrokhzadeh, B.: An Improved Localization Algorithm Based on Simulated Annealing for Wireless Sensor Networks. In: 2nd Int. Conference on ICCCA, pp. 53–58. IEEE Press, South Korea (2011) 5. Zhang, Q., Wang, J., Jin, C., Zeng, Q.: Localization Algorithm for Wireless Sensor Network Based on Genetic Simulated Annealing Algorithm. In: 4th Int. Conference on WICOM, pp. 1–5 (2008) 6. Tay, J.H.S., Chandrasekhar, V.R., Seah, W.: Selective Iteratitive Multilateration for Hop Count-based Localization in Wireless Sensor Networks. In: 7th Int. Conference on Mobile Data Management, pp. 52–55 (2006)
Mechanisms for Distributed Data Fusion and Reasoning in Wireless Sensor Networks Ioannis Papaioannou1, Periklis Stavrou2, Anastasios Zafeiropoulos2, Dimitrios-Emmanuel Spanos2, Stamatios Arkoulis2, and Nikolas Mitrou2 1
Research and Academic Computer Technology Institute, N. Kazantzaki Str., Rio, Patras, Greece
[email protected] 2 National Technical University of Athens, Heroon Polytexneiou, 15773, Zografou, Greece {pstavrou,tzafeir,dspanos,stark}@cn.ntua.gr,
[email protected]
Abstract. Decision making in decentralized and dynamic environments is challenging due to the continuous changes in the network topology and the absence of specific nodes that are responsible to take decisions. These challenges are increased in case of sensor networks deployments. In this paper, a novel approach is presented for realizing distributed data fusion and reasoning in wireless sensor networks. The approach is based on the storage and retrieval of data in stable overlay networks that abstract the physical network topology and the design of proper mechanisms for the semantic annotation of the available information in order to be used in the decision making process. Keywords: distributed reasoning, data fusion, p2p networks, overlay networks.
1 Introduction Sensor Networks (SN) have attracted enormous research effort and triggered a great deal of technological developments during the last decade. Despite the impressive progress, several shortcomings and bottlenecks still exist which prevent SN from being fully deployed and exploited in everyday life applications, such as resource limitations, heterogeneity of infrastructure and the requirements for vast amounts of data collections. Most of all, however, what is really missing in the field is a concrete methodology and a well-defined business model of how to build an integrated information system on top of existing SN infrastructures, capable of coping with the entire chain of operations and orchestrating the various parts together in a flexible, efficient and economic way without the need of centralized components in the network that act as single points of failure. This information system has to be capable to proceed to proper reasoning and decision making over the collected information in a distributed manner. Towards this direction, two basic prerequisites are posed: (i) the existence of a framework for reliable and decentralized storage and retrieval of data and (ii) the use of a suitable ontology. The first prerequisite is fulfilled through the use of an already proposed framework in our previous work for the creation and R. Lehnert (Ed.): EUNICE 2011, LNCS 6955, pp. 221–224, 2011. © Springer-Verlag Berlin Heidelberg 2011
222
I. Papaioannou et al.
maintenance of stable overlay networks, over which p2p techniques may be applied reliably and efficiently for storage and retrieval of data [1]. Regarding the second prerequisite, we are going to use an already proposed context model that describes entities and interactions in ad-hoc networks [2] in combination with complementary context models or ontologies for the description of sensor networks parameters and services [3][4][5].
2 Proposed Approach for Distributed Reasoning In distributed systems, such as WSNs, the application of semantic web techniques cannot be realized in a scalable way if all reasoning is expected to take place in a central node that collects all the semantically annotated data from the SN participating nodes. Furthermore, the existence of a central Knowledge Base (KB) is opposed to in-network processing that is usually required in order to reduce overall power-consumption of the network. The available approaches for distributed reasoning can be classified in two main categories based on the underlying peer-to-peer network and the ability to control its overlay structure: the artificial intelligence area [6] and the database systems area [7,8]. Our approach leverages mechanisms from both areas and follows a hybrid solution similar to the one proposed in [8]. According to the proposed approach, every peer in the overlay can either distribute an RDF triple in the overlay or otherwise store it in its Local Knowledge Base (LKB) and distribute links to its used terms, what we will call semantic links. Every peer in the overlay is also required to maintain a Global Knowledge Base (GKB) where keyvalue pairs from the established p2p network (based on Distributed Hash Tables DHT) will be stored. These key-value pairs will be, as sketched previously, either semantic links or actual RDF triples. For instance, a peer with the RDF statement <s,p,o>,where s is the subject, p is the predicate and o is the object, can either store in the overlay the pairs (hash(s),<s,p,o>), (hash(p),<s,p,o>) and (hash(o), <s,p,o>) or otherwise use its LKB and store in the overlay (the GKBs) the semantic links (hash(s),IP), (hash(p),IP) and (hash(o),IP), where IP is the IP address of the corresponding node. This is essential for large in scale or resource constrained networks where every peer is not necessarily willing to disseminate its entire KB. Imagine for example a WSN application in which there is a need to include an external source such as DBpedia. A KB containing statements, <s1,p,o>, ...,<sn,p,o>,would require to store 3n key-value pairs if fully distributed, while a semantic links approach would require n+2 key-value pairs. Our approach differentiates from the approach proposed in [8] in the use of both LKBs and GKBs, semantic links and distributed statements. Semantic links pose an extra step (forwarding the query to the IP address of the corresponding node) in the reasoning process, so well known ontologies can be fully distributed to avoid this extra step and to avoid also the redundant semantic links (these statements are expected to appear in many nodes). Another significant difference is that of query rewriting. To enable inferred results, we propose the distribution of rules (either ontological axioms or policy/application rules). The distribution is the same as in statements, using again semantic links for the (rare) policy/application rules and fully
Mechanisms for Distributed Data Fusion and Reasoning in Wireless SNs
223
Fig. 1. Backward chaining over a DHT overlay for querying whether there is a fire risk event with 80% confidence. The presented LKB and GKB contents are indicative. Table 1. Query path example and description No 1
Path
Forwarding term :FireRiskEvent
2
P1-P2P4 P4-Pi
3
Pi-P2
:type
4
P2-P3
:subclassOf
:Sensor
Remaining query ?x1 :type :Sensor & ?x1 :hasValue ?x2 & ?x2 <"20"^^xsd:Integer ?x1 :type :Sensor & ?x1 :hasValue ?x2 & ?x2<"20"^^xsd:Integer ?x1 :type ?Y & ?Y :subclassOf :Sensor & ?x1 :hasValue ?x2 & ?x2 <"20"^^xsd:Integer ?x1 :type :HumiditySensor & :HumiditySensor :subclassOf :Sensor & ?x1 :hasValue ?x2 & ?x2 <"20"^^xsd:Integer
distributing the common ontological axioms (e.g. RDFS rules). An extra choice to make is the use of backward or forward chaining. Backward chaining would involve the distribution of the headers only, whereas forward chaining would involve the distribution of the bodies of the rules. In Figure 1, a backward chaining example is presented. Peer P1 makes a query whether there is a fire risk event with 80% confidence. The method for evaluating conjunctive triple pattern queries in the DHT overlay is based in the one proposed in [7] with the difference that now each node in the overlay maintains two KBs. Using the overlay to find statements about a FireRiskEvent the query is forwarded to P4 through the semantic link in P2. The remaining query after the application of the rule in P4 is shown in the first row of the Table 1. After that, P4 sends the remaining query in Pi, using as forwarding term the term: Sensor – the sequence for selecting a forwarding term from a clause is, as in [7,8], first the subject, then the object, and
224
I. Papaioannou et al.
finally the predicate, for non-variable terms. In Pi, the evaluation is not successful and the next forwarding term of the clause (:type) is selected, which routes the remaining query to P2. In P2, the RDFS subsumption rule is applied and the produced query is forwarded through the term: subclassOf to P3, where the query is successfully evaluated (against both GKB and LKB) and the result is returned to P1.
3 Conclusions and Future Work In this paper, existing mechanisms for the design of decentralized decision making techniques in wireless sensor networks are analyzed, taking into account the existing representation schemes and models for the sensor networking world. Challenges that arise due to the dynamic and volatile nature of wireless sensor networks are reported and taken into account in our design. Based on these challenges, an approach is proposed for distributed reasoning in sensor networks. Acknowledgments. This work has been partially supported by the National Strategic Reference Framework (NSRF) (Regional Operational Programme – Western Greece) under the title “Advanced Systems and Services over Wireless and Mobile Networks” (number 312179) and the “Alexander S. Onassis Public Benefit Foundation”, under its Scholarships Programme.
References 1. Gouvas, P., Zafeiropoulos, A., Liakopoulos, A., Mentzas, G., Mitrou, N.: Integrating Overlay Protocols for Providing Autonomic Services in Mobile Ad-Hoc Networks. IEICE Transactions on Communications E93.B(8), 2022–2034 (2010) 2. Zafeiropoulos, A., Gouvas, P., Liakopoulos, A.: A context model for autonomic management of ad-hoc networks. In: Int. Conf. on Pervasive and Embedded Computing and Communication Systems, Algarve, Portugal (2011) 3. Reed, C., Botts, M., Davidson, J., Percivall, G.: OGC® Sensor Web Enablement: Overview and High Level Architecture. IEEE Autotestcon, 372–380 (2007) 4. Henson, C.A., Pschorr, J.K., Sheth, A.P., Thirunarayan, K.: SemSOS: Semantic sensor Observation Service. In: Proc. of the International Symposium on Collaborative Technologies and Systems, Baltimore, Maryland, USA (2009) 5. Eid, M., Liscano, R., El-Saddik, A.: A Universal Ontology for Sensor Networks Data. In: IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, Ostuni, Italy (2007) 6. Adjiman, P., Chatalic, P., Goasdoué, F., Rousset, M.C., Simon, L.: Distributed reasoning in a peer-to-peer setting: Application to the semantic web. Journal of Artificial Intelligence Research 25(1), 269–314 (2006) 7. Liarou, E., Idreos, S., Koubarakis, M.: Evaluating conjunctive triple pattern queries over large structured overlay networks. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 399–413. Springer, Heidelberg (2006) 8. Anadiotis, G., Kotoulas, S., Siebes, R.: An architecture for peer-to-peer reasoning. In: Proceedings of the International Semantic Web Conference, Busan, Korea (2007)
Author Index
Meo, Michela 213 Mitrou, Nikolas 221 M´ ocz´ ar, Zolt´ an 124 Moln´ ar, S´ andor 124 Montavont, Nicolas 181 Mudriievskyi, Stanislav 54
Alfano, Giuseppa 213 Arkoulis, Stamatios 221 Bauschert, Thomas 21 Biagi, Mauro 5 Buzzi, Stefano 213 Castignani, German
181 Nayan, Navneet
Dai, Qin 146 de Boer, Pieter-Tjerk 9 de O. Schmidt, Ricardo 31 Drago, Idilio 134 Dyadenko, Alexander 199 Dyadenko, Olga 199 Fioreze, Tiago Fisches, Stefan
Papaioannou, Ioannis 221 Porsch, Marco 21 Pras, Aiko 31, 100, 134 Radeke, Rico 89 Richter, Volker 65 Robles, Jorge Juan 43
100 112
Shahrokhzadeh, Behrooz 217 Shahrokhzadeh, Mohamadreza 217 Spanos, Dimitrios-Emmanuel 221 Sperotto, Anna 134 Stavrou, Periklis 221 Sultanov, Albert 195
Globa, Larisa 199 Gomes, Reinaldo 31 Grunert, Kai 209 Haghighat, Abolfazl T. Hofstede, Rick 134 Horvath, Daniel 169
217
Irfan Rafique, Muhammad
21
Jak´ o, Zolt´ an 203 Jeney, G´ abor 203 Jorswieck, Eduard 3 Kaiserswerth, Matthias Killat, Ulrich 77 K¨ uhlewind, Mirja 112 Kulik, Ivett 157
209
4
Lampropulos, Alejandro 181 Luntovskyy, Andriy 199 Lyubopytov, Vladimir 195
Tlyavlin, Anvar 195 Trinh, Tuan Anh 157, 169 Tsokalo, Ievgenii 54 T¨ urk, Stefan 65, 89 van Nee, Floris Yulia, Yamnenko
9 54
Zafeiropoulos, Anastasios Zappone, Alessio 213 Zhang, Shu 77 Zhao, Rong 209
221