Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
6811
Hannes Frey Xu Li Stefan Ruehrup (Eds.)
Ad-hoc, Mobile, and Wireless Networks 10th International Conference, ADHOC-NOW 2011 Paderborn, Germany, July 18-20, 2011 Proceedings
13
Volume Editors Hannes Frey University of Paderborn Department of Computer Science Pohlweg 47-49, 33098 Paderborn, Germany E-mail:
[email protected] Xu Li University of Waterloo Department of Electrical and Computer Engineering 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada E-mail:
[email protected] Stefan Ruehrup OFFIS - Institute for Information Technology Escherweg 2, 26121 Oldenburg, Germany E-mail: stefan.ruehrup@offis.de
ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-22449-2 e-ISBN 978-3-642-22450-8 DOI 10.1007/978-3-642-22450-8 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011931292 CR Subject Classification (1998): C.2, H.4, D.2, K.6.5, H.3 LNCS Sublibrary: SL 5 – Computer Communication Networks and Telecommunications
© Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
In 2010, the International Conference on Ad-Hoc Networks and Wireless (ADHOC-NOW) took place for the 10th time. This successful series shows, on the one hand, that wireless ad hoc communication still offers new research challenges, and on the other hand, that ADHOC-NOW is already established as one of the premier venues for research on this exciting topic. After previous venues in Canada, France, Mexico, and Spain, it was the first time that ADHOC-NOW took place in Germany. The 2010 venue was the city of Paderborn, a lively city in Westphalia with a rich cultural heritage and home to various high-tech companies as well as the University of Paderborn, which hosted the conference. The 10th ADHOC-NOW attracted 53 submissions of which 23 were accepted for presentation. In addition, we invited four papers on selected topics. The accepted papers cover topics in routing, medium access control and topology control, security and mobility issues as well as analytical considerations and applications for ad hoc and sensor networks. This enabled us to provide an interesting and versatile program that is representative of recent activities in this area of research. We would like to thank the members of the Program Committee, the reviewers and all the people who helped in organizing the event and putting together an excellent program. May 2011
Hannes Frey Xu Li Stefan Ruehrup
Organization
Program Committee Program Chairs Proceedings Chair Publicity Chairs
Submission Chairs Web Chair Local Arrangements
Hannes Frey (University of Paderborn, Germany) Xu Li (University of Waterloo, Canada) Stefan Ruehrup (OFFIS, Germany) Jiming Chen (Zhejiang University, China) Nathalie Mitton (INRIA, France) Yu Wang (UNC at Charlotte, USA) Cailian Chen (Shanghai Jiao Tong University, China) Ivan Martinovic (University of Kaiserslautern, Germany) Tahiry Razafindralambo (INRIA, France) Hannes Frey (University of Paderborn, Germany)
Technical Program Committee Nael Abu-Ghazaleh Michel Barbeau Zinaida Benenson Marcello Caleffi Juan Carlos Cano Jean Carle Chun Tung Chou Costas Constantinou Falko Dressler Vasilis Friderikos Jie Gao Fran¸cois Ingelrest Abdelmajid Khelil Ralf Klasing Jerzy Konorski
Evangelos Kranakis Thomas Kunz Tianji Li Xiaoyan Li Weifa Liang Hai Liu Rongxing Lu Pietro Manzoni Marc Mosko Amiya Nayak Ioanis Nikolaidis Sotiris Nikoletseas Jaroslav Opatrny Marina Papatriantafilou Matthias R. Brust
S. S. Ravi Francisco J. Ros Pedro M. Ruiz Sushmita Ruj Juan A. S´ anchez Nicola Santoro David Simplot-Ryl Ivan Stojmenovic Limin Sun Violet Syrotiuk Jozef Wozniak Kui Wu Yulei Wu Qin Xin
Juan Jose Galvez Adrian Kosowski
Dionysios Efstathiou Tomasz Radzik
External Reviewers Vincenzo Bonifaci Julien Champ Bilel Derbel
Table of Contents
Routing and Activity Scheduling Effective Geographic Routing in Wireless Sensor Networks with Innacurate Location Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rafael Marin-Perez and Pedro Miguel Ruiz
1
Energy Efficient Mobile Routing in Actuator and Sensor Networks with Connectivity Preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Essia Hamouda, Nathalie Mitton, and David Simplot-Ryl
15
Joint Duty Cycle Scheduling, Resource Allocation and Multi-constrained QoS Routing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . Jamila Ben Slimane, Ye-Qiong Song, Anis Koubaa, and Mounir Frikha Energy Efficient Monitoring for Intrusion Detection in Battery-Powered Wireless Mesh Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amin Hassanzadeh, Radu Stoleru, and Basem Shihada
29
44
Topology Control Using Battery Level as Metric for Graph Planarization . . . . . . . . . . . . . . . Jovan Radak, Nathalie Mitton, and David Simplot-Ryl Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks with Realistic Radio Propagation Models . . . . . . . . . . . . Pedro Wightman, Miguel Jimeno, Daladier Jabba, Miguel Labrador, Mayra Zurbar´ an, C´esar C´ ordoba, and Armando Guerrero
58
72
A Topology Control Algorithm for Interference and Energy Efficiency in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hugo Braga and Fl´ avio Assis
86
Fault Tolerant Interference-Aware Topology Control for Ad hoc Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Md. Ehtesamul Haque and Ashikur Rahman
100
Medium Access Control PaderMAC: A Low-Power, Low-Latency MAC Layer with Opportunistic Forwarding Support for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . Marcus Autenrieth and Hannes Frey
117
VIII
Table of Contents
Overhearing for Congestion Avoidance in Wireless Sensor Networks . . . . Damien Roth, Julien Montavont, and Thomas No¨el Multihop Performance of Cooperative Preamble Sampling MAC (CPS-MAC) in Wireless Sensor Networks (Short Paper) . . . . . . . . . . . . . . Rana Azeem M. Khan and Holger Karl
131
145
Security Secure Position Verification for Wireless Sensor Networks in Noisy Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Partha Sarathi Mandal and Anil K. Ghosh
150
Efficient CDH-Based Verifiably Encrypted Signatures with Optimal Bandwidth in the Standard Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuan Zhou and Haifeng Qian
164
MobiID: A User-Centric and Social-Aware Reputation Based Incentive Scheme for Delay/Disruption Tolerant Networks (Invited Paper) . . . . . . . Lifei Wei, Haojin Zhu, Zhenfu Cao, and Xuemin (Sherman) Shen
177
Improved Access Control Mechanism in Vehicular Ad Hoc Networks (Invited Paper) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sushmita Ruj, Amiya Nayak, and Ivan Stojmenovic
191
Mobility Management and Handling A New Coverage Improvement Algorithm Based on Motility Capability of Directional Sensor Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Amac Guvensan and A. Gokhan Yavuz
206
A Multi-objective Approach for Data Collection in Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christelle Caillouet, Xu Li, and Tahiry Razafindralambo
220
Smart and Balanced Clustering for MANETs . . . . . . . . . . . . . . . . . . . . . . . . Lu´ıs Concei¸c˜ ao and Marilia Curado Promoting Quality of Service in Substitution Networks with Controlled Mobility (Invited Paper) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tahiry Razafindralambo, Thomas Begin, Marcelo Dias de Amorim, Isabelle Gu´erin Lassous, Nathalie Mitton, and David Simplot-Ryl
234
248
Applications and Evaluation Improving CS-MNS through a Bias Factor: Analysis, Simulation and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thomas Kunz and Ereth McKnight-MacNeil
262
Table of Contents
A Methodology to Evaluate Video Streaming Performance in 802.11e Based MANETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ´ Tim Bohrloch, Carlos T. Calafate, Alvaro Torres, Juan-Carlos Cano, and Pietro Manzoni Node Degree Improved Localization Algorithms for Ad-Hoc Networks (Short Paper) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rico Radeke and Jorge Juan Robles Using BPEL to Realize Business Processes for an Internet of Things . . . . Nils Glombitza, Sebastian Ebers, Dennis Pfisterer, and Stefan Fischer
IX
276
290 294
Analytical Considerations On Complexity of Wireless Gathering Problems on Unit-Disk Graphs . . . Nikola Milosavljevi´c
308
On Cardinality Estimation Protocols for Wireless Sensor Networks . . . . . Jacek Cicho´ n, Jakub Lemiesz, and Marcin Zawada
322
Maximizing Network Lifetime Online by Localized Probabilistic Load Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yongcai Wang, Yuexuan Wang, Haisheng Tan, and Francis C.M. Lau
332
Time-Varying Graphs and Dynamic Networks (Invited Paper) . . . . . . . . . Arnaud Casteigts, Paola Flocchini, Walter Quattrociocchi, and Nicola Santoro
346
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
361
Effective Geographic Routing in Wireless Sensor Networks with Innacurate Location Information Rafael Marin-Perez and Pedro Miguel Ruiz Department of Information and Communications Engineering University of Murcia, E-30100, Espinardo, Murcia, Spain {rafael81,pedrom}@um.es
Abstract. Geographic routing is one of the most widely-accepted techniques to route information in wireless sensor networks. The main novelty is that the current node routes the packet to a neighbor which is located closer to the destination than itself. This process is called greedy routing. When the packet reaches a node that has no neighbors located closer to the destination than itself (a.k.a. local minimum) a recovery strategy is used to get to nodes that can again resume greedy routing. However, recent studies have proven that geographic routing may be ineffective in real deployments where location estimation systems introduce location errors. In this paper, we analyze in detail the problems induced by location errors in greedy routing and propose an Effective Greedy Routing supporting Location Errors (EGLE). Our simulation results show that EGLE is able to outperform existing solutions. It achieves nearly a 100% packet delivery ratio with very little additional overhead. Keywords: Wireless Sensor Network, Geographic Routing, Innacurate Location.
1
Introduction
A Wireless Sensor Network (WSN) consists of a set of autonomous and lightweight devices equipped with wireless interfaces and sensor hardware for monitoring the enviroment. When a node has data to send to a destination which is outside its radio range it uses multihop communications. That is, neighboring sensor nodes are used as relays to forward data packets towards the destination. Data communications are the major source of energy consumption. In addition, the number of devices in a WSN may be potentially large. Thus, the design of efficient and scalable communication protocols for WSNs has been one of the most active researh areas withing the WSN community. Geographic Routing (GR) has emerged as one of the most efficient and scalable routing solutions for WSNs [2]. In GR, nodes only need local information to take data forwarding decisions. In particular, nodes only need to know their position, the position of their neighbors and the position of the destination. Based on that, each forwarder selects its best neighbor advancing toward the destination as next relay. This process is also called Greedy Routing Scheme (GRS) [1]. H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 1–14, 2011. c Springer-Verlag Berlin Heidelberg 2011
2
R. Marin-Perez and P.M. Ruiz
The data packet may eventually reach a node having no neighbors closer to the destination than itself. In that case, the node is called a local maximum and a recovery scheme (i.e face routing) is used to exit the local maximum. In most geographic routing protocols nodes discover the position of their neighbors by sending their position in periodic 1-hop beacon messages. These beacons can produce a huge overhead in terms of unnecessary packet transmissions and wasted energy. To reduce the periodic beacon overhead, beacon-less GR solutions have been proposed. The idea is to discover neighbors reactively [7]. However, beacon and beaconless geographic routing protocols assume perfect location information. Recent studies ([6,3,8]) have proven that existing geographic routing solutions may be ineffective in the very common case in which the position of the nodes is not fully accurate. They neglect the innaccuracy of localization systems in real deployments [6]. In this paper we analyze the problems that location errors produce to geographic routing. The first issue is that a forwarding node may end up selecting nodes which do not provide any progress based on their real position (backward progress). Secondly, a packet may enter a local maximum because the forwarding node thinks based on the estimated positions that there is no neighbor providing progress even if there are considering the real positions (false local maximum). Finally, a node may think that it is in the radio range of the destination based on estimated positions, while it may still need some additional forwarder based on real positions (delivery failure at the destination). We design an effective greedy routing strategy to deal with location errors. Our proposed solution called EGLE (Effective Greedy routing protocol supporting Location Errors) combines several modes of operation to deal with the main causes of packet drops as stated above (backward progress, false local maximum and delivery failure at the destination). For the first two issues we propose a novel neighbor selection heuristic that uses the variance of the location estimation and the the number of times that a node overhears the same data packet to penalize next hop relays which may produce backward progress or a false local maximum. Sencondly, we use a alternative mode in a limited region of the local maximum to be able to exit the false void area. Thirdly, we design a lightweight broadcasting scheme to be able to deliver messages within the area that includes all possible positions of the destination. Our simulations show that EGLE exhibits delivery ratios higher than 90% even in scenarios with location errors even as high as a 100% of the radio range, outperforming existing solutions in the literature. The remainder of the paper is organized as follows. Section 2 discusses existing proposals to enhance greedy routing in the presence of location errors. Section 3 studies the main causes of packets losses in greedy routing due to location errors. Section 4 describes the operation of our proposed solution. In section 5, we evaluate the performance of EGLE through simulations and analyze its improvements versus existing solutions. Finally, we provide some concluding remarks in section 6.
Effective Geographic Routing in Wireless Sensor Networks
2
3
Related Work
Recent studies [8] have demonstrated that the performance of greedy routing suffers with innacurate location information. Concretely, Witt et al. [8] study the impact of location errors in beacon and beaconless geographic routing protocols (GPSR and BGR). Their study shows that both protocols experiment a huge increase of packet losses as the location error increases. Kim et al.[3] model the effects of the location errors on greedy routing. Authors prove that location errors may produce local maximums even when an real greedy route was present. In fact, their analysis shows that the most delivery failures come from these false local maximums and especially 90% of the packet drops happen within the coverage area of the destination. To improve the greedy routing performance, Kwon et al. [4] propose a probability function that incorporates location errors to determinate the goodness of candidates as next hops called MER(Maximum Expectation within transmission Range). This function penalizes nodes whose real positions might cause transmission failures and backward progress. Simulation results show the improvement of the delivery ratio for scenarios with moderate location errors. In situations in which the standard deviation of the error is higher than 31.5% of the radio range, MER is not able to deal with delivery failures. In particular, Oliveira et al. [6] showed that localization systemes used in WSN could have errors as high as about 100% of the radio range. Authors also analize the influence of locations errors in geographic algorithms. Their analysis show the importance of understanding the error behavior and the need of geographic algorithms being able to deal with those location errors. In this paper, we propose a novel scheme which is able to solve the main causes of packet drops in greedy routing due to location errors even when location errors are very large. Before presenting our solution we analyze in the next section the issues provoked by location errors in greedy routing.
3
Effects of Inaccurate Location in Greedy Routing
This section analizes the effects of location errors in two greedy routing protocols: GRS [1] and MER [4]. We assume an underlying topology in which nodes are well-distributed without void areas and greedy routing suffices to deliver the message. The analysis shows that even in this ideal topology location errors can make greedy routing to enter into local maxima as well as provoke delivery failures. Let’s assume a current relay i having a packet addressed to a destination node d outside of its radio range R. Then, i has a set Q of neighbors j ∈ Q located within R to route the packet. Assuming a topology without void areas, in both GRS and MER i selects one of its neighbors j located closer to d than itself. However in realistic scenarios, every node a located at A estimates an inaccurate position A . So, in practice i selects a neighbor j whose inaccurate position J is closer to D than I . If any neighbor satisfies this condition, i thinks that there is a void area and becomes a local maximum.
4
R. Marin-Perez and P.M. Ruiz
R
R I J’
I’
D
R N2
N’2
N0
D
I
I’
D’
R D
J N1
(a) False Void Area
N3
N4
(b) Reaching False Local Maximum
(c) False Destination Area
Fig. 1. In (a) a false void area appears due to the inaccurate positions (I , J ) of i and j, being I and J their real position, respectively. In (b) A greedy path exists between n0 at N0 and d at D through nodes n1 , n3 and n4 , being really located at N1 , N3 , N4 , respectively. But the protocols (GRS and MER) reach a false local maximum n2 at N2 because of the wrong estimated position N2 . In (c) i fails the delivery to d because in reality the distance between them is larger than R (dist(I , D ) < R < dist(I, D)).
A false void area may happen for two reasons presented in Fig. 1(a) where for simplicity we assume that D = D. First, if the relay i has an estimated position I that is closer to D than I the innacurate greedy area is smaller. Second, if greedy neighbors j have estimated positions J that are farther to D than J, then J is placed outside the greedy area of i. Also, the combination of both conditions produce false void areas and i becomes a false local maximum. Now, we explain through a example how both GRS and MER reach a false local maximum according to their next hop selection functions. Fig. 1(b) shows an example where a node n0 has a packet addressed to the destination d located at D. For simplicity, every node ni knows its real position Ni except the node n2 located at N2 that estimates an inaccurate position N2 . As we see in the figure n2 thinks based on its estimated position that it has no neighbor closer to d than itself even if that is not the case. In GRS, the selection function minimizes the distance to the destination. Then, n0 selects n2 whose N2 is closer to D than N3 . So, GRS fails. In MER, the selection function incorporate location errors to priorize nodes whose real positions are likely inside the greedy area preventing backward progress and transmission failures. This means that a sender penalizes the goodness of greedy neighbors for their proximity to itself and for being nearly as far as the radio range. In this example, n0 selects n1 because n2 is near the radio range. In the next hop, n1 must select a next relay among its closer neighbors n2 and n3 . Neighbors n2 and n3 are good candidates because their positions N2 and N3 are within the greedy area of the position N1 . Then, n1 selects the neighbor n2 whose inaccurate position N2 is closer to D than N3 . In this hop, the forwarding to n2 generates backward progress because its real position N2 is really farther to D than the previous position N1 . Both GRS and MER end up falling into a false local maximum for two different causes. GRS is bound to reach a local maximum because it selects nodes with excessive distance from the previous relay. MER avoids the local maximum in the first instance. But after several hops, the selection of a previous candidate
Effective Geographic Routing in Wireless Sensor Networks
5
may generate a local maximum due to a backward progress. Moreover, in both GRS and MER, n2 discards n3 as candidate relay because n2 thinks that n3 provides no progress. However using n3 , the packet is able to exit the false local maximum and advance through n4 toward d. Finally, we study a special case of false void area where the current relay i thinks that it is able to deliver the packet to d directly. Both GRS and MER assume the perfect delivery within the destination radio range R. However, the delivery fails if the real distance between i and d is larger than R. Figure 1(c) shows an example of delivery failure where the inaccurate distance between i and d is lower than the radio range R. Summing it up, our analysis shows that false void areas may appear and existing greedy solutions fail to deliver messages in scenarios with location errors.
4
Effective Geographic Routing with Location Errors (EGLE)
This section describes the operation of EGLE and shows how it solves all the issues mentioned above. We first give an overview of the overall operation and then describe the details of the routing protocol. EGLE is a beacon-less protocol based on the forwarding scheme of BOSS [7]. EGLE uses a 3-way handshake scheme based on query, response and selection to discover neighbors reactively and select the next hop. According to BOSS, the node currently holding the data packet sends a query message (Data) which also includes the data payload. In this way, only those neighbors that successfully receive the Data answer with a message (Response) informing about their identifier and positions. Finally, the sender selects the next relay by means of a single message (Selection) with the identifier of the selected neighbor. The benefit of sending the Data message first is to ensure that the selected neighbor sucessfully receives the data packet. The main contributions of EGLE consist of three effective mechanisms to deal with locations errors and an efficient delay function to reduce the number of responses. 1) Advance toward destination avoiding local maximums. EGLE selects the next relay combining two objective functions to prevent local maximums coming from backward progress and excessive distance. The first penalizes neighbors that already took part in the forwarding of the same data packet few hops before. The second penalizes neighbors whose positions are too far from the sender. 2) Continue through false void areas. EGLE proposes a alternative mode in local maximums to find a 2-hop closer neighbor to the destination. The alternative mode needs low-overload because it justs requieres knowing neighbors of the local maximum node but improves delivery rate. 3) Reduce the number of transmissions. EGLE presents a sophisticated timer assignment protocol to priorize the answers from good candidates. This
6
R. Marin-Perez and P.M. Ruiz
alliviates bandwidth consumption as well as reduces collisions at the MAC layer which increases EGLE’s reliability. 4) Deliver the message inside the destination radio range. Finally, EGLE applies a limited broadcast scheme [5] to disseminate directly the data packet within a region containing around the destination position. The broadcast scheme requieres low-overload because it is only performed in a limited region but improves the delivery rate. 4.1
Details of Greedy Mode
Here we describe the detailed operation of EGLE in greedy mode. EGLE uses greedy mode as much as posible and incorporates an alternative mode which is only used when no further advance in greedy mode is possible. The alternative mode will be described later on. In location error scenarios, we assume that each node a located at A has an estimated position A , where A = A + W , being W a Gaussian random vector with zero mean and standard deviation σa . According to the 3-sigma rule of gaussian distributions, 65% of the samples fall in the range of one standard deviation. In general, σa is the maximum position difference between A and A . Given the current relay i holding the packet addressed to the destination d and a set of neighbors j ∈ Q receiving the transmissions of i in its radio range R. We define σij as the the location error of neighbor j with respect to the current relay i, denoted as σij =
σi2 + σj2 . Moreover, σij represents the maximum
distance difference between their real distance (dist(I, J)) and their estimated one (dist(I , J )). Where, dist(A, B) represents the Euclidean distance among positions A and B. The maximum estimated distance between the current relay’s position I and each neighbor’s position J is denoted as M axDistij = R + σij . We define the progress of neighbor j to destination d with respect to the current relay i to be Pij = dist(J , D ) − dist(I , D ) ∈ [−M axDistij ..M axDistij ]. In greedy mode, the current relay i uses the neighbors j providing advance toward the destination d, denoted as Pij > 0. The selection function of next relay combines two objective functions. In following, we describe each one of two objective function and then show the selection operation using an example. 1) Penalize neighbors taking part in the forwarding process several times. As our analysis showed above, a local maximum may come from backward progress. In that case nodes may take part as candidates to route the packet several times. In each hop, the packet advances toward the destination. And, the selection of previous candidates may generate backward progress. To avoid backward progress, we exploit the forwarding scheme of BOSS to identify and penalize previous candidates. In BOSS, the current relay i sends first the Data messages to discover every neighbor j. Each neighbor j replies as a forwarding candidate. Among all candidates, i selects the next relay. This process is repeated several times until the destination or a local maximum is reached. The main idea is to order the goodness of neighbors based on the number of times they have acted as candidates. To do that, every neighbor j saves temporally a
Effective Geographic Routing in Wireless Sensor Networks
7
counter N umRj of the same data packet received from different relays i. Each neighbor j includes its counter N umRj in its responses, where the lower value of N umRj represents the better candidates. Among neighbors j with the lowest value of N umRj , the current relay i selects the next hop that consides their positions and their location errors as explained below. 2) Penalizing neighbors whose distance to the current relay is larger than the radio range. To select the next relay, we use a probability function that considers the location errors σij to penalize the goodness of neighbors j with excessive distance dist(J , I ) to the current relay i. To do that, we define the margin of neighbor j from the current relay i to the maximum distance M axDistij to be Mij = M axDistij − dist(J , I ). Similar to MER [4], our probability function is represented as a Cumulative Distribution Function from Rayleigh Distribution. We define the distribution of probability Fij that node j is located within the area centered at J and the radio uij = Mij with respect I , denoted as: u2ij Fij = (1 − exp(− 2 )) (1) 2σij Where, a larger location error (σij ) penalizes more the distance dist(J , I ). And, the higher the distance dist(J , I ) between neighbor j and the current relay i, the lower the values of Mij and Fij . The usefulness of a neighbor j with an estimated progress Pij is defined as: M EPij = Pij · Fij
(2)
Where M EPij ∈ [0..R] is called Maximum Expectation Progress (M EP ). For Pij > 0, the value of M EPij increases almost lineally for the distance dist(J , I ) between 0 and R. But, the value of M EPij decreases exponencially down to 0 when the distance dist(J , I ) is between R and M axDistij . Our function M EPij is different to the function M ER [4] which uses uij = min(Mij , Pij ). The reason is that M ER penalizes neighbors for two conditions: backward progress and excessive distance. This means that M ER tends to choose neighbors j in the middle between the current relay i and the radio range R. Therefore, M ER has a threshold location error at σth = 31.5% of the radio range as shown the authors [4]. For this reason, M ER does not behave properly in networks with higher location error (σij ) that 31.5%. Unlike the authors of M ER, we combine two different objective functions to prevent backward progress and excessive distance. The current relay i prevents backward progress by ordering neighbors with lower receptions of the same data packet. Among neighbors with least data receptions, the current relay i selects the neighbor j that maximizes M EPij . The function M EP penalizes only neighbors with excessive distance according to the location error (σij ) of the radio range R. When the location error(σij ) increases from 0 to R, M EP tends to choose neighbors j which are nearer to the current relay i. For this reason, our function M EP behaves properly in networks with higher location errors even as high as a 100% of the radio range.
8
R. Marin-Perez and P.M. Ruiz
Now, we show the greedy operation of EGLE using the previous example 1(b) assuming no transmission errors. In this example, a node n0 has a data packet addressed to the destination d. The current relay n0 sends a Data message to discover the forwarding candidates n1 and n2 with positive progress. Neighbors n1 and n2 reply with the same number of previous data packet received N umR1 = 0 and N umR2 = 0, respectively. Then, the current relay n0 uses the probability function of M EP to select the next relay. So, the current relay n0 penalizes the neighbor n2 because its estimated position N2 is larger than the radio range R from the position N0 . This means that the candidate n2 may be a local maximum. Then, the node n0 sends the Select message to the node n1 whose position N1 is within the radio range R of the position N0 . In the next hop, the current relay n1 sends a Data message to discover the forwarding candidates n2 and n3 with positive progress. Neighbors n1 and n2 reply different number of previous data packet received N umR2 = 1 and N umR3 = 0, respectively. The candidate n2 has more receptions than n3 and may generate backward progress. Then, the current relay n1 sends the Select message to the neighbor n3 whose goodness is higher than n2 . Finally, the current relay n3 uses the next hop n4 to deliver the packet to the destination d. This example shows how the greedy operation of EGLE improves the performance of existing greographic routing in networks with location errors. 4.2
Dealing with False Void Areas
When the current relay i has no neighbors j with Pij > 0. Then, the current relay i becomes a local maximum m located at I = M with estimated position I = M . To deal with this situation, EGLE is built upon our previous analysis about false void areas in Section 3. A false void area appears when the local maximum m has some neighbors j whose estimated progress is negative Pmj < 0 (also denoted as dist(M , D ) < dist(J , D )), but these neighbors j are really located at position J closer to D than M (denoted as dist(M, D) > dist(J, D)). A false void area is produced by the location error σmj between real and estimated positions of the local maximum m and neighbors j. As we explained before, σmj represents the maximum distance difference between their real distances dist(M, J) and their estimated one dist(M , J ). For this reason, we define an alternative forwarding region A that consists of all neighbors j with negative progress −σmj < Pmj that may be really closer to the destination d than the local maximum m. To exit the false void area, we use the beaconless forwarding scheme in the alternative forwarding area A. The main idea of alternative mode is forwarding the packet through neighbors j ∈ A until reaching a 2-hop neighbor h of the local maximum m with positive progress Pmh > 0. At that point, the packet is able to continue in greedy mode. All neighbors inside A may be closer to d than the local maximum m if we consider real positions. Therefore, in the coverage area of some neighbors j, there may be some 2-hop neighbors h providing advance to exit the local maximum m. Making the data packet go through the false void area.
Effective Geographic Routing in Wireless Sensor Networks
9
To implement the alternative mode, the current relay i includes its estimated position I as the maximum position M in the data packet. The packet is forwarded for several times using neighbors j ∈ A. Each neighbor j ∈ A is limited to forward the packet only one time to avoid repeating the same selection. The selection of the next relay is based on minimizing negative progress (Pmj < 0). This forwarding process repeats until a 2-hop neighbor h with positive progress (Pmh > 0) is found. Through h, the packet can advance in the greedy mode. The operation of alternative mode provides a high delivery rate with very small overhead. All neighbors inside A are candidate to forward the packet. The maximum overhead is limited by the number of neighbors j ∈ A. If all neighbors j ∈ A forward the packet without finding a closer 2-hop neighbor h. In this case, the packet is dropped because we are in real void area. Now, we show the alternative operation of EGLE using the previous example in fig. 1(b). In this case, we assume that n2 has a data packet addressed to the destination d. The current relay n2 sends a Data messages with its position N2 to discover the forwarding candidates. Only neighbors n1 and n3 inside the alternative area of n2 reply with their positions N1 and N3 , respectively. The current relay n2 does not receive any response with positive progress because n2 has a false void area due to its estimated position N2 . So, the packet changes to alternative mode and the current relay n2 becomes a local maximum n2 = m with maximum position N2 = M . To minimize negative progress, the local maximum n2 sends the Select message to the neighbor n3 as the next relay. In alternative mode, the current relay n3 sends the Data message with the local maximum position M to discover the forwarding candidates. The neighbor n4 replies with a closer position N4 to D than M . The current relay n3 sends the Select message to the neighbor n4 . So, the packet is able to continue in greedy mode through the node n4 until reaching to the destination d. This example shows how the alternative operation of EGLE is able to exit false void areas. 4.3
Delay Function for Beaconless Forwarding
We provide a delay function to reduce the overhead in terms of the number of responses in the beaconless forwarding for greedy and alternative modes. The number of responses may be high in dense networks. In the forwarding scheme, the current relay i broadcasts a Data message to its neighbors. The Data message includes the estimated positions of the current relay i and the destination d. Before answering, every neighbor j delays its response according to its goodness as next relay. The neighbor having the best goodness transmits the response first. The first response cancels responses from other neighbors overhearing that response. The current relay i waits until receiving a response and selects the neighbor that replied first. In the following, we design waiting timers to ensure that alternative neighbors wait more than all greedy neighbors. So, alternative mode is only used when greedy mode is not able to provide advance. As we mentioned above, greedy neighbors (Pij > 0) set their waiting time according to their goodness to advance toward the destination. This goodness is based on two parameters: the number
10
R. Marin-Perez and P.M. Ruiz
of the same data packet (i.e. forwarding selection phases) received N umR ∈ [0..M axR − 1] and the Maximum Expectation Progress M EP ∈ [0..R]. Thus, each greedy neighbor j with N umRj and M EPij determines its waiting time (Tij ), according the following equation: Tij = (TG /M axR) ∗ (N umRj + ((R − M EPij )/R))
(3)
TG is a constant representing the interval reserved for greedy neighbors. TG is divided by M axR to ensure that neighbors with different N umRj , will never wait the same amount of time. Neighbors with a lower N umRj obtain a smaller Tij . Among neighbors with the same N umRj , those with a higher M EPij obtain a lower Tij . Alternative neighbors (−σij < Pij < 0) set their waiting time according to their goodness so that negative progress is minimized. Each alternative neighbor j with (Pij ∈ [−σij ..0]) determines its waiting time (Tij ), according the following equation: Tij = TG + (TA ∗ (−Pij /σij ) (4) Where, TA is a constant representing the interval reserved for alternative neighbors. Neighbors with a lower Pij obtain a smaller Tij . The minimum delay time of alternative neighbors is established by TG . TG ensures that alternative neighbors wait more than greedy neighbors. Notice that when the current relay i receives no responses from any greedy neighbor then the packet enters in alternative mode for the local maximum i = m with the maximum position I = M .In alternative mode, only 2-hopgreedy neighbors (Pmj > 0) and alternative neighbors (−σmj < Pmj < 0) of the local maximum m take part in the forwarding process. When the packet reaches a 2-hop-greedy neighbor (Pmj > 0) it is routed again in greedy mode. 4.4
Final Delivery to the Destination
Finally, we deal with the delivery failure when the current relay i is within the destination radio range. As we showed in section 3, even if the estimated distance between the current relay i and d is lower than the radio range R (denoted as dist(I , D ) < R), the delivery may fail. The delivery failure is produced by the location error σid between positions I and D of the current relay i and the destination d, respectively. By our definitions in section 4.1, σid represents the maximum distance difference between their real distance dist(I, D) and their estimated one dist(I , D ). And, the maximum estimated distance between the destination’s position D and every neighbor’s position J is denoted as M axDistjd = R + σjd . For this reason, we define a estimated delivery area E centered in the destination’s position D with radio M axDistjd where d is really located. Here, we apply a counter-based broadcast scheme inside area E to deliver the data packet to the destination d. The main idea is exploiting the wireless medium to propagate the packet in the estimated destination area E. All nodes inside E are candidates to transmit the packet. So, the size of E and the density of
Effective Geographic Routing in Wireless Sensor Networks
11
nodes determines the number of requiered transmissions. To reduce the number of transmissions, nodes inside E wait a random time before they decide whether they forward the packet or or not. The decision is based on a maximum number of tramissions M axT . To implement the broadcast scheme, the current relay i transmits the data packet in broadcast mode. When a node j receives the broadcast for the first time and dist(J , D ) < M axDistjd , then it sets a random timer. During its waiting, j counts the number of times the message has been received. When the timer is fired, j transmits the packet if its counter is lower than M axT . Otherwise, j drops the packet. Additionally, we reduce the broadcast overhead by means of sending a cancelation from the destination d. When the destination d receives the packet, it transmits inmediatly the cancel packet. So, each node d receiving the cancelation drops the packet. In this way, we solve the most important cause of packet losses. Moreover, our adaptation of counter-based broadcast is efficient avoiding unnecessary transmissions.
5
Experimental Results
In this section, we compare GRS [1], BOSS [7] and MER [4] against our proposal EGLE. To measure the effect of the different enhancements proposed in EGLE over the overall performance, we use three versions of the protocol increasing the modes used (Greedy, Alternative, Broadcast): EGLE-G, EGLE-GA, EGLEGAB. For our simulations, we use the TOSSIM simulator which scales to thousands of nodes and facilitates the development of network applications. So, we implement these routing protocols in TinyOS code and assess their performance through extensive simulations. 5.1
Simulation Setup
The simulation scenario is a 2000x2000m2 area with 900 nodes and a mean density of 15 neighbors. We distribute nodes in the network by means of an hexagonal tesselation of subareas to avoid void areas where there is no advance. Nodes have a fixed radio range R = 150 with perfect communication links. In this way, we focus our studies in scenarios where packet losses are only produced by location errors. Location errors are modeled as a Gaussian Distribution with zero mean and a deviation from 5 to 100% of the radio range. We have considered 8 different deviation errors to represent a wide spectrum of real scenarios. In each scenario, 100 random sources transmit a data packet to a destination which is always located in the center of the network. The results are the average over a total number of 50 simulation runs per scenario that is enough to achieve a sufficiently small 95% confidence interval. Regarding the configuration of the algorithms being tested. As most beaconing protocols, GRS and MER use a beacon period of 4 seconds. On the other hand, BOSS and EGLE use a 3-way handshake and are configured with a greedy delay
R. Marin-Perez and P.M. Ruiz
Percentage of Lost Packets
60
80
Location Error (% of Radio Range) GRS MER
BOSS EGLE-G
EGLE-GA EGLE-GAB
(a) Delivery Ratio
100
20 0
20
40
60
80
EGLE-GAB
40
EGLE-GA
20
EGLE-GAB EGLE-GA
0
40 EGLE-GAB EGLE-GA EGLE-G MER BOSS GRS
20
60
EGLE-GAB EGLE-GA
40
80 EGLE-G MER BOSS GRS
60
100 EGLE-G MER BOSS GRS
80
EGLE-G MER BOSS GRS
100
EGLE-GAB EGLE-GA EGLE-G MER BOSS GRS
Packet Delivery Ratio
12
100
Location Error (% of Radio Range) Local-Maximum
Delivery-Failure
(b) Percentage of Lost Packets
Fig. 2. Results Graphics of Delivery Ratio and Lost Packets
time TG = 300ms. Moreover, EGLE has two addicional modes and needs the following parameters: an alternative delay time TA = 300ms, the maximum number of receptions M axR = 2 and the maximum number of transmitted broadcasts M axT = 2. 5.2
Analysis of Results
The main goal of EGLE is to mitigate the effects of location errors achieving a high packet delivery ratio. Fig. 2(a) compares the packet delivery ratio for previous protocols. We can see that the three versions of EGLE clearly outperform GRS, BOSS and MER regardless of the deviation error. As our studies showed, when the location error is high the probability of loosing packets due to false void areas is high. In high error scenarios, GRS and BOSS provide a very low delivery ratio because they have been designed for scenarios with perfect positions. MER has a little bit better delivery ratio because MER considers location errors in its routing decisions. However, the figure shows that EGLE’s greedy, alternative and broadcast modes are able to mitigate progresively the effects of location errors. So, EGLE achieves over 90% of delivery ratio even with a 100% location errors of the radius. To analyze in more detail the delivery ratio of these protocols we study percentage of lost packets. Fig. 2(b) shows the percentage of lost packets for each algorithm grouped by two causes: local maximums and delivery failures. Clearly, the three versions of EGLE exhibit a lower number of lost packets than GRS, BOSS and MER in all tested scenarios. The reason is that GRS, BOSS and MER assume perfect knowledge of the destination position and the delivery inside its radio range. Moreover, GRS and BOSS are bound to reach local maximus because they select farthest neighbors to maximize the progress. MER avoids some local maximums penalizing farthest neighbors. However, MER does not consider the cases where nodes are candidate relays for several times. Unlike MER, EGLE’s greedy mode provide better results penalizing previous candidate nodes. Moreover, EGLE’s alternative mode is able to exit from local maximums using discarded neighbors. Finally EGLE’s broadcast avoids almost all delivery failures distributing the packet in a limited area where the destination is located. The results show that EGLE is a very effective protocol even with high location errors.
Total Forwardings per Delivery
Total Tx Packets Per Delivery
Effective Geographic Routing in Wireless Sensor Networks
450 400 350 300 250 200 150 100 50 0 20 40 60 80 Location Error (% of Radio Range) GRS MER
BOSS EGLE-G
45 40 35 30 25 20 15 10 5 0
100
EGLE-GA EGLE-GAB
(a) Total Tx Packet Per Delivery
13
20 40 60 80 Location Error (% of Radio Range) GRS MER
BOSS EGLE-G
100
EGLE-GA EGLE-GAB
(b) Total Forwardings Per Delivery
Fig. 3. Number of Transmissions and Forwardings per Destination Reached
Regarding the overall number of transmitted packets per delivery, fig. 3(a) shows that, in addition to achieving a higher effectivity, EGLE’s modes also have a lower transmission overhead than GRS and MER in all scenarios. The main reason is that the beacon-less nature of EGLE makes it scale with the number of nodes in the network. EGLE avoids periodic beacon transmissions especially for the nodes not taking part in the routing process. Also, EGLE’s delay function avoids unnecessary transmission for neighbors being relay candidates. All versions of EGLE improve the performance of BOSS due to their good balances between a high delivery rate and only a little more overhead. As expected, all protocols experience the lowest performance in the very high location error scenarios. The reason is that geographic protocols are based on position of neighbors to select the next relay maximizing progress. Also, higher location error means a higher probability of suboptimal progress. This means that the current relay does not select really the closest neighbor to the destination. And even the selected neighbor is really farther to the destination than the current relay generating backward progress. Fig. 3(b) shows the total number of forwardings per delivery for each protocol. MER obtains even worse results than GRS and BOSS. Because MER has does not work property for high location errors due to its threshold at 31.5%. Unlike MER, EGLE-Greedy requieres a much lower number of forwardings per delivery due to the combination of greedy selection penalizing too far neighbors and the penalization of previous relay candidates. Moreover, we can see that the design of EGLE adapts perfectly to large location error scenarios getting similar results without location errors scenarios.
6
Conclusion
We propose and evaluate EGLE, an effective beaconless geographic routing protocol for wireless sensor networks. EGLE’s main goal is to mitigate the effects of locations errors achieving high delivery ratio with little control overhead. This is achieved with three routing modes and a forwarding delay function. Firstly, its greedy routing penalizes too distant neighbors and previous relay candidates to prevent local maximums. Secondly, its alternative routing mode uses discarded
14
R. Marin-Perez and P.M. Ruiz
neighbors to exit a local maximum advancing through the false void area. Third, its broadcast routing propagates the data packet in a limited area near the estimated position of the destination to ensure the delivery in its real one. Finally, EGLE combines the beacon-less nature of the protocol and a neighborhood discovery function to reduce the overhead. Our simulated results comparing EGLE against GRS, MER and BOSS, confirms that the goal of a good balance among effectivity and efficience has been achieved. EGLE outperforms them not only in terms of delivery ratio, but also in terms of the number of transmissions and forwardings to reach the destination.
Acknowledgments This work has been performed within the framework of the MOTEGRID project (PII1C09-0101-9476-01).
References 1. Finn, G.G.: Routing and addressing problems in large metropolitan-scale internetworks. Technical Report ISI/RR-87-180, Information Sciences Institute (1987) 2. Giordano, S., Stojmenovic, I., Blazevie, L.: Position Based Routing Algorithms for Ad Hoc Networks: A Taxonomy. Ad Hoc Wireless Networking (2004) 3. Kim, Y., Lee, J.-J., Helmy, A.: Modeling and analyzing the impact of location inconsistencies on geographic routing in wireless networks. SIGMOBILE Mob. Comput. Commun. Rev. 8(1), 48–60 (2004) 4. Kwon, S., Shroff, N.B.: Geographic routing in the presence of location errors. Comput. Networks 50(15), 2902–2917 (2006) 5. Mohammed, A., Ould-Khaoua, M., Mackenzie, L.M., Abdulai, J.: An adjusted counter-based broadcast scheme for mobile ad hoc networks. In: UKSIM 2008: Proceedings of the Tenth International Conference on Computer Modeling and Simulation, pp. 441–446. IEEE Computer Society, Washington, DC, USA (2008) 6. Oliveira, H.A.B.F., Nakamura, E.F., Loureiro, A.A.F., Boukerche, A.: Error analysis of localization systems for sensor networks. In: GIS 2005: Proceedings of the 13th Annual ACM International Workshop on Geographic Information Systems, pp. 71– 78. ACM, New York (2005) 7. Sanchez, J.A., Marin-Perez, R., Ruiz, P.M.: BOSS: Beacon-less On Demand Strategy for Geographic Routing in Wireless Sensor Networks. In: Proc. of the 4th IEEE MASS 2007, pp. 1–10 (October 2007) 8. Witt, M., Turau, V.: The Impact of Location Errors on Geographic Routing in Sensor Networks. In: Proceedings of the Second International Conference on Wireless and Mobile Communications (ICWMC 2006), Bucharest, Romania (2006)
Energy Efficient Mobile Routing in Actuator and Sensor Networks with Connectivity Preservation Essia Hamouda1,2 , Nathalie Mitton2 , and David Simplot-Ryl2 1 University of California Riverside INRIA Lille-Nord Europe, Univ. Lille 1, CNRS
[email protected], {nathalie.mitton,david.simplot-ryl}@inria.fr 2
Abstract. In mobile wireless sensor networks, flows sent from data collecting sensors to a sink could traverse inefficient resource expensive paths. Such paths may have several negative effects such as devices battery depletion that may cause the network to be disconnected and packets to experience arbitrary delays. This is particularly problematic in eventbased sensor networks (deployed in disaster recovery missions) where flows are of great importance. In this paper, we use node mobility to improve energy consumption of computed paths. Mobility is a two-sword edge, however. Moving a node may render the network disconnected and useless. We propose CoMNet (Connectivity preservation Mobile routing protocol for actuator and sensor NETworks), a localized mechanism that modifies the network topology to support resource efficient transmissions. To the best of our knowledge, CoMNet is the first georouting algorithm which considers controlled mobility to improve routing energy consumption while ensuring network connectivity. CoMNet is based on (i) a cost to progress metric which optimizes both sending and moving costs, (ii) the use of a connected dominating set to maintain network connectivity. CoMNet is general enough to be applied to various networks (actuator, sensor). Our simulations show that CoMNet guarantees network connectivity and is effective in achieving high delivery rates and substantial energy savings compared to traditional approaches. Keywords: wireless communication, performance optimization, node mobility, connected dominating set.
1
Introduction
Wireless sensor networks are intended to be deployed in hostile environments (battlefield, forest, etc.). Therefore, it is expected that a large number of cheap simple sensor devices will be randomly scattered over a region of interest. These devices are powered by batteries and have limited processing and memory capabilities. Among numerous challenges faced while designing WSN protocols,
This work was partially supported by CPER Nord-Pas-de-Calais/FEDER Campus Intelligence Ambiante and the ANR BinThatThinks project.
H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 15–28, 2011. c Springer-Verlag Berlin Heidelberg 2011
16
E. Hamouda, N. Mitton, and D. Simplot-Ryl
maintaining connectivity and maximizing the network lifetime stand out as critical considerations. The connectivity condition is generally met by deploying dense homogeneous networks to increase resources per unit area. However, dense networks can have several problems, such as device management and increased transmission interference and contention. Another approach is to use specialized nodes with long-range communication capabilities to maintain a connected network. The second consideration, network lifetime, is directly related to how long the power resources in sensor nodes will last. The network lifetime can be increased by designing and using energy efficient-protocols and algorithms. An example would be a scheduling scheme to make sensors work in batches to extend the network life [9]. Another solution is to add actuators, i.e. mobile sensor nodes, that can be moved to areas where resources are most needed to efficiently route packets. Actually, it has been shown [17] that deploying resource rich mobile devices in a network can provide the same performances as increasing the network density. The main motivation of this work is to take advantage of node mobility to extend the life of the network resource and consequently the network itself. Currently, even though the idea of using mobile actuator nodes to improve the performance of the network is well recognized, there is not much work that takes advantage of node mobility to improve routing in wireless networks while ensuring network connectivity. Available solutions adopt existing routing protocols to find an initial route, and iteratively move each node to an arbitrary location on the straight line connecting the source-destination pair. However, adopted node relocation strategies may cause useless zig-zag movements of nodes [10] and may disconnect the network [12]. Moreover, in [10,12] we argue that the associated energy optimization model is incomplete as it does not incorporate the mobility cost in the routing decision. In this paper, we propose CoMNet, a Connectivity preservation Mobile routing protocol for actuator and sensor NETworks, an energy efficient positionbased routing protocol which takes advantage of node mobility to minimize energy consumption and ensure network connectivity. CoMNet is based on a cost-over-progress metric where the optimized cost includes both moving and transmission costs. CoMNet has three variants, ORouting on the move, M ove(r) and M ove(DSr) described in more details in Section 4. These variants have the same objective but are intended for different applications. Both M ove(DSr) and ORouting on the move are intended for use in high traffic networks. They assume that the path will be highly used for a long time, so they establish a path on the straight line (the shortest distance) connecting a source-destination pair. Connecting two nodes and aligning relaying nodes on a straight line reduces energy consumption during transmission between the source and destination. Indeed, it has been shown that the straight line is the least energy consuming path connecting a source-destination pair [12]. ORouting on the move focuses more on optimizing the move distance while M ove(DSr) focuses on optimizing the transmission cost. The third variant M ove(r) assumes that the network
Energy Efficient Mobile Routing in Actuator and Sensor Networks
17
traffic between the source and the destination is light thus it is not optimum to move and align nodes. All CoMNet variants have the following properties: - Localized: A routing decision depends only on local information. A node has to know only its geographical location, the ones of its neighbors and of the final destination. - Scalable: CoMNet is memoryless. No routing information needs to be stored at a node or in a message. - Loop free: A message is always sent to a node in the forwarding direction of the destination. - Energy efficient: In its routing decision, CoMNet considers both the minimum cost of sending a message and moving a node to its new location. At each step, it chooses the least energy consuming solution. - Guaranteed connectivity: Though nodes may be mobile and may be relocated, CoMNet guarantees that the network connectivity is maintained at all times. This is achieved by first, relying on a connected dominating set (CDS), second, by enforcing that nodes in a CDS to be static and mobile nodes to lay within communication range of at least one dominating node. This paper is organized as follows. Section 2 reviews literature related to CoMNet. Section 3 presents our model and assumptions. CoMNet principles along with its three variants are described in Section 4 and evaluated in Section 5. We conclude and present future work in Section 6.
2
Related Work
Before proceeding into the analysis, we present a brief overview of some work of the literature related to position based routing algorithms in static and mobile networks and dominating set algorithms. This is by no means exhaustive and is only indicative of the interest and the applications. For a complete survey, the reader should refer to [8]. Position based routing in static networks. Position based routing algorithms for static sensor networks have been widely studied in the literature. The basic principle is as follows. Every node is aware of its position, the ones of its neighbors and of the destination. A routing decision is made based only on this local information. In greedy routing [6] for instance, the node currently holding a packet forwards it to the neighbor closest to the destination. This method was then extended to energy efficient variants [11,5], to guaranteed delivery solutions [1] and to a combination of both energy efficient and guaranteed delivery approaches [14,4]. In ORouting [5], the neighbor closest to the line (SD) connecting the source node S to the destination node is considered as the best candidate relay to minimize energy. Cost Over Progress (COP) based routing [11] is a localized metric aware greedy routing scheme. A node u forwards a packet to a neighbor v in the forwarding direction of the destination D such that the ratio of the energy consumed for sending a message from u to v (any cost metric can be used) to the progress made (measured as the reduction in distance to D) is minimized.
18
E. Hamouda, N. Mitton, and D. Simplot-Ryl
Routing in mobile networks. Little work has been done in routing in mobile sensor networks. We will address this relationship in our analysis section. Current solutions adopt existing routing protocols to find an initial route, and iteratively move each node to the midpoint of its upstream and downstream nodes on the route. However, these routing protocols may not be efficient. Moving strategy in [10] may cause useless zig-zag movements. In MobileCOP [12], next hop on the path is selected based on COP [11] metric. Once a path is computed, its nodes are moved and placed equidistantly on the straight line connecting the source to the destination, maintaining the same number of hops as in the computed path. Such move may induce a memory overhead on nodes, since they have to store the path, and a high transmission delay. More importantly, the network may be disconnected (a node may move out of range of its neighbors). This arbitrary move may not be optimum as when a node moves further from its upstream neighbor, the transmission delay will automatically increase. In addition, none of these approaches consider the cost of moving in the routing decision. A closer look to the basic ORouting (described previously) suggests that the protocol minimizes the moving cost of a candidate node to line (SD) and locally decreases the transmission cost. Dominating sets. Dominating sets (DS) are defined as follows. Each node in a graph either belongs to a dominating set or has a neighbor in the DS. The DS is called a Connected Dominating Set if the DS nodes are connected. The problem of computing the smallest CDS is known to be NP-complete even if knowledge of the global topology is available. Dai and Wu [3] introduced a generalized DS concept, where coverage can be provided by an arbitrary number of connected one-hop neighbors. The definition was modified by [15], to avoid message exchange between neighbors, as follows. Node a is covered by its one-hop neighbors b, c, . . . if these neighbors are connected. It is then further simplified in [2] as follows. First, each node checks if it is an intermediate node. Then each intermediate node a constructs a subgraph G of its neighbors with higher key values. If G is empty or disconnected then a belongs to the DS. If G is connected but there exists a neighbor of a which is not a neighbor of any node in G then a is in the CDS. If position information of 1-hop neighbors is available, nodes can decide whether or not to belong to a so defined CDS without exchanging any message with their neighbors. Note that these algorithms are local and do not incur any additional message exchange overhead.
3
Models
Before introducing CoMNet, we present in this section the assumptions of the protocols and define the cost models involved in its design and functionality. General model assumptions. We consider a sensor network where nodes are randomly scattered, are aware of their geographical location and are able to tune their transmission range between 0 and R (> 0). We also assume that sensor/actuator devices can be either mobile or stationary. The latter assumption can be further relaxed by making all nodes static or all nodes mobile.
Energy Efficient Mobile Routing in Actuator and Sensor Networks
19
We denote by N (u) the set of physical neighbors of node u, i.e. the set of nodes in communication range of node u (N (u) = {v | |uv| < R} where |uv| is the Euclidean distance between u and v). Let δ(u) = |N (u)| be the cardinality of N (u), also called the degree of node u. We also define ND (u) the set of neighbors of node u with positive progress toward destination node D: ND (u) = {v ∈ N (u) ∧ |vD| < |uD|} . Transmission cost. We denote by Cs (.) the cost measured in units of energy consumed to transmit a packet. We use the most common energy model [13]: α r + c if r = 0, Cs (r) = (1) 0 otherwise, where r is the distance between a sender and a receiver; c is the overhead (in units of energy) due to signal processing; α is a real constant (> 1) that represents the signal attenuation. The optimal transmissionradius, that minimizes the total c power consumption for a routing task is r∗ = α α−1 assuming that nodes can be placed on a straight line toward the destination [16]. Mobility cost. We denote by Cm (.) the cost, measured in units of energy consumed, to relocate a node. To the best of our knowledge, there is no accurate model to define such a cost. Therefore, in this work, we use the model adopted in the literature [12,10]: Cm (|vv |) = a|vv |
(2)
where v denotes a node at its original position (xv , yv ) (before it is moved). To eliminate confusion, we refer to node v after it has moved as v and its new position is (xv , yv ). |vv | is the Euclidean distance between v and v and a is a constant to be defined. In this work, we adopt the above cost models because they are widely used and as a proof of concept. However, other cost metrics can be considered.
4 4.1
CoMNet Motivation
The pitfall of node mobility is the risk of the network to be disconnected once nodes have moved and consequently routing fails. To support our claim, we study the behavior and performance of MobileCOP [12] in terms of network connectivity. We consider packet transmission between a single source-destination pair. As this method dictates, after every transmission, nodes along the computed paths move. Figure 4(a) displays the percentage of times the network remains connected as a function of the number of routes computed (or number of sourcedestination pair transmissions). Results show that for low node densities (δ ≤ 5), after the first routing task, the network remains connected only in 75% of the cases. After computing 10 consecutive routes, the network remains connected
20
E. Hamouda, N. Mitton, and D. Simplot-Ryl
only in 25% of the cases. In fact, node density has to reach δ = 20 to keep a network connected after 10 computed routes. As the number of computed paths increases, to keep the network connected the value of δ has to be large. 4.2
CoMNet Principles
To prevent the network to disconnect, CoMNet uses the notion of CDS. Furthermore, and contrary to other routing protocols, to minimize overall energy consumption CoMNet not only optimizes transmission energy but also mobility cost. In the following we describe both steps in more details. CoMNet and network connectivity. We claim that as long as every node is connected and more specifically is neighbor to a node in a CDS, the network will never be disconnected. Thus, the first step in CoMNet is to locally compute a CDS, using any one of the methods described in Section 2. At this step, every node is covered by the CDS. In order to keep this property, a node is moved if and only if its final position is covered by the CDS. In CoMNet, nodes that belong to a dominating set never move during a next hop selection. CoMNet and routing cost. To select a forwarding node, CoMNet uses a cost to progress criteria. Current node u chooses node v ∈ ND (u) among its neighbors in the forwarding direction of the destination D. More specifically, the selected node v (v ∈ ND (u)) minimizes the ratio of the global cost (packet transmission cost and node relocation cost) to the progress made towards the destination D. Indeed, v satisfies the following optimization problem: v = argminv∈ND (u) with Cs (|uv|)
=
Cs (|uv|) + Cm (|vv |) |uD| − |v D|
Cs (|uv|) send before v moves Cs (|uv |) + send after v moves to v’
where v represents the position where v should move (v = v if v is a dominating node). Cm and Cs are respectively moving and sending cost functions defined by Eq. 2 and 1. In the send after move case, before moving v, a beacon is sent to v to request and advertise its move to v (the cost of sending the beacon is ). Once v moves, u sends its message to v . The transmission cost used depends on the CoMNet variant used which is described in the following paragraphs. The selection of the next hop is summarized in Algorithm 1. There are major differences between CoMNet and existing solutions. First, CoMNet incorporates the cost of moving in its routing decision and it is memoryless (it does not require nodes to store a computed path to be used at a later time). Second, to preserve network connectivity in spite of node movement, in CoMNet a node moves if and only if it does not belong to the DS and its targeted position is within the communication range of at least one dominating
Energy Efficient Mobile Routing in Actuator and Sensor Networks
21
Algorithm 1. SelectNextHop(u,D) - Run at node u toward destination node D 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12:
if ND (u) = ∅ then Return NULL {Routing has failed.} end if A←∅ for all {v ∈ ND (u)} do v ← NewLocation(u, v, D) {Compute position v . v = v if v ∈ CDS} if {∃a ∈ CDS | |av | < R} then A←v end if end for C (|vv |)+C (|uv|) w ← argminv∈A m |uD|−|v sD| Return w {Routing has succeed.}
node. Third, CoMNet is general, it can assume that all nodes are mobile or all nodes are static. It can be easily adapted to heterogeneous networks composed of both actuators and sensors, by simply setting v = v for sensor nodes which cannot move. Now we describe CoMNet variants, ORouting on the Move, M ove(DSr) , M ove(r) . The selection principles is the same for all variants. They differ mainly in the computed cost criteria and in the node relocation scheme. We will show that each variant has its own advantages depending on the application. 4.3
CoMNet: ORouting on the Move
ORouting on the move is based on the plain ORouting [5] algorithm. It assumes that the traffic sent from the source to the destination is high and a computed path will be highly utilized for a long time. Therefore, the optimal path computed is on a straight line connecting the source to the destination. ORouting on the move further aligns nodes on the straight line in order to reduce the overall path length and consequently the overall routing energy consumed. Source node S selects next hop neighbor A based on the cost-to-progress criteria then takes advantage of mobility and moves the selected node. In Figure 1, which illustrates such a routing, node S has to select the next hop among nodes A1 , A2 and A3 (nodes in forward direction to D). Cost over progress metric is computed for 1+1 5+2 each node ( 2+1 3 = 1 for A1 , 1 = 2 for A2 and 6 = 1.17 for A3 ). Having the smallest cost, node A1 is selected and moved to location A1 , intersection point of the line connecting the source to the destination and its perpendicular line passing through A1 . Note that ORouting on the move objective is to minimize the move distance. ORouting on the move is thus well adapted to situations where moving is very costly. In this case, S first needs to send a beacon to node A1 to request and advertise its move on the line. Once A1 moves to position A , S forwards the message to A . The sending cost is Cs (|SA |) + and the move cost is Cm (|AA |).
22
E. Hamouda, N. Mitton, and D. Simplot-Ryl
R
A3 A2 1
5 2
1
A’1
S 2
D
1
A1
1
2
3
Fig. 1. ORouting on the move. Red arrows show possible displacement of nodes with associated moving costs. Solid links from S to Ai are associated to sending costs.
4.4
CoMNet: Move(DSr)
As in ORouting on the move, M ove(DSr) aims at aligning nodes along a straight line in order to reduce the energy consumed by consecutive transmission from S to D. The objective in this case is to move the selected neighbor to a new position r distant from the source (or relay) node to optimize routing energy consumption. The idea here is to compute a routing path from S to D where all nodes are aligned on line (SD) and where all hop lengths are equal to the optimal transmission distance r∗ . Since this objective is not always achieved ( |SD| is not r∗ always an integer), we opted for a hop length r close to the optimal value such that all the nodes in the path are equidistant. The challenge here is to determine the proper value of r . As mentioned in Section 2, since the optimal range r∗ ∗ has aclosed form expression, the optimal number of hops n is expressed as |SD| n∗ = r∗ . We can thus infer that r is equal to
r =
|SD| n∗ |SD| n∗ −1
if r∗ − |SD| < n∗ otherwise
|SD| n∗ −1
− r∗
(3)
Note that in the rare case where |SD| = kr∗ , k a positive integer value, r = r∗ . An illustration of this method is given in Figure 2 where nodes Ai , i ∈ {1, 2, 3} are neighbors of S in the forward direction. Only one of the nodes Ai will be selected by S based on the energy cost. In the case presented in Figure 2, A2 will be selected (cost is 2 + 2 = 4 for A1 , 2.5 + 1 = 3.5 for A2 and 5 + 2 = 7 for node A3 ) and will move to A location such that |SA | = r . Clearly, the progress is a constant for a given neighborhood and does not affect the objective function. It is worth noting that if a node B already lays in position A , it is useless to move A . B will be the selected neighbor. Once the neighbor is selected the packet is transmitted before the node is moved to its new location. The sending cost here is Cs (|AA2 |). 4.5
CoMNet: Move(r)
As in M ove(DSr), the objective is to select a neighbor and move it to a new position r∗ distant from the source (relay) in order to decrease the sending cost.
Energy Efficient Mobile Routing in Actuator and Sensor Networks
R
23
A3 A2
5
1
S
r’
2
2.5
2
2
A’
D
A1
Fig. 2. M ove(DSr) . Red arrows show possible displacement of nodes with associated moving costs. Dashed links are associated to the sending cost.
But contrary to the previous methods, the new location has to be on the circle C r ∗ of radius r∗ centered at S. The main objective is to reduce the moving distance while minimizing the sending cost. This variant of CoMNet is used in light traffic network where routes are seldomly used for long transmission and do not need to be fully optimized to align nodes on line (SD). On the other hand, if node A is such that |SA| < r∗ , A should be relocated on the intersection of C r ∗ and line (AD). If node A is such that |SA| > r∗ , A should be relocated on the intersection of C r∗ and line (SD). An illustration of this method is given in Figure 3 where nodes Ai , i = 1, 2, 3 are neighbors of S in the forward direction. Only the selected neighbor Ai will be moved to node Ai location. Node S selects the node which provides the smallest positive COP ratio, i.e. node A2 (COP for A1 is 2+0,5 = 0, 83, COP for A2 is 1+1 = 0, 66, COP for A3 is 5+0,5 = 2, 2). 3 3 2,5 As in M ove(DSr), the packet is transmitted before the node is moved to its new location. The sending cost is equal to Cs (|AA1 |).
R 0.5
A3
A’3
A2
A’2 1
5 1
r* S
D
2
A1
2.5
0.5
A’1
3
Fig. 3. M ove(r) . Red arrows show possible displacement of nodes with associated moving costs. Dashed links are associated to the sending cost.
5
Analysis and Experimental Results
We compare the performance of all CoMNet variants ORouting on the move, M ove(DSr) and M ove(r) to MobileCOP. We use WSNet/Worldsens [7] eventdriven simulator for large scale wireless sensor networks, that assumes 802.11
24
E. Hamouda, N. Mitton, and D. Simplot-Ryl
DCF MAC layer, free space propagation model and packet collisions. For simulation purpose we assume that nodes are uniformly distributed over a 1000×1000 square and can adapt their range between 0 and R = 150. The variants are compared for the same system environment: same samples of node distribution, same source-destination pairs for various network node densities. We consider only connected networks. We compare the variants based on resource consumption and routing success. 5.1
Routing Success Rate
Figure 4(b) shows the percentage of routing success with respect to network node density. Results show that CoMNet outperforms MobileCOP for low densities. For higher densities CoMNet and MobileCOP have similar performance. As expected, ORouting on the move, which routing behavior is the closest to MobileCOP, performs very close to MobileCOP. The best performing technique is M ove(r) variant.
1
1 0.9
0.95 0.8
Success Rate
0.9
Connexity
0.7
0.6
0.5
0.85 0.8 0.75
0.4
degree = 5 degree = 7 degree = 10 degree = 15 degree = 20 degree = 25 degree = 30 degree = 35
0.3
0.2 1
2
ORouting MOVER MOVEDSR Mobile Routing
0.7 0.65 100 3
4
5
6
7
8
9
10
150
250 200 number of nodes
300
350
Nb of routings with different source-destination
(a)
MobileCOP: Proportion of times the
(b)
Routing success rate: comparative anal-
network gets disconnected wrt number of ysis of CoMNet variants and MobileCOP routes computed
routing
Fig. 4. Disconnection and success rate
5.2
Energy Consumption
We evaluate the energy consumption of each algorithm based on the energy models described in Eqs. 1 and 2. To compute Cs (.) the energy consumed to send a packet, we use parameter values used in the literature [11], i.e. c = 38 and α = 4, which lead to an optimal transmission range of r∗ = 100. As stated previously, nodes will be equipped with GPS devices for localization purpose and we assume that the energy consumed for each node to identify its location is the same for all nodes. We also assume that the cost to exchange Hello packets is the same for every node. As such, these consumptions are considered as a constant and will not be included in our energy optimization model. For each source-destination pair our simulator computes 10 consecutive routes between a given source-destination pair and averages the computed statistics of more than 1000 tries. Regarding the mobility model’s (Eq. 2) parameter a, since not much
Energy Efficient Mobile Routing in Actuator and Sensor Networks
25
research has been done in this area, we run simulations for three different values of constant a, computed as follows. 1. if sending is as costly as moving, Cs (.) = Cm (.), a is solution to the following equation: Cs (r∗ ) = Cm (r∗ ). 2. if sending is much more costly than moving, Cs (.) >> Cm (.), then a is solution to the following equation: Cs (r∗ ) = 102 Cm (r∗ ). 3. if moving is much more costly than sending, Cs (.) << Cm (.), then a is solution to Cs (r∗ ) = 10−2 Cm (r∗ ). Energy consumed during consecutive routing. In CoMNet as in MobileCOP, paths are computed for every packet transmission. As such, the first time a path is computed, nodes are more likely to move than in the second time a path is computed between the same source-destination pair. For this reason, we choose to evaluate the cumulative energy consumed during a fixed number of packet transmissions (or consecutive path computations) to allow a fair comparison of the schemes. Certain protocols do not need to move nodes again in the second route computation. Figures 5(a), 5(c) and 5(e) display the cumulative energy spent (to move and to send) by each CoMNet variant and MobileCOP, for each mobility model for a node density of 300 nodes and as a function of the number of successive routings. Our extensive simulations show that the algorithms have the same behavior for low network node densities (due to page restrictions supporting data cannot be presented). In M ove(DSr) , nodes are moved to the desired location every time a path is computed. However, in MobileCOP nodes are moved only after the first path is computed then-after, nodes memorize the path which will be used in future transmissions. Therefore, energy added at subsequent route computation results only from sending cost. Results show that Routing on the move and M ove(r) converge quickly since they use the same path after computing the first 3 consecutive routes from S to D. Since MobileCOP does not take into account the moving energy consumption in its routing decision, the distance moved by each node is the same for each model, but has different costs. Our analysis show that MobileCOP has a very high starting consumption cost due to node displacement. Since CoMNet incorporates the moving cost in its objective function, the initial cost remains low for CoMNet variants. Note that when moving is much more costly than sending (Figure 5(e)), ORouting on the move consumes the least energy, followed by M ove(r) . This is due to the inherent goals of each variant: ORouting on the move has a bias towards the move cost so it tries to minimize the move distance while M ove(DSr) ’s priority is to minimize the sending cost. M ove(r) which aims at minimizing both energy costs has an average performance compared to the other variants. When sending cost is higher (Figure 5(c)) or close to moving cost (Figure 5(a)), the performance of all schemes are comparable. In this case, MobileCOP and M ove(r) have the best performance. For all models, the cost difference between CoMNet and MobileCOP is even bigger for low node densities since nodes in MobileCOP move longer distances as
E. Hamouda, N. Mitton, and D. Simplot-Ryl
3e+06
Total cumulative energy when successful
Total cumulative energy when successful
26
2.5e+06 2e+06 1.5e+06 1e+06 ORouting MOVER MOVEDSR Mobile Routing
500000 0 3
2
1
8 7 6 5 4 number of successive routing
9
10
3e+06 2.5e+06 2e+06 1.5e+06 1e+06 ORouting MOVER MOVEDSR Mobile Routing
500000 0 3
2
1
8 7 6 5 4 number of successive routing
9
10
8e+06 ORouting MOVER MOVEDSR Mobile Routing
6e+06 5e+06 4e+06 3e+06 2e+06 1e+06 1
2
3
4
5
6
7
8
number of successive routing
(e) Cs << Cm - 300 nodes
3.1e+06 3e+06 2.9e+06 2.8e+06 2.7e+06 2.6e+06 100
150
200 number of nodes
250
300
3.2e+06 ORouting MOVER MOVEDSR Mobile Routing
3e+06 2.8e+06 2.6e+06 2.4e+06 2.2e+06 2e+06 1.8e+06 1.6e+06 1.4e+06 1.2e+06 100
150
200 number of nodes
250
300
(d) Cs >> Cm - 10 routings Total cumulative energy when successful
Total cumulative energy when successful
(c) Cs >> Cm - 300 nodes
7e+06
ORouting MOVER MOVEDSR Mobile Routing
3.2e+06
(b) Cs = Cm - 10 routings Total cumulative energy when successful
Total cumulative energy when successful
(a) Cs = Cm - 300 nodes
3.3e+06
9
10
3e+06 ORouting MOVER MOVEDSR Mobile Routing
2.5e+06
2e+06
1.5e+06
1e+06
500000 100
150
200 number of nodes
250
300
(f) Cs << Cm - 10 routings
Fig. 5. Cumulative energy consumption for each algorithm
candidate nodes are on average located farther away from the direct line between the source and the destination. This phenomena is counter-balanced in the other schemes by the fact that the lower the network density, the higher the proportion of dominating nodes. Thus, dominating nodes are more likely to be selected as next hop candidates, as a result routing will not incur any move cost. Effect of network density on protocol performance. We select a sourcedestination pair (S and D) and allow 10 consecutive route computations between S and D. Figures 5(b), 5(d) and 5(f) display the cumulative energy consumed after all consecutive routing tasks using each CoMNet variant and MobileCOP.
Energy Efficient Mobile Routing in Actuator and Sensor Networks
27
The analysis is conducted for various network densities and for different mobility models. Note that MobileCOP runs independently of the mobility cost. The energy consumed smoothly decreases as the number of nodes increases. This is due to the fact that for high network density MobileCOP can select a relay node at the right location saving the energy to move it. To the contrary, CoMNet variants behave differently. When moving cost is equivalent to sending cost, CoMNet variants behave similar to MobileCOP. When moving is less costly than sending, CoMNet variants try to favor sending rather than moving. Nevertheless, for low densities, by construction of the CDS, almost every node belongs to the CDS and thus can not move. The energy consumed is thus mainly the energy consumed to send messages. Since sending is more expensive than moving, CoMNet consumes more energy than MobileCOP. As the network density increases, the proportion of dominant nodes decreases and thus, more nodes can move to the proper positions and hence, CoMNet variants outperform MobileCOP. For these settings, ORouting variant is the best performing technique (since it tries to minimizes the moving distance), followed by M ove(r) and M ove(DSr) . The latter tries to favor sending over mobility energy consumption. When sending is more expensive than moving, CoMNet adapts to the network density (unlike MobileCOP where the moving cost is not considered in the optimized cost). For various network density, CoMNet outperforms MobileCOP. In addition, in all cases, CoMNet ensures the network connectivity and is memoryless, which is not the case for MobileCOP. For these settings, the best CoMNet variant is M ove(DSr) which tries to minimize the energy consume to send messages.
6
Conclusion and Future Work
We introduce a novel protocol, CoMNet, that takes advantage of node mobility for efficient routing while ensuring network connectivity. The robustness of CoMNet compared to existing methods is due to the fact that it incorporates all costs (transmission as well as moving costs) in its routing decision. Note that the cost model does not have to be restricted to the transmission and moving costs it can be generalized to include receiving energy and other costs. Through extensive simulations, we show that CoMNet outperforms existing methods in terms of energy consumption and memory overhead. CoMNet has three variants that have different objective functions and mobility models. Each variant has its specific applications. Our future work will focus on the behavior of CoMNet in the presence of node conflict–when a node is solicited by more than one flow. Preliminary results show that M ove(r) is the most appropriate variant since it does not try to align nodes. Another interesting extension to this work would consider non-connected networks and explore node mobility to achieve network connectivity. This work is efficient for a single flow transmission, extending it to multiple flows may be an interesting problem; as flows share the same nodes and create bottlenecks, node conflict may complicate the routing schemes.
28
E. Hamouda, N. Mitton, and D. Simplot-Ryl
References 1. Bose, P., Morin, P., Stojmenovic, I., Urrutia, J.: Routing with guaranteed delivery in ad-hoc wireless networks. ACM/Kluwer Wireless Networks 7(6), 609–616 (2001) 2. Carle, J., Simplot-Ryl, D.: Energy efficient area monitoring by sensor networks. IEEE Computer Magazine 37, 40–46 (2004) 3. Dai, F., Wu, J.: An extended localized algorithm for connected dominating set formation in ad hoc wireless networks. IEEE Trans. Parallel and Distributed Systems, TPDS (2004) 4. Elhafsi, E.H., Mitton, N., Simplot-Ryl, D.: End-to-End Energy Efficient Geographic Path Discovery With Guaranteed Delivery in Ad hoc and Sensor Networks. In: IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Cannes, France (September 2008) 5. Hamouda Elhafsi, E., Simplot-Ryl, D.: Flattening the gap between sourcedestination paths in energy efficient greedy georouting in wireless sensor networks. In: Zhang, H., Olariu, S., Cao, J., Johnson, D.B. (eds.) MSN 2007. LNCS, vol. 4864, pp. 56–65. Springer, Heidelberg (2007) 6. Finn, G.G.: Routing and addressing problems in large metropolitan-scale. Internetworks (March 1987) 7. Fraboulet, A., Chelius, G., Fleury, E.: Worldsens: Development and prototyping tools for application specific wireless sensors networks. In: SPOTS (April 2007) 8. Frey, H., Ruehrup, S., Stojmenovic, I.: Routing in wireless sensor networks. Guide to Wireless Ad Hoc Networks 4, 81–111 (2009) 9. Gallais, A., Carle, J., Simplot-Ryl, D., Stojmenovic, I.: Localized sensor area coverage with low communication overhead. In: Fourth Annual IEEE International Conference on Pervasive Computing and Communications, PerCom (2006) 10. Goldenberg, D.K., Lin, J., Morse, A.S.: Towards mobility as a network control primitive. In: ACM International Symposium on Mobile Ad Hoc Networking and Computing (Mobihoc), pp. 163–174 (September 2004) 11. Kuruvila, J., Nayak, A., Stojmenovic, I.: Progress and location based localized power aware routing for ad hoc sensor wireless networks. Intern. Journal on Distributed Sensor Networks IJDSN 2, 147–159 (2006) 12. Liu, H., Nayak, A., Stojmenovi´c, I.: Localized mobility control routing in robotic sensor wireless networks. In: Zhang, H., Olariu, S., Cao, J., Johnson, D.B. (eds.) MSN 2007. LNCS, vol. 4864, pp. 19–31. Springer, Heidelberg (2007) 13. Rodoplu, V., Meng, T.: Minimizing energy mobile wireless networks. IEEE Journal in Selected Areas in Communications JSAC 17(8), 1333–1347 (1999) 14. Sanchez, J.A., Ruiz, P.M.: Exploiting local knowledge to enhance energy-efficient geographic routing. In: Cao, J., Stojmenovic, I., Jia, X., Das, S.K. (eds.) MSN 2006. LNCS, vol. 4325, pp. 567–578. Springer, Heidelberg (2006) 15. Simplot-Ryl, D., Stojmenovic, I., Wu, J.: Energy efficient backbone construction, broadcasting, and area coverage in sensor networks. In: Handbook of Sensor Networks: Algorithms and Architectures, pp. 343–379. Wiley, Chichester (2005) 16. Stojmenovic, I., Lin, X.: Power-aware localized routing in wireless networks. IEEE Trans. Parallel and Distributed Systems (TPDS) 12(11), 1122–1133 (2001) 17. Wang, W., Srinivasan, V., Chua, K.-C.: Extending the lifetime of wireless sensor networks through mobile relays. IEEE/ACM Trans. Netw. 16(5), 1108–1120 (2008)
Joint Duty Cycle Scheduling, Resource Allocation and Multi-constrained QoS Routing Algorithm Jamila Ben Slimane1,2, Ye-Qiong Song2 , Anis Koubaa3,4, and Mounir Frikha1 1
MEDIATRON, Higher School of Communication of Tunis, Tunisia LORIA and INPL, Campus Scientifique, BP 239 54506, France 3 CISTER Research Unit, Polytechnic Institute of Porto, (ISEP-IPP), Portugal COINS Research Group, Al-Imam Mohamed bin Saud University, Saudi Arabia 2
4
Abstract. Wireless mesh sensor networks (WMSNs) have recently gained a lot of interest due to their communication capability to support various applications with different Quality of Service (QoS) requirements. The most challenging issue is providing a tradeoff between the resource efficiency and the multi-constrained QoS support. For this purpose, we propose a cross-layer algorithm JSAR (Joint duty cycle Scheduling, resource Allocation and multi-constrained QoS Routing algorithm) for WMSNs based on multi-channel multi-time slot Medium Access Control (MAC). To the best of our knowledge, JSAR is the first algorithm that simultaneously combines a duty cycle scheduling scheme for energy saving, a resource allocation scheme for efficient use of frequency channels and time slots, and an heuristic for multi-constrained routing protocol. The performance of JSAR has been evaluated, showing that it is suitable for on-line implementation.
1 Introduction Wireless mesh sensor networks (WMSNs) are expected to support various applications with different QoS requirements. According to novel application requirements, QoS constraints become more and more critical in terms of end-to-end delay, data throughput and packet-error-rate. Also, due to energetic constraints at node level, energy saving remains the most challenging issue. Various cross-layer designs are applicable to WMSNs [2, 3, 4, 5], but the majority of proposed approaches generally focus on WMSNs based single-channel MAC protocol and only [5] focus on Multi-channel based WMSNs. However dealing with all the WMSN’s requirements requires tight collaboration of all network’s layers. Although several cross-layer designs have been proposed for WMSNs to ensure energy efficiency or to improve the network’s performance or to support QoS guaranties, few approaches have taken into account simultaneously multiple WMSN’s requirements. The authors in [2] proposed a cross-layer strategy that explores the tradeoff between energy efficiency and packet timeliness in time division multiple access (TDMA) based WSNs, by transmission power allocation and routing path selection schemes. The idea in [3] is based on a traffic balancing inside the network and a judiciously allocation of the retry limit to each link. [4] proposed a cross-layer design based on power transmission management, routing, and duty-cycle schedule to optimize the WSNs energy-efficiency. We find that only [5] proposed a cross-layer design for WMSNs based on multi-channel access. The goal of the joint wakeup/sleep scheduling H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 29–43, 2011. c Springer-Verlag Berlin Heidelberg 2011
30
J. Ben Slimane et al.
and routing algorithm proposed in [5] is the minimization of the communication latency while providing energy efficiency for nodes in FDMA-based multi-channel WMSNs. The proposed routing algorithm takes into account only delay constraint to minimize communication latency and save energy. We note that the flow differentiation, the link reliability, the residual energy per node and the energy consumption per path have not been taken into account during routing process. None of these approaches combines simultaneously multi-constrained QoS routing, duty cycle scheduling and efficient frequency channels and time slots allocation. The main problem, we address, is designing an efficient cross-layer algorithm that simultaneously takes into account various WMSN’s requirements. For a solution of such NP-hard problem, we propose, in this paper, JSAR, an algorithm that combines simultaneously a network duty cycle scheduling policy for energy saving and so network lifetime maximization, a frequency channels and time slots allocation strategy for an efficient sharing of frequency channels and time slots during routing process, and an heuristic for a QoS support offering guaranteed services in compliance with the application requirements and taking into account the network configuration. In this current work, we only focus on the performance evaluation of JSAR. The network performance evaluation under JSAR is our future work. The rest of the paper is organized as follows: In section 2, we present the system model. Section 3 describes the principle of PMCMTP [1] with the use of JSAR. In section 4, we detail the proposed cross-layer design. In section 5, we evaluate JSAR performance by analyzing and commenting some simulation results.
2 System Model and Notations We consider an WMSN composed of one coordinator located at the center of the network and a set of distributed sensor nodes acting as routers and/or sources organized in fully meshed topology (See second tier of the network architecture proposed in [11]). We can model the network as a weighted directed graph G(V, E, w). We admit that the network supports five data flow priority levels from P1 to P5 (P1 for the highest priority level). The control traffic refers to the network synchronization and resource allocation information. All network nodes are concerned by the beacon frames. So, a simple manner to quickly share such information between all nodes is broadcasting it from the coordinator which is more powerful than others nodes, and has no energetic constraint compared to others nodes. Also, the amount of control traffic is negligible in comparison to the data traffic so there is no need to balance such traffic over the network. A single frequency channel will be enough to ensure the control traffic and the rest of available frequency channels can be used to maximize the number of parallel data communications. For those reasons, we propose one hop control communications using one frequency channel. To balance both load and energy consumption and to minimize power transmission, we propose multi-hop routing with the use of J frequency channels of K time slots per channel for data communication between network’s nodes. Each node u ∈ V has a unique identifier (id), and it is characterized by its initial battery capacity (BCu ), residual energy (Reu ) and data queue state (DataQu ). A node
Joint Duty Cycle Scheduling, Resource Allocation
31
presents three activity states respectively busy (transmitting or receiving data), forced sleep (see section 3.A) and free (in sleep). Notations: V: set of vertices (nodes), E: set of edges (link) N: number of nodes, L: number of links, w: link’s weight,Pi : flow priority levels, J : number of frequency channels, K : number of time slots per channel dcu : refers to a node duty cycle or activity state, dcu = 1 for sleep or free state dcu = 0 for busy and in forced sleep states Each link luv ∈ E is associated with multiple parameters which can be classified into: additive metric (as delay, link cost, energy consumption, number of hops), multiplicative metric (as packet reception rate P RR) and concave metric (as available bandwidth [6]). For non-additive metrics, we propose to prune all links that do not satisfy the constraints of concave metrics and for multiplicative metrics we transform it into additive metrics by applying a logarithm operation on it. For the rest of the paper, we consider only additive metrics for path evaluation. Each link luv ∈ E is associated with a cost parameter cost(u, v) and m additive nonnegative weights wi (u, v) ≥ 0, i = 1..., m which can be denoted by the vector w(u, v) = w1 (u, v), ..., wm (u, v) and each weight is corresponding to a constraint, and the upper bound of the ith constraint is denoted by Ci (i.e vector C = C1 , ..., Cm corresponds to the vector of the m constraints). At a given time t, each link luv ∈ E is characterized by a set of parameters as follows: – Link’s logarithmic value of packet reception rate prruv : We define prruv as the absolute value of the logarithmic value of P RRuv (packet reception rate). prruv = |log10(P RRuv )|
(1)
– Link’s delay δuv : For each data-flow (F ), the wireless interface delay would typiF cally involve the data transfer delay (δtr ), the resource allocation and the channel F access delay (δch ). δuv is expressed in terms of time slots as follows: F F δuv = δtr + δch
–
c Link’s energy consumption Euv : We
(2)
c define Euv
as the sum of the energy consumed by node u during radio transmission and data processing and queueing (Euc ) and the energy consumed by node v during radio receiving (Evc ). c Euv = Euc + Evc
(3)
– Link’s residual energy Reuv : We define Reuv as the minimum value of residual energy of nodes u and v. Reuv = min(Reu , Rev )
(4)
– Link’s energy availability Aeuv : We define Aeuv as a boolean value reflecting the energy availability for both nodes u and v. 1 if Reu > Euc and Rev > Evc Aeuv = 0 otherwise
32
J. Ben Slimane et al.
– Link’s available time slots T Suv : We assume the use of K equal time slots (τ ) with J frequency channels (i.e. KxJ τ in total). We consider T Suj (resp.T Svj ) to represent the available time slots of the node u (resp.v) of the j th channel and j to represent the link’s available time slots of the j th channel. We model T Suj T Suv (resp.T Svj ) as a binary vector of dimension K: 1 if time slot τj,k is available T Suj = tsuj,k 1≤k≤K tsuj,k = (5) 0 else τj,k represents the k th time slot of the j th channel. j T Suv = tsuj,k ∗ tsvj,k 1≤k≤K
(6)
At a given time t, we characterize a path Ps→d , from a source s to a destination d, by a set of parameters as follows: – Path’s packet reception rate prrs→d : We transform this multiplicative metric into a cumulative metric by applying a logarithm operation on it. P RRs→d = P RRuv ⇒ prrs→d = prruv (7) luv ∈Ps→d
luv ∈Ps→d
– Path’s delay δs→d : We define path’s delay as the sum of delay expired per each link in the path. δuv (8) δs→d = luv ∈Ps→d
–
c Path’s energy consumption Es→d :
We define path’s energy consumption as the sum of energy consumed per each link in the path. c c = Euv (9) Es→d luv ∈Ps→d
– Path’s residual energy Res→d : We define path’s residual energy as the minimum of residual energy of the set of all links in the path. Res→d =
min Reuv
luv ∈Ps→d
(10)
– Path’s energy availability Aes→d : We define path’s energy availability as the minimum of energy availability of the set of all links in the path. Aes→d =
min Aeuv
luv ∈Ps→d
(11)
3 PMCMTP with the Use of JSAR In [1], we have proposed the PMCMTP for frequency channels and time slots allocation inside a personal area network. A key concept in PMCMTP is the elementary active cycle as shown in Figure 1, which is composed of two consecutive superframes, the first
Joint Duty Cycle Scheduling, Resource Allocation
33
(OHPHQWDU\ DFWLYHF\FOH )LUVW6XSHU)UDPH
6HFRQG6XSHU)UDPH
'DWD7UDQVPLVVLRQSKDVH -GDWD&KDQQHOV
5HVRXUFH 5HTXHVWSKDVH
8VLQJDFRQWURO &KDQQHO
)LUVWEHDFRQ 5HVRXUFHUHTXHVW7LPHVORW
.HTXDO7LPHVORWV 6HFRQGEHDFRQ56$SURFHVV 'DWDWUDQVPLVVLRQ7LPHVORW
Fig. 1. An elementary active cycle
for synchronization and collection of resource requests, and the second for the request scheduling algorithm (RSA), the reception of the second beacon and the data communications. In this paper, we propose the extension of the PMCMTP MAC protocol by including the JSAR instead of RSA to ensure the access to the medium, the duty cycle scheduling of network’s nodes, the efficient sharing of available frequency channels and time slots and the search of the best routing paths in compliance with multiple application QoS requirements. Let us assume that the coordinator maintains a global vision of the network configuration. Whenever there is a change in the network, the coordinator updates related parameters (matrix and graphs...). The principle of PMCMTP using the JSAR is based on the following four phases: – Forced Sleep test and synchronization: Let us assume that the WMSN’s coordinator maintains an updated energetic graph Ge of network’s nodes. Before the start of each elementary active cycle, the coordinator executes a Force Sleep Test (FTS) to exclude nodes with low levels of energy from participating in the routing process. It must include in the beacon payload the id of nodes that must go to forced sleep state. By listening to the first beacon, WMSN’s nodes adjust their wake-up clocks and check the beacon payload to know if they must go to forced sleep state. Only nodes that pass the F ST will be active during the current elementary active cycle and the others must go to forced sleep state to save their batteries. – Requesting resources: This phase represents a set of equal short time slots, during which, the coordinator is listening to the requests of its members. Based on the number of source nodes, the coordinator assigns to each source node a specific time slot according to their id. Just following the reception of the first beacon, each source node waits for its own time slot to send its resource allocation request. The request packet is composed of five fields including the request identifier, the request’s priority index, the number of required time slots per hop, the source and destination addresses. – Routing and resource allocation process: According to the JSAR, after reception of all resource allocation requests, the coordinator schedules them according to their priorities. Once the list of requests are scheduled, the coordinator launches the routing process, an heuristic algorithm based on Dijkastra algorithm, to find the best path, then it ensures the effective allocation of time slots and frequency
34
J. Ben Slimane et al.
channels. For each request, it tries to find the adequate available time slots per available channels to assign it for the entire selected path from the source to the destination taking into account QoS constraints and the resource availability. After processing all resource allocation requests, the coordinator computes the satisfactory note SN of the current active cycle, then it registers a trace of requests, which were not served to analyze it during the next cycle. Then, it inserts into the next beacon frame the necessary information of the served requests as the index of allocated channel, the index of the first allocated time slot, the number of allocated time slots and source and destination addresses. – Data transmission: After listening to the second beacon, nodes can have a feedback of different requests. Each concerned sensor switches to the suitable channel at the suitable time slot and it begins sending or receiving data frames during the allocated duration.
4 WSN’s Cross-Layer Design In this section, we propose a new WMSN’s cross-layer framework, with joint consideration of the network’s duty cycle scheduling policy, the resource allocation management strategy and the network layer QoS routing process to simultaneously ensure energy saving, optimal resource sharing and QoS support. The MAC layer has to support the duty cycle scheduling and the spectral and temporal resource allocation whereas the Network layer has to ensure the multi-constrained QoS routing process. 4.1 Duty Cycle Scheduling The objective of the network’s duty cycle scheduling is to maximize the node’s sleep duration in order to prolong the network’s lifetime while ensuring QoS requirements in a dynamic manner (i.e by balancing load and enhancing energy saving without affecting to the network performance). The dynamic Sleep/Wake scheduling problem in WMSN can be formulated as follows: how should the activity of nodes be scheduled in a WMSN in order to enhance energy efficiency while ensuring network’s QoS requirements? To determine which subset of nodes to be turned off for a time interval and the manner of sleep interval scheduling are both in joint correlation with routing process and application requirements. In order to minimize and balance energy consumption and to avoid the node’s failure due to draining of battery, we suggest forcing certain routers to enter into sleep state. The nodes, whose residual energy is lower than the threshold value T H n and their data queues (DataQi) are empty, are turned into forced sleep state during the nth elementary active cycle (See Alg.1). T H n is defined as follows: N
TH = n
i=1
Reni
N
∗ SN n−1 , SN n−1 =
N umber of satsif ied routing requests
(12)
T otal number of routing requests
N and SN n−1 refer respectively to the number of WMSN’s nodes and the satisfactory note of the n − 1th active cycle (SN n−1 ∈]0, 1]). The satisfactory note SN n−1 is computed at the end of the n − 1th active cycle.
Joint Duty Cycle Scheduling, Resource Allocation
35
Let us assume that the duration of the nth data communication phase is equal to K time slots with the use of J frequency channels. We model the duty cycle during data communication phase of the node i by the vector DCin of dimension K. 0 in f orced sleep or busy n n n DCi = dci,k 1≤k≤K dci,k = (13) 1 if f ree or in sleep The duty cycle matrix DC n gives an idea about the activity states of the WMSN’s nodes. This matrix can be updated during routing process (i.e. by putting the adequate time slots to 0, for each node participating in a path). ⎛ ⎞ ⎛ n ⎞ dc1,1 .. dcn1,K DC1n ⎜ . ⎟ ⎜ . .. . ⎟ ⎟ ⎟ DC n = ⎜ =⎜ (14) ⎝ . ⎠ ⎝ . .. . ⎠ n n n DCN N K dcN,1 .. dcN,K N K Algorithm 1. Forced Sleep Test, FST for i = 1 to N { n if (Ren i < T H and DataQi is empty) then n DCi = 0, update(G(Va, Ea , w)n )} DCin : State of the ith node f or the nth elementary active cycle Va ,Ea : T he active vertices and edges of the nth active cycle
4.2 Resource Allocation The objective of the resource allocation process is to optimally allocate available resource in terms of time slots per available frequency channels to ensure QoS support and to enhance both energy efficiency and channel utility. As Sleep/Wake scheduling problem, the resource allocation is also in joint correlation with routing process and application requirements. We model the resources of the the nth data communication n n phase by the binary matrix RSJK . Initially, RSJK = IJK (i.e. initially there are in total KxJ available time slots) which is updated during routing process. ⎛ n ⎞ rs1,1 .. rsn1,K th th ⎜ . .. . ⎟ n ⎟ rsn = 1 if the k slot of the j channel is f ree, RSJK =⎜ j,k ⎝ . .. . ⎠ 0 otherwise. rsnJ,1 .. rsnJ,K JK (15) For a given link luv ∈ E, we compute the link’s available time slots T Suv , for the nth active cycle, as given in Alg.2. In order to maximize the resource utility and to optimally share such resource between WMSN’s nodes, we suggest to implement a centralized resource allocation policy at the network coordinator. The resource allocation decision is taken by the coordinator in response to the set of collected resource allocation requests, according to the
36
J. Ben Slimane et al.
network configuration, the duty cycle scheduling and the routing decision. As shown in Fig.2, first, the coordinator tries to find candidate paths according to the application requirements and network organization. During this process, it ensures the temporary resource allocation (allocation of the most earlier available time slots of available frequency channel per link) to compute path’s parameters and cost, then it selects the best path. At this moment, the coordinator makes the effective resource allocation and it updates the resource matrix RS, the duty cycle matrix DC and the energetic graph Ge (i.e. by decreasing the residual energy, according to the energy consumption estimation, for each node participating in the path). We propose TRR (Temporary Resource Reservation) algorithm (Alg.3) to ensure the temporary resource reservation of the current path that can be added as extensible or candidate path. This algorithm consists to find the suitable available time slots per available channels to temporary assign it to the new edge luv . The main objective of this algorithm is to minimize, as possible, the end to end delay. Algorithm 2. Computation of the available time slots per link for the nth active cycle ∀luv ∈ E do{ for j=1 to J{ Compute T Suj (resp.T Svj ) of node u (resp.v) n T Suj = dcn u,k ∗ rsj,k 1≤k≤K j Compute T Suv = tsuj,k ∗ tsvj,k 1≤k≤K } T Suv = T Suj 1≤j≤J }
Fig. 2. Resource allocation algorithm for WMSN
Algorithm 3. Temporary Resource Reservation: TRR Given p, RStemp , DCtemp , Getemp , luv δuv temp = ∞, ch=0, fts=0 Compute T Suv if (Aeuv = 1) then for j=1 to J { Compute δuv if δuv < δuv temp then δuv temp = δuv , ch = j, f ts = F irst allocated T S} compute w(u, v), update(RStemp , DCtemp , Getemp ) else w(u, v) = ∞
Joint Duty Cycle Scheduling, Resource Allocation
37
4.3 Multi-constrained QoS Routing The objective of QoS routing is to find a path from a source s to a destination d using the minimum amount of network resources such as energy consumption per route, residual battery power per node, available time slots, while satisfying various QoS constraints, such as delay, reliability, etc. When multiple routing metrics are considered, the problem becomes a multi-constrained path problem, which has been mathematically proven in [7] to be NP-complete. However, various heuristic and approximation algorithms, proposed in the literature, can approximate or solve similar problems in polynomial or pseudo-polynomial time [8, 9]. For WMSNs, we propose a centralized multi-constrained QoS reactive routing, which jointly interacts with the proposed duty cycle scheduling and resource allocation methods. The algorithm JSAR, given by Alg.4, describes the proposed routing approach. Let PG and Ps→d represent respectively the set of all paths and the set of all paths Ps→d . from a source node s to a destination node d in the network. PG = ∀s,d∈V
The basic routing problems can be defined as follows: Definition 1. The MCP (Multi-Constrained Path) problem is to find paths Ps→d from s to d such that: ⎛ ⎞ ⎞ ⎛ w1 (u, v) C1 ⎟ ⎟ ⎜ ⎜ ⎜ w2 (u, v) ⎟ ⎜ C2 ⎟ ⎜ ⎟ ⎟ ⎜ . w(Ps→d ) = (16) ⎜ ⎟≤⎜ . ⎟ ⎝ ⎠ ⎝ . . ⎠ luv ∈Ps→d wm (u, v) Cm The goal is to find the set of feasible paths Pfs→d from s to d that satisfy multiple constraints simultaneously. Definition 2. The MCOP (Multi-Constrained Optimal Path) is one variant of M CP opt problem which is to find a path Ps→d from s to d such that: opt Cost(Ps→d ) = min Cost(Ps→d )
(17)
∀Ps→d ∈Pfs→d
The goal is to find the optimal feasible path from s to d in such a way that network resources are efficiently utilized. two different paths from s to d. Path Ps→d is Definition 3. Let Ps→d and Ps→d dominated by path Ps→d if and only if w(Ps→d ) ≤ w(Ps→d ) (i.e. Ps→d ≺ Ps→d ) with at least one strict inequality. ∈ Pfs→d . Lemma 1. if ∃Ps→d ∈ Pfs→d =⇒ ∃ a non-dominated Ps→d
According to Lemma (1), we can use the concept of path domination to reduce the computational complexity by considering only non-dominated feasible paths, given that the set of non-dominated feasible paths from s to d represents a sub set of Pfs→d . We nd define Pfs→d as the set of all non-dominated feasible paths in Pfs→d .
38
J. Ben Slimane et al.
nd nd Lemma 2. Given Ps→d ∈ Pfs→d , let Pu→v ⊂ Ps→d =⇒ Pu→v ∈ Pfu→v (Optimality principle: Every optimal path is formed from optimal sub-paths).
According to Lemma (2), a sub-path of a non-dominated feasible path is even a nondominated feasible path. We consider only non-dominated feasible sub-paths to be extended to reach last destinations. Sub-optimal paths (i.e, paths that are dominated by or are equal to others) are ignored during path’s extension process. Let two paths P and P to vertex u and P is dominated by P , if path P can be extended to a path that satisfies Eq. 16, then so also can P . So, there is no need to retain P for path extension. nd The determination of all non-dominated paths (i.e Pfs→d ) from s to d becomes a very hard task to accomplish when the number of such paths is too high. In this context, we propose to limit the search of non-dominated feasible paths for a set of at most X paths ). The main sub-problems that we must study are as follows: (PCandidate s→d – How to define the parameter X taking into account its impact on the routing decision and consequently on the network performance, – How to evaluate a path, – How to evaluate and select the set of the X most efficient non-dominated feasible paths if they exist. is obtained, one still faces the problem of selecting the best Once the set PCandidate s→d final solution. Definition of the parameter X: To improve the efficiency of the path finding algorithm, X should be large, but it is to the detriment of a computation complexity. This parameter must be carefully determined in order to ensure a tradeoff between routing efficiency and both space and computation complexity. We propose two methods to compute X: – Statically: The coordinator defines X as a fixed number computed as soon as the beginning of the network operation. We can use the definition used in [8]. – Dynamically: For each active cycle, the coordinator can dynamically compute X according to the network configuration, the supported load, and the satisfactory note of the previous active cycle. Evaluation of a path: First, we propose a link’s cost function. For resource allocation requests with the priority’s level equals to P1 or P2 (i.e. hard real time constraint), we consider the delay as the metric of performance. Then, for resource allocation requests with the priority’s level equals to P3 or P4 , we propose the energy consumption per link as the metric of performance given that energetic aspect becomes constraining. Finally, for the rest of requests we propose the residual energy per node as the metric of performance to avoid premature death of some nodes. ⎧ ⎨ δuv if P = P1 , P2 c if P = P3 , P4 (18) Costuv = Euv ⎩ Reuv if P = P5 To ensure jointly QoS support, load balancing and energy saving, we propose the link’s cost function (Eq.18) and the path’s cost function (Eq.19) based on the resource allocation request’s priority. For requests with priority P1 and P2 , the path cost represents respectively the end to end delay and the average delay. For requests with priority P3
Joint Duty Cycle Scheduling, Resource Allocation
39
and P4 , the path cost represents respectively the average energy consumption per path and the total energy consumption per path. Finally for last requests, in order to avoid network partition, the cost represents the residual energy per path. Evaluation and selection of most efficient non dominated X paths: According to the relax operation (See Alg.4), each new generated path ((current path p) ∪ luv ) must pass the feasibility test then the non-dominance test. In this case, the new founded path is stored, according to its cost, in the suitable position at the scheduled queues PATH(v) (i.e Pfs→v ) and Q (non-dominated paths). The length of queues can not exceed X, otherwise, the last path must be discarded. ⎧ v=d ⎪ ⎪ ⎪ Costuv if P = P1 ⎪ ⎪ ⎪ ⎪ l ∈P ⎪ uv s→d ⎪ ⎪ Costuv ⎪ ⎪ ⎪ luv ∈Ps→d ⎪ ⎪ if P = P2 ⎪ ⎪ ⎨ Nhop Costuv Costs→d = (19) ⎪ luv ∈Ps→d ⎪ ⎪ if P = P ⎪ 3 ⎪ ⎪ ⎪ Nhop ⎪ ⎪ ⎪ Costuv if P = P4 ⎪ ⎪ ⎪ ⎪ l ∈P uv s→d ⎪ ⎪ ⎪ ⎩ min(Costuv ) if P = P5 luv ∈Ps→d
Nhop : the number of hops per a path Ps→d from s to d. Selection of the best path: The best path can be selected among a set of candidate paths. The selection’s criterion is the path’s cost function. So, the minimum cost path will be selected as the best path. Consequently, the effective resource allocation takes place. 4.4 JSAR As already explained, for each active cycle, the JSAR (See Alg. 4) performs firstly the Forced Sleep Test (line 2 in the main). After the reception of all resource allocation requests, it computes the parameter X (line 3 in the main). For each request it tries to find the set of candidate paths (line 4 in the main) then, from this set it selects the best path and ensures the effective resource allocation (line 5 in the main). Finally it computes the satisfactory note of the current cycle (line 6 in the main).
5 JSAR Evaluation We have implemented the proposed algorithm in MATLAB environment. The performance of the algorithm is evaluated by experiments on an ANSNET [10] network (Table.1). We consider a three-constrained path problem. Link weights are as follows: – w1 (u, v): represents the parameter prruv randomly selected from uniform [0, 1], – w2 (u, v): represents the parameter δuv assumed be equal to 1 Time slot, c – w3 (u, v): represents the parameter Euv assumed be equal to 40 .
40
J. Ben Slimane et al.
Algorithm 4. JSAR Initialization( G(Va , Ea , w), Rq, RS n , DC n , Ge n ) 1 s=Rq.s, d=Rq.d, RStemp = RS n , DCtemp = DC n , Getemp = Ge n 2 ∀ v ∈ Va \ {s} PATH(v)=∅ →
3 PATH(s)={ 0 , RStemp , DCtemp , Getemp } →
4 Q = PCandidate = { 0 , RStemp , DCtemp , Getemp } s→d ——————————————X efficient PATHs(X, q, RStemp , DCtemp , Getemp , Queue) 1 ∀p ∈ Queue 2 if Cost(q) < Cost(p) then { 3 Insert q, RStemp , DCtemp , Getemp from p’s position in Queue 4 if |Queue| = X + 1 then 5 Delete last path in Queue ——————————————Relax(p, v, RStemp , DCtemp , Getemp , C) 1 TRR(p, RStemp , DCtemp , Getemp , luv ) 2 if W (p ∪ luv ) > C then 3 return 4 dominated=False 5 ∀q ∈ P AT H(v) 6 Compute W (p ∪ luv ) 7 if p ∪ luv q then 8 dominated=True 9 if p ∪ luv ≺ q then 10 Remove q, RStemp , DCtemp,Getemp from PATH(v) 11 Remove q, RStemp , DCtemp,Ge temp from Q 12 if(!=dominated) then 13 q, RStemp , DCtemp , Getemp = p ∪ luv , RStemp , DCtemp , Getemp 14 X efficient PATHs(X, q, RStemp , DCtemp , Getemp , P AT H(v)) 15 X efficient PATHs(X, q, RStemp , DCtemp , Getemp , Q) ——————————————Routing(G(Va , Ea ), Rq, RS n , DC n , Ge n ) 1 Initialization( G(Va , Ea ), Rq, RS n , DC n , Ge n ) 2 While(Q! = ∅ and |PCandidate | < X) { s→d 3 p, RStemp , DCtemp , Getemp = Dequeue(Q) 4 u = lastelement (p) 5 if (u = d) then 6 Inqueue p, RStemp , DCtemp , Getemp in PCandidate s→d 7 else ∀luv ∈ Ea 8 Relax(p, v, RStemp , DCtemp , Getemp , C) } 9 return PCandidate s→d ——————————————1 The Best PATH(PCandidate , RS n , DC n , Ge n ) s→d 2 pbest , RStemp , DCtemp , Getemp = min(Cost(p)) ∀p∈PCandidate s→d
3 RS n = RStemp , DC n = DCtemp , Gn e = Getemp ——————–Main———————– 1 For the nth active cycle do { 2 Forced Sleep Test 3 X Computation(G(V,E), DC n , RQn , SN n−1 ) 4 for i = 1 to |RQn | do { 5 PCandidate = Routing(G(V a, Ea), RQ[i], RS n , DC n , Ge n ) s→d 6 The Best PATH(PCandidate , RS n , DC n , Ge n )} s→d 7 Compute SN n }
Joint Duty Cycle Scheduling, Resource Allocation
41
Table 1. Network configuration Parameter Default value Topology ANSNET model Number of Vertices - Edges 32-54 Number of Time Slots per data communication phase 15 Number of Channels per data communication phase 3 Length of a path queue or X 5 Number of routing and resource allocation requests 100 Table 2. Range of C1 , C2 and C3 for each experiment Case 1 2 3 4 5
Range of C =< C1 , C2 , C3 >) C1 ∼ unif orm[1, 2],C2 ∼ unif orm[1, 3],C3 ∼ unif orm[40, 120] C1 ∼ unif orm[2, 3],C2 ∼ unif orm[3, 6],C3 ∼ unif orm[120, 240] C1 ∼ unif orm[3, 4],C2 ∼ unif orm[6, 9],C3 ∼ unif orm[240, 360] C1 ∼ unif orm[4, 5],C2 ∼ unif orm[9, 12],C3 ∼ unif orm[360, 480] C1 ∼ unif orm[5, 6],C2 ∼ unif orm[12, 15],C3 ∼ unif orm[480, 600]
Table 3. SN vs X Case 1 2 3 4 5
Exact algorithm 0.15 0.43 0.62 0.78 0.85
JSAR with X= 2 3 4 5 0.15 0.15 0.15 0.15 0.42 0.42 0.42 0.43 0.58 0.59 0.61 0.62 0.68 0.73 0.76 0.78 0.72 0.80 0.84 0.85
Fig. 3. Execution T ime vs X
Table 4. SN vs Nch
Table 5. SN vs Nts
Case JSAR with Nch = 1 2 3 4 1 0.12 0.14 0.15 0.15 2 0.19 0.34 0.43 0.51 3 0.23 0.47 0.62 0.67 4 0.37 0.68 0.78 0.80 5 0.49 0.78 0.85 0.84
Case JSAR with Nts = 5 10 15 1 0.15 0.15 0.15 2 0.43 0.43 0.43 3 0.53 0.62 0.62 4 0.53 0.78 0.78 5 0.53 0.78 0.85
Constraints are randomly generated and five different ranges of the constraint vector C (< C1 , C2 , C3 >) are selected for each experiment (Table.2). The result is based on 100 randomly generated resource allocation requests for each range. The first experiment is carried out to test how X value affects the performance of JSAR. We compared
42
J. Ben Slimane et al.
JSAR with the exact algorithm where the length of path queue is infinite. According to Table.3, we note that the SN increases as the constraints get looser. Also, the SN increases with the increase of X. When X is equal to 5, JSAR is almost as good as the exact one. According to Fig.3, we observe that the execution time of JSAR increases with the increase of X, also it increases as the constraints get looser. Globally, the execution time of JSAR is acceptable for typical applications for its on-line implementation. The second experiment is carried out to respectively test how the number of available channels and the number of time slots affect the performance of JSAR. According to Table.4 and Table.5, in the majority of cases, the SN increases with the increase of respectively the number of available channels and the number of time slots. Respectively, the increase of the number of available channels and the number of time slots can significantly increase the performance of the algorithm especially when the constraints get looser. According to the network configuration and the application requirements (e.g. desired SN ), those results can be used in practice to compute the necessary resources.
6 Conclusion In this paper, we presented a new cross-layer design for WMSNs based on multichannel MAC protocol. We have decomposed the optimization problem into three subproblems: (1) duty cycle scheduling to save energy and extend the network lifetime, (2) frequency channels and time slots allocation to optimally allocate resource and enhance the network’s performance and (3) multi-constrained QoS routing to ensure QoS support and energy efficiency. We jointly solved the three problems using JSAR that simultaneously satisfies multiple QoS requirements and ensures optimal resource allocation. The evaluation results demonstrate the performance of JSAR. As future work, we will implement the proposed algorithm in a network simulator to evaluate the performance of the network in terms of end to end delay, throughput, lifetime and resource utility.
References 1. Ben Slimane, J., Song, Y.Q., Koubaa, A.: A Prioritized Multi-Channel Multi-Time slot MAC Protocol For Large-Scale Wireless Sensor Networks. In: COMNET 2009, Tunisia (2009) 2. Luo, J., Jiang, L., He, C.: Cross-Layer Optimization for Energy-Timeliness Tradeoff in TDMA Based Sensor Networks. In: IEEE Global Telecommunications Conference, pp. 1–5 (2008) 3. Bouabdallah, F., Bouabdallah, N., Boutaba, R.: Cross-Layer Design for Energy Conservation in Wireless Sensor Networks. In: IEEE International Conference on Communications, pp. 1–6 (2009) 4. Yuebin, B., Shujuan, L., Mo, S., Yang, L., Cong, X.: An Energy Optimization Protocol Based on Cross-Layer for Wireless Sensor Networks. Journal of Commuincations, 27–34 (2008) 5. Gang, L., Bhaskar, K.: Minimum latency joint scheduling and routing in wireless sensor networks. Journal Ad Hoc Networks, 832–843 (2007) 6. Upadhyaya, S., Dhingra, G.: Exploring Issues for QoS Based Routing Algorithms. International Journal on Computer Science and Engineering 2(5), 1792–1795 (2010) 7. Wang, Z., Crowcroft, J.: QoS routing for supporting resource reservation. IEEE Journal on Selected Areas in Communications 14, 1228–1234 (1996)
Joint Duty Cycle Scheduling, Resource Allocation
43
8. Wendong, X., et al.: An efficient heuristic algorithm for multi-constrained path problems. In: The 56th IEEE Vehicular Technology Conference, pp. 1317–1321 (2002) 9. Wan, S., Hao, Y., Yang, Y.: Approach for Multiple Constraints Based Qos Routing Problem of Network. In: The 9th International Conference on Hybrid Intelligent Systems, pp. 66–69 (2009) 10. Comer, D.E.: Internetworking with TCP/IP, 3rd edn. Prentice Hall, New York (1995) 11. Ben Slimane, J., Song, Y.Q., Koubaa, A., Frikha, M.: A Three-Tiered Architecture for LargeScale Wireless Hospital Sensor Networks. In: MobiHealthInf 2009, pp. 20–31 (2009)
Energy Efficient Monitoring for Intrusion Detection in Battery-Powered Wireless Mesh Networks Amin Hassanzadeh1 , Radu Stoleru1 , and Basem Shihada2 1
Department of Computer Science and Engineering, Texas A&M University, USA {hassanzadeh,stoleru}@cse.tamu.edu 2 Department of Computer Science, King Abdullah University of Science and Technology (KAUST), Saudi Arabia
[email protected]
Abstract. Wireless Mesh Networks (WMN) are easy-to-deploy, low cost solutions for providing networking and internet services in environments with no network infrastructure, e.g., disaster areas and battlefields. Since electric power is not readily available in such environments batterypowered mesh routers, operating in an energy efficient manner, are required. To the best of our knowledge, the impact of energy efficient solutions, e.g., involving duty-cycling, on WMN intrusion detection systems, which require continuous monitoring, remains an open research problem. In this paper we propose that carefully chosen monitoring mesh nodes ensure continuous and complete detection coverage, while allowing non-monitoring mesh nodes to save energy through duty-cycling. We formulate the monitoring node selection problem as an optimization problem and propose distributed and centralized solutions for it, with different tradeoffs. Through extensive simulations and a proof-of-concept hardware/software implementation we demonstrate that our solutions extend the WMN lifetime by 8%, while ensuring, at the minimum, a 97% intrusion detection rate.
1
Introduction
Recently, wireless mesh networks (WMN) have emerged as a technology to provide network connectivity in large, remote physical areas where no networking infrastructure is available [1, 2]. WMN reduce networking costs required for offering, over a large physical area, Internet, intranet, and other services to mobile and fixed clients. WMN provide such services using a multi-hop multi-path wireless infrastructure based on a set of mesh routers [3, 4]. A WMN typically consists of Access Points (APs), connecting mobile and static clients to the mesh, relaying mesh nodes, and mesh gateways, connecting the WMN to the Internet. Our motivating application is DistressNet [5], a system, under development, for situation management in disaster response. In DistressNet, WMN are used for providing an infrastructure in triage areas for collecting physiological data from victims and in the disaster area for communication among emergency responders. Since in disaster areas electric power is almost always unavailable (see H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 44–57, 2011. c Springer-Verlag Berlin Heidelberg 2011
Energy Efficient Monitoring for IDS in Battery-Powered WMN
45
earthquake and tsunami disaster in Japan 2011, with energy blackouts going as far as 200+ miles away from the affected area), DistressNet needs to operate predominantly on batteries. Battery powered WMN pose major challenges given the typical high power consumption of mesh nodes. Despite the attention energy efficient operation in WMN has received [6–8], there is no provision in the 802.11s standard for power saving mode operation. This led to the absence of mesh node hardware that operates in a power saving mode. Given the urgent need for deploying DistressNet, we are proposing, as a first step for energy efficient operation, to allow mesh nodes, when feasible, to duty-cycle by turning on-off their wireless interfaces. As we uncover experimentally, the duty-cycling has an interesting effect, in that it allows the battery to recover some of its capacity, thus allowing for a longer total operation time. Duty-cycling, however, has adverse effects on the operation of intrusion detection systems, which are required to be on/awake at all times, to monitor network traffic. As proposed in the literature, in wireless networks, some nodes can be selected as “monitoring” nodes. They cooperatively perform intrusion detection functions [9, 10]. It is obvious that duty-cycling mesh nodes are not suitable to be monitor nodes, since they are not awake all the time. Consequently, the research challenge/problem we address in this paper is how to reconcile energy efficient operation, which requires nodes to be asleep as much as possible, with an effective intrusion detection, which requires nodes to be awake, to monitor traffic. We define this problem as an optimization problem and propose centralized and distributed algorithms for solving it, algorithms that trade off communication and computation overhead for optimality of the solution. Based on analysis of potential security attacks, in a novel approach, the nodes that our algorithms select as monitors, are monitoring wireless links, and not individual neighbor nodes. More precisely, this paper makes the following contributions: – We formulate a novel optimal monitoring node selection problem, in which monitor nodes are responsible for monitoring wireless links, not individual neighbor nodes, and show that it is NP-hard. – We propose centralized and distributed algorithms for solving the monitoring node selection problem. We provide analysis of our algorithms to illustrate the tradeoffs: time and message complexities for intrusion detection rate. – We perform extensive simulation studies that demonstrate the performance gains of our proposed algorithms. – We perform a real system implementation of a solution for saving energy in mesh nodes, using duty-cycling, and show, using real battery profiling data, that the intrusion detection functions are not impacted. This paper is organized as follows. In Sections 2 and 3 we present evidence for the feasibility of our proposed duty-cycling approach and details of our system/attacker models, respectively. We formulate the problem of optimally selecting monitoring nodes and give a proof of its NP-hardness in Section 4. Solutions to our problem, and their performance evaluation are presented in Sections 5 and 6, respectively. We present the state of art solutions in Section 7 and conclude in Section 8.
46
A. Hassanzadeh, R. Stoleru, and B. Shihada
2
Validation of Duty-Cycled Operation in WMN
DistressNet, being deployed in an environment where electric power is very limited (if at all available), needs to aggressively pursue energy efficient operation, including in the WMN. Unfortunately, no native procedure is included in IEEE 802.11s to allow mesh routers to work in power saving mode. Moreover, a power saving mode is not supported by current wireless routers available on the market. Consequently, we propose to use an application-layer controlled duty-cycling, as a means for saving energy on mesh routers. We ran experiments involving Linksys WRT54GL wireless routers (we tested different OpenWrt firmware versions as well) powered by 12V-7Ah Power Sonic rechargeable lead acid batteries (as illustrated in Figure 1(a)) to investigate if duty-cycling affects connectivity between mesh routers and their clients and estimate an expected increase in the mesh router lifetime. A wireless client establishes an ssh session when the mesh router is initially turned on and starts a terminal application. Then, the duty-cycling operation is initiated by turning the wireless interface of a mesh router on and off using “iwconfig eth1 txpower on/off ”, at different time intervals. When using duty-cycling the power consumption of a mesh router was reduced by 840mW (the current consumption drops from 250mA, to 180mA when the wireless interface is turned off). We have validated experimentally that the proposed duty-cycling does not close the ssh session - our terminal application continues to work despite the duty-cycled operation of the router.
Voltage (V)
12 10 8 6
Always On 30S On/Off 60S On/Off
4 0
10
20
30
Time (hours)
(a)
(b)
Fig. 1. (a) Experimental setup. (b) Battery consumption for different on/off intervals.
Figure 1(b) depicts the battery lifetime when the mesh router has the wireless interface constantly on, and when it operates at a 50% duty-cycle, with different on/off periods (e.g., 30s on/off and 60s on/off). As expected, we observe that when the router operates in duty-cycle mode, its lifetime is extended. Surprisingly, different on/off periods (30s vs 60s) extend the lifetime of the router differently, despite operating at the same 50% duty-cycle. As shown in Figure 1(b) the router lifetime is prolonged by 5h when using the 60s on/off duty-cycling,
Energy Efficient Monitoring for IDS in Battery-Powered WMN
47
and by 3h when using the 30s on/off interval. This experiment validated battery recovery effects [11], that have been mentioned briefly in the context of WMN [8]. We used the data collected during these experiments to enhance a simulator we have developed so that it accounts for the new source of energy efficiency, namely battery recovery. The proposed energy efficient operation based on duty-cycling, however, has adverse effects on solutions for monitoring network security in wireless networks. If a mesh router is assigned an intrusion detection/monitoring task or if it helps in relaying high network traffic, then the router has to be awake all the time. This implies that routers with higher available energy and with higher network traffic load should be better suited candidates for becoming monitoring nodes. Deciding which routers should be selected as monitoring nodes, for reducing total energy consumption, while not affecting intrusion detection functions is a challenging problem. In the sections that follow, we introduce our systems and security models and formulate mathematically our problem.
3 3.1
System and Security Models System Model
Our system consists of a WMN with wireless routers powered predominantly by batteries. We allow some of the mesh routers to be AC powered. We assume, as it is typical in DistressNet, that the WMN is connected to the Internet through more powerful gateway routers, that do not have energy constraints and can execute more sophisticated computations. In this paper we will use interchangeably “mesh router” and “node” and will refer to a “WMN client” as “client”. In our WMN a mesh router serves as relay node, or as an AP for WMN clients, or both. Each router has information about the network load it handles and about its residual energy. The routers periodically exchange information through secure communication links among them. We assume that, if needed (e.g., for a centralized algorithm), there exists a middleware service that collects mesh node information on the WMN gateway(s). Nodes are assigned monitoring, or non-monitoring roles. A monitoring node is awake at all times, while a non-monitoring node operates in a duty-cycled manner, to save energy; gateway is considered to be a monitor node. In our WMN system there are two different configurations of intrusion detection, based on the role assigned to the router (i.e., monitoring vs non-monitoring). Details about our security model are detailed in the following section. 3.2
Intrusion Detection System and Attacker Model
In our proposed system, each router runs an intrusion detection engine (i.e., Snort). More complex actions performed by the detection engine (e.g., number of active rule sets) require more system resources. Therefore, the configuration of the detection engine provides opportunities to trade off intrusion detection
48
A. Hassanzadeh, R. Stoleru, and B. Shihada
rate for resource availability. In our system we define two types of configurations for the detection engine: regular (RE-DS), employed by monitor nodes, and lightweight (LW-DS) employed by non-monitoring nodes. An RE-DS detection engine (employed by monitoring nodes) has rules that allow the monitoring of all traffic, while the LW-DS detection engine (employed by non-monitoring nodes) has rules for monitoring only the traffic from/to mesh router’s clients. The proposed intrusion detection configurations allow us to trade off intrusion detection accuracy for resource availability. In this paper, due to space constraints, we describe only intrusion detection of client attackers. As we will show in Section 6.1, compromised router attackers do not affect the intrusion detection. In our scenario, a client attacker first connects to a mesh router and joins the WMN. Afterwards, the client runs attacks against other clients or mesh routers. The targeted routers and clients could be local or multi-hop. A local router is the router that the attacker client is connected to. Local clients are the clients connected to a local router. The attacker can run attacks at two different severity levels: one detectable by the LW-DS detection engine, and one by the RE-DS detection engine. Our novel approach for monitoring node selection is to consider monitoring “wireless links” and not monitoring “nodes” as existing solutions propose [9, 10]. Our approach helps detecting attacks that affect functionality of communication link, e.g., Black hole attack. Consider a linear topology of four nodes, in order, ABCD where each node is connected to nodes physically adjacent. State of art solutions that monitor nodes, may select nodes A and D as monitors (which cover all the nodes). However, this monitoring solution can not cover the communication link between B and C. A Black hole attack between nodes B and C will never be detected by monitors at A and D, unless there is a cooperation mechanism between them through another path. Therefore, we propose that link coverage as a better approach to achieve higher intrusion detection rate. Analysis and simulation results, confirming our intuition will be provided in Section 6.1.
4
Problem Formulation and its NP-Hardness
We model a WMN as a graph G = (V, E), in which V is the set of mesh nodes {v1 , v2 , · · · , vn }, and E = {e1 , e2 , · · · , em }, is the set of links between them. We denote the residual energy and the network load of a mesh node vi by bi and li , respectively. Let w : V −→ [0, 1] be a cost function that assigns a weight wi to a node vi based on li and bi (wi = w(li , bi )), such that higher normalized li and bi values result in lower weight being assigned to vi . Definition 1. The Covering Set Ci = {eij : j = 1..c}, Ci ⊆ E, for a monitoring node vi , contains any edge eij where either eij is incident to vi or vi is connected to the two end points of eij . (Figure 2). Considering our link coverage (as opposed to node coverage) and the desired effect of selecting mesh routers with higher residual energy and higher network
Energy Efficient Monitoring for IDS in Battery-Powered WMN
H
H H
H
& ^&&&&` ( &8& ^HHHH` 0 ^PP`
H
H
H
49
H
& ^&&&` ( & ^HHH` 0 ^P`
& ^&&` ( & ^H` 0 ^P`
Fig. 2. Examples of monitor nodes M and corresponding covering sets C
load as monitoring nodes, we define the Weighted Monitor Coverage (WMC) Problem as follows: Definition 2. Weighted Monitor Coverage (WMC) Problem Given G = (V, E) with a set of vertices in V and a set of edges in E, let wi be the weight of vi , find the set of monitors M = {m1 , m2 , · · · , mk } with the minimum cost Σi∈M wi , such that i∈M Ci = E, i.e., the monitors cover all edges in G and bi ≥ bth , ∀i ∈ M , i.e., the residual energy of each monitor node exceeds a threshold bth . We set bth based on real battery profile; however, if it is not possible to cover all the links by monitor nodes with residual energy higher than the threshold, the threshold value is reduced by Δb until there exists a feasible solution. It is important to observe that our problem is different than the Maximum Coverage and 1-hop Dominating Set problems as proposed in earlier research. Similarly, it may seem that our problem is the same as the Weighted Vertex Cover problem, since both problems aim to cover all the network links, while minimizing the total weight assigned to the selected mesh nodes. It is key to observe that in the Vertex Cover problem when we pick a vertex, incident edges to the vertex are considered covered. In our problem, however, all edges in the communication range of the node are considered to be covered. As an illustration of these key observations, consider Figure 2, which depicts the covering sets and monitoring set of different networks. As shown, only one node is enough to monitor all the edges of a 3-node network. Theorem 1. WMC is NP-hard even for wi = 1. Proof 1. First we assume that each node has a unit weight, so that the problem is to find minimum number of nodes to cover all the edges, i.e., Monitor Coverage (MC) problem. We show that even with this assumption MC is NP-hard, thus same proof is valid for WMC. To prove this, we reduce the Set-Cover to MC in polynomial time. Given a universe U = {x1 , x2 , · · · , xn }, subsets Si ⊆ U , and a positive integer k, the Set-Cover is to determine if ∃ a collection C of at most k such subsets such that union of the k subsets cover all of U , i.e., ∃C ⊆ {1, 2, · · · , m} s.t. |C| ≤ k and i∈C Si = U . Given the instance of the SetCover, we attempt to construct the instance of MC. We let E = U , and for each
50
A. Hassanzadeh, R. Stoleru, and B. Shihada
vi ∈ V , define the subset Ci ⊆ E such that Ci = {e|e is within communication range of vi , e ∈ E}. Next we show that our construction is correct, i.e., we prove the claim, “SetCover has a valid instance if and only if MC has a valid instance.” Suppose Set-Cover has a valid instance. By our construction, each Si corresponds to Ci . Since |Si | = k, we have at most k monitors. Furthermore, since i=1,··· ,k Si = U , and we defined E = U , the k monitors cover all the edges in G. Therefore, MC has a valid instance. Next suppose that MC has a valid instance. This implies that there exists at most k monitors in G. By our construction, each subset Ci of covered links by monitor mi corresponds to the subset Si , so |Si | is k. And since the monitors cover all edges in G, and E = U , it is trivial to see that i=1,··· ,k Si = U , thus proving the claim. This proof is also valid for the case that weights are more than one unit. One other problem to consider is how to optimally choose duty-cycle values for non-monitoring nodes, to extend WMN lifetime, but to also ensure WMN availability to clients. Obviously, the longer a mesh router sleeps, higher the lifetime extension will be. WMN availability, however, limits the maximum time interval a mesh router can sleep. Therefore, the actual duty-cycle a non-monitoring mesh router will use trades off network availability for WMN lifetime. In this paper, we assign the duty-cycle value for a non-monitoring nodes inversely proportional its network load. We leave the computation of an optimal duty-cycle value for a mesh router, for future work.
5
Proposed Solutions
In this section, we present centralized and distributed solutions for our WMC problem. As centralized solutions, we propose a greedy algorithm and an integer linear programming (ILP) algorithm. These algorithms are executed on the WMN gateway (i.e., base station). The base station collects information from WMN nodes (i.e., connectivity, communicating load, and residual energy), executes the monitoring node selection algorithm (either greedy or ILP) and distributes back in the network the decisions. The distributed algorithm, however, is executed by individual nodes using 1-hop neighbor information. It is notable that these algorithms have different time complexity, message complexity, and approximation ratios. 5.1
Greedy Algorithm
We propose a greedy algorithm, shown in Algorithm 1. The algorithm selects monitor nodes based on the number of links per unit weight a node covers and based on the remaining energy level bi which needs to be above a threshold bth . When a node vi is selected, all the links in Ci are covered. Hence, they are removed from the uncovered set E . This selection is repeated until all the links become covered. The proposed algorithm runs in time polynomial of |E| and |V |. Similar to the Set Cover problem, the approximation ratio of our greedy n algorithm is H(maxi∈V |Ci |), where H(n) = Σj=1 ( 1j ) ≤ ln n + 1.
Energy Efficient Monitoring for IDS in Battery-Powered WMN
51
Algorithm 1. Greedy Monitor Coverage 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:
5.2
M = {} E = E , V = V while E = ∅ do if ({m} = maxi∈V {|Ci ∩ E |/wi }) = ∅ then M =M ∪m V =V−m E = E − Ci else bth = bth − Δb end if end while
Integer Linear Programming
The second solution we propose is based on Integer Linear Programming (ILP). Let Pj be a set of selected monitor nodes out of all possible nodes that can monitor link j. The proposed WMC can be formulated as follows: minimize wi mi (1) i∈V
subject to:
|Pi | ≥ 1, ∀j ∈ E bi ≥ bth , ∀mi ∈ M
(2) (3)
mi ∈ {0, 1}
(4)
where constraint (2) indicates that every link has to be covered, constraint (3) enforces the algorithm to select the nodes with residual energy greater than a threshold. We reduce bth by Δb and run the ILP again if there is no feasible solution for the given bth . For using LP-relaxation we replace constraint (4) with mi ≥ 0, since its upper bound is redundant. As a result, several ILP solvers, with different time complexities, can be employed for solving our problem. 5.3
Distributed Algorithm
We propose a distributed algorithm, shown in Algorithm 2. In our protocol each node periodically broadcasts a HELLO message containing its residual energy, network traffic it handles and the number of links it covers, and sets a local timer TBC . When TBC fires, every node builds an adjacency table AdjT bl using the collected HELLO packets. Then each nodes computes the weight per link for each neighbor and for itself. Based on this computed value, a node vi will broadcast an IS-MONITOR message to announce itself as monitor or it will set another timer TMon , waiting to receive an IS-MONITOR message from a neighbor. If node vi receives an IS-MONITOR message before TMon expires, it
52
A. Hassanzadeh, R. Stoleru, and B. Shihada
Algorithm 2. Distributed Monitor Coverage 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:
Broadcast (HELLO) delay (TBC ) create (AdjT bli ) w wi if bi ≥ bth and |C > |Cjj | for all j = i then i| mi = 1 Broadcast (IS-MONITOR) else delay (TM on ) //should receive IS-MONITOR if ({el } = uncover-link(i)) = ∅ and bi ≥ bj , ∀vj that can cover el then Broadcast (IS-MONITOR) else duty-cycle(li ) end if end if
checks all its links to see whether the elected monitor(s) can monitor all of them. If there are still uncovered links, then vi will also broadcast IS-MONITOR to its neighbors, indicating it will be a monitor. To avoid redundancy, the higher the weight (wi ) of a node, the longer timer TMon will be. 5.4
Solution Analysis
The proposed algorithms have different time complexities, message complexities, and approximation ratios. The Set Cover problem has a relatively high approximation ratio (i.e., O(ln |Ci |max )). Improving this ratio has not been addressed by research. Our greedy algorithm has the same approximation ratio as Set Cover, while the ILP solution is considered near optimal. The distributed algorithm, however, has worse approximation ratio because the solution is locally optimal. On the other hand, the time complexity of the distributed algorithm is O(|V |), which is smaller than that of the centralized algorithms; greedy algorithm has time complexity of order O(|V ||E|min(|V |, |E|)) and the time complexity of ILP algorithm depends on the solver. The message complexity of the distributed algorithm is less than that of the centralized algorithms, since the distributed algorithm requires |V | + |M | network-wide packet exchanges. The message complexity of centralized algorithms is O(|V |log|V |). Considering the above analysis, we expect that centralized algorithms produce a smaller set of monitors than the distributed algorithm. On the other hand, the distributed algorithm, with lower time and message complexities, produces larger set of monitors with higher average weight. Therefore, we expect that centralized algorithms will save more energy than the distributed one. The distributed algorithm, however, will select more monitoring nodes, improving the intrusion detection rate.
Energy Efficient Monitoring for IDS in Battery-Powered WMN
Link Coverage Percentage
Average Number of Nodes
16 14 12 10 8 Monitors Uncovered
6 4 2
53
1 0.9 0.8 Link Coverage Alg. Node Coverage Alg. 0.7 0.6
0 10
15 20 25 30 35 40 45 K = x% of number of nodes
(a)
50
10
20
30
40
50
60
70
80
90
Network Size
(b)
Fig. 3. (a) Average number of monitor and uncovered nodes for different K values in Max Coverage of 50-node network. (b) Link coverage percent.
6
Performance Evaluation
We implemented all three proposed algorithms in MATLAB. We consider networks ranging in size from 10 to 90 nodes, while maintaining the network density constant at 3 neighbors per radio range. The radio range is fixed 50m. To compare with state of art solution, we implemented a greedy Maximum Coverage algorithm (called “MAX Coverage” for the remainder of the paper). To fairly compare the results, we ran the MAX Coverage for several upper bound values, and found the minimum k (maximum number of monitors) that guarantees 100% node coverage in a 50-node network. As depicted in Figure 3(a), roughly 35% of the nodes have to be selected for guaranteeing 100% coverage. We use this upper bound value in all our simulations. First, we show that any solution for node coverage problems that guarantees full node coverage, does not necessarily guarantee link coverage. Figure 3(b) shows that the number of uncovered links increases as network size grows. In contrast, our solutions always guarantee full link coverage. Next, we show how different solutions produce an optimal set of monitors with maximum residual energy and high network load. For each network size, we ran simulation for 100 random networks. Figures 4(a) and 4(b) depict the average energy and communicating load of the selected nodes, i.e., (1/|M |)Σi∈M bi and (1/|M |)Σi∈M li , respectively, an evidence that the proposed algorithms select monitors with higher values of remaining energy and communicating load. The average cost per monitors, (1/|M |)Σi∈M wi , is also presented in Figure 4(c) The results show that distributed approach has the worst results, since its solution is locally optimal. On the other hand, the Max Coverage algorithm benefits from selecting monitors with lower link coverage, therefore it achieves better performance. Our centralized greedy WMC and Distributed WMC select the nodes with minimum weight per links first. Therefore, selected nodes in the last iterations add more weight to the total weight of the solution since we may have to select nodes that cover a single wireless link; simply because all the
A. Hassanzadeh, R. Stoleru, and B. Shihada
0.8
Ave. Comm. Load of Monitors
Ave. Residual Battery of Monitors
54
0.7 0.6 0.5 Greedy WMC Distributed WMC Binary ILP WMC Max Coverage
0.4 0.3 0.2 10
20
30
40 50 60 Network Size
70
80
0.8 0.7 0.6 0.5 Greedy WMC Distributed WMC Binary ILP WMC Max Coverage
0.4 0.3 0.2 10
90
20
30
40 50 60 Network Size
70
80
90
80
90
(b)
(a) Greedy WMC Distributed WMC Binary ILP WMC Max Coverage
2.5
Greedy WMC Distributed WMC Binary ILP WMC Max Coverage
0.8 Battery Percentage
Average Cost Per Monitor
3
2
0.6 0.4 0.2
1.5 10
20
30
40 50 60 Network Size
(c)
70
80
90
10
20
30
40 50 60 Network Size
70
(d)
Fig. 4. Given 100 random topologies per each network size, (a) Average residual battery charge of monitors (%). (b) Average communicating load of monitors (%). (c) Average cost per monitor. (d) Threshold reduction.
links must be covered. This constraint imposes more weight to the total weight. In contrast, the Max Coverage has a total weight usually less than that of our solutions, at the price of less intrusion detection coverage. Finally, we show how different solutions impose different Δb for selecting monitor nodes. Using the battery profiling data, we set a threshold of bth = 0.6 for the energy capacity of a node, in order for it to be a monitoring node candidate. The reduction in the residual energy threshold value is considered as penalty by our algorithm since monitors with low residual energy most likely die in a short time. The reduction in the residual energy threshold (i.e., Δb) is shown in Figure 4(d). As shown, our greedy WMC is penalized less than the Max Coverage solution, ensuring a better coverage. 6.1
Security Analysis
As mentioned in Section 3, the attacker is considered to be a client while the target could be either a client or router(for both local and multi-hop cases). Let P athij be the path between attacker vi and target vj . Also let Ei = {eij |eij
Energy Efficient Monitoring for IDS in Battery-Powered WMN 'ĂƚĞǁĂLJ
DŽŶŝƚŽƌ
55 DŽŶŝƚŽƌ
Detection Rate
1 ϭϬ͘Ϭ͘ϳ͘ϭ
0.9
ϭϬ͘Ϭ͘ϳ͘Ϯ
ϭϬ͘Ϭ͘ϳ͘ϯ
ϭϬ͘Ϭ͘ϳ͘ϰ
ϭϬ͘Ϭ͘ϳ͘ϱ
D &HQWUDOL]HGDOJRULWKPVVHOHFWQRGHVDQGDVPRQLWRU
Greedy WMC Distributed WMC Binary ILP WMC Max Coverage
0.8
'ĂƚĞǁĂLJ
DŽŶŝƚŽƌ
DŽŶŝƚŽƌ
DŽŶŝƚŽƌ
0.7 10
20
30
40 50 60 Network Size
70
80
90 ϭϬ͘Ϭ͘ϳ͘ϭ
(a)
ϭϬ͘Ϭ͘ϳ͘Ϯ
ϭϬ͘Ϭ͘ϳ͘ϯ
ϭϬ͘Ϭ͘ϳ͘ϰ
ϭϬ͘Ϭ͘ϳ͘ϱ
E 'LVWULEXWHGDOJRULWKPVHOHFWVQRGHVDQGDVPRQLWRU
(b)
Fig. 5. (a) Intrusion detection rate of different solutions in different network sizes. (b) Five-node mesh network topology and different monitoring solutions. Table 1. Different attack scenarios and the corresponding attack paths Target Type router Local client router Multi-hop client
Path {epi } {epi } ∪ {eiq } {epi } ∪ P athij {epi } ∪ P athij ∪ {ejq }
connects node i to its client j} be a set of all local edges between a router and its clients. Depending on the attacker-client pairs, the Table 1 summarizes the paths P athij for different links, where p is the attacker, vi is local router, vj is multi-hop router, and vq is the target. Since in our solutions ei is covered by at least one monitor, we ensure full coverage for any P athij . Consequently, in our solutions security attacks of any severity we consider, can be easily detected. The Max Coverage solution may leave some links uncovered, causing false negatives, as shown in Figure 3(b). On the other hand, monitor nodes detect high severity local attacks, while non-monitoring nodes detect only low severity attacks. Therefore, the attack severity and target location will determine if the attack may be detected. As mentioned in Section 3, the compromised router does not affect the detection scenario since local attacks (i.e., attacks against mesh router’s client) cannot be detected in any solution. To evaluate the intrusion detection rate of our solutions and compare them with Max Coverage solution, we simulated four different attack scenarios presented in Table 1. We ran simulations for 100 random locations for the attacker and the target, and for different sizes of the network. The results, as depicted in Figure 5(a), show that the detection rate of our solutions is always higher than 97% while the detection rate of Max Coverage decreases as network size increases (e.g., 92% for 70-node network). We can also see that detection rate of Distributed WMC (which produces a less optimal solution) is higher that the other solutions since the number of monitoring nodes (that run RE-DS) is larger than for the other centralized solutions (more nodes can detect local attacks of higher level severity from their clients).
56
6.2
A. Hassanzadeh, R. Stoleru, and B. Shihada
Impact of Duty-Cycling on WMN Lifetime
We investigate the impact of duty-cycling on WMN lifetime through a system implementation on five Linksys WRT54GL routers. One router acts as an AC powered gateway. The other mesh routers are battery powered. We assigned a fixed random network load to each router as 62%, 49%, 33%, 67% of the maximum network load a mesh router can handle. As depicted in Figure 5(b), we created a linear network topology to ensure that centralized and distributed algorithms produce different set of monitoring nodes. The Centralized algorithms (Greedy WMC and ILP WMC) selected nodes 3 and 5 as monitoring nodes, while the distributed algorithm selected nodes 3, 4 and 5 as monitoring nodes. We used 12V-3.4Ah Power Sonic rechargeable batteries for powering the mesh routers. We observed that the centralized solution prolonged the network lifetime (defined as the time when the first battery dies) by 8%, while the distributed solution did not increase it. The explanation for this is that battery attached to the router 4, that was the first one died, was not monitor in centralized solution, however, in the distributed solution it was selected as monitor.
7
State of Art
WMN, as a new popular networking solution with variety of applications, are still a new research area for security community and people who work on energyaware algorithms for power constrained networks. Some power-aware algorithms have been proposed for solar-powered WMN [6–8]. Reducing the load on battery was proposed for giving battery recovery time [6] and [8]. An on/off controller was proposed theoretically for battery recovery [11]. The monitoring node selection as an optimization problem has received some attention [9, 10, 12], where the first two papers optimize the channel assignment in monitoring nodes equipped with multi-channel radios, while the latter only addresse the coverage problem of distributed monitoring selection algorithms. While the authors use existing mesh routers for monitoring purposes, another set of related work (e.g., [9, 13]) considers deploying additional monitoring nodes. In their work, the objective is to deploy the minimum number of monitors to guarantee close to full coverage. From a coverage perspective, this problem is somewhat similar to optimal gateway placement problem in mesh networks where the objective is to maximize network capacity while providing fairness. In addition to solving the coverage problem, we also reduce the power consumption. We follow a security model [14] with different configurations (RE-DS, LW-DS) for intrusion detection system as a tradeoff between detection rate and resource availability. Detecting some attacks [15] may require consolidating intrusion detection information obtained by different monitoring nodes.
8
Conclusions
Wireless network monitoring, specifically intrusion detection, can be difficult in battery-powered wireless mesh networks. Solutions that employ the typical
Energy Efficient Monitoring for IDS in Battery-Powered WMN
57
802.11 power saving mode or duty-cycling have been proposed for improving the energy efficiency. This energy efficient mode of operation impacts, however, network intrusion detection functions, especially when considering the problem of selecting monitoring nodes. In this paper, we define the selection of monitoring nodes as an optimization problem and proposed centralized and distributed solutions for it. We have investigated how the communication load and the residual energy of a mesh router affects router’s capability to operate as a monitoring node. Through extensive simulations we demonstrate that our solutions preserve intrusion detection capabilities, while prolonging the network lifetime. Acknowledgement. This work was funded in part by NSF grant CNS 0923203 and by King Abdullah University of Science and Technology (KAUST) award KUS-C1-016-04.
References 1. Wang, X., Lim, A.O.: IEEE 802.11s wireless mesh networks: Framework and challenges. Ad Hoc Networks, pp. 970–984 (2008) 2. Eriksson, J., Agarwal, S., Bahl, P., Padhye, J.: Feasibility study of mesh networks for all-wireless offices. In: MobiSys (2006) 3. Camp, J., Knightly, E.: The IEEE 802.11s extended service set mesh networking standard. IEEE Communications Magazine, 120–126 (2008) 4. Amir, Y., Danilov, C., Mus˘ aloiu-Elefteri, R., Rivera, N.: The smesh wireless mesh network. ACM Trans. Comput. Syst (September 2008) 5. George, S., Zhou, W., Chenji, H., Won, M., Lee, Y.O., Pazarloglou, A., Stoleru, R., Barooah, P.: DistressNet: a wireless ad hoc and sensor network architecture for situation management in disaster response. IEEE Communications Magazine 48(3), 128–136 (2010) 6. Farbod, A., Todd, T.D.: Resource allocation and outage control for solar-powered wlan mesh networks. IEEE Transactions on Mobile Computing (2007) 7. Badawy, G., Sayegh, A., Todd, T.: Energy provisioning in solar-powered wireless mesh networks. IEEE Transactions on Vehicular Technology (2010) 8. Ma, C., Zhang, Z., Yang, Y.: Battery-aware scheduling in wireless mesh networks. Mob. Netw. Appl. 13, 228–241 (2008) 9. Shin, D.-H., Bagchi, S.: Optimal monitoring in multi-channel multi-radio wireless mesh networks. In: MobiHoc (2009) 10. Subhadrabandhu, D., Sarkar, S., Anjum, F.: A framework for misuse detection in ad hoc networks - part i. IEEE Journal on Selected Areas in Communications, 274–289 (2006) 11. Rakhmatov, D., Vrudhula, S.: Energy management for battery-powered embedded systems. ACM Trans. Embed. Comput. Syst., pp. 277–324 (August 2003) 12. Chhetri, A., Nguyen, H., Scalosub, G., Zheng, R.: On quality of monitoring for multi-channel wireless infrastructure networks. In: MobiHoc (2010) 13. Li, F., Wang, Y., Li, X.-Y., Nusairat, A., Wu, Y.: Gateway placement for throughput optimization in wireless mesh networks. Mob. Netw. Appl (2008) 14. Hugelshofer, F., Smith, P., Hutchison, D., Race, N.J.: OpenLIDS: a lightweight intrusion detection system for wireless mesh networks. In: MobiCom (2009) 15. Cheng, Y.-C., Bellardo, J., Benk¨ o, P., Snoeren, A.C., Voelker, G.M., Savage, S.: Jigsaw: solving the puzzle of enterprise 802.11 analysis. SIGCOMM Comput. Commun. Rev. 36, 39–50 (2006)
Using Battery Level as Metric for Graph Planarization Jovan Radak, Nathalie Mitton, and David Simplot-Ryl INRIA Lille - Nord Europe, Univ Lille Nord de France, USTL, CNRS UMR 8022, LIFL, France {firstname.lastname}@inria.fr Abstract. Topology control in wireless sensor networks is an important issue for scalability and energy efficiency. It is often based on graph reduction performed through the use of Gabriel Graph or Relative Neighborhood Graph. This graph reduction is usually based on geometric values. In this paper we tackle the problem of possible connectivity loss in the reduced graph by applying a battery level based reduction graph. Experiments are conducted to evaluate our proposition. Results are compared with RNG reduction which takes into account only the strength of the received signal (RSSI). Results show that our algorithm maintains network connectivity longer than solutions from the literature and balances the energy consumption over nodes. Keywords: RNG, topology control, wireless sensor networks, batterylevel.
1
Introduction
Wireless sensor networks have been under big scientific interest past few years. They are intended to be deployed in hostile environments (battlefield, forest, etc.). Therefore, it is expected that a large number of cheap simple sensor devices will be randomly scattered over the region of interest. These devices are powered by batteries and have limited processing and memory capabilities. Among numerous challenges faced while designing WSN protocols, maintaining connectivity and maximizing the network lifetime stand out as critical considerations. Important results have been given aiming at generic solutions which will be easily implemented on the variety of platforms used for building wireless sensor networks. Topology control via per-node transmission power adjustment has been shown to be effective in extending network lifetime and increasing network capacity (due to better spatial reuse of spectrum). Generally, nodes compute a graph reduction on their local view of the network, such as Gabriel Graph (GG) or Relative Neighborhood Graph (RNG) [14]. In the RNG construction, in every triangle uvw, the link with the worst value is logically removed in the outcome graph. This graph reduction allows a node to consider only a subset of its neighborhood by keeping connectivity. Then, a node adjusts its range such that the further node in the graph reduction can still be reached. Most common way to apply graph reduction is by using Euclidean distance as a weight function for reduction. Connection between received signal strength H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 58–71, 2011. c Springer-Verlag Berlin Heidelberg 2011
Using Battery Level as Metric for Graph Planarization
59
(RSSI) of the sensor nodes and distance between sensors allows usage of RSSI as the indicator of this distance. The distance between the nodes and RSSI are inversely proportional. In addition, RSSI gives a value on the quality of links. The underlying idea is that longer edges are the most expensive ones so removing them improves energy efficiency. Nevertheless, these methods do not consider the energy level of nodes. And if a node exhausts its battery and fails, it may disconnect the network. In this paper, we introduce a novel algorithm for topology control which takes into account battery level of each sensor. Battery level is considered as normal or critical if under a threshold. We give appropriate weight to each link in the graph according to the battery levels of its end nodes. We introduce a metric called power factor which divides links into three sets: – reliable links – the ones with normal battery level on both ends (power factor = 0), – suspicious links with critical battery level on one of the sides (power factor = 1) and – bad links with critical battery level on both ends (power factor = 2). By using this metric as main weight in the graph reduction, reliable links are preferred to suspicious and bad links and suspicious links are preferred to bad links. As a result, weak nodes have a limited role in the graph connectivity and have no impact on the network connectivity when they fail. Graph reduction is recomputed periodically since battery level evolves during the time. This allows extending the whole network lifetime. To validate our scheme, we run experiment on real nodes. Experimentation results show that our algorithm maintains network connectivity longer than solutions from literature, balancing connectivity over nodes when needed. The remaining of this paper is organized as follows. Related works are presented in Section 3. Section 2 presents our model and assumptions. Our main contribution is introduced in Section 4 and evaluated in Section 5. Finally, we conclude and discuss future works in Section 6.
2 2.1
Preliminaries Model
We consider a sensor network where nodes are randomly scattered, aware of their geographical location and able to tune their transmission range between 0 and R (R > 0). The network is modeled as a graph G = (V, E) where V is the set of sensors and E is the set of edges. uv ∈ E if and only if there exists a radio link between sensors u and v, i.e. they are in communication range of each other. We denote by N (u) the set of physical neighbors of node u, i.e. the set of nodes v such that uv ∈ E. Let δ(u) = |N (u)| be the cardinality of N (u), also called the degree of node u. We also define NRN G (u) ⊂ N (u) the set of RNG neighbors of node u. Every node is aware of its battery level, denoted as BL(u) for node u. We consider that every node u has a unique identifier. We denote the identifier of node u as ID(u).
60
2.2
J. Radak, N. Mitton, and D. Simplot-Ryl
Assumptions
We assume that a node u is aware of every edge within its neighborhood, i.e. node u knows every edge vw such that v = w and such that v ∈ {u ∪ N (u)} ∧ {w ∈ N (u)}. This can be achieved by two ways: either nodes are aware of their positions and of the one of their neighbors (nodes broadcast their position in Hello messages) or nodes are aware of their 2-neighborhood (nodes broadcast their neighbor list in Hello messages). If nodes are aware of their positions, they can easily compute the edge length. If they are not, we assume that node u can estimate the distance between itself and its neighbor v by the use of the RSSI. The use of the RSSI has an additional feature. Indeed, even if it is not always inversely proportional to the distance, it gives an indication on the link quality. Thus if the link is short and it has a low RSSI, that means that the quality is poor and equivalent to a very distant node.
3
Related Work
Graph reductions are often used for topology control in wireless sensor networks. Almost every topology control is based around some graph reduction strategy with the aim to keep certain parts and properties of the graph and optimize communication between the nodes and energy used. Indeed, the idea is to perform a graph reduction and then adjust the transmission power of every node in such a way that they can reach the further neighbor in the resulting graph. Some of the graph reductions are based on the geometric properties of the graphs and propose topology control using certain geometric structures like minimum spanning trees [9], Delaunay triangulation [3] or Gabriel graph and some propose usage of cluster based-algorithms, like connected dominating sets. A major drawback of techniques such as minimum spanning trees and Delaunay triangulation is the lack of localized properties of the algorithm. Relative neighborhood graph was first presented in [14]. The principle is as follows. In every triangle, the link with the greatest weight is removed. RNG is widely used and on most of utilizations the weight considered is the distance [1, 12]. So, the longest edge in each triangle is removed. RNG has the great advantage to be computed locally (every node only needs to know its neighbors and links between them) and it is not limited to the usage of geometric properties of the graph. Other variants use the RSSI [4] or the expected transmission count (ETX) [8, 15] as a metric. While it can be used as a notion of distance RSSI still cannot be considered as precise and reliable enough to be solely used. RSSI and ETX are mainly used as quantitative values of link quality. RSSI as a possible metric for RNG graph reduction is much discussed and while some authors state that it is under appreciated [11] others are advocating against it stating its unreliability as the major problem. In [5], authors apply a primary filter before performing a RSSI-based RNG. Indeed, a node first observes its neighborhood and removes every neighbor for which the RSSI is to low or not stable enough. Topology control is built by using the RSSI as the weight function. Algorithm runs in three steps: (i) neighborhood discovery – in which each node is sending Hello messages in order to find out its neighbors
Using Battery Level as Metric for Graph Planarization
61
and the quality of the connection with them (value of the RSSI which is implicitly sent as the part of the Hello message), (ii) filtering possible neighbors and removing the ones with RSSI under a given threshold level – in this way unreliable links are removed before applying topology control (iii) RNG reduction – RNG algorithm is applied to all nodes, it is run in a distributed way, meaning that each node u runs RNG algorithm on its neighborhood, removing node v ∈ N (u) from the list of its RNG neighbors if there exists a node w such that: ∃w ∈ {N (u) ∩ N (v)} | ((RSSI(u, v) < RSSI(u, w)) ∧ (RSSI(u, v) < RSSI(v, w))) Problems of energy consumption and energy conserving are also one of the main interests in area of wireless sensor networks [2]. Great number of proposed algorithms are built to be energy efficient. Using different types of batteries with different capacities has an impact on the lifetime of the wireless sensor network presented in [6], [7] and [10]. In these papers authors analyze the problems of different battery levels from the wireless sensor network lifetime extension point of view. However these papers do not consider dynamic reconfiguration of the network in cases when some of the nodes reach certain battery levels, rather just the problem when one of the nodes disconnects due to the loss of power. In this paper, we propose to apply a battery level based RNG in conjunction with RSSI value in order to take the advantage of works from the literature. We consider dynamic reconfiguration of the graph when battery levels of nodes evolve in time. The usage of specific hardware allowed us to have information about the critical battery levels of the sensor nodes. These two informations, along with the unique ID of each sensor node (also hardware specific property) forms the metric which is then used for topology control using relative neighborhood graph.
4
Battery-Level Based RNG
In this section, we introduce our topology control algorithm. It is a Relative Neighborhood Graph in which the metric to be considered takes into consideration the battery level of nodes. The main idea is that nodes that are running out of energy should be given less importance and appear as leaves in the resulting graph. In this way, when their battery is exhausted, the network will not be disconnected and we avoid costly graph re-computing. Nevertheless, the battery level can not be used directly by the algorithm since battery level is related to a node and RNG graph uses metrics related to links. Value assigned to link uv between nodes u and v should be the same from the point of view of u and from the point of view of v to avoid links to be removed improperly and network disconnections. Therefore, we first assign a value to edges based on node battery levels. 4.1
Making Connection from Voltage Level to RNG Weight Function
In order to make a connection between battery level of two sensor nodes on each link, we introduce a value called power factor. Power factor of link kl, noted
62
J. Radak, N. Mitton, and D. Simplot-Ryl
0000 111 (1, 27, 1, 19)1111 000 0000 1111 000 111 0000j 1111 000i (0, 54, 2, 15) 111 11 00 00d 11 (0, 47, 1, 3)
11 00 00a 11
(0, 63, 1, 9)
(1, 91, 4, 8)
11111 00000 00000 11111 00000b 11111
(1, 64, 2, 16) (1, 45, 1, 13)
(0, 46, 1, 11)
(1, 48, 2, 6)
(1, 39, 1, 3)
111 000 000f 111
(0, 69, 2, 10)
(1, 57, 3, 7)
(2, 40, 1, 5)
0000 1111 0000 11 (1, 45, 2, 12) 1111 00 0000g 1111 00e 11
(1, 38, 2, 8)
1111 0000 0000 1111 0000c 1111
(1, 57, 1,17)
1111 0000 0000h 1111
(2, 39, 1,15)
(2, 73, 4,10)
Fig. 1. Example for PF computation. Blue nodes have a high battery level. Red nodes have a low battery level. Every link is assigned with a tuple (P F, distance(RSSI), |Id1 − Id2|, |Id1 + Id2|).
P F (k, l) is determined using the voltage level of the two nodes on the link and is such that P F ∈ {0, 1, 2}. We consider a battery level threshold τ . If the battery level of a sensor node is lower than this threshold, the node is considered as a critical node. Power Factor value is assigned to every edge by using Algorithm 1 that takes as an input value battery level threshold τ and battery level for all nodes. According to these values, Algorithm 1 divides sensor nodes into two sets: (i) normal battery state nodes, with battery level higher than τ , and (ii) critical battery state nodes, with battery level lower than τ . Then depending on the battery state of each sensor node on the link, the power factor is assigned to each link as follows: – power factor 0 – if both nodes are in normal battery state, – power factor 1 – if exactly one of the nodes is in critical battery state, – power factor 2 – if both nodes are in critical battery state. Algorithm 1 is run on node u. It assumes that node u is aware of the battery level of each of its neighbors and knows whether there exist a link between the two of them. This information can be achieved through the use of Hello messages in which every node piggybacks its battery level and its position if available otherwise its neighborhood table. Based on this, node u is able to identify any triangle within its neighborhood and then to assign power factor value on every of these links. To illustrate the Power Factor computation, let us consider Figure 1. Nodes that appear in blue are nodes which battery level is higher than τ while red nodes are the ones which battery level is lower than τ . On this figure, link ad gets a Power Factor equal to 0 since both nodes a and b have a high battery level. Link gh gets a Power Factor of 2 since it connects two nodes with a low battery level. Link bc connects two nodes with different battery levels and thus gets the Power Factor value 1.
Using Battery Level as Metric for Graph Planarization
63
Algorithm 1. Calculate Power Factor 1: foreach k, l ∈ N (n) | k = l do 2: if (BL(k) > τ ) ∧ (BL(l) > τ ) then 3: P F (k, l) ← 0 4: else if (BL(k) > τ ) ∧ (BL(l) τ ) ∨ (BL(k) τ ) ∧ (BL(l) > τ ) then 5: P F (k, l) ← 1 6: else 7: P F (k, l) ← 2 8: end if 9: end for
4.2
Algorithm for Graph Planarization Using Power Factor
Once Power Factor is computed on every link, RNG computation can be performed. Nevertheless, since Power Factor only returns 3 different values, in a triangle, several edges could hold the same Power Factor value. Thus we need to use a second metric that allows nodes to choose between the nodes in such a way that every node of the triangle take the same decision regarding edge removal to avoid network disconnections. To do so, we apply the traditional RNG metrics, i.e. Euclidean distance between nodes and then, to break any potential additional ties, the node identifiers. To ensure that every node computes the same value for a link, we first compute the difference between the identifiers of the two end nodes. Nevertheless, if node with ID 2 is connected to nodes with ID 1 and 3, the two differences of ID remains the same. So, in such a case, we consider afterwards the addition of IDs. Using both difference and addition of the nodes IDs, we ensure that every node computes the same values for a link and that two links in a triangle can not have the same values since nodes IDs are unique and we can not have at the same result for both addition and difference for two different links in a same triangle. Figure 2 shows the different values considered for every link. Every link holds a set of values (P F, d, ID−, ID+) where P F is the Power Factor, d is the Euclidean length of edge, ID− is the difference between ID of end nodes and ID+ is the addition of ID of end nodes. As ID we have considered the rank of the letter in the alphabet. For instance, ID(a) = 1, ID(b) = 2, etc. . . . We define ≺ as a binary total order such that uv ≺ uw if and only if – – – –
P F (uv) < P F (uw) or P F (uv) = P F (uw) ∧ d(u, v) < d(u, w) or P F (uv) = P F (uw)∧d(u, v) = d(u, w)∧|ID(u)−ID(v)| < |ID(u)−ID(w)|. P F (uv) = P F (uw)∧d(u, v) = d(u, w)∧|ID(u)−ID(v)| = |ID(u)−ID(w)|∧ |ID(u) + ID(v)| < |ID(u) + ID(w)|.
Yet, the edge removal runs as described by Algorithm 2. We assume that node u is aware of the length of every edge within its neighborhood, either because nodes are aware of their positions and broadcasts it in Hello messages to their neighbors, or because they are able to estimate it based on RSSI for instance. Every node considers every triangle within its neighborhood (like triangles f ig,
64
J. Radak, N. Mitton, and D. Simplot-Ryl
(0, 54, 2, 15)
11 00 00d 11 (0, 47, 1, 3)
111 000 000 111 000a 111
(0, 69, 2, 10)
(1, 91, 4, 8) (0, 63, 1, 9)
1111 0000 0000b 1111
(1 ,64, 2, 16) (1, 45, 1, 13)
(0, 46, 1, 11)
(1, 48, 2, 6)
(1, 39, 1, 3)
111 000 000f 111
(1, 13, 3, 7)
(2, 40, 1, 5)
0000 111 (1, 45, 2, 12) 1111 000 0000g 1111 000e 111
(1, 38, 2, 8)
1111 0000 0000c 1111
0000 111 (1, 27, 1, 19)1111 000 0000j 1111 000i 111
(1, 57, 1,17)
1111 0000 0000h 1111
(2, 39, 1, 15)
(2, 73, 4, 10)
Fig. 2. Example of value computation. Blue nodes have a high battery level. Red nodes have a low battery level.
igh or ef g on Fig. 2) and determines what edge to logically remove in the RNG. To do so, it first compares the P F values of nodes. If one of edges has a lower PF value than other ones, it is removed (e.g. link gh is removed in triangle igh on Figure 2). If two edges have the same lower PF value or than every edge has the same PF value, the longer one is removed. For instance, in triangle f ig on Figure 2, the longer edge is f i and should be removed in a traditional RNG algorithm. But f has a PF value of 0 and thus it is kept. Instead, we remove edge ig which has the same PF value than edge f g but is longer. Algorithm 2. Calculate RN G 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16:
NRNG (u) ← N (u) CalculatePowerFactor(u) {Node u computes PF factor of every link within its neighborhood.} foreach v, w ∈ N (u) do if uv ≺ vw ≺ uw or vw ≺ uv ≺ uw then NRNG ← NRNG (u) \ {w} {Link uw is removed from the RNG.} else if uw ≺ vw ≺ uv or vw ≺ uw ≺ uv then NRNG ← NRNG (u) \ {v} {Link uv is removed from the RNG.} else {Link vw will be removed from the RNG.} end if end if end for Return NRNG (u)
If edges hold the same PF value and are of the same length, ties are broken by considering the difference between identifiers of two nodes on the end of each edge. The edge with the largest ID difference is removed. This is for instance
Using Battery Level as Metric for Graph Planarization
65
the case in triangle ef g on Fig. 2 where edges eg and f g hold the same PF value and have the same length. Their ID is used to differentiate them. Since |(ID(f ) − ID(g)| < |(ID(e) − ID(g)|, link eg is removed. Execution of this algorithm is distributed in the sense that each node is calculating its own set of RNG neighbors, according to given condition and knowing its neighborhood. As a result, we obtain a connected graph in which weaker edges have been removed and where disappearance of weak nodes has a minimal impact of the graph connectivity. Figure 3 compares the graph obtained after topology control when applying distance-based RNG (Fig. 3(a)) and our battery level based RNG (Fig. 3(b)). As we can see, in our approach, critical nodes appear either as leaves in the graph (nodes c, g, h, j) or in a redundant path (node c). Yet, if one of these nodes fail, the network is not impacted. At contrary, in the traditional RNG, there is no battery level concern and critical nodes belong to principal paths. If node b, c or g fails, the network is disconnected.
00000 11111 111 000 00000 000 (1, 27, 1, 19)11111 111 00000j 11111 000i (0, 54, 2, 15) 111 11 00 00f 11
111 000 000 111 000d 111 11 00 00a 11 (1, 39, 1, 3)
(1, 57, 1,17)
(1, 45, 1, 13)
(0, 47, 1, 3)
00000 11111 00000 11 (1, 45, 2, 12)11111 00 00000g 11111 00e 11
11111 00000 00000 11111 00000b 11111
1111 0000 0000h 1111
(2, 39, 1, 15)
(1, 38, 2, 8)
(2, 40, 1, 5)
1111 0000 0000c 1111
(a) Distance based RNG
111 000 0000 000 (1, 27, 1, 19)1111 111 0000j 1111 000i (0, 54, 2, 15) 111 111 000 000d 111 (0, 47, 1, 3)
11 00 00f 11 (0, 46, 1, 11)
11 00 00a 11 (1, 39, 1, 3)
11111 00000 00000b 11111
(1, 48, 1, 13)
(0, 63, 1, 9)
(1, 57, 3, 7)
11 00 00 11 00e 11
11111 00000 00000 11111 00000g 11111
(1, 57, 1,17)
11111 00000 00000 11111 00000h 11111
(1, 38, 2, 8)
11111 00000 00000c 11111 (b) Battery level based RNG Fig. 3. Example for PF computation. Blue nodes have a high battery level. Red nodes have a low battery level.
66
J. Radak, N. Mitton, and D. Simplot-Ryl
5
Implementation and Results
In order to validate our proposition and highlight its valuable features, we run experimentations over the SensLAB1 platform on the Lille site. It is worth noting that performances of our solution strongly depend on the hardware architecture of sensor nodes used for the experiments, more precisely to the microcontroler that is being used on this architecture. This platform is composed of WSN430 sensor nodes (see Fig. 4(a)). On the Lille platform, these sensors are equipped with MSP430F1611, 16 bits RISC microcontroller, ultra low power multichannel RF transceivers CC2420, unique identifier DS2411, serial flash memory M 25P 80.
(a) WSN430
(b) SensLAB platform
Fig. 4. WSN430 and SensLAB platform. INRIA / Photo N. Fagot.
5.1
Experimentation Set Up
We run our experiment on the Lille SensLAB platform. We select a 5 × 6-node grid via the SensLAB interface spread as depicted by Figure 5. In the SensLAB Lille platform nodes are placed in grid on the distance of d = 60cm between each other. We chose a subset of nodes on the grid leaving out some of the nodes such that we can see what happens with physically longer links and how the algorithm is applied to them. Microcontroller MSP430F1611, as the whole MSP430x series, is equipped with circuitry called Supply Voltage Supervisor – SVS. This part of microcontroller follows voltage level of the power supply of the microcontroller and gives user information when power supply drops under fixed voltage level. SVS can detect 16 discreet voltage levels ranging from 3.7V to 1.9V [13]. We use critical voltage level, τ = 3.7V , which corresponds to value 1110 in the SVS register, i.e. we consider that a node has reached a low battery level as soon as the microcontroller sends an information that the critical voltage is reached. Experiments are run 12 hour long, during this time all sensors are loaded with same program which runs RNG algorithm 2, calculating RNG neighbors in distributed way, and recording statistics – neighbor candidates, RNG neighbors and the parameters for each link (power factor, RSSI value, IDs). For the implementation of 1
http://www.senslab.info/
Using Battery Level as Metric for Graph Planarization
d
50
d 49
d 48
67
d 47
46
2d
60
59
58
57
56
d 65
64
63
62
61
70
69
48
67
66
d
3d
85
84
83
82
81
2d 95
94
93
92
91
Fig. 5. Placement of the nodes on which the experiment is run. In SensLAB, d = 60cm.
our algorithm we use CSMA/CA MAC layer implementation provided by the SensTools project 2 and FreeRTOS port for MSP430 microcontrolers. Our solution is compared to [5] from the literature. As detailed in Section 3, in [5], a RNG is built based on the RSSI on links after a first filter on neighborhood. Our algorithm also uses the RSSI as a metric to estimate at the same time distance and link quality as claimed in Section 2.2 but only as a secondary weight. We use the Power Factor metric as primary one. We do not apply the filter used in [5] since we assume that these bad links will be automatically removed in the RNG computation except if the removal of these links disconnects the network. 5.2
Experimental Results
Figure 6 shows the edges connecting node 46 to its neighbors after the initial exchange of HELLO messages. For the sake of visibility, we have represented only these links. Figure 7 shows after a topology control performed with our solution (Fig. 7(a)) and with RSSI-based RNG with filters (Fig. 7(b)) like in [5]. At these pictures black edges are RNG edges of the node 46, and yellow edges are RNG edges of the rest of the nodes, ensuring the connectivity between node 46 and the rest of the nodes in its neighborhood. At this step, all nodes have a high battery level and thus both graphs are equivalent. We are running test program on 2
http://senstools.gforge.inria.fr/
68
J. Radak, N. Mitton, and D. Simplot-Ryl -75 49
50
-74
47
58
57
56
62
61
-69
-86
-81 63
64
65
46
-68
59
60
-72
48
-70
-79 68
69
70
67
-68
66
-79
-91
85
84
83
82
81
95
94
93
92
91
Fig. 6. Initial topology of graph 50
49
48
47
-69
-72 46
50
59
58
57
56
65
64
63
62
61
70
69
68
67
85
84
83
95
94
93
48
47
-69
-68
60
49
-72 46
-68
60
59
58
57
56
65
64
63
62
61
66
70
69
68
67
66
82
81
85
84
83
82
81
92
91
95
94
93
92
91
(a) Battery-level based RNG
-68
(b) RSSI-based RNG + filters
Fig. 7. Topology after topology control
-68
Using Battery Level as Metric for Graph Planarization
50
49
48
47
-69
-72 46
50
49
48
47
-69
-68
-72 46
-68
60
59
58
57
56
65
64
63
62
61
66
70
69
68
67
66
82
81
85
84
83
82
81
92
91
95
94
93
92
91
60
59
58
57
56
65
64
63
62
61
70
69
68
67
85
84
83
95
94
93
(a) Battery-level based RNG
-68
69
-68
(b) RSSI-based RNG + filters
Fig. 8. Topology after topology control after nodes have exhausted their battery
the nodes making them exhaust the battery, this program is just made for the nodes to exchange messages and faster discharge batteries thus to speed up the experiment. Figure 8 shows the final topology control after some time, when some of the nodes exhausted their batteries and brought them to the critical state. We can see that with our solution (Fig. 8(a)), the network is still connected and that weak nodes appear as leaves in the reduced graph. We can also see that the network dynamically reorganized itself: link 46 − 47 − 48, which allows node 46 to reach node 48, has been changed for link 46 − 59 − 49 − 48 when the battery of node 48 has dropped under critical level. Furthermore, if we consider complete discharge of the battery of the node 48 in this case if we are using RSSI-based RNG then we will lose of connectivity between nodes 46 and 49 while batterylevel based RNG is preserving connectivity between those two nodes.
6
Conclusion
In this paper, we have introduced a new weight for applying RNG and topology control in wireless sensor networks. This weight is primarily based on battery level and is used in such a way that links between and towards weak nodes are taken only if there is no other path. Our algorithm extends the network lifetime. Algorithm has been validated through experimentations. As future works, we intend to evaluate our solution more deeply to measure more precisely the dynamics of the network on different hardware in order to measure the impact of
70
J. Radak, N. Mitton, and D. Simplot-Ryl
this latter. We also wish to investigate the impact of the number of power factor values. Indeed, nodes are only classified in two categories (high or low battery level). What if we introduce an additional level? At last, we intend to evaluate our algorithm in mobile environments.
Acknowledgements This work was partially supported by CPER Nord-Pas-de-Calais/FEDER Campus Intelligence Ambiante and the ANR BinThatThinks project.
References 1. Cartigny, J., Ingelrest, F., Simplot-Ryl, D., Stojmenovic, I.: Localized lmst and rng based minimum-energy broadcast protocols in ad hoc networks. Ad Hoc Networks, 1–16 (2005) 2. Ephremides, A.: Energy concerns in wireless networks. IEEE Wireless Communications 9(4), 48–59 (2002) 3. Hu, L.: Topology control for multihop packet radio networks. IEEE Transactions on Communications 41(10), 1474–1481 (2002) 4. Khadar, F., Simplot-Ryl, D.: Connectivity and topology control in wireless ad hoc networks with realistic physical layer. In: Third International Conference on Wireless and Mobile Communications (ICWMC 2007), p. 49 (March 2007) 5. Khadar, F., Simplot-Ryl, D.: From theory to practice: topology control in wireless sensor networks. In: MobiHoc 2009, pp. 347–348 (2009) 6. Long, H., Liu, Y., Wang, Y., Dick, R.P., Yang, H.: Battery allocation for wireless sensor network lifetime maximization under cost constraints. In: Proceedings of the 2009 International Conference on Computer-Aided Design, ICCAD 2009, pp. 705–712. ACM, New York (2009) 7. Sichitiu, M.L., Dutta, R.: Benefits of multiple battery levels for the lifetime of large wireless sensor networks. In: Boutaba, R., Almeroth, K., Puigjaner, R., Shen, S., Black, J.P. (eds.) NETWORKING 2005. LNCS, vol. 3462, pp. 1440–1444. Springer, Heidelberg (2005) 8. Lukic, M., Pavkovic, B., Mitton, N., Stojmenovic, I.: Greedy geographic routing algorithms in real environment. In: MSN, pp. 86–93 (2009) 9. Ramanathan, R., Rosales-Hain, R.: Topology control of multihop wireless networks using transmit power adjustment. In: Proceedings of IEEE Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM 2000, vol. 2, pp. 404–413 (2000) 10. Yuan, D., Zhang, R., Jia, Z.: Analysis of lifetime of large wireless sensor networks based on multiple battery levels. Int’l J. of Communications, Network and System Sciences 1(2), 136–143 (2008) 11. Srinivasan, K., Levis, P.: Rssi is under appreciated. In: Proceedings of the Third Workshop on Embedded Networked Sensors (EmNets) (2006) 12. Supowit, K.J.: The relative neighborhood graph, with an application to minimum spanning trees. J. ACM 30, 428–448 (1983)
Using Battery Level as Metric for Graph Planarization
71
13. Texas Instruments. MSP430x1xx Family - User’s Guide, mixed signal products edition (2006) 14. Toussaint, G.T.: The relative neighbourhood graph of a finite planar set. Pattern Recognition 12(4), 261–268 (1980) 15. Wattenhofer, R., Zollinger, A.: XTC: a practical topology control algorithm for ad hoc networks. In: Proc. 4th International Workshop on Algorithms for Wireless, Mobile, Ad Hoc and Sensor Networks (WMAN), Santa Fe, MN, USA (2004)
Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks with Realistic Radio Propagation Models Pedro Wightman1, Miguel Jimeno1, Daladier Jabba1, Miguel Labrador3, Mayra Zurbarán1, César Córdoba2, and Armando Guerrero2 1
Departament of Systems Engineering Departament of Electrical and Electronics Engineering Universidad del Norte, Km. 5 Vía a Puerto Colombia Barranquilla, Colombia {pwightman,djabba,majimeno}@uninorte.edu.co,
[email protected],
[email protected],
[email protected] 3 Department of Computer Science University of South Florida, 4202 E. Fowler Ave. ENB 118 Tampa, FL, USA 33620
[email protected] 2
Abstract. Choosing the appropriate network size to guarantee connectivity in a WSN deployment is a challenging and important question. Classic techniques to answer this question are not up to the challenge because they rarely consider realistic radio models. This work proposes a methodology to evaluate the performance of network size estimation techniques in terms of connectivity efficiency under realistic radio scenarios. This study is carried out using Atarraya, a simulation tool for wireless sensor networks, considering three classical estimation techniques and a radio model based on the specifications of the ZigBee radio from off-the-shelf WaspMote nodes from Libelium. The results show that the hexagon-based optimal grid technique provides the most efficient estimate, offering a high connectivity level with the lowest estimated number of nodes for a given proximity radius parameter, followed by the circle packing and the triangle-based grid distribution. In addition, the results show that packet error rates of 10% could still produce highly connected topologies. Keywords: Atarraya, ZigBee, Critical transmission range, Circle packing problem, Lattice-based deployments.
1 Introduction Wireless sensor networks (WSNs) is a technology that allows fast, cheap and remote monitoring of environmental variables or the occurrence of events in distant, inaccessible or hostile places where human presence could be hazardous. A WSN can be defined as a large set of small devices with communication, sensing and processing capabilities characterized by the fact that, just after being deployed, can organize themselves in order to create a communication infrastructure to forward H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 72–85, 2011. © Springer-Verlag Berlin Heidelberg 2011
Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks
73
all data gathered by the network to a control center far away from the deployment area where the data will be stored and analyzed. One of the most important questions to be addressed before making the actual network deployment is the size of the network: how many nodes are necessary in order to build a connected network? Deploying very few nodes will certainly leave uncovered areas whose information will never be acquired by the network. On the other hand, deploying too many nodes may result in a high density network which, if all nodes in it are active, may become affected by communication problems like interference and excess of message collisions due to competition for the channel and its impact on battery life and network lifetime as a whole. The problem of determining the best size of the network in terms of number of nodes has been addressed before. The most commonly used solutions are based on finding the critical transmission range (CTR), the critical node density (CND), or an optimal grid-based deployment. However, most of these solutions are pure theoretical formulations and rarely consider real signal propagation models. This paper presents an empirical methodology for evaluating network size estimation techniques based on three estimation techniques that use existing deployment sizing methods, and the connectivity curves obtained from the characterization of the radio propagation model of the XBee-ZB WaspMote® nodes from Libelium®, based on free space model. The paper is organized as follows: Section 2 includes a summary of related work. Section 3 explains the radio model of the WaspMote, and shows its correspondent probability of successful transmissions. Section 4 describes the proposed methodology to define the experimental evaluation of the network size estimation techniques. Section 5 shows the connectivity curves and connectivity efficiency of the estimators. Finally, Section 6 presents the conclusions and future work.
2 Related Work Finding the optimal configuration for a network in order to ensure connectivity in the area of wireless sensor networks has been an important problem. Most works in the area are based in existing topologies and solve this problem by changing the network topology by modifying the transmission range of the nodes until the desired level of connectivity is achieved. One approach is to determine the minimal transmission range, common to all nodes, that produces a connected topology. This problem is widely known as the Critical Transmission Range (CTR) of the network. Another approach is to find the minimal transmission range for each node, which is known as the Range Assignment problem (RA). Both techniques are meant to reduce energy consumption and therefore extend the network lifetime due to the reduction in power usage. One existing approach to calculate the CTR is presented in [1], where the author proposes a general formula for dense networks based on finding the expected longest edge on a minimal spanning tree, which will provide at least a connected tree to cover the network. Eq. 1 shows the equation used to find the CTR in this work [2].
74
P. Wightman et al.
CTR ( n) = ln n + ln ln n nπ .
(1)
However, in [2] and [3], it has been shown that the transmission range estimated by the CTR does not provide complete connectivity. In [4], the authors propose a formula to find a CTR for sparse networks in which, based on the area side, they calculate a network size and a correspondent transmission range. Even though this CTR formula guarantees connectivity in different network scenarios, the main drawbacks of this formula are that 1) the transmission range may go way over the maximum range that current radio technologies may offer, and 2) for a given area side, the network size is usually a small number that could not produce a connected topology with the existing radio technologies. A large number of techniques have been proposed to solve the RA problem, including centralized and distributed solutions. Some of the most important ones are described in [2], [3] and [5]. A significant disadvantage of most of the solutions to the RA is that they do not provide good guidance about the best number of nodes to use in the deployment. RA solutions determine the appropriate transmission range for a given number of nodes, but not the opposite, and this is definitely a very important decision in the design of a WSN. Two other approaches to reach connectivity in the network are based on finding the Critical Node Density (CND) and the Critical Average Node Degree (CAND). Several works have been done in this area. In [6], the authors prove that if the average node degree k is as defined in Eq. 2, then connectivity will be reached with high probability.
k = c ⋅ log 2 n,0.074 < c < 5.1774 .
(2)
In [7] the authors show that node density shows a particular behavior with respect to the connectivity probability: connectivity does not increase linearly with the increase of the density, but it does it in “sharp threshold” manner around the critical node density value of the network. However, the main drawbacks of these techniques are that the levels of connectivity for node density are not universal for all scenarios and depend on the network design, and that they do not consider realistic radio models. Some other techniques provide an estimation of the network size, based on communication and sensing radii. One of the solutions is the circle packing problem, a mathematical problem that pretends to find the maximum radius for k nonoverlapping circles inside a given area [8]. This problem could be interpreted in the opposite way in order to find the maximum number of non-overlapping circles of radius r inside a given area. In order to approximate this problem to the connected network sizing for a communication range Rcomm, it is necessary to find the solution to the circle packing problem for radius Rcomm/2, which will guarantee that contiguous circle centers will be at most Rcomm away from each other, and thus, connected. The main weakness of this solution is that there is no mathematical formula to calculate the circle packing solution for most scenarios to find whether the range or the number of circles, so the solutions should be approximated from existing optimal solutions. If sensing areas can overlap, the problem can be solved by the definition of latticebased grids which allow the creation of minimal topologies for given communication
Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks
75
and sensing ranges. In [9] and [10], the authors propose a set of formulas to calculate optimal ratio of area per node. In order to determine the actual size of the network, divide the total deployment area by the value obtained by the formula. Eq. 3, 4, 5 and 6 show the node density formulas for the different lattice designs from [10] based on triangles, hexagons, rhombus and squares. The actual number of nodes is obtained by finding the inverse of the density (1/γ) for the given communication and sensing radii, and they will be used to generate the values from Table 4. 2
γ TRI =
⎧ ⎫⎞ 3 2 ⎛⎜ R RSense ⎜ min ⎨ 3 , Comm ⎬ ⎟⎟ . 2 RSense ⎭ ⎠ ⎩ ⎝
2 γ RHO = RComm sin(θ ),
π 3
≤θ ≤
π 2
, 2≤
RComm ≤ 3. RSense
(3)
(4)
2
γ HEX
⎛ ⎧ ⎫⎞ 3 2 ⎜ min ⎨1, RComm ⎬ ⎟ . 3RSense = ⎟ ⎜ 4 ⎩ RSense ⎭ ⎠ ⎝
(5)
2
γ SQR = R
2 Sense
⎛ ⎧ ⎫⎞ ⎜ min ⎨ 2 , RComm ⎬ ⎟ . ⎜ RSense ⎭ ⎟⎠ ⎩ ⎝
(6)
It is important to mention that all these solutions use the unit disk communication model, in which given a certain transmission range r, every node inside the range will always receive the message and every node outside the range will never receive the message. Realistic radio models, on the other hand, have been rarely considered, limiting the applicability of these solutions. Realistic radio models take into account the signal degradation due to distance or obstacles in order to develop a statistical function to define the Bit Error Rate (BER) and consequentially, the probability of receiving a message correctly. This fact alone would affect the way connectivity was defined in the previous techniques because it is not constant during the operation of the network. The work presented in this paper defines a general methodology that can be applied to particular scenarios in order to find the best estimator for network size. Three techniques are considered in this paper in order to estimate the network size: Inverse CTR, circle packing, and optimal grid deployments. These techniques will be evaluated, based on the connectivity curves obtained for the experimental scenarios, in terms of connectivity efficiency, or the ratio of the connectivity provided by the estimated number, and the number of nodes produced by the estimator for given node proximity parameters. This process will be explained in detail in Section 4. Even though this work just considers the free space model, other more realistic models can be considered also under the same methodology, according to the needs of the network designer.
76
P. Wightman et al.
3 Communication Model The most common communication model used in wireless sensor networks protocols has been the perfect unit disk: each node has a perfect disk of radius r such that every other node inside the disk will receive all transmissions, and every node outside the disk will not receive any messages transmitted by the sender node. The unit disk is an ideal model and does not reflect the soft drop of successful reception probability that a more realistic radio communication model produce because of signal degradation due to the distance between nodes and interference. The simulation tool named Atarraya [11] is an open source application to design, implement and test topology control protocols in wireless sensor networks. This tool now includes a new radio propagation model that allows the study of the behavior of these solutions under more realistic conditions. In particular, the path loss model was implemented and customized for two radio models: Libelium’s WaspMote XBee-ZB model, and Crossbow’s MICA2. In addition, the BER is calculated for each of the radio models so the simulator can determine if the packets are lost or not in a transmission between nodes. This work only includes evaluation of networks using the WaspMote nodes. 3.1 Path Loss and Bit Error Rate The first step in creating the model is the definition of the path loss model or the loss in signal strength due to distance. The loss in an omnidirectional antenna, assuming free space propagation, is defined according to Eq. 7, where λ is the wave length of the sent signal and d is the distance.
⎛ 4πd ⎞ Lp = ⎜ ⎟ . ⎝ λ ⎠ 2
(7)
There are other versions of this formula in which the Fresnel zone is included in the estimation of the signal loss, but that will be included in future versions. The second factor of the radio model is the calculation of the BER. The bit error rate is calculated based on the type of modulation used in the communication system. The sensor node selected in this work uses OQPSK modulation. Eq. 8 shows the BER for the OQPSK modulation. The terms of the equation will be explained later.
⎛ Eb ⎞ 1 ⎟. BER = erfc⎜⎜ ⎟ 2 N 0 ⎝ ⎠
(8)
3.2 Communication Model In order to calculate BER, Eq. 7 is transformed into decibels and the new version is shown in Eq. 9, where d is the distance between the sender and receiver nodes, and λ is the wavelength.
Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks
77
⎛d⎞ ⎛d ⎞ L p ( dB) = 20 log10 (4π ) + 20 log10 ⎜ ⎟ ⇒ L p (dB) = 22 + 20 log10 ⎜ ⎟ . ⎝λ ⎠ ⎝λ ⎠
(9)
The wavelength of the WaspMote radio model, which works with a 2.4GHz frequency, is 0.125 meters. After obtaining the loss with Eq. 9, the BER can be calculated by finding the transmitted signal strength ptx, received signal power prx, the noise power N, the signal to noise ratio STN and the ratio of bit rate and bandwidth BBW. In this work we assumed ptx to be 2mW and the bandwidth to be 5 MHz, and a bit rate of 38400 bps for the WaspMote. After calculating the STN and the BBW, the ratio of energy per bit of the transmitted signal and the spectral density of the noise (Eb/N0) can be obtained. Then, the bit error rate is calculated using the modulation equation used in the transmission (Eq. 8). Eq. 10 shows an approximation of the error function erf using Chebyshev´s fitting estimate [12], given that there is no close form for the integral in original erf function definition. Once the error function is calculated, the complementary error function erfc can be obtained, and used to calculate the BER as shown in Eq. 8. erf ( z ) ≈ 1 − t * exp( − z 2 − 1.26551223 + 1.00002368 t + 0.37409196 t 2 + 0.09678418 t 3 − 0.18628806 t 4 + 0.27886807 t 5 − 1.13520398 t 6 + 1.48851587 t 7 −
(10)
0.82215223t + 0.17087277 t ). 8
9
Fig. 1. Curves of successfully received messages ratio (1-PER) for the WaspMote nodes
Using the Atarraya simulator, the communication model for the WaspMote was tested in a simple scenario: two nodes; one sender and one receiver. The sender transmitted packets to the receiver in a point to point manner, and the simulator counted all lost packets. This simple scenario was tested with different distances in order to calculate a curve of message reception probability, and with two different packet sizes: long size packets (100 Bytes) and short size packets (40 Bytes). The final curve is shown in Fig. 1. In general, it is expected that the signal will reach 125
78
P. Wightman et al.
meters, but due to the size of the packets, and the assumption that with just 1 bit in error the packet is discarded, the probability of successful reception decrease with distance and with packet size.
Fig. 2. Example of random distributions of 64 nodes in a deployment area of 300 meters x 300 meters: pure uniform (left) and grid-based (right) node deployments
4 Methodology Definition This work defines a methodology to evaluate the performance of different network size estimators for a particular scenario. The proposed methodology considers 5 steps: 1) definition of the scenario, 2) selection of the network size estimation techniques, 3) estimation of the network sizes, 4) calculation the connectivity curve for the scenario and 5) evaluation of the connectivity efficiency of the estimators. 4.1 Definition of the Scenario
Three factors are considered to define a particular scenario: the length of the side of a square area L, the node distribution on the area D, and the radio technology R. The L factor is important because it affects the node density of the network, which has been shown to have a determinant impact on connectivity. For experiments in this work, three levels have been chosen for this factor: 200, 300 and 400 meters long. The D factor determines the way nodes will be distributed in the area of deployment. This factor has an impact on average node degree, area coverage and other variables that depend on the location of the nodes. In this evaluation, two levels were chosen for this factor: uniform distribution and grid-based distribution. The first technique distributes nodes randomly in the area, such that each position is equally probable to be chosen for a node. In the second technique, the area is divided in a grid based on the desired Lengthwise Node Density (LND); in other words, the user selects how many nodes are needed along the side of the deployment area. From this number, the length of the side of the cells is obtained by dividing L by LND. Then, inside each cell a node will be located according to a uniform random distribution. If the network administrator has the power to manually and accurately deploy the nodes, then the optimal deployments may be an option to be evaluated in this factor. Fig. 2 shows two examples of network topologies with area side of 300 meters, and 64 nodes with the two node distributions, uniform and grid-based. It can be seen how the
Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks
79
uniform distribution allows for nodes to form random dense clusters and areas with low presence of nodes, while the grid-based distribution produces a more homogeneous distribution of the nodes in the space. The R factor is important in order to understand the behavior of the communication in the network. The radio technology has an impact on the minimum distance that the nodes can be apart from each other and still be able to communicate successfully. In other words, the estimators should have a way to guarantee a certain average distance among the nodes for the network. In this work, this parameter for the estimator is called the Node Approximation Parameter (NAP), which defines the average or maximum distance between any pair of nodes in the network. The range of values considered in this work goes from the maximum distance that offers almost 100% successful transmissions to the minimum distance with almost 0% successful transmission probability, which, from Fig. 1, can be found at approximately 20 and 80 meters, respectively. The resolution of the NAP parameters is 10 meters difference between them; however, a more detailed study could be done with shorter distances in order to have a larger set of NAPs. This work evaluates 6 different scenarios, based on the combination of the levels of factors L and D and a unique level for factor R. 4.2 Definition of the Estimation Technique
For this evaluation, three estimation techniques were selected because they consider a NAP in order to provide their estimations: the inverse CTR, the circle packing number and the lattice-based optimal deployments. The first technique, the inverse CTR, uses the same formula in Eq. 1 and implies a manual task of testing different network sizes in order to obtain a CTR equal to the required NAP. For example, for a square area with side 400 meters, 40 nodes have a CTR of 80 meters, and 56 nodes have a 70 meters CTR. This number of nodes will guarantee that, with high probability, every node will have at least one neighbor at the specified distance. The second technique is the circle packing problem. In this work, and based on the results in [8], a list of packing numbers and correspondent radius/area side ratio is built as shown in Table 1. However, in order to select a given packing number, consider that the distance between nodes must be at least half the RComm radius, otherwise the nodes will not be able to communicate; this is, for a square area with side 400 meters long, a RComm of 80 meters have a ratio of 0.2, thus the packing number should be 25, which corresponds to a ratio of 0.1 (or like having a RComm equal to 40m) in order to guarantee connectivity among the nodes. This table allows the selection of an approximate packing number for a given scenario, especially because optimal packing numbers have not been proven for most number of nodes. The third technique uses lattice-based optimal deployments node densities in order to calculate the number of nodes that would be necessary in order to build the grid with a given RComm and RSense. In this evaluation it is assumed that both radii are the same. This assumption could be changed in order to evaluate also the level area coverage provided by the estimated network size.
80
P. Wightman et al. Table 1. Network sizes per RComm/L ratio based on packing number
Packing number
Packing number
Ratio
RComm/L
Packing number
Ratio
RComm/L
5 7 10 13 18 21 24 25 36 45 56 62 78
0.2071 0.1745 0.1482 0.1340 0.1155 0.1069 0.1014 0.1000 0.0833 0.0747 0.0675 0.0636 0.0573
100 121 144 169 196 225 256 289 324 361 400 441 484
0.0500 0.0455 0.0417 0.0385 0.0357 0.0333 0.0313 0.0294 0.0278 0.0263 0.0250 0.0238 0.0227
529 576 625 676 729 784 841 900 961 1024 1089 1156 1225
0.0217 0.0208 0.0200 0.0192 0.0185 0.0179 0.0172 0.0167 0.0161 0.0156 0.0152 0.0147 0.0143
Ratio
RComm/L
Table 2. Network sizes per area side based on the CTR value L / Distrib.
400 meters
300 meters
200 meters
Transmission range 80 m 70 m
Uniform
Grid
Uniform
Grid
Uniform
Grid
40 56
36 49
18 26
16 25
N/A 6
N/A 4
60 m 50 m
84 132
81 121
40 64
36 64
12 21
9 25
40 m
226
225
112
121
39
36
30 m 20 m
448 1145
441 1146
226 589
225 576
84 226
81 225
4.3 Estimation of the Network Sizes
In this step, the estimators will be executed in order to obtain their projected network sizes. A list of packing numbers and correspondent radius/area side ratio is built as shown in Table 1. Table 2 shows the number of nodes calculated and the correspondent critical transmission ranges for the correspondent area sides of 400, 300 and 200 meters. Table 3 shows the network sizes based on the packing number approach. Table 4 shows the network size based on linear-, hexagon-, rhombus- and square-based grids, calculated from Eq. 3, 4, 5 and 6 in Section 2. For this experiment, RComm is assumed equal to RSense, which has an impact on the rhombus lattice.
Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks
81
Table 3. Network sizes per area side based on the packing number L / Ratio and 400 meters Packing number Packing Transmission Ratio range (RComm/L)/2 number 80 m 0.100 25 70 m 0.088 36 60 m 0.075 45 50 m 0.063 62
Ratio (RComm/L)/2 0.133 0.117 0.100 0.083
Packing number 13 18 25 36
Ratio (RComm/L)/2 0.200 0.175 0.150 0.125
Packing number 5 7 10 13
40 m 30 m 20 m
0.067 0.050 0.033
56 100 225
0.100 0.075 0.050
25 45 100
0.050 0.038 0.025
100 144 400
300 meters
200 meters
Table 4. Network sizes per polygon-lattice-based grid design for optimal deployments Polygon / L
Triangle
Transmission 400 m Range 80 m 29
Rhombus
Hexagon
300 m
200 m
400 m
300 m
200 m
400 m
300 m
200 m
16
7
25
14
6
19
11
5
70 m
38
21
9
33
18
8
25
14
6
60 m
51
29
13
44
25
11
34
19
9
50 m
74
42
18
64
36
16
49
28
12
40 m
115
65
29
100
56
25
77
43
19
30 m
205
115
51
178
100
44
137
77
34
20 m
462
260
115
400
225
100
308
173
77
4.4 Calculation of the Connectivity Curves
The connectivity in the network is not easy to calculate when using a realistic radio model. In the perfect disk model just the fact that the nodes were in each other’s area was enough to say that they were connected; on the other side, the path loss model does not allow that asseveration due to the variable probability of error depending on the distance between the nodes. In order to obtain a metric for connectivity, the experiment uses the size of the final topology generated by the JustTree protocol compared to the size of the complete network. The JustTree protocol is a simple flooding-based routing protocol included in Atarraya that builds a tree-like message distribution structure on the network, rooted at the sink node. In order to transmit messages to the sink, the nodes forward the messages towards their own parent nodes in the tree. Eventually, all the messages should reach the sink. The methodology to create the tree is very similar to the one used by a Breadth First Search (BFS): an initiator (the sink node) sends a HELLO message to all its neighbors and it includes its current level in the tree (0 in the case of the sink), and
82
P. Wightman et al.
marks itself as Visited. Every node that receives the message successfully will mark itself as Visited, will update its own tree level as the one from the message plus 1, will register the sender as its parent node, and will forward this message to all its neighbors including its current level. A Visited node will not forward messages again, thus only n messages will be sent in total. This protocol guarantees connectivity in a unit disk network if the network is initially connected due to its similarity to the BFS algorithm; however, in the case of realistic radio models this is not the case due to the possibility of losing packets which could interrupt the process, especially in bottleneck links. The execution of this protocol over a network topology could give a good indication on the level of connectivity that can be reached in the network without the support of MAC layers and including no retransmissions or mechanisms to recover lost packets. Fig. 3 shows the connectivity curves after evaluating 50 instances of each of the proposed scenarios: three area sides (200, 300 and 400 meters) versus two node distribution techniques (uniform and grid-based). The node sizes used to generate the figure were the ones from the inverse CTR estimation. The curves show the average behavior of the 50 instances and two standard deviations from the average, one above and one below. Each point on the series corresponds to a NAP value, starting from 80 meters (farther left) and finishing at 20 meters (farther right), with a distance of 10 meters between them. From Figure 3 some conclusions can be drawn: first, the grid-based node distribution starts having worse connectivity probability than the uniform distribution for high NAPs (80-60m), and then it starts to show better connectivity in the low NAPs (50-20m). Both can be explained by the fact that the nodes are sparser in the grid distribution than in the uniform distribution, so for high transmission ranges the uniform may reach dense clusters and increase its connectivity faster, but for low communication ranges, the grid allows farther reach of the branches, reaching more nodes, and also due to the of the nodes, farther nodes could be reached through different paths. In addition, the homogeneous distribution also had an impact on the variability of the results in the grid-based deployment, which tends to be smaller than
Fig. 3. Connectivity curves for scenarios with 200, 300 and 400 meters of area side and different node distribution techniques
Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks
83
in the uniform; this mean that the existing variability in the connectivity level of this scenario can be explained by the probability of errors in transmissions. Third, most of the scenarios reach a high probability of connectivity when the communication range is about 40 meters (fifth point from the left on each series) for grid-based distributions, and 30 meters (sixth point in the series) for uniform distributions. This result is perfectly sound with the communication model shown in Figure 1, in which it can be seen that from 40 meters, the probability of successful message reception is about 100% and presents low variability; in other words, all estimations that guarantee an average distance among nodes close to the one that provides high successful reception probability will result in a connected topology, with high probability. 4.5 Connectivity Efficiency of the Estimators
We evaluate the performance of the estimators considering the number of nodes and the connectivity probability. For a specific NAP, the best estimator would be the one that produces the least number of nodes with the desired connectivity probability. For example, for an area side of 400 meters and a NAP of 20 meters, the inverse CTR technique estimates 1145 nodes, while the hexagon-based optimal deployment estimates just 308, and both provide almost 100% connectivity in the connectivity curve, making the second estimator more efficient because it provides a similar solution with fewer resources. Fig. 4 shows the curves of connectivity efficiency offered by the different network size estimation techniques with the uniform node distribution. Connectivity efficiency is meant as the ratio of connectivity offered by the estimation divided by their projected network size. These results just show the best two estimators and the worst for each area side, for clarity in the figure. The results show that the hexagon-based optimal deployment technique provides the most efficient connectivity ratio for small NAP values, which are the ones that
Fig. 4. Connectivity efficiency of the estimation techniques for network sizes in scenarios with uniform node distribution
84
P. Wightman et al.
provides the highest level of connectivity. The inverse CTR technique shows a good performance for scenarios with NAPs of more than 50 meters, and then its efficiency falls lower than the rest, which means that it estimates more nodes than needed. The packing, the rhombus- and the triangle-based estimates have an average behavior and do not show better results than the hexagon-based technique. The results for the gridbased node distribution scenarios are not showed because they present a similar behavior to the ones in Fig. 4.
5 Conclusions and Future Work Network size estimation is a very complicated problem to solve universally for every possible scenario. In this work, a methodology for evaluating network size estimators is presented. Three methods to estimate the size of the deployments are evaluated: find a network size whose CTR is close to the desired communication range called inverse CTR, calculating the circle packing number for the desired communication range, and using the lattice-based optimal deployments. These techniques were evaluated in six different scenarios using a realistic radio model based on the XBeeZB WaspMote nodes, in which the size of the deployment area and the node distribution was changed. A new evaluation metric was proposed: the connectivity efficiency, which is a ratio of the connectivity probability provided by the estimation over the estimated network size. The hexagon-based optimal deployment estimator produced the most efficient estimation for small NAPs in which higher connectivity was achieved, while the inverse CTR showed the best efficiency for scenarios with large NAPs, in which lower connectivity level was achieved. An interesting result can be drawn also from the connectivity results in Fig. 3: on average, topologies were mostly connected with a NAP parameter of 40 meters in all scenarios, which shows a successfully transmission probability of around 90% according to Fig. 1, and a high node density in order to reach this inter- node distance. It means that, even though there is a considerable packet error rate (around 10%), the network tends to be connected when the network density is high. This can be explained by the fact that the JustTree protocol is based on flooding, so the probability of a node not receiving a HELLO message is low due to the large set of neighbors that each node has and the numerous trials to send the message, from which it is expected that at least one of them will arrive correctly. The next lower value, 50 meters, show a connectivity level around 80% and 90%, which is very high compared to its expected packet error rate of 60% for short packets. Some of these results need a deeper analysis that will be included in a future work. This work can be extended in different ways: other radio models may be considered that include other physical aspects, like obstacles or interference; another area would be the inclusion of the sensing coverage metric as a way to evaluate not only connectivity but how the distributions provide also better coverage; and finally, contrasting the network sizes from the estimations with real test beds would be critical for evaluating the performance under real conditions.
Empirical Approach to Network Sizing for Connectivity in Wireless Sensor Networks
85
References 1. Penrose, M.: The Longest Edge of a Random Minimal Spanning Tree. The Annals of Applied Probability 7(2), 340–361 (1997) 2. Santi, P.: Topology Control in Wireless Ad Hoc and Sensor Networks. John Wiley and Sons, England (2005) 3. Labrador, M., Wightman, P.: Topology Control in Wireless Sensor Networks. Springer. Science + Business Media B.V., New York (2009) 4. Santi, P., Blough, D.: The Critical Transmitting Range for Connectivity in Sparse Wireless Ad Hoc Networks. IEEE Transactions on Mobile Computing 2(1), 25–39 (2003) 5. Karl, H., Willing, A.: Protocols and Architectures for Wireless Sensor Networks. John Wiley and Sons, England (2005) 6. Xue, F., Kumar, P.R.: The Number of Neighbors Needed for Connectivity of Wireless Networks. Wireless Networks 10(2), 169–181 (2004) 7. Cai, H., Jia, X., Sha, M.: Critical Sensor Density for Partial Connectivity in Large Area Wireless Sensor Networks. In: IEEE Infocom, pp. 1–5. IEEE Press, New York (2010) 8. Graham, R.L., Lubachevsky, B.D.: Repeated Patterns of Dense Packing of Equal Disks in a Square. Electronic Journal of Combinatorics 3(1), 1–16 (1996) 9. Bai, X., Kumar, S., Xuan, D., Yun, Z., Lai, T.: Deploying Wireless Sensors to Achieve both Coverage and Connectivity. In: ACM Symposium on Mobile Ad Hoc Networking and Computing, pp. 131–142. ACM Press, New York (2006) 10. Bai, X., Xuan, D., Yun, Z., Lai, T., Jia, W.: Complete Optimal Deployment Patterns for Full-Coverage and K-Connectivity (K<6) Wireless Sensor Networks. In: 9th ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp. 401–410. ACM Press, New York (2008) 11. Wightman, P., Labrador, M.: Atarraya: A Simulation Tool to Teach and Research Topology Control Algorithms for Wireless Sensor Networks. In: 2nd International ICST Conference on Simulation Tools and Techniques, pp. 1–10. ICST, Brussels (2009) 12. Sedgewick, R., Wayne, K.: Introduction to Programming in Java: An Interdisciplinary Approach. Addison Wesley, USA (2007)
A Topology Control Algorithm for Interference and Energy Efficiency in Wireless Sensor Networks Hugo Braga and Fl´avio Assis LaSiD - Distributed Systems Laboratory DCC - Department of Computer Science PPGM - Graduate Program on Mechatronics UFBA - Federal University of Bahia Salvador, Bahia, Brazil {hugobraga,fassis}@ufba.br
Abstract. Topology control is one of the main techniques that can be used to decrease energy expenditure and/or interference in wireless sensor networks. Less attention, however, has been devoted to algorithms that address energy and interference efficiency together. In this paper, we describe a localized topology control algorithm called TCO which is very efficient in terms of interference while minimizing energy efficiency. In order to evaluate TCO in terms of interference and compare it with other algorithms, we defined a new metric called PICS (Path Interference Cost based on Sender) Spanning Factor. According to this metric, in our experiments TCO outperformed all related localized topology control algorithms and its performance was extremely close to the performance of a centralized algorithm which is optimal according to the PICS spanning factor (a variation of ATASP based on PICS). Keywords: Wireless Sensor Networks, Topology Control, Overhearing, Interference, Energy Efficiency.
1
Introduction
A Wireless Sensor Network (WSN) is a special type of wireless network whose nodes have limited resources in terms of energy supply (they are usually batteryoperated), computation power and memory. Conserving energy is thus a key issue in the design of WSNs. Topology control is one of the main techniques used to conserve energy. The goal of topology control is to determine a transmission power to each node of the network with the purpose of maintaining some property (e.g. connectivity) over the resulting communication graph while reducing the energy consumed by the nodes and/or the interference in the network [30]. Although optimizing energy efficiency and interference are at the core of topology control, most work does not consider these goals together. Algorithms for
The work of this author was partially supported by CAPES (Coordena¸c˜ ao de Aperfei¸coamento de Pessoal de N´ıvel Superior), Brazil.
H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 86–99, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Topology Control Algorithm for Interference and Energy Efficiency
87
topology control initially focused on energy efficiency. Interference was only considered implicitly, assuming that reducing the number of neighbors of nodes resulted in low interference. In [10], however, it was shown that low node degree does not imply low interference. Therefore interference should be addressed explicitly. Since then different approaches to topology control which attempt to optimize interference have been proposed. They are based on metrics that consider the number of nodes in specific areas affected by transmissions. These areas are dependent on the specific metric of interference used. In this paper, we describe a distributed localized algorithm based on a specific edge weight function which is efficient in terms of interference and energy expenditure. This algorithm was presented in a previous work by one of the authors [2] as an algorithm for energy-efficiency which takes the cost of overhearing into consideration, i.e. the cost implied when nodes hear messages even if the messages are not intended for them. We will refer to this algorithm as TCO (from Topology Control considering Overhearing). In this paper we show that as the edge weight function used in TCO takes into consideration the transmission and reception cost, including overhearing, it implicitly takes into consideration the number of nodes affected by a transmission. Thus, TCO optimizes energy and interference at the same time. An important aspect of this work is that we deviate in the design of our algorithm from most existing work in two important ways. First, we take the cost of overhearing into consideration (normally ignored in previous work on energy-efficient topology control). Second, we generate topologies that might be asymmetric, i.e. on the final topology there might be an edge from a node u to node v but not from v to u (the final topology is, however, strongly connected). We argue that the cost of overhearing is significant when considering currently used sensor nodes and MAC (Medium Access Control ) protocols and that, by taking this cost into consideration, we can generate topologies that are efficient both in terms of energy and interference. We argue additionally that generating asymmetric links might be useful in some situations and might even lead to energy conservation. More specifically in this paper: (a) we argue that, unlike most previous work on topology control, it is important to take the cost of overhearing into consideration and it might be useful to generate topologies with asymmetric links (b) we show the impact of taking the cost of overhearing into consideration when optimizing energy and interference at the same time (c) we propose a new metric for interference (d) we present a specific edge weight function that incorporates energy and interference parameters and (e) we show a (distributed) localized topology control algorithm based on this function that is efficient in terms of energy (as shown in [2,32]) and interference. According to the defined interference metric, our algorithm outperformed existing algorithms described in the literature and, more importantly, although localized, it is extremely close to the performance of an optimal global solution (i.e. a solution based on the whole information about the network).
88
H. Braga and F. Assis
This paper is organized as follows. Section 2 discusses related work. Section 3 presents our arguments in favor of considering the cost of overhearing and asymmetric links. Section 4 describes the assumed system model. Section 5 introduces a new interference metric. Section 6 describes a topology control algorithm for optimizing energy efficiency and interference. Section 7 describes the results of experiments performed to evaluate the algorithm. Finally, Section 8 concludes the paper.
2
Related Work
To the best of our knowledge, [10] was the first work on topology control to address interference explicitly (in a traffic-independent way), after pointing out that reducing node degree does not imply low interference. The authors measured interference as the sum of the nodes in the range of two communicating nodes, a sender and a receiver. The idea is that these nodes are the ones which could hear a transmission from the sender and a corresponding reply (such as an ACK message) from the receiver. This approach is classified as link-based (according to the classification introduced in [22]) because interference is based on the two endpoints of links, and as sender-centric (according to the classification introduced in [16]) because interference is considered from the perspective of the node sending a message. Later, [6] argued that interference must be addressed considering multihop paths as messages actually traverse such paths. An algorithm developed considering only one-hop paths does not necessarily provide a good solution when multihop interference is considered. The authors thus proposed a multihop metric for interference, based on the link-based approach. In [16] (and after that [20], [26] and [29]) the authors argued that interference should be considered from the receiver perspective. According to these authors, a transmission can be affected by the overall effect of different transmissions as perceived by the receiver. These authors thus introduced interference metrics that are classified as receiver-centric. The main idea is that the interference as perceived by node v is the number of all nodes u such that v is in the range of u’s transmissions. More recent work [23,17] addresses interference based on physical models (as opposed to graph-based models, as used in previous work). Their notion of interference is naturally receiver-centric. However, as pointed out in [13], adopting a sender-centric approach has the following benefits: (a) it captures interference in certain scenarios better and (b) it has been shown that a receiver-centric metric is limited by a constant factor of a sender-centric metric [9]1 , i.e. reducing the interference when a sender-centric metric is used implies that interference is reduced when a receiver-centric metric is used. Additionally, we argue that optimizing interference based on a receivercentric approach is harder, because the level of interference to consider for a 1
In [9] it was shown that the receiver-centric metric defined in [29], referred to as Iin , relates to the sender-centric metric defined in [10], referred to as Iout , by the expression Iin ≤ 5.Iout .
A Topology Control Algorithm for Interference and Energy Efficiency
89
given node u depends on possible combinations of powers assigned to each of the nodes that might have u in their ranges. In fact, to the best of our knowledge, all algorithms that have been proposed to optimize interference using a receiver-centric approach are centralized, i.e. they are based on global information about the communication graph [16] [26] [29] [34] [23] and [17]. Some approaches to interference based on a sender-centric approach are also centralized [10] (LIFE and LISE) and [6] (ATASP). In particular, ATASP is an optimal algorithm considering multihop path interference (according to the PIC spanning factor - see Section 5). However, four localized sender-centric algorithms have been proposed: LLISE [10], API [20], I-LMST [22] and I-RNG [22] (the authors in [22] proposed an additional localized topology control algorithm that preserves spanning property - a more restrictive requirement which might lead to worse solutions -, but we do not consider it here as we are not addressing spanning property). A distributed algorithm, SLISE [13], has also been described, but it optimizes one-hop interference. The algorithm that we describe in this paper, TCO, is sender-centric, localized and is based on multihop interference. It is therefore closely related to LLISE, API, I-LMST and I-RNG. Additionally, according to the classification introduced in [22], it is based on an approach called node-based, because interference is based on only one of the endpoints of links (in contrast to the link-based approach). As presented in Section 7, our algorithm outperforms LIFE and all other localized algorithms related to ours. We compared our algorithm with LIFE instead of LLISE (both proposed in the same paper), because LIFE is a centralized algorithm which performs better than LLISE (according to the interference metric of their authors). TCO has been previously evaluated only in relation to energy efficiency [2,32]. The work described in this paper differs from [2,32] in that we address interference here.
3
Overhearing Cost and Asymmetric Links
As mentioned before, TCO takes into consideration the cost of overhearing and the generated topology might contain asymmetric links. Most previous work on topology control does not take the cost of overhearing into consideration. Additionally, the absence of asymmetric links in the generated topology is generally considered a good property. Thus, our assumptions deviate from the assumptions commonly adopted in the literature. In this section, we argue why we think our assumptions are appropriate. The cost of overhearing is one the major sources of energy expenditure due to communication in WSNs [24]. Despite this, many MAC protocols are subject to overhearing. All protocols that rely on low power listening (or preamble sampling) [27,15,8,36,1] suffer from overhearing, basically due to their asynchronous nature. A node broadcasts a signal to every node in its vicinity to indicate its intention to transmit a message. Nodes in the vicinity might hear the signal even if the message is not intended for them. In particular, B-MAC [27], a preamble sampling protocol, is the default MAC protocol for TinyOS [8], one of the
90
H. Braga and F. Assis
most important operating systems for WSNs. Additionally, the MAC protocol specified in IEEE 802.15.4 [19], a de facto standard for the MAC and the physical layer for devices with low consumption, also suffers from overhearing. In IEEE 802.15.4, RTS/CTS signaling messages (which could be used to identify the sender and receiver of a transmission) are not used and nodes do not sleep during the so-called Contention Access Period (each node needs to hear the transmissions to identify messages sent to it). Furthermore, another argument in favor of considering overhearing is that the reception cost in current transceivers has tended to increase [12,3] and, for some specific transceivers, it is higher than the transmission cost at maximum power (for example, for the Chipcon CC 2420 transceiver [11]). Regarding the presence of asymmetric links in the final topology, many topology control algorithms avoid it on the assumption that handling asymmetric links is difficult at the MAC sublayer (for example, due to ACKs). However, we think that asymmetric links might be interesting in many specific situations and might even lead to energy conservation. In IEEE 802.15.4, for example, ACKs are optional (the need for the receiver to send an ACK is specified on a message basis) [19]. The need to acknowledge every message might be a source of unnecessary energy expenditure. Message broadcasting without the need for guaranteeing reception at all nodes, such as in the case of beacons or specific messages (such as cryptographic keys as in [14]) broadcast by the base station to the whole network are also examples where asymmetric links can be used. Additionally, some authors have proposed eliminating ACKs in order to improve the performance of the network. For example, in [35] the authors propose an energy efficient transport service (that includes both transport and MAC layers services), the PMC, to transport events with a tradeoff between reliability and energy efficiency. PMC works over a Silent CSMA, which consists of a variant of CSMA/CA without ACKs. In [37], the authors propose a key establishment protocol for security services that explores unidirectional links. The advantage of this approach is that, instead of removing the unidirectional links as generally key establishment schemes do, the authors explore this type of link in order to improve connectivity and increase the network lifetime.
4
System Model
In this paper, a wireless sensor network consists of a set of n static nodes. We model the behaviour of each node by a process associated with the node. Therefore we have n processes, p1 , p2 , ..., pn , one for each node. As there is a one-to-one correspondence between processes and nodes, we will use the terms process and node interchangeably. We assume that each node can adjust its transmission power to any value between 0 and a certain maximum, referred to as Mpower (in the experiments described in this paper, however, each node transmits with one out of a fixed number of power levels - see Section 7.1). The maximum power level is the same
A Topology Control Algorithm for Interference and Energy Efficiency
91
for all nodes. Varying the transmission power of a node might change its set of neighbours. We assume that there is a path between any pair of nodes (processes) in the network, if all nodes transmit at Mpower . Our energy model is based on the model presented in [18]. Energy is spent by nodes during transmission, reception and during processing states. However, the energy spent during processing is neglected (we do not take the processing costs into consideration in this paper). The energy used to transmit is the energy spent to run the radio electronics and the power amplifier. Both are dependent on hardware characteristics, such as the digital coding and modulation used. The energy used to run the power amplifier is also dependent on the distance between the transmitter and receiver and is computed according to a specific path loss model. Thus, the energy spent to transmit an l-bit message from process p to process q, denoted ET x (l, p, q), is given by: ET x (l, p, q) = l.ET xElec + l..dα ,
(1)
where: ET xElec is the energy spent by the transmitter electronics; d is the distance (in meter) between p and q; α is the path loss exponent (typically 2 ≤ α < 6); and is a parameter that is characteristic of the transceiver and the channel [28]. The energy spent to receive an l-bit message, denoted ERx (l), is given by: ERx (l) = l.ERxElec ,
(2)
where RxElec is the energy spent to run the receiver electronics. We assume that the nodes are distributed over a plane (i.e. the location of each node is given by a pair of x, y-coordinates). Additionally, each process knows its current geographic location (the nodes obtain this information from a positioning system, such as GPS, or by other means, such as triangulation with some reference points in the network).
5
A New Metric for Interference
Different metrics have been proposed to quantify interference [6] [22] [10] [20] [25]. In most previous work interference is defined on a per link basis, instead of being based on paths. In [6], the authors showed that topologies constructed based only on links might not be efficient, when considering a path-based interference metric. However, in WSNs data generaly flow through multihop paths. Therefore, a metric based on multihop paths seems more reasonable. In this section we define a new interference metric based on paths that differs from previous work as we consider asymmetric links. In [10], interference is defined based on the notion of link coverage. The coverage of a link (u, v) represents the number of nodes that are inside the disks centered at u and v with radius |u, v|. The authors assume that links are bidirectional. The coverage thus represents the set of nodes that are affected by a
92
H. Braga and F. Assis
transmission over this link in both directions (to model that a transmission of a message from a node u to node v is commonly followed by an acknowledgment message from v to u). The authors in [10] additionally assume a simplistic model of circular coverage, i.e., they use the UDG (Unit Disk Graph) model. Due to the oversimplified nature of the UDG model, the authors in [6] introduced the Interference Number (IN) of an edge. The IN generalizes the concept of coverage. It is based on the fact that the area affected by a transmission might not be a perfect circle, rather an arbitrarily shaped area. More formally, the IN of an edge (u, v), denoted IN (u, v), is defined as IN (u, v) = |covIN (u, v))| where: covIN (u, v) = {w|w ∈ I(u, pu ) ∨ w ∈ I(v, pv )}.
(3)
In (3), for a node z, I(z, pz ) denotes the nodes affected (in some region) when node z transmits with power pz . There is no restriction on the geometry of the area affected by a transmission. The authors in [22] present a notion of interference called Interference based on Sender (IS). The ISH (w) denotes the interference of a node w under a subgraph H of the original graph G(V, E). Let puv be the minimum power that node u needs to reach node v. Let pu (H) be the minimum power that node u needs to reach all neighbours in H. The interference based on sender is thus defined as: ISH (w) = |{v|pwv ≤ pw (H)}|.
(4)
Both the IN and the IS reflect the notion of interference presented in sendercentric approaches (see Section 2). In addition, IS reflects the node-based interference approach (see Section 2). In this paper, we propose a new interference metric based on paths, called Path Interference Cost based on Sender (PICS). This metric is a variation of the Path Interference Cost (PIC), proposed in [6]. Let G(V, E) be a maximum power graph and H(V, E ) be a subgraph of G. Let p be a multihop path p = w0 , w1 , ..., wh−1 , wh in H. We define the PICS for path p as: P ICS(p) =
h−1
ISH (wi ).
(5)
i=0
The difference between our metric (PICS) and PIC [6] is that PIC is based on IN, instead of on IS, i.e. in the case of PICS the interference is based on node instead of being based on link. Based on our PICS definition, we denote mipsG uv the minimum interference path based on sender between a source node u and a destination v in a graph G = (V, E), i.e. a path p between u and v for which PICS(p) is the minimum. In order to quantify the amount of interference that a specific algorithm reduces in comparison to a maximum power graph, the authors in [6] introduced the PIC Spanning Factor (ρ). Adapting this concept to our definition of path interference cost, we define the PICS Spanning Factor (σ) as follows. Let G = (V, E) be a maximum power communication graph, and let H = (V, E ) be
A Topology Control Algorithm for Interference and Energy Efficiency
93
a subgraph of G. The PICS spanning factor of H, σ(H), is the average, over all possible source/destination pairs, of the ratio of the cost of a minimum interference path based on sender in H to the cost of a minimum interference path based on sender in G. Formally, σ(H) =
P ICS(mipsH uv ) ∀u,v∈V,u=v P ICS(mipsG uv )
|{(u, v)|u, v ∈ V, u = v}|
.
(6)
There are basically two differences between σ and ρ: (a) σ is based on the notion of PICS instead of PIC and (b) σ is based on an average value instead of a maximum value (of ratios). The former difference comes from the fact that we assume a node-based interference approach. The latter comes from the fact that we believe that the average value captures the notion of global interference in the network better. For example, for a specific pair of nodes u and v, if the level of interference of the path between them is high, the topology will be considered bad even if the interference levels for all other paths in the graph are good, if the maximum (interference value) is used. The average value circumvents this problem. The interference of a topology based on an average value is also used elsewhere [20,25,4].
6
A Topology Control Algorithm That Considers Overhearing
The algorithm TCO determines the transmission power for each node by finding the node’s reduced set of neighbours. A node q belongs to a node p’s reduced set of neighbors iff q is one of p’s neighbours and the edge (p, q) is not k-redundant, considering p’s local information. An edge (p, q) is k-redundant (k ≥ 2) iff there is a path with length k (i.e. with k edges), such that sending a message from p to q along this path has a lower cost (i.e. it results in less energy being spent) than sending the message directly from p to q. All processes execute the same algorithm, shown in Figure 1. The algorithm 2h 2h has a directed graph G2h p = (Vp , Ep ) which represents the two-hop neighbourhood of process p and the set P osp as input. Let Gmax = (V, Emax ) be the graph generated when all nodes transmit at full power. Vp2h contains p, its neighbours (when transmitting at full power), and the neighbours of its neighbours (when transmitting at full power). Formally: Vp2h = {p} ∪ {q : q ∈ V ∧ (((p, q) ∈ Emax ) ∨ (∃ r : r ∈ V ∧ (p, r) ∈ Emax ∧ (r, q) ∈ Emax ))} and Ep2h = {(p, q) : q ∈ Vp2h ∧ (p, q) ∈ Emax } ∪ {(r, s) : r ∈ Vp2h ∧ s ∈ Vp2h ∧ (p, r) ∈ Emax ∧ (r, s) ∈ Emax } The set P osp is the set of 2-tuples q, xq , yq , where xq and yq are the Euclidian x, y-coordinates of process q. We do not specify a particular way for p to obtain G2h p and P osp as the focus of this paper is on the topology generated
94
H. Braga and F. Assis
Algorithm 1. TCO (process p) 2h 2h Input : G2h p = (Vp , Ep ), P osp Output : myP owerp
1 2 3 4 5
Gmin (Vpmin , Epmin ) ← findMinCostPathsTree( p, G2h p p , P osp ) RN brsp ← {q : q ∈ Vpmin ∧ (p, q) ∈ Epmin } highestSetp ← {q : (q ∈ RN brsp ) ∧ (r : (r ∈ RN brsp ) ∧ (power(p, r) > power(p, q)))} highestp ← any q such that q ∈ highestSetp myP owerp ← power(p, highestp )
by the algorithm (they can be easily obtained by the exchange of messages with topology information). The algorithm is described for a generic process p. First, p calculates a tree of minimum cost paths from itself to each of its 2-hop neighbours (Fig. 1, line 1). We represent an algorithm that finds this tree by the procedure findMinCostPathsTree. Any algorithm can be used to do this, however we impose as a requirement that, if the edge (p, q) is a minimum cost path between nodes p and q, this edge is returned as the minimum cost path between these nodes, instead of any other longer path with the same cost that might exist, i.e. the algorithm prefers one-edge paths instead of longer paths with the same cost. Subsequently, the RN brsp set is determined (Fig. 1, line 2). RN brsp contains the set of nodes that are direct neighbours of p in the computed minimum cost paths tree. The set highestSetp contains the nodes q in RN brsp for which the edges (p, q) have the highest cost (Fig. 1, line 3). We assume that there is a function power(u, v) that determines the minimum power needed by node u to reach node v. As there may be more than one such node, highestp represents any of these nodes (Fig. 1, line 4). Finally, the power assigned to p is the minimum power needed by p to reach highestp (Fig. 1, line 5). R R Let GR p = (Vp , Ep ) be the directed graph that represents the relationship between a process p and its reduced set of neighbours, i.e. VpR = {p} ∪ RN brsp and EpR = {(p, q) : q ∈ RN brsp }. The resulting topology determined by the algorithm is the graph GT CO = (V, ET CO ), where ET CO is defined as follows: ET CO =
EpR
∀p∈V
TCO generates a topology that is strongly connected and that has the minimum-energy property (see [2] for details).
7
Evaluation
TCO was analysed in terms of energy efficiency in [2,32]. In this section, we present the results of a series of experiments to show that TCO also presents
A Topology Control Algorithm for Interference and Energy Efficiency
95
good performance in terms of interference. The main reason for this is that the edge weight function used in TCO takes into consideration energy efficiency and interference as it encompasses the number of nodes affected by transmissions (interference). We first describe the parameters used in the experiments (Section 7.1). Then, we compare TCO with other topology control algorithms considering the interference metric defined in Section 5 (Section 7.2). 7.1
Experiment Parameters
We built a specific Java program in order to perform the experiments. The communication and energy parameters used in the program were based on values extracted from the Chipcon CC2420 transceiver datasheet [11] since this transceiver is commonly used in wireless sensor networks. In particular, the program models the different transmission power levels of the transceiver (a node transmits with one out of five power levels). The maximum radio range of each node is approximately 100 distance units. To calculate the distance reached by a transmission we needed the path loss which was calculated considering the assumptions made in [7], assuming a reference distance of 1 unit and the path loss at the referential distance to be 54dBm. The path loss exponent (α in (1)) was assumed to be 2. The energy dissipated by the eletronics during transmission (Eelec in (1)) and reception (Eelec in (2)) were assumed to be 48nJ/bit and 236.4nJ/bit, respectively. The value of the constant associated with the power amplifier ( in (1)) was assumed to be 0.016pJ/bit/m2. These values are based on 3 V voltage. The number of nodes in each specific experiment (n) ranged from 30 to 300. For each n we generated 30 scenarios. The nodes were randomly spread over a 600x600 region. 7.2
Evaluation of Interference
Although the edge weight function used in TCO explicitly contains only energy parameters, it implicitly takes into consideration the number of nodes affected by a transmission because the reception cost is proportional to this number. As TCO minimizes energy, it implicitly minimizes the number of nodes (over)hearing transmissions. This is supported by the very good performance of TCO in terms of interference, presented below. In order to evaluate TCO in relation to interference, we compared it with the following algorithms: API [20], I-RNG [22], LIFE [10], ATASP (based on PICS) [6], XTC [33], Gabriel Graph (GG) [31] and Relative Neighborhood Graph (RNG) [31]. We compared the topologies using the PICS spanning factor, as explained in Section 5. The results of the experiments are summarized in Figures 1 and 2. We compared TCO against these algorithms for the following reasons. As described in Section 2, API, I-RNG, I-LMST and LLISE are the only localized algorithms that address interference explicitly. We did not compare TCO with I-LMST because I-RNG requires fewer message exchanges, implying a smaller
96
H. Braga and F. Assis
Fig. 1. PICS spanning factor
Fig. 2. Maximum PICS spanning factor
overhead (I-LMST and I-RNG were both proposed in [22]). We compared TCO with LIFE instead of LLISE because LIFE is a centralized algorithm that performs better than LLISE (according to [10]). The ATASP algorithm, as proposed in [6], provides topologies with optimal PIC spanning factor, i.e. ρ(ATASP) = 1. In our experiments, we modified ATASP to use PICS (instead of PIC). ATASP based on PICS also has optimal PICS spanning factor, i.e. σ(ATASP) = 1 (it follows from the way edges are chosen in the final topology found by ATASP). XTC is a very simple algorithm that provides good energy spanners (on average-case graphs) [33]. We compared TCO with GG because GG presented good results in [6] for an interference metric similar to ours. In particular, it outperformed the algorithms CBTC [21] and KNeigh [5]. Finally, we compared TCO with RNG because it is another classical result of computational geometry upon which many topology control results are based. For all algorithms, except GG and RNG, we modelled the transmissions considering the power levels of the transceiver. For GG and RNG we used the exact distance between nodes as these algorithms are based on geometrical properties of the network (the definition of the resulting topology is based on distances between nodes). Figure 1 shows the mean PICS spanning factor (σ) for the topology control algorithms. Observe that σ presents a constant or decay behaviour for all algorithms, except for LIFE and API. Observe that, in the case of TCO, σ follows the optimum which is represented by ATASP (based on PICS) - the lines of TCO and ATASP appear superposed. Figure 2 shows a comparison of the algorithms based on maximum σ, which corresponds to the worst case for each specific scenario. Observe that maximum σ increases quickly when compared to the average σ. The relative behaviour of the algorithms is similar, except for XTC, which exhibits a worse performance when maximum σ was considered. However, except for TCO, I-RNG (after a certain point) and ATASP, the maximum PICS spanning factor tend to increase with network size. Thus TCO and I-RNG (localized algorithms) support better network scalability.
A Topology Control Algorithm for Interference and Energy Efficiency
97
A special fact is that TCO presented a behaviour which is extremely close to the optimum, even with the increase in the network size. All the other algorithms exhibited an increase in the maximum PICS spanning factor with an increase in the network size. After a certain point the maximum PICS spanning factor decreased for I-RNG (similar to what happened in Figure 1). TCO is, therefore, the most scalable localized algorithm (according to these experiments). Our results are consistent with the results presented in [6]. First, LIFE did not exhibit good performance compared to the other algorithms. In fact, as explained in [6], algorithms based on Minimum Spanning Tree (such as LIFE) are not efficient when considering multihop interference. Second, the PIC spanning factor increases for GG and RNG as the network density increases and GG outperforms RNG. Similar behaviour occurred to maximum σ, as can be seen in Figure 2 - recall that the PIC spanning factor is based on maximum values. We also ran additional experiments, first with higher density, i.e., the same number of nodes but on 400x400 and 200x200 areas, and then with α = 3 (200x200 region with 300 nodes). In all these experiments, the behaviour of TCO was similar to the previous cases. TCO exhibited an extremely good performance, outperforming the other algorithms and again matching the performance of ATASP (based on PICS). We do not present these results here due to lack of space. TCO differs from GG, RNG, API, XTC and I-RNG in the fact that these protocols are based on simple triangular inequality, i.e., they consider paths up to two hops when searching for interference-efficient paths, while TCO considers (at the two-hop vicinity of nodes) general paths (with length k, k ≥ 2). This might be one reason for the better performance of TCO in relation to these algorithms. It is important to note that TCO is not the algorithm that removed the highest number of edges from the original graph. In fact, LIFE and GG removed many more edges than TCO. According to our experiments and the metric we used, TCO exhibited an extremely good performance. When considering multihop interference, TCO, which is a distributed localized topology control algorithm (thus is based on partial knowledge of the graph), exhibited a performance equivalent to an optimal global algorithm, i.e. based on information about the whole network. Additionally, TCO minimizes multihop interference and energy expenditure at the same time (i.e. it does not provide a compromise solution between these two goals).
8
Conclusion
This paper presented TCO as a distributed and localized topology control algorithm for optimizing interference. TCO takes into consideration the overhearing cost and generates topologies that might have asymmetric links. As these characteristics of TCO deviate from the assumptions adopted in many previous works on topology control, we emphasized arguments to support them. Further, we demonstrated that when considering overhearing, we achieved an algorithm that minimizes energy and is very efficient in terms of interference.
98
H. Braga and F. Assis
In order to compare TCO with other algorithms, we defined a new interference metric (σ). This metric is different from previous ones because it is based on the sender-centric perspective, on the notion of interference along multihop paths and on the node-based approach. The experiments were performed taking into consideration discrete power levels of transmission. The values used in the experiments were based on values of a transceiver commonly used in real sensor networks (CC2420). We compared TCO against existing localized topology control algorithms for interference. According to our experiments and the used metric, TCO outperformed all the other algorithms and, more importantly, its performance was equivalent to the performance of a centralized algorithm (optimal for the related PIC spanning factor).
References 1. Ansari, J., Ang, T., Mahonen, P.: Spectrum agile medium access control protocol for wireless sensor networks. In: SECON, Boston, MA, USA, pp. 1–9 (2010) 2. Assis, F., Telemaco Neto, U.: A topology control algorithm for wireless sensor networks that considers overhearing. In: ISADS, Athens, Greek, pp. 1–6 (2009) 3. Atmel: Atmel RF230 datasheet (April 2009), http://www.ti.com/lit/gpn/cc2530 4. Banner, R., Orda, A.: Multi-objective topology control in wireless networks. In: INFOCOM, pp. 448–456 (2008) 5. Blough, D.M., Leoncini, M., Resta, G., Santi, P.: The k-neigh protocol for symmetric topology control in ad hoc networks. In: MobiHoc, New York, NY, USA, pp. 141–152 (2003) 6. Blough, D.M., Leoncini, M., Resta, G., Santi, P.: Topology control with better radio models: implications for energy and multi-hop interference. In: MSWiM 2005, pp. 260–268. ACM, New York (2005) 7. Braem, B., Latr´e, B., Moerman, I., Blondia, C., Reusens, E., Joseph, W., Martens, L., Demeester, P.: The need for cooperation and relaying in short-range high path loss sensor networks. In: SENSORCOMM, Washington, USA, pp. 566–571 (2007) 8. Buettner, M., Yee, G.V., Anderson, E., Han, R.: X-mac: a short preamble mac protocol for duty-cycled wireless sensor networks. In: SenSys 2006, pp. 307–320. ACM, New York (2006) 9. Burkhart, M.: Analysis of interference in ad hoc networks. Master’s thesis, Swiss Federal Institute of Technology Zurich (2003) 10. Burkhart, M., von Rickenbach, P., Wattenhofer, R., Zollinger, A.: Does topology control reduce interference? In: MobiHoc, New York, NY, USA, pp. 9–19 (2004) 11. Chipcon: Chipcon CC2420 datasheet (2007), http://www.ti.com/lit/gpn/cc2420 12. Chipcon: Chipcon CC2530 datasheet (2009), http://www.ti.com/lit/gpn/cc2530 13. Damian, M., Javali, N.: Distributed construction of bounded-degree lowinterference spanners of low weight. In: MobiHoc, New York, NY, USA, pp. 101–110 (2008) 14. Deng, J., Han, R., Mishra, S.: Insens: Intrusion-tolerant routing for wireless sensor networks. Tech. Rep. Technical Report CU-CS-939-02, University of Colorado, Department of Computer Science (2002) 15. El-Hoiydi, A., Decotignie, J.D.: Wisemac: an ultra low power mac protocol for the downlink of infrastructure wireless sensor networks. In: ISCC 2004, vol. 1, pp. 244–251 (2004) 16. Fussen, M., Wattenhofer, R., Zollinger, A.: Interference arises at the receiver. In: WIRELESSCOM, vol. 1, pp. 427–432 (2005)
A Topology Control Algorithm for Interference and Energy Efficiency
99
17. Gao, Y., Hou, J.C., Nguyen, H.: Topology control for maintaining network connectivity and maximizing network capacity under the physical model. In: INFOCOM, pp. 1013–1021 (2008) 18. Heinzelman, W., Chandrakasan, A., Balakrishnan, H.: An application-specific protocol architecture for wireless microsensor networks. IEEE Transactions on Wireless Communications 1(4), 660–670 (2002) 19. IEEE: Ieee std 802.15.4 - part 15.4: Wireless medium access control (mac) and physical layer (phy) specifications for low-rate wirelees personal area networks (2006) 20. Johansson, T., Carr-Motyˇckov´ a, L.: Reducing interference in ad hoc networks through topology control. In: DIALM-POMC, New York, USA, pp. 17–23 (2005) 21. Li, L., Halpern, J.Y., Bahl, P., Wang, Y.M., Wattenhofer, R.: A cone-based distributed topology-control algorithm for wireless multi-hop networks. IEEE/ACM Trans. Netw. 13, 147–159 (2005) 22. Li, X.Y., Moaveni-Nejad, K., Song, W.Z., Wang, W.Z.: Interference-aware topology control for wireless sensor networks. In: SECON, pp. 263–274 (December 2005) 23. Liu, Y., Zhang, X., Liu, Q., Dai, S.: A hybrid interference model-based topology control algorithm. In: NCM, vol. 1, pp. 42–46 (2008) 24. Misra, S., Mohanta, D.: Adaptive listen for energy-efficient medium access control in wireless sensor networks. Multimed. Tools Appl. 47(1), 121–145 (2010) 25. Moaveni-nejad, K., Li, X.-y.: Low-interference topology control for wireless ad hoc networks. Ad Hoc & Sensor Wireless Networks 1(1) (2005) 26. Moscibroda, T., Wattenhofer, R.: Minimizing interference in ad hoc and sensor networks. In: DIALM-POMC, pp. 24–33 (2005) 27. Polastre, J., Hill, J., Culler, D.: Versatile low power media access for wireless sensor networks. In: SenSys 2004, pp. 95–107. ACM, New York (2004) 28. Rappaport, T.S.: Wireless Communications: Principles & Practice. Prentice Hall, Englewood Cliffs (1996) 29. Rickenbach, P.v., Schmid, S., Wattenhofer, R., Zollinger, A.: A robust interference model for wireless ad-hoc networks. In: WMAN, Denver, Colorado, USA (2005) 30. Santi, P.: Topology control in wireless ad hoc and sensor networks. ACM Comput. Surv. 37(2), 164–194 (2005) 31. Santi, P.: Topology Control in Wireless Ad Hoc and Sensor Networks. Wiley, Chichester (2005) 32. Telemaco, U.: Topology control algorithm for wireless sensor networks that considers overhearing. Master’s thesis, Federal University of Bahia (2009) (in Portuguese) 33. Wattenhofer, R., Zollinger, A.: Xtc: A practical topology control algorithm for ad-hoc networks. In: IPDPS (2004) 34. da Wu, K., Liao, W.: Revisiting topology control for multi-hop wireless ad hoc networks. IEEE Transactions on Wireless Communications 7(9), 3498–3506 (2008) 35. Wu, X., Ananda, A.L., Chan, M.C.: Pmc: An energy efficient event transport service for wireless sensor network. In: ICC, Istanbul, TK, vol. 1, pp. 355–360 (2006) 36. Zhang, X., Ansari, J., Mahonen, P.: Traffic aware medium access control protocol for wireless sensor networks. In: MobiWac, Tenerife, Spain, pp. 140–148 (2009) 37. Zhang, Y., Gu, D., Li, J.: Exploiting unidirectional links for key establishment protocols in heterogeneous sensor networks. Comput. Commun. 31, 2959–2971 (2008)
Fault Tolerant Interference-Aware Topology Control for Ad Hoc Wireless Networks Md. Ehtesamul Haque1 and Ashikur Rahman2 1 2
Dept. of CS, Rutger University, NJ 08854, USA
[email protected] Dept. of CS, Univ. of Calgary, T2N 1N4, Canada
[email protected]
Abstract. Interference imposes a major challenge for efficient data communication in wireless networks. Increased level of interference may increase number of collisions, energy consumption and latency due to retransmissions of the interfered data. Interference of a link is the number of nodes interfered while a pair of nodes are communicating over a bi-directional link. Approaches have been proposed to reduce interference by dropping links that create high interference. However, dropping links make a network more susceptible to node failure/departure–a frequent phenomenon in ad hoc networks. Thus dropping high interference links while keeping the network significantly connected is an important goal to achieve. In this paper, we formulate the problem of constructing minimum interference path preserving and fault tolerant wireless ad hoc networks and then provide algorithms, both centralized and distributed with local information, to solve the problem. Moreover, for the first time in literature, we conceive the concept of fault tolerant interference spanner and provide a local algorithm to construct such spanner of a communication graph.
1
Introduction
Wireless ad hoc networks are said to be networks without networking. Due to the potential benefits, application and ease of deployment wireless ad hoc networks have gained popularity among the researchers during the last two decades. Research on generic wireless ad hoc networking also ramified to special types of networking like wireless mesh networks, wireless sensor networks, vehicular networks, underwater acoustic networks, radio frequency identification (RFID) networks etc. One of the prime challenges in all types of wireless networks is channel interference caused by parallel transmissions. Roughly, a node a may interfere node b if transmission of a is unintentionally received by b. Interference may cause collisions and thus reduce throughput and increase delay. Since energy is scarce at devices of ad hoc networks, attempting energy optimized protocol is a necessity in such networks. Reducing interference may reduce number of collisions which in turn reduces number of retransmissions. Thus less interference may cause less energy consumption and longer network lifetime. H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 100–116, 2011. c Springer-Verlag Berlin Heidelberg 2011
Fault Tolerant Interference-Aware Topology Control
101
Nodes in ad hoc networks may reduce interference by lowering their transmission powers. The idea is to reduce the covered area by reducing transmission power and eventually interfering less number of nodes. Moreover, nodes may identify more interfered links and avoid transmitting over those links to reduce interference. Besides reducing interference nodes can increase spatial reuse of frequency and lessen MAC level contentions by using shorter and less interfering links. However, dropping the links of the network makes the network to be more susceptible to node/link failure. Since nodes may join or leave in ad hoc networks frequently and links in wireless networks are unreliable due to high bit error rate, node and link failures may be quite disastrous specially in case of ad hoc networks. The problem can be mitigated if an adequate level of path redundancy can be properly embedded into the topology control algorithm. Specifically, a k + 1 connected network topology may continue to operate with at most k node/link failures. Besides providing fault tolerance, a topology control algorithm may also provide minimum interference paths. Informally, a minimum interference path is the path between a pair of nodes (x, y) that enables communication between x and y producing minimum interference in the network. Preservation of one or more least interference path(s) will allow the routing layer to select a path for routing data packets with minimum interference at first place and keep selecting low interference paths in the event of failure of nodes or links on the minimum interference path. The idea of this unification i. e. providing solution to fault tolerance and minimum interference using single algorithm is the major contribution of this paper. In this paper, we have formulated two variations of the combined fault tolerance and interference aware topology control problem for ad hoc wireless networks: Minimum Interference Bi-connected Communication Networks (MIBCN): Given a communication graph construct an induced sub graph by reducing the total number of edges which is bi-connected and the minimum interference path between every pair of nodes is preserved. Bi-connected Interference-aware Spanners (BIS): Given a communication graph construct an induced subgraph by reducing the total number of edges which is bi-connected and between any pair of nodes u and v there exist (at least) two node disjoint paths with interference less than t (> 1) times the interference of the minimum-interference path between u and v. For both MIBCN and BIS problem we provide a central and a distributed local algorithm. Central algorithms assume that the global topology information is available. On the other hand local algorithms assume only a limited neighborhood information (up to 2-hop). Every node locally runs this algorithm to generate an induced subgraph of the neighborhood graph and superposition of every node’s local computation constructs a graph which preserves the desired properties globally. We have also provided an explicit procedure for calculating
102
Md.E. Haque and A. Rahman
link interference locally which was not available in the literature before. By providing rigorous experimental results we show the effectiveness of our algorithms. The rest of the paper is organized as follows. We summarize related works in Section 2. Section 3 discusses how to quantitatively measure interference and also provides formal definition of the problems. Section 4 and Section 5 provide the algorithms and their correctness for the MIBCN and BIS problem respectively. Simulation results have been discussed in Section 6 and Section 7 concludes the paper.
2
Related Work
Most of the works on topology control addresses energy minimization. Rodoplu et al. [18] first conceived the idea of minimum energy mobile wireless networks. They provided a distributed algorithm to reduce transmission range of the nodes while keeping the network connected and preserving minimum energy paths between each pair of nodes. Later on, their work was improved in [13]. However, none of these works addresses the issue of interference explicitly. The work that closely matches with our work is FLSS provided in [10]. However, there are some major differences: (i) FLSS focuses on energy while we focus on interference. Measuring interference is a nontrivial task as we discuss while solving MIBCN problem locally, (ii) FLSS does not preserve minimum energy paths while our local solution to MIBCN preserves minimum interference paths, and (iii) FLSS needs special treatment of links with tied weights while link weight tie does not create any problem in our solution. There was an implicit assumption that minimizing energy also minimizes interference due to the fact that energy minimization tends to drop longer links as energy consumption increases exponentially with the length of the links. But, Burkhart et al. [5] first contradicted this assumption. By giving specific definition and quantification of interference, they showed that recently proposed energy aware algorithms may even show worst performance in terms of interference. Since then, there are myriad of works focusing explicit interference minimization in topology control research. Li et al. [12] and Moaveninejad et al. [16] provided central and local algorithms to minimize the maximum and average interference of a network by constructing interference based single-hop local Minimum Spanning Tree (I-MST). In their algorithm each node develops local MST with its own neighborhood graph and the topology becomes a global spanning tree when superposition of all the nodes’ local computation is considered. Xu et al. [20] defined path interference formally and provided a local algorithm to construct topology with minimal path interference. However, their work does not address the issue of fault tolerance at all. There are some recent works addressing fault tolerance [17], [15], [1]. Penrose [17] first studied k connectivity in a geometric random graph. Although the result is similar to the result given in [2] but Penrose proposed that a graph becomes k connected almost certainly when the minimum degree of the graph is k. The result is significant because it relates a global property, k connectivity, to a local property, node degree. None of these works considers interference explicitly.
Fault Tolerant Interference-Aware Topology Control
103
Next we move on to the works related to spanners. After the introduction of spanner in the form of geometric spanner by Chew [6], spanners have found applications in different areas. Relative neighborhood graph (RNG) and Gabriel graph (GG) are spanning structures that have been first used in routing of communication networks by Karp et al. [9] and Bose et al. [3]. The concept of Yao graph [21] is used by [19] and [14] to generate spanner topologies. Localized delaunay triangulations [11] and Restricted Delaunay Graph (RDG) [8] are two other planar spanners for ad hoc wireless networks. These works do not consider interference explicitly. Burkhart et al. [5] proposed central (LISE) and local algorithm (LLISE) to construct interference aware length spanner that do not provide fault tolerance. Li et al. [12] provided local algorithm to construct interference aware power spanners. Xu et al. [20] provided an algorithm to construct minimal energy interference spanner. Czumaj et al. [7] provided a greedy algorithm that constructs a k fault tolerant distance spanner in which every vertex is of degree O(k). Note that, to the best of our knowledge, there is no work addressing fault tolerant interference spanners.
3 3.1
Problem Formulation Preliminaries
Formally, an ad hoc wireless network can be modeled by a communication graph G = (V, E) where each node of the network corresponds to a vertex v ∈ V and each link corresponds to an edge e ∈ E. A (wireless) link between u and v exists if u and v are within the mutual transmission range of each other. A topology control algorithm computes a subgraph of G preserving some desirable properties and satisfying some given constraints. We assume that nodes can vary their transmission ranges depending on the position of the immediate receiver up to a maximum which is fixed for every node. We also assume that each node knows its position via GPS devices or similar techniques. Interference is quantitatively measured as follows. Suppose a node u is transmitting to node v, another node w interferes the reception of v if v is unable to successfully receive the transmission from u due to the transmission from w. Generally, a node v may not correctly decode the transmission from node u if the signal to interference and noise ratio (SINR) perceived by v is below a certain threshold. While this threshold is dependent on various factors like antenna sensitivity of receivers, signal modulation techniques and other environmental factors, it is well understood that a third node w interferes node v’s signal reception from node u if w is located at a nearby position of v and transmitting simultaneously with u. The distance (region) within which a node w interferes another node is called the interference range (region) of w. For simplicity of analysis, we assume that the interference range and the transmission range is same for any node w. However, the solution is also applicable where the assumption does not hold.
104
Md.E. Haque and A. Rahman
While sending some data to node v, a node u interferes the nodes that are nearer to u than v. The set of nodes interfered by u’s transmission to v is called unidirectional interference set from u to v and is denoted by U IS(u, v). Formally, let dis(u, v) be the euclidean distance between u and v, then: U IS(u, v) = {w ∈ V |dis(u, w) ≤ dis(u, v)} Note that, due to directional dependency U IS(u, v) and U IS(v, u) are generally not equal. The set of the nodes interfered while v is communicating to u is: U IS(v, u) = {w ∈ V |dis(v, w) ≤ dis(v, u)} The number of nodes interfered by u’s transmission to v is called unidirectional interference number from u to v and is denoted by U IN (u, v). It is easy to see that U IN (u, v) = |U IS(u, v)|. The interference set of an edge e = (u, v) is the set of nodes that are interfered while either u or v is transmitting to each other (one might be sending data and the other might be acknowledging), and hence always considered as bidirectional. Mathematically, interference set IS of an edge e = (u, v) is, IS(u, v) = U IS(u, v) ∪ U IS(v, u) Interference number of an edge e = (u, v), denoted by IN (u, v), is the cardinality of the interference set IS(u, v). Thus, IN (u, v) ≤ U IN (u, v) + U IN (v, u). Finally, interference number of a path R, denoted by IN P (R) is the sum of the interference numbers of the edges of R, i.e. if R =< v1 , v2 , . . . , vn > is a v1 ∼ vn path then, IN P (R) =
n−1
IN (vi , vi+1 ) =
i=1
n−1
|IS(vi , vi+1 )|
i=1
Suppose Rp is the set of paths between v1 and vn . Then the minimum interference path (MIP) between v1 and vn is mathematically defined as, M IP (v1 , vn ) = min IN P (R) R∈Rp
3.2
Problems Definition
Given a bi-connected communication network G = (V, E) where each vertex denotes a wireless node and weight of an edge (u, v) is IN (u, v). We focus on the following two problems: Problem 1 (MIBCN). Construct a subgraph H = (V, E ) of G by reducing the total number of edges such that for every pair of vertices u, v ∈ V , the minimum interference path between u and v is preserved and u, v remains connected even after deletion of any vertex w ∈ V , w = u, v.
Fault Tolerant Interference-Aware Topology Control
105
Problem 2 (BIS). Construct a subgraph H = (V, E ) of G by reducing the total number of edges such that for every pair of vertices u, v ∈ V , there exist (at least) two node disjoint paths R1 and R2 between u, v in H such that both IN P (R1 ) and IN P (R2 ) are less than or equal to t × M IP (u, v) in G, for any positive constant, t > 1. The subgraph constructed in problem BIS is called interference t-spanner and the parameter t is known as the dilation factor or stretch factor of the spanner. Moreover, this solution is guaranteed to be bi-connected due to the way the problem is formulated.
4
Minimum Interference Path Preserving Fault Tolerant Structures
4.1
Centralized Algorithm for the MIBCN Problem
At first, we propose a central algorithm using global knowledge of every node’s location for the MIBCN problem. This algorithm is primarily for theoretical interest. The algorithm starts with the calculation of IS(u, v) and IN (u, v) for each edge e = (u, v). The weight of each edge is then assigned to be IN (u, v). After that, the algorithm satisfies the two constraints specified in the MIBCN problem for each pair of vertices. For the first constraint, the minimum interference path between each pair of vertices is determined and added to the output topology. This minimum interference path is then deleted (both vertices and edges) from the original graph and a second minimum interference path, node disjoint with the first one, is computed. This path is also added to the output topology. This simple method ensures conformity to both the constraints for each pair of vertices. The algorithm is listed as Algorithm 1. The subgraph generated by this algorithm is called Central Minimum Interference Bi-connected Communication Networks (CMIBCN). Algorithm 1. Algorithm to construct CMIBCN for each edge e with endpoints (u, v) do Calculate the interference of edge e by determining the cardinality of the set IS(e) end for Eo = φ for each node pair (x, y) do Select the minimum interference path P1 between x ∼ y Delete all the nodes ∈ P1 except x and y Select the minimum interference path P2 between x ∼ y Eo = Eo ∪ E(P1 ) ∪ E(P2 ) Put back all the deleted nodes and edges ∈ P1 end for Edges in Eo is the final output
106
Md.E. Haque and A. Rahman
Theorem 1. CMIBCN is a minimum interference path preserving bi-connected communication network which can be built in time complexity O(|V |2 |E|). Proof. The most important part of Algorithm 1 is the second for loop. It executes once for each pair of nodes. Path P1 is the minimum interference path between a pair of nodes. Since all the edges of P1 is retained in the final output, Eo must contain the edges that are on the minimum interference path between a pair of nodes. Again, P2 is obtained after deleting all the internal vertices of P1 . Thus presence of the edges of P2 ensures that there is another path which is node disjoint with the minimum interference path in the final output. So, Algorithm 1 constructs a topology which preserves minimum interference path and provides a path which is node disjoint with the minimum interference path for each pair of nodes. Now, the time complexity of first for loop can be at most O(|E| × |V |) since the maximum cardinality of IS(e) is equals to |V | − 2. Second for loop runs |V | times and each step can be completed in O(E) steps. Thus the total time 2 complexity of Algorithm 1 is O(|E||V | + |V |2 |E|) = O(|V |2 |E|). 4.2
Local Distributed Algorithm for the MIBCN Problem
Next we move on to locally construct minimum interference path preserving biconnected communication networks. Our algorithm works in the following three steps: (i) Neighbor Discovery: Each node gathers information about its neighbors and determines interference numbers of the associated edges. (ii) Topology Generation: Each node independently executes an algorithm to decide about the neighborhood set that is enough to preserve the minimum interference path and bi-connectivity property. (iii) Bi-directional Topology Generation: Each node sends PROPOSE message to the neighbors in output topology. If any of the nodes agree to keep an edge e, e is retained in the bi-directional topology. This is an optional step. Now we describe each of these steps in more details. Neighbor Discovery. Neighbor discovery is needed to get a measure of the interference in the locality of a node. Each node initiates this process by sending a HELLO message that contains id and position of the sending node. Note that the measurement of interference with one hop neighbors needs knowledge of two hop neighbors. The reason is explained in Figure 1. Here the interference of the edge (u, v) is being determined. Transmission range of the node u is shown as dashed circle. Two solid circles centered at u and v enclosed the nodes that are interfered while there is a bi-directional communication over the link (u, v). So IS(u, v) = {a, b, c, d, e, f, g, h}. However node h is out of the transmission range of u. Node h is one hop neighbor of v and thus two hop neighbor of u. In general, to measure interference up to h hop a knowledge of up to h + 1 hop neighbors is required. The knowledge of two hop neighbors is gathered from the HELLO
Fault Tolerant Interference-Aware Topology Control
HELLO message of u Neighbor a b c d e v i ... Interference number 0 1 2 3 4 5 6 ...
d c g a
v
u
107
h f
b
e
i
j
Fig. 1. Determining edge level interference
HELLO message of v Neighbor f g e d h u j ... Interference number 0 1 2 3 4 5 6 ... Fig. 2. HELLO messages
messages sent by the one hop neighbors. However, we propose not to add the position of each neighbor in HELLO message. Rather, HELLO messages contain a series of tuples < node id, interf erence number > where interference number is the total nodes interfered when the sender of this message is transmitting to a specific neighbor denoted by node id. More specifically, node u calculates the interference number for a neighbor v by determining the cardinality of the set {w ∈ V |dis(u, w) ≤ dis(u, v)}. We term this technique as limited two hop information propagation. This mechanism reduces the data transmission by almost 75% than the traditional two hop information propagation in < id, position > format. Let us describe the process of determining interference using an example. Consider Figure 1. To determine the interference of edge (u, v), node u must know the elements of two sets. The set of the nodes that are interfered when u transmits to v and the set of nodes that are interfered when v transmits to u. Between these two sets, all nodes of the second set may not be neighbor of u. Thus u must rely on v to know about these nodes. Figure 2 shows the HELLO message transmitted by u and v with respect to the network shown in Figure 1. The numbers below each neighbor is the interference number from u to its neighbors. For example, u has added a column in its HELLO message with a and 0 means that u does not interfere any node while transmitting to a. Note that, a may interfere some nodes while transmitting to u but that is not included in u’s HELLO message. Similarly, u interferes 3, 4 and 5 nodes while transmitting data to node d, e and v, respectively. Some implicit information about the distance from the neighbor is also conveyed in this kind of HELLO messages. The HELLO message of u tells that it does not interfere with any node while transmitting to a means a is the nearest neighbor of u. Moreover, interference number from u to a, IN (u, a) = 0 and interference number from u to b, IN (u, b) = 1 indicates that a is the only node that is being interfered when u transmits to b. Now we discuss how node u can determine interference number of the edge (u, v) from HELLO message of u and v. Hello message of v tells that it can transmit to node u interfering 5 nodes and the set of the nodes that are nearer to v than u is {f, g, e, d, h}. Again, node u itself determines that it interferes 5
108
Md.E. Haque and A. Rahman
nodes while transmitting to v and the set of nodes interfered is {a, b, c, d, e}. Thus node u can easily determine the interference set of the edge (u, v), IS(u, v) = {f, g, e, d, h} ∪ {a, b, c, d, e} = {a, b, c, d, e, f, g, h}. Thus IN (u, v) = |IS(u, v)| = 8. In this way, node u can determine the interference of all the edges of its neighborhood graph, Gu . Definition 1 (Neighborhood Graph). Let G = (V, E) be a communication graph and Nu be the neighborhood of node u, i.e. Nu is the set {w|w ∈ V and dis(u, w) ≤ rmax } where rmax is the maximum transmission range of u. Then the Neighborhood Graph of u, Gu = (V (Gu ), E(Gu )), is the induced subgraph of G by V (Gu ) = Nu . Neighbor discovery procedure finishes once a node u gathers the full information about its neighborhood graph, Gu . Algorithm 2. Algorithm to construct LMIBCN Run Neighbor Discovery procedure and construct neighborhood graph Gu = (Vu , Eu ) Nf = φ for each node v ∈ V (Gu ) do if there are more than one node disjoint paths between u and v with interference less than the interference of the edge (u, v) then Nf = Nf ∪ v end if end for Nl = V (Gu ) − Nf is the neighborhood set of u
Topology Generation. In this step each node executes the local algorithm to generate a global minimum interference and bi-connected topology. A node u generates the subgraph Hu of its neighborhood graph Gu such that V (Hu ) = V (Gu ) and for any vertex v ∈ V (Gu ), the minimum interference path from u to v is preserved in Hu and more than one path from u to v is also preserved. The basic idea is to keep an edge e unless we know the graph is going to preserve minimum interference and bi-connected property without e. We preserve the minimum interference path, delete it and then preserve the second minimum interference path for each pair of nodes (u, v) where u is the node executing the algorithm and v is neighbor of u. The steps of algorithm executed by u is shown in Algorithm 2. An instance of this algorithm is executed individually by all nodes and the union of the subgraphs locally produced by all nodes is called Local Minimum Interference Bi-connected Communication Networks (LMIBCN). Bi-Directional Topology. Bi-directional links facilitates different desirable characteristics like link level acknowledgment transmission, dynamic route discovery etc. To generate a bi-directional topology, it is required to eliminate directional edges either by making them bi-directional with the addition of edges in the opposite direction or with the deletion of directional edges completely. Both
Fault Tolerant Interference-Aware Topology Control
109
of these methods will work for constructing LMIBCN. So, we discuss approach of adding directional edges. Each node u sends a PROPOSE message to every node v ∈ Nl . A node v receiving PROPOSE message from u, checks its own Nl for the presence of u. If u is not present then v adds u as a neighbor in the final neighbor list Nl+ . Proof of Correctness. To prove the correctness of LMIBCN we prove that Algorithm 2 constructs a topology that—(i) preserves minimum interference paths and, (ii) is bi-connected. For the first one, we have the following lemma, Lemma 1. LMIBCN is a topology where all minimum interference paths are preserved. Proof. Let by contradiction, there is a pair of nodes x and y for which minimum interference path is not preserved. Let us assume that the minimum interference path between x and y in graph G is Pxy =< x = v1 , v2 , . . . , vn = y >. If Pxy is not preserved in LMIBCN then there must be at least one edge (vi , vi+1 ) which is not preserved. Thus, when Algorithm 2 is executed at node vi , it must have found at least two vi ∼ vi+1 paths which have less interference than the edge (vi , vi+1 ). But Pxy is the minimum interference path between x and y and (vi , vi+1 ) is an edge of this path. So, the direct edge between vi and vi+1 must be the minimum interference path between them. Otherwise, we may replace the edge (vi , vi+1 ) by less interference path between them in Pxy and get a lower interference path between x and y than Pxy which is a contradiction. Thus, every edge on every minimum interference path is preserved by Algorithm 2. Therefore, all minimum interference paths are preserved in LMIBCN. Now we move to the proof of bi-connected property. Lemma 2. Let u and v be two vertices in a bi-connected graph G. If u and v are bi-connected after removal of edge (u, v) then G − (u, v) is also bi-connected. Proof. Let, x and y be two arbitrary vertices. We have to prove that x and y are bi-connected in G − (u, v). Since G is a bi-connected graph, there must be at least two node disjoint paths between x and y. Suppose, P1 and P2 are two such paths. Nodes u and v are bi-connected both in G and G − (u, v). Let, P and P be two node disjoint paths between u and v in G − (u, v). Since, P1 and P2 are node disjoint paths, both of them can not contain the edge (u, v). We have the following cases: (i) If both P1 and P2 does not contain edge (u, v) then both the paths remain intact in G − (u, v). So, x and y are bi-connected in G − (u, v). (ii) Since at most one of them can contain edge (u, v) we assume, without loss of generality, that P1 contains the edge (u, v). We also assume, path P1 can be divided into three parts, a path from x to u, Pxu , then the edge (u, v) and finally the path from v to y, Pvy . Figure 3 gives a depiction of the paths. We note that, P2 , Pxu and Pvy all three are mutually node disjoint. So, if we now delete any single node w from the graph G − (u, v), it can not disconnect all three paths P2 ,
110
Md.E. Haque and A. Rahman P
Pxu
u
v
Pvy
P’
y
x
P2
Fig. 3. Two paths between node pair x, y
Pxu and Pvy simultaneously. If w ∈ V (Pxu ) or w ∈ V (Pvy ), then x and y remains connected through P2 . If on the other hand, w ∈ V (P2 ) then, since P and P are node disjoint, w cannot disconnect both of them simultaneously. Thus there will be a path between x and y using Pxu , P or P (which one is not disconnected by deletion of w) and Pvy . Thus x and y are bi-connected in G − (u, v). So, G remains bi-connected after deletion of (u, v). From Lemma 2, we can say that any node u can decide not to keep the edge (u, v) if u knows u and v are bi-connected without the edge (u, v). However, it is important that the other nodes keep the edges that node u considered for bi-connectivity of u and v. This is ensured by considering only the paths with less interference than a direct edge. The following lemma shows that, Lemma 3. Let G and G be two undirected graph such that V (G) = V (G ). If G is bi-connected and every edge (u, v) ∈ E(G) − E(G ) satisfies that u has at least two node disjoint paths to v with less interference than (u, v) then G is also bi-connected. Proof. Let, E(G) − E(G ) = {e1 , e2 , . . . , em } where w(ei ) ≥ w(ei+1 ) for 1 ≤ i ≤ (m − 1). We define a series of graphs Gi for 1 ≤ i ≤ m such that Gi = G − {e1 , e2 , . . . , ei } i.e. V (Gi ) = V (G) and E(Gi ) = E(G) − {e1 , e2 , . . . , ei }. We note that, Gi and Gi+1 is related as Gi+1 = Gi − ei+1 and Gm = G . We also assume, G0 is the original graph G. Now we prove Gm is bi-connected by induction. Basis: Given graph G0 is bi-connected. Inductive step: Let, graph Gi is bi-connected. According to the hypothesis, there are two node disjoint paths between the endpoints of edge ei+1 with interference less than the interference of edge ei+1 . Thus, the endpoints of edge ei+1 are bi-connected even after deletion of the the edge ei+1 . So, by Lemma 2 we can say that graph Gi − ei+1 is also bi-connected. But, Gi − ei+1 is nothing but Gi+1 . So, graph Gi+1 is also bi-connected. So, by induction the graph Gm = G is bi-connected. By combining Lemma 1 and Lemma 3 we deduce: Theorem 2. LMIBCN is a minimum interference path preserving bi-connected topology.
Fault Tolerant Interference-Aware Topology Control
5
111
Interference-Aware Fault Tolerant Spanners
5.1
Local Distributed Algorithm for the BIS Problem
We focus on developing a distributed algorithm that constructs a bi-connected interference spanner from a given graph based on local information only. As there may exist numerous paths between a pair of vertices that satisfies the spanner property it is very challenging to decide which paths should be kept during construction of spanners. Here we imitate the work of [7] to generate an interference spanner. In [7], Czumaj and Zhao provided a central algorithm to construct fault tolerant geometric spanners. Here we show that there algorithm can be used as a local algorithm with slight modifications. Most importantly, we have to make sure that there is no tie remaining among the costs of the edges. To break the ties between edges we use the id’s of the endpoints of the edges. Let us define two functions MIN and MAX as follows: M IN (u, v) = min(id(u), id(v)) and M AX(u, v) = max(id(u), id(v)). Let e1 = (u1 , v1 ) and e2 = (u2 , v2 ) be two edges then we define cost(e1 ) > cost(e2 ) if, |IS(e1 )| > |IS(e2 )| or, |IS(e1 )| = |IS(e2 )| ∧ M IN (u1 , v1 ) > M IN (u2 , v2 ) or, |IS(e1 )| = |IS(e2 )| ∧ M IN (u1 , v1 ) = M IN (u2 , v2 ) ∧ M AX(u1 , v1 ) > M AX(u2 , v2 ) In all other case, we let cost(e1 ) < cost(e2 ). Algorithm 3. Algorithm to construct LBIS Run Neighbor Discovery procedure and construct neighborhood graph Gu = (Vu , Eu ) Sort the edges of Eu in increasing order of cost Es = φ for each edge (u, v) taken in ascending order do if two node disjoint paths between u and v using only edges from Es with interference less than t times of interference of (u, v) cannot be found then Add (u, v) to Es end if end for Nodes adjacent using only the edges from Es comprise the neighbor set of node u in the final topology
After discovering the neighbors each node u at first sorts the edges of its neighborhood graph Gu according to the costs of the edges (the cost an edge is set to its interference number). Then edges are taken in ascending order and added in a set Es unless its endpoints have already two node disjoint paths with spanner property. The nodes that are adjacent using the edges of Es are the final neighbors of node u. The local subgraphs generated by each nodes combinedly
112
Md.E. Haque and A. Rahman
produce a topology for BIS problem which is dubbed as Local Bi-connected Interference Spanner (LBIS). Algorithm 3 summarizes the total procedure. Once the neighbor set is defined, each node may send a PROPOSE message using the same protocol as explained in Section 4.2 and generate a bi-directional topology. 5.2
Proof of Correctness of Algorithm to Generate LBIS
To prove the correctness of Algorithm 3 we need to show that LBIS is biconnected and for each pair of vertices u, v, there is at least two paths between u and v with spanner property. To prove this, we use an indirect approach. First, we state the central counter part of Algorithm 3 that generates Central Bi-Connected Interference Spanner (CBIS). We prove the correctness of central algorithm. Finally, we prove that the local algorithm generates a topology that is super set of the output of the central algorithm. The central algorithm is given in Algorithm 4. This central algorithm and the proof of Lemma 4 are stated with slight modification from [7]. Algorithm 4. Algorithm to construct CBIS Let the given bi-connected graph be G = (V, E) Sort the edges of E in increasing order of cost Es = φ for each edge (u, v) taken in ascending order do if two node disjoint paths between u and v using only edges from Es with interference less than t times of interference of (u, v) cannot be found then Add (u, v) to Es end if end for G = (V, Es ) is the output graph.
Lemma 4. CBIS preserves two node disjoint paths with interference spanner property between each pair of vertices. Proof. The proof can be stated from the constructive nature of the Algorithm 4. An edge (u, v) is removed only when it is found that its endpoints u and v are connected by two node disjoint spanner paths from the edge set Es . Since Es is the final output and no edge has been removed once it has been added to Es , node u and v remains connected by two node disjoint spanner paths at final topology. Thus, CBIS preserves two node disjoint paths with interference spanner property between each pair of nodes. Lemma 5. Edge set of LBIS is a super set of edge set of CBIS. Proof. Since costs of edges are strictly increasing, the mutual ordering of edge consideration is same. That is, let e1 and e2 be two edges. If e1 is considered
Fault Tolerant Interference-Aware Topology Control
113
before e2 by a node u while constructing LBIS then e1 must has been considered before e2 by Algorithm 4 for constructing CBIS. Thus the set of edges that are considered for constructing LBIS before a specific edge (u, v) will be subset of the set of the edges considered for constructing CBIS before (u, v). So, if Algorithm 4 cannot find u and v to be bi-connected then any node running Algorithm 3 will also fail to find u and v to be bi-connected. Thus edge set of LBIS must be a super set of edge set of CBIS. By combining Lemma 4 and Lemma 5 we deduce: Theorem 3. LBIS is a bi-connected interference spanner.
6 6.1
Simulation Results Simulation Environment
To evaluate the performance of the proposed algorithms we simulate randomly deployed networks of 75 to 300 nodes uniformly distributed over a 1000 m × 1000 m square region. The maximum transmission range of each node is 250 m. Thus the width and height of the deployment area are four times of the maximum transmission range. Only bi-connected networks are considered since unless the initial network is bi-connected it is impossible to generate a bi-connected sub network. By initial network, we mean the network that consists of all the nodes and there is a link between a pair of nodes if they can transmit to each other using maximum transmission power. We have generated 10 instances of networks for each node number and performance metrics are measured as an average of these 10 random tries (unless otherwise stated explicitly). We analyze the performance of the different algorithms using the following evaluation metrics: Interference. We measure the level of interference of a network by counting the total number of links present in the underlying communication graph. We assume that a “(wireless) link” virtually exists from u to v if v is within the transmission range of u. Fault tolerance. For each of the communication networks we measure the degree of connectivity as a measure of level of fault tolerance. We call a network n-connected if there exists at least n paths between any pair of nodes within the network. Needless to say that increased degree of connectivity generally means increased level of fault tolerance. Quality of paths. For a pair of nodes u and v, we determine two node disjoint least interference paths, i.e., the paths having the lowest and the second-lowest interference numbers. These two paths can be easily determined by finding a cycle with minimum interference number that contains the vertices u and v. We record the interference number of this cycle and take the average over all the node pairs. We named this average as Average Path Interference (API). This metric captures the quality of any algorithm that simultaneously provides bi-connectivity and reduced interference.
114
Md.E. Haque and A. Rahman
6.2
Numerical Results
Figure 4 shows a comparison on number of links with respect to initial networks. It is clearly evident that the proposed algorithms achieved linear growth on number of edges contrasting exponential growth of initial networks. LBIS is the most sparse among these topologies having least number of links. A subtle check of Figure 4 reveals that LMIBCN contains less number of links than CMIBCN. The reason for that is CMIBCN has one more property that is not present in LMIBCN. For each pair of nodes, CMIBCN preserves minimum interference path and another path which is node disjoint with this minimum interference path. On the other hand, LMIBCN preserves minimum interference path and is bi-connected. Thus there may exist two node disjoint paths between each pair of nodes in LMIBCN but neither of these may be the minimum interference path. Figure 5 shows the connectivity achieved by our algorithms. For comparison purpose, we have also plotted the result of I-MST [16]. X-axis shows the connectivity number (k) and Y-axis shows percentage of node pairs that are k connected. Clearly, LMIBCN and CMIBCN show high degree of connectivity compared to recent results addressing interference like I-MST. Moreover a high percentage of node pairs are at least 3 connected in LMIBCN and CMIBCN. On the average only 20% node pairs are biconnected in I-MST. Although LBIS ensures bi-connectivity as the plot shows a 100% node pairs are bi-connected, but the percentage of nodes having connectivity more than 2 drops quickly compared to CMIBCN and LMIBCN. 7000
Percentage pair of nodes k-connected
Number of Edges
100
Initial Graph CMIBCN LMIBCN LBIS
6000 5000 4000 3000 2000 1000
80
60
CMIBCN LMIBCN LBIS I-MST
40
20
0
0 50
100
150
200
250
Node Number
Fig. 4. Comparison on No. of Links
300
1
2
3
4
Connectivity Number (k)
Fig. 5. Comparison on Fault Tolerance
Table 1. API of different structures Number CMIBCN LMIBCN LBIS Percentage increase in LBIS of Nodes w.r.t. CMIBCN and LMIBCN 75 46.01 46.01 47.89 4.08% 150 62.75 62.75 65.98 5.14% 225 74.90 74.90 77.84 3.93% 300 88.15 88.15 90.37 3.09%
>4
Fault Tolerant Interference-Aware Topology Control
115
We have determined the API for various structures generated by our algorithms. Table 1 shows the results. Both CMIBCN and LMIBCN have same API because the same goal is achieved in both topologies by preserving almost same number of links. This equality indicates that LMIBCN, although locally constructed, achieves similar results in terms of quality of paths to CMIBCN. LBIS, being a less restrictive topology, has slightly higher API compared to CMIBCN and LMIBCN. However if we consider the percentage increase in API of LBIS with respect to CMIBCN and LMIBCN we can see a very small increase relative to CMIBCN or LMIBCN (between 3 ∼ 5%), thus making it a very attractive interference-aware topology.
7
Conclusions and Future Works
Minimizing interference and providing fault tolerance are two important goals to achieve in ad hoc wireless networks. In this paper, we have provided numerous unified algorithms to simultaneously achieve both goals. CMIBCN is centrally constructed minimum interference path preserving bi-connected topology which can be used for benchmarking purposes. LMIBCN can be locally constructed which has similar characteristics of CMIBCN. Structures like LBIS are also interesting and important since topologies with spanner properties have useful applications. There still remain some open problems to answer. Minimizing interference is NP-hard [4] however it is still unknown whether finding bi-connected minimum interference path preserving subgraphs with minimum number of edges is NPhard or not. We also did not provide any bound of the stretch factor (t) for LBIS. It will be interesting to see how small stretch factor can be achieved by distributed algorithms.
References 1. Bahramgiri, M., Hajiaghayi, M., Mirrokni, V.S.: Fault-tolerant and 3-dimensional distributed topology control algorithms in wireless multi-hop networks. Wireless Networks 12(2), 179–188 (2006) 2. Bollo‘as, B.: Random Graphs. Cambridge University Press, Cambridge (2001) 3. Bose, P., Morin, P., Stojmenovic, I., Urrutia, J.: Routing with guaranteed delivery in ad hoc wireless networks. Wireless Networks 7(6), 609–616 (2001) 4. Buchin, K.: Minimizing the maximum interference is hard. CoRR, abs/0802.2134 (2008) 5. Burkhart, M., von Rickenbach, P., Wattenhofer, R., Zollinger, A.: Does topology control reduce interference? In: Proc. of MobiHoc (2004) 6. Chew, L.P.: There is a planar graph almost as good as the complete graph. In: Proc. of the 2nd Annual Symp. on Comp. Geometry (1986) 7. Czumaj, A., Zhao, H.: Fault-tolerant geometric spanners. In: Proc. of the 19th Annual Symp. on Comp. Geometry (SCG), pp. 1–10 (2003) 8. Gao, J., Guibas, L.J., Hershberger, J., Zhang, L., Zhu, A.: Geometric spanner for routing in mobile networks. In: Proc. of MobiHoc (2001)
116
Md.E. Haque and A. Rahman
9. Karp, B., Kung, H.T.: GPSR: greedy perimeter stateless routing for wireless networks. In: Proc. of MobiCom, pp. 243–254 (2000) 10. Li, N., Hou, J.C.: Localized fault-tolerant topology control in wireless ad hoc networks. IEEE Trans. Parallel Dist. Syst. 17(4), 307–320 (2006) 11. Li, X., Calinescu, G., Wan, P., Wang, Y.: Localized delaunay triangulation with application in ad hoc wireless networks. IEEE Trans. Parallel Distrib. Syst. 14(10), 1035–1047 (2003) 12. Li, X., Moaveninejad, K., Song, W., Wang, W.: Interference-aware topology control for wireless sensor networks. In: Proc. SECON (2005) 13. Li, X., Wan, P.: Constructing minimum energy mobile wireless networks. In: Proc. of ACM MobiHoc, pp. 55–67 (2001) 14. Li, X., Wan, P., Wang, Y.: Power efficient and sparse spanner for wireless ad hoc networks. In: Proc. of ICCCN (October 2001) 15. Li, X., Wan, P., Wang, Y., Yi, C.: Fault tolerant deployment and topology control in wireless ad hoc networks. Wireless Communications and Mobile Computing 4(1), 109–125 (2004) 16. Moaveninejad, K., Li, X.: Low-interference topology control for wireless ad hoc networks. AHSN 1(1-2) (2005) 17. Penrose, M.D.: On k-connectivity for a geometric random graph. Random Struct. Algorithms 15(2), 145–164 (1999) 18. Rodoplu, V., Meng, T.H.: Minimum energy mobile wireless networks. IEEE J. Selected Areas of Communication 17(8), 1333–1344 (1999) 19. Wattenhofer, R., Li, E.L., Bahl, P., Wang, Y.: Distributed topology control for wireless multihop ad-hoc networks. In: INFOCOM (2001) 20. Xu, H., Huang, L., Liu, W., Xu, B., Xiao, M.: Topology control for minimal path interference in wireless sensor networks. In: Proc. of the IEEE Symposium on Computers and Communications, ISCC (2008) 21. Yao, A.C.: On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM J. Comput. 11(4), 721–736 (1982)
PaderMAC: A Low-Power, Low-Latency MAC Layer with Opportunistic Forwarding Support for Wireless Sensor Networks Marcus Autenrieth and Hannes Frey University of Paderborn {marcus.autenrieth,hannes.frey}@upb.de
Abstract. Modern medium access control (MAC) protocols for wireless sensor networks (WSN) focus on energy-efficiency by switching a node’s radio on only when necessary. This intoduced rendezvous problem is gracefully handled by modern asynchronous approaches to WSN MAC’s, e.g. X-MAC, using strobed preambles. Nevertheless, most MAC layer ignore the possible benefits in energy consumption and end-toend latency, supporting opportunistic routing can provide. In this paper we present PaderMAC, a strobed preamble MAC layer which supports cross-layer integration with an arbitrary opportunistic routing layer. This work specifies the PaderMAC protocol, explains its implementation using TinyOS and the MAC layer architecture (MLA), and presents the results of a testbed performance study. The study compares PaderMAC in conjunction with opportunistic routing to X-MAC in conjunction with pathbased routing and shows how PaderMAC reduces the preamble length, better balances the load and further improves the end-to-end latency within the network.
1
Introduction
Replacing batteries is impossible or infeasible in many deployments of wireless sensor networks (WSNs). For this reason, reducing the power consumption of WSN nodes is a central research topic for prolonging the lifetime of such a network. The most common technique to reduce the power consumption on nodes is to let them sleep, i.e. power down their radio, most of the time and let them wake up, i.e. switch their radio on, only when needed to send or receive packets. This approach, called duty-cycling, introduces a problem, called the rendezvous problem, which the medium access control (MAC) layer must solve: transmitting and receiving node must both be awake in order to have a successful transmission. Solving this rendezvous problem in an energy-efficient manner is one of the key focuses of WSN MAC design. So far, most MAC and routing layers for wireless sensor networks have been developed without considering the possible performance gains and implementational issues of a practical cross-layer integration on real hardware. In this paper we introduce PaderMAC, a sender-initiated, asynchronous MAC layer which extends X-MAC [4] with opportunistic forwarding support. Section 2 H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 117–130, 2011. c Springer-Verlag Berlin Heidelberg 2011
118
M. Autenrieth and H. Frey
will present the related work, giving a short summary on opportunistic routing mechanisms and sender-initiated asynchronous MAC layer approaches. Section 3 will first motivate with a sink-based routing scenario why opportunistic routing support is a valuable extension for X-MAC. It will then explain how PaderMAC integrates with an opportunistic routing layer to address a set of eligible recipients during a transmission and how collisions between contending recipients are handled. It will also explain how PaderMAC retains backwards-compatibility with X-MAC after motivating why this is desireable. Section 4 will then cover the reference implementation of PaderMAC in TinyOS [13], using the MAC layer architecture (MLA) [14]. Its focus lies on the cross-layer integration with an opportunistic routing layer, but it also includes the contribution of a feedback-enabled preamble-sender component to the MLA. Section 5 will then show how opportunistic routing using PaderMAC can reduce energy consumption and end-to-end latency in comparison to routing using X-MAC. Section 6 then summarizes these findings and points to future research directions.
2
Related Work
The MAC layer and the routing layer integration described in this work is related to two research areas, routing mechanisms which support opportunistic packet forwarding and preamble sampling based duty cycling MACs. Both areas are summarized in the following two sections. 2.1
Routing Mechanisms Supporting Opportunistic Packet Forwarding
Unicast Communication. In opportunistic packet forwarding the next hop node is not decided at the time of message transmission but depends on which of the nodes in vicinity could successfully receive that transmission. This technique was exploited in the seminal opportunistic forwarding schemes ExOR [2] to increase network throughput, and GeRaF [30] to reduce network latency and energy consumption. Both schemes employ a greedy strategy to decide which node is eligible to forward a received message. Greedy forwarding schemes, which were not considered as an opportunistic forwarding strategy when introduced, have already been described in earlier work [7,26,18,24]. In those schemes the next hop node is selected depending on the position of that node and the position of the destination node. In principle, however, any node geographically closer to the destination than the current node is a potential forwarder. This allows an opportunistic realization of these greedy routing schemes which has been considered by the so called beaconless greedy routing variants BLR [12], IGF [3], and CBF [11]. A further geographic forwarding mechanism which supports opportunistic packet forwarding is given by the geographical clustering idea [10,9,27,20]. Using a regular partitioning of the space, each node is assigned to the cluster it is located in. Packet forwarding is done on geographic cluster level, i.e., a forwarder’s
PaderMAC
119
task is to reach any node in a neighboring cluster but not necessarily a specific one. In the same way as done with the beaconless greedy routing variants this allows an opportunistic forwarding realization. In the listed forwarding schemes the receiving nodes utilize information about geographic node locations to decide if they are eligible for forwarding. However, even without geographic location information, opportunistic forwarding variants are possible. Examples for unicast communication are link-reversal based routing schemes like TORA [19]. Sink-oriented Communication. Opportunistic forwarding is of course not limited to unicast communication. In sink-oriented communication, sensor nodes use intermediate nodes to transmit their measurement data towards one or a set of dedicated sink nodes. The CTP protocol [8] is an example of such a sink-oriented communication protocol, which could be modified to allow opportunistic packet forwarding. With initial flooding and a sophisticated repair mechanism, distance values from the nodes towards sink nodes are maintained. Using these values, each node can compare its own distance with the distance of its neighbor and thus opportunistically decide if it is eligible to forward the message. 2.2
MAC Layers for Wireless Sensor Networks
MAC layers for wireless sensor networks can be classified into three major directions of approach: synchronous, asynchronous, and hybrid approaches. In the synchronous approaches [29,28,17,16], nodes form groups whose members all sleep and wake up simultaneously. This incurs some overhead, as a master node has to be elected, which has to distribute a sleep-wakeup-schedule for its group and which has to ensure that all members keep synchronized. In the asynchronous approaches nodes follow their sleep cycles notwithstanding the other nodes’ ones. When a message has to be transmitted, sender and receiver have to solve the rendezvous problem on demand. Two ways can be followed here: a sender-initiated and a receiver initiated approach. In the senderinitiated approach [21,4], the sender announces a pending transmission by first sending a preamble. Once the sender has ensured that the receiver is awake, it transmits the data frame. In contrast the receiver initiated approach is to let the receiver announce a short beacon when it wakes up. The sender has to monitor the channel and can start its transmission once it receives a beacon from the intended receiver node [25]. The hybrid approaches [1,6,22] combine synchronous and asynchronous techniques to mitigate weaknesses or tune a MAC protocol to a specific scenario. In the domain of sensor networks an asynchronous sender initiated MAC was initially introduced with the B-MAC protocol [21]. In B-MAC, a sender starts its transmission by sending a preamble after performing a clear channel assessment (CCA). A node which overhears the preamble stays awake and waits for the sender to transmit the data frame. Since the preamble must ensure that the designated recipient is awake, is has to be transmitted for a complete sleep
120
M. Autenrieth and H. Frey
node wakes up sending frame
node goes to sleep receiving frame
clear channel assessment
preamble A
data
B time acknowledgement
Fig. 1. An example for an X-MAC transmission
period, before the data-frame is transmitted. Clearly, this wastes energy on both, sender and receiver. like illustrated in Figure 1, X-MAC [4] addressed this issue of B-MAC, by replacing the bitstream preamble of B-MAC by request-to-send (RTS) frames, which are sent repeatedly up to a duration of a sleep period. In between the RTS frames, the receiver can now reply with a clear-to-send (CTS) frame, also called an ”early acknowledgement” (early ACK), to cut the preamble short. Upon receiving a CTS, the sender immediately sends the data frame and both nodes can go back to sleep. Since the RTS frames contain the address of the intended receiver, every other node which overhears an RTS can go directly back to sleep. This reduces the duration the sender has to transmit its preamble, shorten By shortening the duration a sender has to transmit its preamble, as well as the overall transmission, energy consumption is reduced.
3
PaderMAC: A WSN MAC Protocol with Opportunistic Forwarding Support
In contrast to previous simulative and analytical approaches [15,23], PaderMAC is a practical MAC protocol with opportunistic forwarding support, compatible with modern sensor node hardware, such as the Tmote-Sky platform. By design, it is not tailored to a specific routing mechanism, but tries to be usable with several opportunistic and non-opportunistic routing protocols. The central idea behind PaderMAC is to improve a network’s lifetime by further shortening energy-consuming preambles. This is achieved by exploiting the fact that, in a sufficiently dense multi-hop network, multiple relays for a given destination can be found in each hop. As an example, consider a sink-based routing scenario, commonly used in data-centric WSN deployments. The traditional approach is to create and maintain a data-gathering tree, using a distributed algorithm and an appropriate metric, e.g.: hop-count or ETX [5], which assigns to each node a relay-node, responsible for forwarding the data. Figure 2 shows an example with node B acting as the sink. Packets from node A to node B are routed along the predefined path [A → 5 → 2 → sink]. In
PaderMAC
121
node wakes up node goes to sleep predefined path alternative path
clear channel assessment receiving frame sending frame 6 A 5
2
A 6
B distance 0
1
3
3 5
distance 1
t0
t1
t2
t3
time
distance 2 distance 3
Fig. 2. An example for energy saving opportunities, by using alternative paths in multihop end-to-end transmissions
addition three alternative paths exist: [A → 6 → 2 → sink], [A → 3 → 2 → sink] and [A → 3 → 1 → sink]. At time t0 , Node A wakes up and wants to transmit data towards node B. In this example node 5 is used as a fixed relay, though nodes 6 and 3 could also act as a relay. After performing a CCA, node A starts its preamble, waiting for the rendezvous with node 5. At time t1 node 6 and at time t2 node 3 wake up, but each goes back to sleep immediately since they both are not addressed in the preamble of node A. Finally node 5 wakes up at time t3 and receives the packet. By handing over the packet to node 6, instead of node 5, the preamble could be shortened by a duration of t3 − t1 , thus saving additional energy on the sender. 3.1
Opportunistic Routing Integration
In principle, the opportunistic routing support of PaderMAC works as follows: A sender embeds the routing information into its preamble, and based on that information, a receiver decides whether it is a suitable recipient for the transmission, i.e. whether it can successfully forward the data to its final destination. If more than one recipient contends for the packet, a receiver contention mechanism (described in section 3.2) is used to resolve the conflict. To keep PaderMAC as routing-agnostic as possible, the decision whether to contend for a transmission or not, has to be encapsulated by the routing-layer. When a preamble frame announcing an opportunistic transmission is received, PaderMAC hands that frame for inspection to the routing layer, which responds with its decision. The inter frame spacing of the opportunistic routing preambles have to be adapted to the complexity of the decision, made by the routing layer.
122
M. Autenrieth and H. Frey
Not every packet transmission needs opportunistic forwarding. For example, it might be necessary for the routing layer to send additional management packets like beacons, or send packets over a predefined path in case the opportunistic forwarding fails. To distinguish between opportunistic and normal transmission, PaderMAC employs two differenc types of preamble-frames. 3.2
The Receiver Contention Mechanism
During an opportunistic forwarding transmission, it can happen that two or more suitable forwarders wake up and contend for the transmission. Several circumstances prevent the receiver contention from being handled with a simple carrier sense multiple access (CSMA) scheme. The first one is the fact that PaderMAC sends its preamble frames in a tight loop with very small gaps in between, which are used for the early ACK. Increasing these gaps is infeasible, since it would result in increased listen-periods for nodes which wake up, thus increasing their energy consumption. Another reason against a CSMA scheme is the possibility that two contending receivers could be subject to a hidden terminal problem, a situation where a carrier sense is ineffective for avoiding collision since the contending nodes are unable to detect each other’s transmission. PaderMAC does not try to prevent the inevitable, and instead employs a slotted backoff mechanism for collided acknowledgements. Its backoff mechanism exploits the fact, that the sender does not recognize collided early ACKs, since both ACKs destroy each other at the sender. Because it cannot detect the collision the sender then proceeds with sending its preamble. A receiver which overhears a preamble frame after sending an early ACK assumes that a collision took place and switches into backoff-mode. It randomly picks a number of additional preamble frames it has to receive from the sender before it resends its early ACK. As soon as the sender receives the first early ACK from a contending receiver, it transmits the data frame addressed to that receiver. To identify the receiver which won the contention, receivers need to identify themselves to the sender. This is done by each receiver embedding its MAC address in its early ACK. The sender then uses this information to address the data frame to the receiver which won the contention. Figure 3 shows an example situation for the receiver contention mechanism of PaderMAC. In the example the two receivers, node A and B wake up almost simultaneously and since they are both hidden from each other they cannot sense each others start of transmission and their acknowledgements collide. Both acknowledgements get destroyed at the sender node S and it continues to send its preamble. Upon overhearing a preamble frame after sending an ACK, both nodes switch into backoff mode. In the example A choses a backoff-window of two preamble frames and B a backoff-window of 3. Under the assumption that both nodes can receive each following preamble, node A will win the contention. But now let’s assume in this exmample that node A only receives one of the subsequent preambles. Since a contending node can only count preamble frames it successfully receives and decodes, node A has to wait until the channel quality improves to a point where it can successfully receive the remaining
PaderMAC
123
collision preamble
preamble (cont.)
S
data
A
B sending frame receiving frame
time random slotted backoff
clear channel assessment acknowledgement
Fig. 3. An example for collision handling in PaderMAC
preamble frames or another node wins the contention or the transmission times out. Because of this effect, the contention mechanism is slightly biased towards contenders with low packet error rates on the downlink.
4
Implementation
We implemented PaderMAC on TmoteSky hardware using TinyOS [13] and the MAC Layer Architecture (MLA) [14]. Since it uses the MLA’s implementation of X-MAC as a starting point, PaderMAC superficially resembles X-MAC in its architecture, though important parts work much differently. In contrast to the MLA implementation of X-MAC, PaderMAC uses a feedback-capable component for sending a preamble and needs a sophisticated component for handling the receiver contention mechanism for opportunistic forwarding transmissions. Like illustrated in Section 3, PaderMAC needs the early ACKs, sent by the contending receivers, to contain the MAC address of the contenders. The preamble-sender component as provided by the MLA, could not be used, since it’s implementation relies on the low-level acknowledgements of TinyOS. We therefore designed a new component for sending preambles, called SoftAckPreambleSenderC. The SoftAckPreambleSenderC component itself is designed for the development of preamble-based MAC layers which need to stop their preamble based on some arbitrary event. This is done by decoupling the standard PreambleSenderC component from the low-level radio interface via the SoftAckHandler interface. In contrast to the straightforward receiver mechanism of X-MAC, the receiver contention mechanism, encapsulated in the component PaderMACReceiverEngineP, plays a central role in PaderMAC during opportunistic forwarding transmissions. It has to communicate the reception of an ACK to the SoftAckPreambleSenderC component, it has to receive and process the two types of
124
M. Autenrieth and H. Frey
preamble frames and it has to communicate with the routing layer in order to decide whether or not to contend for an opportunistic forwarding transmission. To integrate PaderMAC with an opportunistic routing layer, we defined the RoutingDecider interface. The interface specifies the do forward function, which accepts a preamble frame as its argument and returns a boolean value, indicating the routing layers decision. An opportunistic routing layer must now implement this interface, to encapsulate the opportunistic routing decision, and wire it to PaderMAC’s MacControlC component. That way, PaderMAC can hand over the opportunistic forwarding decision to the routing layer. This handover, makes PaderMAC very flexible, but comes at the cost of having to manually tune the ACK-timeouts of the SoftAckPreambleSenderC component to the computation time of the component providing the RoutingDecider interface.
5
Performance Study
This section describes the conducted measurements to evaluate the performance of PaderMAC versus X-MAC in terms of ernergy consumption and end-to-endlatency. Our measurements have a clear focus on the MAC-layer performance, sacrificing some realism on the routing layer for a more transparent understanding of the MAC behaviour. The goal of our performance study is to get an estimate of the performance of PaderMAC vs. X-MAC in a low-traffic, multihop scenario, without cross-traffic. For the spatial setup, we created an equidistand grid of TmoteSky sensor nodes under our laboratory’s ceiling, with 6 rows and 5 columms. 5.1
Routing Scheme
Our measurements aim to study the performance of PaderMAC vs. X-MAC under almost routing-agnostic conditions. Our measurements use geo-routing on the grid coordinates to send a packet on a round-trip from position (1,1) to (5,6) and back. This enabled us to measure the end-to-end latency without having to perform tedious clock-synchronization for being resilient to clock-drift. The performance of PaderMAC is measured using a greedy packet forwarding scheme. X-MAC (emulated by the legacy support of PaderMAC) is measured by routing the packets along the shortest path through the grid. Both routing methods were integrated into one routing layer, to speed up the measurement. 5.2
Scenario Description
To model an asynchronouse wakeup-pattern during our measuremets, the dutycycles of the motes were jittered randomly using an uniform distribution over the sleep period. This method was also used to model the random generation of packets on the source mote at position (1,1).
PaderMAC
125
The only varied parameter in the measurement was the sleep period, which ranged from 1000 ms to 4500 ms, increasing in steps of 500 ms. For each sleep period 30 samples for each MAC protocol were measured, resulting in overall 240 samples. The order in which those samples were measured was scrambled in order to decorrelate the results in time. In each sample, for each MAC layer, we sent 5 consecutive packets across the network. 5.3
Results
In our multihop measurement, we were interested in three metrics: powerconsumption, latency and fairness. All confidence intervals have a confidencelevel of 95%. Figure 4, presents the latency comparison, where PaderMAC clearly dominates in the domain of larger sleep periods. Interestingly X-MAC compares quite well for smaller sleep periods, because it always uses the shortest path (in hops) across the network, while PaderMAC can opportunistically choose shorter hops, which lead to longer pathes, impacting the end-to-end latency. Nevertheless, with increasing sleep periods, when X-MAC has to wait longer for the forwarders to wake up, PaderMACs opportunistic forwarder selection pays off. Since sending the preamble consumes most energy, we measured the preamble durations for each transmission, but also had to account for the different path length, i.e. different number of single transmissions. We therefore took the sum of all preamble durations for each sample to have a fair comparison in energy consumption. Using the CC2420 datasheet we computed the energy consumption in Joule, used for all preambles in and end-to-end transmission. As shown in Figure 5, PaderMAC can save up to a factor of two in energy consumption, compared to X-MAC. In addition to the above mentioned metrics, we measured how fair the task of relaying packets towards the destination and hence the energy-consumption was distributed over the available relays. The fairness shown in Figure 6 results from computing the Herfindahl index over the number of packets each relay-node received. With N = 28 being the number of relay nodes and ri being the number of packets relay i received, the Herfindahl index H is computed as N 2 r H := Ni=1 i . ( i=1 ri )2
(1)
Since all nodes, except those at position (1,1) and (5,6) act as a relay, the ideal fairness for our multihop scenario is 1/28. Figure 7 illustrates how the load of forwarding packets was distributed over the network during our measurement, with the hotspots illustrating the shortest path, taken by X-MAC and showing the smooth load distribution of PaderMAC, albeit with the possibility of slightly longer pathes.
126
M. Autenrieth and H. Frey
16000
X-MAC PaderMAC
14000
RTT latency [ms]
12000 10000 8000 6000 4000 2000 0
1000
1500
2000
2500
3000
Sleep periods [ms]
3500
4000
4500
Global power consumption for preambles [J]
Fig. 4. The end-to-end latency for a roundtrip from position (1,1) to (5,6) and back
70
X-MAC PaderMAC
60 50 40 30 20 10 0
1000
1500
2000
2500
3000
Sleep periods [ms]
3500
4000
4500
Fig. 5. The power consumption, measured by adding up the preamble-duration along the path
PaderMAC
1.0
127
X-MAC PaderMAC
Packet fairness
0.8 0.6 0.4 0.2
ideal fairness (1/28) 1000
1500
2000
2500
3000
Sleep periods [ms]
3500
4000
4500
Fig. 6. Load-balancing in terms of packet-fairness
PaderMAC
6
X-MAC
dst
dst
1000
Columns
5 4
750
3
500
2
250
1
src 1
src 2
3
Rows
4
51
2
3
Rows
4
5
Fig. 7. Heat map of forwarded packets throughout the node grid
0
128
M. Autenrieth and H. Frey
The near ideal fairness of PaderMAC, as shown in Figure 6, in conjunction with the reduced power consumption, shown in Figure 5 can drastically reduce the mean time to node failure in sensor networks.
6
Conclusions
Integrating a strobe preamble MAC layer with an opportunistic routing layer is an appealing cross layer optimization approach. This work describes PaderMAC, a practical implementation of such a MAC layer integration using the runtime environment TinyOS and the MAC layer architecture MLA. Our implementation contributes an extension for the MLA, a feedback-capable component for sending preamble strobes. The MAC protocol design of PaderMAC is generic in that respect that it can also be combined with other opportunistic routing layer implementations. And of course by combining it with a traditional non-opportunistic routing layer implementation the functionality of X-MAC is contained in the PaderMAC implementation as well. Our real hardware testbed evaluation of X-MAC and PaderMAC in conjunction with a simple geographic greedy routing mechanism complements already known theoretical and simulation studies on integrating a strobe preamble MAC layer with opportunistic routing. It illustrates in practice how opportunistic routing in conjunction with such MAC layer can improve the performance of WSNs in terms of reduced end-to-end latency, reduced power consumption and increased fairness. Two future research tracks can be followed with respect to PaderMAC. On one side PaderMAC can be combined and studied with routing schemes which are more sophisticated than plain opportunistic greedy packet forwarding. In particular, an integration of PaderMAC with CTP, allowing nodes improving progress toward the sink or a set of sink nodes to contend for packet forwarding is an interesting WSN relevant research direction. A further research track is a PaderMAC extension improving how the next relay node is selected. So far the relay for each transmission is implicitly selected via the receiver contention mechanism: the earliest relay or the relay with the smallest backoff window wins. A future PaderMAC implementation could however let the preamble sender first collect several potential receiver replies and then decide the best one out of these (for instance the receiver minimizing the hop distance or the ETX towards the end receiver). How long the sender should wait for possibly better next hop node replies then becomes a classical optimal stopping problem.
References 1. Ahn, G., Hong, S., Miluzzo, E., Campbell, A., Cuomo, F.: Funneling-mac: a localized, sink-oriented mac for boosting fidelity in sensor networks. In: Proceedings of the 4th International Conference on Embedded Networked Sensor Systems, p. 306. ACM, New York (2006) 2. Biswas, S., Morris, R.: Exor: Opportunistic multi-hop routing for wireless networks. In: Proceedings of the Annual Conference of the Special Interest Group on Data Communication (SIGCOMM), pp. 133–144 (2005)
PaderMAC
129
3. Blum, B.M., He, T., Son, S., Stankovic, J.A.: IGF: A state-free robust communication protocol for wireless sensor networks. Tech. Rep. CS-2003-11, Department of Computer Science, University of Virginia (April 21, 2003) 4. Buettner, M., Yee, G., Anderson, E., Han, R.: X-MAC: a short preamble MAC protocol for duty-cycled wireless sensor networks. In: Proceedings of the 4th International Conference on Embedded Networked Sensor Systems, p. 320. ACM, New York (2006) 5. De Couto, D.S.J., Aguayo, D., Bicket, J., Morris, R.: A high-throughput path metric for multi-hop wireless routing. In: Proceedings of the 9th Annual International Conference on Mobile Computing and Networking, p. 134. ACM Press, New York (2003) 6. El-Hoiydi, a., Decotignie, J.D.: WiseMAC: an ultra low power MAC protocol for the downlink of infrastructure wireless sensor networks. In: Proceedings of Ninth International Symposium on Computers and Communications, ISCC 2004 (IEEE Cat. No.04TH8769), vol. 1, pp. 244–251 (2007) 7. Finn, G.G.: Routing and addressing problems in large metropolitan-scale internetworks. Tech. Rep. ISI/RR-87-180, Information Sciences Institute (ISI) (March 1987) 8. Fonseca, R., Gnawali, O., Jamieson, K., Levis, P.: Collection tree protocol. In: SenSys 2009: Proceedings of the 6th ACM Conference on Embedded Network Sensor Systems. ACM, New York (2009) 9. Frey, H.: Geographical cluster based routing with guaranteed delivery. In: 2nd IEEE International Conference on Mobile Ad-hoc and Sensor Systems (MASS 2005), Washington, DC, USA (November 7-10, 2005) 10. Frey, H., G¨ orgen, D.: Geographical cluster based routing in sensing-covered networks. IEEE Transactions on Parallel and Distributed Systems: Special issue on Localized Communication and Topology Protocols for ad hoc Networks 17(4), 885–891 (2006) 11. F¨ ußler, H., Widmer, J., K¨ asemann, M., Mauve, M., Hartenstein, H.: Contentionbased forwarding for mobile ad-hoc networks. Ad Hoc Networks 1(4), 351–369 (2003) 12. Heissenb¨ uttel, M., Braun, T.: BLR: Beacon-less routing algorithm for mobile adhoc networks. Elsevier’s Computer Communications Journal 27, 1076–1086 (2003) 13. Hill, J., Szewczyk, R., Woo, A., Hollar, S., Culler, D., Pister, K.: System architecture directions for networked sensors. ACM Sigplan Notices 35(11), 93–104 (2000) 14. Klues, K., Hackmann, G., Chipara, O., Lu, C.: A component-based architecture for power-efficient media access control in wireless sensor networks. In: Proceedings of the 5th International Conference on Embedded Networked Sensor Systems - SenSys 2007, vol. 1, p. 59 (2007) 15. Lin, E.Y., Rabaey, J., Wolisz, A.: Power-efficient rendez-vous schemes for dense wireless sensor networks. In: Proceedings of the IEEE International Conference on Communications, vol. 7, pp. 3769–3776 (June 2004) 16. Lin, P., Qiao, C., Wang, X.: Medium access control with a dynamic duty cycle for sensor networks. In: IEEE Wireless Communications and Networking Conference, WCNC 2004, vol. 3, pp. 1534–3159 (2004) 17. Lu, G., Krishnamachari, B., Raghavendra, C.S.: An Adaptive Energy-Efficient and Low-Latency MAC for Data Gathering in Wireless Sensor Networks. In: Proceeings of the 18th International Parallel and Distributed Processing Symposium (2004) 18. Nelson, R., Kleinrock, L.: The spatial capacity of a slotted aloha multihop packet radio network with capture. IEEE Transactions on Communications 32(6), 684–694 (1984)
130
M. Autenrieth and H. Frey
19. Park, V.D., Corson, M.S.: A highly adaptive distributed routing algorithm for mobile wireless networks. In: Proceedings of the 16th IEEE Conference on Computer Communications (INFOCOM 1997) (1997) 20. Philip, S.J., Ghosh, J., Ngo, H.Q., Qiao, C.: Routing on overlay graphs in mobile ad hoc networks. In: Proceedings of the IEEE Global Communications Conference, Exhibition & Industry Forum (GLOBECOM 2006) (2006) 21. Polastre, J., Hill, J., Culler, D.: Versatile low power media access for wireless sensor networks. In: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems, pp. 95–107. ACM, New York (2004) 22. Rhee, I., Warrier, A., Aia, M., Min, J., Sichitiu, M.L.: Z-MAC: A Hybrid MAC for Wireless Sensor Networks. IEEE/ACM Transactions on Networking 16(3), 511–524 (2008) 23. Shah, R., Wietholter, S., Wolisz, a., Rabaey, J.: When Does Opportunistic Routing Make Sense? In: Third IEEE International Conference on Pervasive Computing and Communications Workshops, vol. 1, pp. 350–356 (March 2005) 24. Stojmenovic, I., Lin, X.: Power-aware localized routing in wireless networks. IEEE Transactions on Parallel and Distributed Systems 12(11), 1122–1133 (2001) 25. Sun, Y., Gurewitz, O., Johnson, D.B.: RI-MAC: a receiver-initiated asynchronous duty cycle MAC protocol for dynamic traffic loads in wireless sensor networks. In: SenSys 2008: Proceedings of the 6th ACM Conference on Embedded Network Sensor Systems, pp. 1–14. ACM, New York (2008) 26. Takagi, H., Kleinrock, L.: Optimal transmission ranges for randomly distributed packet radio terminals. IEEE Transactions on Communications 32(3), 246–257 (1984) 27. Tejeda, H., Ch´ avez, E., Sanchez, J.A., Ruiz, P.M.: A virtual spanner for efficient face routing in multihop wireless networks. In: Cuenca, P., Orozco-Barbosa, L. (eds.) PWC 2006. LNCS, vol. 4217, pp. 459–470. Springer, Heidelberg (2006) 28. Van Dam, T., Langendoen, K.: An adaptive energy-efficient MAC protocol for wireless sensor networks. In: Proceedings of the 1st International Conference on Embedded Networked Sensor Systems, p. 180. ACM, New York (2003) 29. Ye, W., Heidemann, J., Estrin, D.: An energy-efficient MAC protocol for wireless sensor networks. Tech. rep., USC/ISI (2001) 30. Zorzi, M., Rao, R.R.: Geographic random forwarding (geraf) for ad hoc and sensor networks: Energy and latency performance. IEEE Transactions on Mobile Computing 2(4), 349–365 (2003)
Overhearing for Congestion Avoidance in Wireless Sensor Networks Damien Roth, Julien Montavont, and Thomas No¨el Image Sciences, Computer Sciences and Remote Sensing Laboratory (LSIIT UMR CNRS 7005), University of Strasbourg, France {droth,montavont,noel}@unistra.fr
Abstract. Convergecast traffic pattern is predominant in current wireless sensor networks. Few packets are periodically sent toward the sink, but interesting events may generate a burst of packets for a limited period of time. Such burst may create congestion in the surrounding of the event and thus may result in packet loss. Several congestion avoidance solutions exist in the literature but they either involve a lot of control messages or complicate the deployment of sensor networks. We therefore propose a new approach, named CLOMAC, which can be integrated to existing preamble-based MAC protocols. CLOMAC reduces congestion by overhearing and passively creating alternative paths toward the destination. An evaluation by simulation will demonstrate the benefits of our contribution integrated with the B-MAC protocol. Keywords: Wireless Sensor Networks, Medium Access Control, Congestion Avoidance, B-MAC, CLOMAC.
1
Introduction
A wireless sensor network is composed of small and autonomous sensor nodes which cooperate to observe their environment in a non-intrusive way. In most cases, collected data (e.g. temperature, luminosity, heartbeat, etc.) is routed toward a collecting station (i.e. the sink) through wireless communications [13]. Due to their small size, the sensor nodes are heavily constrained: low computing capabilities, limited memory and energy resources, etc. New communication protocols are therefore be designed specially for wireless sensor networks. Current wireless sensor networks mainly involve a convergecast data traffic pattern. Most of the time few packets are periodically sent toward the sink. But whenever an interesting event occurs, sensor nodes located in the vicinity may generate a large number of packets to collect more data about the event. Such burst of packets may create congestion and thus may increase packet collisions, delays and loss. Regarding the nature of the event, successful packet delivery and packet reception latency may be of crucial importance. Consider as an example a wireless sensor network deployed to monitor fire outbreak in forests. Burst of packets may occur whenever a fire starts and therefore may generate congestion and loss of information. In the worst case scenario, such situation may lead to loss of lives and properties. H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 131–144, 2011. c Springer-Verlag Berlin Heidelberg 2011
132
D. Roth, J. Montavont, and T. No¨el
In this article, we propose CLOMAC (Cross-Layer Opportunistic MAC ) which is a new congestion avoidance scheme for wireless sensor networks. The idea lying behind CLOMAC is to benefit from overhearing inherent in wireless communication to opportunistically use alternative paths toward the sink and therefore reduce congestion. CLOMAC is designed to be integrated with preamble-based MAC protocols [6,7] which are very popular and prone to overhearing. The remaining parts of the article are organized as follows. Section 2 presents different congestion avoidance solutions available in the literature. Then, the foundations of CLOMAC and its integration with the B-MAC protocol [7] are exposed in Section 3. The simulation parameters and results of the performance evaluation are detailed in Section 4. Finally conclusions and future work are presented in Section 5.
2
State of the Art
There are various methods to control or avoid congestion in wireless sensor networks. Particularly, convergecast traffic pattern is conducive to generate congestion around the sink due to the funneling effect [10]. By sharing a time scale between all sensor nodes, synchronized MAC protocols are able to handle this problem. They use either slotted schemes [3] or common active/sleep periods [12] to send or receive data in a synchronized manner. However, establishing and maintaining synchronization over a whole sensor network is a difficult task which is hardly scalable and may require a significant overhead. The scalability of synchronized MAC protocols may be improved by hybrid approaches such as the Funneling-MAC protocol [1]. This protocol uses a synchronized MAC protocol for communication with the sink and a CSMA-based protocol in the rest of the network. Each sensor node located at one hop of the sink splits its communication window in two parts: one for CSMA-based communications and the other for synchronized communications. However, such hybrid approach has several drawbacks. First, packets sent between full CSMAbased nodes in the vicinity of the synchronized zone may enter in collision with packets sent within the synchronized zone. In addition, hybrid nodes are the last relays to reach the sink. Splitting the communication window of such nodes may significantly increase the competition to successfully transmit packets from CSMA-based nodes to these nodes. Mobile sinks can also be used to tackle the funneling effect. Indeed, the sink may periodically moves to different locations to balance the traffic in the network [5]. However, every movement may involve a significant signaling overhead to maintain paths toward the sink and thus may consume a large amount of energy. Although most of the literature focuses on congestion occurring at the sink, a few propose more generic solutions for congestion avoidance. The Tree-based Multi-Channel Protocol (TMCP) [11] uses multiple radio channels to increase the transmission rate of packets. TMCP divides the network in several trees with the same root (the sink) and allocates a different wireless channel to each.
Overhearing for Congestion Avoidance in Wireless Sensor Networks
133
These wireless channels are selected using an orthogonal property: a transmission on a specific channel should not interfere with the other channels. This method therefore increases the overall data rate by segmenting the network. However, allocating wireless channels while limiting radio interference is a difficult problem. TMCP advocates the use of a greedy algorithm which however may consume a lot of energy whenever the trees are created. In addition, the design of the sink is not trivial as it should be able to listen to every channel at the same time (e.g. by using multiple transceivers). Wireless communications potentially involve multiple receptions of each transmitted packet. Opportunistic routing protocols such as [8] may take advantage of these overhearings to reduce congestion by allowing nodes to relay packets even if they are not the destination at the MAC level. By this means, packets can naturally take alternative paths to reach their final destination. However, one of the major issues of such solution is to limit the transmission of a unique copy of each packet. In CA-PATH [8], an ordered list of relays is included in the routing header. Which relay will forward a packet is settled by overhearing: if none of the higher rank relays have already forwarded the packet before a certain amount of time, the node can relay the packet. In wireless sensor networks, although the creation of such a list may consumes energy, the major drawback is the overhearing which is inherent in opportunistic solutions at two levels: to get the data packet and to settle its effective retransmission. The wireless transceiver being the most energy consuming element of a wireless sensor node [2], such overhearings may seriously reduce the battery life of sensor nodes. In the next section, we detail how our proposal deals with this issue.
3 3.1
CLOMAC General Overview
CLOMAC (Cross-Layer Opportunistic MAC ) is designed to avoid congestion in wireless sensor networks using the characteristics of wireless communication. When a sensor node sends a packet, all sensor nodes in its transmission range may receive the packet due to overhearing. Generally, if the node is not the destination of the packet, this packet is silently discarded. However, those multiple receptions are used by opportunistic protocols to improve the performance of the network. We propose to apply this mechanism to preamble-based MAC protocols in order to dynamically create alternate paths toward the sink when congestion occurs. We target preamble-based MAC protocols because they are prone by nature to overhearing due to their asynchronous operations. Among these protocols, a few include information such as the source or the destination of the pending data packet in the preamble. In the following, we consider a generic preamble-based MAC protocol in which the preamble does not contain such information. CLOMAC is designed to operate with a convergecast traffic pattern in which data packets include a sequence number. Its operations start whenever the surrounding area of a sensor node is detected as congested. To detect a congestion, CLOMAC relies on the retransmission mechanism included in most of the MAC
134
D. Roth, J. Montavont, and T. No¨el Threshold exceded Enable CLOMAC
Opportunistic node
Radio off
A C K
A C K
Radio off
Time
Change destination
Sender
Preamble
Data
Preamble
Data Time Backoff
Reception
Fig. 1. Alternative path creation with CLOMAC
protocols. Generally, a sender keeps on retransmitting a packet while it has not received a confirmation of successful reception (ACK) of the transmitted packet or while it has not reached the maximum number of retransmissions. In CLOMAC, a sender is considered as congested when the number of retransmissions is greater than a certain threshold referred to as CT (Congestion Threshold). CT should be carefully selected to not react upon very temporary problems such as signal degradation. To inform sensor nodes in the neighborhood, the number of retransmissions is included in the MAC header of the packets. A sensor node overhearing a packet can therefore determine whether the sender is congested by comparing CT to the current number of retransmissions. To reduce congestion, CLOMAC relies on the same principle that the opportunistic routing protocols use. When a sensor node detects a congested sender, it can accept the packet the sender tries to retransmit on behalf of the real destination. Those nodes are referred to as opportunistic nodes in the rest of the article. Once an opportunistic node has overheard a packet in which the number of retransmissions has crossed CT, it goes back to sleep until the next transmission of the sender. Upon the reception of the next preamble, the opportunistic node sends back an acknowledgment which indicates that it requests the incoming packet. If the sender receives an acknowledgement after its preamble, it changes the destination of the data packet for the opportunistic node. Upon successful reception, the opportunistic node acknowledges the data packet and forwards it to its next hop. Such procedure ensures that the opportunistic node does not forward a packet that has already been acknowledged by another node. For example, the opportunistic node may not be in the transmission range of the real destination and thus can not overhear its acknowledgment. This procedure also synchronizes the set of potential opportunistic nodes located in the vicinity of the sender. Even if multiple opportunistic nodes try to acknowledge the preamble, only the destination of the data packet is allowed to forward it. By this means, a unique copy of the data will be finally received by the sink. The whole procedure is illustrated in Figure 1. The main problem in opportunistic protocols is the synchronization between nodes to ensure that a unique copy of each packet is forwarded to the destination. Also, opportunistic nodes should not create loop in the network. In order to select opportunistic nodes, CLOMAC uses a heuristic based on information from the routing protocol. A simple scheme is to allow a node to be opportunist for a
Overhearing for Congestion Avoidance in Wireless Sensor Networks
135
specific sender only if this node is closer to the destination than the sender regarding the routing metric. Such scheme ensures that CLOMAC is loop free but may not select enough opportunistic nodes in the vicinity of a sender to avoid efficiently congestion. The chosen solution is to also allow nodes with an equivalent routing metric to the sender to be opportunists. However, routing packets between such nodes may create loop in the network. CLOMAC prevents the creation of loops by recording information about transmitted data packets in a distributed table located in each node. All packets sent or retransmitted by a node are registered together with their originator and sequence number for a short period of time. A node accepts to be opportunist for a packet only if its originator and sequence number are not already included in its table. By this means, a packet can not be transmitted several times by the same node. In the following, we detail the integration of CLOMAC in the B-MAC protocol [7]. With B-MAC, a sensor node has to listen to the preamble plus the data packet to determine the destination of the data packet. Such overhearing is one of the main drawbacks of B-MAC but it can become an asset in CLOMAC. 3.2
Integration with B-MAC
The B-MAC protocol [7] is a seminal work on MAC protocol based on preamble sampling. B-MAC consists of a long preamble followed by a short synchronization (SYNC) word. The data packet is directly sent after the SYNC word. Since neither the preamble nor the SYNC word contain information about the destination, all the nodes in the transmission range have to remain in RX mode until the complete reception of the data packet. Consequently, CLOMAC benefits from the overhearing inherent in B-MAC. We however made few modifications to BMAC to operate CLOMAC with B-MAC. First, to allow opportunistic nodes to acknowledge the preamble and thus becoming the new destination of data packets, we include a small inter-frame space interval (IFS) between the SYNC word and the data packet. Then, we include in the MAC header the packet sequence number, the initial sender plus the current number of retransmissions for this packet as requested by CLOMAC. This new version of B-MAC combined with CLOMAC is called B-MAC+CLOMAC in the rest of this article. 3.3
Error Cases
We identified few error cases when operating B-MAC+CLOMAC. The following briefly describes the potential problems and how CLOMAC deals with them. Collision between preamble acknowledgements. Before sending a preamble acknowledgement, an opportunistic node checks if the medium is clear to avoid collisions with any other communication. Particularly, another opportunistic node would also send an acknowledgement for the same preamble. However, a collision can still occur at the sender if two opportunistic nodes which are not in the same communication range send their acknowledgement at the same time. In CLOMAC, opportunistic nodes detect such collision if the pending packet is
136
D. Roth, J. Montavont, and T. No¨el Collision detection Probability test: negative
A C K
Opportunistic node 1
Radio off
Radio off
Time
Collision
Sender
Preamble
S Y N C
Data
Preamble
S Y N C
Data Time
Collision detection Probability test: positive
Opportunistic node 2
A C K
A C K
Radio off
Channel sampling
A C K
Backoff
Time Reception
Fig. 2. Collision detection of preamble acknowledgements
still destined to the standard destination. To avoid further collisions between preamble acknowledgements, opportunistic nodes can send new acknowledgements in the next transmission according to a probability p. After a collision on preamble acknowledgments, each opportunistic node draws a random number between [0, 1] and if the number is less than or equal to p it can send a new acknowledgment for the next preamble. This scenario is represented in Figure 2. Acknowledging the right data packet. Since the preamble and the SYNC word of B-MAC do not contain any information about the sender nor the destination, an opportunistic node may acknowledge the preamble of another sender than the previous one. Also, an opportunistic node may acknowledge a data packet which has already be successfully sent to the destination (the retransmission of such packet is due to a collision on the reception of the acknowledgment). To ensure that the preamble acknowledgement is sent for the right sender and data packet, opportunistic nodes use the source and sequence number included in the MAC header of each packet (as required by CLOMAC). When acknowledging a preamble, an opportunistic node includes the source and sequence number of the packet for which it requests to be opportunist. A sender accepts a preamble acknowledgement only if the source and sequence number included in the acknowledgment match the one of the pending data packet. Otherwise the sender discards the preamble acknowledgement and sends the pending data packet to the original destination. Whenever an opportunistic node received a new packet after the transmission of a preamble acknowledgment, it goes back to sleep and waits for the next transmission. This scenario is illustrated in Figure 3.
4
Evaluation of Our Proposal
We integrated CLOMAC in the version of B-MAC available in the WSNet simulator1 . WSNet is a discrete event simulator dedicated to the study of wireless 1
http://wsnet.gforge.inria.fr/
Overhearing for Congestion Avoidance in Wireless Sensor Networks Threshold exceded Enable CLOMAC
Opportunistic node
137
New data packet Disable CLOMAC
A C K
Radio off
Time
Acknowledgement for an older packet: ignore it
Sender
Preamble
S Y N C
Data
Preamble
A C K
Destination
S Y N C
Data Time A C K
Radio off
Channel sampling
Backoff
Time Reception
Fig. 3. Acknowledging the preamble of the right data packet
sensor networks. It includes various models of radio propagation and interference which eases the study of the lower layers of the protocol stack. In addition to the development of B-MAC+CLOMAC, we also implemented a mechanism for the detection of packet duplications. This process uses the packet sequence numbers that we have included in the MAC header. If a sensor node tries to send an already successfully transmitted packet, the receiver silently discards this copy but still sends back an acknowledgment to stop the retransmission process. Such situation occurs when the acknowledgment for a data has not be received by the sender. This mechanism is respectively included with BMAC (referred to as B-MAC duplication free) and B-MAC+CLOMAC (referred to as B-MAC+CLOMAC duplication free). In the following, we compared the performance of all of these solutions (i.e. B-MAC, B-MAC duplication free, BMAC+CLOMAC and B-MAC+CLOMAC duplication free) in terms of packet loss and energy consumption. 4.1
Simulation Environment
Our simulation scenario consists of 100 wireless sensor nodes deployed in a square grid of 100mx100m. Each node has a transmission range of 15 meters and thus has an average of 8 neighbors. The network uses a gradient-based routing [4] to calculate the next hop toward the sink. Each sensor node uses an event-based application - data is only transmitted upon event detection. At the beginning of the simulation we distributed all the events among the 600 seconds of the simulation (divided into 1s slot) according to a Poisson process: λpoisson =
evt 600
| evt ∈ {5, 10, 15, 20, 25, 30}
The location of the events is uniformly distributed in the area. Upon the detection of an event, a sensor node starts sending data packets of 4 bytes toward the sink every second during the length of the event. An event lasts 10 seconds. The nodes uses alternatively B-MAC or B-MAC+CLOMAC with a preamble
138
D. Roth, J. Montavont, and T. No¨el Table 1. Parameters used in our simulations Simulation parameters
Values Square grid (100mx100m) Topology of 100 sensor nodes Data sending period Upon event detection Event detection range 15m Number of events 5, 10, 15, 20, 25, 30 events Routing model Gradient-based routing protocol B-MAC, B-MAC+CLOMAC MAC model preamble duration 100ms Max. number of retransmissions 8 Radio model Frequency: 868MHz BPSK; Range: 15m Energy model Idle: 1.6mA, Rx: 15mA, with a 3V battery Tx: 16.9mA, Init. radio: 8.2mA Duration and number of simulations 600 seconds simulated 50 times CLOMAC parameters Values Congestion Threshold (CT) 4 IFS duration 1ms ACK probability p 1/2
duration fixed to 100ms. The energy model emulate the energy consumption of a CC1100 transceiver from Texas-Instruments2. The consumption values and every simulation parameter are reported in Table 1. The heuristic used to select opportunistic nodes is based on the rank of the sensor nodes. In gradient-based routing, the rank refers to the distance (in number of hops) between a sensor node and the sink. In CLOMAC, only the nodes with the same or a lesser rank than a sender can act as an opportunistic node for this sender. An opportunistic node considers a sender as congested if the latter retransmits more than 4 times the same data packet. If a collision occurs between preamble acknowledgements, opportunistic nodes have a probability p of 1/2 to acknowledge the next preamble so that the chance of further collisions is reduced. 4.2
Results and Analysis
Each couple (MAC protocol, number of events) has been simulated 50 times. The simulations also include a warm-up period of 60s. Each set of simulations uses the same seed for random number generation in order to evaluate the different solutions in a similar environment. The results presented in this section are an average of the overall data collected on the set of simulations. The 95% confidence interval indicates the reliability of our measurements. Figure 4 represents the percentage of packet lost at the application layer. A packet is considered as lost for the application when the number of retransmissions at the MAC layer has reached the maximum allowed value (fixed to 8 in 2
http://ti.com/lit/gpn/cc1100
Overhearing for Congestion Avoidance in Wireless Sensor Networks
139
Fig. 4. Packet loss ratio at the application layer
our simulations). Whatever the MAC protocol used, the sensor nodes have sent between 249 (±18) packets with 5 events and 1488 (±42) packets with 30 events. The results show that the integration of CLOMAC into B-MAC outperforms BMAC and divides by 3 to 5 times the number of lost packets. CLOMAC makes it possible by limiting congestion and therefore packet loss by passively using alternative paths. We observed that packet loss is always due to transmission issues. The reasons of packet loss at the MAC layer are further analyzed in Table 2. A packet is considered as lost for the MAC layer whenever the sender schedules its retransmission. With B-MAC, we can observe that 5542 packets in average are lost because the packets are not captured by the radio transceiver. This means that the radio transceiver has captured another signal (e.g. data packet or preamble) because the reception sensibility was better. B-MAC+CLOMAC is able to limit those losses using opportunistic nodes to get around congested areas. However, most of the packet loss is due to data packet errors. The data packet is captured correctly but discarded because the packet is corrupted. This may happen because of environmental interferences or when a node has received multiple packets at the same time. Since B-MAC do not use any mechanism like RTS/CTS, it is prone to the hidden node problem [9]. CLOMAC and B-MAC have similar results with this error, most of them occur during the first part of the retransmission process (before CLOMAC is activated). By examining in details the location of lost packets, we can observe that most of them are lost outside the event area. With the CT value used in the simulations, CLOMAC is only activated in highly congested areas (the event areas) and not necessarily in other areas which may experience relatively little congestion (e.g. the routing backbone). To further reduce the packet loss in those areas, we could consider the use of an dynamic CT.
140
D. Roth, J. Montavont, and T. No¨el
Table 2. Average number of lost packets at the MAC layer and their location in the network with 30 events MAC protocol B-MAC B-MAC duplication free B-MAC+CLOMAC B-MAC+CLOMAC duplication free
Lost packets error not captured 6331.91 5542.03 (± 230.92) (± 159.27) 4123.12 3402.98 (± 114.34) (± 80.82) 6176.70 1216.52 (± 104.45) (± 71.89) 4572.49 955.38 (± 62.55) (± 47.00)
Location of lost packets near sink event area other 143.58
2542.42
13879.48
74.26
2472.78
8556.7
80.04
1966.52
9294.62
55.58
1924.7
6752.48
Fig. 5. Duplication ratio at the sink level
When a collision occurs during the transmission of an acknowledgement, the sender assumes that the transmitted packet has been lost and schedules its retransmission. Consequently, a destination may receive multiple copies of the same packet. However, the transmission of multiple copies of the same packet to the sink consumes resources such as energy in vain. Figure 5 represents the percentage of duplicated packets received by the sink at the end of the simulation. The results show that in addition to the reduction of packet loss, CLOMAC also helps to reduce the number of duplicated packets. By distributing packets on multiple paths, CLOMAC reduces the contention level on the medium and thus reduces the chance of collision. However, duplication of packets can still occur with B-MAC+CLOMAC. Before the activation of CLOMAC, collision can occurs as in B-MAC which explains the number of duplicated packets observed for B-MAC+CLOMAC. Nevertheless, we noticed that CLOMAC itself does not generate packet duplication thanks to the preamble acknowledgment
Overhearing for Congestion Avoidance in Wireless Sensor Networks
141
Fig. 6. Number of paths used to reach the sink
process. Finally, we can observe, as expected, that both B-MAC duplication free and B-MAC+CLOMAC duplication free do not generate packet duplication. As a result, the introduction of sequence numbers in the MAC header can be a practical solution for limiting packet duplication in wireless sensor networks. CLOMAC reduces congestion by using alternative paths to route packets toward the sink. These additional paths are opportunistically created and therefore do not request any control messages as explained in Section 3.1. Figure 6 represents the number of routes used to reach the sink. A path is composed of a unique sequence of hops to reach the sink. With only 5 events in the network, CLOMAC already increases the number of used paths by a factor 2 (34 paths are used by B-MAC+CLOMAC and only 17 by B-MAC). With 30 events in the network, the number of used paths reaches 160 for B-MAC+CLOMAC. Indeed, more the number of events is large, more the number of opportunistic nodes is increased and therefore more paths are created and used. By contrast, B-MAC is only able to use less than 60 paths. Note that the number of paths increases along with the number of events because new sources appear with new events. This observation is clearer in Figure 7 which represents the links used in the network. The links used by both B-MAC and B-MAC+CLOMAC are represented in red and the additional links used by B-MAC+CLOMAC are colored in blue. The green circles represent the event areas which have appeared during the simulations. We can notice that more links are used by B-MAC+CLOMAC (108 links) than B-MAC (42 links) to forward the packets. A higher proportion of the network is therefore used. As a result, this allows the reduction of congestion and distributes more fairly the energy consumption required to deliver all data packets to the sink. This latter observation is further analyzed in Figure 8 which represents the average energy consumption of the wireless sensor nodes at the end of the simulation. Opportunistic mechanisms are generally based on overhearing and therefore
142
D. Roth, J. Montavont, and T. No¨el
Link used by both B-MAC and B-MAC+CLOMAC Additionnal link used by B-MAC+CLOMAC
Fig. 7. Paths used by the packets to reach the sink with 5 events
Fig. 8. Average energy consumption of sensor nodes in the network
the energy cost is of crucial importance. Despite increasing the listening time of the radio transceiver by adding IFS between the SYNC word and the data packet, we can observe that B-MAC+CLOMAC is able to significantly reduce the energy consumption of the sensor nodes. The overhearing necessary to operate CLOMAC is already present in B-MAC and thus only IFS and the preamble
Overhearing for Congestion Avoidance in Wireless Sensor Networks
143
acknowledgment are sources of additional energy consumption. However this cost is largely compensated by the reduction of the number of retransmissions and duplicated packets which are likely to occur in B-MAC alone. We can also remark that the energy consumption of B-MAC+CLOMAC is almost the same as its duplication free version. As explained in Section 4.1, the receiver is required to acknowledge the packet (even if it has already been received) to stop the retransmission process. Therefore only the energy used to forward duplicated packets is saved compared to BMAC+CLOMAC alone.
5
Conclusion
In this paper we have focused on congestion avoidance in wireless sensor networks. Furthermore, we have proposed CLOMAC (Cross-Layer Opportunistic MAC ) which is based on the natural overhearing of preamble-based MAC protocols to create alternative paths toward the sink. Once the neighbors of a sender detect a congestion (based on the number of retransmissions of the same data packet), they can accept the packet on behalf of the real destination. Also, CLOMAC includes synchronization mechanisms between opportunistic nodes in order to forward a unique copy of each packet. CLOMAC can be adapted to existing preamble-based MAC protocols. For this first work, we have integrated CLOMAC in B-MAC [7]. B-MAC is a popular preamble-based MAC protocol. The combination of those two protocols is named B-MAC+CLOMAC. The results of the simulations presented in Section 4.2 show that B-MAC+CLOMAC outperforms the B-MAC protocol. In essence, B-MAC+CLOMAC significantly increases the number of paths used to reach the sink. This reduces the congestion level on the medium and therefore drastically reduces packet loss. As these alternative paths are created opportunistically, they do not require additional messages to be maintained. Finally, opportunistic solutions generally suffer from high energy consumption. However, the energy required by CLOMAC for overhearing is mitigated by BMAC which is already prone to this phenomenon. On the contrary, the overall energy consumption is drastically reduced in B-MAC+CLOMAC compared to B-MAC thanks to the reduction of the number of retransmissions. Nevertheless, we have observed that the congestion threshold (CT) used to activate CLOMAC should be more dynamic to react more efficiently upon minor congestion. We are currently investigating an adaptable threshold which activate CLOMAC for a certain period of time. In light of these results, we plan to further evaluate CLOMAC using other preamble-based MAC protocols such as X-MAC [6]. Future investigation could also focus on alternative heuristics to increase the number of opportunistic nodes while ensuring that CLOMAC remains loop-free. Finally, we expect to benefit from the SensLAB platform3 to extend our performance studies to large scale experiments. 3
http://www.senslab.info/
144
D. Roth, J. Montavont, and T. No¨el
References 1. Ahn, G.-S., Hong, S.G., Miluzzo, E., Campbell, A.T., Cuomo, F.: Funneling-mac: a localized, sink-oriented mac for boosting fidelity in sensor networks. In: Sensys (2006) 2. Anastasi, G., Conti, M., Francesco, M.D., Passarella, A.: Energy conservation in wireless sensor networks: A survey. ADHOCNETS 7 (2009) 3. IEEE Computer Society. IEEE Standard 802.15.4-2009: Wireless medium access control and physical layer specifications for low-rate wireless personal area networks (April 2009) 4. Karl, H., Willig, A.: Protocols and Architectures for Wireless Sensor Networks. John Wiley & Sons, Chichester (2005) 5. Luo, J., Hubaux, J.-P.: Joint mobility and routing for lifetime elongation in wireless sensor networks. In: INFOCOM 2005, vol. 3 (2005) 6. Buettner, E.A.M., Yee, G.V., Han, R.: X-mac: A short preamble MAC protocol for duty-cycled wireless sensor networks. In: SenSys (2006) 7. Polastre, J., Hill, J., Culler, D.: Versatile low power media access for wireless sensor networks. In: Sensys (2004) 8. Schaefer, G., Ingelrest, F., Vetterli, M.: Potentials of opportunistic routing in energy-constrained wireless sensor networks. In: Roedig, U., Sreenan, C.J. (eds.) EWSN 2009. LNCS, vol. 5432, pp. 118–133. Springer, Heidelberg (2009) 9. Rahman, A., Gburzynski, P.: Hidden problems with the hidden node problem. In: Biennial Symposium on Communications (2006) 10. Wan, C.-Y., Eisenman, S.B., Campbell, A.T., Crowcroft, J.: Siphon: overload traffic management using multi-radio virtual sinks in sensor networks. In: SenSys 2005 (2005) 11. Wu, Y., Stankovic, J., He, T., Lin, S.: Realistic and efficient multi-channel communications in wireless sensor networks. In: INFOCOM (2008) 12. Ye, W., Heidemann, J., Estrin, D.: Medium access control with coordinated adaptive sleeping for wireless sensor networks. IEEE/ACM Transactions on Networking 12 (2004) 13. Yick, J., Mukherjee, B., Ghosal, D.: Wireless sensor network survey. Computer Networks 52 (2008)
Multihop Performance of Cooperative Preamble Sampling MAC(CPS-MAC) in Wireless Sensor Networks Rana Azeem M. Khan and Holger Karl University of Paderborn, Paderborn, Germany
[email protected],
[email protected]
Abstract. Cooperative communication(CC) is a promising technique to combat fading in a wireless environment. In our previous work, we proposed Cooperative Preamble Sampling(CPS)-Medium Access Control(MAC) protocol which highlighted the benefits of using CC in Wireless Sensor Networks(WSN). Initially, CPS-MAC performance was evaluated in a 3-node network comprising a single source, partner, and destination node. In this paper, we evaluate the performance of CPS-MAC in a multihop WSN configuration, where a large number of sensor nodes are deployed around a sink to create a data gathering network. All nodes generate traffic, which leads to channel contention, collisions, and idle listening. Results show that, under light traffic load, CPS-MAC performs on par with non-cooperative preamble sampling protocol and performs significantly higher as traffic load increases. Keywords: Cooperative Communication, Medium Access Control, Preamble Sampling, Wireless Sensor Networks, Reliability.
1
Introduction and Background
Applications such as target tracking and search-and-rescue require sensor nodes to be deployed on moving objects such as robots or vehicles (cars, trains, and airplanes). In scenarios with mobility, random scattering from reflectors results in multiple copies of a transmitted signal to arrive (and interfere) at the receiver with different gains, phase shifts, and delays. These multiple signal replicas can result in destructive interference at the receiver, causing fading and temporary failure of communication. As sensor nodes are battery powered and operate under strict energy constraints, data transmission at high transmission powers and over large distances is unfeasible. Under such conditions, ensuring reliable communication is a challenging problem. This has motivated us to propose Cooperative Preamble Sampling Medium Access Control (CPS-MAC) protocol [1]. CPS-MAC, based on preamble sampling [6] and cooperative communication (CC), [2] takes advantage of overhearing to improve reliability. Overhearing means that a node will receive all messages H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 145–149, 2011. c Springer-Verlag Berlin Heidelberg 2011
146
R.A.M. Khan and H. Karl
in its reception range, including those that are intended for other nodes. Considered problematic, specially in dense WSN, these packets are usually discarded and this wastes energy. In contrast, CPS-MAC intentionally wakes up 1-hop and 2-hop neighbors to improve their chances of overhearing a packet. This allows a 2-hop neighbor to receive two copies of the same packet, one directly from source and one repeated from a 1-hop cooperating node. By combining multiple copies of the same packet, using a packet combining technique such as maximum ratio combining (MRC) [5], the 2-hop neighbor is more likely to recover the original packet. This spatial diversity gain allows CPS-MAC to combat channel fading. Design details of CPS-MAC have been presented in [1]. Details on CC can be found in [2–4]. In our previous work[1], we compared the performance of CPS-MAC with both preamble sampling and relaying scheme. Results showed that CPS-MAC was able to achieve reliability gains without expending additional energy. However, the simulation setup was limited to a 3-node Wireless Sensor Network(WSN) comprising a single source, partner and destination. This paper extends the performance evaluation of CPS-MAC for a multi-hop configuration, where large number of sensor nodes are deployed around a sink to create a data gathering network. Such a multi-hop configuration allows us to examine CPS-MAC scalability properties. All nodes generate traffic, which means CPS-MAC must efficiently handle channel contention, collisions, and idle listening. Results show that, under very light traffic load, CPS-MAC performs on par with traditional preamble sampling protocol and performs significantly higher as traffic load increases. Details are presented in next section.
2 2.1
Performance Evaluation Simulation Setup
We conducted simulations using Mobility Framework extension for OMNET++ [7]. Table 1 lists physical and MAC layer parameters used. The physical layer parameters are based on Chipcon CC1020, a low power RF transceiver.
Table 1. List of Parameters Parameter Value Bitrate 153.6 Path loss Exponent 3.5 Transmit power -21 to 9 Current consumption: Transmit mode 12.3 to 27.1 Current Consumption: Receive mode 19.9 Current Comsumption: Power down mode 0.2 Receiver Sensitivity -104 Duty Cycling: Sleep Duration 800 Duty Cycling: Listen Duration 200
Unit Kbps dBm mA mA µA dBm ms ms
Multihop Performance of CPS-MAC in Wireless Sensor Networks
147
Fig. 1. Network topology
Figure 1 shows the Network topology. Here 17 sensor nodes are deployed around the sink and all nodes generate data. 2.2
Results
The performance of CPS-MAC is compared with relaying MAC. In relaying MAC, nodes simply relay each other packets towards the sink and use preamble sampling[6] to wakeup neighboring nodes. We identify two parameters that can effect sensor node’s performance during operation. One is ability of sensor node to change transmit power depending upon battery state and lifetime requirement. Therefore, we evaluate the reliability and corresponding energy consumption of the network at various transmit powers, in the range of -21dBm to 9dBm. Second parameter is the variation in load depending upon frequency of sensing events. For this, the network is subjected to different traffic loads by selecting intervals of 5min, 30 seconds and 10 sec to generate a new application layer packet per node. In figure 2, 3, and 4, reliability is represented as packet delivery rate(PDR), which is percentage of packets delivered at the sink, excluding any duplicates. MRC is used for packet combining at the destination and PDR results for CPSMAC are plotted using both, with and without packet combining. The difference represents the diversity gain achieved by CC and MRC. Energy consumption is represented as energy per useful bit(EPUB) i.e., energy spent in transferring a useful bit from source node, through the network, to the sink. Figure 3(a) shows that, under low traffic, CPS-MAC and relaying MAC achieve comparable packet delivery rate(PDR). Nodes benefit from CC at low transmit powers(-25dBm to -10dBm). However, when the transmit power is increased
148
R.A.M. Khan and H. Karl
100
0.08
90
CPS-MAC with Maximum Ratio Combining Relaying MAC
0.07
Energy Per Useful Bit[J]
Packet Delivery Rate
80 70 60 50 40 30 CPS-MAC with Maximum Ratio Combining CPS-MAC without Packet Combining Relaying MAC
20 10 0 -25
-20
-15
-10
-5
0
5
0.06 0.05 0.04 0.03 0.02 0.01 0 -25
10
-20
-15
Transmit Power[dBm]
-10
-5
0
5
10
Transmit Power[dBm]
(a) Packet Delivery Rate
(b) Energy per useful bit
Fig. 2. Traffic load of one packet generated every 5 minutes per node 100
-3
8
x 10
90
Energy Per Useful Bit[J]
Packet Delivery Rate
70 60 50 40 CPS-MAC with Maximum Ratio Combining CPS-MAC without Packet Combining Relaying MAC
30 20 10 -25
CPS-MAC with Maximum Ratio Combining Relaying MAC
7
80
-20
-15
-10
-5
0
5
6
5
4
3
2
1 -25
10
-20
-15
Transmit Power[dBm]
-10
-5
0
5
10
Transmit Power[dBm]
(a) Packet Delivery Rate
(b) Energy per useful bit
Fig. 3. Traffic load of one packet generated every 30 seconds per node 100
-3
5.5
CPS-MAC with Maximum Ratio Combining CPS-MAC without Packet Combining Relaying MAC
90 80
x 10
CPS-MAC with Maximum Ratio Combining Relaying MAC
5
Energy Per Useful Bit[J]
Packet Delivery Rate
4.5 70 60 50 40 30 20
3 2.5 2 1.5
10 0 -25
4 3.5
1 -20
-15
-10
-5
0
Transmit Power[dBm]
(a) Packet Delivery Rate
5
10
0.5 -25
-20
-15
-10
-5
0
5
Transmit Power[dBm]
(b) Energy per useful bit
Fig. 4. Traffic load of one packet generated every 10 seconds per node
10
Multihop Performance of CPS-MAC in Wireless Sensor Networks
149
from -10dBm to 10dBm, the quality of 2-hop link improves accordingly, and relaying-MAC slightly outperforms CPS-MAC because of the latter’s cooperation overhead. Effect of increasing the traffic load can be seen in Figure 3(a) and 4(a). Here, CPS-MAC achieves better performance in almost all the cases.This performance improvement over relaying MAC is attributed to CPS-MAC wake up scheme and CC over 2-hops. Repeating the preamble from the partner node increases the chances of the destination node waking up prior to data transmission. Cooperation over 2-hop means data can travel longer distances in single transmission, thus increasing network throughput, especially under heavy traffic load. Figure 2(b), 3(b) and 4(b) shows the energy consumed per useful bit (EPUB) for the three configurations. The EPUB metric takes into account the energy consumption of all the nodes in the topology. CPS-MAC energy consumption corresponds to the PDR results. At high traffic load, the improved PDR pays off and CPS-MAC achieves significantly lower EPUB as shown in figure 4(b).
3
Conclusion and Future Work
This work has shown possible benefits of using cooperative communication for performance improvement in a realistic multi-hop WSN. Results show that cooperative communication can help achieve better packet delivery rates, which further helps in reducing the overall energy consumption of the network. The maximum benefit is achieved when network is operating under mild to very heavy traffic load. For future work, we are extending the protocol to include acknowledgments for data packets, improve partner selection algorithm, and implement the protocol on a sensor network testbed to corroborate our findings.
References 1. Khan, R.A.M., Karl, H.: Cooperative Communication to Improve Reliability and Efficient Neighborhood Wakeup in Wireless Sensor Networks. In: Proc. of the Fourth International Conference on Mobile Ubiquitous Computing (UBICOMM 2010) (2010) 2. Nosratinia, A., Hunter, T., Hedayat, A.: Cooperative communication in wireless networks. IEEE Communications Magazine 42, 74–80 (2004) 3. Liu, P., et al.: CoopMAC: A Cooperative MAC for Wireless LANs. IEEE Journal on Selected Areas in Communications 25(2), 340–354 (2007) 4. Yi, L., Hong, J.: A New Cooperative Communication MAC Strategy for Wireless Ad Hoc Networks. In: ACIS International Conference on Computer and Information Science, pp. 569–574. IEEE Computer Society, Los Alamitos (2007) 5. Meier, A., Thompson, J.S.: Cooperative Diversity in Wireless Networks. In: 6th IEE International Conference on 3G and Beyond, 2005, pp. 1–5 (2005) 6. El-Hoiydi, A.: Aloha with preamble sampling for sporadic traffic in ad hoc wireless sensor networks. In: Proceedings of IEEE International Conference on Communications, ICC 2002 (Cat. No.02CH37333), New York, NY, USA, pp. 3418–3423 (2002) 7. OMNET++ discrete event simulator (June 2010), http://www.omnetpp.org
Secure Position Verification for Wireless Sensor Networks in Noisy Channels Partha Sarathi Mandal1 and Anil K. Ghosh2 1
Indian Institute of Technology, Guwahati - 781039, India
[email protected] 2 Indian Statistical Institute, Kolkata - 700108, India
[email protected]
Abstract. Position verification in wireless sensor networks (WSNs) is quite tricky in presence of attackers (malicious sensor nodes), who try to break the verification protocol by reporting their incorrect positions (locations) during the verification stage. In the literature of WSNs, most of the existing methods of position verification use trusted verifiers, which are in fact vulnerable to attacks by malicious nodes. They also depend on some distance estimation techniques, which are not accurate in noisy channels (mediums). In this article, we propose a secure position verification scheme for WSNs in noisy channels without relying on any trusted entities. Our verification scheme detects and filters out all malicious nodes from the network with a very high probability. Keywords: Distributed protocol, Quantiles, Location verification, Distance estimation, 3σ limit, Wireless networks.
1
Introduction
Secure position verification is important for wireless sensor networks (WSNs) because position of a sensor node is a critical input for many WSN applications those include tracking, monitoring and geometry based routing. Most of the existing position verification protocols rely on distance estimation techniques such as received signal strength (RSS) [1, 11], time of flight (ToF) [10] and time difference of arrival (TDoA) [19]. These techniques are relatively easy to implement, but they are a little bit expensive due to their requirement of special hardwares to estimate end-to-end distances. These above techniques, especially RSS techniques [1, 11] are perfect in terms of precision in ideal situations. The Friis transmission equation 1 [17] used in RSS techniques leads to this precision. But, in practice, due to the presence of noise in the network channel, signal attenuation does not necessarily follow this equation. There are many nasty effects those have influence on both propagation time and signal strength. So, the distance calculated using Friis equation usually differs from the actual distance. This difference, in reality, may also depend on the location of the sender and the receiver. A good position verification protocol should take care of these noises and limited precisions in distance estimation. In this article, we use the RSS technique for position verification, where the receiving node estimates the distance of the sender on the basis of sending and H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 150–163, 2011. c Springer-Verlag Berlin Heidelberg 2011
Secure Position Verification for Wireless Sensor Networks in Noisy Channels
151
receiving signal strengths. Here we use the term node for wireless sensor device in WSNs, which is capable of processing power and equipped with transceivers communicating over a wireless channel. We consider that there are two types of nodes in the system, genuine nodes and malicious nodes. While the genuine nodes follow the implemented system functionality correctly, the malicious nodes are under the control of an adversary. To make the verification problem most difficult, we assume that the malicious nodes know all genuine nodes and their positions (coordinates). Once the coordinates of all genuine nodes are known, the main objective of a malicious node is to report a suitable faking position to all these genuine nodes such that it can deceive as many genuine nodes as possible. On the other hand, the objective of a genuine node is to detect the inconsistency in the information provided by a malicious node. In order to do this, they compare two different estimates of the distance, one calculated from the coordinates provided by a node and the other computed using the RSS technique. If these estimates are close, the genuine node accepts the sender as genuine, otherwise the sender node is considered as a malicious node. Malicious nodes, however, do not go for such calculations. They always report all genuine nodes as malicious and all malicious nodes as genuine to break the verification protocol. In this present work, we deal with such situations and discuss how to detect and filter out all such malicious nodes from a WSN in a noisy channel. Related Works: Most of the existing methods for secure position verification [4, 5, 15, 16] rely on a fixed set of trusted entities (or verifiers) and distance estimation techniques to filter out faking (malicious) nodes. We refer to this model as the trusted sensor (or TS ) model. In this model, faking nodes may use some modes of attacks that cannot be adopted by genuine nodes, such as radio signal jamming or using directional antenna that permit to implement attacks, e.g., wormhole attack [13]and Sybil attack [7]. Lazos and Poovendran [15] proposed a secure range-independent localization scheme, which is resilient to wormhole and Sybil attacks with high probability. Lazos et. al. [16] further refined this scheme with multi-lateration to reduce the number of required locators, while maintaining probabilistic guarantees. Shokri et. al. proposed a neighbor verification protocol, which is secure against the classic 2-end wormhole attack. The TS model was also considered by Capkun and Hubaux [4] and Capkun et. al. [5]. In [4], the authors presented a protocol, which relies on the distance bounding technique proposed by Brands and Chaum [2]. The protocol presented in [5] relies on a set of hidden verifiers. However, there are two major weakness of the TS model; firstly, it is not possible to self-organize a network in a completely distributed way, and secondly, periodical checking is required to ensure the reliability of trusted nodes. Position verification problem becomes more challenging when there are no trusted sensor nodes. Hwang et. al. [14] and Dela¨et et. al. [6] investigated the verification problem with this no trusted sensor (NTS ) model. In both of these articles, the authors considered the problem, where the faking nodes operate synchronously with other nodes. The approach in [14] is randomized and consists of two phases: distance measurement and filtering. In the distance measurement phase, all nodes measure their distances from their
152
P.S. Mandal and A.K. Ghosh
neighbours, when faking nodes can corrupt the distance measure technique. In this phase, each node announces one distance at a time in a round robin fashion. Thus the message complexity is O(n2 ). In the filtering phase, each genuine node randomly picks up two so-called pivot nodes and carries out its analysis based on those pivots. However, these chosen pivot sensors could be malicious. So, the protocol may only give a probabilistic guarantee. The approach in [6] is deterministic and consists of two phases that can correctly filter out malicious nodes, which can corrupt the distance measure technique. In the case of RSS, the protocol can tolerate at most n2 − 2 faking sensors (n being the total number of nodes in the WSN) provided no four sensors are located on the same circle and no four sensors are co-linear. In the case of ToF, it can handle up to n2 − 3 faking sensors provided no six sensors are located on the same hyperbola and no six of them are co-linear. Our results: The main contribution of this article is SecureNeighborDiscovery, a secure position verification protocol in the NTS model in a noisy channel. To the best of our knowledge, this is the first protocol in the NTS model in a noisy environment. This protocol guarantees that the genuine nodes reject all incorrect positions of malicious nodes with a very high probability (almost equal to 1) when there are sufficiently many genuine nodes in the WSN. If the noise in the network channel is negligible, this required number of genuine nodes matches with the findings of [6], where the authors proposed a deterministic algorithm for detecting faking sensors. However, when the noise is not negligible, each node can only have a limited precision for distance estimation. In such cases, it is not possible to develop a deterministic algorithm. Our protocol based on probabilistic algorithm takes care of this problem and filters out all malicious nodes from the WSN with a very high probability. When the number of nodes in the WSN is reasonably large, this probability turns out to be very close to 1. So, for all practical purposes, this proposed probabilistic method behaves almost like a deterministic algorithm. Our SecureNeighborDiscovery protocol can be used to prevent Sybil attack [7] by verifying whether each message contains the real position (id) of its sender or not. The genuine nodes never accept any message with a malicious sender location.
2
Technical Preliminaries
Here, we assume that each node knows its geographic position (coordinates), and the nodes in WSN form a complete graph of communication, i.e., each node can communicate with all other nodes in the WSN. We also assume that the WSN is partially synchronous: all nodes operate in phases. In the first phase, each node send exactly one message to all other nodes without collision, and for each transmission, all nodes use the same transmission power S s . We further assume that malicious nodes can transmit incorrect coordinates (incorrect identifier) to all other nodes, and they cooperate among themselves in an omniscient manner (i.e. without exchanging messages) in order to deceive the genuine nodes in the
Secure Position Verification for Wireless Sensor Networks in Noisy Channels
153
WSN. Each malicious node obeys synchrony and transmits at most one message at the beginning of first phase and one message at the end of it. Let dij be the true distance of node i from a genuine node j. Since node j does not know the location of node i, it estimates dij using two different techniques, one using the RSS technique and the other using the coordinates provided by node i. These two estimates are denoted by dˆij and d˜ij , respectively. In the RSS technique, under idealized conditions, node j can precisely measure the distance of node i using Friis transmission equation 1 [17] given by r Sji = Sis
λ 4πdij
2 (1)
where Sis is the transmission power of the sender node i (here Sis = S s for all i), r Sji is the corresponding RSS at the receiving node j, and λ is the wave length. If the sender node i gives perfect information regarding its location (i.e., d˜ij = dij ), then the distance estimated using the RSS technique (dˆij ) and that computed from coordinates provided by node i (d˜ij ) will be equal in the ideal situation. However, in practice, when we have noise in the channel, they cannot match exactly, but they are expected to be close. But, if node i sends an incorrect information about its location, |d˜ij − dˆij | can be large.
3
RSS Technique in a Noisy Medium
The above Friis transmission equation 1 is used in telecommunications engineering, and it works well under idealized conditions. But, in the presence of noise in the network, this transmission equation may not hold, and it needs to be modified. Modifications to this equation based on the effects of impedance mismatch, misalignment of the antenna pointing and polarization, and absorption can be incorporated using an additional noise factor ε, which is supposed to follow a Normal (Gaussian) distribution with mean 0 and variance σ 2 . The modified equation is given by r Sji
=
Sis
α dij
2 + εij
(2)
λ where εij ∼ N (0, σ2 ) and α = 4π . However, the εij s are unobserved in practice. So, the receiving node j estimates the distance dij using the Friis transmission r 1/2 equation 1, and this estimate is given by dˆij =α (Sis /Sji ) . Since εij ∼ N (0, σ2 ), 2 r following the 3σ limit, Sji is expected to lie between Sis (α/dij ) − 3σ and 2 Sis (α/dij ) + 3σ, where dij is the unknown true distance. Accordingly, dˆij is ex1 1 pected to lie in the range [dij {1+(3σd2ij /α2 Sis )}− 2 , dij {1−(3σd2ij /α2 Sis )}− 2 ]. So, if the sender sends its genuine coordinates (i.e., d˜ij = dij ), dˆij is expected 1 1 to lie in the range [d˜ij {1 + (3σ d˜2ij /α2 Sis )}− 2 , d˜ij {1 − (3σ d˜2ij /α2 Sis )}− 2 ] with probability almost equal to 1 ( 0.9973). The receiver node j accepts node i as
154
P.S. Mandal and A.K. Ghosh
genuine when dˆij lies in that range. Throughout this article, we will assume σ 2 to be known. However, if it is unknown, one can estimate it by sending signals from known distances and measuring the deviations in received signal strengths from those expected in ideal situations. Looking at the distribution of these deviations, one can also check whether the error distribution is really normal (see [20, 12] for the test of normality of error distributions). If it differs from normality, one can choose a suitable model for the error distribution and find the acceptance interval using the quantiles of that distribution. For the sake of simplicity, throughout this article, we will assume the error distribution to be normal, which is the most common and popular choice in the statistics literature. We assume that there are n sensor nodes deployed over a region D in a two dimensional plane, n0 of them are genuine, and the rest n1 (n0 + n1 = n) are malicious. Though our protocol does need n0 and n1 to be specified, for the better understanding of the reader, we will use these two terms for the description and the mathematical analysis of our protocol. 3.1
Optimal Strategy for Malicious Sensor Nodes
Here, we deal with the situation, where all malicious nodes know all genuine nodes and their positions, or in other words, they know which of the sensor nodes are genuine and which ones are malicious. Therefore, to break the verification protocol, each malicious node reports all genuine nodes as malicious and all malicious nodes as genuine. In addition to that, a malicious node sends a suitable faking position as its coordinates so that it can deceive as many genuine nodes as possible. Let xj = (xj , yj ) j = 1, 2, . . . , n0 , be the coordinates of the genuine nodes and x0 = (x0 , y0 ) be the true location of a malicious node. Instead of reporting its original position, the malicious node looks for a suitable faking position xf = (xf , yf ) to deceive the genuine nodes. Note that if it sends xf as its location, from that given coordinates, the j-th (j = 1, 2 . . . , n0 ) genuine node estimates its distance by d˜0j = xj − xf , where · denotes the usual Euclidean distance. Again, the distance estimated from the received signal is r 1/2 dˆ0j = α (S0s /Sj0 ) . So, the j-th node accepts the malicious node as genuine if dˆ0j will lies between α1,0j = d˜0j {1 + (3σ d˜20j /α2 Ss,0 )}−1/2 and α2,0j = d˜0j {1 − (3σ d˜20j /α2 Ss,0 )}−1/2 . Now from equation 2, it is easy to check that α1,0j ≤ dˆ0j ≤ α2,0j ⇔ α∗1,0j = α2 S0s [1/α22,0j − 1/dˆ20j ] ≤ ε0j ≤ α∗2,0j = α2 S0s [1/α21,0j − 1/dˆ20j ]. Let pf0j be the probability that the malicious node, which is originally located at x0 , is accepted by the j-th genuine node when it reports xf as its location. Now, from the above discussion, it is quite clear that pf0j (j = 1, 2, . . . , n0 ) is given by pf0j = P dˆ0j ∈ (α1,0j , α2,0j ) =
1 √ σ 2π
α∗ 2,0j
α∗ 1,0j
x2 exp − 2 dx 2σ
(3)
Naturally, the malicious node tries to cheat as many genuine nodes as possif ble. Let us define an indicator variable Z0j that takes the value 1 (or 0) if the malicious nodes successfully cheats (or fails to cheat) the j-th genuine node
Secure Position Verification for Wireless Sensor Networks in Noisy Channels
155
f f when it sends the faked location xf . Clearly, here E(Z0j ) = P (Z0j = 1) = f p0j . So, given the coordinates of the genuine nodes X0 = {x1 , x2 , . . . , xn0 }, 0 n0 f f,X0 f θ0,n = E( nj=1 Z0j ) = j=1 p0j denotes the expected number of genuine 0 nodes to be deceived by the malicious node if it pretends xf as its location. f,X0 . Let us deNaturally, the malicious node tries to find xf that maximizes θ0,n 0 f,X X0 fine θ0,n0 = supxf ∈F0 θ0,n0 , where F0 is the set of all possible faking coordinates.
f,X0 X0 = θ0,n . Here, A malicious node located at x0 looks for xf ∈ F0 such that θ0,n 0 0 one should note that the region F0 depends on the true location of the malicious node x0 , and it is not supposed to contain any point lying in a small neighborhood x0 . Because in that case, x0 and xf will be almost the same, and the malicious node will behave almost like a genuine node. Naturally, the malicious node would not like to do that, and it will keep the neighborhood outside F0 . The size of this neighborhood of course depends on the specific application, and X0 the value of θ0,n may also depend on that. 0
3.2
Optimal Strategy for Genuine Sensor Nodes
Let A0 as the total number of nodes in the WSN that accept the malicious node located at x0 (as discussed in Section 3.1) as genuine. Since a malicious node is always accepted by other malicious nodes, if there are n0 genuine nodes in the WSN and X0 denotes their coordinates, for the optimum choice of the faking coordinates xf , the (conditional) expected value of A0 is given by E(A0 | X0 . Now, a genuine node does not know a priori how many n0 , X0 ) = (n−n0 )+θ0,n 0 genuine nodes are there is the WSN, and where they are located. So, at first, for a given n0 , it computes the average of E(A0 | n0 , X0 ) over all possible X0 . If D denotes the deployment region (preferably a convex region) for the sensor nodes, and if the nodes are assumed to be uniformly distributed over D, this average is given by E(A0 | n0 ) = X0 ∈Dn0 E(A0 | n0 , X0 )ψ(X0 )dX0 , where ψ is the uniform density function on Dn0 . Here we have chosen ψ to be uniform because it is the most simplest one to deal with, and it is also the most common choice in the absence of any prior knowledge on the distribution of nodes in D. When we have some prior knowledge about this distribution, ψ can be chosen accordingly. Now, X0 define θ0,n0 = X0 ∈Dn0 θ0,n ψ(X0 )dX0 . Clearly, E(A0 | n0 ) = (n − n0 ) + θ0,n0 0 depends on n0 , which is unknown to the genuine node. So, it finds an upper bound for E(A0 | n0 ) assuming that at least half of the sensor nodes in the WSN are genuine. Under this assumption, this upper bound is given by 0.5n + θ0,0.5n . Theorem 1. If there are n nodes in a WSN, and at least half of them are genuine, the expected number of acceptance for a malicious node located at x0 = (x0 , y0 ) cannot exceed 0.5n + θ0,0.5n . The proof of this theorem and all other theorems given in this article can be found in [18]. Note that θ0,0.5n and the upper bound depend on the location of the malicious node x0 , So, for a genuine node, it is an unknown random quantity. Therefore, a genuine
node takes a conservative approach and computes ∗ θn/2 = supx0 ∈D θ0,n/2 , which gives an upper bound of the expected number
156
P.S. Mandal and A.K. Ghosh
of genuine nodes to be deceived by a malicious node in D when there are n/2 genuine sensor nodes in the WSN. To filter out all malicious nodes from the WSN, a genuine node follows the idea of [6]. For any node, it calculates the total number of acceptances (approvals) (A) and rejections (accusations) (R), ∗ and considers the node as malicious if R exceeds A − θn/2 . Since A + R = n, ∗ a node is considered to be genuine if A ≥ (n + θn/2 )/2. Note that if there are n0 genuine nodes and n1 malicious nodes in the WSN, a malicious node, on an average, can be accepted by at most θn∗ 0 + n1 nodes, and it will be rejected by ∗ at least n0 − θn∗ 0 nodes. So, for a malicious node A − R − θn/2 is expected to ∗ ∗ ∗ be smaller than (n1 + 2θn0 ) − n0 − θn/2 ≤ n/2 + θn0 − n0 (from Theorem 1). Therefore, if we have n0 ≥ n/2 + θn∗ 0 , all malicious nodes are expected to be filtered out from the WSN. A more detailed mathematical analysis of our ∗ protocol will be given in Section 5. For computing θn/2 , a genuine node uses the statistical simulation technique [3] assuming that the sensors are distributed over D with density ψ (which is taken to be uniform in this article). First it generates coordinates x0 for the malicious node and X for n/2 genuine nodes f,X X in D to compute θ0,n/2 by maximizing θ0,n/2 . Repeating this over several X X one gets θ0,n/2 as an average of the θ0,n/2 s. This whole procedure is repeated
∗ for several random choices of x0 to compute θn/2 = supx0 ∈D θ0,n/2 . Note that this is an offline calculation, and it has to be done once only.
4
The Protocol
Based on above discussions, we develop the SecureNeighborDiscovery protocol. It is a two-phase approach to filter out malicious nodes. The first phase is named as AccuseApprove, and the second phase is named as Filtering. In the first phase, each node reports its coordinates to all other nodes by transmitting an initial message. Next, for each pair of nodes i and j, node j computes two estimates of the distance dij , one using the RSS technique (dˆij ) and the other from the reported coordinates (d˜ij ), as mentioned earlier. If dˆij ∈ / (α1,ij , α2,ij ) then node j accuses node i for its faking position. Otherwise, node j approves 1 the location of node i as genuine. Here α1,ij = d˜ij {1 + (3σ d˜2ij /α2 Ss,i )}− 2 and 1 α2,ij = d˜ij {1 − (3σ d˜2ij /α2 Ss,i )}− 2 are analogs of α1,0j and α2,0j defined in Section 3.1. To keep track of these accusations and approvals, each node j maintains an array accusj , and transmits it to all other nodes at the end of this phase. This AccuseApprove protocol is given below. In the second phase, each node j executes the Filtering protocol, where it counts the number of accusations and approvals toward node i including its own message. Node j finds node i as malicious if the number of accusations ∗ exceeds the number of approvals minus θn/2 . Conversely, node i is considered ∗ as genuine if its number of approvals is greater than or equal to (n + θn/2 )/2. In this process, nodes that are detected as malicious nodes, are filtered out from the WSN. Next, it ignores the decisions given by these deleted nodes and repeats the same filtering method with the remaining ones. If there are n nodes in the
Secure Position Verification for Wireless Sensor Networks in Noisy Channels
157
Protocol: AccuseApprove (executed by node j) 1. j exchanges coordinates by transmitting initj & receiving n − 1 initi . 2. for each received message initi : 3. compute dˆij using the ranging (RSS) technique and d˜ij using the reported coordinates of i. 4. if dˆij ∈ / (α1,ij , α2,ij ) then accusj [i] ← true else accusj [i] ← f alse 5. j exchanges accusations by transmitting accusj & receiving n − 1 accusi . Protocol: Filtering (executed by node j) 1. F = φ, G = {1, 2, . . . , n}, n ← n 2. repeat{k ← n 3. for each received accusi : (i ∈ G) 4. for each r : (r ∈ G) 5. if accusi [r] = true then N umAccusr + = 1 else N umApprover + = 1 6. newF = φ. 7. for each sensor i : (i ∈ G) ∗ 8. if (N umApprovei ≥ (k + θn/2 )/2) then j considers i as a genuine node. else j considers i as a malicious node. filter out i, newF = newF ∪ {i}, n ← n − 1. 9. F = F ∪ newF , G = G \ newF . 10. for each sensor i : (i ∈ newF ) 11. discard accusi & corresponding ith entry of accusr for all r ∈ G 12. } until(k = n )
WSN, a node is considered to be malicious if the number of approvals is smaller ∗ ∗ ∗ ∗ than (n +θn/2 )/2. Instead of θn/2 , we can use θn , but in that case, θn /2 /2 needs to be computed again, and it needs to be computed online. Therefore, to ∗ reduce the computing cost of our algorithm, here we stick to θn/2 . Note that ∗ the use of θn/2 also makes the filtering protocol more strict in the sense that it increases the probability of a node being filtered out. Node j repeats this method until there are no further deletions of nodes from the WSN. The Filtering protocol is given above. Here F and G denote the set of malicious and genuine nodes respectively. Initially, we set F = φ and G = {1, 2, . . . , n}. At each stage, we detect some malicious nodes and filter them out. Those nodes are deleted from G and included in F . At the end of the algorithm, G gives the set of nodes remaining in WSN, which are considered to be genuine nodes. It would be ideal if the set of coordinates of the nodes in G matches with X . However, it may not always be the case. Note that the main objective of our protocol is to filter out all malicious nodes from the WSN. In the process, a few genuine nodes may also get removed. So, if not all, at the end of the algorithm, one would like G to contain most of the genuine nodes and no malicious nodes.
158
5
P.S. Mandal and A.K. Ghosh
Correctness of the Protocol
To check the correctness of the above protocol, we consider the worst case scenario as mentioned before, where all genuine nodes get accused by all malicious nodes, and each malicious node gets approved by all other malicious nodes. Assume that there are n0 genuine nodes and n1 malicious nodes in the WSN. Now, ∗ for j, j = 1, 2, . . . , n0 , define the indicator variable Zjj = 1 if the j -th genuine node accepts the j-th genuine node, and 0 otherwise. So, for the nj-th genuine ∗ node, the number of approvals A∗j can be expressed as A∗j = 1 + j 0=1,j =j Zjj , ∗ where the Zjj s are independent and identically distributed (i.i.d.) as Bernoulli ∗ random variables with the success probability p = P (Zjj = 1) = 0.9973 1. If n0 is reasonably large, using the Central Limit Theorem (CLT) [8] for the i.i.d. ∗ case, one can show that (see Theorem 2) P (A∗j ≥ (n + θn/2 )/2) 1 − Φ (τ ), where Φ = cumulative distribution function of the standard normal distribution n+θ ∗ −2n0 p and τ = √ n/2 . Since this probability does not depend on j, the same 2
p(1−p)(n0 −1)
expression holds for all genuine nodes. Theorem 2. Assume that there are n nodes in the WSN, and n0 of them are genuine. If n0 is sufficiently large, for the j-th genuine node (j = 1,∗2, . . . , n0 ),we ∗ n+θn/2 n+θ −2n0 p ∗ have the acceptance probability P Aj ≥ 1 − Φ √ n/2 . 2 2
p(1−p)(n0 −1)
∗ ∗ If n + θn/2 − 2n0 p < 0 (equivalent to n0 > (n + θn/2 )/2 since p 1), for any genuine node j (j = 1, 2, . . . , n0 ), the acceptance probability P (A∗j ≥ ∗ (n + θn/2 )/2) is bigger than 1/2. Again, if p is close to 1 (which is the case here), the denominator of τ becomes close to zero. So, in that case, the acceptance ∗ probability P (A∗j ≥ (n + θn/2 )/2) turns out to be very close to 1. Note that if ∗ ∗ we have n0 ≥ n/2 + θn0 , the condition n0 > (n + θn/2 )/2 gets satisfied. Now, given the coordinates of n0 genuine sensor nodes X0 , the malicious node, which is actually located at x0 but sends xf as its faked location, has the number 0 f of acceptance A0 = n1 + nj=1 Z0j , where n1 is the number of malicious nodes
f in the WSN, and Z0j ∼ B(1, pf0j ) for j = 1, 2, . . . , n0 (see Section 3.1). Again from the discussion in Section 3.2, it follows that E(A0 ) < n1 + θn∗ 0 . So, if ∗ n0 ≥ n/2+θn∗ 0 , using Theorem 1, it is easy to check that (n+θn/2 −E(A0 )) > ∗ 0.5(n0 − n/2−θn0 ) ≥ 0, and it is expected to increase with n linearly. So, if the standard deviation of A0 (square root of the variance V ar(A0 )) remains bounded as a function of n, or it diverges at a slower rate (which is usually the case), for sufficiently large number of nodes in the WSN, the final acceptance probability ∗ of the malicious node P (A0 ≥ (n + θn/2 )/2) becomes very close to zero.
Theorem 3. If we have sufficiently large number of nodes in the wireless sensor network and n0 ≥ n/2 + θn∗ 0 , for any malicious node, the final acceptance ∗ probability P (A0 ≥ (n + θn/2 )/2) 0. Theorems 2 and 3 suggest that if n is sufficiently large and n0 ≥ n/2 + θn∗ 0 , all genuine nodes in the WSN have acceptance probabilities close to 1, and all
Secure Position Verification for Wireless Sensor Networks in Noisy Channels
159
malicious nodes have acceptance probabilities close to 0. So, it is expected that after the first round of filtering, if not all, a large number of genuine nodes will be accepted. On the contrary, if not all, almost all malicious nodes will get filtered out from the network. However, for proper functioning of the WSN, one needs to remove all malicious nodes. In order to do that, we repeat the Filtering procedure again with the remaining nodes. Now, among these remaining nodes, all but a few are expected to be genuine, and because of this higher proportion of genuine nodes, the acceptance probability of the genuine nodes are expected to increase, and those for the malicious nodes are expected to decrease further. So, if this procedure is used repeatedly, after some stage, WSN is expected to contain genuine nodes only, and no nodes will be filtered out after that. When this is the case, our Filtering algorithm stops. Note that this algorithm does not need the values of n0 and n1 to be specified. We need to know n only ∗ for computation of θn/2 . This is the only major computation involved in our method, but one can understand that this is an off-line calculation. If we know ∗ a priori the values of θn/2 for different n, one can use those tabulated values to avoid this computation. Note that the condition n0 ≥ n/2 + θn∗ 0 is only a sufficient condition under which the proposed protocol functions properly. Later, we will see that in the presence of negligible noise (or in the absence of noise) in the WSN, this condition matches with that of [6], and in that case, it turns out to be a necessary and sufficient condition. However, in other cases, it remains a sufficient condition only, and our protocol may work properly even when it is not satisfied. Our simulation studies in the next section will make this more clear.
6
Simulation Results
We carried out simulation studies to evaluate the performance of our proposed ∗ algorithm. In the first part of the simulation, we calculated the value of θn/2 ∗ using the statistical simulation technique [3], and using that θn/2 , in the second part, we filtered out all suspected malicious nodes from the WSN. While maxif,X mizing θ0,n/2 w.r.t. xf , in order to ensure that xf and x0 are not close, an open ball around x0 is kept outside the search region F0 . Unless mentioned otherwise, we carried out our experiments with 100 sensors nodes, but for varying choice of n0 and n1 and also for different levels of noise (i.e., different values of σ2 ). For choosing the value of σ 2 , first we considered two imaginary nodes (the sender and the receiver nodes) located at two extreme corners of D and calculated the r received signal strength Sextreme for that set up under ideal condition (see Friis equation 1). The error standard deviation σ was taken as smaller than or equal r to SS = Sextreme /3 to ensure that all received signal strengths remain positive (after error contamination) with probability almost equal to 1. 6.1
WSN with Insignificant Noise (σ = 10−6 SS)
X remains almost constant and In this case, we observe that the value of θ0,n/2 ∗ equal to 2p = 1.9946 2 for varying choices of x0 and X . So, we have θn/2 = 2.
160
P.S. Mandal and A.K. Ghosh
In fact, in this case, θk∗ turns out to be 2 for all k ≥ 2. So, if we choose n0 = 52 and n1 = 48, the condition n0 ≥ n/2 + θn0 gets satisfied, and one should expect the protocol to work well. When we carried out our experiment, each of the 48 malicious nodes could deceive exactly two genuine nodes, and as a result, the number of approvals turned out to 50. So, all of them failed to reach the ∗ )/2 = 51, and they were filtered out from the WSN at the threshold (n + θn/2 very first round. On the contrary, all 52 genuine nodes had number of approvals bigger than (47 out 52 nodes) or equal to (5 out of 52 nodes) 51, and none of them were filtered out. So, at the beginning of the second round of filtering, we had 52 nodes in the WSN, and all of them were genuine. Since the number of approvals for each genuine node remained the same as it was in the first round, it was well above the updated threshold (52+2)/2=27. So, no other nodes were filtered out, and our algorithm stopped with all genuine nodes and no malicious nodes in the network. Needless to mention that the proposed protocol led to the same result for all higher values of n0 . But it did not work properly when we took n0 = 51 and n1 = 49. In that case, all malicious nodes had 51 approvals, and those for the genuine nodes were smaller than or equal to 51. So, no malicious nodes but some genuine nodes were deleted at the first round of filtering. As a result, the number of approvals for the genuine nodes became smaller at the second round, and that led to the removal of those nodes from the WSN. Note that in this case, the condition n0 ≥ n/2 + θn0 does not get satisfied. So, here the condition is not only sufficient, but it turns out to be necessary as well. We carried out our experiment also with 101 nodes. When there were 51 genuine and 50 malicious nodes in the WSN, the protocol did not work properly. But in the case of n0 = 52 and n1 = 49, it could filter out all malicious nodes. In that case, each malicious node had 51 approvals, smaller than the threshold ∗ (n + θn/2 )/2 = 51.5. But, 48 out of 52 genuine nodes were accepted by all 52 genuine nodes. So, at the end of first round of filtering, in the WSN, we had 48 genuine nodes only. Naturally, no other nodes were removed at the second round. Again this shows that n0 ≥ n/2 + θn0 is a necessary and sufficient condition for the protocol to work when the noise is negligible. This is consistent with the findings of [6], where the authors allowed no noise in the network. 6.2
WSN with Significant Noise (σ = SS)
X did not remain constant for different Unlike the previous case, here θ0,n/2 X choices of x0 and X . Considering n = 100, we computed θ0,n/2 over 500 simu∗ lations, and they ranged between 5.9831 and 23.6964 leading to θn/2 = 24 and ∗ (n + θn/2 )/2 = 62. Clearly, if we start with less than 62 genuine nodes, the protocol fails as all genuine nodes get deleted at the first round of filtering. So, we started with 62 genuine and 38 malicious nodes. One can notice that here ∗ n0 < n/2 + θn/2 , and the condition n0 ≥ n/2 + θn∗ 0 does not get satisfied. But our protocol worked nicely and filtered out all malicious nodes from the WSN. This shows that the above condition is only sufficient in this case. At the first round of filtering, 54 out of the 62 genuine nodes, and 5 out of 38 malicious nodes could reach the threshold. So, at the beginning of the second round, we
Secure Position Verification for Wireless Sensor Networks in Noisy Channels
161
had only 59 nodes in the network leading to a threshold of (59+24)/2=41.5. Naturally, none of the malicious nodes and all the genuine nodes could cross this threshold, and at the end of the second round of filtering, we had only 54 nodes in the WSN, all of which were genuine. As expected, no nodes were filtered out at the third round, and our algorithm terminated with 54 genuine nodes. 6.3
A Modified Filtering Algorithm Based on Quantiles
Note that in the previous problem, if we start with 60 genuine nodes and 40 malicious nodes, the protocol fails as all genuine nodes get deleted at the first round of filtering. Now, we propose a slightly modified version of our protocol ∗ that works even when n0 is smaller than (n + θn/2 )/2. Instead of using (n + ∗ θn/2 )/2, here we use a sequence of thresholds based on different quantiles of X ∗ θ0,n/2 . At first, we begin with the threshold n/2 (i.e. replace θn/2 by 0) and follow the protocol described in Section 4. In the process, some nodes may get filtered out. If there are n(1) nodes remaining in the WSN, we use the threshold 0.1 ∗ 0.1 (n(1) + θn/2 )/2 (i.e. replace θn/2 by θn/2 ) and apply the filtering phase q of the protocol Filtering on the remaining nodes. Here θn/2 denotes the qX th (0 < q < 1) quantile of θ0,n/2 , and this can be estimated from the 500 X values of θ0,n/2 observed during simulation. This procedure is repeated with i/10
thresholds (n(i) + θn/2 )/2 for i = 2, 3, . . . , 9, and finally we use the threshold ∗ (n(10) + θn/2 )/2. The nodes remaining in the WSN after these 11 steps of filtering are considered as genuine nodes. This algorithm worked well in our case, and it filtered out all malicious nodes from the WSN without losing a single genuine node. In fact, all malicious nodes were filtered out after the first two steps, and there were no deletions of nodes after that. The results for the 0.1 first two steps are shown in Table 1 (in our case, θn/2 was 8.6786). The total number of approvals for the deleted nodes are also reported in the table for better understanding of the algorithm. This modified version could filter out up to 44 malicious nodes. In the case of n0 = 56 and n1 = 44, only one genuine node was deleted from the WSN before all malicious nodes were filtered out. However, in the case of n0 = 55, n1 = 45
Table 1. First two steps of filtering (based on quantiles) with n0 = 60 and n1 = 40 Step(i) Total nodes (n(i) ) Threshold Nodes deleted No. of approvals Genuine Malicious Genuine Malicious for deleted nodes 0 60 40 50.00 0 1 < 50 60 39 49.50 0 0 — 1 60 39 53.84 0 3 51-54 60 36 52.34 0 5 55-56 60 31 49.84 0 5 57-58 60 26 47.34 0 17 59-61 60 9 38.84 0 9 62-69 60 0 34.34 0 0 —
162
P.S. Mandal and A.K. Ghosh
our algorithm failed. In that case, all genuine nodes had 54 or 55 approvals, but almost all malicious nodes had more than 55 approvals. So, our protocol could remove only 9 malicious nodes before all genuine were filtered out.
7
Possible Improvements
In this article, we have used the modified version of Friis transmission equation 2 for developing our SecureNeighborDiscovery protocol. However, sometimes one needs to make empirical adjustments to the basic Friis equation 1 using larger exponents. These are used in terrestrial models, where reflected signals can lead to destructive interference, and foliage and atmospheric gases contribute r /Sis to be proportional to to signal attenuation [9]. There one can consider Sji m Gr Gs (λ/dij ) , where Gr and Gs are mean effective gain of the antennas and m is a scaler typically lies in the range [2, 4]. If m is known, one can develop a verification scheme following the method described in this article. Even if it is not known, it can be estimated by sending signals from known distances and measuring the received signal strengths. However, our proposed protocol is not above all limitations. In this article, we have assumed that the underlying network topology is a complete graph. But, in practice, this may not always be the case. In multi-hop network topology, our SecureNeighborDiscovery protocol based on voting can be used in the neighborhood of each node, provided there are sufficiently many genuine nodes in the neighborhood. However, the performance of this verification protocol in the case of multi-hop network topology needs to be thoroughly investigated.
8
Concluding Remarks
In this article, we have proposed a distributed secure position verification protocol for WSNs in noisy channels. In this approach, without relying on any trusted sensor nodes, all genuine nodes detect the existence of malicious nodes and filter them out with a very high probability. The proposed method is conceptually ∗ ∗ quite simple, and it is easy to implement if θn/2 is known. Calculation of θn/2 is the only major computation involved in our method, but one should note that this is an off-line calculation. In the case of negligible noise in the WSN, we have seen that the performance of our protocol matches with that of the deterministic methods of [6]. However, when the noise is not negligible, each of the sensor nodes can only have a limited precision for distance estimation. In such cases, it is not possible to develop a deterministic algorithm [6]. Our protocol based on probabilistic algorithm takes care of this problem, and it filters out all malicious nodes with a very high probability. When the number of nodes in the WSN is reasonably large, this probability turns out to be very close to 1. So, for all practical purposes, our proposed method behaves almost like a deterministic algorithm as we have seen in Section 6. Since the influence of noise on signal propagation is very common in WSNs, this probabilistic approach is very practical for the implementation perspective in the real world. One should also notice that compared to the randomized protocol of Hwang et al. [14], our protocol leads to substantial savings on the time and the power
Secure Position Verification for Wireless Sensor Networks in Noisy Channels
163
used for transmissions. In [14], since each sensor announces one distance at a time in a round robin fashion, the message complexity is O(n2 ). But, in the case of our proposed protocol, O(n) messages are transmitted in the first phase, and each sensor announces all distances through a single message.
References 1. Bahl, P., Padmanabhan, V.N.: RADAR: an in-building RF-based user location and tracking system. In: INFOCOM, vol. 2, pp. 775–784. IEEE, Los Alamitos (2000) 2. Brands, S., Chaum, D.: Distance bounding protocols. In: Helleseth, T. (ed.) EUROCRYPT 1993. LNCS, vol. 765, pp. 344–359. Springer, Heidelberg (1994) 3. Bratley, P., Fox, B.L., Schrage, L.E.: A Guide to Simulation. Springer, Heidelberg (1987) 4. Capkun, S., Hubaux, J.: Secure positioning in wireless networks. IEEE Journal on Selected Areas in Comm 24(2), 221–232 (2006) ˇ 5. Capkun, S., Rasmussen, K., Cagalj, M., Srivastava, M.: Secure location verification with hidden and mobile base stations. IEEE TMC 7(4), 470–483 (2008) 6. Dela¨et, S., Mandal, P.S., Rokicki, M.A., Tixeuil, S.: Deterministic secure positioning in wireless sensor networks. Theoretical Computer Science (April 5, 2011) (in press) (accepted Manuscript) 7. Douceur, J.R.: The sybil attack. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, pp. 251–260. Springer, Heidelberg (2002) 8. Feller, W.: An Intro. to Probability Th. and Its Applications, vol. II. Wiley, Chichester (1966) 9. Fette, B.: Cognitive Radio Technology, 2nd edn. Academic Press, London (2009) 10. Fontana, R.J., Richley, E., Barney, J.: Commercialization of an ultra wideband precision asset location system. In: 2003 IEEE Conference on Ultra Wideband Systems and Technologies, pp. 369–373 (2003) 11. Hightower, J., Want, R., Borriello, G.: SpotON: An indoor 3D location sensing technology based on RF signal strength. Technical Report UW CSE 00-02-02, Univ. of Washington, Dept. CSE, Seattle, WA (February 2000) 12. Hollander, M., Wolfe, D.A.: Nonparametric Statistical Methods. Wiley, Chichester (1999) 13. Hu, Y., Perrig, A., Johnson, D.B.: Packet leashes: A defense against wormhole attacks in wireless networks. In: INFOCOM. IEEE, Los Alamitos (2003) 14. Hwang, J., He, T., Kim, Y.: Secure localization with phantom node detection. Ad Hoc Networks 6(7), 1031–1050 (2008) 15. Lazos, L., Poovendran, R.: SeRLoc: Robust localization for wireless sensor networks. ACM Trans. Sen. Netw. 1(1), 73–100 (2005) 16. Lazos, L., Poovendran, R., Capkun, S.: ROPE: Robust position estimation in wireless sensor networks. In: IPSN, pp. 324–331. IEEE, Los Alamitos (2005) 17. Liu, C.H., Fang, D.J.: Propagation. In: Antenna Handbook: Theory, Applications, and Design, ch. 29, pp. 1–56. Van Nostrand Reinhold, New York (1988) 18. Mandal, P.S., Ghosh, A.K.:Secure position verification for wireless sensor networks in noisy channels. CoRR, abs/1105.0668 (2011), http://www.arxiv.org/abs/1105.0668 19. Priyantha, N.B., Chakraborty, A., Balakrishnan, H.: The cricket location-support system. In: 6th ACM MOBICOM, Boston, MA. ACM, New York (August 2000) 20. Shapiro, S.S., Wilks, M.B.: An analysis of variance test for normality (complete samples). Bometrika 52(3-4), 591–611 (1965)
Efficient CDH-Based Verifiably Encrypted Signatures with Optimal Bandwidth in the Standard Model Yuan Zhou1,2 and Haifeng Qian1, 1
2
Department of Computer Science and Technology, East China Normal University, China Network Emergency Response Technical Team/Coordination Center, China
Abstract. Exchanging items over mobile ad hoc network has been considered a challenging issue in recent years. To tackle this challenge, Verifiably Encrypted Signature (VES), which is employed as primitives when designing a large class of protocols such as certified email, fair exchange, and contract signing in wireless communication, provides a possible solution. However, the limited communication band, low computational ability and weak energy power restrict many existing verifiably encrypted signatures to be applied in ad hoc networks directly. In this paper, we propose a compact verifiably encrypted signature scheme without random oracles based on the Computational DiffieHellman problem (CDH) with pairings. Comparing with prior works, our scheme achieves the following desired features: (1) Our verifiably encrypted signature has compact size (only two group elements) which is optimal for both Elgamal encryption and the Waters signature; (2) The scheme is more efficient in terms of signature generation and verification; (3) Our scheme also achieves provable security under a standard complexity assumption in the standard model. Apparently, our schemes are amongst the most efficient solutions in terms of both signature size and computation (optimal ) because these features are important in wireless communication due to limited bandwidth and power. It can be surely applied flexibly to many secure exchange circumstances in mobile ad hoc network that solely allows the minimum cryptographic implementation. Keywords: Verifiably Encrypted Signatures, Short Signatures, Minimum Cryptographic Implementation, Wireless Communication, Ad Hoc Network.
1
Introduction
Due to the fast advance of wireless and mobile computing technologies, mobile ad hoc networking has been subject to extensive research efforts in recent years. In a typical mobile ad hoc network, there is no fixed infrastructure, and mobile nodes that are within each other’s radio transmission range can communicate directly. This nature of mobile ad hoc network makes the peer-to-peer
Corresponding author.
[email protected]
H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 164–176, 2011. c Springer-Verlag Berlin Heidelberg 2011
Compact VES without Random Oracles
165
communication without the help of a third party possible, yet it also bring about the fair exchange issue. Moreover, the limited communication band, low computational ability and weak energy power of mobile nodes make the fair exchange more challenging in mobile ad hoc networks. In recent years, the challenge of fair exchange in mobile ad hoc network has attracted many researchers, and the state of the art shows that verifiably encrypted signature is one of the most important means to solve such a challenge in mobile ad hoc networks. Verifiably encrypted signatures first proposed by Asokan et al. [1] and Bao et al. [4] independently is a way of encrypting a signature under a designated public key and subsequently proving that the resulting ciphertext indeed contains such a signature. Verifiably encrypted signature scheme needs a trusted third party (TTP) called Adjudicator. To realize fair exchange in an optimistic way, we require that the adjudicator is only needed in case that a participant attempts to cheat the other or simply system crashes. Another key feature is that a participant can always force a fair and timely termination, without the cooperation of the other participants. Neither party can be left hanging or cheated so long as the adjudicator is available. Today verifiably encrypted signatures are widely used as primitives to build contract signing, efficient fair exchange and certified e-mail protocols in wireless communication. 1.1
Prior Work
Recently, Camenisch and Shoup [10], Ateniese [2, 3] proposed verifiably encryption signatures based on the discrete logarithm problem. In 2003, Boneh et.al [8] setup a security model of a verifiably encrypted signature and gave construction satisfying the definitions based on the BLS short signature in the random oracle model [9]. By combining ID-based public key cryptography with verifiably encrypted signature, Gu et al. [16] proposed an ID-based verifiably encrypted signature scheme based on the Hess signatures, and “proved” that their scheme was secure in the random oracle model. Unfortunately, the scheme was discovered insecure [28]. Many of the existing verifiably encrypted signature (VES) schemes are only proven to be secure in random oracle model [8,27]. Thus the security of them are not ensured when they are implemented in the real world (due to the fact that security in the random oracle models does not imply security in the real world [11, 12, 6, 13, 22]). In 2008, Shao propose a new verifiably encrypted signature scheme in the random oracle model [24] where the public keys are certified by a non-traditional CA (belonging to the certificate-based cryptography). To address the above problem, the authors in [15] proposed the first verifiably encrypted signature scheme without random oracles based on the short signature scheme in [7]. There is no detailed security proof for their scheme however. Even if their proof is right, the security of the scheme still relies on a strong complexity assumption as in [7]. Recently, Lu et al proposed an efficient verifiably encrypted scheme without random oracles in [17] based on the Waters signature in [25] whose security was proved under the CDH assumption with bilinear pairing. However, the size of signature is Not optimal (which has three group elements)
166
Y. Zhou and H. Qian
since there is one more group element compared with Elgamal encryption (or the Waters siganture). In 2007, Zhang and Mao proposed a novel scheme without random oracles, which also has small signature size but based on a strong assumption that the Chosen-Target-Inverse-CDH with square problem is intractable [26]. Recently, [19] presents a new construction without random oracles, but security of such a scheme relies also on a strong assumption on a new computational problem (called Strong Diffie-Hellman problem). [18, 20] present new constructions with much larger sizes and complex procedures, achieving neither optimal size or computation, but considering the attack from the trust third party, called abuse-free [19]. 1.2
Motivation and Contribution
To our best knowledge, many of verifiably encrypted signatures [8,27] are proved secure in random oracle model [5]. However, proofs in such a model do not guarantee practical security when the oracles are instantiated by any particular cryptographic primitive. To ensure the security in the real world, we prefer to a proof of security in the standard model. However, there are only very few VES schemes provably secure without random oracles. The most popular one [17] is proven under CDH assumption in bilinear groups which contains three group elements. The security without random oracles incurs a larger size of signature than that in [8] whose verifiably encrypted signature contains only two group elements (can be viewed as ElGamal encryption of the signature under the adjudicator’s key). In this paper, based on the Waters signature [25] we propose novel verifiably encrypted signatures without random oracles, whose security relies on a standard complexity assumption (the Computational Diffie-Hellman assumption). Compared with other typical verifiably encrypted signature schemes, our schemes are more efficient in term of signature size and computation (i.e., signature generation and verification). Actually, generation of our verifiably encrypted signature is pretty simple and easy, and almost the same as that of the underlying signature scheme (the Waters signature scheme). However, other verifiably encrypted signatures, for instance, those in [8, 17] are completed in two steps: one step to produce the ordinary signature, the other to produce a verifiably encrypted signature. Meanwhile, verification of our verifiably encrypted signatures is almost the same as the underlying signature scheme too. More explicitly, through re-using the randomness in the Waters signature scheme, we greatly reduce the signature size and improve the efficiency. The advantages of our scheme are shown in the above table, comparing with those schemes in [8,17,26] (denoted by BGLS, LOSSW and ZM, respectively). We let e be scalar multiplication on the curve, m point addition, p pairings computation, RO the random oracle model and k the output length of a collisionresistant hash function. Our verifiably encrypted signature scheme is denoted by OVES in the table.
Compact VES without Random Oracles
167
Table 1. Comparison with Some Other Schemes Schemes BGLS LOSSW ZM OVES
1.3
RO √
Size Assumption Generation Verification Optimal 320 bits CDH 3e 3p No × 480 bits CDH 4e, k2 m 3p, k2 m No × 320 bits CDH& CTI-CDH 3e, k2 m 2p, k2 m No × 320 bits CDH 2e, k+2 m 2p, k+2 m Yes 2 2
Organizations
This paper is then organized as follows: In section 2, we review the building block and complexity assumption of our construction. In section 3, we review the definitions of verifiably encrypted signatures. Then we present our construction without random oracles in section 4. We present our security proof under the Computation Diffie-Hellman assumption for the scheme in section 5. Finally, we draw a conclusion on the whole paper in section 6.
2
Building Block and Complexity Assumption
We briefly review the necessary facts about bilinear maps and bilinear map groups. Consider the following setting: Let G and GT be two (multiplicative) cyclic groups of prime order p; the group action on G, GT can be computed efficiently; g is a generator of G; e : G × G → GT is an efficiently computable map with the following properties: • Bilinear: for all u, v ∈ G and a, b ∈ Zp , e(ua , v b ) = e(u, v)ab ; • Efficient Computable: e(u, v) is efficiently computable for any input pair (u, v) ∈ G × G; • Non-degenerate: e(g, g) = 1. We say that G is a bilinear group if it satisfies these requirements. 2.1
Computational Diffie-Hellman Problem
The security of our scheme relies on the hardness of the Computational DiffieHellman (CDH) problem in bilinear groups. We state the problem and our assumption as follows. Definition 1 (CDH). In a bilinear group G, the computational Diffie-Hellman R problem is: given (g, g a, g b ) ∈ G3 for some a, b ← Zp , to find g ab ∈ G. Define the success probability of an algorithm A in solving the Computational Diffie-Hellman problem on G as def R Advcdh = Pr A(g, g a , g b ) = g ab : a, b←Zp . A The probability is over the uniform random choice of g from G, of a, b from Zp , and the coin tosses of A. We say that an algorithm A (t, ε)-breaks the
168
Y. Zhou and H. Qian
Computational Diffie-Hellman problem on G if A runs in time at most t, and Adv cdh is at least ε. If there is no adversary A which can (t, ε)-breaks the A Computational Diffie-Hellman problem on G, we say the Computational DiffieHellman problem on G is (t, ε)-secure. 2.2
The Waters Signature Scheme
We review the underlying signature scheme, the Waters signature scheme in [25, 17]. Let the messages be bit strings of the form {0, 1}k for some fixed k. However, in practice one could apply a collision-resistant hash function H : {0, 1}∗ → {0, 1}k to sign messages of arbitrary length. The scheme requires, besides the random generators g, h ∈ G, a vector with k+ 1 additional random elements U=(u , u1 , . . . , uk ) ∈ Gk+1 . Actually u , u1 , . . . , uk define a function F (·) that given M = (m1 , . . . , mk ) ∈ {0, 1}k , maps M to k i um F (M ) = u i . i=1
The Waters signature scheme is a three-tuple of algorithms WS=(KeyGen(1λ ), Sigsk (m), Verpk (m, σ)) and those algorithms behave as follows. R
KeyGen(1λ ) : Pick random x←Zp and compute V ← e(h, g x ). The public key pk is (U, V, h, g). The private key sk is x. Sigsk (m) : Given the user’s private key sk = x, and the message M = (m1 , . . . , R mk ) ∈ {0, 1}k , pick a random r←Zp and compute σ1 ← hx · F (M )r
and σ2 ← g r .
(1)
The signature is σ = (σ1 , σ2 ) ∈ G2 . Verpk (m, σ) : Take in the user’s public key pk, the message M as a bit string (m1 , . . . , mk ) ∈ {0, 1}k , and the signature σ as (σ1 , σ2 ) ∈ G2 . Verify that ?
e(σ1 , g)=V · e (σ2 , F (M ))
(2)
holds; if so, output 1; otherwise 0. This signature is existentially unforgeable under a chosen-message attack, the standard notion of signature security due to Goldwasser, Micali, and Rivest [14], if CDH problem is hard (Refer to Corollary 1 of [17]).
3
Definition and Security Model
A verifiably encrypted signature on some message attests to two facts: (1) that the signer has produced an ordinary signature on that message; and (2) that the ordinary signature can be recovered by the third party under whose key the signature is encrypted. Boneh et al. [8] introduced the definition of verifiably encrypted signatures and formalized a security model for them: Definition 2 (Verifiably Encrypted Signature). A verifiably encrypted signature scheme comprises seven algorithms. Three, KeyGen, Sign, and Vrfy, are
Compact VES without Random Oracles
169
analogous to those in ordinary signature schemes. The others, AKeyGen, VESign, VESVrfy, and Adjud, provide the verifiably encrypted signature capability. – A Standard Signature Scheme Σ: KeyGen(1λ ), Sign(sk, m), Vrfy(pk, m, σ) $
are three algorithms of a standard signature scheme with the forms (pk,sk)← $
– – –
–
KeyGen(1λ ), σsk (m) ← Sign(sk, m) and Vrfy(pk, σ) = b (b ∈ {0, 1}). AKeyGen(1λ ): On input 1λ , output a public-private key pair (apk, ask) for the adjudicator. VESign(sk, apk, m): On input the signer’s private key sk, the adjudicator’s public apk, and message m, output a verifiably encrypted signature ω. VESVrfy(pk, apk, m, ω): On input the signer’s public key pk, the adjudicator’s public apk, message m and its signature ω, output 1 if ω is valid; otherwise 0. Adjud(ask, pk, m, ω): On input the adjudicator’s private key ask, the signer’s public key pk, message m and the veriably encrypted signature ω, extract and output σ, an ordinary signature on message m under pk.
For correctness, we require that verifiably encrypted signatures verify, and that adjudicated verifiably encrypted signatures verify as ordinary signatures, i.e., that VESVrfy(pk, apk, m, VESign(sk, apk, m)) = 1 Vrfy(pk, m, Adjud(ask, pk, m, VESign(sk, apk, m))) = 1
(3)
hold for all m and for all properly-generated keypairs and adjudicator keypairs. 3.1
Security Model
We require that a secure verifiably encrypted signature scheme should have two properties (besides correctness): unforgeability and opacity. Unforgeability requires that it should be impossible to forge a valid verifiably encrypted signature. The advantage of an algorithm A in existentially forging a verifiably encrypted signature, given access to a verifiably-encrypted-signature generation oracle OVESign (·) and an adjudication oracle OAdjud (·), is ⎡ ⎤ / Qv : VESVrfy(pk, apk, m , ω ) = 1 m ∈ ⎢ ⎥ ⎢ ⎥ $ λ ⎢ ⎥ (pk, sk) ← KeyGen(1 ); def ⎥ AdvfA = Pr ⎢ (4) ⎢ ⎥ $ λ ⎢ (apk, ask) ← ⎥ AKeyGen(1 ); ⎣ ⎦ OVESign (·),OAdjud (·) (ω , m ) ← A (pk, apk). where Qv is the set of messages queried to oracle OVESign (·). The probability is taken over the coin tosses of the key-generation algorithms, of the oracles, and of the forger. Definition 3 (Unforgeability). A verifiably encrypted signature forger A, (t, qv , qa , ε)-forges a verifiably encrypted signature if: A runs in time at most
170
Y. Zhou and H. Qian
t, makes at most qv queries to oracle OVESign (·), at most qa queries to the adjudication oracle OAdjud (·), and AdvfA is at least ε. A verifiably encrypted signature scheme is (t, qv , qa , ε)-secure against existential forgery if no forger (t, qv , qa , ε)breaks it. Opacity requires that it should be impossible, given a verifiably encrypted signature, to extract an ordinary signature on the same message. The advantage of an algorithm A in extracting a verifiably encrypted signature, given access to a verifiably-encrypted-signature generation oracle OVESign (·) and an adjudication oracle OAdjud (·), is ⎡ ⎤ Vrfy(pk, m , σ ) = 1 m ∈ / Qa : ⎢ ⎥ ⎢ ⎥ $ λ ⎢ ⎥ (pk, sk) ← KeyGen(1 ); e def ⎢ ⎥ AdvA = Pr ⎢ (5) ⎥ $ λ ⎢ (apk, ask) ← ⎥ AKeyGen(1 ); ⎣ ⎦ (σ , m ) ← AOVESign (·),OAdjud(·) (pk, apk).
where Qa is the set of messages queried to oracle OAdjud (·)1 . The probability is taken over the coin tosses of the key-generation algorithms, of the oracles, and of the adversary. Definition 4 (Opacity). A verifiably encrypted signature extractor A, (t, qv , qa , ε)-extracts a verifiably encrypted signature if: A runs in time at most t, makes at most qv queries to oracle OVESign (·), at most qa queries to oracle OAdjud (·), and AdveA is at least ε. A verifiably encrypted signature scheme is (t, qv , qa , ε)-secure against extraction if no forger (t, qv , qa , ε)-extracts it.
4
Our Verifiably Encrypted Signature Scheme
In the following, we present a new verifiably encrypted signature scheme based on the Waters signature scheme. Different from those in [8, 17], our verifiably encrypted signature is not only provably secure under the standard assumption without random oracles but also achieves the optimal signature size. The scheme requires, besides the random generators g, h ∈ G, a vector with k+ 1 additional random elements U = (u0 , u1 , . . . , uk ) ∈ Gk+1 . Thus u0 , u1 , . . . , uk define a function F (·) that given M = (m1 , . . . , mk ) ∈ {0, 1}k , maps M to k i F (M ) = u0 um i . i=1
Our signature scheme is a seven-tuple of algorithms: KeyGen, Sign, Vrfy, AKeyGen, VESign, VESVrfy, Adjud. R
KeyGen(1λ ) : Pick random x←Zp and compute V ← e(h, g x ). The public key pk is (U, V, h, g). The private key sk is x. 1
It is allowed, however, to query OVESign (·) at m . Obviously, verifiably encrypted signature extraction is thus no more difficult than forgery in the underlying signature scheme for the adversary.
Compact VES without Random Oracles
171
Sigsk (M ) : Given the user’s private key sk = x, and the message M = (m1 , . . . , R
mk ) ∈ {0, 1}k , pick a random r←Zp and compute σ1 ← hx · F (M )r
and σ2 ← g r .
(6)
The signature is σ = (σ1 , σ2 ) ∈ G2 . Verpk (M, σ) : Parse the user’s public key pk and message M as a bit string (m1 , . . . , mk ) ∈ {0, 1}k , and the signature σ as (σ1 , σ2 ) ∈ G2 . Verify that ?
e(σ1 , g)=V · e (σ2 , F (M ))
(7)
holds; if so, output 1; otherwise 0. R AKeyGen(1λ ): On input 1λ , the adjudicator randomly chooses α ← Zp and computes A = g α , finally sets the public-private key pair (apk = A, ask = α). VESign(sk, apk, M ): On input the signer’s private key sk, the adjudicator’s public apk, and message m, output a verifiably encrypted signature ω. R
1. Pick a random r ←Zp and compute s1 ← hx · (A · F (M ))
r
and s2 = σ0 ← g r .
(8)
2. Output ω = (s1 , s2 ) as the signature for message M . VESVrfy(pk, apk, M, ω): On input the signer’s public key pk, the adjudicator’s public apk, message m and its signature ω, parse ω = (s1 , s2 ) ∈ G2 and verify that ?
e(s1 , g)= V · e (s2 , A · F (M )) .
(9)
holds; if so, output 1; otherwise 0. Adjud(ask, pk, M, ω): On input the adjudicator’s private key ask = α, the signer’s public key pk, message M and the verifiably encrypted signature ω = (s1 , s2 ), B first verifies the verifiably encrypted signature ω = (s1 , s2 ), if valid: 1. Compute η = s1 · s−α 2 . R
2. Randomly choose t ← Zp . 3. Compute: σ1 = η · F (M )t
and σ2 = s2 · g t .
(10)
4. Return σ = (σ1 , σ2 ); otherwise aborts. Remark 1. Note that our verifiably encrypted signature is as efficient as the Waters signature except for one multiplication in G (the cheapest computation in G). The size is identical to that of the Waters signature and optimal if we use Elgamal encryption as the underlying block of encryption. Furthermore, our security is as secure as that of the Waters signature whose security relies on a standard complexity assumption.
172
4.1
Y. Zhou and H. Qian
Reduce the Size of the Public Key
One weakness of our scheme is that a user’s public key might be large. If we use a 160-bit collision resistant hash function, then keys will be approximately 160 group elements and take around 10KB to store. Actually, the size of public keys in our system reflects the size of keys in the underlying Waters signature scheme. However, Naccache [21] and Chatterjee and Sarkar [23] have proposed ways to shorten the public keys of the Waters IBE/signature scheme by trading off parameter size with tight security reduction. This approach is also applicable to our verifiably encrypted signature schemes.
5
Security of Our Verifiably Encrypted Signature
In the following we prove that our scheme is unforgeable and opaque if the Waters signature is unforgeable, which implies the proposed scheme is secure under CDH assumption. Theorem 1 (Unforgeability). Our verifiably encrypted signature is (t, qv , qo , ε)-unforgeable if the Waters signature is (t , q, ε )-unforgeable where ε = ε ,
t = t + (3qa + qv + 1) · te ,
qv = q
(11)
where qv , qa and q are the numbers of queries to oracles OVESign (·), OAdjud (·) and Os (·). te is the time of exponentiation in G. Proof. We show how to construct an algorithm B to break the Waters signature scheme WS defined in Section 2.2, from an adversary algorithm A which breaks the unforgeability of our VES scheme. W.l.o.g, we assume that B is given pk = (G, GT , e, U, V, g, h) of WS and a signing oracle Os (·) generating the Waters signature for any message. To use A to get a forgery of WS, B must give A the public keys pk = (G, GT , e, U, V, g, h) and apk = A = g α where α is chosen randomly from Zp . Meanwhile, B should provide A with the oracles OVESign (·) and OAdjud (·). Finally, B uses the forgery output by A, to create a forgery of WS. The simulations can be constructed in the following way. Setup. Given the adversary A which issues at most qv queries to oracle OVESign (·), and qa queries to oracle OAdjud (·), algorithm B sets the public keys as pk = (U, V, h, g) and apk = A = g α for the adversary to attack (and gives it other parameters (G, GT , e, g, h, p) to A as the input). OVESign (m)-Queries. To respond with the oracle queries, B queries its own R
oracle Os (·) at m and obtain σ = (t1 , t2 ) ← Os (m). Then, B returns ω = (t1 · t2 α , t2 ). OAdjud (m, pk, apk, ω)-Queries. To respond with the oracle queries, B first verifies ω on message m under pk and apk if valid, parses ω = (s1 , s2 ) and R
r r returns σ = (s1 · s−α 2 · F (m) , s2 · g ) where r ← Zp ; otherwise aborts.
Compact VES without Random Oracles
173
Output. Finally, when A outputs a forgery of verifiably encrypted signature ω = (s1 , s2 ), B returns σ = (s1 ·s−α 2 , s2 ) as the forgery of the Waters signature scheme. Since the simulation of the oracles is perfect, the view of adversary A is identical to that of the real world. With the same probability of A, B outputs a forgery of the Waters signature scheme. Namely ε = ε . The running-time of B, t includes the running time of A, t and the time of exponentiations linear to qv + 3qa + 1. Theorem 2 (Opacity). Our verifiably encrypted signature is (t, qv , qa , ε)opaque if the Waters signature is (t , q, ε )-unforgeable where ε = ε ,
t = t + O(qa + qv + 1),
qv + qa = q
(12)
where qv , qa and q are the numbers of queries to oracles OVESign (·), OAdjud (·) and Os (·). te is the time of exponentiation in G. Proof. Again we show how to construct an algorithm B to break the Waters signature scheme WS defined in Section 2.2, from an adversary A which breaks the opacity of our VES scheme. W.l.o.g., we assume that B is given pk = (G, GT , e, U, V, g, h) of WS where U = (u0 , · · · , uk , uk+1 ) and a signing oracle Os (·) generating the Waters signature for any k + 1 bits of message (or hash values of the messages) 2 . B gives A the public keys pk = (U , V, g, h) where U = (u0 , · · · , uk ) and apk = A = uk+1 . Meanwhile, B should provide A with the oracles OVESign (·) and OAdjud (·). Finally, B uses the forgery output by A, to create a forgery of WS. The simulations can be constructed in the following way. Setup. B sets the public keys as pk = (U , V, g, h) where U = (u0 , u1 , · · · , uk ) and apk = A = uk+1 for the adversary to attack (and gives it other parameters (G, GT , e, g, h, p) to A as the input). OVESign (m)-Queries. To respond with the oracle queries, B queries its own oraR
cle Os (·) at (m||1) ∈ {0, 1}k+1 and obtain σk+1 = (t1 , t2 ) ← Os (m||1). Then, B returns ω = (t1 , t2 ). OAdjud (m, pk, apk, ω)-Queries. To respond with the oracle queries, B first verifies ω on message m under pk and apk if valid, B queries its own oracle Os (·) R at (m||0) ∈ {0, 1}k+1 and obtain σk+0 = (t1 , t2 ) ← Os (m||0), then return σk+0 ; otherwise aborts. Output. Finally, when A outputs a forgery of the Waters signature ω = (s1 , s2 ) on message m ∈ {0, 1}k associated with public key (U , V ), B returns σ = ω = (s1 , s2 ) as the forgery of the Waters signature scheme on message M = (m||0) ∈ {0, 1}k+1 under public key (U, V ). 2
In our VES scheme, we assume the verifiably encrypted signatures are on k bits of message or hash value.
174
Y. Zhou and H. Qian
From the definition 4, we know that the forgery output by A is valid, if it is a valid signature on m ∈ / Qa where Qa is the set of messages queried at OAdjud () for some verifiably encrypted signature ω. Then, M is different from any previously queried messages ||0 in answering OAdjud (∗)-Queries. On the other hand, each m||1 has a different last bit from M . Therefore, we can conclude that M hasn’t been queried to oracle Os (·) during the whole simulation. Namely, σ output by B (or ω output by A) is valid forgery of the Waters signature scheme with public key (U, V ). Since the simulation of the oracles is perfect, the view of adversary A is identical to that of the real world. With the same probability of A’s outputting an extraction, B outputs a forgery of the Waters signature scheme. Namely ε = ε . The running-time of B, t includes the running time of A, t and time of oracle queries linear to qv + qa + 1.
6
Conclusion
We propose a compact verifiably encrypted signature scheme without random oracles based on the Computational Diffie-Hellman problem (CDH) in a bilinear group. Comparing our scheme with other schemes, we show that it has the following advantages (1) Optimal signature size, only 320 bits (two group elements) (2) More efficient computation in the phase of producing and verifying the verifiably encrypted signature, (3) Security under one general assumption–the CDH assumption and without random oracles. We believe that our scheme with the smallest signature size should be one of the most efficient scheme in mobile ad hoc networks or many real-life applications in wireless communication.
Acknowledgements This work has been partially supported by the National Natural Science General Foundation of China Grant No. 60873217 and 61021004.
References 1. Asokan, N., Shoup, V., Waidner, M.: Optimistic Fair Exchange of Digital Signature (extended abstract). In: Nyberg, K. (ed.) EUROCRYPT 1998. LNCS, vol. 1403, pp. 591–606. Springer, Heidelberg (1998) 2. Ateniese, G.: Efficient Verifiable Encryption (and Fair Exchange) of Digital Signatures. In: Proc. of the 6th Conference on CCS, pp. 138–146. ACM Press, New York (1999) 3. Ateniese, G.: Verifiable Encryption of Digital Signature and Applications. ACM Transactions on Information and System Security 7(1), 1–20 (2004) 4. Bao, F., Deng, R.H., Mao, W.: Efficient and Practical fair exchange protocols with off-line TTP. In: IEEE Symposium on Security and Privacy, Oakland, CA, pp. 77–85 (1998)
Compact VES without Random Oracles
175
5. Bellare, M., Rogaway, P.: Random Oracles Are Practical: A paradigm for designing efficient protocols. In: Proc. the 1st ACM Conference on Computer and Communications Security, pp. 62–73. ACM, New York (1993) 6. Bellare, M., Boldyreva, A., Palacio, A.: An uninstantiable random-oracle-model scheme for a hybrid-encryption problem. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 171–188. Springer, Heidelberg (2004) 7. Boneh, D., Boyen, X.: Short signatures without random oracles. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 56–73. Springer, Heidelberg (2004) 8. Boneh, D., Gentry, C., Lynn, B., Shacham, H.: Aggregate and verifiably encrypted signatures. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 416–432. Springer, Heidelberg (2003) 9. Boneh, D., Lynn, B., Shacham, H.: Short signatures from the weil pairing. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248, pp. 514–532. Springer, Heidelberg (2001) 10. Camenisch, J., Shoup, V.: Practical Verifiable Encryption and Decryption of Discrete Logarithms. In: Boneh, D. (ed.) CRYPTO 2003. LNCS, vol. 2729, pp. 126–144. Springer, Heidelberg (2003) 11. Canetti, R., Goldreich, O., Halevi, S.: The random oracle methodology, revisited. In: Proceedings of 30th Annual ACM Symposium on Theory of Computing (STOC), pp. 209–218. ACM press, New York (1998) 12. Dent, A.: Adapting the weaknesses of the random oracle model to the generic group model. In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501, pp. 100–109. Springer, Heidelberg (2002) 13. Dodis, Y., Oliveira, R., Pietrzak, K.: On the generic insecurity of the full domain hash. In: Shoup, V. (ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 449–466. Springer, Heidelberg (2005) 14. Goldwasser, S., Micali, S., Rivest, R.: A digital signature scheme secure against adaptive chosen-message attacks. SIAM Journal of Computing 17(2), 281–308 (1988) 15. Gorantla, M.C., Saxena, A.: Verifiably Encrypted Signature Scheme Without Random Oracles. In: Chakraborty, G. (ed.) ICDCIT 2005. LNCS, vol. 3816, pp. 357–363. Springer, Heidelberg (2005) 16. Gu, C.X., Zhu, Y.F.: An ID-based Verifiable Encrypted Signature Scheme Based on Hess’s Scheme. In: Feng, D., Lin, D., Yung, M. (eds.) CISC 2005. LNCS, vol. 3822, pp. 42–52. Springer, Heidelberg (2005) 17. Lu, S., Ostrovsky, R., Sahai, A., Shacham, H., Waters, B.: Sequential Aggregate Signatures and Multisignatures Without Random Oracles. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 465–485. Springer, Heidelberg (2006) 18. R¨ uckert, M.: Verifiably encrypted signatures from RSA without NIZKs. In: Roy, B., Sendrier, N. (eds.) INDOCRYPT 2009. LNCS, vol. 5922, pp. 363–377. Springer, Heidelberg (2009) 19. R¨ uckert, M., Schr¨ oder, D.: Security of verifiably encrypted signatures and a construction without random oracles. In: Shacham, H., Waters, B. (eds.) Pairing 2009. LNCS, vol. 5671, pp. 17–34. Springer, Heidelberg (2009) 20. R¨ uckert, M., Schneider, M., Schr¨ ooder, D.: Generic Constructions for Verifiably Encrypted Signatures without Random Oracles or NIZKs. In: Zhou, J., Yung, M. (eds.) ACNS 2010. LNCS, vol. 6123, pp. 69–86. Springer, Heidelberg (2010) 21. Naccache, D.: Secure and Practical Identity-based encryption. Cryptology ePrint Archive, Report 2005/369 (2005), http://www.eprint.iacr.org/
176
Y. Zhou and H. Qian
22. Paillier, P., Vergnaud, D.: Discrete-log-based signatures may not be equivalent to discrete log. In: Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 1–20. Springer, Heidelberg (2005) 23. Chatterjee, S., Sarkar, P.: Trading time for space: Towards an efficient IBE scheme with short(er) public parameters in the standard model. In: Won, D.H., Kim, S. (eds.) ICISC 2005. LNCS, vol. 3935, pp. 424–440. Springer, Heidelberg (2006) 24. Shao, Z.: Certificate-based verifiably encrypted signatures from pairings. Information Sciences 178(10), 2360–2373 (2008) 25. Waters, B.: Efficient identity-based encryption without random oracles. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 114–127. Springer, Heidelberg (2005) 26. Zhang, J., Mao, J.: A Novel Verifiably Encrypted Signature Scheme Without Random Oracle. In: Dawson, E., Wong, D.S. (eds.) ISPEC 2007. LNCS, vol. 4464, pp. 65–78. Springer, Heidelberg (2007) 27. Zhang, F., Safavi-Naini, R., Susilo, W.: Efficient verifiably encrypted signature and partially blind signature from bilinear pairings. In: Johansson, T., Maitra, S. (eds.) INDOCRYPT 2003. LNCS, vol. 2904, pp. 191–204. Springer, Heidelberg (2003) 28. Zhang, J., Zou, W.: A robust verifiably encrypted signature scheme. In: Zhou, X., Sokolsky, O., Yan, L., Jung, E.-S., Shao, Z., Mu, Y., Lee, D.C., Kim, D.Y., Jeong, Y.-S., Xu, C.-Z. (eds.) EUC Workshops 2006. LNCS, vol. 4097, pp. 731–740. Springer, Heidelberg (2006)
MobiID: A User-Centric and Social-Aware Reputation Based Incentive Scheme for Delay/Disruption Tolerant Networks Lifei Wei1 , Haojin Zhu1 , Zhenfu Cao1,∗ , and Xuemin (Sherman) Shen2 1 2
Shanghai Jiao Tong University, Shanghai, China University of Waterloo, Waterloo, Ontario, Canada
Abstract. Delay/Disruption tolerant networks (DTNs) are wireless ad-hoc networks, where end-to-end connectivity can not be guaranteed and communications rely on the assumption that the nodes are willing to store-carry-and-forward bundles in an opportunistic way. However, this assumption would be easily violated due to the selfish nodes which are unwilling to consume precious wireless resources by serving as bundle relays. Incentive issue in DTNs is extraordinarily challenging due to the unique network characteristics. To tackle this issue, in this paper, we propose MobiID, a novel user-centric and social-aware reputation based incentive scheme for DTNs. Different from conventional reputation schemes which rely on neighboring nodes to monitor the traffic and keep track of each other’s reputation, MobiID allows a node to manage its reputation evidence and show to demonstrate its reputation whenever necessary. We also define the concepts of self-check and community-check to speed up reputation establishment and allow nodes to form consensus views towards targets in the same community, which is based on our social metric by forwarding willingness. Performance simulation are given to demonstrate the security, effectiveness and efficiency of the proposed scheme. Keywords: Selfish; Reputation based incentive, Cooperating stimulation, Security, Delay/Disruption tolerant networks.
1 Introduction Most popular Internet applications rely on the existence of a contemporaneous end-toend link between source and destination, with moderate round trip time and small packet loss probability. This fundamental assumption does not hold in some challenged networks, which are often referred to as Delay/Disruption Tolerant Networks (DTNs) [1]. Typical applications of DTNs include vehicular DTNs for dissemination of locationdependent information, pocket switched networks, underwater networks, etc. Different from traditional wireless ad hoc networks, data in DTNs are opportunistically routed toward the destination by exploiting the temporary connection and store-carry-andforward transmission fashion. Most of the DTN routing schemes require the hypothesis that individual node is ready to forward bundles for others. However, in certain DTN applications such as ∗
Corresponding author.
[email protected]
H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 177–190, 2011. c Springer-Verlag Berlin Heidelberg 2011
178
L. Wei et al.
Node A
Transmission range
Bundle Node trace Packet forwarding
T=t2 T
Event time
Bundle carried by B T=t1 Transmission range
T=t2 Transmission range
Fig. 1. A typical store-carry-and-forward transmission fashion
vehicular DTNs or pocket-switched networks, which are decentralized and distributed over a multitude of devices that are controlled and operated by rational entities, DTN nodes can thus behave selfishly and try to maximize their own utility without considering the system-level welfare. Existing research has shown that a non-cooperative DTN may suffer from serious performance degradation [2–4]. Therefore, to deploy applicable DTNs in real-world scenarios, the proper incentive schemes considering such characteristics should be the most promising ways. In general, incentive schemes can be classified into the following three categories: credit-based [2–9], tit-for-tat based [10], and reputation-based [11–20]. Even though incentive schemes have been well studied for the traditional wireless networks, the unique network characteristics including lack of contemporaneous path, high variation in network conditions, difficulty to predict mobility patterns, and long feedback delay, have made the incentive issue in DTNs quite different. Therefore, there is an increasing interest in designing the incentive schemes in DTNs. The reported incentive schemes in DTNs are mainly focusing on the credit-based and tit-for-tat based solutions. However, the reputation based schemes still receive less attention due to the special challenges brought by unique characteristics of DTNs. Firstly, existing reputation based incentive schemes designed for conventional wireless networks assume that the sender can monitor the next hop’s transmission and detect if the next hop appropriately forwards the traffic. This assumption may not hold in DTNs due to the store-carry-and-forward transmission. For example, as shown in Fig. 1, a node A forwards bundles to a node B, which carries the bundles until it meets the next hop node C. Meanwhile, the data transmission from B to C is beyond the sensing range of A. This unique characteristic makes existing reputation schemes which are based on neighboring detection unsuitable in DTNs. In addition, due to the long propagation delay and frequent disconnectivity, how to efficiently and effectively propagate the reputation is another challenging issue. In this paper, we introduce a user-centric and social-aware reputation based incentive scheme, named MobiID, to stimulate cooperation among selfish nodes in DTNs. MobiID is a dynamic reputation system where reputation can be maintained, updated, and shown for verification by each node whenever needed. Specifically, in a store-carryand-forward transmission, each successful transmission can be demonstrated by either
A User-Centric and Social-Aware Reputation Based Incentive Scheme for DTNs
179
the previous/next hop nodes or their community, which can be divided into two categories: self-check and community-check. The former is defined as that a node keeps its forwarding evidence for the purpose of future directly checking by the bundle sender. The latter means that the forwarding evidence is collected and then checked through the social network to improve reputation propagating efficiency in DTNs. Different from existing reputation based incentive schemes which rely on neighbors’ monitoring and scoring targets, all the reputation related information for a specific node is stored in its own local buffer in our scheme, which enables efficient reputation retrieval for other nodes. Thus, our MobiID can be named a “user-centric reputation” scheme. Furthermore, MobiID provides a suitable way to measure the metrics of social relationships for reputation community check efficiently. Recently, there is an increasing interest to study the social relationship by mapping the contact history to directed graph [21]. However, we argue that this social relationship built on the physical locality and contact history can not reflect their real willingness in the bundle forwarding. This could be demonstrated by the scenarios that two people are just a nodding acquaintance relationship although they almost meet every day in the daily life. Instead, MobiID abstracts the true willingness between the nodes by honestly recording each data forwarding. Thus, MobiID is also a “social-aware” scheme. To the best of our knowledge, this paper is a novel one to propose the reputation based scheme for DTNs. Our contributions can be summarized as follows: – Firstly, we define a new social metric of DTN nodes, which considers the forwarding willingness from forwarding history, and identifies the social community based on this new metric. – Secondly, we propose MobiID to stimulate cooperation among selfish nodes in DTNs, which allows a node to maintain, update, and show its reputation tickets as an identity card whenever needed. – Thirdly, we make use of social property to speed up MobiID’s reputation establishment and allow nodes in the same community to share reputation information and form consensus views towards targets. – Lastly, MobiID, as a high level scheme, can be compatible with diverse dataforwarding algorithms. We also use the security techniques such as identity based signatures and batch verification to reduce the computational overhead. Extensive simulation demonstrates the efficiency and efficacy of the proposed schemes. The remainder of this paper is organized as follows. In Section 2, we present the system models and design goals. Necessary preliminaries are introduced in Section 3. Section 4 proposes MobiID, building a reputation based incentive mechanism to stimulate cooperation in bundles forwarding through reputation self-check and community-check. In Section 5 and Section 6, performance analysis and simulation are given, respectively. In Section 7, we review the related work. Finally, we draw the conclusion in Section 8.
2 Models and Assumptions In this section, we define our system model, attack model and design goals.
180
L. Wei et al.
2.1 Social Based DTN Model We consider a social based DTN system model, which is characterized as not only endto-end connections are not always guaranteed and routings are made in an opportunistic way, but also nodes in this networks have social relationships based on their willingness. Specifically, a source node Src wants to send bundles to a destination node Dst depending on relays of the intermediate nodes {N1 , N2 , · · · , Nn}. Similar to creditbased schemes in [2], we assume that there exists an Offline System Manager (OSM), which is responsible for key distribution. At the beginning of the system initialization, each node in the DTNs should register to the OSM and get the secret keys and public parameters. Different from credit-based schemes [2], our system model do not need the credit or reputation clearance process by virtual bank or OSM and our reputation is evaluated in a self-organizing manner which caters to the DTN environment. Moreover, to introduce the concept “community-check”, we first define the social relationship model from the forwarding history. Our social based DTNs could be modeled as a weighted directed graph (V, E), where the vertex set V consists of all the DTN nodes and the edge set E consists of the social links between these nodes. In this work, we use the forwarding history instead of contact history in [21] to evaluate the weight of social links between nodes since the fact that two nodes contact does not mean that they are willing to forward bundles for others in a rational assumption. Maybe they are just running into each other. To extract the social relationship, we introduce the concept of Average forwarding time (AF T ), which is used to reflect both the contact frequency and forwarding capability for one bundle. AF T from node i to node j is defined as T , N tk ,tk+1 k
AF Tij =
(1)
where T is a training time window and Ntk ,tk+1 represents the number of forwarding bundles between two meeting time ti and ti+1 . The smaller AF Tij is, the more willingness i have to forward for j. It is obviously that AF Tij = AF Tji . Finally, we deduce nodes’ knowledge into a single willingness metric wij ∈ [0, 1] for node i to node j. We use Gaussian similarity function [21, 22] to normalize AF Tij in equation (1) and AF T 2 denote the resulting metric as the willingness wij = exp − 2σ2ij . Here, σ is a scaling parameter [21, 22]. We set the threshold Tw and employ the technique of social group identification where the non-overlapped community structure can be constructed in a distributed manner using a simplified clique formation algorithm from [21]. In our settings, we assume that if two nodes belong to the same community, the chance that they meet is higher than that of belonging to different communities. 2.2 Attack Models In this paper, we assume that every node is rational. The nodes with a high reputation have chance to be chosen than the low ones for their past successful forwarding history in DTNs while the nodes can improve their reputation through actively involving the bundle forwarding to avoid being put into blacklist, which is often used in the reputation based incentive scheme.
A User-Centric and Social-Aware Reputation Based Incentive Scheme for DTNs
181
In addition, we consider two kinds of DTN nodes: selfish nodes and malicious nodes, except the innocent ones among these intermediate nodes. Due to the selfish nature and energy consuming, selfish nodes are not willing to forward bundles for others without stimulation or rewards. Such selfish behavior could significantly degrade network performance. Moreover, malicious nodes may launch the attacks such as modifying the forwarding history to overclaim a high reputation and then attract bundles and drop them to isolate the target user, which destroys network performance. This kinds of attacks may not be easy to be discovered in the distributed networks without the monitors, especially in the DTNs. 2.3 Design Goals Our goals are to develop a user-centric and social-aware reputation-based incentive scheme for DTNs. Specifically, the following three desirable objectives will be achieved: – Effectiveness. Our proposed scheme is effective in stimulating cooperation among the selfish nodes in DTNs. – Security. Our scheme resists various regular attacks launched by malicious nodes. – Efficiency. It is an efficient scheme without introducing too much extra communication and transmission overhead.
3 Preliminary 3.1 Bayesian Systems Bayesian systems often take binary inputs such as positive or negative ratings to compute scores by statistical updating of Beta probability density functions Beta(α, β) [12, 17]. The updated score is combined the previous score with a new rating. It updates as follows. Initially, the prior is function Beta(α, β) = Beta(1, 1), which is the uniform distribution on [0, 1]. When a new observation, including f negative ratings and s positive ratings, is collected, the prior function is updated by α ← α + s and β ← β + f . The advantage of Bayesian system is that it provides a theoretically sound basis for computing scores and it only needs two parameters α and β that are continuously updated along with reported observations [15]. According to the definition, the mathematical expectation of evidence distribution is defined as following: EXP (Beta(α, β)) =
α . α+β
(2)
In addition, EXP (Beta(α, β)) is only a ratio that can not reflect the uncertainty of distribution since the mathematical expectation are equal in the case of (α, β) = (1, 1) and (α, β) = (10, 10). Thus, we need to find the normalized variance of evidence distribution to describe the uncertainty. V AR(Beta(α, β)) =
12 · α · β (α + β)2 · (α + β + 1)
(3)
182
L. Wei et al.
3.2 Cryptographic Technology In MobiID, signatures, signed by the next hop node, are introduced to authenticate forwarding evidence. Each node has an unique ID never changed in the scheme, which is used as its public key to verify signatures. To reduce the computational cost of MobiID, we adopt a cryptographic technology based on a signature scheme [23] and its well-known batch verification version [24], which consist of five algorithms: Setup. OSM runs setup algorithm to generate the system parameters and master secret keys. Specifically, OSM selects bilinear pairing on elliptic curve. An efficient admissible bilinear pairing eˆ : G1 × G1 → G2 , where G1 and G2 be two cyclic multiplicative groups of the same prime order q, (i.e., |G1 | = |G2 | = q), has following properties (Let P be a generator of G1 ): (1) Bilinear: for all P ∈ G1 and a, b ∈ Z∗q , eˆ(aP, bP ) = eˆ(P, P )ab ; (2) Non-degenerate: there exist P ∈ G1 such that eˆ(P, P ) = 1; (3) Computable: there is an efficient algorithm to compute eˆ(P, Q) for any P, Q ∈ G1 . In addition, it chooses two cryptographic hash functions H1 : {0, 1}∗ → G1 and H2 : {0, 1}∗ → Z∗q . After that, OSM picks a random number s, where s ∈ Z∗q as its secret key, and sets its public key as Ppub = sP . The system parameters are params = (G1 , G2 , q, eˆ, P, Ppub , H1 , H2 ). The system’s secret is s, which is known only by OSM itself. When a DTN node wants to join into this network, it needs to register to OSM. The node is assigned its identity ID, system parameters params and a secret key skID from OSM in a secure way. Specifically, OSM does as follows: skID = s · H1 (ID). Note that system initialization and registration step could be accomplished in the off-line phase. Sign. To generate a signature on a message m, it first encodes m to a non-zero element in Z∗q . Then it selects a random number r in Z∗q and computes signature SigID (m) = (U, V ) where U = r · H1 (ID) and V = (r + H2 (U m)) · skID . IndVer. To verify a message-signature SigID (m), it verifies individually by eˆ(V, P ) ?
= eˆ((U + H2 (U m)) · H1 (ID), Ppub ). Aggr. To improve signatures {SigIDi (mi )=(Ui , Vi )|1 ≤ efficiency, it first combines i ≤ n} by VBat= ni=1 Vi and UBat = ni=1 Ui + (H2 (Ui mi )) · H1 (ID). ?
BatVer. It verifies the combined signature in a batch: eˆ(VBat , P ) = eˆ(UBat , Ppub )
4 The Proposed MobiID Scheme In this section, we first introduce the primitive concept of “reputation ticket”. Then we illustrate how MobiID works in the bundle forwarding by making full use of reputation tickets to stimulate nodes’ cooperation. The goal of our scheme is to detect and punish selfish nodes in order to encourage nodes’ forwarding. By punishing these nodes, we show that behaving selfish will not benefit them. Instead, behaving cooperative has a better chance to increase their benefit. 4.1 Bundle Forwarding Fig. 2 shows the bundle forwarding process. When a node F reaches the transmission range of a sender S, they first check whether they are in the blacklist. If either of them in the blacklist, the forwarding stops. Otherwise, according to whether they have met before, two cases emerge.
A User-Centric and Social-Aware Reputation Based Incentive Scheme for DTNs
)RUZDUGLQJ DWWKH ILUVWWLPH
183
Bundle B Reputation ticke RT Bundle B Reputation ticke RT
Request for reputation evidence
5HSXWDWLRQ VHOIFKHFN
Return RT for reputation evaluating
Request: relay RT(S)
5HSXWDWLRQ FRPPXQLW\ FKHFN
Agree to relay RT(S) Send RT(S)
Fig. 2. The Proposed MobiID Scheme
Case a: If they meet at first time, F directly forwards bundles to S and sets the initiate reputation value of F . After S sends the bundle to F , F replies to S a reputation ticket as an evidence of successful forwarding of S. Similarly, F receives a reputation ticket from the next hop node R and keeps it for future checking in the next encounter with S. The format of reputation ticket is shown in Fig. 3. Specifiid
S
SIS
F
R
TS
H(B)
SigR
Fig. 3. A reputation ticket after forwarding a packet
cally, it is comprised of ticket sequence number id, sender node S, forwarding node F , receiving node R, time stamp T S and bundle hash value. SIS is the social group identifier of S which will be used in the reputation social agreement later. R’s signature SigR (idSSIS F RT SH(B)) is constructed. This reputation ticket provides an evidence that F forwarded S’s bundle to R at time T S. Different from any existing reputation schemes, MobiID allows each node to maintain its own reputation tickets in the local buffer and thus is able to provide its own reputation on its demand. The node could also actively update its reputation by self checking. Therefore, MobiID is a user-centric scheme, where reputation computing does not depend on others. This feature makes it highly appealing in DTNs which suffer from frequent disconnectivity and high propagation delay. Case b: If these two nodes met before, S starts the reputation self-check algorithm. In our settings, we assume the “good property”: If S and F meet once, it is likely that they will meet again in the near future.
184
L. Wei et al.
4.2 Reputation Self-Check S starts the reputation self-check algorithm to evaluate F ’s forwarding quality. S first finds out the forwarding records in the last encounter to verify the expiration time of reputation tickets and then requires F for forwarding evidence to evaluate reputation. F needs to actively return the related reputation tickets to S, or F is evaluated a low reputation leading to be put into blacklist. After that, S makes an observation where N tF (S, F ) denotes the number of bundles that S required F to forward in the last encounter. N aF (S, F ) denotes the number of bundles that F has actually forwarded for S. Definition 1 (Observation). The observation starts at time ts and ends at time min(td , ts +T T L). S verifies reputation tickets in a individual model or a batch model: Individual-Verification model: S individually takes IndV er(Ui , Vi ). Batch-Verification model: S takes Aggr(Ui , Vi ) and BatV er(U, V ). If the node F completes the bundle forwarding, that is, N tF (S, F ) ≤ N aF (S, F ). The observed forwarding result is considered to be a success, Otherwise, a failure. To reason from observation results and further to generate the reputation metric, we denote α and β to represent the total number of observed successful forwarding and failure forwarding, respectively. s and f are the successful forwarding and failure forwarding in this observation. Thus, we have α ← α + s and β ← β + f ;
(4)
According to the definition of Bayesian inference, the basic reputation value can be quantified by the expectation (2) of evidence distribution vS−F = EXP ((Beta(α, β))) and uS−F = V AR(Beta(α, β)). For the certainty part, we should assign advanced reputation value according to the proportion of supporting evidence in the observed results. We assign advanced reputation value (ARV ): α ARVS−F = vS−F · (1 − uS−F ) = · (1 − uS−F ). (5) α+β Additionally, to take it into account that the reputation value fades along the time, we give some discount weight ω to indicate the freshness of reputation as an aging factor for a time slot ΔT , such that, α=ω
td −ts ΔT
α + s and β = ω
td −ts ΔT
β+f
(6)
4.3 Decision Making After the reputation value generation, S needs to make a decision on whether to reward or punish F . In our setting, OSM sets two thresholds: THigh and TLow for the candidate forwarder. The case that ARV is higher than THigh indicates that this candidate node is willing to forward bundles for S and S prepares to send it new bundles as in the section 4.1. Besides, as a reward, S agrees to forward reputation tickets for F and starts reputation social establishment procedure in the section 4.4. In the case, the reputation value is lower than TLow , which means the candidate node behaves selfish in forwarding bundles for S and S puts it into blacklist and later informs its community members
A User-Centric and Social-Aware Reputation Based Incentive Scheme for DTNs
185
Algorithm 1. Reputation Self-Check Algorithm 1: procedure ReputationSelfCheck 2: if node F is in the blacklist then 3: procedure stops; 4: end if 5: S asks F to show its reputation ticket; 6: for each reputation tickets F returns do 7: S verifies in a individual model or in a batch model 8: end for 9: S makes observations to obtain s and f and S updates F ’s ARVS−F by (5); 10: S starts to make a decision based on new ARVS−F . 11: if ARVS−F < TLow then 12: F is considered as a selfish node and put into the blacklist and procedure stops; 13: else 14: if ARVS−F ≥ THigh then 15: F is considered as a good forwarder and S starts community check procedure:F submits the reputation tickets in its buffer belonging to the same community of S. 16: else 17: F is considered an inactive node. S warns F by not doing community check. 18: end if 19: end if 20: end procedure
when they are meeting later in the section 4.4. In the case that the reputation value is between THigh and TLow , S still sends bundles to F but gives a warning to F and does not propagate reputation tickets this time to encourage F forwarding more actively. Note that the thresholds THigh and TLow must be carefully defined. Otherwise, false positive and false negative could be high. The nodes in the blacklist can not be asked for forwarding because their low reputation leads to unreliable bundle forwarding. At the same time, as a punishment, their forwarding evidence can not be community checked by other nodes in time. This leads to a reputation drop of the selfish nodes. Therefore, they need to find some new nodes which do not put them in the blacklist or they just wait for the sender to meet again. 4.4 Reputation Community-Check We demonstrate how to efficiently and effectively propagate reputation by reputation community-check. Due to DTN’s long propagation delay and frequent disconnectivity, the “good property” in the reputation self-check would not preform so well when two nodes, belonging to different communities, encountered occasionally before. According to the definition of social community, two nodes in the same community have a higher probability to meet and forward bundles to each other periodically than that of in the different community since our community is constructed to reflect locality and forwarding willingness. Therefore, our reputation community-check mechanism is built on the community level, which allows nodes in the same community to share reputation information to accelerate reputation collection. After that, all nodes in the same community can form consensus views towards the targets.
186
L. Wei et al. Node S
v
Node trace Reputation tickets Reputation ticket propagate
7UDQVPLVVLRQ UDQJH
Fig. 4. Reputation social establishment
As shown in Fig. 4, if S agrees to help F to community check reputation tickets, F sends the related reputation tickets whose SGI ∈ SIS . This means those senders and S belong to the same community, who have more chance to meet in the near future than that of F . S holds these reputation tickets in its buffer until it meets S in the same community and then it starts reputation community check algorithm 2. Algorithm 2. Reputation Community Check Algorithm 1: procedure ReputationCommunityCheck 2: When S meets its community member, S , they exchange their reputation tickets and blacklists. 3: After verifying the validity of these tickets, they can make a consensus on F ’s reputation by updating: uSS −F = uS−F + uS −F − 1 and vSS −F =
(1 − uS−F ) · vS−F + (1 − uS −F ) · vS −F (1 − uS−F ) + (1 − uS −F )
4: end procedure
When this reputation community check continues, nodes in the same community can form consensus views towards the target F by i∈C(i) (1 − ui−F ) · vi−F vF = , (7) i∈C(i) (1 − ui−F ) where C(i) represents the social group node i belongs to. (vi−F , ui−F ) represents node i’s view towards node F . The consensus view vF is weighted sum of views from the nodes in the same community. The weight is decided by node i’s certainty towards the reputation value vi−F . Thus, nodes in one community can form consensus views
A User-Centric and Social-Aware Reputation Based Incentive Scheme for DTNs
towards the target F by
ARVS−F = vF · (1 − uF ).
187
(8)
When a selfish node has been put into the blacklist, other nodes in the same community will refuse community check as a punishment.
5 Performance Analysis We model our reputation community check as an epidemic problem [25] for investigating the factors which influence the reputation social establishment. If the establishment speed is higher than the mobility of the selfish node, selfish nodes will be isolated. For the limit of space, we omit the proof.
6 Simulation We evaluate the performance of MobiID in two aspects: the cryptographic operation cost in MobiID and the effectiveness and efficiency of MobiID in stimulating selfish nodes with extensive simulations. 6.1 Cryptographic Overhead Evaluation Our simulation consists of Intel Core 2 Duo P7450 (2.13GHz) with 1 GB RAM based on the Pairing Based Cryptography Library (PBC) [26] in the Ubuntu 9.10 to evaluate the delays of cryptographic operations, which are summarized in Table 1. Since signatureaggregation algorithm could be performed incrementally by nodes, this computational cost can be reduced. Given n unauthenticated tickets, the computational cost is bounded by 2 pairings plus several multiplications in the batch-verification, which is a significant improvement over 2n pairings by individual verification. Table 1. Cryptographic Overhead Operations Ticket generation Individual verification for one ticket Individual verification for 10 tickets Aggregation for 10 tickets Batch verification for 10 tickets
Execution Time 12.578 ms 20.167 ms 201.674 ms 62.891 ms 13.878 ms
6.2 Performance Simulation Simulation Step. We implement our MobiID in a public DTN simulator, namely, the Opportunistic Networking Environment (ONE) simulator [27], and evaluate its performance under a practical application scenario, i.e., vehicular DTNs. Each vehicle first randomly appears at one position and moves towards another randomly selected position along the roads in a map. The details parameters are given as follows: Duration: 12 hours; Number of nodes: 126; Speed of nodes: 0.5 m/s ∼ 13.92 m/s; Transmission
188
L. Wei et al. 0.8
0.8
0.8 with MobiID w/o MobiID
with MobiID w/o MobiID 0.7
0.6
0.6
0.6
0.5
0.4
0.5
0.4
0.3
0.3
0.2
0.2
0.1
0
5
10 15 20 25 The percentage of selfish nodes
30
35
(a) Spray and Wait routing
Delivery ratio
0.7
Delivery ratio
Delivery ratio
with MobiID w/o MobiID 0.7
0.5
0.4
0.3
0.2
0.1 0
5
10 15 20 25 The percentage of selfish nodes
30
(b) Epidemic routing
35
0.1
0
5
10 15 20 25 The percentage of selfish nodes
30
35
(c) Prophet routing
Fig. 5. Incentive effectiveness comparison of MobiID with diverse data-forwarding algorithms
range: 10 m; Transmission speed: 2 Mbps; Buffer size: 5 MB ∼ 50 MB; Bundle generation interval: 15 s ∼ 35 s. Incentive Effectiveness. We begin our simulation by observing the incentive effectiveness of MobiID, which can be measured by the bundles’ average successful delivery probability under different percentages of selfish nodes or selfish behaviors, as shown in Fig. 5, from 0% to 35%. Additional, our scheme can be compatible with diverse data-forwarding algorithms such as Spray and Wait (Fig. 5(a)), Epidemic (Fig. 5(b)), Prophet (Fig. 5(c)). These results indicate that the network delivery could significantly degrade if the selfish nodes or selfish behaviors exist. Moreover, the average successful delivery ratio becomes worse as the percentage of selfish nodes or selfish behaviors increases. However, with MobiID, nodes are naturally motivated to participate in bundle forwarding to avoid being in the blacklist. The delivery ratio changes little as selfish increases. This demonstrates the incentive effectiveness of MobiID.
7 Related Work The issues on studying selfish behavior and designing incentive schemes have received extensive attentions in all kinds of networks. Most of previously reported studies have focused on how to stimulate selfish nodes through different approaches in the ad-hoc, sensor, and p2p networks. Credits-based incentive schemes [2–9, 28], are usually employed to provide incentive such as virtual credits to encourage selfish nodes forwarding. A recent work [10] proposes pair-wise Tit-for-Tat as an incentive scheme in DTNs. Reputation-based incentive schemes often rely on the individual nodes to monitor neighboring nodes’ traffic and keep track of each others’ reputation so that uncooperative nodes can be eventually detected and excluded from the networks. Meanwhile, reputation based incentive mechanisms always accompany the trust systems [17] to reward or punish selfish node. [11] proposes two techniques: watchdog and pathrater. CORE, in [13], uses the watchdog mechanism to observe neighbors and then detect and isolate selfish nodes. [14] proposes OCEAN, in which each node maintains the ratings for neighbors through directly interacting, but these ratings are not propagated to other node. SORI [15] proposes the concepts of first-hand reputation and second-hand reputation and shows to weighted sum these values. [16] proposes a reputation management system (RMS) in mobile ad hoc networks. [19] presents two forwarding protocols for mobile wireless networks and formally shows that both protocols are Nash equilibria. However, in DTNs, existing reputation-based incentive schemes may face the
A User-Centric and Social-Aware Reputation Based Incentive Scheme for DTNs
189
challenges. Using willingness to measure the strength of the social tie, [20] proposes a social selfishness aware routing algorithm to provides better routing performance in an efficient way. [18] designs a reputation-assistant framework to accurately evaluate encounter’s competency of delivering data in opportunistic networks.
8 Conclusions In this paper, we propose MobiID, a novel “user-centric” and “social-aware” reputation based incentive scheme for DTNs to stimulate cooperation in bundle forwarding. Different from the conventional reputation based incentive schemes, MobiID allows each node to maintain and update its reputation tickets in the local buffer and thus provide its reputation on demand. Besides, we measure a new willingness between two nodes, which is extracted from a social wireless network. We further provide a social based solution “community check” to accelerate the reputation checking and establishing. Our future work includes investigating the privacy issue of reputation systems in DTNs.
Acknowledgments This paper is partially supported by the Key Program of the National Natural Science Foundation of China (Grant No. 61033014), the National Natural Science Foundation of China (Grant No. 60972034, 60970110, and 61003218) and the National Fundamental Research Development Program of China (973) (Grant No. 2007CB311201). This paper is also supported by the Science and Technology Fundamental Infrastructure Construction Project of Jiangsu Province–Engineering Research Center (Grant No. BM20101014). Prof. Zhenfu Cao is the corresponding author, whose email address is
[email protected].
References 1. Fall, K., Farrell, S.: DTN: an architectural retrospective. IEEE Journal on Selected Areas in Communications 26(5), 828–836 (2008) 2. Zhu, H., Lin, X., Lu, R., Fan, Y., Shen, X.: SMART: A Secure Multilayer Credit-Based Incentive Scheme for Delay-Tolerant Networks. IEEE Transactions on Vehicular Technology 58(8), 4628–4639 (2009) 3. Chen, B., Chan, M.: MobiCent: a Credit-Based Incentive System for Disruption Tolerant Network. In: INFOCOM 2010, San Diego, California, USA, March 14-19 (2010) 4. Lu, R., Lin, X., Zhu, H., Shen, X., Preiss, B.: Pi: a practical incentive protocol for delay tolerant networks. IEEE Trans. on Wireless Communications 9(4), 1483–1493 (2010) 5. Buttyan, L., Hubaux, J.P.: Stimulating cooperation in self-organizing mobile ad hoc networks. Mobile Networks and Applications 8(5), 579–592 (2003) 6. Zhong, S., Chen, J., Yang, Y.: Sprite: A simple, cheat-proof, credit-based system for mobile ad-hoc networks. In: INFOCOM 2003, San Franciso, USA, March 30-April 3 (2003) 7. Anderegg, L., Eidenbenz, S.: Ad hoc-VCG: a truthful and cost-efficient routing protocol for mobile ad hoc networks with selfish agents. In: MOBICOM 2003, San Diego, CA, USA, September 14-19 (2003)
190
L. Wei et al.
8. Zhang, Y., Lou, W., Liu, W., Fang, Y.: A secure incentive protocol for mobile ad hoc networks. Wireless Networks 13(5), 569–582 (2007) 9. Mahmoud, M.E., Shen, X.: FESCIM: Fair, Efficient, and Secure Cooperation Incentive Mechanism for Multi-hop Cellular Networks. IEEE Trans. on Mobile Computing (to appear) 10. Shevade, U., Song, H., Qiu, L., Zhang, Y.: Incentive-Aware Routing in DTNs. In: ICNP 2008, Orlando, Florida, USA, October 19-22 (2008) 11. Marti, S., Giuli, T., Lai, K., Baker, M.: Mitigating routing misbehavior in mobile ad hoc networks. In: MOBICOM 2000, Boston, MA, USA, August 6-11 (2000) 12. Jøsang, A., Ismail, R.: The beta reputation system. In: The 15th Bled Electronic Commerce Conference, Bled, Slovenia, June 17-19 (2002) 13. Michiardi, P., Molva, R.: Core: a collaborative reputation mechanism to enforce node cooperation in mobile ad hoc networks. In: Sixth Joint Working Conference on Communications and Multimedia Security, Portoroˇz, Slovenia, September 26-27 (2002) 14. Bansal, S., Baker, M.: Observation-based cooperation enforcement in ad hoc networks. Arxiv preprint cs/0307012 (2003) 15. He, Q., Wu, D., Khosla, P.: SORI: A Secure and Objective Reputation-based Incentive Scheme for Ad hoc Networks. In: WCNC 2004, Atlanta, GA, USA, March 21-25 (2004) 16. Anantvalee, T., Wu, J.: Reputation-based system for encouraging the cooperation of nodes in mobile ad hoc networks. In: ICC 2007, SECC, Glasgow, Scotland, June 24-28 (2007) 17. Jøsang, A., Ismail, R., Boyd, C.: A survey of trust and reputation systems for online service provision. Decision Support Systems 43(2), 618–644 (2007) 18. Li, N., Das, S.: RADON: reputation-assisted data forwarding in opportunistic networks. In: MobiOpp 2010, Pisa, Italy, February 22-23 (2010) 19. Mei, A., Stefa, J.: Give2get: Forwarding in social mobile wireless networks of selfish individuals. In: ICDCS 2010, Genova, Italy, June 21-25 (2010) 20. Li, Q., Zhu, S., Cao, G.: Routing in socially selfish delay tolerant networks. In: INFOCOM 2010, San Diego, California, USA, March 14-19 (2010) 21. Li, F., Yang, Y., Wu, J.: CPMC: An Efficient Proximity Malware Coping Scheme in Smartphone-based Mobile Networks. In: INFOCOM 2010, San Diego, California, USA, March 14-19 (2010) 22. Von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007) 23. Cha, J., Cheon, J.: An Identity-Based Signature From Gap Diffie-Hellman Groups. In: Desmedt, Y.G. (ed.) PKC 2003. LNCS, vol. 2567, pp. 18–30. Springer, Heidelberg (2002) 24. Ferrara, A.L., Green, M., Hohenberger, S., Pedersen, M.Ø.: Practical short signature batch verification. In: Fischlin, M. (ed.) CT-RSA 2009. LNCS, vol. 5473, pp. 309–324. Springer, Heidelberg (2009) 25. Capasso, V.: Mathematical structures of epidemic systems. Springer, Heidelberg (1993) 26. Lynn, B.: The Pairing-Based Cryptography Library (PBC), http://crypto.stanford.edu/pbc// 27. The Opportunistic Network Environment simulator (The ONE) Version 1.4.0 (March 18, 2010), http://www.netlab.tkk.fi/tutkimus/dtn/theone/ 28. Mahmoud, M.E., Shen, X.: ESIP: Secure Incentive Protocol with Limited Use of PublicKey Cryptography for Multi-hop Wireless Networks. IEEE Trans. on Mobile Computing (to appear)
Improved Access Control Mechanism in Vehicular Ad Hoc Networks Sushmita Ruj, Amiya Nayak, and Ivan Stojmenovic SITE, University of Ottawa, Ottawa K1N 6N5, Canada {sruj,anayak,ivan}@site.uottawa.ca
Abstract. Access control of message is required when certain selected vehicles are granted access to information, instead of all vehicles within communication range. In these situations an access policy (consisting of attributes as road situation and vehicle type) is built into the vehicle and messages are encrypted using these access policies. Only valid vehicles possessing these attributes are able to decrypt the message. Huang and Verma [16] had proposed such an access control framework. The scheme assumed that the road-side units (RSU) are not compromised and had a very restricted access structure. We propose a new access control structure which eliminates the drawbacks of their schemes, by providing access control in presence of compromised RSU. Our technique permits a more general boolean access structure. Communication is possible between two vehicles which are monitored by two RSU, which was not permitted in [16]. The costs are comparable to that of [16]. Keywords: Access control, Decentralized attribute-based encryption, Bilinear maps, Vehicular networks, Access policies.
1
Introduction
Vehicular ad hoc networks consist of vehicle, road side units (RSU) and Certification Authorities (CA) and ensure road safety. Communication might be either vehicle to vehicle (V2V) or vehicle to infrastructure (V2I). V2V communication helps in faster transfer of messages about road situations and helps to prevent accidents by detecting road congestion, hazardous road conditions, crashes etc. V2I communication helps in intelligent services like access to Internet, finding nearest cheapest gas station, automatic payment service at toll gates, to name only a few. Transmission of messages are vulnerable to security breeches. For example, a vehicle (node) might send false alert messages about congestion or hazardous road condition, etc either because of malicious motives or selfish reasons. Authentication in VANETs has been an active area of research. Another aspect that has received a lot of attention is location privacy. To preserve the location privacy of the vehicles, nodes are assigned several pseudonyms (aliases) in addition to a unique identity. The pseudonyms are assigned in such a way that two or more pseudonyms belonging to a vehicle cannot be linked together. Each node has a set of public keys and private keys corresponding to the set of H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 191–205, 2011. c Springer-Verlag Berlin Heidelberg 2011
192
S. Ruj, A. Nayak, and I. Stojmenovic
pseudonyms, that it has. These are either assigned by a certification authority (CA) [22] or can be generated by the vehicle itself. The nodes also have certificates, authenticating their public keys. When a node sends a message, it signs it with the private key and sends a certificate as well. The receiving nodes verify the certificate and the signature using the sender’s public key and verifies the authenticity of the message. Extensive surveys on the security issues in VANETs can be found in [19,17,21,4]. When a patrol car wants to send a secret information to other patrol cars, no other vehicle should be able to decode it. A mechanism that allows certain users to access information and denies access to unauthorized users, is known as access control. We address the problem of access control, in which a message is not just intended for all members in the vicinity, but certain members, depending upon certain attributes that they possess. The problem of access control in vehicular networks has received very little attention. Existing solutions assume that messages sent can be accessed by all vehicles in the vicinity. Group based approach has also been taken in CARAVAN [27] and AMOEBA [28]. Nodes moving together belong to one group. Any member can sign a message using a group signature scheme [7] and the others can verify it even without knowing the identity of the signer. However, the election of leaders (to assign group keys), and the fast changing groups (due to ephemeral nature of the network) make it difficult in practical situations. Access control in VANETs has several applications. For example, on a highway, there might be several lanes. Some lanes might be reserved only for heavy trucks while certain lanes might be reserved only for small vehicles. At toll gates or border security control gates, different lanes might be reserved for different types of vehicles and also for vehicles of different types of ownership (certain pass-holders and no-pass holders). A message which is to be received by certain nodes with certain attributes, should not be accessed by others. For example, a lane change information for pass-holders of private cars, need not be received by others with no-pass. Huang and Verma [16] were the first to propose an access control mechanism in VANETs. In their scheme, each vehicle possesses certain static and dynamic attributes. The static attributes can be type of the vehicle (emergency, commercial goods carrier, patrol cars, private vehicles etc), province and city to which it belongs etc. The dynamic attributes can be the road segment, highway number, lane number etc. A list of attributes is given in Table 2. Vehicles moving together are assumed to be in one group. Messages are sent encrypted with certain access policies (rules over the set of attributes). Only those nodes which posses attributes that satisfy the policies (rules) will be able to decrypt the message. For example, if a message might be encrypted with the policy “patrol car AND in HW211 AND segment 5” then only those patrol cars in segment 5 of highway 211 will be able to decrypt the message. Each node receives its set of attributes from the RSU, when it comes within communication range. The RSU also gives secret keys corresponding to these attributes. The rules (access tree) are in the form of t-out-of-n threshold schemes: if there is an overlap between the t
Improved Access Control Mechanism in Vehicular Ad Hoc Networks
193
attributes (amongst n) of the sender and receiver, then the receiver can decrypt the message. A public key is derived using the access tree. The encryption and decryption of messages using attribute policies utilize a cryptographic protocol known as Attribute Based Encryption (ABE) [26]. There are several limitations of [16]. First, the paper handles only one type of access policy (t-out-of-n threshold structure). This is a very restrictive assumption. Second, there is no coordination amongst RSUs. Consider the following situation. Suppose the road segments X and Y are in the communication range of RSU R1 and RSU R2 respectively. Suppose there is an accident in X and all patrol vehicles in segments X and Y should be warned. However, if RSU R2 is down then there is no communication with the selected set of vehicles in segment Y . Third, the paper assumes that all key management in each group is done by the RSU in their range, which are assumed honest and not subject to compromised. However, if an RSU is compromised, then the access control scheme does not work in the region. Huang et al. [15] improved upon the work of [16]. They mainly established trust in online systems in VANET and Chen et al. [8] discussed how to decide the lifetime of an attribute assigned to a vehicle. None of the above three limitations of [16] are however addressed in any of the above papers. We address each of the above limitations, in this paper. Our scheme supports any sort of boolean access structure like ((a1 ∧ a2 ∧ a3 ) ∨ (a4 ∧ a5 )) ∧ (a6 ∨ a7 ). According to our scheme, nodes posses attributes and keys from different RSUs. A node in the communication range of RSU R1 might also send information to selected nodes in the communication range of RSU R1 and R2 (or to much broader region). Thus, nodes are not divided into groups and can communicate with one another across domains of different RSUs. This helps in the widespread dissemination of messages and early decision making. However, we note that it might not be required to interact with nodes which are far apart. Our scheme is resilient to compromise of RSUs. For example in the above case, if RSU R2 is monitoring the road segment Y , then only those messages which contain attributes only from R2 can be decrypted by unauthorized users as well. However, in general messages contain static as well as dynamic attributes. Thus, they cannot be decrypted by unauthorized users. We propose an improved access control scheme based on a very recent work on decentralized attribute based encryption of Lewko and Waters [18]. Attribute based encryption (ABE), proposed by Sahai and Waters [26], and has its roots in identity-based encryption (IBE) proposed by Shamir [29]. In IBE each user has a unique identity, and the public key is the unique information about the user. In ABE, a user has a set of attributes in addition to its unique ID. The message is encrypted using the attributes and can be decrypted by another user who possesses a valid access structure (access policy). The access structure can either be a t-out-of-n scheme or complex access tree with attributes at the leaves and logic gates at the remaining nodes. This is known as key policy ABE (KP-ABE). The other class of ABE is known as ciphertext policy ABE. Goyal et al. [14] proposed a scheme which supports a monotonic access structure
194
S. Ruj, A. Nayak, and I. Stojmenovic
and designed a KP-ABE. The monotonic access structure supports AND, OR and threshold gates. In ciphertext policy ABE (CP-ABE) roles of ciphertexts and keys are reversed. Ciphertext is encrypted with access policy chosen by an encryptor. Each user has a set of attributes and secret keys. The message is encrypted using the access policy and a user who has matching set of attributes can decrypt the message. CP-ABE was introduced by Bethencourt et al. [3]. Most of these schemes are centralized, meaning that there is only one TA who generates the public/private key pair and assigns them to the users. Chase [5] first used the concept of multiple authorities. Instead of one authority who distributes keys and attributes, there are several authorities. Users receive attributes and keys from one or several authorities. However, a TA coordinates the activities of all the authorities. Chase’s scheme was extended by Chase and Chow [6], who proposed a multi authority scheme without a trusted server. The scheme is fully secure. Lewko and Waters [18] recently proposed a new decentralized CP-ABE. The authorities need not coordinate with one another unlike [6]. A user can have attributes from any number of authorities, unlike [6], in which all authorities had to contribute attributes to all users. We have used Lewko and Water’s scheme to suit our requirement of access control in VANETs. The role of authorities which distribute attributes and keys are taken by RSUs (which distribute dynamic attributes) and CA(which distribute static attributes). The users are nodes which decide the access structures, and encrypt messages. Sometimes RSUs can also decide the intended receivers and encrypt messages (the border security gate, in our previous example). The recipients are nodes, who decrypt the message, if they have access rights. The fundamental difference between [18] and our scheme is that Lewko and Waters uses bilinear maps on composite groups, whereas our scheme uses groups of prime order. They consider groups of order N = p1 p2 p3 , where p1 , p2 and p3 are primes. This is required to support a Dual System Encryption[33]. This scheme uses two types of ciphertexts: normal and semi-functional. Similarly, there are two types of private key keys: normal and semi-functional. Semifunctional private key can decrypt all normally generated ciphertexts as well as semi-functional ciphertexts. Semi-functional ciphertexts can be decrypted only by normal private keys. Orthogonality property of Gp1 , Gp2 and Gp3 is used to implement semi-functionality. Dual system encryption has been introduced to prove security of IBE in a new way, which needs constant size of ciphertexts. We do not need this in our model, because we do not need this restriction in vehicular systems. Therefore, we use bilinear maps on prime order groups. Yu et al. [34] proposed an access control scheme for wireless sensor networks (WSN) which used Goyal et al. [14] scheme. A recent work by Ruj et al. [25] propose a distributed access control scheme for WSN which is based on Chase and Chow’s scheme [14]. The first scheme, [34] cannot be applied to VANET because, the coordination of the CA needs to be there to decrypt all the messages received. The second scheme, [25] is not applicable in VANET because all the RSUs need to contribute towards the access structure, which is not required for VANETs. Only closely placed RSUs might have shares in the access policy. We
Improved Access Control Mechanism in Vehicular Ad Hoc Networks
195
show that our scheme results in comparable costs as [16], but has more flexibility and can withstand compromised RSU. The paper is organized in the following way. In Section 2, we present work related to access control and security in VANETs. In Section 3, we describe mathematical tools that we use in our scheme. We also present the network and attack model. Our scheme is discussed in details in Section 4. We analyze the security and performance of our scheme in Section 5. We conclude the paper with open problems in Section 7.
2
Related Work
In this section, we discuss work related to VANETs security. Most of the research on security issues in VANETs has been concentrated on location privacy, authentication, trust management, revocation. Location privacy is achieved using pseudonyms, as we discussed in Section 1. Most papers on location privacy deal with how to assign pseudonyms [22], when to change pseudonyms [22,28,11], and how to assign signatures using pseudonyms [31]. Authentication is achieved by means of signatures and has been studied in [32,31]. Revocation of malicious nodes is another issue that has received a lot of attention. The issues in this context are, whether to maintain a list of all revoked certificates and keys or revoked vehicles or some seed of the revoked vehicles [31]. Trust management in VANETs involves two aspects, entity centric trust and data-centric trust [23]. Entity centric trust involves trusting the node which sends an information and data-centric trust involves trusting the information. In an ephemeral network like VANETs, nodes are in contact for too short a time, to build trust. Nodes can also misbehave because of selfish reasons. Thus, it becomes more important to ensure the trust worthiness of data, than that of the nodes (entities). Trust management in [23,24] is done using game theoretic techniques. There are two types of games: one played between good nodes and adversarial nodes and other between the nodes of each type. Each node has an assigned benefit if it behaves well and a cost if it behaves badly. Each node on noticing misbehavior can either vote against it, or abstain from voting or convict it and then commit suicide. Each of these has associated costs and benefits. Though the authors assume that the costs and benefits are fixed, this is not a correct assumption in VANETs. For example, different types of vehicles will have different costs and benefits. An police car behaving badly, will have severe consequences than a private car. Misbehavior detection in VANETs involves finding out nodes that misbehave and the data that is false. Golle et al. [13] designed a scheme to detect misbehaviors based on observations and matching them to the already created model. A model is a data structure containing events and observations. If the observations are invalid, according to the model, then a misbehavior is said to have been detected. It is not clear how this model is stored. There has been several papers [35,20] on detecting Sybil attacks [9] in VANETs. Misbehavior detection in post-crash situations has been studied in [12].
196
S. Ruj, A. Nayak, and I. Stojmenovic
Revocation of certificates and secret credentials has the following disadvantages. The certificate revocation list (CRL) containing all the certificates of revoked vehicles, has to be sent to all the nodes in the network. This approach requires a huge bandwidth, if the number of revoked nodes is high.
Symbols ni pit Rj R T t = |T | Lj lj = |Lj | I[0]i I[j]i Ii P K[0]x SK[0]x P K[j]x SK[j]x SKx,pi t S A |G| M C H
3
Table 1. Notations Meanings Node i Pseudonym of node ni at time t RSU j Set of RSUs Set of static attributes Number of static attributes Set of attributes that RSU Rj possesses Number of attributes that RSU Rj possesses Set of static attributes that node ni Set of attributes that RSU Rj gives to node ni Set of attributes that node ni possesses Public key of CA corresponding to attribute x Secret key of CA corresponding to attribute x Public key of RSU Rj corresponding to attribute x Secret key of RSU Rj corresponding to attribute x Secret key corresponding to attribute x given to node ni with pseudonym pit Boolean access structure Access matrix Order of group G Message Ciphertext Hash function, example SHA-1
Preliminaries
In this section we discuss the mathematical tools that we use throughout the paper. Table 1 presents the notations used throughout the paper. We also discuss the network and adversary model. 3.1
Mathematical Background
Our design is based on bilinear maps. Let G be a cyclic groups of prime order q generated by g. Let GT be a group of order q. We can define the map e : G × G → GT . The map satisfies the following properties: 1. e : G × G → GT is bilinear map if e(aP, bQ) = e(P, Q)ab for all P ∈ G and Q ∈ G and a, b ∈ Zq , Zq = {0, 1, 2, . . . , q − 1}. 2. Non-degenerate: e(g, g) = 1. 3. Computability: ψ : G2 → G1 is a computable isomorphism from G2 to G1 , with ψ(g2 ) = g1 . The isomorphism ψ and the bilinear map e are all efficiently computable.
Improved Access Control Mechanism in Vehicular Ad Hoc Networks
197
Table 2. Static and dynamic attributes Static attributes Type: Emergency vehicle, patrol car, truck, taxi, private vehicle... Location: City, province Group: Emergency vehicle (fire/ambulance), Taxi(company) Dynamic Attributes Road: highway number, street number Road segment: segment of highway/street Lane number Nearest intersection
3.2
Network Model
Each node has a set of static attributes which are preloaded into onboard unit (OBU). Each RSU possesses a list of dynamic attributes, which mainly depict the road conditions. A list of possible static and dynamic attributes is given in Table 2. The CA provides each node with pseudonyms and public key/private keys corresponding to each pseudonym and certificates [31]. When a vehicle enters within the communication range of a RSU, the RSU secretly gives it certain dynamic attributes along with the set of secret keys. These attributes and secret keys are encrypted using the public key of the node. The node decrypts this information using its secret key. We assume that each node changes pseudonyms from time to time. Messages can also be sent by RSUs for a set of selected nodes to access. 3.3
Formats of Access Policies
Access policies can either be boolean functions of attributes or Linear Secret Sharing Scheme (LSSS) matrix. An example of a boolean function of attributes ((a1 ∧a2 ∧a3 )∨(a4 ∧a5 ))∧(a6 ∨a7 ). is given in Section 1. The boolean functions can also be represented by access tree, with the attributes at the leaves and AN D(∧) and OR(∨) as the intermediate nodes and root. It is possible to convert an access tree into an LSSS matrix. An algorithm is given in Appendix. 3.4
Adversary Model
We consider insider adversary, in which the nodes can collude and try to decrypt messages that they alone are unable to decrypt. We assume that there are means to detect misbehaviors (eg. [13]) and revocation scheme [32], if the message sent is not authentic.
4
Our Access Control Scheme
Initially the CA generates public key P K[0]x and secret key SK[0]x for the set of static attributes x ∈ T . All nodes ni ∈ N are given a set of static attributes I[0]i ∈ T and a set of secret keys SKx,pit , x ∈ I[0]i and for all pseudonyms pit that the node has. We note here that since the number of static attributes is small and constant, the number of secret keys SKx,pit is not large.
198
S. Ruj, A. Nayak, and I. Stojmenovic
Each RSUs Rj posses certain dynamic attributes Lj and generates a set of public keys P K[j]m and secret key SK[j]m for each dynamic attribute m ∈ Lj . Each node ni securely receives a set of attributes I[i]j ∈ Lj from the RSU Rj (when it comes within communication range) along with the public keys. When a node wants to send a message M , it decides the target receivers and builds the access structure S. We provide an example in the next section to demonstrate this. The access structure is a boolean function of the attributes that it possesses. The node then constructs the corresponding access matrix A. Then it encrypts the message with the access structure A and sends the ciphertext C. Details of the construction is given in Appendix. When a node ni receives an encrypted message C along with the access matrix A, it first checks if it has attributes which match with the access matrix, then it decrypts the message. For example, if the access structure is ((a1 ∧ a2 ∧ a3 ) ∨ (a4 ∧ a5 )) ∧ (a6 ∨ a7 ), and a node does not have the attributes a1 and a4 , then the node cannot decrypt any message with this access structure. A node whose attributes do not match the access structure will not be able to decrypt the message. The whole scheme proceeds in four steps and is given below. 1. System initialization: Initializes parameters and decides public and private keys of CA, RSU. 2. Key generation: Generates public and secret keys for nodes. 3. Encryption: Message is encrypted using access matrix, constructed from boolean access structure. 4. Decryption: Decrypts received message, if the node has valid set of attributes. System Initialization 1. The following system parameters are chosen: A prime q, and bilinear groups G and GT of order q. A map e : G × G → GT is also chosen. g is a generator of G. A hash function H : {0, 1}∗ → G which maps the identities nodes to G. The hash function used is SHA-1 [30]. 2. The CA generates a set of static attributes T for all vehicles. 3. The CA chooses two random exponents αi , yi ∈ Zq . 4. The public key of CA is published P K[0] = {e(g, g)αi , g yi , i ∈ T }
(1)
5. The secret key of CA is SK[0] = {αi , yi , i ∈ T }
(2)
6. Each RSU Rj ∈ R also has a set of dynamic attributes Lj . 7. Each RSU also chooses two random exponents αi , yi ∈ Zq . 8. The public key of RSU Rj is published P K[j] = {e(g, g)αi , g yi , i ∈ Lj }
(3)
9. The secret key of RSU Rj SK[j] = {αi , yi , i ∈ Lj }
(4)
Improved Access Control Mechanism in Vehicular Ad Hoc Networks
10. The attributes chosen by the CA and the RSUs are such that Li for i = j and T Li = φ.
199
Lj = φ
Key generation and distribution. The algorithm generates secret and public keys for nodes for static and dynamic attributes. The inputs are the set of static attributes T , the set of secret and public keys SK[0] and P K[0], respectively, and the set of dynamic attributes Lj , for RSU Rj , and the secret and public keys SK[j] and P K[j], respectively. The CA gives each node ni , a set of static attributes I[0]i . For each pseudonym pit , the CA also gives a secret key for attribute x ∈ I[0]i . the secret key is skx,pit = g αx H(pit )yx
(5)
where αx , yx ∈ SK[0]. When a node ni (with pseudonym pit ) enters the communication range of an RSU Rj ∈ R, then the RSU gives it a set of attributes I[j]i . The RSU Rj also gives a secret key skx,pit for each x ∈ I[j]i skx,pit = g αx H(pit )yx ,
(6)
where αx , yx ∈ SK[j]. Note that all keys are given to the nodes securely using the node’s public key, such that it alone can decrypt it using its secret key. Data Encryption. Encryption proceeds in two steps. Firstly, the boolean access tree is converted to LSSS matrix. In the second step the message is encrypted and transmitted along with the LSSS matrix. Suppose a node ni wants to send a message M to a set of nodes. ni defines the access structure S (m is the number of attributes in the access structure), to decide the authorized set of nodes, who will be able to decrypt the message M . It then creates a m × l matrix A and defines a mapping π of its rows with the attributes (Using Algorithm in Appendix). l is the depth of the access tree corresponding to S. π : {1, 2, . . . , m} → W. Encryption takes as input the message M that needs to be encrypted, the LSSS matrix A, the permutation function π, which maps the attributes in the LSSS to the actual set of attributes, the group G. It outputs ciphertext C. For each message M , ni does the following: 1. Chooses a random seed s ∈ Zq and a random vector v ∈ Zlq , with s as its first entry 2. Calculates λx = Ax · v, where Ax is a row of A 3. A random vector w ∈ Zlq is chosen with 0 as the first entry. 4. Calculate ωx = Ax · w 5. For each row x of A, choose a random ρx ∈ Zq . 6. The following parameters are calculated: C0 = M e(g, g)s C1,x = e(g, g)λx e(g, g)απ(x) ρx , ∀x C2,x = g ρx ∀x C3,x = g yπ(x) ρx g ωx ∀x
(7)
7. The node ni sends the following ciphertext C C = A, π, C0 , {C1,x , C2,x , C3,x , ∀x}
(8)
200
S. Ruj, A. Nayak, and I. Stojmenovic
Decryption. This algorithm takes as input, ciphertext C, secret keys of node ni , group G, and outputs message M . When a node ni receives a ciphertext C, it obtains the access matrix A and mapping π. It then executes the following steps: 1. Calculates the set of attributes {π(x) : x ∈ X} Ii that are common to itself and the access matrix. X is the set of rows of A. 2. For each of these attributes, it checks if there is a subset of rows X of A, such that the vector (1, 0 . . . , 0) is a linear combination of AX . This means, there exist constants cx ∈ Zq , such that x∈X cx Ax = (1, 0, . . . , 0). 3. If yes, then decryption proceeds as follows: (a) For each x ∈ X , C1,x e(H(pit ),C3,x ) e(skπ(x),pit ,C2,x ) λx
(9) = e(g, g) e(H(pit ), g)ωx (b) ni chooses constants cx ∈ Zq , such that x∈X cx Ax = (1, 0, . . . , 0) and computes Πx∈X (e(g, g)λx e(H(pit ), g)ωx )cx = e(g, g)s (10) Equation (10) above holds because λx = Ax · v and ωx = Ax · w, where v · (1, 0, . . . , 0) = r and ω · (1, 0, . . . , 0) = 0. (c) Calculate M = C0 /e(g, g)s 4.1
Example
Consider the following set of attributes: Type: Emergency vehicles (T1 ), patrol vehicle (T2 ), private vehicles (T3 ), trucks (T4 ), Road: D1 , D2 , . . ., Segment: S1 , S2 , . . .. Suppose an emergency vehicle (travelling on road D1 segment S1 ) has to report some incident to other emergency and patrol vehicles in road D1 , segment S1 and road D2 and segment S2 . Thus, its boolean access structure is (T1 ∨ T2 ) ∧ ((D1 ∧ S1 ) ∨ (D2 ∧ S2 )). Using the algorithm in Appendix, we can construct the following tree structure. Thus, ⎛ ⎞ 1 1 0 ⎜1 1 0 ⎟ ⎜ ⎟ ⎜0 −1 1 ⎟ ⎜ ⎟ A=⎜ ⎟ ⎜0 0 −1⎟ ⎝0 −1 1 ⎠ 0 0 −1 Thus, a patrol car on road D2 and segment S2 finds that there exists three rows 2nd, 5th and 6th, such that there exists c2 = 1, c5 = 1, c6 = 1, such that c2 A2 + c5 A5 + c6 A6 = (1, 0, 0). Then the car can compute Step (10), and subsequently, compute the message. 4.2
Messages Sent by RSUs
Messages can be sent by RSUs in a similar way. The scheme remains almost the same. In this case, data encryption is carried out by the RSU, instead of a node.
Improved Access Control Mechanism in Vehicular Ad Hoc Networks
5
201
Analysis
In this section we discuss the security and performance of our access control scheme. 5.1
Security
We will first prove the validity of our scheme. We will show that only a node with a valid set of attributes is able to decrypt the messages and an unauthorized node (not having matching attributes) will not be able to decrypt the message. The construction of matrix A from the access structure S is such that a set of attributes satisfy S, if and only if there exists a set of rows X in A, such there exists linear constants cx ∈ Zq , such that x∈X cx Ax = (1, 0, . . . , 0). A proof of this can be found in [2, Chapter 4]. Therefore, an invalid user does not have attributes such that x∈X cx Ax = (1, 0, . . . , 0). Hence, it cannot calculate e(g, g)s . We next show that our scheme is collusion secure. This means that no two or more nodes can collude and decrypt a message that they are not supposed to decrypt alone. Suppose users collude and have attributes x ∈ X, such that ωx needs to be calculated acx∈X cx Ax = (1, 0, . . . , 0). However, e(H(pit ), g) cording to Eq. (10). Since different nodes have different values of e(H(pit ), g), even if they combine their attributes, then cannot decrypt the message. If an RSU Rj is compromised, then αx and yx are known, for each attribute x ∈ I[j]i . Thus, secret key skx,pit will be known. For all attributes Eq. (9) can be calculated. If the valid set of attributes, all belong to RSU Rj , then an attacker can decrypt a message. However, if the access structure contains attributes from other RSUs and static attributes, then no unauthorized user can decode the message. If a node is compromised, then existing misbehavior detection schemes [13] and revocation schemes [32] are used so that the message is ignored by the rest of the nodes. 5.2
Performance
In this section we calculate the computation and communication overheads involved and compare with the scheme of [16]. The computation of matrix A from the boolean function takes O(m). To check if there exists a set of rows in A (such that step (2) of decryption holds), is the same as solving the equation CA = (1, 0, . . . , 0), for non-zero row vector C. This takes O(ml). Thus, the main expenses are due to pairing operations. We calculate the number of pairing operations and the number of scalar multiplications performed by the nodes for encryption and decryption. During encryption, each node performs only one pairing operation (to calculate e(g, g)). For each attribute it also performs two scalar multiplications to calculate C1,x , one scalar multiplication to calculate C2,x and one to calculate C3,x . Thus there are a total of 4m scalar multiplication, where m is the number of attributes in the access structure. During decryption, there are two pairing operations one for e(H(pit ), C3,x ) and the
202
S. Ruj, A. Nayak, and I. Stojmenovic
other for e(skπ(x),pit , C2,x ) for each attribute x. Thus, this is at most 2m, where m is the number of attributes in the access tree. There are also at most m scalar multiplication to calculate (e(g, g)λx e(H(pit ), g)ωx )σx . Therefore, the computation load is (2m + 1)Tp + 5mTm , where Tp and Tm are the time taken to perform pairing and scalar multiplication. Table 3. Computation costs of [16] and our scheme
Schemes Computation costs [16] Scheme (m + 1)Tp + 2mTm Ours (2m + 1)Tp + 5mTm Table 4. Communication overheads of [16] and our scheme
Schemes Communication costs [16] Scheme 2m log |G| + m2 + Data Ours m log |GT | + 2m log |G| + m2 + Data The communication overhead results due to sending the access matrix and other fields in the ciphertext. The maximum size of matrix is m2 . The other communication costs include transmitting C0 , C1,x , C2,x , C3,x which is |data| + log |GT | + 2 log |G|. Thus, the total communication cost is m log |GT | + 2m log |G| + m2 + |data|. We compare the computation and communication costs of our scheme and that of [16] in Tables 3 and 4, respectively. Therefore, we see that the costs are comparable. The amount of memory used needed is to store the tree which takes O(ml), where m is the number of attributes and l is the depth of the tree. Each node also needs to store the static and dynamic attributes and keys.
6
Implementation Issues
The bilinear pairings defined on elliptic curve groups are mostly Weil and Tate pairings and computed using Miller’s algorithm. We do not consider the algorithm in details. The choice of curve is an important consideration, because it determine the complexity of pairing operations. A survey on pairing friendly curves can be found in [10]. The curves chosen are either MNT curves or supersingular curves. PCB library (Pairing Based Cryptography) [1] is a C library which is built above GNU GMP and contains functions to implement elliptic curves and pairing operations. Considering the requirements, we consider elliptic curve group of size 159, with an embedding degree 6 (Type d curves of PBC [1]). Pairing takes 14 ms on Intel Pentium D, 3.0 GHz CPU [34]. Thus, we see that computations required in feasible for VANET. The detailed implementations of this scheme is left as a future work.
Improved Access Control Mechanism in Vehicular Ad Hoc Networks
7
203
Conclusions
We propose an access control scheme for vehicular networks. The scheme eliminates the limitations of [16] and has comparable costs. We also address the problem, where not only nodes, but RSUs can send messages to a target group. In future we would like to reduce the communication overhead by reducing the amount of information needed to transmit access structures.
Acknowledgement The authors would like to thank the anonymous reviewers for their comments and suggestions. This work is partially supported by NSERC Grant CRDPJ386874-09.
References 1. http://crypto.stanford.edu/pbc/ 2. Beimel, A.: Secure Schemes for Secret Sharing and Key Distribution. Ph D Thesis. Technion, Haifa (1996) 3. Bethencourt, J., Sahai, A., Waters, B.: Ciphertext-policy attribute-based encryption. In: IEEE Symposium on Security and Privacy, pp. 321–334. IEEE Computer Society, Los Alamitos (2007) 4. Biswas, S., Mahbubul Haque, M., Misic, J.V.: Privacy and anonymity in vanets: A contemporary study. Ad Hoc & Sensor Wireless Networks 10(2-3), 177–192 (2010) 5. Chase, M.: Multi-authority attribute based encryption. In: Vadhan, S.P. (ed.) TCC 2007. LNCS, vol. 4392, pp. 515–534. Springer, Heidelberg (2007) 6. Chase, M., Chow, S.S.M.: Improving privacy and security in multi-authority attribute-based encryption. In: ACM Conference on Computer and Communications Security, pp. 121–130 (2009) 7. Chaum, D., van Heyst, E.: Group signatures. In: Davies, D.W. (ed.) EUROCRYPT 1991. LNCS, vol. 547, pp. 257–265. Springer, Heidelberg (1991) 8. Chen, N., Gerla, M., Hong, D.H.X.: Secure, selective group broadcast in vehicular networks using dynamic attribute based encryption. In: Ad Hoc Networking Workshop, Med-Hoc-Net, pp. 1–8 (2010) 9. Douceur, J.R.: The Sybil Attack. In: Druschel, P., Kaashoek, M.F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, pp. 251–260. Springer, Heidelberg (2002) 10. Freeman, D., Scott, M., Teske, E.: A taxonomy of pairing-friendly elliptic curves. J. Cryptology 23(2), 224–280 (2010) 11. Freudiger, J., Manshaei, M.H., Boudec, J.-Y.L., Hubaux, J.-P.: On the age of pseudonyms in mobile ad hoc networks. In: INFOCOM, pp. 1577–1585. IEEE, Los Alamitos (2010) 12. Ghosh, M., Varghese, A., Gupta, A., Kherani, A.A., Muthaiah, S.N.: Detecting misbehaviors in vanet with integrated root-cause analysis. Ad Hoc Networks 8(7), 778–790 (2010) 13. Golle, P., Greene, D.H., Staddon, J.: Detecting and correcting malicious data in vanets. In: Vehicular Ad Hoc Networks, pp. 29–37 (2004) 14. Goyal, V., Pandey, O., Sahai, A., Waters, B.: Attribute-based encryption for finegrained access control of encrypted data. In: ACM Conference on Computer and Communications Security, pp. 89–98 (2006)
204
S. Ruj, A. Nayak, and I. Stojmenovic
15. Huang, D., Hong, X., Gerla, M.: Situation-aware trust architecture for vehicular networks. Topics In Automotive Networking 48(11), 128–135 (2010) 16. Huang, D., Verma, M.: ASPE: attribute-based secure policy enforcement in vehicular ad hoc networks. Ad Hoc Networks 7(8), 1526–1535 (2009) 17. Kargl, F., Papadimitratos, P., Buttyan, L., Mter, M., Schoch, E., Wiedersheim, B., Thong, T.v., Cal, G., Held, A., Kung, A., Hubaux, J.p.: Secure vehicular communication systems: Implementation, performance, and research challenges. IEEE Wireless Communication Magazine, 110–118 (2008) 18. Lewko, A., Waters, B.: Decentralizing Attribute-Based Encryption. In: Paterson, K.G. (ed.) EUROCRYPT 2011. LNCS, vol. 6632, pp. 568–588. Springer, Heidelberg (2011), eprint.iacr.org/2010/351.pdf (last accessed February 22, 2011) 19. Papadimitratos, P., Buttyan, L., Holczer, T., Schoch, E., Freudiger, J., Raya, M., Ma, Z., Kargl, F., Kung, A., Hubaux, J.p.: Secure vehicular communication systems: design and architecture. IEEE Wireless Communication Magazine, 100–109 (2008) 20. Park, S., Aslam, B., Turgut, D., Zou, C.C.: Defense against sybil attack in vehicular ad hoc network based on roadside unit support. In: MILCOM, pp. 1–7 (2009) 21. Parno, B., Perrig, A.: Challenges in security vehicular networks. In: HotNets-IV (2005) 22. Raya, M.: Data-Centric Trust in Ephemeral Networks. Ph D Thesis. EPFL, Lausanne (2009) 23. Raya, M., Papadimitratos, P., Gligor, V.D., Hubaux, J.-P.: On data-centric trust establishment in ephemeral ad hoc networks. In: INFOCOM, pp. 1238–1246. IEEE, Los Alamitos (2008) 24. Raya, M., Shokri, R., Hubaux, J.-P.: On the tradeoff between trust and privacy in wireless ad hoc networks. In: Wetzel, S., Nita-Rotaru, C., Stajano, F. (eds.) WISEC, pp. 75–80. ACM, New York (2010) 25. Ruj, S., Nayak, A., Stojmenovic, I.: Distributed fine-grained access control in wireless sensor networks. IEEE International Parallel & Distributed Processing Symposium (to appear, 2011) 26. Sahai, A., Waters, B.: Fuzzy Identity-Based Encryption. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 457–473. Springer, Heidelberg (2005) 27. Sampigethaya, K., Huang, L., Li, M., Poovendran, R., Matsuura, K., Sezaki, K.: Caravan: Providing location privacy for vanet. In: Proc. of the Workshop on Embedded Security in Cars, ESCAR (2005) 28. Sampigethaya, K., Li, M., Huang, L., Poovendran, R.: Amoeba: Robust location privacy scheme for vanet. IEEE Journal on Selected Areas in Communications 25(8), 1569–1589 (2007) 29. Shamir, A.: Identity-based cryptosystems and signature schemes. In: Blakely, G.R., Chaum, D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 47–53. Springer, Heidelberg (1985) 30. Stinson, D.R.: Cryptography: Theory and Practice, 3rd edn. CRC Press Inc., Boca Raton (2006) 31. Sun, Y., Lu, R., Lin, X., Shen, X., Su, J.: An efficient pseudonymous authentication scheme with strong privacy preservation for vehicular communications. IEEE Trans. on Vehicular Technology 59(7), 3589–3603 (2010) 32. Wasef, A., Jiang, Y., Shen, X.: Ecmv: Efficient certificate management scheme for vehicular networks. In: GLOBECOM, pp. 639–643. IEEE, Los Alamitos (2008) 33. Waters, B.: Dual system encryption: Realizing fully secure IBE and HIBE under simple assumptions, http://www.eprint.iacr.org/2009/385.pdf
Improved Access Control Mechanism in Vehicular Ad Hoc Networks
205
34. Yu, S., Ren, K., Lou, W.: FDAC: Toward fine-grained distributed data access control in wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems 22(4), 673–686 (2011) 35. Zhou, T., Choudhury, R.R., Ning, P., Chakrabarty, K.: Privacy-preserving detection of sybil attacks in vehicular ad hoc networks. In: MobiQuitous, pp. 1–8. IEEE, Los Alamitos (2007)
Appendix Converting a boolean function to a LSSS matrix: This algorithm is used to convert a boolean function (in the form of a tree) to a LSSS matrix. This algorithm is given in [18]. However, we present it here, for completeness. vx is the vector corresponding to node x. The following subroutine returns the vectors corresponding to each node x. Subroutine: vector(x) Input: A boolean formula in the form of a binary tree. v1 = 1 and 1 is the root. Output: Returns a vector for each node of the tree 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:
if x is a leaf then Return else if x = AN D then vlef t(x) = (vx |1) vector(vlef t(x) ) vright(x) = ((0, . . . , 0)| − 1), (0, . . . , 0) is the size of vx vector(vright(x) ) else vlef t(x) = (vx ) vector(vlef t(x) ) vright(x) = (vx ) vector(vright(x) ) end if
The shorter vectors ones padded with zeros, such that all the vectors are of equal length. For each attribute at the leaf, the vector is a row in the LSSS matrix.
A New Coverage Improvement Algorithm Based on Motility Capability of Directional Sensor Nodes M. Amac Guvensan and A. Gokhan Yavuz Department of Computer Engineering, Yildiz Technical University, Istanbul, Turkey {amac,gokhan}@ce.yildiz.edu.tr
Abstract. In directional sensor networks (DSNs), motility capability of a directional sensor node has a considerable impact on the coverage enhancement after the initial deployment. Since random deployment may result in overlapped field of views (FoVs) and occluded regions, directional sensor nodes with rotatable mechanisms may reorganize their working directions to improve the coverage. Our proposed algorithm, Attraction Forces of Uncovered Points (AFUP), aims at both minimizing the overlapped areas and facing the working directions towards the area of interest. AFUP is a distributed iterative algorithm and exploits the repel forces exerted by the uncovered points around the sensor nodes. The proposed algorithm improves the coverage by 18%-25% after the initial deployment. Moreover, AFUP outperforms three well-known area coverage enhancement methods [15] [19] [16] in terms of coverage improvement and overlap minimization. Our simulation results show that AFUP converges in five iterations in most of the scenarios. Keywords: Directional Sensor Networks, Coverage, Motility, Field of View, Repulsive Force.
1
Introduction
The coverage issue is a fundamental problem in wireless sensor networks. There are extensive number of studies about the coverage problem in omni-directional sensor networks [18] [9]. However, the coverage problem in directional sensor networks has attracted attention, especially with the increasing number of wireless multimedia sensor network (WMSN) applications [2]. Directional sensor nodes equipped with ultrasound, infrared, and video sensors differ from traditional omni-directional sensor nodes with their unique characteristics, such as angle of view (AoV ), working direction, and line of sight (LoS). Therefore, DSN applications require specific solutions and techniques for coverage enhancement. Directional sensor nodes should be deployed appropriately to reach an adequate coverage level for the successful completion of the assigned tasks. However, working environments, such as remote harsh fields, disaster areas, and battlefields prevent the placement of sensors manually. Deploying sensors remotely via aircraft or catapult may result in a situation that the actual working directions H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 206–219, 2011. c Springer-Verlag Berlin Heidelberg 2011
A New Coverage Improvement Algorithm Based on Motility Capability
207
can not be controlled. Thus, the expected coverage may not be satisfied with a substantial number of sensors. In such cases, it is necessary to make use of the motility capability of directional sensors. Motility is a specialized form of mobility where a directional sensor node can only change its working direction along x, y (and/or) z axes [11]. Thus, the node carries out a reasonable physical movement instead of relocating itself. Since directional sensor nodes may work in several directions, they may adjust their working direction based on the requirements of the application. In this paper, our objective is to enhance the area coverage of directional sensor networks by turning sensors to the correct orientation and decreasing the coverage overlap caused by neighboring sensors. Moreover, we aim at facing the sensor nodes towards the targeted area instead of the outside of the observed area. The proposed algorithm searches for possible uncovered areas to minimize the overlapped regions. The key contributions of our work are as follows: – A new distributed solution to the self-orientation problem in DSNs has been formulated. – The proposed algorithm substantially enhances the coverage after the initial deployment. – AFUP algorithm provides more coverage improvement than several wellknown existing studies. The rest of this paper is organized as follows. In Section 2, we review existing solutions for area coverage enhancement in directional sensor networks. Section 3 describes the directional sensing model and mentions about related notations and assumptions. In Section 4, we state the problem formally and present our solution for area coverage enhancement in DSNs. This section also gives the details of the proposed algorithm. Section 5 describes the simulation environment and presents the performance results of the AFUP algorithm. Finally, we conclude the paper in Section 6.
2
Related Work
There are three main goals in directional sensor networks as well as in wireless sensor networks. maximizing the coverage, prolonging the network lifetime, and assuring the network connectivity. Regarding these goals, existing studies on coverage enhancement can be categorized into four categories [8]. (i)area-based coverage enhancement, (ii)target-based coverage enhancement, (iii)coverage enhancement with guaranteed connectivity and (iv)coverage enhancement with network lifetime prolonging. We will only discuss the studies aiming at maximizing the whole coverage, as we focus on the improvement of the area coverage. Enhancing area coverage is very important for DSNs to fulfill the specified sensing tasks. Since a small unmonitored sub-area defeats the whole purpose of the network, sensor nodes need to be spread as uniformly as possible over the entire sensing region with minimum gaps. However, random deployment may cause several problems, such as overlapped and occluded regions, uncovered areas, and broken sensor nodes.
208
M.A. Guvensan and A.G. Yavuz
Therefore, three solutions have been proposed by the research community to overcome these difficulties. First solution is to redeploy new sensors after the initial deployment. Second solution is to adjust the working directions of the directional sensor nodes to improve the field coverage [15] [19] [14] [12] [5] [10]. The last one is to relocate the sensor nodes with mobility capability [13]. The study [15] is one of the pioneering works on coverage enhancement. The authors present a new method based on a rotatable sensing model. To achieve less overlapping area, a directional node repositions itself on the reverse direction of the interior angle-bisector occuring between two neighboring directional nodes. On the other hand, Cheng et.al describe the area-coverage enhancement problem as Maximum Directional Area Coverage (MDAC) problem and prove the MDAC to be NP-complete [5]. In their study, the authors define two new concepts, virtual sensor and virtual field. A virtual sensor represents one working direction of a directional sensor, whereas a virtual field is a minimal region that is formed by the intersection of the sensing regions of a number of virtual sensors. The distributed solution for MDAC problem, DGreedy algorithm, chooses the least overlapped direction as the new working direction. The authors observe that scarce sensors are highly critical to achieve maximal coverage, thus they utilize the number of sensing neighbors to differentiate the priority which represents the decision slot of sensor nodes. Zhao and Zeng [19] have adapted the theory of the virtual potential field to wireless multimedia sensor networks for coverage improvement. They propose an electrostatic field-based coverage-enhancing algorithm (EFCEA) to enhance the area coverage of WMSNs by turning sensors to the correct orientation and decreasing the coverage overlap of active sensors. They also aim at maximizing the network lifetime by shutting off as much redundant sensors as possible based on the theory of grid approach, and waking them up according to a correlation degree. In [12], the authors name the above mentioned coverage problem as the optimal coverage problem in directional sensor networks (OCDSN). They propose a greedy approximation algorithm to the solution of the OCDSN problem, based on the boundary Voronoi diagram. By constructing the Voronoi diagram of a directional sensor network one could find the maximal breach path of this network. The authors introduce an assistant sensor that can obtain the global information by traveling the edges of the Voronoi diagram. While moving, the assistant sensor determines which sensor to wake up in order to ensure the uncovered boundaries to be covered. In [16], Tezcan and Wang have studied the problem of self-orientation in WMSNs, that is finding the most beneficial orientation for all multimedia sensors to maximize the multimedia coverage. Apart from previous works, they define obstacles in the observed area and propose a solution for occluded regions. They aim at both minimizing the overlapping areas and enabling occlusion-free viewpoints. Their simulation results show that the occlusion-free viewpoint approach increases the multimedia coverage significantly. In this study, we propose a new distributed solution for the self-orientation problem in DSNs. The proposed AFUP algorithm benefits from candidate
A New Coverage Improvement Algorithm Based on Motility Capability
209
uncovered areas. We first compare the performance of the AFUP algorithm with random deployment. As expected, AFUP achieves greater coverage than random deployment by minimizing overlapping areas. To demonstrate the benefits of the algorithm, we also compare the AFUP algorithm against the three well-known area coverage enhancing algorithms [15] [19] [16] proposed for directional sensor networks. Our algorithm outperforms the three algorithms in terms of coverage enhancement and minimization of overlapping areas.
3
Directional Sensing Model
This section briefly explains the well-known directional sensing model. The proposed algorithm uses this sensing model. Contrary to the omni-directional sensing, directional sensors may sense towards their working direction with a given sensing radius and an AoV. A sensor node theoretically covers each point in its FoV. This assumption relies on the binary detection model [6]. 3.1
Notations
The common directional sensing capability for 2D spaces is illustrated in Figure 1. The sector covered by a directional sensor node S is denoted by a 4-tuple −→ −→ (P, Rs , Wd , α), where P is the location, Rs is the sensing radius, Wd is the working direction, and α is the AoV of the sensor node S. Under ideal conditions without occlusion, a sensor node S covers an area with a size of α2 Rs2 units. The special case, where α = 2π can be described as the omnisensing model. For omni-directional sensors, there is only one possible working direction, whereas directional sensors have several possible working directions. −→ However, they can work only at one direction at any given time t. Wd and α are two additional parameters for determining the covered points in directional sensor networks. 3.2
Target in Sector (TIS) Test
TIS test [1] finds out whether a given target is in the FoV of a sensor node or not. The two conditions, given in Equation 1 and 2, are tested in order to determine whether a target is covered by a directional sensor S. d(P, T ) ≤ Rs
(1)
−→ −→ α P T .Wd ≥ d(P, T ) cos( ) (2) 2 d(P, T ) denotes the distance between the target T and sensor node S. Equation 1 ensures that the target is within the sensing range of S, whereas Equation 2 performs the FoV test. According to the binary model, the target is sensed if both conditions are satisfied. This approach is commonly used in target-based coverage problems [1] [3] [17].
210
M.A. Guvensan and A.G. Yavuz
Fig. 1. A directional sensor node senses a unit of sector described with the position (P), the working direction (Wd ), the sensing radius (Rs ), and the AoV(α). A target (T) may be covered if it is located within the FoV of the node. It is found by the TIS test.
For area coverage problems, researchers opt for grid-based approach [4] to adapt this test model for indicating the (un)covered points in the observed area. Each point around the sensor node S is tested with the TIS test. The coverage map of the sensor node S is then created according to the test results, as shown in Figure 2(a).
4
4.1
Coverage Enhancement Algorithm Based on the Attraction Force of Uncovered Points Problem
Overlapping is the main problem of randomly deployed directional sensor networks. It occurs when the FoV of two or more sensors cross over. Another problem is that some sensor nodes may cover outside of the observed area after the initial deployment. Therefore, it is very hard to obtain the optimum coverage in random deployment scenarios. To address this problem, we propose a distributed coverage enhancement algorithm for directional sensor networks. This algorithm aims at rotating the sensor nodes with higher overlap ratio towards uncovered regions to minimize the overlapped regions in the sensing area. 4.2
Attraction Force of Uncovered Points (AFUP) Algorithm
After the initial deployment, directional sensor nodes need to be positioned towards uncovered areas both to minimize possible overlapping/occlusion and to −→ cover those uncovered points. To adjust the Wd of a directional sensor node towards such an uncovered area, we may assume that each point in this area exerts a positive repulsive force on the given node. AFUP exploits these repel −→ forces to determine a more appropriate Wd for a given node. It uses a grid-based approach where each cell in the grid represents the points of the observed area. The algorithm consists of two main phases, AFUP INIT and AFUP NODE.
A New Coverage Improvement Algorithm Based on Motility Capability
211
PHASE 1. AFUP INIT phase aims at both forming subgroups in the network and prioritize the nodes with less number of neighboring nodes. Given that N directional sensor nodes are deployed, each sensor node discovers its neighborhood within its communication radius (Rc ≥ 2Rs ) via exchanging DISCOVER MSGs. DISCOVER MSG carries off the id, the location (P), and the current working −→ direction (Wd ) information of the related node. After the discovery phase, N subgroups are formed within the network, where each sensor node belongs to at least one subgroup. Then, nodes exchange the number of their neighbors via NEIGHBORHOOD MSGs. According to the number of neighbors, each node is assigned to a priority. A sensor node with less number of neighboring nodes has higher priority and is the first node to decide its new working direction. Since −→ the nodes only change their Wd , not their physical location, the number of their neighboring nodes remains the same. Thus, the prioritization is needed only once at the beginning of the AFUP INIT phase. PHASE 2. AFUP NODE phase is the core of the proposed algorithm and takes place iteratively. At the beginning of this phase, each node is in the so called ”unbalanced” state. In each iteration, sensor nodes determine their new candidate working directions. However, this new direction is not considered as final until the node reaches the ”balanced” state. Some sensor nodes may not find an appropriate working direction after several iterations, especially when the node density is too high. In such scenarios, to prevent infinite oscillations, the corresponding sensor nodes update their status as ”balanced”. Once the status of a node is changed to ”balanced”, the node finalizes the AFUP NODE phase. The elaboration of the AFUP NODE phase is given below in the following items. 1. Exploring Overlapped Regions. First, the node marks the covered points in −→ its coverage map (2Rs x2Rs ) relative to its Rs , α and Wd (Fig. 2(a)). Then, after exchanging DISCOVER MSGs, each sensor has an up-to-date FoV of its neighbors. Afterwards, the node calculates the FoVs of its neighboring nodes −→ using their respective P, α, Rs , Wd values and updates the coverage map using the points located within its sensing radius. Finally, the node counts the number of overlapped points (NOP ) to determine the state of the node, shown as in Figure 2(b). 2. Threshold-Value Test. Threshold-Value Test is performed to determine the status of the node. The values in the coverage map represent the number of sensors, which cover a given grid cell. If the value of a grid cell is greater than 1, it is then an overlapped cell(point). The node sums up the number of overlapped points in its coverage map, shown as in Figure 2(c). Next, the sensor node compares the total number of overlapped points to a predefined threshold value (). If the value is less than this threshold value, the sensor node finishes the AFUP NODE phase and it physically turns to this new working direction. If a node could not find an appropriate working direction after several iterations, this node is regarded to be oscillating and its status is forcibly changed to ”balanced”. The threshold value has been considered
212
M.A. Guvensan and A.G. Yavuz
(a) The coverage map of a directional sensor node A. Ones and zeros represent covered and uncovered points, respectively.
(b) The updated coverage map of a directional sensor node A. A has two neighboring nodes, B and C. The number of each cell shows how many nodes cover the corresponding cell. Each cell represents a grid point.
(c) The repel force map of a directional sensor node A. Zeros represent the repel forces on the node A, whereas ones apply neither positive nor negative forces to A. Fig. 2. The coverage map and the neighborhood map of a directional sensor node A. The default width and height of the maps of a sensor node are denoted by 2Rs x2Rs .
as the k percentage of the possible maximum covered sector ( α2 Rs ). After several tests, k has been chosen as 10%. balanced NOP ≤ k × α2 Rs2 Sstate = unbalanced NOP ≥ k × α2 Rs2 3. Determination of the New Working Direction A sensor node, which can not pass the Threshold-Value test, marks the points covered by its neighbors
A New Coverage Improvement Algorithm Based on Motility Capability
213
in its repel force map. Then, it calculates the center of those uncovered points. The uncovered points are represented with a zero value. Each point is considered to perform an equal positive repulsive force on the working direction of the node. The center (xm , ym ) of the repel forces is calculated using Equation 3. Given that the sensor node is located at the point (x0 , y0 ), m arctan( xy00 −y −xm ) gives the new working direction of the sensor node. Rs xm =
−Rs j=−Rs i=Rs Fij × Rs −Rs j=−Rs i=Rs Fij
x
Rs ym =
−Rs j=−Rs i=Rs Fij × Rs −Rs j=−Rs i=Rs Fij
y
(3)
Algorithm 1. The pseudo-code of the AFUP Algorithm −→ Each sensor node knows its location(P (x0, y0 )), Wd ,Rs and α. /* Parallel computation */ /* AFUP INIT */ set the parameter Sstatus = unbalanced; set the parameter forceThreshold (); −→ exchange DISCOVER MSG (P, Wd , Rs , α) create neighboring sensor list; count the number of neighbor nodes exchange NEIGHBORHOOD MSG (Number of Neighbors) set the priority /* AFUP NODE */ oscillation=0 while (Sstatus = unbalanced) AND (oscillation < 10) do sum covered points; −→ collect Wd from neighbor sensors; calculate NOP ; if NOP < then Sstatus = balanced; else find UNCOVERED points within the FoV; calculate the CENTER(xm , ym ) of uncovered points; −→ set the Wd towards the CENTER point; inform neighboring sensors; end if oscillation ++; end while
4. Informing the Neighborhood. After calculating the new working direction, the sensor node informs the neighboring nodes via WD MSG about its candidate working direction. The node waits then for the next round. WD MSG involves only the exchange of the working directions of the neighboring nodes. Thus, no exchange of coverage maps is required.
5
Performance Evaluation
We have implemented a simulation environment using MATLAB 7.8 to evaluate the performance of the AFUP algorithm. We have run several tests for different scenarios to show the effect of the number of sensors (N ), Rs , and α. We have compared the results against to the results of the three recently proposed methods in [15], [19] and [16]. We consider total coverage and overlap ratio as the
214
M.A. Guvensan and A.G. Yavuz
300
300
19
250 2926 38 200
14
45 13 12 49
43 42
46 41
50
100
7 44
23
11 150
8 31 39
2
50
250
300
(a) Initial deployment of 50 directional sensor nodes in a 250x250m2 area.
0
30 13
12 49
43 42
50
36 27
20 37 32
45
46 41
200
35 33
9
6
50
0
1
21
100
343 40
14
48 25
30
22 18 15
17 10 5
24 150
2847
4 16
200
36 27
9
6
50
2926 38
20 37 32
33
21
50
0
35
48 25
100
0
1
17 10 5
24 150
22 18 15
19
250
2847
4 16
100
7 44
23
11 150
8 31 39
2
343 40 200
250
300
(b) New working directions of 50 directional sensor nodes after the AFUP algorithm.
Fig. 3. Random deployment vs. AFUP algorithm (Rs = 30m, α = 60 ◦ )
two key metrics to evaluate the performance. We have used the same node densities as in the respective studies. Thus, sensor nodes had a basic configuration of AoV = 60 ◦ and Rs = 30m. Simulations have been performed for randomly deployed sensor nodes in a rectangular two-dimensional terrain of 250x250m2. 15 different uniform random distributions have been generated for each individual scenario. Note that, each sensor node is aware of its exact location via some localization technique. In Figure 3(a), as an example, a sensing field with 50 directional sensor nodes with Rs = 30m is shown. Each directional sensor node is illustrated with a ”small circle” and its FoV is shown with a green area. The dark regions represent the overlapped areas. Figure 3(a) shows the working directions of the nodes after random deployment, whereas the working directions in Figure 3(b) are the result of the AFUP algorithm. Simulation results show that the motility capability of sensor nodes prevent serious waste of resources. Since the nodes have the capability to exchange information with their neighbors, they can reposition their working directions to minimize the overlapping. Moreover, they can elude from monitoring outside the area of interest. Fig. 3(b) shows the situation after motile sensor nodes have determined their FoV disks, scanned their coverage neighbors and communicated with their neighbors to decide on the optimal working directions. We have observed that by using the AFUP algorithm in a 50 node network, a coverage ratio of 34.95% could be achieved, whereas the maximum possible coverage on this field is 37.68%. The comparison of the maximum possible coverage (optimum deployment), the coverage ratios of the AFUP algorithm and random deployment for different number of nodes are given in Figure 4(a). Note that, optimum deployment refers to a controlled deployment where nodes are placed manually with minimum overlapping areas. Figure 4(a) shows that motility is inadequate above a certain threshold to heal the coverage holes. We believe that either mobility feature needs to be used above this threshold to improve the coverage [7] or redundant nodes should be shut off to prolong the network lifetime. Analysis of Table 1 shows that the AFUP algorithm outperforms the random deployment in terms of the total coverage. Especially, the coverage gain in dense
A New Coverage Improvement Algorithm Based on Motility Capability
(a) The coverage ratios of random deployment, AFUP algorithm, and optimum deployment
(b) The relationship between the coverage gain ratio, the overlap minimization ratio and the number of sensor nodes. The negative coverage gain denotes where motility is inadequate. At this point, either mobility feature needs to be used to improve the coverage or redundant nodes should be shut off to prolong the network lifetime. Fig. 4. Simulation results, N = 50, Rs = 30m, α = 60 ◦ Table 1. AFUP Algorithm vs. Random Algorithm Rs = 30m Random Increment of Coverage AFUP(%) α = 60 ◦ Deployment(%) Total Coverage Gain (%) N = 25 15.65 18.48 2.83 18.08 N = 50 28.28 34.95 6.67 23.59 N = 75 39.69 49.28 9.59 24.16 N = 100 48.38 60.56 12.18 25.18 N = 125 56.28 69.80 13.52 24.02 N = 150 63.26 77.23 13.97 22.08
215
216
M.A. Guvensan and A.G. Yavuz
(a) The relation between the angle of view (α) and the coverage, where Rs is constant
(b) The relation between the sensing radius (Rs ) and the coverage, where α is constant Fig. 5. Effect of Node Capabilities on Coverage Improvement
networks is greater than in sparse networks. The coverage gain varies from 18% to 25%. However, coverage gain starts to drop down when the network is saturated with sensor nodes. The node capabilities, such as Rs and α, also affect the coverage ratio of the network. The total coverage definitely increases with the increase of Rs and α. Fig. 5(a) and Fig. 5(b) show how the coverage ratio is changed with varying Rs and α for both random deployment and AFUP algorithm. Figure 4(b) shows the ratios, overlap minimization and coverage gain for different number of sensor nodes. We have observed that overlap minimization ratio decreases as the number of sensor nodes increases and contributes to the total coverage up to 7%. However, the overlap minimization ratio becomes negative when the node density exceeds a certain threshold. This threshold value could be chosen as to put the redundant sensors into sleep. AFUP algorithm gives better results than the two available solutions [15] [19] in terms of coverage improvement. Table 2 shows the performance difference between the AFUP algorithm and the two other methods. We have also compared our algorithm to the method in [16]. The authors take obstacles into consideration, so we have only compared the overlap minimization ratios. Initial results show that
A New Coverage Improvement Algorithm Based on Motility Capability
217
Table 2. Comparison of AFUP Algorithm against other solutions Scenario N = 75, Rs = 60, α = 44 ◦ N = 38, Rs = 50, α = 120 ◦
Coverage Improvement Method 1 [19] AFUP 10.90%
12.28%
Method 2 [15] AFUP 4.66%
17.5%
Fig. 6. The convergence characteristic of the AFUP algorithm. AFUP achieves the greatest coverage improvement in five/six iterations.
AFUP performs slightly better than theirs. However, we will obtain more accurate results after occluded areas are introduced into the AFUP algorithm. Since AFUP is an iterative algorithm, we have evaluated its convergence time too. The convergence time depends on the number of nodes within subgroups. However, simulation results show that AFUP algorithm mostly converges in 5 to 6 iterations, shown as in Figure 6.
6
Conclusion
In this paper, we have proposed a self-orientation algorithm for directional sensor networks in order to improve the coverage after the initial deployment. Simulation results show that the AFUP algorithm can increase the total coverage by up to 25% compared to random deployment. Moreover, AFUP may heal uncovered areas by up to 13% better than some existing approaches. We have also observed that it is easy to decrease the overlapping areas in sparse networks than in dense networks. However, the coverage gain increases with the increasing number of nodes, since facing the border nodes towards the area of interest contributes a significant amount to the total coverage. In this study, we also show that motility is inadequate in highly dense networks and we state that mobility/node scheduling is required for those networks to improve the coverage/the network lifetime. We are already working on an enhanced version of the AFUP algorithm, named W-AFUP, where W refers to the weights of the forces. First simulation
218
M.A. Guvensan and A.G. Yavuz
results show that W-AFUP achieves up to 10% more coverage than AFUP. The details and the performance results of the W-AFUP algorithm will be discussed in a future work. In this future work, occluded regions will also be included in the test scenarios and performance evaluations.
Acknowledgement This research has been supported by Yildiz Technical University Scientific Research Projects Coordination Department under the grant number 2011-04-01DOP03.
References 1. Ai, J., Abouzeid, A.A.: Coverage by directional sensors in randomly deployed wireless sensor networks. Journal of Combinatorial Optimization 11(1), 21–41 (2006) 2. Akyildiz, I., Melodia, T., Chowdhury, K.: Wireless multimedia sensor networks: Applications and testbeds. Proceedings of the IEEE 96(10), 1588–1605 (2008) 3. Cai, Y., Lou, W., Li, M.: Cover set problem in directional sensor networks. In: Proc. of IEEE Intl. Conf. on Future Generation Communication and Networking (FGCN 2007), Washington, DC, USA, pp. 274–278 (2007) 4. Chen, H., Wu, H., Tzeng, N.-F.: Grid-based approach for working node selection in wireless sensor networks. In: Proc. Intl. Conf. on Communications, Paris,France, vol. 6, pp. 3673–3678 (20-24, 2004) 5. Cheng, W., Li, S., Liao, X., Changxiang, S., Chen, H.: Maximal coverage scheduling in randomly deployed directional sensor networks. In: Proc. of Intl. Conf. on Parallel Processing Workshops (ICPPW 2007), pp. 68–68. Xi-An, China (2007) 6. Ghosh, A., Das, S.K.: Coverage and connectivity issues in wireless sensor networks: A survey. Pervasive and Mobile Computing 4(3), 303–334 (2008) 7. Guvensan, M.A., Yavuz, A.G.: A hybrid solution for coverage enhancement in directional sensor networks. In: Proc. of Intl. Conf. on Wireless and Mobile Communications (ICWMC 2011), Luxembourg (June 2011) 8. Guvensan, M.A., Yavuz, A.G.: On coverage issues in directional sensor networks: A survey. Elsevier Ad Hoc Networks (February 2011), http://dx.doi.org/10.1016/j.adhoc.2011.02.003 9. Huang, C.-F., Tseng, Y.-C.: The coverage problem in a wireless sensor network. Mobile Networks and Applications 10, 519–528 (2005) 10. Kandoth, C., Chellappan, S.: Angular mobility assisted coverage in directional sensor networks. In: Proc. of Intl. Conf. on Network-Based Information Systems (NBIS 2009), pp. 376–379 (August 2009) 11. Kansal, A., Kaiser, W., Pottie, G., Srivastava, M., Sukhatme, G.: Reconfiguration methods for mobile sensor networks. ACM Transactions on Sensor Networks 3(4) (2007) 12. Li, J., Wang, R.-C., Huang, H.-P., Sun, L.-J.: Voronoi based area coverage optimization for directional sensor networks. In: Proc. of Intl. Symposium on Electronic Commerce and Security (ISECS 2009), Nanchang, China, vol. 1, pp. 488–493 (May 2009) 13. Liang, C.-K., He, M.-C., Tsai, C.-H.: Movement assisted sensor deployment in directional sensor networks. In: Proc. of Intl. Conf. on Mobile Ad-Hoc and Sensor Networks (December 2010)
A New Coverage Improvement Algorithm Based on Motility Capability
219
14. Ma, H., Zhang, X., Ming, A.: A coverage-enhancing method for 3d directional sensor networks. In: Proc. of IEEE Intl. Conf. on Computer Communications (INFOCOM 2009), Rio de Janerio, Brazil, pp. 2791–2795 (2009) 15. Tao, D., Ma, H., Liu, L.: Coverage-enhancing algorithm for directional sensor networks. In: Cao, J., Stojmenovic, I., Jia, X., Das, S.K. (eds.) MSN 2006. LNCS, vol. 4325, pp. 256–267. Springer, Heidelberg (2006) 16. Tezcan, N., Wang, W.: Self-orienting wireless multimedia sensor networks for occlusion-free viewpoints. Computer Networks: Int. Journal of Computer and Telecommunications Networking 52(13), 2558–2567 (2008) 17. Wen, J., Fang, L., Jiang, J., Dou, W.: Coverage optimizing and node scheduling in directional wireless sensor networks. In: Proc. of IEEE Intl. Conf. on Wireless Communications, Networking and Mobile Computing (WiCom 2008), Dalian, China, pp. 1–4 (October 2008) 18. Mohamed, Y., Akkaya, K.: Strategies and techniques for node placement in wireless sensor networks: A survey. Ad Hoc Networks 6(4), 621–655 (2008) 19. Zhao, J., Zeng, J.-C.: An electrostatic field-based coverage-enhancing algorithm for wireless multimedia sensor networks. In: Proc. of IEEE Intl. Conf. on Wireless Communications, Networking and Mobile Computing (WiCom 2009), Beijing, China, pp. 1–5 (September 2009)
A Multi-objective Approach for Data Collection in Wireless Sensor Networks Christelle Caillouet1, , Xu Li2 , and Tahiry Razafindralambo2 1
Lehrstuhl II f¨ ur Mathematik - RWTH Aachen University, Germany
[email protected] 2 INRIA Lille - Nord Europe, France xu.li,
[email protected]
Abstract. Wireless sensors networks (WSNs) are deployed to collect huge amounts of data from the environment. This produced data has to be delivered through sensor’s wireless interface using multi-hop communications toward a sink. The position of the sink impacts the performance of the wireless sensor network regarding delay and energy consumption especially for relaying sensors. Optimizing the data gathering process in multi-hop wireless sensor networks is, therefore, a key issue. This article addresses the problem of data collection using mobile sinks in a WSN. We provide a framework that studies the trade-off between energy consumption and delay of data collection. This framework provides solutions that allow decision makers to optimally design the data collection plan in wireless sensor networks with mobile sinks. Keywords: mobile sink, data collection, energy.
1
Introduction
Wireless sensor networks (WSNs) have received a lot of attention in recent years due to their potential applications in various areas such as environment monitoring or tracking [1,5,12]. In order to get useful and up-to-date information from the environment, the network is composed of a large number of low-capacity (processor, memories, battery) sensors. As the number of sensors increases, the amount of data in the network also increases. The data generated by the sensors has, then, to be sent to a central entity, called sink, for storage and processing. Thanks to the wireless communication capabilities and the protocols developed in the literature, multi-hop transmissions can be used to route data from a sensor to the sink if no direct connection is available. However, this classical N to 1 communication paradigm rapidly consumes the energy of intermediate sensors and provides unfair delay distribution depending on the distance to the sink [8]. Therefore, data collection becomes a key issue in wireless sensor networks. The data collection in wireless sensor networks consumes energy and needs low delay depending on the application. Reducing the energy consumption while increasing the amount of generated data to have a correct view of the environment is a great challenge. Due to this conflicting goals, the trade-off between
Supported by an Alexander von Humboldt fellowship for postdoctoral researchers.
H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 220–233, 2011. Springer-Verlag Berlin Heidelberg 2011
A Multi-objective Approach for Data Collection in Wireless Sensor Networks
221
energy consumption and environment observation accuracy is still an hot topic in wireless sensor networks. Moreover, WSNs are more and more used for delay sensitive applications such as battlefield monitoring. In these applications, delay reduction between data generation and data processing becomes mandatory. The literature shows different ways to reduce the amount of transiting data in the network. On one hand, data aggregation techniques [9] limit the generated data by using forecasting. These techniques have some strong assumptions regarding the data. However, forecasting strongly reduces the amount of generated data and thus increases the network lifetime. On the other hand, the use of mobile sinks reduces the number of forwarding sensors [11] but needs motion capabilities for the sinks. In this paper, we focus on data collection using mobile sinks in wireless sensor networks with the objective of minimizing the energy consumption and the delay of data collection. Our purpose is to determine where to place a set of gateways (or collection points) that are defined to collect the produced data of a region in the WSN field, and compute the route of a mobile sink moving along the gateways to gather the data from the sensors. To answer these questions, we propose a Multi-objective Linear Programming (MLP) framework that allows to optimally place the gateways and minimize jointly the energy spent in the WSN and the route of the mobile sink. Multi-objective optimization does not compute an unique solution, but a set of ”best” solutions, called the Pareto front, capturing the trade-offs between the different metrics. Solving a multi-objective problem consists in finding the Pareto front, from which the decision maker chooses the solution that fits the best his needs. In this work, each point of the Pareto set is obtained by solving an optimization problem. The main contribution of this work is to give a multicriteria vision of the data collection problem in WSNs. As far as we know there is no multi-objective analysis in this subject. The developed solutions reduce the overall energy consumption but also reduce the delay of data delivery from the sensors to the sink. Unlike the works proposed in the literature, the results of this paper are twofold. First, we tackle the problem of optimal placement of data collection points in an energy-efficient WSN. Second, we optimize the data collection tour by the sink to minimize the delay. In the first contribution, our aim is to reduce the energy spent by intermediate sensors and in the second contribution, we focus on delay reduction. In Section 2, we review the previous proposed solutions for optimizing the energy consumption and data collection in WSNs, and describe our assumptions for this work. Section 3 presents the formulation of our proposed MLP. Section 4 presents the method to optimaly solve our MLP. Section 5 shows the experimental results and Section 6 concludes the paper.
2 2.1
Background and Assumptions State of the Art
Various solutions have been proposed to extend the network lifetime and reduce delay for data collection. Some solutions have proposed to deploy static sinks
222
C. Caillouet, X. Li, and T. Razafindralambo
in order to reduce the traffic bottlenecks which affect the energy efficiency and the WSN lifetime. In [3], authors propose an heterogeneous view of the network. They develop an ILP (Integer Linear Program) formulation for placing a minimum number of collection points (or gateways) and ensure connectivity among them and the sink to form a wireless mesh network to deliver the data. They minimize the number of collection points and the maximum distance between the sensors and a gateway so that energy consumption is minimized. The use of mobile sinks instead of static sinks to collect the data is more efficient and significantly increases the lifetime of the sensor network [4,11]. In these works, the location of the mobile sinks is periodically computed so that the network lifetime is maximized. Some research efforts have focused on approaches either minimizing the energy consumed by the sensors [4], or maximizing the global network lifetime [2,7]. Considering the route of the mobile sinks in WSNs instead of its periodical relocation has not been addressed in previous work to the best of our knowledge. We propose a different formulation of the problem, seeking to jointly compute an optimal placement of the collection points (or gateways) in an energy-efficient way, while minimizing the length of the route of a mobile sink. The extension of our model to deal with several mobile sinks is straightforward as described in the next section. We do not claim to provide a unique best solution of the problem, but we study the trade-off between the energy spent by the sensor nodes in data collection, and the delay induced by a mobile sink collecting data at collection points in the WSN. Thus, our work provides solutions that allow decision makers to optimally design the data collection plan in wireless sensor networks with mobile sinks. 2.2
Motivating Application
We consider a multi-tiered network structure in order to improve energy efficiency. The hierarchical architecture considered in this paper has been designed for WSN and consists of multi-tiers: – Sensors are static devices that are capable of collecting information and are resource constrained. The sensors are spread out over the sensing field following a distribution that may be probabilistic or deterministic. – Gateways (or collection points) are static devices deployed to collect the traffic of a region of the network. – Sinks are devices with motion capabilities which gather data from sensors. In our application, the sinks gather data at different collection points of the sensing field. Since the collection points may not be inside the communication range of a given sensor, sensors use multi-hop communications to send data to a given collection point. During the bootstrapping phase of our application, a set of possible collection points are defined. An optimal subset of collection points is chosen in order to reduce the application delay and to enhance the energy efficiency of the whole network. Moreover, since we consider mobile sinks, it is also important to optimize the route of the mobile sink in order to reduce its energy consumption and the delay for data delivery.
A Multi-objective Approach for Data Collection in Wireless Sensor Networks
2.3
223
Model and Assumptions
We assume that the routing in the WSN (from sensors to collection points) is given so that our model is independent on the specific routing strategy. We select and place the gateways in order to minimize the energy consumption of the WSN. The energy model considered for the sensors is based on the first order radio model described in [6]. A sensor consumes elec = 50 nJ/bit to run the transmitter or receiver circuitry, and amp = 100 pJ/bit/m2 for the transmitter amplifier. Thus, to receive a k-bit message, sensor i consumes: Er = elec k,
and
Et = elec k + amp dist2 (i, j)k,
(1)
to transmit this message to a neighbor j, where dist(i, j) is the euclidean distance between i and j. In the following, we propose a Multi-objective Linear Programming (MLP) framework that allows to study the trade-offs between the length of the route of the mobile sinks associated with a computed gateway placement, and the overall energy consumption in a wireless sensor network.
3
Problem Definition
In this work, we focus on data collection in WSNs minimizing the energy consumption and the delay. A set of gateways is chosen to collect the traffic of a region in the WSN field, and a mobile sink moves along the gateways to gather the data from each region. Given a wireless sensor network represented by a set of sensor nodes S, we define a set of candidate sites CS which can potentially be a collection point. We want to determine the gateways’ location such that each sensor is associated with its closest one. In order to associate each sensor with its closest gateway, we order, for each sensor node i ∈ S, the reachable gateways in the vector Oi : if j < k, then dist(i, j) ≤ dist(i, k) and Oi (j) is before Oi (k) in the vector. Sets Ji are index sets of vector Oi . The routing table is an input of our optimization problem. We thus have the set P of paths between the sensors and the candidate sites where we can potentially deploy a gateway. O(p) (resp. D(p)) denotes the source node (resp. the destination node) of path p ∈ P . From the set of paths, we introduce the matrix C to indicate the connectivity between the sensors and the gateways: 1 if there exists a route between sensor i and candidate site c Cic = 0 otherwise The decision variables used in our MLP are the following: Emax = the maximum amount of energy consumed by a sensor 1 if sensor i is assigned to gateway j xij = 0 otherwise
224
C. Caillouet, X. Li, and T. Razafindralambo
yj =
1 if a gateway is installed at candidate site j 0 otherwise
⎧ ⎨ 1 if two gateways are installed at candidate sites i and j and the link (i, j) is selected for the route of the mobile sink χij = ⎩ 0 otherwise To evaluate the overall quality of our solutions, we use the following metrics: – MinMaxE (f 1 ): Balancing the energy spent by the sensor nodes, that can be viewed as WSN lifetime maximization. From the energy model described in Section 2.3, we seek to minimize the maximum energy spent by each sensor node: min maxs∈S (Etot (s)), where Etot (s) is the total amount of traffic sent and relayed by sensor s in the WSN. The formal definition of Etot (s) is given in the next subsection. – MinRoute (f 2 ): Minimizing the lengthof the route of the mobile sink between the different installed gateways: min i∈CS j∈CS dist(i, j)χij . 3.1
Multi-objective Linear Program
The optimization problem of placing the gateways such that we jointly minimize the length of the mobile sink route, and the energy spent by the sensor nodes is the following: ⎧ ⎨ (i) min f 1 = E max 2 (2) dist(i, j)χij ⎩ (ii) min f =
i∈CS j∈CS
∀i ∈ S
xij = 1
(3)
j∈CS
xij ≤ Cij yj yOi (k) + xiOi (h) ≤ 1 p ∈ P, i ∈ p, i = O(p)
h∈Ji ,h>k
(Er + Et ) xO(p)D(p) +
i,j∈K
∀i ∈ S
(5)
∀i ∈ S
(6)
∀i, j ∈ CS
(7)
p ∈ P, i = O(p)
χij = yj ,
i∈CS
Et xiD(p) ≤ Emax
∀i ∈ S, j ∈ CS (4)
χij
χij = yi
j∈CS
i∈K\{k}
y i + 1 − yc
∀k ∈ K ⊂ CS , (8) c ∈ CS \ K
The objective (2) seeks to minimize (i) the maximum energy consumed by the sensor nodes, and (ii) the length of the route of a mobile sink along the placed gateways, subject to the euclidean distance between the gateways. Constraints (3) and (4) ensure that each sensor must be associated with an installed gateway that can be reached using a given existing path. Constraints (5) force each sensor to be assigned to its closest gateway.
A Multi-objective Approach for Data Collection in Wireless Sensor Networks
225
The objective (2)(i) and constraints (6) try to minimize the maximum amount of energy spent by the sensors. Each sensor has to send its own traffic to its associated gateway, and it has also to forward traffic received by other sensors destinated to their associated collection points. According to the definitions in Section 2.3, the total energy spent by sensor i to relay the data from another sensors associated to an installed collection point j equals: EF tot (i) = (Er + Et ) xO(p)D(p) . p∈P, i∈p, i=O(p), yD(p) =1
We have to add i’s traffic to this formula: EStot (i) =
Et xiD(p) .
p∈P, i∈p, i=O(p), yD(p) =1
Leading to the total amount of energy spent by i: Etot (i) = EF tot (i) + EStot (i). In constraints (4), the sensor-gateway association cannot exist if the candidate site is not chosen, i.e. yj = 0 ⇒ xij = 0, ∀i ∈ S. Etot (i) can thus be replaced by the left-hand side of constraints (6). We then bound this amount by Emax which is to be minimized in the objective function. Equalities (7) force the mobile sink to visit all the chosen collection points to collect the data generated by the sensors. They refer to the NP-complete Travelling Salesman Problem (TSP). Since our problem combines gateway placement and TSP, we do not know a priori the number of deployed gateways that have to be part of the TSP tour but only the number of candidate sites. The classical linear constraint for subtour elimination is not enough to ensure subtour elimination within the deployed gateways. Constraints (8) are so-called generalized subtour elimination contraints and ensure that there must be a selected edge between a subset K and K = CS \ K only if there is at least a deployed gateway inside and outside of K. Unfortunately, the number of subsets K ⊂ CS is exponential in the cardinality of CS . To avoid the complete enumeration of the subsets, we proceed as follows: 1. Solve the MLP without subtour elimination. 2. Check the solution: if it has no subtour, we are done. 3. If there is a subtour, add the corresponding constraint (8) for the subtour and solve the program again. 4. Iterate until a solution without subtours is found. 3.2
Multiple Mobile Sinks
The extension with several mobile sinks is straightforward in our model and can be done in the following way: – Introduce the set M of mobile sinks, and the decision variable χm ij to be the selection of link (i, j), i, j ∈ CS in the route of mobile sink m ∈ M. – Add the set of constraints specifying that each link between two gateways must be selected by exactly one mobile sink: χm ∀i, j ∈ CS . (9) ij = χij m∈M
226
C. Caillouet, X. Li, and T. Razafindralambo
– Ensure that the route of each mobile sink is a tour: χm χm ∀i ∈ CS , m ∈ M. ji = ij j∈CS
(10)
j∈CS
Investigation regarding multiple mobile sinks is out of the scope of this paper and is left as future works. We can also notice that this simple extension can be enhanced by a coordination of the different mobile sinks’ tours.
4
Energy-Delay Trade-Off
In order to determine the optimal solutions of our problem, we have solved the proposed MLP with IBM Cplex solver1 version 12 on an INTEL Core 2 2.4 GHz with 2 Gb of memory. Combining the two metrics of the objective function (2) for our problem is not relevant. Indeed, there exists confliction between route length and energy consumption in our optimization framework, i.e., pursuing the optimization of the route length of the mobile sink inextricably damages the performance of energy spent by the sensors. Saving energy enforces to deploy more collection points in the network, thus increasing the length of the route for the mobile sinks. As a result, the length of the route would be damaged, and vice versa. Consequently, for such a multi-objective optimization problem in which the objectives cannot be optimized simultaneously, the concept of Pareto optimality was introduced into the evaluation system. The main idea to study the trade-offs between the two metrics MinMaxE and MinRoute is to find out all the possible non-dominated solutions of the optimization problem. In a general multi-objective problem of the form: min f (x) = (f 1 (x), f 2 (x), . . . , f k (x)) s.t. gi (x) 0, i = 1, . . . , m Where x ∈ Rn is the decision vector belonging to the feasible region F = {x ∈ Rn | gi (x) 0, i = 1, 2, . . . , m}, a solution x2 ∈ Rn is dominated if there is another solution x1 ∈ Rn such that: – The decision vector x1 is not worse than x2 in all objectives: f i (x1 ) f i (x2 ), ∀i = 1, 2, . . . , k. – The decision vector x1 is strictly better than x2 in at least one objective: f i (x1 ) < f i (x2 ) for at least one i = 1, 2, . . . , k. A solution is non-dominated if there is no other solution dominating it. Informally, this means that if a solution is non-dominated within the whole solution space, it is not possible to improve one of the metrics without worsening at least one of the other metrics. Each multi-objective problem has a set of Paretooptimal solutions defined as the set of non-dominated solutions. The set of all non-dominated solutions is the Pareto front [10]. The Pareto front provides a set of solutions that can be chosen depending on the application requirements. 1
http://www.ibm.com/software/integration/optimization/cplex-optimizer/
A Multi-objective Approach for Data Collection in Wireless Sensor Networks
227
Fig. 1. Non-dominated and dominated solutions for a 2-function minimization problem
More precisely, each non-dominated solution represents a different optimal trade-off between the objectives. In this paper, the objective functions of our data collection problem for WSNs are f 1 = M inM axE and f 2 = M inRoute, which are used in the Pareto dominance comparison as in Figure 1. In order to generate Pareto-optimal solutions on the Pareto front, we use the -constraint method that transforms the multi-objective problem into a sequence of parameterized single-objective problems such that the optimum of each single-objective problem corresponds to a Pareto-optimal solution [10]. We thus generate and solve monoobjective optimization problems of the form: min f i (x) s.t. f j (x) j , ∀j = i The i are chosen such that f i∗ i , where f i∗ corresponds to the optimum value of the mono-objective problem minimizing only objective f i .
5
Performance Evaluation
In this section, we present results obtained by solving our MLP with the constrained method in order to get optimal solutions on the Pareto front. We study networks of size between 50 and 250 sensors whose position is randomly chosen in a square area. The length is normalized so that the WSN is deployed in a unitary squared area. For each random network computed, we use two policies of candidate sites locations: a regular and a random one. The first model divides the area considered into equal squares in which one candidate site is placed in the center of the square. In this way, the candidate sites for placing the collection points form a regular grid. The second policy chooses randomly the location of the candidate sites in the area. We have tested the two policies with {32 , 42 , 52 } candidate sites. The set of paths considered is comprised of shortest paths between each pair (sensor, candidate site). 5.1
Effect of Candidate Sites and Network Density
To demonstrate the utility of the approach, we generate sets of non-dominated solutions by iteratively solving -constrained mono-objective optimization problems. Results are depicted in Figures 2 and 3 for various network sizes and numbers of candidate sites.
C. Caillouet, X. Li, and T. Razafindralambo
3.6 3.2 2.8 2.4 2 1.6 1.2 0.8 0.4 0
MinMaxE (mJ)
MinMaxE (mJ)
228
0
0.3
0.6
0.9
1.2
1.5
1.8
2.1
3.6 3.2 2.8 2.4 2 1.6 1.2 0.8 0.4 0
2.4
0
0.3
0.6
0.9
MinRoute
3.6 3.2 2.8 2.4 2 1.6 1.2 0.8 0.4 0 0.3
0.6
0.9
1.2
1.5
1.8
2.1
2.4
1.8
2.1
2.4
(b) |S| = 150
MinMaxE (mJ)
MinMaxE (mJ)
(a) |S| = 100
0
1.2 MinRoute
1.5
1.8
2.1
3.6 3.2 2.8 2.4 2 1.6 1.2 0.8 0.4 0
2.4
0
0.3
0.6
0.9
MinRoute
1.2
1.5
MinRoute
(c) |S| = 200
(d) |S| = 250
2.2 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0
MinMaxE (mJ)
MinMaxE (mJ)
Fig. 2. Pareto fronts for random networks with 25 candidate sites and different number of sensors |S|
0
0.3
0.6
0.9
1.2
1.5
2.2 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0
1.8
0
0.3
0.6
MinRoute
MinMaxE (mJ)
MinMaxE (mJ) 0.3
0.6
0.9
1.2
MinRoute
(c) |CS | = 25
1.2
1.5
1.8
(b) |CS | = 16
2.2 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0
0.9
MinRoute
(a) |CS | = 9
1.5
1.8
2.2 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0
0.3
0.6
0.9
1.2
1.5
1.8
MinRoute
(d) |CS | = 36
Fig. 3. Pareto fronts for random networks with 50 sensors and different number of candidate sites |CS |
We can easily see that limiting the energy spent by each sensor node increases the length of the mobile sink route. Moreover, the number of installed gateways strongly depends on the limit of the energy spent (M inM axE): The amount of forwarding traffic from other sensors is strongly limited. In particular, when we focus on energy (optimizing only MinMaxE, without any constraint on the number of deployed collection points) the optimal solution of our MLP minimizes the energy spent by each sensor essentially by limiting its forwarding traffic. We
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
250 nodes 200 nodes 150 nodes 100 nodes 50 nodes
0
0.5
1
1.5
2
2.5
3
3.5
Nb paths on most loaded sensor
Nb of installed gateways
A Multi-objective Approach for Data Collection in Wireless Sensor Networks
MinMaxE (mJ)
(a) Number of deployed gateways depending on Emax
32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0
229
250 nodes 200 nodes 150 nodes 100 nodes 50 nodes
1
3
5
7
9
11
13
15
Nb of installed gateways
(b) Number of paths through the most loaded sensor depending on gateways
Fig. 4. Results for random networks with 25 candidate sites
can thus see that the placement of the collection points verifies that each sensor is a neighbor of its associated gateway (when possible). In order to get a better analysis about the forwarding traffic related to the energy consumed by the sensors, we have computed the number of paths going through the most loaded sensor node for a given placement of collection points. More formally, given the set of paths as an entry of our problem, we compute for each sensor i the number of paths containing i that have a gateway placed as destination node, and so that the source node of the path is associated with this gateway: xO(p)D(p) . Load(i) = p∈P|i∈p
The most loaded sensor node is therefore the one that has the maximum number of paths going through it: Load(S) = maxi∈S Load(i). Figure 4(b) represents another way of viewing the trade-off between the two objectives, by depicting the maximum number of paths going through the most loaded sensor node in function of the number of deployed gateways in the WSN. We can see that deploying more gateways allows to limit the amount of forwarding traffic at each sensor. When the number of gateway is large enough so that each sensor is a neighbor of one gateway (when possible), then the forwarding traffic becomes null and the sensor spends energy only for sending its own traffic. When the size of the network increases, then the energy consumption of the sensor nodes also increases (see Figure 2). Indeed, the total traffic in the WSN is more important, so the sensors have more forwarding traffic to relay which increases their load. When the energy is limited to Emax = 2 mJ for each sensor, it is worth noting that the length of the mobile sink route also increases with the network size. The average route length of the mobile sink equals respectively 0.33, 0.47, 0.6, and 0.65 for a WSN of 100, 150, 200, and 250 nodes. However, the maximum amount of energy spent by the sensor nodes decreases when the number of candidate sites in the network increases as depicted in Figure 3. On one hand, placing a gateway in a candidate site reduces the relaying traffic and, therefore, reduces the energy spent. On the other hand, the length of the mobile sink’s route increases, especially when the energy consumed by each sensor is low. This assertion is confirmed by Figure 4(a) that depicts the number of deployed gateways depending on the maximum energy spent by the sensors.
230
5.2
C. Caillouet, X. Li, and T. Razafindralambo
Graphical Trade-Off Interpretation
The trade-offs among the different metrics are shown using value paths in Figure 5. Value paths have proven to be an effective way to present trade-offs in multi-objective problems [13]. In the figure, there is a vertical axis for each objective. The value assigned to each non-dominated solution on a particular axis is that solution’s value divided by the best solution possible for that objective: i.e. f i /f i∗ for each i = 1, 2. The minimum value for each axis is 1.00, corresponding to the optimal mono-objective solution f 1∗ and f 2∗ : The minimum energy consumption for M inM axE and the minimum possible route length M inRoute for the mobile sink. For a network of 50 sensors and 9 candidate sites (Fig. 5(a)), the optimal solution for M inM axE leads to a route 3 times longer than the optimal one. On the contrary, having the optimal route length incurs 233 per cent more in energy consumption than does the best solution. For a fair trade within the two-objectives, the best solution would be of minimizing the maximum value on all the axis: minv maxf 1 ,f 2 {f 1 (v)/f 1∗ , f 2 (v)/f 2∗ }. The non-dominated solution optimizing the fair trade is thus the one incuring 66 percent more in energy consumption and 100 per cent more in route length. On the Pareto front this solution is located in the bottom of the curve of the Pareto front, with value M inM axE = 0.75mJ and M inRoute = 1 (see Fig. 3(a)). When the number of candidate sites increases in the network, the maximum gap between the optimal mono-objective and the multi-objective values for M inM axE becomes really large (Fig. 5(b) and 5(c)). The number of possibilities to deploy the gateways grows exponentialy, therefore this can lead to really bad results in terms of energy, when the route length of the mobile sink is optimized. Value 1 for M inRoute corresponds to value 27.45 for M inM axE in Figure 5(b). The fair trade non-dominated solution for a 50-sensors network with 25 candidate sites has value 4.3 and 4 for M inM axE and M inRoute percentages respectively, with exact values M inM axE = 0.22mJ and M inRoute = 1.33 (Fig. 3(c)). But 27.45
57.65
3.33
7.06
3.00
6.79
45.50
6.75
21.57
2.71
6.06
2.55 2.41
5.45
17.06 5.03
34.33 5.06 4.88
4.76
2.11 2.00
24.53 4.03 11.57 10.80
1.66
3.76
1.71
1.11 1.00
MinMaxE
1.41
1.00
MinRoute
(a) |S| = 50, |CS | = 9
3.03
13.35
2.73
11.10
2.44
8.84 7.84 6.88 5.88
6.39
4.31
2.00
3.00 2.31
1.42
3.92
1.00
1.00
1.00
MinMaxE
3.75
16.69
8.92
1.33
4.45 4.03
21.27
3.03 2.73 2.42 2.00 1.73
2.96
MinRoute
(b) |S| = 50, |CS | = 25
MinMaxE
1.00
MinRoute
(c) |S| = 200, |CS | = 25
Fig. 5. Percentage path analysis for delay-energy trade-off
A Multi-objective Approach for Data Collection in Wireless Sensor Networks
231
when the number of sensors increases, while the number of candidate sites stays the same, then the scales in the percentage value paths have the same order for M inRoute, and does not increase a lot for M inM axE, because the routing stays roughly the same with the same number of candidate sites. The fair trade non-dominated solution is M inM axE = 0.3mJ and M inRoute = 1.8 for a network with 200 sensors and 25 candidate sites (Fig. 5(c) and 2(c)), which is quite effective. One can remark that the fair trade optimal solution is always closer to the optimal solution of M inM axE because when the route length is optimized, then the energy consumption becomes really bad in the WSN. After analysing such trade-offs, a network manager can then decide which option is preferred. The network manager can also choose the required number of collection points deployed and can then find the best way to route the data with minimum energy consumption. 5.3
Impact of the Candidate Site Positioning
The location of the candidate sites among the WSN is also important regarding the network lifetime and the delay of data collection. In Table 1, we present results for various topologies in order to compare the two policies of candidate site placement described at the beginning of the section. For each policy (regular grid and random), we compute the optimal value of the 2 mono-objective optimizations f 1∗ and f 2∗ . On one hand, a regular placement of the collection points always saves energy in the WSN, thus increasing the network lifetime. These results are explained by the fact that the distance between a sensor and a gateway can be bounded due to the regular placement of candidate sites. On the other hand, a random placement of the collection points usually leads to a smaller route for the mobile sink, thus leading to a better data collection. This behavior is due to the possible useless placements of gateways since these placements are randomly chosen. Moreover, due to the same reason, the average number of deployed gateways among the Pareto optimum of the bi-objective optimization Dpl. Gtw is smaller for the random policy. However, the reduction of Dpl. Gtw increases the sensor’s average load Load(S) (the mean value Load(S) over the Pareto optimum) compared to a regular placement of candidate sites. Table 1. Comparison between regular and random placement of the candidate sites Topology |S| |CS | f 1∗ (mJ) 50 9 0.45 50 16 0.15 50 25 0.05 100 9 0.45 100 16 0.35 100 25 0.05 150 9 0.75 150 16 0.45 150 25 0.15 200 9 0.85 200 16 0.25 200 25 0.05
Grid placement Random placement f 2∗ Dpl. Gtw Load(S) f 1∗ (mJ) f 2∗ Dpl. Gtw Load(S) 0.5 3.5 8.93 0.45 0.21 2.96 13.24 0.4 4.25 8 0.35 0.22 4.63 9.83 0.33 4.29 7.59 0.15 0.02 4.93 12.57 0.5 3.33 10.2 0.65 0.28 3.71 21.26 0.4 4.81 9.87 0.65 0.21 4 21 0.33 4.7 9.05 0.25 0.03 4.24 23.74 0.5 2.85 17 0.95 0.08 3.05 27.88 0.4 4.24 12.76 0.75 0.01 4.41 20.53 0.33 4.77 14.24 0.25 0.04 3.76 32.78 0.5 3.9 16.3 0.95 0.22 3.17 20.79 0.4 4.77 13.5 1.05 0.11 3.53 27.4 0.33 4.4 15.84 0.35 0.07 4.43 34.3
232
C. Caillouet, X. Li, and T. Razafindralambo
When the candidate sites are regularly placed in the area (i.e. the regular policy as described in Section 5.1), then the sensor’s load is significantly reduced in comparison to a completely random placement (see Table 1). This value is always greater of at least 20% when the gateway placement is performed among candidate sites chosen randomly, leading to more loaded sensors in the WSN.
6
Conclusion
In this paper, we have presented a framework for efficient data collection in wireless sensor networks. We have developed a Multi-objective Linear Program with two metrics to evaluate the trade-off between the maximum amount of energy spent by the sensor nodes, and the length of the route of mobile sinks collecting data at collection points that we jointly deploy. We can see from these particular results that the load of the sensors rapidly decreases with the number of gateways and that above a given number of deployed gateways the load remains stable (or decreases slowly). This allows to save energy in the WSN and maximize its lifetime. When the number of deployed gateways is important, then the mobile sink collecting data at the gateways has a longer route to perform, therefore increasing the delay of data collection until processing. The proposed model therefore provides a means by which several objectives must be evaluated by a network manager. A fair trade optimal solution is drawn from the obtained Pareto fronts to fairly optimize the metrics. If the energy is the major concern, then the network lifetime objective may be favorable. If the decision maker wants to reduce the delay of collecting the data to ensure fast processing, he or she may give a higher priority to the route length objective when compared to other objectives. Hence, there is a critical necessity to incorporate all the different objectives when we model the data collection plan in WSNs.
References 1. Barrenetxea, G., Ingelrest, F., Schaefer, G., Vetterli, M.: The hitchhiker’s guide to successful wireless sensor network deployments. In: ACM Conference on Embedded Networked Sensor Systems (ACM SenSys 2008), Raleigh, USA, pp. 43–56 (2008) 2. Ben Saad, L., Tourancheau, B.: Multiple mobile sinks positioning in wireless sensor networks for buildings. In: 3rd International Conference on Sensor Technologies and Applications (SENSORCOMM 2009), Athens, Greece, pp. 264–270 (2009) 3. Capone, A., Cesana, M., Donno, D., Filippini, I.: Deploying multiple interconnected gateways in heterogeneous wireless sensor networks: An optimization approach. Elsevier Computer Communications 33(10), 1151–1161 (2010) 4. Gandham, S., Dawande, M., Prakash, R., Venkatesan, S.: Energy efficient schemes for wireless sensor networks with multiple mobile base stations. In: IEEE GlobeCom, San Francisco, USA, pp. 377–381 (2003) 5. Hart, J.K., Martinez, K.: Environmental sensor networks: A revolution in the earth system science? Earth-Science Reviews (Elsevier) 78, 177–191 (2006) 6. Heinzelman, W., Chandrakasan, A., Balakrishnan, H.: Energy-efficient communication protocol for wireless micro sensor networks. In: Hawaii International Conference on System Science (HICSS 2000), Maui, Hawaii, pp. 3005–3010 (2000)
A Multi-objective Approach for Data Collection in Wireless Sensor Networks
233
7. Kalpakis, K., Dasgupta, K., Namjoshi, P.: Efficient algorithms for maximum lifetime data gathering and aggregation in wireless sensor networks. Computer Networks (Elsevier) 42, 697–716 (2003) 8. Khadar, F., Razafindralambo, T.: Performance evaluation of gradient routing strategies for wireless sensor networks. In: Fratta, L., Schulzrinne, H., Takahashi, Y., Spaniol, O. (eds.) NETWORKING 2009. LNCS, vol. 5550, pp. 535–547. Springer, Heidelberg (2009) 9. Krishnamachari, B., Estrin, D., Wicker, S.B.: The impact of data aggregation in wireless sensor networks. In: International Conference on Distributed Computing Systems (ICDCS 2002), Washington, DC, USA, pp. 575–578 (2002) 10. Laumanns, M., Thiele, L., Zitzler, E.: An adaptive scheme to generate the pareto front based on the epsilon-constraint method. In: Branke, J., Deb, K., Miettinen, K., Steuer, R.E. (eds.) Practical Approaches to Multi-Objective Optimization, Dagstuhl, Germany. Dagstuhl Seminar Proceedings, vol. 04461 (2005) 11. Luo, J., Hubaux, J.P.: Joint mobility and routing for lifetime elongation in wireless sensor networks. In: IEEE International Conference on Computer Communications (IEEE INFOCOM 2005), Miami, USA, pp. 1735–1746 (2005) 12. Martinez, K., Ong, R., Hart, J.: Glacsweb: a sensor network for hostile environments. In: IEEE SECON, Santa Clara, USA, pp. 81–87 (2004) 13. Miettinen, K.: Graphical illustration of Pareto optimal solutions. In: Tanino, T., Tanaka, T., Inuiguchi, M. (eds.) Multi-Objective Programming and Goal Programming: Theory and Applications, pp. 197–202. Springer, Berlin (2003)
Smart and Balanced Clustering for MANETs Lu´ıs Concei¸c˜ao and Marilia Curado Dept. Informatics Engineering, Centre for Informatics and Systems, University of Coimbra {lamc,marilia}@dei.uc.pt
Abstract. Clustering is the most widely used performance solution for Mobile Ad Hoc Networks (MANETs), enabling their scalability for a large number of mobile nodes. The design of clustering schemes is quite complex, due to the highly dynamic topology of such networks. A numerous variety of clustering schemes have been proposed in literature, focusing different characteristics and objectives. In this work, a fully distributed and clusterhead-free clustering scheme is proposed, namely Smart and Balanced Clustering for MANETs (SALSA). The scheme introduces a new cluster balancing mechanism and a best clustering metric, aiming to provide a reduced maintenance overhead. SALSA was evaluated and compared with the Novel Stable and Low-maintenance Clustering Scheme (NSLOC), featuring topologies with up to 1000 nodes and velocities of 20 meters per second. Results confirmed the performance efficiency of the new scheme, providing stability and low maintenance overhead, even in the largest networks. Keywords: MANET, distributed clustering, mobility, stability, large networks.
1
Introduction
With the evolution of wireless technologies, there has been an increasingly wide utilization of mobile devices. Mobile networks have become particularly attractive in the recent years due to their flexibility at considerable low costs. Wireless is indeed one of the nominated communication technologies of the future, since it has the potential to allow the connection of all types of mobile devices. MANETs are autonomous systems, capable of self deployment and maintenance, not requiring infrastructure support for their operation. As a result, the topology of such networks is very dynamic, especially due to the unpredictable behavior of the nodes involved. In this context, numerous clustering schemes were developed, following different approaches and objectives, such as stability, low maintenance overhead or energy efficiency. Each one attempts to obtain the best efficiency by varying the characteristics of the system, like the usage of clusterheads and gateways, the maximum hop distance between nodes and the location awareness. However, to the best of our knowledge, there is only one clustering scheme aiming at providing a fully distributed cluster structure with no clusterheads, namely Novel Stable and Low-maintenance Clustering Scheme (NSLOC) [1]. This work, proposes an evolution of the NSLOC algorithm, H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 234–247, 2011. c Springer-Verlag Berlin Heidelberg 2011
Smart and Balanced Clustering for MANETs
235
attempting to further improve its performance, by reducing the control overhead. To accomplish this goal, SALSA introduces a new cluster balancing mechanism and a best clustering metric, capable of choosing suitable joining clusters. The rest of this document is organized as follows. Section 2 discusses the related work. Section 3 describes the SALSA clustering scheme. Section 4 performs an evaluation of SALSA and, finally, Section 5 concludes the article.
2
Related Work
Clustering algorithms can be classified according to different characteristics and objectives [2]. One of the common features in clustering schemes is the utilization of clusterheads (CH) and most of the proposed schemes rely on centralized nodes to manage the clusters structure. The utilization of gateway (GW) nodes is also another important characteristic that is present in the majority of clustering schemes. Other properties of clustering schemes concern the single-hop or multi-hop environments, the multi-homing (MH) support, embedded routing capabilities and location awareness. Combining the possible characteristics, each proposed clustering scheme attempts to accomplish a specific objective. The Stable Clustering Algorithm (SCA) [3] aims at supporting large MANETs containing nodes moving at high speeds by reducing re-clustering operations and stabilizing the network. To meet these requirements, the algorithm is based on the quick adaptation to the changes of the network topology and reduction of clusterhead reelections. In order to avoid a high frequency of clusterheads reelection, the algorithm initially chooses the nodes that best meet some required metrics such as, energy, mobility, connectivity and communication range. The Stability-based Multi-hop Clustering Protocol (SMCP) [4] also builds the cluster structure according to the node connectivity quality. Moreover, this scheme introduces a new methodology (clustercast mechanism) with the purpose of limiting the broadcast of less significant control messages. The K-hop Clustering Protocol (KhCP) [5] protocol is specifically designed to cluster dense MANETs, as it delimits the cluster formation at a specified k-hop distance. In this protocol, clusters are formed on a circle basis, whereas the clusterhead, at the start point, is the centre of the circle. A weight-based clustering scheme, named Distributed Weighted Clustering Algorithm (DWCA), was proposed with the objective to extend the lifetime of the network, by creating a distributed clustering structure [6]. The election of clusterheads is based on the weight value of nodes, which is calculated according to their number of neighbors, speed and energy. The Enhanced Performance Clustering Algorithm (EPCA) [7] is also a weight based clustering solution. Once more, the weight parameters are only taken into account for the selection of the clusterhead. The Connectivity-based Clustering Scheme (CCS) [8] has the purpose of improving the effectiveness, reliability and stability of MANETs. In contrast with most schemes, this solution ignores mobility and energy parameters, focusing only in the cluster organization to achieve its objectives. In order to provide effectiveness and low maintenance, it utilizes a technique of maintaining
236
L. Concei¸c˜ ao and M. Curado
Table 1. Comparison of clustering schemes CH GW 1/nhop SCA (2007) Yes Yes 2-hops max. SMCP (2005) Yes Yes n-hop KhCP (2006) Yes No n-hop DWCA (2006) Yes Yes 1-hop EPCA (2010) Yes No n-hop CCS (2008) Yes No n-hop EEMC (2007) Yes No n-hop
MH Main Objective No Large MANETs with high-speed nodes No Yes No No No No
TEDMC (2008) Yes Yes 1-hop
No
OCRP (2007)
Yes Yes 1-hop
No
OCR (2007)
Yes Yes n-hop
Yes
ODGM GN Yes No n-hop (2008) EWDCA Yes No n-hop (2010) NSLOC (2010) No Yes n-hop
No No No
Stable cluster formation Limited overhead for dense networks Stability of the network Performance, with trusting node mechanism Effectiveness, reliability and stability Distributed power consumption, limited control message flooding Stability, relying on trust values and residual energy of nodes Merge clustering phase with routing discovery and data transmission Light control overhead, establishing the cluster structure and routing paths simultaneously Build clusters as foundation for variable types of routing protocols Maintain stable cluster structure with lowest number of clusters Provide stable cluster structure with low control overhead
clusterheads separated by a significant hop distance. Therefore, the probability that two clusterheads come into each other’s transmission range is reduced, decreasing the number of re-clustering operations. Concerning the reliability objective, an intra-connection degree is used to measure the connection quality between a node and the possible clusters that it can join. The Energy Efficient Mobility-sensitive Clustering (EEMC) [9] presents a solution for energy balancing. The main objective of this scheme is to extend the lifetime of the network, by distributing the load amongst nodes and also regarding their mobility. The Trust-related and Energy-concerned Distributed MANET Clustering (TEDMC) [10] is also a scheme driven by energy concerns. TEDMC considers that the most important nodes are the clusterheads, and therefore it elects them according to their trust level and residual energy. In order to keep information about the trust level of nodes, this algorithm maintains and periodically exchanges a reputation rank table, which contains a reputation value and the unique identification of the last node to assign the value in question. Furthermore, TEDMC is substantially different from KhCP, as it only allows 1-hop clusters, thus being less suitable for dense networks. There are also clustering schemes capable of performing route discovery, such as the On-Demand Clustering Routing Protocol (OCRP) and On-Demand Routing-based Clustering (ORC) [11,12]. These schemes are capable of building cluster structures and routing paths on-demand. In these schemes, only the nodes
Smart and Balanced Clustering for MANETs
237
that are necessary to satisfy a routing path are bounded to the cluster structure. The On-Demand Group Mobility-Based Clustering with Guest Node [13] provides a solution with the main purpose of building a cluster structure capable of supporting several types of routing protocols with identical efficiency. Furthermore, it relies in a guest node approach in order to introduce arriving nodes to the network. The Efficient Weighted Distributed Clustering Algorithm (EWDCA) [14] has the major concern of providing scalability for MANETs, by taking into consideration several weight parameters: connectivity, residual battery power, average mobility and distance between nodes. These parameters are used only to elect the most suitable clusterhead, in order to keep an optimal number of clusters, thus providing as much scalability as possible. A Novel Stable and Low-maintenance Clustering Scheme (NSLOC) [1] is a fully distributed clustering scheme, with the main goal of simultaneously providing a low maintenance overhead and network stability. The NSLOC scheme employs a completely distributed approach, not relying on clusterheads, in contrast to most well known clustering schemes. Table 1 shows the main characteristics of the analyzed clustering schemes. One of the main reasons clusterheads are so utilized is due to the simplicity that they provide to the clustering algorithm. Centralizing the power of management on only one node results in a less complex algorithm thus, becoming easier and faster to implement. Nonetheless, clusterheads carry big disadvantages, as they represent bottlenecks and uneven energy consumption in the network, due to the centralized management decisions.
3
Smart and Balanced Clustering for MANETs
SALSA is a fully distributed clustering scheme designed to operate in MANETs. The main purpose of this scheme is to build stable clusters aiming to significantly reduce the control overhead, thus providing a light hierarchical structure for routing. This proposal is designed to build a cluster topology in a distributed fashion, meaning that each node in the network will have the same role, not relying on centralized points, like clusterheads. This scheme introduces a new load-balancing algorithm, which acts progressively along time. During execution, SALSA analyzes the current size of clusters and distributes nodes across them, in order to maintain well balanced clusters. Before the maximum capacity of a cluster is reached, it starts to assign nodes to neighbor clusters or, in cases where this operation is not possible, builds a new cluster to receive excess nodes. With this new scheme, it was intended to reduce, even further, the clustering control overhead. This objective was mainly accomplished by utilizing small and purpose-drive specific messages. As a result, the proposed scheme utilizes five different types of messages, ensuring in most cases, a significant decrease in the amount of transmitted traffic, when compared to NSLOC. 3.1
Node States
In SALSA, nodes can be in one of three distinct states, namely Unclustered, Clustered and Clustered-GW, as shown in Figure 1.
238
L. Concei¸ca ˜o and M. Curado
Unclustered
Lost connection to cluster
One cluster in-range
Multiple clusters in-range
Lost connection to all clusters
Multiple clusters Became in-range
Clustered Only 1 cluster remains in-range
Clustered GW
Fig. 1. Node states
The Unclustered state typically represents a temporary role, as the node is waiting to be assigned to a cluster. In this state, when the node discovers neighbors, it waits a predefined period of time in order to calculate the best candidate cluster to join. Nodes in the Clustered state usually represent the majority of nodes on the network, whereas all in-range nodes must belong to its cluster. Thus, the communication with foreign nodes (i.e. nodes assigned to a different cluster) is performed through gateway nodes. Finally, the Clustered-GW state is assigned to nodes that have in-range foreign nodes, i.e. they must have direct connectivity with at least one different cluster. Thus, they are responsible of forwarding inter-cluster maintenance messages and typically are located on the edge of clusters. State Transitions. The Unclustered state occurs on two different situations: 1. Node isolation - in this case the node does not have any in-range neighbor nodes, therefore cannot create or be assigned to a cluster 2. Cluster transition - the management of clusters occasionally requires nodes to change clusters, due to cluster balancing. In this phase, nodes can be unassigned from a cluster. Unclustered to Clustered. This state occurs when a node becomes aware of an in-range cluster or an unclustered node. In the first situation, the node joins the cluster automatically. However, if the node only detects unclustered nodes, a new cluster is created to adopt the unclustered nodes. Unclustered to Clustered-GW. This transition is similar to the previous, but more than one cluster is discovered. Firstly, the node calculates which is the best, taking several parameters into account: number of in-range nodes for each cluster and the size of clusters. The greater the number of in-range nodes, the stronger connection to the cluster. However, if the size of the cluster is high,
Smart and Balanced Clustering for MANETs
239
possibly close to the maximum allowed, this cluster would be a bad choice. To measure this trade-off, a new metric is utilized (1), namely the best clustering metric (BC), where BCi is the metric value for cluster i. IRNi (1) 2 APi is defined as the number of the available positions in cluster i until it reaches the maximum allowed, i.e. the difference between the maximum allowed number of nodes per cluster and the current number of assigned nodes. IRNi is the number of in-range nodes belonging to the cluster. As a result, the cluster with the higher BC value is chosen by the node. BCi = APi +
Clustered to Clustered-GW. This transition occurs when a node becomes aware of clusters, excluding its own. Clustered-GW to Clustered. Whenever a clustered gateway node loses connection with all its foreign clusters, it automatically transits to a normal clustered state. Clustered/Clustered-GW to Unclustered. A node becomes unclustered when willingly disconnects from the network or loses connection with all its neighbor nodes. When this situation occurs, it is necessary to verify the consistency of the cluster, i.e. guarantee that all home nodes can communicate with each other. 3.2
Maintenance Information and Messages
This subsection describes the information that each node maintains and the messages utilized. There are two tables providing insight of the network topology, namely the NODE TABLE and the CLUSTER TABLE.NODE TABLE. This table keeps all the information about neighbor and home nodes, as described in Table 2. Table 2. Node maintenance information Information Node ID State C-Degree
Alive
Description Unique identifier of the node Current state of the node (Unclustered, Clustered or Clustered-GW ) Value to determine the connection type towards this node. Value ranges from 0 to 5, whereas 0 represents a non-neighbor (therefore merely a home node), 1 denotes a lost connection towards this node and finally, 2-5 values represent the quality of the connection, being 5 the best possible connection. Boolean value, determining whether the node is responding or not
SALSA relies on multiple, small purpose-driven, messages to manage the cluster structure. All messages contain one common field, Type ID, which uniquely identifies the message type that is being transmitted. Apart from this field, all the messages contain different sets of fields, suitable to their purpose, as follows: – Ping - periodic broadcast message, allowing nodes to discover their neighborhood
240
L. Concei¸c˜ ao and M. Curado
– Hello - provide the structure of the cluster to member nodes – Lost Hello - broadcasted when a node loses connection with a neighbor home node, informing member nodes, that do not have direct connection, about a possible disconnected node. This event triggers a process in order to verify if the node is still connected via other nodes, namely alive check process. At the end of this process, if it is verified that the node is in fact disconnected, it is necessary to verify if the cluster is still consistent, which implies the utilization of the following described message (Alive Hello) – Alive Hello - upon the trigger of an alive check process, to verify the consistency of the cluster, i.e. guarantee that all nodes inside the cluster are capable of communicating with each other. In most situations the cluster remains consistent; however there are rare cases in which the cluster becomes partitioned in two clusters. In this particular situation, both clusters have the same identifier, thus it becomes imperative to change it. – Switch Hello - used when a cluster identifier becomes inconsistent and it is necessary to change the Cluster ID for their nodes. 3.3
Complexity Analysis
In this section, an analysis of the overhead introduced by SALSA is performed. The scheme operations can be classified as follows: – Overhead due to Ping messages (OHP g ) – Overhead due to Cluster Formation (OHCF ) – Overhead due to Cluster Maintenance (OHCM ) As previously described, SALSA utilizes a Ping message mechanism so nodes are able to discover their neighborhood. Thus, since the broadcast of these messages is constant during the execution of the algorithm, it must be analyzed aside from the remaining operations. The network model of SALSA for the analysis of the clustering overhead relies on the following parameters: – N = the number of nodes in the entire network – M = predefined constant, defining the maximum allowed number of nodes per cluster – tping = predefined period of time for Ping message broadcast – tf ormation = predefined period of time for initial cluster formation (join or cluster creation) – tjoin = predefined period of time for node join operation – tchange = predefined period of time for node cluster change – talive = predefined period of time to determine status of nodes Ping Overhead. In SALSA, Ping messages are broadcasted periodically. This process implies an overhead of tping N messages per time step. Since tping is a predefined constant by the algorithm, the overhead of the Ping message is OHP g = O(N ) per time step. Cluster Formation Overhead. In the cold start of SALSA, where all nodes in the network are unclustered, each node waits a predefined period of time,
Smart and Balanced Clustering for MANETs
241
whether to create a new cluster or to join a recently created one. Thus, before a node being assigned to the cluster structure, it must wait at least a tf ormation period of time. Following this procedure, several Hello messages will be broadcasted to it, providing the necessary information about its cluster. The number of Hello messages broadcasted is equal to the number of 1-hop neighbors of the node, inside its cluster. Thus, in a worst case scenario, a recently clustered node will receive M Hello messages. The Hello messages are broadcasted simultaneously, and therefore it takes only 1 time step for this process, which adds up to tf ormation + 1. Analyzing the complexity for the entire network, the overhead is (tf ormation + 1M )N , resulting in O(M N ). However, since the formation process only occurs once during the entire execution, and not constantly as for Ping messages, the total overhead is OHCF = O(1). Cluster Maintenance Overhead. The maintenance of clusters is divided in two main routines, namely the joining of a new node and the leaving of a node. This two events are responsible for triggering all the operations to manage the cluster structure. Joining of New Node. When a node joins a cluster two operations may be triggered, namely the auto-balancing of clusters or the creating of a new cluster, due to the imposed maximum nodes per cluster. In most cases, the node simply joins a cluster without requiring these operations, however for the complexity analysis, the worst case scenario must be considered. When a node wishes to join the cluster structure, it waits a predefined period of time tjoin in order to discover the neighborhood, and to choose the most suitable cluster. Upon choosing its cluster, the node assigns itself to it and receives an Hello message from a member node, similarly to the initial phase of cluster formation. Thus, the join operation alone has a complexity of tjoin N for the entire network, and an overhead of O(N ) per time step. The auto-balancing mechanism may be triggered once a node joins a cluster, which requires a node to be assigned to a different cluster. In this process, the node waits a random amount of time, no longer than a predefined period tchange . When this time expires, the node emits an Hello message, informing its former members that it is no longer assigned to that cluster. This process implies a time complexity of (tchange + 1)N which results in an overhead of O(N ) per time step. The creation of a new cluster is also an operation that can be triggered by the join cluster operation, when auto-balancing is not possible. This operation, is executed before the new node joins the cluster. Since the operation does not affect the topology of existing clusters, the message complexity does not exist, since the existence of the new cluster is broadcasted using Ping messages. In short, it will only cost the period of time tjoin , resulting in an overhead of O(1) per time step. Summarizing, the overhead of joining of new node is O(N ) + O(N ) + O(1) which results in O(N ). Leaving of a Node. When a clustered node detects that it has no longer connection to one of its member neighbors, it broadcasts a Lost Hello message. Upon the reception of this message, each node waits a predefined period of time talive
242
L. Concei¸c˜ ao and M. Curado
and broadcasts an alive message. This process, results in a message complexity of talive M N , which implies an overhead of O(N ), since M is a constant predefined by the algorithm. After this process, as the cluster may lose its consistency, a Switch Hello message is broadcasted to build two new clusters. In the worst case scenario, M messages are broadcasted, resulting in complexity of (talive M + M )N , with an overhead of O(N ). Total Maintenance Overhead. As analyzed above, the overhead of joining of a new node is O(N ) and the leaving of a node is also O(N ), which results in a maintenance overhead of OHCM = O(N ). Total Clustering Overhead. Summarizing this analysis, the total overhead is denoted by OHC = OHP g + OHCF + OHCM , which results in OHC = O(N ) + O(1) + O(N ). Consequentially, SALSA has a total clustering overhead of O(N ) per time step.
4
Simulation Evaluation
To properly examine the effectiveness of SALSA, a simulation evaluation, driven by the main objectives of the scheme, was performed using the OPNET Modeler [15]. Therefore, the main purpose of this simulation evaluation is to assess the stability and low overhead capabilities of SALSA. To accomplish this objective, a set of different simulation environments, featuring the network size and speed of nodes, were defined. 4.1
Environment and Parameters
The performance of clustering schemes is strongly influenced by the scenarios under which they are evaluated. For instance, a better performance is expected for low-density networks (i.e. low quantity of nodes per Km2 ) or with nodes moving at low speeds. The scenarios used for SALSA evaluation were selected in such a way that they represent, as much as possible, realistic scenarios. For this specification, the evaluation parameters can be divided in two groups, the Table 3. Simulation parameters Fixed-value parameters Simulator OPNET Modeler 16.0 Field Size (m2 ) 5000 × 5000 Node mobility algorithm Random Waypoint Model Pause time (s) 50 Transmission range (m) 150 Bandwidth (Mbps) 11 Simulation time (s) 900 Variable-value parameters Network size (number of nodes) 200; 400; 600; 800; 1000 Node maximum speed (m/s) 0; 5; 10; 15; 20
Smart and Balanced Clustering for MANETs
243
fixed-value and the variable-value parameters, according to whether their value changes for different simulation scenarios (Table 3). Given the enormous quantity of different possible scenarios that the combination of parameters provides, only the most significant were chosen. In particular, the parameters that most influence the scalability of the network are the network size (number of nodes) and the maximum speed that nodes can achieve. Considering the vast application that clustering can have and that this simulation study aims to evaluate a generic scenario, a specific node mobility pattern, like Group Mobility, Freeway or Manhattan models would not be suitable [16]. Thus, a random model, the Random Waypoint, was preferred. Also, for a simulation of 900 seconds, a 50 second pause time was chosen. Each simulation execution was repeated 30 times, assigning to each a distinct seed value. 4.2
Results
This section presents the obtained results from the SALSA simulation. As previously mentioned, SALSA is a completely new algorithm, based on the NSLOC scheme. Thus, the discussion of the following results will be conducted according to the results obtained in NSLOC.
100
100
90
90
80 0 m/s 5 m/s 10 m/s 15 m/s 20 m/s
70
60
50
40 200
Amount of Clustered Nodes (%)
Amount of Clustered Nodes (%)
Number of Clustered Nodes. This metric provides the number of nodes that are associated with the cluster structure. Figure 2a shows the percentage of clustered nodes for the different network sizes and node speeds in SALSA. The percentage of clustered nodes for large networks is bigger than for smaller networks. Naturally, this occurrence is strongly tied with the density of the network, i.e. the probability of a node being communication in-range with another is greater for networks with more nodes. When compared to the NSLOC scheme (Figure 2b), the new proposal is capable of assign more nodes to the cluster structure, specially in bigger networks. This slight difference is due to the new node state transition specification, as it analyses the most suitable clusters based on the best clustering metric.
80 0 m/s 5 m/s 10 m/s 15 m/s 20 m/s
70
60
50
400
600 Network Size
800
(a) SALSA
1000
40 200
400
600 Network Size
800
(b) NSLOC
Fig. 2. Amount of clustered nodes (in percentage)
1000
244
L. Concei¸ca ˜o and M. Curado
180
400
140 120
0 m/s 5 m/s 10 m/s 15 m/s 20 m/s
100 80 60
Average Network Load (kbit/s)
Average Network Load (kbit/s)
160 350
300
0 m/s 5 m/s 10 m/s 15 m/s 20 m/s
250
200
40 20 200
400
600 Network Size
800
1000
150 200
400
(a) SALSA
600 Network Size
800
1000
(b) NSLOC Fig. 3. Average network load
Network Load. The network load represents the received and transmitted traffic in the network. This metric translates the overall weight of the network, including the clustering control overhead. Figure 3a and Figure 3b show the average network load, for different velocities and network sizes, for SALSA and NSLOC, respectively. As shown in the charts, SALSA handles clustering with a significantly lower overhead. In fact, the biggest difference between the two schemes lies exactly on the amount of traffic necessary to handle clustering. As described in Section 3, SALSA only sends the exact required information, utilizing specific message types. As a result, the amount of traffic necessary to manage the cluster structure is quite lower than in NSLOC. Another important aspect to retain about these results is the consistency of the amount of traffic. In NSLOC, the amount of traffic increases significantly as the maximum speed of nodes increases. SALSA, on the other hand, utilizes almost the same amount of traffic for different speeds, excluding the case when nodes are static. Furthermore, for a network size of 1000 nodes, the average network load for the different speeds seems to converge, utilizing almost the same amount of traffic, meaning that SALSA is sustainable. Number of Messages. The number of messages required by the clustering scheme to operate, increases with the size of the network. Figure 4a shows the average number of control messages sent by SALSA, for the different network sizes and node speeds. As expected, the shape of this chart is quite similar with the network load metric, since the number of messages sent is strongly tied with the overall network load. Figure 4b shows evaluation results of NSLOC. The interesting fact though, is that the average number of messages sent by SALSA is not much lower than the ones sent by NSLOC, in spite of the significant difference that obtained in the average load results. Once more, this fact is due to the smaller, and more specific, messages utilized by the new scheme. Other Evaluation Metrics. In addition to the previous evaluation parameters, there are important metrics that must not be ignored, such as the balancing of clusters and their stability. However, due to the lack of space, the complete results are not presented here. Concerning the first, results are quite acceptable,
Smart and Balanced Clustering for MANETs
700
450 400 350 0 m/s 5 m/s 10 m/s 15 m/s 20 m/s
300 250 200 150 100 50 200
400
600 Network Size
800
Average Number of Messages Sent (packets/s)
500 Average Number of Messages Sent (packets/s)
245
600 500
300 200 100 0 200
1000
0 m/s 5 m/s 10 m/s 15 m/s 20 m/s
400
400
600 Network Size
800
1000
(b) NSLOC
(a) SALSA
Fig. 4. Average number of messages sent
having for instance, an average of 23 clusters for the 1000 sized network with static nodes (0 m/s). In this scenario, an average total of 944 nodes were clustered, resulting in an average of around 41 nodes per cluster. Moreover, in the 1000 sized network with a node maximum speed of 20 m/s, an average of 32 clusters were created, whereas around 858 nodes were clustered. This represents an average of 27 nodes per cluster, which can be acceptable due to the high velocity. To be noted that in all the evaluations, SALSA was configured with a maximum number of 50 allowed nodes per cluster. The stability of clusters can be measured according to the amount of time that nodes belong to a cluster, without suffering re-clustering operations. For this analysis, a cluster stability metric is utilized, which defines a stability time (ST ), from which nodes are considered to be stable (2). ST = k ×
r×p v×d
(2)
where r is the transmission range of nodes, p is the pause time, v the average of node speed (mean value of minimum and maximum speed), d the density of nodes (number of nodes per Km2 ) and finally, k represents an arbitrary constant, equal in all simulation executions. For the network size of 1000 nodes and no mobility (0 m/s), results show an average of 894 stable nodes. This means that this quantity of nodes were at least ST time without changing cluster. Also, for a network size of 1000 nodes with a maximum speed of 20 m/s, a total of 804 nodes were considered stable.
5
Conclusion
In this paper, the SALSA scheme was proposed, aiming to improve the performance of large MANETs. The proposed clustering scheme employs a full distributed approach, in contrast to most well known clustering schemes. A new cluster balancing mechanism and a best clustering metric are employed by SALSA, providing a reduced control overhead.
246
L. Concei¸ca ˜o and M. Curado
Evaluation results shown that SALSA is capable of outperforming NSLOC in the majority of the scenarios, particularly for network sizes greater than 400 nodes. The most noticeable difference is in the traffic overhead, which is significantly lower (reduced around 45% for all evaluated network sizes) than with the NSLOC scheme.
Acknowledgments This work was supported by FCT project MORFEU (PTDC / EEA-CRO / 108348 / 2008). The authors would like to thank the OPNET University Program for the licenses provided for the OPNET ModelerWireless Suite R.
References 1. Concei¸ca ˜o, L., Palma, D., Curado, M.: A novel stable and low-maintenance clustering scheme. In: Proceedings of the 2010 ACM Symposium on Applied Computing, SAC 2010, pp. 699–705. ACM, New York (2010), http://doi.acm.org/10.1145/1774088.1774232 2. Yu, J.Y., Chong, P.H.J.: A survey of clustering schemes for mobile ad hoc networks. IEEE Communications Surveys & Tutorials 7(1), 32–48 (2005) 3. Tolba, F., Magoni, D., Lorenz, P.: A stable clustering algorithm for highly mobile ad hoc networks. In: Second International Conference on Systems and Networks Communications, ICSNC 2007, p. 11 (2007) 4. Tenhunen, J., Typpo, V., Jurvansuu, M.: Stability-based multi-hop clustering protocol. In: IEEE 16th International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2005, vol. 2, pp. 958–962 (September 2005) 5. Angione, G., Bellavista, P., Corradi, A., Magistretti, E.: A k-hop clustering protocol for dense mobile ad-hoc networks. In: International Conference on Distributed Computing Systems Workshops, p. 10 (2006) 6. Choi, W., Woo, M.: A distributed weighted clustering algorithm for mobile ad hoc networks. In: International Conference on Internet and Web Applications and Services/Advanced International Conference on Telecommunications, AICT-ICIW 2006, pp. 73–73 (February 2006) 7. Zoican, R.: An enhanced performance clustering algorithm for manet. In: 2010 15th IEEE Mediterranean Electrotechnical Conference, MELECON 2010, pp. 1269–1272 (April 2010) 8. Mai, K.T., Choo, H.: Connectivity-based clustering scheme for mobile ad hoc networks. In: IEEE International Conference on Research, Innovation and Vision for the Future, RIVF 2008, pp. 191–197 (July 2008) 9. Wei, H.A.D., Chan, Chuwa, E.L., Majugo, B.L.: Mobility-sensitive clustering algorithm to balance power consumption for mobile ad hoc networks. In: International Conference on Wireless Communications, Networking and Mobile Computing, WiCom 2007, pp. 1645–1648 (September 2007) 10. Qiang, Z., Ying, Z., Zheng-hu, G.: A trust-related and energy-concerned distributed manet clustering design. In: 3rd International Conference on Intelligent System and Knowledge Engineering, ISKE 2008, vol. 1, pp. 146–151 (2008)
Smart and Balanced Clustering for MANETs
247
11. Huang, C., Zhang, Y., Jia, X., Shi, W., Cheng, Y., Zhou, H.: An on-demand clustering mechanism for hierarchical routing protocol in ad hoc networks. In: International Conference on Wireless Communications, Networking and Mobile Computing, WiCOM 2006, pp. 1–6 (2006) 12. Hsu, C.-H., Feng, K.-T.: On-demand routing-based clustering protocol for mobile ad hoc networks. In: IEEE 18th International Symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2007, pp. 1–5 (September 2007) 13. Dana, A., Yadegari, A.M., Salahi, A., Faramehr, S., Khosravi, H.: A new scheme for on-demand group mobility clustering in mobile ad hoc networks. In: 10th International Conference on Advanced Communication Technology, ICACT 2008., vol. 2, pp. 1370–1375 (February 2008) 14. Hussein, A., Yousef, S., Al-Khayatt, S., Arabeyyat, O.: An efficient weighted distributed clustering algorithm for mobile ad hoc networks. In: 2010 International Conference on Computer Engineering and Systems, ICCES (30 2010) 15. OPNET, Opnet simulator (1986), http://www.opnet.com/ [Online]. Available, http://www.opnet.com/ 16. Divecha, B., Abraham, A., Grosan, C., Sanyal, S.: Impact of node mobility on manet routing protocols models (2007)
Promoting Quality of Service in Substitution Networks with Controlled Mobility Tahiry Razafindralambo1, Thomas Begin3 , Marcelo Dias de Amorim2 , Isabelle Gu´erin Lassous3, Nathalie Mitton1 , and David Simplot-Ryl1 1
INRIA Lille – Nord Europe {tahiry.razafindralambo,nathalie.mitton,david.simplot-ryl}@inria.fr 2 UPMC Sorbonne Universit´es
[email protected] 3 Universit´e Lyon 1 - LIP (UMR ENS Lyon - INRIA - CNRS - UCBL) {thomas.begin,isabelle.guerin-lassous}@ens-lyon.fr
Abstract. A substitution network is a rapidly deployable backup wireless solution to quickly react to network topology changes due to failures or to flash crowd effects on the base network. Unlike other ad hoc and mesh solutions, a substitution network does not attempt to provide new services to customers but rather to restore and maintain at least some of the services available before the failure. Furthermore, a substitution network is not deployed directly for customers but to help the base network to provide services to customers. Therefore, a substitution network is not, by definition, a stand-alone network. In this paper, we describe a quality of service architecture for substitution networks and discuss provisioning, maintenance, as well as adaptation of QoS inside and between the base and the substitution networks. Keywords: QoS, substitution network, mobility.
1
Introduction
Access and metropolitan networks are much more limited in capacity than core networks. While the latters operate in over-provisioning mode, access and metropolitan networks (called hereafter base network ) may experience high overload due to evolution of the traffic (e.g. flash crowd) or failures (e.g. network outage). Whenever possible, the base network is equipped with a backup network that restores the services to the subscribers in case of failure. In this paper, we focus on the case where no such backup network exists, and a temporary solution must then be quickly deployed. The base network may be any access network or metropolitan network including wired and wireless technologies (such as the telephone network, Internet cabling, and TV network). Troubles may come from a surge in the traffic inside a network that causes the network to be virtually unreachable, a failure of an equipment, or a power outage. A case in the point was the disruption of telephone and Internet services experienced by counties of southern Santa Clara and Santa Cruz, California, in 2009, as vandals intentionally cut fiber optic cables H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 248–261, 2011. Springer-Verlag Berlin Heidelberg 2011
Promoting Quality of Service in Substitution Networks
249
(Figure 1 shows employees splicing the damaged cables) [7]. The outage initially affected some cell phones, Internet access, and about 52,200 household land lines. The point to highlight in this case is that the operator spent about 12 hours to repair one single cable, thereby restoring a few priority services, and more than 18 hours to restore all services. Besides physical failures, the dramatic growth of Internet users, mobile devices and network services leads to a steadily increase of traffic workload whose volume may, in some cases, hamper the overall quality of networking communications. An interesting study has been performed by Nemertes Research, which states that, though the capacity in the core will be enough to support Internet traffic in the near future, the workload level may rapidly exceed the access line capacities [8]. Clearly, increasing the capacity resources in the access part of base networks requires to replace some of the current technologies, and hence requires much time to be performed. Meanwhile, substitution network can be viewed as a practical alternative to respond to punctual and temporary needs. The approach behind substitution network is to deploy, for a given space and time domain, a wireless network made up of mobile routers (mobile substitution routers) so as to keep the base network operational. Once deployed, the mobile substitution routers establish new traffic routes that can be used by the base network to afford alternative communication channels to affected subscribers. Thanks to the use of controlled mobility, mobile substitution routers can move to adapt their topology to Fig. 1. Crew splicing fibergeographical obstacles, to avoid wireless contention optic cables zones, or to traffic evolution and QoS requirements. Upon deployment of the substitution network, network services can be restored. However, the capacity of the substitution network is likely to be smaller than the capacity of the base network. It is therefore important to control the traffic going through substitution network and this implies to set up QoS policies for on-going and incoming flows such as admission control, prioritizing mechanisms, etc. Providing QoS to subscribers is clearly an issue that must be handled in an end-to-end fashion. In our scenario, QoS requirements must encompass different networks with variable performance characteristics. To cope with this heterogeneous environment, the proposed QoS architecture includes, in addition to mobile substitution router, a second type of components, namely bridge router. Bridge routers lie in between the base network and the substitution network and they are accountable for provisioning, maintenance, and adaptation of QoS inside and between the base network and the substitution network. On the other hand, mobile substitution routers are only dedicated to the substitution network. This proposed architecture is inspired by the QoS-Architecture proposed by Campbell et al. since we use the same layered architecture with vertical planes [3]. We also use a mix of both wired and wireless technologies as proposed in the
250
T. Razafindralambo et al.
DAIDALOS project [5]. However, the fundamental difference between these two works and ours is the use of the mobility primitive inside the QoS architecture. The rest of this paper is organized as follows. In Section 2, we present the terminology and describe the general architecture of a substitution network. The system and architecture requirements are described in Section 3, while we detail in Section 4 the QoS architecture. In Section 5 we discuss the specification and usage of the system and in Section 6 we describe in detail the operations executed inside the QoS architecture. In Section 7, we provide a discussion on the mobility usage and on the monitoring system. We finally conclude the paper and provide some research directions in Section 8.
2
System Overview
Access networks or metropolitan networks are mainly used to connect users to the Internet. These base networks are based on wired and/or wireless technologies and may provide QoS to the users. As explained previously, we think that substitution network may be used to help the base network keep providing services for which it was deployed assuming there is no backup infrastructure. The interconnection between the base network and the substitution network mainly consists in two types of nodes: 1. Bridge Routers that are connected in between the base network and the substitution network, and that are used to forward the traffic from the base network to the substitution network and vice versa; 2. Mobile Substitution Routers that are mobile wireless routers of the substitution network, possibly connected to Bridge Routers, and whose union provides alternative path(s) to the base network. The deployment of the substitution network for a base network involves the positioning of bridge routers. This placement can be done during the building of the base network or on demand (when extra resources are needed). In this paper, we assume that a set of bridge routers have already been installed in the base network. This is a realistic assumption since it only requires adding a wireless interface to some routers of the base Fig. 2. Wifibot (www.wifibot.com) network and some simple software modification as we will describe later. In order to reduce the human intervention for the deployment of the substitution network, we assume that Mobile Substitution Routers have motion capabilities and a positioning system. This assumption is feasible by using robots such as Wifibots (see Figure 2) as mobile substitution routers.
Promoting Quality of Service in Substitution Networks Classic Router
Classic Router
Bridge Router
251
Mobile Substitution Routers
Bridge Router
Mobile Substitution Routers
Bridge Router
Bridge Router
Bridge Router
Bridge Router
Bridge Router
Classic Router
(a) BASE NETWORK
Bridge Router
Classic Router
(b) BASE NETWORK
+ FAILURE
Classic Router
Classic Router Bridge Router
Mobile Substitution Routers
Bridge Router
Bridge Router Mobile Substitution Routers
Bridge Router
Bridge Router
Bridge Router
(c) BASE NETWORK Classic Router
(d) BASE NETWORK + SUBSTITUTION NETWORK
Classic Router
+ SUBSTITUTION NETWORK
Fig. 3. Typical use case for a base network and a substitution network. Bridge Routers are deployed together with the base network. In case of failure, the mobile substitution routers are deployed or self-deployed to form a substitution network that helps the base network in restoring basic services such as connectivity.
Figure 3 shows an example of a possible usage of a substitution network. In this figure, the bridge routers are deployed together with the base network (Fig. 3a). In this example, the base network operates without the help of the mobile substitution routers. In case of failure (Fig. 3b), the mobile substitution routers are deployed. In our architecture, the failure detection and the deployment are done autonomously by the base network itself. Mobile substitution routers try to find an optimal position to restore the connectivity service [11] and to ensure QoS to some flows (Fig. 3c). In some cases, redeployment may be necessary to improve QoS or to adapt to evolving network (base network and/or substitution network) conditions (e.g. change in the traffic, appearance of interferences) (Fig. 3d).
3
Entities
In this section, we describe the specification of the network and its QoS architecture as well as the components and functionalities for the substitution network. This specification is independent of any hardware implementation. Moreover, we give some clues on how and where to implement these functionalities. All
252
T. Razafindralambo et al.
the described components and functionalities are rated from 1 to 3 where ❸ is mandatory, ❷ is strongly recommended and ❶ is optional. The logical system component of a bridge router is represented in Figure 4(a). The bridge router includes at least two network interfaces: (i) one wireless interface to connect to the substitution network and (ii) one wired (or wireless) interface for the connection to the base network. It also includes different functionalities such as: – Monitoring ❸. The monitoring building block keeps track of flows that cross the bridge router. For example, the bridge router monitors the number of flows passing through its interfaces and tries to detect anomalies on each of its interfaces. – Mobility Engine ❸. Based on the monitoring results, the mobility engine can request the self-deployment of mobile substitution routers in between bridge routers. The substitution engine can also send an end-of-deployment command to the mobile substitution routers if substitution network is seen as not useful anymore. The mobility engine is the core of our architecture since it also gives the mobile substitution routers the information for selfconfiguration. – Routing Conversion Process ❸. The routing process between the base network and substitution network can be different (e.g. IPv4 and IPv6). In this case, a routing conversion process must be set up. For example, classic IP routing can be used on base network while geographic routing protocol can be used in the substitution network. In this case, a specific process such as encapsulation must be set up to deal with this difference. – QoS • Traffic classification ❷. Before entering into the substitution network, every single flow is assigned to a given class that will determine how its packets will be handled within the substitution network. • Admission Control ❷. We assume that the substitution network capacity is smaller than the base network capacity. Admission control aims at improving the overall quality of communications by preventing the network from congestion. • Traffic control ❶. The rate of the flows are limited using primarily traffic shaping, queue management and scheduling techniques. The logical system component of a Mobile Substitution Router is given in Figure 4(b). The Mobile Substitution Router includes at least one wireless network interface to connect to the Substitution Network. It also includes a localization system such as GPS and different functionalities such as: – Mobility Engine ❸. The mobility engine is used to control the mobility of the mobile substitution router. The decision taken by the mobility engine is linked to the Substitution Engine on the bridge router but can also be used autonomously by the mobile substitution router to self-position in the network.
Promoting Quality of Service in Substitution Networks
BRIDGE ROUTER Wireless interface
Interface with Base Network
Monitoring
Wireless interface
Localisation system
253
MOBILE SUBSTITUTION ROUTER Mobility Engine
Mobility Engine
Monitoring Routing Conversion Process
Routing Process QoS
QoS
(a) Bridge Router node component
(b) Mobile Substitution Router node component
Fig. 4. Substitution Network Components
– Monitoring ❸. Monitoring is used to provide to the mobile substitution router some information about its local status as the surrounding network status. The monitoring results can be used as an input for the Motion Engine to take proper decision in order to improve network performance or QoS. – Routing Process ❸. A routing process must be set up inside the substitution network. This routing process can be specific to substitution networks or can be some standard routing protocol. – QoS • Routing Layer ❷. The routing process inside the substitution network can be QoS-aware. Different routes or paths can be used depending on the preferential delivery service. Multi-path routing or routing with service differentiation can be implemented. • MAC Layer ❷. A MAC layer with QoS support can be used. Such a MAC layer may include scheduling and queue management mechanisms in order to provide traffic differentiation. • PHY Layer ❶. At the physical layer, the Motion Engine can be used to avoid geographical zones with high level of interferences. If available, multi-interface, multi-channel or smart antenna techniques can be used.
4
QoS Architecture
The purpose of the QoS architecture [1] is to 1) configure, predict, and maintain the requested QoS between two bridge routers that are inter-connected by the substitution network; 2) provide a transparent QoS management. QoS mechanisms are integrated into the low levels of the protocol stack (e.g. physical, MAC and routing layers), and are not implemented at the user or application level; 3) maintain and adapt the QoS to the existing conditions of the network. As stated earlier, the substitution network is a wireless network with mobile entities, and
254
T. Razafindralambo et al.
this mobility will be used in an autonomous way by the mobile substitution routers to improve the QoS when needed. Figure 5 describes the overall QoS architecture, the entity specifications and the relationships between functionalities. This figure shows that for bridge routers the functionalities are fed by the network stack. These functionalities send instructions, recommendations or requests to mobile substitution routers. We can also see the relationship between the functionalities and the protocol stack of each Mobile Substitution Router. We can see that QoS parameters are monitored and the monitoring results are then used to possibly take mobility decisions in the aim of improving the performance at each layer of the protocol stack, and thereby increasing QoS performance. Moreover, mobile substitution routers can also send their monitoring results to bridge routers to improve decisions taken at bridge router level. BRIDGE ROUTER
BRIDGE ROUTER QoS
Transport
MOBILE SUBSTITUTION ROUTER
Monitoring
Monitoring
QoS Mobility
QoS Mobility
Substitution Engine
Monitoring
Base Network
substitution Network
Base Network
Monitoring
MOBILE SUBSTITUTION ROUTER
substitution Network
QoS Substitution Engine
Routing
Routing LL / MAC
Routing
Routing
Physical
LL / MAC
LL / MAC
Physical
Physical
Routing
Routing
LL / MAC
LL / MAC
LL / MAC
Physical
Physical
Physical
Transport
Fig. 5. Substitution Network Qos Architecture
Our architecture is based on the QoS-A architecture proposed by Campbell et al. in [3]. The QoS-A is a layered architecture of services and mechanisms for QoS management and control of multimedia flows. Like the QoS-A architecture, the substitution network QoS architecture proposed in this paper is layered and has different vertical planes linked to the different protocol layers. It also has QoS adaptation, admission control, etc. However, the fundamental differences between the two architectures are: – Our architecture does not include the flow concept inside the substitution network. In our architecture, the flow concept ends at the Bridge Router level, before entering into the substitution network. This is due to the computation time and memory constraints of the substitution network.
Promoting Quality of Service in Substitution Networks
255
– We do not create a middleware plane to manage the multimedia traffic. – The substitution network considered in our architecture can by highly dynamic, therefore our QoS architecture includes a mobility management plane.
5
QoS Primitives
QoS primitives correspond to the different processes used inside the QoS architecture. We detail here the key primitives. 5.1
Traffic Classification
The traffic classification aims at gathering together flows according to some of their features. Thus, before entering into the substitution network, every single flow is assigned to a given class that will determine how its packets will be handled within the substitution network. We describe the two main features used to classify flows, namely: 1. The flow type basically determines the constraints associated to this flow. For instance, real-time flows need firm guarantees in terms of performance while non real-time flows may only need a best-effort service. The flow type also includes information related to the flow pattern. For instance, flows coming from telnet applications and FTP applications differ widely by mean duration length, packet size, burstiness, etc. 2. The flow priority that indicates the degree of importance. For instance, a communication issued by some subscribers may be obviously seen as less urgent than others (e.g. operator, police). Table 1 shows a possible way to classify incoming flows. Note that the description of the underlying classifier is out of the scope of this paper. Table 1. A typical example of flows classification Flow Type Example Network Control And Signaling
Traffic Shapes and characteristics
Loss
Tolerance to Delay Jitter
Traffic Application Priority Example
Variable size packets, mostly Network inelastic short messages, but Low Low Medium High Routing traffic can also burst Fixed-size small packets, IP Telephony constant emission rate, Low Low Low High Phone inelastic and low-rate flows Multimedia Variable size packets, Medium Low Low Medium Video Streaming elastic with variable rate Streaming High-Throughput Variable rate, bursty longLow High High Low FTP Data lived elastic flows Low-Latency Variable rate, bursty shortLow Medium High Low WEB Data lived elastic flows App. Low-Priority Non-real-time and elastic High High High Low Other Data
256
5.2
T. Razafindralambo et al.
Admission Control
Admission control aims at improving the overall quality of communications by preventing the network from congestion. As stated earlier, the substitution network capacity is very likely to be smaller than the base network capacity. Therefore, saturation is more likely to occur in substitution network at lower degrees of workloads. In our architecture, admission control takes the decision, at the bridge router level, to allow flows to access the substitution network or not. The admission control designed in our architecture will be based on the type and the priority of the incoming flow, as on measurements collected by the mobile substitution routers. 5.3
Mobility for QoS
A substitution network consists of mobile substitution routers with controlled mobility capabilities. As opposed to the classical mobility, controlled mobility refers to the capability of the mobile substitution routers to move according to their willing. This mobility can be viewed as a means to improve QoS provided by the substitution network. As for any wireless network, the signal quality (e.g. SINR -Signal Interference over Noise Ratio-) of each mobile substitution router is highly linked to its geographical position. Therefore, QoS parameters, such as the overall delay or loss rate, could be improved by carefully selecting the positioning of each mobile substitution router. For example, IEEE 802.11b can use four different physical rates (i.e. 1, 2, 5.5, 11 Mb/s) that are dynamically adapted using the Auto Rate Fallback algorithm based on undergone radio collisions. Since mobile substitution router motion can be seen as a means to reduce collisions (e.g. avoiding hidden terminal configurations [4]), controlled mobility can ultimately increase the physical rates used by IEEE 802.11b. To perform an adequate positioning of the mobile substitution routers, two approaches are possible. Both require measurements from the mobile substitution routers. First, the mobile substitution routers positioning is decided by a single node, namely the bridge router, that centralizes all the measurements sent by the mobile substitution routers. Second, a distributed approach where any mobile substitution router decides by itself its positioning is feasible. 5.4
Secondary Primitives
The secondary primitives are not mandatory to the substitution network to be run, but they can contribute to enhance its performance. Traffic control: Some well-known techniques of traffic control are potential candidates to secondary primitives. First, we consider traffic shaping that regulates the packets arrival pattern so as to limit the rate of the flows. The token bucket mechanism is a simple and common way to achieve this goal. Second, scheduling strategies can also be considered. Scheduling provides a convenient means to prioritize flows within the queues of the mobile substitution routers and the bridge routers. In our architecture, scheduling policy is highly tighted
Promoting Quality of Service in Substitution Networks
257
to the traffic classification. Third, secondary primitives may also include queue management techniques. Queue management aims at notifying the state of congestion within the network so as flows reduce their sending data rate. Both implicit and explicit approaches are feasible. In any case, a queue management technique requires measurements to be performed at the mobile substitution router level. QoS Routing: QoS routing can only be applied if several routes exist between the two bridge routers. Classical routing approaches always route identically packets intended to the same destination. On the other hand, QoS routing can route differently packets sent towards the same destination. In our architecture, this differentiation depends on the flow type as given by the traffic classification.
6 6.1
QoS Operations Bridge Router Source
The bridge router is responsible for establishing the connection between the source and destination bridge routers inside the substitution network. This section describes in details the operations at the Bridge Router source level. Figure 6 describes the operation chart of a bridge router when a new flow arrives at the bridge router: 1. The bridge router first classifies the flow. 2. The bridge router checks whether there are available QoS resources (bandwidth, delay, queue, etc.) by using the Admission Control functionality. There must be enough QoS resources so that if the new flow is admitted, there will be no degradation or a minor for the other already admitted flows.
New Flow Flow classification
Flow Priority
No
Admission Control
Yes
Admit Flow
Medium / low
high
Traffic Control
No
Yes
Mobility for QoS
Yes
No
Reject Flow
Fig. 6. Flow diagram at a Bridge Router source
258
T. Razafindralambo et al.
If there are enough resources for this new flow, regardless of the new flow priority, then it is admitted. 3. If the QoS resources cannot satisfy the requirements of the new arriving flow, the bridge router checks the priority of this flow. If the priority is low, then the flow is rejected. If the flow priority is high, the bridge router tries to adapt/reduce/remove on-going flows of lower priority. 4. If the adaptation of lower priority flows is possible, then the bridge router admits this new flow. If the adaptation turns to be impossible: the bridge router ’asks’ the mobile substitution router to redeploy or tries to ’add’ new mobile substitution routers in the aim of creating a new route. 5. If a new deployment or redeployment of mobile substitution routers can create enough available resources, the new flow is admitted. If this new deployment or redeployment does not create enough available resources, the new flow is rejected. It comes out that a new deployment/redeployment can be triggered by the arrival of a high priority flow. Besides, it is worth noting that substitution networks can be rearranged even if no physical link failure occurs since high priority packets that transit over the bridge router may trigger the deployment of new routes (see Figure 6). 6.2
Mobile Substitution Routers
Figure 7 describes the operation chart of a mobile substitution router. When a new packet is received by the mobile substitution router, this packet is treated at each layer according to the packet specifications described in Section 5. Specifications embedded in the packet are used to provide the QoS required. The following policies can be used at each layer: – Routing Layer: Choose the next hop depending on the available resources. – Link Layer (LL): Use priority queues and/or 802.11e-like scheduling at the MAC layer. – Physical Layer (PHY): Use multiple interfaces, channels, frequency and/or coding schemes. The interactions between the QoS plane, the mobility plane and the bridge routers are described in Figure 8. QoS resources are measured periodically and
New Packet
Packet Priority
QoS Route
QoS MAC
QoS PHY
Forward Packet
Fig. 7. Flow diagram at a Mobile Substitution Router
Promoting Quality of Service in Substitution Networks
259
Periodic measurements
QoS
Local QoS Improvements
QoS Resources Status
Reports relocation / Movements
Global QoS Improvements Bridge Routers
Mobility
Reports relocation / Movements
Fig. 8. Mobile substitution router interaction diagram
measurements results are reported to the bridge routers. These measurements can be used locally by the mobile substitution router (self-relocation) or by the bridge routers to relocate the mobile substitution router. Mobile substitution router’s position is sent to the QoS plane and to the bridge routers to evaluate the gain of the relocation. 6.3
Bridge Router Destination
At the destination bridge router, QoS policy is translated into its original form. This requires that enough information is embedded into the packets in order to have the exact inverse translation as in the bridge router source.
7 7.1
Discussions Mobility
The controlled mobility of the mobile substitution routers represents a novel means to improve QoS in wireless mobile networks. The mobility of mobile substitution routers can be controlled by the mobile substitution router itself or by a bridge router. Based on the monitored state of the network, a mobile substitution router decides whether or not to move. However, this movement is constrained by at least the connectivity requirement. The autonomous movement of a mobile substitution router is used to improve some local parameters such as interferences, MAC layer contention or routing failures. The mobility of a mobile substitution router can be requested from a bridge router. The bridge router can ask a specific mobile substitution router to move to a given position to
260
T. Razafindralambo et al.
increase the QoS for a given traffic/flow. Note that, the two motion decisions can be different, in this case, the mobile substitution router follows the recommendations of the bridge router since we give priority to bridge router QoS instead of the local mobile substitution router’s QoS. 7.2
Monitoring
The monitoring architecture is an important feature of the QoS architecture proposed in this paper. Discussion regarding the monitoring architecture is out of the scope of this paper. However the following features are required by the Monitoring architecture: 1) Mobile substitution router State. Mobile substitution routers must be able to gather their own states such as level of interference in the surrounding, number of neighbors, queue states, data rates used for the transmissions with each of its neighbor. 2) Mobile substitution router must be able also to measure delays of one-hop transmissions.
8
Conclusion
In this paper, we propose a new QoS architecture for substitution networks. This architecture is based on a layered QoS architecture. In our architecture, two different devices are used. The bridge routers are entry points in the substitution network. They play a major role for the deployment of substitution network and for avoiding congestion in the substitution network. Mobile substitution routers are the devices that compose the substitution network. They are playing the communication task inside the network. The fundamental difference of our architecture and the architectures proposed in the literature is the use of the controlled mobility at the mobile substitution router level to increase the network and flows performance, and thus QoS. Acknowledgments. This work is partially funded by the French National Research Agency (ANR) under the project ANR VERSO RESCUE (ANR-10VERS-003) and the French Research National Institute of Computer Science and Automation (INRIA) under the ARC MISSION project.
References 1. Aurrecoechea, C., Campbell, A.T., Hauw, L.: A survey of QoS architectures. Multimedia Systems 6(3), 138–151 (1998) 2. Babiarz, J., Chan, K., Baker, F.: Configuration Guidelines for DiffServ Service Classes. IETF - Request for Comments (RFC 4594), no. 4594 (2006) 3. Campbell, A.T., Coulson, G., Garcia, F., Hutchinson, D., Leopold, H.: Integrated quality of service for multimedia communications. In: Proceedings of Joint Conference of the IEEE Computer and Communications Societies (INFOCOM), vol. 2, pp. 732–739 (1993) 4. Chaudet, C., Dhoutaut, D., Gu´erin Lassous, I.: Performance issues with IEEE 802.11 in ad hoc networking. IEEE Communication Magazine 43(7), 110–116 (2005)
Promoting Quality of Service in Substitution Networks
261
5. DAIDALOS IST Project, Designing Advanced Interfaces for the Delivery and Administration of Location Independent Optimized personal Services (2008), http://www.ist-daidalo.org 6. IEEE Computer Society, IEEE 802.11: Wireless LAN Medium Access Control and Physical Layer Specifications. Medium Access Control (MAC) Quality of Service (QoS) Enhancements, IEEE Press, D13.0 (2005) 7. MercuryNews.com, San Jose police: Sabotage caused phone outage in Santa Clara, Santa Cruz counties (2009), http://www.mercurynews.com/ci_12106300 8. Nemertes Research, Internet Interrupted: Why Architectural Limitations Will Fracture the Net (2009), http://www.nemertes.com/studies/internet_ interrupted_why_architectural_limitations_will_fracture_net_0 9. Park, S., Kim, K., Kim, D., Choi, S., Hong, S.: Collaborative QoS architecture between DiffServ and 802.11 e wireless LAN. In: Vehicular Technology Conference (VTC), vol. 2, pp. 945–949 (2003) 10. Radhakrishnan, S., Frost, V.S., Evans, J.B.: Quality of Service for Rapidly Deployable Radio Networks. Telecommunication Systems 18(1), 207–225 (2001) 11. Razafindralambo, T., Simplot-Ryl, D.: Connectivity Preservation and Coverage Schemes for Wireless Sensor Networks. IEEE Transaction on Automatic Control (2011) 12. Sarr, C., Chaudet, C., Chelius, G., Gu´erin Lassous, I.: Bandwidth Estimation for IEEE 802.11-based Ad Hoc Networks. IEEE Transactions on Mobile Computing 7(10), 1228–1241 (2008)
Improving CS-MNS through a Bias Factor: Analysis, Simulation and Implementation Thomas Kunz and Ereth McKnight-MacNeil Systems and Computer Engineering, Carleton University
[email protected],
[email protected]
Abstract. A WSN consists of numerous nodes gathering observations and combining these observations. Often, the timing of these observations is of importance when processing sensor data. Thus, a need for clock synchronization arises in WSNs. The CS-MNS algorithm has been proposed to fulfil this role. However, the core algorithm suffers from an initial divergence of clocks. This paper shows, through analysis, that introducing a bias factor in the CS-MNS update law significantly reduces this initial divergence. This is then further confirmed via simulation results, using Matlab, and actual testbed measurements in a testbed deploying motes running TinyOS 2.1. The results show that a designer, having some a-priori knowledge about clock characteristics, can choose a bias that allows the algorithm to speed up the convergence time and greately improve the overall protocol performance. The work also demonstrates that rigorous analysis can be helpful in designing protocols and predicting protocol behaviour, which is then verified through simulation and testbed measurements.
1
Introduction
The possibility of combining microelectromechanical system (MEMS) sensors with VLSI control, signal processing and communications circuits to form an intelligent wireless sensor node capable of forming wireless networks with like sensor nodes was recognized as early as 1996 in [1]. In the considerable subsequent literature on wireless sensor networks (WSNs), various applications have been proposed including intrusion detection, habitat monitoring, and building automation. In many WSN applications, sensor readings from multiple nodes need to be combined either to synthesize a snapshot of the conditions at a single point in time or to track the progress of a phenomenon through the sensor field. The challenge of generating a mutually consistent time stamping service across the nodes of a WSN is referred to as time synchronization. In wireless sensor and actuator networks (WSANs), where the passive sensing role of some nodes is augmented with an active role, valid synchronization data is required in real-time to co-ordinate current and near-future actions. Thus, in WSANs post-hoc methods of synchronization are generally insufficient and a more active form of synchronization is required. H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 262–275, 2011. c Springer-Verlag Berlin Heidelberg 2011
Improving CS-MNS through a Bias Factor
263
Various synchronization algorithms for use in WSNs have been proposed [8]. Among these algorithms is the clock sampling mutual network synchronization (CS-MNS) algorithm proposed in [6, 7]. The objective of the work presented here are to analyze and improve the startup behaviour of the algorithm. As proposed, clocks initially diverge, which results in poor initial clock synchronization, concerns about the stability of the algorithm, and lengthy convergence times. Based on a mathematical analysis of CS-MNS’ convergence, we propose a modification to the original update law, introducing a bias factor. Choosing appropriate initial values for this bias factor ensures that we can significantly reduce the initially divergent behaviour. Determining suitable bias values requires a-priori knowlegde of the maximum rate at which clocks can drift from one another (which is typically provided as part of the specification of the oscilators used as timing source on many systems) and a bound on initial clock offsets (which can be achieved through appropriate initial steps in the clock synchronization protocol itself). We modeled the algorithm in Matlab and evaluated the impact of the proposed modification through simulation. We also implemented the algorithm in TinyOS 2.1 and performed experiments in a testbed consisting of a mix of TelosB and MICAz motes. The results confirm the analytical insights. The remainder of this paper is organized as follows. Section 2 gives a brief overview of time synchronization and wireless sensor networks and goes on to introduce CS-MNS. Section 3 presents our analysis of CS-MNS’ convergence. As one outcome of this analysis, we propose a modification to the original update law to speed up clock converge to a common time reference from the very beginning. Section 4 discusses the simulation setup and results that demonstrate the effect of the bias factor through Matlab simulations. Section 5 discusses the testbed implementation and provides further testbed experiments that demonstrate the effect of the proposed modification. Finally, Section 6 highlights the conclusions drawn from the work and discusses future work.
2 2.1
Background WSN Hardware and Clocks
Our testbed consists of a mix of TelosB and MICAz motes, running TinyOS 2.1. The TelosB and MICAz platforms each have a number of different oscillators. However, each platform limits the oscillators that can be used for softwarelevel timing. On the TelosB platform, the 4 MHz central processing unit (CPU) clock oscillator is a low-accuracy digitally controller oscillator implemented as a ring oscillator. This oscillator is trimmed by software at boot time using a 32.768 kHz crystal oscillator as a reference. Thus, although the 4 MHz oscillator has higher resolution than 32.768 kHz, the frequency accuracy cannot exceed 32.768 kHz since it is effectively derived from this source. The MICAz implements the 7.3728 MHz CPU clock as a crystal oscillator. However, on both systems the CPU oscillators are stopped in low-power sleep states. This leaves only the 32.768 kHz oscillator capable of providing a continuous time reference.
264
T. Kunz and E. McKnight-MacNeil
Both mote platforms use the CC2420 intelligent radio IC. This radio provides IEEE 802.15.4 radio communications in the 2.4 GHz ISM band at a bit rate of 250 kbps. The CC2420 intelligent radio includes hardware AES-128 encryption and a separate hardware timing strobe that is triggered during each packet preamble whenever a packet is transmitted or received. This allows the host microcontroller to perform a hardware-level time stamp of outgoing and incoming packets. The natural frequency of a crystal is controlled by controlling the physical size and shape of the crystal. Thus, the crystal period is subject to error related to the physical manufacturing tolerances for the crystal. Unfortunately, crystal are not perfectly stable over very long time periods or across temperature variation. Tolerance in the frequency of crystals introduces rate error to a clock. In lay terms, if two clocks exhibit rate error, one clock will gain time relative to the second clock. Another type of error is clock offset. Clock offset error is the familiar everyday error exhibited by a clock that is, ‘five minutes fast.’ A clock with offset error is either ahead or behind of another clock by a constant time increment. The concept of a common timebase shared within a group but unconnected to any outside reference is termed relative synchronization or internal synchronization. Maintaining a common timebase synchronized with an outside reference, such as coordinated universal time (UTC), is termed absolute synchronization or external synchronization. By necessity, absolute synchronization implies relative synchronization. 2.2
Time Synchronization in Wireless Sensor Networks
Time synchronization is an important foundation of networked systems. However, the ad hoc and dynamic nature of WSN topologies prevents the straightforward application of traditional centralized, hierarchical clock synchronization strategies. Limited energy availability and cost sensitivity pressures further limit the application of traditional algorithms [5]. In keeping with the properties generally found in WSNs, various self-organizing centralized and decentralized algorithms for WSN synchronization have been proposed [8]. The clock sampling mutual network synchronization(CS-MNS) algorithm is presented in [6, 7]. The clock sampling mutual network synchronization (CS-MNS) algorithm has previously been shown through simulation to perform well and is fully decentralized in that all nodes execute the same algorithm at all times. Furthermore, the algorithm does not require knowledge of, and makes no assumptions about, the network topology. As originally presented, the CS-MNS algorithm uses periodic beacons equivalent to the IEEE 802.11 time synchronization function (TSF). The simple beacon format of CS-MNS is compatible with the IEEE 802.11 TSF beacon format. The CS-MNS algorithm can be implemented in software with standard IEEE 802.11 radio and clock hardware [7]. This makes CS-MNS applicable for use in networks of currently available hardware and in cost-sensitive consumer devices that must use commodity radio hardware.
Improving CS-MNS through a Bias Factor
265
The CS-MNS algorithm uses a single multiplicative correction factor to correct the local clock. Upon receiving any beacon, the CS-MNS algorithm updates the local correction factor. In CS-MNS, each node maintains a hardware clock which is allowed to run freely at its natural rate. At any point in time, t, we represent this uncorrected hardware clock at node j by Tj (t). However, in order to synchronize the nodes, the CS-MNS algorithm creates a new, corrected clock. CS-MNS uses a simple multiplicative transformation to generate this corrected clock from the uncorrected clock. As given in [6, 7] the corrected clock, Tj , is related to the uncorrected clock, Tj by Tj (t) = sj Tj (t)
(1)
and the CS-MNS algorithm controls the correction factor sj . The CS-MNS algorithm modifies the correction factor by comparing the local corrected time with the corrected time, Ti , sampled from some other node i. The updated correction factor, s+ j is calculated as s+ j = sj + k
Ti (τ ) − Tj (τ ) Tj (τ )
(2)
where k represents a control gain and τ is the time at which the clock sample is taken. Implicitly, it is assumed that the clocks at node j and node i are sampled at the same instant in time τ . While this cannot be achieved exactly in practise, the hardware support for radio message time stamping described above allows for the two node clocks to be sampled well within a single clock period. By substituting the updated correction factor into Equation 1, the new time estimate, Tj+ , is given as Tj+ (t) =
Ti (τ ) − Tj (τ ) 1+k sj Tj (τ )
sj Tj (t)
(3)
which can be simplified by using Equation 1 again to yield Ti (τ ) Tj+ (t) = (1 − k) Tj (t) + k Tj (t) Tj (τ )
(4)
Clocks have two main sources of error: an offset error and a rate error. The above adjustment in essence adjusts for the drifts/clock rates, which is motivated by two observations. First, CS-MNS coarsely synchronizes clocks to keep offset errors small: when a node joins the network and receives the first beacon, it uses it to set its local clock to the time advertised in the beacon. Second, based on the initially small offset errors, the main source of clock differences in the long run are the different clock rates. Once clocks are synchronized to each other, and different from other clock synchronization protocols, CS-MNS can reduce the beacon frequency while still ensuring tight clock synchronization.
266
3
T. Kunz and E. McKnight-MacNeil
Analysis of CS-MNS’ Convergence
The analytical analysis of the CS-MNS algorithm begins by outlining a simplified model of a real-world clock. This model clock model is then used throughout the remainder of the analysis. The CS-MNS clock correction scheme and the CS-MNS adaptation law are explained and an expression for the updated clock in terms of previous clocks is derived. The theoretical convergence properties of the CS-MNS update law are examined both in the special case of zero offset error and in the general case. Unconditional asymptotic convergence results are obtained in the case of zero offset error. Limits are derived under which the general case exhibits similar convergence properties to the special case. For the purpose of analysis we adopt a simplified clock model exhibiting only offset and rate errors. Such a clock is termed an affine clock in [2] because the model presents the clock at each node as an affine transformation of a theoretical, prefect reference clock. Thus, the time process at node j is modelled by Tj (t) = αj t + βj
(5)
where t represents the reference time process. Implicitly, the clock rates and offsets, αj and βj , remain constant in time. The justification for adopting the affine clock model stems from the assumption that the other forms of clock error are small in relation to the clock offset and rate errors. Thus motivated, the affine clock model can be interpreted as the Maclaurin series expansion of the true clock truncated after the first order term. The approximation that offset and rate error dominate overall clock error is part of the motivation behind the design of CS-MNS. This clock model also describes well the system clock on the motes in our testbed, as described in more detail in [3]. Under this model, the corrected time process given in Equation 1 becomes, Tj (t) = sj Tj (t) = sj (αj t + βj )
(6)
at node j. The convergence properties of the CS-MNS update law are a natural area of investigation. Some insight into the behaviour of CS-MNS can be motivated by considering the special case of synchronizing clocks without offset errors. This special case corresponds to the condition that all β1 = · · · = βn = 0 in Equation 5. In this case the corrected time process of Equation 6 becomes simply Tj (t) = sj Tj (t) = sj αj t
(7)
and substituting this into the expression for the updated corrected time of Equation 4 yields sk αk τ Tj+ (t) = (1 − k) sj αj t + k sj αj t (8) sj αj τ
Improving CS-MNS through a Bias Factor
which simplifies to
Tj+ (t) = (1 − k) Tj (t) + k Ti (t)
267
(9)
for this case. The updated corrected clock rate is also of interest and is given by d + d d Tj = s+ Tj + k Ti j αj = (1 − k) sj αj + ksk αk = (1 − k) dt dt dt
(10)
in the absence of offset errors. If the control gain k is kept in the range k ∈ (0, 1), then Equation 9 expresses the updated time estimate at node j as a strict convex combination of the previous time estimates at notes j and k. Furthermore, Equation 10 shows that the updated clock rate at node j is also a convex combination of the previous clock rates at nodes j and k. This is exactly the requirement given as Assumption 1, part 3 by Moreau in [4]. The intuitive motivation for the argument formalized in [4] stems from the observation that an update law with this form cannot increase the worst-case level of disagreement within the group. From Equation 10, the updated rate must satisfy min (sl αl ) ≤ s+ (11) j αj ≤ max (sl αl ) l
l
which simply states that any updated rate must lie between the minimum and maximum rates in the group. Thus, we can immediately conclude that the minimum and maximum rates for the group with updated rates are either unchanged or have moved toward each other. Further, applying Theorem 2 of [4] results in the conclusion that the CS-MNS update applied to a collection of node clocks without offset error will, with sufficient communication, continuously drive the node clock rates towards a common rate close to the initial rates that does not change with time. While the above theoretical results for the special case without offset errors are heartening, it is unlikely that a group of clocks would have zero offset error in practise. Thus, we examine the effect of introducing offset error into the above analysis. We proceed by adding an error term to the updated clock rate given by Equation 10 in the zero-offset case so that it is equal to the rate from the general case given in Equation 4, d Tj+ (t) = (1 − k) sj αj + ksi αi + ksi αi i,j (τ ) dt where the error term, i,j (τ ) =
βj −
αj β αi i
αj τ + βj
arises from the non-zero values of βi and βj . If sj (τ ) αj < si (τ ) αi and −k si (τ ) αi − sj (τ ) αj < ksi αi i,j (τ ) < (1 − k) si (τ ) αi − sj (τ ) αj
(12)
(13)
(14)
268
T. Kunz and E. McKnight-MacNeil
or sj (τ ) αj > si (τ ) αi and − (1 − k) si (τ ) αi − sj (τ ) αj < ksi αi i,j (τ ) < k si (τ ) αi − sj (τ ) αj (15) then the updated clock rate in Equation 12 is strictly between si αi and sj αj . In other words, despite the error term, the updated clock rate will lie in the interior of the range between the previous corrected clock rates. Thus, under the above conditions the essential property that led to convergent behaviour in the case without offset error is preserved. By defining λ = min 1 − k, k (16) then the above conditions will always be met if λ sj αj − si αi |i,j (τ )| < k si αi
(17)
regardless of whether node i or j has the slower clock. Further, by choosing αmin = min αi i
βmin = min βi i αj γ = max βj − βi i,j αi then |i,j (τ )| can be bounded over all i, j as |i,j (τ )| ≤ max (τ ) =
γ αmin τ + βmin
(18)
which becomes smaller with increasing time. Indeed, limτ →∞ max (τ ) = 0. Finally, using max (τ ) and Equation 17 we arrive at the condition sj αj − si αi λ max (τ ) < min (19) k i,j si αi under which all possible CS-MNS rate updates exhibit the required properties for convergent behaviour. Thus, regardless of network topology, any period in which all relative clock rate errors are greater than a threshold will be periods in which the clock rates show convergent behaviour. In addition, Equation 18 shows that this threshold grows smaller as time passes. From the above analysis we can gain some insight into an optimal value for the control gain k. Considering Equations 14 and 15 we observe that the limits are symmetric in the case when k = 0.5. Thus it is not surprising that k = 0.5 is the value that maximizes the λk term in Equation 17. Thus, k = 0.5 serves to maximize the range of i,j , and thus the magnitude of offset errors, that can be tolerated while the CS-MNS control law will make updates placing the updated clock rates between the existing clock rates. However, identifying optimal values for k remains an open problem.
Improving CS-MNS through a Bias Factor
269
When synchronization begins, the value of τ in Equation 13 is small, which leads to a large value for i,j (τ ). Because of this large value, updates are less likely to satisfy the convergence condition Equation 19 initially. This causes divergent behaviour until i,j (τ ) shrinks sufficiently. However, an initial divergent period can serve to further desynchronize the node clocks, which puts the algorithm at a disadvantage once convergent behaviour emerges. In order to prevent this initial divergent behaviour, a bias factor can be introduced into the update law in Equation 2 to arrive at s+ j = sj + k
Ti (τ ) − Tj (τ ) Tj (τ ) + βBIAS
(20)
when βBIAS is a non-negative constant parameter expressed in units of time. With this updated control law the expressions for the error term and error term bound become α βj − αji βi i,j (τ ) = (21) αj τ + βj + βBIAS and γ i,j (τ ) ≤ max (τ ) = (22) αmin τ + βmin + βBIAS which are consistent with the previous definitions when βBIAS = 0. The system designer can choose an appropriate value for βBIAS by imposing the requirement that initial updates should exhibit convergent behaviour. If all possible initial updates satisfy Equation 17, where i,j (τ ) includes a βBIAS term, then the initial updates will show convergent behaviour. If the designer somehow knew the s, α, and β parameters for each update, the designer could solve Equation 17 for values of βBIAS that would encourage convergence. Since the designer does not know which nodes will update, we can approximate by removing the node dependence by using max (τ ) on the left hand side and by introducing expected values for the differences in offset and rate in Equation 17 to arrive at α E βj − αji βi λ E [|sj αj − si αi |] < (23) αmin τ + βmin + βBIAS k maxi (si αi ) which leaves only values that the designer would typically be able to estimate. The expected values and extrema of clocks can be estimated from the expected distributions of the clocks at start-up. For example, the expected and maximum rate error of crystal clocks are easily evaluated from the crystal specifications. By substituting these estimates, along with a value for τ representing the time of the initial update of interest, the system designer can solve for a value of βBIAS that is expected to result in initial convergence. A less rigorous estimate than the estimate in Equation 23, that appears useful in practise is to eliminate the expected value by replacing them with the maximum α offset and rate errors and to assume that βmin = 0 while αji = si αi = 1, leading to βBIAS >
k max |βj − βi | −τ λ max |sj αj − si αi |
(24)
270
T. Kunz and E. McKnight-MacNeil
which allows a simple estimation of an appropriate βBIAS . When the rate errors are measured in parts per million and the initial offset errors are small, the above approximation provides the system designer a rough guide to selection of a βBIAS value.
4
Simulation Results
We implemented a simulation model of CS-MNS in Matlab. A few simplifying assumptions are made in the implementation. The beacon transmission and processing is assumed to be instantaneous. The simulation is conducted under the assumption that all of the nodes broadcast beacons as a homogeneous Poisson process with rate λnode . For all experiments reported in this section, the beacon rate per node was set to 1 30 beacons per second. Consequently, considering a group of N nodes, the composite stream of beacons is also a Poisson process with rate λ = N λnode . Taken together with the assumption that the beacon transmission is instantaneous, the Poisson process assumption eliminates the possibility of beacon collision. The results in this section are derived by simulating each scenario 1 000 times and determining the mean as well as the 95% confidence interval for the maximum error among the nodes. 1000 900 Maximum Group Error (μs)
800 700 600 500 400 300 200 100 0
0
5
10
15 Time (seconds)
20
25
30
Fig. 1. Simulation results showing results without bias term (thin traces) and with a bias value of βBIAS = 20 000 ticks (heavy traces). The solid lines represent the mean values while the dashed lines represent the 95% confidence intervals.
Under the above assumptions, simulations were run using discrete 32.768 kHz beacon values. Clock quantization was applied to the timestamps carried in beacons only, with the errors being calculated exactly. The total number of nodes was 30 in a single-hop topology. The node clocks were initially uniformly distributed within 92 μs which corresponds to three ticks of a 32.768 kHz clock. The node clock rates were uniformly distributed in the range of ±50 parts per million. The total beacon rate was one beacon per second, the control gain was k = 0.5.
Improving CS-MNS through a Bias Factor
271
Figure 1 shows the CS-MNS performance in this single-hop scenario. The mean error settles at about 21 μs after the first 25 seconds. As the value of 21 μs corresponds to just less than the time quanta of a 32.768 kHz clock, persistence of disagreement at this level is within expectation. The figure also shows that CS-MNS, as originally proposed, will see the clocks diverge initially. In the next simulation, we added a bias value of βBIAS = 20 000 ticks. This bias value is the value suggested by Equation 24 for the above initial offset error and range or clock rates. The results show that the bias term reduces the initial divergence of the CS-MNS algorithm. While the peak of the mean across the trials for the synchronization error in the case without bias is slightly more than twice that for the case with bias, the confidence interval for the case without bias is very wide, indicating a large variance among the trials during the initial convergence period.
5
Testbed Implementation and Results
In order to evaluate the performance of CS-MNS experimentally, we also implemented the CS-MNS algorithm in TinyOS 2.1. We then validated the experimental setup and characterized the node hardware by collecting clock data from nodes with free-running clocks. These tests revealed that the clocks on our two hardware platforms (TelosB and MicaZ) ran at rates that differed by almost 14%. Once this was completed, the CS-MNS algorithm was tested in a number of different single-hop and multi-hop network configurations, we here report only the results that compare the performance of CS-MNS with and without the bias term. Other results, as well as a more in-depth discussion of the testbed and the experimental setup, can be found in [3]. For the purposes of testing CS-MNS performance, each node broadcasts beacons with a Poisson process schedule. Unless otherwise noted, the average interbeacon time at each
node is 30
seconds. This results in a composite beacon rate for the group of N 30 beacons s . The control gain was set at k = 0.5. Shown in Figure 2 are the maximum clock differences observed among a group of 14 TelosB nodes in a single-hop group running the CS-MNS algorithm. The top graph shows the maxumum clock difference during the whole 140 minute experiment, the bottom graph focuses on the first 10 minutes. The value of βBIAS = 20 000 ticks, combined with the close initial synchronization of approximately 30 μs, results in some initial divergence with a peak error of 915 μs occurring after approximately 35 seconds. This peak error corresponds to the error that would accumulate in 35 seconds between two clocks with a 26 ppm difference in rate, which is just outside of the uncorrected hardware tolerance. Indeed, inspection of the raw data indicates that the large error observed is due to node ‘669A’. During the first second of the test, node ‘669A’ received one beacon successfully, but this beacon resulted in no change of the sj correction factor. This node did not successfully receive any more beacons until after 35 seconds and thus the large error is due to the accumulated clock error at node ‘669A’. The next largest synchronization error observed is 275 μs. For the remainder of
272
T. Kunz and E. McKnight-MacNeil 1000 900
Maximum Difference (µs)
800 700 600 500 400 300 200 100 0
0
20
40
60 80 Time (Minutes)
100
120
140
1000 900
Maximum Difference (µs)
800 700 600 500 400 300 200 100 0
0
2
4 6 Time (Minutes)
8
10
Fig. 2. CS-MNS synchronization errors for 14 TelosB nodes in a single-hop group
the test, clock granularity appears to be the primary factor limiting the CS-MNS algorithm, since the maximum observed clock differences are comparable to the quantization errors inherent in the beacons and the measurement. The experimental results shown for the 14 nodes group in Figure 2 compare well with the simulation results for a 30 node group shown in Figure 1. Since the experiment and simulation both use the same beacon rate at each node, the total beacon rate in simulation is approximately double that in experiment. The peak error is somewhat higher in experiment, but without the sample from node ‘669A’ the next highest error of 275 μs is closer to the peak of the upper bound from simulation. However, slower convergence is expected in experiment as a result of the slower beacon rate and this, in turn, means that larger peak errors are expected. As expected, the convergence time is approximately 58 seconds in experiment while this time is only 25 seconds in simulation. Furthermore, the final synchronization error obtained in experiment is approximately 150% of the final error obtained in simulation. However, some allowance must be made for measurement error which is not present in the simulation results.
Improving CS-MNS through a Bias Factor
273
Maximum Difference (ms)
4000 3000 2000 1000 0
0
1
2 3 Time (Minutes)
4
5
(a) βBIAS = 0
Maximum Difference (ms)
40 30 20 10 0
0
1
2 3 Time (Minutes)
4
5
(b) βBIAS = 13 100 000 ticks Fig. 3. CS-MNS results from 14 MicaZ nodes with an artificial initial synchronization error of up to 20 ms
Shown in Figure 3 are the results from 14 TelosB nodes arranged in a singlehop topology with artificial initial synchronization error of up to 20 ms, running CS-MNS both with and without a bias term. Strong divergence is clearly visible in the case without bias term. The maximum disagreement between nodes of 3.8 secs peaks around 33 seconds into the run, almost 190 times the initial clock disagreement. The rather large bias value used is suggested by Equation 24 with an initial offset error of 20 ms and a clock maximum rate error of 50 ppm. With the addition of this bias term, the maximum disagreement in the group tends downward during the initial minute of the test. Not visible in Figure 3 is the effect that the initial divergence has on the overall group clock rate. Without a bias term, the average value of the correction factor sj after 5 minutes is approximately 1.06, which is well outside the expected range of hardware clock rates. However, with the bias term the average value of the sj
274
T. Kunz and E. McKnight-MacNeil
adjustments is approximately 0.999998, which is 2 ppm less than unity and well within the expected range of hardware clock rates. The differences observed in group clock rates are consistent with the analytical presentation given earlier. In fact, the peak disagreement observed was 3.8 s, which occurred at τ = 33 s and at this point the CS-MNS update law, given by Equation 2, with k = 0.5 would apply a correction of approximately 6%. However, once a large correction, resulting from a large clock difference, is applied, this correction will persist and at the end of the 5 minute test the average of the clock adjustments are still showing a 6% increase above nominal. Technically, relative clock synchronization is achieved even if the aggregate group rate clock rate moves outside of the range defined by the initial clock rates, as was observed in the case without bias term. However, if the synchronized aggregate rate is hard to predict from the initial conditions, this can complicate a number of possible applications. For example, applications using the group clock to drive periodic operations will introduce a sensitivity to the unpredictably group clock on battery life, as periodic operations are repeated more often. Large difference in group clock rate can also complicate applications requiring previously-independent groups of nodes to join and agree on a combined relative synchronization. In this case, the application designer has two choices. First, the joined network can be allowed to synchronize the clocks with rate differences larger than the hardware differences. Or second, methods to detect the joining of the groups can be added and the node clocks can be reset to their underlying hardware clocks, thus reducing the potential range of clock rates.
6
Conclusions and Future Work
This paper presented the analysis of CS-MNS’ convergence behaviour, derived a modification of the core update law to speed up convergence, and showed through both simulations and testbed results that the predicted reduction in initial clock divergence can indeed be achieved. As such, the paper not only proposes a specific improvement to a clock synchronization protocol, but demonstrates how rigorous analysis can be used to better understand and enhance network protocols. Not shown here, but discussed in the results presented in [3], the improvement in the inital convergence is also achieved for multihop scenarios. While CS-MNS does eventually converge, the maximum clock difference can be larger and and the time required to converge is noticeably longer then in the single-hop case. Adding a bias factor to the control law speeds up convergence time and reduces the initial divergence, as shown for the single-hop results in this paper as well. Along the way, and in particular while performing our testbed experiments, we noticed and documented a few additional items of general interest. For example, a simple characterization of the TinyOS 2.1 system and the MICAz and TelosB hardware led us to discover that the MICAz nominal 32.768 kHz clock is implemented as a 28.8 kHz clock. Analysis of the TinyOS code shows that this discrepancy is rooted in software. Also, in analyzing the clock rates in the testbed results, in particular the results for Figure 3, demonstrate that the addition of the bias term not only improves the algorithm convergence, but also
Improving CS-MNS through a Bias Factor
275
results in an overall group clock rate closer to unity. One of the items of future work are therefore the explore this effect further, in particular as we are also interested in extending CS-MNS to provide external clock synchronization. Other items of future research address the testbed and additional evalulations of CS-MNS. The testbed would benefit from the addition of a reliable, separate channel that allowed reporting of test data without adding traffic to the wireless channel used by the nodes under test. Additionally, a method to distribute a shared clock or timing pulses to each node would allow more accurate measurement of the time process at each node. Previous work shows promising simulation results for CS-MNS in networks with mobility. Future experiments could evaluate the performance of CS-MNS in a testbed which included node mobility. Additionally, there remains significant related theoretical, simulation, and experimental work that can be done on the behaviour of CS-MNS when networks are segregated, when networks are joined, and when nodes are added and removed from networks. Similarily, there remains possible analytical and experimental work toward identification of an optimal value for the control gain k, as well as investigation into the effects of the k value on group dynamics.
References 1. Bult, K., Burstein, A., Chang, D., Dong, M., Fielding, M., Kruglick, E., Ho, J., Lin, F., Lin, T.H., Kaiser, W.J.: Low power systems for wireless microsensors. In: International Symposium on Low Power Electronics and Design, pp. 17–21 (1996) 2. Freris, N.M., Kumar, P.R.: Fundamental limits on synchronization of affine clocks in networks. In: 2007 46th IEEE Conference on Decision and Control, December 2007, pp. 921–926 (2007) 3. McKnight-MacNeil, E.: Cs-mns: Analysis and implementation. Master’s thesis, Systems and Computer Engineering, Carleton University, Ottawa, Canada (2010) 4. Moreau, L.: Stability of multiagent systems with time-dependent communication links. IEEE Transactions on Automatic Control 50(2), 169–182 (2005) 5. Ren, F., Lin, C., Liu, F.: Self-correcting time synchronization using reference broadcast in wireless sensor network. IEEE Wireless Communications Magazine 15(4), 79–85 (2008) 6. Rentel, C.H.: Network Time Synchronization and Code-based Scheduling for Wireless Ad Hoc Networks. PhD thesis, Carleton University (January 2006) 7. Rentel, C.H., Kunz, T.: A mutual network synchronization method for wireless ad hoc and sensor networks. IEEE Transactions on Mobile Computing 7(5), 633–646 (2008) 8. Sundararaman, B., Buy, U., Kshemkalyani, A.D.: Clock synchronization for wireless sensor networks: a survey. Ad Hoc Networks 3(3), 281–323 (2005)
A Methodology to Evaluate Video Streaming Performance in 802.11e Based MANETs Tim Bohrloch1 , Carlos T. Calafate2 , Alvaro Torres2, Juan-Carlos Cano2 , and Pietro Manzoni2 1
HfT Leipzig, Germany
[email protected] 2 Universitat Politècnica de València, Spain
[email protected],
[email protected], {jucano,pmanzoni}@disca.upv.es Abstract. Video delivery in mobile ad-hoc networks (MANETs) is an exciting and challenging research field. In the past, most works addressing this issue have resorted to simulation due to the complexity of deploying QoS-enabled testbeds and retrieving video quality indexes in such environments. In this paper we introduce a methodology that allows testing the effectiveness of video codecs in ad-hoc networks. Our methodology relies on a well-defined video quality evaluation framework that is able to combine different video codecs and transmission environments. In particular, our evaluation procedures encompass a preliminary quality assessment, which relies on a point-to-point wireless channel, to establish the general behavior of a video codec under lossy channel conditions, along with tests in static and mobile ad-hoc network environments to determine the impact of factors such as congestion, hop count, and mobility on video quality. To validate our methodology we compare the H.264/AVC and the MPEG-4/ASP video codecs, showing that, in general, the former outperforms the later in terms of video quality, although, for very high loss rates, the differences between both become minimal. Additionally, we show that the number of hops between video transmitter and receiver is a decisive factor affecting performance in the presence of background traffic. Moreover, in mobile scenarios, we find that the impact of congestion and routing delay affects video streaming quality in different manners, being congestion mainly responsible for random losses, while routing delay is usually associated with large loss burst patterns.
1
Introduction
Video streaming in mobile ad-hoc networks (MANETs) is considered one of the most challenging research goals due to the combined effects of wireless communications characteristics (multipath fading and shadowing, interferences, collisions, etc.) and topology maintenance in the presence of node mobility, all of which negatively affect on-going video sessions. In particular, topology changes provoke intermittent connectivity, causing large packet loss bursts. Thus, assessing the effectiveness of video transmission systems in ad-hoc networks is a relevant issue. In the past, most works addressing this issue have resorted to simulation due to H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 276–289, 2011. c Springer-Verlag Berlin Heidelberg 2011
A Methodology to Evaluate Video Streaming Performance
277
the complexity of deploying real testbeds. However, this approach does not allow for a thorough validation since results tend to be too optimistic. Additionally, the few works describing real testbed experiments do not deal with the IEEE 802.11e technology, which impedes offering QoS support to video streams. In this work we introduce a methodology based on emulation that allows evaluating video streaming performance in real, 802.11e-based MANETs. The proposed methodology relies on a video quality evaluation framework to assess the effectiveness of different video codecs when transmitted over IP networks. When focusing on MANET scenarios, we propose performing an initial evaluation using a controlled point-to-point wireless channel to have a clear overview of the video codec’s error resilience performance. Afterward, static multi-hop communication scenarios are used to determine of impact of factors such as congestion and hop count on performance. Finally, we introduce mobile multihop communications, typical of MANET environments, to study the effects of topology updating delays on the video quality perceived by the user. To validate our methodology we assess the effectiveness of two well-known video coding standards - H.264/AVC [1] and MPEG-4 Part 2 [2] - following the proposed procedure. Experimental results show that the H.264/AVC codec offers higher video quality and, more important, they evidence how the different sources of video data losses in MANETs impact the video decoding process. This paper is organized as follows: in section 2 we present some related works, evidencing how our work differs from previous ones. In section 3 we introduce the proposed video evaluation methodology. Section 4 describes the video sequences used for testing, along with the experimental results obtained in an emulated pointto-point channel when varying the loss rate. The results obtained in our ad-hoc network testbed are then presented in section 5, comprising both static and mobile scenarios. Finally, section 6 presents the main conclusions of this paper.
2
Related Works
In the literature we can find works addressing video streaming performance in MANETs from different perspectives. Schierl et al. [3] presented a scheme based on Raptor FEC that uses different sources for reliable media streaming in MANET scenarios with high route loss probability. Their evaluation is based on ns-2 simulation. Martinez-Rach et al. [4] analyzed the performance of video delivery in wireless ad-hoc networks based on a Markovian model that mimics the behavior of real ad-hoc networks. Additionally, they tested the video resilience in a real 802.11 testbed by moving a robot (client device) away from an access point (single hop communications). Overall, they found that H.264 offers a reasonable degree of resilience in the presence of losses. Sheltami [5] evaluates the performance of the H.264 protocol using two routing protocols: the Neighbor-Aware Clusterhead (NAC) and the Dynamic Source Routing (DSR) protocols. The author shows that it is feasible to have video
278
T. Bohrloch et al.
over MANETs within an average distance of 6 hops, and requiring 5.5 Mbps on average. Oh and Chen [6] develop a physical rate based strategy for dual band IEEE 802.11 b/g that adopts the EDCA MAC layer architecture of IEEE 802.11e for an improved QoS performance. They demonstrate through simulation that the EDCA-based scheme they propose is able to outperform the simple dual band IEEE 802.11 b/g in terms of received video quality. Calafate et al. [7] propose a QoS framework for MANETs combining IEEE 802.11e technology, a multipath routing algorithm, and a distributed admission control algorithm. Their solution was tested via simulation, and Peak Signal-toNoise Ratio results were obtained under different network congestion conditions. More recently, Lee and Song [8] propose an effective cross layer optimized video streaming algorithm over multi-hop mobile ad hoc networks. Their algorithm attempts to satisfy an end-to-end delay constraint, while maintaining packet loss rate within a tolerable range at the receiver. Despite the aforementioned works address problems similar to those we focus on this work, our proposal differs from previous ones by providing: (i) an IEEE 802.11e enabled MANET testbed, while most proposals are simulation based, and the real testbeds deployed (e.g. [4]) lack any QoS support at the MAC layer; (ii) a methodology for assessing video streaming effectiveness in MANET environments based on emulation.
3
Our Proposed Video Evaluation Methodology
In this section we introduce our methodology to evaluate video streaming effectiveness in MANETs when relying on different video codecs, and under different network conditions. We first describe the video quality measurement framework adopted, including the different target video quality metrics. Based on this framework, we then propose different transmission environments over which performance is assessed. In particular, preliminary evaluations are made using an emulated point-to-point wireless channel; the goal is to have a clear characterization of the video codec’s behavior under different loss conditions. These preliminary evaluations are followed by tests in a real wireless ad-hoc network testbed, both static and mobile, to evaluate the impact of congestion, hop count and mobility on performance. 3.1
Video Quality Measurement Framework
Measuring the quality of a transmitted video is a process that involves several steps. In particular, retrieving video quality indexes usually requires doing a frame-by-frame comparison of the original video against the received video, both in the raw format. Figure 1 presents the steps involved in obtaining video quality indexes for a specific combination of video encoder/decoder and transmission channel. The original video is a raw video sequence, typically in the YUV 4:2:0 format. The
A Methodology to Evaluate Video Streaming Performance
279
Fig. 1. Steps involved in obtaining video quality indexes for video streaming environments
encoding process relies on one of the available video codecs for data compression prior to transmission. Notice that, to enable a fair performance comparison between different video codecs, it is very important to perform datarate control to get the same target bitrate for transmission. Otherwise, any qualitative comparison would be deemed unfair. For the transmission process, we propose connecting a VLC [9] client to a VLC server through an IP network, since this tool is compatible with a wide variety of formats and video codecs. For the decoding process, we can use a tool such as Mencoder [10] to obtain a raw video sequence in the YUV 4:2:0 format. In addition, and to account for lost frames in the transmission process, frame freezing is performed. The latter is the process through which a same frame is replicated to fill-in for missing frames, thereby generating a raw video sequence with the same length as the original one. This is a strict requirement to obtain a meaningful output from the quality measurement process. To determine the impact of the different transmission impairments on the quality of streamed video sequences as experienced by the receiver, we have relied on different metrics. In particular, the metrics we considered were: (i) Peak Signal-to-Noise Ratio (PSNR), (ii) Packet loss ratio, and (iii) Frame losses. According to different authors [4,11], this set of metrics is quite adequate in the context of video transmission over lossy IP networks since they assess video quality from different perspectives. In particular, the last two metrics are more appropriate to discriminate between different video transmission impairments when losses grow above the 15-20% threshold. Notice that all other video-specific metrics, both objective and subjective, usually fail to differentiate between different scenarios when transmission conditions become very poor, experiencing a saturation effect at the lower edge of the metrics’ range. In the following sections we describe the different networks introduced in the Transmission process step according to the proposed evaluation methodology. 3.2
Point-to-Point Wireless Channel Emulation
Our experiments were made with real working software for both GNU/Linux and Windows platforms. To make sure that the test sequences were repeatable and reproducible, we created a controlled test environment where we emulated different channel conditions. With this purpose, we interconnected two
280
T. Bohrloch et al.
Fig. 2. Snapshot of the Castadiva tool: setting up of a multi-hop ad-hoc network environment
computers using Fast Ethernet, and, using the GNU/Linux traffic control tool (tc), we set the channel delay and the packet loss ratio. This way, we were able to emulate different channel conditions to assess the impact on performance at the application layer. In this process, it is important to highlight that the packet loss events introduced were completely random, meaning that no loss burst effects were present. In contrast, the wireless ad-hoc network environments that we describe below will mostly introduce bursty losses. 3.3
Wireless Ad-Hoc Network Testbed
Testing video transmission in a real wireless ad-hoc network testbed is a complex task. In fact, the mere setup process of such a testbed requires configuring nodes to share the same IEEE 802.11 parameters, as well as being within the same IP network, and running a same routing protocol in case mobility is introduced. Additionally, relying on users to move mobile terminals around makes the process ambiguous by impeding repeatability and strict control of experiments. Instead, we propose using the Castadiva tool [12] to setup both static and mobile scenarios in a seamless and straightforward manner, while providing experiment repeatability. Through a graphical user interface, Castadiva allows setting up any static or mobile scenario, defining both the layer 2 and layer 3 setup for nodes, providing start/stop traffic control, and gathering experimental results at the end (see figure 2).
4
Performance Evaluation in Lossy Wireless Channels
Following the proposed methodology, in this section we proceed with the first set of experiments where we assess video streaming performance in a point-to-point
A Methodology to Evaluate Video Streaming Performance
281
Table 1. Details about the different video sequences used (source formats) Video sequence Resolution Degree of motion Scene changes Frame rate Duration Sequence #1 768×576 Sequence #2 1280× 720
high moderate
very frequent frequent
25 fps 25 fps
18 s 20 s
wireless channel under different loss conditions. We first introduce the two video sequences used as input for our tests, and then present experimental results when transmitting in an emulated wireless channel. 4.1
Selected Video Codecs and Sequences
For our experiments we used two different video sequences with different characteristics (see table 1). The first video sequence, taken from the “Die hard 4” movie, is an action sequence with a very high degree of motion and many scene changes, being particularly demanding for video codecs (see figure 3 - top). The second video sequence, taken from the “Sunshine” movie trailer (see figure 3, bottom), has less motion and less scene changes, but it has a higher resolution compared to the first one (720p vs 576p). Instead of the standard VQEG [13] video sequences1 , we opted for the aforementioned sequences since they introduce more scene changes than the latter ones. Notice that scene changes cause bandwidth peaks, which are desirable for our study in order to stress the network. The two video sequences are encoded using both the H.264/AVC and the MPEG-4 video codecs. In particular, we relied on the “XMedia Recode” tool [14] for the video encoding task. 4.2
Experimental Results
In this first set of experiments the goal was to compare the error resilience of both video codecs under analysis when facing different channel conditions. With this purpose we relied on our point-to-point channel emulator to introduce a variable loss rate between video sender and receiver, while setting delay to 10 ms. In our experiments we used both the H.264/AVC and the MPEG-4/ASP (Advanced Simple Profile) video codecs, setting the target data rate of both video sequences to 900 kbit/s. We varied the packet loss rate between 0,1% and 10%, since this range is deemed adequate by most authors [11] when performing video quality assessment. For each combination of video sequence, codec and packet loss rate, we repeated the experiment 10 times. In the charts that follow, each point represents the mean value of these 10 experiments. Figure 4 shows the PSNR results obtained for both video sequences using the two different video codecs. Concerning video sequence #1 (see Figure 4, left), we found that the H.264/AVC codec achieves an increment of 2 dBs, on average, compared to the MPEG-4/ASP video codec. With respect to video sequence #2, we find that the differences between both codecs, although initially high 1
Available at: ftp://ftp.crc.ca/crc/vqeg/
282
T. Bohrloch et al.
Fig. 3. Screenshot of video sequence #1 - Die hard 4 (top) - and video sequence #2 Sunshine (bottom) 45
45
H.264/AVC MPEG-4/ASP
40 Peak Signal-to-Noise Ratio (dB)
Peak Signal-to-Noise Ratio (dB)
40
H.264/AVC MPEG-4/ASP
35
30
25
35
30
25
20
20 0.1
1 Packet loss rate (%)
10
0.1
1
10
Packet loss rate (%)
Fig. 4. PSNR values for video sequence #1 (left) and video sequence #2 (right) when varying the packet loss rate on the channel
(5.9 dB), become negligible for packet loss rates above 2%. Compared to video sequence #1, this effect only occurred for a packet loss rate of 10%. Also, we find that the PSNR values for video sequence #2 experience a faster degradation compared to the first one, and that PSNR values at 10% loss are up to 5 dB lower, approaching noise levels (20 dB threshold). Concerning frame losses, figure 5 shows the impact of packet loss on the frame loss ratio. We can see that there is some relationship between packet losses and frame losses. We also find that both video codecs follow a similar trend, although, for the first video sequence, the MPEG-4/ASP codec performs quite poorly, with a frame loss ratio of 12% when the packet loss ratio is of 10%. The H.264/AVC codec offers more consistent operation, achieving a maximum frame loss ratio of 8%, and always maintaining the frame loss ratio below the packet loss ratio. This is the expected behavior, and clearly evidences the superior error resilience offered by the H.264/AVC codec.
A Methodology to Evaluate Video Streaming Performance 12
H.264/AVC MPEG-4/ASP
10
10
8
8
Frame loss ratio (%)
Frame loss ratio (%)
12
6
4
2
283
H.264/AVC MPEG-4/ASP
6
4
2
0
0 0.1
1 Packet loss rate (%)
10
0.1
1
10
Packet loss rate (%)
Fig. 5. Frame loss ratio values for video sequence #1 (left) and video sequence #2 (right) when varying the packet loss rate on the channel
Overall we find that, although the H.264/AVC codec is expected to offer a much better performance compared to its predecessor (MPEG-4/ASP), the PSNR difference becomes somehow attenuated when the packet loss ratio becomes too high. We also find that the impact of packet loss on frame loss ratio is maintained consistent for H.264/AVC, while MPEG-4/ASP improved its behavior with a higher video resolution sequence (#2).
5
Performance Evaluation in a Real Wireless Ad-Hoc Network Testbed
Endowed with the knowledge acquired in the experiments of the previous section, and following the proposed evaluation methodology, we now proceed with our study by deploying an ad-hoc network testbed composed of low cost netbooks (Asus Eee PCs), all of which are equipped with IEEE 802.11g wireless cards. The laptops were configured to enable QoS support through IEEE 802.11e, meaning that video traffic can be transmitted at a higher priority category compared to best effort traffic. Due to the complexity of the experiments and the time involved, the tests presented in this section use video sequence #1 alone. We now present experimental results considering both a static and a mobile topology. 5.1
Static Topology Tests
In this section we perform some experiments in a static ad-hoc network environment. In our tests we assess performance when varying the number of hops between video transmitter and receiver. For the network to operate under realistic assumptions, we also create best-effort traffic flows. These flows consist of constant bitrate UDP traffic regulated to produce a moderate/high degree of congestion in the network. In particular, each of these background traffic clients will generate about 8 Mbit/s using 1400 byte packets, although the number of hops associated with each flow differs (see Figure 6). With this purpose, the video streaming client connects to a different streaming server in each test, being each test repeated 10 times.
284
T. Bohrloch et al.
Fig. 6. Static testbed environment used for multi-hop ad-hoc network performance tests
Fig. 7. Packet loss ratio values for both video codecs when increasing the number of hops between source and destination
Figure 7 shows the average packet loss ratio experienced at different hops for both two video codecs. Notice that, since the target bit rate is the same for both video codecs, the packet loss rate is expected to be similar in both cases. In fact, the differences detected are mostly related to the variability of the wireless channel’s capacity in the presence of other sources of interference. Overall, we find that losses grow proportionally to the hop count, being the most significant growth detected when increasing the hop count from one to two hops. For a greater number of hops, the packet loss ratio increase becomes proportionally less significant compared to this initial growth. Concerning PSNR values, figure 8 (left) shows that video quality at one hop is very good, as expected. However, as the number of hops increases, the PSNR values decrease at a rate of about 2 dB per hop for the H.264/AVC video codec. In the case of the MPEG-4/ASP encoded stream, the per-hop decay is greater, although it levels out at three hops (almost no difference compared to the four hops case). The frame loss results are coherent with the packet loss and PSNR results, again evidencing that the H.264/AVC video codec is again able to offer a better performance (see fig. 8, right). Nevertheless, and independently of which codec is used, we find that frame losses at high hop counts become excessive for good quality video streaming.
A Methodology to Evaluate Video Streaming Performance
285
Table 2. Experimental results obtained in the mobile testbed scenario Video codec H.264/AVC MPEG-4/ASP H.264/AVC MPEG-4/ASP H.264/AVC MPEG-4/ASP H.264/AVC MPEG-4/ASP
Node speed Background Packet loss PSNR Frame loss (m/s) traffic ratio (%) (dB) ratio (%) 10 10 15 15 10 10 15 15
NO NO NO NO YES YES YES YES
0.22 0.21 22.72 22.38 8.75 8.98 31.17 32.08
41.52 43.36 36.62 36.69 34.40 35.42 29.24 29.19
0.39 0.26 20.71 22.34 10.22 12.21 30.28 32.08
Overall, we find that the H.264/AVC codec offers a consistent improvement compared to the MPEG-4/ASP encoder, and that the number of hops is a factor which drastically affects quality when the channel is congested.
Fig. 8. PSNR values (left) and Frame loss ratio (right) for both video codecs when increasing the number of hops between source and destination
5.2
Mobile Topology Tests
The last target scenario defined in the proposed methodology is a mobile ad-hoc network environment. The chosen setting is similar to the previous section, being the main difference that now the video streaming client will be moving across the scenario. This client is initially located away from the streaming server (4hop distance) and, when the experiment begins, it will start moving towards the streaming server at a constant speed (10 or 15 m/s in our experiments), meaning that the hop count will be gradually reduced (see figure 9). We rely on the OLSR routing protocol [15] for topology maintenance. Since OLSR is a proactive routing protocol, it will attempt to make routes available even before they are required. To accomplish this task, it will continuously listen to the HELLO messages generated by the different nodes to determine whether a link has been lost, or if a better route is available. TC messages are propagated
286
T. Bohrloch et al.
Fig. 9. Mobile testbed environment used for multi-hop ad-hoc network performance tests. Emulated distance between static stations is of 200 meters
throughout the network whenever the topology is updated. Parameters such as the HELLO and the TC message intervals were set to standard values [15]. In our experiments, we perform tests both with and without background traffic. The former adopt the same flow characteristics used in the previous section for the static testbed. Each experiment is repeated 10 times and lasts for 40 seconds, which in our setting corresponds to moving the streaming client until it is within range of the streaming server (optimal conditions). Table 2 shows the results obtained for the different scenarios considered. These results show how mobility and congestion affect video performance. We find that, while mobility is the most important source of packet losses, congestion has a greater impact on the average PSNR since losses are more distributed throughout frames. Another important conclusion that can be directly drawn from these results is that packet losses experienced in the presence of both high mobility and congestion are basically the sum of both these causes of loss, meaning that mobility and channel congestion are basically additive factors for the packet loss metric. In terms of frame loss ratio, this metric is strictly related to packet losses, where again we confirm the higher performance offered by the H.264/AVC codec. For an in-depth analysis of the results presented in table 2, we now present the PSNR detail for some of the tests made with the H.264/AVC video codec in the mobile testbed. As shown in figure 10.a, when there is no congestion and the degree of mobility is moderate, the PSNR results are sustained at good levels, with only a brief drop in quality detected. When the degree of mobility increases (see figure 10.b) we find that the PSNR values experience a long period of poor quality; this effect is caused by loss of routes, which are prone to provoke long loss bursts. When increasing the degree of congestion instead, we find that PSNR values experience a much higher variability (see figure 10.c). This poor performance mainly occurs because interference provokes the loss of both video and routing packets. Thus, not only does the video stream suffer from random packet discarding associated with congestion, but it also experiences loss bursts associated with the routing protocol’s difficulty at maintaining routes.
50
50
40
40
30
30
PSNR (dB)
PSNR (dB)
A Methodology to Evaluate Video Streaming Performance
20
10
20
10
0
0 0
100
200
300
400
500
600
700
800
0
100
200
300
Frame number
400
500
600
700
800
500
600
700
800
Frame number
a)
b)
50
50
40
40
30
30
PSNR (dB)
PSNR (dB)
287
20
10
20
10
0
0 0
100
200
300
400
500
600
700
800
0
100
200
300
Frame number
c)
400 Frame number
d)
Fig. 10. PSNR results for the H.264/AVC codec in the mobile testbed: a) no background traffic, node speed of 10 m/s; b) no background traffic, node speed of 15 m/s; c) with background traffic, node speed of 10 m/s; d) with background traffic, node speed of 15 m/s
Finally, under both congestion and high mobility conditions, we find that performance is overall quite poor. In particular, it is interesting to notice that the periods associated with poor PSNR values are basically the combination of the effects of mobility (figure 10.b) and congestion (figure 10.c). This result agrees with previous works on this topic based on simulation [16].
6
Conclusions
Despite the attention received in the past few years, video streaming in wireless ad-hoc networks is still an issue lacking an in-depth performance analysis. In particular, no clear methodology has been proposed that is able to offer both a detailed video quality assessment along with experiment repeatability and reproducibility; both of these objectives are of utmost importance to enable results validation by the research community. Moreover, when it comes to real deployments, the unavailability of IEEE 802.11e support in the ad-hoc mode has prevented testing this technology when supporting multi-hop video communications, which has further hindered research progress in this area. In this paper we presented a methodology for assessing video streaming quality in MANET environments. Our proposal includes a well-defined video eval-
288
T. Bohrloch et al.
uation framework that allows assessing performance when combining different video codecs and different transmission scenarios. In particular, we propose performing the evaluation in several phases, where in the first phase we make an in-depth evaluation in a point-to-point lossy channel to determine the codec error resilience under different channel conditions, and, in a second phase, performance assessment analysis is carried out in a real ad-hoc network testbed to determine the impact of congestion, hop count and mobility on performance. Our testbed offers full QoS support through IEEE 802.11e, and relies on emulation to devise test scenarios under both static and dynamic network conditions. By following our methodology, we compared the performance of two state-ofthe-art codecs: H.264/AVC and MPEG-4/ASP. The performance indexes gathered evidence the better performance of H.264/AVC when compared to MPEG4/ASP. Nevertheless, we find that the differences between codecs become minimal as we increase the packet loss rate and the video resolution. Experimental testbed results show that, for a static environment, the amount of losses experienced under moderate best-effort loads do not compromise video transmission, although video quality may experience a degradation of up to 14 dB when the number of hops increases from one to four hops. In the presence of mobility, and using the OLSR protocol for routing tasks, we find that performance is quite good when the degree of mobility and congestion is low. When congestion increases, the video sequence experiences a significant quality degradation due mostly to random losses. Instead, mobility is prone to cause long loss bursts due to route disruption. We also found that, in scenarios characterized by both high degrees of mobility and congestion, the video stream experiences a loss pattern that is basically an additive combination of the effects of these two parameters. Overall, we find that the proposed methodology offers a comprehensive analysis of video streaming performance in MANETs, allowing to discriminate losses according to their particular origin so that any improvement efforts point to the right direction.
Acknowledgments This work was partially supported by the Ministerio de Educación y Ciencia, Spain, under Grant TIN2008-06441-C02-01.
References 1. Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC). Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, JVTG050 (March 2003) 2. ISO/IEC IS: Coding of Audio-Visual Objects, Part 2: Visual (MPEG-4). Information Technology (November 2001) 3. Schierl, T., Hellge, C., Ganger, K., Stockhammer, T., Wiegand, T.: Multi Source Streaming for Robust Video Transmission in Mobile Ad-Hoc Networks. In: IEEE International Conference on Image Processing, Atlanta, GA, USA (October 2006)
A Methodology to Evaluate Video Streaming Performance
289
4. Martinez-Rach, M., López, O., Piñol, P., Malumbres, M.P., Oliver, J., Calafate, C.T.: Quality Assessment Metrics vs. PSNR under Packet Loss Scenarios in MANET Wireless Networks. In: International Workshop on Mobile Video (MV 2007), Augsburg, Germany (2007) 5. Sheltami, T.: Performance Evaluation of H.264 Protocol in Ad hoc Networks. Journal of Mobile Multimedia 4(1), 59–70 (2008) 6. Oh, B.J., Chen, C.W.: Performance evaluation of H.264 video over ad hoc networks based on dual mode IEEE 802.11B/G and EDCA MAC architecture. In: IEEE International Symposium on Circuits and Systems (ISCAS 2008), Seattle, WA, USA (May 2008) 7. Calafate, C.T., Malumbres, M.P., Oliver, J., Cano, J.C., Manzoni, P.: QoS Support in MANETs: a Modular Architecture Based on the IEEE 802.11e Technology. IEEE Transactions on Circuits and Systems for Video Technology 19(5), 678–692 (2009) 8. Lee, G., Song, H.: Cross layer optimized video streaming based on IEEE 802.11 multi-rate over multi-hop mobile ad hoc networks. Mob. Netw. Appl. 15, 652–663 (2010) 9. The VideoLAN Project: Vlc: open-source multimedia framework, player and server, http://www.videolan.org 10. The MPlayer Project: Mplayer - the movie player, http://www.mplayerhq.hu/ 11. Monteiro, J.M., Calafate, C.T., Nunes, M.S.: Evaluation of the H.264 Scalable Video Coding in Error Prone IP Networks. IEEE Transactions on Broadcasting 54(3), 652–659 (2008) 12. Hortelano, J., Nacher, M., Cano, J.-C., Calafate, C.T., Manzoni, P.: Castadiva: A Test-Bed Architecture for Mobile Ad hoc Networks. In: 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC 2007), Athens, Greece (September 2007) 13. The Video Quality Experts Group (VQEG), http://www.its.bldrdoc.gov/vqeg/ 14. Dorfler, S.: Xmedia recode 2.2.9.7, http://www.xmedia-recode.de/ 15. Clausen, T., Jacquet, P.: Optimized link state routing protocol (OLSR). Request for Comments 3626, MANET Working Group (October 2003) (work in progress), http://www.ietf.org/rfc/rfc3626.txt 16. Calafate, C.T., Malumbres, M.P., Manzoni, P.: Performance of H.264 compressed video streams over 802.11b based MANETs. In: International Conference on Distributed Computing Systems Workshops (ICDCSW 2004), Hachioji - Tokyo, Japan (March 2004)
Node Degree Improved Localization Algorithms for Ad-Hoc Networks Rico Radeke and Jorge Juan Robles Technische Universit¨ at Dresden, Germany Chair for Telecommunications
[email protected],
[email protected]
Abstract. In this paper we improve the well-known localization algorithms Lateration, Weighted Centroid Localization and Min-Max by using a improved distance estimation. It does not only consider the hop count between two nodes, but also the neighbor degree information. Simulation studies show the performance improvements. Keywords: distance estimation, range-free localization, Weighted Centroid Localization, Lateration, Min-Max.
1
Introduction
Interest in localization in Ad-hoc networks has been increasing in the last years. This is principally due to possible applications that are offered like monitoring, security and navigation systems. In general, by using the number of hops between two nodes it is possible to have an idea about the separation within the network. This information can be exploited by a distance-based localization algorithm to calculate the position of a blind node in relation to anchor nodes, which know their positions. In such algorithms a poor distance estimation negatively impacts on the achieved position accuracy. In this paper, in order to improve the distance estimation, the neighbor degree information is additionally taken into consideration. The benefit of the improved distance estimation is evaluated in three different localization algorithms: Min-Max, Weighted Centroid Localization (WCL), and Lateration. The remainder of the paper is structured as follows. Section 2 describes the improvements of the three localization algorithms. Section 3 shows the simulation setup and results.
2
Neighbor Degree Improvements
As shown in [1] the neighbor degree (ND) can be used to improve the localization performance of the weighted centroid approach. We will extend this idea to Lateration and Min-Max as well. The following expression from [1] is used to approximate the average distance to n-hop-neighbors, where n is the the hop count and ND the average node degree on the path to the respective anchor. H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 290–293, 2011. c Springer-Verlag Berlin Heidelberg 2011
Node Degree Improved Localization Algorithms for Ad-Hoc Networks
291
d¯n (N D) = (0.0391 · N D + 0.3338) ∗ n + −0.1108 · N D + 0.9917 (1) 2 ¯ for n ≥ 2. For n = 1 we assume a node degree independent d1 = 3 . The impact of the improved distance estimation is investigated in the following algorithms: The Weighted Centroid Algorithm [3] provides an approximated position of the blind node and is very robust against distance error. For the position estimation the weighted mean of the reachable anchors is used. The weights for this approach are the inverses of the hop counts to the respective anchors. The Min-Max Algorithm [2] is mainly based on subtractions, additions and logic operations. Therefore it can be easily calculated by nodes with poor processing power such as sensor nodes. Around each anchor a bounding zone is built. This is a square with center at the anchor and a side length of two times the distance estimation from the blind node. The intersection of the bounding zones defines a region where the blind node is located. As position estimation the center of the intersection area is used. Lateration [2] provides the exact position of the blind node, when distance information does not contain error. According to [2] this algorithm is more sensitive to the distance errors than Min-Max. The position can be calculated by minimizing the sum of the distance errors to the respective anchors. The distance error is the difference between the estimated distance to the anchor and the distance between the estimated position and the anchor.
3
Simulative Investigation
For investigation of performance, we analyze scenario II from [1]. 100 nodes, each with a unidirectional communication range of 10m, are placed randomly in a 50x50m square plain area. To provide paths to anchors with different node degrees, three times more nodes are placed in the right side compared to the left side of the area. Four anchors are placed in the corners of the square area. All localization algorithms use the three closest anchors in terms of the hop count as base for the position estimation. Figure 1 shows an example of this setup. To compare the performance we investigate the positioning error described in [3]. The positioning error is the Euclidean distance between the exact real position and the estimated position of a node. The position of each node in a scenario is estimated by the above mentioned algorithms. For each algorithm we compare three distance estimations. The first multiplies communication range and hop count. The second one additionally multiplies a scaling factor of f = 2/3 motivated by [1]. The last one uses Equation 1, which is based on node degree ND, multiplied by the communication range. An example for the position estimation of a single node is shown in Figure 1. 10.000 scenarios are randomly generated. In each scenario the positions of all nodes are estimated and the corresponding positioning errors are averaged. Figure 2 shows the mean positioning errors over all scenarios with 95% confidence intervals.
292
R. Radeke and J.J. Robles
Fig. 1. Localization Example
Fig. 2. Localization Results
Node Degree Improved Localization Algorithms for Ad-Hoc Networks
293
In WCL the implementation of a scaling factor does not improve the results, due to the fact, that a scaling factor linearly modifies all weights in the same way. The scaling factor improves the Min-Max localization by 11.9% and Lateration by 68.8%. This shows the robustness of Min-Max and the sensitivity of Lateration to distance estimation errors as also stated [2]. Additional improvements are made by the ND based improved distance estimation for all algorithms. WCL is additionally improved by 4.5%, Min-Max by 9.5% and Lateration by 3.9%. As our approach without real distances or positions (except the anchors positions) uses only the hop count and node degree, which are topological information, it performs quite well compared to the original algorithms. If the node degree information is not available, at least a scaling factor of f = 2/3 should be used to decrease the positioning error.
4
Conclusions
We show with this paper that the position accuracy can be improved for Lateration, Min-Max and Weighted Centroid Localization algorithms. At least a scaling of two thirds should be used for the distance estimation in the original algorithms. This can be further improved by additionally considering the node degree. Our results also confirm that Lateration is very sensitive to the distance error compared to WCL and Min-Max, which are more robust.
References 1. Radeke, R., T¨ urk, S.: Node Degree based Improved Hop Count Weighted Centroid Localization Algorithm. In: Proc. of 17th GI/ITG Conference on Communication in Distributed Systems (KiVS 2011), pp. 194–199 (2011) 2. Langendoen, K., Reijers, N.: Distributed localization in wireless sensor networks: a quantitative comparison. Journal Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue: Wireless Sensor Networks Archive 43(4) (November 15, 2003) 3. Blumenthal, J., Grossmann, R., Golatowski, F., Timmermann, D.: Weighted Centroid Localization in WLAN-based Sensor Networks. In: Proceedings of the 2007 IEEE International Symposium on Intelligent Signal Processing, WISP 2007 (2007)
Using BPEL to Realize Business Processes for an Internet of Things Nils Glombitza, Sebastian Ebers, Dennis Pfisterer, and Stefan Fischer Institute of Telematics, University of L¨ ubeck Ratzeburger Allee 160, 23538 L¨ ubeck, Germany {glombitza,ebers,pfisterer,fischer}@itm.uni-luebeck.de
Abstract. In the vision of an IoT, trillions of tiny devices extend the Internet to the physical world and enable novel applications that have not been possible before. Such applications emerge out of the interaction of these devices with each other and with more powerful server-class computers on the Internet. Programming such applications is challenging due to the massively distributed nature of these networks combined with the challenges of embedded programming. In addition, resource constraints, device heterogeneity, and the integration with the Internet further complicate this situation. In this paper, we present a programming-in-thelarge approach for resource-constraint devices such as wireless sensor nodes. Our approach is to model such applications using the Business Process Execution Language (BPEL), which is successfully and widely used in the Internet to model complete applications and business processes. However, BPEL and its associated technologies are too resourcedemanding to be directly applied in resource-constraint environments. We therefore use the BPEL model as input to a code generation process that generates custom-tailored, lean code for different target platforms. The resulting code is fully standard-compliant and allows a seamless integration of IoT devices in enterprise IT environments. We present an exhaustive evaluation on real hardware showing the first-rate performance of the approach. Keywords: BPEL, WSN, Sensor Networks, Business Process Modeling, Web Services.
1
Introduction
In the vision of the IoT, trillions of tiny devices extend the Internet to the physical world and enable novel applications that have not been possible before. Such applications emerge out of the interaction of these devices with each other and with more powerful server-class computers on the Internet. Programming such applications is challenging due to the massively distributed nature of these networks combined with the heterogeneity of the devices. In this paper, we present a programming-in-the-large approach for resource-constraint devices such as wireless sensor nodes. Our approach is based on the observation that programming H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 294–307, 2011. c Springer-Verlag Berlin Heidelberg 2011
Using BPEL to Realize Business Processes for an Internet of Things
295
individual applications in an Internet of Things (IoT) is hardly possible anymore and the fundamental challenge is the integration with the Internet on an application level. In the Internet, Service Oriented Architectures (SOA) are increasingly used to model, implement, and execute complex applications and business processes. The predominant language for modeling this is the Business Process Execution Language (BPEL) orchestrating individual services to complete applications. BPEL uses standard Web Services as communication technology. Web Services are based on standard Internet technologies such as SOAP and XML for message encoding and HTTP and TCP/IP for message transport. By contrast, wireless sensors nodes are often programmed using low-level embedded languages and accessed using proprietary protocols. The main reasons for this are the embedded nature and the severe resource constraints of wireless sensor nodes. The desire for long lifetimes paired with constrained energy supply from batteries implies that computing power, memory capacity, and networking bandwidth are several orders of magnitude smaller than those found in backend IT systems and require optimized software and networking solutions. In this paper, we present a standard-compliant approach to integrate resourceconstraint devices into BPEL-based processes. However, BPEL and its associated technologies are too resource-demanding to be directly applied in resourceconstraint environments. We therefore use BPEL models as input to a code generation process that generates custom-tailored, lean code for different target platforms. The resulting code is fully standard-compliant and allows a seamless integration of IoT devices to the Internet. Using our approach, IoT devices become first-class citizens in business processes and allow the execution of business processes directly on sensor nodes. We present an exhaustive evaluation on real hardware showing the first-rate performance of the approach. The remainder of this paper is structured as follows. In the following section, we introduce a motivating use-case which is also used in our evaluation. Section 3 introduces fundamental technologies of Web Service-based business process modeling and execution and Section 4 introduces related work. Section 5 introduces our concept in detail and Section 6 presents implementation details. Finally, Section 7 evaluates our approach and Section 8 concludes this paper with a summary.
2
Use Case
As a use case, we consider a logistics process where companies are concerned with storing and shipping trailers with dangerous goods. For their storage inside port areas as well as on ferry ships, strict laws and rules apply. For instance, dangerous goods must be declared at port authorities and shipping lines. In addition, trailers carrying different types of dangerous goods must have a minimum distance between them. In this work, we present two business processes which are part of the process of shipping dangerous goods and both processes are actually executed on sensor nodes.
296
N. Glombitza et al.
The first, quite simple, process is concerned with storing cargo data on a sensor node attached to a trailer. This process receives a cargo information message (carrying cargoID and amount encoded as two integer values) from the cargo dispatcher, checks if there is enough space on the trailer and stores this data. It replies with a message containing the amount of stored cargo (encoded as an integer).
send cargo information and position
alarm
rule is violated
confirm
process cargo information
send new parking information
send alarm
Trailer a (cargo monitoring)
Port authority
handle new location information
reply own cargo information and position
Trailer b
Fig. 1. Monitoring Dangerous Cargo in the Parking Lot of a Port (CargoMonitoring)
The second process (executed on each trailer’s sensor node) is more complex and ensures that two trailers with dangerous goods never park too close to each other (cf. Figure 1). Imagine that trailer a parks inside a port’s area and a second trailer b arrives and parks close to a. Additionally, b joins the ad-hoc network built by all parking trailers. b sends its cargo information and position to a which replies with the same information. Both check if they are allowed to park that close together. If this is not the case, they invoke an alarm service at the port authority. Afterwards, they are waiting to be told to park at a new position; this is answered with a confirmation.
3
Fundamentals
In this section, we introduce conceptual and technological fundamentals required to realize Web Service-based business process modeling and execution in sensor networks. 3.1
Business Process Modeling
The BPEL (Business Process Execution Language, [17]) language is the de-facto standard in enterprises to model and execute Web Service-based business processes. BPEL processes are typically designed using graphical tools that provide a domain-specific and process-oriented definition of processes. These tools allow domain experts to realize the business processes’ IT realization without a need for IT experts. BPEL processes are often generated automatically during modelto-model transformations from other, sometimes proprietary, business process modeling standards.
Using BPEL to Realize Business Processes for an Internet of Things
297
In BPEL, individual activities are provided by external Web Services which are used by a BPEL process. A Web Service used by a BPEL process is called a Partner Link. Each BPEL process itself exposes its functionality in the form of one or many Web Services. Hence, a BPEL process can again be used by another BPEL process. To model the flow of a business process, BPEL provides basic activities and functions for simple data manipulations as well as exception and transaction handling. With so-called structured elements such as loops or conditional commands, the control flow between activities can be modeled. Since BPEL is an XML-based language, direct storage and execution (which means interpretation of the BPEL document at run-time) on a sensor node is not possible due to the resource-constraints imposed by typical sensor network hardware. We discuss how BPEL can be applied to sensor networks in Section 5. 3.2
Web Services for Sensor Networks
Management
As mentioned above, Web Services are the basis for describing, executing, and using BPEL-based business processes. To make use of them in sensor networks, sensor nodes must be able to use and to expose Web Services. In this section, we discuss the challenges of using Web Services on sensor nodes and discuss viable solutions. As shown in Figure 2, the normal Web Service technology stack uses wellknown Internet technologies such as XML and SOAP [21] for encoding Web Service messages as well as HTTP or SMTP over TCP/IP for conveying these messages from source to sink. However, these technologies easily overburden the scarce resources of WSN platforms. Using Web Services in such environments therefore requires two things: an efficient encoding of XML/SOAP-messages and a lean transport protocol instead of the resource-demanding HTTP over TCP/IP.
BPEL, WWF, ...
Business Process
WSDL
Description
SOAP Message XML HTTP, SMTP, ... TCP/IP, UDP/IP
SOAP Message Compression LTP Any Routing/ Networking Protocol
Transport
Fig. 2. Extended Web Service Technology Stack
In previous work, we have presented a solution for each of these two challenges. In [19], we introduced a highly efficient compression scheme for XML messages called microFibre and in [9], we propose a lightweight Web Service
298
N. Glombitza et al.
transport protocol (called Lean Transport Protocol, LTP) which can optionally use technologies such as microFibre to convey compressed XML messages (called SOAP message compression or SMC in the following). The integration of LTP and SMC into the standard Web Service architecture is depicted in Figure 2. Even if HTTP and XML are the predominant choice in the Internet, the Web Service technology stack does not prescribe their use. A property of our LTP and SMC is that they are fully standard-compliant to Web Services. In the following, we briefly introduce the main features and properties of LTP and SMC. The basic idea of microFibre (or SMC in general) is to exploit structural knowledge about the exchanged XML messages and provide a custom-tailored (de-)compression code. This is possible since Web Services and the SOAP messages exchanged between server and client are well-defined in WSDL (Web Services Description Language, [4]) and XML Schema documents. This structural knowledge can be exploited to compress XML documents adhering to such a grammar more efficiently than using general or none-schema-based XML compressors. In principle, LTP/SMC supports any payload type (e.g., plain XML or microFibre encoding) but in the following, we use microFibre due to its high compression ratios and its mature code generation framework that produces (de)serialization code for various WSN platforms and programming languages. For more information please refer to [9, 19]. As mentioned above, LTP is a light-weight Web Service transport protocol. It allows an exchange of Web Service messages between 1. 2. 3. 4.
sensor nodes within a single WSN, hosts in the Internet (e.g., enterprise IT servers), sensor nodes and hosts in the Internet, and sensor nodes in different (remote) WSNs.
To support these communication patterns, LTP is realized as an overlay network. To address arbitrary Web Service endpoints, a standard-compliant URLbased addressing scheme is used (e.g., ltp://
[email protected]/WS). The message exchange is transparent to the user or developer. In the Internet, LTP packets are forwarded using TCP or UDP while inside a sensor network arbitrary transport and routing protocols are supported. For instance an application specific tree-routing protocol or even the recently standardized IPv6-based 6LowPAN [13] could be used. Gateways then convert between the WSN and the Internet protocols. Thus, we can adapt LTP to the application-specific needs of the WSNs since due to clustering, duty cycling, etc., routing protocols in WSNs are very application-specific. In addition to the message exchange as core feature of LTP, it optionally provides header compression as well as message fragmentation in order to address limited bandwidth and radio packet sizes of WSNs. For more information and a thorough evaluation, please refer to [9, 10].
Using BPEL to Realize Business Processes for an Internet of Things
4
299
Related Work
In literature, some authors present approaches that use SOA programming paradigms as well as Model-driven Software Development (MDSD) to develop sensor network applications. But to the best of our knowledge, there is no contribution that combines MDSD (or business process modeling in particular) with a Web Service-based SOA in WSNs. In this section, we present related work and discuss similarities and differences to our contribution. In [1], Amundson et al. present a SOA-based approach for WSNs. While Web Services are used in the Internet, proprietary protocols are used inside the WSN. A gateway is needed to convert between both communication protocols. PandPandey et al. [18] and Glombitza et al. [9] present approaches to integrate sensor networks into business processes utilizing the BPEL standard for the process definition and execution. However, both approaches run the business processes in the backend. With Flow [15, 16], Naumowicz et al. present a graphical editor and different domain specific languages for implementing WSN applications. Fuchs et al. [7] propose an activity diagram based approach to execute tasks on different sensor nodes. In both approaches, neither Web Services nor the SOA concept is utilized. Sliver [12] is a light-weight BPEL-Engine running on PDA-class devices. It is based on standard-compliant Web Service technologies but demands for too many resources to run on sensor nodes. A first approach to run BPEL processes on sensor nodes is proposed by Spieß et al. [20]. BPEL processes can be executed distributed inside the WSN as well as in the backend. However, as communication technology no Web Services are used. In [11], Glombitza et al. present SM4RCD (State Machine for Resource Constrained Devices) as state machine-based domain specific language to orchestrate Web Services. SM4RCD is executed on sensor nodes and utilizes standard-compliant Web Services as communication technology.
5
Designing Business Processes for Sensor Networks
As mentioned in the introduction, business processes are characterized by a permanent adoption to constantly changing market requirements and enterprise strategies. This business process-driven necessity can only be fulfilled if the IT systems allow for a fast and flexible adoption as well. Current enterprise IT systems fulfill this requirement by using different business process modeling and execution methods. However, for WSNs, no such technology exists as of today. We propose to use graphical business process modeling methods to realize business processes which are executed on sensor nodes. To realize this, the following three requirements have to be fulfilled: 1. Reusability of functionalities encapsulated in self-contained services. 2. Communication via standardized protocols enabling the use of distributed services which can even be used in inter-enterprise relationships. 3. A business process-specific, graphical language to define business processes by composing services and defining arbitrary control flows.
300
N. Glombitza et al.
In the following, we first introduce our approach on the conceptual layer (cf. Section 5.1) before describing the technological perspective (cf. Section 5.2). 5.1
Concept
Our approach combines the SOA (Service Oriented Architecture) concept with business process modeling methods in order to achieve a fast and flexible as well as business process-driven definition of processes in WSNs. Service Oriented Architecture. According to the SOA concept, application logic is encapsulated in self-contained services which provide their functionalities via clearly defined interfaces and loosely coupled communication connections. Thus, existing services can be easily combined to new services providing higherlevel functionalities. In such a SOA realization, services can range from basic services (e.g., the storing of a bill of loading) to high-level services (e.g., the monitoring of dangerous goods). In our approach, not only every functionality of enterprise IT components but also the functionalities of sensor nodes are encapsulated in services. As result, an easy and flexible reuse of existing functionalities is provided which reduces development times and costs of WSN applications. Business Process Modeling. Using the SOA concept improves application and business process development in WSNs since a reimplementation of already existing functionalities is avoided. However, highly specialized WSN experts are still needed to orchestrate existing services to new ones since this task has to be done using classical or WSN-specific programming languages such as C, C++ or nesC. Thus, a technical process realization by a domain expert is impossible. Business processes always have to be designed by a domain expert from the business perspective in a first step before a WSN expert can realize this process as an executable application. As a result, error-proneness and time-to-market are still very high. In our approach, we propose to use standard business process modeling languages in order to allow for a fast, flexible and less error-prone definition of business processes from a business perspective. Thus, allowing domain experts to easily define these processes with a graphical tool support. Furthermore, all process models are platform-independent and can be easily executed on different WSN and enterprise IT platforms. 5.2
Architecture
In enterprise IT environments, Web Services are the predominant technology used to realize SOAs. As described in Section 3.1 and depicted in Figure 2, there are a lot of business process modeling languages which orchestrate services to business processes using Web Services as middleware. The most important Web Service-based business process modeling and execution language is BPEL which is widely spread in enterprise IT environments. To be compatible to these systems as well as to benefit from existing tools, we propose to use BPEL on the
Using BPEL to Realize Business Processes for an Internet of Things
301
business process as well as Web Services on the communication layer to realize business process modeling in WSNs. Web Services. As depicted in Figure 2, BPEL requires SOAP Web Services using WSDL (Version 1.1) as its interface definition language. In Section 3.2, we introduced our approach of realizing efficient Web Services which are applicable in WSNs (LTP for Web Service message transport and SMC for the efficient compression of SOAP messages). LTP and SMC are fully compliant to the SOAP (Version 1.2) and WSDL (Version 1.1) standards which allows them to be used as a Web Service transport binding for BPEL (for details please refer to [9]). Using LTP/SMC, a transparent message exchange between sensor nodes and enterprise IT systems is achieved (cf. Section 3.2). As a result, BPEL processes running on sensor nodes can use services in different WSNs and enterprise IT environments and vice versa. Executing BPEL on Sensor Nodes. In our approach, we use standardcompliant BPEL (Version 2) to define process models. Such BPEL files can be created either manually or using visual BPEL designers. Since BPEL is an XML standard, an XML document representing a BPEL model is generated as output of such a design process. In enterprise IT environments, these models are executed by heavy-weight process execution engines which interpret BPEL XML documents. Since we are targeting sensor nodes as execution platform, interpretation at run-time is not feasible. The storage of the XML process models alone would overburden most WSN platforms. Instead, our approach is to generate custom-tailored code out of the process models.
Parsing & Validating
Linking to a Web Service framework
Linking to an operation system
Genrating WS messaging code
Genrating process logic code
Generating Compiling for process manatarget platform gement code
Fig. 3. Process of Transferring a BPEL Model into Executable Code for Sensor Nodes
Figure 3 depicts the transformation process of BPEL models into code for a WSN platform. After parsing and validating the process model, it is bound to a specific Web Service framework. Architecturally, arbitrary Web Service frameworks for sensor networks are supported but in our current implementation, we only support LTP/SMC as presented in Section 3.2. Afterwards, the generated code is linked to a specific operation system. Again, arbitrary WSN operating systems are supported. In its current version, we support TelosB [5], Pacemate [14], and iSense [3] sensor nodes. A challenge when transforming a BPEL model to code for sensor network platforms is that such platforms (including iSense) do not support multithreading due to resource constraints imposed by the hardware. But as BPEL processes are characterized by parallel control flows, the generated code must emulate the parallel execution of several control flows. We solve this by sequentially executing parallel control flows which only increases execution time but otherwise yields
302
N. Glombitza et al.
identical results. Some very special BPEL flows cannot be realized without real multithreading. These—and possible workarounds—are discussed in Section 6. In the next three steps, C++ code for the different process-specific components is generated from the BPEL model. First, for every Partner Link (i.e., an external Web Service used by the BPEL process) of the model, code for the Web Service communication is generated using the WSDL descriptions of the referenced Web Service endpoints. Second, the process flow of the BPEL process is transformed to C++ code representing the application’s logic. And finally, the BPEL runtime is generated which contains code to manage all process instances and incoming messages and to control the process flow. After generating the code comprised of the Web Service clients and servers, the BPEL runtime, and the application logic is compiled for the target platform. The code generation approach has several advantages over interpretation of BPEL: i) Memory and CPU consumption: An interpretation of BPEL would overburden the capabilities of sensor nodes in terms of CPU and memory consumption, ii) Performance: The generation of optimized code for a specific platform, which is compiled to machine code, improves the performance, iii) Code size: The required features of the process engine and the process logic are already known at compile time. Thus, only these are included in the generated code, iv) Compile-time validation: A lot of errors can be detected during compilation which would occur during runtime if interpreting, and v) Logging: Logging instructions are only generated for debugging.
6
Realization
In our approach, we use standard-compliant BPEL. Thus, there was no need to implement a graphical editor to design or adopt BPEL processes; any standardcompliant BPEL editor can be used. We developed a code generator which transforms the XML source code of the BPEL process to C++ code. The generated code can be directly run on any target platform for which a GCC C++ compiler [8] is available. The generated classes Process and Dispatcher represent the BPEL Process Flow and a lean BPEL Runtime, respectively. Additional to creating and destroying process instances on demand, the runtime relays messages from and to the individual process instances. The code for the Web Service messaging layer is generated by microFibre, which is incorporated as module in our code generator but could be easily replaced by a different serializer. As transport binding, compressed SOAP over LTP is used in order to realize the communication patterns presented in Section 3.2. In our approach, we use iSense as operating system and LPT as the Web Service transport layer. These components are used as static libraries which are independent of the process logic. Note that the implementation of the business logic itself is completely independent of these static libraries. The only generated component which depends on the operating system is the runtime component. However, the runtime solely depends on the availability of an interface to register
Using BPEL to Realize Business Processes for an Internet of Things
303
tasks. Thus, our approach is easily adoptable to any operating system which provides an interface to register tasks. BPEL expects to run multiple processes or parts of them in parallel. The lack of multi-threading capabilities of typical WSN devices leads to two problems: (a) We cannot technically run multiple activities of a process in parallel, which can be modeled using BPEL. (b) Many BPEL processes have to wait for events like the expiration of timers and the arrival of messages. Since we are limited to single-thread execution, setting the process into sleep mode would do the same to the whole device. If the process waits busily, the runtime is blocked and cannot forward the awaited incoming message which would block the whole device, as well. Problem (a) was solved by executing activities modeled to be run in parallel one after another, which might lead to longer overall execution times, but does not influence the semantics. To encounter Problem (b), the originally continuous BPEL graph has to be split into multiple sub-graphs. A sub-graph ends whenever the process has to actually wait for an event, e.g., an incoming message. Each sub-graph’s individual entry point is marked by the processing of an event, e.g., the reception of an awaited message. Hence, even on single-thread devices, the process can wait for events passively.
7
Evaluation
The main benefit of our approach is the enormous reduction of the development effort. We used the COCOMO (Constructive Cost Model, [2]) approach to estimate the development time of realizing the CargoMonitoring use case manually and compared the result to the actual time it took to realize the same process with our approach. With BPEL, it took less than one hour while the estimated development time for a manual implementation is about 7 person months. In the following we present measurements which show that our concept is feasible in the WSN domain and does not impose significant overhead. We evaluated both processes introduced in Section 2 which will be referred to as LoadCargo for the first and CargoMonitoring for the second one, respectively. While LoadCargo’s message payloads just comprises simple integer values, the payload in CargoMonitoring consists of complex data structures. Additionally, LoadCargo is stateless and does not call other Web Services. It is not a typical business process example and due to its simplicity, a process like that would rather be implemented manually. Anyway, it will show the maximum overhead introduced when using our code generation engine. On the contrary, CargoMonitoring is stateful, sends and receives complex messages and calls other Web Services to fulfill its task. Since this is common in ordinary business processes, it is used to show the overhead which is typically to be expected. We compare the memory usage, code and message size as well as the round-trip times of the C++ applications generated by our code generation engine to functional equivalent but manually implemented and optimized applications to evaluate the overhead introduced by the code generation process.
304
N. Glombitza et al.
We measure the memory usage of the processes utilizing the WSN-simulator Shawn[6] which simulates a 32 bit platform using the iSense operating system. We use debugging messages to detect the amount of the individual allocations and deallocations of stack and heap memory. Due to the ressource constraints of the devices, this cannot be directly run on sensor node hardware. However, the code and message sizes as well as the round-trip times are evaluated on a Pacemate1 sensor node which runs the iSense OS, too. 7.1
Memory Usage
Table 1 summarizes the results of the memory consumption over time. The amount of memory required to enable the application to receive and process messages is denoted as minimal amount. Table 1. Statistical evaluation of the RAM memory usage (in byte) LoadCargo min max avg Generated Process 211 1754 827.6 Manual Implementation 60 1482 660.5
CargoMonitoring min max avg 247 1909 1164.6 200 1798 1100.6
As can be seen by comparing the generated application to its manually implemented equivalent, the overhead significantly depends on the complexity of the process. Since LoadCargo is stateless, the previously described runtime component can be completely omitted in the manual implementation. This results in a 2.52 times higher minimal and a 0.25 times higher average memory usage of the generated application. However, CargoMonitoring is stateful and calls other Web Services which renders the runtime component mandatory. The generated application requires a 0.24 times higher minimal and a 0.06 times higher average amount of memory compared to the manual implementation. 7.2
Code and Message Sizes
The results of the code size evaluation are depicted in Figure 4(a). Since LTP is used as static library, its code size is constant throughout the evaluation. The Web Service messaging component is generated by microFibre. It is considered to be optimal in the manual implementations of the two processes. However, its code size significantly depends on the quantity of message types as well as the complexity of their payloads. See Table 2 for detailed information about the messages sent and received by the two processes. When comparing the code size of the process logic which comprises the process flow and runtime component, 1
32 bit ARM processor
Using BPEL to Realize Business Processes for an Internet of Things
305
a significant overhead of about 74, 5% can be noted in LoadCargo. Since this process is stateless and does not call other Web Services, the runtime part is completely omitted in the manual implementation. This is not stipulated by our code generator since it is not a typical business process scenario. In the more complex scenario CargoMoitoring, the runtime part is mandatory for both variants. Due to the optimization of the process flow implementation, we observe an overhead of about 4, 7%.
3URFHVV 60& /73
&RGHVL]H>E\WH@
5RXQWULSWLPH>PV@
%3(/
PDQXDOO\
%3(/
PDQXDOO\ /RDG&DUJR
%3(/
PDQXDOO\
%3(/
/RDG&DUJR
PDQXDOO\
&DUJR0RQLWRULQJ
&DUJR0RQLWRULQJ
(b) Round-Trip Time
(a) Codesize Fig. 4. Codesize and Round-Trip Time Table 2. Message Sizes of the Evaluation Processes Process
Message
SendCargoInformation LoadCargo ReplyStorageInformation AlarmRequest AlarmResponse SendCargoInformation CargoMonitoring ReplyCargoInformation SendNewParkingInformation ReplyNewParkingConfirmation
7.3
Size 2 2 12 3 9 9 5 1
B B B B B B B B
Round-Trip
As can be seen in Figure 4(b), the round-trip times considerably differ between the two processes which is due to the higher complexity of the process logic as well as the amount of messages sent and received by CargoMonitoring. Considering CargoMonitoring, the round-trip time of the generated application nearly equals the one of the manually implemented application. Again, a significant difference can be observed comparing the generated and manually implemented version of LoadCargo, which is due to the overhead introduced by the BPEL runtime component of the generated application.
306
N. Glombitza et al.
These results match those of the previously evaluated parameters: The higher the complexity of the process, the smaller is the amount of the overhead of the generated code compared to a manual implementation. Since the advantages of a model driven software development particularly show for complex scenarios, a small overhead is to be expected when our code generation engine is applied to typical business processes.
8
Conclusion
In this paper, we present a programming-in-the-large approach for resourceconstraint devices such as wireless sensor nodes. In this scenario, applications are modeled using the Business Process Execution Language (BPEL) instead of using low-level embedded programming languages. Our approach is based on a code generation step which transforms BPEL models to source code which easily fits on resource-constraint devices as we show in an exhaustive evaluation conducted on real hardware. Using our approach, sensor nodes become first-class citizens in enterprise IT environments where they offer their services as Web Services and run an application logic as defined in a BPEL model. In addition, as part of the BPEL definition, they consume Web Services offered by other sensor nodes or Internet-based enterprise servers. Using this approach, domain experts can use graphical tools to model business processes without requiring expert knowledge on embedded programming. Our approach is fully standardcompliant and uses the well-defined extension mechanisms of the Web Service technology stack. In addition to qualitative properties such as faster time-tomarket or a less error-prone development process, our exhaustive evaluation shows the first-rate performance of our implementations.
References 1. Amundson, I., Kushwaha, M., Koutsoukos, X., Neema, S., Sztipanovits, J.: Efficient integration of web services in ambient-aware sensor network applications. In: 3rd IEEE/CreateNet International Workshop on Broadband Advanced Sensor Networks, BaseNets 2006 (2006) 2. Boehm, B.W.: Software Engineering Economics. Prentice Hall PTR, Englewood Cliffs (1981) 3. Buschmann, C., Pfisterer, D.: iSense: A Modular Hardware and Software Platform for Wireless Sensor Networks. Tech. rep., 6. Fachgespr¨ ach Drahtlose Sensornetze der GI/ITG-Fachgruppe Kommunikation und Verteilte Systeme (2007) 4. Christensen, E., Curbera, F., Meredith, G., Weerawarana, S.: Web Services Description Language (WSDL) 1.1. Tech. rep., World Wide Web Consortium W3C (2001), http://www.w3.org/TR/wsdl.html 5. Crossbow Technology Inc.: TELOSB - TelosB Mote Platform (2009), http://www.xbow.com/ 6. Fekete, S., Fischer, S., Kroeller, A., Pfisterer, D.: Shawn. Simulator for sensor networks by the SwarmNet project (2004), http://www.swarmnet.de/shawn
Using BPEL to Realize Business Processes for an Internet of Things
307
7. Fuchs, G., German, R.: UML2 Activity Diagram based Programming of Wireless Sensor Networks. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering and ICSE Workshops (2010) 8. GCC team: GCC, the GNU Compiler Collection (2010), http://gcc.gnu.org/ 9. Glombitza, N., Pfisterer, D., Fischer, S.: Integrating Wireless Sensor Networks into Web Service-Based Business Processes. In: MidSens 2009: Proceedings of the 4th International Workshop on Middleware Tools, Services and Run-Time Support for Sensor Networks, pp. 25–30. ACM, New York (2009) 10. Glombitza, N., Pfisterer, D., Fischer, S.: LTP: An Efficient Web Service Transport Protocol for Resource Constrained Devices. In: Seventh Annual IEEE Communications Society Conference on Sensor, Mesh, and Ad Hoc Communications and Networks, IEEE SECON 2010 (2010) 11. Glombitza, N., Pfisterer, D., Fischer, S.: Using State Machines for a Model Driven Development of Web Service-Based Sensor Network Applications. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering and ICSE Workshops, Cape Town, South Africa (2010) 12. Hackmann, G., Gill, C., Roman, G.C.: Extending BPEL for Interoperable Pervasive Computing. In: Proceedings of the 2007 IEEE International Conference on Pervasive Services, pp. 204–213 (2007) 13. Kushalnagar, N., Montenegro, G., Schumacher, C.: IPv6 over Low-Power Wireless Personal Area Networks (6LoWPANs): Overview, Assumptions, Problem Statement, and Goals. RFC 4919 (Informational) (August 2007), http://www.ietf.org/rfc/rfc4919.txt 14. Lipphardt, M., Hellbrueck, H., Pfisterer, D., Ransom, S., Fischer, S.: Practical experiences on mobile inter-body-area-networking. In: Proceedings of the Second International Conference on Body Area Networks (2007) 15. Naumowicz, T., Schr¨ oter, B., Schiller, J.: Demo Abstract: Software Factory for Wireless Sensor Networks. In: 6th European Conference on Wireless Sensor Networks (EWSN 2009) (February 2009) 16. Naumowicz, T., Schr¨ oter, B., Schiller, J.: Poster Abstract: Prototyping a Software Factory for Wireless Sensor Networks. In: 7th ACM Conference on Embedded Networked Sensor Systems, SenSys 2009 (November 2009) 17. OASIS WS-BPEL Technical Committee: Webservices – Business Process Execution Language Version 2.0. Tech. rep., http://www.oasis-open.org/committees/wsbpel 18. Pandey, K., Patel, S.: A Novel Design of Service Oriented and Message Driven Middleware for Ambient Aware Wireless Sensor Network. International Journal of Recent Trends in Engineering, IJRTE (May 2009) 19. Pfisterer, D., Wegner, M., Hellbr¨ uck, H., Werner, C., Fischer, S.: Energy-optimized Data Serialization For Heterogeneous WSNs Using Middleware Synthesis. In: Proceedings of The Sixth Annual Mediterranean Ad Hoc Networking Workshop (Med-Hoc-Net 2007), pp. 180–187 (June 2007) 20. Spieß, P., Vogt, H., Jutting, H.: Integrating sensor networks with business processes. In: Real-World Sensor Networks Workshop at ACM MobiSys (June 2006) 21. World Wide Web Consortium (W3C): SOAP Specifications. Tech. rep. (2007), http://www.w3.org/TR/SOAP/
On Complexity of Wireless Gathering Problems on Unit-Disk Graphs Nikola Milosavljevi´c Max-Planck-Institut f¨ ur Informatik, Saarbr¨ ucken, Germany
[email protected]
Abstract. We address the problem of efficient gathering of data from a wireless network to a single sink node. Network’s communication and interference pattern are assumed to be captured by the unit-disk model. We consider the objective of minimizing the maximum latency (the time between release and delivery of a packet). We prove that the problem is NP-complete even when all packets are released from the same source node, or at the same time. To our knowledge, these are the first results about the wireless gathering problem in the plane. They can be seen as an extension of recent inapproximability results of Bonifaci et al., which hold in three dimensions. Keywords: wireless gathering problem, unit-disk graphs, wireless networks, computational complexity.
1
Introduction
Wireless networks have become ubiquitous in recent years. Some examples include 802.11 wireless Internet, mobile networks for voice and data communication, vehicular networks, sensor/actuator networks which measure physical data in a given environment (natural habitat, office building), and act to achieve desired environmental effect (energy savings, event notification etc.) One of the most important communication tasks in all these applications is gathering data from some or all nodes (data sources) into one distinguished node (sink). For example, in an 802.11 network, the sources are individual WiFi transceivers and the sink is a WiFi access point. In a sensor/actuator network, the sources are sensor nodes and the sink is a more powerful “hub” node, a base station with human operator that examines the data, or a control unit that turns sensor data into actuators’ control signals. In most wireless networks data is being generated asynchronously in real time. New data packets arrive as others are being relayed. In this setting, a natural performance metric for a data gathering scheme is flow time or latency: the time that a packet spends in the network, between the time it is generated at a source, and the time it is delivered to the sink. In this paper we consider the objectives
The author would like to thank anonymous reviewers for suggesting parts of the original submission that needed extra clarification.
H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 308–321, 2011. c Springer-Verlag Berlin Heidelberg 2011
On Complexity of Wireless Gathering Problems on Unit-Disk Graphs
309
related to flow time: maximum flow time of any packet, and total flow time of all packets. The feature that distinguishes scheduling problems in wireless networks from those in wired networks is the nature of interference. In wireless networks, transmitting a packet along a link does not only affect the endpoints of that link, but also all nodes in radio range of the sender, regardless of whether they are intended recipients or not. Different models of radio propagation yield different interference patterns, resulting in optimization problems with different theoretical properties. In this paper we adopt the unit-disk model of radio propagation, under which a node’s radio range is a disk of radius 1 centered at the node. In other words, the network is modeled by a unit-disk graph, in which two nodes are connected if and only if their Euclidean distance is at most 1. Although far from realistic, the unit-disk model is widely used in the literature and generally accepted as a good first approximation. 1.1
Wireless Gathering Problem
We now formally define the decision problems that are the topic of this paper, and introduce the associated terminology. The input is a set of points in the plane (given by their Cartesian coordinates), representing nodes of the wireless network. One node is designated as the sink. We assume that the time consists of discrete steps. In the beginning of each step (before any transmissions take place), a number of packets is generated (released) at some nodes in the network (sources). The release pattern of the packets is also given as input. Each node can transmit at most one packet (message) in one step, and that packet must be addressed to exactly one other node (intended recipient)1 . A transmission from node u to node v is received successfully by v if and only if the only node within distance 1 from v which is transmitting in that step is u. The goal is to schedule transmissions so that all packets are eventually delivered to the sink. Given such a schedule, a packet’s latency is the difference between its release time and its delivery time. In this paper, wireless gathering problem is the problem of computing a schedule that minimizes maximum latency over all packets. We consider two special cases: – when all packets are generated at the same source node, possibly at different times, – when all packets are generated at the same time (without loss of generality at time zero), possibly at different nodes. We denote these by T-WG and S-WG, respectively2 . Note that S-WG is equivalent to minimizing the completion time, i.e., total length of the schedule. 1
2
Note that even though “local broadcast” (simultaneous transmission from one sender to multiple receivers) is supported by the wireless medium, and often helpful in realworld scenarios, our model does not allow it. Prefixes T- and S- refer to the quantities that differ among packets, i.e., “time” and “source”, respectively. WG stands for ‘wireless gathering”.
310
N. Milosavljevi´c
1.2
Related Work
The literature on routing and scheduling in wireless networks is extremely large. Existing work can be classified according to interference model (geometric, graphbased etc.), objective function (completion time, maximum latency, average latency etc.)... We only mention publications most relevant to our results. Kumar et al. [5] study the problem of routing packets between different sourcedestination pairs in minimum number of steps under distance-2 interference model. They prove strong inapproximability for general graphs and give approximation algorithms for disk graphs. Bermond et al. [2] consider the problem of collecting data to a single sink node in minimum number of steps. Their interference model is defined by transmission radius rT and interference radius rI , which are the same for each node, and measured in network’s hop-distance, rather than geometric distance. They provide a 4-approximation algorithm when rI > rT , and prove that the problem is NP-hard for rI = rT . Balakrishnan et al. [1] considered the problem of maximizing the number of transmissions that can be scheduled simultaneously. They prove that the problem is NP-complete even in the case of planar unit-disk graphs. The proof is based on the fact that in a bipartite graph, a set of transmissions can be scheduled simultaneously if and only if the corresponding set of edges (sender-receiver pairs) forms an induced matching3 (IM for short). Hardness then follows by reduction from IM on a certain subclass of bipartite graphs which also happen to be planar unit-disk graphs. Bonifaci et al. [3] consider the case when the nodes lie in the 3D space, and communication/interference pattern is given by the unit-ball model, the obvious three-dimensional analogue of the unit-disk model. They prove that unless P=NP, T-WG cannot be O(p1−ε )-approximated in polynomial time for any ε > 0, where p is the number of packets. They also prove inapproximability of minimizing the average latency. They give a constant approximation algorithm for S-WG, as well as resource-augmented algorithms for T-WG (both maximum and average latency objective). Proofs of inapproximability in [3] also exploit hardness of IM on a subclass of bipartite graphs similar to that in [1]. The idea is to connect one half of the bipartition to the source, and the other half to the sink. That way, the most efficient way of gathering data is clearly to find the largest set of simultaneously schedulable links in the bipartite graph and use it repeatedly to transfer all packets. Unfortunately, it seems that adding these extra links really requires using the third dimension. The third dimension is used to separate the two halves of the bipartition in space. Once this separation is achieved, one can place 3
A matching in an undirected graph is a subgraph in which no two edges share an endpoint. An induced matching is a matching that is induced by a subset S of vertices, i.e., consists of S and all edges connecting vertices in S. The problem of deciding if a graph has an induced matching of given size (denoted by IM in the rest of the paper) is NP-hard even for graphs that are simultaneously planar, bipartite, unit-disk, and have maximum degree at most 4 [7,1].
On Complexity of Wireless Gathering Problems on Unit-Disk Graphs
311
source and sink so that they are connected only to one half of the bipartition. The main technical contribution of this paper is showing that if we settle for “weaker” connection between the bipartite graph and the source/sink (not by direct links, but longer paths), then two dimensions are enough to prove “weaker” hardness. 1.3
Our Results
We prove that both T-WG and S-WG are NP-complete. Our proofs are by reduction from IM on a subclass of planar bipartite unit-disk graphs similar to that in [1,3]. For the graphs in this class we propose a unit-disk embedding in which the two halves of the bipartitions are spatially separated well enough that they can be connected to the source/sink by paths, while still respecting the unit-disk constraint. The rest of the paper is organized as follows. Section 2 contains the proofs of our two claims. Section 3 summarizes the paper and suggests some problems for further research.
2
NP-Completeness of Wireless Gathering Problems
We prove NP-completeness of T-WG and S-WG by reduction from IM on a subclass of bipartite unit-disk planar graphs which we call grid graphs4 . In Section 2.1 we define grid graphs, and argue that IM is indeed NP-hard when restricted to this class. Then we proceed to describe the reductions to T-WG and S-WG. The network (set of points) in the output of both reductions is the same, only the packet release patterns are different. The network is obtained in three stages: (i) embedding the grid graph onto a narrow strip, so that the two sides of the bipartition are well separated along the strip (Section 2.2), (ii) embedding several copies of the result onto a ring (annulus), so that the two sides of the bipartition are well separated along the ring (Section 2.3), (iii) inserting additional nodes to induce paths that connect the ring embedding to the source and sink, placed inside and outside of the ring, respectively (Section 2.4). In Sections 2.5 and 2.6 we describe the packet release patterns for T-WG and S-WG, respectively, and prove correctness of respective reductions. 2.1
Grid Graphs
Let G be a cubic graph (planar graph with maximum degree 3) with n vertices and m edges. Let a, b be arbitrary positive integers. 4
We borrow this term from Bonifaci et al. [3], although our definition is slightly different.
312
N. Milosavljevi´c
Lemma 1. Given G, a, and b, one can compute in polynomial time a graph H such that the following holds. (i) H can be embedded as a unit-disk graph on some finite subset of the integer grid Z2 . We refer to this embedding as the grid embedding. (ii) H is a subdivision of G; edge e of G is replaced by a path of length 6ae − 2, where ae ≥ a. (iii) The length of any horizontal or vertical path in the grid embedding of H is even, and at least b. Proof. Compute a drawing of G which is planar (no two edges intersect), orthogonal (edges are polygonal chains with alternating horizontal and vertical segments) and grid (vertices of the graph and the polygonal chains have integer coordinates). Such a drawing can be computed in polynomial time [6,8]. Clearly, each edge e has integer length ze . Choose x ∈ Z+ large enough so that the j i following holds: each edge e in the drawing has a point ue = ( 6x , 6x ), i, j ∈ Z 1 in its interior such that the drawing in the disk of radius 2x centered at ue is just a horizontal or vertical line across the diameter of the disk. On each edge 1 replace a straight line segment of length 3x centered at ue by a polygonal chain 1 of length x without introducing edge crossings, as shown in Figure 1 below. This 2 is possible by choice of x. The length of edge e is now ze + 3x , and the length 1 of any horizontal or vertical segment is at least 6x . Scale up the drawing by a factor of 6x(3y + 1), where y is a fixed integer. The length of edge e is now 6[(3y + 1)xze + (2y + 1)] − 2, and the length of any horizontal or vertical segment is at least 3y + 1. Subdivide G by adding all vertices with integer coordinates ( 6si , j+1 ) ( i+1 , j+1 ) 6s 6s 6s
ue ( i−2 , j ) ( i−1 , j ) ( 6si , 6sj ) ( i+1 , j ) ( i+2 , j) 6s 6s 6s 6s 6s 6s 6s 6s
( i−2 , j ) ( i−1 , j) 6s 6s 6s 6s
ue
( 6si , 6sj ) ( i+1 , j ) ( i+2 , j) 6s 6s 6s 6s
( i−1 , j−1 ) ( 6si , j−1 ) 6s 6s 6s
Fig. 1. Proof of Lemma 1: one step in obtaining a grid graph from a cubic graph
that lie in the interiors of edges in the drawing. We claim that the resulting subdivision of G is in fact desired H. The drawing inherited from G is clearly a grid embedding. Edge e is subdivided 6ae −2 times, where ae = (3y +1)xze +(2y +1). The length of any horizontal or vertical path is 3y + 1. One can ensure ae ≥ a and 3y + 1 ≥ b by choosing y to be a large odd integer. Definition 1. A grid graph is any graph H2 obtained from a cubic graph G by first computing H1 as in Lemma 1, and then modifying H1 as follows: for each vertex v of H1 which corresponds to a vertex of G (i.e., is not a subdivision vertex), create a new vertex v and an edge (v, v ) in H2 .
On Complexity of Wireless Gathering Problems on Unit-Disk Graphs
313
Since H1 in Lemma 1 has maximum degree 3, a vertex v can always be added safely as the fourth neighbor of v in the grid. Furthermore, vertices v , thus placed, cannot “collide” with each other in the embedding, because they are added only for vertices v that are far away from each other, by property (iii) of Lemma 1. We conclude that H2 satisfies property (i) of Lemma 1. In particular, grid graphs are bipartite. We end this section by proving that IM is NP-complete on grid graphs, by reduction from the independent set problem on cubic graphs. The reduction is given by Definition 1, and its correctness follows from Lemma 2 below. The proof of Lemma 2 is very similar to those in [7] (Theorem 2.1) and [1] (Section 11.1). For completeness, we reproduce it here with necessary minor modifications. Lemma 2. H2 has an induced matching of size 2 e ae − m + α if and only if G has an independent set of size α. Proof. If G has an independent set I of size α or more, then one can obtain an induced matching S of size 2 e ae − m + α as follows. For each v ∈ I, put (v, v ) into S. For each edge e = (u, v) of G, do the following. Assume without loss of generality that u ∈ / I. Consider the path u = w1 , w2 , . . . , w6ae −1 = v in H2 which is the subdivision of e. Put into S edges (w3i−1 , w3i ), i = 1, 2, . . . , 2ae − 1. Now suppose that there is an induced matching S of size at least 2 e ae − m+ α. Let Se be a subset of S that belongs to the subdivision of edge e. Suppose that |Se | < 2ae − 1 for some e = (u, v). Let u = w1 , w2 , . . . , w6ae −1 = v be the nodes in the subdivision. Let (u, w) be the only edge of S adjacent to u (if any). Replace S by S \ (Se ∪ {(u, w)}) ∪ {(w3i−1 , w3i ) | i = 1, 2, . . . , 2se − 1}. Notice that the new S is still an induced matching of size no smaller than the old S. Repeating this for all edges, we obtain that |S| ≥ 2 e ae − m + α and |Se | ≥ 2ae − 1. Let I be the set of all nodesv such that (v, v ) ∈ S. Clearly, |I| = |S \ e Se | = |S| − e |Se | ≥ |S| − e (2ae − 1) ≥ α. Moreover, I is an independent set in G, since for any edge e = (u, v) in G, (u, u ), (v, v ) ∈ S implies |Se | ≤ (6ae −2)−4 = 2ae − 2. 3 The following is an immediate consequence of Lemma 2. Corollary 1. IM on grid graphs is NP-hard. 2.2
Strip Embedding
We proceed to describe the reduction from IM on grid graphs to our wireless gathering problems. In accordance with the previous section, we denote the input grid graph by H2 . The first step is embedding H2 onto a very narrow horizontal strip. Informally, this is done by scaling down the grid embedding of H2 in the vertical direction until its vertical extent becomes small enough, while keeping the lengths of all edges just below 1, as they are in the grid embedding. This forces “folding” of vertical paths into a stack of nearly horizontal subpaths with alternating directions (see Figure 2 below for illustration). Since vertical distances generally
314
N. Milosavljevi´c
become very small, some new edges will be introduced to preserve the unitdisk property. Hence the result of this stage, which we denote by H3 , will be a supergraph of H2 . The corresponding unit-disk embedding we call the strip embedding. Formally, the construction proceeds as follows. Start from the grid embedding of H2 , scale it by a factor of h = 1 − 2ε2 horizontally, and by a factor of v = 2ε vertically, where ε is arbitrarily small. Then, process vertical paths one by one. For each vertical path that comes from H1 (i.e., subdivided edge of G, rather than (v, v ) edge), move every other vertex in the interior of a vertical path by h = h−3ε2 to the right. More precisely, if the vertices on the path are v1 , v2 , . . . , v2t+1 (recall that any vertical has even length), then move v2 , v4 , . . . , v2t . For every vertical edge (v, v ), move the degree-1 vertex by h to the right. Define H3 to be the unit-disk graph on the resulting set of points. Figure 2 below (not drawn to scale) illustrates what the strip embedding looks like in the vicinity of a node of degree 3. Any two vertices in the same “column” are connected (those edges are omitted from the drawing for clarity). h
h
h
h
h
v v v v v v h
Fig. 2. Illustration of the strip embedding
Strip embedding fits into the bounding box [0, X] × [0, Y ], where X = O(n), Y = O(nε), and X is an integer. Notice that for a vertex v of H2 there is an unique integer xv ∈ [0, X] such that the x-coordinate of v is either xv h or xv h − 3ε2 . Parity of xv induces a cut in H3 . The key property is that the bipartite subgraph of H3 consisting of edges that cross this cut is exactly H2 . Note that this claim is the primary reason for introducing parameters s and t in the definition of grid graphs (Section 2.1). By choosing s and t to be large enough constants, we make sure that things are well enough separated in the grid embedding, so that in the strip embedding no other edges are introduced except than those connecting nodes in the same “column” (with the same xv ). 2.3
Ring Embedding
Now we map several copies of H3 into a narrow ring (annulus) to obtain a new graph H4 . Let k = 2 e se − m + α, where α is the independence number of G.
On Complexity of Wireless Gathering Problems on Unit-Disk Graphs
315
Let c = z(k + 1) + k − 2 for an large enough integer z and let R = ch π be the inner radius of the ring. Create 2c copies of each vertex from the strip embedding of H3 by mapping it via the following 2c functions (x, y) → fi (x, y) = (ri (y) cos φi (x), ri (y) sin φi (x))
i = 0, 1, . . . , 2c − 1 ,
where
2ih + x . R Graph H4 is defined to be the unit-disk graph on the resulting points. We prove that each copy has the same H3 . Non-edges in the connectivity as √ strip embedding have length at least (2h − h )2 + v 2 ≥ 1 + 6ε2 . We analyze X how much they are shrunk by fi . For R ≥ 3ε , we have ri (y) = R + 2(i mod (2X))Y + y
φi (x) =
||fi (x1 , y1 ) − fi (x2 , y2 )||2 = (y1 − y2 )2 + 4ri (y1 )ri (y2 ) sin2
x1 − x2 2R
x1 − x2 2R ≥ (y1 − y2 )2 + (1 − 3ε2 )(x1 − x2 )2 ≥ (1 − 3ε2 )(1 + 6ε2 ) > 1 , ≥ (y1 − y2 )2 + 4R2 sin2
so all non-edges remain non-edges. Edges in the strip embedding have length at most h = 1 − 2ε2 . We analyze how much they are stretched by fi . For R ≥ 2XY , ε2 we have x1 − x2 ||fi (x1 , y1 ) − fi (x2 , y2 )||2 = (y1 − y2 )2 + 4ri (y1 )ri (y2 ) sin2 2R 2 x1 −x2 ≤ (y1 −y2 )2 +4(R+2(i mod (2X))Y +Y )2 2R 4XY ≤ (y1 − y2 )2 + (1 + )2 (x1 − x2 )2 R 4XY 2 ≤ (1 + ) (1 − 2ε2 )2 ≤ (1 + 2ε2 )2 (1 − 2ε2 )2 < 1 , R so all edges remain edges. Previous calculations also show that it suffices to compute fi up to additive error of ε3 , which is important for polynomial running time. Notice that each node v has coordinates (rv cos φRv , rv sin φRv ), where φv is either xv h or xv h − 3ε2 for some integer xv between 0 and 2c − 1. As in the previous section, parity of xv induces a cut in H4 . It is not hard to check that vertices u and v are connected in H4 if and only if (i) xu = xv or (ii) they belong to the same copy of H3 and are connected in H3 . From this we conclude that the subgraph of H4 consisting of edges that cross the cut is isomorphic to 2c disjoint copies of H2 . Hence its maximum induced matching is of size ck. For any fixed i ∈ [0, 2c − 1], define Vi to be the set of vertices of H4 with xv = i. Due to rotational symmetry of the construction, there is a maximum induced matching in H4 such that each Vi is incident to equally many (i.e., k) matching edges. Such a matching is obtained by applying some fixed optimal solution for H3 to each copy.
316
2.4
N. Milosavljevi´c
Connecting Paths
We place node s at the center of the ring, and add nodes in order to connect it to each Vi with i odd. For each i, nodes are placed so that they induce a collection of paths that run parallel and very close to each other (see Figure 3 below for illustration). The number of paths in the collection for a given i should be thought of as very large. That can be achieved if the paths are close enough to each other. ∞ paths U1
U2
2ε
s 1 2
U ls
Vi
∞ paths
1 − ε2
−ε
Uls +3
ls hops
Uls +4
2ε
Ul+2 t
Vi−1 1 − ε2
1 2
−ε
lt hops Fig. 3. Illustration of the construction of parallel connecting paths
Formally, the following conditions should hold. – The paths are of the same length, and two nodes on different paths can be adjacent only if they have the same hop-distance from s. Nodes on the same path are not adjacent unless they are consecutive (i.e., each path is equal to the subgraph induced by its vertices). – The first node on each path (closest to s), is at most 12 away from s. Therefore, the first nodes on all paths form a clique. – The last node on each path (farthest from s) is connected to all nodes of Vi for one fixed i, and no other nodes. It is clear that these conditions can be satisfied for small enough ε (depending on n), and large enough R (depending on n, ε). We skip the formal details of the construction to keep the presentation simple; most of the geometry can be inferred from Figure 3. Next we place another node t somewhere outside the ring, and connect it with parallel paths to all Vi with i even. Figure 3 shows how to do this for one such Vi , but it is clear that the paths for all Vi cannot be “straight” like that one. Fortunately, the above construction allows for two connectivity-preserving modifications: (i) turning the path by a small (polynomial in ε) angle in each step, and (ii) keeping the same direction while modifying the length of the step
On Complexity of Wireless Gathering Problems on Unit-Disk Graphs
317
by a small constant fraction of ε2 . Armed with these, we construct curved paths as illustrated in Figure 4. In the curved (solid) parts one exploits modification (i), and in the straight (dashed) parts one exploits modification (ii) to make the lengths of all paths equal.
Fig. 4. Details of the construction of parallel paths to t. The small circle on the left represents the ring, and the nodes on it represent sets Vi . The “central” node on the right is t. Each geometric curve represents a set of parallel paths connecting one Vi with t, closely following the curve (e.g., the horizontal one in the middle is shown in Figure 3). Curved (solid) parts have curvature small enough for modification (i) to be applicable. Horizontal (dashed) parts are long enough.
Let ls and lt be the length of a parallel path to s and t, respectively. Define l = ls + lt . Since ε is independent of the size of the input (H2 , that is), l, ls , lt can be bounded by polynomial functions of the input size. We define H5 to be the resulting graph. For the next two sections, it is also convenient to define Ui to be the vertices of H5 that are exactly i hops away from s. In particular, U0 = {s}, and Ul+3 = {t}. 2.5
NP-Completeness of T-WG
Let s be the (only) source, and let t be the sink. Suppose that a batch of zck packets arrive at s every zck + 1 time steps. Let there be a total of q batches. Lemma 3. If H2 has an induced matching of size k, there is a schedule with maximum latency at most 2zck + zk + z + k + l − 2. Proof. The schedules for all batches are identical, only shifted in time by zck + 1 steps. We describe the schedule for one batch. All zck packets go from s to U1 in zck steps, one by one. They travel on parallel paths until they reach Uls , and then move to Uls +3 in smaller groups of size ck as follows. Each group is sent from Uls to Uls +1 in k steps (in sub-groups of c), then from Uls +1 to Uls +2 in
318
N. Milosavljevi´c
one step, and then from Uls +2 to Uls +3 in k steps. Note that consecutive groups’ schedules are overlaid, i.e., while one group is being sent from Vls +2 to Vls +3 , the next group is being sent from Vls to Vls +1 . That way, all packets move from UlS to Uls +3 in 1 1 zck + + k = zk + z + k c ck steps. Finally, they travel on parallel paths to t, and get delivered to t one by one. Clearly, maximum latency of one batch is 2zck + zk + z + k + l − 2. It is not hard to check that schedules for different batches do not interfere with each other. Hence the maximum latency for all packets is the same as the maximum latency for a single batch. Lemma 4. If H2 has no induced matching of size k, maximum latency of any z schedule is at least 2zck + zk + z + k + l + k−1 − 3. Proof. Suppose the maximum induced matching in H2 is of size k ≤ k − 1. Packets can be transferred from Uls to Uls +2 with “rate” at most ck packets per k + 1 steps. Therefore, a batch moves from Uls to Uls +3 in at least zck ·
k + 1 kz(k + 1) k2 z z = ≥ = zk + z + ck k k−1 k−1
steps. Suppose that there is a batch that departs s in ck steps and also gets delivered to t in ck steps. Then, we can assume without loss of generality that all packets from this batch leave U1 , reach Uls , leave Uls +3 and reach Ul+2 together (we can always modify the schedule without increasing the maximum latency to make this true). It follows that its maximum latency is at least 2zck + zk + z + k + l +
z − 3. k−1
Now consider the other case, i.e., suppose that every batch takes at least zck+1 steps to either depart from s or arrive at t. Let r be the number of batches that depart from s in at least zck + 1 steps. Recall from the beginning of this section that q is the total number of batches. Then it takes at least q(zck + 1) + r steps for all packets to depart from s, and at least q(zck + 1) + q − r steps for all packets to arrive at t. Since the hop-distance between U1 and t, as well as the distance between s and Ul+2 , is l + 2, we have that the total length of the schedule for all packets is lower-bounded by both q(zck + 1) + q − r + l + 2 and q(zck + 1) + r + l + 2. Therefore, it is at most q(zck + 1) + q/2 + l + 2. The last batch is released in step (q − 1)(zck + 1). It follows that the maximum latency is at least zck + q/2 + l + 3. If we set q/2 ≥ zck + zk + z + k +
z −6 k−1
z we get that the length of the schedule is at least 2zck + zk + z + k + l + k−1 − 3, as required.
On Complexity of Wireless Gathering Problems on Unit-Disk Graphs
319
Theorem 1. T-WG in NP-complete. Proof. T-WG is obviously in NP. Choosing z > k − 1, Lemma 3 and Lemma 4 imply that our T-WG instance has a schedule of length 2zck + zk + z + k + l − 2 if and only if H2 has an induced matching of size k. Together with Corollary 1, this implies that T-WG is NP-hard. 2.6
NP-Completeness of S-WG
Recall from Section 2.3 that c = z(k + 1) + k − 2 for some large enough integer z. At time zero, p = zck + c packets are released at Uls so that equally many are adjacent to each Vi , i = 1, 3, . . . , 2c − 1. The sink is t. Since the release time for all packets is zero, maximum latency is equal to the total length of the schedule. Lemma 5. If H2 has an induced matching of size k, there is a schedule of length p + lt + 3. Proof. Suppose that H2 has an induced matching of size k. We describe the schedule of required length. To avoid complicated notation, we keep the presentation informal. We divide the packets into groups P and Q, containing c and zck packets, respectively. The schedule for P is as follows. All packets of P depart from Uls at time 1, and travel from Uls to Ul+2 in lt + 2 steps using parallel paths. This is possible because P is small enough that it can move from Uls to Uls +3 in 3 steps, without waiting. Finally, P gets delivered to t in |P | steps. The schedule for Q is as follows. Q departs from Uls starting from time 3, but they don’t depart all at once. Instead, they move from Uls to Uls +3 in smaller groups of size ck, as in the proof of Lemma 3. We saw that they need zk + z + k steps for that. Notice that in this case, due to our choice of c, this is exactly c + 2 steps. After reaching Uls +3 , Q travels to Ul+2 on parallel paths, and gets delivered to t in |Q| steps. Now we verify that the schedules of P and Q never interfere with each other. This is obvious for all time steps, except perhaps when Q get near the sink, at which point it may interfere with packets of P being delivered to t. We argue that this does not happen. Q starts departing the sources two steps later than P . Also, P needs 3 steps to move from Uls to Uls +3 , while Q needs c + 2. Otherwise, P and Q proceed in the same fashion on parallel paths. It follows that Q arrives at Ul+2 exactly c + 1 steps later than P ; at that point P is already delivered to t. To finish the proof, we compute the length of the schedule. P departs Uls at time 1, arrives at Ul+2 at time lt + 2, and gets delivered at time c + lt + 2. Delivery of Q starts at time c + lt + 4 and completes by time zck + c + lt + 3. Lemma 6. If H2 has no induced matching of size k, the length of any schedule is at least p + ls + 4. Proof. Suppose the maximum induced matching in H2 is of size k ≤ k − 1. Let τ be the length of the schedule. Let Q be the last batch of packets delivered to t consecutively (i.e., in |Q| consecutive steps).
320
N. Milosavljevi´c
Suppose |Q| ≥ zck. Since packets of Q are delivered consecutively, they are all at Ul+2 no later than step τ − |Q|. We can assume without loss of generality that they are all at Uls +3 no later than step τ − |Q| − (lt − 1), i.e., that they travel together on parallel paths from Uls +3 to Ul+2 . Packets can be transferred from Uls to Uls +2 with “rate” at most ck packets per k + 1 steps. Therefore, Q moves from Uls to Uls +3 in at least zck ·
k + 1 kz(k + 1) k2 z z z = ≥ =c−k+2+ ≥ |P | − k + 2 + ck k k−1 k−1 k−1
z steps. It follows that τ ≥ p + lt − k + 1 + k−1 , which exceeds p + lt + 4 for large enough z. Now suppose |Q| < zck, and let P denote all packets outside of Q. Suppose that P is delivered in |P | consecutive steps. Clearly, |P | ≥ c + 1, so there is some odd i such that at least two packets from P cross the cut through Vi . The distance from Uls to Ul+2 is lt + 2, but one of these two packets takes at least lt + 3 steps to reach Ul+2 , because it has to wait one step when moving between Uls and Uls +3 . Since |P | is delivered consecutively, no packets can be delivered before these two reach Ul+2 , hence no packets are delivered in the first lt + 3 steps. Since packets are delivered in exactly two batches (P and Q), with at least one “empty step” separating them, we have τ ≥ (lt + 3) + (p + 1) = p + lt + 4. Finally, suppose that P is not delivered consecutively. Since Uls and t are lt +3 hops apart, no packets are delivered in the first lt + 2 steps. Since packets are delivered in three or more batches, with at least two “empty steps” separating them, we have τ ≥ (lt + 2) + (p + 2) = p + lt + 4.
Theorem 2. S-WG in NP-complete. Proof. Membership in NP is obvious, while NP-hardness follows directly from Lemma 5, Lemma 6, and Corollary 1.
3
Discussion and Future Work
This paper initiates the study of wireless gathering problems on unit-disk graphs, by proving that it is NP-complete to decide the existence of a gathering schedule with given maximum latency. This holds even if all packets are originate from the same node or if all packets are released at the same time. Our hardness results are much weaker than those for the 3D unit-ball model [3]. There is a wealth of open problems in this area. We believe that our current techniques suffice to establish NP-completeness even when all packets are released both at the same node and at the same time. We are currently working on this. An alternative objective function to consider is the sum of latencies (which is equivalent to average latency). There are strong inapproximability results for it in the 3D case [3]. One can consider less restricted (and more realistic) models of radio propagation: quasi-unit-disk graphs [4], general disk graphs (in which different nodes are allowed to have different radio ranges) etc.
On Complexity of Wireless Gathering Problems on Unit-Disk Graphs
321
Finally, there is the question of approximation algorithms, centralized and distributed. For S-WG, a 3-approximation is easy to obtain – simply find any shortest s–t path and send a packet along that path in every third step. This is exactly the priority greedy algorithm of [3] applied to our setting. The question is whether one can do better. An example from [3] shows that simply forwarding packets along shortest s–t paths does not work, even in the case of unit-disk graphs in the plane. As for T-WG, there are strong inapproximability results in the 3D case. Algorithms proposed in [3] perform well under a different metric (resource augmentation). Actually, they do not rely on geometry, so they also work for arbitrary communication/interference graphs. In the 2D case we may be able to do better.
References 1. Balakrishnan, H., Barrett, C.L., Kumar, V.S.A., Marathe, M.V., Thite, S.: The distance-2 matching problem and its relationship to the mac-layer capacity of ad hoc wireless networks. IEEE Journal on Selected Areas in Communications 22, 1069– 1079 (2004) 2. Bermond, J.-C., Morales, N., Perennes, S., Galtier, J., Klasing, R.: Hardness and approximation of gathering in static radio networks. In: IEEE International Conference on Pervasive Computing and Communications Workshops, pp. 75–79 (2006) 3. V. Bonifaci, P. Korteweg, A. Marchetti-Spaccamela, L. Stougie. Minimizing flow time in the wireless gathering problem. ACM Transactions on Algorithms (accepted for publication) 4. Kuhn, F., Wattenhofer, R., Zollinger, A.: Ad Hoc Networks Beyond Unit Disk Graphs. Wireless Networks 14(5), 715–729 (2008) 5. Anil Kumar, V.S., Marathe, M.V., Parthasarathy, S., Srinivasan, A.: End-toend packet-scheduling in wireless ad-hoc networks. In: Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004, pp. 1021–1030. Society for Industrial and Applied Mathematics, Philadelphia (2004) 6. Shiloach, Y.: Arrangements of Planar Graphs on the Planar Lattice. PhD thesis, Weizmann Institute of Science, Rehovot, Israel (1976) 7. Stockmeyer, L.J., Vazirani, V.V.: NP-Completeness of Some Generalizations of the Maximum Matching Problem. Inf. Process. Lett. 15(1), 14–19 (1982) 8. Valiant, L.G.: Universality Considerations in VLSI Circuits. IEEE Trans. Computers 30(2), 135–140 (1981)
On Cardinality Estimation Protocols for Wireless Sensor Networks Jacek Cicho´n, Jakub Lemiesz, and Marcin Zawada Institute of Mathematics and Computer Science, Wrocław University of Technology, Poland {Jacek.Cichon,Jakub.Lemiesz,Marcin.Zawada}@pwr.wroc.pl
Abstract. In this article we address the problem of estimating a size of wireless sensor networks (WSNs). We restrict our attention to sensors with very limited storage capabilities. The problem arises when sensors have to quickly obtain approximate size of the network to use algorithms which require such information. Another application area is the problem of counting the number of different objects (e.g. people in public bus transportation) and use of such information to optimize the routes and frequency of buses. In this paper we present two-phase probabilistic algorithm based on order statistics and balls-bins model which effectively solves the presented problem. Keywords: cardinalities estimation, sensor networks.
1 Introduction Most wireless sensor network (WSN) algorithms require knowing, at least approximately, the size of the sensor network to work efficiently. In other applications the sensors should cope with mobility and changing connectivity and our goal is to count these sensors in a given area. For example, we want to effectively count the number of distinct people in mass events. Then, when the counter exceeds some threshold, we can send an alert message [1], thus increasing security of those events. At first, the solution to this problem seems quite simple. We can just store and count the number of different sensors’ identifiers. However, what if the number of sensors is large, for example several hundred or thousand. Let us assume for the moment that we have a thousand sensors and each has a 32 bit identifier, then we need almost 4kB of memory to store those identifiers. Moreover, we need at least logarithmic time to insert each new encounter sensor or just check if the encounter sensor has already been inserted. We restrict our attention to a quarter size sensors e.g. M ICA 2D OT which is based on the Atmel ATmega128 with available 4kB memory for data storage. Then, our simple algorithm will take all available memory. However, if the number of sensors exceeds a thousand then this algorithm simply fails. In this paper, we present probabilistic algorithm that is capable to count sensors with a much smaller memory consumption. Our algorithm is able to count up to 100 000 or more with less than 100 bytes of memory available and it has a very low variance of estimation.
This research was supported by Polish Ministry of Science and Higher Education grant N N206 369739.
H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 322–331, 2011. c Springer-Verlag Berlin Heidelberg 2011
On Cardinality Estimation Protocols for Wireless Sensor Networks
323
Related Work. The problem of counting the number of sensors is similar to the problem of counting distinct elements in data stream [2,3,4,5,6,7]. Thus, the first idea is to apply algorithms for counting distinct elements to WSNs. All these algorithms behave well for large massive data but, unfortunately, for a small number of data their accuracy is highly unsatisfactory. The simulations show that using those algorithms unmodified and separately, we obtain a very large error of estimation for a small number of sensors. Therefore, we present an algorithm that combines the technology of order statistics [6,7] with balls and bins model [3,5] modified by preselection.
2 The Basic Algorithm In this section, we present our two-phase algorithm. To make the exposition clearer, we describe the intuitive way to look at our algorithm. In the first phase (see Subsection 2.1) the ordered statistic is used to quickly obtain a rough estimate n ˆ 1 of the network size. Next, in the second phase (see Subsection 2.2) we use the balls and bins model with preselection to obtain a more accurate approximation, for preselection the estimate n ˆ1 is used. Namely, each sensor decides to participate in the second phase with probability m p = min ,1 , (1) n ˆ1 where m denotes the number of bins. Then, we obtain the estimation of the number of sensors n ˆ 2 participating in the second phase. Dividing n ˆ 2 by p gives the estimation of the total number of sensors n ˆ. 2.1 First Phase: Order Statistics The first phase of our algorithm is based on order statistics. Initially each sensor allocates a table of l-bits numbers T of size k and fills it by special symbol . We assume that = 2l − 1. Next, it generates a random value x from the interval [0, 2l − 2]. Then, each sensor inserts x into its table T , broadcasts x to all neighbours and goes to sleep. Upon receiving the message x, a sensor wakes up and tries to insert x into its table T . At first, the algorithm checks whether the value x has already occurred in the table. If so, the message is ignored. Otherwise, two cases are considered. If the table T is not yet completely filled, the value x is inserted into the table. If the table T is already filled up, a sensor checks whether the value x is smaller than the largest number in the table. If so, the largest element is replaced by x. Finally, in the case the table T has been changed, it is sorted and sensor forwards the new element to all neighbors. At any time a sensor can estimate the network size by counting the number of inserted elements or, if the whole table T is filled up, by calculating (k − 1) · (2l − 2)/T [k]. It is worth to mention that the optimal value of parameter l could be obtained from the Birthday Paradox problem. We are interested in such a value of l that the generated random values are distinct. Analysis. Let X1 , . . . , Xn be independent random variables with the uniform density on the interval [0, 1). The order statistics X1:n , . . . Xn:n are the random variables
324
J. Cicho´n, J. Lemiesz, and M. Zawada
Algorithm 1. O RDER S TATISTICS(k, l) Initialization 1: set T [i] ← for i = 1, . . . , k 2: x ← generate uniformly at random a value from an interval [0, 2l − 2] 3: T [1] ← x 4: broadcast x to neighbours Upon receiving a message 1: receive x 2: if ∀1≤i≤k T [i] = x then 3: if ∃1≤i≤k T [i] = then 4: T [i0 ] ← x for i0 such that T [i0 ] = and sort T 5: broadcast x to neighbours 6: else 7: if x < T [k] then 8: T [k] ← x and sort T 9: broadcast x to neighbours 10: end if 11: end if 12: end if Return the estimated number of sensors 1: if ∃1≤i≤k T [i] = then 2: return |{i : T [i] = }| 3: else 4: return (k − 1) · (2l − 2)/X[k] 5: end if
obtained from X1 , . . . , Xn by sorting in the increasing order each of its realizations. It k is well known (see e.g. [8]) that E [Xk:n ] = n+1 for each k = 1, . . . , n. We set Zk,n =
k−1 . Xk:n
We will use the random variable Zk,n as an estimator of the number n. Namely, the following result holds: Theorem 1. Let 3 ≤ k < n. Then the random variable Zk,n is an unbiased estimator of the number n (i.e. E [Zk,n ] = n) and σ(Zk,n ) 1 k−1 1 1 = √ 1− =√ +O . (2) n n n k−2 k−2 Remark 1. For the estimator Z ∗ from [7] we have EZ ∗ = n(1+o(1)) and σ(Z ∗ )/n ∼ √ 1 . Hence the estimator Zk,n has a slightly better statistical properties (it is unbik−2 ased) than the estimator Z ∗ . Equation 2 together with Chebyshev’s inequality gives some information about the precision of the estimator Zk,n . However, this approach is not precise. We reduce
On Cardinality Estimation Protocols for Wireless Sensor Networks
325
the properties of order statistics to the Bernoulli distribution and then using the classical Chernoff inequalities we get a more precise estimate. Namely, the following result holds: Theorem 2. Suppose that η > 0, 0 < ε < 1 and 3 ≤ k ≤ n. Then kη 2 kε2 n k−1 k−1 n k−1 − 2(1+η) − 2(1−ε) Pr[ < < ]>1− e +e 1+η k Xk:n 1−ε k
(3)
After setting η = 1, ε = 0.53 and k = 20 into the last formula from the last theorem we get the following bound: Corollary 1. Suppose that n ≥ 20. Then Pr[0.526316n <
19 < 2.23964n] ≥ 0.99 X20:n
Remark 2. Numerical calculations with the incomplete regularized Beta functions show that for all n <= 107 we have Pr[0.5n < X19 < 2n] ≥ 0.99975. 20:n We can implement Algorithm 1 using 5 bytes registers for each real number from the table T . This solution, due to the Birthday Paradox, guarantees the proper functioning of this algorithm for all n ≤ 106 . Notice that this solution require only 100 bytes of memory for storing values of the table T at each node. 2.2 Second Phase: Balls-Bins Model with Preselection The second phase of our algorithm is based on the balls and bins model. At first each sensor allocates a bit map of size m, which represents m bins, and initialize all entries to "0"s. The sensor at this point decides whether it will participate (b = 1) or not. If so, it generates random value x, sets xth bit to "1" and broadcasts x to all neighbors. Next, a sensor goes to sleep and wakes up upon receiving the message x. Then, it checks if bin T [x] is empty; if so it sets xth bin to "1" and forwards a message to neighbors. Otherwise, sensor does nothing and goes to sleep. Notice that if a sensor receives the same message twice then the proper bit is already set so it does not forward the message further. Since we have usually a lot more sensors than bins it also can happen that two stations generate the same value, which further decreases the number of transmitted messages. At any time we can get the estimated number of sensors by calculating log(ˆ x/m)/ log(1 − p/m), where x ˆ is the number of empty bins. Analysis. Let us assume that we have n balls and m bins. Let us fix a probability p ∈ [0, 1]. Then, we consider the following two-round process: in the first round each ball decides with probability p whether it will participate in the next round or not. In the second round each ball chooses uniformly at random one of the bins. Let random variable Xm,n denote the number of empty bins after n balls have been thrown. In the theorem below we shall generalize the result from [3], which holds only for p = 1.
326
J. Cicho´n, J. Lemiesz, and M. Zawada
Algorithm 2. BALLS A ND B INS W ITH P RESELECTION(m) Initialization 1: set bitmap T [i] ← 0 for all i = 0, . . . , m − 1 1 with probability p, 2: b ← 0 otherwise. 3: if b = 1 then 4: x ← random(0, m − 1) 5: T [x] ← 1 6: broadcast x to neighbours 7: end if Upon receiving a message 1: receive x 2: if T [x] = 0 then 3: T [x] ← 1 4: broadcast x to neighbours 5: end if Return the estimated number of sensors 1: x ˆ = m − m−1 i=0 T [i] 2: if x = 0 then 3: return ∞ 4: else 5: return log(ˆ x/m)/ log(1 − p/m) 6: end if
Theorem 3. Let p ∈ [0, 1]. Then
p n E [Xm,n ] = m 1 − . m
(4)
We use the method of moments to derive an estimator of the number n. Namely, let X denote the number of non-occupied bins after an experiment. We solve the equation X = m(1 − p/m)n and we get the following formula for the estimator of n: n ˆ=
log( X ) m p . log(1 − m )
(5)
We shall use the above estimator for m = 800. Notice that we need for this purpose only 100 bytes. 2.3 Time Complexity Putting together all two phases we get Algorithm 3. Let d(x, y) denote the minimal number of hops between nodes x and y and let D = max{d(x, y)} denote the diameter of considered network. It is easy to see that phase (1) and phase (2) requires D rounds. Hence our algorithm require 2D rounds for proper estimation of number of nodes. It can be shown that our algorithm return value ∞ with probability less than 10−6 . If this happens, then the algorithm should be re-executed.
On Cardinality Estimation Protocols for Wireless Sensor Networks
327
Algorithm 3. 1: 2: 3: 4: 5: 6: 7:
calculate the number n ˆ 1 using the Algorithm 1 O RDER S TATISTICS(k) if ∀1≤i≤k T [i] = (which means that the number of nodes is greater than k) then set p = min{ nˆm1 , 1} calculate the number n ˆ using Algorithm 2 with parameter p calculated in previous step B ALLS A ND B INS W ITH P RESELECTION(m) end if
3 Experimental Results In this section, we present the illustrative simulation of Algorithm 3. We assume that the number of order statistics of the first phase is k = 20 and the number of bits for one statistic is l = 32. In the second phase we assume that the number of bins is m = 800. Therefore our algorithm will use at most 100 bytes of memory. As a matter of fact, that was our storage limitation in the practical problem we considered. We investigate numerically the algorithm proposed in previous section with above parameters. In Fig. 1 we show a typical result of simulations. For each n ∈ {1, . . . , 104 } we made a simulation of a network with n nodes. We placed at this figure values of n n n ˆ . We see that except for few points we have 0.85 < n ˆ < 1.15. Hence for most experiments we have 0.85 · n ˆ < n < 1.15 · n ˆ.
1.2
1.2
1.1
1.1
1
1
0.9
0.9
0.8
0.8
0.7 0
2000
4000
6000
8000
10000
Fig. 1. Plot of n/ˆ n for n = 1, . . . , 104 : results given by Algorithm 3
0.7 0
50
100
150
200
Fig. 2. Plot of n/ˆ n for n ∈ [800, 1200]: results of a sequential use of Algorithm 2
Hence we see that our Algorithm estimates the total number of nodes in the network with relative error of order 0.2 with high probability. Moreover, if the number of sensors is less than k (in our case 20) then we obtain an exact result. We compared the precision of our estimates with other algorithms based on HyperLogLog technology (see [5]) or ordered statistics (see [7]). Up to our knowledge our Algorithm is the most accurate of the known nowadays algorithms estimating the number of nodes using only 100 bytes of memory.
328
J. Cicho´n, J. Lemiesz, and M. Zawada
Let us also remark that after the first use of the proposed algorithm we may use the newly found estimate of the number of nodes for further estimation on number of nodes. This may be useful when the number of nodes in the network changes dynamically. We checked the precision of this way of using our algorithm for a network in which the number of nodes changes randomly near the number 1000 with changes of order 20%, i.e. when the number of nodes lays in the interval [800, 1200]. Fig. 2 shows the precision of the estimate. We see that the precision of our estimate is of order 0.1. Hence the first phase of our algorithm may be used only once and later we may use only the second phase of proposed algorithm for a very precise estimation of the network size. Note that Algorithm 1 and Algorithm 2 can be used separately. However, when using 100 bytes of memory Algorithm 1 can not accomplish our precision goal and the correctness of Algorithm 2 is limited. Their combination gives much better results (compare Figure 1,3 and 4).
2
2
1.8
1.8
1.6
1.6
1.4
1.4
1.2
1.2
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0 0
2000
4000
6000
8000
10000
Fig. 3. Plot of n/ˆ n1 for n = 1, . . . , 104 : results given by Algorithm 1
0 0
2000
4000
6000
8000
10000
Fig. 4. Plot of n/ˆ n2 for n = 1, . . . , 104 : results given by Algorithm 2 with p = 1
4 Conclusions In this paper we show that modified algorithms for counting distinct elements in data stream can be used to efficently solve the problem of estimating of the size of WSNs. We presented a two-phase algorithm for estimation the cardinality of a connected multihop network which uses for practical reasons only 100 bytes of storage memory at each device with relative precision of estimation about 20% with high probability. The precision of estimation can be further increased by adjusting the parameters k and m. Moreover, the algorithm requires 2D rounds, where D is the diameter of the network.
A Proof of Results from Section 2.1 Let X1 , . . . , Xn be independent random variables with the uniform density on the interval [0, 1). The order statistics X1:n , . . . Xn:n are the random variables obtained from
On Cardinality Estimation Protocols for Wireless Sensor Networks
329
X1 , . . . , Xn by sorting in the increasing order each of its realizations. The probabilistic density fk:n (x) of the variable Xk:n equals fk:n (x) =
1 xk−1 (1 − x)n−k . B(k, n − k + 1)
(6)
(see e.g.[8]) where B(a, b) = Γ (a)Γ (b)/Γ (a+b). Direct calculus shows that E [Xk:n ] =
1 k 0 xfk,n (x)dx = n+1 . Let k−1 Zk,n = . Xk:n Then for all k ≥ 2 we get 1 k−1 E [Zk,n ] = fk:n (x)dx = n , x 0 and for k ≥ 3 we have
n(n − k + 1) . k−2 From these observations Theorem 1 follows immediately. Let Bp,n denote a random variable with binomial distribution with parameters p and n, i.e. Pr[Bp,n = k] = nk pk (1 − p)n−k . Var [Zk,n ] =
Lemma 1. Suppose that k ≤ n and α ∈ (0, 1). 2
1. If αn < k then Pr[Xk:n ≤ α] ≤ exp(− 13 (k−αn) ) αn 2
2. If αn > k then Pr[Xk:n ≥ α] ≤ exp(− 12 (k−αn) ) αn Proof. We shall use in the proof the following well known form of Chernoff bounds for binomial distribution: 1 Pr[Bp,n ≥ (1 + δ)np] < exp − npδ 2 (7) 3 1 Pr[Bp,n ≤ (1 − δ)np] < exp − npδ 2 (8) 2 (see e.g. [9]). Let X1 , . . . , Xn be a sequence of independent uniformly distributed random variables in the interval (0, 1). Let Yi = 1 if Xi ≤ α and Yi = 0 otherwise. The random variable Bα,n = Y1 + . . . + Yn have binomial distribution with parameters n and α. Observe that the sentence Xk:n ≤ α means that |{i : Xi ≤ α}| ≥ k i.e. that Bα,n ≥ k. Notice that k = αn(1 + k−αn ). Hence if αn < k then from inequality 7 we αn get 2 1 k − αn 1 (k − αn)2 Pr[Xk:n ≤ α] < exp − nα = exp − . 3 αn 3 αn Suppose now that αn > k. Observe that Xk:n ≥ α is equivalent to Bα,n ≤ k. So we may use inequality 8 and in a similar way we get the result.
330
J. Cicho´n, J. Lemiesz, and M. Zawada
We are ready to prove Theorem 2. Let 0 < ε < 1. From the first part of the last Lemma we obtain k 1 kε2 Pr[Xk:n ≤ (1 − ε) ] ≤ exp − n 31−ε Observe next that X ≤ (1 − ε) nk if and only
n k−1 1−ε k
≤
k−1 X .
Hence
n k−1 k−1 1 kε2 Pr[ ≤ ] ≤ exp − 1−ε k Xk:n 31−ε In a similar way we show that Pr[
n k−1 k−1 1 kη 2 ≤ ] ≥ exp − 1−η k Xk:n 21−η
Therefore Theorem 2 is proved.
B Proof of Facts from Section 2.2 We shall prove Theorem 3. Let Ani denote the event that a i-th bin is empty for all 1 n 1 ≤ i ≤ m after n balls have been thrown. Then Pr[Ani ] = (1 − m ) . Let Yin =
1 if a i-th bin is empty, 0 otherwise.
m Thus, Xm,n = i=1 Yin . Moreover, let Z denote the that k balls have decided to event
participate in the second round. Then Pr[Z = k] = nk pk (1 − p)n−k and E [Xm,n ] =
n
E [Xm,n |Z = k] Pr[Z = k].
k=0
m m 1 k Since E [Xm,n |Z = k] = i=1 E Yik = i=1 Pr[Yik ] = m(1 − m ) , we obtain that n 1 k n k p E [Xm,n ] = m(1 − ) p (1 − p)n−k = m(1 − )n . m k m k=0
Therefore Theorem 3 is proved. In a similar way we can prove that n p n 2p p 2n Var [Xm,n] = m 1 − + m(m − 1) 1 − − m2 1 − . m m m This equation allows us to derive quite precise estimation on the precision of Algorithm considered in this paper.
On Cardinality Estimation Protocols for Wireless Sensor Networks
331
References 1. Cichon, J., Kapelko, R., Lemiesz, J., Zawada, M.: On alarm protocol in wireless sensor networks. In: ADHOC-NOW, pp. 43–52 (2010) 2. Flajolet, P., Martin, G.N.: Probabilistic counting algorithms for data base applications. J. Comput. Syst. Sci. 31(2), 182–209 (1985) 3. Whang, K.-Y., Zanden, B.T.V., Taylor, H.M.: A linear-time probabilistic counting algorithm for database applications. ACM Trans. Database Syst. 15(2), 208–229 (1990) 4. Bar-Yossef, Z., Jayram, T.S., Kumar, R., Sivakumar, D., Trevisan, L.: Counting distinct elements in a data stream. In: Rolim, J.D.P., Vadhan, S.P. (eds.) RANDOM 2002. LNCS, vol. 2483, pp. 1–10. Springer, Heidelberg (2002) 5. Flajolet, P., Fusy, E., Gandouet, O.: HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In: Conference on Analysis of Algorithms, AofA 2007 (2007) 6. Giroire, F.: Order statistics and estimating cardinalities of massive data sets. Discrete Applied Mathematics 157(2), 406–427 (2009) 7. Lumbroso, J.: An optimal cardinality estimation algorithm based on order statistics and its full analysis. In: AofA 2010. Discrete Mathematics and Theoretical Computer Science, vol. 5333, pp. 491–506 (2010) 8. Arnold, B.C., Balakrishnan, N., Nagaraja, H.N.: A First Course in Order Statistics. John Wiley & Sons, New York (1992) 9. Mitzenmacher, M., Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, New York (2005)
Maximizing Network Lifetime Online by Localized Probabilistic Load Balancing Yongcai Wang1 , Yuexuan Wang1 , Haisheng Tan2 , and Francis C.M. Lau2 1
2
Institute for Interdisciplinary Information Science, Tsinghua University, China Department of Computer Science, the University of Hong Kong, Hong Kong, China {wangyc,wangyuexuan}@mail.tsinghua.edu.cn{hstan,fcmlau}@cs.hku.hk
Abstract. Network lifetime maximization is a critical problem for longterm data collection in wireless sensor networks. For large-scale networks, distributed and self-adaptive solutions are highly desired. In this paper, we investigate how to optimize the network lifetime by a localized method. Specifically, the network lifetime maximization problem is converted to a localized cost-balancing problem with an appropriately designed local cost function. A distributed algorithm, LocalWiser, which adopts the idea of adaptive probabilistic routing, is proposed to construct a localized and self-adaptive optimal solution to maximize the network lifetime. We analyze LocalWiser in both static and dynamic networks. In static networks, it is formally proved that 1) LocalWiser can reach a stable status; 2) the stable status is optimal for maximizing the network lifetime. In dynamic networks, our extensive simulations illustrate that LocalWiser can converge to the optimal status rapidly for the network topology and flow dynamics.
1
Introduction
For long-term autonomous data collection in large-scale wireless sensor networks, it is important for the networks to be not only optimal in lifetime, but also self-adaptive to the online network dynamics. However, even using centralized computation when the sensors are of identical initial energy and of the identical data collection rate, it is NP-Complete to solve maximize the network lifetime by constructing the energy optimal routing tree[8]. If probabilistic multi-path routing is allowed, the network lifetime maximization problem can be solved by a centralized Linear Programming model [2]. However, these centralized methods need global information and global coordination which are not scalable, and they can not be self-adaptive to the online network dynamics, such as the variations of the network topologies and the data flows. Some other previous works studied the network lifetime maximization by distributed approaches based on load-balancing or probabilistic routing methods. However, because of the restriction of local information and local computation, instability and local optimum are their common weakness. To our best knowledge, in the literature there are no distributed solutions that can simultaneously guarantee 1) stability, 2) network lifetime optimization at the stable status, and 3) self-adaptivity to online network dynamics. H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 332–345, 2011. Springer-Verlag Berlin Heidelberg 2011
Maximizing Network Lifetime Online
1.1
333
Our Contribution
Although the network lifetime maximization is a global optimization problem, we can map the problem to a localized cost-balancing problem with an appropriately designed local cost function. As a major contribution, we develop a localized cost-balancing algorithm, LocalWiser, to construct a scalable, stable and selfadaptive optimal solution for the network lifetime maximization problem. In the algorithm, we make use of adaptive probabilistic routing and propose a virtual guidance for the sensors to overcome instability and local optimum during their local computation. Further, we formally prove that in static networks 1) LocalWiser is guaranteed to reach a stable status, and 2) the stable status is optimal for maximizing the network lifetime. LocalWiser is also adaptive to changes in the network. When the network topology or data flow changes dynamically, our extensive simulations show that LocalWiser can adjust the transmission probabilities and makes the network converge quickly to the optimal status for network lifetime maximization. 1.2
Related Works
The related work includes research results on network lifetime maximization, load-balancing, energy-efficient routing and distributed consensus. We introduce the most important related contributions in the literature in two categories: the centralized approaches and the distributed approaches. For the centralized approaches, Buragohain, et al. [2] studied the optimal routing tree problem by modeling the data collection network as a sensor database. They proved that it is NP-Complete to find the energy-optimal routing tree for partially aggregated and unaggregated queries in random networks. Moreover, they claimed probabilistic routing can improve the network lifetime compared with the tree-based approaches, and proposed a centralized linear programming model to solve the network lifetime maximization problem in polynomial time through probabilistic routing . Liang and Liu [8,9] studied lifetime maximization through the construction of the energy optimized spanning tree. They also proved it is NP-complete and devised several heuristics to prolong the network lifetime. Dai, et al. [3] studied the load balancing problem in a grid topology. They made use of the Chebyshev sum inequality as a criteria for load balancing. All the above solutions require centralized computation and global coordination for the data routing. For the distributed algorithms, Jarry [5] etc. proved that the flow maximization is equivalent to network lifetime maximization in data collection networks. They investigated the network structures that support the max-flow and proposed two online distributed probabilistic routing algorithms for energy balancing in the optimal network structures. However, in their work, the sensors are allowed to transmit data to the nodes beyond their parent level, which may be hard to implement due to the sensors’ limited communication range. Parametric Probabilistic Routing (PPR) [1] addressed the energy efficient routing
334
Y. Wang et al.
by assigning higher transmission probabilities to the nodes which are closer to the destination. Dynamic Load Balance Tree(DLBT) algorithm [17] calculated routing probabilities based on the current loads of the parent candidates. The stability of DLBT had been verified by simulations, but its stable status was not guaranteed to achieve the global optimum. Wang et al. [14] presented a level-based load-balancing algorithm to get level-by-level load balancing in the network, which provides a stable but sub-optimal localized solution. In some other works, Zhu and Girod [19] proposed an algorithm to minimize congestion by decomposing the total target flow rate into a sequence of rate increments, where the classical Bellman-Ford algorithm was adopted to find a corresponding minimum-cost route for each increment. Tsai et al.[12] proposed a routing protocol for load-balancing based on path energy and self-maintenance. Mauro [4] proposed a gossip-based distributed balancing algorithm in which the nodes need extensive information in neighborhood. Most recently, Lee et al.[7] applied DAG’s (Direction Acyclic Graph) to different data collection cycles for load balancing so that energy efficiency was improved in both time and spatial domains. The rest of paper is structured as follows. In Section 2, the network lifetime maximization problem is mapped to a local cost-balancing problem. Section 3 presents the LocalWiser algorithm to solve the local cost-balancing problem. In Section 4, we formally prove the stability and optimality properties of LocalWiser in the static networks. Section 5 illustrates the performance of LocalWiser in static and dynamic networks. There are conclusion remarks and discussion of future works in Section 6.
2 2.1
Bridge Network Lifetime to Node’s Local Cost System Model
We study the online network lifetime maximization problem after a network is deployed over a region for periodical data collection. The network contains N sensors and a single sink, which is organized into a level structure. Each sensor computes its level index by calculating the minimum hops to the sink[18]. The calculation can be carried out distributively after the sink floods several messages. Each sensor can only find one-hop neighbors within its limited communication range. The one-hop neighbors in a node’s parent level (one hop closer to the sink) are called parent candidates and the node establishes probabilistic links to them for probabilistic minimum hop routing. The one-hop neighbors in the child level (one hop further from the sink) are called child candidates, and the other nodes in the same level are called siblings. We consider both static and dynamic networks. In static networks, the sensors are assumed to be stationary, which have identical initial energy and collect identical amount of data in each round. In dynamic networks, the joining, leaving and movement of the sensors are considered as topology dynamics. The changes of the data collection amount in different rounds are considered as flow dynamics. In both types of networks, the sensors communicate with omni-directional antenna and use identical, limited and fixed transmission power. The disc radio model and symmetric channels are assumed.
Maximizing Network Lifetime Online
2.2
335
Network Lifetime Maximization Problem
We model the above network as a weighted dynamic graph G = {V, E, P(t), L(t)}, where V is the N sensor nodes and E is the directed links from each node to its parent candidates. The transmission probability of a link i, j at round t is denoted as Pi,j (t). Vector L(t) = {Li (t)} denotes the load of each sensor at round t, where Li (t) means the amount of data that sensor i needs to transmit in round t. The load of a sensor in round t is the sum of data captured by itself and the loads transmitted from its children: Li (t) = ui + Pj,i (t)Lj (t), (1) j∈Ci
where Ci is the child set of node i, and ui is the locally captured data of node i. The network lifetime is defined as the network’s work duration before its first node dies [2][8][9][14]. Since it is well known that data transmission is dominant in sensor network energy consumption, we can only take into account the transmission energy consumption and assume the other consumptions such as receptions are negligible [8]. Sensors are assumed to consume one unit of energy by transmitting one unit of data. If the network experiences a quick self-optimizing phase to converge to a stable status, where the sensors’ loads are stable, node i can estimate its own lifetime by its energy-load ratio, Ti (s) = ei /Li (s), where ei is its initial energy, s is a flag standing for the stable status, and Li (s) is its stable load. Therefore, we have Problem 1. The online network lifetime maximization problem is to minimize the maximum load-energy ratio among all the sensors at the stable status. max{T (s)} ⇔ min{max{Li (s)/ei }}. i
(2)
Since all data has to be be forwarded by the first level sensors, the sensors with the heaviest load must be in the first level. Therefore, the network lifetime maximization problem can be reduced to the problem of balancing the load among the first level sensors, which meets the lifetime maximization problem formulated in [2][8]. 2.3
Local Cost of Network Lifetime
The difficulty to address network lifetime maximization problem locally is lack of the global information at the distributed sensors. Thus, we define a Local Cost function of Network Lifetime (LCNL) at each sensor as a criterion during its local computing to bridge its local decision to the global optimization of the network lifetime. Definition 1. For a sensor i at round t, its Local Cost function of Network Lifetime (LCNL) is defined as: ⎧ ⎪ if i is in the first level ⎨ Li (t)/ei , (3) Ci (t) = Pi,j (t) ∗ Cj (t), otherwise ⎪ ⎩ j∈Pi
336
Y. Wang et al.
If sensor i is in the first level, its LCNL is defined as its load-energy ratio. Otherwise, the LCNL is defined as the sum of the products of its transmission probabilities with its parent candidates’ LCNLs. Pi is the set of all the parent candidates of node i. Pi,j (t) is the transmission probability from i to a parent candidate j at round t. Here, we can see 1) the LCNL can be locally computed with the LCNLs of its parent candidates and its local transmission probabilities; 2) the LCNL of each node is a recursive function, which can be expanded to a function of the LCNLs of the first level sensors and the transmission probabilities of the multi-hop links. Moreover, we have Theorem 1. When all the sensors in the network have equal LCNLs, the network’s lifetime is maximized. Proof. When all the sensors have the equal LCNL, the sensors in the first level must have the same LCNL, i.e., the same load-energy ratio. According to the equivalence in Eqn (2), the lifetime of the network is maximized. Therefore, the network lifetime maximization problem is mapped to a local cost balancing problem at each sensor. The work left for a sensor is to self-optimize its transmission probabilities locally so that the network converges to a stable status when all the nodes have the equal LCNL.
3 3.1
Local Algorithm to Maximize the Network’s Lifetime Challenges
The local cost balancing problem is difficult due to the issues of instability and local optimum. Instability is a general challenge to the concurrent distributed decisions, which prevents the system to reach a consensus [11]. In our problem, at each round one node is not aware of the concurrent changes to the probability assignments of its siblings. For example, when some adjacent siblings observe the load of a parent candidate is too high, they will react to this local observation by decreasing their own transmission probabilities to the parent candidate. The changes may lead to an overreaction which make the load of the parent from over-high to over-low. Such kinds of overreaction result in “instability”, which prevents the network from converging to a stable status. Local optimum is also a general difficulty in distributed algorithms [14], and is a challenge of the local cost balancing problem. Since each node only has local information, it cannot know the global status of the network. Even if a stable status can be reached, local algorithms may stop at a local optimum. i.e., it converges to a sub-optimal status and cannot proceed towards the global optimum any more. 3.2
“Virtual Guidance” in LocalWiser
To overcome instability and local optimum, we propose a ‘virtual guidance’. We suppose that all the sensors can reach an ideal status where all of their LCNLs are equal to an expected average cost E(C). If an online sensor can know the value of E(C) , it will have a guidance for its local computation to avoid overreaction
Maximizing Network Lifetime Online
337
and local optimum. But unfortunately, the value of E(C) can not be computed by any sensor locally. Therefore we can only suppose the sensors in the network have the value of E(C) virtually. Actually, it will be shown that the exact value of E(C) will not be needed in the our final algorithm. For a node i, j ∈ Pi is one of its parent candidates. After round t − 1, the transmission probability from i to j is Pi,j (t − 1) and the LCNL of node j is Cj (t − 1). To update the transmission probability at round t, node i is virtually guided to balance Cj (t − 1) towards E(C). Therefore, it sets the transmission probability Pi,j (t) as: Pi,j (t) = Pi,j (t − 1) where Mi (t) =
j∈Pi
E(C) , Cj (t − 1)Mi (t)
(4)
Pi,j (t − 1) CE(C) is a normalizer that keeps the sum of the j (t−1)
transmission probabilities of node i equal to 1. By substituting Mi (t) in Eqn (4), we have Pi,j (t − 1)/Cj (t − 1) Pi,j (t) = . (5) j∈Pi Pi,j (t − 1)/Cj (t − 1) The most interesting point in Eqn(5) is that E(C) is counteracted. The reason is that every sensor only needs a qualitative guidance, which is the LCNL of the parent level nodes should be equal, but not the exact value of E(C). In static networks, when LCNLs of all sensors are balanced, the value of E(C) is reached automatically. 3.3
LocalWiser Algorithm
Based on the above idea, the LocalWiser algorithm is derived based on time synchronization and level-based scheduling. In each round, the sensors are scheduled according to their level indices. As shown in Fig.1, in each round, the sensors in the child level collect data, run LocalWiser and transmit data before the parent level nodes. This meets the general transmission scheduling schemes for energy efficient multi-hop data aggregation[6]. The issues of time synchronization, collision avoidance and scheduling are studied in [10][13]. The LocalWiser algorithm for a sensor i in round t is given in Algorithm 1. – In line 11, the node updates its local load information. Since the children of node i execute LocalWiser before it, node i can use the children’s loads at round t to update its load information. – In line 13, the node updates its LCNL information. When node i executes LocalWiser, their parents have not executed LocalWiser. Therefore, node i uses Cj (t − 1) which is the LCNL information in the t − 1 round to update Ci (t). In Algorithm 1, we can see during the local computation, no global information is used in LocalWiser. Moreover, in the following sections, we will analyze our algorithm in both static and dynamic networks to illustrate its properties of stability, optimality and good self-adaptivity to network dynamics.
338
Y. Wang et al.
Algorithm 1. LocalWiser Algorithm for a sensor i, in round t 1: while (t < ∞) do 2: if (t == 0) then 3: for (n=1, j = Pi (n); n <= #Pi ; n + +) do 4: Pi,j (t) = 1/(#Pi ) //initiate Pi,j 5: end for 6: else 7: for (n=1, j = Pi (n); n <= #Pi ; n + +) do P (t−1)/Cj (t−1) 8: Pi,j (t) = i,jPi,j (t−1)/C j (t−1) j∈Pi 9: Li (t) = ui + m∈Ci Pm,i (t)Lm (t) ⎧ i ∈ level 1 ⎪ ⎨ Li (t)/ei , 10: Ci (t) = Pi,j (t) ∗ Cj (t − 1), i ∈ / level 1 ⎪ ⎩ j∈Pi
11: end for 12: end if 13: end while
4 4.1
Stability and Optimality of LocalWiser Stability
Stability is proved in two steps. 1). we prove the maximum LCNL of all the nodes (denoted by Cmax (t)) will converge to a stable value. 2). When Cmax (t) converges, we can finally prove the stability of LocalWiser. Lemma 1 (Convergence of Cmax (t)). The maximum LCNL of all the nodes, Cmax (t) must converge to a stable value. Proof. ∀i ∈ / level 1 Ci (t) =
j∈Pi
Pi,j (t) · Cj (t − 1),
Pi,j (t − 1)/Cj (t − 1) = · Cj (t − 1) j∈Pi Pi,j (t − 1)/Cj (t − 1) j∈Pi Pi,j (t − 1) ≤ j∈Pi Pi,j (t − 1)/Cmax (t − 1) j∈Pi
Pi,j (t − 1) ≤ ≤ Cmax (t − 1) 1/Cmax (t − 1)
(6)
j∈Pi
For ∀i ∈ level 1, it can be proved similarly by expanding Eqn (1) to get Ci (t) ≤ Cmax (t − 1). Because Cmax (t) = max{C1 (t), ..., CN (t)}, we get Cmax (t) ≤ Cmax (t − 1).
(7)
Maximizing Network Lifetime Online Sensors in child level
Time of a round
Capture local data Receive children data Data Aggregation Run LocalWiser Transmit Data to Parents
Sensors in level k
Capture local data Receive children data Data Aggregation Run LocalWiser Transmit Data to Parents
339
Sensors in parent level
Capture local data Receive children data Data Aggregation Run LocalWiser Transmit Data to Parents
Fig. 1. The work-flow of sensors in level-based scheduling
Since Cmax (t) is lowerbounded, it cannot be smaller than the average load of the first level nodes. As time passing, Cmax (t) will be infinitely close to Cmax (t− 1) or equal to Cmax (t − 1). Therefore, Lemma 1 is proved. When Cmax (t) is stable, there is at least one node having LCNL equal to Cmax (t). We call such nodes as the maximum cost nodes (MCNs). Lemma 2. When Cmax (t) is stable, if a node is a MCN, all its ancestors that it can reach by positive pathes (links on the path have positive transmission probabilities) are MCNs. Proof. Without loss of generality, we suppose node i is a MCN. From Eqn.(6), the condition of Ci (t) = Cmax (t) = Cmax (t − 1) is: Pi,j (t − 1)/Cj (t − 1) = 1/Cmax (t − 1) (8) j∈Pi
The condition that Eqn. (8) holds is :∀j ∈ Pi , if Pi,j (t−1) > 0, then Cj (t−1) = Cmax (t − 1) = Cmax (t). Therefore, if the transmission probability from a MCN to a parent candidate is positive, this parent candidate must also be a MCN. The same condition will also holds by its parent candidates. Therefore, all its ancestors that it can reach by positive pathes are MCNs. Lemma 3. When Cmax (t) is stable, if a node is not a MCN but it has some MCNs as its parent candidates, the transmission probabilities from it to these MCNs will converge to 0. Proof. Suppose a node i is not a MCN with local cost Ci (t) < Cmax (t). Suppose it has a parent j who is a MCN. From Eqn.(5), we have Pi,j (t) 1/Cmax (t − 1) = <1 Pi,j (t − 1) m∈Pi Pi,m (t − 1)/Cm (t − 1)
(9)
As time passes, the transmission probability from i to j will be decreased until:
340
Y. Wang et al.
Pi,j (t) =
Pi,j (t − 1)/Cmax (t − 1) = Pi,j (t − 1) m∈Pi Pi,m (t − 1)/Cm (t − 1)
(10)
The only condition that Eqn.(10) holds is Pi,j (t) = Pi,j (t − 1) = 0. Therefore, the transmission probability from i to j will be decreased monotonously to 0. Theorem 2 (Stability). The LocalWiser algorithm can guarantee the network converge to a stable status, where both the LCNLs and the transmission probabilities of all sensors are stable. Proof. We proof Theorem2 by an isolation and reconstruction method. Lemma2 and Lemma3 have proved that after Cmax (t) is converges, if a node is a MCN, all its parent candidates with positive transmission probabilities are MCNs; if a node is not a MCN, its transmission probabilities to the MCNs must converge to zero. Therefore, the MCNs can be isolated from the network, i.e., the network can be divided into a MCN set and a non-MCN set. The transmission probabilities between the MCN set and the non-MCN set are zero. If we subtract the nodes in MCN set and the links among MCNs from the network, the costs and the transmission probabilities of the nodes left will not be affected. After the substraction, Lemma1, Lemma2 and Lemma 3 will still hold for the remaining network, and a new set of MCNs can be determined in the remaining network. We can repeat the substraction process until all the remaining nodes have the equal LCNLs, which reaches a stable status. After we add the removed MCNs and the links back into the network in the reversed order, the stable scenario of the network is obtained. Corollary 1 (Properties of the stable status of the network). When the network reaches the stable status, we have 1. The network is composed of K isolated sets, where K is a positive integer. 2. Nodes in each isolated set have equal LCNL. The transmission probabilities among nodes in different sets are zero. 3. Each isolated set contains at least one first level nodes, and the first level nodes are fully covered by the K isolated sets. Proof. Property 1) and property 2) are directly proved by Theorem 2. For property 3), Lemma 3 gives that two nodes in different isolated sets cannot have positive pathes to the same node, so the first level nodes are covered by the K isolated sets without overlapping. Fig.2 illustrates the stable status of a network. Three isolated sets are formed in the network. The first set contains MCNs of the whole network; the second contains MCNs after substraction of the first set; the third contains MCNs after substraction of the first and the second sets. In each set all nodes have equal LCNL. The transmission probabilities between any two different sets have converged to zero.
Maximizing Network Lifetime Online
sink
341
MCNs of the network MCNs after subtracting the 1st set.
First level
MCNs after subtracting the 1 st and 2nd sets. Link with zero transmission probability
1st set
2nd set
LCNL=4
LCNL=3.5
3rd set LCNL=3
Link with positive transmission probability
Fig. 2. The stable scenarios of a network in which nodes form isolated sets
4.2
Optimality
We prove the stable status of LocalWiser is optimal to maximize the network lifetime. A Chebyshev’s Sum Inequality based metric is used to measure the load-balancing performance of the first level sensors [3].
2 K q 2 n k k k=1 ( Li (s)) θ = i=1 = K , (11) q q i=1 (Li (s))2 q k=1 nk (k )2 where Li (s) is the stable load of a first level node i. s is a stable flag. The right part of Eqn. (11) is a set representation of the metric. We suppose the network contains q first level nodes, which are fully covered by the K isolated sets. The first level nodes in the kth set are denoted by a set Sk , and the number of first level nodes in Sk is denoted by nk ; The load of the first level sensors in Sk is denoted by k . It is easy to verify that θ is not larger than 1. Only if all the first level nodes have equal loads, θ = 1; Otherwise, θ < 1. The smaller θ is, the worse the load balancing performance is. Theorem 3 (Optimality). When the network has reached stable status, any modification to the probability assignments at any node can only make the load balancing performance of the first level nodes worse. Proof. We consider different cases of probability assignment modifications. 1). If the transmission probability modifications are within one isolated set (say the kth set), only the loads of the first level nodes in this set will be affected. Suppose after modification the loads of the first level sensors in this set are {k + xj }, where j∈Sk (xj ) = 0 as the total loads in the set will not change. The load-balancing metric after modification is:
2 K n i i i=1
θ = (12) K 2 2 q i=1,i=k ni (i ) + j∈Sk (k + xj ) As j∈Sk (k + xj )2 > nk 2k , we have θ < θ, which means the load balancing performance of the first level nodes becomes worse after the modification.
342
Y. Wang et al.
2). Consider the case when the transmission probabilities among different sets are modified from zero to some positive values (the dashed links shown in Fig.2). Without loss of generality, we suppose the first level sensors of loads k and j are affected by such modifications. If in stable status k > j , the modifications can only further increase the loads of k and further decrease the loads of j . After the network converging to a new stable status, if no other sets are affected, the new stable loads of the two sets will be k + nxk and j − nxj respectively, where x is a positive value indicating the data transmission from set j to set k. It is easy to verify by Rearrangement Inequality [16] that θ < θ, and the load-balancing performance of the first level becomes worse after the modification. If the inter-set modifications affect more than two sets, since the modification can only increase the loads of the larger-load nodes and decrease the loads of the smaller-load nodes in the first level, we can prove the load-balancing metric will also be worse similarly according to the Rearrangement Inequality. Based on the cases in 1) and 2), the stable status of LocalWiser is optimal to maximize the network lifetime.
5 5.1
Numerical Results in Static and Dynamic Networks Simulation Setups
For a static network, the sensors are randomly deployed to a 2D space in a uniform random distribution. A sink is deployed at the center of the network. Sensors are organized into levels according to their minimum hops to the sink. Each sensor establishes probabilistic links to its parent candidates. The communication radius of the sensors is set identical and normalized to r = 1. All sensors’ initial energy is set to 1. Four other algorithms are used during the performance comparison, which are 1)Centralized algorithm using global information and Linear Programming (LP)[2], which is the global optimal result; 2) Greedy algorithm for local loadbalancing (Greedy); 3)Dynamic Load-Balanced routing Tree(DLBT)[17]; 4) Level-based load balancing algorithm (Level-based Balancing [14]). 5.2
Performance in Static Networks
In static networks, the sensors collect constant amount of data in each round and the network topology doesn’t change. In our simulation, a high density network of 300 nodes is randomly deployed in a 7*7 area, and a low density network of 100 nodes is randomly deployed in a 10*10 area. Each node collects one unit data in each round. To evaluate network lifetime, the minimum energy-load ratio of all sensors, i.e., maxN 1(Li (t)) is plotted as the evaluation metric for different i=1 algorithms. Fig.(3) shows the performances of the different algorithms in the high density network, and Fig.(4) shows the performances in the low density network. For the networks of different densities, LocalWiser can always converge to a stable status where it reaches the global optimal performance as the same as
Maximizing Network Lifetime Online
343
0.055 0.08
0.017
0.075
Centralized LP Greedy DLBT Level−based Balancing LocalWiser
0.045 1 max N (L i (t)) i=1
1 max N (Li (t)) i=1
Level−based Balancing 0.014
LocalWiser
0.013
0.04
10% nodes increase data capturing rate randomly
10% nodes decrease data capturing rate randomly
0.048 Centralized LP Greedy DLBT Level−based Balancing LocalWiser
0.046 Centralized LP
0.035
0.012
0.05
Five nodes move to random new positions
0.07 0.065
1 max N (Li (t)) i=1
DLBT
1 max N i=1 (Li (t))
Greedy 0.015
Remove a node from the 3rd level
0.05
Centralized LP
0.016
Greedy DLBT
0.06
Level−based Balancing LocalWiser
0.044
0.042
0.055
0.011
0.04
0.03
0.05
0.01
0.025 0
0.009 0
10
20
30
40
50
Number of data collection rounds
60
70
0.038
0.045
10
20 30 40 50 Number of data collection rounds
60
70 0
10
20
30
40
50
60
70
Number of data collection rounds
80
90
0.036
0
10
20
30
40
50
60
70
80
90
Number of data collection rounds
Fig. 3. Performances Fig. 4. Performances Fig. 5. Self-adaptingFig. 6. Self-adapting a low-density in a high-density in to topology dynamicsto flow dynamics static network static network
the centralized Linear Programming. We can find the instability of the greedy algorithm from the zigzag performance curves. The DLBT algorithm and the Level-based Balancing can only reach the local optimum. Evaluations in many other networks show the similar results[15]. Therefore, in our simulations LocalWiser guarantees stability and the global optimal network lifetime by localized methods. According to the simulations, the convergence speed of LocalWiser in static networks can be summarized as: 1) quick convergence to good-enough, and 2) slow improvement from good-enough to the optimum. LocalWiser improves 1 quickly within several rounds to reach a good-enough performance maxN i=1 (Li (t)) which is very close to the optimum. The reason of “slow convergence from goodenough to optimum” is in Eqn.(5). When Pi,j (t − 1) is very close to zero, Pi,j (t) gets very small gain in every step. Since the good-enough performance is already very close to the global optimum, LocalWiser algorithm provides a fast network lifetime maximizing methods for practical applications. 5.3
Performances in Dynamic Networks
We evaluate LocalWiser in two kinds of dynamic networks: 1) the network’s topology is dynamic; 2) the amounts of data collected by sensors are dynamic in different rounds. If the dynamic events are highly frequent, it will be hard to provide an adaptive and online optimal solution, specifically when the variation speed exceeds the adaptation speed. Therefore, we focus on analyzing the adaptation speed of LocalWiser with the infrequent network dynamics. 1) Mechanisms to handle network topology dynamics Topology dynamics include the joining, leaving and movements of the nodes. When a new node joins the network, it will discover its parent candidates and child candidates by local communications according to the level indices of its neighbors. Then, it assigns transmission probabilities evenly to the parent candidates and notifies the children to reset their transmission probabilities. Node leaving is handled similarly. The children of the leaving node will reset its transmission probabilities to exclude the leaving node. The parent candidates of the leaving node will also notify their children to reset their transmission probabilities. Node movements are treated as a merge event of the node leaving and joining.
344
Y. Wang et al.
An example is shown in Fig.5, where the network has 100 nodes and eight levels. LocalWiser converges to a stable status after about five rounds. In the 30th round when we remove a node from the third level, LocalWiser leads all the sensors to converge to a new stable and optimal status quickly. In the 60th round when five nodes are moved to some random new positions, the network self-adapts to the movement quickly and also converges to a new stable and optimal status after several rounds. 2) Adaptivity to flow dynamics Nodes may capture different amounts of data in different rounds, which is called flow dynamics. LocalWiser can self-adapt to such flow dynamics and maintains the network to online stable and optimal status. An example is shown in Fig.6, which is a network of 100 nodes. It converges to a stable status after about five rounds. After that, in the 30th round, 10% nodes increase data capturing rates randomly. LocalWiser uses about five rounds to adapt to such flow dynamics and converges to a new stable and optimal status. In the 60th round, when 10% nodes decrease the data capturing rates randomly, LocalWiser uses about five rounds to converge to another new stable and optimal status again. We have also evaluated the performance of self-adaptivity of LocalWiser in many other networks and find that the time for LocalWiser to adapt to the network dynamics is only several rounds in general. More simulation results can be found in our online demo[15]. Therefore, we can say when the network dynamics are not very frequent, LocalWiser is a distributed and adaptive online network lifetime maximization method.
6
Conclusion
In this paper, we study the network lifetime maximization problem by mapping it to the localized cost-balancing problem. An algorithm, LocalWiser, is proposed to maximize the network lifetime, which is of the following characteristics: 1) computing locally and distributedly; 2) stable; 3) optimal to maximize the network lifetime in the stable status; 4) fast self-adapting to the network dynamics; and 5) easily implemented. In future work, the localized cost-balancing algorithm can be improved to jointly optimize the transmission power and the transmission probability. Mobility of the sensors can improve the performance of LocalWiser. The convergence speed of LocalWiser is also our further work. In addition, LocalWiser can be extended to multi-hop heterogenous networks and multi-hop networks with queues.
Acknowledgement This work was supported in part by the National Basic Research Program of China Grant 2007CB807900, 2007CB807901, the National Natural Science Foundation of China Grant 61073174, 61033001, 61061130540, and the Hi-Tech research and Development Program of China Grant 2006AA10Z216.
Maximizing Network Lifetime Online
345
References 1. Barrett, C.L., Eidenbenz, S.J., Kroc, L., Marathe, M., Smith, J.P.: Parametric probabilistic sensor network routing. In: WSNA 2003, pp. 122–131. ACM, New York (2003) 2. Buragohain, C., Agrawal, D., Suri, S.: Power aware routing for sensor databases. In: INFOCOM 2005, pp. 1747–1757, 2881 (2005) 3. Dai, H., Han, R.: A node-centric load balancing algorithm for wireless sensor networks. In: Globecom 2003, pp. 548–552, 4209 (2003) 4. Franceschelli, M., Giua, A., Seatzu, C.: Load balancing over heterogeneous networks with gossip-based algorithms. In: ACC 2009, pp. 1987–1993, 5802 (2009) 5. Jarry, A., Leone, P., Nikoletseas, S., Rolim, J.: Optimal data gathering paths and energy balance mechanisms in wireless networks. In: Rajaraman, R., Moscibroda, T., Dunkels, A., Scaglione, A. (eds.) DCOSS 2010. LNCS, vol. 6131, pp. 288–305. Springer, Heidelberg (2010) 6. Krishnamachari, B., Estrin, D., Wicker, S.B.: The impact of data aggregation in wireless sensor networks. In: ICDCSW 2002, pp. 575–578. IEEE Computer Society, Washington, DC, USA (2002) 7. Lee, H., Keshavarzian, A., Aghajan, H.: Near-lifetime-optimal data collection in wireless sensor networks via spatio-temporal load balancing. ACM Trans. Sen. Netw. 6, 26:1–26:32 (2010) 8. Liang, W.F., Liu, Y.Z.: Online data gathering for maximizing network lifetime in sensor networks. Ieee Transactions on Mobile Computing 6(1), 2–11 (2007) 9. Luo, W.L.: Prolonging network lifetime for data gathering in wireless sensor networks. Submitted to IEEE Trans. Computers (2010) 10. Mar´ oti, M., Kusy, B., Simon, G., L´edeczi, A.: The flooding time synchronization protocol. In: SenSys 2004, pp. 39–49 (2004) 11. Olfati-Saber, R., Fax, J.A., Murray, R.M.: Consensus and cooperation in networked multi-agent systems. Proceedings of the Ieee 95(1), 215–233 (2007) 12. Tsai, Y.-P., Liu, R.-S., Luo, J.-T.: Load balance based on path energy and selfmaintenance routing protocol in wireless sensor networks. In: Hong, C.S., Tonouchi, T., Ma, Y., Chao, C.-S. (eds.) APNOMS 2009. LNCS, vol. 5787, pp. 431–434. Springer, Heidelberg (2009) 13. Wang, L., Xiao, Y.: A survey of energy-efficient scheduling mechanisms in sensor networks. Mob. Netw. Appl. 11, 723–740 (2006) 14. Wang, Y.C., Wang, Y.X., Lin, H.: Level-based load-balancing by tuning transmission probabilities locally in wireless sensor networks. Submitted to Computer Communications 15. Wang, Y.C., Wang, Y.X., Qi, X.: Guided-evolution:convergence to global optimal by distributed computing using local information. In: Demo in MobiCom 2010 (2010), http://www.project.itcs.tsinghua.edu.cn/balance 16. Wu, K., Liu, A.: Rearrangement inequality. Mathematics Competitions 8, 53–60 (1995) 17. Yan, T., Bi, Y., Sun, L., Zhu, H.: Probability based dynamic load-balancing tree algorithm for wireless sensor networks. In: Lu, X., Zhao, W. (eds.) ICCNMC 2005. LNCS, vol. 3619, pp. 682–691. Springer, Heidelberg (2005) 18. Zheng, M., Zhang, D., Luo, J.: Minimum hop routing wireless sensor networks based on ensuring of data link reliability. In: MSN 2009, pp. 212–217 (2009) 19. Zhu, X.Q., Girod, B.: A distributed algorithm for congestion-minimized multi-path routing over ad hoc networks. In: ICME 2005, pp. 1485–1488, 1598 (2005)
Time-Varying Graphs and Dynamic Networks Arnaud Casteigts1 , Paola Flocchini1 , Walter Quattrociocchi2, and Nicola Santoro3 1 University of Ottawa {casteig,flocchin}@site.uottawa.ca 2 University of Siena
[email protected] 3 Carleton University, Ottawa
[email protected]
Abstract. The past decade has seen intensive research efforts on highly dynamic wireless and mobile networks (variously called delay-tolerant, disruptivetolerant, challenged, opportunistic, etc) whose essential feature is a possible absence of end-to-end communication routes at any instant. As part of these efforts, a number of important concepts have been identified, based on new meanings of distance and connectivity. The main contribution of this paper is to review and integrate the collection of these concepts, formalisms, and related results found in the literature into a unified coherent framework, called TVG (for timevarying graphs). Besides this definitional work, we connect the various assumptions through a hierarchy of classes of TVGs defined with respect to properties with algorithmic significance in distributed computing. One of these classes coincides with the family of dynamic graphs over which population protocols are defined. We examine the (strict) inclusion hierarchy among the classes. The paper also provides a quick review of recent stochastic models for dynamic networks that aim to enable analytical investigation of the dynamics. Keywords: Highly dynamic networks, delay-tolerant networks, challenged networks, time-varying graphs, evolving graphs, dynamic graphs.
1 Introduction In the past few years, intensive research efforts have been devoted to the study of highly dynamic networks, whose topologies change as a function of time, and the rate of changes is too high to be reasonably modeled in terms of network faults or failures; in these systems changes are not anomalies but rather integral part of the nature of the system. They include, but are not limited to, dynamic mobile ad hoc networks where the network’s topology changes dramatically over time due to the movement of the network’s nodes; sensor networks where links only exist when two neighbouring sensors are awake and have power; vehicular networks where the topology changes continuously as vehicles move. These highly dynamic infrastructure-less networks, variously called delay-tolerant, disruptive-tolerant, challenged, opportunistic, etc. (e.g., see [7, 10,11,19,27,29,30]), have in common that the assumption of connectivity does not necessarily hold, at least with the usual meaning of contemporaneous end-to-end multi-hop
An extended version of this paper can be found in arXiv:1012.0009.
H. Frey, X. Li, and S. Ruehrup (Eds.): ADHOC-NOW 2011, LNCS 6811, pp. 346–359, 2011. c Springer-Verlag Berlin Heidelberg 2011
Time-Varying Graphs and Dynamic Networks
347
paths between any pair of nodes. The network may actually be disconnected at every time instant. Still, communication routes may be available over time and space, and make broadcast and routing and other computations feasible. An extensive amount of research has been devoted, mostly by the engineering community but also by computer scientists, to the problems of operating and computing in such highly dynamical environments. As part of these efforts, a number of important concepts have been identified, often named, sometimes formally defined. In particular, most of the basic graph concepts were extended to a new temporal version, e.g, path and reachability [4, 20], distance [6], diameter [11], or connected components [5]. In several cases, differently named concepts identified by different researchers are actually one and the same concept. For example, the concept of temporal distance, formalized in [6], is the same as reachability time [18], information latency [21], and temporal proximity [22]; similarly, the concept of journey [6] has been coined scheduleconforming path [4], time-respecting path [18, 20], and temporal path [11, 28]. Hence, the concepts discovered in these investigations can be viewed as parts of the same conceptual universe; and the formalisms proposed so far to express them as fragments of a larger formal description of this universe. As the notion of graph is the natural means for representing a standard network, the notion of time-varying graph is the natural means to represents these highly dynamic infrastructure-less networks. All the concepts and definitions advanced so far are based on or imply such a notion, as expressed even by the choices of names; e.g., Kempe et al. [20] talk of a temporal network (G, λ) where λ is a time-labeling of the edges, that associates to every edge a date corresponding to a punctual interaction; Leskovec et al. [24] talk of graphs over time; Ferreira [14] views the dynamic of the system in terms of a sequence of static graphs, called an evolving graph; Flocchini et al. [16] and Tang et al. [28] independently employ the term time-varying graphs; Kostakos uses the term temporal graph [22]; etc.1 The main contribution of this paper is to integrate the existing models, concepts, and results found in the literature into a unified framework that we call TVG (for timevarying graphs). This formalism, presented in Section 2, essentially consists of a set of compact and elegant notations and the possibility to switch between graph-centric and edge-centric perspectives on the dynamics. It is extended in Section 3, where we present the most central concepts identified by the research (e.g. journeys, temporal distance, connectivity over time and further concepts built on top of them). We identify in Section 4 several classes of dynamic networks defined with respects to basic properties on TVGs. Some of these classes have been extensively studied in the literature; e.g., one of them coincides with the family of dynamic graphs over which population protocols [1] are defined. We examine the (strict) inclusion hierarchy among the classes. As a given class typically corresponds to necessary or sufficient conditions for basic computations, the inclusion relationship implies the transferability of feasibility results (e.g., protocols) to an included class, and impossibility results (e.g., lower bounds) to an including class. Finally, Section 5 reviews recent efforts to study dynamic networks
1
The more natural term dynamic graph is not often employed because it is commonly and extensively used in the context of faulty networks.
348
A. Casteigts et al.
from a stochastic perspective, including modeling aspects (e.g. with edge-markovian evolving graphs), then we conclude with some remarks and open questions.
2 Time-Varying Graphs Consider a set of entities V (or nodes), a set of relations E between these entities (edges), and an alphabet L accounting for any property such a relation could have (label); that is, E ⊆ V × V × L. The definition of L is domain-specific, and therefore left open – a label could represent for instance the intensity of relation in a social network, a type of carrier in a transportation network, or a particular medium in a communication network. For generality, we assume L to possibly contain multi-valued elements (e.g. <satellite link; bandwidth of 4 MHz; encryption available;...> ). The set E enables multiple relations between a given pair of entities, as long as these relations have different properties, that is, for any e1 = (x1 , y1 , λ1 ) ∈ E, e2 = (x2 , y2 , λ2 ) ∈ E, (x1 = x2 ∧ y1 = y2 ∧ λ1 = λ2 ) =⇒ e1 = e2 . Because we address dynamical systems, the relations between entities are assumed to take place over a time span T ⊆ T called the lifetime of the system. The temporal domain T is generally assumed to be N for discrete-time systems or R+ for continuoustime systems. The dynamics of the system can be subsequently described by a timevarying graph, or TVG, G = (V, E, T , ρ, ζ), where – ρ : E × T → {0, 1}, called presence function, indicates whether a given edge is available at a given time. – ζ : E × T → T, called latency function, indicates the time it takes to cross a given edge if starting at a given date (the latency of an edge could vary in time). One may consider variants where the presence of nodes is also conditional upon time, by adding a node presence function ψ : V × T → {0, 1}. We do not do it in the general case in this paper, for conciseness of the notations, and mention instead when this could be relevant. The TVG formalism can arguably describe a multitude of different scenarios, from transportation networks to communication networks, complex systems, or social networks. Two intuitive examples are shown on Figure 1.
λ1
Ottawa
Montreal λ2
λ3
λ4
b
λ1
λ1 Lisbon
(a) Transportation network
a
c
λ2
d
λ1
(b) Communication network
Fig. 1. Two examples of time-varying graphs, employed in very different contexts
The meaning of what is an edge in these two examples varies drastically. In Figure 1(a), an edge from a node u to another node v represents the possibility for some
Time-Varying Graphs and Dynamic Networks
349
agent to move from u to v. The edges in this example are assumed directed, and possibly multiple. The meaning of the labels λ1 to λ4 could be for instance “bus”, “car”, “plane”, “boat”, respectively. Except for the travel in car from Ottawa to Montreal – which could assumably be started anytime –, typical edges in this scenario are available on a punctual basis, i.e., the presence function ρ for these edges returns 1 only at particular date(s) when the trip can be started. The latency function ζ may also vary from one edge to another, as well as for different availability dates of a same given edge (e.g. variable traffic on the road, depending on the departure time). The second example on Figure 1(b) represents a history of connectivity between a set of moving nodes, where the possibilities of communication appear e.g. as a function of their respective distance. The two labels λ1 and λ2 may account here for different types of communication media, such as WiFi and Satellite, having various properties in terms of range, bandwidth, latency, or energy consumption. In this scenario, the edges are assumed to be undirected and there is no more than one edge between any two nodes. The meaning of an edge is also different here: an edge between two nodes means that any one (or both) of them can (attempt to) send a message to the other. A typical presence function for this type of edge returns 1 for some intervals of time, because the nodes are generally in range for a non-punctual period of time. Note that the effective delivery of a message sent at time t on an edge e could be subjected to further constraints regarding the latency function, such as the condition that ρ(e) returns 1 for the whole interval [t, t + ζ(e, t)). These two examples are taken different on purpose; they illustrate the spectrum of models over which the TVG formalism can stretch. As observed, some contexts are intrisically simpler than others and call for restrictions (e.g. between any two nodes in the second example, there is at most one undirected edge). Further restrictions may be considered. For example the latency function could be decided constant over time (ζ : E → T); over the edges (ζ : T → T); over both (ζ ∈ T), or simply ignored. In the latter case, a TVG could have its relations fully described by a graphical representation like that of Figure 2. b
[0, 4)
[1, 3)
a
c
[5, 6) ∪ [7, 8)
d
[2, 5)
Fig. 2. A simple TVG. The interval(s) on each edge e represents the periods of time when it is available, that is, ∪(t ∈ T : ρ(e, t) = 1).
Note that a number of work on dynamic networks simply ignore ζ, or assume a discrete-time scenario where every time step implicitely corresponds to a constant ζ.
3 Definitions of TVG Concepts This section transposes and generalizes a number of dynamic network concepts into the framework of time-varying graphs. A majority of them emerged independently in
350
A. Casteigts et al.
various areas of scientific literature; some appeared more specifically; some others are original propositions. 3.1 The Underlying Graph G Given a TVG G = (V, E, T , ρ, ζ), the graph G = (V, E) is called underlying graph of G. This static graph should be seen as a sort of footprint of G, which flattens the time dimension and indicates only the pairs of nodes that have relations at some time in T . It is a central concept that is used recurrently in the following. In most studies and applications, G is assumed to be connected; in general, this is not necessarily the case. Let us stress that the connectivity of G = (V, E) does not imply that G is connected at a given time instant; in fact, G could be disconnected at all times. The lack of relationship, with regards to connectivity, between G and its footprint G is even stronger: the fact that G = (V, E) is connected does not even imply that G is “connected over time”, as discussed in more details later. 3.2 Point of Views Depending on the problem under consideration, it may be convenient to look at the evolution of the system from the point of view of a given relation (edge) or from that of the global system (entire graph). We respectively qualify these views as edge-centric and graph-centric. Edge-centric evolution. From an edge standpoint, the notion of evolution comes down to a variation of availability and latency over time. We define the available dates of an edge e, noted I(e), as the union of all dates at which the edge is available, that is, I(e) = {t ∈ T : ρ(e, t) = 1}. When I(e) is expressed as a multi-interval of availability I(e) = [t1 , t2 ) ∪ [t3 , t4 )..., where ti < ti+1 , the sequence of dates t1 , t3 , ... is called appearance dates of e, noted App(e), and the sequence of dates t2 , t4 , ... is called disappearance dates of e, noted Dis(e). Finally, the sequence t1 , t2 , t3 , ... is called characteristic dates of e, noted ST (e). In the following, we use the notation ρ[t,t ) (e) = 1 to indicate that ∀t ∈ [t, t ), ρ(e, t ) = 1. Graph-centric evolution. The sequence ST (G) = sort(∪{ST (e) : e ∈ E}), called characteristic dates of G, corresponds to the sequence of dates when topological events (appearance/disappearance of an edge) occur in the system. Each topological event can be viewed as the transformation from one static graph to another. Hence, the evolution of the system can be described as a sequence of static graphs. More precisely, from a global viewpoint, the evolution of G is described as the sequence of graphs SG = G1 , G2 .. where Gi corresponds to the static snapshot of G at time ti ∈ ST (G); i.e., e ∈ EGi ⇐⇒ ρ[ti ,ti+1 ) (e) = 1. Note that, by definition, Gi = Gi+1 . In the case where the time is discrete, another possible global representation of evolution of G is by the sequence SG = G1 , G2 , . . ., where Gi corresponds to the static snapshot of G at time t = i. Note that, in this case, it is possible that Gi = Gi+1 . Observe that in both continuous and discrete cases, the underlying graph G (defined in Section 3.1) corresponds to the union of all Gi in SG .
Time-Varying Graphs and Dynamic Networks
351
The idea of representing a dynamic graph as a sequence of static graphs, mentioned in conclusion of [17], was brought to life in [14] as a combinatorial model called evolving graphs. An evolving graph usually refers to either one of the two structures (G, SG , ST ) or (G, SG , N), the latter used only when discrete-time is considered. Their initial version also included a latency function, which makes them a valid – graphcentric – representation of TVGs. 3.3 Subgraphs of a Time-Varying Graph Subgraphs of a TVG G can be defined in a classical manner, by restricting the set of vertices or edges of G. More interesting is the possibility to define a temporal subgraph by restricting the lifetime T of G, leading to the graph G = (V, E , T , ρ , ζ ) such that – – – –
T⊆T E = {e ∈ E : ∃t ∈ T : ρ(e, t) = 1 ∧ t + ζ(e, t) ∈ T } ρ : E × T → {0, 1} where ρ (e, t) = ρ(e, t) ζ : E × T → T where ζ (e, t) = ζ(e, t)
In practice, we allow the notation G = G[ta ,tb ) to denote the temporal subgraph of G restricted to T = T ∩ [ta , tb ), which includes the possible notations G[ta ,+∞) or G(−∞,tb ) regardless of whether T is open, semi-closed, or closed. 3.4 Journeys A sequence of couples J = (e1 , t1 ), (e2 , t2 ) . . . , (ek , tk ), such that e1 , e2 , ..., ek is a walk in G is a journey in G if and only if ρ(ei , ti ) = 1 and ti+1 ≥ ti + ζ(ei , ti ) for all i < k. Additional constraints may be required in specific domains of application, such as the condition ρ[ti ,ti +ζ(ei ,ti )) (ei ) = 1 in communication networks (the edge remains available until the message is delivered). We denote by departure(J ), and arrival(J ), the starting date t1 and the last date tk + ζ(ek , tk ) of a journey J , respectively. Journeys can be thought of as paths over time from a source to a destination and therefore have both a topological length and a temporal length. The topological length of J is the number |J | = k of couples in J (i.e., the number of hops); its temporal length is its end-to-end duration: arrival(J ) − departure(J ). Let us denote by JG∗ the set of all possible journeys in a time-varying graph G, and ∗ ⊆ JG∗ those journeys starting at node u and ending at node v. If a journey by J(u,v) ∗ exists from a node u to a node v, that is, if J(u,v)
= ∅, then we say that u can reach v, and allow the simplified notation u v. Clearly, the existence of journey is not symmetrical: u v v u; this holds regardless of whether the edges are directed or not, because the time dimension creates its own level of direction. Given a node u, the set {v ∈ V : u v} is called the horizon of u. 3.5 Distance As observed, the length of a journey can be measured both in terms of hops or time. This gives rise to two distinct definitions of distance in a time-varying graph G:
352
A. Casteigts et al.
– The topological distance from a node u to a node v at time t, noted du,t (v), is ∗ defined as M in{|J | : J ∈ J(u,v) ∧ departure(J ) ≥ t}. For a given date t, a journey whose departure is t ≥ t and topological length is equal to du,t (v) is qualified as shortest ; – The temporal distance from u to v at time t, noted dˆu,t (v) is defined as ∗ M in{arrival(J ) : J ∈ J(u,v) ∧ departure(J ) ≥ t} − t. Given a date t, a journey whose departure is t ≥ t and arrival is t + dˆu,t (v) is qualified as foremost. Finally, for any given date t, a journey whose departure is ≥ t and temporal length is M in{dˆu,t (v) : t ∈ T ∩ [t, +∞)} is qualified as fastest. The problem of computing shortest, fastest, and foremost journeys in delay-tolerant networks was introduced in [6], and an algorithm for each of the three metrics was provided for the centralized version of the problem (with complete knowledge of G). A concept closely related to that of temporal distance is that of temporal view, introduced in [21] in the context of social network analysis. The temporal view (simply called view in [21]; we add the “temporal” adjective to avoid confusion with the concept of view in distributed computing) that a node v has of another node u at time t, denoted φv,t (u), is defined as the latest (i.e., largest) t ≤ t at which a message received by time t at v could have been emitted at u; that is, in our formalism, ∗ φv,t (u) = Max{departure(J ) : J ∈ J(u,v) ∧ arrival(J ) ≤ t}.
The question of knowing whether all the nodes of a network could know their temporal views in real time was recently answered (affirmatively) in [10]. 3.6 Other Temporal Concepts The number of definitions built on top of temporal concepts could grow endlessly, and our aim is certainly not to enumerate all of them. Yet, here is a short list of additional concepts that we believe are general enough to be worthwhile mentioning. The concept of eccentricity can be separated into a topological eccentricity and a temporal eccentricity, following the same mechanism as for the concept of distance. The temporal eccentricity of a node u at time t, εˆt (u), is defined as max{dˆu,t (v) : v ∈ V }, that is, the duration of the “longest” foremost journey from u to any other node. The concept of diameter can similarly be separated into those of topological diameter and temporal diameter, the latter being defined at time t as max{ˆ εt(u) : u ∈ V }. These temporal versions of eccentricity and diameter were proposed in [6] The temporal diameter was further studied from a stochastic point of view by Chaintreau et al. in [11]. Clementi et al. introduced in [13] a concept of dynamic expansion – the dynamic counterpart of the concept of node expansion in static graphs – which accounts for the maximal speed of information propagation. Given a subset of nodes V ⊆ V , and two dates t1 , t2 ∈ T , the dynamic expansion of V from time t1 to time t2 is the size of ∗ the set {v ∈ V V : ∃J(u,v) ∈ JG[t : u ∈ V }, that is roughly speaking, the 1 ,t2 ) “collective” horizon of V in G[t1 ,t2 ) . The concept of journey was dissociated in [10] into direct and indirect journeys. A journey J = {(e1 , t1 ), (e2 , t2 ) . . . , (ek , tk )} is said direct iff ∀i, 1 ≤ i < k,
Time-Varying Graphs and Dynamic Networks
353
ρ(ei+1 , ti + ζ(ei , ti )) = 1, that is, every next edge in J is directly available; it is said indirect otherwise. The knowledge of whether a journey is direct or indirect was directly exploited by the distributed algorithm in [10] to compute temporal views between nodes. Such parameter can also play a role in the context of delay-tolerant routing, indicating whether a store-carry-forward mechanism is required (for indirect journeys).
4 TVG Classes This section discusses the impact of temporal properties on the feasibility and complexity of distributed problems, unifying existing works from the literature. In particular, we identify a hierarchy of classes of TVGs based on properties that are formulated using the concepts presented in the previous section. These class-defining properties, organized in an ascending order of assumptions, are important in that they imply necessary conditions and impossibility results for distributed computations. Let us start with the simplest class. Class 1 ∃u ∈ V : ∀v ∈ V : u v. That is, at least one node can reach all the others. This condition is necessary, for example, for broadcast to be feasible from at least one node. Class 2 ∃u ∈ V : ∀v ∈ V : v u. That is, at least one node can be reached by all the others. This condition is necessary to be able to compute a function whose input is spread over all the nodes, with at least one node capable of generating the output. Any algorithm for which a terminal state must be causally related to all the nodes initial states also falls in this category, such as the election of a leader in an anonymous network or the counting of the number of nodes by at least one node. Class 3 (Connectivity over time) ∀u, v ∈ V, u v. That is, every node can reach all the others; in other words, the TVG is connected over time. By the same discussions as for Class 1 and Class 2, this condition is necessary to be able to broadcast from any node, to compute a function whose output is known by all the nodes, or to ensure that every node has a chance to be elected. These three basic classes were used e.g. in [8] to investigate how relations between TVGs properties and feasibility of algorithms could be canonically proven. Class 4 (Round connectivity) ∗ ∗ ∀u, v ∈ V, ∃J1 ∈ J(u,v) , ∃J2 ∈ J(v,u) : arrival(J1 ) ≤ departure(J2 ). That is, every node can reach all the others and be reached back afterwards. Such a condition may be required e.g. for adding explicit termination to broadcast, election, or counting algorithms. The classes defined so far are in general relevant to the case when the lifetime is finite and a limited number of topological events is considered. When the lifetime is infinite, connectivity over time is generally assumed on a regular basis, and more elaborated assumptions can be considered.
354
A. Casteigts et al.
∗ Class 5 (Recurrent connectivity) ∀u, v ∈ V, ∀t ∈ T , ∃J ∈ J(u,v) : departure(J ) > t.
That is, at any point t in time, the temporal subgraph G[t,+∞) remains connected over time. This class is implicitely considered in most works on delay-tolerant networks. It indeed represents those DTNs where routing can always be achieved over time. It has been explicitly referred to as eventually transportable dynamic networks in [27]. As discussed in Section 3.1, the fact that the underlying graph G = (V, E) is connected does not imply that G is connected over time – the ordering of topological events matters. Such a condition is however necessary to allow connectivity over time and thus to perform any type of global computation. Therefore, the following three classes assume that the underlying graph G is connected. Class 6 (Recurrence of edges) ∀e ∈ E, ∀t ∈ T , ∃t > t : ρ(e, t ) = 1 and G is connected. That is, if an edge appears once, it appears infinitely often. Since the underlying graph G is connected, we have Class 6 ⊆ Class 5. Indeed, if all the edges of a connected graph appear infinitely often, then there must exist, by transitivity, a journey between any pairs of nodes infinitely often. In a context where connectivity is recurrently achieved, it becomes interesting to look at problems where more specific properties of the journeys are involved, e.g. the possibility to broadcast a piece of information in a shortest, foremost, or fastest manner (see Section 3.5 for definitions). Interestingly, these three declinations of the same problem have different requirements in terms of TVG properties. It is for example possible to broadcast in a foremost fashion in Class 6, whereas shortest and fastest broadcasts are not possible [9]. Shortest broadcast becomes however possible if the recurrence of edges is bounded in time, and the bound known to the nodes, a property characterizing the next class: Class 7 (Time-bounded recurrence of edges) ∀e ∈ E, ∀t ∈ T , ∃t ∈ [t, t + Δ), ρ(e, t ) = 1, for some Δ ∈ T and G is connected. Some implications of this class include a temporal diameter that is bounded by ΔDiam (G), as well as the possibility for the nodes to wait a period of Δ to discover all their neighbors (if Δ is known). The feasibility of shortest broadcast follows naturally by using a Δ-rounded breadth-first strategy that minimizes the topological length of journeys. A particular important type of bounded recurrency is the periodic case: Class 8 (Periodicity of edges) ∀e ∈ E, ∀t ∈ T , ∀k ∈ N, ρ(e, t) = ρ(e, t + kp), for some p ∈ T and G is connected. The periodicity assumption holds in practice in many cases, including networks whose entities are mobile with periodic movements (satellites, guards tour, subways, or buses). The periodic assumption within a delay-tolerant network has been considered, among others, in the contexts of network exploration [15, 16] and routing [25]. Periodicity enables also the construction of foremost broadcast trees that can be re-used (modulo p
Time-Varying Graphs and Dynamic Networks
355
in time) for subsequent broadcasts [10] (whereas the more general classes of recurrence requires the use of a different tree for every foremost broadcast). More generally, the point in exploiting TVG properties is to rely on invariants that are generated by the dynamics (e.g. recurrent existence of journeys, periodic optimality of a broadcast tree, etc.). In some works, particular assumptions on the network dynamics are made to obtain invariants of a more classic nature. Below are some examples of classes, formulated using the graph-centric point of view of (discrete-time) evolving graphs, i.e., where G = (G, SG , N). Class 9 (Constant connectivity) ∀Gi ∈ SG , Gi is connected. Here, the dynamics of the network is not constrained as long as it remains connected in every time step. Such a class was used for example in [26] to enable progression hypotheses on the broadcast problem. Indeed, if the network is always connected, then at every time step there must exist an edge between an informed node and a non-informed node, which allows to upper-bound the broadcast time by n = |V | time steps (worst case scenario). Class 10 (T-interval connectivity) ∀i ∈ N, T ∈ N, ∃G ⊆ G : VG = VG , G is connected, and ∀j ∈ [i, i + T − 1), G ⊆ Gj . This class is a particular case of constant connectivity in which a same spanning connected subgraph of the underlying graph G is available for any period of T consecutive time steps. It was introduced in [23] to study problems such as counting, token dissemination, and computation of functions whose input is spread over all the nodes (considering an adversary-based edge schedule). The authors shown that the computation of these problems could be sped up of a factor T compared to the 1-interval connected graphs, that is, graphs of Class 9. Other classes of TVGs can be found in [27], based on intermediate properties between constant connectivity and connectivity over time. They include Class 11 and Class 12 below. Class 11 (Eventual connectivity) ∀i ∈ N, ∃j ∈ N : j ≥ i, Gj is connected. In other words, there is always a future time step in which the network is instantly connected. Class 12 (Eventual routability) ∀u, v ∈ V, ∀i ∈ N, ∃j ∈ N : j ≥ i and a path from u to v exists in Gj . That is, for any two nodes, there is always a future time step in which a instant path exists between them. The difference with Class 11 is that the paths can appear at different time for different pairs of nodes. Classes 11 and 12 were used in [27] to represent networks where routing protocols for (connected) mobile ad hoc networks eventually work if they tolerate transient topological faults. For all the classes discussed so far, the referenced investigations studied the impact that various TVG properties have on problems or algorithms. A reverse approach was considered by Angluin et al. in the field of population protocols [1]. Instead of studying
356
A. Casteigts et al.
the impact of various assumptions on given problems, they assumed a given assumption – that any pair of node interacts infinitely often –, and characterized all problems that can be solved in that context. This class is generally referred to as that of complete graphs of interaction. Class 13 (Complete graph of interaction) The underlying graph G=(V, E) is complete, and ∀e ∈ E, ∀t ∈ T , ∃t > t : ρ(e, t )=1. From a time-varying graph perspective, this class is the specific subset of Class 6, in which the underlying graph G is complete. Various types of schedulers have been considered in the area of population protocols that add further fairness constraints on Class 13 (e.g. weak fairness, strong fairness, bounded, or k-bounded schedulers). Each of these could further be seen as a distinct subclass of Class 13. An interesting aspect of unifying these properties within the same formalism is the possibility to see how they relate to one another, and to compare the associated solutions or algorithms. An insight for example can be gained by looking at the short classification shown in Figure 3, where basic relations of inclusion between the above classes are reported. These inclusion are strict: for each relation, the parent class contains some time-varying graphs that are not in the child class. C13
C10
C8
C7
C9
C11
C6
C1 C5
C4
C3
C12
C2
Fig. 3. Relations of inclusion between classes (from specific to general)
Clearly, one should try to solve a problem in the most general context possible. The right-most classes are so general that they offer little properties to be exploited by an algorithm, but some intermediate classes, such as Class 5, appear quite central in the hierarchy. This class indeed contains all the classes where significant work was done. A problem solved in this class would therefore apply to virtually all the contexts considered heretofor in the literature. Such a classification may also be used to categorize problems themselves. As mentioned above, shortest broadcast is not generally achievable in Class 6, whereas foremost broadcast is. Similarly, it was shown in [9] that fastest broadcast is not feasible in Class 7, whereas shortest broadcast can be achieved with some knowledge. Since Class 7 ⊂ Class 6, we have f oremostBcast ≺ shortestBcast ≺ f astestBcast where ≺ is the partial order on these problems’ topological requirements.
5 Non-deterministic TVGs Non-determinism in time-varying graphs can be introduced at several different levels. The most direct one is clearly that provided by probabilistic time-varying graphs, where the presence function ρ : E × T → [0, 1] indicates the probability that a given edge
Time-Varying Graphs and Dynamic Networks
357
is available at a given time. In a context of mobility, the probability distribution of ρ is intrinsically related to the random mobility pattern defining the network. Popular example of random mobility models are the Random Waypoint and Random Direction models, where waypoints of consecutive movements are chosen uniformly at random. Definitions of random TVG differ depending on whether the time is discrete or continuous. A (discrete-time) random TVG is one whose lifetime is an interval of N and whose sequence of characteristic graphs SG = G1 , G2 , .. is such that every Gi is a Erd¨os and R´enyi random graph; that is, ∀e ∈ V 2 , P[e ∈ EGi ] = p for some p; this definition is introduced by Chaintreau et al. [11]. One particularity of discrete-time random TVGs is that the Gi s are independent with respect to each other. While this definition allows purely random graphs, it does not capture some properties of real world networks, such as the fact that an edge may be more likely to be present in Gi+1 if it is already present in Gi . This question is addressed by Clementi et al. [12] by introducing Edge-Markovian Evolving Graphs. These are discrete-time evolving graphs in which the presence of every edge follows an individual Markovian process. More precisely, the sequence of characteristic graph SG = G1 , G2 , .. is such that P[e ∈ EGi+1 |e ∈ / EGi ] = p P[e ∈ / EGi+1 |e ∈ EGi ] = q for some p and q called birth rate and death rate, respectively. The probability that a given edge remains absent or present from Gi to Gi+1 is obtained by complement of p and q. The very idea of considering a Markovian Evolving Graph seems to have appeared in [2], in which the authors consider a particular case that is substantially equivalent to the discrete-time random TVG from [11]. Edge-Markovian EGs were used in [12], along with the concept of dynamic expansion (see Section 3.6) to address analytically some fundamental questions such as does dynamics necessarily slow down a broadcast? Or can random node mobility be exploited to speed-up information spreading? Baumann et al. extended this work in [3] by establishing tight bounds on the propagation time for any birth and death rates. A continuous-time random TVG is one in which the appearance of every edge obeys a Poisson process, that is, ∀e ∈ V 2 , ∀ti ∈ App(e), P[ti+1 − ti < d] = λeλd for some λ; this definition is introduced by Chaintreau et al. in [11].2 Random time-varying graphs, both discrete- and continuous-time, were used in [11] to characterize phase transitions between no-connectivity and connectivity over time as a function of the number of nodes, a given time-window duration, and constraints on both the topological and temporal lengths of journeys.
6 Research Problems and Directions The first most obvious research task is that of exploring the universe of dynamic networks using the formal tools provided by the TVG formalism. The long-term goal is that of providing a comprehensive map of this universe, identifying both the commonality and the natural differences between the various types of dynamical systems modeled by 2
It is interesting to note that the authors rely on a graph-centric point of view in discrete time and on an edge-centric point of view in continuous time. This trend seems to be general.
358
A. Casteigts et al.
TVG. Additionally, several, more specific research areas can be identified including the ones described below. Distributed TVG algorithms design and analysis. The design and analysis of distributed algorithms and protocols for time-varying graphs is an open research area. In fact very few problems have been attacked so far: routing and broadcasting in delaytolerant networks; broadcasting and exploration in opportunistic-mobility networks; new self-stabilization techniques; detection of emergence and resilience of communities, and viral marketing in social networks. Design and optimization of TVG. If the interactions in a network can be planned – decided by a designer –, then a number of new interesting optimization problems arise with the design of time-varying graph. They may concern for example the minimization of the temporal diameter or the balancing of nodes eccentricities. Is a given setting optimal? How to prove it? What if the underlying graph can also be modified? etc. A whole field is opening that promises exciting research avenues. Complexity Analysis. Analyzing the complexity of a distributed algorithm in a TVG – e.g. in number of messages – is not trivial, partly because contrarily to the static cases, the complexity of an algorithm in a dynamic network has a strong dependency, not only on the usual network parameters (number of nodes, edges, etc.), but also on the number of topological events taking place during its execution. In many of the algorithms we have encountered, the majority of messages is in fact directly triggered by topological events, e.g., in reaction to the local appearance or disappearance of an edge. The number of topological events therefore represents a new complexity parameter, whose impact on various problems remains to study.
References 1. Angluin, D., Aspnes, J., Eisenstat, D., Ruppert, E.: The computational power of population protocols. Distributed Computing 20(4), 279–304 (2007) 2. Avin, C., Kouck´y, M., Lotker, Z.: How to explore a fast-changing world. In: Aceto, L., Damg˚ard, I., Goldberg, L.A., Halld´orsson, M.M., Ing´olfsd´ottir, A., Walukiewicz, I. (eds.) ICALP 2008, Part I. LNCS, vol. 5125, pp. 121–132. Springer, Heidelberg (2008) 3. Baumann, H., Crescenzi, P., Fraigniaud, P.: Parsimonious flooding in dynamic graphs. In: Proc. 28th ACM Symp. on Principles of Distributed Computing (PODC), pp. 260–269 (2009) 4. Berman, K.A.: Vulnerability of scheduled networks and a generalization of Menger’s Theorem. Networks 28(3), 125–134 (1996) 5. Bhadra, S., Ferreira, A.: Complexity of connected components in evolving graphs and the computation of multicast trees in dynamic networks. In: Proc. 2nd Intl. Conf. on Ad Hoc Networks and Wireless (AdHoc-Now), pp. 259–270 (2003) 6. Bui-Xuan, B., Ferreira, A., Jarry, A.: Computing shortest, fastest, and foremost journeys in dynamic networks. Intl. J. of Foundations of Comp. Science 14(2), 267–285 (2003) 7. Burgess, J., Gallagher, B., Jensen, D., Levine, B.N.: Maxprop: Routing for vehicle-based disruption-tolerant networks. In: Proc. 25th IEEE Conference on Computer Communications (INFOCOM), pp. 1–11 (2006) 8. Casteigts, A., Chaumette, S., Ferreira, A.: On the assumptions about network dynamics in distributed computing. Arxiv preprint arXiv:1102.5529 (2011); A preliminary version appeared in SIROCCO (2009) 9. Casteigts, A., Flocchini, P., Mans, B., Santoro, N.: Deterministic computations in timevarying graphs: Broadcasting under unstructured mobility. In: Calude, C.S., Sassone, V. (eds.) TCS 2010. IFIP AICT, vol. 323, pp. 111–124. Springer, Heidelberg (2010)
Time-Varying Graphs and Dynamic Networks
359
10. Casteigts, A., Flocchini, P., Mans, B., Santoro, N.: Measuring temporal lags in delay-tolerant networks. In: Proc. 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS (2011) 11. Chaintreau, A., Mtibaa, A., Massoulie, L., Diot, C.: The diameter of opportunistic mobile networks. Communications Surveys & Tutorials 10(3), 74–88 (2008) 12. Clementi, A., Macci, C., Monti, A., Pasquale, F., Silvestri, R.: Flooding time in edgemarkovian dynamic graphs. In: Proc. 27th ACM Symp. on Principles of Distributed Computing (PODC), pp. 213–222 (2008) 13. Clementi, A., Pasquale, F.: Information Spreading in Dynamic Networks: An Analytical Approach. In: Nikoletseas, S., Rolim, J. (eds.) Theoretical Aspects of Distributed Computing in Sensor Networks. Springer, Heidelberg (2010) 14. Ferreira, A.: Building a reference combinatorial model for MANETs. IEEE Network 18(5), 24–29 (2004) 15. Flocchini, P., Kellett, M., Mason, P., Santoro, N.: Mapping an unfriendly subway system. In: Proc. 5th Intl. Conf. on Fun with Algorithms, pp. 190–201 (2010) 16. Flocchini, P., Mans, B., Santoro, N.: Exploration of periodically varying graphs. In: Dong, Y., Du, D.-Z., Ibarra, O. (eds.) ISAAC 2009. LNCS, vol. 5878, pp. 534–543. Springer, Heidelberg (2009) 17. Harary, F., Gupta, G.: Dynamic graph models. Mathematical and Computer Modelling 25(7), 79–88 (1997) 18. Holme, P.: Network reachability of real-world contact sequences. Physical Review E 71(4), 46119 (2005) 19. Jain, S., Fall, K., Patra, R.: Routing in a delay tolerant network. In: Proc. Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM), pp. 145–158 (2004) 20. Kempe, D., Kleinberg, J., Kumar, A.: Connectivity and inference problems for temporal networks. In: Proc. 32nd ACM Symp. on Theory of Computing (STOC), page 513 (2000) 21. Kossinets, G., Kleinberg, J., Watts, D.: The structure of information pathways in a social communication network. In: Proc. 14th Intl. Conf. on Knowledge Discovery and Data Mining (KDD), pp. 435–443 (2008) 22. Kostakos, V.: Temporal graphs. Physica A 388(6), 1007–1023 (2009) 23. Kuhn, F., Lynch, N., Oshman, R.: Distributed computation in dynamic networks. In: Proc. 42nd ACM Symp. on Theory of Computing, pp. 513–522 (2010) 24. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: Densification and shrinking diameters. ACM Trans. on Knowledge Discovery from Data 1(1) (2007) 25. Liu, C., Wu, J.: Scalable routing in cyclic mobile networks. IEEE Trans. Parallel Distrib. Syst. 20(9), 1325–1338 (2009) 26. O’Dell, R., Wattenhofer, R.: Information dissemination in highly dynamic graphs. In: Proc. Workshop on Foundations of Mobile Computing (DIALM-POMC), pp. 104–110 (2005) 27. Ramanathan, R., Basu, P., Krishnan, R.: Towards a formalism for routing in challenged networks. In: Proc. 2nd ACM Workshop on Challenged Networks (CHANTS), pp. 3–10 (2007) 28. Tang, J., Musolesi, M., Mascolo, C., Latora, V.: Characterising temporal distance and reachability in mobile and online social networks. ACM Computer Communication Review 40(1), 118–124 (2010) 29. Zhang, X., Kurose, J., Levine, B.N., Towsley, D., Zhang, H.: Study of a bus-based disruptiontolerant network: mobility modeling and impact on routing. In: Proc. 13th ACM Int. Conf. on Mobile Computing and Networking, pp. 195–206 (2007) 30. Zhang, Z.: Routing in intermittently connected mobile ad hoc networks and delay tolerant networks: Overview and challenges. IEEE Comm. Surveys & Tutorials 8(1), 24–37 (2006)
Author Index
Assis, Fl´ avio 86 Autenrieth, Marcus
Mandal, Partha Sarathi 150 Manzoni, Pietro 276 Marin-Perez, Rafael 1 McKnight-MacNeil, Ereth 262 Milosavljevi´c, Nikola 308 Mitton, Nathalie 15, 58, 248 Montavont, Julien 131
117
Begin, Thomas 248 Ben Slimane, Jamila 29 Bohrloch, Tim 276 Braga, Hugo 86 Caillouet, Christelle 220 Calafate, Carlos T. 276 Cano, Juan-Carlos 276 Cao, Zhenfu 177 Casteigts, Arnaud 346 Cicho´ n, Jacek 322 Concei¸ca ˜o, Lu´ıs 234 C´ ordoba, C´esar 72 Curado, Marilia 234 Dias de Amorim, Marcelo Ebers, Sebastian
Nayak, Amiya No¨el, Thomas
191 131
Pfisterer, Dennis
294
Qian, Haifeng 164 Quattrociocchi, Walter
248
294
Fischer, Stefan 294 Flocchini, Paola 346 Frey, Hannes 117 Frikha, Mounir 29
Radak, Jovan 58 Radeke, Rico 290 Rahman, Ashikur 100 Razafindralambo, Tahiry Robles, Jorge Juan 290 Roth, Damien 131 Ruiz, Pedro Miguel 1 Ruj, Sushmita 191
346
220, 248
Ghosh, Anil K. 150 Glombitza, Nils 294 Gu´erin Lassous, Isabelle 248 Guerrero, Armando 72 Guvensan, M. Amac 206
Santoro, Nicola 346 Shen, Xuemin (Sherman) 177 Shihada, Basem 44 Simplot-Ryl, David 15, 58, 248 Song, Ye-Qiong 29 Stojmenovic, Ivan 191 Stoleru, Radu 44
Hamouda, Essia 15 Haque, Md. Ehtesamul 100 Hassanzadeh, Amin 44
Tan, Haisheng ´ Torres, Alvaro
Jabba, Daladier Jimeno, Miguel
72 72
Karl, Holger 145 Khan, Rana Azeem M. Koubaa, Anis 29 Kunz, Thomas 262 Labrador, Miguel 72 Lau, Francis C.M. 332 Lemiesz, Jakub 322 Li, Xu 220
145
332 276
Wang, Yongcai 332 Wang, Yuexuan 332 Wei, Lifei 177 Wightman, Pedro 72 Yavuz, A. Gokhan
206
Zawada, Marcin 322 Zhou, Yuan 164 Zhu, Haojin 177 Zurbar´ an, Mayra 72