BEHAVIOUR MONITORING AND INTERPRETATION – BMI
Ambient Intelligence and Smart Environments The Ambient Intelligence and Smart Environments (AISE) book series presents the latest research results in the theory and practice, analysis and design, implementation, application and experience of Ambient Intelligence (AmI) and Smart Environments (SmE). Coordinating Series Editor: Juan Carlos Augusto Series Editors: Emile Aarts, Hamid Aghajan, Michael Berger, Vic Callaghan, Diane Cook, Sajal Das, Anind Dey, Sylvain Giroux, Pertti Huuskonen, Jadwiga Indulska, Achilles Kameas, Peter Mikulecký, Daniel Shapiro, Toshiyo Tamura, Michael Weber
Volume 9 Recently published in this series Vol. 8. Vol. 7. Vol. 6. Vol. 5. Vol. 4. Vol. 3. Vol. 2. Vol. 1.
R. López-Cózar et al. (Eds.), Workshop Proceedings of the 6th International Conference on Intelligent Environments E. Mordini and P. de Hert (Eds.), Ageing and Invisibility G. van den Broek et al. (Eds.), AALIANCE Ambient Assisted Living Roadmap P. Čech et al. (Eds.), Ambient Intelligence Perspectives II – Selected Papers from the Second International Ambient Intelligence Forum 2009 M. Schneider et al. (Eds.), Workshops Proceedings of the 5th International Conference on Intelligent Environments B. Gottfried and H. Aghajan (Eds.), Behaviour Monitoring and Interpretation – BMI – Smart Environments V. Callaghan et al. (Eds.), Intelligent Environments 2009 – Proceedings of the 5th International Conference on Intelligent Environments: Barcelona 2009 P. Mikulecký et al. (Eds.), Ambient Intelligence Perspectives – Selected Papers from the First International Ambient Intelligence Forum 2008
ISSN 1875-4163 (print) ISSN 1875-4171 (online)
Behaviour Monitoring and Interpretation – BMI Well-Being
Edited by
Björn Gottfried Centre for Computing Technologies, Universität Bremen, Germany
and
Hamid Aghajan Department of Electrical Engineering, Stanford University, USA
Amsterdam • Berlin • Tokyo • Washington, DC
© 2011 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-60750-730-7 (print) ISBN 978-1-60750-731-4 (online) Library of Congress Control Number: 2011925298 Publisher IOS Press BV Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail:
[email protected] Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail:
[email protected]
LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved.
v
Preface The annual workshop on Behaviour Monitoring and Interpretation (BMI) was launched in 2007. The workshop is co-located with the German conference on Artificial Intelligence, and hence, receives much attention in the research community investigating intelligent means for behaviour monitoring and interpretation. The edition of the workshop in 2009 focused on the topic of well-being to reflect the significant interest in this research direction. The current volume consists of extended versions of selected contributions from this workshop as well as other invited articles covering major research themes in this field. This volume aims to offer state-of-the-art contributions in the application area of well-being. The notion of well-being has been treated in diverse disciplines such as engineering, sociology, psychology and philosophy. With this book a perspective is offered to the latest trends in this field from a few different viewpoints. Well-being is indeed an omnipresent concept that reaches out to a myriad of aspects of our daily lives. In addition to supporting a healthy lifestyle, the concept of well-being extends to the selections involving the type of the environment we live in, the interactions we have with other humans, and the practices we engage in to achieve our plans for future. Well-being concerns us in our daily life, and hence, plays a fundamental role at all times and places. This fact in turn needs to be taken into account when designing ubiquitous computing technologies that pervade our life. With the presented articles the book provides a survey of different research projects that aim to address the many influential aspects of well-being that are considered in today’s designs or play an essential role in the designs of the future. The present volume is the second BMI edition that is published by IOS Press. We are looking forward to receiving feedback from our readership in order to better plan and prepare for future BMI workshop editions. We are thankful to all authors that have contributed to this volume and to the programme committee of the BMI workshop, who provided the authors with valuable reviews. March 2011 Björn Gottfried and Hamid Aghajan
This page intentionally left blank
vii
Contents Preface Björn Gottfried and Hamid Aghajan
v
Part I. Foundations of Well-Being Behaviour Monitoring and Interpretation Björn Gottfried and Hamid K. Aghajan
3
Information Communication Technology as a Means of Enhancing the Well-Being of Older People Daniel Johnson and Felicia Huppert
11
Well-Being in Physical Information Spacetime: Philosophical Observations on the Use of Pervasive Computing for Supporting Good Life Stefan Artmann
26
Part II. Supporting the Well-Being Through Care Taking in Smart Environments Tracking Systems for Multiple Smart Home Residents Aaron S. Crandall and Diane J. Cook
65
KopAL – An Orientation System for Patients with Dementia Sebastian Fudickar, Bettina Schnor, Juliane Felber, Franz J. Neyer, Mathias Lenz and Manfred Stede
83
Cost/Benefit Analysis of an Adherence Support Framework for Chronic Disease Management Kumari Wickramasinghe, Michael Georgeff, Christian Guttmann, Ian Thomas and Heinz Schmidt
105
Part III. Improving the Well-Being Through Life-Style and Entertainment Predicting Daily Physical Activity in a Lifestyle Intervention Program Xi Long, Steffen Pauws, Marten Pijl, Joyca Lacroix, Annelies H. Goris and Ronald M. Aarts Hey Robot, Get Out of My Way – A Survey on a Spatial and Situational Movement Concept in HRI Annika Peters, Thorsten P. Spexard, Marc Hanheide and Petra Weiss
131
147
Towards Adaptive and User-Centric Smart Home Applications Amir Hossein Khalili, Chen Wu and Hamid Aghajan
166
Subject Index
183
Author Index
185
This page intentionally left blank
Part I Foundations of Well-Being
This page intentionally left blank
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-3
3
Behaviour Monitoring and Interpretation An Overview of Technologies Supporting the Well-Being of Humans
a
Björn GOTTFRIED a,1 , Hamid K. AGHAJAN b Centre for Computing Technologies, University of Bremen, Germany b Department of Electrical Engineering, Stanford University, USA Abstract. The notion of well-being has been treated in as diverse disciplines as engineering, sociology, and philosophy. This volume offers a perspective to the latest trends in research and development in well-being of humans across a subset of the involved disciplines. This introductory chapter provides an overview of the subsequent chapters and discusses the challenges shaping the future research agenda for technologies that support well-being. Keywords. Behaviour Monitoring; Behaviour Interpretation; Well-Being; Smart Environments; Ambient Assisted Living; Pervasive Computing; Ambient Intelligence.
Introduction This article provides an overview of the technologies that support well-being of humans. Through discussing the various current efforts on studying this topic, the article argues that contrary to the common perception of equating well-being to health-related issues, the notion of well-being is indeed an omnipresent concept that reaches out to a multitude of aspects of our daily lives. In addition to supporting health and a healthy lifestyle, the notion of well-being extends to the practices and selections involving the type of the environment we live in, the interactions we have with other humans, and the practices we engage in to achieve our plans for future. well-being concerns us in our daily life, and hence, plays a fundamental role at all times and places. This in turn needs to be taken into account when designing ubiquitous computing technologies that pervade our life. With this overview paper we survey different research projects that aim to address the many influential aspects of well-being in future. First, we shall outline our own understanding of the notion of well-being in Section 1. For this purpose, a first definition of this notion is given. Moreover, related areas which are concerned with this topic and the users of such technologies are discussed to further clarify this notion. In Section 2 the state-of-the-art is described through examining a few projects each focusing on a different aspect of well-being. Open questions that are centred around the notion of well-being are eventually discussed in Section 3 in order to set forth a future research agenda. 1 Corresponding Author: Björn Gottfried, Centre for Computing Technologies, University of Bremen, 28359 Bremen, Germany; E-mail:
[email protected].
4
B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation
1. What is meant by Well-Being Trying to define what is meant in the research community at large by the notion of wellbeing one might come to the conclusion that this is a matter of perception, and thus, a subjective issue. This diversity is rooted in the fact that the viewpoint on well-being is indeed a personal one, and what an individual considers as an important element in affecting his or her well-being. If we proceed to compare what people wish to experience in their daily life and the form and behavior of the systems that can support this, we can infer a more common view on at least the technological perspective for support of well-being: People employ devices which help them perform daily activities and solve problems more easily. Such systems and devices and our interactions with them define our understanding of the impact of the technology on our well-being. For example, a system that intuitively and perhaps even proactively connects an elderly with a health specialist according to the type of event measured by a home sensor network can be an example of a well-being system, while for young users a system that enables them to share an exercise session with a remote friend can be an instance of technology that supports well-being. This is a straightforward view shaped by considering the aspects of well-being related to and being impacted by the technology, and it is certainly not a comprehensive definition of the scope of well-being. Through focusing on the definition of technologies that support well-being, this view can help us in not only developing new tools and technologies for supporting well-being but also in defining metrics and evaluation criteria which can examine acceptance by users they intend to serve. Obviously, well-being is a general notion that can be applied in every aspect of an individual’s life. In fact, through the concepts of ubiquitous and pervasive computing technologies that enable ambient environments, the underlying notion of well-being has already found its way to support humans almost everywhere. Typical case studies found in the literature are concerned with nursing homes and hospitals. In these well-being systems, on the one hand challenged people are provided with devices that can help them manage their life despite disabilities, and on other hand the care staff is to be supported in accessing the patients and their information. This dual-pronged support scope also extends to many other areas of well-being, including sports, comfort and leisure, social life, work efficiency, and most other dimensions supported by today’s technology. In most instances, multiple entities benefit from the same supporting technology, and hence the technology need to cater support in different ways to different users. In a larger scope, multiple technologies overlap in their use and their services to the same user may be convolved. Hence the capability to coexist and work together is also an essential aspect of well-being systems and technologies. To offer a more precise understanding of what is currently happening within the research community around the notion of well-being, several research projects are presented in the following section. Each project emphasizes a different aspect but they have in common the goal of supporting their users in a well-being application.
B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation
5
2. Projects on Well-Being 2.1. Philosophical Roots A philosophical perspective on what is meant by well-being and what this implies regarding the technology to be employed is given by [1]. Artmann discusses the roots which were already found in Aristotle’s approach more than 2300 years ago. Aristotle’s practical philosophy, Artmann shows, provides a useful starting point for analysing human actions now in a world pervaded with computer networks. The functional view of Aristotle’s approach can indeed be reconciled quite well with the engineer’s view. A more modern alternative is Dewey’s pragmatism that re-interprets Aristotle’s philosophy of action. One of his fundamental insights is that any technical device is nothing but an objectified habit. On this basis, an argument is derived about what supports the well-being of humans, namely the ways people themselves act upon their environment: prudent actions and virtues of humans, as already argued by Aristotle, determine what leads to a good life. Dewey continues by stating that proactive habits allow us to adapt to our environments; the better we carry out prudent actions and realise our desires, the happier we will be. Artmann continues to discuss the notion of Physical Information Spacetime as the medium in which the interaction between humans and pervasive computing systems takes place. A weak and a strong notion of pervasive computing is distinguished: while weak pervasive computing is basically about a universal network that makes available information sources everywhere, within the strong notion humans do not explicitly interact with computer media but make natural use of smart spaces which are invisibly interwoven with technologies embedded everywhere in order to support the user adequately. In order to improve the well-being of the human user, it is realised that intention recognition is a fundamental prerequisite of pervasive computing technologies. For this reason, Artmann extensively elaborates on what intention recognition is about and arrives at a number of criteria that are to be fulfilled by technology in order to recognise intentions, and hence, to support the user in improving her well-being. 2.2. Enhancing the Well-Being of Elderly People Johnson and Huppert [4] give an overview of opportunities which Information Communication Technology (ICT) provides in order to maintain and improve the well-being of elderly people. Social networking (broadening and deepening of connections), cognitive function (adaptation to changes of cognitive functions and minimising the impact of their loss or deterioration), and health (assisting with prevention and management of disorders) are the three main domains that are addressed. Background knowledge about these domains and the way the ICT might improve well-being with respect to these areas are outlined. Moreover, barriers and opportunities to deal with them are discussed regarding the four categories of psychology, technology, cost, and issues of privacy and confidentiality. Concrete recommendations are identified which should promote the employment of the ICT to improve well-being among the elderly population.
6
B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation
2.3. Human-Robot Interactions Robots serve us today in many different situations like in factories or hospitals and they will extend their reach to more areas in the future, for example, in the living spaces of our homes. Therefore, a fundamental aspect of the related technologies is that robots should not only execute their tasks correctly, but they should also adopt the social conventions of human beings so that we feel comfortable when employing them in our households. This issue is addressed by [7]. Peters, Spexard, Hanheide, and Weiss recognise that robots should stick to social conventions while they act in our environments so that humans get the feeling that their well-being is indeed supported by robots without any doubts and restrictions. A specific scenario is thoroughly analysed in order to investigate how social conventions should be implemented in the context of human-robot interactions. This scenario is about a human meeting an approaching robot in a narrow hallway. For a robot this means to detect the specific situation it is a part of, and to act appropriately. Different possibilities are analysed in how the robot can behave and a user study explores what humans prefer regarding the interaction with an approaching robot. The outcomes concern the social signals which are expected by the users, and on the other hand, the social signals which can be generated as well as perceived by the robot. 2.4. Improving Mental and Physical Health A quite different area concerned with well-being is the development of physical activity promotion systems that help people to improve their mental and physical health. Technology can be applied in order to monitor activity, provide feedback, and offer coaching to increase or improve activity. A system that measures the activity of people within a lifestyle intervention programme is described by [6]. Long, Pauws, Pijl, Lacroix, Goris, and Aarts present a study on predicting daily physical activity-level data from past data of participants. The data are obtained by a body-worn activity monitor with a built-in triaxial accelerometer that measures the acceleration of body movements. The motivation behind this approach is to provide a human coach with insightful information about a participant’s performance by predicting his near-future activity levels. In this way, the coach is able to reach a large number of patients to support in an interactive while personalised programme. The focus of Long’s paper is the approach for predicting physical activity levels given the past activities of a participant. For this purpose autoregressive integrated moving average models on the data from the user’s history data are employed. Physical activity data are categorised as either stationary, trend, or seasonal by assessing their autocorrelation functions. The model that best fits the data is chosen for prediction. 2.5. Nursing Homes In the work of Fudickar, Schnor, Felber, Neyer, Lenz, and Stede [3], technologies are employed in order to improve the well-being of nursing home residents as well as their caretakers. A network infrastructure is deployed which enables the localisation of residents and the interchange of information. Three use cases are supported: • Reminding residents of upcoming events, such as meals, gymnastics, medication, social meetings, and other appointments,
B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation
7
• Dealing with wandering behaviour of inhabitants, and in particular, keeping track of people suffering from dementia, • An emergency call system to help patients who require assistance, and inform caretakers in critical situations. Within this project the authors have conducted extensive surveys in order to determine the requirements of the system to be developed. Nursing home residents and caretakers have both been interviewed. The results show that new technologies are desired concerning both groups. The surveys establish that the elderly users are interested in employing new technologies while they may worry about their own limitation or failure in using these technologies. In contrast, caretakers are used to computers or mobile phones, and hence, will be comfortable with applying other new devices. But an issue for caretakers is that they are facing considerably stressful working conditions including the already frequent use of various technologies. The introduction of additional devices should therefore be made carefully. The system is under development and will be installed in a nursing home in Stahnsdorf, Germany. 2.6. Smart Homes Smart homes provide an example of system-level approach to implementing pervasive computing technologies. Their general aim is to improve the well-being of the inhabitants. For this purpose, these systems need to determine where the inhabitants are and what they activities they are engaged in. Tracking of individuals is therefore a fundamental prerequisite which many researchers have developed methods for. Crandall and Cook [2] provide a set of tracking algorithms based on employing sensors such as infrared sensors installed at the ceiling. While useful tracking systems exist for keeping track of single inhabitants, the situation becomes more difficult when there are more residents in an area. The article investigates tracking systems for multiple smart home residents. Two different approaches are presented in their work: one uses a graph of the sensor network in a smart home environment and a set of rules to determine the current location and history for individuals; the second method uses a history of the resident occupants data to build a set of probabilities for a Bayesian updating tool for tracking residents. Both methods are applied to two data sets, namely from the testbeds of Washington State University and University of Tokyo. 2.7. User Preferences for Ambient Environments While the issue of tracking inhabitants is an important prerequisite to enable many smart home functions, another issue concerns the question of how the home’s ambience can be regulated to allow a pleasant environment. The idea is that the environment can automatically adapt to the preferences of the user and provide a desirable setting according to the activity of the user. This is to influence the well-being of the user through setting suitable environmental conditions. Khalili, Wu, and Aghajan [5] investigate unsupervised learning algorithms to let a smart environment adapt itself to the preferences of the inhabitants. This is another example for smart homes that serve the general population. Environmental factors such as music or lighting conditions are considered as smart home elements that enhance the comfort or overall experience of the inhabitants in different states. Instead of a system
8
B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation
with fixed rules, user behaviour including activity patterns and preferences of the environmental settings are taken into account. As a consequence, individual regulations for specific inhabitants enable an ambient atmosphere which adapts to the specific user it serves. 2.8. Chronic Disease Management Another perspective to supporting the well-being of humans in need of care presented in the work of Wickramasinghe, Georgeff, Guttmann, Thomas, and Schmidt [8]. They recognise that an appropriate care methodology for patients suffering from chronic diseases would have to include: (1) continuous monitoring and recognition of possible deviations from the plan; (2)determination of the cause of a deviation; and (3) generation of an appropriate plan to assist patients and their healthcare providers. A cost-benefit analysis is carried out that tests through simulations the benefit of implementing those mechanisms. A result of this work is that the adherence to care plans by both patients and healthcare providers is a critical factor in exploiting the technology efficiently as well as reducing the costs. While this is an issue which may only be indirectly perceived by the patient, intelligent collaborative care management methods have a direct influence on the overall management of the well-being and care of these patients. The complexity of the underlying information infrastructure, including the monitoring functions and recommendation of appropriate reaction to situations need to be seamlessly managed and integrated into easy-to-understand user-level and caretaker-level interfaces. 2.9. Summary The different research projects represented in this volume are illustrated in Figure 1. Each project deals an aspect of well-being. By considering the interconnection of various technological and user-level aspects we can understand the potential links between these projects as well as envision new directions for developing other applications. One such aspect concerns the differences resulting from implementing systems with mobile devices as opposed to embedding the technology in the environment. While the smart home approach supports the latter [2], the nursing home example [3] and the physical activity measuring example [6] are based on mobile devices that the users carry with them. This distinction offers a set of tradeoffs in the way the technology is deployed and user. One is the choice between a system which is bound within the confines of a smart home and one which can be carried with the user everywhere. Another tradeoff is the unobtrusive nature of systems that are part of the natural environment versus the dependability on the user to be alert and cooperative, as well as adequately skilled to use the devices that need the user’s input to operate. The adaptive environment project [5] combines the embedded technology of sensing the user’s state with the mobile technology which registers any feedback the user might have to adjust the ambience setting. The human-robot interaction scenario [7], by contrast, is of a different category. Here, the interaction between the user and a piece of technology which is not carried by the user is investigated in a situation which is also essentially independent of the location. A similar separation of local events and information-level reactions is considered in [8], where the care management methodology needs to consider the patient at an abstract
B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation
Nursing Homes
9
Smart Homes Ambient Environments
Human-Robot Interactions Well-Being Mental and Physical Health
Well-Being of Elderly People
Care Philosophical Management Roots
Figure 1. Different research directions around well-being considered in current research projects.
level and independent of specific local events, but still consider the behaviours concerned with how the patient deals with the chronicle disease. Another distinction exists between the consideration of the well-being of specific target groups, such as the elderly in the nursing home example [3] or patients suffering from a chronicle disease [8], and the well-being of a larger segment of the society within the physical activity measuring example [6,5,7]. In their overview article [4], the authors discuss specific user groups based on the technologies which can support them, i.e. the broad groups served by social networking, the more specific group which needs support regarding their cognitive functions, and similarly, other groups which need support regarding specific physical impairments and other disorders.
3. Open Questions Centred around Well-Being In order to conceive of new technologies to improve our well-being, an awareness of the various aspects that influence the overall well-being experience of the intended users is a necessity. In his work, [1] has shown how the conception of well-being derives from Aristotle’s philosophy and by extension from Dewey. One result is that habits allow us to adapt to our environments and the better we carry out actions to realise our desires, the happier we will be. This in turn means that in order for smart environments to improve our well-being, they should be able to derive our desires from what they can observe, which is mainly our activities and behaviour. The field of intention recognition will therefore be in the focus of future research aiming to provide further services or improve the service automation process. Moreover, it will be necessary to develop interdisciplinary approaches to address the facets concerning intentions, desires, actions, user skills, acceptance issues, and social behavioral aspects of new applications. In particular, the issues of user privacy and data confidentiality as well as ethical and socio-economical issues are all relevant to deploying successful applications. For example, future discussions can focus on whether the well-being of only those that can afford the technology should be maintained or what
10
B. Gottfried and H.K. Aghajan / Behaviour Monitoring and Interpretation
mechanisms can be adopted to extend the application scope to all users in need. Or, these discussions can reveal the tradeoff between the extent of support offered to future generations by ambient technologies and the ability these individuals may need to possess in solving problems of daily life independently and without technological support.
References [1]
[2] [3]
[4]
[5]
[6]
[7] [8]
S. Artmann. Well-Being in Physical Information Spacetime: Philosophical Observations on the Use of Pervasive Computing for Supporting Good Life. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Well-Being. IOS Press, 2011. A. S. Crandall and D. J. Cook. Tracking Systems for Multiple Smart Home Residents. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Well-Being. IOS Press, 2011. S. Fudickar, B. Schnor, J. Felber, F. J. Neyer, M. Lenz, and M. Stede. KopAL - An Orientation System For Patients With Dementia. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Well-Being. IOS Press, 2011. D. Johnson and F. Huppert. Information Communication Technology as a Means of Enhancing the Wellbeing of Older People. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Well-Being. IOS Press, 2011. A. H. Khalili, C. Wu, and H. Aghajan. Autonomous Learning of Users’s Preference of Music and Light Services in Smart Home Applications. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Well-Being. IOS Press, 2011. X. Long, S. Pauws, M. Pijl, J. Lacroix, A. H. Goris, and R. M. Aarts. Predicting Daily Physical Activity in a Lifestyle Intervention Program. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Well-Being. IOS Press, 2011. A. Peters, T. P. Spexard, M. Hanheide, and P. Weiss. Hey robot, get out of my way. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Well-Being. IOS Press, 2011. K. Wickramasinghe, M. Georgeff, C. Guttmann, I. Thomas, and H. Schmidt. Cost/Benefit Analysis of an Adherence Support Framework for Chronic Disease Management. In B. Gottfried and H. Aghajan, editors, Behaviour Monitoring and Interpretation – Well-Being. IOS Press, 2011.
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-11
11
Information Communication Technology as a Means of Enhancing the Well-being of Older People Daniel JOHNSON a,1 , Felicia HUPPERT b a Queensland University of Technology, Australia b University of Cambridge, UK Abstract. This chapter provides an informal review of the opportunities provided by Information Communication Technology (ICT) for improving and maintaining well-being among older people. The manner in which ICT might improve wellbeing is considered across three domains; social networks (broadening and deepening connections, facilitating social contribution, providing indirect monitoring), cognitive function (maintaining and enhancing function, supporting cognitive impairment), and health (preventing and managing disorders, rehabilitation, enhancement of health, improvements in health care). The barriers and opportunities that exist are identified as falling into four categories; psychological, technological, costs (financial and human), and issues of privacy and confidentiality. Based on exploration of these barriers and opportunities, recommendations regarding possible solutions are made. Keywords. Well-being, Information Communication Technology, social networking, cognitive function, health
Introduction Mental Well-being was defined by the UK Government Foresight project2 as a dynamic state in which the individual is able to develop their potential, work productively and creatively, build strong and positive relationships with others, and contribute to their community. It is enhanced when an individual is able to fulfil their personal and social goals and achieve a sense of purpose in society [1]. It is increasingly recognised that the wellbeing of many older people can and should be improved. Well-being can be enhanced by the remediation or prevention of problems or by developing opportunities for physical, mental and social flourishing and a sense of purpose. Looking to the future, a currently underutilised potential means of improving well-being among older people is Information Communication Technology (ICT). At present, in England only 51% of those over 60 and 20% of those over 75 own a personal computer [2]. Based on a workshop involving international experts in gerontology, ICT and well-being, this chapter provides 1 Corresponding
Author: Dr Daniel Johnson, Games and Interactive Entertainment Faculty of Science and Technology, Queensland University of Technology, GPO Box 2434, Brisbane, Qld, 4001. 2 http://www.foresight.gov.uk
12
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
an informal review of the opportunities, provided by ICT, for maintaining and enhancing well-being among older people It is very difficult to accurately predict how technology will change over an extended timeframe. It is easier, and arguably more useful, to accurately predict people’s needs. At a broad level, the well-being related needs of older people are not going to change. Our goal should be to meet these needs flexibly through the use of ICT. Moreover, through a focus on needs we can identify how technology can best support and enhance people’s well-being, whereas a focus on technology could potentially obfuscate proper understanding of needs, making the satisfaction of needs subservient to technological solutions. Older people should not be required to adapt to technology; technology should be adapted to their needs. We have identified three broad areas in which ICT can be expected to make a positive impact on well-being; social networking, cognitive function and health. These domains are not intended to be exhaustive. Rather, they are proposed as key domains in which ICT is particularly well poised to facilitate an improvement in well-being. While this report is focussed on older people, the improvements and recommendations made can be considered applicable at any age. It is not proposed that ICT can provide benefits equivalent to those that result from face-to-face interaction, rather it is recommended as an alternative when other factors (e.g., decreased mobility) prevent face-to-face interaction. The importance of these three domains is underscored within the Opportunity Age initiative (UK Government Department of Work and Pensions) in which health, maintaining social contacts, challenging stereotypes, and the opportunity to live a full and active life are highlighted as key contributors to mental capital. The interventions described in the current report address the strategic challenge identified by the Government regarding the support of an ageing population [3], and the achievement of the Public Service Agreement (PSA) on tackling poverty and promoting greater independence and well-being in later life.
1. Social Networking 1.1. Background With advancing age, it is common for people’s social networks to decrease in size. This reduction is a result of a variety of factors which include losing contact with work colleagues post-retirement, loss of peers and friends, reduced ability to visit or be visited as a result of decreased mobility (a reduction in one’s own mobility, the mobility of peers, or both) and other lifestyle changes that may result in a decreased likelihood of making new contacts. It is known that social networks enhance well-being [4] (through providing companionship, a sense of belonging, support etc.), and that social isolation is associated with physical and mental health problems and reduced survival [5,6]. Hence, a reduction in one’s social network is very likely to lead to reduced well-being. In addition, group memberships (e.g., profession, sports club) contribute to a person’s sense of identity [7]. As a person’s social network decreases in size, they are increasingly likely to lose their group memberships and the resulting reduction in selfidentity may also result in decreased well-being. Concurrent with these life-course changes is a general trend in the larger population for increased mobility and a greater chance of dispersion. For many older people this
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
13
means that family members are less likely to live nearby. ICT directed towards social networking is uniquely positioned to counter the reduction in well-being associated with these three factors (reduction in social network associated with ageing, loss of group memberships, trend towards population mobility and dispersion). Moreover, interventions in this area will contribute directly to meeting the recently identified ageing population strategic challenge through encouraging more socially active ageing [3]. 1.2. Benefits Broadly, ICT aimed to enhance social networks can be used to facilitate a feeling of connectedness. While there is some evidence that older people are beginning to engage with ICT as a means of social networking [8], further support and encouragement will allow a higher proportion of older people to gain further benefit in this domain. More specifically, ICT can be used by older people to connect with their own cohort, intergenerationally (e.g., with children and grandchildren who might be living some distance away), with others who have similar interests (thereby facilitating group memberships) or with others in a similar situation. For the last category, it can be expected that an improvement in well-being is likely to result simply by knowing that others share your experiences and by connecting with them. Accessing interpersonal support networks will have a direct positive impact on well-being. Moreover, it is likely that being able to make these various connections will, in many cases, provide the opportunity to experience a sense of purpose and usefulness (which, in turn, can be expected to improve wellbeing) [9]. 1.3. Broadening and Deepening Social Contacts ICT can potentially be used by older people to maintain, enhance or build social networks. Maintenance will be facilitated through ICT’s ability to allow older people to overcome their own possibly reduced mobility or to connect with others who are less mobile or dispersed. An older person’s social networks will be enhanced in situations where ICT allows for more regular contact and a greater transfer of information (e.g., story telling, images, video) than would otherwise be possible. Finally, ICT can lead to older people building new social networks (in situations where this would otherwise not be possible) through identification of others with similar interests. 1.4. Social Contribution In addition to the benefits to well-being associated with increased feelings of connectedness, social networking via ICT can improve well-being for older people by facilitating the ability to continue to contribute to society. Specifically, ICT can help people to identify and pursue opportunities for providing expertise, volunteering, paid work, workrelated networking and community tasks. Thus, interventions that assist older people to engage with social networking ICT can be expected to facilitate the aforementioned relevant PSA by providing a means by which older people can continue to contribute to society. Moreover, as identified as part of the strategic challenge surrounding an ageing population, many older people may need to work past retirement age to ensure they do not experience poverty in later life [3]. Better access to social networking ICT among older people can help meet this strategic challenge by facilitating work-related behaviour
14
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
as described. It should be noted that in addition to these specific opportunities, the general connections facilitated by social networking via ICT (discussed above) will in many cases present opportunities for older people to provide social support (to their peers, their grandchildren, other young people etc.). Accessing these opportunities can also be expected to improve well-being, as research has shown that providing support to others is associated with an improvement in well-being and longevity [10,11]. 1.5. Indirect Social Monitoring Social networking through ICT may also provide a non-intrusive form of monitoring (monitoring will be considered in more detail below). An older person’s presence and persistence through social networking, such as recent online activity, can provide an indication of that person’s real-world status without the associated risks to privacy or dignity associated with some other forms of health monitoring. To be clear, it is not suggested that social networking can perform the same functions as technology specifically designed for health monitoring, but that in situations where a balance with privacy is sought, indications of presence or persistence could be useful.
2. Cognitive Function 2.1. Background Older people typically experience a decrease in cognitive function as they age [12] and this can result in an associated decrease in well-being. For some older people, awareness of cognitive problems, for example memory impairment, can lead to loss of confidence and social withdrawal. Similarly, finding daily tasks more stressful can lead to poorer health, which in turn has a negative impact on well-being. Moreover, decreases in cognitive function can have a direct effect on health and well-being, e.g., through forgetting to take medication. When considering the use of ICT as a means of improving well-being, two types of decrease in cognitive function can be identified as relevant. First, older people may experience decreases in cognitive function that are an inevitable part of the aging process. Such decreases point to the need for ICT that is designed to allow for these changes in cognition (ICT Compatible with Changes in Cognitive Function) and also to ICT that can minimise the impact of any loss in cognitive function (ICT to Support Cognitive Impairment). Second, older people may experience decreases in cognitive function due to factors such as lack of confidence or reduced cognitive stimulation post retirement [13]. Such decreases could potentially be prevented, reduced or slowed by ICT designed to encourage the continued use of cognitive resources among older people (ICT to Maintain and Enhance Cognitive Function). 2.2. Benefits 2.2.1. ICT Compatible with Changes in Cognitive Function Much research and development has been devoted to exploring the adaption of technology to the age-related changes in cognition that older people often experience. This
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
15
includes the consideration of physical characteristics of the interface (e.g., font and icon sizes), navigational issues (e.g., minimal screen scrolling), information organisation (e.g., appropriate breadth and depth in menu structures) and more general conceptual issues (e.g., the importance of consistency in the interface) [14]. This work is inherently challenging as it requires identifying people’s changing needs as they age, and creating technology that is compatible with such changes. Such work, including its application in commercial settings should be further encouraged (as discussed in section 4). 2.2.2. ICT to Support Cognitive Impairment The research discussed above is focussed on the adaption of ICT to make it more accessible to older users who have experienced changes in cognitive function. In contrast, there is less research available regarding ICT that could potentially minimise the impact of losses of cognitive function on well-being (the existing research is reviewed in [23]). In the first instance, such work should be informed by a user needs analysis to identify exactly what older people might want or need from technology deployed to this purpose. It is possible that changing needs could be met through the use of an intelligent adaptive agent. An intelligent agent can be defined as any software or technology that is able to respond effectively to it’s context and to the user. This might help minimise the impact of losses in cognitive function in an older person’s day-to-day life. Such an intelligent agent could support an older person by, for example, keeping track of appointments and providing reminders of people’s names and their connection to the older person. However, further research is required to determine whether such an agent would be desirable and useful and what form it might most effectively take. 2.2.3. ICT to Maintain and Enhance Cognitive Function There is evidence that cognitive function is impaired by lack of use and that training in specific areas (memory, reasoning, processing speed) can have long-term benefits for older adults [16,17]. ICT can be used as a means of improving well-being for older people in the area of cognitive function through providing opportunities for cognitive stimulation and lifelong learning. More research is needed on the means by which ICT could maintain or facilitate cognitive function. Such research will require an inter-disciplinary approach, combining expertise in cognitive function, ageing and ICT. As with ICT designed to minimise the impact of losses in cognitive function, it will be important that a needs analysis is conducted, in this case to identify which cognitive activities older people might find rewarding and what form ICT should take in order to be effective and engaging. More generally, ICT can be used as a means of improving the cognitive function, and hence the well-being of older people through providing opportunities for life-long learning (such as those provided through the University of the Third Age) [18]. Aside from identifying the exact role and form that ICT can most effectively take in this area, attention should simultaneously be given to improving older people’s perceived self-efficacy and motivations – both in terms of ongoing learning and the use of ICT (discussed further in section 4). Finally, it is known that mood and mental health can negatively affect cognitive functioning. Consideration should be given to how ICT can help prevent this process through detecting any signs of deterioration – a topic explored in the following section on health.
16
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
3. Health 3.1. Background There is an increased prevalence of health problems in older age, including chronic illness, which can have a direct negative effect on older people’s well-being. ICT can potentially have benefits in terms of improving health, and health-related behaviours. This is particularly true given the current trend away from institutional care towards care in the home. Interventions in this area will contribute to the relevant Public Service Agreement (PSA) in terms of improving the level of health experienced by people in later life and helping people maintain independent living. Furthermore, the recommended interventions align with issues identified around the strategic challenges of an ageing population; specifically the potential advantages of telecare and monitoring, the increasing pressure on the health system and on carers, and the likelihood of increasing demand for higher quality services [3]. Health problems also often reduce opportunities for social engagement, networking and cognitive stimulation that are available to older people in better health, resulting in an even more important role for ICT with respect to health. 3.2. Benefits ICT can be used by both older people themselves, and others involved in their care (such as informal care providers and health professionals) in a manner that will ultimately improve the health, and thus the well-being of older people. ICT can be used by older people to improve their own health and well-being in four main ways: prevention of disorders, management of disorders, rehabilitation, and enhancement of health (regardless of the presence of disorder). 3.2.1. Prevention of Disorders ICT can be used to provide health education aimed at reducing behaviours that are detrimental to health and increasing behaviours that are beneficial. For instance, ICT can be used to provide feedback or cues to older people (particularly those at risk of specific disorders) that would identify situations, times or behaviours that put them at risk. An example would be combating circulation or posture problems by highlighting the need to get up from a desk or favourite armchair, move around and stretch. Prevention can also be facilitated through monitoring early signs of illness or deterioration and feeding this information back to the individual and or others involved in their care. 3.2.2. Management of Disorders When a disorder is present, ICT can be used to improve well-being by managing the disorder in a variety of ways. At a simple level, ICT might be useful for providing a distraction from pain or other problems associated with the disorder. ICT can also be used at a practical level to monitor behaviour and provide reminders (regarding medication, exercise etc.). At a more advanced level, ICT might be used to provide therapy (for example cognitive-behavioural therapy) or to train older people in techniques such as the use of mindfulness meditation to reduce stress or relieve pain. Furthermore, ICT, possibly in the form of an intelligent agent, may be able serve as a ’coach’ for older people in managing a disorder.
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
17
3.2.3. Rehabilitation ICT can facilitate rehabilitation through providing information explaining exactly how to perform post-operation exercises and reminders regarding when to perform them. Moreover, rehabilitation provided via ICT can potentially be more fun than traditional methods. Existing technology such as the Nintendo Wii [21] and other games such as ’Dance, Dance Revolution’ [22] can be adapted to form an enjoyable means of physical rehabilitation. More advanced forms of ICT such as robotics and Virtual Reality can also be used to support rehabilitation. For example, the use of robotics to help with physical rehabilitation exercises (whereby a robotic device might help a person shape their limbs into the correct orientation) or the use of virtual reality to simulate first time use of a wheelchair to access a medical clinic as a means of improving older people’s sense of self-efficacy with the task. Although ICT is already being used for these purposes, there are complex issues regarding efficacy, validation of existing practice and new techniques that require further work and exploration. 3.2.4. Enhancement of Health and Well-Being In the absence of disorder, ICT can be used to improve well-being by enhancing health. For example, ICT can improve health literacy by facilitating access to information and improved education about positive health, vitality and thriving. ICT can be used to train mindful awareness and appreciation of good health and positive functioning. The aforementioned technologies (Nintendo Wii, ’Dance, Dance Revolution’) can also be used as a fun way to augment health and well-being by encouraging physical exercise. Moreover, ICT could be deployed in the form of a personal assistant or ’coach’ that monitors daily activities and helps identify personal strengths and positive behaviours as well as areas where further improvement is desirable. The role of ICT in reducing social exclusion will also have direct benefits on the enhancement of health (see section 1). 3.3. Improvements to Health Care Provision ICT can be used by care-givers and health professionals to improve the well-being of older people in their care. It should be noted that the UK National Health Service (NHS) initiative ’Connecting for Health’ has begun the process of instituting many of the ideas and initiatives discussed in this section. What follows is designed to highlight the areas in which further focus and support will be worthwhile. In general, ICT is beginning to be used to facilitate the provision of up-to-date information regarding an older person’s health status to those involved in their care. This information can be provided most effectively if level of complexity is varied according to the needs and understanding of those accessing the information. That is, the older person, their primary carer or a medical practitioner could access descriptions of the issues adapted to their needs and understanding. Similarly, ICT can facilitate better healthcare integration, such as coordination between specialists treating an older person or coordination between medical practitioners when an older person travels. In line with current thinking in this domain, it is essential that any such healthcare integration system needs to be opt-in and should be highly trustworthy, as privacy concerns may exist. Furthermore, ICT could be directed towards greater standardisation of health records in terms of how they are written, stored and accessed. Such standardisation will assist in both
18
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
the provision of up-to-date information and the improvement of healthcare integration. Finally, ICT might be used to provide information to the medical practitioner on a realtime basis, and could extend to relevant up-to-date research, allowing them to draw on the evidence base available in a ’bedside’ situation.
4. Barriers and Solutions The barriers and opportunities that exist around improving well-being for older people through the use of ICT can be considered to fall into 4 major categories – psychological, technological, costs (financial and human) and issues of privacy/confidentiality. In most cases the barriers and opportunities apply across domains (social networking, cognitive function and health). The recommended interventions are summarised in Table 1 for ease of reference. 4.1. Psychological While ICT has the potential to improve well-being in a variety of ways, older people may be less accepting of new technologies because of negative attitudes towards technology generally, a lack of motivation to engage with technology and/or poor self-efficacy regarding technology use. Further, some older people (and others in close contact with them) may not be aware of the existence of particular kinds of ICT and, more importantly, may not be aware of the potential usefulness of specific ICT solutions. A key issue may be older people’s lack of knowledge regarding how to use and maintain technology. It is important to note that knowledge of how to use technology is necessary but may not be sufficient in the absence of knowledge regarding how to maintain technology. For example, an older person who knows how to use a social networking technology such as Skype will potentially cease to use that technology the first time it fails if they are unable to restore it to a working state. It should be noted that these psychological barriers are not unique to older people and can be applicable to younger people as well. 4.1.1. Solutions For obvious reasons, it is important that assumptions are not made about what older people might need without consulting them. Thus, in the early stages of intervention in this area a user needs analysis will be of great value. However, as discussed, it is likely that in some cases, people are not aware of the full range and possible benefits of ICT solutions. For this reason any needs analysis should be combined with the provision of information, where the nature and benefits of available technologies are explained, so that older people can understand the full range of solutions and thereby provide the most informed response possible regarding their needs. In addition to awareness raising as part of a needs analysis, education should be undertaken more broadly, with a view to informing older people and their carers about the potential benefits of technologies and how to use and maintain technology. Such education could be provided both formally, e.g., through classes and informally, e.g., through modelling via advertising. This is particularly relevant in the areas of social networking and cognitive function where the benefits may not be as readily apparent as they are in the health domain. For instance, the advantages of an instant messaging
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
19
Table 1. Summary of recommendations to promote the use of ICT to improve well-being among older people Domain
Interventions
General • A comprehensive assessment of needs and desires conducted with older people, leading to a formal user needs analysis • Creating opportunities for positive experiences with technology • Challenging existing stereotypes through examples (i.e., older people effectively using and enjoying ICT) Learning & Education • A programme of formal and informal education designed, developed and delivered by older people (with expert consultation) covering: ∗ the potential benefits of ICT (both direct, e.g., improved social connectivity and indirect, e.g., improved well-being through providing social support to others) ∗ how to effectively use and maintain ICT ∗ a full understanding of technology designed for monitoring (including differing levels and kinds of monitoring and the associated benefits) Financial • Direct financial support to offset ICT-related costs (e.g., a broadband subsidy) • Provision of basic ICT infrastructure in the home by social services • Support programmes for the use and maintenance of technology, staffed (wherever possible) by older people (e.g., government funded support technicians) Legislative • Statutory requirement for inclusive, accessible and usable design of key products related to instrumental activities of daily living • Requirement for transparency in technology designed for monitoring (i.e., clear indications of what is being monitored and who has access to monitoring data) • Standardisation of methods for providing consent to monitoring technology Research & Development • Funding and support for research and development for ICT to enhance wellbeing in key areas, for example: ∗ accessible and inclusive design ∗ affective design (i.e., design of products to elicit specific emotional responses) ∗ ubiquitous and aware technology • Programmes focused on encouraging translation of existing research in fields such as inclusive design, cognitive function, well-being into commercial and industrial practice and collaboration between commercial and research entities
20
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
system in terms of the combination of synchronous and asynchronous communication may not be obvious to someone who has never used such a system. Awareness of benefits will have a positive impact on the motivation to learn how to use and maintain such technology. The indirect benefits of ICT may also be unclear to potential older users. There is likely to be great value in highlighting such benefits. For example, the potential for social networking ICT to allow an older person to help younger relatives through the provision of advice and support or to contribute to community groups, may not be readily apparent but could be key motivating factors in terms of uptake of and engagement with technology. In tandem with education it is essential to provide support for using and maintaining technology. Technology will often fail and support may be required to restore it to a functioning state. Where possible, such support could be provided by an older person. This has two key benefits; firstly, allowing the support provider to take an active and useful role, which will enhance their own well-being, and secondly, helping to dispel stereotypes regarding older people’s inability to use and maintain technology. Ultimately, this should facilitate the building of self-efficacy among older people regarding their abilities around technology. 4.2. Technological Much ICT is of questionable standard in terms of usability and interface design. Designers do not always consider user requirements and when consideration is given to userrelated issues the focus is often on particular user groups (for example younger users) or specific contexts (for example office environments). Users are often under-utilised in the design process and technologies are far too often deployed untested and arguably, prematurely. The end result is that technology can be unnecessarily difficult to use. Moreover, the robustness and reliability of technology is sometimes inadequate. When a technology performs unreliably there is a follow-on impact on the users’ degree of trust in the technology and their own sense of self-efficacy [19]. Difficult interfaces and unreliability will potentially prevent older users from experiencing the benefits to well-being available through use of ICT. 4.2.1. Solutions The field of inclusive design focuses on the design of products that are accessible to all users. Older users are a key demographic in this field as a result of the changing age structure in many developed and developing countries [20]. It is important that the UK government provides further support to this field with a view to the accelerated facilitation of research and development that will identify how ICT can be improved to allow greater accessibility for older people. Likewise, support should be provided by the government for work towards the goal of ensuring that key ICT products and interfaces are reliable and robust. Further research and development exploring devices and interfaces that are more aware of the user and more effectively able to respond to the user should be encouraged. This is essentially another way of dealing with problems around older people’s difficulties with ICT interfaces. If the device or product has a degree of awareness of the state and desires of the user then the user has less need to engage with the interface (in the traditional sense of the term) as the product will anticipate the user’s needs and desires.
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
21
For example, a device that provides written information to the user, such as an electronic book, could be designed to automatically respond to the visual needs of the person holding it, e.g. by taking account of the reading distance or ambient lighting. In this way, the user would not be required to find and adjust the text-size or backlighting controls on the device as their needs have been anticipated and accommodated by the device. In addition to supporting work on inclusive design and aware technologies, consideration should be given to the means by which design can make ICT more appealing to older users. Fields such as affective design focus on designing products and interfaces so that they are pleasant and rewarding to use. If ICT technology is designed to provide older users a positive affective experience then the benefits to well-being of using ICT will be more readily realised. Hence engagement with the technology could, in itself, have a positive impact on well-being. In some cases the provision of support will be insufficient motivation for change. In tandem with such support, consideration should be given to legislation requiring greater usability testing and the meeting of minimum design and accessibility standards in certain key domains associated with instrumental activities of daily living (for example interfaces on ICT related to public transport such as ticket machines). This legislation could take a form similar to that used to manage minimum World Wide Web accessibility requirements (as detailed in the UK Disability Discrimination Act (1995) Part III). 4.3. Costs There are many costs associated with the suggested uses of ICT as a means of improving well-being among older people. There are financial costs involved in acquiring ICT devices and also with maintaining them. Additionally, there are human costs in terms of the time and effort required to learn how to use and maintain technology. It is important to consider that the investment horizon is often reasonably long in this area – many of the most effective interventions will be costly in the short term but provide significant returns in the long term. For example, the initial costs of setting up a person’s home with ICT infrastructure can ultimately be expected to result in net financial benefit simply on the basis of allowing older people to live in their own homes for longer and thereby avoiding the costs associated with institutional care. 4.3.1. Solutions Broadly, the solutions in this section revolve around offsetting the costs associated with using ICT to improve well-being in older people. In many cases this could take the form of direct financial support. For example, consideration should be given to the provision of a broadband subsidy to older people. Access to the internet is a key component of social networking and health interventions and the ongoing cost of internet access could be made more palatable to older users if it is subsidised. This is particularly the case as the full benefit of internet access may only become clear to users after they have had an opportunity to engage with it for a period of time. Offsetting the cost of internet access is likely to lead to more older people discovering such benefits than if they had to meet this cost on their own. Alongside such subsidies, social services could provide a minimum essential level of connectivity infrastructure as part of the process of adapting homes for older people. Furthermore, the service might extend to adapting ICT to meet the changing needs of
22
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
individual older people, in the same way that adaptions are made to physical spaces such as the bathroom or stairs. To minimise the cost in terms of time and effort required to understand how to use and maintain ICT, older people should be provided with training and support. Training and support will ensure that the benefits of any subsidies and services are fully realised. Where possible, training should be built into the system itself. Training that is built into systems should be supplemented by the provision of more traditional training in classroom-type situations (provided by older people wherever possible). This has the dual benefits of giving a greater sense of purpose to the trainer and having a positive impact on stereotypes and self-efficacy within those being trained. 4.4. Privacy / Confidentiality The issues in this area are most directly relevant to health-related ICT (although the principles apply across the board). Some of the key concerns in this domain relate to the notion of monitoring. Given people’s general preference for privacy, it is important that wherever monitoring is considered, the associated benefits are made obvious. Similarly, it should be made clear who is performing any monitoring and wherever possible, decisions regarding who has access to the monitoring data should be made by the person being monitored. 4.4.1. Solutions The most important solution in this domain is the provision of education. Older people should be informed about what the possible benefits of monitoring are, who would perform any monitoring and exactly what would be monitored. Additionally, it is essential that any monitoring ICT be designed to be transparent. In other words, it should be clear what is being monitored and by whom. Such transparency will allow for appropriate levels of trust to be placed in the technology. Consideration should be given to legislation requiring that basic minimum standards be met in this area. Attention should also be given to the development of a standardised means of providing consent. The deployment of appropriate ICT will be facilitated if older people can choose the degree of monitoring and privacy they are comfortable with in a simple, standardised and understandable manner. 4.5. Broad Considerations and Solutions There is a great deal of relevant research being conducted that can be applied to the use of ICT as a means of improving well-being among older people. However, there is a continuing need to translate existing research into practice and to find ways to encourage researchers and those working in commercial settings to collaborate (as exemplified in knowledge-transfer partnerships). As discussed above, many of the barriers to successful use of ICT by older people stem from the existence of negative stereotypes (around older people’s inability to use technology) and from low self-efficacy among older people regarding their ability to successfully interact with ICT. Two key interventions will be invaluable in this area.
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
23
1. Opportunities for older people to experience successful interactions with ICT need to be provided (for example through hands-on training). The more successful interactions older people enjoy with ICT the better the self-efficacy they will experience and the more stereotypes will change. 2. At an informal level, providing counter-stereotypical examples in advertising and in the media generally can reduce negative stereotypes regarding older people and ICT. This reduction in negative stereotypes is likely, in turn, to have a positive impact on older people’s self-efficacy. Across all the solutions discussed it is essential that caution be exercised in order to avoid unintended consequences. It is possible that interventions in one sphere could inadvertently cause problems in another. For example, a robotic technology designed to improve well-being by cleaning an older person’s house might lead to a dramatic decrease in the level of exercise undertaken by that person with an associated negative impact on well-being. A holistic approach must be taken in which all facets of well-being are considered. Moreover, any interventions that are deployed must be designed to be ŠoptinŠ systems (that is, systems that are only deployed with the consent of those involved). Wherever possible, the person potentially benefiting from the intervention should have the final word regarding whether and how the intervention is used.
5. Summary In summary, the evidence reviewed at the workshop leads to several conclusions regarding the means by which ICT can be used to improve well-being among older people. Social networking through ICT can lead to the broadening and deepening of connections, the facilitation of social contribution and the provision of an indirect form of monitoring. ICT can be used to maintain and enhance cognitive function and to support people with cognitive impairment. With respect to health, ICT can assist with the prevention and management of disorders, rehabilitation, enhancement of health and improvements to health care provision. In order to best realise these opportunities the psychological and technological barriers need to be reduced and cost and privacy concerns need to be resolved.
Acknowledgements This review is based on an International Workshop held under the auspices of the UK Foresight Project on Mental Capital and Well-being, March 7, 2008. The content of this report is based on the contributions of the attendees at this workshop. The full list of participants is as follows: • • • • •
Emma Berry - Microsoft Research Cambridge, UK Herman Bouma - Technische Universiteit Eindhoven, Netherlands Don Bouwhuis - Technische Universiteit Eindhoven, Netherlands Neil Charness - Florida State University, USA Anna Dickson - University of Dundee, Scotland
24
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
• • • • • • • • • • • •
Arthur (Dan) Fisk - Georgia Institute of Technology, USA Derek Flynn - Government Office for Science, UK Felicia Huppert - University of Cambridge, UK Daniel Johnson - University of Cambridge, UK Hannah Jones - Government Office for Science, UK Jane Mardell - Government Office for Science, UK Tuvi Orbach - Health-Smart, UK Wendy Rogers - Georgia Institute of Technology, USA Will Stewart - Southampton University, UK Sandy Thomas - Government Office for Science, UK John Waterworth - Umea University, Sweden Mary Zajicek - Oxford Brookes University, UK.
References [1] [2]
[3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
[13] [14] [15] [16] [17]
[18]
Foresight Mental Capital and Wellbeing Project (2008). Final Project report. The Government Office for Science, London. Banks, J. & Leicester, A., 2006. Expenditure and consumption. In Banks, J., Breeze, E., Lessof, C. & Nazroo, J. eds. Retirement, health and relationships of the older population in england: The 2004 english longitudinal study of ageing. London: The Institute for Fiscal Studies, 243-296. The Strategy Unit (2008) Realising Britain’s potential: Future strategic challenges for Britain. Cabinet Office, UK Government. Helliwell, J., & Putnam, R (2005) The social context of well-being. In F. Huppert, N. Baylis & Keverne, B (eds.) The science of well-being. Oxford University Press: Oxford: UK. House, J., Landis, K., & Umberson, D (1988) Social relationships and health. Science, 29, 540-545. Tomaka, J., Thompson, S., & Palacios, R (2006) The relation of social isolation, loneliness, and social support to disease outcomes among the elderly. Journal of Aging and Health, 18, 359-384. Tajfel, H., M. Billig, et al. (1971) Social categorization in intergroup behavior. European Journal of Social Psychology 1: 149-177. Lipsman, A. (2006). "Measuring the Digital World." Retrieved 29 October, 2008, from http://www.comscore.com/press/release.asp?press=1019. Diener, E. and M. Seligman (2004). "Beyond Money." Psychological Science in the Public Interest 5(1): 1-31. Brown, S., Nesse, R., Vinokur, A., & Smith, D. (2003) Providing social support may be more beneficial than receiving it: Results from a prospective study of mortality. Psychological Science, 14, 320-327. Hao, Y. (2008) Productive activities and psychological well-being among older adults. Journal of Gerentology: Social Sciences, 63B(2), S64-72. Banks, J. & Leicester, A., 2006. Expenditure and consumption. In Banks, J., Breeze, E., Lessof, C. & Nazroo, J. eds. Retirement, health and relationships of the older population in england: The 2004 english longitudinal study of ageing. London: The Institute for Fiscal Studies, 243-296. Rogers, R., Meyer, J., & Mortel, K. (1990) After reaching retirement age physical activity sustains cerebral perfusion and cognition. Journal of the American Geriatric Society, 38, 123-128. Fisk, A., Rogers, W., Charness, N., Czaja, S., & Sharit, J (2004). Designing for older Adults: Principles and Creative Human Factors Approaches. Boca Raton, Florida: CRC Press. Oliver, P., Wherton, J., & Monk, A. (2008) State of Science Review: Technology solutions to prevent waste of mental capital (SR E25). Office of Science and Innovation. Ball, K., et. al., (2002) Effects of cognitive training interventions with older adults: A randomized controlled trial. Journal of the American Medical Association, 288(18), 2271-2281. Wolinsky, F., Unverzagt, F., Smith, D., Jones, R., Wright, E., & Tennstedt, S. (2006) The effects of the ACTIVE cognitive training trial on clinically relevant declines in health-related quality of life. Journal of Gerontology: Social Sciences, 61B(5), S281-7. U3A: The third age trust. Retrieved 1st March, 2010, from http://www.u3a.org.uk/
D. Johnson and F. Huppert / ICT as a Means of Enhancing the Well-Being of Older People
25
[19] Sanchez, J., Fisk, A. D., & Rogers, W. A. (2006). What determines appropriate trust of and reliance on an automated collaborative system? Effects of error type and domain knowledge. Proceedings of the 9th International Conference on Control, Automation, Robotics, and Vision (pp. 98-103). New York, NY: IEEE. [20] Marmot, M., Banks, J., Blundell, R., Lessof, C., & Nazroo, J (eds.). (2002). Health, wealth and lifestyles of the older population in England: The 2002 English Longitudinal Study of Ageing. London: The Institute for Fiscal Studies. [21] Wii Nintendo, Retrieved 1st March, 2010, from http://uk.wii.com/ [22] IGN Staff IGN: Dance Dance Revolution Tournament Report, Retrieved 1st March, 2010, from http://uk.psx.ign.com/articles/084/084175p1.html [23] Oliver, P., Wherton, J., & Monk, A. (2008) State of Science Review: Technology solutions to prevent waste of mental capital (SR E25). UK Government, Office of Science and Innovation.
26
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-26
Well-Being in Physical Information Spacetime: Philosophical Observations on the Use of Pervasive Computing for Supporting Good Life a
Stefan ARTMANNa,1 Frege Centre for Structural Sciences & Institute of Philosophy, Friedrich Schiller University, Jena, Germany
Abstract. The paper discusses, from a philosophical perspective, the use of Pervasive Computing for supporting the well-being of humans. Philosophy can help computer scientists who are working in this area, to resolve conceptual problems, to take up a considered position concerning ethical issues, and to define clear technological principles. Drawing on the philosophical tradition of pragmatism, questions addressed in the paper are: How is the technological research programme of Pervasive Computing to be defined? How are humans and systems of Pervasive Computing to be modelled as intelligent agents? How do we understand the behaviour of agents as intentional? How is the interaction of humans and systems of Pervasive Computing to be analyzed? How can we define the well-being of intelligent agents? How should systems of Pervasive Computing be designed if they ought to support the well-being of humans? Keywords. Well-being, weak and strong Pervasive Computing, physical information spacetime, intelligent agents, intentionality, intention recognition
Introduction After the triumph of personal computers and the internet restructured life and work around clearly identifiable technical devices in the contemporary human habitat, computing technology now starts becoming indistinguishable from the public sphere of developed countries and the private sphere of their inhabitants. In the dawn of a world of ambient intelligence and smart environments it does not come as a surprise that even philosophers increasingly feel the need to reflect on the computers they knowingly or unknowingly interact with. Philosophy is the effort of human reason to enquire into the basic structures of what there is, into the general principles of how humans act, and into the elementary rules of how they produce artefacts. Fundamental problems in the design of technologies inextricably intertwine all three questions. Whether and, if so, how Pervasive Computing can support human beings in leading a good life is a current 1
Corresponding Author: Stefan Artmann, Frege Centre for Structural Sciences, Friedrich Schiller University, Zwätzengasse 4, 07743 Jena, Germany; E-mail:
[email protected].
S. Artmann / Well-Being in Physical Information Spacetime
27
and most important example. It is here that philosophers can show what they are able to contribute to shaping the technological world of tomorrow. The first step towards a productive involvement of philosophers in technology design is to prove, to engineers and scientists who develop and implement technical systems, the pragmatic value of what philosophers are trained in. This endeavour may be especially fruitful in computer science. Both philosophers and computer scientists are used to analyze and synthesize concrete abstractions: virtual entities that are considered empirical objects effective in physical spacetime, or real entities that are considered general structures constructible in formal imagination [1; 2, ch. 12]. Computer scientists deal with pieces of software, on the one hand, as algorithms that abstractly prescribe instructions for the solution of a problem, and, on the other hand, as activity patterns of electric charges in a concrete machine. When an algorithm is expressed by a programming language and typed into a computer, an object arises that is fully described only if its concrete and abstract aspects are considered to be correlated with each other. Philosophers have discussed such entities since they began to think about symbols, texts, and languages, which are concrete abstractions, too. The black-and-white pattern you are seeing in this moment functions as a text if and only if you correlate the concrete objects you perceive to an abstract meaning. The field of computer science where collaboration with philosophers can already look back on a certain tradition is Artificial Intelligence (AI). The reason for this is that AI, even if it follows the embodiment paradigm, must come to grips with the role of processing of symbols in the generation of intelligent behaviour. Conversely, decisive engineering advances in AI should inspire, in philosophy, new ideas on necessary conditions of intelligent behaviour and general mechanisms of symbol processing. Pervasive Computing (PerComp) is a good candidate of a strongly AI-related technology [3, ch. 33] that potentially has such an effect. The interaction, e.g., that smart environments shall allow humans and artefacts to engage in, requires that some computing devices are able to recognize intentions of humans by monitoring and interpreting their behaviour, and to support or obstruct the realization of these intentions by controlling actuators in a proactive manner. To achieve this feat, it may be helpful to combine an AI expertise with a philosophical perspective on the interaction of intelligent agents for the benefit of PerComp’s inventiveness, which might in turn stimulate, not only AI’s formal modelling of intelligent behaviour, but also the conceptual imagination and thought experiments of philosophers. In this paper I shall show how strongly connected PerComp, AI, and philosophy are, by addressing the question of how PerComp can support human beings in leading a good life. Philosophy will make two main contributions to the answer sketched here. A model of cooperation between intelligent agents is sketched that owes its fundamental ideas to philosophical reasoning about intentions, their recognition, and the construction of shared intentions. Whereas these ideas have at least partly been applied in AI, their embedding into a theory of well-being that originated in Greek philosophy, may be new to computer scientists. The more than 2300 years since Aristotle have seen many ethical conceptions of what humans should do, and why they should do so. Concurrently, cultural conditions of everyday life have drastically been changing, and this happened not least through technological progress. Aristotle’s practical philosophy, however, still is a good starting point for the ethical discussion of human action in today’s computerized world. Aristotle described human action and well-being in a functional way that can be developed into an engineering view on how humans interact with artefacts in order to lead a good life. Such a modern re-interpretation of Aristotle’s
28
S. Artmann / Well-Being in Physical Information Spacetime
philosophy of action has been proposed by one of the classical authors of American pragmatism, John Dewey. The essentials of these philosophies of well-being are presented in Section 1. Of course, ethical theories of human-machine interaction would be futile if they were not informed by knowledge of the technologies with which humans are interacting. Current progress in ambitious varieties of PerComp (which I call ‘strong PerComp’) is of utmost interest to the philosopher. They do away with traditional differences between information processing hardware and other material objects, and between information processing and other processes in the physical world. They increasingly bring contextual variation in the interaction between humans and information processing objects. Strong PerComp systems are not restricted to adequately reflect the given physical world: they are enabled to autonomously control their environments on the basis of information about actual and desired physical worldstates, and to control themselves on the basis of self-generated rules of behaviour. The first task of a practical philosophy for building strong PerComp systems is to describe the medium in which humans and such systems interact. For this purpose I introduce the concept of physical information spacetime in Section 2. Applied to interaction of humans and strong PerComp systems, Aristotle’s and Dewey’s concepts of well-being contradict the idea that a system ought to prescribe a human a pattern of action that it holds to be good for the human. It means that the system has to recognize the intentions of the human’s actions, and to help the human to realize them if the system considers them realizable in the particular situation the system is able to influence. The second task of a practical philosophy of strong PerComp engineering is, thus, to develop a general understanding of the role of intentions in the interaction of humans and strong PerComp systems. A philosophical discussion of the concept of intention, followed by its information-theoretic interpretation, is presented in Section 3. Intention recognition is a central problem of strong PerComp, so general prerequisites for its solution must be described precisely. In order to find such conditions, it is sensible to start with the analysis of a seemingly simple small-scale process involving intention recognition. An agent Alter is doing something and another agent Ego wants to help Alter achieve its aim. Ego and Alter are not allowed to communicate with each other by means of a common symbol system. Nevertheless, Ego shall recognize Alter’s intention, i.e., Ego shall ascribe an intention to Alter in a controlled way. This scenario, which is discussed in Section 4, requires of Ego to recognize, on the spot, present-directed intentions of Alter. At least three postulates must be assumed if intention recognition in the simple scenario of Ego trying to recognize intentions of Alter shall be successful. The first postulate concerns the semantics of abstract data types common to Ego and Alter. These data types are, as explained in Section 5, what philosophers call ‘intentional predicates.’ The designer of strong PerComp systems should describe and analyze, by means of intentional predicates, what humans and machines are doing, and are planning to do, in the very same terms, namely in terms of intentions to be realized by their actions. The underlying philosophy of action must thus be able, not only to capture the action of humans, but also the behaviour of intelligent artefacts. The second postulate of intention recognition states the pragmatic prerequisite of continuity between intentional and intended actions. It says, in short, that Ego is able to infer Alter’s intention to do something in the moment tn from Ego’s observation of Alter’s behaviour in the time interval t1...tn-1 before, since Alter has intentionally been
S. Artmann / Well-Being in Physical Information Spacetime
29
doing what it did at t1...tn-1. For the engineer of strong PerComp systems, this means that the process character of actions must be taken seriously. More specifically, the design of strong PerComp systems should allow the system to extrapolate intended actions from intentional ones by building an environment in which the human can concentrate on how to accomplish intentional actions without being disturbed by the presence of the system. As Section 6 shows, this is a way to describe more precisely what is usually meant by ‘calm computing.’ The third postulate of intention recognition requires the presence of social conventions, or distributed algorithms, for acting with shared intentions. These conventions are meshed sequences of behavioural rules, or habits, of Ego and Alter that pragmatically structure the interaction of both agents without the need of coordinating intervention by a third agent. Without common knowledge of their habits and the construction of shared intentions on the basis of this knowledge, the coordination problem of Ego and Alter will be solved only by luck, even if Ego’s interpretation of Alter’s behaviour in terms of intentions has been correct. For the development of strong PerComp systems this means, e.g., that it is necessary to install such a system long before it is expected to run smoothly. Successfully supporting humans to lead a good life is a matter of long-time social relations – this is also true for computerized supporters. This is shown in Section 7. The three postulates of the basic mechanism of intention recognition are to be understood as parts of a minimal foundation for the ethically informed design of strong PerComp systems. Section 8 summarizes what general consequences can be drawn, from the philosophical observations here presented, for the design of intelligent systems that shall support the well-being of humans – a task that promises many philosophical and technological challenges for future interdisciplinary work.
1. What Is Well-Being? Aristotle’s Classical Definition and Dewey’s Modern ReInterpretation The well-being of humans has been a topic of Western philosophy since antiquity. What might come as a surprise is that classical Greek philosophers can inspire thought about well-being also in our times. The main reason for this is that they discuss, for the first time in Western history, conceptions of what it means to be a human that are not authoritatively transmitted in mythical traditions but have stood the test of critical reasoning in which principally any rational being can participate. To address questions of well-being in the modern context of human-machine interaction, the philosophy of Aristotle (384-322BC) is the best classical starting point. This has at least two reasons. First, he thought systematically about human well-being as resultant from a particular relation of knowledge, action, and production. Aristotle sub-divided philosophy, as general science, into theoretical philosophy, which is about what there is, practical philosophy, which is about human action, and ‘poetical’2, or technological, philosophy, which is about producing. What, e.g., ‘computation’ means is a theoretical question. What we do with computers is a practical question. How we are able to build computers to meet a particular demand is a technological question. 2
The expression ‘poetical’ has its etymological root in the Greek verb poiein meaning to create, to produce. The Greek verb prattein means to act without producing something, and is the etymological root of the expression ‘practical.’
30
S. Artmann / Well-Being in Physical Information Spacetime
The method by which Aristotle would bind together all these questions is based on his functionalist manner of analyzing any topic he thought about. Aristotle’s functionalism is the second reason why his philosophy is the best classical starting point for discussing well-being in the modern context of human-machine interaction. How Aristotle defines well-being is explained in Section 1.1. To make Aristotle’s concept of well-being fruitful for current research on humancomputer interaction, it is necessary to further develop its technological implications. This is done in Section 1.2 by re-interpreting Aristotle’s way of relating human action and production from the perspective of one of the most important 20th century philosophers, John Dewey (1859-1952). His philosophy, called ‘instrumentalism’ or ‘productive pragmatism,’ arises from an even more functionalist way of thinking than Aristotle’s. Since the role of technology in human action stands in the centre of Dewey’s functionalism, it holds a combined philosophical and engineering view on how humans interact with technical artefacts in order to lead a good life. 1.1. Aristotle’s Functionalist Definition of Well-Being For Aristotle to lead a good life is the supreme and all-encompassing purpose that human beings try to accomplish. They are doing so by making choices rationally. Desiring something that they consider to be good, humans deliberatively select a means, a particular action, which makes it probable to realize the desired aim. “Hence choice is either desiderative thought or intellectual desire, and such an origin of action is a man.” [4, 1139b4]3 If every action pursues a goal, is there a hierarchy of goals leading to a supreme aim that encompasses all other ones? This superior purpose must be something that is not again a means for realizing another purpose, which would then be considered even higher. The supreme purpose of human action must be an end in itself. Since an artefact produced by humans can be used as a means by other actions, neither an artefact nor its production can be ends in itself. Aristotle concludes that the supreme purpose of a human being inheres in the very act of performing particular actions which are not carried out for achieving something else. Different sub-disciplines of practical philosophy address, in Aristotle’s classification of sciences, different types of action: ethics is about individual action, economics about actions in the context of households, and politics about actions in the context of the Greek city-state. Yet the supreme purpose of human action is one and the same in ethics, economics, and politics. It consists in acting well, i.e. performing actions as ends in themselves. Acting well, in turn, is nothing else but living well. In the words of a foremost expert in Aristotle’s philosophy, the late J.L. Ackrill, “The good man acts bravely and honourably not to win a prize, for an ulterior motive, or in order to enable himself to live well later; but because he sees that to act so is to live well, and that is what he wants to do.” [5, p. 142] This must be so since otherwise wellbeing could not be the supreme aim of action. What are examples of acting well and being well? This question cannot be answered easily from an Aristotelian point of view. Practical philosophy, though human action is its subject matter, its origin, and its aim, cannot prescribe to a human what she has to individually do in order to live well. The philosopher can just give 3
The works of Aristotle are usually cited by making references to a standard edition, Immanuel Bekker’s edition of 1831. ‘1139b4’ means page 1139, column 2, line 4 in this edition. All good modern editions of Aristotle reprint the Becker references in the outer margins of the text.
S. Artmann / Well-Being in Physical Information Spacetime
31
orientation about the general structure of human choice and action, and about constant conditions of well-being. Each individual human has to think about her own way of leading a good life in the social community where she is living [6, p. 11 and p. 180]. What individual well-being means can be specified only after the capability to act of the person concerned is taken into account. What can be said in general is that wellbeing is a quality of the whole life of a human. It, thus, cannot be realized fully by a single act; it arises from the continuum of actions of an individual. Aristotle calls well-being also eudaimonia, which literally means being inspired by a good spirit. This Greek expression is usually translated as ‘happiness,’ yet this means neither a short-term private state nor a purely subjective feeling. Eudaimonia is the constant realization of a competency to do something at which an individual human excels so that she experiences a continuous fulfilment of her best abilities. Aristotelian happiness, though related to an individual and the circumstances of her life, is objective because it is inherent in action: first, actions are not private events but observable changes in the empirical world and, second, they are dependent on available abilities of an individual to act well in particular contexts. An American expert in Aristotle, J.H. Randall, Jr., concisely characterized Aristotle to be “in ethics a complete and thoroughgoing relativist – an objective relativist, in our present-day classifications.” [7, p. 252] From the perspective of such an objective relativism, happiness is an individual and context-dependent but nevertheless non-arbitrary way of living, which is accessible for ethical deliberation by observing the situated actions of the individual concerned. The human being who is eudaimon, i.e. happy, is also free: her actions are selfsufficient. What particular actions lead to her happiness is dependent on the habits that she has been developing. Habits are individually acquired patterns of action by which the character of an individual is shaped. Being able, due to one’s character, to set the right aims for one’s action is to have moral virtue, which Aristotle describes as the bearing that makes one search for the middle between two extremes (e.g., courage as the intermediate between sheepishness and foolhardiness). For Aristotle, morality is thus an affair of habituation. But, to lead a good life moral virtue is not enough. A human must also be capable of judging in which situation she now has to act so that she can find the appropriate virtue, and her reason must find the action that realizes that virtue. If such judging and reasoning has become a behavioural attitude, Aristotle’s refers to it as phronêsis, practical wisdom or prudence: “practical wisdom, then, must be a reasoned and true state of capacity to act with regard to human goods.” [4, 1140b20] Prudence is practical virtue that concerns contingent matters relevant to an individual. In a sense, the prudent human embodies the rules to follow in contextdependent action. If she then acts according to what prudence advises her, which she will do when she has moral virtue, she is acting well and thus living well. Aristotle’s account of what well-being means, how it is related to action, and in what way desire, reason, choice, and action must play together to develop good habits and prudence, is thoroughly functionalist [7, ch. 12]. Aristotle did not give a theory of what well-being substantively is (e.g., a particular type of emotion). He thought wellbeing to consist in a mutual functional dependence of moral virtue, prudence, choice, and action in relation to circumstances. This dependence might, in terms of modern cybernetics, be considered a control loop underlying the well-being of a human (see Figure 1). Moral virtue determines what shall appropriately be desired in a particular situation, and prudence, as intellectual virtue, proposes actions that may realize what is desired. Virtue and prudence inform the choice of action the individual then really performs. The condition that shall be controlled by action is the well-being of the
32
S. Artmann / Well-Being in Physical Information Spacetime
acting human, her happiness that is nothing but the excellent performance of the prudentially chosen action. The more the human is able to overcome the perturbations of her environment on her action, the better she carries out her prudent action and realizes the desired aim, and the happier she is. This, in turn, more and more strongly habituates the human to acting virtuously and using her prudence: “it is only through the practice of performing good acts that we can become good ourselves, only through practical moral experience.” [7, p. 270] Consequentially, Aristotle’s ethical control loop is a closed one: there is no need for an external controlling variable besides the internal condition to be controlled, happiness considered as an excellent functioning of the whole control loop.
Figure 1. Aristotle’s functionalist account of well-being depicted as closed control loop.
1.2. Dewey’s Technological Re-Interpretation of Well-Being Aristotle argued that good life cannot consist in producing something since productions are not self-sufficient: they are carried out in order to make an artefact that can then be used in other actions or productions. The distinction between self-sufficient actions and non-self-sufficient productions is, however, problematic. Are not at least some actions self-sufficient and productive at the same time? The engineering of a device that helps physically disabled people is, of course, the production of an artefact good for somebody else than the engineer but it may also be, for the engineer, an action good in itself when it shows the professional at the height of her ability. It is thus necessary to ask, with Ackrill, “Is the difference between an action and a production a difference in what is done, or is it just a matter of how a given performance is described and appraised?” [5, p. 154] Does the engineering of a useful artefact consist of two interwoven processes: an action that, for the engineer, is good in itself and a production that is good for somebody else? Or is this just one process with two different descriptions, one in terms of self-sufficient actions and the other in terms of productions? If so, can every self-sufficient action, broadly understood, be considered
S. Artmann / Well-Being in Physical Information Spacetime
33
producing an effect that goes beyond self-sufficiency? A courageous deed of resistance against a tyrant may, e.g., make another citizen join a partisan movement. Aristotle does not give a clear answer to those questions. Philosophers who belong to the tradition of pragmatism, base their thought on an understanding of human action since the latter is, for them, the fundamental process from which even the most abstract problems of science and philosophy arise. We can thus expect pragmatists to answer the questions posed above to Aristotle. One of the most important pragmatists, John Dewey, described the relation of action and knowledge, which is most relevant to Aristotle’s discussion of the role of prudence in good action, as follows: Action, when directed by knowledge, is method and means, not an end. The aim and end is the securer, freer and more widely shared embodiment of values in experience by means of that active control of objects which knowledge alone makes possible. From this point of view, the problem of philosophy concerns the interaction of our judgments about ends to be sought with knowledge of the means for achieving them. [8, p. 107]
Consequentially, Dewey advocates the anti-Aristotelian thesis that prudent, and thus potentially good, action is the production of value-increasing change in the experienced world. 1.2.1. Technology as Paradigmatic Form of Prudent Action Dewey’s variant of pragmatism is called ‘instrumentalism’ by himself [8, p. 3-13] and ‘productive pragmatism’ by today’s foremost Deweyan philosopher of technology, L.A. Hickman [9]. According to Dewey, “Instrumentalism is an attempt to establish a precise logical theory of concepts, of judgments and inferences in their various forms, by considering primarily how thought functions in the experimental determinations of future consequences.” [8, p. 9] Of all instrumental uses of thinking, Dewey considers technology paradigmatic. Its analysis is thus the best way to investigate how humans try to define and attain their aims by the interaction of behaviour and knowledge. Dewey’s concept of technology is very broad: “‘Technology’ signifies all the intelligent techniques by which the energies of nature and man are directed and used in satisfaction of human needs; it cannot be limited to a few outer and comparatively mechanical forms.” [8, p. 24] Dewey’s anti-Aristotelian thesis that prudent action is production can now be made more precise. A prudent action is a production because it is a controlled application of techniques in order to achieve some aim that is of value for the satisfaction of needs. A technique is a practical skill (in the form of a human ability or a function of an artefact) that can be put into use in a particular class of environments. Technology is, most generally understood, the system of rational uses of techniques to transform the world in order to achieve a desired effect. Yet technology also transforms the aims of applying techniques. This is experienced every day by any human being living in modern society: “Experience now owns as a part of itself scientific methods of discovery and test; it is marked by its ability to create techniques and technologies – that is, arts which arrange and utilize all sorts of conditions and energies, physical and human. These new possessions give experience and its potentialities a radically new meaning.” [8, p. 23] Reason has the function to let knowledge of how to achieve a particular aim inform, not only the choice of actions as means to realize given aims, but also the definition of these aims. Reason is thus itself
34
S. Artmann / Well-Being in Physical Information Spacetime
the supreme technique: it transforms the totality of human experience, which is structured by the categories of means and ends, in a way that conforms to the best knowledge available. Dewey’s instrumentalism leads to “a sophisticated engineering philosophy of technology” [10, p. 73], since it invites to think about technology in terms of methodical design, controlled use, and rational evaluation of techniques. Instrumentalism thus is an excellent conceptual framework for philosophers to get productively involved in the design of technical devices. The instrumentalist commitment includes ethical considerations on such artefacts from the very start. Since, for Dewey, technological reasoning does deeply affect the purposes for which techniques are used, they cannot be just neutral instruments, if ‘neutral’ means that an instrument is to be judged only by its quality of service. Instead, techniques are, as a publicist of today puts it, “morally neutralizing” [11, p. 22]. They can be decoupled from the particular purpose of each single context of application and used for very diverse purposes. This process affects also the ethical evaluation of a technical device. The assessment of a particular use of a particular artefact may be quite uncontroversial against the backdrop of moral requirements a particular group of its users expect to be met. But, since a successful device can be also used by other users in other contexts, its general assessment may be very controversial, so that the pseudo-solution to let divergent moral standards cancel each other out is tempting. 1.2.2. Habits as Instrumental Components of Technology Against the moral neutralization of technology in general, and of technical devices in particular, Dewey’s instrumentalism proposes to start a deeper analysis of how techniques are developed, used, and transformed, and how the ethics of well-being is inextricably involved in these processes. To describe Dewey’s approach to those problems, I shall start by showing how he re-interpreted Aristotle’s concept of habit in order to reconstruct our understanding of technology. ‘Reconstruction’ is a technical term in Dewey’s philosophy. It means “to carry over into any inquiry into human and moral subjects the kind of method (the method of observation, theory as hypothesis, and experimental test) by which understanding of physical nature has been brought to its present pitch.” [12, p. 10] Dewey held that the most important moral subject of modern life, for the reconstruction of which philosophy has to produce new conceptual instruments, is the pervasive introduction of results of science and technology into contexts of everyday life that are not prepared for such a modernization. What actions lead to well-being, Aristotle thought, depends on the habits of the human concerned; habits are acquired patterns of action by which the individual character is formed (see Section 1.1). Into this Aristotelian conception of well-being Dewey introduces the idea that habits are themselves results of what Aristotle would call ‘productions.’ The pragmatist thus pulls down Aristotle’s principal distinction between actions and productions: any action is, first, at least one habit carried out and, second, an instrument to produce new habits. Habits are means, or instruments, as well as ends, or products, of the control of action. More detailed, Dewey defined habit as that kind of human activity which is influenced by prior activity and in that sense acquired; which contains within itself a certain ordering or systematization of minor elements of action; which is projective, dynamic in quality, ready for overt manifestation; and which is operative in some subdued subordinate form even when not obviously dominating activity. [13, p. 33]
S. Artmann / Well-Being in Physical Information Spacetime
35
Each habit is a technique to cope with a particular type of environment to which it responds, as a method of acting, in order to attain a particular type of aim. To understand a habit, both the acting individual and the context in which she acts must be considered. Her habits filter what she perceives in her environment, and constrain how she acts on the environment. What habits an individual acquires, is determined by her genetic endowment and by the environment in which she has been living. Habits result from, and shape, the interaction of a human and her environment: they are, in two ways, relative yet objective. Dewey thus subscribes to the objective relativism of Aristotle’s practical philosophy (see Section 1.1). Concepts such as virtue and prudence, Dewey thinks as Aristotle did before, refer to aspects of habit. But in contradistinction to Aristotle, who claimed that human reason cannot be completely understood as a function of acting, for Dewey habits “constitute the self” [13, p. 28] up into the highest forms of abstract reasoning. There is no human faculty that is not grounded in habitformed and -forming actions. From the instrumentalist perspective on human action, habits are the essential instruments that shape, and are shaped by, productions. Hickman summarizes Dewey’s general concept of instrument as follows: Dewey’s instruments are conditional, general, and final. They are conditional in the sense that they are available for use if the proper situation presents itself. Like tools in a toolbox, it is not that they must be used, but they are available for use. They are general because they are applicable to whole classes of situations, and those classes are defined by and further refine their associated tools. [...] They are final because even though they may operate as signs of something further, there is no requirement that they do so: they terminate in action that is satisfactory, and that is all that can be asked. [9, p. 253f]
These three properties of instruments can be used to further detail Dewey’s concept of habits as techniques that constitute the instrumental components of technology. x
x
The conditionality of habits as instruments means that habits, though they are context-dependent in two ways, must not to be understood in terms of simple stimulus-response reactions. Except for very inflexible habits that can be regarded, in the extreme case, as symptoms of obsessive-compulsory disorder, the realization of a habit is not automatically triggered by an environment. Although the habits of a human may pre- or unconsciously influence any action of her, the rational application of a habit in order to attain a preferred end in a particular situation can be reconstructed as the result of a choice from a set of individually disposable habits. Being habituated to choosing successful habits is to be skilful in using one’s own reason as a technique of action selection. The more a human has organized her habits rationally, the more prudent she has become. She has been developing an individual technology of autonomously defining and successfully attaining aims on the basis of her knowledge about what habits are objectively available to her. The generality of habits as instruments, too, refers to their context-dependence. Generality means that, on the one hand, habits are types of action sequences which are applicable in a particular class of environments. On the other hand, no habit is universal in the sense that it is a type of action sequence which achieves a particular type of aim in every conceivable environment.
36
S. Artmann / Well-Being in Physical Information Spacetime
x
That a habit is neither a particular action sequence for a concrete environment nor a universal type of action sequence for all types of environments but a general type of action sequence for a particular class of environments, leads to a new view on the problem of moral neutralization of technical artefacts (see Section 1.2.1). It can now be seen that this problem arises already in the ethical consideration of habits since they, too, can be decoupled experimentally from a particular use in a single context of action and, by imitation, from a particular individual who has used a habit. The more a habit is spreading successfully over a population, the more its general assessment may become very controversial. Dewey was thus very sceptical about the possibility of universal, or even general, rules of habit assessment, if such rules shall prescribe particular substantive, and not functional, moral standards. Instead, he searched for “the logic of individualized situations, each having its own irreplaceable good and principle,” and tried “to transfer the attention of theory from preoccupation with general conceptions to the problem of developing effective methods of inquiry.” [12, p. 137] This is also valid for the assessment of technical devices and it points to the Deweyan insight that a technical device is nothing but an objectified habit. The finality of habits as instruments means that actions as actualized habits can be self-sufficient, and thus good from an Aristotelian perspective, even if they are used to produce something so that, as Hickman remarks above, they may operate as signs of what they produce. It was also already noted above that habits are means as well as ends of action. To this dual nature of habits the finality of habits adds that one and the same application of a habit can be considered a means to produce something and an end in itself. Here, Dewey’s anti-Aristotelian stance, which was described in Section 1.2.1, makes itself felt again: actions are productions, and vice versa. Any action can be considered being a means and an end. To distinguish between them is to draw a methodological, not an ontological, distinction: Means and ends are two names for the same reality. The terms denote not a division in reality but a distinction in judgment. [...] ‘End’ is a name for a series of acts taken collectively – like the term army. ‘Means’ is a name for the same series taken distributively – like this soldier, that officer. To think of the end signifies to extend and enlarge our view of the act to be performed. It means to look at the next act in perspective, not permitting it to occupy the entire field of vision. To bear the end in mind signifies that we should not stop thinking about our next act until we form some reasonably clear idea of the course of action to which it commits us. To attain a remote end means on the other hand to treat the end as a series of means. [...] Only as the end is converted into means is it definitely conceived, or intellectually defined, to say nothing of being executable. [13, p. 31]
The development of a habit is a complex process of experimenting with the dual character of actions. An action sequence can be tested, as an instrument and possible future habit, whether it usually attains some particular aim. And an action sequence can be explored, as an end and at least hypothetically acquired habit, as to what conditions must be met if its use shall be sensible. Substitute ‘technical device’ for ‘action sequence’ in the last two sentences and it should become clear that they are valid also for the development and
S. Artmann / Well-Being in Physical Information Spacetime
37
testing of artefacts. This is again pointing to the Deweyan insight that technical devices are objectified habits. 1.2.3. Well-Being as Controlled Growth of Habits Compared to Aristotle’s account of what well-being means, how it is related to action, and in what way desire, reason, choice, and action must play together to develop good habits and prudence (see Figure 1), Dewey’s account of well-being is also thoroughly functionalist and can be described in cybernetic terms (see Figure 2). Differences between Aristotle and Dewey are discernible in the concrete interpretation of the closed control loop that underlies well-being.
Figure 2. Dewey’s instrumentalist account of well-being depicted as closed control loop.
In Aristotle’s version, virtue and prudence shall regulate, via the morally and rationally informed choice of means and aims, the action of a human being so that she becomes happy, i.e. well performs those of her possible actions she is able to carry out best. Dewey takes Aristotle’s insight seriously that both virtue and prudence are habits, and considers the general concept of habit as the most comprehensive characterization of what regulates human action. For Dewey, habits regulate action by adapting the individual to its environment. The pragmatist radicalizes Aristotle’s idea of the relativity of habit. If the use of habits can be understood only in respect to an individual and an environment, it must be analyzed as to how it changes the relation of that individual to that environment. To get a rational basis for moral discussion we must begin with recognizing that functions and habits are ways of using an incorporating the environment in which the latter has its say as surely as the former. [...] All virtues and vices are habits which incorporate objective forces. They are interactions of elements contributed by the make-up of an individual with elements supplied by the out-door world. [13, p. 24f]
38
S. Artmann / Well-Being in Physical Information Spacetime
However, this description just makes clear that habits are types of interaction between individuals and environments. But why does Dewey substitute adaptation for Aristotle’s choice? To explain why habits control the concrete actions of a human by regulating her adaptation to the environment it is necessary to specify a general criterion for successful interaction, or a general purpose of the control of future action. Since Dewey was very sceptical about the possibility of general rules of habit assessment, if such rules shall prescribe substantive moral standards (see Section 1.2.2), the necessary criterion must be of functional nature. For Aristotle, the criterion of good choice was whether the selected action and purpose contributes to the eudaimonia, i.e. happiness, of the human concerned. Dewey’s version of eudaimonia is the growth of habits. It consists in making one’s own habits more and more differentiated and flexible, so that they can better make fit themselves to the singular characteristics of actual and future environments. Acquiring new habits that, e.g., are better suited to cope with a new type of environment than the old ones, is also a form of growth, but can be seen as a variety of making one’s habits more flexible: it requires a habit of learning as being able to self-differentiate one’s habits. A more traditional way of describing the growth of habit is to speak about an increase of individual liberty: “Freedom for an individual means growth, ready change when modification is required.” [12, p. 161] Freedom by adapting oneself to the environment is, of course, not a pure acceptance of the environment as it is at a given time. On the contrary, growth of habit most often requires actively changing the conditions of one’s own actions by transforming one’s environment: “Happiness is found only in success; but success means succeeding, getting forward, moving in advance. It is an active process, not a passive outcome. Accordingly it includes the overcoming of obstacles, the elimination of sources of defect and ill.” [12, p. 143] The more the habits of a human allows her to adapt proactively to her environment, the better she carries out her prudent action and realizes the desired aim, and the happier she is. This, in turn, more strongly habituates the human to acting morally, using her reason, and adapting to her environment. As was the case for Aristotle’s ethical control loop, Dewey’s version of it has no need for an external controlling variable besides the internal condition to be controlled: growth considered as an excellent functioning of the whole closed control loop. The end is no longer a terminus or limit to be reached. It is the active process of transforming the existent situation. Not perfection as a final goal, but the ever-enduring process of perfecting, maturing, refining is the aim in living. Honesty, industry, temperance, justice, like health, wealth and learning, are not goods to be possessed as they would be if they expressed fixed ends to be attained. They are directions of change in the quality of experience. Growth itself is the only moral ‘end.’ [12, p. 141]
The self-sufficiency of growth completes Dewey’s instrumentalist conception of wellbeing as the continuous development of habits by flexibly incorporating and proactively changing dynamic environments. Is Deweyan pragmatism a good instrument for a philosopher to discuss the use of state-of-the-art techniques in the everyday environment of modern societies? It was the principal aim of Dewey’s programme of reconstruction to develop a general but specifiable framework for asking questions about the use of technology that supports the well-being of humans (see Section 1.2.2). To test whether we can use Dewey’s
S. Artmann / Well-Being in Physical Information Spacetime
39
ideas to analyze the socio-technical world of today, we must now turn to the discussion of a particular class of techniques that may immensely influence our habits in tomorrow’s world: Pervasive Computing.
2. Physical Information Spacetime and the Distinction of Weak versus Strong Pervasive Computing Sometimes, seemingly innocuous names of technical systems, or technologies, betray important conceptual issues. Although the expressions ‘Ubiquitous Computing’ and ‘Pervasive Computing’ are usually regarded as synonyms (e.g., [14]-[16]), it may be more appropriate to let them denote technological research programmes of different scale. Taking the general meaning of ‘ubiquitous’ and ‘pervasive’ [17] as a basis, ‘Ubiquitous Computing’ ought to refer to the project of making computing services available everywhere. ‘Pervasive Computing’ ought to mean that information processing devices diffuse into the world and permeate ever more interactions of humans and their environments so that a new type of social medium comes into being: Physical Information Spacetime (PhISt), which will finally saturate human experience. ‘Pervasive Computing’ (PerComp) is thus better suited to express the ambitious intention of making a new phase in the history of technological culture possible [18], albeit this expression was originally used in a less radical sense than ‘ubiquitous computing’ [19]. It is still the case that the latter expression is often meant to confusingly denote more user-centred academic research, while ‘Pervasive Computing’ shall misleadingly refer to more network-oriented industrial research. PhISt can be, for our purposes in this paper,4 described as the medium in which the interaction between humans and systems of PerComp happens. As to physics, PhISt is an ordering of events according to their succession, i.e. a temporal structure, and an ordering of events according to their juxtaposition, i.e. a spatial structure. As to information, the processes happening in PhIST are bearers of meaning, like the processes in human communication through language. But how can we describe such meaningful, or informational, physical processes? Dewey made the following proposal: Because meanings and essences are not states of minds, because they are interdependent of immediate sensations and imagery as are physical things, and because nevertheless they are not physical things, it is assumed that they are a peculiar kind of thing, termed metaphysical, or ‘logical’ in a style which separates logic from nature. But there are many other things which are neither physical nor psychical existences, and which are demonstrably dependent upon human association and interaction. Such things function moreover in liberating and regulating subsequent human intercourse; their essence is their contribution to making that intercourse more significant and more immediately rewarding. [13, p. 61f]
The ‘things’ Dewey points at are rules, or conventions, that are established in order to help develop, coordinate, and transform habits: “Meanings are rules for using and interpreting things; interpretation being always an imputation of potentiality for some 4
I cannot here discuss fundamental information-theoretic problems of how to formally define, in PhISt, ‘dimension,’ ‘point,’ ‘metrics,’ ‘interval,’ ‘displacement,’ ‘causality,’ ‘identity,’ etc. Systematically, the theory of PhISt, which must contain such definitions, is between traditional socio-psychological descriptions of human-technology interaction and (meta-)physical discussions of the nature of spacetime.
40
S. Artmann / Well-Being in Physical Information Spacetime
consequence.” [13, p. 58] Applied to PhISt, this means that the social medium of interaction between humans and PerComp systems is the structure consisting of rules that control the interaction of humans and PerComp systems by making physical processes – observable spatiotemporal changes of humans and artefacts – meaningful. This is achieved by letting these physical processes represent particular consequences that result from them, for the future behaviour of humans as well as of PerComp systems. Both sides must take into account these consequences when they are planning their further interaction. Now a weak form of PerComp can be distinguished from a strong one by specifying their technological aims. If a PerComp project just tries to enhance the universal accessibility of information geared to the needs of individual users, it belongs to the research programme of weak PerComp. Its ultimate aim is to deploy a globalyet-personal information system (sect. 2.1). If a PerComp project tries to make the whole instrumental habitat of humans a communicative environment, it belongs to the research programme of strong PerComp. Its ultimate aim is to engineer PhISt (sect. 2.2). 2.1. Weak Pervasive Computing: Deploying the Global-yet-Personal Information System Among contemporary technological trends that greatly improve efficiency of communication and access to information are [20-22]: x x x x x x x x x
the construction of smart devices with embedded computers, the convergence of digital information technologies, the integration of computer and communication networks on all scales, the standardization of data exchange formats and protocols for information transmission, the optimization of human-machine interfaces as for easiness of use, the personalization of digital services, the increase in flexibility of collaboration between users, devices, and services, the development of proactive software agents, and the functional diversification of information appliances.
Altogether, these trends are bringing forth a new quality of human-machine interaction by embedding mobile individuals in a universal information network. The underlying technological paradigm has appropriately been called, by J.R. Gurd and C.B. Jones, “the Global-yet-Personal Information System (GyP∫IS)” [23]. 5 It is global because “information stored in the totality of all physically distributed repositories should be potentially accessible, subject to provision of the physical means to access the raw data” [23, p. 134], and it is personal because “information should be structured in a 5
The intellectual origin of weak PerComp can be traced back at least to J.C.R. Licklider’s concept of human-computer symbiosis, which he saw to be “an expected development in cooperative interaction between men and electronic computers. It will involve very close coupling between the human and the electronic members of the partnership. The main aims are 1) to let computers facilitate formulative thinking as they now facilitate the solution of formulated problems, and 2) to enable men and computers to cooperate in making decisions and controlling complex situations without inflexible dependence on predetermined programs. In the anticipated symbiotic partnership, men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations. Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking.” [24, p. 4]
S. Artmann / Well-Being in Physical Information Spacetime
41
way which facilitates access and individuals should be able to obtain access in a way which fits their stated preferences” [23, p. 134]. A user-centred view on the interaction of humans and technical devices of GyP∫IS suggests itself, and focuses on enabling creativity in individual users by collecting information, communicating with others, developing something new, and disseminating innovations to help others [25, ch. 5]. A conceptual framework in which question of GyP∫IS design may be discussed productively is information foraging theory, an application of a biological optimization theory on human information search in GyP∫IS [26]. Any technological research programme that contributes to the deployment of GyP∫IS, belongs to PerComp. If the programme accepts, however, the traditional divide between physical spacetime in which a human makes use of, e.g., a mobile internet appliance, and information spacetime, in which the appliance works for the human, the research programme is an example of weak PerComp. This is the case, e.g., whenever making computational services ubiquitous is considered the ultimate aim. In the context of weak PerComp, any concept which resembles that of PhISt, refers essentially to networks of computing devices that are available for users in particular environments (see, e.g., the Human Information Space project and its understanding of PhISt as “our view into the electronic information space” [27, p. 74]). The first phase of M. Weiser’s project of ubiquitous computing: “to construct, deploy, and learn from a computing environment consisting of tabs, pads, and boards” [28, p. 76f], takes such a point of view – but solely to leave it behind in later phases, advancing into the direction of an ever increasing permeability of physical spacetime and information spacetime. 2.2. Strong Pervasive Computing: Engineering Physical Information Spacetime The success of weak PerComp is a necessary but not sufficient condition for the mutual saturation of physical spacetime and information spacetime. It is the ultimate aim of strong PerComp to push ahead with this process through which PhISt is engineered. Trends that start where weak PerComp ends, and that lead closer and closer towards strong PerComp’s aim, are: x
x
x
x
the disappearance of the difference between information processing hardware and other material objects, resultant in “a physical world richly and invisibly interwoven with sensors, actuators, displays, and computational elements, embedded seamlessly in the everyday objects of our lives and connected through a continuous network” [29, p. 694], the vanishing of the distinction between information processing and other processes in the physical world by establishing “embodied virtuality,” where “the ‘virtuality’ of computer-readable data – all the different ways in which they can be altered, processed and analyzed – is brought into the physical world” [30, p. 3], the increase of contextual variation in the interaction between humans and information processing objects by automatically taking physical and informational attributes of explicitly or implicitly interacting systems and their environments into account (“context-aware computing” [31]), the transition from computer systems constructed in order to adequately reflect the given physical world, to information processing objects enabled to autonomously control their environments on the base of information about
42
S. Artmann / Well-Being in Physical Information Spacetime
x
actual and desired physical world-states in order to achieve “dynamic reoptimization” [32, p. 41] of relevant parameters of these world-states, and the change of engineering paradigm from information-generating self-control of technical systems (in order to context-dependently select a behaviour against the backdrop of a set of possible behaviour types) to rule-generating self-control of technical systems (in order to agent-dependently create new behaviour types) [33].
In systems of strong PerComp, humans do not interact, as deliberate users, physically and communicatively over well-defined stretches of time mainly with information devices they can easily identify. In such systems, humans interact, as inhabitants, physically and communicatively with the space surrounding them potentially over the whole time in which they are living in that space. The latter is not just a physical surrounding anymore but also, in its near-totality, a sender and receiver of information so that it can react physically and communicatively to humans. Talking sensibly about systems of strong PerComp thus requires a comprehensive concept of system that does refer, not only to technical artefacts, such as computers, sensors, and actuators, but also to explicit and implicit rules of behaviour, or conventions, which are guiding engineers in developing and implementing these artefacts, as well as users in becoming acquainted with and personalise them (see [34] for a description and application of such a comprehensive concept of system by a historian of technology). The interpenetration of physical spacetime and information spacetime in strong PerComp systems leads naturally to a wide-spread use of geometrical, topological, and architectural concepts for their description. One well-known instance is to call strong PerComp ‘ambient intelligence’ [35]. Another example is the engineering of so-called ‘smart spaces’ by “embedding computing infrastructure in building infrastructure” for, e.g., the “automatic adjustment of heating, cooling, and lighting levels in a room based on an occupant’s electronic profile” [36, p. 11]. Adam Greenfield invented the name ‘everyware’ for strong PerComp, and writes that “everyware can be understood as information processing dissolving in behaviour” [37, p. 26]. He adds that “everyware isn’t so much a particular kind of hardware or software as it is a situation” [37, p. 31], and that it “is always situated in a particular context.” [37, p. 72] Malcolm McCullough, an expert in architecture, proposes to call the design orientation of strong PerComp ‘context-centred’: To begin, let ‘setting’ describe objective, a priori, space. ‘Context’ is not the setting itself, but the engagement with it, as well as the bias that setting gives to the interactions that occur within it. ‘Environment’ is the sum of all present contexts. According to the cognitive principles laid out thus far, environment is not an other, or an empty container, but a perception of persistent possibilities for action. [38, p. 48]
Against the backdrop of Aristotle’s ethics and Dewey’s instrumentalism it does not come as a surprise that McCullough considers, on the one hand, the perception of, and active transformation of, contexts as being shaped by habits [38, p. 53]. On the other hand, the design of a context continuously influences the probability that particular actions will happen in that context, so that the designer should be interested in developing a “typology of situated interactions” [38, p. 118]. Architecture is, to a great extent, nothing else but “interaction design” [38, p. 172]. It tries to build up persistent structures in which the abilities of humans to act well shall be supported effectively by
S. Artmann / Well-Being in Physical Information Spacetime
43
allowing them to grow successful habits that may even be transmitted, as traditions, from generation to generation. How can we describe and analyze, from a context-centred perspective, the origin and development of habits and traditions in systems of strong PerComp that shall support the well-being of humans? A first step consists in taking Dewey’s insight seriously that techniques, including technical artefacts, are not only instruments but also ends, since they are (objectified) habits (see Section 1.2.2). For the relation of humans and strong PerComp systems this means that the latter can influence the aims their human users are trying to attain by using them. To analyse such a complex process of bi-directional influence, Dewey proposed a form of description that is based on the concept of transaction. He defined transaction as a form of analysis of processes “where systems of description and naming are employed to deal with aspects and phases of action, without final attribution to ‘elements’ or other presumably detachable or independent ‘entities,’ ‘essences,’ or ‘realities,’ and without isolation of presumably detachable ‘relations’ from such detachable ‘elements.’” [39, p. 101f] To analyse a strong PerComp system from a transactional perspective means to conceive it as part of a process that is determined by the common evolution of two interdependent aspects: first, the rules of behaviour that shall help attain particular ends by entering into a relation to the system, and second, the ends that motivate to set up and follow those rules. In short, it means to conceive the structure of rule-based and -changing transactions between humans and strong PerComp systems as a PhISt, a physical information spacetime. Taking up the cybernetic description of Dewey’s theory of well-being (see Figure 2), the transaction of humans and strong PerComp systems can be depicted as a relation between two control loops: one that underlies human action and one that underlies the behaviour of a strong PerComp system (see Figure 3). The elements of the latter control loop are functional equivalents of the former one. The programs carried out in strong PerComp systems correspond to habits, the regulators of human behaviour. Whereas habits regulate, from Dewey’s point of view, the adaptation of humans to their environments, programs regulate how systems of strong PerComp adapt to the humans that are living with such systems. The systems’ adaptive behaviour has a clearly defined aim: to support the humans transacting with them. The controlled systems of both control loops are actions: in one case, of humans, in the other, of technical artefacts. The mutual relation between the actions of humans and those of artefacts is to be considered as a transaction in the Deweyan sense introduced above. The condition controlled via transactions is, as to the human, the growth of her habitual adaptation. As to the strong PerComp system, it is the continuity of its programmed support of the human’s adaptation. This means that the artefact shall as steadily, smoothly, and flexibly as technically possible support the human to let her habits grow. If the transaction of humans and strong PerComp systems leads to the establishing of conventions of behaviour that are followed by both and support the well-being of the human, her growth of habitually regulated adaptation, it is ethically advisable to use the strong PerComp system in question. In a sense, such a system then helps the human realize her personal morality that strives towards a better control of her behaviour. For Dewey, “morality becomes legitimately subjective or personal when activities which once included objective factors in their operation temporarily lose support from objects, and yet strive to change existing conditions until they regain a support which has been lost.” [13, p. 37] To support this regain is the principal task of strong PerComp systems: they are objective factors of the environments in which humans try to regain
44
S. Artmann / Well-Being in Physical Information Spacetime
lost support for their habits, and help the humans to change the actual conditions of their habitual actions.
HABITS Regulator
GROWTH Controlled condition
ADAPTATION Regulating variable
ACTION Controlled system
Transaction ACTION CTIO Controlled system
SUPPORT Regulating variable
CONTINUITY Controlled condition
PROGRAMS Regulator Figure 3. Transaction between human (upper control loop) and strong PerComp system (lower control loop).
3. The Role of Intentions in Communication Aristotle’s as well as Dewey’s theory of well-being speak against the idea that a strong PerComp system ought to prescribe a human a pattern of action that it holds to be good for her. On the contrary, the system has to recognize the intentions of the human’s actions, and to help her execute her intentions if it considers them realizable in the context it is able to influence. The more systems of strong PerComp are pervading modern technological culture, the more the need will be making itself urgently felt to describe, in terms of a general model of intelligent agents, technical artefacts that support continuously the growth of human habits [40]. If this is shown to be possible, the transaction between humans and strong PerComp systems can be represented from a unified theoretical perspective that considers both transacting parties as species of the genus intelligent agent. It is impossible, in this paper, even to sketch the ontology of intelligent agents. Instead, we accept the quite unspecific definition of an agent as “a kind of physical
S. Artmann / Well-Being in Physical Information Spacetime
45
object that can act in the environment, perceive events, and reason” [41, p. 339; italics suppressed], and of an intelligent agent as “a kind of agent that is required to be reactive, proactive, and social.” [41, p. 342; italics suppressed] Both definitions allow us to further follow Dewey’s pragmatic direction of thought, which emphasizes the adaptive (reactive and proactive) function of habitually controlled action, and makes it clear that “our objective test of the presence or absence of intelligence is influence upon behaviour.” [8, p. 126] Behaviour can be influenced in two different ways. First, the aims of actions must be defined. Second, actions that are expected to attain defined aims must be chosen. For an intelligent agent, its actions then have the purpose to achieve its goals. It intends to realize its ends by means of its actions. It seems thus to be uncontroversial that intelligent agents must be regarded as acting according to intentions. For an external observer the behaviour of an agent, if it is to be judged intelligent, must be interpretable as following intentions to reach some goals. Since an agent does very often not communicate its intentions explicitly to other agents, a second agent that is transacting with the first one must be able to recognize the intentions of the first agent by observing and interpreting its behaviour. This is also the case in the transaction of a human and a strong PerComp system: the technical artefact must be able, if it shall support the human, to recognize her intentions by observing and interpreting her behaviour. Intention recognition is thus one of the key problems to be solved by strong PerComp. So the question of what intentions are must be addressed. In order to define what an intention is, the more general concept of intentionality has to be discussed first (Section 3.1). An intention realizes a particular form of intentionality that combines internally accessible and externally observable information transmission in a certain manner (Section 3.2). 3.1. Intentionality as a Form of Evaluating Communication The concept of intentionality is of utmost importance in the philosophy of mind and AI. In the very first proceedings volume on AI, J. McCarthy [42, p. 181] proposed the following criterion for considering a machine to be thinking (and behaving rationally): Perhaps, a machine should be regarded as thinking if and only if its behavior can most concisely be described by statements, including those of the form “It now believes Pythagoras’ theorem.” This would not be the case for a phonograph record of a proof of Pythagoras’ theorem. Such statements will probably be especially applicable to predictions of the future behavior under a variety of stimuli.
Believe, but also doubt, know, see, etc. are intentional predicates. An intentional predicate IP modifies, as its object, a proposition P (e.g., Pythagoras’ theorem). It does so by specifying the epistemic relation ER of an agent A, which is IP’s subject, to P. Epistemic relations are often less formally called ‘mental states’ or ‘mental attitudes’ [43]. ‘Attitude’ is more appropriate than ‘state’ because all epistemic relations have in common that they show a kind of directedness: an epistemic relation is, like an attitude, about, or directed towards, something (namely, a proposition). A Latin word for the state of being directed towards something is intentio, so philosophers of mind usually call this quality of epistemic relations ‘intentionality’ [44]. Formally, an intentional predicate IP is part of a meta-proposition MP about a proposition P. An MP represents a state of an agent A. This state is characterized by an epistemic relation ER of A to P. The sentence “I know that one plus one equals two,”
46
S. Artmann / Well-Being in Physical Information Spacetime
says about the agent who is denoting itself as ‘I,’ that it has entered into the epistemic relation knowing to the arithmetical proposition ‘1+1=2.’ Thus, the sentence “I know that one plus one equals two” expresses a meta-proposition about an agent that has at least the proposition ‘1+1=2’ at its disposal. From an information-theoretic point of view, which abstracts from particular types of agents, intentionality may be described even more generally (see Figure 4). Following an idea of J. Doyle [45, p. 249], intentionality can be considered a way to evaluate processes of information transmission, or communication, from the world outside an agent to this agent (which is, of course, a part of that world). Any intentional predicate IP generally determines, first, a direction of communication. The agent A takes a particular mental attitude ER towards a proposition P, which contains information I about the world. So I must have been received by A logically before A enters into ER. A is thus considered an information destination at the receiving end of a channel through which I has been transmitted to A. Second, a particular IP (such as believe) specifies the quality of P by evaluating the process of transmitting I to the agent A. A then is in the state of having a particular ER (such as believing) to P. Doubting about P indicates rather a bad quality of the transmission of I, believing P a medium quality, and knowing P quite a good quality. In short, intentionality is a way of evaluating communication from the perspective of its destination. WORLD AGENT Information Transmission
Fact
Evaluation
Reference
Epistemic Relation
Intentional Predicate
Proposition Information
Figure 4. The information-theoretic structure of intentionality.
3.2. Intentions as Evaluators of Future Acts of Communication Intentions constitute a particular type of mental attitudes [46]. The intentional predicate intend can, as any other intentional predicate, be part of a meta-proposition MP about a proposition P. The MP represents a state of an agent A that is characterized by the epistemic relation intending of A to P. What distinguishes intentions from the intentional predicates discussed in Section 3.1 is that if a proposition P is to be evaluated as an object of the IP intend, P must be about possible future facts. The sentence “I intend to go to the refrigerator” contains meta-information about the relation of the agent A symbolized by ‘I’ to the proposition ‘I go to the refrigerator,’ which is about a process in the future.6 Due to their directedness towards future facts, 6
This process may already have begun, or may begin just in the moment the sentence is uttered. Otherwise, a so-called ‘present-directed intention,’ which shall be recognized in our small-scale scenario (see Section 6), were a contradictio in adjecto.
S. Artmann / Well-Being in Physical Information Spacetime
47
mental attitudes, such as intention, want, and desire, have been called ‘motivational attitudes’ in contrast to ‘informational attitudes,’ such as knowledge and belief [43]. Yet from the abstract information-theoretic perspective inspired by Doyle [45] (see Section 3.1), motivational attitudes, too, play an evaluative role as for processes of communication between an agent and the world. Compared to informational attitudes, motivational attitudes are, however, more restrictive in respect to information sources and more complex in respect to the information channels they involve. First, motivational attitudes are more restricted concerning the source of the transmitted information I that is contained in the proposition P. This source is the agent A itself. P is about future states of A (e.g., to be sated) or future states of the world (e.g., A standing before the refrigerator), and these future states are accessible to A as internal representations. In this respect, the intentional predicate intend generally determines the same direction of communication as, e.g., the intentional predicate believe. Since the agent A takes the motivational attitude intention towards a proposition P, which contains information I about the future, I must have been received by the evaluating component of A from the future-representing component of A before the evaluating component of A intends that P. Thus, the future-representing component of A is considered an information source at the sending end, and the evaluating component of A an information destination at the receiving end of a channel through which I has been transmitted. Second, a particular motivational attitude (such as desiring and intending) characterizes P by evaluating the process of realizing the state of the agent A or of the world that would make P true. The intended future behaviour of A can then be considered a transmission of the information I that is contained in the propositional object P of A’s intention, to other agents. A evaluates future possible states of itself or the world according to their agreement to A’s preferences of what should come about. Correspondingly, the quality of A’s behaviour as communication to other agents depends on the degree of agreement of P with A’s preferences. An intention indicates a higher grade than a desire, since – according to M.E. Bratman’s planning theory [46], which is widely used in AI (e.g., [47] and [48]) – intentions are decisions about what to do, that commit the agent much stronger than desires do. From an information-theoretic perspective, the intentionality of motivational attitudes, too, is a way to evaluate communication, but now by an agent A behaving also as an information source (see Figure 5). An intention realizes a particular form of intentionality that combines, first, internally accessible information transmission (from the future-representing component of A to the evaluating component of A) and, second, externally observable information transmission (from A to other agents). By restricting the information source of the first information transmission to agent A and by regarding the behaviour of A as a second process of information transmission, motivational attitudes bring the self-forcing [45, p. 250] quality of mental attitudes clearer to light than informational attitudes. Doyle [49, p. 65f] explains his idea that agents with intentions are forcing themselves to bring themselves into a particular desired state when he describes intelligent agents as self-programmers: As its own programmer, the agent needs some way of guiding its adaptations, some way of stating its intentions about the relations of parts of its mental structure at single instants and over time so that it can modify itself to satisfy these intentions. […] The agent modifies itself so that its new state still satisfies (is a model of) the set of self-specifications. But since the machine is doing this revision itself with only its current state to work with, the machine’s
48
S. Artmann / Well-Being in Physical Information Spacetime
interpretation of these self-specifications cannot refer to external affairs, but must be narrowly interpreted specifications referring only to parts of the agent itself. In classical terminology, artificial-intelligence programs are not merely ruleobeying machines, but are rule-following machines, interpreters of the rules, and as such must be able to grasp the import of their rules.
Such a view on intelligent agents is completely matching Dewey’s description of human action both as producing habits and being controlled by habits (see Section 1.2.2). We also already considered programs as being the functional equivalents of habits in strong PerComp systems, which ought to support human actions (see Figure 3). Trying now to represent humans and technical artefacts as two kinds of intelligent agents, it is permissible to interchangeably call them ‘self-habituating’ or ‘selfprogramming.’ In the research programme of strong PerComp, rule-generating machines should, however, be added to Doyle’s distinction of rule-obeying and following machines, since the engineering paradigm changes from informationgenerating self-control of technical systems that context-dependently selects a behaviour against the backdrop of a set of possible behaviour types, to rule-generating self-control of technical systems that agent-dependently creates new behaviour types (see Section 2.2). WORLD Representation of Future State Information Transmission
Intentional Predicate
Reference
Information
Motivational Attitude
Proposition
AGENT
Evaluation
BEHAVIOUR Information Transmission
OTHER AGENTS
Figure 5. Motivational attitudes from an information-theoretic perspective.
4. A Minimum Scenario of Small-Scale Intention Recognition The philosophic and information-theoretic concept of intention must now be integrated into the research programme of strong PerComp as sketched in Section 2.2. This can be done by embedding, in PhISt, intelligent agents whom other intelligent agents ascribe intentions to. Representations of intelligent agents as showing intentionally organized behaviour thus become part and parcel of information processing in intelligent agents so that they are capable to try to recognize each other’s intentions. PhISt can thus become a medium of transaction in which agents interact with each other in a way that
49
S. Artmann / Well-Being in Physical Information Spacetime
involves meta-representations of each other as intentionally behaving objects which, in turn, use representations of each other as intentionally behaving objects. It is a very difficult task to elaborate a theory of intention recognition that captures all possible PhISt scenarios in which intentions are relevant. It is even not a priori clear to what degree mechanisms necessary to perform intention recognition are scenarioindependent. Thus the focus will, from now on, be on a minimum scenario of intention recognition in small-scale regions of PhISt. By ‘small scale’ is meant that only processes involving intention recognition in short time intervals and local areas of PhISt are considered. A paradigmatic application is the recognition of activities of daily living (ADLs) and instrumental activities of daily living (IADLs). ADLs focus on assessing the person’s ability to perform basic self-care activities such as eating, dressing, bathing, going to the toilet and transferring in and out of bed/chair and walking. IADLs measure activities related to independent living and include preparing meals, shopping for personal items, medication management, managing money, using the telephone, laundry doing and transportation ability. [50, p. 143]
Recognition of ADLs and IADLs may be done, e.g., in order to support elderly people at home [51, 52]. The focus on a small-scale scenario excludes, e.g., modelling of movement patterns of humans on an urban scale and learning the weekly sequence of their normal movement intentions in order to instantaneously signal possibly wrong movements to them at their respective position [53]. Yet a small-scale scenario lets us concentrate on essential prerequisites of intention recognition to be met if distinctions that are necessary for ADL and IADL recognition, between (intended and nonintended) normal behaviour and (intended and non-intended) novel behaviour shall be made reliably by a strong PerComp system. A most simple example of a small-scale process involving intention recognition is that an agent Alter is doing something and another agent Ego wants to help Alter achieve its aim (see Figure 6). Ego and Alter are not allowed to explicitly communicate about their intentions with each other by means of a common symbol system. Nevertheless, Ego shall recognize Alter’s intention, i.e., Ego shall ascribe an intention to Alter in a controlled way. This simple scenario cannot capture problems of intention recognition that arise in situations where more than two agents interact, or where different agents have competing intentions. Both is the case when, e.g., intentions in games, such as soccer, shall be recognized [54].
Intention Recognition Alter
Goal Observation
Assistance Ego
Figure 6. A small-scale scenario of intention recognition.
50
S. Artmann / Well-Being in Physical Information Spacetime
The idea to discuss a fundamental problem of strong PerComp on the base of an idealized scenario is in accordance with the boundary principle that has been stated by T. Kindberg and A. Fox: “Ubicomp system designers should divide the ubicomp world into environments with boundaries that demarcate their content. A clear system boundary criterion – often, but not necessarily, related to a boundary in the physical world – should exist. A boundary should specify an environment’s scope but doesn’t necessarily constrain interoperation.” [55, p. 70f] An important reason why it is sensible to follow the boundary principle also in theoretical research is that it then becomes easier to compare questions of strong PerComp with problems well-known from AI. Our simple scenario of Ego trying to recognize Alter’s intention is, as to some of its most characteristic aspects, the dual of the famous Turing test [56]. Both scenarios discuss interaction of intelligent agents. Yet in the minimum scenario of intention recognition the agents are not allowed to communicate by linguistic means; it is accepted by Ego and Alter from the beginning that they both are intelligent agents; and it is considered irrelevant which kind of intelligent agents they are – organisms, machines, or hybrids. Instead, the question for Ego is what the directly observable Alter wants to do. This is no question in the Turing Test: Alter, whatever class of agent it belongs to, wants to signal its human nature to Ego. To handle the problem of recognizing Alter’s intention, Ego must leave the purely behaviouristic perspective of the Turing test behind: it has to make assumptions about the internal information processing of Alter, e.g. Alter’s intentions. In this way, the research programme of strong PerComp can carry out thought and real experiments that might be decisive for the development of human-level AI, beyond the Turing test [57].
5. Semantic Postulate of Abstract Data Types Common to Intelligent Agents In order to understand how our simple scenario of Ego recognizing Alter’s intention might work, a postulate must be assumed by any working mechanism of small-scale intention recognition: the ability of an agent to recognize another one’s intentions has to rely on data types that both agents are considered having in common. This postulate concerns the semantics of information that the agents involved in our minimum scenario can process. It can be stated clearly, in its most general form, by following a reasoning of the British philosopher P.F. Strawson [58]. An agent Ego is able to ascribe intentions to itself if and only if Ego is able to ascribe intentions also to at least one other (actual or possible) agent Alter. Ego must be able to differentiate between intentions of Ego and intentions of Alter; otherwise, Ego could not ascribe intentions to any particular agent at all (including Ego itself). Being able to make ascriptions of intentions to a particular agent presupposes a representation of this agent as different from at least one other agent. In this respect, the minimum scenario captures also the case of a single agent that ascribes intentions to itself. Strawson [58, p. 99f] expresses this insight for the case of humans as follows: it is a necessary condition of one’s ascribing states of consciousness, experiences, to oneself, in the way one does, that one should also ascribe them, or be prepared to ascribed them, to others who are not oneself. This means not less than it says. It means, for example, that the ascribing phrases are used in just the same sense when the subject is another as when the subject is oneself. Of course the thought that this is so gives no trouble to the non-philosopher: the thought, for example, that ‘in pain’ means the same
S. Artmann / Well-Being in Physical Information Spacetime
51
whether one says ‘I am in pain’ or ‘He is in pain’. The dictionaries do not give two sets of meanings for every expression which describes a state of consciousness: a firstperson meaning and a second-and-third-person meaning. But to the philosopher this thought has given trouble. How could the sense be the same when the method of verification was so different in the two cases – or, rather, when there was a method of verification on the one case (the case of others) and not, properly speaking, in the other case (the case of oneself)?
Whatever the problems of verifying such ascriptions might be, if intelligent agents want to coordinate their actions they must ascribe, to each other, internal states that control their actions, and mean the same by so ascribing a particular internal state – whether it is ascribed to oneself or the other. Yet the semantic identity of ascribed internal states does not imply that every instance of a particular type of internal state is equally accessible by Ego and Alter. Pain is a striking example for this. As to the transaction between a human being and a system of strong PerComp it need not be presupposed that, e.g., the human’s belief can be experienced, in any way that might be considered human-like, by the strong PerComp system. It is outright possible to accept the incomparability of internal states considered as individual experiences, and to affirm nevertheless that the concepts denoting such internal states have the same meaning when referring to me or to another agent. From the instrumentalist perspective of Dewey all that it is needed here are empirical criteria that tell us when the transaction between a human and a strong PerComp system is supporting the growth of human habits and, therefore, the wellbeing of the human. Those criteria must be based on concepts with the same meanings in respect to all agents involved. Using the terminology introduced in Section 3, we can express Strawson’s result as follows: an intentional predicate IP denotes the same epistemic relation ER whether an agent Ego specifies by IP its own ER to some proposition P, or whether Ego specifies by the same IP the ER that another agent Alter may have to some P. More generally, “the idea of a predicate is correlative with that of a range of distinguishable individuals of which the predicate can be significantly, though not necessarily truly, affirmed.” [58, p. 99, n. 1] The condition that an intelligent agent Ego is able to ascribe intentions to itself, must be satisfied if Ego is self-habituating in the sense of Dewey or self-programming in the sense of Doyle (see Section 3.2). Then Ego uses a second-order IP to express, e.g., that Ego knows that Ego satisfies the meta-proposition ‘A intends P’, with ‘intend’ as first-order IP. The very same second-order IP can also express epistemic evaluations of IPs that are supposed to be used by another agent [59]. These considerations lead to the following semantic postulate as for the problem of small-scale intention recognition in the minimum scenario consisting just of two intelligent agents Ego and Alter. Semantic Postulate of Intention Recognition In order to set Ego the task of recognizing Alter’s intention we must postulate: 1. that Ego has at least second-order intentional predicates at its disposal, 2. that Ego considers Alter to have at least first-order intentional predicates at its disposal, and 3. that Ego assumes the very same semantics for Ego’s and Alter’s intentional predicates.
52
S. Artmann / Well-Being in Physical Information Spacetime
These preconditions are independent of what specific types of intelligent agents Ego and Alter instantiate, and express general constraints on Ego trying to recognize the intentions of Alter. They concern intentional predicates used, technically speaking, as abstract data types considered common to Ego and Alter. “An abstract data type defines a class of abstract objects which is completely characterized by the operations available on those objects. This means that an abstract data type can be defined by defining the characterizing operations on that type.” [60, p. 51] Intentional predicates IP are data types since they are informational objects definable by two kinds of operations. First, IPs are combined with propositions P that contain information I, to form meta-propositions MP. What IP is selected by an agent A, depends on the quality of transmission of I to A (see Section 3.1) Second, motivational IPs, such as ‘intend,’ are additionally connected to actions. What IP is selected by A, now depends on the degree of preference that A has for the future state which is denoted by information I contained in proposition P (see Section 3.2). And intentional predicates are abstract since they must be so defined as to be usable in the same way, whatever type of intelligent agent they shall apply to. The three presuppositions constitute, thus, the content of a semantic postulate as for abstract data types that determine what types of evaluations of communications both Ego and Alter are able to process. Consequently, the postulate indicates a lower boundary of data complexity that intention-recognition algorithms must be able to cope with if they shall be useful in the minimum scenario. The intentional predicate ‘intend’ belongs plainly to the set of abstract data types that must be presupposed in two intelligent agents trying to coordinate their actions by recognizing each other’s intentions.
6. Pragmatic Postulate of Continuity between Intentional and Intended Actions Besides the semantic postulate of abstract data types common to both agents in the minimum scenario of small-scale intention recognition, two pragmatic postulates can be stated that must be accepted as true if a solution to the problem of small-scale intention recognition shall be found. These are the postulate of continuity between intentional and intended actions (this Section) and the postulate of distributed algorithms for acting with shared intentions (Section 7). Both postulates are called ‘pragmatic’ because they directly concern the actions of Ego or Alter, not just the information they process internally. As has been argued for in Section 5, ‘intend’ is an abstract data type that agent Ego must suppose to have in common with agent Alter when Ego wants to help Alter achieve its aims in a simple small-scale PhISt scenario, which is characterized by a short time-interval and a local area. In this context it is possible to more precisely determine the set of propositions P that can be modified by Alter’s intentions. As to their temporal reference, the intentions of Alter that Ego tries to recognize are presentdirected. They are, in contrast to future-directed intentions, about actions that have already started or are beginning just now [46]. 7 As to their spatial reference, the 7
In the AI field of plan recognition, present-directed intentions are considered building blocks of plans by an approach that centres on models of plan execution: “plans are executed dynamically, and as a result the action an agent takes at each time step critically depends on the actions they have previously taken. Intuitively, at any moment, the executing agent might like to execute any of the actions that will contribute to
53
S. Artmann / Well-Being in Physical Information Spacetime
intentions of Alter that Ego tries to recognize are directed towards Alter’s actions in the area surrounding Ego and Alter.8
Alter Observation Ego t1…tn-1 Figure 7. Small-scale intention recognition, 1st step: Ego is assuming that Alter is intentionally doing what Alter is doing.
Two steps can be distinguished in the recognition of present-directed intentions. Ego observes Alter in the time interval t1...tn-1 (see Figure 7). At time tn Ego shall infer Alter’s intention from the resultant observational data and Ego’s beliefs about the small-scale region of PhISt in which Ego and Alter are acting (see Figure 8). The presupposition of Ego’s reasoning is that Ego is able to infer Alter’s intention to do something in the moment tn from Ego’s observation of Alter’s behaviour in the time interval t1...tn-1 before. Ego considers itself justified in presupposing this because Ego ascribes to Alter that Alter has intentionally been doing what Alter was observed to be doing in t1...tn-1. Alter’s intention to do something at tn can thus be inferred by assuming behavioural continuity between the intentional acts during t1...tn-1 and the intention to do something at tn.
Intention tn Recognition Alter
Goal Observation Ego t1…tn-1
Figure 8. Small-scale intention recognition, 2nd step: Ego is assuming behavioural continuity between Alter’s intentional acts and Alter’s intention to do something.
Ego’s reasoning follows a principle of continuity as for intentionally organized behaviour of Alter. Otherwise, intention recognition would be impossible, however it one of his current goals, but it can only successfully execute those that have been enabled by prior execution of predecessor actions in the plan” [61, p. 1105]. 8 And not towards states of the world since this could involve information spatially or temporally transcending the small-scale region of PhISt in question.
54
S. Artmann / Well-Being in Physical Information Spacetime
might be conceptualized. To propose, e.g., that intention (or plan) recognition “is largely a problem of inference under conditions of uncertainty” [62, p. 54] presupposes that, as uncertain as the mutual knowledge of both agents might be, there is a continuity between the actions of an agent. The observation of Alter’s intentional actions during t1...tn-1 allows Ego to infer Alter’s intended action at tn. This inference relies on a common reservoir of typical intentions (as to actions in this spatial region) that are consistent with behaviour observed in the time interval before. The rules that express the continuity of intentional and intended action in a particular small-scale region of PhISt, and guide the inference necessary for intention recognition, constitute a grammar of actions [63] for that region. By ‘grammar’ is meant a multi-level combinatorial system of actions, sequences of actions, sequences of sequences of actions, and so on. We can thus state the following pragmatic postulate of continuity: First Pragmatic Postulate of Intention Recognition In order to set Ego the task of recognizing Alter’s intention at time tn in a particular region of PhISt we must postulate: 1. that Ego observes Alter in a time interval t1...tn-1, 2. that Ego ascribes to Alter that Alter has intentionally been doing what Alter was observed to be doing in t1...tn-1, and 3. that Ego infers Alter’s intention to do something at tn by assuming behavioural continuity, as to a grammar of actions for the particular region of PhISt, between Alter’s intentional acts during t1...tn-1 and Alter’s intention to do something at tn. If this postulate were not accepted, it could not be explained how Ego would, in a controllable manner, be able to take the step from the general assumption of common abstract data types (see Section 5) to a concrete hypothesis about what intention Alter’s behaviour follows. First, Ego starts by necessarily presupposing the semantics of intentional predicates common to Ego and Alter; second, Ego observes Alter’s real behaviour; and, third, by relying on the postulate of continuity, Ego infers what Alter might intend at this moment. If Ego is a system of strong PerComp, and Alter a human being who shall be supported by Ego, Ego’s postulate that Alter’s behaviour is continuous is absolutely necessary. Otherwise, Ego were not able to consider observed actions of Alter as instances of habits, i.e., sequences of intentional actions Alter shows regularly in the particular region of PhISt where Ego shall support the growth of Alter’s habits. The engineering of strong PerComp systems must enable the system to extrapolate intended actions from intentional ones. This can be supported by designing the spatial context that can be influenced by the system, as an environment in which the human can concentrate on how to accomplish intentional actions without being disturbed by the presence of the system. So may be described, in terms of intention recognition, what M. Weiser and J.S. Brown have called ‘calm technology’: “If computers are everywhere, they had better stay out of the way, and that means designing them so that people being shared by the computers remain serene and in control.” [64, p. 79]
55
S. Artmann / Well-Being in Physical Information Spacetime
7. Pragmatic Postulate of Distributed Algorithms for Acting with Shared Intentions It seems that, if a grammar of actions could be elaborated for a particular region of PhISt, the most fundamental prerequisite for a solution of the problem of intention recognition would have been worked out at least in respect to that region. Yet a further problem should not be underestimated that will persist even after Ego has fulfilled the task of interpreting the behaviour of Alter correctly. Ego must then use the inferred intention of Alter to construct a shared intention of Ego and Alter (see Figure 9). This construction proceeds in three steps. First, Ego meta-intends that Alter succeeds in realizing its intention. Second, Ego assumes that Alter accepts Ego’s help. Third, Ego selects, or creates, a behaviour type whose realization will most probably lead to the success of Alter’s inferred intention, and behaves accordingly [65].
Intention
Alter
SHARED INTENTION
Goal
A Assistance Ego tn+1…tend Figure 9. Small-scale intention recognition, 3 rd step: Ego must construct a shared intention of Ego and Alter.
What Ego presupposes when it acts according to a shared intention, is that there does exist a set of social conventions that help regulate the interaction of two agents sharing an intention. This set must have been correctly identified and adopted by Ego. Alter, too, must have recognized and adopted the same conventions. From the instrumentalist perspective of Dewey this means that there exist, in the transactions of Ego and Alter, habits of co-operation. Speaking in terms of computer science, Ego must accept a second pragmatic postulate: there do exist distributed algorithms, meshed sequences of behavioural rules for Ego and Alter, that structure the transaction of both agents without the need of co-ordinating intervention by a third agent. These algorithms use abstract data types common to Ego and Alter (see Section 5) in order to realize intentions shared by Ego and Alter, given that Ego has correctly recognized Alter’s intention (see Section 6). The second pragmatic postulate of intention recognition must, however, demand more than just knowledge of social conventions by Ego and Alter [66]. First, Ego has to expect that Alter’s behaviour is conforming to the particular convention that regulates how to jointly realize a certain shared intention. Second, Alter must expect that Ego, too, is conforming to the very same convention. Third, Ego and Alter have to have a shared belief in their mutual expectations of convention-conforming behaviour. Otherwise, the coordination problem of Ego and Alter will be solved only by luck, even if Ego’s interpretation of Alter’s behaviour in terms of intentions has been correct.
56
S. Artmann / Well-Being in Physical Information Spacetime
The second pragmatic postulate of intention recognition can thus be summarized as follows: Second Pragmatic Postulate of Intention Recognition In order to set Ego the task of supporting Alter to realize Alter’s intention in time interval tn+1...tend in a particular region of PhISt we must postulate: 1. that there exist meshed sequences of behavioural rules (conventions, distributed algorithms) for Ego and Alter that structure the transaction of both agents in order to realize intentions shared by Ego and Alter, given that Ego has correctly recognized Alter’s intention, 2. that Ego and Alter have to expect that each other’s behaviour is conforming to the particular convention that regulates how to jointly realize a certain shared intention, and 3. that Ego and Alter must have a shared belief in their mutual expectations of convention-conforming behaviour. The existence of conventions for acting with shared intentions is another fundamental prerequisite for the design of calm technology (see Section 6). A human user considers a technology of strong PerComp to be calm only if there do exist established conventions of how the technology and the human interact so that the former can help the latter achieve its correctly recognized intention and thus supports the human’s wellbeing by transacting with her habits. If such conventions did not exist, or if they were defied, the technology would become the centre of the human’s attention and stop being a calm partner in cooperation.
8. Consequences for Engineering Strong Pervasive Computing Systems To the philosopher who is working in the tradition of Dewey’s instrumentalism, strong PerComp is a technological programme that is of greatest importance since it is beginning to extensively and lastingly transform the habits of human beings in modern societies. Strong PerComp not only integrates earlier technical developments but also requires a new understanding of human-machine interaction in the context of an increasingly high degree of mutual saturation of physical spacetime and information spacetime. The main theses this paper advocated are: 1.
2. 3.
4.
The instrumentalist conception of well-being as growth of habits is a philosophical framework that helps design technological systems which shall support the well-being of humans (see Section 1). Strong PerComp is a research programme that strives to engineer PhISt, as social medium of transaction between intelligent agents, in order to technologically support the growth of human habits (see Section 2). Intention recognition is a fundamental problem in the engineering of strong PerComp systems since intentions are evaluators of future behaviour that are used by intelligent agents as interfaces between internally accessible and externally observable information transmission (see Section 3). In theoretical research on strong PerComp systems a recommendable method is to analyze idealized scenarios in which fundamental engineering problems, such as intention recognition, can be studied intensely (see Section 4).
S. Artmann / Well-Being in Physical Information Spacetime
5.
6.
57
The local recognition of present-directed intentions semantically postulates abstract data types common to intention-recognizing agents and agents whose intentions shall be recognized (see Section 5). The local recognition of present-directed intentions pragmatically postulates, first, continuity of intentional and intended actions of agents whose intentions shall be recognized and, second, conventions for acting with intentions shared by intention-recognizing agents and agents whose intentions shall be recognized (see Sections 6 and 7).
Altogether, those six theses make up a first rough sketch of a conceptual framework that is centred upon the problem of intention recognition, for strong PerComp research in philosophy, AI, and other fields of computer science. In order to indicate how this conceptual framework may help the engineer of strong PerComp systems, I end this paper by listing some of their consequences – one for each of the theses above – as to the design of strong PerComp systems. 1.
2.
3.
4.
“We may give Advice, but we cannot give Conduct.” (Benjamin Franklin [67, p. 523]) If the instrumentalist conception of well-being as growth of habits is a philosophical framework that helps design technological systems supporting the well-being of humans, then instrumentalism must also be usable for the specification of limits of technical support. Individual habits, as rules of conduct of a human, have a complex history. They cannot be influenced or changed instantly, and even to integrate a technical co-operating agent into them is a difficult task. The role paradigm according to which such an agent should be designed is the benevolent advisor who uses not only explicit communication by symbol systems but also implicit communication by controlled changes of the physical environment to give advices. This is particularly true for persuasive technologies that try to change the habits of individuals (e.g., towards a more healthy life-style [68] or in occupational therapy [69]). We ought to engineer not technical devices but human habitats. If strong PerComp is a research programme that engineers PhISt so that the growth of human habits is technologically supported, then the system that is designed should not be considered being an easily isolable technical device but, in the end, the habitat of human beings. The idea of context-centred design must, therefore, be taken very seriously [38]. We should understand concepts necessary to the development of strong PerComp systems from the information-theoretic perspective of fundamental AI research. If intention recognition is a fundamental problem in strong PerComp – because intentions evaluate future behaviour and are used by intelligent agents as interfaces between internally accessible and externally observable information transmission –, then ongoing AI research [45] on what concepts of intelligence, intention, etc. mean in information-theoretic terms should be systematically absorbed into strong PerComp. We must develop a design pattern approach towards the physical and informational transformation of human habitats by strong PerComp.
58
S. Artmann / Well-Being in Physical Information Spacetime
5.
6.
If in strong PerComp a useful method is to explore idealized scenarios in which basic engineering problems, such as intention recognition, can be analyzed intensely, this methodology should be developed towards a combinatorics of design patterns. This approach is well-known from architecture [70] and already very successful in software engineering [71] (for software design patterns to recognize intentions see [72, ch. 6]). By using architectural as well as software design patterns in the development of strong PerComp systems, C. Alexander’s idea that patterns are results of human habits, would be made fruitful for modern technological society. Architects and computer scientists both are engineering physical and informational spacetime – their co-operation would thus be essential for the design of physical information spacetime, the medium of transaction between humans and strong PerComp systems. We need a structural science of agency. If the local recognition of present-directed intentions semantically postulates abstract data types common to intelligent agents, then the elaboration of the semantics of those types needs a systematic theory of agency. This theory must abstract, on the one hand, from the differences between particular classes of agents. Otherwise, the theory would not be able to capture the complex transaction of humans and technical artefacts in strong PerComp systems from a unified perspective. On the other hand, the development of the theory must constantly be motivated by concrete applications. Otherwise, it easily becomes a purely formal theory that might not be a great help to the engineer. The sort of theory that the technological research programme of strong PerComp thus needs has been called ‘structural-scientific’ (see, e.g., [73]). The structural science of agency would be built up on the base of an information-theoretic view on agents and on a design pattern methodology. We have to follow a fairly conservative approach to the transformation of human habitats by strong PerComp. If, first, the continuity of intentional and intended action and, second, conventions for acting with shared intentions are of utmost importance for recognizing intentions and supporting the realization of recognized intentions, a strong PerComp systems must be given enough time to learn the grammar of actions of those individuals who ought to be supported. Humans as well as strong PerComp systems must also be given sufficient time, after intention recognition starts to function, to develop a system of co-operative habits that control the transaction of human being and technical artefact. To enable a strong PerComp system to extrapolate intended actions from intentional ones, the spatial context of the system should be designed as an environment in which the human can focus on how to accomplish intentional actions without being disturbed by the system. Such design principles may be characterized as recommending quite a conservative approach to the transformation of human habitats by strong PerComp. The engineer must take the habitual nature of human behaviour and its transmission by social traditions seriously. With this long-term process has to correspond the continuity of strong PerComp’s support of human habits whose growth shall be assisted as steadily, smoothly, and flexibly as possible.
S. Artmann / Well-Being in Physical Information Spacetime
59
Acknowledgements I would like to thank Björn Gottfried for his kind invitation to submit my philosophical ideas about Pervasive Computing to the 3rd Workshop on Behaviour Monitoring and Interpretation (BMI) at KI 2009 in Paderborn, and the reviewers and participants of that workshop for their insightful comments on previous versions of this paper.
References [1] M. Hailperin, B. Kaiser, K. Knight, Concrete Abstractions. An Introduction to Computer Science Using Scheme, Brooks/Cole Publishing, Pacific Grove/CA, 1999. [2] T.R. Colburn, Philosophy and Computer Science, M.E. Sharpe, Armonk/NY, 2000. [3] N.J. Nilsson, The Quest for Artificial Intelligence. A History of Ideas and Achievements, Cambridge University Press, Cambridge, 2010. [4] Aristotle, Nicomachean Ethics, in: J. Barnes (ed.), The Complete Works of Aristotle. The Revised Oxford Translation, vol. 2, Princeton University Press, Princeton/NJ, 1995, 1729-1867. [5] J.L. Ackrill, Aristotle the Philosopher, Oxford University Press, Oxford, 1980. [6] O. Höffe, Praktische Philosophie. Das Modell des Aristoteles, second edition, Akademie Verlag, Berlin, 1996. [7] J.H. Randall, Jr., Aristotle, Columbia University Press, New York/NY, 1960. [8] L.A. Hickman, Th.M. Alexander (eds.), The Essential Dewey. Volume 1. Pragmatism, Education, Democracy, Indiana University Press, Bloomington/IN, 1998. [9] L.A. Hickman, Productive pragmatism. Habits as artifacts in Peirce and Dewey, in: Hickman, Pragmatism as Post-Postmodernism. Lessons from John Dewey, Fordham University Press, New York/NY, 2007, 241-254. [10] C. Mitcham, Thinking through Technology. The Path between Engineering and Philosophy, University of Chicago Press, Chicago/IL, 1994. [11] Y. Levin, Imagining the Future. Science and American Democracy, Encounter Books, New York/NY, 2008. [12] J. Dewey, Reconstruction in Philosophy, paperback edition, New American Library, New York/NY, 1950. [13] L.A. Hickman, Th.M. Alexander (eds.), The Essential Dewey. Volume 2. Ethics, Logic, Psychology, Indiana University Press, Bloomington/IN, 1998. [14] W.S. Ark, T. Selker, A look at human interaction with pervasive computers, IBM Systems Journal 38 (1999), 504-507. [15] M. Satyanarayanan, A catalyst for mobile and ubiquitous computing, IEEE Pervasive Computing 1 (2002) 1, 2-5. [16] Bo Hu, Bin Hu, V. Callaghan, Z. Lin, Combining theory and systems building in pervasive computing, The Computer Journal 53 (2010), 129-130. [17] Oxford English Dictionary Online, http://www.oed.com, Entries ‘pervade,’ ‘pervasive,’ and ‘ubiquitous,’ accessed 07/14/09. [18] A. Ferscha, Pervasive Computing: connected > aware > smart, in: F. Mattern (ed.), Die Informatisierung des Alltags. Leben in smarten Umgebungen, Springer, Berlin, 2007, 3-10. [19] M. Friedewald, Ubiquitous Computing. Ein neues Konzept der Mensch-Computer-Interaktion und seine Folgen, in: H.D. Helliger (ed.), Mensch-Computer-Interface. Zur Geschichte und Zukunft der Computerbedienung, Transcript, Bielefeld, 2008, 259-280. [20] J. Burkhardt, H. Henn, S. Hepper, K. Rindtorff, Th. Schäck, Pervasive Computing. Technology and Architecture of Mobile Internet Applications, Addison-Wesley, Boston, 2002. [21] D. Amor, The E-business (R)evolution. Living and Working in an Interconnected World, second edition, Prentice Hall, Upper Saddle River/NJ, 2002. [22] U. Hansmann, L. Merk, M.S. Nicklous, Th. Stober, Pervasive Computing. The Mobile World, second edition, Springer, Berlin 2003. [23] J.R. Gurd, C.B. Jones, The global-yet-personal information system. In: I. Wand, R. Milner (eds.), Computing Tomorrow. Future Research Directions in Computer Science, Cambridge University Press, Cambridge, 1996, 127-157. [24] J.C.R. Licklider, Man-machine symbiosis, IRE Transactions on Human Factors in Electronics HFE-1 (1960), 4-11.
60
S. Artmann / Well-Being in Physical Information Spacetime
[25] B. Shneiderman, Leonardo’s Laptop. Human Needs and the New Computing Technologies, MIT Press, Cambridge/MA, 2002. [26] P. Pirolli, Information Foraging Theory. Adaptive Interaction with Information, Oxford University Press, Oxford, 2007. [27] A.J. Cowell, S. Havre, R. May, A. Sanfilippo, Scientific discovery within data streams, in: Y. Cai (ed.), Ambient Intelligence for Scientific Discovery (Lecture Notes in Artificial Intelligence, vol. 3345), Springer, Berlin, 2005, 66-80. [28] M. Weiser, Some computer science issues in ubiquitous computing, Communications of the ACM 36 (1993) 7, 75-84. [29] M. Weiser, R. Gold, J.S. Brown, The origins of ubiquitous computing research at PARC in the late 1980s, IBM Systems Journal 38 (1999), 693-696. [30] M. Weiser, The computer for the 21 st century, Scientific American 265 (1991) 3, 94-104. Cited after reprint in Mobile Computing and Communications Review 3 (1999) 3, 3-11. [31] G.D. Abowd, M. Ebling, G. Hunt, H. Lei, H.-W. Gellersen, Context-aware computing, IEEE Pervasive Computing 1 (2002) 3, 22-23. [32] A.Z. Spector, Technology megatrends driving the future of e-society. In: J. Eberspächer, U. Hertz (eds.), Leben in der e-Society. Computerintelligenz für den Alltag, Springer, Berlin 2002, 35-50. [33] I. Schulz-Schaeffer, Formen und Dimensionen der Verselbständigung. In: A. Kündig, D. Bütschi (eds.), Die Verselbständigung des Computers, vdf Hochschulverlag, Zürich, 2008, 29-53. [34] Th.P. Hughes, American Genesis. A Century of Invention and Technological Enthusiasm 1870-1970, new edition, University of Chicago Press, Chicago/IL, 2004. [35] E. Aarts, R. Harwig, M. Schuurmans, Ambient intelligence. In: P.J. Denning (ed.), The Invisible Future. The Seamless Integration of Technology into Everyday Life, McGraw-Hill, New York/NY, 2002, 235250. [36] M. Satyanarayanan, Pervasive computing: vision and challenges, IEEE Personal Communications 8 (2001) 4, 10-17. [37] A. Greenfield, Everyware. The Dawning Age of Ubiquitous Computing, New Riders, Berkeley/CA, 2006. [38] M. McCullough, Digital Ground. Architecture, Pervasive Computing, and Environmental Knowing, MIT Press, Cambridge/MA, 2004. [39] J. Dewey, A.F. Bentley, Knowing and the known. In: Dewey, The Later Works 1925-1953. Volume 16. 1949-1952, ed. J.A. Boydston, Southern Illinois University Press, Carbondale/IL and Edwardsville/IL, 1989, 1-294. [40] S.G. Thompson, B. Azvine, No pervasive computing without intelligent systems, in: A. Steventon, S. Wright (eds.), Intelligent Spaces. The Application of Pervasive ICT, Springer, London, 2006, 37-54. [41] L. Sterling, K. Taveter, The Art of Agent-Oriented Modeling, MIT Press, Cambridge/MA, 2009. [42] J. McCarthy, The inversion of functions defined by Turing machines, in: C.E. Shannon, J. McCarthy (eds.), Automata Studies (Annals of Mathematics Studies, vol. 34), Princeton University Press, Princeton/NJ, 1956, 177-181. [43] Y. Shoham, S.B. Cousins, Logics of mental attitudes in AI. A very preliminary survey, in: G. Lakemeyer, B. Nebel (eds.), Foundations of Knowledge Representation and Reasoning (Lecture Notes in Artificial Intelligence, vol. 810), Springer, Berlin, 1994, 296-309. [44] T. Crane, The Mechanical Mind. A Philosophical Introduction to Minds, Machines, and Mental Representation, second edition, Routledge, London, 2003. [45] J. Doyle, Extending Mechanics to Minds. The Mechanical Foundations of Psychology and Economics, Cambridge University Press, Cambridge, 2006. [46] M.E. Bratman, Intention, Plans, and Practical Reason, Harvard University Press, Cambridge/MA, 1987. [47] P.R. Cohen, H.J. Levesque, Intention is choice with commitment, Artificial Intelligence 42 (1990), 213261. [48] M. Wooldridge, Reasoning about Rational Agents, MIT Press, Cambridge/MA, 2000. [49] J. Doyle, The foundations of psychology. A logico-computational inquiry into the concept of mind. In: R. Cummins, J. Pollock (eds.), Philosophy and AI. Essays at the Interface, MIT Press, Cambridge/MA, 1991, 39-77. [50] F. Rivera-illingworth, V. Callaghan, H. Hagras, Detection of normal and novel behaviours in ubiquitous domestic environments, The Computer Journal 53 (2010), 142-151. [51] M. Philipose, K.P. Fishkin, M. Perkowitz, D.J. Patterson, D. Fox, H. Kautz, D. Hähnel, Inferring activities from interactions with objects, IEEE Pervasive Computing 3 (2004) 4, 50-57. [52] D.J.T. Heatley, R.S. Kalawsky, I. Neild, P.A. Bowman, Integrated sensor networks for monitoring the health and well-being of vulnerable individuals, in: A. Steventon, S. Wright (eds.), Intelligent Spaces. The Application of Pervasive ICT, Springer, London, 2006, 219-237.
S. Artmann / Well-Being in Physical Information Spacetime
61
[53] L. Liao, D.J. Patterson, D. Fox, H. Kautz, Learning and inferring transportation routines, Artificial Intelligence 171 (2007), 311-331. [54] M. Beetz, B. Kirchlechner, M. Lames, Computerized real-time analysis of football games, IEEE Pervasive Computing 4 (2005) 3, 33-39. [55] T. Kindberg, A. Fox, System software for ubiquitous computing, IEEE Pervasive Computing 1 (2002) 1, 70-81. [56] A.M. Turing, Computing machinery and intelligence, Mind 59 (1950), 433-460. [57] I. Arel, S. Livingston, Beyond the Turing test, Computer 42 (2009) 3, 90-91. [58] P.F. Strawson, Individuals. An Essay in Descriptive Metaphysics, Methuen, London, 1959. [59] S. Artmann, Behavioural congruence in Turing test-like human-computer interaction, in: B. Mertsching, M. Hund, Z. Aziz (eds.), KI 2009. Advances in Artificial Intelligence. 32 nd Annual German Conference on AI (Lecture Notes in Artificial Intelligence, vol. 5803), Springer, Berlin, 2009, 387-394. [60] B. Liskov, S. Zilles, Programming with abstract data types, ACM SIGPLAN Notices 9 (1974) 4, 50-59. [61] C.W. Geib, R.P. Goldman, A probabilistic plan recognition algorithm based on plan tree grammar, Artificial Intelligence 173 (2009), 1101-1132. [62] E. Charniak, R.P. Goldman, A Bayesian model of plan recognition, Artificial Intelligence 66 (1993), 53-79. [63] R. Hamid, S. Maddi, A. Johnson, A. Bobick, I. Essa, C. Isbell, A novel sequence representation for unsupervised analysis of human activities, Artificial Intelligence 173 (2009), 1221-1244. [64] M. Weiser, J.S. Brown, The coming age of calm technology, in: P.J. Denning, R.M. Metcalfe (eds.), Beyond Calculation. The Next Fifty Years of Computing, Copernicus, New York/NY, 1997, 75-85. [65] M.E. Bratman, Shared intention, in: M.E. Bratman, Faces of Intention. Selected Essays on Intention and Agency, Cambridge University Press, Cambridge, 1999, 109-129. [66] D. Lewis, Convention, Harvard University Press, Cambridge/MA, 1969. [67] B. Franklin, Autobiography, Poor Richard, and Later Writings, Library of America, New York/NY, 2007. [68] S. Consolvo, J.A. Landay, D.W. McDonald, Designing for behavior change in everyday life, Computer 42 (2009) 6, 86-89. [69] J.-L. Lo, P.-Y. Chi, H.-H. Chu, H.-Y. Wang, S.-C.T. Chu, Pervasive computing in play-based occupational therapy for children, IEEE Pervasive Computing 8 (2009) 3, 66-73. [70] C. Alexander, S. Ishikawa, M. Silverstein, A Pattern Language. Towns, Buildings, Construction, Oxford University Press, New York/NY, 1977. [71] E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns. Elements of Reusable Object-Oriented Software, Addison-Wesley, Boston/MA, 1995. [72] C. Heinze, Modelling Intention Recognition for Intelligent Agent Systems, Defence Science and Technology Organization Technical Report DSTO-RR-0286, Edinburgh/Australia, 2004. [73] S. Artmann, Artificial Life as a structural science, Philosophia naturalis 40 (2003), 183-205.
This page intentionally left blank
Part II Supporting the Well-Being Through Care Taking in Smart Environments
This page intentionally left blank
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-65
65
Tracking Systems for Multiple Smart Home Residents Aaron S. CRANDALL and Diane J. COOK
[email protected],
[email protected] Abstract. Once a smart home system moves to a multi-resident situation, it becomes significantly more important that individuals are tracked in some manner. By tracking individuals the events received from the sensor platform can then be separated into different streams and acted on independently by other tools within the smart home system. This process improves activity detection, history building and personalized interaction with the intelligent space. Historically, tracking has been primarily approached through a carried wireless device or an imaging system, such as video cameras. These are complicated approaches and still do not always effectively address the problem. Additionally, both of these solutions pose social problems to implement in private homes over long periods of time. This paper introduces and explores a Bayesian Updating method of tracking individuals through the space that leverages the CASAS platform of pervasive and passive sensors. This approach does not require the residents to maintain a wireless device, nor does it incorporate rich sensors with the social privacy issues.
1. Introduction Smart homes are providing an ever more important suite of tools for home care. From simple assistive tools to complex contextually aware interactive personas, smart home technology has penetrated many parts of the medical community. These tools rely on various sensors, artificial intelligence algorithms and human intelligence to operate. Most often the tools are geared towards recognizing the activities of daily living (ADLs), with the purpose of providing historical and instantaneous feedback about the residents’ behavior. Any improvements to these tools in recognizing ADLs is welcomed by care practitioners and the residents themselves because it gives them a more accurate day to day picture of the residents’ situation [1]. Currently, the latest in smart home technology has trouble operating with low profile, privacy aware sensor platforms. These sensor platforms are designed to minimize effort on the part of the resident while maximizing privacy at the cost of sensor granularity. The goal of research in this area is to make use of this reduced sensor information to still build systems capable of providing quality assistive technologies. As an added hurdle, the smart homes are often deployed where there is more than one resident dwelling in the space. Even visitors and care providers make it difficult for the smart home system to determine which person currently in the space caused a given event and attribute it appropriately. Without that ability, ADLs become much more difficult to detect through the noise in the data and individual histories are impossible to obtain.
66
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
The tools used to follow an individual through the space are commonly called tracking systems. A tracking system’s goal is to determine the current number and location of individuals, as well as their identity if possible. This information is invaluable when dealing with multi-resident situations to provide the computer a method of attributing events to individuals. There are three main strategies for tracking people in smart homes: 1. Carried Devices or Tags 2. Video biometrics 3. Entity detection via dividing events spatially and temporally Carried devices or tags are commonly done via RFID [2–4] or a wireless device carried on ones person [5–7]. The device or a base station of some sort reports the current location of the device to the central system. This has been accomplished using PDAs, cell phones, actigraphs, custom built RF devices, et al. While these kinds of systems work, it does require that every individual in the space keep and maintain their personal device at all times. It is easy for the residents to forget their device, have the batteries run down or not even want to have it. Additionally, guests need to be issued a device whenever they are in the space to ensure they are accounted for. In many environments this is a feasible solution, given the manpower to maintain it. For example, hospitals and full time care facilities are often able to make use of such systems. In private homes or understaffed situations, it becomes a less feasible solution. For video biometrics one or more cameras are placed around the monitored space. These cameras capture the current residents for tracking and processing [8,9]. The goal is to interpret the video data to identify individuals, detect ADLs and give more context to item interaction. While these tools are often very good at meeting these goals, they bear the overhead of expensive cameras and the privacy concerns of the residents. Asking individuals to have 24 hour video monitoring in the homes can be difficult. While some may be willing to accept such an intrusion, many others will not [10]. The last solution, doing Entity Detection by interpreting the sensor network data directly, strives to remove the effort of carried devices and the privacy concerns of cameras in exchange for more complexity in the tracking algorithm [11–13]. Many smart homes are very sensor rich. By exploiting the physical locality of the sensors with the timing and general behavior of the residents, tools can be developed to determine how many residents there are and attribute events accordingly. This approach is a much more classical artificial intelligence one, and one that will likely get a probabilistic result. Whether or not it is good enough to support the other tools, such as ADL detection, is the question. The researchers at Washington State University’s Center for Advanced Studies in Adaptive Systems (CASAS) have built a set of smart home testbeds to support research into assistive care technologies. After working with the systems and the residents it was hypothesized that an algorithm could be devised to take the full event stream from anonymous residents and exploit the spatial and temporal features to make a tracking system without a carried device or rich sensors such as cameras. This work introduces two algorithms that would be considered entity detection and tracking tools. They are used to divide the events generated by the sensor system into different sets. Each set represents a person currently in the space, and the events they caused on the sensor network. These sets can then be used to identify individuals, detect ADLs and give a much better sense of the behaviors occurring within the smart home. The result is a tracking system that uses passive and unobtrusive sensors to track people as they go about their day within the smart home space.
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
67
2. Sensor Platform and Data The Center for Advanced Studies in Artificial Systems has constructed a number of smart home testbeds. These testbeds are used to record sensor information from the activities of the residents, both in scripted and unscripted situations. The testbeds were designed to support the detection of resident activities and track individuals in a passive manner. Unto these ends, a number of sensor types have been introduced: 1. 2. 3. 4. 5. 6. 7. 8.
Passive Infra-Red (PIR) motion detectors Temperature sensors Power meters Door open/close sensors Item presence platforms Light switch monitors Water flow meters Phone use
The primary role of these sensors is to aid in ADL detection. The PIR motion detectors are used not only for ADL detection, but also for tracking and localization of residents. The PIR sensors are commodity off the shelf home security sensors that have been modified with a custom built communications daughter board dubbed the Lentil Board. This daughter board senses a change in the sensor relay state, i.e.: ON vs. OFF, and transmits the change to a Dallas 1-wire bus to be logged by a host computer. These PIR sensors come in two configuration. The first is an area sensor that is placed in a fairly common home security position. These sensors have a field of view that covers most of an entire room and are used to measure occupancy of the space. The second, and more common type, is a downward facing unit that had had its view occluded to only be able to see about 4’ × 4’ of the floor below it. These second PIR sensors give a much better sense of the location of motion within the space. These kinds of PIR sensors are used quite often for smart home implementations. They’re inexpensive, robust and accepted by most residents. During the initial design phases of the CASAS project, it was determined that high fidelity sensors, such as cameras, raise significant privacy concerns. When using these low fidelity platforms and more intelligent algorithms, the residents have been very accepting of being monitored by such systems. During several of the in-home deployments of CASAS testbeds the residents expressed significant concerns over having such a system watching them full time. By showing them the very simple form of the data gathered and the simple visualizations that can be created their initial concerns have been assuaged. For the purposes of the algorithms used in this work only the downward facing sensors are used for the tracking of individuals. The area sensors give a much too general sense of a resident’s location for any kind of precise locality. Normally a PIR security sensor is only used to say someone is in the room, which is enough for intrusion detection. These smart home systems need a much finer-grained tool where an event means someone is within this small area. The more information that can be derived about locality the better. The events created by the CASAS testbed are very simple. They come in the form of a four tuple: 1. Date 2. Time
68
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
3. Location 4. Message The date and time are the time the event occurred, locally to the testbed. The location is a named physical spot within the testbed. This is not the serial number of the device, but an actual place. By abstracting the serial number of a device from the physical location we can then change out devices in the face of hardware failure without impacting the running algorithms. Lastly, the message field is an arbitrary string. For most devices, such as PIR sensors, it is a binary state. Other more complicated sensors, such as temperature, power meters and water flow sensors, put a number value or a description of their state in this field. By leveraging a platform that generates simple and discrete events, the total size of the data sets created are smaller than approaches that use more complex sensor sources such as video or wearable sensors. There is also significantly less pre-processing to do to interpret the data for later classification. For example, to determine resident location using a camera a number of image processing techniques have to be applied first. These techniques can take a noticeable amount of computational resources and induce additional noise that later algorithms need to account for. The CASAS sensor platform and event model make for very clean data to be used by artificial intelligence algorithms.
3. CASAS Testbeds In this work two different CASAS testbeds were used. In both locations people live, work, and perform daily activities without external guidance or scripting. The Tokyo testbed is installed in a WSU Electrical Engineering and Computer Science (EECS) department lab. The room is 40’ × 36’, and is used by the CASAS graduate students as their main work area. Within the room there are a number of separate spaces containing desks, cubicle walls, a conference area, a sitting area and an inner room with engineering work tables. Anywhere from zero to 9+ people will be in the space at any given time, at nearly any hour of the day. The testbed is outfitted with PIR sensors on the ceiling, a front door open/close sensor and power switch sensors. The testbed has now been operational for nearly three years. Tokyo has 44 motion detectors1 . They are all placed on the ceiling pointing down at roughly 4’ intervals. The ceiling is a dropped t-bar ceiling with a nearly uniform height of 10’ and the sensors are attached to the t-bar surface. An example of this implementation at the Tokyo testbed can be seen in Fig. 2. The Kyoto testbed is a three bedroom campus apartment provided by WSU Housing. The facility has 52 motion detectors along with many other types of sensors. This testbed has been in operation for just over two years, normally with two full time undergraduate residents. In addition to day to day living, the space is also used for a number of smart home studies. These studies will have one or more people moving about the space in scripted or unscripted behaviors during the day. A map with the Kyoto sensor placement is shown in Fig. 3. The sensors that begin with ‘M’ are the motion detectors used for determining the rough location of motion with the space. The rest of the sensors monitor doors, water flow, lights and items. 1 There
is no motion detector number 35 due to a numbering error at installation time.
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
Figure 1. Floor plan for WSU CASAS office testbed, named Tokyo.
Figure 2. Sample of PIR motion detectors, as installed in the Tokyo testbed.
69
70
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
Figure 3. Floor plan for WSU CASAS student apartment testbed, named Kyoto.
4. Approaches This work introduces two algorithms to track individuals in the smart home space. They both attempt to exploit the physical and temporal features of the events generated by the residents on the CASAS sensor network. The goal is to incrementally build a model of what is occurring in the space and attribute the events accordingly. Because the tools will eventually need to operate in real time, they take events in order and should be able to classify them quickly. This classification would then be used by other tools, such as ADL detectors, to more accurately describe the current activities. In both tools it was determined that some terminology had to be defined. The researchers use the term ‘entity’ within the models to represent an individual. This is because not every entity in the model represents a person. Most often they are people, but the studies have included smart home installations with cats, dogs and even robots that cause events. By using the term entity, it allows for a wider understanding of how complex living spaces can be. The two algorithms are similar in many ways, as the evaluation of one led to the creation of the other. The first algorithm is a rule-based tool. It uses a set of simple rules combined with a graph of all possible routes between sensor locations to track individuals. This tool is dubbed ‘GR/ED’, which stands for Graph and Rule based Entity Detector. The initial results for the GR/ED were promising, but the tool fell down in more complex social situations as well as in poor sensor network environments. The GR/ED is introduced and explored in more depth in Section 4.2.
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
71
As a means to exploit the available data and create a better tool, a second tool based on Bayesian Updating was created. By using a corpus of training data annotated with the number of residents, a probabilistic transition matrix is built and used to update the world model. This tool is dubbed the ‘BUG/ED’, which stands for the Bayesian Updating Graph based Entity Detector. By leveraging a probabilistic model, the system is able to handle significantly more issues with the sensor network and perform marginally better in the face of more complex resident behaviors. With these additional successes, the BUG/ED was also tested for its efficacy at improving the performance of a Bayesian Classifier for doing ADL detection. The BUG/ED is discussed in more detail in Section 4.3. 4.1. Annotated Data To use both algorithms two corpora of data were created. A subset of the stored events for both the Tokyo and Kyoto testbeds were taken and annotated by humans. The humans were taught to watch the events as they were replayed using a visualization tool and log the current number of residents in the testbed. This value representing the current occupancy could then be used to determine how accurate the tracking tools were, and in the case of the BUG/ED it was also used to train the transition probability matrix. The Tokyo data set represents sensor events that were generated while faculty, students, and staff performed daily working routines in the lab over a course of 59 days. To train the algorithm, the data was manually inspected by a person and every event annotated with the current number of residents in the space. In total this made for 209,966 motion sensor events, with a mean of 86.84 events and a standard deviation of 201.11 events per resident instance. The resident count ranged from zero to more than nine during this data gathering window. Once the testbed had more than six to seven people in it, the annotators noted that there was little available information to identify what was happening in the space. This was anecdotal evidence for the limited resolution of the testbed. Adding more sensors should increase this maximum detectable occupancy. The Kyoto data was taken from 21 days of the Kyoto testbed. This made for 231,044 motion sensor events, with a mean of 603.67 events and a standard deviation of 843.17 events per resident instance. Again, the sample data was inspected by a person and annotated with the number of people currently in the space. In this set, the number of residents ranged from zero to five and the annotators noted a marked decrease in their ability to interpret individuals’ movements as the occupancy reached about four residents. 4.2. GR/ED – Graph and Rule Based Algorithm The GR/ED algorithm was designed to use the order of events to incrementally track individuals in the CASAS testbed. The core idea is that entities will most likely trip sensors as they cross from one location to another, and multiple entities will often be separated by one or more sensors as they go about their day. The graph part of the tool is based around the physical locations of the sensors with the testbed. The two CASAS testbeds used in this work are shown in Figs 4 and 52 . 2 The edge cutting across the Kyoto graph from M026 to M027 is connecting the sensor at the bottom of the stairs with the one at the top leading to the second story.
72
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
Figure 4. Graph of sensor locations for the Tokyo testbed.
These graphs are made up of only the downward facing PIR motion detectors, which are laid out to cover most of the floor space. Since the sensors are placed to cover the space fairly well, people walking around have an obvious and complete chain of events from one place to another. The graph that represents a given space has vertexes representing the sensors themselves and edges that represent the possible connections between those vertexes. The rule based part of GR/ED is a simple set of logical rules for creating, destroying and moving entities within the model based on the evidence given by the event series. The first rule is for creating a new entity. With this rule, if an ON event occurs at a location with no adjacent entities, a new entity is created. This theoretically means that this event was caused by a heretofore unseen entity. They could have either entered the space, or have been shadowing another one of the residents and only just then been separated enough to have noticed as a separate entity. The second rule is for destroying entities. An entity is destroyed from the model under two circumstances. First, is when they have been determined to leave the sensor network. In the case of the CASAS system, this is when an entity moves to the sensor most adjacent to the exit. Since there is no hardware available to easily determine whether someone has moved through the doorways, it can be assumed that moving next to the door is an exit. The second way an entity can be destroyed is when they fail to generate new events for a period of time. If the model has an entity that does not actually exist, then it must have a means to recover. Since the PIR sensors do not provide data if an entity does not
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
73
Figure 5. Graph of sensor locations for the Kyoto testbed.
move, then it becomes difficult to determine if they are still in a given location, of if they have moved away without triggering events. This kind of movement can either occur due to a flaw in the sensor network, or if two entities move to the same location followed by moving together across the space. Since the sensors do not provide a magnitude of the size of an entity, it is easy for multiple people to move as a group and leave old entities in the model that no longer exist. To remedy this, a timeout on entities has been imposed. After trying a wide range of values with the Tokyo data set, it was determined that a timeout of 300 to 600 seconds is the best range, and 300 was used for this work. The final rules for the GR/ED tool have to do with movement. The first rule for movement is that when an ON event occurs and an entity is at a neighbor in the graph, then that entity moves to the location that generated the event. Only one entity can have that event attributed to them, so if more than one entity is adjacent to the new event, then the one that moved most recently takes it. This most recent mover continues rule allows the system to deal with entities moving together in tighter areas. As the system operated, it was noticed that people could easily fool the GR/ED by walking back and forth. The PIR sensors chosen are from a commodity home security product line. Because the home security hardware is slow, the sensors stay in the ON state for anywhere from one to five seconds before turning back off once movement stops. Due to this very long time frame, people could walk in the pattern shown in Fig. 6, which would move their virtual entity to the node on the left, but the sensor in the middle would stay on long enough that they would then move to the sensor on the right without causing an ON event on the middle sensor. This would leave their old virtual entity on the left, and create a new one from the new event ON event from the right most sensor.
74
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
Figure 6. Example of movement that breaks the simple GR/ED algorithm.
At this point the system was out of sync with the space and the false entity left behind would have to time out before the GR/ED would be correct again. To remedy this failing, the Open List of sensors was proposed. With the Open List, an entity has a set of locations that they are currently at. These PIR sensors are robust. For every ON event, there is always an OFF to match it. When an entity is attributed an ON event, that location is placed in their open list. Once that sensor sends a final OFF event, the location is removed from their Open List. Now that this list is available, an entity’s location is not merely their current vertex in the graph, but the whole of the Open List. If an ON event occurs that is adjacent to any location in this list, it may be attributed to the entity. This technique solved most of the issue of people walking back and forth. As in the previous example, the entity’s Open List would be both the center and left sensors. So when they next trip the right sensor they are still considered ‘adjacent’, due to the middle sensor being in their Open List, and would properly be attributed that new event. The resulting system was efficient and operated in near real time, making it feasible for real-world smart home implementations. As an added advantage it takes no training data to operate, only the graph of possible routes between sensor locations. This would allow the GR/ED to be deployed and started once the layout of the sensors is known without having to wait for any kind of annotated training data to be made available. 4.2.1. Testing the GR/ED The GR/ED tool was tested for accuracy at counting the current number of residents using both the Tokyo and Kyoto data sets. The tool was evaluated using 10-fold cross validation, divided by days. Once the data sets were run through the tool, the resultant guesses were compared to the human annotated ground truth. The results could then be inspected for total number of event correct, as well as total length of time correct.
75
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
Tokyo Data Set GR/ED Accuracy by Occupancy 1 0.9 0.8 Accuracy
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
Occupancy Figure 7. Accuracy by occupancy count for the GR/ED tool on the Tokyo data set.
4.2.2. Results for GR/ED The Tokyo results were somewhat promising. GR/ED was very accurate with zero and one residents, as was expected, but rapidly fell to a lower rate as the number of residents increased. In Fig. 7, the accuracy by number of events on the Tokyo data set is shown. Note that as the resident count increased the accuracy declined. Since the GR/ED tool cannot tell the difference between a single or multiple residents at a given location, while the annotators can, it is often too low in its estimations. Additionally, it can be too high if an entity is a false positive until it times out. Overall, the GR/ED algorithm achieved an overall accuracy of 72.2% with a standard deviation of 25.21% by counting events and an accuracy of 88.9% with a standard deviation of 12.8% for the total time represented by the data set. The Kyoto data set truly showed the flaws in the GR/ED algorithm. This testbed has significantly more sensor error. People are able to move past sensors in many more places without tripping intermediate sensors. This quickly leads to many false entities being created in the model and a marked reduction in accuracy. Overall, the GR/ED had an accuracy of 16% measured by number of correctly-labeled events and 45% for total correctly-labeled time on the Kyoto data set. These low accuracies placed it well within range of a weighted random guess, so further evaluation on the Kyoto data set was abandoned. The GR/ED tool has the advantage of not requiring any training data, only the graph itself. If the sensor locations can be determined at installation time, or automatically through some means, then this tool can be used with a new smart home installation quickly. Depending upon the needs of the other tools within the system, it may be sufficient for the smart home application.
76
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
Because the graph used by the GR/ED is so rigid, it was determined that a more probabilistic model might be a better solution. Instead of relying on a human-created set of equal and fixed connections between locations, perhaps a graph of likely connection as derived from the annotated data might serve better. This lead to the application of Bayesian Updating and the creation of the BUG/ED tool. 4.3. BUG/ED – Bayesian Updating Graph After looking through various algorithms, it was determined that a Bayesian Updating algorithm might be a good choice for a successor to the GR/ED tool. Bayesian Updating is a probabilistic strategy where new evidence provided is used to update the guess at the model of the world. The Bayesian Updating Graph Entity Detector (BUG/ED) proposed here takes the current model of the smart home space, with respect to the current resident locations, and combines it with the new evidence in the form of a sensor event to build the most likely world model for the latest state. The behavior is similar in many ways to the GR/ED, but instead of a simple and uniform graph it has a transition matrix of probabilities. The matrix can also be augmented with other sources of evidence, though the algorithm here was only provided sensor location to sensor location transition likelihoods. The biggest advantage of the BUG/ED over the GR/ED is handling failures of the sensor network. Often times a person will bypass a sensor in the graph, which caused an immediate problem for the GR/ED tool. It would end up creating a new entity in the model, and abandon the old one improperly. With the BUG/ED, the transition matrix will normally have a likelihood of transition between those two more distant sensors and will often properly move its entity, even if the person it represents skips sensors occasionally. This ability alone increased the robustness of the system in day to day operation. 4.3.1. Training the BUG/ED Matrix With Bayesian Updating, there must be some corpus of information for the algorithm to use to estimate the conditional and joint probabilities. Getting or generating that corpus is up to the implementation and domain. The data annotated by humans for the Tokyo and Kyoto testbeds specifying the number of current residents was used to train the BUG/ED transition matrix. This training process is done before operation of the BUG/ED can begin. The training process itself resembles the GR/ED algorithm with a very important addition. Since the annotated data has the true count of residents, the training algorithm can make use of that key data for determining when residents entered and left the space. The training algorithm takes the events in the training data one at a time and incrementally builds a model of the residents locations and transitions between sensor locations, much like the GR/ED does. The difference is that it uses the resident count from the training data to decide when to create, destroy or move entities. It also makes use of the same graph that the GR/ED tool used, but only for counting hops across the graph. This graph has one addition for the BUG/ED environment, which is a virtual sensor location called “OUTSIDE”. This OUTSIDE location represented all of the world not within the smart home space. It is directly connected via an edge in the graph to any sensor at an exit to the smart home, such as the front and back doors. Entities are also moved from OUTSIDE when they are created, and to OUTSIDE when removed.
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
77
This graph is used in determining which entity is closest to the OUTSIDE location, or which entity is closest to an event that just occurred. By looking at whether the resident count went up, down or stayed the same between events the training algorithm will either create, destroy or move entities. If the count goes up, a new entity is created at that location by moving them from the virtual location OUTSIDE to the location of the event. If the count goes down, the closest entity to the exit is immediately moved to the virtual location OUTSIDE. If the count stays the same, the entity closest to the event on the graph is moved there. Every time an entity is created, destroyed or moved, that transition from one location to another is added to a matrix. The matrix represents the number of times entities transitioned between locations and is the source of probabilities during the operation of the BUG/ED algorithm on new data. The length of wall clock time an entity resides at a given sensor location is also kept. This set of time lengths is used to determine dynamic timeouts for entities, which will be discussed more in depth later. 4.3.2. Noise Reduction in the BUG/ED Matrix The training algorithm for the BUG/ED matrix is not perfect. Inspection of the results shows several instances where the model of the testbed got into a state where taking the closest entity was inappropriate. This would increase the likelihood of transition between two unrelated locations, but by having a large enough training set this would not normally impact the overall performance of the BUG/ED too greatly. In some of the training data the human annotators were also incorrect in their resident count. Since that value is very important to the training phase, these bad training files would also impact the overall accuracy of the system. To overcome these aberrant transitions between sensor locations, a flooring filter was applied to the transition probability matrix. Any transition likelihood below the filter would be changed to the lowest probability. Setting is flooring value was seen to have a profound effect on the behavior of the system. If too many bad connections were left in, by setting it too low, then the BUG/ED would have too little evidence to create new entities as people entered the space. Alternatively, if it was set too high then too many entities would end up being created. For each data set, the value to floor with was experimentally derived. In future work, a proper outlier detection algorithm for each sensor location will replace this flat number. An additional noise reduction tool was implemented to remove training data that was too complex for good use. This was a maximum occupancy limit on the training data. As the number of residents increased within a space, it becomes more and more difficult to determine how many are truly there. This limit is a factor of the sensor density and how mobile the residents are. It was noted by the annotators that once more than five or six people were in the Tokyo testbed, it was nearly impossible to keep their locations perfectly tracked. At that juncture, the annotators watched the entrance for people entering and leaving more than individual events anywhere in the space. Since the training algorithm to build the BUG/ED transition matrix is a simple one, a ceiling value on the number of occupants in the space was implemented. If the training data exceeded that number, it was thrown out. Between removing very unlikely connections and not using training data with too many residents, the BUG/ED tool started to perform much better in day to day use, and the overall accuracy of the system improved.
78
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
4.3.3. Timeouts in the BUG/ED In the GR/ED tool, a flat timeout for entities was enforced. This was set at 300 seconds, which was chosen by running the GR/ED tool on the data repeatedly with different timeout values chosen. The overall accuracy at guessing the number of residents was compared for each timeout setting and the best value of 300 was taken for future work with the tool. This flat timeout of 300 seconds is the default used by the BUG/ED as well, though it is supplanted by the dynamic timeout system described below. It was noted by the residents that the GR/ED would timeout most often when people sat and worked in a location for some time without moving enough to cause sensor events. Because the training algorithm for the BUG/ED is stateful and remembers an entity’s location for an infinite length until they move, it could be used to find a more appropriate timeout for every sensor location. It was hypothesized that by making a dynamic timeout system that gets its results from the training data, the BUG/ED could do a better job of handling situations where entities spend long periods of time still. As the BUG/ED transition probability matrix is being trained, the length of total time an entity spends on a given sensor is kept. Once the data has all been used for training, these lists of times are used to build a single timeout value for each sensor location. The mean plus three standard deviations was used for the timeout value at every given location. Manual inspection of the results mostly conformed to the expected timeouts. Areas such as hallways and kitchens had shorter timeout values, while desks, beds and couches ended up with longer timeouts. This was not always true, but the flaws in the timeout calculations were results of flaws in the simple training rules used to build the transition probability matrix. 4.3.4. Testing the BUG/ED The BUG/ED was tested using the same two data sets as the GR/ED tool. Because the BUG/ED requires training data, a 3-fold cross validation system was implemented. In this case, 2/3 of the available days were used to train the transition matrix, and the last 1/3 was for testing. The days were randomly selected and the model was reset with each new day when testing. The overall accuracy value was calculated by counting the number of events where the BUG/ED was correct in the current number of residents. The difference between the true value and the current guess by the tool was also calculated to give a sense of how far off the model was from the ground truth. Since this is a probabilistic model, some error is to be expected. Depending upon the final use of the tools, having a roughly accurate guess might be sufficient for the smart home system’s needs. 4.3.5. Results of the BUG/ED As hoped, the BUG/ED performed better than the GR/ED tool on these data sets. It was noted by researchers watching the BUG/ED operate in real time that it felt more ‘stable’. Indeed, the BUG/ED failed less often in the face of skipping sensors and timed out less often when people stayed in one place for a period of time. These results were quantified by higher accuracy rates and measurable benefits to the ADL detection tools. The BUG/ED tool’s overall accuracy improved over that of the GR/ED on both data sets. It was a significant improvement on the Kyoto data, mostly due to its ability to
79
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
Tokyo Data Set BUG/ED Accuracy by Occupancy 1 0.9 0.8 Accuracy
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
Occupancy Figure 8. Accuracy by occupancy count for the BUG/ED tool on the Tokyo data set.
handle missed sensors as people moved about. Overall, the BUG/ED classifies 44% of the events correctly, which accounts for 85% of the total time on the Tokyo data set. Where it improves over the GR/ED tool is when there are more people occupying the space. In Fig. 8, the accuracies for 2, 3 and 4 residents are noticeably higher than in Fig. 7, which showed the GR/ED results for the same data. This new robustness in the face of more residents attests to the efficacy of the BUG/ED approach. Where the BUG/ED truly performed much better was on the Kyoto data set. While the GR/ED tool routinely failed as people traversed the space, the BUG/ED would much more often track them correctly. In Fig. 9, it shows that in the most common state of two residents, the tool performs perfectly accurately just over 60% of the time. Overall, the BUG/ED classified 59% of the events and 67% of the total time for the Kyoto data set correct. This was significantly better than the GR/ED tool on this data set. These improvements in behavior and accuracy attest to using a probabilistic model for decision making in this kind of tracking system. There are too many uncertainties with sensor placement, resident behavior and system configuration to have a purely rule based system operate well. 4.3.6. Application of BUG/ED to Activity Recognition Many of the applications of smart environments that have been explored, such as health monitoring, health assistance, context-aware services, and automation, rely upon identifying the activities that residents are performing. Activity recognition is not an untapped area of research and the number of algorithms that have been used to learn activity models varies almost as greatly as the types of sensor data that have been employed for this task. Some of the most commonly-used approaches are naive Bayes classifiers, decision trees, Markov models, and conditional random fields [2,14–16].
80
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
Kyoto Data Set BUG/ED Accuracy by Occupancy 1 0.9 0.8 Accuracy
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
1
2
3
4
5
Occupancy Figure 9. Accuracy by occupancy count for the BUG/ED tool on the Kyoto data set. Table 1. Attributes of the three tested Kyoto data sets Set Name
#Months
#Residents
Set 1 Set 2
2
2
#Activities 12
2
2
13
Set 3
5
2
25
While activity recognition accuracy has become more reliable in recent years, most existing approaches are applied to situations in which a single resident is in the space performing activities. Recognition accuracy significantly degrades when multiple residents are in the same space. We hypothesize that this accuracy can be improved if the data is separated into multiple streams, one for each resident, or if each event is labeled with the corresponding resident identifier. To validate this hypothesis, we apply the BUG/ED algorithm to data collected in the Kyoto apartment while two residents lived there and performed normal daily routines. The data used for this experiment actually represents different time frames, different residents, and different activities than was used to train the BUG/ED graph. Attributes that describe these three data sets are shown in Table 1. Table 2 summarizes the performance of the activity recognition algorithm for each data set with and without entity labeling using BUG/ED. As is shown in this table, the accuracy of activity recognition generally improves when entity detection and tracking are employed. To demonstrate that the BUG/ED strategy is useful in further smart home tools, it was used to annotate three new sets of Kyoto data. That data was then used to train
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
81
Table 2. Before and after ADL detection accuracies when adding BUG/ED tracking information to Kyoto data Set Name
Without BUG/ED
With BUG/ED
Set 1 Set 2
42%
40%
63%
88%
Set 3
54%
63%
Overall
56%
67%
a Naive Bayesian ADL detector. The results with and without the BUG/ED tracking information were compared and summarized in Table 2. These three data sets are annotated for 11 different ADLs in an unscripted environment. There are two residents, though one or even more than two might be present at any given time. The data sets cover nearly a full calendar year in total and run all day every day. The overall improvement to complex ADL detection was just over 10%.
5. Conclusion In this work two different, though similar, tracking tools were introduced and evaluated. The first uses a graph of the sensor network in a smart home environment and a set of rules to determine the current location and history for individuals. The second uses a history of resident occupancy information to build a set of probabilities to be used by a Bayesian Updating tool for tracking residents. Both tools have benefits and negatives, though overall the probabilistic model provided by the BUG/ED performed better, especially in an environment with poor sensor layout. There will be places for all kinds of tracking systems in smart home technologies. Choosing the right one for the needs of residents will be important for the continued success of smart homes in multi-resident situations. Continued research into passive tracking systems should improve upon these kinds of tools, allowing smart homes to handle ever more complex behaviors and numbers of residents.
6. Future Work Both of the tools presented here offer chances for improvement. The BUG/ED especially has opportunities for continued success. First would be a better method of garnering the transition matrix. Changing the algorithm for training the matrix, or even finding ways to reduce the amount of data needed to make a successful set of probabilities would be very beneficial. Second would be incorporating better methods of detecting an entrance or exit of individuals. This could be accomplished by taking door sensor information into account, as well as a more specific kind of sensor at the doorway to report someone entering or leaving. Finally would be an evaluation of the impact of the sensor layout itself. The current CASAS sensors are focused on detecting ADLs, but perhaps some sensors could be placed in key locations to improve the tracking ability of the system.
Acknowledgments This work is supported by NSF grant IIS-0121297.
82
A.S. Crandall and D.J. Cook / Tracking Systems for Multiple Smart Home Residents
References [1] M. Skubic, G. Alexander, M. Popescu, M. Rantz and J. Keller, “A smart home application to eldercare: Current status and lessons learned,” in Technology and Health Care, vol. 17, no. 3. IOS Press, 2009, pp. 183–201. [2] E.M. Tapia, S.S. Intille and K. Larson, “Activity recognition in the home setting using simple and ubiquitous sensors,” in Proceedings of Pervasive, ser. Proc. PERVASIVE, vol. LNCS 3001. Berlin / Heidelberg: Springer-Verlag, 2004, pp. 158–175. [3] D.J. Cook and S.K. Das, “How smart are our environments? An updated look at the state of the art.” Journal of Pervasive and Mobile Computing, vol. 3, no. 2, pp. 53–73, 2007. [4] J.R. Smith, K.P. Fishkin, B. Jiang, A. Mamishev, M. Philipose, A.D. Rea, S. Roy and K. Sundara-Rajan, “Rfid-based techniques for human-activity detection,” Commun. ACM, vol. 48, no. 9, pp. 39–44, 2005. [5] A. Harter and A. Hopper, “A distributed location system for the active office,” IEEE Network, vol. 8, no. 1, pp. 62–70, Jan 1994. [6] N. Bulusu, J. Heidermann and D. Estrin, “Gps-less low-cost outdoor localization for very small devices,” Special Issue on Smart Spaces and Environments, IEEE Personal Communications, pp. 28–34, October 2000. [7] J. Hightower and G. Borriello, “Location systems for ubiquitous computing,” Computer, vol. 34, no. 8, pp. 57–66, August 2001. [8] O. Brdiczka, P. Reignier and J.L. Crowley, Knowledge-Based Intelligent Information and Engineering Systems, ser. Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 2010, vol. 4692/2010, ch. Detecting Individual Activities from Video in a Smart Home, pp. 363–370. [9] J. Krumm, S. Harris, B. Meyers, B. Brumitt, M. Hale and S. Shafer, “Multi-camera multi-person tracking for easy living,” in Visual Surveillance, 2000. Proceedings. Third IEEE International Workshop on. Piscataway, N.J.: IEEE Press, 2000, pp. 3–10. [10] G. Demiris, B.K. Hensel, M. Skubic and M. Rantz, “Senior residents perceived need of and preferences for smart home sensor technologies,” in International Journal of Technology Assessment in Health Care, vol. 24. Cambridge University Press, 2008, pp. 120–124. [11] C. Reynolds and C.R. Wren, “Worse is better for ambient sensing,” in In Pervasive: Workshop on Privacy, Trust and Identity Issues for Ambient Intelligence, 2006. [12] S. Helal, B. Winkler, C. Lee, Y. Kaddoura, L. Ran, C. Giraldo, S. Kuchibhotla and W. Mann, “Enabling location-aware pervasive computing applications for the elderly,” in Pervasive Computing and Communications, 2003. (PerCom 2003). Proceedings of the First IEEE International Conference on, 2003, pp. 531–536. [13] R.J. Orr and G.D. Abowd, “The smart floor: a mechanism for natural user identification and tracking,” in CHI’00: CHI’00 extended abstracts on Human factors in computing systems. New York, NY, USA: ACM, 2000, pp. 275–276. [14] U. Maurer, A. Smailagic, D. Siewiorek and M. Deisher, “Activity recognition and monitoring using multiple sensors on different body positions,” in Proceedings of the International Workshop on Wearable and Implantable Body Sensor Networks, 2004, pp. 99–102. [15] D. Cook and M. Schmitter-Edgecombe, “Assessing the quality of activities in a smart environment,” in Methods of Information in Medicine, vol. 48, no. 5, 2009, pp. 480–485. [16] L. Liao, D. Fox and H. Kautz, “Location-based activity recognition using relational markov networks,” in Proceedings of the International Joint Conference on Artificial Intelligence, 2005, pp. 773–778.
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-83
83
KopAL - An Orientation System For Patients With Dementia Sebastian FUDICKAR a1 , Bettina SCHNOR a , Juliane FELBER b , Franz J. NEYER b , Mathias LENZ c , and Manfred STEDE c a Institute of Computer Science, Potsdam University b Institute of Psychology, Personality Psychology and Psychological Diagnostics, Friedrich Schiller University of Jena c Department of Linguistics, Potsdam University Abstract. As a result of the aging societies in the western world, the impact of dementia, with its characteristics like disorientation and obliviousness is becoming a significant problem to an increasing amount of persons and the health system. To enable such dementia patients to regain a self determined life, we have developed a mobile orientation system called KopAL that assists dementia patients in every day problems, like remembering appointments, keeping track within their familiar surroundings as well as informing caretakers in critical situations, with a focus on minimal operational costs and a speech based human computer interface. While easy-to-use was one of KopALs requirements, the system itself uses and experiments with new technologies in the field of Mobile Ad Hoc Networks (MANETs), VoIP, and embedded systems. Further, KopAL is an interdisciplinary project. It is developed at Potsdam University within the Assisted Living Initiative of the Institute of Computer Science [3]. The working group Applied Computational Linguisics of Potsdam University cooperates to integrate speech generation and recognition on embedded systems. Psychologies from the University of Jena evaluate the project from the beginning: in helping to figure out the real user demands and a suited user-interface. Keywords. Appointment reminder, assisted living, dementia, emergency call, empirical study, KopAL, localization, mobile system, nursery home, speech synthesis
1. Introduction The aging society results in an increasing amount of elderly people including the associated disease patterns which challenges nowadays societies with increasing costs for the health systems. For example, 20.3% of the German and 17.34% of the European population passed an age of 65 in 2009. Dementia, one of the old-age diseases, is clearly becoming a severe problem in aging societies. Dementia manifests itself primarily at average age 72.3 and older, as stated at the multi-study of Clarfield A.M. et al. [5]. Next to the social challenges, the resulting personal aftermath for dementia people is as well notable. In the primer stages of dementia, patients try to overcome the resulting symptoms, like 1 Corresponding
author: S. Fudickar, B. Schnor, [fudickar||schnor]@cs.uni-potsdam.de
84
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
disorientation in space and time, to be able to continue life as usual. While the sickness intensifies it becomes harder for the patients to keep track of upcoming appointments and regular tasks (like medications, or cooking tasks) since the abilities of the short term memory decrease. Even the orientation in (previously) common environments becomes continuously harder. Dementia patients still demand (at least in the early stages) a self determined living as long as possible, but with the ongoing disease they increasingly depend on external help. Imagine a small electronic assistant that resides in the patient’s pocket and reminds the patients about upcoming events with a natural sounding voice and takes care of them in an autonomic manner. Such a device may increase the well-being of such patients significantly since they may stay longer self determined. In addition, such a device, in case used in nursing homes, reduces the workload for caretakers. Within this contribution, we introduce KopAL [11], an orientation system for dementia patients which focus on the support of dementia patients, their relatives, and caretakers in nursing homes in the following situations: • Reminding on upcoming social or regular events (e.g. meals, choir meetings or remedial gymnastics) as well as medications. • Patients getting lost in their surroundings on unaccompanied walks. • Patients panic even in common situations or require assistance (e.g. after a collapse) The remainder of this chapter is structured as follows. Section 2 describes possible environments, the empirical study and the resulting requirement list. An overview over the presented KopAL system is given in section 3. A more detailed description of the essential services like the appointment reminder, the calling functionality and the localization engine are described in the following sections. Essential lessons learned are summarized in section 7, which is followed by a summary in section 8.
2. Application Field 2.1. Use cases The KopAL system is developed for and deployed within the following test beds: 2.1.1. Nursing home testbed We are cooperating with a nursing home in Stahnsdorf, Germany which is part of the LAFIM2 institution. Like shown in Figure 1 this nursing home resides next to a forest on the one side of the building and a highway in 200 m distance on the other side. Thereby, patients have the advantage to make distant walks through the woods on their own, as long as they have no orientation disabilities. The caretakers regularly organize group based events, like reading clubs. Most patients take their meals within the dining rooms. 2 http://www.lafim.de
(12.10.2010)
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
85
2.1.2. Home stay testbed Another testbed is within the home of a dementia patient that resides in home stay. His moderate dementia is diagnosed since 10 years and he is supported by mobile caretakers once a week. His major requirement is an appointment reminder functionality, since he is still able to orientate within his surroundings, where he resides since his youth. 2.2. Empirical study Introducing assistant technology to elderly people with age-related deficits or diseases such as dementia requires an understanding of their specific needs in order to establish the usefulness of such devices. Valid data must be obtained from the elderly and their caretakers and needs to be accounted for in the process of developing helpful technologies such as KopAL. To meet these requirements, we interviewed ten non dementia residents and ten caretakers of the nursing home in Stahnsdorf prior to a test phase of KopAL in the institution. 2.2.1. Interviewing the residents Ten residents of the nursing home in Stahnsdorf were interviewed by two researchers during the development process and prior to the test phase of KopAL. The sample consisted of seven women and three men ranging in age from 65 to 90 years (M ean = 77.5 years). The interviewees were nominated by the administration of the nursing home based on their good mental and physical health status. As KopAL aims to assist more severe impaired residents, our sample might not exactly represent the target group. Nevertheless, we chose them as interview partners and test subjects because we expected them to be able to state first impressions and make suggestions regarding KopAL’s improvement prior to and within the test phase. The interviews lasted about 45 minutes each and addressed (1) experiences with devices that share some features with KopAL, (2) overall
Figure 1. The Stahnsdorf region map
86
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
technology commitment, and (3) first impressions of KopAL and recommendations for the development and improvement of the device based on individual needs. 1. Technology experience: At first, we were interested in the residents’ overall experiences with new technology that shares some similarity with KopAL. We asked the elderly whether they own a computer, a mobile phone and/or a wireless phone. Not surprisingly, the respondents neither owned a computer or a mobile phone nor had they access to these devices on a regular basis. Consequently, no resident felt familiar in dealing with computers and mobile phones. However, most of the residents owned a wireless phone and used it at least 2-3 times per week, which made most of them feel quite comfortable with this technology. These results followed the expectations as most of the elderly will have retired before computers and mobile phone became common equipment in today’s working environments. Furthermore, it cannot be assumed that computer and mobile phone use is standard in nursing homes where residents often deal with agerelated physical and/or psychological impairments. Nonetheless, we appreciated finding that the residents felt comfortable with a wireless phone since the use of this device and KopAL are related to a certain extent. Although not explicitly addressed in the interviews with the elderly, we were informed that most of the residents were familiar with emergency buttons to be carried around the neck. Overall, we concluded that it is important to focus on the residents’ experience with wireless phones and emergency buttons. Building on this within the implementation process of KopAL will certainly help to overcome users’ doubts and insecurities. 2. Technology commitment: Although we expected the residents being a technologically rather inexperienced group, we were surprised to find them describing themselves as technology committed. We consider technology commitment as a complex personality trait with quite consistent inter-subject variability. Technology commitment has a three-dimensional structure: (1) technology acceptance, (2) technology competence, and (3) technology control. While technology acceptance refers to a positive attitude towards new technology, technology competence addresses the individual belief in mastering the use of technology. Technology control concerns internal control beliefs in dealing with new technology, that is a sense of own control vs. the idea of chance, (mis-)fortune and/or an external influence when using technology. Technology commitment is measured by a 15-item scale (5 items per dimension) with a five-point Likert-type scale (1 = "not true at all" to 5 ="very true"). Contrary to the stereotype, in previous studies we have found no age-related differences in technology acceptance, which means that the elderly are not per se less interested in new technology. However, they feel less competent in using modern technology. This may partly be due to poor implementation strategies and former dysfunctional experiences with new devices. No age-related differences have been found in technology control and overall technology commitment [14]. The majority of interviewed residents from the nursing home in Stahnsdorf reported a general interest in new technology (M eantechnology acceptance = 3.49). Items such as "I am very curious about new technology" and "I like new technology", which address the individual attitude towards new technology, were positively rated by nearly all of the respon-
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
87
dents. Interestingly, they differed more strongly in their ratings of items with a reference to one’s own use of new technology such as "I enjoy the use of new technology" and "If I had the opportunity to use more technology in my daily life, I would do so". This ambiguity has to be kept in mind during the implementation process of a new device such as KopAL. Concerning technology competence (M eantechnology competence = 2.91) the uncertainty in using new technology became further obvious as about half of the residents agreed with items such as "Often I am in fear of failing using new technology" or "I am afraid of breaking new technology rather than using it appropriately". Nonetheless, the elderly feel in control when dealing with new technology (M eantechnology control = 3.69). Items such as "It depends on me whether or not I solve problems that I might face when dealing with new technology" or "Being successful in using new technology is a result of my personal effort" were positively rated by two thirds of the respondents. In conclusion, the interviewed residents reported a sufficient technology commitment (M eantechnology commitment = 3.36), which we interpret as a solid basis for the implementation of KopAL. 3. Introducing KopAL: After assessing the residents’ experience with technology and their technology commitment we introduced a KopAL demonstrator to the elderly. The functions were described and the residents had the opportunity to test the device. We then presented a variety of statements about KopAL and asked the residents to indicate their (dis-)agreement according to their first impression of KopAL. The majority of the elderly revealed an interest in the device and stated that they would like to use it. Over half of the residents thought of KopAL as a device that would improve their daily life. Residents in doubt were those who were still in good health. They stated that they would not need KopAL or its functions yet. When asked to imagine the use of KopAL, one third of the residents expressed concerns regarding the appropriate use of the device. This may partly be due to the fact that these respondents had special disabilities such as impaired hearing and seeing, spasticity after a stroke, missing fingers and symptoms of Parkinson’s disease. This diversity of impairments underlines the need of a careful implementation of the device based on the user’s capacity, meeting individual fears and insecurities. We then focused on the specific functions of the device: (1) emergency call, (2) reminder, and (3) localization. All residents stated that they would make use of the emergency call function since they considered it a helpful and important feature. The participants insisted on a guaranteed connection to the caretaker, a sufficient audibility both ways and reliability on help. Regarding the reminder function of KopAL, all of the interviewed residents looked upon it favorably. Two thirds were also positive about its use. Participants who abnegated this stated that they would not need this function yet as they still note and remember appointments, events and tasks themselves. When asked about the necessities of this feature, the residents emphasized enunciation, sufficient audibility, unambiguousness of the information, punctuality of the reminder and the opportunity to replay the information as desired. While the emergency call and the reminder function of KopAL was appreciated by the majority of the residents, over half of them stated that they would not use the localization function as they are not in need of it yet. However, all residents - being aware of critical incidents with resi-
88
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
dents who suffer from dementia - considered this an essential feature for the nursing home environment. Again, they demanded a sufficient audibility and unambiguousness of the information given by the caretaker. Additionally, the residents recommended that the criterion ’leaving the familiar environment’ should be defined individually (e.g., not too extensive). In a final step we presented different options for the configuration of KopAL to the residents and asked them, which of these options they would personally prefer. Designing KopAL according to the users’ requests is essential to assure the future acceptance of the device. For example, asked about the favored gender of KopAL’s speaking voice prior to any speech synthesis trials, the majority favored a male voice. Furthermore, the residents preferred a lighter device even if it would require a frequent battery charge. Additionally, the majority of residents demanded both speech synthesized and displayed information. And finally, the elderly asked for the maximum possible support of KopAL, taking into account that this requires considerable more data recording. 2.2.2. Surveying the caretakers We considered it not only essential to assess the residents’ attitudes towards KopAL but also strived towards the caretakers’ feedback and recommendations. After all they will be using the device as well and should view it as beneficial working equipment. Ten caretakers of the nursing home in Stahnsdorf were therefore surveyed in the development process of KopAL. The sample consisted of seven women and three men ranging in age from 28 to 59 years (M ean = 43.2 years). Completion of a questionnaire took about 15 minutes. Corresponding to the residents’ interview topics, we addressed the caretakers’ (1) experiences with devices that share some features with KopAL, (2) overall technology commitment, and (3) first impressions of KopAL and recommendations for the development of the device and its improvement. 1. Technology experience: As expected, the caretakers are well experienced with computers and mobile phones. All of them stated that they own and use these devices daily in their professional and private lives. Consequently, nearly all caretakers felt confident in handling computers and mobile phones. This is an important finding as the caretakers will be in control of selected functions of KopAL (e.g., set up of the reminders; interaction with the resident in need). The use and continuous set up of KopAL’s features share some commonality with known applications on computers and mobile phones. We interpret these results as a sound basis for the implementation of KopAL. 2. Technology commitment: The caretakers described themselves not only as experienced with technology; they also displayed a sufficient level of technology commitment (M eantechnology commitment = 3.39). Interestingly, technology commitment of the interviewed caretakers was less based on high acceptance of modern technology (M eantechnology acceptance = 3.06) but rather on a comparatively strong sense of competence (M eantechnology competence = 3.46) and control (M eantechnology control = 3.65) concerning the use of technology. In fact, over half of the interviewees showed an indifferent response to or even abnegated acceptance items such as "I enjoy the use of new technology", "I am very curious
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
89
about new technology" and "I like new technology". Nevertheless, the majority of caretakers agreed with the statement "I would use technology more often if I had the chance to do so". The caretakers’ sense of competence and control in dealing with new technology was concluded from the majority’s explicit rejection of statements such as "Often I am in fear of failing using new technology" or "I am afraid of breaking new technology rather than using it appropriately", respectively to their strong agreement with items such as "I have trust in my capability to use new technology", "It depends on me whether or not I solve problems that I might face when dealing with new technology" or "Being successful in using new technology is a result of my personal effort". Regarding the implementation of KopAL, the caretakers’ sense of technology competence and control will certainly be beneficial. In contrast, the observed acceptance ambiguity has to be kept in mind. Caretakers are facing considerably stressful working conditions including the frequent use of technology (e.g., answering phone calls, responding to residents’ emergency calls, computer-assisted documentation). The introduction of additional technology such as KopAL and its implementation into the working environment could be anticipated as another stress factor. This, in turn, might contribute to the rather low level of acceptance of new technology, which needs to be addressed in order to establish the device on a long term basis. 3. Introducing KopAL: Following the assessment of the caretakers’ technology experience and commitment we described KopAL and its features to the interviewees. Additionally, a demonstrator of the device had been presented to them within a team meeting before. In line with the procedure for the residents, the caretakers then responded to a variety of statements concerning the perceived usefulness of KopAL. Finally, we asked for recommendations for the improvement of the device according to the caretakers’ needs. The majority of caretakers reported an interest in the device and was confident about its reliability. Furthermore, they indicated willingness to test KopAL. Overall, they evaluated it as a helpful tool that would facilitate their everyday working routines. However, some caretakers questioned the safety of the device and reported uncertainty regarding its operating mode. Hence, we aim at resolving this skepticism within the ongoing test phase of KopAL. Having assessed the first overall impression of KopAL, we brought the specific functions of the device: (1) emergency call, (2) reminder, and (3) localization into focus. Literally all caretakers strongly appreciated these three functions and anticipated their contribution to the facilitation of daily working routines. Regarding the emergency call function, a precise and distinguishable design of the emergency button was recommended. Furthermore, the necessity of locating the emergency call and the need to define cases of its application to be communicated to the residents in order to avoid misuse was emphasized. With respect to the reminder function, it was pointed out that - depending on their physiological and/or psychological health status - not all residents would be able to respond to a reminder in the intended way. In accordance with the residents, the caretakers demanded enunciation, sufficient audibility, unambiguousness of the information, individualized punctuality of the reminder (early enough for a response but not too early) and the opportunity to replay the information as desired. Considering the localization feature, the caretakers stressed
90
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
the importance of a calibration of KopAL that would set off a distinguishable, reliable alarm as soon as the resident has left a defined area. Furthermore, they demanded a sufficient audibility of the resident-caretaker-interaction. Additional questions addressed options for the configuration of KopAL. The caretakers suggested a lighter version of the device with maximum possible support even if it requires a frequent battery charge and considerable more data recording. Furthermore, the caretakers demanded a flexible, practicable set up of the system (e.g., reminder coding) at different workstations, executed by all caretaker in service. They suggested a specific sound given by the device to inform the resident or caretaker that battery charge is required. Several caretakers recommended extra equipment that allows the resident to wear KopAL on the body (e.g., a carry case and a belt). Finally the interviewed caretakers proposed future features of the device that would be appealing to the residents such as music, pictures, games (e.g., Sudoku) and/or a phone call option. 2.2.3. Conclusion The interviews with residents and caretakers provided us with important and helpful information about their special needs concerning KopAL and its implementation to facilitate everyday routines in a nursing home. We were pleased to find a great general interest and confidence in the system. It has to be kept in mind that the interviews were carried out prior to the test phase of KopAL, and the residents’ and caretakers’ first impressions presented here are merely based on descriptions and a premature test version of the device. However, interviewing some of the residents and caretakers that early in the development process of KopAL has only been the starting point of our continuing cooperation with the nursing home in Stahnsdorf. Focus groups on selected features of the device (e.g., speech synthesis) and the evaluation of the test phase are currently established. We consider the evaluation of KopAL throughout the test phase as essential since more detailed feedback and solid suggestions for the improvement of the device will certainly arise through trial and error. We aim at considering all responses and critiques. By doing so, we are confident about the successful implementation of KopAL. 2.3. Requirements Our empirical study suggests that a partial usage of our system (e.g. as plain appointment/medication reminder) is even adequate by non dementia elderly people. To assure an adequate usability in the hands of dementia patients, the following aspects are essential: • KopAL may not depend on the direct interaction with the users. • The mobile devices of KopAL should announce information in a speech based manner with clear pronunciation. • Appointments should be read out in time via a speech based push service. Medication reminding should not be read out redundantly. • All read out messages must be replay-able in case the user demands it (e.g. he could not understand it at the first time). • User interface: The frequency of repeating read out messages can be adjusted to the users memorizing skills.
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
91
Figure 2. The Components of KopAL
• Low operational costs and easy to install (i.e. no extra constructional costs are required). • Usability: The caretakers may use KopAL via the in-house phone system and thereby should not be required to carry an extra device.
3. The KopAL System In this section we introduce the hardware components of KopAL and the main functionalities. An overview of KopALs components is given in Figures 2 and 3. The central server component contains a web server that enables the caretakers to manage patient data and appointments. Each user/patient is equipped with a mobile device, similar to a smart phone. Nowadays, the caretakers already carry an office mobile. This is integrated into the scenario. The KopAL system consists of several components that require a wireless communication channel. Since low operational costs and ease of setting up the system are demanded, KopAL utilizes the license fee free IEEE 802.11b/g [12] Wi-Fi protocol in ad-hoc mode. Further, pre-configured Wi-Fi routers are easy to install also in existing buildings. The KopAL system consists of the following devices: • Several mobile devices (nodes) which are hand out to the patients and communicate with other devices via the Wi-Fi network. The nodes act as an appointment reminder and contain an emergency button to inform caretakers of critical situations. In addition, they are used for localization purposes and inform the caretakers if patients seem to get lost. • A central server hosts the software components like the web front end of the appointment management system (which is available through other computers within the environment) and the central appointment database next to the call management. The central server has to be equipped with a Wi-Fi connection and connectivity to the local telephone system (e.g. by a translator box). • Several Wi-Fi routers guarantee the minimal required Wi-Fi network coverage within important regions of the environment. In addition they act as Wi-Fi buoys, which are required for location determination. These devices are required to be connected to the power line since they are always active.
92
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
• The telephone system is used to play out messages to the care takers mobile phones and establish voice calls on care takers site. Thereby, caretakers are not required to carry additional devices. In a home stay scenario the emergency call system can be redirected to external caretakers or a relative. If a device is not in the area of one of the Wi-Fi routers, it may communicate in a hopby-hop manner via other mobile devices. Therefore, we utilize the mobile ad hoc network (MANET) [7] routing protocol OLSR [6] for the route calculation between nodes that do not reside in direct neighborhood. Since the caretakers may use KopAL via the in house phone system, a connection to this system must be available. Therefore, an Asterisk server [2] is attached to the central server component as an interconnection point to assist the direct communication between patients and caretakers in emergency cases via Voice over IP (VoIP) calls like introduced in [9]. Since the whole system should not depend on the availability of the central server, most communication processes (like appointment synchronization and time synchronization) between all the end user devices are established in a peer to peer (P2P) like manner. An adequate solution for the demand that KopAL may not depend on the direct interaction with the users can be the usage of speech based interaction, since the user is not required to actively check the device for novel messages like appointments. Since the mobile devices of KopAL have to announce information in a speech based manner, KopAL integrates speech synthesis functionality within these devices, which enables the readout of appointments and warning messages, even in absence of the central server component.
Figure 3. The Hardware system components
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
93
4. Reminder The reminder functionality assists patients with moderate dementia by reminding upcoming events (like meals, meetings or medication) and tasks. When an appointment is upcoming the patients mobile device just reads the appointment with a naturally, humansounding, but in fact computer synthesized voice. Depending on the patients grade of dementia he or she may be even reminded several times (up to five). Noisy environments may require patients to replay read out past reminders manually, as well. Therefore, by activating the replay button the last appointment can be repeated. The appointments can be recurring (like daily meals) or one-time. Medication reminders are a special type of event, since an unwanted replay of them may lead to an over medication, while a missed reminding may induce a sub medication. Thereby medication events must be played exactly once (the only exception are user requested replays). The acoustical representation of upcoming events is played out on the patients devices. The acoustical representation of an appointment message is similar to a telephone call and structured as follows: • Two ringings signalize the user an upcoming event. • During a delay time of 5 seconds the user has the opportunity to lift the device to a better listening position (if not adequate placed). • The message is read out via the synthesized speech file, which contains a salutation next to the message content.
Figure 4. The Appointment Web interface
The following subsections contain a detailed description of the subcomponents of the appointment reminder system. 4.1. P2P Appointment transmission A caretaker, or a relative, creates appointments for a patient or a group of patients via a web interface (see Fig. 3). Since KopAL utilizes text to speech synthesis for appointment
94
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
readout, the editor does not have to record voice. Next to appointments, user accounts are managed via the web front end as well. The generated appointments and other data reside on the central server component, from where it is distributed within the mobile node network. Appointment entries are separated regarding their occurrence frequency in single, weekly and yearly appointments. All appointments contain the following variables: • Receiver: The user name of the dedicated receiver is used for authentication purposes. The appointment is even transmitted if any of the devices are registered to the required user, transmitted between devices. • Owners: The Owners consist of a list of users and/or groups that are allowed to edit the appointment. • ID: A unique ID to enable the identification of each appointment within the network. • Release number: The release number is incremented each time, the message is manipulated. As a result, each node can identify the latest appointment version. • Text: The textual representation of the appointment content. • Start date: First occurring date of the appointment. This parameter is required for the decision, when the central server submits the appointments into the network. • End date: Last occurring date of the appointment. This parameter is required for the decision, when to delete the appointment from the devices. • Execution clock time: Defines the played out time of an appointment. • Timestamp: The timestamp prevents replay attacks. • Signature: The signature assures originality of the appointment. Depending on the appointment type the following extensions are used: • Execution days (in case of weekly appointments) • Execution date (in case of yearly appointments) Once a day, each node checks upcoming appointments for the next day and generates the synthesized speech message representations, in case it is missing. Since the speech generation is a resource intensive task this step is executed while the nodes are connected to an energy plug. In addition, each node checks periodically (e.g. all 3 minutes) for upcoming events and plays their acoustical representation. Distributed Update of the Appointment Database When the appointment map is updated on the central server, the information has to be transferred to the mobile devices. Devices have not only the ability to synchronize their appointment map representation with the central server, but also can update it with other mobile devices. As a result even nodes that never reside in direct Wi-Fi communication range with the central server may have an up to date representation of the appointment database. The appointment databases are updated in a P2P like manner between the mobile nodes and the central server, in case they are in direct communication range of each other, as follows: When a possibly out of date (last update > 5 hours) mobile node (node A) reaches the communication range of another mobile node (node B) or the central server, the mobile node A may initialize an appointment database update, in case the synchronization task is triggered (frequently). During the resulting appointment update phase the following steps occur:
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
95
• Node A->node B: Initializes the request by sending an appointment update request message which includes the md5 hash value of the known appointments. Previously node A activates the local semaphore lock (to prevent concurrent appointment database manipulations). • When node B receives an appointment update request message it tries to activate the local semaphore lock. When the semaphore is already locked, node B responses with an appointment update delay message, which signalizes node A that node B is currently busy. • If node B successfully acquired the semaphore lock, it compares the received md5 hash value with the one of the local appointment database. If both hash values differ, and thereby some entries, node B sends a list of all appointments (containing each appointments ID and release number) of its local database to node A. • Node A compares the received ID and release numbers with the current local database representation. This comparison may result in one of the following cases (per appointment): ∗ If a received release number is locally missing -> request an update from node B ∗ Both appointments fit -> do nothing ∗ Local appointment has a lower release number -> node A request an update from node B ∗ Local appointment is not yet known by node B, or has locally a higher release number -> send an update to node B. 4.2. Speech synthesis In the absence of any well-established, open-source German speech synthesizer running on mobile devices, we conducted a survey of existing (open) Linux-based synthesizers to determine the best candidate for our application scenario. The main criteria that we established for such a comparison were • • • • • •
Open license Portability (no idiosyncratic software requirements) Support for several languages, including German Possibility to mark up the input string with prosodic parameters (see below) Runtime performance Quality of the voice output
Here, the “mark up” criterion refers to the question whether the synthesizer can be instructed to pronounce words in non-default ways, as a result of specific context-specific considerations. In the normal text-to-speech (TTS) setting, the system accepts a plain string, retrieves pronunciation rules for the words, and arranges for an overall fitting intonation curve for the complete sentence. This often leads to acceptable output, but sometimes can be problematic. A case in point are corrective utterances, such as: Your appointment is not on Monday; it is on Tuesday. For this to sound natural, Tuesday must be stressed, so that the corrective function becomes clear to the listener. A standard TTS algorithm, however, has no way of detecting this. It is a matter of contextual reasoning and world knowledge to decide that Monday and Tuesday are in a contrastive relation-
96
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
ship to one another. In addition to this reasoning requirement, the technical requirement for the synthesizer is that it is able to interpret a ‘speech markup language’, which encodes information on prosodic stress, rising tunes, or pauses. Such markup languages have been standardized in the context of voice dialogue systems; a prominent example is the W3C Speech Synthesis Markup Language3 , which offers tags such as
or <emphasis>. Hence, the input for our example could be Your appointment is not on <emphasis>Monday it is on <emphasis>Tuesday
The following synthesizers were included in our study: eSpeak4 , IMS-Festival5 , MARY6 , and BOSS7 . The first two of these are all based on the MBROLA synthesis front-end8 , which provides the actual voices for the output; the third can use MBROLA and other voices; the fourth comes with its own voice. MBROLA, which has been developed for many languages, takes not a text string but a list of phonemes as input, which have to be computed beforehand – hence it performs only the “second half” of the overall TTS task. Determining the phoneme sequences from the textual input is a matter of looking up the words and mapping them to a phonemic representation that also respects the neighbouring words, since the right pronounciation of many words is not contextindependent. In general, not surprisingly, there is a trade-off between the memory resources needed for synthesis and the quality of speech output, as wide-vocabulary speech synthesis by definition requires large acoustic models to be held in main memory. It turned out that only two of the four systems could be executed on the handheld hardware with acceptable runtime behaviour: eSpeak and IMS Festival. Nonetheless, all of them have not been optimised for this purpose, which clearly would be necessary (note that commercial synthesizers also come with specific ‘embedded’ versions, for example IBM’s Embedded Via Voice). We thus decided to perform the actual synthesis on a server and to transmit wave files between the handheld and the server. As a result of our experiments, we ruled out the MBROLA-based voices due to their sounding which is rather unnatural; given our target users, we reckon they might not be willing to accept such artificial voices. The BOSS output sounds better, but the system has extensive memory and runtime requirements (for instance, it needs its own mySQL-server). Therefore, we settled on MARY, which offers comfortable development tools. It is still actively supported, and has a sizable user community. The voices based on Hidden Markov Models (HMM) seemed to us most acceptable – and a first pilot study with potential system users confirmed this choice. The drawback, however, is that MARY can interpret the markup tags mentioned above only in conjunction with the MBROLA voices. Hence there is currently another trade-off: Either insist on computing the prosodic markup (on the basis of context and speaker intentions) and accept the suboptimal voices; or work with a good voice but just standard prosody, running the risk of irritating pronounciation in certain problematic utterances. For the time being, we 3 http://www.w3.org/TR/speech-synthesis/
(12.10.2010)
4 http://espeak.sourceforge.net 5 http://www.ims.uni-stuttgart.de/phonetik/synthesis/festival_ opensource.html(12.10.2010) 6 http://mary.opendfki.de(12.10.2010) 7 http://www.sk.uni-bonn.de/forschung/phonetik/sprachsynthese/ boss(12.10.2010) 8 http://tcts.fpms.ac.be/synthesis/(12.10.2010)
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
97
chose the latter path, as for our application the generally good acceptance of the MARY HMM voices is a major factor for system usability. In respect to hearing problems of elderly we selected the hmmbits3 low-pitch voice, which was as well the favourite of the interviewed elderlies. 4.3. Security The KopAL appointment component relies on encrypted and authenticated communication channels. The required security functionalities, encryption, authentication and key management, are encapsulated within the security component. We use a public-key infrastructure. We assume that new users are registered once on the central server via the web front end by either a caretaker or an administrative. During this registration the security component is triggered to create the users key pair and a certificate. The users certificate in addition is signed by the root certificate. Afterwards a personalized key store instance is generated. This key store contains next to the certificates (of all generated users and the central server) the user’s private key. The personalized key store has to be downloaded and stored on the user device. When another node requires the certificate of the new user, the certificate is requested from the dedicated user’s device. We can assume a connectivity between both devices, since foreign users’ certificates are only required for encryption or signature validation as part of message communication. Since certificates are not hidden secrets this communication can be unsecured. When receiving a new certificate, it is validated with the central server certificate, and if valid, attached to the local key store. To achieve a secure communication, the KopAL system partially relies on the encryption and verification of message / communication streams if necessary. Such packages are protected via signature and encryption of selected message fields. Appointment security All appointments are stored on and transmitted between the devices partly encrypted. Since the encryption should not pretend the appointment update process even for devices that are not the final receivers, just critical fields are encrypted. To prevent the message manipulation all fields are signed by the originator (which in our case is always the central server). The encrypted fields are the following: • • • • •
Description text Execution clock time Execution days (in case of weekly appointments) Execution date (in case of yearly appointments) The owners
4.4. Time synchronization To assure that appointments are read out in time, the clocks of the system must be synchronized to the world time with a precision of seconds. Since the hardware of the applied mobile devices is not highly dependable, the build in hardware clocks experience a significant time aberration (up to several minutes per day). The time synchronization protocol NTP [13] which is popular in distributed systems, is not adequate for our use
98
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
case since it demands the connectivity to a central server (e.g. stratum concept). But the KopAL network is a MANET where the devices may be disconnected for some time. Therefore, we integrated a time synchronization approach which enables the time synchronization between several nodes, even without the availability of a central time service (which may be updated by GPS [15], or external NTP servers). We thereby focus on traffic optimized and secured (in sense of authenticated) communication. In case a node has not updated its system time recently it requests the current timestamp from other neighboring nodes via a broadcast request.
5. Calling functionality Similar to existing elderly care support systems, the KopAL system enables the patients to trigger an emergency call, by pressing a button (see Figure 5).
Figure 5. KopALs mobile device and its user interface
KopAL can treat this emergency call in two ways, like shon in Figure 6, depending on the devices configuration: 1. A VoIP call is directly established between the patient’s mobile device and the caretakers’ mobile phone. 2. A textual alarm message is sent via the central server (where a speech synthesized audio file is generated from the textual message) to the caretakers’ mobile phone. Such a message may contain the following content: "Mr. Meyer pressed the emergency button in the dining room. To call him back please dial the 1234." 3. The caretaker can directly call the device of the patient as well. Therefore the adequate number (e.g. 1234 for Mr. Meyer) must be dialed via a normal mobile phone. Depending on the user device state such a call may be automatically accepted (e.g. in emergency case). Therefore the KopAL mobile application includes a full session initiation protocol (SIP) [16] client stack and thereby acts as a SIP based VoIP phone. SIP defines the call sessions initiation and handling as well as the user registration. VoIP sessions are initialized through the ManetSIP overlay network to achieve full functionality even in avail-
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
99
ability of the central server, via the local MANETSip proxy. In case of an patient triggered emergency call the SIP messages are transmitted to the central servers asterisk service. The asterisk service identifies the currently adequate phone numbers of the caretaker and initializes a phone call with the mobile phone of the user. After the session is established all communication participants agree on a common available media codec. For audio en-/decoding the G.729 [1] voice codec is appreciated, since it supports well speech quality in combination with slim media streams. The media codec is utilized to transmit the talk within a Real-Time Transfer Protocol (RTP) [4] media stream. As a result the patient and his caretaker can talk.
Figure 6. KopAL call messageflows
MANETSip To support a VoIP communication between several mobile devices even if the central server is not available we use the MANETSip framework for SIP based user registration and session establishment. The MANETSip framework enables the session establishment and user management via SIP over MANETs by ensuring the availability of the essential registrar- and location services (RLS) on several nodes within the network. Each node of the MANETSIP network acts as a local SIP proxy service. All proxy services (and thereby all MANETSip nodes) listen to a specific multicast address, to receive frequent update messages of available RLS, which are transmitted by nodes acting as such. Thereby all participating nodes within the multi hop communication range get a recent overview on all available registrar/location services. All SIP messages of a KopAL in-
100
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
stance therefore are addressed to the local MANETSip proxy via local channels, where they are processed and forwarded, if necessary. In case of a call establishment the IP address of the addressed callee is resolved through the distributed MANETSip RLS backbone and the message is forwarded to the dedicated node. By the additional usage of the MANETSip security extension all internal messages are transmitted partially encrypted and fully signed and thereby are resistant versus the following exposures: • • • • • •
man in the middle attacks, replay attacks, message manipulation, message sniffing, identity spoofing, ...
Further information can be found in [10] and [8].
6. Localization functionality The localization of patient devices within a defined environment (e.g. a nursing home) is required for the following actions: • The mobile devices trigger an alarm chain, when a dementia patient leaves the familiar surroundings (e.g. the building). • The information of the current location is used by the appointment reminder, to adaptively react on to the current location. In case an upcoming appointment is tagged with a regional location, the region of the appointment is compared with the current device location. In case the device resides already in the appointments region, the readout of the appointment may be waived. Otherwise, the appointment would be read out. • In case the location monitoring functionality is activated on a device, new location results are stored in a local database. Each entry consists of the current region as well as the measuring time. The device tries once per hour or in case of critical situations to transmit all collected entries to the central server database, where it is used for the purpose of diagnosis, like estimation of algorithm precision. In case of successful transmission the transmitted entries are removed from the devices database. • In case a patient lost his device, the caretakers can relocate it, by requesting the last known location of the device via the central server. The caretaker is shown the last known region of the requested device and he has the possibility to trigger a finder state on the device, in case it is still reachable via Wi-Fi connections. While most described functionalities require a precision up to several meters, the latter functionality prefers a precise as possible localization. The localization component therefore has the functionality to detect the current region of the device. This may be triggered either by a timer (in case of regular critical region detection), by a distributed request (e.g. by the localization request for a lost device) or by a local request before playing out an upcoming appointment. The localization is implemented via a zone-based approach, where the given environment is divided in zones (in the follow described as regions). Depending on the den-
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
Name
IP Addresses
Action
Critical forest region
192.168.2.22
Alarm call
Critical road region
192.168.2.66
Alarm call
Center Region
101
192.168.2.22, 192.168.2.33 192.168.2.44, 192.168.2.55
North east region (garden)
192.168.2.44, 192.168.2.55 (either 192.168.2.33 or 192.168.2.66)
All others Table 1. Area table of LAFIM (see Figure 1.)
sity of Wi-Fi routers, placed within the environment, the size of a region is between few meters up to 50 meters. The amount of placed routers is important since the localization is achieved based on the plain availability of other devices as well as their radio signal strength (RSSI) which reside within the communication range of the localizing device. The received signal strengths of other devices are mapped to a region map. A region map (like shown in 1 is a discrete representation of the given environment, which must be setup manually once. The region map defines regions which consist of several devices. Each device entry consists of the following parameters: • IP address of the given device • Flag if the device must or must not be available • Minimal signal strength (in case it must be available) During mapping process the received IP addresses of the static placed Wi-Fi routers are matched with the devices within the device regions of the region map. The matching may result in several possible regions. The resulting regions are matched regarding the received signal strengths and thereby result in one region. To optimize the processor load, the last determined region is cached and the current device address and signal strength measurements are directly compared with the region map of this region. In most cases the devices reside long term within one region. The initial creation of the region map requires a measurement based positioning of the given environment. The definition of a region requires the uniqueness regarding the availability of included router. In our test bed at the nursing home Stahnsdorf we currently achieve a sufficient precision with 5 Wi-Fi access points, like shown (via their ending-ids) in Figure 1. The region map is represented via an output XML file, which is the masterpiece for the internal representation and is interpreted by the mobile devices. Therefore the generated XML file must be transmitted to all participating devices (which may be achieved either manually on system initialization, or via the appointment management system in a P2P like manner). Let us have a detailed look at the critical region scanner functionality. For all new scan results the device’s region is determined. If a device enters a critical region the device triggers the following alarm chain: • The device plays out a message to the patient asking him to return and go along with a caretaker during the walk. If the patient returns on its own the alarm chain is stopped.
102
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
• In case the patient continues the walk the device is sending an alarm message to the central server. On the central server the alarm message triggers a voice call to the caretaker’s mobile phone. The voice call informs the caretaker about the recognized alarm including the following information: ∗ ∗ ∗ ∗
Name of the patient The cause of the alarm (e.g. leaving the building unattended) The direction towards which the patient left the building An internal POTS number to establish a direct VoIP call with the patient’s device
In our test bed of LAFIM the resulting message may be as follows: "Mr. Meyer left the building alone in direction of the wood. To call him back dial the number 1234." • The caretaker has the choice to call the patient device from his mobile phone and talk with the patient. In case the device is within a critical region, all calls are directly accepted by the device of the patient (even when signalized via an acoustical signal), otherwise calls must be accepted by the patient, to prevent eavesdropping. In case of entering a critical region the critical region scanner requires the connectivity with the central server (to inform the caretaker). Thereby the critical regions must be well covered with Wi-Fi access points to the backbone network. The necessary frequency for region scanning varies for different conditions. When a patient is in the first floor, it is unlikely that he will leave the building within the next minute. In contrast when a patient is next to the building entrance he may enter a critical region within the next seconds. When the critical region scanning functionality is not activated, an on demand scanning is sufficient for all remaining service. As a result the frequency of regular scans should be adjusted dynamically depending on the last known location of the device of the patient and the remaining services. Currently the scanning frequency ranges from 20 seconds (e.g. a client resides in a critical region) up to 5 minutes (in case the device is connected to power line). The dynamic adjustment of the scanning frequency is essential for an extended battery lifetime.
7. Lessons learned During design-, development- and prototypical testing phase of the KopAL project, we have recognized so far the following unpredictable essentials of system development for elderly people: • Its ringing, its a mobile: Most elderly people handle devices that ring like a mobile phone automatically. In case of a ringing the device is lifted to the ears. In our case the voice reading out the appointments and messages is taken as a caller. With intuitively responding to it like: "Ah yes thank you by" • Advertise a message: Audible voice based messages must be advertised (e.g. tone signal). A short gap (e.g. 3 seconds) between initial advertise signal and the message play out enables the user to focus on the upcoming message and optimize its environment (e.g. lift the device).
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
103
• It is not "‘the voice"’: We have a qualitative evaluation, asking elderly to choose their voice from several voices. While the majority has chosen one of two voices (one female and one male), it was not possible to select "‘the voice"’. Some users prefer a male voice while others prefer a female voice. The individual distinctions continue when requesting the optimal voice speed shift. As a result to achieve acceptable results the voice must be highly configurable per user, or at least selectable from a set of predefined voices. • Some read, some listen: Regarding the preferences how to be informed, at least some users won’t accept an acoustical presentation of appointment reminders. Imagine sitting in a bus when your device reminds you on an appointment at your prostate doctor loudly. In such conditions a acoustical signal in combination with a textual appointment reminder fits more appropriate. • What’s heavy and comfortable: The maximum acceptable weight of a mobile device differs between several users. In addition a comfortable carrying position for our mobile device varies between the users.
8. Summary We introduced a mobile orientation system called KopAL, that assists patients with dementia handling every day problems, like remembering appointments, keeping track within their familiar surroundings as well as informing caretakers in critical situations, with a focus on minimal operational costs and a speech based human computer interface. In addition to residents of nursery homes KopAL is well suited to assist patients in home stay. The evaluation of patients and caretakers of a nursery home in Stansdorf (Germany), give an overview of their technological experience and commitment, next to their requirements for the KopAL system. We are looking forward to an additional evaluation-phase at the end of the prototypical deployment. While easy-to-use was one of the resulting requirements, the KopAL system itself is based on several new technologies in the field of Mobile Ad Hoc Networks (MANETs), peer to peer (P2P) networking, VoIP and speech synthesis on embedded systems.
9. Acknowledgements The authors would like to thank Martin Fischer and his team of the LAFIM Stahnsdorf for the support within our project. Without the discussions we had, the authors would not be able to design a useful system. In addition we thank our students for their work to get the KopAL prototype running.
References [1] ITU-T Recommendation G.729: Coding of speech at 8 kbit/s using conjugate-structure algebraic-codeexcited linear-prediction (cs-acelp), March 1996. [2] Asterisk Service, July 2010. http://www.asterisk.org/. [3] PALI, July 2010. http://www.cs.uni-potsdam.de/pali/.
104 [4]
[5] [6] [7]
[8] [9] [10]
[11]
[12]
[13] [14] [15] [16]
S. Fudickar et al. / KopAL – An Orientation System for Patients with Dementia
Audio-Video Transport Working Group, H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RFC 1889: RTP: A transport protocol for real-time applications, January 1996. Status: PROPOSED STANDARD. A.M. Clarfield. The Reversible Dementias: Do They Reverse? Annals of Internal Medicine, 109(6):476– 486, 1988. T. Clausen and P. Jacquet. Optimized link state routing protocol (olsr). RFC 3626, Internet Engineering Task Force, October 2003. S. Corson and J. Macker. RFC 2501: Mobile Ad Hoc Networking (MANET): Routing Protocol Performance Issues and Evaluation Considerations, January 1999. Status: INFORMATIONAL., http: //www.ietf.org/rfc/rfc2501.txt. S. Fudickar, S. Gasterstaedt, and B. Schnor. Security extension for MANETSip. Technical Report TR-2010-3, Potsdam University, Institute of Computer Science, June 2010. S. Fudickar, K. Rebensburg, and B. Schnor. Kopan - speech based communication via manets. pages 341–349, Boizenburg, Germany, October 2009. VWH. S. Fudickar, K. Rebensburg, and B. Schnor. MANETSip - A Dependable SIP Overlay Network for MANET Including Presentity Service. In 5th Int. Conf. Networking and Services, pages 314–319, Los Alamitos, CA, USA, April 2009. IEEE Computer Society. S. Fudickar and B. Schnor. Kopal - a mobile orientation system for dementia patients. In Communications in Computer and Information Science, Int. Conf. Intelligent Interactive Assistance and Mobile Multimedia Computing, volume 53, pages 109–118, Berlin Heidelberg, Germany, November 2009. Springer. IEEE. Amendment to IEEE 802.11-2007 Standard: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 1: Radio Resource Measurement of Wireless LANs. IEEE Computer Society, Jun 2008. D. Mills, J. Martin, J. Burbank, and W. Kasch. Network Time Protocol Version 4: Protocol and Algorithms Specification. RFC 5905 (Proposed Standard), June 2010. F. J. Neyer and J. Felber. Personality and adaptive technology use in old age. Presentation at the International conference on Ageing and Technology (available from the authors), March 2010. J. White R. Beard. Gps common time reference architecture. In Proceedings of the 13th International Technical Meeting of the Satellite Division of the Institute of Navigation, pages 895–904, Sep 2000. J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M. Handley, and E. Schooler. SIP: Session Initiation Protocol. RFC 3261 (Proposed Standard), June 2002. Updated by RFCs 3265, 3853, 4320, 4916, 5393, 5621.
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-105
105
Cost/Benefit Analysis of an Adherence Support Framework for Chronic Disease Management Kumari WICKRAMASINGHE a , Michael GEORGEFF a , Christian GUTTMANN a , Ian THOMAS b , Heinz SCHMIDT b a Department of General Practice, Monash University, Melbourne, Australia e-mail: {kumari.wickramasinghe|michael.georgeff|christian.guttmann} @monash.edu b School of Computer Science and Information Technology, RMIT University, Australia e-mail: {ian.edward.thomas|hws} @rmit.edu.au Abstract. Chronic disease is identified as one of the main causes of all deaths worldwide and adversely affects the economy through huge healthcare costs and human capital losses. These outcomes can be ameliorated if patients and their healthcare providers adhere to the care plans developed for the patients’ chronic condition. This paper proposes a framework for adherence support based on (1) continuous monitoring and recognition of possible deviations from plan; (2) determination of the cause of the possible deviation; and (3) generation of an appropriate to assist patients and their healthcare providers avoid plan failure. The framework and findings are generalised to include agents in any domain that adopts similar behaviour monitoring and intervention mechanisms. A cost/benefit analysis is performed under different settings using a number of theoretical and simulation studies. The theoretical analysis serves to: (1) establish the general principles of intervention; and (2) provide a basis for validating the simulation results. The agents are modelled as Belief Desire Intention (BDI) agents and different modes of failure (“deficits”) and interventions are characterised in these terms. The simulation code was shown to produce results compatible with the theoretical analysis and has the potential to be used for other settings and domains for which a theoretical analysis is difficult.
Introduction According to The World Health Organisation (WHO), chronic disease is the cause for 60% of all deaths worldwide and, by the year 2020, chronic disease will account for almost three quarters of all deaths [1]. Seven million Australians [4] and 133 million Americans [6] have a chronic medical condition. In Australia, chronic disease accounts for more than 80% of the burden of disease and injury [3] and over 60% of healthcare costs ($60 billion per annum) [2], and significantly impacts on workforce productivity ($8 billion per annum) [5]. Once an individual is identified with a chronic condition, it persists through life and requires adequate management [4]. In Australia, the management of a patient with
106
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
a chronic disease aims to follow the best-practice clinical guidelines. A General Practitioner (GP) creates a General Practice Management Plan (GPMP) (a care plan in the context of this paper) by assessing the patient’s health condition. The care plan includes services (e.g., diabetic education, podiatry service), tests (e.g., lipid test) and medication required for managing the chronic condition. The GP then identifies possible healthcare providers who can provide the services listed in the care plan. The patient, the GP and the identified healthcare providers (care team) agree among themselves to carry out tasks and actions to improve the health of the patient. However, the use of care plans in practice has serious deficiencies [11,12]: (1) over 30% of the Australian population has a major chronic disease but less than 25% of these people are provided with a care plan; (2) less that one in five patients who are on care plans are tracked for adherence to these plans; (3) 15-30% do not take prescribed medications; and (4) 30-50% are hospitalized due to inadequate care management. It is estimated that improved adherence to care plans could dramatically reduce health care costs and improve patient outcomes. For example, in Australia, this reduction is estimated at reducing the cost by $1.5 billion per annum to the health care system [11]. Part of the reason for these failures is the lack of computational and communications infrastructure to support the GP and the care team in care planning, management, and collaboration. The Intelligent Collaborative Care Management (ICCM) Project aimed to develop a comprehensive architecture for customer life cycle management (CLCM), in particular, for contracts in which humans or other “intelligent” agents function as contract parties. ICCM considers that a customer is provided with a number of services by different service providers in a manner agreed in some agreement or contract. The customer and the service providers are expected to carry out respective tasks and deliver services through out the entire lifetime of the customer. ICCM investigates the manner in which any change, inconsistency, or “deficit” in human mental attitudes (such as beliefs, desires, intentions, plans) prevent agents from carrying out their planned tasks and delivering the planned services. ICCM proposes a framework for adherence support involving continuous behaviour monitoring, interpretation and interventions to reduce contract failures resulting from deficits in mental attitudes [13,20,14]. Chronic disease management is a key application of ICCM [13]. In this application, we consider: (1) the patient as the customer; (2) the GP and the other healthcare providers (care team) as the service providers; and (3) the care plan as the contract. The patient, the GP and the other healthcare providers who are part of the care plan are the contract parties. The items or actions listed in the care plan are the obligations of the contract. The aim of this paper is to establish an understanding of the potential benefit that can be gained by applying the adherence support framework proposed in ICCM for Chronic Disease Management (CDM). The framework for the cost/benefit analysis consists of the following elements: 1. Certain care plan failures occur because of deficits in mental attitudes of contract parties; 2. An adherence support mechanism is used to (1) detect possible digressions or potential digressions from plan; (2) identify the cause of these possible or potential digressions; and (3) generate one or more interventions with the aim of avoiding a failure of the care plan; 3. The party or parties subject to the interventions may or may not respond to the intervention;
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
107
4. Each of the adherence support components is associated with a given cost; and 5. A successfully executed plan is associated with a given benefit (value). While the framework is based on behaviour monitoring, interpretation and intervention, we first consider the case in which there is no attempt to identify the cause of the possible digression. We will call this case “blind intervention”. The analysis is performed under different settings using a number of theoretical and simulation studies. The cost/benefit analysis for CDM is carried out in a general way such that the results are valid for any other domain that adopts similar behaviour monitoring and intervention strategies. The aim of the analysis is twofold: (1) to use the theoretical findings as a basis for validating the simulation results; and (2) to use the validated simulation code in other domains or scenarios for which a theoretical analysis may otherwise be difficult or impossible. The paper is organized as follows. The ICCM project and its application to chronic disease management is briefly described in Section 1. Section 2 performs the theoretical cost/benefit analysis. Section 3 relates our work to existing research. Concluding remarks are discussed in Section 4.
1. Intelligent Collaborative Care Management (ICCM) Preliminaries Intelligent Collaborative Care Management (ICCM) proposes an architecture for CLCM for contracts with contract parties that we can characterise as having the mental attitudes of Belief, Desire, and Intention. In this paper, we will use the word “agent” to refer to such contract parties. An agent’s mental attitudes determine the agent’s actions. This section briefly describes: (1) the manner in which deficits in mental attitudes may cause care plan violations; (2) an adherence support framework to reduce care plan failures; and (c) the application of the framework for chronic disease management. 1.1. Care Plan Violations Resulting from Deficits in Mental Attitudes ICCM models the contract parties as Belief-Desire-Intention (BDI) agents [7]. The BDI architecture provides formalisms and representations for an agent’s mental attitudes and their translation into action. The architecture is based on three mental attitudes: • Beliefs – an agent’s factual model of the world; • Desires – states that an agent desires to be in, but not necessarily tries to achieve; and • Intentions – states that an agent actively tries to achieve either now or in future. In CLCM, the obligations in contractual agreements correspond to intentions of the participating BDI agents. That is, contract parties commit to actions to be performed in the future. However, changes to or missing elements of these mental attitudes can result in the contract parties deviating from the parties’ intended behaviour. We will call such variations mental “deficits”. Mental deficits affect the execution of obligations by the corresponding parties and may lead to contract failures. ICCM models contract violations that arise from deficits in mental attitudes using: (1) Bratman’s notion of “future-directed intentions” [7] and (2) Castelfranchi’s beliefbased goal dynamics [8]. Future-directed intentions describe how agents decide in ad-
108
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
vance a plan for the future and the execution of the plan in the future. Because of new information or changes in the agent’s expectations, agents may reconsider the plan and may not execute the plan that was decided in advance. Belief-based goal dynamics proposes beliefs as the deciding factor for selecting and executing goals: agents commit to obligations based on beliefs that exist at commitment-time and subsequent execution of the obligations depends on execution-time beliefs. Consolidating these two concepts, the ICCM project proposes three categories of deficits (refer [13,20] for a detailed description of the deficits and supporting theory): • Belief deficit: occurs if a contract party forgets or does not know when to execute an action, to whom to communicate with to execute an action, or when to execute an action that is necessary for meeting a contractual obligation; • Intention deficit: arises when a contract party drops a committed intention or changes the priority of goals, resulting in a modification or a reordering of intentions so that his/her behaviour no longer conforms with the contractual obligations; and • Plan deficit: arises when a contract party does not know the means for carrying out an intention or achieving a goal that is necessary for fulfilling the contractual obligations. 1.2. Adherence Support Framework ICCM’s adherence support framework consists of three processes to assist agents to carry out contractual obligations: 1. Monitoring processes, which aim to observe the care plan execution behaviour of contract parties; 2. Recognition processes, which identify possible deficit(s) in mental attitudes if the execution behaviour identified in Item 1 does not match the intended behaviour and/or is indicative of a potential plan failure; and 3. Intervention processes, which apply blind or tailored strategies to intervene with contract parties when a deficit in mental-attitudes is identified. This ongoing monitoring and management processes is based on four elements: • Precursors: a priori actions, steps or states that may indicate whether the actions of the contract parties are on track with respect to the care plan (e.g., the patient setting an appointment on time to visit a care provider) or likely to go off track (e.g., very low or very high blood pressure); • Detection strategies for precursors: mechanisms to detect the occurrence or missing occurrence of such precursors at run-time; • Mental state recognition processes: processes to identify possible deficit(s) in mental attitudes that may have led to the occurrence (or non-occurrence) of precursors; and • Intervention processes: strategies to intervene with one or more of the contract parties to reduce the likelihood of the parties violating their obligations.
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
109
Table 1. A sample care plan. Party’s name
Party’s type
Obligation
Execution due
Harry Brown
Podiatrist
Examine feet
July, Oct, Jan
Mary John
Optometrist
Check eyes
December
Walk 1 km
daily
Take Diamicron
April, July, Oct
Bob Smith
Patient
1.3. ICCM Adherence Support Framework: Application in Chronic Disease Management We consider the sample care plan given in Table 1 for a diabetes patient who has foot and eye complication risks as an example of a care plan. The care plan expects the patient to visit the podiatrist and the optometrist on certain months; walk regularly; take a repeat medication; and renew the medication on certain months. During the execution of the care plan, the patient (sometimes the care team) may not perform all or some of these actions because of reasons developed after agreeing to the plan. For example, the patient may: (1) be unaware that the podiatrist appointment is at 10.00am on 1st July 2009 (forgotten or believes it to be on another date). (2) believe his current risk of heart attack is less than that at the time of commitment; (3) discover that an important football match is to be held on an appointment date; (4) realise he cannot walk more than 0.5 km a day; (5) discover that the medication (Diamicron) causes severe side effects; (6) decide not to take medications that have side effects; or (7) not know that he needs to initiate the setting of an appointment with the podiatrist or optometrist. The ICCM framework models these changes as “deficits” in mental attitudes. We illustrate the application of the adherence support framework in such situations using the patient’s obligation of visiting the podiatrist on the given months. The same obligation is used as a working example throughout the paper. Usually, to visit a provider, the patient has to set an appointment with the provider a certain number of days prior to the planned visit. The number of days is called the waiting time and it may vary with each provider. The waiting time for each provider is specified as part of the domain-based precursor detection processes. From the adherence support perspective, the setting of an appointment with the podiatrist prior to the waiting period serves as a precursor that indicates the patient’s intention of visiting the podiatrist. Precursor detection strategies check the existence of due appointments before the corresponding waiting times have elapsed. If an appointment has not been made, there is the possibility of violating the related obligation. There can be a number of reasons that may have prevented the patient from setting an appointment. The patient: 1. may not know the waiting time of the podiatrist. Even though he still intends to visit the podiatrist, he will not be able to do so as he has not setup the appointment on time. 2. may intend to visit the podiatrist only if he has a high risk of heart attack. A low heart attack risk calculated by a heart attack risk calculator might have negatively influenced the patient from setting an appointment with the podiatrist.
110
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
3. may not know how to proceed with setting an appointment. That is, the patient does not have a plan for setting an appointment, for example, the patient may not know how to initiate the setting of the appointment. In the absence of the precursors, mental-state recognition processes try to identify the possible deficit associated with the (non-)occurrences of the precursors. It is important to identify the cause for the (non-)occurrences of the precursors as the intervention is aimed at reversing the cause associated with the mental deficit(s). For example, for the three possible reasons identified above, intervention strategies should vary depending on the cause. The deficit associated with Item 1 above may be avoided by informing the patient the waiting time of the podiatrist. For Item 2, the patient has to be informed about his current actual risk of heart attack, which may be higher than the value calculated by the risk calculator. For Item 3, the patient has to be informed of the method for making appointments with providers.
2. Adherence Support: A Theoretical Analysis The application of the adherence support framework to CDM is determined by the cost effectiveness of the framework. The cost effectiveness depends on: (1) the costs associated with implementing the adherence support processes; and (2) the benefits that can be obtained through successfully applying the processes. The practical application of adherence support is feasible if benefits outweigh the costs. This section aims to perform a cost/benefit analysis to determine the circumstances under which it is cost effective to use the proposed adherence support processes. The analysis is based on a number of parameters as described in Section 2.1. These parameters are used to analyse adherence support for two possible intervention approaches: 1. Random (or blind) intervention: An intervention strategy is randomly selected without trying to recognize the type of deficit that prevented the contract party from executing a previously intended action. The cost/benefit analysis for the random intervention approach is performed in Section 2.2; and 2. Tailored intervention: An intervention strategy is selected after attempting to recognise the type of deficit that prevented the contract party from carrying out an obligation. The cost/benefit analysis for the tailored intervention approach is performed in Section 2.3. 2.1. Parameters for the Analysis The analysis is based on three main parameters that capture: 1. The contract parties’ behaviours; 2. The cost associated with the adherence support processes; and 3. The benefit associated with a successfully executed care plan. 2.1.1. Behaviour Related Parameters: We consider two types of parameters to represent the behaviour of the contract parties: 1. Level of deficit, D; and
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
111
2. Level of response, R. Level of Deficit, D represents a statistical distribution or a numerical value that denotes the deficit associated with the mental attitudes of the contract party. Theoretically, D is a complex parameter that may vary with: • Each obligation in the care plan; and or • The type of deficit: either belief, intention or plan. For the preliminary analysis carried out in this paper, once a contract party is assigned with a certain D value, the same value is considered throughout the analysis. That is, in this analysis D is considered to be independent of the obligations and/or the type of deficit. We assign D with a series of values, 0 ≤ D ≤ 1. D=0 means that a contract party never forgets to execute any obligation and D=1 means that the party always forgets to execute all the obligations as committed in the care plan. For example, in the situation described in Section 1.3, if the patient’s deficit level is 0 (D = 0), he will carry out all obligations in Table 1 without any failures. If his deficit level is 1 (D = 1), he will not carry out any of the obligations in Table 1. When 0 < D < 1, the patient will carry out the obligations D% of the time. Level of Response, R represents the responsiveness of a contract party to an intervention. In the analysis, R is assigned a series of values, 0 ≤ R ≤ 1. R=0 means that a contract party never responds to an intervention and R=1 means that the party always responds to an intervention. For a given deficit, when an intervention is applied to a party a number of times, the party’s responsiveness to each intervention may vary with the number of interventions. That is, each successive intervention may not yield the same outcome as of the previous intervention. This phenomenon is captured using two variants of R: • Constant responsiveness: The contract party’s responsiveness is independent of the number of interventions. The effect of each successive intervention is equivalent to the effect of the first intervention. The responsiveness of the contract party to any successive intervention is R; and • Negative exponential responsiveness: The contract party’s responsiveness decreases as the number of interventions increase. For example, if the responsiveness to the first intervention is R, the responsiveness to the second intervention is qR, and for the third intervention is q 2 R, where 0 < q < 1. Such a decaying responsiveness is typical of multiple interventions in practice. For the aforementioned parameter values, adherence support does not improve outcomes when R = 0 and it is not required when D = 0. We therefore only consider the cases R > 0, D > 0. 2.1.2. Cost Related Parameters: The total cost associated with a single behaviour monitoring, interpretation and intervention cycle is denoted as C. In the analysis of random intervention approach, C represents cost per intervention. In the tailored approach C has two parts: (1) cost per recognition; and (2) cost per intervention as used in Section 2.3.
112
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
2.1.3. Benefit Related Parameters: The benefit or the gain, denoted as S represents the value of a successfully executed care plan over an unsuccessful care plan. 2.2. Theoretical Analysis: Random Intervention In this analysis, no investigations are performed to recognise the type of deficit associated with the (non-)occurring precursor. For example, for the situation discussed in Section 1.3, when an appointment has not been made for a required visit, no attempt is made to identify the associated deficit. Instead, the patient is either informed of the waiting time of the provider, the risk of heart failure, or of a plan to set an appointment. The approach is entirely randomly selected. Even though ICCM’s proposed adherence support framework is based on mentalstate recognition processes, an analysis on random (blind) intervention is useful to determine: • the circumstances (limits of D, R, S and C) under which it is possible to gain any net-added value with blind interventions. This finding can be beneficial as blind intervention, for example reminders, is a common strategy currently applied in many domains including healthcare. In addition, this analysis is useful when the application of mental-state recognition techniques is not possible or practical, e.g., for domains in which cost of recognition is comparatively expensive or impossible; and • the usefulness and/or cost effectiveness of the deficit interpretation phase. Constant Responsiveness: The responsiveness of the contract party for each intervention is a constant, R. To analyse the effect, a care plan with a single obligation (Case 1) and multiple obligations (Case 2), is used. The corresponding equations are derived in Appendix A. This section interprets the equations. Case 1: The care plan has a single obligation (N = 1). For example, the care plan contains a single obligation: visit the podiatrist in July 2010. From the analysis carried out in Appendix A.1, for a care plan with a single obligation, for I number of interventions: Net Value (NV)/D = (1 − D)/D + (RS − C) ∗ (1 + r + r2 + r3 + r4 + . . . + r I−1 ) where r = 1 − R The first term, (1 − D)/D denotes the net value from zero interventions. That is, the net value possible in the absence of any adherence support is (1 − D)/D. Once zeroth intervention is excluded, the “net added value of intervention”, (NV-added) = D(RS − C)(1 + r + r2 + r3 + r4 + . . . + rI−1 ) Analysis: 1. In the absence of intervention strategies, as D increases, the net value, N V decreases proportional to (1 − D);
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
113
2. For given values of (RS − C) and R, (NV-added/D) depends only on the number of interventions. Therefore, the key parameters of cost effectiveness are the value of R, the value of RS/C; and the number of interventions; 3. Interventions are cost effective if the patient’s level of responsiveness is greater than the C/S ratio. That is, when N = 1, interventions are effective iff R > C/S; 4. If a single intervention is cost effective, then multiple interventions are also cost effective, and the more interventions the greater the net benefit but with diminishing returns; and 5. Cost effectiveness increases as R increases. This justifies the use of mental-staterecognition for selecting interventions as they potentially increase the value of R (see Section 2.3). Case 2: The care plan has multiple obligations N = n, n > 1. For example, the care plan contains multiple obligations as given in Table 1. From the analysis carried out in Appendix A.2, for a care plan with n obligations, for I number of interventions: net added value by interventions, NV-added = S((GD,R,I )n − (1 − D)n ) − KD,R,I (1 + GD,R,I + (GD,R,I )2 + . . . + (GD,R,I )n−1 ) where KD,R,I = DC + D(1 − R)C + D(1 − R)2 C + . . . + D(1 − R)I−1 C and GD,R,I = 1 − D + DR(1 + (1 − R) + . . . + (1 − R)I−1 ) Analysis: When there is more than one obligation in a care plan (N > 1): 1. Net added value increases as R increases; 2. The change in net added value decreases with each successive intervention ; 3. If the net added value of the first intervention is negative for a given C/S, it is not cost effective to intervene any further for the same C/S; 4. As the number of obligations N increases, the minimum level of R required for a positive net added value from an intervention for a given C/S increases. The increase in the minimum R is non-linearly proportional to the increase in C/S and N . The requirement of an increased R to obtain a positive net added value as N increases can be explained using a care plan with 2 obligations N = 2. Let f be the amount added to the net added value by the second obligation such that f ≤ S. From Item 3 of Case 1 above, any intervention to the first obligation is cost effective if R > C/f . That is, for N = 1 interventions are cost effective when R > C/S and for N = 2, an intervention to the first obligation is cost effective when R > C/f . As f ≤ S, R required for a positive net added value increases as N increases. The minimum R required for a positive net added value for N = 2 is calculated in Appendix A.3. 5. For a care plan with N obligations, if an ith intervention is not cost effective for certain values of R, D and C/S, then for a care plan with a higher number of obligations than N , the ith intervention is not cost effective for same R, D and C/S.
114
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
6. For a care plan with N obligations, if an ith intervention is cost effective for certain values of R, D and C/S, then for a care plan with a higher number of obligations than N , the ith intervention is cost effective for the same R, D and C/S. Negative Exponential Responsiveness: The level of responsiveness of the contract party decays with each successive intervention. Responsiveness to the first intervention is R, second intervention is qR, third intervention is q 2 R, where 0 < q < 1. To analyse the effect, a care plan with a single obligation (Case 1) and multiple obligations (Case 2), is considered. The corresponding equations are derived in Appendix B. This section interprets the equations. Case 1: The care plan has a single obligation (N = 1). For example, the care plan contains a single obligation: visit the podiatrist in July 2010. From the analysis carried out in Appendix B.1, for a care plan with a single obligation, for I number of interventions, Net value (N V )/D = (1 − D)S/D + RS(1 + q(1 − R) + q 2 (1 − R)(1 − qR) + . . . + q I−1 (1 − R)(1 − qR)(. . .)(1 − qI−2 R)) −C(1 + (1 − R) + (1 − R)(1 − qR) + . . . + (1 − R)(1 − qR)(. . .)(1 − q I−1 R))
The first term, (1 − D)/D denotes the net value for zero interventions. That is, the net value possible in the absence of an adherence support framework is 1 − D/D. Once the zeroth intervention is excluded, the “net added value of intervention”, (NV-added)/D = RS(1 + q(1 − R) + q 2 (1 − R)(1 − qR) + . . . + qI−1 (1 − R)(1 − qR)(. . .)(1 − qI−2 R)) − C(1 + (1 − R) + (1 − R)(1 − qR) + . . . + (1 − R)(1 − qR)(. . .)(1 − q I−1 R)) The effect of q on (NV-added/D), for a given C/S and for several values of R is graphically presented in Figures 1 and 2. For certain R and q values, the net added value decreases after a certain number of interventions and continues to decrease with each subsequent intervention. For example, when C/S = 0.2, net added value decreases after the • fourth intervention for R = 0.4 and q = 0.8 (Figure 2(b)); and • second intervention for R = 0.4 and q = 0.6 (Figure 2(b)). That is, when R = 0.4 and q = 0.8, it is cost effective to intervene four times, after which the costs of intervention exceed the benefits. Analysis: 1. For given values of q, C/S, and R, (NV-added/D) depends only on number of interventions; 2. Similar to Case 1 of the constant responsiveness, interventions are cost effective when R > C/SqI−1 ; and 3. According to Item 2 above, interventions are not cost effective when the number of interventions, I ≥ 1 + (log(C/SR)/log(q)).
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
115
Figure 1. Effect of q on (NV-added/D) for C/S = 0.2, D = 0.2 for several R values.
Case 2: The care plan has multiple obligations N > 1
From the analysis carried out in Appendix B.2, for a care plan with n obligations, for I number of interventions: net added value by interventions, NV-added = S((HD,R,q,I )n − (1 − D)n ) − KD,R,q,I (1 + HD,R,q,I + (HD,R,q,I )2 + . . . + (HD,R,q,I )n−1 ) where KD,R,q,I = DC + D(1 − R)C + D(1 − R)(1 − qR)C + . . . + D(1 − R)(1 − qR)C(. . .)(1 − qI−1 R)) and HD,R,q,I = 1 − D + DRq I−1 (1 − R)(1 − qR)(. . .)(1 − q I−2 R)
The above formula is used to investigate the relationship among (1) the maximum number of interventions that can be applied before a negative net added value is reached, (2) level of deficit, and (3) level of responsiveness as the number of obligations in a care plan increases. The results are summarised in Tables 2 and 3: • Table 2 represents the maximum number of interventions for a care plan with two obligations (N = 2) for different combinations of D and R when q = 0.2; and • Table 3 represents the maximum number of interventions for a care plan with five obligations (N = 5) for different combinations of D and R when q = 0.2 In these tables, zero indicates that even a single intervention does not provide a nonnegative net added value for the first ten interventions (the maximum number of interventions considered).
116
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
Figure 2. Effect of q on (NV-added/D) for C/S = 0.8, D = 0.2 for several R values.
Analysis: 1. For a care plan with a given number of obligations, for all D, the number of interventions that yield a non-negative net added value increases as R increases. This increase is non-linearly proportional to the increase in D. 2. As the number of obligations N increases, the minimum level of R required for a positive net added value from an intervention for a given C/S increases. Similar to Item 4 of Case 2 of the constant responsiveness, the requirement of an increased R to obtain a positive net added value as N increases can be explained using a care plan with 2 obligations N = 2. Let f be the amount added to the net added value by the second obligation such that f ≤ S. From Item 2 of Case 1 above, any intervention to the first obligation is cost effective if R > C/f SqI−1 . That is, for N = 1 interventions are cost effective when R > C/SqI−1 and for N = 2, an intervention to the first obligation is cost effective when R > C/f SqI−1 as f ≤ S, R required for a positive net added value increases as N increases. 3. For a given combination of R and D, the number of interventions that yields a non-negative net added value decreases as the number of obligations increase. For example, such interventions for • R = 0.6 and D = 0.2 are 6 when N = 2 (Table 2) and 4 when N = 5 (Table 3). • R = 0.4 and D = 0.8 are 1 when N = 2 (Table 2) and 0 when N = 5 (Table 3). As the number of obligations increase, less interventions are cost effective.
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
117
Table 2. The maximum number of interventions before a non-negative net added value is reached for a care plan with two obligations for different combinations of D, R and q = 0.2. Level of Response (R)/
D=0.2
D=0.4
D=0.6
D=0.8
D=1.0
Level of Deficit (D) R=0.2
0
0
0
0
0
R=0.4
2
2
2
1
0
R=0.6
6
5
4
3
2
R=0.8
≥ 10
≥ 10
≥ 10
≥ 10
9
Table 3. The maximum number of interventions before a non-negative net added value is reached for a care plan with five obligations for different combinations of D, R and q = 0.2. Level of Response (R)/
D=0.2
D=0.4
D=0.6
D=0.8
D=1.0
Level of Deficit (D) R=0.2
0
0
0
0
0
R=0.4
2
0
0
0
0
R=0.6
4
2
1
1
0
R=0.8
≥ 10
8
5
3
2
2.3. Theoretical Analysis: Tailored Intervention This section analyses adherence support when all three components are included: (1) behaviour monitoring to detect possible diversions from plan; (2) mental-state recognition to determine the cause of these diversions; and (3) and tailored interventions to avoid plan failure. In this analysis, the domain specific mental-state recognition processes aim to recognise the prevailing deficit associated with the (non-)occurred precursor (details of the mental-state recognition is described elsewhere [20]). For the example discussed in Section 1.3, when an appointment does not exist for a required visit, attempts are made to identify the associated deficit: whether the patient (1) knows the waiting time of the provider; (2) has the correct assessment of the risk of heart failure; or (3) has a plan to set an appointment. An intervention that best addresses the identified deficit is selected for application. The analysis in this section aims to determine the parameter values that yield a positive net-added value for tailored interventions. The analysis uses two additional parameters: 1. Level of recognition, L 1 ; and 2. Level of responsiveness after mental-state recognition, Rreg . Level of Recognition, L is a measure of the effectiveness of the mental-state recognition algorithm(s) used by the mental-state recognition processes. We assign L with a series of values such that 0 ≤ L ≤ 1. L = 1 means that the mental-state recognition algorithms have 100% success in recognising the deficit and L = 0 means that the algorithms completely fail to recognise the deficit. 1 We note that this analysis is sufficiently abstracted that it does not depend on whether recognition of the cause of the (non-)occurrence of the precursor event attempts to identify a mental state “deficit” or not. However, within the context of our framework for CCLM, we assume that mental state recognition is the process used in the interpretation phase.
118
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
Level of Responsiveness after Mental-State Recognition, we use Rreg to indicate the level of responsiveness of the contract party after the mental-state recognition process. We assume L = 1 increases the original level of responsiveness of the contract party to 1 (Rreg = 1) and L = 0 does not change the original level of responsiveness (Rreg = R). That is, Rreg = L + R(1 − L) In addition, mental-state recognition incurs a cost. The cost parameter defined in Section 2.1 is now divided into two components: • cost per recognition (Cr ) denotes the system cost per single recognition; and • cost per intervention (Ci ) denotes the system cost per single intervention. In this initial study, all recognition and intervention strategies are assumed to incur the same cost. However, this study can potentially be extended by assigning different costs for different recognition and intervention strategies depending on the cost of implementation and their effectiveness. Similar to the analysis of random (blind) interventions, this analysis considers a care plan with a single obligation (Case 1) and multiple obligations (Case 2). The corresponding equations are derived in Appendix C. This section interprets the equations. Case 1: The care plan has a single obligation (N = 1). For example, the care plan contains a single obligation: visit the podiatrist in July 2010. From the analysis carried out in Appendix C.1, for a care plan with a single obligation, for I number of interventions: Net Value (NV)/D = (1 − D)/D + (Rreg S − C) ∗ (1 + r + r 2 + r3 + r4 + . . . + r I−1 ) where r = 1 − Rreg , Rreg = L + R(1 − L) and C = Ci + Cr As before, the first term, (1 − D)/D denotes the net value from zero interventions. Once zeroth intervention is excluded, the “net added value of intervention”, (NV-added) = D(Rreg S − C)(1 + r + r2 + r3 + r4 + . . . + rI−1 ) Analysis: 1. Mental-state recognition is cost effective when L > (C/S − R)/(1 − R) (corresponding equations are obtained in Appendix C.2); 2. Tailored intervention is cost effective over random intervention when L > (((Ci + Cr )R/Ci ) − R)/(1 − R) (corresponding equations are obtained in Appendix C.3). Figure 3 illustrates the behaviour of (“NV-added with tailored intervention” - “NV-added with random intervention”) vs Cr /Ci when L = 0.5 and R = 0.2; and 3. For a given R, as L increases, the upper limit of the cost increases. That is, when L increases, a positive return is possible even for a higher cost. A higher L indicates a higher success rate in identifying the deficit associated with an (non-)occurred precursor and results in a higher Rreg . When the level of responsiveness increases, there is a high possibility that the intervention will have the desired effect and the care plan will be executed successfully.
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
119
Figure 3. net value added with tailored intervention - net value added with random intervention vs Cr /Ci when L = 0.5 and R = 0.2.
Case 2: The care plan has multiple obligations N > 1.
From the analysis carried out in Appendix C.4, for a care plan with n obligations, for I number of interventions: net added value by interventions, NV-added = S((GD,Rreg ,L,I )n − (1 − D)n ) − KD,Rreg ,L,I (1 + GD,Rreg ,L,I + (GD,Rreg ,L,I )2 + . . . + (GD,Rreg ,L,I )n−1 ) where KD,Rreg ,L,I = DC + D(1 − Rreg )C + D(1 − Rreg )2 C + . . . + D(1 − Rreg )I−1 C and GD,Rreg ,L,I = 1 − D + DRreg (1 + (1 − Rreg ) + . . . + (1 − Rreg )I−1 )
The effect of successive interventions on net added value is observed in the random intervention strategy discussed in Section 2.2. As successive interventions have a similar effect in the tailored intervention, no further analyses are carried out to investigate the effect of successive interventions on net added value. Instead, the upper limit of the cost corresponding to a non-negative Net Value (NV) is investigated. The mental-state recognition and intervention based approach is cost effective only if the recognition and intervention cost associated with a given R, D and L is less than the corresponding upper limit. The upper limit of the costs corresponding to N V = 0 from a single intervention for a care plan with 2 and 10 obligations is shown in Figure 4 for all values of R, L when D = 0.2.
120
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
Figure 4. Costs corresponding to Net Value(NV) = 0 from a single intervention for all R and L values when D = 0.2 for a care plan with (a) 2 obligations; and (b) 10 obligations.
Analysis: When N > 1: 1. For the same reasons discussed in Item 5 of the Case 1 analysis above, for a given R, as L increases, the upper limit of the cost increases; and 2. The number of obligations has a non-linear effect on the upper limit of the cost. As the number of obligations increases, the decrease in the upper limit of the cost is high for lower values of R. In summary, this section analysed the cost effectiveness of two approaches for adherence support: (1) behaviour monitoring and intervention based approach (random intervention); and (2) behaviour monitoring, mental-state recognition and intervention based approach (tailored intervention). Each approach is analysed for care plans with a single and multiple obligations using the parameters as summarised in Table 4. The analysis is aimed at identifying whether subsequent interventions are cost effective and the boundary values of the parameters that yield positive net outcomes. 3. Related Research This paper aims to determine the cost effectiveness of an adherence support framework based on behaviour monitoring, mental state recognition, and intervention that is applicable to CLCM. The adherence support framework identifies means for reducing service delivery failures resulting from “deficits” in the mental attitudes of contract parties. As such, our interest is in predictive monitoring approaches rather than research in replanning, plan repair and plan failure recovery, in which remedial actions are considered after a plan failure occurs [15,19]. Research on norms attempt to address the self-interested behaviour by encoding obliged, permitted and prohibited behaviour in care plans [17,9]. Care plan executiontime monitoring is a key functionality used by normative frameworks to determine possible norm violations [10]. There are two types of monitoring techniques described in multi-agent research: (1) corrective monitoring [16], and (2) predictive monitoring [21]. When violations occur the corrective monitoring detects them and imposes punishments on contract parties that violate norms. The punishments work in a non-favourable way thus limiting the chances of such parties to be part of the care plan in a re-negotiation phase. Conversely, predictive monitoring approaches aim to predict possible future norm violations and execute remedial measures to prevent such violations. Existing predictive
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
121
Table 4. Summary of parameters. behaviour monitoring and
behaviour monitoring,
intervention based
mental-state recognition
approach (random intervention)
and intervention based approach (tailored intervention)
deficit responsiveness types responsiveness
level of deficit, D
level of deficit, D
(1) constant responsiveness
constant responsiveness
(2) negative exponential responsiveness (1) for all interventions, R
for all interventions, R
(2) for the 1st intervention: R for the 2nd intervention: qR for the 3rd intervention: q 2 R .. . cost
cost per intervention, C
cost per recognition, Cr cost per intervention, Ci cost per recognition and intervention, C = Cr + Ci
benefit
value of a successfully executed care plan value of a successfully executed care plan over an unsuccessful care plan, S over an unsuccessful care plan, S
cost effectiveness (1) net value, N V (1) net value, N V = value of successfully executed care plans = value of successfully executed care plans - cost of interventions - (cost of recognition + cost of intervention) (2) net added value, NV-added
(2) added value, NV-added
= value of successfully executed care plan = value of successfully executed care plan with interventions - cost of interventions with recognitions and interventions - (cost of recognition + cost of interventions) Note: N V takes into account the value with Note: N V takes into account the value with out interventions but NV-added does not out interventions but NV-added does not
monitoring frameworks [16] use detection followed by an intervention strategy to avoid care plan violations, but they do not attempt to recognise any associated mental deficits. From the perspective of electronic care plan formation and management, a similar approach to ICCM has been proposed [18]. This approach associates a care plan with two types of states: critical states (CS) and danger of violation (DOV) states. CS define the states which are compulsory for the successful execution of the care plan. The DOV states indicate a possible violation of the care plan, but they are not explicit care plan states [18]. Even though the aforementioned frameworks can be compared with the adherence support mechanism proposed in ICCM, to our knowledge, these frameworks do not have associated cost/benefit analyses. The generic cost benefit analysis performed in this paper for adherence support based on behaviour monitoring, recognition and intervention can be used as a framework to determine the cost effectiveness of other related approaches. 4. Concluding Remarks Adherence to care plans by both patients and healthcare providers is a critical factor in reducing the costs and improving the outcomes associated with chronic disease. This pa-
122
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
per provides insights into the cost effectiveness of adherence support based on behaviour monitoring, interpretation and intervention within an ICCM model. A number of theoretical studies and simulations was carried out using parameters that capture: (1) the contract parties’ behaviour; (2) the costs associated with behaviour monitoring, interpretation and intervention; and (3) the benefit associated with a successfully executed care plan. The theoretical analysis: (1) established the general principles of intervention; and (2) provided a basis for validating the simulation results. In the current studies, all parameters are assigned with values from zero to one. Our future work aims at: (1) applying the framework for different domains; and(2) assigning parameters with various statistical distributions which mathematically describe the behaviour of the parameters. These statistical distributions may vary depending on the application domain. Since simulation results are compatible with the theoretical outcomes, the current simulation code can be used in situations involving complex distributions or when a theoretical analysis is difficult.
Acknowledgments The work reported here was supported in part by British Telecom (CT1080050530), the Australian Research Council (LP0774944), the Australian Governments Clever Networks program and the Victorian Department of Innovation, Industry and Regional Development, Department of Health, and Multi Media Victoria. We also gratefully acknowledge the contributions and advice from Dr Simon Thompson and Dr Hamid Gharib of British Telecom and Professor Leon Piterman, Dr Kay Jones, Associate Professor Peter Schattner, and Dr Akuh Adaji of the Department of General Practice, Monash University.
A. Random Intervention and Constant Responsiveness A.1. Case1: Analysis for a Care Plan with a Single Obligation value of the successfully executed care plan I n ∗ S where ni denote the number of successfully executed = i=0 i care plans at ith intervention = n0 ∗ S + n1 ∗ S + n2 ∗ S + . . . + nI ∗ S = (1 − D) ∗ S + DRS + D(1 − R)RS + D(1 − R)2 RS + . . . + D(1 − R)I−1 RS cost of intervention I−1 = x ∗ C where xi denote the number of failed care plans i=0 i at ith intervention = x0 ∗ C + x1 ∗ C + x2 ∗ C + . . . + xI−1 ∗ C = 0 + DC + D(1 − R)C + D(1 − R)2 C + . . . + D(1 − R)I−1 C Net value =N V = value of the successfully executed care plan - cost of intervention = 1 + D ∗ (−1 + R + (1 − R)R + (1 − R)2 R + . . . + (1 − R)I−1 R)S −(1 + (1 − R) + (1 − R)2 ) + . . . + (1 − R)I−1 = 1 − D + RDS ∗ (1 + (1 − R) + (1 − R)2 + . . . + (1 − R)I−1 ) −DC ∗ (1 + (1 − R) + (1 − R)2 + . . . + (1 − R)I−1 ) = (1 − D) + RDS − DC ∗ (1 + (1 − R) + (1 − R)2 + . . . + (1 − R)I−1 )
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
123
= (1 − D) + D(RS − C) ∗ (1 + (1 − R) + (1 − R)2 + . . . + (1 − R)I−1 ) Let r = (1 − R) N V /D = (1 − D)/D + (RS − C) ∗ (1 + r + r2 + r 3 + r 4 + . . . + r I−1 )
A.2. Case2: Analysis for a Care Plan with Multiple Obligations When N = 1, for given values of D and R, let GD,R,I denote the total number of (final) success states up to I interventions. Then, when N = n, (GD,R,I )n denote the number of final success states, F , for an I th intervention. That is, when N for I = 1, for I = 2, for I = 3,
=1 GD,R,1 = 1 − D + DR GD,R,2 = 1 − D + DR(1 + (1 − R)) GD,R,3 = 1 − D + DR(1 + (1 − R) + (1 − R)2 )
Similarly, when N = 2 for I = 1, F = (GD,R,1 )2 = (1 − D + DR)2 for I = 2, F = (GD,R,2 )2 = (1 − D + DR(1 + (1 − R)))2 .. . Then the “success states added up to an I th intervention”, F-added = (GD,R,I )n − (1 − D)n The net added value by I interventions can be calculated as follows: when N = 1 cost of intervention = DC + D(1 − R)C + D(1 − R)2 C + . . . + D(1 − R)I−1 C let KD,R,I = DC + D(1 − R)C + D(1 − R)2 C + . . . + D(1 − R)I−1 C when N = n cost of intervention = KD,R,I (1 + GD,R,I + (GD,R,I )2 + . . . + (GD,R,I )I−1 ) net added value by interventions, NV-added = S((GD,R,I )n − (1 − D)n ) −KD,R,I (1 + GD,R,I + (GD,R,I )2 + . . . + (GD,R,I )n−1 )
A.3. Minimum R for a Positive net added value for a Care Plan with N = 2 Any intervention to the first obligation is cost effective if R > C/f where f is the amount added by the second obligation to the net added value and f ≤ S. Therefore, the first obligation will be intervened if R > C/f . The question is whether it is worth intervening the second obligation if it is not carried out by the patient. From a single obligation case it is cost effective to intervene if R > C/S and infinite number of interventions are possible as at each intervention the value gain will be higher than the cost of intervention. So when applied to a population (more than one patient), everyone will eventually execute the obligation. The net value from the first intervention is SR − C. Each of the subsequent interventions will take the number of failed states and gives a net value of SR − C for each failed state. Net value for the population = g(SR − C) where g is some number based on how many patients fail at each intervention. Therefore, when N = 2, it is cost effective to intervene if Rg(SR − C) > C or R2 > C(1/g + R)/S That is, R2 > C(1/g +R)/S, any number of interventions are possible. Even though an infinite number of interventions is not practically feasible, this provides a basic norm as to whether a subsequent intervention is cost effective or not.
124
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
B. Random Intervention and Negative Exponential Responsiveness B.1. Case1: Analysis for a Care Plan with a Single Obligation When N = 1, for given values of D, R, and q, let HD,R,q,I denote the number of success states achieved by the I th intervention. for I = 1, HD,R,q,1 = DR for I = 2, HD,R,q,2 = D(1 − R)qR = HD,R,q,1 ∗ q(1 − q I−2 R) for I = 3, HD,R,q,3 = D(1 − R)(1 − qR)q 2 R = HD,R,q,2 ∗ q(1 − q I−2 R) value of the successfully executed care plan n = H ∗S I=0 D,R,q,I = HD,R,q,0 ∗ S + HD,R,q,1 ∗ S + HD,R,q,2 ∗ S + . . . + HD,R,q,I ∗ S = (1 − D) ∗ S + DRS + DRq(1 − R) + DRq 2 (1 − R)(1 − qR) + . . . + DRq I−1 (1 − R)(1 − qR)(. . .)(1 − q I−2 R) cost of intervention n−1 x ∗ C where xi denote the number of failed care plans = i=0 I at I th intervention = x0 ∗ C + x1 ∗ C + x2 ∗ C + . . . + xI−1 ∗ C = 0 + DC + D(1 − R)C + D(1 − R)(1 − qR)C + . . . + D(1 − R)(1 − qR)C(. . .)(1 − q I−1 R)) Net value =N V = value of the successfully executed care plan - cost of intervention = (1 − D) ∗ S + DRS + DRq(1 − R) + DRq 2 (1 − R)(1 − qR) + . . . +DRqI−1 (1 − R)(1 − qR)(. . .)(1 − q I−2 R)) = 0 + DC + D(1 − R)C + D(1 − R)(1 − qR)C + . . . + D(1 − R)(1 − qR)C(. . .)(1 − q I−1 R)) N V /D = (1 − D)S/D + RS(1 + q(1 − R) + q 2 (1 − R)(1 − qR) + . . . +q I−1 (1 − R)(1 − qR)(. . .)(1 − qI−2 R)) −C(1 + (1 − R) + (1 − R)(1 − qR) + . . . + (1 − R)(1 − qR)(. . .)(1 − q I−1 R))
B.2. Case2: Analysis for a Care Plan with Multiple Obligations The generalised calculation used in Case 2 of the constant responsiveness scenario (Appendix A.2) is adopted to calculate the success states when responsiveness depicts a negative exponential behaviour. That is, when N = 1, let (HD,R,q,I )n denote the number of final success states, F for an I th intervention. From Case 1 of the negative exponential responsiveness above (Appendix B.1), when N = 1: for I = 1, HD,R,q,1 = 1 − D + DR for I = 2, HD,R,q,2 = 1 − D + DR(1 + q(1 − R)) for I = 3, HD,R,q,3 = 1 − D + DR(1 + q(1 − R) + q 2 (1 − R)(1 − qR)) Similarly, when N = 2: I = 1, F = (HD,R,q,1 )2 = (1 − D + DR)2 I = 2, F = (HD,R,q,2 )2 = (1 − D + DR(1 + q(1 − R)))2 . .. When N = n: I = 1, F = (HD,R,q,1 )n = (1 − D + DR)n I = 2, F = (HD,R,q,2 )n = (1 − D + DR(1 + q(1 − R)))n .. .
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
125
Then the “success states added up to an I th intervention”, F-added = (HD,R,q,I )n − (1 − D)n The net added value by I interventions can be calculated as follows: when N = 1 cost of intervention = DC + D(1 − R)C + D(1 − R)(1 − qR)C + . . . +D(1 − R)(1 − qR)C(. . .)(1 − qI−1 R)) let KD,R,q,I = DC + D(1 − R)C + D(1 − R)(1 − qR)C + . . . +D(1 − R)(1 − qR)C(. . .)(1 − q I−1 R)) when N = n cost of intervention = KD,R,q,I (1 + HD,R,q,I + (HD,R,q,I )2 + . . . + (HD,R,q,I )I−1 ) net added value by interventions, NV-added = S((HD,R,q,I )n − (1 − D)n ) −KD,R,q,I (1 + HD,R,q,I + (HD,R,q,I )2 + . . . +(HD,R,q,I )n−1 )
C. Tailored Intervention C.1. Case1: Analysis for a Care Plan with a Single Obligation This calculation is similar to the calculations carried out in Appendices A and B for random intervention. The only difference is R gets replaced by Rreg . For example, for a single obligation and a constant responsiveness (Appendix A.1): value of the successfully executed care plan I = n ∗ S where ni denote the number of successfully executed i=0 i care plans at ith intervention = n 0 ∗ S + n1 ∗ S + n2 ∗ S + . . . + n I ∗ S = (1 − D) ∗ S + DRreg S + D(1 − Rreg )Rreg S + D(1 − Rreg )2 Rreg S + . . . +D(1 − Rreg )I−1 Rreg S cost of intervention I−1 x ∗ C where xi denote the number of failed care plans = i=0 i at ith intervention = x0 ∗ C + x1 ∗ C + x2 ∗ C + . . . + xI−1 ∗ C = 0 + DC + D(1 − Rreg )C + D(1 − Rreg )2 C + . . . + D(1 − Rreg )I−1 C Net value =N V = value of the successfully executed care plan - cost of intervention = 1 + D ∗ (−1 + Rreg + (1 − Rreg )Rreg + (1 − Rreg )2 Rreg + . . . + (1 − Rreg )I−1 Rreg )S −(1 + (1 − Rreg ) + (1 − Rreg )2 ) + . . . + (1 − Rreg )I−1 = 1 − D + Rreg DS ∗ (1 + (1 − Rreg ) + (1 − Rreg )2 + . . . + (1 − Rreg )I−1 ) −DC ∗ (1 + (1 − Rreg ) + (1 − Rreg )2 + . . . + (1 − Rreg )I−1 ) = (1 − D) + Rreg DS − DC ∗ (1 + (1 − Rreg ) + (1 − Rreg )2 + . . . + (1 − Rreg )I−1 ) = (1 − D) + D(Rreg S − C) ∗ (1 + (1 − Rreg ) + (1 − Rreg )2 + . . . + (1 − Rreg )I−1 ) Let r = (1 − Rreg ) N V /D = (1 − D)/D + (Rreg S − C) ∗ (1 + r + r 2 + r 3 + r 4 + . . . + r I−1 )
126
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
C.2. Condition under which Mental-State Recognition is Cost Effective We consider the case with a single obligation. From Appendix C.1: N V /D = (1 − D)/D + (Rreg S − C) ∗ (1 + r + r 2 + r3 + r4 + . . . + rI−1 ) Mental-state recognition is cost effective if N V /D > 0 That is, (Rreg S − C) ∗ (1 + r + r 2 + r 3 + r 4 + . . . + r I−1 > 0 As (1 + r + r 2 + r 3 + r 4 + . . . + r I−1 ) > 0, to be cost effective Rreg S − C > 0 L + R(1 − L) > C/S L > (C/S − R)/(1 − R)
C.3. Condition under which Tailored Intervention is Cost Effective Compared to Random Intervention We consider the case with a single obligation. For random intervention (from Appendix A.1) N V /D = (1 − D)/D + (RS − C) ∗ (1 + r + r2 + r 3 + r4 + . . . + rI−1 ) where r = 1 − R since 0 ≤ r ≤ 1 1 + r + r 2 + r 3 + r 4 + . . . + r I−1 = 1/(1 − R) = 1/R N V /D = (1 − D)/D + (RS − C)/R N V /D = (1 − D)/D+ NV added from random intervention Similarly for tailored intervention (from Appendix C.1) N V /D = (1 − D)/D + (Rreg S − C)/Rreg N V /D = (1 − D)/D+ NV added from tailored intervention For tailored intervention to be cost effective NV-added from tailored intervention > NV-added from random intervention (Rreg S − (Ci + Cr ))/Rreg > (RS − Ci )/R (Ci + Cr )R/Ci < L + R(1 − L) L > (((Ci + Cr )R/Ci ) − R)/(1 − R)
C.4. Case2: Analysis for a Care Plan with Multiple Obligations The generalised calculation used in Case 2 of the constant R scenario (Appendix A.2) is used in this calculation. Then, when N = n, (GD,Rreg ,L,I )n denote the number of final success states, F , for an I th intervention. Then the “success states added up to an I th intervention”, F-added = (GD,Rreg ,L,I )n − (1 − D)n The net added value by I interventions can be calculated as follows: when N = 1 cost of intervention = DC + D(1 − Rreg )C + D(1 − Rreg )2 C + . . . + D(1 − Rreg )I−1 C let KD,Rreg ,L,I = DC + D(1 − Rreg )C + D(1 − Rreg )2 C + . . . + D(1 − Rreg )I−1 C when N = n cost of intervention = KD,Rreg ,L,I (1 + GD,Rreg ,L,I + (GD,Rreg ,L,I )2 + . . . +(GD,Rreg ,L,I )I−1 ) net added value by interventions, NV-added = S((GD,Rreg ,L,I )n − (1 − D)n ) −KD,Rreg ,L,I (1 + GD,Rreg ,L,I + (GD,Rreg ,L,I )2 + . . . +(GD,Rreg ,L,I )n−1 )
K. Wickramasinghe et al. / Cost/Benefit Analysis of an Adherence Support Framework
127
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
[14]
[15]
[16] [17] [18]
[19] [20]
[21]
Preventing chronic disease: a vital investment, 2005. Accessed: 12. November 2007. Chronic diseases and associated risk factors in australia, 2006. Accessed: 12. November 2007. National chronic disease strategy, 2006. Chronic disease management, 2007. Accessed: 12. November 2007. Potential benefits of the national reform agenda report to the council of australian governments, productivity commission, 2007. Accessed: 12. November 2007. G. Anderson and K. Wilson. Chronic disease in california: Facts and figures, 2006. M. E. Bratman. Intentions, Plans, and Practical Reason. Harvard University Press, 1987. C. Castelfranchi and F. Paglieri. The role of beliefs in goal dynamics: Prolegomena to a constructive theory of intentions. Synthese, 155(2):237–263, 2007. R. Conte, R. Falcone, and G. Sartor. Agents and norms: How to fill the gap. Artificial Intelligence and Law, 7(1):1–15, 1999. N. Faci, S. Modgil, N. Oren, F. Meneguzzi, S. Miles, and M. Luck. Towards a monitoring framework for agent-based contract systems. Cooperative Information Agents XII, 2008. M. Georgeff. E-health and the Transformation of Healthcare, 2007. M. Georgeff and J. Hilton. Why is telemedicine not more widely used? Feature, pages 32 – 41, to appear 2010. C. Guttmann, I. Thomas, M. Georgeff, K. Wickramasinghe, H. Gharib, S. Thompson, and H. Schmidt. Towards an intelligent agent framework to manage and coordinate collaborative care. In Proceedings of the first workshop on Collaborative Agents – REsearch and development (CARE 2009/2010), volume 6066 of Lecture Notes in Artificial Intelligence. Springer-Verlag, Berlin, Germany, accepted in 2009, to appear in 2010. C. Guttmann, K. Wickramasinghe, M. Georgeff, I. Thomas, and H. Schmidt. A Framework to Manage Contractual Relationships in Customer Life Cycle Management Systems. In Proceedings on ServiceOriented Computing: Agents, Semantics, and Engineering (SOCASE 2010), 2010. M. Jakob, M. Pˇechouˇcek, S. Miles, and M. Luck. Case studies for contract-based systems. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems: industrial track, pages 55–62. International Foundation for Autonomous Agents and Multiagent Systems, 2008. G. Kaminka, D. Pynadath, and M. Tambe. Monitoring teams by overhearing: A multi-agent planrecognition approach. Journal of Artificial Intelligence Research, 17(1):83–135, 2002. F. López, M. Luck, and M. dŠInverno. A normative framework for agent-based systems. Computational and Mathematical Organization Theory, 12(2):227–250, 2006. N. Oren, S. Miles, M. Luck, S. Modgil, N. Faci, S. Alvarez, J. Vazquez, and M. Kollingbaum. Contract based electronic business systems theoretical framework. Technical Report D2.2, King’s College London, 2008. R. van der Krogt and M. de Weerdt. Coordination through plan repair. Lecture notes in computer science, 3789:264, 2005. K. Wickramasinghe, C. Guttmann, M. Georgeff, I. Thomas, and H. Schmidt. An Adherence Support Framework for Service Delivery in Customer Life Cycle Management. In Proceedings of The 9th International Workshop on Coordination, Organization, Institutions and Norms in Multi-Agent Systems (COIN 2010), 2010. L. Xu and M. Jeusfeld. Pro-active monitoring of electronic contracts. Lecture Notes in Computer Science, pages 584–600, 2003.
This page intentionally left blank
Part III Improving the Well-Being Through Life-Style and Entertainment
This page intentionally left blank
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-131
131
Predicting Daily Physical Activity in a Lifestyle Intervention Program Xi LONGa,c, Steffen PAUWSa,1 , Marten PIJLa, Joyca LACROIXa, Annelies H. GORISb, and Ronald M. AARTSa,c a Philips Research Laboratories, Eindhoven, The Netherlands b Philips DirectLife, Amsterdam, The Netherlands c Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Abstract. The growing number of people adopting a sedentary lifestyle these days creates a serious need for effective physical activity promotion programs. Often, these programs monitor activity, provide feedback about activity and offer coaching to increase activity. Some programs rely on a human coach who creates an activity goal that is tailored to the characteristics of a participant. Throughout the program, the coach motivates the participant to reach his personal goal or adapt the goal, if needed. Both the timing and the content of the coaching are important for the coaching. Insights on the near future state on, for instance, behaviour and motivation of a participant can be helpful to realize an effective proactive coaching style that is personalized in terms of timing and content. As a first step towards providing these insights to a coach, this chapter discusses results of a study on predicting daily physical activity level (PAL) data from past data of participants in a lifestyle intervention program. A mobile body-worn activity monitor with a built-in triaxial accelerometer was used to record PAL data of a participant for a period of 13 weeks. Predicting future PAL data for all days in a given period was done by employing autoregressive integrated moving average (ARIMA) models on the PAL data from days in the period before. By using a newly proposed categorized-ARIMA (CARIMA) prediction method, we achieved a large reduction in computation time without a significant loss in prediction accuracy in comparison with traditional ARIMA models. In CARIMA, PAL data are categorized as stationary, trend or seasonal data by assessing their autocorrelation functions. Then, an ARIMA model that is most appropriate to these three categories is automatically selected based on an objective penalty function criterion. The results show that our CARIMA method performs well in terms of PAL prediction accuracy (~9% mean absolute percentage error), model parsimony and robustness. Keywords. tailored activity intervention, persuasive technology, activity monitoring, ARIMA, Kalman recursion, order selection, PAL, prediction, wellbeing, health, behaviour change
Introduction An active lifestyle has large beneficial effects on people’s mental and physical health [1]. Both the engagement in intense physical activity (e.g., endurance sports or strength exercises) and the engagement in various types of brief, low to moderately 1
Corresponding author: Steffen Pauws, Philips Research Laboratories, [email protected]
132
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
intense, everyday physical activities (e.g., walking) contribute to a better mental and physical health [2]. Nevertheless, a growing number of people worldwide fails to meet the recommended levels of physical activity to maintain good health. An inactive lifestyle has been associated with health problems such as obesity, diabetes and cardiovascular disease and hence a reduction in quality of life. These health problems result in a significant burden to the healthcare system [3]. As a consequence of this worrisome situation, the development of effective physical activity intervention programs has become a major focus area [4][5]. For participants of these programs, the provision of feedback on activity level against personal targets has proven to be highly important for staying motivated and attaining goals [5][6]. To track activities of the participants, a mobile body-worn triaxial accelerometer is often used. It is an inexpensive, effective and feasible device with minimal discomfort to the participant and tracks activity by measuring acceleration of body movements [7][8][9][10][11]. The relationship between energy expenditure due to physical activity and body acceleration measured by such a triaxial accelerometer has been evaluated under various controlled and free-living conditions [7][8][12][13]. These studies demonstrated the validity and usefulness of a triaxial accelerometer to measure daily physical activity. Many physical activity intervention programs have failed to be effective in realizing significant and sustainable changes in activity behaviour [14][15]. They rely on mass intervention by providing a generic solution for the entire participant base that does not accommodate individual differences within variables underlying behaviour change. In contrast to mass interventions, individual one-to-one behaviour change interventions by a human coach are often more effective. The main reason is that a human coach has insight into the participant’s performance and characteristics and can provide feedback and support regarding specific motivational dips or barriers that the participant encounters. In this way, the coach can employ a personalized approach to optimize behaviour change by, for instance, setting and adapting individual targets for the participant, providing participant-relevant feedback and adjusting interaction mode to the participant preferences. However, human coaching support can be labour intensive, which limits the number of participants that a human coach can reach. To increase the coach’s reach, we propose to provide the coach with insightful information about a participant’s performance by predicting his near-future activity levels. In particular, we introduce a modelling technique that predicts the most likely next-period daily physical activity level (PAL) of a participant from his past physical activity patterns. Such information can be helpful for a coach to provide effective coaching in terms of timing and content. For example, when it is likely that a participant will fail to reach a personal target for the next week, a proactive human coaching intervention can reach a participant immediately before the target is actually missed, for instance, by sending an additional motivational trigger or by lowering the target. For predicting future PAL data, we use autoregressive integrated moving average models (ARIMA) also well-known as the Box-Jenkins methodology for time series data analysis [16]. Unfortunately, this method takes an unacceptably long computation time due to the need of repeatedly choosing the most appropriate model from a large number of candidate models. Therefore, we applied a categorized ARIMA (CARIMA) method in which the times series under study are classified in one of three categories (i.e., stationary data, trend-wise data and seasonal data). Then, the most appropriate model for each category is chosen beforehand instead of computed from a pool of
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
133
candidate models. The parameter estimation of the models is based on the maximum likelihood (ML) method via a Kalman recursion [17]. The remainder of the chapter is organized as follows: Section 1 presents some related work and possible extensions of the work. Section 2 presents the data collection. The ARIMA method is explained in Section 3. Section 4 discusses the results of the ARIMA modeling and prediction. Conclusions are provided in the final section.
1. Related work The Box-Jenkins method has already been widely used in a number of other areas such as economic time series forecasting [18][19], ecological and weather prediction [20][21], medical monitoring [22][23], traffic flow prediction [24], and also physical activity recognition [25]. Generally, the application of ARIMA models is mostly focused on predicting a single univariate time series. For multi univariate series, an automatic method has been developed [27][28]. Besides providing insightful information to a coach in a physical activity intervention program, the proposed work can also be applied in telecare or home telemonitoring services. In these healthcare services, elderly patients or those with chronic conditions are helped to maintain their independence and continue living in their own homes by means of communication technologies. For instance, event prediction is a crucial element for providing high-quality telecare. In the case of telemonitoring chronic health failure patients, a timely prediction of a worsening patient condition based on daily measurements of body weight, heart rate and blood pressure is essential to prevent re-hospitalizations or any critical situation. In the case of a senior wandering service, analysis of urban dwelling behaviour of cognitively impaired elderly using GPS data is essential to track outdoor mobility of these elderly. Lastly, data from a single tri-axial accelerometer measuring human physical activity also allow for automatic classification of typical daily physical activities such as walking, running, cycling or driving [26]. This helps participant and coach to effortlessly link activity patterns with a type of activity.
2. Data Collection In a lifestyle physical activity intervention program2, participants were provided with a Philips DirectLife activity monitor containing a built-in triaxial accelerometer to measure the acceleration data of their activities performed throughout the day. This device is a small (3.2 x 3.2 x 0.5 cm), light-weight (12.5 g) instrument (see Figure 1). It is waterproof up to 30 meters depth, and has a battery life of 3 weeks and an internal memory that can store data for up to 22 weeks. The device can be worn on the chest with a key cord, on the waist, or in the trouser pocket in an arbitrary orientation. The features of the device have been designed to enhance unobtrusiveness of wearing and to reduce the interference of the monitoring system with spontaneous activity behaviour. During the monitoring period, the activity monitor was connected several times to a personal computer, using Universal Serial Bus (USB) communication, and the recorded data were uploaded, processed and stored using dedicated software. 2
URL: http://www.directlife.philips.com/
134
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
Figure 1. The Philips DirectLife activity monitor.
The output of the activity monitor is expressed as activity counts per minute, which are the running time summations of absolute output values from the three uniaxial accelerometers in the device. Consecutive counts were summed to arrive at counts per day. By using the correspondence between activity counts and total energy expenditure, daily Physical Activity Level (PAL) data values are calculated using a linear regression model on activity counts and a measure for basal metabolic rate corrected for age, height and body mass [7][8]. The physical activity intervention program was primarily enrolled at different locations in the Netherlands throughout the year with a high participation in the months November, December and January. Each participant in the program, which lasted 13 weeks, took part in one assessment week and 12 intervention weeks. During the assessment week, participants learned to use the activity monitor and completed personal data (e.g., age, gender, weight). The PAL data monitored during the assessment week serve as a baseline on the basis of which a personal activity goal was set to work towards, during the 12 weeks of the program. In total, there were 91 (13×7) days in which participants wore the device. Each day consisted of a data point containing the PAL value accumulated over an entire 24 hour day of a participant in the program. All data were stored in a database. In total, 950 participants were recruited for participation in the physical activity intervention program. For the majority of the adult population, daily PAL data lie between 1.2 and 2.5 [30]. A daily PAL around 1.2 corresponds to a sedentary activity level. A PAL below 1.2 indicates that the activity monitor was not been worn that day. A PAL higher than 1.7 corresponds to a healthy activity level. A PAL higher than 2.5 corresponds to a vigorous activity level. An extremely high level of physical activity leads to a PAL value as high as 4.5, but this level of activity is only achieved in extreme situations (e.g., professional sports) [29]. In this study, we treated the extremely low and extremely high PAL values (outliers: PAL<1.2 or PAL>5) as missing data. Outliers and missing data are generated by non-modelled mechanisms such as not wearing the activity monitor, a flat battery, monitor noise or other disturbances. They can lead to ARIMA model misspecification and bad prediction performance. Therefore, the database was cleaned up by removing all time series contaminated with missing data (and outliers). Of a total of 950 time series, 227 were kept for further study. The impact of noise and missing data on ARIMA prediction accuracy is treated separately (in Section 2.3) in a systematic fashion.
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
135
3. ARIMA Method In this study, ARIMA models were used to fit the observed daily PAL data and to make a prediction on future PAL data. To this end, an ARIMA method consists of a procedure for modeling and for prediction. In addition, the robustness of the fitted ARIMA models on noise and missing data is also important for prediction purposes and will be evaluated accordingly. 3.1. ARIMA Modeling The ARIMA modeling aims at constructing the most appropriate model to fit observed data. A general ARIMA model can be structurally classified as the form of ARIMA(p,d,q)(P,D,Q)S models [16]. The model can be written as below I ( B))(B S )(1 B)d (1 B S )D yt T (B)4(B )S H t
(1)
I ( B) 1 I1B I2 B 2 ... I p B p
(2)
T ( B) 1 T1B T 2 B 2 ... T q B q
(3)
)( B S ) 1 )1B S )2 B2 S ... ) P B PS
(4)
4( B S ) 1 41B S 42 B 2 S ... 4Q BQS
(5)
where the symbols used are defined as follows: yt: εt: B: p: d: q: P: D: Q: S:
data point at time t the independent, identical, normally distributed residual at time t backward shift operator, where Bnyt = yt-n, order of non-seasonal autoregressive (AR) terms order of non-seasonal differencing order of non-seasonal moving average (MA) terms order of seasonal autoregressive (SAR) terms order of seasonal differencing order of seasonal moving average (SMA) terms seasonal order.
The ARIMA modeling process consists of identification, model estimation and order selection, diagnostic checks and residual analysis. x
Identification
The orders of the ARIMA models need to be determined to ensure that the selected model fits the observed PAL time series best. In the ARIMA models, the non-seasonal differencing and the seasonal differencing are crucial to remove the trend and seasonality of the time series (i.e., to achieve stationarity). The selection of the orders d
136
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
and D can be made by tentatively identifying the stationarity and seasonality of the observed data. A stationary time series has a mean and a variance that are constant over time. A non-stationary series has either a trend upwards or downwards or fluctuates over different levels. In practice, an approximately stationary assumption is sufficient rather than a strictly stationary one. Seasonality is yet another general pattern that can be observed in the data. Intuitively, a seasonal series means that the data behaves periodically. The periodicity of our daily PAL time series is assumed to be seven days as we expected weekly behaviour in physical activity. First-order non-seasonal differencing (d=1) and seasonal differencing (D=1) can effectively remove the trend and the seasonality of the data [31]. The use of higher differencing orders may result in over-differencing the data. In this study, the analysis of autocorrelation function (ACF) was used to identify the time series as being a stationary, a trend-wise or a seasonal time series. Considering the observations y1, y2, …, yt, the autocorrelation at lag k is
Uk
¦
n k t 1
( yt y )( yt k y )
¦
n t
(6)
( yt y ) 2 1
where y 1n ¦ t 1 yt , n is the sample size. If the ACF magnitude of a time series cuts off or dies down fairly quickly as k increases, then this series is considered to be stationary. Otherwise, if the ACF of a series dies down slowly, it is considered to be nonstationary. Besides, if the ACF of a series has spikes at specific lags, it is considered to be seasonal. Figures 2, 3 and 4 show typical examples of a stationary, a trend-wise and a seasonal time series and their ACFs. One would expect that trend and seasonal components coexist in physical activity time series over twelve weeks. For instance, a person can steadily improve his activity levels but still follow a consistent week routine. We did not observe these patterns in the current data set. n
Figure 2. The plots of daily PAL times series that demonstrates stationary behaviour over 91 days (left-hand side) and its corresponding ACF (right-hand side).
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
137
Figure 3. The plots of daily PAL time series that contain a down-ward and up-ward trend over 91 days (lefthand side) and its corresponding ACF (right-hand side).
Figure 4. The plots of daily PAL time series containing a seasonal (weekly) trend (left-hand side) and its corresponding ACF (right-hand side).
In order to examine the stationarity and the seasonality of a time series quantitatively and automatically rather than to inspect them visually, the tU statistic can be calculated k
t Uk
U k / sU
(7) k
where the standard error of ρk is sUk
[(1 2¦ i 1 Ui2 ) / n]1/ 2 k 1
(8)
Table 1. Critical absolute t-values for identified spikes in the ACF [31] Lags Non-seasonal lags Low (1,2,3) Other non-seasonal Seasonal lags Exact seasonal (S, 2S, 3S) Near & half seasonal Other lags
Critical Absolute t-values in ACF 1.6 2.0 1.25 1.6 2.0
138
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
For identifying significant spikes at lags of an ACF automatically, various critical absolute t-values are suggested [31]. Table 1 indicates the spike identification to examine stationarity and seasonality in an ACF. Alternatively, the Augmented DickeyFuller (ADF) unit root test [32] can be utilized to identify the stationarity and seasonality of a given series. x
Model Estimation and Order Selection
In this study, the well-known recursive algorithm, Kalman filtering [17], was applied to the observed PAL time series data for model estimation. In this so-called state space model, the filtered state is the predicted state. Considering the observations {y1, y2, … , yt-1}, the Kalman recursion aims at searching for optimal values yt at point t. The fitting residuals εt and their variances Ft are of particular interest. The Kalman recursion process consisted of two groups of equations: the observation equations and the state updating equations. The observation equations are responsible for incorporating a new incoming observation into the estimate and the state updating equations (unobserved) are responsible for updating the current state and residual variances estimates to obtain the fitted values in the next time step. Details of Kalman recursion for ARIMA models are provided elsewhere [33][34]. The unknown initial elements of the unobserved state of the state space model include the initial disturbances of different components, the initial state mean value and its variance at time t = 1. They were defined by using diffuse initialization method [35]. In addition, the ARIMA models normally contain two or more parameters (hyperparameters) such as the autoregressive and moving average coefficients. The maximum likelihood estimation (MLE) was used to optimize these parameters iteratively. In this case, a log-likelihood function was constructed and the value of it was maximized by minimizing the residuals and their variances simultaneously [35]. In order to select the most appropriate model for a specific series, the orders of the AR, MA, SAR, and SMA processes should be determined. Models with a too high order (i.e., p, q, P and Q) result in over-fitting the data. In practice, p, q, P and Q are equal to or less than 2 [36]. The seasonal order S was assumed to be 7 (i.e., reflected weekly periodicity). The objective penalty function criteria such as Akaike Information Criterion (AIC) [37] and Schwarz’s Information Criterion (SIC) [38] are most widely used in the order selection for ARIMA models. These criteria try to find a trade-off between the goodness-of-fit and the parsimony of the models. The SIC is defined as SIC
2 ln( L) / n k ln(n) / n
(9)
where L is the likelihood function of the ARIMA model, k is the number of parameters to be estimated (i.e., k=p+q+P+Q+1), and n is the number of observed data points. For a small or moderate sample size, SIC performs better than AIC in order selection of ARIMA models [39][40]. Compared to AIC, SIC penalizes the number of parameters more heavily and focuses more on model parsimony, which effectively decreases the possibility of over-fitting. Initially, in this study, all possible 144 candidate models encompassing models until ARIMA(3,d,3)(2,D,2) were involved in model selection. However, the results show that 99% of the time series have the best fit by ARIMA models that encompass orders until (2,d,2)(1,D,1) using SIC. Therefore, only 36 candidate ARIMA models with a maximum order (2,d,2)(1,D,1) are tested under SIC for model selection.
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
x
139
Diagnostic Checking and Residual Analysis
The residuals r = {r1, r2, …, rn} of the prediction need to be uncorrelated. To check for this, the rescaled residuals (i.e., standardized residuals) are defined as rt
H t / Ft
(10)
where εt is the fitting residual and Ft is the residual variance at time t. The standardized residuals from an ARIMA fitting process should be uncorrelated and normally distributed. Serial correlation in the residuals means that the model is inadequate. There are several statistical tests to check for correlation. The Ljung-Box Q test [41] is a widelyused test based on the ACF of the residuals. The Q statistic is Q
n(n 2)¦ l 1 (n l ) 1 Ul2 h
(11)
where n is the data sample size, ρl is the autocorrelation of the residuals at lag l, and h is the number of lags being tested. The choice of h is arbitrary and is normally set at 10 [35]. For a significance level of 5% (i.e., the probability of a Type І error), the assumption of uncorrelation is rejected if Q is larger than χ2[5%] (df), the critical value of chi-square distribution [42] with df degrees of freedom. The fitting performance of the models can also be compared by the criterion of residual measures. We used the mean absolute relative residual (MARR), which is usually used, to measure and quantify the quality of fit, MARR
1 n
¦
n t 1
| rt / yt |
(12)
3.2. Prediction The ARIMA modeling aims at choosing the most appropriate model for a specific PAL time series based on in-sample fitting. During the modeling process, the parameters and orders were automatically updated with every incoming new observation. In practice, once a model has been chosen, it should be used to predict future data, preferably using out-of-sample data to assess the prediction performance. Take the seasonal PAL time series in Figure 4 as an example. Assume that 70 data points in 10 weeks (including the assessment week) are observed. The modeling period of 70 days and prediction period are indicated in Figure 5. The automatic order selection method based on a penalty function criteria for the ARIMA models has been frequently used [27][28]. This automatic-order-selection ARIMA (AOS-ARIMA) method searches for the best model for each individual time series evaluating 36 candidate models; this is computationally intensive. Another method applies only one single model with specific AR, MA, SAR, and SMA orders for all the time series. However, this single ARIMA (Single-ARIMA) method may choose the model apriori that has the best fit with the majority of categories of stationarity, trend and seasonality. We propose a categorized ARIMA (CARIMA) method by using the most appropriate model that best fit one of the categories. The advantages of the CARIMA include low
140
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
computational intensity and good quality of fit for the different PAL time series categories. The out-of-sample one-step-ahead prediction (1-SAP) provides a prediction of the PAL value for the next day. To assess the prediction accuracy, we use the measure of mean absolute percentage error (MAPE), MAPE
1 n
¦
n t 1
| et / yt |
(13)
where n is the sample size, et is the prediction error and yt is the actual observation at time t.
Figure 5. Predicting a (seasonal) PAL time series from 70 observed data points.
3.3. Model Robustness The robustness of a prediction model is important for real applications because data in practice contains noise, missing values and outliers. In the areas of economics, ARIMA models have proven to be robust [43]. However, for the PAL data, the robustness of the models on noise, missing values, and outliers needs to be assessed. The noise of the PAL data mainly comes from the background environment and the device itself during usage. Hence, the noise can be considered as additive white Gaussian noise (AWGN). Since we transformed the outlier problem into a missing value (see Section 1), we will not address outliers. In the database, approximately 11.6% of the PAL data points were missing for each time series on average. The missing data made the model identification and parameter estimation difficult during the ARIMA modeling. In this study, a mean substitution approach was applied to deal with incomplete data; this approach simply pads the mean of all non-missing values when calculating the ACF of the data [44]. During the prediction procedure, the missing data were treated as predictions in the same way [35]. In estimating the parameters by computing the log-likelihood function in the Kalman recursion, the value of the residual was simply set to zero when the corresponding observation was missing.
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
141
4. Results and Discussion The ACFs of the PAL data were identified and the results show that the percentage of stationary series amounts to ~83% (188 out of 227) in the clean dataset, whereas the percentages of trend and seasonal time series are ~4% (10 out of 227) and ~13% (29 out of 227), respectively. This means that the majority of the series are considered stationary. The 36 ARIMA models encompassed by a (2,d,2)(1,D,1) model have been applied to fit the data. The ARIMA(1,d,1)(0,D,0) model was used to compare average SIC statistics over all series. This latter model was chosen because of its best overall fit for the stationary series. In Table 2, the best models are ranked for the stationary, trend and seasonal series according to their average SIC statistics. As shown in Equation 9, SIC tries to establish a trade-off between low residuals and model parsimony. The CARIMA method proposed that the highest ranked model should be applied for the series in each category. This results in the (1,d,1)(0,D,0), (0,d,1)(0,D,0) and (1,d,0)(0,D,1) models for the stationary, trend and seasonal series, respectively. Table 3 compares the fitting performances of the three methods in terms of MARR. It shows that the AOS-ARIMA method performed the best in fitting accuracy but it is the highest in computational intensity (implemented in Matlab ®). The other two methods take significantly less computation time than the AOS-ARIMA method. In addition, CARIMA achieved lower average MARR than Single-ARIMA for the 227 PAL time series in this study. In order to test for significant effects in the MARRs, we employed the student’s ttest for each two methods. The null hypothesis states that there is no difference between the two methods. The t-test statistic for the 227 MARRs by using AOSARIMA and CARIMA methods was calculated to be 2.84, which is larger than t(0.05; 452) = 1.96, the 5% critical value with the degree of freedom (df) of 452. The null hypothesis should be rejected in favour of the alternative hypothesis: there is a significant difference between the means of the two methods. Similarly, the t-test statistic of 2.90 calculated from the MARR values by using CARIMA and SingleARIMA methods also indicates a significant difference. As a conclusion, the use of the CARIMA method largely reduces computation intensity to fit the data with a little loss in its fitting accuracy (~0.1%), in comparison with the AOS-ARIMA method. Table 2. Top 3 ARIMA models ranked by the average SIC over the series of the stationary, trend and seasonal categories. Stationary series Model Average SIC (1,d,1)(0,D,0) -0.3210 (1,d,1)(1,D,0) -0.2656 (1,d,2)(0,D,0) -0.2410
Trend series Model Average SIC (0,d,1)(0,D,0) -0.5339 (0,d,1)(1,D,0) -0.4896 (0,d,2)(0,D,0) -0.4785
Seasonal series Model Average SIC (1,d,0)(0,D,1) -0.0551 (0,d,0)(0,D,1) -0.0550 (0,d,1)(0,D,1) -0.0498
Table 3. Overall performance comparison of the in-sample fitting of the three methods used. Method Single-ARIMA AOS-ARIMA CARIMA
Average MARR (±SD) 8.9% (±2.8%) 8.5% (±2.7%) 8.6% (±2.8%)
Average Computation Time 0.5s 39.3s 0.6s
142
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
Figure 6 shows that for the stationary and trend series, the CARIMA and SingleARIMA methods performed similarly and almost as well as the AOS-ARIMA method. However, for the seasonal series, the fitting performance of Single-ARIMA method becomes worse than the other two methods. In addition, the fitting performance of the trend and seasonal series are better than the fitting performance of the stationary series.
Stationary Series
Trend Series
Seasonal Series
Figure 6. Average MARRs (%) of the data fitting for the stationary, trend and seasonal PAL series by using Single-ARIMA (black bars), AOS-ARIMA (dark gray bars) and CARIMA (light gray bars) methods.
The diagnostic checks, for examining whether or not the applied model accurately represented the underlying process in the observed time series, revealed that residuals were indeed serially uncorrelated in 211, 199 and 176 of the 227 cases when using AOS-ARIMA, CARIMA and Single-ARIMA methods, respectively. Thus, the AOSARIMA selected the best fit model for the series and achieved the highest number of serially uncorrelated residuals. The CARIMA method achieves a slightly lower number of uncorrelated residual cases. The Single-ARIMA performed worse in this respect.
Figure 7. Average MAPEs (%) of multiple-step-ahead predictions by using CARIMA method.
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
143
During the prediction process, the data of the assessment week were assumed to be known. The out-of-sample 1-SAP errors show similar results as the in-sample fitting residuals (see Table 4): the CARIMA method is preferred because of its high prediction performance and computation simplicity. Table 5 shows that the predictions for the trend and seasonal series were more accurate than those for the stationary ones. For the purpose of evaluating the performance of multiple-step-ahead prediction (M-SAP) with the CARIMA method, we plotted the errors (MAPEs) up to the 14-SAP (i.e., the future two weeks) in Figure 7. The models performed well in M-SAP (with an accuracy of <0.7% decline of the 14-SAP compared with that of the 1-SAP). Table 4. Overall performance comparison of the out-of-sample predictions by using the three methods Method Single-ARIMA AOS-ARIMA CARIMA
Average MAPE (±SD) 9.2% (±3.2%) 9.0% (±2.9%) 9.1% (±3.1%)
Average Computation Time 0.4s 36.5s 0.5s
Table 5. Out-of-sample prediction performances of the three categories by using CARIMA method Category Stationary series Trend series Seasonal series
Average MAPE (±SD) 9.3% (±3.3%) 7.9% (±1.5%) 8.6% (±2.5%)
To test for robustness of the models, the data were noised by introducing AWGN. As the database has been cleaned, we manually substituted data points with so-called missing values. Figure 8 gives the plots of various signal-to-noise ratios (SNR) and various percentages of missing values versus prediction performance (average 1-SAP MAPE). The average MAPE of the noisy data is still under 10% when the SNR of the noisy data is 15dB. In addition, when we introduce 12% missing values in our time series, there is still a low mean MAPE of about 10%. Therefore, in this study, the CARIMA models are found to be robust to PAL data contaminated with noise and missing values.
Figure 8. Prediction performances (average 1-SAP MAPE) of PAL time series data under various signal-tonoise ratios (left-hand side) and with varying percentage of missing values (right-hand side).
144
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
5. Conclusion The results presented in this chapter demonstrate the feasibility of rather precise predictability of human physical activity behaviour. The prediction of physical activity data was studied in the context of a lifestyle intervention program. Daily physical activity data of a large number of program participants were collected by a body-worn activity monitor with a built-in single triaxial accelerometer. ARIMA models were employed to predict physical activity in future weeks based on past data from previous weeks. The model estimation was done via a Kalman recursion. Subsequently, the CARIMA method was employed to select the orders of the model. This led to a large reduction in computation time needed with only a minor decline in prediction accuracy in comparison to the traditional, automatic-order-selection ARIMA method. Using CARIMA, MAPEs of 9.3%, 7.9% and 8.6% were achieved for the stationary, trendwise and seasonal series in the data set, respectively. The CARIMA prediction models were shown to be robust against noisy data (with AWGN at SNR=15dB) and missing values (at a level of 12%). The high precision prediction of near future physical activity of participants in a lifestyle intervention program opens opportunities for timely personalized coaching. It provides insight into future goal achievement and may be indicative of changes in motivation levels. These predictions can be helpful for coaches to optimize the content and timing of their coaching, thereby enabling them to offer more personalized and effective coaching support.
Acknowledgments The authors would like to thank the Philips DirectLife team for the great and fruitful co-operation and Bin Yin, Wim Stut and John Lamb (Philips Research Laboratories) for their insightful comments.
References [1] J.R. Penedo and F.J. Dahn, Exercise and well-being: review of mental and physical health benefits associated with physical activity, Current Opinion in Psychiatry 18 (2005), 189–193. [2] W.L. Haskell, I.M. Lee, R.R. Pate, K.E. Powell, S.N. Blair, B.A. Franklin, C.A. Macera, G.W. Heath, P.D. Thompson and A. Bauman. Physical activity and public health: updated recommendation for adults from the American College of Sports Medicine and the American Heart Association. Circulation 116, 9, (2007) 1081-93. [3] P. Campagna, Physical Activity Levels and Dietary Intake of Child and Youth in the Province of Nova Scotia. Nova Scotia Dept. of Education, Health Promotion and Protection, 2005. [4] L.J. Ware, R. Hurling, O. Bataveljic, W.B. Fairley, L.T. Hurst, P. Murray, L.K. Rennie, E.C. Tomkins, A. Finn, R.M. Cobain, A.D. Pearson, and P.J. Foreyt, Rates and Determinants of uptake and use of an internet physical and weight management program in office and manufacturing work sites in England: Cohort study, Journal of Medical Internet Research 10(4) (2008), e56. [5] A.H.Goris and R. Holmes, The Effect of a Lifestyle Activity intervention program on improving physical activity behavior of employees, Proceedings of the 3rd International Conference on Persuasive Technology, PERSUASIVE, Oulu, Finland (2008). [6] J. Lacroix, P. Saini, and R. Holmes, The Relationship between Goal Difficulty and Performance in the Context of a Physical Activity Intervention Program, Proceedings of the 10th International Conference on Human Computer Interaction with Mobile Devices and Services (MobileHCI), Amsterdam, the Netherlands (2008), 415-418.
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
145
[7] A.G. Bonomi, G. Plasqui, A.H. Goris and K.R. Westerterp. Estimation of free-living energy expenditure using a novel activity monitor designed to minimize obtrusiveness, Obesity advance online publication, February 25, 2010; doi:10.1038/oby.2010.34. [8] G. Plasqui, A.M. Joosen, A.D. Kester, A.H. Goris, and K.R. Westerterp, Measuring free-living energy expenditure and physical activity with triaxial accelerometry, Obesity Research 13 (2005), 1363-1369. [9] R. Hurling, M. Catt, M. De Boni, B.W. Fairley, T. Hurst, P. Murray, A. Richardson, and J.S. Sodhi, Using internet and mobile phone technology to deliver an automated physical activity program: Randomized controlled trial, Journal of Medical Internet Research 9(2) (2008), e7. [10] B.P. Clarkson, Life Patterns: Structure from Wearable Sensor. PhD Thesis, MIT Media Lab, 2002. [11] X. Long, B. Yin, and R.M. Aarts, Single-Accelerometer-Based Daily Physical Activity Classification. 31st Annual International IEEE EMBS Conference, Minneapolis, MN (2009). [12] C.V.C. Bouten, K.R. Westerterp, M. Verduin, and J.D. Janssen, Assessment of energy expenditure for physical activity using a triaxial acceleromter, Journal of Medicine & Science in Sports & Exercise 26(12) (1994), 1516-1523. [13] M.J. Mathie, A.C. Coster, N.H. Lovell, and B.G. Celler, Detection of daily physical activities using a triaxial accelerometer. Journal of Medical & Biological Engineering & Computing 41 (2003), 296-301. [14] M.H. Van den Berg, J.W. Schoones and T.P.M. Vliet-Vlieland. Internet-based physical activity interventions: a systematic review of the literature. Journal of Medical Internet Research, 9(3):e26 (2007). [15] W. Zhu. Promoting physical activity through internet: a persuasive technology view. In: Proceedings of the Second International Conference on Persuasive Technology (Palo Alto, CA, USA, April 26-27 2007). PERSUASIVE 2007, LNCS 4744, 12-1. [16] G.E.P. Box, and G.M. Jenkins, Time Series Analysis Forecasting and Control, Holden-Day, San Francisco, CA, 1976. [17] R.E. Kalman, A new approach to linear filtering and prediction problems, Journal of Basic Engineering 82 (1960), 35-45. [18] M.P. Clements, and D.F. Hendry, Forecasting Economic Time Series, Cambridge University Press, 1998. [19] P. Pai, and C. Lin, A hybrid ARIMA and support vector machines model in stock price forecasting, Omega 33(6) (2005), 497-505. [20] K. Yurekli, A. Kurunc, and F. Ozturk, Application of linear stochastic models to monthly flow data of Kelkit stream, Journal of Ecological Modelling 183(1) (2005), 67-75. [21] S.D. Campbell, and F.X. Diebold, Weather Forecasting for Weather Derivatives, Journal of the American Statistical Association, 100(469) (2005), 6-16. [22] U. Helfenstein, Box-Jenkins modelling in medical research, Journal of Statistical Methods in Medical Research 5(1) (1996), 3-22. [23] A. Earnest, M.I. Chen, D. Ng, and L.Y. Sin, Using autoregressive integrated Moving average (ARIMA) models to predict and monitor the number of beds occupied during a SARS outbreak in a tertiary hospital in Singapore, BMC Health Services Research 5:36 (2005). [24] B.M. Williams, P.K. Durvasula, and D.E. Brown, Urban Freeway Traffic Flow Prediction: Application of Seasonal Autoregressive Integrated Moving Average and Exponential Smoothing Models, Journal of the Transportation Research Board, 1644 (1998), 132-141. [25] A.M. Khan, Y.K. Lee, and T.-S. Kim, Accelerometer Single-based Human Activity Recognition Using Augmented Autoregressive Model Coefficients and Artificial Neural Nets, 30th Annual International IEEE EMBS Conference, Vancouver, BC (2008), 5172-5175. [26] X. Long, B. Yin and R.M. Aarts, Single-Accelerometer-Based Daily Physical Activity Classification. Conference Proceedings of 31st Annual International Conference of the IEEE Eng. Medicine and Biology Society (EMBS), Minneapolis, Minnesota, USA, Sept 2-6, 2009, 6107-6110. [27] S. Halim, I.N. Bisono, Melissa, and C. Thia, Automatic Seasonal Auto Regressive Moving Average Models and Unit Root Test Detection, IEEE International Conference on Industrial Engineering and Engineering Management, Singapore (2007), 1129-1133. [28] V. Gómez, and A. Maravall, Automatic Modelling Methods for Univariate Series, Banco de España Working Paper, no. 9808 (1998). [29] World Health Organization, Energy and nutrient requirements. Report of a joint FAO/WHO/UNI Expert Consultation, Rome (2001). [30] P.S. Shetty, C.J. Henry, A.E. Black, and A.M. Prentice, Energy Requirements of Adults: An Update on Basal Metabolic Rates (BMRs) and Physical Activity Levels (PALs). European Journal of Clinical Nutrition 50(1) (1996), S1-S23. [31] B.L. Bowerman and R.T. O’Connell, Forecasting and Time Series: An Applied Approach. Duxbury Press, Belmont, CA, 1993.
146
X. Long et al. / Predicting Daily Physical Activity in a Lifestyle Intervention Program
[32] D.A. Dickey, and W.A. Fuller, Distribution of the Estimators for Autoregressive Time Series with a Unit Root, Journal of the American Statistical Association 74 (1979), 427-431. [33] A.C. Harvey, Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge University Press, Cambridge, 1989. [34] J.Y. Peng, and J.A.D. Aston: A MATLAB Software Implementation for Time Series Analysis by State Space Methods, Proceedings of the American Statistical Association Business and Economics Section, American Statistical Association (2006). [35] J. Durbin, S.J. Koopman: Time Series Analysis by State Space Methods, Oxford University Press, New York, 2001. [36] A. Meyler, G. Kenny, T. Quinn: Forecasting Irish Inflation Using ARIMA Models. Technical Paper, Central Bank and Financial Services Authority of Ireland 3/RT/98 (1998), 1-48. [37] H. Akaike: A New look at Statistical Model Identification, IEEE Transactions on Automatic Control AC-19 (1974), 716-723. [38] G. Schwarz: Estimating the Dimension of a Model, Annual of Statistics 6 (1978), 461-464. [39] C.M. Hurvich, C.L. Tsai: Regression and Time Series Model Selection in Small Samples, Biometrika 76(2) (1989), 297-307. [40] Z. Chik: Performance of Order Selection Criteria for Short Time Series, Pakistan Journal of Applied Sciences 2(7) (2002), 783-788. [41] G. Ljung, G. Box: On a Measure of Lack of Fit in Time Series Models, Biometrika 66 (1978), 67-72. [42] M. Merrington: Table of Percentage Points of the t-Distribution, Biometrika 32 (1941), 300. [43] D. Stockton, J. Glassman: An Evaluation of the Forecast Performance of Alternative Models of Inflation, Review of Economics and Statistics 69(1) (1987), 108-117. [44] A.C. Acock: Working With Missing Values, Journal of Marriage and Family 67 (2005), 1012-1028.
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-147
147
Hey robot, get out of my way A survey on a spatial and situational movement concept in HRI Annika PETERS a,1 , Thorsten P. SPEXARD a , Marc HANHEIDE b and Petra WEISS c a Applied Informatics Group, Bielefeld University, Germany b School of Computer Science, University of Birmingham, UK c Department of Linguistics and Literature, Bielefeld University, Germany Abstract Mobile robots are already applied in factories and hospitals, merely to do a distinct task. It is envisioned that robots assist in households soon. Those service robots will have to cope with several situations and tasks and of course with sophisticated human-robot interactions (HRI). Therefore, a robot has not only to consider social rules with respect to proxemics, it must detect in which (interaction) situation it is in and act accordingly. With respect to spatial HRI, we concentrate on the use of non-verbal communication. This chapter stresses the meaning of both, machine movements as signals towards a human and human body language. Considering these aspects will make interaction simpler and smoother. An observational study is presented to acquire a concept of spatial prompting by a robot and by a human. When a person and robot meet in a narrow hallway in order to pass by, they have to make room for each other. But how can a robot make sure that both really want to pass by instead of starting interaction? This especially concerns narrow, non-artificial surroundings. Which social signals are expected by the user and on the other side, can be generated or processed by a robot? The results will show what an appropriate passing behaviour is and how to distinguish between passage situations and others. The results shed light upon the readability of signals in spatial HRI. Keywords. HRI, movement concepts, situational interaction, spatial interaction, non-verbal communication, body movements
Introduction Mobile robots are already applied in various places and for a lot of purposes. There are mobile robots in human-free spaces, where hardly any interaction between humans and robots occurs, such as the exploration of non-human-accessible areas (e.g. in nature, in space, in disaster areas, in industry set-ups). Further mobile robots are deployed in human environments, where they have to accomplish tasks, which do not usually involve any human interaction, e.g., supply tasks in hospitals [23]. Those robots have to execute one task and have to cope with only one situation. Furthermore, there are mobile service robots, which are especially designed for human-robot interaction (HRI). These mobile robots can be found, for example, in museums [5] and it is envisioned that soon robots 1 E-mail:
[email protected]
148
A. Peters et al. / Hey Robot, Get Out of My Way
assist in all kinds of households and offices as well. These robots will not only have to cope with different interiors, but also with various tasks. Most importantly humans should be comfortable with the robot. Service robots are applied for our well-being, to achieve this, the interpretation of our behaviour is essential. Imagine – in close future, you bought a service robot for your flat or office. From time to time you will meet your robot while walking through or towards a narrow passage, e.g., caused by furniture, a hallway, a door frame, or a small kitchen. Your service robot might block the way, or might drive towards you, pursuing its own goal, which it might have received by you or by other people in your flat or office. You just want to pass by – maybe you are in a hurry. In order to make room for you, this service robot needs to decide, whether you just want to pass by or if you would like to start a conversation with it. This means, the robot has to know in which situation it is in and act fast, according to social conventions. Wouldn’t it be good if it started moving to a side as you did, when you first saw each other? Neither of you would need to do a lot more than you had intended to do – only passing by. Consider – you would have met a person instead in your office or flat. Most of the times you would just have moved around each other without the need to speak, even if it is a narrow passage. People use means of communication that are non-verbal, especially in the coordination of joint actions [3]. Communication contains many such non-verbal features. Following this example, signals or cues must have been sent back and forth between both interaction partners, which contained information as to whether they are in a passing by situation or not. In this passing by situation, body language, namely body movements to a side play a role in deciding what to do [4].
1. Foundations The general aim is to enhance situation awareness in service robots. Hence, the idea is to identify characteristics of communication that contain information about the situation. Situation awareness in social robotics is a complex problem not only because those robots have to cope with several tasks but also the interaction partner’s behaviour has to be monitored and interpreted. Behaviour interpretation means that the intentions of the humans have to be identified in peoples’ action. Regarding that, interaction includes not only receiving signals, but also it includes sending signals. Furthermore, situation awareness is also a coordination between action and reaction or encoding and decoding information [2]. The robot has to signal that it is in a known situation and/or at least that it is currently coping with it and reacts accordingly. Otherwise the current situation might just pass over. Considering the example above, the robot should detect that it is in a passing by situation, and for instance move sideways before the person is too close and squeezed herself past or simply had to wait long in front of the robot in order for it to move away. Moreover, those mixed up situations make the distinction of situations very challenging and not natural. As already revealed in the introduction, the chapter is centred on a distinct HRI situation: the passing by setting and particularly on one feature: movements with the entire body of a human or robot towards the other interaction partner. This is a start to understand spatial and situational constraints better, especially the joint spatial management and situational behaviours of a robot towards a human and behaviours of humans towards
A. Peters et al. / Hey Robot, Get Out of My Way
149
a robot. The long-term aim is the distinction between several situations, preferably by non-verbal features. These aspects will be explained more thoroughly in the following subsections. 1.1. Aspects of Body and Machine Movements Body movements are one non-verbal way to signal each other what one is going to do, sometimes in an implicit or unconscious way [2,4]. Accordingly, spatial interaction with a social robot should not only consist of speech. Considering proxemics, especially in dynamic situations, in which both partners are moving, speech is not the means to express intentions. Proxemics was introduced by E.T. Hall [14]. One part of proxemics discusses, how humans manage distances around themselves within the focus of non-verbal communication. The communicational aspects of proxemics, state that at about 3.6 m (public distance) humans do not tend to speak to each other, unless they are addressing an audience [14]. In HRI similar distances have been used, as well [15,18]. Regarding the introductory example, if a robot could interpret non-verbal behaviours and express them, it would give both interaction partners more time to cope with the current situation. Transferring this to the passing by situation, the robot moves more to the side and keeps moving before the human has to speak. Hüttenrauch et al. observed that people, even if they know verbal commands to steer the robot, tend to squeeze around the robot to pass by. They called those behaviours security breaches, where people came purposely very close to the robot during HRI experiments [16]. To maintain a high level of comfort, interaction with a robot should manage spatial situations in a socially appropriate manner. For this, another aspect of interaction is important, the predictability of the partner’s action [11]. Humans are good in predicting how and where an observed movement ends using a range of non-verbal cues, such as eye-movements and entire body motions [8,24]. Body and machine movements as communicational features are worth focusing on [20]. Frith et al. found that humans can infer intentions from another person’s movement [8]. Furthermore, Pacchierotti et al. suggest that robot designs should follow human-human interaction patterns initially, to have a foundation from which to start [26]. Even nonanthropomorphic robots are perceived as human-like and are thought to behave so [Siino cited in [19]]. To achieve human likeness, robots need to consider spatial and situational concepts of humans, which can be achieved by taking non-verbal signals into account. The idea is to consider interactions from the human side of view towards the robot and vice versa, with a focus on body and machine movements in regard of the predictability of the situation. Consequently, both participants of the interaction should be taken into account as sender and receiver of signals when HRI is modelled. With this in mind this chapter focuses on identifying and interpreting signals which are sent and received in spatial human-robot interaction. Therefore an observational survey has been planned and conducted. A typical "passing by" situation was created to observe especially communicative actions and reactions, focussing on body and machine movements, which are executed in order to pass by successfully. Below, the scenario of the introductory example will be made concrete. 1.2. The Scenario Passing By Making room for someone is the general concept, which encompasses a lot of different situations, for example, moving aside for someone: joining a group, walking in a shop
150
A. Peters et al. / Hey Robot, Get Out of My Way
corridor, in a plane aisle or in a crowded room and also in the typical passing by setting in which two people are walking facing towards each other. The latter is a basic concept in HRI, which is still challenging in narrow passages like hallways in homes and hospitals. Aside from this, furniture in homes can often cause narrow passages [16]. Although robot movement in domestic environments has already been studied, it is not sufficient for a robot to avoid obstacles and navigate to metric positions to really share space with humans in a social and interactive fashion. An example is depicted in Figure 1. Basic obstacle avoidance approaches, do not avoid oncoming people in a human-like manner [19]. A robot should not only consider social rules with respect to proxemics (e.g., defined by Hall [14]), orientations in communication (e.g., Kendon’s Lformation [17]), or in approaching people. It should also be able to signal and understand certain spatial constraints such as “passing by”. Kirby et al. introduced COMPANION, a first approach to consider social conventions for person-acceptable navigation [19]. They incorporated spatial constraints, like passing someone on the right side. Furthermore, Kirby et al. implemented the personal space around a person as a cost function, which changes dynamically due to factors identified in human-human research like walking speed [9,10]. This promising approach is tested in simulation and directly in passing by situations. No decision making takes place about the situation. The complexity for a social robot comprises not only handling one particular situation but also the distinction between several situations. These could be, for example, a “passing by” situation, an “approaching”-, or a “giving information” situation. When modelling interaction behaviours, both sides of the interaction have to be taken into account. On the one hand, there is the side of the robot and how the movement of the robot is interpreted. On the
Figure 1. This figure depicts the difference between a pure obstacle avoiding approach (left) and a situation aware approach (right). The robot (square R) on the right drives towards a moving human (triangle H). The robot might intimidate the human coming closer. This approach is slower compared to the approach on the right side. It is slower, because at some point, the robot and the human have to stop in order to figure out, how to pass by. The drawing on the right depicts a robot and a human spatially prompting each other. The lines represent the paths of both. These fluid and non-stop movements to the side are considered to be more human-like.
A. Peters et al. / Hey Robot, Get Out of My Way
151
other hand, there is the human side and how the human tries to communicate her intention to the robot. Human-human communication is taken as an inspiring example in social robotics; therefore one should not forget the human interaction partner might display some human features of human-human communication, either, which can be used for situation distinction. In contrast, our focus lies on a step below the stated work, namely, on how to signal and how to detect signals or social-spatial conventions which result in the next step to identify the situation and then to act appropriately. 1.3. Spatial prompting in interaction In the spatial HRI context, Green introduced the term spatial prompting for spatial and communicative signalling [13]. The term spatial prompting summarizes verbal, gestured, auditory, visual or tactile and full body prompts. He defined a spatial prompt as a “communicative action that incites someone to spatial action, or a spatial action that incites someone to a communicative action” [12]. With this definition in mind, we refine that a spatial prompt can also be a spatial action, for example a movement, as long as it incites someone to any action. Consequently, a spatial action is also a communicative action, although non-verbal. Accordingly, a spatial action can incite someone to a spatial action. As this is still a broad definition, we focus on one aspect, a spatial action or prompt, which is performed non-verbally, intuitive, and through movements of the entire human body or entire robot body. Those spatial prompts in various situations, for example, crossing streets together, move through a crowd of people, and make room for each other and passing by each other [27,28]. In HRI, the concept of spatial prompting is mostly used on the part of the robot to direct or notify the user – verbally and non-verbally. On the part of the robot Hüttenrauch et al. suggest that robots should actively and spatially influence humans to position themselves in a better way to make human-robot communication easier [13]. As soon as the position of the user is detected, the robot could plan to position itself with respect to proxemics. So far the interaction is only fine tuned by the robot’s communicative explicit prompting. This can be realised via speech: “Please move on, you are too close to me” [11,16], which is called communicative spatial prompting. The robot should prompt to improve the interaction and therefore facilitate joint spatial management. There is little research done on how exactly humans are spatially influenced (or prompted) by each other in scenarios described above. There is even less research done in how a human and a robot could interact and are interacting on a non-verbal level [3]. Hence, the concept of spatial prompting is taken further and considered from another point of view. Spatial prompting from the human side towards a robot is examined closer to analyse the implicit spatial signals, which are sent by the human towards a robot within interaction. Human-human communication or interaction is very complex. Various implicit and explicit signals are sent by interaction partners back and forth [3]. Therefore, the social robot should not only prompt a human. Additionally, the goal is to recognise social signals, especially, the orientation of the entire human body or the direction of movements. Those social signals are helpful for the robot to be aware of situations in a spatial context and react in a socially appropriate manner without the need for humans to explicitly explain her intentions. Naturally, if there are usable social-spatial signals of the human towards the robot, the interplay between spatial prompting from either side and the reaction towards those signals will be crucial to examine.
152
A. Peters et al. / Hey Robot, Get Out of My Way
“Making room for each other” in a “passing by” scenario is the situational and spatial concept, which is examined to find features which help to detect if the robot is in a passing by situation or not. 1.4. Overview of Passing By Scenarios in HRI Most research is primarily concerned with how the human perceives and is comfortable with the action of the robot. Butler et al. identified and evaluated two parameters, which influence the participant’s comfort level with the robot: (a) speed of the robot and (b) distances of the robot towards a human. The participants encountered a robot which drove towards them and was supposed to avoid the participant. The results of the user study reveal that participants prefer a faster (0.38 m/s) and a flowing passing movement of the robot [4]. Participants of the study thought this behaviour more ’human-like’. The authors think that the speed of the robot should be between 0.3 m/s and 1 m/s to reach maximal comfort, depending on the familiarity of the user with the robot. Pacchierotti et al. evaluated a person passing model on a social robot in a hallway setting [25]. Three parameters were tested in a pilot study with four participants. The lateral distance towards a walking person (a), the distance when initiating the avoiding movement (b) and the speed of the robot (c), were varied and tested in different combinations. Results display a tendency that participants were more comfortable with a higher robot speed (0.6 m/s), a longer signalling of avoidance distance (6 m) and a lateral distance of 0.4 m. In a follow up study Pacchierotti et al. varied only the lateral distances and found that people do not perceive lateral distances lower than 0.4 m as reaction of the robot towards themselves. Furthermore, preliminary results suggest that social conventions of direction of travel play a role when passing a robot [26] (keeping to the left or right side). The above mentioned parameters of passing strategies are considered to make the robots’ overall behaviour more human-like. In this context, the human-robot survey in a hallway setting will be outlined in the section after the robot platform BIRON is presented and a software component necessary for this research is briefly introduced.
2. Person Representation As a first step towards situation aware spatial prompting a robust tracking system for humans has to be established. Besides pure position estimation, which is often a tough task on its own, the actions of an interaction partner have to be provided for a situation aware system causing a more natural robot behaviour. By which means should a robot decide whether to avoid an approaching person or to make contact when only position information is given? For this reason a person tracking approach based on the idea of Anchoring, suggested first by Coradeschi and Saffiotti [6], is used, see Figure 2. The idea of Anchoring consists of applying real world sensor readings (percepts) to a software model, the so called symbol. The symbol contains the measurable values of the percepts according to the software representation, e.g., variables containing real world object coordinates. For a continuous tracking each percept is rated as belonging to either an existing symbol or a newly created one. This rating depends both, on the temporal distance, as well as on the spatial distance between two successive sensor readings. Originally, this approach considered only one kind of sensory input to track one single
153
A. Peters et al. / Hey Robot, Get Out of My Way
Person Anchoring Symbols
Percepts Multi−modal Anchor
Person Person Model Fusion Movement Composition
Face Legs Sound
Anchoring Anchoring
Face Region Leg Position
Anchoring Component Anchoring
Sound Direction
Figure 2. Person Anchoring: For each modality, e.g., a face, a separate anchoring process associates position information to the symbol. Tracking is achieved by applying a movement model. This evaluates position changes of two subsequent percepts, according to the time elapsed between them. Besides feasible movements a composition model limits the association of the different kinds of symbols to one particular person by maximum distances between, e.g., a face position and a pair of legs. By this Person Model the separate modalities are fused to a Multi-modal Anchor for persons. Thus, a person is tracked as long as one Component Anchor exists.
object. Lang et al [21] extended the uni-modal single object anchoring to a multi-modal multi-person tracking, considering several types of percepts and symbols, as well as several persons in the robot’s vicinity. The whole multi-modal anchoring process is depicted in Figure 2. Comparable to the uni-modal anchoring each modality is separately anchored. To successfully combine the separate person components a so called person model is applied. The person model respects the amount of associable components and their maximum real world distances summarized as composition. According to the person model a person possesses e.g. only one pair of legs, and the distance between legs and head is limited by the human physique. Additionally to the composition the movement of each component is considered regarding to the time elapsed between two percepts. The movement model is applied to each component to estimate, whether the position change applies to the same real world object or the appearance of a new one. Finally, the components are fused to the person representation completing the multi-modal anchoring process for persons. For successful tracking of persons, results of face recognition, pair-of-leg detection, and sound direction estimation, are applied. The face recognition is based on the approach of Viola & Jones [31]. Due to its efficiency multiple classifiers trained with rotated faces can be applied in real-time. This allows the classification of the horizontal gazing direction in steps of 20 degrees. The pair-of-leg detection uses distance measurements, e.g., by a laser range finder, to identify connected segments with a real world size from 5 cm up to 25 cm as single “legs”. Subsequently pairs of legs are built if the distance
154
A. Peters et al. / Hey Robot, Get Out of My Way
between the legs is less than or equal to 50 cm and the legs of one pair have a similar distance to the sensor. For sound detection a pair of microphones is used to separately record the audio channels from two distinct positions. Using a Cross-PowerspectrumPhase analysis on the two audio signals, the temporal delay of peaks in the signals is calculated, and thus, the direction of the peak-generating sound source estimated. Combining these three modalities considerably increases the robustness of the person tracking approach. For the uni-modal anchoring, a symbol is removed, if for a given amount of time no percept can be assigned to it. The multi-modal person tracking keeps the person symbol as long as at least one subsymbol (face, legs, sound) is anchored. In parallel to the uni-modal anchoring each modality is subjected to an individual anchoring process. To fuse the separate subsymbols to one person symbol a so called person model is applied. This considers the composition of a person by the number of subsymbols assignable to a person and the maximal distances of the subsymbols to each other. E.g., a person possesses one pair of legs and the distance between legs and head is limited by the physiques of a human. Additionally, the movement is rated, based on the time span between two subsequent percepts. The movement model compares the time span with the position changes and estimates whether the same person just moved around or a new person had appeared. Regarding the different types of information, which are given from the several modalities, assumptions according the person’s current action are drawn to enable the robot for situation aware interaction. Currently, information about the movement behaviour, the speech, and gaze direction are evaluated. This information is used to identify the most promising interaction partner, if multiple persons are in the vicinity of the demonstrator. A person is rated most interesting to the robot if she stands still, looks at the robot, and talks. Quiet persons or persons not looking at the robot are of less interest and persons wandering around not paying attention to the robot by looking or talking are of no interest to the system. But the observed person behaviour is not only used for person attention control. It is provided to a central memory, enabling both, flexible accesses by other software modules and the monitoring of the development in time [30]. The flexibility and monitoring are, in a first step, achieved by an infrastructure based upon an active database information exchange. Data of the participating software modules is collected in a central storage also called memory. Since this memory is realized upon an active database, depending on the data input, database events may be executed to send data from the central storage to a software module. Apart from the unified data representation, e.g. XML is used for this demonstrator; the exchanged data contend ranges from robot motor commands, over surrounding descriptions, up to interpretation of the interaction partner’s behaviour. Still the question is kept open how and when the centrally gathered data is evaluated and forwarded to its destination. Therefore the Active Control Memory Interface (ACMI) was developed and set on the top of the memory. The general operation method of ACMI is to change the memory content in a way that modules observing the memory react properly to generate the desired system behaviour without direct connection to ACMI (see Fig. 3). ACMI operates by evaluating the current memory content and thus the current system state by a set of rules. It can be compared to rule interpreters of production systems similar to ACT-R [1], but without the need of general control connections to other system components. Although this kind of direct control is still supported for security reasons, e.g. to prevent collisions with humans if a higher level behaviour fails, the idea
A. Peters et al. / Hey Robot, Get Out of My Way
155
Figure 3. Coordinating Information exchange by ACMI: While the integrated HRI modules provide and register on information collected in the memory, ACMI observes the information and manipulates the memory content according to its configuration to achieve the desired system behaviour. Since the configuration is a common memory entry it can easily be modified during runtime by simple database access, changing the operation of ACMI immediately. For security reasons ACMI may be directly connected to HRI modules to reconfigure them, directly receiving module feedback without relying on the database transaction schedule.
of coordination instead of control as well as a continuously evolving system [29] is followed. For the rule interpretation and appliance it reuses the same database framework as the remaining modules. Any rule is associated to a database trigger and a specified triggered action which is performed if a rule applies. A rule is represented in XML, like the information the rules apply to. This allows a kind of reuse of the framework: Instead of separating the rules which implement the system behaviour from the factual knowledge, they are stored in the memory as an XML document as an important part of the system information. As previously mentioned this enables the manipulation of rules as easy as any kind of database content even during runtime. Taking into account that ACMI modifies database content, the described ap-
156
A. Peters et al. / Hey Robot, Get Out of My Way
proach enables modification of the system not only during runtime from a human user, but also by the system itself. The approach provides a powerful basis for highly adaptive systems modifying a given basic set of rules to a more improved one according to the current operation conditions. This information coordination in combination with the flexible rule base can be used to decide situational, whether the robot should circumnavigate a person, who wants to pass by, or approach a person since she/he wants to start an interaction. The exact rules heuristic will be modelled, based on the outcomes of the study with the demonstrator. 3. Platform BIRON
Figure 4. The robot BIRON as it is used for the conducted interaction studies. Based on the GuiaBot the following in- and output devices shown on the right hand side have been added: Pan-/Tilt-camera, interfacial microphones, arm with gripper, and laser range finder (from top to bottom).
The demonstrator on which the person representation and monitoring is already implemented is BIRON - the (BIlefeld Robot companiON), see Figure 4. It serves as a
A. Peters et al. / Hey Robot, Get Out of My Way
157
platform for the user study described in the following section. The demonstrator is based on the research platform GuiaBotTM by MobileRobots2 . It is customised and equipped with sensors that allow analysis of the current situation in a human-robot interaction. The recent BIRON is a consequent advancement of a platform, which has been under continuous development for seven years. It comprises two piggyback laptops with Intel c 2GB main memory, and Linux to provide the computational power. Thus, Core2Duo, a system for HRI is achieved, which runs autonomously and in real-time. The robot base is a two-wheeled PatrolBotTM , with two passive rear casters for balance. Its maximum motion speed is 1.7 m/s translation and 300 degrees per second rotation. Inside the base front there is a 180 degree laser range finder (SICK LMS200). With a scanning height at 30 cm above the floor, the laser ranges are used both for navigation tasks and to detect pairs of legs for the previously mentioned person representation. The camera, for face detection and gaze estimation, is a Sony 12x zoom pan-/tilt camera. The camera is able to scan the area in front of the robot. For localisation of sound direction in the person tracking approach, two interfacial microphones are mounted on the top of the body of the robot. With the upper part, BIRON has an overall size of approximately 0.6 m×0.5m×1.3 m. The body houses a touch screen as well as the system speaker, used for speech output in user interaction.
4. The Observational Study To investigate both, the human side, and the robot side of prompting during interaction, an observational study was conducted, to provide data under the focus of spatial management (proxemics and prompting) and to analyse it with the focus on spatial interaction. The study took place in a corridor of the Bielefeld University. Mainly students and employees of the university use this hallway to reach other stories, lecture halls and offices. Signs on the floor and walls of the hallway informed them that a robot was working and moving in this area, as well as that the area was temporary under video surveillance. Participants took part in the study, unaware of the details. Their task was to enter the building and to walk along the hallway to reach another floor. The participants were informed shortly after they took part in the human-robot passing situation in the hallway that the actual experiment was already over. Of course, they were asked if the data can be used for the study and if they would fill in a questionnaire. One external camera recorded the interaction in the hallway. Moreover, an experimenter was always present, observing but hiding. The experimenter was able to stop the experiment whenever she thought it was necessary. This setting was chosen to achieve a realistic situation, in which people have the wish to pursue their own goal and are not in a lab environment. The line of thought, which is approached within this study, is that humans prompt spatially; therefore the robot has to react adequately. This raises the questions, what reactions are appropriate and how people are communicating their wishes to pass by. Consequently, eight behaviour patterns or strategies were activated on BIRON in two slightly different experiment settings, a static - and a dynamic setting, see Figure 5. The robot either stands (static) already in a narrow doorframe in the corridor or drives 2 www.mobilerobots.com
158
A. Peters et al. / Hey Robot, Get Out of My Way
#
!
!
"
#
"
Figure 5. Each experiment setting depicts four possible passing strategies to make room for each other. In regard of the strategies, the dashed lines with arrows represent possible direction of travel of the robot (square R). The numbered circles stand for the future position according to the different strategies BIRON exhibits. An approaching human (triangle H) is walking towards the narrow doorframe and the robot and wants to pass by. Passing strategies, which are against the direction of travel, will be not considered in this study (traffic rules: right side in Germany, see Section 1.2 and [26]). The difference between the static setting (left) and dynamic setting (right) is that BIRON is either standing in or moving towards the door frame in the corridor. The boxes on the top help to identify passing strategies beneath, in which the robot either prompts less or more. That means it is driving either to the side or not. The boxes on the left of each hallway summarize defensive and offensive strategies which BIRON exhibits.
towards it (dynamic) to create a blocking situation for the approaching human. As soon as an approaching person is detected (about 8 m), a behaviour is initiated. The eight strategies or patterns can be divided into defensive and offensive behaviours. This means the robot is not driving through the doorframe (defensive) or it is driving through the narrow passage (offensive), see Figure 5 and Table 1. Defensive behaviours are backward motions in a static setting, see circles 1, 2 in Figure 5 and corresponding numbers in Table 1. In a dynamic setting, defensive strategies are short forward movements, see circles 5, 6 in Figure 5 and corresponding numbers in Table 1. Offensive passing strategies are movements towards the person in both settings. The eight strategies can be also divided into the categories less - and more prompting
A. Peters et al. / Hey Robot, Get Out of My Way
static social signals
more prompting
less prompting
defensive
1 = backwards, sideways
2 = backward, straight
offensive
3 = forward, sideways
4 = forward, straight
dynamic social signals
more prompting
less prompting
defensive
5 = forward, sideways
6 = forward, straight
offensive
7 = forward, sideways
8 = forward, straight
159
Table 1. These tables display the possible behaviours. BIRON exhibits one for each experiment. The numbers correspond to the circles in Figure 5. The entries forward, backward, straight and sideways describe the movement of the robot. More prompting means that the robot moves to the right in order to avoid an approaching person. These movements can be forward or backward. Less prompting stands for straight motions with no hint how the passing situation might be resolved. Offensive motions are forward motions, past the narrow space, see Figure 5 - frame of the door. Defensive motions are mostly backward motions and motions without passing narrowness.
behaviours. In the less prompting situations, BIRON drives straight towards the person or straight away from the person without displaying any signals that it might avoid, see circles 2, 4, 6, 8 in Figure 5 and corresponding numbers in Table 1. More prompting situations are situations in which BIRON initiates a smooth curve to the right side either turning around and driving away or driving forward, see circles 1, 3, 5, 7 in Figure 5 and corresponding numbers in Table 1. For an overview over the general setting, see the filmstrip in Figure 6. Here, the experiment settings are displayed for pattern 7 (circle 7) in Figure 6. With the variety of tested behaviours, we approach questions, whether the participants perceived the spatial prompting of BIRON as its intention and what they would like BIRON to do in an avoiding situation. Questionnaire After the encounter with BIRON, participants were asked to fill in a questionnaire. The questionnaire is designed to find out, if people related the displayed behaviour of the robot to their presence. We like to know, if the participant can guess the intention of BIRON, and if the behaviours are good enough to work without the help of other modalities. Furthermore, the questionnaire aims to find out how the participants liked or expected BIRON to have reacted. In addition, it is interesting to know what the participants are willing to do either.
5. Results 59 people took part in the study, 56 participants (19 f, 37 m) filled in the questionnaire. The median age was 27.5 years and the arithmetic mean of the age was 30.4. Most participants have a computer science background (70.4%), 17.9% have another scientific background and 10.7% have no scientific background. However, 82.2% of the participants had seen BIRON before but only 18% had interacted already with it. Results of the questionnaire and further observations, which were made while conducting the experiment, are presented in the following subsections.
160
A. Peters et al. / Hey Robot, Get Out of My Way
Figure 6. The filmstrip displays a model experiment setting for pattern 7 in six pictures. The filmstrip also provides an overview over the general setup of the study. The white arrows point in the direction in which BIRON prompts spatially to signal the passer-by which way it is going to drive. The filmstrip shows one run of the experiment: The person approaches, passes by the robot, and proceeds.
5.1. Questionnaire Participants were asked in an open question what they thought BIRON’s intention was, when they saw BIRON in the hallway. Interestingly, there is a significant (χ2 = 35.28, p < 0.01, df = 2) difference between the two situations: static pattern with more prompting and the dynamic pattern with more prompting. This means 64.7% of the participants in pattern 1 and 3 thought BIRON is avoiding them whereas only 25% thought so in the dynamic situation (pattern 5, 7). In pattern 5 and 7 BIRON was already moving when the participant saw it, see dynamic setting in Figure 5. Therefore, we assume that the participants did not interpret the prompting behaviour (pattern 5, 7) of BIRON as a reaction towards the presence of their own person (prompting behaviour = the robot is moving to the side). There is no difference between static and dynamic situations in less prompting behaviours of BIRON (less prompting = no explicit avoiding of persons). In patterns 2, 4, 6, 8, 26.3% think that BIRON is avoiding them explicitly and 52.6% think that BIRON acted not disturbed by their presence. The rest thought that BIRON blocked their way on purpose. Moreover, the participants had to choose what they liked BIRON to do, when the robot was already driving towards a narrow passage and when BIRON was already blocking the narrow passage. In the dynamic situation participants would like the robot to drive forward and sideways as early as possible in 47.3% of cases whereas participants choose this possibility in only 9.1% within the static setting. Here, driving backwards
161
A. Peters et al. / Hey Robot, Get Out of My Way
seems to be the more preferred way for BIRON to react (54.5%). A robot which turns and drives away might be not as intimidating as a robot, which starts its engine to drive forward even if it drives sideways. In terms of predictability, we suggest that a turning and driving away behaviour is easier to interpret as an avoiding behaviour. Additionally, the robot has more time to move more to a side, till the person reaches the robot. In contrast when a robot is already driving, the backward motions take only the second place in the ratings (40%). In both situations, straight movements of BIRON were not as preferred. Interesting is that 29.1% liked BIRON not to move at all when they had to imagine the robot standing in a narrow place. An overview of the results can be found in Table 2. Naturally, this raises the question, if the participants would have preferred to squeeze around the robot in order to pass by. A driving robot might be more intimidating than a standing robot, especially if its intention is not clear yet [4]. In a further study this question will be considered. backwards sideways
forward sideways
straight movements
stop
static
54.5%
9.1%
7.3%
29.1%
dynamic
40%
47.3%
1.8%
10.9%
Table 2. Results of the questions in which the participants had to choose how they liked BIRON to behave. Two initial situations were considered when they answered the questions: they had to imagine that the robot is already standing in a narrow space (static) and that BIRON is driving towards a narrow passage (dynamic). The possibilities how BIRON should move are given in the top row. Remarkably low rated are the forward movements to the side in the static setting.
Furthermore, we asked the participants, how BIRON should signal in general that it is making room. Table 3 holds an overview of the chosen answers. Movements to the side, either backwards or forward are above 20%. Interesting is that screen signs, which should signal the direction of travel are chosen in 21.7%. Answer
Percentage
forward, sideways
27.17%
backward, sideways
21.74%
screen signs
21.74%
stop motion
9.78%
camera motion
9.78%
straight forward
5.43%
straight backward
4.34%
Table 3. Participants had to choose how BIRON should signal in general that it wants to avoid. The first column contains the actions, which were possible to choose.
In addition to this question the questionnaire allowed to write down own ideas how they liked BIRON to signal that it is going to pass by. Verbal output and any kind of audio signals were suggested each in 20% of the cases. Remarkably, participants suggested putting indicators on the robot to signal the direction of travel in 60% of the cases. An indicator and the screen signs (Table 3) solution are a simple but very effective method to tell a person what the robot is going to do - even if the robot is not moving. Those signs
162
A. Peters et al. / Hey Robot, Get Out of My Way
can be seen from further away and enhance the clearness of the interaction. The results are conform with the idea of Matsumaru et al., who applied an omni-directional display to indicate the robot’s speed and direction of travel in a robot follow scenario [22]. In contrast to the questions above the next question treats the human way of prompting. The participants should suggest how they would signal the robot that they want to pass. The entire answers can be found in Table 4. Walking sideways was suggested most often (31.8%). Moreover, participants suggested that they could use hand gestures (24.2%). Unfortunately, we do not know if they meant to show, where they want to go or if they would like to point where the robot should drive. Verbal communication was recommended in 18.2% of the cases. However, participants rated their own behaviour, compared to what they suggested before (mean = 3.13 on a Likert scale of 1 (no did not do what suggested) to 5 (did what suggested)). Approximately, half of the participants did what they suggested and the other half did not. In a first analysis of the video data, it catches ones eye that nobody used speech and any kind of hand signs to tell the robot what they were going to do. Hand signs and speech might be a possibility to interact in those situations but they are definitely not the intuitive way of a human to prompt a robot. This conforms to Hall’s personal spaces, earlier mentioned in this chapter [14], that people do not tend to address someone via speech further away than 3.6 m. Answer
Percentage
walking sideways
31.8%
hand gesture
24.2%
speech
18.2%
progressing
6.1%
orienting the body away
6.1%
waiting
4.6%
normal behaviour
3.0%
touching
1.5%
going away
1.5%
walking slowly
1.5%
eye contact
1.5%
Table 4. In this question the participants were asked to think of ways to show the robot what their intention in this particular situation is. The first column contains actions the participants would prefer.
To sum up the participants would like to have features for BIRON to signal what it is going to do. They also like BIRON to drive sideways either backwards or forward depending on the situation. Differences between an already moving robot and standing robot even if it is the same narrowness and avoiding route were found. Results suggest that a standing robot should not move at all or should move backwards but not forward even if it is sideways. This conforms to the safety breaches Hüttenrauch et al. reported [16]. Generally people had no anxiety to come close to the robot for a certain amount of time and especially when the robot was not moving, people squeezed around it. Furthermore, verbal interaction was suggested by the participants but movements and more non-verbal signals were preferred in the passing by scenario even if they are less human-like approaches. On the human side of prompting, participants can imagine several ways to communicate their intention. Not all participants acted as they suggested during the experiment.
A. Peters et al. / Hey Robot, Get Out of My Way
163
5.2. Observations The observational study with its eight different behaviours displayed by BIRON brought some insight how important the timing between the manoeuvres of the robot and the humans walking velocity (average 1 m/s) is. Assuming that a person is detected on average at a distance of 6 m, a robot has 6 seconds to decide in which situation the robot is in and what strategy to apply. This conforms to the wish for the robot to turn around and drive away instead of starting to drive towards the person. Moreover, Ducourant et al. explain that humans need not only to plan their own actions, they have to consider the actions of others as well, when planning their own actions [7]. Timing is important within the planning of one’s own action, especially regarding the amount of space, interaction partners have to manage [7]. A challenge we encountered during the study is that a solution in avoiding any obstacle needs to be found for systems that are going to operate in narrow spaces and open spaces. On the one hand, a robot should stay away from any kind of obstacle as far as possible. On the other hand, it should also drive as close as possible to a wall and along it when there is a narrow passage and a human is about to pass by.
6. Conclusion and future developments Communication is much more than just speech [2]. In interaction there are some situations in which speech might be the first choice of signal to send but there are basic situations in which people use non-verbal cues intuitively. Especially, in HRI it is essential to know about behaviours, which differ depending on the situation. This chapter centres on the passing by situation in which a robot and human meet in a narrow passage. To get an overview of this situation and how a robot should signal and how a human might prompt in a spatial way towards a robot, an observational and broad study was conducted. The results of the questionnaire hint that movements to the side either backward or forward are perceived as the intention of the robot to pass by instead of interacting with the human. Furthermore, other signals to prompt the human are suggested. Indicators and signs of the monitor can be imagined to be useful. Verbal communication seems to play a minor role in this scenario. Timing of the joint action is especially crucial when a small amount of space and time is available. A robot which is not in motion should not start to move towards an approaching person even if it drives on one side. Backward motion to the side and no motions are preferred over forward motions. We assume that the timing of the passing by interaction can be planned better and is more predictable. Participants are willing to walk more along the side (=prompt), to signal that they want to pass the robot and did so during the passing by interaction. A subsequent video analysis will focus especially on the human side of signalling (prompting) and the timing of prompts towards each other as timing was identified as an important factor in interaction. Further studies will focus on the movements towards the robot in a appartment environment which is rented fully furnished.
164
A. Peters et al. / Hey Robot, Get Out of My Way
7. Acknowledgements This work was supported by Cognitive Interaction Technoloy - Center of Exellence (www.cit-ec.de), the Collaborative Research Centre SFB 673 “Alignment in Communication” (www.sfb673.org), and Deutsche Servicerobotik Initiative (www. service-robotik-initiative.de)
References [1] John L. Anderson. Rules of the Mind. Lawrence Erlbaum Associates, Inc., 1993. [2] M. Argyle. Bodily communication. Methuen, London, 1988. [3] Cynthia Breazeal, Cory D. Kidd, Andrea L. Thomaz, Guy Hoffman, and Matt Berlin. Effects of nonverbal communication on efficiency and robustness in human-robot teamwork. In in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS, pages 383–388, 2005. [4] J. T. Butler and A. Agah. Psychological effects of behavior patterns of a mobile personal robot. Autonomous Robots, 10(2):185–202, March 2001. [5] A. Clodic, S. Fleury, R. Alami, R. Chatila, G. Bailly, M. Brethes, L.and Cottret, P. Danes, X. Dollat, F. Elisei, I. Ferrane, M. Herrb, G. Infantes, C. Lemaire, F. Lerasle, J. Manhes, P. Marcoul, P. Menezes, and V. Montreuil. Rackham: An interactive robot-guide. In In Proc. of the 15th IEEE International Symposium on Robot and Human Interactive Communication, ROMAN, pages 502–509, Hatfield, September 2006. [6] S. Coradeschi and A. Saffiotti. Perceptual anchoring of symbols for action. In Proc. of the 17th IJCAI Conference, pages 407–412, Seattle, Washington, 2001. [7] T. Ducourant, S. Vieilledent, Y. Kerlirzin, and A. Berthoz. Timing and distance characteristics of interpersonal coordination during locomotion. Neuroscience Letters, 389(1):6–11, November 2005. [8] C. Frith and U. Frith. How we predict what other people are going to do. Brain Research, 1079(1):36–46, March 2006. [9] M. Gerin-Lajoie, Richards C., and B. McFadyen. The negotiation of stationary and moving obstructions during walking: anticipatory locomotor adaptations and preservation of personal space. Motor control, 9(3):242–269, July 2005. [10] M. Gerin-Lajoie, C. Richards, and B. McFadyen. The circumvention of obstacles during walking in different environmental contexts: A comparison between older and younger adults. Gait & Posture, 24(3):364–369, November 2006. [11] Rachel Gockley, Jodi Forlizzi, and Reid Simmons. Natural person-following behavior for social robots. In HRI ’07: Proceedings of the ACM/IEEE international conference on Human-robot interaction, pages 17–24, New York, NY, USA, 2007. ACM. [12] A. Green. Designing and Evaluating Human-Robot Communication : Informing Design through Analysis of User Interaction. PhD thesis, School of Computer Science and Communication (CSC), Royal Institute of Technology (KTH), Stockholm, Sweden, 2009. ISBN 978-91-7415-224-1, http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-9917. [13] A. Green and H. Hüttenrauch. Making a case for spatial prompting in human-robot communication. In Proceedings of the fifth international conference on language resources and evaluation (LREC2006), HCI-24, May 2006. [14] E. T. Hall. Proxemics. Current Anthropology, 9(2_3):83+, January 1968. [15] H. Hüttenrauch, K. S. Eklundh, A. Green, and E. Topp. Investigating spatial relationships in humanrobot interaction. In International Conference on Intelligent Robots and Systems, pages 5052–5059, October 2006. [16] H. Hüttenrauch, E. A. Topp, and K. S. Eklundh. The art of gate-crashing: Bringing hri into users’ homes. Interaction Studies, 10(3):274–297, December 2009. [17] A. Kendon. Conducting Interaction: Patterns of Behavior in Focused Encounters (Studies in Interactional Sociolinguistics). Cambridge University Press, November 1990. [18] R. Kirby. Social Robot Navigation. PhD thesis, The Robotics Institute Carnegie Mellon University Pittsburgh, Pennsylvania 15213, 2010. CMU-RI-TR-10-13.
A. Peters et al. / Hey Robot, Get Out of My Way
[19]
[20]
[21]
[22]
[23] [24]
[25]
[26]
[27] [28] [29] [30] [31]
165
R. Kirby, R. Simmons, and J. Forlizzi. Companion:a constraint-optimizing method for person-acceptable navigation. In The 18th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2009, September 2009. S. Koo and D. Kwon. Recognizing human intentional actions from the relative movements between human and robot. In The 18th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2009, Toyama, Japan, September-October 2009. S. Lang, M. Kleinehagenbrock, S. Hohenner, J. Fritsch, G. A. Fink, and G. Sagerer. Providing the basis for human-robot-interaction: A multi-modal attention system for a mobile robot. In Proc. Int. Conf. on Multimodal Interfaces, pages 28–35, Vancouver, Canada, November 2003. ACM. Takafumi Matsumaru, Kazuya Iwase, Kyouhei Akiyama, Takashi Kusada, and Tomotaka Ito. Mobile robot with eyeball expression as the preliminary-announcement and display of the robot’s following motion. Autonomous Robots, 18(2):231–246, March 2005. Philippe Michelon, Marcelo D. Cruz, and Viviane Gascon. Using the tabu search method for the distribution of supplies in a hospital. Annals of Operations Research, 50(1):427–435, December 1994. Lauri Nummenmaa, Jukka Hyönä, and Jari K. Hietanen. I’ll walk this way: Eyes reveal the direction of locomotion and make passersby look and go the other way. Psychological Science, 20(12):1454–1458, December 2009. E. Pacchierotti, H. I. Christensen, and P. Jensfelt. Human-robot embodied interaction in hallway settings: a pilot user study. In Robot and Human Interactive Communication, 2005. ROMAN 2005. IEEE International Workshop on, pages 164–171, 2005. E. Pacchierotti, H. I. Christensen, and P. Jensfelt. Evaluation of passing distance for social robots. In The 15th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2006, pages 315–320, 2006. M. L. Patterson, Webb A., and W. Schwartz. Passing encounters: Patterns of recognition and avoidance in pedestrians. Basic and applied social psychology, 24(1):57–66, 2002. Annika Peters, Petra Weiss, and Marc Hanheide. Avoid me: A spatial movement concept in human-robot interaction. Springer Cognitive Processing, 10:S178, 09 2009. Abstract in doctoral colloquium. T. P. Spexard and M. Hanheide. System integration supporting evolutionary development and design. In Human Centered Robotic Systems. Springer, Springer, 19/11/2009 2009. T. P. Spexard, F. H. K. Siepmann, and G. Sagerer. A memory-based software integration for development in autonomous robotics. pages 49–53, Baden-Baden, Germany, 23/07/2008 2008. P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proc. Conference on Computer Vision and Pattern Recognition, pages 511–518, Kauai, Hawaii, 2001.
166
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-731-4-166
Towards Adaptive and User-Centric Smart Home Applications Amir Hossein Khalili, Chen Wu and Hamid Aghajan AIR (Ambient Intelligence Research) Lab Stanford University, USA Abstract. In this chapter we discuss the system requirements and components of an adaptive smart-home service system. To achieve adaptivity in providing services to the user, the system needs to 1) sense the activity and state of the user and 2) customize service to the user’s profile. To achieve this, three functional parts are developed and described. In the first part, we present behavior analysis of the user in a home environment based on multi-camera vision processing. In the second part, the concept of user profile is introduced and hierarchical reinforcement learning is employed as a technique to learn the user profile dynamically. The third part of the chapter discusses how to employ the user profile to control services to maximize user comfort and utility. Future work is discussed in conclusion. Keywords. Reinforcement learning, smart homes, adaptive services, user profile, behavior analysis
1. Introduction There has been extensive research on smart homes in a variety of applications including monitoring systems for accident detection and independent living of the elderly [13], intervention for assistive living [39], smart appliances, and automated services such as energy and light control [24,1]. In many applications, the goal is to provide services that maximize the user’s comfort while minimizing the user’s explicit interaction with the environment as well as the cost of the service. In some services such as light and air conditioning control, the user comfort and the cost of energy are jointly considered to find a balanced operating point [17]. This type of tradeoff is relevant as considerations for energy efficiency are finding a major influence on the design of smart buildings. Another type of service may be designed mainly to provide a comfortable ambience for the user. For example, in the Philips Experience Lab [10], LED lights of different colors are installed at several locations in the room, and different themes of ambient lighting can be applied depending on the user’s preference. Recently, TVs equipped with an embedded camera have been introduced that switch their display based on the presence or absence of viewers. In most applications, the system provides its services either based on a set of embedded fixed rules which determine the service parameters based on the situation, or based on a query from the user every time a change is to be applied to the service. For example, a simple light control system may be programmed to turn the lights off automatically when the user sits to watch TV. Without an intelligent method to adapt
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
167
the settings to the user’s preferences, such fix decisions may result in a rigid operation which may not be desirable for many users. Alternatively, the system may ask the user to confirm the action every time the same event is detected. When different services are to be offered throughout the day such queries disrupt the user’s normal routines and may render the system obtrusive. In this work, our goal is to create a user-centric methodology for adaptation of system services based on both real-time context, namely the location and activity of the user, and accumulated knowledge about the user preferences under different activity contexts. These preferences are learned through user’s explicit or implicit feedback to the system when the user opts to react to the provided service. As a result, the system adapts to provide the most satisfactory service to the user according to the location and activity type. Therefore, there are two important components for a smart-home system: user behavior analysis and the user profile. Sensing the user’s behavior provides the starting point to learn the user’s preference and provide adaptive services. In many smart home system embodiments, readings from different sensors are employed to infer user’s activities [14]. Motion sensors, light sensors and sensors that indicate user’s interactions with objects have been used for this purpose. In a variety of cases such sensory data may not provide sufficient detail for inferring the activity. For example, the user may watch TV, read newspapers or talk with the family in the living room, and the type of lighting or background music he may prefer under each situation may likely differ. In our system a network of cameras is used with the objective of obtaining a rich description of the user’s location, pose and activity. This enables the system to learn and correlate the user’s preferences with more specific contextual information about the user. One challenge for the learning algorithm to adapt to user’s preference is that the user may change his behavior over time [8]. Therefore, an adaptively learned user profile is important, since it is not practical to learn the user’s preference off-line. In this paper, we apply the framework of reinforcement learning (RL) to learn the user profile and to maximize satisfaction of the user. RL as an unsupervised algorithm does not need predefined models of the environment and learns the model from the user’s feedback to the service the system provides. The state of the user (time of the day, location, pose/activity) is observed by the system, and the service is chosen automatically by the algorithm based on the user’s preference learned so far. If the user is not satisfied and applies a change to the service using the system’s user interface or simply leaves the area, the event is recorded as explicit or implicit feedback respectively, and is used to update the decision policy. The rest of the paper is organized as follows. In Sec. 2 user behavior analysis in a multi-camera setting is discussed. Sec. 3 introduces the concept of user profile, and discusses reinforcement learning as a technique to learn user profile online. Sec. 4 discusses how the user profile can be used in providing adaptive smart home services. Finally, Sec. 6 offers some concluding remarks.
2. Behavior Analysis Analyzing the behavior of the user is the aim of the sensing part of the system. This section discusses the two parts of the behavior analysis function implemented in our system. The first part handles activity recognition, and the second part employs sequential
168
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications )RUHJURXQGH[WUDFWLRQ
6SDWLRWHPSRUDO IHDWXUHV
&RQGLWLRQDO5DQGRP )LHOG&5)
VWDQGLQJ
ZDONLQJ YDFFXPLQJ
O\LQJ
FXWWLQJ VFUDPEOLQJ
&DPHUDQHWZRUN
)DFH GHWHFWLRQ
/RFDWLRQ HVWLPDWLRQ
VLWWLQJ
HDWLQJ ZRUNLQJRQ FRPSXWHU
ZDWFKLQJ UHDGLQJ
/RFDWLRQFRQWH[W NLWFKHQOLYLQJURRP VWXG\DUHDHWF
Figure 1. Hierarchical activity analysis with coarse and fine-level activities.
correlations to create a model for transitions between user activities. The above two parts correspond to the two main levels of interpreting low-level sensor data to high-level action models. In [36], Yang provides a survey of approaches to the three levels of inferring logic representations for AI from radio frequency signals, namely the inference of locations from sensors, activities from locations, and action models from activities. Different types of sensors have been used to infer user’s activities in a smart environment. Examples include state-change sensors [31] attached to appliances and RFID tags and readers used with household items [27] to collect object usage data as an indirect way to infer user activity. Logan et al. in [19] study activity recognition with a variety of sensors including RFID sensors, switch sensors, and motion sensors. They argue that in real-world conditions it is difficult to detect fine-grained activities such as “reading” and “eating” with these sensors. They also state that visual sensing provides more information which is oftentimes complementary to other sensors. Vision-based human activity analysis has seen significant progress in recent years [21], and examples of classifying fine-grained human activity based on video can be found, for example, in [26,2,32]. Temporal duration modeling is an important aspect in activity recognition [33,20]. Hidden semi-Markov models (HSMM) and semi-Markov conditional random fields (SMCRF) are used to explicitly model duration in [33]. In [20], McKeever et al. use evidence theory to incorporate temporal domain knowledge, and show that recognition accuracy improves when time patterns and activity durations are included. 2.1. Hierarchical activity recognition with a camera network In our work camera sensors are used for behavior analysis. We use a hierarchical approach to classify user activities with visual analysis in a two-level process. Different types of activities are often represented by different image features, hence attempting to classify all activities with a single method would be ineffective. In Fig. 1, activities are represented in coarse and fine levels. The coarse activity level includes the classes of standing, sitting and lying, which relate to the pose of the user. The fine activity level consists of activities involving motion such as cutting, eating, reading, etc. We apply such a hierarchical approach because the first-level activities are discriminated based on
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
169
Figure 2. Structure of the CRF model. Observation context is included in inferencing the pose in a frame.
Figure 3. Pose-related activity classification. The top row shows example images of standing, sitting and lying. The bottom row shows probability of pose activities across frames.
pose and in a short temporal scope, while the second-level activities are classified based on motion features and are inferred with a larger observation scope. In the first level, activity is coarsely classified into standing, sitting and lying with temporal conditional random fields (CRF). This is achieved through employing a set of features consisting of the height of the user and the aspect ratio of the user’s bounding box. The foreground bounding box of the user is extracted with adaptive background subtraction. Position and height of the user are calculated from the the bounding box using calibration parameters of the cameras. The advantage of using CRF is that neighboring frames can be used as context for activity inference in a frame (Fig. 2). By considering the context, activity inference is more robust to observation noise. The top row of Fig. 3 shows images in which the user is standing, sitting and lying. The detected bounding box is shown in red squares. The output of activity inference on a short sequence is shown on the bottom plot of Fig. 3. The probability of each activity changes through frames. More details and experimental results can be found in [34]. Based on the result of the coarse level classification, the activity is further classified at the fine level based on spatiotemporal features [26]. Fig. 4 illustrates this process. In this method, the points of interest are 3D corners of volumes generated by stacked frames. An extended version of Haris corner detector is used to detect corners in the volume, which correspond to when an object changes its movement speed or direction,
170
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
([WUDFWVSDFHWLPH LQWHUHVWSRLQWV
GHVFULSWRUV
FRGHERRNVL]H 1 .PHDQVFOXVWHULQJ
HSLVRGH WVHFRQGV
690
KLVWRJUDP
DFWLYLW\ FODVVHV
IHDWXUHYHFWRU IRUWKHHSLVRGH
Figure 4. Motion-related activity classification with spatiotemporal features.
Figure 5. Example images of motion-related activities: (a) cutting, (b) scrambling, (c) eating, (d) reading, (e) typing, (f) vacuuming.
or when two objects merge or split. The histogram of gradients describes the features around points of interests. A codebook of size N is constructed with K-means clustering on a random subset of all the extracted spatiotemporal features of the training dataset. Each feature is assigned to the closest cluster based on the Euclidean distance. The video sequences are segmented into episodes with duration of t seconds. Bag-of-features (BoF) are collected for every episode, therefore each episode has the histogram of spatiotem-
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
location
activity
kitchen
cutting, scrambling, vacuuming, others
dining area
eating, vacuuming, others
living room
watching, reading, vacuuming, others
171
study room typing, reading, vacuuming, others Table 1. Semantic location context for activity classes.
poral features as its feature vector. We use discriminative learning with SVM. The semantic location context constrains activity types for classification in the fine level (Table 1). Note that we have others as an activity category. This is because our sequences are not specifically designed to exclusively include the defined activity types. There are many observations where the activities are in the transition phase or the observed person is simply doing some activities at random which are not within our defined categories. The others category introduces a challenge for our activity recognition algorithm since samples in this category possess many different motions and hence the feature space for others can be very complex. However, the applications built on top of activity analysis discussed in this paper are less sensitive to false positives on others, because the system is designed to perform no operation when the user’s activity is not specific. Fig. 5 shows some images with different activities. The described process can be applied to either a single-camera setting or a multicamera observation system. With multiple cameras, a method of data fusion between the cameras needs to be considered as well. The possible methods for camera fusion can be mainly divided into decision fusion and feature fusion. Two approaches for decision fusion based on the pose priority and camera prior methods are discussed in [34]. Feature fusion approaches including a combined-view and a mixed-view method are introduced in [35]. The paper also compares the advantages and disadvantages of the feature fusion approaches. Besides performance, another important aspect of a multi-camera algorithm is how sensitive the algorithm is to the camera views and the environment, so the method can be transferred from one smart environment to another. 2.2. Behavior Analysis Behavior refers to a sequence of actions or events repeated frequently by the user. Through learning the periodic behavioral patterns of the user, the smart environment can remove the burden of repetitive queries from the user and enhance the service automation process. Furthermore, the activities that are not consistent with user habits can be considered as abnormal events and may be registered as such. There are different methods to model repetitive sequences to learn and predict the common events. Hidden Markov models (HMM) as a generative statistical model [18,6], dynamic time wrapping [5], syntactic grammatical techniques [15], and agent-based techniques [7] are some major existing methods used for behavior understanding [16]. Since in practice behavior patterns differ from home to home and from user to user, unsupervised behavior learning and methods of recognition without prior definition have captured the attention of researchers. A problem with modeling the behavior is that the length of the actions of interest may be unknown. A behavior pattern may consist of many smaller frequent runs. Rashidi et al. [28] use an iterative method to extract the patterns. They consider the frequency
172
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
of events in a sliding time window. Then they increase the length of the window to consider longer candidate patterns. A modified information theory-based compression measurement is used to evaluate the frequency. In another approach Nater et al. [25] use a hierarchical structure to represent different granularity of actions. In each level the statistically frequent actions that are temporarily adjacent are combined to form a new, more informative action level which is called micro action. In this formulation, micro actions can overlap to support cases in which there is no clear boundaries between actions. Abnormal sequences are recognized in a different level of granularity when the sequence does not match with nodes on that level. Behavior can be represented by Nondeterministic finite-state automation (NFA) [16] in which for each state of the user and the environment (e.g., an individual watching TV in the living room), there may be several possibilities for the next state. Using learning mechanisms such as reinforcement learning, the probability of going from one state to another state is learned through observations of user activities over time.
3. User Profile Discovering patterns of the sequences observed by sensors facilitates home automation and makes it possible to customize the automation for each user. This customization is known as creating a user profile. In other words, user profile is a mapping from the state of the user’s lifestyle to his preferences. From a real-time system’s point of view, there are requirements on the user profile and its learning process. First, the structure of the user profile needs to be data-driven rather than model-driven, since the user profile is learned gradually and it is difficult to impose a model beforehand. Second, learning of the user profile needs to be unsupervised and adaptive, because 1) it is not practical to have a separate training phase for each user before they use the system, and 2) the users may change their habits and preferences over time. Therefore, in our method the user profile is learned through user’s feedback of either accepting or rejecting the service, or adjusting the service parameters. 3.1. Learning the Profile Different methods have been examined for automatic learning of the user profile. Mozer in 1998 [23] demonstrated an adaptive neural network method to control light and temperature systems of a home. Doctor et al. [12] use an unsupervised, data-driven fuzzy technique to extract fuzzy membership functions and rules that model the user’s profile in the environment. Amigoni et al. in [3] examine the use of an adaptive hierarchical plan to control home appliances based on goal-oriented behaviors. A challenge to learning the user’s preferences is that the user may change his behavior over time [8]. Therefore it may not be practical to learn the user’s preference off-line and instead a continuous learning algorithm is needed. A case study on learning and predicting entertainment preferences of children playing physical games is presented in [37], where four preference learning algorithms are compared. In [4] the authors present the state-of-the-art in machine learning techniques for learning user patterns in intelligent environments, and propose ideas on combining different methods based on the complexity of the problem and the strengths of different techniques. User preference is learned [29] through combining context information represented with ontologies and KNN machine learning algorithm.
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
173
We apply the framework of reinforcement learning (RL) to adapt to the user’s preferences under different contextual conditions. 3.2. Reinforcement Learning Reinforcement learning (RL) as an unsupervised algorithm does not need predefined models of the service parameters and it learns the model from the user’s feedback to the service the system provides. In our system the state of the user (time of day, location, and activity) is observed by the system, and the service is chosen automatically by the algorithm based on the user’s preference learned up to that time. If the user is not satisfied and applies a change to the service parameters using the system’s user interface or simply leaves the area, the event is recorded as explicit or implicit feedback respectively, and is used to update the decision policy. Reinforcement learning was used by Mozer and Miller [22] to control a home’s lighting system according to the user movements and based on user preferences. In their work, the penalty is defined according to the actual energy cost and degree of dissatisfaction of the user. Given the penalty, the decision learner explores different lighting patterns and estimates their utility under the prevalent user movement behavior. While Mozer’s system does not assume a prior definition of the environment, Zainderbeg et al. [38] set a starting point for the system according to prior assumptions by the designer. Starting from a predefined setup, the system begins to explore the best control policy to maximize the user’s satisfaction. In [9], Crites and Barto designed an intelligent elevator with near-optimized movement strategy for an office building. They demonstrated that a reasonable performance could be achieved even without a complex algorithm. In [28] the authors simulate a smart home environment and use a reinforcement learning method to discover user’s daily behavioral patterns in a hierarchical structure to generate automatic control policies. Our work differs from the above-mentioned methods in the following aspects. First, hierarchical reinforcement learning (HRL) is used to overcome the slow convergence issue of regular RL methods. In our formulation, the user’s preferences are first learned for general tasks. If more observations are available through time, preferences are learned gradually for more detailed activities. The faster convergence rate offered this way by HRL makes the system practical for real applications. Second, based on the idea developed by Lee et al. [17] a utility function is employed which relates user’s satisfaction with the light intensity. In our work the utility functions for different activities are adapted from their initial design-stage setting as the user provides feedback. The utility function is employed to estimate the user’s preference under the similar but not yet observed situations, and maps the light intensity to a value. Third, while adapting to the user’s preferences, the system also minimizes the consumed energy. The optimal light setting is determined through an optimization formulation that balances the user’s satisfaction and energy consumption. 3.2.1. HRL Model for Adaptive Light Control Most daily activities can be described in a hierarchical structure according to the location context. Cutting, scrambling, and serving food may be grouped as kitchen works, and activities in a study area may include reading and working with computer. Such a grouping allows for generalizing service decisions based on similarities encapsulated in a group,
174
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
and removes the burden of having to test individual circumstances. Using state abstraction and generalization, machine learning methods would explore, examine and learn the proper decisions in a reduced states space, achieving a shorter convergence time for complex problems. For example, the feedback received from the user about the desired light level in the kitchen may be used to adjust all light level utility functions reserved for different activities in the kitchen. This would be an example of location-based grouping. Or when the user provides feedback on the light level when he is observed studying, the feedback can be used to update all utility functions that relate to the studying activity even in other location contexts. This would be an example of activity-based grouping. In a suitable abstraction the states in the original state space which are close to each other are assigned to neighboring high-level abstract states. To shrink the search space and expedite the training time for optimal or suboptimal decisions, the number of abstract states needs to be considerably smaller than the number of states in the real space. This enables more frequent updates for a broader range of similar states. States should be encapsulated in a way that the inter-transition between abstract states represent a meaningful change in the real world. Actual occurrences of high-level transitions should hence be non-uniformly distributed in the abstract space, resulting in clusters of transitions in that space. Generalization methods (such as tile coding [30]) can be used to expand trained values on similar neighboring states. Using hierarchies is another strategy to propagate common belief in different levels of abstraction. An extended version of RL using the hierarchical nature of the problem was explored in MAXQ hierarchical reinforcement learning (HRL) [11]. MAXQ-HRL uses prior knowledge of hierarchical behaviors for which optimal policies can be learned simultaneously. In the higher level of MAXQ, the original states are grouped in abstract states, and MAXQ-HRL looks for optimal sub-policies in transitions between abstract states. In the bottom level, using flat RL, the sub-policies are trained for optimum primitive actions at the observed states. In our model, the environment’s lighting is optimally controlled based on the activity and location of the user. The user often stays in a location for a task consisting of different activities which requires a specific configuration of lights. Therefore, the location context is used to abstract the state space for adaptation. Valid activities in each location are planned and their transitions in that location are learned in time. Fig. 6 illustrates a simple example of possible routines in different location contexts considered in our experiments. Transitions in each abstract state are modeled through a finite Markov process. We call the finite Markov process of each location as a “routine”. The length of different observations of a routine may not be uniform. In Fig. 7, the black nodes in the lower level are the activities, and white nodes in the higher level indicate abstract states. The intra-transition between abstract states can also be described as a Markov process as shown in (Fig. 8). The user moves from an abstract state to another with a transition probability of δ and an observed probability of μ(location{st }, Mi ). For nonadjacent states δ = 0. The probability distribution of μ shows how much an activity is probable in a location. For instance, the probability of “studying” in the study room can be higher than its probability in the living room. Transitions between abstract states occur whenever the termination condition Ti is met. In general the termination condition depends on the sequence of observed state-action pairs. In our method, termination of a routine is conditioned on the displacement of the user.
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications .LWFKHQ 'LQLQJ7DEOH /LYLQJ5RRP 6WXG\5RRP
175
(DWLQJ 9DFXXPLQJ
&RPSXWHU :DONLQJ
5HDGLQJ
:DWFKLQJ 79
&XWWLQJ
6FUDPEOLQJ
Figure 6. Finite Markov process models for different location contexts (routines). 7LPH 5RXWLQH
5RXWLQH
5RXWLQH
5RXWLQH
5RXWLQHOHYHO 3ULPLWLYHVWDWHOHYHO /RFDWLRQ /RFDWLRQ /RFDWLRQ
$SULPLWLYHVWDWHDWORFDWLRQ FDXVHGWKHURXWLQHWREHDFWLYDWHG
Figure 7. Activation of routines and primitive states.
'LQLQJ 7DEOH 6WXG\ 5RRP ђ; ɷ;> Đƚŝǀ ŝǀŝŶ ŝLJƐ ŐZ ƚĂƚ ŽŽ Ğ͕ů ŵ͕ ^ƚƵ ŽĐĂƚŝ ĚLJ ŽŶͿ ZŽ ͕ Žŵ Ϳ
.LWFKHQ
/LYLQJ 5RRP
Figure 8. High-level finite Markov process.
MAXQ-HRL has a high-level policy that selects a routine Mi ∈ M based on μ(location{st }, Mi ) and δ. In other words, routines (sub-policies) are actions for the high-level policy. Each routine Mi has its own policy πi to control the lighting system using actions based on the user’s preference. Such policies are called “flat” or “low-level” policies. The policy is trained using MAXQ which is a hierarchical adaptation method of
176
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications 4WDEOHRI5/
V V
!
#
8WLOLW\IXQFWLRQ
OQ
VDWLVIDFWLRQ
O O O
O OLJKWVHWWLQJ
Figure 9. The utility function is estimated from the Q-table.
reinforcement learning. To this purpose we assign a finite-state machine to each routine. The optimized actions are selected according to the current state of the system st and the activated routine Mi . The actors only receive primitive actions derived by the low-level policies.
4. Service Control of the Environment The goal of providing automatic services to the user is two-fold. First, the service satisfies the user’s need and maximizes his comfort level. Second, the service needs to satisfy certain utility requirements. In this section we first discuss encapsulation of the scores for state-action pairs based on user’s feedback, and then introduce an optimization method which balances between user preferences and utility requirements. In reinforcement learning, the Q-table encapsulates scores for state-action pairs. Let us denote Q(s, a) as the score for state s and action a. The bigger Q(s, a) is, the more preferred the action a is for state s. Therefore, for a state s, the optimal action to select is a∗ = argmax{ai } Q(s, ai ), if only the user’s preference and no other factors are considered for decision making. In some applications, the user profile is an important source of information for decision making, but there can be other considerations such as energy consumption level, and smooth change of lights. The system learns the user’s light setting preference under different situations in order to create a user profile. The Q-table is a tabular profile in reinforcement learning which records inclination of selecting actions at different states. In Fig. 9, each row si corresponds to a state, and each column li denotes a light setting. The value in the tabular refers to the score of level li at state sj as expressed by the user. Each user in the smart home may have his own Q-table. Uploading a table customizes the system services for that particular user. The Q-learning operation adapts the profile to user preferences smoothly by changing the weights and probability of selections. Although the user’s preferences are learned online and encapsulated in the Q-table during the reinforcement learning process, the Q-table may remain sparse before the system has collected enough information from the user about his preferences for all the possible states. In order to enable the system to estimate the user’s preference for any light setting li at any state sj while user scores are still not learned, we estimate a utility function U (s, l) given the current form of the Q-table. The idea is based on the assumption of continuity in the scores for nearby light settings, so the scores can be estimated along the light setting dimension. In our experiments li denotes quantized light intensity
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
177
GLQLQJWDEOH
FDELQHW
GHVN IULGJH FRPSXWHU GHVN
/LYLQJURRP
6WXG\ DUHD
ZRUNVSDFH PLFURZDYH
FRIIHH WDEOH
ERRNFDVH
.LWFKHQ DUHD
Figure 10. The schematic and two views of AIR lab.
levels. As shown in Fig. 9, for each state sj , U (s, l) is estimated as a Gaussian that fits optimally to the existing Q-table values in the row corresponding to sj . With the utility function U (s, l), we can estimate the user’s preference level of any pair (sj , li ) even if the corresponding Q-table grid is still not explored and contains no entry. Since our goal is to maximally satisfy the user and save energy, the optimization objective function is constructed as follows: minimize
fs (l) = U (s, l) + λC(l)
(1)
for the current state s. C(l) is the cost of energy, which is linear in the intensity level: C(l) = c0 l where c0 is the unit-level cost of energy. Even though U (s, l) is not convex, the optimization is straightforward since l is quantized. A greedy search efficiently yields the optimal light intensity l∗ . At the current stage, we do not consider settings based on combination of lights. For each state, the optimal setting for each light is solved separately. In future work, we will explore more complex ambient settings which include combination of lights.
5. Experiments Our testbed, called AIR (Ambient Intelligent Research) Lab, is a smart studio located at Stanford University (Fig. 10). It consists of a living room, kitchen, dining area, and study area. The testbed is equipped with a network of 5 cameras, a large-screen TV, a digital window (projected wall), handheld PDA devices, appliances, and wireless controllers for lights and ambient colors.
178
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
cam 1
cam 2
cam 3
cam 4
cam 5
cutting
0
9032
0
0
0
scrambling
0
9042
0
0
0
eating
0
11397
11414
0
0
reading
3082
0
5176
10484
0
computer
4640
0
0
0
0
vacuuming
0
9052
0
0
17942
others 5130 9022 18160 8909 17902 Table 2. Number of frames for each activity present in each camera.
cam 1
cam 2
cam 3
cam 4
cam 5
cutting
–
0.92
–
–
–
scrambling
–
0.83
–
–
–
eating
–
0.96
0.98
–
–
reading
0.75
–
0.85
0.93
–
computer
0.95
–
–
–
–
vacuuming
–
0.95
–
–
0.91
others 0.94 0.78 0.98 0.76 0.95 Table 3. Precision of the fine-level activity analysis.
5.1. Activity analysis We acquired approximately 80 minutes of video sequences as the training data for the fine (second)-level of activity classification with spatiotemporal features (Fig. 1). Each sequence is captured from all the 5 installed cameras, and two persons participated in the sequence taking. Here we only present the performance of the second-level activity analysis. Results and performance analysis of the coarse (first)-level activity classification can be found in [34]. The number of frames for each activity in each camera is shown in Table 2. We experimented on the number of K-means clusters (N ) and the episode duration (t) for bag-of-features. A three-fold cross-validation process is used in which when one fold is chosen as test data the other two are chosen as training data. Considering the average precision of the results from all cameras, we observed when N > 60 and t > 12 seconds, the performance stays roughly stable. Therefore, we chose N = 100 and t = 15 seconds. Precision data for each activity in each camera can be found in Table 3. The test data for HRL contains about 45 minutes of video sequences from 4 persons. Each person continuously performs different activities in the environment. At each frame, the best-view camera is chosen for the fine-level activity analysis. The best-view camera is chosen based on the person’s location. The episode is segmented out with a sliding window with duration of 15 seconds and slides every 1 second. 5.2. Reinforcement Learning In order to evaluate the ability of the system to follow the user’s preferences in the light settings, 5 levels of intensities are defined for each of the 4 lights located at different areas of the testbed lab. The lights are adjustable using a set of wireless dimmers. In our
179
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications 6WRU\OLQHRIWKHH[SHULPHQWV /LYLQJ5RRP :DWFKLQJ7Y
6WXG\LQJ
.LWFKHQ
'LQLQJ7DEOH
6WXG\5RRP
&XWWLQJ 9DFXXPLQJ 6FUDPEOLQJ (DWLQJ
6WXG\LQJ :DWFKLQJ7Y 6WXG\LQJ
&XWWLQJ 6FUDPEOLQJ (DWLQJ
:DWFKLQJ7Y
7\SLQJ 9DFXXPLQJ
6WXG\LQJ
:DWFKLQJ7Y
Figure 11. Storyline of the two examined scenarios with durations of each activity shown in minutes. Green: User1, Orange: User2.
'LQLQJ 7DEOH
(DWLQJ
:DWFKLQJ 5HDGLQJ
į
/LYLQJ 5RRP
.LWFKHQ
į
&XWWLQJ 6FUDPEOLQJ 9DFXXPLQJ
į
į
/LYLQJ 5RRP
5HDGLQJ :DWFKLQJ79
D 8VHU
į
(DWLQJ
E 8VHU
į
į
6WXG\ 5RRP
'LQLQJ 7DEOH
.LWFKHQ
&XWWLQJ 6FUDPEOLQJ 9DFXXPLQJ
:DWFKLQJ 5HDGLQJ
Figure 12. Transition and activity probabilities for the two user behavior scenarios.
simulation to study the convergence behavior of the learning system, the mean-squareerror (MSE) between the ideal and current compositions of light intensities is given as the feedback to the system. Four volunteer participants performed activities in different places over different time intervals. Fig. 11 shows the storylines of the examined scenarios. Following the prescribed storylines, the user moves from each location to only a unique location, hence, the transition probabilities between the consequent observed movements are δ = 1. The extracted activity probabilities and transitions for the two scenarios are shown in Fig. 12(a) and (b). Assuming the participants would not change their behavior, the scenarios were used as daily habits of the users and repeated to construct 10,000 observations for each experiment. At the beginning of each scenario, the system is initialized not to have any prior knowledge about the user preferences.
180
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
Figure 13. Convergence diagrams for different learning methods for two users: a) User 1, b) User 2.
Fig. 13 shows the average penalty in the past observations versus time. Due to the state space abstraction of hierarchical reinforcement learning, the HRL methods converge faster to the ideal composition compared to the regular RL methods. This means that less time is required by the system to adapt to the user’s ideal light settings. A fast adaptation rate is important when the user occasionally changes his preferred light settings and the system needs a quick adjustment. The length of activities in User 1’s video is less than that of User 2’s. As a result, in the case of User 1 the system does not stay in the same state long enough in each daily episode and the four examined methods perform linearly like each other before they converge.
6. Conclusion We discussed key components and techniques to enable a smart home system that provides adaptive services customized to the user. Reinforcement learning especially fits the application to dynamically learn the user’s preference profile. Example home services that can be handled with the learning method include light control and background music control. For future work, we are looking into speeding up the convergence rate of reinforcement learning, and predicting the user’s activity and providing services predictively based on the user’s behavior model.
References G. Abowd, E. Mynatt, and T. Rodden. The human experience [of ubiquitous computing]. Pervasive Computing, IEEE, 1(1):48–57, 2002. [2] S. Ali and M. Shah. Human action recognition in videos using kinematic features and multiple instance learning. In IEEE transactions on Pattern Analysis and Machine Intelligence, volume 32, pages 288– 303, 2010. [3] F. Amigoni, N. Gatti, C. Pinciroli, and M. Roveri. What planner for ambient intelligence applications? Systems, Man and Cybernetics, Part A, IEEE Transactions on, 35(1):7–21, 2005.
[1]
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
[4]
[5] [6] [7] [8]
[9] [10] [11] [12]
[13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29]
181
A. Aztiria, A. Izaguirre, R. Basagoiti, and J. C. Augusto. Learning preferences and common behaviours of the user in intelligent environments. In Behaviour Monitoring and Interpretation: Smart Environments, pages 289–315, 2009. A. Bobick and A. Wilson. A state-based approach to the representation and recognition of gesture. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1325–1337, 1997. M. Brand and V. Kettnaker. Discovery and segmentation of activities in video. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 844–851, 2000. R. Bryll, R. Rose, and F. Quek. Agent-based gesture tracking. IEEE Transactions on Systems, Man and Cybernetics, 2005. D. Cook, M. Schmitter-Edgecombe, A. Crandall, C. Sanders, and B. Thomas. Collecting and disseminating smart home sensor data in the casas project. In CHI Workshop on Developing Shared Home Behavior Datasets to Advance HCI and Ubiquitous Computing Research, 2009. R. Crites and A. Barto. Improving elevator performance using reinforcement learning. In In Advances in Neural Information Processing Systems, 1995. B. de Ruyter, E. van Loenen, and V. Teeven. User centered research in experiencelab. In Ambient Intelligence, volume 4794/2007, pages 305–313, 2007. T. G. Dietterich. State abstraction in maxq hierarchical reinforcement learning. In Advanced in Neural Information Processing System 12, pages 994–1000. MIT Press, 2000. F. Doctor, H. Hagras, and V. Callaghan. A fuzzy embedded agent-based approach for realizing ambient intelligence in intelligent inhabited environments. Systems, Man and Cybernetics, Part A, IEEE Transactions on, 35(1):55–65, 2005. A. Helal, M. Mokhtari, and B. Abdulrazak. Engineering Handbook on Smart Technology for Aging Disability and Independence. John Wiley & Sons. Computer Engineering Series, 2008. S. Intille, K. Larson, E. Tapia, J. Beaudin, P. Kaushik, J. Nawyn, and R. Rockinson. Using a live-in laboratory for ubiquitous computing research. In Pervasive Computing, pages 349–365, 2006. Y. Ivanov and A. Bobick. Recognition of visual activities and interactions by stochastic parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 852–872, 2000. T. Ko. A survey on behavior analysis in video surveillance for homeland security applications. 13th IEEE Workshop on Applied Imagery Pattern Recognition, pages 1–8, 2008. H. Lee, C. Wu, and H. Aghajan. Vision-based user-centric light controlfor smart environments. Accepted for publication in Pervasive and Mobile Computing, 2010. L. Liao, D. Fox, and H. Kautz. Location-based activity recognition using relational markov networks. 19th international joint conference on Artificial intelligence, pages 73–778, 2005. B. Logan, J. Healey, M. Philipose, E. M. Tapia, and S. Intille. A long-term evaluation of sensing modalities for activity recognition. In Proc. of Ubicomp, 2007. S. McKeever, J. Ye, L. Coyle, C. Bleakley, and S. Dobson. Activity recognition using temporal evidence theory. J. Ambient Intell. Smart Environ., 2:253–269, August 2010. T. Moeslund, A. Hilton, and V. Kruger. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 103(2-3):90–126, November 2006. M. Mozer and D. Miller. Parsing the stream of time: The value of event-based segmentation in a complex real-world control problem. Springer Verlag., 1998. M. C. Mozer. The neural network house: An environment that adapts to its inhabitants. In the AAAI Spring Symposium on Intelligent Environments, 1998. M. C. Mozer. Lessons from an adaptive home. In Smart environments: Technologies, protocols, and applications, pages 271–294. John Wiley & Sons., 2004. F. Nater, H. Grabner, and L. Van Gool. Exploiting simple hierarchies for unsupervised human behavior analysis. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010. J. C. Niebles, H. Wang, and L. Fei-Fei. Unsupervised learning of human action categories using spatialtemporal words. International Journal of Computer Vision, 2008. M. Philipose, K. P. Fishkin, M. Perkowitz, D. J. Patterson, D. Fox, H. Kautz, and D. Hahnel. Inferring activities from interactions with objects. IEEE Pervasive Computing, 3(4):50–57, 2004. P. Rashidi and D. J. Cook. Keeping the resident in the loop: Adapting the smart home to the user. IEEE transactions on Systems, Man, and Cybernetics journal, 39(5):949–959, 2009. L. San Mart´in, V. Pel´ aez, R. Gonz´ alez, A. Campos, and V. Lobato. Environmental user-preference learning for smart homes: an autonomous approach. J. Ambient Intell. Smart Environ., 2:327–342, August 2010.
182 [30] [31] [32] [33] [34] [35] [36] [37]
[38]
[39]
A.H. Khalili et al. / Towards Adaptive and User-Centric Smart Home Applications
R. Sutton and A. Barto. Reinforcement Learning: An introduction. MIT Press, Cambridge, MA, USA, 1998. E. M. Tapia, S. S. Intille, and K. Larson. Activity recognition in the home using simple and ubiquitous sensors. In Pervasive Computing, pages 158–175, 2004. D. Tran and A. Sorokin. Human activity recognition with metric learning. In European Conference on Computer Vision, pages 548–561, 2008. T. van Kasteren, G. Englebienne, and B. Kr¨ ose. Activity recognition using semi-markov models on real world smart home datasets. J. Ambient Intell. Smart Environ., 2:311–325, August 2010. C. Wu and H. Aghajan. User-centric environment discovery in smart home with camera networks. In to appear in IEEE Transactions on System, Man and Cybernetics, Part A. C. Wu, A. H. Khalili, and H. Aghajan. Multiview activity recognition in smart homes with spatiotemporal features. In International Conference of Distributed Smart Cameras (ICDSC), 2010. Q. Yang. Activity recognition: linking low-level sensors to high-level intelligence. In Proceedings of the 21st international jont conference on Artifical intelligence, pages 20–25, 2009. G. N. Yannakakis, M. Maragoudakis, and J. Hallam. Preference learning for cognitive modeling: a case study on entertainment preferences. IEEE Systems, Man and Cybernetics: Part A, pages 1165–1175, 2009. S. Zaidenberg, P. Reignier, and J. Crowley. Reinforcement learning of context models for a ubiquitous personal assistant. In The 3rd Symposium of Ubiquitous Computing and Ambient Intelligence, volume 51, pages 254–264, 2008. S. Zhang, S. McClean, B. Scotney, X. Hong, C. Nugent, and M. Mulvenna. An intervention mechanism for assistive living in smart homes. J. Ambient Intell. Smart Environ., 2:233–252, August 2010.
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved.
183
Subject Index activity monitoring adaptive services ambient assisted living ambient intelligence appointment reminder ARIMA assisted living behavior analysis behaviour change behaviour interpretation behaviour monitoring body movements cognitive function dementia emergency call empirical study health HRI Information Communication Technology intelligent agents intention recognition intentionality Kalman recursion
131 166 3 3 83 131 83 166 131 3 3 147 11 83 83 83 11, 131 147 11 26 26 26 131
KopAL 83 localization 83 mobile system 83 movement concepts 147 non-verbal communication 147 nursery home 83 order selection 131 PAL 131 persuasive technology 131 pervasive computing 3 physical information spacetime 26 prediction 131 reinforcement learning 166 situational interaction 147 smart environments 3 smart homes 166 social networking 11 spatial interaction 147 speech synthesis 83 tailored activity intervention 131 user profile 166 weak and strong pervasive computing 26 well-being 3, 11, 26, 131
This page intentionally left blank
Behaviour Monitoring and Interpretation – BMI B. Gottfried and H. Aghajan (Eds.) IOS Press, 2011 © 2011 The authors and IOS Press. All rights reserved.
185
Author Index Aarts, R.M. Aghajan, H. Artmann, S. Cook, D.J. Crandall, A.S. Felber, J. Fudickar, S. Georgeff, M. Goris, A.H. Gottfried, B. Guttmann, C. Hanheide, M. Huppert, F. Johnson, D. Khalili, A.H.
131 v, 3, 166 26 65 65 83 83 105 131 3, v 105 147 11 11 166
Lacroix, J. Lenz, M. Long, X. Neyer, F.J. Pauws, S. Peters, A. Pijl, M. Schmidt, H. Schnor, B. Spexard, T.P. Stede, M. Thomas, I. Weiss, P. Wickramasinghe, K. Wu, C.
131 83 131 83 131 147 131 105 83 147 83 105 147 105 166