This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen University of Dortmund, Germany Madhu Sudan Massachusetts Institute of Technology, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Moshe Y. Vardi Rice University, Houston, TX, USA Gerhard Weikum Max-Planck Institute of Computer Science, Saarbruecken, Germany
4550
Julie A. Jacko (Ed.)
Human-Computer Interaction Interaction Design and Usability 12th International Conference, HCI International 2007 Beijing, China, July 22-27, 2007 Proceedings, Part I
13
Volume Editor Julie A. Jacko Georgia Institute of Technology and Emory University School of Medicine 901 Atlantic Drive, Suite 4100, Atlanta, GA 30332-0477, USA E-mail: [email protected]
Library of Congress Control Number: 2007929779 CR Subject Classification (1998): H.5.2, H.5.3, H.3-5, C.2, I.3, D.2, F.3, K.4.2 LNCS Sublibrary: SL 2 – Programming and Software Engineering ISSN ISBN-10 ISBN-13
0302-9743 3-540-73104-0 Springer Berlin Heidelberg New York 978-3-540-73104-7 Springer Berlin Heidelberg New York
The 12th International Conference on Human-Computer Interaction, HCI International 2007, was held in Beijing, P.R. China, 22-27 July 2007, jointly with the Symposium on Human Interface (Japan) 2007, the 7th International Conference on Engineering Psychology and Cognitive Ergonomics, the 4th International Conference on Universal Access in Human-Computer Interaction, the 2nd International Conference on Virtual Reality, the 2nd International Conference on Usability and Internationalization, the 2nd International Conference on Online Communities and Social Computing, the 3rd International Conference on Augmented Cognition, and the 1st International Conference on Digital Human Modeling. A total of 3403 individuals from academia, research institutes, industry and governmental agencies from 76 countries submitted contributions, and 1681 papers, judged to be of high scientific quality, were included in the program. These papers address the latest research and development efforts and highlight the human aspects of design and use of computing systems. The papers accepted for presentation thoroughly cover the entire field of Human-Computer Interaction, addressing major advances in knowledge and effective use of computers in a variety of application areas. This volume, edited by Julie A. Jacko, contains papers in the thematic area of Human-Computer Interaction, addressing the following major topics: • • • •
Interaction Design: Theoretical Issues, Methods, Techniques and Practice Usability and Evaluation Methods and Tools Understanding Users and Contexts of Use Models and Patterns in HCI The remaining volumes of the HCI International 2007 proceedings are:
• Volume 2, LNCS 4551, Interaction Platforms and Techniques, edited by Julie A. Jacko • Volume 3, LNCS 4552, HCI Intelligent Multimodal Interaction Environments, edited by Julie A. Jacko • Volume 4, LNCS 4553, HCI Applications and Services, edited by Julie A. Jacko • Volume 5, LNCS 4554, Coping with Diversity in Universal Access, edited by Constantine Stephanidis • Volume 6, LNCS 4555, Universal Access to Ambient Interaction, edited by Constantine Stephanidis • Volume 7, LNCS 4556, Universal Access to Applications and Services, edited by Constantine Stephanidis • Volume 8, LNCS 4557, Methods, Techniques and Tools in Information Design, edited by Michael J. Smith and Gavriel Salvendy • Volume 9, LNCS 4558, Interacting in Information Environments, edited by Michael J. Smith and Gavriel Salvendy • Volume 10, LNCS 4559, HCI and Culture, edited by Nuray Aykin
VI
Foreword
• Volume 11, LNCS 4560, Global and Local User Interfaces, edited by Nuray Aykin • Volume 12, LNCS 4561, Digital Human Modeling, edited by Vincent G. Duffy • Volume 13, LNAI 4562, Engineering Psychology and Cognitive Ergonomics, edited by Don Harris • Volume 14, LNCS 4563, Virtual Reality, edited by Randall Shumaker • Volume 15, LNCS 4564, Online Communities and Social Computing, edited by Douglas Schuler • Volume 16, LNAI 4565, Foundations of Augmented Cognition 3rd Edition, edited by Dylan D. Schmorrow and Leah M. Reeves • Volume 17, LNCS 4566, Ergonomics and Health Aspects of Work with Computers, edited by Marvin J. Dainoff I would like to thank the Program Chairs and the members of the Program Boards of all Thematic Areas, listed below, for their contribution to the highest scientific quality and the overall success of the HCI International 2007 Conference.
Ergonomics and Health Aspects of Work with Computers Program Chair: Marvin J. Dainoff Arne Aaras, Norway Pascale Carayon, USA Barbara G.F. Cohen, USA Wolfgang Friesdorf, Germany Martin Helander, Singapore Ben-Tzion Karsh, USA Waldemar Karwowski, USA Peter Kern, Germany Danuta Koradecka, Poland Kari Lindstrom, Finland
Holger Luczak, Germany Aura C. Matias, Philippines Kyung (Ken) Park, Korea Michelle Robertson, USA Steven L. Sauter, USA Dominique L. Scapin, France Michael J. Smith, USA Naomi Swanson, USA Peter Vink, The Netherlands John Wilson, UK
Human Interface and the Management of Information Program Chair: Michael J. Smith Lajos Balint, Hungary Gunilla Bradley, Sweden Hans-Jörg Bullinger, Germany Alan H.S. Chan, Hong Kong Klaus-Peter Fähnrich, Germany Michitaka Hirose, Japan Yoshinori Horie, Japan Richard Koubek, USA Yasufumi Kume, Japan Mark Lehto, USA Jiye Mao, P.R. China
Robert Proctor, USA Youngho Rhee, Korea Anxo Cereijo Roibás, UK Francois Sainfort, USA Katsunori Shimohara, Japan Tsutomu Tabe, Japan Alvaro Taveira, USA Kim-Phuong L. Vu, USA Tomio Watanabe, Japan Sakae Yamamoto, Japan Hidekazu Yoshikawa, Japan
Foreword
Fiona Nah, USA Shogo Nishida, Japan Leszek Pacholski, Poland
Li Zheng, P.R. China Bernhard Zimolong, Germany
Human-Computer Interaction Program Chair: Julie A. Jacko Sebastiano Bagnara, Italy Jianming Dong, USA John Eklund, Australia Xiaowen Fang, USA Sheue-Ling Hwang, Taiwan Yong Gu Ji, Korea Steven J. Landry, USA Jonathan Lazar, USA
V. Kathlene Leonard, USA Chang S. Nam, USA Anthony F. Norcio, USA Celestine A. Ntuen, USA P.L. Patrick Rau, P.R. China Andrew Sears, USA Holly Vitense, USA Wenli Zhu, P.R. China
Engineering Psychology and Cognitive Ergonomics Program Chair: Don Harris Kenneth R. Boff, USA Guy Boy, France Pietro Carlo Cacciabue, Italy Judy Edworthy, UK Erik Hollnagel, Sweden Kenji Itoh, Japan Peter G.A.M. Jorna, The Netherlands Kenneth R. Laughery, USA
Nicolas Marmaras, Greece David Morrison, Australia Sundaram Narayanan, USA Eduardo Salas, USA Dirk Schaefer, France Axel Schulte, Germany Neville A. Stanton, UK Andrew Thatcher, South Africa
Universal Access in Human-Computer Interaction Program Chair: Constantine Stephanidis Julio Abascal, Spain Ray Adams, UK Elizabeth Andre, Germany Margherita Antona, Greece Chieko Asakawa, Japan Christian Bühler, Germany Noelle Carbonell, France Jerzy Charytonowicz, Poland Pier Luigi Emiliani, Italy Michael Fairhurst, UK Gerhard Fischer, USA Jon Gunderson, USA Andreas Holzinger, Austria
Zhengjie Liu, P.R. China Klaus Miesenberger, Austria John Mylopoulos, Canada Michael Pieper, Germany Angel Puerta, USA Anthony Savidis, Greece Andrew Sears, USA Ben Shneiderman, USA Christian Stary, Austria Hirotada Ueda, Japan Jean Vanderdonckt, Belgium Gregg Vanderheiden, USA Gerhard Weber, Germany
VII
VIII
Foreword
Arthur Karshmer, USA Simeon Keates, USA George Kouroupetroglou, Greece Jonathan Lazar, USA Seongil Lee, Korea
Harald Weber, Germany Toshiki Yamaoka, Japan Mary Zajicek, UK Panayiotis Zaphiris, UK
Virtual Reality Program Chair: Randall Shumaker Terry Allard, USA Pat Banerjee, USA Robert S. Kennedy, USA Heidi Kroemker, Germany Ben Lawson, USA Ming Lin, USA Bowen Loftin, USA Holger Luczak, Germany Annie Luciani, France Gordon Mair, UK
Ulrich Neumann, USA Albert "Skip" Rizzo, USA Lawrence Rosenblum, USA Dylan Schmorrow, USA Kay Stanney, USA Susumu Tachi, Japan John Wilson, UK Wei Zhang, P.R. China Michael Zyda, USA
Usability and Internationalization Program Chair: Nuray Aykin Genevieve Bell, USA Alan Chan, Hong Kong Apala Lahiri Chavan, India Jori Clarke, USA Pierre-Henri Dejean, France Susan Dray, USA Paul Fu, USA Emilie Gould, Canada Sung H. Han, South Korea Veikko Ikonen, Finland Richard Ishida, UK Esin Kiris, USA Tobias Komischke, Germany Masaaki Kurosu, Japan James R. Lewis, USA
Rungtai Lin, Taiwan Aaron Marcus, USA Allen E. Milewski, USA Patrick O'Sullivan, Ireland Girish V. Prabhu, India Kerstin Röse, Germany Eunice Ratna Sari, Indonesia Supriya Singh, Australia Serengul Smith, UK Denise Spacinsky, USA Christian Sturm, Mexico Adi B. Tedjasaputra, Singapore Myung Hwan Yun, South Korea Chen Zhao, P.R. China
Online Communities and Social Computing Program Chair: Douglas Schuler Chadia Abras, USA Lecia Barker, USA Amy Bruckman, USA
Stefanie Lindstaedt, Austria Diane Maloney-Krichmar, USA Isaac Mao, P.R. China
Foreword
Peter van den Besselaar, The Netherlands Peter Day, UK Fiorella De Cindio, Italy John Fung, P.R. China Michael Gurstein, USA Tom Horan, USA Piet Kommers, The Netherlands Jonathan Lazar, USA
IX
Hideyuki Nakanishi, Japan A. Ant Ozok, USA Jennifer Preece, USA Partha Pratim Sarker, Bangladesh Gilson Schwartz, Brazil Sergei Stafeev, Russia F.F. Tusubira, Uganda Cheng-Yen Wang, Taiwan
Augmented Cognition Program Chair: Dylan D. Schmorrow Kenneth Boff, USA Joseph Cohn, USA Blair Dickson, UK Henry Girolamo, USA Gerald Edelman, USA Eric Horvitz, USA Wilhelm Kincses, Germany Amy Kruse, USA Lee Kollmorgen, USA Dennis McBride, USA
Jeffrey Morrison, USA Denise Nicholson, USA Dennis Proffitt, USA Harry Shum, P.R. China Kay Stanney, USA Roy Stripling, USA Michael Swetnam, USA Robert Taylor, UK John Wagner, USA
Digital Human Modeling Program Chair: Vincent G. Duffy Norm Badler, USA Heiner Bubb, Germany Don Chaffin, USA Kathryn Cormican, Ireland Andris Freivalds, USA Ravindra Goonetilleke, Hong Kong Anand Gramopadhye, USA Sung H. Han, South Korea Pheng Ann Heng, Hong Kong Dewen Jin, P.R. China Kang Li, USA
Zhizhong Li, P.R. China Lizhuang Ma, P.R. China Timo Maatta, Finland J. Mark Porter, UK Jim Potvin, Canada Jean-Pierre Verriest, France Zhaoqi Wang, P.R. China Xiugan Yuan, P.R. China Shao-Xiang Zhang, P.R. China Xudong Zhang, USA
In addition to the members of the Program Boards above, I also wish to thank the following volunteer external reviewers: Kelly Hale, David Kobus, Amy Kruse, Cali Fidopiastis and Karl Van Orden from the USA, Mark Neerincx and Marc Grootjen from the Netherlands, Wilhelm Kincses from Germany, Ganesh Bhutkar and Mathura Prasad from India, Frederick Li from the UK, and Dimitris Grammenos, Angeliki
X
Foreword
Kastrinaki, Iosif Klironomos, Alexandros Mourouzis, and Stavroula Ntoa from Greece. This conference could not have been possible without the continuous support and advise of the Conference Scientific Advisor, Prof. Gavriel Salvendy, as well as the dedicated work and outstanding efforts of the Communications Chair and Editor of HCI International News, Abbas Moallem, and of the members of the Organizational Board from P.R. China, Patrick Rau (Chair), Bo Chen, Xiaolan Fu, Zhibin Jiang, Congdong Li, Zhenjie Liu, Mowei Shen, Yuanchun Shi, Hui Su, Linyang Sun, Ming Po Tham, Ben Tsiang, Jian Wang, Guangyou Xu, Winnie Wanli Yang, Shuping Yi, Kan Zhang, and Wei Zho. I would also like to thank for their contribution towards the organization of the HCI International 2007 Conference the members of the Human Computer Interaction Laboratory of ICS-FORTH, and in particular Margherita Antona, Maria Pitsoulaki, George Paparoulis, Maria Bouhli, Stavroula Ntoa and George Margetis.
Constantine Stephanidis General Chair, HCI International 2007
HCI International 2009
The 13th International Conference on Human-Computer Interaction, HCI International 2009, will be held jointly with the affiliated Conferences in San Diego, California, USA, in the Town and Country Resort & Convention Center, 19-24 July 2009. It will cover a broad spectrum of themes related to Human Computer Interaction, including theoretical issues, methods, tools, processes and case studies in HCI design, as well as novel interaction techniques, interfaces and applications. The proceedings will be published by Springer. For more information, please visit the Conference website: http://www.hcii2009.org/
General Chair Professor Constantine Stephanidis ICS-FORTH and University of Crete Heraklion, Crete, Greece Email: [email protected]
Table of Contents
Part 1: Interaction Design: Theoretical Issues, Methods, Techniques and Practice Design Principles Based on Cognitive Aging . . . . . . . . . . . . . . . . . . . . . . . . . Hiroko Akatsu, Hiroyuki Miki, and Naotsune Hosono
3
Redesigning the Rationale for Design Rationale . . . . . . . . . . . . . . . . . . . . . . Michael E. Atwood and John Horner
11
HCI and the Face: Towards an Art of the Soluble . . . . . . . . . . . . . . . . . . . . Christoph Bartneck and Michael J. Lyons
20
Towards Generic Interaction Styles for Product Design . . . . . . . . . . . . . . . Jacob Buur and Marcelle Stienstra
30
Context-Centered Design: Bridging the Gap Between Understanding and Designing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yunan Chen and Michael E. Atwood
40
Application of Micro-Scenario Method (MSM) to User Research for the Motorcycle’s Informatization - A Case Study for the Information Support System for Safety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hiroshi Daimoto, Sachiyo Araki, Masamitsu Mizuno, and Masaaki Kurosu
A New User-Centered Design Process for Creating New Value and Future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yasuhisa Itoh, Yoko Hirose, Hideaki Takahashi, and Masaaki Kurosu
108
The Evasive Interface – The Changing Concept of Interface and the Varying Role of Symbols in Human–Computer Interaction . . . . . . . . . . . . Lars-Erik Janlert
117
An Ignored Factor of User Experience: FEEDBACK-QUALITY . . . . . . . Ji Hong and Jiang Xubo
127
10 Heuristics for Designing Administrative User Interfaces – A Collaboration Between Ethnography, Design, and Engineering . . . . . . . . . Luke Kowalski and Kristyn Greenwood
133
Micro-Scenario Database for Substantializing the Collaboration Between Human Science and Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . Masaaki Kurosu, Kentaro Go, Naoki Hirasawa, and Hideaki Kasai
140
A Meta-cognition Modeling of Engineering Product Designer in the Process of Product Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jun Liang, Zu-Hua Jiang, Yun-Song Zhao, and Jin-Lian Wang
146
User Oriented Design to the Chinese Industries Scenario and Experience Innovation Design Approach for the Industrializing Countries in the Digital Technology Era . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . You Zhao Liang, Ding Hau Huang, and Wen Ko Chiou
Axiomatic Design Approach for E-Commercial Web Sites . . . . . . . . . . . . . Mehmet Mutlu Yenisey
308
Development of Quantitative Metrics to Support UI Designer Decision-Making in the Design Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Young Sik Yoon and Wan Chul Yoon
316
Scenario-Based Product Design, a Real Case . . . . . . . . . . . . . . . . . . . . . . . . Der-Jang Yu and Huey-Jiuan Yeh
Long Term Usability; Its Concept and Research Approach - The Origin of the Positive Feeling Toward the Product . . . . . . . . . . . . . . . . . . . . . . . . . . Masaya Ando and Masaaki Kurosu
393
General Interaction Expertise: An Approach for Sampling in Usability Testing of Consumer Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ali Emre Berkman
397
Are Guidelines and Standards for Web Usability Comprehensive? . . . . . . Nigel Bevan and Lonneke Spinhof
A Game to Promote Understanding About UCD Methods and Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muriel Garreta-Domingo, Mag´ı Almirall-Hill, and Enric Mor
446
Table of Contents
DEPTH TOOLKIT: A Web-Based Tool for Designing and Executing Usability Evaluations of E-Sites Based on Design Patterns . . . . . . . . . . . . Petros Georgiakakis, Symeon Retalis, Yannis Psaromiligkos, and George Papadimitriou
XVII
453
Evaluator of User’s Actions (Eua) Using the Model of Abstract Representation Dgaui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Susana G´ omez-Carnero and Javier Rodeiro Iglesias
463
Adaptive Evaluation Strategy Based on Surrogate Model . . . . . . . . . . . . . . Yi-nan Guo, Dun-wei Gong, and Hui Wang
472
A Study on the Improving Product Usability Applying the Kano’s Model of Customer Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeongyun Heo, Sanhyun Park, and Chiwon Song
482
The Practices of Usability Analysis to Wireless Facility Controller for Conference Room . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ding Hau Huang, You Zhao Liang, and Wen Ko Chiou
490
What Makes Evaluators to Find More Usability Problems?: A Meta-analysis for Individual Detection Rates . . . . . . . . . . . . . . . . . . . . . . . . Wonil Hwang and Gavriel Salvendy
499
Evaluating in a Healthcare Setting: A Comparison Between Concurrent and Retrospective Verbalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Janne Jul Jensen
508
Development of AHP Model for Telematics Haptic Interface Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yong Gu Ji, Beom Suk Jin, Jae Seung Mun, and Sang Min Ko
Why It Is Difficult to Use a Simple Device: An Analysis of a Room Thermostat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sami Karjalainen
544
Usability Improvements for WLAN Access . . . . . . . . . . . . . . . . . . . . . . . . . . Kristiina Karvonen and Janne Lindqvist
549
A New Framework of Measuring the Business Values of Software . . . . . . . In Ki Kim, Beom Suk Jin, Seungyup Baek, Andrew Kim, Yong Gu Ji, and Myung Hwan Yun
559
XVIII
Table of Contents
Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Koutsabasis, T. Spyrou, and J. Darzentas
How to Use Emotional Usability to Make the Product Serves a Need Beyond the Traditional Functional Objective to Satisfy the Emotion Needs of the User in Order to Improve the Product Differentiator - Focus on Home Appliance Product . . . . . . . . . . . . . . . . . . . Liu Ning and Shang Ting Towards Remote Empirical Evaluation of Web Pages’ Usability . . . . . . . . Juan Miguel L´ opez, Inmaculada Fajardo, and Julio Abascal Mixing Evaluation Methods for Assessing the Utility of an Interactive InfoVis Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Markus Rester, Margit Pohl, Sylvia Wiltner, Klaus Hinum, Silvia Miksch, Christian Popow, and Susanne Ohmann
Effectiveness of Content Preparation in Information Technology Operations: Synopsis of a Working Paper . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Savoy and G. Salvendy
624
Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture for Early Usability Evaluation: Basic Principles and Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jean-Claude Tarby, Houcine Ezzedine, Jos´e Rouillard, Chi Dung Tran, Philippe Laporte, and Christophe Kolski
632
Usability and Software Development: Roles of the Stakeholders . . . . . . . . Tobias Uldall-Espersen and Erik Frøkjær
642
Human Performance Model and Evaluation of PBUI . . . . . . . . . . . . . . . . . Naoki Urano and Kazunari Morimoto
The Balancing Act Between Computer Security and Convenience . . . . . . Mayuresh Ektare and Yanxia Yang
731
What Makes Them So Special?: Identifying Attributes of Highly Competent Information System Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brenda Eschenbrenner and Fiona Fui-Hoon Nah
736
User Acceptance of Digital Tourist Guides Lessons Learnt from Two Field Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bente Evjemo, Sigmund Akselsen, and Anders Sch¨ urmann
746
Why Does IT Support Enjoyment of Elderly Life? - Case Studies Performed in Japan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kaori Fujimura, Hitomi Sato, Takayoshi Mochizuki, Kubo Koichiro, Kenichiro Shimokura, Yoshihiro Itoh, Setsuko Murata, Kenji Ogura, Takumi Watanabe, Yuichi Fujino, and Toshiaki Tsuboi
756
Design Effective Navigation Tools for Older Web Users . . . . . . . . . . . . . . . Qin Gao, Hitomi Sato, Pei-Luen Patrick Rau, and Yoko Asano
765
Out of Box Experience Issues of Free and Open Source Software . . . . . . . Mehmet G¨ okt¨ urk and G¨ orkem C ¸ etin
774
XX
Table of Contents
Factor Structure of Content Preparation for E-Business Web Sites: A Survey Results of Industrial Employees in P.R. China . . . . . . . . . . . . . . . . Yinni Guo and Gavriel Salvendy
784
Streamlining Checkout Experience – A Case Study of Iterative Design of a China e-Commerce Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alice Han, Jianming Dong, Winnie Tseng, and Bernd Ewert
796
Presence, Creativity and Collaborative Work in Virtual Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ilona Heldal, David Roberts, Lars Br˚ athe, and Robin Wolff
Mental Models of Chinese and German Users and Their Implications for MMI: Experiences from the Case Study Navigation System . . . . . . . . . Barbara Knapp
User Response to Free Trial Restrictions: A Coping Perspective . . . . . . . . Xue Yang, Chuan-Hoo Tan, and Hock-Hai Teo
991
XXII
Table of Contents
A Study on the Form of Representation of the User’s Mental Model-Oriented Ancient Map of China . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001 Rui Yang, Dan Li, and Wei Zhou Towards Automatic Cognitive Load Measurement from Speech Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1011 Bo Yin and Fang Chen Attitudes in ICT Acceptance and Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1021 Ping Zhang and Shelley Aikman
Part 4: Models and Patterns in HCI Using Patterns to Support the Design of Flexible User Interaction . . . . . . 1033 M. Cec´ılia C. Baranauskas and Vania Paula de Almeida Neris Model-Based Usability Evaluation - Evaluation of Tool Support . . . . . . . . 1043 Gregor Buchholz, J¨ urgen Engel, Christian M¨ artin, and Stefan Propp User-Oriented Design (UOD) Patterns for Innovation Design at Digital Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1053 Chiou Wen-Ko, Chen Bi-Hui, Wang Ming-Hsu, and Liang You-Zhao Formal Validation of Java/Swing User Interfaces with the Event B Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062 Alexandre Cortier, Bruno d’Ausbourg, and Yamine A¨ıt-Ameur Task Analysis, Usability and Engagement . . . . . . . . . . . . . . . . . . . . . . . . . . . 1072 David Cox ORCHESTRA: Formalism to Express Static and Dynamic Model of Mobile Collaborative Activities and Associated Patterns . . . . . . . . . . . . . . 1082 Bertrand David, Ren´e Chalon, Olivier Delotte, and Guillaume Masserey Effective Integration of Task-Based Modeling and Object-Oriented Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1092 Anke Dittmar and Ashraf Gaffar A Pattern Decomposition and Interaction Design Approach . . . . . . . . . . . 1102 Cunhao Fang, Pengwei Tian, and Ming Zhong Towards an Integrated Approach for Task Modeling and Human Behavior Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1109 Martin Giersich, Peter Forbrig, Georg Fuchs, Thomas Kirste, Daniel Reichart, and Heidrun Schumann A Pattern-Based Framework for the Exploration of Design Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1119 Tibor Kunert and Heidi Kr¨ omker
Table of Contents
XXIII
Tasks Models Merging for High-Level Component Composition . . . . . . . . 1129 Arnaud Lewandowski, Sophie Lepreux, and Gr´egory Bourguin Application of Visual Programming to Web Mash Up Development . . . . . 1139 Seung Chan Lim, Sandi Lowe, and Jeremy Koempel Comprehensive Task and Dialog Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . 1149 V´ıctor L´ opez-Jaquero and Francisco Montero Structurally Supported Design of HCI Pattern Languages . . . . . . . . . . . . . 1159 Christian M¨ artin and Alexander Roski Integrating Authoring Tools into Model-Driven Development of Interactive Multimedia Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1168 Andreas Pleuß and Heinrich Hußmann A Survey on Transformation Tools for Model Based User Interface Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1178 Robbie Schaefer A Task Model Proposal for Web Sites Usability Evaluation for the ErgoMonitor Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1188 Andr´e Luis Schwerz, Marcelo Morandini, and S´ergio Roberto da Silva Model-Driven Architecture for Web Applications . . . . . . . . . . . . . . . . . . . . . 1198 Mohamed Taleb, Ahmed Seffah, and Alain Abran HCI Design Patterns for PDA Running Space Structured Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206 Ricardo Tesoriero, Francisco Montero, Mar´ıa D. Lozano, and Jos´e A. Gallud Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216 Kristof Verpoorten, Kris Luyten, and Karin Coninx Patterns for Task- and Dialog-Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1226 Maik Wurdel, Peter Forbrig, T. Radhakrishnan, and Daniel Sinnig Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237
Part I
Interaction Design: Theoretical Issues, Methods, Techniques and Practice
Design Principles Based on Cognitive Aging Hiroko Akatsu1, Hiroyuki Miki1, and Naotsune Hosono2 1 Oki Electric Industry Co., Ltd. 1-16-8 Chuou Warabi-shi, Saitama, 335-8510 Japan [email protected], [email protected] 2 Oki Consulting Solutions Co., Ltd. [email protected]
Abstract. This study proposes the design principles considering the balance of ‘simplicity’ and ’helpfulness’ based on cognitive aging. Due to the increase of the aging population, various equipments are required to better assist the elderly users. ATMs (Automatic Teller Machine) have always been considered to be equipment that is difficult for the elderly users. Then this paper discusses a new ATM interface design considering the principles. The effectiveness of the new design was examined by comparing it with a conventional ATM. The usability test results favored the new ATM design, and it is consequently accepted by many elderly users. Keywords: cognitive aging, design principles, elderly users, ATM.
2 Influences of Interaction Equipments by Cognitive Aging 2.1 Issues It is important to consider not only the perceptive and physical characteristics, but a comprehensive consideration including cognitive behavioral characteristics that are definite influences on operation should also be taken into account (Figure1).
The elderly users’ characteristics when they operate various equipments
Aged-Changes Decreased vision
Slow operations through confirmations
Cataracta senilis Decreased sensibility
Hard to understand all the information at once.
longer response time
Hard to notice the screen changes
Diminished attention
Repeat similar errors
Decline in memory
Hesitate to take initiatives
・ ・ ・ Perceptive Physical characteristics
・ ・ ・ Cognitive behavioral characteristics
Cognitive aging Fig. 1. Cognitive aging
The elderly user's characteristics were found by usability tests of various equipment as presented below [3]. 1) Longer Response Time than Younger Users The time required for entries was quite long when using the 50 character keys, which involved the time to insert a passbook or cash and the overall time responding to individual items. This often resulted in a time-out, meaning many of the elderly needed to repeat the procedure from the beginning. A comparison of the average times needed for each task revealed that the group of elderly users took twice as long as the group of university students for withdrawal operations and three times as long for fund transfers. It was found by repeating the same operations, such as entering one's name using the 50 character keys, however, the elderly people also learned the operation, and this resulted in a shortening of time for such tasks.
Design Principles Based on Cognitive Aging
5
2) Difficulties Collecting all the Information in a Short Time Under certain conditions, they experienced difficulties in collecting all the necessary information at once, such as being able to read only a portion of the messages displayed on the screen. 3) Excessive Response to Voice Messages In general the voice message prompts prevented the elderly from forgetting to press a key (example: A voice message such as “Please verify the amount and press the 'Confirm' key if the amount is correct”). However, when a voice message prompting them to “enter your name” was given at a time after the name was entered, the elderly proceeded to enter the name again, even though the name entry had just been completed. 4) Recurrence of the Same Errors It was found when an operational error was once made, then there was a tendency to repeat the same error. It appears that it is difficult for the elderly to determine what status they are currently in or how the operation was done previously, therefore, making it difficult for themselves to avoid the same errors. 5) They Tend to Respond to Items that are Easily seen or can be Touched Directly by Hand (example: hardware keys) 6) They Hardly Notice the Changes to Information Displayed on the Screen 7) They cannot always extract the necessary information (or they will try to read all the information, but will get tired on the way through and are unable to finish the reading). 8) They will not take any initiatives on their own (or they will just follow the orders when they are asked to push keys, for example). 2.2 Ease of Use and Cognitive Aging: A Three-Layered Factor Model By sorting out the problems of the elderly obtained through various experiments, it appears that the three factors as shown in Figure 2 overlap each other in a complex manner, causing the phenomena that the elderly“cannot use equipment”. The three factors are; (a) Factors Associated with the Deterioration of the Cognitive Capacity of the Elderly Users Factors that are the basis for the inability to use equipment are the deterioration of the cognitive function, which occurs by aging. As reported by researches in the field of experimental cognitive psychology, the deterioration of capabilities due to aging is considered to have a clear influence on the matter.
6
H. Akatsu, H. Miki, and N. Hosono
(b) Factors Relevant to the Lack of Knowledge and Mental Models (for Equipment and Systems) A mental model is an image that a user puzzles how equipment should be used. It is believed that the lack of such knowledge is accelerating the effects of cognitive aging outlined in (a) of figure 2, delaying the understanding on the operations of equipment. Such problems arise from the rapid acceleration in the advancement of IT equipments. This brings difficulties for the elderly in the future. So long as new technologies are being developed at all times, however, it is believed that new problems, which are different from those today, will appear continuously. (c) Factors Relevant to Attitude (cultural and social values) The elderly users seem to have an attitude of not even wanting to try to use the equipment from the start by selecting methods and means that are beyond their familiarity (example: Using a teller rather than an ATM), as they do not want to be seen as being incapable. This factor is a problem for manufacturers. Still, as mentioned before, with the branches of many banks being consolidated and reduced in number, it is believed that there will be an increasing number of situations in the future when the elderly are forced to use ATMs, which are eventually difficult for them to use. As our agenda for the future, it is essential to broaden the scope of usability research and to conduct studies from other perspectives, such as what needs to be done to enable the elderly to use the equipment. It is necessary to consider that a cause of one issue is not only by one factor but also by three factors. Then the design principles are based on cognitive aging considering the three factors. Consequently the new ATM design for the elderly users by the design principles is proposed. Afterward the effectiveness of the new design was compared with a conventional ATM.
(c) Factors associated w ith attitudes • Negative attitude by using the equipm ent • Values, knowledge and fram ework for each generation. • Select m ethods and m eans to effectively sustain their own capabilities. (b) Factors associated w ith a lack of know ledge and m ental m odels • Knowledge and m ental m odels concerning particular m odes of operation of equipm ent. • Knowledge relative to the concept of the inform ation itself. (a) Factors associated w ith the deterioration of cognitive capabilities of the elderly • Deterioration of inhibition functions. • Decrease in short-term m emory capacity. • Delays in com prehension.
Fig. 2. Ease of use and cognitive aging: A three-layered factor model (The material was touched up and corrected by Harada and Akatsu[3])
Design Principles Based on Cognitive Aging
7
3 Design Principles and ATM Design Through consideration of elderly users’ characteristics above, the following design principles were clarified. A new ATM design that balances ‘simplicity’ and ‘helpfulness’ based on cognitive aging is proposed. 1) Just One Operation Requires at One Screen ATM design example: the elderly users can perform the banking transaction in a stepby-step manner. 2) The Screen Switch Must be Noticed ATM design example: blinking buttons and screen switch by side slide at a time of page renewal (Figure3). 3) The Operation Flow also Must be Comprehensible ATM design example: The conventional ATM demands two operations of input and confirmation. The new ATM divides them into two pieces of an input screen and confirmation screen. As a result, the elderly users could use it with confident input operation and confirmation (see Figure 4). 4) The Screen Information Must be Easy to Read (sufficient font size and contrast) 5) Screen Information must be Simple as Possible The announcements generally support the operation. However, sometimes the announcements hinder the operation due to inappropriate timing and contents. Hence the following points were considered. 6) The Same Content as the Announcement Must be Displayed on the Screen 7) The Announcement Must be Done at the Time Just Before Changing to the Next Screen, and it must not Repeat 8) The Announcements of Feedback Massage can be Done by the Handset Phone
Fig. 3. Screen switch by side slide
8
H. Akatsu, H. Miki, and N. Hosono
Please enter the amount to remit. Next then please confirm.
cancel confirm
clear
amount
Conventional ATM Please enter the amount to remit. Clear
amount
The amount is “65,000 yen” Is it OK?
Clear
Yes
Next
Input Screen
Confirmation Screen
New ATM for elderly users Fig. 4. Input screen and Conformation screen
4 ATM Usability Testing The effectiveness of the new ATM design for the elderly users was compared with conventional one. 4.1 Methods At first the test participants were instructed to express vocally, what they were thinking while operating an ATM simulator (“Think Aloud Method”). Then, the collected data (every behavior and speech of the test participants) were “Protocol Analyzed”. 4.2 Test Participants The test participants were six elderly users (three males and three females, aged between 68 and 75). They have never used an ATM before. 4.3 Experimental Equipments As an intended system, the ATM simulator was prepared (a personal computer and a touch display were installed in a paper model housing), and ordinary transaction
Design Principles Based on Cognitive Aging
9
operations were then to be performed. A video camera, a tiepin- type small microphone, recording equipment, etc., were prepared as recording media. 4.4 Experimental Procedures Each test was conducted by the individual participants. First, an explanation of the usability test objectives, an explanation for the use of the equipment, practice of the thought utterance method and preliminary questionnaire survey concerning the use of ATM were conducted prior to performing the tasks. A follow-up questionnaire survey was conducted once after the tasks had been completed, and additional interviews were also conducted. The prepared two tasks were (1) withdrawal using a cash card, and (2) money transfer. 4.5 Results and Considerations 1) Decreased Number of Time-outs from Operational Errors It was found that most time-outs of an ATM operation occur when the elderly users become confused and are uncertain of what to do next. When a time-out occurs, the display is usually returned to the top screen and wipes out any previous efforts by the users. The number of time-outs of each user experienced during a money transfer task. As a whole, the new ATM design was found to decrease the number of time-out occurrences to less than half when compared with a conventional ATM. On the conventional ATM, the time-outs mainly occurred during the money transfer operation, entering the first letter of the bank branch name and selecting a bank branch from a list. On the other hand, the new ATM time-outs were found to occur during the name input using the Japanese character list. Consequently it can be said that the new ATM solved the issues of usability even though there are still some problems left with the name input. 2) Less Cognitive Load The six users were interviewed after the experimental evaluation. They admitted that the new ATM was easier to use and the most part were satisfied. From the comments made by the users, it is surmised that accumulation of useful tips on each screen page and overall effort to reduce cognitive load were effective. 3) Number of Operational Steps and Operational Confidence There is a trade off between simplifying one screen page information and the additional number of page operations. In the elderly user mode, additional screen pages are added, so that the operations can be performed easier and with their confidence. Operational rhythm is enhanced with subsidiary announcements to make the additional steps less noticeable. Interview results by the test participants showed they preferred simple usability even if several steps are added. Judging by the results of the usability test, the proposed principles were confirmed its effectiveness.
10
H. Akatsu, H. Miki, and N. Hosono
5 Conclusion This paper proposed to design a new ATM interface particularly reflecting the requirements of cognitive aging. Experimental evaluation shows a lower number of operational puzzlement and errors when compared with the conventional ATM. The elderly users appreciated the step-by-step operations, which were more in line with their input pace. Therefore the proposed principles were confirmed its effectiveness. As for the principles , not only the ATM but also other equipments will be applicable.
References 1. Fisk, A.D., Rogers, W.A., et al.: Designing for older adults: Principles and Creative Factors Approaches, CRC Press (2004) 2. Kyoyou-Hin Foundation: Inconvenience list such as the elderly people (1999) 3. Harada, T.E., Akatsu, H.: What is “Usability” - A Perspective of Universal Design in An Aging Society. In: Cognitive Science of Usability, Kyoritsu Publisher (2003)
Redesigning the Rationale for Design Rationale Michael E. Atwood and John Horner College of Information Science and Technology Drexel University Philadelphia, PA 19104 USA {atwood, jh38} @drexel.edu
Abstract. One goal of design rationale systems is to support designers by providing a means to record and communicate the argumentation and reasoning behind the design process. However, there are several inherent limitations to developing systems that effectively capture and utilize design rationale. The dynamic and contextual nature of design and our inability to exhaustively analyze all possible design issues results in cognitive, capture, retrieval, and usage limitations. In addition, there are the organizational limitations that ensue when systems are deployed. In this paper we analyze the essential problems that prevent the successful development and use of design rationale systems. We argue that useful and effective design rationale systems cannot be built unless we carefully redefine the goal of design rationale systems. Keywords: Design rationale, theories of design, interactive systems design.
1 Introduction Over the past two decades, much has been written about design rationale. That design rationale has remained an active research area within the human-computer interaction (HCI) community for an extended time indicates that researchers see it as an attractive and productive area for research. We share this enthusiasm for research on design rationale. But, at the same time, we have little confidence that useful and usable design rationale systems will ever be built. And, should they ever be built, we have little confidence that they will be used. The only solution we see to successful research on design rationale is to carefully define the rationale underlying design rationale. Our motivation in writing this paper is derived from two questions. First, since we don’t have a common understanding of what design is, how can we have a common understanding of what design rationale is? Second, why is the collection of papers that describe design rationale systems so much larger than the collection that describe design rationale successes?
Wania et al reported a bibliometric cocitation analysis of the HCI literature over much of the past two decades. From this analysis, shown in Figure 1, seven major approaches to design were identified. It is important to note that the Design Rationale cluster spans across much of the map, almost connecting one side to the other. Two points are worth noting here. First, design rationale is not a tool that other design communities use as much as it is a research area of its own; that is why is appears here as a separate cluster, Second, the design rationale community does not have a great deal of commonality in interest. The authors in the Design Rationale cluster all seem to be boundary spanners. Each author in this cluster is located very close to another cluster. This suggests that design rationale may mean different things to the different researchers and practitioners within this community. 2.1 Why Do the Papers Describing Systems Outnumber Those Describing Successes? In analyzing the papers that describe design rationale systems, we will look at two end-points. In 1991, a special issue of the journal Human-computer interaction presented six papers on design rationale. Of these six, only one reported any data on system use and this data indicated only that one design rationale system was usable; there was no data supporting a claim that is was useful. In 2006, an edited text [2] presented twenty papers on design rationale. Of these twenty, only one reported data on system usability; no data on usefulness was presented. Clearly, the number of papers describing design rationale systems is much larger then the number reporting design rationale successes. In order to understand why design rationale is not seen as a tool for designers and why successes are so rare, we will begin with a common view of design rationale. In Figure 2, we show the flow of information in most design rationale systems. Initially, designers consider alternatives to design issues they are facing [3]. Then, they store the rationale for their decisions in a design rationale system. At a later time, another design can browse the design rationale system to review earlier decisions and potentially to apply these earlier decisions to the current design. All of this, of course, sits in some organizational context.
Redesigning the Rationale for Design Rationale
13
Organizational Setting Artifact B
Artifact A
1
4
2
DR System
3
Fig. 2. Barriers to Effective Design Rationale Systems
Overall, design rationale systems are intended to support communication, reflection, and analysis in design. Design rationale systems are intended to support the communication of design decisions to others, to support reflecting on design options, and to support analyzing which option to select. But, referring back to Figure 2, the goal of transmitting information to future designers detracts from the goal of doing good designs today! Simply put, a designer’s cognitive energy can be focused on solving today’s problems or on recording information to be used in the future. But, doing one detracts from the other. We argue that the main use of rationale of design rationale systems is to support today’s design. In essence, this brings design rationale back to its starting point (e.g.,[4]).
3 The Essential Barriers For each of the activities shown in Figure 2, we list the essential problems that inhibit the success of design rationale systems. We use the term essential in the same way that Brooks [5] did; essential problems are inherent in the nature of the activity in contrast to accidental problems that are problems for today but which are not inherent and may well be solved by future technological advances. After analyzing these essential problems, we return to two additional questions. In order to better understand what the rationale for design rationale should be we must ask what do designer do? And then what should the goal of design rationale be? 3.1 Cognitive Barriers Designers must focus their cognitive energy on the problem at hand. Imposing inappropriate constraints or introducing irrelevant information into design activities can have detrimental effects. Satisficing, Not Optimization. People have a limited capacity to process information. This limitation can hinder the effectiveness of design rationale. Simon [6] states that we are bounded by our rationality and cannot consider all possible
14
M.E. Atwood and J. Horner
alternatives. Therefore, people choose satisfactory rather than optimal solutions. Since we are bounded by the amount of information we can process, design rationale is necessarily incomplete. Unintended Consequences. It is important to recognize the potential for unintended consequences, especially in systems where the risks are high [6]. In these situations, designers may want to ensure that they have exhaustively covered the design space so as to minimize the risk for unanticipated effects. The key question in this type of query is “what are we missing?” Design rationale is a potential solution to help designers identify issues that they may have otherwise left unconsidered. Systems could allow designers to search for similar projects or issues to identify issues that were considered in those projects. Collaboration Hampers Conceptual Integrity. One mechanism to more exhaustively analyze the design space is to use collaboration in the design process [7]. However, in any collaborative design context, maintaining conceptual integrity is important to keep the design project focused [5]. More people are capable of considering more ideas, but this adds complexity and effort in keeping persons on the design team up to speed. It also increases the effort of integrating diverse perspectives. 3.2 Capture Barriers There are many different situations in which design rationale may not be captured. In some cases, the omission is unintentional. In others, it is quite intentional. We consider both below. Work-benefit Disparity. Complex design is normally a group activity, and tools to support designers can therefore be considered a type of groupware. Grudin [8] describes several problems involved in developing groupware. Specifically, one of the obstacles he discusses is of particular interest to design rationale systems. He contends that there should not be a disparity between who incurs the cost and who receives the benefit. If the focus of design rationale is placed only on minimizing the cost to later users, it can add significant costs to the original designers. A major shortcoming in design rationale is the failure to minimize the cost to the original designers. Gruber and Russell [9] contend that design rationale must go beyond the record and replay paradigm and collect data that can benefit later users, while also not being a burden on designers. Context Is Hard to Capture. Design rationale may be considered, but unintentionally not recorded by the capture process. There are several reasons why considerations could be unintentionally omitted from design rationale. If the design rationale capture takes place outside of the design process, it is possible that contextual cues may not be present, and designers may not recall what they deliberated upon, or designers may not be available at the time the rationale is captured. For these reasons, it would appear that rationale should be captured in the context of design. However, it is not always possible or advantageous to capture rationale in
Redesigning the Rationale for Design Rationale
15
the design context. Grudin [10] notes that in certain development environments, exploring design space can be detrimental because it diverts critical resources. Additionally, many design decisions are considered in informal situations, where capturing the rationale is infeasible [11]. Tracking the location of where the rationale was recorded, the persons present at the time of design rationale capture, their roles and expertise, and the environmental context of the capture can help reviewers infer why specific information was considered. Designers Should Design, Not Record Rationale. Tacit knowledge [12] is a term used to describe things that we know, but are not able to bring to consciousness. It is possible that design rationale may unintentionally be omitted because a designer may not be able to explicate their tacit knowledge. Designers may not be able or willing to spend the energy to articulate their thoughts into the design rationale system, especially when they reach breakdowns, and are focusing on understanding and resolving the problem at hand. Conklin and Bergess-Yakemovic [7] state that designers focus should be on solving problems and not on capturing their decisions. During routine situations, designers react to problems as they arise without consciously thinking about them. Recording Rationale Can Be Dangerous! Sharing knowledge can be detrimental to designers, especially if the information they share could potentially be used against them. Designers may be hesitant to simply give away knowledge without knowing who will use it or how it will be used. Rewarding knowledge sharing is a challenging task that involves creating tangible rewards for intangible ideas. This is especially difficult considering that there is often no way to evaluate which ideas resulted in the success or failure of an artifact. In certain contexts, there are privacy and security concerns with the design rationale. For instance, organizations may want to keep their rationale secure so that competing organizations cannot gain a competitive advantage. Similarly, there may be political repercussions or security breaches if policy makers make their rationale available to the public. For example, designers may not want to document all of their considerations because politically motivated information could be held against them. There are also situations where people working outside the specified work procedures may not want to document their work-arounds in fear that it will be detrimental to them. Designers may not want to capture rationale that could be viewed as detrimental to themselves or certain other people, and therefore will intentionally omit certain rationale. Additionally, individual designers may not want their design considerations to be available for post-hoc scrutiny. 3.3 Retrieval Barriers Karsenty [13] evaluated design documents and found that design rationale questions were by far the most frequent questions during design evaluation meetings. However, only 41% of the design rationale questions were answered by the design rationale documentation. The reasoning for the discrepancy between the needed and captured design rationale is broken into several high-level reasons, including analysts not capturing questions, options, or criteria; the inadequacy of the design rationale method; and the lack of understanding. Other literature has focused on several issues
16
M.E. Atwood and J. Horner
that contribute to this failure, including inappropriate representations [14,15] the added workload required of designers [7,10] exigent organizational constraints [11] and contextual differences between the design environment at the time when the rationale is captured and the time when it is needed [9]. Relevance Is Situational. Initial designers and subsequent users of rationale may have different notions of what is relevant in a given design context. Wilson [16] describes relevance as a relationship between a user and a piece of information, and as independent of truth. Relevance is based on a user’s situational understanding of a concern. Moreover, he argues that situational relevance is an inherently indeterminate notion because of the changing, unsettled, and undecided character of our concerns. This suggests that the rationale constructed at design time may not be relevant to those reviewing the rationale at a later time in a different context. When rationale is exhaustively captured, there is an additional effort required to capture the information. And, when too little information is captured, the reviewers’ questions remain unanswered. Belkin [17] describes information retrieval as a type of communication whereby a user is investigating their state of knowledge with respect to a problem. Belkin contends that the success of the communication is dependent upon the extent to which the anomaly can be resolved based on the information provided, and thus is controlled by the recipient. This suggests that designers cannot recognize the relevance of rationale until a person queries it. And, later uses may not be able to specify what information will be most useful, but rather will only recognize that they do not have the necessary knowledge to resolve a problem. Indexing. A more structured representation can make it more difficult to capture design ideas, but can facilitate indexing and retrieval. One problem is that there is an inherent tradeoff between representational flexibility and ease of retrieval. Unstructured text is easier to record, but more difficult to structure in a database. One solution is to push the burden on to those who are receiving the benefit [8] which would be the retrievers in this case. However, if the potential users of the rationale find the system to be too effortful, then it will go unused. Then, designers will not be inclined to spend time entering design rationale into a system that will not be used. 3.4 Usage Barriers People reviewing design rationale have a goal and a task at hand that they hope the design rationale will support. Often, these people are also involved in designing. If this is the case, the reviewers may not know whether retrieved rationale is applicable to their current problem. The Same Problem in a Different Context Is a Different Problem. Because design problems are unique, even rationale that successfully resolved one design problem may not be applicable to a different problem. In addition to the problem of accurately and exhaustively capturing rationale, recognizing the impact of rationale can be a difficult task. Understanding rationale tied to one problem could help resolve similar problems in the future. However, design is contextual, and external factors often interact with the
Redesigning the Rationale for Design Rationale
17
design activity in a complex and unexpected manner. Reviewers of rationale are interested in understanding information to help them with their task-at-hand, and without understanding the context of those problems, utilization of the information becomes difficult. The inherent problem of identifying the impact of rationale across different design problems adds a net cost to utilizing rationale, decreasing the overall utility in the design process. Initiative Falls on the User. Design rationale systems are passive rather than active. The initiative to find relevant rationale falls on the user. The system does not suggest it; it is the user’s responsibility to retrieve it. 3.5 Organizational Barriers As Davenport and Prusak warn in their book [18] “if you build it, they may not come." Being able to build a system is only an initial step; the “gold standard” against which success is measured, however, is whether people will accept and use it. Designers don’t Control the Reward Structure of Users. As system builders, we do not have much control over the personal reward systems of the individual users and management mandate that many [18,19] recommend will enhance usage of the technology, and therefore we can not motivate our users as such. Therefore, we must rely on other factors. Informal Knowledge is Difficult to Capture. Design Rationale tools must support both formal and informal knowledge, making the system flexible enough so that broad content types were supported [20]. They must support multiple levels of organization of content and design systems so that knowledge can be structured at any time after it is entered [21].
4 Conclusions In this paper, we have explored the role of design rationale research within the broader design community. And, we have looked into a number of barriers that impede design rationale as an effective tool for reflection, communication, and analysis. The barriers were discussed in terms of cognitive, capture, retrieval, usage, and organizational limitations. At one level, the intent of design rationale is to transmit information from a designer working at one time and in one context to another designer working in another time and context. This is the most frequently-cited goal in design rationale research. But, is this the ultimate goal of design rationale? We argue that it is not. The goal of research on design rationale is to improve the quality of designs. There are fundamental barriers to developing information systems that support asynchronous communication among designers working on different design problems. Therefore, design research should focus on supporting designers who better understand the context of their unique problems.
18
M.E. Atwood and J. Horner
The goal of research on design rationale is to improve the quality of designs. There are fundamental barriers to developing computer systems that support communication among designers working on design problems. Therefore, the focus of design rationale should be on identifying what tools are most appropriate for the task. Using less persistent modes of communication, putting a greater emphasis on supporting design processes rather than design tools, and creating systems that are optimized for a single purpose are necessary steps for improving design.
References 1. Wania, C., McCain, K., Atwood, M.E.: How do design and evaluation interrelate in HCI research? In: Proceedings of the 6th ACM conference on Designing Interactive systems, June 26-28, 2006, University Park, PA, USA (2006) 2. Dutoit, McCall, Mistrik, Paech. (eds.) Rationale Management in Software Engineering. Springer Heidelberg 3. Horner, J., Atwood, M.E.: Design rationale: the rationale and the barriers. In: Proceedings of the 4th ACM Nordic conference on Human-computer interaction: changing roles (2006) 4. Rittel, H., Weber, M.: Planning Problems are Wicked Problems. In: Cross, N. (ed.) Developments in design methodology, pp. 135–144. Wiley, Chichester; New York (1984) 5. Brooks, F.P.: The mythical man-month: essays on software engineering. Addison-Wesley Pub. Co, Reading, Mass (1995) 6. Simon, H.A.: The sciences of the artificial. Cambridge, MA, MIT Press. 1996. Tenner, E. Why things bite back: technology and the revenge of unintended consequences. New York, Knopf (1996) 7. Conklin, E., Bergess-Yakemovic, K.: A process oriented approach to design rationale. In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 8. Grudin, J.: Groupware and social dynamics: eight challenges for developers. Communications of the ACM 37(1), 92–105 (1994) 9. Gruber, T., Russell, D.: Generative Design Rationale. Beyond the Record and Replay Paradigm. In: Moran, T.P., Carroll, J.M. (eds.) esign rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 10. Grudin, J.: Evaluating opportunities for design capture. In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 11. Sharrock, W., Anderson, R.: Synthesis and Analysis: Five modes of reasoning that guide design. In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 12. Polanyi, M.: The tacit dimension. Doubleday, Garden City, NY (1966) 13. Karsenty, L.: An empirical evaluation of design rationale documents. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 150–156. ACM Press, New York (1996) 14. Lee, J., Lai, K.: What’s in design rationale? In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 15. MacLean, A., Young, R., Bellotti, V., Moran, T.: Questions, Options, Criteria: Elements of design space analysis. In: Moran, T.P., Carroll, J.M. (eds.) Design rationale: concepts, techniques, and use, L. Erlbaum Associates, Mahwah, N.J (1996) 16. Wilson, P.: Situational Relevance. Information Stor. Retrieval 9, 457–471 (1973)
Redesigning the Rationale for Design Rationale
19
17. Belkin, N.: Anomalous States of Knowledge as a Basis for Information Retrieval. Canadian Journal of Information Science 5, 133–143 (1980) 18. Davenport, T.H., Prusak, L.: Working Knowledge: How Organizations Manage What They Know. Harvard Business School Press, Boston, Massachusetts (1998) 19. Orlikowski, W.J., Hofman, J.D.: An Improvisational Model for Change Management: The Case of Groupware Technologies, Sloan Management Review/Winter, pp. 11–21 (1997) 20. Davenport, T.H.: Saving IT’s Soul: Human-Centered Information Management, Harvard Business Review: Creating a System to Manage Knowledge, 1994, product #39103, pp. 39–53 (1994) 21. Shipman, F., McCall, R.: Incremental Formalization with the Hyper-Object Substrate. ACM Transactions on Information Systems (1999)
HCI and the Face: Towards an Art of the Soluble Christoph Bartneck1 and Michael J. Lyons2 1
Department of Industrial Design, Eindhoven University of Technology, Den Dolech 2, 5600 MB Eindhoven, The Netherlands [email protected] 2 ATR Intelligent Robotics and Communication Labs, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288, Japan [email protected]
Abstract. The human face plays a central role in most forms of natural human interaction so we may expect that computational methods for analysis of facial information and graphical and robotic methods for synthesis of faces and facial expressions will play a growing role in human-computer and human-robot interaction. However, certain areas of face-based HCI, such as facial expression recognition and robotic facial display have lagged others, such as eye-gaze tracking, facial recognition, and conversational characters. Our goal in this paper is to review the situation in HCI with regards to the human face, and to discuss strategies which could bring more slowly developing areas up to speed. Keywords: face, hci, soluble, recognition, synthesis.
recognition community and the findings are highly relevant to HCI [1, 2]. Work on animated avatars may be considered to be mature [3], while the younger field of social robotics is expanding rapidly [4-6]. FP is a central concern in both of these fields, and HCI researchers can contribute to and benefit from the results.
2 HCI and the Face Computer scientists and engineers have worked increasingly on FP, from the widely varying viewpoints of graphics, animation, computer vision, and pattern recognition. However, an examination of the HCI research literature indicates that activity is restricted to a relatively narrow selection of these areas. Eye gaze has occupied the greatest share of HCI research on the human face (e.g. [7]). Eye gaze tracking technology is now sufficiently advanced that several commerical solutions are available (e.g. Tobii Technology [8]). Gaze tracking is a widely used technique in interface usability, machine-mediated human communication, and alternative input devices. This area can be viewed as a successful, sub-field related to face-based HCI. Numerous studies have emphasized the neglect of human affect in interface design and argued this could have major impact on the human aspects of computing [9]. Accordingly, there has been much effort in the pattern recognition, AI, and robotics communities towards the analysis, understanding, and synthesis of emotion and expression. In the following sections we briefly introduce the areas related to analysis and synthesis, especially by robots, of facial expressions. In addition, we share insights on these areas gained during a workshop we organized on the topic. 2.1 Analysis: Facial Expression Classification The attractive prospect of being able to gain insight into a user’s affective state may be considered one of the key unsolved problems in HCI. It is known that it is difficult to measure the “valence” component of affective state, as compared to “arousal”, which may be gauged using biosensors. However, a smile, or frown, provides a clue that goes beyond physiological measurements. It is also attractive that expressions can be guaged non-invasively with inexpensive video cameras. Automatic analysis of video data displaying facial expressions has become a topic of active area of computer vision and pattern recognition research (for reviews see [10, 11]). The scope of the problem statement has, however, been relatively narrow. Typically one measures the performance of a novel classification algorithm on recognition of the basic expression classes proposed by Ekman and Friesen [12]. Expression data often consists of a segmented headshot taken under relatively controlled conditions and classification accuracy is based on comparison with emotion labels provided by human experts. This bird’s eye caricature of the methodology used by the pattern recognition community given above is necessarily simplistic, however it underlines two general reflections. First, pattern recognition has successfully framed the essentials of the facial expression problem to allow for effective comparison of algorithms. This narrowing of focus has led to impressive developments of the techniques for facial expression analysis and substantial understanding. Second, the narrow framing of the
22
C. Bartneck and M.J. Lyons
FP problem typical in the computer vision and pattern recognition may not be appropriate for HCI problems. This observation is a main theme of this paper, and we suggest that progress on use of FP in HCI may require re-framing the problem. Perhaps the most salient aspect of our second general observation on the problem of automatic facial expression recognition is that HCI technology can often get by with partial solutions. A system that can discriminate between a smile and frown, but not an angry versus disgusted face, can still be a valuable tool for HCI researchers, even if it is not regarded as a particularly successful algorithm from the pattern recognition standpoint. Putting this more generally, components of algorithms developed in the pattern recognition community, may already have sufficient power to be useful in HCI, even if they do not yet constitute general facial expression analysis systems. Elsewhere in this paper we give several examples to back up this statement. 2.2 Synthesis: Robotic Facial Expressions There is a long tradition within the HCI community of investigating and building screen based characters that communicate with users [3]. Recently, robots have also been introduced to communicate with the users and this area has progressed sufficiently that some review articles are available [4, 6]. The main advantage that robots have over screen based agents is that they are able to directly manipulate the world. They not only converse with users, but also perform embodied physical actions. Nevertheless, screen based characters and robots share an overlap in motivations for and problems with communicating with users. Bartneck et al. [13] has shown, for example, that there is no significant difference in the users’ perception of emotions as expressed by a robot or a screen based character. The main motivation for using facial expressions to communicate with a user is that it is, in fact, impossible not to communicate. If the face of a character or robot remains inert, it communicates indifference. To put it another way, since humans are trained to recognize and interpret facial expressions it would be wasteful to ignore this rich communication channel. Compared to the state of the art in screen-based characters, such as Embodied Conversational Agents [3], however, the field of robot’s facial expressions is underdeveloped. Much attention has been paid to robot motor skills, such as locomotion and gesturing, but relatively little work has been done on their facial expression. Two main approaches can be observed in the field of robotics and screen based characters. In one camp are researchers and engineers who work on the generation of highly realistic faces. A recent example of a highly realistic robot is the Geminoid H1 which has 13 degrees of freedom (DOF) in its face alone. The annual Miss Digital award [14] may be thought of as a benchmark for the development of this kind of realistic computer generated face. While significant progress has been made in these areas, we have not yet reached human-like detail and realism, and this is acutely true for the animation of facial expressions. Hence, many highly realistic robots and character currently struggle with the phenomena of the “Uncanny Valley” [15], with users experiencing these artificial beings to be spooky or unnerving. Even
HCI and the Face: Towards an Art of the Soluble
23
the Repliee Q1Expo is only able to convince humans of the naturalness of its expressions for at best a few seconds [16]. In summary, natural robotic expressions remain in their infancy [6]. Major obstacles to the development of realistic robots lie with the actuators and the skin. At least 25 muscles are involved in the expression in the human face. These muscles are flexible, small and be activated very quickly. Electric motors emit noise while pneumatic actuators are difficult to control. These problems often result in robotic heads that either have a small number of actuators or a somewhat larger-thannormal head. The Geminoid H1 robot, for example, is approximately five percent larger than its human counterpart. It also remains difficult to attach skin, which is often made of latex, to the head. This results in unnatural and non-human looking wrinkles and folds in the face. At the other end of the spectrum, there are many researchers who are developing more iconic faces. Bartneck [17] showed that a robot with only two DOF in the face can produce a considerable repertoire of emotional expressions that make the interaction with the robot more enjoyable. Many popular robots, such as Asimo [18], Aibo [19] and PaPeRo [20] have only a schematic face with few or no actuators. Some of these only feature LEDs for creating facial expressions. The recently developed iCat robot is a good example of an iconic robot that has a simple physically-animated face [21]. The eyebrows and lips of this robot move and this allows synthesis of a wide range of expressions. More general and fundamental unsolved theoretical aspects of facial information are also relevant to the synthesis of facial expressions. The representation of the space of emotional expressions is a prime example [22]. The space of expressions is often modeled either with continuous dimensions, such as valence and arousal [23] or with a categorical approach [12]. This controversial issue has broad implications for all HCI applications involving facial expression [22]. The same can be said for other fundamental aspects of facial information processing, such as the believability of synthetic facial expressions by characters and robots [5, 24].
3 Workshop on “HCI and the Face” As part of our effort to examine the state of the field of FP in HCI, we organized a day-long workshop the ACM CHI’2006 conference (see: http://www.bartneck.de/ workshop/chi2006/ for details). The workshop included research reports, focus groups, and general discussions. This has informed our perspective on the role of FP in HCI, as presented in the current paper. One focus group summarized the state of the art in facial expression analysis and synthesis, while another brainstormed HCI applications. The idea was to examine whether current technology sufficient advanced to support HCI applications. The proposed applications were organized with regards to the factors “Application domain” and “Intention” (Table 1). Group discussionseemed to naturally focus on applications that involve some type of agent, avatar or robot. It is nearly impossible to provide an exhaustive list of applications for each field in the matrix. The ones listed in the table should therefore be only considered as representative examples.
24
C. Bartneck and M.J. Lyons Table 1. Examples of face processing applications in HCI and HRI Intention Persuade
Application domain
Entertainment
Communication
Health
Advertisement: REA [3] Greta [25] Persuasive Technology [28] Cat [29] Health advisor Fitness tutor [32]
Being a companion
Educate
Aibo [19] Tamagotchi [26]
My Real Baby [27]
Avatar [30]
Language tutor [31]
Aibo for elderly [33] Paro [29] Attention Capture for Dementia Patients [34]
Autismtic children [35]
These examples well illustrate a fundamental problem of this research field. The workshop participants can be considered experts in the field and all the proposed example applications were related to artificial characters, such as robots, conversational agents and avatars. Yet not one of these applications has become a lasting commercial success. Even Aibo, the previously somewhat successful entertainment robot, has been discontinued by Sony in 2006. A problem that all these artificial entities have to deal with is, that while their expression processing has reached an almost sufficient maturity, their intelligence has not. This is especially problematic, since the mere presence of an animated face raises the expectation levels of its user. An entity that is able to express emotions is also expected to recognize and understand them. The same holds true for speech. If an artificial entity talks then we also expect it to listen and understand. As we all know, no artificial entity has yet passed the Turing test or claimed the Loebner Prize. All of the examples given in Table 1 presuppose the existence of a strong AI as described by John Searle [36]. The reasons why strong AI has not yet been achieved are manifold and the topic of lengthy discussion. Briefly then, there are, from the outset, conceptual problems. John Searle [36] pointed out that digital computers alone can never truly understand reality because it only manipulates syntactical symbols that do not contain semantics. The famous ‘Chinese room’ example points out some conceptual constraints in the development of strong AIs. According to his line of arguments, IBM’s chess playing computer “Deep Blue” does not actually understand chess. It may have beaten Kasparov, but it does so only by manipulating meaningless symbols. The creator of Deep Blue, Drew McDermott [37], replied to this criticism: "Saying Deep Blue doesn't really think about chess is like saying an airplane doesn't really fly because it doesn't flap its wings." This debate reflects different philosophical viewpoints on what it means to think and understand. For centuries philosophers have thought about such questions and perhaps the most important conclusion is that there is no conclusion at this point in time. Similarly, the possibility of developing a strong AI remains an open question. All the same, it must be admitted that some kind of progress has been made.
HCI and the Face: Towards an Art of the Soluble
25
In the past, a chess-playing machine would have been regarded as intelligent. But now it is regarded as the feat of a calculating machine – our criteria for what constitutes an intelligent machine has shifted. In any case, suffice it to say that no sufficiently intelligent machine has yet emerged that would provide a foundation for our example applications given in Table 1. The point we hope to have made with the digression into AI is that the application dreams of researchers sometimes conceal rather unrealistic assumptions about what is possible to achieve with current technology.
4 Towards an “Art of the Soluble” The outcome of the workshop we organized was unexpected in a number of ways. Most striking was the vast mismatch between the concrete and fairly realistic description of the available FP technology and its limitations arrived at by one of the focus groups, and the blue-sky applications discussed by the second group. Another sharp contrast was evident at the workshop. The actual presentations given by participants were pragmatic and showed effective solutions to real problems in HCI not relying on AI. This led us to the reflection that scientific progress often relies on what the Nobel prize winning biologist Peter Medawar called “The Art of the Soluble” [38]. That is, skill in doing science requires the ability to select a research problem which is soluble, but which has not yet been solved. Very difficult problems such as strong AI may not yield to solution over the course of decades, so for most scientific problems it is preferable to work on problems of intermediate difficulty, which can yield results over a more reasonable time span, while still being of sufficient interest to constitute progress. Some researchers of course are lucky or insightful enough to re-frame a difficult problem in such a way as to reduce its difficulty, or to recognize a new problem which is not difficult, but nevertheless of wide interest. In the next two subsections we illustrate the general concept with examples from robotic facial expression synthesis as well as facial expression analysis. 4.1 Facial Expression Synthesis in Social Robotics As we argued in section 2, the problems inherited by HRI researchers from the field of AI can be severe. Even if we neglect philosophical aspects of the AI problem and are satisfied with a computer that passes the Turing test, independently of how it achieves this, we will still encounter many practical problems. This leads us to the socalled “weak AI” position, namely claims of achieving human cognitive abilities are abandoned. Instead, this approach focuses on specific problem solving or reasoning tasks. There has certainly been progress in weak AI, but this has not yet matured sufficiently to support artificial entities. Indeed, at present, developers of artificial entities must to resort to scripting behaviors. Clearly, the scripting approach has its limits and even the most advanced common sense database, Cyc [39] , is largely incomplete. FP should therefore not bet on the arrival of strong AI solutions, but focus on what weak AI solutions can offer today. Of course there is still hope that eventually also strong AI applications will become possible, but this may take a long time.
26
C. Bartneck and M.J. Lyons
Fig. 1. Robots with animated faces
When we look at what types of HRI solutions are currently being built, we see that a large number of them do barely have any facial features at all. Qrio, Asimo and Hoap-2, for example, are only able to turn their heads with 2 degrees of freedom (DOF). Other robots, such as Aibo, are able to move their head, but have only LEDs to express their inner states in an abstract way. While these robots are intended to interact with humans, they certainly avoid facial expression synthesis. When we look at robots that have truly animated faces, we can distinguish between two dimensions: DOF and iconic/realistic appearance (see Figure 1). Robots in the High DOF/Realistic quadrant not only have to fight with the uncanny valley [40] they also may raise user expectations of a strong AI which they are not able to fulfill. By contrast, the low DOF/Iconic quadrant includes robots that are extremely simple and perform well in their limited application domain. These robots lie well within the domain of the soluble in FP. The most interesting quadrant is the High DOF/Iconic quadrant. These robots have rich facial expressions but avoid evoking associations with a strong AI through their iconic appearance. We propose that research on such robots has the greatest potential for significant advances in the use of FP in HRI. 4.2 Facial Analysis for Direct Gesture-Based Interaction The second example we use to illustrate the “Art of the Soluble” strategy comes from the analysis of facial expressions. While there is a large body of work on automatic
HCI and the Face: Towards an Art of the Soluble
27
facial expression recognition and lip reading within the computer vision and pattern recognition research communities, relatively few studies have examined the possible use of the face in direct, intentional interaction with computers. However, the complex musculature of the face and extensive cortical circuitry devoted to facial control suggest that motor actions of the face could play a complementary or supplementary role to that played by the hands in HCI [1]. One of us has explored this idea through a series of projects using vision-based methods to capture movement of the head and facial features and use these for intentional, direct interaction with computers. For example, we have used head and mouth motions for the purposes of hands-free text entry and single-stroke text character entry on small keyboards such as found on mobile phones. Related projects used action of the mouth and face for digital sketching and musical expression. One of the systems we developed tracked the head and position of the nose and mapped the projected position of the nose tip in the image plane to the coordinates of the cursor. Another algorithm segmented the area of the mouth and measured the visible area of the cavity of the user’s mouth in the image plane. The state of opening/closing of the mouth could be determined robustly and used in place of mouse-button clicks. This simple interface allowed for text entry using the cursor to select streaming text. Text entry was started and paused by opening and closing the mouth, while selection of letters was accomplished by small movements of the head. The system was tested extensively and found to permit comfortable text entry at a reasonable speed. Details are reported in [41]. Another project used the shape of the mouth to disambiguate the multiple letters mapped to the keys of a cell phone key pad [42]. Such an approach works very well for Japanese, which has a nearly strict CV (consonant-vowel) phoneme structure, and only five vowels. The advantage of this system was that it took advantage of existing user expertise in shaping the mouth to select vowels. With some practice, users found they could enter text faster than with the standard multi-tap approach. The unusual idea of using facial actions for direct input may find least resistance in the realm of artistic expression. Indeed, our first explorations of the concept were with musical controllers using mouth shape to control timbre and other auditory features [43]. Of course, since many musical instruments rely on action of the face and mouth, this work has precedence, and was greeted with enthusiasm by some musicians. Similarly, we used a mouth action-sensitive device to control line properties while drawing and sketching with a digital tablet [44]. Here again our exploration elicited a positive response from artists who tried the system. The direct action facial gesture interface serves to illustrate the concept that feasible FP technology is ready to be used as the basis for working HCI applications. The techniques used in all the examples discussed are not awaiting the solution of some grand problem in pattern recognition: they work robustly in real-time under a variety of lighting conditions.
5 Conclusion In this paper we have argued in favour of an “Art of the Soluble” approach in HCI. Progress can often be made by sidestepping long-standing difficult issues in artificial
28
C. Bartneck and M.J. Lyons
intelligence and pattern recognition. This is partly intrinsic to HCI: the presence of a human user for the system being developed implies leverage for existing computational algorithms. Our experience and the discussions that led to this article have also convinced us that HCI researchers tend towards an inherently pragmatic approach even if they are not always self-conscious of the fact. In summary, we would like to suggest that skill in identifying soluble problems is already a relative strength of HCI and this is something that would be worth further developing.
References [1] Lyons, M.J.: Facial Gesture Interfaces for Expression and Communication, IEEE International Conference on Systems, Man and Cybernetics, The Hague (2004) [2] Lyons, M.J., Budynek, J., Akamatsu, S.: Automatic Classification of Single Facial Images. IEEE PAMI 21, 1357–1362 (1999) [3] Cassell, J., Sullivan, J., Prevost, S., Churchill, E.: Embodied Conversational Agents. MIT Press, Cambridge (2000) [4] Bartneck, C., Okada, M.: Robotic User Interfaces, HC2001, Aizu (2001) [5] Bartneck, C., Suzuki, N.: Subtle Expressivity for Characters and Robots. In International Journal of Human Computer Studies, vol. 62, Elsevier, pp. 306 (2004) [6] Fong, T., Nourbakhsh, I., Dautenhahn, K.: A survey of socially interactive robots. Robotics and Autonomous Systems 42, 143–166 (2003) [7] Zhai, S., Morimoto, C., Ihde, S.: Manual and gaze input cascaded (MAGIC) pointing presented at ACM CHI’99 [8] Tobii Technology, Tobii Technology (2007) Retrieved February 2007, from http://www.tobii.com/ [9] Picard, R.W.: Affective computing. MIT Press, Cambridge (1997) [10] Pantic, M., Rothkrantz, L.J.M.: Automatic analysis of facial expressions: the state of the art. IEEE PAMI 22, 1424–1445 (2000) [11] Fasel, B., Luettin, J.: Automatic facial expression analysis: a survey. Pattern Recognition 36, 259–275 (2003) [12] Ekman, P., Friesen, W.V.: Unmasking the Face. Prentice-Hall, Englewood Cliffs (1975) [13] Bartneck, C., Reichenbach, J., Breemen, A.: In your face, robot! The influence of a character’s embodiment on how users perceive its expressions, Design and Emotion (2004) [14] Cerami, F.: Miss Digital World (2006) Retrieved August 4th, from http://www.missdigitalworld.com/ [15] Mori, M.: The Uncanny Valley, Energy, vol. The. Uncanny Valley 7, 33–35 (1970) [16] Ishiguro, H.: Towards a new cross-interdisciplinary framework, presented at CogSci Workshop Towards social Mechanisms of android science, Stresa (2005) [17] Bartneck, C.: Interacting with an Embodied Emotional Character, presented at Design for Pleasurable Products Conference (DPPI2004), Pittsburgh (2003) [18] Honda, Asimo (2002) Retrieved from http://www.honda.co.jp/ASIMO/ [19] Sony, Aibo (1999) Retrieved January, 1999, from http://www.aibo.com [20] NEC, PaPeRo (2001) Retrieved from http://www.incx.nec.co.jp/robot [21] Breemen, A., Yan, X., Meerbeek, B.: iCat: an animated user-interface robot with personality, 4th Intl. Conference on Autonomous Agents & Multi Agent Systems (2005) [22] Schiano, D.J.: Categorical Imperative NOT: Facial Affect is Perceived Continously, presented at ACM CHI’2004 (2004)
HCI and the Face: Towards an Art of the Soluble
29
[23] Russell, J.A.: Affective space is bipolar. Journal of personality and social psychology 37, 345–356 (1979) [24] Bartneck, C.: How convincing is Mr. Data’s smile: Affective expressions of machines. User Modeling and User-Adapted Interaction 11, 279–295 (2001) [25] Pelachaud, C.: Multimodal expressive embodied conversational agents, In: Proceedings of the 13th annual ACM international conference on Multimedia (2005) [26] Bandai, Tamagotchi (2000) Retrieved from http://www.bandai.com/ [27] Lund, H.H., Nielsen, J.: An Edutainment Robotics Survey, 3rd Intl. Symposium on Human and Artificial Intelligence Systems (2002) [28] Fogg, B.J.: Persuasive technology: using computers to change what we think and do. Morgan Kaufmann Publishers, Amsterdam, Boston (2003) [29] Catherine, Z., Paula, G., Larry, H.: Can a virtual cat persuade you?: The role of gender and realism in speaker persuasiveness, presented at ACM CHI’2006 (2006) [30] Biocca, F.: The cyborg’s dilemma: embodiment in virtual environments, 2nd Intl. Conference on Cognitive Technology - Humanizing the Information Age (1997) [31] Schwienhorst, K.: The State of VR: A Meta-Analysis of Virtual Reality Tools in Second Language Acquisition. Computer Assisted Language Learning 15, 221–239 (2002) [32] Mahmood, A.K., Ferneley, E.: Can Avatars Replace The Trainer? A case study evaluation, International Conference on Enterprise Information Systems (ICEIS), Porto (2004) [33] Tamura, T., Yonemitsu, S., Itoh, A., Oikawa, D., Kawakami, A., Higashi, Y., Fujimooto, T., Nakajima, K.: Is an entertainment robot useful in the care of elderly people with severe dementia? The. Journals of Gerontology Series A 59, M83–M85 (2004) [34] Wiratanaya, A., Lyons, M.J., Abe, S.: An interactive character animation system for dementia care, Research poster, ACM SIGGRAPH (2006) [35] Robins, B., Dautenhahn, K., Boekhorst, R., t. Boekhorst, R., Billard, A.: Robotic Assistants in Therapy and Education of Children with Autism: Can a Small Humanoid Robot Help Encourage Social Interaction Skills? In: UAIS, 4(2), 1–20. Springer-Verlag, Heidelberg (2005) [36] Searle, J.R.: Minds, brains and programs. Behavioral and Brain Sciences 3, 417–457 (1980) [37] McDermott, D.: Yes, Computers Can Think, in New York Times (1997) [38] Medawar, P.B.: The art of the soluble. Methuen, London (1967) [39] Cycorp, Cyc. (2007) Retrieved February 2007, from http://www.cyc.com/ [40] MacDorman, K.F.: Subjective ratings of robot video clips for human likeness, familiarity, and eeriness: An exploration of the uncanny valley, ICCS/CogSci-2006 (2006) [41] de Silva, G.C., Lyons, M.J., Kawato, S., Tetsutani, N.: Human Factors Evaluation of a Vision-Based Facial Gesture Interface, IEEE CVPR (2003) [42] Lyons, M.J., Chan, C., Tetsutani, N.: MouthType: Text Entry by Hand and Mouth, presented at ACM CHI’2004 (2004) [43] Lyons, M.J., Tetsutani, N.: Facing the Music: A Facial Action Controlled Musical Interface, presented at ACM CHI’2001 (2001) [44] Chan, C., Lyons, M.J., Tetsutani, N.: Mouthbrush: Drawing and Painting by Hand and Mouth, ACM ICMI-PUI’2003 (2003)
Towards Generic Interaction Styles for Product Design Jacob Buur and Marcelle Stienstra Mads Clausen Institute for Product Innovation, University of Southern Denmark Grundtvigs Allé 150, 6400 Sønderborg, Denmark {buur, marcelle}@mci.sdu.dk
Abstract. A growing uneasiness among users with the experience of current product user interfaces mounts pressure on interaction designers to innovate user interface conventions. In previous research we have shown that a study of the history of product interaction triggers a broader discussion of interaction qualities among designers in a team, and that the naming of interaction styles helps establish an aesthetics of interaction design. However, that research focused on one particular product field, namely industrial controllers, and it was yet to be proven, if interaction styles do have generic traits across a wider range of interactive products. In this paper we report on five years of continued research into interaction styles for telephones, kitchen equipment, HiFi products and medical devices, and we show how it is indeed possible and beneficial to formulate a set of generic interaction styles. Keywords: Interaction styles, interaction history, product design, user interface design, tangible interaction, quality of interaction.
Towards Generic Interaction Styles for Product Design
31
in education as a way of explaining the historical inheritance and debating the difference between alternative design solutions. Since user interaction design shares many characteristics with industrial design, we claim that interaction design can benefit greatly from an understanding of the concept of style. It can provide designers with strong visions and a sense of direction in designing new user interfaces. In particular we focus on user interface design for physical IT-products with small displays and dedicated keys, because of the tight coupling of interaction design and industrial design. The design of such user interfaces seems largely governed by technological progress, and to a large extent they seem to inherit user interface principles from the computer world, just one generation delayed. Human-Computer Interaction (HCI) interface principles were designed for full keyboard and mouse operation, therefore they become much more cumbersome with a tiny display and a limited number of keys. And in particular, when moving away from buttons and screen to forms of tangible interaction, HCI principles fall short of providing much help. We are concerned that interaction designers in enthusiasm with new technologies fail to transfer the qualities of use, which were achieved with previous technologies. It is, however, pointless to exactly copy products of the past as society’s needs and values have changed and technology has moved on. But we argue that it is possible to use the interaction style of a particular period as inspiration for an innovative blend of interaction style, functionality and technology within a contemporary interaction design. In this way we may be able to preserve qualities of interaction otherwise lost in history. In previous research we have shown that a study of the history of product interaction triggers a deep discussion of interaction qualities among designers in a team, and that the naming of interaction styles helps establish an aesthetics of interaction design [1, 2]. Since then we have expanded our research from industrial controllers to a broader range of interactive products including telephones, kitchen equipment, HiFi products and medical devices, and in this paper we will show that it is possible and beneficial to formulate a set of generic interaction styles for interactive products. Our research is based on two types of investigations: 1. 2.
Historical analysis of interactive products. We identify and characterise style eras for each of five product fields, then compare the style eras across the fields. Design experiments (research-through-design). We exaggerate the qualities found in historic eras, but implement them with contemporary technology, e.g., a mobile phone with the interaction experience of a 1930 rotary dial telephone. Then we analyse all design experiments across the five product fields to identify core dilemmas in current interaction design. Based on these investigations we propose a set of four generic interaction styles.
2 Interaction Styles in History The concept of style has been the focus of much debate within all genres of art, from literature and visual arts to architecture and design. In recent decades, emphasis has shifted from understanding style as a method of categorisation based on particular
32
J. Buur and M. Stienstra
conventions of content and norms to an understanding that styles are defined within social groups and essentially dynamic both in form and function [3, 4]. In a relatively new field as interaction design, discussions about style have only started recently, e.g., [2, 5, 6]. Style has been used for different purposes, to classify products and systems [6], but also to serve as an inspiration to create a specific look and feel [7]. In this paper we focus mainly on this last approach. In our understanding of style, the following concepts are important: ‘network of norms’, ‘style marker’, and ‘interpretation community’. Essential to style is, as Merleau-Ponty explains in [5], the fact that perception – which lies at the basis of stylization – ‘cannot help but to constitute and express a point of view’. Stylization thus starts the moment we perceive of an object and is an individual activity: it depends very much upon the person (his/her competences, references and experiences), and context in which the stylization takes place. We compare the object with similar objects based on, for example, function and usage. Essential to this systematic activity is the existence of a given system, which Enkvist calls the ‘network of norms’[8]: ‘a compilation of prior experiences with objects into a style taxonomy that makes it possible to find correspondences, both differences and similarities, between new objects and previous norms’. Enkvist observes that all style experiences arise from comparison. The comparison of artefacts that we see as similar lets us identify ‘style markers’, i.e. elements in the products that significantly correlate with or deviate from prevailing norms of design [8]. Our investigation covers five product genres: industrial controllers (in collaboration with Danfoss), telephones (Nokia), kitchen equipment, HiFi equipment (Bang & Olufsen), and medical devices (Novo Nordisk). We organised the style study as a yearly 2-week seminar for graduate design students with a new product genre each year. Each seminar included literature search, museum studies, interviews with curators, and videotaping of interactions with historic products. Product collections in museums provide a good opportunity to engage into the comparison activity. With groups of 16-20 students (our ‘interpretation community) we were able to cover 2-3 museums for each product genre; typically a combination of a science museum and a private company collection. To ensure a broad view of styles, we split into 3-5 teams, each with a particular focus of study: • Society context: What is the dominating view on humans and technology? • Hands and skills: What movements and skills are required to interact? • Technology: What main technologies are employed for functions and manufacturing? • Company Spirit: What is the dominating self-image of the manufacturer? Based on the collected data we sequenced a timeline in appropriate eras of a dominating ‘style’, characterise each style era, create an appropriate set of style names, and produce collage posters to communicate findings (see fig. 1). The posters then served as input for the ensuing design experiments (described in section 3). The naming of style eras posed a particular challenge. Where style labels in the history of architecture and design typically spring out of the style discourse of the period (e.g., De Stijl, Dada, Art Deco) and the origins of dominant pieces of art (e.g., Bauhaus, Pop Art, Swiss Style), the discussion of user interaction experience is rather
Towards Generic Interaction Styles for Product Design
33
Fig. 1. Four posters describing eras of telephone interaction styles, each covering the dominant aspects of community, hand movements & skills, knowledge allocation, technology & design
recent. So for each product genre we were in the unique situation of at the same time discussing and naming of all styles eras through the 20th century1. It became clear to us that interaction style names need to point to people, interaction purpose and experience, rather than to the visual identity of user interfaces (buttons, knobs, sliders). Thus we chose labels like Routine Caller (30s-70s telephones), Food Processor Queen (50s-70s kitchen equipment) and Analog Professional (60s-70s industry controllers) – rather flamboyant names to trigger the imagination of interaction designers. The naming discussions were long and intensive, because they seemed to condense the many observations and interpretations. In the first seminars we left the naming to a small sub-group to easier form consensus. But later we realised that this discussion may well be the core of forming a shared style understanding, as it contributes exactly to the development of the ‘network of norms’. The naming discussion seemed precisely to foster the building of shared norms in the investigation team. According to Engholm [5]: ‘the stylization will always depend on the discursive context that one is part of and on one’s historical, cultural or technical competence’. Therefore it was even more important that all students were involved, not just a small group. In addition to the naming activity, the format of style posters worked exceedingly well as a format to synthesize what we had seen at the museums. The graduate students took pride in their work and the style period labels quickly became part of the language repertoire in discussions. 1
One can argue that the invention of electricity also gave birth to the field of user interface design as we know it today. Therefore we focused in our study on products invented in the end of the 19th century and start of the 20th.
34
J. Buur and M. Stienstra
Fig. 2. Four interaction style eras presented in the form of a ‘style book’. The style eras have been generated after analyzing the timelines for all five product genres. The ‘operation’ of the pages symbolizes the main mode of interaction of the respective eras: turning, sliding, clicking, and brushing.
When we align the style timelines for all five product genres, it is obvious that they share similar developments, although for some genres a new era may arrive earlier than for others (see fig. 3). To stress this we have indicated rather sharp transitions between one style and the next (see fig. 2). In reality the eras should be seen as waves with large overlaps. One would expect this similarity, as all products in an overall sense are embedded in the same society discourses and draw on the same technology inventions. We have however refrained from coining composite style labels, as we feel that the collapsing of the specific product style names would result in too abstract names without sufficient imaginative power.
3 Designing with Interaction Styles To test the power of interaction style thinking, we challenged our graduate students to design contemporary digital products that would incorporate the interaction experience of each of the style eras studied: a mobile phone, a microwave oven, a motor controller. By keeping the specifications constant across styles we were able to compare experiences (see fig. 4). This lead to a large collection of design samples and many challenging discussions, like: how would you send an SMS with the feeling of an old time crank? Or how would you use the rotary dialing motion in a portable telephone?
Towards Generic Interaction Styles for Product Design
Fig. 3. Comparison of interaction style studies for five different product genres
35
36
J. Buur and M. Stienstra
Fig. 4. Four mobile phones inspired by respectively the Magic Connector period, the Routine Caller period, the Life Chatter period and the Information Navigator period (see also fig. 2)
Based on the design experiments we claim that interaction style thinking indeed helps designers to increase their sensitivity to experience issues and break with user interfaces conventions. We support this by three observations. Firstly, we observed a fine spread of interaction qualities in the designs that the graduate students produced following the history style studies. Along the way some teams found it very difficult to let go of their preoccupation with button and display technology, some simply copied user interface components of the past. In the end, however, all teams created designs that support rich actions and established convincing links between actions and functions. Secondly, the graduate students were able to compare their designs to exemplars in history and, most importantly, they were explicit about the expression of interaction they wanted to support. They demonstrated in their presentations that they had established a shared understanding of different interaction styles based in history and the respective qualities of each style. Thirdly, the students themselves were positive about the interaction style thinking compared to their prior experiences. One student, for instance, expressed his surprise about the richness of interaction history: »Inspiration from the past is like going to the beach - there is so much more to find.« Another one adds: »We did suffer from preconceptions. We think we know all about telephones already.« Our next step was to see if by looking at the designs themselves we were able to abstract generic characteristics. We analyzed four motor controller designs, four mobile phones, and five microwaves ovens. In order to reduce the risk of a circular argument (that what we learn from the designs only confirms what we knew already from the historical study), we added a set of 10 MP3 player designs. This assignment did not explicitly refer to historical interaction styles, but required students to design a new interface for an iShuffle-inspired MP3 player (no screen, very simple functionality) that would support rich interaction, bodily engagement and the expressiveness of product movements [9]. For clarity reasons only 17 of the 23 designs appear on the clustering diagram in fig. 5. The analysis helped us explicate two dilemmas in current (tangible) interaction design. One, the designs seem to support either an explanatory or an exploratory mode of interacting. The ‘explanatory’ designs provide a direct link between the goal you want to achieve, and how to get there. Every step is explainable: there is a feeling of being in control. The ‘exploratory’ designs, on the other hand, are less ‘serious’ in that they support a playful building of interaction skills, where the goal may be less important than the action itself.
Towards Generic Interaction Styles for Product Design
37
Fig. 5. Visual comparison of 17 student designs; 4 motor controllers, 4 telephones, 5 microwaves and 4 MP3 players. Only the MP3 players were not designed using the history styles explicitly as inspiration.
Two, there seems to be an important distinction between discrete and composite interface designs, or – put very bluntly – simple and complex products. The ‘discrete’ interaction designs favor one control for each function they offer (think of old radios with different buttons to choose wave lengths, sliders to select radio channels, knobs to adjust volume, treble, bass). Products with ‘composite’ interaction have general controls to access different, layered functions (think of the keypad on mobile phones).
4 Towards Generic Interaction Styles Looking back at the Interaction Styles in history and comparing them with the designs made by students, we argue that it is possible to extract four generic and contemporary interaction styles based on the presented material. We take inspiration from the work of Maaß & Oberquelle [10], who proposed four perspectives to explain differences in how designers conceive of the computer in relation to its users: the system perspective, the information processor perspective, the workshop perspective and the media perspective. We propose four interaction styles for interactive products characterized as follows: • Tangible Control (discrete, explanatory): the product exhibits its function through its design; the interface consists of several, discrete controls; the spatial arrangement of the controls supports the understanding of the product; the
38
J. Buur and M. Stienstra
interaction takes place there where the product is placed. This style supports the view that interactive technology is a tool that people employ to achieve a certain, explicit purpose. • Elastic Play (discrete, exploratory): there are specific controls for specific functions; the interface consists of a wide variety of general control types (buttons, sliders, handles etc.); the interaction supports physical input and feedback; learning to interact with the product requires both a cognitive and bodily understanding. Elastic Play banks on virtuosity: technology is an expressive instrument that people can learn to master, and aims growing with the skills. • Rhythmic Logics (composite, explanatory): the product is a complex system which consists of different layers; the interaction requires a cognitive understanding of the product; input is a rhythmic sequence of simple actions, like button tapping; the interaction focuses on efficiency; feedback is digitally mediated. Technology is an ‘intelligent’ partner that people negotiate sense with. • Touch-free Magic (composite, exploratory): the product reacts in surprising ways; it may not have one clear identity (e.g., phone, camera and music player in one); personal style (in appearance and/or interaction) is important - in a way, the user also becomes the designer of the product; the product supports an exploratory type of interaction with no or very light touch; the product may move and respond physically, but there is no tactile feedback; interaction with the product takes place there where the user is. This style supports the view of technology as a wonder, as something unexplainable, a magic that people can learn to engage in/with.
5 Conclusions The generic interaction styles presented here are based on five studies of interaction history combined with a number of conceptual design experiments. They have come into being after long discussions amongst interaction designers and researchers. They clearly refer to qualities of interaction from the past, but have a contemporary character being based upon current technology and needs and values of today’s society. A next step would be to investigate how the generic interaction styles work for interaction designers who haven’t been involved in the preceding discussions. Rather than use our generic style proposals as an analytic tool or as design guidelines, we aim at provoking interaction designers to discuss how they relate to their own product genres. Such discussions are vital in order for common understanding and agreement to arise, and to create a shared ‘network of norms’ in Enkvist’s sense [8]. Are the descriptions, examples and illustrations provided for each style enough for designers to serve as the inspiration we intended them to be? Or are specific activities – like museum studies - required to get a deeper understanding of the styles, and what could such activities be? Employed as a trigger for discussion we believe that the generic interaction styles can help interaction designers to innovate the dominant user interface conventions.
Towards Generic Interaction Styles for Product Design
39
Acknowledgments. We would like to thank the IT Product Design students at the University of Southern Denmark for their enthusiastic participation in the interaction style experiments, in particular Mads Vedel Jensen, Peng Cheng, Mette Mark Larsen, Ken Zupan, Kyle Kilbourn, Anda Grarup, René Petersen and Yingying Wang who created poster style guides and helped analyze the material.
References 1. Øritsland, T.A., Buur, J.: Taking the best from a company history - designing with interaction styles. In: Symposium on Designing Interactive Systems 2000, ACM Press, New York (2000) 2. Øritsland, T.A., Buur, J.: Interaction Styles: An Aesthetic Sense of Direction in Interface Design. International Journal of Human-Computer Interaction 15(1), 67–85 (2003) 3. Chandler, D.: An Introduction to Genre Theory, [WWW document], [15.02.2007] (1997), URL: http://www.aber.ac.uk/media/Documents/intgenre/intgenre.html 4. Ylimaula, A.M.: Origins of style - Phenomenological approach to the essence of style in the architecture of Antoni Gaudi. In: Mackintosh, C.R., Wagner, O. (eds.) University of Oulu, Oulu, Finland (1992) 5. Engholm, I.: Digital style history: the development of graphic design on the Internet. Digital Creativity 13(4), 193–211 (2002) 6. Ehn, P., et al.: What kind of car is this sales support system? In: On styles, artifacts, and quality-in-use. In Computers and design in context, MIT Press, Cambridge (1997) 7. Engholm, I., Salamon, K.L.: Webgenres and -styles as socio-cultural indicators - an experimental, interdisciplinary dialogue. In: The Making, Copenhagen, Denmark ( 2005) 8. Enkvist, N.E.: Någat om begrepp och metoder i språkvetenskaplig stilforskning. In Om stilforskning/Research on Style, Stockholm, Sweden: Kunglig Vitterhets Historie och Antikvitetsakademien (1983) 9. Djajadiningrat, J.P., Matthews, B., Stienstra, M.: Easy Doesn’t Do It: Skill and Expression in Tangible Aesthetics. Special Issue on Movement of the Journal for Personal and Ubiquitous Computing, forthcoming 10. Maaß, S., Oberquelle, H.: Perspectives and Metaphors for Human-Computer Interaction. In: Floyd, C., et al. (eds.) Software Development and Reality Construction, pp. 233–251. Springer, Heidelberg (1992)
Context-Centered Design: Bridging the Gap Between Understanding and Designing Yunan Chen and Michael E. Atwood College of Information Science & Technology Drexel University, Philadelphia, PA, 19104, USA {yunan.chen, michael.atwood}@ischool.drexel.edu
Abstract. HCI is about how people use systems to conduct tasks in context. Most current HCI research focuses on a single or multiple users’ interaction with system(s). Compared with the user, system and task components, context is a less studied area. The emergence of ubiquitous computing, context-aware computing, and mobile computing requires system design to be adaptive and respond to aspects of setting in which the tasks are performed, including other users, devices and environments. Given the importance of context in information system design, we note that even the notion of context in HCI is not well-defined. In this paper, we review several theories of context as it relates to interaction design. We also present our Context-centered Framework which is aimed to bridging end users’ understand and designers’ designing together. The research design and expected outcomes are also presented.
Context-Centered Design: Bridging the Gap Between Understanding and Designing
41
context of the interaction itself. Dourish [4] identified two perspectives for context: representational and interactional. He argues that the correct focus for research is on the interaction between objects and activities and not solely on the representation of the objects. We concur with observation and also with Greenberg’s point [5] that context is not a fixed, descriptive element, but is a dynamic and interactive element. Designing context-aware system for complex environments is very challenging because the knowledge needed to solve this complex problem is processed by people who typically work in different domains. This is known as the Symmetry of Ignorance, and communication breakthrough is needed in these cases [6]. Since endusers live in their context they understand the context much better than system designers do. But, end-users must rely on others to design the systems they need. Doing so effectively requires a shared understanding of context to ensure a good design in context rich environment. To solve this problem, in this paper we presented a Context-centered framework for interactive system design which is intended to answer the following three research questions. − What is context when it is applied in interactive design? − What are the components of the context? − How can we use context to bridge the gap between understanding and designing?
2 Literature Review: Theories and Metaphors Although many current theories within HCI do not explicitly address context issue, some consideration of context is embedded in these theories. We review these theories in this section (Table 1). Table 1. Theories and Metaphors Applicable to Using of Context in HCI
Activity Theory Distributed Cognition
Basic Unit of Analysis An activity -a form of doing directed to an object that transforms an object into an outcome. A cognition system composed of individuals and the artifacts they use.
Situated Action
The activity of persons-acting in setting
GMOS
GMOS-user’s cognitive structure
Awareness
Awareness -knowledge about the state of some environment
Locales Framework
Locales –the relationship between social world and its interactional needs, and the “site and means” its members use to meet those needs.
Components of context Subject, Tools, Object, Rules, Community, Division of labor Goals, Internal Representation, External Representation Person, Activity
Setting, Relationship between person and setting Goals, Operators, Methods for achieving the goals, Rules for choosing methods People, Artifacts, Time Actions happened and happening Locales foundation, Civic structures, Individual views, Interaction trajectory, Mutuality
42
Y. Chen and M.E. Atwood
2.1 Activity Theory Activity theory is a research framework originating in Soviet Psychology in the 1920s [7]. The application of activity theory has lately been introduced to information systems area [8, 9]. The object of AT is to understand the unity of consciousness and activity [9]. Emphasis on Context: Nardi [9] argued that the AT is a descriptive tool which provides different perspective on human activity. Activity theory begins with the notion of activity. Unlike many other theories which take human actions as a unit of analysis, AT takes actions and the situated context as a whole and calls this an activity. Context is the activity and the environment in which it occurs. 2.2 Distributed Cognition Distributed cognition [10, 11] theory believes that humans augment their knowledge by placing memories, facts, or knowledge on the objects, individuals, and tools in their environment. Distributed cognition breaks the traditional internal and external boundary and combined them together as a distributed system. Emphasis on Context: Distributed cognition system together is a context for the activities they are carried out. Since distributed cognition theory focus on the distributed nature of activity solving process, it takes into account people, artifacts situated in various locations. It is widely adopted in Computer Supported Collaborative Work (CSCW) studies which emphasized on the collaborations across multiple participants [12]. 2.3 Situated Action Situated action was first introduced in 1987 by Lucy Suchman [13]. Rather than decompose the circumstances and the actions being taken by a preset plan, situated action theory think that the actions are highly contextualized; the context of specific situation determined what the next action is. Suchman believes that people construct their plan as they go along in the situation, creating and altering their next move based on what has just happened, rather than planning all actions in advance and simply carrying out that plan. Emphasis on Context: Situated action theory believes that context is a dynamic thing associated with actions. From the situated action point of view, an action plan, is not pre-defined, but consists many unpredicted actions which determined by the specific context it is situated. In this way we could define and analyze context as an interaction entity from the action point of view. 2.4 Locales Framework Locales framework [14, 15] is a theory that create a shared abstraction among stakeholders and bridge understanding and design in CSCW field. Basically, a locale
Context-Centered Design: Bridging the Gap Between Understanding and Designing
43
is a space together with the resources available there which has particular relationship with social world and interaction needs to meet people’s needs. Locales could either be a physical space or a virtually shred environment. Emphasis on Context: Though Fitzpatrick only studied the locales in the CSCW field, the notion of ‘locales’ could be applied to any interaction situation. A locale is an individual context in this sense and the framework could help identify locales five properties. This would potentially 2.5 GOMS GOMS [16] is a method for modeling and describing human task performance. GOMS is an acronym that stands for Goals, Operators, Methods, and Selection Rules, the components of which are used as the building blocks for a GOMS model. Emphasis on Context: GOMS provides an alternative view of context. Context, instead of a shared environment and the people, artifacts inside it, it is a means to select and conduct activities. Context does not necessarily to be tangible artifacts. Like distributed cognition theory claim, human cognition is part of context too. Though the rules are not physical artifacts, it restrict the why which actions is carried out. 2.6 Awareness Awareness is generally defined in terms of two concepts: knowledge and consciousness. In the HCI scope, awareness is studied as it relates to the interaction between an agent and its environment. Emphasis on Context: Dourish and Bellotti [17] defined awareness as “an understanding of the activities of others, which provides a context for your own activities.” In this sense, awareness could be simply defined as “knowing what’s going on in the context [18].” This definition indicated that awareness is associated with the context under which the intended task is being processing. Also knowing what’s going on provide users feedbacks and conscious of the context. 2.7 Contextual Factors Identified From the above review, we conclude that context, although defined and used different in these theories, does share some common elements. The contextual factors associated with each theory are outlined (table 2). Our review and analysis suggested that Context is not a fixed, descriptive element. Instead, it is a dynamic and interactive element which arises from the activity and is particular to each occasion of activity.
44
Y. Chen and M.E. Atwood Table 2. Contextual Factors Extracted from HCI Theories
Factors Motivation Goal Activity
Rules Constraint Awareness Methods People Objects Settings
Explanations The reasons for a action The intend outcome for the a action Action
Principles or regulations of a action Limitation or restriction of a action Knowing what’s going on Different ways of conducting a action People involved in a action and their roles Relevant artifacts Either physical or virtual space for a action
3 Context Revisit Given the importance of context in the system design and the contextual factors extracted from the previous theories, we are interested in what exactly context is and context aware system from this activity bounded view. 3.1 Context Definition Both Dourish’s [4] point and the literature review above indicate that context is a property of an interaction between objects and activities, not of the objects in the environment alone. From this interactional point of view, context is “a relational property held between objects or activities. We can not simply say that something is or is not context; rather, it may or may not be contextually relevant to some particular activity.” [4] This viewpoint shows that context is a dynamic property which particular to each occasion of activity or action. Therefore, context in our definition is: A dynamic property aroused from activities. It interacts and constraints activities happened within it. 3.2 Context-Aware System A context aware application is adaptive, reactive, responsive, situated, contextsensitive and environment-directed [19]. Since the definition of context varies depending on the different usage, the notion and usage of context-aware application also differs greatly. In the early stage, context-aware has been depicted as “adapts according to its location of use, the collection of nearby people and objects, as well as
Context-Centered Design: Bridging the Gap Between Understanding and Designing
45
changes to those objects over time.” [20] Context depicts in this definition is only as representational problem. What does the context-aware mean when it is an interactional property? Dey [2] define context-aware as: “a system is context-aware if it uses context to provide relevant information and/or services to the user, where relevancy depends on the user’s task”. There is no doubt that adaptive and responsive to the surrounding environment is the key characteristic of context-aware computing. From the activity point of view, contextual information is decided by the activities happened within it, a task is a more general notion, and a task may contain many goal oriented activities. Therefore we define context-aware applications as: a system which could incorporate relevant contextual information and be adaptive to the situation it is situated, whereas the contextual information is determined by the goal oriented activities users carried out to complete tasks.
4 Context-Centered Framework Our Context-Centered Framework is intended both to incorporate context into design and to facilitate communication between end users and designers. Compared with locales framework of considering context as static environment, we adapt a dynamic view to combine context with the task solving process. End users could use this framework to identify the contextual information associated with their working activities. It also assists designers to analyze the system features and to validate it in the context. We take activity as a unit of analysis in this framework. 4.1 Action as a Unit of Analysis The review shows that context is inseparable from activities, whether something is considered to be context or not is determined by its relevance to a particular activity. Therefore, we set the unit of analysis in our study to an activity level. From the interaction point of view, contextual information is initiated from and bounded by the activities happened within it. According to Nardi’s [9] hierarchical levels of activities, activities are long-term formations and their objects can not be translated into outcomes at once, but through a process consisting often of several steps or phrases of actions. Actions under the same activity are related to each other by the same overall object and motive. 4.2 Context-Centered Framework Aspects From the hierarchy of activity point of view [9] , the activity is similar to the task which users are trying to accomplish, actions are steps of achieving it, and operation are procedures under each steps. Context differs in each step and also the overall task. For each action, there are four aspects to analyze it. These four aspects are highly interdependent and overlapping. They have been connected by the same action under taken. Combined together, the aspects have the potential to capture many contextual characteristics in the working settings.
46
Y. Chen and M.E. Atwood
Goal: First thing in understanding the context is to identify the object of the activity. It could determine what relevant context information is. Goal includes users’ motivation, intended outcome of performing this activity. Setting: Setting is a place where participants perform the activity; it could be either virtual or physical environment. The relevant setting information includes: − − − −
Who are the people who conduct this activity and their roles The characters of the setting where the activity performs The available tools like other available methods and approaches; The artifacts involved in the setting like other devices and objects.
Rules: Rules of using the resources in current setting and constraints of using any tool how users will perform the activity. E.g. Time constraint for an action. − Constraints of using the resources in the working settings − Rules of allocating resources Awareness: An understanding of the others (either objects or people), which provides feedbacks and conscious of the context and the activities. − The shared context: Aware of other people who involved in the activity and their roles; Aware of the tools and artifacts in the current settings; Aware of the rules/constraints for performing this activity − Actions: Aware of the actions has been taken; Aware of the actions is carrying out now. Table 3. Contextual factors identified Goal Setting Rules Awareness
Object determines what contextual information for the activity is. Setting is the place where participants perform activities. In includes the resources involved in the task solving process. Rules and constraints of using the resources. An understanding and conscious of the setting and activity.
5 Research Design In order to understand how can we use context to bridge the gap between understanding and designing. We designed a 2x2 experiment to test: 1) whether the context-centered Framework could bridge understanding and designing and 2) whether the context-centered Framework could generate a better design compared to non-contextual consideration. A scenario-based design (SBD) [21, 22] approach is applied to our experiment. SBD is an ideal way to measure the context implication in design [23]. Two group of students will be recruited to conduct to generate scenarios based on a given tasks. Students who had HCI courses are believed to have certain design expertise; whereas students who have nursing training are considered as end-users. We will apply two conditions to these students: with and without context-centered framework training (table 4).
Context-Centered Design: Bridging the Gap Between Understanding and Designing
47
Table 4. Research Design
Without training With training
Designers Group D1 Group D2
End Users Group E1 Group E2
The hypotheses are: Ha: In both designers and end-users groups, using Context-centered Framework could produce better design scenarios. Hb: Without context consideration, designers will generate better design scenarios; whereas with context-centered framework training, end-users could generate better design scenarios. To assess the quality of the scenarios, two HCI experts will review the scenario quality and score them according to the quality of scenarios to design.
6 Expected Results We believe that the focus on context could improve communication between endusers and designers. This focus will also produce high quality scenarios which will lead to better design products. We expect that without contextual centered framework instruction, designers (Group D1) will produce better interaction scenarios than end users (Group E1); whereas when context is taken into consideration, end users (Group E2) will generate high quality scenarios than the designers (Group D2).
7 Conclusion and Future Work We intend to use the context-centered framework to connect end users’ understanding of working setting and designer’s design activities. We believe that the results of this study will be relevant to both researchers and practitioners and will help in designing useful and usable system for two reasons. First the context-centered framework can be a starting point to help analyst and designers understand working environment. The task dependent framework could be used to generate initial question and direct observations. It could also capture working settings from the end users’ point of view. Second, a context-centered framework can be used by system designers to help identify where features can be added to enhance existing design, to identify task related context issues and how to incorporate then into system design. Our future work includes conducting experiment for this study, and also we intend to adapt the context-centered framework to a contextual walkthrough for system evaluation.
References 1. Greenbaum, J., Kyng, M. (eds.): Design at Work: Cooperative Design of Computer Systems. Lawrence Erlbaum Ass, Hillsdale, New Jersey (1991)
48
Y. Chen and M.E. Atwood
2. Dey, A.K., Abowd, G.D., Salber, D.: A Conceptual Framework and a Toolkit for Supporting the Rapid Prototyping of Context-Aware Applications. Human-Computer Interaction 16, 97–166 (2001) 3. Schilit, B., Theimer, M.: Disseminating active map information to mobile hosts. IEEE Netwk 8, 22–32 (1994) 4. Dourish, P.: What we talk about when we talk about context. Personal Ubiquitous Comput. 8, 19–30 (2004) 5. Greenberg, S.: Context as a Dynamic Construct. Human-Computer Interaction 16, 257– 268 (2001) 6. Rittel, H.: Second-Generation Design Methods. In: Cross, N. (ed.) Developments in Design Methodology, pp. 317–327. John Wiley & Sons, New York (1984) 7. Wertsch, J.V.: Vygotsky and the Social Formation of Mind. Harvard University Press, Cambridge, MA, London (1985) 8. Bødker, S.: A human activity approach to user interfaces. Human-Computer Interaction 4, 171–195 (1989) 9. Nardi, B.: Context and Consciousness: Activity Theory and Human-Computer Interaction. MIT Press, Cambridge (1996) 10. Hutchins, E.: Cognition in the Wild. The MIT Press, Cambridge, MA (1996) 11. Zhang, J., Norman, D.A.: Representations in Distributed Cognitive Tasks. Cognitive Science 18, 87–122 (1994) 12. Rogers, Y., Ellis, J.: Distributed cognition: An alternative framework for analysing and explaining collaborative working. Journal of Information Technology 9, 119–128 (1994) 13. Suchman, L.A.: Plans and Situated Actions: The Problem of Human-Machine Communication. Cambridge University Press, New York (1987) 14. Fitzpatrick, G., Mansfield, T., Kaplan, S.M.: Locales framework: exploring foundations for collaboration support, pp. 34–41 (1996) 15. Fitzpatrick, G., Kaplan, S., Mansfield, T.: Applying the Locales Framework to Understanding and Designing. In: Proceedings of the Australasian Conference on Computer Human Interaction, p. 122. IEEE Computer Society Press, Los Alamitos (1998) 16. Card, S.K., Moran, T.P., Newell, A.: The phychology of human computer interaction. Lawrence Erlbaum Associates, Inc, Hillsdale, NJ (1983) 17. Dourish, P., Bly, S.: Portholes: Supporting Awareness in a Distributed Work Group. In: Proceedings of the Conference on Human Factors in Computing Systems, Monterey, CA, 541–547 (1992) 18. Gutwin, C., Greenberg, S.: A Descriptive Framework of Workspace Awareness for RealTime Groupware. Computer Supported Cooperative Work (CSCW) 11, 411–446 (2002) 19. Abowd, G.D., Dey, A.K., Brown, P.J., Davies, N., Smith, M., Steggles, P.: Towards a Better Understanding of Context and Context-Awareness. In: Proceedings of the 1st international symposium on Handheld and Ubiquitous Computing, pp. 304–307. SpringerVerlag, Karlsruhe, Germany (1999) 20. Schilit, B., Theimer, M.: Disseminating Active Map Information to Mobile Hosts. IEEE Network 8, 22–32 (1994) 21. Carroll, J.: Scenario-Based Design: Envisioning Work and Technology in System Development. John Wiley & Sons, Chichester (1995) 22. Rosson, M.B., Carroll, J.M.: Usability Engineering: scenario-based development of human-computer interaction. Morgan Kaufmann, Seattle, Washington (2001) 23. David Pinelle, C.G.: Groupware walkthrough: adding context to groupware usability evaluation. In: Proceedings of the SIGCHI conference on Human factors in computing systems: Changing our world, changing ourselves, pp. 455–462. ACM Press, Minneapolis (2002)
Application of Micro-Scenario Method (MSM) to User Research for the Motorcycle’s Informatization - A Case Study for the Information Support System for Safety Hiroshi Daimoto1,3, Sachiyo Araki1, Masamitsu Mizuno1, and Masaaki Kurosu2,3 1 YAMAHA MOTOR CO., LTD., Japan National Institute of Multimedia Education, Japan 3 Department of Cyber Society and Culture, The Graduate University for Advanced Studies, Japan 2
Abstract. The Micro-Scenario Method (MSM) is an approach to uncover the consumer needs and establish the development concepts [2]. In this study, the MSM is applied to the Information Support System for Safety related to a motorcycle and devised for application efficiency. What is devised is to make a prescriptive model up before interview research and set up the syntax rules of the problem-scenario (a description sentence of problem situation). As a result, the development efficiency is improved by the modified MSM. The communication of relevant parties can be speeded up, because the prescriptive model which keywords are structurally organized helps development actors share wide-ranging information about problem situations. Moreover, the creation time of problem-scenario can be cut, because the syntax rule of problem-scenario simplifies how to describe it. Though the modified MSM is an effort to take MSM in practical use at YAMAHA Motor Company Ltd. (YMC), the modified MSM was considered as a useful approach to reduce the workload of HCD (Human-Centred Design).
user needs and to improve their usability. This approach corresponds to the activity of “the understanding and specifying the context of use” in the early development stage of ISO13407. The purpose of present paper is to propose the modified MSM that is improved in terms of the analytical method regarding the problem-scenario (pscenario). There are two distinctive improvements on the analysis of the p-scenario, which consists of “the prescriptive model” and “the syntax rule”.
2 The Prescriptive Model and the Syntax Rule of the Modified MSM 2.1 The Prescriptive Model The prescriptive model consists of structured keywords derived from literature research. The prescriptive model is exploited for covering rough aspects of the target fields and utilized to facilitate understanding of the research contents among development actors (user, engineer, designer, usability engineer). Before the interview research, we made up the prescriptive model (see Fig.1) that is organized from a standpoint of i) rider’s factors (physical factor, emotional factor, personality factor, information processing factor), ii) vehicle body factor (breakdown, poor maintenance, etc), iii) environmental factor (surrounding vehicles, traffic situation, road surface condition, etc). Fig.1 shows the structured accident cause of a motorcycle. The keywords about the accident cause are grouped and organized structurally such as a KJ method [1].
Fig. 1. Prescriptive model about the accident factors of a motorcycle
Table 1. Detail descriptions of contextual factors on the accidents of a motorcycle
Application of MSM to User Research for the Motorcycle’s Informatization 51
52
H. Daimoto et al.
After the interview research, the prescriptive model is revised by adding keywords derived from interviews. Table 1 shows the detail descriptions of contextual factors on the motorcycle accidents. The prescriptive model is based on this structured classification. The prescriptive model is utilized for participants to understand the whole image of the accident causes. At a stage of p-scenario analysis, the prescriptive model is utilized for usability engineers to analyze the accident causes by connecting the prescriptive model (keywords) with the p-scenario. 2.2 The Syntax Rule The p-scenarios are derived from organizing the interview data and the literature research. The person who is responsible for usability takes plenty of time for making p-scenarios. Because the text derived from the interview and the literature are huge volumes of data. Therefore, the writers of scenarios have a hard time how to describe the p-scenarios. Their way of writing the p-scenarios vary considerably from person to person. It is resolved by setting up the syntax rules of the p-scenarios. The syntax rule is to regulate the words that should be described. Fig.2 shows a case example of p-scenario for the Information Support System for Safety related to motorcycles. Fig.3 shows the traffic situation (Japanese keeping to the left) of the p-scenario. This case example of p-scenario is described about “subject-object”, “provided information”, “to whom”, “when”, “condition of rider”, “situation of environment and other vehicles”, “a kind of hazard”, “means”. When one's own motorcycle goes straight through an intersection on a green light while there is a preceding vehicle (truck, etc.) and an oncoming right-turn car, and both the motorcycle’s rider and the oncoming car’s driver fail to see each other, there is a risk that the oncoming right turn car might come into the intersection by mistake. Therefore, it is desirable to indicate the presence of one's own motorcycle to the oncoming car's driver. However, such a means does not exist.
Fig. 2. A case example of p-scenario for the Information Support System for Safety related to a motorcycle
The p-scenarios were made up to cover the all keywords of the prescriptive model of Fig.1. When the p-scenario of Fig.2 is connected with the prescriptive model of Fig.1, the accident factors (keywords of the prescriptive model) that are assumed by the p-scenario are “invisible (= an invisible oncoming car)”, “surrounding vehicle (= a preceding vehicle)”, and “road geometry (= an intersection)”.
3 An Application Study of Modified Micro-Scenario Method
Fig. 3. A traffic situation of Fig2
An application study of the modified MSM for the Information Support System for Safety related to a motorcycle as follows. The modified MSM is characterized by “the prescriptive model” and “the syntax rule”.
Application of MSM to User Research for the Motorcycle’s Informatization
53
3.1 Participants Participants for the interview research were 20 working people between the ages 20’s and 50’s. Table 2 shows the detail attributions and the number of participants. The general riders were gathered through a research company for payment. The instructors were a driving instructor of a motorcycle. The other selection criteria of the participants were (1) to ride a motorcycle more than twice a month, (2) to have the experience that they run on a highway. The Attributes were scattered as much as possible to hear a voice of various riders. Participants for the questionnaire research were 20 working people same as the interview research. However, the 4 participants could not participate in the questionnaire research. The data analysis of questionnaire was performed for 16 participants. Table 2. The attributions of participants
3.2 Procedure At the interview research, each participant answered our questions (“What kinds of information do you want under what circumstances?” etc) and explained the context of the problem situation. Additionally, each participant was presented 15 typical traffic scenes (e.g. “crossing collision”, “right turn”, and “bump from behind” etc.) to lead an accident through the safety education teaching materials for motorcycles, and reported the requirements of motorcycle’s informatization for safety in each scene that was covered the major traffic situations. At the end points of the each reporting, each participant was also presented the prescriptive model of each scene, which showed an envisioned accident factor. Then, after having been explained about a general accident cause of each scene, each participant was demanded to report the
54
H. Daimoto et al.
more detailed requirements. The interview takes about two hours each. The voice data of the interviews is recorded. At the questionnaire research, each participant answered the questionnaire about “Level of importance (How important is the problem to be solved for motorcycle safety?)”, “Degree of risk (How dangerous it is due to the absence of means to solve?)”, and “Frequency (How often it is to encounter a case requiring the means?)” in a range of five-point for each p-scenario that was derived from the interview research. The p-scenarios were made up refer to the syntax rule. 3.3 Result The prescriptive model is exploited for two purposes. First is to cover rough aspects of the accident factors and facilitate understanding of the accident causes among development actors. The participants are easy to share the whole image of the accident causes with interviewers and easy to report the requirements without exception. As a result of the interviews, 66 p-scenarios regarding the various traffic situations that cover the 39 keywords of the prescriptive model was made up. Table 3 shows the number of p-scenario for each traffic situations that derived from the interviews and the literature. Table 3. The number of p-scenario for each traffic situations
Second is to analyze the accident cause by connecting the prescriptive model with the p-scenario (39 keywords x 66 p-scenario). We can make an important accident factor clear by analysing the ten high-scored p-scenarios (see Fig.3-1, Fig.3-2). Table 4 shows the result of the accident factor analysis regarding the checked factor (checked = 1). The result indicates that “surrounding vehicles”, “invisible”, and “road geometry” was the particularly important factors.
Application of MSM to User Research for the Motorcycle’s Informatization Table 4. The result of the accident factor analysis
Fig. 3-1. High-scored p-scenarios
55
56
H. Daimoto et al.
*1
The score of supporting data is the average one of questionnaire about “Level of importance (How important the problem is for the motorcycle safety?)”, “Degree of risk (How dangerous it is due to the absence of means to solve?)”, and “Frequency (How often it is encounter a case requiring the means?)” for p-scenarios of the best 10. *2 The order of ten high-scored p-scenarios is defined by the overall score (overall score = “Level of importance” x “Degree of risk” x “Frequency”). Fig. 3-2. High-scored p-scenarios
The syntax rule is exploited for describing the p-scenarios systematically. As a result of having used the syntax rule, the writer of the p-scenario made up 66 pscenarios based on the text data of the interview and the literature. Without the syntax rule, the writer of the p-scenario would not make up the p-scenario effectively and spend much time in vain. In fact, the writer of the p-scenario reported that there was the syntax rule and was easy to write the p-scenario.
4 Summary The purpose of present study is to propose the modified MSM that is improved in terms of the analytical method regarding the p-scenario. Specifically, the modified
Application of MSM to User Research for the Motorcycle’s Informatization
57
MSM is applied to the Information Support System for Safety related to a motorcycle, the application example is shown. Two distinctive improvements are obtained on the modified MSM, which are “the prescriptive model” and “the syntax rule”. As a result of the application, it is indicated that (1) the prescriptive model helps development actors to share wide-ranging information about the accident causes structurally, (2) the prescriptive model helps usability engineers to make a detailed analysis of the accident causes, (3) the syntax rule helps scenario writers to make up the p-scenario easily. In the early development stage of HCD, it was considered that scenario method was effective [3]. MSM is a method of analysis using the scenario technique for a qualitative data such as an interview data, and is getting clear a frame of methodology. However, most of the adaptation example to the real development field has never been introduced. The present study is a case study of MSM, and the modified MSM is an effort to apply MSM to a practical development at YMC. The coverage and quantitative evaluation of the modified MSM are future problems, and this method will be improved by taking in more voice of the real development field.
References 1. Kawakita, J.: Hassouhou. Chuko Shinsho, Tokyo, [in Japanese] (1967) 2. Kurosu, M.: Micro Scenario Method. Research Reports on National Institute of Multimedia Education, 17, [in Japanese] (2006) 3. Carroll, J.M.: Five reasons for scenario-based design. In: Proceedings of the 32nd Hawaii International Conference on System Sciences (Maui, HI, January 4-8), [published as CDROM] pp. 4–8. IEEE Computer Society Press, Los Alamitos, CA (1999) 4. Carroll, J.M.: Scenario-Based Design of Human-Computer Interactions. MIT Press, Boston, MA (2000)
Incorporating User Centered Requirement Engineering into Agile Software Development Markus Düchting1, Dirk Zimmermann2, and Karsten Nebe1 1
University of Paderborn C-LAB, Cooperative Computing & Communication Laboratory, Fürstenallee 11, 33102 Paderborn, Germany 2 T-Mobile Germany, Landgrabenweg 151, 53227 Bonn, Germany {markus.duechting, karsten.nebe}@c-lab.de, [email protected]
Abstract. Agile Software Engineering approaches gain more and more popularity in today’s development organizations. The need for usable products is also a growing factor for organizations. Thus, their development processes have to react on this demand and have to offer approaches to integrate the factor “usability” in their development processes. The approach presented in this paper evaluates how agile software engineering models consider activities of Usability Engineering to ensure the creation of usable software products. The user-centeredness of the two agile SE models Scrum and XP has been analyzed and the question of how potential gaps can be filled without loosing the process’ agility is discussed. As requirements play a decisive role during software development, in Software Engineering as well as Usability Engineering. Therefore, different User Centered Requirements that ensure the development of usable systems served as basis for the gap-analysis. Keywords: Agile Software Engineering, Usability Engineering, User-Centered Requirements.
Incorporating User Centered Requirement Engineering
59
the spiral consists of four major activities and ends with a progress assessment, followed by a planning phase for the next process iteration. Additionally, a risk assessment is performed after each iteration. The iterative approach allows reacting adequate on changing requirements. This makes the process of developing software more manageable and minimizes the risk of failure, in contrast to the sequential SE Model.
2 Agile Software Engineering A recently emerging trend in SE focuses on lightweight, so called agile models, which follow a different approach to software development. Agile models follow the idea of Iterative and Incremental Development (IID), similar to the Spiral Model mentioned above. But in contrast to Boehm’s model, the iteration length is shorter in agile models. The iterations in the Scrum Model for instance, take 30 calendar days. Agile software development does not rely on comprehensive documentation and monolithic analysis activities; instead they are more delivery- and code-quality-oriented approaches. Through co-location of the development team the tacit knowledge among the team members compensates extensive documentation efforts. Agile models emphasize communication, and aspire towards early and frequent feedback through testing, on-site customers and continuous reviews. The basic motivation behind agile and iterative development is to acknowledge that software development is similar to creating new and inventive products [8]. New product development requires the possibility for research and creativity. It is rarely possible to gather all requirements of a complex software system upfront and identify, define and schedule all detailed activities. Many details emerge later during the development process. This is a known problem within the domain of SE and the reason for many failed projects [8]. For this reason, agile models implement mechanisms to deal with changing requirements and other unforeseen incidents to plan, monitor and manage SE activities. 2.1 Scrum Scrum is an agile and iterative-incremental SE model. Its development tasks are organized in short iterations, called Sprints. Each Sprint starts with a Sprint Planning meeting where stakeholders decide the functionality to be developed in the following Sprint. All requirements for a software system are collected in the Product Backlog. The Product Backlog is a prioritized list and serves as a repository for all requirements related to the product. However, the Product Backlog is not at any time a finalized document but rather evolves along with the product. In the beginning of a project the Product Backlog only contains high-level requirements and it becomes more and more percise during the Sprints. Each Backlog item has a priority assigned to represent its’ business value, and an effort estimation to plan the required resources to implement it. During the Sprint Planning, the Scrum Team picks high priority backlog items that they think are realistic for the next Sprint. The Scrum Teams are small interdisciplinary groups of 7 to 9 people [12], which are self-organized and have full authority to determine the best way for reaching the
60
M. Düchting, D. Zimmermann, and K. Nebe
Sprint Goals. There are no explicit roles defined within the Scrum Team. Scrum places emphasis on an emergent behavior of the team, meaning the teams develop their mode of cooperation autonomously. This self-organizing aspect supports creativity and high productivity [12]. The Scrum Team and its’ manager - the Scrum Master – meet in a short, daily meeting, called Daily Scrum, to report progress, impediments and further proceedings. Every Sprint ends with a Sprint Review meeting, where the current product increment is demonstrated to project stakeholders. 2.2 Extreme Programming Extreme Programming [1] is one of the established agile SE methodologies. Similar to Scrum, XP is an iterative-incremental development model. However, XP’s iterations are even shorter than Scrum’s. According to Beck the optimal iterationlength is somewhere between 1 and 3 weeks. XP adopts reliable SE techniques to a very high degree. Continuous reviewing is assured by pair programming, where two developers are sitting together at one workstation. XP also applies the common code ownership principle. All team members are allowed to make changes in code written by someone else when it is necessary. In addition, XP requires a user stakeholder to be on-site as a mean to gather early user feedback. The requirements in XP are defined by the customer in so called User Stories. Each story is a brief, informal specification of requirements. Similar to Scrum’s Product- and Sprint Backlog, the User Stories have a priority and effort estimation assigned to it. Before a new iteration starts, the User Stories are decomposed into more granular technical work packages. The literature about XP does not mention an explicit design phase, but highly emphasizes continuous refactoring and modeling. The functionality described in User Stories is converted into test cases. The simplest concept that passes the test is implemented. Development is finished, when all tests are passed.
3 User Centered Design A recent trend can be observed, showing that usability criteria become a sales argument for products and the awareness for the need of usable systems is growing. But many software development projects are mainly driven by the SE model that is used. Usability Engineering (UE) provides a wide range of methods and systematical approaches to support the user-centered development. These approaches are called Usability Engineering Models (UE Models), e.g. the Usability Engineering Lifecycle [9] or Goal-Directed Design [4]. Mayhew’s UE process consists of three phases, which are processed sequentially. The first Phase is the Requirement Analysis, followed by Design/Testing/ Development Phase and the Installation of the product. The process is iterative: Concepts, preliminary and detailed design are evaluated until all problems are identified and resolved. In the Goal Directed Design Process of Cooper, several phases are passed through as well. During the Research Phase, qualitative research leads to a picture of how users do work in their daily work environment. During the Modeling Phase Domain Models and User Models (so called Personas) are developed that are then translated
Incorporating User Centered Requirement Engineering
61
into a Framework for the design solutions, which is detailed in the Refinement Phase. These two models have much in common since they describe an idealized approach to ensure the usability of a software system, but they usually differ in the details. UE Models usually define an order of activities and their resulting deliverables. UE approaches often happen concurrently to the other software development activities, so there is an obvious necessity for integrating these two approaches, in order to permit the predictability of budgets, resources and timelines of the UE activities within software development.
4 Motivation According to Ferre [7] basic conditions for integrating SE and UE are an iterative approach and active user involvement. The two agile SE models outlined above are iterative-incremental approaches that rely on a solid customer involvement. They even talk about user representatives as a special kind of customer stakeholders. The involved customer should at least have a solid knowledge of the user’s domain and their needs. This raises the question, if and how Usability is ensured in an agile software development process in order to perform UE activities in a satisfying way. This paper discusses the user-centeredness of two agile SE Models and the question how potential gaps can be filled without loosing the process agility. When exploring the UCD Models described above, there is a commonality with the traditional SE Models. Both are strongly driven by phases and the resulting deliverables. However, documentation has a minor part in agile models. Due to their incremental approach and overlapping development phases there are no distinct phases, like e.g. Analysis, Design, Development and Validation, in agile SE Models. Without certain deliverables or activities there is a need for other criteria to allow an assessment of the user-centeredness of agile SE Models. Requirements play a decisive role during the software development lifecycle, in both the SE and the UE domain. SE is mainly concerned with system requirements, while UE takes the user’s needs into account. Requirements are measurable criteria and the elicitation, implementation and validation takes place in most approaches to software development. The approach of defining granular requirements allows to look at activities independent of the larger modules, which lends itself well to the agile approach of developing smaller increments that ultimately add up to the final system, and not preparing a big design up front. In order to develop recommendations for the integration the authors analyze Scrum and XP to see how they are able to adopt UCD activities, specifically how they can utilize UCD requirements. The Requirement Framework introduced in the following section offers a way to approach this.
5 User Centered Requirements Based on a generalized UCD model defined in DIN EN ISO 13407 [5], Zimmermann & Grötzbach [13] describe a Requirement Engineering framework where three types of requirements are generated, each of which constitutes the analysis and design outcome for one of the three UCD activity types. Usability Requirements are
62
M. Düchting, D. Zimmermann, and K. Nebe
developed during the Context of Use analyses; which revolves mainly around the anticipated user, their jobs and tasks, their mental models, conceptions of the usage of the system, physical environment, organizational constraints and determinants and the like. It is important to elicit these findings from actual users in their context of use, in order to get a reliable baseline for requirements pertaining to users’ effectiveness, efficiency and satisfaction. These requirements can be used as criteria for the system and intermediate prototypes through Usability Tests, questionnaires, or expert based evaluations. The Workflow Requirements focus on individual workflows and tasks to be performed by a user. Task performance models are elicited from users, the workflow is optimized for, and an improved task performance model is generated. The outcome of this module is a set of requirements pertaining to a specific user’s interaction with the system in the context of a specific workflow or task, e.g. as described in use case scenarios. The requirements describe the discrete sub-steps of a user’s interaction flow and the expected behavior of the system for each of these steps in an optimized workflow. It is important to validate these requirements against the usability requirements with users, e.g. by comparing an optimized workflow to the current state of workflow performance with regard to effectiveness, efficiency and user satisfaction. Workflow Requirements are ideal input for test cases, against which prototypes or the final system can be tested, either through usability tests or expert evaluations. The User Interface (UI) Requirements, generated in the Produce Design Solution activities, define properties of the intended system that are derived from Usability or User Requirements, e.g. interaction flow or screen layout. During the development phase, the UI Requirements provide guidance for technical designers regarding the information and navigation model, which can then be aligned with other technical models. They also help programmers implement the required functions using the correct visual and interaction model. UI requirements serve as criteria for the actual system that has been developed, i.e. to determine if it follows the defined model for layout and interaction. These evaluations can be user or expert based, and can be conducted during system design and testing. By translating UI Requirements into test cases, this evaluation step is facilitated.
6 Proceedings The authors used the User Centered Requirements summarized above as a basis for a gap-analysis in order to determine whether the two agile SE Models (Scrum and XP) consider the three types of requirements adequately. As the different requirements have distinct stages, they have to be elicited, implemented and evaluated appropriately. The fulfillment of the requirements will guarantee user centeredness in the development process. In order to prepare the gap-analysis the authors used the description of the different requirements to derive several criteria used for the assessment. The goal was to specify criteria which apply to both models. Thereby there is no 1:1 relation of the stages (elicitation, implementation, evaluation) and the criteria derived for the different types of requirements. Thus, there might be no criteria at a specific stage for a specific type of requirement, as the framework suggests. As an example, selected criteria for the UI Requirements are shown in Table 1.
Incorporating User Centered Requirement Engineering
63
Table 1. Selection of criteria, defined for the UI Requirements,based on the definition in 5 Elicitation develop appropriate representation of workflow by UI designer
Implementation verify feasibility
specify interaction and behavioral detail
transform architecture into design solutions
Evaluation evaluate if UI meets UI methods to measure improvements in effectiveness and efficiency verify requirements and refine existing requirements
According to the criteria the two agile models Scrum and XP have been analyzed regarding whether the criteria’s are met. This allows comprehensive statements about the considerations of UE activities and outcomes in agile SE. The analysis results are presented by each type of requirement and in the order of the three stages and are based on the model description from the sources cited above. Subsequent to the analysis the authors give recommendations for the two agile SE Models that enhance the consideration of the three requirement types in Scrum and XP. 6.1 Implementation of User Centered Requirements The results of the analysis for the Usability Requirements (Table 2) show, that neither Scrum nor XP consider this type of requirements appropriately. During the elicitation of Usability Requirements only one criteria, the consideration of stakeholder input is partly fulfilled by both models. The insufficient acquaintance of overarching Usability Requirements can also be determined in evaluation activities. Just one criterion is met by the Scrum Models to some extend. Table 2. Selection of criteria, defined for the Usability Requirements. + fulfilled; - not fulfilled; o partly fulfilled. Usability Requirements Elicitation observe user in context of use consider workflow-oriented quality criteria measurable, verifiable and precise usability requirements gather and consider stakeholders input Evaluation verify if requirements are met measure end user's satisfaction check requirements and refine existing requirements
SCRUM
XP
─ ─ ─ o ─ ─ o
─ ─ ─ o ─ ─ ─
During the elicitation of Workflow Requirements hardly any of the criteria could be found in Scrum or in XP. Except that the XP model does partly fulfill the criteria to verify if the new workflow is an improvement from the user’s perspective. However, the agile models posses’ solid strengths in the evaluation of user requirements. The only criterion which is not met by both models is the verification of workflow mockups against the improved workflow. The impact for the usability, because of these unconsidered criteria, is negligible.
64
M. Düchting, D. Zimmermann, and K. Nebe Table 3. Selection of criteria, defined for the Workflow Requirements
Workflow Requirements Elicitation specify system behavior for given task, related to concrete goal check if new workflow is an improvement form users perspective Evaluation check correctness and completeness of workflow description check workflow mockups for correctness, completeness and possibly find new requirements verify requirements and refine existing requirements verify that final system meets requirements
SCRUM
XP
─
─
─
o
+
+
─
─
+ o
+ +
Table 4. Selection of criteria, defined for the User Interface Requirements User Interface Requirements Elicitation develop appropriate representation of workflow by UI designer specify interaction and behavioral detail Implementation verify feasibility transform architecture into design solutions Evaluation evaluate if UI meets UI requirements concluding evaluation to see if system meets requirements methods to measure improvements in effectiveness and efficiency verify requirements and refine existing requirements
SCRUM
XP
─
─
─ + + o ─
─ + + ─ +
─
─
+
─
The elicitation of User Interface Requirements is not provided by any of the two models. However, for the criteria of implementation activities, both models provide an opportunity to verify feasibility of certain interaction concepts and consider technical constraints for design decisions before the UI concepts are implemented. In terms of the evaluation of UI Requirements the two models have several distinctions. The Scrum Model provides a way to verify UI Requirements with users and experts whereas there is no information about a comparable activity in the literature for the XP Model. As opposed to that, XP does perform concluding evaluations to see if the system meets the UI Requirements within the scope of automated tests. Both models do not consider measuring the improvements of the user’s effectiveness and efficiency. 6.2 Conclusion and Recommendations Looking at the summarized results it becomes apparent that both agile models have significant deficiencies in handling User-Centered Requirements. Usability Requirements are treated insufficiently in the important stages of development. Regarding to more detailed requirements the agile models possess certain strengths and the potential for the integration with UE activities.
Incorporating User Centered Requirement Engineering
65
Workflow Requirements for instance are dealt with appropriately regarding evaluative activities. But it needs to be assured that they are elicited and processed adequately during previous stages from an UE standpoint. The development can be essentially influenced on the granular level of UI Requirements. However, the UI requirements have to be derived from correct workflow descriptions and qualitative Usability Requirements. The recommendations listed below, provide suggestions to endorse the two models in order to include the criteria of User-Centered Requirements and ensure the usability of a software product. The recommendations are derived from the results of the analysis described above. In the descriptions of both models an explicit exploration phase prior to the actual development is mentioned. The development teams work out system architecture and technology topics to evaluate technical feasibilities, while customer stakeholders generate Product Backlog Items (in Scrum) or User Stories (in XP). Compared to common UE analysis activities the exploration phases in Scrum and XP are rather short and are supposed to not exceed one usual process iteration. Nevertheless, this exploration phases can be used by UE experts to endorse the particular development teams in a rough exploration of the real users in their natural work environment. In order to stay agile it is important to not insist on comprehensive documentation of the results, rather than emphasizing on lightweight artifacts and sharing the knowledge with the rest of the team. Having an UE domain expert in the development team also assures that generic Usability Requirements are taken into account during requirement gathering activities. Due to the vague definition of the customer role in Scrum and XP it is not guaranteed that real users are among the group of customer stakeholders. From an UE point of view, it is essential to gather information regarding the context of use and the users’ workflows and to validate early design mockups and prototypes with real users. Therefore, it is necessary to explicitly involve users on-site for certain UE activities instead of different customer stakeholders, even though when they claim to have a solid knowledge of the end users needs. The Product Backlog (in Scrum) and the User Stories (in XP) would be the right place to capture Workflow Requirements. However, there is the risk of loosing the “big picture”, of how single system features are related to each other, because both artifacts focus on the documentation of high level requirements instead of full workflows. Modeling the workflow with Essential Use Cases and scenario based descriptions [10] would be sufficient, but is not intended by any of the two models. Scrum and the XP do not intend to perform usability tests to verify if the requirements are met, nor they measure the users satisfaction, e.g. using questionnaires. However, the Sprint Review in Scrum offers facilities to expert evaluations involving people with UE expertise and/or real users among the Scrum Team and as attendees of the Sprint Review. This can not substitute comprehensive usability evaluations, but helps to avoid user problems at an early stage. System testing in terms of usability is a problem in agile models because the solutions are specified, conventionalized and developed in small incremental steps. However, to perform a usability test with real users, the system has to be in a certain state of complexity to evaluate the implementation of their workflows. In traditional SE models, also using incremental development, these workflows and regarding requirements are documented forehand and a prototype could be developed regarding
66
M. Düchting, D. Zimmermann, and K. Nebe
such a set of requirements for one workflow to be tested with the users. It certainly does not make sense to demand for Usability Testing subsequent to each process iteration, but the tests could be tied to a release plan. Agile models provide good opportunities for a close collaboration between developers and designers during development activities. Due to the overlapping development phases and the multidisciplinarity of the development teams, the feasibility of certain interaction models can be compiled with developers frequently and without fundamentally slowing down design and implementation activities. Regarding to this, design decisions can consider Usability Requirements and technical constraints in an easy and early stage. In terms of the evaluation of UI Requirements the two models differ in their proceedings. The Sprint Review in Scrum can be used to review the user interface in order to verify whether the design meets the previously defined specifications - presuming that those specifications have been created and defined as Sprint Goals beforehand. XP does not stipulate a review meeting like the Scrum Model. Unlike to Scrum, the XP Model explicitly demands for constant testing on a frequent basis. Certain subsets of UI Requirements are suited for automated test, e.g. interaction or behavior related requirements. But it is barely possible to test the conformity to a style guide regarding the accurate implementation.
7 Summary and Outlook The underlying criteria for the assessment do not claim to be exhaustive. Anyhow, they show the right tendencies and allow to make statements in terms of the realization in the particular models. The approach presented in this paper is used to evaluate how agile software engineering (SE) models consider activities of usability engineering (UE) in order to ensure the creation of usable software products. The user-centeredness of the two agile SE models Scrum and XP has been analyzed and the question how potential gaps can be filled without loosing the process agility was discussed. As requirements play a decisive role during software development, either in software engineering but also in usability engineering, the authors assumed that requirements can serve as the common basis on which agile SE models can work together with the results of usability engineering activities. The User Centered requirements, defined by Zimmermann and Grötzbach, describe three types of requirements derived from the results of UCD activities outlined in DIN EN ISO 13407 [5]. By using these three types of requirements the authors derived more specific criteria in order to perform a gap-analysis of the two agile models. As a result, the fulfillment of the criteria allowed comprehensive statements about the considerations of UE activities and outcomes in agile SE. It turned out that both agile models have significant deficiencies in handling User-Centered Requirements. Usability Requirements are treated insufficiently in all the important stages of development. The presented approach has been used to acquire first insights about the ability of agile SE models in creating usable software. However, the authors are well aware of the need for further more extensive and more specific criteria. Using and applying them to other agile models will enable to derive more generic statements about the integration of UE in agile SE models in general.
Incorporating User Centered Requirement Engineering
67
References 1. Beck, K.: Extreme Programming explained. Addison-Wesley, Boston (2000) 2. Boehm, B.: A Spiral Model of Software Development and Enhancement. IEEE Computer 21, 61–72 (1988) 3. Cohn, M.: User Stories Applied – For Agile Software Development. Addison-Wesley, Boston (2004) 4. Cooper, A.: About Face 2.0. Wiley Publishing Inc, Indianapolis, Chichester (2003) 5. DIN EN ISO 13407: Human-centered design processes for interactive systems. Brussels, CEN - European Committee for Standardization (1999) 6. DIN EN ISO 9241-11. Ergonomic requirements for office work with visual display terminals (VDTs) – Part 11: Guidance on usability. International Organization for Standardization (1998) 7. Ferre, X.: Integration of Usability Techniques into the Software Development Process. In: Proceedings of the 2003 International Conference on Software Engineering. pp. 28–35, Portland (2003) 8. Larman, C.: Agile & Iterative Development – A Manager’s Guide. Addison-Wesley, Boston (2004) 9. Mayhew, D.J.: The Usability Engineering Lifecycle. Morgan Kaufmann, San Francisco (1999) 10. Rosson, M.B., Carrol, J.M.: Usability Engineering: Scenario-Based Development of Human-Computer Interaction. Academic Press, London (2002) 11. Royce, W.: Managing the Development of Large Software Systems. In: Proceedings of IEEE WESCON. vol. 26, pp. 1–9 (August 1970) 12. Schwaber, K., Beedle, M.: Agile Software Development with Scrum. Prentice Hall, Upper Saddle River (2002) 13. Zimmermann, D., Groetzbach, L.: A Requirement Engineering Approach to User Centered Design. In: HCII 2007, Beijing (2007)
How a Human-Centered Approach Impacts Software Development Xavier Ferre and Nelson Medinilla Universidad Politecnica de Madrid Campus de Montegancedo 28660 - Boadilla del Monte (Madrid), Spain {xavier, nelson}@fi.upm.es
Abstract. Usability has become a critical quality factor in software systems, and it requires the adoption of a human-centered approach to software development. The inclusion of humans and their social context into the issues to consider throughout development deeply influences software development at large. Waterfall approaches are not feasible, since they are based on eliminating uncertainty from software development. On the contrary, the uncertainty of dealing with human beings, and their social or work context, makes necessary the introduction of uncertainty-based approaches into software development. HCI (Human-Computer Interaction) has a long tradition of dealing with such uncertainty during development, but most current software development practices in industry are not rooted in a human-centered approach. This paper revises the current roots of software development practices, illustrating how their limitations in dealing with uncertainty may be tackled with the adoption of well-known HCI practices. Keywords: uncertainty, software engineering, waterfall, iterative, HumanComputer Interaction-Software Engineering integration.
How a Human-Centered Approach Impacts Software Development
69
leading to a need for integration of usability methods into SE practices, providing them the necessary human-centered flavor. The term "Human-Centered Software Engineering" has been coined [25] to convey this idea. In contrast, HCI practitioners need to show upper management how their practices provide value to the company in the software development endeavor, in order to get a stronger position in the decisiontaking process. HCI and SE need to understand each other so that both can reciprocally complement with effectiveness. While SE may offer HCI practitioners participation in decision-making, HCI may offer their proven practices that help in dealing with the uncertainty present in most software development projects. In the next section the diverging approaches of HCI and SE are analyzed. Next, in section 3 the role of uncertainty in software development is outlined, elaborating on problem-solving strategies and how they apply to software development. Section 4 presents how joint HCI-SE strategies may be adopted for projects where uncertainty is present. Finally section 5 presents the conclusions gathered.
2 HCI and SE Development Approaches SE is defined as the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software [13]. In the pursuit of these objectives, SE has highlighted software process issues, and it has also traditionally focused on dealing with descriptive complexity. On the other hand, HCI is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use in a social context, and with the study of major phenomena surrounding them [22]. Usability is the main concern for HCI, and it is multidisciplinary by essence. The HCI view on software development is, in a certain sense, broader than the SE one, which mostly focuses on the running system in isolation. In contrast, HCI does not handle with comparable deepness specific issues, like software process or software architecture. Fig. 1 shows how SE and HCI differ about their main subject of interest in software development. While HCI cares about the impact created by the software in the user and his social context, SE focuses mainly on the correctness of the running software system itself. Software engineers mostly consider usability as a user interface issue, usually dealt with at the end of development, when the `important´ part of the system has already been built. Alternatively, HCI experts carefully study the users and their tasks, in order to better fit the system to the intended users, and they consider that once the system interaction has been defined software engineers may begin `building´ the system. There is a high degree of misunderstanding between both fields, along with some lack of appreciation for the work performed by the other discipline. Practitioners of both fields think it is them who do the "important job" in software development. Comparing HCI to SE it may look like as lacking maturity. In this direction, Mayhew states that integration of usability engineering with the existing software development lifecycle has not yet been solved, mostly due to the state of maturity of the Usability Engineering discipline [20]. Alternatively, SE methods may look too system-centered for an effective user-system interaction, as understood in HCI.
70
X. Ferre and N. Medinilla
Fig. 1. Comparison between HCI and SE main focus
Despite this lack of mutual understanding, both disciplines need to collaborate, since there is a non-trivial overlapping between their respective objects of study and practice. In particular, requirements-related activities are considered a cornerstone of the work of both HCI and SE. The decision of which system is going to be built is quite important for usability purposes, so HCI has a lot to say about it, while requirements engineering is a SE subdiscipline with a recognized importance in the field, so software engineers will not be handing completely requirements-related activities to usability experts. The traditional overall approach to development in SE has been the waterfall lifecycle. In relation to requirements, it is based on requirements which are fixed (frozen) at early stages of development. Nevertheless, the waterfall lifecycle is considered nowadays in SE as only valid for developing software systems with lowmedium complexity in domains where the development team has extensive experience. As an alternative to the waterfall, iterative development is currently identified as the development approach of choice, even if its practical application finds some opposition. On the contrary, HCI has traditionally adopted an iterative approach to development. Therefore, some promising opportunities for SE-HCI collaboration come out. Conflicts may arise between both kinds of practitioners, but they must be solved if usability is to be considered a relevant quality attribute in mainstream software development. Fortunately, recent trends in SE show a higher acceptance of uncertainty in software development, and this can provide a higher appreciation for HCI practices, as explained in the next sections.
How a Human-Centered Approach Impacts Software Development
71
3 Uncertainty in Software Development Uncertainty is currently accepted as a necessary companion of software development [3],[19]. However, SE has traditionally considered uncertainty as harmful and eradicable. The aim was to try to define a "safe" space where no uncertainty could affect the work of software developers. The development of software systems of higher complexity levels has led to the need of changing this approach. In order to deal with complexity, the traditional SE view considers only descriptive complexity (quantity of information required to describe the system, according to Klir & Folger [17]). It is a useful dimension to work in the software universe but, on most occasions, it is not enough on its own to explain the software universe. Descriptive complexity needs to be combined with the complexity due to uncertainty, which is defined by Klir & Folger as the quantity of information required to solve any uncertainty related to the system [17]. Ignoring uncertainty in software development obstructs the objective of better coping with highly complex problems to be addressed by software systems, since it narrows the interpretation of both the problem and the possible strategies for building a successful solution. Complexity due to uncertainty adds a new dimension to the software space, as shown in Fig. 2. When extending the software universe dimensions to two, some hidden issues that hinder software development projects are uncovered, and new solutions emerge. complexity due to uncertainty
software universe descriptive complexity
Fig. 2. Extension of the software universe when considering the uncertainty dimension
Dealing with uncertainty is unavoidable in software development. But it is not just an undesired companion in the software development journey, it can be used as a tool that offers a powerful mean of attacking recurring problems in software development. Having uncertainty-based means in the toolbox of software development teams, offers them a richer background and vision to better tackle their work in the complex software universe. The usage of uncertainty as a tool in software development takes several forms: The introduction of ambiguity in the solution and the adoption of problem-solving strategies that manage uncertainty.
72
X. Ferre and N. Medinilla
3.1 Ambiguity as a Way of Introducing Uncertainty in the Solution Abstraction is a simplification tool that expresses just the essential information about something, leaving out the unnecessary details. This omission deliberately introduces uncertainty, which manifests in the form of ambiguity. An abstraction is precise with respect to the essence of the topic conveyed, but it is necessarily ambiguous with respect to the particulars, which are intentionally taken out of the picture. When making design decisions, uncertainty also plays a major role in providing solutions which are easier to maintain, modify or extend. For example, the hiding information principle [21], promotes the introduction of uncertainty in the design, by not providing details on how a particular module is implemented. Modularization on its own does not provide benefits for this purpose, since a careful design of the modules and their headers is necessary for attaining the necessary relation of indifference between modules. Any design decision that attempts to introduce some degree of ambiguity in the solution being developed uses uncertainty as a tool for allowing easier future modifications. As a collateral effect, development usually gets more complex and more difficult to manage when employing uncertainty-based strategies; in a similar way to object-oriented design being more complex than the structured development paradigm, but providing a more powerful and less constrained instrument for the development of complex software systems. 3.2 Problem-Solving Strategies and Uncertainty Human beings use different strategies according to the extent of the uncertainty they must confront. A linear or industrial strategy may be employed with zero or negligible uncertainty; a cyclical or experimental strategy when having medium uncertainty (something is known); and an exploratory or trial and error strategy when high uncertainty needs to be dealt with. The higher the uncertainty level provided by the strategy, the higher will be its power for dealing with uncertainty (in the problem). Linear strategy (step after step) follows a path between a starting point to an ending one, given that both points and the path between them are known in advance. That is, it is necessary to know the problem, the solution, and the way to reach such solution. If all these requirements are met, the lineal strategy is the cheapest one. In order to make possible its application, any uncertainty needs to be eradicated before beginning the resolution process. The paradigm that represents the linear strategy in software development is the waterfall life cycle. It follows the sequence requirements, analysis, design, implementation, and testing, which is a direct translation of the Cartesian precepts enunciated in the Discourse on Method [8]: Evidences, analysis, synthesis and evaluation. The idea behind these principles is to undertake in the first place the what and afterwards the how. This separation between requirements and design is an abstract goal and not a human reality [1]. The so called incremental strategy is a variant of the linear one where the problem is divided into pieces, which are then undertaken one by one. Cyclical or experimental strategy (successive approximations), when converging, comes progressively closer to an unknown final destination through the periodical refinement of an initial proposition (hypothesis). A cyclical strategy is adopted when
How a Human-Centered Approach Impacts Software Development
73
the solution is unknown, but there is enough information on the issue to be able to formulate a hypothesis. The paradigm for the cyclical strategy in the software world is the spiral model [2]. A common statement in software development is to describe each cycle in the spiral model as a small waterfall. This is inappropriate, since the spiral recognizes the presence of uncertainty throughout the (risk-driven) process, and the waterfall, whatever its size, requires eradicating the uncertainty at the beginning. Arboreal or exploratory strategy (trial and error) is the way to reach an unknown destination without a best first guess, provided that the universe is closed. In the case of an open universe, the exploratory strategy does not ensure finding the solution, but none of the other strategies may ensure it, given the same conditions of uncertainty. An exploratory strategy is in place every time a solution is discarded and development goes back to the previous situation. The Chaos life cycle [23] is very close to an exploratory strategy, but it is limited by Raccoon's waterfall mindset. 3.3 Uncertainty and HCI HCI has developed interactive software for decades, without the obsession about uncertainty eradication present in SE. In fact, HCI literature has some examples of insight regarding real software development. Hix & Hartson's [10] observations about the work of software developers show that they usually operate in alternating waves of two complementary types of activities: both bottom-up, creative activities (a synthesis mode) and top-down, abstracting ones (an analysis mode). Hix & Hartson also unveil the closeness that exists between analysis and design activity types, especially in the requirements-related activities. It is not sensible, then, to try to draw a clear separation between both activity types. With regard to methodologies in place in software development companies (based in a waterfall approach), they report that in some of their empirical studies they noticed that "iterative and alternating development activities occurred, but because corporate standards required it, they reported their work as having been done strictly top-down" [10]. The reality of development was hidden behind the mask of order of the waterfall. According to Hakiel "There is no reason why a design idea might not survive from its original appearance in requirements elicitation, through high- and low-level design and into the final product, but its right to do so must be questioned at every step" [9]. This approach is a radical separation from the waterfall mindset mostly present in SE, which was traditionally presented as the way to develop software in an orderly manner. The multidisciplinary essence of HCI has helped in providing a not so rigid approach to development in the field. As Gould and Lewis [12] say, when a human user is considered (as in the upper part of Fig. 1) a coprocessor of largely unpredictable behavior has been added. Uncertainty is a companion of any attempt to develop interactive systems of non-trivial complexity, since human beings are part of the supra-system we are addressing: the combination of the user and the software system, trying to perform tasks which directly relate to the user goals. User-centered or human-centered development is the HCI approach to the development process, and it has traditionally introduced uncertainty when labelling himself as iterative. In this sense, [5], [10], [16], [22], and [26] agree on considering iterative development as a must for a user-centred development process. Therefore,
74
X. Ferre and N. Medinilla
iterativeness is at the core of HCI practices. A real iterativity, in the sense that evaluation is often considered as formative; not just an exam for identifying mistakes, but a tool for giving form to the interaction design, and maybe for identifying new requirements.
4 Common HCI-SE Problem-Solving Strategies As presented in the previous section, uncertainty is a tool for problem resolution; in particular, it is a tool for interactive software development. Uncertainty-based approaches have been adopted in the resolution strategies of both HCI and SE, without labeling them as such. When trying to integrate usability and HCI methods into mainstream development, the extensive HCI experience in dealing with uncertainty may be incorporated into SE practices, making them better prepared to cope with the development of complex systems with a high usability level. Non-linear problem-solving strategies present important challenges with respect to estimation and planning, along with the danger of continuously iterating without advancing towards the solution. A certain degree of flexibility is necessary for dealing with these issues, as HCI usually employs. Accordingly, some degree of uncertainty will have to be introduced in the formal procedures advocated by SE methodologies. 4.1 Iterative Development Iterative-cyclical strategies are currently at the center of debate in SE, with agile and iterative practices (see, for example, [18]). When adopting cyclical strategies of this kind, the introduction of HCI practices may be undertaken with greater success than former proposals for integration into waterfall lifecycles, like [7]. The aim of integrating usability engineering and HCI practices into mainstream software development, which mostly refuses to deal with uncertainty, have led to more formal solutions, in a SE sense, but leaving out the uncertainty present, for example, in iterative approaches. Such as Mayhew's Usability Engineering Lifecycle [20], which is based on a two-step process where analysis activities are performed in a first phase, and then design and evaluation activities are performed iteratively on a second phase; but there is no place for resuming analysis activities. Therefore, it is based on a frozen requirements paradigm, with reminiscences of a waterfall mindset. Nevertheless, iterativeness has been at the heart of usability engineering practices because usability testing has been the central point around which the whole development effort turns. It is necessary to test any best-first-guess design. Observational techniques and sound analysis are performed with the aim of getting a high quality first design, but usability testing with representative users is then performed to check against reality the logical constructs the design is made of. The expected functionality and quality levels of the final system can be specified, but there is a certain degree of uncertainty in building the solution, the software system, in the sense that when undertaking the construction of some part of the system we do not exactly know how far we are from the specified solution. This is especially true when dealing with usability. Any design decision directed to usability
How a Human-Centered Approach Impacts Software Development
75
improvement needs to be tested with representative users, in order to check the actual improvement in usability attributes like efficiency in use. When the system under scrutiny includes the final user on top of the computer system, as it is necessary for the management of the final product usability, flexibility is required for adapting the partial prototypes according to evaluation outcomes. 4.2 Exploratory Strategies and the Definition of the Product Concept Exploratory strategies are not yet dealt with in SE literature and practice. Traditional information systems, like payroll systems, are well defined and most SE methodologies are directed to building them. Input-process-output models fit very well this kind of problems: automation of procedures previously performed manually, with well defined rules and algorithms. The product concept is clearly delimited in this kind of systems, so requirements can be written down with less risk of misunderstandings between the customer and the development team. Actually, IEEE body of standards has a standard for establishing the user requirements, the Concept of Operations [15] or ConOps, but it is seldom used in software development, unlike the more system-oriented (or developer-oriented) IEEE recommended practice for software requirements specification [14], which receives much more attention from the SE field. On the other hand, the HCI field has a long tradition of dealing with ill-defined problems, developing new products with a high degree of innovation. Even if the creation of these systems has not been their main focus of activity, dealing with problems with neither an obvious solution nor indications of how development should proceed, has been part of HCI practitioners' work. Accordingly, several HCI techniques are specially well suited for defining the product concept. These techniques favor participative and creative activities, which fit very well the purpose of creating a model of how the system works, from the user point of view, studying if it fits with user needs/expectations. Examples of this techniques are Personas [6], Scenarios [4], Storyboards and Visual Brainstorming [22]. As long as current interactive systems development goes on changing to new paradigms of interaction, with an ever increasing degree of novelty required, these HCI techniques will have to be either adopted by software engineers, or applied by HCI experts belonging to the development team.
5 Conclusions In this paper we have shown how uncertainty plays a major role in software development in the construction of non-trivial interactive software systems. While uncertainty in the problem may be harmful, uncertainty in the solution may be useful when used as a tool for dealing with the former kind of uncertainty (the one in the problem). HCI has been traditionally applying flexible processes that allow participatory design, and it has promoted the usage of prototypes aiming at greater flexibility for making changes to the (partial) software solution. Aditionally, some HCI techniques are especially well suited for the development of innovative software systems, which
76
X. Ferre and N. Medinilla
are ill-defined by definition, and they may be adopted for exploratory problemsolving strategies. Even if this is part of standard HCI practices, the convenience of this approach has not been formalized in a way that helps HCI methods integration into mainstream software development practices. Recent awareness about the obstacles that traditional approaches, like the waterfall life cycle, imposes on the endeavor of successful systems development, leads to a more favorable attitude to the introduction of HCI methods, which ultimately lead to better quality products. In particular, HCI may play an important role in introducing practices that improve the usability of the final product, while additionally preparing businesses to better deal with uncertainty in software development. Understanding the roots of current software development practices and knowing their deficiencies in dealing with uncertainty is essential for any software development business. A model for software development that considers uncertainty is needed, in order to change from a field that is based only on the expertise of gurus to a software development field with sound foundations for the selection of development practices.
References 1. Blum, B.I.: Software Engineering A Holistic View. Oxford University Press, New York, USA (1992) 2. Boehm, B.W.: A Spiral Model of Software Development and Enhancement. ACM SIGSOFT Engineering Notes 11-4, 14–24 (1986) 3. Bourque, P., Dupuis, R., Abran, A., Moore, J.W., Tripp, L., Wolf, S.: Fundamental principles of software engineering- a journey. Journal of Systems and Software 62, 59–70 (2002) 4. Carroll, J.M.: Scenario-Based Design. In: Helander, M., Landauer, T., Prabhu, P. (eds.) Handbook of Human-Computer Interaction, 2nd edn. pp. 383–406. Elsevier, NorthHolland (1997) 5. Constantine, L.L., Lockwood, L.A.D.: Software for Use: A Practical Guide to the Models and Methods of Usage-Centered Design. Addison-Wesley, New York, USA (1999) 6. Cooper, A., Reimann, R.: About Face 2.0: The Essentials of Interaction Design. Wiley Publishing, Indianapolis (IN), USA (2003) 7. Costabile, M.F.: Usability in the Software Life Cycle. In: Chang, S.K. (ed.): Handbook of Software Engineering and Knowledge Engineering, pp. 179–192. World Scientific, New Jersey, USA (2001) 8. Descartes, R.: Discourse on the Method of Rightly Conducting One’s Reason and of Seeking Truth (1993), http://www.gutenberg.org/etext/59 9. Hakiel, S.: Delivering Ease of Use. Computing and Control Engineering Journal 8-2, 81– 87 (1997) 10. Hix, D., Hartson, H.R.: Developing User Interfaces: Ensuring Usability Through Product and Process. John Wiley and Sons, New York (NY), USA (1993) 11. Glass, R.L.: Facts and Fallacies of Software Engineering. Addison-Wesley, Boston (MA), USA (2003) 12. Gould, J.D., Lewis, C.: Designing for Usability: Key Principles and What Designers Think, Communications of the ACM, 300–311 (March 1985) 13. IEEE: IEEE Std 610.12-1990. IEEE Standard Glossary of Software Engineering Terminology. IEEE, New York (NY), USA (1990)
How a Human-Centered Approach Impacts Software Development
77
14. IEEE: IEEE Std 830-1998. IEEE Recommended Practice for Software Requirements Specifications. IEEE, New York (NY), USA (1998) 15. IEEE: IEEE Std 1362-1998. IEEE Guide for Information Technology - System Definition Concept of Operations (ConOps) Document. IEEE, New York (NY), USA (1998) 16. ISO: International Standard: Human-Centered Design Processes for Interactive Systems, ISO Standard 13407: 1999. ISO, Geneva, Switzerland (1999) 17. Klir, G.J., Folger, T.A.: Fuzzy Sets, Uncertainty and Information. Prentice Hall, N.J. (1988) 18. Larman, C.: Agile and Iterative Development. In: A Manager’s Guide, Addison-Wesley, Boston (MA), USA (2004) 19. Matsubara, T., Ebert, C.: Benefits and Applications of Cross-Pollination. IEEE Software. 24–26 (2000) 20. Mayhew, D.J.: The Usability Engineering Lifecycle. Morgan Kaufmann, San Francisco (CA), USA (1999) 21. Parnas, D.L.: On the Criteria To Be Used in decomposing System into Modules. Communications of the ACM. 15-12, 1053–1058 (1972) 22. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., Carey, T.: Human-Computer Interaction. Addison Wesley, Harlow, England (1994) 23. Raccoon, L.B.S.: The Chaos Strategy. ACM SIGSOFT Software Engineering Notes, 20-5, 40–46 (1995) 24. Seffah, A., Andreevskaia, A.: Empowering Software Engineers in Human-Centered Design. In: Proc. of the ICSE’03 Conference, Portland (OR), USA, pp. 653–658 (2003) 25. Seffah, A., Gulliksen, J., Desmarais, M.D. (eds.): Human-Centered Software Engineering Integrating Usability in the Development Process. Springer, Heidelberg (2005) 26. Shneiderman, B.: Designing the User Interface: Strategies for Effective Human-Computer Interaction, 3rd edn. Addison-Wesley, Reading (MA), USA (1998) 27. Vredenburg, K., Mao, J.Y., Smith, P.W., Carey, T.: A Survey of User-Centered Design Practice. In: Proc. of CHI-2002, Minneaopolis (MI), USA, pp. 471–478 (2002)
After Hurricane Katrina: Post Disaster Experience Research Using HCI Tools and Techniques Catherine Forsman USA [email protected]
After Hurricane Katrina: Post Disaster Experience Research
79
1 Introduction Disaster is a complex human and environmental event oftentimes perplexing to the most brilliant social scientist, humanitarian worker, governmental official, or human being. FEMA, and other governmental agencies, work with a defined categorization of disaster, called a cycle of disaster [1]. This high-level cycle is an overall view of stages within a continual loop of disaster. Yet, defining a fixed type of disaster and the resulting solutions for appropriate information dissemination, acquisition and organization is difficult because no disaster is the same. Each disaster brings with it specific characteristics. To easily envision the complexity, one could consider the difference between the impact of extreme environmental elements such as wind, water, air and fire and then compound that with the varying contexts where such a disaster could take place. These contexts could be urban or rural with different cultural and language requirements. For example, imagine the difference between 9/11 and Hurricane Katrina. These disasters took place in two different cities with different local governments, cultural histories, demographics, and within different urban plans. If one were to envision the types of activities and information involved in both disasters there may be a few overlapping qualities at a high level, but in reality, they are specifically different at the informational need, activity and urban context level. In other words, disaster has a site specific element to it that involves understanding context, activities, information and the flexibility of real-time, ad hoc information adaptation to contextual activities. If this hypothesis is interesting, then research in disaster management with an HCI perspective may reap interesting findings. Because HCI deals with the study of information systems and appropriate technologies for people within situated activities, it is a unique field, well suited for understanding human needs in adaptive and changing environments. That is what this paper is about: the process of using HCI tools and techniques in a post disaster situation in order to learn how context, activities and people learn what to do and what they may need as information and technology in order to do those things. In the future, conducting HCI research in disaster areas may lead to important findings regarding innovation for disaster situations, technology devices, information structures and the creation of ontological frameworks of experience used as infrastructures for adaptive learning tools when the cycle repeats itself.
2 The Cycle of Disaster The disaster cycle is outlined next in this paper in order to illustrate a framework and define what is meant by “disaster management.” Mitigation. This phase encompasses lessening the effects of possible disasters. It differs from other phases because it involves trying to learn from past disasters through information and data to lessen the severity of any future disaster. This phase also deals with evaluating risks and risk management [2].
80
C. Forsman
Preparedness. Common preparedness measures include the proper maintenance and training of emergency services, the development and exercise of emergency population warning methods combined with emergency shelters and evacuation plans, the stockpiling, inventory, and maintenance of supplies and equipment, and the development and practice of multi-agency coordination. An efficient preparedness measure is an emergency operations center (EOC) combined with a practiced regionwide doctrine for managing emergencies [2]. A development of interest to HCI professionals in this area is one where ethnographic observations regarding self organizing behavior were used after 9/11. In 2002, the US Federal government created a new procedure for evacuating federal employees in Washington. The protocol is based upon observed social dynamics exhibited in 9/11 and attempts to “improve the ad hoc process” based upon ethnographic findings [3]. But, even if there are some insights into how field research can contribute to understanding self organizing systems for future disaster scenarios, is the concept of preparedness flawed? Certainly some risks can be avoided, but disaster by definition is about chaos and the unexpected that takes place in specific contexts that cannot be predetermined. Is it possible to be prepared for dynamic and complex situations that may not now exist, even in a risk model? In two surveys conducted by NYU’s Center for Catastrophe Preparedness and Response (CCPR), one after 9/11 and one after Hurricane Katrina, they noticed a steep rise in participant’s beliefs that one could not prepare for a disaster after the widespread destruction of Hurricane Katrina. This survey data is as follows: 62% of Americans said that it was nearly impossible to be very prepared for terrorist bombings, 60% said the same about hurricanes and floods, and 55% said the same of a flu epidemic” [4]. This shows, perhaps, a lack of confidence in the idea that anyone can “prepare” for such events. But, it also illustrates a perception that preparedness may be an area of inquiry. The question is: What tools does one use to understand this issue? In reality, managing disaster is a combination of both understanding the ad hoc organization before, during, and after the disaster occurs—a very difficult proposition. Response. The response phase includes the mobilization of the necessary emergency services and first responders in the disaster area, such as firefighters, police, volunteers, and non-governmental organizations (NGOs). Additionally, this phase includes organizing volunteers [2]. One could imagine that there are dynamic events that occur in the real world during a disaster. Additionally, there are static preparedness protocols that could be described in a taxonomic way such as a “type” of response (e.g. rescue in water), and scenarios of rescue (e.g. evacuation to hospital). However, rescuing someone from a nest of poisonous water snakes as the person struggles to stay afloat in oil enriched water with no clear directive on where to take the victim due to limited radio frequency and lack of organizational directives is what occurs. The actual narrative of events is very different from simulation of the event previous to the disaster. This underscores an aspect inherent in response, and that is the need for real-time collaboration in interactions with people, information and technology in a social networking and ad hoc organizational manner as needs arise and have outcomes that can rarely be predicted. It also underscores the need for HCI research that deals both
After Hurricane Katrina: Post Disaster Experience Research
81
with narratives of actual events and the creation of technical infrastructures, information structures and organizational models for real-time response and access and organization (reading of patterns). Recovery. The aim of the recovery phase is to restore the affected area to its previous state. It differs from the response phase in its focus; recovery efforts are concerned with issues and decisions that must be made after immediate needs are addressed [2]. The idea of “restructuring” brings with it a wealth of opportunity to explain and explore contextual and population needs through narrative. In other words, new socialized orders can be explored and remapped in accordance to what may not have existed before but ideally could have. The narrative could be shown in personas and scenarios; yet, grounded in field research that can be validated by communities and individuals in order to ensure a community feedback loop.
3 Technology and Disaster If one looks more closely at the technology used during the Rescue and Recovery phase after Katrina it leads to some interesting findings. For example, first responders often communicate via two-way radios. Two-way radios have limited range, about a kilometer, but repeater stations can be positioned to increase the range. They are most often used to coordinate supplies, rescue missions and communications between team members and a Coordinator [5]. Additionally, to accommodate a real-time dynamic, cell phones can be used but oftentimes the network cannot respond or infrastructure or device failure occurs due to environmental issues [6]. However, even if a cell phone could work, when power failure occurred, all 911 center capabilities were disabled [7]. There was nowhere to call but friends or family. A useful technology after Hurricane Katrina was Ham radio. Commercial radio antennas are placed on top of hills in order to cover broader areas of reception making them highly vulnerable to wind and earthquake tremors, whereas Ham radio operators build smaller antennas knowing their broadcast range is within the local area, such as within a city. If an Ham radio tower (can be as small as 100 feet) falls over, it is easy to pick up and reinstall. ARRL President Jim Haynie testified before Congress in 2005 that 1000 Amateur Radio volunteers have been serving the stricken area to provide communication for such agencies as The Red Cross and the Salvation Army and to promote interoperability between agencies” [8]. FEMA passed out radios to citizenry in Bay St. Louis, MS so that they could listen to the broadcasts from a local radio station and obtain information regarding food, shelter and medicine [9]. The idea of the local and smaller technology prevailed simply because it was quickly repairable and could be supported easily by governmental agencies (just pass out the radio from a truck). There is no guarantee that any type of technology would not suffer the same fate as cell phones did during Hurricane Katrina. Meaning, they may work but there are important factors to consider that involve information and organization (e.g. 9/11 operators, system overload, complete loss of communication tools). However, as in the case of Ham radio, the key is the separation of information structures from the device and the local, smaller aspect of the technology. This then, has some precedence in HCI with pervasive computing [10, 11].
82
C. Forsman
For the volunteers outside the disaster zone, a proliferation of internet use and social networking took place (bulletin boards, websites, etc…). The internet became a platform for grassroots initiatives and individual and small-group citizen rescue mission needs (e.g. “Please send money for a gas card for the plane we are flying in to the Gentilly area to distribute water.”) [12]. While those outside the disaster zone can use the internet for organizational and informational purposes, those within the disaster area most likely are without immediate access to the internet. The important takeaways from this is that technology, understood in a real-world framework, worked best when it was easily restructured, could relay distribution and organizing information, and allowed for a social networking either via voice or text.
4 Considering Disaster, Context and the User If we think about the “user’s” capacity for information organization and processing in a complex environment such as in a disaster, the idea of the user as an isolated element understood and normalized for specific psycho-cognitive interactions with an information system in a laboratory does not hold. This type of user definition arose from cybernetics where Herbert Simon proposed his ideas of bounded rationality and learning through information feedback and adaptation [13]. The objective of these studies was to determine task models, ergonomic needs, information models and cognitive responses to a system [14]. From the set of results, usually conducted in a laboratory with task-based questions or via survey, a baseline could be created of a user with varying degrees of expertise and satisfaction in relation to a technology system and tasks performed. Yet another train of thought, as written about by Drs. Lamb and Kling is the idea of the user as a social actor where context plays an important role in understanding the requirements for interactions [15]. The added relation of context and user accounts for the complexities of situated actions, such as space, interactions with objects and people, and power dynamics for use of systems or information. Situated action was first introduced in 1987 in Lucy Suchman’s book, Plans and Situated Actions: the Problem of Human-Computer Communication. And, as Lamb and King point out, “years later a particularly formative and influential study in this area appeared in Mumford’s socio-technical ETHICS PD approach [16]. PD practitioners became keenly aware that structural constraints may prevent the exchange of information, but they believed that users were social actors and capable of mobilizing change. This is not to say the early PD research sided only with the social actor as worker, but included within their perspective organizational changes through technology throughout the full power hierarchy of an organization.” Basically, it was a creative way to consider context and people as social actors within their interactions with context, information and technology [17, 18, and 19]. Disaster by its very definition is chaos rapidly changing and possibly disintegrating contexts. Context here can be the urban landscape, such as a city or home, or mental models of operation, such as knowing how to reach a medical facility if one is hurt. Due to the disruptive nature of disaster to context it is an important study for HCI field research in disaster situations. Meaning, how do people, whether organizers or
After Hurricane Katrina: Post Disaster Experience Research
83
survivors, deal with varying cognitive loads of information and organizational complexity in order to readjust themselves within survival contexts (shelters, hospitals, etc…)? Extending context beyond the workplace does have some precedent in HCI in the area of pervasive and urban computing where context is extended so that “cities can also be viewed as products of historically and culturally situated practices and flows. When we view urban areas in this context, rather than as collections of people and buildings, infrastructure and practice are closely entwined” [15 and 20]. Over the course of time, methodologically, the research organically moved closer to the sensibilities of PD as envisioned in earlier Scandinavian HCI projects [23]. Observational notes were requested by NGOs so that they could better understand the conditions of specific locations. These notes were sent via email with participant’s permission and editing. Participants in the research began organizing within the shelter and asking for photography or journal recording advice in order to post information to the internet. Traditionally, PD dealt with context and human activities in work environments and was deployed with the understanding that a community would be studied and impacted from decisions made about information systems or machines [24]. The core premise of PD was that better and safer working conditions will result from some sharing of power and an appreciation of the tacit knowledge and adaptive capabilities that workers contribute to organizational processes. In other words, researchers immersed themselves in a culture in order to contextualize culture within their research, create a feedback loop with people within the context, and to participate with the community in developing prototypes and articulations of requirements. In the historical context of PD, the research itself became a conduit for requirements that arbitrated the needs of the workers to management and vice versa. In the case of HCI disaster research, the need for research within context becomes even more strongly coupled than in industry as understanding the needs of a population cannot be divorced from survival actions in context. Understanding what organizational and information needs confront people while coping with the myriad facets of disaster very likely can inform information structures for disaster response in the future, as well as immediate feedback loops in the present.
5 The Research Experience The full study took place over a 2 year period, but for the purposes of this paper, the first experiment is briefly described as it set the direction for the following research experiments. To understand the organizational complexity of people and how they rapidly relearn a new context in order to survive, ethnographic observation and contextual inquiry was used. Interviews were conducted regarding people’s memories of their experience in the changed flow of the city during escape compared to preexisting conditions. Interviews took place in order to understand the memory of how the participant once contextualized their day-to-day experiences within their
84
C. Forsman
homes and neighborhoods, and how this had changed. Additionally, design probes1 (cameras, diaries, sketching, asking participants to post photos to an internet site and video to you tube) were used [21]. The research had as its goal to understand what was critical when changes in context-as-situated-action had on participants. The inquiry highlighted the needs of survivors for both information and organization around flow in order for them to cocreate ongoing survival strategies. The output from this area of inquiry was a set of narratives, field notes, scenarios and personas with clear representations of the participants in context before and after disaster. These personas and scenarios were taken back to communities, and via workshops in some cases, or individuals in others, validated and iterated, thereby involving a “community” aspect. Another interesting output from this research was political in nature. Given that context became a central area of inquiry, findings regarding the appropriate distribution of goods and cultural misinterpretation of needs between governmental and NGO workers and hurricane survivors was evident. Meaning, the survivor believed “You just don’t get me.” when it came to privacy concerns, organization of cots, shower schedules and food. The NGOs and Red Cross organizers followed a protocol of organization that had little to do with context and had been prepared previous to the hurricane. This, then, may be categorized as a missing layer of information interfacing between a specific, situated population and the relief organization’s protocols. In order to bridge this gap, organizational meetings too place within the shelters and field notes were used as tools of information dissemination, pointing to the need for a more efficient; yet, malleable form for creating an information interface between the two populations. Similar research was conducted in the city of Bay St. Louis, MS. In this context, gas stations and Red Cross distribution centers had gluts of certain supplies (baby diapers and sweaters in 90 F weather where babies had been evacuated) and very little of the needed supplies for navigating the new context (accessible medical centers for tetanus shots, appropriate makeshift shelter for 200+ people sleeping in the church’s parking lot, or fork lifts to clear major thorough fares and street signs to order the flow of traffic). This is not an uncommon pattern as witnessed in an ethnographic work in Peru in 1990 when a powerful earthquake struck. Noticing that indigenous populations were not receiving food and goods that arrived through NGOs, Theo Schilderman, an anthropologist, studied the problem of how official relief agencies, survivors, and grassroots volunteers misinterpreted each others needs. The result was a deprivation of goods to survivors because both governmental agencies and NGOs 1
By the word “probe” is meant a label used to categorize a set of tools used in field research and design practices to gather information and iterate ideas with people. Tools categorized within the “probe” category are such things as diaries, remote cameras, drawing exercises mailed to the researcher, etc… Probes have a lineage within the design field for open inquiry. However, the author has intentionally not classified Participatory Design (PD) within the area of Probes in order to distinguish a different historical lineage, such as PD resulting from a need to incorporate an understanding of politics into the research process, whereas the classic definition of probes (diaries, cameras, etc…) were developed for design feedback loops that may or may not have had as an objective and understanding mediating politics.
After Hurricane Katrina: Post Disaster Experience Research
85
were unfamiliar with the community conditions. Additionally, when the goods were finally distributed, the results were a mismatch between what emergency management authorities were trying to give to the victims and what they actually needed [22]. Creating information structures that can easily be accessed around population needs within specific contexts could alleviate some of this tension. 5.1 Aftermath Experiment The first experiment was performed from September 25 – 30, 2005 in three separate locations. Location 1: the Houston Astrodome. Location 2: NGO center outside Baton Rouge shelter, and Location 3: at various places within the city limits (home or makeshift shelter) of Bay St. Louis and Waveland, Mississippi. Two days were spent in each location, interviewing and observing people with a video camera and recording field notes. The recruitment process was done by word-of-mouth and over the internet. 5.2 Social Networking Via the Internet Before arriving onsite, the necessity to quickly establish connections with NGOs and emergency medical personnel within specific locations was done through email and the internet. Craig’s list and Katrina bulletin boards were also used during this phase. Due to the organizational complexities thwarting NGOs and governmental agencies in the accurate distribution of volunteers and supplies, certain people formed their own organizations and drove into the disaster area distributing goods. They organized via the internet and appeared on the doorsteps of shelters, churches or roadsides with supplies. Connecting with these people became invaluable in order to cover wider geographic areas of interest because they distributed life-saving goods in an ad hoc fashion. 5.2.1 Research Issues and the Failure of the Probe Ethnography was performed in the following way: observational, interactive (conversational), and some times participatory in a shelter or on the street with survivors. Cognitive issues arose, such as memory recall for the telling of a story. Additionally, a shifting of importance in temporal information would change with sudden interruptions in the conversation. Participants would interrupt themselves with more pressing concerns such a fears that “my house may blow up,” or “I don’t know where my child is could you help me get information?” Given this, cameras were passed out to each participant so that they could record the details of their lives when they had time and mail it to the researcher at a later date. Envelopes and stamps were included so that wherever they relocated to they could mail the envelope from that location. These diaries are still arriving in the mail from Sept. 2005 illustrating the importance of understanding post traumatic stress disorder and its lasting effects, as well as how long it may take to restructure key infrastructures, such as post office. Below is a chart illustrating the response times.
86
C. Forsman Table 1. Camera passed out in September 2005 with return dates Participant Location Baton Rouge Shelter Houston Shelter Bay St. Louis
Number of Cameras 25 10 27
Waveland
8
Date/Amount of Returned 1, Jan, 2005; 10, Dec. 2006 5, June, 2005 3, Oct. 2005; 5, Nov. 2005; 6, collected 2, collected Mar 2006
Understanding the context: post offices destroyed, or multiple relocation of participants, stress in recounting events, demonstrates the needs for adaptable research. After the initial field experiment participants began to email images taken with their cell phones or cameras. Additionally, participants wished to post information on You Tube and flickr in order to reach a wider audience. They expressed that they believed that these structures worked better for them as they struggled for assistance and wished to be noticed by a wider population. As this pattern began to develop new approaches for organizing information in the research took place. A good example of a website that is specifically for ethnographic research and disaster is the Indigenous Knowledge in Disaster Management website.
6 Conclusion The Disaster Cycle was highlighted in this paper in order to set the stage for a HCI field research. The research was explained so that anecdotal evidence was presented on how HCI research needed to be both participatory and adaptive in a post disaster environment. Acknowledgments. Participants in Bay St. Louis, Slidell, and Waveland, Mississippi. and in New Orleans and Baton Rouge who graciously gave of their time.
References 1. Alexander, D.: Principles of Emergency Planning and Management. Terra Publishing, Harpendern (1991) 2. Haddow, G.D., Bullock, J.A.: Introduction to Emergency Management. ButterworthHeinemann, Amsterdam (2004) 3. Jason, P.: (August 14, 2002), http://www.govexec.com 4. Berne, R.: CCPR: Organizational & Community Preparedness Project Executive Summary (2005) 5. SAFAM Summary of Events for a Medical Mission to Mozambique (2007) 6. Banipal, K.: Strategic Approach to Disaster Management: Lessons Learned from Hurricane Katrina. Disaster Prevention and Management, pp. 299–421 (2006) 7. Hurricane Katrina Timelines. The Brookings Institute (2004) 8. ARRL President Congressional Testimony on Hams’ Katrina Response, Submitted to the House Government Reform Committee (September 15, 2005)
After Hurricane Katrina: Post Disaster Experience Research
87
9. Moyers, B., Kleinenberg, E.: Fighting for Air Transcripts (2006) 10. Dourish, P.: Seeking a Foundation for Context-Aware Computer. Human-Computer Interaction 16(2,3 & 4), 229–241 (2001) 11. Dourish, P.: Speech-gesture driven multimodal interfaces for Crisis Management. Proceedings of the IEEE 91, 1327–1354 (2003) 12. Anonymous, Craig’s List posting retrieve (September 10, 2005) 13. Simon, H.A.: A Behavioral Model of Rational Choice. Quarterly Journal of Economics 69, 99–118 (1955) 14. Norman, D.A.: Cognitive Engineering. In: Norman, D.A., Draper, S.W. (eds.) In UserCentered System Design, Lawrence Erlbaum Associates, Hillsdale, NJ (1986) 15. Dourish, P.: What We Talk About When We Talk About Context. Personal and Ubiquitous Computer 8(1), 19–30 (2004) 16. Mumford, E.: Effective Systems Design & Requirements Analysis: The ETHICS Approach. MacMillan, New York (1995) 17. Greenbaum, J., Kyun, M.: Design at Work: Cooperative Design of Computer Systems. Lawrence Erlbaum, Hillsdale, NJ (1992) 18. Gutwin, C., Greenberg, S.: Design for Individuals, Design for Groups: Tradeoffs between Power and Workspace Awareness. In: Proceedings of the ACM 2000 Conference on Computer Supported Cooperative Work 2003, Philadelphia, PA (2000) 19. Nardi, B., Miller, J.: Twinkling Lights and Nested Loops: Distributed Problem Solving and Spreadsheet Development. International Journal of Man.-Machine Studies 34, 161–184 (1991) 20. Curry, M., Phillips, D., Regan, P.: Emergency Response Systems and the Creeping Legibility of People and Places. The. Information Society 20, 357–369 (2004) 21. Boehner, K., Vertesi, J., Sengers, P., Dourish, P.: How HCI Interprets the Probes. In: Proceedings of CHI (2007) 22. Schilderman.: Theo. Strengthening the Knowledge and Information System for the Urban Poor. Cambridge Unversity Press, Cambridge (2003) 23. Nygaard, K.: Program Development as Social Activity. In: Kugler, H.-J. (ed.) Information Processing, pp. 189–198. Elsevier Science Publishers, Amsterdam (1986) 24. Schuler, D., Namioka, A.: Participatory Design: Principles and Practices. Lawrence Erlbaum Associates, Hillsdale, NJ (1993)
A Scenario-Based Design Method with Photo Diaries and Photo Essays Kentaro Go Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi 4-3-11 Takeda, Kofu 400-8511 Japan [email protected]
Abstract. In this paper, we propose a requirements elicitation method called Scenarios, Photographic Essays and Diaries as User Probes (SPED-UP). In SPED-UP participants create photographic diaries and photographic essays themselves. Each participant creates photographic diaries to capture a day in their own life. They reflect upon their personal experiences and create photographic essays based upon this reflection. This approach enables designers to collect user data conveniently. Designers, who might be participants themselves in a participatory approach, can then analyze these experiences by forming design concepts, envision scenarios by imagining contexts of use, and create artifacts by sketching these scenarios. We also describe an exemplary workshop using the SPED-UP approach. Keywords: user research, photographic diary, photographic essay, probe, requirements inquiry, scenario.
A Scenario-Based Design Method with Photo Diaries and Photo Essays
89
We propose a design approach using lightweight user research for designers and create design ideas from the user data to address this issue. Our approach, Scenarios, Photo Essays and Diaries as User Probes (SPED-UP), is a scenario-based design using participants’ self-produced photographic essays and photographic diaries. In this paper, we describe an overview of the SPED-UP approach; this paper specifically examines photographic diaries and photographic essays as representations of user research.
2 User Research to Elicit Requirements Four goals of the early stage of design for human-computer interaction are the following. • • • •
Elicit potential desires and requirements. Envision novel scenarios of use. Create designs reflecting the material of user research. Bring actual users into design activities.
Several efforts have been made to study user research for design. Researchers and practitioners transferred research methods for field work to the design of humancomputer interaction. For example, in the contextual inquiry technique [1], researchers visit users’ work settings and ask questions during the actual activities. This technique is useful to record and understand actual users’ tasks and activities to elicit their potential wants and requirements. Gaver, Dunne and Pacenti [3] created cultural probes, which is a package of devices such as post cards, disposable cameras, notebooks, and so forth. Each device is designed to encourage potential users to keep a diary themselves as the instruction and messages from the designers are printed on it. The packages are distributed to potential users; they in turn keep a diary using the devices and send the package back to the designers. The designers browse the materials. Consequently, the materials provide the designers with a clue for design. As in the cultural probe technique, photographs taken by actual users often play a central role in user research. Frost and Smith [2] used photographs taken by patients with diabetes themselves for self-management training. In the marketing research field, Holbrook and Kuwahara proposed a data collection method using collective stereographic essays to probe consumption experiences. Holbrook and Kuwahara’s approach inspired us to develop the Participatory Requirements Elicitation using Scenarios and Photo Essays (PRESPE) approach [6, 7]. Based on experiences using the PRESPE approach, we created the SPED-UP approach. With devices such as photographs and writings created by potential users, we intend to deal with the above four issues in the early stage of the design process.
3 SPED-UP: Scenarios, Photo Essays and Diaries as User Probe Our approach on user research for design employs three key devices: scenarios, photographic essays and photographic diaries. The approach is called Scenarios, Photo Essays and Diaries as User Probe (SPED-UP). Fig. 1 depicts an overview of the SPED-UP approach.
90
K. Go
Coordinator
Theme Theme
(1) Collect Photo Diaries
(2) Reflect
(3) Analyze Photo Essays
Personal Experience Participants
Artifacts Artifacts
Design Design Concept Concept Requirements and needs
(5) Translate
Scenarios
(4) Envision
Fig. 1. Overview of the Scenarios, Photo Essays and Diaries as User Probes (SPED-UP) approach
3.1 SPED-UP Overview As a participatory design approach, SPED-UP sets a group of major stakeholders (including designers and real users) working together to produce and evaluate product designs [11]. The SPED-UP approach encompasses two roles: coordinators and participants. The coordinators assign a project theme and provide ongoing support for the participants’ activities. Five main activities are (1) collection, (2) reflection, (3) analysis, (4) envisioning, and (5) translation. Participants collect their own personal photographic diaries. For the assigned theme, participants create photo-essays to reflect their personal experiences using existing artifacts. The participants are divided into several groups; the remaining SPED-UP activities are conducted as group work. By comparing the individual photographic essays, the participants can analyze shared ideas, identify the concepts behind them, and then develop design concepts. The participants can then use these design concepts as inspiration for future uses of the relevant technology when they envision use scenarios and contexts. This activity, called scenario exploration, is a structured brainstorming session with role-playing using scenarios and questions. The participants then translate scenes described in the scenarios into artifacts by making sketches of the scenes [4]. Three devices area used for SPED-UP: photographic diaries, photographic essays, and scenarios. 3.2 Photo Diaries A photographic diary comprises a series of photographs and their descriptions. Fig. 2 shows an example of a photographic diary. A participant takes a photograph at specified time intervals and describes an activity at the time the photograph is taken.
A Scenario-Based Design Method with Photo Diaries and Photo Essays
91
In Fig. 2 the participant took a photograph and wrote a diary at one-hour intervals. Each photograph and description represents a scene from a day in the participant’s life. The purpose of collecting photographic diaries of users is to capture actual scenes of life from the users. The final outcome from the design process is design ideas or designed products relating to information and communications technologies. Therefore, we are interested in finding opportunities for information processing and communication from their daily lives.
8:45 All I have in my wallet is a thousand-yen note. I stop at .an ATM machine to withdraw money on the way to work.
9:45 Working on a business meeting, a business partner gave a phone call.
10:45 The business meeting started at 10:30. The meeting material got in under the wire. I will be giving a talk on the material soon.
Fig. 2. An example of a photographic diary
A timer or prompter is useful to prompt taking a photograph by users to create photographic diaries. However, using a self-timer to take photographs might not be appropriate for our approach because it might capture unintended scenes and cause privacy and security concerns. For this reason, we ask users to take photographs themselves so that the users can choose what they capture as a scene of daily life. Instead of taking photographs automatically, we ask users to capture a scene that represents their actions, tasks, or activities as well as the environment surrounding them. In fact, we ask them to appear themselves in photographs to represent clearly what they are doing in what situation. Current technologies such as small portable digital cameras, mobile telephones with digital cameras, and personal digital assistants (PDAs) with digital cameras provide us opportunities to create photographic diaries without too much trouble. In addition, self production of photographic diaries by the participants enables designers to collect user data in a short period of time.
92
K. Go
3.3 Photo Essays A photographic essay contains representative photographs on an assigned theme and an essay explaining why the participant thinks the photographs fit the theme. Photos might be stereograms to increase the viewer’s sense of reality [8]. Fig. 3 shows an example of a photographic essay.
I live alone. The first thing I do is turn on the TV when I get back home. I guess I might be feeling lonely. I try to find an entertaining program. I watch many kinds of programs, such as variety shows, dramas, and comedies. Because I live alone, I have a habit of channel surfing. Because I do not subscribe to a newspaper, I do not know what TV programs are currently on the air. So after turning on the TV, I start channel surfing and stop when I find an entertaining program. During commercial breaks, I start channel surfing again because I do not want to miss any entertaining programs that might be airing simultaneously on a different channel. Another reason for this habit is that I am not disturbing anyone because I live by myself. I think that this habit might change depending on my environment. Fig. 3. An example of a photographic essay: Channel surfing [7]. The theme assigned to the participant is “something I usually do with an IT product.” In the essay, the author assumed that the television is an IT product.
The purpose of collecting photographic essays of users is to elicit potential hidden needs. This is achieved by users’ deep introspection based on the assigned theme. The photographic diaries and photographic essays are key user data in the SPEDUP approach. We expect from the user data that users’ needs or requirements that emerge from photographic essays might be incorporated into opportunities of information processing or communication found in photographic diaries. Toward this end, designers analyzed the collected photographic diaries and essays. The obtained ideas from the data are summarized and listed as Design Concepts shown in Fig. 1. The next step in the SPED-UP approach is to create scenarios. 3.3 Scenarios Scenarios in the SPED-UP approach have two aspects: as a tool to support idea generation and as a representation of design ideas based on user data. At the idea
A Scenario-Based Design Method with Photo Diaries and Photo Essays
93
generation stage from the design concept produced from the photographic diary and photographic essay analysis, designers conduct brainstorming sessions using an affinity diagram. In this activity, scenarios might be a textual narrative form. During the SPED-UP brainstorming session, participants create short scenarios that include usage situations. The participants ask 5W1H (What, Why, Who, When, Where and How) and what-if questions to identify concrete details of various use situations. The answers to the questions are represented as scenarios with detailed information. As a representation of design ideas, designers create scenarios that represent scenes of a task or activity. Scenarios at this stage are much longer descriptions than those in the brainstorming session.
4 Example We conducted a two-day workshop of the SPED-UP approach at the Ergo-Design Division, Japan Ergonomics Society. This section describes an overview of the workshop as an example. Other reports on the workshop can be found at [9, 10, 16]. The workshop was intended to create design specifications of a ubiquitous computing system for a university campus. Specifically, we designed the system not only for traditional usability aspects but also for emotional aspects; in this sense, we intended to incorporate the aspect of happiness into the system. The workshop participants are from several companies and universities in Japan. They have various backgrounds and experiences in industrial and product design but have no experience using the SPED-UP approach. Box 1 and Box 2 show the assignment given to the participants. Following our SPED-UP approach, we asked them to address a theme – “Something I feel happy about” – by taking a representative photograph and writing a brief vignette indicating the significance of the photo. We provided assignments to the participants of the workshop beforehand. They created the photographic diaries and photographic essays prior to the workshop. Fig. 4 shows the first two hours of a photographic diary created by a participant. She is a supporting staff member of a university field hockey team, and she describes her day during spring break. The photographic diaries provided by the participants enable the workshop members to share and understand the individual’s daily life. Photo Diary Project Description Do the following.
• Take a photograph every thirty-minute interval from morning to night (one-hour interval may be acceptable if you think thirty-minute interval is too busy). • Write a short diary that explains the scene captured in the photograph. • Construct a summary document (a PowerPoint presentation or a poster) that contains the photographs and diary. Notes
• Consider what the theme means to you. • Describe the scene in the photographs; explain why you selected that particular scene. Box 1. Photo diary assignment given to the participants
94
K. Go Photo Essay Project Description For the theme below, do the following.
• Take a pair of photographs (overview and close-up) that describes the theme. • Write a short essay that explains the meaning of the scene captured in the photographs. • Construct a summary document (a PowerPoint slide) that contains the photographs and essay. Theme
• Something I feel happy about Notes
• Consider what the theme means to you. • Describe the scene in the photographs, and explain why you selected that particular scene. Box 2. Photo essay assignment given to the participants
8:30-9:00 (1), (2) I wake up in the morning and check e-mail first. (1)
(3)
9:30-10:00 (4) I time warm-up exercise with a stop-watch behind the backstop on the field hockey field
(2)
I use a microwave for making a drink in the cold winter.
9:00-9:30 (3) At a convenience store, I use photoprinting service. The cash insertion slot is out of reach of the printing terminal. 10:00-10:30 (5) I hand out drinks to players every thirtyminute interval (4)
(5)
Fig. 4. A photographic diary created by a participant. She is a supporting staff member of a university field hockey team (excerpt from her poster and translated by the author).
Fig. 5 shows a photographic essay created by a participant. He explains in the photographic essay why self-made coffee in the morning is important for him. At the workshop we started explaining the photographic diaries and photographic essays that were brought. Then we divided the workshop members into three groups. Each group reviewed all the photographic diaries and photographic essays and find common ideas and opportunities behind them. They created design keywords through this activity. All materials had been posted on the wall of the workshop room so that the participants were able to review them anytime.
A Scenario-Based Design Method with Photo Diaries and Photo Essays
95
Fig. 5. A photographic essay created by a participant. He explains why self made coffee in the morning is important for him to spend a happy day.
During the analysis phase of photographic diaries and photographic essays, the participants created keyword descriptions. Box 3 shows an example of the keyword descriptions created by a participant group. Based on those keywords, the participants conducted scenario-based brainstorming sessions. Finally, they created design ideas about restructuring the concept of a lecture on campus. They proposed the “learning like a pot-luck party” concept, a student-led learning environment where anyone comes and leaves anytime and shares knowledge and experience. Keyword descriptions: Relativity: The degree of happiness is perceived in a relative manner. The same life event can be experienced differently from person to person. Rhythm: Series of events in daily life create a harmony of happiness. Box 3. Keyword description by the participant group
5 Conclusions In this paper, we introduced a user research and design method using a scenario-based approach with photographic diaries and photographic essays. The Scenarios, Photo Essays and Diaries as User Probes (SPED-UP) approach enables designers to collect user data at the beginning of design process in lightweight manner. In this paper, we specifically addressed representation of photographic diaries and photographic essays.
96
K. Go
We introduced the SPED-UP approach at a workshop held by the Ergo-Design Division, Japan Ergonomics Society in February, 2006. The participants at the workshop quickly acquire the approach; then they started using it at design departments of several companies and universities in Japan including Fujitsu Co. Ltd., Canon Inc., Ricoh Company, Ltd., Chiba University, Musashi Institute of Technology, Kurashiki University of Science and The Arts, and University of Yamanashi. The Ergo-Design Division is now considering using it as a basic design approach for ubiquitous services, applications, and products. Ueda and Watanabe [15] reported that the SPED-UP approach enables design students to center their creative efforts specifically on their design target, which suggests the potential value of SPEDUP for use in design education. Acknowledgments. The author thanks the Ergo-Design Division, Japan Ergonomics Society. The photographic diary and photographic essay in Section 4 are provided by Saori Oku, Wakayama University and Hiromasa Yoshikawa, Design Center, Fujitsu Co. Ltd.
References 1. Beyer, H., Holtzblatt, K.: Contextual design: Defining customer-centered systems. Morgan Kaufmann, San Francisco (1998) 2. Frost, J., Smith, B.K.: Visualizing Health: Imagery in Diabetes Education. In: Proceedings DUX 2003 Case Study, Designing for User Experience ACM/AIGA (2003) 3. Gaver, B., Dunne, T., Pacenti, E.: Cultural probes, interactions, 6(1) 21–29 (1999) 4. Go, K., Carroll, J.M.: Scenario-based task analysis. In: Diaper, D., Stanton, N. (eds.) The Handbook of Task Analysis for Human-Computer Interaction, pp. 117–134 (2003) 5. Go, K., Carroll, J.M.: The Blind Men and the Elephant: Views of Scenario-Based System Design. Interactions 11(6), 44–53 (2004) 6. Go, K., Takamoto, Y., Carroll, J.M., Imamiya, A., Masuda, H.: PRESPE: Participatory Requirements Elicitation using Scenarios and Photo Essays, Extended. In: Proceedings of the CHI 2003, Conference on Human Factors in Computing Systems. pp. 780–781 (2003) 7. Go, K., Takamoto, Y., Carroll, J.M., Imamiya, A., Masuda, H.: Envisioning systems using a photo-essay technique and a scenario-based inquiry. In: Proceedings of HCI International 2003, pp. 375–379 (2003) 8. Holbrook, M.B., Kuwahara, T.: Collective Stereographic Photo Essays: An Integrated Approach to Probing Consumption Experiences in Depth. International Journal of Research in Marketing 15, 201–221 (1998) 9. Inoue, A.: A Proposal for New Campus Life for the Ubiquitous Generation: An approach using Photo Scenario Method. The. Japanese Journal of Ergonomics 42 Supplement, 58–59 (in Japanese) (2006) 10. Ito, J.: How to Make Campus Life Unforgettable with Ubiquitous Service. The Japanese Journal of Ergonomics 42 Supplement, 54–55 (in Japanese) (2006) 11. Muller, M.J., Haslwanter, J.H., Dayton, T.: Participatory Practices in the Software Lifecycle. In: Helander, M., Landauer, T.K., Prabhu, P.V. (eds.) Handbook of HumanComputer Interaction, 2nd edn. pp. 255–297. Elsevier, Amsterdam (1997) 12. Poltrock, S.E., Grudin, J.: Organizational obstacles to interface design and development: two participant-observer studies. ACM Transactions on Computer-Human Interaction 1(1), 52–80 (1994)
A Scenario-Based Design Method with Photo Diaries and Photo Essays
97
13. Pruitt, J., Adlin, T.: The Persona Lifecycle: Keeping People in Mind throughout Product Design. Morgan Kaufmann, San Francisco (2006) 14. Rosson, M.B., Carroll, J.M.: Usability Engineering: Scenario-Based Development of Human-Computer Interaction. Morgan Kaufmann, San Francisco (2001) 15. Ueda, Y., Watanabe, M.: A study of vision-development methods for the ubiquitous generation. In: Proceedings of the 36th annual Meeting of Kanto-Branch, Japan Ergonomics Society, pp. 29–30 (in Japanese) (2006) 16. Yoshikawa, H.: Campus Life Support by Ubiquitous Technology. The Japanese Journal of Ergonomics 42 Supplement, 56–57 (in Japanese) (2006)
Alignment of Product Portfolio Definition and User Centered Design Activities Ron Hofer1, Dirk Zimmermann2, and Melanie Jekal3 1
Siemens IT Solutions and Services C-LAB, Fürstenallee 11, 33102 Paderborn, Germany [email protected] 2 T-Mobile Deutschland GmbH, Landgrabenweg 151, 53227 Bonn, Germany [email protected] 3 Universität Paderborn C-LAB, Fürstenallee 11, 33102 Paderborn, Germany [email protected]
Abstract. To reach a product’s business objectives, the requirements of all relevant stakeholders have to be analyzed and considered in the product definition. This paper focuses on the processes applied to analyze and consider the needs and expectations of two of these stakeholder groups, namely the customers and the users of a product. The processes to produce customer centered product definitions and user centered product definitions are compared, rendering visual opportunities to increase their efficiency and effectiveness by means of collaboration. Keywords: Business Requirements, Customer Requirements, Marketing, Product Definition, Product Portfolio Management, Usability Engineering, User Centered Design, User Requirements.
Alignment of Product Portfolio Definition and User Centered Design Activities
99
2 The Playground The roles that one or more person(s) might perform in a buying decision, can be classified into six buying roles which are the initiator, the influencer, the decider, the buyer, the user and the gatekeeper [1]. This framework helps to understand the different view angles, expectations and needs of the customers - and the users regarding the same products. Business plans consider all of these six roles to define products, which intentionally influence all factors leading to a purchase decision. One of these buying roles is the user. User Centered Design (UCD) offers established processes, methods and tools to understand and consider this part of the six buying roles, which leads to the authors’ belief, that an early start of UCD activities supports business decisions already in the initial phase of the PL. Another buying role is the decider (the one who decides on the purchase of a product). In the context of this paper, the motivation to make a purchase decision is different for organizational customers that purchase IT systems to be used by members of the organization (e.g. a call center or an intranet solution) and private customers who are actual end-users (e.g. the purchaser of a tax software or mobile phone). These differences will be addresses at relevant points within the paper. The PPD is conducted at the very beginning of a product’s lifecycle. Product portfolios (PPs) consist of a unified basic product platform and product modules, which are tailored to fit the needs of specific market segments. Objectives and requirements of PPs are defined in “product vision” documents [22]. The modules of a PP can be developed and launched as independent projects at different times. There is a wide range of drivers influencing the definition of product vision for PPs. Company-external drivers, such as society and politics, sciences and technology and the target market as well as more internal drivers like the business strategy, the product strategy and existing and planned own and competitive products are to be considered. This paper focuses solely on one aspect of these drivers, the so called “voice of the customer” ([16], [22]) which has to be heard and considered in the definition of product visions and project scopes to tailor the modules of a product line according to customer segments and to align each module with specific customer needs and expectations. Literature on the process of product definition (PD) emphasizes, that the analysis of the context of use provides valuable insights about customers’ needs and expectations and should be considered in the definition of product visions and project scopes ([16], [22]). On the other hand, usability experts (e.g. in the QIU Reference model [8]) and related ISO standards (DIN EN ISO 13407 [7], ISO/TR 18529 [13] and ISO/PAS 18152 [12]) point out, that the interests and needs of user groups that will work with the products should be considered throughout the entire product lifecycle, “from the cradle to the grave” to thoroughly ensure and enhance the ease of use and usability of interactive products.
3 Comparison of Focus and Methods The following comparison identifies activities within both processes which needs to be aligned to assure and increase both the customer and the user acceptance of
100
R. Hofer, D. Zimmermann, and M. Jekal
Fig. 1. The four activities within the PPD and subsequent UCD phase
products. To ease the comparison, both processes are divided into four steps, namely “Analyze Context”, “Specify Requirements”, “Produce Concepts” and “Evaluate Concepts”. This sequence is in line with the iterative human centered design steps [7] and customer centered approaches to define products [16]. For each step, product definition and UCD activities are juxtaposed to identify opportunities to increase the efficiency within both processes by joint activities and to explore the usage related effects of decisions within the product definition phase. The steps are mapped on a schematic diagram visualizing the commonly acknowledged sequence from the PPD phase to the UCD phase. 3.1 Analyze Context Analyzing the Business and Customer Context Within the business context, product visions and project scopes are defined, based on a thorough analysis. This paper focuses on a significant part of the overall analysis activities, namely the identification of “the voice of the customer” [22]. Within this part, significant differences between groups of customers are identified in order to segment markets and detailed insights about each customer groups’ specific current and future needs and expectations are gathered. In the case of product offerings for private customers, information about customers’ geographics, demographics (addressing the social levels and the family lifecycle) psychographics (addressing patterns by which people live and spend time and money) and behavioristics (addressing the customers extent of use and loyalty, unrealized consumer needs and the usage situation) ([9], [10]) supports “the process of dividing a market into groups of similar consumers and selecting the most appropriate group(s) […] for the firm to serve” [19] and provides valuable information about the private customers motivation to make purchase decisions. Common sources to analyze customers’ needs and expectations are problem reports and enhancement requests for a current system, marketing surveys, system
Alignment of Product Portfolio Definition and User Centered Design Activities
101
requirements specifications and descriptions of current or competitive products. These sources are supplemented with interviews and discussions with potential users, user questionnaires, the observation of users at work and the analysis of user tasks [22] to “perform foresight research into potential user groups in order to identify forthcoming needs for systems and new users or user organizations” and to “Identify expected context of use of future systems” [13]. These methods have significant overlap with analysis methods used in the UCD process. Analyzing the User Context UCD processes begin with a thorough analysis of the context of use. The context of use includes “the characteristics of the intended users”, “the tasks the users are to perform” and “the environment in which the users are to use the system” [7]. Additionally, a “competitive analysis” [17] of competitive systems can add valuable information. The characteristics of the intended users include information about their “knowledge, skill, experience, education, training, physical attributes, habits, preferences and capabilities” [7]. This information is summarized in user profiles [14] often represented as Personas ([20], [5]). User profiles help to keep each user group’s specific constraints, abilities and mental models in mind throughout product development. The relevant user goals are captured and analyzed to identify the as-is sequences of tasks that users proceed to reach these goals. The usage environment analysis adds information about “the hardware, software and materials to be used [and] the organizational and physical environment” [7]. Information about the environment helps to consider restrictions and to identify potential opportunities to enhance the product-to-be. Common methods to analyze the context of use are structured on-site visits, structured interviews or interviews using the master/apprentice model [3] with users and customers ([6], [11]). 3.2 Specify Requirement Business and Customer Requirements Business requirements set the overall “product vision” and determine the product portfolio modules to be developed. Furthermore, business requirements contain the identified business opportunity, business objectives and criteria, customer and market needs, business risks, scopes and limitations and the business context containing information about the stakeholder profiles. Customers are a subset of the overall stakeholders considered in the definition of business requirements. Business requirements are the basis to elicit customer requirements for each project. This is done in tight collaboration with customers. Customer requirements can be grouped into nine classes, namely “Business Requirements, Business Rules, Use Cases or Scenarios, Functional Requirements, Quality Attributes, External Interface Requirements, Constraints, Data Definitions and Solution Ideas” [22]. Business as well as customer requirements address issues related to the context of use. High-level business requirements “set business tasks (Use Cases) that the product enables” and “influence the implementation priorities for use cases” [22] and project related customer requirements include those Use Cases.
102
R. Hofer, D. Zimmermann, and M. Jekal
User Requirements User or Workflow requirements specify how the system should support the user to complete his/her tasks and thus have an impact on the early definition of products and market segments [7]. They are captured in Use Cases that “describe the system behavior under various conditions as the system responds to a request from one of the stakeholders, called the primary actor” [4]. The core element of a Use Case is the main scenario, which lists the flow of interaction to reach a specific goal. This interaction flow is improved into a reengineered shall-be-status to “realize the power and efficiency that automation makes possible” and to “more effectively support business goals” [14] and customer requirements. Use Cases are an ideal container to gather all functional requirements necessary to enable a specific user group (primary actor) in reaching a specific goal. As products usually enable several kinds of distinctively different user groups in reaching several goals, Use Cases can be organized into a matrix showing user groups and their respective user goals. This matrix supports decisions concerning the product portfolio elements and project scopes. 3.3 Produce Concepts Business and Project Concepts On the business level, a consistent concept is developed under consideration of the business requirements. This process is of a complex nature, as there is more than one alternative solution for each component of the concept ([2], [18]. On the product level, customer requirements are consolidated into product definition concepts describing the “Place” variable (referring to a geographic location, an industry and/or a group of people - a segment - to whom a company wants to sell its products or services) and the “Product” variable (addressing a product’s functionality, product differentiation, product shape and the Product Portfolio management) of the “4Ps” of a so called “marketing mix”. From a marketing perspective, the “Pricing” and “Promotion” variables supplement the concepts [15]. Methods to systematically derive an optimum configuration of business and product concepts address the visualization of complex requirement interrelations, the production and usage of prototypes and the prioritization of requirements. To deal with the uncertainties given, usually several concepts are derived and evaluated to reduce the risks of misconceptions [22]. User Interface Concept The conceptual phase within the UCD process deals with two major objectives. The first objective is to organize the identified and reengineered tasks into models to describe the overall hierarchy and interrelations of tasks, considering the user and business perspective. The second objective is to translate these models into a consistent specification of the UI through several iterations. The first iteration, focuses on the creation of the “Conceptual Model Design”, which defines “a coherent rule-based framework that will provide a unifying foundation for all the detailed interface design decisions to come“ [14]. This framework, visualized in mock-ups, represents the reengineered task models in a more tangible way and can thus support customer-focused evaluation activities.
Alignment of Product Portfolio Definition and User Centered Design Activities
103
3.4 Evaluate Concepts Evaluation of Business and Project Concepts From a business perspective, evaluation activities address business concepts and product concepts defining the segmentation of markets and the corresponding of products. These concepts are reviewed with customers (usually specific registered customers of the company) and relevant stakeholders and domain experts [22]. Customer requirements are evaluated with customers to get feedback on how to adjust concepts and which concept to choose. Feedback on product concepts is gained by surveys, focus groups, reviews and structured interviews with potential and known customers. In the case of several concepts to be compared, benchmarking methods such as the KANO method or the Conjoint Analysis method [16] are used to identify promising project concepts and marketing mixes. These methods are based on the assumption that customers are able to explain and predict their thinking and behavior [20]. They can be supplemented by methods to gain insights about the 95% of thinking that takes place in customers’ unconscious minds, and strongly affect their purchasing behavior [23]. Additionally, launching products with a limited area of circulation or functionality (single modules, beta versions) provides early feedback from the marketplace. User Centered Evaluation One of the basic principles of UCD is to develop human system interfaces in iterations to decrease costly the chance of changes and revisions at late stages of product development [22]. With this approach, the risk of unforeseen obstacles which might result from reengineered task sequences, task models and UI concepts can be reduced and initially undetected issues concerning the users’ needs and expectations can be considered at an early stage of UI development. There are two types of UCD evaluations. Summative evaluations (e.g. usability tests, benchmarks and reviews) aim at the final assessment of products, whereas formative evaluations are conducted continuously to support decisions concerning UCD concepts within the process. As this paper discusses mutual benefits in joint customer centered and user centered activities in the early “cradle” step of product development, the formative UCD evaluation is of foremost interest. The methods used for formative evaluations at this point of product development are collaborative reviews, expert reviews, validations with users and customers and focus groups. Formative evaluations confirm intermediate results within the process and identify potential areas for optimization or correction.
4 Mutual Benefits As shown, the methods used within product development overlap with methods used in UCD activities. This overlap can be a promising starting point to reduce time and effort (the two basic metrics for efficiency) within product development.
104
R. Hofer, D. Zimmermann, and M. Jekal
Fig. 2. Promising areas for collaboration within the PPD and subsequent UCD phase
The second advantage of a simultaneous proceeding of Product portfolio definition and UCD activities is the opportunity to explore the effects of PPD activities on the context of use within the PPD phase. This feedback is a valuable basis to make adjustments within each of the PPD steps, enhancing the reliability of all subsequent steps and reducing cost intensive change request in subsequent PL phases. This enables the product definition team to adjust analysis plans, requirement specifications, concepts and evaluation focus accordingly. In the following, we summarize all potential areas of collaboration. The areas are mapped on the schematic diagram (Figure 2) visualizing the PPD and subsequent UCD phase, introduced in chapter 3. a) Joint Analysis and Customer Selection The identification of relevant customer and user segments for analysis activities can be simplified by joint collaboration of business and user analysts. Business analysts can utilize user groups described in Personas to segment markets ([20], [22]), which leads to a significant reduction of the set of customers to be investigated [16]. On the other hand, “ethnographic interviewers should use market research to help them select interview partners” [5] and derive user groups [20]. Some of the main methods used to analyze the characteristics of target customers are equally used within the UCD process to gain insights about the characteristics of
Alignment of Product Portfolio Definition and User Centered Design Activities
105
the intended user, their user goals and the environment in which the users are to use the system. A simultaneous analysis approach could therefore reduce time and effort. The relevant interview partners can be jointly interviewed adding valuable mutual insights. As stated by Cooper, “data gathered via market research and that gathered via qualitative user research complement each other quite well” [5]. b) Exploring User Requirements for Product Definition First (jointly analyzed) insights about customers and users can be utilized by UCD activities to “perform foresight research into potential user groups in order to identify forthcoming needs for systems and new users or user organizations” [13] which can be used as a basis for a user groups and goal oriented product modularization and the identification of “technology capabilities that would be needed” [21]. The UCD methods to translate user goals into meaningful Use Case requirements can be utilized in PD to “Identify expected context of use of future systems” [13]. Use Cases fill the “Use case or scenario” class of the customer requirements derived within PD [22]. Furthermore, early insights about the expected context of use can indicate missing analysis data about customers within the customer analysis step. c) Joint Requirement Specification Business requirements “determine both the set of business tasks (use cases) that the application enables” and “influence the implementation priorities for use cases and their associated functional requirements” [22]. Within the requirement elicitation phase of PD, analysts elaborate customer and user statements into general customer requirements. Some of these requirements address statements concerning user goals or business tasks that users need to perform. UCD methods can be utilized to condense these requirements in the form of Use Cases, which cluster all product requirements necessary to fulfill a certain user goal in one single requirement and can thus reduce the complexity of requirements to be considered. [17] In the requirement phase of the UCD process, task sequences are reengineered to optimally achieve the identified business goals. These UCD reengineering activities allow the consideration of improved workflows and changes in users and tasks within the PD phase. d) Explorative Concepts Usage related product requirements can be translated into first conceptual models and mock-ups. Especially in the context of private customers, these mock-ups can be used in the requirement elicitation phase to get early customer feedback and adjust the requirements accordingly. e) Joint Conception In the concept phase of PD, several marketing mix concepts are derived to identify the best mixture of all variables of the product offering. Joint conceptualization activities allow to see the effect of trade-off decisions in the marketing mix immediately and to adjust the marketing mix concepts accordingly. Furthermore, a simultaneous creation of first conceptual UI models increases the real-world character of marketing mixes to be evaluated with customers and users.
106
R. Hofer, D. Zimmermann, and M. Jekal
f) Explorative Evaluation of Usage Related Components of the Marketing Mix Explorative evaluation efforts to “assess the significance and relevance of the system to each stakeholder group which will be end users of the system and/or will be affected by input to or output from the system” [13] provide early feedback in the context of use. Marketing mix concepts can be evaluated up front by UCD activities based on the first set of user requirements to allow usage related concept adjustment within the PD phase. g) Joint Evaluation UCD processes offer appropriate methods to evaluate the (high-level) usability of product concepts. Furthermore, UI mock-ups derived within the UCD processes help to communicate the product part of marketing mixes to customers and users within review and evaluation sessions. h) Positive Influence on Schedule, Budget, Resources and Quality The alignment of PD and UCD activities reduces time and effort, enables to utilize each others expertise and increases the product quality and thereby the predictability of product acceptance of customers and users.
5 Summary This paper identified opportunities to improve the alignment of PPD and UCD activities. It offers a basis for the discussion of how these joint activities can be embedded into established product development processes. Considering the specific requirements of users within the Product Portfolio Definition increases the user acceptance of future products and helps to smoothly implement the UCD process into the overall Product Development: • The users’ acceptance of future products is considered from the beginning and leads to strategic product portfolios aiming at high-level user goals. • As UCD activities can start earlier in the product development process, the time necessary to analyze the context of use in subsequent process steps is reduced. • The simultaneous customer and user focus enhances the shared understanding and awareness of business and user goals across development teams early in the project development process. • Feedback about the user acceptance of portfolio definitions is provided early in the process, which enables the adjustment of product portfolios within the first process steps and thus reduces extra costs of change requests in subsequent steps.
References 1. American Marketing Association: Dictionary of Marketing Terms. Retrieved (February 16 2007), from http://www.marketingpower.com/mg-dictionary-view435.php 2. Becker, J.: Marketing-Konzeption. Grundlagen des ziel-strategischen und operativen Marketing-Managements. 8th edn. München, Vahlen (2006)
Alignment of Product Portfolio Definition and User Centered Design Activities
107
3. Beyer, H., Holzblatt, K.: Contextual Design. Defining Customer-Centered Systems. Morgan Kaufmann Publishers, San Francisco, CA (1998) 4. Cockburn, A.: Writing Effective Use Cases, vol. 1. Addison-Wesley, Boston, MA (2001) 5. Cooper, A.: About Face 2.0., vol. 53. Wiley Publishing Inc, Indianapolis, US (2003) 6. Courage, C., Baxter, K.: Understanding Your Users. A Practical Guide to User Requirements [...]. Morgan Kaufmann Publisher (Elsevier), San Francisco, CA (2005) 7. DIN EN ISO 13407: Human-centered design processes for interactive systems. Brussels, CEN - European Committee for Standardization vol. 9(10) (1999) 8. Earthy, J., Sherwood-Jones, B.: Quality in use processes and their integration - Part 1 Reference Model. Lloyd’s Register of Shipping, London (2000) 9. Engel, J.F., Blackwell, R.D., Minard, P.W.: Consumer Behavior. The Dryden Press, Chicago (1990) 10. Evans, M., Jamal, A., Foxall, G.: Consumer Behaviour. John Wiley & Sons Ltd, West Sussex, England (2006) 11. Hackos, J.T., Redish, J.C.: User and Task Analysis for Interface Design. John Wiley & Sons, Inc, USA (1998) 12. ISO/PAS 18152: Ergonomics of human-system interaction - Specification for the process assessment of human-system issues. ISO, Genf. 8, 9, 11 (2003) 13. ISO/TR 18529: Ergonomics - Ergonomics of humansystem interaction - human-centred lifecycle process descriptions. ISO, Genf. (2000) 14. Mayhew, D.J.: The Usability Engineering Lifecycle, Morgan Kaufmann, San Francisco, pp. 172, 174, 188 (1999) 15. McCarthy, J.: Basic Marketing - A managerial approach. Irwin, Homewood, IL (1960) 16. Mello, S.: Customer-centric product definition. Amacom, New York (2002) 17. Nielsen, J.: Usability Engineering. Academic Press, Boston (1993) 18. Nieschlag, R., Dichtl, E., Hörschgen, H.: Marketing. 18th edn. Duncker & Humbolt, Berlin (1997) 19. Peter, J.P., Olson, J.C.: Consumer Behavior and Marketing Strategy, p. 378. McGraw-Hill Higher Education, Boston (2002) 20. Pruitt, J., Adlin, T.: The Persona Lifecycle. Keeping People in Mind Throughout Product Design. Morgan Kaufmann Publishers (Elsevier), San Francisco, CA (2006) 21. Sengupta, U, Sherry, J.: Future vision 2015: Building a User-Focused Vision for Future Technology. Technology@intel Magazine (9/2004) (2004) 22. Wiegers, K.E.: Software Requirements. In: Practical Techniques for gathering and managing Requirements [...], 2nd edn. Microsoft Press, Redmond, Washington 120, 95,81 (2003) 23. Zaltman, G.: How Customers Think. Essential Insights into the Mind of the Market. Harvard Business School Press, Boston, MA (2003)
A New User-Centered Design Process for Creating New Value and Future Yasuhisa Itoh1,2, Yoko Hirose3, Hideaki Takahashi3, and Masaaki Kurosu3 1
U'eyes Design Inc., Housquare Yokohama 4th Floor 1-4-1 Nakagawa, Tsuzuki-ku, Yokohama, Kanagawa-ken 224-0001 Japan 2 Department of Cyber Society and Culture, The Graduate University for Advanced Studies, 2-12, Wakaba, Mihama-ku, Chiba-shi, 261-0014 Japan 3 National Institute of Multimedia Education, 2-12, Wakaba, Mihama-ku, Chiba-shi, 261-0014 Japan
Abstract. This paper presents a new process model of user-centered design that can be applied to the development of new value and future. Realizing that the widely known conventional human-centered design process, defined by ISO13407, is not always effective, here we propose a new process model and introduce an overview of activities based on this process. This aims at not only developing new value and future, but also in generating new ideas in concept planning. Keywords: User-centered design; ISO13407; Developing new value and future; Concept planning.
A New User-Centered Design Process for Creating New Value and Future
109
ISO013407 processes. Here we will introduce some specific characteristics of the projects we are focusing on, which are outlined below: • The system’s realization date (the launch date) is in the near future • The system will make use of technology that is not currently available • Wishing to add new value but currently having no specific ideas Development that meets these kinds of requirements does not include products that have just become available for sale, but is aimed at products or systems that will be released over a period that could range from 2 to 3 years following development, to up to 5 or 10 years in the future. These products will also include items that will contain entirely new functions or added value, be equipped with a completely new user interface, or fall under the category of a completely new product or service. In order to be able to realize these new functions and added value new technology is often required, as well as a suitable amount of time being necessary for the development of this technology. This often means therefore, that rather than using the most recent technology what is actually required is using technology that, while not actually currently available, will be developed in the near future. In the initial stages of such development there are often cases in which the product or service itself is in the middle of the planning process, and this often leads to cases in which it is necessary to create new ideas regarding new value and include an investigation of the feasibility of actually realizing this as part of the product planning process. In this study we will introduce a conceptual model for a user-centered design process that involves a system that is both close to realization in the near future and that is capable of creating new value. This period of the near future is defined as being from 2 to 10 years from the current date.
2 Scope of a New User-Centered Design Process 2.1 Scope Table 1 shows the scope of the proposed process model. The points that divide the scope are whether there is actual new value in the system being examined (either a product or service) and the proposed realization date of the relevant system. In attempting to acquire quantifiable values from each axis for both realization date and new value it is impossible to actually divide these values qualitatively, but the figure does present a general concept of how these figures can be distinguished and separated. For areas in Table 1 that show no new value and whose realization date has been only been recently realized then the ISO13407 process model is thought to be a suitable model for use in bringing these products to development. For items that have already undergone the product planning stage under ISO13407, however, these can be treated using the “Specific requirements for user-centered planning” as defined in the upper-left panel in Figure 1. After making the decision as to whether the product or service is in need of user-centered design then we think that they can then undergo the same actual process. In contrast to this, however, in the development process that is the subject of this current study then the user-centered design process will start from a stage taken as being during the initial product planning process (Figure 2). This will result in the planning process being incorporated as the first of a series of processes.
110
Y. Itoh et al. Table 1. Scope of a new user-centered design process Realization time for the relevant system
Yes
New value? No
Recent events
Near future
Development of new value that has been recently realized
Development of new value that will be realized in the near future
*Suitable for application using the proposed process
*Suitable for application using the proposed process
Relatively little development of new value that has been recently realized
Relatively little development of new value that that will be realized in the near future
*Suitable for application using ISO13047
*Suitable for application using the proposed process
Fig. 1. Process of human-centered design activities
2.2 The Process Model Figure 2 shows a general outline of a conceptual model for the user-centered design process that we propose here. In this model, in contrast to the process model outlined in Figure 1 that is included within the conventional ISO13407, an additional 3 processes have been added: 1) User-centered planning, 2) Study and prediction of future circumstances, and 3) Selection and creation of new value. 1) The user-centered design process differs from cases of development in which the planning stage has already been decided and involves the process beginning from the initial product planning process stage. As the product planning that takes place here includes a user-orientated philosophy we therefore decided to name this process as user-centered planning.
A New User-Centered Design Process for Creating New Value and Future
111
Fig. 2. A new user-centered design process for creating new value and future
2) The study and prediction of future circumstances is a necessary process for envisaging an actual realization time for the relevant product or service in the near future. In the event of the development period being within the several months or between 1 to 2 years from the current period, it can be assumed that future circumstances and users will be virtually unchanged from the current period and suggesting that development can already take place. In contrast to this, however, if the realization period is in the near future (anticipated as being between 2 to 10 years in the future) then it is likely that a wide variety of factors will change in this period, including the currently available technology, and it is also difficult to envisage future users having the same needs and requirements as current users. In this event it is necessary to conduct a study of future circumstances and global changes as well as attempting to predict the characteristics of potential users in the future, together with the anticipated conditions for the relevant product or service. This particular process of study and predication of future circumstances is therefore an integral part of the proposed process. 3) The process of creating and selecting new value is also connected to usercentered planning. In the event of new value being one of the requirements of this planning process, then coming up with new ideas is an essential element of this process. If such ideas are subsequently found to be of high value and feasible for implementation then these can be used as the basis for the refinement of the product planning process. In order to carry this out, however, it is first necessary to develop a number of creative ideas. This involves generating a number of ordinary ideas and subsequently choosing the best ideas from this number for use as the basis for refinement of the product planning process. This element of generating and selecting new ideas is an important factor in the user-centered design process.
112
Y. Itoh et al.
2.3 The ISO13407 Process Model and Its Application In addition to the 3 processes outlined in section 2.2, there are a number of processes that share a number of the same points as ISO13407. The content of each of these processes, however, has undergone some change and expansion, and the content of each of these processes will be touched on in chapter 3. 2.4 The Life Cycle of ISO/IEC15288 and Its Application Table 2 shows the system life cycle stages of ISO/IEC15288[2]. We consider our new proposed process (Figure 2) as actually corresponding to the concept stages and development stages of ISO/IEC15288. In our proposed process we anticipate each activity involved in the concept stage and development stage to undergo repeated activity. When this occurs then there is a possibility of switching between both the concept stage and development stage, although in the event of this not fully satisfying the user or organization, or from the planning point of view, then the process will return to the previous stage and the overall process will be repeated. Table 2. System life cycle stages and purposes [2] LIFE CYCLE STAGES
PURPOSE
DECISIONS
Identify stakeholders’ needs CONCEPT
Explore concepts Propose viable solutions
DEVELOPMENT
PRODUCTION
Refine system requirements
- Execute next stage
Determine system components
- Continue this stage
Build system
- Go to previous state
Verify and validate system
- Hold project activity
Mass produce system
- Terminate project
Inspect and test UTILIZATION
Operate system to satisfy users’ needs
SUPPORT
Provide sustained system capability
RETIREMENT
Retire; archive or dispose the system
3 Proposed Process Activity 3.1 User-Centered Planning Product planning is an essential element of the process to develop the relevant service or product. Although product planning does give rise to technology-driven planning in a number of cases, this process adopts a system that doesn’t rely solely on
A New User-Centered Design Process for Creating New Value and Future
113
technology but also involves planning that takes into account the perspective of users who will actually use the system. Based on the subsequently developed planning then the realization date for the service and requirements from the planning side for creating new value can subsequently be determined. The process model that we describe here is expected to be mainly involved in systems whose realization date is in the near future and which require the creation of new value. It is also possible, however, to utilize this process for cases in which the creation of new value is required and the realization date is more recent, or in which the realization date is in the near future but which involve no demand for the creation of new value. In such cases some parts of the process will not be required (for the former case there will be no need to carry out a study and prediction process for future circumstances; for the latter case then the process for creating and selecting new value will become redundant). 3.2 Understand and Specify the Context of Use The proposed process will also involve carrying out a survey and analysis of users. The subjects of such a survey should be the actual anticipated users of the relevant system. Particular attention should be paid, however, in not being able to carry out a survey of potential future users in looking to determine a future realization date for the system. If the proposed realization date is 1 to 2 years following the planning process then a survey can be carried out based on the assumption that future users will not be noticeably different from current users. If the system or product’s current use and users are unclear then it will be difficult to actually develop a system in the future, meaning that is essential to carry out a survey of current users. The results of this survey can then be used as requirement definitions for the system as well as being used as important original data for creating new value. 3.3 Study and Prediction of Future Circumstances In focusing on being able to implement the system in the immediate future it is first necessary to carry out a study and prediction of future circumstances. While it is impossible to completely predict the future it is possible to survey and predict the future as much as possible relating to the development of the system and its targeted users. If the proposed realization date is only 1 to 2 years following the initial planning process then future users can be expected to not be noticeably different from current users and there should no significant changes [7]. Events that are anticipated as undergoing some change can also be expected to undergo quantitative prediction based on extrapolation of previous data [7]. Despite this, however, it is still important to remember that considerable change can still occur in new technology, products or services, and that the rate of usage or adoption of the relevant service or products is also subject to significant change. In the event of the realization date expected to be in the near future (roughly 2 to 10 years following the initial planning process) then it is only to be expected that significant change will take place between conditions now and in the future, meaning that being able to predict the future is a valuable facility. Although it is impossible to
114
Y. Itoh et al.
completely predict the future, in carrying out the principles of scenario planning this offers the potential of being able to portray a number of different scenarios for the future [4], [5]. In order to be able to carry out these predictions it is first necessary to fully clarify the items related to changes in the future and find out the principle factors that contribute to change on a global-scale [4], [5]. With these factors as the core it is then necessary to consider a number of different possible futures. Figure 3 shows a conceptual diagram of futures that have a high potential of actually coming about. The number of futures with a high potential of actually being realized are not limited to just one possible outcome, meaning that a number of different scenarios should be drawn up with different prospects for the future. The future scenario drawn up in this case includes stages showing each of the specified requirements and this will also be used as an important basis for data in creating new value.
Fig. 3. Model of future scenario planning
3.4 Specify the User and Organizational Requirements, and Future Circumstances This process involves using the results from user-centered planning, understanding and displaying usage, and a study and prediction of future circumstances to extract the necessary requirements for the relevant system and subsequently describing these in a text format. Outwith the displayed contents of the requirements involved in the ISO13407 process, it is also necessary to display requirements from the initial planning stage and for future circumstances. The usability requirements for the system and required conditions for the relevant functions are the same as conventional requirements. What should be of particular attention is that the results gained from the study and prediction of future circumstances can lead to a definition of what technology can be used and what type of technology will be unsuitable for use in the realization date to be decided in the future. These will also act as restrictions in creating new value. In order to be able to choose from a wide variety of different ideas in creating new value
A New User-Centered Design Process for Creating New Value and Future
115
it is necessary to define a rating scale for ideas and this rating scale can be developed based on the requirements for new value. 3.5 Creation and Selection of New Value This particular process contains a distinctive element to the process model and this process is essential if the creation of new value is required from the product planning stage. What is called new value in this situation is not simply a few minor changes to the product or a new level of model change, but rather the introduction of completely new functions, a new user interface, high added-value that previously didn’t exist, or a product or system that has been implemented based on new findings. A range of creative ideas is therefore necessary to be able to realize such new value contained in these products, and this usually involves implementing brainstorming sessions or individual thinking by product planning and design staff. It is then possible to select the most appropriate ideas that are developed and generate a concept using the best ideas. This will ultimately be compiled as part of the product planning process. This particular process takes advantage of the ideas of user-centered design and the results of user surveys and analysis, as well as the usage and predictions for a future world, future markets, and future users as a base for carrying out creative thinking and developing ideas. Regarding specific methods for creative thinking we are currently planning to explore this in a separate study and publication. Generating a broad range of multiple ideas means that these should be subject to the rating scale for ideas developed in section 3.4 and subsequently used to carry out a quantitative evaluation with the ensuing results used as a basis for selecting the most appropriate ideas. 3.6 Produce Design Solutions The requirements that include the selected ideas can then be used to design and develop a range of solutions. At this time we recommend that a number of different prototypes of the relevant product be created. As there is a concept stage element involved in this particular process this means that this is implemented as a result of selecting planned ideas that have a high level of feasibility of being implemented. For each process involving user-centered planning, displaying the relevant requirements, creating and selecting new value, and development of solutions through planning, we think that in some cases there may a repetitive and simultaneous carrying out of such processes in a progressive manner, although these processes will not necessarily be carried out in the order indicated by the arrows in Figure 2. This is the same process as occurs in ISO13407 [1],[3]. 3.7 Evaluate Designs Against Requirements Products or system prototypes that have been created by the previous process can then be evaluated using this process. The evaluation will essentially be implemented among anticipated users of the relevant product or system and the evaluation method used will be based on a usability test and user test. These will, however, differ from regular tests in that the anticipated users will be users at some point in the future. Although it is impossible to actually carry out an evaluation test on future users, it is possible to carry out a test on subjects who are anticipated as being relatively close to such future users.
116
Y. Itoh et al.
In order to carry out an evaluation of systems that will be used in the future, rather than performing an evaluation on regular users it is preferable to take measures to carry out such a test on progressive users of the product or system [6].
4 Conclusion Here we introduced a conceptual model for a user-centered design process for use on systems that involve the creation of new value and will be realized and implemented in the future. Although efforts are already underway into the development of systems that use such a conceptual model, these systems are currently in use and we have yet to see any clear results from these efforts. In the future we plan on further investigating the effectiveness of this process, as well as continuing to use this process model as part of the development process.
Acknowledgement We presented the first draft of this paper at the HIS2004.
References 1. ISO13407: Human-centered design processes for interactive systems (1999). JIS Z 8530: Human-centered design process for interactive systems (2000) 2. ISO/IEC 15288: Systems engineering - System life cycle processes (2002). JIS X 0170: Systems engineering - System life cycle processes (2002) 3. Kurosu, Hirasawa, Horibe, Miki: Understanding human-centered design processes for interactive systems, Ohmsha (2001) 4. Schwartz, P.: The Art of the Long View, John Wiley & Sons (1997) – Translated as Shinario puraningu no giho (Scenario Planning Techniques) (trans. Taomoto and Ikeda), Toyokeizai (2000) 5. Teramoto, Yamamoto, Yamamoto: Advanced Evaluation of Technology, Nikkei BP (2003) 6. Holmquist, L.E.: User-Driven Innovation in the Future Applications Lab, In: Proc. CHI2004, pp. 1091–1092 (2004) 7. Sherden, W.: The Fortune Sellers: The Big Business of Buying and Selling Predictions, Diamond (1999)
The Evasive Interface – The Changing Concept of Interface and the Varying Role of Symbols in Human–Computer Interaction Lars-Erik Janlert Department of Computing Science Umeå University, Sweden [email protected]
Abstract. This is an analysis of the changes the concept of interface is going through in the shift from the currently dominating virtuality paradigm of use to two new use paradigms, namely ubiquity and mobility; an analysis of the concomitantly shifting role of symbols in relation to the user and to the world; ending with an attempt to identify and analyze important research issues in the new situation that arises, two of which are to better understand the various ways different kinds of interface symbols can link to their real-world referents, and how to combine tracking reality with supporting the user’s own thinking.
1 Changing Paradigms of Use, Changing Notions of Interface There is enormous diversity in the ways modern information technology—that is, computer, telecommunication and interface technology1—have been put to use. Narrowing down to uses that would normally count as involving a “user” and falling within the field of study of human–computer interaction (HCI), still leaves a very great variety. On a high level of abstraction it is possible to discern general, broadly characterized forms of use, which may be helpful in identifying and understanding long-term trends and important challenges ahead. Often, specific technological advancements (e.g. in display or telecommunication technology) play a major role in determining new forms of usage, but there is also considerable inertia in a wellestablished form of use, striving to assimilate technological changes while retaining basically the same form. In this paper three of the most important paradigms of use in the last decades will be identified and examined: one older and well established—the virtuality paradigm; and two new, which are rapidly gaining ground theoretically as well as in practical applications—the ubiquity and the mobility paradigms. The purpose of this analysis is to draw some conclusions from the changing notion of interface and to identify some central research issues that arise as a consequence of the ongoing paradigm shifts. The choice of the term “paradigm” in this context is inspired by Thomas Kuhn’s famous 1
Usually just “information technology” (IT) or “information and communication technology” (ICT) as if deliberately ignoring the fact that such technologies (per definition) have been around since the beginning of history.
notion of scientific paradigms [9]. A use paradigm comprises important design examples, use scenarios, specific techniques and technologies, specific views on key concepts, such as what a “user” is, and what goals to pursue in HCI—and, not least important, groups or communities of people (researchers and interaction designers) developing and defending the paradigm. Unlike the scientific paradigms in Kuhn’s understanding of scientific development, however, new paradigms in HCI seldom completely replace old ones; even if a new paradigm becomes predominating over time, older paradigms can find niches where they survive. In this manner several use paradigms can coexist and come to be seen as complementing each other. Shifting paradigms of use imply shifting notions of interface. The interface concept in HCI has emerged from a variegated background: the precise physical specifications of components necessary for industrial mass production and assembly, the control panels and steering devices of complicated engines, and, of course, software interfaces between different parts of complicated programs. Within HCI, the interface concept has developed into a complex and multifaceted notion, and the development with regard to the three chosen use paradigms will specifically be studied here. The changing role of symbols is of particular interest. 1.1 The Role of Symbols in Human–Computer Interaction Human–computer interaction, as it is usually understood (which includes the three paradigms examined here), invariably involves the use of symbols, in the general, technical sense of the term [1]. Symbols are being used to represent input, control settings, system status, events, ongoing processes, available resources, possible user actions, results, outputs, etc., all for the benefit of the user. The earliest use of computers was as advanced calculating machines. Since then, not only has there been important changes in the kinds of symbols used, but also in what they are used to refer to, how the user envisages the relations between self, symbol and referent, and how these relations are upheld, in the abstract and concretely. Broadly speaking, in this context symbols may serve three general purposes: as a means for the user to access and acquire information; as a means for the user to supply information, including with the special purpose to achieve certain ends; as an instrument simplifying, supporting or extending the user’s own thinking. The first two are easy enough to understand: symbols are used for output; they are also used for input, data as well as control. The third purpose, supporting cognition, is less obvious but of special interest here. 1.2 Cognitive Artifacts Donald Norman [12] introduced the concept of a cognitive artifact, defining it as “artificial devices that maintain, display, or operate upon information in order to serve a representational function and that affect human cognitive performance.” Computer applications that normally concern HCI are generally cognitive artifacts. But there are two different senses in which a cognitive artifact can assist thinking, related to the distinction Norman makes between “personal view” and “system view.” One sense is that it can substitute for (parts of the) thinking in performing a certain task. An example would be the pocket calculator. The user doesn’t have to do all the thinking
The Evasive Interface
119
involved in the usual routine for multiplying numbers using pen and paper. The pocket calculator requires the user to press the right buttons to input the right numbers (watching out for errors) and the desired operation, and read off the result, too—but that is very much the same as in the pen-and-paper version. Norman would say that the task for the user has changed, the “personal view” has changed significantly, whereas from the “system view,” the result should be the same but delivered faster and possibly with less errors. The other manner in which a cognitive artifact can assist in the cognitive work of the user is well illustrated by the manual method of multiplying numbers using cognitive artifacts such as pen, paper, mathematical notation such as numerals and arithmetical operator symbols. In many cases, these two senses can be two different aspects of the same artifact. For this to happen, it is important that the symbols employed in the interface are chosen with care: they should raise the level of abstraction in such a manner that they really support fruitful higher-level thinking on the user’s part. We should not underestimate the extent to which computer applications can support users in their own thinking, serving as cognitive artifacts in the second sense. Spreadsheet applications and word processors are typical examples of a range of common applications where the support for the user’s own thinking is about equally important as supporting the user in producing results. 1.3 Thinking Versus Doing and Perceiving To be able to claim that thinking is taking place, it is of some importance that a distinction between thinking and doing can be maintained, even if it is relative and at a symbolic level itself: e.g. to entertain the possibility of X should not be tantamount to causing X.2 Thinking by “doing” is certainly sometimes a possibility, for example when we think about how to best lay the table by trying out different placements of plates, cutlery, glasses, etc.. It is reported that in playing the game of Tetris, contrary to what one would expect, as users become more skilled they increase rather than decrease the number of “epistemic” actions, i.e. actions performed to uncover or speed up access to information, compared to “pragmatic” actions, i.e. actions performed with a purpose to put the current piece in its chosen place and orientation [8, 10]. In some circumstances it seems difficult to say whether an action is part of the thinking preceding the “real,” effective action or the effective action itself, until after the fact. Computer applications supporting undo encourage such tentative actions, but if the ultimate purpose of the application is some real-world implement or effect, we can still see it as a (productive) play with symbols; at least as long as the symbols are more easy to change than their (ultimate) referents, and failures are less devastating at the symbolic level than at the referent level. In considering various hypothetical stages of thinking in evolution, Daniel Dennett arrives, first at what he calls the “Popperian creature,” which, as Karl Popper succinctly put it “permits our hypotheses do die in our stead,” and then at the “Gregorian creature,” named after Richard Gregory, which is also able to take cognitive shortcuts by importing “mind tools” from the environment [1]. 2
Compare Hegel’s remark in Lectures on the philosophy of history, that whereas animals cannot interpose anything between an impulse and its satisfaction, human beings have the ability to arrest an impulse and reflect on it before letting it pass into action [5].
120
L.-E. Janlert
Mainstream cognitive science has been attacked from different quarters for attaching too much importance to thinking with the help of symbols. Within HCI there have been several attempts to rectify the predominance of interaction through explicit symbols, by investigating alternatives in the direction of rich perceptual experiences and complex physical actions, which presumably make better use of natural human capabilities to interact. An influential case in point is the concept of affordance brought in from ecological psychology and adapted to HCI by Donald Norman [14]. Affordances, in Gibson’s original version, are not symbols (possibly they might count as indices in Peirce’s taxonomy of signs), they are rather perceptual cues that trigger responses, behaviors [3]. Still, it is one thing to perceive that a button invites to being pressed, another to know what the effect will be, and when and for what purpose it is appropriate to perform the action. In the pedagogical examples Norman likes to use, such as operating doors and water taps, the function of the artifact is severely limited and well-known: just about the only thing you expect to be able to do with a door is opening and closing it, so if you perceive a button-looking feature that invites to pushing, you can reasonably infer that pushing the button will either open or close the door. Going further in the direction of tangible user interfaces, consider computerized artifacts that lack a dedicated symbolic interface, e.g. a computerized chair that adapts to your body, interprets your spontaneous, small movements, learns and remembers your favorite positions, wakes you up when you fall asleep, makes you change your posture when you have been sitting too long in the same position, etc.. It may be an academic question whether this is really HCI, but researchers and designers will have to deal with such cases. At this time, however, none of the paradigms studied here seem to include artifacts of this kind.
2 The Virtuality Paradigm In what may be called the virtuality paradigm, the interface is a means for the user to access a different and symbolic world. This is the use paradigm that has become so common and dominating that we are hardly aware of it. The user ultimately wants to get through the interface, partly (as in the typical graphical user interface, GUI) or completely (virtual reality), into that other world. Transparency is commonly seen as an ideal. In engaging with the virtual world, the user more or less shuts out the real world and the specific situation of use; it rather disturbs the interaction and task performance. Maintaining links and relations between the symbols and the real world is the responsibility of the user and the service provider: mapping real world regularities and state of affairs into symbolic models, and interpreting and mapping symbolic results back for application in the real world. This arrangement puts the user in the position of a middleman: streams of information pass through the user in both directions; the user easily becomes a bottleneck, exhausted and confused by the traffic, afflicted by information and communication overload. Although taking its name from virtual-reality technology (VR)—which may be said to have as its ideal the complete immersion of the user in an alternative, virtual world appearing as real to the user as the real world—the virtuality paradigm not only
The Evasive Interface
121
antedates VR, but also GUIs.3 The “other” world accessed through early textual interfaces, before the advent of graphical user interfaces, was also a symbolic world, typically consisting of mathematical models and data about the real world. It was a rather abstract “world,” usually lacking spatiality and shape, in some sense comparable to the world evoked by a book. GUIs transformed these abstract and spatially weak symbolic models into what could be more properly be called worlds, directly accessible to the user’s perception, in the process also replacing the previous conversation model of interaction with the acting-in-a-world model. In some sense this parallels the step from book to motion picture. Interface concept. The interface provides access to a different and symbolic world, whether the means are textual or graphical (or involves other modalities). The interface is something the user wants to reach or get through, to engage in the virtual world behind. Graphical interfaces open for a more vivid interpretation of “world,” and the interface can be viewed literally as an opening. Use scenario. The user accesses or enters the virtual, symbolic world via the interface in order to perform some operations in the world, to retrieve information, to update and develop. In many cases this is done in order to support some real-world activity: tasks arise in the real world; the user enters the virtual world beyond the interface, for help and assistance, mentally more or less leaving the real world (since it is difficult to engage in more than one world at a time); and eventually returns to the real world with an answer. In preparation for future uses, the user may also learn facts about the real world, and enter the virtual world to record or modify the facts or change the model. Whereas it is hard to find examples of virtual worlds that bear absolutely no relation to the real world, some uses are undeniably more escape from, than support for, the real world. Symbols. In most cases the symbolic world thus represents aspects of the real world even if large parts can be hypothetical, counterfactual, even fantastic. The task of keeping track of which real-world referents the symbols have, and what status they have, falls on the user and the service provider (maintaining the basic model, updating variable data). The situation of use is not linked to the model world. When engaged in using the application, the application is basically the only means of accessing the real world, which usually means a rather abstract, alienated view of the world, with little chance to verify that the virtual world gives a correct picture of the real state of affairs, especially since the user is in principal cut off from the real world by the very way the interface concept works. 2.1 Mixed Reality Leaving the purely virtual approach where symbols are unaffected by the real world, there are now many applications in which virtual world elements are causally coupled to real world counterparts. Some actions in the virtual world have real-world effects, 3
Similar to how computer graphics has had as its longstanding ideal the ability to produce pictures qualitatively indistinguishable from photographs of any actual or imaginable realworld scene.
122
L.-E. Janlert
they are not just symbolic actions; some real-world changes are reflected in virtualworld updates. This is part of the idea of cyberspace as interpreted by among others Benedikt [1]. By this move users are somewhat relieved in their role as mediators. Information can bypass the user. Some tasks can be completely automated, taking the user out of the loop completely. Typically though, real-world feedback to the user through the interface is weak and abstract, giving the user a feeling of unreality (as e.g. in computerized warfare). In lifting part of the responsibility of connecting symbols to reality off the user, the overview of consequences and quality of control may suffer. In the case of more radical forms of mixed reality, like augmented reality, the user may face a single world that is a fusion of real world and virtual world, where it potentially may become difficult to distinguish what is real and what is just a symbol, or perhaps even to insist that the distinction still exists. There are two types of augmentation. The first is to superimpose extra information (normally inaccessible to the user’s senses) about the real world on top of the real world elements it is about, producing a kind of “annotated reality.” The second is to introduce elements, components, aspects that are simply non-existent, fictional, in relation to the real, actual world. The first type of augmentation is less problematic as long as the extra symbols are easy to distinguish as such (e.g. textual annotations); the second kind is more problematic: it is what may turn this into a kind of “magic reality,” where you might become uncertain whether you can walk through that wall or not. Of course, it is not easy to freely mix fantasy with hard reality if the basic requirement is that reality is perceived as such and as it is. This branch of the virtuality paradigm is not so well developed yet—it clearly needs the addition of mobility to become more than very locally realizable—so it will have to await further analysis, but potentially there is a whole new use paradigm hidden here, just waiting for the right technology: efficient, comfortable, and cheap. One interesting technical possibility is hand-held VR [6].
3 The Ubiquity Paradigm If the old idea was to put a world into the computer, the new idea is to put the computer into the world of real objects and environments. In what may be called the ubiquity paradigm, ubiquitous computing and computer artifacts divide the traditional interface into a multitude of individual thing and environment interfaces. The computer artifact is reality, and the interface is a way to use and control the real thing. This is a notion of interface more in line with traditional industrial design: an envelope of the object, negotiating between inner and outer environment, as elaborated by Herbert Simon [16]. Whereas the virtual approach is arbitrarily free relative to the real world, the ubiquitous approach tends to be earthbound, welding symbol and object together, as in the notion of the object symbol introduced by Donald Norman and Edwin Hutchins [15, 14]. In the more environmentally oriented areas of ubiquitous computing, such as calm technology, introduced by Mark Weiser [17], the unobtrusiveness and even invisibility of the interface is emphasized. The interface can signal real-world state affairs, but it should not be in the form of proper symbols, rather like indexical signs in nature (e.g. smoke or smell of burning indicates fire).
The Evasive Interface
123
Interface concept. The interface is the surface of a real, clearly distinguishable physical object, which it covers and is the means of controlling. The “invisibility” ideal, the interface as something the user should not have to think about, is an ideal of superficiality—everything of importance is on the surface—which is complementary to the transparency ideal of the virtuality paradigm. Use scenario. Users use facilities on site, wherever they happen to be, use objects and devices where they are present, for purposes that are pertinent to the situation of use. Computer artifacts typically have specialized functions (compared to the traditional general-purpose computer), dedicated uses. Symbols. Symbols are strongly real-world related, more precisely to the real-world situation of use, to the point where symbol and referent threaten to fuse into one entity. There is no reference to a different world. Accessing the symbols is accessing the real world, here and now. 3.1 The Problem with Object Symbols Three of the most basic expectations we have on symbols are: 1) that they are lightweight and easy to manipulate compared to their referents; 2) that they can be at a distance from their referents; and 3) that they can symbolize states of affairs other than the real and actual. In dropping one or two of these conditions, the third in particular, we also lose some or all of their ability to serve as tools for thinking. They may still work as tools for observing and acting. The notion of the object symbol was put forward to encourage very tight couplings between symbols and referents in HCI—as well as in artifact interaction in general, but in many older, mechanical artifacts, this tight coupling is already present and seen as an ideal by Norman: “when the object in the artifact is both the means of control (for execution of actions) and also the representation of the object state (for evaluation), then we have the case of an object symbol” [12]. It seems that object symbols violate all three of the above conditions for symbols. Per definition they violate the third condition, and thus give poor support for the user’s own thinking: if you cannot represent counterfactual state of affairs, if you do not have the ability to fantasize, you are not, properly speaking, thinking at all. Tracking reality is not thinking. Per definition, object symbols also violate the second condition: when objects represent themselves or a larger artifact of which they are a proper part, they cannot be at a distance from their referent. Of course, in the ubiquity paradigm, this is a feature, not a bug. Again, much depends on how cognitively sophisticated applications and artifacts we consider. For example, since we do not use stoves to help us think, perhaps the idea of object symbols might work out fine. Imagine that the knobs of the stove are object symbols: not only can the user control the heat by turning the knob, the current temperature is simultaneously indicated by the current angle of rotation of the knob. Here we see the effect of violating the first condition: if the stove has ordinary electric heaters, the logic of object symbols will require the user to apply torque to the knob for as long as it takes the stove to reach the desired temperature. Not very convenient. And there is another problem: if the symbol really works both ways, how does the user express desired artifact states except by constantly working the controls? What stops the stove from gradually
124
L.-E. Janlert
getting cooler, slowly turning the knob to indicate lower and lower temperature? In many ways it is easier to make interfaces to virtual worlds than to the real world where you cannot adjust the physics to suit the desired logic of the interface.
4 The Mobility Paradigm Another important new use paradigm is mobility, very much a consequence of mobile computing, using mobile, “untethered,” and (usually) small units, connected through wireless technology. Mobility brings two new scenarios of use: remote operation, which is the main focus of practical applications at present; and, more important, in situ application, which is just beginning to be explored. The latter creates a new kind of situation with regard to the interface. Bringing computer applications to bear directly and dynamically on their very point of use in the real world, precisely where the user is in space and time, the user will need to relate symbols with their also present real-world referents—contingent on real-world location and real-world changes. Contrary to the virtuality paradigm, the real world and the actual situation of use in particular, is not a distraction but a resource as well as an, obviously present, target for the use of the application. Interface concept. The interface concept is not one and fixed. One possible concept ties in with the remote access use scenario, basically inheriting the interface concept of the virtuality paradigm. With regard to the in situ use scenario, the issue of interface concept is interesting but so far unresolved: it is clear that like in the ubiquity paradigm, the interface must relate closely to the objects and environment at hand in the situation of use; on the other hand the interface must allow access to informational and computational resources not tied to a particular real-world location or time, like in the virtuality paradigm. Use scenarios. There are thus two use scenarios. One scenario is remote access and control, that is, use independent of situation, which can be seen as extending the virtuality paradigm to allow remote operation from wherever the user happens to be; as if bringing along your desktop computer, connections and all. The second use scenario is the exact opposite, in situ application: use is determined by and dependent upon the situation. The computational resources are brought to bear on the very situation of use and user. Symbols. For the remote operation scenario and interface notion, symbols work similar to the virtuality paradigm. For the in situ application scenario and interface notion, we have a more complex situation. Some of the symbols need to relate to referents that are copresent with the user: the user needs to mentally and dynamically link present realworld referents to symbols in the interface. This is different from both the virtuality paradigm where the user disappears into the interface, and the ubiquity paradigm where the referents are within the artifact itself, so it puts the interaction designer in a new kind of situation. We have not really had to deal with how the user is supposed to match symbols to particular, present real-world referents, dynamically and efficiently, before. In [7] there is an attempt to begin a systematic investigation of the possibilities to make this kind of linking in the particular case of visual symbols.
The Evasive Interface
125
4.1 Context Awareness and Use Situation The mobility paradigm brings with it the opportunity and challenge of context-aware computing (CAC) [11]. Many suggested applications of CAC build on the assumption that the physical setting and situation largely define social roles and agenda. Ironically, just when we have the means to automatically silence mobile phones as we enter the meeting room (remote operation scenario), it is becoming less obvious that we should do so, and less axiomatic that “meeting room” is a physically fixed location with this one purpose. If before, the physical environment very much determined the social environment —e.g. a class room is for teaching, which involves teacher and pupils playing their particular roles—and, vice versa the informational environment, i.e., the available informational and computational resources, very much determined the physical environment—e.g. to access the reference literature you would need to go to the library—with the mobility paradigm of use we now are both freer to mix environments and more exposed to inconvenient environment combinations (e.g. driving and using the mobile phone at the same time). Before, the user would typically do one thing at a time; handling the physical stuff, negotiating with people, and doing the thinking and information work, in turns. The mobility paradigm creates a condition where the total situation of use (i.e. the information situation, the social situation, and the physical setting) has to be taken into account in parallel, and where the course of events in each environment no longer can be assumed to be well correlated with the course of events in the others.
5 Conclusion Earlier and more recent developments in HCI have worked to modify, extend and elaborate the concept of user interface, making it the complex and multifaceted notion it is today. The meeting of the established virtuality paradigm with the new ubiquity and mobility paradigms (and there are no signs at this point that any of these three paradigms will recede into the background), seem to result in a confusion of options and requirements that need to be satisfied regarding the status of symbols and their relation to the real world. The mobility paradigm, in particular, produces some new research challenges by bringing to the fore the issue of linking interface symbols to the real world at the very point of use. Research challenges identifiable from the above analysis include: examining and developing the various ways different kinds of symbols can link to their real-world referents, as seen from the user’s point of view; investigating how conceptual links can be turned into effective perceptual links; studying how to make different statuses of relation between symbols and reality perspicuous to the user, as well as the distinction between symbol and reality itself; and finding out how in doing all this we can strike a balance between tracking reality and allowing symbolic “freedom of thought” supporting the user’s thinking, without confusing the user too much. When it comes to practical answers they will certainly depend on the application, on the particular circumstances and functions.
126
L.-E. Janlert
References 1. Benedikt, M.: Cyberspace: Some Proposals. In: Benedikt, M. (ed.) Cyberspace: First steps, The MIT Press, Cambridge MA (1991) 2. Dennett, D.C.: Darwin’s Dangerous Idea. Simon & Schuster, New York (1995) 3. Gibson, J.J.: The Ecological Approach to Visual Perception. Lawrence Erlbaum Associates, Hillsdale NJ (1986) 4. Goodman, N.: Languages of art: an approach to a theory of symbols, 2nd edn. Hackett, Indianapolis IN (1976) 5. Hegel, G.F.: Vorlesungen über die Philosophie der Geschichte (1837) 6. Hwang, J., Jung, J., Kim, G.J.: Hand-held Virtual Reality: A Feasibility Study. In: Proceedings of the ACM symposium on Virtual reality software and technology, pp. 356– 363. ACM Press, New york (2006) 7. Janlert, L.E.: Putting Pictures in Context. In: Janlert, L.E. (ed.) Proceedings of the working conference on Advanced Visual Interfaces, pp. 463–466. ACM Press, New York (2006) 8. Kirsh, D., Maglio, P.: On Distinguishing Epistemic from Pragmatic Action. Cognitive Science 18, 513–549 (1994) 9. Kuhn, T.S.: The Structure of Scientific Revolutions, 2nd edn. The University of Chicago Press, Chicago (1970) 10. Maglio, P.P., Kirsh, D.: Epistemic Action Increases With Skill. In: Proceedings of Twenty-first annual conference on the cognitive science society, Lawrence Erlbaum Associates, Hillsdale NJ (1999) 11. Moran, T.P., Dourish, P. (eds.): Context-Aware Computing. Special Issue of Human– Computer Interaction 16(2–4) (2001) 12. Norman, D.: Cognitive Artifacts. In: Carroll, J.M. (ed.) Designing interaction, Cambridge University Press, Cambridge (1991) 13. Norman, D.: Emotional Design. Basic Books, New York (2004) 14. Norman, D.: The Psychology of Everyday Things. Basic Books, New York (1988) 15. Norman, D.A., Hutchins, E.L.: Computation via direct manipulation (Final Report: ONR Contract N00014-85-C-0133). Institute for Cognitive Science, La Jolla CA. University of California, San Diego (1988) 16. Simon, H.A.: The Sciences of the Artificial, 3rd edn. The MIT Press, Cambridge MA (1996) 17. Weiser, M., Brown, J.S.: The Coming Age of Calm Technology. In: Denning, P.J., Metcalfe, R.M. (eds.) Beyond Calculation: The Next Fifty Years of Computing, Springer, Heidelberg (1997)
An Ignored Factor of User Experience: FEEDBACK-QUALITY Ji Hong1 and Jiang Xubo2 1
Abstract. User experience plays a more and more important role in the process of design and development for the information products. About the user experience in the field of the network-based (Internet and mobile network) application a lot of research and development teams focus on the information architecture (IA) and user interface (UI) design, they locate on the middle and front level of the products. But in the same time a very important factor of user experience is ignored: FEEDBACK-QUALITY, which is decided by the quality of telecommunication from Telecom Service Support. Through the long observation and research we find: this factor can basically influence the most network-based products. Keywords: feedback quality , feedback periods , feedback periods integrality , feedback time.
1 Brief Introduction At present , the study about user experience concentrates in user interface design which is user straight osculant , but the other important factor is ignored by most people which we called feedback quality . By studying three kinds of long-distance scrutiny software , we discover that the ignored factor takes an important part in information systems user experience which is mostly intermediary by internet.
2 Definition In order to make our discussion clearly , we make several definitions: 2.1 Feedback Periods It is the process from user sends out instruction of aiming at information storeroom to receives corresponding feedback . We can make this definition clearly by picture. USER
We can see too principal parts from the model : User and information storeroom . On the side ,there’s intermedium between them . There’s user interface that we are familiar with in this intermedium , and also there’s An important part which is made of the whole information system: Internet connecting. User interface -----------internet physics stratum -----------------informaition storeroom interface (machinery interface) Fig. 2. Intermedium forming model
2.2 Feedback Quality It is a standard to scale efficiency of feedback quality , also an important to user experience which is ignored for a long time. We think there are too standards to scale the feedback quality : 1. Integrality of feedback periods , which is directly deciding form of feedback periods, it affects users’ needs can be satisfied or not in the user experience field. 2. Feedback time which also called the time users finish feedback periods , satisfies users’ needs of efficiency .Generally speaking ,user experience field only needs to pay attention to the affection of user interface ,but after studying we discover that the internet speed is also affects user experience .
3 Methodology What we found comes from a UT about 3 softwares from China Telecom. The methodology is the usability testing: Let the really users in the same designed scenarios perform the selected Tasks, with the statistic of the perform-time, the amount of errors and the interview to the participant,we can get the problems in the tested products about the usability.[1][2].
4 The Design of Experiments This test is a landscape orientation contrast test between 3 different versions of the far-controled security software. We found at the first 7 users to perform the test to their main functions. 4.1 The Choose of the Participants First of all we setted the standards of participants: the staffers in security room or normal personnels without absolute use experiences of the tested softwares. Base of the choosing standard,we found,there were 2 out of 7 participants worked to sell the those softwares,so we deleted their datas in the test. 4.2 The Test Plan To avoid the affections of study-impacts we set a matrix-order for each participants in the test.
An Ignored Factor of User Experience: FEEDBACK-QUALITY
129
Table 1. Participant
Participant
Participant
Participant
Participant
1
2
3
4
5
1
Version A
Version C
Version B
Version A
Version B
2
Version B
Version A
Version C
Version C
Version A
3
Version C
Version B
Version A
Version B
Version C
Orde
4.3 Task Arrangement We design tasks for users according to the main functions of software: 1. Showing the XXXX watch menu in the top left corner window 2. Showing the YYYY watch menu in the bottom right corner window 3. Select the picture in the top left corner window, turn the camera left and up, then take establishing shot 4. Take a picture from the bottom left corner window. 5. Check the picture of step 4. 4.4 Data Collection Criteria 1. Time Criteria of Task Completion The time will be accounted after user finishing reading the task, and it will end after the user’s announcement of finish. If the time a user used excesses the average time, the task completion of this user will be considered failure. 2. Criteria of Successful Task Completion User announces the finish himself, and the completion is confirmed by the question-master.
5 Analysis of Data and Experiment Result 5.1 Stat of Testing Time Software A User1
Description User selected” Kinescope Research” to display the watch picture User hit the control under catalog of “facility list”; without hitting the image of the camera, but hitting images of subdirectory. User drag and hit images of the control User hit other irrelative widgets to research watch pictures -
No. 2 5 3 4 -
Task3 Mistake Description Software A Software C Software B
User went into “Image Effect” and “Advanced Control” to research establishing shot control function. User could not find the direction control function User misconduct the direction control frequently User hit wrong widget to control establishing shot User confused the images of establishing shot and close shot
No. 3 4 4 3 5
An Ignored Factor of User Experience: FEEDBACK-QUALITY
131
Task4, 5 Since the three testing all had situations of Task failure, comparison and stat is hard to handle. However, the task failure itself shows the mistake bearable problem of the software design. When the mistakes happened, all three software did not show clear hint or help, nor necessary in-support function insert, so user’s operation can only be based on one single mode. If any problem happens in this process, user will have on way to finish the whole task. This is the biggest Mistake Bearable problem of the three software so far. 5.3 Our Discovers After the analyse to the statistion about the using-time and the errors but also the interview to the participants,we got: version C gets the worst note of UE. But we did not immediately conclude,that all prolems ascribe the UID of this version,becase we found 2 strange phenomenas: 1. the 2. task of the version C is not difficult,it is just a simple select-perform.but we got a lot of error-records. 4 in 5 participants were lost. It confused us,why after the study of task 1, the perform of the participants got a lower note. The difficulty of the task schould be reduced. To explain this question we reviewed the feedback of the participants after the test and wacthed the video-tapes again. Through the analyse we found,the impersonality reason of the lost is the discontinuity of the video cable,it leads to the participants can not find the video they wanted,and more there were not clear clew for them. We defined it : the discontinuity of the feedback cycle. Its intergrality is destroyed. The interface in users(UID) —Destroyed— The Medium——the interface in machine Fig. 3. The intergrality of the feedback cycle is destroyed
2. We got a bad record of using-time and errors in the 3. task of version C.We discovered the reason after the review of the test. There was a 2-3 seconds delay when the participants tried to control the direction of the camera. That means the feedback time is far higher than the participant`s limit of patient. It leads, that they can not decide their performs as their custom.They must face to the difficulty of study and more the efficiency is affected. At the last we got: The unintergrality of the feedback cycle and the too long feedback time can directly affect the feedback Quality,and more can lead to the reduction of the UE, which is not just localized in the field of UID.
132
H. Ji and X. Jiang
6 Conclusions Customer experience is always just considered as user interface design. Once it’s related to technique issue, it’s easy to neglected and simply considered as technique bug. However, through this usability testing, in our opinion, user experience is not just user interface design. The company should not only focus on the user research and user interface design of the two sides of feedback time but also should pay attention to the relative technique aspects. Particularly China Telecom, how to improve the quality of feedback or how to male the customer satisfied with basic needs for feedback periods and decrease feedback periods to improve the potential needs of feedback efficiency is significant. User experience could be considered as the furniture which is made up by a few woods. User research, feedback quality and user interface are one of them. If one of them is neglected, user experience will be affected. Therefore, research and improve feedback quality which is always neglected is the key point to improve user experience to China Telecom.
References 1. Handbook of Usability Testing(How to plan, design, and contduct effective Tests) Jeffrey Rubin (John wiley & Sons Inc) pp. 25–26 2. A Practical Guide to Usability Testing Joseph S. Dumas Janice C. Redish(Intellect) p. 4
10 Heuristics for Designing Administrative User Interfaces – A Collaboration Between Ethnography, Design, and Engineering Luke Kowalski and Kristyn Greenwood Oracle Corporation, 500 Oracle Parkway, Redwood Shores, CA 94065 [email protected], [email protected]
Abstract. The lack of focus on administrative interfaces often comes from management's mandate to prioritize end user screens ahead of others. This often shortchanges a more technical class of users with unique needs and requirements. At Oracle, design heuristics for admininstrative GUIs were sourced from a multitude of sources in the corporate ecosystem. Ethnographers, software architects, designers, and the administrators themselves all contributed to bring a better understanding of this often forgotten class of user. Administrators were found to inhabit anywhere from two to five particular classifications, depending on the size of the company. Recently, an ethnographer studied one classification in greater detail, the Database Administrator, while a designer, in the course of an E-Business Suite Installer project analyzed another, the application administrator. What emerged based on the gathered data was a remarkably consistent and universal set of rules and tools that can be used to lower the total cost of ownership and increase usability, attractiveness, and satisfaction for administrative interfaces. Keywords: Design, Administrative interfaces, design techniques, heuristics, ethnographic research, design methods.
deal with administration, configuring, tuning, and maintenance of databases. They posses highly specialized skills. This last user type was the subject of a 2-year long, 8 site ethnographic study. Data collection in this instance involved 23 Database Administrators and included a self-report survey, observational sessions during which task, object, and tool use was recorded at set intervals, and a follow up interview to elicit more quallitative data. The study was designed to find out what the administrators spend time on, what tools they use, and how this information could influence the next generation of Oracle’s server products. The fourth administrator type is the Application Administrator or Functional Administrator. These professionals usually work with a given application like Human Resources, Manufacturing, or Financials. Their duties encompass Lifecycle Change Management, which spans installation, setup, configuration, maintenance (patching), and upgrade. It is often said that the funds spent on LCM are anywhere from 2 to 4 times as much as the initial license cost of the software. Total Cost of Ownership (TCO) issues are much more relevant in software supporting complex enterprises than in consumer software. Most of the administrator’s time is spent tailoring the applications to meet the business needs and practices of a given company. The fact that an enterprise suite is installed does not mean that it is ready to use. Administrators need to configure things like security, populate or provision the system with users, and define defaults for invoicing, printers, and tax structure, among other tasks. These individuals were studied in detail in the context of a design project to improve the task completion of a suite installer. The installer was a Java based wizard that installed the database, application server, and the applications tiers of the Oracle E-Business Suite. The last type of administrator belongs in the Business Analyst or Implementation Consultant class. They often customize the seeded business flows to meet specific business needs, or work on legacy system integration projects. When the project gets too technical they are often joined by a team of developers who extend and customize the application programmatically, often using a developer tool like Oracle JDeveloper. In studying the administrators through the ethnographic research and through a series of design projects we were able to abstract out heuristics and tools that are generalizable for most administrators and could help a designer better target their deliverables to the needs of this unique community.
2 Heuristics Heuristic 1: Do Not Force a Graphical User Interface (GUI). Innovate Only Where Appropriate. In the ethnographic study, we found that 32 percent (See figure 1) of the administrators relied on the command line as their primary tool on the job. They often found it more efficient, faster, and offering more feedback than a GUI. It can also be accessed remotely. Furthermore, the UNIX command line does not involve any set up or configuration in order to be immediately usable. Designers often assume that command line tools and utilities only exist because engineers did not have the time to develop a GUI. Instead of forcing a GUI it is instead advised to support the habits, comfort zone, and core competencies of the administrators by developing tools to accommodate the command line. These could include repositories of custom scripts for batching jobs, or logging tools and information visualization for mentoring.
10 Heuristics for Designing Administrative User Interfaces
135
Fig. 1. Percentage of Time Using Tool Categories. Data from Oracle Study of Database Administrators.
Heuristic 2: Design Based on Observation. Do Not Rely on Self-Reported Data When It Comes to Design for Administrators. Participate in user groups, advisory councils, and include observational data. This is often a universal truth when it comes to data gathering methods, but we found it to more pronounced for this type of user. Surveys and interviews provided inconsistent data compared to observational sessions. Administrators told us that they spent little time doing troubleshooting, where the observational data showed otherwise (Fig. 2). In the study design we did make sure that our sample came from a representative day, and did not include a singular task. It is recommended to focus on 2 or 3 methods when gathering information for the design of administrative applications. One of them should include some form of direct observation, in context, or with a prototype.
Fig. 2. Comparison of Self Report and Observed Database Administrators for Top 5 SelfReport tasks
Heuristic 3: Design Lightweight and Flexible Applications to Accommodate Remote Administration. Administrators often work from home, or administer hardware located in a data center far away. We have observed that if a tool needs to be installed, or if it has slow performance, or long download times, it will not be used
136
L. Kowalski and K. Greenwood
at all. With the current technology, this means thin client web applications, as opposed to native operating system applications, or Java on the client. Mobile applications are critical for administrators, as well. More intelligent devices that can provide more information about a given escalation are slowly replacing pagers that notify the administrator of a given alert. Personal Data Assistants (PDAs) like Treos and BlackBerries were very popular in the Database Administrator and Application Administrator environments. One data point came from a user in a supervisory capacity. His role was to send as a trafficmaster for alerts and data center escalations. He would send specific tasks to administrators based on severity and acquired competencies. Heuristic 4: Design for Collaboration. Administrators spend a large portion of their time communicating with others. Database Administrators spent 19 percent of their time talking to others and 9 percent using e-mail . A good set of collaboration tools can help them become more efficient, automate certain tasks, or just become better organized. Accountability and record keeping also come into question. If collaboration tools are not integrated with the other tools to monitor or tune the hardware and software, they are not considered as useful. Rob Barrett of IBM Almaden Research presented a similar finding, where collaboration was found to be a critical element in the Database Administrator’s work. [1]. In our study we found that administrators underreported all of the communication tasks. Once we were able to identify collaboration as a key feature we were able to design it into the knowledge repository and other tools used by our users, and these features well extremely well received in future lab tests. Heuristic 5: Integrate the Major Administrative Tool Silos: Collaboration, Monitoring, Information Knowledgebase. All administrators studied expressed a desire for a better-integrated portal that would provide an overview of their systems and tools. The application to monitor and tune was only useful if it had an “in context” connection to the application that was used to troubleshoot (Information Knowledgebase, or the repository of solutions to known problems). Collaboration tools were also deemed more useful if they were integrated with their monitoring tools and were designed specifically for administrators to collaborate on lifecycle management of the software environments they were supporting. A good example of this is the ability for the administrator to append notes to an alert in the application that monitors database performance. Administrators are often presented with multiple interrupts of different priorities. We found that they could be more efficient if provided additional context. If they receive two critical notifications (running out of tablespaces) they will triage the one that involves a sales deal database before the end of the quarter and then try to troubleshoot one that belongs to a test system for a future version implementation. Heuristic 6: Documentation for Administrators Is More Frequently Referenced, Needs To Be Fresher, Task vs. Product-Based, and Include the Web. If an application administrator needs to apply patches to their system, they need to have the most recent source of truth, since patches can affect security and stability of the applications they are administering. A 3-month-old printed manual will not be as
10 Heuristics for Designing Administrative User Interfaces
137
useful as online documentation (Fig. 3). Administrators, in contrast to the end users study the documentation and form detailed project plans around installation and production deployments. Administrators also work with software and tools authored by sometimes disconnected product groups within one company. Their tasks do not correspond to the product or organizational boundaries. They often span them. In working with the application administrators, in the context of administering a Common Industry Format (CIF) test from NiST, we found that when administrators were stuck after reading the documentation, they went to search Google. They would often find a web based discussion group where this exact error message was analyzed and the problem solved. These were not always official, or company sponsored sites.
Fig. 3. Documentation in the form of a Post-Installation Page with Links to Tools, Guides, and Information Knowledgebases
Heuristic 7: Manage Complexity by Providing Defaults, and Automating Tasks. A constant point of feedback from the application administrators was a request to provide tool defaults that work. This tends to entail a reduction in the number of
138
L. Kowalski and K. Greenwood
screens and fewer decision points. If an administrator is using a wizard to perform an installation they do not always want to see all the choices and all the paths (Fig.4). Creating a Quick Install path and an Expert path resonated very well with administrators in the next iteration of the design. Sometimes intelligent assumptions are better, and the optimization, or “tweaking” can happen after the system is working in its basic configuration. Other feedback included complaints about the number of manual steps necessary to prepare for the installation. Automation of some steps proved the answer. In one usability test, one issue involved the absence of system checks that, when not performed beforehand, would cause failure of the installation. One of the checks, for free disk space, took place at the end of the installation when it was too late to do anything about it.
Fig. 4. Managing Complexity by Providing Alternate Paths and Decreasing the Number of Decisions on a given screen
Heuristic 8: Perform Competitive Analysis, Including Open Source Tools. As much as a company will try, it is impossible to force the administrators to use only your tools. They will find utilities developed by their user group, an open source product for monitoring and health checks, or even deploy your competitor’s product. The more a designer studies these tools, the more effective the integration exercise, or the information can be used to enhance existing applications. Subjects of our DBA study all had their favorite collections of tools, and while there were some patterns there seemed to be a race to discover the coolest and latest utility to make the adminstration tasks more efficient and operations transparent. Heuristic 9: International Focus and Hosted Applications. Administration is being outsourced. In some cases, the physical infrastructure and the software are remote to both the end users and the administrators. This is the case when a company hosts an application suite for a customer who accesses it over the Web. In other cases, only the administrators are in remote locations. Designers need to include sensitivity to other cultures, and design with internationalization support in mind, including support for
10 Heuristics for Designing Administrative User Interfaces
139
languages, bi-directional support, and accessibility standards relevant to the local government bodies. Heuristic 10: Use the Right Communication Vehicle during the Design Process. When designing for administrators, it is very common to create designs that are not implemented. Results of studies are often communicated in 100 page reports that the stakeholders do not have time to read. Conversely, posters representing flows, or “before and after” designs, are more successful. What also helps is “speaking the administrator’s or the developer’s language” and using the bug defect database to record design and usability issues. Communication among team members can also prove to be a failure point for a designer. Utilizing a new tool like the collaborative Twiki can accelerate communication and foster a feeling of an extended virtual team, with everyone working on the same goal. A designer is furthermore successful if they extend their role and try to understand why technology, legal, or business issues stand in the way of their vision’s implementation. Standardized testing, while not always useful in the creative phases of the project can still be instrumental when comparing unassisted task completion rates between one release and the next, or comparing yourself to the completion. And lastly, direct involvement with the end users and project stakeholders tends to work better than management mandates and lengthy and abstract guidelines.
3 Conclusion Administrators are not yet a fully understood user type. More work is needed to fully develop complete user profiles. Enterprise software also represents just one dimension. Consumer companies like EBay and Yahoo are also cultivating their own administrative ecosystems. The domain is not an easy one since it involves constantly evolving technology and industry standards. Furthermore, few enterprise installations include only the software sold. There are always legacy systems and integration exercises present unique logistical, financial, and human factors challenges. The heuristics identified provide a focus for a designer who is new to this domain and dealing with that user type. If taken into consideration, the most basic administration UI design bloopers will be avoided.
Reference 1. Barrett, R.: System Administrators are Users, Too, Standard Human Computer Interaction Seminar (Mary 30, 2003), http://hci.stanford.edu/seminar/abstracts/02-03/030530-barrett.html
Micro-Scenario Database for Substantializing the Collaboration Between Human Science and Engineering Masaaki Kurosu1, Kentaro Go2, Naoki Hirasawa3, and Hideaki Kasai4 1
Abstract. For the purpose of achieving the effective and efficient humancentered design, a database of problem micro scenario (p-MS) is proposed. In the concept of this system, the human scientist work first for getting the information about the user and the context of use by applying the field work methods. The information about problems discovered in the field data will be stored in the p-MS database with the tag and the ground information. Engineers who plan to manufacture something can retrieve relevant problem information from this database, thus they can shorten the time required for the early stage of development. This idea of p-MS database is believed to facilitate the humancentered design and the feasibility study will be conducted within a year from this presentation. Keywords: usability, scenario based design, micro scenario method, database.
Fig. 1. Relationship between the human science and the engineering (above: previous situation where human science served just as the information source separately, below: situation where HCD is implemented and both approach are integrated into one)
2 Collaboration Between Human Science and Engineering As is shown in Figure 2, there are two types of collaboration between the human science and the engineering. In this figure, above is an idealistic type where the human scientist takes the first role by investigating the user characteristics and the context of use, thus summarizes the requirement.
Fig. 2. Two types of collaboration between the human science and the engineering
142
M. Kurosu et al.
But most of the real development takes the type below where both parties start at the same time. Although this type of development is better than no collaboration, engineers will not wait for the requirement information presented to them. Because it is a waste of time, they start “something” while waiting. As a result, when the requirement information is given, engineers might have stepped into some designing process without the adequate information about the user and the context of use. If engineers were quite flexible and receptive, they will redo designing. But in most cases, to our regret, engineers do not lend their ears to the requirement, thus design something that do not fit to the user requirement. On the other hand, the serial approach described above in Figure 2 is difficult because it is unbearable for engineers just to wait for the completion of the requirement and do nothing until then.
3 Micro Scenario Database One answer to the problem above is to construct the database of problem micro scenario as is shown in Figure 3.
Fig. 3. Concept of micro-scenario database
The problem micro scenario (p-MS) is a scenario that represents the micro information structure constructed from the field work data. It is an output from the first half step of the micro scenario method (Kurosu et al. 2003, Kurosu 2004, 2005, 2006) described in Figure 4. Micro scenario method is a successor to the scenariobased design originally proposed by Caroll (1995). As is shown in Figure 5, each p-MS represents the problem in terms of the relationship between the user and the artefact. Fundamental information about the user and the context of use is described as the ground information (GI) and linked to each p-MS, hence if one wants to get the background information of the p-MS, s/he can get it by tracing the link to the GI. p-MS is attached with the tag information that represents the content or the domain of the problem. It is similar to the keyword. Thus it will be used to retrieve the relevant p-MSs from p-MS database and the user of the system can get p-MSs with similar problems, and can summarize the information. In this way, this database of p-MS can be used to create the requirement for developing some products or systems.
Micro-Scenario Database
Fig. 4. Basic flow of micro scenario method
Fig. 5. Problem micro scenario
143
144
M. Kurosu et al.
As shown in Figure 3, human scientists work for investigating the user and the context of use by using the field work methods independently from the engineering developmental process. They summarize the information as a set of p-MS, tag, and GI and store them into the database. Engineers can use that database whenever they would like to start a project for manufacturing something. Relevant information can be retrieved from the database by entering the keyword. Figure 6 represent the situation where the micro scenario database is used by many engineers. In this figure, an interpreter is added to the top of each engineering project. The interpreter must have the background of usability engineering and can interpret the retrieved p-MS adequately in order to create the requirement.
Fig. 6. Use of micro scenario database
4 Conclusion The p-MS database is just a concept at the time of this presentation, but it is planned to be implemented in a year or two. The feasibility study will then be started. Authors have a belief that this kind of database is surely be useful in order to spread the human centered design. Besides, the micro-scenario authoring tool (Kurosu et al. 2006) that has just completed will facilitate the use of the database.
References 1. Carroll, J.M. (ed.): Scenario-Based Design: Envisioning Work and Technology in System Development. Wiley, Chichester, UK (1995) 2. Kurosu, M., Nishida, S., Osugi, T., Mitsui, M.: Analysis of Field Data by Micro-scenario Method (in Japanese) In: Proceedings of Human Interface Symposium (2003)
Micro-Scenario Database
145
3. Kurosu, M.: Micro-scenario method for designing and re-designing the e-Learning system, E-Learn 2004 (2004) 4. Kurosu, M.: Micro-scenario method: a new approach to the requirement analysis, WWCS 2004 (2004) 5. Kurosu, M.: Micro-scenario method (MSM) - a new approach to the requirement analysis , Human Interface Society SIGUSE (2004) 6. Kurosu, M.: Micro-scenario method – interface design based on the context of use information, Design IT (in Japanese) (2005) 7. Kurosu, M.: Scenario creation by using the micro-scenario analysis system, JPA 2006 (in Japanese) (2006) 8. Kurosu, M., Kasai, H., Hirasawa, N., Go, K.: Analysis tool for micro-scenario, Human Interface Society SIGUSE, 2006 (in Japanese) (2006) 9. Kurosu, M.: Micro-scenario method, NIME report, 2006 (in Japanese) (2006) 10. Ohnishi, J., Go, K.: Requirement Engineering, Kyoritsu-shuppan, 2002 (in Japanese) (2002)
A Meta-cognition Modeling of Engineering Product Designer in the Process of Product Design Jun Liang, Zu-Hua Jiang, Yun-Song Zhao, and Jin-Lian Wang Department of Industrial Engineering & Management, School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dong Chuan Road, Shanghai, 200240, P.R. China {jliang, zhjiang, lemon_zhao, buada}@sjtu.edu.cn
Abstract. For further effectual tacit knowledge reusing in the process of product design, individual cognitive processes, cognitive factors, and cognitive strategies need to be realized to find the essential factors that affect the generation of tacit knowledge and control designer activities in the whole design process. But these key factors are relative to individual cognitive capability and meta-cognitive level. So, based on physical symbol system hypothesis (PSSH) and connectionism, a meta-cognition model of engineering product designer is provided to elucidate the active monitoring and consequent regulation in this paper. Designers’ cognitive activities in the process of product design are analyzed from the viewpoint of cognition science. Finally, the cognitive differences between the experienced designers and the novices in the process of fuel injection bump design is compared and elaborated in detail. Keywords: Meta-cognition, Cognitive activity, Individual Difference, Product design.
A Meta-cognition Modeling of Engineering Product Designer
147
product design, meta-cognition can monitor and control cognitive process of the designers about product design. For example, in the case based design, designer starts cognitive activities and meta-cognitive activities from design tasks and design requirements, continues with the confirmation of features, case retrieval, case revision, case using, and ends with the accomplishment of design. Effective design support systems must complement human cognitive activities, and must be based on a sound understanding of the human cognitive abilities [4]. This paper focuses on the analysis of designer’ cognitive and meta-cognitive activities and builds a bridge that connects cognition psychology and engineering product design. This paper is organized in the following way. Section 2 introduces the cognitive foundation of designer meta-cognition model and the related works of meta-cognition. Section 3 provides a meta-cognition model of engineering product designer and presents the components of meta-cognition model. Cognitive and meta-cognitive activities in the process of product design are explored and analyzed in Sect. 4. Section 5 discusses cognitive and meta-cognitive activities in fuel injection pump design and compares the cognitive differences between the experienced designers and the novices and the conclusions are presented in Sect. 6.
2 Cognition Science Foundation for Individual Meta-cognition in Product Design Meta-cognition emphasizes personal mind activity, thought, perception, memory, and interaction of cognitive activities, and pays more attention to self-awareness and selfregulation. Meta-cognition is defined by Flavell as “knowledge and cognition about cognitive phenomena” [5] and often described as the executive process governing our cognitive efforts [1] and it consists of meta-cognitive knowledge and self-regulation [6]. Susan V. Baxt [7] defined six meta-cognitive processes, i.e. problem definition, planning, Strategy selection, flexibility (of strategy use), evaluating, and checking and monitoring, which are based on above three meta-cognition models. Meta-cognitive activity was significantly related to knowledge acquisition, skilled performance at the end of training, and self-efficacy [8]. Monitoring and Control are the two important information flow processes, one information flow is from a cognitive to a meta-cognitive level allows monitoring of the cognitive level by the meta-cognitive level, the other is from the meta-cognitive to the cognitive level allows control of cognition by meta-cognition [9].Monitoring one's thinking and the effects of controlling it are the model's mechanisms for increasing meta-cognitive understanding [10]. Furthermore, the importance of the context was emphasized by Erik Hollnagel [11] and he pointed out that cognition and context are inseparable. Valkenburg R. [12] and Smith R.P. [13] had studied the cognitive activities of design teams. Mao-Lin Chiu[14] considered that design status is a kind of manner of design operation, which can implement with sense input, perceive process, conception process, status-structure process and memory construction process.
148
J. Liang et al.
3 Meta-cognition Model of Engineering Product Designer 3.1 The Framework of the Model Designer meta-cognition in product design domain refers to designers monitor and control a series of cognitive activities to dominate individual knowledge to solve design problems in self-awareness when stimulating information of design environment interchanges with their cognitive behaviors. Meta-cognitive process in product design is a continuous process of driving design task forward until its accomplishment. Designers can cognize not only design objectives, design process, but cognitive process and cognitive results, and these cognitive activities happen in a positive and self-conscious situation.
Fig. 1. Meta-cognition model of Engineering Product Designer
As shown in Fig. 1, meta-cognition model of engineering product designer involves five sub-modules, and they are meta-cognitive knowledge, meta-cognitive experience, meta-cognitive operation, product design cognition sub-module, and long memory module of product design knowledge introduced in following context. Among of them, meta-cognitive knowledge, meta-cognitive experience, and metacognitive operation are the hard core of the model. Long memory module of product design knowledge provides various kinds of knowledge for solving the design problems, and product design cognition sub-module supports the minute cognitive activities, such as sense perception and, respectively.
A Meta-cognition Modeling of Engineering Product Designer
149
3.2 The Components of the Model 3.2.1 Meta-cognitive Knowledge Meta-cognitive knowledge refers to beneficial knowledge, experiences, and lessons that impact cognitive processes, cognitive strategies, cognitive structures, and cognitive results during cognitive activities happening in the product design process, and it supports and affects meta-cognitive operation and meta-cognitive activities and transfers cognitive tasks. In product design, meta-cognitive knowledge includes three main aspects, people, tasks, and strategies, which are described as follows: People means that the designer self or others act as the objects, concerns cognitive capability, intelligence level, design experiences, knowledge, cognitive structure etc., which involves to cognize his/her own cognitive capability, and to perceive cognitive states in design requirements, cognitive differences and similarities, and special cognitive rules and experiences formed in the product design process. Tasks mainly mean cognitive knowledge when the designer analyzes and judge detailed cognitive goals and cognitive requirements, which include to cognize the requirements, goals, and features of cognitive tasks, the properties, characteristics and mode of appearance of cognitive materials, and the familiarity, degree of difficulty and schedule of cognitive object in product design. Strategies mean cognitive knowledge and methods used by the designers when they plan, employ, monitor, control, and adjust cognitive activities, which include methods about cognizing designer’s cognitive activities, analysis of merits and demerits about cognitive strategies, guideline about exception problem handling in process, and directions about cognitive activities, such as, attention, memory, and thought etc. 3.2.2 Meta-cognitive Experience Meta-cognitive experience refers to designers’ comprehension and consciousness about their cognitive activities and cognitive process, reflects the awareness and unawareness about cognitive activities, and shows in the form of affective experience. The execution of designer cognitive activities in product design emerges from meta-cognitive knowledge activated by meta-cognitive experience that changes metacognitive knowledge from activated level to work state to serve meta-cognitive monitoring and meta-cognitive regulation. Positivity or negativity of meta-cognitive experience impacts designer cognitive activities, decides designer’s decision-making behaviors, such as, attention of different degrees about design process, cognitive strategy, and method choosing, and finally, conducts the success and failure of product design. Meta-cognitive experience is a mediator and a trigger of monitoring and regulating cognitive activities. From the viewpoint of engineering product design, at initial stage of product design, the designer experiences degree of difficulty, familiarity, and ongoing situation of cognitive tasks. In medium-term, the designer experiences the process of cognitive tasks, all kinds of difficulties and obstacles about cognitive tasks, the gap between planning and practices, and reschedule of cognitive strategies. At final stage, the designer experiences the effect of cognitive activities, the evaluation of planning and practices, meta-cognition activities, such as, the improvement of cognitive strategy, and emotional experiences, like, glad and sad. So, it is very important to
150
J. Liang et al.
arouse designer meta-cognitive experience in product design, because meta-cognitive experience can activate enthusiasm of cognitive activity and improve the validity of cognitive process about design problems. 3.2.3 Meta-cognitive Operation Meta-cognitive operation refers to a series of meta-cognitive activities monitoring and regulating designer cognitive activities by activation of meta-experience, when research objects is cognitive activities of the designer oneself. It means a continuous work process of different operative behaviors, regulates and acts on cognitive activities directly, and interacts with meta-cognitive experience and meta-cognitive knowledge. Herein, operative behaviors of meta-cognitive operation includes choosing, controlling, feedback, monitoring, evaluating, comparing and analyzing etc., which is self-consciously governed by phenomenal consciousness called “metacognitive center” in this model. All these meta-cognitive operative behaviors may do execution in concurrent mode or in serial mode. Such as, the operative behaviors of “monitoring--feedback-controlling” is a serial handling structure in the individual cognitive process of product design, but “choosing” of meta-cognitive knowledge or design domain knowledge in long memory of module, and the operative behaviors of “monitoring— feedback--controlling” are dealt with in a concurrent mode. Meta-cognitive center is the core of meta-cognitive operation, guides the operative behaviors, and contacts with meta-cognitive knowledge. It is affected by the designer oneself, call cognitive tasks with corresponding cognitive strategy. Meta-cognitive operation revises meta-cognitive knowledge and responds to the activation of metacognitive experience. In the individual cognitive process about product design, metacognitive operation carries through meta-cognitive activity to monitor, to control and to regulate the cognitive process of product design, interacts with others sub-modules in the model. 3.2.4 Cognition of Product Design Cognition of product design refers a set of cognitive activities happening in designer consciousness, which starts from receiving the stimulation of design requirements and design tasks and ends with the completion of a concrete design. This process is a special applying of general cognitive activities in product design, an access of metacognition and cognition, and the cognitive access of design problems. It includes cognitive activities about product design, cognitive process of product design, attention, the characteristics of cognitive tasks of product design, cognitive effects, and mental feeling etc. 3.2.5 Product Design Expertise Knowledge in Long-Term Memory From the product design coming to hand to accomplishment of this product, all individual memory contents about product design, such as, expertise knowledge, experiences, lessons that exist with all product design activities are stored in long memory module of product design knowledge, and this module serves for metacognitive operation. Tulving [15] divides memory into episodic memory and semantic memory. Over here, semantic memory refers to memory of general knowledge and rules of product design, relates to the connotation of concepts that emerges from the
A Meta-cognition Modeling of Engineering Product Designer
151
whole product design process. However, the information of episodic memory comes from external information resource and concerns design experiences and their concrete scene and specific details. This module provides the needed expertise knowledge, domain knowledge, and other knowledge for the designer going along cognitive activities and supports meta-cognitive knowledge.
4 Relationship Between Meta-cognitive and Cognitive Activities and the Product Design Process Individual cognitive activities in the product design process mainly focus on the imago and cognition of the components, concepts, execution, and completion of cognitive tasks about design, and involve cognitive process and mental activities, such as, sensation, perception, image, thinking, memory and attention etc. individual metacognition is a cognition about product design cognition and a continuous process of realizing design tasks. The designers can cognize objective tasks and their own cognitive process and cognitive results. Cognitive activities and meta-cognitive activities for product design are governed and regulated in a positive and selfconscious status, such as, self-regulation, self-awareness, and self-control. The designers start their cognitive activities from receiving the stimulation of design tasks, such as, sensation, perception, and attention etc. At one time, metacognitive activities work in a concurrent mode, like, meta-cognitive monitoring and meta-cognitive controlling. With the development of product design activities, cognitive activities and meta-cognitive activities continue to advance and improve. Finally, individual cognitive and meta-cognitive activities will end along with the completion of design tasks. Observing from a special time or space viewpoint, cognitive and meta-cognitive activities of the designers exist in the dispersion, fragment, and concurrency mode, but in the whole design process, they go along in the sequence, order, and series mode.
5 Cognitive and Meta-cognitive Activities and Cognitive Differences in Fuel Injection Pump Design The retrospective verbal protocols of two experienced designers and four novices have been analyzed and compared to research the cognitive process and metacognitive activities that happened in PM fuel injection pump design process. 5.1 Cognition and Meta-cognition Analysis in Fuel Injection Pump Design As soon as designers get the design task of PM fuel injection pump, their cognition and thinking start to deal with related design tasks and cognitive tasks from task assignment, technology resource, strategy, role, and potential problem etc. The design information is sensed and perceived by vision and audition of the designers, and the design requirements of PM fuel injection pump are paid more attention firstly, such as, the type of matching engine, key parameters. With the stimulation of design information, meta-cognitive center handles the related cognitive information from
152
J. Liang et al.
bottom to top. Meta-cognition analyzes cognitive tasks, considers designer own role, cognitive goals and intention, and monitors cognitive activities by meta-cognitive center. Meta-cognitive operation goes into effect in series or concurrent mode, such as, planning individual cognitive process, selecting cognitive strategy, and comparing the differences of this design task and one-time design tasks in mental feeling. At the same time, meta-cognitive operation inspires meta-cognitive experience, which activates meta-cognitive knowledge to call the related knowledge and design scenario segments, like PL and/or PM fuel injection pump design scenes. Designer metacognitive knowledge guides and affects meta-cognitive operation and comprehends meta-cognitive experience, in reverse, and meta-cognitive experience supports all kinds of operative behaviors. They interact, restrict, collaborate, and depend one other to monitor, control and regulate cognitive activities in product design. The design tasks and design intention of PM fuel injection pump, its functionbehavior-structure, and sub-goal and sub-task need be arranged, discussed, and determined at some meeting and branch meeting in several working days, which lead to designer cognition existing in a dispersion and fragment mode observed from time and space viewpoints. Designer cognitive and meta-cognitive activities govern and dominate individual behaviors, such as, the lingual expression of design scheme, the drawing practice, and the concrete design steps. With regard as minute design calculation and basic parameters, designers finish them in the direction of design template and design manual or by the professional software, in which there is few creative activity, so designers only need to notice, monitor, and control their cognitive activities. When designers encounter some difficulties, they need to extract related experiences, knowledge, and shortcuts from long-term memory module of product design. Sometimes, designers need to activate individual image, creativity and afflatus etc. to complete design task and design activities of PM fuel injection pump. In general, designer cognitive and meta-cognitive activities in PM fuel injection pump design conform to the principle of economy. 5.2 Cognitive Differences of Different Designers The whole design process of PM fuel injection pump contains two stages in a nutshell, the preparation of design scheme and concrete design and calculation of PM fuel injection pump. At the first stage, the experienced designer and the novice differ in cognitive plan, cognitive strategy, the perception and prediction of PM fuel injection pump design process and detailed step. The cognitive differences between them mainly focus on cognitive effects, mental feeling, cognitive goal and intention, result prediction of cognitive tasks, cognitive process, and meta-cognitive activities etc, which are shown in Fig. 2. For example, in cognitive effects and cognitive tasks, the experienced designers like to perceive all-sided design tasks to plan their cognitive tasks and transfer and use their design experience, but the novices focus their attention on design details and design difficulties, and their cognitive strategies are different. Furthermore, the experienced designers emphasize the utilization of the techniques, setting forms, materials, and tolerances of existing series products, like IW fuel injection pump and P fuel injection pump, and mature products, like PW2000 fuel injection pump, but there is any experience using and operation found in the novices.
A Meta-cognition Modeling of Engineering Product Designer
153
Fig. 2. Meta-cognitive and cognitive activities and individual differences in the preparation of fuel injection pump design
Due to the difference of knowledge quantity, problem analysis, experience and shortcut possession of similar design task between them, so the design effects and design schemes generated by them are distinct obviously at this stage. At the second stage, the experienced designers and the novices solve the minute design problem and parameter calculation. The cognitive differences are focused on key problem perception of design process, design experience, knowledge quantity, knowledge structure, which is represented in the method selection of concrete parts, like plunger and camshaft, and in the determination of parameters, like the pressure of fuel supply. For example, the novices design plunger and plunger barrel according to fuel delivery per cycle and duration of feeding, but the experienced designers analyze the parameters and history data of dimension chain and max pressure at pump end of PL fuel injection pump and PW2000 fuel injection pump, and consider the influence of fuel supply rate, spray quality, and the pressure of combustion system at the end of injection to calculate the coefficient of plunger diameter/ effective stroke and chute inclination of plunger. Table 1 shows the partial comparison of the cognitive differences between the experienced designers and the novices in the design process of fuel injection pump. Due to the differences of design role, cognitive tasks, cognitive strategies, and knowledge structure etc., the designers have different mental feeling, perception, cognitive activities and meta-cognitive activities, and their activated meta-cognitive experience and meta-cognitive operative behaviors are also different.
154
J. Liang et al.
Table 1. Partial cognitive differences between the experienced designers and the novices in fuel injection pump design Differences
Experienced Designers
Cognitive People
Understand them and solve difficulty by easy stages, but lack creativity.
Trend to field dependence, Reflective, divergence, holist Abundant expertise knowledge, Knowledge Quantity domain knowledge, and practice experience Ordered, connected, and Organized Manner hierarchical organizing Cognitive Level of Simple and effective product Problem design cognitive process Manner of Knowledge Extracting according to the Extraction rules of schema and hierarchy Cognitive Style
Novices Deficiency of Self-cognition, excessive self-confidence or negative in problem-solving, and sometimes creativity. Trend to field independence, impulsive, convergence, serialist Only part expertise knowledge learned in university of by enterprise training Out of order, untrimmed, and random organizing Form product design cognitive process gradually. Extracting in stochastic and disorder manner
6 Conclusions This paper explores the designer’s cognitive activities in the process of product design and provides a meta-cognition model of engineering product designer, which afford the bedrock of cognition psychology for the research of cognitive process and metacognitive activities in the engineering design process. The core factors of the module are described and discussed in detail, and they interact, restrict, collaborate, and depend on one other in the product design process. Meta-cognitive and cognitive activities in the process of product design are analyzed, and the cognitive differences of the experienced designer and the novices in PM pump design process are compared. It can support and sever for cognition research in engineering design. Furthermore, meta-cognition activity can guide the reusing of the important tacit knowledge and provide the designer the effective knowledge, experience and right design orientation. At the same time, this study provides a useful reference for other domains researches about cognitive and meta-cognitive activity. Acknowledgments. This work is supported by the Shuguang Program of the Shanghai Educational Committee under grant No.05SG15 and the National Basic Research Program of China (973 Program) under grant No. 2003CB317005.
References 1. Sternberg, R.J.: Human intelligence: the model is the message. Science. vol. 230(4730) pp. 1111–1118 2. Flavell, J.H.: Cognitive monitoring. In: Dickson, W.P. (ed.) Children’s oral communication skills, pp. 35–60. Academic Press, New York (1981)
A Meta-cognition Modeling of Engineering Product Designer
155
3. Walczyk, J.J.: The Development of Verbal Efficiency, Metacognitive Strategies, and Their Interplay. Educ. Psychol. Rev. 2, 173–189 (1994) 4. Sherman, Y.T., Lang, J.D., Ralph, O.B.: Cognitive factors in distributed design. Comput Ind. 48, 89–98 (2002) 5. Flavell, J.H.: Metacognition and cognitive monitoring: A new area of cognitive developmental inquiry. Am. Psychol. 34, 906–911 (1979) 6. Brown, A.L.: Metacognition, executive control, self-regulation, and other more mysterious mechanisms. In: Weinert, R.E., Kluwe, R.H. (eds.) Metacognition, Motivation and Understanding, pp. 65–116. Lawrence Erlbaum Associates, Hillside New Jersey (1987) 7. Baxt, S.V.: Metacognition gets personality: a developmental study of the personality correlates of metacognitve functioning. Carleton University, Ottawa (1995) 8. Ford, J.K., Smith, E.M., Weissbein, D.A., Gully, S.M., Salas, E.: Relationships of goal orientation, metacognitive activity, and practice strategies with learning outcomes and transfer. J. Appl. Psychol. 83, 218–233 (1998) 9. Butterfield, E.C., Albertson, L.R., Johnston, J.: On making cognitive theory more general and developmentally pertinent. In: Weinert, E., Schneider, W. (eds.) Memory Performance and Competence: lssues in Growth and Development, pp. 181–205. Lawrence Erlbaum, Hillsdale New Jersey (1995) 10. Butterfield, E.C., Hacker, D.J., Albertson, L.R.: Environmental, Cognitive, and Metacognitive Influences on Text Revision: Assessing the Evidence. Educ. Psychol. Rev. 8(3), 239–297 (1996) 11. Hollnagel, E.: Cognition As Control: A Pragmatic Approach To The Modelling Of Joint Cognitive Systems. IEEE Trans Syst Man Cybern (in press). http://www.ida.liu.se/ eriho/Publications_O.htm 12. Valkenburg, R., Dorst, K.: The reflective practice of design teams. Des Stud. 19, 249–271 (1998) 13. Smith, R.P., Leong, A.: Observational study of design team process: a comparison of student and professional engineers. J. Mech. Des, Trans. ASME. 120(4), 636–642 (1998) 14. Chiu, M.L.: Design moves in situated design with case-based reasoning. Des. Stud. 24, 1– 25 (2003) 15. Tulving, E., Donaldson, W.: Episodic and semantic memory, Organization of Memory, pp. 381–403. Academic Press, New York (1972)
User Oriented Design to the Chinese Industries Scenario and Experience Innovation Design Approach for the Industrializing Countries in the Digital Technology Era You Zhao Liang, Ding Hau Huang, and Wen Ko Chiou 259 Wen-Hwa 1st Road, Kwei-Shan Tao-Yuan, Taiwan, 333, R.O.C Chang Gung University [email protected]
Abstract. Designing for Chinese industries and the new China market has became a ‘hot’ issue within the global and Chinese industrial design society. The characteristics of low labor costs and hard-working Chinese have had an effect on the rapid economic development within the region as a whole. The purpose of this paper is to analyze state of the art industrial development within Taiwan and Mainland China, and to evaluate the critical problems of industrial design development in both regions. Additionally to discover how Taiwan Chinese digital technology industries confront this situation with user-oriented design (UOD). This paper synthesizes six approaches to carry out an innovative product development framework of new product development procedures, with user oriented scenario predictions and experience innovation approach. These approaches not only generate original design data from a user’s point of view, but furthermore make it much easier to get consensus from product development teams and really create innovative designs through interdisciplinary collaboration to create innovative cultural enterprises. Keywords: User oriented design, Scenario approach, Innovation design, Industrializing countries, Digital technology.
User Oriented Design to the Chinese Industries Scenario
157
own, preferring instead to copy or imitate those products that are already available in highly industrialized countries. Most manufacturers in the region involve themselves more with technical and production problems and with upgrading their production and technical quality. It is thus obvious that most makers are primarily concerned with ‘how to produce’ rather than with ‘what to produce’. In the past Taiwan has developed the export of low-priced items based on the island’s competitive edge which stems from relatively low labor costs. Taiwan has been competing in terms of ‘price’ rather than ‘quality’. The product has not been considered as ‘important’ and manufacturers have spent comparatively little on it. This situation has been changing as other nations with even lower labor costs are producing lower priced products. Looking particularly at the recent history of Taiwan, the slow but steady implementation of industrial design reflects this dilemma. This history can be grouped into three periods. The first, the economic industrial development period from 1966 to 1973 focused on ‘design as a tool’ in developing products which must satisfy local users’ need as will as environment requirements. The second, the export industries development period from 1973 to 1989, emphasized ‘design as a bridge’ between foreign buyers and local manufacturers. The third period, the industrial period from 1981 to the present, has implemented ‘design as a tool’ in developing unique Taiwanese products for the global market. Therefore the purpose of this paper is to analyze state of the art industrial development within Taiwan and Mainland China, and to evaluate the critical problems of Industrial Design development in both regions. Additionally to discover how Taiwan Chinese digital technology industries confront this situation with useroriented design.
2 The Value of Design Firstly we propose how product design and development actually work. 2.1 Definition and Scope of Industrial Design A number of managers in Taiwan local industries have understood that industrial design is a very important element in industry. However, it is still necessary to clarify the role of industrial design as something more than cosmetic ‘face-lifting’ or the creation of a ‘nice outer shell’ surrounding technology in general. In this respect, we would like to quote the definition of industrial design as formulated by the International Council of the Society of the Industrial Design (ICSID): “industrial design is a creative activity. Its objective is to improve human life and its environment through product design which satisfies user’s needs and habits, and is concerned with their functional and emotional requirements” [3]. Today, most top managers in global business enterprises have recognized the importance of industrial design, not only as an important specialized field during the product development process, but also as a quality ‘tool’.
158
Y.Z. Liang, D.H. Huang, and W.K. Chiou
2.2 Product Design and Value Planning To further enlighten the issue, We would like to quote the industrial design policy of the Concern Industrial Design Center (CIDC) of Philips Netherlands [2]: “It is the task of the CIDC to transform technology into products which are simple to produce, ergonomically correct, safe and easy to use and to service, and which are also aesthetically appealing there by improving man’s comfort and environment”. Based on this policy we can list the main design factors as ‘function’, ‘use’, ‘appearance’ and ‘production’. Each factor significantly influences a product’s quality and value. The relationship between the factors can be formulated as: V =
Q C
=
F+U+A C
where V=value, C=cost of materials & production, Q=quality, F=function, U=use, and A=appearance. 2.3 Function of Product Design A quantity of managers in the Taiwan region imagines the function of product design to be simply a product’s engineering and manufacturing. Others may think it follows in terms of electronics. In fact, design can be defined as a conscious plan. Its main contribution to product development lies in the synthesis of a concept using carefully assembled facts. Design skills may be defined in relation to the type of product and may also be related to the various functions of the designer. The three main groups directly involved in the product design and development process are: ‘the marketing group’, ‘the technical development and production group’, and ‘the industrial design group’. Team work is the key word applicable during the product development process. All specialists involved cooperate according to a systematic product development pattern, and they must be competent enough to coordinate their specialized ‘optimal solution’ with the expected holistic solution. This coordination creates an optimal product or product system, and, at the same time, prevents the dominance of one function over another. The product development procedure is a systematic process which integrates all product design and development activities from the idea stage to mass production to ensure the product meets market and consumer time and price needs. Product development also works as a coordinator and integrator to ensure that every functional division works as an integrated team to maintain good communication with full commitment to the project goal.
3 Experience in Taiwan At the Pacific rim of Mainland China, it seems as though they are following in the same footsteps as the Taiwan Chinese, that is, developing their industries on the basis of original equipment manufacturer (OEM) orders, and then trying to upgrade to an
User Oriented Design to the Chinese Industries Scenario
159
original brand manufacturing (OBM) level through original design manufacturing (ODM) business. Therefore we can talk about the experience of Taiwanese industry especially on product design and development as following. 3.1 The Gap of Smile Curve and Its Shifting ‘ACER’ has a famous brand image the world over. It symbolizes that Taiwan has not just got a manufacturing industry, but can create a brand which embodies value. The founder of ACER described the current characteristic of the electronic equipment manufacturing industry by a theory called ‘Smile curve’. Within the smile curve, the two sides are marketing and development. Manufacturing function is in the middle [2]. Mr. Shih was encouraged Taiwan industry should move to these two functions of smile curve during the global value system. Marketing and development have higher add-on value throughout the industry. Taiwan should not stay in the middle of the curve which has lower value within the industry. Therefore Taiwan should develop industries on the basis of OEM orders, to upgrade to an OBM level through ODM business (shown as figure 1).
Fig. 1. Smile curve
However a product developed without market strategy, positioning and user knowledge will find it hard to get customer acceptance and to become a market recognized brand. Therefore in the knowledge economy era we should transfer the strategy from ‘smile curve theory’ to ‘close cycle concept’. We should integrate manufacture knowledge into the bottom of smile curve, the technology knowledge on the left side and the marketing knowledge on right side, but more importantly should add the content knowledge of user oriented needs in the top of the cycle (shown as figure 2). 3.2 The Missing Link from OEM to OBM Regarding the industrial development gap as with the product development practice in Taiwan, the term ODM actually means ‘own development manufacturing’, that means we are qualified in technical and engineering development, but usually only offer an on-going solution with ‘me too’ (follower) design, rather then applying user-oriented design (UOD) principles.
160
Y.Z. Liang, D.H. Huang, and W.K. Chiou Cl ose cycl e
Gap1 . Original De si gn Manufactu ri ng Engnineer Knowledge
Research and development Engine ering
Ow n D evelop ment M an ufactu rin g
、
Sale me too
、
MKT ENG'S MFG MNGT Design
、
S tr at eg ic In no vat ion D es ig n
(OBM) Domain Marketing Branding Knowledge Sales
MFG
MFG(OEM) MFG Knowledge (OEM)
Fig. 2. Smile curve shifting
GAPⅡ
M ar ket St rategy
RdD Hi-M anagement Hi -Tec h Hi -Des ig n
Desi gn
Engi neer ing Me too
Manu fact ur ing
DCOR
Fig. 3. Reasonable product innovation development process
However a reasonable innovation design process should define the direction of the innovation strategies firstly, and then according to the goal of the strategies conduct R&D and design and then finally the result could be put into manufacture. Moreover we should be concerned with both technological innovation and product strategy at the same time and building brand image by interdisciplinary collaborative design and marketing based value efficiently. In order to bridge the gap of industrial development in Taiwan, UOD and ‘interdisciplinary collaboration’ integration should be emphasized. It is proposed to upgrade the own development manufacture (follower design) to original design manufacture and to build design strategies including High design of user oriented
User Oriented Design to the Chinese Industries Scenario
161
base, High tech of technology base and hi-management of interdisciplinary collaborate base (shown as figure 3).
4 Scenario and Experience Innovation Design Approach This paper recommended six approaches to carry out user oriented innovative product development framework of new product development procedures which can be applied to a series of practical cases. The approaches are as follows: 4.1 I-Ching and Darwin’s Natural Law Applying I-Ching (the theory of change) and Darwin’s natural law to describe the principle of the form, shape, function to create and develop from a ‘natural environment and scenario’ in which things are living. 4.2 Competitive Product Appraisal and Monitoring Competition As products are developed following their ‘field of use’ and ‘use scenario’, monitoring competition from the users’ point of view and market positioning assists in evaluating their advantages and disadvantages in order to position and define the competitive advantage. 4.3 Macro Vision Scenario This is from an economic, social, technology point of view, defining the product opportunity from macro vision to develop the key issue/s for new product development. 4.4 Micro Scenario Defines the target user group and detailed scenario situations, and activities from the above product opportunities, which interact with the product/s (the user target groups will be generated from character mapping, which are defined from a set of attributes and relate to the product and users). From the micro scenario key issues and design requirements for new products can be identified. The above approaches not only generate original design data from a user’s point of view, but furthermore make it much easier to get consensus from product development teams and really create innovative design/s through interdisciplinary collaboration to create innovative cultural enterprises. 4.5 Scenario Observation Observations of actual situations and interaction with actual sampling characters verify critical issues and design requirements, which are generated from micro scenarios so final design definitions become evident.
162
Y.Z. Liang, D.H. Huang, and W.K. Chiou
4.6 Design Development and Scenario Verification Scenario simulation and scenario verification are facilitated by means of rough ‘mock ups’, prototyping and ‘field test sampling’ to experience and verify users’ scenario/s in order to refine designs and to reduce risks from both users’ and business’ points of view.
5 User Oriented Innovation Design Concept With this approach, we collaborated with ADVANTECH Co. Ltd [1], Taiwan, who is a leader in the industrial computing and automation market. The above methods were applied to a series of interactive interface products for e-automation systems including industrial automation e-platform, service automation e-platform (medical, vehicle), home automation e-platform in ADVANTECH, with UOD scenario prediction and experience innovation approach. The innovative UOD concept for e-automation industries is shown as figure 4.
Fig. 4. Innovative UOD concept [1]
6 Conclusion The most important consideration for managers in this region is the development of marketing and design and not just technology and production. Products have to be designed to closely fit the market and complement users’ life-styles, needs and habits.
User Oriented Design to the Chinese Industries Scenario
163
It is also essential for our region’s producers to think more along the lines of longterm advantages instead of immediate profit. Manufacturers have to put more effort into creating new products as well as improving existing products. They must simultaneously establish their own corporate identity and product image to further their global development. These goals are best met by adhering to a set procedure of product development. This will give the customer both what he desires and generate an inbred ‘quality consciousness’ toward innovative design manufacturers. As noted earlier, product development is the coordinator and integrator of the entire product development cycle. It ensures that the overall program stays on schedule and that the product introduction date is met. Most important, the whole concept is based on the premise that the customer is the boss.
7 Implications Taiwan is an island with a population of 23 million; the market is too small for new innovative products to survive unless enterprise scales up to international markets. With a population of 1.3 billion, Mainland China’s market and industries will have many more opportunities in developing innovative UOD in today’s knowledge economic era.
References 1. 2. 3. 4.
ADVANTECH: http://www.advantech.com/ CIDC: Concern Industrial Design Centre -Philips, Nederland: http://www.design.philips.com ICSID: http://www.icsid.org/ Shi, Z.R.: Acer reconstruction: Initiating, growing up and challenge. Commonwealth Publishing (2004)
Emotional Experiences and Quality Perceptions of Interactive Products Sascha Mahlke1 and Gitte Lindgaard2 1
Centre of Human-Machine Systems, Berlin University of Technology, Franklinstr. 28/29 – FR2-7/2, 10587 Berlin, Germany [email protected] 2 Human-Oriented Technology Lab, Carleton University, 1125 Colonel By Drive, Ottawa, K1S 5B6, Canada [email protected]
Abstract. Over the past few years, various novel approaches have been applied to the evaluation of interactive systems. Particularly, the importance of two categories of concepts has been emphasized: non-instrumental qualities and emotions. In this paper we present an application of an integrative approach to the experimental study of instrumental and non-instrumental quality perceptions as well as emotional user reactions as three central components of the user experience. A study is presented that investigates the influence of system properties and context parameters on these three components. The results show that specific system properties independently influence the perception of instrumental (i.e. usability) and non-instrumental qualities (i.e. visual aesthetics). Especially the perception of instrumental qualities was shown to have an impact on the users’ emotional reactions (subjective feelings as well as cognitive appraisals). There was also evidence suggesting that context parameters influenced emotional user reactions.
Emotional Experiences and Quality Perceptions of Interactive Products
165
Mahlke [2] reviewed various approaches to the study of non-instrumental quality aspects. Briefly, he argued that two distinct categories of non-instrumental qualities have been differentiated in most approaches. On the one hand, aesthetic aspects have been discussed. These contain first and foremost visual aspects of product appearance, but can also imply other sensory experiences like haptic or auditory aspects of product use, as for example discussed by Jordan [3] and captured in his definition of physiopleasure. The other category refers to a symbolic dimension of product appearance. The concept of hedonic quality discussed by Hassenzahl [4] belongs to this category, which is similar to what Jordan [3] calls socio- and ideo-pleasure. Although much is being said about non-instrumental quality aspects and their application to design, only a few empirical studies actually measuring these have been reported. In a study of the interplay of non-instrumental quality perceptions with other concepts, Tractinsky, Katz and Ikar [5] highlighted the connection between aesthetics and usability. They argue that users’ aesthetic judgment made before using an interactive system affects their perceived usability even after using it. Lindgaard & Dudek [6] found a more complex relationship between these two concepts. Hassenzahl [4] studied the interplay between usability and hedonic quality in forming overall judgments concerning beauty and goodness. He found that judgments of beauty are more influenced by the user’s perception of the hedonic qualities, while judgments of goodness - as a more general evaluative construct - are affected by both hedonic quality and usability. Although a few empirical studies do exist that contribute to a better understanding of the role of non-instrumental qualities and their interplay with other relevant aspects of technology use, many questions remain to be addressed. In particular, the relationships between quality perceptions and emotional experiences have barely been explored. 1.2 Emotions as Part of the User Experience Rafaeli and Vilnai-Yavetz [7] attempted to link quality perceptions and emotional experience. They suggested that artifacts should be analyzed in terms of three conceptually distinct quality dimensions: instrumentality, aesthetics, and symbolism. They conducted a qualitative study in a non-interactive product domain to better understand the influence of these three quality dimensions on emotional responses. All three categories contributed significantly to the emergence of emotion. Tractinsky and Zmiri [8] applied this idea to an interactive domain by studying various existing websites which yielded similar results, and Mahlke’s [9] study on actual audio players showed that various instrumental and non-instrumental quality perceptions influenced users’ emotional responses. While Rafaeli and Vilnai-Yavetz [7] used interviews, Tracinksy and Zmiri [8] and Mahlke [9] applied questionnaires to assess users’ emotional responses. All these studies focused on the subjective feelings that arise when perceiving or using the relevant products. Much research has been conducted on measurements of emotion during interaction with technical devices, and different methods have been proposed to measure emotions in interactive contexts. Mahlke, Minge and Thüring [10] used Scherer’s [11] multi component model of emotion to structure a range of relevant emotion-measurement methods and relating them to the five components of emotion:
166
S. Mahlke and G. Lindgaard
subjective feelings, facial expressions, physiological reactions, cognitive appraisals and behavioral tendencies. Taken together, there are two major problems with the interpretation of results emerging from the studies reported above that relate emotional experiences during the interaction with users’ quality perceptions [7, 8, 9]: 1. They took a quasi-experimental approach by using existing products. As it was not discussed which properties of the stimuli or other variables influenced quality perceptions and the emotional experience, this question remains unanswered. 2. Rather than measuring all the five components of Scherer’s [11] model, only subjective feelings were measured as indicators of emotions. 1.3 Research Approach Mahlke and Thüring [12] describe an integrated research approach to the experimental study of emotional user reactions considering both instrumental and non-instrumental quality perceptions of interactive systems. Their model defines instrumental and noninstrumental quality perceptions as well as emotional reactions as three central components of the user experience, claiming that characteristic of the interaction affect all three of these. These characteristics primarily depend on system properties, but both user characteristics and context parameters like aspects of the tasks and the situation can play an important role. The outcomes of the users’ interactive experience as expressed in overall judgments of a product, usage behavior or choices of alternatives are shown to involve all three components, namely emotional user reaction as well as instrumental and non-instrumental quality perceptions. This model has been applied to study the influence of system properties on the three user experience components and users’ overall appraisal of the system [12]. In an effort to affect the perception of instrumental qualities as well as user performance, the level of usability was systematically varied as were other system properties modified expected to affect perception of visual aesthetics. Emotions were measured in terms of subjective feelings, motor expressions and physiological responses. The results confirmed that the manipulations had the predicted impact on the perception of both instrumental and non-instrumental qualities. Prototypes high in usability and attractiveness were significantly rated more highly than those that were low in both aspects. The results of the questionnaire assessing subjective feelings showed an effect of both factors. They revealed that the effect of variations in usability was greater than variations in visual aesthetics on both valence and arousal measures. Consequently, the high-usability/high-aesthetics prototype was experienced as most satisfying, while the low-usability/low-aesthetics was found to be most annoying. Since no statistical interaction of usability and aesthetics was found, both factors contributed additively to these emotions. EMG data of facial muscle sites and other physiological measures (dermal activity and heart rate) supported this interpretation. The following study is based on the same research approach, but differs in two aspects. First, the measurement of emotions focuses on subjective feelings and cognitive appraisals to learn more about another component of emotions defined by Scherer [11], and second, task demands were varied as an example for contextual parameters. Hassenzahl, Kekez and Burmester [13] found that the influence of instrumental and non-instrumental quality perceptions on overall judgments differs
Emotional Experiences and Quality Perceptions of Interactive Products
167
depending on whether users are in a goal- or action-mode. In the goal-mode participants were required to accomplish given tasks, while they had the same amount of time to explore the system on their own in the action-mode. This variation was applied to investigate the effect of context parameters on emotional responses. The following predictions were made: 1. The versions with higher levels of usability and/or visual aesthetics would lead to higher instrumental and/or non-instrumental quality ratings. 2. Quality ratings would not be influenced by the usage mode [13]. 3. The versions with higher levels of usability and/or visual aesthetics would lead to differences in the cognitive appraisal of the usage situation and more positive subjective feelings. 4. In goal-mode, the correlation between instrumental quality perceptions and subjective feelings would be higher than between non-instrumental quality perceptions and subjective feelings. In action-mode the opposite would be found.
2 Method The variables investigated concerned the influence of system properties associated with usability and aesthetics of the system and task demands, that is, goal- versus action-mode, on the perception of instrumental and non-instrumental qualities and emotional user reactions. These included subjective feelings and cognitive appraisals. 2.1 Participants Eighty undergraduate students (48 women, 32 men) participated in the study. They were between 18 and 54 years old (average 21.3 years) and received course credit for participation in the study. Most of the participants (n = 72) owned a portable audio player and used it regularly. Almost all (n = 78) used computers daily. 2.2 Material Portable audio players were chosen as the domain of study and different versions were simulated on a computer. The aim of the variation of system attributes was to influence perceived usability and aesthetics of the system independently. To produce two versions with different levels of usability, three system features were varied: the number of menu lines shown (five versus two), a scrollbar indicating available but hidden menu items (given or not), a cue about the present position in the menu hierarchy (given or not). These variations had been used in a previous experiment [12] in which the effect of these on usability varied in the direction one would predict, that is, the most usable version resulted in the highest usability ratings. With respect to system features designed to influence the perception of visual aesthetics, two different body designs were used in the earlier experiment [12] varying in symmetry (high or low), color combination (high or low color differences) and shape (round or square). Because these manipulations resulted only in small differences in perceived aesthetics between the two versions, an attempt was made here to improve the high-aesthetic version by consulting a professional designer.
168
S. Mahlke and G. Lindgaard
The prototypes were presented on a 7” TFT-display with touch screen functionality that participants could hold in their hands for providing input. The display was connected to a computer which ran the simulation of the audio player. 2.3 Design Three independent variables were manipulated: ‘usability’, ‘visual aesthetics’, and ‘mode’ (goal- vs. action-mode). Since each of the variations of ‘usability’ and ‘visual aesthetics’ had two levels (‘high’ and ‘low’), four prototypes were created: (a) ‘highusability’ and ‘high-aesthetics’, (b) ‘high-usability’ and ‘low-aesthetics’, (c) ‘lowusability’ and ‘high-aesthetics’, (d) ‘low-usability’ and ‘low-aesthetics’. In the goalmode participants were required to accomplish a set of tasks, and in the action-mode they were freely browsing the system for the same amount of time. All three variables were between-subjects factors. 2.4 Measures Two types of behavioral data were recorded in the goal-mode condition to ensure that versions of assumed high or low usability differed as planned: task completion rates and time on task. Questionnaires were employed to assess the user’s perception of instrumental and non-instrumental qualities. Selected sub-dimensions (controllability, effectiveness, helpfulness, learnability) of the Subjective Usability Measurement Inventory (SUMI) [14] served to rate usability. The dimension ‘classical visual aesthetics’ of a questionnaire developed by Lavie and Tractinsky [15] was used to measure visual aesthetics. Subjective emotional data were obtained via the Self-Assessment Manikin (SAM) [16] which captures the quality, or valence (positive/negative), and intensity (arousal) of emotions. Cognitive appraisals were obtained via a questionnaires based on the Geneva Appraisal Questionnaire [17]. It measures five appraisal dimensions: intrinsic pleasantness, novelty, goal/need conduciveness, coping potential, and norm/self compatibility. Novelty is a measure of familiarity and predictability of the occurrence of a stimulus, while intrinsic pleasantness describes whether a stimulus event is likely to result in a positive or negative emotion. A goal conduciveness check establishes the importance of a stimulus for the current goals or needs. Coping potential refers to the extent to which an event can be controlled or influenced. Norm/self compatibility describes the extent a stimulus satisfies external and internal standards. 2.5 Procedure The experiment took roughly 30 minutes on average. Participants were given instructions describing the experimental procedure and the use of SAM. They were then asked to rate their subjective feelings as a baseline measure. Then, depending on the experimental condition to which they were assigned at random, the relevant player was presented and participants rated its visual aesthetics. Next, they read a short text describing how to use the system.
Emotional Experiences and Quality Perceptions of Interactive Products
169
Participants were then asked either to complete the set of five tasks or to explore the system for a certain amount of time. In the goal-mode condition a limit of two minutes was set for each task. Typical tasks were ‘Please have a look which songs you find on the player in the Genre POP’ or ‘Please change the sound setting of the player to CLASSIC’. However, participants actually completed the five tasks in five minutes on average. Therefore, a five-minute time limit was also set for the browsing participants. In the task condition participants filled in SAM scales after the first, third and fifth task. In the browsing condition, they were asked to rate their current subjective feeling after one, three and five minutes of exploration. At the end of this, the cognitive appraisal questionnaire was completed and usability ratings were obtained.
3 Results A 2x2 ANOVA for ‘usability’ and ‘visual aesthetics’ was performed on the goalmode data only, assessing task-completion rates and task-completion time. There was a significant main effect for ‘usability’ only, for both task-completion rates, F(1,38)=9.20, p < .01, and task-completion time, F(1,38)=13.10, p < .01. Thus, high usability led to better performance on both measures. 3.1 Instrumental and Non-instrumental Quality Perception Table 1 summarizes the average usability and visual aesthetics ratings for each condition. The ratings were transformed to values between 0 and 1 because the range of ratings differed between the variables. The Table shows that the average ratings were comparatively high even in the low-usability and the low-aesthetics conditions. Table 1. The first number in each cell represents the average usability rating and the second number the average visual aesthetics rating for each condition (ratings are transformed to values between 0 and 1)
A 2x2x2 ANOVA for ‘usability’, ‘visual aesthetics’ and ‘mode’ performed on the usability ratings revealed a significant main effect for ‘usability’ only, F(1,72)=9.0, p < .01. A similar 2x2x2 ANOVA carried out on the visual aesthetics ratings showed a significant main effect for ‘visual aesthetics’ only, F(1,72)=34.3, p < .001. Consistent with hypotheses 1 and 2, this suggests that the system properties affected the perception of both instrumental (i.e. usability) and non-instrumental qualities (i.e. visual aesthetics), and that quality perceptions were not influenced by usage mode.
170
S. Mahlke and G. Lindgaard
3.2 Emotional User Reactions A series of 2x2x2 ANOVAs for ‘usability’, ‘visual aesthetics’ and ‘mode’ on each of the five cognitive appraisal dimensions showed that participants rated the intrinsic pleasantness of the interaction higher for the high-usability than for the low-usability version, F(1,72)=3.9, p < .05. Furthermore, the experience with the low-usable system was rated as more novel, F(1,72)=5.6, p < .05, and self/norm compatibility was higher for the high-usability version, F(1,72)=5.2, p < .05. Neither ‘visual aesthetics’ nor ‘mode’ influenced intrinsic pleasantness, novelty or self/norm compatibility, and goal conduciveness as well as coping potential showed no significant effect for any of the independent variables. In summary then, we found partial support for hypothesis 3: differences in cognitive appraisals for three of the appraisal dimensions and only the factor ‘usability’ had a significant influence. For the analysis of subjective feelings we calculated the changes from the baseline value obtained at the beginning of the experiment to the three values assessed during the interaction for each participant. For the changes from the baseline to the first two assessments of subjective feelings the 2x2x2 ANOVAs with ‘usability’, ‘visual aesthetics’ and ‘mode’ as independent variables revealed no significant effects for either the dimensions valence or arousal. Figure 1 shows the average subjective feeling changes to the third data point at the end of the interaction for the four prototypes. A 2x2x2 ANOVAs for ‘usability’, ‘visual aesthetics’ and ‘mode’ and the changes in valence as dependent variable revealed a significant effect for ‘usability’ only, F(1,72)=25.5, p < .05. The ANOVA for arousal as dependent variable showed no significant effects. Thus, only ‘usability’ affected the valence of subjective feelings, what again only partially supported hypothesis 3.
Arousal
1
0
-1 -2
-1
0 Valence
1
2
Fig. 1. Changes of subjective feeling ratings from the beginning of the experiment to the third assessment during the interaction with the system for the four systems (squared high vs. round low usability; filled high vs. unfilled low aesthetics; SAM ratings were between 0 and 8)
In order to test prediction 4 we conducted partial correlations to assess the correlation of usability and visual aesthetic ratings and subjective feelings in the two usage situations. As shown in Table 2 we found a high correlation for perceived usability and valence in the goal-mode, but not for perceived aesthetics and valence. For arousal none of the correlations was significant. For the action-mode the results yielded a moderately significant correlation with perceived usability and also with perceived aesthetics. For arousal again none of the correlations was significant.
Emotional Experiences and Quality Perceptions of Interactive Products
171
Table 2. Correlation coefficients between quality ratings (usability and visual aesthetics) and subjective feelings (valence and arousal) Goal-mode (tasks) perceived usability – valence
.66
perceived aesthetics – valence
-.01 b)
perceived usability – arousal
-.16
.35 a) * .35 b) *
a)
-.19 a)
.04 b)
perceived aesthetics – arousal Partial correlation coefficients with p < .05; ** p < .01
Action-mode (exploration)
a) **
a)
.22 b) b)
visual aesthetics controlled and usability controlled
*
4 Discussion As stated in hypothesis 1, system properties did independently influence instrumental as well as non-instrumental quality perceptions. Both usability and aesthetics manipulations affected subjective predictions in the predicted directions. In comparison to other studies [5, 18], we did not find any influence of the visual aesthetics variation on perceived usability. One reason may be that in other studies an overall usability rating was used, while we applied a detailed measure for usability. No effect of the factor ‘mode’ was found on quality perceptions (prediction 2) as one would have expected based on Hassenzahl et al.’s [13] findings. The integration of cognitive appraisals as another component of emotions followed the recommendations by Mahlke et al. [10] to consider different components of emotions. We found an influence of the factor ‘usability’ on cognitive appraisals. The interaction with the low-usability system was experienced as less intrinsically pleasant, which corresponds to the findings regarding the subjective feelings. Furthermore, participants rated it as more novel or unusual, which may have led to more negative subjective feelings. The low-usability system was also rated as less self/norm compatible. Although this experiment is another step to the study of cognitive appraisals in interactive contexts, further research is clearly needed on this topic. In terms of the users’ subjective feelings, these were only affected by variations in usability. Furthermore, only the valence dimension was influenced. Participants’ subjective feelings were more positive in the high usability condition towards the end of the experiment compared to the beginning. Surprisingly, we did not find an effect of ‘visual aesthetics’, although we tried to improve the differences in visual aesthetics in comparison to a previous experiment [12]. The variation of usage mode revealed differences in the connections between quality perceptions and participants’ subjective feelings. These differences were most pronounced for the subjective feeling dimension of valence. While there was a high correlation between the valence of users’ subjective feelings and the perceived usability of a system and no correlation with the perceived visual aesthetics when participants focused on the given tasks in the goal-mode, we found moderate correlations between valence and both perceived usability and aesthetics when participants were merely exploring the system. These results indicate that context
172
S. Mahlke and G. Lindgaard
parameters like usage mode influence both the specific quality dimensions for overall judgments [13], and also the quality of the emotional experience. However, more research is needed on these relationships, especially with respect to the subjective feeling dimension of arousal. In future studies the influence of user characteristics should also be studied in addition to system properties and context parameters. Furthermore, the variation of system properties that influence noninstrumental qualities other than visual aesthetics (e.g. haptic and acoustic quality) may reveal important insight especially for the domain of consumer electronic products. Acknowledgements. This research was supported by the German Research Foundation (DFG) as part of the Research Training Group ‘Prospective Engineering of Human-Technology Interaction’ (no. 1013) and by the German Academic Exchange Service (DAAD) with a travel grant. We would like to thank Lucienne Blessing, Manfred Thüring and various colleagues at the Center on Human-MachineSystems in Berlin and the Human-Oriented Technology Lab in Ottawa for the discussions on the study.
References 1. ISO: ISO 9241: Ergonomic requirements for office work with visual display terminals. Part 11: Guidance on usability. ISO, Genf. (1998) 2. Mahlke, S.: Aesthetic and Symbolic Qualities as Antecedents of Overall Judgements of Interactive Products. In: Bryan-Kinns, N., Blanford, A., Cruzon, P., Nigay, L. (eds.) People and Computers XX - Engage, pp. 57–64. Springer, Heidelberg (2006) 3. Jordan, P.W.: Designing pleasurable products. Taylor & Francis, London (2000) 4. Hassenzahl, M.: The Interplay of Beauty, Goodness, and Usability in Interactive Products. Human-Computer Interaction 19, 319–349 (2004) 5. Tractinsky, N., Katz, A.S., Ikar, D.: What is beautiful is usable. Interacting with Computers 13, 127–145 (2000) 6. Lindgaard, G., Dudek, C.: What is the evasive beast we call user satisfaction? Interacting with Computers 15(3), 429–452 (2003) 7. Rafaeli, A., Vilnai-Yavetz, I.: Instrumentality, aesthetics and symbolism of physical artifacts as triggers of emotion. Theoretical Issues in Ergonomics Science 5, 91–112 (2004) 8. Tractinsky, N., Zmiri, D.: Exploring Attributes of Skins as Potential Antecedents of Emotion in HCI. In: Fishwick, P. (ed.) Aesthetic Computing, MIT Press, Cambridge (2006) 9. Mahlke, S.: Studying user experience with digital audio players. In: Harper, R., Rauterberg, M., Combetto, M. (eds.) ICEC 2006. LNCS, vol. 4161, pp. 358–361. Springer, Heidelberg (2006) 10. Mahlke, S., Minge, M., Thüring, M.: Measuring multiple components of emotions in interactive contexts. In: CHI ’06 extended abstracts on human factors in computing systems, pp. 1061–1066. ACM Press, New York (2006) 11. Scherer, K.R.: What are emotions? And how can they be measured? Social Science Information 44, 693–727 (2005)
Emotional Experiences and Quality Perceptions of Interactive Products
173
12. Mahlke, S., Thüring, M.: Antecedents of Emotional Experiences in Interactive Contexts. In: CHI ’06 proceedings on human factors in computing, ACM Press, New York (2007) 13. Hassenzahl, M., Kekez, R., Burmester, M.: The importance of a software’s pragmatic quality depends on usage modes. In: Lucsak, H., Cakir, A.E., Cakir, G. (eds.) (WWDU2002). Proceedings of the 6th international conference on Work With Display Units, pp. 275–276. ERGONOMIC Institut für Arbeits- und Sozialforschung, Berlin (2002) 14. Kirakowski, J.: The software usability measurement inventory: Background and usage. In: Jordan, P.W., et al. (eds.) Usability Evaluation in Industry, pp. 169–178. Taylor & Francis, London (1996) 15. Lavie, T., Tractinsky, N.: Assessing dimensions of perceived visual aesthetics of web sites. International Journal of Human-Computer Studies 60, 269–298 (2004) 16. Lang, P.J.: Behavioral treatment and bio-behavioral assessment: Computer applications. In: Sidowski, J., Johnson, H., Williams, T. (eds.) Technology in Mental Health Care Delivery Systems, pp. 119–137. Ablex Publishing, Greenwich (1980) 17. Scherer, K.R.: Appraisal considered as a process of multi-level sequential checking. In: Scherer, K.R., Schorr, A., Johnstone, T. (eds.) Appraisal processes in emotion: Theory, methods, research, pp. 92–120. Oxford University Press, New York, Oxford (2001) 18. Ben-Bassat, T., Meyer, J., Tractinsky, N.: Economic and Subjective Measures of the Perceived Value of Aesthetics and Usability. ACM Transaction on Computer-Human Interaction 2, 210–234 (2006)
CRUISER: A Cross-Discipline User Interface and Software Engineering Lifecycle Thomas Memmel, Fredrik Gundelsweiler, and Harald Reiterer Human-Computer Interaction Lab University of Konstanz, D-78457 Konstanz, Germany {memmel,gundelsw,reiterer}@inf.uni-konstanz.de
Abstract. This article seeks to close the gap between software engineering and human-computer interaction by indicating interdisciplinary interfaces of SE and HCI lifecycles. We present a cross-discipline user interface design lifecycle that integrates SE and HCI under the umbrella of agile development. Keywords: Human-Computer Interaction, Usability Engineering, Extreme Programming, Agile Modeling, User-centered Design & Development (UCD).
1 Human-Computer Interaction and Software Engineering From its birth in the 1980’s, the field of human-computer interaction (HCI) has been defined as a multidisciplinary subject. To design usable systems, experts in the HCI arena are required to have distinct skills, ranging from an understanding of human psychology, to requirements modeling and user interface design (UID) [1]. In this article we will use the term user interface (UI) designer as a synonym for a professional who combines knowledge of usability, graphics and interaction design. Table 1. Methods for integrating SE and UE, based on [2] (excerpt) Integration issue
Method of application
Mediating and improving the communication lines between users, usability experts and developers
Use medium-weight artifacts, work with toolkits appropriate for collaborative design, talk the same language, work in pairs
Extending software engineering artifacts for UI specification & conceptualization
Use artifacts known by both professions and adjust their expressiveness
Extending RE methods for collecting information about users and usability
Include principles, practice and light- to medium-weight methods from HCI into RE
Representing design artifacts including prototypes using different formalisms
Apply prototyping as a method of participatory design; all stakeholders gather requirements
CRUISER: A Cross-Discipline User Interface and Software Engineering Lifecycle
175
functional requirements are translated into a running system. HCI and SE are recognized as professions made up of very distinct populations. Each skill set is essential for the production of quality software, but no one set is sufficient on its own. The interaction layer is the area where HCI and SE are required to work together, in order to ensure that the resulting software product behaves as specified in the initial requirements engineering (RE). To provide a high level of UI usability, software SE has to work with people with a background in HCI, but the course of collaboration is mostly unclear. It is therefore true that classic and agile SE methods still lack integration of HCI methods and processes (see Table 1). Bearing these two different engineering disciplines in mind, each software design process can be characterized in terms of its dependency on its engineering orientation, ranging from a formal and model-based methodology to an informal explanatory design. SE tends to be more formal and “consequently, the business user and IT analyst may think that they both agree on a design, only to discover down the line that they had very different detailed implementations and behaviors in mind” [3]. Very formal or complex models are an inappropriate base for communication, especially so for collaborative design processes with high user- and businessstakeholder participation. Scenarios [4] - known as user stories in Extreme Programming (XP) [5] - and prototypes are recognized as interdisciplinary modeling language for RE and as bridging techniques for HCI and SE [6]. In SE, scenarios – as a sequence of events triggered by the user – are generally used for requirements gathering and for model checking. HCI applies scenarios to describe software context, users, user roles, tasks and interaction [4]. Prototypes in SE are used to verify functional specifications and models. Agile Modeling (AM) and XP recognize prototypes as a type of small release [5,7], whereas HCI mainly employs them for iterative UID [8]. The bottom-line is that some informal methods of XP and AM are close to HCI practice and therefore the pathfinder for a common course of action. While heavy-weight methods such as style guides (HCI) are far too expensive, lightweight methods such as essential use cases (SE) are in contrast too abstract for system specification. Cross-discipline agile methods are the optimum, and workable, compromise. Agile approaches of both SE [5] and HCI [9,10] are therefore the interface for our common and balanced software lifecycle known as CRUISER.
2 From XP to Agile Cross-Discipline Software Engineering In contrast to classic, heavy-weight SE processes like the V-Model, agile methods begin coding at a very early stage while having a shorter up-front RE phase. Following the paradigm of XP, implementation of code takes place in small increments and iterations, and the customer is supplied with small releases after each development cycle. During the exploration phase, teams write user stories in an attempt to describe user needs and roles. But the people interviewed need not necessarily be the end-users of the eventual software. XP therefore often starts coding based only on assumptions about end-user needs [10]. AM is less rigid than XP and takes more care over initial RE as is provides room for low-fi prototyping, activity diagrams or use-case diagrams [11]. Nevertheless, the analysis phase is finished as soon as requirements have been declared on a horizontal
176
T. Memmel, F. Gundelsweiler, and H. Reiterer
level, because the iterative process assumes that missing information will be filled in at later stages. Development in small increments may work properly as long as the software is not focused on the UI. Changes to software architecture usually have no impact on what the user sees and interacts with. With the UI, however, it is a different story. When designing UIs, continual changes to the UI may give rise to conflicts with user expectations and learnability, cause inconsistency and finally lead to user dissatisfaction. Thus, agile development does not really qualify as user-centered design (UCD), but can function as one pillar for an integrated approach [10]. Both SE and UID have to cope with a shorter time-to-market, in which the quality of the delivered software must not suffer. This therefore is a great challenge both for management and the methods and tools applied. Our idea is a balanced hybrid process, which is both agile SE and agile UCD, and which is consistent with the principles and practices of both disciplines. In order to identify interfaces between agile SE and agile HCI, we have to highlight different approaches to UID, analyze their agile potential and their different contributions to a cross-discipline process. Like XP, original UCD is a highly iterative process. It differs from agile methods, however, since real users are taken into account and the development team tries to understand user needs and tasks before any line of code is written. The lifecycles of usability engineering processes [4,12] provide numerous methods and tools that should support the designer in gathering all of the required information. Most of these methods are rated as heavy-weighted, due to their claim to analyze and document as much as possible about users, work flows, context, etcetera right from the beginning. Constantine [9] argues that UCD produces design ideas in a rather magical process in which the transformation from claims to design is neither comprehensible nor traceable. Such a “Black Box Designer” produces creative solutions without being able to explain or illustrate what goes on in the process. Furthermore, UCD tries to converge the resulting, often diverse, design alternatives into a single solution, which is then continuously evaluated and refined. UCD may therefore take a long time, or even fail, if too many users are involved and narrowing the design space is difficult. Iteration may create the illusion of progress, although the design actually goes round in circles and solutions remain elusive. Altogether, a one-to-one integration of UCD processes and methods is in general inappropriate for an agile course of action. Constantine’s usagecentered design approach takes up the basic philosophy of AM and concentrates on essential and easy to understand models. Through their application, HCI becomes more formal, but the simplicity of their syntax still enables collaborative design by engineering rather than by trial and error [9] (see Table 2). Although the list of usagecentered design success stories is creditable, the products praised tend to support user performance rather than user experience. This cannot be the only aspiration of a modern design approach, however. This is where Donald Norman's recently proposed activitycentered design approach (ACD) [13] comes in. Products causing a high joy of use can reach great user acceptance even when they lack usability. Norman therefore votes for the integration of emotional design issues and the stronger consideration of user satisfaction. In Lowgren and Stolterman’s book about thoughtful interaction design (TID) [14], the designer, in order to design such highly usable and aesthetic systems, switches between 3 levels of abstraction: vision, operative image and specification. If the designer is confronted with a design situation, at first an often sketchy and diffuse vision emerges. Frequently, several visions are promising and are therefore competing to be implemented, eventually resulting in a chaos of conflicting visions. The initial
CRUISER: A Cross-Discipline User Interface and Software Engineering Lifecycle
177
version of the operative image is the first externalization of the vision, e.g. captured in mock-ups or elaborated interactive (hi-fi) prototypes. It enables manipulation, stimulation, visualization and decision making for the most promising design. The designer wants to learn as much about the design space as possible, narrowing the design towards the best solution as late as possible. The operative image is transformed into a (visual) specification of the final design if it is sufficiently detailed. Table 2 shows a comparison of the design approaches under discussion. Our development lifecycle is set up on the core methods of all the approaches presented, such as e.g. selective user involvement (UCD, ACD), prototyping for visual thinking (TID), as well as modeling with scenarios or task maps (usage-centered design). Table 2. Comparison of user interface design approaches, adapted from [9] User-Centered Design
Usage-Centered Design
Activity-Centered Design
Thoughtful Interaction Design
Focus is on users
Focus is on usage
Focus is on activities
Focus is on design
Substantial user involvement
Selective user involvement
Authoritative user involvement
Thoughtful user involvement
User studies Particip. Design User testing
Explorative modeling Model validation Usability inspections
All designers in a project need to have a similar understanding of the vision and the wholeness of the system (TID). Thus continuous and lively discussion is necessary (XP). Informal communication across organizational borders should be easy, and teams should have common spaces (XP). Since reaching agreement on abstract notions (text) is difficult, ideas have to be made visible, allowing participants to look at, feel, analyze and evaluate them as early as possible (XP, AM). The process should be controlled by an authoritative person who must have a deep understanding of both SE and HCI. With our demand for such highly capable personnel, we concur with what XP and AM announced as one of their most important criteria for project success [5]. The leader navigates through the development process, proposes solutions to critical design issues and applies the appropriate design, engineering and development methods. Since the gap between SE and HCI becomes less significant “when the (HCI) specialist is also a strong programmer and analyst” [2], we chose XP as fundamental to our thoughts on bonding SE and HCI. Its principle of pair programming allows people with different fields of expertise, but common capabilities, to design a system together.
178
T. Memmel, F. Gundelsweiler, and H. Reiterer
The basis of our cross-discipline lifecycle is therefore the identification of similarities between XP and HCI (see Table 3), AM and HCI (see Table 4), as well as ACD and TCD when compared to HCI, AM and XP (see Table 5). We outline some major similarities, although our comparison highlighted many more interfaces of these disciplines. Although different in their wording, agile principles and practices are comparable and show a significant overlap, such as in iterative design, small releases and prototyping, story cards of active stakeholder participation and scenarios, or testing and evaluation. Modern UID approaches do not oppose collaboration with SE; on the contrary, they underline the commonalities. Table 3. Similarities between XP and HCI (excerpt) XP Practice
HCI Practice
Iteration, Small Increments, Adaptivity
Prototyping
Planning Game
Focus Groups
Story Cards, Task Cards, User Stories
Scenarios, User Profiles, Task Model
Table 4. Similarities between AM and HCI (excerpt) Agile Modeling Practice
Usability Engineering Practice
Prove It With Code
Prototyping
Create Several Models in Parallel
Concurrent Modeling
Active Stakeholder Participation
Usage-Centered Design, User Participation
Consider Testability
Evaluation, Usability Inspections
Table 5. Overall comparison of agile SE, usual HCI and other practice (excerpt) AM & XP Practice
HCI Practice
TID & ACD Practice
Minimalist documentation
Comprehensible models
Interactive representations
Show results early
Lo-/Hi-Fi prototyping
Make ideas visible asap
Small teams, design rooms
Design rooms, styles guides
Informal communication
Active stakeholder part.
Collaborative design
externalization of visions
User performance
User performance, user experience
User performance, user experience, hedonic quality
3 Agile Cross-Discipline User Interface Design and Software Engineering Lifecycle Our agile cross-discipline user interface and software engineering lifecycle, called CRUISER, originates in our experience of developing various kinds of interactive
CRUISER: A Cross-Discipline User Interface and Software Engineering Lifecycle
179
software systems in teams with up to 20 members [16]. Although CRUISER is based on XP, we firmly believe in a scaling of our lifecycle for larger teams, bearing in mind success stories of agile development with several hundred team members [17] and within large organizations [18]. For the following explanation of CRUISER, we concentrate on those issues that need to be worked out collaboratively by HCI and SE experts. SE practice that is independent from UID are not mentioned in detail. CRUISER starts with the initial requirements up-front (IRUP, see Table 6), which must not take longer than the claims analysis in XP. The agile timeframe can be preserved if the methods employed can be rated as agile (see Table 3, 4, 5) and interdisciplinary. Concerning the design of the UI, XP and AM practice is not sufficient and has to be endorsed by UID practice and authoritive design (TID, ACD). Table 6. CRUISER initial requirements up-front; contributions of disciplines Initial Requirements Up-Front (IRUP) Agile SE
Human-Computer Interaction
Authoritive Design
Use Cases, Usage Scenarios Technical Requirements User Performance Goals
Role & Task Model User-, Task-, Interaction Scenarios Essential Use Cases UI Patterns User Experience Goals
As discussed in Chapter 2, the real users have to be taken into account rather than just stakeholders of any kind. Appropriate cross-discipline methods for analyzing user needs are role models and task models. The model-based RE proposed by [9] focuses on surveying essential information and satisfies an agile course of action due to the use of index cards. The user roles are prioritized (Focal User Roles) and sorted in relation to their impact on product success. Finally, essential use cases describe user tasks and enable the building of task model and task map. Like user roles, task cases are sorted in accordance with Kent Beck’s proposal, which is “required - do first, desired - do if time, deferred - do next time”, whenever the necessary scenarios are established for understanding and communication. For a shared understanding of developers and for communication with stakeholders, all models are translated into scenarios, which can focus on different aspects of UID (users, tasks, interactions). Since agile methods do not consider the UI in detail, they do not recognize extensive style guides as used in HCI practice. We therefore suggest light-weight style guides that are shorter, more relevant and contain UI patterns [19]. They ease the design process by providing design knowledge and experience (AM: Apply Design Standards, Use Existing Resources). During all IRUP assignments, users, HCI, SE and business personnel support and finalize RE with initial discussions about scenarios and design alternatives. This alone will result in various outline visions such as mockups or prototypes that make up the initial project design space. In contrast to other HCI lifecycles (e.g. [12]), CRUISER envisions the externalization of design visions even before the requirements analysis is finished. In our opinion, this break with common HCI practice enables the UI designer to decide
180
T. Memmel, F. Gundelsweiler, and H. Reiterer
very early about the degree of user involvement and the necessity of more innovative solutions. He can have a considerable influence on balancing user performance, user experience and hedonic quality demands and can guide the IRUP accordingly. The second phase of the development process is the initial conceptual phase (ICP, see Figure 1). In the ICP we envisage a separation of ongoing UI prototyping from architectural prototyping whenever possible to speed up the process. The conscientious application of software patterns [19] facilitates this procedure. The development of UI and system architecture can take place in parallel as soon as a minimalist, common UI specification [13] is generated and the necessary interfaces are identified. Dependencies between UI and system architecture can be found with the help of task cases and scenarios established during IRUP. It is very likely that highly interactive UIs will have greater impact on the system architecture.
Fig. 1. CRUISER initial conceptual phase
As discussed, prototypes are common practice in HCI and SE. The overall purpose of the ICP is therefore the generation of more detailed and interactive prototypes for narrowing the design space towards a single solution through discussion with stakeholders and through scenario refinement [3]. For this assignment, the designer must leap between abstract and detailed levels of prototyping, always considering a timeframe and expressivity suitable for an agile project environment (see Table 7). Bearing in mind the claims of agile methods, prototypes should be easy to work with and, above all, quick to produce and easy to maintain. With more interactive and complex external representations, the designer conducts a dialogue about design solutions and ideas. Prototypes that are visually more detailed help us to overcome the limitations of our cognitive abilities to process, develop, and maintain complex ideas and to produce a detailed operative image (TID). As long as the prototype can be modified using simple direct manipulation techniques, the users can be proactively involved in the participatory process. In addition to low-fi prototyping for e.g. conceptual design, a modern UID approach must also provide methods and tools for hi-fi prototyping that overcomes most of the disadvantages mentioned in Table 7. We
CRUISER: A Cross-Discipline User Interface and Software Engineering Lifecycle
181
recommend prototyping tools such as Macromedia Flash and iRise Studio. They are easy to use for all stakeholders due to the absence of coding, they allow reuse of components through the application of patterns or templates, and they produce running interactive simulations that can be enhanced to small releases. Table 7. Low- and High-Fidelity Prototyping, based on [8] (excerpt) Type
Advantages
Disadvantages
Low-Fidelity
less time & lower cost evaluate multiple concepts communication device address screen layout issues
limited usefulness for usability tests navigational and flow limitations facilitator-driven poor specification
High-Fidelity
partial/complete functionality interactive use for exploration and test marketing & sales tool
time-consuming to create inefficient for proof-of-concept designs blinds users to major representational flaws management may think it is real
Interactive prototypes can also run as “Spike Solutions”, which are used to evaluate and prove the functionality and interoperability of UI concepts and system architecture. More importantly, they can be applied as visual, interactive UI specifications in the ensuing construction phase. Visual specifications are unambiguous and can guarantee the final system matches stakeholder expectations about UID and behavior. The prototyping-based process minimizes the risk of making wrong design decisions and leads the way towards a winning design solution. Through the well-balanced and thoughtful application of selected methods of RE such abstract modeling or detailed prototyping, CRUISER avoids a design by trialand-error and makes the design process move forward in a traceable manner. The process of identifying the most promising design solution is guided by UI evaluations, which can be kept at low complexity if the UE methods applied are agile [20]. In order to give due regard to the UI's hedonic qualities, which are e.g. the ability to stimulate or to express identity, we envision a design review with AttrakDiff [15]. On entering the construction and test phase (CTP), coding starts (see Figure 2). At this phase, the CRUISER lifecycle closely resembles the incremental and iterative manner of XP. CTP therefore begins with iteration planning and the creation of unitand acceptance-tests, which are later used to evaluate parts of the system architecture (e.g. automatically) and the UI (e.g. with extreme evaluations [20]). The latter guarantees that the previously defined usability or hedonic quality goals are properly taken into account. They are only to be executed if a usability expert on the team identifies a need for it. We therefore recommend the integration of HCI personnel in the pair programming development. As with the construction of prototypes, the actual coding of UI and system architecture again takes place in parallel, and components of the UI that have great impact may be developed faster initially and then later refined during the following iterations. As in XP, the CTP ends with the deployment of a small release. Before the next iteration starts, each small release can again be evaluated using cheap and fast methods [20]. If usability or hedonic quality issues are identified, they can also be
182
T. Memmel, F. Gundelsweiler, and H. Reiterer
documented on index cards (“defect cards”). Each defect is assigned to its corresponding task case. The usability defects may be sorted and prioritized and thus reviewed during earlier or later iterations. If usability or design catastrophes occur, HCI and SE experts and stakeholders can decide on the necessary measures. The last step in the CRUISER lifecycle is the deployment phase. While users are working with the system, new functionality may be requested, or usability and design issues that were underrated during the iterations may be raised. The lifecycle therefore allows for a return to earlier phases to cater for such new requirements.
Fig. 2. CRUISER construction and test phase
4 Summary Our motivation was to take a step towards a cross-discipline procedure for software design with respect to agile movements. With the CRUISER lifecycle, we bridge HCI and SE based on the commonalities of both fields. Similarities can be found in basic principles and practices as well as among the methods and tools that are typically applied. CRUISER has important links to XP [5], but differs from it in many important aspects related to AM, HCI and beyond. For integrating all critical disciplines under the umbrella of one common lifecycle, we concur with the findings of interdisciplinary researchers and use scenarios and prototypes as fundamental artifacts propelling a design process with high involvement of users and stakeholders.
References 1. Pyla, P.S., Pérez-Quiñones, M.A., Arthur, J.D., Hartson, H.R.: Towards a Model-Based Framework for Integrating Usability and Software Engineering Life Cycles. In: Proceedings of Interact 2003, Zurich, Switzerland, September 1-3, IOS Press, Amsterdam (2003) 2. Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.): Human-centered software engineering – integrating usability in the development process, pp. 3–14. Springer, Heidelberg (2005) 3. Zetie, C.: Show, Don’t tell - How High-Fidelity Prototyping Tools Improve Requirements Gathering, Forrester Research Inc. (2005) 4. Rosson, M.B., Carroll, J.M.: Usability engineering: scenario-based development of human computer interaction. Morgan Kaufmann, San Francisco (2002) 5. Beck, K.: Extreme Programming Explained. Addison-Wesley, London, UK (1999)
CRUISER: A Cross-Discipline User Interface and Software Engineering Lifecycle
183
6. Sutcliffe, A.G.: Convergence or competition between software engineering and human computer interaction. In: Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.) Human-centered software engineering – integrating usability in the development process, pp. 71–84. Springer, Heidelberg (2005) 7. Blomkvist, S.: Towards a model for bridging agile development and user-centered design. In: Seffah, A., Gulliksen, J., Desmarais, M.C. (eds.) Human-centered software engineering – integrating usability in the development process, pp. 219–244. Springer, Heidelberg (2005) 8. Rudd, J., Stern, K., Isensee, S.: Low vs. high fidelity prototyping debate, Interactions, vol. 3(1), pp. 76–85. ACM Press, New York (1996) 9. Constantine, L.L.: Process agility and software usability: Toward lightweight usagecentered design, Information Age, vol. 8(8) (August 2002) 10. Gundelsweiler, F., Memmel, T., Reiterer, H.: Agile Usability Engineering. In: KeilSlawik, R., Selke, H., Szwillus, G. (Hrsg.) Mensch & Computer 2004: Allgegenwärtige Interaktion, pp. 33–42. Oldenbourg Verlag, München (2004) 11. Ambler, W.S.: Agile Modeling. John Wiley & Sons, New York (2002) 12. Mayhew, Deborah, J.: The usability engineering lifecycle - A Practicioners Handbook for User Interface Design. Morgan Kaufmann, San Francisco (1999) 13. Norman, D.: Human-Centered Design Considered Harmful. Interactions 12(4), 14–19 (2005) 14. Lowgren, J., Stolterman, E.: Thoughtful Interaction Design: A Design Perspective on Information Technology. MIT Press, Cambridge, MA (2004) 15. Hassenzahl, M., Platz, A., Burmester, M., Lehner, K.: Hedonic and Ergonomic Quality Aspects Determine a Software’s Appeal, In: Proceedings of the CHI 2000, Conference on Human Factors in Computing, The Hague, NL, pp. 201–208 (2000) 16. Limbach, T., Reiterer, H., Klein, P., Müller, F.: VisMeB: A visual Metadata Browser. In: Rauterberg, M. pp. 993–996. IOS Press, Amsterdam (2003) 17. Eckstein, J.: Agile Software Development in the Large: Diving Into the Deep. Dorset House Publishing Co., Inc. New York (2004) 18. Lindvall, M., Muthig, D., Dagnino, A.: Agile Software Development in Large Organizations. Computer 37(12), 26–34 (2004) 19. Borchers, J.: A Pattern Approach to Interaction Design. John Wiley & Sons, New York (2001) 20. Gellner, M., Forbrig, P.: Extreme Evaluations – Lightweight Evaluations for Soft-ware Developers, In: IFIP Working Group 2.7/13.4, editor, INTERACT 2003 Workshop on Bridging the Gap Between Software Engineering and Human-Computer Interaction (2003)
Interface Between Two Disciplines The Development of Theatre as a Research Tool Maggie Morgan and Alan Newell School of Computing, University of Dundee, Scotland, DD1 4HN [email protected]
Abstract. Dundee University’s School of Computing is researching technology for older users, whose difficulty with technology often exclude them from its benefits. This paper discusses the problems raised in consulting potential users who feel they do not understand technology and are anxious about using it. How should the technologists and designers get over to this clientele the somewhat abstract concepts of ‘what might be developed’ and how it might affect the users’ quality of life? How could they keep the focus of discussion while giving the older people the confidence to be truthful? Experiments made with video and live theatre in consulting with older users, requirements gathering and evaluation of designs are described. This paper addresses: the process of scientific data being transformed into appropriate and useful ‘stories’ to the satisfaction both of writer and researchers: the role of actors and facilitator: the impact on the ‘extreme users’ in the audience: and the data thus gained by the researchers.
Interface Between Two Disciplines - The Development of Theatre as a Research Tool
185
to look after people as they become increasingly frail. Technology should have an important role in improving an old frail person’s quality of life, giving him/her more control over his/her environment, and in giving support to the carers. In order for such technology to be successful, however, older people should be consulted as part of any design process [3].
2 Problems of Consultation Consulting older people about the design of potential technology raises a number of questions: − How do you translate rather abstract scientific concepts into a ‘reality’ that older people can relate to and apply to their own lives? − How can you make older people really understand a piece of technology that has not yet been developed? − How can you make it easier for older people to be critical? They often do not want to ‘upset’ designers and their responses aim to please. How can you create a ‘safe’ method of lively discussion between older people and designers, without the older people feeling intimidated and ashamed of their ‘ignorance’ or the designers either being frustrated or unwittingly patronising.
3 The Introduction of Drama The School of Computing is experimenting with using drama, both video and live theatre, to address these problems [7]. This is based on the following premises: Theatre, whether live or on video, has the ability to ‘pretend’ - so undeveloped technology can be presented as real and working. Scientific concepts and novel technology, with their esoteric language and jargon, can be translated into everyday life. This enables the audience to apply them to their own situation; thus facilitating significant information transfer between researchers and older users. Stories, with ‘real’ characters, with whom the audience can identify, help the audience engage with problems and questions encountered [4,5,11]. All discussion, debate and criticism are focussed on the story and the characters; no-one is going to be offended. This enables both older people and designers to discuss, argue, inform and share needs and experience in a very safe way. This very safety helps older people and designers to draw on and share their experiences. This can be particularly useful in an area where individual needs and disabilities are subject to very wide variation. The roles of researchers, writers, actors and facilitators within this process are all very important, and will be discussed later in this paper. 3.1 Maggie Morgan The Scotland-based Foxtrot Theatre Company, which specialises in interactive forum theatre, provided Maggie Morgan, a theatre writer, director and interactive theatre
186
M. Morgan and A. Newell
facilitator, to work with researchers, write scripts and produce video for two research projects within the School of Computing. The success of these resulted in her being awarded a Leverhulme Art-in-Residence Fellowship for the academic year 2005-6, with the remit to further develop the role of theatre as a research tool within computing. 3.2 The Fall Mentoring Project – Requirements Gathering Using Video A group of researchers were developing a mentoring system which detected falls which involved video cameras within an old person’s home. The pictures would be transmitted to a computer which would alert a carer if it detected the person suffering a fall [6]. The initial reaction of people to the idea of having cameras in the home can be completely negative, but is perhaps an uninformed judgement. To address this issue in more depth, Morgan and the researchers devised four different situations which would inform the viewers, open up wider discussion, and provide valuable data for the researchers. Videos of these scenarios were then made using professional actors and video engineers. The four brief video scenes consisted of – − Older man rushing to answer door bell, and tripping and falling when there was not monitor in his house to detect the fall. − Older woman who has a monitor in her room, reaches up to dust, loses her balance and falls. She is shocked and cannot get up. The monitor registers fall, and soon someone arrives, having been alerted. − False alarm: an older woman – with a monitor – drops a jigsaw, gets down on the floor to pick up the pieces. The monitor registers this and alerts her daughter, who rings her immediately. Music is playing so the mother does not hear the phone for a long time; the daughter rushes out from an important meeting and arrives to find her mother enjoying her jigsaw. She is both relived and frustrated! − A daughter, talking to her father, describes the monitor her mother-in-law has, and that it has somewhat eased the burden of checking up on the old lady. The conversation is interrupted by a phone message from the computer connected to mother-in-law’s monitor. The computer is letting her know that, although the old lady has not fallen, she is not moving around as usual. Daughter-in-law rings to check whether she might be ill. It’s OK! It is Wimbledon fortnight, the old lady is a tennis enthusiast and is hardly moving from the television! Relief but some irritation – but father comments to his daughter that she might be very glad of this function some day. Pauses were built into these video scenes so that the audience could comment and discuss each scenario. The researcher facilitating the discussions, who had been trained in facilitation by Morgan, was able to answer questions about what happened to the ‘pictures’ the cameras took and how carers might be alerted. Audiences varied from relatively fit older people living independently in sheltered housing or their own house to very frail old people who needed a lot of care in order to stay at home and who came together at Day Centres. One audience consisted of a group of professional carers. Each audience brought its own experiences and perspectives; among the
Interface Between Two Disciplines - The Development of Theatre as a Research Tool
187
topics covered were anxieties about privacy; what support systems were already in use and how effective these were; anxieties about falling or becoming ill and this being detected; where falls were most likely; how their individual activities differed and false alarms. The narrative form of the video clips engaged the audience and kept the focus of the discussion. Using drama was found to be an extremely useful method of provoking discussion at the pre-prototyping stage and provided many insights that we believe would not have been obtained without such techniques being used. This confirms the comments made by Sato & Salvador [13] that human centred stories lead to a more detailed discussion and that the drama provides a point of contact, which makes the evaluative task much easier. Although Strom [14] reported that he found it difficult to combine large or dramatic consequences with the exploration of an interface, this was not an issue in this piece of research. 3.3 The UTOPIA Trilogy Video – An Attitude Changing Exercise Using Video A similar technique to the above was used to produce narratives for discussion aimed at communicating the essential findings of the UTOPIA group (Usable Technology for Older People: Inclusive and appropriate) [2] to designers of technology for older people. During the research phase of the project, which included discussions with individuals and groups of older people, important data emerged concerning older people’s problems with language; anxiety; assumptions of knowledge that they in fact lacked; confusing software and the increase of disabilities with aging. Designers, however – usually young – found it difficult to conceive of people who were totally unfamiliar with basic modern technology. Three videos were produced, which focussed on: installing a web camera, a completely novice user attempting to use email, and a first time user of a mobile telephone [15]. The video stories were viewed and discussed by several audiences: some consisting of designers and engineers, some of older people, others of mixed audiences. Changes in audience attitudes were measured by identical questionnaires about perceptions of older people being filled in before the viewing and at the end of the event. Each performance provoked lively discussion and proved very enjoyable. Significant changes in attitude were noted in all audiences who viewed these videos [1]. 3.4 The Rice Digital Television Project – Live Theatre for Requirements Gathering Rice, a researcher in the Dundee University School of Computing, used focus groups in his initial requirements gathering for the design of a home telecommunication system for older adults, and subsequently used live interactive theatre as a method of holding in-depth discussions with a large groups of older people [12]. Although digital television and its possible applications is very topical many, particularly older people, neither understand how digital TV worked nor what its potential uses are, especially those which could enhance the quality of life of older people. The potential uses of digital TV examined were: a ‘chatting’ service, communication between homes via a camera: a ‘scrap book’, and a reminder service. The problems of describing technology, which had not yet been developed, and therefore ascertaining
188
M. Morgan and A. Newell
how desirable or useful it might be seen to be were solved by the ability of theatre to ‘pretend’. A ‘multi-media’ production was scripted, developed and produced, using professional actors, on-stage props, and the projection of DVD onto a back screen. The situations chosen were those frequently found in real life experience – children and grandchildren living at a distance: having to move from the family home to a smaller place; becoming more forgetful The creation of characters in life-like situations resulted in a ‘reality’ with which older audiences could identify and empathise, directly relating the action to their own experiences and expectations. The discussion was enhanced even further when the characters - i.e. the actors who remained in role - took part in the discussion with the audience. The characters bore the brunt of being unsure of the role of the technology and finding the possible disadvantages – but also discovered how it might help their human situation. The performances, and all the audience interaction, were conducted in a purpose designed studio theatre within the School of Computing [9] and were recorded using four cameras and a high quality sound system. This ensured that all the interaction within the audience was faithfully recorded, and were subsequently transcribed. This provided extensive data which was extremely useful both in the decision making process for, and in the detailed development of digital television application.
4 Experiments with Combining Video and Live Theatre Live theatre has a big impact, but the full rehearsed performance is not always feasible both practically and financially. We therefore also experimented with a mixture of video clips and live theatre. Showing a video clip was followed by the actors in that clip being present ‘in role’ to dialogue with the audience. The aim of the viewings was to measure change in attitude towards older people and technology with three audiences – undergraduate students, post-graduate students and professionals at an HCI conference. The undergraduate and post-graduate students reported that, although the video was interesting and informative, being able to question and discuss with the ‘live’ characters had more impact. The response of the professional audience at the HCI conference [10], who were not specialists in designing for older people, was very mixed, but again the session with the actors stimulated a huge amount of discussion and argument and made the session highly memorable for the audience. With all three very different audiences, the fact that the characters were actually actors liberated everyone to say what they really thought. The ‘characters’ were highly believable and convincing, but the audience could attack the characters, knowing that the actors were not going to take their comments personally.
5 Continuing Use of Theatre in Technological Research Plans are already being implemented by a group of researchers from four Scottish Universities, and involving “telecare”, health and social work stakeholders, to use live theatre for requirements gathering, evaluation and inter-communication among
Interface Between Two Disciplines - The Development of Theatre as a Research Tool
189
audiences of older people, formal and informal carers, designers and engineers, health and social work professionals. Two different formats for discussion following the performances will be tested, and the results of this methodological experiment will be reported at the conference at HCI 2007. 5.1 How Does It Actually Work? The essential constituents of Interactive Forum Theatre are quality of: • • • •
The script, The performance , The facilitation of interaction with the audience, and the use of Appropriate interaction techniques.
5.1.1 The Script – The “Story of an Interface” The script must be the result of thorough collaboration between researchers and writer. The task of the researchers is to convey their aims accurately and clearly to the writer: the questions they want answers to and/or the information they wish conveyed. The writer’s task is to understand clearly the aims of the researchers, and to translate their research issues into the form of a story. This interaction between researchers and writer may sound simple, but is in fact complex. The researchers may well be anxious about their measured scientific data being rendered inaccurately: they may find the whole process very alien. Researchers with no experience of this method may feel a lack of trust both in the process and in the writer. The writer on the other hand may find their technical jargon impenetrable, and have to ask many ‘idiot’ questions in order to understand what is really required. The writer has to produce a good story that will work dramatically in performance how can this be reconciled with scientific data and analyses? The writer too can feel frustrated if the researchers seem not to understand what (s)he is trying to do and are even suspicious of the process. The process, however, gradually builds up a rapport between researchers and writer. The writer goes through several stages of composition: she produces one or more outline ideas: then a first draft of script: then a second draft of script, then a ‘working draft’ that the director and actors can begin to rehearse with. At each stage, the writer’s outlines and scripts are referred back to a working group of the researchers for checking out. The writer needs to be clear about her limitations and continually ask the researchers to amend or suggest. For example, when needing audience responses to technological help in the home, what pieces of technology would the researchers like to see in the story? What are the questions they would like asked around this piece of technology? How would the character make this work? An older person might have disabilities to take into account when operating it. Or even how might you persuade an older person that this facility would really benefit them? Alien as this process may seem to traditional theatre, the structure of a dramatised story is actually very appropriate. Tension and conflict are need to achieve drama: characters resisting or struggling with pieces of technology introduces tension and asks questions, and, as with all HCI technology, the interface is with human beings, with their own psychologies, knowledge and context. Theatre can create the “story of an interface”, where an audience can look at a piece of technology, its possible
190
M. Morgan and A. Newell
usefulness, design and usability, and how a human being interacts with it, the human being having attitudes, emotions, physical difficulties and needs. 5.1.2 The Actors Only professional actors have been used in the experiments reported. Minimal costume and only essential props were used and the actors were physically very close to the audience. This form of theatre requires experienced professional actors who can take direction and immediately, or almost immediately, to produce a three dimensional believable characters. The actors, which have been used in this interactive work, also are experienced in interactive theatre, and are able to ‘suspend disbelief’ and have the ability to engage an audience without the normal technical aids of a full theatre production. The actors were very well briefed into the aims of the theatre: the way the pieces of technology were supposed to ‘work’: how it might relate to the life style and needs of the character: and what questions might arise in the audience that they may have to react to. It was extremely useful for one or more researchers to be present for some of the rehearsals. Questions inevitably arise about the technology during rehearsal, and a researcher can supply the information and explanation the actors need. This also assures the researchers that they still have control over the project and that their research is being respected in detail. For example, if a character is being ‘hot-seated’ – questioned ‘in role’ in a dialogue with the audience, (s)he needs to be well versed in the character’s own story and circumstances and also the issues around the piece of technology. Other dramatic possibilities with this format include, the audience being able to redirect a character in the story. For example, one of the characters in the story may have explained the technology in a way that is either incomprehensible or patronising to the older person - the audience can be given the opportunity to replay that part of the story to see the effect of a different approach to the challenge of communicating technology to older people. 5.1.3 The Director and Facilitator The director needs to thoroughly understand the research aims and brief the actors as they rehearse. The director and facilitator have to be as well briefed as the writer. In the case of the work reported here, the writer was also the director and facilitator. If this is not the case, the writer, director and facilitator must work very collaboratively. The facilitator’s role is crucial. (S)he must: Thoroughly understand the issues which the researchers need investigated, Explain clearly and simply to the audience how the process will work and how the facilitator will enable them to interact, Particularly with older people, but in fact with any audience, have a brief, relaxed ‘warm up’ session, to begin the process of audience members responding and beginning to focus, to establish the rapport between facilitator and audience. At the ‘Pauses’ for interaction, guide the audience through the techniques appropriate at that point. Ask questions that are as open as possible, and accept contributions from the audience unconditionally. No one should be made to feel belittled by a facilitator’s response.
Interface Between Two Disciplines - The Development of Theatre as a Research Tool
191
Frequently repeat or paraphrase what an audience member has just said both to reinforce the point and also to make sure everyone in the audience has heard. Where conflicting attitudes and perspectives come from the audience, briefly sum up the divergence, with respect, which often moves the discussion on. The different perspectives are aired and heard by everyone, but there is the safety of the differences being projected onto the characters and the situation in the story. If the focus of the discussion is being lost, regain the focus by referring back to the story. 5.1.4 Co-facilitation In some projects it is appropriate to have a co-facilitator who is a member of the research team. Whenever scientific issues or queries arise, the main facilitator can call on the co-facilitator to supply the information. In the case of a researcher / cofacilitator thinking an important issue or question is being missed in the discussion, (s)he can raise this with the audience. This method of co-facilitation worked well [12]. 5.2 Focus The performance of the story maintains the focus of the discussion, the characters bear the brunt of any negative comments, the audience increasingly engages and feels it’s comfortable to join in and a great deal of data emerges from the discussion. The whole process can be recorded unobtrusively (though with permission) for subsequent transcription and analysis. 5.3 Cost Video and live theatre are both extremely useful for engaging and informing an audience and stimulating lively discussion. They can be used for requirements gathering and evaluation by large groups of people at a time. The impact of live theatre and the ability of the audience to respond, and often directly interact with the characters, cannot be underestimated. If a video is used the discussions following the viewing need to be as well facilitated, as those in live performances, though obviously there is no direct interaction with the performers. The balance of costs between producing a DVD and live performances depends on the number of performances planned. Economically live performances need to be put on close together, so that the actors are employed for a single period and need only one rehearsal period as part of this. If the presentations are spread out in time rebriefing and re-rehearsal of the actors will be needed. The cost of producing a good quality video can be up to five times the cost of producing a series between 2 and 5 live performances within a single run of productions, but if researchers wish to use the performance many times but at intervals and in different places, the initial cost of a video may be more economical. A useful compromise, where performances have to be at intervals, is to make a video and have at least one of the actors present in character for dialogue with the audience. This means that the actor(s) do not need a rehearsal period prior to the performance.
192
M. Morgan and A. Newell
6 Conclusions the Appropriateness of Theatre for HCI The work reported has shown that theatre can be very effective in many stages of the development of technology. There is a logic to the use of theatre in HCI research. Human needs and wants should be the starting point with researchers frequently needing to consult potential users at the earliest stage, and theatre provides a very effective communication method. Once technological ideas begin to be developed, further consultation is needed with potential users. At the pre-prototype stage, theatre is particularly useful to help the researchers create a ‘reality’, where we imagine these devices are being used, but raising questions about appropriateness of design for older people’s life situations and for their usability by people who are unsure about technology and slower to learn than when they were younger. An interactive performance essentially provides a very flexible ‘virtual’ world in which an audience can play with novel technology and concepts. Acknowledgements. The work reported has been supported by the Scottish Higher Education Funding Council, the Engineering and Physical Sciences Research Council, and the Leverhulme Trust.
References 1. Carmichael, A., Newell, A.F., Dickinson, A., Morgan, M.: Using theatre and film to represent user requirements. Include, Royal College of Art, London (April 5-8, 2005) 2. Dickinson, A., Eisma, R., Syme, A., Gregor, P.: UTOPIA: Usable Technology for Older People: Inclusive and Appropriate. In: Brewster, S., Zajicek, M. (eds.) A New Research Agenda for Older Adults, Proc. BCS HCI, London, pp. 38–39 (2002) 3. Eisma, R., Dickinson A., Goodman, Mival, O,J., Syme, A., Tiwari L.: Mutual inspiration in the development of new technology for older people. In: Proc. Include 2003, London, pp.7:252–7:259 (March 2003) 4. Grudin, J.: Why Personas Work – the psychological evidence. In: Pruitt, J., Adlin, T. (eds.) The Persona Lifecycle, keeping people in mind throughout product design, Elsevier (In press) 5. Head, A.: Personas: Setting the stage for building usable information sites. Online 27(4), 14–21 (2003) 6. Marquis-Faulkes, F., McKenna, S.J., Gregor, P., Newell, A.F.: Gathering the requirements for a fall monitor using drama and video with older people. Technology and Disability 17(4), 227–236 (2005) 7. Newell, A.F., Carmichael, A., Morgan, M., Dickinson, A.: The use of theatre in requirements gathering and usability studies. Interacting with Computers 18, 996–1011 (2006) 8. Newell, A.F., Gregor. P.: User sensitive inclusive design in search of a new paradigm. In: Scholtz, J., Thomas, J. (eds.) CUU 2000, Proc. First ACM Conference on Universal Usability, USA. pp. 39–44 (2000) 9. Newell, A.F., Gregor, P., Alm, N.: HCI for older and disabled people in the Queen Mother Research Centre at Dundee University, Scotland, CHI 2006 Montreal, Quebec, Canada, 22-27 April 2006. pp. 299–303 (2006)
Interface Between Two Disciplines - The Development of Theatre as a Research Tool
193
10. Newell, A.F., Morgan, M.: The use of theatre in HCI research, In: “Engage” 20th Annual BCS HCI Conference University of London (September 11-15, 2006) 11. Pruitt, J., Grudin, J.: Personas: Practice and Theory. In: Proceedings DUX 2003, CD ROM, 15 (2003) 12. Rice, M., Newell, A.F., Morgan, M.: Forum Theatre as a requirement gathering methodology in the design of a home telecommunication system for older adults, Behaviour and Information Technology (In press ) 13. Sato, S., Salvador, T.: Playacting and Focus Troupes: Theatre Techniques for creating quick, intensive, immersive and engaging focus group sessions, Interactions, pp. 35–41 (September-October, 1999) 14. Strom, G.: Perception of Human-centered Stories and Technical Descriptions when Analyzing and Negotiating Requirements. In: Proceedings of the IFIP TC13 Interact 2003, Conference (2003) 15. Utopia Trilogy can be downloaded from: http://www.computing.dundee.ac.uk/projects/ UTOPIA/utopiavideo.asp
Aspects of Integrating User Centered Design into Software Engineering Processes Karsten Nebe1 and Dirk Zimmermann2 1
University of Paderborn, C-LAB, 33098 Paderborn, Germany [email protected] 2 T-Mobile Deutschland GmbH, Landgrabenweg 151, 53227 Bonn, Germany [email protected]
Abstract. Software Engineering (SE) and Usability Engineering (UE) both provide a wide range of elaborated process models to create software solutions. Today, many companies have realized the need for usable products and understood that a systematic and structured approach to usability is as important as the process of software development itself. However, theory and practice still have problems to efficiently and smoothly incorporate UE methods into established development processes. One challenge is to identify integration points between the two disciplines SE and UE that allow a close collaboration, with acceptable additional organizational and operational effort. The approach presented in this paper identifies integration points between software engineering and usability engineering on the level of process models. The authors analyzed four different software engineering process models to determine their ability to create usable products. Therefore, the authors synthesized demands of usability engineering and performed an assessment of the models. Keywords: Software Engineering, Usability Engineering, Standards, Models, Processes, Integration, Assessment.
Aspects of Integrating User Centered Design into Software Engineering Processes
195
1.1 Software Engineering Software engineering is a discipline that adopts various engineering approaches to address all phases of software production, from the early stages of system specification up to the maintenance phase after the release of the system ([15], [18]). Software engineering tries to provide a systematic and planable approach for software development. To achieve this, it provides comprehensive, systematic and manageable procedures: so called software engineering process models (SE Models). SE Models usually define detailed activities, the sequence in which these activities have to be performed and resulting deliverables. The goal of SE Models is to define a process where the project achievement does not depend on individual efforts of particular people or fortunate circumstances [5]. Hence, SE Models partially map to process properties and process elements and add concrete procedures. Existing SE Models vary with regards to specific properties (such as type and number of iterations, level of detail in the description or definition of procedures or activities, etc.) and each model has specific advantages and disadvantages, concerning predictability, risk management, coverage of complexity, generation of fast deliverables and outcomes, etc. Examples of such SE Models are the Linear Sequential Model (also called Classic Life Cycle Model or Waterfall Model) [16], Evolutionary Software Development [12], the Spiral Model by Boehm [1], or the V-Model [9]. Software engineering standards define a framework for SE Models on a higher abstraction level. They define rules and guidelines as well as properties of process elements as recommendations for the development of software. Thereby, standards support consistency, compatibility and exchangeability, and cover the improvement of quality and communication. The ISO/IEC 12207 provides such a general process framework for the development and management of software [7]. It defines processes, activities and tasks and provides descriptions about how to perform these items on an abstract level. Thus, there is a hierarchy of different levels of abstractions for software engineering: Standards that define the overarching framework and process models describe systematic and traceable approaches for the implementation. All these levels put the focus on system requirements and system design. 1.2 Usability Engineering Usability Engineering is a discipline that is concerned with the question of how to design software that is easy to use. Usability engineering is “an approach to the development of software and systems which involves user participation from the outset and guarantees the efficacy of the product through the use of a usability specification and metrics.” [4]. Therefore usability engineering provides a wide range of methods and systematic approaches to support the development process. These approaches are called Usability Engineering Models (UE Models). Examples are Goal-Directed-Design [2], the Usability Engineering Lifecycle [11] or the User-Centered Design-Process Model of
196
K. Nebe and D. Zimmermann
IBM [6]. They describe an idealized approach to ensure the development of usable software, but they usually differ in its details, in the applied methods (the “how?”) and the general description of the procedure (the “what?”, e.g. phases, dependencies, goals, responsibilities, etc.) [19]. Usability engineering provides standards which are similar to the idea of software engineering standards. They also serve as a framework to ensure consistency, compatibility, exchangeability, and quality. However, usability engineering standards lay the focus on the users and the construction of usable solutions during the development of software solutions. Examples for such standards are the DIN EN ISO 13407 [3] and the ISO/PAS 18152 [8]. The DIN EN ISO 13407 introduces a process framework for the human-centered design of interactive systems. Its overarching aim is to support the definition and the management of human-centered design activities. The ISO/PAS 18152 is based on the DIN EN ISO 13407 and describes a reference model to measure the maturity of an organization in performing processes that make usable, healthy and safe systems. Thus, in usability engineering exists a similar hierarchy of abstraction levels as in software engineering: Standards define the overarching framework and process models describe systematic and traceable approaches for the implementation. However, usability engineering puts the focus on creating usable and user-friendly systems instead of system requirements and system design. 1.3 Relationship of Standards, Models and Operational Processes In general standards and models are seldom applied directly, neither in software engineering nor in usability engineering. Standards merely define a framework to ensure compatibility and consistency and to set quality standards. Models are being adapted and/or tailored according to the corresponding organizational conditions, such as existing processes, organizational or project goals and constraints, legal policies, etc. According to this, the models are detailed by the selection and definition of activities, tasks, methods, roles, deliverables, etc. as well as responsibilities and relationships in between. The derived instantiation of the model, fitted to the organizational aspects, is called software development process (for SE Models) or usability lifecycle (for UE Models). Thus, the resulting Operational Process is an instance of the underlying model and the implementation of activities and information processing within the organization. This applies to both software engineering and usability engineering. Thus, there is not just a single hierarchy of standards and models but an additional level of operational processes for software engineering, as well as for usability engineering. Standards define the overarching framework, models describe systematic and traceable approaches and on the operational level these models are adjusted and put into practice (Figure 1). In order to achieve sufficient alignment between the two disciplines, all three levels have to be regarded to ensure that the integration points and suggestions for optimized collaboration meet the objectives of both sides and not lose the intentions behind a standard, model or operational implementation.
Aspects of Integrating User Centered Design into Software Engineering Processes
ISO/IEC 12207
Usability Engineering DIN EN ISO 13407 ISO/PAS 18152
Opera tion Proce al ss
Proce ss Mode l
Stand ards
Software Engineering
197
Procedure
Procedure
Fig. 1. Similar hierarchies in the two disciplines software engineering and usability engineering: standards, process models and operational processes
2 Motivation For development organizations SE Models are an instrument to plan and systematically structure the activities and tasks to be performed during software creation. However, software development organizations aim to fulfill specific goals when they plan a software solution. Such goals could be the rapid development of a new software solution (to become the leader in this area) or to develop a very stable and reliable solution (e.g. because of the organization’s prestige) and of course, to create revenue with it. Depending on its’ goals an organization will chose one (or the combination of more than one) SE Model for the implementation that will in their estimate fits best. As an example, the Linear Sequential Model with its predefined results at the end of each phase and its sequential flow of work certainly provides a good basis for planability. On the other hand, the Evolutionary Development might not be a good choice if the main focus of the solution is laid on error-robustness, because the continuous assembling of the solution is known to cause problems in structure and the maintenance of software code. As usability engineering puts the focus on the user and usability of products, which is an important aspect of quality, usability becomes important for the development process and thus also an important criterion for organizations to choose a well-suited SE Model. However, usability engineering activities are not just a subset of software engineering or SE activities. Although different models exist for software and usability engineering, there is a lack of systematic and structured integration [17]. They often coexist as two separate processes in an organization and therefore need to be managed separately and in addition need to be synchronized, by adding usability engineering activities to the software engineering process models. In order to identify integration points between the two disciplines the authors believe examinations on each level of the hierarchy have to be performed: On the level of standards it has to be shown that aspects of software engineering and usability
198
K. Nebe and D. Zimmermann
engineering can coexist and can be integrated, even on this abstract level. On the level of process models it has to be analyzed how usability engineering aspects can be incorporated into SE Models. And on the operational level’s activities, a close collaboration should be achieved, resulting in reasonable additional organizational and operational efforts. 2.1 Common Framework on the Level of Standards In previous work the authors already performed an initial analysis on the first two hierarchy levels [13] of Standards and Processes. First integration points on the level of Standards could be found in comparing the software engineering standard ISO/IEC 12207 with the usability engineering standard DIN EN ISO 13407. Therefore, standards’ detailed descriptions of processes, activities and tasks, output artifacts, etc. have been analyzed and similarities were found. Based on common goals and definitions, the single activities of the standards could be consolidated as five common activities: Requirement Analysis, Software Specification, Software Design and Implementation, Software Validation and Evaluation. These common activities represent and divide the process of development from both, a software engineering and a usability engineering point of view. The five common activities can be seen as basis for integrating the two disciplines on the overarching level of standards: a common framework for software engineering and usability engineering activities. The authors used the framework to set the boundaries for the next level of analysis in the hierarchy: the level of process models. 2.2 Ability of SE Models to Create Usable Products Based on the common framework different SE Models were analyzed with regards to see how they already support the implementation of the usability activities. Thus, an assessment of SE Models with the goal to identify the ability of SE Models to create usable software solutions was performed. In order to create valuable results, the authors defined several tasks to be performed. First, adequate criteria for the assessment of the SE Models needed to be defined, by which unbiased and reliable statements about process models and their ability to create usable software can be made. The assumption was that based on the results of the assessment specific recommendations can be derived to enrich the SE Models by adding or adapting usability engineering activities, phases, artifacts, etc. By doing this, the development of usable software on the level of process models can be guaranteed. Furthermore, hypothesizes about the process improvements can be made for each recommendation which then can be evaluated on the Operational Process level. Therefore, case studies will be identified based on which the recommendations can be transferred in concrete measures. These measures can then be evaluated by field-testing to verify their efficiency of user-centeredness of software engineering activities. In summary, four types of analyses need to be performed: two on the level of process models and two on the operational process level. The four respective analysis topics differ in their proceedings as well as their expected results:
Aspects of Integrating User Centered Design into Software Engineering Processes
-
-
-
199
Operationalization of the base practices and the identification of criteria for the assessment von usability engineering activities and the corresponding deliverables. Assessment of SE Models, based on the identified criteria and the derivation of adequate recommendations. Inspection of case studies with regards to the recommendations and the derivation of specific measures for the implementation of UE activities in SE Processes Evaluation of the measures in practice
For each of the analyses several methods can be used, some of which involve domain experts as interview partners, whereas others are more document oriented. This paper focuses on the description of the performed analyses in the first topic listed above and first results on the second topic as a forecast based on the results of the first topic, i.e. the operationalization of base practices and derivation of UE criteria for the assessment. 2.3 Criteria for the Assessment of SE As the authors identified the need for assessment criteria to define the degree of usability engineering coverage in SE Models, the following section shows how these criteria were gathered and what results were derived and to be expected from further research activity. To obtain detailed knowledge about usability engineering activities, methods, deliverables and their regarding quality aspects, the authors analyzed the DIN EN ISO 13407 and the ISO/PAS 18152. In addition to the identified common activities of the framework within the human-centered design activities, ISO/PAS 18152 defines detailed Base Practices that specify the tasks for creating usable products. These base practices have been used as a foundation to derive requirements that represent the common activities’ usability engineering perspective. The quantity of fulfilled requirements for each activity of the framework informs about the level of compliance of the SE Model satisfying the base practices and therewith the usability view of activities. For each base practice the authors determined whether the model complied with it or not. In a second iteration of the gap-analysis expert interviews will lead to more detailed criteria in order to assess the corresponding SE Models more specific. Additionally the completeness and correctness of the base practices and humancentered design activities as defined in the ISO/PAS 18152 itself needs to be verified. The detailed descriptions of the base practices have been used to pre-structure the collection of criteria and for the expected results. Since the base practices are structured based on activities, methods, and deliverables the authors used this to prestructure the expected results. Additionally expected results are criteria about the quality aspects of the overall process. The results will be separated based on the specific human-centered design activities and those that are more generic and overarching. This results in a matrix of activities & methods, content & deliverables, roles & quality aspects in relation to the human-centered design and overall activities as shown in Table 1.
200
K. Nebe and D. Zimmermann
of use
Produce Design S l i Evaluation
User Requirement
Context of use
Overarching Aspects
Table 1. Structure and orientation of criteria for the assessment of software engineering models
Based on this, several evaluation questions have been gathered, focusing on the abstract level of process models. The goal is the to define overarching criteria and not evaluate the concrete accomplishment within one specific model or particular procedure, e.g. questions about overlaps of activities, phases, deliverables, or questions about the relevance of specific activities or roles within a process model. According to the questions and based on the initial structure, as shown in Table 1, the authors performed the first analysis, the documentation of existing SE Models (Linear Sequential Model, Evolutionary Software Development, the Spiral Model by Boehm and the V-Model) and for the second analysis created an interview guideline that is currently used as basis for the expert-interviews. Initial results of theses analyses are described in the following section.
Across Activities
0%
Evaluation of Use
0%
Produce Design Solutions
User Requirements
Linear Sequential Model
Context of Use
Table 2. Summary Results of the gap-analysis, showing the sufficiency of SE Models in covering the requirements of usability engineering (based on the ISO/PAS 18152; HS 3)
0 % 60 %
13 %
Evolutionary Development
13 % 40 % 40 % 80 %
39 %
Spiral Model
13 %
V-Modell
88 % 80 % 40 % 100 % 78 %
Across Models
28 %
80%
50 %
40 % 100 % 52 %
30 %
85 %
3 Results As a result of the first analysis of selected SE Models first general statements can be made: The overall level of compliance of the SE Models satisfying the base practices and therewith the usability view of activities, is rather low (Table 2). For none of the
Aspects of Integrating User Centered Design into Software Engineering Processes
201
SE Models all base practices of the ISO/PAS 18152 are fulfilled. However, there is also a large variability in the coverage rate between the SE Models. For example, the V-Model shows a very good coverage for all modules except for lower compliance of the activity HS 3.3 Produce Design Solution criteria, whereas the Linear Sequential Model only fulfills a few of the HS 3.4 Evaluation of Use criteria and none of the other modules. Evolutionary Design and the Spiral Model share a similar pattern of findings, in that they show little coverage for Context of Use, medium to good coverage of User Requirements, limited coverage for Produce Design Solution and good support for Evaluation of Use activities. By looking at the summary of results and comparing the percentage of fulfilled requirements for each SE Model, it shows that the V-Model has a better compliance than the other models and it can basically be regarded to be able to produce usable products. In the comparison, the Linear Sequential Model cuts short, followed by Evolutionary Development and the Spiral Model. Both in the overview and the detail findings it shows that the emphasis for all SE Models is laid on evaluation (Evaluation of Use), especially in comparison to the remaining activities. The lowest overall coverage could be found in Context of Use and Produce Design Solution. Based on the relatively small compliance values for the Context of Use (28%), User Requirements (50%) and Produce Design Solutions (30%) activities across all SE Models, the authors see this as an indicator that there is only a loose integration between usability engineering and software engineering. In summary, the results confirmed expectations of the authors, showing the low level of integration between both disciplines on the level of the overarching process models. As expected it becomes apparent that there is a dire need to compile more specific and detailed criteria for the assessment of the SE Models. As the analysis showed, the base practices currently give too much leeway for interpretations. In addition it turned out that the dichotomous assessment scale (in terms of “not fulfilled” or “fulfilled”) is not sufficient. A less granular rating is necessary to evaluate the process models adequately. Performing the documentation analysis of the SE Models produced first insights but it turned out that the documentation is not comprehensive enough to ensure the validity of the resulting statements. In the second analysis the authors plan to conduct more specific criteria will be determined, according to the previously described structure. These will be compiled in semi-structured interviews with experts from the domain of usability engineering. The criteria focus on the activities defined in the module Human-centered design (ISO/PAS 18152) and their respective base practices and specifics in: fundamental activities, basic conditions and constraints, relevance of activities, resulting outcomes, type of documentation, and respective roles and responsibilities. Beyond this, a substantial focus is put on the quality aspects based on the activities, deliverables, roles and the super ordinate model. The criteria will be evaluated concerning questions like: -
How to identify good activities? How to identify good results or deliverables? How to identify appropriate Roles What are properties/characteristics for the relevance and frequency? How could the progress of an activity or deliverable be measured and controlled?
202
K. Nebe and D. Zimmermann
Based on these criteria the authors expect to be able to get evidence, which activities, deliverables and roles are necessary to ensure the development of usable products from the experts’ point of view. Relevant factors of influence could be for instance: „When will an activity A not be performed, and why?” or “Under which circumstances will an activity A be performed completely, when just partly?” Additionally, criteria are to be raised, based on which the progress of the process could be measured. However, the central point will be collection of criteria that focus on quality aspects of the activities, deliverables and roles as well as their relevance. It is expected that the results can not just be used as more detailed criteria for the assessment but will also provide evidence on the level of completeness of the ISO/PAS 18152 and surface potential areas of improvement.
4 Summary and Outlook The approach presented in this paper was used to identify integration points between software engineering and usability engineering on the level of process models. The authors analyzed four different software engineering process models to identify their ability to create usable products. The authors synthesized demands of usability engineering and performed an assessment of the models. The results provide an overview about the degree of compliance of the models with usability engineering demands. It turned out that there is a relatively small compliance to the usability engineering activities across all software engineering models. This is an indicator that there only little integration between usability engineering and software engineering exists. There are less overlaps between the disciplines regarding these activities and therefore it is necessary to provide suitable interfaces to create a foundation for the integration. The authors identified the need to compile more specific and detailed criteria for the assessment as well as a more differentiated dichotomous assessment scale to evaluate the process models appropriately. Therefore the authors introduced a structured approach of how they will perform the follow-up analysis. The more detailed criteria will be compiled in semi-structured interviews with experts from the domain of usability engineering. Thereby, a substantial focus is put on the quality aspects based on the activities, deliverables, roles and the super ordinate model. Based on these criteria the authors expect to be able to make statements about their necessity and the relevance to ensure the development of usable products from the experts’ point of view. It is expected that the results could not just be used as criteria for the assessment of software engineering models but could also define the demands of usability more precisely and to give evidence about the completeness and potential extension areas of the ISO/PAS 18152.
References 1. Boehm, B.: A Spiral Model of Software Development and Enhancement. IEEE Computer 21, 61–72 (1988) 2. Cooper, A., Reimann, R.: About Face 2.0. Wiley, Indianapolis, IN (2003)
Aspects of Integrating User Centered Design into Software Engineering Processes
203
3. DIN EN ISO 13407. Human-centered design processes for interactive systems. CEN European Committee for Standardization, Brussels (1999) 4. Faulkner, X.: Usability Engineering, pp. 10–12. PALGARVE, New York (2000) 5. Glinz, M.: Eine geführte Tour durch die Landschaft der Software-Prozesse und – Prozessverbesserung. Informatik – Informatique, pp. 7–15 (6/1999) 6. IBM: Ease of Use Model. (11/2004) Retrieved from http://www-3.ibm.com/ibm/easy/ eou_ext.nsf/publish/1996 7. ISO/IEC 12207. Information technology - Software life cycle processes. Amendment 1, 2002-05-01. ISO copyright office, Switzerland (2002) 8. ISO/PAS 18152. Ergonomics of human-system interaction — Specification for the process assessment of human-system issues. First Edition 2003-10-01. ISO copyright office, Switzerland (2003) 9. KBST: V-Modell 97. (05/2006), Retrieved from http://www.kbst.bund.de 10. Larman, C., Basili, V.R.: Iterative and Incremental Development: A Brief History. Computer 36(6), 47–56 (6/2003) 11. Mayhew, D.J.: The Usablility Engineering Lifecycle. Morgan Kaufmann, San Francisco (1999) 12. McCracken, D.D., Jackson M.A.: Life-Cycle Concept Considered Harm-ful. ACM Software Engineering Notes pp. 29–32 (4/1982) 13. Nebe, K., Zimmermann, D.: Suitability of Software Engineering Models for the Production of Usable Software. In: Proceedings of the Engineering Interactive Systems 2007, HCSE (IFIP Working Group 13.2, Methodologies for User Centered Systems Design). Lecture Notes In Computer Science (LNCS), Springer, Heidelberg (in prep. 2007) 14. Pagel, B., Six, H.: Software Engineering: Die Phasen der Softwareentwicklung, 1st edn. vol. 1. Addison-Wesley Publishing Company, Bonn, D (1994) 15. Patel, D., Wang, Y. (eds.): Annals of Software Engineering. Editors’ introduction: Comparative software engineering: Review and perspectives, vol. 10, pp. 1–10. Springer, Heidelberg (2000) 16. Royce, W.W.: Managing the Delopment of Large Software Systems. In: Proceedings IEEE, pp. 328–338. IEEE, Wescon (1970) 17. Seffah, A. (ed.): Human-Centered Software Engineering – Integrating Usability in the Development Process, pp. 3–14. Springer, Heidelberg (2005) 18. Sommerville, I.: Software Engineering. 7th ed. Pearson Education Limited, Essex, GB (2004) 19. Woletz, N.: Evaluation eines User-Centred Design-Prozessassessments - Empirische Untersuchung der Qualität und Gebrauchstauglichkeit im praktischen Einsatz. Doctoral Thesis. University of Paderborn, Paderborn, Germany (4/2006)
Activity Theoretical Analysis and Design Model for Web-Based Experimentation∗ Anh Vu Nguyen-Ngoc Department of Computer Science University of Leicester United Kingdom [email protected]
Abstract. This paper presents an Activity Theoretical analysis and design model for Web-based experimentation, which is one of the online activities that plays a key role in the development and deployment of flexible learning paradigm. Such learning context is very complex as it requires both synchronous and asynchronous solutions to support different types of interaction, which can take place not only among users but also between the user and the provided experimentation environment, and also between different software components that constitute the environment. The proposed analysis and design model help clarify many concepts needed for the analysis of a Webbased experimentation environment. It also represents an interpretation of Activity Theory in the context of Web-based experimentation. Keywords: Analysis and Design model, Activity Theory, Web-based experimentation.
1 Introduction Since about a decade, several engineering departments in colleges and universities have faced the logistical matters of educating more students with the same resources while maintaining the quality of education. There is also an increasing need to expand the diversity of laboratory resources provided to students. Within this challenging context, the flexible learning paradigm [1, 2] could be seen as an appropriate solution. It refers to a hybrid-learning scheme in which the traditional courses are combined with online activities. In engineering education, Web-based experimentation is one of the online activities that plays a key role in the development and deployment of such flexible paradigm. In fact, since the last decade, several institutions have already exploited the usage of the Web infrastructure and developed their experimentation courses in engineering curricula using this medium as a main infrastructure. However, Web-based experimentation is a very complex socio-technical setting [2-4]. As a consequence, understanding the main factors that constitute such particular learning context is an essential step in finding solutions to support and sustain interaction, ∗
Most of this work has been carried out while the author was with the Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland.
Activity Theoretical Analysis and Design Model for Web-Based Experimentation
205
collaboration and learning processes. Though several Web-based experimentation environments have been developed, such as [5-9], so far, there is still no analysis and design model that is really capture the main characteristics of such learning context, and provide useful guilds for analysts, designers, and developers to design and develop Web-based experimentation environments. This paper proposes such a model. Section 2 of this paper discusses the major characteristics of Web-based experimentation. Section 3 presents a typical scenario of interaction and collaboration processes in such learning context. The Activity Theoretical analysis and design is discussed in Section 4. Finally, section 5 concludes the paper.
2 Characteristics of Web-Based Experimentation Although there have been several works on Web-based experimentation environment design, development, and deployment, there is still no clear standard for determining the main characteristics of the collaborative hands-on activities in such learning environments. In this section, a list of these essential characteristics is discussed. 2.1 Hands-On Activities Support First of all, the content delivered in engineering courses that rely on Web-based experimentation includes not only static documents, textual presentations, or video presentations but also computation, graphics generated on-the-fly, real devices measurements, and the like. Web-based experimentation can include virtual and/or remote laboratory resources. In fact, real experimentation is still irreplaceable in engineering curricula since students need to have contact with the apparatus and materials, and that labs should include the possibility of unexpected data occurring as a result of material problems, noise, or other uncontrolled real-world variables. Virtual and remote laboratory resources provide a complement means to carry out real experimentation online and/or at distance. A typical virtual laboratory resource is an interactive experiment that relies on a simulation engine. A typical remote laboratory resource is a physical experimental system that is equipped with the necessary facilities to enable Webbased measuring, monitoring, and manipulation [2]. 2.2 Components Integration Due to the complexity of hands-on work [2-4], several components may need to be integrated into the same experimentation environment. These components should help support the whole experimentation process from the preparation stage, to the design stage, to the experiment stage, and to the experimental analysis stage. Each component provides a working space or working console where students carry out some dedicated tasks to solve a particular problem for a complete experiment. Since the output from one stage may serve as the input for the next stages, there should be some linkages between these components. A comparative study have been carried out
206
A.V. Nguyen-Ngoc
in various engineering courses at the EPFL to determine the most common service spaces that may well require the supporting components for completing typical experimentation assignments [2, 10]. Each service space can be supported by one or several components developed using different technologies. These spaces are as follows • The first space that needs to be supported of course relates to the experimentation itself. This can be regarded as the interaction part of the environment. It enables the actual realization of experiments by interacting with virtual laboratory or remote laboratory resources. • The second space that needs to be supported concerns with tools to carry out interactive design and analysis activities related with the experiment. • The third space of a Web-based experimentation relates to the collaboration support. This is where the professors and the teaching assistants can interact with the students to monitor their progress and to guild their learning activities; where students interact with each other to get the tasks done. • Furthermore, a Web-based experimentation environment may also need to integrate some supplementary components, which give access to a number of pieces of information, including relevant reminders or links presenting the underlying theory, experimental protocol, and description of the environment, including the laboratory resources and the environment features that are used in the experiment. Obviously, depending on the experimental protocol, a Web-based experimentation environment may not need to integrate all of these components. 2.3 Multi-session Experiment Typical Web-based experimentation sessions are mediated by teaching assistants and also by professors responsible for the course. There may be some face-to-face sessions, in which the students work in the laboratory with the presence of the professor and/or teaching assistants, but most of the learning activities take place in flexible sessions. Actually, multi-session experiments are an important factor that helps facilitate students to perform experimentation in a flexible way. In a Web-based experimentation environment, students should be able to carry out several trial-anderror experiments that help them reinforce their understanding of theoretical lectures and physical phenomena in a framework where errors are neither penalized nor hazardous. Ideally, a Web-based experimentation environment should be able to allow students to reconstruct the whole or some parts of the experiment and perform it as many times as they want. Hence, the experimental parameters need to be stored somehow for further reconstruction or reuse of that experiment. To support multi-session experiments carried out by a single student or by groups of students; many issues need to be addressed, such as the continuity of interaction [11] that allows students to interact smoothly and uninterruptedly with the experimentation environment and the laboratory resources, and also with other students. Several asynchronous and synchronous collaboration facilities need to be considered as well.
Activity Theoretical Analysis and Design Model for Web-Based Experimentation
207
2.4 Types of Collaboration The importance of collaboration among students has been recognized since a long time in education, especially in distance and online education. According to social constructivists, learning is a social construct mediated by language via social interactions [12], where collaboration among learners is a critical concept [13]. In addition, hands-on activities are usually conducted in small groups [2]. Consequently, Web-based experimentation environments should integrate components that help students to actively create their own contextual meaning, rather than passively acquire knowledge structures created by others [3]. These components should facilitate students to interact with their peers, discuss their positions, form arguments, reevaluate their initial positions, and negotiate meaning. Students become responsible for learning as they collaborate with one another, with their environment, and with their teaching assistants and professors. Both synchronous and asynchronous collaborations should be supported in a Web-based experimentation environment. 2.5 Discretionary of Collaboration The autonomy of individual students while working in flexible modalities means that collaboration with other students is, in many cases, not strictly required. In other words, the student can collaborate with other students only when they believe that it is worth to do so. In fact, students participating in the course using the provided Webbased experimentation may enrol in different other courses. This means that they may have different study schedule, and they may carry out different tasks at different times. These variations can make it difficult to find some common times when students can collaborate. As a consequence, even working in groups, students usually work together, either in face-to-face or distance modes, when a due date is approaching, e.g. before the laboratory sessions, or before the laboratory test. Of course, there exist also other modes of group working. Our experience in observing the students’ work shows that there are some “well-organized” groups, in which the members clearly divide the tasks for each one. There are also many cases in which only one member of the group does the “whole job”. However, depending on the experimental protocol, more precisely speaking, on how the laboratory test is carried out, sometimes it is difficult for the teaching assistants and professors to recognize such problems. The Web-based experimentation environment should allow students to switch between single working mode and collaborative working mode. This switching should be smooth and transparent as much as possible from the student’s point of view.
3 Typical Scenario of Interaction and Collaboration Process Fig. 1 illustrates the interaction and collaboration process happened in Web-based experimentation in which collaborative actors perform a chain of activities to obtain an outcome, i.e. to acquire knowledge from the course (see 1 in the figure). Collaborative actors are, for instance, student groups enrolled in the course and are using the environment to carry out their experimentation. In hands-on sessions, the group size is usually small (consisting of 2 or 3 students) [2, 3]. These actors share their common
208
A.V. Nguyen-Ngoc
background, divide tasks, coordinate their work, and collaborate with each other based on some social rules to get the work done. To support the coordination and communication between these actors, several collaboration and communication facilities may be needed and integrated into the experimentation environment.
Fig. 1. The interaction and collaboration process of Web-based experimentation
These actors interact with various (software) objects displayed in the GUI of the Web-based environment (2). For example, a student uses the computer mouse to modify the parameters of an electrical drive, which are displayed in the GUI as scrollbars. These objects are actually the representations of software components (3), which may be located on different servers. The interaction between the actors and the objects may change the status and the behaviours of the components, as well as may invoke the interaction and/or the internal calculating process of these components (4). In its turn, the interaction between the components at the system level facilitates the interaction process at the user level, which may serve for the next activities of students (5). To summarize, this scenario depicts the complexity of the context in which • Students can collaboratively carry out their hands-on activities in a flexible way. • The online learning community is heterogeneous and its members may have different roles. The coordination and collaboration among the members of the community may be defined by different social protocols and rules. • The Web-based experimentation environment itself may integrate a large variety of software components, which constitute what we call the system level. These components are represented by several objects displayed in the interface of the provided experimentation environment. • The interaction process conducted by the actors, which externally and internally happens in both user and system levels, allow the actors to acquire the outcome for the course.
Activity Theoretical Analysis and Design Model for Web-Based Experimentation
209
4 Activity Theoretical Analysis and Design Obviously, the complexity of Web-based experimentation is caused by several social and technical factors. As a consequent, when studying the collaborative hands-on work in Web-based experimentation, the interaction and collaboration process should be analyzed as a whole, not as any of its constituent entities in separation, since there are close, active, reciprocal, and bidirectional interdependences among these entities. Actually, the importance of Activity Theory as a framework for conceptualizing human activities has also been studied since a long time by the CSCW and CSCL communities [14, 15]. In an influenced paper published in 1999, Jonassen and RohrerMurphy also argued that Activity Theory has provided a powerful instrument to analyze the needs, tasks, and outcomes for designing constructivist learning environments [16]. They proposed a framework that helps analyze and design a constructivist learning environment. However, one of the most difficult problems for the analysts and designers is how to apply these abstract concepts to a real world problem, e.g. to design a real Web-based experimentation environment that supports online collaborative hands-on activities. In this section, the Jonassen and Mohrer-Murphy‘s framework is adapted to introduce a mapping and interpretation from the abstract concepts introduced in Activity Theory into the real context of Web-based experimentation. The constructed framework would help understand and clarify the context of Web-based experimentation from an Activity Theoretical perspective. 4.1 Activity Theory Concepts 1.
Subject: There could be several types of subjects in the context of Web-based experimentation. Following are the most important ones a. Professor: is someone who is in charge of the course. His/her role is to design and construct the course pedagogical scenario, to guide students in their learning process during the whole course, and also to evaluate the students’ progress and their acquired knowledge. b. Teaching assistant: is someone who may play a very important role in distributing knowledge in the class. The teaching assistant would help students during hands-on sessions. His/her role could also be to support the course management and administration. c. Student: the main subject using the environment, who enrols in the course for carrying out experimentation using the environment provided. d. Technician: is responsible for the configuration of physical equipments in the laboratory. e. Evaluator, research assistant: is responsible for assessing the effectiveness and efficiency of the environment, and/or proposing further improvement, development, and the like.
2.
Object: Different objects can be defined. These different objects are transformed during the course to obtain different outcomes a. Long-term object: can be composed of both physical and mental products. The physical object could be the deliverables obtained after
210
A.V. Nguyen-Ngoc
finishing the course, e.g. a course report, or a set of adequate parameters to obtain a stable state of the system. The mental product refers rather to the knowledge, the concepts, or the perceptions of students on a particular engineering domain. b. Short-term object: objects for each experimental sessions, or modules. Deliverables represented short-term objects could be a report, a mathematical problem to be resolved, a hands-on module to be realized, and the like. Short-term objects can also be the knowledge obtained after finishing these modules. 3.
4.
5.
6.
Community: All professors, assistants, students, technicians using the environment for the course form an online learning community, in which the student is the central character and the professors, teaching assistants are usually the central source of knowledge distribution. Rule: Several rules can be defined for a course depending on the course requirements, the laboratory policies, and on the pedagogical scenarios. The task organization among the members of the same groups normally relies on a social protocol or a compromise established within the group or between groups in the community. In hands-on sessions, experimental protocol is what the professors define to guild the students’ hands-on steps. Tool, artefact: Tools that need to be integrated should support and reflect the major characteristics of Web-based experimentation as presented in the contextual model. Various tools may be required. The analysts and designers should also consider the question of developing the tools themselves or integrate those having been developed by other institutions. Division of labour: This also means the division of tasks between the members of the learning community. The division of labour is actually dependent upon the learning community and the rules defined for that community.
4.2 Activity Structure This part involves in a definition of the activities that engage the subject. Each activity could be decomposed into its component actions and operations. However, the definition of the activity structure and its granularity is solely based on the pedagogical scenarios as well as on the objectives of the environment evaluators. In a practical course, an activity is usually equated with the task students need to complete [11]. For each activity (or task), actions are the full meaningful experimental steps that need to be realized. Operations are what students do unconsciously by interacting with the environment to complete each step. In an automatic control laboratory course, for example, a task could be “Modelling and control of an electrical drive”. For each task, several actions need to be realized. These actions have an immediate, pre-defined goal, such as “preparing the pre-lab”, “manipulating the physical drive”, or “analyzing the experimental result”. Actions consist of a chain of operations, such as “moving the parameter scrollbar to increase or decrease the value of a parameter of a studying electrical drive”.
Activity Theoretical Analysis and Design Model for Web-Based Experimentation
211
4.3 System Dynamism This part investigates the interrelationships between the components that are integrated into the environment. Actually, the interrelationships are dependent upon the pedagogical scenarios defined by the professors. The dynamics of the relationship between members of the community, who use the environment for their learning activities, depends on the social protocol, the division of labour established, and the rules set for the course. Usually, in hands-on sessions, the experimental protocol is pre-defined by the professors and always available for students to follow; hence, for students, the task complexity is mostly dependent upon how they carry out the tasks following the steps defined in the experimental protocol. In addition, the “objectives of work” is also pre-defined, thus collaborative activities are usually not necessarily up to the co-construction level of activity [17]. Fig. 2 summarizes the Activity Theoretical analysis and design model, in which all major elements of Activity Theory are mapped into the context of Web-based experimentation. In other words, the proposed model illustrates our Activity Theoretical vision on the analysis and design of Web-based experimentation environments. Actually, it can also be used as an independent guidance for analysts and designers to analyze and design Web-based experimentation environments. In fact, this model has facilitated the design and development of the eJournal, which is an electronic laboratory journal integrated into the eMersion experimentation environment. In turn, the iterative design and development of the eMersion environment and the eJournal have validated the reliability and usefulness of the proposed model. The eMersion environment has been used in several automatic control courses offered by the EPFL since several academic semesters. It has also been deployed and tested in other European institutions such as the University of Hanover in Germany, the UNED University in Spain and the Ecole Nationale de Mines St. Etienne in France. More information about the design and evaluation of the eMersion and eJournal could be found in [2, 3, 10, 18, 19].
Fig. 2. Activity Theoretical analysis and design model
212
A.V. Nguyen-Ngoc
5 Conclusion This paper presents what we call Activity Theoretical analysis and design model. It discusses the characteristics of Web-based experimentation and also introduces a typical scenario of interaction and collaboration processes in such learning context. This model shed light on many concepts needed for the design of Web-based experimentation environments. It also represents a mapping from Activity Theory to the context of Web-based experimentation. The goal of the proposed models is to capture the important aspects concerning the collaborative hands-on activities in a Web-based experimentation environment. The model could be used by a variety of users. Researchers and professors could be based on this model to conduct their study on the students’ behaviours and activities in such particular learning context. Environment developers could use the model to facilitate their development tasks as the model focused already on the most relevant issues of the domain. And the developers could use the model to structure the environment in a coherent way. Acknowledgments. This work would not have been finished without the invaluable support from the eMersion team, EPFL.
References 1. Holmberg, B.: Theory and practice of distance education, Routledge, London (1995) 2. Gillet, D., et al.: The Cockpit, An effective metaphor for Web-based Experimentation in engineering education. Int. Journal of Engineering Education, 389–397 (2003) 3. Gillet, D., Nguyen-Ngoc, A.V., Rekik, Y.: Collaborative Web-based Experimentation in Flexible engineering education. IEEE Trans on Education, 696–704 (2005) 4. Feisel, L.D., Rosa, A.J.: The role of the laboratory in undergraduate engineering education. ASEE Journal of Engineering Education (2005) 5. Böhne, A., Faltin, N., Wagner, B.: Synchronous tele-tutorial support in a Remote laboratory for process control. In: Aung, W., et al. (eds.) Innovations 2004: World Innovations in Engineering education and research. iNEER in cooperation, pp. 317–329. Begell House Publishers, New York (2004) 6. Schmid, C.: Using the World Wide Web for control engineering education. Journal of Electrical Engineering, 205–214 (1998) 7. Tzafestas, C.S., et al.: Development and evaluation of a virtual and remote laboratory in Robotics. In: Innovations 2005: World innovations in Engineering education and Research. iNEEER in cooperation, pp. 255–270. Begell House Publishers, New York (2005) 8. Ko, C.C. et al.: A Web-based virtual laboratory on a frequency modulation experiment. IEEE Trans on Systems, Man, and Cybernetics, pp. 295–303 (2001) 9. Sepe, R.B., Short, N.: Web-based virtual engineering laboratory (VE-LAB) for collaborative experimentation on a hybrid electric vehicle starter/alternator. IEEE Trans on Industrial Applications (2001) 10. Nguyen-Ngoc, A.V., Rekik, Y., Gillet, D.: Iterative design and evaluation of a Web-based experimentation environment. In: Lambropoulos, N., Zaphiris, P.P. (eds): User-centered design of online learning communities. Idea Group Inc, Pennsylvania, pp. 286–313 (2006)
Activity Theoretical Analysis and Design Model for Web-Based Experimentation
213
11. Nguyen-Ngoc, A.V., Rekik, Y., Gillet, D.: A framework for sustaining the continuity of interaction in Web-based learning environment for engineering education. ED-MEDIA conference, Montreal, Canada (2005) 12. Vygotsky, L.S.: Mind in Society. In: The development of higher psychological processes, Harvard University Press, London (1978) 13. Jonassen, D.H., et al.: Constructivism and computer-mediated communication in distance education. The American Journal of Distance Education, pp. 7–26 (1995) 14. Kuutti, K.: Activity Theory as a potential framework for Human-Computer Interaction research. In: Nardi, B.A. (ed.) Context and Consciousness: Activity theory and Humancomputer interaction, The MIT Press, MA (1995) 15. Nardie, B.A.: Context and consciousness: Activity theory and Human-computer interaction. MIT Press, MA (1996) 16. Jonassen, D.H., Rohrer-Murphy, L.: Activity Theory as a framework for designing constructivist learning environments. Educational Research and Development, pp. 61–79 (1999) 17. Bardram, J.E.: Collaboration, Coordination, and Computer Support: An Activity Theoretical Approach to the Design of CSCW. University of Aarhus (1998) 18. Nguyen-Ngoc, A.V., Gillet, D.S., Sire, S.: Evaluation of a Web-based learning environment for Hands-on experimentation. In: Aung, W., et al. (eds.) Innovations 2004: World Innovations in Engineering education and research. iNEER in cooperation, pp. 303–315. Begell House Publishing, New York (2004) 19. Nguyen-Ngoc, A.V., Gillet, D., Sire, S.: Sustaining collaboration within a learning community in flexible engineering education. In: ED-MEDIA conference. Lugano, Switzerland (2004)
Collaborative Design for Strategic UXD Impact and Global Product Value James Nieters1 and David Williams2 1
255 W Tasman Ave, San Jose, CA 95134- PhD, 2 934 Nanjing West Road, Suite 505, Shanghai, 20041 China [email protected], [email protected]
Abstract. Experts in the field of HCI have spoken at length about how to increase the strategic influence of User Experience Design (UXD) teams in industry [2] [5]. Others have talked about how to build a usability or user experience team in industry [3], and others have offered courses in managing HCI organizations [1] [7]. At the same time, other experts have spoken about the importance of making products usable and desirable for international audiences [9] and the value of “offshoring” their usability efforts [8]. Few though have discussed the value and process for an embedded UXD Group functioning as an internal consultancy to different product teams within their organizations. This paper presents both how the consultancy model can increase the strategic effectiveness of UXD inside a company, and how, by leveraging partners internationally, such groups can broaden the usefulness, usability, and desirability of their products to a more global audience. Keywords: User Experience Design, Organizational development, User Experience Teams, Management, Internationalization.
Collaborative Design for Strategic UXD Impact and Global Product Value
215
leaders in each division may request that UXD resources working on their project report directly to them. • Client-funded model, where individual business units fund a central team that provides UXD resources to their teams, and one central UXD organization manages these people. The benefits of this model are similar to the central model. In addition, the central organization does not become a cost center because other divisions pay for UXD resources. However, managers in each division may feel that UXD practitioners who are not part of their organization are not core or central to their business—and they can decline to pay for the individuals at any point. This challenge becomes more likely when managers need to reduce headcount and do not want to eliminate the individuals whom they “own” (who report to them). • Distributed model, where there is no central UXD group, but UXD practitioners (and smaller groups) report directly to the divisions for the products on which they work. One benefit of this model is that such people are viewed more as “insiders,” as part of the team. While an increasing number of companies are using this model, it poses many challenges for the UXD groups and their influence. There is often no explicit sharing of resources or processes across UXD groups, and destructive competition can arise. Unless each UXD group is large enough, practitioners can end up reporting to a manager who does not understand the value of the UXD function. In addition, without a central UXD group, there is no team responsible for UXD process, standards, or infrastructure. At Cisco, these more traditional organizational structures met with some success. One group within the centralized model was able to show a ROI of more than 10x, or $50 Million USD annually. However, $50 Million in a company that grew from $4 Billion to >$30 Billion from 1999 through 2006 was barely noticed. Attempting to improve the usability, usefulness, and desirability of too many products at one time diminished the Cisco UXD group’s ability to gain the sustained support of senior executives. The Cisco UXD group needed a different model. Senior leaders at Cisco, and other companies, both in mature and emerging markets, are held responsible for steep revenue growth. As such, they are in search of the next “advanced technology” (AT). AT’s are disruptive innovations [6] that differentiate one company from its competition, resulting in large revenue increases. To become strategically relevant, the Cisco UXD team needed to deliver disruptive innovation that changed the way people thought about and interacted in a domain. Therefore, these executives want to invest in groups that can drive radical differentiation. They may also invest in groups that incrementally increase revenue or decrease costs (such as prior Cisco usability teams), but they are likely to invest the most in groups who prove that they can stimulate disruptive innovation [6]. Attempting to improve the usability, usefulness, and desirability of too many products at one time diminished the Cisco UXD group’s ability to gain the sustained support of senior executives. The Cisco UXD group needed a different model, so it could increase revenue geometrically instead of incrementally. To influence a complex-systems company [6] such as Cisco, the UXD Group needed an ROI of 100x to 1000x.
216
J. Nieters and D. Williams
2 Enter the ‘External Consultancy Model’ Within the areas of product and interaction research, design and testing, independent design studios have flourished in mature markets such as the US, Europe and South Korea (IDEO, Fitch, Razorfish). Now, a new breed of international and cost-effective design studios such as Asentio Design are developing business from bases in emerging markets such as China or India. Asentio Design flourishes due to its ability to allocate multi-function design teams to chosen client projects without being constrained by processes and corporate politics experienced by design teams within companies. By capitalizing on its geographic and linguistic context is also able to act as a design bridge between clients in mature markets and ODM/OEM design teams in emerging markets. This model has been referred to as “Collaborative Design.” [9] With such companies in mind, the Cisco UXD Group is able to act like an external design firm. Instead of assigning one UI designer to one or even multiple projects, the Cisco UXD Group now assembles highly focused teams comprising multiple crossfunctional experts to support speedy innovation on carefully selected products. These experts include user researchers, interaction designers, visual designers, developers, and industrial designers as necessary to deliver a superior user experience in a very short time. The consultancy model has the additional advantage of placing the UXD group outside of the organization, allowing freedom of decision-making and objectivity when selecting projects to pursue. Following this model, the group can focus intensively on the five or six most strategic products, and work with teams truly interested in their expertise. Since converting to the Focus Team model, senior leaders recognize that the UXD Group’s contribution to revenue increased to more than $2.5 Billion! Such impact has been difficult to ignore; one result is that Cisco’s new motto is “Lead the Experience.” Cisco executives now recognize that the experience itself is the next “advanced technology.” 2.1 Engagement Model for Successful Focus Teams The Internal Consultancy Model is not ideal in every environment. For it to succeed, UXD management must: 1. Only choose worthwhile projects where measurable opportunity exists for demonstrable impact, and where management is willing to give credit to the UXD Focus Team. 2. Merge each UXD Focus Team into the Product Development Team with clearly delineated roles. 3. Adhere to best practices by following a clearly defined process, with well-defined entry and exit criteria. 4. Choose Focus Team members carefully. 5. Follow through to demonstrate impact.
Collaborative Design for Strategic UXD Impact and Global Product Value
217
2.1.1 Choosing Worthwhile Projects While it is a shame to forego UXD on smaller projects, the point is to dedicate resources where they will have the most effect—we must pick our battles wisely. To take this metaphor a bit farther, a classic military strategy is to focus overwhelming resources on a single target. Then, when success has been achieved, move to the next target. This model can apply to UXD efforts: Shouldn’t any UXD manager make sure that critical projects are fully resourced, even if it means neglecting other projects? The alternative is to be spread desperately thin, resulting in average improvements on most projects, rather than disruptive innovation [6] on a few projects. Choosing the right projects also includes: 1. Conducting an Opportunity Review before agreeing to commit resources, to ensure that the product team is receptive and executives recognize the problem. The product team must agree that their success requires a UXD Focus Team. 2. Generate a Project Brief, a statement of work that describes: • • • • • • • •
Statement of value (summary) Challenges (such as competition) Solution (typically broken into multiple phases) Deliverables to be provided Resources (people) required on UXD team Detailed schedule Costs Assumptions and risks
3. Concept and Execution Commitments, in which managers from the different organizations agree to supply people and money 4. The UXD Focus Team is embedded and integrated with the product development team. 5. The project has clear start and stop points, with clear exit criteria, and is not open-ended. 6. Focus team members love to collaborate, and excel at working in teams. When UXD Group leaders decide which projects to accept, they consider the following factors: • Product team receptivity. The product development team itself has requested support from UXD, rather than had it “pushed” upon them by management. If a product team is ambivalent, the UXD group disengages. • Potential revenue or cost savings. The UXD group seeks projects on which they anticipate a minimum revenue increase of $25 Million in the first year. • Advanced technology—a new technology that has not yet been introduced to the market, so the UXD Group can make a larger impact than on legacy products (preferable, but not required). • Leveraging the Cisco UE Standards (UI guidelines and tools). If a product team does not intend to adopt the UE Standards, the UXD Group will not assign
218
J. Nieters and D. Williams
resources. These standards include component libraries to help engineers quickly create code that is accessible, usable, internationalized, and branded. • High visibility. If a project is a “pet project” of a cross-functional or highly visible organization within the company, the UXD Group is more willing to accept it. • Point in the product lifecycle. If design has already begun, it is often too late to impact a product’s overall experience at a fundamental level. There are times when the UXD group agrees to work on a project through multiple iterations, starting late in one cycle to impact a subsequent release. • Realistic time-to-market demands. The Cisco UXD Group delivers value rapidly. However, if project schedules make delivering a high-quality user experience impossible, the UXD group is less likely to accept the project. While there are other factors, this list represents the most salient ingredients used in deciding to work on a project. 2.1.2 Merging the UXD Focus Teams into Product Teams with Clearly Delineated Roles UXD Focus Teams must integrate completely with the product development team during a project. They cannot function as the “icing on the product team’s cake.” In the centralized and client-funded models, product teams can more easily treat UXD team members like outsiders. In the focus team model, management and product team members have all committed to a stellar user experience. UXD Focus Teams need to be viewed as true partners with product teams, and they must treat each product team like the paying customer it actually is. The roles of the UXD Focus Team must be specifically defined, just as the roles of the product team members are. Cisco’s UXD management created a role grid that explicitly defines UXD roles and skills. The UXD Focus Team functions as the architect who provides the blueprint for the elements of the product that define the user experience, and the developers function as the carpenters who deliver to the specifications. If the product team does not agree in advance to these roles, the UCD group does not accept the project. 2.1.3 Choosing Focus Team Members Carefully To win the trust and respect of product teams, members of the UXD Group must demonstrate world-class user experience design skills. Of equal importance, UXD practitioners must have the business, teamwork, technical, communication, and advocacy skills to ensure that product teams will choose to work with the UXD Focus Team. We must understand the larger business context of our work rather than drive single-mindedly toward an ideal design goal. By approaching the design role as though the product team is a customer with a revenue target that we need to help meet, we become more strategically relevant in our organizations. Ddespite their underlying focus on business goals, corporate executives need to trust you to understand their requirements, to trust that you can help them succeed. Personal trust and accountability can be more important than ROI. UXD Focus Team members must be able to build this credibility.
Collaborative Design for Strategic UXD Impact and Global Product Value
219
2.1.4 Following Through to Demonstrate Impact As any consultancy would do, it is essential to make all successes visible. Future business requires such demonstrable impact. No one would engage a consultancy without a fine reputation and portfolio, and the same rules apply to internal consultancies. To achieve this visibility, the Cisco UXD Group tracks impact and records case studies on its website, as you would find on the websites of design firms in industry. The stories in this portfolio describe: • The Problem • Our Solution • The Impact If the UXD Group cannot calculate the financial impact and managers do not provide a quote attesting to the value of the UXD Group activities, that project does not appear on the portfolio website. Other managers can refer to these examples of impact and trust that the group can deliver the same value for them. 2.2 Extending UXD with a Partner Ecosystem Since Cisco’s UXD Group now behaves as an internal consultancy, it has been able to increase its influence by subcontracting to external consultants. To the customers of the UXD Group (Cisco’s product teams), there is little difference. Such collaboration with external design firms such as Asentio Design in China not only increases the internal UXD team’s capacity. It also injects emerging and global perspectives on research, design, technology, partnerships, and the connection between these domains. Such fresh perspectives are critical to stimulate the innovation required in such a company. The UXD Group soon realized it needed an ecosystem of partners who could augment staff, drive entire projects, and introduce ideas that stimulate disruptive innovation. Using external consultants has become a natural extension of the group’s engagement model. The UXD partner ecosystem includes different types of design firms for different types of design projects. Asentio Design, through its international team can provide dedicated support in all areas of the design lifecycle as well as specific market knowledge and partner relationships from its base in China. As product experiences are increasingly designed to support emerging and mature markets, models such as Asentio Design’s are crucial in allowing Cisco to collaborate with manufacturers in, and develop new products for, emerging markets. A partner ecosystem therefore provides opportunities for innovation between internal and external consultancies as well reducing cost and providing design “bridges” between markets. Many Western companies now leverage a global network of partners [4]. Should we as designers, not also leverage this business model to deliver rapid, low-cost, and globally relevant products?
220
J. Nieters and D. Williams
2.3 Leveraging Intact Design Firms Is Not Offshoring It is important to distinguish “offshoring” from leveraging intact global partners. In the consultancy model, as companies hire work with external design firms, they are seeking rapid, high-quality and globally relevant engagements. This process differs from “offshoring,” which in this paper we define as a company hiring its own resources in another country in order to decrease costs. One of the key value propositions for hiring an intact design team (international design firm) is that they have already performed the hard work of seeking and hiring trusted researchers and designers. These teams have also already gone through the hard work of teambuilding. Developing an ecosystem of partners prevents leaders of UXD organizations from having to attract, hire, and retain talent, which can be even more difficult across international boundaries.
3 Examples of Impact Asentio Design and Cisco are currently working on some joint projects that we hope will change market dynamics, but because these products have not yet reached market, we look forward to reporting on these in subsequent years. From a Cisco perspective though, the company is attempting to enter emerging markets in which they have less experience to cultural expectations, norms, and challenges from a user perspective. As such, it is critical that they partner with design firms in four areas: • Design of personal experiences, which encompasses physical products, application user interfaces, out-of-box experiences and retail environments. • Consumer Research in markets where Cisco does not have a UXD research or design present. The costs of leveraging a company such as Asentio Design are significantly cheaper than setting up a presence in each such emerging market. • Globalization. As Cisco has focused more on Internationalization and Localization as they enter new markets, they need partners in-country to help test their products for these international audiences. Asentio Design has many examples of working with US and European companies and delivering world-class and culturally appropriate designs at a much lower cost than if US or European-based companies had designed them. The following case studies show examples of such international collaboration. 3.1 Case-Study 1 (US/China): Commercialization of a Military Product The client had a long history in developing products for military customers. However, they now wished to take their advanced image processing technology into the commercial market place. While building sourcing relationships in China the client was introduced to Asentio Design as a possible design partner. In order to develop their first consumer product, the scope of the client’s requirement was broad, covering consumer research, feature planning, retail & packaging, user interface design and industrial design. Asentio Design, through its international team and position in shanghai (allowing rapid travel to the client’s west coast US headquarters) conducted
Collaborative Design for Strategic UXD Impact and Global Product Value
221
consumer research on the US East and West coast and personal experience strategy planning through two design workshops at the client’s US-based office. Research and Strategy work was followed by a user interface and product design phase where teams in Shanghai and US worked in close collaboration with frequent face-face meetings. 3.2 Case-Study 2 (Europe/China): Research into Digital Imaging Lifestyles in China and Europe A European mobile phone OEM wished to research and compare the usage of highend camera phones in Europe and China. The company approached Asentio Design because of the latter’s partners’ long experience in researching and designing mobile personal experience across global markets, their location in China and their lower cost base compared to European design consultancies. Asentio Design, through its multilingual team was able to conduct diary studies, one-one interviews and on-line surveys in 4 languages (Mandarin, Cantonese, English, German) in Shanghai, Hong Kong, London and Germany. The on-going results of the research were presented to client teams in Europe and China, allowing wide-dissemination, and providing the stimulus for subsequent more focused research
4 Choosing a UXD Organizational Model The Focus Team model is not right for every company. Perhaps the most important factor in deciding what UXD structure to adopt for your group is management that understands what business model is appropriate for your company’s unique environment. The Focus Team or Internal Consultancy model, is best when: • The organization does not have enough UXD practitioners to support every project. • When cost is an issue. Working with a reputable design firm, such as Asentio, who knows how to deliver excellent results, provides highly qualified resources at a much lower cost. • You need to design products for international markets and need a partner who can design a culturally appropriate product. • Your team’s survival or reputation depends on delivering excellence on every project (you cannot afford to assign one designer to multiple projects, thus diluting their impact) • Product teams can “opt out” from working with you. If your company does not require every product team to follow UCD practices and work with a UXD staff, then working only with motivated teams can optimize your resources. • You can “opt out” of minor projects and focus on the highest-priority projects in the company. Trying to make small improvements on all (or most) products can dilute a UXD group’s impact.
5 Summary Personal experience design is now a truly global activity. In order for companies such as Cisco to effectively support product teams and innovate in global markets, their
222
J. Nieters and D. Williams
UXD groups must look increasingly to the new breed of international design studios located in these markets. Companies such as Asentio Design can offer local knowledge allied with western design processes and experience.
References 1. Anderson, R.I.: Managing User Experience Groups. Second Offering, UCSC Extension, Cupertino, CA (2006), http://www.well.com/user/riander/mguxgrps.html 2. Bias, R.G., Mayhew, D.J.: Cost-Justifying Usability. Academic Press, INC, San Diego, CA, USA (1994) 3. Huh, B.-L., Henry.: PhD. Developing usability team in a company: Multiple perspectives from industries, In: Conference Proceedings, Asia-Pacific CHI (2006) 4. Engardio, P., Einhorn, B.: Outsourcing Innovation, BusinessWeek (March 21, 2005) 5. Innes, J., Friedland, L.: Re-positioning User Experience as a Strategic Process. CHI 2004 tutorial (2004) 6. Moore, G.A.: Dealing with Darwin: How Great Companies Innovate at Every Phase of Their Evolution. Portfolio, New York, New York, USA (2005) 7. Rohn, Janice.: Managing a User Experience Team. In: Proceedings of the Nielsen Norman Group Conference, Seattle, WA (2006) 8. Schaffer, E.: Offshore Usability: Helping Meet the Global Demand? Interactions, p. 12 (March – April, 2006) 9. Williams, D.M.L.: Co-Design, China and the Commercialisation of the Mobile User Interface, ACM “Interactions” Special Gadget Issue, October, Vol. XIII(5) (2006)
Participatory Design Using Scenarios in Different Cultures Makoto Okamoto1, Hidehiro Komatsu1, Ikuko Gyobu2, and Kei Ito1 1
Media Architecture, Future University-Hakodate, Kamedanakano 116-2 Hakodate, 041-8655, Japan 2 Fuculty of Human Life and Environmental Sciences, Ochanomizu University, 2-1-1 Otuka, Tokyo, 112-8610, Japan {maq, g2105009, k-ito}@fun.ac.jp, [email protected]
Abstract. In this paper we have examined the effects of scenarios from a participatory design and cross-cultural perspective. The Scenario Exchange Project was an international workshop using scenarios. The participants were university students from Japan and Taiwan. The impetus behind this project was the practical demand for designers to correctly understand different cultures and design products and services. We confirmed that scenarios are effective techniques for bolstering participatory design. Furthermore, we have recognized that we must create new methods for describing the lifestyle and cultural background of personas. Keywords: Scenario, Information Design, Cross Culture, Situated Design, Participatory Design.
2 Design Process Using Scenarios Many researchers have worked on the scenario-based design approach, starting with John M. Carroll [1]. As they describe human activities and goals using unformatted symbols (words or pictures), scenarios are special in that anybody can understand and use them easily. Furthermore, they facilitate smooth, mutual understanding between stakeholders, such as requirement analysts (designers), customers, users and developers [2]. While scenarios are common tools for expressions, they are also effective tools for interaction designers or media architects. However, it cannot be said that most designers commonly use scenarios in their work. A number of innovations need to be made in order for designers to use them at work, including the situations in which they are effective, simple and effective methods for describing situations, how to elicit requirements, ways of expressing new scenarios and methods for evaluating them. We believe that the meaning behind the use of scenarios is the process by which designers and users cooperate in order to understand unknown living conditions and create new products. As advances continue to be made in information technology, we will be confronted with situations for which we have never designed before. The scenario-related tasks which we have worked on include systems that support the lifestyles of the visually impaired [3, 4], systems that assist exchanges between people from different cultural and linguistic backgrounds [5], and mobile communications services[6]. In an increasingly complex and globalizing modern society, we believe that there is a limit to the world view which individuals are capable of understanding, and new techniques for sharing situations, such as scenarios, will become more and more necessary in the future. In this paper we will report on cases which we implemented design activities using scenarios, in situations where language and culture differed.
3 Scenario Exchange Project Okamoto, Der-Jang Yu and associates held a workshop for students from different cultures to design systems using scenarios (from May 2005 to May 2006). The Scenario Exchange Project (hereafter, SEP) proposed by Yu was one in which Japanese and Taiwanese students designed new information systems through the medium of scenarios. They used the Scenario Exchange Server to share scenarios; problem scenarios with described situations and solution scenarios that address those situations. Furthermore, in order to verify factors which could not be expressed via scenarios or online communication, we held workshops in the respective countries (Table 1). There were two kinds of scenarios which we used with this technique. The first was a Problem Scenario. It describes how users applied the device and the kinds of problems that may have been confronted. This was described from field surveys and interviews. Based on the requirements extrapolated from an analysis of this scenario, a Solution Scenario, which describes how the proposed service should be used, was introduced. These scenarios are special in that they are specific and easy for anybody to understand, making them useful for communication between stakeholders, such as designers, engineers and those involved in the process [1].
Participatory Design Using Scenarios in Different Cultures
225
Table 1. Summary of Project Title: Scenario Design & Episode Exchange Term: Dec16-18, 2005 Place: Hakodate, Japan Participants: FUN 11, NCTU 18 Title: Mobile Taiwan & Ubiquitous City Term: May 7-9, 2006 Place: Taipei, Taiwan Participants: FUN 14, NCTU 18, TAU 22, NYUST 4
1st Workshop
2nd Workshop
Table 2. U-team’s and D-team’s Roles Team U-team (User) One Group D-team (Designer)
Role - User’s perspective (or assuming the role of the user) - Write a Problem Scenario - Idea development - Establishing a hypothesis
SEP constructed the Scenario Exchange Web to enable stakeholders to share scenarios and exchange opinions with each other. This Web enables the whole process, from Problem Scenario to Solution Scenario, to be recorded and shared. It is possible to post not only text, but also camera images and hand-drawn sketches. Furthermore, users and designers can exchange opinions by means of a function for commenting on scenarios. This environment makes it possible for information to be shared via unformatted symbols. Our aim was to consider how much participants who were interacting in this environment were able to understand their counterpart’s situation. The participants were students studying user interface, graphic design and product design. Every group had five to six members, a combination of Japanese and Taiwanese. Each group was further subdivided into a U-team and a D-team (Table 2). The U-team acted as observers. They had to carefully observe the condition of users (or assume the role of users) and create Problem Scenarios. The D-team had to propose ideas based on U-team’s Problem Scenario, that is, they acted as designers. All of the workshops were on the theme of proposing information-processing devices which facilitate travel. By adopting the theme of travel, the students had to take into account the local characteristics of the travel destination.
4 The SEP Process SEP comprised of two phases (Fig.1). Phase 1 was the Remote Research Phase. During this phase, U-team and D-team carried out activities from separate locations (Japan and Taiwan). U-team was the first to travel. The actions of one subject being observed were recorded using a camera or by taking notes. The observer interviewed the subject and wrote a Problem Scenario. This scenario was then uploaded onto the Scenario Exchange Web. This scenario was divided into multiple scenes and each scene was provided with Positive, Negative and Wish categories. Observers wrote down brief notes on the users’ satisfied, positive attitudes and behavior under
226
M. Okamoto et al.
Positive, mistakes and passive attitudes and behavior under Negative, and desires under Wish. Furthermore, personas were set up based on the subject observed and brief profiles were written down at the beginning of each scenario. The term personas described in this paper were virtual user profiles used in the scenario method. Usually, personas are set up based on multiple persons and the most appropriate one is determined from among them, but this step was omitted in the SEP. D-team gained an understanding of what U-team had experienced on their travels from the scenarios and proposed ideas (establishing a hypothesis) via the Scenario Exchange Web. They proactively used online communication such as the Internet, email and chat, asking questions about unclear points in the Problem Scenarios. In Phase 2, D-team actually visited U-team’s country and held a joint workshop. D-team re-experienced the situations which they had previously only been able to understand from the Problem Scenarios. U-team answered questions on differences in the social and cultural background, in particular, and facilitated D-team’s understanding. By re-experiencing, they became aware of things which they had been unable to understand with the scenario, reconsidering ideas based on the new insights and views which they had attained. Solutions Scenarios and Product Images were then created collectively.
Fig. 1. Process of Scenario Exchange Project
5 Project Results 5.1 Workshop 1: Scenario Design and Episode Exchange (December 2005, Japan) We will discuss the significance of this technique by using an example from a group that worked on the “Service Proposal for Fishermen” at the first workshop, which was held in Hakodate in December, 2005. Phase 1: Three students from Future University (U-team) went to a fishing port in the vicinity of Hakodate City for a fishing trip. The actions of one subject being observed were recorded using a camera and by taking notes. Additionally, details of the participant’s experiences were gathered at an interview, and were written up as a Problem Scenario before being uploaded onto the Scenario Exchange Web (Fig.2, Left). The Problem Scenario descriptions began with the fishing preparations on the day prior to the trip, right up until the moment when the fish that had been caught were eaten.
Participatory Design Using Scenarios in Different Cultures
227
Fig. 2. Problem Scenario (Left) and Idea Sketch based on User Requirements (Right)
Fig. 3. Solution Scenario (Left) and Product Images (Right)
Four students from Chao Tung University (D-team) extracted user requirements from the Problem Scenario and proposed three Idea Sketches (Fig.2, Right). Phase 2: On the first day of the workshop, the students from Future University reexperienced fishing with the students of Chao Tung University. By actually experiencing the situation which they had previously only known via the scenario, Dteam was able to understand the enjoyment of fishing and the persona’s feelings. During Phase 1, the members of D-team assumed that hardly anybody went fishing in the winter, and thought that young people did not fish. These assumptions were at odds with the facts. Re-experiencing enabled the students to become aware of such assumptions and lead to an understanding of the persona’s intentions. On the second day, the students discussed in groups whether ideas were valid or not. U-team and Dteam cooperated with one another to create a final design summary. Students then proposed an information system that enhances the enjoyment of fishing by allowing people to compete against other fishermen with regard to the size
228
M. Okamoto et al.
of the fish they have caught using Solution Scenarios and 3D models (Fig.3). The proposed solutions provided extensive support, from a persona making preparations at a fishing tackle store to taking a fish print of the fish which they had caught as a memento. We feel that this was the result of a design that grasped the broad spectrum of the persona’s experiences. 5.2 Workshop 2: Mobile Taiwan and Ubiquitous City (May 2006, Taiwan)
.
The second workshop was held in Taiwan in May, 2006 In this workshop the roles were reversed, with the Japanese students becoming D-team and the Taiwanese students becoming U-team. We use the proposal for a “device which supports communication between people who do not understand each other’s language during a trip” as an example for discussion. Phase 1: Two students from Chao Tung University (U-team) went on a trip to Tamsui in northern Taipei. Tamsui is a historical town blessed with water and greenery. In accordance with the SEP process, they created a Problem Scenario and uploaded it onto the Scenario Exchange Web (Fig.4, Left). The Problem Scenario described subjects who were unfamiliar with Tamsui freely traveling in the town. Four students from Future University (D-team) then proposed multiple ideas (Fig.4, Right) from the Problem Scenario, from the perspective of new tourism experiences in Tamsui.
Fig. 4. Problem Scenario (Left), Ideas based on User Requirements (Right)
Phase 2: On the first day of the workshop the Japanese and Taiwanese students joined together and went on a trip to Tamsui. In Taipei, Tamsui is a leading sightseeing area with many market stalls. The group walked around sampling the local food and taking in the natural scenery, historic buildings and landmarks. As a result of re-experiencing Tamsui, D-team realized that the streets were a maze, complicated and easy to get lost in. The group then became aware that many problems arose when the Japanese and Taiwanese students communicated with each other, such
Participatory Design Using Scenarios in Different Cultures
229
Fig. 5. Solution Scenario (Left) and Product Image (Right)
as when trying to fill in direct information on the map for deciding the nextdestination, or when the Taiwanese students communicated about food which they recommended through pictures. Furthermore, they focused their attention on the importance of finger pointing in these activities. The group then proposed that an IC chip be attached to the finger and a device be worn over one of the eyes (Fig.5). A canvas could then hypothetically be spread in the air and words and pictures drawn onto it, operated by metaphorical movements of the fingers. Additionally, it would also be possible to search for information using a network. These were the findings for an innovative design based on the very exchange experience of U-team and D-team.
6 Discussion 6.1 Scenario and Hypothesis Exchange In Phase 1, the Problem Scenario which U-team had created was exchanged for a hypothesis by D-team relating to it. It is thought that the Problem Scenario was useful in communicating the situation to D-team, the members of which were from a different cultural background. U-team had to condense the actual experience of the trip into the form of a scenario. Although the actions of the persona and the situations could be written into the scenario, the intentions of individual actions and the cultural meaning behind them were obscure. The persona profiles described, for example, that someone was male, 22 years old and that their hobby was skiing, but nothing more detailed. As a result, it became clear from the follow-up interviews that D-team had trouble understanding the intentions or culture, even though they were able to learn about the individual situations. Although scenarios were definitely able to relay the situation, it was difficult for them to get across factors, such as context and culture that were related to those situations.
230
M. Okamoto et al.
6.2 Re-experiencing in Workshops Re-experiencing at Phase 2 was useful for developing a more refined solution, by Dteam’s having an experience equivalent to that of U-team. The interactive efforts during Phase 1 also deepened rapport at the time of the workshops. Furthermore, it is also thought that the level of cross-cultural understanding increased according to the depth of that rapport. We concluded from the students’ reports that proactive communication for gaining background knowledge of their counterparts and reexperiencing made them realize that they had made assumptions about facts, which in real-world situations may lead to poor interaction between stakeholders and design processes.
7 Conclusion The advantages and limitations of our discussion so far are divided and summarized in Table 3. In the SEP, students from completely different cultural backgrounds cooperated to design products. The results of their activities were that they created proposals which offered rich experiences and they were able to practically apply situational designs. For these efforts, an understanding of the cultural background was important when designing. Cultural background is not limited to the national and ethnic cultures of Japan and Taiwan. Culture exists in different structures, such as age, generation, occupation, family or area. Scenarios are extremely helpful for grasping situations. Scenarios are no more than a doorway for understanding. The repetition of questions about problematical points or obscure areas contained in scenarios leads to a deep understanding of the user (context or cultural background). Although re-experiencing grants a deeper understanding of the counterpart’s situation, it seems that formation of a rapport between users and designers such as that gained in our workshops is of significance. The use of a representational scenario as a mediator has the effect of stimulating active participation, even when the participants’ counterparts are from countries where different languages are spoken. Scenarios are effective media in participatory design efforts. However, scenarios also have the following limitations: • The information takes a lot of processing effort (resizing of photographs, composition of text, etc.) before it can be sent to the server. • Text-intensive descriptions take time to read and write. • There is no easy way of writing up background information (context and culture). In order to solve these problems, we would like to create a design system which assists easily stores observational records on a server and allows viewers to understand situations easily.
Participatory Design Using Scenarios in Different Cultures
231
Table 3. Advantages and Limitations of the Scenario Exchange Project
Advantages
Phase 1 Scenario Exchange 1. D-team can understand situation in which U-team is placed. 2. Easy for D-team to discover problems from scenarios. 3. Scenarios give opportunity to try to understand intentions and culture. (Questions and interests arise easily) 4. Since scenarios and hypotheses are disclosed on web, they are always available for viewing.
Limitations
1.
2.
3.
4.
U-team occasionally takes time to write up scenarios. Innovations for explicitly relaying situations are required. Risk of subjective and objective perspectives becoming mixed in scenarios. Skills for expressing self in foreign language (English) are required in order to communicate ideas to counterpart. Possibility that D-team will carry their assumptions on reality.
Online Communication 1. Can hold discussions in real time by using chat software. 2. Can observe counterpart’s face and voice by using video chat. 3. Can exchange information which cannot be completely supplemented with scenarios. 4. Leads to rapport building. 1. Risk of exchanges taking up a lot of time. 2. U-team is required to be well acquainted with their own country’s culture and have skills to relay that knowledge adequately.
Phase 2 Re-experience 1. D-team can notice environments or information which was not described in scenarios. (Discovery of new problem areas) 2. D-team can notice assumptions about reality. 3. D-team can increase level of understanding of intentions and culture. 4. Can verify whether ideas are appropriate through re-experience. 1. Possibility that information gathering will be insufficient when time is short.
References 1. 2. 3. 4.
Carroll, J.M.: MAKING USE. MIT Press, Cambridge, MA (2000) Go, K.: Requirement Engineering, Kyoritsu Publisher (2002) Kato, S., Okamoto, M.: Tool supporting memory of visually impaired person, WIT (2006) Okamoto, M., Akita, J., Ito, K., Ono, T., Takagi, T.: CyARM; Interactive Device for Environment Recognition Using a Non-Visual Modality. In: Miesenberger, K., Klaus, J., Zagler, W., Burger, D. (eds.) ICCHP 2004. LNCS, vol. 3118, Springer, Heidelberg (2004) 5. Komatsu, H., Ogawa, T., Gyobu, I., Okamoto, M.: Scenario Exchange Project. In: International workshop using Scenario Based Design, Human Interface 2006, Japan, pp. 503–508 (2006) 6. Okamoto, M., Ishii, K.: Method for Information Design based user’s thinking process, JSSD, pp.18–19 (2002)
Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations Ronnie Taib and Natalie Ruiz ATP Research Laboratory, National ICT Australia Locked Bag 9013, NSW 1435, Sydney, Australia School of Computer Science and Engineering The University of New South Wales, NSW 2052, Sydney, Australia {ronnie.taib, natalie.ruiz}@nicta.com.au
Abstract. The use of Wizard of Oz (WOz) techniques for the acquisition of multimodal interaction patterns is common, but often relies on highly or fully simulated functionality. This paper suggests that a more operational WOz can benefit multimodal interaction research. The use of a hybrid system containing both fully-functional components and WOz-enabled components is an effective approach, especially for highly multi-modal systems, and collaterally, for cognitively loaded applications. The description of the requirements and resulting WOz set-up created for a user study in a traffic incident management application design is presented. We also discuss the impact of the ratio of simulated and operational parts of the system dictated by these requirements, in particular those related to multimodal interaction analysis. Keywords: Wizard of Oz, Multimodal user interface, Speech and gesture, User-centred design.
Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations
233
particular task, and has been associated with the limited capacity of working memory [4, 5]. TIM operators are bombarded with live information that needs to be integrated, synthesised and entered into the system. They need to monitor heterogeneous information sources and respond to several complex incidents at a time, activities which induce very high levels of cognitive load. Thus, the research involves correlating two factors: high levels of cognitive load and use of multimodality, i.e. speech and gesture interaction.
Fig. 1. TIM User Interface in the Experiment
We hypothesised that the operators’ patterns of multimodality would significantly change as their cognitive load increased. In more detail, we expected: • Increase in level of multimodality, i.e. using more than one modality when given the choice, as a strategy for cognitive load management; • Increase in the frequency of complementary information carried across modalities, for example, where the same message is partly carried partly by speech and partly carried by gesture, with no semantic overlap; • Decrease in the level of redundant information in interactions as cognitive load increased, i.e. less occurrences of input where each modality would carry the same message, with semantic overlap. In this paper, we present the design of a Wizard of Oz (WOz)-based user experiment intended to verify these hypotheses. We review the constraints imposed by the study of multimodal interaction, given our research field and requirements and discuss the trade-off existing between simulated and operational parts of the WOz.
2 Background Multimodal interaction, though characterised as more intuitive or natural is not yet robust enough to fulfil its promises. Product-oriented multimodal systems become
234
R. Taib and N. Ruiz
limited in their functionality to alleviate robustness problems, while research-oriented multimodal systems can suffer from over customisation and application dependency, not allowing broader reuse of components. The Wizard of Oz (WOz) technique has early been recognised as an essential tool for the design of multimodal interfaces, where novel interaction patterns were expected to appear [6]. WOz set-ups allow intuitive interaction while removing the limitations, such as input recognition error rates and misinterpreted semantic fusion of multimodal signals. The ethics of the method have been criticised, considering that the subjects are deceived into believing that they are interacting with a working system; however, [7] have noted a positive acceptance by the subjects when informed during post-hoc debriefing. Another limitation of evaluating simulated systems versus real working systems is mitigated by the same authors on the ground that human users can adapt to the capabilities of a system. While this remark is interesting and correct, we have found that an unconstrained WOz implementation is an efficient UCD tool since it may still require a large user adaptation to the system functionality. This highlights a crucial aspect of the development of a WOz set-up: the relationship between the boundaries of the real system in comparison to the simulated functionality of the WOz system.
3 Design Methods for MMUI Systems 3.1 Task Design for Eliciting Multimodal Interaction A user study of multimodal user interaction requires well planned experiment tasks in order to elicit as natural interaction as possible, yet providing targeted data. The traffic incident management scenario we designed comprised the tasks of marking entities or locations on a map, then deploying resources in relation to those incidents. Four sets of tasks with varying difficulty corresponded to four distinct cognitive load levels. Each set comprised three modality-based conditions, namely using speech only, gesture only, and multimodal (speech and gesture) interaction, this latter being the focus of this paper. Each condition had three repeat tasks in order to obtain statistical power. Hence, subjects had to perform 48 tasks in total. Each set of tasks was completed with the same interface, and the subjects were trained in all conditions during a preliminary session. Task difficulty can be induced in two ways. Firstly, the content and inherent complexity of the problem can be increased: this is known as intrinsic load [4]. Similarly, task difficulty can be induced by increasing the complexity of the representation of the data, known as extraneous load [4]. A good example of this is performing a simple ‘drag and drop’ operation with a mouse-driven UI versus a speech-driven UI. The operation is the same, so the difference in complexity originates from affordances of the input modality. It is much simpler to increase the task difficulty (and cognitive load) by increasing the inherent complexity of the concepts in the task, rather than providing more complex representations, where the effects are much more subjective and unpredictable. For these reasons, we chose to
Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations
235
manipulate intrinsic load to increase task difficulty. The four distinct levels of cognitive load varied in five ways (Table 1): • • • •
Visual complexity: The number of streets in each task increased from ~40 to ~60; Entities: The number of distinct entities in the task description was increased; Extras: The number of distractor (not needed for the task) entities increased; Actions to completion: The minimum number of actions required for task completion; • Time Limit: The most difficult level was induced by a time limit for completion. Table 1. Cognitive Load Levels Entities
Actions
Distractors
Time
1
6
3
2
∞
2
10
8
2
∞
3
12
13
4
∞
4
12
13
4
90 sec.
Level
3.2 Modality Selection Intelligent transportation systems manipulate large amounts of spatial data and traffic control rooms invariably offer wall displays providing an overview of complex situations. In this context, we introduce speech and gesture as two novel communication media, allowing the operator to interact from a distance with the large displays. We further discriminate modalities over these carrier media by the type of interaction that they allow. The resulting modalities are: • Hand pointing: Simple deictic gestures can be used to point to items on the large display, e.g. a specific location on the map; • Hand pausing: Pausing for a short lapse during the deictic movement results in the selection or the activation of the item being pointed at; • Hand shapes: A few specific hand shapes have been allocated specific arbitrary meanings, e.g. a closed fist to tag a location as an accident; • Speech: Natural language can be used for the selection or tagging of items; • Menu bar buttons: Some graphical buttons on the display can be selected by hand pointing and pausing, in order to tag items. The hypotheses of this study necessitate that all tasks be achievable in either a complementary or redundant multimodal way, hence all modalities should be made as semantically equivalent as possible. For this experiment, most tasks could be achieved using the main three modalities: speech, hand pointing and hand shapes. This required a careful crafting of the user interface so that the modalities provide similar functionality in spite of their various affordances. Table 2 provides some examples of equivalent speech and gesture interaction.
236
R. Taib and N. Ruiz
An important aspect to note is that the design allowed users the freedom to choose combined multimodal interaction. They could opt to interact with a single input, in either modality; or with more than one input, in the same or different modality. This applied to the task as a whole e.g. performing the whole task using speech or using gesture; but also to each subtask e.g. performing the item selection using pointing and tagging it using hand shape or speech. Table 2. Examples of multimodal inputs
Functionality
Speech
Gesture
Zooming in
“Zoom in to the top left quadrant”
Point to the corners of the top left quadrant
Selecting an element
“Select the Church on Street X”; or “Church on Street X”; or “St Mary’s Church”
Point to the element and pause
Requesting information on an element
<Select an element> then, “Information on selected element please”; or “Information”
<Select an element> then, Point to “Info” button
Tagging an element as an accident
<Select an element> then, “Mark as accident”; or “Accident”
<Select an element> then, Point to “Accident” button; Or, make closed fist shape
Tagging an element as an event
<Select an element> then, “Mark as event”; or “Event”
<Select an element> then, Point to “Event” button; Or, make scissors shape
Using automatic speech and video-based gesture recognition would dramatically decrease the usability of the system because of the average recognition rates exhibited by such technologies [8]. Reduced usability, in turn, forces subjects to adapt to the system limitations, which works against our primary objective to collecting natural inputs. Hence a WOz approach was selected for this set of experiments, where the wizard manually performed speech and hand shape recognition. An automated hand tracking module developed in-house was found to be sufficiently robust to use during the experiment.
Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations
237
3.3 Data Collection Given our hypotheses and selected modalities, a number of interaction features have to be captured and analysed. Each stream reflects different aspects of the interaction and involves specific requirements. Application-Generated Events. The application monitors the progress towards task completion by recording relevant actions such as the selection or tagging of items on the map. The time of occurrence of such actions may also be used to estimate the subject’s performance on the task. Speech Input. Speech is a major input in many multimodal user interfaces, and we decided to use unconstrained, natural language input during this experiment. The wizard is in charge of interpreting inputs and a the task. However, a complete recording of the speech inputs is very desirable as it contains rich features for the post-analysis of the interaction. Since this experiment was using a single user, we opted for a directional microphone connected to a camcorder in order to capture speech. The major benefit is the inherent synchronisation with the video signal. Gesture Input. An in-house gesture tracking and recognition software module was used to capture hand moves and shapes. This provides untethered gesture interaction with a fair reliability in a lab setting, by using a dedicated high quality Firewire camera focusing on the subject’s hand. The subjects were also videotaped on a classical camcorder in order to capture the overall gestures (see Figure 2). Biosensor data. Physiological data was captured in order to evaluate the level of stress and arousal of the subject during the interaction. In particular, galvanic skin response (GSR) and blood volume pulse (BVP) were recorded using an external device with finger sensors.
Fig. 2. Gesture and speech TIM prototype
238
R. Taib and N. Ruiz
3.4 Data Type Limitations Each data stream provides a rich source of information for the analysis of multimodal interaction, however, there are inherent limitations that have to be balanced in view of the experiment’s purpose. Volume. Audio-visual information is very rich but high quality recordings imply large storage capacity requirements and potential playback and trans-coding issues. Recording on tapes (e.g. MiniDV) requires transfer to a computer at a later stage, often with trans-coding. Further to the consumable cost, this process is extremely time consuming. Hence we opted for connecting the camcorder directly to a computer and record the stream directly on the hard drive. A flat format codec was used for the video streams in order to ensure correct synchronisation between audio and video channels. The resulting files are very large though, so we decided to record them directly on external hard drives in order to provide the maximum flexibility during the post-analysis, while avoiding file copies and transfers that have the potential to corrupt data. Biosensor data also generates large amounts of information due to the high sampling rate at which they should be acquired. Being short text records, the overall file sizes remain easily manageable. Reliability. Multimodal interaction analysis relies on the combination of distinct modality streams in order to improve recognition of other parameters, such as cognitive load. This mutual disambiguation process [8] is most effective when the individual streams are unreliable because of inaccurate recognisers or user context, e.g. automatic speech recogniser or noisy rooms. Biosensor data sensors and acquisition chain is fairly complex, hence often inaccurate. The position and stability of the sensor are paramount for reading GSR, for example. In our experiment, subjects used their main hand for gesture interaction, while their other hand was connected to the biosensors and rested on a back rest of a chair. Any unnecessary movement with the ‘sensor’ hand could cause a disruption in the reading. While it may be difficult to compare results across subjects, within subject evaluation is reasonably stable with this set-up. Another key reliability issue are manual annotations. Uniformity among annotators is difficult to achieve and requires precise annotation rules and cross-validation between annotators. The precision of manual annotations usually comes at a cost; for example we annotated start and end of speech with a precision of around 10ms, which required specialised software tools and more annotation time. Finally, data precision is important as it can restrain the span of numerical analysis. Biosensor technologies vary in cost and precision, so a trade-off between these parameters dictates the final choice. In this experiment, we used professional grade biosensors, with a real-time link to the computer for acquisition. Synchronisation. Accurate synchronisation of all the data streams is crucial to the effective annotation and analysis of multimodal interaction. Logging data on separate computers and devices requires means to ensure synchronisation during recording, for example using the Network Time Protocol (NTP) to synchronise the computers’ time.
Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations
239
But it also requires means to synchronise streams post-hoc, which may be unreliable for video or biosensor data for example. To alleviate this issue, we directed all data streams, except the audio-visual, to a single software logging application. This latter provides a uniform time scale as well as a preformatted output easing annotations. Output is buffered to memory during the tasks, in order to avoid loss of information, and is stored to disk files between tasks. The post-hoc synchronisation of the audio-visual stream is possible thanks to auditory beeps played by the system at the beginning and end of each task. The time of occurrence of the beeps is logged by the unified software logger, and can be manually reconciled with the audio channel during annotation.
4 Discussion: Level of Actual Functionality In the field of multimodal research, system robustness is critical to elicit intuitive and reliable interactions from the subjects, e.g. users will compensate unnaturally for errors in recognition of a certain input and may stop using it or increase use of other modalities that are perceived to be less error prone. Hence, the WOz systems are usually highly or fully simulated, sometimes based on multi-wizard support (e.g. one for speech recognition and one for output preparation). However, there are no general guidelines available in terms of design factors for WOz systems, and our experiment allowed us to determine some characteristics of the data that greatly impact the design, such as volume, reliability and synchronisation. Further to those characteristics, we discovered that the balance between functional components and ‘wizardry’ is highly dependent on the user study design and the goals of the research. When the goals are largely evaluative, more functional modules are necessary, such that feedback on actual functionality can be assessed and incorporated into final versions of software. In addition, having a fairly functional system makes product development far more achievable. In our case, the focus was on identifying multimodal behavioural patterns in highly multimodal systems: the goals were exploratory and we aimed to capture naturalistic interaction. Though input could only be conveyed through three asynchronous modalities, (speech, hand movements and hand shapes), the temporal, syntactical and semantic characteristics of the interaction were highly complex. To illustrate: the least expressive modality, free-hand gesture, could be used to issue 11 distinct commands in a single movement, each of which could then be combined with other commands, in groups of two or three, along various temporal arrangements to alter the semantics of the command. Further, any command could also be conveyed through the speech modality, and again, combined with others in various temporal arrangements. The choice of modality and the temporal arrangements are very delicate characteristics of interaction and subject to both unreliable input recognition and individual subject differences [9]. The state of the art in fully functional speech and gesture recognition would not be sufficiently error-free to produce unbiased interaction, and for this reason, the decision was made to use wizard-based simulation in place of the recognition and multimodal fusion engines. Giving the wizard this responsibility meant that very few other tasks could be allocated to him, so as to prevent overloading. The limitations of the wizard’s attention span, and the lack of resources
240
R. Taib and N. Ruiz
to provide a second wizard, drove the rest of the functionality to be automated as much as possible. The WOZ technique relies on the user believing the system is fully functional. This gives rise to two aspects of system design that impact the implementation of the system and hence the percentage of actual vs. simulated functionality. The complex form of input in multimodal interaction requires equally complex forms of output. Though primarily graphical, the task scenario was also required to provide able textual output at different stages of input forcing the lag time for system feedback to be as fast as possible. The feedback for each different kind of command may require more than one element to appear on the screen, or some text at various stages of the command being issued. The back-end logic of the application, e.g. responses and immediate output were fully functional and largely operated by the wizard, once user input was interpreted, but the wizard did not need concern themselves with selecting the content or form of output on the fly. The wizard’s interface was tailored to suit, providing large buttons which would facilitate this process. Another factor that may also drive the decision of how to distribute the ratio of functional vs. simulation in a WOZ system is the post-analysis required. The more system events are fully automated, the more markers that can be placed on the data and the more features can be recorded on the fly, such as time stamps, command sequences and types, and The centralisation of system models on a single machine allows a better synchronisation of input signals, facilitating data analysis post hoc. In conclusion, our WOz design allowed us to collect the target data and to confirm our hypotheses. However, there are still many aspects of multimodal user interaction that need addressing, especially in view of the evaluation of the cognitive load experienced by a user. So reflecting on the design choices brought some important insights for the design of future WOz based user experiment. In particular, we identified data characteristics that have a deep impact on the design choices, and we clarified the necessary trade-off between implemented and simulated functionality.
References 1. Oviatt, S.: Ten Myths of Multimodal Interaction. ACM, Communications of the ACM 42(11), 74–81 (1999) 2. Bolt, R.A.: “Put-That-There”: Voice and Gesture at the Graphics Interface. In: Bolt, R.A. (ed.) Proc. 7th annual conference on Computer Graphics and Interactive Techniques, Seattle, WA, USA, pp. 262–270. ACM Press, New York, USA (1980) 3. Schapira, E., Sharma, R.: Experimental Evaluation of Vision and Speech based Multimodal Interfaces. In: PUI’01, Workshop on Perceptive User Interfaces, Orlando, FL, pp. 1–9. ACM Press, New York, USA (2001) 4. Paas, F., et al.: Cognitive load measurement as a means to advance cognitive load theory. Educational Psychologist 38, 63–71 (2003) 5. Baddeley, A.D.: Working Memory. Science 255(5044), 556–559 (1992) 6. Salber, D., Coutaz, J.: A Wizard of Oz platform for the study of multimodal systems. In: Ashlund, S., Mullet, K., Henderson, A., Hollnagel, E., White, T. (eds.) INTERACT’93 and CHI’93 Conference Companion on Human Factors in Computing Systems, Amsterdam, The Netherlands, pp. 95–96. ACM Press, NY (1993)
Wizard of Oz for Multimodal Interfaces Design: Deployment Considerations
241
7. Dahlbäck, N., Jönsson, A., Ahrenberg, L.: Wizard of Oz studies: why and how. In: Gray, W.D., Hefley, W.E., Murray, D. (eds.) Proc. 1st international Conference on intelligent User interfaces, Orlando FL, USA, pp. 193–200. ACM Press, NY (1993) 8. Oviatt, S., Cohen, P.: Perceptual user interfaces: multimodal interfaces that process what comes naturally. Communications of the ACM 43(3), 45–53 (2000) 9. Oviatt, S., DeAngeli, A., Kuhn, K.: Integration and Synchronization of Input Modes During Multimodal Human-Computer Interaction. In: SIGCHI conference on Human factors in computing systems, Atlanta, GA, USA, pp. 415–422 (1997)
Extreme Programming in Action: A Longitudinal Case Study Peter Tingling1 and Akbar Saeed2 1
2
Faculty of Business Administration, Simon Fraser University, 8888 University Drive, Burnaby, Canada V5A 1S6 [email protected] Ivey School of Business, University of Western Ontario, 1151 Richmond St. N., London, Canada N6A 3K7 [email protected]
Abstract. Rapid Application Development (RAD) has captured interest as a solution to problems associated with traditional systems development. Describing the adoption of agile methods and Extreme Programming by a software start-up we find that all XP principles were not adopted equally and were subject to temporal conditions. Small releases, on site customer, continuous integration and refactoring were most vigorously advanced by management and adopted by developers. Paired programming on the other hand was culturally avoided. Keywords: Extreme Programming, Agile Methods, Rapid Application Development.
1 Introduction The speed and quality with which systems are delivered continues to concern both practitioners and academics. Traditional methodologies, while praised for their rigor, are often criticized as non responsive, bloated, bureaucratic, or contributing to late and over budget systems that when delivered solve problems that are no longer relevant. Various solutions have been proposed. Frequently combined under the rubric of Rapid Application Development (RAD), these include extensive user involvement, Joint Application Design, prototyping, integrated CASE tools, and more recently, agile methods such as eXtreme Programming (XP). Following a qualitative study of agile methods and concepts we conclude that adoption and extent of agile principle appropriation are affected temporally and by culture. Coding standards for example may initially be excluded in a search for creativity and flexibility. Similarly, in addition to the continuous improvement of refactoring bursts of intense focus also occur.
Extreme Programming in Action: A Longitudinal Case Study
243
Life Cycle is a well adopted ‘systematic, disciplined, quantifiable approach to the development, operation and maintenance of software’ [2, 6]. However, with increasing backlogs; some high profile development failures; and the need to adapt to emerging business conditions; the SDLC has been subject to criticism that it is constraining, heavyweight and results in projects that are outdated before they are finished [7]. Consequently, many organizations have adopted alternates that emphasize incremental development with constant customer feedback (Rapid Application Development); structured processes where constituents collectively and intensely review requirements (Joint Application Development); construct partial systems to demonstrate operation, gain acceptance or technical feasibility (Prototyping); and tools that assist in software development and business analysis (Computer Aided Systems Engineering). Table 1. Agile Principles of Extreme Programming XP Principle 40-Hour Work Week Coding Standards Collective Ownership Continuous Integration Continuous Testing On-Site Customer
Rationale and Description Alert programmers are less likely to make mistakes. XP teams do not work excessive hours. Co-operation requires clear communication. Code conforms to standards. Decisions about the code are made by those actively working on the modules. All code is owned by all developers Frequent integration reduces the probability of problems. Software is built and integrated several times per day. Test scripts are written before the code and used for validation. Ongoing customer acceptance ensures features are provided. Rapid decisions on requirements, priorities and questions reduce expensive communication. A dedicated and empowered individual steers the project. Two programmers using a single computer write higher quality code than individual programmers. Business feature value is determined by programming cost. The customer decides what needs is done or deferred. The software is continually improved Programs are simple and meet current rather than future evolving requirements.
Pair Programming Planning Game Refactoring Simple Design Small Systems are updated frequently and migrated on a short cycle. Releases System Communication is simplified and development guided by a Metaphor common system of names and description. Source: Adapted from [8]
In 2001, a group of programmers created a manifesto that embodied the core principles of a new methodology [9]. An extreme application of RAD, agile methods capitalize on member skill; favor individuals and interactions over process and tools; working software over comprehensive documentation; customer collaboration over negotiation; and change rather than plans and requirements. Dynamic, context specific, aggressive and growth oriented [10, 11], agile methods favor time boxing
244
P. Tingling and A. Saeed
and iterative development over long or formal development cycles. The most widely adopted agile development methodology, eXtreme Programming is a generative set of principles that consisting of twelve inter-related principles. These are described in Table 1.
3 Methodology and Data Collection For this study, we used a case oriented approach which is an “empirical inquiry that investigates a contemporary phenomenon within its real-life context, especially when the boundaries between phenomenon and context are not clearly evident” [12]. Site selection was opportunistic, the result of an ongoing relationship with Semper Corporation, a year-old start-up developing an interactive software product. Data was collected between August 2005 and December 2006 and consisted of interviews with employees, observation of the environment and work practices, and retrospective examination of documents and email [13] These are described in Table 2. Table 2. Data Collection Activities Data Type Interviews Observation Artifact examination
Description The development staff and company principals were regularly interviewed throughout the year long data gathering. Programming and development staff was observed at least weekly. This was both active (at the development offices) and passive (by remote viewing of video cameras). Employment and programming records, progress and bug reports, and copies of each build and version of the product were reviewed as was email correspondence.
The main steps in analysis involved identification of concepts and definition followed by the theorizing and write up of ideas and relationships. Content relating to agile methods and extreme programming content were separated and categorized according to discrete principles. Illustrative yet concise examples were then selected. Direct quotations have been italicized and placed within quotation marks.
4 Extreme Programming at Semper Corporation This section reviews principles described in Table 1. Although these principles were meant to be generative rather than all-inclusive, typical recommendations recognize their inter-relatedness and suggest that implementation be done in entirety with adaptation encouraged only when use and familiarity is established [6]. Findings are summarized in Table 3. 40 Hour Work Week. Company policy was one of flexible work hours. Other than core hours between 10:00 to 15:00, developers were free to set their own schedule. While there were work weeks longer than 40 hours (during customer testing or
Extreme Programming in Action: A Longitudinal Case Study
245
resolving production problems) this was the exception rather than the norm. There was no overtime compensation. Another factor affecting the schedule was the young (average 21) age of the developers who adopted nocturnal habits because of their social schedule. For example, advising when he might be in the office, one of the developers noted “I will be in early tomorrow -around 10:00-10:30”. Email conversations (where a message was sent and a response received) between the developers and managers declined during the core hours from 45% to 31% and increased from 6% to 37% between 22:00 and 04:00. Table 3. Adoption Faithfulness of XP Principles eXtreme Programming Principle 40-Hour Work Week Coding Standards Collective Code Ownership Continuous Integration
Adoption Level*
Temporal Effects
Summary
Developers worked flexible but regular workdays. Low to Standards were initially avoided Y Partial but later implemented. Code was officially shared but Partial Y developers exhibited possessiveness. Code was rarely broken and was Full N continually linked and compiled. Testing was continuous but Continuous Partial to Y advance scripts were not created. Testing Full Black box testing was phased. On-Site The CEO and Analytic Director Full N Customer acted as customers. Programmers were independent Pair Low N except when difficulties or Programming interdependencies existed. Value engineering balanced Planning Game Full N features against time and budget. Modules were constantly Refactoring Full Y improved. Periodic bursts of and dramatic improvement occurred. Simple Design Full N Working software was favored. Small Releases Full N Frequent (weekly) build cycles System Communication was simple and Full N Metaphor informal but unambiguous. *Adoption is considered Complete, Partial or full Full
N
Coding Standards. Coding standards were initially avoided. For example, rather than conventionally declaring all variables at the beginning of a module, one programmer simply added them wherever they were needed. Requests to impose standards were generally ignored by management until the program became sufficiently complex as to require tighter control and the CEO realized development teams continually rewrote the variables when they refactored or changed modules. Staff attrition later resulted in de facto standards.
246
P. Tingling and A. Saeed
Collective Code Ownership. With the exception of a few core modules, module decisions were made by the active developers. As a consequence, different ideas about modules were continually rewritten according to individual preferences. Although modules had multiple authors – one team tended to write the analytic modules while another wrote the graphically intense reporting component. While officially shared and located on a common storage medium, developers were reluctant to adapt code written by others and continued to speak of “their code”. Continuous Integration. The programming environment was Visual Basic in a Microsoft .Net framework. Modules were tested in isolation and embedded into the program several times a day adding to the formal build schedule with weekly integration. During the sixteen months of observation, more than 35 complete formal versions of the product and 225 integrations were compiled. In addition, internal and external users were given replacement Dynamical Link Libraries (DLLs) that encouraged up-to-date testing. Despite a preference for working code, there were several occasions when changes to the data model required extensive rewrites and the code was broken for up to two weeks as modules were re-written and tested. Continuous Testing. Test scripts were not written in advance of coding (as recommended by XP) and were frequently developed in parallel. Ongoing functional and compatibility testing used standardized and ad hoc test scripts. Because the design was modular and addressed a specific rather than generic problem, the majority of the code could be tested in isolation. Integration testing was completed after each weekly build and was conducted by management and external users. Black box testing was conducted using a combination of end user and test samples. HCI and usability aspects were the most dynamic with the majority of the changes immediately accepted or rejected by the onsite customer. The few exceptions to this occurred when the developers were given free rein to creatively design new ideas or when previously adopted choices were abandoned. The CEO often challenged the developers to present complex information simply and intuitively rather than providing them with a design to be implemented. After reviewing the work, he frequently commented that they seemed to anticipate what he wanted or were able to implement what he had been unable to imagine. In addition to a comprehensive series of test scripts that were developed and executed, the program was also provided to industry professionals. Two beta tests involving early customer experience programs were used by the company for acceptance testing and both of these surfaced unanticipated areas for attention. Semper used formal bug and feature tracking software for major or outstanding problems but generally the developers tended to simply immediately fix problems once identified. Often the first indication that management had of a problem was when a fix was provided or noted in the change log. Discussing the need to document bugs, the programmers opined that judgment was used to determine if a bug report should be completed after the fact and that this was only done for difficult or particularly complex solutions. Onsite Customer. Because Semper was an early-stage pre-market company, they did not have customers in the traditional sense. Instead, the product vision was provided by the CEO and the Director of Analytics. Originally trained as a mainframe
Extreme Programming in Action: A Longitudinal Case Study
247
programmer, the CEO was empathetic to technical problems but was not familiar with modern systems development and did not get involved in construction details. He would often jokingly describe programming and analytic modules as “it is just a sort and a print right - what is the big deal – three to four hours programming tops!” and would often laugh and offer to write some code himself if he thought some simple aspects were taking too long. He would challenge developers by reminding them that they learned little by programming simple tasks. A developer response to his question about a particularly complex change provides an example “This is possible but will be hard to do. This is because [text redacted]. Anyway, I’m not going to start talking about the how-to parts. I know your response will be ‘if it were easy, why would you want to do it?’ ”. The Director of Analytics on the other hand, had current technical skills and would often interact directly with the developers and offer suggestions. Generally developers worked interactively with the management team and demonstrated prototypes for immediate feedback. Where planned requirements or changes necessitated extensive coding and development work, Unified Modeling Language use cases, conceptual sketches and data models were used as scaffolding to be discarded in favor of a prototype. A great deal of the management and developer communication was oral but the fact that offices were physically separated meant that email and instant messenger were used a great deal. The main design artefacts were the data model and build reports that identified progress and what was planned for or deferred to the next iteration. Pair Programming. Pair programming was not adopted. Developers were dyadic but each within their own workstations. Modules were coded by one person although complex or difficult problems were shared. Although management discussed paired programming as an option with developers when they were hired (new applicants were interviewed by the programming staff and in addition to technical competency had to “fit in”) it was not pursued. Developers, hired directly from university where assignments and evaluations were competitive and individual; did not embrace collective approaches. While the environment was co-operative, developers would occasionally compete to see who could write the most efficient and effective code. Further exacerbating the difficulties with paired programming were work schedules, staff turnover, and personalities. Two of the development staff for example preferred to listen to iPods and to be isolated. Although programmers would often compete to see who could develop the better module they were reluctant to comment on code written by co-workers except in a joking manner. However, once a programmer left the company or was assigned to a different capacity they immediately became part of the out group and their code would often be referred to as “strange” , “poorly written” or “in need of a re-write”. Although developers would blame problems on former co-workers they would laugh when reminded that they may ultimately be subject to the same criticism. After one developer had been gone for six months another noted it was “too late to blame [redacted] now”. Planning Game. Management realized that development had aspects of both art and science. Nevertheless the planning game was used extensively and trade-offs between time and features were routine. Estimates were almost exclusively provided by the developers and once established were treated as firm deadlines against which they
248
P. Tingling and A. Saeed
were evaluated. Development was categorized into Structural Requirements, Differentiating Features, Human Computer Interaction, and Cosmetic changes. Structural Requirements. Features and capabilities outlined in the business plan, considered core and treated as priority and foundational items. Differentiating Features. Provided differentiating or competitive capabilities and were further grouped into “must haves”, “nice to have” and “defer”. The majority of the “must haves” differentiated the product. Additions to this list resulted from competitive reviews or extensions to existing capabilities suggested by users. Typically a few “must haves” were included each week and developers knew that these could delay the build (there were two or three occasions where a deadline was missed). “Nice to have” items were optional. There were between eight to twenty of these each week although they were added to a cumulative list. Approximately threequarters of these were included in each time box. “Defer” items were a combination of large and small features or changes that could be moved over time into the “must have” or “nice to have” group. Examples included the tutorial to complex encryption requirements that were included in subsequent builds. Human Computer Interaction. Although management realized that HCI was important it was considered secondary to programming and design staff were not hired until the first version of the product had been completed. The main proponent of a more expanded view of usability was the Director of Analytics. Rather than criticize the existing product he would usually make his point by identifying other products that he believed exemplified good design. The result of these comparisons was a complete re-write from the existing traditional Window’s-based interface (Icons, Menu’s and Pointers) to one that was much more intuitive and conversational. Despite the fact that Human Computer Interface issues were later seen as critical to the system and a great deal of time was spent in design, HCI was considered technically minor by the CEO. Cosmetic Changes. Semper viewed all non programming changes as important to customers and use but mainly “cosmetic”. There were numerous evolutions and changes to text, font, color, position and alignment. These were continuous and, in the words of a developer, were “tedious but not hard”. The frequency and approach used to manage these changes are described in Table 4. Refactoring. Code focused on functionality and was continually refined and improved. The first product build, created after just two weeks, was essentially a shell program but was designated version 1.0.0. Substantive changes incremented the second order digit and minor changes usually incremented the low order identifier. In addition there were several major changes. For example a complete change in system interface required that all of the modules be re-written simultaneously and the main analytic engine (over 6,000 lines of code) was completely re-written over a two month period. As such, in addition to continuous improvement through refactoring there were periods of intense improvement in function, usability, reliability, speed and stability.
Extreme Programming in Action: A Longitudinal Case Study
249
Table 4. Development Taxonomy
Type Structural
Feature
HCI
Cosmetic
Description Fundamental aspects or product core. Market and competitive requirements. Grouped into “must have”, “nice to have” and “defer” Usability issues such as placement of glyphs, screen dialogue and presentation. Icons, glyph, color, dialogue and position changes (not all simple).
Number of Changes <12
>100
>250
>1,000
Approach Simple Design & System Metaphor On Site Customer, Planning Game, Simple Design, & Refactoring. On Site Customer, Small Releases, Continuous testing, Refactoring. Onsite Customer, Refactoring, & Small Releases
Simple Design. Development was guided by simple principles but trying to avoid architectural constraints or what the CEO called “painting themselves into a corner”. Problems were designated BR or AR. BR were those that impacted customers and had to be fixed before revenue. AR were those could be solved with the increased resources provided after revenue. The planning game arbitrated between the cost of desired features and refactoring delivered functionality that was later improved. Conceptually developers were told to consider the metaphor of a ‘modern digital camera’, where a high level of complexity and functionality was behind a simple interface that users could employ in a myriad of sophisticated ways. Small Releases. Time boxing was part of the discipline. Consequently, developers released a new version almost every second week. This was relaxed during major revisions and accelerated to almost daily versions when approaching a major deadline. In addition, management and users were also given replacement modules (DLLs) that delivered specific functionality, fixed problems or generally improved the code. Despite periods where developers complained that the ongoing short term focus impeded delivery of a more systematic and quality oriented product, management remained committed to the concept of small releases. In a twelve month period developers delivered approximately 35 complete versions, with almost two dozen non-developer compiles and more than 150 replacement DLLs over and above the build cycle. Working through the planning game, management and the developers laid out a build schedule that was tracked using basic project management tools and rarely modified. System Metaphor. Communication was simple and directly facilitated most often by the data model, the program itself, and the fact that with the exception of the Director of Finance and two junior business analysts all employees had been formally trained in systems analysis or computer programming. Design of the products was handled through a combination of strategic and tactical adjustments. Joint Application Design
250
P. Tingling and A. Saeed
(JAD) sessions were used to begin product development and after each of the beta programs and before each of the three program redesigns. Tactically, designers and management met twice a week to receive the weekly build and to review progress, bug status and planned revisions to the upcoming version schedule. We next draw conclusions about the degree and extent of appropriation, discuss limitations and suggest future research and implications.
5 Conclusions and Summary Semper’s partial adoption of agile principles reinforce other findings that indicate up to two thirds of large companies have adopted ‘some form’ of agile methods [8] which are then blended with more traditional practices. Practitioners have not adopted XP in an all or none action and faithful appropriation of all principles seems to be a rarity. Initially Semper implemented only eight principles. Interestingly, three of the remaining four (continuous testing, shared code and coding standards) did later become more fully and faithfully appropriated. At first, it would appear that Semper should have applied more diligence in following agile principles from the outset. Alternatively, we suggest that these principles may have required a certain level of maturity not present in the organization’s employees and processes. Coding standards were initially eschewed by management in favor of creativity, until a basic level of code had been developed. While the programming staff themselves favored standards, they were unable to agree on the specifics, until staff turnover and management support of a standard pressured them to do so. Similarly, developers still sought code ownership despite a concerted effort by management to curb such behavior. Paired programming, the only principle that did not manage to gain any momentum continues to be supported by management but has yet to be embraced by the developers. Therefore, we find that temporal conditions and maturity affect the extent to which extreme programming principles are adopted and that both management and developer cultures are salient considerations. Consequently, future research should consider both cultural conditions and managerial preferences. Acknowledgments. We are grateful to Semper Corporation. This research was supported by a grant from Simon Fraser University.
References 1. Geogiandou, E.: Software Process and Product Improvement: A Historical Perspective. Cybernetics and Systems Analysis 39(1), 125–142 (2003) 2. Gibbs, W.W.: Software’s Chronic Crises. Scientific American 271(3), 89–96 (1994) 3. Berry, D., Wirsingm, M., Knapp, A., Simonetta, B.: The Inevitable Pain of Software Development: Why there is no silver bullet. Radical Innovations of Software and Systems Engineering in the Future. Venice (2002) 4. Brooks, F.P.: The Mythical Man Month. Addison-Wesley, London, UK (1975)
Extreme Programming in Action: A Longitudinal Case Study
251
5. Duggan, E.W.: Silver Pellets for Improving Software Quality. Information Resources Management Journal 17(2), 1–21 (2004) 6. Beck, K.: Extreme Programming Explained: Embrace Change. Addison -Wesley, Reading, Mass (2000) 7. HighSmith, J.: Agile Software Development Ecosystems. In: Cockburn, A., HighSmith, J. (eds.) Agile Software Development Series, Addison-Wesley, Boston (2002) 8. Barnett, L., Narsu, U.: Best Practices for Agile Development. on accessed, (January 15, 2003, 2005), http://www.gigaweb.com 9. AgileManifesto: The Agile Manifesto (2001) 10. Goldman, S.L., Nagal, R.N., Preiss, K.: Agile Competitors and Virtual Organizations. Van Nostrand Reinhold, NY (1995) 11. Williams, L., Cockburn, A.: Agile Software Development: IT’s About Feedback and Change. Computer 36(6), 39–43 (2003) 12. Yin, R.K.: Case Study Research: Design and Methods. Sage Publications, Thousand Oaks, CA (1994) 13. Spradley, J.P.: The Ethnographic Interview. Holt, Rinehart and Winston, New York (1979)
Holistic Interaction Between the Computer and the Active Human Being Hannu Vanharanta and Tapio Salminen Tampere University of Technology, Industrial Management and Engineering, Pohjoisranta 11, 28101 Pori, Finland
Abstract. In the design, development and use of computer-based decision support systems, the ultimate challenge and goal is to arrange and organize successful interaction between the computer and the active human being. This paper therefore examines the extent to which, by applying the hyperknowledge framework developed by Ai-Mei Chang, the holistic concept of man developed by Lauri Rauhala, and the Circles of Mind metaphor developed by Hannu Vanharanta for decision support systems, these systems can be made to emulate human cognitive processes. The approach is a new one, and it represents an emerging paradigm for achieving emulation and synergy between human decision-making processes and computer configurations. Keywords: Holistic, Interaction, Human Beings, Computer Systems, Concepts, Constructs, Architecture, Co-Evolution, Decision Support Systems.
Holistic Interaction Between the Computer and the Active Human Being
253
architecture can easily be applied to many computer systems as well to new areas of computer usage where holism plays an important role. 1.1 A Philosophic Model of the User The Holistic Concept of Man (HCM) is a philosophic model that has been described in a number of books and articles by Rauhala, a Finnish phenomenological philosopher and psychologist [1] [2] [3]. Rauhala’s source material consists, in particular, of the works of two well-known German philosophers: Husserl [4] and Heidegger [5]. The advantage of the holistic concept of man, compared to the theories presented by Husserl and Heidegger, is that it has a rather simple construction and is therefore more understandable for non-experts. 1.2 The User’s Mind The Circles of Mind metaphor [6] opens up the mind’s most important sectors: the memory system, interpretation system, motivation system and automatic system. These systems and their content must be reinforced by the computer system so that the user feels supported when using the computer. 1.3 The User as a Decision Maker The hyperknowledge framework [7], in turn, views a decision maker as cognitively possessing many diverse and interrelated pieces of knowledge (i.e. concepts). Some are descriptive, some procedural, some are concerned with reasoning, and so forth. The mind is able to deal with these in a fluid and inclusive manner via the controlled focusing of attention. That is, the decision maker actively acquires (recalls, focuses on) desired pieces of knowledge by cognitively navigating through the universe of available concepts. The result is then a hyperknowledge view of the underlying concepts and content involved in the decision making. 1.4 Computer Architecture By combining these three above-described different views, we end up with a new architecture for computer applications and constructs.
2 The Holistic Concept of Man Metaphor 2.1 Modes of Existence The Holistic Concept of Man (HCM) is a human metaphor. The basic dimensions of the metaphor consist of a body, a mind and a situation [2] [8]. A human being is an organism [9] which uses thinking processes and exists in particular and individually formed situations. The human being constitutes simultaneously of three modes of existence based on the above basic dimensions of the HCM, which cannot be separated from each other. According to the HCM, all three modes are needed in
254
H. Vanharanta and T. Salminen
order to make human existence possible and to understand the holistic nature of the human being. These modes of existence of the human being are called 1) corporeality, or existence as an organism with organic processes (the body) 2) consciousness, or existence as a psychic-mental phenomenon, as perceiving and experiencing (the mind), 3) situationality, or existence in relation to reality (the situation). Human beings have relationships and interrelationships that characterize their qualities as individuals in specific situations [10]. 2.2 Corporeality The first mode of existence, corporeality, maintains the basic processes of existence and implements the physical activities of the human being. The human brain and sense organs (internal and external) are needed when observing the objects and concepts in a specific situation which send meanings to the observer [10]. 2.3 Consciousness In consciousness, the active human being experiences, perceives and understands the phenomena encountered. This is more than a mere thinking process (cf. Res cogitans) because qualities such as experiencing, perceiving and understanding are also involved. When human beings uses their inner and outer senses to receive physical signals from the environment, the situation provides the consciousness with a meaningful content, and the human being understands this content, i.e. perceives the corresponding construct(s) or object(s) or concept(s) to be “something.” As a result of an act of understanding, there emerges a relationship, or a meaning or meanings. The HCM metaphor separates the terms consciousness and mind. Consciousness is the totality of the psychic-mental existence of the human being. Mind is used in a more functional sense to refer to the psychical and mental processes which, when taken as a totality, form the mode of existence called consciousness. Mind is a continuous process in which meanings emerge, change and relate to each other. Meanings are linked together in the mind and collectively form networks of meanings. The totality of these networks is called the world view of a human being. In relation to the world view, a human being understands both old and new phenomena. “Cause maps” or “mental models and maps” used in the cognitive psychology approach correspond to some degree to the notion of the world view. The psychological term “memory” also corresponds to some degree to the world view in the HCM metaphor. [10]. 2.4 Situationality Situationality is the third dimension of human existence. Situationality emphasizes that a human being exists not only “as such,” in a vacuum or isolation, but always in relation and interrelation to reality with its multitude of aspects. The world, or reality, is all that exists concretely or ideally, i.e. the world with which people in general can
Holistic Interaction Between the Computer and the Active Human Being
255
Will
Scientific information Everyday knowledge
Sense s Brain s
Mind
Intuition Feeling Belief
Activities Lim
bs
Object(s)
World view
relate to. Situation (or the situation of life) is that part of the world with which a particular human being forms relationships and interrelationships. [10]. Situationality is always unique to each individual. Human beings understand the same object(s) in their situation in an individual way.
Corporeality Consciousness Situationality
Fig. 1. An active human and different types of meaning [10]
3 The Circles of Mind Metaphor 3.1 The Theatre Metaphor The HCM metaphor, or the idea of the human being in a specific situation as a totality, is not sufficient to be used for the development of a brain-based system. The metaphor lacks the new, current research findings on the unconscious part of the human brain. Baars [11] has combined psychology with brain science and the old conception of the human mind to create a metaphor based on the workspace of the mind. The totality can be explained through the theatre metaphor, where the self as an agent and observer behave as if on the theatre stage. Close to the stage is the unconscious part of the brain (the audience), which is divided into four main areas: the motivational system, automatic systems, interpreting system and memory system. The spotlight controller, context and theatre director are also present. 3.2 The Circles of Mind Construct A combination of the HCM and the theatre metaphor of Baars led to our new particular and very practical metaphor. This was named the Circles of Mind metaphor [6](Vanharanta, 2003). The Circles of Mind metaphor was also designed as a physical entity so the metaphor could be used for design purposes. This has led to the idea of a brain-based system which contains the physical body following the Cartesian mind-body relationships, i.e. as a thinking thing and an extended thing [9]. One version of the Circles of Mind metaphor is presented in Figure 2 below.
H. Vanharanta and T. Salminen
R
Y
S
Co beh ntex ind t op the era s ce tor s n es
E
e
D
ing Read n tio Ac ntrol co
Th eu au nc d on Em ien sci re s oti ce ous po ona n ses l F expr acial essio ns
liz rba Ve
Th eP la y er In s spe n er ec h
Imagible
Viol ative
Goals conflicts
Details of language
nstructive
ed Imagings feelin
Pl ay er Int uiti s ons
Motives Objectives
ms ea Dr
Di rec tor
S
Thinkin g
udes Attit
E
Th e
onceptual
S
M
Th eu au nco di ns en ci ce ou s Skill mem ory
l Loca
ST EM
E
C
S M E L S T L S Y O
S
S
T
S
T I ON
E
S
A
I V
S
Y
T O M A
E
T
N
S
U
O
O
S E
O R Y
A
O
N
S
M
M
P
Vi im sual a ge ry
ling Fee ng ari He
E
S
B
Consci ous Exper i ence on the St age
D
S
xts nte Co
Ob eve ject a rec nts nd ogn itio n
S
M
Th eu au nco di ns en ci ce ou Au s gra tob io me phic mo al ry Declara See tiv ing memor e y Smelling Beliefs Facts g Tastin Lexicon on rati Vib at He e g led ow lf Kn onese e g of led ow s e I Kn other edg rld l o f s r o w o o ew at es r n e K f th op scen xt o nte he Co i nd t beh
ST EM
Synta analy ctic sis
N
S
Lo ng Go -term al
erm
tors nes
E T I NG
E P
Social es inferenc and ech Spe n face gnitio reco
I
T
s lue E Va nd s a es sk n g us Ri le l io a sc ch on e nc enc e u di Th au
256
Fig. 2. The Circles of Mind metaphor [6]
Res cogitans/A Thinking Thing was evident here, giving us the four main parts for the architecture of a new computer system. Res extensa/An Extended Thing (body) represents the other dimension of man, which physically uses the computer keyboard and gives the power of functionality to the computer application to be used on the stage.
4 The Hyperknowledge Framework 4.1 Hyperknowledge The hyperknowledge framework views the decision maker, i.e. here an active computer user, as cognitively possessing many diverse and interrelated pieces of knowledge (i.e. concepts). Some concepts are descriptive, some procedural, some are concerned with reasoning, and so forth. The mind is able to deal with these in a fluid and inclusive manner via the controlled focusing of attention. That is, the decision maker actively acquires (recalls, focuses on) desired pieces of knowledge by cognitively navigating through the universe of available concepts. To the extent that a DSS emulates such activity, interacting with it should be relatively “friendly,” natural and comfortable for the user. That is, the DSS can be regarded as an extension of the decision maker’s innate knowledge management capabilities. The decision maker is able to contact and manipulate knowledge embodied in the DSS as a wide range of interrelated concepts. The decision maker is able to navigate through the concepts of the DSS in either a direct or an associative fashion, pausing to interact with it. Thus, the hyperknowledge framework regards a decision support environment ideally as an extension of the user’s mind or cognitive faculties. Its map of concepts and relationships extends the user’s cognitive map, pushing back the cognitive limits on knowledge representation. Its knowledge processing capabilities augment the
Holistic Interaction Between the Computer and the Active Human Being
257
user’s skills, overcoming cognitive limits on the speed and capacity of human knowledge processing. In the following passages we summarize, on a technical level, the major contents and functionality of a DSS specified as per the hyperknowledge framework. For further details, readers can refer to Chang, Holsapple, and Whinston [12], [7] and study also the prototype applications based on the Vanharanta’s framework. [13]. 4.2 Decision Support Content and Functionality According to the hyperknowledge framework, a decision support system is defined, architecturally, in terms of a language system [LS], a presentation system [PS], a problem processing system [PPS], and a knowledge system [KS]. The LS is the universe of all requests the DSS can accept from the user, and the PS is the universe of all responses the DSS can yield. The KS is the extensive universe of all knowledge stored in the DSS. The PPS has a wide range of knowledge management capabilities corresponding to a wide range of knowledge representations permitted in the KS. The KS holds concepts that can be related to each other by definition and association. These concepts and their relationships could be formally expressed and processed in terms of data base, formal logic and model constructs. Associative and definitional relationships among concepts in the KS are the key to creating a hyperknowledge environment and navigating within it. The KS also contains more than just models and data. It contains reasoning, assimilation, linguistic and presentation knowledge (see Figure 3, the human system metaphor developed by Dos Santos and Holsapple. [14].
Fig. 3. Structure of the decision support system [14]
The dynamics of the DSS involve transformations of messages from the user's language system to the decision support system’s LS. These transformations are carried out by the PPS (subject to the content of the KS) using four basic functions: translation (t), assistance (a), functionality (f), and presentation (p). The user interface and functionality of a DSS specified as per the hyperknowledge framework are depicted in Figure 4.
258
H. Vanharanta and T. Salminen
User Interface
Functionality K ling m
m
SI
inner loop
User
m
SO
a
DI
K
outer loop f
p
K K
m
reas desc proc
DO
K
Language System
Problem Processing System
pres
Knowledge System
Fig. 4. User interface and functionality of the hyperknowledge framework [13]
The knowledge symbols in Figure 4 signify the following: Kling Kreas Kdesc Kproc Kpres
= linguistic knowledge available in the KS = reasoning knowledge available in the KS = descriptive knowledge available in the KS = procedural knowledge available in the KS = presentation knowledge available in the KS
4.3 Working Space When a decision maker is working in the hyperknowledge environment, a concept must be “contacted” before it can be “impacted” (affected) by or have an “impact” on the decision maker. Contact is the recognition of a concept in the environment and entails sensing the existence of the concept and bringing it into focus. Either implicitly or explicitly, the user is provided with a “concept map” as the basis for establishing contacts [13]. The concept map indicates what concepts are in the environment and what their interrelationships are. An implicit map is external to the DSS (e.g. in the user’s cognitive environment, which may be burdensome as the KS becomes complex). An explicit map is provided by the DSS itself and can be regarded as a piece of descriptive knowledge held in the KS, describing the present state of its contents. With a concept map as the original contact point within the environment, the user can make controlled purposeful contacts with any desired concept in the hyperknowledge realm. Users can focus their attention on any part of an image, multiple windows can provide different views of parts of the same image, and different images of the same underlying concept can be seen in various windows. The result is extensive user interface flexibility, which is important in the facile and adaptive interface design.
5 Emerging Paradigm 5.1 Fusion Framework In the developed computer architecture, we have based our thinking on co-evolution by combining the HCM metaphor, hyperknowledge framework and the Circles of
Holistic Interaction Between the Computer and the Active Human Being
259
Mind metaphor in one design framework, i.e. a fusion framework. The basic idea has been to map computer constructs and computer applications according to our theories based on modern brain science and the basics of the HCM and hyperknowledge functionality. With the created fusion framework we can design various computer applications and alter the design of existing created knowledge and data bases. First, our created applications contain the same systems as integral to the human brain or emulate the business processes as the brain emulates reality with the brain processes. The knowledge structure therefore contains the same important areas as the unconscious part of the brain. 5.2 Functionality of the Fusion Framework Figure 5 shows the user’s brain processes interacting with the user interface via the computer screen. The functionality is described as the hyperknowledge functionality and the database construction as the unconscious part of the human brain. In contemporary Internet applications, it is possible to navigate through the data and then combine the information according to the user’s needs, just as the hyperknowlege functionality describes the active computer user [10]. Again, these new applications share the same construct – to support the user through the user interface and, furthermore, to support the basic human processes of the mind, i.e. interpretation, memory, motivation and automatic activities. On the other hand, the combination possibilities are huge and, therefore, we have to focus on creating efficient and effective computer content for the computer user in a context-specific situation. In our computer applications, we first describe the content and the objectives of the application itself. The creation of the context-specific ontology then becomes crucial.
Fig. 5. A human compatible computer system (Salminen & Vanharanta 2007)
The construction is new and can be applied in many ways, from application design to database and computer design purposes. Our goal is to demand more from the computer and its application design. We require the design to be more holistic for the user.
260
H. Vanharanta and T. Salminen
6 Conclusion In the developed computer architecture, we have based our thinking on co-evolution. In this kind of overall system design, the computer has been illustrated to have the same sub-systems as we have in our brains. This framework can be applied to many different applications which use hardware and software. We can increase our knowledge through computer interaction. Hyperknowledge is then created on the computer screen. The construction as such contains the basic ideas of co-evolution: self-development through the use of and interaction with computer. Some applications bring the user information automatically and others extend the user’s memory capacity. Some applications also help the user to interpret the current reality, while others may help motivate the user. There are even some applications which support all system areas. Therefore, all applications, to one extent or another, increase and support our brain processes. In the same way, we can work with concepts other than computers within the conscious experience of humans. If we put an object into the conscious experience, for example different business processes, it is possible to create the extroversion of the business processes through an application. The actor can then explore the concept and gain a holistic view and understanding of the matter. These kinds of applications need supporting ontologies, concepts as well as technology, to uncover the underlying models behind the motivation, interpretation, memory and automatic systems and how these different sub-systems can be used in real life applications. These models also need other living system concepts to evolve with the processes and make the applications all the more humanistic. Acknowledgments. The work behind this paper has been financed by the 4M-project (cf. National Agency of Technology in Finland DNro 770/31/06 and Nro 40189/06) in Industrial management and engineering department at Pori, Finland.
References 1. Rauhala, L.: The Hermeneutic Metascience of Psychoanalysis, Man and World, vol. 5, pp. 273–297 (1972) 2. Rauhala, L.: Ihmiskäsitys ihmistyössä, The Conception of Human Being in Helping People. Helsinki: Gaudeamus (1986) 3. Pihlanto, P.: The Holistic Concept of Man as a Framework for Management Accounting Research, Publications of the Turku School of Economics and Business Administration, Discussion and Working Papers, vol. 5 (1990) 4. Husserl, E.: Husserliana I-XVI, Gesammelte Werke, Martinus Nijhoff, Haag (1963-1973) 5. Heidegger, M.: Being and Time. Blackwell, Oxford (1962) 6. Vanharanta, H.: Circles of mind. Identity and diversity in organizations – building bridges in Europe Programme XI th European congress on work and organizational psychology 14-17 May 2003, Lisboa, Portugal (2003) 7. Chang, A., Holsapple, C.W., Whinston, A.B.: A Hyperknowledge Framework of Decision Support Systems. Information Processing and Management, 30(4), 473–498 (1994) 8. Rauhala, L.: Tajunnan itsepuolustus, Self-Defense of the Consciousness. Yliopistopaino, Helsinki (1995)
Holistic Interaction Between the Computer and the Active Human Being
261
9. Maslin, K.T.: An Introduction to the Philosophy of Mind, p. 313. Blackwell Publishers, Malden (2001) 10. Vanharanta, H., Pihlanto, P., Chang, A.: Decision Support for Strategic Management in a Hyperknowledge Environment and The Holistic Concept of Man. In: Proceedings of the 30th Annual Hawaii International Conference on Systems Sciences, pp. 243–258. IEEE Computer Society Press, California (1997) 11. Baars, B.J.: In the Theatre of Consciousness. Oxford University Press, Oxford (1997) 12. Chang, A., Holsapple, C.W., Whinston, A.B.: Model Management: Issues and Directions. Decision Support Systems 9(1), 19–37 (1993) 13. Vanharanta, H.: Hyperknowledge and Continuous Strategy in Executive Support System. In: Acta Academiae Aboensis, vol. Ser. B, 55(1), Åbo Akademi Printing Press, Åbo (1995) 14. Dos Santos, B., Holsapple, C.W.: A Framework for Designing Adaptive DSS Interface. Decision Support Systems 5(1), 1–11 (1989)
The Use of Improvisational Role-Play in User Centered Design Processes Yanna Vogiazou, Jonathan Freeman, and Jane Lessiter Psychology Department, Goldsmiths College London, University of London New Cross SE14 6NW {y.vogiazou, j.freeman, j.lessiter}@gold.ac.uk
Abstract. This paper describes the development and piloting of a user-centered design method which enables participants to actively engage in a creative process to produce intuitive representations and inspire early design concepts for innovative mobile and ubiquitous applications. The research has been produced as part of the EC funded project PASION, aiming to enhance mediated communication in games and collaborative environments through the introduction of socio-emotional information cues, represented in meaningful yet abstract enough ways to accommodate variable thresholds of privacy. We describe our design research methodology, which combines analytical approaches, aiming to uncover participant’s needs, desires and perceptions with creative, generative methods, with which participants inform and inspire the design process.
The Use of Improvisational Role-Play in User Centered Design Processes
263
physiological data over time and the association of the data with communication exchanges with other people (former) or places the users were visiting (latter) at the time the changes in their physiological states were recorded. In a similar line of thought, we are particularly excited by the possibility of introducing such feedback in real-time in social and collaborative contexts and observing the kind of spontaneous individual and group behaviours that could emerge. For this purpose we have adopted a bottom-up, user centered design research approach in order to initially identify how people express various communication, personal and contextual cues in spontaneous ways that make sense to them. The benefits and future opportunities deriving from these research directions span across a range of application areas, in particular applications in which the communication and collaboration of individuals and groups through new technologies takes place. For instance, Reimann and Kay (2005) discuss the role of visualizations for groups of learners in improving upon their knowledge and performance. The authors consider groups as complex systems, where global dynamics can result from local interactions and propose visualizations as a means of providing team awareness. Research in social computing applications (Vogiazou, 2006) has shown that even minimal indicators of other people’s presence facilitate group awareness, which is beneficial for strengthening social bonds among groups and communities. Our interest in patterns of group behaviour and social dynamics in collaborative interactions, in work, learning and leisure oriented activities has motivated the initial phase of the research described in this paper. The goal is to identify through design and user research the kind of socio-emotional cues that can provide useful feedback in communication and to explore emergent group and individual user behaviours from the introduction of such cues. The studies we discuss in this paper, which are part of the EC funded PASION (Psychologically Augmented Social Interaction Over Networks) project aim to identify: • social, emotional and contextual information elements (situational cues, environmental context, and individual and group behaviours) that are relevant in mediated communication in collaborative work and social settings, and • potential real time and historical representations of these elements in the form of multimodal, non verbal/textual representations. • the relevance and importance of such cues in collaborative work and social gaming situations at different levels of privacy disclosure. Next we describe the design research method we deployed to address these issues.
2 Design Research The main premises of user centered design are to bring users closer to the design process and to help designers gain empathy with users and their everyday activities through the use of different methodologies. Role playing has been used in usercentred design workshops for the concept generation of innovative products in everyday life (Kuutti et al, 2002) as well as testing out design ideas with potential consumers (Salvador and Sato, 1999). In interaction design research, role playing has
264
Y. Vogiazou, J. Freeman, and J. Lessiter
been extensively performed with the use of low-fidelity prototypes to develop further design ideas, what Buchenau and Fulton Suri describe as ‘Experience Prototyping’ is usually based on improvising user scenarios that create opportunities for some kind of technological intervention or design solution (Laurel, 2003). These scenarios of use are often acted out either by users or designers with some kind of props or imaginary objects, aiming to identify potential breakdowns as well as design opportunities. This method of user involvement in the design process tends to generate potential or ‘futuristic’ functionalities for products that the design team is working on. The functionalities are then eliminated or developed further in the continuing process by the design team. Role play has the main advantage of facilitating empathy with the context of use while trying out early design ideas. When acting out everyday problems refrigeration technicians are confronted with, with designers as actors and target users as the audience, Brandt and Grunnet (2000) found that the users recognized the situations shown in the dramatized scenarios as ones they often experienced. The designers who performed the scenarios on the other hand found it harder to use drama in an unfamiliar context like this. In another study, role playing was used to elicit a first brainstorm among users about the potential functionalities of an interactive book, the Dynabook in the home environment. Both studies showed that drama can help designers to achieve a greater empathy for the users and the contexts of use. In our research, role play was not related to a particular prototype or imaginary object aiming to elicit ideas for functionality, but was used as an expressive medium for users to communicate emotional states and contextual situations. The provided props were open to interpretation and aimed to facilitate the acting itself, without binding the design process to a particular artefact. Previous research in role playing as a design methodology has outlined the difficulties in involving drama professionals as facilitators (Svanaes and Seland, 2004) because: a) introducing users to acting techniques can be very time consuming and is a separate activity from design – with drama exercises lasting for 4,5 hours the creative sessions need to be arranged at other time slots and b) drama professionals tend to focus on their subject of expertise – teaching and facilitating the acting – rather than the generation of design ideas and therefore need to be able to understand the purpose and scope of a generative workshop. In our studies it was important to ensure that the participants were initially immersed in the themes and ideas of the workshop. Following a group discussion the role playing itself was presented as a game, so there was no need to provide any training in performance, it was sufficient to describe the activity and act out an example of what was asked, introduced by one of the facilitators. An innovative research method, combining analytical and generative approaches was developed and deployed in two user group workshops, which focused on collaborative work (at the Center for Knowledge and Innovation Research, in the Helsinki School of Economics, in Finland) and social gaming (at the Department of Computing and Informatics in the University of Lincoln, in the UK) respectively. The workshops were designed to identify relevant and potentially useful elements of personal, social and contextual information, represented in meaningful ways to be readily interpretable. User attitudes in relation to privacy and comfort with sharing these information cues were also explored.
The Use of Improvisational Role-Play in User Centered Design Processes
265
Both the collaborative work and social gaming workshops followed a similar structure, which encouraged participants to get immersed in the subject and discuss their views, before engaging in creative activities that required them to generate ideas and concepts for representations. The phases can be summarized as follows: • General group discussion and brainstorm. The discussion was focused on everyday collaborative work practices and different forms of play in either workshop, aiming to identify relevant information elements about individuals, teams and context. • Feedback on early sketches. Participants were shown the same set of rough sketches (see figure 1 for example), representing individual, collective and contextual states and cues and were asked to guess what they were meant to suggest. This initiated further discussion and suggestions on non-verbal representations. At the same time this activity acted as a warm-up, to prepare the generative session that followed and inspire participants to think about representations in a more abstract, broader sense. • Improvisational role playing. The role-playing was performed individually by each participant to come up with creative ideas about representing information using different modalities (e.g. visual, sounds, actions). Here we focus on this method in particular. • Card sorting activities. In this last task participants prioritized and grouped the main information elements that emerged in the initial group discussion. They were also asked to comment on when these elements need to be private and when they can be public.
Fig. 1. Left: sketch of a group state indicating collective activity (movement, excitement). Right: sketch of an individual in a calm environment.
For the workshop on collaborative work six male participants were recruited from the age group of 24-40, with professional experience of collaborative work, either as researchers or PhD students. For the social gaming workshop nine participants were recruited from the age group of 17-40, five of whom were female and four were male. Four participants were pursuing a postgraduate degree and the other five were A Level Psychology students. Participants had variable gaming experiences, ranging from online massively multiplayer games to traditional board and card games and physical street games. The role playing was not used as a re-enactment of a user scenario or for the evaluation of a design concept, but in an entirely generative way: participants were asked to act out in a non verbal way different situations that were relevant to the workshop theme. For example, some situations in collaborative work were: “You are confused by what your manager is saying to you in a conversation”, “You are very
266
Y. Vogiazou, J. Freeman, and J. Lessiter
stressed about a forthcoming deadline”, “Being on the bus or train to work, very crowded during peak time”. Situations related to social gaming were along the lines of: “You and your team are exploring a new area – a danger is approaching”, “You have developed bonds with a team of people”, “You are playing a mobile game in a really crowded café”. Participants were asked to pick one situation and one modality that they should use for their representations from a box, both written on strips of paper. Examples of modalities were: “Draw on paper”, “Act out a situation, improvise, mimic an activity” and “Make a sound orally”. The activity was introduced as a game of ‘Charades’, which appears in variations across cultures. Part of the challenge for participants was to represent individual, group and contextual information cues in a way that the rest of the group could guess what was being represented. Various props (e.g. a tambourine, plasticine, paper, coloured pens and cups) were brought and used to express different modalities (e.g. auditory, visual, tactile). The workshop was recorded on audio and video. The video recording of the role playing workshop was used for the further generation of concepts and design ideas. Video recordings, still photographs and sketches from these role-playing activities were then used as generators (Holmquist, 2005), they formed part of a process that generated inspiration, insights and ideas – the beginning rather than the end in concept development. Following the two workshops we organised a third one (at Goldsmiths College, University of London), which was primarily generative, aiming to explore in more depth the key emergent themes from the previous workshops. We used a similar method to the one used in the previous workshops. Two teams of graduate designers in the age of 23-30 were recruited (5 male, 1 female) to generate a breadth of concepts and multidimensional representations of individual, collective and contextual states. The workshop was structured as follows: • Brainstorm and concept mapping. Participants were asked to discuss the key concepts of ‘group power’ and ‘connecting’ in the context of different situations, taking into account location, user attributes and collaboration, either work or leisure related. They documented the generated ideas by drawing collaboratively a ‘concept map’ (Novak, 1998) on large sheets of paper. This acted as point of reference for further discussion and debate around the ideas. • Individual role play. Role-playing activity performed individually to come up with creative ideas about representing various situations using different modalities (e.g. visual, sounds, drawing, modeling, actions). Similarly to the earlier workshops, participants had to choose randomly a ‘situation’ to represent and a modality to use. • Collaborative role play. Role-playing in pairs: participants acted out together an idea they generated in the earlier discussion using various props. A range of props was provided to facilitate improvisation and idea generation on the fly, including a mixer with many different sounds in order to experiment with representations (figure 2). The mixer had two CDs: one with ambient sounds (e.g. park, street noises) and another CD with short sound effects (e.g. clapping, stampede). These could also be used in combination with a touch microphone attached under the table. The touch microphone allowed participants to produce sounds spontaneously by tapping on the table or moving objects on its surface, enhancing the role playing experience and the richness of the representations created. The design graduates
The Use of Improvisational Role-Play in User Centered Design Processes
267
Fig. 2. Participants exploring and then using the audio equipment for sound based representations
produced detailed multimodal representations using, for example, samples of background sounds to represent emotional states and environmental situations, and combining traditional design processes like sketching and modelling with acting.
3 Insights from Improvisational Role-Playing Activities The two user workshops in Finland and the UK produced spontaneous representations with noticeable cross-cultural similarities. For example, we observed an open posture, shaking hands (or a tambourine) as a representation of positive affect relating to celebration or excitement. A more closed body posture indicated negative affect, namely confusion or sadness in collaborative work and gaming scenarios respectively. Participants in all three workshops engaged with the process and gave positive feedback; their interactions became more spontaneous during role play. Their nonverbal representations were very compelling in presentation and encouraged the continuous involvement of the rest of the group, as they tried to guess what was being represented as closely as possible. We found it easy to change the activity on the fly in the workshop, because of its flexible and non-prescriptive nature; participants could act out representations on their own or improvise collaboratively in pairs. The collaborative acts were more detailed and made extensive use of the available props. In the first two workshops, participants used visual representations and in particular actions (as opposed to static poses or drawing) more than other modalities, in spite of encouragement to explore all modalities. Often participants would try and combine modalities (e.g. drawing and then performing some gestures in the relation to the drawing) in order to communicate their situation more accurately. The design graduates who participated in the third workshop discovered the use of sound as a powerful creative tool through mixing the different sounds provided. Modifying the available props in future workshops could therefore reveal more emergent representations and encourage a diverse improvisational play. Below we present a selection of the generated representations that illustrate a variety of individual, group or contextual cues in collaborative work practice and social gaming, using various media (e.g. sound, hand gestures, poses, drawing and actions). 1. Using particular postures and movement to indicate personal states (confusion, stress, sadness) Posture tuned out to be a powerful means of communicating individual emotional states and social cues in role play. The postures and expressions of confusion (figure 3) were sketched out after the workshop to illustrate possible visual representations of confusion in technology mediated communication.
268
Y. Vogiazou, J. Freeman, and J. Lessiter
Fig. 3. Two different postures showing lack of understanding, confusion in communication. Sketches outlining the posture used to represent confusion and lack of understanding.
Fig. 4. One participant kept moving in a loop to indicate a high level of stress
Even when not using the whole body for acting, posture could be suggested with other means. A participant in the social gaming workshop communicated body posture in a rather abstract way using his fingers. An imaginary sad figure, represented by a bended finger and one finger moving away from it, showed the growing distance between two team players. This inspired the sketch in figure 5:
Fig. 5. Growing distance between two players
2. Using a continuous sound for ‘context’ and short sounds for an individual state or alert Sound was used to communicate a sense of atmosphere. One participant in Finland made a continuous noise orally (i.e. “bla bla bla bla bla”) occasionally interrupted by sounds of yawning to suggest boredom during a presentation, which the rest of the group understood. A social gaming participant produced an intense and continuous sound alert to indicate approaching danger. The sound (generated by beating a spoon inside a glass) became more intense and loud towards an imaginary player (represented by an object) to signify some kind of danger getting closer. This was also easily perceived by the rest of the group. In the third workshop, graduate designers experimented by combining techniques they were familiar with such as sketching, with acting or sound creation on the fly, by using a combination of sounds from the mixer. Sound was a good tool for communicating environmental cues. For example, the noise of traffic and a stampede of horses were played to represent crowd flow, while the designer drew the sketch in figure 6 to illustrate the flow of people towards different directions in rush hour.
The Use of Improvisational Role-Play in User Centered Design Processes
269
Fig. 6. Crowd flow during rush hour, accompanied by the combination of city sounds with the noise of a stampede
3. Using lively sounds and an open posture to indicate excitement Open postures were used in all workshops to communicate positive affect, with crosscultural similarities. One participant in the collaborative work workshop (Finland) held the tambourine up and shook it to show joy. Similarly, in the social gaming workshop (Lincoln) another participant used an open gesture and moving wrists to show the celebration for victory in a game.
Fig. 7. An expression of celebration of success (left) and victory in a game. The middle sketch illustrates the same posture.
Similar representations of excitement emerged in the third generative workshop, illustrated by shaking a pair of maracas. A different one was the making of an exclamation mark from plasticine and the drawing of ‘emoticons’ (smileys), a rather common representation of joy. Excitement was also communicated through a juxtaposition of natural sounds – the sound of animals (monkeys) in the background of calm, environmental sound. 4. Size representing status indicator The size of a figure was used in a drawing to suggest that a player holds a higher status in a game, in the social gaming workshop.
Fig. 8. Size of figure represents hierarchical status; circle indicates one’s own team
270
Y. Vogiazou, J. Freeman, and J. Lessiter
5. Representations of private space Participants’ attitudes to privacy issues were explored through discussion and card sorting in the first two workshops to identify different levels of privacy. The concept of personal or ‘communication-free’ space also emerged in the third, creative workshops in which ideas on privacy and personal space were visualised in different ways. For example by marking the space with a line made out of objects or creating an ‘isolation tank’ which completely disconnects all communication and external stimuli. The ‘tank’ was also sketched as an ‘isolation island’, a kind of mobile ‘cloud’ that protects the person from the intrusion of wireless communication when this is not desired. In some of the performances in the third workshop, a participant would try to engage a ‘stranger’ in conversation, for example by playing lively natural sounds (e.g. monkeys), making eye contact, pointing out objects, getting closer to the other person or drawing links between individuals to show connection. The other person would respond by trying to maintain his or her privacy, for example by hiding behind sunglasses/ a book and moving further away. This performance also illustrated the idea of a state of ‘disconnection’ and maintaining one’s privacy and personal space.
Fig. 9. Different representations of private space
6. Varying degrees of disagreement were represented with ‘emoticons’ and gestures In the third workshop, the design graduates created representations to show disagreement within a group discussion or disapproval of a person, with varying degrees. For example, in the sketch in figure 10, gradual disagreement is represented by a ‘smiley’ that eventually stops smiling and responds with ‘abuse’. Another participant drew different icons for disagreement and then smashed a plasticine model of himself to show complete rejection and exclusion. An interesting representation through role-play which was fun to observe, was performed by a designer who pretended he was having a discussion with another participant (who had no idea on what he was trying to communicate). He made gestures of anger shaking his finger at the other person and then hit his fist on the table with the contact microphone, making a very loud sound and ripping up a sheet of paper.
The Use of Improvisational Role-Play in User Centered Design Processes
271
Fig. 10. A sketch of a meeting in which gradual disagreement results in abuse! Different indicators of disagreement
4 Conclusion The combination of analytical and generative methods worked well by initially immersing users in the ideas of the PASION project, helping identify their needs and desires and then engaging them in communicating those ideas in interesting ways, that can be further explored and developed through a design process. The initial discussions introduced participants to the themes of collaboration, connection to other people and non verbal communication. Asking participants to guess what the sketchy drawings meant was also a way of encouraging them to consider more abstract non verbal representations of personal, environmental and collective states and set the scene for the role play. By introducing the role play activity as a fun ‘Charades’ game and demonstrating an example, we shifted the focus from trying to be a good actor to trying to come up with interesting ideas. Role playing and experimentation with different media also opened up a range of creative possibilities for the participating graduate designers, enabling them to enrich initial ideas, to bring them to life from a one-line sentence written on a piece of paper to an engaging performance. Because the activity was not bound to a particular artefact or technology, common to other uses of role play, where a user scenario is acted out to identify product functionalities or solutions to design problems, the generated representations were open to interpretation and diverse in the use of expressive media (actions, props, sound, drawing). In the future, we would like to see how this kind of improvisational role play can be applied in the exploratory design research phase for other innovative products and applications, which are not necessarily focused on non verbal representations. The concepts generated through the activities discussed in this paper demonstrate that improvisational role playing can be a powerful tool for both participants and designers: a) enabling participants to engage creatively in user centred design workgroups, and b) generating useful initial user input for the design process that can be then developed further for the design of easily interpretable and intuitive visualizations and interfaces. This method proved cost and time effective, compared to other role playing methodologies, involving drama professionals as facilitators, in which some training in acting needed to be provided. Most importantly, the method generated valuable concepts and ideas for novel representations of socio-emotional and situational states, which became part of the core design process for the PASION project. These representations are currently being developed further through sketching, mock-ups for application concepts and as user interface design elements that can be trialed with users.
272
Y. Vogiazou, J. Freeman, and J. Lessiter
Acknowledgements. The research is supported by the European Community under the Information Society Technologies (IST) programme of the 6th Framework Programme (FP6-2004-IST-4) – Project PASION (Psychologically Augmented Social Interaction Over Networks). The authors would like to thank all our participants and Nela Brown (sound artist) who planned and arranged the set up for the sound experimentation in the third workshop.
References 1. Charades, Wikipedia definition and rules of play at http://en.wikipedia.org/wiki/Charades (last accessed on 3/11/06) 2. Brandt, E., Grunnet, C.: Evoking the future: drama and props in user centered design. In: Cherkasky, T., Greenbaum, J., Mambrey, P. (eds) Proceedings of Participatory Design Conference, New York, CPSR (2000) 3. Buchenau, M., Fulton Suri, J.: Experience Prototyping. In: Proceedings of the DIS2000 conference, pp. 424–433. ACM Press, New York (2000) 4. Holmquist, L.: Practice: design: Prototyping: generating ideas or cargo cult designs? Interactions of the ACM 12(2), 48–54 (2005) 5. Kuutti, K., Iacucci, G., Iacucci, C.: Acting to Know: Improving Creativity in the Design of Mobile Services by Using Performances. In: Proceedings of the 4th Conference on Creativity & Cognition, Loughborough, UK (2002) 6. Laurel, B.: Design Research: Methods and perspectives, pp. 49–55. The MIT Press, Cambridge, MA London (2003) 7. Lindström, M., Ståhl, S., Höök, K., Sundström, P., Laaksolathi, J., Combetto, M., Taylor, A., Bresin, R.: Affective diary: designing for bodily expressiveness and self-reflection. In: CHI ’06 Extended Abstracts on Human Factors in Computing Systems, Montréal, Québec. Canada, ACM Press, San Francisco (2006) 8. Nold, C.: BioMapping Project. (last accessed on 25/01/06, 2006)Available online at http://biomapping.net/press.htm, 9. Novak, J.D.: Learning, Creating, and Using Knowledge: Concept maps as facilitative tools for schools and corporations. Mahwah, N.J., Lawrence Erlbaum & Assoc. 10. Paulos, E.: Connexus: a communal interface. In: Proceedings of the 2003 conference on Designing for user experiences, pp. 1–4. ACM Press, San Francisco (2003) 11. Reimann, P., Kay, J.: Adaptive visualization of user models to support group coordination processes. In: Paper presented at the 2nd Joint Workshop of Cognition and Learning through Media-Communication for Advanced e-learning, Tokyo, Japan (2005) 12. Salvador, T., Sato, S.: Playacting and Focus Troupe: Theater techniques for creating quick, intense, immersive, and engaging focus group sessions. Interactions of the ACM 6(5), 35– 41 (1999) 13. Svanaes, D., Seland, G.: Putting the users center stage: role playing and low-fi prototyping enable end users to design mobile systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’04, Vienna, Austria, ACM Press, New York (2004) 14. Vogiazou, Y.: Design for emergence: collaborative social play with online and locationbased media. IOS Press, Amsterdam (2006)
Quantifying the Narration Board for Visualising Final Design Concepts by Interface Designers Chui Yin Wong and Chee Weng Khong Interface Design Department, Faculty of Creative Multimedia, Multimedia University, 63100 Cyberjaya, Malaysia {cywong, cwkhong}@mmu.edu.my
Abstract. The narration board is a powerful design tool to help translate user observation studies into a storytelling format. It helps to communicate design values and ideas among the design team via visualising user scenarios in its proper context during the early design stages. This paper aims to discuss the narration board as a design tool to help the design team conceptualise and visualise user scenarios interacting with future design concepts within its context of use. Second part of the paper discusses how narration boards assist in generating ideations and visualising final design concepts by interface designers. Twenty (20) design projects (N=20) were examined to study and quantify two important factors, i.e. the components of the narration board in relation with the attributes of the final design concepts. A non-parametric correlation test was used to study the correlation coefficient between scores of the two factors. The results show that there is a statistically significant positive correlation between components of the narration board and attributes of the final design concept. Those with higher scores of components in narration board tend to produce better final design concepts, and vice versa. Keywords: Narration, Interface Design, Storyboard, and design concepts.
multi-disciplinary team to share the same vision and theme on a design project, there needs to be a means for communicating high-level concept designs across the team. Hence, narration or storytelling, has become an important channel for depicting scenarios and sharing visions of design ideas and concepts in the design process. The objectives of this paper are twofold. Firstly, to discuss on the narration board as a design tool to help the design team conceptualise and visualise user scenarios interacting with future design concepts within its context of use. Secondly, this paper attempts to quantify the components of narration (story) board in relation with the attributes of the final design concepts.
2 Storytelling and Scenarios 2.1 Rationale of Storytelling User researchers or ethnographers conduct user studies to elicit user requirements during the early stages of the design process. This is meant to have a closer understanding of how users behave and interact with artefacts within the real environment. Such studies will highlight social activities, trends and values, which are then analysed and incorporated in the scenario-building process to depict user personas in the context of use. Storytelling is perceived as an acceptable channel to share similar beliefs and thoughts among the community. In general, stories are easily remembered by society rather than design principles, facts and figures. There are several reasons why stories are good communication catalysts for a design team [5], [11]: • Stories communicate ideas holistically, conveying rich yet clear messages. Thus, they are an excellent way of communicating intricate ideas and concepts in an easy-to-understand format. Therefore, stories allow people to convey tacit knowledge that might otherwise be difficult to articulate. • Stories are easily remembered by people because they are circulated with human passion and emotions. • Stories aid in team-building as it becomes a communication tool to share similar user-activities events and information that help in constructing vision. It eases the communication flow by nurturing a sense of community and help to build relationships, especially in a multi-disciplinary design team. • Storytelling provides the context in which knowledge arises as well as the knowledge itself, and hence it can increase the likelihood of accurate and meaningful transfer of knowledge. 2.2 Adoption of Storytelling into Scenarios Storytelling has been widely adopted in different disciplines, particularly in film, animation, education, design and business. For instance, Walt Disney uses storyboards for creating motion pictures and animation characters in their film production process. In the real business world, multi-national companies like IBM’s Knowledge Socialisation Project [6] use storytelling to share business visions within
Quantifying the Narration Board for Visualising Final Design Concepts
275
the organizations. Instructional designers may use storyboards to create learning objects for courseware design whilst developing educational systems. In design practice, storytelling has been used by designers to share the conceptual design prototypes and design solutions across the design team. Stories and event scenarios are collected from observational fieldwork studies to share user behaviour, cultural belief, and insight to the whole design team for design strategy. Stories are concrete accounts of particular people and events, in particular situations; scenarios are often more abstract and they are scripts of events that may leave out details of history, motivation, and personality [5]. Despite the differences, storytelling and scenarios are intertwined and both are difficult to be distinguished as design story or user-interaction scenarios. In the user requirement stage, user researchers collect user stories and observational information from fieldwork studies. Observational data is then translated and analysed into various themes and consumer insights. This helps to create realistic example and build scenarios as shared stories in the design team. User profiles, characters and goals form personas in scenario-building process. Cooper [4] first proposed the concept of persona and it has widely applied in academic and industrial practice and the concept has been integrated in various design projects. In essence, persona is an archetype person representing a user profile whereas scenarios inherently describe how a person interacts with the product in the context of use. As mentioned earlier, stories are easily memorized by people, the medium of presenting storytelling are crucial in making the stories memorable and the shared visions are inherently comprehended within the design team. Rosson and Carroll [10] described user-interaction scenarios are setting, actors, task goals, plans, evaluation, actions and events. However, the design scenario activities are illustrated in conventional text-based description, embedding characteristic elements of user interaction scenarios. Thus, next session describes how narrative scenarios are illustrated in pictorial form to conceptualise high-level of user-interaction scenarios.
3 Narration in Context 3.1 Narration in the Design Process Narration has been used and applied in different phases of the design process. Lelie [8] described the value of storyboards in the product design process. The term “storyboard” is used instead of narration board. In each phase of the design process, the form of storyboards has its own style of exploring, developing, discussing or presenting the product-user interaction. The design process ranges from analysis, synthesis, simulation, evaluation to decision phase. The consideration of visualisation style is illustrated differently in relation to design activities, purpose/goals, and its representation form in each phase of design process [8]. In our context, we discuss how narration boards are used in the interface design process during early conceptual design stages for ideation purposes. Figure 1 shows the detail requirements in the conceptual design phase for interface designers. There are two types of narration boards being adopted, that are the Narration Board (preideation) and Narration Board (post-ideation). For the Narration Board (pre-ideation),
276
C.Y. Wong and C.W. Khong
interface designers are required to translate the results of observation studies and market research into problem scenarios highlighting the problems or any issues that users face in the real environment. Different design aids such as mood boards, product design specification, and product positioning are also developed in assisting designers to achieve a holistic grasp on the concept designs being developed. The interface designers will then be required to produce another Narration Board (post-ideation) to project how their concept designs will be used in future scenarios. Research – User Studies – Ideation/Conceptual Design – Prototype – (Re) evaluate Fig. 1. Brief Conceptual Design Phase
In the realm of interface design, communication between designers and other team members are important for a successful design project. Narration board is a valuable design tool to the design team as it provides a common visual-based medium to share the common understanding of future design developments. Conventionally, scenarios are illustrated in textual descriptions to portray userinteraction scenarios [10]. For designers, visual-based mediums are important to assist them in ‘visualising’ and developing ideations for future design solutions. In such circumstances, scenarios described in visual forms accompanied by text explanations serve the communication purpose within the design team. Nonetheless, visual-based narrative is a valuable aid in provoking the thinking process, evoking ideations and to spur creativity to higher levels for interface designers. Several types of medium have been used to illustrate narration or storytelling in either analogue or digital format such as hand drawing, sketching, photography and video [2], [8]. There are some software tools developed for storytelling such as DEMAIS [1], and Comic Book Creator™ [3]. In developing narration boards, the interface designers are required to consider the characteristics of user personas, scenarios and context of use. They are able to select any medium of communication to illustrate the narrative scenarios. Due to time and cost considerations, hand sketching, marker rendering and drawing on layout pads are the most cost-effective way. The designers then scan their narrative scenarios into digital formats, which can then be posted online for sharing purposes. Alternatively, the interface designers can transfer the photographs they have captured during their observation studies using graphical software such as Adobe Flash™, Adobe Photoshop™ or Comic Book Creator™. 3.2 Types of Narration Boards Narration boards also play an important role in bridging the communication gap between the design team and other corporate departments such as top management, manufacturing department and the clients themselves. For top management and the clients, they usually do not have ample time to go through the detailed design levels. Hence, narration board assists in projecting the problem scenarios of the user experience. This is illustrated in the Narration Board (pre-ideation) (figure 2). On the
Quantifying the Narration Board for Visualising Final Design Concepts
277
Fig. 2. An example of a Narration Board (pre-ideation) depicting a scenario of a primary school pupil who is robbed on the way to home from school
Fig. 3. An example of Narration Board (post-ideation) illustrating a scenario of how E-Hovx plays as a role in protecting the primary pupil from a potential robbery
278
C.Y. Wong and C.W. Khong
other hand, top management and clients will be able to grasp the design solutions from the illustration of how the intended users interacting with the new product design concepts or design solutions in the future scenarios as demonstrated in Narration Board (post-ideation) (figure 3). An example of the E-Hovx project depicts a scenario faced by a primary school pupil encountering danger as he is robbed on his way home from school (figure 2). Figure 3 shows how the concept of the E-Hovx device assisting in the scenario by producing an alarm to alert the pupil and to ward off any potential harm.
4 Evaluating Narration Board for Visualising Final Design Concept 4.1 Methodology In order to evaluate how narration board (pre-ideation) is effective as a design tool in assisting interface designers for generating ideations and visualizing final design concepts, an empirical study was conducted by a usability specialist to examine the relation between the two variables of narration board and final design concept. The study examined twenty (20) different design projects developed by interface designers as test subjects (sample size N=20) at the Interface Design Department. Based on the above description, the null hypothesis (Ho) is “there is no relation between narration board and final design concept”. The alternative hypothesis (H1) is where ‘there is a positive association between the narration boards (pre-ideation) with final design concept for a design project.’ To produce a successful narration board, there are certain elements to be highlighted by the designers. Truong et. al. [12] highlighted that there are five significant elements for a narration board to convey its narrative across to the design team. The five elements of narration board are level of detail, inclusion of text, inclusion of people and emotions, number of frames and portrayal of time. On the other hand, there are also 5 attributes that determine how usable and functional the final design concepts are deriving from the input of the narration board. These 5 attributes of generating final design concept in the later conceptual design stage are form and functionality, usability (ease of use), user-artefact illustration, product semantics, and design appeal (emotional and mood). This study looks at 20 design projects (DP) developed by interface designers addressing a common theme of “i-Companion”. The DPs were selected based on the inclusion of narration board (pre-ideation) and final design concept in the design process. To quantify the effectiveness of the narration board, the usability specialist justified the scores based on a 1-5 point Likert scale (1 is the least, 5 is the most applicable of applying the elements) on the elements of the narration board. The elements look at level of detail, inclusion of text, inclusion of people and emotions, number of frames, and portrayal of time. Subsequently, a final score was given on the 20 DPs respectively based on the sum of the 5 narration board elements. Conversely, to evaluate the output of final design concepts, the final design concept scores were calculated with the total sum of the 5 attributes, i.e. form and
Quantifying the Narration Board for Visualising Final Design Concepts
279
functionality, usability, user-artefact illustration, product semantics, and design appeal (emotional and mood) on the 20 DPs respectively. 4.2 Results, Data Analysis and Discussion Result. The table below (table 1) shows the summary of the final scores of narration board and final design concepts for the 20 DPs. Table 1. A summary of the final scores on Narration Board and Final Design Concepts for 20 Design Projects Design Project (DP)
Data Analysis. To examine the relation of both variables (narration board and final design concept), non-parametric Spearman’s Rho test was conducted to study the correlation coefficient for the sample size (N) of 20. The diagram below (table 2) shows the ‘correlations’ matrix of the two variables (scores of narration board and final design concept). From the diagram, there is a statistically significant positive correlation between narration board and final design concept scores (rho=0.78, df = 18, p<0.001). Thus, those with higher scores of components in narration board tend to produce better final design concepts and vice versa. In order to examine whether there is a curvilinear relationship or any outliers on a correlation, the final scores of two variables (narration board scores and final design concept scores) were reported in a scatterplot diagram (figure 4). From the diagram, there was no evidence of a curvilinear relationship or the undue influence of outliers.
280
C.Y. Wong and C.W. Khong
Table 2. Spearman’s correlation produced by Correlate for the two variables (narration board and final design concept) Correlations
Spearman's rho
Narration Scores
Concept Scores
Correlation Coefficient Sig. (2-tailed) N Correlation Coefficient Sig. (2-tailed) N
**. Correlation is significant at the 0.01 level (2-tailed).
Fig. 4. A scatterplot diagram showing the relation of narration board scores and final design concept scores for 20 design projects
Discussion. Apparently, the result showed that there is statistically significant positive relation between the two variables of narration board and final design concept. In other words, the higher scores of narration board would have the tendency of producing higher scores for the final design concepts as well. This is proven true whereby an interface designer, who is proficient in design skills such as visualization, drawing, sketching and possess the ability of creativity, would be able to generate better design solutions and usable final design concept. However, there are some exceptional cases such as DP 5 and DP 8 indicating a contrary of higher scores in narration board compared to lower scores of final design concepts. The reasons are mainly due to time constraints that the interface designers have perhaps spent too much time and effort in producing the narration board (pre-ideation) and other design tools resulting in lesser time in developing ideations at the final conceptual design stage. In addition, the more elements of the narration board are applied, the higher the impact and the better the design solutions will be produced by interface designers. For
Quantifying the Narration Board for Visualising Final Design Concepts
281
instance, the narration board (pre-ideation) in figure 2 revealed that there was a certain level of details applied in narration board such as richness in colour, 8 narrated frames in sequence with number order, appropriate text inclusion in dialogues, portrayal of time and location, facial expression of users in the scenarios, issues highlighted in the user-scenario context revealing user requirements for problemsolving solutions. As a result, a device called E-Hovx was produced as the final design concept to help solve the scenarios as shown in figure 3. The E-Hovx device was also produced in high resolution 3D modeling and final prototype using Rapid Prototyping machine for detailed visualization and testing purpose.
5 Conclusion In general, narration boards provide a visual reference for interface designers in terms of illustrating user problems via scenarios and a medium to promote future userinteraction scenarios. In many ways, its approach is visually tacit. When applied in a holistic manner, narration boards are powerful visual communication aids that can provide input towards design strategies. Its use not only benefits the design team but also helps to convey ideations to top management and to clients. It is proven that designers are more panache and creative in employing mixed mediums for producing narration boards. In a nutshell, a narration board with the required components of level of detailing, inclusion of text, inclusion of people and emotions, appropriate number of frames and portrayal of time will greatly help in visualising and generating final design concepts for interface designers. Having said this, future studies will examine the relation of each component of the narration board with the attributes of the final design concepts. The future study will also cover which components will affect and result in the most usable and effective final design concept. Apart from this, various expert opinions from multi-disciplinary team will be gathered to examine the uptake of narration board in visualising final design solutions in practice. Acknowledgments. The authors would like to express gratitude to the staff and students involving in the design projects at the Interface Design Department, Faculty of Creative Multimedia at Multimedia University, Malaysia. We also wish to thank Selina Ooi whose works are mentioned in the paper. Disclaimer. The authors wish to emphasize that the names and images shown in this paper are only for training and educational purposes, and does not intentionally infringe the rights of individuals or organisations. Any company names, registered trademarks or commercial names mentioned or shown in the text solely belong to their respective owners, organisations and institutions.
References 1. Bailey, B., Konstan, J., Carlis, J.: DEMAIS: Designing Multimedia Applications with Interactive Storyboards. In: MM01, Ottawa, Canada, (Sept. 30 – Oct 5, 2001) pp. 241–250. ACM Press, New York (2001) 2. Cheng, K.: Storytelling Techniques Workshop. Available online at http://www.okcancel.com/archives/article/2006/02/storytelling-techniques-workshop.html
282
C.Y. Wong and C.W. Khong
3. Comic Book Creator. Available online, at http://www.planetwidegames.com 4. Cooper, A.: The Inmates are running the asylum. SAMS, Indiana (1999) 5. Erickson, T.: Notes on Design Practice: Stories and Prototypes as Catalysts for Communication. Available online at http://www.pliant.org/personal/Tom_Erickson/Stories.html 6. IBM Knowledge Socialization Project. Available online at http://www.research.ibm.com/ knowsoc/ 7. Khong, C.W.: A Review of Applied Ergonomics Techniques Adopted by Product Designers. In: Lim, K.Y., et al. (eds.) Proceedings of 4th APCHI and 6th ASEAN Ergonomics 2000 (APCHI/SEAES 2000), pp. 317–322. Elsevier, Singapore (2000) 8. Lelie, C.: The value of storyboards in the product design process. Personal and Ubiquitous Computing 10(2), 159–162 (2006) 9. Pedell, S., Vetere, F.: Visualizing use context with picture scenarios in the design process. In: Mobile HCI 2005. Salzburg, Austria, pp. 271-274 (2005) 10. Rosson, M., Carroll, J.: Usability Engineering: Scenario-based Development of Humancomputer Interaction. Morgan Kauffman Publishers, San Francisco (2002) 11. Specialist Library Knowledge Management. Storytelling. Available at http:// www.nelh.nhs.uk/knowledge_management/km2/storytelling_toolkit.asp 12. Truong, K.N., Hayes, G.R., Abowd, G.D.: Storyboarding: An Empirical Determination of Best Practices and Effective Guidelines. In: Carroll, J. (ed.) Proceedings of the 6th ACM conference on Designing Interactive Systems, pp. 12–21. ACM Press, University Park, PA, USA (2006)
Scenario-Based Installability Design Xiao Shanghong UCD Research Department,Huawei Technologies Co. Ltd., China
Abstract. We introduce the user scenario-based installability design approach. The basic idea is to check out how users complete the installation and thus to understand the experience, skills, and operation habits of users through onsite survey. Special attention is paid to the installation time that affects the efficiency, problems encountered during the installation process, and how users solve these installation problems. The main issues need to be considered: How to Select Typical Users, How to Conduct Installability Task Analysis, How to Define the Scenario, How to Conduct Installability User Test. Keywords: Installability, Typical Users, Task Analysis, Scenario definition, User Test.
1 Introduction Product installability design has aroused wide concerns from the R&D staff. They welcome the idea of conducting the installability design from the user’s point of view. However, they do not have a clear idea on how to take the user requirements into consideration and incorporate user requirements into their design. Huawei promotes the UCD and lists the product installability project as a UCD project. The purpose is to enable the R&D staff to consider installability from the user’s point of view. The problems are how to carry out the installability design and how to let users to participate in the installability design. We introduce the user scenario-based installability design approach. The basic idea is to check out how users complete the installation and thus to understand the experience, skills, and operation habits of users through onsite survey. Special attention is paid to the installation time that affects the efficiency, problems encountered during the installation process, and how users solve these installation problems. We can find out the problems that affect the installation efficiency and quality through careful observation and analysis, so as to work out the task model that complies with user operation habits and can facilitate the user operation. This task model is also called user scenario. Based on the user scenario, we can develop the prototype, invites users to evaluate the prototype, and then reiterate the design until the design meets the installability requirements of users.
Typical installation users represent the majority of installation users. Selecting typical users is in fact a process of sampling from the large amount of users. During sampling, we need to consider the skill of users, which is generally classified into three levels: novice, mid-level, and skillful. Another factor we need to consider is the number of users interviewed. If the number is too small, the interview result cannot reflect the general situation of the majority of users. If the number is too large, the cost of user interview increases. Thus, we need to find out an economic and reasonable number. 2.1 Sample Size and Probability of Detecting Usability Problems Formula (1) is used to calculate the probability of detecting usability problems. λ is the probability that a single user detects the usability problems. Experience proves the probability that a single user detects the usability problems is 31%. Figure 1 shows if we select 6 to 7 typical users for interview, more than 80% of usability problems can be found out.
p 1 (1 O ) n
(1)
Fig. 1. Relationship between sample size and probability of detecting usability problems
2.2 Set Up a Typical Installation User Database The typical installation user database can be set up as follows: For the domestic market: • Select 2 to 3 qualified telecom equipment installation companies in each province. • Request the representative office to distribute the questionnaire to investigate the basic information of installers, including the age, education, experience, and responsibility. • Set up the installation user database according to the survey result. • Export the user profile. • Select typical users according to the user profile. For the overseas market, the same procedure can be referred to.
Scenario-Based Installability Design
285
Example of Task Analysis: Current task model
Time(man*minute)
1 Cabinet Transportation
10
2 Cabinet Installation 2.1 Mark lines based on template 2.2 Drill holes and position the cabinet 2.3 Level the cabinet by using pads 2.4 Secure the cabinet
72 5 23 34 10
3 Power Cable Installation
24
4 User Cable Installation 4.1 Install the user subrack 4.2 Install the user boards 4.3 Route user cables from the cabinet top. (It is impossible to operate on either side of the cabinet.)
125 125
5 Installation upon Expansion (Board Expansion) 5.1 Replace the dummy panel with user board for expansion purpose. 5.2 Locate the correct user cable from the cable trough. 5.3 Install the user cable on the user board.
Problem
It is inconvenient and also takes long time to use pads to level the cabinet.
Users need to route the cables from the cabinet top. At first, users can accept this method. But with the increase of cables to be installed, the cable routing space on either side of the cabinet become smaller and smaller. When it comes to the cables of the last two user boards, users could hardly install these cables. User cables installed in advance are scattered in the cable trough and do not look nice. Cable connectors are full of dust and not well protected. As too many cables pile up near the door, sometimes the front door cannot be properly closed.
2.3 About User Proxy User proxy is often used in the product development. For example, in the installability pre-test, to save expense in inviting users, Huawei internal employees are invited instead to attend the test. Requirements are also set for this type of user proxy. The first requirement is that such users must work in the installation field. For example, for the cabinet installation test, it is not appropriate to invite software engineers. Installation conclusions drawn by software engineers cannot reflect the installability of the equipment. The correct way is to select the mechanical test engineers, new
286
S. Xiao
employees of mechanical department, and document development engineers or technical support engineers engaged in cabinet installation.
3 How to Conduct Installability Task Analysis After determining the typical users, we need to study how users complete the installation task. This process is called task analysis. During the task analysis, we observe how users execute the installation task, record the time needed for each installation task and problems found in each task, and understand the user experience. A detailed task analysis helps the UCD team to have a better understanding of the current product and thus locate the points for improvements. Task analysis can be conducted by adopting various user research methods, such as interview, survey, group discussion, and observation. It should be completed in the early stage of concept phase. Installability task analysis can be conducted by observing the installability test in the laboratory. The best choice, however, is the onsite observation. After the observation, it is recommended to invite experienced users for interview and collect opinions or suggestions from users to help better understand the user problems on site. After the task analysis is completed, we need to export the task model of the current product, completion time of each task, and a detailed record of installation problems (if any). Besides, we also need to export the task importance and satisfaction survey table. Importance of a task is generally judged according to factors such as the installation time and severity of the problem. The satisfaction survey requests at least 6 users and the satisfaction in the survey table is the average of scores given by these users.
Task
Importance Satisfaction
Cabinet transportation
3
5
Cabinet Installation
4
3
Power Cable Installation
4
5
User Cable Installation
5
2
Installation upon Expansion
4
2
Use the four-quadrant analysis method to find out the key points for improvement based on the importance and satisfaction of tasks. Tasks that fall into the opportunity window are those that users react strongly and should be given high priority. Task analysis is the foundation for the user scenario-based design. Only through task analysis can we understand the installation scenario of the user site and define the user scenario oriented to the future design. Task analysis is indispensable in the installability UCD.
Scenario-Based Installability Design
287
4 How to Define the Scenario Based on Task Analysis A scenario defines the human-product interaction. Scenario definition is based on the analysis of the current user installation tasks. The purpose is to improve the current task model and provide the task model oriented to the future product design. The new Before 1 Cabinet Transportation
After 1 Cabinet Transportation
2 Cabinet Installation 2.1 Mark lines based on template 2.2 Drill holes and position the cabinet 2.3 Level the cabinet by using pads 2.4 Secure the cabinet
2 Cabinet Installation 2.1 Mark lines based on template 2.2 Drill holes and position the cabinet 2.3 Level the cabinet by using anchors 2.4 Secure the cabinet
3 Power Cable Installation
3 Power Cable Installation
4 User Cable Installation 4.1 Install the user subrack 4.2 Install the user boards 4.3 Route user cables from the cabinet top. (It is impossible to operate on either side of the cabinet.)
4 User Cable Installation 4.1 Remove the side door 4.2 Install the user sburack 4.3 Install the user boards 4.4 Install user cables (Users can operate at one side of the cabinet.)
5 Installation upon Expansion (Board Expansion) 5.1 Replace the dummy panel with user board for expansion purpose. 5.2 Locate the right user cable from the cable trough. 5.3 Install the user cables on the user board.
5 Installation upon Expansion (Board Expansion) 5.1 Loose user cables that are installed in advance on the dummy panel. 5.2 Replace the dummy panel with user board for expansion purpose. 5.3 Install the user cable on the user board
288
S. Xiao
task model should allow users to complete the installation task in a more convenient and fast manner. For example, in the cabinet installation scenario, the cabinet is originally leveled by using pads. As installers need to insert the pads at the bottom of the cabinet. It is difficult and time-consuming to level the cabinet. Therefore, cabinet leveling problem is the key concern of users in the cabinet installation scenario and special attention must be paid to fix this problem. For example
5 How to Conduct Installability User Test The installability user test is based on the user scenario. Generally, 2 to 3 groups of typical users are invited to the test. Major roles attending the installability user test include one instructor and two test recorders. The instructor steers the whole test process during the test, but does not persuade users to do anything. Recorders record the data such as the installation time, number of errors, times of help seeking, and installation problems. The test process must be videotaped, so that we can analyze the test process in details after the test to locate user problems. Before the test, users will be requested to complete the user acknowledgement letter and the user basic information table. We must acknowledge the users that the test is used for product design only and does not constitute any harm to users. Meanwhile, users have the obligation to keep any related information confidential. Upon completion of each scenario test, users need to complete the satisfaction questionnaire. After the whole test is completed, users need to complete the overall satisfaction questionnaire.
6 Conclusion Besides the above key activities, scenario-based installability design also includes the scenario-based prototype design. Prototype design is also scenario based. It reflects the consistency of user requirements throughout the product development process. Prototype design is not detailed here due to the limited space. This article provides the method of scenario-based installability design method to clear puzzles encountered in the current installability design. It explains how to select typical users, conduct installability task analysis, define user scenarios based on task analysis, and conduct the installability test based on the scenario. The purpose is to provide a clear clew in product installability design for the R&D staff. The article also illustrates the effectiveness of scenario-based installability design through examples.
A Case Study of New Way to Apply Card Sort in Panel Design Yifei Xu, Xiangang Qin, and Shan Shan Cao Corporate Technology, Siemens Ltd. China, 7 Wangjing Zhonghuan Nanlu, Beijing 100102 P.R. China {yifei.xu, xiangang.qin, shanshan.cao}@siemens.com
Abstract. The aim of this paper is to describe a case of washing machine panel design. In this case card sorting and cluster analysis were applied to get target users’ mental models of the information architectures about the washing machine panels, the differences among information architectures of existing panels were also quantitatively evaluated. Besides, the differences between users’ mental models and existing washing machines regarding the information architectures were identified. The methodology and results in this paper contribute to the design of washing machine panels. Keywords: Panel Design, Card Sorting, Quantitative Measure.
it shows less “selling points” to customers in marketing. The goal of this project is to find out a compromised solution for the washing machine panel with better usability and visual affordance of operation and functionality.
Fig. 1. The first trend of HMI design
Fig. 2. The second trend of HMI design
Although washing machine is common in most families and the number of washing programs and functions is limited, the current products were still complained for usability. One finding from user study is that the information architectures of the panels are disorganized and inconsistent with users’ mental models. Both semantic categories of the operations and customers’ using habits were considered in panel design and in most cases they are usually inconsistent. For example, to most users’ common sense, Cotton, Synthetic, Wool and Silk are four programs for different fabric textiles and Quick, Normal and Power Wash are three different programs in washing strength and time. However, according to users’ habits and requirements, Cotton, Synthetic and Quick wash are more frequently used. So, in many products these three programs are grouped together and operable with the knob, but Normal and Power Wash are only available in menu. Another example is that some new and innovative program like Fresh Drying and Steam Wash, which are not frequently used, locate on the panel in high priority places. Also, for the consideration of visual appearance and manufacture cost, knob, buttons and lights on the panel could not be various in design and must be displayed in a visually beautiful order. In that way, much information indicating operation and functionality will be lost in panel design. Because users’ mental model about the washing machine panel is different from the real product, it will be difficult for them to understand and remember the panel information architecture and not be able to locate the program or function when necessary. In order to provide a better design of washing machine HMI, we would study target users’ mental model of the panel and find out its’ difference from current products for improvement. 1.2 Card Sort Method Card sorting is a usability method used in software and product design to discover users’ mental models about information architectures. [1] It is proved to be a useful and effective technique to organize several pieces of information items or concepts. The standard way of card sorting is to provide a group of target users with a set of cards. On each card is a concept or piece of information from the set that needs to be organized. Users then categorize the cards into groups based on their understanding. [2] The technique is based on the assumption that if users group cards together, the concepts probably should be grouped together in the system. [3] The result suggests users’ mental model about the information architecture of a system or website.
A Case Study of New Way to Apply Card Sort in Panel Design
291
The standard way of analyzing card sort results is cluster analysis. Results from all or various groups of users could be entered into a statistical analysis program and tree-diagrams (or dendrograms) will be generated as a graphical representation of the relationships between the concepts under study. The dendrograms may provide useful insight into people’s mental model about a system, but they are usually very complex and difficult to interpret and measure quantitatively. [4] In this project, qualitative differences are not sufficient to support the redesign. Therefore, we applied a quantitative measure to evaluate the differences of card sorting results from users. Also as a new trial, the information architectures of current washing machine panels were input as the card sort results and compare with that of users. We also evaluate the differences between information architectures of existing washing machine panels and the differences between users’ mental models and machine information architectures. The assumption of this test is that the information architecture of current panel designs are also based on previous studies about their target users.
2 Method and Procedure This session is to outline the methodology employed in the data gathering, performance and analysis of the card sort results. 2.1 Card Sorting with Users Participants Twelve (6 male, 6 female) end users of high-end & front-loading washing machines were recruited from the 19 participants in previous interview. Information was collected regarding their gender, age, education level and type of washing machine they currently used. Procedure Thirty-nine physical cards were generated from the programs and functions of the washing machine panel to be evaluated. Each was printed with a menu entry or a visible option around the controls. The functions and programs of the target washing machine are very similar with those of existing products in market. Participants conducted the card-sorting separately after all items on the cards were explained to them in detail to make sure they have understood the meanings. Each participant was accompanied by a usability engineer throughout the session and interpreted their card-sorting results to the usability engineers. Participants were instructed to sort the cards into piles based on their understanding and requirements without any restrictions of how many categories to create or what criteria to use to group the cards. Participants were also allowed to throw out any item that was not believed to fit into their requirements. After card-sorting, participants were asked to name each group of cards and then merged the sub-groups of cards into higher level of categories.
292
Y. Xu, X. Qin, and S.S. Cao
2.2 Card Sorting Derived from Washing Machines Washing Machines 10 high-end, front-loading washing machines were selected for the study. In this paper, we mark them as A1, A2, B1, B2, C, D1, D2, E, F, G. Letter indicates brand, while number distinguishes module. A1, A2, B1, B2 and C are European brands, and the rest are Asian products. Procedure As mentioned above, we strived to analyze information architecture trend of washing machines available in market. To this end, the information architectures of 10 highend, front-loading washing machines were summarized into dendrograms based on the criterion that “items that operated by the same control should be grouped in one category”. Additionally, distinct visual design of the controls or the labels around the control naturally partitions the group into sub-categories. For example, the items around the knob, such as Cotton, Super Quick, Dark Wash and etc., usually belong to the group of wash programs. But according to the visual patterns or visual design elements like location, font and color, they could be divided into several sub-groups. Items that are close in location and similar in visual appearance are usually set under one sub-group. Items locate in software menu belong to another sub-group. All function sub-groups form the group of Functions and Features. Fig.3 shows the item-sort result of one product. We conjecture that the items were supposed to be presented in two groups—program and function groups. The items of program group were split up into two sub-categories by a red semi-ring around the 4 programs leftward and by text in red labeled on top of them. Function group consists of 4 sub-categories. Buttons that are close in location and similar in visual appearance are set under one sub-category.
S
Function Group
S
S P
S
S
S
Fig. 3. The information architecture of washing machine panel
2.3 Data Analysis To make the information architectures from users and washing machines comparable, the card sort results must share the same “card pool”. We summarized the programs and functions from the 39 cards (Pool User) and 10 washing machines (Pool Machine). The same or very similar programs and functions were labeled with the same card name. When inputting the users’ data into cluster analysis, the cards only available in Pool Machine were regarded as “useless cards” and vice versa. The aggregate set of Pool User and Pool Machine were named as Pool General, which will be the card pool for the comparison.
A Case Study of New Way to Apply Card Sort in Panel Design
293
Cluster analysis was performed to process the card sort data. The quantitative measure of card sort data was based on the assumption that each dendrogram can be represented by a card sort cluster and each cluster could be represented as a matrix. The triangle matrix could be concatenated into a single vector. [4] Therefore, a matrix can be got from each participant’s card sort result and the existing panels by USort and EzCalc. Then we may concatenate the matrixes into vectors and do statistics analysis to the vectors representing the card sort result.[4]
3 Results and Discussions 3.1 Information Architectures from Users The results of 12 users were manually input into card-sorting tools Usort and analyzed by EZcalc with complete algorithms. 12 dendrograms and distance matrixes were generated separately which represented the information architectures and the distances between pairs of cards in users’ mind.
Fig. 4. Analysis of the Distance Matrixes for 12 Users
Fig. 5. Dendrogram of card sorting results for 11 Users
From our observation, one participant did not understand the process of cardsorting very well and the result extremely inconsistent with others. So the data from this participant was excluded from further analysis. This result was validated by Hierarchical Cluster Analysis of users’ distance matrixes. Then 11 users’ card-sorting results merged and generated one dendrogram (See Fig.5 for example) and distance matrix (See Table 1 for example) with complete algorithms. A distance of 0 means
294
Y. Xu, X. Qin, and S.S. Cao
that everyone placed two cards together and a “1” means no one combined the two cards. In the dendrogram and distance matrix below, everyone combined Memory1 and Memory2 in the same pile; no one put Quick and Memory1 together. The smaller the distance between two cards, the more users put them together. Table 1. of the Distance Matrix (11users)
Memory1 Memory2 Standard Synthetic Quick Normal Intensive
Memory1 1.00 0.00 0.78 0.90 1.00 1.00 1.00
Memory2 0.00 1.00 0.78 0.90 1.00 1.00 1.00
Standard 0.78 0.78 1.00 0.61 0.64 0.67 0.75
Synthetic 0.90 0.90 0.61 1.00 0.83 1.00 0.90
Quick 1.00 1.00 0.64 0.83 1.00 0.06 0.28
Normal 1.00 1.00 0.67 0.94 0.06 1.00 0.13
Intensive 1.00 1.00 0.75 0.90 0.28 0.13 1.00
Gender Difference As home furnishings, the target users of washing machines could be either male or female. One main purpose of current project is to explore whether there are significant differences between male and female regarding the mental structures of washing machines. The card sorting results of all 11 participants were divided into male-structure and female-structure and produced two distance matrixes respectively. Wilcoxon Signed Ranks Test was used to compare the relationships between the distance matrixes of male and female. However, the result (See Table 2.1 and Table 2.2) shows no significant difference, which indicates that it’s not extremely necessary to distinguish male from female in designing the menu structures of washing machines. This result was consolidated by the correlation test (r=0.613, p<0.001). Table 2.1 Wilcoxon Signed Ranks Test for the information architectures of male and female
Female - Male
Negative Ranks Positive Ranks Ties Total
N 285(a) 312(b) 144(c) 741
Mean Rank 307.69 291.06
Sum of Ranks 87691.00 90812.00
a Female < Male b Female > Male c Female = Male Table 2.2 Signed Ranks Test for the information architectures of male and female
Z Asymp. Sig. (2-tailed) a Based on negative ranks. b Wilcoxon Signed Ranks Test
Female - Male -.371(a) .711
A Case Study of New Way to Apply Card Sort in Panel Design
295
3.2 Information Architectures from Washing Machines The information architectures of the 10 washing machines are supposed to be different among diverse brands and consistent within one brand, which is one of the usability principles recognized by ergonomics [5]. Besides that, brands from Europe and Asia are also assumed to be different. To this end, the distance matrixes and dendrograms of 10 individual washing machines were analyzed by Hierarchical Cluster Analysis to test which information architectures could be clustered together and thus be classified as one group. The results showed that when cluster membership was confined to 2, A1 and B2 were classified as one cluster, and other washing machines were classified as one cluster. This result is out of our expectation. The four washing machines of brand A and B should keep consistent design styles in panel structure. But the distance between B1 and B2 ranked largest among those between B1 and other washing machines, although they are of the same brand. When the cluster membership was confined to 3, F was distinguished as one independent cluster. The reason is that all functions and corresponding parameters was displayed on the panel of F with only one level of menu structure, while there were more than one level of menu structures on other washing machines. E and G were distinguished as independent cluster respectively when the cluster membership was confined to 4 and 5. When the cluster membership was confined to 6, A2&B1, D1&D2&C, A1&B2 were clustered together respectively and E, F, G were distinguished from each other. The classifications were also validated by the proximity of washing machines (See Fig. 6). The results showed that the main dissimilarities exist between washing machines from European and Asian brands, except that the distance between C European and D1 was the smallest one among those between C and other washing machines.
(
)
Fig. 6. Hierarchical Cluster Analysis of 10 Washing Machines Table 3. The Proximity of Washing Machines
Max Min
A1 F 148 B2 45
A2 F 104 B1 44
B1 B2 114 A2 44
B2 E&F 163 A1 45
C B2 140 D1 57
D1 B2 144 D2&C 57
D2 B2 162 D1 56
E B2 163 D1 71
F B2 163 C 78
G B2 160 B1 78
Note: Max is the Maximum Distance between Pairs of Washing Machines; Min is the Minimum Distance between Pairs of Washing Machines
296
Y. Xu, X. Qin, and S.S. Cao
3.3 Comparison of the Structures for Users and Washing Machines Six information architectures of washing machines were generated (that of A2&B1, D1&D2&C, A1&B2, E, F, G ) based on the results of Hierarchical Cluster Analysis. The six groups generated by cluster analysis could be validated by the real machines in terms of their panel layouts and brands. Therefore, the information architectures of the six groups were used for furthur analysis. The similarity between the information architectures of washing machines and users was compared using Wilcoxon Signed Ranks Test. The results showed that the dendrogram of A1& B2 was the most similar one with that of users (See Fig.7). Table 4. Wilcoxon Signed Ranks Test for the Menu structures of users and competitors
Z Asymp. Sig. (2tailed)
Users Vs A1& B2 -1.82(a)
Users Vs D1&D2&C -5.48 (a)
Users Vs A2&B1 -8.26 (a)
Users Vs F -10.72(a)
Users Vs E -12.09(a)
Users Vs 7Others -3.07(b)
.069
.000
.000
.000
.000
.002
Users Vs G -13.57 (a) .000
a Based on positive ranks. b Based on negative ranks. c Wilcoxon Signed Ranks Test
Fig. 7. Dendrogram for A1&B2
4 Strengths and Limitations In this study card sorting was not only used to gather users’ mental models about panel design, but also as a tool to quantitatively measure the differences among information architectures of existing product panels and their correlations with users’ mental models. However, limitations also exist in this study due to the requirements for an applied project rather than a well-designed research project. First, the original purpose of card sorting by users was to get users’ mental models of a panel design for a specified new product as defined in the project. Measurement
A Case Study of New Way to Apply Card Sort in Panel Design
297
for existing products was conducted to further analyze the results. Therefore, the card pool for users was a little different from the card pool of products, which could influence their comparisons. Second, nonparametric tests were used to conduct quantitative analysis due to the small sample size of users (12) and washing machines (10). Although statistical principles were observed strictly, we should be prudent to expand current results into a wide field.
5 Conclusions Card sorting is usually an efficient and effective way to explore users’ requirements of the information architecture design for products. The results from card sorting by users could be guidance for product information architecture design. But how the results would be interpreted to support design highly depends on individual applications and researchers’ personal experiences. Quantitative analysis of card sorting results could be a good supplement of this method. In current case study, the information architecture of existing products and users were identified by card sorting and the results were further analyzed by cluster analysis and nonparametric test to classify and compare them. The information architectures of existing washing machines were classified into six groups by cluster analysis and one of them was proved to best fit that of users. No significant gender difference was found in terms of the information architectures in users’ mental model. In summary, current case study shows that card sorting and cluster analysis are effective ways to build up information architectures of products. As a consequence, it also provides researchers and practitioners with qualitative and quantitative methods and results for information architecture design.
References 1. Nielsen, J., Sano, D.: SunWeb: User Interface Design for Sun Microsystems Internal Web. Computer Networks and ISDN Systems, 28, 179–188 (1995) 2. Dong, J.M., Fu, L.M., Salvendy, G.: Human-Computer Interaction: User Centered Design and Evaluation. Tsinghua University Press, pp. 74–83 (2003) 3. Berndtsson, J.: Designing an Intranet from Scratch to Sketch: Experiences from Techniques Used in the IDEnet Project. In: Proceedings of the Thirty-Second Annual Hawaii International Conference on System Sciences, vol. 2, p. 2019 (1999) 4. Ewing, G., Logie, R., Hunter, J., McIntosh, N., Rudkin, S. et al.: A New Measure Summarising ‘Information’ Conveyed in Cluster Analysis of Card-Sort Data: Application to a Neonatal Intensive Care environment. In: Proceedings of the 7th Workshop on Intelligent Data Analysis in medicine and pharmacology, pp. 25–29 (2002) 5. Vredenburg, K., Isensee, S., Righi, C.: User-Centered Design: An Integrated Approach (Reprint Edition). Pearson Education North Asia Limited and Higer Education Press, p. 152 (2003)
Design Tools for User Experience Design Kazuhiko Yamazaki1,* and Kazuo Furuta2 1
IBM Japan Ltd., User Experience Design Center/ 2 The University of Tokyo * [email protected]
Abstract. The purpose of this study is to develop an approach to artifacts design based on information technology. To make interactive system easy to use, user centered design approach is utilized by many systems. For user centered design, it is important to consider total user experience. But it is not easy to consider total user experience because user experience is including many aspects. To approach total user experience, the author proposes the method of designing for user experience that consist of “User viewpoint”, “Environment viewpoint” and “Lifecycle viewpoint”. “User viewpoint” is including several user groups from universal design viewpoint, several user characters and several user emotions. “Environment viewpoint” is including hardware product, software, application, space, people who is communicating. “Lifecycle viewpoint” is including pre sales, after sales, support, upgrade, setup product and application. To help this design approach, user experience design tool named “UED (User Experience Design) Studio” was proposed. Based on proposed three approaches, design tools were developed such as “The definition tool”, “The evaluation tool” and “The visualization tool” for user experience design. To define user experience situation easily, “The definition tool” helps designer such selecting user group, selecting environment and input user tasks based on life cycle state. “The evaluation tool” is to evaluate defined user experience easily. And “The visualization tool” is to show the result of evaluation by 3 D graphics easy to understand complicated information. To evaluate proposed tools, experiment to make prototype was conducted and the results indicate that the proposed approach has possibility to help designer and multi-disciplinary team to consider user experience for user centered design.
The web site by American Institute of Graphic Arts describes for experience design as follows; • A different approach to design that has wider boundaries than traditional design and that strives for creating experiences beyond just products or services. • The view of a product or service from the entire lifecycle with a customer, from before they perceive the need to when they discard it. • Creating a relationship with individuals, not targeting a mass market. • Concerned with invoking and creating an environment that connects on an emotional or value level to the customer. • Built upon both traditional design disciplines in the creation of products, services, as well as environments in a variety of disciplines. For example, what is the user’s experience in making a presentation at a conference? Before the conference, the user needs to consider the title and content of the presentation, prepare the presentation slides, travel to the conference with a notebook PC, and after the presentation, the user receives questions and feedback and may change the presentation for the next conference. During the actual presentation, the user may need to set up a desk for the projector, connect cables to the projector, and decide where to put the notebook PC. Also, depending on the user’s character, the presentation style will be different from other presenters. For example, a younger presenter may not think about using small text on the slides, a designer may try to use a lot of graphics on the slides, and some presenters may not use any slides at all. There are a lot of factors that are related to the user’s experience. In reality, it is not easy to adopt a user experience design approach to products and services because user experience design means covering a wide range of aspects, and without collaboration by a multidisciplinary team, the user experience design will not be successful. In this paper, the authors tried to organize the user experience design approach and propose methods for the design approach, for the processes, and for the teams in order to help designers and multidisciplinary teams. Also, to help this design approach succeed, the authors also propose the creation of a user experience design tool.
2 Design Approach for User Experience Design 2.1 About User Experience Design User experience design is to design products, systems, and environments by considering the total user experience that includes various aspects such as usability, accessibility, appearance, personality, branding, etc. The design should span the entire lifecycle, and not be limited to just one scene of the user’s time. Also, it should cover the total environment that is related to all of the materials needed to achieve the user’s goals. 2.2 Design Approach for User Experience Design To approach the total user experience, the authors propose a method of designing for user experience that consists of the Lifecycle viewpoint, the Environment viewpoint,
300
K. Yamazaki and K. Furuta
and Various User viewpoints. The Environment viewpoint covers all of the materials that users look at, touch, or feel. For example, it includes hardware products, software, applications, the space containing the systems, and the people who are communicating. The Lifecycle viewpoint covers all of the time that the user will be related to the product and its systems. For example it includes the pre-sales period, the after-sales time, support, upgrades, the product setup, and actual use of the application. Various User viewpoints cover the differences between various people. For example, it includes several user groups from the universal design viewpoint, such as users with various characters, and various emotions felt by one person. 1) Consider User Experience from the Lifecycle Viewpoint From the Lifecycle viewpoint, the system includes a user's initial awareness, through additional discoveries, on to ordering, delivery, installation, initial use, day-to-day use, service, support, upgrades, and end-of-life. For example, considering a presentation at a conference, before the conference the user needs to consider the title and content of the presentation, prepare the slides, and travel to the conference with a notebook PC, and after the presentation, the user receives feedback and may change the presentation for the next conference. 2) Consider User Experiences from the Environment Viewpoint The Environment viewpoint is to cover all of the materials that the user looks at, touches, or feels. It includes the hardware product, software, applications, the space where things happen, and the people who are communicating. For example, for a presentation, the environmental requirements include a desk, a projector, suitable cables, the projection screen, etc. 3) Consider the user experience from various User viewpoints The various User viewpoints cover various human differences. These include several user groups from the universal design viewpoint, users with various characters and personalities, and feeling various emotions. For example, for a presentation, depending on the user’s character, the presentation style will be different. Younger presenters may not consider small text on the slides, designers may try to use lots of graphics on the slides, and some presenters may not use any slides. Many of these items are related to the user’s unique experience. 2.3 Design Method for User Experience Design The design method for user experience design has to be based on the user centered design approach because user centered design is a useful method to solve the problems from the user’s viewpoint, and also it is very popular in many companies. The design method for the user experience design needs to extend user centered design to cover the Lifecycle viewpoint, the Environment viewpoint, and various User viewpoints. Also, from the corporate viewpoint, branding is very important for user experience design to succeed as a business. The design process and the design team for user experience design should be based on user centered design with extensions from the user experience viewpoint.
Design Tools for User Experience Design
301
Here is the design process for user experience design: • Make a plan for the user experience design: It is important to have the right design process, right methods, and the right team. For this purpose, before starting the project, the project leader has to make a plan for the user experience design. The plan has to include an outline of the process, the schedule, the team members, and the budget. • Understand the background of user experience: The background includes the market, the business, the users, the stakeholders, and the branding. • Understanding the user experience of the targeted users from lifecycle, environment and various User viewpoints: This includes the people, the user roles, the user goals, the user tasks, and the user scenarios, as seen from the user experience viewpoints. • Concept design for user experience: This includes a low fidelity user experience prototype, and a document for the concept design and its evaluation. • Detailed design for user experience: This includes a high fidelity user experience prototype, detailed design specifications, and their evaluation. • Evaluation from the user experience viewpoint: This includes the final prototype and evaluation. • Validation of user experience in marketplace: It is important to validate the results of user feedback from the user experience viewpoint. The design team for user experience design should be considered based on the user centered design approach. The members are almost same as for user centered design, but all of the members have to knowledge about user experience design. Here is the list of team members for user experience design: • • • • • •
Project leader User researcher User experience designer Visual designer (Industrial designer or Graphic designer) User testing specialist Marketing planner
3 Design Tool for User Experience Design 3.1 Purpose User experience design is not easy to understand because it is a new approach and covers many different fields. It is also a new approach for currently practicing usability specialists and designers. It needs to cover various fields from the Lifecycle viewpoint, Environment viewpoint, and various User viewpoints. Also, it is important to share information for a multidisciplinary design team. To help designers to approach the design from the user experience viewpoint, an effective tool is desirable. In this chapter, the authors describe the current tool used to help designers apply the user experience viewpoint, and also the requirements for a newer tool.
302
K. Yamazaki and K. Furuta
3.2 Current Tool To support user experience design, there are some current design tools such as Persona, User Scenarios, User Segment Tables, and Lifecycle for User Experience, as follows: 1) Persona for User Experience A persona is an example of a person who characterizes a role that represents a user group from the user experience viewpoint. A persona describes a fictitious user including the roles, skills, goals, emotions, and other personal characteristics. It feels real to designers because it is example of a person, not just a conceptual description. A persona helps designers understand and focus on characteristics of users from the user experience viewpoint. 2) User Scenario for User Experience Modeling user scenarios is one of the useful methodologies to understand users and share information among designers and related people. A user scenario has many roles such as system vision, design rationale, usability specifications, functional specifications, user interface metaphors, prototypes, object models, formative evaluation, documentations, and overall evaluation. For an innovative design, user scenarios need to be developed for each end user segment to share the goals and aspirations of these users with a variety of professionals. They use these user scenarios to create and evaluate new ideas. User scenarios are very important tools to collaborate with many professionals around the world and to create a common language for collaboration. As shown in Figure 1, one example of a user scenario is a visual user scenario for a notebook PC. It describes the typical user scenario, gives an image of a persona and the goods, and it is like a poster to share among several people who are working on this project. 3) User Segment Table As shown in Figure 2, the user segments table is prepared to help designers identify various types of users of a product being designed and to put these user types in some target user groups. This table constitutes a matrix of rows and columns. The columns consist of the items related to human physical and mental functions and demographic, cultural, and environmental factors, all of which designers have to be taken into account. The rows list the basic types of users, such as disabled people, temporarily disabled people, children, and so on. Some cells in the matrix are listed with typical or general user examples. 4) Lifecycle for User Experience A lifecycle for user experience is an approach that considers user experiences for each step of the relationships among the users and system or product. For example, for a product, a lifecycle could be divided into interest building, serious consideration, shopping, setup, support, and upgrade steps.
Design Tools for User Experience Design
303
3.3 Requirements for a Design Tool for User Experience Design To help this design approach, the authors here describe the requirements for a design tool for user experience design: • Understand the user experience approach including the lifecycle, environment, and various User viewpoints. • Share the information among the project members with different backgrounds. • Easy to update by changing the information. • This tool can be utilized in several different steps of the design process. • Easy to see the information for all members of a multidisciplinary team.
Fig. 1. Example of Visual Scenario
•The tool has to be networked application software and share the data among the team members.
4 Experiment for UED Studio 4.1 Introduction for UED Studio To help support this design approach, the authors propose user experience design tools named UED (User Experience Design) Studio. Based on the requirements for a design tool for user experience design, UED Studio would be an integrated application to support design. UED Studio consists of three applications, the Definition Tool, the Evaluation Tool, and the Visualization Tool. To easily define the user experience situation, the Definition Tool helps designers in ways such as selecting a user group, selecting an environment, and entering user tasks based on the Lifecycle viewpoint. The Evaluation Tool is for easily evaluating the defined user experiences. The Visualization Tool is to show the results of the evaluations by using 3D graphics to make it easy to understand the complicated information. The Definition Tool and the Evaluation Tool are based on the three user experience design approaches, the Lifecycle viewpoint, the Environment viewpoint, and the various User viewpoints. For this purpose, both of these tools have three corresponding views to span these viewpoints.
304
K. Yamazaki and K. Furuta
Fig. 2. Example of User Segment
The purpose of the user view in both of these tools is to define a target user group, and a designer is able to define several user groups by using text and images. The purpose of the environment view in these tools is to define the environment of some target user, and a designer is also able to define several such environments by using text and images. The purpose of the lifecycle view in both of these tools is to select a stage of the lifecycle, such as product recognition, shopping, use, or update. The purpose of the user tasks in these tools is to define each user task, and a designer is able to input descriptions of each user task. Here are the steps of user experience design and the relationships with each application of UED Studio: 1. Make the plan for user experience design Consider which methods and tools will be good for each steps of user experience design. 2. Understanding market and business including branding from user experience viewpoint The Evaluation and Visualization Tools helps to evaluate the current products or systems and the competitors. 3. Understanding user experience of targeted users from lifecycle, environment, and various User viewpoints The Definition Tool helps designers make the definitions. 4. Concept design for user experience The Definition Tool helps designers remember the basic user definitions. 5. Detailed design for user experience The Definition Tool helps designers remember the use of user experience design. The Evaluation and Visualization Tools evaluate low-level user experience prototypes. 6. Evaluation from user experience viewpoint The Evaluation and Visualization Tools evaluate high-level user experience prototypes. Validation of user experience in market The Evaluation and Visualization Tools evaluate the final user experiences.
Design Tools for User Experience Design
305
The UED Studio application is composed of three applications, using text and visual data. The three applications are controlled with XML files, using common text data and common visual data for user information, environment information, user tasks, and the results of evaluations. It is easy to exchange text and images because the text data and image data are separate from the application. Designers can update the text data directly in each application. 4.2 UED Studio: The Definition Tool The purpose of the Definition Tool is to help a designer define target users from their user experience viewpoints. When a designer develops products or systems from the user experience design viewpoint during the concept design stage, this tool helps the designer to define target users including user roles, user characteristics, user tasks, and user environments. As shown in Figure 3, this tool consists of three views, one for the user, one for the user’s environment, and one showing the user’s tasks within the lifecycle.
Fig. 3. Definition Tool
Fig. 5. Evaluation Tool
Fig. 4. User View of Definition Tool
Fig. 6. Visualization Tool
To define a target user, the following is the process with UED Studio: • Start from the main menu of the Definition Tool • Define a user group • The designer needs to select the user definition view from the main menu in order to define each user group. As shown in Figure 4, the user view has many examples of user groups with pictures and detailed definitions, and the designer can select various user groups by clicking the pictures. • Define user environment • The designer needs to select the environment view to define each user’s environment. Like the user view, the environment view has many examples of user environments with pictures and detailed definitions, and the designer can select various environments by clicking the pictures.
306
K. Yamazaki and K. Furuta
• Define lifecycle and user tasks. The designer needs to select one of the lifecycle phases and input each user task. • Returning to the main menu allows looking at overviews of the definitions 4.3 UED Studio: The Evaluation Tool The purpose of the Evaluation Tool is to help a designer evaluate products or systems from the user experience viewpoint. As shown in Figure 5, the Evaluation Tool consists of a lifecycle and task view, an environment view and various user views. A designer evaluates the user experience and selects one of the five steps. This tool has a capability to convert the data and save it as a CSV file for use by other software. 4.4 UED Studio: The Visualization Tool The purpose of the Visualization Tool is to show the results of the evaluations using 3D graphics to make the complicated information easy to understand. The 3D graphics are created automatically with the data of the Evaluation Tool. As shown in Figure 6, the Visualization Tool has 3 axes. The author-designed X-axis is for time, the Y-axis is for a variety of users, and the Z-axis and each surface have pictures that are related to the functions. This picture is also relevant for the Evaluation Tool. The columns are visualized as the results of an evaluation. The color and diameter of each column is related to the results of an evaluation. The authors intend that the designer will find it easy to quickly recognize the results of evaluation.
5 Conclusions To help designers, the authors have proposed a design process by using UED Studio. After creating the proposal, this process was introduced to several companies as a design process from the user experience design viewpoint. We need further study in practical situations to evaluate this design process and the next steps should include experiments in real product design processes. For the next step, the author is planning to enhance UED Studio to make it easier to input user tasks. In addition, based on the current UED tool, the authors are planning to perform experiments to get designers’ feedback. Acknowledgements. For help with this paper, we would like to give special thanks to Manabu Sasajima, Kousuke Akai and Akira Okada.
References 1. Nielsen, J.: Usability Engineering. Academic Press, US (1993) 2. Carroll, J.M.: Scenario-based Design-envisioning work and technology in system development. Wiley, US (1996) 3. Carroll, J.M.: Making Use of Scenario-based design of human-computer interactions. The MIT Press, US (2000)
Design Tools for User Experience Design
307
4. Yamazaki, K.: Study on Design Method by Using Video User Scenario, CHI 2001, New Orleans (2001) 5. Yamazaki, K.: Listening and Leading in User-Focused Design ICSID (International Council of Societies of Industrial Design), In: Proceedings ICSID 2001, pp. 382–388, Seoul (2001) 6. Nomura, M., Yanagida, K., Yamaoka, T., Yamazaki, K., Okada, A., Saito, S.: A Proposal for Universal Design Practical Guideline (3) Proposal of UD Matrix for Universal Design Practical guideline, In: International Conference for Universal Design, Yokohama, Japan (2002) 7. Yamaoka, T., Yamazaki, K., Okada, A., Saito, S., Nomura, M., Yanagida, K.: A Proposal for Universal Design Practical Guideline (1) Framework for UD Practical guideline, In: International Conference for Universal Design, Yokohama, Japan (2002) 8. Yamazaki, K., Yamaoka, T., Okada, A., Nomura, M., Yanagida, K.: Universal Design Practical Guideline, In: First International Conference on Planning and Design, Taipei, CD JP009-F, pp. 01–06 (2001) 9. Yamazaki, K., Okada, A., Saito, S., Nomura, M., Yanagida, K., Yamaoka, T.: A Proposal for Universal Design Practical Guideline (4) Design Process for Universal Design Practical Guideline, In: International Conference for Universal Design, 2002, Yokohama, Japan (2002) 10. Yamazaki, K.: Universal Web Approach to Web Contents for a Company Web Site. In: International Conference on Universal Access in Human-Computer Interaction, pp. 747– 751 (2001) 11. Yamazaki, K., Okada, A., Saitoh, S., Nomura, M., Yanagida, K., Yamaoka, T.: Proposal for design process and user segments table?for universal practical guidelines, In: 10th International Conference on Human-Computer Interaction, vol.4, pp. 168–172, Crete (2003) 12. Yamaoka, T., Yamazaki, K., Okada, A., Saitoh, S., Nomura, M., Yanagida, K.: A concept and method of proposed Universal Design Practical Guideline, In: 10th International Conference on Human-Computer Interaction, vol.4, pp. 163–167, Crete (2003) 13. Okada, A., Saitoh, S., Nomura, M., Yanagida, K., Yamaoka, T., Yamazaki, K.: Construction and application of the user segments table as a tool proposed in the universal design practical guidelines. In: Proceedings of the 6th Triennial Congress of the IEA, pp. 322–326, International Ergonomics Association, Seoul (2003) 14. Yamazaki, K., Okada, A., Saitoh, S., Nomura, M., Yanagida, K., Yamaoka, T.: Design process for universal practical guideline by using UD MATRIX and UD user segments, In: Proceedings of the 6th Triennial Congress of the IEA, pp. 359–363, International Ergonomics Association, Seoul (2003) 15. Yamazaki, K., Furuta, K.: Proposal for design method considering user experience, In: 11th International Conference on Human-Computer Interaction, Las Vegas (2005)
Axiomatic Design Approach for E-Commercial Web Sites Mehmet Mutlu Yenisey Istanbul Technical University, Industrial Engineering Department 34367 Macka-Istanbul TURKEY [email protected]
Abstract. The success of e-commerce depends on strong infrastructure, powerful business processes, error-free codes in e-commerce site, and highly usable interfaces. However, the most important factor to achieve these goals is the quality of design process. The main objective of axiomatic design is to provide a scientific base for design process. Axioms are the propositions which are accepted as true. The design axioms are determined by the definition of common elements of good designs. There are four sets to systemize this interaction; customer, functional, physical and process definition sets. Customer Definition Set shows the expectations of the customer in manner of product, process, system or/and material. The customer needs are expressed as functional requirements and constraints in Functional Definition Set. The physical design parameters to correspond the functional requirements are defined in Physical Definition Set. And finally, the process characterized in manner of process variables is in Process Definition Set. Keywords: Axiomatic Design, Web page, Usability.
Axiomatic Design Approach for E-commercial Web Sites
309
The basic dominant principles for a good design are generated by axiomatic approach. It is based on the creation of decisions and processes for a good design. Moreover, the axioms produce new concepts. New results and theories are generated by generalizing the axioms. Hence, these new results and theories are accepted as true and valid since they are based on axioms. The main objective of axiomatic design is to provide a scientific base for design process. Logical and rational processes and tools are presented to the designer. Hence, the development of design activities is built on a theoretical base. Moreover, the axiomatic design aims to improve the creativity, to shorten the research process, to minimize the trial and error period, to find out the best among all solutions, to combine the powers of computer and creativity by providing a scientific base for design process. Axiomatic design increases the creativity when necessary functions and constraints to correspond the customer needs are clearly defined. It helps that the designer can focus on the ideas which guaranties to correspond the customer needs by eliminating the bad ones immediately. It creates a systematic flow from the emerging ideas to detailed design. It used to be thought that the design could be learnt only by experience. However, it is believed that the creativity and the experience sides of the design can be improved by systematic and scientific approach. This development can be seen as renaissance in design field. The design is a interaction process between “What is to be wanted to achieve?” and “What is the way to reach the achievement?”. There are four different definition sets to systemize this interaction. These sets also define the borders among four design activity. These sets are; customer, functional, physical and process definition sets. Customer Definition Set shows the expectations of the customer in manner of product, process, system or/and material. The customer needs are expressed as functional requirements and constraints in Functional Definition Set. The physical design parameters to correspond the functional requirements are defined in Physical Definition Set. And finally, the process characterized in manner of process variables is in Process Definition Set [1].
2 Commerce vs. E-Commerce Human being had been trading for thousand and thousand years. At the beginning, we collected the goods needed from nature. Later, we guessed that our requirements were beyond of the nature-giving. Hence, we started to exchange goods we did not own. Probably, this was the beginning of the trading. As time went by, several shopping means have been emerged. Today, we are shopping in e-stores in order to accelerate the shopping procedure, to obtain error-free shopping environment and the most important of all, to do shopping easily. Ng [2] classifies today’s shopping means as i)public markets, ii)stores, iii)supermarkets, iv)shopping malls or centers, and v)electronic and cyber malls. E-commerce is simply performing the commercial transactions via digital processes over computers and networks connecting them. However, e-commerce means more than this definition. The main objective is to increase the accuracy and
310
M.M. Yenisey
efficiency in business processes. Hence, e-commerce enables the parties to benefit with more accurate and efficient transactions [3]. Rayport and Jaworski [4] defines E-commerce as technology-mediated exchanges between parties (individuals or organizations) as well as the electronically based intraor interorganizational activities that facilitate such exchanges. As a matter of fact, e-commerce is not only a transfer of business transactions form click and mortar to virtual world. For those companies considering e-commerce as simply such a transformation, e-commerce adventure ended up with disappointment. Schniederjans and Cao [5] claim that, companies need to have an integral business model with an emphasis on supply-chain management rather than a model solely based on sales and marketing in order to be successful in the e-commerce era. Critical success factors for e-commerce focus on both usability and business processes behind the web site. Actually, usability consists of interfaces and processes. Hence, this approach leads us to concept of ease of use in means of both web content and business procedures running behind web server. However, it should not be overlooked these both side of usability. 2.1 Basic E-Shopping Process Basically, e-shopping process is very similar with physical one. The shopper firstly searches for the goods she/he needs. Then, she/he checks the properties and compares the alternatives, later, makes a decision, and finally check-outs. Similar process can be defined for e-shopping. Of course, some additional external steps will exist for e-commerce (Fig.1).
Entering the web site
Shoppi ng cart
Navigation for product
Payment and checkout
Check product properties
Order tracking
Compare alternatives
Deliver y
Fig. 1. Purchase process in e-commerce
A successful e-commercial web site should reflect the customer’s mental model. Kwan et al [6] divides e-customer behavior into three phases when requesting an URL: Phase 1: Awareness; request entry, home page, browse page Phase 2: Exploration; login page, registration page, search page Phase 3: Commitment; select page, add to shopping cart page, payment page Helander and Khalid [7] define a systems model for human factors research in ecommerce. There are three sub-systems; web environment, costumer, and web technology. They classify the design parameters dependent on these sub-systems.
Axiomatic Design Approach for E-commercial Web Sites
311
Design parameters are physical environment, description, images, arrangement, browse for merchandize, map, index, hierarchies, landmarks, search for navigation, minimum number of clicks, shopping basket, images for easy to purchase, games, free gifts for promotions, seller, buyer for feedback in web environment; needs, attitudes, purchasing power, competence, addiction, motivation, age, trust for modulating variables in customer, and search agents for features, tools, artifacts for controls, visual and auditory for displays in web technology. 2.2 Critical Success Factors in E-Commerce Sung [8] summarizes the critical success factors for e-commerce and compares them on a basis of East and West. According to this study, there sixteen factor for the success, i.e. customer relationship, privacy of information, low-cost operation, ease of use, electronic commerce strategy, technical electronic commerce expertise, stability of systems, security of systems, plenty of information, variety of goods/services, speed of system, payment process, services, delivery of goods/services, low price of goods/services, and evaluation of electronic commerce operations.
3 Axiomatic Design Concepts Suh [1] defines design as an interplay between what is wanted to achieve and how is it be achieved. The common in all design activities is what the designers must do: i) know or understand their customers’ needs, ii) define the problem they must solve to satisfy the needs, iii) conceptualize the solution through synthesis, iv) perform analysis to optimize the proposed solution, and v) check the resulting design solution to see if it meets the original customer needs. Design begin with “What we want to achieve?” and end with a clear description of “How we will achieve it?”. There are iterations between “What” and “How”. Each loop in iteration must clarify “What”. Actually, our final understanding from customers’ needs must be transformed into a minimum set of specifications. Classical design approach is iteratively, empirically and intuitively. It is based on experience, cleverness, or creativity. Additionally, it is involving much trial and error. Axiomatic design aims to establish a scientific basis for design. It improves design activities by providing a theoretical foundation. This foundation is based on logical and rational thought processes and tools. Axiomatic design requires a clear definition of design objectives. For this purpose, functional requirements and constraints are established. Criteria for good and bad design are obtained. Then, these help the designer to eliminate the bad ones as early as possible. Moreover, designers can concentrate on promising ideas. Additionally, it provides a decomposition process of a systematic flow from creation of concepts to detailed design. Engelhardt [9] says that engineering design schools often rely on subjective engineering judgments when modeling product structure and behavior. However, Axiomatic Design specifically addresses the internal relationships between a product’s components. It is a principle-based design method. According to Thielman and Ge [10], Suh’s Axiomatic Design provides a consistent framework based on a logical thinking process, and techniques to carry out design activities in a well-organized manner. Axiomatic Design by Suh [1] requires four domains for design lifecycle: i) Costumer Domain includes the needs (or attributes) (CAs) that the costumer looking
312
M.M. Yenisey
for in a product, ii) Functional Domain contains the customer needs in terms of functional requirements (FAs) and constraints(Cs), iii) Physical Domain covers the design parameters (DPs) to satisfy the specified FRs, and iv) Process Domain encompasses the process variables (PVs) characterize the developed process to produce the product specified in terms of DPs. Axiomatic Design has two fundamental axioms to establish a scientific foundation for design activity [1] [11]: Axiom1: The Independence Axiom; Maintain the independence of functional requirements (FRs) The mapping between FRs and DPs is represented by a design equation: {FR}=[A] {DP}
(1)
where {FR} is a column vector that contains all the FRs of design, {DP} is a column vector that contains all the DPs, and [A] is the design matrix defining the relationships between DPs and FRs. If number of FRs (n) is equal to that of DPs then [A] is a square matrix of size n x n. An element of matrix [A], Aij is given by
Aij =
∂FRi ∂DPj
(2)
If DPj influences FRi this element is non-zero. Otherwise, it is zero. Moreover, a strictly uncoupled design has a matrix with only diagonal elements are non-zero. This situation guarantees that the FRs are completely independent. However, it is very difficult to obtain such a design in real world. Designs where FRs are satisfied by more than one DP are acceptable. In this case, the matrix is triangular. Axiom2: The information Axiom: Minimize the information content of the design Information axiom defines the information content of a design with entropy, expressed as the logarithm of the inverse of the probability of success p:
I = log 2
1 p
(3)
These two axioms lead designer to obtain the best design. Thus, the first axiom forces that each requirement can be fulfilled by design parameters without affecting other requirements while the second axiom indicates that the best design is the one with the least information content. The first axiom facilitates concurrent design without interactions. The second axiom is a variation of the old adage “keep it simple”. Hence, they represent two quality characteristics of the design [12][13].
4 Axiomatic Design Approach for E-Commercial Web Sites An e-commercial web site has two dimensions. The first dimension is the appearance of the web site and the second one is processes running behind it. Both of them define
Axiomatic Design Approach for E-commercial Web Sites
313
Shopping in an easy and usable way (CA)
FR A web site with high
DP System design
FR1 Web site appearance
FR2 Process management
FR3 Technic al issues
DP1 Web site design
DP2 Process design
DP3 Usage of
FR11 Easy navigation
FR21 Customi zation
FR31 Security
DP11 Navigati on bar
DP21 User preferences definition
DP31 Strong encryption mechanism
FR12 Locate the items
FR22 Shoppin g cart
DP13 All pages look similar
DP22 Dynamic ally changing shopping cart in another window
DP32 Prevent database against unauthorized Access
DP14 Color contrast
DP23 Smple, short processes
FR13 Web site consistency FR14 Color FR15 Text
FR23 Ease of use processes FR24 Integrati on with suppliers FR25 Accurat e information FR26 Efficient checkout
FR32 Privacy FR33 Transact ion speed FR34 Data transfer speed
DP12 Efficient search mechanism
DP15 Fontsize
DP24 Strong process interaction
DP33 High quality servers DP34 Faster networking technology
DP25 Frequent update policy DP26 Checkout mechanism highly suitable wit human mental model
Fig. 2. An example decomposition diagram for e-commercial web sites
the degree of easy to use for the web site. The success is measured by the metrics related to activities in these dimensions.
314
M.M. Yenisey
Of course, the ease of use reflects the costumer requirements in terms of functional requirements (FRs). Web site’s physical design and the processes developed are design parameters (DPs). By this viewpoint of classification, it becomes true to apply the axiomatic design approach to the design of a commercial web site in the means of both usability and processes. It is surely that they are all based on the technological issues. It is clear that there may be conflicting FRs. That is, a costumer requires a web page having densely animations while desiring shorter download times. Apart from this, it is better to separate procedural part of a web site from physical layout to overcome such collisions. Such an distinction becomes very important especially for e-commercial web sites since the commercial side consists of many processes. FRs can easily be expressed as numerical value since they are related to measurable metrics. Again, DPs can also be digitalized as they are directly in measurable form. Hence, it will be possible to apply two axioms of the Axiomatic Design. There are several works in literature discussing the costumer attitudes or preferences in e-commerce [2], [6], [7], [14], [15]. Actually, these attitudes reflect the functional requirements of costumers. However, almost all studies in the literature have focused on appearance and/or content of the web site so far. Author of this study’s belief is that all aspect discussed in literature has a process running in background. A decomposition diagram expressing FRs and DPs based on the well known usability factors in literature is given in Fig.2. If this decomposition diagram examined according to the first axiom, it can be seen that an uncoupling design achieved. Therefore, this is an acceptable design.
5 Conclusions Axiomatic design is a recent approach to the design activities. It provides a scientific base for design. Moreover, it is very useful to obtain a simple design with minimal information content. In this paper, a decomposition diagram was proposed to show the ability of application of Axiomatic Design to e-commercial web sites. In this paper, it was discussed that e-commercial web sites have not only been made of usability aspects but also procedural features. Apart from this,
References 1. Suh, N.P.: Axiomatic Design: Advances and Applications. Oxford University Press, New York (2001) 2. Ng, C.F.: Satisfying Shoppers’ Psychological Needs: From Public Market to Cyber-mall. Journal of Environmental Psychology 23, 439–455 (2003) 3. Trepper, C.: E-commerce Strategies: Mapping Your Organization’s Success in Today’s Competitive Marketplace. Microsoft Press, Washington (2000) 4. Rayport, J.F., Jaworski, B.J.: Introduction to E-commerce, 2nd edn. McGraw Hill, New York (2004)
Axiomatic Design Approach for E-commercial Web Sites
315
5. Schniederjans, M.J., Cao, Q.: E-commerce: Operations Management, World Scientific Publication Co, Singapore (2002) 6. Kwan, I.S.Y., Fong, J., Wong, H.K.: An E-costumer Behavior Model with Online Analytical Mining for Internet Marketing Planning. Decision Support System 41, 189–204 (2005) 7. Helander, M.G., Khalid, H.M.: Modeling the Customer in Electronic Commerce. Applied Ergonomics 31, 609–619 (2000) 8. Sung, T.K.: E-commerce Critical Success Factors: East vs. West. Technological Forecasting & Social Change 73, 1161–1177 (2006) 9. Engelhardt, F.: Improving Systems by Combining Axiomatic Design, Quality Control Tools and Designed Experiments. Research in Engineering Design 12, 204–219 (2000) 10. Thielman, J., Ge, P.: Applying Axiomatic Design Theory to the Evaluation an Optimization of Large-scale Engineering Systems. Journal of Engineering Design 17(1), 1–16 (2006) 11. Helander, M.G., Lin, L.: Axiomatic Design in Ergonomics and an Extension of the Information Axiom. Journal of Engineering Design, 13(4), 321–339 (2002) 12. Bras, B., Mistree, F.: A Compromise Decision Support Problem for Axiomatic and Robust Design. Advances in Design Automation 65(1), 359–369 (1993) 13. Su, J.C., Chen, S., Lin, L.: A Structured Approach to Measuring Functional Dependency and Sequencing of Coupled Tasks in Engineering Design. Computers and Industrial Engineering 45, 195–214 (2003) 14. Lightner, N., Yenisey, M.M., Ozok, A.A., Salvendy, G.: Shopping Behaviour and Preferences in E-commerce of Turkish and American University Students: Implication from Cross-cultural Design. Behaviour and Information Technology 21(6), 373–385 (2002) 15. Konradt, U., Wandke, H., Balazs, B., Christophersen, T.: Usability in Online Shops: Scale Construction, Validation and the Influence on the Buyers’ Intention and Decision. Behaviour and Information Technology 22(3), 165–174 (2003)
Development of Quantitative Metrics to Support UI Designer Decision-Making in the Design Process Young Sik Yoon and Wan Chul Yoon Department of Industrial Engineering, KAIST, 373-1, Guseong-dong, Yuseong-gu Taejeon, Korea {nanhari, wcyoon}@kaist.ac.kr
Abstract. The UI designer must be able to anticipate cognitive difficulties of users in the UI design process. However, the designer is likely to make erroneous judgments in the context of increasing functionality. Furthermore, time constraints in the development process exacerbate the design problem. There are various techniques to support the UI designer in the design process, including abstract design principles, specific design guidelines, design cases, design inspections, and design metrics. Metrics can summarize the status of a UI design solution more objectively and more accurately than human designers. This paper aims to develop quantitative metrics based on a unified framework for interaction design, which decomposes UI design problem into the four components: information architecture, task procedure, system dynamics, and physical interface. Three metrics were proposed to assist designer’s decisionmaking, including incongruity, complexity, and inefficiency. A case study shows that the proposed metrics can support the designer’s decision making in an efficient manner. Keywords: Model-based UI Design, Metrics, Design Aids, Usability.
Development of Quantitative Metrics to Support UI Designer Decision-Making
317
another. Therefore, it is necessary to provide the designer with holistic information that describes multiple dimensions of the UI design space. In the present research, metrics are defined as numerical values that can reflect the status of a UI design solution. A set of metrics can evaluate multiple design aspects, and thus can serve as powerful tools in a design-evaluation process. There are two kinds of metrics: internal metrics and external metrics[4]. The former measure the internal attributes of the designed interaction, such as complexity, ambiguity, and efficiency. The latter represent the external perspective of the designed interaction when the system is in use, such as user performance time, number of errors, and subjective satisfaction. External metrics can be used as usability indicators[2]; however, they are ineffective for an economically feasible design-evaluation process. Although it is not easy to identify useful internal metrics, the internal metrics can serve as effective and early indicators for the usability of designed interaction. (Hereafter, ‘internal metrics’ will be referred to as ‘metrics’ for convenience.) This paper suggests three quantitative metrics that can support UI designer decision-making in the design-evaluation process. We apply a systematic modelbased approach to capture usability information of a designed interaction. These metrics can be applied to help the UI designer make a decision in an efficient and effective manner. Furthermore, the metrics can be based in automatic design and evaluation tools, which can reduce the cognitive work demands imposed on the UI designer. This paper is organized as follows. Section 2 provides an overview of related works in metrics for supporting user interaction design. Section 3 describes a unified framework for interaction design and the three metrics, including incongruity, complexity, and inefficiency. Section 4 presents a case study where the proposed metrics were applied to support the UI designer. Finally, section 5 describes our conclusions and further research directions.
2 Background and Related Works The UI designers must anticipate users’ behaviors during their interaction with a system. The Task-Interface Matching, or TIM, framework provides a unified view of the use/design space to deal with the usability problem[10]. Under this unified framework, both the use and the interaction design are employed to relate the tasklevel knowledge with the interface level knowledge. Matching between abstraction levels does not entail random coupling of tasks and interface means. The designers should comply with existing social standards and users’ prior knowledge at these levels. Therefore, the matching relations must be congruous with the top-down and the bottom-up expectations of users. That is, the designers’ TIM relations must be congruous with the users’ TIM relations. The UI design problem can be decomposed into two domains, the behavioral domain and the constructional domain[7]. The former concentrates on user-centered aspects of interaction design, while the latter focuses on system-centered aspects, such as every operation and state, and the transitions among them. Many researchers advocate a coupling method that utilizes functional and structural models for representing the behavioral and constructional domains. Navarre et al. proposed a tool
318
Y.S. Yoon and W.C. Yoon
that couples CocurTaskTrees for task modeling and Petri nets for system modeling[11]. Lee and Yoon suggested a coupling method that integrates OCD for the functional models and statechart for the structural model[9]. Model-based UI design is an effective approach to manage the design complexity[12]. However, there can exist some limitations in model-based design approaches when supporting tools are not available. Designing and evaluating with a manual approach is a time-consuming endeavor[8], imposes a high work load on the designers[3], and may lead to completion of the design search with premature, suboptimal solutions[13]. One approach to address these limitations is to develop useful metrics and to guide the design search with the metrics. Many kinds of metrics have been proposed to address different types of usability attributes in interaction design. First, complexity is one of the most popular and widely applied metrics in evaluating human-machine interaction[14-16]. Complexity can deteriorate human performance and cause human errors[18]. Lee et al. proposed the use of system entropy to assess the cognitive complexity of an interface. This metric predicts the difficulty of learning how to use an interface by taking into account the user’s state schemas[16]. Second, efficiency is another important metric frequently referred to in the literature pertaining to usability design[2, 17]. The UI designers must provide an efficient procedure for frequent and important tasks of users. Several members of the GOMS family can assess the efficiency of an interaction sequence by predicting the task execution time[17]. Finally, the prior knowledge of the user is an important issue for usability design. Many studies have suggested that learning and comprehending new knowledge can be facilitated when that knowledge is compatible with prior knowledge. There are many other possible metrics for usable interaction design. However, we focus here on applying a few important metrics based on a unified UI design framework.
3 A Framework for Interaction Design and I2C Metrics 3.1 A Framework for Interaction Design The unified framework for interaction design, shown in Fig. 1, consists of four components: information architecture, task procedure, system dynamics, and physical interface. First, information architecture (IA) is the structure within which the information, functionalities, and services are grouped. The IA component affects the user’s performance of navigation tasks. Second, task procedure (TP) is the functional model of user interaction. It represents user actions when interacting with a system. The OCD model is an effective tool to describe the task procedure in an operationcentered manner. Third, system dynamics (SD) is the structural model of user interaction. It describes the designed interaction in a state-centered manner. Finally, physical interface (PI) represents the physical aspects of the user interface, such as interface layout, UI controls, and information elements. The PI component provides a common ground for interaction between the user and system.
Development of Quantitative Metrics to Support UI Designer Decision-Making
319
Fig. 1. A unified framework for user interaction design
3.2 Models for Task Procedure and System Dynamics We can model each component of the framework, described in the previous section, with some modeling techniques. We use the OCD model for the task procedure, and use the state-operation matrix for some aspects of the system dynamics. The basic entities of the OCD are operation, abstract operation, state, state closure, and state header. The OCD can represent three procedural structures: sequence, branch, and loop. The elements of the state-operation matrix, aij, have a binary value of 0 or 1, which reflects the availability of the jth operation at the ith state. The notation is given in Fig. 2, and further details are presented in [5, 9, 16]. operation O1 O2 O3 O4 O5 O6
state state header abstract operation
S1
1
1
0
0
0
0
S2
1
1
0
0
0
0
S3
0
0
1
1
1
1
state closure (a) Entities of OCD
(b) State-Operation Matrix
Fig. 2. OCD notations and an example of an S-O matrix
3.3 Proposed Metrics as UI Design Aids We considered the following axioms to develop the metrics for usable interaction. First, the designed task procedure should be compatible with the prior knowledge of the user. That is, the designer must minimize the semantic gap between the user’s
320
Y.S. Yoon and W.C. Yoon
procedural knowledge and the designer’s conceptualization. Second, the user should be able to perform a task in an efficient manner. For example, the designer can provide a shortcut for important and frequently performed tasks. Third, the designer must reduce the relational complexity within a state-transitional structure. In this work, we suggest three metrics, incongruity, inefficiency, and complexity. Detailed descriptions of the metrics are given in the following sections. Table 1. The proposed metrics for interaction design Metric Name Incongruity
Description The transformational distance between the user’s procedural knowledge and the designed task procedure The length of the task procedure The entropy within the designed state transitional relations
Inefficiency Complexity
Related Model OCD
OCD S-O Matrix
Incongruity. The task procedure can be represented by a diagrammatic model, OCD. While description of the designed interaction with the OCD is a straightforward process, some additional steps, shown in Table 2, are needed to represent the user’s prior knowledge. The incongruity is estimated by measuring the transformational distance between OCD pairs. There are many kinds of operators to transform one structure into another[19-20]. However, the psychological relevance of the operators is thus far unknown. Here, we chose three operators - insert, delete, and substitute. The operators are frequently cited in the literature, and thus we assume the operators have psychological relevance. The incongruity for two OCDs of the ith task is defined as given in Eq.(1). The weight of each variable, wx, can be estimated by a regression analysis of the data from the usability test. We assume that the state of the OCD has negligible effects on the incongruity, because users learn the procedural knowledge in an operation-centered manner. Table 2. The process of representing user's procedural knowledge Step 1
Description Provide the user with a physical interface and a task list Then, record the user’s expectation during the interaction Refine the expected interaction procedures Represent the informal protocol data with the OCD
Inefficiency. The inefficiency is defined as the weighted sum of the length of the designed task procedure. The weight is a value between 0 and 1. The designer should allocate more weight for frequent and important tasks when calculating the
Development of Quantitative Metrics to Support UI Designer Decision-Making
321
inefficiency. The inefficiency is calculated using Eq. (2). We estimate the length of the task procedure by counting the number of operations in OCDs. IE ( D ) = ∑ wi ⋅ Li ,
where Li : the length of task procedure for task i
(2)
i
Complexity. The complexity is defined based on the entropy, an information theory concept, with reflecting schemas. Users attempt to identify whether an operation is available at a state while planning procedures for their tasks. Thus, the schemas originate from the similarity between states in which an operation is available. The process to calculate the complexity is described in Table 3 and Eq. (3). Table 3. The process of calculating the complexity Step Description 1 Represent the designed interaction with an SO matrix (A) 2
Calculate a similarity matrix (C) based on matrix A
Equation A = {aij | i = 1, " , n; j = 1, " , r} where aij = 1,
if Oi is available at S j
aij = 0, otherwise I ikj = −{log P (a ⋅ j = a ij ) + log P (a ⋅ j = a kj )}
C = {cik | c ik =
∑I
j ik
∑I
j ik
j∈S ik
, S ik = { j | a ij = a kj }
j
3
Calculate a weight matrix (W) based on matrix C
1 , for i = j ⎧⎪ C' = ⎨ C , others ij ∑ C kj ⎪⎩ k ≠i W = {wij | wij = C ij' ∑ C ij' }
4
Calculate an entropy matrix (I)
E[ A] = W T A
j
⎧ E ( a ij ) P (a ij ) = ⎨ ⎩ 1 − E ( a ij ) I ( a ij ) = − log P( a ij )
, a ij = 1 , otherwise
CM ( D) = ∑∑ I (aij ) nr , where A = {aij | i = 1,", n; j = 1,", r} i
(3)
j
4 A Case Study: Designing Interaction of an MP3 Player We introduce a simple example to demonstrate how the metrics are used as design aids in a UI design process. A multi-functional MP3 player is rather complex and can illustrate the utility of the proposed method. We made three design solutions by surveying and modifying a variety of UI cases. A detailed description follows. 4.1 Information Architecture and Physical Interfaces We assume that the design solutions have a common function list and common information architecture. The designed MP3 player has the following four functions:
322
Y.S. Yoon and W.C. Yoon
M P3P
Solution (1)
Back
M usic
FM Radio
Voice
Setup
Play / Pause
Record/Pause/Save
Play / Pause
Focus++ / Focus--
FF / Rew
M em ory/M .C ancel
Record/Pause/Save
Select
F.Scan / B.Scan
C h++ / C h--
Vol++ / Vol--
Value++ / Value--
Vol++ / Vol--
C h Scan
R
Rew
P/P
FF
▶
P/P
AB
R
Solution (2)
◀
OK
Solution (3)
◀
Vol++ / Vol--
OK
▶
P/P
R
Menu Back
Fig. 3. Information architecture and physical interfaces of the designed MP3 player
(1) play mp3 file or radio channels, (2) record a voice or radio channels, (3) manage files or radio channels, (4) setup the MP3 player. The information architecture and three design solutions are given in Fig. 3. 4.2 Modeling User Interaction We conducted a task analysis and created an initial task list, which includes playing MP3 files, playing radio, recording voice, setting up the lighting time of the display, storing a radio channel in memory, and setting up the radio volume. Fig. 4 shows the OCD pairs and the state-operation matrix of solution (1). User’s Procedural Knowledge
Designed System Dynamics
T1: Playing MP3 ¡ã
¡å
¡ã
music_ready
menu R
Back
¡å
¡ã
¡å
¡ã
¡å
¡ã R
Designed Task Procedure T1: Playing MP3 ¡ã
¡å
¡ã
R
Rew
Rew -L
PP
FF
FF -L
Menu
0
1
1
1
0
0
0
0
0
Music _list
1
1
1
1
0
0
0
0
0
Music _play
1
1
1
0
1
1
1
1
1
Radio _play
1
1
1
1
1
1
0
1
1
Voice
1
0
0
1
0
0
0
0
0
Voice _record
1
0
0
1
0
0
1
0
0
Setup
1
1
1
1
0
0
0
0
0
Lignting _time
1
1
1
1
0
0
0
0
0
¡å
lighting_time
setup R
¡å
music_list
menu
¡å
R
T4: Setting up the Lighting Time of Display menu
¡ã
music_list R-L
R
R
T4: Setting up the Lighting Time of Display ¡ã
¡å
¡ã
¡å
¡ã
setup
menu R
¡å
lighting_time R
R
Fig. 4. Modeling task procedures and system dynamics
4.3 Metrics-Based Analysis of the Design Solutions The design solutions were analyzed with the proposed metrics, as presented in Table 4. As an initial attempt, we calculated the incongruity without considering the relative importance of each transformational operator. That is, the incongruity value is an average of the number of transformational operators in a design solution. In
Development of Quantitative Metrics to Support UI Designer Decision-Making
323
calculating the inefficiency, we assigned more weight to the tasks 1, 2, 3, 5, 6 than to task 4. As a result, the ranking of the usable interaction design is as follows: design solution(1), design solution(3), and design solution(2). Table 4. Comparison of the design solutions based on the proposed metrics
5 Conclusion and Further Research In this work, a unified framework for interaction design is proposed and three quantitative metrics – incongruity, inefficiency, and complexity – are suggested. The metrics reported here are expected to complement usability testing by quantifying the usability attribute of interaction design to some extent. This will effectively reduce the development costs of user interaction. As future work, we plan to validate the proposed metrics with a series of empirical tests. A metric-based design support system also will be developed to facilitate the UI design process.
References 1. Yoon, W.C.: Identifying, Organizing and Exploring Problem Space for Interaction Design. In: Proceedings of the 8th IFAC/IFIP/IFORS/IEA Symposium on Analysis, Design, and Evaluation of Human-Machine Systems, Kassel, Germany, pp. 81–86 (2001) 2. Nielsen, J., Mack, R.L.: Usability Inspection Methods. John Wiley & Sons, Inc, West Sussex, England (1994) 3. Sears, A.: AIDE: A step toward metric-based interface development tools. In: Proceedings of the 8th Annual ACM Symposium on User Interface and Software Technology, Pittsburgh, Pennsylvania, United States, pp. 101–110 ( 1995) 4. ISO: ISO/IEC DIS 14598-1 Information Technology – Evaluation of Software Products – Part 1, General Guide (1996) 5. Yoon, W.C., Park, J.S.: A diagrammatic model for representing user’s interface knowledge of task procedures. In: Proceedings of Cognitive Systems Engineering in Process Control, Kyoto, Japan, pp. 276–285 (1996) 6. Kang, H.G., Seong, P.H.: An Information Theory-Based Approach for Quantitative Evaluation of User Interface Complexity. IEEE Trans. on Nuclear Science 45(6), 3165– 3174 (1998) 7. Hix, D., Hartson, H.R.: Developing User Interfaces – Ensuring Usability Through Product and Process. John Wiley & Sons, Inc, New York (1993) 8. Paterno, F.: Tools for Task Modeling: Where we are, Where we are headed. In: Proceedings of the 1st International Workshop on Task Models and Diagrams for User Interface Design, Bucharest, Romania, pp. 10–17 (2002) 9. Lee, D.S., Yoon, W.C.: Coupling structural and functional models for interaction design. Interacting with Computers 16, 133–161 (2004)
324
Y.S. Yoon and W.C. Yoon
10. Yoon, W.C.: Task-Interface Matching: How we may design user interfaces. In: Proceedings of the 15th Triennial Congress of the International Ergonomics Association, Seoul, Korea (2003) 11. Navrre, D., Palanque, P., Paterno, F., Santoro, C., Bastide, R.: A tool suite for the coevolutionary design of user interfaces. In: Proceedings of the 8th International Workshop on Design, Specification of Interactive Systems, Glasgow, Scotland, pp. 88– 113 (2001) 12. Paterno, F.: Model-based Design and Evaluation of Interactive Applications. Springer, Heidelberg (1999) 13. Visser, W.: Use of episodic knowledge and information in design problem solving. In: Cross, N., Christiaans, H., Dorst, K. (eds.) Analysing Design Activity, pp. 271–289. Wiley, New York (1996) 14. Rouse, W.B., Rouse, S.H.: Measure of Complexity of Fault Diagnosis Tasks. IEEE Trans. On Systems, Man, and Cybernetics 9(11), 720–727 (1979) 15. Payne, J.S., Green, T.R.G.: Task-Action Grammars: A model of the mental representation of task languages. Human-Computer Interaction 2, 93–133 (1986) 16. Lee, D.S., Yoon, W.C, Choi, S.S.: An Entropy-Based Measure for Evaluating the Cognitive Complexity of User Interface. Korean Journal of The Science of Emotion and Sensibility 1(1), 213–221 (1998) 17. John, B.E., Kieras, D.E.: Using GOMS for user interface design and evaluation: Which technique? ACM Trans. on Computer-Human Interaction 3(4), 287–319 (1996) 18. Wickens, C.D.: Engineering Psychology of Human Performance, HarperCollins Publiser Inc. (1992) 19. Bunke, H., Shearer, K.: A graph distance metric based on the maximal common subgraph. Pattern Recognition Letters 19, 255–259 (1998) 20. Hahn, U., Chater, N., Richardson, L.B.: Similarity as transformation. Cognition 87, 1–32 (2003)
Scenario-Based Product Design, a Real Case Der-Jang Yu and Huey-Jiuan Yeh Scenario Lab, No 1, Lane 49, Sianjheng 1st St., Jhubei City, Hsinchu County 302, Taiwan Creativity Lab, Industrial Technology Research Institute, 195 Sec. 4, Chung Hsing Rd. Chutung, Hsinchu County 310, Taiwan
Abstract. This paper proposes a simple framework for implementing SBD. This framework consists of four elements: a basic story structure, an innovation acceleration field, a tool for expressing idea/describing scenario, and an activity theory-based tension detector/idea stimulator, and a process based on the Chinese traditional literature four-stage creation process. A case study is presented at the end of the paper to demonstrate the feasibility of the proposed framework. Keywords: Scenario-based design, Activity Theory.
1 Introduction A product design team faces the challenge of working with people from multiple disciplines. Also, factors such as user needs, user experiences, and user emotion must be taken into account. They need a simple, familiar, and intuitive method to bridge not only the gap between designers and users, but also between designers and designers. Such a bridge can facilitate the exchange of experience and the sharing of knowledge easily and effectively. Scenario-based design (SBD) technique is a good approach for solving communication problems between design team members. The issue now becomes how to build a successful scenario-based product design process to resolve the aforementioned issues. This paper proposes a simple framework for implementing SBD. This framework consists of four elements: a basic story structure, an innovation acceleration field, a tool for expressing idea/describing scenario, and an activity theory-based tension detector/idea stimulator, and a process based on the Chinese traditional literature fourstage creation process. The target users of this framework are people who received no trainings in usability engineering or user-centered design.
means the existing stories that lifestyle or ethnographic researchers collect from the fields. Scenario, or “new scenario”, is the new stage that the people want to move to from old story. Scenario is generated based on the changes of tool, people, or situation and the causal activity narration. On the other hand, scenario is also generated based on a new narration that causes changes to tool, people, or situation. In our approach, we will focus on making changes to tool. 2.2 Innovation Acceleration Field A field to accelerate the stimulation of ideas is created by identifying negative issues and positive expectations from old story. The negative and positive issues provide the SBD users inspiration and stimulation for creating new tool to complete the “new scenario” originated from “old story”. 2.3 Idea-Scenario Sketch Sheet Idea-scenario sketch sheet is used for expressing ideas and describing scenarios. The sketch sheet has a drawing area on the right and space for text on the left. In the case of old story, the right side contains the snapshot or drawing related to the story and the story is narrated on the left. Similarly, for new scenario the related sketch is presented on the right and the new scenario is narrated on the left. Also, the positive and negative issues are required to be addressed in the old story narration. 2.4 Activity Theory-Based Tension Detector/Idea Stimulator Activity theory-based tension detector/idea stimulator is used to help SBD users interpret the activities in old story. Following the activity theory’s actor-goal-tool concept, an SBD user pretends to be the subject in the story and tries to understand the goal of intension and the obstacle between the goal and story teller. The objective is to understand the usage and incompleteness of tool and to create solution based on the SBD user’s own knowledge and experience.
3 The Chinese Traditional Literature Four-Stage Creation Process “Chi”, “chen”, “tsuang”, “ho” mean “start”, “adapt”, “evolve”, and “conclude”, respectively. It is the Chinese traditional literature four-stage creation process. It also enables a reader to enjoy and appreciate the esthetics of Chinese literature. Chinese people use this process to create literature works, and also acquire wonderful experience while reading the text. This traditional process is adopted to guide the SBD team through different design stages where team members can enjoy and experience a journey leading to successful innovations. • Start: the stage where old story is collected; positive/negative issues are identified. • Adapt: the stage where the SBD team studies the story and uses the activity theorybased tension detector/idea stimulator to come up with new ideas.
Scenario-Based Product Design, a Real Case
327
• Evolve: the stage where each SBD team member explores ideas or creates new scenarios which in turn stimulate new ideas/scenarios. Once this process ends, all members will share the results with each other. • Conclude: the stage where the final ideas/scenarios are created by evaluating, selecting, combing, and refining the ideas/scenarios created in the previous stage.
New scenarios
Old story
起
承
轉
合
start
adapt
evolve
conclude
Chinese traditional literature four stage creation procedure
Fig. 1. The four stages process transforms old story to new scenario
4 A Case Study A major scooter company in Taiwan wanted to design a new model utilizing SBD methodology. 4.1 The Start Stage The project started with user research. Thirty-two pioneer users were selected from 300 target market users. After the focus group meeting, 10 were invited to participate
Start Stage
起 Positive issues
境 人
Negative issues
situation
活動
activity
people
Old story
物
tool
“People”, “tool”, “situation”, “activity”, the story/scenario structure
起
承
轉
start
adapt
evolve
合 conclude
Chinese traditional literature four stage creation procedure
Fig. 2. The start stage, old story is composed of People, tool, situation, and activity
328
D.-J. Yu and H.-J. Yeh
in further ethnographic studies which included scooter use history, scooter usage diary, introduction of each participant’s vehicle, and one-on-one interview. Participants were asked to provide memorable and unique experiences using their scooters. Based on the ethnographic studies, a new market segment was identified and selected as the target market for the new product. Ethnographic data was collected. Interested statement pieces carrying negative or positive issues were used to create 50 short stories. 4.2 The Adapt Stage A multidiscipline team was formed to include product designers, market planners, and product planners. At this stage, two workshops were held in which team members read the stories collected from the start stage and created idea-scenario on the ideascenario sketch sheets by using the activity theory based tension detector/idea stimulator. The stories allowed the team members to empathize with the story tellers. Each member was encouraged to create ideas for the stories that affected them most. As a result, ideas for new scooter functions that would satisfy the new target market were created. Adapt Stage
Chinese traditional literature four stage creation procedure
Fig. 3. The adapt stage, innovation acceleration field and Activity theory-based tension detector/idea stimulator helps SBD user create ideas
4.3 The Evolve Stage About 50 idea-scenario sketches were created. During the presentation, each creator described not only the new concepts and scenarios in detail but also the process of creating them. Therefore, each member was able to understand the key points of the creations by the other members. This common understanding improved the communication between team members and helped generate better ideas. Each ideascenario snapshot was evaluated by a voting process. All ideas-scenarios are grouped based on similarity. The number of votes for each concept group indicated its value to the team.
Scenario-Based Product Design, a Real Case
329
Evolve Stage ᠏ Storyboard sketch sheet
Idea-scenario sketch sheet
ದ
ࢭ
᠏
start
adapt
evolve
ٽ conclude
Chinese traditional literature four stage creation procedure
Fig. 4. The evolve stage, Idea-scenario sketch sheet and Activity theory-based tension detector/idea stimulator help SBD users implement idea-scenario sketches and share with team members
4.4 The Conclude Stage Ideas-scenarios were selected from each group based on the votes to create 12 advertisement posters for the new scooter show to be held two years later. One more vote was cast to select the final advertisement poster to be used for the new generation scooter. Conclude Stage ٽ
New scenarios
ದ
ࢭ
᠏
start
adapt
evolve
ٽ conclude
Chinese traditional literature four stage creation procedure
Fig. 5. The conclude stage, the final ideas/scenarios are created by evaluating, selecting, combing, and refining the ideas/scenarios created in the previous stage
The results were used by the market planning division which started the new product development. Since each team had enough ethnographic information and new ideas and scenarios, and the cooperation experience between different teams, the new product development started smoothly.
330
D.-J. Yu and H.-J. Yeh
5 Conclusions SBD plus the proposed start-adapt-evolve-conclude is a powerful tool for product design. This approach is Chinese culture specific and has been successfully applied in various design projects in Taiwan. Further study is needed to determine whether or not it can be successfully adopted in a non-Chinese setting.
References 1. Yu, D.J.: Scenario-Oriented Design, Garden City, Taipei (2000) 2. Carroll, J.M.: Scenario-Based Design. Jhon Wiley, and Sons, New York (1995) 3. Usability Engineering, Scenario-Based Development of Human-Computer Interaction San Francisco, Morgan Kaufmann (2002) 4. Schank, R.: Tell Me a Story: Narrative and Intelligence, Evanston, IL, Northwestern University Press (1995) 5. Carroll, J.M.: Making Use: Scenario-Based Design of Human-Computer Interaction. MIT Press, Cambridge, MA (2000)
Designing Transparent Interaction for Ubiquitous Computing: Theory and Application Weining Yue, Heng Wang, and Guoping Wang Department of Computer Science and Technology, Peking University [email protected]
Abstract. Designing transparent interaction is important for ubiquitous computing (ubicomp). A psychology framework that characterizes user’s cognitive behavior in ubicomp environments would be invaluable for guiding the interaction design to be optimally compatible with human capabilities and limitations. By analyzing the cognitive skill and attention selectivity, such a framework is proposed in this paper. Correspondingly, a context-sensitive multimodal architecture is presented on the level of technology. A case study, where the theory was implemented in a handheld hypermedia guide and deployed into the context of authentic use, is then discussed.
As Weiser indicated, the disappearance of computation is a fundamental consequence not of computing technology, but of human psychology. This implies the importance of a psychology framework that characterizes user’s cognitive behavior in ubicomp environment when designing transparent interaction. Such a framework would be invaluable for guiding the interaction design to be optimally compatible with human capabilities and limitations. In this article, we propose a cognitive framework to describe users’ common features in cognitive skill and attention selectivity. Then technology architecture is correspondingly presented to support the general ubicomp interaction design. Finally we introduce an application where the theory was implemented and deployed into the real-world use.
2 Cognitive Psychology Framework As to the psychology framework, distributed cognition [3] is regarded as a new foundation of human computer interaction. The central hypothesis is that the cognitive and computational properties of systems can be accounted for in terms of the organization and propagation of constraints. Interacting Cognitive Subsystems (ICS) [4] represents the human information processing mechanism as a highly parallel organization with a modular structure. The assumption is that we are dealing with a system of distributed cognitive resources, in which behavior arises out of the coordinated operation of the constituent parts. As a fundamentally systemic approach to mental processing, ICS encompasses all aspects of perception, cognition, and emotion, as well as the control of action and internal bodily reactions. Norman also studies various psychological issues in his The Visible Computers [5]. These work made substantial contribution. However, none is the methodology that we can readily pick off the shelf and apply to a design problem. Lots of efforts need to be paid on understanding the concepts and learning to interpret and re-represent data captured in interaction design. In this paper we want to give a cognitive framework that one can directly adopts while designing ubicomp interaction. 2.1 Cognitive Skill Interaction load is most directly affected by cognitive skill. According to the widely accepted ACT (Adaptive Control of Thought) model [6], cognitive skill develops in three stages. In stage one, the declarative stage, the user produces a crude approximation of the skill by using general purpose problem solving strategies to interpret facts about the skill. Performance is slow, error prone, and working memory load is high because facts about the skill (e.g., the correct sequence of operations) must be actively rehearsed. The second stage, named knowledge compilation stage, is characterized by speedup, more seamless performance, and dropout of verbal mediation. During this phase, declarative facts about the skill are converted into procedural knowledge through knowledge compilation. Procedural knowledge is a collection of productions, or if-then statements that specify a cognitive condition and an action that will be performed if that condition is met. Two mechanisms underlie knowledge compilation: composition and proceduralization. Composition collapses successive productions into single productions, and produces speedup and more
Designing Transparent Interaction for Ubiquitous Computing: Theory and Application
333
seamless performance. The extent to which composition can occur is determined by the capacity of working memory because the conditions specified in a production must be represented in working memory. Through proceduralization, declarative facts are instantiated in productions, thereby eliminating the need to represent declarative information in working memory. Proceduralization is responsible for the dropout of verbal mediation. In the final phase, tuning, search of alternate solution paths becomes more selective. Generalization, discrimination, and strengthening are the three learning mechanisms responsible for proceduralization. With the development of cognitive skill, humans can perform tasks with less attention resources and low cognitive load. In desktop environment, users’ cognitive skill and knowledge on interacting with computers always begin at the first stage. Users have to spend a varying length of time to learn how to get their tasks done with certain applications. In this time span, interactions, being inefficient and unnatural, will distract lots of users’ attention on their own tasks. Moreover, a number of people cannot reach the second or the third stage even after a long time use. It violates the original purpose of ubiquitous computing. To minimize the cognitive load and technological distraction, ubicomp applications should allow users to utilize the procedural or tuned knowledge and skills they have obtained from daily lives to interact with computers. 2.2 Attention Selectivity In an ubicomp environment user’s cognitive and motor modalities, especially eyes and hands are often preoccupied by other tasks. So we also need to discuss the attention selectivity. Attention refers to a human’s ability to concentrate certain objects and allocate processing resources. We can think of it as a spotlight that we shine on things around us to make them “stand out”. Though people try to devote attention to several things at the same time, our ability to do so is clearly limited. When we give our attention to some, we ignore others inevitably. This cognitive limit puts computers in competition for user’s attention resource with other tasks and objects. Related topics have been discussed in depth in desktop computing. However, attention competition is more complex and important in ubicomp since users are often preoccupied with other physical or mental tasks while interacting with pervasive devices. Compared with computer applications, such physical and mental tasks are often more attractive for user’s attention. To explain this phenomenon, we need to address the allocation policy of human’s attention. As shown in Fig.1, according to the physiological mechanism of attention [7], the start of attention allocation is generally the stimulations on sense organs. As to the psychological mechanism of attention, Kahneman explains in his classic capacity model of attention [8] that people do have some control over how we allocate the mental capacity of attention, and the policy of allocation is principally affected by two factors. a) Intention and experience: the objects which users are more interested in and familiar with are more attractive. b) Evaluation of demands on capacity: humans will evaluate the demands on capacity in their minds when there are several things around us, and usually give attention to the ones who need lower capacity.
334
W. Yue, H. Wang, and G. Wang
Fig. 1. Attention allocation
The disappearance of computers will reduce its probability of obtaining user attention. Meanwhile, users are usually more familiar with and interested in their daily tasks rather than computers. Besides, long-time WIMP interactions make lots of users consider interacting with computers to be more difficult than performing their daily tasks. Thus user’s attention on computing systems will remarkably decrease in ubicomp, along with the reduction of explicit input. If interactions are still simply user-driven, the functionality of applications will be inevitably weakened even if users can interact multimodally. According to the features of attention selectivity, we should improve the adaptability and the activity of interaction so that tasks can be accomplished without entirely depending on explicit user input.
3 Interaction Architecture To deal with the challenges posed by cognitive skill and attention selectivity, two techniques play fundamental roles: multimodal interaction and context awareness. 3.1 Multimodal Interaction Humans speak, gesture, and write to communicate with other humans and alter physical artifacts everyday. For the majority of people, their knowledge and skills on performing and interpreting multimodal interaction have already been in the proceduralization or tuning stage. If ubicomp applications support more natural human forms of communication, they will create more natural and expressively powerful means of interaction, and will significantly reduce the cognitive load. Also, the flexibility will be improved since users can alternate modes and switch modalities as needed during the changing conditions. For example, a person may use speech input for voice dialing a car cell phone, but switch to pen input to avoid telling privacy during a public transaction. In ubiquitous computing, speech is considered to be the most potential modality is speech. It offers people more conveniences since it is entirely an eyes-free and handsfree modality. Users can interact with computers by speech when their eyes and hands are busy. Meanwhile, speech has obvious limitations. First, speech interaction is error-prone because recognition technology is still not reliable at present. Furthermore, speech is slow for presenting information, is transient and therefore
Designing Transparent Interaction for Ubiquitous Computing: Theory and Application
335
difficult to review or edit, and interferes significantly with other cognitive tasks, especially in noisy transactions. Consequently, we prefer pen as important assistance. Pen can act as the facilities of pointing, handwriting and gesture which are performed by millions of people everyday. Speech and pen are complementary along many dimensions [9]. By combining them, the parallel recognition and interpretation can yield a higher likelihood of correct recognition, and the strengths of one can be used to offset the weakness of the other. 3.2 Context Awareness Observing communication between humans we can see that the action of a person is always performed in a certain situation and lots of information is implicitly exploited in the exchange of messages. What happens in the surrounding environment often supplies valuable information that is vital for the communications. If computer applications can also utilize such information to characterize the situation, the activity and adaptability of interaction will be enhanced. In other words, computers should be able to have a certain understanding of user's behavior and surrounding states in a given situation, and use the contextual knowledge as additional input. The hints carried in context could help ubicomp appliances select the most appropriate mode and automate tasks, so that the attention minimization can be obtained. Context is an abstract concept and therefore difficult to capture directly. Salber and Dey's context toolkit [10] and Schmidt's architecture [11] proposed the idea of layered abstraction to extract ambiguous environmental situations into executable contextual information. In this architecture, physical or logical sensors are used to capture raw data from environment. Then the raw data is divided into several basic elements named cues, which provide an abstraction of sensor data. Generally, each cue is dependent on one single sensor; but using the data of one sensor, multiple cues can be calculated. Finally, a clustering algorithm is used to cluster cues into contexts. 3.3 Integration Multimodal interaction and context awareness are closely related. When integrating them at semantic level, two problems in particular require our attention: a) where and how context plays its role in the fusion, and b) how to make decision when some information collides with others in integration. The way in which humans integrate information from multiple sources can give us useful hints. Though the detailed mechanisms of information integration in human brains are still uncertain, the recent accepted Fuzzy Logical Model of Perception (FLMP) [12] can give us an overview. In this model there are two central assumptions: a) the sources of information are evaluated independently of one another and are integrated multiplicatively to provide an overall degree of support for each alternative, and the perceptual identification and interpretation follows the relative degree of support among the alternatives; b) the result of multimodal integration is not always all-ornot, but allows the fuzzy nature of information to be reflected in subjects evaluation and response. If the result is fuzzy, it will be integrated with contextual information to make the unambiguous decision.
336
W. Yue, H. Wang, and G. Wang
In accordance with the FLMP model, the fusion of multiple sources of information in ubicomp interaction is also designed to be a two-stage procedure. After recognized by parallel recognizers, user inputs are assigned with weight factors and integrated. If the result is clear enough to indicate an independent task and all its parameters are available, it will be issued. Otherwise, it will be integrated with contexts in the second stage. If the result is still ambiguous, application could ask the user to make the final decision. Context can also derive tasks independently. In order to control the tasks to communicate with high-level application and eliminate the conflicts among them, a buffer algorithm (e.g., statistical model, neural network) between task space and application is necessary. It is used to check if transitions among tasks are probable. If the transition is not very likely, the application will not change to the new behavior. Generally, the longer the system is trained, the better the performance becomes. The fusion of multimodal interaction and context awareness also plays its parts in output. Applications select the most appropriate modes to present the feedback information of high-level application based on context. Fig.2 illustrates the major components and data flow of the transparent interaction.
Fig. 2. Interaction architecture for ubiquitous computing
4 Application Handheld guides have been demonstrated in several research projects and commercial applications as a way to ease the plight of tourists. We developed a hypermedia handheld guide system, named TGH (Tour Guide in Hand), in accordance with the principles and architecture described above. It illustrates the basic idea of transparent ubicomp interaction. Besides, several novel facilities are implemented in the system. We deployed it into practice to test the theory in a real-world setting. 4.1 Interaction Facilities Users can perform tasks by speech, pointing, handwriting, sketching or a combination according to their own habits and task types. Table 1 shows some representative cases.
Designing Transparent Interaction for Ubiquitous Computing: Theory and Application
337
Table 1. Representitive cases of multimodel interaction Tasks Look for the location of A Look for the route from current location to A
Look for all the restaurants in certain area
Multimodal interaction modes Speak the name of A, or write by pen Input the entire command by speech, or speak “the way here” and point to A by pen synchronously speak “all the restaurants” along with sketching a area by pen on the screen
Contexts in TGH include location (current and past), time, moving direction, device orientation, task type, user’s operation habits and so on. The Markov chain model is used as the buffer algorithm to control the tasks. As mentioned above, the responses on context-based adaptations from users will be recorded to personalize successive interactions. Here we give an example to illustrate how contexts work: Example one: Detecting and displaying user’s location and trace on digital map is a basic function of handheld guides. However, we find that users often have to turn the devices’ or their own orientations to look for certain locations or routes, because the positive direction of map is always NORTH no matter users’ directions. It results in the mismatch between the physical representation and user’s psychological representations on orientation. Assume a user is walking towards the south, his will track “downwards” on the map, and physical objects on his right will be presented on the left. To solve this problem and allow users to locate objects more easily, the digital map in TGH is able to rotate automatically to suit the user’s direction. Users are also permitted to rotate the map back by sketching with pen. If the system detects a user does so several times, this adaptation will be stopped for s/he. 4.2 Performance Experiments To prove the advantage of context-sensitive multimodal interaction compared with context-insensitive unimodal style, we conducted user tests in two stages. In the first tests, which we ran in the campus of Peking University, we invited 36 undergraduates who are the novices at handheld device to study the performance by formative and qualitative evaluation in laboratories. After a short training, the participants are divided into three groups (A, B, and C), and given the same 12 typical tasks in guide application, shown in Table 2, to perform. In order to avoid the accident error, each type has two similar tasks. In group A, TGH worked in the traditional mode in which users could only interact by pen and all the context-based. facilities were stopped except location tracing. In Group B, users were permitted to interact multimodally. In Group C, interactions were contextsensitive multimodal. The result (see Figure 5) showed that users in Group C obtained the minimal completion time in average. In the experiment, we found that though multimodal interface could improve the efficiency and users did like being able to interact multimodally, they did not always do so when given free choice. Generally, they preferred to interact unimodally to perform simple tasks (e.g., task 1-4), and multimodally for those difficult or complex tasks (e.g., task 9-12).
338
W. Yue, H. Wang, and G. Wang Table 2. Experimental tasks Task description 1 Zoom in, zoom out and move the map 2 Display all the restaurants on the map 3 Look for the Stone Fish and point out its orientation 4 Look for the original owner of the Stone Fish from its textual introduction 5 Look for Boya Tower and point out its orientation 6 Look for the original purpose of the Boya Tower from its textual introduction 7 Look for the nearest ATM and point out its orientation 8 Look for the nearest classroom and point out its orientation 9 Look for the optimized route from West Gate to the library 10 Look for the optimized route from the library to Boya Tower 11 Look for the optimized route from current location to the Main Building 12 Look for the optimized route from current location to the post office
Fig. 3. Average task completion time. x-axis: task number; y-axis: average competition time (in seconds).
Besides the formative experiments, we ran a second stage series of tests to evaluate the general user acceptance in the context of authentic use. Eighty-three visitors to Peking University, between 15 and 54 years of age, were invited to use TGH in the campus of Peking University where they had never been to. During the tests, we observed users’ interactions, recording their reactions and comments. At the end of the test, the participant was required to express their experience with TGH and evaluate through a survey containing a 22-question questionnaire, which covered ease of use, general helpfulness, multimodal interaction, awareness and other aspects. Through the tests, we can see that the context-aware multimodal interaction does improve the user acceptance and reduce efforts. Meanwhile, we also noticed that currently the theoretical advantages of speech modality are compromised by the reliability and other problems which are essentially caused by recognition technology. It proves that simplex speech interface is still inappropriate at present. We also carried Chi-square test to find the correlations between some of the survey’s variables. According to the test, we found that the ease of use is strongly related to the age (90 percent confidence), with users aged from 19 to 44 enjoying the system more. The use of speech is closely related to the level of familiarity with computers (95 percent confidence), with novices being more likely to use speech. Enjoyability is correlated
Designing Transparent Interaction for Ubiquitous Computing: Theory and Application
339
with the user interface design such as the quality of the images and buttons (95 percent confidence). There are valuable for designing hypermedia guide tools.
5 Conclusion Improving the human computer interaction is a great challenge for ubiquitous computing. In order to design transparent interaction, a psychology framework that accounts for users’ cognitive behavior in ubicomp environment is proposed in this paper. By analyzing the cognitive skill and attention selectivity, two principles are proposed: a) allowing users to utilize their procedural or tuned knowledge and skill to interact with computers, and b) improving the adaptability and activity of applications. A context-sensitive multimodal architecture is then proposed to support universal interaction design. The implicit knowledge from contexts and explicit user inputs are integrated at semantic level in accordance with the FLMP model of multimodal integration of human brain. Finally, in a case study we give an overview of the handheld guide system TGH. User studies proved that context-sensitive multimodal interaction does improve the user acceptance and reduce efforts. Acknowledgements. This work was supported by NSFC (No. 60473100 and 60573151), and China 973 Program (No. 2004CB719403).
References 1. Weiser, M.: Computers for the 21st Century. Scientific American 265(3), 94–104 (1991) 2. Garlan, D., Siewiorek, D.P., Smailagic, A., Steenkiste, P.: Project Aura: Toward Distraction-free Pervasive Computing. IEEE Pervasive Computing 1(2), 22–31 (2002) 3. Abowd, G.D., Mynatt, E.D.: Distributed Cognition: Toward a New Foundation for HumanComputer Interaction Research. ACM Trans. on Computer-Human Interaction 7(2), 174– 196 (2000) 4. Barnard, P.J., Teasdale, J.D.: Interacting Cognitive Subsystems: A Systemic Approach To Cognitive-Affective Interaction And Change. Cognition and Emotion 5, 1–39 (1991) 5. Norman, D.: The Invisible Computer. MIT Press, Cambridge, Mass (1999) 6. Anderson, J.R.: Automaticity and the ACT Theory. American Journal of Psychology 105, 165–180 (1992) 7. Eysenck, M.W.: Principles of Cognitive Psychology. Psychology Press, UK (1997) 8. Kahneman, D.: Attention and Effort. Prentice-Hall, New Jersey (1973) 9. Oviatt, S.L., Cohen, P., et al.: Designing the User Interface for Multimodal Speech and Pen-based Gesture Applications: State-of-the-Art Systems and future research directions. Journal of Human-Computer Interaction 15(4), 263–322 (2000) 10. Salber, D., Dey, A.K., Abowd, G.D.: Aiding the development of Context-Enabled Applications, In: Proc. Conf. Human Factors in Computing Systems, pp. 434–441 (1999) 11. Schmidt, A., Karlsruhe, U.: How to Build Smart Appliances. IEEE Personal Communications 8(4), 66–71 (2001) 12. Massaro, D.W., Stork, D.G.: Speech Recognition and Sensory Integration: A 240-year-old Theorem Helps Explain how People and Machines Can Integrate Auditory and Visual Information to Understand Speech. American Scientist 86(3), 236–245 (1998)
Understanding, Measuring, and Designing User Experience: The Causal Relationship Between the Aesthetic Quality of Products and User Affect Haotian Zhou1,2 and Xiaolan Fu1,∗ 1
State Key Laboratory of Brain and Cognitive Science, Institute of Psychology Chinese Academy of Science, Beijing 100101, China {zhouht, fuxl}@psych.ac.cn 2 Graduate School, Chinese Academy of Science, Beijing 100101, China
Abstract. This study sought to test the often-taken-granted assumption about the causal relationship between the aesthetic quality of products and user affect by using affective priming paradigm. The results showed that when beautiful web-pages were used as primes, the discrepancy between the response latencies to positive target and negative targets was larger than when the primes were ugly-webpage. A parallel pattern was obtained when pleasant pictures and unpleasant pictures were used as primes. Such findings supported the hypothesis that visual Gestalt of products can lead to affective change independent of reflective beauty judgment. The possibility of employing affective priming procedure to measure product beauty is also discussed in the light of the experiment results. Keywords: user experience, aesthetics, beauty, affect, affective priming.
1 Introduction Since the first documented attempt to define user experience (UX) in 1996 [1], a major shift of focus from functionality and usability to non-pragmatic or hedonic aspects of products has been observed in the field of human-computer interaction and interaction design. Available evidence to date all points to the same conclusion that the hedonic aspects of a given interface can significantly influence the user experience of that product [8, 19]. Of all the non-instrumental qualities of products, beauty has been gaining prominence amongst HCI researchers, and striving for beauty has become one of ultimate goals of product design process [4, 16]. Though limited studies have yield important findings attesting to the leverage beauty has on UX [17], the underlying mechanism still remains nebulous [9]. Norman [15] argues that beautiful products induce positive affect in users, which in turn facilitates the user-product interaction process. Though Norman’s claim has enormous appeals among researchers [9], it is of both theoretical and practical importance to subject this ∗
Understanding, Measuring, and Designing User Experience
341
causal chain to careful examination. Until recently, much of the emphasis has been put on testing out the link between positive affect and UX [e.g. 13], while the first part of Norman’s claim has been largely assumed to be true. For example, Hassenzahl [9] asserts that beauty judgment is driving by the affect evoked by the visual Gestalt of products (Fig. 1a). Yet, given the lack of empirical evidence, one can propose an alternative model in which the direction of causality between beauty judgment and elicited affect reverses (Fig.1b). Note the principal difference between the two models is that whether conscious aesthetic evaluation is the precondition for the visual Gestalt of products to exert impact on users’ affective state or not.
Fig. 1. Outlines of two competing models of how beauty leads to affect: (a) Norman’s model; (b) alternative model. Despite their differences, we acknowledge that cognitive appraisal per se is capable of influencing UX (user experience).
Since the rationale behind many UX professionals’ advocacy of assigning more weight to aesthetics during product design process rests heavily on Norman’s model [15], this growing interest in beauty might be rendered groundless if the alternative model proves to be right (see Discussion section for a detailed explication). In fact, Hassenzahl [9] has indicated the necessity of carefully scrutinizing Norman’s claim. The lack of research to discriminate between the two competing models may be due to the inability of traditional UX methodologies to dissociate affective process and reflective process. In the present research, we endeavored to tackle this issue by adopting an approach often used to investigating the interplay between cognition and emotion. In addition, we also intended to demonstrate the possibility of adapting this paradigm for use as a promising measuring instrument of product beauty.
2 Methods 2.1 Overview The affective priming paradigm developed by Fazio et al. [6] was adopted to assess the automatic affective response evoked by the visual Gestalt of products. The
342
H. Zhou and X. Fu
underpinning of affective priming paradigm is the so-called congruency effect, that is when the affect induced by the prime is of the same valence as target (e.g. both are positive) the evaluation of target valence (i.e. whether it is negative or positive) will be facilitated as compared to the response to a incongruent target (in this case, a negative one). Thus, rather than explicitly inquiring participants about the affect elicited by certain stimuli—primes, it is inferred from participants’ response to another distinct yet affectively related stimuli—targets [10]. Consequently, this paradigm enables us to examine the affective effect of beauty free from possible distortion of conscious reflection (e.g. social desirability). 2.2 Participants 25 undergraduates participated in the experiment (12 men and 13 women), with a mean age of 21.5 years and an age range of 19-23 years. All of them were right-handed and had normal or corrected-to-normal vision ability. All participants were reimbursed upon the completion of the experiment. 2.3 Materials In this study, we concentrated on the beauty of website specifically. The screen shots of 100 English-language web pages constitute the primary stimulus pool. Among them, 50 are badly designed web pages (Fig. 2a) taken from www.webpagesthatsuck.com and
Fig. 2. Examples of web-pages used in this study: (a) well-designed webpage; (b) bad-looking webpage
other sources; and the remainders are beautifully designed web pages (Fig. 2b) from a few design-award winner lists (e.g. www.worldwidewebawards.net). 30 Chinese undergraduate students (15 males and 15 females), participated in a rating procedure designed by Lindgaard et al. [14], in which they were asked to give beauty judgment to all the candidate web-pages presented one by one in random order. Fig. 3 depicts the time course of a single trial in the rating session: each webpage was on screen for 500ms after a 800ms fixation symbol, and then, participants assigned a visual appeal score to that webpage via a sliding bar. Participants were unable to proceed to a second webpage unless they finished rating properly. An average visual attractiveness score
Understanding, Measuring, and Designing User Experience
343
was computed for each webpage following the algorithm suggested by Lindgaard et al. [14], and the candidates were ranked accordingly. The 20 most appealing and the 20 least appealing websites were retained for usage in subsequent experiment. 60 affective pictures taken from Native Chinese Affective Picture System [2] served as controls for webpage in subsequent study. The valence score of each picture was assessed using 9-point valence scale with 1 designating extremely unpleasant and 9 extremely pleasant. One third of the pictures are unpleasant ones such as bloody scene (mean rating = 2.18, SD = 0.20), 20 are of positive valence such as smiling baby (mean rating = 7.47, SD =0.19), and the remainders are neutral ones such as common tool (mean rating = 4.99, SD = 0.17). The targets consist of 20 positive Chinese adjectives (e.g. outstanding) and 20 negative ones (e.g. selfish) selected from the list standardized by Luo and Wang. These words were rated using the same valence scale. The mean valence rating of the 20 negative adjectives is 2.81 (SD = 0.10) while that of the 20 positive one is 6.81 (SD = 0.16). The average familiarity of positive words is 5.22 (SD = 0.22), slightly higher than that of negative ones, 4.73 (SD = 0.42).
Fig. 3. Time sequence of the webpage rating procedure
2.4 Experiment The time sequence of the affective priming procedure is shown in Figure 4. Each trial started with a fixation symbol (600 ms) followed by the prime (100 ms). After the prime was a blank screen lasting 50 ms, then the target appeared. Participants were required to judge whether the target was a positive word or a negative one as fast and accurate as possible by pressing specified keys on keyboards. After the judgment, the target disappeared and the program slept for 3 s before next trial resumed. Reaction time and accuracy for each target were recorded. The 40 web-pages retained plus the 60 affective pictures constituted the prime pool of this experiment. The primes were classified into five categories—beautiful webpage (BW), ugly webpage (UW), pleasant pictures (PP), unpleasant pictures (NP), and neutral/control pictures (CT). The experiment employed a fully crossed 5 (prime categories) by 2 (target valence: negative word vs. positive word) within-subject design. Each of the 40 targets appears five times, once in one of the five prime categories with the stipulation that the same instance from a given prime category can only be paired with one instance from a given target category. Such a pairing scheme
344
H. Zhou and X. Fu
guaranteed that the same set of targets was used as its own control with respect to prime categories. The whole experiment session consists of a training block with 16 trials and 200 experimental trials, which were divided evenly into two blocks with a break in-between.
Fig. 4. Time sequence of a single trial in the affective priming procedure
3 Results 3.1 Data Screening and Preliminary Data Analysis The priming data were screened for outliers by excluding trials with reaction times below 250 ms or above 1000 ms (8.6% of all trials). After correction for outliers, trials with a false response (1.7 % of the remaining trials) were also eliminated from subsequent analysis. Preliminary analysis shows that response latencies to positive targets (M=608.9) were shorter than those to negative words (M = 632.18), F(1, 24) = 17.15, p < .001. Such a positive-target-premium (PTP) is a typical finding in previous affective priming studies [3, 20]. Past affective priming studies [3] have demonstrated that the direction of PTP score variations can be used to infer the valence of the affect elicited by primes. Specifically, PTP score decrease is related with negative affect while its increase with positive affect. Therefore, we investigated priming effect of different types of primes by observing the variations of PTP score as a function of prime categories. PTP score for each prime category was computed subject-wisely by subtracting the mean response latency to positive targets from the mean latency to negative ones, resulting in five PTP scores (associated with each of the five prime categories) per subject. 3.2 Priming Effects of Prime Categories If beauty judgment is indeed an affect-driven response (i.e. affect change precedes beauty judgment, Fig. 1a), it can be hypothesized that the priming effect (indexed by PTP scores) of visual attractiveness of web-pages (ugly vs. beautiful) should resemble that of pleasantness of pictures (non-pleasant vs. pleasant).
Understanding, Measuring, and Designing User Experience
345
Fig. 5. Mean PTP scores of targets as a function of prime type and prime polarity (Table 1)
To test out this hypothesis, data related with neutral primes were dropped, and the remaining four prime categories used in previous analysis were recoded through two two-level variables—type (webpage and picture) and polarity (negative and positive). Table 1 shows the how different types of primes fall into corresponding cells determined by type and polarity. The PTP score was then analyzed via a 2 by 2 (type by polarity) ANOVA with both variables as repeated measures. The outcome clearly supports the hypothesis (Fig. 5). Of the two main effects and one two way interaction, only the main effect of polarity reaches significance, F(1, 24) = 32.37, p < .001, with PTP score related with positive primes (M = 37.39, SD = 8.00) much higher than that related with negative primes (M = 6.92, SD = 8.71). Table 1. Correspondence between prime categories and their values on polarity and type
Polarity Type
picture webpage
negative NP UW
positive PP BW
Post hoc comparisons shows that priming with both ugly web-pages and non-pleasant picture led to significantly decreased PTP score compared with neutral primes: both ts(24) > 2.30; ps < .03. On the other hand, when the priming effects of
346
H. Zhou and X. Fu
beautiful web-pages and pleasant pictures were compared with neutral primes respectively, PTP scores increased as predicted. However, neither increase reached significance, ps > .45 (a detailed account for this unexpected finding as well as its implication is provided in the discussion section). The means and standard deviations for PTP scores associated with all five prime categories are displayed in Table 2. Table 2. Means and standard deviations for PTP score as a function of prime categories
Prime type BW
Mean (ms)
Std Deviation (ms)
35.43
41.50
UW
14.06
49.37
PP
39.35
48.90
NP
-.23
47.82
CT
32.79
47.24
4 Discussion 4.1 Beauty and Affect In the present study, we found that ugly web-pages’ influence on subsequent adjective-evaluation task is similar to that of non-pleasant pictures, while beautiful web-pages similar to pleasant pictures. It can be inferred by analogy that ugly web-pages induced negative affect in participants; whereas beautiful web-pages pushed the affective state toward the positive direction. In the present study, participants were required to concentrate on target evaluation tasks rather than the primes. With little attention being directed to prime, any affective reaction evoked by the prime was unlikely to be contaminated by conscious reflective processing, and thus can be seen as the direct outcome of visual Gestalt’s impact on affective system (Fig. 1a). Such findings bear out the often-taken-for-granted assumption about the linkage between beauty and affect—visual Gestalt of products is capable of changing affective state of users independent of explicit judgment about product aesthetics. What would be the consequences if the alternative model outlined in Fig. 1b were true? According to this competing model, visual Gestalt of a product cannot impact users’ affective state until users make a conscious aesthetic judgment about its appearance. Yet, in most goal-mode (i.e. driven by predetermined goal) [7] user-product interaction context, it seldom happens that user will explicitly evaluate the visual appeal of the product before proceeding to interact with it. Accordingly, at least in the case of goal-mode products; much of the discussion about making beauty a design goal would become irrelevant, because in this model, the involvement of conscious reflection is required for beauty to influence UX (Fig. 1b).
Understanding, Measuring, and Designing User Experience
347
4.2 Direct Measure vs. Indirect Measure Affective priming paradigm has recently been developed as an indirect measure of stimuli (e.g. food or friends) valence to replace the more conventional direct measures (e.g. questionnaire) in the field of social psychology [5]. The edge of indirect measure over direct measure is that the former is less susceptible to distortions and bias by respondents (e.g. impression management). However, UX field has been slow in taking up this recent advancement in measuring methodology and traditional measuring instruments (e.g. Likert scales) still predominate [12]. In fact, the unexpected result of our study testifies to the possible caveats associated with the direct measurement. Recall that, compared to neutral pictures, neither beautiful web-pages nor the pleasant pictures were capable of increasing PTP scores significantly as predicated. There are two possible explanations for this finding. The first one is that both pleasant pictures and beautiful web-pages failed to alter affective states of participants, assuming that neutral pictures did not induce affective changes. However, this account can hardly reconcile with two facts (1) past affective priming studies have consistently demonstrate that positive pictures can induce affective reaction [e.g.11, 19] and (2) in this study, negative primes resulted in affect change as indexed by significant PTP score decrease. Hence, we believe the alternative account makes more sense—neutral pictures led to positive affect in participants. After reexamining the neutral pictures, we noticed that most of the pictures are beautifully designed artifacts (e.g. antique) or pleasing geometric patterns (Fig. 6). Their neutrality on rating scale is likely due to the distortion of raters’ cognitive process. Despite the initial pleasant affect brought about by the visual Gestalt of a streamlined hairdryer, the raters might have engaged in a reflective process (e.g. how can such mundane object make me feel good?) which dismissed this affective change; and therefore rated it as emotionally bland. Recently, Zhou et al. [21] discovered that Chinese participant took the neutral schematic face (explicitly rated as such) as positive one in an implicit task (i.e. categorization). Admittedly, including an extra condition, where no primes precede targets, to our experiment would help clarify this ambiguity. Nonetheless, the present study points to the precaution one should take when interpreting the results of beauty studies employing direct measuring instruments.
Fig. 6. Examples of emotionally-neutral pictures used in the experiment
348
H. Zhou and X. Fu
Notwithstanding the uncertainty concerning neutral pictures, the present research unequivocally demonstrated that PTP score was capable of distinguishing good designs from bad ones. Therefore, we provided some empirical evidence suggesting the possibility of using affective priming paradigm can be successfully as a promising alternative measuring instrument in future UX research.
5 Conclusion By differentiating between two opposing accounts of how beauty creates affect (Fig. 1), the present study demonstrated that product beauty is one of the ‘many circumstances in which affective reaction precedes the very cognitive appraisal on the which the affective reaction is presumed to be based [18] ’. Note that in our experiment, the affective changes of participants were inferred from PTP scores; therefore, such evidence is indecisive at best. Clearly, more research employing alternative procedures such as physiological instrument are needed to validate our conclusion. Showing that beautiful and ugly products can be set apart on their effect on PTP scores, this study has interesting implication for the important question of how to measure beauty [9]. Though we speculated about the possibility of adopting affective priming paradigm to measure visual attractiveness of products, the adjustment necessary for achieving this end has yet to be specified. Acknowledgments. This research was supported by grants from 973 Program of Chinese Ministry of Science and Technology (#2006CB303101), and the National Natural Science Foundation of China (#60433030).
References 1. Alben, L.: Quality Of Experience: Defining the Criteria for Effective Interaction Design. Interactions 3, 11–15 (1996) 2. Bai, L., Ma, H., Huang, Y.X., Luo, Y.J.: The Development of Native Chinese Affective Picture System-A Pretest in 46 College Students. Chinese Mental Health Journal 19, 719–722 (2005) 3. Banse, R.: Affective Priming with Liked and Disliked Persons: Prime Visibility Determines Congruency and Incongruency Effects. Cognition & Emotion 15, 501–520 (2001) 4. Desmet, P.M.A., Hekkert, P.: The Basis of Product Emotions. In: Green, W., Jordan, P. (eds.) Pleasure with Products, beyond Usability, pp. 60–68. Taylor & Francis, London (2002) 5. Fazio, R.H., Olson, M.A.: Implicit Measures in Social Cognition Research: Their Meaning and Use. Annual Review of Psychology, pp. 297–328 (2003) 6. Fazio, R.H., Sanbonmatsu, D.M., Powell, M.C., Kardes, F.R: On the Automatic Activation of attitudes. Journal of Personality and Social Psychology 50, 229–238 (1986) 7. Hassenzahl, M.: The Thing and I: Understanding the Relationship between User and Product. In: Blythe, M., Overbeeke, K., Monk, A., Wright, P. (eds.) Funology: From Usability to Enjoyment, pp. 31–42. Kluwer Academic Publishers, Dordrecht Boston London (2003)
Understanding, Measuring, and Designing User Experience
349
8. Hassenzahl, M.: The Interplay of Beauty, Goodness, and Usability in Interactive Products. Human-Computer Interaction 19, 319–349 (2004) 9. Hassenzahl, M.: Aesthetics in Interactive Products: Correlates and Consequences of Beauty. In: Schifferstein, H.N.J., Hekkert, P. (eds.) Product Experience, Elsevier, Amsterdam (2006) 10. Hermans, D., Baeyens, F., Lamote, S., Spruyt, A., Eelen, P.: Affective Priming as an Indirect Measure of Food Preferences Acquired through Odor Conditioning. Exp. Psychol 52, 180–186 (2005) 11. Hermans, D., Spruyt, A., De Houwer, J., Eelen, P.: Affective Priming with Subliminally Presented Pictures. Can J Exp Psychol. 57, 97–114 (2003) 12. Kuniavsky, M.: Observing the User Experience: A Practitioner’s Guide to User Research. Morgan Kaufmann, San Francisco (2003) 13. Lyubomirsky, S., King, L., Diener, E.: The Benefits of Frequent Positive Affect: Does Happiness Lead to Success. Psychological Bulletin 131, 803–851 (2005) 14. Lindgaard, G., Fernandes, G., Dudek, C., Brown, J.: Attention Web Designers: You Have 50 Milliseconds to Make a Good First Impression! Behaviour & Information Technology 25, 115–126 (2006) 15. Norman, D.A.: Emotional Design: Why We Love (Or Hate) Everyday Things. Basic Books, New York (2004) 16. Norman, D.A.: Introduction to This Special Section on Beauty, Goodness, and Usability. Human-Computer Interaction 19, 311–318 (2004) 17. Tractinsky, N., Katz, A.S., Ikar, D.: What is Beautiful is Usable. Interacting with Computers 13, 127–145 (2000) 18. Zajonc, R.B., Markus, H.: Affective and Cognitive Factors in Preferences. The Journal of Consumer Research 9, 123–131 (1982) 19. Zhang, P., Li, N.: The Importance of Affective Quality. Communications of the ACM 48, 105–108 (2005) 20. Zhang, Q., Li, X.: Affecitve Priming Effects under Two SOA Conditons. Chinese Journal of Applied Psychology 11, 154–159 (2005) 21. Zhou, G., Fu, X., Hayward, W.G., Locke, V., Pellicano, E.: Cultural Difference in the Application of the Diagnosticity Principle to Schematic Faces. Journal of Cognition and Culture 5(1), 240–247 (2005)
Enhancing User-Centered Design by Adopting the Taguchi Philosophy Wei Zhou1,2, David Heesom2, and Panagiotis Georgakis1 1
West Midlands Centre for Constructing Excellence (WMCCE), The Development Centre, Wolverhampton Science Park, Wolverhampton, WV10 9RU, U.K 2 School of Engineering and the Built Environment, University of Wolverhampton, Wulfruna Street, Wolverhampton, WV1 1SB, U.K {wei.zhou,d.heesom,p.georgakis}@wlv.ac.uk
Abstract. Since the 1980s User-Centered Design (UCD) has been becoming popular in the ICT industry. It helps seek usable designs through a set of workflows, evaluation methods, and design approaches, which construct a comprehensive UCD framework. Along with its extensive utilizations, its pitfalls are also exposed in cost-benefit, robustness, and optimization respects. However, applying the Taguchi Method can remedy these pitfalls to gain robust optimal designs. This approach is feasible but less emphasized in the Human-Computer Interaction field. From a theoretical perspective, this paper depicts a practical approach to enhance UCD framework by adopting the Taguchi philosophy. Based on the analysis of the UCD framework and the Taguchi Method, it discusses key adaptation points for the Taguchi philosophy adoption in the UCD framework. As a result, the Taguchi-Compliant User-Centered Design (TCUCD) framework is proposed in this paper. Keywords: Taguchi-Compliant User-Centered Design, the Taguchi Method, usability, User-Centered Design.
Enhancing User-Centered Design by Adopting the Taguchi Philosophy
351
this repetitiveness feature is a weakness of the UCD method. It costs a substantial amount of time and money to ensure usability. Secondly, the current UCD is not highly effective to achieve robust usability. Design and evaluation in the UCD are two separate phases, which have no relationship or mechanism to integrate them. Nielsen (1993, pp. 107) pointed out it is likely in an iterative design that "additional usability problems appear in repeated tests after the most blatant problems have been corrected". This shows that the unstructured iterative design is incapable to deliver a robust design, and inevitably causes uncertainties to achieve robustness in usability. Last but not least, it is questionable applying today’s UCD to gain an optimized design. In many circumstances, a design solution might have different options in its design components. For the purpose of picking out best one, a normal approach is to make a comparison among all the design options. Nonetheless, the current parallel design approach in the UCD is essentially a collection of several independent designs, which are unsystematic and short of analytical comparison. Moreover, the evaluation analysis methods, such as within-subjects, between-subjects, etc. are weak to deal with multi-variable situations, which normally have a large number of design choices for evaluating and analyzing. It is formidable to undertake huge optimizing work applying those evaluation methods in the UCD. In these aspects, ironically, current UCD approaches are unusable for designers to achieve robust optimal usability in designs. Total Quality Management (TQM) theory provides inspirations to enhance robustness and optimization of the UCD method. Its underlying philosophy of the Taguchi Method advocates designing the product quality in the design process instead of after the design. Comparably, it is possible to design usability in the user interface design process, and hence significantly shorten design-testing cycles, save cost in usability testing, and deliver optimized robust design. Its applicability in HCI was proven in the early 1990s (Reed, 1992). Applying the Taguchi Method, Smith (1996) attempted to create another design approach Logical User Centered Interface Design (LUCID, tagged as LUCID-Smith in this paper). Unfortunately, it lacks of explicit specifications to adopt UCD elements whilst the UCD has been increasingly emphasized in the HCI realm. However, it is shown in a pioneer design project (Zhou, 2005) that combining UCD essences with the Taguchi philosophy can overcome those weaknesses of the UCD, and achieve optimized robust usability. The aim of this paper is to describe an enhanced UCD framework by adopting the Taguchi philosophy. Firstly, the UCD methodology is depicted to show its fundamental elements. Secondly, the Taguchi Method is reviewed to highlight its key concepts and features. Thirdly, the integration of UCD essences with the Taguchi philosophy is discussed to show their correlation and adaptations. Based on the analysis of these theories and methodologies, a new Taguchi-Compliant User-Centered Design (TC-UCD) framework is proposed.
2 UCD Framework UCD was advocated by Donald Norman in the 1980s (Norman, 1988). It recommends placing the user at the center of the design. Since its initiation, it has been developing to be a substantial framework with various methods in requirement analysis, design,
352
W. Zhou, D. Heesom, and P. Georgakis
and evaluation in usability engineering. A few UCD methodologies are invented by HCI specialists, institutions, and organizations to fit needs in the framework. This section outlines the UCD framework from UCD workflow, evaluation methods, and design approaches. 2.1 UCD Workflow Mayhew (1999) introduced a detailed roadmap of usability engineering lifecycle. It can guide practitioners to achieve a usable design step by step. In this roadmap, a complete usability engineering lifecycle consists of requirement analysis, design/testing/development, and installation, in which a serial of specific activities needs to carry out for gaining certain goals. Roughly speaking, the requirement analysis deals with user profile, contextual task analysis, platform capabilities and constraints, and general design principles. These analyses are helpful for determining usability goals. The phase of design/testing/development plays a central role in the roadmap. On the one hand, it applies design strategies, which are derived from previous phase of requirement analysis, for prototyping and development. On the other hand, it provides feedback by testing to the previous phase for adjustment and improvement. The installation phase is to get deployment feedback for the further design improvement. In addition to this roadmap, similar UCD models are also proposed by Nielson (1993), Hix (1999), and some commercial institutions like Cognetics Corp., which created another LUCID (Logical User Centered Interaction Design) framework. Essentially, the backbone of these models can be generalized as several key parts: user study for requirement analysis, conceptual design/development for prototyping and implementation, usability testing for finding usability problems, and deployment for gaining design improvement points. 2.2 Evaluation Method There are three types of evaluation method for the usability testing: heuristic evaluation, formative evaluation, and summative evaluation. In the light of the number of needed users in usability testing, the cost of them is variant from low to high (Hix, 1999). Heuristic evaluation applies existing design guidelines or checklists without involving real users. Formative evaluation is often applied in design-testing lifecycles to identify usability problems. It can produce both qualitative (narrative) and quantitative (numeric) results. Summative evaluation is used for finalizing a design in order to obtain some statistical information. Besides these general types of usability evaluation, Nielson (1993) summarized popular usability methods and their suitability (pp. 224). Among these methods, thinking aloud is a typical formative evaluation, which can be further evolved to be several methods like constructive interaction (codiscovery), retrospective testing and coaching method. 2.3 Design Approach Iterative design and parallel design are two approaches in the UCD. The mainstream in the HCI field is iterative design. Usually, it is realized by several design-testing cycles to incrementally achieve usability. Mayhew (1999) applied this approach
Enhancing User-Centered Design by Adopting the Taguchi Philosophy
353
extensively in the roadmap. Such an iterative approach has been considered the best choice in user interface/interaction design. However, Dix (2003) specified that the iterative approach might be confined to obtain the best design due to an inappropriate start point. For overcoming this pitfall, he acclaimed that it is crucial to have a good initial design based on experience and judgment. Another approach is to have several initial design ideas and drop them one by one as they are developed further. Dix’s (2003) suggestion actually keeps consistency with Nielson’s parallel design model (Nielson, 1993, pp. 86), in which several independent designs can be performed simultaneously by different designers, and then to merge their merits to be a new one for further iterative design. In its case study (Nielson, 1996), it is reported that parallel design is more expensive than iterative design because of consuming more resources. Nevertheless, it can speed up time-to-market, and explore the design space in less time. It is noticed that the merged design was dependant on senior designer’s subjective judgment and individual experience. As a conclusion of the study, this method is not recommended for all projects due to its costly nature, unless time-to-market is of essence.
3 Taguchi Method The Taguchi Method is devised by Dr. Genichi Taguchi in the late of 1940s. It has a strong theoretical relationship with Design of Experiment (DoE), which is founded by Sir Ronald Fisher in the 1920s. Initially, the Taguchi Method was created for the purpose of quality control to deliver robust products. Nowadays, its application has been extensively used in all kinds of fields. Thousands of successful cases from diverse companies have been reported in the past 40 years. Its detailed theory and applications can be available from the reference (Ross 1988, Taguchi 2000). 3.1 Robust Design Dr. Taguchi establishes his philosophy about robust design. He defines “robustness” as: the state where the technology, product, or process performance is minimally sensitive to factors causing variability (either in the manufacturing or user’s environment) and aging at the lowest unit manufacturing cost (Taguchi, 2000). Following this philosophy, the designer’s goal should reduce output variability in the presence of noise. Traditionally, the design approach follows this way: design → test → find problem → solve problem → test → find problem →…until the problem can be eliminated. Such a “plug-the-leak” design approach is just what UCD follows. It is obviously time-consuming and costly for improving design quality. Nevertheless, the Taguchi Method breaks down this conventional design approach. It advocates designing quality into the product instead of inspecting it after its production. For realizing this aim, a three-stage design process is suggested in product quality control. 3.2 Three Stage Design Stage 1 - System Design. The focus of the system design is to determine suitable working levels of design factors. The Taguchi Method treats design in an analytical way. It identifies design issues as design factors, design levels, and noise factors.
354
W. Zhou, D. Heesom, and P. Georgakis
Design factors refer to main controllable design issues for product creation. They directly influence on product performance. Design levels are some options of a design factor. Noise factors are some uncontrollable external issues, which usually interfere to product performance. Basically, noise factors come from three aspects: outer noise (environment), inner noise (product itself) and between product noise (piece-to-piece variation). The choice of design factors, levels and noise factors can be decided by the researcher’s judgment based on selected materials, parts and technology. Stage 2 - Parameter Design. Parameter design is to seek design factor levels that produce the best performance of the product/process under study. These optimal conditions are selected so that the influence of uncontrollable factors (noise factors) causes minimum variation of system performance. For searching these optimal conditions which are insensitive to the noise, a partial factorial experiment is introduced rather than a tedious full-factorial experiment. Orthogonal Array (OA) and its optimization analysis play a key role in this stage. An OA consists of inner array and outer array. An inner array will control design factors and their levels to compose a group of parallel trials for an experiment, and achieve the purpose of partial trials to test whole design combinations. Its features of balance and orthogonality can lead to a comparable experimental result, and thus dramatically decrease the number of experiments. An outer array, likewise, can create different noise situations for testing those design combinations. Optimization analysis for experiment results is able to find most robust design levels for creating optimized design combinations. Besides the often used ANOVA method, Dr. Taguchi suggested using Signal-to-Noise (S/N) ratio to discern right design levels which provide the best performance under study. A confirmation experiment ought to be followed up to verify the validity of the optimized designs. Stage 3 - Tolerance Design. Tolerance design is a way to refine the results of the parameter design by tightening the tolerance of factors. It is possible that design levels are improperly chosen by designers. Even after optimizing in the experiment, the optimized designs might not show desired performances in the confirmation experiment. In this situation, adjustments for design levels need to be made so that to initiate another design-testing cycle. In accordance with the three-stage design as well as its OA rationale, the Taguchi Method constructs a design space or problem space by design factors and levels, in which optimal design solutions can be sought by optimization analysis. Contrasting to the “plug-the-leak” way of aimless searching, this target-oriented approach ensures to achieve a robust design, and can also be applicable for the quality control of user interface/interaction design, the usability.
4 Adoption and Adaptation The Taguchi philosophy demonstrates a structured approach to combine design with evaluation. It allows a group of correlated parallel designs to be created, evaluated, and optimized through comparison. In the UCD framework, adopting the Taguchi philosophy can compensate its weaknesses in robustness, optimization, and shorten tedious design-testing cycles to improve usability control. Some adaptations in both
Enhancing User-Centered Design by Adopting the Taguchi Philosophy
355
UCD and the Taguchi Method are needed so as to comply with the principles of each other. In the following sections, primary adaptations in the adoption are discussed. 4.1 Taguchi Design Taguchi design is introduced into the UCD framework. It can connect the conceptual design and the usability evaluation together to achieve robustness and optimization in application. In accordance with the Taguchi philosophy, the Taguchi design consists of system design and parameter design. Their functions are clarified as follows. System design. The objective of this stage is to identify main design factors and levels which are influential on usability. It has been acknowledged that task-centered process can be visionary to foster design (Davis, 2001). Therefore, task analysis is able to produce design issues, and helpful to decide design factors and design levels. For achieving this objective, task analysis needs to be differentiated into two levels: abstract level analysis for generating design factors, and concrete level analysis for choosing specific design levels. ConcurTaskTrees (CTT) (Paternò, 2000), a popular task analysis method, provides an ideal interface to meet this need. Besides user tasks, specific design elements in the user interface like layout, GUI components etc. also could be design factors if they can cause variations in usability. Task analysis in the UCD framework can lead to defining of usability goals, and thereafter to create prototypes. Similarly, it plays the same role in the system design, but the prototyping is performed in the parameter design. Parameter design. Parameter design seeks most usable design levels through usability testing on prototypes. It needs a proper inner array to arrange identified design factors and levels which have been decided in the system design. Thereby, the inner array will construct a set of parallel combinations for prototyping. Compared with independent parallel design in the UCD framework, these parallel prototype designs are subject to the inner array, and every prototype essentially corresponds to a trial for usability testing. The usability testing in every trial is to examine design factor levels so that to identify robust design levels, which are insensitive to noise factors. As these prototypes have the same design factors but with different design levels, it is hence comparable for each prototype to pick out robust design levels after the testing. Positively, the feature of balance and orthogonality of inner array ensures a partial factorial experiment to test all the possible prototypes. It is particularly valuable for usability testing to save cost. Formative evaluation is the main approach in this evaluation. Its execution and result analysis can lead to optimized designs. 4.2 Formative Evaluation and Analysis Formative evaluation is performed in the trials of the parameter design. It serves two aims of usability testing and design level optimizing. The former can check usability problems in each design level; the later is able to identify robust design levels. Among usability methods, thinking aloud and performance measure are applicable for the testing partly because both of them can check usability problems in prototypes, and partly because they all can provide objective quantitative information to identify
356
W. Zhou, D. Heesom, and P. Georgakis
robust design levels. According to the ISO9241-11, usability is defined as effectiveness, efficiency and satisfaction. Its evaluation accordingly has objective evaluation for effectiveness, efficiency, and subjective evaluation for satisfaction. In objective evaluation, effectiveness and efficiency can be associated with the rate of error and time cost respectively from the participant. During the testing, this quantified objective information can be recorded to assess usability of design levels. For the purpose of identifying most robust design levels among the test results, analyzing S/N ratio can intuitively judge effectiveness and efficiency. The measure of satisfaction, however, is unsuitable to apply a partial factorial experiment because of its subjective nature. Nevertheless, it is positive for the participant to pick out his or her most favorite design combinations after finishing all the trials. Based on these objective and subjective evaluations, three types of most usable design levels can be found out in terms of effectiveness, efficiency, and satisfaction. As this analysis approach focuses on individual’s behavioral information, its optimized design solutions are only applicable for the individuals. Such an analysis accordingly can be named as Within Individual Analysis. 4.3 Human Factors and Outer Array Human factors are main noises in terms of the Taguchi philosophy in the parameter design. Smith (1998) applied the outer array to arrange objective human factors like age, gender, ethnic background, etc. On the other hand, he suggested using cognitive approaches to handle subjective human factors such as cognitive style, attitude etc. However, utilizing outer array to build noise conditions in usability evaluation is suspicious because human factors are chaos in the testing. It is uncertain to assert that a user’s performances will be influenced by a pure human factor such as gender, or nationality etc. Moreover, usability testing is mixed with both objective factors and subjective factors from every participant. For these reasons, the obtained optimization results could be inconsistent. The more persuasive and economic approach for dealing with these variations is to check statistical significance of obtained optimized designs. Nonparametric statistics is an applicable approach for solving this problem. 4.4 Significance Analysis Applying the Within Individual Analysis, optimal robust designs can be available in the Taguchi design. Apparently, these optimized designs are not the same for all people because of individuals’ difference. For a group of obtained optimal designs in the evaluations, they fall into different categories of design levels’ combinations. Therefore, it triggers a question about if there are significant differences among these combinations. In essence this is a hypothesis test question of one-sample goodness-of-fit for categorical measurement. The Chi-square test (Siegel, 1988), one of nonparametric statistics methods, can answer this question. As this test concentrates on significance analysis of optimization results from all individuals, it thus can be named as Among Individual Analysis. Such an analysis is able to identify if there are the most effective, or efficient, or satisfactory solutions for all end users.
Enhancing User-Centered Design by Adopting the Taguchi Philosophy
357
5 TC-UCD Framework On the basis of the foregoing discussion, an enhanced UCD framework is proposed as Taguchi-Compliant User-Centered Design (TC-UCD). The backbones of TC-UCD consist of user study, conceptual design, Taguchi design, usability evaluation, confirmation test, and deployment. TC-UCD not only keeps the UCD essences in design techniques and usability evaluations, but also has some unique features for exploring optimized robust usability in designs. Fig. 1 illustrates present UCD framework and TC-UCD framework for comparison. Compared with the current UCD framework, there four major characteristics of the TC-UCD. Firstly, it preserves user study as design start point for user profile definition, functional and non-functional requirement analysis. Secondly, the Taguchi design is integrated in the UCD framework. Its system design and parameter design belong to the conceptual design and the usability evaluation respectively. The former determines design factors and levels; the later seeks robust optimal design factor levels through parallel prototyping and formative evaluation. Meanwhile, the Within Individual Analysis can identify optimal design combinations whilst the Among Individual Analysis can verify the significance of optimized designs. Thirdly, the confirmation test is suggested in the TC-UCD for checking the usability of optimized designs. Summative evaluation is no longer necessary in the new framework. Lastly, TC-UCD itself is compatible with iterative design. Although the emphasis of the TCUCD is placed on the parallel design and evaluation, it is still flexible to adjust design strategies from beginning of the user study, or to tighten design levels from system design. The start point of iteration depends on real design situations.
6 Conclusions and Future Work This paper presents a new vision to enhance present UCD framework. From a theoretical perspective, it presents the TC-UCD to compensate the UCD shortcomings in robustness and optimization aspects. As a multidisciplinary field, HCI theories and methods mainly derive from behavioral sciences. Undoubtedly, it is a valuable practice to combine feasible engineering theories with behavioral sciences for enriching HCI framework. TC-UCD is an advanced and comprehensive design approach. It is highly useful for exploring in-depth design solutions in multi-variable situations. Particularly, the identification of design factors provides a common start point to seek appropriate design levels within a parallel design space. Such a design mechanism can completely fulfill the needs of Nielson’s parallel design model only by more considerations and analyses. Its analysis methods, theories, and comparable quantitative design approach make the model more practical and applicable. Hence the cost of this parallel design will not be increased due to changing design mechanism. However, it is demanding that designers ought to own good skills for quick prototyping to meet the needs in evaluations and time-to-market. Designers also need to gain the knowledge of Taguchi philosophy so as to utilize it in practice. Adopting engineering theories into the UCD framework is a new attempt. Especially, UCD has
358
W. Zhou, D. Heesom, and P. Georgakis
Fig. 1. UCD and TC-UCD framework
Enhancing User-Centered Design by Adopting the Taguchi Philosophy
359
strong characteristics in empirical design and behavioral sciences. The merge of these two aspects is a challenge. It is expected to fully apply the TC-UCD in real design practices for further verification, validation, and improvement.
References 1. Davis, L., Dawe, M.: Collaborative Design with Use Case Scenarios. International Conference on Digital Libraries, In: Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries, Roanoke, Virginia, United States, pp. 146–147, ISBN: 1-58113-345-6 (2001) 2. Dix, A., Finlay, J., Abowd, G., Beale, R.: Human-Computer Interaction, 3rd edn. Prentice Hall, Englewood Cliffs (2003) 3. Hix, D., Ii, S., Gabbard, J.L., McGee, M., Durbin, J., King, T.: User-Centered Design and Evaluation of a Real-Time Battlefield Visualization. Virtual Environment. In: IEEE Virtual Reality Conference (VR’99) (1999) 4. Paternò, F.: Model-Based Design and Evaluation of Interactive Applications. Springer, Heidelberg (2000) 5. Mayhew, D.J.: The Usability Engineering Lifecycle. Academic Press, San Diego (CA) (1999) 6. Nielsen, J.: Usability Engineering. Morgam Kaufmann, San Francisco, California (1993) 7. Nielsen, J., Faber, J.M.: Improving System Usability Through Parallel Design. IEEE Computer 29(2), 29–35 (1996) 8. Norman, D.: The Psychology of Everyday Things. Basic Books, New York (1988) 9. Reed, B.M.: A robust approach to human-computer interface design using the Taguchi method. Old Dominion University, Norfolk, VA, USA (1992) 10. Ross, P.J.: Taguchi Techniques for Quality Engineering. McGraw-Hill, New York (1988) 11. Siegel, S.: Nonparametric Statistics, 2nd edn. International Editions. McGraw-Hill, New York (1988) 12. Smith, A.: Towards the total quality interface - applying Taguchi TQM techniques within the LUCID method. In: People and Computers XI, Proceedings of HCI-96, Springer, Heidelberg (1996) 13. Smith, A., Dunckley, L.: Using the LUCID method to optimize the acceptability of shared interfaces. Proceedings of Interacting with Computers 9, 335–345 (1998) 14. Taguchi, G.: Robust Engineering. McGraw-Hill, New York (2000) 15. Weiss, S.: An Alternative Business Model for Addressing Usability: Subscription Research for the Telecom Industry. Interactions (July + August 2005) 16. Zhou, W.: Confidential Report: The interface development of Explicit Recommender for an Open Media Centre. Stan Ackermans Institute, Eindhoven University of Technology, the Netherlands (2005) ISBN 90-444-0543-8
A Requirement Engineering Approach to User Centered Design Dirk Zimmermann1 and Lennart Grötzbach2 1
2
T-Mobile Deutschland GmbH, Landgrabenweg 151, 53227 Bonn, Germany Siemens IT Solutions and Services C-LAB, Fürstenallee 11, 33098 Paderborn, Germany [email protected],[email protected]
Abstract. This paper describes an approach to integrate UCD activities into the existing Software Engineering practices and processes. The aim is to use the outcomes of UCD activities throughout the development process and to ensure that they can be utilized, traced and tested by subsequent development groups. Through this, UCD activities do become planable and manageable just like any other development activity. The authors introduce a framework of three different usability-related requirement types that reflect the results of the UCD activities performed during the software development. Each requirement type is extracted from the UCD results generated in the first three phases of the DIN EN ISO 13407 model. Keywords: Usability Engineering, Requirements Engineering, Processes, Integration.
A Requirement Engineering Approach to User Centered Design
361
The approach taken in this paper is to create a Requirement Engineering (RE) framework that distinguishes three different types of Usability-related requirements that correlate to the three facets: Usability, Workflow and Design. Knowing that most UE processes are embedded into a more complex Software Engineering (SE) process, one of the goals was to align the framework both with current best practices in Usability Engineering and with existing RE approaches in the Software Engineering discipline. 1.1 Requirement Engineering Current software development practices use software requirements to specify the functional and non-functional aspects of software systems. While functional requirements specify the services and functions the system should provide and how it should react and behave, non-functional requirements define constraints to the offered services and functions. Thus non-functional requirements define the product quality. In the ISO/IEC 9126 six non-functional requirements dimensions are differentiated – Usability is one of them [13]. The process of requirement elicitation, as well as the results of the requirements engineering activities, is well described in today’s literature. The IEEE STD 830-1998 [12] describes the resulting artifact, the Software Requirement Specification (SRS) as a document that “correctly define[s] all of the software requirements” of the system to be developed. Each of these specifies a “software capability needed by a user to solve a problem to achieve an objective” [1]. For the SRS the IEEE STD 830-1998 also defines quality attributes such as correctness, unambiguousness, completeness, consistency and verifiability. Thus these attributes apply to each individual requirement contained in the SRS. This is a huge benefit for the development process since a SRS provides verifiable and testable demands towards the system. But to specify these testable and precise requirements is difficult, sometimes requirements represent architecture/design constraints or lofty goals that can neither be met by the system nor sufficiently tested. Recent efforts in the Usability Community aim to produce similar guidelines tailored to Usability-related requirements. One result of these efforts is the Common Industry Specification for Usability-Requirements (CISU-R) [5] that provides guidelines for “defining usability requirements in sufficient detail to make an effective contribution to design and development and [create] usability criteria that can be empirically validated subsequently if needed.” It is closely related to the Common Industry Format for Usability Test Reports (CIF) [14] that offers guidance and a specification format for performing and describing the results of summative Usability Tests based on Usability requirements. These tests following the CIF can be used to validate the specified requirements written in the CISU-R notation. Both standards aim to specify the level of Usability based on its three dimensions: Effectiveness, Efficiency and Satisfaction: „The CIF and CISU-R take a broad approach to Usability based on DIN EN ISO 9241 Part 11 […] The value of specifying these high level requirements is that they relate closely to business requirements for successful use of a product and increased productivity” [5]. From an RE point of view, it is therefore important that the quality attributes defined in the IEEE 830-1998 are applicable to any Usability-related requirement – in order to guarantee that they can be verified and tested throughout the development process.
362
D. Zimmermann and L. Grötzbach
1.2 User Centered Design Process In current literature several User Centered Design (UCD) processes are described (e.g. [6], [16]). They focus on different aspects and address different needs of software development – but all of them share a set of common properties. These properties are described in the DIN EN ISO 13407, where a generalized human-centered design process for interactive systems is described [8]. The process is not a replacement to established software development processes but an addition to them: “This International Standard does not assume any one standard design process, nor does it cover all the different activities necessary to ensure effective system design. It is complementary to existing design methods and provides a human-centered perspective that can be integrated on different forms of design process in a way that is appropriate to the particular context.” The process consists of four base activities common to UCD process models: • • • •
To understand and specify the context of use, To specify the user and organizational requirements, To produce design solutions and To evaluate the designs against requirements.
These four activities are performed iteratively during the development process. The process is complete when the resulting system meets its specified requirements. Iterations to close in on the requirements and testing the results throughout the process are also common traits of this process model. Even though the DIN EN ISO 13407 requests, that evaluation of mock-ups and prototypes take place throughout the process, few details are given on how this relates to the basic evaluation phase of the process. Because the results are iteratively improved, it seems reasonable to suggest that the intermediate results as well as the requirements need to increase in detail. Not only that new requirements will be discovered during the process, more specific requirements are needed to evaluate the intermediate process results while the system design becomes more concrete and precise. This is in line with the activities described in the DIN IEC TR 18529, a standard that is an addition to the DIN EN ISO 13407, providing process descriptions to the human-centered lifecycle [9]. In the Evaluate designs against requirements sub-phase evaluation activities are proposed to improve the design, to define and to validate requirements and to check whether the defined practices are being followed. Thus, results of the UCD phases serve as input for subsequent phases as well as validation criteria for intermediate and final UCD results. 1.3 Evaluation Since the DIN EN ISO 13407 describes an iterative process model, all of its phases are performed several times during one development cycle. As Woletz [17] pointed out, evaluation activities focus on different aspects during this cycle – in early phases intermediate results are evaluated to identify weak points, gaps or errors for further improvement, while at the end a final assessment of the system is being performed.
A Requirement Engineering Approach to User Centered Design
363
Therefore evaluation activities should not only be performed once at the end of the cycle but several times during the development cycle. “In the general software industry it is increasingly recognized that continued evaluation is needed throughout the system development lifecycle, from early design to summative testing, in order to ensure final products meet expectations of designers, users, and organizations” [15]. Two types of goals and corresponding evaluation methods can be differentiated for validation activities: First, it can be evaluated whether the resulting documents of the development phases are correct in terms of content and style and capture the appropriate information needed for the later phases of the cycle. This can be done via interviews where the captured information are discussed with user representatives, stakeholders or other domain experts. In addition to this, “upward validations” are needed to check whether the results of a phase correspond to the results of previous phases. As an example, all generated requirements of the Specify Requirement phase are checked against the results of the Context of Use phase to find open loops, conflicts or mismatches between them. Second it can be evaluated whether the system (in development) matches the requirements that were specified in advance. For UCD evaluations this can be done through Usability Tests or Expert Reviews. To be effectively able to do this, the specified requirements need to be measurable and precise, as stated above. For both evaluation types the activities and the granularity of the resulting artifacts of the phases vary greatly. Therefore different evaluation methods need to be applied to the results. For example, analysis results from the Context of Use phase require different evaluation methods than software prototypes from the Design Solution phase. Results of the Context of Use phase, such as persona descriptions or other models describing the work, the context and the users in focus can be validated with Usability Tests or through Expert Reviews. To check how well the system is accepted by the target audience, questionnaires or surveys can be conducted. The workflow descriptions resulting from the Specify Requirement phase can best be evaluated with real or potential users by comparing their real workflows with the proposed ones and gathering their feedback on the modifications. The same applies to early sketches showing the envisioned interaction steps the user will have to take. Design solutions, such as screens showing the user interface or detailed system interactions showing the flow of information, can a) be evaluated with users to see if the solutions supports their workflows and can b) be tested against styleguides or measurable criteria to see if the solution meets the previously specified expectations and requirements. As mentioned before, not just final result but also intermediate results from the different phases from the development cycle need to be evaluated. To effectively evaluate the system against the results of the previous phases, they need to contain measurable criteria. Thus, in order to be able to evaluate UCD results there should be requirements generated from each of the first three phases described in the DIN EN ISO 13407. This is what we do in the following sections.
2 A User Centered Requirement Framework Given that on one side each of the first three phases described in the DIN EN ISO 13407 generates some type of result, and on the other the fact that the DIN EN ISO
364
D. Zimmermann and L. Grötzbach
9241 Parts 10 to 12 also describes three facets: Usability, Workflow and Design, the RE framework presented in this paper focuses on these three types of results as starting points. Within the Context of Use phase, the analysis revolves mainly around the anticipated user, their jobs and tasks, their mental models, conceptions of the usage of the system, physical environment, organizational constraints and determinants and the like. While a lot of this information is descriptive (i.e. helps to create concepts and models rather than being a testable requirement), the Context of Use phase also yields users’ expectation regarding the fundamental Usability Dimension, namely effectiveness, efficiency and satisfaction. CISU-R reflects these overarching metrics pertaining to the use and perception of the system by the user in the form of requirements, whereas CIF presents a standard format for presenting results of tests based on Usability requirements. These requirements can be used as evaluation criteria for the system and any intermediate prototypes, as well as intermediate UCD outcomes of subsequent phases. The User Requirements phase focuses on individual workflows and tasks to be performed by one of the target users. Taking into consideration some of the Context of Use data, specific task performance models are elicited from users, workflow optimization happens and an improved task performance model is generated. The outcome of this phase could be described as a set of requirements pertaining to a user’s interaction with the system in the context of a specific workflow or task, e.g. as use case scenarios. The requirements describe the discrete sub-steps of a user’s interaction flow and the expected system behavior for each of these steps. These requirements themselves can already be evaluated against the Usability requirements elicited from the context of use, e.g. by comparing an optimized workflow to the current state of workflow performance with regard to effectiveness, efficiency and user satisfaction. However, they also serve as input and evaluation criteria for the subsequent process step: Produce Design Solution. During the Produce Design Solution phase, properties of the intended system are defined, e.g. information architecture, interaction flow, screen layout, component design, etc. Some of these properties are more conceptual, i.e. they serve as underlying models, but others can be experienced by the user during the use of the system. In order to translate these designs into solutions, different facets have to be considered. • Conceptual/Structural/Framework type requirements that describe a model that is more underlying and less visible. • Visual requirements (or other perceptive modality) that describe the physical properties of the solution (e.g. size, color, spacing, contrast, alignment, etc.) • Interaction requirements that describe the behavior of the system (e.g. interaction flow, messaging, etc.) These User Interface (UI) requirements can be evaluated against the Usability requirements generated in the Context of Use phase, i.e. to evaluate if the layout and interaction model fit the mental model that the users have of the task and associated information and if the general usage would be effective, efficient and satisfactory. They can also be evaluated against the Workflow Requirements from the Specify Reqirements, in order to assess whether all specific workflows can be performed
A Requirement Engineering Approach to User Centered Design
365
easily with the given concept and design. They serve as criteria for the actual system that has been developed, i.e. does it follow the defined model for layout and interaction In summary, the user related requirements, which can be generated in the first three phases of the DIN EN ISO 13407 model, are depicted in Figure 1.
Usability Requirements from Context of Use Phase
Workflow Requirements from User Requirements Phase
User Interface Requirements from Design Solution Phase
Usability Evaluation Usability Principles
Workflow Evaluation Dialog Principles
UI Evaluation Information Design Principles
Fig. 1. Usability-related requirements, their origin during the DIN EN ISO 13407 phases and their evaluation activities based on DIN EN ISO 9241 principles
For the overall system design, it shall be noted however, that these three types of Usability-related requirements are not exhaustive in the description of a system. There is a variety of requirements stemming from different stakeholder groups that need to be viewed in conjunction with the Usability-related requirements described in this paper. But from a UCD perspective it is important to include user needs in the form of requirements into the overall process for scoping, implementation or testing purposes. 2.1 Usability Requirements This type of requirement contains general Usability criteria that the system must meet. It is based on the three Usability dimensions effectiveness, efficiency and satisfaction described in DIN EN ISO 9241 Part 11 and specifies the requirements of the user population towards the holistic usage of the system in the specified context of use. Thus, they are the result of the Context of Use phase in DIN EN ISO 13407 and provide measures that the system should comply to. Usability requirements are derived from the results of the context analysis and from competitive analyses, previously identified areas of improvement, and from general expertise about the domain, human-computer interaction practices, etc. For example, UCD artifacts to be used as input can be the Contextual Design Models as specified by Beyer and Holtzblatt [2] (Flow of Work, Sequence of Work, Work Artifact, Work Culture and Physical model) or Persona Descriptions as introduced by Cooper [6], since they describe user archetypes capturing user characteristics and describing overarching contextual and task information. As usability requirements are one of many non-functional requirement types (as mentioned above), they have strong impact on functional requirements, since the usability goals address how the system shall be used and perceived by the user. They
366
D. Zimmermann and L. Grötzbach
can be documented using established SRS formats or using the best practices defined in the CISU-R (see [12], [5]). Usability Requirements can be verified through interviews and reviews during their elicitation, and tested via Usability Tests, Expert Reviews or with surveys and questionnaires towards the end of the development cycle. They can be used to generate unique selling points or other arguments used by marketing or sales divisions. The requirements can be tested by the testing department in conjunction with the other non-functional requirements. 2.2 Workflow Requirements During the Specify Requirement phase of DIN EN ISO 13407 the current state captured during the Context of Use phase is considered and iteratively modified towards an optimized set of workflows and requirements that reflect the planned changes and improvements. The new workflow descriptions capture user goals and the associated workflows. They specify how the system should support the user to complete his tasks/goals with the system. They encapsulate essential interaction steps, needed information and options for the user and thus specify system behavior without being too specific about the concrete details (UI). Entire Workflow Scenarios are considered as one requirement, meaning that the improved workflow to accomplish a task needs to be implemented in the system to allow the users to work successfully. The information and models describing the future system’s context of use serve as input as well as innovations, the anticipated changes, fixes to known bugs and new features which are molded into a description of a “better” system than the current. The possible interaction with the system and the flow of information is described in use cases or similar artifacts (e.g. scenario descriptions). “A scenario is a concrete description of a specific flow of interaction, but one that is chosen to be typical or representative. [...] A use case is a generic scenario, describing one kind of interaction with a particular user interface.” [4] Cockburn defines, that each use case is the implementation of a stakeholder's goal. Multiple stakeholders participate in the use case and their interests should also be protected by the interactions defined in it. For Cockburn, the collection of use cases is the essence of the system's requirements, even though they don't represent all the requirements [3]. Workflow Requirements can be reviewed and verified with users to assess whether the current workflow description is accurate and if the optimized workflows are regarded as an actual improvement of their tasks. Later in the development process the requirements can be used to evaluate a prototype or system with regard to the workflow support, i.e. whether the interaction model meets the workflow requirements. A single scenario, e.g. the main scenario from a use case, could for example serve as an evaluation scenario during a usability test. The concrete descriptions of the system and the needed functionality within workflows can be used to determine what to build into the system. The workflow descriptions are used as test cases to verify if the workflows can be realized with the prototypes and the final system created during the Design Solutions phase.
A Requirement Engineering Approach to User Centered Design
367
2.3 UI Requirements During the Produce Design Solution phase, both the Usability Requirements and the Workflow Requirements are synthesized into a set of conceptual models and solution elements. These usually consist of Information Architecture, Navigation Models, Wireframes, Sketches, high fidelity screen designs, or prototypes. They describe the logical model and physical properties of the system, i.e. specify how the widgets, UI patterns, and interaction elements shall look, behave, and interact. Ideally, they also provide all states, presets, and system reactions for concrete system screens. UI requirements are generated from prototypes, the actual system in the current stage, UI specification documents and also from general styleguides, information architecture or navigation models. They focus on different layers of the user interaction with the system and thus have sub-types: • Information architecture and information flow requirements which define the overarching logical structure of the user interface. • Presentation requirements, where the layout of an entire component or a single element is defined, e.g. a screen, a widget like a dropdown combo box. • User-system-interaction requirements, where the behavior of the UI elements is defined, e.g. status changes. • Compound requirements, where the interaction between more than one element is specified, e.g. in the case of drag and drop functionality. • Message requirements, which define when and how system generated notifications, e.g. errors or alerts, are presented to the user. In order to ensure that the solution is conform with the previous analysis, the UI requirements are evaluated against usability and workflow requirements to ensure that they match the mental model that the user has of the “look & feel” of the system, as well as the users expectations regarding effectiveness and efficiency on a granular level, e.g. with regard to screen flips or mouse clicks. Additionally, they should comply with general design guidelines as defined e.g. in DIN EN ISO 9241 Part 12, or styleguides. UI requirements are also used to evaluate the fulfillment of the solution design in the coded system. Within the development process, UI requirements can be used by Development as input for their system design, as well as by testing divisions for ensuring appropriate realization of the UI. They enhance/extend workflow requirements in a way that development can select the most appropriate implementation of a given workflow. UI Requirements are also used by internal experts (e.g. UI design, testing) to determine if the (coded) system complies with its specification. They are mostly of internal value, i.e. supporting system specification and verification. From an end user’s perspective they should almost be invisible, i.e. the users’ perception of the system should not recognize individual UI features, but a holistic experience that is compliant with their expectations.
3 Summary and Outlook The authors introduced a framework of Usability-related requirements to align Usability Engineering with Software Engineering and Requirement Engineering
368
D. Zimmermann and L. Grötzbach
practices. Correlated to the phases of the DIN EN ISO 13407 process model and to the three levels of usability principles laid out in DIN EN ISO 9241 Parts 10 to 12, three requirement types were differentiated: Usability-, Workflow- and UI Requirements. Through the introduction of Workflow- and UI Requirements the authors extended the established concept of usability requirements, e.g. as described in CISU-R. These two additional types offer a more precise representation of UCD demands generated in the later phases of a development cycle. The three requirement types were explained, their interconnections described and details to their application and reuse were given. By differentiating the three types of requirements, development organizations can ensure a seamless, traceable hierarchy of user-focused requirements, starting with the high level needs pertaining to the general use of a system, going through the specific needs during the task performance towards requirements regarding the specific implementation of the system. In addition, the approach allows selecting the most appropriate evaluation method for different types of requirements, e.g. summative usability tests for Usability Requirements, or expert based reviews for UI Requirements. Guidance in selecting the appropriate evaluation method is provided by a poster by Freymann et al. [11] that differentiates evaluation methods by their result types and by their application within the development life cycle. Another benefit is scalability. Individual requirements are less monolithic than e.g. complete specification documents. Especially with regard to the emerging lightweight agile approaches to SE, e.g. Extreme Programming or Scrum, a less document driven approach to capture UCD outcomes could be more suitable. A detailed analysis of potential application areas for the three UCD requirement types in agile development is described by Düchting et al. [10]. The approach and the requirement framework presented in this paper need to be applied in the field and tried out. The requirement types need to be integrated within established requirement engineering approaches to evaluate whether the proposed granularity and differentiation of the requirement types is sufficient and where areas of improvement and refinement can be identified.
References 1. ANSI/IEEE Std. 729-1983. Standard Glossary of Software Engineering terminology (1983) 2. Beyer, H., Holtzblatt, K.: Contextual Design – Defining Customer-Centered Systems. Morgan Kaufmann, San Francisco (1998) 3. Cockburn, A.: Writing Effective Use Cases. Addison-Wesley, London, UK (2001) 4. Constantine, L.: What do users want? Engineering Usability into Software. Windows Tech. Journal 4(12), 30–39 (1995) 5. CISU-R. Common Industry Specification for Usability-Requirements, Draft Version 0.86, National Institute of Standards and Technology, 18.03.2006 (2006) 6. Cooper, A., Reimann, R.: About Face 2.0. Wiley, Indianapolis, Chichester, UK (2003) 7. DIN EN ISO 9241. Ergonomic requirements for office work with visual display terminals (VDTs). International Organization for Standardization (1998)
A Requirement Engineering Approach to User Centered Design
369
8. DIN EN ISO 13407. Human-centered design processes for interactive systems, International Organization for Standardization (1999) 9. DIN IEC TR 18529. Software engineering - Product quality. International Organization for Standardization/International Electrotechnical Commission (2001) 10. Düchting, M., Zimmermann, D., Nebe, K.: Incorporating User Centered Requirement Engineering in Agile Software Development. In: preparation, HCII 2007, Beijing (2007) 11. Freymann, M., Grötzbach, L., Nebe, K.: Selecting Appropriate Evaluation Methods for Different UCD Outcomes. In: preparation, HCII 2007, Bejiing (2007) 12. IEEE STD 830-1998. IEEE recommended practice for software requirements specifications, International Organization for Standardization (1998) 13. ISO/IEC 9126. Software Engineering - Product Quality, International Organization for Standardization (2001) 14. ISO/IEC 25062:2006. Software product Quality Requirements and Evaluation (SQuaRE) Common Industry Format (CIF) for usability test reports (2006) 15. Kushniruk, A.W., Patel, V.L.: Cognitive and Usability Engineering Methods for the Evaluation of Clinical Information Systems, Journal of Biomedical Informatics 37 (2004) 16. Mayhew, D.J.: The Usability Engineering Lifecycle – A practitioner’s Handbook to User Interface Design. Morgan Kaufmann, San Francisco (1999) 17. Woletz, N.: Evaluation eines User-Centred Design-Prozessassessments - Empirische Untersuchung der Qualität und Gebrauchstauglichkeit im praktischen Einsatz. Doctoral Thesis. University of Paderborn, Germany (2006)
This page intentionally blank
Part II
Usability and Evaluation Methods and Tools
Design Science-Oriented Usability Modelling for Software Requirements Sisira Adikari, Craig McDonald, and Neil Lynch School of Information Sciences and Engineering University of Canberra ACT 2601 Australia {Sisira.Adikari,Craig.McDonald,Neil.Lynch}@canberra.edu.au
Abstract. An identified key reason for degraded usability in software systems is the deficiencies of current RE practice to incorporate usability perspectives effectively into SRS. The explicit expression of user and usability aspects in SRS benefits designers, developers, and testers in ensuring optimal usability in software products. This paper presents the results of a design-science oriented user interface design study to validate the proposition that incorporating user modelling and usability modelling in SRS improves design. Keywords: User modelling, usability modelling.
1 Introduction Despite the presence of various User-Centred Design (UCD) methods developed to produce usable information systems, still usability related issues are detected late in the software development, during testing and deployment [1]. One of the identified reasons for poor usability in products is that usability requirements are often weakly specified [2]. Traditionally, Requirements Engineering (RE) concentrates on functional requirements and ensuring that the developed products meet these requirements, rather than other non-functional requirements (NFR), which are considered less important [3]. Usability has been classified as one of the NFR in RE [4]. Designation of usability as a rather less important NFR may cause paying less attention during the requirements definition stage and less focus of usability in requirements may propagate usability issues into end products. In this paper, we propose design science-oriented requirements modelling based on user modelling and usability modelling as an effective means in transforming usability aspects into Software Requirements Specifications (SRS). We explain the UCD process and a possible way to integrate it into a typical SDLC. We present a design science-oriented research design to test our proposal and some results to validate the proposition that incorporating user modelling and usability modelling in SRS improves design.
desired properties [6]. In a much cited paper, March and Smith [7] describe ‘build’ and ‘evaluate’ as two fundamental design science processes and four types of products in design science: constructs, models, methods, and instantiations. According to their definitions, constructs or concepts form the vocabulary of a domain, a model is a set of propositions or statements expressing relationships among constructs, a method is a set of steps used to perform a task, and an instantiation is the realisation of an artefact in its environment. The reporting of design science concepts by March and Smith further developed by a number of authors recently [8-10] claim that design activities are central to the information systems (IS) discipline and present a conceptual framework for understanding, executing, and evaluating IS research combining behavioural science and design science paradigms. Figure 1 shows research as addressing both the rigour required of research and the practical environment of use.
Fig. 1. Information Systems Research Framework [10]
3 ISO 13407: Usability Engineering Process The ISO 13407 standard intends to provide guidance of best practice in usability engineering [11]. Jokela et al. carried out an in-depth interpretive analysis of ISO 13407 and identified that the standard provides limited guidance for designing usability, describing user goals, usability measures and producing the various outcomes [12]. Although the ISO 13407 provides a general guidance to UCD activities, our analysis of the standard yields two important aspects which are not clearly visible: • How to use of evaluation feedback to improve the design and requirements. • How the ISO 13407 can be used to support a software development process.
4 ISO 13407: Evaluation Feedback to Improve the Design In ISO 13407 process model (see Figure 2), the output of the process 4 feeds back to process 1. In practice, it is required to apply the outcome of the evaluation immediately to the design with changes to improve the design prior to the next iteration. In producing effective design solutions, we argue that it is quite important to feedback
Design Science-Oriented Usability Modelling for Software Requirements
375
the output of the process 4 (Evaluate design against requirements) also into process 3 (Produce design solutions) and also updating the requirements. Our suggested variation with additional feedback loops is also shown in the Figure 2.
Identify need for human-centred design process Understand and specify the Context of use 1
Evaluate design against requirements 4
System satisfies specified user and organisational requirements
Specify the user and organisational requirements 2
Produce design solutions 3
FINAL DESIGN
Fig. 2. A variation to the ISO 13407 to improve the design
4.1 Integration of ISO 13407 into Software Development Process In Figure 2, when the system satisfies user and organisational requirements (Final Design), the requirements at stage 2, can be considered as the requirements of the best design solutions of the product. As illustrated in Figure 3, we suggest that feeding such requirements into requirements definition stage of a typical SDLC will make requirements definition process more user-centred from the beginning. Identify need for human-centred design process Understand and Specify the Context of use 1 Software Development Life Cycle
Evaluate design against requirements 4
Specify the user and organisational requirements 2
Requirements Definition
Design
Produce design solutions 3
Build
Testing System satisfies specified Specified user and Organisational requirements Release Final Design
Fig. 3. The integration of ISO 13407 process model into a typical SDLC
There are many significant advantages of integrating the ISO 13407 process model through requirements into a typical SDLC: • Requirements definitions are more user-centred and task oriented. • Requirements definition phase can be completed fairly quickly. • System and software design phase can be driven with concrete design solutions leading to lesser turnaround time.
376
S. Adikari, C. McDonald, and N. Lynch
• Testing phase can also be user-centred because of the availability of usercentred requirements specifications and design solutions to aid for usabilityfocused test specifications.
5 User and Usability Requirements The key challenge of UCD process is to communicate the user-centeredness effectively to the designer and developer. We propose the integration of user modelling and usability modelling into software development process as important in filling this communication deficiency. 5.1 Conceptual User Model A user model is a representation of information and assumptions about users [13] and can be viewed from three perspectives: modelling user knowledge, modelling user plans, and modelling user preferences [14]. Modelling user knowledge involves the accurate estimation of users’ background knowledge, skills, and experience. Modelling user plans aims to investigate the sequence of user tasks required to achieve user goals. Modelling user preferences primarily focuses on users’ information needs and preferences. Our proposed conceptual user model consists of seven user attributes as illustrated in Figure 4. In this research a user model of the existing system was created through the process of user interviews, and observations and persona development resulting a specification of the model. User Needs and Expectations Existing Knowledge and Skills Existing Experience
User Model
User Goals and Tasks Physical Attributes Cultural Factors Attitude Information
Fig. 4. Conceptual User Model
5.2 Conceptual Usability Attribute Model Some of the important usability attributes published in the literature are: Learnability [15], Memorability [16 page 31], Functional Correctness [17], Efficiency [16 pages 30-31], Error Tolerance [16 pages 32-33], Flexibility [15 pages 260-270], and Satisfaction [16 pages 33-37, 11 page6]. Figure 5 uses an Ishikawa diagram (fishbone diagram) to illustrate the conceptual usability attribute model and its measurable criteria. It shows that usability is a combination of seven usability attributes: Efficiency, Functional Correctness, Error Tolerance, Learnability, Memorability, Flexibility, and Satisfaction, and the each usability
Design Science-Oriented Usability Modelling for Software Requirements
Functional Correctness
Efficiency E1
FC1 E2
Error Tolerance ET1
FC2
E3
377
ET2
FC3
ET3 Usability
L1
M1
L2
F1
M2
S1
F2
L3
S2 S3
L4 Learnability
Memorability
Efficiency E1 – Task completion in minimum time E2 – User tasks are not misleading E3 – No workarounds are needed
Flexibility
Satisfaction
Functional Correctness
Error Tolerance
FC1 – Task completion in minimum time
ET1 – Appropriate error messaging for invalid conditions FC2 – User tasks are appropriate, effective ET2 – Ability to exit error conditions or and match the user requirement unwanted states FC3 – User spends minimal time on “Help” ET3 – No workarounds are needed
Satisfaction S1 – User desirability of the system and user tasks S2 – User opinion about user experience S3 - User opinion about frustration or confusion
Learnability
Memorability M1 – No memory recall to carry out tasks
L1 – Clear visibility of current system status and a feel about what to do next L2 – User tasks are not misleading L3 – Task completion in minimum time
M2 – User spends minimal time on “Help” L4 – User spends minimal time on “Help”
Flexibility F1 – Multiplicity of ways to carry out user tasks F2 – User control of task performance
Fig. 5. The integration of ISO 13407 process model into software development cycle
attribute is governed by several usability related measurable aspects of the system or product. For example, the usability attribute “Efficiency” can be measured, based on the evaluation of three components: E1- Task completion in a minimum time, E2User tasks are not misleading, and E3- No workarounds are needed.
6 Research Design and Research Process The aim of the research design was to test whether systems design quality was improved when functional specifications were explicitly enhanced with user modelling and usability modelling. Seven designers participated in the design process. We selected a web-based library information system (“existing system”) as our target system of study. We presented the functional specification for the existing system to each designer and requested to produce a user interface based on the given functional specifications. We then gave each designer User and Usability Specifications and invited them to refine the design on the basis of the added information. The two designs were evaluated against the usability criteria and compared to detect the differences. This process was repeated with each designer. Results were aggregated to see where, and in what ways, the designers’ work differed and how the differences might have impacted on the quality of their design. Our research design is illustrated in Figure 6.
378
S. Adikari, C. McDonald, and N. Lynch
Existing System
T1 User Modelling
P1 User Model
T2 Usability Modelling
P2 Usability Model
System Functions Specifications
T3 Design Interface
T5 Questionnaire
P3 Interface 1
T4 Redesign Interface
P4 Interface 2
T6 Evaluation Process (Interface 1)
T7 Evaluation Process (Interface 2)
P5 Evaluation Outcome (Interface 1)
P6 Evaluation Outcome (Interface 2)
Key T
- Research Task
P
- Research Product - Flow between product and task (Input or Output)
T8 Comparison
P7 Research Findings
Fig. 6. The Research Design
Summary of activities in the research design: • T1 & T2 User-centred designers carried out user modelling and usability modelling producing artefacts: user model and usability model. • T3 Systems designers designed and produced Interface 1 based on system functional specification only • T4 Systems designers redesigned the interface and produced Interface 2 using Interface 1, and the user model and usability model • T5 Systems designers filled out questionnaires expressing their views, design experience and opinions on redesigning the interface • T6 & T7 Interface testers carried out an evaluation process against user requirements and usability requirements on both interfaces involving end users • T8 Outcome of the evaluations was compared to come up with research findings. 6.1 Functional Specifications As functional specifications, a set of documents were provided to designers namely; a scenario of use for a simple library system, a use case diagram, a data base structure diagram, and a diagram detailing library tables and sample data. The scenario of use contained a narrative description of a typical library system user completing a number of user tasks as per the use case diagram shown in the Figure 7. 6.2 Enhanced Requirements Specifications The addition of the two models based on user modelling and usability modelling produced an improved version of SRS - the Enhanced Requirements Specification (ERS).
Design Science-Oriented Usability Modelling for Software Requirements
379
Use Case Diagram: Library System
Search for Book(s)
Reserve Book(s)
Review Personal Account
Reserve Book(s)
Library System User
Cancel Reserved Books
Fig. 7. Use case diagram for library system
The research described will compare the designs resulting from the enhanced specification with those produced from only the functional specification to test the proposition that enhances specification produces more testable designs that are better suited to their environment.
7 Interface Evaluation and Results For the design of interface 1, designers were provided with only the functional specifications (FS). For redesigning the interface, designers were provided with enhanced requirements specifications (ERS) consisting user modelling and usability modelling descriptions. We requested interface designers to fill out a questionnaire expressing their views, design experience and opinions on designing and redesigning the interfaces. The questionnaire used a five point scale. A summary of findings of the questionnaires with mean (µ) and Standard Deviation (SD) values is as follows. For the design of interface 1 using only FS: • How easy was it to design the interface with the specifications provided? (µ=3.14, SD=0.69) • To what extent did you want additional information to create a proper design? (µ=2.0, SD=0.58). • To what extent the specifications were helpful to create fields, buttons, tabs, menus, information content etc. in the design? (µ=3.43, SD=0.98). • To what extent did you use your previous experience to create fields, buttons, tabs, menus, information content etc. in the design? (µ=1.43, SD=0.53). • For the design of interface 2 using ERS: • How easy was it to design the interface with the additional specifications provided? (µ=4.43, SD=0.53) • To what extent did you want FURTHER additional information to create a proper design? (µ=2.42, SD=0.98). • To what extent the additional specifications were helpful to create fields, buttons, tabs, menus, information content etc. in the redesign? (µ=4.14, SD=0.69).
380
S. Adikari, C. McDonald, and N. Lynch
• To what extent did you use your previous experience to create fields, buttons, tabs, menus, information content etc. in the redesign? (µ=2.14, SD=1.35). • A user tester facilitated interface evaluations on both interfaces involving one user for each design. A summary of results is outlined below with details in relation to the usability evaluations on five focus areas. • The evaluation of Interface 1: • Overall, the system was easy to use (µ=3.21, SD=0.63). • Ability to complete tasks in a reasonable amount of time (µ=3.14, SD=0.62). • Individual pages were well designed (µ=2.86, SD=0.63). • The content of the system meet user expectations (µ=3.07, SD=0.61). • The organisation and terminology used in the system was clear (µ=3.29, SD=0.57). • The evaluation of Interface 2: • Overall, the system was easy to use (µ=4.36, SD=0.75). • Ability to complete tasks in a reasonable amount of time (µ=4.29, SD=0.76). • Individual pages were well designed (µ=3.43, SD=0.53). • The content of the system meet user expectations (µ=3.5, SD=0.66). • The organisation and terminology used in the system was clear (µ=3.57, SD=0.53).
8 Discussion Table 1 shows a summary of results for designers’ questionnaires. Criteria 1 and 3 show distinct improvement in designers' perception of the usefulness and helpfulness of the enhanced requirements specification. Criteria 2 and 4 asked the designer for “requirements of additional information” and “use of their previous experience” for design and redesign processes. These criteria too showed that the enhanced specification was more complete and relied less on previous designer experience (note that as these questions are of 'negative' nature, the 5 point scale was reversed for the responses so that the correct µ and SD could be calculated). The perception that designers relied less on previous experience in their second design could have been influenced by some level of experience designers gained through first interface design. The intent of the question was made clear during the data collection. Table 1. Analysis of design questionnaire Questionnaire Criteria 1.The extent of the usefulness of specs 2. The need of additional information 3. The extent of helpfulness of specs 4. The use of designer’s experience
FS µ 3.1
ERS µ 4.4
Diff µ 1.3
2.0 3.4 1.4
2.4 4.1 2.1
0.4 0.7 0.7
Table 2 shows a summary of results relating to the usability evaluation of interfaces. The ERS was evaluated as better on all criteria.
Design Science-Oriented Usability Modelling for Software Requirements
381
Table 2. Analysis of interface evaluations Evaluation Criteria 1. Ease of use 2. Task completion 3. Individual page design 4. Information content 5. Organisation and terminology
FS µ 3.2 3.1 2.9 3.1 3.3
ERS µ 4.4 4.3 3.4 3.5 3.6
Diff µ 1.2 1.2 0.5 0.4 0.3
Table 3 summarises the two tables above. It shows that from the perspective of the designer and the perspective of product quality the design based on ERS was superior to that based on FS alone. Table 3. Analysis of interface design and evaluation Description Design Questionnaire Interface Evaluation
FS µ 2.5 3.1
ERS µ 3.3 3.8
Diff µ 0.8 0.7
9 Conclusion In this paper, we have reported research into the use of UCD approaches to interface development through the use of enhanced requirements specifications. This research shows that deploying user modelling and usability modelling specifications made a positive difference to both designer perception and to design quality of an interface. Improved design can be expected to lead to improved development of more usable and quality systems. Developers will be able to incorporate usability aspects effectively into systems for optimal usability and testers will be able to test systems effectively and efficiently to uncover functionality issues as well as usability issues. Such approaches will ensure that any system that goes “live” will be with no or minimal usability issues hence minimising the usability-related issues in end products and enhancing the positive user experience. There are three kinds of contributions made by this research. First, evidence has been collected as to the impact of user and usability specification on design. Second, techniques have been developed for the practical presentation of these specifications. Third, the research provides a reflection on the 'design science' research approach.
References 1. Folmer, F., Bosch, C.: Architecting for Usability: A Survey. In: Journal of Systems and Software, vol. 70, pp. 61–78. Elsevier, Amsterdam (2004) 2. Folmer, E., Gurp, J., Bosch, J.: Software Architecture Analysis of Usability. The 9th IFIP Working Conference on Engineering for Human-Computer Interaction, Hamburg (2004) 3. Bevan, N.: Design for Usability. In: Proceedings of HCI International, pp. 762–767 (1999) 4. Sommerville, I.: Software Engineering. Pearson Addison-Wesley, England (2004)
382
S. Adikari, C. McDonald, and N. Lynch
5. Venable, J.R.: The Role of Theory and Theorising in Design Science Research. First International Conference on Design Science Research in Information Systems and Technology, Claremont, pp. 1–18 (2006) 6. Carlsson, S.A.: Developing Information systems design knowledge: A critical realist perspective. The. Electronic Journal of Business Research Methodology 3(2), 93–102 (2005) 7. March, S.T., Smith, G.F.: Design and Natural Science Research on Information Technology. Decision Support Systems 15(4), 251–266 (1995) 8. Au, Y.: Design Science I: The Role of Design Science in E-Commerce Research. Communications of the AIS, vol. 7 (2001) 9. Ball, N.: Design Science II: The Impact of Design Science on E-Commerce Research and Practice. Communications of the AIS, vol. 7 (2001) 10. Hevner, A.R., March, S.T.: The Information Systems Research Cycle. IEEE Computer, 111-113 (November 2003) 11. ISO 13407:1999(E).: Human-Centred Design Processes for Interactive Systems (1999) 12. Jokela, T., Iivari, N., Matero, J., Karukka, M: The Standard of User-Centred Design and the Standard Definition of Usability: Analyzing ISO 13407 against ISO 9241-11. Retrieved July 30, 2006, (2006) from http://delivery.acm.org/10.1145/950000/944525/p53jokela.pdf?key 1=944525&key2=4303774511&coll=portal&dl=ACM&CFID=74456598&CFTOKEN=24 876680 13. Kobsa, A.: Supporting User Interfaces for All Through User Modeling. In: Proceedings HCI International’95, pp. 155–157. Elsevier, Yokohama (1995) 14. Kobsa, A.: User Modeling: Recent Works, Prospects and Hazards. Retrieved July 28 (2006) ( 2006), from http://www.ics.uci.edu/ kobsa/papers/1993-aui-kobsa.pdf 15. Dix, A., Finlay, J., Abowd, A.D., Beale, R.: Human-Computer Interaction Pearson, Upper Saddle River NJ, pp. 260–270 ( 2004) 16. Nielsen, J.: Usability Engineering. Academic Press, San Diego (1993) 17. Brinck, T., Gergle, D., Wood, S.D.: Usability for the Web: Designing Web Sites that Work. Morgan Kaufmann, San Francisco (2002)
Prototype Evaluation and User-Needs Analysis in the Early Design of Emerging Technologies Margarita Anastassova1, Christine Mégard2, and Jean-Marie Burkhardt3 1
CREATE-NET, Via Solteri, 38 - 38100 Trento - Italy [email protected] 2 CEA/LIST, 18, route du Panorama, BP 6, 92265 Fontenay-aux-Roses Cedex, France [email protected] 3 René Descartes University, Unité Ergonomie – Comportement & Interactions, 45, rue des Saints-Pères, 75270 Paris Cedex 06, France [email protected]
Abstract. This paper presents two case studies of prototype evaluation as a tool for user needs elicitation for emerging technologies. In the first user evaluation, a high-fidelity virtual reality prototype is used. In the second one, a low-fidelity mixed reality prototype is used. Our results show that prototypes may be a powerful a tool for eliciting user-needs, but their potentiality depends on their fidelity. In our studies, users elicit more needs when working with the highfidelity prototypes. Furthermore, the information collected in this case is richer and more useful for design. We discuss these results as well as some factors that could help stakeholders elicit a greater number of needs for emerging technologies. Keywords: Mixed Reality, Early Design, Emerging Technologies, Prototype Evaluation, User Needs Analysis, Virtual Reality.
approach to user needs seems quite interesting in the field of emerging technologies, which are also in constant evolution. Of course, the limited interest in the utility of emerging technologies may be because there is no need for specific research on user needs analysis for emerging technologies. Such a statement implies that the results of empirical analyses of user needs for traditional technologies could be directly transposed to innovative applications. Moreover, it means that traditional needs analysis methods in HCI are also easy-touse and suitable for innovation. However, there are many arguments against this assumption. These arguments, briefly discussed below, show that the design of useful emerging technologies challenges existing HCI knowledge and methodology. 1.1 Difficulties in Analysing End-User Needs in the Early Design of Emerging Technologies A first argument supporting the assumption that the elicitation of user needs for emerging technologies is a difficult matter stems from the fact that, by definition, such technologies express designers’ strive for technical achievements. As a result, their development is essentially technology-driven and user needs often remain designers’ minor concern. Therefore, HCI specialists, if requested, are principally integrated in the later stages of development projects. Furthermore, literature reviews in the field of VR and MR reveal that current research focuses on building systems ad-hoc and on evaluating them in artificial or informal settings [6]. In the rare cases where explicitly evoked, user-needs analysis is generally done by interviewing very few “task experts” (e.g. [7]), by quick field studies of future users’ activity (e.g. [8]) or by questionnaires (e.g. [9]). Such practices and research focuses hinder the capitalisation of HCI knowledge on user-needs analysis and partly explain the lack of a structured methodology for such analyses. A third argument for the difficulty of user needs analysis for emerging technologies is that innovation is upcoming and in search of potential applications. Consequently, it is barely known by its future users. Thus, users are not likely to express their needs for innovation because they can hardly imagine and describe what might be possible to do with an eventual future technology. In addition, “the more radical an innovation…the harder it is to understand how it should look, function, and be used” [10]. In fact, people are generally most prone to communicate needs which they are particularly aware of. Therefore, most of the HCI methods traditionally used for user needs analysis help the elicitation of such conscious user needs, thus undoubtedly informing design and key industrial stakeholders. However, during the early design of emerging technologies, users are required to express their “undreamed of requirements” [3], and unless people are encouraged explicitly to think about such requirements, they are unlikely to appear until later in the development of a technology, when its potential applications become clear and evident [11]. 1.2 Prototypes as a Tool for User-Needs Analysis in the Early Design of Emerging Technologies Prototypes may be one of the powerful tools for encouraging people to express their “undreamed of requirements” for emerging technologies. Their principal advantages and disadvantages in this context are discussed below.
Prototype Evaluation and User-Needs Analysis in the Early Design
385
The main advantage of prototypes for user-needs analysis is the fact that they are concrete physical representations of the future emerging technology as well as of some of its functional, aesthetic and interactional characteristics. Their concreteness facilitates users’, managers’ and, in some cases, designers’ understanding of abstract, unfamiliar or fuzzy technological concepts. Furthermore, prototypes could be considered as “executable representations of (designer’s) knowledge” [12]. As in the process of innovation designers are usually more knowledgeable than users and managers about multiple technical aspects, prototypes may help to transmit some of designers’ knowledge to other stakeholders. Thus, prototypes may support discussions about the functions of the future emerging technology and facilitate the refinement and elicitation of “latent” [11] and unconscious user-needs. For these reasons, physical prototyping of emerging technologies may be an efficient way (1) to demonstrate, communicate and explore a number of possible design ideas and solutions; (2) to search for design alternatives and, hopefully, (3) to provoke further innovation [13]. However, some authors have suggested that prototyping may not always be profitable to elicit “emerging” needs. Though there are few empirical results available, five major limits of prototypes in this context have been advanced. First, there might be an important difference between the functional characteristics of a prototype and the characteristics of a final product, because the potential applications of emerging technologies are, by definition, ill-defined [14]. Second, prototypes might primarily express their designer’s point of view on functions and interaction [15]. Third, especially if prototypes are high-fidelity ones, their concreteness might inhibit stakeholders’ imagination as well as their search of innovation [16]. Forth, as for lowfidelity prototypes, they might be « stigmatized » as less efficient, and even rejected, if stakeholders extended to them their expectations and mental images of traditional and, plausibly, more elaborated technologies. Finally, some researchers claim that low-fidelity prototypes have limitations, since they cannot precisely represent a number of interactional and sensory aspects of future products (e.g. [17]). Thus, an important issue is to clarify the advantages and disadvantages of both low- and high-fidelity prototypes evaluation as a method for user-needs elicitation in the early design of emerging technologies. In this paper, we compare the results of two case studies having this objective. The first one uses a high-fidelity prototype, while the second one uses a low-fidelity prototype. These were two prototypes of two different systems. We chose this solution instead of analysing two versions of the same prototype mainly because of practical and financial constraints. As the design of emerging technologies demands a lot of technical resources, it was impossible to have more than one prototype per system.
2 Case Study 1 The first case study concerns the design of a VR prototype with force feedback for upper-limb rehabilitation. The initial idea of the prototype was to provide various force-feedback based exercises adapted to the motor and the cognitive abilities of patients, who had motor difficulties shortly after a neurological lesion. Starting from this promising idea, the objective of the study was to verify how the prototype
386
M. Anastassova, C. Mégard, and J.-M. Burkhardt
matched actual user needs. We also wanted to refine these needs and to extend them to other potential applications that might be supported by such a prototype. 2.1 Prototype Design Approach End-User Participation. Eighteen users in two teams belonging to two rehabilitation centres participated in the design process. The first team was composed of one rehabilitation doctor and two occupational therapists. Four patients took part in the evaluation. The second team was composed of two physiotherapists, one occupational therapists and one rehabilitation doctor. Six patients participated in the evaluation. The other members of the project were one rehabilitation doctor who served as an expert, and 4 designers. Method and Resulting Prototype. At the beginning of the project, meetings were organized to gather user needs. Users felt that VR technologies with haptic feedback could be useful for rehabilitation but it seemed difficult for them to precisely define and express their needs. Therefore, it was decided to follow a prototype-based methodology to support the elicitation of user needs both for the hardware and for the software aspects of the future rehabilitation system. Thus, the next step was to define a few concrete rehabilitation exercises together with the hardware and software applications to support them. Four exercises, mainly dedicated to rehabilitation of the shoulder, were defined by the therapeutic teams with the help of the designers, who intervened in order to ensure the technical feasibility. These exercises were: (1) Crank wheel rotation, which implied movements of the shoulder; (2) Pong exercise, which consisted in catching a ball with a virtual racquet moving horizontally or vertically; (3) Trajectory exercise, in which the path of movement was defined by the therapist and the patient must follow it; (4) Library exercise, in which the patient had to pick virtual books on a shelf and put them onto another shelf. Different parameters could be controlled by the therapist, namely the level of difficulty of the exercises, the friction level of the robot, the weight of the virtual objects manipulated and the amplitude of the movements of the patient’s arm. The hardware of the prototype was based on a force feedback device called Virtuose from Haption Inc, built on technologies developed by CEA/LIST in France. It is a six degrees-of-freedom arm with a 40cm side working space ending with a shovelshaped handle to allow manipulation by spastic patients, i.e. patients having too much muscular tonus. This device can be considered as a high-fidelity hardware prototype for VR applications. The general GUI developed for the purpose of the evaluation was rather simple. It was specified and implemented by a software partner. 2.2 Prototype Evaluation Approach and Needs Collection Evaluation Approach. Each team could use the system during one week. No common methodology had been defined for the evaluation, because each team was working rather differently. Instead, it was decided that the therapists should freely choose their patients and the parameters of the exercises. Thus, the first team proceeded with a systematic evaluation of the 4 exercises for each parameter and with each patient. The therapists reported the problems they
Prototype Evaluation and User-Needs Analysis in the Early Design
387
encountered during the evaluation as well as the positive aspects of the system in a document. The second team decided to use the prototype as regular rehabilitation equipment. Each parameter was not systematically tested, but each patient could practise the exercises selected by the therapists according to her pathology and her capacities. Four patients used the system during an hour session each. Two patients used the prototype 3 times during the week in order to get an idea of the interest of the device for a longer rehabilitation period. All evaluation sessions were filmed and postanalysed. Data Analysis for Needs Collection. There were three steps in this analysis. First, the content of observations collected using the video data and the written document produced by the first team were classified according to 3 exhaustive categories, namely (1) the problems encountered either by the therapist or by the patient; (2) the positive judgments emitted by the therapists during the evaluation and (3) the new ideas for the enhancement of the system, further referred as “Needs”. The comments of the patients were not analysed because they emitted few critiques and suggestions. Second, the observations were further coded in 7 classes according to the central aspect, which they concerned. The first class is hardware, covering aspects related to the robot, its handle and support. The second class concerns software aspects. The third class concerns all general settings (i.e. the overall accessibility for patients and therapists). The other classes concern the content of the four rehabilitation exercises defined. Last, a plenary meeting was organised to gather stakeholders’ feedback on the evaluation and its results. The evaluation results were presented to the stakeholders and were used to elicit more suggestions on the possible enhancement of the system. The suggestions gathered during the meeting issued mainly from the designers. They were coded as “Needs”, since they corresponded to probable requirements constructed on the basis of users’ feedback on the utility of the prototype and on the problems encountered during the evaluation. 2.3 Major Results Eighteen observations are positive comments dispatched in the 7 classes. These observations show the great enthusiasm of the therapists. There are also 76 problems reported, which are not uniformly dispatched in the categories. Most problems (n=46, 61%) found concern the exercises. Many problems expressed concern also the hardware (n=18, 24%). They are mainly related to the handle of the robot and some security aspects. The general setting is also the object of critiques (n=9, 12%). The therapists criticize mainly its accessibility and the installation of the patients. Software was rarely criticized (3 times, i.e. 3%). Seventy-four needs were collected during the meetings, from the observations and from the written report. Among these needs, 84 % (n=62) are directly related to the concrete problems encountered during the evaluation and 16% (n=12) are needs expressed independently. Thus, the repartition of the needs follows tightly the repartition of the problems encountered. User needs are not uniformly dispatched in the classes. Most users’ needs are expressed on exercises (45, i.e. 61%). This result shows
388
M. Anastassova, C. Mégard, and J.-M. Burkhardt
good creativity on rehabilitation exercises in order to improve their efficiency and to maintain the motivation of the patient. Twenty percent of the needs (n=15) concern hardware. Most of them are ideas for the improvement of the handle. However, no major modifications of the general design and setting have been suggested. The most important problem concerning the general setting is the lack of compensation of the weight of patient’s arm. In fact, many patients could not move their arm upward because they lack arm strength and the prototype would not allow compensating this weight in order to provide gravity-free movements. The suggestions in this direction were provided by designers only. They proposed a new mechanical architecture in order to enhance the motor capacities of the system, hoping this solution will solve the problem of compensation of the weight of patient’s arm.
3 Case Study 2 This second study concerned an evaluation carried out with a low-fidelity MR prototype for guiding train maintenance tasks. The MR prototype had to provide contextual repair information to inexperienced train maintenance technicians by overlaying computer-generated graphics and textual repair instructions on a real piece of equipment. This was a formative evaluation done during the early design of the prototype. Its initial objective was to rapidly assess the usability and the usefulness of the prototype for such type of tasks. The results of the evaluation had to serve the redesign of the MR prototype. Moreover, the evaluation had a second indirect goal, which concerned the clarification of the role of low-fidelity prototypes for user needs elicitation. It is this aspect that is mainly developed below. The usability aspects have been reported elsewhere [14]. 3.1 Prototype Design Approach End-User Participation. The participation of real future users was impossible for two main reasons. First, the industrial partner did not want to provide access to real end-users to participate in the design and test phase. Second, the prototype we evaluated possessed very few of the functions of the future finished product and we were afraid of an eventual technological rejection. Resulting Prototype. The MR prototype was a video see-through system where a handheld tablet-PC was used as an augmented window. Thus, repair instructions in textual and graphical form as well as 3D models and 2D graphics and animations were overlaid on a real piece of equipment. The user interacted with the resulting multimedia content using a simple WIMP interface and a pointing device. The tracking and registration was done by a camera attached to the tablet-PC. The total weight of the prototype was 2 kg and it was suspended on user’s neck. 3.2 Prototype Evaluation Approach and Needs Collection Subjects. Ten subjects (6M, 4F), all of them computer and electronics engineers in one of the laboratories working on the project, participated in the study. A control group of 5 subjects used standard paper repair instructions. The experimental group
Prototype Evaluation and User-Needs Analysis in the Early Design
389
used MR-based guidance. The participants were aged from 26 to 46 (M = 30, SD = 6). Because of their professional background, they were all familiar with VR and MR technologies. Experimental Task. The task was rather simple and consisted in removing nuts and washers from a real piece of equipment. The task comprised 9 steps demanding one or more user actions. The task and the real piece of equipment were identical for both the paper-guided and the MR-guided group. Procedure and Data Analysis. All the user evaluation sessions were filmed. The evaluation required two evaluators, since the one was filming the user’s actions, while the other helped the user when technical breakdowns occurred. Participants were let to freely explore the prototype. Then, they were asked to perform the maintenance task. The participants could pose any question about the prototype and its usage. After the test sessions, the subjects using MR-based guidance were also interviewed on their difficulties, needs and ideas on the potential applications of the prototype in train maintenance tasks. The data obtained (videotapes and verbal protocols transcribed verbatim) were analyzed. The video data analysis focused on total time for task completion (comprising the time for instruction comprehension and the time for manipulation of the real piece of equipment); number of steps successfully completed; number of deviations; eye movements for the search of information, etc. The analysis of the verbal protocols obtained focused on users’ difficulties, needs and suggestions. 3.3 Major Results The user evaluation helped the identification of numerous usability problems (for more detailed results, cf. [14]). Primarily, we found that the MR-guided group realized the repair tasks less rapidly than the paper-guided group (this group needed 20% more time to realize the task). In the same time, the participants made the same number of errors whatever the type of guidance was. As for postures and gesture activity, MR users stayed in the same position longer than the users of paper instructions. In general, MR users stood up, while the participants in the paper-based condition stood mostly squatted down at the real piece of equipment. Having in mind these empirical results, it seems logical that, in their postexperimental interviews, 4 subjects did not find the MR prototype useful for the maintenance task chosen as a possible application, since the latter was too simple and could be done without any technological aid. The users’ impression of the limited utility of the prototype was reinforced by the numerous registration problems encountered during the evaluation. In fact, the registration problems constitute the main class of all the problems evoked in the interviews (33% of all the problems). Four of all the 5 participants we interviewed did not even wait for a good registration because the process was too long. Thus, they did not use one of the main advantages of MR systems, namely rapid contextual guidance. Furthermore, they did not express any needs for such guidance and for better registration as the task was too simple. In fact, these four participants declared having used mainly the information from the textual
390
M. Anastassova, C. Mégard, and J.-M. Burkhardt
instructions provided by the tablet-PC as well as some visual indications, which the real piece of equipment “afforded”. The design of animations was the second class of problem elicited by our participants. This problem has been evoked in 25% of the cases, and by all participants. The users estimate these animations too rapid and semantically unclear. The other problems put forward by our test participants concern the usability of the MR prototype (its weight and bulkiness). The MR users evoked very few suggestions for the future development of the MR prototype. Compared to the number of problems evoked, the number of suggestions is 3 times less important. These suggestions concern primarily the semantics and the colors of animations and the non-usage of registration.
4 Discussion and Conclusions The two case studies reported above provide qualitatively and quantitatively different pictures of prototype evaluation as a tool for needs elicitation for emerging technologies. These differences may be explained by several factors related to the design and the evaluation approaches chosen in our two case studies. The first explicative factor is the degree of fidelity of the prototype evaluated in each case. Our results show that the number of needs expressed by users is greater when working with the high-fidelity VR prototype than when working with the lowfidelity MR prototype. Furthermore, the information collected is richer. In this sense, our results do not support the hypothesis that the concreteness of high-fidelity prototypes would inhibit users’ imagination and needs elicitation [16]. A possible interpretation of this result is that a low-fidelity prototype would provoke a focalisation on minor usability problems, and would thus hinder the emergence of ideas about redesign, possible applications or anticipated benefits of the technology. Therefore, an accurate representation of a future emerging technology and of what is technically feasible could be beneficial for user needs elicitation. The second explicative factor is the representativity of users. The first case study was done with real future users, whereas their participation was not possible in the second one. This fact could explain the richness of data in the first case, where the actual future users were really motivated to contribute to the design of a potentially useful future emerging technology. Furthermore, the therapists used their knowledge of devices currently used in their daily work in order to provide suggestions for the improvement of the VR prototype. On the contrary, in the second case, the MR technology was almost perceived as a funny new toy, and the evaluation sessions – as a new game. Therefore, less effort was done to elicit user needs and suggestions for improvement. In the same vein, in the first case, the experimental tasks were real user tasks, whereas, in the second case, the task was too simple to be perceived as real. Moreover, as participants were not representative of future users, they had a vague idea of their real tasks. We may reasonably expect the elicitation of a greater number of needs, if the prototype evaluation is done with real future users and on real tasks. The discussion about users’ and tasks representativity naturally introduces some remarks on users’ roles in the design of emerging technologies. In the first case study,
Prototype Evaluation and User-Needs Analysis in the Early Design
391
the design process was longer and richer (e.g. it comprised several meetings and evaluation sessions). Therefore, users were regularly involved in design in a participatory manner. This fact allowed a co-construction and a gradual evolution of their needs. In this sense, users acted as co-designers. In the second study, users intervened locally as test participants. Therefore, they elicited a small number of needs, which could not be further discussed and developed in direct interactions with designers. Last but not least, the evaluation setting and approach were different in both cases. In the first case, the prototype evaluations were done in a real work setting and the therapists could freely choose their patients and the parameters of the exercises. In the second case, the evaluation was done in a laboratory setting in a quite formal manner. Our hypothesis is that the artificial setting and the formal comparison between the traditional guidance and the MR-guidance could partially explain (1) the poorer performance of the MR-group and (2) the small number of needs elicited by test participants. We could reasonably expect better results, if the evaluation was done in a real work setting, using less formal approaches (e.g. focus groups). In conclusion, the evaluation of high-fidelity prototypes as well as iterative prototyping seem efficient ways to point some problematic aspects of design, which could further serve to elicit new needs. We think that this process would be most beneficial if done in a close cooperation with designers, whose ideas on what is technically feasible are very important in the field of emerging technologies. In the future, we plan to evaluate more prototypes of different emerging technologies, and in different evaluation settings in order to have more formal and convincing results on prototype evaluation as a tool for user needs elicitation for emerging technologies.
References 1. Benko, H., Ishak, E.W., Feiner, S.: Collaborative Mixed Reality Visualization of an Archaeological Excavation. In: Paper presented at the 3rd IEEE/ACM ISMAR 2004, Arlington, VA (November 2004) 2. Nielsen, J.: Usability Engineering, AP Professional, Cambridge (1994) 3. Robertson, S.: Requirements Trawling: Techniques for Discovering requirements. Int. J. H.-C. St. 55, 405–421 (2001) 4. Brangier, E.: Besoin et Interface. In: Akoka, J., Comyn-Wattiau, I. (eds.): Encyclopédie des Nouvelles Technologies. Vuibert, Paris, pp. 1070–1084 (2006) 5. Kjeldskov, J.: Human-Computer Interaction Design for Emerging Technologies: Virtual Reality, Augmented Reality and Mobile Computer Systems. PhD Thesis, Aalborg University, Aalborg (2003) 6. Anastassova, M., Burkhardt, J.-M., Mégard, C., Ehanno, P.: L’ergonomie de la Réalité Augmentée pour L’apprentissage: une Revue. Le Tr. H (in press) 7. Gabbard, J.L., Swan II, J.E., Hix, D., Lanzagorta, M., Livingston, M., Brown, D., Julier, S.: Usability Engineering: Domain Analysis Activities for Augmented Reality Systems. In: Woods, A., Merritt, J., Benton, S., Bolas, M. (eds.): The Engineering Reality of Virtual Reality 2002. SPIE Vol. 4660, pp. 445–457 (2002) 8. Träskbäck, M., Haller, M.: Mixed Reality Training Application for an Oil Refinery: User Requirements. In: Paper presented at VRCAI 04, Singapore (June 2004)
392
M. Anastassova, C. Mégard, and J.-M. Burkhardt
9. Anastassova, M., Burkhardt, J.-M., Mégard, C., Ehanno, P.: L’ergonomie de la Réalité Augmentée pour L’apprentissage: une revue. Le Tr. H, vol. 70, pp. 97–125 (2007) 10. Leonard, D., Rayport, J.F.: Spark Innovation Through Empathic Design. Harv. Bus. Rev. 6, 102–113 (1997) 11. Sperandio, J.-C.: Critères Ergonomiques de L’assistance Technologique aux Opérateurs. In: Paper Presented at JIM’2001: Interaction Homme – Machine & Assistance, Metz, France (July 2001) 12. Schneider, K.: Prototypes as Assets, not Toys. Why and How to Extract Knowledge from Prototypes. In: Proceedings of IEEE ICSE-18 (1996) 13. Holmquist, L.E.: Protoyping: Generating Ideas or Cargo Cult Designs? Interactions, pp. 48–54 (March-April 2005) 14. Anastassova, M., Burkhardt, J.-M., Breda, J., Mégard, C.: Evaluation Ergonomique d’un Prototype de Réalité Augmentée par des Tests Utilisateurs: Apports et Difficultés. In: Paper Presented at ErgoIA 2006, Biarritz, France (October 2006) 15. Sutcliffe, A.: User-Centred Requirements Engineering. Springer, Heidelberg (2002) 16. Lindgaard, G., Dillon, R., Trbovich, P., White, R., Fernandes, G., Lundahl, S., Pinnamaneni, A.: User Needs Analysis and Requirements Engineering: Theory and Practice. Int. Comp. 18, 47–70 (2006) 17. Liu, L., Khooshabeh, P.: Paper or Interactive? A Study of Prototyping Techniques for Ubiquitous Computing Environments. In: Paper presented at CHI 2003, Ft. Lauderdale, FL, USA (April 2003)
Long Term Usability; Its Concept and Research Approach – The Origin of the Positive Feeling Toward the Product Masaya Ando1 and Masaaki Kurosu2 1,2
The Graduate University for Advanced Studies (SOKENDAI) [email protected] 2 National Institute of Multimedia Education [email protected]
1 Introduction There are many people who have the belief that the washing machine, for example, should equip with the minimum functions and there is no affection to such a machine. But today, some users have the affection and/or the positive adherence to such washing machine as to equip the slanted drum and anti-bacteria function. − To date, the subjective evaluation of users toward the product or artifact was grasped as the customer satisfaction (CS). But it is more important to let users have such a positive feeling that is more than the simple satisfaction, and this corresponds to the goal of the concept of long-term usability. − The fundamental question here is “how users will have the affection or the positive adherence to the product?” Table 1. Positive feelings towards artifacts
2 Feelings Toward Artifacts Table 1 summarizes positive feelings towards artifacts. Usually the concept of usability is considered as the quality of product, but such feelings listed in Table 1 are different from the concept of quality and represent a positive involvement to the artifact. In other words, an intrinsic motivation is triggered inside the user by using it. Among various theories about the customer satisfaction, the adaptation level theory by Helson (1959) is prominent for expressing the positive involvement of users. According to this theory, the satisfaction is determined by the balance between the assumed level of quality and the real performance of the artifact. But among positive feelings listed in Table 1, such feelings as are not related to the degree of conformity to the assumed level are included. In other words, these feelings cannot be analyzed by the model of customer satisfaction.
3 A Survey on the Structure of Positive Feelings The author conducted a survey regarding products of which users have long been using by applying the “cooperative drawing of usage history chart” method in which a history chart is drawn based on the interview. Informants were 9 people including 2 females. And 27 product items were used in this research. After drawing the history chart since the initial use till the present time, informants were asked to draw the line graph of satisfaction and affective feeling toward the product. Then they were asked the reason for the changes in the graph. Major findings of this survey were as follows; − Users do not have a positive feeling to all product items. − Positive feeling does not exist at the initial usage but will grow up during the long term usage. − Users experienced the change of the way of using the product and the feeling against it because of various reasons including the change in context of use, the accidental operation, and the reference information. − That change is related to the sympathy of the user toward the characteristics of the product. − In other words, if the user could have a sympathetic experience to the characteristics or the concept of the product, s/he may have a positive feeling toward the product to which s/he had not a positive evaluation because of the poor usability. − A positive cyclic relationship can be triggered by this kind of positive feeling in such a way as to begin to use the product more intensively or to customize it.
4 Relationship Between the Theory of Flow Experience by Csikszentmihalyi and the Use of Product There are many theories regarding the intrinsic motivation among which the theory by Csikszentmihalyi is interesting because he studied the structure of the feeling of pleasure in such an autotelic (self-purposed) activity as the chess and the rock climbing. He
Long Term Usability; Its Concept and Research Approach
395
called an involved feeling to the behavior as “flow”. The flow can be felt when there is a balance between the opportunity (challenge) of behavior and the skill of the human being as is shown in Figure 1. It should be noted that such challenge and skill are not objective concept but are based on the subjective perception of human being. According to Wiedenbeck and Davis (2001), it was found that the difference between the interaction style and the learning experience influences the existence of flow experience for the operation of computer. And it was also found that the flow experience is affecting the perception of ease-of-use and usability of application software. It could be said that the positive feeling toward the product can be generated by the sympathetic experience to the product, i.e. flow experience.
Fig. 1. Model of Flow State (Csikszentmihalyi 1975)
Csikszentmihalyi listed up the elements that compose the model of flow experience as follows. Besides he presented the way on how to get the flow experience. − − − − − − − −
to have the insight to achieve the task to be able to concentrate on what one is doing to have a clear goal for what one is doing to have a direct feedback to the task to be in the absorption deep but natural to have the feeling to control one’s own behavior to lose the self-consciousness to change the sensation of time
5 Discussion Wiedenbeck and Davis reported that the level of skill was improved for those who were given the challenge level higher than the skill. Based on this evidence, we can consider the strategy to let the user to have the quasi-sympathetic experience intentionally. It might be possible to plan a new procedure that many people should have the higher level of satisfaction and the positive feeling.
396
M. Ando and M. Kurosu
References 1. Csikszentmihalyi, M.: Beyond Boredom and Anxiety: Experiencing Flow in Works and Play, Jossey-Bass Inc. 1975 tr. by Imamura, H. Shisakusha (2000) 2. Shimizu, S.: New Consumer Behavior, Chikura-shobo (in Japanese) (1999) 3. Wiedenbeck, S., Davis, S.: Intrinsic Motivation, Ease of Use and Usefulness Perceptions as Mediators in Computer Learning. Proc. HCI International 2001 1, 1553–1557 (2001)
General Interaction Expertise: An Approach for Sampling in Usability Testing of Consumer Products Ali Emre Berkman METU - CADCAM / ŪTEST Product Usability Unit, Fac. of Architecture no. 21 Ankara – Türkiye [email protected]
Abstract. As digital technology flourished, modes of interaction pertaining to computer systems started to be utilized in consumer products. As a consequence, problems peculiar to software began to be observed in once simple-tooperate products. In order to overcome these problems, one of the most versatile tools utilized during design and evaluation stages in software development, that is usability testing, was introduced to the domain of consumer products. However, both literature findings and author’s personal experiences show that there are some problems with sampling issues, since participants’ prior experiences with digital interfaces seem to affect test results more in the case of consumer products. In this study, after a theoretical discussion, the measurement tool being developed to control general interaction expertise (GIE) was presented. In the preliminary studies of predictive validity, correlation coefficients up to 0.76 were detected between test scores and usability performance. Keywords: user expertise, usability testing, consumer products, sampling.
For professional products, it is usually possible to determine the characteristics of users and ‘choose’ the ones that best represent the actual population with the help of observable attributes.1 In the case of consumer products, working on homogeneous ‘subsets’ is not plausible most of the time, given the fact that such products are usually intended for a larger portion of the population.2 Therefore, many user characteristics, that vary both quantitatively and qualitatively, should be considered. 1.1 Problems with Heterogeneity Causes and consequences of this methodological phenomenon may best be illustrated with a speculative example: Suppose that during the development process of an innovative cellular phone, the manufacturer wants to see whether the new interface is easy to use or not. Furthermore, the manufacturer wants to verify that basic functions can be easily used by all users. Although usability testing would be the right choice to fulfill those quite specific needs, results of the test would not be able to yield unambiguous results. First of all, the manufacturer would never know whether the sample was representative enough to infer that ‘basic functions can be easily used by all users’, regardless of the level of success observed in the tests. Even if the types and frequencies of usability problems observed are concentrated on instead of—or together with—effectiveness and efficiency data, the problem is not even slightly alleviated. The fact that variance observed in performance may be explained by individual differences causes methodological problems, and is hard to neglect especially in the case of consumer products. For example, in the speculative case provided, some participants may not be able to complete even a single task successfully; interpretation of this result would really be trivial. Was it the interface that caused too much problem for the participants? Was it the participants’ lack of expertise with such interfaces? Were the participants representative enough of the intended users of the product? Although, the need for representative sampling finds support in literature, suggestions about factors to be considered are divergent. The primary aim of any usability test should be to observe the effect of interface design on user performance, and eliminate all the other interfering factors. Egan states that [4] variability in performance up to 20:13 can be explained by differences among users, regardless of design or other factors. Although, how an interface is designed should have a strong determining effect on user performance in a usability test4, and even usability practitioners keep informing participants that what they test is the interface not the participants’ abilities, it is usually the participant’s familiarity with digital interfaces that is being reflected in results. Experiential factors, among other individual differences, are known to have a significant effect on performance (e.g.[5],[6]). 1
Attributes such as age, occupation, level of education, instead of hard-to-measure latent traits. Literally, apart from for whom the product is intended for, everybody that has access to market is a potential user in the case of consumer products. For example, everybody in the world is theoretically a potential user for an mp3 player produced by a global company. 3 This difference in performance was observed with professional software, so ratios more than 20:1 may be expected in the case of consumer products. 4 Actually, this is the very motive behind testing. In conditions where this assumption is violated there is no possibility of turning test results into design recommendations. 2
GIE: An Approach for Sampling in Usability Testing of Consumer Products
399
1.2 Experience, Expertise, and Attitudes In this study, the model suggested in Fig. 1 will be partially utilized for comprehending the relationship between what is experienced (experience) and what is retained— i.e. permanent cognitive changes (expertise and attitudes). Term suggested for the expertise as it is formulated in this study is General Interaction Expertise (GIE) [7] and may be defined as: A general expertise acquired by experiencing several interfaces, which helps users to cope with novel interaction situations.
Fig. 1. Triad of acquisitions
This triadic model is inline with Bandura’s social learning theory [8]. According to the model, as users experience a diversity of interfaces5 for some time they start to gain an expertise and self images are formed. Bandura [8] suggests that individuals possess a self system called self-efficacy, which enables them to influence their cognitive processes and actions. It may be stated that during acquisition of GIE (expertise) through experience, a General Interaction Self-efficacy belief (attitude) is synchronously built.
2 Assessment of GIE Like the assessment of constructs such as computer literacy, assessment of GIE may be done in many ways. Bunz, Curry and Voon [9] argue that experience has actual and perceived facets. The latent construct of GIE can either be observed as competency in interaction performance (actual expertise) or in the form of self-perceptions (perceived expertise). Although both approaches are plausible, assessment of GIE in 5
See [7] for an elaborate discussion on how specific experiences with individual interfaces lead to a general expertise.
400
A.E. Berkman
accordance with so-called ‘actual’ expertise—in other words how to assess GIE through the observation of performance—will be explored for the rest of this study. 2.1 Recognition of Expert Behavior According to Norman [10], human action consists of two main components. In order our goals to be fulfilled, we should be able to perceive and evaluate the current state of the world. This is followed by a set of actions for changing the world so that our goals are accomplished.
Therefore, the steps of the cycle presented in Fig. 2 continuously follow each other until the “the world” is transformed so that our goals are satisfied. However, whether the flow is smooth or constantly interrupted, whether a single iteration is enough or the cycle is run many times depend on the characteristics of the components of interaction. Cycle may be so internalized by the user that both concretizations of goals and interpretation of the world may be minimally crucial. Taken to the extreme, executions may dominate the cycle, that is, automatic processing may take place, minimizing even the need for perception in the form of feedbacks. On the other extreme, there may be cases where sequence of actions may not be readily available, or “interpreting the perception” may not be possible. This usually occurs when people confront with serious problems with a known system, or when they came across with a totally novel interface. In such cases, translation of intention to act to a meaningful sequence of actions and to transform perceptions to evaluations may be problematic. In their seminal work, Human Problem Solving, Newell and Simon [11] argue that “[a] person is confronted with a problem when he [sic] wants something and does not know immediately what series of actions he can perform to get it” (p.72). According to them, together with the apparent qualities pertaining to experts such as extensity and intensity of interface experience; efficacy in building
GIE: An Approach for Sampling in Usability Testing of Consumer Products
401
internal representations when the problem is ill-defined and flexibility in exploring a diversity of methods to obtain the desired outcomes seem to be distinguishing qualities of expert problem solving. 2.2 Development of Apparatus Tests After the theories discussed here and in the previous studies [7], it is possible to formulate GIE as consisting of two fundamental behavioral components, which are automatic loops of execution – evaluation and controlled problem-solving behavior. These two distinct behavior categories constituted the framework for the development of apparatus tests. 2.2.1 GIE_XEC The task consisted of three simple sub-tasks, assumed to fall into automatic loops of execution and evaluation domain defined previously. Task content was deliberately reduced as to eliminate the direct effects of specific experiences. Task difficulty and novelty was tried to be adjusted to a level so that indications of automatic processing would provide a partial estimate of participants’ GIE for the specific case. Before the administration of the test, step-by-step instructions were provided so that task goals and methods of achieving them were clear. Therefore, it was expected that no problem-solving behavior was involved while completing the sub-tasks. Steps to complete one trial were as follows: Sub-task 1: Navigate and choose modify (‘değiştir’), Sub-task 2: Navigate and choose ‘P’, Sub-task 3: Complete the required modifications and choose confirm (‘onay’) Keystroke data6 were recorded for 6 successive trials and the mean elapsed times to complete trials (except trial 1) were assigned as GIE_XEC scores.
Fig. 3. Screenshots for each sub-task in GIE_XEC 6
Data consisted of keys pressed and key stroke latencies.
402
A.E. Berkman
When 6 trials were treated as separate items, odd-even7 reliability coefficient for the instrument was 0.96 (N = 71). However, there was a statistically significant learning effect, manifested in the difference between mean scores for odd and even groups (p < 0.01). 2.2.2 GIE_PS Considering a pre-defined set of heuristics, among many other alternatives a problem situation was chosen to be developed as an apparatus test. Task consisted of reproducing a pattern of shapes shown to participants so that the pattern displayed in the interface screen exactly matches the goal pattern. The interface elements were a display and five push buttons. Three of the buttons were located under the screen, each coupled with a one-digit numerical display. A button labeled with an arrow pointing towards the screen was positioned on the right (redraw button). An auxiliary button labeled “tamam” (done) was positioned between the pattern card and screen. By pushing that button, participants would be able to indicate that the task was successfully completed (see Fig. 4.). Parameters that can be manipulated were not described to participants. At the beginning of the test, the aim of the test was briefly described, together with some limited instructions about the task.
Fig. 4. Screenshot of the user interface for the GIE_PS
A typical sequence of actions taken by an expert user for accomplishing the task would be as follows: (1) Select the slot to be filled with the leftmost button, (2) Modify the type parameter with the middle button, (3) Select the appropriate value for the color parameter with the rightmost button, (4) Press redraw button to see the results, (5) After the goal state is reached, press the button labeled “done”. 7
Data were grouped as trials 1,3,5 - 2,4,6. Mean values for each group were computed and Pearson’s product-moment correlation coefficients were obtained.
GIE: An Approach for Sampling in Usability Testing of Consumer Products
403
2.3 Preliminary Validity Study I After pilot tests to fix bugs and operational problems, GIE_XEC was administered in a real usability test to explore whether there is a considerable correlation between usability performance and independent variables gathered during observations. Usability performance data was gathered during a user test for a dishwasher with a menu-driven interface, which consisted of an LCD, one rotary knob, six shortcut pushbuttons, an on/off and a flow control button. Total effectiveness score across the 7 task scenarios applied to a sample of 15 participants was assigned as the dependent variable that represents user performance (effectiveness). Partial effectiveness scoring was avoided since an objective way of determining partial scores seemed to be impossible. Therefore, in cases where participants could not totally complete the tasks as they are defined, effectiveness was scored as 0. For the main independent variable GIE_XEC scores were assigned. Besides this, number of visual feedbacks that users got in order to re-orientate their fingers or before they press keys (feedbacks) and number of errors (errors) done were also included in analyses to explore any other significant correlations. The results indicate that there is a strong correlation between effectiveness and feedbacks (-0.60), and between effectiveness and GIE_XEC (-0.68). The correlation between errors and effectiveness was not found to be strong (-0.17). Another finding was about the type of relationship between GIE_XEC and effectiveness. During the analyses it was suggested that there may be a non-linear relationship between the variables. After GIE_XEC scores were log transformed and re-analyzed, the correlation between GIE_XEC and effectiveness has increased to -0.74. However, it is early to arrive at a conclusion in accordance with this result. For the performance data collected, it can be stated that: • Participants that were able to complete the tasks embodied in GIE_XEC more efficiently (quicker) were more successful in using the digital interface tested; • Participants that were able to complete the tasks embodied in GIE_XEC with less feedbacks were relatively more successful in using the digital interface tested. • Although it seems that GIE_XEC has a predictive power, this does not necessarily mean that there is a causal relationship between these variables. 2.4 Preliminary Validity Study II For gaining further insight about the predictive validity of GIE_XEC and obtaining preliminary data with GIE_PS, tests were conducted in accordance with a comparative usability test, where the aim was to comparatively evaluate 4 washing machines with digital interfaces. With this purpose 24 participants were allocated to three test groups and each individual interacted with two different interfaces. Two apparatus tests were administered to participants, just before or right after the usability test sessions. Whether participants took the test before or after the sessions was not a controlled factor and was determined mainly by the restrictions imposed by test conditions. The dependent variable that represents user performance was assigned as effectiveness across seven tasks and two interfaces attained by each participant. In order to eliminate the effects of differences regarding the distribution of effectiveness scores
404
A.E. Berkman
for each interface, standardization of scores was performed. For each apparatus test, elapsed time data were used to represent performance. Correlation coefficients detected between effectiveness and GIE_XEC scores, and between effectiveness and GIE_PS scores were -0.69 and -0.46 respectively. After the initial analyses, it was seen that treating elapsed time data as GIE_PS score was quite problematic due to the fact that 8 out of 24 participants quitted the task without attaining success. Whether participant was successful or not in completing the task was thought to better grasp the essence of problem-solving behavior. Therefore, GIE_PS scores were converted into dichotomous pass-fail scores.8 Non-linear relationship hypothesized in Study I was also observed in the data set of Study II and correlation between effectiveness and log transformed GIE_XEC scores increased to -0.73. In order to explore predictive validity further, some ‘ex post facto’ analyses were done. In this regard, participants were seeded within a 2x2 matrix in accordance with their GIE_XEC and GIE_PS scores (see Table 1). Table 1. 2x2 score matrix
High GIE_PS
Low GIE_PS
High GIE_XEC
A
B
Low GIE_XEC
C
D
With this test design, following hypotheses were tested: • H1: Mean effectiveness values for participants seeded in High GIE_XEC row (cell A and B) are higher than the ones seeded in Low GIE_XEC row (cell C and D), where difference between means is D1; • H2: Mean effectiveness values for participants seeded in High GIE_PS column (cell A and C) are higher than the ones seeded in Low GIE_PS column (cell B and D), where difference between means is D2; • H3: Mean effectiveness values for participants seeded in cell A (High GIE_PS ∩ High GIE_XEC) are higher than the ones seeded in cell D (Low GIE_PS ∩ Low GIE_XEC), where difference between means is D3; • H4: The relationship between differences between means is: D3 > D1 and D3 > D2. All the related null hypotheses were rejected, that is, there were significant differences between high – low GIE_XEC groups (D1 = 1.20, p<0.02) 9, as well as between high – low GIE_PS groups (D2 = 1.87, p<0.01) and between cell A and cell D (D3 = 2.48, p<0.01)10. Lastly, the relationship hypothesized in H4 was also observed. 8
After this modification, GIE_PS was converted to a test with only a single item, which is actually not acceptable in terms of reliability. However, since study was explorative in nature analyses were done even with a single item. 9 Note that D1 is the difference between effectiveness scores in the standardized form.
GIE: An Approach for Sampling in Usability Testing of Consumer Products
405
The added value of GIE_PS was shown in the linear model derived: Z = -0.5 Log x + 0.3 p
(1)
Z: Standardized predicted value; x: GIE_XEC score; p: GIE_PS score. Utilizing this model, the correlation between effectiveness and value predicted with the formula given above (1) was 0.76. Given this, it may be stated that variance in GIE scores can be accounted for 58% of the variance observed in the effectiveness (r2 = 0.58). As a result, the partial conclusions drawn in Study I were supported in this study as well. In addition, it may be stated that GIE_PS has augmented the predictive power yielded with only GIE_XEC. However, GIE_PS should be revised so that performance assessment based on many pass-fail type items is possible.
3 Implications for Research and Future Studies Preliminary evidence provided in this study indicates that in its fully-fledged form GIE would be a valuable tool for sampling. Measurement of GIE may be used as a means for justification of certain assumptions regarding participant profile, as a way of manipulating GIE as an independent variable, or for ascertaining that the effects of GIE on test results were kept to a minimum. Furthermore, after determination of normative standards, the tool may also be used to evaluate usability of interfaces in absolute terms. In other words, it would be possible to identify interfaces that require high levels of GIE and those do not. A final merit of pre-evaluating participants would be to detect the individuals that exhibit intolerable levels of test / performance anxiety before the actual usability test. In addition to the ‘performance observation’ approach presented in this study, attitudes should also be studied in order to provide an opportunity of triangulation, and to embrace social aspects of the phenomenon as well. Acknowledgments. A part of the research was conducted in the testing facilities of METU – CADCAM ŪTEST. The author wishes to thank to researchers in ŪTEST; Ç.Erbuğ, B.Şener, E. Akar, Z. Karapars, P. Gültekin, A. Öztoprak; and all the participants involved in the corresponding usability tests.
References 1. Rosenbaum, S., Chisnell, D.: Choosing usability research methods. In: Proceedings of the IEA 2000/HFES 2000 Congress, pp. 569–572 (2000) 2. Gray, W.D., Salzman, M.C.: Damaged merchandise? A review of experiments that compare usability evaluation methods. Human-Computer Interaction 13(3), 203–261 (1998) 3. Potosnak, K.: Recipe for a usability test. IEEE Software, pp. 83–84 (November 1988) 4. Egan, D.E.: Individual differences in human-computer interaction. In: Helander, M. (ed.) Handbook of human-computer interaction, pp. 543–565. Elsevier, New York (1998) 5. Nielsen, J.: Usability engineering. Academic Press, Boston (1993)
406
A.E. Berkman
6. Dumas, J.S., Redish, J.C.: A practical guide to usability testing. Ablex, Norwood- NJ (1993) 7. Berkman, A.E., Erbuğ, Ç.: Accommodating individual differences in usability studies on consumer products. In: Proceedings of the 11th conference on human computer interaction, vol. 3 [CD-ROM] (2005) 8. Bandura, A.: Social foundations of thought and action. Prentice Hall, New Jersey (1986) 9. Bunz, U., Curry, C., Voon, W.: Perceived versus actual computer-email-web fluency. Computers in Human Behavior, [article in press] 10. Norman, D.A.: The design of everyday things. Currency, New York (1990) 11. Newell, A., Simon, H.A.: Human problem solving. Prentice Hall, Englewood Cliffs (1972)
Are Guidelines and Standards for Web Usability Comprehensive? Nigel Bevan1 and Lonneke Spinhof2 1
2
Professional Usability Services, 12 King Edwards Gardens, London W3 9RG, UK Centre for Usability Research-K.U. Leuven, Parkstraat 45 bus 3605, 3000 Leuven, Belgium [email protected], [email protected]
Abstract. A previous paper compared the 110 guidelines in ISO CD 9241-151 with the 187 guidelines produced by the U.S. Department of Health and Human Services (HHS) and found that 76% of the HHS guidelines and 54% of the ISO guidelines were unique. New versions of both the original 2004 documents were issued in 2006, but 71% of the HHS guidelines and 46% of the ISO guidelines are still unique. Neither set of guidelines is easy to use for an expert review of whether a web site complies with the guidelines. A more comprehensive checklist has been developed, based on the HHS and ISO guidelines, but extended to include additional research-based guidelines on privacy and security and e-commerce. It is complemented by a handbook describing each guideline in more detail, illustrated with an example, and with an explanation of how it should be tested and when compliance can be stated.
• A brief statement of the overarching principle that is the foundation of the guideline. • Comments that further explain the research/supporting information. • Citations to relevant web sites, technical and/or research reports supporting the guideline. • A score indicating the "Strength of Evidence" that supports the guideline. These range from "Strong Research Support," indicating that there is at least one formal, rigorous study with contextual validity and agreement among experts to "Weak Research Support," indicating limited evidence and disagreement among experts. • A score indicating the "Relative Importance" of the guideline to the overall success of a web site. These scores range from 1-5 and are intended to help guide usability experts and web designers to prioritize the implementation of these guidelines. • One or more graphic examples of the guideline in practice. ISO is developing an International Standard to provide recommendations for the user-centered design of web user interfaces. The recommendations cover much the same scope as HHS, but are documented in a more concise format appropriate for an international standard. The ISO document distinguishes between design, process and evaluation aspects of web development. However, since the development process and evaluation is already covered by other ISO standards, it focuses on the design aspects, and provides design guidance and recommendations in four major areas: • • • •
High-level design decisions and design strategy. Content design. Navigation and search. Content presentation.
ISO 9241-151 primarily contains material that is unique to the web, so some topics covered by HHS are omitted from ISO 9241-151 as they are covered by more general ISO standards, in particular: • • • • •
Design Process and Evaluation: ISO 13407, ISO TR 16982, and ISO 9241-11. Accessibility: ISO TS 16071 and WAI Guidelines. Lists: partly covered by ISO 9241-12. Screen-based Controls (Widgets): ISO 14915-2. Graphics, Images, and Multimedia: ISO 14915-3.
This means that for complete guidance on the web, readers have to acquire additional standards and identify the parts that are relevant. This is not easy to do, particularly as some interpretation is needed to apply the material in other standards to the web. The previous paper showed that 76% of the HHS guidelines and 54% of the ISO guidelines were unique. In 2006, new versions were published of both the HHS and ISO guidelines. The new HHS document has added 22 new guidelines and updated 30 more, and the ISO document has been extensively revised with 31 additional recommendations. The 196 HHS guidelines (excluding usability testing) and 141 ISO recommendations are listed in Table 1, showing the ISO topics that appear to be most closely equivalent to
Are Guidelines and Standards for Web Usability Comprehensive?
409
each HHS guideline (or category of guideline). Partially corresponding ISO guidelines are shown in italics. Apparently conflicting guidelines are shown in bold. Table 1. Comparison of HHS and ISO guidelines * = HHS importance rating * HHS Guideline
ISO 9241-151 Recommendation
Design Process and Evaluation
5 1:1
Provide Useful Content
5 1: 2
E stabl i sh U ser R equi rements
5 1:3
Understand and Meet User’s Expectations
5 1:4
Involve Users in Establishing User Requirements
4 1:5
Set and State Goals
4 4 4 3 2 1
Focus on Performance Before Preference Consider Many User Interface Issues Be Easily Found in the Top 30 Set Usability Goals Use Parallel Design Use Personas
1:6 1:7 1:8 1:9 1:10 1:11
7.1.3 Appropriateness of content for the target group and tasks 7.1.4 Completeness of content 6. 3 A nal ysi ng the target user groups 7.1.2 Designing the conceptual model 7.1.5 Structuring content appropriately 8.3.2 Choosing suitable navigation structures 6.2 Determining the purpose of a Web application 6.5 Matching application purpose and user goals
6. 4 A nal ysi ng the users’ task s 6. 6 Pri ori ti si ng di f f erent desi gn goal s 6. 7 Pri ori ti si ng di f f erent desi gn goal s 6. 11 C oherent mul ti -si te strategy 7. 2. 2 I ndependence of content, structure & presentati on 9. 3. 5 V i sual i si ng temporal status 9. 3. 12 C onsi stency across rel ated si tes 10. 6 U si ng general l y accepted technol ogi es & standards 10.7 Making Web user interfaces robust 10. 8 D esi gni ng f or i nput devi ce i ndependence Optimizing the User Experience
5 4 4 4 4 4 4
2:1 2:2 2:3 2:4 2:5 2:6 2:7
4 2:8 4 4 4 4 3 3 3 2
Do Not Display Unsolicited Windows or Graphics Increase Web Site Credibility Standardize Task Sequences Reduce the User’s Workload Design For Working Memory Limitations Minimize Page Download Time Warn of ’Time Outs’ Display Information in a Directly Usable Format
2:9 Format Information for Reading and Printing 2:10 Provide Feedback when Users Must Wait 2:11 Inform Users of Long Download Times 2:12 Develop Pages that Will Print Properly 2:13 Do Not Require Users to Multitask While Reading 2:14 Use Users’ Terminology in Help Documentation 2: 15 Provi de Pri nti ng Opti ons 2:16 Provide Assistance to Users
8.3.11 Avoidi ng opening unnece ssary windows 9.6.4 Text quality
10.5 Acceptable download times 10.1.4 Using appropriate formats, units of measurement or currency.
10.2 Providing help. 9. 3. 15 Provi di ng pri ntabl e document versi ons 10. 3 E rror pages
Accessibility
5 5 5 4 4
3: 1 3:2 3:3 3:4 3:5
C ompl y wi th S ecti on 508 Design Forms for Users Using Assistive Technology Do Not Use Color Alone to Convey Information Enable Users to Skip Repetitive Navigation Links Provide Text Equivalents for Non-Text Elements
4 3:6
Test Plug-Ins and Applets for Accessibility
3 3:7 3 3:8 3 3:9
Ensure that Scripts Allow Accessibility Provide Equivalent Pages Provide Client-Side Image Maps
6. 8 C onf ormi ng to content accessi bi l i ty standards 9.3.9 Using Colour 7.2.3.2 Providing text equivalents for non-text objects 10.9 Making the user interface of embedded objects usable and accessible
410
N. Bevan and L. Spinhof Table 1. (continued)
3 3 2 2
3:10 3:11 3: 12 3:13
Synchronize Multimedia Elements Do Not Require Style Sheets Provi de F rame T i tl es Avoid Screen Flicker
9. 3. 10 U si ng f rames wi th care 6.9 Conforming to software accessibility standards 9.6.7 Making text resizable by the user
Hardware and Software
4 4 4 4 3
4:1 4:2 4:3 4:4 4:5
Design for Common Browsers Account for Browser Differences Design for Popular Operating Systems Design for User’s Typical Connection Speed Design for Commonly Used Screen Resolutions The Homepage
5 5:1
Enable Access to the Homepage
5 5:2
Show All Major Options on the Homepage
5 4 4 4 3 2 2
Create a Positive First Impression of Your Site Communicate the Web Site’s Value and Purpose Limit Prose Text on the Homepage Ensure the Homepage Looks like a Homepage Limit Homepage Length Announce Changes to a Web Site Attend to Homepage Panel Width
5:3 5:4 5:5 5:6 5:7 5:8 5:9
8.4.11 Linking back to the home page 8.3.9 Directly accessing relevant information from the home page 8.3.8 Informative home page
Place Important Items at Top Center Structure for Easy Comparison Establish Level of Importance Optimize Display Density Align Items on a Page Use Fluid Layouts Avoid Scroll Stoppers Set Appropriate Page Lengths U se M oderate W hi te S pace Choose Appropriate Line Lengths Use Frames When Functions Must Remain Accessible
6:3 6:4 6:5 6:6 6:7 6:8 6:9 6:10 6: 11 6:12
1 6:13
4 7:1
Navigation Provide Navigational Options
4 7:2
Differentiate and Group Navigation Elements
4 7:3
Use a Clickable ’List of Contents’ on Long Pages
4 7:4
Provide Feedback on Users’ Location
4 3 3 2 2 2 1
Place Primary Navigation Menus in the Left Panel Use Descriptive Tab Labels Present Tabs Effectively Keep Navigation-Only Pages Short Use Appropriate Menu Types U se S i te M aps Use ’Glosses’ to Assist Navigation
7:5 7:6 7:7 7:8 7:9 7: 10 7:11
1 7:12
Breadcrumb Navigation
9.3.2 Consistent page layout 9.3.3 Placing title information consistently 9.3.7 Avoiding scrolling for important information
9.6.5 Quantity of text per information unit/page 9. 3. 16 U se of “ whi te space” 9.3.11 Providing alternatives to frame-based presentation 9.3.6 Making content fit the expected size of the display area 9.3.13 Using appropriate techniques for defining the layout of a page
8.4.3 Maintaining visibility of navigation links 8.4.5 Placing navigation components consistently 8.4.7 Splitting up navigation overviews 8.4.14 Subdividing long pages 8.2.2 Showing users where they are 8.4.4 Consistency between navigation components and content 10.4 Naming of URLs
9.4.5 Inferring the link target from link cues 8. 4. 8 Provi di ng a si te map 8.2.2 Showing users where they are 8.4.12 Going back to higher levels
Are Guidelines and Standards for Web Usability Comprehensive?
411
Table 1. (continued) 8. 2. 1 M ak i ng navi gati on sel f -descri pti ve. 8. 2. 3 S upporti ng di f f erent navi gati on behavi ours. 8. 3. 3 B readth versus depth of the navi gati on structure 8. 3. 4 Organi si ng the navi gati on i n a meani ngf ul manner 8. 3. 7 S uperi mposi ng di f f erent navi gati on structures 8. 4. 6 M ak i ng several l evel s vi si bl e 8. 4. 10 M aki ng dynami c navi gati on components obvi ous 8. 4. 13 Provi di ng a 'step back ' f uncti on Scrolling and Paging
5 2 2 2 2
8: 1 8:2 8:3 8:4 8:5
E l i mi nate H ori zontal S crol l i ng Facilitate Rapid Scrolling While Reading Use Scrolling Pages for Reading Comprehension Use Paging Rather Than Scrolling Scroll Fewer Screenfuls
5 4 4 4 4 4 3 2
9:1 9:2 9:3 9:4 9:5 9:6 9:7 9:8
Use Clear Category Labels Provide Descriptive Page Titles Use Descriptive Headings Liberally Use Unique and Descriptive Headings Highlight Critical Data Use Descriptive Row and Column Headings Use Headings in the Appropriate HTML Order Provide Users with Good Ways to Reduce Options
5 4 4 4
10: 1 10:2 10:3 10:4
9. 3.8 Avoiding horizontal scrolling
Headings, Titles and Labels
9.4.17 Page titles as bookmarks 9.3.1 General page information 8.2.2 Showing users where they are
Links
U se M eani ngf ul L i nk L abel s Link to Related Content Match Link Names with Their Destination Pages Avoid Misleading Cues to Click
4 10:5
Repeat Important Links
4 10:6 4 10: 7
Use Text for Links D esi gnate U sed L i nk s
3 10:8
Provide Consistent Clickability Cues
3 3 3 3 3 3
Ensure that Embedded Links are Descriptive Use ’Pointing-and-Clicking’ Use Appropriate Text Link Lengths Indicate Internal vs. External Links Clarify Clickable Regions on Images Link to Supportive Information
10:9 10:10 10:11 10:12 10:13 10:14
9. 4. 7 U si ng descri pti ve l i nk l abel s
8.2.4 Offering alternative navigation paths 8.4.9 Providing cross linking to potentially relevant content 9.4.15 Redundant links 9. 4. 8 H i ghl i ghti ng previously visited links 9.4.2 Identification of links 9.4.3 Distinguishing adjacent links from each other
9.4.14 Link length 9.4.13 Distinguishable within-page links
8. 4. 16 D ead l i nks 9. 4. 4 D i sti ngui shi ng navi gati on l i nk s f rom transacti ons 9. 4. 9 M ark i ng l i nks to speci al targets 9. 4. 11 M arki ng l i nk s openi ng new wi ndows 9. 4. 12 D i sti ngui shi ng navi gati on l i nk s f rom acti on l i nks 9. 4. 16 A voi di ng l i nk overl oad Text Appearance
Use Black Text on Plain, High-Contrast Backgrounds Format Common Items Consistently Use Mixed-Case for Prose Text Ensure Visual Consistency Use Bold Text Sparingly Use Attention-Attracting Features when Appropriate Use Familiar Fonts Use at Least a 12-Point Font Color-Coding and Instructions Emphasize Importance Highlighting Information
9.3.9 Using Colour
Lists
4 12:1 4 12:2 4 12:3
Order Elements to Maximize User Performance Place Important Items at Top of the List Format Lists to Ease Scanning
[ISO 9241-12 5.7.1]
412
N. Bevan and L. Spinhof Table 1. (continued)
4 3 3 2 2 1
12:4 12:5 12:6 12: 7 12:8 12:9
Display Related Items in Lists Introduce Each List Use Static Menus S tart N umbered I tems at O ne Use Appropriate List Style Capitalize First Letter of First Word in Lists
Distinguish Required and Optional Data Entry Fields Label Pushbuttons Clearly Label Data Entry Fields Consistently Do Not Make User-Entered Codes Case Sensitive Label Data Entry Fields Clearly Minimize User Data Entry Put Labels Close to Data Entry Fields Allow Users to See Their Entered Data 8.5.2.8 Search field size 9.5 Choosing interaction objects Use Radio Buttons for Mutually Exclusive Selections Use Familiar Widgets Anticipate Typical User Errors Partition Long Data Items Use a Single Data Entry Method Prioritize Pushbuttons Use Check Boxes to Enable Multiple Selections Label Units of Measurement Do Not Limit Viewable List Box Options Display Default Values Place Cursor in First Data Entry Field Ensure that Double-Clicking Will Not Cause Problems Use Open Lists to Select One from Many Use Data Entry Fields to Speed Performance Use a Minimum of Two Radio Buttons Provide Auto-Tabbing Functionality Minimize Use of the Shift Key 8.4.15 Explicit activation
[ I S O 9241-12 5. 7. 6]
Screen-Based Controls (Widgets)
Graphics, Images, and Multimedia
4 14:1 4 14:2 4 14:3
Use Simple Background Images Label Clickable Images Ensure that Images Do Not Slow Downloads
4 14:4
Use Video, Animation, and Audio Meaningfully
4 14:5
Include Logos
4 4 4 3 3 3
Graphics Should Not Look like Banner Ads Limit Large Images Above the Fold Ensure Web Site Images Convey Intended Messages Limit the Use of Images Include Actual Data with Data Graphics Display Monitoring Information Graphically
14:6 14:7 14:8 14:9 14:10 14:11
2 14:12 Introduce Animation 2 2 1 1
14:13 14:14 14:15 14:16
7.2.3 Selecting suitable media 7.2.3.1 Selecting appropriate media objects 6.10 Identifying the site and its owner 9.3.14 Identifying all pages of a site
7.2.3.3 Enabling users to control time-dependent content changes
Emulate Real-World Objects Use Thumbnail Images to Preview Larger Images Use Images to Facilitate Learning Using Photographs of People Writing Web Content
5 15:1
Make Action Sequences Clear
4 4 4 4 4 4
Avoid Jargon Use Familiar Words Define Acronyms and Abbreviations Use Abbreviations Sparingly Use Mixed Case with Prose Limit the Number of Words and Sentences
15:2 15:3 15:4 15:5 15:6 15:7
8.3.5 Offering task-based navigation 8.3.6 Offering clear navigation within multi-step tasks 8.3.7 Superimposing different navigation structures 8.4.2 Providing navigation overviews 9.4.6 Using familiar terminology for navigation links
Are Guidelines and Standards for Web Usability Comprehensive?
413
Table 1. (continued) 3 3 3 3
15:8 15:9 15:10 15:11
Limit Prose Text on Navigation pages Use Active Voice Write Instructions in the Affirmative Make First Sentences Descriptive 9.6.1 Readability of text 9.6.2 Supporting text skimming 9.6.3 Writing style Content Organization
5 5 5 4 4 3 3 3 2
16:1 16:2 16:3 16:4 16:5 16:6 16:7 16:8 16:9
Organize Information Clearly Facilitate Scanning Ensure that Necessary Information is Displayed Group Related Elements Minimize the Number of Clicks or Pages Design Quantitative Content for Quick Understanding Display Only Necessary Information Format Information for Multiple Audiences Use Color for Grouping
9.6.2 Supporting text skimming
8.2.5 Minimising navigation effort
7.1.6 Level of granularity 7.2.4 Keeping the content up to date 7.2.5 Making the date and time of the last update available 7.2.7 Accepting online user feedback Search
5 17:1
Ensure Usable Search Results
5 17:2
Design Search Engines to Search the Entire Site
4 4 4 3 3 3 2
Make Upper- and Lowercase Search Terms Equivalent Provide a Search Option on Each Page Design Search Around Users’ Terms Allow Simple Searches Notify Users when Multiple Search Options Exist Include Hints to Improve Search Performance Provide Search Templates
17:3 17:4 17:5 17:6 17:7 17:8 17:9
8.5.3.1 Ordering of search results 8.5.3.2 Relevance-based ranking of search results 8.5.3.3 Descriptiveness of results 8.5.3.4 Sorting search results 8.5.4.1 Scope of a search 8.5.4.2 Selecting the scope of a search 8.5.2.7 Availability of search 8.5.2.10 Error-tolerant search 8.5.2.3 Providing a simple search facility 8.5.2.6 Describing the search technique used 8.5.2.1 Providing a search function 8.5.2.2 Providing appropriate search functions 8.5.2.4 Advanced search 8.5.2.5 Full-text search 8.5.2.9 Shortcut to search function 8.5.4.3 Providing feedback on the volume of the search result 8.5.4.5 Showing the query with the results 8.5.5.1 Giving advice for unsuccessful searches 8.5.5.2 Repeating searches 8.5.5.3 Refining searches
Use an Iterative Design Approach Solicit Test Participants’ Comments Evaluate Web Sites Before and After Making Changes Prioritize Tasks Distinguish Between Frequency and Severity Select the Right Number of Participants Use the Appropriate Prototyping Technology Use Inspection Evaluation Results Cautiously Recognize the ’Evaluator Effect’ Apply Automatic Evaluation Methods Use Cognitive Walkthroughs Cautiously Choosing Laboratory vs. Remote Testing Use Severity Ratings Cautiously Privacy and business policies
7.2.8.1 Providing a privacy policy statement 7.2.8.2 Providing a business policy statement
414
N. Bevan and L. Spinhof Table 1. (continued) 7.2.8.3 User control of personal information 7.2.8.4 Storing information on the user’s machine Internationalization
9.6.6 Identifying the language used 10.1.1 General 10.1.2 Showing relevant location information 10.1.3 Identifying supported languages 10.1.5 Presenting text in different languages Personalisation and user adaptation
7.2.9.2 Taking account of the users’ information needs 7.2.9.3 Making personalisation evident 7.2.9.4 Making user roles evident 7.2.9.5 Allowing users to see and change profiles 7.2.9.6 Informing about automatically generated profiles 7.2.9.7 Switching off automatic adaptation 7.2.9.8 Providing access to complete content
Only 56 of the HHS guidelines are in common (71% of the HHS guidelines and 46% of the ISO guidelines are unique). For the 101 HHS guidelines rated highest for importance, the proportion of unique guidelines drops to 62%. If the topics not covered by ISO 9241-151 are excluded (Design Process; Evaluation; Hardware and Software; Lists; Screen-based Controls; Graphics, Images, and Multimedia; Privacy & Business Policies; Internationalisation; and Personalisation) the percentage of unique guidelines drops to 64% of the HHS guidelines (55% of those of highest importance) and 38% of the ISO guidelines. While the percentage of unique ISO guidelines in the new documents has reduced from 49% to 38%, the percentage of high priority HHS guidelines that are unique remains about 55% (As some judgments had to be made for what constitutes equivalence, these figures are only approximate.) Some HHS guidelines are not in the ISO draft because they are beyond the scope of software ergonomics, e.g.: • Hardware and Software: browser, and operating system (e.g. 4:1 Design for common browsers). • 5:3 Create a Positive First Impression of Your Site. Other types of HHS guidelines that are not included by ISO include: • Home page design, e.g.: 5:5 Limit prose text on the homepage. • Scrolling & paging: e.g. 8:3 Use scrolling pages for reading comprehension. • Headings, Titles and Labels: window titles and descriptive headings, e.g.: 9:1 Use clear labels for categories of information that summarise the items within the category. • Appearance, e.g.: 11:4 Ensure visual consistency of website elements within and between web pages. • Lists: headings, ordering and formatting, e.g. 12:2 Display a series of related items in a vertical list. • Writing Web Content: jargon, abbreviations, and case, e.g. 15:5 Use abbreviations sparingly. • Content Organisation: support scanning and display necessary information, e.g.: 16:1 Organize information clearly: Structure the site to be meaningful to the user.
Are Guidelines and Standards for Web Usability Comprehensive?
415
ISO provides more detail in areas specific to the web such as Navigation and Searching, and includes Privacy and Internationalization that are outside the scope of HHS. In total 65 guidelines are unique to ISO. Examples of apparently important guidelines within the scope of HHS, but unique to ISO include: • 8.3.10 Avoiding unnecessary start (splash) screens. • 9.4.2 Navigation links should be clearly distinguishable from links activating some action. • 9.4.9 Links to other file formats should be clearly marked. • 9.4.11 Links that open new browser windows should be clearly marked. • 10.3 Error messages should clearly state the reason why the error occurred. • 8.4.13 Provide a separate ‘back’ function if the standard function does not lead to a meaningful previous state. These items may not have been included by HHS either because they were not included in the original set of guidelines that were reviewed (for example because there was no supporting evidence), or because they were subsequently judged “less important” and therefore eliminated from the published set. Some differences were noticed in the content of some HHS and ISO guidelines: • ISO 8.2.2 recommends use of breadcrumbs, while HHS 7:12 says that they are ineffective. • ISO 9.6.5 recommends limiting the quantity of text per information unit/page, while HHS 6:10 recommends using an appropriate page length, and using longer scrolling pages when reading for comprehension (8:3). • ISO 9.3.11 warns against using frames, while the HHS guidelines recommend frames in some circumstances (6:13 When functions must remain accessible) and suggest how they should be used (3:12 Use frame titles). • ISO 9.4.15 warns against using redundant links, while HHS 10:5 recommends repeating important links. • ISO 9.4.14 recommends that link names should not exceed one line of text, while HHS 10:11 recommends that link names should be long enough to be understood, but short enough to minimize wrapping.
2 Guideline Based Inspections Since the use of design- and usability standards in software development is rising [5] the interest in usability inspections is also rising. Previous research pointed out that the, then existing standards were not very easy for researchers and professionals to use [2,8]. Most current standards are still not readily useful in guideline reviews. To perform a thorough guideline review on websites can involve using a combination of different sets of guidelines. But when sets are combined ad hoc they are difficult to use and to interpret. This part of the paper describes: 1) the problems that occurred when the second author tried to use the HHS and ISO documents as a tool in guideline reviews, and 2) how the second author combined the different sets of guidelines into one checklist.
416
N. Bevan and L. Spinhof
Guideline based inspections are commonly used by usability professionals in the User Centered Design process. ‘Guideline based inspections’ or ‘guideline reviews’ are considered as usability inspection methods [11]. During a usability inspection method the usability related aspects of the interface will be examined. In contrast with formal usability tests here are no end users involved, the examination is performed by some kind of professional, a usability expert, a developer, an experienced user, etc. [11]. The best-known inspection method is ‘heuristic evaluation’, in which an application is checked against a list of quite generally formulated heuristics (e.g. the ten heuristics of Nielsen). The outcome of a heuristic analysis depends highly on the interpretation of the expert that performs the inspection. A Guideline review checks an application against a more concrete set of guidelines (e.g. the HHS guidelines). The outcome of a guideline review depends less on the expertise of the expert (Jordan, e.a. 1996). Guidelines used in a guideline review can have very different abstraction levels, they can vary from quite system specific to more general guidelines that can be applied to different kind of systems. The higher the abstraction level the more insight the expert needs to have in the system [3]. Design- and usability guidelines are becoming more and more popular in the software development processes [5]. The use of and compliance with usability standards is considered as a good way to create a high degree of consistency across and within applications [5,13]. One of the benefits of using guidelines can be increased consistency. Consistency is considered as one of the most important usability principles of Human Computer Interaction. Consistency within and between applications improves the overall usability of the application. Interfaces that are consistent have a higher ease of learning, ensure a smaller number of errors and therefore a higher user satisfaction [10]). More recent sets of usability guidelines such as the HHS guidelines and the ISO 9241-151 also consider other aspects of usability like interaction design and information architecture. To test an application for compliance with a certain set of guidelines it is useful to develop a checklist, listing all requirements the application should adhere to. A checklist helps the expert who checks for compliance to keep track of the requirements that need to be met. A good checklist consists of an exhaustive list of well-written requirements. A good requirement is necessary, verifiable, attainable and clearly formulated to avoid ambiguous statements [6]. Although usability guidelines tend to be more subjective in nature these rules for good requirements should also be kept in mind while creating a usability checklist. A good set of guidelines is composed of a combination of more specific guidelines for the application at hand and more generic guidelines that refer to more general aspects of the interface. The set of guidelines needs to be well-documented including concrete examples illustrating the different guidelines [5]. The document itself should comply with all guidelines for good document design, such as a thorough table of contents and index, word lists and glossaries [13]. Although the focus of usability guidelines may differ they are all developed to be used by developers when developing new applications and by usability experts when they are inspecting the usability of a system. Existing usability standards tend to be quite unusable for developers and even for usability experts who check for compliance with these standards [13]. So users of the different available standards still need to supplement the standards to make them usable in a guideline review.
Are Guidelines and Standards for Web Usability Comprehensive?
417
The two standards discussed in this paper are not directly usable in a guideline review. Ambiguous guideline formulations occur in both sets of guidelines, not all guidelines are verifiable and neither documents provides a ready to use checklist (see Table 2). The ISO standard contains no illustrations to serve as examples for the requirements and no index is provided. So both standards need to be supplemented to be usable as a checklist. Table 2. Examples of guidelines transformed to make them verifiable Original guideline from HHS: Let users know if a page is programmed to 'time out', and warn users before time expires so they can request additional time. Unambiguous requirements: Are users warned for a page time out? Can users "ask" for more time to complete a task? Original guideline from HHS: Provide users with appropriate feedback while they are waiting. Verifiable requirements: Is progress feedback provided for processes that take longer than one second? If a process takes maximum 10 seconds, is a visual indication used to indicate the progress? If a process will take more than 10 seconds, is a progress indicator used that shows progress toward completion?
2.1 Creating a Web Usability Checklist To perform a usability guideline review on websites we needed a complete checklist covering all usability topics concerning web usability. Neither of the existing sets of guidelines was complete and detailed enough to serve the purpose: a complete conformance test of websites against the existing and commonly used web design and usability guidelines. The HHS guideline set was the most complete and almost ready to use set of guidelines we could find. Therefore we decided to take the HHSguidelines as a basis to create the usability checklist we were looking for. We extended the scope of the HHS-guidelines with two topics: privacy and security and e-commerce taken from other guidelines found in literature (e.g. [12]). The requirements from the other topics were transformed into necessary, verifiable, attainable and unambiguous guidelines. All requirements are stated as yes/no questions. Some of the topics were combined into one topic and other topics were complemented with some extra requirements that in our experience are important in usability evaluations. All requirements in the checklist are research based. The whole set of guidelines was checked against the ISO/DIS 9241-151 to be sure we included everything from this upcoming international standard. All requirements in the ISO/DIS 9241-151 were covered in our set of guidelines but often differently formulated. Due to the contextual nature of usability research it was not always easy to formulate requirements that are true for all websites and web applications. We therefore developed a structure where the expert who does the inspection can decide which requirements are applicable in what situation. This was the only way to create a generally applicable web usability checklist that can be adjusted to each specific situation.
418
N. Bevan and L. Spinhof
Fig. 1. Example from the handbook
2.2 Use of the Checklist The checklist as it is developed here is intended to be used by usability experts during guideline-reviews. As stated above a checklist alone is not enough to create a usable review-tool. All requirements in the checklist are documented in the “guideline handbook”. Each requirement in the checklist is described in more detail and when possible is illustrated with an example (see Fig, 1). For each requirement there is a description of how it should be tested and when compliance can be stated. A glossary, a table of content and an index are included to make the document usable for the user of the checklist. Not all requirements have the same impact on the usability of an interface [1], so all requirements have been prioritized on a scale from 5 (highest priority) to 1 (lowest priority). This prioritization is based on the “relative importance” scales used in the HHS-guidelines. Adjustments to this scale have been made based on the experience from former research and on feedback from a group of usability experts that used the different drafts of the checklist in usability research.
Are Guidelines and Standards for Web Usability Comprehensive?
419
3 Further Research The complete review-tool (handbook and checklist) need to be user-tested before it can be used as a standalone test-tool. The usability of the tool itself as well as the outcome of the test need to be evaluated. In the next phase the complete review-tool will be tested in parallel with a formal usability test on the same application. By performing such a test the results of the guideline review can be compared with the results of a formal usability test. By asking several usability experts to participate in this test as evaluators, the usability of the tool itself will be evaluated at the same time.
References 1. Agarwal, R., Venkatesh, V.: Assessing a Firm’s Web Presence: A Heuristic Evaluation Procedure for the Measurement of Usability. Info.Sys.Research 13, 168–186 (2002) 2. Bevan, N.: Guidelines and standards for web usability. In: Human Computer International 2005, Proceedings HCI International 2005, Lawrence Erlbaum, Mahwah (2005) 3. Cockton, G., Woolrych, A., Hall, L., Hidemarch, M.: Changing Analysts’ Tunes: The Surprising Impact of a New Instrument for Usability Inspection Method Assessment. In: Proc.HCI 2003, pp. 145–162. Springer, Heidelberg (2003) 4. de Souza, F., Bevan, N.: The use of guidelines in menu interface design: Evaluation of a draft standard. In: Proceedings of the IFIP TC13 Third Interational Conference on HumanComputer Interaction pp. 435–440. North-Holland Publishing Co (1990) 5. Henninger, S., Lu, C., Faith, C.: Using organizational learning techniques to develop context-specific usability guidelines. pp. 129–136. Amsterdam (1997) 6. Hooks, I.: Writing good requirements. In: Proceedings of the 3rd International Symposium of theNational Council on Systems Engineering (NCOSE) (1993) 7. Jordan, W.P., Thomas, B., Weerdmeester, A.B., McClelland, L.I.: Usability evaluation in industry. Taylor & Francis, London (1996) 8. Mosier, J.N., Smith, S.L: Application of guidelines for designing user interface software. Behavior and Information Technology 5(1), 39–46 (1986) 9. Nielsen, J.: Usability Engineering. Morgan Kaufmann Publishers, San Francisco, CA (1993) 10. Nielsen, J.: Coordinating user interfaces for consistency. Morgan Kaufmann Publishers Inc. San Francisco, CA (1989) 11. Nielsen, J., Mack, L.R.: Usability inspection methods. John Wiley & Sons, Inc, US (1994) 12. Nielsen, J., Molich, R., Snyder, C., Farrell, S.: E-Commerce User Experience. Nielsen Norman Group Fremont, CA, USA (2001) 13. Thovtrup, H., Nielsen, J.: Assessing the usability of a user interface standard, pp. 335–341. ACM Press, New Orleans, Louisiana, United States (1991) 14. US Department of Health and Human Sciences: Research-Based Web Design & Usability Guidelines (2006) Available at, http://www.usability.gov/guidelines/
The Experimental Approaches of Assessing the Consistency of User Interface Yan Chen, Lixian Huang, Lulu Li, Qi Luo, Ying Wang, and Jing Xu User Research & Experience Design Center of Tencent Technology (Shenzhen) Ltd. Shenzhen 518057
Abstract. Consistency, as one of the most important features of usability, has been using as an important indicator of accessing usability. A number of studies recently have focused on how to create consistency in a single application, but few of them have noted how to create and evaluate the consistency across products in a same company. In this paper, we addressed the problem by using two methods, in-complete matching task and the methods of paired comparison, to analyze the distinction among the competitive products and evaluate the consistency of the current products. The study finds that these two methods can relative rapidly identify the performances of consistency between different products and be able to find out some design elements impacting the consistency. However, as the object of the study in this experiment is only involved in the login interface, the applicability of the method needs further studies. Keyword: consistency; user experience; usability testing.
1 Introduction One of the most important aspects of usability is consistency. Consistency should apply both within the individual application and across complete computer systems and even across families.[1] Consistency in a single application can reduce user’s memory load, and the risk of errors. While Consistency kept among different products in the same company can enhance the overall identification of the product so as to distinguish from our competitors’ products. However, due to different product positioning, different product application areas, and different target users of the product, it is very difficult to maintain consistency in the interface design of all the products. And for the company with a certain number of products, it is even more difficult to analyze the factors affecting the consistency among products. This study focuses on the exploration and development of an effective method which is applicable for more products to complete the analysis of consistency performances among the existing products.
The Experimental Approaches of Assessing the Consistency of User Interface
421
consistency interface is the very one following the rules, such as using the same operation to select targets [2]. Another issue of consistency is to determine where consistency should be reflected in the interface, and what should be kept consistent. Is it the object operation manner in the daily life (called external consistency) or the one in the existing operating system (known as internal consistency) [2]. In accordance with the definition in the "User Interface Design" by Ben Shneiderman, consistency is mainly the unification of the general operating sequence, terminology, components, layout, color, style sheets in the application of [3]. The former mentions external and internal consistency, while the latter mainly emphasize, from a procedural point, what should be kept consistent. Although these definitions vary from each other, but they all refer to the complete consistency in all products experience within the same company. (Find theoretical supports) Here, consistency includes the identification (differences between them and other competitive products) that is used by user in products of a company, which are caused by as the number, size, color, layout and other factors of interface design elements among different products within the same company, and consistency among different products within the same company.
3 Methods of Assessing Consistency 3.1 Premises of the Study The above definitions on consistency contain two important factors: the identification of the product and the consistency among different products within the same company. If the product has a high identification, and the consistency among different products is also high, we can say that the complete consistency of the product experience is very good. Based on this premise, we list the following matrix to describe the complete consistency. We determine the location of the consistency of the assessing products in the list and give the corresponding descriptions through the use of experimental methods. 3.2 Methods of Accessing Since there is the big quantity and more categories of products in the company, and the login interface has important influence on user and has the maximum using frequency, the consistency performance of the login interface is selected as the study object of this experiment. To analyze two factors constituting the complete consistency, we apply the in-complete matching method and the method of paired comparison separately to finish the evaluation. 3.2.1 Method of Paired Comparison The method of paired comparison is an indirect method used for the preparation of a sequence scale in experimental psychology. It was first introduced by Psychologist J.Chon in his color-preferred study. This method is to match a pair for all pre-compared
422
Y. Chen et al.
Table 1. Matrix of Complete Consistency Performance Description
Consistency
1
Complete
Identification
among products
consistency
High
High
Good
Description The complete consistency of the product experience is very good, and
the
consistency
among
different products is high and the products have high identification. 2
Low
High
Common
The complete consistency of the product experience is common, and
the
consistency
among
different products is high while the products have low identification. Possible reason: not have obvious differences
with
competitive
products and need to enhance the identification elements design. 3
High
Low
Common
The complete consistency of the product experience is common, and the products have high identification consistency
while among
the different
products is low. Possible reason: a part
of
products
have
high
identification and it is possible to have sub-products. 4
Low
Low
Bad
The complete consistency of the product experience is bad, and the products have low identification, the similarity to other products is high; the consistency among different products is low and the product design has not unity.
The Experimental Approaches of Assessing the Consistency of User Interface
423
objects, and then present them to tested persons one pair by another, and require the tested persons to compare some given property of each matched objects and judge which of two objects has a stronger property. The feature of this method is to force the tested persons to make a choice in two objects, and the paired match can ensure that every object can be matched with other objects to complete the comparison among objects, thus we can conclude the performance of all pre-compared objects in a selected property. In the experiment, we used the login interface of QQ2006 as a standard, and selected similar interface to the one of QQ2006 in other pre-compared matched pairs. Seven pre-compared interface were selected in the experiment, they are matched in to 21 pairs, and operate the forced selection in two options of A and B interface in each pair.
Login interface of QQ2006
A
B
Fig. 1. Depiction of Task of the Method of Paired Comparison
If the consistency of all pre-compared interfaces is high to the login interface of QQ2006, the consistency evaluation of all pre-compared interfaces should not show a clear trend, that is to say, the options of user have no obvious distribution trends. The deduction principle is a hypothesis that there are the same N balls, and they will be compared by different users, the comparison results should be random and there is not any trend in users’ options; If there are N balls marked different numbers, and they will be compared by different users, some users may compare them by the numbers’ size, while some may compare them by odd or even number, so there are certain distribution trends in the comparison results. We can assess whether there are distribution trends by using Kendall's consistency coefficient which is also called "Kendall’s U coefficient". If there are distribution trends, the U-value should be close to 1, if there are no distribution trends, the U-value should be close to 0. Kendall formula follows as:
U=
( (
8( ∑ γ 2ij − K ∑ γ ij ) N ( N − 1) • K ( K − 1)
);
+1
N means the number of the assessing objects (the number of classes) N=7 7pre-compared interfaces K means the number of assessors K=16 16 users
)
(1)
424
Y. Chen et al.
Rij means the selected score of i> j (or i<j) in the recording table of paired comparison. The selected interface is noted 1, while the non-selected interface is noted 0. Table 2. Table of Kendall Coefficient Calculation
Interface No. A B C D E F G
A 4 1 1 3 9 5
B
2 1 5 9 5
C
9 1 1 0
D
1 0 1
E
9 8
F
G
3
3.2.2 In-Complete Matching Task-Name Matching Task The in-complete matching method is to assess the identification of products. It requires users to select a product name for the login interface of each product, and then it will calculate the accuracy of interface-produce match (AM). AM is equivalent to the number of interfaces selected correctly by users dividing the total number of the required-selected interfaces, that is, AM = AI (accurate interface) / TI (total interface). If the matching task gets a high score, we will consider the product has high identification.
Fig. 3. Depiction of In-complete Matching Task
4 Experiment Preparation Screening of users is conducted through telephone. Combining with the server registered information, we make a quick telephone interview and invite users to participate in our activities, and we will require users to recall the used relevant QQ products and give a brief description of this product in order to ensure that users meet our requirements. 16 users in total participated in this experiment, among which were 10 men and six women, and 13 users aged 21-30 years, 3 users aged 20 years and less years. To
The Experimental Approaches of Assessing the Consistency of User Interface
425
ensure the familiarity of users to the QQ businesses, QQ business we selected those users who have used at least five QQ businesses. Only a few users have used the QQ businesses less than five. Seven login interfaces of the typical businesses and three login interfaces of other instant messaging software including MSN, Yahoo Message and Popo are selected in the complete-matching method. These interfaces are all processed in advance to remove the names and icons of relevant products.
Login interface before modifying
Login interface after modifying
Fig. 4. Depiction of Interface Modifying of In-complete Matching Task
5 Results The average accuracy of in-complete matching task of 16 users is 61%. The result shows that the identification of seven interfaces to be assessed is high. However, the Kendall’s U in coefficient calculated in the paired comparison task is approximately equivalent to 0.28. There is a certain trend in the distribution of users’ options, but the trend is not significant. Therefore, we can conclude that the overall consistency of 10 interfaces to be assessed is common. But in the calculation process, we also find that some options of pairs appear a concentrating trend, seeing the following table. Table 3. Table of Paired Comparison Data
Interface No. A B C D E F G
A
B
C
D
E
F
4 1 1 3 9 5
2 1 5 9 5
9 12 14 14
4 14 12
9 8
3
G
426
Y. Chen et al.
A-G is the code of seven interfaces to be assessed. Among of them, option C of CF, CG and CE is hardly selected, the selected times are separately 4 (subtract 12 times of E selected from 16 times in total), 2 (subtract 14 times of F selected from 16 times in total), and 2 ((subtract 14 times of G selected from 16 times in total). In the interviews, we know that the colors of the interfaces have greater impact on the users’ options. The color of C interface varies from the colors of all other interfaces and it is why the options of C have a concentrating trend. While in the DF and DG pairs, the times that D is marked as an option is very few, and they are separately: 2 (subtract 14 times of F selected from 16 times in total) and 4 (subtract 12 times of G selected from 16 times in total). Through interviews, we also learn that the sizes of the interfaces have greater influence on the users’ options. The size of D interface, relative to the sizes of F and G interfaces, is very different. Therefore, the size of the interface is the main cause of option concentration.
6 Conclusion We can conclude that the identification of products is high through the in-complete matching task, and conclude that the consistency of products is low by the method of paired comparison. Based on the list corresponding to the above study premises, we have come to the consistency description of eight interfaces to be assessed, seeing the table below. Identification
Consistency among
Complete consistency
Description
products 3
high
low
common
The complete consistency of the product experience is common, and the products have high identification while the consistency among different products is low. Possible reason: a part of products have high identification and it is possible to have sub-products.
Eight interfaces to be assessed including the QQ2006 interfaces as a standard in the paired comparison task all have high overall identification, while the consistency among the product is low. Design elements of some products with great differences, such as size, color and icon of the interface, may be the causes of a rise of the accuracy
The Experimental Approaches of Assessing the Consistency of User Interface
427
in the matching task, while that the consistency among the products is lower gives a support for the possibility of the design uniqueness existing in some products, and there are also many causes of the uniqueness formation, sub-products may be one of them.
7 Discussion In this study, it is found that the in-complete matching method and the method of paired comparison can rapidly assess the consistency performance of products, and that the increase in the number of products does not affect the experimental progress, therefore these two methods are more applicable for the situation with the larger quantity and more categories of products. However, the study only involves the login interface with single design elements, and therefore, the applicability of the methods needs for the next study. We consider adding the further interviews to users in the following studies and explore the specific factors which affect the consistency. A rise in the number of users and experimental groups to compare also could further verify the experimental conclusion.
References [1] Nielsen, J.: Coordinating User Interfaces for Consistency, Academic Press, Boston, MA. pp. 35–55 [2] Preece, J., Rogers, Y., Sharp, H.: Interaction Design, Beyond Human-computer Interaction. John Wiley & Sons, Inc, New York (2002) [3] Shneiderman, B.: Designing the User Interface: Strategies for Effective Human-computer Interaction, Person Education (1998)
Evaluating Usability Improvements by Combining Visual and Audio Modalities in the Interface Carlos Duarte, Luís Carriço, and Nuno Guimarães LaSIGE – Faculty of Sciences of the University of Lisbon Edifício C6, Piso 3, Campo Grande 1749-016 Lisboa, Portugal {cad,lmc,nmg}@di.fc.ul.pt
Abstract. This paper reports the findings of an evaluation of an adaptive multimodal application for reading of rich digital talking books. Results are in accordance with previous studies, indicating no user perceived difference between applications with and without adaptivity. The NASA Task Load Index was also used and showed that users of the adaptive application reported less workload. Results also include a comparison between tasks executed with electronic support and tasks executed with print support, and also what specific features in the interface benefited the most from the use of visual and audio modalities. Keywords: Evaluation, Adaptive Interfaces, Multimodal Interfaces, Electronic and Print Reading, Digital Talking Books.
Evaluating Usability Improvements by Combining Visual and Audio Modalities
429
whose performance has increased over the years, are beginning to be deployed in general public applications, meaning more and more users have had contact with some kind of speech technology. However, most of these applications, like call centers, rely solely on audio. The combined used of two modalities remains outside of the general public reach. In this paper we explore usability issues in an application using video and audio as input and output modalities. The following section briefly introduces the application used in the evaluation sessions. The next section describes the experimental setting and procedures. This is followed by the presentation of the evaluation results. Section 5 discusses the results, and the final section concludes the paper and presents future work.
2 Rich Book Player – An Adaptive Multimodal Digital Talking Book Player The application used in the usability evaluation was the Rich Book Player, an adaptive multimodal Digital Talking Book player [4]. This player can present book content visually and audibly, in an independent or synchronized fashion. The audio presentation can be based on previously recorded narrations or on synthesized speech. The player also supports user annotations, and the presentation of accompanying media, like other sounds and images. In addition to keyboard and mouse inputs, speech recognition is also supported. Due to the adaptive nature of the player, the use of each modality can be enabled or disabled during the reading experience. Figure 1 shows the visual interface of Rich Book Player. All the main presentation components are visible in the figure: the book’s main content, the table of contents, the figures panel and the annotations panel. Their arrangement (size and position) can be changed by the reader, or as a result of the player’s adaptation. The other visual component, not present in figure 1, is the search panel. Highlights are used in the main content to indicate the presence of annotated text and of text referencing images. The table of contents, figures and the annotations panels can be shown or hidden. This decision can be taken by the user and by the system, with the system behavior adapting to the user behavior through its adaptation mechanisms. Whenever there is a figure or an annotation to present and the corresponding panel is hidden, the system may choose to present it immediately or may choose to warn the user to its presence. The warnings are done in both visual and audio modalities. All the visual interaction components have a corresponding audio interaction element, with one exception. Since the speech recognizer currently used in the player1 does not support free speech recognition, annotations have to be entered by means of a keyboard. All the other commands can be given using either the visual elements or vocal commands. 1
This applies to the Portuguese version of the player, which was the one used in the usability study.
430
C. Duarte, L. Carriço, and N. Guimarães
Fig. 1. The Rich Book Player’s interface. The center window presents the book’s main content. On the top left is the table of contents. On the bottom left is the annotations panel. On the right is the figures panel.
3 Experimental Setting The usability evaluation was carried out in the context of an article reviewing assignment for a Hypermedia Systems course. The students had several such assignments over the semester, which consisted of preparing a summary and an oral presentation of a given article. The summary and the oral presentation were group tasks, typically done over a two weeks period. With the students’ agreement, it was decided that one of those assignments was to be done with support from the Rich Book Player, over a one day period. The assignment consisted in reading the article “The Dexter Hypermedia” individually during the morning period, and preparing a group summary and answering a short test during the afternoon. Over a period of four days, thirty-three students participated in the evaluation: six in the first day, and nine in each of the other days. Giver the number of simultaneous participants, and the length of each session, the experiment was not conducted in our regular usability evaluation laboratory, but a special setting was prepared in another room. The room was set up with nine test stations. Each station consisted of a laptop computer with a larger screen, mouse, headphones, microphone and webcam attached to it. The Rich Book Player application was available in all stations. The application was endowed with logging capabilities, thus recording all interaction with the participants. The stations also did screen recording, voice inputs recording, and webcam recordings, thus allowing for a full backup of the experiment. In addition to the stations, two digital video cameras recorded other aspects of the interaction.
Evaluating Usability Improvements by Combining Visual and Audio Modalities
431
The experiment was divided in two periods. The morning period started with a 30 minutes period for application familiarization, which was followed by 120 minutes for article reading, and ended with a usability questionnaire. The afternoon period was composed by a 75 minutes session for summary preparation, 30 minutes for answering a short test without access to the article, 30 minutes for the same test with access to the article, and finally, another questionnaire. For the summary preparation task, the annotations of all the group’s members were merged, and the group worked on only one station. In order to be able to evaluate the effects of using a multimodal application on the task of reading an article, the students were divided in two major groups. The control group read the article printed in paper, and the test group read the article using the Rich Book Player. In order to investigate the effects of adaptation, the test group was further divided in two groups: a group with some of the adaptation features turned off, and other group with all the adaptation features on. In total the control group counted nine elements, and the other two groups, twelve elements each. To reduce the effect of extraneous variables, the following controls were applied: • The tasks were the same for each participant. • The tasks had the same time constrains for all participants. The questionnaires were answered immediately after task completion. • All test stations were equipped with laptop PCs of the same model (Sony VAIO TX3) and external monitors with the same dimensions. All stations were configured to use the same screen resolution, operating system version, applications and desktop configuration.
4 Evaluation Results The experiment results consist of qualitative data, gathered from the different questionnaires answered by the participants, and quantitative data, gathered from the logs and screen and video capture. In this paper we present and analyze the results from the qualitative data. Three sets of questionnaires were answered during the experiment by the participants from both test groups, and one set only by participants from the control group. 4.1 NASA Task Load Index The first questionnaire administered to the participants was the NASA Task Load Index (NASA-TLX) [5]. All the participants answered this questionnaire since it focused on the task, not the application. The questionnaire was presented to the participants immediately after the completion of the article reading task. The NASA TLX is a subjective workload assessment measure. NASA-TLX is a multi-dimensional rating procedure that derives an overall workload score based on a weighted average of ratings on six subscales: Mental Demands, Physical Demands, Temporal Demands, Own Performance, Effort and Frustration. The NASA TLX was used in this experiment with the main goal of finding a difference between the scores of participants in the adaptive and non-adaptive groups,
432
C. Duarte, L. Carriço, and N. Guimarães
and between these groups and the control group. Previous findings [6,7] show users do not perceive advantages in using adaptive interfaces over non-adaptive interfaces. Using a subjective workload assessment measure might reveal a difference not directly perceived by the participants, leading to the following hypotheses: H1 Performing the article reading task with the adaptive application, the nonadaptive application, or with a paper article, will result in different perceived workload measures. Measures were collected for all participants (12 in the adaptive group, 12 in the non-adaptive group and 9 in the control group). A one-way ANOVA test was performed, and revealed that the perceived workload by users of the adaptive application (M = 53.30, SD = 14.27), users of the non-adaptive application (M = 57.11, SD = 13.45), and users with only a paper article (M = 57.56, SD = 14.79) did not differ significantly F(2, 30) = 0.31, p > 0.05. The statistical analysis does not support hypotheses H1, meaning that the perceived workloads do not differ significantly based on the support used for reading the paper. 4.2 Usability Questionnaire Following the NASA TLX, participants in the adaptive and non-adaptive group were asked to answer to a second questionnaire. This 26 questions questionnaire focused on feature usefulness and application usability, and was organized in the following groups: Navigation, Annotations, Images, Search, Adaptation (only for the adaptive application group), Presentation, Interaction and General Opinion. All the questions were answered in a 10 point scale. The General Opinion was measured on three questions, evaluation the participants’ opinion and reaction to the application (figure 2). The correlation between the answers to the three questions was calculated, and all three showed to be significantly correlated (p < 0.001). Taking this significant correlation into account, it was possible to reach a single measure of opinion by adding the answers to the three questions for each participant. In accordance to what has been presented before, no significant difference was expected to be found between the two groups, which lead to formulating the following hypotheses: H2 The general opinion of users of the adaptive application is similar to the general opinion of users of the non-adaptive application. To evaluate this hypotheses a t-test was performed on the data, showing that the opinion of people in the adaptive group (M = 18.92, SD = 6.05) was not significantly different from the opinion of non-adaptive group (M = 17.33, SD = 5.02), t(22) = 0.70, p > 0.05. For each of the other question groups in the questionnaire, t-tests were applied to the usability related questions, in order to understand how the use of multimodal output (visual and audio combined) contributed to the overall usability of the application. In the following paragraphs all reported t-tests take into consideration the necessary Bonferroni adjustment.
Evaluating Usability Improvements by Combining Visual and Audio Modalities
433
Fig. 2. Average of the answers per participant group to the three criteria on the General Opinion group of the usability questionnaire
Regarding the navigation in the Rich Book Player, several features were offered, including navigation using the table of contents, going forward or backwards a word, sentence, paragraph or chapter, and by direct selection in the main window content. The results indicate it the available mechanisms were considered usable, t(23) = 10.79, p < 0.001. Annotation creation is one of the most difficult mechanisms to implement. Previous evaluations showed it [8], and prompted an alteration of the steps necessary to create an annotation. This procedure was redesigned, making more explicit the need to first select the part of text being annotated, and only after that step inputting the annotation. Better support for text selection was developed, including an initial suggesting of the current sentence, and simple commands to expand this selection. However, both the sequence of commands to create an annotation, t(23) = 1.79, p > 0.05, and the commands for helping with the text selection, t(23) = 2.00, p > 0.05, did not reach statistical significance, meaning test participants did not consider them particularly usable. Search results appear highlighted in the text. To improve context acquisition, the whole sentence where the search term exists is also highlighted with a different color (one lighter than the one used for highlighting the searched terms). This feature was considered to improve the usability, t(21) = 4.21, p < 0.05. The application also tried to minimize the movement of the main text windows whenever another window appeared or disappeared from the screen, by controlling the appearance point, the width of the windows, and the position of remaining windows whenever a window was hidden. This feature was considered useful by the test participants, t(23) = 4.20, p < 0.05. On an overall interaction rating, the Rich Book Player was considered usable by the participants, t(23) = 7.05, p < 0.001.
434
C. Duarte, L. Carriço, and N. Guimarães
The awareness raising mechanisms made special use of the two modalities available, displaying text which had been annotated, or had an image associated with it, in different background colors, and also using verbal cues to signal the presence of such text. Current chapter was also highlighted in the table of contents, and after arriving at a new chapter, verbal cues indicated its number and name (whenever applicable). A series of questions concerned this features, and tried to evaluate if they helped the users become aware of their place in the book, and what content existed around their current reading point. All the answers showed these features to be usable and effective awareness raising mechanisms, p < 0.05. 4.3 Comparing Electronic and Paper Reading The final questionnaire, presented after the group summary writing task, asked the participants from adaptive and non-adaptive groups to compare their experience of reading an article with the Rich Book Player application to that of reading printed articles. A questionnaire with eight questions comparing different aspects of the reading experience was prepared. Answers were given on a 5 point Likert Scale. Once again, all the t-test results presented in the following paragraphs have taken into account the necessary Bonferroni adjustment. The first question compared navigation in the electronic format to the printed format. The average of the answers was 3.79, and a t-test revealed that participants felt navigation in the electronic format was significantly easier than in the printed format, t(23) = 4.98, p 0< 0.05. The next question compared searching in both formats. Answers’ average was 3.96, and a t-test confirmed that participants felt that finding text in the electronic format is significantly easier than in the paper format, t(23) = 4.7, p < 0.05. The two following questions deal with annotation creation and annotation reading. Neither of these showed statistically significant results. Answers for easiness of annotation creation were 3.00 in average, while for annotation reading 3.46 on average. The next question dealt with how easy it was to acquire the context of an image in both formats. Once again the answer is not statistically significant, even though the average answer, 3.21, is above the scale’s mid-point. Questions six and seven dealt with which format did the users felt it was quicker to read, and easier to understand the article’s contents. The average for the first one was 3.04, and for the second one 3.13, with both failing to reach statistical significance. The last question asked which is the less tiring format for reading the article. Average answer was 3.08, not reaching statistical significance.
5 Discussion The analysis of the experiments results conducted so far allows drawing some conclusions regarding the usage of a digital book player endowed with multimedia and adaptive features: the comparison of an application with adaptive features turned on and off, the comparison of performing a task with electronic or printed support, and the improvements in usability gained from combining two modalities (in this case, video and audio).
Evaluating Usability Improvements by Combining Visual and Audio Modalities
435
5.1 Adaptive Versus Non-adaptive Applications When evaluating adaptive systems, additional problems have to be dealt with, in comparison to the evaluation of non-adaptive systems: • The definition of a control group is difficult for those systems that cannot switch off the adaptivity to make a non-adaptive version, because it is an inherent feature of the system [9] • Criteria for definition of adaptivity success are not well defined. On the one hand, objective standard criteria regularly failed to find a difference between adaptive and non-adaptive versions of a system. On the other hand, subjective criteria, standard in HCI research have been rarely applied to evaluation of adaptive systems [10]. • The effects of adaptivity in most systems are expected to be rather subtle in comparison to what may be expected from individual differences, and thus require precise measurements, potentially taking into account behavior and cognitive aspects of the users [11]. This study tried to deal with some of these aspects. By having some of the participants work with a print version of the article, it was possible to define a control group applicable to both adaptive and non-adaptive versions of the application. Furthermore, it was possible to turn off some of the application’s adaptive features without rendering it unusable, enabling a comparison between two versions of the application. The study also tried to establish a comparison between adaptive and non-adaptive versions of the same application using different subjective measures. The results, however, are in accordance to previous results in the literature, indicating no significant perceived differences between the adaptive and non-adaptive versions of the application, even tough the opinion of the participants who worked with the adaptive version of the application was, on average, higher than that of the participants who worked with the non-adaptive version. The same can be said about the perceived workload measured by the NASA TLX, where, once again, no statistical significance was found in the results. In this case, the comparison extended to the participants working with the print version, who achieved scores very similar to those of the participants working with the non-adaptive version of the application. The participants of the adaptive application group achieved lower scores on the NASA TLX, indicating a lower perceived workload, even tough not enough to be statistically significant, but justifying further studies to investigate if this indicator can identify a difference between adaptive and non-adaptive applications. 5.2 Reading in Electronic and Print Supports Another aspect evaluated in this study was the participants’ opinion regarding the task of reading an article using an electronic medium offering multimodal output, compared to reading printed works. A somewhat surprisingly result was the average answer to all the questions being above or equal to the 5-point Likert scale’s medium point, meaning that no task was more difficult to perform in the electronic medium than in the printed medium. This
436
C. Duarte, L. Carriço, and N. Guimarães
was the expected result for some tasks, like searching, but not for other tasks like annotation creation. However, only two tasks were significantly easier to perform with the Rich Book Player than with printed articles: navigating and searching. While this was an expected result for searching tasks, given the digital supports advantage, it is worth mentioning that navigation tasks also achieved the same level in the participant’s opinion. This is probably explained by the vast possibilities offered for navigation inside the application, allowing users to navigate to any point with ease. 5.3 Improvements from Multimodality Multimodal output is used throughout the application: content is presented visually and aurally, awareness raising mechanisms combine both modalities, and reading position is presented in both modalities also. Usability questionnaires assessed how the use of multimodality impacted the participants’ opinion of the application. The results show that combining visual and audio led to improvements not felt in other areas of the interaction, where the modalities were not used in combination. This was particularly felt in the participants’ opinion of the usability of the awareness raising mechanisms.
6 Conclusions and Future Work This paper presented the results of an evaluation of an adaptive multimodal Rich Digital Talking Book Player. This player combines visual and audio modalities, both for input and output, and is also endowed with adaptive capabilities, leading to the interface’s behavior adaptation in response to changes in the user’s behavior. The evaluation experiment counted with the involvement of 33 participants, arranged in three groups: an adaptive application group, a non-adaptive application group, and a control group which worked with printed texts. Evaluation results confirmed no perceived differences between adaptive and nonadaptive applications. However, when considering the NASA Task Load Index, the workload felt was smaller for the adaptive application group. This result did not reach statistical significance, but nevertheless prompts the need for further experiments. When comparing tasks performed with the Rich Book Player, and tasks performed with printed texts, the participants’ general feeling was that it was easier to perform tasks with electronic support. While for some tasks (e.g. searching) this was expected, for other it was somewhat surprising. The use of multimodality has also proven beneficial from the usability viewpoint, particularly for implementing awareness raising mechanisms. To gather further results that may shed some light on the effects felt with long term usage of an adaptive application, another experiment is currently underway, where the participants have the Rich Book Player at their disposal in their home environment for a period of two months.
Evaluating Usability Improvements by Combining Visual and Audio Modalities
437
References 1. Kalyuga, S., Chandler, P., Sweller, J.: Managing split-attention and redundancy in multimedia instruction. Applied Cognitive Psychology 13(4), 351–371 (1999) 2. Sawhney, N., Schmandt, C.: Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments. In: ACM Transactions on Computer-Human Interaction, vol. 7(3), pp. 353–383. ACM Press, New York (2000) 3. Oviatt, S., Coulston, R., Lunsford, R.: When Do We Interact Multimodally?: Cognitive Load and Multimodal Communication Patterns. In: Proceedings of the 6th International Conference on Multimodal Interfaces, State College, PA, USA, pp. 129–136. ACM Press, New York (2004) 4. Duarte, C., Carriço, L.: A conceptual framework for developing adaptive multimodal applications. In: Proceedings of the 11th International Conference on Intelligent User Interfaces, Sydney, Australia, pp. 132–139. ACM Press, New York (2006) 5. Hart, S.G., Staveland, L.E.: Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In: P.A. Hancock, N. Meshkati (eds.): Human mental workload. North-Holland, Amsterdam, pp. 139–183 (1988) 6. Höök, K.: Evaluating the utility and usability of an adaptive hypermedia system. In: Moore, J., Edmonds, E., Puerta, A. (eds.) Proceedings of the 2nd International Conference on Intelligent User Interfaces, Orlando, Florida, United States, pp. 179–186. ACM Press, New York (1997) 7. Weibelzahl, S.: Evaluation of Adaptive Systems. PhD Dissertation. University of Trier, Germany (2003) 8. Duarte, C., Chambel, T.: Simões, H., Carriço, L., Santos, E., Francisco, G., Neves, S., Rua, A.C., Robalo, J., Fernandes, T.: Avaliação de Interfaces Multimodais para Livros Falados Digitais com foco Não Visual. In: Proceedings of the 2nd Conferência Nacional em Interacção Pessoa-Máquina, Braga, Portugal (2006) 9. Höök, K.: Steps to take before intelligent user interfaces become real. In: Interacting with computers, vol. 12(4), pp. 409–426. Elsevier, Amsterdam (2000) 10. Weibelzahl, S., Lippitsch, S., Weber, G.: Advantages, opportunities, and limits of empirical evaluations: Evaluating adaptive systems. Künstliche Intelligenz 16(3), 17–20 (2002) 11. Karagiannidis, C., Sampson, D.G.: Layered evaluation of adaptive applications and services. In: Brusilovsky, P., Stock, O., Strapparava, C. (eds.) AH 2000. LNCS, vol. 1892, pp. 343–346. Springer, Heidelberg (2000)
Tool for Detecting Webpage Usability Problems from Mouse Click Coordinate Logs Ryosuke Fujioka1, Ryo Tanimoto2, Yuki Kawai2, and Hidehiko Okada2 1
Kobe Sogo Sokki Co., Ltd, 2 Kyoto Sangyo University 4-3-8, Kitanagasadori, Chuo-ku, Kobe 650-0012, Japan 2 Kamigamo Motoyama, Kita-ku, Kyoto 603-8555, Japan [email protected], [email protected] 1
Abstract. In this paper, we propose a method that detects inconsistencies between user interaction logs of a task and desired sequences for the task based on mouse click coordinate logs. The proposed method models two successive clicks as a vector and thus a sequence of operation in a user/desired log as a sequence of vectors. A vector is from the ith clicked point to the (i+1)th clicked point in the screen. To detect inconsistencies in user interactions and desired sequences, each vector from user logs is compared with each vector from desired logs. As cues of usability problems, the method detects two types of inconsistencies: unnecessary/missed operations. We have developed a computer tool for logging and analyzing user interactions and desired sequences by the proposed method. The tool is applied to experimental usability evaluation of ten business/public organization websites. Effectiveness of the method is evaluated based on the application result. The proposed method contributes to find 61% of the usability problems found by a manual method in much smaller amount of time: the number of clicks analyzed by an evaluator with the proposed method is only 1/5-1/10 of that with the manual method. This result indicates the proposed method is efficient in finding problems. Keywords: Automated usability evaluation, web, user interaction logs, mouse clicks, usability problem cues.
Tool for Detecting Webpage Usability Problems from Mouse Click Coordinate Logs
439
The existing methods require widget-level logs for the comparisons: the logs are required to include data of widget properties such as widget label, widget type, title of parent window, etc. This requirement degrades independency and completeness in logging user interactions with systems under evaluation. In this paper, we propose a method that detects inconsistencies between the user logs and the desired sequences based on mouse click coordinate logs. Coordinate values of clicked points can be easily and fully logged independently of what widgets are clicked. We have developed a computer tool for logging and analyzing user interactions and desired sequences by the proposed method. The tool is applied to experimental usability evaluations of websites. Effectiveness of the method in usability testing of webpages is evaluated based on the application result.
2 Method for Analyzing Mouse Click Coordinate Logs 2.1 User Logs and Desired Logs A user log can be collected by logging mouse clicks while a user (who does not know the desired sequence of a test task) performs the test task on a computer in user testing. In our research, a log file is collected for a test user and a test task: if the number of users is N and the number of tasks is M then the number of user log files is N ∗ M (where all the N users completes all the M tasks). A “desired” log is collected by logging mouse clicks while a user (who knows well the desired sequence of a test task) performs the test task. For a test task, one desired log file is usually collected. If two or more different interaction sequences are acceptable as desired ones for a test task, two or more desired log files can be collected (and used in the comparisons described later). 2.2 Method for Detecting Inconsistencies in User/Desired Logs The proposed method models two successive clicks as a vector and thus a sequence of operation in a user/desired log as a sequence of vectors. A vector is from the ith clicked point to the (i+1)th clicked point in the screen. To detect inconsistencies in a user log and a desired log, each vector from the user log is compared with each vector from the desired log. If the distance of the two vectors (vu from the user log and vd from the desired log) is smaller than a threshold, vu and vd are judged as being matched: the user operation modeled by vu is supposed to the same operation modeled by vd. The method defines the distance of two vectors as a weighted sum of distance between start points and size of difference (Fig. 1). Distance between start points = Size of difference =
w x (x 1 − x 2 ) 2 + w y ( y1 − y 2 ) 2 .
w x (x 3 − x 4 ) 2 + w y ( y 3 − y 4 ) 2 .
Vector distance = wp(Distance between start points) + wv(Size of difference) .
(1) (2) (3)
440
R. Fujioka et al. (x3,y3) (x4,y4)
vd
vu
(x1,y1) (x2,y2)
Fig. 1. Two Vectors and Their Distance
The role of weight factors wx and wy used in the calculations of distance between start points and size of difference is as follows. Users may click on links shown in a web browser window. The width of a link is usually larger than the height of the link, especially of a text link. Therefore, the differences of clicked points for clicking on the same link are likely to become larger for the horizontal axis (the x coordinate values) than for the vertical axis (the y coordinate values). To deal with this, weights wx and wy are used so that the horizontal differences can be counted smaller than the vertical differences. User operations to scroll webpages by mouse wheels should also be taken into account: scrolls by mouse wheels changes widget (e.g., link) positions in the screen so that the clicked positions may not be the same even for the same widget. Our method records the amount of wheel scrolls while logging interactions. Bu using the log of wheel scrolls, coordinate values of clicked points are adjusted. Fig. 2 shows this adjustment. Suppose a user clicked the point (xi,yi) in the screen (Fig. 2(a)) and then clicked the point (xi+1,yi+1) (Fig. 2(b)). In this case, the vector derived from the two clicks is the one shown in Fig. 2(c). As another case, suppose a user scrolled down a webpage along the y axis by the mouse wheel between the two clicks and the amount of the scroll was S pixel. In this case, the vector derived from the two clicks is the one shown in Fig. 2(d).
(xi,yi)
(a)
vu=(xi+1-xi,yi+1-yi)
(xi+1,yi+1)
(b) vu=(xi+1-xi,yi+1+S-yi)
(xi+1,yi+1+S)
(c)
(d)
Fig. 2. Adjustment of Clicked Point for Mouse Wheel Scroll
Tool for Detecting Webpage Usability Problems from Mouse Click Coordinate Logs
441
2.3 Two Types of Inconsistencies as Cues of Usability Problems As cues of usability problems, the proposed method detects two types of inconsistencies between user interactions and desired sequences. We name them as “unnecessary” operations and “missed” operations. Fig. 3 illustrates unnecessary and missed operations. Desired Log User Log
m u
u
u
m
Missed operations
u
Unnecessary operations Two operations judges as the same
Fig. 3. Unnecessary Operations and Missed Operations
Unnecessary operations are user operations judged as not included in the desired sequences, i.e., unnecessary operations are operations in a user log for which any operation in desired logs is not judged as the same one in the comparison of the user/desired logs. The method supposes such user operations as unnecessary because the operations may not be necessary for completing the test task. Unnecessary operations can be cues for evaluators to find usability problems that users clicked on a confusing link when another link is desired (expected) to be clicked on for the task. Missed operations are desired operations judged as not included in the user interaction sequences, i.e., missed operations are operations in desired logs for which any operation in a user log is not judged as the same one. The method supposes such user operations as missed because the operations may be necessary for completing the test task but the user finished the task without performing the operations. Missed operations can be cues for evaluators to find usability problems that a link is not clear enough or not easy to find for users. Our method models an operation in a user/desired log by a vector derived from clicked point coordinate logs, so the method detects unnecessary/missed operations as unnecessary/missed vectors. Suppose two or more successive operations are unnecessary ones in a user log. In this case, the first operation is likely to be the best cue in the successive unnecessary operations. This is because the user might deviate from the desired sequence by the first operation (i.e., the expected operation instead of the user operation is not clear enough for the user) and had performed additional operations irrelevant to the test task until the user returned to the desired sequence. The method can extract the first operations in the successive unnecessary operations and show them to human evaluators so that the evaluators can analyze usability problem cues (unnecessary operations in this case) efficiently. 2.4 Unnecessary/Missed Operations Common to Users Unnecessary/missed operations common in many of test users are useful cues for finding problems less independently of individual differences among the users. The method analyzes how many users performed the same unnecessary/missed operation. The analysis of the user ratio for the same missed operation is simple. For each missed operation, the number of user logs that do not include the desired operation is counted. To analyze the user ratio for the same unnecessary operation, the method
442
R. Fujioka et al.
User2 User3
User1 Extraction
Vector Comparison
Desired Log
User1
User2 User3
Unnecessary Operations Common to Users
Unnecessary Operations
User Logs
Vector Comparison
compares unnecessary operations extracted from all user logs of the test task. This comparison is achieved by the same way as operations (vectors) in user/desired logs are compared. By this comparison, unnecessary operations common among multiple users can be extracted (Fig. 4).
Fig. 4. Unnecessary Operations Common to Users
3 Evaluating Effectiveness Based on Case Study 3.1 Design of Experiment Ten websites of business/public organizations are selected. For each site, a test task is designed. The average number of clicks in the designed sequences for the ten test tasks is 3.9. Five university students participate in this experiment as test users. Each test user is asked to perform the task on the site. They have enough knowledge and experience in using web pages with a PC web browser but they use the websites for the first time. The desired sequences of the test tasks are not told to the test users. Thus, if the desired sequences are not clear enough for the test users, the users will be likely to deviate from the desired sequences and unnecessary and/or missed operations will be observed. The interaction of each user for each test task is logged into a user log file. To avoid fatigue affecting the results, the time of experiment for each user is limited to 60+ minutes: each test user is asked to perform a test task within five or ten minutes depending on the task size. Fifty user logs (five users ∗ ten tasks) and ten desired logs (a log per test task) are collected. For each task, a computer tool implementing the proposed method analyzes the logs and extracts possible cues of usability problems (i.e., unnecessary/missed operations). An evaluator tries to find usability problems from the extracted cues. 3.2 Weight Factors and Thresholds for Vector Distance Our method requires us to determine the values of weight factors wx, wy, wp and wv and the threshold value of vector distance (see subsection 2.2). To determine these values, we conducted pre-experiment with another test user. Based on the analysis of the log files collected by the pre-experiment, we investigated appropriate values that lead an accurate result in detecting unnecessary/missed operations. Values in the row labeled as “Original” in Table 1 shows the obtained values.
Tool for Detecting Webpage Usability Problems from Mouse Click Coordinate Logs
443
In our method, distance of two operations (vectors) is defined by Eqs. (1)-(3). In the case where wp = 0, the vector distance = the distance between start points so that two operations are compared by the clicked points only (i.e., a click in a user log and a click in a desired log are judged as the same operation if the clicked points are near). Similarly, in the case where wv = 0, the vector distance = the size of difference so that two operations are compared by the size of vector difference only (i.e., the position on which the click is performed is not considered). We evaluate these two variations of the method. Variations A/B in Table 1 denote those in which wp/wv = 0, respectively. Table 1. Values of Weight Factors and Threshold for Vector Distance
Original Variation A Variation B
wx
wy
wp
wv
0.4 0.4 0.4
1.0 1.0 1.0
0.5 0.0 1.0
1.0 1.0 0.0
Threshold (pixel) 100 67 34
3.3 Number of Problems Found To evaluate the effectiveness of our method in finding usability problems, we compare the number of problems found by the method with the number by a method based on manual observation of user interactions. In addition to record click logs in user interaction sessions, PC screen image was also captured to movie files (a screen recorder program is used in the PC). A human evaluator observes user interactions with the replay of the captured screen movies and tries to find usability problems. This manual method requires much time for the interaction observation but it will contribute to find problems thoroughly. In this experiment, the evaluator who tries to find problems by the proposed method and the evaluator who tries to find problems by the manual method are different so that the result with a method does not bias the result with another method. Table 2 shows the number of problems found by each of the methods. The values in the table are the sum for the ten test tasks (sites). Eleven problems are shared in the four sets of the problems, i.e., the proposed (original) method contributes to find 61% (=11/18) of the problems found by the manual method. Although the number of problems found by the proposed method is smaller than the manual method, the time for a human evaluator to find the problems by the proposed method is much less than the time by the manual method. In the case of the manual method, an evaluator has to investigate all clicks by the users because, in this case, user clicks that are possible problem cues are not automatically extracted. In the case of the proposed method, a human evaluator is required to investigate smaller number of clicks extracted as possible problem cues by the method. In this experiment, the number of clicks to be investigated in the case of the proposed method is 1/5-1/10 of the number in the case of the manual method.
444
R. Fujioka et al. Table 2. Number of Problems Found by Each Method
Methods Manual Original Variation A Variation B
#Problems 18 15 14 13
This result of case study indicates that the proposed method • contributes to find usability problems to a certain extent in terms of the number of problems, and • is much efficient in terms of the time required. 3.4 Unnecessary/Missed Operations Contributing to Finding Problems Not all unnecessary/missed operations extracted by the proposed method may contribute to finding usability problems. As the number of unnecessary/missed operations that contribute to finding problems is larger, the problems can be found more efficiently. The contribution ratio is investigated for the proposed method and its variations (Table 3). Values in the “Counts (first)” column are the counts of unnecessary operations that are the first in two or more successive unnecessary operations (see subsection 2.3). For example, the original method extracted four missed operations in total from log files of the ten test tasks, and 25.0% (one) of the four operations contributed to finding a problem. Similarly, the original method extracted 375 unnecessary operations in total, and 7.5% (28) of the 375 operations contributed to finding problems. Table 3. Number of Unnecessary and Missed Operations Found by Each Method and The Ratio of Contribution in Finding Problems
Missed Operations Methods Original Variation A Variation B
Counts 4 8 1
Ratio 25.0% 37.5% 0.0%
Unnecessary Operations Counts Counts Ratio Ratio (all) (first) 375 7.5% 51 49.0% 422 5.7% 58 39.7% 299 9.4% 72 37.5%
Findings from the result in Table 3 are as follows. • In the three methods, the ratios are larger for unnecessary operations (first) than those for unnecessary operations (all). This supports our idea that an evaluator can find usability problems more efficiently by analyzing the first operations only in the successive unnecessary operations. • In the result of unnecessary operations (first), the ratio for the original method is larger than either of the two variations. In the result of missed operations, the ratio for the original method is larger than that for the variation B but smaller than that for the variation C. This indicates that both the original method and its variation A are promising ones.
Tool for Detecting Webpage Usability Problems from Mouse Click Coordinate Logs
445
4 Conclusion In this paper, we proposed a method that extracts cues for finding usability problems from user/desired logs of clicked points. To detect inconsistencies between user and desired logs, the method compares operations in the logs. The method compares user/desired operations by modeling each operation as a vector derived from coordinate values of the clicked points and checking the distance between two vectors. The distance is defined as a weighted sum of distance between start points and size of difference for the two vectors. The method extracts two types of inconsistencies: unnecessary and missed operations. Effectiveness of the proposed method was evaluated based on a case study. We tried to find usability problems for ten websites by the proposed method and the manual method. The proposed method contributes to find 61% of the usability problems found by the manual method in much smaller amount of time: the number of clicks analyzed by an evaluator with the proposed method was only 1/5-1/10 of that with the manual method. This result indicates the proposed method is efficient in finding problems. In our future work, we extend our method by utilizing log data of click time intervals. Timestamps are another data that are easily, independently and fully logged. By utilizing both logs of clicked points and time intervals, usability problems cues that are more likely to contribute will be obtained. Additional case studies are also necessary for further evaluations of our method.
References 1. Ivory, M.Y., Hearst, M.A.: The State of the Art in Automated Usability Evaluation of User Interfaces. ACM Computing Surveys 33(4), 1–47 (2001) 2. Okada, H., Asahi, T.: GUITESTER: a Log-based Usability Testing Tool for Graphical User Interfaces. IEICE Transaction on Information and Systems E82-D(6), 1030–1041 (1996) 3. Okada, H., Ashio, T., Kunieda, K., Shimazu, H.: Interaction Logging and Analysis Tool Estimating Expected Operations for Unexpected User Operations. In: Proc. of the 11th Int. Conf. on Human-Computer Interaction (HCI International 2005), CD-ROM (2005)
A Game to Promote Understanding About UCD Methods and Process Muriel Garreta-Domingo1, Magí Almirall-Hill1, and Enric Mor2 1
Learning Technologies Dept - Universitat Oberta de Catalunya, Av. Tibidabo 47, 08035 Barcelona, Spain {murielgd,malmirall}@uoc.edu 2 Computer Science, Multimedia and Telecommunication Dept - Universitat Oberta de Catalunya - Rambla del Poblenou 156, 08018 Barcelona, Spain [email protected]
Abstract. The User-centered design (UCD) game is a tool for human-computer interaction practitioners to demonstrate the key user-centered design methods and how they interrelate in the design process in an interactive and participatory manner. The target audiences are departments and institutions unfamiliar with UCD but whose work is related to the definition, creation, and update of a product or service. Keywords: Games with a purpose, game pieces, HCI education, HCI evangelization, user-centered design, role-playing, design games, experience.
A Game to Promote Understanding About UCD Methods and Process
447
The project is organized in 12 work packages and besides the coordination and methodological packages, all except two are technically-oriented and led by programmers. These two exceptions are the first work package, which consists of gathering user requirements, and the second work package, which is responsible for the prototyping and user testing of all developed modules. Therefore, these two work packages with the help of the methodology package are responsible for ensuring that the entire project and, consequently, all development teams follow a UCD approach. As Twidale and Marty state, usability professionals often have to combine the roles of usability advocates, educators and practitioners [10]. Bias and Mayhew [2] addressed this issue by putting together a collection of articles on cost-justification of usability. However, the argument of cost-justification by itself is not enough to introduce UCD in an organization. As Siegel [7] explains, “success will hinge not on a single convincing argument, but on the many interrelated ideas we introduce to our organizations, on the kinds of relationships we build with various stakeholders, and on how we demonstrate our value to them first hand.” At the Open University of Catalonia (UOC), one of the Campus project participating universities, we created a game as a tool to increase the understanding of UCD methods. Through a participatory and interactive manner, its purpose is to promote a better understanding of a good design process; showing the importance of knowing the end user and keeping the focus on the user as well as choosing the right methods for analyzing the users and evaluating the design. Several activities have been programmed in the context of the Campus project in order to proselytize and teach the project teams the importance of a UCD process and the best way to apply it. As a part of these activities, we decided to deploy the UCD game for the Campus project teams.
2 The UCD Game: Origin, Audience and Goals The UCD game idea was created after the celebration of World Usability Day (WUD) 2005. As part of the UCD diffusion goal at UOC, different activities were organized for the occasion. There were formal presentations about in-house projects that followed the UCD process. Outside the conference room, a set of independent stations was placed for visitors to receive an overview of the UCD process and methods, experience a usability test in a lab setup, and use a computer with a screen reader installed (JAWS) in order to understand the importance of accessibility. This accessibility station, where participants had to browse the Internet with the monitor off and only with the help of the screen reader, was the most successful of all the activities organized. The aim when designing the game was to obtain a set of engaging stations where participants can experience the different steps of a UCD process. It is structured as a team and participatory activity with a set of interrelated tasks because the goal is not only to show how each project phase is accomplished individually but also how the project is completed and how these different phases relate to one another. 2.1 The Game Goals The setup of the UCD game is similar to the Interactionary, a design exercise envisioned and organized by Berkun [1], however, the main difference is that unlike
448
M. Garreta-Domingo, M. Almirall-Hill, and E. Mor
Interactionary, our game is not an on-stage competition and does not address designers or an HCI audience. In this sense, the goals of the game show created by Twidale and Marty [10] are more closely related to our objectives. Yet, while their game illustrated usability evaluation methods, our game strives to illustrate the UCD process and techniques. Buchenau and Fulton [3], in their paper about experience prototyping, quote the Chinese philosopher Lao Tse: “What I hear, I forget. What I see, I remember. What I do, I understand!” Our game is a way of promoting understanding by doing. Like Buchenau’s and Fulton’s, there are several papers on how to include “doing” in the design process by role-playing, informance design, interactive scenarios, participatory design, etc. [4,5,6,8,9]. However, these papers address a different problem than the UCD game and therefore are aimed to a different audience, pursuing different goals. While addressing designing exploratory design games, Brandt [4] describes various kinds of games, one of which is similar in concept to the UCD game: “The primary aim with the negotiation and workflow oriented games is for the designers to understand existing work practice. Game boards and game pieces are produced in paper. The outcome of the game playing is often flow diagrams showing relations between people and various work task or tools.” In our case, we want the Campus project participants to understand UCD work practices using game pieces for each of the UCD phases and a game board to show the relations between the different phases and the end design. In summary, the purpose of the game is: 1) To show the key steps of a UCD process in an enjoyable and informal setting. 2) To help participants understand how these steps relate to each other. 3) To provide an overview of the main HCI techniques and methods. 4) To illustrate that the user target and the methods used affect the end design. 2.2 The Target Audience We initially created the UCD game in the context of the UOC, a completely online university with more than 40,000 students that offers 19 official undergraduate degrees as well as several graduate programs. As a result, the virtual campus plays a key role at the UOC as it is the work tool for UOC employees, the teaching tool for faculty members, and the learning tool for students. In such a context, UCD should play a central role in all UOC departments that design products for the virtual campus users. Nevertheless, this is still not the case today. Although the introduction of usability and HCI concepts in the organization started in 2002, they are still not well understood and therefore not always properly applied. Hence, we created the game as a tool to promote a better understanding of a good design process in hopes of demonstrating the importance of understanding and focusing on the end user as well as choosing the right methods for analyzing the users and evaluating the design. The target audiences are organizational departments that participate in the creation, definition, and update of the virtual campus applications. Even though this target audience is formed by people familiar with the concept of usability and UCD, the goal is to ensure that the game is comprehensible even for people unaware of the existence of UCD.
A Game to Promote Understanding About UCD Methods and Process
449
Within the context of the Campus project, there are nine universities actively working in its development. Additionally, there are several other universities and public institutions that act as observers of the project and whom, in the near future, may use the virtual campus as their learning management system. Therefore, the audience is much more diverse than at UOC since it includes active and passive management, project leaders and developers from both public and private institutions. For our purposes, the target audience of the game was management and project leaders since they are responsible for ensuring that their teams follow a UCD process. However, our aim was to include developers because the more participants that understand the value of UCD afford greater opportunities for UCD to be applied throughout the development processes.
3 The Game Structure The game consists of four different stations; each station representing a phase in the UCD design process: defining the users, analyzing the users’ needs, designing the artifact and evaluating the resulting artifacts. Like the exploratory design games [4], players do not compete. Each team goes through the stations and at the end, all game boards are shown together in a separate room so that participants and observers can evaluate the design solutions. The WUD promoted by the Usability Professionals’ Association was the first background for the UCD game. The Campus project was the context of our second application of the game. Groups of 3 to 4 people are created with participants from different institutions and departments. To begin, they read an overall description of the game and are given a one-page description of the design problem. 3.1 The Design Problem Like Berkun [1], we decided that a non-web problem would work best with a large audience and that a physical design of a public and not-work related object would be better as these concepts are familiar to everyone, and the details are broad enough for everyone to follow along. Therefore, when considering a design problem, these issues played a key role; in the end, we opted for the design of an airport self-check in machine. The initial design problem was to create a ticket vending machine. To narrow the scope, the machine was supposed to only sell tickets to the airport and it was to be placed in a central railroad station of Barcelona, Spain. For the self-check in machine, we also narrowed the project to the flights between Barcelona and Melbourne or Sydney of a specific airline company. 3.2 The First Game Station: Defining the Users The aim of this first station is to introduce the idea that good design is accomplished by thinking of the end user and that this end user is neither the designer, nor anyone else. The team is presented with four groups of people, each containing four users with pictures and a short demographic description. Participants are asked to choose the group of users for whom they will design and write down their main characteristics.
450
M. Garreta-Domingo, M. Almirall-Hill, and E. Mor
Initially, all possible users were presented individually but we realized that participants needed a substantial amount of time to choose and group a set of users in order to build their primary user type. We have found that having the groups already formed is clearer for our audiences. 3.3 The Second Game Station: Analyzing the Users’ Needs The aim of this second station is to show that designers use several quantitative and qualitative methods to gather data about the chosen target. Defining the users is the first step, in this phase participants analyze the users’ needs, wants, contexts, and limitations by choosing a maximum of three methods from the UCD toolbox. After opening the envelopes of the selected methods, the team has to summarize the findings and write down a list of characteristics that should be considered when designing the artifact. For example, during the contextual inquiry method, the team watches a video of the Barcelona airport. For benchmarking, they have pictures of other self check-in machines already in use at the airport. Other methods available in the toolbox are: in-depth interview, focus groups, surveys and log analysis. Outside of the envelope there is a short description of the technique to help participants choose the ones that they consider most useful. Inside there is more information about the technique being applied to the design problem and the results of conducting it. For instance, for an in-depth interview, the inside page contains a list of possible interview questions and a list of possible answers given by users. 3.4 The Third Game Station: Designing the Artifact The goal of the third station is to show that a successful design is focused on the end user. As a consequence, designers should not jump directly to the end design but they should consider the output of the previous stations and follow an iterative design process. The team is also asked to use one of the evaluation methods available in another UCD toolbox. Assuming the team has understood the UCD philosophy, the result of this station will be a simple prototype and a list of changes that should be made to it after applying an evaluation technique. The game organizers are the pretended users for the evaluation techniques. For user testing, the team has to think of one or two tasks they would like the user to accomplish. The organizer will then perform these tasks using the prototype. 3.5 The Fourth Game Station: Evaluating the Designed Artifacts At the end of the game, each team pastes the one-page output of each station on a horizontal game board. The board is separated into four quadrants: 1) photos of the target users and key characteristics, 2) required characteristics of the artifact according to the user analysis and the methods used, 3) the first low-fidelity prototype, a list of changes resulting from the evaluation of the prototype and 4) the evaluation of the development process. Game boards are displayed in a room where participants and other observers can see the different designs and UCD processes. In order to evaluate the designs, participants and observers have a questionnaire that contains questions such as “Does the
A Game to Promote Understanding About UCD Methods and Process
451
design take into account the context of use?” or “Did the team evaluate their first design solution?”
4 Deploying he UCD game In order to test the game structure and its different stations, we initially ran a pilot of the game using a small group, half being HCI experts while the others were familiar with UCD but had never applied a full UCD process. The mixed groups were required to traverse through each of the four stations of the game: defining the target user, analyzing its needs, designing, and evaluating. It was very rewarding to see the groups make different decisions at each of the stations. Since the groups defined different user characteristics and target goals as well as selected different evaluation methods, the final designs varied greatly. In this sense, the pilot study proved that the game is useful in showing how phases relate to each other and that designs depend on characteristics of the end user and the methods used. Through the post-game questionnaire, we concluded that all participants considered the game useful to show the value of UCD methods and process and that it was an enjoyable, refreshing and enriching experience. We also obtained feedback on areas to improve, such as a tighter control of time for each station and a less technical and ambiguous description of the phases and methods. Our second application of the game was on November 14th during World Usability Day 2006. Around thirty people (8 groups) participated in the game. From observing the teams and the post-game questionnaire, we gathered that most participants enjoyed the experience and found it a successful tool to show UCD process and methods. Again the time spent on each station was perceived as too long despite the timers at each station and the organizers, who tried to encourage groups to move to the next station. Participants that had an interest in UCD wanted to do a good work on each phase and therefore they took longer than the allowed time. The biggest problem caused by lack of time was that the teams were not able to view the other teams´ solutions. Thus, visualizing how different target users and processes led to different results was one of the goals not accomplished in this application of the game. Running the UCD game for the Campus project participants was more challenging since most people are not interested in UCD. This lack of interest reduced the problem of the time issue but still made the game useful as another tool to promote understanding about the UCD process and approach.
5 Conclusions We created the game in order to show the UCD process and methods to an audience of non-experts but whose tasks are related to the definition, creation, and update of a product or service. We have deployed the game in three different occasions with diverse contexts and audiences. The feedback given by the different types of participants told us that the game is perceived as enjoyable and useful for our purpose. Recalling our goals when creating the game (1) To show the key steps of a UCD process in an enjoyable and informal setting. 2) To help participants understand how these steps relate to each other. 3) To provide an overview of the main HCI
452
M. Garreta-Domingo, M. Almirall-Hill, and E. Mor
techniques and methods. 4) To illustrate that the user target and the methods used affect the end design.), we are confident that it manages to accomplish the four objectives in a short period of time. However, a new question arises: does it make a difference in the participants’ everyday work? Will they consider applying a UCD approach in their next project? Will they be more willing to include the results of UCD methods in their work? We plan to answer these questions by deploying again the UCD game in the Campus project context but with the real design problem. As it has been mentioned, the first work package of the project is to gather user requirements. The output of the package will be personas, scenarios and needs of the future campus users. We will use the project main goal as the design problem and these outputs to prepare the game materials. With this new focus of the game, we expect to increase the developers’ involvement in the UCD process as well as interesting feedback from both developers and observers for the project development. The UCD game is a powerful and flexible tool that can be applied for different goals, in diverse contexts and for different audiences. Although each setting will require an adapted design problem, the overall structure of the game is a useful guide for all cases. Acknowledgments. This work has been partially supported by a Spanish government grant under the project PERSONAL (TIN2006-15107-C02-01) and by the Campus project promoted by the Generalitat de Catalunya.
References 1. Berkun, S.: Interactionary: Sports for design training and team building. http://www.scottberkun.com/dsports 2. Bias, R.G., Mayhew, D.J. (eds.): Cost justifying Usability: An Update for the Internet Age. Morgan Kaufmann, San Francisco, CA, USA (2005) 3. Buchenau, M., Fulton Suri, J.: Experience Prototyping. In: Proceedings on Designing Interactive Systems, pp. 424–433. ACM Press, New York (2000) 4. Brandt, E.: Designing Exploratory Design Games: A Framework for Participation in Participatory Design? In: Proceedings Participatory Design Conference, pp. 57–66. ACM Press, New York (2006) 5. Burns, C., Dishman, E., Verplank, W., Lassiter, B.: Actors, Hairdos & Videotape – Informance Design. In: Proceedings of CHI 1994, pp. 119–120. ACM Press, New York (1994) 6. Klemmer, S.R., Hartmann, B., Takayama, L.: How Bodies Matter: Five Themes for Interaction Design. In: Proceedings on Designing Interactive Systems, pp. 140–149. ACM Press, New York (2006) 7. Siegel, D.: The Business Case for User-Centered Design: Increasing Your Power of Persuasion. Interactions 10(3), 30–36 (2003) 8. Simsarian, K.T.: Take it to the Next Stage: The Roles of Role Playing in the Design Process. In: Proceedings of CHI 2003, pp. 1012–1013. ACM Press, New York (2003) 9. Svanaes, D., Seland, G.: Putting the Users Center Stage: Role Playing and Low-fi Prototyping Enable End Users to Design Mobile Systems. In: Proceedings of CHI 2004, pp. 479–486. ACM Press, New York (2004) 10. Twidale, M.B., Marty, P.F.: Come On Down! A Game Show Approach to Illustrating Usability Evaluation Methods. Interactions 12(6), 24–27 (2005)
DEPTH TOOLKIT: A Web-Based Tool for Designing and Executing Usability Evaluations of E-Sites Based on Design Patterns Petros Georgiakakis1, Symeon Retalis1, Yannis Psaromiligkos2, and George Papadimitriou1 1
University of Piraeus, Department of Technology Education and Digital Systems 80 Karaoli & Dimitriou, 185 34, Piraeus Tel.: 0030 210 414 2746 2 Technological Education Institute of Piraeus General Department of Mathematics Computer Science Laboratory 250, Thivon & P. Ralli, 122 44 Athens, Greece Tel.: 0030 210 5381193, Fax: 0030 210 5381351 {geopet,retal,papajim}@unipi.gr,[email protected]
Abstract. This paper presents a tool that supports a scenario based expert evaluation method called DEPTH (usability evaluation method based on DEsign PaTterns & Heuristics criteria). DEPTH is a method for performing scenario-based heuristic usability evaluation of e-systems. DEPTH focuses on the functionality of e-systems and emphasizes on usability characteristics within their context. This can be done not only by examining not only the availability of a functionality within an e-system but also the usability performance of the supported functionality according to a specific context of use. The main underlying ideas of DEPTH are: i) to minimize the preparatory phase of a usability evaluation process and ii) to assist a novice usability expert (one who is not necessarily familiar with the genre of the e-system). Thus, we (re)use expert’s knowledge captured in design patterns and structured as design pattern languages for the various genres of e-systems. This paper briefly describes the DEPTH method and presents the way a specially designed tool supports it along with the findings from an evaluation study.. Keywords: Heuristic evaluation, design patterns, reuse of design expertise.
account the user needs. The benefits anticipated are as follows: increased sales, customer satisfaction, customer retention, reduced support, stronger brand equity [13]. Usability evaluation of e-sites is not an easy task and requires a lot of effort [9]. One approach is the use of usability experts which raises the cost for the organization undertaking the task [2]. It is often difficult to find a usability expert who will be able to perform his/her tasks and pinpoint a lot of usability problems which stem from the general usability heuristics as well as to successfully determine usability problems which have to do with the specific context of use for the e-site. Not only is it difficult to find usability experts [10], but it is even harder to find genre specific e-sites usability experts. Digital genres are described as a classification system for kinds and types of digital products [12]. During the last years several digital genres of e-sites have been studied such as online newspapers, e-shops, e-travel sites, etc. Thus, a practical approach for solving the problem of finding usability experts for the specific genre of an e-site under evaluation could be to, accurately and efficiently, help a typical novice usability engineer in performing usability evaluation for that genre of e-sites. This can be achieved by transferring the expert knowledge to the novice usability engineers and guiding them to perform an e-site evaluation with the aid of systematic approaches and supported toolkits. Such an approach is the DEPTH method (usability evaluation based on DEsign PaTterns and Heuristics criteria). DEPTH is a scenario based expert evaluation method. It eliminates the difficulties of expert based evaluation described above and provides an integrated framework where the novice usability evaluator can find and (re)use expert knowledge for better performing the evaluation tasks of genres of e-sites. The innovative ideas behind the DEPTH approach are: i) the reuse of expert knowledge in the form of design patterns during the evaluation process. A design pattern describes a problem, a solution to it in a particular context, and the benefits or drawbacks from using that solution [1, 3]; ii) the use of scenarios of genres specific of e-sites. In this paper we describe the DEPTH toolkit which is a prototype Web-based tool, for designing and implementing usability evaluations of e-sites, based on the DEPTH usability method [4, 11]. The rest of the paper is organized as follows: Section 2 describes DEPTH in detail. Section 3 describes the application of DEPTH on two systems classified as Learning Brokerage Platforms (LBPs) in order to clarify the main points of the method. Finally, in section 4 we discuss the current status of the method, as well as our future plans.
2 The DEPTH Approach 2.1 Principles of DEPTH According to DEPTH the evaluation process of an e-site should focus on three dimensions: functionality according to genre, usability performance of the functionality according to context of use, and general usability performance according to heuristics criteria. We present the whole process in Figure 1 using an activity diagram which depicts the general activities and responsibilities of the elements that make up our method.
DEPTH TOOLKIT: A Web-Based Tool
455
The basic aim of our method is to provide a framework where an evaluator can find and (re)use expert knowledge in order to perform an evaluation that supports the above dimensions.
Fig. 1. The whole process depicting the general activities and responsibilities of the elements those make up DEPTH method
The first swim lane presents the general steps/actions of the evaluator according to DEPTH. These steps are guided and supported by the “DEPTH-Repository” which is the element that is constructed during the preparatory phase. The last element shows the deliverables of the execution phase of the evaluation process. Each evaluation study should start by first selecting the specific genre of the e-sites under evaluation. There are various checklists with the functionality of various genres of e-sites which one can easily use and re-use. In case one cannot find such a checklist, an analysis of the most well-known systems of a specific genre should be made in order to find out their functionality and provide a super set of all features categorized in groups as well as an analytical table. Such genres of systems along with their analytical tables of the supported functionality become part of the “DEPTH- Repository”. Having as input the analytical table of the functionality of the system under evaluation, the evaluator can easily perform the next step, which is a simple checking to ensure whether the system supports the underlying functionality. This step provides the first deliverable of our method which is a functionality report. This report describes the functions supported by the selected system. At the next step the evaluator has to decide which of the supported functions will be further analyzed for usability performance. As we have already mentioned, the production of the functionality table alone is not enough for someone to select the right e-site. We may have systems of similar genre, like e-commerce systems, which may contain the same set of features but vary in usability. In other words, “It is not only the features of the applied technology but especially the way of implementation of the technology”, as Lehtinen, et. al. [6] says for different genre of systems.
456
P. Georgiakakis et al.
Evaluating the usability performance of the system involves two primary tasks: (a) evaluation in the small, i.e. at the specific context, and (b) evaluation in the large, i.e. evaluating the general usability conformance to well-defined heuristics criteria. The first task is the most difficult since it implies the use of domain experts and therefore it is very expensive. Moreover, the availability of domain experts is very limited. At this point our method suggests the (re)use of domain knowledge through the design patterns and the underlying design pattern languages. Such a language can adopt issues from HCI design patterns since usability is of prime importance, while at the same time will take into account the particularities of the type of genres under evaluation, and so forth [14]. So, at the next step the evaluator for each specific function (or a set of functions) identified for usability performance, can see a related scenario. As we described above, one or more related scenarios are bound to specific design patterns during the preparatory phase and are part of the DEPTH’s repository. The evaluator may also decide to modify a related scenario to better suit his/her case. The next step is the execution of the underlying related tasks of the specified scenario. We have to stress here the essential role of the underlying usage scenario which acts as an expert wizard guiding the evaluator. After the execution, the evaluator is motivated by DEPTH to see the ideal solution as it has been recorded in the related pattern(s). This is necessary because the evaluator hasn’t seen the solution till now, but only the related usage scenario. By seeing the actual solution, the evaluator can complement his/her findings about the e-site under evaluation and he/she becomes more prepared to compose the evaluation report. The final evaluation report has two parts: a context specific part and a general part. The first reveals/measures the usability performance of the system under evaluation according to its specific context of use, while the second presents the general usability performance according to the expert/heuristic criteria. 2.2 The DEPTH TOOLKIT DEPTH TOOLKIT supports the tasks of two categories of users: i) Creators of the evaluation studies to be performed and the ii) the novice usability engineers. The DEPTH toolkit from the creator’s point of view supports four main tasks for each genre of e-sites: i) Specifications of the features of genres of e-sites, ii) Assignment of scenarios and appropriate tasks to features of genres iii) Editing of design patterns, as well as links between those patterns and specific features of the genre and, iv) Management of evaluation sessions and recording of evaluation reports. From the novice usability engineer perspective, the toolkit supports the evaluation study in two phases, first a preparatory and then an execution phase. During the preparatory phase, the user (novice usability engineer) chooses the genre of the e-site, and selects from a list the features of the system that she is interested in evaluating. This list is generated by the toolkit and includes features related to systems of the same genre as the one specified. During the execution phase, the selected set of features is being evaluated through the context oriented scenarios that have been proposed and written by the creators of the evaluation studies for that specific genre of e-sites (Fig. 2). At the end of the evaluation process, a detailed report is automatically
DEPTH TOOLKIT: A Web-Based Tool
457
produced, describing the usability performance of the examined system on the chosen function(s) at its specific context of use, along with the general usability performance of the examined e-site according Nielsen’s heuristics criteria [8].
Fig. 2. DEPTH TOOLKIT interface
3 Evaluating the DEPTH Method 3.1 Scope of Evaluation Study In order to evaluate our method we conducted an experiment with non expert usability evaluators. Twenty three (23) graduate students of our Department after having completed an introductory MSc course on Human - Computer Interaction, (we call them novice usability engineers) were asked to evaluate two e-sites of a specific genre. In order to make an experiment that would be related to their interests (they attend a MSc programme on e-learning technologies) we proposed the evaluation of two Learning Brokerage Platforms (LBPs), namely the Premier Training Online (https://www.premiertrainingonline.com/default.aspx) and the Adobe Store of North America (https://store1.adobe.com/cfusion/store/index.cfm). All students had average knowledge of such a genre of systems, and none of them claimed to be experts in using (nor designing) such systems. Actually, none of them had ever used any of the systems under evaluation. These e-sites allow the user to search, view, and purchase selected online learning objects that have to do with training in specific areas of interest. The Premier Training Online site offers distance learning programs in order to provide comprehensive home study courses for those pursuing careers in the health and fitness industry. The Adobe Store provides Adobe training about Adobe products via an online training center. This center gives access to many libraries full of engaging, interactive course contents, assessment features, and additional resources to maximize design and development of skills. These two e-sites had been carefully picked up since various usability problems had been identified during expert based evaluations previously organized by our group. We used DEPTH only from the evaluator’s point of view since we wanted to focus in this specific perspective. The main research questions of this evaluation
458
P. Georgiakakis et al.
study were: Can the DEPTH method help novice usability engineers identify usability problems (especially complex ones)? Can the DEPTH method make novice usability engineers improve their ability to propose solutions to the identified usability problems? Is the DEPTH method easy to apply? Does the DEPTH method make the novice usability engineers’ evaluation process easier, more flexible and enjoyable? Does the DEPTH method make novice usability engineers feel confident that they performed a good evaluation study? Do the novice engineers appreciate the added value of Design Patterns for usability evaluation? 3.2 Evaluation Process Several systems of LBP genre have been thoroughly examined with respect to the features they provide and a superset of those features is shown in Figure 3 (Fig.3 ).
Fig. 3. Features supporting online purchases
Selected design patterns (DP) from Martijn van Welie’s web design patterns repository (http://www.welie.com/) were related to a number of features (F) that an LBP may support as shown in Table1 below (each feature was related to one or more design patterns from Welie’s repository). Table 1. Examples of relations between Features and Design Patterns
FEATURES (F) F1. Select preferred language F2. Directions to the right section of the website F3. Know where you are in a hierarchical structure F4. Navigate a hierarchical structure
For each of these features we created a related usage scenario. For example for the functionality “F11: Buy/use shopping basket” we assigned the scenario “S11: Shopping Cart” as shown in Table2.
DEPTH TOOLKIT: A Web-Based Tool
459
Table 2. Task Scenario and Questions for a specific functionality
(S11) Description: Task:
Questions:
Shopping Cart Collect and purchase several items in one transaction. Locate the Shopping Cart / Shopping Basket. The basket is initially empty. Search for a product (either manually or with assistance from a search mechanism) and, if found, add it to the contents of the basket. Add to the basket a new product that is advertised in the home page. Search for a new product, other than those already included in the basket. Browse through the store. Delete one item from the shopping cart. Select another instance of a product included in the cart and add that instance to the contents of the cart. Select one of the products that are already in the cart and view its price. While viewing the shopping basket contents, try to locate a link related to shipping and handling costs and the calculation of their cost. While viewing the shopping basket contents, try to locate a link related to the return policy. • Was the name of the shopping cart used appropriately? • Was the shopping cart easily located? • Were you able to add in the basket a product advertised in the home page? How easy did you find this operation? • While viewing search results were you able to see the contents of the shopping cart? • Could the operation of searching for a new product other than those already included in the basket, be executed with zero or one click / move? Were you still able to view the contents of the shopping cart? • Was it easy to delete items from the shopping cart? Was it easy to modify their quantity? Was the total price automatically recalculated? • Was it easy to view the price of any product item included in the cart once you selected that item? • Was it easy to locate the link related to shipping and handling and the calculation of their cost? Was the information provided satisfactory? • Was it easy to locate the link related to the return policy? Was the information provided satisfactory?
The novice usability evaluators (i.e. the DPEs) had to conduct all the tasks of the proposed scenarios while having the ability to look at the related design pattern that the DEPTH method proposes. After having fulfilled the inspection of the LBP, they had to express their overall opinion about the e-site according to Nielsen heuristic criteria [8] that are: Visibility of system status, match between system and the real world, user control and freedom, consistency and standards, error prevention,
460
P. Georgiakakis et al.
recognition rather than recall, flexibility and efficiency of use, aesthetic and minimalist design, help users recognize, diagnose, and recover from errors, help and documentation. Finally a report in which all the answers of the questions proposed from the scenario and Nielsen’s heuristic criteria is automatically generated. Not only did we analyze the reports written by the DPEs but we conducted focus group interviews (in teams of three students) to get better insight of their opinion about DEPTH and the DEPTH toolkit. The major advantage of conducting a focus group interview [5] was the ability to obtain detailed information through group cooperation. The findings resulted in important and promising conclusions as shown below. 3.3 Evaluation Findings Using, throughout this experiment, novice usability engineers helped us verify what we intended to prove: the DEPTH method can actually enable novice usability evaluators perform evaluations of expert quality. After following the related task scenarios, they were able identify simple usability problems, while at the same time they were also assisted in identifying complex problems, which could not be easily spotted if scenarios and design pattern have not been given. The novice usability evaluators clearly stated that the design patterns helped them in realizing the good design practices concerning the various features of a LBP e-site. As an unexpected outcome, many of the evaluation reports that we received showed that the DPEs were also suggesting solutions for each of the problems identified. We, as reviewers of the experiment, wanted to know where this kind of knowledge came from. When we asked our students how they got these references, all of them mentioned the added value of the Design Pattern that accompanied each feature. By considering the solution given from the pattern and customizing to the context of the specific e-site, they were able to offer clear solutions to the usability problems. This assistance made them more confident, not only in indicating the usability problems, but also in proposing solutions for them. All students stated that the interface of the Toolkit made the evaluation process flexible and enjoyable. The use of specific task scenarios along with the categorization of all features provides a source of tasks and requirements that can be easily evaluated. Among other remarks, it was also mentioned that DEPTH can be used in evaluating isolated areas of interest by simply choosing only few features. However the method has some disadvantages. Design patterns are not that many. So it is difficult to find mature pattern languages to support the variety of e-sites genres. That became obvious from the collection of Design Patterns we proposed as we deliberately chose some that are not pretty matured. Even if we assume that the pattern language is there, pretty matured, will there always be a design pattern to validate all areas of interest in a digital genre? A problem that occurred during the evaluation was that the TOOLKIT didn’t allow the users to revise their report after they had submitted it. This problem is not difficult to be solved and the next revised version of the toolkit will include such functionality. Another major issue related to the evaluation of the DEPTH method and mainly depending on the user of the DEPTH Toolkit is the creation of genre dependent scenarios. Who should be the creator of those? Will the scenarios be highly scripted or
DEPTH TOOLKIT: A Web-Based Tool
461
loosely defined? What will the granularity of each scenario be? The need of experts in the creation of these task scenarios is meaningful. We may want to define scenarios that are very descriptive, or we may want to use scenarios that are more general. We need to have several scenarios, of different granularities, for each feature and let the user decide between cost and efficiency and choose the one that is most appropriate to the case of study.
4 Conclusions In this paper we provided an overview of DEPTH, which is an innovative method for performing scenario-based expert heuristic usability evaluation for e-sites. It is innovative since it uses the added value of design patterns in a very systematic way in the usability evaluation process. This method can be easily used by a novice usability engineer. When DEPTH was used by non-expert engineers in the evaluation of LBPs using a supported toolkits, called DEPTH toolkit, the results were satisfactory. The expert knowledge embedded in the form of design patterns and usage scenarios was readily available to the novice engineers, thus enhancing their testing methods and improving their perspective towards the usability of each functionality being tested. As the field of design patterns grows and matures, this method will be very promising and highly applicable. Acknowledgments. This work has been partially funded by through the EU IST FP7 project Grid4All (http://grid4all.elibel.tm.fr/).
References 1. Alexander, C.: The Origins of Pattern Theory: the Future of the Theory, And The Generation of a Living World, In: Keynote speech at the 1996 ACM Conference on ObjectOriented Programs, Systems, Languages and Applications (OOPSLA) (1996) retrieved from http://www.patternlanguage.com/archive/ieee/ieeetext.htm 2. Dix, Alan, Finlay, Janet, E., Abowd, Gregory, D., Beale, Russell.: Human-Computer Interaction, 3rd edn. Prentice Hall, Englewood Cliffs (2003) 3. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns – Elements of reusable object oriented software. Addison –Wesley, London, UK (1994) 4. Georgiakakis, P., Tzanavari, A., Retalis, S., Psaromiligkos, Y.: Evaluation of Web applications Using Design Patterns. In: Costabile, M.F., Paternó, F. (eds.) INTERACT 2005. LNCS, vol. 3585, Springer, Heidelberg (2005) 5. Krueger, R.A., Casey, M.A.: Focus Groups: A Practical Guide for Applied Research, 3rd edn. Sage Publications, Thousand Oaks, CA (2000) 6. Lehtinen, E., Hakkarainen, K., Lipponen, L., Rahikainen, M., Muukkonen, H.: Computer supported collaborative learning: A review of research and development (The J.H.G.I. Giesbers Reports on Education No. 10). Department of Educational Sciences. University of Nijmegen, Nijmegen, the Netherlands (1999) 7. Martín, G.Y.: Wiki tools for collaborative learning environments, Final project thesis, Telecommunications Engineering, Universidad de Valladolid (August 2005) 8. Nielsen, J.: Usability Engineering. Academic Press, London (1993)
462
P. Georgiakakis et al.
9. Nielsen, C.: Testing in the field. In: Werner, B. (ed.) Proceedings of the third Asia Pacific Computer Human Interaction Conference, IEEE Computer Society, Los Alamitos, CA (1998) 10. Nielsen, J.: Designing Web Usability: The Practice of Simplicity. New Riders Publishing, Indianapolis (2000) 11. Sartzetaki, M., Psaromiligkos, Y., Retalis, S., Avgeriou, P.: Usability evaluation of ecommerce sites based on design patterns and heuristic criteria, In: 10th International Conference on Human - Computer Interaction, Hraklion Crete (June 22-27, 2003) 12. Schmid-Isler, S.: The Language of Digital Genres: a Semiotic Investigation of Style and Iconology on the World Wide Web. In: Proceedings of the 33rd Hawaii International Conference on System Science, IEEE Press, CD-ROM, Hawaii (2000) 13. Stefani, A., Xenos, M.: A model for assessing the quality of e-commerce systems: PC-HCI 2001 Conference (2001) 14. Van Welie, M., Klaassen, B.: Evaluating museum websites using design patterns, Technical Report: IR-IMSE-001 (2004) Available at: http://www.welie.com/articles/IR-IMSE001-museum-sites.pdf
Evaluator of User's Actions (Eua) Using the Model of Abstract Representation Dgaui Susana Gómez-Carnero and Javier Rodeiro Iglesias Escuela Superior de Ingeniería Informática de la Universidad de Vigo Campus As Lagoas S/N, Ourense {jrodeiro,susanagomez}@uvigo.es
Abstract. User Interfaces has an important role on the success of an application. Due the relevant temporal and economic cost of its development is necessary to obtain a high acceptability and effective design. To consider a user interface acceptable this must be kind to user, do its objectives and be easy for the user. In this paper an abstract model specification is presented to allow evaluate the acceptability of user interfaces. This is made in a semiautomatic way validating the three items defined before. We also present a notation for the user interface testing and a tool that allows the user executes user tasks over the graphic user interface prototyping generates by the tool. Keywords: user interface design, usability, user interface modelling, prototyping, user interface test.
The evaluation of the compliment of these criteria is very complex. This complexity is given by the subjective approximation to this evaluation in most of the cases, mainly using expert opinion or questionnaires for the user [7] [11] [9] [14]. Given the subjectivity on user interface evaluation, and due to the importance of personal human perception in user interface qualitative evaluation techniques it will be interesting can reach a more direct and discrete method to avoid this personal perceptions. One engineering approach could be that user can interact with a prototype generating from a specification model obtained in a previous phase of analysis the requirements. With this approach the user can obtain a view of user interface more tangible, identifying early possible problems before spending time and money for the industry. Due the cost of develop software and that most of evaluation methods is realized after this development seems correct try to do the evaluation previously to the implementation. Following the criteria exposed before, a big amount of research related exists but almost all of them are in theoretical state and is not of practical application. Before to examine representation or notation techniques to probe if they can be applied to complex interfaces for an objective evaluation bases on the criteria the conclusion is negative. These are not appropriated to cover the necessities previously defined. By this, we have presented an abstract notation that allows represent the user interface using components, visual presentation (in graphic terms) and user interaction defined at component level. We present in this paper the part of the notation which represents the functionality of user interface, extracting of this task user that allows doing semiautomatic evaluation making minimum the actual development and evaluation costs. In section 2 it’s presented the notation for user interface behaviour representation. In section 3 we present the notation for user task test definition. In section 4 we present the EAU tool that allows the user dynamic evaluation of usability of the user interface by an interactive simulation of user interface. In section 5 we present the conclusions and future work of this research.
2 DGAUI Representation A review of user interface representation models had been made on literature. This review was focus in models that proposed a visual representation and behaviour of user interface [3], but considering the problems that we have found in these representations we need to present an alternative solution. The proposed representation DGAUI considers that visual user interface is not a continuous structure. The fact, it considers that is composed by discrete finite elements, defining the interface as a composition of individual elements called user interface components. This user interface components have a topological hierarchy could be and component into another. [12]. For the definition of the visual user interface the notation allows: - Define the visual user interface components, with standard graphical primitives if the component has visual representation on the user interface, or determinates properties if the component is for input information or only a container of other user interface components.
Evaluator of User's Actions (Eua) Using the Model of Abstract Representation Dgaui
465
- Determine the topological composition of visual user interface components to construct the visual user interface which the user interact in a moment of time. - Represent the dialog between components. Identifying the user events that user can use over components and which is the response of the other user interface components when the interaction occurs. The best choice for structure the notation is XML. By this we have create a DTD to allow an easy parse of notation structure. Attending to the notation semantic, one part of it has the initial representation of visual user interface and the second part allows represent everyone states of user interface obtained from interaction defined over user interface components and the transitions between states. Due the different nature of the two parts we divide the notation in two DTD. The first (called DGAUI-DEF) consist in a details definition of each one of the user interface components that composes the whole interface. This separation is adequate for allowing reusing of component definitions in other visual user interface representations. The second (called DGAUI-INT) depending of the first. This dependence is because this second notation is calculated from the first. The DGAUIDEF contains all states that visual user interface can reach. This set of states can be calculated from the initial representation of user interface. The initial state is formed from the properties of the user interface components definition. After this, the possible individual events over user interface components of this state are simulated, and the changes on components determines a new state (that already exists or that is identifier as new) and a transition between the state actual and the new state. This is the application of the concept the state diagrams for a user interface but generates from interaction on individual user interface components. This notation is oriented to the state of visual user interface component instead the state of whole visual user interface. A state of visual user interface is obtained from the combined states of the visual user interface components. Thus, the notation has a separation between presentation and behaviour of user interface. The presentation is located in the representation definition while functionality is located on states and transitions between states. This transition between states is calculated using the possible user actions (events) on visual user interface components and using also event of the system. We define an interface state as the join of all visual user interface components that, by the value of its properties, can be reached by user to interact in a moment of time. For the definition of the notation we consider that: - The user actions are not arbitrary. - The set of visual user interface states are finite and can be described and evaluated. - A visual user interface state depends on components that own and the properties of each one of them. - A state is a moment of visual user interface when is waiting by a user action, and doesn’t change while the user does not interact with it. Each state is characterized by the component value in these four properties:
466
S. Gómez-Carnero and J. Rodeiro Iglesias
- Visible: Visibility property of the component. Visible (T) or not Visible (F) on screen. - Activo: Indicates if the component responds to the user action (T) or not (F). If the component has Activo(F) in a state it doesn’t exist transition to other state caused by this component. - InfI: This component property activates the input data from user for itself component. If the component property has value True accepts data given by user. - InfO: the data output function of the component is activates with value true. If this property has value True the component will visualize the data send from the “core” of the application to the user. Events are user action over input hardware devices on system. These events are detected by the system and the system will respond as be defined for each event. An event is a single user action, for example, drag and drop is the combination of three single actions or events: click, move and release. We can define pre-conditions and post-conditions. If an event is defined using the notation over a visual user interface component and it has not pre-condition, the changes for other components can be performed always if the event over this component is produced. If a pre-condition exists, for example, for event RightClick over ComponentTwo is “ComponentOne:Activo(T)” (the property Activo of componentOne component have value True), this user action over ComponentTwo will not be performed if the value of ComponentOne in property Activo is F. For post-conditions, we define the values of properties that must be satisfied to reach the next state. The notation does not limit the events that can be defined for interaction. The HCI engineer can define the events that considers necessary and communicates its meaning to the workgroup. Some examples of basic event that we use are: - LeftClick: click over Mouse left button - RightClick: click over Mouse right button - ReLeClick: Release Mouse left button - ReRiClick: Release Mouse right button - MouseOn: Mouse pointer over a component - Key(key of keyboard): keys of keyboard combination With DGAUI from the events over the visual user interface components is possible to calculate the events over states. Thus, we can build a oriented labelled state graph of user interface and establish what the following state of user interface is if we know that component is affected by an event. The vertices are the visual user interface states and labelled arcs are the transitions between states. Two determining special states for the functionality of the interface exist: - Initial state: vertex in which all their associate arcs are of exit and do not have any arc of entrance that it can not possible be reached without passing by initial state.
Evaluator of User's Actions (Eua) Using the Model of Abstract Representation Dgaui
467
- Final State: vertex in which all their associate arcs are of input and no one of exit. An anomalous situation exists if two or more vertex has only arcs of input because it will exists more that one final state in visual user interface. The set of possible following states if a user action is performed can be obtained from the initial state. This is possible applying the events over the visual user components with property Activo(T) and making the associate changes of interaction on the other visual user components. From this first set of following states, and applying the same process to each one, the rest of states can be obtained until to reach a final state where any visual user component has value True in properties Activo and Visible. During the building state process can be identified transitions or arcs (labelled with a component and event that is applied) to states identified before. Two states are equal (and therefore the same state) if all its visual user interface components have the same value in properties Activo, Visible, InfI and InfO. A visual user interface component belongs to a state if has one of the following functionality. Los componentes que forman parte de un estado son los que tienen una funcionalidad dentro del estado. One of the functionality is that the visual user interface component has a visual appearance in the visual interface that provides of relevant information to the user (in this case the component property Activo has value True). Other of the functionality is that the visual user interface component causes changes on the properties of other components when event is produced over it. One of the advantages of this notation is that allows the visual properties modification of a component without the behaviour of this component varies (the user could not seen the component but its behaviour is maintained along intermediate states). If the visual user interface component appearance modification varies its functionality, this would be a different visual user interface component. The user interpretation of the visual user interface component appearance must be unique for each visual user component and must also identify to the user its functionality. In other case the component will be ambiguous and the visual user interface design will be wrong. This situation exists in interactive models that using interactor. The specification of interactor is defined to support the different interactor states reached by the occurrence of user actions. According to traditional specification of interactor, for an interactor state there is a concrete functionality and a unique appearance of state interactor. There is no model that considers multiple rendering functions for a unique interactor state. This is because the specification is based on interactor dialog instead of its visual appearance. In DGAUI proposal may exists different appearance for a visual user interface component, caused by user actions (visual operations), but the visual user interface component behaviour is the same. Visual operations are for example resizing or changes on the size of visual user interface components. Using DGAUI the visual interface consistence is maintained because two visual user interface components, with the same appearance must have the same behaviour. But if the visual user interface component appearance is modified as a personal choice of user, this modification will not affect to visual user interface component behaviour. Because this work is oriented to early phases of prototyping DGAUI proposal does not consider the abstract representation of application data. The participation on user interface of visual user interface component as elements to allow the user choice of input and output information are defined including in version 3.04 domain definition
468
S. Gómez-Carnero and J. Rodeiro Iglesias
of data or mask for text input. If a user action changes drastically the visual user interface component appearance then the new appearance must be a new visual user interface component and therefore a different visual user interface state. If the visual user interface is correct would be exists only a state where all visual user interface component have the properties Activo and Visible with value False (final state). Also, it can identify the initial state from the visual user interface components representation in DGAUI-DEF examining their properties. The visual user interface component definition, the topological composition, and the dialog between components are constant for a visual user interface. La information for each state of visual interface is determined by the values of visual user interface component properties. Once obtained the states of the visual user interface we use a state graph (multidigraph) to represent the whole set of transitions between states. In it the vertex are the visual user interface states and the arcs are the transitions between states. The arcs are labelled with the name of visual user interface component and the event that causes the transition. The XML document (DGAUI-INT) contains the following information: - Topological Composition of visual user interface components contained in other visual user interface components. - Information about the visual user interface states. All visual user interface states are defined by the description and properties of its components. The initial state is obtained from visual user interface components description and the other states are obtained from a automatic process. - Set of transitions between states. This is obtained during the automatic process of states identification. Information about XML the structure and examples of notation DGAUI may be seen in http://www.ei.uvigo.es/~susanagomez/hci.html
3 User Test Definition Once defined the visual user interface, if we want to evaluate it we must define the test over this interface. The objective is that the user interacts with the prototype and during this interaction we record as parameters or actions as we desire. DGAUI provides of the components and state appearance description that may be rendering in a standard rendering device. Also it provides of the user tasks and the states that user can find using the visual user interface. The first phase if we want automate the evaluation of visual user interface is define a notation that allow describe the atomic parts of the evaluation. Equal that DGAUI-DEF we use XML for structure the notation. By this we have create a DTD to allow an easy parse of notation structure. It is possible define as many evaluations as it is desired. Each evaluation is formed by a set of user task to perform. Each user task is described by the following information: - A description. (textual description of the user task for documentation) - The parameters to evaluate and record during the evaluation process. May be time parameters, for example total time using by user to perform the task, time until the user starts interaction, medium time between events,
Evaluator of User's Actions (Eua) Using the Model of Abstract Representation Dgaui
469
time to first user mistake, etc. los parámetros que se quieren evaluar durante la realización de la tarea. Other parameter may be counter parameters, for example, number of user events, number of user mistakes during the evaluation task, number of times that an error happens, etc. The last parameter may be error parameter that is dedicated to identify and control types of user mistakes, for example, what is the most frecuent user mistake or what is, produced in a state, the user mistakes previously defined. - The visual user interface states that it will be evaluated. It is defined which are the states and the transitions (visual user interface components and user actions) presented in the prototype to the user. Information about XML the structure and examples of notation EUA may be seen in http://www.ei.uvigo.es/~susanagomez/hci.html
4 Evaluator of User's aActions (EUA) The EAU tool allows the user evaluate dinamicaly the visual user interface usability. This evaluation is doing with a interactive simulation of the interface. From the abstract notation of DGAUI (concretly DGAUI-INT) we can build the visual appearance of the interface states and simulate the user actions over the components. With de EAU notation it can posible define the user task that user can probe. The simulation reproduces the visual appearance of interface following the user task described in section 3. The user interaction on the simulation record information according the parameters defined in section 3. This information is stored in a data base for its posterior study and analysis. This tool allows the HCI engineer define as evaluations as be necessary and obtain cuantitative information about the real use of a user on the visual user interface. Normally, the HCI engineer explains the user which are the objetives that has to reach while evaluate the visual user interface on the prototype. Then with the information obtained the HCI engineer can determine if the interface has any problem before start to code it. EAU tool has a simple interface with two basic functionalities: - Load Interface: allows select a XML file that contains the visual user interface description (DGAUI-DEF and GDAUI-INT) and generate the visual appearance of the user interface states. - Test Interface: it is necessary to evaluate the visual user interface select the individual user tasks definitions that complete the evaluation (AEU XML file). With this definition and with the configuration parameters to access to the data base the simulation is executed. For each user interaction with the prototype the information about this interaction it is stored in the data base for its study. Fig. 1 shows a visual user interface state generated by EAU tool from a description DGAUI. The interface example corresponds with a basic text processor.
470
S. Gómez-Carnero and J. Rodeiro Iglesias
Fig. 1. Text processor prototype
5 Conclusions and Future Work In this work we present an abstract representation of user interfaces specially designed for visual interactive systems. The focus of this representation is the visual aspect of user interface because this is the most important part of the user interface for the user. The most significate information that user obtain from user interface is throught appareance and his interaction on user interface is based on this signification. Other contribution of this work is the concept of visual user interface component with different appearance (in most of the cases size and position) for the same behaviour allowing diferent rendering functions for the same state of component. We showed that is possible to describe in a notation a set of interface user tasks and create automatically a prototype to evaluate with users its behaviour and its acceptability in a cuantitative way. As future work we are working on exception definitions for the user interface behaviour, including system events and user interface answers as result of data basw querys. Other line of work is to define metrics to be used with the information obtained from the evaluation and create a system that can find automatically errors on user interface prototype using agents. Acknowledgements. This work has been founded by projects TIN2005-08863-C0302 and 05VI-C02.
References 1. Eason, K.: Information Technology and Organizational Change. Taylor and Francis, London (1988) 2. Gartner Group Annual Symposium on the Future of Information Technology, Cannes (November 7-10, 1994)
Evaluator of User's Actions (Eua) Using the Model of Abstract Representation Dgaui
471
3. Gómez Carnero, S., Rodeiro Iglesias, J.: Aplicación de los sistemas de representación para la sistematización de la validación de interfaces de usuario. Technical Report TR-LSIGIG-05-2. Computer Science Department. University of Vigo (2005), http://www.ei.uvigo.es/ susanagomez/hci.html 4. IBM, IBM Dictionary of Computing. McGraw-Hill (1993) 5. ISO. Software product evaluation quality characteristics and guidelines for their use (1992) 6. ISO. Ergonomics Requirements for Office Work with Visual Displays Terminals: Guidance and Usability (1993) 7. Molich, R.y., Nielsen, J.: Heuristic evaluation of user interfaces. In: Proceedings of ACM CHI 1990. Seattle, WA, April 1990, pp. 249–256 (1990) 8. Myers, B.A.y., Nielsen, J.: Survey on user interface programming. In: Bauersfeld, P., Bennett, J.y., LYNCH, G. (eds.) CHI’92 Conference Proceedings on Human Factors in Computing Systems, pp. 195–202. ACM Press, Nueva York, NY (1992) 9. Nielsen, J.: Usability Engineering. Academic Press, London (1993) 10. Nielsen, J., Mack, R.L.: Usability Inspection Methods. John Wiley and Sons, New York (1994) 11. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S.: Human-Computer Interaction. Addison-Wesley Publishing, Reading, MA (1994) 12. Rodeiro Iglesias, J.: Representación y análisis de la componente visual de la interfaz de usuarios. PhD Thesis. Universidad de Vigo (September 2001) 13. Shackel, B.: Ergonomics in designing for usability. In: Harrison, M.D., Monk, A. (eds.) People and Computers: Designing for Usability, Cambridge University Press, Cambridge (1986) 14. Wharton, C., et al.: The cognitive walkthrough method: a practitioner’s guide. In: Nielsen, J.y., MACK, R.L. (eds.) Usability Inspection Methods, pp. 105–140. John Wiley & Sons, New York (1994)
Adaptive Evaluation Strategy Based on Surrogate Model Yi-nan Guo, Dun-wei Gong, and Hui Wang School of Information and Electronic Engineering,China University of Mining and Technology, 221008 Xuzhou, China [email protected]
Abstract. Human fatigue is a key problem existing in interactive genetic algorithms which limits population size and generations. Aiming at this problem, evaluation strategies based on surrogate models are presented, in which some individuals are evaluated by models instead of human. Most of strategies adopt fixed substitution proportion, which can not alleviate human fatigue farthest. A novel evaluation strategy with variable substitution proportion is proposed. Substitution proportion lies on models’ precision and human fatigue. Different proportion cause three evaluation phases, which are evaluated by human only, mixed evaluated by human and the model, evaluated by the model only. In third phase, population size is enlarged. Taking fashion evolutionary design system as an example, the validity of the strategy is proved. Simulation results indicate the strategy can effectively alleviate human fatigue and improve the speed of convergence.
Adaptive Evaluation Strategy Based on Surrogate Model
473
were adopted in order to lower the complexity of evaluation and reduce human burden. In above all researches, surrogate models replace human to evaluate all individuals or part of individuals in each generation so as to reduce the number of individuals evaluated by human. But they did not utilize surrogate models enough. First, the proportion of population evaluated by surrogate models in each generation is fixed which alleviate human fatigue limitedly. Second, population size is small and fixed all the time which limits the performance of IGAs. Surrogate models compute the fitness of individuals by computers which do not need human participation. So population size can be enlarged when only surrogate models are adopted in evaluation. Aiming at solving above problems, a novel adaptive evaluation strategy based on surrogate model is proposed. The number of individuals evaluated by surrogate model is adaptively tuned according to the degree of human fatigue and the evaluation precision of the model so as to effectively alleviate human fatigue. When population is evaluated only by surrogate model, population size is enlarged so as to improve the speed of convergence. In the rest of the paper, adaptive evaluation strategy is explicated in Section2. To validate the validity of the strategy, experiments based on fashion evolutionary design system and testing results are analyzed in Section3. At last, future work planned to introduce distributed neural networks into surrogate model is included.
2 Adaptive Evaluation Strategy Based on Surrogate Model When human preference to optimization problems is stable, surrogate model is adopted to evaluate individuals instead of human. Here, two problems must be taken into account. First, surrogate model must keep consistency with human cognation and preference exactly in order to ensure the convergence of the algorithm. So how to obtain the model with great prediction precision and generalization is the base of the strategy. Second, how to utilize the model instead of human in evaluation influences the performance of the algorithm. In this paper, the latter is of interest. In adaptive evaluation strategy, when the model is started up in the evaluation process and how many individuals are evaluated by the model in each generation are two key problems. And now few of researches concern them. 2.1 Startup Mechanism About Surrogate Model Startup mechanism offers some conditions which decide when to start up surrogate model in the evaluation process. That is in which generation these conditions are satisfied, population can be evaluated by surrogate model in proper proportion. In general, when human feel tired and surrogate model has learned human preference exactly, the model is adopted to calculate the fitness of individuals. So startup mechanism about surrogate model includes two conditions. They are the condition of human fatigue and the condition of models’ precision. When any of conditions is satisfied, surrogate model is start up to evaluate. This evaluation strategy is shown as follows.
474
Y.-n. Guo, D.-w. Gong, and H. Wang
F(P(t)) ={Fm(I,t), Fu (I ' ,t)| I ≠ I ' , I, I ' ∈P(t)}, Fm ≠∅, ∀((Fa(t) ≥ ε) ∨ (Trf (t) ≥Ψ))
(1)
where Fm denotes fitness value calculated by surrogate model and Fu denotes fitness value given by human. I and I ' express individuals evaluated by surrogate model and human respectively. P ( t ) denotes the population in t-th generation. Fa(t) ≥ ε describes the condition of human fatigue where Fa(t ) expresses the degree of human fatigue and ε is the threshold for human fatigue. The degree of human fatigue reflects how tired human are. Letting v(t ) denotes time that human spend
evaluating and β (t ) denotes the proportion of population evaluated by human. The degree of human fatigue is defined as follows [7].
Fa (t ) = 1- e - tv ( t )β( t ) S ( t )
(2)
where t is generation and S (t ) is the similarity of population which describes average similarity of individuals in population, shown as follows. | P −1| | P|
S (t ) =
2∑
n
∑ ∑σ ( x (t ) , x (t )) l
i
j
i =1 j = i +1 l =1
(3)
| P || P − 1|
where | P | is population size and n is the length of individuals. σl ( xi (t ), x j (t )) expresses the similarity of l-th bit between two individuals. σl ( xi (t ), x j (t )) =1 if l-th bit of xi (t ) is same as it of x j (t ) , otherwise σl ( xi (t ), x j (t )) = 0 . Human will spend more time evaluating the individuals when similar individuals in the population are more. It is obvious that human feel more tired when the total number of individuals evaluated by human is more and time for evaluation in each generation is more. Trf (t) ≥Ψ describes the condition of models’ precision where Trf (t) expresses the reliability of surrogate model and Ψ is the threshold for the reliability of the model. The reliability of the model reflects the consistency between surrogate model and human preference. It is measured by the average Euclid distance between fitness value calculated by the model and fitness value given by human of individuals in sampling population, shown as follows.
Trf ( t ) =
| Ps | | Ps |
∑( F ( I , t ) - F ( I , t ) ) u
i
m
2
(4)
i
i =1
where
| Ps | is the sampling population size.
In a word, whether surrogate model is start lies on two conditions: whether or not the degree of human fatigue exceed the threshold for human fatigue; whether or not reliability of surrogate model exceed the threshold for the reliability of the model.
Adaptive Evaluation Strategy Based on Surrogate Model
475
2.2 The Proportion of Population Evaluated by Surrogate Model
The proportion of population evaluated by surrogate model decides how many individuals are evaluated by the model in each generation. Up to now, in most of evaluation strategy based on surrogate models, the proportion of population evaluated by the model is fixed which limits the effect of surrogate models on performance. Aiming at this problem, the proportion of population evaluated by the model adaptively varies. Two factors are taken into account to decide the proportion of population. First, when human feel more tired, human hope that fewer individuals are evaluated by themselves. Second, when the reliability of surrogate model is higher, the model is more urgent to evaluate more individuals instead of human. So the proportion of population evaluated by the model in t-th generation is defined as -Trf (t )
ρ(t ) = Fa (t )(1- e
)
(5)
So the number of individuals evaluated by the model in t-th generation is
N f (t ) = ⎣⎢| P | ρ (t ) ⎦⎥ 2.3
(6)
Substitution Mechanism About Surrogate Model
In general, the evaluation process of IGAs adopting evaluation strategies with fixed proportion of population evaluated by surrogate model can be divided into two phases. There are two kinds of division. If ρ(t ) = 1 while the conditions of startup mechanism are satisfied, two phases include phase evaluated by human only and phase evaluated by surrogate models only. If ρ(t ) < 1 while the conditions of startup mechanism are satisfied, two phases include phase evaluated by human only and phase mixed evaluated by human and surrogate models. But in this paper, adaptive proportion of population evaluated by the model is adopted. So the evaluation process is different with above instances. According to the number of individuals evaluated by human in each generation, there are three phases in the evaluation process of IGAs, including evaluated by human only, mixed evaluated by human and the model, evaluated by the model only. Here, the third phase is of interest. In this phase, population size is enlarged because there does not exit human fatigue when surrogate model is adopted as an implicit fitness function. Phase I: Population is evaluated by human only In this phase, all of individuals are evaluated by human and surrogate model is not started up. So the evaluation mode and the number of individuals evaluated by the model are defined as follows.
F ( P(t )) = {Fu ( I ' , t ) | I ' ∈ P(t )}, ∀Fa(t ) < ε , Trf (t ) < Ψ
(7)
Nf (t) =0
(8)
476
Y.-n. Guo, D.-w. Gong, and H. Wang
It is obvious that in this phase, human do not feel tired and surrogate model can not reflect human preference exactly. Above phenomenon possibly appear in the beginning of evolution. So the evaluation mode of this phase is usually adopted in the former of evolution. Phase II: Population is mixed evaluated by human and surrogate model In this phase, surrogate model is startup. Some of individuals are evaluated by human and others’ fitness values are calculated by the model. So the evaluation mode and the number of individuals evaluated by the model are shown as follows.
F(P(t)) = {Fm (I , t), Fu (I ' , t)| I ≠ I ' , I , I ' ∈ P(t)}, ∀Fa(t) < ε ,Trf (t) ≥ Ψ −Tr ( t ) N f (t ) = ⎢⎣| P | Fa (t )(1 − e f ) ⎥⎦
(9)
(10)
In this phase, the degree of human fatigue does not exceed the threshold and surrogate model learns human preference exactly. So the number of individuals evaluated by the model is increasing. In above phases, population size is fixed and small because human participate in the evaluation process. In general, population size in IGAs is less than ten to alleviate human visual fatigue. Phase III: Population is evaluated by surrogate model only In this phase, all of individuals are evaluated by surrogate model. So the evaluation mode is shown as follows.
F (P(t )) = {Fm ( I , t ) | I ∈ P(t )}, ∀Fa(t ) ≥ ε
(11)
Human often feel very tired in the latter of evolution while this evaluation mode is adopted. Because evaluation strategy based on surrogate model is done by computers, the evaluation process in this phase is the same as traditional genetic algorithms. So population size can be enlarged. But how to extend population size is a key problem. Higher the precision of surrogate model is, generalization of the model is better. And population size can be larger. So enlarged population size is defined as follows.
N p (t ) =| P | where
⎢ ⎥ 1 + 0.5 ⎥ ⎢ ⎢⎣ Tr f ( t0 ) Fmax ⎥⎦
(12)
Fmax denotes the upper limit of fitness. Trf (t0 ) expresses the reliability of
surrogate model as evaluation strategy in Phase III is adopted. It is obvious that the value of exponent in formula (12) may be 1, 2 or 3. So population size may be enlarged to corresponding multiple of | P | .
Adaptive Evaluation Strategy Based on Surrogate Model
477
3 Simulations and Analysis 3.1 Background for Simulations
In this paper, fashion evolutionary design system is adopted as a typical background to validate the rationality of adaptive evaluation strategy. The goal of the system is to find a dress which wins the favor of human [8]. Visual Basic 6.0 as programming tool for human-machine interface and Microsoft Access as database are utilized. Matlab 6.5 is adopted to train surrogate model based on artificial neural networks. In fashion evolutionary design system, each dress is composed of collar, skirt and sleeve. Each part has two factors including pattern and color which described by two bits. So each dress is expressed by 12 bits, which act as 6 gene-meaning-units (GMunits)[9]. Each gene-meaning-unit has four alleles. The meanings of each allele in gene-meaning-unit are shown in Table.1. Table 1. The meanings of each allele in gene-meaning-unit allele GM-unit meaning
code
meaning
code
meaning
code
meaning
code
collar’s pattern
medium collar
00
high collar
01
wide collar
10
gallus
11
sleeve’s pattern
long sleeve
00
medium sleeve
01
short sleeve
10
nonsleeve
11
skirt’s pattern
long skirt
00
formal skirt
01
medium skirt
10
short skirt
11
color
pink
00
blue
01
black
10
white
11
3.2 Desired Objectives and Parameters in Experiments
In order to validate the rationality of adaptive evaluation strategy and the influence on performance of IGAs, two groups of experiments are designed. They have different desired objectives which reflect different psychological requirements of human. Desired objectives of experiments are shown as follows. Experiment I: To find a favorite dress fitting for summer without the limit of color. Experiment II: To find a favorite dress fitting for summer and the color is blue. In both experiments, artificial neural network is adopted as surrogate model. The values of parameters about the model and the evolution are shown in Table.2. 3.3 Analysis of Performance About Adaptive Evaluation Strategy
In order to validate the rationality of IGAs with adaptive evaluation strategy (AESIGAs), 30 persons are gathered to do two groups of experiments aiming at desired objective of experiment II.
478
Y.-n. Guo, D.-w. Gong, and H. Wang Table 2. The values of parameters
parameters about the evolution
parameters about the model
crossover probability
mutation probability
population size
generation
ε
ψ
0.5
0.01
8
40
0.7
0.7
input neurons
hidden neurons
output neurons
learning rate
epochs
error
6
15
1
0.09
15000
10
-2
Group I: Comparison of different proportion of population evaluated by surrogate model Fixed proportion of population and adaptive proportion of population evaluated by surrogate model are adopted in experiments respectively. Testing results done by all persons are integrated, as shown in Table.3. Table 3. Comparison of the performance by different proportion of population the proportion of population
average generation
The average number of individuals evaluated by human
ρ( t ) = 0.5
16
100
ρ (t ) = 1
14
72
12
52
-Trf (t )
ρ(t) = Fa (t)(1- e
)
It is obvious that the difference of average generation among different proportion of population is small. But the difference of the average number of individuals evaluated by human is large. First, if ρ(t ) = 0.5 , the number of individuals evaluated by human is equal to half of population size when the condition of startup mechanism is satisfied. But human should evaluate individuals all along. So human shall feel more tired adopting this evaluation strategy than other strategies. Second, if ρ(t ) = 1 , all individuals are evaluated by surrogate model when human feel tired and the model can reflect human preference exactly. Although human evaluate fewer individuals than first strategy, the model are started later than adaptive evaluation strategy. So the number of individuals evaluated by human is more than last strategy. Group II: Comparison of different population size in Phase III Fixed population size and enlarged population size are adopted in experiments respectively. Testing results are shown in Table.4.
Adaptive Evaluation Strategy Based on Surrogate Model
479
Table 4. Comparison of performance by different population size in Phase III
population size
average generation
The average number of individuals evaluated by human
|P|
13
52
12
52
⎢ ⎥ 1 +0.5⎥ ⎢ ⎣⎢Trf (t0 ) Fmax ⎦⎥
| P|
It is obvious that different population sizes in Phase III do not influence the degree of human fatigue in evaluation because the average number of individuals evaluated by human is same. But the speed of convergence adopting enlarged population size is faster than it adopting fixed population size. The reason for this result is that exploration of the algorithm is better while population size is larger. 3.4 Comparison of Performance About IGAs
In order to validate the improvement in performance of IGAs with adaptive evaluation strategy, 30 persons are gathered. Everyone do four experiments, including experiment I adopting IGA and AES-IGA respectively, experiment II adopting IGA and AES-IGA respectively. Aiming at each experiment, testing results done by all persons are integrated, as shown in Table.5. Table 5. Comparison of performance with IGAs and AES-IGAs Experiments
I
II
Evaluation strategies
IGA
AES-IGA
IGA
AES-IGA
Average generation
28
9
40
12
The average number of individuals evaluated by human
224
41
240
52
The average number of individuals evaluated by human in each generation
8
4
8
4
Generations when Fa(t) ≥ ε
-
7
-
9
Comparison of testing results in experiment I, generation adopting AES-IGA averagely reduces 68.9% than IGA. The total number of individuals evaluated by human adopting AES-IGA averagely reduces 80%. These indicate adaptive evaluation strategy can effectively alleviate human fatigue and speed up convergence so as to reduce human burden for evaluation which makes human absorbed in more creative design work.
480
Y.-n. Guo, D.-w. Gong, and H. Wang
Comparison of testing results between two groups of experiments, generations when Fa(t) ≥ ε in experiment I is lower than it in experiment II. This means human are easy to feel tired when they concern more gene-meaning-units. This matches the physiological rules of human.
4 Conclusion In order to farther alleviate human fatigue in interactive genetic algorithms, a novel adaptive evaluation strategy with variable substitution proportion is proposed. Startup mechanism about surrogate model considering the degree of human fatigue and the evaluation precision of the model is given. Variable proportion of population evaluated by surrogate model is proposed. Three phases are given according to the number of individuals evaluated by human in each generation, including evaluated by human only, mixed evaluated by human and the model, evaluated by the model only. In third phase, population size is enlarged. Taking fashion evolutionary design system as a testing platform, the validity of adaptive evolution strategy is validated aiming at different psychological requirements of human. Comparison of testing results adopting IGAs with fixed proportion of population evaluated by surrogate model or fixed population size and AES-IGAs with adaptive evaluation strategy proposed in this paper, they indicate adaptive evaluation strategy can convergent faster than others and human feel less tired. Compared with canonical IGAs, AES-IGAs can effectively alleviate human fatigue and improve the speed of convergence. The surrogate model based on distributed neural networks is the future research. Acknowledgements. This work was supported by the National Postdoctoral Science Foundation of China under grant 2005037225, the Postdoctoral Science Foundation of Jiangsu under grant 2004300, the Youth Science Foundation of CUMT under grant OC 4465.
References 1. Biles, J.A., Anderson, P.G., Loggi, L.W.: Neural Network Fitness Functions for A Musical IGA. In: Proc.of the Symposium on Intelligent Industrial Automation & Soft Computing, pp. 39–44 (1996) 2. Takagi, H.: Interactive Evolutionary Computation: System Optimization Based on Human Subjective Evolution. In: Proc.of IEEE Conference on Intelligent Engineering System, pp. 1–6 (1998) 3. Zhou, Y., Gong, D.W., Hao, G.S., et al.: Neural Network Based Phase Estimation of Individual Fitness in Interactive Genetic Algorithm. Control and Decision 20, 234–236 (2005) 4. Wang, S.F., Wang, S.H., Wang, X.F.: Improved Interactive Genetic Algorithm Incorporating with SVM and Its Application. Journal of Data Acquisition & Processing 18, 429–433 (2003) 5. Lee, J.Y., Cho, S.B.: Sparse Fitness Evaluation for Reducing User Burden in Interactive Genetic Algorithm. In: Proc. of IEEE International Fuzzy Systems, pp. 998–1003 (1999)
Adaptive Evaluation Strategy Based on Surrogate Model
481
6. Sugimoto, F., Yoneyama, M.: An Evaluation of Hybrid Fitness Assignment Strategy in Interactive Genetic Algorithm. In: 5th Workshop on Intelligent & Evolutionary Systems, pp. 62–69 (2001) 7. Guo, Y.N., Cheng, J., Dun, W.G.: Knowledge-inducing Interactive Genetic Algorithms Based on Multi-agent. In: Jiao, L., Wang, L., Gao, X., Liu, J., Wu, F. (eds.) ICNC 2006. LNCS, vol. 4221, pp. 769–779. Springer, Heidelberg (2006) 8. Kim, H., Cho, S.B.: Application of Interactive Genetic Algorithm to Fashion Design. Engineering Applications of Artificial Intelligence, 13, 635–644 (2000) 9. Hao, G.S., Gong, D.W., Shi, Y.Q.: Interactive Genetic Algorithm Based on Landscape of Satisfaction and Taboos. Journal of China University of Mining & Technology 34, 204–208 (2005)
A Study on the Improving Product Usability Applying the Kano’s Model of Customer Satisfaction Jeongyun Heo, Sanhyun Park, and Chiwon Song MC R&D Center, LG Electronics Inc, Kasan-Dong, KumChon-Ku, Seoul, Korea {jy_heo,sanghyun,chiwon}@lge.com
Abstract. User-Centeredness is the popular approach for achieving users’ satisfaction. Nevertheless, when considering profit optimization under economy efficiency and the limit of development period, it is almost impossible to apply solutions to all the usability problems reported during the test. Therefore, the strategic approach is required to maximize the perceived usability under the limited circumstance. Physical User Interaction (PUI) is defined as the physical side view of the usability and the broader concept of the usability. In this research, we constructed UI guidelines for PUI (Physical Usability Interaction) of mobile phone reflecting the user’s value. This research applied the Kano’s model of customers’ satisfaction to classify the PUI guidelines into two groups. One is the design standards which must be satisfied to guarantee the minimum satisfaction. The other is the value-adding criteria to hold a dominant position compared to competitive product. From this categorization, we could use the PUI design guidelines not only for evaluating current product quality, but also for finding the direction of strategic value improvement. Keywords: PUI(Physical User Interaction), Customer satisfaction, classification of usability problem, Perceived usability, kano’s model of customer satisfaction.
A Study on the Improving Product Usability Applying the Kano’s Model
483
Physical User Interaction (PUI) is defined as the physical side view of the usability and the broader concept of the usability. PUI issues come not only from the issues in the ergonomics area which you could easily conceive, but also from the usage experience of similar devices, emotional preferences, and the usage context of each function. Imagine a camera-phone which needs lots of clicks to take a photo cause of the absence of quick access key for the camera mode, you probably could not satisfied with it. Likewise if the buttons is too hard to press, your perceived usability would be bad no matter how other features’ usability is high. A research based on the real industrial field cases reported that usability evaluation and improvement activity not considering users’ value could not lead to the product value enhancement in the real market field. The main reason behind seems that evaluation during the product-development-phase usually focus on the detection of dissatisfaction factor and improvement of detected issues. Besides internal development environment does not allow to improve all the detected issues cause of the economic concept like maximum value ROI(Return on Investment) or constraint on development time. In other words, applying all the issues found to the product is almost impossible. That’s why strategic approach like applying issues with priority is needed. Priority for usability issues are usually decided with considering the severity of the issue itself, and most of the gap between the evaluated usability and user-perceived usability comes from the priority difference. To enhance the perceived usability, we should find the way to reflect the users’ value to the priority of the usability issues. Kano's methods(1984) show a reasonable approach to reflect users’ value and to understand customer-defined quality. He reveals the relations between the customersatisfaction with the product requirement. Furthermore he characterizes product requirements which influence customer satisfaction into three different groups; must-be requirement, attractive requirement, and one-dimensional requirement. Must-be quality is the mandatory one, without that users could not satisfies at all. Attractive requirement is an optional one. If this type of requirement is provided, users may be attracted by the product but without that users may not feel the inconvenience. Onedimensional requirement is functionally related to users’ satisfaction. If not provided, users may be un-satisfied. If provided, users may be satisfied. In this research, we applied Kano’s model on the constructing UI guidelines for PUI (Physical Usability Interaction) of Mobile phone. Classification of usability issues considering potential effect is the first start point of this research. Then we applied Kano’s model of customers’ satisfaction to prioritize the issues reflecting users’ value. This priority is the base of strategic improvement. This research categorizes the UI guidelines into two groups. One is the design standards which must be satisfied. The other is the comparison criteria to hold a dominant position compared to competitive product. From this UI design guidelines may applied not only to evaluating product quality, but also to providing value improvement direction. Furthermore, the use of UI design guideline may expand to user satisfaction from quality control.
2 Kano Model: The Theory of Attractive Quality Kano’s Model(1984) is based on the two-factor theory of job satisfaction by Herzberg(1974) which suggests that the factors causing job satisfaction are different from the factors causing job dissatisfaction. According to Kano, quality could not be explainable with one-dimensional recognition. For instance, people are very dissatisfied
484
J. Heo, S. Park, and C. Song
if they could not make a call, but they are not satisfied if it does. The one-dimensional view of quality could not explain this case. Kano et al(1984) introduced a model which categorize quality attributes, based on customers’ satisfaction with the level of quality. This view is useful to understand how customers evaluate a product. Kano defined the customer expectations for product quality as five levels: 1) must-be, 2) one-dimensional, 3) attractive, 4) Indifferent, and 5) Reverse. Must-be quality is the minimum requirement to avoid the customer’s dissatisfaction. It also introduces as must-have level. One-dimensional quality has the one-dimensional characteristics regarding quality and satisfaction; users’ level of satisfaction is proportional to the provided quality. Attractive quality has the contrary meaning of the basic quality. Though the absence of attractive quality does not promote the user’s dissatisfaction, if provided these features could excite and delight users. Indifferent quality is the quality which does not result in either customer satisfaction or customer dissatisfaction. Though Kano model explains the relationship between product quality and users’ satisfaction, it also applicable to the relationship between the usability and users’ satisfaction. (Jolka, 2005) Fig.1 shows the applied Kano Model.
Fig. 1. Kano Model applied on the usability domain
We defined the characteristic of utility for uses’ satisfaction as three levels: 1) basic, 2) opportunity, and 3) attractive. For clear meaning the term is changed, and the meaning is almost same as explained above. While developing a product, all of the guideline could not be applied cause of environment constraints like time and resource. Even worse there may be some conflicts among guidelines, so we should decide which one to apply. By adopting the Kano’s model of customers’ satisfaction, we could classify the PUI guidelines with priorities, and this will be helpful to for finding the direction of strategic value improvement.
3 Constructing PUI Guideline Using Kano’s Model The guideline is a usable tool in organization to systematically improve and monitor the usability of the product. A guideline could be structuralized by grouping the cases of PUI issues then revised from the applicable design principles. This section introduces the details of PUI guideline suggested.
A Study on the Improving Product Usability Applying the Kano’s Model
485
3.1 Constructing PUI Guidelines Total 106 design evaluation lists are designed based on the design principals which are obtained from the usability problem reported through commercialized products. Among these defined evaluation lists, some are proposed to Appendix. 3.2 Adopting Kano’s Model to Categorize He Collected PUI Guidelines Kano’s survey model asks the users’ preferences in both cases of where the effect is given and where it is not. The organized evaluation lists are performed with sixty users. These users are not sexually considered and the average age is between twenty four and thirty one. Kano’s survey model is designed as following.
Fig. 2. A pair of requirement questions in a Kano questionnaire
The results can be classified into three categories according to the Kano’s evaluation standard shown in Fig. 3 below.
Fig. 3. Kano evaluation table adapted from Berger et al.(1993)
Kano’s classification defines the representing property as the one that has the most votes. However, this method hardly reflects the difference between preferences of each user. To compensate Berger(1993) proposed users’ satisfaction coefficients. However, If
486
J. Heo, S. Park, and C. Song
most of responses are irrelevant one like indifferent, reverse or questionable, another index for selecting meaningful properties is needed. We proposed effective in order to isolate effective responses from those are not. All three coefficient is represented as follows; Users’ satisfaction coefficients = (A + O) / Total responses .
(1)
Users’ dissatisfaction coefficients = (O + E)/ Total responses.
(2)
Effective Coefficients = (A + O + E)/ Total responses
(3)
The user satisfaction coefficient is the ratio of ratio of positive responses. “Onedirectional” and “Attractive” are the positive responses because they are directly proportion to the increment in usability. The user dissatisfaction coefficient is defined as the ratio of negative. “One-directional” and “Must-be” are the negative responses cause which tend to decrease users’ satisfaction when usability is inferior. At this time, the total response is defined as the sum of four kinds of response except the irrelevant responses which seems discreditable and the questionable response. Also, effective coefficient is defined as the ratio of meaningful properties. “Must-be,” “One-directional,” and “Attractive” are the elements that directly affect the user. In this research, we defined the property with 0.65 or higher effective coefficient as valid. Fig.4 shows graph of valid properties with 0.65+ effective coefficients representing relationships between the user satisfaction and dissatisfaction coefficients.
Fig. 4. Three Utilities Classified Using Satisfaction Coefficients
According to Fig. 4 ”One-Dimensional” property is positioned in the first quadrant, “Must-be” is in the second quadrant, and “Attractive” is in the fourth quadrant. The properties in the third quadrant are known as indifference property according to the Kano’s model. 3.3 Strategic Use of Proposed PUI Guidelines Properties that are related with fundamental quality are defined as “Basic Utility.” The Basic Utility includes screen resolution and size when watching DMB (Digital
A Study on the Improving Product Usability Applying the Kano’s Model
487
Media Broadcasting) or setting keys to prevent user error such as an outer interruption or unintended input. Moreover, emotional satisfaction factors relating cell-phone design and images such as a well-taken picture are defined as “Opportunity Utility.” This categorization well explains that users’ expectation becomes higher as cellphone’s functions and quality get better. However, unexpectedly, “Attractive Utility” barely includes any lists, because the lists used in survey are already extracted from the preexisting usability problems and design principals. Also, improvement based on the preexisting usability evaluation can rarely bring an epochal reformation of the product and customer’s satisfaction as well. Improvement strategy can be described as the following based on Kano model’s three types of property classification. Basic utility is defined as a product usability standard which should be applied to the product. If not, product would be failed at the market cause of users’ claim. Opportunity is utilized as a comparison evaluation standard, because the products need a marketability power in order to compete other companies. Below diagram shows the suggesting strategy for effectively improving the usability of product.
Fig. 5. Strategy for Improving Usability considering the characteristic of Utility
Especially, since these properties have an independent relationship, Basic utility should be applied to the product because users’ dissatisfaction can be occurred when this is not satisfied.( Jokela, 2004) Moreover, this cannot be helped with an addition of attractive utility. Also, this property classification depends on time so that attractive utility could be basic utility after some years. Particularly, this trend tends to appear on the products, which requires a short development period such as a cell phone; therefore, a constant usability testing is necessary in order to understand a new kind of attractive utility and to apply this new property to the product.
488
J. Heo, S. Park, and C. Song
4 Conclusion In this research, we applied Kano’s model on the constructing UI guidelines for PUI (Physical Usability Interaction) of Mobile phone. Physical User Interaction (PUI) could be seen as the Physical side view of the usability and a broader view of the usability. Besides, PUI seems the most influential aspect on users’ satisfaction. Usability issue of PUI should be taken into account from the concept phage of product considering the issue characteristics. As development process goes on, possible region of physical design change shrinks rapidly. Guideline is a usable tool in organization to systematically improving and monitoring the usability of the product. Classification of usability issues considering potential effect is the first start point of this research. Then we applied Kano’s model of customers’ satisfaction to prioritize the issues reflecting users’ value. This priority is the base of strategic improvement. This research categorizes the UI guidelines into two groups. One is the design standards which must be satisfied. The other is the comparison criteria to hold a dominant position compared to competitive product. From this UI design guidelines may applied not only to evaluating product quality, but also to providing value improvement direction. Furthermore, the use of UI design guideline may expand to user satisfaction from quality control. The benefits of applying Kano model to design process are summarized as follows; 1) It is possible that the characteristics and criteria of product which affect to user satisfaction could be revealed. Besides, potential element which users usually may not describe explicitly could be understood. 2) It helps to find out the effect of the designer’s intention to users’ satisfaction. 3) The categorization of characteristics may be used as the criteria for the decision making. Especially limits on development resources exists, proposed categorization may be used as the criteria for deciding to focus on which characteristics. 4) It is easy to apply with many numbers of users, while most of common methods collecting users’ needs like Focus Group Interview are only applicable with small numbers of users.
References 1. Berger, C., Blauth, R., Borger, D., Bolster, C., Burchill, G., DuMouchel, W., Pouliot, F., Richer, R., Rubinoff, A., Shen, D., Timko, M., Walden, D.: Kano’s methods for understanding customer-defined quality. The Center for Quality Management Journal, 2(4) (1993) 2. Bevan, N.: Quality in use: meeting user needs for quality. Journal of Systems and Software 49(1), 89–96 (1999) 3. ChiWon, S., JeongYun, H., SangHyun, P.: Evaluating elements of the physical user experience (usability) of mobile device. In: Proceedings of HCI2006, Korea (2006) 4. ChiWon, S., JeongYun, H., SangHyun, P.: Classifying emotional elements of Mobiledevices to evaluate physical interface usability. In: Proceedings of Korean Society for Emotion and Sensibility 2006, Korea (2006) 5. Herzberg, F.: Work and the Nature of Man (1974) 6. JeongYun, H., SangHyun, P., ChiWon, S.: A Study of Improving Product Usability Based on the Classification of Usability Problems Considering Users’ Satisfaction 7. Jokela, T.: When Good Things Happen to Bad Products: Where are the Benefits of Usability in the Consumer Appliance Market?”, Ineractions, pp. 29–35 (2004)
A Study on the Improving Product Usability Applying the Kano’s Model
489
8. Lofgren, M., Witell, L.: Kano’s Theory of Attractive Quality and Packaging, 2005. Quality Management Journal 12(3), 7–20 (2005) 9. Kano, N., Seraku, N., Takahashi, F., Tsuji, S.: Attractive quality and must-be quality. The Journal of Japaneses Society for Quality Control 14(2), 39–48 (1984) 10. Matzler, K., Hinterhuber, H.H.: How to make product development projects more successful by integrating Kano’s model of customer satisfaction into quality. Technovation 18(1), 25–38 (1998) 11. Zhang, P., von Dran, G.M.: Satisfiers and Dissatisfiers: A Two-Factor Model for Website Design and Evaluation. Journal of The American Society For Information Science 51(14), 1253–1268 (2000)
Appendix: Part of PUI Guideline with Kano Classification Part of PUI guideline consisted with valid items considering efficient coefficient is provided. Classification applying Kano model and users’ satisfaction coefficient and dissatisfaction coefficient is also included. Table 1. Classification example of PUI guideline
The Practices of Usability Analysis to Wireless Facility Controller for Conference Room Ding Hau Huang, You Zhao Liang, and Wen Ko Chiou 259 Wen-Hwa 1st Road, Kwei-Shan Tao-Yuan, Taiwan, 333, R.O.C Chang Gung University [email protected]
Abstract. Increasingly there are more and more advantageous technical facilities and automated systems visible in business conference rooms. One of most advantageous media from the central system to users is the wireless facility controller and it is expected to bring individuals more convenience and efficiency by assisting them to control many kinds of media. This paper discusses ‘usability analysis’ with a ‘scenario-based’ approach on ‘user-oriented’ design concepts early on in the product design process through a practical case study concerning the controller. This study suggests a practical approach of scenario and usability analysis through a simple, structured framework. The framework is outlined by three major components: the design strategy from analyzing competitors’ products with scenario-based approach consisting of user, product, applications and field of use as context variables, usability analysis with product interaction and user’ observations with existing problems. Keywords: Wireless facility controller, User-oriented design, Usability, Interaction, Innovation.
The Practices of Usability Analysis to Wireless Facility Controller
491
interface’ to control these controllable appliances and when there are many different kinds of appliances to control, the user interface of the controlling device might be very challenging to use. The changes in the infrastructure may need software or even hardware changes to the user interface as well, therefore, controlling them all in an efficient way requires much thought on the user interface and usability issues [4]. Moreover the important function of a smart business conference room is to help document meetings: i.e. to capture and index the various activities that occur during meetings, presentations and teleconferences. Other functions include controlling room equipment and ambience, managing media streams, and providing networked electronic whiteboards and note taking devices [1]. All of the above are highly interactive activities and user orientated, if there was a universal wireless facility controller that could integrate all of these multi-functions, then how to ‘understand’ the context of people interacting with them is a very important issue. ‘Scenario’ based approach is useful in stimulating user-centered ideation, illustration of issues, evaluation of design ideas from a user's point of view and showing the role of a product in a larger context of use. All these are important activities in establishing a common user centered focus, particularly in the earliest phases of product design. Scenario building provides the human factors professional an elective additional method for exploring, prototyping and communicating human factors issues within a product design context [5]. Lim & Sato [3] created a method for generating scenarios through use of aspect models based on design information framework (DIF) structure and addressed the scenario generation technique to design is that designers can effectively analyze complex use situations through multiple aspects, and identify problems and requirements that lead to further design problem solving. The scenarios that clearly embed rationales for solutions become valuable reference sources of use context information throughout the design process. However the DIF was developed to enable designers to organize and manipulate information throughout a design process. In a design process, templates for archiving information into a DIF structured database can be generated, and all types of design information, such as user studies data, design concepts, models, scenarios, and prototype models, are then structured by those templates (shown as figure 1) [2].
Fig. 1. Design information framework [2]
492
D.H. Huang, Y.Z. Liang, and W.K. Chiou
During this research, we collaborated with ADVANTECH Co. Ltd, Taiwan, who wish to develop the wireless facility controller, and is a leader in the industrial computing and automation market, and has more than 20 years experience [6]. ADVANTECH covers the complete market share of integrated solutions, from industrial automation to medical computing to home automation. Nevertheless, it is the engineering and marketing departments who are concerned about user oriented design approaches to fit the new interactive generation. Thus this paper discusses usability analysis with a scenario-based approach on UOD concepts early on in the product design process through a practical case study. This paper describes some advantages and potential pitfalls in using scenario and usability analysis, and provides examples of how innovative concepts are developed and can be applied usefully.
2 Methods This research took on the form of a case study in progress of new interactive product concept develop processes which included a wireless facility controller for conference rooms. The wireless facility controller is a kind of interactive device which has a wireless touch panel and fast operation keys that can control other automated devices. We applied the usability analysis and scenario-based approach to UOD concepts to create an innovative design concept and to synthesize the processes to develop an interactive product development framework. The detailed process is as follows. Firstly, the analysis of competitor’s products took place to elicit all raw information such as the function of the product, applications, user groups and field of uses. Then deeper more involved analysis was conducted, using scenario-based approach, which included product functions to applications, field of use to scenarios, and scenarios synthesis. The third stage was product usability analysis to define the product positions and the fourth was user observation. After appraising all the information, we discussed all the processes to establish an interactive product development framework. 2.1 Monitored Competitor Offering Competitor’s product analysis: The two most famous companies and worldwide leaders in advanced controlled and automation systems were chosen to elicit all raw information. The first, CRESTRON, is the world's leading manufacturer of advanced control and automation systems, innovating technology and reinventing the way people live and work. Offering integrated solutions to control audio, video, computer, IP and environmental systems, CRESTRON streamlines technology, improving the quality of life for people in corporate boardrooms, conference rooms, classrooms, auditoriums, and in their homes [8]. The second is AMX a worldwide leader in advanced control and automation technology for commercial and residential markets. The company’s hardware and software products simplify the way people interact with technology. This includes making it easier for system integrators to sell, program and install AMX products – ranging from touch panels, keypads and handheld remotes to customizable resource management tools – as well as making the overall AMX enduser experience intuitive and simple [7].
The Practices of Usability Analysis to Wireless Facility Controller
493
During this stage initially a product ‘tree map’ was used to identify all the related devices to facilitate an overview of the whole product line and to know what kind of types we should focus on. After deducing all the raw product information such as company, product type, prices, specifications, fields of use, characteristics, product semantics and related accessories with product analysis card which can build the product information database. And the database can be a strong information bank. 2.2 Scenario Analysis and Product Classification This stage was separated into three parts: The first was ‘product functions to applications’ in order to find which functions were supplied in the main market. For example after analyzing competitors’ products we could elicit all the applications including lighting automation, AV control, light control, presentation tools, windows treatments, voting systems, drapes shades & screens, climate control, internet access, remote manager, network campuses, media manager, security & intercoms, central control, security systems, automated bell towers, record manager and electronic menus. The second was ‘field of use to scenarios’ to understand what kinds of applications fit the individual field of use such as business, whole home, home theater, government, education, house of worship, MDU, private transportation, entertainment, healthcare, broadcasting, network operation, and retail hotels; as well as choosing the main target of these fields of use. For this research we conducted the business conference as an example. The third part was ‘scenario synthesis’ to factor-in ADVANTECH’s strategy and to identify the main scenarios, field of uses and its applications. 2.3 Usability Analysis and Product Positioning In this stage there were seven related products from seven different companies to facilitate usability analysis to understand basic specifications, hardware interface and their accessories. Finally the illustration of radar was used to synthesize and compare with the product characteristics. According to this we could define the product position. 2.4 User Observation with Existing Problems During this stage researchers went to observe the ‘field situation’ where the central control system and wireless facility controller was installed at an assembly hall and business conference room. We conducted direct ‘user observation’ to discover how the user interacted with the system, as well as asking questions which focused on existing problems. 2.5 Constructive Demo Design After each stage ‘rough’ design issues were synthesized facilitating several ‘demo’ designs. However only after completing all processes would the key design issues be gleaned, and then accordingly a ‘constructive demo’ design could be developed.
494
D.H. Huang, Y.Z. Liang, and W.K. Chiou
3 Results After completing all processes the main findings will be discussed as follows: firstly, the findings gleaned from ‘scenario analysis’, secondly, the findings from the ‘usability analysis’; and thirdly the findings gleaned from ‘user observation’. To synthesize all the results, the new design concept can be accepted by the development team members and the innovative results are better than the traditional method. 3.1 The Results of Competitor’s Product and Scenario Analysis Firstly, the findings gleaned from scenario analysis: modularization, standardization and diversification (shown as figure 2).
Fig. 2. The application of different field of use
By analyzing the ‘product functions to applications’, ‘field of use to scenarios’ and ‘scenarios to interaction’, we found that many applications had the same key functions, so we need to design a main type which has the same key functions but can still add functions to fit other fields of use. The main type must have one base cover which is different from the front cover and has different fast keys, styles and textures. According to these results the ‘demo design’ shown as figure 3, was refined from ADVANTECH’s UbiQ350 (a kind of wireless facility controller). In this stage we also applied ‘scenario approach’ to simulate the main user target and the different styles of the conference room (shown as figure 4). All users’ characteristics and space ‘styles’ can be references early on in the design process.
The Practices of Usability Analysis to Wireless Facility Controller
495
Fig. 3. The demo concept from competitor’s product and scenario analysis
Fig. 4. Scenario analysis of main targets and styles
3.2 The Results of Usability Analysis By analyzing the product hardware interaction interface shown in figure 5, the product illustration radar has eight dimensions including product functionality, hotkey numbers, the ease degree of handling, shape, price, battery life, whole product size and LCD capability. Through the main findings of usability analysis by the radar: we discovered that the controller should retain its multi-function aspect to remain convenient, so we separated the touch panel and fast key to form different parts. The independence of the fast key part can be used conveniently and also supply advanced settings through the touch panel. 3.3 The Results of User Observation Concerning the findings of user observation, we found that after ‘user observation’ there was an existing problem: that is that the handling should be improved. This should include advanced settings and accident prevention (shown as figure 6).
496
D.H. Huang, Y.Z. Liang, and W.K. Chiou
Fig. 5. The product position analysis
Fig. 6. User observation situation
3.4 The Results of Final Design To synthesize all the results, the new design concept could be accepted by the development team members and the innovative results were better than the traditional method. There are seven design issues which were synthesized including: (1) Modularizing the hardware and software to easily fit different fields of use; (2) The number and the function of fast keys are changeable with different applications; (3) More function, but retaining the same convenience; (4) Personality and safety setting; (5) Easy to handle; (6) Preventing unexpected start; (7) One touch scenario pre-setting.
The Practices of Usability Analysis to Wireless Facility Controller
497
According to all of the above the final design is shown in figure 7.
Fig. 7. Final design
The touch panel and fast key form different parts. The tube of the fast key part can be used conveniently and also supply advanced settings through the touch panel. 3.5 Design Framework The user-oriented innovation design framework (UOIDF) was developed by means of a practical design case. The UOIDF was developed to enable designers to organize and manipulate user data or user oriented information throughout a design process. In a design process, templates for archiving information into a UOIDF structured database can be generated, and all types of design information, such as product functions, applications, field of use, observation and interactions, are then structured by those templates. Product strategy
Product positioning
Competitor offering
Usability analysis
Scenario based approach
*user *product *applications *field of use
User interaction
Product improvement User observation Scenario based approach
4 Conclusion This study suggests a practical approach of scenario and usability analysis through a simple, structured framework. The framework was outlined by three major components: the design strategy from analyzing competitors’ products with scenario-based approach consisting of user, product, applications, and field of use as context variables, usability analysis with product interaction and user’ observations with existing problems. Based on this framework, this study established methods to specify interactive product features, to define development context, and to measure usability. The effectiveness of this framework was demonstrated through case studies in which the usability of interactive products was developed by using UOD concepts in this study.
5 Implications This study is expected to help product design practitioners in the consumer electronics industry in various ways. Most directly, it supports the plan and conduct of product development teams to develop new concepts in a systematic and structured manner. In addition, it can be applied to other categories of consumer interactive products (such as appliances, automobiles, communication devices, etc.) with minor modifications as necessary.
References 1. Chui, P., Wilcox, L.: Kumo Interactive: A Smart Conference Room. DARPA/NIST/NSF Workshop on Research Issues in Smart Computing Environments. 2. Lim, Y., Sato, K.: Development of design information framework for interactive systems design, In: Proceedings of the 5th Asian International Symposium on Design Research, Seoul, Korea (2001) 3. Lim, Y., Sato, K.: Describing multiple aspects of use situation: applications of Design Information Framework (DIF) to scenario development. Design Studies, 27(1) (2006) 4. Ritala, M., Tieranta, T., Vanhala, J.: Context Aware User Interface System for Smart Home Control. In: HOIT 2003 Conference. Irvine, California (2003) 5. Suri, J.F., Marsh, M.: Scenario building as an ergonomics method in consumer product design. Applied Ergonomics 31, 151–157 (2000) 6. ADVANTECH: http://www.advantech.com/ 7. AMX: http://www.amx.com 8. CRESTRON: http://www.crestron.com/
What Makes Evaluators to Find More Usability Problems?: A Meta-analysis for Individual Detection Rates Wonil Hwang1 and Gavriel Salvendy2 1
School of Industrial & Information System Engineering, Soongsil University 511 Sangdo-Dong, Dongjak-Gu, Seoul, 156-743, South Korea 2 School of Industrial Engineering, Purdue University 315 N. Grant St. West Lafayette, IN 47906, USA and Department of Industrial Engineering, Tsinghua University Beijing 100084, P.R. China [email protected], [email protected]
Abstract. Since many empirical results have been accumulated in usability evaluation research, it would be very useful to provide usability practitioners with generalized guidelines by analyzing the combined results. This study aims at estimating individual detection rate for user-based testing and heuristic evaluation through meta-analysis, and finding significant factors, which affect individual detection rates. Based on the results of 18 user-based testing and heuristic evaluation experiments, individual detection rates in user-based testing and heuristic evaluation were estimated as 0.36 and 0.14, respectively. Expertise and task type were found as significant factors to improve individual detection rate in heuristic evaluation.
In this situation, that is, a situation in which many empirical results have been accumulated from previous usability evaluation research, providing usability practitioners with generalized guidelines by analyzing the combined results of this body of work will be more useful than conducting research that involves all the factors. The objectives of this study are (a) to estimate individual detection rate for user-based testing and heuristic evaluation through meta-analysis, and (b) to find hidden factors, which affect individual detection rates.
2 Related Literatures Individual detection rate, which indicates the ratio of the number of usability problems found by individual evaluator or test user against the number of real usability problems that exist, is an important measure in usability evaluation research, because it reflects the individual ability to detect usability problems in a certain situation. If the individual detection rate can be estimated more reliably, the optimal sample size issue, one of the most disputable issues in usability evaluation research, can be resolved. When Nielsen [14] suggested so-called ‘4±1’ or ‘magic number five’ rule that indicates we need only 3 ~ 5 evaluators for detecting 80% of usability problems with heuristic evaluation method, the underlying assumption would be that the mean probability of detecting a problem by an evaluator (i.e., mean of individual detection rate) existed between 0.32 and 0.42. However, a lot of empirical research reported that the mean of individual detection rate did not exist in the above range, but in much lower range. For example, Law and Hvannberg [10] reported that mean of individual detection rates was 0.14 when user-based testing with think aloud method was employed, and Andre, Hartson and Williges [1] indicated that mean of individual detection rates was 0.179 when heuristic evaluation was used. Since the results of usability evaluation experiments could not support the underlying assumption of Nielsen’s conclusion, there have been many arguments about optimal sample size and the conditions, in which Nielsen’s conclusion can be supported. Thus, in order to get a more generalized conclusion about the optimal sample size issue, we need valid and reliable estimation of individual detection rate from the accumulated results of usability evaluation experiments. In addition, in case that the means of individual detection rates are significantly heterogeneous, we need to find hidden factors that have effects on the variability of individual detection rates, in order to suggest usability evaluation conditions, in which evaluators’ ability to detect more usability problems can be improved. There are many scientific ways to summarize, integrate, and interpret the independent studies. One of them is meta-analysis, which is the statistical methodology for combining findings from a selected set of studies. Lipsey and Wilson [11] indicated three conditions, under which meta-analysis can be applied. First, meta-analysis applies only to empirical studies that produce quantitative results. Second, metaanalysis is conducted based on the statistics of summarizing the research results rather than the original data sets. Third, meta-analysis aggregates and compares the results of independent studies that deal with the same constructs and relationships and report their results in similar statistical forms. Due to these limited applications, historically meta-analysis has been applied to combine the results from studies that have been
What Makes Evaluators to Find More Usability Problems?
501
repeated, such as astronomical and physical experiments, and social science research [4, 6]. During the 1990s, researchers have tried to combine the results from software engineering studies using the meta-analysis methods [3, 12]. When statistical information, including means and standard deviations, for estimating the effect sizes can be obtained, the parametric estimation models, such as fixed effects model, are utilized [9]. The underlying idea of these models is to combine the estimates of effect sizes that are computed from each study by weighting studies based on inverse of variance. When a statistical test for the homogeneity of effect sizes shows a non-significant result, there is no problem in combining the estimates of effect sizes. Otherwise, researchers need the parametric models that explain the variance of effect sizes (or studies variability) among individual studies. In general, the fixed effects model is employed when a researcher believes that there are systematic sources as potential moderators, which can explain the studies variability. Hedges [8] explained the fixed effects model as the ANOVA analog model, which is used when the study characteristic variables (independent variables) are categorical.
3 Methods 3.1 Data Collection We collected 103 usability evaluation experiments from HCI-related journals, the proceedings of HCI-related conferences, technical reports and books, since 1990. Online academic databases and offline sources were used to search for the relevant studies. Most of references of the papers found as relevant studies were also checked to make sure that no relevant studies were missed. As a result of such extended search efforts without pre-selected sources of studies, most of major HCI-related journals, such as International Journal of Human–Computer Interaction, Behaviour & Information Technology, International Journal of Human–Computer Studies, and Human Factors, and the proceedings of major HCI-related conferences, such as CHI Conference on Human Factors in Computing Systems and Human Factors Society Annual Meeting, were included as the sources of relevant studies. However, only 18 experimental results were used for meta-analysis, because we selected usability evaluation experiments under two criteria: (a) user-based testing or heuristic evaluation method was employed in the experiments, and (b) the experiments reported mean and standard deviation of individual detection rates. Because multiple experiments, which result in an independent experimental result for each, may be reported in a paper, 18 experimental results that are used for metaanalysis in this study come from 10 usability evaluation papers, which were published between 1990 and 2004. Albeit some of experiments were conducted under the same experimental conditions and reported in the same publications, each experiment is considered as an independent experiment that shares some of experimental conditions, such as evaluated systems, task type, and report type, because each experiment is administered independently. All 18 experiments reported mean and standard deviation values of individual detection rates after the interfaces of software products or information systems were evaluated for checking usability problems based on user-based testing (9 experiments) or heuristic evaluation (9 experiments) (see Table 1).
502
W. Hwang and G. Salvendy Table 1. Data collected for meta-analysis
Usability evaluation methods
User-based testing
Heuristic evaluation
Number of test user or evaluators 20
Mean of individual detection rates 0.36
Standard deviation of individual detection rates 0.0006
12 20
0.32 0.42
0.0014 0.0015
[18] [18]
17 36
0.14 0.4625
0.07 0.2032
[10] [13]
18 18
0.0799 0.0926
0.0269 0.0595
[16] [16]
7 6
0.1984 0.2432
0.0707 0.1172
[16] [16]
12 12
0.084 0.094
0.038 0.064
[19] [19]
14 10
0.19 0.179
0.0199 0.032
[5] [1]
16 9
0.203 0.14
0.075 0.123
[2] [2]
16 11
0.279 0.222
0.11 0.095
[2] [2]
18
0.046
0.025
[10]
References [17]
3.2 Methods of Analysis First of all, bubble charts were used to see the overall shape of data before the metaanalysis was conducted. In bubble charts, each bubble represents an experiment, the center of bubble indicates mean of individual detection rates across the number of test users or evaluators, and the radius of bubble means standard deviation of individual detection rates. According to the philosophy of meta-analysis, relatively small bubbles, which indicate smaller standard deviations, are given more importance than big bubbles, which indicate bigger standard deviations, when mean of individual detection rates are estimated. Second, means of individual detection rates from individual usability evaluation experiments were combined using inverse of variance as a weight in order to estimate individual detection rate for user-based testing and heuristic evaluation. Q statistic [15] that is known to follow Chi-square distribution was calculated to test homogeneity of effect sizes (i.e., means of individual detection rates in this study) that were used to estimate parameter (i.e., individual detection rate in this study). Third, the fixed effect model was applied for finding hidden factors that explain the variability of effect sizes (i.e., means of individual detection rates in this study), in case that means of individual detection rates were significantly heterogeneous. In
What Makes Evaluators to Find More Usability Problems?
503
practice, candidates of hidden factors were selected, and then homogeneity tests were conducted repeatedly to check whether each candidate of hidden factors contributed to making homogeneous subgroups until hidden factors were identified as moderators in the fixed effects model.
4 Results 4.1 Bubble Charts Analysis In order to see the overall shape of data that were used for meta-analysis, two bubble charts were drawn for user-based testing and heuristic evaluation. In bubble charts, x-axis represents number of test users or evaluators, and y-axis represents mean of individual detection rate. As shown in Figure1 and Figure 2, the bubbles of user-based testing were more scattered according to x-axis and y-axis than those of heuristic evaluation, and also in terms of size, bubbles of user-based testing were more various than those of heuristic evaluation. It means that the collected data of user-based testing were more scattered in a wide range of mean and standard deviation of individual detection rate and number of test users than the data collected from heuristic evaluation experiments. Thus, we can conclude from bubble charts analysis that means of individual detection rates reported from user-based testing experiments are fairly heterogeneous, whereas means of individual detection rates from heuristic evaluation experiments show somewhat heterogeneous, but the possibility of being sub-grouped.
User-based testing
Mean of individual detection rate
0.6 0.5 0.4 0.3 0.2 0.1 0 0
10
20
30
40
Number of test users
Fig. 1. Bubble chart of user-based testing data
4.2 Estimation of Individual Detection Rate Meta-analyses were conducted for combining results from 9 user-based testing experiments and results from 9 heuristic evaluation experiments, respectively. As shown in Table 2, when user-based testing is used, the estimated individual detection rate is
504
W. Hwang and G. Salvendy
Heuristic evaluation
Mean of individual detection rate
0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15
20
Number of evaluators
Fig. 2. Bubble chart of heuristic evaluation data
0.36 and its 95% confidence interval is 0.361 ~ 0.363. When heuristic evaluation is used, the estimated individual detection rate is 0.14 and its 95% confidence interval is 0.115 ~ 0.164, which is not consistent with the assumption of Nielsen [14]’s conclusion. However, the results of homogeneity tests for effect sizes show that means of individual detection rates both from user-based testing and heuristic evaluation are significantly heterogeneous. Thus, we need to employ the fixed effects model to find hidden factors that affect heterogeneity of means of individual detection rates. Table 2. Estimated individual detection rates and homogeneity tests Usability evaluation methods User-based testing Heuristic evaluation
Estimated individual detection rate
95% confidence interval of mean Lower bound Upper bound
Q statistic
0.3615
0.3605
0.3625
2552.558
0.1393
0.1150
0.1637
27.669
Chi-square (d.f. = 8, α = 0.05) 15.507
4.3 Hidden Factors for Individual Detection Rate in Heuristic Evaluation We considered experimental conditions, such as expertise of test users or evaluators, task type, type of evaluated systems, experimental duration, and report type, as candidates of hidden factors that would explain the heterogeneity of means of individual detection rates. As for user-based testing, means of individual detection rates were too heterogeneous to find homogeneous sub-groups based on the above candidates of hidden factors. Thus, in this study we cannot find proper hidden factors that affect individual detection rate in user-based testing.
What Makes Evaluators to Find More Usability Problems?
505
As for heuristic evaluation, evaluator’s expertise (experts vs. novice) and task type (scenario-based task vs. free exploration) were found as significant factors that explain the variability of individual detection rates. Using evaluator’s expertise and task type, heterogeneous data could be sub-grouped into three homogeneous data, such as heuristic evaluation done by experts, by novice with free exploration task, and by novice with scenario-based task (see Table 3). Because we had only one data resulted from evaluation done by novice with scenario-based task, homogeneity test was not conducted for this case. When novice evaluators conduct heuristic evaluation with free exploration task, the estimated individual detection rate is the highest (0.19). It implies that in order to improve evaluator’s problem detection ability the evaluation conditions of heuristic evaluation need to be set up similar to that of user-based testing (novice + free exploration). Table 3. Hidden factors for individual detection rate in heuristic evaluation Hidden factors Expertise
Task type
Expert
Mixed
Novice Novice
Free exploration Scenariobased
Estimated individual detection rate
95% confidence Interval of mean Lower Upper bound bound
5 Conclusion and Discussion We conducted meta-analyses for combining results from user-based testing and heuristic evaluation experiments, and thus, for estimating individual detection rates in user-based testing and heuristic evaluation. Estimated individual detection rates in user-based testing and heuristic evaluation were computed as 0.36 and 0.14, respectively, but they need to be interpreted carefully because they were derived from heterogeneous data. As for heuristic evaluation, however, based on the fixed effects model with expertise and task type as moderators, we estimated two individual detection rates from homogeneous sub-data: 0.14 (when experts conduct heuristic evaluation) and 0.19 (when novice evaluators conduct heuristic evaluation with free exploration task). Those individual detection rates are not consistent with the assumption of Nielsen [14]’s conclusion. This study makes two contributions in usability evaluation research. First, this study combined results from user-based testing and heuristic evaluation experiments, and estimated individual detection rates in those usability evaluation methods. Usability practitioners can use this generalized conclusion of individual detection rates for deciding optimal sample sizes for usability evaluation. Second, this study found significant factors, such as expertise and task type, to improve individual detection rate
506
W. Hwang and G. Salvendy
in heuristic evaluation. Usability practitioners can consider these factors to improve the performance of usability evaluation. However, this study has limitation in that the number of collected data is small. There are not enough experiments that have reported statistical information for conducting meta-analysis. It was one of reasons why meta-analysis was often given up to apply for usability evaluation research [7]. In addition, the collected data from userbased testing were significantly heterogeneous, but we could not find significant factors that explain the variability of means of individual detection rates. This issue would be left to the future study.
References 1. Andre, T.S., Hartson, H.R., Williges, R.C.: Determining the effectiveness of the usability problem inspector: a theory-based model and tool for finding usability problems. Human Factors 45, 455–482 (2003) 2. Baker, K., Greenberg, S., Gutwin, C.: Empirical development of a heuristic evaluation methodology for shared workspace groupware. In: Proceedings of the 2002 ACM Conference on Computer supported cooperative work, pp. 96–105. ACM, New York (2002) 3. Chen, C., Rada, R.: Interacting with hypertext: a meta-analysis of experimental studies. Human-Computer Interaction 11, 125–156 (1996) 4. Cook, T.D., Leviton, L.C.: Reviewing the literature: a comparison of traditional methods with meta-analysis. Journal of Personality 48, 449–472 (1980) 5. De Angeli, A., Matera, M., Costabile, M.F., Garzotto, F., Paolini, P.: Validating the SUE inspection technique. In: Di Gesù, V., Levialdi, S., Tarantino, L. (eds.) Proceedings of Advanced Visual Interfaces (AVI’2000), pp. 143–150. ACM, New York (2000) 6. Glass, G.V., McGaw, B., Smith, M.L.: Meta-analysis in social research. Sage Publications, Beverly Hills CA (1981) 7. Hartson, H.R., Andre, T.S., Williges, R.C.: Criteria for evaluating usability evaluation methods. International Journal of Human-Computer Interaction 15, 145–181 (2003) 8. Hedges, L.V.: Fixed effects models. In: Cooper, H., Hedges, L.V. (eds.) The handbook of research synthesis, pp. 285–299. Russell Sage Foundation, New York (1994) 9. Hedges, L.V., Olkin, I.: Statistical methods for meta-analysis. Academic Press, Orlando FL (1985) 10. Law, L.-C., Hvannberg, E.T.: Analysis of combinatorial user effect in international usability tests. In: CHI Conference on Human Factors in Computing Systems, pp. 9–16. ACM, New York (2004) 11. Lipsey, M.W., Wilson, D.B.: Practical meta-analysis. SAGE Publications, Thousand Oaks CA (2001) 12. McLeod, P.L.: An assessment of the experimental literature on electronic support of group work: results of a meta-analysis. Human-Computer Interaction 7, 257–280 (1992) 13. Nielsen, J.: Finding usability problems through heuristic evaluation. In: CHI Conference on Human Factors in Computing Systems, pp. 373–380. ACM, New York (1992) 14. Nielsen, J.: Estimating the number of subjects needed for a thinking aloud test. International Journal of Human–Computer Studies 41, 385–397 (1994) 15. Shadish, W.R., Haddock, C.K.: Combining estimates of effect size. In: Cooper, H., Hedges, L.V. (eds.) The handbook of research synthesis. pp. 261–281. Russell Sage Foundation, New York (1994)
What Makes Evaluators to Find More Usability Problems?
507
16. Spool, J., Schroeder, W.: Testing web sites: Five users is nowhere near enough. In: CHI ’01 extended abstracts on Human factors in computing systems, pp. 285–286. ACM, New York (2001) 17. Virzi, R.A.: Streamlining the design process: Running fewer subjects. In: Human Factors Society 34th Annual Meeting. Human Factors and Ergonomics Society, pp. 291–294. Human Factors and Ergonomics Society, Santa Monica CA (1990) 18. Virzi, R.A.: Refining the test phase of usability evaluation: how many subjects is enough? Human Factors 34, 457–468 (1992) 19. Zhang, Z., Basili, V., Shneiderman, B.: Perspective-based usability inspection: An empirical validation of efficacy. Empirical Software Engineering 4, 43–69 (1999)
Evaluating in a Healthcare Setting: A Comparison Between Concurrent and Retrospective Verbalisation Janne Jul Jensen Department of Computer Science, Aalborg University Fredrik Bajers Vej 7, E2-220, DK-9210 Aalborg East, Denmark [email protected]
Abstract. The think-aloud protocol, also known as concurrent verbalisation protocol, is widely used in the field of HCI today, but as the technology and applications have evolved the protocol has had to cope with this. Therefore new variations of the protocol have seen the light of day. One example is retrospective verbalisation. To compare concurrent and retrospective verbalisation an experiment was conducted. A home healthcare application was evaluated with 15 participants using both protocols. The results of the experiment show that the two protocols have each their strengths and weaknesses, and as such are very equally good although very different.
the decrease of mental workload, as the participant is now free to focus on the task at hand. However, a drawback could be that participants quickly forget specific details that occurred in the task solving process and they are then unable to recall these details afterwards [3]. To shed some light on the pros and cons of the two protocols an experiment was conducted. This was done as a field evaluation in the home healthcare system. The reason for choosing this setting and type of evaluation was to make the setting as realistic as possible in order to investigate any possible effects the surroundings might have with regards to sensitivity. Is it possible to observe any awkwardness in using the concurrent think-aloud protocol compared to the retrospective thinkaloud protocol, with respect to a sensitive setting?
2 The Experiment To compare concurrent vs. retrospective verbalisation in a healthcare setting and to test the appropriateness of each protocol, an experiment was conducted. It was set up as a field evaluation to create as realistic settings as possible. The system chosen for evaluation was an application developed to aid home healthcare workers in their daily work. It is an electronic replacement to the existing paper-based system which is currently in use in many municipalities in Denmark. It supports the current work-procedure as well as offer new functionality such as wireless access to added information about the elder citizens and the progress of coworkers, information that earlier was available only at the main office building. 2.1 Participants 15 participants were chosen with the help of the head of the group of home healthcare workers with due consideration for work plans etc. All 15 were trained home healthcare workers and their demographic data is shown in table 1.
Experience computer (1-6)
Concurrent
Average High Low Average High Low
Experience total
Retrospective
Experience local
Protocol
Age
Table 1. The demographic data of the 15 participants in the two protocols
42.0 54 33 42.4 57 31
5½ 12 2½ 7 18 1
8¼ 13 3¾ 10.3 23 1½
3 6 1 3.9 6 1
The table shows the age, the experience as home healthcare workers in the municipality where the experiment took place, the experience as home healthcare workers in total and the level of experience with computers on a scale from one to six where 1 is
510
J.J. Jensen
most experienced and 6 is least experienced. For each of these variables, the high low and average has been calculated for each of the protocols. 2.2 Equipment To support the field evaluation a mobile laboratory was used. It consists of small clip-on wireless mobile cameras (see figure 1), wireless microphones and a mobile digital video recorder. To run it all, it furthermore requires various types of batteries and receivers for the wireless technology. Only the camera and microphone are carried by the participant, the rest is carried by the test monitor packed in a small bag (see figure 2 and 3).
Fig. 1. The small clip-on wireless mobile camera from the mobile laboratory
Fig. 2. The equipment in the mobile laboratory used for concurrent verbalisation
Fig. 3. The mobile laboratory packed up for use
Fig. 4. The setup for retrospective verbalisation
For retrospective verbalisation, the digital recordings from the mobile video recorder were played back to the participant and the retrospective verbalisation was caught using a camcorder (see figure 4).
Evaluating in a Healthcare Setting
511
2.3 Procedure To gain the necessary insight into the field of home healthcare, a small ethnographic field study was conducted. Based on a thorough examination of the system and the insight gained from the ethnographic field study the 8 tasks that covered a wide range of the commonly used functionalities in the application were designed and the experiment was then designed in detail. With the design of the experiment in place, a pilot was conducted for both protocols and the setup was adapted according to the minor issues discovered. 15 participants were recruited from a local municipality. 14 were female and one male, which was representative for the employment situation where women far outweighed the men. The actual experiment took six days and all evaluations were recorded on video. The evaluations took place in six different homes of actual elderly citizens, with the citizen present during the evaluation to further heighten the realism in the experiment. 7 of the 15 participants were assigned to evaluate using retrospective verbalisation, while the remaining 8 participants evaluated the application using concurrent verbalisation. Each of the participants was given a thorough introduction to the experiment, explaining the equipment and its function, what their contribution was, what was expected of them, what would happen etc. They were also instructed thoroughly in how to apply the protocol assigned to them. They were then given 10 minutes to freely familiarise themselves with the system, before trying to solve the tasks. After the introduction the experiment itself took place in the home of an elderly citizen where the participants attempted to solve the tasks handed out. 8 participants solved them thinking aloud during the evaluation whereas the other 7 had their test session played back to them on a screen afterwards and were thinking aloud during the replay. Upon completion of the evaluation each participant was debriefed. All the raw video data was analysed afterwards and a list of problems was constructed. The severity of each of the problems was categorised according to the definition by Rolf Molich [5]. According to the definition a problem experienced by a participant falls in one of three categories: • Cosmetic: The user is delayed for less than one minute, is mildly irritated or is confronted with information, which to a lesser degree deviates from the expected. • Serious: The user is delayed for several minutes, is somewhat irritated or is confronted with information, which to some degree deviates from the expected. • Critical: The users attempt to solve the task comes to a halt; the user is very irritated or is confronted with information which to a critical degree deviates from the expected. The categorisation was done by observing the video recording of each participant, and then evaluate each situation according to the guidelines described above. A given problem is often not experienced equally serious by each participant, and in those cases the problem is categorised in the most severe category.
512
J.J. Jensen
3 Results This section sums up the observations made from the list of problems, which was extracted from the analysis of the raw video data. 3.1 Problems Revealed In total, 105 problems were identified through the evaluation and interestingly the participants using concurrent verbalisation revealed a total of 87 problems whereas the participants using retrospective verbalisation only experienced 61 problems in total. This is a quite big difference which origin is not clear. One explanation could be that the participants evaluating with retrospective verbalisation has an average computer experience level that is almost a point better (3.0) compared to that of the participants using concurrent verbalisation (3.9) on a scale from 1 to 6 (see table 2). Table 2. Total number of problems, unique problems and the average computer skill of the participants
All Problems revealed Unique problems* Average computer experience
105 44 3.4
Concurrent Verbalisation 87 30 (47) 3.9
Retrospective Verbalisation 61 14 (33) 3.0
* Note that the number in parentheses refers to problems that are unique to that protocol and not necessarily unique in total.
3.2 Unique Problems When looking at the number of unique problems the experiment in total reveals 44 unique problems. 30 of these are problems revealed by the concurrent verbalisation protocol, whereas the retrospective verbalisation protocol only experience 14 of the 44. Even if we look at problems that are unique to each of the protocols, concurrent verbalisation discovers 47 problems that are unique to that protocol, whereas retrospective verbalisation only encounters 33 problems that are unique to that protocol (see table 2). It has long been debated in the literature whether unique problems were real or “false” problems, since they had only been encountered by one participant during the evaluation, and how this seems increasingly likely when the number of participants increase. If unique problems are indeed “false” problems, then this experiment could indicate that retrospective verbalisation is better at eliminating these “false” problems. This could be because the protocol is of a recall-nature, where the participant simply recalls fewer of these “false” problems afterwards than what would be verbalised in the situation, due to it not really being a problem after all. 3.3 “False” Problems – Do They Exist? However, retrospective verbalisation finds only slightly more than half of the total number of problems, and the question is if nearly half of the problems found can be
Evaluating in a Healthcare Setting
513
considered “false” problems. When looking at the severity, concurrent verbalisation finds more problems in all three categories. If the problems found extra by concurrent verbalisation were “false” problems, it would be fair to assume that they would appear mostly as cosmetic problems. However it is difficult to dismiss problems that are categorised as critical as being false, so eliminating “false” problems can only partly explain why retrospective verbalisation finds only slightly more than half the problems. Another explanation might be that the participant forgets some of the problems in the short time between the evaluation and the retrospective verbalisation. Perhaps problems seem less frustrating when looking back, than when in the middle of it. It is possible that it is easier for the participant to keep the overview when sitting outside the situation looking in. 3.4 Problems Detected by Both Protocols There are 43 problems that are registered by both protocols. As an example one problem was that the participant did not enter username and password before pressing the “login”-button. In another problem the participants did not understand the error message displayed to them. Thirdly, the participants think “Unplanned task” adds an extra task to the visit in progress. These three problems are typical for the 43 problems in common of the two protocols and the initial inspection does not reveal any connection between them that explains why exactly those problems have been revealed by both protocols. The same is the case with the unique problems that also doesn’t seem to have anything in common. Examples of those are: The participant thinks TAB will move the cursor to the next text field. Secondly, a participant is unsure how to end a visit in progress. Thirdly, a participant is unsure what data the “search”-button searches in. 3.5 Few or Many – Nothing in Between It is notable that in concurrent verbalisation it seems like the participants fall in one of two groups. They either experience few or many problems and not the average in between, whereas the number of problems experienced by the participants in retrospective verbalisation is more evened out. Three of the participants using concurrent verbalisation experience only few problems (6-11) while the other five experience many (21-36), but none of the participants experience the average number of problems in between (12-20). This could be due to difficulties in verbalising concurrently with the task-solving, as has been reported as a drawback of the concurrent thinkaloud protocol [7]. This can materialise itself either as very little verbalisation due to difficulties doing that simultaneously with the task-solving (few problems experienced) or by extra problems occurring due to lack of concentration caused by the simultaneous verbalisation (many problems experienced). In retrospective verbalisation this is much more evened out, because the mental workload is lowered by letting the participants concentrate on one thing at a time and the differing number of problems experienced might simply be caused by their varying computer skills and also differing skills in recalling their thought process at the time in details.
514
J.J. Jensen
3.6 The Diverse Participants Each participant in concurrent verbalisation revealed an average of 20.8 problems, whereas each participant in retrospective verbalisation only discovered an average of 16.0 problems (see table 3). This difference is not particularly big though when considering the large spread in experienced problems between the participants, and this spread is probably to be expected in a group of participants as diverse as the present one. The group contained a wide variety both in job experience and computer experience and as such it would have come as a surprise if the amount of problems experienced were similar between the participants. Table 3. Average number of problems experienced totally and for each of the two protocols
Total Average problems
18.5
Concurrent Verbalisation 20.8
Retrospective Verbalisation 16.0
4 Discussion Many attempts have been made to determine which of the two verbalisation protocols are better, but so far the results are differing between studies. Nielsen et al. [7] discover quite a few weaknesses in concurrent verbalisation, and propose that Mind Tape (a version of retrospective verbalisation) is a more viable option, whereas van den Haak et al. [9] rate the two protocols as being equally good although clearly different. This study indicates that concurrent verbalisation finds more problems than retrospective verbalisation, but it seems that this can be both a good and a bad thing. Good, if it means that the number of “false” problems (unique) is minimized; bad since it is not only “false” problems that aren’t discovered. Concurrent verbalisation on the other hand seems to lay a higher mental workload upon the participant, causing them to focus either on the task-solving process and thus tend to forget to verbalise or to focus on the verbalisation thus loosing concentration on the task-solving. However, the reason that retrospective verbalisation finds less problems might be that even in the short time between the actual evaluation and the retrospective verbalisation, things have already started to fade in the memory of the participant and problems are being forgotten. Thus, the conclusion tends to lean towards that of van den Haak et al. [9] that they are equally good, but very different. As the observant reader might have noticed, the two protocols in the experiment had an uneven number of participants: 8 participants used concurrent verbalisation, while only 7 participants used retrospective verbalisation. This of course influences the results in the subsection Problems Revealed of the Results-section, but even if the numbers are corrected to compensate for that (done by taking all possible combinations of 7 participants out of the 8 and then taking of the average of the amount of problems found by these combinations of 7 participants in concurrent verbalisation), concurrent verbalisation still reveals 81.125 problems to retrospective verbalisations 61. This is still a notable difference and does not change the conclusions drawn. The same is the case in the subsection Unique Problems where concurrent verbalisation
Evaluating in a Healthcare Setting
515
still finds 27.3 of the globally unique problems (compared to the 30) and 41.1 problems that are unique to that protocol (compared to 47) when the numbers are corrected to compensate for the extra participant as descried above. Here the differences too are still noteworthy even after the compensation and therefore does not change any of the above written. It of course looks a bit odd to be talking about a fraction of a problem, but it is simply to illustrate the average amount of problems that would have been experienced, if we had only used 7 participants and not 8, regardless which 7 participants we were to choose of the 8. With the corrected numbers, table 2 would then look as can be seen in table 4. Table 4. Table 2 as it would look with the corrected numbers for concurrent verbalisation
All Problems revealed Unique problems* Average computer experience
105 44 3.4
Concurrent Verbalisation 87 27.3 (41.1) 3.9
Retrospective Verbalisation 61 14 (33) 3.0
* Note that the number in parentheses refers to problems that are unique to that protocol and not necessarily unique in total.
One purpose of the experiment conducted was to look at the suitability of the protocols for sensitive settings, in this case healthcare in a field evaluation: Surprisingly, and contrary to expected, there was no evidence that the participants using concurrent verbalisation were influenced by the awkwardness or private nature of the information they were verbalising about. This indicates that this is not an issue that affects the test situation or the participant. It is however unclear if this goes for other settings and it would be interesting to explore if, what can be described as sensitive settings, influence the suitability of verbalisation. However, this requires a definition of what makes a sensitive setting, such as surroundings, participants etc., and then identifying application areas where this could pose a problem. Acknowledgements. The research behind this paper was partly financed by the Danish Research Councils (grant number 2106-04-0022, the USE-project), without which it would not have been possible. I would also like to thank my supervisor for his continuously constructive comments on the paper. Finally, a thank you to the home healthcare workers of Aars kommune in Denmark, who agreed to participate in this experiment, and to the elderly citizens, who so willingly opened their homes to us.
References 1. Dix, A., Finlay, J., Abowd, G., Beale, R.: Human-Computer Interaction. Prentice Hall, Englewood Cliffs (1997) 2. Duncker, K.: On Problem-solving, in Dashiell, John F.: Psychological Monographs. The American Psychological Association, Inc.vol. 58, pp.1–114 (1945) 3. Ericsson, K.A., Simon, H.A.: Protocol Analysis: Verbal Reports as Data. MIT Press, Cambridge, MA (1993)
516
J.J. Jensen
4. Hackos, J.T., Redish, J.C.: User and Task Analysis for Interface Design. Wiley, Chichester, UK (1998) 5. Molich, R.: User Friendly Systems (in Danish), Teknisk Forlag (1994) 6. Nielsen, J.: Estimating the Number of Subjects Needed for a Thinking Aloud Test. International Journal of Human-Computer Studies 41(3), 385–397 (1994) 7. Nielsen, J., Clemmensen, T., Yssing, C.: Getting access to what goes on in people’s heads? – Reflections on the think-aloud technique. In: Proceedings of NordiCHI, ACM Press, New York (2002) 8. Preece, J.: Human-Computer Interaction. Addison-Wesley, London, UK (1994) 9. van den Haak, M., de Jong, M.D.T., Schellens, P.J.: Retrospective vs. concurrent thinkaloud protocols: testing the usability of an online library catalogue. In: Behaviour and Information Technology, vol. 22, pp. 339–351. Taylor & Francis Ltd, London (2003)
Development of AHP Model for Telematics Haptic Interface Evaluation Yong Gu Ji, Beom Suk Jin, Jae Seung Mun, and Sang Min Ko Yonsei University, 134 Sinchon-Dong, Seodaemun-gu, Seoul, Korea {yongguji, kbf2514jin, mjs, sangminko}@yonsei.ac.kr
Abstract. These days, the main focus in developing telematics systems is to promote safety by decreasing the workload of the driver. To achieve this goal, simplification of the interface as well as the resolution of GUI interaction problems must be worked on. For this research, objective and quantitative assessments are provided in the early steps of building the haptic interface model. The purpose of this research is to create an evaluation model that uses the Analytic Hierarchy Process (AHP) method to fulfill user requirements. This research developed an AHP evaluation model that can present recommendations, as well as the degree of importance, for haptic interface design with quantitative assessments of the prototype by finding out the absolute and relative importance for evaluation groups and factors in early design levels using AHP. Keywords: Analytic Hierarchy Process, Haptic Device, Haptic Interface, Telematics.
Therefore, simplifying complex GUI interactions to reduce driver workload and secure driver safety has become an important issue for the next generation’s telematics systems. It is necessary to introduce new interaction methods to overcome setbacks in some parts of speech recognition and touch screen technology. Multifunction display and device control using a mental model would be a solution. Haptic interface will be an important part in intelligent vehicles, and drivers will easily manipulate devices using a haptic interface [1]. Moreover, using the haptic interface will help us obtain the core technology that can affect the future market of intelligent vehicles. If tactile feedback is supported in telematics devices, it would greatly reduce the chances of malfunctions in telematics systems caused by overload during driving. Furthermore, the tactile feedback provides drivers with useful information on the functions of telematics. It will contribute to the reduction of driver distraction to further guarantee driver safety. While the previous haptic interface model, equipped with difficult and complex manipulations, has been inefficient, the new telematics device will support efficient and instinctive manipulation using combinations of tactile feedbacks [11]. Due to the lack of quantitative evaluation in early design steps, customer needs have not been reflected enough in the haptic device. In this study, to reflect customer need properly and to offer an effective evaluation method, we developed the Analytic Hierarchy Process (AHP) evaluation model for Multi Criteria Decision Making. In the AHP evaluation model for an early prototype using hierarchical analysis, quantitative values of each evaluation factor were computed [14]. As a result, an objective and quantitative evaluation model for a usercentered haptic device design was developed. It will be useful to both drivers and developers.
2 Literature Review Previous studies about haptic interface and device evaluation were researched to collect the evaluation factors of the AHP model. As human-machine interaction within vehicles is starting to get more and more complex, an advanced interface is needed. Interaction types like haptic interface, touch screen, and voice control have been developed to manipulate multi-functional systems, and interaction types were compared in usability studies [2]. To study the relationship between practical applications and the haptic interface’s hardware/software, user perception and motor control in haptic mode were evaluated. E. Kirkpatrick (2002) researched the requirements of hardware/software for haptic interface [6]. The study on how to design a haptic user interface and application was performed by Steven Wall (2006) [15]. In this study, a heuristic guideline for improving usability was presented. The heuristic guideline was acceptable to users. Also, relations among the haptic device, human perception, and computer application were analyzed. As a result, the haptic interface to improve interaction between the user and computer was proposed. A study of interface attributes of navigation in vehicles shows that controllability—like speed and accuracy—as well as ease of use are highly related to user satisfaction in car navigation systems. So, Robert (2003)
Development of AHP Model for Telematics Haptic Interface Evaluation
519
researched device design that considers users who have no experience manipulating complex devices [12]. E. Kirkpatrick (2001) researched predicting the performance of usability in the haptic environment by measuring the necessary time to perceive the shape of a physical object [7]. This study offered ways to improve user performance by using tactile feedback from the haptic interface. Mark Evans (2005) researched the usability evaluation study of tactile feedback devices to improve interactions in a virtual environment [10]. In this study, the limitations of commercial haptic feedback were described, and a guideline for haptic feedback to develop new products was provided. P. Richard (1999) evaluated controllability and accuracy related to object control using haptic feedback to evaluate user performance of haptic interaction in a virtual environment [13]. As a result, task performance decreased when visual, auditory, and tactile feedback were offered in a simple task. According to the study, multiple feedbacks are not necessary, and only one feedback is more effective in a simple task. Also, Camilla Grane (2005) compared haptic information, graphic information, and integrated information (haptic and graphic) in a simple menu selection, and Dario D. (2005) measured the effectiveness of different input methods in driving [3][4].
3 Methodology To develop the AHP evaluation model for evaluating the haptic interface, the preceding study on haptic interface was reviewed and then evaluation indexes were extracted. After reviewing, 25 evaluation indexes were finally selected and classified with 7 evaluation groups by factor analysis. Then, using AHP, each index’s weight was generated and the hierarchical structure of evaluation indexes was organized. 3.1 Selection of Evaluation Indexes The literature review was used to generate evaluation indexes, which will be used in the AHP evaluation model. Based on this review, and on studies about general device evaluation indexes or haptic interface, 50 usability evaluation indexes were extracted, including previous studies’ haptic interface evaluation indexes as criteria for usability and functionality of the haptic interface model. These 50 indexes were reviewed twice to make the criteria more subjective and clear. An expert group interview was performed in the first index selection. The criteria of selection, unification, and exclusion were decided in this step. Similar or duplicate meanings of the evaluation indexes were resolved considering the research purpose and characteristic of object. If an evaluation factor was ambiguous, it was not used. Secondly, 25 indexes were finally chosen to evaluate the haptic interface model. The criterions were: scope of definition (inclusion of index’s concept or scope of generality), hierarchical relation among concepts (one index’s concept is the other’s subset), and correlation among concepts (causation or correlation between concepts of indexes). Table 1 shows the 25 indexes and descriptions.
520
Y.G. Ji et al. Table 1. Evaluation indexes Index Learnability Memorability Ease Flexibility Efficiency
Evaluation index definition The way of manipulation should be easy for novices to learn. The manipulation way of device controller should be easy for users to remember once learned. The device controller should be easy for the selection/execution/level control of the function. The manipulation way of device controller should be designed to be connected flexibly for each menu/mode. Device controller should be worked efficiently. (Functions have to be executed by the minimum number of the key operations.) Device controller should be worked effectively. (Functions have to be executed to minimize the workload of users’ hands and brain.) The user’s capacity for tasks should be excellent. The search menu and list and the performance of the task should be manipulated fast. The input information like the selection/execution/control of the level should be delivered to the system exactly through the device controller. Device should provide users with controllability. The feedback of errors, including tactile, visual, and auditory, should be indicated and provided so clearly that users can recognize errors easily. The user’s input error or incorrect operations are prevented in advance. The exit or cancel should be provided in order that users can escape from wrong input or unwanted menus. The current state of the device/system should be given visually. Device controller should be designed robustly so that malfunction or damage is prevented. Device controller should exclude unnatural manipulation so that the overload of the users is minimized. Device controller should be designed considering the common user’s hand size. Device controller’s features, like the shape/form/surface/elasticity/weight/tactile, should be designed to increase the sense of a grip. Device controller should be set up within hand’s radius for action so that users have no difficulty in operation. (The control of the device location should be provided.) The way to manipulate or form type of device controller should attract user’s interest. The complex manipulation for a selection/execution/navigation of the functions should be excluded.
Development of AHP Model for Telematics Haptic Interface Evaluation
521
Table 1. (continued) Simplicity
Cognition Consistency Discriminability
Device controller should be simply manipulated for selection/execution/navigation of the functions (to minimize user’s workload). Device controller should be designed to anticipate how to manipulate and which function to perform. The one that has a similar way to manipulate the function should be designed to be manipulated similarly. The function that has a different manipulation should be designed to use the different manipulation or controller.
3.2 Hierarchical Classification of Evaluation Indexes For the AHP analysis, a hierarchical classification was conducted on the basis of 25 indexes in table 1. Hierarchical classification and grouping related indexes guaranteed more effective and efficient evaluation of the haptic device model. The degree of relation among evaluation factors was assigned by 9 evaluators: 2 points for high relation, 1 point for middle relation and 0 points for low relation. Using factor analysis,
522
Y.G. Ji et al.
the evaluated points were used to organize and group evaluation indexes. Table 2 shows the results of factor analysis along with the degree of relation among evaluation factors. 25 evaluation factors were classified into 7 groups in accordance with the result of factor analysis and terminology was defined to represent each group. The 7 groups were “interaction support,” “function support,” “user support,” “information support,” “device capacity,” “device appearance” and “device control.” These were reclassified into “time,” “manipulation,” and “device” for the haptic model. These grouped evaluation indexes formed a basis of the AHP evaluation model to evaluate the haptic interface. Table 3 shows description about each group. Table 3. Grouping of evaluation indexes
Device
Manipulation
Evaluation group Definition Index Interaction Evaluation index group related to Learnability Support interaction between user and device for Memorability controlling the device Ease Function Evaluation index group related to Flexibility Support device function Efficiency Effectiveness User Evaluation index group related to user’s Performance Support usability to perform a given task Fast Accuracy Controllability Information Evaluation index group related to Feedback Support feedback or information about the state Prevention of the device Recoverability Visibility Device Evaluation index group related to Durability Capacity duration and capacity in design the Safety hardware Device Evaluation index group related to the Size Appearance physical features like shape, size, and Familiarity arrangement of the device controller Arrangement Attractiveness Device Evaluation index group related to Complexity Control manipulation way for selection and Simplicity operation of function using the device Cognition controller Consistency Discriminability
3.3 Analytic Hierarchy Process Evaluation Model The hierarchical structure generated from the factor analysis on evaluation indexes indicates the index’s structural level of the AHP evaluation model. To calculate the degree of significance, a cross comparison of parallel indexes was conducted, and the relative comparative value was collected on 10 evaluator’s checklists. With this value, we generated the Eigen value of the evaluation index, which was used to deduce the relative significant value of the criterion indexes. Expert-choice, as an AHP tool, was used to generate the absolute significant value by the upper criterion’s significance.
Development of AHP Model for Telematics Haptic Interface Evaluation
523
Fig. 1. Hierarchical structure of evaluation indexes (Analytic Hierarchy Process evaluation model)
4 Results The weights of the evaluation factors of the haptic interface were generated in the AHP model. The evaluation factors’ weights were divided into local and global results. Local result refers to the importance of the evaluation factors in each group, and global result refers to the importance of the evaluation factors for the whole model. Table 4. Local results Evaluation group
4.1 Results of Local Through analysis using the AHP model, the values of local and global section’s results were generated. Local section’s results refer to the importance between the evaluation group and evaluation factors. Table 4 shows the comparative importance between the evaluation group and evaluation factors. In the haptic interface evaluation model, the “Manipulation” group is more important than the “Device” group. Also, in the Manipulation group, “Interaction Support” was most important. “Device Control” was most important in the Device group. This shows that the manipulation method and offered information to the driver is more important than the device appearance. 4.2 Results of Global The value of the global results refers to the importance from the evaluation factors to the haptic interface model. Table 5 shows the values from each evaluation group and factors to the haptic interface model. In the haptic interface evaluation model, “Interaction Support,” “Information Support” and “Function Support” were most important among 7 evaluation groups, and “Memorability,” “Ease,” “Durability,” and “Learnability” were most important among the 25 evaluation factors. Table 5. Global results Evaluation group
Development of AHP Model for Telematics Haptic Interface Evaluation
525
As a result, developing simple interaction method with haptic device that is easy to learn to use, and supporting feedback and information about system status will make effective and efficient haptic device. Consequently, considering core factors of haptic device will improve its usability.
5 Conclusion In this study, we offered a priority-based, quantitative evaluation generated from statistical analysis using qualitative evaluation from experts in early design steps of the haptic interface model. In conclusion, a developer can design a user-centered haptic device with important considerations, and a heuristic evaluation for a haptic interface’s prototype is possible using the AHP model. This will have a great impact on the advancement of haptic interface design and improvement.
References 1. Marcus, A.: The next revolution: vehicle user interfaces. Interactions, 11(1) (2004) 2. Rydström, A., Bengtsson, P., Grane, C., Broström, R., Agardh, J., Nilsson, J.: Multifunctional Systems in Vehicles: A Usability Evaluation. In: Proceedings of CybErg 2005, The Fourth International Cyberspace Conference on Ergonomics, Johannesburg, International Ergonomics Association Press (2005) 3. Grane, C., Bengtsson, P.: Menu Selection with a Rotary Device Founded on Haptic and/or Graphic Information. In: Proceedings of the First Joint Eurohaptics Conference and Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, IEEE (2005) 4. Salvucci, D.D., Zuber, M., Beregovaia, E., Markley, D.: Rapid Prototyping and Evaluation of In-Vehicle Interfaces. In: CHI 2005, Portland, Oregon, USA (April 2-7, 2005) 5. Electronics Information Center: Market trend of Telematic. Knowledge Research Group report (November 2003) 6. Kirkpatrick, E., Douglas, S.A.: Application-based Evaluation of Haptic Interfaces. In: Proceedings of the 10th Symp. On Haptic Interfaces For Virtual Envir. & Teleoperator Systs. IEEE, New York (2002) 7. Kirkpatrick, E., Douglas, S.A.: A Shape Recognition Benchmark for Evaluating Usability of a Haptic Environment. In: Brewster, S., Murray-Smith, R. (eds.) Haptic HCI 2000. LNCS, vol. 2058, pp. 151–156. Springer, Heidelberg (2001) 8. Gartner Inc.: Automotive Telematics Overview and Forecast (2002) 9. Isaksson, J., Nordquist, J.: Evaluation of haptic interfaces for in-vehicle systems. IEA (2003) 10. Evans, M., Wallace, D., Cheshire, D., Sener, B.: An Evaluation of Haptic Feedback Modelling during Industrial Design Practice. Design Studies 26(5), 487–508 (2005) 11. Payette J., Hayward V., Ramstein V., Bergeron D.: Evaluation of a Force Feedback (Haptic) Computer Pointing Device in Zero Gravity. In: Proceedings ASME Dynamics System and Control Division, DSC-vol. 58 (1996) 12. Llaneras, R.E., Singer, J.P.: In-Vehicle Navigation Systems. In: 2nd International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design (2003)
526
Y.G. Ji et al.
13. Richard, P.: Dextrous Haptic Interaction in Virtual Environments: Human Performance Evaluations. In: Proceeding of the IEEE, International workshop on robot and human interaction, pisa, Italy (September 1999) 14. Saaty: The analytic hierarchy process: planning, priority setting, resources allocation. McGraw-Hill (1980) 15. Wall, S., Brewster, S.: Design of haptic user-interfaces and applications. Virtual Reality 9, 95–96 (2006)
How to Make Tailored User Interface Guideline for Software Designers Ilari Jounila University of Oulu P.O. Box 3000, 90014 University of Oulu, Finland [email protected]
Abstract. A large numbers of user interface guidelines and patterns have developed by different researchers. These patterns and guidelines are, however, either too generic or too specific to use. In addition, a multitude of guides cause problems to find and use them effectively. Because of these problems, using different guides are not enough useful e.g. for software designers. This paper describes experiences and findings of a case study project. As a result of an iterative development process, the tailored user interface guideline is presented. Other result was that the guideline was well received by the software designers. Keywords: User interface guidelines, software designers.
There have been several tools to make usability guides more accessible through different systems. For example, the tool that provides the management of multiple guidelines via Internet or locally [8]. This paper approach is case study in the project, which purpose is creating tailored and useful user interface guideline for software designers.
2 Related Work The case project, Kätsy, consisted of our usability research group and the company, whose business activity is software development. The company contacted our usability researchers because of their growing interest in usability issues. In the first meeting, the company listed their interest of usability, such as usability knowledge generally, and existing methods. They had also expectations of two issues: usability evaluation of their existing web-based Content Management System (CMS) as well as short-term and long-term advantages of usability for the company. The company had developed the system without knowledge of usability; e.g. their product development process had not included participation of end users at all. The CMS is consisted of several modules such as content production, content editing, content updating etc. A new version of the CMS was under development, and therefore the company had interested in how to improve the usability of the CMS before releasing it. Improving the usability of the CMS was consisted of several parts. First of all, our usability research group was educated by the company for a few hours to use the system. We also organised a workshop together with the company, in which user groups of the system and their typical tasks were identified by using a persona method. Based on a selected persona, usability test tasks were designed with our usability research group and company’s software designers. After the workshop a specified test environment was constructed for the purpose of the usability tests by the company. Defining the current state of the system’s usability was started by using expert evaluation. Expert evaluation was used, when the usability research group familiarised themselves with the use of the test environment. Also four usability test sessions, each with one participant, were conducted for gathering user feedback of the system. Three participants were end users of the tested product and one had used of the same kind product earlier. Test facilitator and observers were all from our research group. Test participants were briefed shortly to the test session situation and they filled out a user profile questionnaire. After that they had about five minutes to familiarise themselves with the system before beginning the session. The think aloud method was used during the all sessions. Each of the four test sessions took about one hour and was recorded. Finally, an after test questionnaire was filled out by each participant Findings from the all test sessions were analyzed and combined to the test report. The test report consisted of the usability problems of the system mostly, but also good design solutions were included. The test report and a test video run-through were introduced to the company in a test report meeting. The company was very satisfied with the results and provided solutions. However, one thing came up in the meeting: the company needed some
How to Make Tailored User Interface Guideline for Software Designers
529
concrete guides and rules for the aid of user interface development. At the end of the project, our usability research group produced two deliverables: a content analysis document based on the expert evaluation and the findings of the usability test sessions as well as the user interface guideline document based on a literature research, the expert evaluation, and the findings of the usability test sessions.
3 Development Process of the Guideline Findings from the usability tests and the expert evaluation, and the software development company’s needs were all as a starting point to develop the user interface guideline. Using generic design guidelines and design patterns were basic aspect to illustrate bad and good design solutions of the system. With generic guidelines in this study means guidelines provided e.g. in text books such as GUI Bloopers [5], but also ISO 9241 standard with parts 10 to 17 [4] and Research-based Web Design and Usability Guidelines [6]. In addition, the design patterns, such as Patterns for Effective Interaction Design by Tidwell [11] and Interaction Design Patterns by van Welie [12] were also included in this study. A linguistic form of the guideline was Finnish because of a small-sized national organisation. The guideline was developed iteratively based on empirical findings. The first version of the guideline was implemented as text and picture examples of the generic design guidelines and design patterns. However, the use of this kind of general examples included some problems. It was not easy to understand and use too general text descriptions and examples of pictures. Company pointed out that they preferred less general pictures of literature, and more picture examples of their own system. After three iteration rounds and three meeting discussions (like evaluation sessions), the guideline was accepted by the company. Depending on the occasion, 4 to 5 researchers and two persons from the company, software designer and development manager, participated the all meeting sessions. In these sessions, all of the participants followed the content of guideline from wide-screen. The user feedback was gathered through the observations done by the researchers, and discussions with the company. Final version of the guideline provided 28 different individual guides. Each of them included text based description of the problem or good design solution, sample picture(s) of the company’s own system only, and also recommendation of the better design solution or mention of already good design solution in the existing system. 3.1 Iteration 1: Preliminary Version The first iteration started with literature research of existing guidelines, principles and patterns. Some exclusions were made because of a large numbers of different guideline collections were found. It was decided to include only well-known guideline and pattern collections into this guideline (e.g. GUI Bloopers [5], ISO 9241 standard [4], Research-based Web Design and Usability Guidelines [7], Patterns for Effective Interaction Design by Tidwell [11], and Interaction Design Patterns by van Welie [12]). A purpose of this iteration was also to find only such guidelines that were not followed by the CMS.
530
I. Jounila
The first version of the guideline was produced as text and picture examples of the original generic design guidelines and design patterns. This preliminary version consists of only four guides because of needs to get user (company) feedback before doing too much time-consuming work. Figure 1 shows an example of the individual guideline. Structure of the guideline was presented as the title, the description how to would have to design, and after all an example picture. The title (number 1 in the fig 1) was translated in Finnish but also English analogue was included. Textual description (number 2 in the fig 1) was translated in Finnish only from the original source. The example picture was included in it original form. This example was found from Research-based Web Design and Usability Guidelines [6].
Fig. 1. An example of the first version of the guideline. (1) Title of the individual guideline, (2) longer description of the followed guide (translated from English into Finnish from original source, (3) an example picture from original source. Numbers (1-3) were added into this picture to clarifying structure of the example.
The first version of the guideline was presented in the meeting with company. The use of these version general examples included some problems. It was not easy to understand and use too general guidelines with general examples of pictures. Company pointed out that they preferred less general pictures of literature, and more picture examples of their own system. Other feedbacks were that the guideline should be concrete and logical wholeness. Also more individual guides should be included in the next version.
How to Make Tailored User Interface Guideline for Software Designers
531
3.2 Iteration 2: Restructured Version After the feedback of the preliminary version, it was then looked at how to create the content of guideline logically. The second version of the guideline was implemented of the same kind than the first version but using the picture examples of the company’s own system. Also, a structure of an individual guideline was changed more clearly and logical with numbered guides. Each of the individual guideline followed the same structure: (1) A title of the guideline in numbered order, (2) a description of the followed guideline, (3) a description of the founded problem, (4) a description of the proposed solution, (5) an example picture. In this version, the number of individual guideline increased. The restructured version consists of eight guides and four proposal guides only with the title. The guideline was still proposal level in this iteration because of needs to get company’s feedback before continuing its development. Figure 2 shows an example of the individual guideline of the second iteration based on ISO 9241-12 standard [4].
Fig. 2. An example of the second version of the individual guideline: Initial position for entry fields. (1) The numbered title, (2) description of the followed guide, (3) founded problem in the company’s system, (4) proposed solution, (5) an example picture from the company’s system. Numbers (1-5) were added into this picture to clarifying structure of the example.
The second version of the guideline was presented in the meeting with company. This style of implementation was found quite clear and illustrative by the company. However, the company proposed to increase the number of the individual guideline, e.g. pop-up menus, wizards, and error messages. After this iteration, it was concluded also, that the structure of the individual guideline should specify more precise.
532
I. Jounila
3.3 Iteration 3: Superfine Version The primary focus of the third iteration was to increase examples of different user interface elements. Instead of only bad solutions, it was also included examples of well implemented solutions of the system into the guideline. Also, the structure of the individual guideline was revised. This version of the guideline consists of an abstract, an introduction, a table of content and several chapters of guidelines. The guideline included 28 individual guides and 38 pages (using MS Word). The structure of an individual guideline was changed again more clear and simple. Each of the individual guideline followed the same structure: the title, an example of the system with problem description and picture, and guideline/solution for the problem. Figure 3 shows an example of the individual guideline of the second iteration based on ISO 9241-13 standard [4].
Fig. 3. An example of the final version of the individual guideline: “Error prevention and error messages”. (1) The numbered title, (2) an example of the company’s system including the problem and picture, (3) guideline/solution for the problem. Numbers (1-3) were added into this picture to clarifying structure of the example.
This version of the guideline was very well received and appreciated by the company. The representatives of the company commented in the meeting (translated in English): “[the document is] concrete guideline”, “[this guideline is] superfine because of using examples of our own system”, “we will also go through [the guidelines] with our business partner” and “the guidelines will ensure of basic quality of usability to our company”.
How to Make Tailored User Interface Guideline for Software Designers
533
The third iteration version of the guideline was accepted by company. Only some misspelling had to correct into the final deliverable version.
4 The Tailored Guideline The final version of the guideline is concrete presentation for a specific user interface design in a specific company. In this project, the deliverable form of the guideline was a Word-document by the email and also same document in a paper printout. Of course the deliverable form depends on the users needs. The tailored guideline supports development of user interface design in a small-sized software development company. 4.1 Structuring the Model of Guideline Proposed structure of guideline consists of the title with identified number, an example of the user’s own system with description of the problem/well-designed solution as well as the picture, and also short description of general guideline and solution to the specific problem (solution not needed if the example is well-designed). Figure 4 shows the simple model of the structure.
Fig. 4. A proposed simple model of an individual guideline
4.2 Criteria for Making the Tailored Guideline Proposed criteria when making the tailored guideline: 1. Mode of the generic guidelines have to be changed toward the close by user (examples have to be from users own system including the description of the problem with picture and guideline/solution how to correct it) 2. Guidelines have to be concrete 3. Guidelines have to be enough extensive but not too long
534
I. Jounila
4. Guidelines have to include bad solutions of the system as well as well designed solutions 5. Deliverable form has to decide as case-specific 6. Iteration needed when developing a tailored user interface guidelines
5 Discussion One of the case project ideas was to provide long-term usability knowledge for the case company. Because of this, the tailored user interface design guideline was developed for a case company, although Mosier and Smith suggested that it is not appropriate to make specific guidelines [10]. However, this study seemed that specific guidelines are needed at least for small-sized company. Due to a large numbers of general guidelines, finding the right guidelines for the specific needs causes problems for software designers. Also, the general guidelines are often too general for using in a specific context. The tailored individual guideline was built with a simplified structure with quite short length. The simplified structure was also supported by the expectations of the company. An important thing is to use examples of the company’s own system to describe the problems. Other important thing is that the picture examples in generic guidelines confused developers. This is the reason why should have to use sample pictures only of their own system. It was found that the form of deliverable is casespecific. Deliverable could be a paper document, a Word-document or a Web-page etc. Thus the specific tools are not needed. In this research was found, that developing tailored user interface guideline is timeconsuming due to the development bases on the findings of product education by the company, requirement specification workshop, expert evaluation, and four usability tests. Also, a large numbers of existing general guidelines caused challenges to find appropriate general guidelines to this work. However, the discussions with the company were useful between iteration rounds to decide things to be included into the guideline. Proposed criteria are useful when developing guideline for a company without knowledge of usability issues, but perhaps they could be too restrictive for a company with usability knowledge (e.g. guideline should be substantially more extensive than developed in this work.). A proposed simple model of an individual guideline seemed to work in this project. However, it needs more study in further research. Also, the initial criteria should define more specific. In the future, it would be interesting to see the usefulness of developed guideline in the case project after six to twelve months. This study was a one approach to educate guidelines to software developers with making tailored guideline. Thus, one other future work will include studies on how should educate existing guidelines and patterns for software designers but also for other groups such as students.
6 Conclusions This research concluded that adoption of tailored user interface guideline is more appropriate for software developers than generic guideline collections due to
How to Make Tailored User Interface Guideline for Software Designers
535
understandability and expression. The most important thing is to included examples of developers own system into the guideline. Acknowledgments. I would like to thank the Kätsy project for providing a research environment. I also thank Dr. Timo Jokela, Kari-Pekka Aikio, Niina Kantola, and Mauri Myllyaho for feedback and comments on developed guideline. In addition, this work would not be possible without software designers at software development organization.
References 1. Apple Computer Inc.: Apple Human Interface Guidelines (2006) Last accessed 2007-0215, http://developer.apple.com/documentation/UserExperience/Conceptual/OSXHIGuidelines/ index.html. 2. Deng, J., Kemp, E., Todd, E.G.: Managing UI pattern collections. In: Proceedings of the 6th ACM SIGCHI New Zealand chapter’s international conference on Computer-human interaction: making CHI natural, July 07-08, 2005, Auckland, New Zealand, pp. 31–38 (2005) 3. Henninger, S., Haynes, K., Reith, M.W.: A Framework for Developing Experience-Based Usability Guidelines. In: Proceedings of DIS ’95, pp. 43–53. ACM Press, New York (1995) 4. International Standards Organization: ISO 9241: Ergonomic requirements for office work with visual display terminals. Geneva, Switzerland (1999) 5. Johnson, J.: GUI Bloopers: Don’ts and Do’s for Software Developers and Web Designers. Morgan Kaufmann, San Francisco (2000) 6. Koyani, S.J., Bailey, R.W., Nall, J.R.: Research-based Web Design and Usability Guidelines, Dept. of Health & Human Services, National Institutes of Health Publication 03-5424, National Cancer Institute, Washington, DC (2006) Last accessed 2007-02-13 http://www.usability.gov/pdfs/guidelines.html 7. Mariage, C., Vanderdonckt, J.: Creating Contextualised Usability Guides for Web Sites Design and Evaluation. In: Proceedings of 5th Int. Conf. on Computer-Aided Design of User Interfaces CADUI’2004 (Funchal, 12-16 January 2004), Kluwer Academics, Dordrecht (2004) 8. Mariage, C., Vanderdonckt, J., Pribeanu, C.: State of the Art of Web Usability Guidelines. In: Proctor, R.W., Vu, K.-P.L. (eds.) The Handbook of Human Factors in Web Design (Chapter 41), Lawrence Erlbaum Associates, Mahwah (2004) 9. Microsoft Corporation: The Windows interface guidelines for software designers. Microsoft Press, Redmond, WA (1995) 10. Mosier, J.N., Smith, S.L.: Application of Guidelines for Designing User Interface Software, in Behaviour and Information Technology, vol. 5(1), pp. 39–46 (JanuaryFebruary 1986) 11. Tidwell, J.: Designing Interfaces – Patterns for Effective Interaction Design (2006), Last accessed 2007-02-13. http://designinginterfaces.com/ 12. Welie, M.v.: Patterns in Interaction Design (2001) Last accessed 2007-02-13, http://www.welie.com/index.html
Determining High Level Quantitative Usability Requirements: A Case Study Niina Kantola and Timo Jokela P.O. Box 3000 90014 Oulu University, Finland {niina.kantola, timo.jokela}@oulu.fi
Abstract. High-level quantitative usability requirements were determined for a public health care system. The requirements determination process was iterative, and the requirements were refined step-by-step. The usability requirements are categorized first through the main user groups, then by the services, and finally by specific usability factors. Keywords: Usability requirements, requirements.
health
care
systems,
quantitative
1 Introduction It is generally agreed as a good project management practice to define quantitative requirements for system quality characteristics. Quantitative, measurable quality requirements provide a clear direction of work and acceptance criteria for a development project. In practice, usability requirements are quite seldom among the quantitative requirements in development projects. One of the consequences of not defining usability requirements is that other objectives dominate and usability is considered only as a secondary objective of a project. The obvious consequence is a product with usability problems. Our case study is a system development project where the city Oulu is the purchaser of a system, and a consortium of two software development companies will develop the system. The system-to-be-developed is a healthcare system. The goal is that is will be extensively used by the citizens. Also the healthcare professionals (doctors, nurses, etc.) would be naturally users of the system. To make usability a true issue in the project, it was decided in the beginning of the project that measurable level high usability requirements be determined. In this paper, we present how we approached usability requirements determination in order to define them at a high level of abstraction but still in a measurable way, and what were the results.
Determining High Level Quantitative Usability Requirements: A Case Study
537
are we measuring? How many do we measure? How do we present the measures? What measures do we take? For example these kinds of topics were discussed in the special issue on measuring usability of Interactions magazine (Interactions Nov + Dec 2006). According to [5] discussions have recently recurred on which measures of usability are suitable and on how to understand the relation between different measures of usability. Literature recognizes several usability attributes that can be measured. ISO 924111 [6] defines usability as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use”. In the early phases of development measures of effectiveness, efficiency and satisfaction should be selected, and acceptance criteria based on these measures established. These attributes are generally measured on different scales such as task completion rates, average time to task completion and average task satisfaction scores. Acceptance criteria may include separate definitions of the target level and the minimum acceptable level [12]. It is also possible use different scales, for example worst, planned, best and current levels [15]. Other measurable usability attributes may include attributes such as learnability, memorability, error, affect, helpfulness and control [11, 9]. There are also attempts to standardise traditional usability metrics on a uniform scale (e.g. [14]). Several measurable usability attributes exists, but however, there are no clear guidelines how the determination of measurable, quantitative, requirements should be organized and managed. Jokela [7] points out that the existing literature mainly focus on describing and exploring the concepts and formats related to the definition of usability and the contents of usability requirements document. Some guidelines are presented for example by Wixon and Wilson [16], Nielsen [11] and Mayhew [10]. Further, there exist only very few empirical research reports on quantitative usability requirements methods in practice. One of the few reports is by Bevan et al. [3] who conducted case studies on quantitative usability evaluations following the Common Industry Format for usability testing, CIF [1]. However, the methodological aspects are not discussed in detail in his report. Jokela et al. [8] describe a case study where quantitative requirements played a key role in the development of a new user interface of a mobile phone. Because of limitations of existing methods, they developed tailored methods for determining and evaluating quantitative usability requirements. To support the process of defining usability requirements and usability criteria, the Working Group sponsored by National Institute of Standards and Technology (NIST), has recently developed a Common Industry Specification for Usability–Requirements (CISU-R). It is still in a draft form, but it aims to define the content of usability requirements, rather than requiring a specific process by which they are gathered [13]. CISU-R has three parts: the context of use, usability measures and the test method. Scenarios of use play an important role in the process. In the first part scenarios are used to specify how users carry out their tasks in a specified context. In the second part usability measures are provided for defined scenarios of use. Such task (scenarios) should have to be selected that are the most frequent or most critical to the business or user [13]. Determining whether the quantitative requirements have been achieved can be done through a usability test. Also user preference questionnaires provide a subjective
538
N. Kantola and T. Jokela
metric for the related usability attribute. Usability can be quantitatively evaluated with theory-based approaches such as GOMS and keystroke level model, KLM, too [4]. Hornbæk [5] has reviewed current practice in how usability is measured. He has also analysed problems with the measures of usability employed. According to his analysis such problems include for example such: measures of the quality of interaction are used only in a few studies; measures of learning and retention of how to use an interface are rarely employed; measures of users’ satisfaction with interfaces are in a disarray and validated questionnaire readily available are ignored. Based on his review, he proposes several challenges with respect to measuring usability [5].
3 Flow of the Case Study The case study is a health care system, the aim of which is to provide web-based health services to citizens of the city. Because the use of the services will be based on the voluntariness of the citizens, usability is a key success criterion for the system. Therefore, it is essential to explicitly define the usability requirements for the system. A typical way of defining usability requirements is by user task performance. Because there typically are many kinds of users and many user tasks, the number of user requirements easily is quite big. The number of requirements should not be too many, but still well depict the required usability. In practice, the requirements were determined in a qualitative and iterative way. The steps were: 1. The available documentation of the project was examined. 2. The key persons behind the project idea were interviewed; these persons were at various managerial positions in the city and the at the development companies. The goal of the interviews was to get an understanding of the planned use of the system, and the critical success factors. 3. Thereafter, an interpretation of the interviews was made, and the first proposal of usability requirements was produced. The requirements included (1) three main user categories and (2) two to three usability requirements for each category. 4. This first set of requirements was presented in a small working group of the project steering group. The discussion revealed that some updates were needed. For example, the appropriate number of user categories should be four (not three) 5. Based on the feedback from the working group, the requirements were revised. 6. The revised set of requirements, then, were presented to a larger steering group of the project, with larger number of participants. Another problem in the requirements was noticed in the meeting. An additional requirement was added to the requirements “on-line”, and thereafter the requirements were approved. In summary, the research method was an iterative and constructive process. An artifact (the requirements) was constructed, based on usability experience and on the data that had been gathered through documentation and interviews. The artifact was evaluated and refined two times before being accepted.
Determining High Level Quantitative Usability Requirements: A Case Study
539
4 Result: The Usability Requirements As a result, we have defined a set of usability requirements. The requirements identify main user categories, and a set of quantitative usability requirements (= 1 to 4 requirements) is defined for each category. The main user categories are: • “Customers”; i.e. the citizens of the city • “Professionals”; i.e. the healthcare personnel of hospitals and health centers. In the following, we discuss the usability requirements for each two main user categories separately. The “Customer” Category In the “Customer” category, the following main services of the health care system were identified: • Proactive healthcare: information about proactive healthcare issues, such as weight control, nutrition, and physical training • Occasional healthcare problems: information for self-assisted care (how to act in occasional healthcare problems such as occasional fever, flu, small accidents, etc.) • Chronic diseases: support for self-assisted care for diseases such as diabetes, asthma and arterial hypertension Four main users groups within the “customer” category were identified: • • • •
Parents of small children Young (teenagers, students) Adults Seniors
The relationship between the services and the users groups are illustrated in Table 1. One can find that the single most important service/user group segment is “chronic diseases” service used by “seniors”. On the other hand, for example, chronic diseases are quite seldom in younger user groups. Table 1. Services and the users of the “Customer” category of the healthcare system. The higher number, the more important user group. User category Parents of small children Teenagers, students Adults Seniors
Proactive healthcare
Occasional healthcare Chronic diseases problems
1
2
1
1
2
1
2
2
2
2
1
3
540
N. Kantola and T. Jokela
The representative user groups for each service were chosen to be the ‘demanding’ ones (= if these user groups can use the service, then one can assume that the other user groups can use it, too)1: • Proactive healthcare: adults (including parents of small children) • Occasional healthcare problems: adults (including parents of small children) • Chronic diseases: seniors The usability requirements for the service “Occasional healthcare problems” and “chronic diseases” are as shown in Table 2 and Table 3. One can see, for example, the importance of positive first-time usage in the “Chronic disease” service: 9 users out of 10 should be able to use the system and have a positive first experience.
Description The user finds instructions to those typical sicknesses and accidents that one can care by him or herself (a separate specification of those sicknesses and accidents exists)
Goal 50% of users find instructions without contacting the health care personnel
Measuring means Usability tests Post-release follow-up
Description The user needs to experience the system useful and easy to use
Goal Measuring means 9 users out of 10 can Usability test perform the routine tasks related to his/her sickness correctly, and find the experience positive The users should 9 users out of 10 Follow-up studies continue to use the regularly use the system on daily basis system
The service “Proactive health care” was identified as a separate service only later, and the goals and measuring means have not been determined yet, Table 4. 1
It is assumed, however, that the users have used internet (web).
Determining High Level Quantitative Usability Requirements: A Case Study
541
Table 4. Usability requirements: Proactive health care Criterion Easy-to-find programs
First time usage Every day use
Description The user finds instructions to proactive health care programs, as appropriate to him/her Taking the system must be very easy The users should continue to use the system on daily basis
Goal
Measuring means
The “Professionals” Category Several user groups were identified within this category: • • • •
Doctors Nurses Public health nurses Other personnel
The usability requirements for these different user groups, however, were consolidated into one table, Table 5. At this stage, it was neither found necessary to define the requirements separately for different services. Table 5. Usability requirements: Professionals Criterion Learnability
Description Can be learnt without training
Efficiency
The users need to be able to quickly carry out time-critical tasks
Subjective satisfaction
Pleasant to use regularly
Goal 9 experienced professionals out of 10 learn how to correctly carry out the routine tasks without training Users can carry out time-critical tasks (which need to be identified) within the pre-defined time limits 9 users out of 10 rates the system 1 point (scale 1…7) more pleasant to use than a reference system (= a system widely used in hospitals and health care centers).
Measuring means Usability test
Usability test
Satisfaction measurement questionnaire
542
N. Kantola and T. Jokela
5 Conclusions In this study, a natural way of determining the requirements was first through user categories. This is probably not very surprising – “who are your users” is the key question when designing usability. In all, the quantitative usability requirements determined in this study fall into the following hierarchical categorization. • First by the main user groups (“Customers”, “Professionals”) • Then by the services (“Occasional health problems”, “Chronic diseases”, etc.) • Finally by specific usability factors (“Learnability”, “Subjective satisfaction”, etc.) We find that the usability requirements determined in this study have some new features: • The overall idea of determining ‘high level’ usability requirements. The requirements outlined in section 0 are defined in a quantitative but abstract level at the level of services. User task based usability requirements could be determined without detailed user tasks analysis. The ‘routine tasks’ are not determined at this stage, and need to be determined later. • This kind of hierarchical categorization is, in our knowledge, quite new. Typically usability requirements are “just” a set of individual requirements [2]. • The types of appropriate usability requirements for different users are quite different between the different services and user groups.
6 Discussion of Results In this study, preliminary quantitative usability requirements for a public health care system were determined. Overall, this study is one of the few case studies on quantitative usability requirements. A meaningful set of quantitative, high-level usability requirements could be determined – which really was not obvious in the beginning of the research. The requirements determination process was iterative, and the requirements were refined step-by-step. The usability requirements are categorized first through the main user groups, then by the services, and finally by specific usability factors. As research contributions, we find (1) the idea of having high level usability requirements determined at the level of services; (2) hierarchical organization of the requirements; and (3) the finding that the types of usability requirements may be quite different for different categories of users and service. One should understand that the approach for defining usability requirements described in this paper is not proposed to be applicable as such for other development contexts. For example, the authors ended up with quite a different set of usability requirements in the context of a development project for a user interface of a mobile phone [8]. Another limitation of this study is that the health care system is still very much under development, and we do not yet have data on the appropriateness and usefulness of the requirements. These issues are the topic of other papers in the future.
Determining High Level Quantitative Usability Requirements: A Case Study
543
For practitioners, the results indicate that the appropriate set of usability requirements in dependent on the specific application and development context. One should try to define requirements such that truly depict the usability of the system or product under development. Research on quantitative usability requirements is quite limited. There is space and need for different kinds of research efforts: from better theoretical understanding to the development of effective practical methods.
References 1. ANSI. Common Industry Format for Usability Test Reports. NCITS 354-2001 (2001) 2. Bevan, N.: Practical Issues in Usablity Measurement. ACM interactions 13(6), 42–43 (2006) 3. Bevan, N., Claridge, N., Athousaki, M., Maguire, M., Catarci, T., Matarazzo, G., Raiss, G.: Guide to specifying and evaluating usability as part of a contract, version1.0. PRUE project. London, Serco Usability Services: 47 (2002) 4. Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction. Lawrence Erlbaum Associates, Hillsdale (1983) 5. Hornbæk, K.: Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human-Computer Studies 64(2), 79–102 (2005) 6. ISO/IEC. 9241-11 Ergonomic requirements for office work with visual display terminals (VDT)s - Part 11 Guidance on usability. ISO/IEC 9241-11: 1998 (E) (1998) 7. Jokela, T.: Guiding designers to the world of usability: Determining usability requirements through teamwork. In: Seffah, A., Gulliksen, J., Desmarais, M. (eds.) Human-Centered Software Engineering. Kluwer HCI series (2005) 8. Jokela, T., Koivumaa, J., Pirkola, J., Salminen, P., Kantola, N.: Methods for quantitative usability requirements: a case study on the development of the user interface of a mobile phone. Personal and Ubiquitous Computing 10(6), 357–367 (2006) 9. Kirakowski, J.: The Software usability measurement inventory: background and usage. In: J.P., W., Thomas, B., Weerdmeester, B.A., McClelland, I.L. (eds.) Usability Evaluation in Industry, pp. 169–177. Taylor & Francis, London (1996) 10. Mayhew, D.J.: The Usability Engineering Lifecycle. Morgan Kaufman, San Francisco (1999) 11. Nielsen, J.: Usability Engineering. Academic Press, Inc. San Diego (1993) 12. NIST. Proposed Industry Format for Usability Requirements. Draft version 0.62. 8-Aug-04 (2004) 13. NIST. Common Industry Specification for Usability - Requirements (2006) (Retrieved 16.2.2007) http://zing.ncsl.nist.gov/iusr/ 14. Sauro, J., Kindlund, E.: A Method to Standardize Usability MetricsInto a Single Score. In: Conference on Human Factors in Computing Systems, Portland, Oregon, USA, pp. 401– 409. ACM Press, New York (2005) 15. Whiteside, J., Bennett, J., Holtzblatt, K.: Usability Engineering: Our Experience and Evolution. In: Helander, M. (ed.) Handbook of human-computer interaction. Amsterdam, North-Holland, pp. 791–817 (1988) 16. Wixon, D., Wilson, C.: The Usability Engineering Framework for Product Design and Evaluation. In: Helander, M., Landauer, T., Prabhu, P. (eds.) Handbook of HumanComputer Interaction, pp. 653–688. Elsevier Science B.V, Amsterdam (1997)
Why It Is Difficult to Use a Simple Device: An Analysis of a Room Thermostat Sami Karjalainen VTT, P.O. Box 1000, 02044 VTT, Finland [email protected]
Abstract. A diversity of usability problems with office thermostats were found in a preceding study. In this paper, the reasons behind the problems are studied by analysing a room thermostat. The analysis shows that a substantial amount of information is needed to use a simple thermostat, and the system image of the thermostat does not deliver the information. From the viewpoint of the analysis, it is not surprising that office occupants have serious problems with thermostats. Keywords: thermostat, knowledge, information needs, user interface design.
Why It Is Difficult to Use a Simple Device: An Analysis of a Room Thermostat
545
This paper concentrates on the reasons behind the problems: why is it difficult to use a room thermostat? The paper presents an analysis of a room thermostat and the knowledge a user must have to be able to use the room thermostat with effectiveness and efficiency. The analysis is based on the experiences gained in interviewing 27 office occupants in 13 Finnish offices. Twenty-three of the occupants had a room thermostat in their office. All the room thermostats were non-programmable and simple. The interviewees had been working in their present rooms from one-and-ahalf months to more than ten years, but they still had serious problems with the thermostats in the offices.
2 A Typical Example of a Room Thermostat Many kinds of room thermostats have been designed. Many of them are very complex and it is clear that they can not be used without a manual. This study, however, concentrates on simple room thermostats. I have chosen a typical example of a room thermostat for a closer examination (see Fig. 1). The model presented in Fig. 1 is common in Finnish offices. Several companies manufacture practically similar versions of the room thermostat.
Fig. 1. An example of a room thermostat
The room thermostat has a dial for adjusting room temperature set point. The scale presents no temperature values, but only the symbols "+" and "–". To increase the room temperature, the user should turn the dial to the "+" direction, and to decrease the room temperature the dial should be turned to the opposite direction ("–"). The room thermostat presents a light symbol in the upper right corner of the interface. If the light is red, the system is increasing the room temperature. Correspondingly, a green light means that the system is decreasing the room temperature at the moment. A blank light denotes a stable situation.
546
S. Karjalainen
The room thermostat can be connected to a cooling or heating system, or it can be shared with both systems. Most typically in offices it is connected to a cooling system, for example, to a cooled beam system or a fan convector system. A separate heating system typically exists in Finnish offices. The heating systems typically include thermostatic valves for user adjustment.
3 Information Needs for Using the Room Thermostat The use of the room thermostat (Fig. 1) is analysed in Table 1. The table presents the information needs and possible misunderstandings with the thermostat. It also presents the consequences of the misunderstandings. Table 1. Information needs for use of the room thermostat
1
2
3
4
5
Information needed Correct knowledge for use of the room thermostat What is the purpose It is a user-adjustable of the device? thermostat.
Are office occupants allowed to touch the device? Is the room thermostat active or passive at the moment?
Yes. The room thermostat is for the use of occupants. Depends on the cooling/heating system and the current conditions (e.g. season). What do "+" and "–" "+" means increase mean in the and "–" decrease in interface? room temperature set point.
Possible misunderstanding
Consequences of misunderstanding
The purpose of the device remains unclear. It is not recognised as being for temperature control. The room thermostat is for service personnel only. Passive thermostat is considered to be active, or the other way around.
The room thermostat is not used even in thermal discomfort.
As above.
Use of a passive system leads to dissatisfaction with the system (or to a placebo effect). "+" means increase The dial is turned and "–" decrease in (i.e. the room cooling power. temperature set point is adjusted) to the wrong direction. This may lead to dissatisfaction with the system. How much should There is no clear The adjustable range The dial is turned the dial be turned to answer, because that of room temperature (i.e. the room get the desired effect depends on the may be understood temperature set on room characteristics of completely wrong. point is adjusted) temperature? cooling/heating For example, it may too little or too system and the be thought that the much. This may lead current conditions. room temperature is to unnecessary adjustable with a adjustments and very large range. dissatisfaction with the system.
Why It Is Difficult to Use a Simple Device: An Analysis of a Room Thermostat
547
Table 1. (continued) 6
(After adjusting the thermostat), Has the room temperature changed to the desired level or is it still changing?
A red light means that the system is increasing the room temperature at the moment. The green light means that the room temperature is decreasing. A blank light means a stable situation.
The light is not recognised at all, or the meaning of the light symbol is not understood.
The user may think, for example, that the room temperature has reached the new level, although it is still changing. This may lead to unnecessary adjustments and dissatisfaction with the system.
All the misunderstandings presented in Table 1 are real and are taken from the contextual interviews with the occupants. Most of the misunderstandings are common in practice. The analysis explains that bad user interface design may easily lead to dissatisfaction with the system or disuse of the system. The analysis also shows that a lot of information is needed to use a simple thermostat. From the viewpoint of the analysis, it is not surprising that office occupants have serious problems with the thermostats and that the significance of the thermostats on thermal comfort is low, as it was found [1]. Some of the information (Table 1) can be gathered by trial and error. For example, the meaning of "+" and "–" should be easy to learn by experience. However, it is clear that office occupants need instructions for use of the thermostats. Occupants need to understand, for example, whether the thermostat is connected to a cooling or heating system. If the thermostat is connected to a cooling system, the user of the thermostat needs to know when the cooling system is active. It is, however, unrealistic to suppose that office occupants would spend their valuable time on learning the way in which the building works.
4 Incoherent Mental Models Norman [3] distinguishes three aspects of mental models: the design model, the user’s model and the system image. The designer creates the image of the system, the visible part of the system (the user interface including the labels and the documentation), according to the design model. The user is confronted with the system image. The user acquires all knowledge of the system from the system image, and the user’s model is the way the user perceives the system operates. We can drive a car without an understanding of how it actually works. Similarly, we should be able to use the thermostats with only limited knowledge of the cooling and heating systems. Unfortunately, the system image of the thermostat in Fig. 1 does not deliver the information that is needed to operate the thermostat. The design model is not similar to the user’s model as it should be. The designer has not had a realistic view of the users but has supposed that office occupants have knowledge they do not have in reality. Misunderstandings with thermostats have earlier been reported by Kempton [2]. He analyzed folk theories for home heating control and found two common theories of
548
S. Karjalainen
how a thermostatic valve works: a feedback theory and a valve theory. In the feedback theory a thermostat senses room temperature, but in the valve theory that is not understood and a thermostat dial is like a gas pedal and controls the amount of heat.
5 Improving the Design of the Thermostat The analysis shows that for the successful use of the thermostat the user must have a lot of knowledge. Although room thermostats are common in offices, the office occupants do not have that knowledge. This had lead to a situation where the thermostats are used very little. It is clear that the user interface of the thermostat in Fig. 1 could be improved considerably. At first, the thermostat should clearly present its purpose. Identifiability can be enhanced by symbols that refer to temperature, e.g. a degree sign, a thermometer, or red and blue colours (denoting warm and cool). Many of the problems users have could be avoided by just two more modifications to the user interface. If the thermostat had an understandable temperature scale, there would be fewer problems in adjusting the thermostat. Another main improvement concerns the feedback the thermostat gives after a user adjustment. Users need to know whether the system is working to fulfil the request. The feedback is especially important since the rate of room temperature change is slow, because of the thermal inertia of the building materials and the cooling/heating system itself. Many thermostats do not give any feedback, but the thermostat in question shows a light symbol when the room temperature is changing. However, the light symbols are not intuitively understandable but learning is needed to understand their meaning. The feedback should be presented more clearly (in one way or another), for example, by arrow symbols that show the direction of the temperature change.
6 Conclusion and Future Work Even a simple device can be very difficult to use if the user does not have the information needed for the use of the device. The designers often overestimate the knowledge the users have, and that overestimation leads to usability problems and dissatisfaction with the system or even disuse of the system, which is the case with the room thermostat analysed in this paper. No specific usability guidelines are available for room temperature controls in the literature. In future work I will concentrate on developing such a guideline. Acknowledgments. I thank Raino Vastamäki for the picture of the thermostat in Fig. 1.
References 1. Karjalainen, S., Koistinen, O.: User Problems with Individual Temperature Control in Offices. Building and Environment (In Press) 2. Kempton, W.: Two Theories of Home Heat Control. In: Quinn, N., Holland, D.C. (eds.) Cultural Models in Language and Thought. pp. 222–242. Cambridge University Press, Cambridge (1987) 3. Norman, D.A.: The Design of Everyday Things. Basic Books, New York (1988)
Usability Improvements for WLAN Access Kristiina Karvonen and Janne Lindqvist Department of Computer Science and Engineering, Helsinki University or Technology, P.O. Box 5400, 02015 TKK, Finland {Kristiina.Karvonen, Janne.Lindqvist}@tml.hut.fi
Abstract. Wireless Local Area Networks (WLANs) have become commonplace addition to the normal environments surrounding us. Based on IEEE 802.11 technology, WLANs can now be found in the working place, at homes, and in many cities’ central district area as open or commercial services. These access points in the public areas are called “hotspots”. They provide Internet access in various types of public places such as shopping districts, cafés, airports, and shops. As the hotspots are being used by a growing user base that is also quite heterogeneous, their usability is becoming evermore important. As hotspots can be accessed by a number of devices differing in their capabilities, size, and user interfaces, achieving good usability in accessing the services is not straightforward. This paper reports a user study and usability analysis on WLAN access to discover user’s needs and suggest enhancements to fight the usability problems in WLAN access. Keywords: WLAN, Usability, user interface design, security, accessibility, authentication.
expect their security be in place and privacy protected, and be in control of what information is disclosed of them. A further difficulty to providing easy-to-use WLAN access is caused by the fact that hotspots can be accessed by a number of devices differing in their capabilities, size, and user interfaces. As a result, achieving good usability in accessing the WLAN and the services it can provide is by no means straightforward. In this paper, we will look into the current work done in this area, covering both relevant user studies done on hotspot, as well as other types of WLAN access, such as public WLAN service or private home WLAN access to discover user’s needs and current usage of the access points, and usability work done to enhance the current solutions. We will also present and evaluate the methodologies used to study the usability of the hotspots and other types of WLAN access and the usability issues in the controllability and visualisation embedded in current approaches. The main body of this work consists of a report on a user study and usability analysis conducted for determining the current level of usability of WLAN hotspot access. The work covers a representative selection of earlier usability work done in this area or an area related to it: the relevant user studies, and the usability studies of existing solutions and UIs. It will also look into how various usability methods have been applied to the study of the usability of WLAN hotspot access and discuss on their feasibility. The novelty of the work lies in that it not only covers both the usability of the access points themselves, and the usability issues in some of the most probable end devices to be used to access the WLANs, but also seeks to point out the usability of security issues involved in this access. The paper is organized as follows: First, we will give a short presentation of usability and user-centred design in general in regard to mobile usage situations with small devices. We will proceed by presenting the relevant work done in this area, and discuss the state of the existing usability work. We will then present our user field studies, where we searched for and located publicly visible WLAN access points in several locations in two cities in Finland. Since providing access means dealing with users’ privacy and security issues also, we will complement the analysis by a short discussion on the privacy and usability of security in the area of WLAN access.
2 Background 2.1 Security Issues in WLAN Access The standardized way to secure WLAN access is based on the link-layer: the radio traffic between the access point and the user’s device is encrypted. The first version of the WLAN link-layer security architecture – Wired Equivalent Privacy (WEP) [16] – was proven to be insecure [12] and a working attack tool was quickly implemented and published [33]. Today, there are free easy-to-use attacking tools downloadable from the Internet for e.g. Windows and Linux that can passively break the WEP protection in few seconds [2]. The WLAN vendor community decided to solve the problem without the IEEE standardization body and formed an alliance to correct the problem. The result of the alliance was Wi-Fi Protected Access (WPA) which corrects the deficiencies of WEP. The standard security scheme is IEEE 802.11i [17] which is
Usability Improvements for WLAN Access
551
also known as WPA2. Despite that the WEP is very insecure, it is still widely used for backwards compatibility reasons. Since the radio network is a shared medium, anyone in the proximity (with clear and unobstructed space, the WLAN signal can reach more than 500 meters) of the WLAN network can receive the broadcasted traffic. This allows for a technique called wardriving [8]. The attacker merely drives around the city looking for open or vulnerable networks. The equipments needed are again software downloadable from the Internet [4] and a laptop or even a PDA. A community of security professionals and hobbyists gathered world wide data, which revealed that out of 22 8537 access points found 61.6 % were not using link-layer encryption at all [3]. In addition to link-layer encryption, WLAN networks are secured by two common ways: MAC address based and application layer, usually Web based, authentication. The MAC address based authentication is used to allow only known computers to the network. However, it practically provides protection only against benevolent network visitors. Attackers can just eavesdrop that which MAC addresses can access the network and reconfigure their WLAN devices accordingly. Web based authentication is used to authenticate users to the network. The network might not even use link-layer encryption; instead the authentication process is secured with TLS as are secured Web sites. Before the connection is authenticated, all traffic originating from the unauthenticated WLAN device is forwarded to the authentication web page. On the page, the user is required to give a correct user name and password, and is then given access to the Internet. It is also common to bind the authentication to the particular MAC address of the device that performed the authentication. This results in problems that we give in the discussion section. 2.2 User Studies and Usability Testing of WLAN Hotspot Access The usability work done in areas related to usability of WLAN access fall under several subcategories of work done in the field of Human-Computer Interaction (HCI). These areas include work done in the areas of personal computing, pervasive computing, wireless computing, mobile HCI, and, of course, usability of WLAN access. However, most user studies that we know of focus on Hotspot/WLAN access in a specific area, most notably in university campuses, such as [14],[15],[24],[30], [31], building-wide local-area wireless network [34], or a scientific conference [6]. This work does not, however, really give insight to understanding the usability issues embedded in accessing public APs on the fly, since the users were not at all mobile in their usage behaviours. [9] have tracked how public APs were utilized in Manhattan, N.Y., finding some similarities with the campus studies: the users were repeatedly using the same APs, another proof of the relative “immobility” of the subjects. However, finding the APs was again not a part of this study. [5],[7] have identified the current challenges and future directions for wireless hotspots. They mention, among other things, “device-independence” as one key goal in enhancing the usability of WLAN access – the need for this became obvious in this study also. Further challenge is presented by defining the identity for user when accessing hotspots – which attributes to use to preserve privacy at the same time preserving accountability. Existing solutions for easy-to-use WLAN hotspot access include the FriendZone usage study [10], which was however not tested with real users. Another interesting
552
K. Karvonen and J. Lindqvist
piece of work is presented by [32], who bring up the issue that in mobile usability in general, the work on understanding the interaction between device and its physical environment is still scarce and better utilization of geographical information could prove beneficial to locating and naming of WLAN APs also. [31] deal directly with how hotspots are currently found, the key outcome being that “word-of-mouth” was currently serving as primary information source about the location of available hotspots. They identify several usability problems in finding WLAN APs. These include the locating of distant networks, notification of new hotspots, finding and accessing the strongest signal (an approach that could easily be used for malicious purposes also as reported in [29]), and getting information about hotspots. [11] write interestingly about the free WLAN access as part of wireless commons, and what means to use to prevent its misusage. The usability methods applied to studying WLAN hotspot access usability include the usage of questionnaires, as e.g. in [20], [21], usage tracking and analysis, e.g. [5], [10], [24], and observations and interviews of users [31]. On basis of the related work, there are several areas in WLAN usability where we several usability issues can be detected. These are: • The multitude of devices, differing on form, input mode, processing power, battery life, and screen size/resolution/colour depth [7], [32]. • The relative immobility of users in how the hotspots are currently used and why this is so [9], [26], [31], and how hotspots can be found [7], [31] • Location privacy, tracking, transactions [1], [13], [15], [21], [23] • Tradeoffs between usability and security. E.g. according to [23], users require that their transactions over public WLANs be safe, yet they want seamless, automated roaming without need for manual sign-on. What form of authentication is best from usability point-of-view [7]? [25] concentrate on privacy enhancements AP usage. Further, work on finding out about home users’ network access behaviours can be found in e.g. [19], who have evaluated the usage behaviours and UI expectations in a smart home environment with several users over extended period of time (6 months), where part of the network access was via WLAN. Three devices, PC, mobile phone, and a media terminal, were tested, and UI prototypes for these devices were designed and adjusted according to user feedback. It became clearly evident that the user expectations for each device were different, mobile phone becoming the most used device to control the smart home functionalities despite initial reluctance and suspicion towards it as suitable for operating the home. These results may have repercussions for the work at hand, since the initial resentment of small terminal of the mobile phone was later overcome and this device was preferred. In practice this means that users were willing to negotiate usability for mobility and personal possess in the actual usage.
3 The Usability Study 3.1 Test Setting We searched for and located publicly visible WLAN access points in several locations in two cities in Finland. The discovered access points were of several different nature:
Usability Improvements for WLAN Access
553
part of a publicly available WLAN provided by the city; WLAN access offered by a private vendor, such as a café, as free service; company WLAN offered for outside visitors; private home WLAN, and WLAN offered by a public institute, such as a local university. The user studies were done by two researchers, one security expert and one usability expert, with three types of end devices: an Apple iBook G4 laptop with Mac OS X operating system, a Nokia 770 series PDA device, and a Nokia 9500. 3.2 Test Procedure A cognitive walkthrough method was used to simulate the steps and mindset of a actual mobile user, who would be moving within a district, trying to find and utilize existing WLAN APs. The cognitive walkthrough method is a well established usability methodology that has been effectively used e.g. in the classical study by Whitten et al [36]. As further methodology, expert analysis, consisting also a heuristic analysis as reported e.g. by [27], was used to analyse the usability problems detected during the usage. The two test persons would test different locations, dispersed in the capital city region in Finland in two cities, Helsinki and Espoo. The searching and accessing procedures were repeated in each location with at least two different devices, in most cases with all three.
4 Discussion On basis of the study, we were able to detect several generic usability problems in how the current access points are provided and visualised to the end users regardless of the device used. These usability problems include the naming of the WLANs available, their actual availability, visualisation of the security and access possibilities of the WLANs, as well as usability problems arising from controlling issues in managing the connections due to the dynamism of the WLAN search functionalities in the tested devices. Next, we will discuss each of these generic issues in detail. Naming. In the study, it became obvious that there were no standard and intuitive ways to name the various WLAN available. For the most, the WLANs were named either according to the service provider (the company; the university; the city), according to the manufacturer of the WLAN device used (Linksys; Motorola), or according to the WLAN owner (“pete”), or a generic location name (“home network”; “home base”). Possible better naming policies could be induced from [22], where users were asked to name locations in a mobile use situation. The name classifications that came up in the study included 1) generic locations, 2) point of interest and 3) geographical areas. Since these may be the natural and intuitive location names for users, even if the focus of this work is on presence information, utilizing these categorizations in WLAN AP naming might prove beneficial to the overall usability of WLAN access and its understandability. Actual availability. The found WLAN access points were often not really available. Also, the lists were not updated according to what APs were currently available.
554
K. Karvonen and J. Lindqvist
Visualisation of the security and access possibilities. The various types of locks associated with different types of WLAN security features (WEP, WPA) were not intuitive to the users. Further, the WLAN APs were often visually suggestive of being openly accessible, when in reality they were not. In many cases, like in trying to find open access in a public place, it is futile to show users the An improvement suggestion would be to allow user selecting listing only non-secured WLAN APs. Managing the connections list. The tested devices had a built-in feature of searching for available access points constantly, regardless of user actions conducted at the same time. It was not possible for user to stop the search when a desired AP had been found. It was also not possible to organize the list of access points in any way, except by naming policy. The only possible listing that was easily available (or at all?) to the users was alphabetical listing on basis of the default of user-specified name of the AP. In many cases, this was the worst order, since the list contained all WLAN APs added to the list at any point during usage – including APs found in another city, for example. A usability improvement would include more advanced ways to arrange the APs listed, according to preference and most recently used aspects, for example. Further, the search should be stoppable – and restartable – by user command. In addition to the generic usability problems discovered, with each end device tested, there were also several usability issues specific to each device adding to the usability issues embedded in the WLAN access itself, including how the established connections are established, shown, maintained, managed and accessed via the devices. Nokia 770. The Nokia 770 clearly was easiest to use for WLAN access, which is natural since WLAN is its major connectivity type among other NFC such as Bluetooth connections via mobile phone. Access to WLAN was rather straightforward, with a one-step access from main screen to the connections. However, changing the connectivity settings was accessible only via control panel, not from the connection manager directly. A clear improvement would be to allow connectivity editing also from the connection manager directly. Further, the dynamic search process with no ‘pause’ possibility made the managing of the found APs very low in usability, since the list of connections was changing constantly. Adding a ‘pause’ button to search would benefit the experienced usability of the search and list handling. Further, the once established and saved APs were listed as a singular list in alphabetical order. Because of this, as first items in the list could appear APs that were not accessible at the time. A clear improvement would be to include multiple ways to organize the list of APs according to e.g. recent use, location, currently available, etc. Further, even if the device was showing via an icon, whether the AP was in fact reachable, the user was able to connect to any AP, go all way through with the process, get an acknowledgement for successful connection, and only when opening e.g. a web browser get a notice of failed network connection. A usability improvement would be not to able connections to APs that the system has detected as unreachable. Nokia 9500 (Communicator). Nokia 9500 Communicator offers two ways to start using a WLAN access network. The main screen shows a white W sign on the left if WLAN access is available. The first is "EasyWLAN", where the user is shown a list
Usability Improvements for WLAN Access
555
of available access points. However, the list does not provide any information of the access point, only the name of currently available access points. To get more information, the user must explicitly configure the access point with the process shown in Figure 1. After the access point has been configured, the user can start e.g. a Web browser. The user is prompted to "Select an access point". By default, the list of available network connections shows also list of e.g. GPRS and other configured network access options. The user thus must remember where to connect.
Fig. 1. The multiple steps required for handling WLAN access in Nokia 9500
With this device, the steps required for managing and starting the WLAN access was quite cumbersome, consisting of multiple non-straightforward steps. Further, the language and terminology used in the UI was quite technical, thus effectively diminishing an average user’s capability for any WLAN access management. A clear improvement for the usability for WLAN access with Nokia 9500 would, then, include at least changing the UI language to more user-friendly, as well as cutting down the steps required to form a WLAN connection in the first place. Further, major visualization enhancements would be desirable. The current icons used for actual access and for the strength of the signal will probably be incomprehensible for most users, especially if WLAN usage is infrequent no learning effect can be expected. Apple iBook G4 laptop with Mac OS X. Maybe one of the biggest problems with the laptop, besides the obvious fact that a laptop size connection device is not truly feasible in the use situation described in this study, was that it would find only small percentage of the available APs at each location, as compared with the other two devices. This was eating on its reliability as providing WLAN access in the first place, and leave the user frustrated and without any connections, since they would not
556
K. Karvonen and J. Lindqvist
be found. In usability there is a saying, “if functionality is not found by the user, it doesn’t really exist”. This truly holds for trying to get WLAN access with a laptop. A further difficulty was presented by the change of the connecting device. In the case of restricted access with a temporary username and password to a publicly available WLAN access such as Helsinki city public wlan, once the user had logged into the system with one device, it was not possible to change the end device. In the case the batteries would run out or the user would decide to change to a device that would be better suited for browsing the available services, the initial wrong choice would effectively stop the user from accessing the service at all, since it was not possible to log out of the service. 4.1 On the Privacy and Usability of Security in the Area of WLAN Access The Privacy Enhancement Technologies (PET) address four essential requirements for privacy. These are: anonymity, pseudonymity, unlinkability, and unobservability [25]. [25] describe the current state of the privacy protection in existing solutions for WLAN access as “complex, non-adaptive, intrusive for the user and not contextaware”. Further, these solutions use only a very limited set of possible user identification parameters for accountability, like ID, address, or location. In their approach, medical information of the user, collected from a body sensor network is protected by automatic filtering when user is using a hotspot service. However, the system described also allows for user control by enabling the user to assign preferred levels of privacy to the data. The different aspects of the data the privacy of which needs to be protected include location, context, identity of the user, and private information available. The user of the AP should, then, be able to choose to reveal these different types of information of himself in different amounts and different combinations in different situations – in an easy fashion.
5 Conclusions On basis of the usability analysis presented, it is clear that there are several serious usability issues in the current UIs for handling WLAN access management. On basis of the analysis, we are in the process of implementing the suggested usability improvements on Nokia 770 and then intend to do extensive usability testing with real users on the new design.
References 1. Ackerman, M.S.: Privacy in pervasive environments: next generation labeling protocols. Pers. Ubiq. Comput. 8(6), 234–240 (2004) 2. Anon: [Aircrack-ng]: Referenced 15.2.2007 (2007) Web page http://www.aircrackng.org/doku.php 3. Anon: The Official WorldWide Wardrive (2007) Referenced 16.2.2007 Web page available at http://www.worldwidewardrive.org/wwwdstats.html 4. Anon: Wardriving Tools, Wardriving Software, Wardriving Utilities, (2007) Referenced 16.2.2007 Web page available at http://www.wardrive.net/wardriving/tools
Usability Improvements for WLAN Access
557
5. Balachandran, A., Voelker, G.M., Bahl, P.: Wireless hotspots: current challenges and future directions. In: WMASH ’03: Proc. of the 1st ACM international workshop on Wireless mobile applications and services on WLAN hotspots, pp. 1–9. ACM Press, New York (2003) 6. Balachandran, A., Voelker, G.M., Bahl, P., Rangan, P.V.: Characterizing user behavior and network performance in a public wireless lan. In: SIGMETRICS ’02: Proc. of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, pp. 195–205. ACM Press, New York (2002) 7. Balachandran, A., Voelker, G.M., Bahl, P.: Wireless hotspots: current challenges and future directions. Mob. Netw. Appl. 10(3), 265–274 (2005) 8. Berghel, H.: Wireless infidelity I: war driving. Comm. ACM. 47(9), 21–26 (2004) 9. Blinn, D.P., Henderson, T., Kotz, D.: Analysis of a wi-fi hotspot network. In: Proc. of the 2005 workshop on Wireless traffic measurements and modeling, pp. 1–6. USENIX Association, Berkeley, CA, USA (2005) 10. Burak, A., Sharon, T.: Analyzing usage of location based services. In: Proc.of Human factors in computing systems, pp. 970–971. ACM Press, New York (2003) 11. Damsgaard, J., Parikh, M.A., Rao, B.: Wireless commons perils in the common good. Commun. ACM 49(2), 104–109 (2006) 12. Fluhrer, S., Mantin, I., Shamir, A.: Weaknesses in the Key Scheduling Algorithm in RC4. In: Vaudenay, S., Youssef, A.M. (eds.) SAC 2001. LNCS, vol. 2259, Springer, Heidelberg (2001) 13. Gruteser, M., Grunwald, D.: Enhancing location privacy in wireless lan through disposable interface identifiers: a quantitative analysis. In: Proc. of the 1st ACM international workshop on Wireless mobile applications and services on WLAN hotspots, pp. 46–55. ACM Press, New York, USA (2003) 14. Henderson, T., Kotz, D., Abyzov, L.: The changing usage of a mature campus-wide wireless network. In: Proc. of the 10th annual international conference on Mobile computing and networking, pp. 187–201. ACM Press, New York, USA (2004) 15. Hong, J.I., Ng, J.D., Lederer, S., Landay, J.A.: Proc. of the 2004 conference on Designing interactive systems: processes, practices, methods, and techniques, pp. 91–100. ACM Press, New York, USA (2004) 16. IEEE: 802.11-1999 Information technology. Telecommunications and information exchange between systems- Local and metropolitan area networks. Spec. req. Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. IEEE New York (1999) 17. IEEE: 802.11i-2004 IEEE Standard for Information technology- Telecommunications and information exchange between systems- Local and metropolitan area networks. Spec. req. Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications Amendment 6: Medium Access Control (MAC) Security Enhancements. IEEE New York (2004) 18. Kanter, T.G.: Going wireless, enabling an adaptive and extensible environment. Mob. Netw. Appl. 8(1), 37–50 (2003) 19. Koskela, T., Väänänen-Vainio-Mattila, K.: Evolution towards smart home environments: empirical evaluation of three user interfaces. Personal Ubiquitous Comput. 8 (2004) 20. Lederer, S., Mankoff, J., Dey, A.K.: Who wants to know what when? privacy preference determinants in ubiquitous computing. In: CHI’03: extended abstracts on Human factors in computing systems, pp. 724–725. ACM Press, New York, USA (2003) 21. Lederer, S., Hong, I., Dey, K., Landay, A.: Personal privacy through understanding and action: five pitfalls for designers. Pers. and Ubiq. Comp. 8(6), 440–454 (2004)
558
K. Karvonen and J. Lindqvist
22. Lehikoinen, J.T., Kaikkonen, A.: PePe field study: constructing meanings for locations in the context of mobile presence. In: Proceedings of the 8th Conference on HumanComputer interaction with Mobile Devices and Services MobileHCI ’06, vol. 159, pp. 53– 60. ACM Press, New York (2006) 23. Matsunaga, Y., Merino, A.S., Suzuki, T., Katz, R.H.: Services: Secure authentication system for public WLAN roaming. In: Proceedings of the 1st ACM international workshop on Wireless mobile applications and services on WLAN hotspots WMASH ’03, ACM Press, New York, USA (2003) 24. McNett, M., Voelker, G.M.: Access and mobility of wireless pda users. SIGMOBILE Mob. Comput. Commun. Rev. 7(4), 55–57 (2003) 25. Mitseva, A., Imine, M., Prasad, N.R.: Context-aware privacy protection with profile management. In: Proc. of the 4th international Workshop on Wireless Mobile Applications and Services on WLAN Hotspots, pp. 53–61. ACM Press, NY (2006) 26. Nicholson, A.J., Chawathe, Y., Chen, M.Y., Noble, B.D., Wetherall, D.: Improved access point selection. In: Proc. of the 4th international conference on Mobile systems, applications and services, pp. 233–245. ACM Press, New York (2006) 27. Nielsen, J.: Usability engineering. Academic Press, Inc. Boston, USA (1993) 28. Palmieri, A., Sigona, F.: A QoS management system for multimedia applications in IEEE 802.11 wireless LAN. In: Proc. of the 5th international Conference on Mobile and Ubiquitous Multimedia MUM ’06,, vol. 193, ACM Press, New York, USA (2006) 29. Potter, B.: Wireless hotspots: petri dish of wireless security. Comm. ACM 49(6), 5 (2006) 30. Päykkänen, K., Räisänen, H., Isomäki, H.: Mobile studying and social usability on a wireless campus. In: Proc. of the 8th Conference on Human-Computer interaction with Mobile Devices and Services, vol. 159, pp. 269–270. ACM Press, New York (2006) 31. Roto, V., Laakso, K.: Mobile guides for locating network hotspots. In: Workshop on HCI in Mobile Guides (2005) 32. Ryan, C., Gonsalves, A.: The effect of context and application type on mobile usability: an empirical study. In: Proc. of the Twenty-eighth Australasian conference on Computer Science, Australian Computer Society, Inc. pp. 115–124 ( 2005) 33. Stubblefield, A., Ioannidis, J., Rubin, A.D.: Using the Fluhrer, Mantin, and Sharmit Attack to Break WEP. In: Proc. of the Network and Distributed System Security Symposium, Internet Society (2002) 34. Tang, D., Baker, M.: Analysis of a local-area wireless network. In: Proc of the 6th annual international conference on Mobile computing and networking, pp. 1–10. ACM Press, NY, USA (2000) 35. Venkatesh, V., Ramesh, V., Massey, A.P.: Understanding usability in mobile commerce. Commun. ACM. 46(12), 53–56 (2003) 36. Whitten, A., Tygar, J.D.: Why Johnny Can’t Encrypt: A Usability Evaluation of PGP 5.0, In: Proc. of the 8th USENIX Security Symposium, USENIX (1999)
A New Framework of Measuring the Business Values of Software In Ki Kim1, Beom Suk Jin2, Seungyup Baek3, Andrew Kim4, Yong Gu Ji3,∗, and Myung Hwan Yun1 1
Department of Industrial Engineering, Seoul National University Department of Industrial and information Engineering, Yonsei University 3 Department of Industrial Engineering, Pennsylvania State University 4 Ubiquitous Computing Laboratory, IBM Korea {lookat2, mhy}@snu.ac.kr, {kbf2514jin, yongguji}@yonsei.ac.kr, [email protected], [email protected] 2
Abstract. A new framework for measuring the business values of software is presented. The business values of software are categorized to two groups: tangible- and intangible-benefit. An implicit approach is used to quantitatively measure the intangible benefit of software by introducing two concepts, product attribute and quality attribute. The approach can relate the quantitative value from the usability test into the qualitative, intangible benefits of software. As an example, the proposed framework is applied to a software system in the development stage. We demonstrate the capability of the framework to quantitatively measure the intangible benefits of software as well as the tangible benefit by studying the usability test. Keywords: Software, Business value, Product attribute, Quality attribute, Usability test.
1 Introduction In the development stage of business application software, many project managers need to minimize the risk of the failure of the investment. To do that, they usually conduct project reviews by assessing the potential benefit of the software usability in terms of business value. There have, however, been many qualitative explanations of the potential benefit, which are somewhat vague in essence. The one reason is that the potential benefits of software usability are relatively harder to be quantified in comparison to the cost. In addition, it is difficult to find hard evidence to support the common sense expectation that the ease-of-use of a software leads to improved productivity or specific ease-of-use characteristics that truly make the software easier to use for a majority of users, as cautioned by Fried [1]. Furthermore, when problems are recognized after implementation begins, they cost considerably to find and fix them compared to at the stage of requirements and design [2]. ∗
There are, however, few systematic approaches to early evaluate the problems in software engineering [3]. Therefore, a framework is proposed for a quantitative measurement of business value (BV) of software in conjunction with usability test. For this end, an implicit approach is applied by introducing several concepts to relate the quantifiable value of the business value to usability test which can provide tangible data and related financial value. The remainder of this paper is organized as follows. Section 2 briefly reviews some related works. In Section 3, details of the proposed framework are described. The proposed framework introduces an approach to quantify the intangible value of benefits of business values by introducing two concepts, quality attribute and product attribute, and utilizing them to relate the tangible data from the usability test into the intangible value. Section 4 presents and discusses the results obtained on quantitatively measuring the business value for an example of software, named Dsoftware, which is in the development stage. The software system has been developed to support the developers of RFID application by providing a way of visual representation, namely Graphical Composition Language (GCL). Finally, the conclusive remarks are drawn in Section 5.
2 Literature Reviews According to Karat [4], usability engineering can help developers produce marketable products that will be useful to all the organization, the users, and the customers of their products’. In this paper, we take the assumption that usability itself has significant advantages in the business perspective. The assumption leads to develop metrics for quantifying of software usability. The standard economic model, which is one of the traditional methods to estimate the value of software, assumes that software consists of various design attributes [3]. Under this assumption, the economic model compares costs and benefits of the implementation properties to estimate the values of the design attributes. This model, however, requires thorough and long-term observation of trajectory in sales and cost structure to estimate the value of software. In the competitive industry environment today, this is often not practical and not possible. Thus, there are suggested several alternative methods to analysis the cost and benefit of software attributes in the information system [5]. Mantei and Teorey [6] tried to tie human factors into the software development lifecycle. They calculated tangible benefit and cost of applying human factors approach into the software lifecycle by using the results of task analysis and usability test. They found out that even on small scale, the task analysis and the usability test can reduce a great amount of time and efforts required to estimate the total tangible benefit and cost. In addition, the usability test which is performed prior to the release of the product to market also enables the stakeholders to understand the potential intangible benefit and cost. (See [6] for details of the intangible benefit and cost) Krishnan [7] utilized two concepts, QA (Quality Attribute) and PA (Product Attribute), to quantify the benefits from the result of task analysis and usability test. QA can be regarded as abstract values such as capability and reliability perceived by customers from a number of PAs. QA can be defined conceptually and the relative
A New Framework of Measuring the Business Values of Software
561
importance of each of QAs can be estimated from the business and organization’s perspective. There is, however, no predefined set nor unique importance values for the QA. The reason is that new QA can be always introduced with the change of market environment, which can accordingly modify the original relative-importance of QA. According to Bachmann and Bass [8], QA is incurred from multiple interactions between multiple PAs and it is reflected through specific User Interfaces. Contrary to QA, PA is concrete and tangible, which can be component and/or function of software. Individual PA is closely related to the result of the usability test.
3 Methods In this section, a framework is proposed for a quantifiable evaluation of business value of software in conjunction with task analysis, questionnaire and usability test. 3.1 Business Values of the D-Software The survey and interview method are used to select key business-values of the Dsoftware. First, a business value pool for a RFID-related software is surveyed. Next, five factors for the D-software are identified in the list of business values through the interview with a sales expert (table 1). Table 1. Key business values Business Value Proposition Cost savings Tangible benefit (BV1) Ease (BV2) Quality (BV3) Intangible benefit
Flexibility (BV4) Extendibility (BV5)
Operational definition The amount of quantifiable savings in the system development and maintenance phase derived from usability-related factors Potential benefits due to ease of development, ease of use, ease of maintenance derived from usability-related factors Potential benefits in the quality of the system derived from usability-related factors Potential benefits regarding agilely responding to external environments and various requirements of users derived from usability-related factors Potential benefits that supports the system to extend based on components in the light of usability-related factors
To develop a method for the quantifiable assessment of the key business-values (BVi, i=1,..,5), the values are categorized as two kinds of benefits by deciding whether they are financially measurable or not: 1) tangible benefit which is financially measurable and 2) intangible benefit which is not financially measurable. 3.2 Usability Evaluation Framework The D-software is a component based solution to facilitate the RFID application composed of various physical and logical devices such as RFID Tags, Readers, Motion sensors and servers. Since the D-software is in the stage of development and revision, it is difficult to estimate and quantify the potential benefit of the software system due to lack of information such as sales data and feedback from market. Thus,
562
I.K. Kim et al.
the usability test and task analysis will be conducted to quantify the benefit of the Dsoftware. A hierarchical structure model by Seffah et al. [9] is utilized to quantify a user’s performance in modeling of a RFID application using the D-software (fig. 1).
Fig. 1. Usability evaluation framework
From the usability test including small task using the D-software, we can measure time, error and subjective response directly coming from the D-software. It is important to divide the whole process of use into sub tasks matching each PA respectively in the sufficient sub-level. For the usability test, a use case is selected, which is utilized in many commercial RFID-based solutions for supply chain management. 3.3 Measuring the Tangible Benefits of Software For tangible benefit, at first, two categories in the standard cost structure of a software development are considered as the main factors of the cost to develop a RFID application: 1) development cost and 2) maintenance cost. In the estimation method of development cost, there are estimating method of developing scale and number of people and period. Using the estimated number of people and period, total development cost (Cdevelopment) is computed as follows: C direct labor = C engineer × N people × N day × N month C over head = C direct labor × w over head
C technology = (C direct labor + C over head ) × w technology
(1)
C development = C direct labor + C over head + C technology
where Cdirect labor means the direct labor cost, Cengineer the unit cost of engineer payment, Cover head the overhead expenses, Ctechology the technology cost, Npeople the number of people, Nday the number of average working days, Nmonth the number of month, wover head the ratio of the Cover head to the Cdirect labor ranging from 110 to 120%, and wtechnology the ratio of the Ctechology to the sum of the Cdirect labor and Cover head ranging from 24 to 40%. The maintenance cost per year (Cmaintenance) is estimated using MD (Maintenance Difficulty) and Cdevelopment. The maintenance difficulty (MD) is computed using TMC (Total Maintenance Complexity) which is calculated by measuring frequency of maintenance, frequency of data manipulation, interconnectivity to other system,
A New Framework of Measuring the Business Values of Software
563
required knowledge, and divided transaction. Then, the Cmaintenance induced by a RFID application is computed as follows: ⎛ [TMC] ⎞ MD(%) = 10 + ⎜ 5 × ⎟ 100 ⎠ ⎝ C maintenance = C development × MD
(2)
3.4 Measuring the Intangible Benefits of Software Benefit of PA comes from two different aspects - product and Task (T). Assuming causal relationship between PA and usability test, the benefit of PAj can be expressed as follows:
∑ (U Tk × X jk ) K
j B PA =
k =1
J
K
∑ ∑ X jk
(3)
j=1 k =1
j represents the benefit of PAj, J and K are the number of PA and Task, where B PA
respectively, U Tk refers the increased usability in a task Tk, which means increased effectiveness, saved time, or increased subjective satisfaction during conducting Tk, and Xjk is the random variable representing the relevance between each PAj and Tk. The random variable is defined as follows: ⎧1 if PA j is fully used during a task Tk X jk = ⎨ ⎩0 otherwise
(4)
Since the reference point of comparison is before the implementation of PAj, U Tk is expressed in terms of percentage. When PAj is partially used during a task Tk, it is highly recommended that Tk should be divided further into subtask Tk1 and Tk2 . Thus, either of the subtasks can be attributed to PAj. After all, the benefit of PAj is the averaged sum of error reduction, time savings or marginal satisfaction for each partitioned task involving PAj. The task analysis works here to divide the whole tasks into sub-tasks involving specific product attributes. From the estimated benefit of PAj (j=1, …,J) in Eq. 2, the intangible benefit of each of QA is calculated in the form of the linear combination as follows: j B lQA = ∑ ∑ (B PA × C lj ) J
L
j=1 l =1
(5)
where BlQA represents the intangible benefits of each of QA and C lj means the
contribution of PAj to QAl, ranging from -1 to 1. The contribution is subjectively assessed by the stakeholders. The stakeholder should have a thorough idea of both PA and the concept of QA. The range [-1, 1] is intuitive (negative value mean lessening QA and vice versa) as used by Kazman et al. [10]. Assessed value of contribution is then normalized between 0 and 1. Sometimes the benefit of each QA can be more than 100 because the assessed value of each PA’s contribution is independent each other.
564
I.K. Kim et al.
The benefit of QAl means the total of reduced time, error or increased satisfaction from all the related product attributes. To integrate all the benefit of each QA into the possible intangible benefit, the relative importance of each QA should be assessed from the business perspective. It is advisable that the cross-functional group composed of project managers, sales person, and decision makers is to assess the relative importance of each QA collectively. The estimation of total intangible benefit (TIB) of software comes as follows: TIB = ∑ (B lQA × I lQA ) L
(6)
l =1
L
where I lQA is the relative importance of QAl, so that ∑ I lQA = 1 . l =1
4 Results 4.1 Usability Test and Task Analysis of the D-Software
The usability test includes the evaluation of user satisfaction and performance by questionnaire and measurements on small group test. Especially, the modeling aspect of an RFID application using the D-software is focused in the test. The subject group is composed of 32 under/graduate students in the software engineering, where male are 28 and female are 4. Subjective responses from questionnaire are also collected. For setting the reference point of usability, apart from usability testing, 2 experts who have experiences in RFID programming over 3 years with MS degree in software engineering participated with the same use case without the D-software (using manual coding). We analyzed the usability degree of user satisfaction (qualitative) and performance (quantitative) for understanding the usability degree of the D-software through usability evaluation model. From the observation of modeling with the D-software and manual coding, modeling stages can be categorized as follows from the results of task analysis: 1) Component searching, 2) Component modeling, 3) Component manipulation, 4) Model searching, 5) Error detection & correction and 6) Attaching related library supports. Table 2 below shows the metrics of usability testing. Table 2. Metrics of usability testing Metrics Component searching time (CST) Component manipulation time (SMT) Component connection time (CCT) Model searching time (MST) Number of deleted component (NDC) Frequency of code line deletion (FOCLD) Frequency of copy & paste of previously made codes (FOCP) Frequency of adding description (FOAD)
Description Search time for choosing appropriate component Cost time for arranging component Cost time for connecting each component Cost time that subject searches conceptual model to constructing system Number of deleted component while task completion
A New Framework of Measuring the Business Values of Software
565
4.2 Tangible Benefit of the D-Software
The table 3 below shows the consumed time during each modeling stage calculated from measured time and frequency, where τdel means the unit time to delete a component or a line, τcp the unit time to copy and paste a bunch of codes, and τdes the unit time to add one line of description. Table 3. Consumed time during each modeling stage Modeling stages Component searching (CST) Component modeling (CCT) Component manipulation (CMT) Model searching (MST) Error correction (1NDC×τdel; 2FOCLD×τdel) Copy & Paste / Descriptions (2FOCP×τcp+FOAD×τdes)
Total modeling time is computed as the summation of the consumed time of the six modeling stages from the usability test. The value of the total modeling time was 779.79sec (=0.217hr) using the D-software and 6645sec (=1.846hr) using manual coding, respectively. The total modeling time can be used to calculate the savings in the development cost Cdevelopment in Eq. 1. The daily working hour is set to be 8hrs. It is assumed that the decrease in the productivity caused by increasing size of model is relatively low in the D-software modeling, thus not applying to the cost in the D-software. For conservative estimation of savings in cost, we assume that the line of codes of general use case amounts to 30,000 lines of codes. To eliminate the cost effects of programming language, we assume that GCL in the D-software is in the similar level of difficulties with Visual Basic, HTML, Delphi, etc. Also, we assume that relatively little java code is used for the D-software modeling. Then, the development cost Cdevelopment is calculated using Eq. 1. For the overhead expenses Cover head, it usually takes 110~120% of direct labor cost. For the technology cost Ctechnology, it usually takes 20~40% of direct labor cost plus overhead expenses. As a result, the saving percentage in the development cost Cdevelopment by the D-software compared to manual coding by C# is estimated to be greater than 93% in the use case. To estimate the maintenance cost Cmaintenance in the D-software system over manual coding, the total maintenance complexity (TMP) is set to be 35 by the D-software and 40 by the manual coding using C#, respectively. The major difference between the Dsoftware and the manual coding in terms of operation and maintenance is that the Dsoftware requires much less knowledge of hardware devices and the code itself. As a result, TMP in the D-software is 5 points less than that of the manual coding. Here, we assume that number of maintenance is below 4 per year (score = 0), that number of data transaction is below 500,000 per year (score = 10), that interconnection to other system is more than 3 (score = 10), and that divided transaction is in the integrated state (score = 10). We assume that the life cycle of the product is 5 years from the moment of completion of development. The product is purchased at the end of completion of development, which is set to be the 0 year. Maintenance is repeated
566
I.K. Kim et al.
every year except the two years of 0 and 5. The discount rate is assumed to be 6% yearly. The saving percentage in the maintenance cost Cmaintenance by the D-software compared to the manual coding using C# is estimated to be greater than 93% in the use case. 4.3 Intangible Benefit of the D-Software
Before estimating intangible benefit of the D-software, key PA and QA should be identified. Five PAs which mean the specific and representative characteristics incarnated in the D-software are identified by developers: 1) GCL (Graphical composition language) (PA1), 2) State machine (PA2), 3) Component Library (PA3), 4) Deployment (PA4), and 5) Code generator (PA5). The survey and interview method are used to select major QAs. First, candidate quality attributes used and emphasized in similar RFID-related software are collected. Next, four key QAs are identified through the interview with sales expert and project manager. The resulting list of QA is the same with the four business values that consist of the intangible benefit as shown in Table 1. First step is to identify and quantify the relative importance of the QAl, I lQA , for all l in Eq. 6. The I lQA can be imposed from the managerial perspective. That is, the value of each I lQA is obtained from focus group interview (FGI) with a sales manager. According to the sales manager, Ease in the development, use and maintenance (QA1) is emphasized to be the most important among four QAs. Quality (QA2) and Flexibility (QA3) are equally less important than QA1. Last, extendibility (QA4) has the least importance. From the comments, the values of I lQA for l=1,…4, are obtained as follows: Table 4. The relative importance of four qualitative attributes QA
Ease (QA1)
Quality (QA2)
Flexibility (QA3)
Extendibility (QA4)
Relative importance
I 1QA = 0.5
2 I QA = 0.2
I 3QA = 0.2.
4 I QA = 0.1.
Next, the contribution of each of PAj to each of QAl is derived on a scale of 0 to +1 from the subjective evaluation by a programmer. The matrix of J× L, where J = 5 and L = 4 will be derived as follows, which is the contribution matrix C lj for j=1,…,5, and l=1,…4. in Eq. 5. Table 5. The contribution of PA to QA
C lj
QA1
QA2
QA3
QA4
PA1 PA2 PA3 PA4 PA5
0.70 0.70 0.80 0.60 0.90
0.40 0.20 0.10 0.20 0.30
0.70 0.70 0.70 0.60 0.20
0.50 0.20 0.50 0.70 0.20
A New Framework of Measuring the Business Values of Software
567
We analyzed the result of user satisfaction (qualitative) and performance (quantitative) for understanding the usability degree of the D-software through the usability evaluation framework. The result calculated by experiments is as follows. In order to simplify the analysis, the set of tasks is assumed to be identical to that of the modeling stages shown in Table 3. Therefore, the increased usability in a task Tk, U Tk , can be computed as follows: Table 6. The increased usability in a task Tk Task Component searching (T1)
Increased usability U 1T = 0.85
Component modeling (T2)
U 2T = 0.95
Component manipulation (T3)
U 3T = 0.82
Model searching (T4)
U 4T = 0.89
Error correction (T5)
U 5T = -1.82
Copy & Paste / Descriptions (T6)
U 6T = 1.00
The random variable Xjk defined by Eq. 4 is shown in Table 7. The benefit of PA, defined by Eq. 3, is computed using the U Tk in Table 6 and Xjk. Table 7. The relevance between PA and T Xjk PA1 PA2 PA3 PA4 PA5
T1 1 0 1 0 0
T2 1 1 1 1 1
T3 1 0 0 0 1
T4 0 1 1 0 0
T5 1 1 0 0 1
T6 0 1 0 1 1
j B PA
0.05 0.06 0.16 0.11 0.06
j The intangible benefit of QA, defined by Eq. 5, is computed using the B PA in
Table 7 and C lj in Table 5. Now, we can quantitatively compute the total intangible benefit (TIB) of software, defined by Eq. 6, using I lQA in Table 5 and B lQA in Table 8. The resulting total intangible benefit (TIB) of the D-software is 0.25. Table 8. The benefit of QA QA
QA1
QA2
QA3
QA4
TIB
B lQA
0.33
0.09
0.27
0.21
0.25
5 Concluding Remarks This paper has presented a framework for measuring the business value of software. An implicit approach that introduce five product attributes and four quality attributes
568
I.K. Kim et al.
allows to quantitatively measure the intangible benefits of software, which have been usually assessed in qualitative ways. Therefore, all kind of business values, which are classified into the tangible- and the intangible benefit, can be quantitatively measured. This can enable project managers to evaluate the project of software development in quantitative way. The result of an example, which tested the D-software in the development stage, showed that the proposed framework can be used to quantitatively evaluate the business value of software. This will help the project managers to reducing the risk of failure of the investment.
References 1. Fried, L.: Nine principles for ergonomic software. Datamation 28(12), 163–166 (1982) 2. Boehm, B., Basili, V.R.: Software Defect Reduction Top 10 List. Computer 34(1), 135– 137 (2001) 3. Scaffidi, C., Arora, A., Butler, S., Shaw, M.: A Value-Based Approach to Predicting System Properties from Design. In: The seventh international workshop on Economicsdriven software engineering research, Missouri, St. Louis, ACM Press, New York (2005) 4. Karat, C.M.: Usability Engineering in dollars and cents. IEEE Software 10(3), 88–89 (1993) 5. Sasson, P.G.: Cost benefit analysis of information systems: a survey of methodologies. ACM SIGOIS Bulletin 9(2-3), 126–133 (1988) 6. Mantei, M.M., Teorey, T.J.: Cost/benefit analysis for incorporating human factors in the software lifecycle. Communications of the ACM 31(4), 428–439 (1988) 7. Krishnan, M.S.: Cost, Quality and User Satisfaction of Software Products: An Empirical Analysis. In: CASCON ’93. Toronto, Ont. Canada Nat. Res. Council of Canada (1993) 8. Bachmann, F., Bass, L.: Introduction to the Attribute Driven Design Method. In: The 23rd International Conference on Software Engineering. ICSE 2001. Toronto, Ont. Canada IEEE Comput. Soc. (2001) 9. Seffah, A., Donyaee, M., Kline, R.B., Padda, H.K.: Usability measurement and metrics: a consolidated model. Software quality journal 14(2), 159–178 (2006) 10. Kazman, R., Asundi, J., Klein, M.: Quantifying the costs and benefits of architectural decisions. In: The 23rd International Conference on Software Engineering. ICSE 2001. Toronto, Ont. Canada IEEE Comput. Soc. (2001)
Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study P. Koutsabasis, T. Spyrou, and J. Darzentas University of the Aegean Department of Product and Systems Design Engineering Hermoupolis, Syros, Greece, GR-84100 Tel.: +30 22810 97100, Fax: +30 22810 97109 {kgp, tsp, idarz}@aegean.gr
Abstract. The paper proposes an approach to comparative usability evaluation that incorporates important relevant criteria identified in previous work. It applies the proposed approach to a case study of a comparative evaluation of an academic website employing four widely-used usability evaluation methods (UEMs): heuristic evaluation, cognitive walkthroughs, think-aloud protocol and co-discovery learning. Keywords: Usability evaluation methods, comparative usability evaluation, case study.
2 Related Work A comparative usability evaluation involves multiple evaluators or evaluation teams that employ a single or multiple UEMs to carry out parallel evaluations of the same target system. There are few comparative evaluations in HCI literature. Hertzum and Jacobsen [5] present a comparative study concerning eleven UEMs evaluations carried out with three of the four methods studied in this paper, namely CW, HE, and T-AP. Their results show that the average agreement between any two evaluators who have evaluated the same system using the same UEM ranges from 5% to 65%, and no one of the three UEMs is in general more consistent than the others. Unfortunately, Hertzum and Jacobsen could not find studies where heuristic evaluation was performed by evaluators who aggregated the results of their individual inspections to a group output (which is the case for our study). Heuristic evaluations are usually applied by a group of inspectors or users and the individual results are then aggregated [12]. Van den Haak et al (2004) make a comparison of T-AP and C-D to test the usability of online library catalogues. The UEMs were compared upon four criteria of comparison related to digital libraries: number and type of usability problems detected; relevance of the problems detected; overall task performance; and participant experiences. The study involved 80 students. The main result of their study was that the UEMs revealed similar numbers and types of problems that were equally relevant. Molich et al [9] report on the results a comparative evaluation of a single web site (Hotmail) by nine professional teams. The goal of this study was to investigate the consistency of the results obtained. Each team was let alone to select their particular UEM and carry out the evaluation according to their work practices. The results of this evaluation are quite surprising: a large ratio (75% - 232 of 310) of usability problems identified were unique for each team that participated in the experiment, while there were only two usability problems of the target system that were reported from six or more teams. Other comparative evaluations with different foci are presented in [3] [4] [6] and [10]. These comparative studies differ in terms of goals and the criteria used to compare evaluator performance and/or UEMs. Of particular interest for comparative evaluation work is the work of Hertzum and Jacobsen [5] who investigate the evaluator effect in usability evaluations. The term denotes the fact that multiple evaluators evaluating the same interface with the same user evaluation method detect markedly different sets of problems [6]. They [5] propose three generic guidelines to minimize the evaluator effect: ▪ Be explicit on goal analysis and task selection. ▪ If it is important to the success of the evaluation to find most of the problems in a system, then we strongly recommend using more than one evaluator. ▪ Reflect on your evaluation procedures and problem criteria. The work presented in this paper contributes to related work by synthesising a general set of criteria from previous work into a structured approach for comparative usability evaluations. Furthermore, it presents a case study of a comparative usability evaluation that provides various insights about the UEMs employed.
Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study
571
3 A Structured Approach for Comparative Usability Evaluations: Criteria and Process 3.1 Criteria for Comparative Usability Evaluations The criteria that can be taken into account for comparative usability evaluations can be distinguished by whether they refer to the evaluation target or to the UEMs themselves. An example of the first category of criteria is [17] that evaluate a web-based digital library focusing on: layout, terminology, data entry and comprehensiveness. However, criteria that are related to the target system are quite different for systems that follow different user interface paradigms. On the other hand there are also generic criteria that refer to the UEMs and not the target system. Among these (a useful review is provided by [4]), the paper identifies as most important the following: Realness (or relevance) refers to whether a usability finding is a real usability problem or not (or to what degree, i.e. a severe or not important problem). According to [4] the realness of usability findings can be determined by: a) comparing with a standard usability problem list; b) expert review and judgment; and c) by end-user review and judgement. Any approach includes advantages and drawbacks regarding applicability, cost-effectiveness and trustworthiness. In this respect further research includes: severity ratings [11] and combinations of severity and probability of occurrence [15]. Validity (or accuracy) can be defined as the ratio of the number of real usability problems with respect to the total number of findings (i.e. real and ‘false alarms’) for each application of UEM [4] [16]. Thoroughness (or completeness) is identified in [4] and [16] as the ratio of the number of (real) usability problems found by the application of a UEM with respect to the total number of usability problems that exist in the target system. Obviously validity requires that the total number of real problems has been identified through a detailed cross-examination of results produced by all UEMs. Effectiveness. The criterion of effectiveness for UEMs has been synonymous to thoroughness and validity of usability findings by most related work [2] [4] [8]; this is also in line with the definition of effectiveness by the ISO 92412 standard for usability as the ‘accuracy and completeness with which users achieve specified goals’. Thus, the effectiveness of UEMs can be identified as the product of thoroughness and validity [4]. Some related work goes even further to the definition of effectiveness by adding the issue of predictive power of UEMs in relation to the uptake of usability findings by developer teams [7] [8]. The latter perspective has to cope with additional methodological considerations not only about the persuasiveness of usability findings reporting, but also about the nature of usability findings themselves e.g. ‘objective’ usability problems (such as broken links in a web site) are far more likely to be addressed by development teams, rather than ‘subjective’ findings (such as findings related to terminology), which are the most difficult to explain in usability reporting anyway.
572
P. Koutsabasis, T. Spyrou, and J. Darzentas
Consistency has been related to reliability [4] and repeatability [13]. In our work, we use a working definition of consistency of UEMs in terms of repeatability, as the extent to which multiple applications of different usability inspection methods produce ‘reasonably similar’ results. This working definition is similar to the approach of [9]. Again, the need for the identification of means for trustworthy interpretation of the similarity of usability findings is required and may be addressed by the same ways as with the realness problem. 3.2 Essential Process Steps for Carrying Out Comparative Usability Evaluations The set up and carrying out of any comparative usability evaluation needs first of all to ensure that it has controlled as much as possible the aspects of the experiment that are related to the evaluator affect, thus conform with the guidelines proposed by [5]. Furthermore, the processing of results needs to ensure effective decision making about the problems of realness or relevance of the results and about the similarity of the results obtained by the parallel usability evaluations. In order to address the issues above, we propose the following guidelines for comparative usability evaluations: Ensure Commons when Carrying out the Usability Evaluations: A number of issues related to the preparation and carrying out of the parallel usability evaluations need to be addressed uniformly for each evaluation. In particular: ▪ Select evaluators that have a similar level of experience for usability evaluations. This can be achieved by selecting professional evaluators for carrying out the experiments. When this is not possible and novice evaluators must participate, then ensure that they work in teams and that they are closely supervised. Having more than one evaluator to carry out a usability evaluation is also proposed by [5] to maximise the number of results that can be obtained; when novice evaluators are employed, then working in teams can also assist their interaction towards resolving issues about the carrying out of the usability evaluation provided they are supervised by an experienced evaluator. ▪ Assign UEMs to evaluators according to their experience. It is generally better to allow evaluators to select a method in which they have experience or feel most comfortable in using it. ▪ Provide a common set of tasks to carry out. Unless a common set of tasks is provided, there is no way to ensure that evaluators have examined the same or at least similar areas of the target system. ▪ Provide a common format for documentation - reporting of usability findings. As Hartson et al [4] remark “many UEMs are designed to detect usability problems but problem reporting is left to the developers/evaluators using the UEM... problem report quality will vary greatly according to the skills of the individual reporter at communicating complete and unambiguous problem reports.” Reporting of usability problems can aid significantly the processing of results, especially for the case of parallel usability evaluations.
Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study
573
Ensure effective decision making when processing the results of multiple usability evaluations: In particular, ▪ Select criteria for comparative usability evaluation: As discussed in related work there are various criteria that can be considered for comparative evaluations. We make use of the criteria list presented in section 3.1, in order to draw more general conclusions about UEMs. However, aspects related to the target system affect the performance of UEMs, such as the user interface paradigm (e.g. hypertext, WIMP, 3D, etc.) and the level of maturity of the target system (e.g. application or prototype). For example, it has been argued that usability inspection methods may be more appropriate for finding problems in the early design stage of an interactive system [3]. Therefore, any conclusions drawn need to be interpreted carefully in the context of the particular class of target systems. ▪ Select a decision criterion for relevance of usability findings: making decisions about the realness or relevance of usability findings has been addressed in various as discussed above. We have addressed the relevance problem in a two-stage approach: first, the evaluation teams provided as part of their documentation their argumentations upon each usability finding; secondly, all usability findings were rated by an experienced usability evaluator (the first author of the paper) upon a three-scale severity scheme: 0 – not a problem; 1: minor problem; 2: serious problem. ▪ Select a decision criterion for similarity of usability findings: when all evaluations are available, there is a need to go through the reports in order to identify the similarity of usability findings. Again, we followed an expert-based approach for this task, which is the most usual condition in parallel usability evaluations. In this case it is however generally advisable that more than one expert performs this task. Van den Haak et al [17] have used five experts to interpret the results of their comparative study. However, there are various practical problems with involving more than a single expert. The amount of time that is needed to go through all evaluation reports, to process the large pool of data in terms of relevance and similarity and to resolve ambiguities and disagreements actually requires a lot of synchronous work. Therefore, we have used a single expert to go through the data, as well as others (e.g. [9]).
4 A Case Study of Comparative Usability Evaluation 4.1 Evaluation Object The web site evaluated is that of the Department of Product and Systems Design Engineering (www.syros.aegean.gr), University of the Aegean and has been operating since September 2000. The web site was designed to address the emerging needs of the new department and has been extended since, by the addition of web-based subsystems (both open source and in-house developments) for the support of administrative and teaching tasks. 4.2 Participants The usability evaluations were carried out by the MSc students of the department in terms of partial fulfilment of their obligations for the course on interaction design.
574
P. Koutsabasis, T. Spyrou, and J. Darzentas
The students have a wide range of backgrounds about design having graduated from departments such as arts, graphic design, industrial engineering and information systems. Only two (out of a total of 27) students had limited experience on usability from their bachelor studies and had carried out a usability evaluation before. However, all students had considerable knowledge about the web site since they had been using it repeatedly. Thus the selected subjects had a similar level of usability experience (novice) but a good knowledge of the target system. According to Nielsen [10] who reports in the context of heuristic evaluation: “usability specialists are better than non-specialists at performing heuristic evaluation, and “double experts” with specific expertise in the kind of interface being evaluated perform even better”. Thus the lack of previous experience of selected subjects on usability evaluations was partly compensated by their good knowledge of the target system. Furthermore, the progress of the exercise was reviewed in weekly sessions with all teams in order to allow for resolution of queries and guide the smooth progress of the usability evaluations. Finally the fact that there was a team that carried out the evaluation instead of single novice designers encouraged critical discussion and group decision making about the findings of the usability evaluation. 4.3 Tasks and Methods Selected The evaluation teams were assigned one from the four usability evaluation methods of heuristic evaluation (HE - 3 teams), cognitive walkthroughs (CW - 3 teams), thinkaloud protocol (T-AP - 3 teams) and co-discovery learning (C-D 1 team) according to their degree of confidence for carrying out a usability evaluation with each one of these methods. All four methods are widely used in industry and academia for usability evaluation. The evaluation teams were provided with an analytic template for documenting the results, which included table of contents for the usability report and a predefined categorization of types of usability problems. The evaluation teams should test the system by following two given user tasks: ▪ For a student, to locate information about a specific course: course description, instructor and online notes. ▪ For a visitor of the department, to locate necessary information about visiting the department at Hermoupolis, Syros, Greece: the map of the town, accommodation information and travel information. The evaluation teams were given a two-month period to organise, carry out and document the usability evaluation. Their main deliverables were the usability report and their presentation of their results to an open to all discussion session. 4.4 Results Realness or relevance: The realness of usability findings (Table 1) is generally high in most methods even reaching 100% in one case of HE. However, three UEMs were identified with a rather large number of false (not real) usability findings HE2, CW2 and C-D1. The fact that this variability appeared in three different UEMs leads to the conclusion that it cannot be safely related to intrinsic characteristics of methods themselves but rather to the inexperience of the evaluation teams.
Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study
575
Table 1. Realness of usability findings and severity ratings UEMs HE1 HE2 HE3 CW1 CW2 T-AP1 T-AP2 T-AP3 C-D1
Usability findings 18 28 14 21 24 21 18 17 39 200
1 8 0 3 6 1 1 3 10
0: not a problem 5.6% 28.6% 0.0% 14.3% 25.0% 4.8% 5.6% 17.6% 25.6%
Validity: The validity of UEMs (Table 2) can be directly measured out of the process of identifying the realness (or relevance) of usability findings. Baring in mind that the evaluator teams had little experience in usability evaluations, the validity of UEMs was quite satisfactory besides the three applications of methods that were discussed above. Table 2. Validity of Usability Evaluation Methods UEMs HE1 HE2 HE3 CW1 CW2 T-AP1 T-AP2 T-AP3 C-D1
Thoroughness: Thoroughness can be specified by the total number of real usability problems identified by each UEM divided by the total number of real problems that exist in the system, which is the sum of unique real problems identified by all methods. The eight out of nine UEMs demonstrated similar performance regarding the thoroughness measure (Table 3): they identified about 1/4 to 1/5 of the total number of the usability problems found throughout the system. The last UEM (co-discovery learning) resulted to an impressive (in comparison to the other UEMs) 41.4% of usability problems identified. Effectiveness: The effectiveness of UEMs can be identified as the product of thoroughness and validity (Table 4). The effectiveness of UEMs has demonstrated wide ranging results: ▪ Five out of nine UEMs identified about 1/4-1/5 of the total number of usability problems effectively (HE1: 22.9%; HE3: 20%; CW1: 22%; T-AP1: 24.6%; TAP2: 22.9%)
576
P. Koutsabasis, T. Spyrou, and J. Darzentas
▪ Another three out of nine methods identified 1/6 of the total number of usability problems effectively (HE2: 14.7%, CW2: 17.2% and T-AP3: 16.5%). ▪ Only one UEM identified almost 1/3 the total number of usability problems effectively (C-DL1: 30.8%) The overall results about the effectiveness of UEMs are unsatisfactory with regard to one of the central questions in usability evaluation: whether the application of a single UEM can identify a considerable amount of usability problems. This was also shown by the comparative usability evaluation work of [9] that uses professional design teams. A second interesting result, regarding the comparison of the effectiveness of UEMs themselves is that the co-discovery learning method was significantly more effective than all other methods. Thus, it seems that this method seems to significantly help young teams to perform better than the other three methods. On the other hand, the fact that only one team selected this method constraints the safety of the conclusion, which can also be further pursued in other comparative usability evaluations. Table 3. Thoroughness of Usability Evaluation Methods
UEMs HE1 HE2 HE3 CW1 CW2 T-AP1 T-AP2 T-AP3 C-D1
Total number of real usability problems 17 17 14 18 17 19 17 14 29
Total number of usability problems that exist in the system
Consistency: The consistency of UEMs was not satisfactory (Table 5). About the half of usability problems found (50.7%) were uniquely reported by the application of just one UEM. Furthermore, only 2 of a total of 9 teams found a consistent set of about 1/4-1/5 of the total number of usability problems (22.9%). On the contrary there was not a single usability problem that was identified by all UEMs.
Evaluating Usability Evaluation Methods: Criteria, Method and a Case Study
577
Table 5. Consistency across UEMs Total number of usability problems ... found by 9 teams / UEM … found by 8 teams / UEM … found by 7 teams / UEM ... found by 6 teams / UEM … found by 5 teams / UEM … found by 4 teams / UEM … found by 3 teams / UEM … found by 2 teams / UEM … found by 1 team / UEM
70 0 1 3 5 0 5 5 16 35
% 0.0% 1.4% 4.3% 7.1% 0.0% 7.1% 7.1% 22.9% 50.0%
4.5 Discussion The main conclusions that stem out of the case study are that: ▪ The employment of a single method is not enough for comprehensive usability evaluation. If it is important to find most problems, parallel evaluations can be carried out. ▪ No method was found to be significantly more effective or consistent than others. ▪ The realness and validity of evaluation results was considerably high for most teams, which counts for young designers’ supervised participation to usability evaluations. In the case study presented, we have followed the proposed approach to inform current practice regarding the use of UEMs. The educational setting in which the case study was carried out imposed restrictions regarding the selection of evaluators (i.e. supervised teams of novice evaluators), the assignment of UEMs (i.e. only one team felt confident to carry out the usability evaluation following co-discovery learning) and the processing of results (i.e. an expert-based approach was followed to make final decisions about the relevance and similarity of the usability findings). On the other hand, the educational setting was convenient for a number of other reasons including that: UEMs were applied according to a common set of lecture notes; evaluators followed a common format for reporting; and they followed the same tasks to evaluate the system. These conditions are hard to achieve in an industrial setting. For example, Molich et al [9] perform a comparative usability evaluation where the evaluator teams use different UEMs (actually combinations of UEMs that have evolved by practice) and different templates for reporting.
5 Summary and Conclusions Comparative usability evaluations are important for the throrough identification of usability problems and the comparison of UEMs in particular contexts. The paper contributes to the understanding of criteria for comparative usability evaluation both in terms of providing a method for this task and by presenting a relevant case study
578
P. Koutsabasis, T. Spyrou, and J. Darzentas
for a web-based system. It is envisaged that the approach taken can be applied to other comparative studies as well. Also the results of the case study can inform the selection of UEMs particularly when young designers need to be employed in comparative usability evaluations.
References 1. Andre, T.S, Hartson, H.R., Belzand, S.M., McCreary, F.A.: The user action framework: a reliable foundation for usability engineering support tools. Int. J. Human-Computer Studies 54, 107–136 (2001) 2. Cockton, G., Woolrych, A.: Understanding inspection methods. In: Blandford, A., Vanderdonckt, J., Gray, P.D. (eds.) People and Computer, vol. XV, pp. 171–192. Springer, Heidelberg (2001) 3. Doubleday, A., Ryan, M., Springett, M., Sutcliffe, A.: A comparison of usability techniques for evaluating design. In: Proceedings of Designing interactive systems (1997) 4. Hartson, H.R., Andre, T.S., Williges, R.C.: Criteria for evaluating usability evaluation methods. International Journal of Human-Computer Interaction 15, 145–181 (2003) 5. Hertzum, M., Jacobsen, N.E.: The Evaluator Effect: A Chilling Fact about Usability Evaluation Methods. International Journal of Human-Computer Interaction 13(4), 421–443 (2001) 6. Jacobsen, N.E., Hertzum, M., John, B.E.: The evaluator effect in usability tests. In: Summary Proceedings of the ACM CHI 98 Conference, pp. 255–256. ACM Press, New York (1998) 7. John, B.E., Marks, S.J.: Tracking the effectiveness of usability evaluation methods. Behaviour and Information Technology, 16(4/5), 188–202 (1997) 8. Law, E.L-C., Hvannberg, E.T.: Analysis of strategies for estimating and improving the effectiveness of heuristic evaluation. In: Proceedings of NordiCHI 2004, Tampere, Finland (October 23-27, 2004) 9. Molich, R., Ede, M.R., Kaasgaard, K., Karyukin, B.: Comparative usability evaluation. Behaviour and Information Technology 23(1), 65–74 (2004) 10. Nielsen, J.: Finding Usability Problems Through Heuristic Evaluation. In: Proceedings of CHI Conference on Human Factors in Computing Systems, pp. 373–380. ACM, New York (1992) 11. Nielsen, J.: Usability Engineering. Academic Press, San Diego (1993) 12. Nielsen, J.: Usability Inspection Methods. In: CHI’94, Boston, Massachusetts (1994) 13. Öörni, K.: What do we know about usability evaluation? - A critical view, In: Conference on Users in the Electronic Information Environments, September 8 - 9, 2003 Espoo, Finland (2003) 14. Rosson, M.B., Caroll, J.M.: Usability Engineering: Scenario-Based Development of Human-Computer Interaction. Morgan-Kaufmann, San Francisco (2002) 15. Rubin, J.: Handbook of Usability Testing. John Wiley & Sons, Inc. New York (1994) 16. Sears, A.: Heuristic Walkthroughs: Finding the Problems Without the Noise. International Journal of Human-Computer Interaction 9(3), 213–234 (1997) 17. Van den Haak, M.J., De Jong, M.D.T., Schellens, P.J.: Employing think-aloud protocols and constructive interaction to test the usability of online library catalogues: a methodological comparison. Interacting with Computers 16, 1153–1170 (2004)
Concept of Usability Revisited Masaaki Kurosu National Institute of Multimedia Education [email protected]
Abstract. Based on the historical review, a new model on the concept structure of usability and satisfaction was proposed. As a proposer of user engineering, the author redefined the concept of usability of which the usability engineering is responsible and linked the concept of satisfaction to the user engineering. It is based on the differentiation of the objective characteristics of artefact and the subjective impression of user. Keywords: usability, satisfaction, usability engineering, user engineering.
1 Introduction Since ISO13407 was standardized in 1999, usability engineering entered a new era and increased amount of attention has begun to be cast on the usability or the quality in use. At least in Japan, the concept of usability is based on ISO13407 which cites the definition of ISO9241-11. As the alias of “big usability” suggests, it covers wider range of quality compared to the “small usability” originally proposed by Nielsen. But the author was questioning the conceptual dependency among the sub-concepts of the definition of ISO9241-11. Here the author present a revised version of the notion of usability and put more emphasis on the satisfaction as an ultimate goal for user engineering.
“usability”. As will be discussed later, the goal of our activity should not be limited to the traditional connotation of the term “usability” but should be more broadened.
3 Concept of Usability In this section, some major definitions of usability will be reviewed and finally a new concept of usability and satisfaction will be proposed. Nielsen The formal and structural definition of usability concept was first given by Nielsen, J. as follows. As is shown in this figure, usability is composed of such sub-concepts as the learnablity, the efficiency, the user retention over time, the error rate and the satisfaction. And it should also be noted that the utility is put aside as a mutually exclusive concept to the usability. This concept structure may be related to the activity of Nielsen himself. As is well known, he proposed the heuristic evaluation method for evaluating the usability, i.e., for detecting the problems. Thus, for him, the usability is an activity to improve the negative aspects of the artefact that will be found by the evaluation method. In other words, it’s a “non-negative” concept of the usability and it is aiming to improve the artefact from minus level to zero level or the normal level. Looking back the history of usability engineering, it started from the evaluation activity using the usability testing, the inspection method, etc. So it is quite natural that Nielsen focused on the evaluation and proposed the concept structure of usability as such. But the usability activity based on the evaluation had some limitations. For one, engineers and designers who designed the artefact wouldn’t easily agree to accept the result of evaluation, and claim that the users can use it if they should follow the procedure that was designed by them. For another, managers wouldn’t put emphasis on the evaluation-based usability activity because just improving the defects will not contribute to the sales. Of course, there were some engineers, designers, and
Concept of Usability Revisited
581
managers who could understand the significance of the usability activity even though it is a “non-negative” approach. But most of them put their energy to the development of utility, or the functionality and the performance. So it could be said that the “non-negative” concept of usability, sometimes called the “small usability” is not sufficient, and something more should be considered.
ISO 9126 ISO9126 was standardized for defining the quality of software. As can be seen in the figure, there are many quality characteristics that include the usability as just a part. In this standard, the usability is considered to be consisting of the understandability, the learnability and the operability.
582
M. Kurosu
It is reasonable that this standard included the usability as one aspect of the software quality, but its definition is narrow and insufficient. ISO9241-11 An influential definition of usability was proposed by ISO9241-11. The definition of usability clearly specifies that the usability is related to the goad-achievement and it put emphasis on the context of use. The sub-concepts of usability consist of the effectiveness, the efficiency and the satisfaction. It is important that the effectiveness and the efficiency are not only related to the “non-negative” aspects but also the “positive” aspects of the artefact. Regarding the effectiveness, the artefact can become usable by minimizing the difficulty of use. But at the same time, the artefact can become usable by providing the function that will solve the user’s problem and make it easier to achieve the goal. Regarding the efficiency, the usability will be improved by changing the interaction procedure in order to shorten the time of operation. But it could also be improved by providing the faster CPU. In this sense, the definition of usability of ISO9241-11 is not just the “nonnegative” one but also is the “positive” one. In other words, it could be said that this definition includes both the usability and the utility in the definition of Nielsen, and is almost the same with his definition of usefulness. This definition is sometimes called as the “big usability”.
This definition also includes the satisfaction as a promoting factor of the use of artefact. But this point is a bit controversial. The effectiveness and the efficiency are the property of the artefact so is the usability. But the satisfaction can be achieved as the result of the object property, i.e. the effectiveness and the efficiency, and is the subjective impression on the side of the user. Another point that should be carefully looked at is the use of the term “specified users”. It is quite natural that the manufacturer presupposes the user as the targeted
Concept of Usability Revisited
583
user. But it was frequently observed that the profile of the targeted user was based on the engineers and designers themselves. Thus, it is sometimes criticized that the profile of user is a male, aged around 30’s, having a certain level of knowledge of IT. As a result, the artefact they designed frequently becomes difficult to use by everyday people. This was sometimes pointed out by those who are working in the field of universal design and accessibility. Thus it should be redefined as to include the every possible type of users. Anyway, this definition of usability of ISO9241-11 was so influential that ISO13407, the core standard of usability, and other standards such as CIF, ISO18529, ISO16982, ISO20282, etc. are adopting this definition. Jordan Patrick Jordan put emphasis on the pleasure and proposed a three-level concept structure. The functionality is placed as the first level and the usability as the second level. He put emphasis on the pleasure as the third level, because it is inevitable for the artefact not just fulfilling the ease of use but also enhancing the emotional aspects. This corresponds to the current trend that focuses on the emotional aspect of the artefact as was proposed by Norman. In his definition of usability, he cited that of ISO9241-11. But this point is a bit confusing. As was mentioned above, the definition of usability of ISO9241-11 includes the satisfaction as its part. So it is difficult to clearly differentiate the satisfaction and the pleasure. Furthermore, his definition is a bit too simple and does not refer to other aspects of the artefact. Kurosu Considering the insufficiencies of past definitions, Kurosu proposed a new hierarchical model of usability. He differentiated the objective properties of artefacts (on the left hand side) and the subjective characteristics of users (on the right hand side). In the left half, the effectiveness and the efficiency are included as to be influenced by the utility and the (small) usability. The former consists of the functionality and the performance and the latter consists of the ease of operation and the ease of cognition. The ease of operation was once a main target of usability activity by applying the methods and knowledge of the human factors engineering and ergonomics. The ease of cognition later became the center of the concern of the usability professionals that was triggered by the advent of computer and its applications. It was put forward by applying the methods and knowledge of the cognitive psychology. The effectiveness, the efficiency and the satisfaction were regarded as the subconcepts of the usability in ISO9241-11, but Kurosu limits the range of usability concept only to the effectiveness and the efficiency on the left hand side of the figure, admitting the influence of these concepts to the satisfaction. It is based on the notion that the usability is the property of artefact.
584
M. Kurosu
Besides the effectiveness and the efficiency, such quality properties as the cost, the safety and the reliability are located in the property of artefacts that may influence the satisfaction. Some other properties such as the re-usability could be added to the list of quality properties if necessary. On the right hand side of the figure, the satisfaction is located as the top concept and some other subjective characteristics on the side of user such as the pleasure, the aesthetic impression, the attachment, the motivation, the drive and the value system are described as influencing the satisfaction. It is also suggested that the satisfaction, the supreme goal of the artefact, may be related to the user experience (UX), the customer satisfaction (CS) and the quality of life (QOL).
In addition to this structure, Kurosu points out that it is important to include the spatial dimension and the temporal dimension. He put emphasis for considering the diversity of user characteristics and the diversity of context of use as a spatial expansion of the concept. This could be the basis of the concept of universal usability as was originally proposed by Shneiderman. He also introduced the temporal dimension and put emphasis on the long term use or the prolonged use. It is a contrasting approach to the former usability engineering that focus on just a short time use as can be evaluated in the situation of the usability testing. These points will be explained in detail in later sections.
Concept of Usability Revisited Characteristics Age Generation Gender Physical traits Mental traits Educational background Social status Knowledge and skill Language Culture Communication style Cognitive style Learning style Functional Insufficiency
Situation Life style Economical situation Political situation Emotional status Geographical environment Historical background Urgency
585
Value Preference Political attitude Religion Tradition
4 Concluding Remarks Based on the notion of goal achievement, a few notable definitions of usability were reviewed and a new concept of usability and the concept of satisfaction were redefined respectively, thus putting more emphasis on the user engineering. Although this idea is not quite new to the author, he is now confident that the satisfaction is the ultimate goal of people (user) living in this world. Artefacts include invisible systems such as the educational system, the local government system, the banking system, and the transportation system. The concern of author was now enlarged to consider how the educational system can satisfy people who are considering their life-path and the carrier-path. In this sense, the usability of the educational system that include the hardware, the software, the humanware and their total system should be inspected from the viewpoint of the satisfaction. The system should support people to finally select their life path and empower them to have knowledge and skill to realize that goal. The effectiveness and the efficiency of the educational system is just a matter of usability and that will not fulfil their goal of life. In this sense, the author is now interested in pursuing the difference of artefacts in time and in place. He is now conducting the ethnographic research to find out how people invented and decided to use some specific form or pattern of artefact for supporting their life. This is called the “Artefact Development Theory” and will be presented in the next opportunity.
References 1. ISO ISO13407, Human-centred design processes for interactive systems (1999) 2. ISO ISO9241-11, Ergonomic requirements for office work with visual display terminals (VDTs). Guidance on Usability (1998)
586
M. Kurosu
3. Jordan, P.W.: An Introduction to Usability? Taylor and Francis, London (1998) 4. Jordan, P.W., Thomas, B., Weerdmeester, B.A., McClelland, I.L.: Usability Evaluation in Industry? Taylor and Francis, London (1996) 5. Kurosu, M.: What is usability?, HCD-Net News 2006. In: Nielsen, J., Usability Engineering?AP Professional (1993)
How to Use Emotional Usability to Make the Product Serves a Need Beyond the Traditional Functional Objective to Satisfy the Emotion Needs of the User in Order to Improve the Product Differentiator – Focus on Home Appliance Product Liu Ning1 and Shang Ting2 1
Abstract. A traditional definition of usability cites the successful attainment of some related control within a specified period of time and a minimum number of errors. Therefore, most of attempts focused on the function of the product. At present, user centered design is highly emphasized; in addition, more entertainment-oriented products has received high attention by consumer. So, whether or not the product can meet the emotion needs of the consumer is significant for the brand. This paper provides the definition of emotional usability based on the traditional usability research and introduce one of the most famous home appliance company Haier how to use it during the course of product development through case study and provides the process to apply emotional usability to make the product serves a need beyond the traditional functional objective to satisfy the emotion needs of the user in order to improve the product differentiator.
key when people made decision, for example people choose die because of the certain brief; people choose loyal to their lover because of deep falling in love. No doubt, people make decision to pay for the product partly is also partly influenced by emotion. This emotion could come from various origins, such as requirements, impression, trustily and so on. Even sometime people buy the product not because they need it, just because of experience or deeply moved. In the general, the requirement of human for the product performance usually includes the functional, atheistic and quality. From the view of functional usability which means the people who use it can do so quickly and easily to accomplish their own tasks. People will never pay for the product which let them spent a great deal of time on figuring out how to work or keep making errors when customers use the product. It’s easy for customer to abandon these kinds of products. For atheistic, a dictionary provides a definition, “the beautiful, in the taste and art.” This component includes the first and last impression of the product. The role of atheistic has already be the significant component of the product. But sometime, the product which is equip with the aesthetic couldn’t meet customer’s needs or hard to use. This paper will deeply explain the definition of emotional usability and introduce one of the most famous home appliance product company how to address the emotional usability into the new product development.
2 Emotion Usability Hence, what’s the emotion usability really? Let’s talk about usability first. Usability contains two parts; one is functional usability which focuses on easy and efficiency to use the product. The other part of usability is emotion usability refers to a degree to which a product is desirable or serves needs beyond the traditional functional usability. Take the refrigerator for example, Chinese people consider this product which could make the food last longer in 1985s which is very beginning of refrigerator in China; later, people think this function is not enough, customers want to fridge make the food more fresh, save the energy, keep the humidity and so on. However, at present, refrigerator is not just a home appliance for them. Chinese expect the product could be nice furniture in their apartment. Therefore, traditional components of appearance, such as color which includes white and argentine are obvious lack of attractive for customer. Even those products equip with satisfied quality and function. This case could be applied for many types of human electronic product, like wash machine, TV and IT products. Does this mean quality is no more important or no more tempting for Chinese? Absolutely not, let’s back to see how people interact with the product. Figure 1 describes the cognitive judgment for the products. Normally, customer interacts with the product from five senses which is smell, touch, taste, taste and sight. Nevertheless, touch, sight and hearing are main sense of customer interact with home appliance product. Basically, hear could have connection with the performance and quality, such as the degree of noise etc. However, part of touch and particularly sight are related to customer’s emotion reaction. Sight prefers more about aesthetic part and whether or not
Focus on Home Appliance Product
589
the product could serve their needs, such as appearance, user interface design, and function. Touch tends more about the process of interact with the product, such as the usability. Certainly, not only aesthetic could impact human’s emotion changing. Functional requiems point the product could meet people’s potential needs and the enjoyable interactive process not just easy to use. These factors all take great influence on human’s emotion. Some time the product is unique enough, but it couldn’t meet the customer needs, the product will still suffer Waterloo. In the history of industry design, innumerable well designed product failed in the commercial place. As we mentioned, serving the potential needs beyond the functional usability is the key point. This means the most important thing is address the needs which customer doesn’t realize but they really need in the product. In this point, understanding customer’s way of thinking is significant.
Interaction Obje ctive sm ell touc
tast
heari sig ht
Cognitive judgment
Fig. 1. Cognitive judgment for the product
Nowadays, many western companies try to won huge China market. However, there is a deep gap between Chinese and western people. Primarily, there’s distinct difference between western culture and Chinese culture. These differences reflect to the emotion part. The core of Chinese culture is implicit, middlebrow, be afraid of loosing face or let other people feel embarrassed. These are the some important reasons why many Chinese pay attention to the products which can stand out their status, taste and
590
N. Liu and T. Shang
personalize. In contrast, simple, straight forward are the key points of western culture. Western people pay attention to more about simple, practical about product. As we mentioned the refrigerator case, Chinese feel proud when the guest comes to their apartment to see the refrigerator with attractive features. Chinese like to show the new mobile phone which they just bought to their friends, even the old one is not bad all. People feel bored or out of date to use the refrigerator with traditional design or the old model of the mobile phone, even this product has good quality. In this case, the product which can successfully address western people emotion usability might not be able to work in China.
3 The Process to Apply Both Functional and Emotion Usability in Haier Haier is one of the most famous home appliances in China. The company gains the compelling success in China. The secret of success of Haier in the beginning is meticulous support after sell. At present, one more secret of success is always address Chinese customer’s potential needs. Haier have the practical process to adrress emotional usability in the new product. Frist, let’s introduce the principle of Haier design. 3.1 The Principle of Haier Design In 2002, Haier develop the compact wash machine for summer clothes and other small clothes which couldn’t wash quickly because of too light by the normal wash machine. This product has broken the rules which wash machine is relative hard to sell in summer also won G-Mark of Japanese design competition. In 2006, Haier air conditional again won the International Industrial Design competition (IF) in German.
In fact, the principle of Haier’s design is the product must satisfied user’s needs, here mentioned needs not only for those which are already been discovered, the most important part of the needs is keeping exposure the user’s potential needs. Central concepts of Haier emotional usability are two related product attributes. First, the
Focus on Home Appliance Product
591
product must have appropriate aptness, and first/lasting impressions. In another words, the product can catch and maintain user’s attention quickly and longer; second, eliminate the machine feeling which includes fear and lack of humidity to user. 3.2 Process From the beginning of the new product development, usability team is involved in Haier. The responsibility include discover user’s needs; build user interface design, usability testing and other types of testing. In summary, usability team plays a key role in the early new product development process. The following shows the snapshot of these activities (Figure2).
Fig. 2. The process of combine functional and emotional usability
Through conducting user research for the previous products, Haier can get the directly customer feedback. To some extent, this way could also gather user requirements to the new products. Obviously, Haier also conduct other type’s research on discovering potential customer needs. Then, these needs will be imputed into the new product developments team to develop the new concept. In fact, the most important steps to discover users emotional usability needs are conducted in the pre-development stage even before. Because, customer usually couldn’t provide very useful suggestion directly, mostly, customer doesn’t really know what they want. Through the various researches in pre-development stage, Haier gather more answer about why rather than how from the customers, research team keep working to figure out the story which is behind the reason to discover people’s potential needs. For example, in 2003, China was suffered by SARS, especially in South part. This disease can be easily infected even through air. Chinese is really serious afraid of it. The product team through user research to find that when people wash the clothes, people used to sterilize it after washes machine finished washing. However, the clothes got terrible flavor by disinfector. Because there is no wash machine which equip with disinfection function, some people use disinfector to wash the clothes first by hand then
592
N. Liu and T. Shang
put it into the wash machine. The reason of people use this unusual way to wash clothes obvious because of SARS. This potential needs was quickly identified by the new product development team. Haier launched the first wash machine which has disinfection function in Chinese market very soon at that moment. This product brought the huge profit to Haier. In this case, it’s easy to see, the product development team properly transfer customer needs into the new product which achieved the first aim of Haier’s emotional usability- the product must be engaging. Customers could be deeply moved by the unique design, also could because of the product provide something which they never saw but they really need it. Without doubted, aesthetics engaging is also important. Once goes to the concept stage, the product design and user interface design team will provide design concept based on those needs. After the new concept is finished, Haier will send the prototype to the laboratory to conduct one to one acceptance testing or using other research methods to evaluate the feature of the product. The opinion which is collected from the customer will directly sent to the product development team in order to quickly improve the product. At the same time, acceptance testing or type’s research is also conducted. Based on user’s reaction to the product, the company could simply evaluate whether or not the product could catch people’s eyes. It includes product design, usability, user interface even inside frame.
Take this red refrigerator for granted, after this product was finished design stage, Haier did competitive testing between Haier’, Sumsung, Panasonic and LG in China. The result is this model got pretty much praise by customer. Once Haier launch this model, Haier got quite a lot success in the market. Meanwhile, Haier also sent this model to German to participate the IF competition in 2006. Haier won the award. In order to better to achieve the emotional Usability aims, especially for eliminate the fear of the machine, Haier emphasizes the user interface design in the beginning of the new product development. Actually, achieving the functional usability is just a base goal for Haier. The interface design team combines functional usability and visual appealing together to let people are engaging with the product by the easy and enjoyable interaction process between product and customer. The user interface design
Focus on Home Appliance Product
593
concept finally will be conducted by usability testing with the customer to evaluate whether or not the concept could achieve the aim of usability. • Weak point of the process There is no omnipotence weapon in the world. This process already made fantastic impaction on new product development for Haier. However, it also showed the negative parts. First, the cognitive of human keep changing, it also include the opinion which they provide to the company. Sometime, what they said is not really carefully considered. Second, usability is adopt from western, it include many methods. This question back to the previous issue which is western has great difference with China. Chinese people tend to be silence, implicit, consider carefully about company’s face while the research is conducted which are not good. Except that, many methods could work well in western, however, those methods couldn’t work ideal in China. The bad result of research might serious mislead the company’s decision. Therefore, finding a proper methods and make the methods localization are another key issue now.
4 Conclusions To summarize, usability program is new and still growing in Haier. Haier actually just start the fundamental process of usability process; include early and continual focus on consumer and perfect our interactive design process that relies heavily on the research and prototype. Based on this process, Haier customize the different department into new product development process in order to finally improve the differentiator in China. As the principle of Haier design group, emotional usability is the key to improve the product differentiator and eventually achieve success in the market place.
References 1. Micheal, E. wiklund Robert, J.: Usability in Practise, Logan, PhD, Behavior and emotional usability (1994) 2. Oscar Person, Usability is not enough
Towards Remote Empirical Evaluation of Web Pages' Usability Juan Miguel López1, Inmaculada Fajardo2, and Julio Abascal1 1
Laboratory of Human-Computer Interaction for Special Needs (LHCISN) Computer Science Faculty. University of the Basque Country Manuel Lardizabal 1; Donostia - San Sebastian [email protected], [email protected] 2 Cognitive Ergonomics Group Department of Experimental Psychology. University of Granada Cartuja Campus; Granada [email protected]
Abstract. The functional description of EWEB, a tool for automatic empirical evaluation of web navigation, is presented in this document. EWEB supports naïve evaluators for designing experiments which contain experiment type (within-subject, factorial, etc.), web logs to be captured (time, visited pages, etc.), task models (search task, free navigation) and surveys (questionnaires, card sorting) to be performed by experimental participants. EWEB stores navigational data preserving the experiment structure and supports data analysis and interpretation, with the possibility of generating usability metrics. Requiring minimal installation on client computer, EWEB can be used for both lab evaluation and remote evaluation in multiple browsers. One empirical web study, designed and performed by means of EWEB, is described in order to illustrate its validity as a research tool. Keywords: web usability experiments, log capturing and analyzing, web navigation metrics.
Towards Remote Empirical Evaluation of Web Pages' Usability
595
that reason, tools for facilitating the users’ recruitment and for automating the process of designing, registering and interpreting web usability experiments are essential. With the aim of assisting researchers, there exist tools that automate these processes separately. Many useful tools for the capturing of user behavior during web interaction can be found in the literature. On the one hand, tools such as [3], [4] or [5] store data generated by the HTTP level communication between a web server and a browser on a client machine. On the other hand, there are tools for capturing data from user interface, in this case a local web browser, such as [6], [7], [8], [9] and [10]. A subgroup in this category is the one of the tools that use a new or modified client browser specifically prepared for storing user navigation information, such as [11], [12], and [13]. There also exist a number of tools for automating knowledge elicitation tasks administration (mainly card sorting) in the Web context, such as [14], [15], [16] and [17]. However, these tasks are not usually integrated in automated capture tools. [15] is an exception that combines automation of event capture and card sorting tasks, so that researchers can design the task introducing the concepts to be classified by users and gathering the result of this categorization. Once user actions are registered, next step consists on analyzing and interpreting them. Among the tools that allow analyzing registered web navigation information, [10], which provides comparison between task models and user behaviour, [13] and [18] can be found. Other analysis tools such as [19] and [20] do not register user logs but provide interesting navigation metrics, for instance, the disorientation degree (L index of Lostness, [21]), St index of linearity and/or Cp index of complexity of the Web navigation route followed by the user [22]. As a summary, it can be stated that there are numerous tools which automate some of the processes implicated in an empirical evaluation of website usability, mainly capturing and analyzing. However, tools aiding the process of designing a web experiment are fewer and incomplete. Furthermore, to our knowledge there are no tools that facilitate conducting jointly the processes of designing complex web experiments, capturing user interaction, and analyzing and interpreting captured data. This fact may conduct evaluators to acquire and invert a big amount of resources in learning how to use different tools for each process, which paradoxically would interfere instead of making lighter the empirical evaluation of web navigation processes. Given the deficits found, and with the aim of suiting the need of one tool including all previously mentioned aspects, EWEB tool was developed. EWEB (acronym of Experimentation in the WEB) is a tool for automatic empirical evaluation of web navigation. EWEB supports naïve evaluators for creating experiments which contain experiment type (within-subject, factorial, etc.), web logs to be captured (time, visited pages, etc.), task models (search task, free navigation) and surveys (questionnaires, card sorting, etc) to be performed by experimental participants. Additionally, EWEB stores web navigation data preserving the experiment structure and supports data analysis and interpretation, with the possibility of generating usability metrics such as Lostness or similarity to the optimum path [21]. Finally, EWEB can be used for both lab evaluation and remote evaluation in multiple browsers requiring minimal installation on client computer.
596
J.M. López, I. Fajardo, and J. Abascal
2 EWEB Tool: Technical and Functional Description EWEB tool (Experimentation in the WEB) consists on three different modules (see Figure 1): Experimental Session Design module, User Guidance and Monitor module and Analysis module.
Fig. 1. Architecture of EWEB tool
An experimenter defines a session using the design module, creating a XML file as output. This file is used to define the experiment session. User guidance and monitor module uses the XML file to conduct the session in user's computer while monitoring navigational data. These data are stored in a remote repository, from which it can be analyzed according to the experiment session. Each part of the architecture is described next. 2.1 Experimental Session Design Module Any experiment can be described as a study that investigates the effect of X on Y. Therefore, when a Web experiment is carried out, it must be decided which variables are going to be manipulated (X or independent variables), controlled (strange
Towards Remote Empirical Evaluation of Web Pages' Usability
597
variables) and observed (Y or dependent variables), and in which way. For instance, let's consider a company that wants to evaluate the impact of the background colour (blue, white or green) of its website on the time that users require to complete a search task. In the design module of EWEB, the evaluator must add one Independent Variable (Background colour) and specify its levels (blue, white or green). Furthermore, the evaluator must decide whether all users will perform the search task for the three levels of the Independent Variable (within-subject design) or each user will perform it for just one level (between group design). As a result of these initial steps, the design model calculates the number of experimental groups and conditions automatically. In our example, the between group design would have three experimental conditions and three experimental groups. The evaluator must include three different groups of users and each of them would perform the search task with an unique background colour level. If the evaluator selected the within-subject design, the number of experimental conditions would also be three but the number of experimental groups would only be one since all users will perform all three experimental conditions. The next step is to define the task models and the procedure for each experimental condition. Currently, EWEB is designed to implement two types of web navigation tasks (search and free navigation) and two types of surveys (card sorting and questionnaires). 2.1.1 Search Task and Free Navigation Task The search task consists of users searching for a series of targets in the web site with a temporal limit. The design module allows configuring instructions, number of search tasks, time limit for the task, target URL, initial URL for the task, data to be logged (accuracy, time to find the target, pages accessed, total time per page, order of pages accessed) and order of searching trials (random or fixed order). This last point is very important in experimentation in order to prevent practice or fatigue effects, which may mask the effects of the independent variables. The free navigation task consists of asking users to navigate freely through the web starting from a specific web page for a given time. The design module allows configuring initial URL, time limit (if any) and navigational data to be logged. 2.1.2 Surveys Card sorting task is used as a knowledge elicitation task and it has been used extensively in Cognitive Psychology and Artificial Intelligence to study user learning or the so called Mental Model [23]. The card sorting task consists of asking users to sort cards which contain task relevant concepts. The output is a vector or matrix with the user data that can be compared graphically or statistically with a theoretical or expert matrix. Design module allows evaluators to introduce task relevant concepts and specify the theoretical vector. Questionnaire option allows evaluators to design a set of questions to be fulfilled by users. The number and type of questions (true-false, forced choice, scale, etc.) and the presentation format can be designed in this module. Finally, although the general instructions for the experimental conditions are not users tasks, they can be designed within each condition in order to facilitate the description of the procedure.
598
J.M. López, I. Fajardo, and J. Abascal
Continuing with previous example, if the figured company sells books, the speed users are able to find and match their targets is a relevant usability index. Therefore, if the evaluator started with the “blue background” experimental condition, he/she could select the search task. In the design module interface, the evaluator would select the number of search tasks (e.g. two searching trials), time limit (e.g. 20 seconds) and initial and target URLs. As some of the experimental conditions can be identical to the previously defined ones, EWEB allows the evaluator to copy tasks and procedures and later change some parameters on them. Finally, the procedure, that is, the presentation order of the sets of tasks for a specific experimental condition, can also be defined by the evaluator as random or fixed. For instance, if an experimental condition includes two tasks, search and card sorting, and the evaluator selects a fixed order, he/she should indicate which one must be performed first. This module provides as output a XML file with a specific format for the experiment design created. All different variables used and their different conditions are coded in this file. The specification file is later used by the User Guidance and Monitor Module, which assigns the different tasks users have to perform based on its information. The file is also used by the Analysis Module to facilitate the analysis of all user evaluation data. 2.2 User Guidance and Monitor Module In order to perform an experiment, this module must be downloaded to be run locally in user’s computer. As this module is developed using Java technology, the only requirement for the user machine is that Java Virtual Machine is installed. If so, the module will run by means of Java Web Start technology. This module is composed of two different parts: User Guidance Module and Monitor Module. 2.2.1 User Guidance Module User guidance module is based on the experimental design created by the experimenter in the previous stage. A XML file describing the design of the experiment is received as an input. According to the design, the tasks to be performed by the user and their order are established. For instance, if the procedure of an experimental condition is defined as random, all tasks related to it will be randomized when the user passes through this condition. Tasks to be performed are prepared and executed, based on the given experimental design. Instructions for the different groups of tasks to be performed are also provided by the experiment file, as are the texts for error or OK messages that may appear depending on user actions. 2.2.2 Monitor Module The Monitor Module is executed locally on the client machine and its goal is to monitor all information related to user interaction while performing the given tasks. In order to ensure that the evaluation is performed in a realistic scenario, data recovery must be performed in such a way that the user is not aware of any difference from the web navigation he/she performs in his/her browser. Therefore, this module defines no user interface. As almost all current browsers allow the option to connect to the web through a proxy server, in this module a proxy or intermediate software is used to route all the
Towards Remote Empirical Evaluation of Web Pages' Usability
599
incoming and outgoing client browser’s web traffic, so that relevant user navigation data can be captured using this technique. This approach permits the proxy to be used by almost all existing browsers and operating systems. In fact, EWEB has been successfully tested with different browsers such as Internet Explorer, Mozilla, Mozilla Firefox and Konqueror. Modularity of this approach allows a rapid adaptation for this module to be used with new browsers or new versions of supported browsers. This module performs both the proxy navigation activation and deactivation automatically, so that the user does not notice changes in browser configuration. The mechanism is different for each browser, so that different pieces of code have been developed to perform this, one for each different browser. If user’s browser already has a proxy configured, a hierarchy is created in the Monitor Module proxy so that the incoming and outgoing web traffic is rerouted to previously defined proxy. When user session ends, browser settings are restored so that the user can navigate with the previously defined proxy. Use of cached information is disabled to ensure that all users perform the evaluation in the same conditions, because the use of cached web pages can affect the results of the experiments. As user navigation data are recovered locally by the proxy, there is no problem with network latency and received data are more accurate than data obtained in a remote proxy server (millisecond accuracy can be achieved). Data from user performed tasks can be stored either in a remote repository or in a local file, depending on the information specified in the experiment. All recovered data are also stored and labeled according to the design. 2.3 Analysis Module Once users’ data are stored in the remote data repository after an experimental session, both directly or by adding local files manually, they can be analyzed using this module. In addition, data stored in the repository can be reformatted in plain text format so it can be directly imported from different tools such as Excel or Statistica for performing statistical analysis. The type of analysis that can be performed is different for each task type. In the case of the search task, this module allows analysis of parameters such as the number of correct trials per user (target found), average correct trials per experimental condition, total and average elapsed time for finding a target per user or per experimental condition, and the similarity of user path to task or task model's optimal one. The last parameter can be evaluated by means of the Lostness metric [21] per trial and subject, or the average Lostness per trial. Lostness index ranges between 0 and 1. The greater the values are, the greater user's Lostness will be. For the free navigation task, this module allows calculating the total time required by the user to navigate through the website (if there was no time limit) and a matrix of the transitions between actions for the analysis of user’s navigation strategies [22], or for the coherence between accessed nodes [24].
3 Case Study The material for the illustration come from an experimental study carried out by [25] with the aim of comparing ten different websites in terms of accessibility (measured
600
J.M. López, I. Fajardo, and J. Abascal
with the metrics proposed by [26]) and usability (measured by the accuracy, effectiveness and satisfaction searching information in the Web). EWEB was used to design, capture and analyze the experimental data. Twenty volunteers participated in the experiment (fourteen women and six men) whose average age was 25 years old. They were asked to search 54 targets in ten different Websites that was indicated each time (six searches per Website). The order for Websites presentation and searches per Website were randomized for participants. By means of the design module researchers introduced the number and types of variables and the program automatically calculated the number of experimental conditions and showed then to researchers. In this case, “Website” independent variable was manipulated within-subject and with ten levels, one for each website to be studied. Therefore, there were ten experimental conditions. Then, experiment designers selected each experimental condition they defined the tasks users should perform, in this case, a search task and a satisfaction questionnaire. Finally, researchers defined the number and characteristics of each one of the ten search trials: instructions, time limit, target, randomization, etc. Figure 2 shows a piece of the XML form that EWEB generated as a result of this process.
Fig. 2. Piece of the XML form which contains the experiment characteristics
Once researchers defined tasks and procedures of each one of the ten experimental conditions, they started the User Guidance and Monitor Module in the local computer of the experimental participant. Then, researchers selected the experiment, specified the participant identification and experimental group (the list of groups was automatically generated by EWEB). Since Website variable was a within-subjects variable, there was just one experimental group and all participants had to perform the searches and the satisfaction questionnaire for each one of the ten websites. The experiment started controlled by the user guidance module, which asked participants to perform the task according to the specified procedure (in this case, the administration of the websites and search targets was randomized for each participant). In the meantime, monitor module monitored participants’ actions while navigating and saved them.
Towards Remote Empirical Evaluation of Web Pages' Usability
601
The data were saved into a remote repository based on the structure of the experiment. The data report was grouped into task, experimental condition, measures, etc. In Table 1, it can be seen the accuracy and efficiency data of one participant in the 3 trials of the search task for one of the ten websites. Results were merged and exported to a statistic program in order to perform the required statistics. Table 1. Data calculated by EWEB for Participant 1 in the condition “W3C website” User Code: Participant 1 Search Task Experimental Condition: W3C website Trial Response Time (ms) 0 3656 1 60000 2 60000
Target Found
Lostness 1 1 0
0,25 0,780625 0
3.1 Results The results of one Website were removed from the analysis because the recollection of data failed for some users. Consequently, the accessibility of the remaining nine websites was calculated by means of [26] and compared to the usability metrics (search time, percentage of target found, lostness and satisfaction) calculated automatically by EWEB. The results showed the nine websites differed significantly in usability and accessibility and, which is more interesting, accessibility and usability metrics are not correlated and provide different websites ranking (see [25] for a wide description of this experimental results). Therefore, based on the results, it was concluded that technical web accessibility is not a good usability predictor.
4 Conclusions and Future Work The experimental study carried out by [25] illustrates that EWEB automates jointly the processes of experimental design, data registering and data analysis. The design module automatically generates and interprets a complex XML file containing the characteristics of the experiment without the need of researchers been experts in XML language. The design module safeguards the requirements for manipulation and control of an experimental study. In addition, the use of the experimental design facilitates the identification and analysis of users’ data by storing them attending to the patterns defined in the experiment. Since the experimental design and number of experimental conditions are defined in the design module, results are displayed as a function of the experimental structure. Identifying users by experimental conditions has an additional benefit because it prevents from confusing different users with a unique users or vice versa as it can happen when IPs are used for user identification. Another advantage is that EWEB automatically calculates metrics as the accuracy and users' lostness in the search task by comparing the task models provided by the researcher in the design module and user behaviour. That means that EWEB can analyze high level user behaviour and not only isolated events. Definitively, EWEB provides a great versatility at the same time that reduces the evaluators’ investment of
602
J.M. López, I. Fajardo, and J. Abascal
time and resources. From a technical point of view, the use of Java technology allows implementing an easily portable and multiplatform tool that eliminates network latency when measuring response time from users. Future work must consider the inclusion of new tasks and metrics such as Efficiency rating (E), Confidence rating (C ), St (index of linearity) and Cp (index of to the strategy related complexity [22]). In addition, it is interesting to improve the data display by introducing graphical information representation to facilitate the visual analysis to evaluators.
References 1. Shneiderman, B.: Designing the user interface: Strategies for effective human-computer interaction, 2nd edn. Addison-Wesley, Reading, MA (1992) 2. Salmerón, L., Salmerón, L., Cañas, J.J., Kintsch, W., Fajardo, I.: Are expert users always better searchers? Interaction of expertise and semantic grouping in hypertext search tasks. Behaviour and Information Technology 24(6), 471–475 (2005) 3. AccessWatch (n.d.), (2007) Retrieved on February 2007 from http://www.accesswatch.com/ 4. Analog (n.d.) (2007) Retrieved on February from http://www.analog.cx/ (2007) 5. WebTrends (n.d.) (2007) Retrieved on February 2007 from http://www.webtrends.com/ 6. Ellis, R.D., Jankowski, T.B., Jasper, J.E., Tharuvai, B.S.: Listener: a tool for client-side investigation of hypermedia navigation behavior. Behavior Research Methods, Instruments & Computers 30(6), 573–582 (1998) 7. Etgen, M., Cantor, J.: What does getting WET (Web Event-logging Tool) Mean for Web Usability? In: Proceeding of 5th International Conference on Human Factors and the Web, Gaithersburg (1999), http://zing.ncsl.nist.gov/hfweb/proceedings/etgen-cantor/index.html 8. Scholtz, J., Laskowski, S.: Developing usability tools and techniques for designing and testing web sites. In: Proceedings of the fourth conference on Human factors and the web. Basking Ridge, NJ (1998) Available at http://zing.ncsl.nist.gov/WebTools/ 9. Gonzalez, M.: ANTS: An Automatic Navigability Testing Tool for hypermedia. In: Proceedings of the Eurographics Multimedia’99 Workshop, Milán, Italy. Multimedia’99, Italy, Springer, Wein, Austria (2000) 10. Paganelli, L., Paternò, F.: Intelligent Analysis of User Interactions with Web Applications. In: Proceedings of ACM IUI 2002. San Francisco, CA. pp. 439–445 (2002) 11. HotJava (n.d.) (2007) Retrieved on the 10th of February 2007 from http://java.sun.com/ products/archive/hotjava/index.html 12. WebWindow (n.d.) (2007) Retrieved on the 10th of February 2007 from http:// www.javio.com/webwindow/webwindow.html 13. Edmonds, A.: Uzilla: A new tool for web usability testing. Behavior Research Methods, Instruments and Computers 35(2), 194–201 (2003) 14. WebSort (n.d.) (2007) Retrieved on February 2007 from http://www.websort.net/ 15. WebCAT (n.d.) (2007) Retrieved on the 8th of February 2007 from http://zing.ncsl.nist.gov/WebTools/WebCAT/overview.html 16. UzCardsort (n.d.) (2007) Retrieved on the 8th of February 2007 from http://uzilla.mozdev.org/cardsort.html 17. Harper, M.E, Jentsch, F.G, Berry, D., Lau, H.C, Bowers, C., Salas, E.: TPL–KATS-card sort: A tool for assessing structural knowledge. Behavior Research Methods, Instruments and Computers 35(4), 577–584 (2003)
Towards Remote Empirical Evaluation of Web Pages' Usability
603
18. Carmel, E., Crawford, S., Chen, H.: Browsing in hypertext: a cognitive study. IEEE Transactions on Systems, Man. and Cybernetics 22(5), 865–884 (1992) 19. Richter, T., Naumann, J., Noller, S.: LOGPAT: A semi-automatic way to analyze hypertext navigation behavior. Swiss Journal of Psychology 62(2), 113–120 (2003) 20. Brunstein, A., Naumann, A., Krems, J.F.: The Chemnitz LogAnalyzer: A Tool for Analyzing Data From Hypertext Navigation Research. Behavior Research Methods 37(2), 232–239 (2005) 21. Smith, P.A.: Towards a practical measure of hypertext usability. Interacting with Computers 8, 365–381 (1996) 22. McEneaney, J.E.: Graphical and numerical methods to access navigation in hypertext. International Journal of Human Computer Studies 6(5), 761–786 (2001) 23. Cañas, J.J., Antolí, A., Barquier, P., Castillo, A., Fajardo, I., Gámez, P., Salmerón, L.: Representación mental de los conceptos, objetos y personas implicados en una tarea realizada en una interfaz. Inteligencia Artificial 16, 107–113 (2002) 24. Foltz, P.W., Kintsch, W., Landauer, T.K.: The measurement of textual coherence with Latent Semantic Analysis. Discourse Processes 25, 285–307 (1998) 25. Arrue, M., Fajardo, I., López, J.M., Vigo, M.: Interdependence between technical web accessibility and usability: its influence on web quality models. International Journal of Web Engineering and Technology 3(3), 307–328 (2007) 26. Arrue, M., Vigo, M., Abascal, J.: Quantitative metrics for web accessibility evaluation. In: Lowe, D.G., Gaedke, M. (eds.) ICWE 2005. LNCS, vol. 3579, Springer, Heidelberg (2005)
Mixing Evaluation Methods for Assessing the Utility of an Interactive InfoVis Technique Markus Rester1 , Margit Pohl1 , Sylvia Wiltner1 , Klaus Hinum2, Silvia Miksch3 , Christian Popow4 , and Susanne Ohmann4 1
Institute of Design and Assessment of Technology, Vienna University of Technology, Austria [email protected] 2 Institute of Software Technology and Interactive Systems, Vienna Univ. of Technology, Austria 3 Department of Information and Knowledge Engineering, Danube University of Krems, Austria 4 Department of Child and Adolescent Psychiatry, Medical University of Vienna, Austria
Abstract. We describe the results of an empirical study comparing an interactive Information Visualization (InfoVis) technique called Gravi++ (GRAVI), Exploratory Data Analysis (EDA) and Machine Learning (ML). The application domain is the psychotherapeutic treatment of anorectic young women. The three techniques are supposed to support the therapists in finding the variables which influence success or failure in therapy. To evaluate the utility of the three techniques we developed on the one hand a report system which helped subjects to formulate and document in a self-directed manner the insights they gained when using the three techniques. On the other hand, focus groups were held with the subjects. The combination of these very different evaluation methods prevents jumping to false conclusions and enables for an comprehensive assessment of the tested techniques. The combined results indicate that the three techniques (EDA, ML, and GRAVI) are complementary and therefore should be used in conjunction. Keywords: Information Visualization, Evaluation, Utility, Focus Groups, Insight Reports, Methodology.
1 Introduction Several authors have pointed out the importance of evaluation studies of Information Visualization (InfoVis) techniques (see e.g. [1], [2], [3]). In the past few years usability studies concerning visualization techniques have become more frequent, and valuable information about the design of such systems has been gathered. Nevertheless, as [4] mentions, there is still too little systematic information about the specific strengths and weaknesses of the features of InfoVis techniques. Studies presenting data from practical experiences with InfoVis techniques can help to develop a more systematic framework to support the decision which InfoVis technique to use in a given context. Medical data is a very interesting application area for Information Visualization. One of the reasons for this is the complex and time dependent character of these data. For such data, interesting InfoVis techniques have been developed in the past few years. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 604–613, 2007. c Springer-Verlag Berlin Heidelberg 2007
Mixing Evaluation Methods for Assessing the Utility
605
In the following, we will describe a study analyzing several different methods used to assess the therapeutic treatment of anorectic young women. During the therapy process a large amount of highly complex data is collected. Statistical methods are not suitable to analyze these data because of the small sample size, the high number of variables and the time dependent character of the data. The data results from extensive questionnaires the young women and their parents have to fill in several times before, during and after the therapy. These questionnaires treat questions like, e.g., the young women’s propensity for depression, their social behavior or their attitude about eating. The therapists want to find patterns in the young women’s behavior and try to isolate the specific factors influencing success or failure in the therapy (predictors). InfoVis techniques might be a valuable possibility to represent these data, but in accordance with the therapists we also chose two other potential techniques (Machine Learning and Exploratory Data Analysis). Up till now, evaluation in Information Visualization was centered around two variables: time and error. This approach has been criticized recently [5]. For many applications, the measurement of time and errors is too narrow. Many visualization methods support extensive exploration processes and the formulation of hypotheses. For an exploration process, the measurement of time does not make sense, and in the context of the development of hypotheses, errors in a narrow sense do not occur. In an ill-structured domain with no clear-cut results like psychotherapy, for example, other approaches are necessary. Therefore, the concept of insights was introduced to make the results of the exploration processes based on InfoVis techniques more tangible [6]. Unluckily, there is no agreed upon definition of insights although cognitive psychology has dealt with this topic quite extensively (see e.g. [7]). Most authors define ’insight’ in a quite pragmatic manner. In addition, there are no general frameworks for categorizing insights. [8] points out that a starting point might be using user tasks as, for example, finding clusters or extreme values. There are some general cognitive activities which often appear as insight categories, as, for example, finding detailed, factual information, identifying clusters, generalizations, identifying changes over time, etc. [6,9]. We developed our own classification system, partly based on the generic categories described above and partly adapted to the specific task for which our visualization method was developed. Finding predictors plays an important part in the therapists work, therefore it is a central category of our analysis. Developing a theoretical framework for the concept of insights and the definition of relevant categories of analysis will be an important area of future research.
2 Compared Techniques An interactive InfoVis technique named Gravi++ (GRAVI) was developed to support the therapists and clinicians in exploring the multidimensional, abstract, and time dependent data [10]. GRAVI is based on a spring metaphor. The questions from the questionnaires are positioned on a circle. The icons representing the anorectic young women are arranged within this circle depending on the strength of attraction of the questions. The questions function, to a certain extent, like magnets. The final position of the patients’ icons is a combination of the forces of all given answers on the questions
606
M. Rester et al.
Fig. 1. GRAVI: Interactive InfoVis-Tool for Exploration of Multi-Dimensional Time Dependent Data (Typical Screenshot). Concept of Spring-Based Positioning Leads to Formation of Clusters.
(see Fig. 1). GRAVI uses animation to deal with the time dependent data. The position of the patients’ icons change over time. This allows analyzing and comparing the changing values. Various visualization options are available, like Star Glyphs and attraction rings to communicate the exact values of each answer or traces to show the paths of the patients’ icons over all time steps. We decided to compare GRAVI with the following techniques used so far for analyzing the data: Exploratory Data Analysis (EDA) and algorithms of Machine Learning (ML). In the case of EDA boxplots, histograms, scatterplots, and statistical measures were used (e.g., Fig. 2). The ML algorithms were: a C4.5 decision tree (e.g., Fig. 3) and a Support Vector Machine (SVM) trained by Sequential Minimal Optimization (SMO). Exploratory Data Analysis (EDA) was developed by Tukey [11] and is based on statistics. It helps users to review and analyze data on a descriptive level. Tukey thought that the emphasis on statistical testing might be too narrow an approach. He, therefore, suggested EDA as a possibility to formulate hypotheses and assess assumptions. Subjects were given printouts of these techniques. Machine Learning is an area of AI concerned with the development of algorithms that enable computers to ’learn’. A Machine Learning technique learns from observed examples or data. In general, there are two types of machine learning algorithms: supervised
Mixing Evaluation Methods for Assessing the Utility
607
Fig. 2. Exploratory Data Analysis (EDA) Sample: Boxplots
and unsupervised. In case of supervised learning, a priori knowledge about the data is used and in case of unsupervised learning, no prior information is given regarding the data or the output. We utilized two supervised schemes using WEKA [12]: a Support Vector Machine with Sequential Minimal Optimization algorithm [13,14] and a pruned C4.5 decision tree [15]. The output of these two techniques were again available to the subjects as printouts.
3 Evaluation Methods An extensive evaluation of InfoVis has to take place on different stages. Important areas of interest can be: usability evaluation, insight study, case study, and transferability assessment (see [16] for details). For results of a usability evaluation of GRAVI see [17]. The used methods in the insight study were insight reports [16] and focus groups (cf. [18]). A sample of 32 subjects participated in the study. They were computer science students and can therefore be described as domain novices. Therefore they received a comprehensive introduction to the domain (data, real users’ tasks, etc.) and introductions to the three different techniques to use. The evaluation with insight reports took place in a laboratory setting and lasted for an overall of 155 minutes. There was equal time for the three techniques (GRAVI, EDA, ML). Subjects were divided into three groups which used the three techniques in different order. Every technique was was once used in first, second, and third place (MEG, EGM, GME).
608
M. Rester et al.
Fig. 3. Machine Learning (ML) Sample: C4.5 Decision Tree
The subjects used a report system to formulate and document their findings during the exploration process in a self-directed manner. Whenever an insight occurred they had to generate a report with this system. The following data was collected: used material, description of finding, and confidence rating. The insight reports were later classified in the following categories: complexity of each insight, plausibility of an insight, and whether an assigned insight has been elaborated in more detail and if so, whether this elaboration was sound or not valid (see Fig. 4). Focus groups can give interesting insights into the users’ attitudes and experiences although they do not provide representative results [18]. [19] reports that focus groups are especially valuable for evaluating InfoVis techniques as they are able to uncover unexpected problems that cannot be perceived by other research methods. In this sense, they can be an interesting complementary approach to other more systematic methods. So focus groups with the same subjects were held a week after the laboratory setting. They lasted about 100 minutes each. Eight questions were discussed (e.g., ease of use and utility, major strength and weakness, similarity and difference of insights gained with the different techniques, appropriateness of combined use). The value of
Mixing Evaluation Methods for Assessing the Utility
609
Fig. 4. Insight Report Documented by Subject with Classification and Categorization Options for Investigators
this method is that it reveals subjective impressions on questions not asked before and gives a different perspective on as well as arguments for interpretation of the data collected in the experiment. The discussion guideline consisted of eight questions. A set of the first four questions had to be discussed by subjects for each of the three used techniques (GRAVI, EDA, ML) separately. Afterward four more questions were addressed to them concerning all three techniques: 1. 2. 3. 4. 5. 6. 7. 8.
Appropriateness of the allowed time. Ease of use and usefulness of the technique for gaining insights. Overall confidence in insights gained with the technique. Major strength and weakness of the technique. Similarity and difference of gained insights using different techniques. Assumed comprehension rates of the complex matter with each technique. Appropriateness of combined use of the three techniques. Order for best possible comprehension of the data.
4 Results 4.1 Insight Reports The 32 subjects documented an overall of 876 reports. In the classification process we defined 805 different insights which were assigned 2166 times to the reports. Statistical analysis of the collected data from this experiment was carried out. In depth details of these results are currently subject to reviewing and will be published in the near future.
610
M. Rester et al.
To sum up, the results could lead to the conclusion that ML is not a recommendable technique. The subjects’ confidence ratings were low, the complexity of the gained insights was low and few predictors were found. On the other hand GRAVI performed very well concerning insights with high domain value (finding predictors). Confidence ratings were also generally high. EDA lies somewhere in between. Histograms and scatterplots are well known. But the interpretation of boxplots and statistical measures require some familiarity with these techniques. EDA seems especially suited – or more precisely, it was utilized in particular – to analyze single values of individual patients in specific time steps. This may also be the reason why there were fewer wrong arguments with EDA. In contrast, there were many wrong arguments with ML. 4.2 Focus Groups Appropriateness of the Allowed Time. Concerning ML the subjects’ statements clearly show a connection to the position of ML in the order of used techniques: in the case ML was the first technique all of the subjects stated that there was too little time for the tasks. If ML followed GRAVI the allowed time was rated appropriate. Using ML at last led to the assessment that there was too much time left. Many explanatory statements were as follows: subjects are not familiar with ML; ML is no suitable technique to start with if one is not a domain expert already; ML is complex and confusing; and there were no new insights that have not been already gained with the other two techniques. In general the time allowed while using EDA was predominantly rated appropriate. Once more only when used as the first technique the subjects would have needed more time for the tasks. The familiarity with EDA was pointed out by the subjects. Only the statistical measures from EDA were criticized as difficult to interpret. The ratings for GRAVI are similar to ML, though not as pronounced, and follow the position of GRAVI: if GRAVI is used as first technique, subjects would have needed more time to get familiar with both technique and domain. For GRAVI in second position we have a trend towards too much time available for the tasks. Used as last technique subjects rated the allowed time appropriate. Ease of Use and Usefulness of the Technique for Gaining Insights. ML had the lowest scores regarding the usefulness for gaining insights. 55% of all statements made by the subjects belong to the lowest category on this scale. Once again, unfamiliarity with and complexity of ML led to high level of uncertainty. The assessment of EDA was twofold: scatterplots and histograms scored very well, whereas boxplots and statistical measures were not rated as useful. The former were favored for their simplicity and for being visualizations. The latter were criticized for being complicated and in the case of statistical measures for not being a visualization additionally. Also for GRAVI the subjects appreciated some elements as well as disapproved of others. The interactivity of this technique in general and its powerful capability to handle the time dependent data in particular were rated as very useful. Different visual details, like poor visibility of missing data, were mentioned to hinder usefulness.
Mixing Evaluation Methods for Assessing the Utility
611
Overall Confidence in Insights Gained with the Technique. ML had an even worse assessment in the focus groups compared to the ratings given in the lab setting: 65.6% of the statements rated ML in the category “low confidence”. This high ratio is most likely due to peer pressure in one of the three groups where all of the 12 participants rated ML unanimously (low confidence). Interestingly, EDA scored better than GRAVI in the focus groups. One possible explanation for this may be that EDA received a lot of high ratings in the focus group of those who used EDA at last. So we have probably on the one hand a form of learning effect leading to more domain expertise which also affects the confidence in observations. On the other hand EDA was the only technique the subjects were rather familiar with. So it is the more noteworthy that GRAVI did only receive a few more ratings in the “low confidence” category than EDA. Major Strength and Weakness of the Technique. Although the subjects could not make much use of ML they believe that for experts of ML this technique allows for very concise and valid insights. There was a strong appreciation of the automaticity of calculations and a high level of faith in the correctness of the results. The latter was also raised by the often positively mentioned confusion matrix, which is an self-evaluation on correctness provided by the ML algorithms. The visualization of the decision tree was rated a plus whereas the formula of SMO was mentioned to be confusing. The mentioned strengths of EDA were: visual elements (scatterplots, histograms), simplicity, familiarity, and clarity of displayed data. The lack of interactivity, the impossibility of comparison of patients and/or groups of patients, and problems with the exploration of time dependency are the downside of EDA. GRAVI impressed by its interactivity, many options to visualize data in different ways, the handling of time dependent data, its simplicity, and its intuitive interface. Subjects saw the major weaknesses of GRAVI in the fact that visualizing much data rapidly leads to glutted displays. Also the need for check and re-check of possible insights with different constellations is important. Otherwise false conclusions could easily be drawn. Similarity and Difference of Gained Insights Using Different Techniques. The subjects reported by majority that they found the same insights with the three different techniques. Almost 2/3 of the made statements went in this category. Nevertheless the detail of insights varied. Assumed Comprehension Rates of the Complex Matter with Each Technique. ML showed to contribute very little to the comprehension of the provided data. This is in clear accordance with the former statements of the subjects. EDA and GRAVI on the other hand could be utilized well by subjects. Appropriateness of Combined Use of the Three Techniques. 45% of statements put on record that the combined use of ML, EDA, and GRAVI makes perfect sense because all three techniques offer different views on the data and therefore facilitate a deeper understanding and extensive exploration. Other 45% of statements pleaded for omission of ML due to its marginal contribution in comprehension of the data for the subjects who were not familiar with this complex technique.
612
M. Rester et al.
Order for Best Possible Comprehension of the Data. There were almost as many preferred orders in using the three techniques as there were subjects. But there are also some major similarities in the statements: ML is not suitable as the first technique but more useful to recheck insights gained with other techniques. GRAVI and also parts of EDA (simple visual parts: histograms and scatterplots) are viable techniques for first exploration of data. Another interesting outcome in the discussion was that the different techniques should not be used sequentially like in the laboratory setting but simultaneously. The already mentioned different views they provide on the data could add much more value in this way.
5 Conclusion The use of diverse evaluation methods enables different views on the technology under investigation. Whereas insight reports can reveal strengths and weaknesses in form of summative tests followed by statistical analysis, focus groups often give reasons and additional subjective opinions of subjects and therefore also ensure correct interpretation of the former. The outcome of insight reports could lead to the conclusion that ML is not a recommendable technique because of low confidence ratings, low complexity of the gained insights, and small number of found predictors. On the other hand GRAVI performed very well. There were many insights with high domain value (predictors) and with high confidence ratings. EDA seems especially suited to analyze single values of individual patients in specific time steps. The outcome of focus groups shows that GRAVI is useful for gaining insights with a high confidence rating, because of its flexibility through interactivity, the ability to explore more dimensions simultaneously, and the straightforward navigation within the time dependent data. Moreover, subjects rated GRAVI an appropriate visualization tool. ML should be omitted unless there is enough expertise with this technique. If so, it still can and probably will be a powerful technique to gain insight. EDA rapidly leads to insights (although rather basic ones) due to the general familiarity with this technique. Combining these results we see, that all three techniques offer different views on the data and therefore a combined use will likely lead to more insight and comprehension. Acknowledgments. The project “Interactive Information Visualization: Exploring and Supporting Human Reasoning Processes” is financed by the Vienna Science and Technology Fund [Grant WWTF CI038]. Thanks to Bernhard Meyer for the collaboration in the classification process.
References 1. Chen, C.: Empirical evaluation of information visualizations: an introduction. Int. J. HumanComputer Studies 53(5), 631–635 (2000) 2. Plaisant, C.: The challenge of information visualization evaluation. In: Costabile, M.F. (ed.) Proceedings of the working conference on Advanced visual interfaces, pp. 109–116. ACM Press, New York (2004)
Mixing Evaluation Methods for Assessing the Utility
613
3. Tory, M., M¨oller, T.: Human factors in visualization research. Visualization and Computer Graphics, IEEE Transactions on 10(1), 72–84 (2004) 4. Spence, R.: Information Visualization. ACM Press, New York (2001) 5. Stasko, J.: Evaluating information visualizations: Issues and opportunities (position statement). In: Bertini, E., Plaisant, C., Santucci, G. (eds.): Beyond time and errors: novel evaLuation methods for Information Visualization – Proceedings of BELIV’06, Venice, Italy, pp. 5–7 ( 2006) 6. Saraiya, P., North, C., Duca, K.: An insight-based methodology for evaluating bioinformatics visualizations. Visualization and Computer Graphics, IEEE Transactions on 11(4), 443–456 (2005) 7. Eysenck, M.W., Keane, M.T.: Cognitive Psychology. A Student’s Handbook. Psychology Press, Taylor and Francis Group, London, New York (2005) 8. North, C.: Toward measuring visualization insight. Computer Graphics and Applications, IEEE 26(3), 6–9 (2006) 9. Lanzenberger, M.: The Interactive Stardinates – An Information Visualization Technique Applied in a Multiple View System. PhD thesis, Vienna University of Technology, Vienna, Austria ((September 2003) 10. Hinum, K., Miksch, S., Aigner, W., Ohmann, S., Popow, C., Pohl, M., Rester, M.: Gravi++: Interactive information visualization to explore highly structured temporal data. Journal of Universal Comp. Science 11(11), 1792–1805 (2005) 11. Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading, Mass (1998) 12. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco, CA (2005) 13. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning, pp. 185–210. MIT Press, Cambridge (1998) 14. Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO Algorithm for SVM Classifier Design. Neural Computing 13(3), 637–649 (2001) 15. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA (1993) 16. Rester, M., Pohl, M., Hinum, K., Miksch, S., Popow, C., Ohmann, S., Banovic, S.: Methods for the evaluation of an interactive infovis tool supporting exploratory reasoning processes. In: BELIV ’06: Proceedings of the 2006 AVI workshop on Beyond time and errors, New York, NY, pp. 32–37. ACM Press, New York (2006) 17. Rester, M., Pohl, M., Hinum, K., Miksch, S., Ohmann, S., Popow, C., Banovic, S.: Assessing the usability of an interactive information visualization method as the first step of a sustainable evaluation. In: Proc. Empowering Software Quality: How can Usability Engineering reach these goals?, Austrian Computer Society, pp. 31–44 (2005) 18. Kuniavsky, M.: User Experience: A Practitioner’s Guide for User Research. Morgan Kaufmann, San Francisco (2003) 19. Mazza, R.: Evaluating information visualization applications with focus groups: the coursevis experience. In: BELIV ’06: Proceedings of the 2006 AVI workshop on BEyond time and errors, New York, NY, USA, pp. 1–6. ACM Press, New York (2006)
Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings Jaspal S. Sandhu1,∗, P. Altankhuyag2, and D. Amarsaikhan3 1
College of Engineering, University of California, Berkeley, USA [email protected] 2 Asian Development Bank, Ministry of Health, Mongolia 3 Postgraduate Institute, Health Sciences University of Mongolia
Abstract. This paper presents an ethnographic method for assessing user needs in designing for rural settings. “Serial Hanging Out” consists of short-term participant observation with multiple, independent informants. The method is characterized by: (1) its short-term nature, (2) the use of participant observation supported by specific field techniques, and (3) the emphasis on user needs for design. It is discussed in relation to similar methodological work in associated fields. To ground the discussion, the method is presented in the context of ongoing work to develop improved information systems to support rural health workers in Mongolia. Keywords: participant observation, ethnography, design, qualitative methods, user needs, rural, Mongolia.
1 Introduction Rapid ethnographic methods play a critical role in human-centered design. They have been applied extensively not only in human-computer interaction (HCI) and computer-supported cooperative work (CSCW) [1][2][3], but also in product design [4][5], and marketing and consumer research [6][7]. The common motivation for the use of these methods across various disciplines is that they provide a much richer understanding of people, in context and from their own perspective. Blomberg et al. succinctly describe the key dimensions of ethnographic research: (1) it takes place in natural settings, (2) it is holistic, i.e. understanding is framed in systems larger than the immediate context, (3) it is descriptive, and (4) it strives to consider the member’s own perspective [8]. Rapid methods are necessary because projects operate under severe time and budgetary constraints; however, holism and the member’s perspective are often sacrificed in order to operate within these constraints [9]. While ethnographic methods have been used for requirements generation [10], the emphasis here is on design innovation in unfamiliar environments, specifically rural communities in the context of international development. The focus is not only on how technology can ∗
Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings
615
be designed, but also on whether technology makes sense in the first place. In either case, the objective is to design holistic and systemic interventions that target real development problems. Such design innovation requires a deep understanding of target users, which is challenging to obtain using rapid methods. Dourish criticizes ethnographic practice in HCI on the basis that it tends to neglect the interpretive nature of ethnography [9]. Achieving a deep understanding and preserving the interpretive nature of ethnographic research are typically correlated to long term fieldwork, but it is unrealistic to propose long-term fieldwork in most applied design settings. The presented work addresses this issue through longer engagements with individuals than are typical in applied design settings, and by operating from an interpretive perspective, namely one that sits “between two worlds or systems of meaning – the world of the ethnographer … and the world of cultural members” [11]. In HCI, Rapid Ethnography [12] and Quick and Dirty Ethnography [3] have been presented, but these methods have not been sufficiently evaluated in practice [13]. While these forms of ethnographic research focus on a place – offices and air traffic control rooms, respectively – the method proposed here is focused on the individualas-anchor; this makes more sense in many rural applications and is a major thrust of framing this method as one for rural applications. Rapid Ethnography and Quick and Dirty Ethnography share in common an emphasis on focused studies. Such a premature focus may blind researchers to critical information in relatively unfamiliar, rural settings. Although most ethnographic work is cross-cultural to some degree, work in rural communities for international development represents an extreme because those involved in conducting the research often come from very different cultural backgrounds, even if they are from in-country. The settings of international development are necessarily more diverse than business settings, where “through years of experience, trained ethnographers build up a great deal of knowledge … about those segments of the population who are reliably of interest to business” [14]. Given the relative unfamiliarity of the context to those conducting the research informal methods [15] are required, as is a need to “approach social life with a wideangle lens” [16]. Prior methodological work in HCI has not focused on rural settings, so there is an opportunity for HCI to contribute to related work in the international development community [17][18]; however, the proposed methodology – although related to HCI and innovation – is more concerned with meeting specific development objectives than it is with achieving novel technology gains.
2 The Method The proposed method for understanding user needs in the context of designing information systems for rural communities is Serial Hanging Out (SHO): sequential, short-term (2-4 days) participant observation with multiple, independent informants. The participant observation techniques are more sophisticated than the phrase
616
J.S. Sandhu, P. Altankhuyag, and D. Amarsaikhan
“hanging out” suggests.1 Still, the metaphor of “hanging out” captures the essence of the work: participant observation is “a way to collect data in naturalistic settings … [to] observe and/or take part in the common and uncommon activities of the people being studied” [19]. SHO is related to Sanjek’s [20] network method in urban ethnography. His “network-serials” consisted of intensive interviewing of informants in order “to describe the behavior and purposes of the members … and to chart the range of interaction settings.” A key similarity between Sanjek’s network serials and SHO is that they focus on informants as “anchors”, providing access to different activities, interactions, relationships, and actors. Despite the similarities, there are several key differences. First, the core field method in SHO is short-term participant observation rather than interviewing. Sanjek’s suggestion of “direct behavioral recording” as a plausible alternative to “intensive interviewing” highlights another important difference related to the respective goals of the two methods. Network-serials are intended to chronicle behavior, but SHO uses the concept of serial interactions and social networks to explore the interactions themselves. It is not simply interactions occurring in a particular time and place with particular actors that are of interest in SHO, but it is the quality, content, and meaning of those human interactions. Moreover, SHO is not limited to interpreting interactions and movements – in particular, other elements of understanding arise from contextual, ethnographic interviews. In practical terms, SHO uses multiple teams of 1-2 researchers each in order to conduct a study. The need for parallel teams is driven by the significant time investment in each participant, not only the time with the informant, but also the transport time (often significant in rural areas, especially with disparate informants) and the time spent synthesizing field data. As Millen states, international research may require multiple researchers on a single team in order to “help with language and local cultural issues” [12]. With respect to parallel teams, he notes that multiple researchers can observe different groups. Researchers will always have some influence on the situations in which they are involved. This is understood and is in fact an integral part of interpretive research; however the influence of more than 2 researchers is so disruptive that two is the maximum number of researchers to be involved with a single informant. SHO avoids Geertz’s criticism of "hit-and-run" ethnography [21] by concentrating on specific informants methods, and themes of inquiry. Further, the focus on informants-as-anchors, the longer engagement, and the emphases on holism and member’s point-of-view distinguish SHO from contextual inquiry [22].
3 Context This paper presents a theoretical argument for SHO that informs the design of an applied research study in rural Mongolia. While the method draws from the primary 1
The phrase is adopted from Geertz [21] who cites Clifford as originating the phrase “deep hanging out”. Clifford’s tone is intended as an affront to traditional ethnography, but Geertz picks up the phrase, dusts it off, and wears it proudly.
Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings
617
author’s past experience with design research in rural settings [23][24], it is being formally implemented and evaluated in the context of this Mongolia study. Mongolia is unique among developing countries in that one-quarter of the overall population is nomadic or semi-nomadic, and rural areas are extremely sparsely populated [25]. By many standards, the Mongolian health infrastructure is highly developed; however, facilities and human resources are increasingly limited in providing effective healthcare for rural populations that lie beyond aimag (provincial) capitals. Bagiin baga emch, rural health workers, provide services at the bag (smallest administrative unit) level by traveling from household to household by motorcycle or horse; however, there are significant unmet needs in continuing training of these paraprofessionals [26] and in providing support for their work practice. This research focuses on understanding the lives of bagiin baga emch in order to design improved information systems to support their work. The primary field research is being undertaken in partnership with an Asian Development Bank (ADB) project: “Information and Communication Technology for Improving Rural Health Services in Mongolia” (JFICT 9053-MON). The objective of this project is to improve the health of vulnerable, rural populations – especially mothers and young children – by using ICT (information and communication technology) tools to support health services delivery. Part of this project involves providing PDAs (personal digital assistants, or handheld computers) to bagiin baga emch, primarily in order to support data collection. PDA deployment and associated training began in spring 2007. In addition, this work will contribute to an International Development Research Council (IDRC) pilot project that will provide PDAs to rural health workers in Nepal (via HealthNet Nepal) and Mongolia (Health Sciences University of Mongolia). The intention of providing PDAs is to support continuing education and decision-making at the point of care.
4 Sampling Sampling issues are fundamental to ethnographic research, particularly in rural settings where potential users of systems may be geographically and culturally dispersed. In SHO, maximum variation sampling [27] is preferred because it provides maximal coverage of perspectives, behaviors, practices, interactions, and activities. Maximum variation sampling is purposeful selection of informants representing a broad range on key dimensions. In the Mongolia case, such dimensions include geographic zone (e.g. arid steppe, forest steppe, mountain, desert), transport (motorcycle, horse, camel), and work experience. Bagiin baga emch are a formal part of the health care system, so the researchers have access both to the entire sampling frame and to key intermediaries at the aimag (provincial) and soum (county) levels. In other settings, access to informants, or information about them, may be more difficult to be obtain. In such cases, other sampling strategies [27] may be more suitable. The goal of this method is as much to define intracultural variation [28] in apparently homogeneous groups as it is in defining cultural patterns. By understanding variation in
618
J.S. Sandhu, P. Altankhuyag, and D. Amarsaikhan
meanings and practices within our sample of bagiin baga emch, we can develop strategies that have more universal appeal and can even take advantage of knowledge owned by a subset of the population. It can be argued in this regard that von Hippel’s work with lead users is primarily motivated by intracultural variation [29].
5 In the Field Regarding specific field techniques, pens and paper notebooks will be used for field notes since, from prior experience, researchers will be highly mobile. The recommendations of Emerson et al. [30] will be followed closely for developing field jottings. In addition to their recommendations, sketching and diagramming will be used as mnemonic devices. Opportunistic digital audio recording will be used to capture unstructured interviews. Digital photographs and short digital video clips will be recorded on a limited basis to supplement observations and interviews. The primary output of an encounter with an informant will be a narrative – blending realist and impressionist styles [11] – supported by annotated photographs. The audio and video will be used primarily to support the construction of these narratives, but will also remain available for secondary analysis in later phases of the design process. The narratives will be written immediately after leaving the field in order to support maximal recall. In the Mongolia work, this will typically mean writing the narratives at soum health centers, within 24-48 hours of leaving the informant. To this point, SHO has been presented as a cleanly operationalized process; however, as with other ethnographic enterprises, this is simply not the case. Significant field preparation is required and has been undertaken with the Mongolia project. Review of previous research with bagiin baga emch and other rural health workers was an initial step in beginning to understand the culture and work conditions of bagiin baga emch. This is an imperative stage in the process although the literature may sometimes be non-existent [3] or misleading [15]. For the lone foreign researcher on this team, language preparation was critical (as it was for Sanjek’s work in Accra, Ghana [20]). The essential nature of language – and some basic cultural understanding – is why it is important for this work to be done by in-country researchers, or at the very least, in close collaboration with them. Other field preparation has included key informant interviews, observation of continuing training in aimag capitals, and pilot testing of the field protocol.
6 Time There are many factors to consider in the design of rapid ethnographic research, but time-in-the-field is often a primary concern (Table 1). Although Table 1 seems to indicate significant differences in time of engagement, the differences are in fact less dramatic, as the cumulative sum of fieldwork days is greater for those methods that involve multiple sites or informants (Rapid Ethnography, Serial Hanging Out).
Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings
619
Table 1. Sample time recommendations for different rapid ethnographic methods Author Millen [11] Hughes et al. [3]
Method Rapid Ethnography Quick and Dirty Ethnography
Beebe [18] Handwerker [31] Sandhu et al.
Rapid Assessment Process Quick Ethnography Serial Hanging Out
Time of Engagement 1 day or less 4 weeks each study, multiple studies over 3 years 4-40 days 3-90 days 2-4 days
Unit of Analysis Per site Single site
Single site Single site Per informant
In the business world, “ethnographies (read: participant observation) can last a half a day or even less. How is this possible? Ethnographers working in business are generally PhDs and typically manage this seemingly impossible feat by applying their methodological skill and accrued knowledge of theories of human behavior and social interaction” [14]. There are two problems in applying such logic to international development: (1) such familiarity does not exist for rural international development, even for trained people who are from in-country, and (2) such an attitude shifts the power from informant to ethnographer – the informant becomes a subject, rather than participant, in the research. In Mongolia, the selection of 2-4 days per informant is motivated by the nature of bagiin baga emch activities – some activities take much longer than a half day, such as monthly visits to households (2-3 days), visits to soum health centers (1-3 days), and summons (duudlaga) to a patient homes (half to full day). In addition, the strategy is to maximize the cost-benefit given the relatively high time and monetary investment in rural travel.
7 Multiple Researchers and External Reliability Issues of external reliability2 are a primary concern in SHO given the use of multiple, parallel researchers. LeCompte and Goetz indicate that ethnographers “enhance the external reliability of their data by recognizing and handling five major problems” [32]. These five problems and the mechanisms for managing them in SHO are presented below: 1. Researcher status: Mobility along the participant-observer spectrum is limited by selecting similar informants (all bagiin baga emch), by selecting a research team with comparable abilities to one another, and by making explicit the desired status among the research team. 2. Informant choices: The informants are similar along several dimensions since they represent a single class of users (bagiin baga emch). 3. Social context: An engagement of 2-4 days will ensure access to multiple social settings and actors. 2
“External reliability addresses the issue of whether independent researchers would discover the same phenomena or generate the same constructs in the same or similar settings” [32].
620
J.S. Sandhu, P. Altankhuyag, and D. Amarsaikhan
4. Analytic constructs: A single field protocol will be used by all field researchers. 5. Data collection/analysis: The field protocol and periodic team meetings will be used to manage data collection and preliminary analysis in the field, while a structured, team-based process will be used during the data analysis phase.
8 Data Analysis Following the model of Griffin and Hauser [33], the narratives resulting from the SHO will be analyzed by teams of researchers and students, in this case students from the Health Sciences University of Mongolia. Part of the motivation for doing so is to evaluate the effectiveness of this method in uncovering user needs. User needs are one, but by no means the only, way to bridge ethnography and design. Urban and Hauser define needs as “statements in the words of the customer that describe the benefits they need, want, or expect to get from a product” [34]. SHO extends this definition in 3 ways. First, while statements are important, other aspects of ethnographic work are also included. Second, products include services, not just artifacts or technological systems. Third, in international development more than business development, such products or services as are being developed may not exist or may exist in a radically different forms, so wants and expectations may be difficult to obtain. The needs analysis is a team-based process of identifying both explicit (stated) and implicit (latent or unarticulated) needs from the narratives. Upon completion of the needs analysis, all needs will be merged and redundant needs will be removed. The team will then use affinity diagramming3 to create a hierarchy of needs. Such a hierarchy makes the process translating user needs into novel design concepts more tractable. Finally the user needs will be tied to the context from which they came to preserve the richness of the design research. The resulting needs and associated data will be used to generate novel design concepts. The concepts will then be prototyped and tested with users, whether or not the prototyped systems are “technological”.
9 Summary and Assessment Serial Hanging Out (SHO) is a rapid, ethnographic method for uncovering user needs in rural settings. This method is particularly well-suited to the study of a geographically dispersed class of users [20], as may be found in rural institutions (in the present case, Mongolia’s rural health care system). Moreover, by selecting informants as anchors, and by experiencing interactions over a multi-day period, a rich sample of interactions and activities can be included in the development of ethnographic narratives. Although the method draws from Sanjek’s network-serial method, SHO is less concerned with behavioral mapping than it is with using spatial movements and interactions as a scaffolding for ethnographic inquiry. Also, as 3
Affinity diagramming is a team-based process of grouping ideas, in this case user needs, based on the ideas themselves, rather than external categories. It is also known as the KJ method after Japanese ethnologist Kawakita Jiro, the inventor of the method.
Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings
621
opposed to network-serials, SHO emphasizes emic rather than etic4 perspectives, and operates in an interpretive frame of reference. Although there are some similarities to other rapid methods in HCI [3][11], this work is different in that is emphasizes the use of multi-day participant observation and that it has a less specific focus at the onset. In any case, none of these methods has been substantially evaluated in practice [12]. By formally implementing this method in the context of applied research in the rural Mongolian health sector, this method can be evaluated in situ, providing evidence as to its usefulness and to the key elements study design. The evaluative component of this research will address the efficacy, efficiency, and quality of the methods. Details of the evaluation and procedures for evaluation will be presented in future publications. Although this research will have relevance to the design of information systems, it is also expected to have utility beyond design [9], in both applied and theoretical senses. The ethnographic results should provide a deeper understanding of bagiin baga emch for those developing health strategy for the bag/soum level in Monoglia and should also provide a unique view into the work culture of rural health professionals in a particular place and time. This plurality is a main driver of this research. Acknowledgements. Thanks to those who have partnered in, or supported, the prior field research which serves as the basis for the concepts in this paper: Jonathan Hey, Catherine Newman, Alice M. Agogino, Teresa DeAnda, Jessica Granderson, Expedita Ramirez, and Kirk R. Smith. Countless discussions with colleagues, mentors, and friends (and a few non-academic strangers) have been instrumental in the development of these ideas. Special thanks on this front to Judd Antin, Michael Barry, Sara Beckman, Griff Coleman, Peter Lyman, and AnnaLee Saxenian. Preliminary fieldwork in Mongolia was supported by a Foreign Language Area Studies grant. The current research is funded by a Fulbright Fellowship and an NSEP Boren Fellowship. Mahad Ibrahim and Andrei Marin provided invaluable feedback on early drafts of this article. Finally, none of this would be possible without the cooperation of past and current research participants, who have invited us into their homes and daily lives. To them we are most indebted.
References 1. Gilmore, D.: Business: Understanding and Overcoming Resistance to Ethnographic Design Research. Interactions, 9(3) (May 2002) 2. Wasson, C.: Ethnography in the Field of Design. Human Organization 59(4), 377–388 (2000) 3. Hughes, J., Rodden, T., King, V., Anderson, H.: The Role of Ethnography in Interactive Systems Design. ACM Interactions 2(2), 56–65 (1995) 4. Rosenthal, S.R., Capper, M.: Ethnographies in the Front End: Designing for Enhanced Customer Experiences. Journal of Product Innovation Management 23, 215–237 (2006) 4
Emic refers to terms or concepts meaningful to the cultural member, while etic refers to terms or concepts meaningful to the external researcher.
622
J.S. Sandhu, P. Altankhuyag, and D. Amarsaikhan
5. Squires, S., Byrne, B. (eds.): Creating Breakthrough Ideas: The Collaboration of Anthropologists and Designers in the Product Development Industry. Bergin and Garvey, Westport, Connecticut (2002) 6. Mariampolski, H.: Ethnography for Marketers: A Guide to Consumer Immersion. Sage, Thousand Oaks, California (2006) 7. Arnould, E.J., Wallendorf, M.: Market-Oriented Ethnography: Interpretation Building and Marketing Strategy Formulation. Journal of Marketing Research 31(4), 484–504 (1994) 8. Blomberg, J., Burrell, M., Guest, G.: An Ethnographic Approach To Design. In: Jacko, J.A., Sears, A. (eds.) The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, pp. 964–986. Lawrence Erlbaum Associates, Mahwah, New Jersey (2003) 9. Dourish, P.: Implications for Design. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Montréal, Québec, Canada, 541-550 (2006) 10. Sommerville, I., Rodden, T., Sawyer, P., Bentley, R., Twidale, M.: Integrating Ethnography into the Requirements Engineering Process. In: Proceedings of IEEE International Symposium on Requirements Engineering, San Diego, California, pp. 165– 173 (1993) 11. Van Maanen, J.: Tales of the Field. University of Chicago Press, Chicago (1988) 12. Millen, D.R.: Rapid Ethnography: Time Deepening Strategies for HCI Field Research. In: Proceedings of the Conference on Designing Interactive Systems, New York City, New York, pp. 280-286 (2000) 13. Kujala, S.: User Involvement: A Review of the Benefits and Challenges. Behaviour and Information Technology 22(1), 1–16 (2003) 14. Plowman, T.: Ethnography and Critical Design Practice. In: Laurel, B. (ed.) Design Research: Methods and Perspectives, pp. 30–38. MIT Press, Cambridge, Massachusetts (2003) 15. Agar, M.: The Professional Stranger: An Informal Introduction to Ethnography. Academic Press, New York (1980) 16. Spradley, J.P.: Participant Observation. Holt, Rinehart and Winston, New York (1980) 17. Chambers, R.: The Origins and Practice of Participatory Rural Appraisal. World Development 4(7), 953–969 (1994) 18. Beebe, J.: Rapid Assessment Process. Altamira Press, Walnut Creek, California (2001) 19. Dewalt, K.M., Dewalt, B.R.: Participant Observation. Altamira Press, Walnut Creek, California (2002) 20. Sanjek, R.: A Network Method and Its Uses in Urban Ethnography. Human Organization 37(3), 257–268 (1978) 21. Geertz, C.: Deep Hanging Out. The New York Review of Books 45(16) (October (1998) 22. Beyer, H., Holtzblatt, K.: Contextual Design: Defining Customer-Centered Systems. Morgan Kaufmann, San Francisco (1998) 23. Sandhu, J.S., Hey, J., Newman, C., Agogino, A.M.: Informal Health and Legal Rights Education in Rural, Agricultural Communities using Mobile Devices. Proceedings of IEEE International Conference on Advanced Learning Technologies, Kaohsiung, Taiwan, pp. 988-992 ( 2005) 24. Granderson, J., Sandhu, J.S.: Efficiency and Design of Improved Woodburning Cookstoves in the Guatemalan Highlands. Technical Report Max-05-1, School of Public Health, University of California, Berkeley (2005) 25. Ebright, J.R., Altantsetseg, T., Oyungerel, R.: Emerging Infectious Diseases in Mongolia. Emerging Infectious Diseases 9(12), 1509–1515 (2003)
Serial Hanging Out: Rapid Ethnographic Needs Assessment in Rural Settings
623
26. Directorate of Medical Service, Department of Human Resource Development: Survey Report on Training Needs of Bag Feldshers. Ulaanbaatar, Mongolia (2004) 27. Patton, M.Q.: Qualitative Evaluation and Research Methods. Sage, Newbury Park, California (1990) 28. Pelto, P.J., Pelto, G.H.: Anthropological Research: The Structure of Inquiry. Cambridge University Press, Cambridge, United Kingdom (1978) 29. von Hippel, E.: The Sources of Innovation. Oxford University Press, New York (1988) 30. Emerson, R.M., Fretz, R.I., Shaw, L.L.: Writing Ethnographic Fieldnotes. University of Chicago Press, Chicago, Illinois (1995) 31. Handwerker, W.P.: Quick Ethnography. Altamira Press, Walnut Creek, California (2001) 32. LeCompte, M.D., Goetz, J.P.: Problems of Reliability and Validity in Ethnographic Research. Review of Educational Research 52(1), 31–60 (1982) 33. Griffin, A., Hauser, J.R.: The Voice of the Customer. Marketing Science 12(1), 1–27 (1993) 34. Urban, G.L., Hauser, J.R.: Design and Marketing of New Products. Prentice Hall, Englewood Cliffs, New Jersey (1993)
Effectiveness of Content Preparation in Information Technology Operations: Synopsis of a Working Paper A. Savoy1 and G. Salvendy2 1
Purdue University, West Lafayette, IN, USA [email protected] 2 Purdue University, West Lafayette, USA and Tsinghua University, Beijing, China [email protected]
Abstract. Content preparation is essential for web design [25]. The objective of this paper is to establish a theoretical foundation for the development of methods to evaluate the effectiveness of content preparation in information technology operations. Past studies identify information as the dominant concern of users, and delivery mechanism as a secondary concern [20]. The best presentation of the wrong information results in a design with major usability problems and does not aid the user in accomplishing his task. This paper shifts the focus of existing usability evaluation methods. It attempts to fill the void in usability literaoture by addressing the information aspect of usability evaluation. Combining the strengths of content preparation and usability evaluation yields major implications for a broad range of IT uses. Keywords: Content preparation, World Wide Web, Usability.
Effectiveness of Content Preparation in Information Technology Operations
625
This paper unveils a theoretical foundation for the development of methods to evaluate effectiveness of content in IT operations. Combing this with existing methods of usability evaluation will provide an overall effectiveness evaluation of IT systems with which humans interact. It attempts to fill the void in usability literature by addressing the information aspect of usability evaluation.
2 Content Preparation The concept of content preparation emerged from a conference panel discussion in 2002. This panel assessed the preparation of content and its management for four elements of web design: knowledge elicitation, information organization, information retrieval, and information presentation [21]. Identifying specific information needed by users and/or customers is a main goal. Content preparation is a fairly new concept in web design with relevance to e-business and cross-cultural design issues [16]. Due to its infancy, there is no study or development of an evaluation tool for this concept and its principles. Further, there has not been documentation of an information structure that aids the development of an evaluation tool. This study aims to equip designers with end-user information requirements. The focus of Content Preparation evaluation and traditional Usability evaluation differ considerably. The former concentrates on “what” information is needed and provided. The latter evaluates presentation and functionality [16]. Both evaluations are important to the design of human-centered interfaces. Currently, only usability testing is practiced. If Content Preparation evaluations were used as a supplement to traditional usability testing, interface evaluations would be more comprehensive. A similar concept of content preparation was established in the yester years. Its principles were captured in the production of catalogs, newspaper, product manuals, and paper-based bank statements. Studies have investigated differences between traditional and non-traditional media: printed versus online catalogs, printed versus online magazines [17], printed versus online newspapers, printed versus online references, and printed versus online presentation of information. As a result, the varying characteristics of printed and online materials influence modifications/additions for web-based content preparation.
3 Literature Review Many research studies have addressed the issues of data and information quality [24], [14], [19]. In the Information Systems (IS), Human-Computer Interaction (HCI), and Web-design areas, there have been different approaches for classifying important factors and evaluating quality. 3.1 Information Quality The majority of IS studies are founded upon the effectiveness of databases, search and retrieval algorithms, and content management systems [12]. Research in these areas tends to include an overall approach. In hopes of discovering a construct defining useful content for general use, studies in IS were investigated.
626
A. Savoy and G. Salvendy
Research pertaining to information/data quality has been conducted covering traditional and non-traditional contexts of use and presentation. Common classifications of the dimensions include Quality Category, Assessment Class, Criteria, and Dimensions. The Quality Category classification stems from an information quality framework developed to allow IS managers to better understand and meet the needs of their information consumers [24], [12]. This framework consists of the following four categories: Intrinsic information quality, Representational information quality, Accessibility information quality, Contextual information quality. The first three categories refer to general credibility issues. Intrinsic information quality suggests that information should have an independent quality [11]. This category is defined further with related dimensions among which accuracy is labeled the most important [22], [11]. It could be viewed as General Content information quality. Representational information quality is concerned with the format and presentation of information. This category deals with the visual design (e.g. “how” aspect) of web-design, which is not the focus of this paper. Accessibility information quality focuses on security and privacy [12]. This category could be viewed as Trust information quality. The final category is inline with this paper’s objective. Contextual information quality denotes consumers’ need for information may differ according to their tasks. However, general categories: Relevancy, Value-added, Timeliness, and Amount of data can be used for classification. Content requirements for these categories were not mentioned. However, it demonstrated the use of general categories for classification. Findings from IS, provided details for aspects of credible information rather than useful information. While useful information should be credible, credible information is not necessarily useful. A clear definition of useful content can not be deduced from the studies in this area alone. 3.2 Internet Domains Content preparation emphasizes information that will aid users in their decisions and tasks. Although studies show that user preferences and information needs change according to their purposes for using the web or different website domains [3], [26], the desire of useful information is common in all domains. A number of studies are restricted to specific domains, which include E-Commerce, Entertainment, Education, Advertisement, and Medical. The results of each study provide in-depth analysis of what aspects/features were important to usability and customer satisfaction, rarely is information content addressed. E-commerce is ranked among the top two domains for visitation and use. Online shopping surfaced as a by-product of the internet, which produces billons of dollars in revenue [8]. Customers need specific information that will aid in their decisionmaking. However, similar research often does not consider specific elements of information for e-commerce websites [8]. It takes more than simple convenience to persuade a shopper to buy from a particular website due to the vast number of competitors. Deltor et al (2003) was one of the first studies to address the challenge of identifying specific information elements of e-commerce websites. Their research
Effectiveness of Content Preparation in Information Technology Operations
627
concerning browse vs. search for pre-purchase online information seeking is highly referenced. The elements were categorized into three groups: Product, Retailer, and Interface related dimensions of information content. The construct was developed after conducting research on consumers’ information preferences across browsing and searching activities. Some of Deltor et al’s (2003) elements are listed in Table 1. Table 1. Essential Domain Specific Information Components Domain Ecommerce
Aesthetics Product Specification Price
Advertisement
Education
Information components References Deltor, Reliability Delivery 2003 Purchase Advice Retailer Retailer Reputation service Brand Retailer Policy Product alternative Availability Manufacturer
Price-Value Quality Availability Special Offer Packaging Guarantees Company-Sponsored Research Admissions Alumni Facilities FAQs Placement Programs Board Members
Performance Components Taste Nutrition New Ideas Safety Independent Research
Online advertisements have boosted revenues “from $1.9 billion in 1988 to $4.6 billion in 1999 alone” [8]. In addition, the role of information has established itself as a central factor in many discussions of how advertising works [9]. Advertising has many models and theories dedicated to explaining how a consumer searches for functional information to assist decision-making during the purchasing process. Advertisement research has roots that date back to the 1970s. Research conducted in the marketing area addressed needs for certain types of information in ads. Its focus on information cues has the closest relation to the proposed conceptual model. Information cues are defined as categories of information that are potentially useful to consumers [9].The majority of studies cited refer to cross cultural television advertisements. The most referred set of cues is accredited to Resnik and Stern (1977). These cues spawned from the results of a content driven investigation of 378 television commercials. The set originally included 14 cues (refer to Table 1). The longstanding validity of Resnik & Stern’s (1977) information cues provides a strong foundation for the development of a definitional framework of useful content for IT operations. According to studies conducted by Zhang et al (2001), the education domain ranks among the top two (paired with E-commerce) based on user familiarity. After 9/11, universities noticed a dramatic drop in campus visits and need a new method of recruitment. This and other factors have influenced the onset of e-recruitment techniques and tools. Among those tools, websites are the primary. Now university websites contain with vast amounts of information to aid visitors in performing different tasks: selecting a university or course, retrieve personal records, and bill payment.
628
A. Savoy and G. Salvendy
Most research in this area address the website as a recruitment tool. Prospective students view university websites for information to assist in their school selection process. Therefore, the preparation of the website content should deliver information that would market the school appropriately [8]. Griffin (1999) conducted a qualitative analysis of content provided on 16 web sites. Evaluation of these websites over a two week period generated the list of informational cues cited in Table 1.
4 Conceptual Model All domains should consider content preparation for the design of their websites. Ecommerce, Advertising, and Education have relatively strong internet presence. However, there are other domains (i.e. Financial, Government, Medical, and Entertainment) attempting a transition to online environments. For example, egovernment is the attempt to make government more citizen-friendly with well designed websites [15]. The new domains need to identify and assess the content and functionality necessary to motivate their audience to use these websites [15]. Chan & Swatman (2002) attempted to improve university recruitment websites with lessons learned form e-commerce. They conducted a review comparing universities and their use (or potential use) of methods from the e-commerce domain. This comparison was founded on the application of Ho’s framework (1997) to the websites of universities in Australia and Hong Kong SAR. The results concluded 15 different information components for education websites. This research ties the information needs of the education domain with those in the e-commerce domain. Again, the demonstration of transferability encourages the development of a definitional framework of useful information for IT operations that is not limited by domain. The elements listed by domains as important information elements are not mutually exclusive. They can be integrated to form an information structure that would provide a baseline definition for useful content. The categories have to be selected appropriately, to capture all the elements and have relevance to web-based interfaces. Inspecting the elements discovered from the literature review a subjective analysis suggested classification by eight categories: 1.
2. 3.
4.
Site Information – Information concerning the overall perceived quality of information provided be the website. Information content should be frequently updated and the users should be aware of when these updates are occurring [2]. Transaction information – Information explaining different aspects of the purchasing process. This is supported by the E-commerce and Education domains. Users want to make informed purchase decisions [5], [22], [6]. Company information – Information providing details on the many characteristics of a company. All domains support this predicted factor. The internet allows anyone to conduct E-business. Therefore, prospective consumers require information about company characteristics [4], [7], [6]. Security Information – Information describing measures implemented in the website to ensure transfer and storage of personal data is secure. The majority of websites request personal information. Websites should describe their efforts to secure users’ information [2].
Effectiveness of Content Preparation in Information Technology Operations
Users
Information Needed
Information Provided
Site
Revision Date Creation Date Date of Next Update
Product
Aesthetics Price Availability
Shipping
Delivery Date Tracking Number Shipping Cost
Company
Name Mission Sponsors
Security
Payment Information Security
Customer Service
Help Contact Information Refund Policy
Transaction
Taxes Payment methods Quantity
Membership
Account Status History Personal Information
Generalization
629
Domains
Specialization
Fig. 1. Conceptual Model
5.
6.
7. 8.
Product Information – Information providing details about the product and/or services. This predicted factor has the most content requirements. It is important because obtaining products/services is the main purpose of interaction between the user and interface [6], [4]. Customer Service Information – Information describing the purchase assistance and/or after-sales support. Some aspects (i.e. Customer Service) of traditional shopping must be retained by E-commerce. Users are concerned with services during and after sale [7], [2]. Shipping Information – Information explaining the shipping process, payments, and tracking options. The content components in this predicted factor increase product awareness beyond initial interaction with the website [22], [6]. Membership Information – Information pertaining to customer account status, fees, purchase history, and preferences. Most sites allow users to register accounts. This affords desired customized web experience [9], [7], [23].
630
A. Savoy and G. Salvendy
The challenge is to develop a definitional framework depicting the characteristics of useful information. Figure 1 illustrates the classification of the specific content elements noted as essential in the literature review. Useful content is defined as information that is needed to aid a user in accomplishing his/her task. The conceptual model portrays the information needed by users which is the same information that developers should provide. This illustration will equip web designers with a guide for basic content preparation for any domain. There is a general and specific view of the information elements. Please note that only a portion of the specific components are captured in Figure 1. Moreover, this model serves as a framework for the development of an evaluation tool for the effectiveness of content preparation. The tool will evaluate the developer’s interpretation and implementation of the content guide.
5 Conclusion Usability evaluation has established its ability to improve a wide range of interactive systems over the years. However, less than five percent of these methods have addressed the information aspect of interface design for IT operations. Content preparation has been documented as an essential phase of website design [25]. The information provided by the website is a dominant concern of users [20]. Therefore, a tool for evaluating information content is greatly needed to assess the developer’s implementation and interpretation of Content Preparation guidelines. The lack of literature in this area prevents the immediate construction of such a tool. A clear structure of content specific elements was necessary for its development. This paper delivers such a structure as the theoretical foundation for development of methods to evaluate the effectiveness of content preparation in IT operations.
References 1. Akoglu, C., Ozcan, O.: Usability evaluation of architecture based web sites. In: Proceedings of the Tenth International Conference on Human-Computer Interaction, 22-27 June 2003, Heraklion, Crete, Greece, pp. 743-747 ( 2003) 2. Alexander, J.E., Tate, M.A.: Web wisdom: how to evaluate and create information quality on the Web. Lawrence Erlbaum, Mahwah, NJ (1999) 3. Baierova, P., Tate, M., Hope, B.: The impact of purpose for web use on user preferences for web design features. In: Proceedings of the 7th Pacific Asia Conference on Information Systems, 10-13 July 2003, Adelaide, South Australia pp. 1853-1872 (2003) 4. Barnes, S., Vidgen, R.: Assessing the quality of auction web sites. In: Proceedings of the 34th Hawaii International Conference on System Sciences, 3-6 January 2001, Maui, HI, p. 7055 ( 2001) 5. Chan, E.S.K., Swatman, P.M.C.: Web content and design: a review of e-Commerce/eBusiness program sites. In: Proceedings of the 13th Australasian Conference of Information Systems, 4-6 December 2002, Melbourne, Australia, pp. 49-60 (2002) 6. Detlor, B., Sproule, S., Gupta, C.: Pre-purchase online information seeking: Search versus browse. Journal of Electronic Commerce Research 4(2), 72–84 (2003) 7. Gehrke, D., Turban, E.: Determinants of successful website design: relative importance and recommendations for effectiveness. In: Proceedings of the 32nd Annual Hawaii International Conference on System Sciences, 5-8 January 1999, Maui, HI (1999)
Effectiveness of Content Preparation in Information Technology Operations
631
8. Greer, J.: Evaluating the credibility of online information: a test of source and advertising influence. Mass. Communication and Society 6(1), 11–28 (2003) 9. Griffin, G. 1999, A typology of online positioning strategies among creative programs. Available online at: http://www.ciadvertising.org/studies/student/99_fall/phd/griffin/ online paper/abstract.html (accessed 9 January 2006) 10. Hornbaek, K.: Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human computer Studies 64(2), 79–102 (2006) 11. Huang, H., Lee, Y., Wang, R.: Quality Information and Knowledge. Prentice-Hall, Upper Saddle River (1999) 12. Ives, B., Olson, M.H., Baroudi, J.J.: The measurement of user information satisfaction. Communications of the ACM 26(10), 785–793 (1983) 13. Jones, M.Y., Pentecost, R., Requena, G.: Memory for advertising and information content: Comparing the printed page to the computer screen. Psychology and Marketing 22, 623– 648 (2005) 14. Katerattanakul, P., Siau, K.: Measuring information quality of web sites: development of an instrument. In: Proceedings of the 20th International Conference on Information Systems, 12-15 December 1999, Charlotte, NC, pp. 279-285 (1999) 15. Krauss, K.: Testing an e-government website quality questionnaire: a pilot study. In: Proceedings of the 5th Annual Conference on World Wide Web Applications, 10-12 September 2003, Durban, South Africa (2003) 16. Liao, H., Proctor, R., Salvendy, G.: Content preparation for cross-cultural e-commerce: a review, Behaviour and Information Technology ( 2006) 17. Lu, M.Y.: Evaluating and selecting online magazines for children [Electronic Version]. Eric Digest. Available online at http://www.indiana.edu/ reading/ieo/digests/d180.html (accessed 25 March 2006) (2003) 18. Mueller.: An analysis of information content in standardized vs. specialized multinational advertisements. Journal of International Business Studies 22(1) (1st Quarter), 23–39 (1990) 19. Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Communications of the ACM 45(4), 211–218 (2002) 20. Pitt, L.F., Watson, R.T., Kavan, C.B.: Service quality: a measure of information systems effectiveness. MIS Quarterly 19(2), 173–187 (1995) 21. Proctor, R.W., Vu, K.P.L., Salvendy, G.: Content preparation and management for web design: eliciting, structuring, searching, and displaying information. International Journal of Human-Computer Interaction 14(1), 25–92 (2002) 22. Resnik, A., Stern, B.L.S.: An analysis of information content in television advertising. Journal of Marketing, vol (January), pp. 50-53 (1977) 23. Salvendy, G., Fang, X.: Siemens report: guidelines and rules for design of e-business: Purdue University (2001) 24. Strong, D.M., Lee, Y.W., Wang, R.Y.: 10 potholes in the road to information quality. Computer 30(8), 38–46 (1997) 25. Vu, K., Proctor, R.W.: Web site design and evaluation. In: Salvendy, G. (ed.) Human Factors and Ergonomics, 3rd edn., John Wiley and Sons, Inc., New York, NY (2006) 26. Zhang, P., von Dran, G., Blake, P., Pipithsuksunt, P.: Important design features in different website domains: an empirical study of user perceptions. e-Service Journal 1(1), 77–91 (2001)
Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture for Early Usability Evaluation: Basic Principles and Comparison Jean-Claude Tarby1, Houcine Ezzedine2, José Rouillard1, Chi Dung Tran2, Philippe Laporte1, and Christophe Kolski2 1
Laboratoire LIFL-Trigone, University of Lille 1, F-59655 Villeneuve d’Ascq Cedex, France {jean-claude.tarby, jose.rouillard, philippe.laporte} @univ-lille1.fr 2 LAMIH – UMR8530, University of Valenciennes and Hainaut-Cambrésis, Le Mont Houy, F-59313 Valenciennes Cedex 9, France {houcine.ezzedine, chidung.tran, christophe.kolski} @univ-valenciennes.fr
Abstract. Early evaluation of interactive systems is currently the subject of numerous researches. Some of them aim at explicitly coupling design and evaluation by various software mechanisms. We describe in this paper two approaches of early evaluation exploiting new technologies and paradigms. The first approach is based on aspect oriented programming; the second one proposes an explicit coupling between agent-oriented architecture and evaluation agents. These two approaches are globally compared in this paper. Keywords: Human-computer interaction, Early evaluation, Usability, Traces, Agent-based architecture, Aspect oriented programming.
Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture
633
effective tasks. The first approach exploits the paradigm of aspect oriented programming to integrate mechanisms of trace in interactive applications. The concept of trace was the subject of various studies in HCI [16]. The second approach proposes an explicit coupling between agents constitutive of an agent based architecture, and several evaluation agents. These two approaches are first described; then, they are compared.
2 First Approach for Early Usability Evaluation: Injection of Mechanism of Traces by Aspects 2.1 Aspect-Oriented Programming New paradigm of programming appeared in the middle of the Nineties, AspectOriented Programming (AOP) results from the Xerox PARC. AOP must be perceived like an extension of Object-Oriented Programming: indeed, complementary generic mechanisms significantly come to improve separation of the concerns within the applications [14]. In a traditional approach, the business objects locally manage their technical constraints (identification/authentication, security, transactions, data integrity...). The duplication of these crosscutting elements in methods of classes leads to a phenomenon of dispersion and interlacing of the level system concerns and increases the complexity of the code. AOP allows the modularization of these elements by the addition of a new dimension of modularity, the aspect. The scope of the crosscutting concerns supported by AOP exceeds that of the current solutions like the EJB. Join point, advice, aspect, pointcut, are the principal concepts introduced by AOP: • A join point represents a particular location in the flow of the program instructions (beginning or end of method execution, field’s read or write access ...). • Advices are methods which are activated when precise join points are reached: the mechanism of weaving inserts in the initial code the advices calls either in a static way (at compile-time) or in a dynamic way (during execution). Advice can execute before, after, or around the join point. • An aspect is a module which allows the association between advices and join points by means of pointcuts. • Pointcuts are used to define a set of join points on which will have to activate an advice. Furthermore, a pointcut allows capturing the execution context of join points. For a method call, this context includes the target object, the arguments of the method and the reference of the returned object, as many information of most useful for the injection of mechanism of traces. Based on the principle of inversion of control (IOC), AOP thus extracts from the business code the dependences with the technical concerns by locating them in the aspects and by managing them from outside by the mechanism of weaving. It becomes consequently possible to be focused on business logic.
634
J.-C. Tarby et al.
Moreover, AOP proposes the mechanism of introduction. This last allows the modification of classes, interfaces or even of existing aspects: it is possible to inject a method or an attribute in a class, to add a relation of heritage, to specify that a class implements a new interface. For example, in the objective to automatically sort a collection of Java class instances, an aspect will declare that the latter implements the interface Comparable and inject the required method compareTo to it. 2.2 Traces by Aspects Thanks to the principle of separation of concerns, AOP can inject traces mechanisms in existing applications (cf. Figure 1, step c) by writing aspects (step d) which on the one hand listen user actions, method calls, changes in data values, etc., and on the other hand produce the traces. These aspects are then weaved with the initial code (step e) which remains intact. The code produced by weaving contains then the initial code and code of aspects (step f). The initial application can be used completely normally without the aspects or be traced with them (step g). The mechanism of trace is thus disengageable without any effect on the initial code.
AspectJ
d Formats of trace
e
f
Aspects of trace
c
g Application to be traced
Aspect weaving Initial application
interactions
j
Data analysis
Traces
i
h
Fig. 1. Injection of mechanism of traces by aspects
To produce a trace we need three types of information: data to be traced, when to produce the trace and where to store it. Traced data mainly relate to the functional core (and consequently the associated tasks) and the user interface (actions from the user, but also displayed data…). For example it is possible to trace the beginning, the end or the interruption of a task, the opening of a window, the selection in a dropdown list, etc. Because our work is use-oriented, it is easier to trace the actions of the user when the functional core and the user interface are built from a task oriented
Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture
635
design method. Thus, if the application is designed with an evaluation-oriented approach as presented in [23], it is easy to recover other data such as the context of execution of the tasks, the role of the user (in CSCW for example), etc. Most of the time, the traces are produced when a method is called or at the end of the execution of the method, and these methods may be associated to tasks. AOP provides us all the requested services for the production of traces (cf. before and after keywords present in AOP). Moreover, it is very easy to parameterize the productions of traces, for example to produce them by a dedicated thread, or only if a condition is true. Today the traces are generated in XML files (step h) whose contents are parameterized by a set of formats also written in XML (step d). This allows us to generate traces in different formats while emitting same information from the traced application. Although we privilege traces in XML format, the external definition of formats will make it possible to generate very compact textual files (not XML). With our approach the exploitation of traces is facilitated because we choose data that we want to trace, as well as the format for the result, contrary to approaches based on log files. The analysis of traces (step i) produce statistics, task models (step j), filtered information, etc. This side of our work is not presented in this paper. At the moment this analysis is done after the production of traces, but we plan to realise real time analysis in the future (for an adaptation of the application, to advise the user, etc.). Our work is similar to works such as [2,5,6,9,10,11,24]. It uses AspectJ [4] but it could be made with other languages supporting AOP such as [3,21,25].
3 Second Approach for Early Usability Evaluation: Interactive Agent-Based Architecture and Evaluation Module 3.1 Agent-Oriented Architecture for Interactive Systems Several architecture models have been put forward by researchers over the past twenty years. Two main types of architecture can be distinguished: architectures with functional components (Langage, Seeheim, Arch and their derived models) and architectures with structural components (PAC and its derived models [7], the MVC model (Model-View-Controller ; from Smalltalk) and its recent evolutions, AMF and its variants [20], H4 [8]…). The classic models of interactive systems distinguish three essential functions (presentation, control and application). Some models (such as the Seeheim and ARCH models) consider these three functions as being three distinct functional units. Other approaches using structural components, and in particular those said to be distributed or agent approaches, suggest grouping the three functions together into one unit, the agent. These architecture models propose the same principle based on separation between the system (application) and interface. Thus, an architecture must separate the application and the interface, define a distribution of the services of the interface, and define a protocol of exchange. The interest to separate the interface from the application is to facilitate the modifications to be made on the interface without
636
J.-C. Tarby et al.
Application Application agents
dialogue control agents
Interfaces agents
User
Fig. 2. An agent oriented architecture for interactive systems
touching with the application. Figure 2 proposes a comprehensive framework for architecture [12,15], showing a separation in three functional components, called respectively: interface with the application (connected to the application), controller of dialogue, presentation (this component being in direct relation with the user). These three components group together agents: − the application agents which handle the field concepts and cannot be directly accessed by the user. One of their roles is to ensure the correct functioning of the application and the real time dispatch of the information necessary for the other agents to perform their task, − the dialogue control agents which are also called mixed agents; these provide services for both the application and the user. They are intended to guarantee coherency in the exchanges emanating from the application towards the user, and vice versa, − the interactive agents (or interface agents), unlike the application agents, are in direct contact with the user (they can be seen by the user). These agents coordinate between themselves in order to intercept the user commands and to form a presentation which allows the user to gain an overall understanding of the current state of the application. In this way, a window may be considered as being an interactive agent in its own right; its specification describes its presentation and the services it is to perform. 3.2 Principle of Coupling Between Architecture Based on Agents and Evaluation Agents Our starting objective was to propose a tool for collecting objective data, adapted to agent based interactive systems. This tool corresponds to an electronic informer; it consists of a program, invisible for the user (of the system to be evaluated), which transmits and records all the interactions (actions of the operator and reactions of the system) in a data base. The exploitation of this data base has the aim of then providing the evaluator with data and statistics enabling him/her to draw conclusions with regard to various aspects of utility and utilisability.
Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture
637
Fig. 3. Principle of coupling between agent-based architecture of the interactive system and its evaluation [26]
This informer being dedicated to the evaluation of agent-based interactive systems, it must be closely related to the architecture of the system to evaluate [13,26]. We are interested particularly in the interactive agents. This electronic informer, figure 3, consists of several informer agents deduced starting from architecture from the system to evaluate and more particularly starting from the multi-agent system concerning presentation. It is based primarily on the acquisition of information and specific data of the system to be evaluated (actions of the user and reactions of the system). Those will make it possible to rebuild the tasks really carried out by the user (a posteriori mode) and to confront them with the model of tasks to be carried out (a priori mode), according to confrontation principles described in [1]. Let us suppose a module of presentation made up of 6 interactive agents (each one being able to interact with the user), 6 evaluation agents will be instanced and connected to the interactive agents. During the interactions with the user, the 6 evaluation agents memorize in real time the data concerning interaction between the user and the 6 interactive agents. After the realization of the tasks, these data are analyzed automatically; using a specific user interface dedicated to the evaluator, these data are presented in time differed at this one. They can go from a bottom level, corresponding to simple user or system events, to higher levels (for example concerning task level). Examples are available in [26].
4 Comparison Between the Two Approaches A comparison of the two approaches is given in Table 1. The two approaches have common objectives: to gather data to compare predicted tasks and activity, and to highlight utility and usability problems. The ways used to obtain these data differ according to the approaches.
638
J.-C. Tarby et al.
From the point of view of integration in the software engineering, the two approaches require particular specifications. Approach 1 (AOP approach) needs to know the methods and data that can be traced, as well as the formats of trace; this information can be collected during the specifications or after the implementation. Approach 2 (agent approach) requires the specification of the elements of the interactive system, and the evaluation agents. No particular architectural design is requested for the AOP approach, but agent approach requires that the design of the interactive system architecture must be based on interface agents, as well as the establishment of connections between the interactive agents and the evaluation agents. About the implementation, AOP approach automatically generates the code of the aspects and the weaving with the initial code of the application to be traced; agent approach requires programming the services of the interactive system agents and the evaluation agents. From the user centred evaluation point of view, in addition to the fact that the two approaches can be coupled with other techniques such as interviews, eye tracking, etc., they use different modes to gather data: with AOP approach, data are automatically collected by the execution of the code issued from the aspect weaving on the initial application code; with agent approach, data are collected from the evaluation agents by observing the interactions between the interface agents and the user. To be collected with AOP approach, data must be accessible by a method (with the meaning of the object-oriented programming); this method can be public, inherited, etc. Time is accessible in the same way. Data collected with agent approach are potentially multiple (cf. Table 1). In their current version the approaches use different languages. AOP approach uses Java and AspectJ; agent approach is based on C++. In the future, it is expected that AOP approach will be extended to other languages supporting AOP such as PHP, C++, etc., and that agent approach will uses Java. Concerning the types of application, AOP approach currently can trace any application written in Java and supporting AspectJ. However, the traced applications are today mainly interactive applications (WIMP1 applications). In the future, it is planned that AOP approach will be applied to information systems, distance learning applications, and mobile applications. Agent approach is currently applied to information systems used in a context of supervision of network of bus and tramway. In the future, it should aim any type of information system. The advantages of these two approaches are that they provide principles and mechanisms facilitating and prompting early evaluation. In addition AOP approach allows keeping intact the initial code and thus leading in parallel and/or serially the realization of the application and the realization of the mechanisms of traces. The disadvantages are as follows. With agent approach, it is difficult to define for the moment the optimal number of evaluation agents (the first version contained an evaluation agent by interaction agent, and the new version will contain only one for the need for new design methods of user interface envisaging a coupling between interface agents and evaluation agents. To be more effective, AOP approach needs 1
Window, Icon, Mouse, Pull-down menu.
Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture
639
Table 1. Comparison between the two approaches AOP approach
Agent approach
Injection of mechanism of traces by aspects
Traditional stages of software engineering
User-centred evaluation (simultaneously with other possible methods: interviews, eye tracking, questionnaire, etc.) Goals
Languages
Types of application
Coupling of interface agents and module of automatic acquisition Preliminary or Explicit consideration of early Explicit consideration of faisability study evaluation in the project early evaluation in the project Specification of: Specification Specification of: interactive interactive system agents, system, parameters to be evaluation agents traced, formats of traces Architectural (empty) Design of the design interactive system architecture based on interface agents; connections between interactive system agents and evaluation agents Coding Generation of the code of the Coding of the services of aspects and weaving between the interactive system the code to be traced and the agents and evaluation aspects agents Interaction data Execution of the weaved code Espionage by the gathering evaluation agents of the interactions between interface agents and the user Collected data Any data accessible by a User and system events, method (in the meaning of errors, time of tasks object-oriented programming) execution, unused objects, + Time number of help requests… Depends on how traces are exploited: gathering data to compare predicted tasks and real activities, highlighting problems of utility and usability… Current Java with AspectJ C++ Intended Any language supporting Java AOP Current WIMP applications Information systems used in a context of supervision of network of bus and tramway Intended Information systems, Information systems distance learning applications, mobile applications
design methods integrating aspects for the evaluation. That means for example that any potentially traceable data must be accessible by object methods.
5 Conclusion The early evaluation field is the subject of active researches in the HCI community. For our part, we work on two complementary approaches. The first is based on aspect
640
J.-C. Tarby et al.
oriented programming; it allows the injection of mechanisms of traces in existing applications. The second is based on new possibilities offered by agent based approaches; it aims at ensuring a coupling between agent based architectures and evaluation agents. Although turned towards same objectives in term of evaluation, these two approaches have different characteristics, advantages and disadvantages which were compared in the paper. For these two approaches, the research perspectives are numerous: it is important to study adapted design methods, to improve the current mechanisms, to test them in various application domains. Acknowledgments. The present research work has been supported by the “Ministère de l'Education Nationale, de la Recherche et de la Technologie », the « Région Nord Pas-de-Calais » and the FEDER (Fonds Européen de Développement Régional) during the projects SART, MIAOU and EUCUE. The authors gratefully acknowledge the support of these institutions.
References 1. Abed, M., Ezzedine, H.: Vers une démarche intégrée de conception-évaluation des systèmes Homme-Machine. Journal of Decision Systems 7, 147–175 (1998) 2. Aksit, M., Bergmans, L., Vural, S.: An object-oriented language-database integration model : the Composition-Filters Approach. In: Madsen, O.L. (ed.) ECOOP 1992. LNCS, vol. 615, pp. 372–395. Springer, Heidelberg (1992) 3. aoPHP, Aspect Oriented PHP http://www.aophp.net 4. AspectJ project http://www.eclipse.org/aspectj/ 5. Balbo, S., et al.: Project WAUTER (Website Automatic Usability Testing EnviRonment) http://wauter.weeweb.com.au 6. Champin, P-A., Prié, Y., Mille, A.: MUSETTE: Modeling USEs and Tasks for Tracing Experience. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 279–286. Springer, Heidelberg (2003) 7. Coutaz, J.: PAC, an Object-Oriented Model for Dialog Design. In: Bullinger, Hans-Jorg, Shackel, Brian. (ed.): Interact’87, 2nd IFIP International Conference on Human-Computer Interaction, September 1-4, Stuttgart, Germany, pp. 431-436 (1987) 8. Depaulis, F., Jambon, F., Girard, P., Guittet, L.: Le modèle d’architecture logicielle H4: Principes, usages, outils et retours d’expérience dans les applications de conception technique. Revue d’Interaction Homme-Machine (RIHM) 7, pp. 93–129 (2006) 9. Ducasse, S., Gîrba, T., Wuyts, R.: Object-Oriented Legacy System Trace-based Logic Testing. In: Proceedings 10th European Conference on Software Maintenance and Reengineering (CSMR 2006), IEEE Computer Society Press, Washington (2006) 10. Egyed-zsigmond, E., Mille, A., Prié, Y.: Club (Trèfle): a use trace model. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 146–160. Springer, Heidelberg (2003) 11. El-Ramly, M., Stroulia, E., Sorenson, P.: Mining system-user interaction traces for use case models. In: Proceedings of the 10th International Workshop on Program Comprehension (IWPC’02), Paris, France (27-29 June 2002) 12. Ezzedine, H., Kolski, C., Péninou, A.: Agent oriented design of human- computer interface. Application to supervision of an urban transport network. Engineering Applications of Artificial Intelligence, vol. 18, pp. 255-270 (2005)
Traces Using Aspect Oriented Programming and Interactive Agent-Based Architecture
641
13. Ezzedine, H., Trabelsi, A., Kolski, C.: Modelling of an interactive system with an agentbased architecture using Petri nets, application of the method to the supervision of a transport system. Mathematics and Computers in Simulation 70, 358–376 (2006) 14. Filman, R., Elrad, T., Clarke, S., Aksit, M.: Aspect-oriented software development. Addison-Wesley Professional, London (2004) 15. Grislin-Le Strugeon, E., Adam, E., Kolski, C.: Agents intelligents en interaction hommemachine dans les systèmes d’information. In: Kolski C. (ed.): Environnements évolués et évaluation de l’IHM, IHM pour les SI 2 (Éditions Hermes, Paris, pp. 207-248 (2001) 16. Hilbert, D.M., Redmiles, D.F.: Extracting usability information from user interface events. ACM Computing Surveys 32, 384–421 (2001) 17. Ivory, M., Hearst, M.: The State of the Art in Automated Usability Evaluation of User Interfaces. ACM Computing Surveys 33, 173–197 (2001) 18. Jacko, J.A, Sears, A.: The human-computer interaction handbook: fundamentals, evolving technologies and emerging applications (human factors and ergonomics). Lawrence Erlbaum Associates, London (2002) 19. Nielsen, J.: Usability Engineering. Academic Press, Boston (1993) 20. Ouadou, K.: AMF: Un modèle d’architecture multi-agents multi-facettes pour Interfaces Homme-Machine et les outils associés. Ph.D. Thesis, Ecole Centrale de Lyon (1994) 21. PHPAspect http://phpaspect.org/ 22. Sweeney, X.E., Sibertin-Blanc, M., Maguire, M., Shackel, B.: Evaluation user-computer interaction: a framework. International Journal of Man.-Machine Studies 38, 689–711 (1993) 23. Tarby, J.C.: Evaluation précoce et conception orientée évaluation. In: Proceedings ErgoIA’ (Biarritz, France, 11-13 Octobre 2006), ESTIA and ESTIA.Innovation, Biarritz, pp. 343-346 (2006) 24. The Compose* project http://janus.cs.utwente.nl:8000/twiki/bin/view/Composer/ 25. The Java Aspect Components (JAC) project http://jac.objectweb.org/ 26. Trabelsi, A.: Contribution à l’évaluation des systèmes interactifs orientés agents: application à un poste de supervision du transport urbain (in french). PhD Thesis, University of Valenciennes and Hainaut-Cambrésis, Valenciennes, France (2006)
Usability and Software Development: Roles of the Stakeholders Tobias Uldall-Espersen and Erik Frøkjær Department of Computing, University of Copenhagen Universitetsparken 1, DK-2100 Copenhagen {tobiasue, erikf}@diku.dk
Abstract. Usability is a key issue when developing software, but how to integrate usability work and software development continues to be a problem, which the stakeholders must face. This study aims at developing a more coherent and realistic understanding of the problem based on 14 interviews in three case studies. The results indicate that usability during software development has to be considered with both a user interface focus and an organizational focus. Especially techniques to support the uncovering of organizational usability are lacking in both human computer interaction and software engineering. Further, the continued engagement of stakeholders, who carry the vision about the purpose of change, stands out as a critical factor for the realization of project goals.
1 Introduction Integrating usability work into software development is not easy [3]. It requires thorough understanding about usability work methods and software development practices to reach a proper integration, but this understanding seems insufficient when aiming at improving end product usability. Despite heavy investments in information technology we observe deficiencies in practical usability work and significant lack of impact [4]. Even current research fails to explain why [7]. This paper reports from a study combining both an organizational and an individual approach to understanding and exploring the problem. By selecting this approach we seek an understanding of how organizational issues and stakeholders in the organization influence end product usability.
Usability and Software Development: Roles of the Stakeholders
643
development process by two stakeholders: a graphical user interface designer and a business representative responsible for requirement specification, test planning and user education. These two persons were interviewed as well. The main research question was how practitioners in software development projects are working with usability and what we can learn from their practices? All interviews had the same interview guide as starting point, but there were significant differences in how they progressed. The interview guide covered four themes: (1) The software development process. (2) Software quality. (3) Developing usable software. (4) General experiences with development of usable and useful software products. During the interviews theme 1 and 3 were given most attention, and theme 1, 2 and 3 were all discussed based on one specific software development project significant to the interviewees and their organization. Each interview took 60-90 minutes. The interviews were transcribed and analyzed using elements from grounded theory [5]. During the analyses we looked for information that directly or indirectly related to usability. This information was for instance statements about stakeholders’ perception of usability, descriptions of usability related activities, and non-usability related issues that influenced end product usability. 2.1 Usability as a Concept Our data suggests that usability is treated with different goals in mind in the various development projects and their organizational context. This leads us to look further into the relevance and practical conditions of conducting usability work in software development projects in order to examine the various stakeholders’ roles and the possible risks regarding realization of the full potential of the solution. The ISO 9241-11 standard defines usability as: “The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.” Using this definition, usability is depending on four variables, i.e. a product, specified users, specified goals, specified context of use. Following our organizational approach we observed how specified goals had significant influence on the handling of usability. This we found important since these goals existed more or less autonomously of the product, the users and the context of use; three variables which traditional usability work often have special focus on. Various stakeholders formulated goals and their direct or indirect roles informed each case significantly. We found it useful to distinguish between two groups of stakeholders, the users, i.e. persons who interact with the system, and the other stakeholders, i.e. persons who are directly or indirectly affected by the system or have important interests regarding it. Our data suggests that usability work is oriented towards two different dimensions, which is related to the various goals in the development project, among the stakeholders, and in the organization. The two dimensions found were: (1) Usability work oriented towards the user interface or user interests, which we refer to as user interface usability. (2) Usability work oriented towards the organization or other stakeholders, which we refer to as organizational usability. Incidents with both identical and different interests between the two usability orientations were observed, which support our assumption about the importance of analyzing these two dimensions.
644
T. Uldall-Espersen and E. Frøkjær
3 Results The cases had both strong similarities and differences. All projects were based on web technology and were all considered quite successful by the interviewees. In relation to their organizations the developed applications were innovative and both influencing and influenced by their organizations. All the systems had various user groups and groups of people that were influenced by the systems. The systems were all initiated centrally, and anchoring the systems locally in the organizations was a challenge. By nature the systems were very different. Two systems were custom-developed by external contractors and an in-house development team developed one system. Case 1: Development of a new insurance sales tool. This case regards the development of a new sales tool for two groups of users, insurance agents and customer service persons. The tool was developed in-house over a period of 18 months. At the most 25 employees were working at the project. About 400-500 employees would be using the tool. The two user groups had significantly different requirements as the insurance agents were selling at the customers’ locations, typically in their homes, and the customer service persons serviced customers over the phone. It was not considered possible by the project management team to make two different interfaces and considerable efforts were made to make one suitable interface. The sales tool was build as a front-end to two large insurance administration systems and it was a challenge to avoid letting administrative procedures inform the design. A customer centred approach was taken and all possible stakeholders were involved. The aim was to ensure the users the best possible tool and the main improvements were a better quality of data and an improved general view of the customers and their households. The company had a strongly centralized organization rooted at the head office, but employees at five regional offices generated the majority of the sale. A main challenge was to avoid that the tool became “another head office’s idea” and a considerable effort was done to insure that the tool was firmly anchored locally. The project was innovative and utilized new technology, such as wireless access to the back-end systems and other relevant systems, e.g. the national civil registration number register. The new technology also caused severe technological and usability problems. The company did not use a formal software development method and usability was not prioritized initially in the project. Two stakeholders strongly insisted on taking usability seriously and they gradually succeeded in making usability a significant and comprehensive part of the project. The project management team took a risk by yielding control with the process and allowing anyone involved in the project to have an opinion and express it. The software developer described the space for communication this way: “We had our arguments and we have been bloody angry at each other, close to physical fights, but it is like that to integrate systems if you ask me, and I find it great that we could ... we really could go directly to each other and say that this is really annoying. Can’t we ... I think this is foolish ... but I think this is foolish ... why aren’t you done now ... why shall I be done now, and so on. There we really had a very close collaboration.” So, space was made for rewarding discussions and iterations, but the downside was that much decisions making became very time consuming. Case 2: Developing a new IT platform for a political organization. This case regards the development of a new IT-platform for a political organization. The
Usability and Software Development: Roles of the Stakeholders
645
IT-platform was custom-developed by an IT-contractor in close co-operation with the central office in the organization. The co-operation continued over several years where components continuously were delivered and put into use. The project team consisted of six or seven persons from the contractor and the customer’s organization. The organizational leaders had strong visions about modernizing the organization and the new IT-platform was a key tool to fulfil this vision. There were strong economic incentives in the project as well. The IT-platform should serve two purposes. First it should replace an existing, but outdated communication platform used by 2.000– 3.000 members. Otherwise a costly renewal of the license to the old communication platform was needed, which was not a realistic option. Introducing a new platform should help opening up the organization and make it more attractive to new members. The new platform included an advanced CMS-system available for all members (about 50.000) and specific tools for running effective and professional election campaigns. Second the IT-platform should serve as a new tool for membership administration, which would be decentralized and handed over to the local chapters of the organization. Membership administration includes issues like collection of dues, signing members up for courses and the national congress, and internal polling functionality. The contractor applied a highly agile and strongly business process oriented approach to the development. This was a key success factor since external events periodically completely did remove the customers’ focus from the project and changed the short termed goals. A very special contract was made between the contractor and the customer’s organization. No formal requirement specification was agreed upon, but a vision was developed, thoroughly discussed in the management group, and written down. The customers’ project manager describes it this way: “We ended up writing up a two-page contract and some enclosures, which essentially stated that we could put the deliveries into use when we were satisfied, and when we did so we paid. The whole issue of accepting that they had delivered what we needed was handed over to us, by stating ‘our experience is, that you only pay if you are satisfied, so let us put that into the contract.’ Thus, it was completely up to us to decide when things were approved, but it could not be put into use before it was accepted. This model does not work in all projects, but it was extremely operational in relation to what we were going through.” The agreement was governed by a fairness principle ensuring that the customer’s organization and the contractor treated each other respectfully and this converted potential conflicts to win-win situations. According to one of the key persons, “enlightened despotism” dominated within the customer’s organization and only three stakeholders were thoroughly involved in the project. Case 3: Developing a coherent physical and electronic department store. This case regards the development of a new website for a department store with a number of locale houses. The website was developed by an external contractor who were specialized in user centred web development. The customer’s organization did only little to involve itself in the project. The customer considered the solution to be a high-class web-solution and it was technically efficient, but it was poorly anchored in the customer’s organization. The contractor’s information architect experienced the lack of anchoring this way: “ ... they might not all have much notion about what this website should be used for, and they also had different positions. The commercial
646
T. Uldall-Espersen and E. Frøkjær
manager had another position than the marketing manager, who had another position than the loyalty manager. And then ... they need to clarify it internally, and then they can come to us, because we are going to make something they can use for what they have agreed the system to be used for.” The project was completed within five months and five persons from the contractor were core project members. The unique in this case was the idea of creating a coherent solution where the physical and electronic world supplemented each other in order to maintain a leading role in the physical department store market in Denmark and, if possible, also establish a position within the web-shop market. Two different goals were formulated. The first goal was to enable department store customers to buy articles in a traditional web-shop and this was given most attention by the development team. This was a limited success since only about 1 out of 1000 articles from the physical stores were available in the webshop when it opened. It proved to be a non-trivial task to add articles to the web-shop and to ensure that the organization was able to handle the logistics. The second goal was to present information and to inspire potential customers to buy articles in the physical shops, which was the primary goal according to the business representative. A large effort was put into unifying these two goals. A combined physical and webbased fashion magazine was created and when searching for products at the web site, the search function returned information including the physical placements of the articles in the department stores. The development process was split into three phases, each sold individually to the customer. This was an efficient way to keep the project on track, but some economic surprises did occur. Most significant was the surprise when the cost of following the strict HTML 1.0 standard was summed up. This standard was not previously followed and the budget was blown for the html-development without adding significant quality to the usability of the solution. Furthermore, the customer did neither want to pay for a thorough analysis of the target group, i.e. department store customers, nor a final user test. These cost savings watered down the user centred process. 3.1 Cross-Analyzing the Cases Our data suggests three different approaches across the cases, which we use as starting points for analyzing and comparing the cases. Each approach seems to have or could have a significant impact on usability of the end product. The approaches are: (1) The existence or development of living visions or organizational goals in the organizations. (2) The technology used to implement the system and the technical context in which it was implemented. (3) The shaping of the software development process. Approach 1: The existence or development of living visions or organizational goals in the organizations. All three cases were influenced by visions or organizational goals, but the effect of these was very different. In case 1, two main goals were important. (1) Tying up customers closer to the company by selling product from more than one branch of the company. This goal was pursued by making the tool customer centred and by making it easy to refer customers to other branches. (2) Following best practises when selling insurance products. This was done by never leaving the customers with obvious needs that were not treated in the sales process. The treatment was documented in the printed policy
Usability and Software Development: Roles of the Stakeholders
647
and signed by the customer. This was done to harmonize the expectations between the customers and the company and thereby avoiding disappointed and complaining customers when a possible insurance event happened. The redesign of the printed policies introduced a problem with clarity of the policy, since a normal policy that was handed over and signed by the customer was on about 18-20 pages. Since the old tool produced a three-page policy, this change directly influenced the sales process. In case 2 there was a clear vision about modernizing and opening the organization to make it more attractive to new or younger members. Modernizing included revising the administrative processes in order to save money and strengthen the campaign machinery. For example, the new platform included a web-based publication module where members, from a set of templates, could create folders and posters and send them directly to the printing house without dealing with colour formats and other technical issues. One key to opening the organization was through the design of an individual entry page called ‘my page’. My page should give the members easy access to discussion boards, mailing lists, and relevant homepages, but the page suffered from lack of user interface usability. It provided too much information and was difficult to use. This problem could be explained by a significant disagreement among the stakeholders about its purpose, functionality and design. In case 3 the buyer had a set of visions that was not clearly absorbed in the project team, and some of the project members expressed doubts about the realism of fulfilling the visions. The website should inspire customers and attract them to the physical department stores, and should help building and maintaining customer loyalty. Two means supported this. First, the company developed an electronic and physical fashion magazine, which included various articles about fashion, showed various shopping articles, and linked to other text articles on the website. Second, the buyer introduced a special search concept. When customers were searching for an article or a brand, the search result displayed the various available articles of that brand and where to physically find them in the department stores. Based on the three cases, we observe how fulfilling visions and goals in a project are strongly influenced by organizational usability. In all three cases the systems were important tools for creating loyalty or solidarity, but different approaches were chosen. In case 1 the utilization of the visions grew out of the comprehensive involvement of the various stakeholders, through workshops and formal or informal evaluations. In case 2 the design of the contract was an important factor for letting the understanding of the organizational usability develop, while the design and redesign of business processes were an important tool to its realization. The small project team with tightly cooperating members was well qualified for the job. In case 3 only one or a few key persons from the customer’s organisation understood the concept that was implemented and they did not succeed in making the solution an integral part of the organisation. Furthermore our data suggests that successful realization of visions and goals depends on thorough and coherent understandings of the users and the situation of use. Thus inadequacy of user interface usability constitutes a significant risk for not fulfilling the visions and goals. Approach 2: The technology used to implement the system and the technical context in which it is implemented. All three cases relied on web technology and were dependent of the technical context, but the technical impact on usability was very
648
T. Uldall-Espersen and E. Frøkjær
different. One important commonality across the cases was the centralized architecture that made it easy and relative inexpensive to fix errors and ‘roll out’ new corrected versions of the software. Compared to traditional software development the test efforts were reduced because of the easy access to fix problems. In case 1 and 2 less attention was directed at the deliveries when they first were put into use, and the organizations thereby failed to profit fully from the centralized architecture. The tool in case 1 was a Java application running on a number of Citrix servers accessed through a traditional wired network or a high-speed mobile phone connection. On an early workshop the users were asked “What can we do to make your everyday better?” This provided important information about the possible improvements of the tool, such as how online access to the national civil registration register could help the users forming the household fast and correct while visiting the customers. The online abilities also made data validation possible through integration to the back-end systems. This drastically reduced the number of errors that required intervention from other employees after the sales were finalized. The wireless setup had a major performance problem and it took up to 17 minutes to print the policy, which preferably should be signed by the customer during the visit. Case 2 relied on a component based service oriented architecture. This architecture made the solution extremely flexible to expand and modify and supported fast adoptions to changes in the short termed goals of the organization. For example, components of the existing infrastructure was easily integrated into the new solution, which made the solution usable from an early stage in the overall development process, and the ability to fast adoptions to changing goals proved very useful when internal and general elections were announced. Case 3 took the most conservative approach to technology. The customer’s main focus was on getting a stable solution, which they got. The contractor put a lot of effort in delivering a strict html 1.0 compliant solution. This did not have a clear influence on usability of the end product, but increased the cost of the solution significantly. Integration of the web-shop with the existing enterprise resource planner-system was a major issue, which was postponed since the customer’s ITdepartment lacked resources to assist this work. This left the administrative and logistic processes to be carried out more or less manually and thereby exposed to human failures. This caused concerns among the stakeholders and would have been a major problem in the organization had the web-shop been a large success. The technological comparison suggests a number of things. First, the ability to integrate with other systems can have huge effect on both user interface usability and organizational usability and failing to integrate can have severe consequences for the organization. Our data suggests that successful integration depends on continuously bringing experts together. Second, discovering and utilizing the technological abilities can be a learning process that needs space and time. Relying on well-known technology and solution patterns reduces risks of technical issues, but might also reduce innovation in the solution and in the organization, which can reduce both the user interface usability and the organizational usability. New technology can be used to evolve usability and increase the usefulness of the end product, but with a greater risk. Third, relying on specific technology and standards can introduce limitations, formal and informal. This can be a reasonable overall decision, but the consequences for usability is hard to anticipate.
Usability and Software Development: Roles of the Stakeholders
649
Approach 3: The shaping of the software development process. In our three cases we see three very different software development processes. Case 1 relied on a human centred development process. The team aimed at putting the customer in the centre in the tool. All possible stakeholders within the company were involved and anyone at the team was entitled to have an opinion and share it. Occasionally this made the process very time consuming and demanding to handle. The result of the development process was a solid all round sales tool, where different orientations of usability were considered. Neither the user interfaces nor the processes were optimized but both were designed well. Through a number of iterations involving various users most parts of the user interface were tested before the final user tests. Case 2 was a business process centred development process. The main focuses were on identifying important business processes, describing the processes into details, identifying stakeholders in the processes, and then implement the processes. All main design activities started with drawing up and analyzing the involved processes and the project organization saw it as their main task to “add electric current to the business processes”. The positive outcome of the process-oriented development was a system that supported a variety of processes in the organization and was well integrated with existing and new processes and components. However, it also resulted in a non-optimized user interface with serious flaws. Case 3 had a user centred development process as starting point. The user centred process was reduced due to economic limitations, since the customer did not want to pay for a target group analysis or a user test. This decision was inconsistent with the contractor’s advice. In the development process, focus was on the front-end of the system and the back-end was only minimally adjusted to the customer’s organization. The customer took only minimally part in the development project and although the contractor paid some attention to the organizational issues, the integration to the existing business did not work well and introduced a serious risk to the project. The comparison of the three different development processes suggests two main issues regarding usability. First, a process-oriented approach favours organizational usability while a user centred approach mainly considering direct users, favours user interface usability. The human centred approach of case 1 aiming at considering all possible stakeholders, places it self in between by promoting both organizational usability and user interface usability. Second, the human centred approach required lots of resources because of the broad discussions, which was deliberately avoided in case 2 and 3. In both case 2 and 3 the project managers were clearly aware of the risk of overloading the project and refrained from involving users in specific situations, while the project manager in case 1 aimed at ensuring that ‘the user involvement did not get out of hand’.
4 Discussion We discuss possible means to improve integration of usability work and software development based on the three approaches. Approach 1: The existence or development of living visions or organizational goals in the organizations. We find that the main issues regarding this approach are: (1) How is a living vision established, evolved, and maintained throughout the
650
T. Uldall-Espersen and E. Frøkjær
development process? (2) How are visions and goals transformed into concrete and usable systems design? (3) How is usability of the systems design evaluated together with the visions? Participatory IT Design [2] and Contextual Design [1] suggest how to develop and utilize visions in systems design, but how to evolve, maintain and evaluate the vision and goals is not discussed. In our cases the visions and goals are initially anchored among the non-technical stakeholders and it becomes their task as vision carriers to maintain and propagate the visions to the entire set of stakeholders, and particularly to anchor the visions and keep them alive together with the key technical stakeholders. This is for example carried out through workshops, and workshops are also used as a place where visions and goals can inform the concrete systems design. Case 1 and 2 include a number of critical decision points, where the intervention by the vision carrying stakeholders was necessary to retained focus on the overall project goals, also in situations where fast and comprehensive reordering of priorities were urgent. Also, we do not see this issue discussed in either the usability literature or the software engineering literature. Since goals and visions seem to have great influence on organizational usability, an iterative process with evaluations and redesigns taking shape in accordance with visions might be a way to better support organizational usability and thereby to better realize the full potential of the solution. Approach 2: The technology used to implement the system and the technical context in which it was implemented. We find that the main issues regarding this approach are: (1) How do we best realize the technological possibilities regarding usability? (2) How do we visualize and evaluate the consequences of the technological choices regarding usability? (3) How do we evaluate the technical implementation regarding usability before it is to late? Both Participatory IT Design [2] and Contextual Design [1] suggest that technology and the technical context are important when planning and designing new IT-systems, but the need for ongoing evaluation during development is not covered. Our cases show that key stakeholders are aware of how technology can support usability work, for example by making it easy and inexpensive to update web-based software on central servers, which should make it possible to fix a number of usability issues with a reasonable cost. Unfortunately, our data also shows that this possibility is not properly utilized, since focus shifts to other important tasks, even though an insufficient or even defective system is put into use. Furthermore, it might be more difficult than anticipated to upgrade the systems after a large number of users have taken the system into use. Also we observe how rigidly relying on standards can introduce new risks, if they are not necessary and coherent with the visions. Adhering to standards can make demand on considerable scarce resources and remove focus from more critical issues. Approach 3: The shaping of the software development process. We find that the main issues regarding this approach are: (1) How is the development process organized? (2) How do the stakeholders stay engaged of the development process? (3) What tools are advantageous and profitable to apply? We have not yet seen a process taking both organizational usability and user interface usability into account in a controlled and efficient manner. This applies to both the involvement of stakeholders and the use of methods and techniques. So far methods and techniques in HCI are primarily backing user interface oriented usability. This is visible for instance in the many evaluation techniques such as Heuristic Evaluation, Cognitive Walkthrough and
Usability and Software Development: Roles of the Stakeholders
651
Think Aloud Tests. Techniques for uncovering organizational usability issues are far fewer and less commonly used [6].
5 Conclusions The study reports from three interview-based case studies of software development projects, where important web-based applications were implemented. We have aimed at describing different stakeholders’ contributions through cross analysis of the development projects. In all three cases the stakeholders appear as individuals without an archetypical role. They all have positions, interests, and competences that make them important individual contributors. The cases show how end product usability is depending on various factors in the software development project, such as the presence of living visions, the technological choices, and the applied software development processes. Important usability contributors are found both at the user interface usability level and at the organizational level. While many techniques for developing user interface usability are employed, techniques to support the uncovering of organizational usability are lacking. Particularly important are the vision carriers, who are able to keep the project on track with clear focus on the organizational usability issues when plans have to be adjusted. Descriptions of work practises and techniques supporting this task are rare, both in human computer interaction and software engineering. Acknowledgments. This work is part of the USE-project (Usability Evaluation & Software Design) founded by the Danish Research Agency through the NABIIT Programme Committee (Grant no. 2106-04-0022).
References 1. Beyer, H., Holtzblatt, K.: Contextual Design. Morgan Kaufmann, San Francisco (1998) 2. Bødker, K., Kensing, F., Simonsen, J.: Participatory IT Design. The MIT Press, Cambridge, Massachusetts (2004) 3. Juristo, N., Windl, H., Constantine, L.: Introducing usability, IEEE Software, 20–21 (2001) 4. Landauer, T.K.: The trouble with computers. MIT Press, Cambridge, MA (1995) 5. Strauss, A., Corbin, J.: Basics of Qualitative Research. SAGE Publications, Thousand Oaks, CA (1998) 6. Vredenburgh, K., Mao, J., Smith, P., Carey, T.: A survey of user-centered design practice, In: Proc. CHI 2002, Minneapolis, Minnesota, USA (2002) 7. Wixon, D.: Evaluating usability methods: Why the Current Literature fails the Practitioner. interactions 10(4), 29–34 (2003)
Human Performance Model and Evaluation of PBUI Naoki Urano1 and Kazunari Morimoto2 1
SHARP Corporation, Nagaikecho 22-22, Abeno-ku, Osaka-shi, Osaka 545-8522, Japan [email protected] 2 Graduate School of Science and Technology, Kyoto Institute of Technology Matsugasaki, Sakyo-ku, Kyoto-shi, Kyoto 606-8585, Japan [email protected]
Abstract. We analyze and discuss human performance model for PBUI (PushBased User Interface) in this paper. PBUI is a user interface method in which a user performs a desired task by selecting a target object that usually represents the task itself. The candidate objects are sequentially and automatically presented to the user by the system. When a target object is presented, user selects the target object by a simple action such as just pushing a button. In this paper, we propose human performance model of PBUI and discuss the characteristics of PBUI. We also evaluate performance of PBUI by comparing with GUI. Keywords: user interface model, performance model, PBUI (push-based user interface).
for ordinary PC users to manipulate such scroll bars, but it is often hard for noncomputer users, beginners, novice users, naïve users, or disabled users [2]. There are following cases that make GUI difficult to use. 1. Users do not know how to use a graphical pointer and manipulate GUI elements to perform a task. 2. Users themselves have difficulty to use a graphical pointer or cannot manipulate GUI elements smoothly to perform a task. 3. Users are in the limited environment so that users cannot use a graphical pointer to manipulate GUI elements. In case 1, they are usually beginners, novice users, or naïve users. Note that there are many non-computer users in the world. They usually do not have opportunity to have lessons how to use computer devices. Hence, it is sometimes difficult for them to manipulate the computer devices actively. However, it is common for them to watch the display screen and to react to the display similar to watching TV. Therefore, if we provide passive user interface that means less active operations, they might be able to use a computer more effectively. In case 2, they are physically handicapped and have difficulty to use graphical input devices unrestrainedly. If we provide a user interface that incorporates a simpler input device, they can use a computer more smoothly. In case 3, they cannot use the graphical input devices resulting from difficult situations. It is sometimes difficult to manipulate windows or widgets in the limited environment. For example, you are not free to use hands to use complex input devices when you are driving an automobile. In the above cases, if we provide simpler and easier user interface than GUI, it helps the user to perform the task where they cannot perform easily in GUI. We propose push based user interface (PBUI). It has the following characteristics. 1. Simpler input device 2. Less input operations 3. Passive user interface Simple input devices with less input operations are important for beginners or novice users. In PBUI, user performs an arbitrary task by just pushing a button to select a target object presented by the system. We call it one-button interface. The target object is usually represented in icon, image, or graphics which user can easily recognize the meaning of the object. Passive user interface is an important feature for PBUI. It guides the user to a designated task without any user’s active input. The interface prompts the user to select the target object representing a task that the system presented to user. In other words, the system pushes a suggestion to user. That’s why we call it push based user interface. In most GUI, users need to manipulate input device actively. Users move the pointer to the menu bar, open menus, point the chosen menu, and click to perform a task. Unless user makes some actions, the system does not change its state and the display does not change at all. In PBUI, users do not need to make actions actively. Instead of making an action first, users wait for the chosen menu or object to be presented to users. Users just select it by pushing the button when the designated
654
N. Urano and K. Morimoto
menu or object is displayed. The system changes the display to guide the user to perform a task. In the above photo image searching application, PBUI displays the photo images (i.e. candidate objects) sequentially rather than just displays a group of photo images and waits for user's active input of pointing, dragging, scrolling, picking etc. In PBUI, users just wait until a desired photo image (i.e. target object) is displayed. It is unnecessary to know how to use the GUI manipulators of windows such as scroll bars. Rather than manipulating the widows, candidate objects are automatically changing. The user selects the target object when it is displayed. We present human performance model for PBUI, important facors of PBUI and discuss important issues about PBUI that are decried below in this paper. 1. What is right duration time for displaying candidate objects? 2. What is a right number of candidate objects to be displayed simultaneously to user? 3. Is it affected by the complexity of objects? 4. What is an effective way to display candidate objects to user? 5. Performance compared with GUI.
2 PBUI Human Performance Model There are many possible ways to display candidate objects. PBUI’s performance is very dependent on how the objects are displayed to user. If it is displayed one by one, it is easy for user to recognize it to select. However, it takes much time to reach the target object. It is very inefficient if a candidate object is presented to user one by one. On the other hand, if many candidate objects are displayed at once, it is very difficult for user to explore the right target among many candidates and user often misses the right target even though it has been displayed. To maximize the efficiency, we should know human performance model for PBUI. Total time to perform a task consists of perception time, cognition time, and motion time based on the model human processor [3]. The task operation is processed as follows. 1. 2. 3. 4.
Visual stimulus by displaying candidate objects. Perception: Perceive the candidate objects. Cognition: Recognize the target object. Motion: Push a button to select the object.
Displaying time of object is determined by the perception process, cognition process and the motor process. Perception time, Tperception, is the time duration that is needed for user to perceive candidate objects are displayed to the user. Cognition time, Tcognition, is the time duration that is needed for user to recognize the meaning of object. Motion time, Tmotion, is the time duration that is needed for user to react to the target object for selection. Thus the total time, Ttotal, needed for user to decide selecting the target object is the sum of the time. It is represented in the equation (1). Ttotal= Tperception + Tcognition + Tmotion
(1)
Human Performance Model and Evaluation of PBUI
655
It is important for an application associated with PBUI to design the appropriate time duration for displaying candidate objects. We assume that the perception time and the motion time are constant for a particular PBUI system. We should carefully design the time duration for the cognition time to maximize the human performance.
3 Experiments 3.1 Displaying Time In this section, we investigate an right duration time for displaying candidate objects. To find a reasonable displaying time for an application, we need to know how long it takes to process a candidate object. We had related experiments before [4]. We used simple objects shown in the figure 1 for the experiments.
Fig. 1. Simple objects used in the experiments
Experiments were carried out as follows. 1. A target object is presented to test subject. A test subject remembers it. 2. Candidate objects are randomly displayed in front of the test subject. 3. If target object is displayed, the test subject selects it by moving a finger from the home position to it. If there is not the target object, he or she just releases a finger from the home position to move to the next display. The average time to process a candidate object was 0.42 sec. The average time to process a target object was 0.61 sec that means the presented candidate object and the target objects were the same object. We have experimented for 8, 16, 24 object cases. In 8 object case, 8 objects are displayed simultaneously by 2 x 4 (i.e. 2 rows and 4 columns) style. Correspondingly, candidate objects are displayed by 4x4 for 16 objects, and 4x6 for 24 objects. More time is needed to process multiple candidate objects on the same display. The figure 2 shows the results. In that the target object exists in the display screen, the process time is less than that the target object does not exist because the test subject does not need to examine all the candidate subjects. The equation below shows the elements of time required for user to process one display. Tprocess= Tperception + n*Tcognition + Teye(n) + Tmotion
(2)
n:number of objects displayed simultaneously Tperception:Time required to perceive display change Tcognition:Time required to recognize a candidate object is the target or not the target. Teye(n):Time required to move one candidate object to another candidate object. Tmotion :Time required to select the target or to move to the next display
N. Urano and K. Morimoto
& KURNC[KPIVKO G㧔UGE㧕
656
6CTIGVQDLGEV 0 QVCIGVQDLGEV
0 WO DGTQHFKURNC[KPIQDLGEVU Fig. 2. Displaying time for different number of objects
3.2 Appropriate Number of Objects Based on the experiment and equation explained in the previous section, we designed an experiment to investigate the optimum number of simultaneously displaying objects. In this experiment, we measured total time to perform the task on each number of simultaneously displaying objects. In this experiment, the test subject processes all candidate objects to select the target object. Total time required for the task is represented in the equation (3). Ttask= N/n(Tperception + n*Tcognition + Teye(n) + Tmotion) N:Total number of candidate objects.
㫋㪸㫊㫂㩷㫋㫀㫄㪼䋨㫊㪼㪺䋩
㪈㪇㪇
㫋㪸㫊㫂㩷㫋㫀㫄㪼
㪏㪇 㪍㪇 㪋㪇 㪉㪇 㪇 㪈
㪏 㪈㪍 㪉㪋 㫅㫌㫄㪹㪼㫉㩷㫆㪽㩷㪻㫀㫊㫇㫃㪸㫐㫀㫅㪾㩷㫆㪹㫁㪼㪺㫋㫊
Fig. 3. Task time (total time to perform a task) for different number of displaying objects
(3)
Human Performance Model and Evaluation of PBUI
657
In this experiment, the total number of candidate objects is 100. Number of test subjects are 6. Each test candidate had 80 trials for each number of displaying objects. The average times required for the task are 56.05 sec for 1 displaying object, 11.42 sec for 8 displaying objects, 16.32sec for 16 displaying objects, and 15.06 sec for 24 displaying objects as depicted in the figure 3. The results show that the optimum number of simultaneously displaying objects is between 8 and 16. 3.3 Complexity of Objects We discuss how complexity of objects affects the number of displaying objects to yield good performance. We had related experiments before [5][6]. We used the similar objects depicted in the figure 4 to test the performance. The complexity was set based on the number of vectors included in a picture. In figure 4, the complexity of the picture is increased from left to right.
Complexity 1
Complexity 2
Complexity 3
Fig. 4. Sample objects with different complexity
Experiments showed that if complexity is high the user needs more time to recognize the object as expected. However, the complexity does not affect the optimum number of simultaneous displaying objects. If the optimum number is n, displaying n objects applies to any complexity of objects. The area where an object occupies on the screen might affect the recognition time. We suggest that the object should be large enough to recognize it by user. 3.4 Effective Way to Display and Performance Compared with GUI We discuss what an effective way to display objects in PBUI is in this section. We suggest two types of PBUI for comparison. They are an automatic paging user interface and an automatic scrolling user interface. The display designs are depicted in the figure 5. The circle represents the position where a candidate object like a graphical object in the figure 4 is placed. The automatic paging system was experimented as follows. 1. When user’s finger is placed on the home position, 12 candidate objects are displayed for the displaying duration time. The duration time is deduced from the equation 2 based on the preliminary experiment of figuring out right values for each element of the equation for the objects. 2. After displaying the page for the certain time, it automatically displays the next page including the next 12 candidate objects.
658
N. Urano and K. Morimoto
Object
Object
Home
Home
Automatic Paging
Automatic Scrolling
Fig. 5. Display designs of PBUI
3. When the test subject finds the target object, he or she moves the finger to the target object to select. 4. The total time to perform the task is measured. In the automatic scrolling system, candidate objects are presented to the user in a different way. Instead of changing the whole page simultaneously, the objects are smoothly moved to the right. Each object is displayed for the certain time. The method of selecting the target object is the same. The table 1 shows results of the average time to perform a task in which the target object appears on the tenth page or equivalent. The automatic paging and the automatic scrolling showed about equal performance on every complexity. We don’t conclude which PBUI is better than the other, but it was reported that the accuracy of selecting right target is different [6]. The accuracy of the automatic scrolling is better than the automatic paging’s. We suppose that user has to move the eyes actively from an object to the next object in one page in the automatic paging. It might be easy for user to miss the target object because user has to pay attention to scan all objects in the page within the time set by the system. Rather than actively moving the eyes, user’s eyes relatively stay at the same vertical line to scan whole candidate objects in the automatic scrolling. It is easy for user to scan all objects. In other words, user seldom misses the target object. We think that the accuracy difference comes from the user’s scanning ability. Thus, we think that the automatic scrolling is suitable for naïve user. It is consistent with our assumption that PBUI is a user interface for beginners, novice users, or naïve users. Table 1. The results of task performance for PBUI on different complexity of objects
Fig. 6. Graph of task performance for PBUI and GUI
3.5 Performance Compared with GUI We provided two pilot GUIs for comparison. One is a manual paging user interface, and the other is a manual scrolling user interface. The display designs are depicted in the figure 7. The test subject needs to move the finger to the arrow to go to the next page including the next 12 candidate objects in the manual paging user interface. In the scrolling user interface, the test subject needs to manipulate the scrolling bar by dragging to navigate in the window to find the target object. The table 2 shows results of the average time.
Object
Object
Next Page Home
Scroll Bar
Manual Paging Fig. 7. Display designs of GUI
Manual Scrolling
660
N. Urano and K. Morimoto Table 2. The results of task performance for GUI on different complexity of objects
The performance of PBUI of the automatic paging user interface and the automatic scrolling user interface is about equal to the manual paging user interface. Those three show better performance than the manual scrolling user interface that is widely used in GUI for photo applications.
4 Conclusions This paper explains the characteristics of PBUI and suggests some important factors of PBUI. PBUI is an alternative user interface for the users discussed in the introduction. We present the human performance mode by the equation. Based on the human performance model, we discussed important factors of PBUI that are the duration time for displaying object, the number of objects to be displayed simultaneously, the complexity of objects, and the displaying method. We summarize our answers to the issues raised in this paper as follows. 1. What is a right duration time for displaying candidate objects? The duration time should be expressed in the equation 2. 2. What is a right number of candidate objects to be displayed simultaneously to user? If it is a simple application like an image exploring application, the number is between 8 and 16. 3. Is the number of simultaneously displayed candidate objects affected by the complexity of objects? The experiments show it is independent of the complexity of objects. 4. What is an effective way to display candidate objects to user? There are many ways to display candidate objects. Automatic scrolling user interface is a typical PBUI in which users do not need to scan the objects actively. 5. Performance compared with GUI. It depends on applications. If application is very simple like an image exploring application, performance of PBUI shows as good as, or better than GUI’s. It is important to find a suitable application using PBUI. We have to prove that PBUI should be very effective user interface for the real applications for the future work.
References 1. Margone, S., Shneiderman, B. (eds.): A study of file manipulation by novices using commands versus direct manipulation, Twenty-sixth Annual Technical Symposium, pp. 154–159. ACM, Washington DC (1987) 2. Maulsby, D.L., Witten, I.H.: Inducing programs in a direct manipulation environment, Proc. CHI’89 Conference, Human Factors in Computing Systems, ACM, New York, pp. 57–62. ACM, New York (1989)
Human Performance Model and Evaluation of PBUI
661
3. Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction. Lawrence Erlbaum Associates, Mahwah (1983) 4. Takekuni, T., Urano, N., Morimoto, K., Kurokawa, T.: Proposal of Push-based User Interface and its Operating Characteristics, In: 2003 Japan Ergonomics Society Kansai branch conference proceedings, pp.146– 149 (2003) 5. Takekuni, T., Urano, N., Morimoto, K., Kurokawa, T.: A Study on Number of Objects in Push-Based User Interface, In: Human interface symposium proceedings, pp.109–112 (2004) 6. Li, Q., Urano, N., Morimoto, K., Kurokawa, T.: A study of the visual Push-Based User Interface that considers practicality. In: The 7th. human media workshop proceedings (2006)
Developing Instrument for Handset Usability Evaluation: A Survey Study Ting Zhang, Pei-Luen Patrick Rau, and Gavriel Salvendy Department of Industrial Engineering Tsinghua University, Beijing 100084, China [email protected]
Abstract. Handset is transforming from a traditional cellular phone to an integrated content delivery platform for communications, entertainment and commerce. Their increasing capabilities and value-added features provide more utilities, and at the same time, make the design more complicated and the device more difficult to use. An online survey was conducted to measure user’s perspective of the usability level of their current handset using a psychometric type of instrument. A total of 9 usability factors were derived from the results of exploratory factor analysis. The total percentage variance explained by these 9 factors of the overall variance of the data was 65.20%. The average internal consistency in this study is 0.70. Keywords: Handset; Usability; Usability measurements; Usability factors; Instrument; Survey.
Developing Instrument for Handset Usability Evaluation: A Survey Study
663
end-users need to be considered when measuring the usability of handset. Survey research can facilitate large amounts of data to be gathered with relatively little effort and support broad generalization of results [7]. 2) Mobility is quite difficult to simulate in a laboratory setting because of the changing context. The use of such devices in the context of doing other work also has implications for determining the context of use for usability testing [8]. 3) As an inquiry method, questionnaire survey plays a major role in subjective measurements. While past reviews of research has indicated a lack of survey study and psychometric instruments when simultaneously measuring multiple key concepts in the quality of experience in software systems [7, 9]. Furthermore, in many cases, questions of the standardized instruments are not specific enough to investigate handsets [10]. To supply the gap, the present study contributed to develop a usability instrument, comprising of specific design elements and structured usability factors unique to handset devices. The objective of this study is to develop an instrument to measure the perceived usability of handset product. The research issues of this study focused on two questions: (1) what are the most important usability factors for indicating the handset overall perceived usability? And, (2) how do the factors contribute to the handset overall perceived usability? The approach used in this study is expected to provide an innovative and systematic methodology for explaining and measuring the usability of handsets.
2 Literature Review 2.1 Usability of Handset There are many studies focused on the usability testing of individual handset design issues such as keystrokes [11, 12], content presentation [13, 14], battery duration [15], menu structure [16, 17], etc. However, few of them contribute to identifying usability dimensions and design factors for multiple features regarding to subjective feeling about the use of handsets. Chuang, Chang, & Hsu [18] examined the relationship between user’s preference of mobile phones and their form (hardware) design elements. Participants were asked to judge 26 mobile phone designs by using a user preference rating scale for 11 image words. Han et al. [1] measured 88 specific design elements for 36 products by using a measurement checklist. Another related study investigated the relationships among the design features of cell phones according to 1,006 college students’ preference ratings to their current cell phone [19]. According to the results, 5 design issues that significantly impact user’s overall satisfaction were identified, including calling-related features, personal preferences, portability, durable aspect, and aesthetic aspect. All of those studies were based on currently available cell phone models and focused on frequently used features. So the results can just tell us which is best among current designs. However, mobile technology is growing very quickly. It is necessary to consider the advanced features and functions which probably are uncommonly used now but will be popular soon. In the further research of Ling, Hwang, & Salvendy [20], they investigated the relationship among five advanced features and users’ preference
664
T. Zhang, P.-L.P. Rau, and G. Salvendy
level. The results showed that color screen, voice-activated dialing, and Internet browsing feature can strongly predict users’ satisfaction levels. We must aware that the advanced features are changing quickly. This brings difficulties for the usability research work. For instance, the color screen and camera function may be attractive features a few years ago. Today most cell phones have a color screen and the camera function also turns into a must-have feature for most mobile phone users. 2.2 Handset Usability Dimensions/Factors Menu structure and user interfaces are critical design features related to those basic communication issues: text entry, dialing and messaging, calendar, etc. The usability of those basic features can strongly influence the users’ overall satisfaction with the product. So there are many usability criteria have been studied for those basic handset features. Efficiency, effectiveness, simplicity (complexity), learnability, consistency, feedback, and memorability are essential and often addressed [21-26]. Ziefle [26] also indicated that predictability, familiarity, and generalizability (transfer of knowledge of a specific to similar interaction) are also the crucial users’ criteria for selecting mobile phones. Ji, Park, Lee, & Yun [27] developed a usability checklist consisted of five groups of usability factors for mobile phone user interface based on 21 usability principles. The efficient retrieval of information includes the organization of information, the method of accessing it, and form of delivery. Information must be presented to be both easy to obtain and easy to mentally integrate. It is because that performing navigation tasks on handset will place heavy cognitive demands on the user’s short-term memory because of limited screen size, scrolling capabilities, and slower processing [21]. The researches on the user performance with information retrieval largely focus on efficiency (speed and errors) and effectiveness (quality of outputs) [16, 22, 28]. Providing direct access to focused valuable content and simple hierarchies will increase the efficiency and decrease the keystrokes and text entry [16]. Another study of offering efficient information retrieval can be found in the study of [29], in which the mobile phone feature named two-phase fetch in the mail retrieving via Internet services is provided. Other critical usability factors include network connectivity, flexibility [30, 31], and personalization [32]. The emerging mobile commerce (m-commerce) technology promises exciting possibilities, but the user experience and acceptance of this technology still awaits an understanding. Bruner & Kumar [33] applied the technology acceptance model (TAM) to the consumer context and found that, the “fun” attribute contributes to consumer adoption of handheld Internet devices even more than the perceived usefulness. It has been strongly suggested that personalization is essential to creating positive mobile experience [34, 35]. Tarasewich [36] suggested that the context of use and security must be taken into account during the design and use of mobile commerce applications which will be affected by changing environmental conditions. Malloy, Varshney, & Snow [37] suggested the reliability (dependability) of wireless network infrastructure is necessary for the success of mobile commerce.
Developing Instrument for Handset Usability Evaluation: A Survey Study
665
Koutsiouris, Vlachos, & Vrechopoulos [38] provided a tailored evaluation framework to the mobile music services which integrated of key variables (factors) involved the user-mobile interaction process both from a business and technique perspective. Location-awareness provides mobile users with topical and personal contents that may increase the appeal of mobile guides in different application fields. Based on the results of seven field studies of Kaasinen [39], the usability factors of utility and user trust were identified to strongly affect the user acceptance of location-aware mobile guides. Fithian et al. [40] indicated that privacy would be the primary concern when using location-aware technologies, and the integration (with other functions, e.g. calling), understandability (of icons, labels, options, and how the application works), feedback (of system status or confirmation of some important actions) are some crucial determinants of users’ performance and satisfaction with the use of mobile location-aware applications. Except for the technology aspects, Ciavarella & Paternò [41] address the usability criteria for graphical UI design of the mobile guide applications with five concerns: web metaphor, navigation feedback, orientation support in the surrounding environment, minimal graphical interaction, and no redundancy in input commands. Howell, Love, & Turner [42] investigated the effect of interface metaphor and context of use (private/public) on the usability of a hierarchically structured speech-activated mobile city guide service. The results showed that visualization of the metaphor-based service significantly affected participants’ attitudes.
3 Survey Study A psychometric type of instrument was developed to gather end-users’ subjective perspective of usability of their current handset product. Firstly, handset usability dimensions were carefully selected, deleted and integrated from various resources [3, 6, 27, 43, 44]. Then, initial items were generated from a series of published usability instruments [2, 45-51] and modified with particular considerations for the identified handset usability dimensions. The first version of the instrument consisted of 98 items. All of the 98 items were firstly examined by the author according to correlation between items and centrality to the concept of usability. Then critiques of items were obtained from three PhD students and three master students who were familiar with the research topic and instrument design. 10 items were deleted, and word modifications were made according to those critiques. The final version of the scale includes a list of 88 items, with one global scale (item 89) measuring the perceived usability of the subject’s current handset. The instrument asks respondents to indicate how strongly they agree or disagree with each item on the instrument using a scale from 1 (strongly disagree) to 7 (strongly agree). The global scale can be used to analyze the criterion-related validity of the instrument. Not all questions were available on all handsets. If any question is not available on user’s handset or they perceive that they don’t have that question, they will mark “Not Applicable” (N/A) for the question.
666
T. Zhang, P.-L.P. Rau, and G. Salvendy
The survey was implemented online using HTML forms to a broad sample of individuals in China. Participants were recruited via personal contact, email contact, university BBS message, blogs, and Web forum announcement. Demographic information including user’s age, gender, job, and education level were collected. The experience with handset, manufacture and model of his/her current handset were also important to this study. When entering the survey website, the respondents were firstly instructed to read a short introduction of handset devices and the general definition of usability. Then respondents were asked to fill in a background questionnaire concerning demographic information, experience with handset, manufacture and model of their current handset. After that, the list of items was given.
4 Results Analysis 4.1 Respondents The total number of users participating in the survey study was 408. Prior to the analysis, 42 cases were deleted because of incomplete or inconsistent responses or repeated submissions. Particularly, it was assessed that how many respondents fully agreed or fully disagreed with two pairs of items that expressed opposite views to a scale in each of the two pairs (item 3 and 18, and item 37 and 46). 5 respondents had fully agreed or disagreed in both pairs. Therefore, those 5 respondents were deleted in the later data analysis. After the procedures, a total of 361 valid cases (143 males and 218 females) remained for the analysis. Respondents averaged 24.9 years of age (SD = 3.10 years), 4.6 years of handset experience (SD = 1.53 years), and 2.9 years of experience with their current handset (SD = 1.34 years). There are over 22 manufactures were collected in this survey, including four most popular manufactures which covered 71.2% of the sample size: Nokia (34.9%), Motorola (15.8%), Samsung (13.3%) and Sony Ericsson (7.2%). 4.2 Factor Analysis First of all, item response means were evaluated to determine whether a large percentage of participant responses created a “floor” or “ceiling” effect, which is observed when many of the individual scores are at one or both of the extreme ends of the scale suggesting that the scale may not have captured the actual variability in responses. All of the survey items had mean responses greater than 2.4 and less than 5.6, therefore no items were excluded due to “floor” or “ceiling” effects. A series of exploratory factor analyses were repeated to identify the factor structure of the 88-item instrument. Then sample data of 361 responses was examined using a principle component factor analysis together with equamax rotation method. Item reduction procedure was processed based on the four common employed criteria: 1) eliminating items with factor loadings less than 0.50 on all factors or greater than 0.50 on two or more factors [52]; 2) eliminating single-item factors [53]; 3) the value of Cronbach’s alpha of each factor should not decrease substantially when the item within
Developing Instrument for Handset Usability Evaluation: A Survey Study
667
that factor was dropped [54]; and 4) the derived structure should be simple and easy to interpret [55, 56]. As seen in Table 1, after an iterative sequence of factor analysis and item reduction, the finally identified instrument consisted of 29 items left. A total of 9 factors were derived, which explained 65.20% of the overall variance of the handset usability. The first two factors, satisfaction (how the user satisfy and enjoy with the product) and controllability (ability for the user to regulate, control, and operate the product), accounted for one third of the total variance of handset usability. The first five factors accounted for almost 50% of the total variance of handset usability. The internal consistencies of the 9 factors ranged from 0.60 to 0.84 with an average level of 0.70, indicating an acceptable level of internal consistency. Table 1. Factors, eigenvalues, percentage of variance explained and internal consistencies
Satisfaction
6.5718
Total Variance Explained % 22.66
2
Controllability
2.9111
3
Effectiveness
1.8337
4
Frustration
5
Factors 1
Initial Eigenvalues
No. of Items
Internal Consistency (Cronbach’s Alpha)
5
0.84
32.70
3
0.76
39.02
3
0.74
1.4759
44.11
3
0.72
Customizability
1.4419
49.08
3
0.67
6
Navigation
1.3064
53.59
3
0.66
7
Attractiveness
1.2602
57.93
4
0.64
8
Helpfulness
1.0908
61.70
3
0.66
9
Consistency
1.0165
65.20
2
0.60
5 Conclusion and Discussion The proposed instrument for handset usability testing was partially derived from the experimental and theoretical base outlined in the published literatures. The biggest difference between the proposed instrument and other published instruments is that, the factors and items were selected with special considerations for the handset characteristics. Its internal consistency in the present survey study is acceptable. But the construct validity and discriminant validity need more evaluations. The contribution of this study is both theoretical and practical. Few studies focus on the measurements of subjective perceptions on the usability of handsets. Most of the published instruments are limited in the traditional dimensions of software usability. Furthermore, the methodology of identifying relationships between usability factors and design features has not been systematically addressed. The results and methodology proposed in the present study supply those themes and contribute to the practice of handset designing in industry. With increasing efforts on the side of
668
T. Zhang, P.-L.P. Rau, and G. Salvendy
technology development, there is a lack of in-depth inquiry of the underlying phenomenon. The concept of mobility and mobile users are poorly understood. The approach used in this study is expected to provide an innovative and systematic methodology for explaining and measuring the usability of handsets. The survey results may cumulate a base of knowledge on this topic and help designers to recognize false assumptions and better ground their design choices. The study was limited in several aspects. Firstly the instrument was tested only in Chinese language, which generating the problem of semantic validity because of translation. Not all the items were selected from the established instrument items. The content validity and criterion-based validity need to be tested in the future. Secondly, the survey sample size is not large enough to conduct more statistical analysis. The nomological validity should be validated using structural equation modeling (SEM) in the future. Finally, the test-retest reliability of the instrument should be evaluated. Furthermore, because of the diversity of manufactures and models within each manufacture, it is difficult to perform statistical analysis to individual models within manufactures, due to small sample size for each model. Further investigation with usability experiment should be conducted to extract more specific design guidelines to improve specific features.
References 1. Han, S.H., et al.: Evaluation of product usability: development and validation of usability dimensions and design elements based on empirical models. International Journal of Industrial Ergonomics 26(4), 477–488 (2000) 2. Davis, F.D.: Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly 13(3), 318–340 (1989) 3. Han, S.H., et al.: Usability of consumer electronic products. International Journal of Industrial Ergonomics 28(3-4), 143–151 (2001) 4. Jordan, P.W.: Human factors for pleasure in product use. Applied Ergonomics 29(1), 25–33 (1998) 5. Jokela, T., et al.: The standard of user-centered design and the standard definition of usability: Analyzing ISO 13407 against ISO 9241-11. In: de Janeiro, R. (ed.) Proceedings of the Latin American conference on Human-Computer Interaction, Brazil, ACM Press, New York (2003) 6. Zhang, D., Adipat, B.: Challenges, methodologies, and issues in the usability testing of mobile applications. International Journal of Human-Computer Interaction 18(3), 293–308 (2005) 7. Kjeldskov, J., Graham, C.: A review of mobile HCI research methods. In: Chittaro, L. (ed.) Mobile HCI 2003. LNCS, vol. 2795, pp. 317–335. Springer, Heidelberg (2003) 8. Scholtz, J. Usability evaluation (2004) [cited 2006 Oct. 27th]; Available from http://www.itl.nist.gov/iad/IADpapers/2004/Usability%20Evaluation_rev1.pdf. 9. van Schaik, P., Ling, J.: Five psychometric scales for online measurement of the quality of human-computer interaction in Web sites. International Journal of Human-Computer Interaction 18(3), 309–322 (2005) 10. Lee, Y.S., et al.: Systematic evaluation methodology for cell phone user interfaces. Interacting with Computers 18(2), 304–325 (2006)
Developing Instrument for Handset Usability Evaluation: A Survey Study
669
11. Klockar, T., et al.: Usability of mobile phones. In: Proceedings of the 19th International Symposium on Human Factors in Telecommunications, Berlin, Germany (2003) 12. Ziefle, M., Bay, S., Schwade, A.: On keys’ meanings and modes: The impact of different key solutions on children’s efficiency using a mobile phone. Behaviour & Information Technology 25(5), 413–431 (2006) 13. Bederson, B.B., et al.: A fisheye calendar interface for PDAs: Providing overviews for small displays. In: Proceedings of CHI’03 Conference on Human Factors in Computing Systems, ACM Press, Ft. Lauderdale, Florida, USA (2003) 14. Bederson, B.B., et al.: DateLens: a fisheye calendar interface for PDAs. ACM Transactions on Computer-Human Interaction (TOCHI) 11(1), 90–119 (2004) 15. Bloom, L. et al.: Investigating the relationship between battery life and user acceptance of dynamic, energy-aware interfaces on handhelds. In: Mobile Human-Computer Interaction Mobilehci, Proceedings. pp. 13–24 (2004) 16. Buchanan, G., et al.: Improving mobile internet usability. In: Proceedings of the 10th international conference on World Wide Web, ACM Press, Hong Kong (2001) 17. Ziefle, M., Bay, S.: Mental models of a cellular phone menu. Comparing older and younger novice users, in Mobile Human-Computer Interaction - Mobilehci 2004, Proceedings. pp. 25–37 (2004) 18. Chuang, M.C., Chang, C.C., Hsu, S.H.: Perceptual factors underlying user preferences toward product form of mobile phones. International Journal of Industrial Ergonomics 27(4), 247–258 (2001) 19. Ling, C., Hwang, W., Salvendy, G.: A survey of what customers want in a cell phone design. Behaviour & Information Technology (2005) 20. Ling, C., Hwang, W., Salvendy, G.: Diversified users’ satisfaction with advanced mobile phone features. Universal Access in the Information Society 5(2), 239–249 (2006) 21. Albers, M.J., Kim, L.: User Web browsing characteristics using palm handhelds for information retrieval. In: Professional Communication Conference, 2000. Proceedings of 2000 Joint IEEE International and 18th Annual Conference on Computer Documentation (IPCC/SIGDOC 2000), IEEE, Cambridge, MA (2000) 22. Chittaro, L., Dal Cin, P.: Evaluating interface design choices on WAP phones: navigation and selection. Personal and Ubiquitous Computing 6(4), 237–244 (2002) 23. Christie, J., Klein, R.M., Watters, C.: A comparison of simple hierarchy and grid metaphors for option layouts on small-size screens. International Journal of Human-Computer Studies 60(5-6), 564–584 (2004) 24. Kjeldskov, J., Stage, J.: New techniques for usability evaluation of mobile systems. International Journal of Human-Computer Studies 60(5-6), 599–620 (2004) 25. Marila, J., Ronkainen, S.: Time-out in mobile text input: The effects of learning and feedback. In: Chittaro, L. (ed.) Mobile HCI 2003. LNCS, vol. 2795, pp. 91–103. Springer, Heidelberg (2003) 26. Ziefle, M.: The influence of user expertise and phone complexity on performance, ease of use and learnability of different mobile phones. Behaviour & Information Technology 21(5), 303–311 (2002) 27. Ji, Y.G., et al.: A usability checklist for the usability evaluation of mobile phone user interface. International Journal of Human-Computer Interaction 20(3), 207–231 (2006) 28. Jones, M., et al.: Improving Web interaction on small displays. Computer Networks 31(11-16), 1129–1137 (1999) 29. Rao, H., et al.: iMail: a WAP mail retrieving system. Information Sciences 151, 71–91 (2003)
670
T. Zhang, P.-L.P. Rau, and G. Salvendy
30. Watters, C., Duffy, J., Duffy, K.: Using large tables on small screen display devices. International Journal of Human Computer Studies 58(1), 21–37 (2003) 31. Watters, C., Zhang, R.: PDA access to Internet content: Focus on forms. In: Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS’03) - Track 4., IEEE Computer Society, Hawaii (2003) 32. Anderson, C.R., Domingos, P., Weld, D.S.: Personalizing web sites for mobile users. In: Proceedings of the 10th international conference on World Wide Web, ACM Press, Hong Kong (2001) 33. Bruner, G.C., Kumar, A.: Explaining consumer acceptance of handheld Internet devices. Journal of Business Research 58(5), 553–558 (2005) 34. Ho, S.Y., Kwok, S.H.: The attraction of personalized service for users in mobile commerce: an empirical study. ACM SIGecom Exchanges 3(4), 10–18 (2002) 35. Venkatesh, V., Ramesh, V., Massey, A.P.: Understanding usability in mobile commerce. Communications of the ACM 46(12), 53–56 (2003) 36. Tarasewich, P.: Designing mobile commerce applications. Communications of the ACM 46(12), 57–60 (2003) 37. Malloy, A.D., Varshney, U., Snow, A.P.: Supporting mobile commerce applications using dependable wireless networks. Mobile Networks and Applications 7(3), 225–234 (2002) 38. Koutsiouris, V., Vlachos, P., Vrechopoulos, A.: Developing and evaluating mobile entertainment applications: The case of the music industry. In: Rauterberg, M. (ed.) ICEC 2004. LNCS, vol. 3166, pp. 513–517. Springer, Heidelberg (2004) 39. Kaasinen, E.: User acceptance of location-aware mobile guides based on seven field studies. Behaviour & Information Technology 24(1), 37–49 (2005) 40. Fithian, R., et al.: The design and evaluation of a mobile location-aware handheld event planner. In: Proceedings of Human-Computer Interaction with Mobile Devices and Services: 5th International Symposium, Mobile HCI, 2003. Udine, Italy (2003) 41. Ciavarella, C., Paternò, F.: Design criteria for location-aware, indoor, PDA applications. In: Dignum, F.P.M., Cortés, U. (eds.) Agent-Mediated Electronic Commerce III. LNCS (LNAI), vol. 2003, pp. 131–144. Springer, Heidelberg (2001) 42. Howell, M., Love, S., Turner, M.: The impact of interface metaphor and context of use on the usability of a speech-based mobile city guide service. Behaviour & Information Technology 24(1), 67–78 (2005) 43. Folmer, E., Bosch, J.: Architecting for usability: a survey. Journal of Systems and Software 70(1-2), 61–78 (2004) 44. Hornbaek, K.: Current practice in measuring usability: Challenges to usability studies and research. International Journal of Human-Computer Studies 64(2), 79–102 (2006) 45. Chin, J.P., Diehl, V.A., Norman, K.L.: Development of an instrument measuring user satisfaction of the human-computer interface. In: Proceedings of SIGCHI ’88, Washington, DC: New York: ACM/SIGCHI (1988) 46. Lewis, J.R.: Psychometric evaluation of the post-study system usability questionnaire: the PSSUQ. In: Proceedings of the Human Factors Society 36th Annual Meeting, Human Factors Society, Atlanta, GA (1992) 47. Kirakowski, J., Corbett, M.: SUMI: The software usability measurement inventory. British Journal of Educational Technology 24(3), 210–212 (1993) 48. Lewis, J.R.: IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use. International Journal of Human-Computer Interaction 7(1), 57–58 (1995)
Developing Instrument for Handset Usability Evaluation: A Survey Study
671
49. Lin, H.X., Choong, Y.Y., Salvendy, G.: A proposed index of usability: A method for comparing the relative usability of different software systems. Behaviour & Information Technology 16(4-5), 267–278 (1997) 50. Kirakowski, J., Claridge, N.: Website Analysis and MeasureMent Inventory (Web Usability Questionnaire). (1998) [cited 2006 Dec. 05th]; Available from: http://www.ucc.ie/hfrg/questionnaires/wammi/index.html 51. Muylle, S., Moenaert, R., Despontin, M.: The conceptualization and empirical validation of web site user satisfaction. Information & Management 41(5), 543–560 (2004) 52. Hair, J.E., et al.: Multivariate data analysis: with readings, 4th edn. Prentice-Hall, Inc, Upper Saddle River, NJ, USA (1995) 53. Stiggelbout, A.M., et al.: Ideals of patient autonomy in clinical decision making: a study on the development of a scale to assess patients’ and physicians’ views. Journal of Medical Ethics 30(3), 268–274 (2004) 54. Chiou, C.F., et al.: Development and validation of the revised Cedars-Sinai Health-Related Quality of Life for Rheumatoid Arthritis Instrument. Arthritis & Rheumatism-Arthritis Care. & Research 55(6), 856–863 (2006) 55. Smith, B., CapUti, P., Rawstorne, P.: The development of a measure of subjective computer experience. Computers in Human Behavior 23(1), 127–145 (2007) 56. Wang, Y.-S., Liao, Y.-W.: The conceptualization and measurement of m-commerce user satisfaction. Computers in Human Behavior 23(1), 381–398 (2007)
Part III
Understanding Users and Contexts of Use
This page intentionally blank
Tips for Designing Mobile Phone Web Pages for the Elderly Yoko Asano1, Harumi Saito1, Hitomi Sato1, Lin Wang2, Qin Gao2, and Pei-Luen Patrick Rau2 1
Cyber Solutions Laboratories, Nippon Telegraph and Telephone Corporation, 1-1 Hikarinooka, Yokosuka-Shi, Kanagawa, 239-0847, Japan {asano.yoko, saito.harumi, sato.hitomi}@lab.ntt.co.jp 2 Department of Industrial Engineering, Tsinghua University, Shunde Building, Tsinghua University, Beijing, 100084, P.R. China [email protected], [email protected], [email protected]
Abstract. This paper proposes tips for designing Web pages appropriate for the elderly. The characteristics of mobile phone Web pages and the effects of aging are elucidated. The elderly had difficulty in reading texts, finding the focus, operating pages and input, and understanding the contents in some cases. Tips for designing Web pages that are appropriate for the elderly are proposed based on our observations. Keywords: mobile phone Web pages, Web design, the elderly, aging effect.
This paper proposes tips for designing Web pages that are appropriate for the elderly. The characteristics of mobile phone Web pages and the characteristics of the elderly are surveyed. Behavior of the elderly when accessing the mobile phone Web is observed. Tips for designing Web pages appropriate for the elderly are then proposed based on the results of the observation.
2 Characteristics of Mobile Phone Web There are differences in the design and operation of mobile phone Web services around the world. There are also differences among mobile phones in Japan. We talk here about the common characteristics of mobile phone Web service in Japan. The three major characteristics are shown in Table 1. One is small display. The second is that the interface is quite inflexible, i.e. font size, color, and so on. The third is that there are few keys to operate. Small Display. The small displays trigger many negative effects. Only a little information can be displayed at a time. Therefore, text is apt to be closely displayed. Lists are aligned. Abbreviated words, symbols, and icons are frequently used to shorten the text. Many Restrictions on Interface Flexibility. Character size and font can not be changed. Carriage returns and figures are used to indicate paragraphs. The layout is apt to be simple. Color variation is often used to indicate information structure. We can change only the background color and character color. The focus of cursor is generally indicated by color reversal. The color tones differ with the terminal type or usage. Moreover, the cursor only jumps from link to link. Table 1. Characteristics of mobile phone Web Major characters Small display
Interface restrictions
Few keys available
Characteristics of mobile phone Web Only a little information can be displayed at once. Texts are closely displayed. Lists are aligned. Abbreviated words, symbols, and icons are frequently used. Character size and font can not be changed. Color variation is often used in lieu of other formatting techniques. The cursor point is generally indicated by color reversal. The cursor jumps from link to link. Cursor jumps to the next link by down key input even if the link is to the right of the current cursor position. Display mode must be changed to input characters. Mode key must be pushed many times to change input character type. Same key must be pushed many times to input one character.
Tips for Designing Mobile Phone Web Pages for the Elderly
677
Few Keys. The average mobile phone has only twenty keys or so. All operations must be executed through these keys. The cursor is moved to the next link by the down key regardless of the direction of the next link. Even if the links are horizontally aligned, the user must push the down key to move the cursor to the next link. Another big problem is the difficulty of inputting characters. Display mode should be changed to input the characters. In Japanese, we use many types of characters; hiragana, katakana, Chinese characters, alphabets, numerals, and so on. We may have to push the mode key many times to change input character type. Moreover, more than 50 kinds of syllabic characters must be accessed through only twenty keys. We sometimes have to push the same key five times to input one character.
3 Aging Effect of the Elderly Many abilities of the elderly decrease with age. Table 2 shows the aging effects of the elderly as related to the usage of mobile phones. They are divided into three groups: effects related to physical ability, cognitive ability, and mental load. Physical Ability. The most important aging effect is poor sight. 80 percent in sixties and 90 percent in seventies suffer from cataracts. These have several symptoms. Most patients suffer a decrease in eyesight. Everything appears fogged with a yellow tint. The ability to distinguish contrast also decreases [4]. Moving ability also decreases. The elderly are not good at detailed work. Cognitive Ability. The most important impact on cognitive ability is a drop in distinction ability. The elderly take time to discover and recognize information. They are apt to have difficulty in recognizing the difference even when the two things are displayed simultaneously. Moreover, it is difficult for them to perceive a change over time. Table 2. Aging effects of the elderly Category Physical ability
Cognitive ability
Mental load
Common aging effects Eyesight decreases. Everything appears to have a yellow tint. The ability to distinguish contrast decreases. Weak at detailed work. Takes time to discover and recognize information. Tends to have difficulty in recognizing differences. Tends to have difficulty in perceiving things that change over time. Poor at forming mental model. Difficulty in memorizing and retrieving information. Poor spatial ability. Decline in motivation and understanding.
The elderly are also poor at forming mental models. They perceive their operations and information as low level elements, so they can not memorize and retrieve them easily. It causes a failure in spatial ability.
678
Y. Asano et al.
Mental Load. The decline in motivation to do anything causes a decline in understanding. Moreover, the elderly tend to give up easily when they are confronted with a challenge.
4 Behavior of the Elderly Using Mobile Web We conducted an experiment to observe how the elderly interacted with some existing mobile phone Web sites. Behavior of the elderly and their comments were collected. 4.1 Method of Experiment Subjects. Ten subjects participated in the experiment. Four were male and six were female. All subjects were over 55 years old. The average age was 65.4. All subjects had experience in using the telephone call function of mobile phones. Five of them had used the mail function. Only one of them had accessed a mobile phone Web site. Equipment. One of the most popular mobile phones, P901i, made by NTT DoCoMo was used in the experiment. It had been on the market for one year and eight months. Its screen displays only 12 by 12 characters. Objects. Thirteen mobile Web sites were used in the experiment. Two were portal sites, five were electronic commerce sites, two were air ticket reservation sites, three were stock exchange sites, and the other was broadcasting service site. Tasks. The subjects were instructed to access all mobile Web sites and perform a task specific to each Web site. For example, they were instructed to search for a specific flight and to reserve a seat on the air ticket reservation site. They were also asked to remark on the Web design and the problems encountered while using the Web sites. Observation data. Behavior of the subjects as they used the mobile Web sites and their remarks were captured. Particular attention was paid to the behavior and remarks made when they committed some error or were at a loss. 4.2 Results and Considerations The significant results are shown in Table 3. They were related to visibility, focus recognition, understanding, operation, and page structure. Visibility. Most subjects remarked that the pages were not easy to read because the characters were too closely packed and their weak visual acuity. We noted that white backgrounds made reading easier because the contrast between the characters and the white background tended to be high. In some cases, subjects did not recognize scrolling and blinking objects; it was hard for the elderly to recognize objects that rapidly changed and they gave up easily. Focus Recognition. In many cases, the subjects took too long time to identify the focus point. This is because there were many color combinations and they could not easily identify which combination indicated the focus point. In some cases, the focus area was too small to identify easily because short words were used. In other cases, the focus color matched surrounding color so well that users could not identify the focus easily.
Tips for Designing Mobile Phone Web Pages for the Elderly
679
Understanding. In many cases, the subjects skipped abbreviated words, foreign word, symbols, and icons because it was difficult for them to understand their meanings. Moreover, characters drawn in icons or figures were too small for the elderly to recognize. Several similar misinterpretation problems were observed. Many subjects thought, wrongly, that red text always meant a warning. This indicates poor ability in forming mental models. Table 3. Significant problems encountered by the elderly Category Visibility
Focus recognition
Understanding
Operation
Page structure
Major problems Closely packed characters are difficult to read. White background color made reading easier. Scrolling and blinking objects were sometimes skipped. They could not easily discern which color was used for focus highlighting. The focus area was small to recognize because short words were used. They skipped abbreviated words, foreign word, symbols, and icons because they were difficult to understand. They misunderstood red as always indicating a warning. They were at a loss when entering characters. They often tried to jump to the next link (to the right of the cursor) by wrongly pressing the right key instead of the down key. They were apt to try to understand the content based on only the information displayed on the screen at a time. They sometimes had difficulty in choosing one among choices when not all of the choices could be displayed at the same time.
Operation. Most subjects were at a loss when entering characters. This was because they failed to form a mental model of the operation of inputting the characters. Moreover, many input operations imposed high loads on the elderly. A lot of erroneous operations were observed when the subjects intended to move the cursor to the next link; they tried to use the right key instead of the down key. This was caused by the mismatch between the directions of cursor movement and those of the operation key. Page Structure. Most subjects tried to understand the content based on only the information displayed on the current screen; they were not good at remembering content. They sometimes had difficulty in choosing one of several choices when not all of the choices could be displayed at the same time. Moreover, they often got lost on mobile phone Web sites because they could not remember how they reached the present page.
5 Tips for Designing Mobile Web for the Elderly We propose tips for designing mobile phone Web pages for the elderly based on characteristics of mobile phone Web service, the aging effects of the elderly, and our observations of their behavior.
680
Y. Asano et al.
Layout. Use color variation and carriage returns to format content into chunks of information that are easier to follow. However, try to keep information density high so that as much information as possible can be seen without scrolling. Moreover, set choices in one screen so that users can compare all of them at once. Visibility. Do not use scrolling or blinking texts because text changes tend to be too fast for the elderly to recognize. Color. Use only enough color variation so as to make the information structure understandable. Too many meaningless colors hinder recognition of which color combination indicates the focus of attention. Only high contrast color combinations should be used to indicate highlighting. Words. Do not use abbreviated words, foreign word, symbols, or icons for the important words or links because the elderly tend to skip over unfamiliar symbols. Moreover, do not use short words for linked text so as to make the focus of attention stand out. Operation. Try to minimize the number of characters that must be input because input operations impose high loads on the elderly. Choosing one of a few choices is easier for them.
6 Conclusion The characteristics of mobile phone Web pages and the effects of aging were elucidated. The phones' small displays, interface restrictions, and few keys available caused many problems combined with the elderly aging effects of diminished physical ability, cognitive ability, and mental load. We found that the significant problems of using mobile phone Web pages were related to visibility, focus recognition, understanding, operation, and page structure. Tips for designing Web pages appropriate for the elderly were proposed based on the results of our observations. The tips proposed in this paper suit the development of mobile Web sites that is applicable to various users, from the young to the old. Acknowledgments. We would like to thank Ms. Mamiko Mori for conducting and managing the observations. She also gave us a lot of valuable recommendations.
References 1. Cabinet Office (ed.): Statistics of coverage of durable goods. Annual Report of Consumption Trend Investigation, March (2006) 2. Statistics Bureau (ed.): Information on the 2005 Population Census of Japan, Ministry of Internal Affairs and Communications (2006) 3. Mobile Society Research Institute (ed.): White Paper on Mobile Society, NTT Publishing, pp. 57–61 (2006) 4. Okajima, K., Takase, M.: Computerized Simulation and Chromatic Adaptation Experiments Based on a Model of Aged Human Lens. Optical Review 8(1), 64–70 (2001)
The Role of Task Characteristics and Organization Culture in Non-Work Related Computing (NWRC) Gee-Woo Bock, Huei-Huang Kuan, Ping Liu, and Hua Sun National University of Singapore, Department of Information Systems, School of Computing, 3 Science Drive 2, Singapore 117543 {bockgw, mkuan}@comp.nus.edu.sg
Abstract. Many organizations have scrambled to get control measures and discipline systems in place to deter employees from engaging in NWRC. Since control measures and discipline systems are insufficient to curb NWRC at the workplace, we propose to integrate the control perspective with task characteristics and organization culture. Thus, we examine the following research questions: How would the amount of NWRC control mechanisms affect employees’ NWRC behavior under different task characteristics? Does a match between the disciplinary approach and organization culture lead to more effective NWRC management? Two separate studies on full-time employees in various organizations revealed three important findings. Firstly, the ineffectiveness of NWRC control mechanisms occurred under high degree of task non-routineness. Secondly, the fit between discipline systems and organization culture leads to higher employee satisfaction with NWRC management, which subsequently led to lower time spent on NWRC. Thirdly, there is no best NWRC discipline system for each organization. Keywords: Non-Work Related Computing, Task Characteristics, Organization Culture, Fit.
characteristics and organization culture in NWRC management. Thus our research questions of this study are: How would the amount of NWRC control mechanisms affect employees’ NWRC behavior under different task characteristics? Does a match between the disciplinary approach and organization culture lead to more effective NWRC management?
2 Literature Review A number of terminologies have been used synonymously with NWRC such as “junk computing” and “cyberloafing”. Since we are interested in various forms of NWRC that involve Internet access, we define NWRC in this study as using the Internet during office hours for personal e-commerce (i.e. watch stock prices), personal communication (i.e. instant messaging with MSN), Internet browsing (i.e. reading news on the Internet), downloading files for personal purposes (i.e. movies and music) and Internet gaming (i.e. Yahoo! Games) [1]. As NWRC behavior is performed at the expense of organizational resources, much research has uncovered the negative impact of NWRC on corporate productivity [5]. Bock and Ho [5] attributed this result to the interruptive nature of NWRC activities, which are found in their study to be predominantly from emails and internet messaging. Personal emails and instant messaging are forms of distraction for employees and the recovery time compounded from both these interruptions may result in a great deal of time wasted leading to lower job performance. 2.1 NWRC Control Mechanisms Control mechanisms exist to discourage employees from engaging in NWRC at the work place [4]. Sharma and Gupta [7] cited that 45% of all firms and 17% of Fortune 100 companies use monitoring software of various kinds. The study also reported that software that records employees’ keystrokes are used by companies such as Exxon and the U.S State Department. Much of research on control mechanisms are based on General Deterrence Theory [8], which suggests that organizational control approaches can deter employees’ abuse of computing resources by increasing the perceived costs of computer abuse. Although many of the controls have been employed by numerous organizations, they have failed to significantly lessen NWRC behavior. Even by implementing tighter controls, NWRC still persists in organizations of the banking industry, which is equipped with strong security controls [2]. As evident, the implementation of control mechanisms in organizations alone is not sufficient to warrant the success of NWRC management. Organizations are unable to adequately apply it in their environments because General Deterrence Theory does not cover all the factors affecting the effectiveness of NWRC management. Many studies based on General Deterrence Theory did not consider employees’ task characteristics in the implementation of the control mechanisms.
The Role of Task Characteristics and Organization Culture in NWRC
683
2.2 Discipline Systems Besides control mechanisms, several IS studies have also suggested discipline systems to deal with the misuse of organizational computing resources [2]. Discipline can be meted out in several ways. Organizations are reported to discipline computer abusers by internal sanctions such as suspension and dismissal or even report offenses to third parties such as the police or FBI. There are predominantly two kinds of disciplinary systems: progressive and positive [9]. Progressive discipline provides that increasingly serious punishments be meted out to members of the organization who fail to behave acceptably. Positive discipline, on the other hand, retains the idea of making progressively serious contacts with an employee when work problems arise but eliminates the hastiness to use punishment as a means of getting the employee to adhere to rules and regulations. Instead, it seeks to prevent problems through formal and informal managerial practices and the recognition of good performance. Employees are allowed to participate in the disciplinary decision-making process, so they may be more responsible for their own behaviors and more willing to follow disciplinary policies [10]. As such, researchers generally believe that positive discipline is more effective than progressive discipline in curbing employees’ misconduct [9, 10]. 2.3 Fit Fit plays an essential role in strategic management and can be defined as a theoretically defined match between two variables [6]. Fit as matching is specified without reference to a criterion variable, although subsequently, its effect on a set of criterion variables could be examined. Task-Technology Fit model [11] argues the relation between users’ task requirements and usage of organizational IS may play a pivotal role in determining the effectiveness of organizational policies and measures. Goodhue [11] proposed that information systems (systems, policies, IS staff) have a positive impact on performance only when there is correspondence between their functionality and the task requirements of users. Therefore, we examine whether the fit between task characteristics and control mechanisms does indeed help to enhance NWRC management in organizations (Study 1). Prior research suggests that the match between discipline system and organization culture can help to curb NWRC. Crow and Hartman [12] revealed that health care organizations which neglect the detrimental elements of their culture can find themselves at risk of poor employee relations and ineffectiveness in applying discipline. Schwartz and Davis [13] have recommended that management should consider the cultural risk of implementing strategies. The lack of fit between organization culture and the discipline system may result in employees’ resistance towards the system, ultimately leading to the failure of the discipline system. Thus, we examine whether the match between disciplinary system and organizational culture has an impact on NWRC management (Study 2).
684
G.-W. Bock et al.
3 Study 1 Study 1 was conducted to examine the effects of fit between control mechanisms and task characteristics on NWRC management. In this study, tasks are broadly defined as the actions carried out by individuals in turning inputs into outputs [14]. The TaskTechnology Fit model suggests that task-technology fit will lead to greater performance of the technology [11]. Goodhue [11] measured a two-dimensional construct of task characteristics: non-routineness (non-repetitive and non-analyzable search behavior) and interdependence (reliance on other organizational units). This paper would focus on the analysis of these two dimensions of task characteristics because they are closely related to the requirement of information processing capabilities. As the objective of NWRC management is to reduce NWRC behavior, the dependent variable in this study is NWRC behavior. 3.1 Task Characteristics-Control Mechanisms Fit Task non-routineness is defined in this study as the level of structuredness, analyzability, difficulty and predictability of a task [14]. Tasks which have a high degree of non-routineness require employees to engage in intensive analysis, discussion and research in order to minimize the task uncertainty and find out a solution. The use of Internet browsing could provide immense amount of information as well as useful resources for the tasks to support non-routine tasks. Belanger and Slyke [15] found that a certain amount of playful use of the Internet can lead to learning that may be of value to the organization. Thus, certain NWRC like Internet browsing could be perceived as useful for non-routine tasks, since these tasks require employees to acquire more skills, knowledge as well as up-to-date information. If the NWRC control mechanisms within the organizations are tight, the employees may perceive the control mechanisms as a barrier for increasing their job performance and are hence more likely to ignore the control mechanisms. Hypothesis 1: The higher the task non-routineness, the weaker the negative effect of NWRC control mechanisms on employees’ NWRC behavior. Task interdependence is defined in this study as the degree to which the task is related to other organizational units and the extent to which coordination with other organizational units is required [16]. Coordination between interdependent parties needs to be supported by the organization’s information systems. Van de Ven et al [17] found that departmental communication increased as interdependence among employees increased. Instant messaging software and email provide a good communication platform for employees to communicate. As these tools match the communication requirement necessary for tasks of high interdependence, the prohibition of these tools would conflict with employees’ task requirements. If NWRC control mechanisms within the organization are tight, the employees may perceive control mechanisms as a barrier to communicate with other employees for work purposes and are more likely to ignore the control mechanisms. Hypothesis 2: The higher the task interdependence, the weaker the negative effect of NWRC control mechanisms on employees’ NWRC behavior.
The Role of Task Characteristics and Organization Culture in NWRC
685
3.2 Methodology A survey was carried out to test the proposed hypotheses. We target full-time employees who have easy access to the Internet at work. 40 organizations were contacted and 26 of them finally participated in the survey. 250 questionnaires were distributed either by mail or in person. After deleting the responses with missing data, there were 167 valid responses (effective response rate = 66.8%). All scale items are operationalized at the individual level. The measurement of NWRC behavior was measured in terms of the self-reported time spent for NWRC [1]. Respondents of the study were assured that any information they provide would be kept confidential to minimize underreporting of NWRC behavior. Before proceeding to test the hypotheses, we tested the validity of the measures. Convergent validity was shown by item-total correlation coefficients above 0.40. Cronbach’s alpha coefficients ranged from 0.60 to 0.81, which shows acceptable reliability for exploratory research [18]. To test discriminant validity, factor analysis with Varimax rotation was performed and loadings showed that the constructs are distinct from one another. 3.3 Results and Data Analysis Results were analyzed using SPSS 13.0. In performing hierarchical regression, we firstly added in the amount of control mechanisms as a predictor followed by a task characteristic and finally the interaction term. Hierarchical regression equations show that NWRC control mechanisms or NWRC control mechanisms along with non-routineness do not have significant impact on NWRC behavior. When the interaction variable is added, the effects of control mechanisms and the interaction variable are significant at p = 0.012 and p = 0.035 respectively. ∆R2 is 0.027 and the F statistic is 4.52, which is well above 1 [19]. This shows task non-routineness moderates the relationship between NWRC control mechanisms and NWRC behavior. Thus, H1 is supported. The hierarchical regression equations also show that NWRC control mechanisms or NWRC control mechanisms along with interdependence are insignificant on NWRC behavior. When the interaction variable is added, the effects of control mechanisms and the interaction variable are insignificant at p = 0.341 and p = 0.561 respectively. ∆R2 is 0.002 and F statistic is 0.34, which is below 1 [19]. This indicates that H2 is not supported. 3.4 Discussion of Results The first finding shows a significant result for the interaction of task non-routineness and the amount of NWRC control mechanisms. The ineffectiveness of NWRC control mechanisms under high task non-routineness may result from the perceived usefulness of NWRC. Tasks of high non-routineness are typically more difficult to accomplish and in order to accomplish these tasks, the use of up-to-date information and various resources are absolutely essential. Since Internet browsing is the most efficient and effective way to get information in today’s workplace, any restrictions on the Internet browsing would affect the accomplishment of these tasks. As they are required to browse the Internet to complete their tasks, tight control simply brings
686
G.-W. Bock et al.
inconvenience for their jobs and they would still continue to engage in NWRC due to the usefulness of the Internet in their jobs and their personal agenda. Thus, control mechanisms would be ineffective under high degree of non-routineness. Task interdependence may not be a significant moderator between the amount of NWRC control mechanisms and NWRC behavior due to two reasons. With high task interdependence, success at the workplace is contingent on one another’s output. As such, employees may devote more time to accomplish their tasks so that their work unit can accomplish their collective objectives effectively, instead of spending time to engage in NWRC. To collaborate with others within the organization, employees may use other substitutes for communication such as phone calls and/or face-to-face meetings which would be more effective to complete tasks. Thus the control mechanisms for instant messaging software have little conflict with interdependence.
4 Study 2 Study 2 focuses on examining the fit between discipline systems and organization culture. In this study, we include another dependent variable, satisfaction with NWRC management, as managers are also interested to find out if the fit between discipline system and organization culture can improve satisfaction with NWRC management. Organizational culture has pervasive effects on an organization and is defined as a socially constructed, cognitive reality that is rooted in deeply held perceptions, values, beliefs or expectations that are shared by, and are unique to, a particular organization [20]. Although there are several classifications of organization cultures in previous literature, the Organization Culture Index [21] is the most appropriate for the study as it provides three different distinct dimensions of organizational cultures to minimize ambiguity: bureaucratic, innovative and supportive. Bureaucratic cultures are hierarchical and compartmentalized. They are usually based on control and power, with clear lines of responsibility and authority. Innovative cultures are exciting and dynamic. They are creative places to work in, filled with challenge and risk. Supportive cultures are warm and “fuzzy” places to work in and employees are friendly and helpful to each other. 4.1 Organization Culture-Discipline System Fit Management literature suggests that the fit between organization culture and discipline system is crucial for management. Any management idea (which includes discipline systems), no matter how good it is, will not work in practice if it does not fit the culture [13]. Commanducci [22] also stressed that even management simulations require proper fit with company culture. The disciplinary system adopted in an organization is a type of management practice which can affect employees’ satisfaction of the system and NWRC behavior. Satisfaction of the discipline system is defined as a generalized positive or negative evaluation of the discipline system (adapted from [23]). The fit between organization culture and the discipline system may result in stronger employees’ compliance towards the discipline system [12], which can ultimately lead to greater satisfaction with NWRC management and lesser NWRC behavior.
The Role of Task Characteristics and Organization Culture in NWRC
687
Hypothesis 3: Employees of an organization with a disciplinary system closely matched with its organizational culture will have higher satisfaction with its NWRC management. Hypothesis 4: Employees of an organization with a disciplinary system closely matched with its organizational culture will engage in less NWRC behavior. 4.2 Methodology A survey was carried out to test the proposed hypotheses. Similar to study 1, we target full-time employees who have easy access to the Internet at their work place. 182 questionnaires from 30 organizations were collected. There are 174 valid questionnaires for data analysis (effective response rate = 95.6%). All scale items are operationalized at the individual level. The measurement of NWRC behavior was operationalized in terms of self-reported time spent on NWRC [1]. Similar to Study 1, respondents were also assured that any information they provide would be kept confidential. The validity of measures was also tested. Convergent validity was shown by itemtotal correlation coefficients above 0.40. Cronbach’s alpha coefficients ranged from 0.84 to 0.91, which shows acceptable reliability for exploratory research [18]. To test discriminant validity, factor analysis with Varimax rotation was performed and loadings showed that the constructs are distinct from one another. A cluster analysis was conducted to identify homogeneous groups of cultural profiles. 3 types of organization cultures were identified as the complete linkage dendrogram has suggested the possibility of a partition into 3 clusters [24]. Within each cluster, t-tests showed that there are significant differences in the means of one culture dimension from the rest of each cluster at p<0.01. The bureaucratic culture cluster’s mean of bureaucratic dimension is significantly different from its means of innovative and supportive dimensions with t = 9.19 and t = 10.30 respectively. The innovative culture cluster’s mean of innovative dimension is significantly different from its means of bureaucratic and supportive dimensions with t = 4.30 and t = 3.27 respectively. The supportive culture cluster’s mean of supportive dimension is significantly different from its means of bureaucratic and innovative dimensions with t = 4.24 and t = 4.71 respectively. This shows that each cluster possesses a strong inclination toward one type of culture. Described clustering procedures above were carried out for each culture cluster to further sub-cluster it into two groups based on means of scales measuring disciplinary approach. A higher score of disciplinary approach will indicate positive discipline. One-way ANOVA was carried out to determine if significant differences in means existed among the two subgroups within the same cluster. The differences between two disciplinary groups within the same culture are significant at p<0.001, providing evidence of the existence of positive and progressive discipline (see Table 1). For example, within the bureaucratic cluster, the mean of the disciplinary approach of sub-cluster 1 (1.82) was higher than sub-cluster 2 (1.29) with F = 101.46 (p < 0.001). This indicates that sub-cluster 1 has positive discipline while sub-cluster 2 has progressive discipline.
688
G.-W. Bock et al. Table 1. Summary of within culture clusters one-way ANOVA test results
4.3 Results and Data Analysis The hypotheses of study 2 are analyzed with MANOVA and ANOVA. We conducted a test of between-subjects effects with MANOVA to analyze which individual dependent variables contribute to the significant multivariate effect. Bonferroni-type adjustment is applied to control experiment-wise error and decrease the chance of type I error [24]. In this paper, the adjusted alpha is equal to 0.025 (0.05/2). Using this alpha level, we have a significant univariate main effect on satisfaction towards NWRC management with F = 3.882 (p = 0.022) only when organizational culture and disciplinary system are considered together, supporting H1. However, the univariate main effect on the time spent on NWRC is not significant with F = 0.641 (p = 0.528). We also analyzed whether the disciplinary system has significant effect on satisfaction with NWRC management within each culture group using one-way ANOVA (see Table 2). For bureaucratic and supportive culture groups, satisfaction levels between two disciplinary approaches are different with F = 6.36 and F= 3.20 respectively, significant at p < 0.10. For the innovative culture group, the satisfaction levels between two disciplinary approaches are different with F = 8.90, significant at p < 0.05. These findings are consistent with the MANOVA results. Table 2. Summary of one-way ANOVA Test Results within each Culture Cluster Mean of Satisfaction Positive Progressive Discipline Discipline Bureaucratic (df 1:70, F>2.79, α=0.10) Innovative (df 1:40, F>4.08, α=0.05) Supportive (df 1:60, F>2.79, α=0.10)
FSAT
4.02
4.65
6.36
4.48
3.63
8.90
4.8
4.1
3.20
4.4 Discussion of Results The combination of organizational culture and NWRC disciplinary approach has significant effect on employee satisfaction toward NWRC management (H3). We also examined which discipline system fits best with which organization culture by looking at satisfaction. From Table 2 positive discipline is more accepted in
The Role of Task Characteristics and Organization Culture in NWRC
689
innovative and supportive cultures, with higher means of satisfaction (4.48 and 4.8 respectively) compared to progressive discipline (3.63 and 4.1 respectively). Progressive discipline is more accepted for bureaucratic cultures, with a higher satisfaction compared to positive discipline. This shows that positive discipline does not always produce better results than progressive discipline, although theoretically, positive discipline is superior as it encourages employees to participate in the management process. The fit between organizational culture and NWRC discipline have no significant impact on time spent on NWRC (H4). This may be explained by the reasons why employees engage in NWRC which include rational decision making factors like normative awareness regarding NWRC, influence of peer acquiescence and unconscious factors such as habit [25]. However, post-hoc linear regression shows that satisfaction with NWRC management negatively affects time spent on NWRC (p=0.002). Thus, we see that the fit between organizational culture and NWRC discipline system can exert an indirect effect on NWRC behavior through employees' satisfaction with NWRC management.
5 Conclusion and Directions for Future Research Studies 1 and 2 offer some insights for NWRC management in organizations. Firstly, the universal enforcement of control mechanisms or simply increasing the amount of disciplinary actions against offenders cannot reduce NWRC behavior. From the first study, the discrimination of sanctions according to task non-routineness is essential. The Internet activities of employees engaging in non-routine tasks should not be controlled rigidly. Instead, practitioners need to emphasize the balance of NWRC activities in employees’ daily use of Internet. From the second study, there is no best discipline system for each organization. The fit between the discipline system and organization culture leads to greater satisfaction towards NWRC management and eventually leads to less time spent on NWRC. This paper also offers contributions to research. This is a pioneer research effort to consider the effects of task characteristics and organization culture for NWRC management. Majority of NWRC literature has mainly focused on examining the antecedents of NWRC behavior (i.e. [25]) or elucidate the consequences of NWRC in organizations [5]. Although “fit” has occupied a central role in strategic management field [6], it has never been incorporated into NWRC research. However, since this research has not considered the role of industry norms and national culture, future research can further extend this paper by examining industry norms and national culture on NWRC.
References 1. Siau, K., Nah, F.F.H., Teng, L.: Acceptable Internet Use Policy. Communications of the ACM 45, 75–79 (2002) 2. Lee, J., Lee, Y.: A Holistic Model of Computer Abuse within Organizations. Information Management & Computer Security 10, 57–63 (2002)
690
G.-W. Bock et al.
3. Lim, V.K.G.: The Moderating Effects of Neutralization Technique on Cyberloafing and Organizational Justice. In: The Proceedings of Academy of Management Conference, Denver (2002) 4. Urbaczewski, A., Jessup, L.M.: Does Electronic Monitoring of Employee Internet Usage Work? Communications of the ACM 45, 80–83 (2002) 5. Bock, G.W., Ho, S.L.: Non-Work Related Computing (NWRC): Is there a Productivity Payoff? Accepted and forthcoming in Communications of the ACM 6. Venkatraman, N.: The Concept of Fit in Strategy Research: Toward Verbal and Statistical Correspondence. Academy of Management 14, 423–444 (1989) 7. Sharma, S.K., Gupta, J.T.N.: Improving Workers’ Productivity and Reducing Internet Abuse. Journal of Computer Information Systems 44, 74–78 (2003) 8. Beccaria, C.: On Crime and Punishments. Bobbs Merril, Indianapolis (1963) 9. Osigweh, C.A.B., Hutchison, W.R.: Positive Discipline. Human Resource Management 28, 367–383 (1989) 10. King, K.N., Wilcox, D.E.: Employee-Proposed Discipline: How Well is it Working? Public Personnel Management 32, 197–209 (2003) 11. Goodhue, D.L.: Understanding User Evaluations of Information Systems. Management Science 41, 1827–1844 (1995) 12. Crow, S.M., Hartman, S.J.: Organizational Culture: Its Impact on Employee Relations and Discipline in Health Care Organizations. The Heath Care Manager 21, 22–28 (2002) 13. Schwartz, H., Davis, S.: Matching Corporate Culture and Business Strategy. Organizational Dynamics 10, 30–48 (1981) 14. Perrow, C.: A Framework for the Comparative Analysis of Organizations. American Sociological Review 32, 194–208 (1967) 15. Belanger, F., Slyke, C.V.: Abuse or Learning? Communications of the ACM 45, 64–65 (2002) 16. Thompson, J.D.: Organizations in Action. McGraw-Hill, New York (1967) 17. Van de Ven, A.H., Delbecq, A.L., Koenig, R.: Determinants of Coordination Modes within Organizations. American Sociological Review 41, 322–338 (1976) 18. Nunnally, J.: Psychometric Theory. McGraw-Hill, New York (1967) 19. Carte, T.A., Russell, C.J.: In Pursuit of Moderation: Nine Common Errors and Their Solutions. MIS Quarterly 27, 479–501 (2003) 20. Hofstede, G., Neuijen, B., Ohayv, D.D., Sanders, G.: Measuring Organizational Cultures: A Qualitative and Quantitative Study across Twenty Cultures. Administrative Science Quarterly 35, 286–316 (1990) 21. Litwin, G.H., Stringer, R.A.: Motivation and Organizational Climate. Harvard University Press, Cambridge, Massachusetts (1968) 22. Commanducci, M.: Training Can Be Fun: Management Simulations Require Proper Fit with Company Culture. Canadian HR Reporter 11, 15 (1998) 23. Kidwell, R.E., Bennett, N.: Employee Reactions to Electronic Control Systems. Group and Organization Management 19, 203–218 (1994) 24. Coakes, S.J., Steed, L.G.: SPSS Analysis without Anguish: Version 10.0 for Windows. Wiley, Brisbane (2001) 25. Lee, O.K., Lim, K.H., Wong, W.M.: Why Employees do Non-work Related Computing: An Exploratory Investigation through Multiple Theoretical Perspectives. In: Proceedings of Hawaii International Conference of System Sciences, Hawaii (2005)
Searching for Information on the Web: Role of Aging and Ergonomic Quality of Website Aline Chevalier, Aurélie Dommes, Daniel Martins, and Cécile Valérian University of Paris X-Nanterre Cognitive Processes and Interactive Behaviours Laboratory 200 avenue de la République 92001 Nanterre cedex, France {aline.chevalier,adommes,daniel.martins}@u-paris10.fr
Abstract. Despite rapid growth in the number of websites, there is still a significant number of ergonomic problems, which hinder cognitive activities of web users. As cognitive aging is generally associated with a decrease of working memory capacities, an inhibition failure and a slowing of the speed of processing, we argue that aging may have negative effects on information search activities, especially when the website incorporates ergonomic problems. In the present experimental study, we compare younger and older web users performances while searching for information in two websites: one that fits the ergonomic recommendations and another with ergonomic problems. The results show that aging had negative consequences on users’ activities of information search (more times to find information, more number of steps required to find information and more cognitive resources involved in the activity). These consequences are more important for the non-ergonomic web site than for the ergonomic site. Keywords: Information search, Cognitive load, Ergonomics, Aging.
Recently, we notice that few researchers are interested in studying and determining cognitive strategies and difficulties that older users experience while searching for information (see, e.g., [2], [33]). Nevertheless, studies must be led in order to a better understanding of cognitive processes and cognitive difficulties involved in searching for information on the Web. Towards this end, we conducted an experimental study with younger and older web users. This study aimed at determining the influence of aging and ergonomic quality of websites on information search activity. The following section provides an overview of relations between information search activity, cognitive load and cognitive aging. Section 3 presents the experimental study. The results obtained are discussed in Section 4.
2 Searching for Information: Cognitive Load and Aging During the 80-90s, several attempts of modeling cognitive activities involved in searching for information were suggested (see [1], [9], [11]). First models described this activity as cyclical, i.e. the individual defines a (cognitive) goal, selects an information category, extracts information and integrates it into previous extracted information; the individual begins this cycle over and over again until s/he reaches her/his search goal. Nevertheless, these models do not explain why users fail when searching for information. Rouet and Tricot ([25]; then [34]) defined a more complete cognitive model that includes different factors, such as the degree of precision of the user's objective (vague vs precise), the extraction of unique or various sources of information, and the experience of users. This model is close to those used for searching in electronic information systems developed by Marchionini et al. [19] and Shneiderman et al. [32] with one main difference however: the latter models did not consider, for instance, the specific differences between the Web and bibliographical database systems. The model developed by Rouet and Tricot proposes an information search activity which is both cyclical (like Guthrie's model [11]), and close to text comprehension, problem-solving and decision-making activities. Accordingly, when individual searches for information, s/he elaborates a cognitive goal, selects a set of information, extracts information and integrates it to previous one. Individual restarts this cycle until s/he reaches the research goal. Therefore, searching for information consists in transforming a representation of a need of information into a request; its formulation depends on the contents and the constraints imposed by the system (here, the website). Next, the individual has to choose among the sources which are supplied to her/him those relevant for her/his information search (e.g., list of links or items presented on a website), by estimating them with regard to her/his representation of the goal. If the document is relevant, individual deepens her/his searching; if the document contains information partially relevant or very few relevant information, individual generally modifies her/his strategy and so her/his request. Ultimately, if the document is irrelevant, individual changes her/his request but also sometimes his/her representation of the goal. According to this model, searching for information is a very complex cognitive activity that requires many cognitive resources in working memory and so a high cognitive load.
Searching for Information on the Web
693
Sweller [31] distinguishes two kinds of cognitive loads: 1. Intrinsic cognitive load is linked to the task at hand. It depends on the difficulty of the content to be learned and on the amount of information that the individual has to simultaneously process in working memory. Intrinsic cognitive load decreases as knowledge in long-term memory increases. Consequently, a high intrinsic cognitive load corresponds either to a highly complex material or to an individual's low expertise. 2. Extraneous cognitive load is linked to the presentation of information, which has influence on cognitive resources involved. Intrinsic and extraneous cognitive loads are additive to create a total cognitive load [22]. As just indicated, working memory plays a central role in information search, since it allows to maintain and process temporarily several available information and individual’s goals [3]. Working memory is also one of the factors which has been revealed in cognitive aging research as a possible predictor of age-related declines in performance observed on a wide variety of cognitive tasks, such as reading or problem-solving. Older adults might experience difficulties in temporarily keeping and processing several information (for a review, see [35]): the amount of information that can be simultaneously processed and stored in working memory would decrease with aging. According to Hasher and Zacks [12], it is not so much the working memory size which would be important, but the way in which the information in working memory are managed with regard to the goal of the task at hand. Age-related differences in memory and other cognitive functions would be attributed to a decline in attentional inhibitory control over the contents of working memory (for a review see [18]). Older adults might be less able than younger adults to suppress and inhibit irrelevant information. Irrelevant information could overload working memory and thus interfere with the task to be performed. Inhibition failures associated to aging have been observed in numerous cognitive activities, such memory, language comprehension and reasoning [12], [18]. Last but not least, one of the most wide-spread theories of cognitive aging postulates a generalized decrease of the speed of executing processes, independently of the type or structure of the information being processed [28]. Many studies show that slowing of processing accounts for a significant portion of the age-related variance on a large number of cognitive tasks. Salthouse [28] suggested that two mechanisms underlie this effect: (1) according to the limited time mechanism, cognitive operations essential to the success of an activity are not all correctly executed by the elderly. Cognitive operations would be executed too slowly to be entirely accomplished in the assigned time, because much of the available time is taken up with early processes. (2) According to the simultaneity mechanism, the outcomes of early processes may be lost before they can be used by later processes. Decline in working memory capacities, inhibition failure and generalized decrease of the processing speed are the main effects of aging on cognition, and seem to explain the decline of older adults performances observed on numerous simple as well as more complex cognitive tasks, such as information search.
694
A. Chevalier et al.
3 Experimental Study 3.1 Research Problem and Objectives Information searching requires involvement of many cognitive resources, which depend on the individual’s cognitive capacities, their age, as well as characteristics and constraints of the used website. Among web interface characteristics, ergonomic quality has a central role. Indeed, if the website does not fit the users’ cognitive capacities, the cognitive load should increase. The possible consequences would be an overload and lostness on the Web [10], [30] ; for instance, Nielsen [23] noticed that half of the researches on the Web failed. Moreover, because of age-related changes in cognitive functioning, accessing to websites could become more complex for older users, especially when websites do not fit the users’ needs, i.e. sites incorporating (many) ergonomic violations. Accordingly, this study aims at determining the influence of the users’ age and the ergonomic quality of website on: (a) The time necessary to find information; (b) The number of steps (i.e. the number of hyperlinks visited by the participants) required to find information; (c) The amount of cognitive resources involved in finding information; (d) The participants' usability satisfaction with regard to the visited site. 3.2 Procedure Forty novice web users (in line with [13]) participated in this study: twenty younger users (M= 31 years old) and twenty older users (M= 64 years old). All of the participants had the same educational characteristics (Bachelor Degree) and they used punctually Internet to information searching and e-mails. Two versions of the same website were created, which presented an e-shop selling music products: • An ergonomic version that was consistent with ergonomic recommendations for web interfaces (ergonomic site, hereinafter referred to as ES, see Figure 1). • A non-ergonomic site (hereinafter referred to as NES, see Figure 2) that included the main ergonomic problems identified by designers and users in a previous study [7].
Fig. 1. Homepage of the ES
Fig. 2. Homepage of the NES
Searching for Information on the Web
695
We chose an e-commerce music site (selling CDs, show tickets, …) since those products are bought by many people on-line and do not require specific knowledge linked to the site content. The study was divided in two stages: Stage 1: Participants had to search for information to answer three questions successively presented from the homepage; each question had only one correct answer. For each participant, the order of the question presentation was counterbalanced. Navigation activities of the participants (visited pages, etc.) were recorded for analyzing. To respect the ergonomic recommendations and to compare the search time between the two sites, two steps (or two hyperlinks) were necessary to find the three correct answers. To measure cognitive load, participants, while they are searching for information, had to react to auditory signals (from Tholos software developed by [5]) by pressing a pedal with her/his foot (her/his hands remained free to use the computer). This made it possible to determine an average reaction time (in milliseconds). The participant's baseline reaction times measured during a training phase was subtracted from the reaction times measured during the experimental task (information search task), thus providing "reaction time interference scores" (RT). Such scores allowed us to measure the participants' cognitive resources: the greater the reaction time, the more cognitive resources were involved (for more details see, [5]). Stage 2: After searching for information, participants had to freely navigate the website to answer a usability satisfaction questionnaire including seventeen affirmations (based on the WAMMI, [14]). For each affirmation, participants had to indicate the degree to which they agreed on a 5-point-scale; the more the participants evaluated the site as satisfactory, the closer to 5 the grade was. 3.3 Results All of the results are presented in Table 1. Statistical analysis (ANOVAs) were conducted with Age (younger vs older users) and Site (ES vs NES) as factors. Analysis have considered, in the following order, search time and number of steps (§3.3.1), cognitive load (§3.3.2) and usability satisfaction (§3.3.3). 3.3.1 Times and Number of Steps Necessary to Find Targeted Information The time necessary to find targeted information (in sec.) was calculated from the moment the participant saw the homepage to the moment s/he said s/he had found the information. All of the participants succeeded in finding the three correct answers. The younger users needed significantly less time to find information than older users (F(1,36)=11.151 ; p<.002): in mean, 8.2 sec. and 11.56 sec., respectively. The ES users needed less time than the NES users (in mean 8.28 sec. vs 11.49 sec. respectively - F(1,36)=10.149 ; p<.003). Age*Site interaction is not significant. The post-hoc analyses show that the younger users found targeted information quicker while using the ES than the NES (p<.006), whereas the older users do not (see Table 1). In addition, the younger users found information quicker than the older users when navigating the ES (p<.005) whereas there is no significant difference between the younger and the older users while navigating the NES.
696
A. Chevalier et al.
Table 1. Mean (and standard errors of the mean) of time and steps to find information, cognitive load and of the usability satisfaction notes according to the users’ age and the ergonomic quality of the website Ergonomic Site (ES) Younger Older users users
Non-ergonomic Site (NES) Younger Older users users
Mean time (and SEM) to find information (in sec.)
6.12 (0.14)
10.44 (1.04)
10.28 (1.37)
12.68 (1.02)
Mean number of steps (and SEM)
2.1 (0.07)
2.27
2.23
2.85
Cognitive load (in ms)
120 (17.21) 145.4 (19.68)
187.9 (9.46)
420.4 (91.95)
Mean notes (and SEM) on 5 points to the usability satisfaction questionnaire
3.79 (0.19)
2.83
2.85
3.67
(0.14)
(0.2)
(0.1)
(0.25)
(0.27)
(0.27)
Recall that the optimal number of steps was the same for the two sites (2 steps). Nevertheless, the younger users made significantly less steps than the older users (F(1,36)=5.709 ; p<.03): in mean, 2.17steps vs 2.56 steps, respectively. Moreover, the ES users made significantly less steps (in mean 2.18) than the NES users (in mean 2.54) (F(1,36)=4.778 ; p<.04). Post-hoc analyses show that the older users made significantly more steps than the younger users exclusively when navigating the NES (p<.02). The older users made significantly less steps with the ES than the NES (p<.02), whereas there is no significant difference due to the site for the younger users (see Table 1). 3.3.2 Cognitive Load Recall that cognitive resources were measured using Tholos software (for details, see [5]), which allow to determine reaction time interference scores in milliseconds (RT). The RT of older users were significantly higher than the RT of the younger users (F(1,36)=7.207 ; p<.02 ): in mean, 280.9 msec. vs 153.95 msec., respectively. The ES generated RT lower than the NES (F(1,36)=12.74 ; p<.001): in mean, 132.7 msec. vs 302.15 msec., respectively. Interaction between Age and Site is significant (F(1,36)=4.647 ; p<.04): the older users involved less cognitive resources when navigating the ES than the NES (p<.0003), whereas there is no significant difference in the younger groups. In addition, the younger users involved less cognitive resources than the older users when navigating the NES (p<.002); but, there is no significant difference between the younger and the older users when using the ES (see Table 1). 3.3.3 Usability Satisfaction for the Two Web Sites At the end of the experiment, the questionnaire presented seventeen affirmations. Participants had to indicate for these affirmations the extent to which they agreed on a 5-point scale. The more satisfied a participant was, the closer to 5 the evaluation was.
Searching for Information on the Web
697
There is no significant effect of Age (in mean, 3.26 for the older users and 3.31 for the youngest). In contrast, the website used by participants has a significant effect (F(1,36)=14.596 ; p<.0005 ; see Table 1): the users were more satisfied when navigating the ES (in mean 3.73) than the NES (2.84).
4 Discussion and Conclusion This experiment aimed at determining the effects of the ergonomic quality of the website on the information search activity of younger and older novice users. The results show that older web users needed more time than younger users to find target information. This finding corroborates the most wide-spread theories of cognitive aging which suggested a generalized age-related decrease of the speed of executing cognitive processes, independently of the type or structure of the information being processed [28]. So, older adults need more time than younger adults to perform a same task (problem-solving, memorization, etc.). More precisely and surprisingly, this aged-related difference was only significant when participants navigated the ergonomic site. Despite this website was friendly and fit the users’ cognitive capacities, the older adults found more slowly the targeted information than the younger users. Moreover, the younger users found information quicker when navigating the ergonomic site than the non-ergonomic site, whereas the time to find information did not significantly vary between the two websites for the older users. So, when the site comprises numerous ergonomic violations, the younger adults need more time to process all information presented on the layout and to select the most appropriate information for their task. In contrast, no significant difference appeared between the two sites for the older users. Consequently, the younger users would beneficiate from the ergonomic quality contrary to the older users. Concerning the number of steps needed to find targeted information, the younger users made less steps than the older users. This finding is in accordance with those obtained by Kubeck et al. [15] involving younger and older web novice users. In their study, older participants made more actions than younger participants to achieve close performances. These results may reflect aged-related limitations of available cognitive resources in working memory [27]. Given both the limitations of cognitive resources and the decrease of the speed of executing cognitive processes, older adults may forget certain information, such as web pages visited, previous results and the general goal of the information search. Consequently, older users may have to go back more often than younger users, and so to make more steps. This explanation is in line with the one suggested by Rouet et al. [26]: the older adults may experience more difficulties to manage a set of goals (links), because of the decrease of working memory capacities. The results also show that the ergonomic website required less steps than the non-ergonomic one to find information. These findings corroborated the results about the time needed to find information. But a significant effect of the age appeared for the site composed of ergonomic violations: the older users made more steps than the younger users to find information. The younger users seemed less disturbed with the ergonomic violations than the older users; the latter ones needed to open up more web pages than the others. Because of a aged-related decline in attentional inhibitory
698
A. Chevalier et al.
control over the contents of working memory [12], [18], older adults may experience more difficulties than younger adults to suppress irrelevant information, i.e. distractive information, and to not focus attention on it. In this way, our results show that the older users made more steps with the non-ergonomic site than with the ergonomic site, whereas no significant difference appeared in the younger group. Moreover, no significant aged-related differences appeared concerning the number of steps made in the ergonomic site. The results about the cognitive load confirm these findings: the older users involved more cognitive resources than the younger users, and the non-ergonomic site also required a cognitive load higher than the ergonomic site. The interaction effect shows that the older users’ cognitive load was particularly high while navigating the non-ergonomic site, whereas no significant difference appears between the two websites in the younger group. Therefore, as just indicated, the non-ergonomic site may generate more difficulties for the older users than for the younger users: the older users may experience difficulties in selecting the relevant information with regard to the goal of the task to achieve. Because of a decline of inhibitory mechanisms, the older users may process more information than the younger, so they need more time, more steps and more cognitive resources to find the target information. According to Sweller’s theory [31], we can also suggest that the older users would be more sensible than the younger users to the extraneous cognitive load (load due to the ergonomic violations in the non-ergonomic). Consequently, the older users would experience more difficulties than the younger users to inhibit irrelevant information displayed on the non-ergonomic website. Finally, the younger and older users were more satisfied with the ergonomic site than with the non-ergonomic site; this finding is in line with the performance findings. To conclude, based on these findings, we argue that a better understanding of the differences between younger and older users in searching for information is required not only in a specific website, but on the Web with search engine tools. In addition, it would seem necessary to help designers develop a user-centered design activity. At the present, researches are developed to help web designers better understand and consider future users’ needs while designing websites [8]. Therefore, complementary researches are required to help designers focus on the specific needs of older users.
References 1. Armbruster, B.B., Armstrong, J.O.: Locating information in text: A focus on children in the elementary grades. Contemporary Educational Psychology 18, 139–161 (1993) 2. Aula, A.: Older adults’ use of Web and search engines. Universal Access in the Information Society 4, 67–81 (2005) 3. Baddeley, A.: Working memory. Clarendon Press, Oxford (1986) 4. Bhatt, G.: Bringing virtual reality for commercial Web sites. International Journal of Human Computer Studies 60, 1–15 (2004) 5. Cegarra, J., Chevalier, A.: Using Tholos software for combining measures of cognitive load: towards theoretical and methodological improvements (in revision) 6. Charness, N., Schumann, S.E., Boritz, G.M.: Training older adults in word processing: effet of age, training technique, and computer anxiety. International Journal of Technology and Aging 5, 79–106 (1992)
Searching for Information on the Web
699
7. Chevalier, A.: Evaluer un site Web: les concepteurs et les utilisateurs parviennent-ils à identifier les problèmes d’utilisabilité? Revue d’Intelligence Artificielle 19, 319–338 (2005) 8. Chevalier, A., Fouquereau, N., Vanderdonckt, J.: The Influence of a Knowledge-Based System On the Designers’ Cognitive Activities: A Study involving Professional Web Designers. Behaviour & Information Technology (in press) 9. Dreher, M.J.: Searching for information in textbooks. Journal of Reading 35, 364–371 (1992) 10. Gwizdka, J., Spence, I.: Implicit Measures of Lostness and Success in Web Navigation. Interacting with Computers (in press) 11. Guthrie, J.T.: Locating information in documents: examination of a cognitive model. Reading Research Quarterly 23, 178–199 (1988) 12. Hasher, L., Zacks, R.T.: Working memory, comprehension, and aging: A review and a new view? In: Bower, G.H. (ed.) The psychology of learning and motivation, pp. 193–225. Academic Press, San Diego (1988) 13. Hölscher, C., Strube, G.: Web search behavior of Internet experts and newbies. Computer Networks 33, 337–346 (2000) 14. Kirakowski, J., Claridge, N., Whitehand, R.: Human Centered Measures of Success in Web Site Design. In: Proceedings of the Human Factors and the Web Workshop. Basking Ridge (1998) 15. Kubeck, J.E., Miller-Albrecht, S.A., Murphy, M.: Finding information on the world wide web: exploring older adults’ exploration. Educational Gerontology 25, 167–183 (1999) 16. Ling, J., van Schaik, P.: The influence of font type and line length on visual search and information retrieval in web pages. International Journal of Human-Computer Studies 64, 395–404 (2006) 17. Lowe, G.S.: Computer literacy. Canadian Social Trends 19, 13–15 (1990) 18. Lustig, C., Hasher, L., Tonev, S.T.: Inhibitory control over the present and the past. European Journal of Cognitive Psychology 13, 107–122 (2001) 19. Marchionini, G., Dwiggins, S., Katz, A., Lin, X.: Information seeking in full-text end-useroriented search systems: the roles of domain and search expertise. Library & Information Science Research 15, 35–69 (1993) 20. Marquié, J.C., Jourdan-Boddaert, L., Huet, N.: Do older adults underestimate their actual computer knowledge? Behaviour & Information Technology 21, 273–280 (2002) 21. Morrell, R.W., Mayhorn, C.B., Bennett, J.: A survey of World Wide Web use in middleaged and older adults. Human Factors 42, 175–182 (2000) 22. van Merriënboer, J.J.G., Sweller, J.: Cognitive load theory and complex learning: Recent developments and future directions. Educational Psychology Review 17, 147–177 (2005) 23. Nielsen, J.: Designing Web Usability. New Riders Publishing, Indianapolis (2000) 24. Nielsen, J.: Web Usability for Senior Citizens: 46 Design Guidelines Based on Usability Studies with People Age 65 and Older. Nielsen Norman Group Report (2002) 25. Rouet, J.F., Tricot, A.: Task and activity models in hypertext usage. In: van Oostendorp, H., de Mul, S. (eds.): Cognitive aspects of electronic text processing. Ablex, Norwood, pp. 239–264 (1996) 26. Rouet, J.F., Ros, C., Jégou, G., Metta, S.: Chercher des informations dans les menus WEB: interaction entre tâche, type de menu et variables individuelles. Le. Travail Humain 4, 379–395 (2004) 27. Salthouse, T.A.: Working memory as a processing resource in cognitive aging. Developmental Review 10, 101–124 (1990)
700
A. Chevalier et al.
28. Salthouse, T.A.: The processing-speed theory of adult age differences in cognition. Psychological Review 103, 403–428 (1996) 29. Slone, D.J.: Internet search approaches: The influence of age, search goals, and experience. Library & Information Science Research 25, 403–418 (2003) 30. Smith, P.A.: Towards a practical measure of hypertext usability. Interacting with Computers 4, 365–381 (1996) 31. Sweller, J.: Cognitive load during problem solving: Effects on learning. Cognitive Science 12, 257–285 (1988) 32. Shneiderman, B., Byrd, D., Croft, B.: Sorting out searching: a user-interface framework for text searches. Communications of the ACM 41, 95–98 (1998) 33. Stronge, A.J., Rogers, W.A., Fisk, A.D.: Web-based information search and retrieval: Effects of strategy use and age on search success. Human Factors 48, 443–446 (2006) 34. Tricot, A., Rouet, J.-F.: Activités de navigation dans les systèmes d’information. In: Hoc, J.M., Darses, F. (eds.): Psychologie ergonomique: tendances actuelles. PUF, Paris (2004) 35. Zacks, R.T., Hasher, L., Li, K.Z.H.: Human memory. In: Salthouse, T.A., Craik, F.I.M. (eds.): Handbook of Aging and Cognition (2nd Edition). Lawrence Erlbaum, Mahwah 293–357
Creating Kansei Engineering-Based Ontology for Annotating and Archiving Photos Database Yu-Liang Chi1, Shu-Yun Peng 2, and Ching-Chow Yang 2 1
Dept. of Management Information System, Chung Yuan Christian University, 200 Chung-Pai Rd., Chung-Li (32023), Taiwan 2 Dept. of Industry Enginerring, Chung Yuan Christian University, 200 Chung-Pai Rd., Chung-Li (32023), Taiwan [email protected], [email protected], [email protected]
Abstract. Ontology is built to establish a classification and conceptualization in knowledge disciplines. With the support of ontology technologies, users can retrieve information in a semantic manner. A primary course of ontology building is concepts development. Typical concept constructing approaches are usually consulting experts or analyzing documents. However, ontology-based systems usually do not allowed user involvement during developing ontology. To acquire expertise from users, this study utilizes Kansei Engineering to translate human emotions such as perception, feeling, or impression of things into the design elements of ontology concepts. The new design ontology then depends upon user-centric conceptual structure. This study particularly interests in archiving photos by employing ontology with user involvement. Empirical lessons show user involvement can reduce the gap in defining concepts between experts and users. Keywords: Ontology, Knowledge, Affective Design, Kansei Engineering.
"a specification of a conceptualization" [7]. A conceptualization is an abstract, simplified view of the world that is used for representational purposes. That is, the ontology is a formal description of the concepts, attributes, and relationships involved in constructing common understanding for cognitions of real world events. One of the main utilizations of ontology-based applications is addressing semantic differential among physical expressions. Thus, this study employed the ontology technology in metadata to improve semantic representation. The typical strategy of ontology development was collecting abstract expertise from knowledgeable persons. The expertise was then translated into a concrete conceptual structure of ontology. Nowadays, ontology technologies were widely applied in various areas. In most cases, ontologies relied on experts-centric development procedures including concepts collection, formalization, and construction. Since users rarely participated in ontology building, they had to learn domain expertise before using the systems. Thus, current knowledge systems were far away from public needs. To facilitate user involvement during ontology building, approaches to translate user emotional feeling into design elements of ontology-based metadata schema were essential. This study employed Kansei Engineering in knowledge acquisition by following steps: gathering users’ intuition, merging similar intuitions to a perception, and concluding related perceptions to a conception. This paper is organized as follows. Section 2 describes the specific problem domain. We discuss the reasons of why user involvement is important in ontology building. Section 3 proposes a research design of the ontology-based system; three stages are designed to guide system development. Section 4 is detail designs of user involvement; the Kansei engineering is employed to gather user emotions and translate into ontology concepts. Section 5 discusses knowledge representation and query interface design; we propose two steps query procedures including an affective query interface and a temporal photo query interface. Section 6 concludes results of the paper.
2 Problem Description In most traditional archiving systems, information management utilized data level models such as the entity-relationship (ER) model to represent content structure or been applied to indexing schema design. Several limitations such as dependent, constraint, generalization and etc. were proved in researches that data model was insufficient to support rich semantics for in-depth applications [8]. The critical issue of data level model only provided text-based search functionalities. Some systems declared enabling semantic search using synonyms dictionary. However, semantic issues were not only synonyms, but also in-depth implications and semantic divergence. For examples, the plant Alocasia odora implied one of family Araceae, one of order Alismatales, one of class Liliopsida, one of phylum Magnoliophyta, and finally one of kingdom Plantae. That is, Alocasia odora inherited all features from its superior layers. Semantic divergence was also often engendered when the vocabulary was used in different domains. For examples, the mouse might be used as an animal name, an appliance of computers, or even a character of cartoons. Different cognition and knowledge behind distinct application domains are obvious.
Creating Kansei Engineering-Based Ontology
703
On the other hand, semantic level model provided rich data representation types to describe the real world. Evidences in the literatures suggested using semantic level model for a better and flexible information system solution [11][12]. Ontology approach is one of semantic representation techniques that are built to establish a classification and conceptualization in knowledge disciplines. With the support of ontology technologies, users can retrieve information in a semantic manner. As shown in Fig.1, ontological system development can be roughly divided into two phases: ontology engineering and inference applications. Ontology building is a set of engineering processes that comprise knowledge acquisition and knowledge representation. The principal task of expertise acquisition is making concepts and organizing concepts becoming a hierarchical structure.
Fig. 1. Ontology-based system development includes ontology engineering and inference developing
Most ontology developers might agree that building a proper conceptual structure for an applied domain is challenging. Typical knowledge acquisitions usually collect expertise from experts, knowledge engineers, or even elicit knowledge in document [3]. In most cases, however, ordinary users were hardly to use the knowledge systems because the system was built by professional people. Ontology development is highly depended upon the synthesis of three components such as problem domains, applied fields, and human perspectives. Thus, similar but different ontologies were often incurred when components change. For example, if a knowledge system was designed for public, we needed to consider user involvement during ontology development. The left process sketched in Fig.1 was labeled as ‘affective design’ that was a preprocess of expertise acquisition. This study developed approaches to correlate human perspectives that were dedicated not only by experts but also by ordinary users. To develop an affective ontology-based system, this study particularly interested in gathering user emotions such as intuitions, images, and perceptions of things and translated into the design elements of concepts. More detail designs are described in the following sections.
3 Research Design Ontology is increasingly popularly applied in industry; however, the ontology development approach is still at an early stage. Several studies have been noted that ontology building was more of an art than a science [5]. As shown in Fig.2, this study was divided into three stages: knowledge acquisition, knowledge representation, and knowledge application.
704
Y.-L. Chi, S.-Y. Peng , and C.-C. Yang
The first stage dealt with constructing a conceptual structure within a specified domain of interest. For example, this study focuses on the vascular plant domain. The main task was acquiring affective elements of user’s cognitions about a vascular plant. Another issue was how to translate affective elements into concepts of ontology. As a beginning, this study examined the constituent parts of a concept and find spaces for connecting with affective design. A concept is a common understanding of things. Ontology concept concerns determining what definitions of being are fundamental and regards under what kind circumstances. Therefore, a concept can be labeled a name and contain several definitions. The affective design of this study was based on Kansei engineering (KE) method to translate human’s perception, feeling, or image into concrete design elements of the needs of ontology concepts. Kansei Engineering was originally used as ergonomic technology to capture consumers' psychological perceptions for product development [10]; some revisions of KE approach were made to conform to the need of our design. More detailed implementations were described in section 4.
Fig. 2. An design of affective aware ontology based archiving system
The second stage was about building an ontology knowledge base. Within ontologies, the knowledge base can be denoted as K=(T, A) [2]. The expression represents that a knowledge base (K) can be derived from intentional knowledge ‘Tbox’ (T) and extensional knowledge ‘A-box’ (A). The T-box contains the conceptual definitions into a terminology module (i.e., a taxonomy), and the A-box contains assertions about individual states into an assertional module or so called assertional knowledge. As the middle sketch illustrated in figure 2, the T-box was a conceptual structure which is the result of the first stage. An annotation system facilitated developers to describe meaning of objects in a semantic manner. This study used digital photos of vascular plants as annotating examples. The last stage implemented information retrieval. As the bottom sketch illustrated in figure 2, users retrieved information via a Web-based interface. The inference system was an application integrator based on a reasoning engine to manipulate
Creating Kansei Engineering-Based Ontology
705
ontology knowledge base and then accessed digital contents system. More details were described in the section 5.
4 User Involvement in Expertise Acquisition To develop an emotional aware ontology, the expertise acquisition required expert and user work together. As a beginning illustrated in Fig. 3, the domain was about how to identify a vascular plant through users’ emotional descriptions. The Affectivebased ontology has been proposed as an emotion-enabled repository to support knowledge sharing and reuse across different applications. To do so, affective ontology building must capture semantic and property sets that can be further used to define the conceptual structure. A property set was about features of the domain objects. For example, plant properties may include the root, stem, flower, and so on. Domain experts contributed their expertise to property set in this stage. A semantic set was vocabularies that were about human emotions of specific properties. For example, the vocabularies such as tufted, xylem, and stingy can be used to describe a plant stem. The process of Kansei Engineering was implemented to collect, refine, and group user cognitions to critical vocabularies. User emotions were gathered in this stage. To link semantic set and property, some statistical methods were employed for producing Kansei words. The term ‘Kansei’ was a Japanese word which means psychological feeling or experience that people have in their mind [10]. For example, imagine we were going to eat an ice cream bowl. Some people might feel tasty and joy, but some other might feel fatness. These emotions can be grouped as an abstract concept related to ‘ice cream’. The final stage was identifying concept definitions and a conceptual structure. Knowledge engineers utilized analysis tools such as formal concept analysis (FCA) to make a prototype of ontology.
Fig. 3. Knowledge acquisition and modeling process
To gather plant properties, several resources such as botanic books, encyclopedia, and Internet were referred. Two botanists were invited to figure out general plant properties, combined similar properties to categories, and identified critical physical parts of plants. 8 parts of the vascular plant are finally determined. On the other hand, a semantic set process was much complex because of emotions gathering. Three steps of the semantic set building were described below: 1. Emotional vocabularies collection: Most emotional vocabularies were adjectives. Several sources such as magazine, online articles, dialogue, and even slang were
706
Y.-L. Chi, S.-Y. Peng , and C.-C. Yang
referred. The collection task continued for one week by three assistants. More than 334 words were collected. 2. Data cleaning and grouping: To reduce the number of vocabularies, we first deleted infrequent vocabularies manually. Then, an affinity diagram tool was employed to assemble vocabularies with similar semantics and construct their relationships in the nature. The affinity diagram is originally applied to discover meaningful brainstorm into useful ideas. This study utilized the affinity diagram to sort emotional vocabularies into naturally related groups. 96 vocabularies were grouped and further formed in bipolar pair as illustrated in table 1. Table 1. Examples of partial bipolar vocabularies
Big --- Small Shout---Slender Wide---Narrow Decorative---Practical Wild---Plant …..
3. User survey: To understand semantic differential (SD) of users while using the above affective vocabularies in describing a plant, a questionnaire survey was developed. The SD questionnaire measures human’s reactions to stimulus cognitions. Participants were required rating on bipolar vocabularies defined with opposite constructs at each end. A pre-questionnaire was carried out by 15 participants. Some vocabularies were moved if the ratings are neutral. A revised questionnaire survey via the Internet has received valid responses from 573 people out of 812 polled in two weeks. 4. Data analysis: The results was analyzed by using a statistic tool SPSS. This study utilized two functions including factor analysis and cluster analysis to find the ‘Kansei’ words. There were 14 clusters are identified and then labeled as the Kansei words. The synthesis stage utilized formal concept analysis (FCA) to recognize relationships between Kansei words and properties. FCA can be noted as a triple (G, M, I) formula, if G is a set of objects, M is a set of attributes, and I is binary relations between G and M [13]. Here, G is referring parts of a plant and M is referring Kansei words. The formula is usually represented by a context lattice table. As illustrated in figure 4, Kansei words were listed in the top row and features were arraged in the left column. The cross symbol indicated that a specific feature had the corresponding Kansei words. The analysis has been implemented by using a FCA tool Galicia. An interactive system questioned developers each inconsistency context in terms of implication. Users must then either confirm that the implication was always true or disagree by placing in a counterexample of the existing cases. This counterexample was then added to the formal context. The program stops when all uncertain implications of the context were valid in the universe.
Creating Kansei Engineering-Based Ontology
707
Fig. 4. A context lattice table example
In ontology design view, for example, items list in both top row and left column were all concepts or classes. The cross symbol actually described relations between two concepts. For examples, a concept stem has filler classes such as color, quantity, size, shape and so on. The FCA line diagram functionality can be further used for drawing the conceptual structure of ontology.
5 Knowledge Representation and Query System Design Ontologies were used to model domain knowledge of interest. Several XML-based ontology representation languages such as RDF, DAML, and OWL provided different schemas and capabilities [9]. This study utilized Web Ontology Language (OWL) which was the newest ontology language from World Wide Web Consortium (W3C). OWL-based ontology consisted of classes, properties, and individuals. The Protégé OWL editor was employed to create OWL ontology. OWL class was as same as the ontology concept which contained formal definitions. OWL property was used to link classes. A typical definition was formed by a formula that consisted of properties, and filler classes to limit scope of a class.
Fig. 5. An affective query interface allowed the user to enter emotional query
708
Y.-L. Chi, S.-Y. Peng , and C.-C. Yang
Moreover, the description logic (DL) symbols can be added as modifiers to further restrict the definition. For example, an asserted definition of the class stem can be expressed as ∃hasShape Shape hasShape Size . The ‘owl:Thing’ was the root class of an OWL ontology. The root class connected subclasses to create a hierarchical class structure also known as the ‘T-box’ of a knowledge base. OWL individual was also known as an instance of a class. In annotation stage, developers filled in values of each property according to real circumstances of an individual. These asserted individuals were also known as the ‘A-box’. Total 223 asserted individuals of vascular plants have been inserted. The major improvement of this study was involving affective design to a conceptual structure of the ontology. Since the data repository has employed ontology as a knowledge base, the system was available to support an affective user interface development. In many cases, people can not really believe something unless they can see it and see with their own eyes that the thing is true. Therefore, the system first retrieved possible items represented in photos. After user identify target, the second query then sent to the server to get corresponding information. This study has designed two stages information retrieval system that supported an affective query interface and a temporal photo query interface.
Fig. 6. A vascular plant has corresponding information were retrieved by clicking a photo
Creating Kansei Engineering-Based Ontology
709
Here, a scenario supposed that a user wanted to query a vascular plant without the plant name. The information retrieval was implemented as the following two stages processes: ● As shown in fig. 5, users follow the affective directions to select proper concepts of their impressions of the searching target. After the query was done, the knowledge system invoked a reasoner to infer possible answers. ● Some qualified items were represented by photos on a temporal Web page. A user identified the plant by clicking a proper photo. As shown in fig. 6, more detailed corresponding information have been accessed and then presented to the user. This study has developed an information retrieval mechanism based on a Javabased reasoner. The programmed mechanisms were supported by Pellet and Jena API to infer OWL ontology bases. With the Java Server Page helped, this knowledge system can be used in the Internet.
6 Conclusion Advantages of using ontology-based systems included rich semantics, logic expressions, and knowledge sharing. Generally, ontology-based system development involved several processes in terms of knowledge engineering including expertise collection, knowledge modeling, conceptual structure building, and inference mechanisms. In traditional ontology building, knowledge acquisition was carried out by knowledgeable persons who had specialized domain expertise. However, there were doubts that users and system designers had identical cognitions and viewpoints. Particularly, expertise was the basis of ontology building that directly impacted the performance of systems. As a result, users might be difficult to use expert-centered knowledge systems. Thus, a crucial problem underlying ontology building was that users’ needs were often ignored. This study develops an approach that regarded user involvement during the design of concepts development. The Kansei engineering has introduced to gather user emotions and translate into design elements of concepts. Several approaches have been utilized to support expertise acquisition such as a questionnaire test, an affinity diagram, and a formal concept analysis. Empirical lessons related to user involvement were concluded as follows: First, Kansei words were the primary conclusions of user emotions of the domain of interest. These words can be used as ontology concepts. Second, the formal concept analysis can be used to elicit relationships between affective words and properties. Both affective words and properties should be treated as concepts during ontology building. Consequently, user involvement during ontology development is necessary when application systems are designed for the public. Future works suggested developing more efficient ways to collect user emotions. Acknowledgments. The authors would like to thank the National Science Council of the Republic of China, Taiwan for financially supporting this research under Contract No. NSC 95-2416-H-033-009.
710
Y.-L. Chi, S.-Y. Peng , and C.-C. Yang
References 1. Arora, J.: Network-enabled digitized collection at the Central library, IIT Delhi. Intl. Info. and Libr. Review 36, 1–11 (2004) 2. Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Parel-Schneider, P.: The description logic handbook. University Press, Cambridge UK (2003) 3. Chi, Y.-L.: Elicitation synergy of extracting conceptual tags and hierarchies in textual document. Expert Sys. with Apps. 32, 349–357 (2007) 4. Chi, Y.-L, Hsu, T.-Y., Yang, W.-P.: Ontological techniques for reuse and sharing knowledge in a digital museum. The Electronic Libr. 24, 147–159 (2006) 5. Compton, P., Jansen, R.: A philosophical basis for knowledge acquisition. Knowledge Acquisition 2, 241–258 (1990) 6. El-Sherbini, M.: Metadata and the future of cataloging. Library Computing 19, 180–191 (2000) 7. Gruber, T.R.: Towards principles for the design of ontologies used for knowledge sharing. Int. J. of Human-Computer Studies 43, 907–928 (1995) 8. Hull, R., King, R.: Semantic Database Modeling: Survey, Applications, and Research Issues. ACM Computing Surveys 19, 201–260 (1987) 9. Horrocks, I., Patel-Schneider, P.F.: Reducing OWL entailment to description logic satisfability. J. of Web Semantics 1, 345–357 (2004) 10. Nagamachi, M.: Kansei Engineering: A new ergonomic consumer-oriented technology for product development. Int. J. of Industrial Ergonomics 15, 3–11 (1995) 11. Sugumaran, V., Storey, V.C.: Ontologies for conceptual modeling: their creation, use, and management. Data. and Knowledge Engineering 42, 251–271 (2002) 12. Uschold, M., Grueninger, M.: Ontologies: principles, methods and applications. Knowledge Eng. Review 11, 93–155 (1996) 13. Wille, R.: Concept lattices and conceptual knowledge system. Computers and Mathematics with Application 23, 493–515 (1992)
Influence of Avatar Creation on Attitude, Empathy, Presence, and Para-Social Interaction Donghun Chung1, Brahm Daniel deBuys2, and Chang S. Nam3 1
School of Communication Kwangwoon University 447-1 Wolgye-Dong, Nowon-Gu, Seoul Korea 139-701 [email protected] 2 Department of Communication University of Arkansas 417 Kimpel Hall, Fayetteville, AR 72701, U.S.A [email protected] 3 Department of Industrial Engineering University of Arkansas 4207 Bell Engineering Center, Fayetteville, AR 72701, U.S.A [email protected]
Abstract. The present paper focuses on the influence of avatar creation in a video game. More specifically, this study investigates the effects of avatar creation on attitude towards avatar, empathy, presence, and para-social interaction of female non-game users. As a cyber-self, an avatar is a graphic character representing a user in cyberspace. Avatars are primarily used in the entertainment industry as high-tech novelties, controlled by game users, for high-end video games. Some games provide game characters by default that users cannot change, but other games provide various options gamers can choose. What if game users can create their own avatars? Do they have more psychological closeness with their avatars as their cyber-selves? This study tested the differences of attitude, empathy, presence, and para-social interaction of female non-game users between an avatar creation group and a non-avatar creation group and resulted in no difference. Keywords: Avatar, Attitude, Empathy, Presence, Para-Social Interaction, Wii.
Many games also provide users with a variety of avatars and options for creation. Avatars play an important role in gaming because gamers are visually exposed to their avatars. Consciously or unconsciously, gamers interact with their avatars and this interaction directly or indirectly affects entertainment. The role of game character/avatar creation was investigated in a few studies. For instance, Cordova and Lepper [4] found its importance in motivation and engagement in learning, and Lim [5] found that avatar choice leads to greater arousal and identification. In the continuation of those studies, the present research investigated the influence of avatar creation in interactive video games. More specifically, we examined attitude towards avatar, empathy, presence, and para-social interaction as outcomes of avatar creation.
2 Literature Reviews Little research has been done to examine the role of avatar creation, but identification with game characters has been examined [6] [7] [8]. The research has shown that various gaming situations (such as violent, story-based, and the third person POV (point of view) games) increase the level of identification with the characters. Most interestingly, Lim [5] found that avatar choice leads to greater identification. So what other outcomes will avatar creation bring about? One possible variable is attitude. Many variables influence attitude formation. Attitude formation refers to the transition from having no attitude towards a given object to having some attitude towards it, either positive or negative [9]. Many researchers examined cognitive processes to find out various factors to determine attitude formation. As a predisposition, attitude is formed when people have feelings toward an object, so personal experience is one of the most important factors. Indeed, according to Breer and Locke’s task-experience theory [10], when someone works to achieve a goal, nature of task (easy or hard), operations (individually or collectively), and outcome (success or fail) will shape attitudes. When gamers create their avatars, they feel a more unique experience than when they just receive an assigned avatar. Indeed, if gamers pleasantly spend time and endeavor to create their avatars at ease and are satisfied with it, they will more likely have positive attitude. Therefore, we hypothesized that: H1. Gamers who create their own avatar will have more positive attitude towards it than gamers who receive an avatar by default. Tamborini and his colleagues [11] state that there are nearly as many definitions of empathy as there are individuals attempting to study it. However, as a multidimensional construct, empathy has been examined in affective and cognitive dimensions. Cognitive dimension is a process by which we imaginatively place ourselves in another person’s situation such as perspective taking and fictional involvement. Affective dimension is a process associated with a tendency to experience strong emotional reactions to another person’s pain or misfortune such as empathic concern and emotional contagion. Zillmann [12] noted that a viewer’s affective response to a media message was dependent on the veridicality of the portrayal of the circumstances that fostered the emotions of a character on screen.
Influence of Avatar Creation
713
Cummins’ [13] interpretation is that the audio-visual presentations characteristic of contemporary electronic media have the potential to generate the greatest sense of veridicality and thus have great potential for eliciting affective responses. From this perspective, people who create their own avatar are supposed to have a greater sense of veridicality because they create it based on their realistic ideas and greater affective responses. Therefore, we hypothesized that: H2. Gamers who create their own avatar will have greater empathy than gamers who receive an avatar by default. Many scholars in various fields have different approaches to understanding presence. To explicate the concept of presence, the International Society for Presence Research (ISPR), which is a community of scholars interested in the presence concept, has revised a conceptual definition of presence. According to ISPR [14], presence is defined as a psychological state or subjective perception in which, even though part or all of an individual's current experience is generated by and/or filtered through human-made technology, part or all of the individual's perception fails to accurately acknowledge the role of the technology in the experience. Lombard and Ditton [15] pointed out that presence is the perception of non-mediation. In the early stage of conceptualizing presence, Witmer and Singer [16] noted that involvement and immersion are necessary for experiencing presence. They defined involvement and immersion as similar psychological processes. Involvement is defined as a psychological state experienced as a consequence of focusing one’s energy and attention on a coherent set of stimuli or meaningfully related activities and events. On the other hand, immersion is characterized by perceiving oneself to be enveloped by, included in, and able to interact with an environment that provides a continuous stream of stimuli and experiences. The two are distinguished from one another in that involvement depends on focusing one’s attention and energy on a coherent set of stimuli while immersion depends on perceiving oneself as a part of the stimulus flow. Witmer and Singer propose that a valid measure of presence should address factors that influence involvement as well as those that affect immersion. So what is the role of avatars in presence? Lim [5] used a two way ANOVA test with avatar choice and visual point of view (POV), she found that avatar choice leads to a greater sense of presence in the third person point of view. Therefore, we hypothesized that: H3. Gamers who create their own avatar will have a greater sense of presence than gamers who receive an avatar by default. Horton and Wohl [17] conceptualized para-social interaction as the imaginary oneway relationship that viewers develop with people on television. Scholars have examined (imaginary) relationships between reporters and viewers, anchors and viewers, TV characters and viewers, and so on. Recently, Klimmt and his colleague [18] examined how people perceive their avatars as interaction partners using the para-social interaction perspective. The Basic concept of para-social interaction is that media users psychologically interact with characters appearing on-screen and much literature has shown that para-social interaction is developed by frequent exposure. Therefore, if people are more exposed to their avatars by creating their own, it was
714
D. Chung, B.D. deBuys, and C.S. Nam
supposed that the avatar creation group would have a greater sense of para-social interaction. Therefore, we hypothesized that: H4. Gamers who create their own avatar will have a greater sense of para-social interaction than gamers who receive an avatar by default.
3 Console Games In the world of next-generation home gaming systems, the emphasis has been on increasing graphical capabilities and processing speed. The XBOX 360 was the first next-generation system to be released to the public (2005). It boasts a dual-layer DVD-ROM that enables HD quality graphics (http://www.gamespot.com/features/ 6125087/index.html?type=tech). Its processing speed is 3.2 GHz. The PlayStation 3 (PS3) was released at the end of 2006 and is struggling to catch up to the commercial success of the 360. It has a Blu-ray BD-ROM, which also enables HD quality graphics. Its processing speed is also 3.2 GHz, though with a different processor than the 360. Both the PS3 and XBOX 360 can play DVDs. In an attempt to get back into the highly competitive video game market, Nintendo realized that they couldn’t keep up with the hardware capabilities of the PS3 and the XBOX 360. Nintendo decided to approach home gaming from a different perspective, emphasizing playability and interactivity over enhanced visual capabilities. The Nintendo Wii is what emerged from such an attempt to change the way video games are perceived. The Wii has several significant differences from the other next-gen gaming platforms (wii.nintendo.com). For one thing, it is very small, at 8.5 cm x 6 cm x 2 cm and 3.84 lbs. It also has wireless connectivity. The Wii is backwardscompatible with the previous Nintendo system (the Gamecube) and many of the games from previous Nintendo systems will be available for download. The ability to play older games with limited graphical capabilities and relatively simplistic gaming features is consistent with Nintendo’s focus on gameplay over graphics. The most innovative feature of the Nintendo Wii is the controller, called the Wii Remote. It contains sophisticated motion-sensing technology that enables a variety of gaming functions. You can swing the controller like a tennis racket to play a tennis game. You can grab the controller with both hands and steer it like a steering wheel. You can point and shoot in first-person shooters. With an additional controller connected to the Remote, you can box an opponent by engaging in a punching and blocking motion using both hands. The Nunchuk is an additional controller for the other hand that is connected to the Remote. The Nunchuk has an analog joystick to control fine movements, so that motion can be controlled by the left hand while the right hand engages in swinging motions, jabbing movements, or whatever is appropriate for the game. The Nunchuk also contains motion-sensing technology, so controller movements can involve both hands at once. The Remote also contains other features that may contribute to a more immersive experience. It has a rumble feature to supply kinesthetic feedback. The Remote also has a speaker build into the controller itself. Both the Remote and the Nunchuk have an ambidextrous nature that allows for right- or left-handed players to use them with equal facility. The controllers are wireless and interact with the system
Influence of Avatar Creation
715
using Bluetooth technology via a sensor bar that perches on top or in front of the television. The sensor bar can pick up motion from up to 30 feet away. However, unlike XBOX 360 and PS3, the Wii can only play Wii discs and Gamecube discs and resolution is limited with the Wii (480p) compared to the high definition available for the PS3 and XBOX 360 (1080i) (http://reviews.cnet.com/ Nintendo_Wii/4540-6464_7-31355104-4.html?tag=sub). All three systems have wireless controllers.
4 Method 4.1 Participants and Procedure Since many game studies have shown that gender and game experience are significant variables having different outcomes, this research controlled for the gender and game experience factors. Therefore, only non-game female users were applicable to this research. Also, participants must not have a pre-existing avatar because avatar creation is a manipulation in this research. Finally, sixteen female undergraduate students having no game experience were drawn from a few communication classes at a large university. Ages ranged from 19 to 25 and the average age was 21.25 (SD=1.57). All participants filled out a consent form and then entered into a game lab which had a TV set, Nintendo Wii, and home theater system. As an experimental group, only participants who were randomly assigned to an avatar creation group created their avatars for 6 minutes. Participants who were assigned to a non-avatar creation group as a control group just received a default avatar. Since the game provided 3 different default female avatars, we assigned the third female avatar to females in the control group to avoid any internal validity threat. Overall, each group had eight participants. Many options are available to customize an avatar (known on the Nintendo Wii as a Mii). There are 45 initial random faces to choose from, and then participants can keep choosing from 9 altered versions of the face until “Use this face” is chosen. There are then 9 screens of options to choose from. The initial screen contains Nickname, Favorite (outfit color), Gender, Birthday, Favorite Color (12 options), Mingle (choose whether to let your avatar interact with other avatars), and Mii Creator. The second screen has to do with Body Type. Height and Weight can both be altered along a sliding scale. The third screen has to do with Face. Shape (8 options), Facial Characteristics (12), and Color (6) can all be altered. The fourth screen has to do with Hair. Style (72), Color (8), and the Side of the Part can be altered. The fifth screen has to do with Eyebrows. Type (24), Color (8), Up/Down Placement, Size, Rotation, and Left/Right Placement can be altered. The sixth screen has to do with Eyes. Type (48), Color (6), Up/Down Placement, Size, Rotation, and Left/Right Placement can be altered. The seventh screen has to do with Nose. Type (12), Up/Down Placement, and Size can be altered. The eighth screen has to do with Mouth. Type (24), Color (3), Up/Down Placement, and Size can be altered. Finally, the last has to do with Accessories. Glasses (Type (9), Color (6), Up/Down Placement, and Size), Mustache (Type (4), Color (8), Up/Down Placement, and Size), Mole (Type (2), Up/Down Placement, Size, and Left/Right Placement), and Beard
716
D. Chung, B.D. deBuys, and C.S. Nam
Type (4) and Color (8)) can all be altered. Participants in the avatar creation group experienced all these functions and created their own avatars. After participants had their avatars in both groups, a researcher explained how to play a tennis game. The researcher showed them how to serve, do a forehand, and do a backhand. They were asked to play the tennis game for 5 minutes as a training session. All of the lights were turned off and the researcher helped guide them during that time. After this training session, participants played alone for 15 minutes. At the end of the gaming session, participants were given a main questionnaire that asked about attitude towards avatar, presence, empathy, and para-social interaction. 4.2 Instruments The participants were questioned about attitude towards avatar, empathy, presence, and para-social interaction. Attitude towards avatar had four items which measure participants’ general feeling of favorableness or unfavorableness for their avatar. A five-point semantic differential scale was employed and all items were retained: I think that my avatar is “useless/useful,” “unimportant/important,” “foolish/wise,” and “unpleasant/pleasant.” It was found to be reliable (M=3.36, SD=0.71, α=.72). Empathy was operationally defined as feeling the same way that an observed avatar is feeling and eight items were newly created such as “when my avatar was happy, I was
Fig. 1. Overview of the experimental condition
Influence of Avatar Creation
717
happy,” “my emotional state affected the interaction of my avatar and myself,” etc. Two items were deleted and the measure was reliable (M=2.71, SD=0.73, α=.84). Presence had fourteen items based on Witmer and Singer’s presence measure (1998) which asked about involvement and immersion in the gaming environment. All of the items were newly created and retained. It was reliable (M=3.03, SD=0.83, α=.94). Lastly, based on Rubin and Perse [19], para-social interaction was revised. Para-social interaction between gamers and game characters was measured by eight items and all of the items were retained and reliable (M=2.46, SD=0.90, α=.91). Empathy, presence, and para-social interaction used a five-point Likert scale. The Nintendo Wii was chosen for this research, based on its greater interactivity compared with other systems. A LG 42 inch LCD TV was used with the Wii. It is 46.3 x 30.2 x 11.8 (in) and 90.4 (lbs) with the stand. The resolution is 1366 x 768 (Dot) and the television system is NTSC-M, ATSC, 64 & 256 QAM. For better sound, A Panasonic SA-HT940 home theater system was used. It has 5 +1 channels.
Fig. 2. “I can’t see my avatar”
5 Results Results showed that no hypothesis was supported. First, there was no significant difference in attitude between avatar creation (M=3.31, SD=.61) and non-avatar creation groups (M=3.41, SD=.83), t(14)=-.26, ns, two-tailed. Second, between avatar creation (M=2.92, SD=.57) and non-avatar creation groups (M=2.5, SD=.85), there was no significant difference of empathy t(14)=1.16, ns, two-tailed. Third, there was
718
D. Chung, B.D. deBuys, and C.S. Nam
no significant difference in presence between avatar creation (M=2.82, SD=.67) and non-avatar creation groups (M=3.04, SD=1.04), t(14)=-.49, ns, two-tailed. Lastly, no significant difference was found in para-social interaction between avatar creation (M=2.78, SD=.86) and non-avatar creation groups (M=2.14, SD=.88). t(14)=1.48, ns, two-tailed.
6 Discussion The goal of the present study was to find out the influence of avatar creation on attitude, empathy, presence, and para-social interaction. In order to answer this question, we divided the participants into the two groups of avatar creation and nonavatar creation and compared the outcomes. The avatar creation group created their own avatar while the non-avatar creation group received a default avatar. The results showed that no hypothesis was supported. There may be a few reasons why there were no differences between the avatar creation and non-avatar creation groups. First, the sample might not have been appropriate. Having only female participants might be a reason why there were no significant effects. According to Lim [5], gender of the game player was a significant factor that determined many aspects of the game play experience, both physiologically and subjectively. More specifically, males exhibited more significant outcomes than females in arousal, heart rate, and valence, and females’ physiological responses did not depend on avatar choice. However, this research recruited only females because it was hard to find male non-game users. Since creating an avatar is a manipulation in this research, we sought people who did not play games and who did not have an avatar. In the preliminary study, we found that most male students enjoyed gaming and had an avatar, so we limited participants to females. Given that in the preliminary research the average video game self efficacy score, which is a person’s judgment of her ability to play video games, was below the midpoint of the measurement scale (M=2.80, SD=.89), the data might show that female non-game users were not appropriate participants because they were not confident in playing the game. Second, the tennis game itself was too simplistic. Playing a simple game can be an advantage as well as disadvantage in gaming research. The reasons that we chose this game were that participants were non-game users and females. We wanted them to be familiar with an easy game immediately following the five minute training session. Cordova and Lepper [4] found that with a more challenging game, greater use of complex operations and greater strategic play had significant outcomes in the participants’ motivations and engagement. Since the tennis game had simple operations, low difficulty, and no strategy, the participants might not have significant difference in presence, empathy, and para-social interaction between the two groups. It may be a good idea to ask “perceived ease of use” for gaming, which means the degree to which the user feels the game to be easy or free of effort. Third, the manipulation might not be enough. The experimental group was supposed to spend six minutes to create their avatars, but it was hard to know if they really spent the whole time creating their avatars. The experimental group was also told that all nine screens of options should be clicked and used, but again, we had no
Influence of Avatar Creation
719
way to check up on it. Unfortunately, it is hard to know the role of the manipulation because we did not ask how they felt about creating avatar. Fourth, two very important problems existed in terms of avatars. First, the avatars were facing away from the participants during game play so they did not even see their creation’s face, and all of the characters looked pretty much the same from the back . They also did not get to dress up their avatars, something very popular in avatar use. Second, the cartoonish nature of the avatars may take away from the realism of the experience. Although the Wii tennis game provides gamers with many options, it is hard to make it resemble a real person. This is important because empathy has a veridicality issue and if they made their avatars without any consideration about themselves, it might be hard to get empathy. Finally, longitudinal research is necessary. Basically speaking, para-social interaction is discovered through repeated exposure in a relatively long term situation[13]. Just one, fifteen-minute episode of gaming is not sufficient to identify with the avatar. Though the results were inconclusive in the present study, it is believed that with a longitudinal study, more realistic avatars, and a game in which avatar interaction is maximized, the results will show that avatar creation influences attitude towards avatar, empathy, presence, and para-social interaction.
References 1. Nowak, K.: The Influence of Anthropomorphism on Mental Models of Agents and Avatars in Social Virtual Environments. Unpublished Dissertation, Michigan State University, East Lansing Michigan (2000) 2. Suler, J.: The Psychology of Avatars and Graphical Space in Multimedia Chat Communities (1997) Online. Retrieved February 8, 2007 from http://www.rider.edu/ suler/ psycyber/psyav.html 3. Chung, D.: Something for Nothing: Understanding Purchasing Behaviors in Social Virtual Environments. CyberPsychology & Behavior 6, 538–554 (2005) 4. Cordova, D.I., Lepper, M.R.: Intrinsic Motivation and the Process of Learning: Beneficial Effects of Contextualization, Personalization, and Choice. Journal of Educational Psychology, 715–730 (1996) 5. Lim, S.: The Effect of Avatar Choice and Visual POV on Game Play Experiences. Unpublished Dissertation, Stanford University, California (2006) 6. Tamborini, R.: The Experience of Telepresence in Violent Video Games. In: Paper presented at the 86th annual convention of the National Communication Association, Seattle Washington (2000) 7. Tamborini, R., Eastin, M., Lachlan, K., Fediuk, T., Brady, R., Skalski, P.: The Effects of Violent Virtual Video Games on Aggressive Thoughts and Behaviors. In: Paper presented at the 86th annual convention of the National Communication Association, Seattle Washington (2000) 8. Schneider, E.F., Lang, A., Shin, M., Bradley, S.D.: Death with a Story: How Story Impacts Emotional, Motivational, and Physiological Responses to First Person Shooter Video Games. Human Communication Research 30(3), 361–375 (2004) 9. Oskamp, S., Schultz, P.W.: Attitudes and Opinions, 3rd edn. Lawrence Erlbaum Associates, Mahwah New Jersey (2005)
720
D. Chung, B.D. deBuys, and C.S. Nam
10. Breer, P.E., Locke, E.A.: Task Experience as a Source of Attitudes. Dorsey Press, Homewood Illinois (1965) 11. Tamborini, R., Salomonson, K., Bahk, C.: The Relationship of Empathy to Comforting Behavior Following Film Exposure. Communication Research 20(5), 723–738 (1993) 12. Zillmann, D.: Empathy: Affect from Bearing Witness to the Emotions of Others. In: Bryant, J., Zillmann, D. (eds.) Responding to the Screen: Reception and Reaction Processes, pp. 135–168. Lawrence Erlbaum Associates, Mahwah New Jersey (1991) 13. Cummins, R.G.: The Entertainment Appeal of Reality Television: The Effect of Direct Address on Empathy, Interactivity, Presence, and Entertainment Value. Unpublished Dissertation, University of Alabama, Tuscaloosa, AL (2005) 14. International Society for Presence Research. The Concept of Presence: Explication Statement (2000). Online. Retrieved (February 10, 2006) from http://ispr.info/ 15. Lombard, M., Ditton, T.: At the Heart of It All: The Concept of Presence. Journal of Computer-Mediated Communication, 3(2) (1997) 16. Witmer, B.G., Singer, M.J.: Measuring Presence in Virtual Environments: A Presence Questionnaire. Presence: Teleoperators and Virtual Environments 7(3), 225–240 (1998) 17. Horton, D., Wohl, R.R.: Mass Communication and Para-Social Interaction. Observations on Intimacy at a Distance. Psychiatry 19(3), 215–229 (1956) 18. Klimmt, C., Hartmann, T., Schramm, H., Vorderer, P.: The Perception of Avatars: Parasocial Interactions with Digital Characters. In: Paper presented at the 53rd Annual Conference of the International Communication Association, San Diego California (2003) 19. Rubin, A.M., Perse, E.M.: Audience Activity and Soap Opera Involvement. Human Communication Research 14(2), 246–268 (1987)
Abstract. Much of the world’s knowledge is captured in writing, and shared through writing, and as such is inaccessible to the one eighth of the world’s population who are illiterate. We are developing a software system for the use of this population based on speech and images without written text. We have evaluated basic interaction devices and simple interface metaphors to arrive at the design of an overall interface that is attractive to and usable by illiterate people. We report our usability experiments, and describe our system. Keywords: HCI, illiterate, speech.
The computer was first introduced into Nepal more than three and half decade ago, a second generation IBM 1401 to expedite the analysis of national census data. Over the years, computers have evolved from an isolated machines confined within limited organizations to an essential gadget in consumer's home, but still accessibility is limited to the literate mass. Localization, e-governance, e-medicine, and rural telecentres have helped further distribution of the technology in rural areas. But access to and use of these technologies remains extremely uneven. This disparity is a reflection of deeper socio-economic inequalities both between and within societies. Computers and the Internet are often considered effective instruments to empower people, reduce poverty and improve the lives of the people; it has just deepened already existing inequalities and divisions. Providing support in ICT for illiterates and oral communicators can be one of the most effective ways to alleviate this inequality. On the Sambad project we aim to support illiterate people directly, as part of this move towards the new multimodal literacy. We believe if support could be given for the non literate people through appropriate interfaces, they can benefit from computers. We are developing a robust Multimodal Web Authoring and Browsing Tool (MWAB) using which non-literate users will be able to share their knowledge via their web pages using speech instead of text. We are designing computer interfaces taking into account the cultural needs of illiterates. Illiterate users may not be very sophisticated in their experience of technology, and need simple interfaces. We are proceeding cautiously, and will continue to do so, developing a prototype increment and then evaluating it. Ultimately we want functionality which: • • • •
identifies and authenticates the user enables composition and editing of multimedia documents of speech and pictures enables storage and retrieval of collections of multimedia documents communicates documents to others
2 Information Needs Assessment For assessment of the information and requirement analysis for the Sambad, we surveyed telecentres in rural areas across Nepal. We conducted semi-structured interviews with 453 local people out of which, 277 were illiterates and 176 literates. We asked all the respondents about their familiarity on computer, its usefulness and applications. 70% of illiterate respondents were from rural areas of which 67% were female. The literate participants were primarily from telecentre management, telecentre users, and computer entrepreneurs. We observed both, the rural lives in two villages of west Nepal and the services provided by 45 institutions, including cybercafés, community telecentres, FM-radio stations, and NGOs/INGOs. We also conducted 14 separate focus group discussions, 4 with the non-literate village dwellers, 3 with media forum people, 3 with ICT entrepreneurs, and 4 with development practitioners and local political leaders in order to find what kind of information and applications could be useful for these rural illiterate communities. Though many illiterates believed that computers were not meant for them as they were unable to align with the technology, it was evident from our survey that there is a strong thirst in all illiterates for knowledge, pictorial manuals or audio books on pest controlling, farming, bee-keeping, primary health, and so on. From this experience
Sambad- Computer Interfaces for Non-literates
723
with illiterates, we learnt that many illiterates do at present act as indigenous, oral knowledge base for their fellow villagers on issues like curing diseases, good farming practices, etc. Preservation of this traditional knowledge, particularly from indigenous people, is seldom found. Once lost, this oral based knowledge cannot be retrieved [8]. In addition, people in villages are accustomed to indigenous communication in the forms of festivals, story telling, folk dance and songs, etc.; all of them portraying some message on social values, education, politics thereby supporting oral and cultural continuity. We felt the necessity to produce software that addresses this existing oral practice. We also found that they can understand image and audio irrespective of their literacy level when we demonstrated our text free interfaces. The rural community is in need of information on basic health, agriculture, education, governance etc, in a form they can understand. In response to address the community needs, we are building a system that in general can generate, edit, store, and share the contents as speech in their own language and voice for accessing and contributing the information for both literates and non-literates on the same platform. Nepal has 138th rank in the human development index out of 177 countries. The human poverty index is 38.1 [7]. The rural economy is based on agriculture mainly. Education and Information facilities are limited and basic health facilities are very poor in rural areas. There are around 28000 primary and secondary schools across the country, and only 1259 government employed doctors and 89 public hospitals in Nepal [3]. Most of the illiterates dwell in these rural villages. Information on very basic things that affect the livelihood of the villagers like agriculture, basic health, education and commerce can have a greater significance to the rural communities. On our visits to different telecentres of mid western and far western part of Nepal we found that ICT facilities had now penetrated to rural communities, over the years with the initiatives by telecentres movement in Nepal. Till date 240 rural telecentres have been established in Nepal. However, the information that is available through these centers is mostly text, posted on a website or in print. We witnessed an active role of literate intermediaries during our field visits and interviews, and came across many non-literates entering telecentres to send and receive emails and chat with their children abroad. We found one such case in a telecentre in Nepalgunj (west Nepal), where a woman, who could not read and write, came into the telecentre and text chatted with her daughter abroad with the operator there as intermediary. For us it is an irony that she happens to be a regular visitor there and she always waits for the operator’s spare time. We doubt whether the intermediary could express her exact feelings to her daughter in words.
3 Text-Free Technology As a lead into this project in mid-2005, Roger Tucker focused on the composition of speech messages, believing that these should be as easily produced and edited as are written messages. Work in this area goes back to the 1980s on the Etherphone project (Zellweger et al [22]), part of the wider Cedar project within Xerox in California. This work was later picked up by Roger Tucker and colleagues at Hewlett Packard when looking for speech tools for personal digital assistants (PDAs), calling this ‘speech as
724
S. Dhakhwa et al.
data’ (Tucker et al [17]). The idea in both these systems is to display speech in a wordlike form shown in Fig. 1, a sequence of speech ‘chunks’ separated by non-speech ‘silences’. In Tucker’s design study the ideas were reviewed not by illiterate people but by another group of writing-disabled people, dyslexics. This proved very useful.
Fig. 1. Taken from Tucker’s Design Report [16] for Sambad
Software to edit speech as illustrated in Fig. 1 had been implemented within Hewlett Packard but is not available to us, so on Sambad we have been developing our own version of this, using publicly available algorithms for it, and building on developments in speech processing since that early work, adding some innovations of our own. 3.1 Text Free User Interfaces for Non-literates The normal interfaces to computers, such as Microsoft Windows, and the one shown in Fig. 1, are relatively complex, with their metaphors of file paths, menus and command lines. They are targeted mainly at literate users and encompass the metaphors of office environments. It is evident that illiterates are equally able to use modern technologies, for instance televisions, radios, telephones, and motor vehicles, if provided appropriate training and interface [18]. However computer interfaces may include text and unfamiliar metaphors. We believed that using understandable pictures instead of text could be one of the approaches to reduce such complexities. We knew from Medhi et al [13] that this should be the case, but wanted to find out what kind of interactions illiterate people felt most comfortable with. We started by assessing basic interaction devices – keyboard and mouse, and also touch screen. For this trial we developed a simple text free audio recorder, as a prototype, with pictorial icons and buttons with audio captions. From other studies we know that pictures and audio can be used to guide illiterate users from one interface to other [11]. We designed widgets and UI components using metaphors that we believed would be readily understood by illiterate and technically inexperienced people (see Fig. 2 and 3).
Sambad- Computer Interfaces for Non-literates
Fig. 2. Screen to select play/record
725
Fig. 3. Leaf and Rupee notes for rating
The Godavari Experiment We undertook a usability trial at the Godavari Marble Works just south of Kathmandu. For this and other usability assessments we followed the combined approach involving observation, performance measure, open interviews and focus group methods [2], [14]. The main objective of the experiment was to assess basic interaction devices – keyboard and mouse, and also touch screen – to find which interaction device would be the most appropriate. Underlying this lay our overarching objective, to find out what form of text-free user interfaces were appropriate for non-literates. We established a usability laboratory in the marble works, and arranged a number of usability assessment trials over several days, with subjects shown in Table 1. Table 1. Subjects for Godavari experiment Subjects Sex Male female Total
Age N 13 23 36
Average 30.92 29.96 30.31
Std. Dev 11.21 9.06 9.74
Each trial was attended by 4 or 5 people. The subjects were initially interviewed to record basic facts about them, in particular their level of literacy. They were then shown a short video on the computer about the importance of education to get them used to the computer and surroundings, and then introduced to using the computer and the interface shown in Fig. 2. This system asked them to do a number of tasks using the interface device being evaluated, before they went on to listen to prerecorded stories or songs by other non-literate people or record their own story or song. During this we recorded all their keystrokes for later analysis, videoed them with two cameras, and observed them taking notes of all significant behaviour. The number of subjects was really very small, so though differences were found between different categories of subject and between different devices, none were statistically significant excepting for one. Table 1 summarises the timing for interaction devices, showing that the mean time for the touchscreen is more than two standard deviations less than the times for the other devices. From this we concluded that the touch screen was significantly easier to learn and to use. We could have predicted this, but here it was in the experimental evidence.
726
S. Dhakhwa et al. Table 2. Mean Time Consumed by participants using different input devices Used input device Mouse Touchscreen Keyboard Custom Keyboard Total
Mean 8.9982 4.3638 8.5125 8.4750 7.4756
N 11 8 4 4 27
Std. Deviation 2.7455 1.4876 2.4151 1.1857 2.9342
During the experiment we also observed the performance of each participant on their attitude, ability to use and understanding of the software interface. On each screen we graded them Excellent, Good and Poor. From these grades we arrived at an overall grade shown in the bar chart in Fig. 4, confirming the conclusion above.
A = Excellent, B = Good and C = Poor Fig. 4. User performance on different input devices
We found that the text free interface was readily understood by everybody, though some found the spoken instructions too long and stopped attending to these. We have demonstrated this system to a range of people in the context of explaining what we did. Some of the people who had seen this simple record and playback system have seen its potential for recording oral histories and indigenous knowledge. 3.2 Appropriate Metaphors Nielsen’s second user interface design heuristic is to “speak the user’s language”, and in particular use appropriate metaphors [14], a point also taken up by Alan Blackwell [1], with Evers [5] emphasizing the user’s cultural background. A metaphor enables people to quickly learn to use a system by mapping the real world objects onto software objects [12]. Expectations about how a metaphor object
Sambad- Computer Interfaces for Non-literates
727
in the software would behave will depend on the cultural use of the object in the real world, its affordance [4]. In a country like Nepal with diverse ethnicity and culture, the choice of metaphor takes on a locally cross-cultural dimension. We used knowledge from the surveys and our own knowledge of life in Nepal to select metaphors. In our second prototype we incorporated several metaphors like “village”, “cupboard” and “exit door” etc. all closely connected with daily activities in Nepal. These metaphors were assisted by audio captions. We also used image based login system, pictures and photographs are not new in any culture today; though there is some evidence that sequences are difficult for people to memorize [9]. We had explained the analogy of pass images to “Secret Spell” during usability trials which the users accepted well.
Fig. 5. The Village Metaphor
Fig. 6. The Cupboard Metaphor
The village metaphor (Fig. 5) was adopted from a CD produced by UNICEF, Hyderabad but also incorporated into a system in Indian villages [9]. This classifies the information available, in effect the root directory, the top level filing cabinet and folder structure, with sub-directories home, school, health-post, market, etc., clicked on to select. Evers [5] used a picture of a university campus in an equivalent manner in her researches. The cupboard is the most familiar storage device used by people in Nepal, familiar even for the storage of paper files, as anybody who has visited a government office will have experienced. We presented our users with a cupboard UI component to hold their files with drag and drop to move them, in fact two cupboards, one private and one public (see Fig. 6). We included simple file previewing like audio playing with speed adjustment and volume control and an image and album preview. The combination of village and cupboard metaphors gives a fixed three level hierarchy. Bungamati Experiment We undertook usability evaluations of these metaphors at a Telecentre located just south of Kathmandu. We had developed a prototype having a login system, village and cupboard metaphor, and photo/album viewer. We had 13 female participants with age varying from 23 to 51 who had recently joined the CLC (Community Learning
728
S. Dhakhwa et al.
Center) program for illiterates at the Bungamati Telecentre. These participants were functionally illiterate though some of them could read and write a few letters. From our first experiment we had concluded that the touch screen was the most appropriate user interaction device for the illiterates; however in our second experiment we only used the mouse because the participants had already seen a mouse before and six of them had some experiences of using mouse during literacy classes. However, the Sambad software is enabled for touch screens.
A = Excellent, B = Good and C = Poor Fig. 7. Performance Grading based on previous experience
We assigned fixed tasks to individuals and two of us observed them, making notes. We found that 4 of them had difficulties about how to control the mouse and use the widgets with the mouse. One of the participants found it difficult to comprehend audio captions in Nepali while the rest of them had no such problems; we concluded that this particular subject had difficulty because Nepali is her second language. We also found that the participants with prior experience in using the mouse felt at ease using Sambad UI widgets. The first time computer users required an introduction to the computer and its peripherals, like using of mouse, etc before using the software, and then the more time to understand and use the software (see Fig. 6). Subjects without prior experience of computers were graded C with only a single A, while those who had prior experience all got A or B. The participants who got C had major difficulties in using the slider component as they didn’t have full mouse control. Since the majority of the participants could use the metaphors without much difficulty we can conclude that the metaphors we developed were usable. We found that the participants felt that our metaphors were interesting and entertaining, because the UI was colorful and simple; and it allowed users to view images and recorded speech. One participant was very enthusiastic with the interface because she had heard from her daughter that it takes at least 2 months for literate people to learn computers, but she was already using the computer using the Sambad interface. She said, “Wow! It’s damn easy to use; I will take this home and use it in my daughter's computer.”
Sambad- Computer Interfaces for Non-literates
729
3.3 A Note on Current Technology User interfaces without text rely on audio technologies. We reviewed Voice XML [21] but found this impractical for Sambad. We use extended Java Swing components with audio encoded in SPEEX format. SPEEX is an open source and patent free speech compression library [20]. We found that 8KHz 16 bit audio file compressed by the SPEEX algorithm is of acceptable quality for voice instructions. We have developed a TTS system for Nepali using the Festival and Festvox systems. We intend to incorporate Nepali TTS into Sambad system for all texts. We have produced a basic speech corpus, with letter (and syllable) to sound rules, plus a 6,000 word lexicon of syllabification, and a lexicon of exception pronunciations. We are working in Nepali, though the technology we are developing, apart from TTS, is independent of any specific language. The user interface is localizable in terms of audio captions and culture specific icons, pictures and logos.
4 Conclusion We envisage a situation where access to computers for illiterate people will be possible, so that they can undertake all the basic functions that literate people take for granted. The interface to the computer will be different, visual with speech and little or no writing. It will also be less cluttered, with most of the complexity of normal operating systems hidden. We envisage that the computers that will be used will be in publicly accessible community telecentres or knowledge centres. However we also recognize that for somebody who is illiterate such places may seem forbidding, and that there will need to be some persuasive reason for entering. In future we intend to extend our system to be used by visually impaired and physically disabled people so that they can also benefit from computers and the Internet. We also want to extend the system to support literacy education. We might even contemplate a future such as that suggested by Harris (2000): "It is by no means out of the question that in some parts of the world literacy will simply be bypassed by the advent of new communications technologies which make it unnecessary to teach the skills of reading and writing to the whole population" [6 p12]. Perhaps Sambad, as we develop it, will be that new technology. Acknowledgments. We would like to thank the Leverhulme Trust for funding this project, and Madan Puraskar Pustakalaya for hosting the work.
References 1. Blackwell, A.F.: Metaphor in Diagrams, University of Cambridge (1998) 2. Barnum, C.: Usablity testing and research. Longman, New York (2002) 3. Central Bureau of Statistics, Government of Nepal (2006) http://www.cbs.gov.np/ Nepal%20in%20figure/nepal%20in%20figures%202006.pdf 4. Norman, D.: The Design of Everyday Things. Currency; Reissue edition (1990) 5. Evers, V.: Cross-cultural understanding of metaphors in interface design. Cultural Attitudes towards Technology and Communication, London (1998)
730
S. Dhakhwa et al.
6. Harris, R.: Rethinking Writing. Continuum International Publishing Group, Academic Press (2005) 7. Human Development Report. UNDP (2006) http://hdr.undp.org/hdr2006/statistics/ countries/country_fact_sheets/cty_fs_NPL.html 8. Indigenous Peoples of Nepal and Traditional Knowledge. In: International Workshop on Traditional Knowledge Panama City (September 21-23, 2005) 9. Nielsen, J.: Designing for the cultural “other” - A global perspective on ICT and illiteracy (2007) http://ideas.repec.org/p/hhs/cbsinf/2005_011.html 10. Kress, G., Van Leeuwen, T.: Multimodal discourse: The modes and media of contemporary communication. Hodder Arnold (2001) 11. Goetze, M., Strothotte, T.: An Approach to Help Functionally Illiterate People with Graphical Reading Aids. Smart Graphics Symposium, UK (2001) 12. Stefan, M., Seetharam, D.: WeatherTank: Interface for Non-literate Communities and Ambient Visualization Tool (2006) http://alumni.media.mit.edu/ deva/projects.shtml 13. Medhi, I., Sagar, A., Toyama, K.: Text-Free User Interfaces for Illiterate and SemiLiterate Users. In: Paper presented at the International Conference on Information and Communications Technologies and Development, Berkeley, CA (May 25-26, 2006) 14. Nielsen, J., Jakob, N.: Usability Engineering. Academic Press, San Diego (1993) 15. Ong, W.J.: Orality and Literacy: The Technologizing of the World. Routledge (1982) 16. Tucker, R.: An Audio-Visual Webpage Editor for Limited Literacy Users. Report commissions by Pat Hall prior to starting (2005) 17. Tucker, R., Robinson, A.J, Christie, J., Seymour, C.: Recognition-Compatible Speech Compression for Stored Speech. In: Proceedings of ESCA ETRW Workshop on Accessing Information in Spoken Audio, pp. 69–72 (April 1999) 18. Understanding Non-Literacy as a Barrier to Mobile Phone Communication (2006) http://research.nokia.com/bluesky/non-literacy-001-2005/index.html 19. UNESCO Institute for Statistics (2006) http://www.uis.unesco.org/ ev.php?URL_ID=6401&URL_DO=DO_TOPIC&URL_SECTION=201 20. Valin, J-M.: The Speex Codec Manual v. 1.0.4. (2006) http://www.speex.org/ 21. Voice Extensible Markup Language (VoiceXML) Version 2.0. World Wide Web Consortium (2006) http://www.w3.org/TR/voicexml20/ 22. Zellweger, P.T., Terry, D.B., Swinehart, D.C.: An overview of the Etherphone System and its Applications. In: Proc 2nd IEEE Conf on Computer Workstations, pp160–168 (1988)
The Balancing Act Between Computer Security and Convenience Mayuresh Ektare and Yanxia Yang Trend Micro, Incorporated, 10101N De Anza Blvd, Cupertino, California, USA
Abstract. In the past, computer virus writers developed malicious code to become famous. This trend has been steadily changing and we now see a new breed of malicious code that is written with a motivation of financial gain. Computer users are vulnerable to such attacks and security has become one of the domains that affect every computer user. Users often find themselves performing a balancing act between securing their systems and enjoying the “easy life”. Humans are highly task oriented and they tend to discount security if it gets in their way. Some users are unaware of the risks posed by computer viruses/spyware and unprotected networks, while several informed users compromise their security for convenience. With the growing digital infrastructure, the necessity of networking various devices is even more pronounced thereby adding up to the complexities of protecting it. Few users understand the difference between securing their network and protecting their system from viruses and spyware, and the varying degree of security awareness among users translates into inadequate protection for some networks. This paper reports findings from a user research describing the deficiencies and flaws in today’s security software, outlines the user behavior to understand their perspective on computer and network security and describes why security is sometimes compromised for convenience. A “virtual gateway” security service model is also proposed to make security transparent to the users by providing protection at the Internet service provider level. Keywords: Computer security, Viruses, Spyware, convenience, user behavior, user experience.
The user behavior slightly varies based on their security knowledge. This paper reports the findings from a user research study that was performed to identify typical user behavior, deficiencies and flaws in today’s security software and provides a solution to address these issues.
2 Method A survey was conducted to understand the user behavior and identify when security is compromised for convenience. 110 computer users with varying backgrounds answered questions relating to their typical behavior and their knowledge about computer security, the tools they use to protect themselves, and the difficulties they face. The gathered data was then analyzed to identify what makes some users compromise their security at times. The data also offers an insight into the percentage of users who secure their network to avoid unauthorized access and users who feel comfortable connecting to someone else’s unsecured network. This behavior helps to determine user’s perception on their vulnerability to a security threat.
3 Results and Discussions It was noted that when users are faced with a choice between security and convenience, convenience is what they opt for. The survey allowed us to gather user behavior data for different aspects of securing a computer and the home network. Following section presents the analysis of collected data on various aspects of security. 3.1 Security Software The analysis of data revealed that though 98% of users understood the risks of leaving their computers unprotected, 36% users reported that they at times compromise security for convenience. This may be due to many reasons, especially when securing their computers gets in the way of quickly achieving their goals. Users reported that they turn off anti-virus protection to speed up the computer; this is due to the fact that security software is often resource intensive. Users also mentioned turning off scheduled or automatic updates and scanning in security products to avoid getting distracted while using their computers. Users are task oriented and do not want to be distracted to ensure they are well protected. If interrupted, users tend to compromise security for convenience. Average users generally turn their computers on and connect to the Internet to perform some task. Security software can download its updates (a.k.a virus definition or pattern files) only when the computer is connected to the Internet, but performing this operation while the user is engaged in their task typically has an adverse effect. Users often compromise their security by not downloading the new pattern files, by turning off the automatic updates or by totally disabling the security software. If the process were transparent to the user, they would not mind the security software running in the background. The biggest complaint users had with today’s security software was that
The Balancing Act Between Computer Security and Convenience
733
it required users to make decisions on items they had little or no knowledge. By prompting the users to make various choices, the security software interrupts the user’s work-flow, thereby forcing the user to select between security and convenience. In order to determine deficiencies and flaws in today’s security software, several questions in the survey were targeted towards understanding the problems faced by the users. The data revealed that 40% of the users believed their security software restricted access to networked computers making it difficult to share files within the home network. With security software forcing the users to make several decisions along the way, users perceived that it makes it more difficult for them to share files. 21% of users mentioned that it is difficult to configure the security software, while 22% users reported that it prohibits them from using their computer the way they are accustomed. Half of the total users thought that it slowed down their computer and 43% mentioned that they did not believe the security software completely protected them. Although only 6% of users did not have any security software on their computers, 82% thought they were still vulnerable and could get infected by a computer virus. This data suggests that users do not believe that their security software fully protects them from malicious threats. The survey also determined how users select a security vendor, and what factors influence their decision. The data revealed that computer system vendors’ played significant role in suggesting and providing pre-packaged security tools to the users. 35% of the users continued using the security software that was suggested by the computer system vendor. Users who chose their security tools typically had more knowledge and awareness of security issues. Ease of use turned out to be the most important factor in selecting a security vendor with the second closest factor being how good of a protection it offers. 3.2 Securing Home Networks Few users understand the difference between securing their network and protecting their system from viruses and spyware. The varying degree of security awareness among users translates into inadequate protection for some networks. Although 98% users understood the security risks, 18% left their home networks vulnerable by permitting unauthorized access. Few users mentioned that their networks were unsecured due to lack of knowledge on how to secure them, while some thought securing might make it difficult to share files within the network. This also helps to affirm that for a user, ease of use has a higher priority than security. 49% of the users reported that they do not hesitate to connect to an unsecured wireless network belonging to someone else. This provides more evidence on users being task oriented and that they ignore the risk that the wireless network owner may intercept their Internet traffic.
4 Proposed Solution The biggest flaw of current security software is that it puts the user in the driver’s seat and requires them to make decisions on things that they have little knowledge about.
734
M. Ektare and Y. Yang
A solution to user’s problems emerged after their frustrations with today’s security software were well understood. With the gathered data suggesting the typical user behavior, an approach of taking the security burden off the average user was a clear solution to intentionally compromised security. Protection at the “Virtual Gateway” Enterprises deploy a layered security approach, but this model has not been ported to benefit average home computer users. A layered security model involves deploying multiple detection points in a network to identify malicious threats and protect user’s digital infrastructure. Unlike the single point detection at the computer, it introduces an additional layer at the gateway to scan and identify threats. For a home computer user, this gateway layer could be deployed and managed by the Internet Service Provider (ISP) at their site. With such layered security, the need for having security
Fig. 1. Proposed Virtual Gateway Solution
The Balancing Act Between Computer Security and Convenience
735
software running on the computer while connected to the internet via the ISP is diminished. This essentially offloads the burden of configuring and managing the security software from the user, and provides “Security-as-a-Service”. With the ISP responsible for initial configuration and full control available to the user thereafter, the user does not have to make decisions on items they have little or no knowledge about. The security software on the computer could turn itself on and off based on whether the layer at virtual gateway is active. This approach also helps in speeding up the computer by turning off the resource intensive security software when not necessary. These days, most threats are posed by transfer of data over the Internet rather than from a physical media. The virtual gateway protects the user from threats over the Internet but requires users to switch to the local security software for protection from threats that spread via physical media. With the “virtual” gateway approach the users still have to secure their home networks to prevent unauthorized access.
5 Conclusions Users often compromise security if they have to choose between security and convenience. The users remain better protected when security is transparent and does not interrupt their work-flow. Although majority users have security software installed, they believe that they are not completely protected and still remain vulnerable to malicious threats. The proposed “virtual gateway” model puts the security layer at a place in the network where it has least impact on the user’s workflow. This provides security as a true service that takes away the need for installing, configuring and maintaining the security software. The suggested “virtual gateway” service model eliminates the balancing user currently has to do between security and convenience.
References 1. Nielsen, J.: Acting on User Research, Alertbox (2004) http://www.useit.com/alertbox/ 20041108.html 2. Anderson, G.: Making use of user research, Cooper Interaction Design Newsletter, (September 2001) http://www.cooper.com/newsletters/2001_09/making_use_of_user_research.htm 3. Kuniavsky, M.: Observing the User Experience: A Practitioner’s Guide to User Research, Morgan Kaufmann (2003) ISBN-13: 978-1558609235 4. Courage, C., Baxter, K.: Understanding Your Users: A Practical Guide to User Requirements Methods, Tools, and Techniques. Morgan Kaufmann, Washington (2004) ISBN-13: 978-1558609358
What Makes Them So Special?: Identifying Attributes of Highly Competent Information System Users Brenda Eschenbrenner and Fiona Fui-Hoon Nah University of Nebraska-Lincoln, College of Business Administration, Lincoln, Nebraska 68588-0491, USA [email protected], [email protected]
Abstract. Information systems (IS) usage is predominant in organizations. The effectiveness and strategic potential of IS, however, depend upon the individuals within the organization who use or rely on IS, both directly and indirectly, to perform their job functions. Individuals differ in their abilities to use IS effectively to maximize task performance. Some individuals far exceed their peer group and can realize greater performance levels than others. This research proposes to understand the attributes of these individuals using the Repertory Grid Technique. This technique will identify attributes of these individuals identified as highly competent IS users, defined as those individuals who are able to utilize IS to its fullest potential and obtain the greatest performance. The attributes identified may generate factors that can then be fostered in other IS users to improve performance. Keywords: User competence, user attributes, Repertory Grid Technique.
achieved in organizations. For example, Boudreau [3] studied a state institution’s successful implementation of an enterprise system and found different degrees of usage, with some employees struggling with using the new system. One of the users in Boudreau’s study even noted that employees “know the buttons to push for their task, but not necessarily what is around [the system]” [3]. Boudreau’s research also identified individuals in the same organization that became functional, experienced users of the system, while others remained less functional and relied on their more proficient colleagues for assistance. These more proficient users became familiar with the system and utilized it beyond the rudimentary ways to develop processes that better suited their needs. To further understand what attributes contribute to these users’ IS proficiency is the focal point of this research. Identifying these attributes will then provide an opportunity to determine those attributes that can potentially be encouraged or enhanced with training in other IS users. Encouraging or training of these attributes may then lead to improved IS usage and greater benefits from IS being realized. Earlier research has studied factors contributing to individuals’ intentions to adopt technology. For example, Compeau and Higgins [4] found computer self-efficacy to influence computer usage and user expectations. Agarwal and Prasad [5] propose personal innovativeness as a moderator in understanding perceptions of new information technology. Chung and Tan [6] studied the antecedents of perceived playfulness and found focused attention (a user’s attention being completely absorbed in the interaction) and control (perception of being in charge of a given activity) to be important cognitive dimensions. Our research differs from earlier research in that we are interested in understanding how some users are highly competent in utilizing IS and others are not. Our research question is important because intentions to use or adopt technology which has been studied extensively in the MIS literature does not necessarily translate into quality of IS use that delivers benefits to organizations. Therefore, understanding attributes of highly competent users who are able to achieve maximum performance from an IS is important. Consequently, the question is: “What makes highly competent users so special that they are able to achieve the maximum performance from an IS whereas others are not able to do so?” More specifically, this research focuses on answering the question: “What attributes of IS users influence their ability to fully utilize technology to obtain the greatest benefits from IS?”
2 Literature Review and Theoretical Background Research in the literature varies in its approach to studying IS usage. For example, the Technology-to-Performance Chain (or Task-Technology Fit) Model recognizes the impact that an appropriate match between technology and task can have on an individual’s performance [7]. Various models that have been used to explain intentions to use IS and perceptions of IS usage behaviors include Theory of Reasoned Action (TRA), Theory of Planned Behavior (TPB), Technology Acceptance Model (TAM), and the Decomposed Theory of Planned Behavior [8-12]. The primary focus of these models is on behavioral intentions and factors that predict the amount of IS usage. However, the amount of IS usage does not necessarily translate into
738
B. Eschenbrenner and F.F.-H. Nah
quality of IS use. Bagozzi and Warshaw [13] noted that, “Since, by definition, reasoned behaviors are not subject to performance impediments, they cannot be considered goals per se. However, when impediments to performance do exist, even if only in the mind of the actor, actual performance will be problematic” [13]. Certain users may then rise to the occasion and obtain superior performance while others may not. Bagozzi and Warshaw introduced the Theory of Trying and tested the Theory of Goal Pursuit and Theory of Planned Behavior in their research, while Beaudry and Pinsonneault [14] expanded upon previous models to study adaptation of technology by developing the coping model of user adaptation. Beaudry and Pinsonneault acknowledge the disruption that adapting or modifying technology can introduce in the work environment and identify strategies that users can implement to cope. For those individuals who follow a problem-focused approach, appraise the outcome of the IT event as an opportunity, and assess that they have control over the situation; a strategy of Benefit Maximization is highly probable. The participants who adapted this approach were quoted as spending hours trying new items on the system, finding new uses for the system, utilizing support services to learn the system, discovering other capabilities by trial and error, and exploring new methods of conducting their business functions. To further explain individual technology usage, other researchers have examined the construct of personal innovativeness or investigativeness and its influence on perceptions and usage intentions [5, 15]. Personal innovativeness is defined as “willingness of an individual to try out any new information technology” [5]. As Bandura [16] notes in reviewing adoption determinants, “the acquisition of knowledge and skills regarding innovations is necessary but not sufficient for their adoption in practice” [16]. Hurt et al. [17] developed a scale to measure innovativeness based on their view that innovativeness is a personality construct and their definition of innovativeness as a willingness to change. The scale was developed using Rogers and Shoemaker’s [18] characteristics of five innovativeness categories: innovator, early adopter, early majority, late majority, and laggard. In reviewing this research along with Perrewe and Spector [19] and Nicotera et al. [20], the following themes emerged as elements of innovativeness: openness to experience, ambiguity tolerance, rationality, intelligence, optimism, motivation, extroversion, opinion leadership, and resourcefulness. Rank et al. [21] differentiate creativity and innovativeness as follows: “creativity refers to idea generation, whereas innovation refers to idea implementation… Creativity is truly novel, whereas innovation can be based on ideas that are adopted from previous experience or different organizations” [21]. In applying these definitions, one can assess that the participants classified as Benefit Maximizing in Beaudry and Pinsonneault’s study were exhibiting both creativity and innovativeness in their approach to adapting and using IS. Also, they were doing so to such a significant extent that more effective, strategic, and productive benefits were possible. Amabile [22] identifies Components of Creativity as comprising domain-relevant skills (or expertise), creativity-relevant skills (or creative thinking), and task motivation. However, attributes such as the application of an IS user’s innovativeness and creativity have not been fully explored and studied in the literature.
What Makes Them So Special?
739
Other research that has used or generated concepts for user attributes that can contribute to their abilities to utilize IS to achieve maximum performance is as follows. 1. Mindfulness. Butler and Gray [23] address the issue of operationalizing technology in a reliable manner within an organization and doing so with a mindfulness approach. Organizations strive to achieve a certain level of reliable outcomes even with technology that is not entirely reliable. Reliability can be achieved through structured approaches, but a mindfulness approach is also advocated. The mindfulness approach represents greater flexibility in perceiving and interpreting events and breaking existing frameworks to create new approaches or solutions. Individual mindfulness, in particular, includes reasoning about new phenomena, viewing situations from multiple perspectives, evaluating similarities and differences, recognizing the features of the present issue, and orienting in the current situation. 2. Theory of Flow. Research conducted by Ghani and Deshpande [24] and Ghani [25] demonstrates the uses of the theory of flow in studying individual’s learning and using computers, and the connection that flow has with an individual’s exploratory behavior. Csikszentmihalyi [26] defined theory of flow as “the state in which people are so intensely involved in an activity that nothing else seems to matter; the experience itself is so enjoyable that people will do it even at great cost, for the sheer sake of doing it.” The explanation of individual behaviors provided by the theory is context driven and situation dependent. The explanation for optimal flow experience describes a positive state of mind that exists which impacts a user’s exploratory behavior, and thereby impacts the extent and, possibly, quality of IS use. 3. Self-efficacy. Bandura [16] provided the following definition for self-efficacy: “People’s judgments of their capabilities to organize and execute courses of action required to attain designated types of performances. It is concerned not with the skills one has but with judgments of what one can do with whatever skills one possesses” [16]. Compeau and Higgins [4] identify computer self-efficacy as a user’s judgment of their own ability to utilize technology correctly. Self-efficacy, as described in the research, is centered on the individual’s perceptions of their abilities or the individual’s self-confidence in achieving a goal. 4. Symbolic adoption. Symbolic adoption is defined as a user’s voluntary mental acceptance of technology [27-29]. The construct is utilized to set apart acceptance of a system when use is other than voluntary (i.e., symbolic adoption taps largely on one’s genuine behavioral intention in a mandatory context). Agarwal and Karahanna [28] identified dimensions of symbolic adoption as mentally accepting the technology, committing to its usage, positive evaluation of the return to be obtained from using the technology (worthiness), and high levels of enthusiasm and eagerness to engage the technology. Symbolic adoption has been noted as an antecedent of intentions to explore and defined as a user’s mental acceptance, versus the user’s full, productive engagement of a system. The research also indicates a correlation with self-determined motivation.
740
B. Eschenbrenner and F.F.-H. Nah
3 Focus of Research and Research Question The aim of this research is to identify and understand attributes of highly competent users of IS or those who are able to achieve superior performance outcomes from IS use. Several constructs have been used to identify these high performing IS users. Marcolin et al. [2] define user competence as “the user’s potential to apply technology to its fullest possible extent so as to maximize performance of specific job tasks” [2]. Other user descriptions discuss superior IS usage as being able to “correctly exploit the appropriate capabilities of software in the most relevant circumstances” [3] and to successfully deploy the Benefit Maximization approach [14]. Adapting from Marcolin et al. [2], the highly competent user construct is defined as one who is able to utilize IS to its fullest potential and obtain the greatest performance from IS use. In this research, we are interested in understanding the attributes of highly competent users as defined above. The focus of this research is to develop an in-depth understanding of attributes of highly competent users and relate the attributes identified in this study to existing models, theories, and constructs. In so doing, we intend to generate a better understanding of highly competent users’ abilities to achieve the levels of performance that they are able to do so. In other words, our research question is: “What are the attributes of highly competent users of IS that influence their ability to fully utilize technology to achieve the maximum performance from IS use and obtain the greatest benefits from IS?”
4 Research Method and Procedures To identify attributes of highly competent users of IS, the Repertory Grid (RepGrid) Technique [30] will be utilized. The strength of the RepGrid technique is in capturing individuals’ personal constructs that bring meaning and understanding to various phenomena [31]. Hence, it is an appropriate technique to uncover attributes of highly competent users. The RepGrid technique consists of three major components: elements, constructs, and links [32-34]. The research procedure involves interviewing users who utilize IS on a regular basis and asking them to identify highly competent IS users that they know as well as least competent IS users that they know. Details of the RepGrid technique are explained in Stewart [31] and Fransella et al. [35]. The research procedures consist of six main steps explained below: Step 1: Participant Selection. The research participants will be selected from a variety of industries. The sample size for the study will be determined by the point of saturation where no new constructs emerge from interviews with additional subjects. Tan and Hunter [34] indicated that a sample size of 15 to 25 is generally adequate to reach the saturation point. At the beginning of each interview, the participant will be asked to identify the number of IS users that they know fitting the definition of highly competent users of IS (or those considered as close as possible to fitting the definition of highly competent users) as well as the number of IS users that they know fitting the definition of least competent users. These numbers identified will be utilized in the following step.
What Makes Them So Special?
741
Step 2: Select Elements. The next step is to solicit elements which are the focal point of the study [34]. In this research, the potential elements will be IS users that the participant is familiar with who either currently works with or has previously worked with IS. The participant will first be asked to identify the top three highly competent IS users they identified in step 1. The participant will be asked to specifically identify highly competent IS users, versus just any IS users, in order to ensure that the constructs associated with highly competent users can be identified. Considering that a participant might know a significant number of IS users who may not all be classified as highly competent, selecting any IS users may not capture the highly competent users and, therefore, may not allow the relevant constructs to be brought forth. The participant will then be asked to identify three IS users, that they know well, who are the least competent in terms of their ability to utilize IS. These six identified users will be included in the pool of elements for the RepGrid study. As Fransella et al. [35] note “elements should be within the range of convenience of the constructs used…they should be representative of the area being investigated” [35]. Therefore, the definitions of IS as well as highly competent users are supplied to ensure that the elements are within the range of convenience (i.e., relevant to the context of the study) and represent this study’s area of interest – attributes of highly competent IS users. According to Stewart [36], a method of selecting elements is to provide the participant a category within which elements should fall and then allowing the participant to select the elements. An example provided by Stewart was to identify the four best and then least effective managers that the participants knew. To make the constructs generated as rich as possible for this study and to capture a holistic view of a highly competent user, the three least competent users are also identified so that attributes that are clearly distinguishable between the two groups of users (i.e., highly competent versus least competent users) are extracted from the participant’s personal constructs. If, however, other users were selected as elements who were just average, certain attributes may be harder to generate considering that some of the essential attributes may overlap (an average user may have a few attributes that a highly competent user has as well as some attributes of least competent users. As such, the attributes associated with highly competent users may not emerge as part of the triadic approach in identifying similarities and differences in step 3). Therefore, we elected to pursue the strategy used by Stewart [36] to elicit as rich and inclusive set of constructs as possible to understand highly competent IS users. Two additional elements that represent the extreme ends of the bipolar constructs, an Ideal User and an Incompetent User, will also be included in the pool of elements to support the construct elicitation process. Each element is listed on a separate card and this complete set of eight elements is then utilized in step 3. Step 3: Identify Constructs. The construct identifies the interpretation of the elements [34]. In so doing, bipolar labels can be used to divulge a deeper understanding through the development of contrasts. According to Fransella et al. [35], individuals interpret events with the use of bipolar dimensions, or personal constructs, with which they can identify what some person/place/thing is and what it is not. For example, one set of the bipolar constructs developed by Hunter [33] in researching the qualities of an excellent system analyst was “user involvement-lack of user involvement.” Pervin [37] quotes Bonarius [38] in recognizing that the standardized use of the RepGrid provides a stable and representative set of constructs.
742
B. Eschenbrenner and F.F.-H. Nah
The research participant will first be asked to identify constructs using the triadic approach. More specifically, three elements will be randomly selected by the researcher and the participant will be asked to identify how two of them are similar but different from the third in the context of their ability or inability to utilize IS to its full potential to achieve maximum performance. After a construct is identified, confirmation will be solicited to identify the positive or negative end of the construct. The participant will then be asked to identify the opposite end of the bipolar construct. Also, the laddering approach will be utilized in which questions such as “how” and “why” will be asked to gain further insight into the meanings of the participant’s constructs [34]. Step 4: Develop Links. Links illustrate the relationship between elements and constructs, from the research participant’s perspective, as well as interpretations of similarities and differences [34]. In this research, the participant will first be asked to physically arrange the elements’ cards so they are ranked in terms of representing their relative positions on the bipolar constructs identified. If elements are construed as being the same, they will be placed together so that the participant is not forced to rank one over the other. Then, the participant will be asked to rate the elements on a 1 to 9 scale, with 1 being the negative end and 9 the positive end. In the rating process, the Ideal User would be given a 9 while an Incompetent User would be given a 1 for all constructs. A 9-point scale is utilized considering the maximum number of elements possible is 8 (3 highly competent users, 3 least competent users, 1 Ideal user, and 1 Incompetent user). The use of a 9-point scale allows the participant to provide a separate or similar rating to elements if they choose without being constrained by the scale. Steps 3 and 4 will be repeated until no new constructs emerge or the point of redundancy is reached. Reger [39] indicates that previous research identifies seven to ten triads to be sufficient. Step 5: Visual Focusing & Review. After the grids completion, visual focusing will be utilized in which the participant will be asked to review the grid and evaluate the ratings given to each element for the respective construct to ensure they agree with what has been accomplished. Also, the participant will be asked if the ratings given to the respective elements represent the participant’s conception of an ‘Ideal User’ and ‘Incompetent User.’ To further verify the reliability of the constructs elicited, during the final stage of the interview, the participant will be asked to focus on the highly competent users of IS that they identified earlier and asked probing questions such as “How would you describe this person in terms of what makes him/her a highly competent user?”, “Why do you think they are highly competent users?”, “What do you think allows them to utilize the system better than anyone else?”, “How would you label their strengths in terms of what allows them to be a better user than others?”. The constructs identified from the responses will be compared to the existing list. If any new constructs emerge, they will be included in the existing list and steps 4 and 5 will be repeated. The interview will be concluded by asking the participant to rate all of the constructs (positive end only) on a scale from 1 to 9 in terms of the importance of these constructs for an ideal user.
What Makes Them So Special?
743
Step 6: Analysis of RepGrids. To conduct a qualitative analysis of the RepGrids generated from the data, the frequency that the constructs are mentioned will be tabulated. Also, the mean average ratings will be developed and reviewed. Constructs that identify general, demographic attributes (e.g., young…old) will not be included in the categorization. Instead, if these general attributes are provided by a participant, the laddering approach will be utilized (asking the participant “how” and “why” as mentioned above) to obtain more specific and descriptive attributes. These more specific and descriptive attributes that are obtained will be included in the categorization (e.g., experienced with technology). As suggested by Tan and Hunter [34], “linguistic analysis can be used to classify groups of common constructs.” The constructs that are generated will be categorized following Stewart’s [31] approach of content analysis and Strauss and Corbin’s [40] open coding methodology. Stewart suggests that, “to perform a content analysis you select a series of categories into which the elements or constructs fall, and then assign the elements or constructs to categories.” During the open coding process, reference will be made to those categories identified in the literature review, and definitions provided will be followed as closely as possible in order to capitalize on the strong theoretical foundation in the literature.
5 Contributions and Future Research The findings from this research will provide an aggregated understanding of the attributes of highly competent users, which will provide insight into “what makes them so special.” The findings will provide a foundation of IS users’ specific attributes that support their ability to obtain maximum performance outcomes from IS use. This foundation is further enhanced with previous research that was utilized in developing the categorizations of the findings. The attributes that are identified can then be assessed for those that can be trained or acquired by others versus those that are not. If users are trained or encouraged to foster similar attributes that are identified as trainable, they may be able to reach higher levels of performance outcomes as well. In future research, specific interventions (e.g., training programs, creativity exercises) that encourage or develop the identified attributes will be explored. For those that are more innate, the attributes may present specific criteria that organizations can utilize in hiring individuals whose attributes will more appropriately fit with the job expectations. Considering organizations are trying to maximize the benefits that can be gained from IS, implementing such mechanisms can help to improve the proficiency of all IS users and lead to improved outcomes for organizations.
References 1. Wu, J.: Realizing the Benefits from Your Investment in BI. DM Review 15(6), 10–65 (2005) 2. Marcolin, B.L., Compeau, D.R., Munro, M.C., Huff, S.L.: Assessing User Competence: Conceptualization and Measurement. Information Systems Research 11(1), 37–60 (2000)
744
B. Eschenbrenner and F.F.-H. Nah
3. Boudreau, M.-C.: Learning to Use ERP Technology: A Causal Model. In: Proceedings of 36th Annual Hawaii International Conference on System Sciences, vol. 232, pp. 235–243 (2003) 4. Compeau, D.R., Higgins, C.A.: Computer Self-efficacy: Development of a Measure and Initial Test. MIS Quarterly 19(2), 189–211 (1995) 5. Agarwal, R., Prasad, J.: A Conceptual and Operational Definition of Personal Innovativeness in the Domain of Information Technology. Information Systems Research 9(2), 204–215 (1998) 6. Chung, J., Tan, F.B.: Antecedents of Perceived Playfulness: An Exploratory Study on User Acceptance of General Information-searching Websites. Information and Management 41(7), 869–881 (2004) 7. Goodhue, D.L., Thompson, R.L.: Task-technology Fit and Individual Performance. MIS Quarterly 19(2), 213–236 (1995) 8. Ajzen, I., Fishbein, M.: Understanding Attitudes and Predicting Social Behavior. PrenticeHall, Englewood Cliffs (1980) 9. Ajzen, I.: From Intentions to Actions: A Theory of Planned Behavior. In: Kuhl, J., Beckmann, J. (eds.) Action Control: From Cognition to Behavior, pp. 11–39. Springer, Heidelberg (1985) 10. Ajzen, I.: The Theory of Planned Behavior. Organizational Behavior and Human Decision Processes 50(2), 179–211 (1991) 11. Davis, F.D.: Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly 13(3), 319–339 (1989) 12. Taylor, S., Todd, P.A.: Understanding Information Technology Usage: A Test of Competing Models. Information Systems Research 6(2), 144–176 (1995) 13. Bagozzi, R.P., Warshaw, P.R.: Trying to Consume. Journal of Consumer Research 17(2), 127–140 (1990) 14. Beaudry, A., Pinsonneault, A.: Understanding User Responses to Information Technology: A Coping Model of User Adaptation. MIS Quarterly 29(3), 493–524 (2005) 15. Nah, F., Tan, X., Beethe, M.: End-users Acceptance of Enterprise Resource Planning Systems: An Investigation Using Grounded Theory Approach. In: Proceedings of the Americas Conference on Information Systems (AMCIS) pp. 2053–2057 (2005) 16. Bandura, A.: Social Foundations of Thought and Action, vol. 148, p. 391. Prentice-Hall, Englewood Cliffs (1986) 17. Hurt, H.T., Joseph, K., Cook, C.D.: Scales for the Measurement of Innovativeness. Human Communication Research 4(1), 58–65 (1977) 18. Rogers, E.M., Shoemaker, F.F.: Communication of Innovations. Free Press (1971) 19. Perewwe, P.L., Spector, P.E.: Personality Research in the Organization Sciences. In: Ferris, G.R., Martocchio, J.J. (eds.) Research in Personnel and Human Resources Management, vol. 21, pp. 1–63 (2002) 20. Nicotera, A.M., Smilowitz, M., Pearson, J.C.: Ambiguity Tolerance, Conflict Management Style and Argumentativeness as Predictors of Innovativeness. Communication Research Reports 7(2), 125–131 (1990) 21. Rank, J., Pace, V.L., Frese, M.: Three Avenues for Future Research on Creativity, Innovation and Initiative. Applied Psychology 53(4), 518–528 (2004) 22. Amabile, T.M.: Creativity in Context. Westview Press, Colorado (1996) 23. Butler, B.S., Gray, P.H.: Reliability, Mindfulness, and Information Systems. MIS Quarterly 30(2), 211–224 (2006) 24. Ghani, J.A., Deshpande, S.P.: Task Characteristics and the Experience of Optimal Flow in Human-computer Interaction. The Journal of Psychology 128(4), 381–391 (1994)
What Makes Them So Special?
745
25. Ghani, J.A.: Flow in Human-computer Interactions: Test of a Model. In: Carey, J. (ed.) Human Factors in Management Information Systems: An Organizational Perspective, vol. 3, Ablex (1991) 26. Csikszentmihalyi, M.: Flow: The Psychology of Optimal Experience. Harper & Row, 4 (1990) 27. Nah, F.F.-H., Tan, X., Teh, S.H.: An Empirical Investigation on End-users’ Acceptance of Enterprise Systems. Information Resources Management Journal 173, 32–53 (2004) 28. Karahanna, E., Agarwal, R.: When the Spirit is Willing: Symbolic Adoption and Technology Exploration. Working paper, University of Georgia (2003) 29. Karahanna, E.: Symbolic Adoption of Information Technology. In: Proceedings of the International Decision Sciences Institute. Athens, Greece (1999) 30. Kelly, G.A.: The Psychology of Personal Constructs. W.W. Norton (1955) 31. Stewart, V.: Business Applications of Repertory Grid (2005), Retrieved December 7, 2005 from Enquire Within website http://www.enquirewithin.co.nz/BUS_APP/business.htm 32. Easterby-Smith, M.: The Design, Analysis and Interpretation of Repertory Grids. International Journal of Man-Machine Studies 13(1), 3–24 (1980) 33. Hunter, M.G.: The Use of RepGrids to Gather Interview Data About Information Systems Analysts. Information Systems Journal 7(1), 67–81 (1997) 34. Tan, F.B., Hunter, M.G.: The Repertory Grid Technique: A Method for the Study of Cognition in Information Systems. MIS Quarterly 26(1), 39–57 (2002) 35. Fransella, F., Bell, R., Bannister, D.: A Manual for Repertory Grid Technique, 2nd edn. vol. 18. John Wiley & Sons, New York (2004) 36. Stewart, V.: Helpful Hints (2006), Retrieved October 17, 2006 from Enquire Within website http://www.enquirewithin.co.nz/hintsfor.htm 37. Pervin, L.A.: Personality: Theory and Research, vol. 230. John Wiley & Sons, New York (1984) 38. Bonarius, H.: Research in the Personal Construct Theory of George A. Kelly: Role Construct Repertory Test and Basic Theory. In: Maher, B.A. (ed.) Progress in Experimental Personality Research, pp. 1–46. Academic Press, New York (1965) 39. Reger, R.K.: The Repertory Grid Technique for Eliciting the Content and Structure of Cognitive Constructive Systems. In: Huff, A.S. (ed.) Mapping Strategic Thought, pp. 301– 309. John Wiley & Sons, New York (1990) 40. Strauss, A., Corbin, J.: Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory, 2nd edn. Sage Publications, Thousand Oaks (1998)
User Acceptance of Digital Tourist Guides Lessons Learnt from Two Field Studies Bente Evjemo, Sigmund Akselsen, and Anders Schürmann Telenor R&I, Sykehusvn. 23, N-9294 Tromsø, Norway {Bente.Evjemo, Sigmund.Akselsen, Anders.Schürmann}@telenor.com Abstract. Two digital tourist guides have been developed and tested in real settings. They are both outdoor guides adapted to mobile phones, - targeting attractions and tourist services within a region and a specific attraction respectively. Aspects related to simplicity in use, installation procedures, content quality, co-visiting mechanisms, and mechanisms that support links between physical object and digital content should be accentuated in future digital guides. Keywords: digital tourist guide, user acceptance, field study.
1 Introduction The self-made tourist and the active tourist are trends within tourism today [29]. He finds satisfaction in the process of composing the program for the day, and he wants to take part in activities beyond being a spectator. New technology has the power of enriching storytelling by easy access to multimedia information, by fast searching mechanisms and immediate response, and by selecting information and presentation format relevant to the actual user, his preferences and the context as is. The digital guide is available whenever needed, and might adapt to individual needs. The great number of digital tourist guides tells about a strong belief in the potential of digital guiding. The guides vary according to several dimensions. Some guides are designed for portable devices [2], [4] and others for stationary devices [10], [26]. In some cases the tourists decide time for loading and presentation [2], [4] and alternatively the systems decide [5], [20]. Some guides are closely tied to one attraction [1], [10], [24] while others refer to several attractions or a geographical area [4], [14], [25]. Some guides store the content locally on the device [1], while others transfer information on demand through wireless networks, like WLAN1 [4], [14] and Bluetooth [20]. Automatic positioning is sometimes part of the system [2], [27], [2], [24], [15]. Many digital guides also include map based facilities [3]. This paper reports on user acceptance of two different tourist guides. They are both designed and implemented for mobile phones, targeting tourists travelling within a region and museum visitors respectively. As ease of use and usefulness are found essential in the process of modelling and understanding user acceptance these aspects 1
User Acceptance of Digital Tourist Guides Lessons Learnt from Two Field Studies
747
are accentuated [6], [13], [28]. A third aspect, entertainment, is added due to its relevance to mobile phone services [19], [22], [23]. Also user surroundings and user activities are considered as affective factors on user acceptance and usability [16].
2 Two Digital Tourist Guides – Two Field Studies The main functionalities and user interface of the tourist guides are briefly presented, as well as the methods used in the two field studies and the results of importance to user acceptance. The technical solution, methods and results are more thoroughly described in [7], [8], [30]. 2.1 RegionGUIDE – Information While Travelling The RegionGUIDE informed tourists about attractions and tourist services available within a particular region. For testing purpose content was produced for one particular region only. During the field trial there were more than 200 possible entries to the guide. These points of interest (POIs) contained textual and multimedia information partly produced for the field trial and partly made available from existing databases. The content was downloaded when needed over a GPRS2 network. The specifications of RegionGUIDE were based upon a major tourist survey that was performed to better understand tourist behaviour, role, and particular interests and tasks [29]. The survey as well as the service development and the field study were parts of a major project3 focusing on mobile services within tourism. As maps are frequently used and constitute an important source of information for people on the move, the main user interface was based upon a scaleable map [12], [29]. It was implemented as a J2ME Midlet4. Automatic positioning was part of the initial plan, but due to limited project resources it was not implemented. The service was preset to show the region of Lofoten when activated (see figure 1). Within the map there were clickable icons, representing the POIs. These were split into six categories: accommodation, dining, attraction, activity, events, and tourist information. To avoid information overload and too many overlapping icons, attraction was selected initially as the preferred and only category, but all categories could be shown simultaneously. When clicking an icon, the map screen was replaced by a short description of the POI and links to additional information. The description was limited to 255 characters to reduce download time and scrolling. Most POIs had a link to a contact information page with street address and phone number. By clicking on the phone number the user was asked if she wanted to dial the number. When choosing a mobile phone number, an option to send an SMS message was given. Some POIs had a link to a WAP page with more extensive information which could include images and videos. 2
General Packet Radio Service. The MOVE project, see http://www.moveweb.no 4 http://java.sun.com/javame/index.jsp 3
748
B. Evjemo, S. Akselsen, and A. Schürmann
Fig. 1. Initial map is shown with POIs, including star-like icons that hide several specific icons (left), a simple description page (mid), and a WAP-site with additional information (right)
RegionGUIDE provided a key based and a stylus based user interface. A selection was automatically done during installation depending on the capabilities of the actual device. Each panning operation in the key based interface moved the map half a screen in the desired direction (see figure 2 for details). In the stylus version panning was achieved by dragging the map in the desired position, and zooming through drawing a rectangle around the area of interest. Icons at the top left of the screen (figure 2) provided additional functions and toggled between zoom-mode and panmode for the stylus.
Fig. 2. Key based interface (left) and stylus based interface (right)
The field trial - Methods The trial was performed in the archipelago of Lofoten in Northern Norway. Tourists waiting for the ferry were invited to a short demonstration of RegionGUIDE, including some minutes hands-on, and answered afterwards a questionnaire. Tourists with capable phones were asked to install and use RegionGUIDE during their stay in
User Acceptance of Digital Tourist Guides Lessons Learnt from Two Field Studies
749
Lofoten. A similar recruitment process was performed at the main tourist office in the archipelago. Here the tourists were shown a 3 minutes video5 describing the functionality of the guide in detail. A total of 107 persons answered the questionnaire. 65 % of them were men, and approximately 90 % belonged to an extended middle age group (25-65). The fact that only 40 % of the informants6 had a phone that handled Java software made the recruiting process time consuming and cumbersome. Practical problems related to installation and configuration prevented willing tourists to participate (only 29 % installed the guide). 13 tourists were contacted after the trial period for a semistructured interview over the phone. These interviews informed the analyses of the questionnaires, which basically form the findings in the next section. Results One of the informants exclaimed: “This is great! The mobile is always there – and that makes it a handy service!”. Another informant pointed to the future: “This is the coming service!” Most of them seemed surprised to find the phone capable of anything else but calling and messaging. Further they seemed impressed by the detailed map on the small display and the possibility to navigate and zoom and finally spot the very building of the demonstration. Close to 70 % agreed or partly agreed upon the statement: “I think this guide will be useful to me”, and 83 % claimed that they would install and use the system if their mobile terminal allowed them to do so. Comments related to automatic positioning was frequently asked for and seemed to be accentuated by their knowledge of satellite positioning systems in vehicles. 84 % of the informants stated that the RegionGUIDE was easy to use. This percentage kept high for the informants who actually installed and used the guide. The use, however, was limited as their stay in the region lasted for some few days only. The informants reported problems related to accurate navigation and to find back to a recognisable location in the map if “lost”. Lack of smooth panning seemed to reduce the overall user experience. The only technical problems reported were related to configuration procedures, and had presumably limited impact on their assessment of ease of use. Configuration problems occurred because the devices involved varied according to implementation of J2ME specifications and WAP and Internet settings. 81 % agreed or partly agreed upon the statement: “I think it will be fun to use RegionGUIDE”. Among those who installed and used the guide the attitude to this question was not that clear. Some of them hesitated and tried to find words that better described their experiences. They suggested for example; “Well, not actual entertaining or fun, but nice”. Some informants asked for more information and were concerned about the comprehensiveness and quality of the information available. 2.2 MuseumGUIDE– Visit Without Entering MuseumGUIDE was developed to encourage people to visit a museum located next to a well reputable gothic cathedral7. The museum consists of four separate buildings 5
Can be downloaded from http://www.moveweb.no/KartGUIDE_eng.wmv Foreigners not included (25 out of 107). 7 Nidarosdomen in the city of Trondheim, Norway, is visited by 250 000 every year. 6
750
B. Evjemo, S. Akselsen, and A. Schürmann
that enclose a square courtyard. The thick stone walls and the rooms and artefacts inside hold parts of the city history and anecdotes about its inhabitants. The design of the MuseumGUIDE reflects a pedagogical storytelling ambition, which is recommended within tourism and cultural dissemination [20].
Fig. 3. User interface of the MuseumGUIDE. The silhouette of the cathedral was meant to support the user’s orientation and identification of buildings (left), and the numbers on the building correspond to the entry options (left and mid). Different icons denote the type of media (right).
The MuseumGUIDE was meant for outdoor guiding. It was based upon a set of WAP pages with a hierarchical structure on three levels. The main page invited the user into either historical events related to the buildings or to visit particular rooms or parts of the buildings. In addition to short texts, images and graphical sketches, the MuseumGUIDE contained video and audio tracks. Both Norwegian and English versions were available. Totally there were 26 audio tracks, varying from 30 to 180 seconds in length, and 14 video tracks, varying from 24 to 150 seconds. Some audio tracks were supposed to be played close to statues of historical persons to support an intended conception of “talking heads”. Still images with calm transitions were extensively used throughout the videos to improve the viewability of the video on small screens. The WAP pages were relative short to minimize the need for scrolling. The MuseumGUIDE was implemented as a so-called location based service, utilizing advanced antenna technology8 and WLAN for positioning and mobile phones equipped with WLANSIM9 cards. The system utilized WAP Push10 to load WAP pages related to the user’s current location. Initial tests uncovered that the test area had limited 3G coverage. As a result the devices frequently switched between the 3G and 2G networks, and occasionally caused device deadlock. As foreseen, the 2G network was not able to stream video 8
Cordis RadioEye. Wireless Local Area Network, Subscriber Identification Module. 10 WAP Push Service Load (SL) was used. 9
User Acceptance of Digital Tourist Guides Lessons Learnt from Two Field Studies
751
clips, - a fact that obviously reduced the user experience. To avoid the risk of push messages creating additional confusion to the situation it was decided to run the trial without using the location aware functions of the guide. The field trial - Methods The guide was tested by 90 persons during three days in October 2005. Half of them were college students, aged 17-18 years, and their attendance was arranged in advance. A number of the museum staff was booked in as informants, and finally, people crossing the yard during the day were asked to participate. The informants were equipped with mobile phones11 and were given a short introduction of the phone’s user interface and how to use the MuseumGUIDE. In case of technical problems, members of the project staff were present in the courtyard. The informants used the service in approximately half an hour. They were then asked to fill in a questionnaire that collected demographic information, including former experience with mobile phones, and their immediate impressions of the guide. They received two lottery tickets with an approximate value of 6 €€ for the participation. Results The 3G network was unstable throughout the test period, and most informants experienced some kind of technical problems. This unfortunate situation influenced of course on the total impression of MuseumGUIDE. In particular the older informants seemed to be distracted by these problems. The younger ones were generally more enthusiastic about the service and focused on future possibilities. The frequency analysis tells about a system being welcomed by the informants. Respectively 72 % and 75 % of the informants agreed (or partly agreed) in the following statements: “If I get a chance, I will use a guide like this again” and “I will recommend others to use this kind of guide”. The aspect of usefulness was measured through several items, amongst one rather provocative: “To what degree are MuseumGUIDE a substitute to an ordinary museum visit?” Not surprisingly only 32 % agreed or partly agreed upon this latter statement (50 % in the group that experienced few technical problems). Their general curiosity about the history of the ancient palace was reported strengthened, and this effect was found regardless of gender, age, profession or general historical or cultural interests. The informants enjoyed the untraditional visit at the museum and were fascinated by the potential of the solution. A young boy exclaimed: “I don’t like reading – this suits me”. Most informants preferred audio over video, and more extensive use of still images was asked for. In particular the “talking heads” with extended use of audio effects were appreciated. Also the length of the audio clips seemed appropriate even if some of them were close to three minutes. About 60 % reported that it was occasionally difficult to link the information given by the MuseumGUIDE to the surrounded buildings. The leads given by the graphical design were obviously not sufficient. As the positioning part of the service was not activated during the test, this aspect was not fully investigated. The museum staff was concerned about the limited possibilities to share the MuseumGUIDE with others. In total nearly 40 % missed this opportunity, and among 11
Nokia 6630 and Sony Ericsson K600i.
752
B. Evjemo, S. Akselsen, and A. Schürmann
those who usually had company when visiting museums nearly 90 % reported this kind of concerns. The importance of ease of use was pointed out by many of the informants. Some found the menus cumbersome to use. One of them reported: “Too much scrolling needed before I find what I’m looking for”. Another informant said: “I’m here to see the court yard, not to watch a display”. Still 72 % found the guide to be easy to use. This positive result might be somewhat influenced by the fact that half of them received help to cope with the network problems mentioned earlier. In the case of no network problems, 86 % found the guide easy to use. About 70 % agreed or partly agreed that the use of the keys were intuitive. The functions of the keys were directed by the operating system or WAP browser and not by the guide. Not surprisingly those who had prior knowledge to the phone found the use of the keys more intuitive than others. Nearly 70 % of the informants agreed or partly agreed in the statement that described the MuseumGUIDE as entertaining. This finding was almost independent of age and technology experience.
3 Discussion There were great differences between the two services examined, the devices involved, the measuring instruments and the degree of hands-on that the users actually were offered. Consequently it was not our intention to do numeric comparisons, but rather use the results from the respective studies to address challenges related to digital tourist guides on mobile devices. Both services got high scores on perceived usefulness, ease of use, and entertainment. Presented without further comments these values indicate successful trials and promising digital guides. However, the actual use was limited, the number of informants was limited and the number of items used to form the measuring constructs was limited. Still there are reasons to emphasize some issues; ease of use, co-visiting mechanisms, and user interface based links between physical objects and digital content. Ease of use is a must and the installation threshold should be very low. This is in line with other studies, in particular the seven field studies performed by Kaasinen [13] where ease of use to get to use the service was stressed. A good user interface makes the user see the possibilities and oversee the limitations of the system. In both systems addressed here the users reported difficulties in navigating. It was obvious that use of neither keys nor joystick, as implemented in RegionGUIDE, supported the smooth eye movements searching an ordinary paper based map. Improvements might be gained through minimising navigation by adding automatic positioning and personalise the service according to user preferences and immediate context [3]. If the user is hiking off road the map could for instance automatically be equipped with contour lines. In MuseumGUIDE the graphical presentation of the buildings was seen from a bird’s eye view, as found favourable for digital guides [27]. Initial navigation was supported by map elements (see figure 3), but further selections were supported by an ordinary hierarchical menu structure. This mix of maps and menus seemed to function quite well.
User Acceptance of Digital Tourist Guides Lessons Learnt from Two Field Studies
753
Baus et al. [3] argue in line with this paper when it comes to demands for covisiting mechanisms: Exploring a city or an attraction is often a group activity, and mobile guide systems have so far limited support for social navigation and sharing of experiences. These issues have been addressed through guides that are adapted to other devices [9], [11], [18]. The use of MuseumGUIDE close to the actual buildings was meant to enrich the storytelling and vice versa. However, the informants reported that the link between the stone walls located in front of them and the information given by the MuseumGUIDE was not always obvious. Observations of the informants confirmed these reports as their attention at times seemed to be directed towards the mobile display and not the fascinating environment. If we consider awareness as a limited resource these findings tell us that the content provided by the MuseumGUIDE is attention consuming. A visual user interfaces takes attention away from the physical object in question, whilst listening to audio clips seem to balance the attention much better [1]. A picture or a video is dramatically reduced in size compared to the real world, while a voice from an audio device is rather equal to a voice from a person next to you. This might make a difference. The need of field trials is an ongoing discussion [17], [21]. Our claim is that challenges related to supportive links between physical objects and digital content, demand for co-visiting mechanisms and configuration problems would not have been uncovered without studies in the field. Field studies involving avant-garde technologies are highly desirable but also risky as new technology is more unstable and unpredictable than mature technological platforms. The RegionGUIDE was adapted the GSM network only, which has sufficient capacity to transmit text and images. To download videos was time consuming, but did not provoke technical problems. The problems occurred however in the software installation phase, as the devices vary according to implementation of J2ME specifications and WAP and Internet settings. Presumable ongoing efforts related to mobile device management might reduce these problems. The informants related to RegionGUIDE were mainly middle-aged. They found the content interesting but were not familiar to advanced mobile phones and mobile services. This finding indicates a mismatch between the target group of the study and the technological platform used. This is further illustrated by an informant’s utterance: ”This is great - this is the future, but perhaps it suits the younger ones even better.” In the continuation of the project a WAP based version of RegionGUIDE was developed which eluded the problems related to installing Java programs. The service was then adapted to fit the most common phones and thereby several users. There is a risk however, that such strategies reduce the appealing image of more sophisticated systems. Another strategy is to tailor content to different user groups, and exploit the advantages of profile mechanisms. Profiles can be established automatically upon the user’s actual key strokes, content preferences and customary behaviour. In any case the substance of the guide – the content – should be valuable, updated and of high quality. Production of high quality audio clips and videos is expensive and presupposes skills related to the subject of interest as well as to the multimedia and content production for small displays. Thus production of content represents a main hurdle to deployment of digital guides. Further it is necessary to establish standards of
754
B. Evjemo, S. Akselsen, and A. Schürmann
data formats to ease exchange of data and to set routines that assure availability and maintenance of data sources involved. To these aspects the recent focus on end user content production and the possibilities related to peer-to-peer technologies is both promising and challenging.
4 Conclusion It is difficult to quantify how factors like usefulness, ease of use, and entertainment impact on user acceptance, but the overall impression is that tourists find digital tourist guides appealing. The demand for simplicity might however contrast the use of appealing user interface elements. The field trial enlighten aspects related to user acceptance not easily uncovered in restricted laboratory environments like installation and configuration problems that vary on different devices, the demand for co-visiting mechanisms, and also user interface elements that visualise the links between physical objects and the digital content.
References 1. Aoki, P.M, Grinter, R.E, Hurst, A., Szymanski, M.H, Thornton, J.D, Woodruff, A.: Sotto Voce: Exploring the Interplay of Conversation and Mobile Audio Spaces. In: Conference on Human Factors in Computing Systems. Minneapolis, pp. 431–438 (2002) 2. Abowd, G D, Atkeson, C G, Hong, J., Long, S., Kooper, R., Pinkerton, M.: Cyberguide: A mobile context-aware tour guide. Wireless Networks 3, 421–433 (1997) 3. Baus, J., Cheverst, K., Kray, C.: A Survey of Map-based Mobile Guides. In: Meng, L., Zipf, A. (eds.) Map-based mobile services Theories, Methods and Implementations, pp. 197–213. Springer, Heidelberg (2005) 4. Cheverst, K., Davies, N., Mitchell, K., Friday, A., Efstratiou, C.: Developing a contextaware electronic tourist guide: Some issues and experiences. In: CHI 2000, The Hague, pp. 17–24 (2000) 5. Cheverst, K., Mitchell, K., Davies, N.: Exploring Context-aware Information Push. Personal Ubiquitous Comput 6(4), 276–281 (2002) 6. Davis, F D.: Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly 13, 319–340 (1989) 7. Evjemo, B., Stenvold, L.A, Akselsen, S., Brede, S., Rafelsen, T.: Opplev Erkebispegården med mobilen! Lokasjonsbasert museumsguide på 3G-nettet. Fornebu, Telenor R&D, R 41. In Norwegian (2005) 8. Evjemo, B., Akselsen, S., Bergvik, S., Schürmann, A., Ytterstad, P.: Turister gjør det med mobilen! Utprøving av mobile tjenester for turister i Lofoten sommeren 2005, Fornebu, Telenor R&D, R 2. In Norwegian (2005) 9. Galani, A., Chalmers, M.: Can you see me? Exploring co-visiting between physical and virtual visitors. Museums and the Web 2002 Boston, pp. 31–40 (2002) 10. Garzotto, F., Cinotti, T.S, Pigozzi, M.: Designing multi-channel web frameworks for cultural tourism applications: the MUSE case study. Museums and the Web, Charlotte, NCarolina, USA , pp. 239–254 (2003) 11. Grinter, R.E, Aoki, P.M, Hurst, A., Szymanski, M.H, Thornton, J.D, Woodruff, A.: Revisiting the Visit. Understanding How Technology Can Shape the Museum Visit. In: CSCW’02, pp. 146–155 (2002)
User Acceptance of Digital Tourist Guides Lessons Learnt from Two Field Studies
755
12. Hunolstein, S., Zipf, A.: Towards task oriented map-based Mobile Guides. In: Mobile HCI ’03 (2003) 13. Kaasinen, E.: User acceptance of location-aware mobile guides based on seven field studies. Behaviour & Information Technology 24(1), 37–49 (2005) 14. Kesti, M., Ristola, A., Karjaluoto, H., Koivumaki, T.: Tracking consumer intentions to use mobile services: empirical evidence from field trial in Finland. E-Business Review, IV, pp. 76–80 (2004) 15. Khan, F.: Museum puts tags on stuffed birds. RFID Journal (2004) http:// www.rfidjournal.com/article/articleprint/1110/-1/1 16. Kjeldskov, J., Graham, C., Pedell, S., Vetere, F., Howard, S., Balbo, S., Davies, J.: Evaluating the Usability of a Mobile Guide: The influence of Location, Participants and Resources. Journal of Behaviour and Information Technology 24(1), 51–65 (2005) 17. Kjeldskov, J., Skov, M.B, Als, B.S, Høegh, R.T.: Is it worth the hassle? Value of evaluating the usability of context-aware mobile systems in the field. In: Mobile HCI ’04, pp. 61–73 (2004) 18. Laurillau, Y., Paternò, F.: Supporting Museum Co-visits Using Mobile Devices. In: Mobile CHI’04, pp. 451–455 (2004) 19. Leung, L., Wei, R.: More than just talk on the move: Uses and gratifications of the cellular phone. Journalism and Mass Communication Quarterly 77, 308–320 (2000) 20. Luyten, K., Coninx, K.: Imogl: Take Control over a Context-Aware Electronic Mobile Guide for Museums. In: Workshop on HCI in Mobile Guides, 6th International Conference on Human Computer Interaction with Mobile Devices and Services. Glasgow (2004) 21. Nielsen, C.M, Overgaard, M., Pedersen, M.B, Stage, J., Stenild, S.: It’s worth the Hassle! The added Value of Evaluating the usability of Mobile Systems in the Field. In: NordiCHI 2006, Oslo, pp. 272–280 (2006) 22. Peters, O., Almekinders, J.J., Van Buren, R.L.J., Snippers, R., Wessels, J.T.J.: Motives for SMS use. In: Paper presented at the annual conference of the International Communication Association. San Diego, CA (2003) 23. Peters, O., ben Allouch, S.: Always connected: a longitudinal field study of mobile communication. Telematics and Informatics 22, 239–256 (2005) 24. Petrelli, D., Not, E.: User-centred Design of flexible Hypermedia for a Mobile Guide: Reflections on the HyperAudio Experience. UMUAI – User Modeling and User-Adapted Interaction 15(3-4), 303–338 (2005) 25. Poslad, S., Laamanen, H., Malaka, R., Nick, A., Buckle, P., Zipf, A.: CRUMPET: Creation of user-friendly mobile services personalised for tourism. 3G 2001, London, pp. 28–32 (2001) 26. Rocchi, C., Stock, O., Zancanaro, M., Kruppa, M., Krüger, A.: The Museum Visit: Generating Seamless Personalized Presentations on Multiple Devices. Intelligent User Interfaces 2004. Island of Madeira, Portugal (2004) 27. Schilling, A., Coors, V., Laakso, K.: Dynamic 3D Maps for Tourism Applications. In: Zipf, A., et al. (eds.) Design of map-based mobile services, Springer, Heidelberg (2005) 28. Venkatesh, V., Morris, M.G., Davis, G.B., Davis, F.D.: User acceptance of information technology: Toward a unified view. MIS Quarterly 27, 425–478 (2003) 29. Viken, A., Akselsen, S., Evjemo, B., Hansen, A.A.: Lofotundersøkelsen 2004, Fornebu, Telenor Research & Development, R 27 In Norwegian (2004) 30. Wang, A.I., Sørensen, C.-F., Brede, S., Servold, H., Gimre, S.: Development of a locationaware application. The Nidaros framework. IFIP TC8, MOBIS. Leeds, UK (2005)
Why Does IT Support Enjoyment of Elderly Life? - Case Studies Performed in Japan Kaori Fujimura1, Hitomi Sato1, Takayoshi Mochizuki2, Kubo Koichiro2, Kenichiro Shimokura1, Yoshihiro Itoh1, Setsuko Murata1, Kenji Ogura1, Takumi Watanabe1, Yuichi Fujino1, and Toshiaki Tsuboi3 1
Nippon Telegraph and Telephone Co. 1-1 Hikari-no-oka, Yokosuka City 239-0847, Japan {fujimura.kaori, sato.hitomi, k.shimokura, itoh.yoshihiro, ogura.kenji}@lab.ntt.co.jp, [email protected], {watanabe.takumi, fujino.yuichi}@lab.ntt.co.jp, [email protected] 2 NTT Resonant Inc. 2-2 Otemachi 2-chome Chiyoda-ku, Tokyo 100-0004 Japan [email protected], [email protected] 3 NTT IT Co. 2-9-1 Furo-cho, Naka-ku, Yokohama City 231-0032, Japan [email protected]
Abstract. In order to support elderly people to remain activate in communicating with their families and friends, we are developing always-on communications systems that are based on the exchange of indirect information, the videophone, and touch panel displays. Two field experiments were conducted with elderly people in Japan. One of the experiments was conducted between families members, while the other was performed between elderly people and social workers. The results show that IT can support the enjoyment of elderly life. Keywords: elderly people, always-on communications system, indirect information, videophone.
2 Experiment: Family Members The first case study was performed in Kashiwazaki City, Niigata Prefecture, Japan. Kashiwazaki City is located near the center of Japan’s main island and it takes about three hours from Tokyo station by train. The city has a population of 94,000 and covers an area of 440.55 square kilometers. 25.6% of the whole population, some 24,000 people, are 65 or older. Since this system exchanges not only indirect presence information but also direct audio-visual information, it can switch between communication modes according to the feeling and condition of the families. This field work mainly deals with communication experiments between families members that are living apart. 2.1 Always-On Communication System I The always-on communications system used in the Kashiwazaki trial consists of a personal computer, a touch panel display, a web camera, a microphone, a temperature sensor, a light sensor, a sound sensor, and Internet access. As we provided the system with touch panel display, elderly participants don’t have to use the keyboard or mouse.
Fig. 1. Always-on communication system I
Participants attempted five tasks, see Table 1. Each task employed a different always-on communication mode. Fig.2 and Fig.3 show screen samples of two modes. Table 1. Communication modes
communication mode (a) (b) (c) (d) (e)
information of the partner
sensor
presence presence and environment of the room sound of daily life scenery of daily life sound and scenery of daily life
PC: on/off PC:on/off, a temperature sensor, a light sensor, a sound sensor microphone web camera microphone and web camera
758
K. Fujimura et al.
(a)
(b)
Fig. 2. Screen samples: communication modes (a) and (b)
In modes (a) and (b), the screen uses a picture icon to express the partner’s current status. When the picture icon is colored, the partner’s system is on, and when the picture is black and white, it is off. When the participant wants to talk to the partner, he first touches the partner’s picture icon, then the “to Talk” button. In mode (b), in addition to the partner’s status, a graph of recent sensor data was presented. The participants can estimate the partner’s activity from the graph.
(c)
(d)
Fig. 3. Screen samples: communication modes (c) and (d)
In mode (c), the microphone was live so the caller could hear the sounds of daily life like chopping sounds from the kitchen or the patter of footsteps. If he or she wanted to talk to the partner, he would touch the “to Talk” button. The partner’s camera was then activated. In mode (d), the participant could observe the partner’s room through the web camera. If he or she wanted to talk to the partner, he would touch the “to Talk” button. In mode (e), the participant could observe the sounds and sights of the partner’s room. If he wanted to talk to the partner, he would just call out to the partner. 2.2 Methodology The participants of this trial were seven split households, the elderly parents living in Kashiwazaki and their children living in or close to Tokyo. The number of participants was 15 as Table 2.
Why Does IT Support Enjoyment of Elderly Life?
759
Table 2. Description of the participants
male female average age
elderly 6 2 72
children 4 3 38
They tried each mode for 10 to 14 days; period was randomly set. Total experiment duration was about three months. If the participant wanted to communicate with the partner, they used the videophone i.e. audio and visual communication mode provided by the system. Every time they finished talking by videophone, they had to answer a simple questionnaire that was displayed on the system. The system was placed in the room where the participants usually spent their free time such as the living room. Participants asked to leave the computer on during the experiment. 2.3 Results The number of times the participants changed the mode to videophone is shown in Fig.4. The data indicates that the participants changed to videophone mode many times regardless of the mode. s t n a ip c i t r a p e h t s e m i t f o r e b m u N
e d o m n o i t a ic n u m m o c e h t d e g n a h c
400 350 300 250 200 150 100 50 0 (a)
(b)
(c )
(d)
Fig. 4. Number of times the participants changed the communication mode to videophone
This frequent activation of the videophone suggests that the exchange of the partner's current status encourages conversation with the partner. The high frequency of activation from mode (d), compared to modes (a) and (c), indicates the effectiveness of the always-on visual communication in transferring the partner’s current status. From an interview conducted after all the experiment, the elderly participants commented that the system allowed them to feel as if they were living with their child, because they could see their faces and were able to engage in small talk easily.
760
K. Fujimura et al.
Meanwhile, their children had a sense of security because they could see their parents. The negative impressions expressed by the parents were: 1. Feeling of “not enough information”, 2. A leakage of private information (modes (c) and (e)). Those of their children were: 1. A leakage of private information (modes (c) and (e)), 2.Annoyance caused by the feeling of being continually watched. As to usability, three of the eight elderly participants told us that the system was easy to use, while three others felt otherwise.
3 Experiment in the Local Community The second case study was conducted in Kijo Town, Miyazaki Prefecture, Japan. Miyazaki Prefecture is located in the southwest of Japan, and Kijo Town is located approximately in the center of this prefecture. The town has a population of 5,800 and covers an area of 146.02 square kilometers. 25.1% of the whole population, some 1456, were 65 or older in 2005. In response to the results of the first trial, we developed an always-on communication system that exchanged existence information and supported handwritten E-mail and videophone calls. This field work mainly assessed communication within the local community such as elderly people, social welfare council staff, and social workers, in addition to families living apart. 3.1 Always-On Communication System II The always-on communications system used consists of a personal computer, a touch panel display, a web camera, a microphone, a communication light with infrared sensor, and Internet access as shown in Figure 5. In the previous experiment, even though we provided a touch panel system and bigger buttons, about 40% of elderly participants felt that the system was difficult to use. In response, we redesigned the system.
Web Camera
Microphone Communication light with infrared sensor Touch panel PC
Fig. 5. Always-on communication system II
Why Does IT Support Enjoyment of Elderly Life?
761
The communication light in Figure 6 illuminates when someone approaches the terminal at other location.
Fig. 6. Communication light with infrared sensor
The partner’s current status is expressed by fish icons as shown in Figure 7.
(a) user A is in the room (b) user A is making a videophone call (c)user A is away Fig. 7. Three status icons to show the partner’s current status
When the fish icon is swimming (a) and people touch the fish, the system makes a videophone call to the partner automatically. When the fish icon is (b) or (c), the system screen is changed to the screen for making handwritten E-mail. (Fig.8).
Fig. 8. Screen for making a handwritten E-mail -“Please call me” in Japanese
The system was placed in the room where participants usually spent their free time such as the living room. Participants asked to leave the computer on during the experiment.
762
K. Fujimura et al.
3.2 Methodology The participants of this trial were 19 elderly people, 5 case workers, 2 social welfare council staff, and 5 elderly participants’ families as shown in Table 3. The duration of this experiment was about three months. Table 3. Description of participants
attribution elderly case worker staff family of the elderly
number of participants 19 5 2 5
details late 60’s to 90’s social welfare council 3 from Kijo Town 2 from near Tokyo
We held interviews three times: at the beginning, middle, and at the end of the experiment. 3.3 Results As Fig.9 indicates, the elderly participants used the videophone quite often. Almost all the participants used the videophone after they checked the presence of the partner by the fish icon or the communication light.
s t s e u q e r ll a c f o r e b m u N
350 300 250 200 150 100 50 0 first half
second half May
first half
second half June
Fig. 9. Number of videophone call requests by elderly participants
The results gathered from the interviews after the experiment are shown in Fig.10. 87% of the subjects felt relief and a sense of security while using this system. Over 90% of the participants indicated that they enjoyed the experiment, see Fig.11.
Why Does IT Support Enjoyment of Elderly Life?
763
Sense of security 13% 47%
40%
Sense of security (somewhat) Not either Discomfort(somewhat) Discomfort
Fig. 10. Impressioopn of whole experiment (from interview replies)
7%
40%
53%
Fun Fun(somewhat) Not either Boring (somewhat) Boring
Fig. 11. Impression of whole experiment (from interview replies)
The answers to the questionnaires indicated that almost all of the elderly participants enjoyed looking at the changing status as expressed by the communication light or fish icon. Half of them used the videophone and the handwritten e-mail function frequently. By watching the fish icon, the number of times people though about other elderly people was increased, and they were happy at being able to check their safety. Some of the elderly complained about the difficulty of using the videophone or writing the E-mail messages. Videophone problems were howling and voice delay. The elderly with poor hearing tried to turn up the speaker volume, which worsened the howling problem. On the other hand, participants soon became accustomed to the voice delay and felt comfortable holding conversations via the videophone.
764
K. Fujimura et al.
4 Conclusion The results of these two field experiments indicate that always-on communications systems, which present the partners’ current status, provide people with a sense of relief and security. Exchanging current status information induces caring for each other. Checking the partners’ appearance and triggering small talk via videophone tighten family ties and strengthen the community bond. Though there are some issues to be solved, these case studies suggested that IT does indeed make elderly life more enjoyable. Acknowledgments. We thank all the elderly people, their families and volunteers who cooperated in these field experiments.
References 1. Mochizuki, T., Kubo, K., Fujimura, K., Sato, H., Shimokura, K.: Experimental Consideration for Continuous Connection Communication Environment for Families Living Apart. HCS 2004-6 (In Japanese) 104(109), 29–34 (2004) 2. Itoh, Y., Miyajima, A., Ogura, K., Tatemichi, H., Watanabe, T., Fujino, Y., Fujimura, K., Sato, H.: Communication Service Design by Interhuman Interaction Approach CHI 2006, Montreal (2006) 3. Sato, H., Murata, S., Fujimura, K., Ito, Y., Ogura, K., Watanabe, T., Fujino, Y., Mochizuki, T., Tsuboi, T., Miyajima, A., Takagi, S.: THE INFLUENCE WHICH VISUAL COMMUNICATION SYSTEM EFFECTS ON ELDERLY PEOPLE. In: Proceeding of the Second IASTED International Conference TELEHEALTH. Banff, AB, Canada, pp. 92–99 (July 3-5, 2006)
Design Effective Navigation Tools for Older Web Users Qin Gao1, Hitomi Sato 2, Pei-Luen Patrick Rau 1, and Yoko Asano2 1
Abstract. This research looks at various navigation menu designs and the use of the summary of important content with the aim to improve older web users’ performance and satisfaction and to alleviate their disorientation and task workload. Fifty older participants with abundant computer experience were recruited from senior-citizen universities so as to exclude the influence of lack of experience from the result. During the experiment, participants searched for specific product information on e-commerce websites with different navigation menus and different design of the summary of important content. Participants using tab menu were found less disoriented than people using index menu. Providing a summary of important information could reduce the workload the tasks imposed on older user effectively. Keywords: web navigation, older users, usability, navigation tools, summary of important information.
to distinguishing whether the differences or problems founded in the experiment results attributed to age-specific factors or just the lower level of web experience of older users. The purpose of this study is to empirically examine the influence of various navigation menu designs and the use of summary of important information on older users’ performance, disorientation, task workload and satisfaction respectively, and to propose specific design guidelines based on experimental results. In this way, problems which were caused by less web experience were excluded from the results. It is hope to benefit the general population also, since design modifications that help older users often offer at least some benefit to younger users as well[5] .
2 Ageing Effects and Web Navigation Memory and good sight are two crucial abilities for using web effectively, since web interaction involves a tremendous range of cognitive skills (e.g., recalling, reasoning, recognition, skill acquisition and comprehension), and important information is often embedded in the visual display in subtle ways. For example sighted users can recognize the relevant results of an Internet search, from the irrelevant advertising material, from the format in which the results are presented. Older people’s declines in memory is generally agreed, but this decline is not global. Previous research shows that Age Associated Memory Impairment (AAMI) is associated with different areas of the brain. It was found while those fixed and prelearned skills (called “crystalline” intelligence) remains unaffected by ageing, the “ability to solve problems for which there are no solutions derivable from formal training or cultural practices” (called “fluid” intelligence) deteriorates with age h[6] . Reduced working memory capacity leads to poorer performance in various tasks, including reasoning tasks (e.g., understanding how a web hierarchy is organised) and procedural memory (e.g., performing a large number of steps in an e-commerce transactions). Previous research shows that older people have more difficulty retracing and navigating a route than younger people [7] . In another study on older people’s performance on web navigation, Meyer et al. found that older people took more steps to finish tasks than young people during web navigation. This difference was most pronounced for two tasks that required the most steps. They were also more likely to return to pages they had already visited. Index tabs and site maps were also preferred by older people. It appears that older people need means of seeing where they are and where they have been, rather than keep track of this in their own memory. Therefore, easy to use navigation tools is of especial importance for older users. Another cognitive impairment hinders older people from using the web effectively happens to attention. Selective attention, the ability to pay attention to relevant information in the presence of distracting information declines with age [8, 9] . This makes older users be more easily distracted and overwhelmed by the information flood on the web, and thus have more difficulties in extracting information from distracting noises. Researchers also have reported declines in performance with age on divided attention tasks, in which the subject must pay attention to more than one task at the same time. Such divided attention problems with age appear to occur only
Design Effective Navigation Tools for Older Web Users
767
in complex tasks rather than simple or nearly automatic tasks [10, 11] . These problems may cause troubles when multitasking processing is required for accomplishing a task on the web. Visual impairments affect the effective use of web for older people significantly. While generally failing vision, reduced focusing power, and cornel flattening require improvements of general user interface design, such as larger font size and stronger contrast, visual field reduction resulting in reduced peripheral vision, requires important information to be presented as close to the centre of the screen as possible[12] . The narrow visual field of older users adds to the difficulties for them to compare widely separated screen objects (Cerella, 1985), and putting all important information together with an organizer will help older users access important information easily. Results of Charness, Schuman and Boritz’s experiments (Charness et al., 1992) showed that when organizers facilitating reading were used, no significant differences were found between the old and the young readers, while the young group outperformed the old group significantly in case of no organizers available. In another study (Thompson & Diefenderfer, 1986), people with low verbal ability, older or young, were found benefiting more from use of advance organizers. In addition, web designers should also take motor impairments into considerations when designing navigation supports for older people. Stiffening of the joints and arthritis of older users and reduced ability to inhibit interference from neural noise cause difficulty with finer movements and motor coordination for older users, which makes homing into a small target problematic for older users[13]. It was found that older adults make more sub movements and are slower in capturing a target with a mouse[13] , and they show poorer performance when asked to track a target[14] .
3 Navigation Support Tools: Two Level Navigation Menus For those information-rich websites, especially for web portals, two-level menus presenting down-to-child links are nearly inevitable. There are many different ways to present such links, but there are four most basic designs: ● Index layout: as an original state, the first level menu and the second level menu are all presented. ● Horizontal cascading: as an original state, only the first level of menu is presented. After clicking an item, the second level menu corresponding the item will be presented at the right side of the first level menu. ● Vertical cascading: as an original state, only the first level of menu is presented. After clicking an item, the second level menu corresponding the item will be presented below the first level menu item. ● Tab menu: the first level menu is presented as tabs. As an original state, the first level of menu and the second level items containing in the first first-level item is presented. After clicking a menu item, the second level menu corresponding the item will be presented below the first level tab menu. Fig. 1 shows four designs of navigation menus which were compared in this study. Which one is most suitable for older users is of our interest. There are pros and cons of each design. Index menus present both the overall structure and the details of the
768
Q. Gao et al.
whole hierarchy, but too many links together may clutter the page visually. Whereas cascading menus may reduce “link crowding” visually and improve accuracy and satisfaction by making it easier to access proper links, such menus command on relatively finer motor coordination, and may result in accessibility problems for old users. This problem is more pronounced for horizontal cascading menus since users need too select the father link first, than move horizontally along the link button so as not to lose the focus of the selected link, until the cursor is moved onto the panel containing child links. Tab navigation is widely used for breaking information into a few primary categories. It hides child links without requiring fine movements. However, this metaphor may not be as obvious for older people for whom both computers and internet are brand new experiences.
(a) Index menu
(b) Tab menu
(c) Vertical cascading menu
(d) Horizontal cascading menu
Fig. 1. Different designs of two-level menus
4 Navigation Support Tools: Summary of Important Information For web portals and e-commerce websites, the portal page contains a large amount of information, and the important information may be scattered about page, overwhelmed by less important information. Given that old people are more likely to be overloaded by information flood, presenting a summary of important information at a conspicuous location of the page may help them to get access to such information, catch the essence of the website and help their navigations. Moreover,
Design Effective Navigation Tools for Older Web Users
769
the summary can take a format of a list of links or a list of links with summaries. The latter is expected to benefit old users more since their semantic memory decay less than other memories. The effects of such a summary was tested by comparing the two websites shown in Fig. 2. One website provides a summary of the hot items of the whole category at the top of the index page of that category, while another web does not offer such a summary.
(a) With a summary of important infor mation
(b)
Without a summary of important information
Fig. 2. Website design with/without a summary of important information
5 Methodology 5.1 Experiment Design An experiment were conducted to test the effects of various designs of two-level menus and the effects of the use of summary of important information. Five ecommerce website were constructed with all the same contents, but featuring different design of navigation menus and the use of summary of important information, as shown in Table 1. The effects of various designs of two-level menus was to be examined by comparing user performance and satisfaction among website A to D, and the effects of the use of summary of important information was to be examined by comparing website A and website E. Task duration, task workload, disorientation, and user satisfaction were measured as dependent variables. Table 1. Design of experimental websites
No.
Index menu
Tab menu
Website A Website B Website C Website D Website E
X X
X -
Horizontal cascading menu X -
Vertical cascading menu X -
Use of summary X
770
Q. Gao et al.
5.2 Participants 50 participants, aged from 44 to 75, were recruited from two senior-citizen universities in Beijing. These participants were trained to use computers in universities, and thus more skilled in using computers than the general population of the same age. The participants were randomly assigned to one of the five websites, with 10 participants in each group. General background information and PC/web experience of each group were shown in Table 2. Table 2. Descriptive statistics of participants
Group Age (in years) Months of computer usage Months of Internet usage Hours per week using the web
A 61.7 38.1 14.88 4.1
B 61.4 19 9.7 4.9
C 55.3 101.8 49 6.2
D 61.1 80.78 35.5 7.75
E 59.0 54.6 42.2 3.4
5.3 Procedure 6 to 10 participants were arranged to conduct the experiment simultaneously in a testing room. They were first given a background questionnaire concerning their demographic information, including age, gender, education, and experiences with computers and the web. Then the moderator gave a brief description of how the study would be. Participants were asked to look for specific product information from a ecommerce website. The website provided access to five product categories, including house wares, books, electronics, health care, and home décor. Each category broke into 3 or 4 subcategories. Then a practice session was given to help participants understand the operations of the system and the tasks to be performed. Each participant was then given a total of 19 tasks, and the tasks were intended to represent the typical types of tasks users perform on an e-commerce site. For example: How much is the price of a Zhineng sliming belt? Where are Omron blood pressure monitors produced? The task order was randomized for each participant by the software system. Participant were given one task at a time and were asked to complete the task as fast as possible. Task duration and clicking data were logged automatically by the system during the execution of tasks. Upon completing all tasks, participants were given a post study questionnaire that asked them about the workload tasks imposed on them, the degree of disorientation they experienced and their satisfaction towards the tasks.
6 Results 6.1 Testing of the Two-Level Navigation Menus Design The data collected were first checked for model adequacy. If model adequacy was not hold, the data were transformed. ANOVA tests were conducted to test whether different designs of two-level menus had significant influences on old users’
Design Effective Navigation Tools for Older Web Users
771
performance time, satisfaction, workload and disorientation. The result shown in the following table indicated that there was a significant difference in disorientation (F (3,27) = 4.15, p = 0.013) among four groups. No significant differences were found for satisfaction, workload and task time among four groups. Table 3. Data for testing the effects of two-level menus design Mean (STD) Variables
Satisfaction Disorientation Workload Task durationa (s)
Index layout (N=10)
Horizontal cascading (N=10)
Vertical cascading (N=10)
Tab menu (N=10)
4.66(0.40) 2.60(0.98) 41.68(11.0)
4.74(0.42) 2.01(0.90) 33.70(13.9)
4.33(0.28) 1.84(0.57) 28.11(14.7)
4.49(0.52) 1.32(0.76) 32.06(13.6)
82.30(37.10)
83.78(22.46)
65.46(9.26)
70.60(18.74)
F(3,27)
p
1.92 4.15 1.82 H ( 3, 40) =3.24
0.144 0.013* 0.160 0.356
a
Kruskal-Wallis test was used since the data could not be transformed to assure the model adequacy; * p-value<0.05, suggesting a significant difference.
Duncan’s tests were run to determine which of the disorientation means are different from each other. As shown in Table 4, a significant difference existed between tab menu and index menu (p = 0.002), and a marginal significant difference existed between vertical cascading and index menu (p = 0.057). Older people get more disoriented with index menus than with vertical cascading menus. Table 4. Duncan test of disorientation scores of two-level menu designs Variables Index menu Horizontal cascading Index layout 0.120 Horizontal cascading 0.120 Vertical cascading 0.057 0.639 0.079 Tab menu 0.002* * p-value<0.05, suggesting a significant difference.
Vertical cascading 0.057 0.639 0.161
Tab menu 0.002* 0.079 0.161 -
As the experiment results suggested, the presentation of two-level menus affect the level of disorientation older users experience on a website. Tab menu is recommended. It seems that the metaphor tab menu uses is easy to understand for older users. This metaphor helps older users understand how a web hierarchy is organized and build the hierarchy in their minds. Vertical cascading menu is also recommended. Not only the disorientation score of vertical cascading menu was marginally significantly lower than that of index menu, but the mean value of workload and task duration of vertical cascading menu also ranked the lowest among four groups, though not significantly. Compared with horizontal cascading menu, with which some participants reported that they had problems with the fine
772
Q. Gao et al.
movements needed to access the secondary link panel, vertical cascading menu works in a way demanding less fine motor control: secondary links will be presented after the corresponding primary link is clicked. Index menu, however, should be avoided since it produces significantly higher disorientation from older users compared with tab menu. Several plausible reasons may contribute to this effect. Some participants complained about the “crowdedness” of links in the tab menu after the experiment. It seems that both the diminished working memory and the falling vision make a detailed index menu presenting all the links not an ideal solution for older users. Though the hierarchy of the website is already explicitly presented by the index menu, it is difficult for older users to internalize the hierarchy with all the details into their minds, and retain it in their memory. Also the visual clutter cause problems to aging eyes with less focus power and possible bifocal problems. Considered that most participants reported that the biggest problem for them to complete tasks was visual fatigue, visual clutter should be avoided as much as possible for older users. 6.2 Test of the Effect of the Summary of Important Information Data for testing the effects of the use of important information summary did not hold model adequacy, so only non-parametric tests could be used. Kolmogorov-Smirnov tests were adopted to test whether providing a summary of important information was better than no summary provided for older users then. The result shown in Table 5 indicated that there was significant difference in the level of workload between the two groups (with p=0.03*). No significance was found in task duration, satisfaction, or disorientation. Table 5. Data for testing the effects of the use of important information summary Mean (STD)
Variables Task duration (s) Satisfaction Disorientation Workload
Z
p-level
62.18(25.59)
1.59
0.11
4.40 (0.75) 1.86 (0.84) 31.70 (15.81)
-0.15 1.21 2.12
0.88 0.23 0.03*
No summary
With a summary
82.55(36.77) 4.58 (0.36) 2.29 (0.73) 45.08 (8.79)
The experiment results suggested that providing a summary of important information reduces the workload of older web users. Though no significant differences were found in task duration and disorientation, comparisons of the mean values showed that providing a summary helps older web users complete tasks faster and alleviates the disorientation older users may experience.
7 Conclusion Older people suffer more from usability problems due to numerous physical, cognitive, psychological, and social factors, as well as overall difference in life experience. To address these issues through good website design, empirically validated guidelines which accommodate older people’s requirements are needed.
Design Effective Navigation Tools for Older Web Users
773
This study is to empirically examine the effects of specific design features that aim to help older web users. Four types of two-level menus were tested, as well as the use of the summary of important information. Based on experiment results, it is recommended that tab menu and vertical cascading menu should be selected for older web users. They help older user diminish disorientations when they navigate through a web site. Index menu should be avoided as it induces visual clutter and disorientation. Providing a summary of important information at a significant place on a portal page can reduce workload of older users, and thus should be offered if possible. In future studies, more specific design features should be tested against older web users so as to develop a comprehensive set of validated guidelines for website design for older people. Without them, vast information resources on the web would be inaccessible for older people.
References 1. Coyne, K., Nilsen, J.: Web Usability for Senior Citizens: 46 Design Guidelines Based on Usability Studies with People Age 65 and Older. In: Nielson Norman Group Report (2002) 2. Conklin, J.: Hypertext: an introduction and survey. Computer 20(9), 17–41 (1987) 3. Meyer, B., et al.: Age group differences in world wide web navigation. In: CHI ’97 extended abstracts on Human factors in computing systems: looking to the future, ACM Press, Atlanta, Georgia (1997) 4. Ellis, R.D., Kurniawan, S.H.: Increasing the usability of online information for older users: A case study in participatory design. International Journal of Human-Computer Interaction 12(2), 263–277 (2000) 5. Worden, A., et al.: Making computers easier for older adults to use: area cursors and sticky icons. In: Proceedings of the SIGCHI conference on Human factors in computing systems, ACM Press, Atlanta, Georgia, United States (1997) 6. Stuart-Hamilton, I.: Intellectual Changes in Late Life. In: Woods, R.T. (ed.) Psychological Problems of Ageing, Wiley, Chichester (1999) 7. Wilkniss, S.M., et al.: Age-related differences in an ecologically based study of route learning. Psychol Aging 12(2), 372–377 (1997) 8. Connelly, S.L., Hasher, L.: Aging and inhibition of spatial location. Journal of Experimental Psychology: Human Perception and Performance 19, 1238–1250 (1993) 9. Kotary, L., Hoyer, W.J.: Age and the ability to inhibit distractor information in visual selective attention. In: Experimental Aging Research (1995) 10. Hartley, A.A.: Attention. In: Craik, F.I.M., Salthouse, T.A. (eds.) The Handbook of Aging and Cognition, Erlbaum, Hilsdale, NJ (1992) 11. McDowd, J.M., Craik, F.I.M.: Effects of aging and task difficulty on divided attention performance. Journal of Experimental Psychology: Human Perception and Performance 14, 267–280 (1988) 12. Kurniawan, S.H., et al.: Personalising web page presentation for older people. Interacting with Computers 18(3), 457–477 (2006) 13. Walker, N., Philbin, D.A., Fisk, A.D.: Age-related differences in movement control: adjusting submovement structure to optimize performance. Journal of Gerontology: Psychological Sciences 52B(1), 40–52 (1997) 14. Jagacinski, R.J., Liao, M.J., Fayyad, E.A.: Generalized slowing in sinusoidal tracking in older adults. Psychlogy and Aging 9, 103–112 (1995)
Out of Box Experience Issues of Free and Open Source Software Mehmet Göktürk and Görkem Çetin Gebze Institute of Technology 41400 Kocaeli, Turkey {gokturk,gcetin}@gyte.edu.tr
Abstract. This study addresses the Out-Of-Box Experience (OOBE) usability issues of Free and Open Source Software (F/OSS) considering outcomes of distributed development process and high number of available product choices. A methodology is presented, usability experiments are conducted and results are discussed. The objective was to determine key factors that affect usability of F/OSS during OOBE and first hours of use. We concluded that OOBE of F/OSS was significant in software usability perception and possible adoption. User experience, visible structure, consistency and functionality of the interface had significant impact on OOBE and first hours of use. Neither online support, nor product box appearance appeared as important. Keywords: OOBE, usability, open source.
Out of Box Experience Issues of Free and Open Source Software
775
(OOBE) term has been introduced by researchers primarily focusing on packaged consumer products [5]. It has been demonstrated that OOBE has a significant impact on product adoption where the product market is highly competitive with a lot of alternatives [6]. Since many users already have previous experience with consumer electronics and mobile products, they treat instruction manuals as redundant in product boxes. These users rely on usability of the interface and their existing knowledge to be able to use the product. Many studies have been focused on mobile phones and some on other personal electronic devices and pieces of applications [7]. These studies indicate that OOBE design has utmost importance for product adoption and use of advanced features such as mobile data services, calendars, message boxes and so on. OOBE is highly dependent on fundamental usability principles such as visibility, consistency, affordances, locus of control and feedback allowing user to start using the product immediately [8]. Studies have shown that many features of certain products are left unused after unsuccessful OOBE, the user abandons the product [9]. In addition, Pirhonen states that learnability is the central factor on OOBE and suggests good use of metaphors to provide better learnability therefore better OOBE [10]. Yet, it is possible to consider “computer software” as a piece of consumer product and expect an OOBE associated with it. A user will go through a learning phase for both computer hardware and software. A “software OOBE” lifecycle model is given in Figure1.
Fig. 1. Software OOBE lifecycle model
Computer users are being exposed to a variety of new software performing identical tasks. This makes Out-of-box experience an Out-of-installer experience for many software products. As demo, shareware or freeware software that are downloaded for
776
M. Göktürk and G. Çetin
trial have extremely high turnover rates, a successful OOBE resulting in a possible adoption for such is considered as essential. As a result, developers are focusing more on visible product features, OOBE usability and therefore raising software awareness and acceptance [11]. 1.1 Special Characteristics of Free or Open Source Software Free and Open Source Software (F/OSS) have special characteristics due to the nature of the development process. First, the entity behind the product is not an organization, rather a community of volunteers and freelance professionals. Typical user support is provided through online communities and paid consultants [12]. Therefore immediate customer service is not a viable option. Second, number of available choices for a particular tool is considerably high. For example there are over 300 Linux distributions available [13]. The number of available choices for a particular software tool sometimes exceeds 10. This results in quick turnover rates for open source software. Furthermore, cost characteristics of F/OSS software allow users for easy turnover. Third, a tough interface standardization entity is not present in development community. Lack of standardization and consistency cause usability problems during OOBE period. This may adversely affect “quality and functionality perception” of the software being used.
2 Methodology Our aim was to identify key factors in OOBE of F/OSS in a laboratory user experiment. For that purpose, we focused ourselves on an open source system software. We have created a task list including installation and basic operations that matches OOBE definition. 2.1 Designing the Task List The scenario given in the task list covered the process from opening the box through using and experimenting the basic productivity tools and software. The tasklist is given in Table 1. A more explanatory version was given participants as task scenario sheet. Before the test began, participants were given 20 minutes to warm up and get accustomed to the work environment and a different user interface. The subjects were free to comment on the system and ask for help from the facilitator in this time period. All the sessions were videotaped. No hints or assistance were given to complete the tasks. The participants were not forced to think aloud, either, but if they did, everything were written down to conduct further analysis. 2.2 F/OSS Software: A Desktop Operating System We used a desktop centric Linux “Xandros” mainly to fit our testing purposes. Xandros was chosen because of its relatively inflexible installation for new users. Software components were preselected carefully by the company and for each desktop
Out of Box Experience Issues of Free and Open Source Software
777
Table 1. Task Descriptions
Task T1 T2 T3 T4 T5
T6 T7
Explanation Unpack the Linux Power on the computer with CD inserted Install the system including most common used components Reboot and log in into the system 1 Browse the web with a browser: Go to www.google.com and with the keyword “bosphorus”, search for an image. Save this image to the desktop and open it using a graphics package 2 Work with a word processor: Create an OpenOffice.org document with a heading 12 pt and default body font of 10 pt Bitstream Vera Sans font. Merge the previously saved graphics or add a graphic from directory /usr/share/wallpapers and save it on the desktop 3 Send an e-mail: Open the mail client and connect to the e-mail server in everyday use. Send an e-mail to a friend and receive new e-mails in mailbox. If the subject doesn't prefer an e-mail client in her daily life, then a webmail is allowed to use 4 Listen to music: Listen to an artist by inserting a CD to the driver and adjust the volume if necessary 5 Connect to an IM system: Run an instant messaging system with a given username and password. Search and find a friend on the Internet Log out and power off the machine Try to get online assistance (if needed)
task, there's only one corresponding software thereby reducing the clutter of ordinary Linux menu system. Moreover, complete set of desktop productivity software is installed in default configuration. The participants were given a full boxed set of Xandros Linux version 2.5. The box included a comprehensive 350 page user guide, a bonus CD with applications, games, tools and 60 days of e-mail support. The system requirements recommended by Xandros OS have been satisfied at a higher level such as 128MB recommended versus 512MB available RAM on the laboratory computer. 2.3 Designing the Questionnaire The test report has been documented by using Common Industry Format for Usability Test Reports version 4 produced by NIST Industry Usability Reporting Project [14]. Weisberg, et. al. suggests that a questionnaire should include both closed-and open-ended questions, be pretested to mitigate any possible misunderstandings, use questions from existing validated questionnaries and/or include negatively worded questions to gain validity and reliability [15].
778
M. Göktürk and G. Çetin
The pre and post-questionnaires of the test contained both a semantic differential scale and a Lickert scale, in order to measure user attitutes and reactions by quantifying subject infomation. We also designed and formulated the our OOBE questionnaire in accordance with Weisberg's findings, i.e. it included both open-ended closed-ended questions, was handed to 3 evaluators and modified accordingly to their suggestions to increase satisfaction. Questions to measure satisfaction, learnability, usability and effectiveness were included in questionnarie. Each of four items had one ore more respective questions to test the software. Extreme care has been taken to provide a firendly, relaxed atmosphere in the usability testing environment by re-organizing the placement of furnitures and adding extra stuff (pen and papers) to increase the participant's satisfaction and eliminate possible alienation. Four formal scripts with well-defined tasks were prepared aiming to treat all participant equally: 1. Task scenario sheet. Includes guidelines and detailed task items for tasks to be completed in a given order with a given time frame. Presentation and description of tasks were in the form of a realistic, short clear and understandable story. 2. Participant general intructions. Prepared with the guideline of CIF, these instructions briefed the participants on how they can interact with the test administrator and ask for assistance. 3. Participant consent form. 4. A pre-test and post test questionnaire, including the questions regarding the subject's satisfaction degree with various product components such as aesthetic appeal, ease of use, efficiency, effectiveness, learnability as well as other topics such as general success level of the test and intention to learn Linux in the future. 2.4 Participants and Selection Procedure During planning stage, we tried to make sure that the users, tasks and the environment represented the intended context of use. We selected 8 participants (6 male and 2 female) with varying levels of computer skills, but with no or any Linux experience or background. All of them were told the reasons, the aim and possible outcomes of the test before the laboratory session begins. and were given time to ask general questions about the study. They were given to complete a pre-text questionnarie asking personal questions and background computer and operating system experience. Participants of the study (subject group) were occasional users who have heard of F/OSS and Linux before, but none of them had a compelling experience with a Linux distribution. All of them had Internet access at work/school and use computers very often in their everyday life and classified themselves as “casual Microsoft Windows users with the ability to install an operating system and connect to the broadband internet”. During the participant selection procedure, the following items were kept in mind and the participants were screened accordingly: The candidate participant should, − have Internet and network configuration skills − be familiar with a graphics package − have used a word processor and Google search engine before
Out of Box Experience Issues of Free and Open Source Software
779
− have a working e-mail account − have used an instant messaging application before − have an excellent knowledge of English computer terminology All the participants were part-time or full-time workers, with 22-28 years of age. There's no clear segmentation of the participant group, as any division will result in groups with less than or equal to 6 members, which was arguably a questionable number for a statistical analysis.
3 Results Results were obtained by noting down the participants' attitudes and vocal reports, along with evaluation of saved videos and taking measurements of task completion times. We have examined our results for each particular task as follows. 3.1 Box Appearance and Appeal The distributor company chose to design traditional off-the-shelf software box aiming to establish brand recognition same as other branded OS manufacturers. All the participants except P4 could easily open the box and examined the contents. The boxes were closed with an adhesive label where P4 pointed out that the box was tricky to open and take the book and CD's out. None of the participants had examined the back cover containing features and advanced support options. This may result from the lack of sense of possession on the box. P3 was very satisfied with the book's detail level. 3.2 Installing the Operating System The participants had found the installation to be intuitive, fast and entertaining, without exception. All participants but P4 chose express install, allowing them to install operating system in 5 clicks. One participant (P1) has requested an installation floppy from past experiences, later realized that the box contained none and inserted the installation CD. Also, only P1 had created an ordinary account and used this account. P2 tried to insert the application disk, however got no response and had to ask the test coordinator what to do. P3 commented out that it was effortless to install without a driver disk requirement. The terminology used looked adequate (i.e, not technical), however P8 asked what a /dev/hda2 meant. Only one participant (P2) found it interesting to follow the notices appearing on the installation screen about security and stability, and others skimmed the users guide. The duration of installation process has varied from 12 minutes 27 seconds (P2) to 31 minutes (P4). The reason it took long for P4 was that, she rebooted the machine thinking that the machine hung during the installation initialization process. The mean value for the installation task was 16 minutes and 42 seconds. 3.3 Using the Web Browser There's a perceived inconsistency on Linux desktop stemming from distributed development paradigm. Figure 2 shows different save dialogs with no logically coherent
780
M. Göktürk and G. Çetin
confirmation buttons, colors, button orders, title messages and title-bar button positions. The same inconsistency was seen in the help framework of corresponding software packages. While Xandros has Mozilla as a web browser, its functionally equivalence and uncomplicated, analogous interface let participants allow surf the web. All the participants recognized the “Web browser” icon on the desktop and performed this task without doubt. However, while it showed to be obvious to find a suitable “Bosphorus” image and right click on it, the saving function puzzled all the participants. Users expected to find the desktop, however the file dialog was unintuitive to guide the participants through saving the image to the desktop. The duration of this task varied from 1 minute 40 seconds to 4 minutes 36 seconds while the mean value was 3 minutes 20 seconds. 3.4 Using the Office Suite The tested distribution readily came with an office software (StarOffice) at hand, so there was no need to install an office suite from CD or network. While typing and modifying the attribute of a font was easy to administer, the participants lacked the guidance to take them to the image file saved in “Using the web browser” task. P1, P2 and P4 were apparently unable to find and insert the saved picture to StarOffice. This resulted from the fact that StarOffice save file dialog was considerably different from the painting sofware (Kpaint), which is an internal K desktop environment (KDE) application firmly integrated to KDE desktop. While Kpaint file save dialog did explicitly show the desktop icon, this was not the case in StarOffice. P2 liked StarOffice's 10.5 pt font handling system. The duration of this task varied from 1 minute 12 seconds to 7 minutes 14 seconds. The mean value was 3 minutes 40 seconds. 3.5 Sending an E-Mail While it was decided to force users run an e-mail client, only P4 had the experience to use a standalone mailing software or personal information manager. Thus, task was modified as “sending an e-mail using a webmail the participant is acquainted with”. P4 easily discovered the Mozilla Mail interface and clicked on the mail icon, which oriented her to the configuration wizard and mail account setup. Both P2 and P4 were able to finish this task in 1.53 minutes, while it took P3 6 minutes 13 seconds to read e-mails and send an email to a contact. Overall, the task's duration varied from 35 seconds to 6 minutes 13 seconds. The mean value for this task was 2 minutes 37 seconds. 3.6 Listening to a Music CD Listening to the Alexander Balnescu “Angels & Insects” CD was quite an effortless task for the participants. As the CD is inserted to the slot, KDE media player software immediately popped up playing from the first track, just like P2 thought aloud “It should work when I insert the CD”. Two (P1,P3) were unable to locate the place of mixer at once as they didn't discover the Kmix (KDE mixer applet) icon on the system tray. P2 was the fastest participant among all, completing the task in 47 seconds, while it took P1 3 minutes 5 seconds. The mean value was 1 minute 32 seconds.
Out of Box Experience Issues of Free and Open Source Software
a. Save dialog of Mozilla
b. Save dialog of Gimp
c. Save dialog of OpenOffice.org
d. Save dialog of Konqueror
781
Fig. 2. Save dialogs of most anticipated software in a Linux distribution. The versions are Mozilla 1.7, Gimp 2.0, OpenOffice.org 1.9.57, Konqueror 3.3.0.
3.7 Connecting to an Instant Messaging System This task came out as the most difficult task among others. The instant messaging (IM) software named Kopete was hidden in the menus as “Instant messaging”, and it was not meaningful for the participants who were looking for an ICQ. The term instant messaging is rather a technical description. Yet, end users seem to use their “ICQ”, “MSN” and such instead. P1 found it nice that Kopete supports most major IM protocols. P2 waited to connect after opening an ICQ account, which yielded no result. This was probably due to past experiences. P2 also couldn't figure out how to maximize the Kopete window, ended up saying “I can't type here”. P3 and P7 liked the butterfly effect during the MSN connection. P3, P6 and P7 tried to double click on a window title bar, saw it's shaded and had to click on it to maximize again. This window behavior, a KDE default, has surprised most of the subjects during the test. All the participants were able to connect to their IM accounts. The duration of this task varied from 1 minute 28 seconds to 6 minutes 53 seconds. The mean completion time of this task was 4 minutes 31 seconds. 3.8 Logging Out and Powering Off the Machine All test subjects were able to log off, choosing the log off menu from the task bar.
782
M. Göktürk and G. Çetin
3.9 Trying to Get Online Assistance One of the test tasks were to see whether the test subjects were able to try to get online assistance. The test showed that the test subjects tend to look at the operating system manual first, leaving the online assistance option as an alternative. This may be attributed to time pressure on participants since the quickest test lasted nearly one hour, including the briefing of general instructions and signing the consent form. 3.10 Results of Post-questionnaire We distributed the post questionnaire including 7 10-point Lickert scale questions and 3 open ended questions. The multiple choice questions tried to rate aesthetic appeal, ease of use, understandability, consistency, efficiency, effectiveness and learnability. Open-ended questions were used to collect the respondents' opinions about the likes and dislikes of the test together with their opinions on how to increase the usability of the test platform. Some of the questions and the average ratings are like following: “The product box easily opened and set to use” (mean: 6.38/9). “Linux is an easy-tolearn system” (mean: 7.35/9). “If I had time, I would be willing to learn and use Linux” (mean: 7.38/9). “The test has progressed in a successful manner” (mean: 7.88/9). Some remarks from the participants were interesting: “The easiest thing was to start the web browser” (P7), “The interaction between distinct programs can be better defined” (P3),“There could have been a Documents directory on desktop” (P6).
4 Conclusion We have seen an interesting attitude towards free operating systems, heavy criticism and underestimation of existing functions. This attitude is largely affected by OOBE and very first hours of use. Our tests indicated that majority of the problems were visibility and consistency problems. Well used metaphors were positively perceived by participants. Product packaging and online support were found less important. We believe that F/OSS developer community should pay more attention to OOBE design achieving better software adoption. Advantage of being “free” or “open source” turns into disadvantage since a problematic F/OSS OOBE will likely to divert user to another alternative much more quickly than in case of commercial OOBE. A rigorous statistical analysis was not performed since we aimed to investigate key OOBE issues mostly in qualitative form and number of participants were limited. This appears as a limitation of our study. Thus, more rigorous in depth experiments focusing on individual tasks with larger number of participants remain as future study Acknowledgments. We would like to thank Bilgi University for providing required facilities and logistics during the F/OSS usability test..
References 1. Preece, J.: Human Computer Interaction:Concepts and Design, 2nd edn. Wiley & Sons, Chichester (2002) 2. Shneiderman, B.: Designing The User Interface: Strategies for Effective Human-Computer Interaction, 4th edn. Addison-Wesley, New York (2004)
Out of Box Experience Issues of Free and Open Source Software 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
14. 15.
783
Dix, A., et al.: Human Computer Interaction. Prentice-Hall, Englewood Cliffs (2003) Woodworth, R.S.: Experimental Psychology, Revised Edition, Holt Rinehart (1961) Fouts, J.W.: An “Out-of-Box” Experience. Communications of ACM 1(43), 28–29 (2000) Gilbert, A.L.: Personal and Ubiquitous Computing (2005), vol. 9(4), pp. 198–208. Springer, London (2005) Ketola, P.: Out-Of-Box Experience and User Support, Nokia Telecmmunications (2006) Nielsen, J.: Usability Engineering. Morgan Kaufmann Publishers, San Francisco (1993) Kowalski, L.: Designing the Out-of-the-box” Experience: A Case Study. In: Proceedings of STC (2001) Pirhonen, A.: Supporting a User Facing Novel Application: Learnability in OOBE, Personal and Ubiquitous Computing, pp. 218–226. Springer, Heidelberg (2005) Nichols, D.: The usability of Open Source Software, First Monday (8.1) (2003) Mockus, A., et al.: A Case Study of Open Source Software Development: The Apache Server. In: 22nd International Conference on Software Engineering (2000) Kogut, B.: Open-Source Software Development and Distributed Innovation. In: Oxford Review of Economic Policy, vol. 17(2), pp. 248–264. Oxford University Press, Oxford (2001) Bevan, N.: Common industry format usability tests. In: Proc. UPA Conference, Scottsdale, Arizona, vol. 5 (1999) Weisberg, H.F., Krosnick, J.A., Bowen, B.D.: An introduction to survey research, polling, and data analysis. Sage, Thousand Oaks, CA (1996)
Factor Structure of Content Preparation for E-Business Web Sites: A Survey Results of Industrial Employees in P.R. China Yinni Guo1 and Gavriel Salvendy2 1
School of Industrial Engineering, Purdue University, West Lafayette, IN, 47907, U.S.A [email protected] 2 School of Industrial Engineering, Purdue UniversityWest Lafayette, IN 47907, U.S.A And Department of Industrial Engineering, Tsinghua University Beijing, 100084, P.R. China [email protected]
Abstract. To enhance the quality of e-business web sites, a study of factor structure in content preparation is needed. Based on background literature, a content preparation survey of 70 items was developed and completed by 428 white collar employees of XOCECO Company in mainland China. The survey aimed at examining the significant content factors of e-business web sites. Results of the study indicated a 0.75 internal consistency of the questionnaire. A factor analysis of the data indicated fifteen main content factors for e-business web sites, which accounts for 60.1% percent of total variance. The factors in order of importance are: security content, quality content, service content, appearance description, contact information, aid function, customized function, search function, product specification, purchasing aid, price content, detailed description, comment content, matching product, review content. This study concludes with guidelines for the design of content preparation of e-business we presented. Keywords: Content Preparation; E-business; Factor Structure.
Factor Structure of Content Preparation for E-Business Web Sites
785
2 Background Literature 2.1 Inadequacy of Usability Study Previously, HCI researchers focused more on usability study—“how the information should be presented to customers”. Their studies include how to improve the availability, accessibility, efficiency, the format of web sites and the underlying systems. However, only usability enhancement is not enough. A web site could be designed with high efficient structure based on usability theories, but still useless if it contains the wrong content. For example, if a well organized e-business web site provides us education information—it would still be a poor web site, because education information is not what we expect from it. Therefore, attention should be paid to not just “how”, but “what kind of information should be presented”. As Proctor et al. (2002) [22] pointed out, the way in which information is organized and displayed is vital for web sites that contain a large amount of data intended for accessing by a range of users, or even a targeted group of users. Some web sites, like the library categories or personal homepages, could make it straightforward, but more could not. A large amount of systems and web sites contain materials that can not be readily classified into well-established, distinct, and limited topics for search. For companies like e-business companies, which take full advantage of the potential offered by the web, it is essential for their web sites to be prepared and organized in a highly usable manner and with high quality information. However, this is often not the case. In the study of Kim, Kishore and Sanders (2005) [13], users of Business-to-Business systems reported a number of serious problems during transactions, which included difficulties of locating the required information, completing ongoing transactions, finding timely and accurate information, and finding adequate electronic service functions to complete online transactions. Similarly, by the finding of Nielsen et al. (2000) [20], because of low quality information, users cannot find the items they want 36% of the time. Therefore, more researches about web site information and web content need to be taken into consideration. 2.2 The Definition of Content Preparation The formal definition of content preparation was first raised by Proctor and other researchers in 2002 [22]. It includes what information need to be selected, how information should be stored and organized, how the information could be retrieved and how the information should be displayed. In this study, we focus on what information needs to be selected. The definition of content preparation is based on previous studies about information quality. Several studies (Katerattanakul & Siau, 1999; Liu & Arnett, 2000; Loiacono, 2000, Mckinney, 2002) [11, 16, 17, 18] point out that the information quality of e-business web site, which has an impact on the satisfaction with the online purchasing process, has been identified as an evaluation criterion and dimension of web site quality and usability. But content preparation is different in some aspects from information quality. Information quality studies include the content of information and presentation of information, while content preparation concerns only with what kind of information needs to be presented.
786
Y. Guo and G. Salvendy
2.3 Content Preparation for E-Business Web Sites WWW is a pull medium, which means that audiences have much more control over what they want to see than in traditional media (Pollach, 2005) [21]. They would leave and turn to other web sites if they feel the web site could not meet their expectation. Recent statistics data shows that, although up to 80 percent of the satisfied online consumers would shop again within two months and 90 percent would recommend the Internet retailer to others (Cheung, Lee, 2004) [3]. However, 87 percent of the dissatisfied customers would permanently leave their Internet merchants without any complaints (Cheung, Lee, 2004) [3]. Similarly, in Ginige’s (2002) [6] research, he found that 84% of systems did not meet business requirements, 53% of systems did not have the required functionality, 79% of projects were late, 63% of projects exceeded budget. Therefore, solutions of preventing e-commerce development should be studied. Content Preparation and Customer Satisfaction Consumers’ perception about Internet retailers is mainly built upon their interactions with the web sites. The gap between customers’ expectation and e-business web sites’ representation is the majority cause of e-commerce shortage. As Janda et al. (2002) [10] and Szymanski and Hise (2000) [28] suggested, information quality was a strong determinant of consumer satisfaction in Internet shopping. Similarly, in Namjae and Hyojae’s (2003) [19] work, they found significant relationship among customer needs for specific contents, information usefulness, and expected benefit from the contents. Customer satisfaction and web site content are also close related. Girard’s (2002) [7] work on consumers’ preference for shopping on the Internet indicated that the content of a web site would affect consumers’ shopping patterns. McKinney et al. (2002) [18] also specified web customer satisfaction as impacted by information quality and system quality. Similarly, Turban and Gehrke (2000) [29] urged that the quality of the web content determined whether potential customers would be attracted to or driven away from the web site. The issue of customer expectation and satisfaction includes not only attracting new customers, but also the challenge of customer retention. As Anderson and Srinivasan figured out in 2003 [1], satisfaction was one of the most important consumer reactions in Internet shopping, and its importance was reflected in customers’ loyalty. Even for a company’s long term plan, the development of customer satisfaction would improve the company’s market share and profitability (Reichheld and Schefter 2000) [25]. The Investigation of E-Business Content There have been a few studies about the content quality for web sites, especially e-business web sites. Kim, Kishore and Sanders in 2005 [13] concluded that, the content dimension consists of three quality constructs including information accuracy, information relevance, and information completeness. Similarly, Barnes and Vidgen (2001) [2] defined web site content quality as the ability to provide accurate, timely, and reliable information as well as the suitability of the information for the users’ purpose. As early as in 2000, Huizingh [9] developed a research framework for distinguishing between the content and design aspects. In 2002, Robbins and Stylianou [27] used a framework adapted from Huizingh’s [9] to present a conceptual model which differentiates web site content from design. The conceptual web site
Factor Structure of Content Preparation for E-Business Web Sites
787
content/design model is used for studying the features of global corporate web sites, and to determine if the content and design features have become globally standardized or if differences exist as a result of national culture and industry. In the white paper (Proctor et al. 2002) [22], the skeleton of content preparation and the future study areas were discussed. The vast amount of information available through the internet has made it difficult to retrieve information relevant to a specific task. To help ensure that users’ interactions with a system are successful, preparation of content and its presentation to users must take into account to a) what information needs to be extracted, b) the way in which this information should be stored and organized, c) the methods for retrieving the information, and d) how the information should be displayed. Proctor et al. (2003) [23] looked further into the content preparation and management for e-commerce web sites. In this research, they first investigated into how to elicit the knowledge and information that needed to be contained in a particular web site or application. If the appropriate information is not identified, then the content-preparation process could not possibly succeed. Assuming that this information is identified, then the second area involved organizing and structuring the content for the web. The organization and structure of the information should reflect the context, content, and users. Numerous methods have been developed for assisting developers in determining the elements of information for specific tasks and how these elements should be combined. Not only must an appropriate information architecture be developed, but it must be paired with an effective search engine that allows users to easily retrieve the information they desire. The structure and organization of information have to be mapped to the interface display, and the information needs to be conveyed in a manner that promotes successful interactions with users. Many companies have begun to focus more on customer expectation and satisfaction. Like Amazon.com, at the beginning of 21st century, provides very high quality of service including reliability, dependability and trust in order to have customers buy more than books, music, and gain more repeat business. However, today the architecture of corporate web sites appears to have incorporated a number of common features. These include search capabilities and site maps that recognize the diversity policy statement, security information and location information (Robbins and Stylianou 2002 [27]). Traditional approaches for information quality and content preparation fall short in the context of e-business systems as they do not adequately encompass and address aspects that are unique to these systems. E-business systems, enabled by the Internet, Web, and hypermedia technologies, are highly dynamic and interactive in nature, utilize rich hypermedia mechanisms in user interfaces for information presentation, and provide a tremendous amount of control over temporal aspects of information delivery to end users. (Kim, Kishore and Sanders 2005 [13]) Factor Structure of E-Business Content Preparation Preparing and organizing the necessary information is central to the success of content preparation. In this paper, we call it the factor structure of content preparation. The factor structure includes the factors, which are the important categories in the content, and the structure, which includes the factors between two factors, among factors, and the order of factors. Web site designers may incorporate these findings into the design of e-commerce sites in the attempt to increase the shopping satisfaction of their users.
788
Y. Guo and G. Salvendy
Companies that are able to organize and structure information in a way that promotes efficient and effective retrieval will save time and money, promote customer satisfaction and continued business, and gain an advantage over their competitors. There have been some studies of factor structure. McKinney et al. (2002) [18] gave out their suggestion of the structure of information for e-commerce web sites. They recruited undergraduate and graduate students at a large metropolitan university who have internet experience more than two years to create instruments for measuring constructs of web information quality and system quality. From their research results, the web sites should be relevant, understandable, reliable, adequate, useful, and should cover a broad scope. Proctor et al. (2003) [23] also provide some general principles. The first one is to prevent users from getting lost. The information needs to be structured in a manner that is simple and intuitive. Users must be able to comfortably and confidently navigate through the web site. The second one is to communicate the structure of the site to the users. Techniques are needed to help communicate a site’s structure, for example, obvious major section navigation, obvious sub-section navigation, navigational breadcrumbs, site map help. The last one is to satisfy customers. The organization and structure of the information must satisfy the needs, objectives, and preferences of users. For example, links to related web sites or to a table comparing prices or descriptions of similar items can be added. However, the factors are still on a theoretical level. For e-business, we need more applied factors and classifications. Online surveys and some researches have provided the customer expected factors, and classification that they consider to be important. The results of an online survey of 488 individuals in the United States by Lightner (2003) [15] indicated some general important aspects of a web site. It shows that respondents are generally satisfied with their online shopping experiences, with security, information quality and information quantity ranking first in importance overall. Lightner [15] also suggested that the web environment is important for customers, which includes merchandize, navigation, easy to purchase and feedback. Product price is another important feature. According to the study of Hilsenrath (2002) [8], consumer satisfaction rises as price driven lower. Similarly, according to the report the TheStreet.com, online consumers are becoming more price conscious than ever. Because the importance of price, more and more price comparison web sites exist. Like www.dealsea.com, www.dealinstyle.com are for retail deal and price information. Transportation agency web sites, like www.priceline.com and www.expedia.com are very popular among customers. They offer a convenient way of comparing price. Several papers, Liao and Cheung 2002 [14], Lightner et al. 1996 [15] all indicated that since the commercialization of the Internet, the speed of access is of concern to users. Szymanski and Hise (2000) [28] found product information and site design critical in creating a satisfying customer experience. The most related research about necessary information for online shopping was published by Detlor, Sproule and Gupta in 2003 [5]. 14 information categories for search and browse tasks were found through open-end questions answered by 962 participants. Similar research like a television advertising research done by Resnik and Stern (1977) [26], a tourism web site information study done by Corfu, Laranja and Costa in 2003 [4], and a case study of culture web site content (Karagiogoudi, Karatzas, 2003) [11] also listed some important attributes in web site content design.
Factor Structure of Content Preparation for E-Business Web Sites
789
2.4 Conceptual Model Based on the collection and comparison of previous study shown in Table 1, a concept model consisted of 24 potential significant factors were defined. Most of them are supported by background literature. The others, like “Component Description”, “Feeling Description”, “Operation Aid” and “Match Product” were added according to our observation of e-business web sites in US (Amazon.com, Ebay.com) and China (Taobao.com, Ebay.com.cn, joyo.com). Table 1. Potential Factors for E-business Web Sites
Contact Information Search and Category Review News Shipment Service Link Responsibility Security Match Product
Literature Support Detlor et. al (2003) Detlor et. al (2003) Detlor et. al (2003) From Observation Detlor et. al (2003) From Observation Detlor et. al (2003) From Observation Resnik, Stern (1977) Szymanski, Hise (2000) Corfu, Laranja, Costa (2003) Resnik, Stern (1977) Detlor et. al (2003) Resnik, Stern (1977) Hilsenrath (2002) Corfu, Laranja, Costa (2003) Detlor et. al (2003) Berr (2006) Detlor et. al (2003) Detlor et. al (2003) Karagiorgoudi, Karatzas (2003) Lydon, Fennell (2003) Corfu, Laranja, Costa (2003) Lydon, Fennell (2003) Detlor et. al (2003) Corfu, Laranja, Costa (2003) Detlor et. al (2003) Corfu, Laranja, Costa (2003) Karagiorgoudi, Karatzas (2003) Lydon, Fennell (2003) Detlor et. al (2003) Karagiorgoudi, Karatzas (2003) Detlor et. Al (2003) Lightner (2003) From Observation
790
Y. Guo and G. Salvendy
3 Method The best way to determine what information customers want in e-business operation could be obtained by asking the customers themselves. Hence, a questionnaire was developed for this purpose. The questionnaire was first generated in English according to categories in Table 1, then translated to Chinese. Paper-based survey was selected for the convenience of data collection. Factor analysis was selected to analyze the questionnaire in order to determine how the individual questionnaire items group into specific categories or factors. This analysis provides potentially beneficial information for designers to determine critical factors or attributes. 3.1 Sample The paper-based survey was sent to XOCECO (Xiamen Oversea Chinese Electronic Co.) in Xiamen, mainland China. XOCECO was founded in 1985, also known as PRIMA in the USA. It has about 5,000 employees now and manufactures a wide variety of quality electronics products like PDP TV, LCD TV, HDTV, LCD Monitor, Color TV, FAX machine, other telecommunication products and electronic components. The survey was conducted by the director of the company. Initially, 456 subjects participated in the survey. However, a subject would be eliminated if he or she did not fill out all the questions. In the latter analysis, a subject was consider unreliable and eliminated if he or she made precisely opposite response to the same question. Of the 456 subjects, 16 subjects were deleted because of missing answers, 12 subjects were deleted because of lack of reliability. Therefore, the study used data from the remaining 428 subjects, 93.9% of the original samples. The participants are mostly young people age from 20 to 40 with education level of associate degree or bachelor degree. They are balanced on gender and about half of them are engineers. Most of them have internet experience, but only half of them really have purchased something from the internet. 3.2 Instrument Survey results provide indication of what customers think they may need in a product or service; or what they may like or dislike. Surveys typically form the first stage of scientific inquiry. Hence subsequent to this survey study, highly controlled experimental studies will need to be performed where the participants respond to specific scenarios displayed to them, or computers with which they could interact. All responses to the survey were scored on a seven-point Likert scale ranging from 1 (strongly disagree) to 7 (strongly agree). Some of the items were worded negatively according to the scale development recommendations to reduce bias. The survey began with general characteristic questions and followed with randomly ordered items from the E-business web site content.
4 Results and Discussion 4.1 Overview The survey has an acceptable overall internal consistency of 0.749, a 0.73 consistency of product description content, and a 0.75 web site service content consistency. The
Factor Structure of Content Preparation for E-Business Web Sites
791
means of all related content items ranged from 4.62 to 6.54. Some questions have extremely low or high means. Question 44 (Match Product), Question 56 (View Product), and Question 57 (Shopping Records) have extremely small means. These results show that the experimental participants did not consider these items as important contents. On the other hand, question 4 (Manufacturer Reputation), question 21 (Quality Certificate), question 64 (Privacy) and question 65 (Security) have high means. Therefore, these items were considered very important for online shopping procedure. The standard deviations of all related content items ranged from 0.62 to 1.60. Question 63(Responsibility), Question 64(Privacy) and Question 65(Security) had low standard deviations of 0.64, 0.63 and 0.62. This data indicates that participants had similar attitude towards the items related to their safety or privacy. While Question 44(Match Product), Question 56(View Product), Question 57(Shopping Records) have large standard deviations of, 1.39, 1.39, 1.60. This result shows that participants’ opinions varied a lot on the importance of these items. 4.2 Factor Analysis of Content Preparation for E-Business Web Site Determining the Number of Factors However, by examining eigenvalues, we found that there are 15 factors that had eigenvalues equal or greater than 1.00. Moreover, examination of the scree plot shows that there is an elbow point between 15 factors and 16 factors. Therefore 15 factors are used in this study which explains 60.4% of total variance. Defining the Factors The Appendix shows the results of the orthogonal rotated factor loading using the principle components procedure. Items with loading lower than 0.50 were considered insignificant and eliminated. Appendix also lists the remaining items that have a loading of 0.50 or higher. Only one question, Q17 (benefit of product) is not loaded. The Appendix also provides the rank content factors by importance. Even though Content Security and Quality Content were respectively explain respectively only 2% and 3% of total variance. They are considered the two most important factors for the customers immediately following by Service Content. Even though Factor 1 dealing with Purchasing Aid accounted 22% of the total variance (by far the largest of any of the factors), its mean ranks in the 10th among 15 factors. So when organizations make decision of relative importance of each of the 15 factors, they need to consider both the relative rank of factors with regards to the mean, as well as the total sample variance each factor could explain. 4.3 Effect of Users The results of ANOVA shows that there is no significant differences between genders, neither between different education levels. But significant differences occur between different age groups, career groups, purchasing experience groups, and web experience groups. Tables 2 present the results between age differences. Although there are some statistical significant differences, like the difference between youngest people and oldest people on “View aid Content”. “View Aid Content” means the function that
792
Y. Guo and G. Salvendy
allows customer to check the products they recently view, or they shopping records. However, these percentage differences are not practically useful. This means that no special design for special population group is needed. Table 2. Comparison Results of Different Age Groups: Mean, Score and Standard Deviation
No. F5
Factor Name
Group 1 (21-30) 6.27 0.52
Means & Std Group 2 Group 3 (31-40) (41-50) 6.13 6.01 0.52 0.63
Group 4 (51- ) 6.21 0.55
5.74 0.86
6.08 0.72
4.58 0.91
4.48 1.14
Quality Information F7 Appear6.06 5.92 ance In0.65 0.61 formation F9 View aid 5.07 5.04 Content 1.07 0.99 * The percentage of difference between means.
4.4
% of mean * 0.9-4 .0 0.3-5 .9 0.6-1 3.2
F
P
4.49
0.004 1
4.04
0.007 5
5.60
0.0 009
Comparison with Previous Study
As mentioned in section 2.3.3, the most related research was the one done by Detlor et al (2003). 14 important categories were found for pre-purchase study in their study. Both studies found out the important contents for e-business web sites. However the results differ quite a lot. The Detlor study did not include “Security”, “Aid Function” and “Contact Information” which were found very important in this research. The biggest potential reason is the different survey/experimental method. Detlor’s categories were retrieved from open-end questions and free-style experiment by 31 undergraduate students. While this study generated potential factors from literatures, and surveyed by 428 industrial employees. Besides, different analysis tools were applied: Detlor et al. coded and categorized the free-style response by experts, while in this study we employ much more strict tools of factor analysis to retrieve results.
5 Conclusions and Recommendations The Appendix shows that the e-business web site designers should provide secure purchasing environment, detail description of products, description of service, as well as multiple aid function for searching, comparison. Besides, the e-business web site should also provide a convenient and flexible environment for customers. The result of this study could not only be used in the design procedure, but also in the evaluation process of e-business web site. Previous studies about evaluating e-business web site focus mainly on usability, hence this study could a complement. According to the significant factors generated from the questionnaire, and the loaded items in each factor, an evaluation form could be deducted. E-business and online shopping are quite new in China. According to the participants of this study, although most of them had searched product online before, 59.8%
Factor Structure of Content Preparation for E-Business Web Sites
793
of the respondents had not really purchased products from online store. Moreover, no subject has online shopping experience longer than 8 years, and most “experienced” subjects began to try online shopping one or two years ago. Compared to the situation in China, most people in the US have online shopping experience of a much longer period. This result means that the e-business in China is still developing, and most of the e-business web sites are still under construction. Based on the big population of China and the current development status, e-business and online shopping are supposed to have an expected great market. Therefore this survey could be applied as a reference of e-business development in China.
References 1. Anderson, R.E., Srinivasan, S.S.: E-satisfaction and E-loyalty: A contingency framework. Psychology and Marketing 20(2), 123–138 (2003) 2. Barnes, S.J., Vidgen, R.T.: Assessing the Quality of Auction Web Sites. In: Proceedings of the 34th Hawaii International Conference on System Sciences, p. 10 (2001) 3. Cheung, C.F., Lee, W.B., Wang, W.M., Chu, K.F., To, S.: A Multi-perspective Knowledge Based System for Customer Service Management. Expert System with Applications 24(4), 457–470 (2003) 4. Corfu, A., Lanranja, M., Costa, C.: Evaluation of Tourism Website Effectiveness: Methodological Issues and Survey Results. In: Human Computer Interaction: Theory and Practice, Part I, pp. 753–757. LEA, New Jersey (2003) 5. Detlor, B., Sproule, S., Gupta, C.: Pre-purchase Online Information Seeking: Search Versus Browse. Journal of Electronic Commerce Research 4(2), 72–84 (2003) 6. Ginige, A.: Web Engineering: Managing the Complexity of Web Systems Development. In: Proceedings of the 14th international conference on Software engineering and knowledge engineering, pp. 721–729 (2002) 7. Girard, T., Silverblatt, R., Korgaonkar, P.: Influence of Product Class on Preference for Shopping on the Internet. Journal of Computer Mediated Communication, 8(1) (2002) 8. Hilsenrath, J.: Consumer Satisfaction Rises As Prices Are Driven Lower. The Wall Street Journal (November 2002), Retrieved from http://www.theacsi.org/WSJ/wsj_11_18_02.htm 9. Huizingh, E.: The Content and Design of Web Sites: an Empirical Study. Journal of Information and Management 37(3), 123–134 (2002) 10. Janda, S., Trocchia, P.J., Gwinner, K.P.: Consumer perceptions of Internet retail service quality. International Journal of Service Industry Management 13(5), 412–431 (2002) 11. Karagiogoudi, S., Karatzas, E.: Web-site Quality Evaluation, A Case Study on European Cultural Web-sites. In: Human Computer Interaction: Theory and Practice, Part I, pp. 783–787. LEA, New Jersey (2003) 12. Katerattanakul, P., Siau, K.: Measuring Information Quality of Web Site: Development of an Instrument. In: Proceedings of the 20th Annual International Conference on Information Systems, pp. 279–285 (1999) 13. Kim, Y., Kishore, R., Sanders, G.: From DQ to EQ: understanding data quality in the context of e-business systems. Communications of the ACM 48(10), 78–81 (2005) 14. Liao, Z., Cheung, M.: Internet-based e-banking and consumer attitudes: An empirical study. Information & Management 39(4), 283–295 (2002) 15. Lightner, N.J.: What Users Want In E-commerce Design: Effects of Age, Education and Income. Ergonomics 46(1-3), 153–168 (2003)
794
Y. Guo and G. Salvendy
16. Liu, C., Arnett, K.P.: Exploring the Factors Associated with Web site Success in the Context of Electronic Commerce. Information and Management 38(1), 23–33 (2000) 17. Loiacono, E.T., Watson, R.T., Goodhue, D.L.: WebQualTM: A Web site quality instrument. Doctoral Dissertation. University of Georgia (2000) 18. McKinney, V., Yoon, K., Zahedi, F.M.: The measurement of web-customer satisfaction: An expectation and disconfirmation approach. Information Systems Research 13(3), 296–315 (2002) 19. Namjae, C., Hyojae, J.: Information Quality on Consumer Perception: Analysis of Web-based Travel Information. International Journal of Digital Management (2003) 20. Nielsen, J., Molich, R., Snyder, C., Farrell, S.: E-Commerce User Experience. Nielsen Norman Group, CA (2001) 21. Pollach, I.: Corporate Self-Presentation on the WWW. Corporate Communications: An. International Journal 10(4), 285–301 (2005) 22. Proctor, R.W., Vu, K., Salvendy, G., et al.: Content Preparation and Management for Web Design: Eliciting, Structuring, Searching, and Displaying Information. International Journal of Human-Computer Interaction 14(1), 25–92 (2002) 23. Proctor, R.W., Vu, K.L., Najjar, L.J., Vaughan, M.W., Salvendy, G.: Content Preparation and Management for E-Commerce Web Sites. Communication of the ACM 46(12), 289–299 (2003) 24. Ranganathan, C., Ganapathy, S.: Key Dimensions of Business to Consumer Web Sites. Information and Management 39, 457–465 (2002) 25. Reichheld, F., Schefter, P.: E-Loyalty: Your Secret Weapon on the Web. Harvard Business Review 78, 105–113 (2000) 26. Resnik, A., Stern, B.L.: An Analysis of Information Content in Television Advertising. Journal of Marketing, 50–53 (1977) 27. Robbins, S.S., Stylianou, A.C.: Global Corporate Web Sites: an Empirical Investigation of Content and Design. Journal of Information and Management 40, 205–212 (2000) 28. Szymanski, D.M., Hise, R.T.: E-satisfaction: an initial examination. Journal of Retailing 76(3), 309–322 (2000) 29. Turban, E., Gehrke, D.: Determinants of e-commerce Web site. Human Systems Management 19(2), 111–120 (2000)
Appendix. Importance Factors and Guidelines Rank
Name
Mean
% Explained
1
Security Content
6.42
2%
2
Quality Content
6.20
3%
3
Service Content
6.15
4%
4
Appearance Description
5.99
2%
Guidelines
All related security and privacy documents to customers in a clear way. Detail information of product quality, including the brand information, manufacturer reputation, and quality certificate should be included. The web site should include service time, cost, tracking information, shipment choices. The website should describe the products appearance from all aspects, not only literature description but aid like photo or video.
Factor Structure of Content Preparation for E-Business Web Sites
795
5
Contact Information
5.89
2%
6
Aid Function
5.83
2%
7
Customized Function
5.83
2%
The website should allow the customers to set their preferred content or category.
8
Search Function
5.77
3%
9
Product Specification
5.77
1%
Multiple search functions like search by date, brand, prices range, price increment/decrement should be provided, as well as provide multiple categories for customers to find products. The website should tell customers the specification or benefit of products for easy searching and comparison. The website should provide multiple purchasing aid functions and information, like operation guide and printable manual. The website should not only provide the price, but also the price comparison information and discount information. The website should describe the detail information of products, like component, ingredient, color and technical details. The website should allow customers to comment and rate on products they bought, as well as providing the comment information to other customers. The website should provide match choices or related products on the web page for customers’ aim product. The website should provide review function for customers to aid them check the products they just review or the products they order before.
10
Purchasing Aid
5.67
22%
11
Price Content
5.64
2%
5.49
10%
12
Detail Description
13
Comment Content
5.38
2%
14
Matching Product
5.01
2%
15
Review Content
4.95
2%
The website should provide contact information in different ways, like email, telephone, fax, address clearly on the web page. The website should provide aid information like new arrivals and news of the web site.
Abstract. This paper describes a case study, in which PayPal China improved the user experience by streamlining its checkout experience. This project applied User-Centered Design methodologies and involved cross-functional and international collaborations within the company. The outcome of the project drastically improved user satisfaction.
1 Project Background eBay China is one of the most popular e-Commerce Web sites in China. It facilitates communications and transactions among buyers and sellers. While PayPal has been the primary payment method for eBay transactions in most countries, the PayPal brand and concept was still new to many Chinese Web users. These Chinese Web users were primarily accustomed to making payments through direct bank transfers and face-to-face transactions. It was a business priority to increase the proportion of PayPal transaction on eBay. In addition, eBay and PayPal believed that increasing PayPal transactions would also benefit the broad eBay user communities by having more people use it. Thus, the goal of the project was to streamline the checkout flow of eBay/PayPal China and improve e-Commerce efficiency.
2 Analysis of Business and Design Problems After analyzing business metrics and a series of brainstorming sessions, the team identified the key issues that discouraged Chinese users from using PayPal to complete eBay purchases. To address these issues, the team proposed some solutions. First to be addressed was the user flows and page designs that Chinese users felt were confusing. There were a number of known issues that negatively impacted the user experience including: •
The relationship between eBay and PayPal was unclear, particularly to new users. After committing to a purchase on eBay, users were presented with payment options where PayPal was listed as the recommended method. Most
Streamlining Checkout Experience – A Case Study of Iterative Design
•
797
users, not understanding the relationship between the two companies, were often confused at the point of payment. Furthermore, if the user selected PayPal to complete their payment, the checkout pages were in PayPal’s look and feel with only a small eBay logo in the corner. This added confusion to the relationship of the two companies. Non-PayPal users were reluctant to complete the PayPal account creation flow that was required before making the eBay payment. Many users dropped off at this point, unwilling to spend their time opening another account and creating yet another password they would have to remember.
There was also a broad range of business issues, which had fundamental impact on the user experience including: •
eBay Anfutong was introduced into the Chinese market as the escrow product prior to the introduction of PayPal. Later, PayPal was introduced only as a direct-payment method although it was a solution for all eBay payments, including escrow. The existence of the Anfutong brand co-existed with the PayPal brand, causing confusion and internal competition.
A significant number of eBay users dropped off the payment flow due to their confusion and frustration with the issues mentioned above. Based on the analysis of these and other issues, the business and design departments proactively opened up a dedicated project to address them.
3 Design Solutions and Processes The cross-functional teams worked closely together to compare metrics and share ideas. Key issues were identified and after collaborative brainstorming, proposals for solutions emerged. •
Optimize page elements o
o
Simplify the pages. Special attention was given to removing extraneous elements on the page. The superfluous content and information was removed as well as the physical size of the pages was reduced. By doing this, the perceived complexity that existed before was eradicated. Provide more contextual help to users based on their scenario. For example, for users to conduct certain operations, different Chinese banks have different requirements. On the page where the user selects the bank they wish to transfer money from, dynamic interaction to display supplemental information specific to the chosen bank was presented. Figure 1 shows an instance of the contextual help for a selected bank.
798
A. Han et al.
Fig. 1. Contextual help pertaining on Bank selection
•
Clarify relationship between eBay, PayPal and Anfutong o
o
o
Make the transition from eBay to the PayPal checkout flow appear seamless to the users. Previously, the PayPal checkout flow was designed completely in the PayPal look and feel with only a small eBay logo in the corner. Because of this, the relationship between eBay and PayPal was unclear. In response to this issue, we changed the pages to have eBay’s look and feel with the PayPal logo in the corner as well as clear PayPal reference as appropriate throughout the flow. The original and new designs are shown in Figures 2a and 2b. Re-brand Anfutong “Anfutong by PayPal” to resolve the confusing and competing brands. eBay China decided to re-brand the escrow product to be “Anfutong by PayPal” so both existing and new users would have a clear understanding of the tight connection between PayPal and all payment solutions provided by eBay. Provide easy sign-up to PayPal at the end of the payment flow. By putting the PayPal account creation flow after the payment, the payment flow is no longer interrupted by requiring signup. Having it after the payment flow presents the user with the opportunity to sign up for an account. To encourage users to sign up, the guest user information, gathered from eBay, was pre-populated in the PayPal form to make sign up easier. Users only needed to create and confirm their password before acquiring a fully-functional PayPal account.
Streamlining Checkout Experience – A Case Study of Iterative Design
799
Fig. 2a. The original PayPal branded page
Fig. 2b. The redesigned eBay branded page with PayPal co-branding
PayPal and eBay have established processes for User-Centered Design. This process has provided a platform that supports collaboration between different team members. However, there were a number of special situations that were encountered during this project.
800
A. Han et al.
This project was particularly complicated because of the wide scope of its coverage, the aggressive timeline, and the large number of stakeholders it involved. While a typical design project involve about 3-8 people in the design phase, this project involved close to 30 people from many organizations, including product managers, engineering and web development, content and localization, visual design, business strategies, and of course the user interface team. These team members were able to work globally, with one team in California, USA and the other in Shanghai, China, by effective collaborations.
4 User Research to Support Design Decisions Usability testing is especially important when designers come from a different culture than the target market of the product. In this situation, most design templates were created for the United States. Adjustments are often necessary when designs are exported into the China market. After these adjustments, and prior to finalizing the design of this key flow, a thorough usability study was conducted in China, complete with interactive prototypes. This study touched upon all possible scenarios for both new and existing PayPal users. The team was able to draw conclusions that validated many prior speculations, as well as introduced new issues hand-in-hand with suggested design enhancements. These design enhancements were further explored before launch. Two specific unexpected findings were of particular interest to the designers: •
•
Figure 3 shows the PayPal registration page. On this particular page, PayPal asks security questions for password recovery. The user had choices to answer personal questions such as “What is your Mother’s maiden name” and “What is your city of birth”. Some of these questions were proven to be problematic in the Chinese context. For example, while the question “What is your city of birth” is perfectly valid for the United States, it is not recommended for China. The reason being that it would be difficult for a fraud to guess from the tens of hundreds of US cities given the wide geographic spread of the online population. However, in China, the online population is heavily concentrated in a few large cities, such as Beijing and Shanghai. Because of this, in China, this question would be easy for a fraud to guess correctly and hence be a less secure question. Also, it is common for forms in the United States to have two lines for the address field (often noted as “Address 1” and “Address 2”), so that users can put in their full address. However, in Chinese, most commonly, the full address can easily fit into one line. As a result, the fields “Address 1” and “Address 2” were interpreted as requesting the user to fill out two different addresses.
These findings are cultural-sensitive and critical to achieve good user experience. If the usability research had not been conducted in China, these issues would not have been identified, ultimately compromising the user experience.
Streamlining Checkout Experience – A Case Study of Iterative Design
801
Fig. 3. Issues with security questions and address format
5 Conclusion and Lessons Learned This project successfully addressed users’ needs with solid solutions to optimize the payment flow. The general feedback gathered from users was positive. Users commented that the new flows were more secure, intuitive and efficient. Moving forward, here are a few key points the team has learned from this project: • •
•
Design and business solutions should be tightly associated in order to address both user needs and business requirements. This can only be achieved through close collaboration among team members. Prepare for flexibility when problems arise. User-Centered Design requires strong commitments, especially under tight resource constraints and uncertainties. The team should maintain an open mind in dealing with project dynamics. User research is very important to ensure the quality of the design. Particularly, many internationalization issues can be discovered only by diligently conducting user research in the target country with the appropriate cultural context.
Reference 1. Mayhew, D.J.: The Usability Engineering Lifecycle: A Practitioner’s Handbook for User Interface Design. Morgan Kaufmann, San Francisco (1999)
Presence, Creativity and Collaborative Work in Virtual Environments 1
Ilona Heldal , David Roberts2, Lars Bråthe3, and Robin Wolff2 1 Dept. of Technology Management and Economics, Chalmers University of Technology, SE - 412 96 Göteborg, Sweden [email protected] 2 The Center for Virtual Environments, The University of Salford, M5-4WT Salford, UK [email protected], [email protected] 3 Volvo Powertrain Corp., SE - 40508 Göteborg, Sweden [email protected]
Abstract. Research has identified many different concepts and factors, e.g. immersiveness, presence, performance, interaction, and defined a large number of guidelines that contribute to developing advanced virtual environments (VEs). By reviewing research on differences between individual work and group work, and how it is influenced by these factors, this paper aims to improve understanding of networked collaboration. Allowing creativity is considered to promote higher quality of work in general. The paper examines the impact of creativity on work in VEs, with focus on understanding the relationship between presence and creativity in collaborative virtual environments (CVEs). It is found that important prerequisites for successful outcomes are balance between presence and copresence and providing enough time and space for individual contributions. Keywords: Virtual environments, individual work, collaboration, creativity, presence, copresence, social interaction, task performance.
Presence, Creativity and Collaborative Work in Virtual Environments
803
e.g. by unnecessary discussions, technical interruptions, or social misunderstandings [1]. For a superior outcome it is not necessarily enough to allow advanced technologies and provide successful networking. How the individuals handle these, how the group is organized, what the main technical characteristics and the social differences are, also have to be taken into account. Using different technologies in the same networked setting can result in misunderstandings [2]. The aim of this work is to examine the potential of collaborative work in CVEs with the focus on the relationship between presence and creativity and by considering already identified factors such as performance, available technology, technical devices, and experiences. We do not limit this study to examining exclusively VEs, but also consider research from computer-mediated communication (CMC) or computersupported collaborative work (CSCW). The paper is structured as follows. Section 2 briefly presents single-user work versus collaboration in order to understand evaluating work in VEs and in CVEs. In section 3, creativity, presence and those factors and concepts are examined that are considered to have a great impact on outcomes in the view of how people a) solve problems, b) use technical devices, and c) experience VEs. Section 4 examines the influence of considering presence and creativity for individuals or for group collaboration. Section 5 includes a discussion on the findings and suggests future directions. The last section presents the conclusions.
2 Single-User Work Versus Collaboration 2.1 Taking the Step to Understand Collaborative Outcomes To find out how a group can be more than the sum of individuals has preoccupied researchers for several decades. It is difficult to explore the benefits of group work in terms of individual contribution, task and context, and to obtain measurable and consistent results through several projects. Research has acknowledged that individuals contribute with at most 70% of their optimal individual performance for a wellworking collaboration. The results often depend on task, time, group, chosen evaluation method or applied theories [3]. Among the earliest, probably most promising distributed applications are those that support meetings and conferencing. For these, Scott (1999) summarized several group collaboration models in an input, process and output framework that includes task characteristics for the input, grouped according to McGrath’s decision typology: create a plan, choose, negotiate, and execute [4, 5]. Neal et al. argue for the importance of treating the second-level social system effects such as coupling of work, joint awareness and coordination [6]. Understanding such behaviors is important since including social interaction in the evaluation of collaborative work is crucial, complex and difficult [7]. Much applied research on using VEs concerns handling prototypes and models. The establishment of common grounds, conventions, awareness, trust, naturalness and human proximity plays important roles for applications supporting group work. It is found that visualizing strategies, better help for orientation, considering time and
804
I. Heldal et al.
problem-solving issues together, more supportive feedbacks, and knowing more about cultural issues can strengthen distributed work [1, 8, 9]. 2.2 Using VEs in Collaborative Settings Several studies have identified some factors such as presence, performance, intuitiveness, interaction, and leadership as important for VEs [1, 10]. Certain of these may be more closely associated with one specific application or type of VE than with another. There are also differences when varying some technical factors of an environment, and there are differences when varying the social setting [12]. For example, varying latency, field of view, different rendering usage, interaction styles [2, 11] or varying object representations, colors, context [1] can result in different performance and presence measures. For CVEs there are still many problems remaining and also new problems incoming. Many of the new problems are related to communication modalities and considering the influence of social interaction [13, 14]. The social interaction and the technical interaction often take place in parallel, or they are interconnected in a nondeterministic way [1, 7]. There are only a few works on defining evaluation methods for CVEs [11, 12]. To evaluate collaboration in CVEs that can be focused or unfocused, Tromp has identified three main stages, i.e. beginning, proper collaboration, and ending. These stages are embedded in a “meta-collaboration” context [11]. Understanding the task, problem-solving in relation to available or spent time in the environments, and choosing the right strategies influence the overall outcome [2, 11, 13]. This means quicker plan creation, less time spent for negotiation, and quicker decision-making [5]. Considering these issues are also requirements for seamless interaction [15, 7]. Objectfocused collaboration is common in creative tasks and is well researched in CVEs, and typically involves problem-solving [15]. We therefore use object-focused interaction as the basis for discussion of group work in CVEs. To acquire an overall view of collaboration when people use CVEs, one must examine, according to [12]: a) how people can work, with special focus on problem-solving, b) how technologies can support this, how users can intuitively use devices, c) what they experience. Allowing creativity is considered to augment the quality of work [16, 17]. Aiming to obtain creative work in CVEs, however, raises a lot of questions. It is debated whether individual creativity can contribute to group creativity at all [18]. How do the application, technology and social creativity, defined as ‘joint thinking, passionate conversations, and shared struggles among different people‘ (Fischer, p 4) impact upon this [18]?
3 Factors Influencing Outcomes 3.1 Creativity Creativity has been linked to a state of mind known as flow. Flow is defined by Csikszentmihalyi via eight distinct dimensions: (1) clear objectives with immediate feedback, (2) skills suited to challenges, (3) action and awareness merge, (4) allowing high concentration, (5) sense of control, (6) loss of self-consciousness, (7) altered
Presence, Creativity and Collaborative Work in Virtual Environments
805
sense of time, which usually seems to pass faster, and (8) it is worth doing for its own sake [19]. We shall briefly overview relevant research on the dimensions of flow and creative problem-solving; creativity-supporting tools, and creative experiences. Table 1 summarizes these and discusses possibilities for support to individuals and groups. Table 1. Possibilities to support creativity for individuals or for groups What user(s) do? Solve problems
Handle technologies
Experience
Internal vs. external Activities Internal
External – hard to understand sometimes Internal – high-tech might attract
Suggestions for supporting individuals
Suggestions for supporting Group collaboration
By considering timedependent relations, workflow, e.g. [20], and focus of attention [12].
Support for integrating individual work in group work, e.g. [18]. Managing to handle easier interpretations and transparency of others’ activities [1], and allowing personal space. Symmetry helps. Otherwise make the group aware of each other’s possibilities.
Define tasks that should be considered for designing creativity-supporting tools, e.g. [21]. Implement environments prepared to handle experiences, e.g. [16], and allow seamless interaction, e.g. [9].
Common targets, clear objectives, feedbacks, rewards. Separate activities where group awareness stimulates group members.
Vass et al. emphasized the importance of considering an appropriate balance between challenges and skills, and of the immediate feedback and the clarity of goals and problem-solving to the sixth flow dimension, which also means "No worry of failure". They also emphasized the value of differentiating time-dependent relations for their workflow model [20]. However, the model cannot readily be extended for group work, and does not consider enhanced experiences for problem-solving. By considering the dimensions identified for flow Shneiderman suggests certain tasks that have to be considered for designing tools that support creativity: (1) Searching and browsing, (2) Visualizing relationships, (3) Intellectual and emotional support, (4) Allowing free associations, (5) Exploring solutions, (6) Composing artefacts and performances, (7) Reviewing and replaying sessions, (8) Disseminating results [21]. Creativity-supporting tools in distributed scientific communities need to support flexibility in granularity of planning [22]. Roberts later demonstrated characteristics of immersive collaborative environments that can easily provide a seamless workflow through transitions between content and detail in planning [15]. Experiencing flow can be impacted by the way in which people handle different events, and by how they build a conceptual map in VEs. According to Fencott, to support creativity, knowledge about social and technical contexts should be considered already in designing VEs, in order to support VE users in handling sureties, surprises, and shocks [16]. Fischer differentiates between two levels of creativity, viz. historical creativity associated with fundamentally novel ideas and discoveries, and psychological creativity
806
I. Heldal et al.
associated with ideas and discovery from everyday work practice. Accordingly, historical creativity can be associated with individual work, while psychological creativity incorporates social group creativity. Fischer found that individual creativity is usually integrated in social creativity. Beside functionality, it has been necessary to consider factors, such as cultural diversity, the context of the experiment, individual versus group support, allowing reflection on minority conflicts, and supporting flexibility in granularity of planning. The social structure and mindset contribute betterformulated problem areas and stable environments. The collaboration is examined through several applications, e.g. Creation (collaborative drawing art) and Linux; it often has non-simultaneous characteristics, where the contributors take turns at work [18]. 3.2 Presence Short, William and Christie measured the quality of distributed work in 1976, and introduced social presence as the users’ subjective sense of being present in a social setting with another person [23]. Later, presence was identified as a main factor that contributes to an improved experience for VEs. It refers to experiencing being in a place other than where one is physically present. If this sense refers to being there together with one’s partner, then we speak of copresence [10]. For many studies copresence and social presence overlap each other, although social presence also means a higher-order presence with collaborative partners – where the collaborators’ awareness of each other’s intelligence, and their partners’ awareness about this. Table 2. Possibilities to support presence for individuals or for groups What user(s) do? Solve problems
Handle technologies
Experience
Internal vs. External Activities Internal – although an external observer can see it.
External – Disturbances Internal – Intuitive Technologies Both internal and external – Immersive technology, design may benefit
Suggestions for supporting individual presence Fidelity and sensory information, match between sensors and displays [24]. Clear interaction. By considering focus of attention [12]. Screen size, immersiveness, less breaks in presence [26]. Seamless interaction [21, 9]. Proper technologies, tracking, real time [35]. Implement challenges [2], allow seamless interaction [9]; for certain applications realism helps [12].
Suggestions for supporting group collaboration Support for integrating individual work in group work, e.g. creativity [18]. Manage to handle easier interpretations, have awareness transparency of the others’ activities [1], allow seamless communication, show intentions, emotions [2], etc. Consider the dimensions defined by Garau [24]. Symmetry helps [2]; otherwise make the group aware of each other’s possibilities. Clear object laws [15]. Support group awareness, quick feedback, common objectives, allowing rewards, problem-free communication, using rich technologies that transmit human cues, important movements, naturalness [1, 9]. Consider cultural differences, leadership, emotion, etc.
Presence, Creativity and Collaborative Work in Virtual Environments
807
Maya Garau examined the relationship between the presence and copresence through a number of studies. She summarized the determinants as being: (1) the extent of fidelity and sensory information, (2) the match between sensors and displays, (3) content, and (4) user characteristics [24]. An important question here is how awareness relates to presence and to the collaborative experience. Higher presence does not necessarily result in higher copresence [1]. The collaborative experience of “being there together” has to do with the real-time information on the others – who the others are, how they are represented and what they are doing. This information can be transmitted by the different technologies, i.e. it can be externalized [25]. It may be difficult to sustain high sense of presence and also high sense of copresence over time. It seems that the situations of trade-offs between presence and copresence can be explained by focus of attention: in certain situations it is not possible to focus on both the space and the other at the same time. It can be hard to explain relationships between presence and copresence when they are evaluated after the studies, as people do not remember their intentions with intuitive actions. Table 2 lists the possibilities to support presence in relation to what people do in VEs. 3.3 Other Factors Influencing Work in CVEs While collaborative and social presence does make a great contribution to user experiences, the way in which this influences overall performance, usability, and effectiveness, is not known [7]. Bystrom and Barfield [27] have previously analyzed collaborative task performance in a VE. They found that task performance is affected by the presence of others and by the level of control. They suggest that experiences in VEs should be grouped according to three factors: presence, the quality of VEs, and task difficulty. Slater et al. argued that efficient performance could be a consequence of the VE experience. Subjects in a more realistic environment performed better, and played better 3D chess than those in the less realistic environment [28]. Time and efficient workflow also influence effectiveness [12]. People’s behavior and the way in which they interact socially and with objects, changes over time when immersed in a VE [7]. At the beginning and end people focus on social interaction, while interaction via technical interfaces and the virtual representations [9], avatar appearances [1], experiencing emotions [29] also plays a more important role for task-focused collaboration. Considering flow in the wider context, we can postulate that time is likely to impact on many of the defined dimensions. Time is a factor for interaction with the environment, e.g. immediate feedback and control.
4 Relations Between the Factors Influencing Work in CVEs There is a huge difference when people solve problems alone instead of together. Table 3 examines the requirements for individuals and group in terms of presence and creativity for superior results.
808
I. Heldal et al. Table 3. Supporting presence and creativity for individuals or for groups
What user(s) do? Solve problems
Individual Presence Creativity Both presence and creativity need to be supported. High presence is needed for individual creativity.
Handle technologies
To reach high presence, intuitive technologies and natural interactions are needed. This is a precondition for supporting individual creativity.
Experience
High individual presence is required for experiencing VEs. Creativity is less important.
Collaborative or social group Presence Creativity Differs for proper collaboration versus peripheral collaboration. Also depends on application type, e.g. object-focused task solving versus learning [12]. Maintaining and sustaining group activities needs social skills, high copresence, and social presence. Group creativity requires awareness of the members’ activity, i.e. social creativity is favored by high social presence [24]. One can help the others in networked situations to handle technologies and eventual problems [13]. Helping requires different communication modalities. Symmetry or information on asymmetry helps [11, 2]. Naturalistic technologies may support peripheral collaboration, copresence and communicative activities that are required for group creativity [15]. Group maintenance, collaborative activities and awareness are most important for a good outcome. Naturalness, high social presence can help. Time may be influential.
Problem-solving requires mental work which is to a great extent individual [3]. The fact that individual problem-solving also requires high presence is obvious. Following Fischer’s argumentation for supporting creativity for individuals and for groups, we note the need of private space for the individual to make it possible to give her best in collaboration [18]. Collaborative problem-solving requires seamless technology. Then the group can interact more easily and support peripheral collaboration that also requires social presence and copresence. Studies show that symmetrical settings help [2]. Technology can sometimes be neglected and used more intuitively [15]. However, the group still needs to cope with differences between the group members [7]. Interruptions caused by non-intuitive devices, bad design, or social interaction can disturb a group [13]. Social behavior in a group can support peripheral communication, strengthening group awareness of the members and vice versa. High social presence is required for maintaining peripheral awareness in networked group activities, allowing coordination, supporting decision-making processes, negotiations and choosing strategies. Accordingly, high social presence in turn can allow increased social creativity. Continuous workflow contributes to increased presence and creativity in CVEs. This can be maintained by avoiding or decreasing disturbances from the surroundings, which can cause breaks in presence and interrupt concentration [26]. Hiding the technology so that people can interact naturally with the VE significantly increases not only performance, but also engagement, motivation, enjoyment, and creativity [15]. Presence contributes to allowing vivid experiences which can generate powerful emotions [29]. This may influence social creativity.
Presence, Creativity and Collaborative Work in Virtual Environments
809
Since temporal issues are crucial in work and many people work for long periods on computers, it is essential to include the effect of time. If users spend longer times in CVEs they adapt to using technologies and settings, avoid hindrances, and develop work-around to cope with the perceived disadvantages [1].
5 Discussion and Future Directions The role of social interaction and social space becomes more important for collaboration [11, 9]. The drawback of asymmetrical setups can be overcome by letting people trade places and learn about each other’s different capabilities. Some of the problems are inherited from the lack of general theories to guide human performance measurement, difficulties in handling the inverse relationship between operational control and realism, the multiple dimension of behavior, how to measure cognitive tasks and how creativity is supported, etc. There are at least three stages in sustainable constructive/creative work. The first stage is getting the group together, defining rules and roles, and establishing common ground, letting members be aware of each other, and formulating tasks and problems. During this peripheral communication, the socio-cultural context plays an important role. To support group work, probably establishing and maintaining high copresence has the highest impact for this stage. The second stage is performing the creative or constructive work – the actual collaboration. In this stage, there are two possibilities. If the problem or tasks are intended to carry out rather moderate changes/innovations, so-called psychological creativity [18], one part of the aim is to get the entire group to arrive at and accept the relatively foreseeable solutions. In this case it is still important to have simultaneous participation of the entire group. Here, supporting individual presence and thus creativity, and seamlessly integrating it into the group work, is very important. Disturbances from the technical devices and from the group can lower both individual and group results. Personal motivations to contributing in the group are important. Maintaining both collaborative and individual presence is important. Even though ‘isolating’ a member can be disturbing for the group, it may still contribute to overall efficiency. If the purpose is to find a really innovative and so far unknown idea, which Fischer called historical invention [18], the surroundings that support creativity with tools and tasks are important. The environment needs to allow evaluation of alternative choices, using different strategies. Here creativity has the highest importance. Once the context is set, the group is needed only to keep up the pace and to evaluate the proposed solutions. With this kind of formulation, it is desired to allow space and time for the group individuals to reflect and come up with ideas. The third stage is concluding the work, or certain work steps, making the solution and its evaluation known and giving rewards. In this stage the group must be collective. Another aspect that has to be reevaluated for supporting high copresence and group creativity is consideration of interruptions. We showed that external observations play a great role in evaluating collaborative activities and group work. The observations can distinguish certain critical sequences and activities that the users went through quickly during their work and did not pay attention to.
810
I. Heldal et al.
We have shown that future work has to consider developing seamlessly distributed CVEs that allows: creative problem-solving, creativity-supporting tools, and allowing creative experiences. This often requires knowledge on the influence of presence. The recent literature still lacks a consistent theoretical approach to guide experiential research and technological development towards collaborative applications that support creativity. In our opinion, for the first step we have to identify the main relations between presence and technology, application types, users, and time [12].
6 Conclusions By reviewing research on differences between individual work and group work, and how they are influenced by presence and creativity, this paper has sought a better understanding of networked collaboration and how to support creative, collaborative activities in CVEs. Allowing creativity is considered to contribute to higher-quality work in general. However, to support group creativity, the peripheral collaboration should be seamless. This requires high copresence for maintaining and seamlessly integrating individual creativity that contributes to group results. This acknowledge the earlier results presented by Fischer [18], that one of the most important prerequisites for successful outcomes is providing enough time and space for individual creativity contributions. Overall, work was examined in terms of three main activities: how people approach problem-solving, how they handle technical devices, and how they experience virtual representations. The analytical distinction has helped us to review the creativity literature for these three areas and to connect it with available presence results, which are essential for evaluating VEs. In relation to methodology we have shown the importance of observations and of examining normative behaviors via laboratory studies.
References [1] Heldal, I., Bråthe, L., et al.: Analyzing Fragments of Collaboration in Distributed Immersive Virtual Environments. In: Schroeder, R., Axelsson, A. (eds.) Avatars at work and play, pp. 97–130. Springer, Heidelberg (2006) [2] Heldal, I., Schroeder, R. et al.: Immersiveness and Symmetry in Copresent Scenarios. In: Proc. IEEE VR2005, pp. 171–178 (2005) [3] Brown, R.: Group Processes: Dynamics within and between groups. Blackwell, Oxford (2000) [4] Scott, C.R.: Communication Technology and Group Communication. In: Frey, L.R. (ed.) The Handbook of Group Communication Theory and Research, pp. 431–472. Sage, Thousand Oaks (1999) [5] McGrath, J.E.: Groups: Interaction and Performance. Prentice-Hall, Englewood Cliffs (1984) [6] Neale, D.C., Carroll, J.M., Rosson, M.B.: Evaluating Computer Supported Cooperative Work Models and Frameworks. In: Proc. of Computer Supported Collaborative Work Conference (CSCW’04), pp. 112–121 (2004) [7] Schroeder, R., Heldal, I., Tromp, J.: The Usability of Collaborative VEs and Methods for the Analysis of Interaction. Presence 15(6), 655–667 (2006)
Presence, Creativity and Collaborative Work in Virtual Environments
811
[8] Hinds, P., Kiesler, S. (eds.): Distributed Work. MIT Press, Massachusetts (2002) [9] Wolff, R., Roberts, D.J., et al.: A Review of Tele-collaboration Technologies with Respect to Closely Coupled Collaboration. Int. J of Computer Applications in Technology, Special Issue on: Collaborative Multimedia Applications in Technology (2006) [10] Slater, M., Sadagic, A., et al.: Small-group behavior in a virtual and real environment: A comparative study. Presence 9(1), 37–51 (2000) [11] Tromp, J., Steed, A., Wilson, J.: Systematic Usability Evaluation and Design Issues for Collaborative Virtual Environments. Presence 10(3), 241–267 (2003) [12] Heldal, I.: The Usability of Collaborative Virtual Environments: Towards an Evaluation Framework, PhD thesis 2004, Chalmers University of Technology: Gothenburg (2004) [13] Heldal, I.: Impact of Social Interaction on Usability for Distributed Virtual Environments, Virtual Reality, Springer (In press) [14] Wilson, J.R.: If VR has changed. then have its human factors? In: Waard, et al.(eds.) Human Factors in the Age of Virtual Reality. Shaker Publishing, pp. 9–30 (2003) [15] Roberts, D., Heldal, I., et al.: Factors influencing Flow of Object Focused Collaboration in Collaborative Virtual Environments, Virtual Reality, Special Issue on Collaborative Virtual Environments for Creative People, 10(2), 119–133 (2006) [16] Fencott, C.: Content and Creativity in Virtual Environment Design. In: Proc. of Virtual Systems and Multimedia ‘99 (1999) [17] Waterworth, J.A., Waterworth, E.L.: Affective Creative Spaces: the interactive tent and the illusion of being. In: Proc. on Affective Human Factors Design (2001) [18] Fischer, G., Giaccardi, E., et al.: Beyond Binary Choices: Integrating Individual and Social Creativity, Human-Computer Studies (IJHCS). Special Issue on Creativity 63(4-5), 482–512 (2005) [19] Csikszentmihalyi, M.: Creativity: Flow and the Psychology of Discovery and Invention. Harper Perennial, New York (1996) [20] Vass, M., Carroll, J.M., Shaffer, C.A.: Supporting Creativity in Problem Solving Environments. In: Proc. of the 4th Creativity & Cognition Conference, pp. 31–37 (2002) [21] Shneiderman, B.: Establishing Framework of Activities for Creative Work. Creativity Support Tools, Communication of the ACM 45(10), 116–120 (2002) [22] Farooq, U., Carroll, J.M., Ganoe, C.H.: Supporting Creativity in Distributed Scientific Communities. In: Proc. of the International GROUP Conference on Supporting Group Work, pp. 217–226 (2005) [23] Short, J., Williams, E., Christie, B.: The Social Psychology of Telecommunications. John Wiley, London (1976) [24] Garau, M.: The Impact of Avatar Fidelity on Social Interaction in Virtual Environments. Department of Computer Science. University College London, London (2003) [25] Polanyi, M.: The Tacit Dimension. Garden City, NY., Doubleday (1966) [26] Brogni, A., Slater, M., Steed, A.: More Breaks Less Presence. In: Proc. Presence 2003, The 6th Annual International Workshop on Presence (2003) [27] Bystrom, K.E., Barfield, W.: Collaborative Task Performance for Learning Using a Virtual Environment. Presence 8(4), 435–448 (1999) [28] Slater, M., Linakis, V., et al.: Immersion, Presence, and Performance in Virtual Environments: An Experiment with Tri-Dimensional Chess. In: Proc. ACM Virtual Reality Software and Technology, pp. 163–172 (1996) [29] Waterworth, E.L., Waterworth, J.A.: Focus, locus, and sensus: The three dimensions of virtual experience. Cyberpsychology & Behavior 4(2), 203–213 (2001)
Users Interact Differently: Towards a UsabilityOriented User Taxonomy Fabian Hermann1, Iris Niedermann1, Matthias Peissner1, Katja Henke2, and Anja Naumann2 1
Fraunhofer Institute for Industrial Engineering (IAO), Nobelstrasse 12, 70569 Stuttgart, Germany {Fabian.Hermann,Iris.Niedermann, Matthias.Peissner}@iao.fraunhofer.de 2 Deutsche Telekom Laboratories, Technische Universität Berlin, Ernst-Reuter-Platz 7, 10587 Berlin, Germany {Katja.Henke,Anja.Naumann}@telekom.de
Abstract. This paper proposes a preliminary user taxonomy that describes differences among users when interacting with Information and Communication Technology (ICT) systems. A qualitative study based on expert-ratings was conducted to get a prioritized list of person variables influencing the interaction behavior. Based on this list, eight preliminary user types with different attitudes towards ICT-systems were identified and described. This taxonomy will be tested and validated by empirical investigations. Keywords: user taxonomy, interaction behavior, attitude towards technology, user typology, user segmentation.
Users Interact Differently: Towards a Usability - Oriented User Taxonomy
813
The goal of this work is to describe differences among users when interacting with ICT-systems, i.e. how a user interface is used, how good users can deal with UI complexity or usability flaws, which adaptations must be made for particular user types etc. We propose a general user taxonomy for ICT-systems and products. This classification is supposed to help developing intuitively usable design, especially devices and services tailored to different user groups. One example for the concrete application of the classification is its usage for the generation of different user models for model based usability evaluation in early stages of product development. It also may be used to stratify samples for usability testing. At the present state, the taxonomy is preliminary: We conducted a qualitative study in order to get a strong face-validity based on the ratings and experiences of experts working with users. Of course, this version is considered as hypothetical and has to be corroborated by empirical investigations. In the following, two major phases of the procedure are described: First, person variables that influence the interaction and usage behavior were identified in an expert study. Second, the taxonomy was developed by defining types based on attitudes.
2 Expert Study In the expert study, experienced usability engineers were questioned about their estimations which variables influence the interaction behavior, and which typical attitudes and behavioral styles can be observed with users. Six usability experts of different professions (psychologists, designers) participated, all with many years of experience in working with users and user-centered design. First of all, the experts were asked to collect “person variables that influence the interaction with ICTsystems”. In a group meeting, these were collected and discussed. This led to a summarized list of 21 factors altogether, each comprising of a number of single variables proposed by the experts. The experts then rated the relevance of these 21 factors. 7 of them were 3 of low importance and therefore not considered in the next steps. In the last step of the expert study, each expert was given a questionnaire to estimate the correlations between the factors. Strong deviations of the expert ratings were discussed among the participants to reach a joint understanding. The resulting factors are described in the following section. 2.1 Person Variables Influencing the Interaction Behavior On the one hand side, the expert study yielded the impression that a vast and complex set of cognitive, motivational, and social variables must be considered to get a predictive description of user types. However, the most important result was that the experts set very clear priorities on which variables are important and which not: As most important the experts rated a set of variables that covers two aspects: First the knowledge and skills with ICT-systems, interaction mechanisms etc. Second, the general ICT-affinity vs. computer anxiety, and different self-concepts and attitudes towards
814
F. Hermann et al.
ICT in general, like e.g. “I don’t understand computers”, “I use computers as a tool”, “I’m fascinated by high-tech”, etc. These factors – knowledge and attitudes – were estimated as being strongly correlated, mainly because both are strongly associated with the previous experience users have. A second set of variables that were seen as correlated describe working styles and general abilities, i.e. general cognitive abilities, problem solving styles and strategies, goal orientation vs. passive behavior style, and conscientiousness. Besides these most important clusters, the following variables were also rated as relevant: domain knowledge (especially in work contexts but also some for consumer products like e.g. digital cameras), language competencies (in particular English in the German market), age (as strong predictor for many other variables, like e.g. experience), and orientation on social norms, e.g. the influence of peer group for usage. Further person variables like gender, or the cultural background were discussed by the experts but not rated as highly important. 2.2 Qualitative Definition of User Types Based on Attitudes A user type should be described as a set of values of the person variables that discriminate between parts of the user group and predict usage behavior. For that, we chose the approach to define the user types based on the attitudes a user has towards ICT-systems. General attitudes and self-concepts like “ICT fascinates me” or “I don’t like interacting with complex technology” are expected to be stable traits that will not change frequently. Of course, their predictive power might be considered as weak for interaction behavior in concrete situations. However, the experts expected the general attitudes to be strongly related to the knowledge and experience of a user, which again is estimated to be the one of the strongest influence. A further advantage of taking attitudes to describe user types is that a set of distinctive and exclusive user categories can be described based on the same rational. In contrast, using many dimensions could lead to not exclusive categories like “The ICTAnxious” and “The Flexible Problem-Solver”. In the expert workshop, several attitudes were discussed. The most prominent and distinctive of those were taken to define the user types. Each type is characterized by the basic attitude towards ICT and his self-concept; this is expressed in a short statement. The further description of personality and behavior is based on the correlations estimated in the expert study.
3 Results: The Proposed Taxonomy: Eight User Types Each type of the taxonomy resulting from the expert study is characterized by the basic attitude towards ICT and his self-concept. This is expressed in a short statement. The further description of personality and behavior is based on the correlations estimated in the expert study. 3.1 The ICT-Enthusiast Basic attitude: “ICT is interesting and fascinating” Description: ICT-Enthusiasts like ICT-systems, have an high intrinsic motivation to use and to control ICT, and like to know them. They are keen on technology more
Users Interact Differently: Towards a Usability - Oriented User Taxonomy
815
than on brands or design. They are very experienced, and have broad and deep knowledge of many application types and show anxiety on a lower level [3], [4]. Their mental models are not only based on the execution level but on how technology works. They are medium to highly flexible problem solvers (e.g. they are able to adapt behavior in unknown situations). 3.2 The ICT-Anxious Basic Attitude: “I don’t like using ICT, I’m afraid of making errors” Description: ICT-Anxious have a negative and cautious attitude towards ICT systems and avoid any kind of ICT-interaction as far as possible. They would get high scores on the anxiety scale and low on the self-efficacy scale[5]. Their ICT-experience and knowledge is low. Moreover, they have a weak understanding how technology works. This type is characterized by a rather inflexible problem solving behavior with high field dependency [6]. Typically, they show slow interaction behavior. 3.3 The Efficient ICT-User Basic Attitude: “I use ICT only as a tool.” Description: Efficient ICT-Users have a pragmatic attitude towards ICT-systems as tool and assistance to achieve basic purposes and satisfy everyday requirements. E.g. they use a mobile phone to make calls but not for gaming. They adopt new technology only if an added value for their personal aims can be seen. They have rather low computer anxiety, feel self-efficient and show a self-estimating attribution style [3]. They have a good understanding of interface structures, such as information architecture, search areas and navigational patterns. 3.4 The ICT-Player Basic Attitude: “ICT makes fun.” Description: ICT-Players are sold on new products with fun-factor (e.g. gadgets, latest handy-games, or funny dialects in navigation systems). They have not in general a high ICT-Affinity, but a strong preference for ICT with fun-factor. Their ICTknowledge is medium, they have a broad experience, mainly on task level. However, their mental models on how technology works are not very elaborated. They have no computer anxiety and feel self-efficient with a kind of “let’s play”- attitude. 3.5 The Design-Oriented Basic Attitude: “ICT has to be beautiful and aesthetic.” Description: Design-oriented users identify themselves via aesthetic, innovative, and visible products. They have a preference to good, intelligent software-design. Their general ICT-affinity, ICT-knowledge and experience is mediocre. They do not spend too much effort on learning complex functionalities. Although they have a strong preference for well-designed and innovative products, the product must make sense in general.
816
F. Hermann et al.
3.6 The ICT-Indifferent Basic Attitude: “I have to use ICT, it is useful but not my hobby.” Description: ICT-Indifferent are not very interested in ICT, or linked to it emotionally. They have a low anxiety, and estimate their own abilities as weak. Their ICTknowledge and mental models are rather limited with only occasional experience with few application types. 3.7 The Suspicious ICT-User Basic Attitude: “Using ICT is risky.” Description: Suspicious ICT-Users see ICT as potential danger for themselves and think that in general, ICT causes negative development in society. They avoid activities that might include the loss of personal data, like e-banking, e-commerce, personal communication via the internet etc. They highly mistrust ICT [7]. 3.8 The Perfectionist Basic Attitude: “It has to be perfect, not only a beta-version.” Description: Perfectionists like to control their ICT-systems and have a strong demand for good functionality, also usability and design. They like to control a system combined with low tolerance towards system bugs, badly designed systems and inconsistencies. They show low computer anxiety and feel very self-efficient with a self-estimating attribution style. Their ICT-knowledge and experience is on the upper level. They have a deep understanding for used applications but no broad experience.
4 Summary and Next Steps We described a set of user types that were derived from person variables and defined on the basis of general attitudes. The next step will be the empirical validation of the user types. A crucial question for the empirical investigation will be, which differentiations between the attitudes can be corroborated and which only might be nuances not stable enough to discriminate between two types (e.g., the type “The Perfectionist” was discussed extensively among the experts). Furthermore, the correlations between the different variables such as attitudes and ICT-related knowledge will be reviewed and might change the descriptions of the different types. After the current taxonomy will be validated, it will be attempted to map it with existing segmentations from marketing research. The goal is to enrich the taxonomy with typical data from market segmentations like current live situation and demographic variables.
References 1. Carroll, J.M.: Making Use: Scenario-Based Design of Human-Computer Interactions. MIT Press, Cambridge (2000) 2. Karlsen, F.: Media Complexity and Diversity of Use: Thoughts on a Taxonomy of Users of Multiuser Online Games. In: Proceedings of the Other Players conference; Center for Computer Games Research, IT University of Copenhagen (2004)
Users Interact Differently: Towards a Usability - Oriented User Taxonomy
817
3. Beckers, J.J., Rikers, R.M.J.P., Schmidt, H.G.: The influence of computer anxiety on experienced computer users while performing complex computer tasks. Computers in Human Behavior, 456–466 (2006) 4. Wilfong, J.D.: Computer anxiety and anger: The impact of computer use, computer experience, and self-efficacy beliefs. Computers in Human Behavior, 1001–1011 (2006) 5. Peiser, C.: Measuring attitudes towards information technology. Man and work, 34-48 (2006) 6. Daniels, H.L., Moore, D.M.: Interaction of cognitive style and learner control in a Hypermedia Environment. International Journal of Instructional Media, 369–382 (2000) 7. Siegrist, M., Gutscher, H., Earle, T.C.: Perception of risk: The influence of general trust, and general confidence. Journal of Risk Research, 145–156 (2005)
Reminders, Alerts and Pop-ups: The Cost of ComputerInitiated Interruptions Helen M. Hodgetts and Dylan M. Jones School of Psychology, Cardiff University, Cardiff, CF10 3AT, UK {hodgettshm, jonesdm}@cardiff.ac.uk
Abstract. Responding to computer-initiated notifications requires a shift in attention that disrupts the flow of work. The degree of cost associated with resuming the original task following interruption may be dependent upon such factors as the transition between tasks (was the worker able to consolidate his/her place in the main task before engaging in the interruption?) as well as the nature of the interrupting task itself (e.g., length or complexity). The current paper reviews a number of studies from our laboratory that investigate the effects of brief interruptions to the execution phase of computer-based 5-disk Tower of London problems. The results are interpreted within the theoretical framework of the goal-activation model [1] and suggestions are made for practical applications that may help to minimize the disruption caused. Keywords: interruption, Tower of London, goals, activation, memory.
Reminders, Alerts and Pop-ups: The Cost of Computer-Initiated Interruptions
819
frequently and how recently a goal has been retrieved) and its relevance to the current context (the influence of environmental cues). For a new or interrupting goal to become more active than the others and to exceed the interference threshold, it must be repeatedly sampled or strengthened within a short period of time in order to build up base-level activation. Once selected, the rate of sampling decreases and the goal gradually decays to a level below that of other newer goals. Sometimes – in the case of interruption for example – goals are temporarily suspended and must be resumed later. G-AM makes this possible through a process of priming. The base-level activation of a suspended goal will have decayed through lack of use, but retroactive interference can be overcome if a goal is deemed relevant by the current context. A goal’s associative activation is boosted by environmental cues: Effective cues are formed by co-occurrence and so must be present both at the time of goal suspension and when the goal is to be retrieved. In terms of an interruption, Altmann and Trafton propose that these cues can be encoded during the ‘interruption lag’, the time between the interruption alert (e.g., the telephone starting to ring) and the actual interruption (e.g., engaging in the telephone conversation). G-AM is one of the few theoretical models to have been explicitly applied to the issue of task interruption and it provides a useful basis for the exploration and interpretation of interruption effects. 1.2 The Tower of London Task Our experiments used a computer-based version of the 5-disk ToL problem [5] as a primary task because it provides a controlled task environment and allows for the assessment of performance at a sufficiently fine-grained level of detail. Participants are presented with a starting array of different colored, equal-sized disks mounted on three pegs (Figure 1). The aim is to achieve a given goal configuration by moving the disks one at a time, from peg-to-peg. The task involves the formulation, retention and execution of a planned series of actions – processes that are not dissimilar to many everyday computer-based activities (e.g., first formulating and then typing sentences for a report). In the current work it is the execution phase that is interrupted, and we investigate what factors affect retrieval of a planned sequence of action when execution of this plan is unexpectedly broken.
Fig. 1. Screen display during the Tower of London task
820
H.M. Hodgetts and D.M. Jones
Participants were to complete 25 ToL trials, of which either 6 or 8 trials were interrupted (depending on the particular experiment). These interrupted trials were all equivalent in terms of complexity, and required 6 moves to solution. Each interrupted trial was always matched to a control trial which required exactly the same solution path but the colors of the disks were changed. Disks were moved by clicking on the buttons below the pegs, and a pop-up box indicated when the goal state had been achieved. Interruption always occurred in the middle of a trial, after the participant had made their third move. The main ToL display would then be replaced by the interrupting task which in most cases was a mood checklist. A list of six statements along a mood continuum (e.g., “extremely happy,” to “extremely sad”) were presented in the centre of the screen, and participants were asked to select the one that best applied to them by clicking on the statement with the mouse. This task was irrelevant to the main ToL but provided a plausible means with which to interrupt the primary task in the laboratory context. Interruption was always brief and took around 5 s to complete before the participant then continued with the main task at the exact point at which it had been left. Using this general methodology, we have conducted a number of experiments to investigate the effect of brief on-screen interruptions to ongoing task performance.
2 Interruption Length G-AM incorporates ACT-R’s base-level learning equation, according to which a goal’s base-level activation is dependent upon how recently and how frequently it has been retrieved. It is this constraint that determines the initial rapid build-up of activation of a new goal with repeated sampling, but also the decay of a suspended goal as a power function of time delay. In accordance with the model it was predicted that a goal suspended for longer would be subject to greater decay and therefore be more time consuming to retrieve after interruption. Existing studies have provided evidence both for [6] and against [7] an effect of interruption duration, comparing interruption intervals in the region of 30 s to two minutes. In our experiments we used a more fine-grained approach than previously, investigating the effect on individual move times of on-screen interruptions that were relatively short in duration (less than 30 s). Interruptions of either 5 s or 15 s in duration were introduced to the execution phase of ToL problems [8]. The short interruption required the participant to complete one 5 s mood checklist and the longer interruption required the completion of three checklists that each changed automatically every 5 s. Times taken to make the fourth move are shown in Figure 21. Goal retrieval following interruption incurred a time cost relative to the control condition in which solution execution was continuous. Furthermore, the cost of goal retrieval was greater following the 15 s rather than the 5 s interruption as goals suspended for longer were subject to greater decay. 1
The original experiment in Hodgetts & Jones (2006a) included a further factor (whether or not participants were aware of how long the interruption would be). However, this manipulation had no effect so the task resumption time data shown here are collapsed across conditions.
Reminders, Alerts and Pop-ups: The Cost of Computer-Initiated Interruptions
821
7
Move time (s)
6 5 4 3 2 1 0 Control
Short
Long
Interruption condition
Fig. 2. Time to make the fourth move (in seconds) according to interruption condition
In a further experiment [9] interruption length was manipulated by the number of mood checklists to be completed – either one, three, or five – but the precise timing of these was under the control of the participant rather than being dictated by the computer program. Participants clicked a ‘continue’ button after completion of each checklist, which then either displayed a further checklist or returned the participant to the ToL task. The average lengths of short, medium and long interruptions were 4.48 s, 12.65 s and 20.25 s respectively. Task resumption times following short interruptions were around one second quicker than in the medium condition, a result that parallels the findings of the previous experiment. However, introducing the longer interruption interval did not result in any further increase in resumption time, suggesting that goals suffer a more rapid decline in activation when they are first suspended with a less marked decrease thereafter. G-AM predicts that activation decreases as a power function over time, and the current data seem consistent with this proposal. 2.1 Post-interruption Performance Interruption incurs a time cost in resuming previously suspended task goals, but does an unexpected break in task exert a more general negative effect on post-interruption performance, beyond that of the resumption lag? In order to gain a general idea of the pattern of move times throughout an entire trial, a sample of three problems were chosen from Experiment 1 of Hodgetts and Jones [8] and subjected to further scrutiny for which all six move times were calculated (Figure 3). The problems selected were the first interruption trial (trial 4), the last interruption trial (trial 25), and an interruption trial from the middle of the experiment (trial 12), as well as their matched controls (problems 13, 9, and 20 respectively). Only move times for perfect trials were used, that is, those that were completed using the correct six-move solution path (approx 85% of the data). This therefore allowed for a direct comparison between move times in each condition as each move was equivalent. First it is obvious to note that participants take longest to make their first move as this also incorporates planning time. Participants became quicker at planning with practice as plan times were greater for trial 4 than for any of the other later trials shown in Figure 3. Other than this, there were no appreciable differences between conditions (short, long or no interruption) in the first half of the trials
822
H.M. Hodgetts and D.M. Jones
because before interruption these problems were all equivalent. A reliable effect at move 4 is apparent in all three of the problems, with a marked difference between interruption and no interruption trials, and a difference also between short and long interruptions. Although no formal analyses were carried out on these data, the graphs allow us to see at a glance whether interruption may have had a more general effect on post-interruption performance, perhaps slowing moves 5 and 6 as well as move 4. It appeared that this was not the case however, and that participants bear the cost of interruption only at the point of goal retrieval (i.e., the ‘resumption lag’). (a) Trials 4 (interruption) & 13 (control)
Short interruption
15
Long interruption
10 5
M o v e tim e ( s )
20
No interruption
20 M o v e tim e ( s )
(b) Trials 12 (interruption) & 20 (control) No interruption Short interruption
15
Long interruption
10 5 0
0 1
2
3 4 Move number
5
1
6
2
3 4 Move number
5
6
Mov e time (s )
(c) Trials 25 (interruption) & 9 (control) 20
No interruption
15
Short interruption Long interruption
10 5 0 1
2
3 4 Move number
5
6
Fig. 3. Move times for perfect trials in Experiment 1, Hodgetts and Jones (2006a). Error bars show standard error.
3 Interruption Complexity Intuitively we may think that an interruption that is more cognitively demanding would be more disruptive, and our research provides some support for this notion [8]. We find that participants are quicker to resume the primary task following a simple mood checklist task than following a more complex verbal reasoning task of the type “A follows B - - AB” [false] [10], even though they take the same length of time to complete. Furthermore, this effect of interruption complexity is more marked at points of high memory load, when participants are interrupted after their first rather than their third move [11]. Unlike the mood task, the reasoning problem involves several
Reminders, Alerts and Pop-ups: The Cost of Computer-Initiated Interruptions
823
elements that must each be retained and verified in order to assess the validity of the statement. In ACT-R, associative activation is limited such that the more elements that are active the less associative activation each one receives; it is possible that the number of elements involved in this more complex task leaves less activation available for the maintenance of the suspended goal. A further experiment using mental arithmetic problems found increased task resumption times when participants were interrupted to complete one complex sum (e.g., 37 + 48) rather than a series of four simple sums (e.g., 1 + 3, 2 + 4) [8]. It seems therefore that it is not simply the number of subgoals that are activated during the interruption that is critical for disruption, but rather the processing demands imposed by each and the coordination of multiple task goals in the case of the more complex interruption.
4 Expectation and Preparation Interruptions generally consist of two parts: an alert and the interruption proper. The period between these, termed the interruption lag, is thought to be a critical time for ‘preparing to resume’ a to-be-suspended task, perhaps through retrospective rehearsal of the current state or by prospectively encoding a future goal [12]. We manipulated the opportunity for preparing task goals before the onset of interruption by varying the transition between tasks: Either the interrupting mood checklist task covered the whole screen so that the ToL task could not be seen (as in previous experiments) or the mood checklist appeared in a box in the top left hand corner of the screen (although the ToL was visible, no further moves could be made on the main task during the interruption period; Figure 4) [13]. Participants were quicker to resume the ToL task in this latter condition because the less abrupt transition between tasks allowed a chance to prepare to-be-suspended task goals before attention was fully drawn to the mood checklist. When the checklist covered the whole screen, there was no specific time for the efficient encoding of contextual cues and so reactivation of the suspended goal after interruption was a more difficult and time consuming process. In further experiments, it was found that a 3-s pause before the onset of a full-screen interruption resulted in similar benefits to primary task resumption as with the corner-screen condition, as this provided a brief time lag for the preparation of
Fig. 4. Mood checklist interruption in the corner-screen condition [13]
824
H.M. Hodgetts and D.M. Jones
task goals before the interruption ensued. This was the case for both mood checklist [13] and verbal reasoning task [14] interruptions. Rearranging the colors of the discs following interruption (but still retaining the original solution path) was found to increase task resumption times because it disrupts the contextual cues that help to prime retrieval of the suspended goal [13]. Some interruptions require immediate attention before any work on the original activity can proceed (e.g., ‘save’ reminders on spreadsheet programs), while some interruptions provide an alert but allow the user to choose a convenient time to switch tasks (e.g., a flashing instant message icon at the bottom of the computer screen). Given the proposed advantages of the interruption lag as a time for preparation before the onset of an interruption, one might expect that an opportunity to control the timing of the interruption lag would be of particular benefit. In one exploratory experiment we found no difference between ‘immediate’ and ‘negotiated’ interruptions in terms of task resumption times [14]. However, it seemed that participants chose to engage in the ‘negotiated’ interruptions immediately around half of the time anyway, despite the opportunity for the secondary task to be deferred. Nevertheless, the idea of negotiated interruptions deserves further investigation, especially the question of whether it would be possible to train participants to take full advantage of this preparatory period which may in turn help to mitigate the effects of interruption.
5 Practical Applications 5.1 Even Brief Interruptions Are Disruptive The standard mood checklist interruptions were brief and undemanding in content, comparable to many types of pop-up notifications that increasingly invade our computer screens. Such interruptions may seem trivial, but the current work demonstrates that even these may be impacting upon worker efficiency, particularly given their frequency throughout the working day. Having recognised that even these seemingly inconsequential interruptions may be affecting performance, the first obvious practical recommendation would be to minimize such computerised intrusions. For example, instant-messenger systems should be turned off whilst engaging in tasks that require a lot of planning or concentration, or at least set to ‘busy’ so that colleagues are aware that unimportant interruptions are not welcome. Similarly, email alerts could be turned off or a priority system used, e.g., onscreen notifications are only given for those emails that the sender tags specifically as being high priority (or at least alerts are not received for incoming messages that the system detects as being ‘spam’). 5.2 Interruption Length and Complexity Our work has shown that the effects of interruption are exacerbated by longer or more complex interruptions. Although the current work is limited to particularly short interruption intervals, this may still have implications for the design of computerinitiated alerts with the recommendation that the necessary information is displayed clearly in order to minimize the time spent reading them. For example, email alerts sometimes provide a number options (i.e., clear, read, save, delete), but reducing this
Reminders, Alerts and Pop-ups: The Cost of Computer-Initiated Interruptions
825
to just read or clear may decrease the time spent attending to the notification and therefore minimize disruption. If workers anticipate that a particular interrupting event will require a lot of time or effort to deal with, then they should think about rescheduling for a more convenient time (e.g., responding to an email can often be delayed until the worker reaches a more suitable break point in the current task). 5.3 The Importance of the Interruption Lag Our work has also highlighted the importance of the temporal structure of interruptions, and in particular the benefit of a brief time lag before the onset of the secondary task. This time can be used to rehearse one’s place in the task, or to prospectively encode cues that will later help to prime retrieval of the suspended goal. Based on the findings of the current experiments, it is recommended that the task context is such that it enables the encoding of these cues before the participant must attend to the interrupting activity. If the interruption is particularly intrusive or attention-grabbing, it may capture attention immediately without allowing the worker a chance to consolidate his or her place in the task. In light of this, moving icons for example should perhaps be avoided as research has found them to be particularly disruptive [15]. Email packages can differ in the types of notification that they use to alert the worker to incoming mail. Some use relatively discreet alerts that appear in the top corner of the screen, while others can be large and intrusive, appearing centrally on the screen and obscuring much of the original activity. We have shown that the latter type of alert may be particularly disruptive, as there is no ‘window of opportunity’ for consolidating primary task goals before the interruption ensues [13]. Email alerts and similar pop-up messages, should therefore be as small as possible while still conveying the appropriate information. Another difference between email packages is that some alerts allow the computer-operator to continue typing whilst they appear on the screen, whilst for others, the alert receives the focus and the original activity is put to the background. In this case, the worker has no choice but to deal with the interrupting alert immediately before he or she is able to continue with the ongoing task, but if a chance is available to finish off the current subgoal during the interruption lag (e.g., complete writing a sentence), then the worker may be able to arrive at a more convenient cognitive breakpoint before dealing with the intruding task. Furthermore, it would also be recommended that the alert actually disappears after a few seconds if not responded to, as then the worker can be made aware of the incoming information without explicitly needing to interrupt the ongoing task to respond to it. Acknowledgment. Paper supported by the UK’s Economic and Social Research Council grant no. RES-062-23-0101 and by an Overseas Conference Grant awarded by the British Academy (ref. OCG-46404).
References 1. Altmann, E.M., Trafton, G.J.: Memory for goals: An activation-based model. Cognitive Science 26, 39–83 (2002) 2. DeMarco, T., Lister, T.: Peopleware: Productive projects and teams, 2nd edn. Dorset House, New York (1999)
826
H.M. Hodgetts and D.M. Jones
3. Storch, N.A.: Does the user interface make interruptions disruptive? A study of interface style and form of interruption (Report UCRL-JC-108993). Lawrence Livermore National Laboratory, Springfield (1992) 4. Anderson, J.R: Rules of the mind. Lawrence Erlbaum Associates, Hillsdale, NJ (1993) 5. Ward, G., Allport, A.: Planning and problem-solving using the five-disc Tower of London task. Quarterly Journal of Experimental Psychology 50A, 49–78 (1997) 6. Lahlou, S., Kirsh, D., Rebotier, T., Reeves, C., Remy, M.: Interruptions in the workplace (2000) Available: http://adrenaline.ucsd.edu/edf/Experiment1.htm 7. Gillie, T., Broadbent, D.: What makes interruptions disruptive? A study of length, similarity, and complexity. Psychological Research 50, 243–250 (1989) 8. Hodgetts, H.M., Jones, D.M.: Interruption of the Tower of London task: Support for a goal activation approach. Journal of Experimental Psychology: General 135, 103–115 (2006a) 9. Hodgetts, H.M., Jones, D.M.: Resuming an interrupted task: Activation and decay in goal memory. In: Sun, R., Miyake, N. (eds.) Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006), p. 2506. Lawrence Erlbaum Associates, Mahwah, NJ (2006b) 10. Baddeley, A.D.: A three-minute reasoning test based on grammatical transformation. Psychonomic Science 10, 341–342 (1968) 11. Hodgetts, H.M., Jones, D.M.: Interrupting problem solving: Effects of interruption position and complexity. In: Katsikitis, M (ed.) Past Reflections, Future Directions: Proceedings of the 40th Australian Psychological Society Annual Conference, Melbourne, Australia pp. 128–132 (2005) 12. Trafton, J.G., Altmann, E.M., Brock, D.P., Mintz, F.E.: Preparing to resume an interrupted task: Effects of prospective goal encoding and retrospective rehearsal. International Journal of Human-Computer Studies 58, 583–603 (2003) 13. Hodgetts, H.M., Jones, D.M.: Contextual cues aid recovery from interruption: The role of associative activation. Journal of Experimental Psychology: Learning, Memory & Cognition 32, 1120–1132 (2006c) 14. Hodgetts, H.M., Jones, D.M.: Interruptions in the Tower of London task: Can preparation minimize disruption? In: Proceedings of the 47th Annual Meeting of the Human Factors and Ergonomics Society, pp. 1000–1004. HFES, Santa Monica CA (2003) 15. Ware, C., Bonner, J., Knight, W., Cater, R.: Moving icons as a human interrupt. International Journal of Human-Computer Interaction 4, 341–348 (1992)
The Practices of Scenario Study to Home Scenario Control Yung Hsing Hu, Yuan Tsing Huang , You Zhao Liang, and Wen Ko Chiou 259 Wen-Hwa 1st Road, Kwei-Shan Tao-Yuan, Taiwan, 333, R.O.C Chang Gung University [email protected]
Abstract. Home is where human living in, and can be relaxed and entertained. Scenario control is a man-machine system integrate audio/video equipment, light, curtain, air conditioner by using wireless LAN technology, defines common using scenarios, makes user interactive with product, handle all equipment quickly, enjoy smart home lifestyle. This research using the method of practice design, verifies how user-oriented design (UOD) and scenario semantics analysis help designer attain product innovation design. Keywords: User-oriented design, Scenario study, Home automation, Practices.
visualized scenario, and also let designer actual and specific the feasibility, experience the scenario, and feel user’s fancy and demand [4-5]. Home scenario control using wireless LAN technology with customization service, integrated audio/video equipment, light, curtain, air conditioner and security, let system assist human’s life, bring the trend of automation into family, provide clearly and simple man-machine system, customization scenario control service, let user interact with the product, and can create a manner of attribute in different demand, changes scenario and mood at any time. Light automation can avoid the waste of switching, also the damage of improper usage or unfitting performance, and is convenient to identify object, promote flavor in order to fit and satisfying the demand of seeing. By automation controlling system, it is simple and quickly to control video equipment and audio setting, including: on/off, projecting equipment, volume control, playback control and etc., and can also monitor every corner. By scenario control can make environment enter the situation quickly, and there won’t be any problem of too much remote controller or which controller to use.
2 Methods This research firstly collect the information of existing product, primarily collect those top manufacturers’ (CLIPSAL, AMX, CRESTROM, JUNG, ABB) product as reference, and try to find out the relation of scenario control, user, and environment. Then in the process of practice design, obtain information by scenario semantics method simulate and analysis, actual product operating and system integrator (SI) staffs interview as design target of latter stage of concept product. Papers not complying with the LNCS style will be reformatted. This can lead to an increase in the overall number of pages. We would therefore urge you not to squash your paper. 2.1 Design Practice Research Scenario controller in modern development has various of types, in this research, the practice design will cooperate with ADVANTECH co., Ltd , and make a practice cooperation design, improve exist UBIQ230 by using scenario simulation of userdriven develop process. This product are developed to use in mid-high level of home,
Fig. 1. UBIQ 230 of ADVANTECH
The Practices of Scenario Study to Home Scenario Control
829
hanging on the wall, colored monitor, differs with other products in the marker (shown as figure 1). 2.2 Research Architecture This research brings in the method of scenario simulation by practice verifying, find out contribution in the design process, the main issue is the scenario light control, and the whole design process can be divided into three main parts (shown as figure 2): (1) Design strategy: analysis the product of competitor and find product’s main function demand and its’ position. (2) Design evaluation: product attributes analysis, including scenario simulation, analysis of product corresponding to home style. (3) Design correction: actual product survey, including interview and operation.
Fig. 2. The research framework
3 Design Process 3.1 Stage 1: Design Strategy The first stage is product application analysis: Firstly list all related product and find out the related characteristics (shown as figure 3). The analysis finds out hanging scenario controller’s main functions mostly are light control, curtain control, air conditioner setting, guard system, AV control, when corresponding these functions to the location in
830
Y.H. Hu et al.
Fig. 3. The application of different field of use
Fig. 4. Analysis the present product functions to applications
Fig. 5. Attribute mapping
The Practices of Scenario Study to Home Scenario Control
831
home, find out three main application: light control, curtain control, and air conditioner setting, and these application as the main function demand (shown as figure 4). Then show the function of concept product by cross analysis map, the x-axis is instinct operation level, the y-axis is how many functions within, and position the new scenario controller as high instinct operation level and with basic functions (shown as figure 5). 3.2 Stage 2: Design Evaluation Aim at three main functions correspond with user, environment, and equipments, find out the three main demand of user are leisure, ease, and convenient, then simulate these demand with scenario in pictures, find out each demand correspond to what lifestyle and product using scenario (shown as figure 6) try to understand the actual using environment and make clearer product looking for latter concept design.
Fig. 6. Scenario simulation of product requirement
Next, collect data of home style, and conclude as five types and color attribute (shown as figure 7) then correspond with proper adjectives: (1) light, (2) mild, (3) colorful, (4) mat, and (5) metal. In the process of first stage, stress on discussing the interaction of product, human, and environment, find out the basic demand of produce, also is the core value of the product.
Fig. 7. Lifestyle with color attribution
832
Y.H. Hu et al.
3.3 Stage 3: Design Correction In the following design process, researcher operate in the building with the product installed, and interview the SI members, try to find some part of previous product which need to be improved, and get more designing information for latter stage of concept product design. (1) Interface: find proper operating interface and monitor correspond function, also the panel module. (2) System: adjusting the light and discuss the attach function of color monitor. (3) Operation: raise the tactile of the button and the convenience of install and remove. (4) Extension: scenario icons and enhance the setting.
4 Discussion Conclude the above-motioned stage and bring out ten of critical design issues show in table 1. In accordance to these demands, developed two concept design, the product will show how user interacting with the product in pictures and scenario simulating method, and the performance using modulated panel to show the compatibility in different living style, try to find the balance of house and product in the user aspect. The result of this research will show in static illustration, show the features of product design, using simulation, and collocation with the living style. The features of designing is show as figure 8, it has A and B two types, show the features of product modeling and details of every parts. Table 1. Ten design key issues 1. Interface with attribute style and highly introduction 2. Choose of the material, make tactile prominent, and merge with environment 3. Feedback and sound of the button 4. Product module characteristic 5. Sustain the feature of UBIQ230
6. Correspond of interface and buttons 7. Develop the added value of color monitor 8. Immediately trimming operation of the light 9. Easily extend the scenario system 10. Demand of “one touch”
Then in the aspect of using simulation (shown as figure 9) imagine the situation where product will be used in proper scenario. At last, using the characteristic of product panel module, using four materials: metal, plastic, leather, and glass to simulate the harmony with different living styles (shown as figure 10) try to make product has various features and tactile performance, and extend the range of use in home. This research stress on improve existing produce, to last original features, produce two whole new concepts, using scenario syntax analysis, expecting giving better scenario control service to user, by using the feedback of light and audio to interact with user, and also have different simulation with different scenario, user, and usage.
The Practices of Scenario Study to Home Scenario Control
A.
B.
Fig. 8. The characteristic of concept design
A.
B.
Fig. 9. Use simulation of product (a)
833
834
Y.H. Hu et al.
A.
B.
Fig. 10. Use simulation of product (b)
5 Conclusion To improve existing products some original characteristics must be retained, but two brand new concepts must be included in order to expect better scenario control services to users. When designing products, we need a ‘guidance interface’ and simple onetouch buttons; and lights and sounds with good feedback that can interact with the user more. Perfect tactile feedback combined with the environment brings additional value.
6 Implications Regarding technology, system integration can be identified as a future trend as it can bring many advantages to the family and ‘family life’ in general, additionally, ‘ePlatforms’ can also help people to connect to the ‘e-World’. After this system matures applications can be expanded to ‘other’ public spaces such as hotels, karaoke clubs, museums or cinemas and will become a product people depend on.
References 1. Kelley, T.: The Art of Innovation. Doubleday, New York (2001) 2. Moggridge, B.: Design for the Information Revolution. Design DK 4, Danish Design Centre, Copenhagen (1992) 3. IDEO, Innovation Design and Interaction Design - a workshop for ACER 27 (1992) 4. The Handbook of Innovation Strategy Conference in China Productivity Center (1995) 5. Liang, Y.C.: Seven Principle of Innovation Design. The Essay of Speech in the Hong Kong Polytechnic University (2000) 6. JUNG: http://www.jung-katalog.de/index.htm 7. ABB: http://www.abb.fr/ 8. CLIPSAL: http://www.clipsal.com.hk/hk/html/lang1/ 9. LUTRON: http://www.lutron.com/ 10. AMX: http://www.amx.com/ 11. ADVANTECH: http://www.advantech.com/
Effects of Time Orientation on Design of Notification Systems Ding-Long Huang1, Pei-Luen Patrick Rau1, Hui Su2, Nan Tu1, and Chen Zhao2 1
Department of Industrial Engineering, Tsinghua University, Beijing 100084, China 2 IBM China Research Lab, Beijing 100094, China
Abstract. This study investigated the effect of time orientation on notification systems. A special game was designed to test users’ performance and perception with notification systems. Significant differences of Interruption and Reaction level were found between monochronic and polychronic participants. The results showed that time orientation do affect users’ perception and performance with notification systems. Polychronic users perceive lower level of interruption of the notification messages than monochronic users; polychronic users prefer rapid and accurate response to the stimuli provided by the notification system while monochronic users tend to avoid that. Keywords: Time Orientation, Notification System, Interruption, Reaction.
The objective of this study is to investigate the effect of time orientation on notification systems. The findings were presented and discussed, with a primary focus on cultural effect on notification systems and its implications on personalized design of notification systems.
2 Background Literature 2.1 Cultural Differences Culture differences, which affect virtually all of human behavior [11, 12], and its relationship with Information Technology, are emerging as hot spot of research. Many researchers, especially the researchers of Human-Computer Interaction (HCI) devote their efforts toward how to make computer-user interfaces sensitive to different classes of users who have different cultural style, to enable them to perform their task with computer safely, effectively, efficiently and enjoyably. Cultural differences can not only be seen from the explicit products and artefacts of culture such as language, foods and clothes, but also differ at a deeper level, such as ways of processing information and attitudes towards time and space [4]. A popular cultural frameword was proposed by Hall [8-10], in which he stated that all cultures can be situated in relation to Time Orientation. This dimension shows that different cultures have different perceptions of time. Members belonging to Monochronic societies are more likely to have a linear time perception. They are used to doing one item at a time and take time commitments seriously. On the contrary Polychronic societies do not have a linear but rather a cyclic time perception. They do not mind doing several things simultaneously and consider time frames as useful but it does not come to irritations of they cannot achieve them. 2.2 Notification System Notification systems deliver ever-changing and valuable information while allowing users to perform the current activities in an efficient and effective manner [2]. Increasing number of peripheral displays and notification systems have been designed, like Microsoft’s Office Assistant, Toast[13], Info-Lotus[13], Sideshow[14], Scope[15], Irwin[16] and even hardware devices like Ambient Orb[17]. These types of displays share the common design goal of providing the user with access to additional information without requiring excessive levels or prolonged periods of attention.[1] Evaluation techniques for notification systems have also been developed, like Mankoff et al.’s set of discount formative techniques[18] and McCrickard et al.’s notification system categorization framework.[3]. McCrickard and his team developed and proposed the IRC framework as a guiding conceptual approach [2, 3]—a unifying framework for understanding, classifying, analyzing, developing, evaluating, and discussing notification systems, which explores the balance between three parameters: interruption, reaction, and comprehension. The interruption (I) was defined as an event prompting transition and reallocation of attention focus from a task to the notification. The reaction (R) was considered as a parameter that user the rapid and accurate responses to the stimuli provided by notification system. The comprehension (C) was
Effects of Time Orientation on Design of Notification Systems
837
treated as the rate of which users remember and make sense of the information provided by the notification system in a later time.
3 Hypotheses Hypothesis 1: Users who tend to have a monochronic time orientation perceive higher level of interruption for notification system than users who tend to have a polychronic time orientation. Hypothesis 2: Users who tend to have a polychronic time orientation perceive higher level of reaction for notification system than users who tend to have a monochronic time orientation. Monochronic users prefer to do one thing at a time, working on a task until it is finished. To a monochronic user, switching back and forth from one activity to another is not only wasteful and distracting, it is uncomfortable. Thus, they tend to avoid events prompting transition and reallocation of attention focus from a task to the notification, perceive higher level of interruption of the notification message, and tend to avoid rapid response to the notification message On the contrary, polychronic users love to work on more than one thing at a time. To them, switching from one activity to another is both stimulating and productive. Thus, they perceive lower level of interruption of the notification message and prefer rapid and accurate response to the stimuli provided by the notification system.
4 Methodology 4.1 Participants 82 undergraduate and graduate students studying in the Tsinghua University, Beijing, China, were recruited as the experiment participants. All the students had at least one year experience with computer with average 5 hours per week. The ages of the participants ranged from 19 to 30 years old (mean=22.43 , SD=2.025 ). 44 participants were male and 38 participants were female. The participants were categorized into polychronic/monochronic Time Orientation according to their responses to the cultural dimension questionnaire[19]. 4.2 Task Participants were asked to complete a simple hitting dog game as the primary task while monitoring the information displayed on the notification system as the secondary task. The game lasted for 5 minutes. Participants’ hit rate with and without notification message displaying were recorded automatically. After the test, they were asked to fill in a questionnaire with questions related to the information on the notification system, subjective rating of the interruption, and the general satisfaction.
838
D.-L. Huang et al.
Fig. 1. The hitting dog game, without notification message appearing
Fig. 2. The hitting dog game, with notification messages appearing
Hitting Dog Game (Primary task). In the game, a dog raised from one of nine holes randomly, the user needed to click the dog as soon as possible. Every time the user “hit” the dog in time, ten points would be collected. This easy game could keep the primary task to only measure the motorized skills of the users, regardless of other capabilities, such as the software proficiency. To familiarize with the procedure, they practiced for 2-3 minutes. Notification Monitoring Task (Secondary task). While playing the hitting dog game, participants were asked to observe the information displayed in the right bottom corner as the secondary task. They were informed the scenario that the user was a group member received a message from his/her manager to attend an activity in the weekend. Some team members replied to give suggestion on the time, location and content of the activity. Then, the user started tracking that series of notification message while performing the primary task (the hitting dog game) for five minutes.
Effects of Time Orientation on Design of Notification Systems
839
There were 26 messages from the manager and other four team members: Yao Ming, Liu Xiang, Liu Dehua and Zhao Wei (this research use these famous celebrities as team members as users are familiar with them and require little time of attention to recall). The performance of the participants in the game, including their hit rate with and without the messages appearing, was recorded and used as the objective rating for the interruption level. The hitting dog game and the notification system prototype were designed with Macromedia FlashTM MX 2004, which are shown in Fig. 1 and Fig. 2. Each participant was given a comprehension test immediately after the task. They were asked several questions, which were related to the contents of the notification messages, such as “where are you going for the weekend activity” and “what is the time to gather”. There are totally 8 questions, including 7 single choice questions and 1 essay question. The next part of the questionnaire was to measure the general satisfaction for the interruption level of the notification message. The scale was from 1 to 7, 1 being the least and 7 being the strongest. The last part of the questionnaire consisted of 6 questions on notification displays. The questions were designed to measure the reaction of the peripheral display. E.g. “while you are editing files or programming on the computer, a notification system prompt you that you got a new message, what would you do with the prompt? ” 4.3 Dependent Variables Data were collected through recording users’ performance in the hitting dog game and their answers for the questionnaires. Two dependent variables were utilized: Interruption level was the variable to evaluate how seriously the participants were interrupted by the notification messages. This study used both objective measurement and subjective measurement to evaluate interruption level. Objective measurement was the difference of participants’ hit rate with and without the messages appearing in the hitting dog game. Subjective measurement was their evaluation of general satisfaction for the interruption level of the notification message. Reaction level was the variable to evaluate the extent participants tend to avoid events prompting transition and reallocation of attention focus from a task to the notification and tend to avoid rapid response to the notification message. It was measured by the last part of questionnaire, which consisted of 6 questions about their reaction to notification displays. 4.4 Independent Variables Participants’ time orientation was manipulated as the independent variable in testing the proposed hypotheses. The participants were categorized into polychronic/ monochronic Time Orientation according to their responses to the cultural dimension questionnaire[19].
840
D.-L. Huang et al.
4.5 Procedures The experiment was conducted with the following procedures: 1. Prior to the test, the demographic information of the participants was collected, such as name, gender, age, education and so on. 2. Then the participants were given the scultural dimension questionnaire[19]. 3. A short practice and explanation session for the notification system were provided for the participants. 4. The participants conducted the task, which took about 10 minutes 5. The questionnaire was given to the participants after completing the task. The entire process took the participants around 25 minutes to complete.
5 Results 5.1 Effect of Time Orientation on Interruption It is hypothesized that users who tend to have a monochronic time orientation perform higher level of interruption for notification system than users who tend to have a polychronic time orientation. This research used two kinds of measurement to measure users’ interruption level. One was users’ subjective evaluation of the interruption severity of the notification messages, the scale is from 1 to 7, 1 being the least and 7 being the strongest. The other was the decreasing rate of users’ hit rate with notification messages appearing compared with that without messages appearing, which could be viewed as an objective measurement. The results of testing this hypothesis were shown in Table 1. There was a significant difference found in users’ subjective evaluation of interruption (t=-2.279, p=0.026), indicating that monochronic users perceive higher level of interruption of the notification message than polychronic users. However, there was no significant difference found in users’ objective measurement of interruption (t=-1.267, p=0.210), some factors that could affect users’ hit rate might be missed. Table 1. Results for testing hypothesis 1
Polychronic/Monochronic Time Orientation Polychronic(N=36) Monochronic(N=36) Mean
SD
Mean
SD
5.65 2.6
4.654 1.02
7.31 3.3
6.320 1.43
T
p
-1.267 -2.28
0.210 0.026
Considering users’ performance of the secondary task (observing notification messages) would affect their performance of the primary task (hitting dog game): users who paid much attention to the notification messages were more likely to perform low hit rate than users who tended to ignore the messages. Users’ performance of the secondary task could be indicated by their Comprehension level, the mean of which for all the participants is 7.63. A t-test of the mean objective
Effects of Time Orientation on Design of Notification Systems
841
interruption for participants whose comprehension level is above 7.63, that is, participants who had paid relatively enough attention to the notification messages, was conducted. The results were shown in Table 2. And the p value for objective measurement of interruption dropped from 0.210 to 0.085. Table 2. Results for testing hypothesis 1 (for participants whose comprehension level is above the mean comprehension level) Polychronic/Monochronic Time Orientation Polychronic(N=23) Monochronic(N=24)
5.2 Effect of Time Orientation on Reaction It is hypothesized that users who tend to have a polychronic time orientation perform higher level of reaction for notification system than users who tend to have a monochronic time orientation. The results of testing this hypothesis were shown in Table 3. There was a significant difference found in users’ reaction level (t=-2.112, p=0.038), indicating that polychronic users prefer rapid and accurate response to the stimuli provided by the notification system, while monochronic users tend to avoid that. Table 3. Results for testing hypothesis 2
Variables
Reaction
Polychronic/Monochronic Time Orientation Polychronic (N= 36) Monochronic (N= 36) Mean
SD
Mean
SD
8.3
1.93
9.2
1.64
T
p
-2.11
0.038
6 Conclusions and Recommendations This study has revealed that time orientation do affect users’ perception and performance with notification systems. Polychronic users perceive lower level of interruption of the notification messages than monochronic users. Polychronic users prefer rapid and accurate response to the stimuli provided by the notification system while monochronic users tend to avoid that. Based on these findings, this paper suggests that time orientation should be taken into account while designing notification systems for users with difference cultural style, for instance: − Monochronic people, like Scandinavians and North-Americans [10], tend to avoid events prompting transition and reallocation of attention focus from a task to the notification and perceive higher level of disturbing by the notification message. To decrease the interruption level for them, the text filtering technology[20] can be
842
D.-L. Huang et al.
applied, which enable monochronic users to refuse notification messages that come at specific time, from specific senders, or about specific topics. − Since polychronic users prefer rapid and accurate response to the stimuli provided by the notification system, while monochronic users tend to avoid that, notification messages provided to polychronic users should be reactable, like by clicking on the messages users can reply to the message senders. And while anti-virus application provides warning messages to monochronic users, it should be able to take the default operations (e.g. delete the infected files) automatically. It should be noticed that the above suggestions for designing notification systems for users from different countries are based on Hall’s cultural categorization[10], which was developed before 1990’s. The world has moved on, and the situations have probably changed during these period. Especially for the young generation in china, most of the young people are the unique child in the family, and many of them are affected by the western culture greatly, they probably have a different cultural style compared with traditional Chinese. All the above suggestions should be applied carefully and further researches on young generation’s cultural style are necessary.
References 1. McCrickard, D.S., Czerwinski, M., Bartram, L.: Introduction: design and evaluation of notification user interfaces. International Journal of Human-Computer Studies 58(5), 509– 514 (2003) 2. McCrickard, D.S., Chewar, C.M.: Attuning notification design to user goals and attention costs. Communications of the ACM 46(3), 67 (2003) 3. McCrickard, D.S., Chewar, C.M., Somervell, J.P., Ndiwalana, A.: A model for notification systems evaluation-assessing user goals for multitasking activity. ACM Transactions on Computer-Human Interaction 10(4), 312–338 (2003) 4. Rau, P.-L.P., Liang, S.-F.M.: A study of cultural effects of designing a user interface for a web-based service. International Journal of Services Technology and Management 4(4/5/6), 480–493 (2003) 5. Fukuyama, F.: Trust: The Social Virtues and the Creation of Prosperity: Hamish Hamilton (1995) 6. Hampden-Turner, C., Trompenaars, F.: The Seven Cultures of Capitalism, London, Piatkus (1994) 7. Lessem, R., Neubauer, F.: European Management System. McGraw-Hill, London (1994) 8. Hall, E.T.: Dance of Life: The Other Dimension of Time. ed. Yarmouth, ME: Intercultural Press (1983) 9. Hall, E.T.: Beyond Culture, ed. Doubleday, New York: Doubleday (1987) 10. Hall, E.T.: Understanding Cultural Differences, ed. Yarmouth, ME: Intercultural Press (1990) 11. Ferraro, G.: The culture dimension of international business, 2nd edn. Prentice-Hall, Englewood Cliffs (1994) 12. Terpstra, V., David, K.:The cultural environment of international business, 2nd edn. South-Western Publishing (1985) 13. Jianguang, Z.: Environmental hazards in the Chinese public’s eyes. Risk Analysis 14(2), 163 (1994)
Effects of Time Orientation on Design of Notification Systems
843
14. Cadiz, J., Gupta, A., Jancke, G., Venolia, G.D.: Sideshow: Providing Peripheral Awareness of Important Information. Microsoft Technical Report 2001-83 (2001) retrieve from http://research.microsoft.com/pubs/view.aspx?tr_id=488 15. Dantzich, M.v., Robbins, D., Horvitz, E., Czerwinski, M.: Scope: Providing awareness of multiple notifications at a glance. In: Proceedings of the 6th International Working Conference on Advanced Visual Interfaces (2002) 16. McCrickard, D.S.: Maintaining information awareness with Irwin. In: Proceedings of the World Conference on Educational Multimedia/Hypermedia and Educational Telecommunications (ED-MEDIA ’99) (1999) 17. Hsieh, G., Mankoff, J.: A Comparison of Two Peripheral Displays for Monitoring Email: Usability, Awareness, and Distraction. Tech Report UCB//CSD-03-1286, U.C (2003) 18. Mankoff, J., Dey, A.K., Hsieh, G., Kientz, J., Lederer, S., Ames, M.: Heuristic evaluation of ambient displays. Association for Computing Machinery, Ft. Lauderdale, FL, United States (2003) 19. Plocher, T., Zhao, C., Liang, S.-F.M., Sun, X., Zhang, K.: Understanding the Chinese User: Attitudes Toward Automation, Work, and Life. In: Proceedings of HCI International, New Orleans, LA (2001) 20. Oard, D.W., Marchionini, G.: A conceptual framework for text filtering. Report EE-TR96-25 (1996)
Being Together: User’s Subjective Experience of Social Presence in CMC Environments Ha Sung Hwang and SungBok Park Department of Journalism and Mass Communication Hanyang University 17 Haengdang-dong, Seongdong-gu, Seoul, South Korea {[email protected], [email protected]}
Abstract. The concept of presence, or “being there,” has become a central issue for many researchers who study human-computer interaction. Although several dimensions of presence have been discussed in the literature, here we focus specifically on social presence as the feeling of “being together” in mediated communication environment by relating it to the three concepts: co-presence, mutual awareness and connectedness. We propose that this conceptualization is applicable to use in studying social interaction through various types of CMC technologies. Keywords: Social Presence, Co-Presence, Mutual Awareness, Connectedness, Computer-Mediated Communication.
Being Together: User’s Subjective Experience of Social Presence
845
communication environment by relating it to the three concepts: co-presence, mutual awareness and connectedness. We believe that this attempt will help expand the conceptualization of social presence in CMC research. This paper begins with reviewing classical social presence theory developed by Short, Williams and Christie and provides the criticisms on this approach. Then we propose the new concept of social presence, which can apply to CMC environment. Finally, by discussing measurement issues of social presence and its effects on CMC, we suggest a direction for future study.
2 Theoretical Background: Social Presence 2.1 Classical Social Presence Theory The concept of social presence was first introduced in 1976 by Short, Williams and Christie. They define social presence as the “degree of salience of the other person in the interaction and the consequent salience of the interpersonal relationships” [1] and regard it as a medium characteristic. In their original work, “The Social Psychology of Telecommunication,” they state: We regard Social Presence as being a quality of the communications medium. Although we would expect it to affect the way individuals perceive their discussions, and their relationships to the persons with whom they are communicating, it is important to emphasize that we are defining Social Presence as a quality of the medium itself. We hypothesize that communications media vary in their degree of Social Presence, and that these variations are important in determining the way individuals interact [1, p.65]. Short et al. measured social presence with a series of bipolar scales: sociableunsociable, personal-impersonal, sensitive-insensitive, and warm-cold, the greater the degree of social presence in a medium, the more sociable, personal, sensitive, and warm the medium. In other words, when social presence declines, people feel that messages are more impersonal [2]. To test to what extent social presence varies among different communication media they used three media: face-to-face, closedcircuit television and an audio only system. They found that face-to-face was rated as the highest social presence medium followed by video (closed-circuit television) and audio only. The factors that contribute to the social presence are capacity to transmit information about facial expression, direction of looking, posture, dress and nonverbal vocal cues. Since communication media differ in conveying such factors, social presence is referred to as a property of a medium in communication [2]. The influence of the classic social presence theory can be seen in the earliest CMC studies. When it has been applied to CMC, it has been suggested that CMC is a medium with low social presence. Especially, in text-based CMC environments, the important factors relating to the degree of social presence, non-verbal cues (e.g., facial expression, tone of voice, gestures, direction of gaze, posture, and dress) are missing and this is said to directly decrease social presence.
846
H.S. Hwang and S. Park
2.2 Critics of Short et al. ’s Social Presence Although Short et al.’s work is credited with providing broad theoretical currency to the concept of social presence, it has been criticized for theoretical and methodological weaknesses. Several critical concerns are as follows: (a) Although Short et al. consider social presence as being a quality of a medium, measure of social presence is based on the users’ judgment of the medium. Therefore social presence may not be regarded as having a characteristic of medium but as user’s experience of medium [3]. (b) The direction of causality is questionable: “whether the actual characteristics of the media are the causal determinants of communication differences or whether users’ perceptions of media alter their behavior” [4, p.164]. (c) Short et al. ’s social presence measurement (a series of bipolar, seven-point semantic items including impersonal-personal, unsociable-sociable, insensitive-sensitive, and cold-warm) is regarded as inadequate, inconsistent with its definition. If social presence is defined as “degree of salience of other person,” then the perceived feeling concerning others should be an important measure [5]. As a result, many recent attempts have been made to redefine the concept of social presence to refer to as a property of the user instead of a property of a communication medium. When we try to distinguish the experiences of individuals in physical environments from those in mediated environments, our understanding of what it means to feel present and what creates that feeling of social presence becomes a more important issue. Researchers have examined social presence in various types of CMC technologies ranging from low bandwidth interactive text technologies (e.g., email, bulletin board) to high bandwidth audio-visual technologies (e.g., video and computer teleconferencing systems). In the next section, by exploring theoretical definitions of social presence discussed in CMC literature, we propose new conceptualization of social presence, which is applicable to the study of CMC. 2.3 Definitions of Social Presence in CMC This section defines the theoretical construct of social presence as a sense of “being together” by exploring its relationship with the concepts of co-presence, mutual awareness, and connectedness. Social Presence as Co-Presence Recent communication and human–computer interaction researchers who study social presence have begun to apply the “spatial metaphor” in understanding the concept of social presence (see [6] for a more detailed discussion). As to the use of telecommunications in education, a study describes social presence as the feeling that the people with whom one is collaborating are in the same room [7]. According to him, social presence is the feeling of “co-presence” as individuals feel as if they are located in the same environment. In this case, social presence is the sense of being
Being Together: User’s Subjective Experience of Social Presence
847
with another in the same location, space, or room. Therefore, full conditions of social presence are achieved when individuals sense co-presence (being with another) and co-location (same place) at the same time. Actually, the term co-presence was originated in the work of Goffman [8]. According to him co-presence occurs when people sensed that they were able to perceive others and that others were able to actively perceive them. Further, Goffman [8] explained that the co-presence occurs when individuals “sense that they are close enough to be perceived in whatever they are doing including their experiencing of others, and close enough to be perceived in this sensing of being perceived” (p. 17). Therefore, it is argued that co-presence is another way of evaluating the sense of connection with another person [5]. Social Presence as Mutual Awareness Other researchers expand such a concept of co-presence to “mutual awareness.” Biocca and Nowak [9] define social presence as the “level of awareness of the co-presence of another human being or intelligence.” This definition stresses “attention to the sensory properties of the other”, especially an awareness of both user and the other [10]. In this sense, the two participants in an interaction realize each other’s existence, feeling that for instance, “I am present with him/her, She/he is present with me.” Moreover, McIsaac and Gunawardena define social presence as the degree to which a person feels “socially present” in a mediated situation, linking the issue to social context which includes social interaction [11]. Similarly, social presence is regarded as the feeling of being socially present with another person at a remote location [12]. Such mutual awareness, however, is achieved along with the other’s reaction to the user. Heeter [13] defines social presence as “the extent to which other beings in the world appear to exist and react to the user” in a virtual world. This illustration implies that the reaction of the other allow the user to feel “he/she is there” in the mediated interaction. Durlach and Slater go further. They see social presence as “virtual togetherness” in a mediated environment [14]. Put together, users are aware of the “existence of the others” and that “they are there,” therefore they feel “togetherness” in a virtual space or mediated environment. Social Presence as Connectedness Connectedness is observed as a useful paradigm of mobile phone and text message usages. To some scholars, primarily those who study various types of CMC, social presence is a sense of connectedness. In psychology connectedness is one of human basic needs and this fundamental need for belonging promotes social relationships, motivating social behavior [15]. This concept of ‘connectedness’ is related to Biocca et al.’s definition of social presence as psychological involvement. They argue that a sense of social presence is activated when a user feels perceived access to other intelligence and its reaction to the user in any social interaction (e.g., human-tohuman as well as human-to-computer interaction). Therefore, for Biocca et al. a sense of social presence is “cognitive states” that involve “some form of mental model of the other” [10]. Other CMC researchers expand this concept of social presence as perception of the other intelligence (or person) to an experience of connectedness—feeling of being with others. This feeling is realized through, in particular, Instant Messaging systems
848
H.S. Hwang and S. Park
(e.g., MSN messenger) that provide awareness technology called “buddy list” that allows users to monitor the online state of others. In a study of college students’ IM usage Hwang [3] describes social presence as “emotional connectedness” and found that to some extent a sense of social presence occurs in the context of chatting on IM. Moreover, Ijsselstein et al. [16] suggest that the concept of social presence and connectedness are complementary, arguing that in awareness system “the sense of connectedness, the feeling of being in touch can be strong.” Studies suggest that a text message in mobile phone evokes the experience of connectedness. For instance, in a study of communication channel choice Rettie [17] concluded that need for connectedness was the most important factor in making choice among four new technologies--Instant Messenger, email, text messages and mobile phones; in general, respondents of the study felt most connected when using mobiles, followed by IM and text, with email providing least connection. Townsend [18] views a mobile phone that makes a person feel connected, that he or she is not alone in the world.
3 Effects of Social Presence in CMC Research The section reviews the some results from CMC studies dealing with social presence. However, it is important to note that these studies used Short et al.’s social presence theory as the main theoretical framework. According to the CMC literature, a medium that evokes high social presence should be able to convey social context and provide for more expressive communication and interaction while a medium with low social presence would be less able to provide these. Early research on social presence in text-based CMC concluded that CMC was unable to provide social context cues and therefore, evoked limited social presence because it was perceived as being an impersonal medium. In particular, text-based CMC is unable to deliver non-verbal cues, and the lack of nonverbal cues to deliver feeling and value in a text based communication medium makes it difficult for a group to reach an agreement when more than simple facts are involved [19]. However, recent text-based CMC research has shown contrary findings. For instance, Daft and Lengel argued that although the lack of nonverbal information would result in pragmatic interchanges, in some cases this could be beneficial: “When messages are very simple or unequivocal, a lean medium such as CMC is sufficient for effective communication [20]. Moreover, a lean medium is more efficient, because shadow functions and coordinated interaction efforts are unnecessary” (p. 571). Other CMC studies suggest that text-based CMC seems to be effective if users learn how to deliver necessary immediacy to increase interaction. Gunawardena argues that increasing the intensity of immediacy can enhance social presence [4]. For instance, a teacher can assist students in becoming better acquainted with each other through self-introductions and social exchanges and by promoting online etiquette. Walther notes that those who communicate with each other using only a text-based communication medium try to achieve desired levels of immediacy by manipulating verbal immediacy in the textual environment [2].
Being Together: User’s Subjective Experience of Social Presence
849
Several researchers studied students’ perceptions of social presence and found an impact of social presence on CMC. Steinfeld [21] found that social presence was positively related to greater social use of electronic mail. Social presence was also related to college students’ Internet use [22]; “those who perceived the Internet as warm, social and active, used it primarily to fulfill pass time, convenience, and entertainment desires and for interpersonal utility” (p.191). Garramone et al. [23] found that social presence of computer bulletin boards was higher for interactive than non-interactive users. In a study of college students’ use of CMC such as electronic mail and bulletin boards, Perse et al. [24] found a positive relationship between social presence and the students’ perception of their own computer expertise. A survey of 130 respondents revealed that weekly computer use and social presence were significant and positive predictors of CMC. “Feeling that electronic mail was sociable, warm, personal, and sensitive was linked to higher levels of CMC. Clearly students used CMC more when they felt that email conveyed more interpersonal presence” (p. 167). A study examined the relationship between social presence and the various types of CMC such as e-mail, bulletin board, and real time discussion [25]. The results indicated email is perceived to possess the highest level of social presence, followed by real-time discussion and bulletin board. Such different degrees of impact on social presence, he insisted “not only come from the attributes of CMC systems, but also the uses and various perceptions of CMC systems” (p. 21). Attributes of CMC systems (e.g., asynchronous and synchronous), typing skills required, use of spell checks, speed, and accuracy of messages are critical factors that affect students’ perceptions of social presence in CMC environments. According to Foulger [26], experienced computer users assessed some text-based media such as email and computer conferencing, as “rich” or “richer” media than telephone, television, and face-to face conversations. CMC studies suggest that social presence is a significant predictor of the user’s satisfaction with CMC. Hackman and Walker [27] found social presence as a predictor of satisfaction in an interactive television class. Gunawardena and Zittle [28] assessed students’ reactions through their range of feelings toward the medium, a text-based computer conference. Students were asked to consider how online interaction fostered a sense of community and in what instructional situations online discussions felt personal or impersonal. Findings suggested that students consider the social presence of other students and instructors an important predictor of satisfaction in a text-based computer conference. Moreover, the results indicated that “participants who felt a higher sense of social presence within the conference enhanced their socioemotional experience by using emotions to express missing nonverbal cues in written form” (p. 23). Richardson and Swan [29] studied the role of social presence in web-based online learning. They found a positive relationship among students’ perceptions of social presence in online courses, their perceived learning experience, and their satisfaction with the instructor. Their findings indicated that students who felt high social presence learned more from the online courses and were more satisfied with the instructor than students with low perceived social presence. Blocher [30] concluded that confidence, choice, and involvement had impacts on the levels of social presence of learners. When learners felt more confident, made learning choices, and were actively engaged in learning activities, a higher level of social presence was demonstrated.
850
H.S. Hwang and S. Park
4 Future Research When we have reached general understanding of the concept of social presence as the property a medium user, how and what we are measuring it would be a central issue in the future research. Given that there is no widely accepted measure of social presence, many CMC researchers have called for its development. Subjective measures that are administered via survey questionnaire appear to be the most common method in CMC setting. For instance, focusing on the concept of immediacy, Gunawardena and Zittle [28] developed a scale consisting of 14 items to measure social presence. Tu [25] insists that social presence is a complicated construct that involves privacy, social relationships, communication styles, nature of the task, feedback, and immediacy. He created a 42 items questionnaire that identified social context, online communication, and interactivity as factors that contribute to social presence. Reviewing existing studies in the literature of computer supported for cooperative systems, Lin [31] developed 20 items to measure social presence and classified 12 items (that were remained after factor analysis) into three factors: “perception of the assistance of group activity to learning,” “social comfort of expressing and sensing affect,” and “social navigation.” Despite useful and valuable efforts among the researchers, these instruments have been limited for several reasons: (1) the questionnaire contains a large number of questions, therefore other researcher may be reluctant to add them to their existing surveys, (2) these instruments are typically applicable to the specific CMC environment such as distance collaboration learning, therefore it is difficult to use them to measure social presence in other types of interactions, goals, or tasks, and finally (3) many social presence questionnaire include statements addressing technological aspects (e.g., interactivity, immediacy), which can be viewed as factors that contribute to a sense of social presence. With regard to the methodological issue, we suggest that researchers should aim to develop social presence questionnaire as it is defined. This attempt will be more effective in establishing a solid and precise measure of social presence because it can be easily generalized to be used with any technologies involved. In addition, given that social presence is not a technological term but a psychological term concerning “perception” on other person in mediated environment, we suggest that researchers should consider avoiding items that address technological issues and/or assumptions in social presence measurement. The technological questions, which have been extracted from the measurement of social presence, can be measured separately in the way we will discuss later. Although the importance of social presence in users’ experiences with CMC tools has been acknowledged for some time, there has been relatively little research conducted to explore that factors that contribute to a sense of social presence. Therefore, investigating diverse variables that produce social presence would be a challenge for CMC researchers. Lombard and Ditton [32] suggest that characteristics of a medium's form (e.g., interactivity, sensory stimuli, visual and audio display) and content (e.g., social realism) as well as the characteristics of the medium users (e.g., personality, prior experience with the medium) are predictive factors that cause a sense of “being there.” Among these variables, characteristics of the medium’s form are specifically
Being Together: User’s Subjective Experience of Social Presence
851
suited to CMC technologies. Social presence studies in CMC literature have shown that the use of text based communication such as email is considered as having less potential to evoke social presence than visual media such as video conferencing. To develop measures of technological factor, researchers should consider several aspects: interactivity (e.g., one-way or two-way), immediacy (e.g., synchronous or asynchronous), and display ability (e.g., text-based only or multimedia-based). Besides, there would be factors that are not explored yet, in particular, with respect to user’s characteristics. We suggest that researchers should consider other factors such as motivations to use a particular CMC tools, perceived ease of use with that technology, user’s need for belonging and investigate how these variables relate to user’s sense of “being there” while using telecommunication devices. Investigations on such human factors as potential variables that affect social presence will be a valuable research because it will help us to understand an important question: how social presence operates in CMC environments.
5 Summary and Conclusion This paper reviewed the current literature concerning the concept of social presence. A fundamental problem in social presence research is the question of how to measure social presence. In this paper, by discussing the limitations of early Short et al.’s social presence theory, we argued that social presence is not a characteristic of the medium, but it is a user’s subjective experience of any given medium. We also suggested that an appropriate way to measure social presence is measuring it as conceptualized. In this paper, we defined social presence as a sense of “being together” in a mediated communication by relating it to co-presence, mutual awareness, and connectedness. We hope this conceptualization of social presence can contribute to CMC literature in exploring effects of emergence and proliferation of CMC technologies.
References 1. Short, J.A., Williams, E., Christie, B.: The social psychology of telecommunications. John Wiley & Sons, Ltd, London (1976) 2. Walther, J.B.: Interpersonal effects in computer-mediated interaction: A relational perspective. Communication Research 19(1), 52–90 (1992) 3. Hwang, H.: Predictors of IM use among U.S. College Students: Gratifications Sought, Gratifications Obtained, and Social Presence. In: Paper presented at the International Communication Association, New York (2005) 4. Gunawardena, C.N.: Social presence theory and implications for interaction collaborative learning in computer conferences. International Journal of Educational Telecommunications 1(2/3), 147–166 (1995) 5. Nowak.: Defining and Differentiating Copresence, Social presence and Presence as Transportation. In: Paper presented at the Presence Workshop, Philadelphia (2001) 6. Towell, J., Towell, E.: Presence in text-based networked virtual environments or MUDS. Presence. Teleoperators and Virtual Environments 6(5), 590–595 (1997) 7. Mason, R.: Using communications media in open and flexible learning. Kogan Page, London (1994)
852
H.S. Hwang and S. Park
8. Goffman, E.: Behavior in Public Places: Notes on the Social Organization of Gatherings. The Free Press, New York (1963) 9. Biocca, F., Nowak, K.: Plugging your body into the telecommunication system: Mediated embodiment, media interfaces, and social virtual environments. In: Lin, C., Atkin, D. (eds.) Communication technology and society, pp. 407–447. Hampton Press, Waverly Hill, VI (2001) 10. Biocca, F., Harms, C., Burgoon, J.: Toward a More Robust Theory and Measure of Social Presence: Review and Suggested Criteria. Presence: Teleoperators and Virtual Environments 12, 456–480 (2003) 11. McIssac, M.S., Gunawardena, C.N.: Distance education. In: Jonassen, D. (ed.) Handbook for research on educational communications and technology, pp. 403–437. Scholastic Press, New York (1996) 12. Sallnas, E., Rassmus-grohn, K.: Supporting presence in collaborative environments by force feedback. AMC Transactions on human-computer interaction 7(4), 461–476 (2000) 13. Heeter, C.: Being There: The subjective experience of presence. Presence: Teleoperators and Virtual Environments 1(2), 262–271 (1992) 14. Durlach, N., Slater, M.: Presence in shared virtual environments and virtual togetherness. Presence: Teleoperators and Virtual Environments 9(2), 214–217 (2000) 15. Smith, E., Mackie, D.: Social Psychology. Psychology Press, New York (2000) 16. Ijsselsteijn, W.: Staying in Touch: Social Presence and Connectedness through Synchronous and Asynchronous Communication Media. In: Proceedings of HCI International Conference on Human-Computer Interaction, pp. 924–928. Lawrence Erlbaum Associates, New Jersey (2003) 17. Rettie, R.M.: A Comparison of Four New Communication Technologies. In: Proceedings of HCI International Conference on Human-Computer Interaction, pp. 686–690. Lawrence Erlbaum Associates, New Jersey (2003) 18. Townsend, A.M.: Mobile Communications in the Twenty-first Centry. In: Brown, B., Green, N., Harper, R. (eds.) Wireless World, Springer Verlag, United Kingdom (2001) 19. Hiltz, S.R., Johnson, K., Turonff, M.: Experiments in groups decision making: Communication process and outcome in face-to-face versus computerized conference. Human Communication Research 13(2), 225–252 (1986) 20. Daft, R., Lengel, R.: Organizational information requirements, media richness and structural design. Management Science 32(5), 554–571 (1986) 21. Steinfield, C.W.: Computer-mediated communication in an organizational setting: Explaining task-related and socioemotional uses. In: Mclaughlin, M.L. (ed.) Communication Yearbook, vol. 9, pp. 777–804. Sage, Newbury Park, CA (1986) 22. Papacharissi, Z., Rubin, A.M.: Predictors of Internet Use. Journal of Broadcasting and Electronic Media 44, 175–196 (2000) 23. Garramone, G., Harris, A., Anderson, R.: Use of political computer bulletin boards. Journal of Broadcasting & Electronic Media 30(3), 325–339 (1986) 24. Perse, E.I., Burton, P., Kovner, E., Lears, M.E., Sen, R.J.: Predicting computer-mediated communication in a college class. Communication Research Reports 9(2), 161–170 (1992) 25. Tu, C.H.: The impact of text-based CMC on online social presence. The Journal of Interactive Online Learning, 1(2) (2002)[On-line] Available http://www.ncolr.org 26. Foulger, D.A.: Medium as process: The structure, use, and practice of computer conferencing on IBM’s IBMPC computer conferencing facility. Unpublished doctoral dissertation, Temple University, Philadelphia (1990)
Being Together: User’s Subjective Experience of Social Presence
853
27. Hackman, M.Z., Walker, K.B.: Instructional communication in the televised classroom: The effects of system design and teacher immediacy on student learning and satisfaction. Communication Education 39(3), 196–206 (1990) 28. Gunawardena, C.N, Zittle, F.J.: Social presence as a predictor of satisfaction within a computer-mediated conferencing environment. The. American Journal of Distance Education 11(3), 8–26 (1997) 29. Richardson, J.C., Swan, K.: Examining social presence in online courses in relation to students’ perceived learning and satisfaction. JALN 7(1), 68–88 (2003) 30. Blocher, J.M.: Self-regulation of strategies and motivation to enhance interaction and social presence in computer mediated communication. Unpublished doctoral dissertation, Arizona State University (1997) 31. Lin, G.: Social Presence Questionnaire of Online Collaborative Learning: Development and Validity. In: Paper presented at the Association for Educational Communications and Technology, Chicago, IL (October 19-23, 2004) 32. Lombard, M., Ditton, T.B.: At the Heart of It All: The Concept of Presence. Journal of Computer-Mediated Communication, 3(2) (1997) [On-line] Available http://www.ascusc.org.jcmc/vo13/issue2/
Age Differences in Performance, Operation Methods, and Workload While Interacting with an MP3 Player Neung Eun Kang and Wan Chul Yoon Industrial Engineering Department, KAIST, 373-1 Guseong-dong, Yuseong-gu, Daejeon, Korea {popcorn, wcyoon}@kaist.ac.kr
Abstract. This study aimed to reveal age-related interaction characteristics through user observations. The interaction behaviors of comparatively younger adults (20s) and older adults (40s~50s) were examined while they used an MP3 player, and age-related differences regarding performance, operation methods, and workload aspect were analyzed. The results reveal that the higher error frequency, poorer ability in terms of physical operation, and the lower subjective performance of the older adults are due to an age effect while higher workload aspects are due to a lack of background knowledge. Keywords: older adults, younger adults, user observation, MP3 player.
Age Differences in Performance, Operation Methods, and Workload
855
have more difficulties than younger adults when using such complicated devices due to their lack of both the needed abilities and background knowledge. In this study, differences in the interaction behaviors were examined between younger and older adults while using a convergent MP3 player. Through user observations, age-related differences were analyzed in terms of performance, physical operation methods, and workload.
2 Method 2.1 Apparatus The MP3 player used in the observations was an iriver model IFP-180T, which also provides FM radio and voice recorder functions. The MP3 player employs three buttons and one joystick for control of its functions. The buttons perform different functions according to how long they are pressed, and the joystick performs various functions according to how long it is pressed or pushed, and has functions related to a combination of these actions. As the MP3 player’s functions depend mainly on the hidden physical operations of UI input devices rather than a simple selection of menu items, the MP3 player is appropriate for comparing how people in different age groups physically interact with small convergence devices. Fig. 1 shows two UI appearances of the MP3 player. Front View
Side View +
Neck strap MIC TFT LCD
►►
-
►►|
Joystick
STEREO
►/ ■
MODE
●
|
MEMORY/EQ A-B
Fig. 1. Apparatus: iriver IFP-180T MP3 player
2.2 Subjects Thirty novice users of MP3 players, 15 younger (21-28 years) and 15 older adults (4658 years), participated in the user observations. To eliminate the influence of different levels of education, only participants whose education exceeded the high school level were selected. Background knowledge related to their PC types, their understanding of the Internet, and their electronic devices and cellular phones, which were
856
N.E. Kang and W.C. Yoon
considered as possibly affecting interaction behaviors, were examined prior to the observations. PC familiarity was examined through three questions: 1) How long have you used PCs? (Five-scale ratings from “less than 1 year” to “more than 15 years”) 2) How often do you use PCs? (Five-scale ratings from “less than one time per month” to “everyday”) 3) How many different kinds of programs do you use? (Five-scale ratings). The total score for PC experience was indicated as the summation of the three rating scores. Internet experience was measured via the same questions and ratings as those employed for PC experience, with the exception of the usage period, which was measured ranging from “less than 1 year” to “more than 10 years.” Cellular phone experience was also measured in terms of their usage period (five-scale ratings from “less than 1 year” to “more than 10 years”), frequency (five-scale ratings from “less than three times per week” to “more than ten times per day”), and use of functions (five-scale ratings based on the number of functions used). Participants’ level of experience with electronic devices was measured on the basis of five-scale ratings according to the number of devices such as a TV, radio, VCR, CD player, DVD player, digital camera, and PDA that the participants had used. T-test results in Table 1 show that the differences in the background knowledge types between the two age groups are significant (p<0.05). These differences reflect the general generation gap between younger and older adults; the participants were thus deemed representative of general users for their respective groups. Table 1. Age and background knowledge
Age 1 Education PCs Internet Electronic devices Cellular phones
t-test p value <0.001 0.823 0.039 0.024 <0.001 <0.001
2.3 Procedure Before the user observations, an experimenter gave brief explanations about the observation as well as the basic operation of the MP3 player. Given that the aim was to examine the natural interaction behaviors that resulted from the participants’ own intentions and expectations, detailed operation directions for the device were not given. The participants performed six tasks (Table 2), and the interaction behaviors were recorded by both the experimenter and video. NASA-RTLX (Raw Task Load Index) 1
Education levels: 1=middle school less; 2=high school; 3=junior college; 4=university; 5=Master degree or further.
Age Differences in Performance, Operation Methods, and Workload
857
[2] inventory ratings were used to measure the workload of the participants. After the completion of each task, the participants rated six different workload aspects regarding mental demand, physical demand, temporal demand, performance, effort and frustration along a 100 point scale in increments of 10. After all tasks were completed, detailed interaction procedures were examined through interviews with the participants by referring to the experimenter’s notes and video records. Table 2. Task instructions Task No. T-1 T-2 T-3 T-4 T-5 T-6
Task instructions Turn on the player Adjust radio frequency to 95.7MHz Save all radio channels automatically Change the radio mode to the MP3 player mode Fast forward currently playing music Play the first music file in the next folder
3 Results 3.1 Performance and Error Frequency Performance was measured by the number of task completions. The performance of the younger group (58.3%) was higher than that of the older group (50.0%). However, a t-test result showed that this difference was not significant (p>0.10). To reveal the effects of age differences and background knowledge on performance, an analysis of covariance (ANCOVA) was conducted. The main factor was age and the covariates were education and background knowledge related to the type of PC, Internet use, electronic device use, and cellular phone use of the participants. Analysis results showed that both age difference and background knowledge type were insignificant (p>0.10). Error frequency was measured according to the ratio of erroneous actions to the total number of actions. The older adults had more erroneous actions (53.7%) compared to the younger adults (43.6%), and this difference was significant with age (t-test: p=0.008). ANCOVA results showed that age affected the difference of total error ratio (p=0.033), but the other covariates related to types of background knowledge were not meaningful. 3.2 Physical Operation Methods The analysis of the operation methods revealed that in general the younger adults could better carry out long press/push actions, as distinguished from short press/push operations, compared to the older adults (Table 3). Logistic regression analysis was conducted to examine meaningful effects of age differences and background knowledge type in applying long operation methods distinctively as compared to short operation methods. Analysis results showed that only the age variable was meaningful, and that none of the background knowledge variables were meaningful.
858
N.E. Kang and W.C. Yoon
The main reason for the operation method failures by older adults was their poor motor skills. Sixty percent of the older adults failed to accomplish at least one intended operation method essentially because they could not control the operational sensitivity appropriately. Independent of age, most participants failed to ascertain the combined operation as shown by their failures in a task that called for the application of a long push after a short push with the joystick. Only 27% of the younger adults and 13% of the older adults performed this combined operation. Table 3. Physical operation methods: use of long operations as distinguished from short operations Operation methods Joystick – short/long push Joystick – short/long press Buttons – short/long press
Younger adults Mean 80% 87% 87%
Older adults Mean 47% 67% 80%
Logistic regression: Meaningful variables Age (p=0.0253) Age (p=0.0895) Age (p=0.0346)
3.3 Workload The older adults reported a higher workload than the younger adults regarding all aspects. The differences were significant except for the reported physical demand (Table 4). As ANCOVA results indicated, experience with cellular phones was a meaningful effect for physical demand, temporal demand, and frustration level. Age difference was meaningful with respect to performance aspect. Table 4. Workload Workload aspects
4 Discussion 4.1 Age-Related Differences Based on the results, older adults can be expected to achieve similar levels of performance with younger adults when interacting with a new device, but they will make more errors compared to younger adults. In addition, older adults lack motor skills, causing them to make frequent errors when executing intentional physical operations. More erroneous actions and poorer performances of physical operations
Age Differences in Performance, Operation Methods, and Workload
859
are characteristics of the older adults, and this was not found to be related to background knowledge or experiences. The lower level of subjective performance of the older adults, which reflects less satisfaction for task achievement, is related to their high error frequency. Although task performances did not differ with age in this study, the NASA-RTLX results showed that the older adults had a higher workload in terms of performance compared to the younger adults. The older adults made more errors and had more difficulties when performing various operations in their efforts to obtain similar performance levels to the younger adults. Therefore, the more trial and error older people go through, the less confidence for task completion and the heavier the workload in performance they might have. As a result, the older adults may have felt feel less satisfaction with their task achievement resulting in a lower subjective performance level rating compared to the younger adults. However, other workload aspects imply positive characteristics of older adults’ interaction behaviors. Higher workload aspects other than the performance aspect are due to a lack of background knowledge rather than an age difference, thus the higher workload rating by the older adults can be reduced as they acquire more experience. 4.2 Implications The results obtained here have a number of implications for the design of devices for older adults. In such devices, UI input devices should have distinctive sensitivity between short and long operations in order to match the motor skill level of older adults. As most of the older adults in this study recognized the differences between short and long operations, it is expected that they will show similar performance levels in terms of operation methods to younger adults if the control sensitivity is appropriate to their physical ability. Notably, combined operation methods should be avoided, as users have difficulties with those operation methods regardless of age. The fact that higher workload aspects are due to a lack of background knowledge raises meaningful social issues. As older adults tend to underestimate their performances, they may require encouragement. To reduce levels of frustration and negative attitudes in older adults toward new devices, it is important that society offers more opportunities for older adults to learn and use new technologies in their daily life.
5 Conclusion and Further Study This study sought to reveal the effects of age and background knowledge on interaction behaviors in terms of performance, errors, physical operation methods and workload. This study will be helpful in understanding age-related differences in relation to interacting with new devices. However, it is limited in that the results are based on a specific MP3 player with user groups in their 20s and between the ages of 40 and 50. To obtain more general results, further studies could include various types of devices and user groups with a wide age range. In addition, further examination on other age-related characteristics regarding strategies, error type, or emotional aspects will offer meaningful insight into the design of devices for older adults.
860
N.E. Kang and W.C. Yoon
References 1. Birdi, K., Pennington, J., Zapf, D.: Ageing and Errors in Computer-Based Work: An Observational Field Study. Journal of Occupational and Organizational Psychology 70, 35– 47 (1997) 2. Byers, J.C., Bittner, A.C., Hill, S.G.: Traditional and Raw Task Load Index (TLX) Correlations: Are Paired Comparisons Necessary? In: Mital, M. (ed.) Advances in Industrial Ergonomics and Safety I. Taylor and Francis, Philadelphia PA (1989) 3. Chaparro, A., Bohan, M., Fernandez, J., Fernandez, C.S.D., Kattel, B.: The Impact of Age on Computer Input Device Use: Psychophysical and Physiological Measures. International Journal of Industrial Ergonomics 24, 503–513 (1999) 4. Czaja, S.J., Sharit, J.: The Influence of Age and Experience on the Performance of a Data Entry Task. In: Proceedings of the Human Factors and Ergonomics Society 41th Annual Meeting. Human Factors and Ergonomics Society, vol. 144–147 (1997) 5. Hawthorn, D.: Possible Implications of Aging for Interface Designers. Interacting with Computers 12, 507–528 (2000) 6. Kelly, C.L., Charness, N.: Issues in Training Older Adults to User Computers. Behaviour & Information Technology 14, 107–120 (1995) 7. Sharit, J., Czaja, S.J.: Ageing, Computer-Based Task Performance, and Stress: Issues and Challenges. Ergonomics 37, 559–577 (1994) 8. Sjölinder, M., Höök, K., Nilsson, L.G., Andersson, G.: Age Differences and the Acquisition of Spatial Knowledge in a Three-Dimensional Environment: Evaluating the User of an Overview Map as a Navigation Aid. Int. J. Human-Computer Studies 63, 537–564 (2005) 9. Smith, M.W., Sharit, J., Czaja, S.J.: Aging, Motor Control, and the Performance of Computer Mouse Tasks. Human Factors 41, 389–396 (1999)
Appendix: NASA-RTLX Title Mental demand
Endpoints Low/High
Physical demand
Low/High
Temporal demand
Low/High
Effort
Low/High
Performance
Good/Poor
Frustration level
Low/High
Rating scale definitions Descriptions How much mental and perceptual activity was required (e.g., thinking, deciding, calculating, remembering, looking, searching, etc.)? Was the task easy or demanding, simple or complex, exacting or forgiving? How much physical activity was required (e.g., pushing, pulling, turning, controlling, activating, etc.)? Was the task easy or demanding, slow or brisk, slack or strenuous, restful or laborious? How much time pressure did you feel due to the rate or pace at which the tasks or task elements occurred? Was the pace slow and leisurely or rapid and frantic? How hard did you have to work (mentally and physically) to accomplish your level of performance? How successful do you think you were in accomplishing the goals of the task set by the experimenter (or yourself)? How satisfied were you with your performance in accomplishing these goals? How insecure, discouraged, irritated, stressed and annoyed versus secure, gratified, content, relaxed and complacent did you feel during the task?
Age Differences in Performance, Operation Methods, and Workload
Mental Demand Low
High
Physical Demand Low
High
Temporal Demand Low
High
Effort Low
High
Performance Good
Poor
Frustration Level Low
High
861
A Usability Test of Exchanging Context in a Conference Room Via Mobile Device Interactions Doyoon Kim, Seungchul Shin, Cheolho Cheong, and Tack-Don Han Dept. of Computer Science, Yonsei University, 134, Seodaemun-Gu, Seoul 120-749, Republic of Korea {dykim81, seungchul.shin, balgeum}@yonsei.ac.kr, [email protected]
Abstract. In a community such as conferences, numerous service providers and service users exist, and people interact using contexts. With the improvements in context-awareness computing and mobile computing technologies, human– computer interactions for exchanging contexts started increasing. In this paper, we introduce some interaction techniques such as tag interaction and service discovery interaction using a mobile device to provide an efficient user interface to exchange contexts in a conference room. We identified typical situations in which these interactions can be used in a paper, poster session, and for providing individual information among the attendees. We analyzed the two interaction techniques to be suitable to improve the interactions for exchanging contexts in a conference. Keywords: Context-awareness, Service discovery, Image based code.
A Usability Test of Exchanging Context in a Conference Room
863
collecting and delivering contexts to the attendees is not very difficult. Third, the purpose of a conference is to enable interactions between the attendees by the sharing and exchange of contexts among themselves, although traditionally these interactions were achieved using physical objects. We classify the mobile interactions to exchange contexts in a conference using mobile devices into two types. They are as follows: tag interaction, and service discovery interaction. Tag interaction uses a tag as an intermediate media to provide context. To exchange contexts, users have to decode the tag using a particular device. Service discovery interaction uses the service discovery protocol that allows devices to automatically discover nearby network services and to advertise their own services to the network; these services will become contexts in the conference. Similar studies have been carried out in [10], in which research on physical mobile interactions was conducted by using a mobile device between a user and a smart object. It summarizes which physical mobile interaction techniques such as touching, pointing, and scanning is appropriated in various smart environments. We divided the interactions in a conference into three scenarios, namely, the interactions during a paper presentation, interactions during poster sessions, and the interactions among conference participants. The main purpose of a conference is to learn about the research conducted by different people. Moreover, the attendees seek to know each other and reveal information about themselves. Therefore, in these situations, many interactions in which contexts are exchanged will occur frequently; therefore, we believe that it is reasonable to consider these three scenarios.
2 Related Work 2.1 Tag Interaction The tag interaction technique is identifying a tag that contains information using a decoding device. As camera phones became popular, image-based tags such as PDF417 [11], Quick Response (QR) code [12], and ColorCode [13] are being used in magazines, television advertisements, newspapers, etc. These tags are used as a pointer, a repository for data, and a marker that provides the location and inclination in Augmented Reality (AR) / Mixed Reality (MR). In [14], RFID tags are used as an index that points to a particular service. The tags have two types of visual symbols—a general tag that identifies the object and a special tag that represents additional information related to the object. In [15], a paper tag system is proposed which an image code is attached to a paper. The paper tag system overcomes the high cost of using RFID tag which needs tags, readers and writers because a user can generate tags easily by printing it on a paper. 2.2 Service Discovery Interaction Service discovery interaction performs the discovery of nearby network services; further, mobile devices can advertise their own services to the network by using service discovery protocols. Jini [16] places the corresponding services into a directory called the lookup service. Service users can use the services by accessing this lookup service. The addition of a service to a lookup service consists of two
864
D. Kim et al.
steps—a discovery step to determine the available lookup services in order to register a service and a join step for registration. DEAPspace [17] is a decentralized discovery protocol that is suitable for mobile ad hoc networks. It is a push model solution in which each device periodically broadcasts its service list, which is called world view.
3 Task Analysis In this section, we analyzed the problems and the shortcomings in a conference room. We conducted an online survey in which 46 people participated. The participants were aged between 24–36 years with an average age of 27.5. They all had a university degree and attended a conference at least once; the average over the entire group was 7.2 times. Further, all the participants possessed a mobile phone. We divided the questions into three categories. The first category comprised questions regarding the paper presentation, the second comprised questions regarding the poster session, and the last comprised questions regarding interactions among the conference participants.
Fig. 1. Results of the online survey on the inconveniences caused during the paper session
The questions that appear at the beginning of the questionnaire were on the inconveniences caused during the paper presentation. As shown in Figure 1, 72% of the participants believe that the situation in which interesting presentations coincide is the most critical problem in the paper session. Moreover, 52% of the responders feel inconvenienced when seats are full. 48% believe that a limited time for the Q&A session is a problem because it is not easy to contact the presenter after the presentation. The next category comprised questions on the poster session. Figure 2 shows that 59% of the participants were inconvenienced while arranging for additional equipment to present a demonstration of their work. When presenters arrange for equipment, they have to remain in their poster section at all times in order to keep an eye on their equipment and to present their demonstration to others, which makes them unable to visit other posters. On the other hand, 57% of the participants found it difficult to understand a poster if its presenter was absent, for example, if authors leave their respective areas to view other’s works and so on.
A Usability Test of Exchanging Context in a Conference Room
865
Fig. 2. Results of the online survey on inconveniences in the poster session
We then asked questions on the inconveniences caused while interacting with conference participants. In a conference, participants interact with each other, establish friendships, and reveal information about themselves. They exchange business cards to maintain contact with each other. Their areas of interest and research works are in common; therefore, it is highly probable that these participants will meet one another in some other conference. Only a few responders believe that the large number of business cards they receive or a shortage of business cards prepared by them is a big problem. However, some of the participants believe that although they exchange business cards and establish friendships in one conference, the problem arises when they meet one another in some other conference and are unable to recognize each other or exchange their business cards again. This problem arises because the participants do not arrange the business cards they receive.
Fig. 3. Results of the online survey on inconveniences caused while interacting with participants
4 High-Fidelity Prototype 4.1 System Architecture Figure 4 provides an outline of the architecture of a high-fidelity prototype. The system comprises a user device and service provider. It supports both service discovery interaction and tag interaction. The proposed system proposes both server–client communication between the service provider and user client and peer-to-peer communication among the user devices.
866
D. Kim et al.
Fig. 4. The system architecture of a high-fidelity prototype. It consists of a user device and service provider.
PDAs are used for the user device. The user device confirms a user and object identification using a tag, receives the stored data from the service provider, and communicates with the other users. It consists of five components and is developed within the .NET Compact Framework. The main control handles a user request. Control is required when a user needs to receive stored information from the service provider or needs to communicate with other users. The discovery component is used when a service is requested by a user or advertised to nearby users. The tag reader reads a tag to obtain an identification of a context. The group manager manages a group in a conference and exchanges business cards. We have used a PC as the service provider. It gathers information, provides a streaming service to the user, and manages the members in a group. It has four components—data manager, event manager, membership manager, and communication manager. All the components have been developed within the .NET Framework. The data manager stores detailed information on papers or posters. The detailed information may comprise text, audio, and multimedia format. The membership manager handles the authentication of a member who wants to access information in the repository. In this paper, we have used a simple method such as the use of an id and password for certification. The event manager charges all the events in a service provider. 4.2 Tag Interaction We adopted ColorCode [13] for the tag interaction technique. It was developed by the Media Systems Lab at Yonsei University in South Korea in 1999. ColorCode is a 2D color-based image code that is composed of a set of N × N cells with four colors. Figure 5 [13] shows some examples of ColorCode.
A Usability Test of Exchanging Context in a Conference Room
867
Fig. 5. Examples of ColorCode
Several U-Campus services [7] such as U-Profile, U-Messaging, and U-Campus Tour Guide and the U-Town service [6] such as U-Museum Guide which are our previous researches are adopted to exchange context for interaction. All services use a ColorCode and require a camera embedded in a mobile device to recognize a tag.
(a)
(b)
(c)
(d)
Fig. 6. High-fidelity tag interaction using ColorCode. (a) A section of a conference time table, (b) a section of a poster, (c) a business card, and (d) recognizing a ColorCode using a PDA.
Figure 6 shows physical objects attached with ColorCodes. Figure 6 (a) shows a section of a conference schedule table. Each title of the paper has two ColorCodes— one for viewing the presentation and the other for sending an e-mail for Q&A, which is the U-messaging service. We assume that each presentation is being recorded in the conference server and that the participants can view these presentations anytime. Figure 6 (b) shows a section of a poster. Each poster has three ColorCodes. The first ColorCode is for listening or watching an audio or a video file which explains the poster, the second ColorCode is to watch a video file of the demo and the last ColorCode is to send an e-mail for Q&A. Figure 6 (c) shows a business card attached with a ColorCode that contains personal information. Figure 6 (d) shows a person trying to recognize a ColorCode to obtain someone’s personal information and storing it on his mobile phone instead of carrying a business card.
868
D. Kim et al.
4.3 Service Discovery Interaction We developed a prototype named cherry for the service discovery interaction. It follows the proposed architecture introduced in section 4.1.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
Fig. 7. The high fidelity design of service discovery interaction. (a) Main menu, (b) time table of the conference, (c) list of paper presentations, (d) detailed information of papers, (e) list of posters (f) receiving information of a poster from the service provider, (g) finding a nearby user using service discovery, (h) result of the received personal information, (i) list of friends whom the user exchanged business cards with, (j) sending an email.
Figure 7 shows the detailed screen shots of cherry. Figure 7(a) is the main menu. Figure 7(b) is the schedule about conference room Figure 7(c) is the display when pressing the Paper Sessions button in the main menu. It shows the presentation being held currently and the past presentation. Figure 7(d) shows the detail information of a paper by pressing in figure 7(c). A user can watch the author’s presentation video file or send an e-mail to the author. Figure 7 (e) is the list of posters in the poster session. Figure 7 (f) shows the detail poster information. It provides explanation and demo files and simple messaging service to the author. Hence, participants can receive poster information despite the author is not in present. To send personal information, the user presses the Business Card button in Figure 7 (a). Then, the user device discovers the nearby user devices as shown in Figure 7 (g). By pressing the Send button, his personal information is sent to the target person shown in Figure 7 (h). Figure 7 (i) and (h) is for sending an e-mail to friends in the conference.
A Usability Test of Exchanging Context in a Conference Room
869
5 Experimental Evaluation and User Study The experiment was conducted with 21 participants on the basis of the above mentioned prototype. All participants were graduates and were between 24–36 years of age with an average age of 27.8. They had all attended a conference previously, and the average over the entire group was 5.2 times. A PDA, conference schedule table and business card were given to the participants. Before starting the experiment, we explained the two interaction techniques and the services they provide. The participants were asked to assume that they are in a conference.
Fig. 8. The results of the first scenario. Left: Tag interaction, right: Discovery interaction.
Fig. 9. The results of the second scenario. Left: Tag interaction, right: Discovery interaction.
Fig. 10. The results of the third scenario. Left: Tag interaction, right: Discovery interaction.
The first scenario was accessing the paper presentation video they wanted to see. In the second scenario, the participants were asked to acquire the explanation audio and demo video file of a poster. The third scenario was exchanging their personal information with each other. In the last scenario was sending an email to one of the authors. Figure 8 through 11 show the result of the experimental evaluation.
870
D. Kim et al.
Fig. 11. The results of the last scenario. Left: Tag interaction, right: Discovery interaction.
6 Discussions The general result is that both interaction techniques improve the interactions to exchange contexts in a conference. Those all have low physical efforts. However, tag interactions are slightly higher because the ColorCode is sensitive in illuminations that sometimes the mobile device does not identify it. The participants could find the contexts easily using both interactions. However, they chose the tag interaction little more cognitive. This is because physical objects which people are familiar with such as time tables, business cards are used together when using the tag interaction. The efficiency of exchanging context was also both high in the two interaction techniques. Moreover, most of the participants were first unfamiliar using the tag interaction technique. However, they got used to it easily after they experienced several times. However, the number of participants in the survey was little. In a large conference where attendees are abundant, there will be numerous tags to manage and lots of context will exist that the results could be different. As the amount of the contexts increases, it will be difficult to choose the target context among various contexts.
7 Conclusion and Future Work In our paper, we compared several context exchange interaction which are service discovery interaction and tag interaction that can be executed in a conference. We conducted a survey asking about problems in current conference. We implemented a high fidelity for tag and service discovery interactions to evaluate our works. From the evaluation, both the two interaction techniques are expected to be useful to interact with people exchanging contexts in a conference. In our future work, we plan to apply our proposed interaction techniques in a large conference where participants are abundant. Moreover, as the contexts increased in a large conference, selecting the right context will be an important issue when using the service discovery interaction. Therefore, we plan to investigate an efficient method of gathering the proper context. Acknowledgments. This work was performed in the research project supported by the Seoul Development Institute.
A Usability Test of Exchanging Context in a Conference Room
871
References 1. Dey, A.K., Abowd, G.D., Salber, D.: A conceptual framework and a toolkit for supporting the rapid prototyping of context-aware applications. Human-Computer Interaction Journal 16, 97–166 (2001) 2. Kindberg, T., Barton, J., Morgan, J., Becker, G., Caswell, D., Debaty, P., Gopal, G., Frid, M., Krishnan, V., Morris, H., Schettino, J., Serra, B., Spasojevic, M.: People, places, things: web presence for the real world. Int. Mobile Networks and Applications (2002) 3. Moran, T. P., Dourish, P.:Introduction to This Special Issue on Context-Aware Computing 4. Dey, A.K., et al.: The Conference Assistant: Combining Context -Awareness with Wearable Computing. In: Proceedings of the 3rd International Symposium on Wearable Computers. San Francisco, CA, pp. 21–28 (1999) 5. CoolTown home page http://www.cooltown.hp.com/ 6. Han, T.-D., Cheolho, C., Yoon, H.-M., Kim, J.-Y., Jeong, S.-H., Ryu, Y.-S., Kang, B.-S., Kim, H.-K., Lee, S.-W., Vason, P.S., Lee, J.-H., Sohn, Y.-W., Baek, Y.-S., Lee, S.-W., Kang, W.-S., SeongWoon, K.: Implementation of New Services to Support Ubiquitous Computing for Town Life. Software Technologies for Future Embedded and Ubiquitous Systems, 45–49 (2005) 7. Han, T.-D., et al.: Implementation of New Services to Support Ubiquitous Computing for Campus Life. In: Proceedings of the Second IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems, pp. 8–13 (2004) 8. Demeure, I., Faure, C., Lecolinet, E., Moissinac, J., Pook, S.: Mobile Computing to Facilitate Interaction in Lectures and Meetings 9. Sumi, Y., Mase, K.: Supporting the awareness of shared interestes and experiences in communities. Int. J. Human-Computer Studies 56, 127–146 (2002) 10. Rukzio, E., Leichtenstern, K., Callaghan, V., Schmidt, A., Holleis, P., Chin, J.: An Experimental Comparison of Physical Mobile Interaction Techniques: Touching, Pointing and Scanning. In: Eighth International Conference on Ubiquitous Computing, pp. 87–104 (2006) 11. International Organization for Standardization: Information technology – Automatic identification and data capture techniques – Bar code symbology – PDF-417. QR Code. ISO/IEC 15438 (2001) 12. International Organization for Standardization: Information technology – Automatic identification and data capture techniques – Bar code symbology –QR Code. ISO/IEC 18004 (2000) 13. ColorCode. ColorZip Media Inc. (2006) http://www.colorzip.com 14. Riekki, J., Salminen, T., Alakarppa, I.: Requesting Pervasive Services by Touching RFID Tags. IEEE Pervasive Computing, 40–46 (2006) 15. Kim, D., Seo, J., Chung, C., Han, T.: Tag Interface for Pervasive Computing. In: proceedings of the international conference on Signal Processing and Multimedia, pp. 356–359 (2006) 16. Sun Microsystems, Jini architecture specification http://www.javasoft.com/products/jini/ specs/jini-spec.pdf 17. Nidd, M.: Service discovery in DEAPspace. IEEE Personal Communications 8, 39–45 (2001)
Conceptual and Technical Issues in Extending Computational Cognitive Modeling to Aviation Alex Kirlik Human Factors Division and Beckman Institute University of Illinois at Urbana-Champaign 405 North Mathews Avenue Urbana, Illinois, USA [email protected]
Abstract. A recent trend in cognitive modeling is to couple cognitive architectures with computer models or simulations of dynamic environments, such as flight simulators, to study interactive behavior and embedded cognition. Progress in this area is made difficult by the fact that cognitive architectures traditionally have been motivated by data from discrete experimental trials using static, non-interactive tasks. As a result, additional theoretical problems must be addressed to bring cognitive architectures to bear on the study of cognition in dynamic and interactive environments. I identify and discuss three such problems dealing with the need to model the sensitivity of behavior to environmental constraints, the need to model context-specific adaptations underlying expertise, and the need for environmental modeling at a functional level. These issues do not arise merely out of the needs of “applied” science, but instead signal gaps in the fundamental scientific understanding of cognition and behavior in dynamic, interactive contexts. Keywords: Computational cognitive modeling, aviation, embedded cognition, human-computer interaction, human performance modeling.
Conceptual and Technical Issues in Extending Computational Cognitive Modeling
873
The cognitive architectures available to today’s modeling community such as ACT-R [14], COGENT [15], ADAPT [16], EPIC [17], Soar [18], or Clarion [19] are better suited than were their engineering-based predecessors for describing the internal processes underlying behavior beyond merely “sets of independent of stimulus-response relations” [13, p. 4]. So why is it still so difficult to model a (typically experienced) pilot, driver or videogame player with a cognitive architecture? My aim in this paper is to address this question by providing some distinctions and concepts that will hopefully accelerate progress in modeling interactive behavior and embedded cognition.
2 Issues in Modeling Embedded Cognition Difficulties in what is sometimes called “scaling up” cognitive modeling to the complexities of dynamic and interactive contexts such as aviation and driving largely have their origins in tasks and data. In particular, qualitative differences exist between the types of tasks and data sets that gave rise to many of the better known cognitive architectures and the types of tasks and data sets characteristic of many dynamic and interactive contexts. A central goal of this paper is to bring some clarity to the description of these qualitative differences and their implications. My hope is that clarifying these distinctions will be useful in moving beyond vague and not particularly informative discussions on the need to “scale up” modeling, to bridge theory and application, or even worse, to move from the laboratory to the “real world.” As I will try to show in the following, what is at issue here is not so much a scaling up as a fundamentally new unit of analysis. Modeling interactive behavior and embedded cognition raises interesting and challenging theoretical questions that are distinct from the types of theoretical questions that provided the traditional empirical foundation for cognitive architectures. By “distinct” I mean that many of the theoretical questions that arise when modeling dynamic and interactive tasks are not reducible in any interesting sense to the questions that motivated the design of many current cognitive architectures. New and different questions arise, along with their attendant modeling challenges and opportunities. In the following sections I discuss three types of conceptual and technical issues that emerge when examining mismatches between the types of empirical data that have typically motivated the design of cognitive architectures and the types of data confronting modelers of interactive and embedded cognition in operational contexts. The first issue deals with the fact that cognitive architectures have chiefly been designed to model cognition in discrete and static tasks (i.e., laboratory trials) whereas data on embedded cognition often reflects performance in continuous and dynamic tasks. I suggest that modeling cognition and behavior in the latter type of tasks creates a need to model the manner in which behavior is dynamically sensitive to environmental constraints and opportunities. Doing so may require expanding one’s view of the functional contribution of perception to intelligent behavior. Rather than viewing perception to be devoted solely to reporting the existence of objects and their properties to cognition in objective or task-neutral terms, it may be increasingly important to also view perception as capable of detecting information that specifies opportunities for behavior itself.
874
A. Kirlik
The second issue concerns the fact that the design of cognitive architectures has mainly been motivated by data from largely task-naive subjects, or often with subjects with no more than a few hours of task-relevant experience. In contrast, modeling cognition in operational contexts such as aviation and driving often involves data from highly experienced performers. It is impossible to create a good model of a performer who knows more about the task environment than does the modeler. As a result, modeling experienced cognition requires not only expertise in cognitive modeling but also an ability to obtain expert knowledge of the relevant task and environment. While modeling students acquiring Lisp programming or arithmetic skills allows one to obtain this expert knowledge from books, modeling performers in interactive and dynamic domains typically requires detailed empirical study [20]. This knowledge is required not only to guide the development of a cognitive model, but also to develop a formal model of the task environment with which the cognitive model can interact. Finally, I discuss issues that arise out of the profoundly interactive nature of much behavior and embedded cognition in dynamic, interactive contexts such as aviation. In particular I suggest that interactive tasks create a need to view and model the environment much more functionally than may be required when modeling noninteractive contexts. This suggests that a largely physicalistic approach to environmental modeling, for example in terms of the types, locations and features of perceptible objects on a display is likely to be insufficient for understanding cognition and behavior as a functional interaction with the world. Richer techniques for functional-level environmental modeling are needed to marry the functional accounts of cognition provided by cognitive modeling with functional accounts of the environment. When modeling interactive behavior and embedded cognition, one can get only so far by trying to couple functional models of cognition with physical models of the environment. A functional perspective must be adopted for both. 2.1 Modeling Sensitivity to Environmental Constraints and Opportunities One axiom within the engineering-oriented modeling tradition discussed previously concerned the necessity of modeling the environment as a prerequisite to modeling cognition and behavior. As Baron [21] put it: Human behavior, either cognitive or psychomotor, is too diverse to model unless it is sufficiently constrained by the situation or environment; however, when these environmental constraints exist, to model behavior adequately, one must include a model for that environment. [21, p. 6] Baron’s comment places a spotlight on the constraining (note: not controlling) nature of the environment as an important source of variance that must be known when modeling behavior. Understanding how environmental constraints and opportunities determine the playing field of behavior is such a mundane exercise in everyday life that we often forget or overlook the important role that it plays. You will obviously not be swimming in the next minute unless you are already sitting near a pool or on a beach. In experimental research, a modeler typically would not get any credit for explaining all the variance associated with things that our subjects do and
Conceptual and Technical Issues in Extending Computational Cognitive Modeling
875
not do because a task does and does not provide the opportunity to do those things. Instead, the focus is on explaining variance above and beyond what could be “trivially” predicted by examining the carefully equated opportunities for behavior an experiment affords. All of the cognitive architectures of which I am aware, due to their origins in describing data from experimental psychology, have built into them this focus on explaining variance in behavior above and beyond environmental constraints on that behavior. This can be seen from what these models predict: reaction times which, if the experiment is “well designed,” represent solely internal constraints but not external task constraints (a potential confound); the selection of an action from a set of actions all of which are carefully designed to be equally available to the subject (another potential confound). Cognitive experimentalists typically take great pains to equate the availability of the various actions (e.g., key presses) presented to participants. It is easy to overlook how this tenet of experimental design limits generalization to contexts in which the detection of action opportunities themselves and variance associated with the possibly differing levels of the availability of various actions contribute to variance in behavior. I am hardly the first to note the many differences between the largely static, noninteractive environment of the discrete laboratory trial and environments such as videogames, aviation and driving. But note the implications regarding the necessity of environmental modeling in the two cases. To explain variance in the static laboratory experiment, since credit is given only for explaining or predicting variance above and beyond what is environmentally constrained, no attention need be given to modeling how behavioral variance is environmentally constrained. As such, cognitive architectures typically provide no resources explicitly dedicated to this ubiquitous aspect of cognition and behavior in everyday situations. In modeling experimental data, determining which actions are appropriate given the environmental context is a task performed by the modeler and encoded once and for all in the model: it is rarely if ever a modeled inference. This only works because the environment of the laboratory trial is presumed to be static in the sense that all (relevant) actions are always equally available. So the modeler who would like to apply cognitive architectures motivated almost solely by data from such experiments to dynamic, interactive situations is largely on his or her own when determining how to make the model sensitive to environmental constraints and opportunities in a dynamic and interactive fashion. Modeling this type of sensitivity will be necessary any time a performer is interacting with a dynamic, and especially uncertain environment. Both dynamism and uncertainty place a premium on perception to aid in determining the state of the environment in terms of which behaviors are and are not appropriate at a given time. As such, the modeler will be faced with questions concerning the design of perceptual mechanisms to aid in performing this task [22]. If “primitive” perceptual mechanisms are provided by the architecture, the modeler will be faced with questions about which environmental information these mechanisms should be attuned to, and additional “primitive” mechanisms may need to be invented [23]. This may well require reference to an environmental model that represents perceptually available information at a high level of fidelity, and the task of defining perceptual units or objects may present nontrivial problems. All of these issues speak to the question of why it has proven to be difficult
876
A. Kirlik
to use computational cognitive architectures to model performers in dynamic, interactive environments. 2.2 Knowing as Much or More Than the Performer I have already discussed perhaps the most primitive aspect of adaptation to an environment: ensuring that behavior is consistent with environmental constraints on behavior. Assume for a moment that this problem is solved and we are interested solely in examinations of cognition and behavior above and beyond what is so constrained. One finding from the human-machine systems tradition discussed previously is that a good step toward predicting the behavior of experienced performers in dynamic, interactive contexts is to analyze a task in terms of what behavior would be optimal or most adaptive (see Pew [12]). At first blush this approach would seem to dovetail quite nicely with modeling approaches with origins in either rational analysis [14] or ecological rationality [24]. It is important to note, however, that appeals are made to different quarters when one assumes the rationality or optimality of basic cognitive mechanisms and when one assumes the rationality or optimality of experienced behavior. The rationality underlying the design of ACT-R’s memory, categorization and inference mechanisms and [24] toolbox of fast and frugal heuristics appeals to evolutionary arguments rather than to learning or experience per se. The subjects in experiments performed from the perspective of both these adaptive approaches to cognition are not typically presumed to have any first hand experience with the tasks studied. The hypothesis that memory exhibits a Bayesian design or that some decisions are made by a recognition heuristic are intended as claims about the human cognitive architecture independent of any task-specific experience. In fact, one can look at learning to be accumulating the additional adaptations necessary to perform a given task like an experienced performer instead of like a task-naive novice. Much, if not most, modeling research done in dynamic, interactive environments is oriented toward understanding and supporting skilled performance. Much, if not most, experimental research done to inform the design of cognitive architectures uses largely task-naive subjects, or at best subjects with only a few hours of instruction or training. It is hardly surprising, then, that researchers interested in modeling the behavior of automobile drivers, videogame players and pilots have to invent their own methods for identifying and codifying the experiential adaptations underlying skilled behavior. This is true even if they select and use a cognitive architecture informed by rationality or optimality considerations, and even if the behavior to be modeled is highly rational or even optimal. Modeling task-naive behavior can be done by equally task-naive scientists. The main requirement is expertise in cognitive modeling. But modeling expert performance also requires expert knowledge of the task environment to which the expert is adapted. Neisser [25] put the matter of modeling expert performance as follows: What would we have to know to predict how a chess master will move his pieces, or his eyes? His moves are based on information he has picked up from the board, so they can only be predicted by someone who has
Conceptual and Technical Issues in Extending Computational Cognitive Modeling
877
access to the same information. In other words, an aspiring predictor would have to understand the position at least as well as the master does; he would have to be a chessmaster himself! If I play chess against the master he will always win, because he can predict and control my behavior while I cannot do the reverse. To change this situation I must improve my knowledge of chess, not of psychology [25, p. 183]. Our own experiences in modeling experienced performers, has taught us that one must often spend as much, if not more, time studying and formally modeling the external task environment than is spent modeling inner cognition. This is required to enable modeling the highly context-specific cognitive adaptations underlying expertise. 2.3 Cognition and the World Function in Concert In his wonderfully researched and written biography of the late Nobel Prize winning physicist Richard Feynman, James Gleick relates an episode in which MIT historian Charles Weiner was conducting interviews with Feynman at a time when Feynman had considered working with Weiner on a biography. Gleick writes that Feynman, after winning the Nobel Prize, had begun dating his scientific notes, “something he had never done before” [26, p. 409]. In one discussion with Feynman, “Weiner remarked casually that his new notes represented ‘a record of the day-to-day work,’ and Feynman reacted sharply” [26, p. 409]. What was it about Weiner’s comment that drew a “sharp” reaction from this great scientist? Did he not like his highly theoretical research described merely as “day-to-day work”? No, and the answer to this question reflects, to me at least, something of Feynman’s ability to have deep insights, not only into physics, but into other systems as well. Feynman’s reaction to Weiner describing his notes as “a record” was to say: “I actually did the work on the paper.” [26, p. 409]. To which an apparently uncomprehending Weiner responded, “Well, the work was done in your head, but the record of it is still here.” [26, p. 409]. One cannot fail to sense frustration in Feynman’s retort: “No, it’s not a record, not really. It’s working. You have to work on paper, and this is the paper. Okay?” [26, p. 409, italics in the original]. My take on this interchange is that Feynman had a deep understanding of how his work was composed of a functional transaction [27] between his huge accumulation of internal cognitive tools as well as his external, cognitive tools of pencil and paper, enabling him to perform functions such as writing, reflecting upon, and amending equations, diagrams, and so on [28], [29]. Most importantly, note Feynman’s translation from Weiner’s description of the world in terms of physical form (“No, it’s not a record, not really”) into a description in terms of function (“It’s working”). Why did Weiner have such a difficult time understanding Feynman? External objects, such as Feynman’s notes, do of course exist as things, typically described by nouns. Yet, in our functional transactions with these objects, the manner in which they contribute to cognition and behavior requires that these things also be understood in functional terms, that is, in terms of their participation in the operation of the closed-loop, human-environment system (see [30], on “cyclic interaction”). Weiner, like so many engineering students through the ages, apparently had difficulty in viewing the external world not only in terms of form (nouns) but also in terms of function (verbs).
878
A. Kirlik
I share this anecdote here because I believe it to be an exceptional illustration of the fact that studying expert behavior not only presents challenges for understanding what the expert knows, but also challenges for understanding how the expert’s environment contributes to cognition and how that contribution should be described [31]. In our own research modeling interactive behavior and embedded cognition, we have found a need to understand a performer’s environment in functional terms, as a dynamic system in operation [32-37]. Human-environment interaction is then understood in terms of a functional coupling between cognition and the environment functionally described. When modeling experienced performers engaged in interactive behavior and embedded cognition, I suggest that one has a much greater chance of identifying regularities in behavior by analysis at the functional level than by searching for these regularities in patterns of responses to stimuli described in physical terms. Modeling the environment in functional terms is also critically important when trying to model how a person might use tools in the performance of cognitive tasks, as the following examples will hopefully demonstrate. I highlight the importance of adopting a functional perspective on environmental modeling for a number of reasons. As mentioned in the opening of this paper, a trend currently exists to couple models with simulations of dynamic and interactive environments such as flight simulators, videogames and the like. While this is an important technical step in the evolution of cognitive modeling, having such an external simulation of course does not obviate the need for addressing the theoretical problem of modeling the environment in functional terms relevant to psychology. A bitmap model of the visual environment, for example, could be helpful in identifying the information available to a model’s perceptual (input) mechanisms. This environmental model, however, is insufficient for determining what information a model should perceive at what time in order to mimic human cognition and performance, or for documenting any functional structure of the environment is it is reflected in cognition and behavior. For example, it should be obvious that modeling a bicycle rider requires much more than modeling the environment merely in terms of information or stimuli. One must also model the bicycle and the roadway if one is ever to have a chance at modeling the rider’s behavior. Similar ideas apply directly to cognitive tasks as well. Mental arithmetic can perhaps be modeled directly within many existing cognitive architectures. Arithmetic done, as it is normally done, with pencil, paper, hands and eyes requires functional modeling of these additional components of the human-toolenvironment as well, in order to describe the structure and function of this distributed cognitive system, as well as its dynamic operation.
3 Conclusion I have suggested that modeling interactive behavior and embedded cognition raises theoretical questions that are distinct from the types of theoretical questions that provided the traditional empirical foundation for many cognitive architectures. By “distinct” I meant that some of the theoretical questions that arise when modeling dynamic and interactive tasks are not necessarily reducible in any interesting sense to the questions that motivated the design of these cognitive architectures. As such,
Conceptual and Technical Issues in Extending Computational Cognitive Modeling
879
those attempting to use these architectures to model pilots or other performers in dynamic, interactive domains such as aviation will necessarily have to grapple with the conceptual and technical issues discussed here, among others. These include the need to model dynamic sensitivity to environmental constraints on behavior, the need to model highly context-specific cognitive adaptations, and the need to analyze and model the environment of cognition and behavior in primarily functional terms. I would encourage those involved with bringing cognitive modeling to bear on engineering problems in aviation and other operational contexts to begin to explicitly frame and report their advances on these and related issues in terms of addressing novel scientific problems. These problems have not been, as yet, sufficiently addressed by those modeling cognition in discrete, static laboratory trials. The cumulative progress necessary to advance computational cognitive modeling from scientific curiosity to engineering technique requires recognition of this fact, and additional efforts made to remedy this barrier to both fundamental and applied science.
References 1. Byrne, M., Kirlik, A.: Using computational cognitive modeling to diagnose possible sources of aviation error. International Journal of Aviation Psychology 15(2), 135–155 (2005) 2. Byrne, M., Kirlik, A., Fick, C.: Kilograms matter: Rational analysis, ecological rationality, and closed-loop modeling of interactive cognition and behavior. In: Kirlik, A. (ed.) Adaptive Perspectives on Human-Technology Interaction, Oxford University Press, New York (2006) 3. Byrne, M., Kirlik, A.: Closing the loop on computational models of interactive human performance in aviation. In: Foyle, D., Hooey, B. (eds.) Human Performance Models in Aviation: Surface Operations and Synthetic Vision Systems. Erlbaum, Mahwah, NJ (in press). 4. Gluck, K.A., Pew, R.W.: Modeling Human Behavior with Integrated Cognitive Architectures. Erlbaum, Mahwah, NJ (2005) 5. Foyle, D., Hooey, R.: Human Performance Models in Aviation: Surface Operations and Synthetic Vision Systems, Erlbaum, Mahwah, NJ (in press) 6. Gray, W.D., Schoelles, M.J., Fu, W.: Modeling a continuous dynamic task. In: Taatgen, N., Aasman, J. (eds.) Proceedings of the Third International Conference on Cognitive Modeling, pp. 158–168. Universal Press, Veenendal, The Netherlands (2000) 7. Shah, K., Rajyaguru, S., St. Amant, R., Ritter, F.E.: Connecting a cognitive model to dynamic gaming environments: Architectural and image processing issues. In: Proceedings of the Fifth International Conference onCognitive Modeling (ICCM), pp. 189–194 (2003) 8. Salvucci, D.D.: Modeling driver behavior in a cognitive architecture. Human Factors 48, 362–380 (2006) 9. Rouse, W.B.: Advances in Man-Machine Systems Research, vol. 1. JAI Press, Greenwich, CT (1984) 10. Rouse, W.B.: Advances in Man-Machine Systems Research, vol, vol. 2. JAI Press, Greenwich, CT (1985) 11. Sheridan, T.B., Johannsen, G.: Monitoring Behavior and Supervisory Control. Plenum Press, New York (1976)
880
A. Kirlik
12. Pew (in press). Some history of human performance modeling. In: Gray, W. (ed.)Integrated models of cognition. Oxford University Press, New York 13. Sheridan, T.B.: Humans and Automation: System Design and Research Issues. Human Factors and Ergonomics Society and John Wiley & Sons (2, Santa Monica, CA (2002) 14. Anderson, J.R., Lebiere, C.: The Atomic Components of Thought. Lawrence Erlbaum Associates, Mahwah, NJ (1998) 15. Cooper, R.P.: Integrating cognitive systems: The COGENT approach. In: Gray, W.D. (ed.). Integrated models of cognitive systems. pp. 414–427, Oxford University Press, New York (In press) 16. Doane, S., Sohn, Y.W.: ADAPT: A predictive cognitive model of user visual attention and action planning. User. Modeling and User.-Adapted Interfaces 10(1), 1–45 (2000) 17. Kieras, D., Meyer, D.E.: An overview of the EPIC architecture for cognition and performance with application to human-computer interaction. Human-Computer Interaction 12, 391–438 (1997) 18. Ritter, F.E., Reifers, A.L., Klein, L.C., Schoelles, M.: Lessons from defining theories of stress. Integrated models of cognitive systems. In: Gray, W. (ed.) Oxford University, Press New York (in press) 19. Sun, R.: The CLARION cognitive architecture: Extending cognitive modeling to social simulation. In: Sun, R. (ed.) Cognition and multi-agent interaction, Cambridge University Press, New York (2005) 20. Gray, W.D., Kirschenbaum, S.S.: Analyzing a novel expertise: An unmarked road. In: Schraagen, J.M.C., Chipman, S.F., Shalin, V.L. (eds.) Cognitive Task Analysis, pp. 275– 290. Lawrence Erlbaum Associates, Mahwah, NJ (2000) 21. Baron, S.: A control theoretic approach to modeling human supervisory control of dynamic systems. In: Rouse, W. B.(ed.) Advances in Man- Machine Systems Research, Greenwich, CT vol. 1, pp. 1–48 (1984) 22. Fajen, B.R., Turvey, M.T.: Perception, categories, and possibilities for action. Adaptive Behavior 11(4), 279–281 (2003) 23. Runeson, S.: On the possibility of “smart” perceptual mechanisms. Scandinavian Journal of Psychology 18, 172–179 (1977) 24. Gigerenzer, G., Todd, P.M.: the ABC Research Group. Simple Heuristics that Make us Smart. Oxford University Press, New York (1999) 25. Neisser, U.: Cognition and Reality. W. H. Freeman and Company, New York (1976) 26. Gleick, J.: Genius: The Life and Science of Richard Feynman. Pantheon, New York (1992) 27. Dewey, J.: The reflex arc in psychology. Psychological Review 3, 357–370 (1896) 28. Donald, M.: Origins of the Modern Mind: Three Stages in the Evolution of Culture and Cognition. Harvard University Press, Cambridge, MA (1991) 29. Vygotsky, L.S.: The problem of the cultural development of the child, II. Journal of Genetic Psychology, vol. 36, pp. 414–434 (1929) The instrumental method in psychology. In: Wertsh, J. V. (ed.) The concept of activity in Soviet psychology. M.E. Sharpe, Armonk, NY, pp. 134–143 (1981) (1929/1981) 30. Monk, A.: Cyclic interaction: A unitary approach to intention, action and the environment. Cognition 68, 95–110 (1998) 31. Hutchins, E.: Cognition in the Wild. MIT Press, Cambridge, MA (1995) 32. Kirlik, A.: Requirements for psychological models to support design: Toward ecological task analysis. In: Flach, J., Hancock, P., Caird, J., Vicente, K.J. (eds.) Global Perspectives on the Ecology of Human-Machine Systems, pp. 68–120. LEA, Mawhah, NJ (1995)
Conceptual and Technical Issues in Extending Computational Cognitive Modeling
881
33. Kirlik, A.: The ecological expert: Acting to create information to guide action. In: Fourth Symposium on Human Interaction with Complex Systems, IEEE Computer Society Press, Dayton, OH (1998a) 34. Kirlik, A.: The design of everyday life environments. In: Bechtel, W., Graham, G. (eds.) A Companion to Cognitive Science, pp. 702–712. Blackwell, Oxford (1998b) 35. Kirlik, A.: Adaptive perspectives on human-technology interaction: Methods and models for cognitive engineering and human-computer interaction. Oxford University Press, New York (2006) 36. Kirlik, A., Miller, R.A., Jagacinski, R.J.: IEEE Transactions on Systems, Man, and Cybernetics 23(4), 929–952 (1993) 37. Kirlik, A., Bisantz, A.M.: Cognition in human-machine systems: Experiential and environmental aspects of adaptation. In: Hancock, P.A. (ed.) Handbook of Perception and Cognition: Human Performance and Ergonomics, 2nd edn. pp. 47–68. Academic Press, NY (1999)
Mental Models of Chinese and German Users and Their Implications for MMI: Experiences from the Case Study Navigation System Barbara Knapp Siemens AG, CT IC 7, Otto-Hahn-Ring 6 81739 Munich, Germany [email protected]
Abstract. This paper presents the results of an empirical study on some aspects of user-centered design of products for the global market. In the context of the case study “navigation system” Chinese and German users were each confronted with an experimental prototype being structured either according to German users’ mental models of a navigation system or to Chinese users’ mental models. Performance in operating the systems and perceived system attractiveness were measured. Results suggest that the Chinese user group’s performance and the German user group’s perceived attractiveness of the navigation system was negatively affected if the system was based on the other group’s mental model. Keywords: Mental models, Chinese users, German users, quantitative empirical user studies, in-vehicle systems, system structure.
1 Introduction This study is part of the work for a PhD-thesis and as such part of a connected series of studies that use the example navigation system to gain better understanding of the relevance of the integration of users from different countries in the user-centered design process of products for the global market.
Mental Models of Chinese and German Users and Their Implications for MMI
883
or [4]). The other contains studies that mainly focus on the development process of specific systems and that deal with gathering and analyzing differing user requirements and preferences for users from different countries with regard to the system in question (e.g. [5], [6], [7]). This study focuses on the stage of system design when differences in user requirements or user preferences have already been identified. It seeks to provide information that can contribute to answering the question of how important it is to adapt devices to the preferences of users in different countries. The goal is to obtain quantitative data by applying methods from the field of applied psychological research to analyze interaction with one device, the navigation system, and two user groups, German and Chinese users. As one of the important requirements for a user friendly system is that the interaction should be designed so as to correspond to a user’s mental model of the system [8], the study will concentrate on this aspect.
3 Research Questions If a German user group is confronted with a system that is based on Chinese users’ mental models or a Chinese user group with a system based on German users’ mental models, how does this influence user performance and perceived system attractiveness when operating the system? Is an influence only visible for experienced users of navigation systems in both countries or also for non-users?
4 Methods Based on an earlier project stage, in which data on Chinese and German navigation system users’ mental models of functionality and structure of navigation systems was collected, two experimental prototypes were developed that corresponded to Chinese and German users’ mental models respectively [9]. To answer the research questions above, an experiment was conducted. Both German and Chinese users operated the experimental prototypes derived from either German or Chinese data. During the experiment, objective data on user performance when interacting with the prototypes and subjective data, among others perceived system attractiveness, were obtained. 4.1 Material Two experimental prototypes were used in the experiment. Both are hierarchically structured menu-based systems that contain the same 74 single navigation functions. Both systems differ neither in appearance nor in actual content but in where the different functions could be found within the system and with which other functions they were grouped together. Functions in one of the systems were grouped according to the mental model that German users had concerning the structure of a navigation system while the other system was based on the mental models of Chinese users. Both systems were available in German and in Chinese. Participants in the experiment
884
B. Knapp
could not actually activate any functions but could explore the system to find out where each function was located. An additional tracking task consisting of a randomly moving circle that the participant is supposed to track with the mouse arrow was developed to simulate conditions of divided attention as is typical for the use of a navigation system while driving a car. The questionnaire AttrakDiff by Hassenzahl [10], that measures how attractive users think a certain system is, was also employed. 4.2 Participants In Germany and China each 49 persons participated in the experiment and were assigned to one of eight different treatment groups according to country of origin and previous experience with navigation systems (see table 1). Table 1. Experimental conditions and number of participants
Country of Origin German Chinese
Type of System German System Chinese System German System Chinese System
Total
Previous Experience Navigation System yes no 12 13 12 12 12 12 12 13
25 24 24 25
48
98
50
Total
4.3 Tasks and Procedure Each participant operated one of the experimental systems and had to solve 32 different tasks as quickly as possible. A task consists of finding a certain target function within the system structure. During one third of the experiment the participants were asked to also work on the additional tracking task. During the experiment two variables were measured: • the time a user needed to solve a task (time to solve task) • the number of unnecessary steps a user took within the system when looking for the target function (number of unnecessary steps) At the end of the experiment, the users were asked to fill in the AttrakDiffquestionnaire to obtain data on the perceived attractiveness of the system used.
5 Results To determine whether there were significant differences in performance between German and Chinese users solving tasks with the German and Chinese system
Mental Models of Chinese and German Users and Their Implications for MMI
885
respectively, statistical testing and computed analyses of variance (ANOVA) and ttests was employed. As one of the prerequisites for ANOVA, the homogeneity of variance as tested by the Levene-test, was not given for our data, we considered results as significant only for p ≤ 0.001. For t-tests the level of significance was set at p<0.05. Questionnaire data was analyzed by employing Mann-Whitney U-tests with p<0.05 for significant results. Systematic biases in time to solve a task caused by differences in breadth and depth of the prototypes’ structure were controlled by subtracting time needed to navigate through the menus as obtained by a simplified GOMS-procedure [11] from overall problem-solving time. 5.1 Experienced Users: Examples Performance Data A univariate analysis of variance with the between-subject factors country of origin and type of system was computed for the time experienced users of navigation systems needed to solve tasks and for the number of unnecessary steps they took while searching for the target item. Results for the time to solve a task show a significant interaction (F(1,1450)=15.25; p=0.000) between the users’ country of origin and type of system (German and Chinese system). Chinese users perform better with the Chinese system than with the German system. Meanwhile, German users’ performance with the German system does not significantly differ from their performance with the Chinese system (see figure 1). There also exist significant main effects for type of system (F(1,1450)=10.92; p=0.001) and country of origin (F(1,1450)=16.94; p=0.000), with the Chinese system supporting faster interaction and German users needing less time to solve tasks overall (see table 2). Table 2. Experienced German and Chinese users: mean values for time to solve a task (in ms)
Country of Origin Germany China Total
German System M SD 35385 45723 58178 68346 47262 59669
Chinese System M SD 37092 47717 37692 50817 37394 49267
Total M 36270 47935 42239
SD 46742 61049 54826
For the number of unnecessary steps there is a significant interaction effect for experienced users’ country of origin and type of system (F(1, 1311)=10.52; p=0.001). Here again Chinese users perform much better with the Chinese system than with the German system, while the difference in performance with the two systems is smaller for the German users. No significant main effect is found (for mean values see table 3).
886
B. Knapp
Mean time to solve task (in ms)
70000 60000 50000 40000 30000 20000 10000
German system Chinese system
0 Germany
China
Country of origin Fig. 1. Experienced German and Chinese users: mean time to solve a task for German and Chinese system respectively Table 3. Experienced German and Chinese users: mean number of unnecessary steps
Country of Origin Germany China Total
German System M SD 4.54 7.64 7.08 11.79 5.76 9.93
Chinese System M SD 5.06 7.22 4.54 6.90 4.81 7.06
Total M 4.81 5.73 5.26
SD 7.42 9.59 8.56
5.2 Inexperienced Users: Examples Performance Data For inexperienced users as well an ANOVA with the between-subject factors country of origin and type of system for time to solve a task and number of unnecessary steps was computed. Results show a significant interaction effect between user country of origin and type of system for time to solve a task (F(1,1517)=10.55; p≤0.001). Similar to the results for the experienced users, we also found that Chinese inexperienced users perform worse with the German system than with the Chinese system while there is nearly no difference in performance with the two systems visible for German users. No interaction effect shows up for the number of unnecessary steps. For time to solve a task there is a significant main effect of type of system (F(1,1517)=9.17; p≤0.001). Here performance with the Chinese system is generally better than with the German system. For number of unnecessary steps we found a significant main effect for country of origin (F(1,1310)= 14.05; p≤0.001), with German users generally performing better than Chinese users.
Mental Models of Chinese and German Users and Their Implications for MMI
887
All together, inexperienced users need significantly more time to solve tasks (T(2619)=-4.17; p<0.05) and significantly more unnecessary steps before they find the target item than experienced users (T(2965)=-5.46; p<0.05). 5.3 Examples Subjective Data The dimension ATT (Attractiveness Total) from the AttrakDiff-questionnaire was analyzed by computing Mann-Whitney U-tests. There is no significant difference in perceived attractiveness between the two systems for the experienced Chinese users. The experienced German users, though, rate the Chinese system significantly lower (Mann-Whitney U = 23.5; p<0.05) with regard to attractiveness than the German system (see figure 2). 3
Mean Attractiveness (ATT)
2
1
0 Germ any
China
-1
-2 German System Chinese System -3
Country of origin
Fig. 2. Experienced German and Chinese users: ratings of perceived attractiveness for German and Chinese system (negative values correspond to negative ratings, positive values to positive ratings)
Overall, the Chinese users’ ratings are significantly higher than the German users’ ratings (Mann-Whitney U = 77.5; p<0.05). For both inexperienced German and Chinese users no significant differences in their ratings of German and the Chinese system can be found. Still, German users rate both systems significantly lower than Chinese users (Mann-Whitney U = 154.0; p<0.05).
888
B. Knapp
6 Discussion Performance data for experienced navigation system users clearly shows that it is much easier for Chinese users to operate a system that is based on data on mental models that was gathered in China. An analogous effect is nearly invisible for German users, though. This might be due to the fact that not only was the same number of functions grouped differently in the German system and the Chinese system, but that also the mean number of items per menu differed between the systems. The German system had a relatively low number of items per menu (3.6 compared to 5.0 for the Chinese system) and might have been less efficient to use than the Chinese system to begin with (for similar results see [11]) which would have masked a more positive effect of the German system on German users’ performance. This issue will be explored in further research. Interestingly, although performance of Chinese users differs greatly when either operating the Chinese or the German system, there is no trace of this fact found in the data on perceived system attractiveness. Chinese users rate both systems as similar in attractiveness and their ratings are more positive than those of the German users. German users whose performance is similar with both systems nonetheless are more critical and rate the Chinese system as significantly less attractive than the German one. With regard to inexperienced users, an interaction effect similar to the one apparent for the experienced users can be shown for time to solve a task but not for number of unnecessary steps. As number of unnecessary steps as a measure is less sensitive than time to solve a task this result is not contradictory but might simply indicate an existing but weak effect. This effect might be caused by mental models that users, who have no experience with navigation systems as such, may have developed when interacting with other electronic devices (PDA, mobile phone etc.) or navigationrelated PC-applications (online city maps etc.) that have an overlap in functionality with navigation systems. As parts of existing mental models that are applicable to a new system they might shape even inexperienced users’ expectations [12].
7 Conclusion The negative effects of developing systems based on user data from only one country for users in other countries have been shown at least for this example case. Even if German and Chinese users in this study were affected in different ways by systems developed for the other user group respectively – with system acceptance being impaired for the German users and performance for the Chinese users – both these factors are important for the success of a product. This study also contributes some new methodological aspects. It shows that for research on users in different countries as input for development it is necessary to rely on a broader array of methods, not just either experiments or subjective data such as questionnaires.
Mental Models of Chinese and German Users and Their Implications for MMI
889
Acknowledgments. The research presented in this paper is part of the author’s doctoral work at the University of Regensburg under supervision of Prof. Alf Zimmer in cooperation with Siemens AG, Munich. For support of this work in China the author would like to thank Dr. Zhou Wei, Tian Minghui, Chen Nan and Dr. Qin Xiangang from Siemens SLC, Beijing.
References 1. ISO, ISO 9421-11: Ergonomic requirements for office work with visual display terminals (VDTs). Part 11 – guidelines for specifying and measuring usability. International Standards Organisation, Geneva (1997) 2. Choong, Y.-Y., Salvendy, G.: Design of Icons for Use by Chinese in Mainland China. Interacting with Computers, Special Issue: Shared values and shared interfaces 9, 417–430 (1998) 3. Choong, Y.-Y., Salvendy, G.: Implications for Design of Computer Interfaces for Chinese Users in Mainland China. International Journal of Human-Computer Interaction 11, 29–46 (1999) 4. Röse, K., Liu, L., Zühlke, D.: Design Issues for Mainland China in the Area of HumanMachine-Interaction Design. In: Smith, M.J., Salvendy, G. (eds.) Systems, Social and Internationalization Design Aspects of Human-Computer Interaction, pp. 532–536. Lawrence Erlbaum, Mahwah, NJ (2001) 5. Rau, P.-L., Choong, Y-Y., Salvendy, G.: A cross cultural study on knowledge representation and structure in human computer interfaces. International Journal of Industrial Ergonomics 34, 117–129 (2004) 6. Honold, P.: Culture and Context: An Empirical Study for the Development of a Framework for the Elicitation of Cultural Influence in Product Usage. International Journal of Human-Computer Interaction 12, 327–345 (2000) 7. Lightner, N., Yenisey, M.M., Ozok, A., Salvendy, G.: Shopping behaviour and preferences in e-commerce of Turkish and American university students: implications from crosscultural design. Behaviour & Information Technology 21, 373–385 (2002) 8. Jokinen, P., Karimäku, K., Kangas, A.-M.: Demanding Needs for Mobile Phones: A Qualitative User Study on the Young Urban Lower Middle Class in China. In: Evers, V., Röse, K., Honold, P., Coronado, J., Day, D. (eds.) Proceedings of the Fifth International Workshop on Internationalisation of Products and Systems, pp. 105–113. isherUniversity, Kaiserslautern (2003) 9. Streitz, A.: Psychologische Aspekte der Mensch-Computer-Interaktion [Psychological aspects of human-computer interaction]. In: Hoyos, C.G., Zimolong, B. (Hrsg.) Ingenieurpsychologie (S. 240-284). Göttingen: Hogrefe (1990) 10. Knapp, B.: (Manuscript in preparation). Nutzerzentrierte Gestaltung von Informationssystemen im PKW für den internationalen Kontext: Besteht die Notwendigkeit der Berücksichtigung von Nutzerdaten aus verschiedenen Ländern? [User-centered design of in-vehicle information systems for the international context: Relevance of the consideration of user data from different countries] 11. Hassenzahl, M.: Mit dem AttrakDiff die Attraktivität interaktiver Produkte messen [Measuring the attractiveness of interactive products with the AttrakDiff]. In: Hassenzahl, M., Peissner, M. (eds.) Usability Professionals 2004. Stuttgart: German Chapter der Usability Professionals’ Association (2004)
890
B. Knapp
12. John, B.E., Kieras, D.E.: Using GOMS for User Interface Design and Evaluation: Which Technique? ACM Transactions on Computer-Human Interaction 3, 287–319 (1996) 13. Manes, D., Green, P.: Evaluation of a Driver Interface: Effects of Control Type (Knob versus Buttons) and Menu structure (Depth Versus Breadth). Technical report UMTRI-9742, Transportation Research Institute, University of Michigan, USA (1997) 14. Van der Veer, G., Melguizo, P.: Mental Models. In: Jacko, J.A., Sears, A. (eds.) The Human-Computer Interaction Handbook, pp. 52–80. Lawrence Erlbaum, Mahwah, NJ (2003)
Usability Test for Cellular Phone Interface Design That Controls Home Appliances Haeinn Lee 107A Kiehle Visual Arts Center, 720 Fourth Avenue South St. Cloud, MN 56301-4498
Abstract. The role of interface design is to enable communication between people and the technical product such as a cellular phone, computer, or PDA. To use the product successfully, the interface design should be easy to use. The objective of this paper is to create a practical and user-friendly interface design for a wireless device to control home appliances. In order to control home appliances with a cellular phone, the author suggests a natural (intuitive) interface design that is friendly and attractive to users, based on their experience, and effectively uses graphic elements such as layout, icon, color, and text. As part of this natural (intuitive) interface design, the author suggests using a wheel key to control a cursor system for navigating a cellular phone screen. A usability test was conducted to determine problems people have while using the prototype. The results of the usability test indicated that the user interface was successful, and participants were satisfied with the prototype. Keywords: Human interaction design, Graphic Design elements, Natural (intuitive) design, Wireless device, Cellular phone, Usability test.
phone interface that people could use to control home appliances at the HCI International Conference 2005 in Las Vegas. [1]. This paper, the author explored several aspects of design such as layout, icon & image, color, text, and navigation cues to create the final prototype for a Natural (Intuitive) Interface Design for Controlling Home Appliances. The author defines the phrase, “Natural (intuitive) Interface Design,” as a design that allows users to manipulate the interface without needing to think about what they are doing, and it builds on people’s prior experience. All graphical elements such as layout, icon & image, color, and text help the user to remember the interface and navigate. In the human interface design, a usability test is one of the most important methods for finding problems and soliciting users’ practical suggestions. These suggestions will be most effective in improving the interface design. Therefore, this paper is more focused on usability test part, so study user’s behavior about the design and judge whether it is a user centered design or not.
2 Methodology In the methodology section, this paper will propose a suitable prototype of a cellular phone small-screen interface design. As an example of small-screen interface design, an interface that people can use to control home appliances with a cellular phone will be developed. This proposed cellular phone screen interface design and the cellular phone running the interface are both simulated on a computer using Macromedia Flash software. Because of the limitations of available technology, it was not possible to put the interface onto a real cellular phone. Using simulated prototype, usability test is conducted, in order to find out problems that users have while using the prototype. 2.1 Information Electric Home Appliances Kind and a Function The table1 provides information on the types of appliances that can be controlled with a cellular phone, PC, or remote control. Already, Samsung and LG provide a Home Network service that can control several home appliances. Among these home appliances, the most useful and popular functions (Air-conditioning, Lighting, and Microwave oven) were chosen for use in a cellular phone interface design. Airconditioning and Lighting functions are very useful home appliances that can be controlled from outside the home. People expect their house to be warm when they return home. Also, people can make sure whether they turn off all lights or not, thus saving electricity. If people come back home late, they can turn on a lamppost or a front light from their phone. Controlling a microwave is still inconvenient in that you have to put in food before you leave, but it is likely that microwaves could be combined with refrigerators, so that it will be more convenient to remotely control microwaves in the future. Besides, users who participated in my usability test said that the microwave is the most useful home appliance to control from outside the home.
Usability Test for Cellular Phone Interface Design That Controls Home Appliances
893
Table 1. Information electric home appliances kind and function Home Appliance Air Condition/ Heater Electrical Light Oven/ Microwave
Functions Working / stop control, Direction of the wind / temperature setting, Power control Electric light state confirmation, turn off/on Cook food (Make heat), Power control
2.2 Navigating Device Layout The interface uses a cursor system, which is also used for navigating a PC environment, to navigate a cellular phone interface design. If the interface design of a cellular phone is consistent with the PC environment, users will need less time to learn how to use the cellular phone interface. To study this idea, the author of this thesis brought a paper called The effects of experience with a PC on the usability of a mobile product. This paper describes an experiment to determine if the desktop metaphor of the PC is more appropriate for a small device such as a PDA or cellphone for users who are familiar with the PC than the dedicated new user interface design for PDAs. This paper was compared to different views that claim that the current interface design of PCs cannot be directly applied to mobile products because they have their own specific functions and requirements. Therefore, new design approaches are required to address the special design needs for these devices. It was found that using PC metaphors for mobile products could have advantages because users’ PC experiences can reduce the learning efforts and thus contribute to an intuitive understanding of the mobile product. The author’s experimental results show that application of PC metaphors to mobile products can be advantageous because users’ PC experiences can reduce the learning efforts and thus contribute to an intuitive understanding of the mobile device.[2] Therefore, large numbers of people are familiar with using a cursor system because they already use similar systems on computers or MP3 players. Furthermore, the other benefit of using cursor system is saving time. In current cellular phone designs, navigating their information structure is a step-by-step process using the arrow key. To solve this problem, a cursor system was brought to the small screen so that it will be convenient to navigate regardless of what the user wants to do. 2.3 Process of Developing a Prototype Scenario-Based Design. To develop a cellular phone design to control the home appliance, a scenario-based design was used to identify important characteristics of the user character, behavior and navigation map. According to a paper, Five Reasons for Scenario-Based Design, “Scenarios evoke reflection in the content of design work, helping developers coordinate design action and reflection.” Thus, “Scenarios can be abstracted and categorized, helping designers to recognize, capture, and reuse generalizations, and to address the challenge that technical knowledge often lags the needs of technical design.”[3] Therefore, a scenario was created that anticipates the user’s performance when he/she uses a cellular phone to control home appliances.
894
H. Lee
Navigation Map. The Navigation Map is organized to help people to understand the overall sequence design. It is organized into two main categories; Set up and Control. The Set up category is divided into three submenus: Pin number, User name, and Set up picture. The control category is divided into three submenus: Microwave, Lighting, and Heater. Storyboard by Flashcard. Using flash cards to understand the navigation step is a useful method for developing a storyboard. Following the navigation map, the author sketched each step in the flash card and explored several possible screen designs. 2.4 Screen Design Using Graphic Design Element Screen Layout. Layout is the most important graphic element that affects the overall design. A clean and consistent layout makes the layout easy for the user to understand and natural to navigate. For a screen design for controlling home appliances, four different kinds of layout systems were explored. The first and second layouts were designed as circular shapes because of the wheel key system. When using wheel key system, it is easier and more natural to navigate a circular layout. However, if pictures of home appliances were used as an icon, a square shape layout is a better solution because the square shape saves more space and shows pictures more clearly and precisely (Fig.1). All contents can be designed into two different kinds of square layouts in diverse, clean ways. This layout also fits the purpose of this navigation prototype that is more natural to understand and is easier to recognize contents. Even though people will use the circular shape of the wheel key in the screen, the wheel is controlling a cursor, so that it does not matter, even if the layout is used square shapes. Therefore, the author decided to design using a square layout for the final screen interface design.
Fig. 1. Square layout
Usability Test for Cellular Phone Interface Design That Controls Home Appliances
895
Button Image. The author decided to use a picture to identify a button. If users see rooms and home appliance pictures for identification, they are easy to recognize (Fig.2). Since people can take their own room and home appliances’ pictures with their camera, they are already familiar with all images. Therefore, there is no time required to figure out icons, and buttons.
Fig. 2. Use picture as button
Color. Author chose four colors enclosed in the red square as finalists from the several color combinations by criteria. These four color combinations had different colors for each button, so the buttons stand out more than any other color combinations, so these well organized colors make people get more attention and attractive. (Fig.3).
Fig. 3. Color combination (Colors with red line are finalist)
Typography. Typography cannot be a dominant part of the interface design; it is more related with legibility and readability. The typographic choice affects legibility, and the choice of typeface immediately impacts whether a communication is read
896
H. Lee
and how it is perceived. Typeface designed for screen use can increase legibility as well as providing perceptual cues about the approachability and quality of an interface. [4] From a research, most common typefaces for screens are Verdana, Helvetica, and Arial which are san-serif typeface. These typefaces are clean, easy to read, and optimize the reading process. Serif type face such as Georgia is too decorated typeface, and it is good for design. However it is not good for Legibility and Readability in screen. Author chose Arial typeface for a prototype, because it is most clear precise, and easiest to read from screen, also Arial is the basic and common typeface that is available most of devices such as Computer, PDA, Cellular phone etc. Author tried to use 14 point in the prototype screen, but it is too big to cover all contents. Therefore, author decided to use 14 point Aria as a keyword text and to use 13 point and 12 point Aria as a contents text. Navigation Cues. The method that the author focused on was a natural and easy way to navigate. Author tried to organize the menu structure by home appliances. So that, the user can directly chooses each home appliance from a menu page. This is a natural and easy way to navigate, and it takes fewer steps to go through. (Fig.4).
Fig. 4. Navigation Cues
3 Prototype of Natural Interface Design In the methodology section, author explored several aspects of design: layout, button image, color, typography, and navigation cues to create a final prototype of a Natural (Intuitive) Interface Design to Control Home Appliances. Author defines the phrase, “natural (intuitive) design,” as a design that allows users to manipulate the interface without needing to think about what they are doing. It builds on people’s prior experience. Therefore, author used a cursor system to navigate the interface design. According to the methodology section, a number of people are familiar with using a cursor system because they already use similar systems on computers or MP3 players. To control the cursor, author decided that a good solution would use a wheel key as the mouse. Author illustrated the cellular phone for the prototype base on existing model that is LG TeleCom IM-8500L, because of using a wheel key.
Usability Test for Cellular Phone Interface Design That Controls Home Appliances
897
3.1 Final Prototype on Screen Main Page and Set up Page. When users click a home control button, people will see a main page that requires them to type their “User name,” and “Pin number”. However, if people are first time users, they have to go a Set up page first to enter a user name, pin number, and home appliances picture. (Fig.5).
Fig. 5. Final Prototype
Set up Picture Pages. The four images in Fig.6 show the process of setting up the home appliances pictures. First image shows the home appliance set up screen where home appliance pictures are placed. Second image shows the user needs to choose a home appliance category from the drop-down list, and then click a button for each home appliance. For example, the user clicks the lighting button to put a picture in the space. A group of thumbnail pictures appears from which the user picks a lighting picture. In the same way, the user can enter other appliance’s pictures such as a Microwave, Oven, and Air Conditioning/Heating. After the user finishes entering pictures of his or her home appliances, the completed screen looks like last image.
Fig. 6. Final Prototype
Menu page and Microwave page. Once the user sets up the pictures in the Set up page, they can see their home appliances’ buttons from the Menu page. See first
898
H. Lee
image in Fig.7. There are four clickable buttons: Lighting, Microwave, Oven, and Air condition/Heat. Users also can change this order using drag & drop actions. If users want to group kitchen appliances together, they can move the oven button to the top line. The color bar has a relationship with each home appliance. Red is used for the oven because red is used to depict heat, and blue is used for Air Conditioning because blue is used to depict cool temperatures. Second image of Fig.7 shows the microwave page. Once the user clicks the microwave button from the Menu page, Microwave page opens where there is a control box for the microwave. Red is used for the microwave control panel because of microwave ovens’ association with heat. In the control panel, there are six buttons called “One touch cook.” When users want to cook using these simple controls, they can set up a time in a box and click a power button to cook.
Fig. 7. Final Prototype
4 Usability Test Using the simulated prototype, usability tests were conducted to analyze the user’s experience with the prototype with respect to designing a user-friendly, small screen interface to control home appliances. There were 12 participants for the usability test. Their ages ranged from 25~45, and were equally divided by gender (six men, six women). Also, they were divided by region, six participants were from Korean and Taiwan, and six participants were from the USA. Different groups of people were tested. Completing the usability test required approximately 30 minutes. First, participants were asked for demographic information such as age, occupation, and their use of computers and cellular phones. Second, participants were asked to perform five tasks in the computer, and they had to finish all tasks using the provided simulation. Each task was divided into three to four specific questions in order to find more accurate results (Table 2). The software ‘Camtasia’ was used to capture a screen and record a participant’s voice. Third, participants were asked for their perceptions on the prototype and preferences related to the task or pictures that the author showed to participants.
Usability Test for Cellular Phone Interface Design That Controls Home Appliances
You are the first user, set up the user name as ‘poeat’ pin number as’1459’ to make accessible to home application. Find home appliances page Find a ‘Set up’ button Set up user and pin # Find button ‘Next’ Bring the picture of Lighting, Microwave, Oven and Air condition Find Category of home appliances Click each home appliance’s button Bring pictures Change orders the kitchen home appliances to the top line, and placed the air condition on the left side. Control the microwave time to 15min, and make the heater temperature 80 degree on second floor. Find a Microwave button Set up a time of 15 minutes Find a power button and makes work Find to go back ‘Menu’ button Find an ‘Air condition’ button Find the place to set up temperature 80 degree, and click ‘Heat’ button Turn off all lightings, and then turn on the lightings of kitchen, and living room Find a Lighting button Find ‘All off’ button Turn on each room
5 Conclusion A usability test was conducted to find out how well this prototype works and what kinds of problems users have while using the prototype. A usability test is one of the most important methods for finding problems in an interface design and soliciting users’ practical suggestions. The results indicated that the usability test was successful, and all participants responded with a high rating for both the design and its usability. Participants mentioned that they would be willing to buy the prototype as a production product if it is not too expensive. Therefore, the author made these conclusions from the prototyping and usability testing results. The first conclusion is that using pictures of home appliances as icons is a natural and simple way to navigate the interface because users are already familiar with seeing their own home appliances. The second conclusion is that using a cursor system is an excellent method for navigating the cellular phone screen because users are already used to it, and it does not create limitations in navigating the screen. The third conclusion is that users prefer type that is clear and precise. Therefore, a bigger type size is better (13 or 14 point fonts) and a clearer typeface is better (Arial or Helvetica). The fourth conclusion is that users like to see a strong color contrast between the background and buttons because it is easier to distinguish between the two. The fifth and final conclusion is that people do not want to go through too many steps; they prefer to click less and use shortcuts to complete the tasks quickly.
900
H. Lee
Although people were satisfied with the prototype, there are still details that will need to be implemented in order to improve both the design and usability of the interface. For example, proper responses need to be provided when users perform an action. If they are not provided, users wonder if the action was completed. Thus, a prototype that requires users to go through too many steps leaves them confused. For example, the correct path in the Air-condition page is as follows: (1) check the current temperature from the first box, (2) enter the temperature in the second box, and (3) click the heat button. However, participants tried to immediately click the heat button when they wanted to increase the temperature, which would require only one step. Therefore, the author identified several kinds of problems with the interface design; then based on these problems, made the following recommendations to improve user success with the interface. The recommendations are as follows: provide immediate feedback, create a clear visual navigation hierarchy, provide flexibility for the user in controlling buttons, keep the interaction metaphor consistent within the context of a current device, provide clear instructions, and provide simple and natural navigation cues which use common user conventions. Due to technology and time limitations, the prototype was simulated on a computer screen. Perhaps, in a future study, it will be possible to simulate the prototype in an actual cellular phone. The biggest limitation of this study is the small sample size. Therefore, the author cannot generalize beyond the sample. For future usability tests, the author wants to include more participants, so the results can be generalized across a broader population. Even with the limited sample size, the author hopes this thesis makes the following contributions: wireless content will be easier to use due to a simplified and comfortable interface, wireless products can be developed more quickly due to the simplified interface, and human interface technology associated with cell phones will be diversify more rapidly. Acknowledgments. I want to give a special thank you to my family: my father, mother, and brother for their constant and unconditional love, encouragement, and support. I am deeply indebted to them and dedicate this study to them.
References 1. Lee, H.: Human interface design for controlling home appliances with cellular phones. HCI Internation 2005. Las Vegas (2005) 2. Jeong, S., Lee, K.: The effects of experience with a PC on the usability of a mobile product. Department of Industrial Design, Korea Advanced Institute Science and Technology(KAIST) Republic of Korean(no date) 2 3. Carroll, J.M.: Five Reasons for Scenario-Based Design. In: 32nd Hawaii International Conference on System Sciences. 1999 IEEE (1999) 4. Watzmann, S.: The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies, and Emerging Applications. In: Chapter 13: Visual Design Principles for Usable Interfaces, pp. 263–285. Lawrence Erlbaum Associates, NJ (2003)
Validating Information Complexity Questionnaires Using Travel Web Sites Chen Ling1, Miguel Lopez1, and Jing Xing2 1
School of Industrial Engineering, University of Oklahoma 202 W. Boyd, Room 124, Norman, OK {chenling, lopezmf}@ou.edu 2 Civil Aerospace Medical Institute, United States Federal Aviation Administration, Oklahoma City, OK [email protected]
Abstract. With the prevalent use of visual interfaces and the increasing demand to display more information, information complexity becomes a major concern for designers. Complex interfaces affect the system effectiveness, efficiency, and even safety. Researchers at the Federal Aviation Administration have developed two sets of psychometric questionnaires to evaluate information complexity of air traffic control displays. This study adapted the questionnaires for commercial visual interfaces and validated them with directed and exploratory tasks on three travel websites. The results confirmed that both complexity questionnaires have satisfactory reliability, validity, and sensitivity. But questionnaire B demonstrated higher sensitivity than Questionnaire A.
C. Ling, M. Lopez, and J. Xing Table 1. Constructs of Questionnaires and Questions in Questionnaire B
Complexity Dimensions
Perception
Quantity Variety
Relation Overall
Cognition
Quantity
Variety Relation Overall Quantity
Action
Variety
Relation
Overall Grand Overall
Complexity Constructs
Questions in Questionnaire B
Number of How quickly and easily can you find the information you fixation group need on the website? Variety of Does the variety of visual features (e.g., size, color, font, Visual and icons) assist you in acquiring information? Features Degree of How does the website clutter affect reading text and icons? Clutter How would you evaluate the perceptual complexity of the website? Number of How does the amount of information provided on the Functional website affect information management? Units Dynamic How do information changes on the website affect the way Complexity you process information? Relational Does the way in which information is presented affect your Complexity understanding of that information? How would you evaluate the cognitive complexity of the website? Action Cost How does the action cost (transition between action modes e.g. from keyboard to mouse) affect you? Action Depth How does the number of action steps (e.g., number of display windows, pull down menus, and pop up windows) affect you? Simultaneous How does the number of action sequence to perform tasks or Action Goal acquire information affect you? How would you evaluate the action complexity of the website? How would you evaluate the overall complexity of the display?
Based on the complexity framework, Xing [6] developed two sets of questionnaires to evaluate information complexity for air traffic control displays. The first questionnaire (referred to as questionnaire A) is intended to be used for complexity control in acquisition evaluations. It has 13 questions, each corresponding to either one dimension of the complexity constructs or an overall complexity index. Each question is provided with four statements describing different levels of complexity. These statements are used as the multiple choice answers to the question. Subjects need to choose one statement that best describes their perception of the system being evaluated. The second questionnaire (referred to as questionnaire B) is suitable for complexity management during and after design. It contains 13 questions, each accompanied with 3 to 6 statements, with a total of 54 items (See Table 1). To eliminate response
Validating Information Complexity Questionnaires Using Travel Web Sites
903
bias, some statements are positive to the question whereas others are negative. Subjects need to grade the degree of agreement to every statement using a six-point rating scale, ranging from strongly disagree to strongly agree. No neutral answer option is provided. This forced-choice format is believed to better elicit subjects’ opinions. Here we provide one example question from each questionnaire on the same complexity construct: perceptual variety. In questionnaire A, the question is “How easy is it for you to find information on the website?” The statements associated with the question are listed as follows: A. B. C. D.
I can find the information effortlessly. I can find the information with a few quick glances. I can find the information by searching in a local area of the display. I have to search through the display to find the information.
Subjects need to select one of the statements from A to D that best fit their experience with the evaluated system. In Questionnaire B, the question on perceptual variety is “How quickly and easily can you find the information you need on the display?” followed by four statements. Statement 1, 2, and 4 are positive and statement 3 is negative. Subjects need to indicate their degree of agreements with these statements. 1. 2. 3. 4.
I know where to look to find the information I need. I can find the information I need without searching. I have to search through the display to find the information I need. I can find the information I need with a few quick glances.
A concept that is related to complexity is usability. Usability is defined as “the effectiveness, efficiency and satisfaction with which specified users achieve specified goals in particular environments” in ISO 9241-11. Many questionnaires have been developed for interface usability evaluation. One of the most widely used is Post Study System Usability Questionnaire (PSSUQ) for scenario-based usability evaluation [1]. The questionnaire has 19 items. It evaluates three dimensions of usability: system usefulness, information quality, and interface quality. An overall usability score can be derived by averaging the answers to items in each dimension. The PSSUQ used 7-point Likert scale, where higher ratings indicate higher usability. PSSUQ has been validated and demonstrated high reliability with overall Cronbach’s alpha of 0.97. Its construct validity has been established through factor analysis. After ten years of use, the PSSUQ was still considered as reliable and valid [2]. While a few questions in PSSUQ capture some aspects of complexity, it does not systematically evaluate complexity. Neither does it yield much information about the underlying structure of high complexity. On the other hand, Xing’s complexity questionnaire A can elucidate the complexity structure of a visual display because it is based on a well-structured framework, and the statements in questionnaire B themselves can serve as guidelines of reducing complexity. Since there are lots in common between air traffic control displays and commercial interactive visual interfaces, the two complexity questionnaires could be a valuable addition to the usability community. However, neither questionnaire has been validated. The purpose of this study was to adapt Xing’s questionnaires [6] for commercial interactive visual interfaces, validate them, and compare them to the established usability questionnaire PSSUQ.
904
C. Ling, M. Lopez, and J. Xing
We chose three travel websites to validate the questionnaires. With the growing of E-Commerce, people tend to plan their travels with web agents. Since planning trip on travel websites has certain degree of complexity, we selected some tasks on the travel websites as a vehicle to validate the two complexity questionnaires. We intend to establish several quality criteria for the psychometric instruments of complexity questionnaires including reliability, validity, and sensitivity [4]. The validation focused on the following questions: Are the measured indices from different questionnaires consistent? Do the instrument measures the intended attribute? Are the questionnaires sensitive to experimental manipulation? The objective of our study is to establish these three criteria for the adapted complexity questionnaires.
2 Methodology 2.1 Subjects 41 university students participated in the study, of whom, 13 were females and 28 were males. 34 subjects were Native English speakers and the other 7 subjects used English proficiently as a second language. The average age of the subjects was 22.8 years old. 2.2 Apparatus The experiment was performed during November and December of 2006. Microsoft Internet Explorer® version 6.0 was used as the web browser. The websites studied include www.experia.com, www.travelocity.com, and www.orbitz.com. The task performance time was recorded with a Casio Stopwatch (continuous) and an Ipod Nano® (in stopwatch mode). Three sets of questionnaires were used, including two complexity questionnaires (adapted from Xing’s complexity questionnaire A and B), and the PSSUQ. 2.3 Tasks Before the experiment, a task analysis of each web site was conducted to develop and classify the experimental tasks. The experiment consists of two types of tasks: directed and exploratory tasks. The directed tasks were to be performed using the standard toolbox on the top left corner of each website homepage. The toolbox helps users to find the optimal results. The three directed tasks include: 1) Buy tickets for two adults and two children with Day’s Inn hotel from Dallas, TX to Yellowstone national park on particular dates; 2) Buy a roundtrip ticket for one person from Oklahoma City, OK to Chicago, IL on particular dates; 3) Buy cruise tickets for two adults for Western Caribbean Sea that takes off from Miami, FL for seven days, requiring ocean view rooms and no more than $900 budget per person. For the exploratory tasks, subjects were asked not to use the standard tool box, but follow through website links to accomplish the tasks. The three exploratory tasks include: 4) Buy a tour to Paris, France for four nights for your daughter and wife from Miami, FL with the cheapest price; 5) Plan a 7-day honeymoon trip to Hawaii
Validating Information Complexity Questionnaires Using Travel Web Sites
905
with $5000 budget for you and fiancé from San Francisco, CA for mid August next year. 6) Find the best deal to go to Las Vegas this weekend. These two types of tasks are representative of tasks with many visual interfaces, those commonly carried out by users, and those less frequently used. Because it may require more mental efforts to accomplish the less frequent tasks, the complexity ratings of these tasks are expected to be higher. Therefore, we used both types of tasks to test the sensitivity of the two complexity questionnaires. The questionnaires should be able to detect the differences in complexity between these two types of tasks. 2.4 Procedure Subjects filled out the informed consent form and a survey on demographic information before the experiment. Based on the reported familiarity with the three websites, the subjects were assigned to use the website that they are least familiar with. This is to reduce the variation caused by prior experiences. As a result, 14 subjects used Expedia, 13 subjects used Travelocity, and 14 subjects used Orbitz website. The experimenter first presented the subjects with the tasks and carefully explained the purpose of the experiment, emphasizing the importance of filling out the questionnaires carefully. The subjects first performed the 3 directed tasks. Then they were asked to fill out the two complexity questionnaires and PSSUQ usability questionnaire. The order of the tasks was randomly assigned. Then the subjects performed the 3 exploratory tasks, and subsequently filled out the there questionnaires again. The orders of these three questionnaires were counterbalanced to eliminate potential biases. Throughout the experiment, the experimenter took notes on task performance and comments from the subjects. The time to complete the experiment is roughly one and a half hour.
3 Results and Analysis 3.1 Questionnaire Adaptation and Data Preprocessing We first adapted Xing’s complexity questionnaires A & B to fit visual interfaces in general. Some wordings were modified based on the inputs from a language professor and subject matter experts. Due to the length limitation, we are unable to include the adapted questions in this paper. The original questionnaires have been documented [6], and we will present the full version of the adapted questionnaires elsewhere in the near future. Because each subject answered the questionnaire packages twice, there are 82 sets of responses in total. In questionnaire B, multiple questions ask about the same construct, some with positive statement and some with negative statement. In the data preprocessing stage, the answers are all transformed to the degree of agreement to negative statements. Therefore higher agreement indices imply higher complexity levels. Answers to questionnaire A also follow the same pattern. The answer indices range from 1 to 4, with 1 for the complexity level “not complex and easy to use”; 2 for “moderate complexity but manageable,” 3 for “high complexity and hard to manage,” and 4 for “too complex to manage.” The answer indices to questionnaire B range from 1 to 6, with 1, 2, and 3 representing relatively simple interfaces, and 4, 5, 6 for
906
C. Ling, M. Lopez, and J. Xing
relatively complex ones. The average overall complexity rating of all subjects on the three websites is 1.88 from questionnaire A, and 2.69 from questionnaire B. 3.2 Reliability 3.2.1 Correlation Analysis Because both Questionnaire A and B measure the same complexity constructs, the first step of our analysis is to establish the correlation between the two questionnaires. For each of the construct, a correlation coefficient and the p-value are obtained. Most correlation coefficients are positive and significant. But three statements in questionnaire B show negative correlation with the corresponding questions in Questionnaire A. We refer to those statements as “irrelevant.” These statements and the associated questions are further studied in the next section. 3.2.2 Internal Consistency Analysis on Questionnaire B In Questionnaire B, multiple statements are associated with each complexity construct. The responses to the statements associated with the same construct are expected to be consistent. The internal consistency indices among these statements are calculated for every question in questionnaire B. The questions containing the “irrelevant” statements mentioned earlier derive low Cronbach’s alpha value. However, when we remove those statements, all the resultant Cronbach's alpha values are above 0.7, which are considered as an acceptable level of internal consistency. A closer look at the “irrelevant” statements shows that first statement “I can see information better if I ignore some of the colors, fonts, and text formats.” might not be well comprehended by the subjects. Both the second statement “Using this website takes moderate mental efforts” and the third statement “I can interact with the website to accomplish my tasks but with some effort” describe a mediocre level of complexity. Because other positive and negative statements have been used to measure the same construct, these three “irrelevant” statements can be removed from questionnaire B without affecting the overall effectiveness of the questionnaire. 3.3 Validity 3.3.1 Construct Validity To understand how the constructs of Questionnaires A and B (see Table 1) are related to each other, and in particular how the individual dimensions of the complexity contribute to the overall complexity rating, multiple regression analyses are performed on both questionnaires A and B responses. When applied to questionnaire A, the perception, cognition, and action dimensions together account for 46% of the variation in the overall complexity. Within each dimension, the quantity, variety, and relation constructs account for 41% of the variation in perceptual complexity, 37% of cognitive complexity, and 45% of action complexity. Based on the responses to questionnaire B, multiple regression analyses also reveal that the perception, cognition, and action dimensions together account for 77.4% of the variation in the overall complexity. Within each dimension, the quantity, variety, and relation constructs account for 59% of the variation in perceptual complexity, 52% of cognitive complexity, and 55% of action complexity.
Validating Information Complexity Questionnaires Using Travel Web Sites
907
The higher R-square values derived from the responses with Questionnaire B suggest rating multiple statements for each complexity construct provided a broader coverage of complexity issues than just choosing one statement for each construct as in Questionnaire A. The current 82 data points are not enough for factor analysis to derive reliable factor loadings. More data points need to be collected to further validate the construct validity of the two complexity questionnaires. 3.3.2 Concurrent Validity The overall complexity measured by questionnaires A and B is correlated with the three sub-scales of the PSSUQ usability questionnaire [1] and the overall usability value. All the correlation relationships are significant, and the absolute values of the coefficients are all above 0.55. The results indicate that usability and complexity are negatively related with each other. Websites with lower complexity are easier to deal with and therefore are considered as having higher usability. 3.4 Sensitivity To measure the sensitivity of the two complexity questionnaires, the responses from both questionnaires are compared between the two types of tasks and among the websites. Due to the different natures of the tasks, the exploratory tasks are expected to be associated with higher complexity. The analysis also aims to discover whether the three functionally similar websites have different levels of complexity. The independent variables in the analysis include the two types of tasks (directed and exploratory), and the three travel websites (Expedia, Travelocity, Orbitz). Since each subject is assigned to use only one travel website to perform two types of tasks, website is a between-subject variable, and the task type is a within-subject variable. The dependent variables are the responses to questionnaires and the task performance time (in seconds). 3.4.1 ANOVA on Usability Questionnaire PSSUQ Results When applied to the overall usability ratings of PSSUQ, task type is found to be statistically significant (p=0.0081) in affecting the overall usability. The exploratory tasks has lower usability (M=4.62, SD =1.452) than the directed tasks (M=5.23, SD = 1.074). The website factor and the interaction between website and task type are not significant. Further analysis on the individual dimensions of PSSUQ (system usefulness, interface quality, and information quality) shows that the task type is a significant factor affecting system usefulness (p=0.0062) and interface quality (p=0.028), but not significant for information quality (p=0.1345). System usefulness includes aspects of ease of use, learnability, speed, and task performance; interface quality measures whether users feel the system is pleasant and liked the system. Significance in both these dimensions is evidence that there are indeed differences in user’s interactive experiences with the system between the two types of tasks. The complexity questionnaires should be able to capture this difference.
908
C. Ling, M. Lopez, and J. Xing
3.4.2 ANOVA on Questionnaire A Results When applied to the overall complexity responses in questionnaire A, task type is found to be significant (p=0.013) in affecting the overall complexity. The interfaces are considered as more complex (M=2.05, SD=0.714) when using exploratory tasks compared to the directed tasks (M=1.71, SD=0.559). The website factor and the interaction between website and task type are not significant. Further analysis on the individual dimensions (perception, cognition, and action) show that the task type is not a significant factor affecting the perception complexity (p=0.64), but is significant for cognitive complexity (p=0.014) and action complexity (p=0.0002). This result makes sense because when the subjects perform the exploratory task, they need to figure out which links to follow in order to accomplish the tasks. The navigation steps taken to seek the answer and carry out the tasks are usually more than using the standard tool box. Therefore, both the cognitive and action complexity of the interfaces are higher for the exploratory tasks. 3.4.3 ANOVA on Questionnaire B Results When applied to the overall complexity responses in questionnaire B, the task type is again found to be significant (p=0.037) in affecting the overall complexity. The exploratory tasks are considered as more complex (M=2.92, SD =1.301) than the directed tasks (M=2.46, SD=0.872). The websites and the interaction between website and task type are not significant. Further analysis on the individual dimensions (perception, cognition, and action) show that the task type is a significant factor affecting all three complexity dimensions: perceptual complexity (p=0.004), cognitive complexity (p=0.028) and action complexity (p=0.027). Exploratory tasks result in higher complexity in all three dimensions than defined tasks. This result is a bit different from those obtained with questionnaire A. In addition to cognitive and action complexity, the perceptual complexity is also significant. The significant perceptual complexity effect may be due to the fact that users need to spend more effort searching for information during the exploratory tasks. But in the directed task, they only need to search for information within the framework provided by the search tool. The fact that the questionnaire is able to detect the differences in perceptual complexity that questionnaire A can not indicate that the questionnaire B is more sensitive than questionnaire A. Another difference that questionnaire A failed to detect is cognitive complexity difference among the three travel websites. The three websites are found to be significantly different in cognitive complexity (p=0.016). Expedia receives significantly lower overall complexity score (M=2.08, SD =0.677) than Orbitz (M=2.76, SD =0.938). There is no significant difference in cognitive complexity between Orbitz and Travelocity (M=2.56, SD=0.891). This result reflects the differences in the design of the three websites. Many subjects commented on the difficulty with tasks on Orbitz. For example, to book a flight to Yellowstone national park, Expedia provides a lot of options to choose the closest airports, but Orbitz simply returns an empty result with no extra help. There are also problems of losing parts of subject’s entry data (e.g. accompanying children’s ages) when returning to the booking page for Orbitz. These difficulties may have caused greater cognitive complexity for the users to figure out how to perform the given tasks.
Validating Information Complexity Questionnaires Using Travel Web Sites
909
The differences in cognitive complexity are consistent with the differences in the overall performance time. The task type (p=0.003) and websites (p=0.047) are both found to be significant in affecting the performance time. The exploratory tasks take longer (M=1469.76s, SD =435.458s) to be accomplished than the directed tasks (M=1231.93s, SD=389.542s). Expedia takes significantly less time (M=1223.00s, SD=360.913s) for the tasks than Orbitz (M=1524.50s, SD=421.244s). There is no significant difference between Orbitz and Travelocity (M=1301.50s, SD=453.469s). These results are consistent with the analysis result based on questionnaire B, which indicate that Orbiz has higher cognitive complexity. Both questionnaires are able to detect the differences in complexity of the two types of tasks, and demonstrate satisfactory sensitivity. The ANOVA analysis based on questionnaire B resulted in more significant factors than questionnaire A, including the significant task type effect on perceptual complexity and website effect on cognitive complexity. We can say that questionnaire B has higher sensitivity than questionnaire A. The reason again may be that more statements used with each construct in questionnaire B reflect the opinions on complexity issues better.
4 Discussion In this study, two complexity questionnaires are validated by applying to three travel websites. The reliability, validity, and sensitivity of the two questionnaires are calculated and validated. The reliability is validated through the correlation among the two complexity questionnaires and the PSSUQ usability questionnaire, and the internal consistency of complexity questionnaire B. The validity is established by the high correlation between the responses to complexity questionnaires A, B with PSSUQ. The sensitivity is established by the ANOVA analysis of results across different types of tasks and websites. The questionnaire is able to pick up the differences caused by different task types. The experimental results help to improve the quality of the complexity questionnaires. Some statements in questionnaire B are found irrelevant to the complexity constructs. They cause confusions in responding. Removing them yield better reliability in terms of internal consistency. The two types of tasks used in the experiment are complimentary. The directed tasks represent the more frequently used tasks that users do every day, and the exploratory tasks represent tasks that are not done on a regular basis, and take more efforts to figure out. Each type alone only represents a part of the interface, so any evaluation based on one type of the tasks can not fully account for the interface. We propose that future interface evaluation should be based on the combination of these two types of tasks. Although the two complexity questionnaires were initially developed for air traffic control displays, our result show that with minor wording modification, they can also be used to measure the complexity of interactive visual interfaces. Travel website’s complexity is not as high as those normally used by air traffic controllers. The air traffic control displays present dynamic information but the travel website interfaces only present static information. Therefore, for the validation results fully applicable to air traffic control, future studies will use dynamic interfaces to further validate the current sets of questionnaires.
910
C. Ling, M. Lopez, and J. Xing
Acknowledgement. This research was supported by Federal Aviation Administration (FAA) Civil Aerospace Medical Institute (CAMI), Oklahoma City, with grant entitled as “Investigating Information Complexity in Three Types of Air Traffic Control (ATC) Displays” grant number FAA 06-G-013.
References [1] Lewis, J.R.: IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use. International Journal of Human-Computer Interaction 7(1), 57–78 (1995) [2] Lewis, J.R.: Psychometric evaluation of the PSSUQ Using data from five years of usability studies. International Journal of Human-Computer Interaction 14(3&4), 463–488 (2002) [3] Maeda, J.: The laws of simplicity: Design, Technology, Business, Life. MIT Press, Cambridge (2006) [4] Nunnally, J.C.: Psychometric Theory. McGraw-Hill, New York (1978) [5] Xing, J.: Measures of Information Complexity and the Implications for Automation Design. Federal Aviation Administration, Washington, DC No: DOT/FAA/AM-04/17 (2004) [6] Xing, J.: Two questionnaires for complexity evaluation and management in air traffic control displays, Technical report. Federal Aviation Administration, Washington, DC (2007)
Maximizing Environmental Validity: Remote Recording of Desktop Videoconferencing Sean Rintel Department of Communication, University at Albany, State University of New York, 1400 Washington Ave, Albany NY 12222, USA [email protected]
Abstract. This paper discusses the development of the technical methodology for remote recording to maximize environmental validity for a project on how novices develop familiarity with desktop videoconferencing (DVC). It is also a discussion of how the technical setup, as well as the resulting data, was useful for finding usability issues for the company that provided the DVC software. Keywords: Desktop videoconferencing, novices, familiarity, usability, methodology, environmental validity, remote recording.
1 Introduction Longitudinal field research on desktop videoconferencing (DVC) reveals the kinds of problems that users consider important, the way they go about solving them, and the limits of user-generated solutions. Many previous studies of DVC that stress collecting rich naturalistic data amenable to qualitative analysis have been conducted either auto-ethnographically by the researchers themselves at work [2] or conducted using novices but in fairly controlled work environments with on-site recording equipment [4, 5]. While both kinds of studies provide excellent results, they are still some distance from the high level of environmental validity that would be achieved if novice users could be recorded remotely in their own environments, on their timetable, with no extra software, and with no more effort on their part than actually videoconferencing itself. This is especially the case for home users. If such remote recording can be achieved then from both an academic and developer standpoint the resultant data should be of high quality since it minimizes the impact that the research has on the participants in the crucial moments of actually using the technology. This paper discusses the development of the technical methodology for a project exploring how novices develop familiarity with DVC. Rejected and failed solutions are discussed to provide a rationale for the chosen setup, and then the challenges and benefits of the setup are outlined for both the academic goals of the project and usability goals of the industry partner.
DVC’s affordances [3] as part of accomplishing their desired social activities, and how those understandings change over time. Given that these goals are experiencebased and descriptive rather than task- or technology-based and evaluative, it was decided from the outset that environmental validity was to be given primacy. In usability research, Neilson [7] defines validity as resting on whether tests measure “something of relevance to usability of real products in real use outside the laboratory.” Ethnomethodological usability research does not depend on measurement, but its concept of validity could be said to share Neilson’s emphasis on relevance in situated reality [6, 8]. For this project, I use the term “environmental validity” to mean what most sociological research refers to as “ecological validity”, that is, as a concern that a controlled research situation should approximate the real-life situation that is under investigation. I prefer to use the term environmental validity both to avoid conflict with the Brunswickian [1] definition of ecological validity and as a way of stressing that the physical environment in which DVC takes place is likely to matter a great deal to how people use it. As such, a primary concern of this project was to approximate the real–life experience of novices trying out DVC for maintaining long-distance personal relationships and to capture that experience as richly as possible. 12 pairs were to try DVC for two to three months. A combination of observational and interview data was decided upon, with the observational data having primacy. This meant enabling DVC in the participants’ own homes and recording what occurred for later analysis. The observational data collection objective was an unbroken longitudinal record of every DVC interaction engaged in by a pair.
3 Environmental Validity on a Shoestring Budget 3.1 Rejected Local Recording Solutions Studies on DVC at work frequently use recordings of both on-screen action and action in the physical environment [2, 4, 5]. For this project on home DVC, however, the latter had to be ruled out. Not only were resources too limited for physical environment recording, but the physical and social intrusion that would have resulted from placing suitable recording equipment in homes would have undermined the naturalness of the trials, not to mention causing extreme ethical concerns. On-screen DVC action, then, was to be the crux of the observational data. To conduct moment-by-moment microanalysis of interactions by pairs via DVC it is critical to record the synchronized video and audio of each pair member as they experienced it, and that those recordings are synchronized with one another such that the what the researcher experiences after the fact closely approximates the experience of the participants during the interaction. There were, then, two linked issues. First, how the DVC video and audio for each participant could be accessed and recorded, which impacted directly on environmental validity. Second, how the recordings of separate participants could be synchronized, which was affected somewhat by the requirements of environmental validity. The intrusion and resource factors that led to ruling out physical environment recording also impacted upon the methods of accessing and recording the on-screen DVC action.
Maximizing Environmental Validity
913
Methods for recording DVC with apparatus in participants’ homes were rejected for several reasons. The simplest rejected method was a VCR connected to each participant’s computer that the participants would start and stop for every DVC event. This was rejected primarily due to intrusion problems. Requiring participants to record themselves is highly unnatural, emphasizing their awareness of the unreality of the situation and the knowledge that there is a persistent record of what would ordinarily be ephemeral. The VCR would have to be fitted into the participants’ environments in such a way as to be unobtrusive but still easily accessible, as would an unknown quantity of blank video-cassettes, and recorded video-cassettes would have to be changed, labeled and stored. Participants would also have had to be trained to use and troubleshoot the VCR (e.g. how to reset the clock if the power went off, since a non-functioning clock often prevents any action until reset). Further complicating such a method was the possibility that participants might be using laptop computers, adding the need to frequently un-plug and re-plug the VCR into the computer. As it turned out, over 80% of participants in the study used laptops, so this could have been a very real problem. Even this ‘simple’ solution would allow for both deliberate and accidental lapses in recording, problematizing the longitudinal record. More complex, automated recording systems, using off-the shelf or purpose built digital video recorders, were also rejected for similar reasons. While the physical impact of equipment access and replacing and storing tapes would have been eliminated, the laptop issue would still have been a factor. And, of course, designing and supplying a bespoke easily un-and-re-pluggable automated recording system was well beyond the resources of the project. The possibility of using automatic screen-capture software loaded directly on participants’ computer (e.g. Camtasia) was also tested, but two more problems surfaced. First, the size of captured video files was overwhelming. Participant computers would have had to have vastly upgraded hard drives just to fit the files. Further, in tests on a laptop and desktop of average power, the rendering of the temporary video file for even a twenty minute conversation either crashed the computer or took an inordinately long time. Clearly these issues would have directly decreased the realism of the situation for the participants. Local screen captures were also rejected by the researcher because tests found that after an indeterminate period of time above around five minutes, the video and audio of the recordings would desynchronize, rendering the recordings unusable. Even had they been feasible, any form of local recording was also determined to add a layer of complexity to working up the data as analyzable recordings because of the need for synchronizing individual records. While both VCR and digital video recordings can be synchronized using various forms of time-codes, doing so was not worth the effort given both the environmental validity and practical problems that these systems presented. Having rejected local recording systems, remote recording was the only remaining solution. But coming up with an affordable, unobtrusive, and naturalistic system was also a challenge. 3.2 Rejected Remote Recording Solutions Remote usability testing solutions have flourished in the last few years (REF). Most purpose-built remote usability software (e.g. Morae) and remote desktop sharing solutions (e.g. RealVNC, Microsoft Remote Desktop, Citrix, WebEx) work on the same
914
S. Rintel
principle: users load and run the remote sharing software locally, and then all onscreen activity is uploaded in real time to the remote research location. This provides the richest possible picture of the use of the application in question, as well as how users multi-task. The obvious benefit of such a solution is that environmental validity of the project is increased because recording becomes essentially transparent to the participants: There is no physical impact on the participants’ environments and, since these systems can be run automatically and silently on computer startup, also greatly reduced impact on the actions that participants need to take per DVC event. WebEx and Morae were beyond the resources of this project, but both RealVNC and Remote Desktop (built into Windows XP Pro) were affordable enough to be possible data delivery systems. In theory these remote sharing solutions seemed ideal for this project, but practical obstacles led to their rejection. It was immediately found that RealVNC, in common with most frame-buffered desktop sharing applications, does not support transmission of the remote computer’s audio, making it unsuitable for this project. Desktop sharing applications based on the Remote Desktop Protocol (RDP), however, such as the Remote Desktop built into Windows XP Pro, do support audio transmission and improved video transmission, and thus tests were conducted using Remote Desktop in Windows XP Pro. The biggest barrier to using remote sharing was participant upload bandwidth limitations. Tests showed that the average capped upload limit of around 300kbps of home broadband connections was too limited to support both DVC and a real-time remote desktop connection with full fidelity video and audio. Those DVC applications that did not freeze had heavily degraded video and audio streams. Indeed, during the project it was found that this 300kbps average upload rate was on occasion hardpressed to transmit the DVC upload stream itself, given that many users were (a) sharing their network connections and (b) despite advice to the contrary, running other applications which used upload bandwidth. While home users continue to have severely limited upload bandwidths the potential for using RDP-based testing of bandwidth-heavy home applications seems to be limited. There is, however, hope for improvement. The remote sharing solution also had the same synchronization problem as local recording solutions: Individual pair members recordings would have had to be combined and synchronized. However, unlike the local recording solution, had the remote sharing solutions worked to adequately deliver the DVC streams, this would certainly have been worth the effort. The next remote recording solution envisaged was directly tapping into the streams of DVC systems’ servers at their source. This would have split the participant DVC streams as they occur and direct the recorded versions into recording apparatus at the server location. No software other than the DVC application itself would be in operation, leaving participants’ upload bandwidth streams untouched. Clearly this solution would have improved upon the remote sharing solution because it would be completely transparent to users, providing the highest level of environmental validity possible. The old story of resource limitation reared its head in two ways to prevent this solution. Not only did the project lack the purely financial resources to run a dedicated DVC server but also the resources to develop software to enable splitting and copying of the DVC streams. Had these resources been available, this would have been an excellent solution.
Maximizing Environmental Validity
915
3.3 Remote Recording of Participants Using a Multi-party Bridge Having eliminated all local and most remote recording methods due to resource scarcity and validity problems, the only solution remaining was to have participants interact in a small multi-party situation. A pair of interactants and a third party who does not interact are just a form of small group, and thus so long as a group DVC room/channel could be limited to just the desired interactants, and participants consented to being recorded, this solution would provide the desired access. Indeed, not only would access be possible, the access would be largely the same for researcher as the participants, and it would have the virtue of allowing one recording to capture all the interaction, removing the problem of synchronizing separate recordings. Although one extra step would be required above regular point-to-point DVC—pairs would have to log into a multi-party bridge before seeing their partner and then talking—it would have a very limited impact upon environmental validity, certainly much less than other methods. However, there was an important environmental validity problem to be solved for remote recording using a multi-party bridge. Most multi-party DVC video window interfaces are designed to indicate all participants in a room/channel to enhance the sense of group cohesion. Apple iChatAV and SightSpeed are just two of many DVC services which display all participants within a single video window. Nonparticipating members are marked by very obvious place-holders if no camera is attached (mono-colored areas or iconic representations) or a view of an empty physical area if a camera is attached (See Figure 1).
Fig. 1. Displays from Apple iChatAV and SightSpeed in which non-participating members are marked by overly obvious place holders
While this is useful when the group context needs to be foregrounded, it was a drawback for this approach to remote recording since a placeholder would immediately and constantly alert participants to the fact of their being recorded, significantly reducing sense that they were having a private dyadic interaction. Pilot testing using SightSpeed indicated that participants would discuss the placeholder, thus it was critical to find a DVC application that minimized visibility of the researcher. Extensive trials led to the discovery that two DVC services took an alternate approach to multi-party participant video display. Regardless of the number of participants, both Wave Three Inc.’s Session Communication software and iVisit displayed
916
S. Rintel
all participants’ video in individual floating windows that could be resized and positioned anywhere on the desktop. This allowed a group of three to appear as a group of two, since the participants could choose to start only the video windows they wanted and position them anywhere they wanted. Session was chosen for its superior quality video and audio in tests. Figure 2 shows images of Session windows open on participant desktop and the server desktop.
Fig. 2. Session video windows as they appeared on a participant’s desktop and on the server desktop (participant desktop image is a mockup based on Session’s display capabilities and participant descriptions of video window placement)
Session also had several features which minimized visibility of the researcher. First, its primary call window was small and easily hidden, giving primacy to the participant video windows. Once a conversation was underway, participants were not constantly confronted with evidence of the recording beyond a single name in their contact list, which most participants reported placing away from the video windows or minimizing so as to not be visible at all. Second, when participants logged into their multi-party bridge, although they would see the researcher’s computer on their contact list along with the name of their conversational partner, no video would start automatically. Since starting video required participants to click on a play button next to a contact’s name, participants were told to just start their partner’s video and ignore the researcher’s contact entirely. This quickly led to participants simply logging on to the bridge, starting their partner’s video, and talking. While participants did report awareness of being recorded, none reported feelings of intrusion during interactions. Having solved the access issue and associated environmental validity problem, all that remained was a suitable recording system. Four computers were permanently connected to four separate bridges 24/7 as non-participating contacts. Each recording computer output its video and audio to a video cassette recorder (VCR) with a timer set to record for 9 hours. Figure 3 shows a diagram of the final multi-party bridge setup. With daily tape changes this remote recording system ran seven days a week for nine months.
Maximizing Environmental Validity
917
Fig. 3. Multi-party bridge remote recording setup
Had more resources been available, an A/V distribution box and three VCRs per computer could have been used for full 24/7 recording. As it was, an automation script (to be discussed shortly) logged all Session activity—logon, logoff, participant connection, errors etc.—which not only helped pinpointing the times when participants actually began videoconferences, but also occasions when participants logged in outside of the recorded times and also with the diagnosis of technical problems. However, unplanned contingencies lead to a number of challenges, which along with being useful for refining the setup, also provided usability data in and of themselves.
4 Challenges Automation of the recording servers was necessary to allow participants to interact on their own schedules. This meant running automation scripts to control Session. The two major areas of automation required were video display and bridge connection. Automating video display was crucial. As was discussed above, one of the features of Session was that in multi-party bridge situations, no participant video was started automatically. Wave Three’s rationale for manual video startup in bridges is that it prevents participants’ bandwidth being swamped, which is reasonable and participants reported having no problems starting video manually but most would have preferred set automatic startup. Nevertheless, the need for manual video startup contributed to the environmental validity for this method, since the non-participating researcher’s computer video was not started for participants. Unfortunately, this also meant that, if left unattended, the recording computers would record only participant audio. A script was needed to automatically detect when users had connected to the bridge and then simulate the clicking of the play button next to their contact names. Automating bridge connection was also crucial to ensuring that participant
918
S. Rintel
interactions could be recorded at any time. A script was needed to check that the bridge connection was constantly up, and to reconnect if it went down for any reason. Auto It v3 was used to create scripts for the recording computers running Microsoft Windows. Like most scripting applications, Auto It scripts generally rely on being able to identify and interact with application controls based on unique textual identifiers from the Windows API. However, since Session is a Java application wrapped in the standard Microsoft Windows skin, Auto It could not access the application controls directly. The solution was to use a combination of indirect methods: watching for the titles of windows and triangulated sets of color changes on the Session interface and desktop to determine what actions to take at any given time. While this inelegant kludge solution turned out to be effective, in practice it took a little time to smooth out. This is because for the first few months of the project unexpected color display events prevented script triggers, occasionally preventing video display or bridge connection. Some unexpected color display events were purely technical and occurred only at the recording computer. These were primarily unexpectedly opening system or applications windows—especially automated application update windows—covering parts of the Session window or desktop that the script was watching. The quantity and variety of unexpectedly opening windows was surprising, and it took some time to ensure that as many unexpected window openings could be suppressed as possible. Other unexpected color events occurred because of participant actions. An early version of the script expected participants to use assigned user names and searched for color on parts of those predefined names. This was optimistic. When one member of an early participant pair did not use an assigned username (which was intended to prevent identification as well as enabling automation), the script was not triggered and no video was recorded for that participant. Automated bridge connection and reconnection were initially included in the scripts as a time-saving tool rather than a necessity, as no bridge disconnection ever occurred during testing. However, as the first round of data collection got underway it became apparent that the Session servers were not designed to support users staying logged in to bridges 24/7 as the recording computers were. The first three month data collection round experienced bridges downtime on 24 occasions: usually for a day and a night, occasionally over a weekend, and once for 4 nights/3 days. After discussions about fixes led to daily reboots of the Session services, the second data collection period of four months experienced only 7 downtimes: usually just overnight, but once for 3 nights/2 days. The bridges could only be restarted by Wave Three support personnel during regular US Pacific Time working hours. While the researcher was checking the bridges several times a day and could notify Wave Three when problems were noticed in these timeframes, some bridge downtime occasions prevented participants from being able to videoconference on demand. Staying permanently logged in to a bridge is admittedly unusual—bridges are designed more for temporary use—but at the time even Wave Three Inc. were not aware that this kind of login would stress the otherwise fairly solid server system. Wave Three is currently diagnosing the cause of these bridge downtimes, an issue which might not have been discovered in ordinary usage. There is, of course, some irony that to provide environmental validity an unusual use of the servers had to be instituted which then caused some problems.
Maximizing Environmental Validity
919
5 Results Given the fairly long road to the technical methodology outlined above, and the challenges faced in implementing it, was the resulting data worth the effort? I believe that it was. Although a combination of technical and scheduling issues with users during setup had to be ironed out to get them talking frequently enough to provide adequate data (which may be the subject of another paper), once most pairs moved into actually using DVC, it quickly became apparent that the comfort level of using DVC in their own environments was quite significant. Indeed, the first pair to complete the study was so comfortable using DVC late at night in their bedrooms that half of their hourplus conversations would end with one or both falling asleep! Falling asleep ‘with each other’ after a long late-night conversation was a pattern that this pair carried over from their pre-trial behaviors during mobile telephone conversations. Although that particular manifestation of comfort was unique, common to all pairs was the spontaneous display and employment of physical objects from the local physical environment. This, of course, would be unlikely in a laboratory environment, as would the personal nature of the items have been in a controlled work environment. Perhaps more interestingly, the real clue to high environmental validity is that displays of objects or behavior by one pair member could be reciprocated by the other. For example, at one point in a conversation one participant decided that he would put on sunglasses. “We’re wearing sunglasses now?” asked his pair member, rummaging around her room to find her own pair, which she then put on (Figure 4).
Fig. 4. Display and reciprocated display of personal items from the local environment
Also common (and reciprocated) were numerous incidents of talking while folding laundry (Figure 5), eating, pulling faces (Figure 5), taboo gestures (e.g. ‘the finger’) and discussion of bodily functions.
Fig. 5. Participants show comfort with the recording situation: Doing laundry and pulling faces
920
S. Rintel
What is important about all of the behavior above is not, of course, that it involves particularly significant incidents. Rather it demonstrates that despite most participants reporting being generally aware of being recorded, the environmentally validity of the project situation was high enough that the participants felt comfortable enough to do frequently do ‘as they would’ instead of only ‘as they ought’.
6 Conclusions This project on how novices develop familiarity with DVC was premised on the high environmental validity of the trial experience for the novices. Using the multi-party bridges of the Session software allowed the researcher to have virtually the same access to the interactions as that experienced by the participants while not burdening them with equipment in their local environment or extra software on their computers to use resources or bandwidth. Further, the fact that Session displayed user video in separate windows and did not automatically start any contacts’ video automatically meant that the recording was almost transparent to the participants. For the industry partner, an interesting side effect of the project connecting to the bridges 24/7 was to turn up a problem with the Session bridge server system that had not been previously found and could prevent on-demand DVC: a critical usability issue. As far as the academic project itself is concerned, the comfort level displayed by the participants bears out the contention that the technical methodology approximated real-world usage enough to make valid claims. Thus while this solution came about as a response to a shoestring budget and technical limitations, and only after rejecting many other solutions, it turned out the resulting technical methodology provided real benefits for environmental validity.
References 1. Brunswik, E.: Perception and the Representative Design of Psychological Experiments, 2nd edn. University of California Press, Berkeley CA (1956) 2. Dourish, P., Adler, A., Bellotti, V., Henderson, A.: Your Place or Mine? Learning From Long-Term Use of Audio-Video Communication. Computer-Supported Cooperative Work. 5(1), 33–62 (1996) 3. Gibson, J.J.: The Ecological Approach to Visual Perception. Lawrence Erlbaum, Hillsdale NJ (1979) 4. Heath, C., Luff, P.: Technology in Action. Cambridge University Press, Cambridge UK (2001) 5. Heath, C., Luff, P.: Disembodied Conduct: Communication Through Video in a MultiMedia Office Environment. In: Proc. CHI 1991 pp. 99–103 (1991) 6. Garfinkel, H.: Studies in Ethnomethodology. Prentice-Hall, Englewood Cliffs NJ (1967) 7. Nielson, J.: Usability Engineering. Academic Press, San Diego CA (1993) 8. Suchman, L.A.: Plans and Situated Actions: The Problem of Human-Machine Communications. Cambridge University Press, Cambridge UK (1987)
The Impact of Moving Around and Zooming of Objects on Users' Performance in Web Pages: A CrossGeneration Study Hitomi Sato1, Kaori Fujimura1, Lin Wang2, Ling Jin2, Yoko Asano1, Masahiro Watanabe1, and Pei-Luen Patrick Rau2 1 Nippon Telegraph and Telephone Corporation, 1-1 Hikarinooka, Yokosuka City 239-0847, Japan {sato.hitomi,fujimura.kaori,asano.yoko, watanabe.masahiro}@lab.ntt.co.jp 2 Department of Industrial Engineering, Tsinghua University, Haidian District, Beijing 100084, China [email protected], [email protected]
Abstract. The rapidly aging population of Japan is now considered a serious social problem. In fact, populations are aging worldwide, and considerable research has been done on the phenomenon. One area that has been researched is Web page design. Some common guidelines for Web content or page designs make it difficult or impossible for people with certain cognitive or visual disabilities to read moving text quickly enough. Movement can also distract these people to such an extent that the rest of the page becomes unreadable, and people with physical disabilities might not be able to move quickly or accurately enough to interact with moving objects [6]. With this in mind, experiments were conducted on 24 people in their twenties and thirties in Yokosuka-shi, Japan and on 18 elderly people in Beijing, China. The results were then compared. Keywords: elderly people, young people, Web sites, object moving, object zooming, time, error, visual fatigue, satisfaction, and workload.
design, such as the Web Content Accessibility Guidelines (WCAG) of the World Wide Web Consortium (W3C) [6], consider Web accessibility from the viewpoint of universal design and explain how to make Web content accessible to people with disabilities. However, few studies have investigated the more specific patterns of performance with Web pages that may be relevant to elderly people [4]. Many people have problems using Web pages because of visual impairment, memory loss or declining cognitive ability, and because of lack of knowledge of information technologies and the resulting lack of skill with computers typical of senior citizens. Considering this set of challenges, which are unique to senior citizens, we have researched guidelines for Web page design for elderly people.
2 Purpose Some common guidelines for Web content or page design make it difficult or impossible for people with certain cognitive or visual disabilities to read moving text quickly enough. Movement can also distract people with cognitive disabilities to such an extent that the rest of the page becomes unreadable, and people with physical disabilities might not be able to move quickly or accurately enough to interact with moving objects [6]. With this in mind, the current experiments were conducted on 18 elderly people in their fifties, sixties, and seventies in Beijing, China [5]. To ascertain the special characteristics of the elderly when performing tasks with objects moving around and zooming, we conducted the same experiment with 24 people in their twenties and thirties in Yokosuka-shi, Japan, and then compared results. The goal of this experiment was to investigate specific patterns of performance with two kinds of movement (moving around and zooming) in Web pages that may indicate differences in ability between the two age groups.
3 Method We took the following two research questions as Wang et al. [5] did in the experiments. Research Question I. Which combination of moving around and zooming into and away from objects cause the least visual fatigue and workload and the highest satisfaction and performance? Research Question II. Which speed of object movement causes the least visual fatigue and workload, and the best satisfaction and performance? 3.1 Participants The current experiments were conducted on 24 people (12 men and 12 women) in their twenties and thirties in Yokosuka-shi, Japan. The ages of the participants ranged from 25 to 38, with the mean at 30.5 years and standard deviation of 4.41 years. They had been educated for an average of 16.58 years with a standard deviation of 2.48
The Impact of Moving Around and Zooming of Objects on Users' Performance
923
years. They generally used computers in their jobs and were used to technologies related to Web sites and computers. 3.2 Procedure The experiment was conducted in a quiet experimental room in a laboratory building belonging to Nippon Telegraph and Telephone Corporation in Yokosuka-shi, Japan. The young people finished their experiments in approximately half an hour. At the beginning of the experiment, each participant was given instructions on the aim of the experiment, how to perform each task and how to fill out questionnaires. They were then asked to fill in a general information questionnaire concerning personal characteristics such as age, years of education, and experience of using computer technologies. The experiment comprised six exercises and six experiment tasks. The purpose of the exercises was to enable participants to become familiar with the operations in the experiment tasks. Six types of visual stimuli using a combination of objects, either stationary or moving at three different speeds (60 pixels per second, 200 pixels per second and 300 pixels per second), and objects with or without zoom in Web pages were employed on tasks and randomly presented to participants on a 15-inch monitor (Table 1 and Fig.1). Table 1. Description of experimental prototypes Prototypes A1 A2 A4
Zooming speed (pix/s) 0 3.5 3.5
Moving speed (pix/s) 0 0 200
B1
0
60
A3 or B2 B3
0 0
200 300
Fig. 1. Movements and transformations in flash animations
924
H. Sato et al.
Participants were asked to look at a group of objects in Web pages and click on the one that was identical to a previously presented sample object (Fig. 2).
Fig. 2. The interface of the experiment
Each task consisted of six practice pages and ten experimental pages. Each participant’s task time and errors with respect to each task were measured by computer, and after finishing each task, each participant was asked to fill out a questionnaire that consisted of questions about eye fatigue, satisfaction, and workload. Questions were answered on a seven-point scale. Each variable and questionnaire is described in detail below; Time. Time was the total time required to complete each visual search task. It was recorded to the nearest one-thousandth of a second by computer. Error. Error was the total number of excessive clicks utilized to perform the tasks. Visual fatigue. Visual fatigue was measured by a vision perceptive evaluation questionnaire. Satisfaction. Satisfaction was the score obtained through a general satisfaction questionnaire, which was a modified version of the satisfaction measure utilized by Cook [2]. Workload. Workload was the score obtained through a general task load questionnaire used by National Aeronautics and Space Administration [3].
4 Results The results showed differences among young people’s performance (as measured by task time and error), subjective visual fatigue, subjective satisfaction and subjective workload between tasks with each type of visual stimulus. 4.1 Results of Research Question I Results of research question I are given in Table 2 and 3.
The Impact of Moving Around and Zooming of Objects on Users' Performance
925
Table 2. Overview of results 1 (Research question I) Dependent variables Time (ms)
A1 M SD 1790.91 328.75
A2 M SD 1900.57 351.78
A3 M SD 1802.28 287.02
A4 M SD 2120.43 403.53
Error rate
0.04
0.08
0.10
0.16
0.10
0.12
0.19
0.24
Visual fatigue
15.71
8.05
24.08
7.91
21.83
9.57
22.71
7.79
Satisfaction
2.25
1.89
3.97
0.91
4.45
0.90
4.55
0.95
Workload
46.93
18.99
59.64
17.73
59.81
15.30
63.87
16.57
Table 3. Overview of results 2 (Research question I)
Dependent variables Time (ms) Error rate Visual fatigue Satisfaction Workload
F(3,92)
p
3.4798 H=11.20278
0.1907* 0.0107*
4.7044
0.00422*
3.9212 H=13.46240
0.01105* 0.0037*
As presented in Table 3, all variables were affected by each experimental prototype. Young people could perform tasks faster and more accurately without moving and zooming objects (A1) than they could by moving and zooming objects (A4) (F (3, 92) =3.4798, p=0.1907*, H=11.20278, 0.0107*). Moreover, their subjective visual fatigue was caused less by tasks without moving and zooming objects (A1) than by tasks with moving or zooming objects (A2 and A3)/with moving and zooming objects (A4) (F (3, 92) =4.7044, p=0.00422*). They tended to find more satisfaction in tasks with moving objects (A3) /with moving and zooming objects (A4) than ones without moving and zooming objects (A1), and in ones with moving and zooming objects (A4) than ones with zooming objects (A2) (F (3, 92) =3.9212, p=0.01105*). Their subjective workload was increased more by tasks with moving objects (A3) and with moving and zooming objects (A4) than ones without moving and zooming objects (A1) (H=13.46240, p=0.0037*). Differences in performance time, visual fatigue and satisfaction mean values were assessed with LSD test multiple comparison procedures, and error rate and workload were assessed with Kruskal-Wallis nonparametric test for lack of normality. 4.2 Results of Research Question II Results of research question II are given in Table 4.
926
H. Sato et al. Table 4. Overview of the results (Research question II)
Dependent variables
M
B1 SD
M
B2 SD
M
B3 SD
F(2,69)
p
Time (ms)
1720.76
369.95
1802.28
287.02
1788.08
307.97
7.3304
0.00130
Error rate
0.04
0.08
0.10
0.12
0.19
0.19
H= 14.78435
0.0006*
Visual fatigue
15.54
6.23
21.83
9.57
19.46
8.21
3.0201
0.05529
Satisfaction
4.04
0.80
4.45
0.90
4.32
0.96
1.3359
0.26964
Workload
52.65
18.15
59.81
15.30
56.42
15.80
1.1360
0.32703
As shown in Table 4, only the error rate was affected by each experimental prototype (H= 14.78435, 0.0006*). More errors were found in tasks with objects moving at speeds up to 300 pixels per second (B3) than in ones with objects moving at speeds up to 60 pixels per second (B1) (z=3.354102, p=0.000796*). These differences in error rate were assessed with Kruskal-Wallis nonparametric test for the lack of normality and sign test.
5 Discussion The results indicated some similarities between elderly and young people, as follows: 1. Both groups can perform tasks without zooming and moving objects more quickly and accurately than ones with zooming and moving objects. 2. Their visual fatigue is caused less by tasks without zooming and moving objects than ones with zooming and moving. We could also find the following differences between elderly people and young people: 3. Tasks with moving objects reduce elderly people’s performance level in terms of time and error. 4. Tasks with zooming objects increase elderly people’s subjective visual fatigue. 5. Young people are more satisfied with a task with object moving and zooming than one with object zooming. 6. Tasks with object zooming cause a greater subjective workload on young people than tasks without object moving and zooming. Tasks with object moving and zooming also cause a greater workload than those without object moving and zooming. 7. Elderly people can perform tasks better at a slow/middle speed than at a high speed in terms of time and error, while changes in speed of object moving do not affect young people’s performance. Although the results, which were common in elderly and young people, are ordinary and imaginable phenomena, one difference was that elderly people’s performance diminished with objects moving in terms of times and error, and that tasks with
The Impact of Moving Around and Zooming of Objects on Users' Performance
927
object zooming increase their subjective visual fatigue. This indicates that elderly people feel great visual fatigue with object zooming, but it is object moving that diminishes their performance level in terms time and error. It was not clear whether it took longer for participants to find target objects or whether they clicked on them out of experience. In addition to this, it was unique that young people are more satisfied with tasks involving object moving and zooming than with those involving zooming, and more with tasks involving object moving/moving and zooming than with those without moving and zooming. This seems to originate from their need for visual stimulation, which can be found also in computerized games. Actually, many of our participants commented that object moving and zooming made them work on tasks harder. On the other hand, they feel more subjective workload in tasks with object moving/object moving and zooming than in those without. It indicates that their satisfaction and subjective visual fatigues don’t match completely. Finally, elderly people can perform tasks with objects moving at a slow speed faster and more accurately than with those moving at a high speed, while young people weren’t affected by the speed factor. We think that this is because objects moving at speeds up to 300 pixels per second were not fast enough for young people.
6 Conclusion Based on the results, we can offer the following suggestion to web designers; 1. On web sites intended to increase elderly people’s performance level in terms of time and error, avoid object moving or limit object moving speed to below 60 pixels per second. 2. To avoid elderly people’s subjective visual fatigue, limit object moving speed under 200 pixels per second and don’t make objects zoom in/out. 3. If you are young, be aware of your tendency to seek object moving or zooming when you design web sites for elderly people. However, it remains an unresolved issue as to whether users take more time to find the targeted objects (perception) or to click on the targeted objects (behavior). We hope our results can contribute to developing specific guidelines for Web page design that will help young Web designers and elderly people, and will keep doing the kind of study we did for better Web site design guidelines for elderly people.
References 1. Cabinet Office, Government of Japan. White Paper on Aging Society 2005. (2006) Retrieved October 31, 2006 from the World Wide Web: http://www8.cao.go.jp/kourei/whitepaper/w-2006/zenbun/18index.html 2. Cook, J.R.: Cognitive and social factors in the design of computerized jobs. Purdue University (1991) 3. Hart, S., Staveland, L.: Development of NASA TLX (Task Load Index): results of empirical and theoretical research. Elsevier Science, North-Holland, Amsterdam (1988)
928
H. Sato et al.
4. Murata, A., Moriwaka, M.: Age Differences in Web Navigation Performance – Effects of Character Size, Grouping, Density of Page, Layout of Items, and Ease of Detection of Links. The Institute of Electronics, Information and Communication Engineers Japan J90D(3), 788–797 (2007) 5. Wang, L., Sato, H., Jin, L., Rau, P.P., Asano, Y.: Perception of Movements and Transformations in Flash Animations of Older Adults. In: 12th International Conference on HumanComputer Interaction 6. World Wide Web Consortium. Web Content Accessibility Guidelines 1.0, (1995) Retrieved November 1, 2006 from the World Wide Web: http://www.w3.org/TR/WAIWEBCONTENT/
Entelechy and Embodiment in (Artistic) HumanComputer Interaction Uwe Seifert and Jin Hyun Kim Universität zu Köln Albertus-Magnus-Platz 50923 Köln, Germany {u.seifert,jinhyun.kim}@uni-koeln.de
Abstract. This paper points out the complementarity of HCI and cognitive science in studying agents’ interactions with their environments. Embodied interaction is related to embodied and distributed cognition. A theoretical framework based on the distinction “potentiality/actuality” is outlined as an approach to the concept of “reality” in HCI and research on presence and copresence. Within this framework presence and copresence are specified in connection with an agent’s potentiality to act upon its environment, i.e. to actively explore and manipulate its environment. Methodological problems concerning theoretical and empirical research on interaction are sketched. To explore new methodological ideas New Media Art is used as a test-bed and an ongoing exploratory experiment on communicating "emotions" through robots is briefly reported. Keywords: reality, presence, copresence, methodology, New Media Art, robotics, emotion, embodied interaction, embodied cognition, interactionism, distributed cognition.
other words: "the perceptual illusion of non-mediation, i.e. the extent to which a person fails to perceive or acknowledge the existence of a medium during a technologically mediated experience" [18]. Steuer seems to support a somewhat different position at first glance, claiming that presence in a technologically mediated environment is concerned with "mediated experience", and provides a view of mediated communication that transforms the traditional idea of sender–receiver communication [26]. But his term "mediation" seems to refer to the aspect of technological mediation, which aims to represent a technologically non-mediated environment, as observed in one dimension he suggests for discussing telepresence, "vividity". "Vividity" means "the representational richness of a mediated environment as defined by its formal features, that is, the way in which an environment presents information to the senses" [26]. Steuer identifies this property with "transparency" as called by Rafaeli [23]. It is obvious that by "representational richness" Steuer means a high degree of representation of objects in a physical environment. Vividity and transparency in terms of properties of a mediated environment hence seem to contribute to the "perceptual illusion of non-mediation" – speaking with Lombard and Ditton [18]. What Steuer stresses with the term "mediation," which he uses to refer to a specific characteristic in a technologically mediated environment, is an aspect of presence as mediated communication, in which a user acts both as sender and as receiver. But in our opinion, mediated communication is not only induced by technological mediation, but is basic to each form of communication, for which the model of sender-receiver communication cannot be taken for granted. Matovani and Riva [20] support a distinguishable position: they understand the reality in terms of (cultural) mediation. They claim that there are no "natural" objects, which are "unmediated, pre-technological, and pre-cultural." "'Reality' is not 'outside', … On the contrary, it is continually being negotiated and filtered by artifacts, by means of which we adapt the environment to our needs and at the same time adapt ourselves to the environment in order to exploit the affordances it offers us" [20]. For Matovani and Riva, it doesn't matter whether the environment is "natural" or "artificial", since "[a]ll worlds are, in various ways, constructed" [20]. From a constructionist point of view of reality, they propose the concept of presence as a social construction. Taking into account these current discourses, we attempt to develop a framework for studying presence in relation to "reality", especially regarding embodiment discussed in Human-Computer Interaction (HCI) and in cognitive science.
2 Embodiment in Cognitive Science and HCI: Embodied Interaction and Embodied or Distributed Cognition "Embodiment" seems to be of importance to philosophy, HCI and cognitive science. In "embodied cognition" [22] and "distributed cognition" [12, 13] as well as in "embodied interaction" [7] a system and parts of its environment are viewed as coupled and forming one new system. The functions and actions attributed to the system are now assigned to the whole system consisting of the system or agent and the parts of its environment. Under this perspective two reciprocal relationships are important:
Entelechy and Embodiment in (Artistic) Human - Computer Interaction
931
one is the reciprocal relationship between the agent and its environment, the "agentenvironment fit". The other is the reciprocal relationship between action and perception [9, p. 5]: "Perception provides the information for action, and action generates consequences that inform action." In embodied and distributed cognition in cognitive science and embodied interaction in HCI the term "embodiment" seems to refer to very similar ideas, which are nevertheless slightly different. This is because of the different research goals of cognitive science and HCI. Cognitive science is mainly interested in explaining cognitive capacities and functions of different biological and technological systems acting in their environments. A general goal of cognitive science is to set up an explanatory science of cognitive systems. HCI is mainly an applied engineering science that designs software and hardware systems with established methods or develops such methods to solve specified tasks. In general, HCI, unlike cognitive science, is not interested in solving epistemological problems. The traditional approaches in cognitive science and HCI focused on the system without any further reference to the environment or the coupling of the system with its environment. For example, cognitivism and connectionism as classical approaches in cognitive science study cognitive processes "in" the system. Cognitivism models and explains the (human) mind as a system situated in the "head" as based on physical symbol processing. One main technical advantage of this approach is the opportunity to use variable binding and the representation of recursive structures to model cognition and perception. Connectionism uses associative sub-symbolic processing in the style of the brain to model and explain cognitive processes. In connectionism the system under study is the brain or mind/brain. Connectionism provides technical tools well suited to modeling learning, self-organization and classification in cognition. Both approaches focus on the cognitive architecture of the mind/brain and neglect the body and the body’s coupling with its the environment. These traditional approaches have been complemented within cognitive science by two new research directions: embodied and distributed cognition. These research directions suggest that Western philosophy, psychology and neuroscience seem to have focused on the wrong distinction in studying the mind-body relation instead of exploring the animate-inanimate distinction that underlies the idea of studying primarily actions or interactions of systems within their environments. Probably the more fruitful primary problem is not the problem of the connection between "the mental" and "the physical" but between living and non-living matter [5, p. 763] and the dependence of the actions taken by a system in relation to its environment. Arbib and Hesse [3, p. 39] describe the basic idea in connection with the term "embodiment" in the following manner: "The nature of our embodiment helps us to create the metaphors through which we organize our multiple experiences. … Our capacity for purposeful action lays the basis for the relation between thought and action and gives a certain practical character to thought. Human thought is not purely abstract, but a mode of praxis. There is no pure cognition because we are essentially embodied. … . To come to terms with the thinking subject is to come to terms with the actions and practices its thoughts are implicated in. The theory of the embodied subject claims to transcend mind/body dualism, …" What distinguishes both embodied and distributed cognition from the embodied interaction of HCI is that they focus on an explanation of the processes and operations
932
U. Seifert and J.H. Kim
underlying actions carried out by a system to act in and on its environment. They focus on the computational explanation how "habits" or "functions" are embodied or realized in a system, taking its environment into account. As Arthur W. Burks pointed out, Charles S. Peirce’s term "habit" seems to be general enough to cover the central idea behind embodied and distributed cognition in interpreting "habit" as any set of rules embodied in physical and social systems [3, p. 44]. The system may be an individual of a species, and the rules would cover some innate or learned action patterns. Another example of a system is a social institution and its rule governed "behavior patterns". Distributed cognition has been used as a research strategy in HCI [13]. Hutchins [12] proposed that the classical approaches to cognitive architecture in cognitive science – cognitivism and connectionism – have to be extended. In distributed cognition the idea is to extend a system’s cognitive architecture to its environment [12, 13]. The same idea can be found in the development of some connotations of the German term "Geist" and have been studied in connection to cybernetics and computer science by Günther [10]. In the idealistic tradition of Hegel or Hartmann one may distinguish between "subjektiver", "objektiver", and "objektivierter Geist" which may be translated as "subjective", "objective", and "objectivized mind" or "objectification of mind". The term "subjective mind" denotes the mind nowadays studied in psychology and cognitive science, and within this framework the mind-brain relationship becomes an important topic of research. Consciousness, awareness, perception, emotion and volition are parts of the (subjective) mind. The nearest English translation of "objective mind" is "culture" as used in cultural anthropology. Culture may be considered of as habits formed by a group of persons or an institution. One can think of the special atmosphere of a place or an institution. In cognitive science the meaning of "cognitive artifacts" seems to approximately capture the sense of "objectification of mind". Cognitive artifacts [14] are viewed as extensions of the cognitive architecture of the (human) mind, and in this sense one can think of "objectifications of mind" as extensions of the "subjective mind". This would mean that distributed cognition and idealistic philosophy share some common assumptions. Hollan et al. [12] developed an approach based on the idea of distributed cognition taking into account the environment in studying human computer interaction. This approach, however, seems to entail two problems: 1) Although Hollan et al. [12] take the environment into account in investigating interactions, they focus on the "computer is on the box" approach instead of the "computer" in the environment. But in the near future the interaction of humans with (autonomous) robots and communication through robots in virtual and "real" environments will become of importance [21, 22]. 2) It is centered on well-defined working tasks that do not include creative artistic activities, which are an important field of research on presence and copresence [16]. In HCI, "[e]mbodiment is about engaged action rather than disembodied cognition" (13, p. 198). Dourish [7] distinguishes tangible and social computing as research areas of HCI that are relevant to embodiment as it is covered in his term "embodied interaction". For Dourish [7] tangible and social computing are forms of embodiment. In these research programs, computing is recognized as the physical embeddedness of action in the world and its social embeddedness in systems of meaning. As Dourish [7] remarks, there is a mutual constitution of action and meaning through embodied interaction or practice. Tangible computing is characterized as "an attempt to move
Entelechy and Embodiment in (Artistic) Human - Computer Interaction
933
computation out of the "box on the desk" and into the environment." [7, p. 198]. Main aspects of tangible computing come into focus "[by] capitalizing on the contextual factor like presence, location, and activity it sets out to unify computational experience and physical experience, …" [7, p. 198]. Robotics seems to be one relevant field to "move computation into the environment" in which presence, location, and activity as well as social factors play an important role in embodied interaction. For example, in music-making it seems to be of importance to have a physical – a corporeal – realization of the improvising mechanisms of a computational system to catch the attention of the human performer in real-world music improvisations between musician and machine as in the case of the robot musician "Haile" developed by Weinberg and Driscoll [27].
3 Reality, Agency, and Interactionism But how can embodied interaction be studied especially in the case of perception? Before investigating presence and copresence in computing one should distinguish three concepts of reality: a) appearance / reality, b) space-time constraints on "the" reality described by physics, and c) actuality / potentiality. The distinction between appearance and reality, which supposes that the phenomenon is not as it appears, is closely related to the connotation of the term "reality" as described by physics. In general this physical “reality” sets constraints on actions. This position is often associated with the epistemological position of scientific realism that supports the thesis that there is some real nature of a phenomenon behind its appearance. The last meaning of "reality" c) seems to be most important to HCI and is fundamentally different from the first two meanings a) and b). We think that this old, "outdated" meaning as it is captured in the binary relation expressed by "bewirken" or "verwirklichen" – in the sense of "to achieve", "as an effect of", or "to realize" – is more important to HCI in order to understand interaction and interacting entities and their realities than the physical relation expressed by "wirken" or "verursachen" – in the sense of "to have an effect on" or to "cause" – that is at the core of modern physics.1 This third meaning c) goes back to Aristotle and Aquinas and is related to the distinction between potentiality (from Greek: dunamis; Latin: potentia, possibilitas) and actuality (Greek: energeia; Latin: actus/actualitas). Dunamis is the potentiality to achieve a certain goal and energeia the process of achieving a goal. Often the goal (Greek: telos; Latin: finis) to be achieved lies outside the process of achieving the goal. But in some cases the goal (telos) to be achieved, the achieved goal or function (Greek: ergon; Latin: opus, operatio), and the process of achieving the goal (energeia) – activity or actualization – become the same thing, which is called entelechy (Greek: entelecheia; Latin: actualitas). Aristotle uses "Entelechy" as the actualization in general or resulting actuality or perfection of something in particular as opposed to its mere potentiality. Let us illustrate these concepts by some examples. Let's first have a look at perception: the (perceptual) process (energeia) that realizes a (mental) perceiving function (ergon; task, function, product) and its perceptual content (telos; 1
For a logical analysis of these two relations see [8, chapter 3. Analyse des Kausalprinzips, pp. 166-180].
934
U. Seifert and J.H. Kim
end, success) become one "entity". According to Aristotle, the potentiality (dunamis) in perception is the special perceptual sense (aisthêtikon). Energeia (actus, actualitas) is perceiving as the act or process of perception (aisthêsis). And ergon is perception as the mental function (aisthêma). The percept – the content of the perceptual act – is called aisthêton. Another example is a program or code for the simulation of an environment. In its potentiality it is the environment. If the code is executed we get the environment. This is actualization of the code and at the same time the actuality of the simulated environment. Execution of the program constitutes a "real" environment in relation to the potential environment of the unexecuted code. Interaction too may best be conceived in these terms: a system and its environment provide the potentiality for a specific form of interaction; the process of a specific interaction realizes a specific function (ergon) to achieve a goal (telos). In some cases the interaction is the goal itself (i.e. communication). Some special goals (telos) can only be achieved as a function (ergon) in actualization (energeia); this special case is called entelecheia or actualitas. "Entelechy" or "actualitas" is synonymous with "reality". For example, the German term "Wirklichkeit" meaning "reality" is a translation of the latin "actualitas" which goes back to the greek terms "energeia" and "entelecheia". In this sense reality may be conceived of as some kind of resistance. This may be a wall or a logical problem. One may think of physical or mental resistance. These conceptual distinctions underlying interaction may be subsumed under the term "agency". If in cognitive science one wants to stress the importance of interaction and agency instead of the embodiment of disembodied function or the extension of the system’s cognitive architecture to its environment one may speak of "interactionism" [1, 2] instead of "embodied or distributed cognition" as opposed to cognitivism and connectionism.
4 Perception, Presence, and Copresence in Interaction As a consequence of our analysis of the different goals of HCI and cognitive science and the different meanings of "reality" in connection with interaction, summarized as agency and resulting in interactionism, there follows a phenomenalistic approach to perception in HCI. Concerning the user, HCI is not interested in modeling the mechanisms or processes underlying perception and cognition. Research in HCI primarily uses this knowledge to achieve a goal. From the point of view of embodied interaction it does not matter whether there is a "real" or virtual body acting in a virtual environment or a real environment represented in a "realistic" manner. What does matter is the possibility to explore and manipulate entities in the environment, i.e. presence as the systems potentiality to act. The presence of a unit means actual presence, but as the analysis of actuality and potentially has shown does not necessarily entail physical presence. Presence means not passively receiving information, but the potentiality of a system to actively explore its environment and manipulate and act on objects, events and other agents in its environment. Further, copresence in this phenomenalistic framework means a system’s potentiality to relate itself to the environment and to objects, events and other agents in its environment. The actualization of presence and copresence, that is, their actuality or entelechy, is the constitution of reality (actualitas
Entelechy and Embodiment in (Artistic) Human - Computer Interaction
935
or Wirklichkeit) through interaction. If presence and copresence become actuality then the system is in the state of perceiving and coperceiving. So perception, coperception and reality are functions of an immediate interaction of a system with its environment. This embodied interaction is realized through operations working on representations of affordances that are present in the stimulus information. As a consequence one main goal of research on interaction is to study affordances [9]. The main problem is how to study the development of affordances for some agents and their environment and the representation of affordances in the stimulus information. This means that it is important to discover the affordances that are relevant to an agent and their environment under study. The building up of representations of affordances and new operations to use these representations is another area of research. This is the problem of learning and induction. What are the task constraints and the bodily requirements? How should affordances, the development of affordances in systems, and affordances in environments and actions be studied?
5 Methodological Problems of Empirical Studies on Interaction How can affordances that are at the basis of interaction that constitutes “reality” be investigated empirically? As in psychology and cognitive science, in HCI there is a tension between investigations in the laboratory and investigations in real-world situations. In laboratory situations, mental functions are operationally defined and associated with the mind’s capacities to process information. For example, in experimental research on memory it is viewed as some kind of "internal storage". On the other hand, as we showed in our analysis of distributed cognition, if one is studying behavior outside the laboratory, it seems that some parts of its "memory" are outside the system and a computational modeling of cognitive processes should take this into account. The main problem is to identify the units of behavior and the affordances that are relevant for a system in a given situation or environment. To achieve this goal it will be important to bring together experimental studies in laboratories and “real” environments [24] in order to gain some hints as to where to look in order to identify affordances. How can we gain data of interactions of humans in augmented environments with artificial life systems and robots? If we think of coupled interacting systems – extending Licklider's "man-computer symbiosis" [17] – as forming a new system e.g., humans and robots interacting, this new system can then be thought of as a new species to be studied in its environment. In general we can conceive of humans, computers, and robots as well as software agents as agents, so that this approach may be thought of as consisting of studying the interaction of coupled agents forming a system or complex agent coupled to its environment. Viewed from an ethologicalpsychological perspective, New Media Art [16] may be regarded as a test-bed to gain data on the interactions of agents in complex environments. One may especially investigate agents consisting of humans and robots in an augmented environment. In such a research situation one must conceptually distinguish between interaction within the system formed by agents, i.e. a human as an agent interacting with a robotic or software agent, and interaction of the whole system with its environment. The
936
U. Seifert and J.H. Kim
latter form of interaction may be called adaptation. Within adaptation ultimate and proximate adaptation should be distinguished. Proximate adaptation is determined by presence and copresence or perception and coperception. However, coperception in this sense means that the interacting agents that form the system perceive their state in relation to their environment. The ultimate adaptation is a relation between the goals developed by the agent system and the induced changes of the environment and constraints set up by the environment. The main and most difficult point is to find the behavioral units for further analysis of complex behavioral patterns to find the motivational or intentional basis for the agent’s interaction with its environment and other agents. We are beginning to extend occasional observations of human-computer interaction in New Media Art to more systematic observations in using time series analysis as used in ethology [4, 6, 19] applied to robot interaction to discover behavioral patterns and their underlying motivations/intentions and extend our approach to experimental work. A first explorative experimental study has been undertaken by one of our students in connection with the colleagues from the technical university in Stockholm/Sweden. Our student Birgitta Burger, in her master thesis under preparation, designed an exploratory experiment concerning expression and communication of and through robots using Lego Next robots. The basic idea is to test experimentally whether it is possible to recognize intended "emotions": whether rudimentary movements of robots representing or "expressing" these "emotions" can be detected by recipients. The guiding principle behind this experiment is the idea of "expressive poverty", that is, the extent to which it is possible to recognize "emotions" based on rudimentary movements. The robot exhibits three different kinds of movements: slow, fast, and interrupted. At the same time, when he is moving the robot performs some kind of "gestures" with his "arms." These movements and gestures are programmed to represent or "express" three categories of intended "emotions": anger, fear, and happiness. One can think of this situation as if the programmer intends to communicate emotions through the robot to another human being. Likewise, it is possible to conceive of the situation as a robot expressing emotions. A questionnaire is used to record answers concerning the "expressiveness" of the robot’s "movements" from persons observing the moving and gesturing robot. As a next step in the experiment, melodies especially composed for such an experiment are used. The same sequence of tone heights has been used, but certain parameters have been altered to change the "expressiveness" of the tone sequence. These melodies are played in parallel to the movements and gestures of the robot and the test persons are requested once again to rate on several scales the emotional content "expressed" by the robot. The interpretation of the data is under way.
6 Summary and Some Conclusions We pointed out differences and similarities concerning research on embodiment in cognitive science and HCI. For cognitive science we mentioned embodied cognition and distributed cognition as important research fields in computational studies of embodied cognitive processes and mechanisms underlying a systems interaction with its environment. These special approaches seem to be of relevance for HCI. HCI has been described as embodied interaction. Combining these approaches, i.e. embodied
Entelechy and Embodiment in (Artistic) Human - Computer Interaction
937
interaction, embodied and distributed cognition, in research on New Media Art, we hold the viewpoint that "interactionism" seems to be the most appropriate and neutral term to characterize a combination of the research interests and strategies by cognitive science and HCI. Concerning the debate on "reality" in presence and copresence, we argue that although "reality" as described by physics puts constraints on interactions, this concept of "reality" is not as important as it is often assumed. What seems to be of greater importance is a system's potentiality to interact with its environment by active exploration and manipulation. As a theoretical tool we propose the use of a conceptual framework based on the distinction of "potentiality / actuality" dating back to Aristotle. In this framework grades of presence may be defined through the system’s potential to explore and manipulate its environment. In this case copresence becomes the system's frame of self-reference in relation to its environment. This points out the importance of combining empirical and theoretical research. Furthermore, experimental work and fieldwork in empirical research should be combined. In our approach we choose New Media Art as a kind of test-bed to test theoretical ideas, to gain data about interactions and affordances, and for the development of research methodologies concerning "symbiotic" systems consisting of agents such as humans, robots, and software agents. As a starting point for our empirical field research we try to apply ethological methods to study behavior such as interactions/adaptations of agents with/to "their" environments. Concerning the more experimental aspect we have been starting research on "communication" of "emotions" through robots.
References 1. Agre, P.: Computational Research on Interaction and Agency. Artificial Intelligence 72, 1– 52 (1995) 2. Agre, P.: Computation and Human Experience. Cambridge University Press, Cambridge (1997) 3. Arbib, M.A., Hesse, M.B.: The Construction of Reality. Cambridge University Press, Cambridge (1986) 4. Bakeman, R., Gottman, J.M.: Observing Interaction: An Introduction to Sequential Analysis, 2nd edn. Cambridge University Press, Cambridge (1997) 5. Barnes, J.: Psyche. In: Gregory, R.L. (ed.) The Oxford Companion to the Mind, 2nd edn. pp. 762–763. Oxford University Press, Oxford (2004) 6. te Boekhorst, I.R.J.A.: Freeing Machines from Cartesian Chains. In: Beynon, M., Nehaniv, C.L., Dautenhahn, K. (eds.) CT 2001. LNCS (LNAI), vol. 2117, pp. 95–108. Springer, Heidelberg (2001) 7. Dourish, P.: Where the Action Is – The Foundations of Embodied Interaction. MIT Press, Cambridge (MA) (2001) 8. Essler, W.K.: Einführung in die Logik. Stuttgart: Kröner (1966) 9. Gibson, E.J., Adolph, K., Eppler, M.: Affordances. In: Wilson, R.A., Keil, F.C. (eds.) The MIT Encyclopedia of the Cognitive Sciences, pp. 4–6. MIT Press, Cambridge (MA) (1999) 10. Günther, G.: Cognition and Volition: A Contribution to a Cybernetic Theory of Subjectivity. In: Günther, G.: Beiträge zur Grundlegung einer operationsfähigen Dialektik, Bd. II. Hamburg: Meiner (1978)
938
U. Seifert and J.H. Kim
11. Heeter, C.: Being There: The Subjective Experience of Presence. Presence: Teleoperators and Virtual Environments 1(2), 262–271 (1992) 12. Hollan, J.D., Hutchins, E., Kirsh, D.: Distributed Cognition: A New Foundation for Human-Computer Interaction Research. ACM Transactions on Human-Computer Interaction: Special Issue on Human-Computer Interaction in the New. Millennium 7(2), 174–196 (2000) 13. Hutchins, E.: Cognition in the Wild. MIT Press, Cambridge (MA) (1995) 14. Hutchins, E.: Cognitive Artifacts. In: Wilson, R.A., Keil, F.C. (eds.) The MIT Encyclopedia of the Cognitive Sciences, pp. 126–128. The MIT Press, Cambridge (MA) (1999) 15. Ijsselsteijs, W., Riva, G.: Being There: The Experience of Presence in Mediated Environments. In: Riva, G., Davide, F., Ijsselsteijs, W. (eds.) Being There: Concepts, Effects and Measurement of User Presence in Synthetic Environments, pp. 3–16. IOS Press, Amsterdam (2003) 16. Kac, E.: Telepresence & Bio Art. Networking Humans, Rabbits, & Robots. Ann Arbor. The University of Michigan Press, Ann Arbor, MI (2007) 17. Licklider, J.C.R.: Man-Computer Symbiosis. In: Wardrip-Fruin, N., Montfort, N. (eds.) The New Media Reader, pp. 74–82. MIT Press, Cambridge (2003) 18. Lombard, M., Ditton, T.: At the Heart of it All: The Concept of resence (2.1. 2007) Available http://jcmc.indiana.edu/vol3/issue2/lombard.html 19. Maris, M., te Boekhorst, I.R.J.A.: Exploiting Physical Constraints: Heap Formation Through Behavioral Error in a Group of Robots. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1655–1660. IEEE Press, Piscataway (1996) 20. Matovani, G., Riva, G.: Real Presence: How Different Ontologies Generate Different Criteria for Presence, Telepresence, and Virtual Presence. Presence: Teleoperators and Virtual Environments 8(5), 538–549 (1999) 21. Nair, R., Tambe, M., Marsella, S.: The Role of Emotions in Multiagent Teamwork. In: Fellous, J.M., Arbib, M.A. (eds.) Who needs Emotions? The Brain Meets the Robot, pp. 311– 329. MIT Press, Cambridge (MA) (2005) 22. Pfeifer, R., Bongard, J.: How the Body Shapes the Way We Think. MIT Press, Cambridge (MA) (2007) 23. Rafaeli, S.: Interactivity: Do Computers Do It Differently? Unpublished Manuscript. Standford (CA): Standford University, Institute for Communication Research (1985) 24. Sharples, M.: Human-Computer Interaction. In: Boden, M.A. (ed.) Artificial Intelligence, pp. 293–323. Academic Press, San Diego (1996) 25. Sheridan, T.B.: Musings on Telepresence and Virtual Presence. Teleoperators and Virtual Environments 1(1), 120–126 (1992) 26. Steuer, J.: Defining Virtual Reality: Dimensions Determining Telepresence. Journal of Communication 4(2), 73–93 (1992) 27. Weinberg, G., Driscoll, S.: Toward Robotic Musicianship. Computer Music Journal 30(4), 28–45 (2006) 28. Zhao, S.: Toward a Taxonomy of Copresence. Presence: Teleoperators and Virtual Environments 12, 445–455 (2003)
Predicting Perceived Situation Awareness of Low Altitude Aircraft in Terminal Airspace Using Probe Questions Thomas Z. Strybel, Kim-Phuong L. Vu, John P. Dwyer , Jerome Kraft, Thuan K. Ngo, Vanessa Chambers, and Fredrick P. Garcia Center for the Study of Advanced Aeronautic Technologies Department of Psychology, California State University, Long Beach 1250 N. Bellflower Blvd. Long Beach, CA 90840, USA [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract. The purpose of the present study was to evaluate the effectiveness of subjective and objective probe questions in predicting situation awareness as measured by the Situation Awareness Rating Technique (SART). The data for this evaluation were taken from a previous investigation in which instrumentrated pilots flew automated ILS approaches into the Dallas-Fort Worth (DFW) Airport while monitoring the status of patrol vehicles proximal to their approach path. At three points during a simulation run, pilots were administered a questionnaire containing seven questions designed to probe situation awareness. At the end of the run, SART was administered. We found that certain probe questions can predict SART scores. However, the usefulness of these probes requires that the questions be designed in conjunction with scenario development to ensure that operationally critical variables are being probed, and that sufficient variability in the responses allow assessments of relations with sufficient statistical power. Keywords: situation awareness; aviation; simulation.
SA has been viewed as either the cognitive processes involved in achieving high awareness of one’s surroundings or the information that determines the state of the operator’s awareness. Most definitions of SA, in the latter view, assume that good SA requires information about past, present, and future events [3], [4]. The measurement of SA is also difficult because the relationship between SA and performance is not simple. For example, Durso et al. noted that measures of SA may not always predict operator performance because “...the situation might be very simple, or the operator may get lucky” (p 721) [4].” Nevertheless, the absence of situation awareness has been shown to contribute to operational errors. For example, in a review of major carrier accidents over a four year period, 88% of the errors can be traced back to low SA [3]. In other reviews, 69% of ATC-reported incidents involved failure of gathering appropriate information needed for good SA [5]. Moreover, the severity of ATC operational errors is related to controller awareness of the error: Lower awareness results in more severe errors [6], [7], [8]. The importance of developing valid SA metrics will be increased in future years, as we move to more automated air traffic management environments. Automation may change the traditional roles of operators in the airspace, and the effect of these changes on SA need to be determined. Preliminary research on the effects of automation on SA has identified several potential problems. First, automation might take the operator out of the loop and add more demands for vigilance monitoring. However, human performance on vigilance tasks is poor. Second, by taking the operator out of the loop, operators may not be able to easily regain SA during system failures or emergencies because they were not fully aware of system status prior to the failure [9]. Third, SA will be impacted by the reliability of the automated tools and the tendency for operators to become overly involved with the automation tool itself [10], [11]. For example, datalink has been shown to improve the overall efficiency of communications by reducing the number of communications failures. However, datalink, by reducing or eliminating party line chatter, can also reduce pilot SA of ATC current workload level [12]. 1.1 Situation Awareness Measures Over 20 years of research has been conducted on SA, and numerous measurement techniques have been developed for the construct. Recent reviews of SA measurement techniques list between 9 and 17 different tests and measures [2], [13]. Although the specific measures vary, these metrics can be roughly classified as probe techniques, subjective ratings, or performance measures. The use of probe techniques involves submitting queries to the operator during a simulation run. These queries are designed to gather information that is assumed to be relevant to SA (e.g., asking a pilot to estimate the distance a nearby patrol vehicle from his/her aircraft, or “ownship”). The most widely researched and utilized probing SA technique is Endsley’s freeze-probe technique, SAGAT, in which the simulation is stopped in the middle of a simulation run and operators are asked questions about the simulation environment [1], [3]. For air traffic management applications, the queries usually relate to the location and characteristics of aircraft in a sector (for controller SA) or in the vicinity of an aircraft (for pilot SA). According to Endsley, the questions queried should be created from a prior Goal Based Task Analysis
Predicting Perceived Situation Awareness of Low Altitude Aircraft
941
Technique which identifies SA information requirements and classifies them according to which dimension of SA they are tapping (perception, integration, or projection into the future). SAGAT has been criticized because scenario freezes may disrupt performance. The task of air traffic management is dynamic, and freezinf may prevent interactions of elements within the system to unfold realistically. Endsley found no significant differences in ATC performance between simulation runs in which the scenario was frozen and those in which the scenario was not frozen [1]. However, because controller performance in simulated and real environments is generally very high, a lack of significance effect of scenario freezes could be the result of inadequate statistical power. Therefore, the question of whether scenario freezes disrupt performance has not been adequately tested. In nuclear power plant operations, SA has been shown to drop during periods of workload transition (i.e., from normal to high conditions [14]). Scenario freezes may create similar costs in workload transitions in different contexts even if no performance change is observed. Other probe techniques query the operator while the simulation is still running; the scenario is never frozen. SPAM asks questions of controllers or pilots via his/her landline [8]. The questions are developed by subject-matter experts, so they are relevant to the operator’s task and more compatible with how controllers represent traffic information during the scenario. Durso suggests that part of SA involves knowing where to obtain information, instead of holding the information in memory. With SPAM, SA is measured as the number of correct responses, and response time is used as a secondary indicator of workload. Rating methods require that either operators provide self-ratings of SA, or subjectmatter experts rate an operator’s SA. The most widely used measure of SA that makes use of self-reported rating scales is SART [15]. SART is a multidimensional scaling technique that consists of a series of questions that have bipolar responses. The number of dimensions varies between different SART forms. The simplest version, 3D-SART, a simplification of the 10 dimension version, assesses three dimensions of SA: demands on attentional resources (complexity, variability, and instability of the situation), supply of attentional resources (division of attention, arousal, concentration, and spare mental capacity) and understanding (information quantity, and information quality). A combined SART scoring technique is often used as an estimate of overall SA: SART-Combined = Mean Understanding Rating – (Mean Demand Rating – Mean Supply Rating). The lack of task specific questions in SART is both an advantage and disadvantage. Because SART questions are not task specific, it can be administered to both pilots and ATC. On the other hand, the lack of specificity means that SART does not provide much diagnostic information regarding the causes of poor SA. Nevertheless, SART is the most widely used rating method of SA measurement, most likely because it is easy to administer and score. Moreover, SART measures the operator’s perception of his/her SA, which may or may not be related to performance or actual awareness. Performance measures assess SA in terms of system variables. Such measures are objective and non-intrusive. However, the assumption that system outcomes are based solely on operator SA is tenuous because the relationship between SA and system performance is complex. Presently, there is no performance measure that has been shown to be directly related to SA level and is independent of other performance
942
T.Z. Strybel et al.
factors, such as workload. In general, performance measures assess SA in scenarios and on tasks that have been carefully developed to probe the operator’s SA. One technique for obtaining performance measures of SA is to introduce errors into the scenario and use speed of detection and accuracy of correction as SA metrics. 1.2 Present Study The present study is a preliminary evaluation of the effectiveness of subjective and objective probe questions in predicting subjective SA as measured by SART. We used the final approach phase of flight because there are unique information demands required of pilots in this flight phase [16]. The data for this evaluation were taken from a previous investigation in which instrument-rated pilots flew automated ILS approaches into the Dallas-Fort Worth (DFW) Airport while monitoring the status of patrol vehicles proximal to their approach path [17]. These vehicles were characterized as multi-vehicle flights on patrol missions over DFW reservoirs. Furthermore, they were identified as either piloted patrol aircraft, or unpiloted patrol aircraft. The patrol vehicles flew ziz-zag courses parallel to, and ahead of, the participant’s ownship aircraft (but at substantially slower speeds than ownship), and either leveled off, or climbed precipitously in the vicinity of the ownship course Final Approach Fix (FAF), as shown in Figure 1. Also, these patrol vehicles flew either to one side of the ownship course, or on both sides (i.e., ‘straddling’ the ownship course). These variables influenced SA by altering the extent to which pilots could predict patrol vehicle proximity to ownship. In this paper, we examined whether pilots’ situation awareness, as captured by SART, could be predicted by different types of probe questions administered throughout the simulation runs.
Fig. 1. Illustration of a patrol vehicle flight path relative to ownship’s approach path. Dotted line A depicts a climbing patrol vehicle flight’s maneuver at the FAF, and dotted line B represents the patrol vehicles in level flight past the FAF.
2 Method 2.1 Participants Nine instrument-rated pilots (all males) served as participants in the simulation. The group averaged approximately 2,700 hours of flight experience. Participants were recruited through a local flight instruction school in Long Beach, California. Each participant was paid $160 for 8 hours of their participation.
Predicting Perceived Situation Awareness of Low Altitude Aircraft
943
2.2 Apparatus and Procedure The simulation was conducted using the Multi-Aircraft Control System (MACS), Aeronautical Datalink and Radar Simulator (ADRS), and DagVoice suite of software developed by NASA Ames Research Center. MACS allows each individual computer station to be run in one of several modes: air traffic control, pilot, data analyzer, and simulation manager [18]. The MACS pilot interface used in the simulation consisted of a stripped-down, generic, modern commercial transport cockpit with a primary flight display, a navigation display, and a landing gear and flaps setting control. The pilot’s navigation display also showed traffic in the vicinity of ownship, and provided traffic call signs and altitudes. ADRS was used as the radar simulator and communications hub between individual workstations (pilots, psuedopilots, ATC, and simulation manager). DagVoice, a flight simulation voice-over-Internet-protocol (FS-VoIP) software system, allowed multi-channel communication, emulating ATC radio communications [19]. The portion of the DFW terminal area directly involved in this experiment encompassed the low-altitude airspace surrounding the south arrival corridors to the 13 Right and 18 Right runways. The pilots were briefed on the block altitude airspace, and crossing corridor clearances that had been issued to the patrol vehicles. In each scenario, one participant flew an approach to Runway 18 Right and a second participant to Runway 13 Right. Pilots were informed of which approach they were to fly and given time to study the relevant approach plate prior to the start of the run. The entry of the two piloted aircraft in each trial was staggered by about 2 minutes to allow the controller time to manage both approaches, and for the experimenters to administer several questionnaires to the pilots during the runs. A confederate pseudopilot controlled other aircraft in the airspace except for the patrol vehicles, which were always automated. The role of the air traffic controller was played by a confederate, who was trained in basic ATC terminology, and communicated scripted clearances to the pilots. ATC also issued traffic advisories throughout the simulation run to increase party-line verbal communication. Pilots were informed that automated ILS approaches would be used for all runs. As a consequence, pilots did not have to change aircraft flight parameter settings, but instead only monitored flight progress to ensure that the aircraft was on its intended course at the given altitude and speed. Participants were required to manually deploy flaps throughout each approach (from fully retracted to Flaps 40). For landing on 18 Right, each pilot ownship entered the scenario just outside of ICKEL (Waypoint 1) at 3000 ft MSL and 210 KTS. Pilots were instructed to contact Regional Approach Control immediately. An illustration of the task sequence is captured in Figure 2. At initial contact, the controller acknowledged the aircraft, gave instructions to maintain 3000 ft MSL, and passed traffic information regarding the patrol vehicles. At YOHAN (Waypoint 2), the controller issued pilots a speed reduction clearance to 170 KTS. Pilots read back the clearance to ATC, and monitored the aircraft to ensure speed capture. At LEGRE (Waypoint 3), pilots were issued another speed reduction clearance, this time to 150 KTS, cleared for landing on 18 Right, and instructed to contact the DFW tower at NETEE, the FAF for the ILS approach to 18 Right. Pilots again responded by reading back the clearance. They lowered the landing gear and went to Flaps 40 at NETEE. Pilots then performed a short Landing Checklist
944
T.Z. Strybel et al.
Fig. 2. Depiction of pilot activities, and pilot/air traffic controller communications for the ILS approach to Runway 18 Right at DFW
(verifying that gear was extended and that final flaps had been set). A similar procedure was used for landing on 13 Right. 2.3 Situation Awareness Measures Pilot situation awareness was measured during, and at the end of, each simulation run. During a simulation run, pilots were administered a questionnaire containing seven questions designed to probe situation awareness. One question probed awareness of surrounding traffic (e.g., “How many aircraft were to the left/right of ownship when you were at Waypoint X?”). One question probed awareness of ownship status (i.e., “What was your speed at Waypoint X?”). Four questions probed awareness of the patrol vehicles in the vicinity: “How far was the patrol vehicle from ownship at Waypoint X”; “What was the patrol vehicles clock position relative to ownship at Waypoint X?”; “What was the patrol vehicle’s speed at Waypoint X?”; and “What was the patrol vehicle’s altitude at Waypoint X?”. These six questions assessed pilots’ awareness of objective information in the simulated airspace, meaning that the accuracy of the responses could be determined. The final question asked pilots to rate their perceived threat of possible encroachment of the patrol vehicle in relation to ownship using a 10-point scale, with 1 representing no threat and 10 representing extreme threat. This seven-item questionnaire was administered at three points along the approach in each simulation run (as shown in Figure 2): After pilots read back ATC clearances at initial contact (Waypoint 1), after Waypoint 2, and after the FAF (Waypoint 3). At the end of each run, the Situational Awareness Rating Technique (3D-SART) was administered.
3 Results Table 1 presents summary statistics for SART-combined and component scores. SART combined scores range from 1 to 14, with 14 denoting high SA. Component scores range from 1 to 7, with 7 denoting high SA. From Table 1, situation awareness was moderate at best, and in some conditions was quite low. Previously, SARTcombined and understanding were shown to be affected by the interaction of patrol
Predicting Perceived Situation Awareness of Low Altitude Aircraft
945
vehicle type and maneuver [17]. In the present paper, we evaluated whether individual SA probe questions would predict subsequent SART scores administered at the end of each simulation run. The effectiveness of situation awareness probes, administered during a simulation run, was analyzed by regressing SART post run scores against responses to the probe questions that were administered at three waypoints during the simulation run. Three separate regression analyses were performed. The first set of analyses examined the accuracy of pilot responses to objective questions (e.g., deviation of pilot estimate of patrol vehicle distance from actual distance) against SART scores to determine if these objective measures of situation awareness, averaged across the three administrations of the probes, predicted SART. The second set of analyses examined pilot estimates of probed variables (e.g., pilot estimate of patrol vehicle distance regardless of accuracy), predicted SART scores. The last set of analyses evaluated the degree to which the pilot responses to probe questions at each waypoint predicted SART scores. Table 1. SART Score Summary Mean Combined Understanding Demand Supply
5.22 4.44 4.16 4.96
Standard Deviation 1.54 1.07 1.13 0.86
High
Low
8.92 6.33 6.67 6.50
0.50 2.00 2.00 3.25
For the three of six objective probe questions, the variability in the correct answers between simulation runs was quite small. For example, patrol vehicle relative position to ownship was either 11 or 1 o’clock in every scenario, and very few errors were made by the participants. Performance on the three remaining objective questions was measured as root-mean-square error in patrol vehicle distance from ownship, rootmean-square error in number of aircraft in the left/right quadrant and root-meansquare error in ownship speed. These variables were used to predict SART combined and component scores. However, none of these models were significant. For the second analyses, SART-combined and SART component scores were regressed against the following pilot estimates: estimated number of aircraft in the left/right quadrant, estimated distance of patrol vehicle from ownship, estimated ownship airspeed, and perceived threat of patrol vehicle encroachment (rightmost columns in Table 2). The regression equations predicting SART scores from pilot estimates were significant for the SART-combined and two of its components: SARTunderstanding and SART-demand. For SART-combined, a significant regression model was obtained [F (4,66)=2.70;p=.04, adjusted r2 = .09]. The regression model had two significant coefficients: estimated distance [t (66) = -2.58; p = .006] and threat rating [t(66)=-1.992; p=.05]. Estimated distance was shown to be the stronger effect ( = -.368) compared with threat rating ( = -.251). Larger distance estimates and higher threat ratings were associated with lower SART combined scores. A similar model was obtained for SART-understanding [F (4,66) = 4.91; p = .002; adjusted r2 = .18]. As in the previous model, estimated distance [t (66) =-2.78; p= .004] and threat ratings [t(66) = 2.043; p = .045] coefficients were significant. For
946
T.Z. Strybel et al.
SART-understanding, however, the distance coefficient was negative, while the threat rating coefficient was positive. That is, lower distance estimations and higher threat ratings were associated with higher SART-understanding scores. A significant regression equation was also obtained for SART-demand scores [F(4,68) = 8.76; p < .001; adjusted r2 =.31]. Here, threat rating and estimated speed coefficients were significant [t(68) = 4.14; p <.001 and t(68) = -3.50; p = .001, respectively]. Higher threat ratings were associated with higher SART-demand scores, but higher speed estimates were associated with lower SART-demand scores. The regression model for predicting SART-supply was nonsignificant. The significant relationships in the aforementioned models for probe question responses and SART scores used probe responses that were averaged across the three administrations (one at each Waypoint) during the simulation run. To determine if those relationships were a by product of a specific waypoint, we regressed probe responses at each waypoint against SART. As shown in Table 2, changes over the waypoints for each estimate were consistent with actual airspace parameters. Threat ratings also increased from the initial waypoint to the final waypoint, consistent with the experimental manipulation. Significance was obtained for only the SARTunderstanding [F(12,45) = 4.046; p<.001] and SART-demand components [F(12,45) = 6.39; p<.001]. SART-understanding was predicted by threat rating at waypoint 3 [t(45)=3.41; p<.001]. SART-demand was predicted by threat rating at waypoint 3 [t(45)=6.04; p<.001] and estimated speed at waypoint 2 [t(45)=-2.46; p=.018]. Table 2. Actual (Act) and Pilot Estimates (Est) of the Flight Parameters That Were Probed, and Threat Ratings, at Each Waypoint. Standard deviations are provided in parentheses. Estimate Distance (nm) Number Aircraft Speed (kts) Threat Rating
Average Act Est 4.9 4.3 (1.7) 3.7 3.4 (1.6) 185 184.4 (9.6) 3.0 (1.9)
4 Discussion From this preliminary investigation of using probe questions to predict subjective SA for pilots flying an ILS approach to DFW, some findings emerged that could potentially contribute to future designs of SA probes. In the first analysis we showed that accuracy of estimated airspace parameters did not predict any SART measure of SA. Although this suggests that SART measures only perceived SA, it is more likely that the small range of potential responses to these probes may have limited any ability to detect the relationships between accuracy measures and SART. Obviously, the development of probe questions should be done in the context of scenario development to ensure that potential probes are operationally relevant and the responses to them have sufficient variability for adequate statistical testing.
Predicting Perceived Situation Awareness of Low Altitude Aircraft
947
Pilot estimates of airspace parameters did significantly predict SART scores in the second analysis. Two variables that accounted for most of the variance by the regression models were estimated distance of patrol vehicle from ownship and rating of perceived threat of encroachment. The dominance of distance estimation and threat ratings was probably created by the demands of the scenario. Pilots were briefed on the characteristics of patrol vehicles flying near an approach flight path at the beginning of the experiment. Consequently, pilots paid particular attention to these vehicles. The significant regression coefficients that were obtained for estimated distance as a predictor of SART-combined and SART-demand were found only when distance estimates were averaged across all waypoints: Individual distance estimates at each waypoint did not significantly predict SART. However, the distance of patrol vehicle changed continuously throughout the scenario as the pilot overtook the slow-moving patrol vehicle and this may have limited the effectiveness of the individual distance estimates at each waypoint in predicting SART. On the other hand, threat ratings and estimated ownship speed were predictive at specific waypoints. The effectiveness of these predictors may have been due to the scenario design. Estimated speed predicted SA only at the second waypoint, where a speed change had just been issued by ATC. Threat ratings at waypoint 3 (the FAF) significantly predicted SA because the separation between ownship and patrol vehicle was smallest here, and pilots were subsequently most concerned about the patrol vehicle. In conclusion, these preliminary analyses showed that probe questions administered during a simulation can predict SART scores administered after a simulation run. However, the usefulness of these probes requires that the questions be designed in conjunction with scenario development to ensure that operationally critical variables are being probed, and that sufficient variability in the responses allow assessments of relations with sufficient statistical power. Finally, it should be noted that pilots were concerned about interruptions introduced by the probes during the simulation run, despite the fact that they were essentially monitoring an automated approach into DFW as opposed to actively flying the approach. Effective probes must be administered in a way that does not interfere with the operator’s task. Acknowledgments. We thank Gary Gershzohn for his help with operational matters, and for conducting the pilot classroom training session, and Ken Wells for his advice on DFW terminal area operations. We also acknowledge Tom Prevot, Joey Mercer, Everett Palmer at the NASA Ames Air Operations Laboratory for providing simulation software and continued technical assistance. We thank Marshall Dion and Shawn Bates, California State University Long Beach, for helping set up and run the simulation.
References 1. Endsley, M.R., Farley, T.C., Jones, W.M., Midkiff, A.H., Hansman, J.R.: Situation Awareness Information Requirements for Commercial Airline Pilots (ICAT-98-1). Mass. Inst. Tech. Intern. Cent. Air Trans (1998) 2. European Air Traffic Management Programme: The Development of Situation Awareness Measures in ATM Systems. HRS/HSP-005-REP-01 (2003)
948
T.Z. Strybel et al.
3. Endsley, M.R: Toward a theory of situation awareness in dynamic systems. Human Factors 37, 32–64 (1995) 4. Durso, F.T., Bleckley, M.K., Dattel, A.R.: Does Situation Awareness Add to the Validity of Cognitive Tests? Human Factors 13, 721–733 (2006) 5. Endsley, M.R., Jones, D.G.: Situation awareness requirements analysis for TRACON air traffic control (TTU-IE-95-01). Texas Tech Univ, Lubbock, TX (1995) 6. Rodgers, M.D., Mogford, R.H., Mogford, L.S.: The relationship of sector characteristics to operational errors, Air Traf. Cont. Quart. 5, 241–263 (1997) 7. Gosling, G.D.: Analysis of factors affecting the occurrence and severity of air traffic control operational errors. ITS Online (2002) 8. Durso, F.T., Truitt, T.R., Hackworth, C.A., Crutchfield, J.M., Manning, C.A.: En route operational errors and situation awareness. Int. J. of Aviat. Psych. 8(2), 177–194 (1998) 9. Kaber, D.B., Endsley, M.R.: The effects of level of automation and adaptive automation on human performance, situation awareness and workload in a dynamic control task. Theor. Iss. Ergon. Sci. 5, 113–153 (2004) 10. Wickens, C.D.: Automation in air traffic control: the human performance issues. In: Scerbo, M., Mouloua (eds.) Automation technology and Human Performance: Current Research and Trends (1998) 11. Amalberti, R.R.: Automation in aviation: A human factors perspective. In: Garland, D.J., Wise, J.A., Hopkin, V.D. (eds.) Handbook of Aviation Human Factors, pp. 173–192. Erlbaum, New Jersey (1999) 12. Kerns, K.: Human factors in air traffic control/flight deck integration: Implications of datalink simulation research. In: Garland, D.J., Wise, J.A., Hopkin, V.D. (eds.) Handbook of Aviation Human Factors, pp. 519–546. Lawrence Erlbaum, Mahwah (1999) 13. Salmon, P., Stanton, N., Walker, G., Green, D.: Situation awareness measurement: A review of applicability for C4i environments. App. Ergon. 37, 225–238 (2006) 14. Hallbert, B.P.: Situation awareness and operator performance: results from simulator-based studies. In: Proc. IEEE Sixth Ann. Hum. Fact. Mtg, Orlando, Florida (1997) 15. Taylor, R.M.: Situational awareness rating technique (SART): The development of a tool for aircrew systems design. Situational Awareness in Aerospace Operations, AGARDCP 478, 3-1 - 3-37 (1990) 16. Shvaneveldt, R., Beringer, D.B., Lamonica, J., Tucker, R., Nance, C.: Priorities, Organization, and Sources of Information Accessed by Pilots in Various Phases of Flight. DOT/FAA/AM-00/26, Federal Aviation Administration (2000) 17. Dwyer, J.D., Strybel, T.Z., Vu, K.L.: Simulation of Multiple Uninhabited Aerial Vehicles Operating in an Airport Terminal Area. RTO-MP-HFM-135 24, 2–17 (2006) 18. Prevot, T.: Exploring the many perspectives of distributed air traffic management: The multi aircraft control system MACS. HCI-Aero 2002. MIT, Cambridge, MA (2002) 19. Canton, R., Refai, M., Johnson, W.W., Battiste, V.: Development and Integration of Human-Centered Conflict Detection and Resolution Tools for Airborne Autonomous Operations. In: Proc. 15th Intern. Symp. Aviat. Psych. Oklahoma State University (2005)
Co-presence in Shared Virtual Environments: Avatars Beyond the Opposition of Presence and Representation Jan Soeffner1 and Chang S. Nam2 1
Universität zu Koeln Albertus-Magnus-Platz 50923 Koeln, Germany [email protected] 2 Department of Industrial EngineeringUniversity of Arkansas, 4207 Bell Engineering CenterFayetteville, AR 72701 U.S.A [email protected]
Abstract. Avatars in shared virtual environments are usually described as representations of the users, but they can be much more than just an arbitrary icon ‘standing for’ (re-presenting) somebody who is absent. In multi-user virtual reality avatars can be experienced by the users as presences or presentations of persons, as well as can be seen as re-presentations; and it is by this property that they allow for co-presence experience. This paper outlines a theory about the relation between persons and their avatars by focusing on both the experience of transmission (as opposed to simulation) and the experience of méthexis or participation (as opposed to representation). Keywords: avatars, personality, presence, tele-presence, co-presence.
In order to describe this equivalence in human-computer interaction experiences, we have to consider not only the forms of presentation, but also presumptions between the status of transmission and artificial construction – such as the complex phenomenon of tele-presence1 presumed by the experiencing people. Tele-presence is the technical realization of a remote ‘what’ (in most cases a sensually perceivable form) to a present ‘this’ (e.g. a nearby screen and/or loudspeakers) – a realization that can imply interactivity by working in two directions (i.e. realizing a present what to a remote ‘this’). We would like to point out this fact, because the current definition of tele-presence as a “sense of being there” tends to forget the importance of the ‘this’ (i.e. the medium) and therefore potentially looses the difference between teleoperation, virtual operation, and unmediated operation (i.e., the difference between tele-presence, virtually produced presence, and physical presence). They recur to concepts of “epistemic failure”2 and try to define the tele-status by concepts like ‘illusion of non-mediation’ or ‘willing suspension of disbelief’ [1]. The problem of this latter definition is that it does not distinguish between artifacts and transmissions in presence experience. Even though many people willingly fool themselves about the corporal presence of things presented in a video game, only few will do so about the corporeal presence of the interlocutor in a phone-call. Actually we do not think that there is any illusion in this experience. For example, the phone transfers an interlocutor’s voice (a remote ‘that’) to a literal here (into the loudspeaker -- the present this). One can believe this without a great deal of disbelief(s). What leads a number of presence researchers to see no difference between telepresence and artificial presence? In our opinion, the answer to this question is quite simple. It is the fact that transmission has technically changed in the last decades. The older telephone realized the remote voice by simply amplifying it. As a device, it had the technical structure of a prothesis [3]. More recent phones, however, transmit it by artificially constructing the voice in a remote place. They are simulating devices that only have the effects of protheses. As in the age of analogous media, tele-presence is still a process of doubling a what, but nowadays this doubling process has technically become a mixture of simulation and transmission. This difference becomes more obvious in co-presence experiences that are mediated by avatars within a virtual environment [4]. The experience of co-presence (as opposed to playing a single-user video-game) takes place as a sense of being together with others even though these others are not corporally present. Unlike the example of the phone call, the tele-status (i.e. whether an item is transmitted, recorded or a virtual artifact), in principle, is no longer a sensually perceivable phenomenon. It is something that has to be presumed and/or gained from interpretation. It can be concluded that the meta-physical or meta-sensual aspect of tele-presence is important for inquiries pertaining to the experience of social co-presence in shared virtual environments. Under this condition, co-presence regards the unperceivable status of the avatars acting ‘in’ or ‘behind’ perceivable things. To return to the medieval definitions, the question about co-presence corresponds to the question “Is a vision of an appearance nothing more than an illusion or phantasm, or is it is inspired by a higher (but not sensually perceivable) agency?” 1
I use this term in the etymological sense of the word: as far-presence. This definition has the advantage to be able to distinguish transmission factors from artificial constructions. 2 For a critique of these concepts see [2].
Co-presence in Shared Virtual Environments
951
It is the aim of this paper to outline this complex relationship. It tries to describe types of relations between persons and avatars that make people identify with their avatars and perceive the avatars of others as animated.
2 Persona and Appresentation Etymologically there are some striking parallels between the terms avatar and person. The term avatar derives from Indian cultic objects animated by divine agency. The term ‘person’ derives from the Latin word for theatrical mask: A persona is literally ‘what is sounding through’ an unanimated representation of a face. Accordingly, a persona has two different sides: one physically present but dead item; and one agency animating this item even though it cannot be seen. So both avatars and personae, in the original sense of the word, are visible manifestations of an invisible being. Indeed computer-avatars are constructed like theatrical masks in a sense. They are even, perhaps, a more consequent realization of this concept of persona than they are the masks themselves – the animating agent ‘sounding through’ is no more sensually present at all. The ‘seeing behind the things given’ required by a persona, however, is not as unusual as our everyday attribution of personality, especially if we think about the experience of real life face-to-face co-presence [5]. Indeed no personality can ever be sensually present, since essential parts of it (such as ‘intentionality’, ‘strength of mind’, and ‘character’) can only be deduced from the observation of their effects on the sensually given world: And even if these deductions are facts of ‘embodied information’ in face-to-face communication, the personality of people still does not occur to our senses. Therefore, Goffman's distinction between embodied and disembodied information seems to be too categorical to work in the case of co-presence in shared virtual environments. Indeed, avatars are also embodied information; and if considered as embodiments of information, our carnal bodies vice versa can be seen as some sort of real-world avatars carrying traces and nothing but traces of our personality and our agency. Avatars are embodiments like our bodies, even though they are virtual ones. And our bodies are bearers of information about us – just like avatars are. So the discussion has to be open to a more gradual way of defining co-presence than Goffman's condition of people being “close enough to be perceived in whatever they are doing” (p. 17). We therefore completely agree with this proposal to discuss the question about copresence in terms of “mutual awareness”, which is a term that is open enough to the complex role avatars play when mediating “togetherness” in and by a virtual environment. Here the question cannot be about “closeness” – like in Goffman’s theory – nor can it be only about the sensory properties the avatar shares with the body of the remote person. So the question has to be made in a more basic way – i.e.: What makes us believe that a set of sensory properties (be they of a human body or an avatar) is the sensual part of a person? And how do we perceive this person as present? The most interesting theory about perception of the sensually unperceivable has been developed by Edmund Husserl [6] who, for this phenomenon, used the word “appresentation.” referring to a status of ‘also-there’ (‘Mit-da’) in perception. In other words, things that spontaneously occur, even though they are not perceived in a
952
J. Soeffner and C.S. Nam
physical way. Husserl gives the example of the reverse or far side of things that occurs to our perception even though we are not sensually able not be able to see it; and accordingly he conceives appresentation not as an act of thinking, but as part of primordial perception in an interactive environment (which he calls ‘Wirkwelt’/ ‘acting world’) - in this kind of perceptiveness everything ‘other’ can only be perceived as part of the ‘own.’ Despite the fact that appresentation is a fact of perception, Husserl does not limit his thoughts to presumed shapes or qualities of objects, of which we can perceive only small parts. Instead he opens up this concept by considering things that the definition can never occur to the senses. In effect, highly complex phenomena of ‘depth’ that can be presumed as inexhaustible can occur to the primordial perception. Husserl especially refers to the phenomenon of ‘perception’ of others as being provided with consciousness (i.e. as ‘subjects’ that occur as presumed bases of their own ‘original sphere’). This kind of apperception can be observed in everyday life. For example, we feel quite surprised if a presumed human being turned out to be mannequin and if our primordial perception is switching from the state of appresenting personhood and subjectivity to the state of appresenting thing-ness. It is the difference between this sense of surface and the sense of depth that generates a sense of co-presence with a subject or person, which Husserl accordingly calls facts of immanent transcendence.
3 Avatars as Persons If the use of avatars in computer-mediated communication is described this way, the question about co-presence has to be considered as dependant on the question about what makes a sensually present item embody a person to our perception. Indeed avatars in computer-mediated co-presence are the addressable focus of the sum of the sensually given items of remote persons in virtual reality just like the bodies are the sum of the sensually given items of present persons. This is neither bizarre nor special. The fact that the process of attributing personality to unperceivable things not only makes it possible, but easy to attribute personality to unanimated things thereby influencing the sense of co-presence. Every environment is acting upon us, even if it is not animated. This fact often leads us to very narrow-minded perceptions that a common-sense-rationalist would call faults or superstitions: i.e., the attribution of a personified intentionality to animals (especially dogs and cats), non-human forces (e.g. Nature) and even non-living items (e.g. I often insult my computer for his stubborn character and caprices). In his theory about Art and Agency, Alfred Gell [7] gives an example of what a doll can be for a little child. Many people indeed feel copresence while praying in the presence of the statues of the Virgin or Christ, which to be honest, very rarely talk back. In human-computer interaction research, the most prominent example for this kind of felt co-presence is the famous ‘Eliza’ – a system that provides human-like computer-generated answers to people’s questions. Indeed this machine originally worked so well, that people analyzed by an Eliza-psychologist wanted to go beyond the machine and talk to the suspected real, and human (i.e. corporally addressable), Eliza. This behavior, however, has become less probable in our times. People have become used to computer-mediated interaction and also pictorially present avatars. Also,
Co-presence in Shared Virtual Environments
953
people have learned to esteem computer-mediated interaction. Nevertheless these interesting studies show the extreme difference between the experience of felt copresence of a machine and felt co-presence of a human being mediated by a machine: i.e. the attribution of personality or subjectivity lacking in the former. Eliza has led to the discussion about human co-presence in virtual environments; and the trace of this origin can be observed in the extrapolation of a question parameter only implicitly inherent to them: The question about a ‘real’ and a deceptive ‘as-if co-presence [8]. And – like in theories about fiction – the question is about whether an ‘as if’ is acting as deception (i.e. an unrecognized ‘as-if’ situation) or as a game (i.e. a recognized ‘as-if’ situation). So we are confronting the same problem that made ’epistemic failure’ based definitions an insufficient grounds for defining telepresence experience: Indeed there is no epistemic failure in playing with a machine’s accurate imitation of human behavior, unless there is some Weizenbaum misleading the users to attributing personality and intentionality to Eliza’s texts. The mere fact of ‘acting animated’ itself does not produce co-presence, but the attribution of something deeper, present, behind the appearances. The difference between the sense of real or fictitious co-presence is not only dependant on the accuracy of imitation, but rather on the appresentation of agency for which this accuracy is an essential condition, but nevertheless not a sufficient one: Even the most perfect imitation of human behavior can produce both, either willing or spontaneous (mislead) ‘suspension of disbelief’. In other words, between a simple apperception and an apperception lived in a game-like ‘as-if’. Even though Eliza worked only with text, the case of avatars is not completely different in this sense; it is only more complex. We are dealing with a larger scale of communicative stimuli tending to become as rich as (or even more rich than) face-toface interaction, and so allowing for other kinds of coherency or failure of coherency. A current Eliza could have a talking avatar yielding a voice, face, body, and clothes all of which are part of the communication.
4 Possessing and Being Possessed It is also important to look at the other side of co-presence: the identification of a user with her or his avatar. We have stated above, that the invisible persona is hidden and cannot simply be deducted from the shape of a communicated personality. On the other hand we have nothing but the sensually given to communicate it. It has to be read from the traces of the shaping process we try to observe in the shape. So our body itself is a kind of a mask we can shape to some degree, according to the paradox task of making this visible fit to the personality we want to communicate. In addition, we know that the others are doing same job. Transcendence in the immanence, the copresence of a person in a sensual phenomenon, or the perception of this phenomenon as provided with agency, therefore is a two-folded relation in interaction. Everybody knows the own body to be not only perceiving, but also perceived; and so she or he knows to be corporally presenting and communicating her or his invisible ‘essence’ (or to be faking it). This is even more applicable for avatars, which can be shaped much easier and in a more radical way than our bodies. For example, even if people supposed to be dealing
954
J. Soeffner and C.S. Nam
with an Eliza-like avatar analyst, it might be essential for them to see an avatar they can believe a psycho-analyst could have chosen (it doesn't have to be one of the hundreds of thousands of Freud-Images, but I am afraid, many people would be as mentally lazy as to expect something quite similar). Nevertheless there is also an important difference between our communicating bodies and our communicating avatars. Corporality is part of our communication and we try to shape our bodies according to what we believe adequate, convenient, comfortable or true of us. In this way we possess our bodies. On the other hand, we are also possessed by these us-bearers. They establish the focus of one perception and generate our thoughts and emotions. Neuro-scientists even make us believe that all of the shaping is part of this possession. So our corporality provides us with the paradox state of having a body and at the same time being one [9]. The difference between our communicating through embodying corporality and our communicating through incorporating avatars, therefore, is not the difference between face-to-face interaction and a mixture of transmission and simulation. But it is the fact that avatars – up to a certain degree – lack this paradox state of possessed possession: Of course it is true, that on the one hand we also ‘have’ our avatars and at the same time ‘are’ them. But we ‘are’ our avatars in a much less determined way than we ‘are’ our bodies. Indeed we cannot escape our bodies, but can escape our avatars. Avatars are communicating bodies only. They do not possess us physically, but only do so by the laws and restraints of communication. They do not bear our physiology, perception, anatomy, DNA and so on. The difference between our personalities embodied by our corporality and personalities embodied by our avatars, therefore, is about a paradox immediateness of us being possessed possessors of and by our bodies on the one hand – and a radical, but mediated way of possessing a body on the other hand. It is not about presence or absence, but about the power we gain over our presence (or more precisely, the appresentations of others) by means of mediation.3 Avatars therefore, at first glance, seem to be the realized dream of the Cartesian thinking being (ens cogitans) – i.e. to be really independent from all spacerequiring material things (res extensae). In another sense they are quite the opposite, because they are visible and sensual and they do require (virtual) embodiment. So, instead of being the freed thinking substance, an avatar is still an embodiment – even though it is a more freely and more mentally shaped one than the body itself. Avatars thereby have an aspect that is even more personal than our corporal appearance. If a person is really the invisible entity animating our masks, avatars give us much more freedom to make these masks communicative. Unlike our bodies, avatars can be chosen and formed by the immaterial animating essence sounding through them. Again Alfred Gell’s agency theory can clearly illustrate this relationship. For Gell, like for ourselves, agency is not a simply given attribute of a person, but is distributed on several phenomena of the given world, from which it can be abducted. In outlining this theory he distinguishes primary agents (intentional beings who are 3
On the other hand the fact Goffman described as “situational properties” determined by and determining the “social occasion” [4, pp. 193-197] remain decisive facts for the social interaction and in virtual environments. The social frames of the virtual gatherings also determine the user’s behavior acting upon the avatar. So the avatars, like our communicating bodies, are determined by the rules and requirements of social interaction and are also mastering their users.
Co-presence in Shared Virtual Environments
955
categorically distinguished from mere things or artifacts) from secondary agents (i.e. artifacts through which primary agents distribute their agency); and he states theses secondary agents to be an essential part of agency, because the primary agents are affected by the secondary ones (e.g. I can only distribute my agency in this article by getting a part of my computer and giving life to it). This leads Gell to discuss distributed personhood, i.e. the fact that a person is defined by personal agency, as intervention in the causal milieu that generates a variety of material differences in the way things are, from which some particular agency can be abducted. So an avatar, being another kind of a secondary agent, would be only one of these manifestations. We would follow this line of argument, but there are still two problems with the definitions. Gell's concept of the index, being at the base of his concept of abduction, is too semiotic for presence theory. An index as part of a sign (despite the attribution of a causal relation with the agent designated by it) is always referring to; it is not participating in or with. It always makes something stand for something else; and thereby this something else (the signified) is not present in the phenomenon of the something (the signifier) – even if it is also ‘there’ in the same environment. If we understand smoke as an index for fire, the fire is present. However, in our understanding it remains something signified and thus only mentally present, but sensually absent: We ‘know’ it to be present, but we do not perceive it as present – and that is all there is to say about it. In fact Gell lacks the concept of appresentation outlined above, according to which the fire could occur as sensually present, even if only smoke is seen and smelled. This hole in Gell’s theory is problematic, especially when talking about co-presence. Presence research indeed shows us that presence experience will start only if we conceive, sense, or feel an item as part of our original sphere or motor space. This experience might be affected, coined or even produced by our indexical understandings (we see smoke and believe the feeling of warmth to be part of the fire acting upon us; we read a reply to a question and believe in the co-presence of Eliza); but presence experience itself cannot be part of the indexical signification. It is either a conclusion or an inference from it. In order to make Gell’s theory relevant for presence and co-presence research we have to substitute the indexical representation by Husserl's term of appresentation. The property of avatars as embodying something invisible (an agency or personality) and thereby making it addressable indeed requires a status beyond the neat opposition of presence and representation. In fact they are both - representations and real presences. The second problem is Gell's lack of consideration of the role of shaping or modeling what is sensually unperceivable. In fact he considers the shape of an artwork as an index only as far as it is the imitation of an already embodied agency, which he accordingly calls the ‘prototype’. However, avatars show us quite the opposite. It is not the imitation of the actual shape of a person’s body that counts for co-presence experience, rather, it is the embodiment of what a person is communicating about what she or he wants, desires or retains appropriate to her or his character and will. But desire or character themselves have no body. They are spiritual forms asking for embodiment. As argued above, it is this need that avatars reply to. Accordingly the fact of embodying the invisible requires not a single, but a double process: the abduction of agency Gell describes and a shaping process transmitting a spiritual form into the sphere of presentic perception.
956
J. Soeffner and C.S. Nam
5 Methexis: Embodiment by Means of Form We have started our paper with a parallel to medieval theory, and we would like to finish it with a parallel to antique theory: the Platonic concept of méthexis (literally ‘with-having’). We consider not only the best, easiest and most practicable juxtaposition of the complex phenomenon of co-presence by avatars, but also the solution to these theoretical problems. This term méthexis means participation in both senses of the word: taking part in and having (something) in common with. The usual example for this concept is explained in Republic (325d-353a), in which Plato compares a triangle drawn in the sand with the (essential) idea of a triangle. The realized triangle as such is only sensual, material and therefore unessential. But by means of having something in common with the idea of the triangle, it participates in its essential being. Méthexis therefore includes both a concept of embodiment and a concept of shaping or forming. And accordingly it is gradual: The better a triangle in the sand is realized, the more it participates in the idea of a triangle. This concept has been further developed by the Neo-Platonists who conceived the ideal Platonic truth as a living divine agency. The interesting point for avatar theory is, that Neoplatonists combined this concept of méthexis with the concept of the personifying allegories. If the allegories are participating in the life of the ideas by means of their forms, they are able to link the concepts of agency and shape, as well as of embodiment and truth: a concept that allowed for adaptation by Christian theology (equalizing the Platonic truth with God and/or the Holy Spirit). For example, when Marsilio Ficino (In Platonis Ionem) theorized the inspiration of poets by the living spiritual truth by the Muses, the Muses were Allegories (not Goddesses). Inspiration was nevertheless literally true. This was possible, because of two premises about truth and allegory, which are equally given in the case of personality and avatars: 1) the concept of truth as a living spiritual agency, 2) the form of the allegories (but not by means of indexicality) able to bear this spiritual agency.4 Now if we leave out the Platonic metaphysical assumptions, replace the term and concept of the ideas by the term and concept of personality, and open the concept of méthexis up for non-hierarchical reciprocal interaction, we are quite close to understanding how agency and personality get into avatars. Indeed the shaping of avatars shows the same graduality of realization. The more a form is accorded to an unperceivable essence considered as the true essence of a person, the more present this person will feel ‘as’ (or ‘in’) her or his avatar. And the more the other participants of a virtual gathering believe this accordance to be at the base of the avatar, the more 4
In general personifying allegories, like avatars, consist of two more or less mediated aspects: form and embodiment. In less Platonist examples the form is determined by metaphoric attributes. Here embodiment seems to be nothing but the hat-stand of metaphors. Corporality and spiritual essence are two completely distinct aspects. The most common example for this is the allegory of Justice holding a sword (metaphor for her power) and a balance (metaphor for the weighing of interests), and being blindfolded (metaphor for her not-regarding the status of persons). But there are also quite other kinds of personifications, much closer to animistic or demonic concepts of embodiment of environmental forces as agents, and much closer to our avatars of persons. Here form and embodiment are interwoven in a much more complex way, making embodying personification generate an anthropomorphized addressable juxtaposition of invisible but acting forces.
Co-presence in Shared Virtual Environments
957
they will also appresent personality to it. Nevertheless, like in the platonic méthexis, the embodying item takes over only some qualities of the remote item by means of shaping while the completeness of these qualities remains a possible only for the unperceivable personality.
6 Conclusions: Co-presence and Avatars When recently a performance of the famous Second Life avatar Anshe Chung was sabotaged by a hacker introducing some pink penises and other obscene images into the scene, the discussion that arose was not only about violation of copyright and about vandalism against a work of art. Guntram Gräf, the husband of Anshe Chung’s designer Ailin Gräf, rather tried to ban the video of this very performance from YouTube also by arguing that he wanted to protect his wife from further defamation and sexual assaults. This may sound like somebody mixing up virtual artifacts and persons – but it does so only if we see avatars as ‘representations’ and nothing else. The aim of this paper, however, was to argue that avatars can be apt for bearing (parts of) the personality of the users: An avatar, indeed, to our eyes, can also be understood as a form of an extension of personhood outside the person’s body. In a certain way, therefore, an avatar can grant for the person’s presence in virtual reality just like a body grants for a person’s presence in the factual reality. And it is apt for co-presence experiences because of this very reason. An avatar can generate some of the same apperceptions our bodies provoke in face-to-face communication. This is possible first because in multi-user virtual environments avatars evoke sensations of tele-presence or transmission (i.e. the sense of somebody being ‘behind’ the avatar and ‘linked’ to the same environment) and not only of simulation. Secondly it is possible because of the sense that the shape of an avatar in some way participates in the personality of the person it makes present. This can be due to indexical reference to the creative process the person has undertaken, but also the sense that the shape this person has chosen has something to do with the personality. The sense of co-presence transmitted by avatars is accordingly due to an appresentation of personality or agency that is helped by the index of live-transmission and creation. As such it is a fact helped by representation. It can also be mediated by the appresentations about the truer and invisible reality of a person being as much present in the avatar’s form (i.e. shape and actions) as it would be in the person’s corporal presence (or even more). These, in our eyes, are the bases for co-presence experience in which avatars are involved. It is basic for the case of co-presence by avatars – i.e. the sense of togetherness inside the virtual environment (‘visionary’ experience); and it is fundamental for the case of co-presence with avatars – i.e. the personal experience to be literally in the presence of somebody while being in the presence of her or his avatar (the experience of ‘appearance’). It may not be easy to empirically test out these considerations. But, we think that some tests could be done to show that participation is the right paradigm to describe co-presence by avatars. If the issues addressed in this paper can empirically be examined, for example, love should be felt as intense as or even more intense than in real life as far as mental and character-based affections are concerned (and this should be also true for imaginative sexuality as opposed to physical one). A comparison with pornographic single user video games could perhaps show or disprove the
958
J. Soeffner and C.S. Nam
difference – and throw light on the question whether Anshe Chung is nothing but an artifact (or at best a work of art), or if she can be seen also as a serious extension of Ailin Gräf’s personality.
References 1. Ditton, T., Lombard, M.: At the Heart of it All. The Concept of Presence. JCMC 3 [on-line serial] (1997) http://www.ascusc.org/jcmc 2. Eco, U.: Kant and the Platypus, NY: Harcourt Brace. 2000 (orig. Kant e l’ornitorinco, Milano: Garzanti (1997) 3. Floridi, L.: The Philosophy of Presence: From Epistemic Failure to Successful Observation. Presence: Teleoperators and Virtual Environments 14, 656–667 (2005) 4. Gell, A.: Art and Agency. An Anthropological Theory. Clarendon Press, Oxford (1988) 5. Goffman, E.: Behaviour in Public Places. Notes on the Social Organization of Gatherings. The Free Press, New York (1963) 6. Husserl, E.: Cartesian Meditations. Martinus Nijhoff, The Hague (1960) 7. Palmer, M.T.: Interpersonal Communication and Virtual Reality. Mediating Interpersonal Relationships. In: Biocca, F. (ed.) Interpersonal Communication and Virtual Reality, pp. 277–299. Lawrence Erlbaum, Hillsdale (N.J.) (1995) 8. Plessner, H.: Lachen und Weinen. In: Philosophische Anthropologie. Suhrkamp: Frankfurt a. M. (1970) 9. Zhao, S.: Toward a Taxonomy of Copresence. Presence: Teleoperators and Virtual Environments 12, 445–455 (2003)
Using Memory Aid to Build Memory Independence Quan T. Tran, Elizabeth D. Mynatt, and Gina Calcaterra College of Computing & GVU Center, Georgia Institute of Technology Atlanta, GA 30332 USA {quantt,mynatt,ginac}@cc.gatech.edu
Abstract. Memory aids provide useful assistance for forgetful people. However, the inherent concern is that the convenience of memory aids can also create detrimental user dependency; thereby creating forgetful people. As a case study, I investigate how an example cooking memory aid that summarizes which ingredients have been added how many times could avoid user dependency that would otherwise atrophy the cook’s ability of short-term memory recall. How does the cook rely on the memory aid to complete the cooking task? Does the cook use the memory aid more frequently over time? From a group of three young adult participants across five cooking sessions, I report changes in their use and nonuse of the memory aid over two weeks. The findings suggest that the young adults used the memory aid to confirm their own memory recall, thereby bolstering their self-confidence. Consequently, they came to rely on the memory aid less because they learned to trust their own memory recall more, thereby building memory independence from using the memory aid. Keywords: Home, cooking, memory aid, personal autonomy, self-efficacy.
participants came to use the memory aid less because they learned to trust their own memory abilities more. After brief comparisons to related work, I summarize the design of the Cook’s Collage (i.e., the cooking memory aid) and the experimental procedure. I present case studies from the participants, and I discuss implications from the study results as design guidelines for memory aids in smart living spaces.
2 Related Work The concern that technological aids have the inherent potential of creating detrimental user dependency and over-reliance can apply to any domain and application. In America, using calculators to teach math has been similarly debated [6] since the 1991 National Council of Teachers of Mathematics (NCTM) Curriculum and Standards recommended that “every teacher at every level promote the use of calculators to enhance mathematics instruction.” Calculator critics argue that using the technological crutch produces students who cannot perform basic tasks without a calculator, gives students false sense of confidence about their math ability, and keeps students from benefiting from one of the most important reasons for learning math- to train and discipline the mind and to promote logical reasoning. Calculator champions argue that calculator use allows students to spend less time on tedious calculations and more time on understanding and solving problems, simplifies tasks while helping students determine the best methods for solving problems, makes students confident about their math abilities, and allows students who would normally be turned off to math because of frustration or boredom to increase their mathematical understanding. Similar controversies apply to providing memory assistance for smart living spaces in the future home. However, I argue that not all memory aids would atrophy one’s memory abilities. It is possible to provide memory assistance to strengthen one’s memory skills. As related memory tools, brain teasers (e.g., [4]) provide entertainment and mental exercise. Rehabilitating a person’s memory abilities after a traumatic brain injury or other memory impairments is the goal for the research field of memory prothesis and orthesis [3]. For these disabled individuals, user dependency is not detrimental but successful use because their memory needs require step-by-step assistance in coping mechanisms and techniques. I designed the Cook’s Collage as a cooking memory aid specifically for older adults to adopt gradually over time as the natural aging process decline their memory abilities [10], but busy parents and forgetful cooks also found value in having the memory aid. I believe that my design intent is similar to how technological aids of universal design emerge from assistive technology that targets specific user groups then broadens to a general audience. In particular, I compare the design of the Cook’s Collage to the universal design of closed captioning. As video plays, text captions are displayed that transcribe, although not always verbatim, speech and other relevant sounds. This allows people who are deaf or hard of hearing, learning a new language, beginning readers, in a noisy environment, or otherwise to read a transcript or dialog of the audio portion of a video, film, or other presentation. The term “closed” in closed-captioning means that not all viewers see the captions- only those who decode
Using Memory Aid to Build Memory Independence
961
or activate them [9]. I posit the key to assuaging concerns about user dependency to memory aids is to promote memory aids as universal design to be generally accepted.
3 Case Study: Cook’s Collage I built Cook’s Collage to help experienced cooks remember which ingredients they recently added how many times. Details of the display features and design rationale have been previously reported [7]. The experimental procedure has been previously reported [8], but I reprise key parameters so that I can reference in the following results section. Each participant visited the kitchen laboratory for five days across two weeks. Each day consisted of the following four cooking trial conditions: • • • •
P: preparing punch recipe only D: dual task of punch recipe with monitoring stove temperatures D+I: same above condition plus hallway interruptions T+D+I: same above condition except with cookie recipe
Participants were free to use their own memory strategies and cooking habits to prepare the given punch and cookie recipes. The cook chose which ingredient to add in whichever sequence and can choose to return to any ingredient for additional amounts at any time. Except for the first visit (Session 0), the cooking memory aid was provided during the remainder of the cooking days (Sessions 1-4). The participants could refer to the Cook’s Collage for an at-a-glance review of their cooking progress whenever needed. Camera coverage was limited to the corner countertop area, so participants were advised to leave the mixing bowl in place. Quick hand movements can be missed by the monitoring system, so participants were advised to hold each ingredient addition momentarily over the mixing bowl. All participants were given an orientation and informal test with the Cook’s Collage, so they all had a clear understanding how to work within the limitations of the cooking memory aid to make use of its benefits.
4 Experimental Findings Every cooking trial condition was video-taped to document cooking errors, cooking habits, and user interactions with the memory aid. After every cooking trial condition, a structured interview was given for participants to comment on intended cooking/memory strategies, memory aid usage, and self-efficacy rating [1] of how confident they felt about their ingredient accuracy (i.e., Likert scale of 1-5). The goal of the case study analyses is to determine whether machine dependency of memory aids is a valid concern by cataloging if the young adult participants came to use the memory aid more frequently and for more memory needs over time. 4.1 Mrs. P Mrs. P is a positive example of using the memory aid to build memory independence. She used the memory aid primarily to confirm her counts, thereby bolstering her
962
Q.T. Tran, E.D. Mynatt, and G. Calcaterra
self-confidence in her counting accuracy, thereby diminishing her need to use the memory aid. She was the only study participant who accurately added all the ingredients for all trials for all sessions. Details of Mrs. P’s case study have been previously reported [9], but her memory aid usage and self-ratings are summarized as follows. In Session 0, the first day of cooking prior to the cooking aid, her self-confidence rating was lower than her actual accuracy. She doubted her counting for trial P by saying “maybe too few for the water” and for trial D by saying “I may have messed up on the water again.” She doubted her counting accuracy again for trial D+I by saying “I think I added too few of the ginger ale maybe.” In Session 1, the first day with the cooking aid, her self-confidence increased maximally to equal her actual counting accuracy. She noted using the memory aid in trial P when “I had to put in a lot of water,” in trial D by “glancing at it, not really verifying anything though. I looked up to see what it said for the water. I knew I had put in 13 cups of water though.” For trial D+I, Mrs. P said, “I used it for all the ingredients to make sure because I had all the alarms. For the orange juice, I only put in two, and when it still said two, I put in another scoop. I waited a while first though to make sure it maybe just hadn’t caught up.” For trial T+D+I, she used the aid “after I shut off the alarm, I used it to figure out what my count was. I used it to check for times I added just one scoop so I’d know if I’d actually added it or not.” In short, Mrs. P was using the memory aid to confirm her self-confidence in her counting accuracy. In session 2, Mrs. P again used the memory aid to confirm and bolster her selfconfidence about her counting accuracy. She was fully confident after she confirmed her counts with the aid for trial P saying, “I used it when adding the water to make sure of how many I already put in” and again for trial D+I saying, “for the water, to make sure I did the right amount. When I came back from the alarm, I needed to check it for something, but I can’t remember what ingredient it was.” She was not as confident for trial D because she could not use the aid for confirmation saying, “At the very end, I was trying to make sure I put everything in. I think I did the soda first, so that’s why I didn’t see it in there, but I’m not sure.” By trial T+D+I, she rarely used the memory aid and was confident in her own counting accuracy. In Sessions 3 and 4, her self-confidence maintained its maximal rating to equal her actual accuracy. She used the memory aid rarely, only “to double-check if I put in everything correctly” for all the trials except for trial D+I in Session 4 when she “used it for the water- one time when I went to get the alarm I forgot how many I put in so I had to check it when I got back” and for trial T+D+I in Session 4 when she “used it to check my brown sugar- I wasn’t sure how much I put in when I went to the alarm during that one.” 4.2 Ms. L Ms. L is also a positive example of how she used the memory aid to build memory independence because her self-confidence and ingredient accuracy steadily improved. Her initial memory strategy for the punch recipe in Session 0 was to tear tiny strips along a paper towel as tally marks, but she discarded this ad hoc strategy once given the Cook’s Collage. She used the cooking memory aid to verify the last counts of various ingredients when she had distractions. She rarely used the memory aid for
Using Memory Aid to Build Memory Independence
963
trial P across Sessions 1-4 when she had no distractions. By Session 4, she rarely used the Cook’s Collage for all the trials. In Session 0, Ms. L miscounted for every trial condition. Correspondingly, she was not confident in her counting accuracy for all the trials. For trial P in which she added one less water count, Ms. L acknowledged that she definitely miscounted citing “too much water.” For trial D in which she added two less water counts, Ms. L doubted her counting accuracy by saying “I may have added six scoops of orange juice instead of the required three scoops of orange juice and three scoops of pineapple juice.” She had placed the orange juice container aside to attend to an interruption. Upon return, she almost picked it up again but looked around her cooking area confusedly. For trial D+I in which she added one more scoop of orange juice, Ms. L doubted her counting again saying, “I might have messed up on the concentrates.” She looked confused while working with the similar looking juice containers. For trial T+D+I in which she added two extra flour scoops, Ms. L acknowledged she definitely miscounted citing “I added two much flour. I tried to fix it. I may have added ¼ to ½ cup too much.” In Session 1, Ms. L reduced her counting inaccuracies, but her self-ratings were less accurate. For trial P in which she added all the ingredients correctly, Ms. L doubted her counting accuracy by saying “maybe but I’m not sure what.” For trial D in which she added one more scoop of pineapple juice, Ms. L was fully confident that she did not forget or miscount. For trial D+I in which she added one less sherbet scoop, Ms. L again was fully confident that she did not forget nor miscount. For trial T+D+I in which she accurately added all ingredients, Ms. L finally was finally accurate in being fully confident in her counting accuracy. In Session 2 and 4, Ms. L continued to add all ingredients correctly for all trial conditions and to be equally confident that she did not forget nor miscount any ingredients for any trial conditions. In Session 3, Ms. L was again fully confident in her ingredient accuracy. However, she added three less vanilla spoonfuls for trial T+D+I. Otherwise, she was indeed correct for all other trial conditions. For trial P, Ms. L commented, “I didn’t use it [aid]. I just glanced at it.” 4.3 Mr. K Mr. K is a positive example of how he used the memory aid to build memory independence because his ingredient accuracy and self-confidence steadily increased. His self-confidence primarily corresponded with his actual ingredient accuracy. His memory aid usage is interesting. He waited on the aid to confirm the final count before adding it. In Session 1, he used the aid for all ingredients. By the end of Session 2, he used the memory aid only for the final count of water. He disputed the accuracy of the memory display in Session 4 for trial D+I saying, “That [aid showing sherbet] should be 4.” In Session 0, Mr. K committed ingredient errors and was not confident in his counting accuracy. Mr. K added one more scoop of water for trials P and D+I, he correspondingly self-rated his counting inaccuracies saying “may have been off on the water.” For trial D in which he added the ingredients correctly, Mr. K again doubted his counting saying “I may have added too many water.” For trial T+D+I in which he forgot the vanilla entirely, Mr. K was fully confident he had not forgotten nor miscounted.
964
Q.T. Tran, E.D. Mynatt, and G. Calcaterra
In Session 1, Mr. K was fully confident that he did not forget nor miscount any ingredients for any trial condition. He had indeed added all ingredients correctly for all trials except for trial T+D+I by adding two less spoonfuls of vanilla. In Session 2, Mr. K accurately self-rated his counting accuracies. Mr. K added all ingredients accurately for trials P, D, and T+D+I, and correspondingly he was fully confident that he did not forget nor miscount any ingredients for those trial conditions. For trial D+I in which he added one more scoop of water, Mr. K correspondingly self-rated “I may have added one more scoop of water.” In Sessions 3 and 4, Mr. K was completely accurate with all the ingredient additions for all trial conditions. Correspondingly, he was fully confident in his ingredient accuracy.
5 Discussion and Future Work The contributions of this paper are the case study findings that suggest young adult participants were able to use the memory aid to build their memory independence. I chose to focus on three positive examples to show that user dependency is not a requisite for memory aid usage. The other three young adult participants in the study are not presented in this paper in the interest of space. Although similar in trajectory, the trends of their memory aid usage and self-efficacy ratings are less distinct than those presented in this paper. The case studies illustrate the potential to champion the immediate practical benefits of memory aids rather than to criticize their potential underlying dependency. The curves for memory recall among the young adult participants were incrementally improving over time. This slope can be contributed to how effectively the participants learned to use the memory aid. It could also be attributed to learning effects and practice curves for the repeated cooking task itself. The curves for memory aid usage among the young adult participants reach a drop-off point around the final session. This can be contributed to the trial-and-error process of the participants calibrating their cost-benefit tradeoff in depending on the memory aid. It could also be attributed to novelty effect of the Cook’s Collage as a newly introduced memory aid. More experimental comparisons are needed to tease apart these potential factors. The technological feasibility of the Cook’s Collage was simulated by using a Wizard of Oz technique because the computational perception requirements are still unattainable with current capabilities. The monitoring latency and the activity recognition inaccuracies limit the Cook’s Collage from being an effective memory oracle. I believe these two restrictions helped to temper the study participants’ reliance on the memory aid. A follow-up study will be needed to evaluate user dependency of the Cook’s Collage as a memory aid once the technological solution is robust.
References 1. Bandura, A.: Self-Efficacy: Towards a Unifying Theory of Behavioral Change. Psychological Review, 84(2), 191–215 2. Harris, J.E.: External Memory Aids. In: Gruneberg, M.M., Morris, P.E., Sykes, R.N. (eds.) Practical aspects of memory, pp. 172–179. Academic Press, London (1984)
Using Memory Aid to Build Memory Independence
965
3. LoPresti, E.F., Mihailidis, A., Kirsch, N.: Assistive Technology for Cognitive Rehabilitation: State of the Art. Psychology Press Ltd (2004) 4. Nintendo, DS Brain Age. Nikoli Co., Ltd (2006) 5. Park, D.C., Smith, A.D., Cavanaugh, J.C.: Cavanaugh Metamemories of memory researchers. Journal of Memory in Cognition 18(3), 321–327 (1990) 6. Starr, L.: Educators Battle Over Calculator Use: Both Sides Claim Casualties. Education World: The Educators Best Friend (2002) 7. Tran, Q., Calcaterra, G., Mynatt, E.: How an Older and a Younger Adult Adopted a Cooking Memory Aid. In: the Proceedings of HCII: Human-Computer Interaction International, CD-ROM (2005) 8. Tran, Q.T., Calcaterra, G., Mynatt, E. D.: Cook’s Collage: Déjà vu Display for a Home Kitchen. In: Proceedings of HOIT 2005, pp. 15–32 (2005) 9. Wikipedia: The Free Encyclopedia http://en.wikipedia.org/wiki/Closed_captioning 10. Zacks, R.T., Hasher, L., Li, K.Z.H.: Human Memory. In: Craik, F.I.M., Salthouse, T.A. (eds.) The Handbook of aging and cognition, 2nd edn. pp. 293–357. Erlbaum, Mahwah, NJ (2000)
Perception of Movements and Transformations in Flash Animations of Older Adults Lin Wang1, Hitomi Sato2, Ling Jin1, Pei-Luen Patrick Rau1, and Yoko Asano2 1
Abstract. With the concurrent rapid increasing of aging population and digital science, issues on providing appropriate information elements on computer and website have become more and more significant. This study was aimed at examining different effects of movements and transformations in flash animations on performance (time and error) and subjective perception (satisfaction, vision fatigue and workload) of older adults. Eighteen subjects coming from the University of the Third Age of Railway Ministry of China all of who were experienced computer and Internet users participated in the experiment where flash animations mode and moving speed were manipulated as independent variables. The results indicated significant differences among four different animations modes for performance (time and error) and vision fatigue. Significant differences were also found among three levels of moving speeds for performance (time and error) and vision fatigue. Further implications of flash animations design for the elderly were discussed. Keywords: Flash Animations, Older Adults, Movements, Transformations.
Perception of Movements and Transformations in Flash Animations of Older Adults
967
2 Literature Review As people age, they usually experience declines in many of their capabilities, which can impact their abilities to use computer and other related technologies. Physical limitations, such as reduced dexterity and precision, can make the use of input devices like mouse and keyboards more difficult. Sensory limitations create challenges for the design of computer output. Cognitive limitations affect the design of the interaction itself, with challenges caused by factors such as poorer memory, greater difficulty in learning and slower responses, as well as lower levels of familiarity with computers and associated interfaces [10]. The combined aging effects will contribute to loss of confidence and difficulties in orientation and absorption of information [14]. Among all the aging effects, the most important one is visual aging effect that is the first obstacle to approach information presented by the screen of computers. There are age-related vision deficits either in static contrast sensitivity and discrimination or in dynamic acuity and spatial contrast sensitivity. The vision deficits increase with object motion speed [8]. The upper limit on dynamic sensitivity is determined largely by sensitivity for stationary targets. Thus, the dynamic resolution levels of older adults can be enhanced by optimizing their static acuity [8], [12]. This research chose the widely used flash animations as the research object. Flash is one of the primary tools to create sophisticated applications, full-fledged games, rich multimedia, and complex interactivity for the Internet because of its outstanding interaction and graphic characteristics [7], [11]. There are a lot of chances for the elders to encounter flash animations even more than the young people. For example, flash has been used in the teaching tools which are used to teach computer skills to the elders. Flash is also a popular tool used for entertainment and interaction. Its brighter colors, simplicity of format and high novelty reduce boredom and fatigue which is one of the obstacles exhibit elders to use computer and Internet [1]. But unfortunately, due to the characteristics of flash animations mentioned above, appropriate use is not easy. Flash has been heavily criticized for poor usability from the year 2000 [9]. Some design guidelines forbid the use of flash as a convenient solution [15]. When refer to the elders, the situation is worse. They can be easily disturbed and get tired by the flickers, moving and deforming animations. In order to put an end to the abuse use of flash, this research analyzed the basic composition of flash animations. Statistic results of flash elements in popular websites were used as levels of experiment variables. Finally, flash design guidelines were given out which would be very useful for the interface and flash designers. The following model shows the composition of flash animations. Flash animations are generated by three basic animation modes, which are movements, transformations and the combination of the previous two. Different speeds of movements constitute the moving animations. Changes in size, contour, color, brightness, transparency constitute changes in transformation animations.
968
L. Wang et al.
Fig. 1. Composition of flash animations
3 Research Questions and Methodology Since the flash animations are composed by basic movements and transformations, there are two research questions need to be considered: Research Question I: Among flash animations modes, which mode is more preferred by the elders, meaning less consuming of vision load, workload and better performance? Research Question II: What speed of movements in flash animations is suitable for the elders with less consuming of vision load, workload and better performance? In order to keep objects in foveal vision, two different kinds of movements exist including pursuit movements and saccadic movements. Pursuit movements are that the eye follows a target moving across the visual field. Saccadic movements are discrete, jerky movements that jump from one stationary point in the visual field to the next. A saccade will be used to catch up and bring the target back into foveal vision if the object moves in a high speed [13]. Saccadic behavior has two components: saccade and fixation. During the saccade, the visual system suppresses visual input [3], and so display information can be properly processed only during fixation. Movements in flash animations need both of the two kinds vision movements, while deformations only need the pursuit movements. According to the above theory, this study assumed that when only one of the two animation modes existed, movements needed more vision movements than transformations. Thus, they need more vision load and workload than transformations. Meanwhile, it would cause worse performance. When two animation modes existed together, the vision load, workload would be the highest and the performance would be the worst compared with at most one mode of animations exists. 3.1 Variables and Task Data were collected on performance time, errors, vision fatigue, satisfaction and task load as dependent variables. Performance time was the total time required to complete
Perception of Movements and Transformations in Flash Animations of Older Adults
969
each visual search task. Error was the total number of excessive clicks utilized to perform the tasks. Vision fatigue was measured by a vision perceptive evaluation questionnaire. Satisfaction was the score obtained through a general satisfaction questionnaire which was modified version of the satisfaction measure utilized by Cook [4]. Task load was the score obtained through a general task load questionnaire used by National Aeronautics and Space Administration [6]. Two independent variables were manipulated in the experiment. They were flash animations mode and moving speed. Flash animations mode had four levels which were no animations mode, movement mode, transformation mode and combination mode with both movements and transformations. Moving speed had three levels which were 60pix/s, 200pix/s, and 300pix/s obtained from statistical data of flash elements used in popular websites.
Fig. 2. Movements and transformations in flash animations
Visual searching tasks were used in the experiment. There were eight items presented with one flash animation mode or moving speed on the computer screens at one time. The participants were asked to click on the target item which was showed on the top left corner of the screen.
Fig. 3. The interface of the experiment
970
L. Wang et al.
Six types of interface designs were developed for the system to manipulate animation modes and moving speeds. Four types of the interface were aimed at finding out the perceptive differences among different modes of animations from A1 to A4. The other two types combined with one type in the previous ones were used to test the perception differences of different moving speeds from B1 to B3.They are as follows: Table 1. Description of the prototypes Serial code A1 A2 A4 B1 A3 or B2 B3
Animation description No animation mode Transformation mode Combination mode with both movements and transformations Moving with low speed Movement mode or, moving with medium speed Moving with high speed
Transformation speed (pix/s) 0 3.5
Moving speed (pix/s) 0 0
3.5
200
0
60
0
200
0
300
3.2 Participants Eighteen older adults participated in the experiment. The ages of the participants range from 55 to 75 with the mean 64.3 years and standard deviation of 6.6 years. Their mean computer usage had been 47.6 months with the SDT 39.6 months. Although the computer proficiency was not very important in this experiment, we chose all the subjects from students in the computer class of the University of the Third Age of Railway Ministry of China. There were several advantages to choose elders from the University of the Third Age of Railway Ministry of China. First, they were not fear of new technology related to computer. Second, they were willing to take part in the experiments because they thought that the experiments would provide them knowledge which could not be learned from the classes. Third, they always actively shared what they felt about the prototypes with the experiment organizers. 3.3 Procedure The experimentation was conducted in a quiet computer classroom in the University of the Third Age of Railway Ministry of China. It took older people approximately half an hour. At the beginning of the experiment, each participant was asked to read and sign name on the consent form, and then fill in a general information questionnaire concerning the personal characteristics, basic information of computer and Internet usage. The experiment contained six exercises and six groups of experiment tasks. The purpose of the exercises was letting older people to be familiar with the operations in the following experiment tasks. After they finished the tasks, the results would be presented on the screen of the computer, and showed that they had successfully finished the task. At the same time, participants were required to fill in the questionnaire according to their personal feelings and attitudes.
Perception of Movements and Transformations in Flash Animations of Older Adults
971
4 Results and Discussion 4.1 Results of Research Question I The intention of this research question was to find out the differences in performance and perception with different flash animations mode of the older adults. The response variables of Time, Error, vision fatigue, Satisfaction and Task load were used to find answers for the question. Data from these planned comparisons are presented in the following table. Table 2. Overview of the results (Research question I) Dependent variables Time(ms) Error rate Vision fatigue Satisfaction Task load
As presented in Table 2, the performance time (F (3, 68) =6.98, p=0.0004*), error rate (H=55.76, p=0.0000*), and vision fatigue (F (3,68) =3.04, p=0.0346*) planned comparison results among the four animations modes show significant differences in the experiment. But it shows no significant differences for satisfaction and task load. In order to learn more about the differences among the four animations modes, comparisons between every two animations modes from A1 to A4 were compared separately: The performance time planned comparison results show significant differences between A1 and A3 (p=0.0094*), A1 and A4 (p=0.0001*), and A2 and A4 (p=0.0015*) using the LSD test. The participants with no animation mode (A1: N=2565.1, SD=524.7) were 38.95% faster than the participants with movement mode (A3: N=3564.1, SD=1237.9); the participants with no animation mode (A1: N=2565.1, SD=524.7) were 66.59% faster than the participants with combination mode with both movements and transformations (A4: N=4273.1, SD=1631.4); the participants with transformation mode (A2: N=2756.5, SD=489.1) were 55.02% faster than the participants with combination mode with both movements and transformations (A4: N=4273.1, SD=1631.4). The error rate data could not be used in the LSD test. Kruskal-Wallis test was used to compare the differences. There were significant differences among different animation modes from A1 to A4. The test results show that the participants have the lowest error rate with the no animation mode (A1).
972
L. Wang et al.
The vision fatigue planned comparison results show significant differences in the experiment between A1 and A2 (p=0.0319*), A1 and A3 (p=0.0498*), and A1 and A4 (p=0.0055*) using the LSD test. The participants with transformation mode (A2: N=15.33, SD=7.28) were 43.00% higher than the participants with no animation mode (A1: N=10.72, SD=3.98); the participants with movement mode (A3: N=14.61, SD=5.09) were 36.29% higher than the participants with no animation mode (A1: N=10.72, SD=3.98); the participants with combination mode with both movements and transformations (A4: N=16.11, SD=6.27) were 50.28% higher than the participants with no animations mode (A1: N=10.72, SD=3.98). 4.2 Results of Research Question II The intention of research question two was to find out the differences in performance and perception with different moving speeds in flash animations of the older adults. The response variables of Time, Error, vision fatigue, Satisfaction and Task load were used to find answers for the question. Data from these planned comparisons are presented in the following table. Table 3. Overview of the results (Research question II) Dependent variables Time (ms) Error rate Vision fatigue Satisfaction Workload
As presented in Table 3, the performance time (F (2, 51) =19.99, p=0.0000*), error rate (H=39.33, p=0.0000*), and vision fatigue (F (2, 51) =3.49, p=0.0379*) planned comparison results among the three moving speeds in flash animations show significant differences in the experiment. But it shows no significant difference for satisfaction and task load. The performance time planned comparison results show significant differences between B1 and B3 (p=0.0000*) and B2 and B3 (p=0.0000*) using the LSD test. The participants with low moving speed flash animations (B1: N=2819.8, SD=703.3) were 120.15% faster than the participants with high moving speed flash animations (B3: N=6207.9, SD=2329.5); the participants with medium moving speed flash animations (B2: N=3564.1, SD=1237.9) were 74.18% faster than the participants with high moving speed (B3: N=6207.9, SD=2329.5). The error rate data could not be used in the LSD test. Kruskal-Wallis test was used to compare the differences. There were significant differences among different animations
Perception of Movements and Transformations in Flash Animations of Older Adults
973
modes from B1 to B3. The test results show that the participants had the lowest error rate with the low moving speed flash animations (B1). The vision fatigue planned comparison results show significant differences in the experiment between B1 and B3 (0.0149*) using the LSD test. The participants with high moving speed in flash animations (B3: N=18.50, SD=8.22) were 37.65% higher than the participants with low moving speed in flash animations (B1: N=13.44, SD=4.75). 4.3 General Discussion The study first gave out the composition model of flash animations, identified that flash animations could be decomposed into two basic animations which were movements and transformations. Then, according to aging effect theories, research questions were proposed to investigate the different influences among no animations mode, movement mode, transformation mode and combination mode. The performance time, errors, and vision fatigue showed significant differences either in flash animations mode or moving speed. However, there were no significant differences for dependent variables of satisfaction and task load. It was maybe due to that the experiment system was not complicating enough to let the elders tell the differences. During the experiment preparation period, considering the aging effect, the prototype was made interesting and not to spend too much cognitive and task load of the elder. The elders found that the more complicate the prototype was, the more exciting they felt. They also had more compliments when they completed a more complicate task. In summary, it is better to provide older users with website elements without animations. However, when animations usage is needed, a suitable adoption of flash animations is to use animations with only movements or only transformations. The combination mode is not recommended. When moving animations are used, low speed moving animations are recommended.
5 Results and Discussion The purpose of this study was to investigate the different impacts of flash animations on computer performance and perception of older adults and how to design appropriate flash animations for elders. Four animations modes have been studied, namely no animations mode, movement mode, transformation mode and combination mode. A flash composition model was developed to represent the composition of flash animations. The essence of the results can be summarized as follows: 1. In flash animations mode test, performance time with no animations mode was 38.95% faster than movement mode; performance time with no animation mode was 66.59% faster than combination mode with both movements and transformations; performance time with transformation mode was 55.02% faster than combination mode with both movements and transformations. 2. In flash animations mode test, error rate with no animation mode was the lowest. 3. In flash animations mode test, the vision fatigue with transformation mode were 43.00% higher than no animation mode; the vision fatigue with movement mode
974
L. Wang et al.
were 36.29% higher than no animation mode; the vision fatigue with combination mode with both movements and transformations were 50.28% higher than no animations mode. 4. In flash moving speed test, the performance time with low moving speed flash animations were 120.15% faster than with high moving speed flash animations; the performance time with medium moving speed flash animations were 74.18% faster than with high moving speed. 5. In flash moving speed test, the error rate with low moving speed was the lowest. 6. In flash moving speed test, the vision fatigue with high moving speed in flash animations were 37.65% higher than with low moving speed. Based on the conclusion of this study, some design guidelines for flash designers are recommended as follows: • Complicate flash animations which combining movements and transformations would cause higher vision fatigue and poor performance, which should be avoided. • High speed movements should be avoided in flash animations. 200pix/s should be the upper limit. • The perceptive and cognitive abilities are different among the elders to process vision information. Flash animations designed for the elders should provide animations adjust function (moving speed, transformation speed and so on) to meet different needs of the older users.
References 1. Austin-Wells, V., Zimmerman, T., & McDougall, G. J.: An optimal delivery format for presentations targeting older adults. Educational Gerontology 29 (2003) 493-501 2. Chadwick-Dias, A., McNulty, M., & Tullis, T.: Web Usability and Age: How Design Changes Can Improve Performance. Paper presented at the CUU’03 (2003) 3. Chase, R., & Kalil, R. E.: Suppression of visual evoked responses to flash and pattern shifting during voluntary saccades. Visual Research 12 (1972) 215-220 4. Cook, J. R.: Cognitive and social factors in the design of computerized jobs. Purdue University (1991) 5. Goodman, J., & Lundell, J.: HCI and the older population. Interacting with Computers 17 (2005) 613-620 6. Hart, S., & Staveland, L.: Development of NASA TLX (Task Load Index): results of empirical and theoritical research. North-Holland Elsevier Science (1988) 7. Holzinger, A., & Ebner, M.: Interaction and usability of simulation & animations: A case study of the flash technology. In: M. Rauterberg, M. Menozzi & J. Wesson (Eds.): Human-computer interaction INTERACT 2003 (2003) 777-780 8. Kline, D. W., & Scialfa, C. T. (Eds.): Sensory and perceptual functioning: Basic research and human factors implications. California: Academic Press, Inc. (1996) 9. Nielsen, J.: "Flash: 99% Bad" Alertbox Column. Retrieved Oct. 24th, 2006, from http://www.useit.com/alertbox/20001029.html (2000) 10. Rama, M. D., de Ridder, H., & Bouma, H.: Technology generation and Age in using layered user interfaces. Gerontechnology 1 (2001) 25-40
Perception of Movements and Transformations in Flash Animations of Older Adults
975
11. Schaller, D. T., Allison-Bunnell, S., Chow, A., Marty, P., & Heo, M.: To Flash or Not To Flash? Usability and User Engagement of HTML vs. Flash. Paper presented at the Museums & the Web 2004 (2004) 12. Scialfa, C. T., Garvey, P. M., Tyrrell, R. A., Goebel, C. C., Deering, L., & Leibowitz, H. W.: Relationships among measures of static and dynamic visual sensitivity. Human Factor 30 (1988) 677-687 13. Young, L. R., & Stark, L.: A discrete model for eye tracking movements. IEEE Transactions on Human Factors in Electronics 2 (1963) 38-51 14. Zajicek, M.: Successful and available: interface design exemplars for older users. Interacting with Computers 16 (2004) 411-430 15. Zaphiris, P., Ghiawadwala, M., & Mughal, S.: Age-centered Research-Based Web Design Guidelines. Paper presented at the CHI 2005, Portland, Oregon, USA (2005)
Studying Utility of Personal Usage-History: A Software Tool for Enabling Empirical Research Kimmo Wideroos and Samuli Pekkola Department of Computer Science and Information Systems, University of Jyväskylä, P.O. Box 35, FI-40014 University of Jyväskylä, Finland {kimmo.wideroos,samuli}@jyu.fi
Abstract. Managing personal information space and working context is complicated in computerized environment. One well-known cause for the problem is that digital information is superficially fragmented into different data types and structures. Several unifying approaches have been proposed to facilitate semantic connections between them. Particularly in personal information retrieval, temporal information has turned to be useful. Hence, in this article, we present an empirical research setting for studying the utility of representing personal usage-history in information retrieval by comparing it with more traditional hierarchical representation. The research setting is based on a software Tool that is described in the article. Keywords: Personal Information Management, Information Retrieval, Information Visualization, Personal Usage-History.
and applying these results in personal information management area [11]. Temporal information has been seen useful in personal information retrieval as the users usually prefer temporal rankings of search results [10]. Personal information management strategies, tools and needs vary from person to person, making the phenomenon challenging to study. As usual, there are two alternatives; real-life observations are essential because of the idiosyncratic nature of the phenomenon. However, laboratory studies are also needed to understand more about general aspects related to personal information management [6]. Thus, in the article, we focus on the laboratory studies. We describe a software tool that facilitates studying difference in information retrieval performance (with and without representing the subject’s subjective markings, i.e. landmarks) when using a spatial view of a personal usage-history and a spatial view to a document collection.
2 Personal Information Management (PIM) Personal information management is a vague term that is used from several points of view. In [12], we listed several definitions for PIM. First, PIM can be defined by listing activities that belong to personal information management (e.g. retrieval). PIM can also be considered as a tool facilitating personal information management (e.g. e-mail), and as software that can be compared to other kinds of software. For example, PIM systems are more intimate and more in tune with the way people think and work than other information systems, designed for more general use [1]. So, what are the characteristics that make (or should make) personal information management tools “more in tune with the people think and work”? We suggest the following aspects. 2.1 Landmarks of Action “Landmarks refer to the ‘stages’ that have substantial influence on the user’s intentional context in the course of actions. They include, for example: accomplishing a task, being interrupted unexpectedly, shifting between tasks, recognizing something that might well be valuable with respect to another situation when doing something else, and so forth.” [11]. A landmark is in the intersection between subjective present and the contemporary meaningful content. Subjective present underlines the subject for whom something is present. Extending the idea of subjective present to the past leads to the idea of past present [11]. An event in the subjective past is something that has been perceived present by a certain person at that moment of time. A landmark as such is an object arising from the user’s activity upon the text. Hence, landmark can be considered as a part of user’s interactional context [3]. 2.2 Personal Usage-History From the user’s point of view, a common denominator that emphasizes the nature of personal information management is the fact that unstructured shifts between activities on different contents occur in the course of time. There is no need to change this, but one can take the advantage of it by storing the sequences in relation to time and creating a personal usage-history. This history serves as a context for user’s landmarks.
978
K. Wideroos and S. Pekkola
2.3 User-Subjective Approach User-subjective approach to personal information management emphasizes subjective attributes the user gives to the data [2]. User-subjective approach is a theoretical underpinning for the following principles that should be taken into consideration in designing personal information management systems. First, regardless of technological format, all information related to the same subjective topic should be classified together (subjective classification principle). Second, the degree of accessibility and visual salience of information should bear some relation to subjective importance of the information (subjective importance principle). Third, information should be retrieved and viewed by the user in the same context in which it was previously used (subjective context principle). The idea of user-made landmarks and personal-usage history is congruent with user-subjective approach.
3 Studying Utility of Personal Usage-history In order to test the utility of the idea of personal usage-history and landmarks, we have designed a software tool that allows testees to search information from a collection of documents and make landmarks by selecting parts from the text. The application is able to show the testees search history and document collection with or without showing the selections the testee has made. The idea of different views (hierarchical and historical view) is depicted in Fig. 1. Hierarchical view corresponds to the structured representation of data while the historical view captures the testee’s browsing history on those documents. Figure 1 also illustrates the landmarks of actions the testee had made (Fig.1c-d). a.)
b.)
c.)
d.)
Doc 1
Doc 1
Doct 1
Doc 1
Doc 3
SEL 2
Doc 3
SEL 4 Doc 2 Doc 2
Doc 1
Doct 2 Doc 2
Doc 1 SEL 2
Doc 3
Doct 3 Doc 2 Doc 1
Doc 2 SEL 1
Doc 1
Fig. 1. a.) Hierarchical and b.) Historical views without selections and with selections SEL 1 4 in c.) and d.) respectively
Studying Utility of Personal Usage-History
979
3.1 Test Subjects Perspective From the test subject’s perspective, a test session is divided into two main stages. First Stage. In the first stage of the experiment, the test person is given a list of themes (s)he is supposed to search from a document set. Instructions are shown in the same view with all the documents (Fig. 2a). The testee is able to inspect the instructions any time during the first part of the test by pressing a button in the document viewer window. The list of themes (and instructions) may change over time. For example, the testee can get a list of two themes in the beginning, another two themes after some time and so on. How instructions are given, is configured by the researcher. During the first part of the experiment, the Tool appears as a text-browser that allows one document to be visualized at a moment. The document is one from a collection of document, each on different themes. The Tool also allows the reading of the document, marking parts of the document and shifting to read another document. The testee is instructed to search information about each theme from the documents and select appropriate extracts. Selections are done by pasting the text and double-clicking in order to confirm selection. Second Stage. A fixed set of questions is used in the second stage. The questions are related to the themes of the first stage. When a question is asked from the testee, (s)he is shown a spatial representation of the document collection. The representation is either hierarchical (see Fig. 2b) or history of her/his search process during the first stage (see Fig. 2c). The testee is instructed to locate the place where (s)he would start searching for an answer, i.e. the most probable location for the answer. Each question is asked six times so that all view-question combinations are covered. All basic viewquestion combinations are run through three times: first round without any additional hints, second round with the places of selections visible (red bars in Fig. 2b-c), and third time with the actual extracts of text visible (Fig. 2d). For each round, both hierarchical and historical view is included. 3.2 Researcher’s Perspective Then the configuration of test environment and test session is described from researcher’s perspective. Document Collection. In our preliminary test setting, we used 11 documents that were gathered from Finnish Wikipedia. The collection included 2 articles about biographies of classical composers, 2 articles about novel authors, collection included also portrayals of 3 different countries and 4 cities. Most of the articles were featured articles (i.e. considered high quality). Theme Instructions. The testees were given 3 sets of instructions, each including 2 themes. The overall time was restricted to some ten minutes. The themes were hints to search certain kind of information from the document collection and to mark each piece of information they considered relevant with respect to themes given in instructions. Marking a piece of text was a mean for the testee to establish a landmark.
980
K. Wideroos and S. Pekkola
a.)
b.)
b.) (zoomed)
c.)
c.) (zoomed)
d.)
d.) (zoomed)
Fig. 2. a. Selection made to a document; b. Hierarchical view showing selection; c. Historical view; d.) Selection with extract from document
Studying Utility of Personal Usage-History
981
The selections made by the user were recorded to a file. These records included timestamp (both start and end times of making the selection), document name and selections location in the document. Also, when contents in the document viewer window changed, a recording was made. In addition to timestamp, these records included information about the document shown in the viewer window (i.e. document’s name and visible area). Test Protocol. The test software can be customized for different kinds of test settings. This is done by defining configurations for views to be applied in the test. For example, we used in 6 views in our preliminary test setting (see Table 1). An individual test protocol can also be defined for each testee. Table 1. View configurations used in the test View 1 2 3 4 5 6
Type Hierarchical Historical Hierarchical Historical Hierarchical
Historical
Selections Selection Text
X X X X
X X
Retrieving Information to Questions. We had 6 questions, which were related to the themes the testees were instructed to search in Stage 1. Questions were asked according to randomized test protocol. First, all questions were asked so that views 1 and 2 were shown (see Table 1), then using views 3 and 4, and finally views 5 and 6. Thus, the number of answers was 36 per testee. Rating Answers. Each answer given by the user was compared to corresponding predefined correct answer. Correct answer to a question is a selection (or set of selections) from a document (or document collection) that the researcher has considered as the best one. After pointing a location for an answer, the distance between the answer entered by the testee and the correct one was calculated. In hierarchical view, the distance was simply the number of lines between the given answer and the correct answer. In historical view, the testee’s answer was first converted from time domain to a corresponding hierarchical location in the document collection in order to make calculating distance possible. If the testee located her/his answer in a wrong document, it was associated a maximum distance (number of lines in the whole document collection). Analysing Results from the Preliminary Test. We conducted a small test with three testees. All testees answered to 6 questions using 6 views making 36 answers all together. The views were ranked question-wise, according to the exactness of the answers, and separately for all testees. In order to get an idea how different views performed with respect to a specific question, all testees’ ranks were summed up question-wise. These results are illustrated in Fig. 3. The results show that view
982
K. Wideroos and S. Pekkola
Question 1
None
Selections
Text
None
Question 4
None
Selections
Question 3
Question 2
Selections
Text
None
None
Selections
Text
Question 6
Question 5
Text
Selections
Text
None
Selections
Text
Fig. 3. Question-wise performance of views (results from preliminary test with 3 testees)
performance vary a lot from one question to another, emphasizing the phrasing of questions and type of documents. We also counted an overall performance for each view by summing all the ranks for them. These results are shown in Figure 4. On one hand, the results suggest that the selections of extracts really make a difference, and that historical view can outperform hierarchical view when selections, either their places or actual extracts, are not shown at all. On the other hand, showing the text extracts does not seem to improve the performance of either view.
Fig. 4. Overall performance of views (results from preliminary test with 3 testees)
4 Discussion Frequent shifts between hierarchical and historical views were problematic for the testees. Two solutions for this problem are apparent. If within-subject design is considered, both views should be shown at the same time and the testee should be asked to point the location of the answer to the question in both views. Another solution would be to use a between-subject test design where one group uses only hierarchical
Studying Utility of Personal Usage-History
983
and the other group historical view. Both solutions would simplify the test substantially: Compared to our preliminary test setting only half of the test questions are needed. The amount of questions needed can be downsized also by omitting views that show selection texts, because it seems that selections without text already make the difference (see Fig. 4). The testees’ experienced the documents interesting but the limited time (ca. 10 minutes) was considered to be too little time to familiarize oneself with the documents and to search for extracts related to the test themes. This can be solved either by giving the testees more time (up to some limit) or by decreasing the number (or size) of documents and task themes. If the test design is simplified as outlined earlier (i.e. no redundant questions needed), the time saving from the second stage of the test can easily be invested in the first part of the test. As Fig. 3 indicates, the questions and tasks need to be considered carefully. One thing is that if the testees have enough time in searching for information in the first stage of the test, task themes and questions can be made more challenging. Tasks that force the testee to browse several documents to complete the search task, e.g. such tasks that force the testee to first find the theme from the documents in order to be able to find the information. In addition, when applying such test design that no redundant questions are needed, number of unique questions can be increased in the second stage, making the test situation to more realistic. However, our sample is still very small and no far-reaching conclusions can be made. Nevertheless, the test setting and the Tool have proven out to be valuable methods that are able to capture the nuances and differences between hierarchical and historical views, illustrating the utilization of personal usage-history.
References 1. Barreau, D.K.: 'Context as a Factor in Personal Information Management Systems'. Journal of the American Society for Information Science. 5(1995) 327-339 2. Bergman, O., Beyth-Marom, R., Nachmias, R.: The user-subjective approach to personal information management systems. J.Am.Soc.Inf.Sci.Technol. 9 (1997) 872-878 3. Dourish, P.: 'What we Talk about when we Talk about Context'. Personal and Ubiquitous Computing, 8 (2004) 19-30 4. Dumais, S., Cutrell, E., Cadiz, J., Jancke, G., Sarin, R., Robbins, D.C.: 'Stuff I've seen: a system for personal information retrieval and re-use'. In:SIGIR '03: Proc. of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. ACM Press. (2003) 72-79 5. Karger, D. R., Jones, W.: Data unification in personal information management. Commun. ACM 1 (2006) 77-82 6. Kelly, D.: Evaluating personal information management behaviors and tools. Commun. ACM 1 (2006) 84-86 7. Malone, T. W.: How do people organize their desks?: Implications for the design of office information systems. ACM Trans. Inf. Syst. 1 (1983) 99-112 8. Krishnan, A., Jones, S.: 'TimeSpace: Activity-Based Temporal Visualisation of Personal Information Spaces'. Personal and Ubiquitous Computing. 1 (2005) 46-65 9. O'Conaill, B., Frohlich, D.: 'Timespace in the workplace: dealing with interruptions', in the Proc. of CHI '95. ACM, (1995) 262-263.
984
K. Wideroos and S. Pekkola
10. .Ringel, M., Cutrell, E., Dumais, S., Horvitz, E.: Milestones in time: The value of landmarks in retrieving information from personal stores. In: Proc. of Interact 2003, the Ninth IFIP TC13 Int. Conference on HCI. (2003) 184–191 11. Wideroos, K., Pekkola, S.: Presenting the Past: A Framework for Facilitating the Externalization and Articulation of User Activities in Desktop Environment. In: Proc. of the 39th Annual Hawaii international Conference on System Sciences (2006) 12. 12.Wideroos, K., Pekkola, S.: Support for Subjective Time in Personal Information Management. In: Proceedings of IRIS-26. Kristiansand, Norway (2005)
Enable the Organization for UCD Through Specialist and Process Counseling Natalie Woletz1 and Susanne Laumann2 1
Abstract. This paper describes two generic counseling approaches, valuable in the field of User Centred Design. The paper differentiates the areas of User Experience from User Centered Design as a holistic approach. The conclusions drawn will suggest which type of consultancy approach to best use for which type of service. With this, the authors especially address external usability consultants. Keywords: User Centred Design, Process Counseling, Specialist Counseling, Usability, User Experience, Usability Maturity, Usability Consultant.
2 Two Counseling Approaches Over time, many types of counseling have emerged in various businesses. However, we want to describe two "archetypes", which turned out to be the most helpful and typical ones in the usability business. The two "archetypes" are specialist counseling and process counseling. Specialist counseling is usually characterized by a clearly defined client commission, and straightforward execution of tasks is expected. The consultants are the experts who perform tasks with specific know-how and hand over a dedicated deliverable as a result. Typically, these tasks are on an operational level, product-oriented and very hands-on. No detailed analysis of the clients' organization is usually desired. Process counseling, on the other hand, focuses on helping the organization to analyze their actual problem in depth to reach an objective. This methodology spends more time on asking questions in the beginning. The objective is to assist the client’s organization in narrowing down the problem, moderating the process of finding the solution and then implementing the solution. Generally speaking, you might say the process counseling’s concept is to help the organizations to help themselves. The two types of counseling described do have different roots. Specialist counseling is the traditional one. It became a business of its own in the United States of America during the industrial age. During this time the economy grew and entrepreneurs needed more and more external expertise to help steer their businesses. The consultant's role was pretty much comparable to a doctor's: there was a specific problem identified and someone was called who knows the topic and was supposed to fix it or to hand over a "prescription" how to fix it. While the consultant was active, the client's organization was in a passive role, like a patient being "treated". During the 1960s this traditional consultancy approach moved in a new direction, inspired by the clinical domain. Edgar Schein in particular [4, 5, 6] shaped a new understanding of consultancy which he called process consultancy. He observed and published various consultancy methods for decades. As Daniel Schugurensky puts it: "If the general philosophy on which Schein's entire work is built upon could be summarized in one premise, it would probably be the old Chinese proverb that goes 'give a man a fish, and he will eat for a day; teach a man to fish, and he will eat for a lifetime.' (…) Schein advocates a theory of consultation that is both collaborative and client-centered. The ultimate goal of the PC (process consultant, authors) approach is an organization skilled in the diagnosis and solving of its own problems, without the aid of outside intervention." [7]
3 User Experience Consulting vs. User Centered Design Consulting What does that mean for consultants in the Human Computer Interaction (HCI) field? The topics of consultancy in the HCI field are twofold: On the one hand, developers of software or interactive products want to increase the usability and user experience of their products. They call for external usability consultants who focus on the products and try to increase their usability and user experience. We call this “User Experience Consulting”.
Enable the Organization for UCD Through Specialist and Process Counseling
987
On the other hand, some development organizations want to go a step further. They want to be enabled to develop products of optimal user experience themselves. Usually this means they have to enrich their business processes with user centered design activities. In this case, external consultants have to aim at different parts of the organization, not just the development department. We call this “User Centered Design Consulting”. The more a usability expert wants to apply a holistic concept of User Centered Design the more Process Counseling methodology she or he should use. What does this advice mean in practical terms? To provide User Experience Consulting usually means performing tasks the clients cannot or do not want to perform themselves for various reasons. These tasks are e.g. conduct a user and task analysis, write use cases, generate navigation models, design the graphical user interface, conduct usability tests etc. The results are then fed back into the client's organization. If the User Experience Consultant is lucky they get implemented in the product lifecycle of the organization, namely into the “Define” (definition of the software, in most organizations this is done by a product management team) and “Realize” (implementation and testing, in most organizations this is done in the R&D department and/or department for quality management) phases. User Centered Design services concentrate less on product development and much more on a type of organizational development. A UCD Consultant assigned to support an organization in becoming “user centered” is not only expected to put the recommendations into practice but also to consult on a strategic level. This means the UCD Consultant – although an external resource – needs to influence the organization in order to become successful. Although a difficult task, it is by no means impossible because of the consultants; • profound user experience know how, • knowledge about how to put it into practice (UCD Process), • capabilities to influence the organization. In order to bring User Centeredness into an organization the process consultancy method is the first choice. As opposed to User Experience consulting, the UCD consultant should not only focus on product development but needs to provide services to other departments in the client's organization which in turn affects more processes (see figure 2).
Process Counseling
O
+
Specialist Counseling
+
-
User Experience
User Centered Design
Fig. 1. Different counseling approaches for different areas of consultancy
988
N. Woletz and S. Laumann
Management Processes – Strategic Planning: As an external resource on a mission to apply UCD you want to gain sponsors in senior management to a much greater extent. The User Experience Consultant facilitates the process of gaining the insight about the role usability plays for the business success of the organization. It is important to ensure that this insight gets anchored into the management tools and makes it to the strategic objectives. Support Processes – Process- and Information Management: Another process the UCD Consultant needs to connect to is the client's process- and information management, usually located in the quality department. It is important to understand the process maturity of the company. Do they have processes defined? Do they live the processes? Do they improve the processes based on lessons learned? How compliant is the decision making process? According to his or her insights the UCD consultant might want to identify the real powerful person behind the processes. The UCD consultant might to be the link between the officially named head of the processes and the person with the real influence. The consultant need to identify in which format the information about processes like operating instructions, templates to use, role descriptions etc. are passed into the organization and become an author or a coach of an author. Often it is necessary to become directly or indirectly a member of the "Process Team". Customer Relationship Management – Understand: Generally speaking, this process step includes all activities to analyze the business environment to increase sales with the customers. This usually includes the identification of market segments, target customers and decision makers. There is also information gathered how the buyer gains benefits from the product. Based on this information it will be worked out, how to approach potential customers. This process is most commonly found in the marketing department. The interesting link for the UCD consultant is, that the people who decide whether a product will be purchased or not are either identical with or highly influenced by the users. In other words, the information derived from UCD methods is highly interesting for the marketing department as well. There are even more synergies to be gained. The insights of user- and task analysis are not just useful for the CRM process “Understand” (see figure 2) – the very same information can be utilized as input to the PLM process “Define”. This way an organization is able to create a close link between the information base used for development and also for marketing and sales. This creates a great benefit for the organization, since often organizations do have difficulties to tie these processes closely together. If customers start to wonder about the lack of concordance between the marketed messages and the actual product they use, this is a risk for the image of the organization and thus a risk for the business. Product Lifecycle Management – Product Portfolio Management: This process defines and manages the Product Portfolio, the Product Strategy and the Product Roadmaps. PPM is a strategic, continuous, overall and on-going activity, which accompanies the whole PLM Process. As opposed to making a single product more usable, the PPM is another process where the application of "user centeredness" will take effect on the whole portfolio. User centeredness can influence the way how the organization will cut their portfolio. For an example, it can be beneficial to cut the products along
Enable the Organization for UCD Through Specialist and Process Counseling
989
Fig. 2. Different processes that are affected by Process Counseling in the area of User Centred Design Consultancy [8]
the users' tasks, instead along different technologies. This holds especially true for so called enterprise IT. UCD can also support a product family and/or a platform strategy by providing methods how to shape the look and feel of different products which are still supposed to be part of a product family. Last but not least, the UCD methods can provide benefits for the roadmap planning, because user centered thinking does consider the circumstances of use and the human behavior which will influence the purchase decision at least as much as the fact that up to date technology has been applied.
4 Conclusion In the HCI field consultancy services are related to one of the two areas a) enhancement of product usability or b) supporting a developing organization to become user centered. If an organization wants to become “user centered” it has to pass through a process of maturation. We have come to the conclusion that this process can only be facilitated through process counseling. Organizations which probably qualify for this type of approach often show the following criteria: • They are technology leaders who are increasingly face competition. • The organization gained already to some extend experience with usability, e.g. through usability tests, or user experience counseling. • The organization does follow a product family or platform strategy
990
N. Woletz and S. Laumann
• The organization does not only want to improve a single product fast in a one time effort, but wants to learn as a organization how to create this benefit to whole product portfolios • The organization is a large scale enterprise. If the external consultant’s task is to enhance product usability, the process counseling approach would be like breaking a butterfly upon a wheel. From our point of view the specialist counseling approach is sufficient to achieve the desired outcomes. But it can still be beneficial to apply the process counseling approach even if the organization “only” wants to enhance product usability. It might be the case that the organization itself is not aware of what it really needs. It is the task of the consultant to diagnose in what state of maturity the organization is and to apply the appropriate “therapy”.
References 1. Karat, J. & Dayton, T. (1995). Practical Education for Improving Software Usability. In Proceedings CHI 1995, ACM Press (1995), 162-169. 2. Rosenbaum, S., Humburg, J. & Rohn, J. (1998). Unpacking Strategic Usability: Corporate Strategy and Usability Research. In Proceedings CHI 1998, 205-206. 3. Rosenbaum, S., Rohn, J. & Humburg, J. (1999). What Makes Strategic Usability Fail? Lessons Learned from the Field. In Proceedings CHI 1999, 93-94. 4. Schein, E. H. (1969). Process consultation volume I: its role in organization development. New York: Addison-Wesley. 5. Schein, E. H. (1987). Process Consultation: Lessons for Managers and Consultants. New York: Addison-Wesley. 6. Schein, E. H. (1998). Process Consultation Revisited: Building the Helping Relationship. Addison Wesley Longman, Inc. 7. Schugurensky, D (2002). Selected Moments of the 20th Century: 1987 Edgar Schein revisits Process Consultation with a sequel. http://fcis.oise.utoronto.ca/~daniel_schugurensky/ assignment1/1987schein.html 8. Siemens Medical Solutions, Official Process Documentation: Med Process House 3.0 based on Siemens Reference Process House 3.0.0 9. Woletz, N., Laumann, S., & Koopmann, T. (2005). Impact of User Centered Design Approach on the Marketing Department. In Proceedings HCII 2005, Vol. 2. 10. Woletz, N. & Zimmermann, D. (2005). Organizational Aspects of the Introduction of a User-Centered Design Process. In Proceedings HCII 2005, Vol. 2.
User Response to Free Trial Restrictions: A Coping Perspective Xue Yang, Chuan-Hoo Tan, and Hock-Hai Teo National University of Singapore, Department of Information Systems, 3 Science Drive 2, 117543, Singapore {yangxue,tanch,teohh}@comp.nus.edu.sg
Abstract. Software vendors often provide software for free download but with restrictions (e.g., time and/or functionality restrictions). The question that arises is to what extent the restrictions should be set to induce users to procure the full version. This study seeks to answer this question by looking from two perspectives: expectation-disconfirmation and coping behavior. Building on these perspectives, we present a research model of user’s coping reactions toward software restrictions. We seek to understand user reactions (i.e., derivation of coping strategy) when their expectation toward trial restrictions is negatively disconfirmed. It is further posited that situational control could moderate the relationship between expectation disconfirmation and coping responses. We believe this research will contribute to enrich the current IS field and benefit market practitioners. Keywords: Free trial software (FTS), expectation-disconfirmation paradigm, coping theory.
context, we define such difficulties as related to the functional software evaluation which is conducted to test whether the software is suitable to fulfill their utilitarian needs. If the expectation is not met, negative disconfirmation occurs [7], leading to the user experiencing stress in evaluating the FTS [6]. To this end, effectiveness of trial activities may be affected, triggering coping strategies [8], which in turn affects trial outcome evaluation and purchase decision making (e.g.,[6, 9]). To our knowledge, the role of user’s involvement and trial behaviors in making post-trial commitment decisions has rarely been studied. In this light, this research seeks to answer two questions: (1) what will the user do to cope with the situation when the FTS restrictions are worse than expected? (2) how will the coping thoughts or behaviors influence user’s decision making on whether to purchase the software or not? To answer the research questions, we apply expectation-disconfirmation theory [7] and coping theory [10] to identify the induced coping behaviors under the influence of negative disconfirmation on trial restrictions. We believe this research can help to establish a theoretical understanding of the important mediating coping behaviors in influencing post-trial decision making (i.e., propensity to procure). Furthermore, by grounding on the free trial context, we also examine how different coping strategies would help promote emotional well-being actively and resolve stress-inducing problems, which has been the most perplexing topic in the coping literature [11].
2 Coping with FTS Restrictions 2.1 Expectation-Disconfirmation in Free Trial Consumers frequently encounter negative consumption situations that may induce stress and negative emotions [12] which may in turn influence consumer behaviors such as performing judgment and choice tasks [13, 14]. According to the expectationdisconfirmation theory (EDT) [7], when the observed performance turns out to be below the user’s original expectation, discrepancy occurs and is reflected on the construct of negative disconfirmation [15]. Furthermore, disconfirmation may play a role in the formation of consumption emotions [16]. Specifically, disappointment arises when the chosen option results in a worse than expected outcome [6, 14]. If an individual predicts the possibility of undesirable consumption outcomes, emotions associated with threat (e.g., worry or anxiety) are likely to be produced [6, 12]. Moreover, the stressful consumption situation and its resultant negative emotions would affect an individual’s post-exposure reaction [17]. On a negative note, the negative expectation disconfirmation felt, which can lead to goal abandonment, may be detrimental for software vendors [18]. On a positive note, there are other proactive coping strategies that could be applied by the consumer to effectively deal with the noxious encounter (e.g., [19]). Reflected in current context, we regard the free trial as a self-directed process during which the individual intends to “take the initiative in diagnosing needs, formulating trial goals, identifying human and material resources for trial, choosing and implementing appropriate trial strategies and evaluating trial outcomes” ([20], p.18). In such an environment, users are responsible for identifying what to learn, when to use and how to evaluate the FTS [21]. Hence, when trial restrictions on time and/or
User Response to Free Trial Restrictions: A Coping Perspective
993
functionality are perceived to be stronger than expected, the situation will likely be appraised as stressful [10] and the affect toward the situation and subsequent trial behaviors may be influenced (e.g., [1]). Inferring from the coping theory [10], the user will attempt to cope with the negative situation to achieve the software evaluation goal. Thus, the coping strategies and activities toward unexpected restrictions will be the main focus of this research. 2.2 Appraisal and Coping Coping, defined as thoughts and behaviors that people use to manage the internal and external demands of situations that are appraised as stressful [10] (i.e., in a situation of product consumption), is a psychological process embedded in a network of cognitive, attitudinal and behavioral correlates [22]. Based on the appraisal results, an individual devotes effort to cope by performing different actions to deal with the situation at hand [9, 23]. The coping strategies can be decomposed into various subcategories within multidimensional hierarchical latent structures. For example, Duhachek [12] proposes one category of active coping, a form of coping characterized by taking direct actions (e.g., action coping) or thinking positively toward attenuating stressful circumstances (e.g., rational thinking) [5]. Compared with other less proactive strategies such as emotional support or avoidance, active coping focuses on the problem at hand more directly. Similarly, in current context, trial users usually aim to achieve certain level of understanding toward the software product and tend to resolve the negative trial conditions (i.e., restrictions) to fulfill this goal. In turn, the activities in settling the short trial period problem can more likely influence a trial user’s purchase decision. Thus, we choose to focus on the effects and outcomes of active coping means in terms of influence a trial user’s purchase decision making in current study. Cognitive antecedents such as situational control [12] also have an impact on the adoption of different coping measures.
3 Theoretical Model 3.1 Negative Disconfirmation and Coping In many consumption situations such as violation of expectation caused by external objects [24], the current state of affairs worse than initially predicted may induce the psychological stress appraised by the individual as a threat to the future benefits [25]. In our context, the negatively disconfirmed expectation involving intensive time and/or functionality restrictions can give rise to the experience of disappointment [14] and worry as stress emotions [6, 25]. Furthermore, it is suggested that if individuals aim to perform successfully in stressful conditions to achieve salient goals, they have to conduct self-regulations such as coping tactics effectively to overcome those elicited negative emotions [26]. Specifically, individuals are prompted to analyze the situation [27] and change one’s behavioral state which falls short of the goal (e.g., quality and intensity of the actions) [28, 29]. To this end, the active coping mechanism including either cognitive or
994
X. Yang, C.-H. Tan, and H.-H. Teo
behavioral actions is suggested to be the most effective to reach positive coping results (e.g., [12, 30]). Coping with Disappointment. Disappointment is closely related to higher scores on the aspect of unexpectedness [31] which indicates perceived threat to effective trial outcomes. Upon realizing stronger restrictions, a user will attribute the difficulty of engaging in full assessment of the software to external parties. While many would have chosen to terminate the trial [14], those who chose to continue would conduct active cognitive or emotion-focused coping to mitigate the disappointment to maintain the interest for future testing [32]. Specifically, rational thinking in terms of deliberate attempts to prevent subjective emotions from directing behavior will likely be adopted [12]. Driven by the disappointment with the restrictions and the urge to fulfill trial goal, the user may attempt to suppress the negative feeling. For example, the user may try to persuade oneself that it is reasonable for software firms to restrict certain valid period or disable some components of the FTS to guarantee their own profit. It is to be noted that a user may also perceive a level of powerlessness which can result in a tendency to withdraw from the situation [33]. Hence, the increase of negative disconfirmation on time/functionality restriction may increase the possibility of goal abandonment and reduce the possibility of rational thinking [18, 27], as hypothesized: H1a: The more negative one’s disconfirmation on the time restriction attached with the FTS, the less likely that rational thinking on restrictions will be adopted during the free trial. H1b: The more negative one’s disconfirmation on the functionality restriction attached with the FTS, the less likely that rational thinking on restrictions will be adopted during the free trial. Coping with Worry. Besides the emotion of disappointment, the negative disconfirmation on trial restrictions may also raise the emotion of worry [6]. In the state of worrying about threatened future by taking personal goals into account, the most helpful means is to engage in a problem-focused approach to eliminate the threat (e.g., [34]). Reflected on the behavioral coping strategy, the action coping with direct, objective attempts to manage the source of stress, such as concentrating on the ways to resolve the problem [12] will be a priori choice. In the current context of free trial, to make better use of the limited trial period, the user can try to intensify the usage frequency as a specific form of action coping. Similarly, additional sources can be applied to extend one’s understanding about how the software functions (e.g., through product feature description or third-party information). As long as the threat elicited is mitigated, the goal of adopting FTS that is to achieve comprehensive software product knowledge can be fulfilled. However, individuals will prefer choosing the coping strategy that promises the greater chance of success and accomplishment of task [35]. Thus, when the feeling of worry boosts, there will be a natural tendency to escape from the situation and to protect the individual from suffering any prospective harm [36]. In the context of free trial, the user will likely forgo the attempts to try if he/she perceived very limited valid period is offered for trial (e.g., 5 days for relatively complicated software) or much critical
User Response to Free Trial Restrictions: A Coping Perspective
995
components of the FTS being removed (e.g., clip editing functions of the video editing software). Hence, we hypothesize that: H2a: The more negative one’s disconfirmation on the time restriction attached with the FTS, the less likely the action coping will be conducted during the free trial. H2b: The more negative one’s disconfirmation on the functionality restriction attached with the FTS, the less likely the action coping will be conducted during the free trial. According to coping theory [10], different coping methods usually take place simultaneously and can either facilitate or impede each other in the coping process. Regarding the initiation of coping behaviors, we suggest the action coping that deals with the situation directly will more likely be performed if one tries first to smooth negative feelings through a pure emotional way. In the free trial context, if the individual is able to convince oneself the presence of time restrictions are always unavoidable and can be coped with, he/she will likely engage in active software assessment in a higher frequency than originally planned or through more diversified means (e.g., turning to additional product information sources). Thus, we hypothesize that the occurrence of rational thinking will enhance the action coping behaviors across the free trial process, as follows: H3: Higher level of action coping will be performed if higher level of rational thinking is conducted by the user. 3.2 Coping Outcomes Whether the coping is effective to facilitate adaptation or not depends on the outcomes of coping [30]. When the appropriate coping strategy is adopted to accomplish a specific goal, final outcome will likely turn out to be positive and thus coping is considered to be effective. Although the coping outcomes can vary within a broad range, by relating them with one’s goals, we can category them into two general types. First, rational thinking can help reduce psychological stress [37] and restore personal emotion stability [9], i.e., from negative to neutral state. Second, action coping, which involves one in the process of solving problems, will increase the likelihood for the stressful situation to be resolved as compared with the non-action [37]. In the free trial context, the execution of rational thinking and action coping jointly contribute to the desirable outcomes. On the one hand, when certain level of rational thinking takes place, one’s emotional state will likely keep on a relatively stable state and may even turn to be positive. On the other hand, action coping directly addresses the unsatisfied restrictive conditions. Furthermore, it is suggested user’s commitment propensity is primarily determined by the experience of previous product use, such as satisfaction with the information system (IS) performance (e.g., [38]). By being involved in the trial process of constantly adjusting oneself both attitudinally and behaviorally, the user tends to produce psychological and emotional attachment or commitment with the target FTS [39]. Commitment usually links to sticking on certain product choice and reluctance to switch in the consumption context (e.g., [40]). Positive emotional experience in turn makes people maintain attention to the current state of affair with pleasure [41].
996
X. Yang, C.-H. Tan, and H.-H. Teo
Significant effort devoted to resolve the adverse trial problems indicates psychological lock in with the target as well [42]. As the user expresses stronger commitment to current product [43], the propensity to purchase the software will increase. Hence, we hypothesize that: H4: Higher level of rational thinking will positively influence user’s propensity to purchase the commercial software after free trial. H5: Higher level of action coping will positively influence user’s propensity to purchase the commercial software after free trial. 3.3 Situational Control Before taking any action, a rational individual could take the characteristics of the context into consideration when evaluating the effectiveness of different coping mechanisms [33]. To be consistent with the contextual formulation, people not only appraise the nature of encounter or situation (i.e., stressful environment or threat to personal well-being), but also the opportunity for personal control or the changeability toward the condition [30]. The goodness of fit between appraisal of controllability and coping approach [37] intervenes within the decision about how to cope besides the main effects delivered by the situation itself. In other words, the ways people cope are also influenced by the available resources which can be made use of [10]. Furthermore, in the context of coping, the situational control perception refers to the external, objective facilitations that one expects to be accessible for enabling future coping efforts. Intuitively, greater facilitation supports people tolerate aversive stimuli better and encourage them to initiate and sustain behaviors more positively [44]. In contrast, serious lack of facilitation may suggest a lower level of feasibility for certain positive coping strategies. Thus, perception of situational control implies individual with the ability to perform certain coping measures regardless of the personal efficacy (i.e., knowledge or skill) and in turn elevate or dampen one’s enthusiasm to engage in the coping process [45]. In current free trial context, user’s confidence toward applying positive coping strategies is influenced by the interaction between trial condition and perceived situational control in terms of the available spare time. For example, when an individual believes he/she has sufficient personal time for the trial, restricted trial period leads to lower dissatisfaction toward the restricted offer. Thus, the time restriction will more likely be regarded as reasonable or acceptable and the user will be more willing to resolve the restriction problem. In response to the functionality restriction, enough personal spare time can enable the user to search for additional product information which also can help appease a person’s complains to the restrictions. Hence, we hypothesize the positive moderation effect for the perceived situational control in terms of spare time for the trial: H6a: The degree of perceived situational control will positively moderate the relationship between negative disconfirmation on time/functionality restriction and rational thinking.
User Response to Free Trial Restrictions: A Coping Perspective
997
H6b: When perceived situational control becomes stronger, it is more likely for rational thinking to be adopted under negative disconfirmation. H6c: The degree of perceived situational control will positively moderate the relationship between negative disconfirmation on time/functionality restriction and action coping. H6d: When perceived situational control becomes stronger, it is more likely for action coping to be adopted under negative disconfirmation.
4 Research Methodology A field experiment has been conducted and is ongoing when we wrote this article. Field experiment is suitable for current research to match the longitudinal feature of the free trial practice that users usually try the software for certain period. User’s responses toward the trial product characterized by its predefined restrictions will be examined during the experiment process and in turn the decision making related to continuous usage is investigated. The factorial design is based on two types of trial restrictions, time restriction and functionality restriction to form eight treatments in total. Time restrictions are manipulated according to the duration of trial period and functionality restrictions are differentiated in terms of the characteristics of the restricted software components. We have developed a scenario to simulate the real situation of free trial by providing participants with different restricted versions of the same trial software to observe their responses and behaviors in relation to the different restrictions. The invitation to the experiment has been posted in several major online forums discussing software issues and participants are invited to access the experiment website online and download the software to try. All the participants are randomly assigned to one of the eight treatments of different combinations of the trial restrictions. To avoid possible confounding effects, we specifically request participants to treat this experiment as simulating real trial experience and to assume the software provided as what they need. During the trial period, participants are allowed to test the software according to their own manner and pace. They are required to complete two survey questionnaires before the trial including the measurement of response toward trial restrictions and one post-experiment questionnaire after the trial ends which also measures their selfreported trial experience. Any feedback regarding the trial experience and participants’ opinions toward the software are welcome. Each participant will be offered the opportunity to win a prize up to $200 shopping voucher in lucky draws toward the end of the whole series of experiment.
5 Conclusion and Contributions This research focuses on user’s psychological and behavioral reactions toward the time/functionality restrictions which are typical features of the FTS to distinguish from other software products. Specifically, we aim to explore the scenario in which
998
X. Yang, C.-H. Tan, and H.-H. Teo
user’s expectation toward the restrictions is negatively disconfirmed and investigate how different reactions will lead to the consequence of continuing software usage, as the most crucial question of this FTS research. It contributes to the research domain of FTS usage from a consumer’s perspective and the broader disciplines including consumer behavior, consumer psychology, marketing, and Information Systems (IS).
References 1. Kempf, D. S., Smith, R. E.: Consumer Processing of Product Trial and the Influence of Prior Advertising: A Structural Modeling Approach. J. Marketing Research 35 (1998) 325338 2. Rogers, E. M.: Diffusion of Innovations. The Free Press, New York (1995) 3. Gallaugher, J. M., Wang, Y.: Understanding Network Effects in Software Markets: Evidence from Web Server Pricing. MIS Quarterly 26:4 (2002) 303-327 4. Tang, Q.: Free Trial or No Free Trial: Optimal Software Product Design with Network Externality. The Ninth Americas Conference of Information Systems, August 4-6 (2003) Tampa, Florida, United States 5. Duhachek, A., Iacobucci, D.: Consumer Personality and Coping: Testing Rival Theories of Process. J. Consumer Psychology 15:1 (2005) 52-63 6. Yi, S., Baumgartner, H.: Coping with Negative Emotions in Purchase-Related Situations. Journal of Consumer Psychology 14:3 (2004) 303-317 7. Oliver, R. L.: Cognitive, Affective, and Attribute Bases of the Satisfaction Response. J. Consumer Research 20 (1993) 418-430 8. Pinsonneault, A., Rivard, S.: The Impact of Information Technologies on Managerial Work: From the Productivity Paradox to the Icarus Paradox? MIS Quarterly 22:3 (1998) 287-312 9. Beaudry, A., Pinsonneault, A.: Understanding User Responses to Information Technology: A Coping Model of User Adaptation. MIS Quarterly 39:3 (2005) 493-524 10. Lazarus, R. S., Folkman, S.: Stress, Appraisal, and Coping. Springer Publishing Company, New York (1984) 11. Somerfield, M. R., McCrae, R. M.: Stress and Coping Research: Methodological Challenges, Theoretical Advances, and Clinical Applications. American Psychologist 55 (2000) 620-625 12. Duhachek, A.: Coping: A Multidimensional, Hierarchical Framework of Responses to Stressful Consumption Episodes. J. Consumer Research 32:1 (2005), 41-53 13. Raghunathan, R., Pham, M. T.: All Negative Moods Are Not Equal: Motivational Influences of Anxiety and Sadness on Decision Making. Organizational Behavior and Human Decision Processes 79 (1999) 127-146 14. Zeelenberg, M., van Dijk, W. W., Manstead, A. S. R., van der Pligt, J.: On Bad Decisions and disconfirmed Expectancies: The Psychology of Regret and Disappointment. Cognition and Emotion 14 (2000) 521-541 15. Bhattacherjee, A., Premkumar, G.: Understanding Changes in Belief and Attitude toward Information Technology Usage: A Theoretical Model and Longitudinal Test. MIS Quarterly 28:2 (2004) 229-254 16. Phillips, D. M., Baumgartner, H.: The Role of Consumption Emotions in the Satisfaction Response. J. Consumer Psychology 12:3 (2002) 243-252 17. Yi, Y.: A Critical Review of Consumer Satisfaction. In: Zeithaml, V. A. (eds.): Review of Marketing. American Marketing Association, Chicago, IL (1990) 68-123
User Response to Free Trial Restrictions: A Coping Perspective
999
18. Frijda, N. H.: What is the Function of Emotions? In: Ekman, P., Davidson, R. J. (eds.): The Nature of Emotions. Oxford University Press, New York (1994) 112-122 19. Luce, M. F., Bettman, J. R., Payne, J. W.: Emotional Decisions: Tradeoff Difficulty and Coping in Consumer Choice. University of Chicago Press, Chicago (2001) 20. Knowles, M.: Self-Directed Learning: A Guide for Learners and Teachers. Follett, Chicago (1975) 21. Guglielmino, P. J., Guglielmino, L. M.: Moving toward a Distributed Learning Model Based on Self-Managed Learning. SAM Advanced Management Journal (2001) 36-43 22. Carver, C. S., Scheier, M. F.: Situational Coping and Coping Dispositions in a Stressful Transaction, J. Personality and Social Psychology 66 (1994) 184-195 23. Folkman, S.: Making the Case for Coping. In Carpenter, B. N.: Personal Coping: Theory, Research, and Application. Praeger/Greenwood, Westport, CT (1992) 24. Olson, J. M., Roese, N. J., Zanna, M. P.: Expectancies. In: Higgins E. T., Kruglanski A. W. (eds.): Social Psychology: Handbook of Basic Principles. Guilford Press, New York (1996) 211-238 25. Lazarus, R. S.: Stress and Emotion: A New Synthesis. Springer, New York (1999) 26. Brown, S. P., Westbrook, R. A., Challagalla, G.: Good Cope, Bad Cope: Adaptive and Maladaptive Coping Strategies Following a Critical Negative Work Event. J. Applied Psychology 90:4 (2005) 792-798 27. Lazarus, R. S.: Emotion and Adaptation. Oxford University Press, New York (1991) 28. Carver, C. S., Scheier, M. F.: On the Self-Regulation of Behavior. Cambridge University Press, Cambridge (1998) 29. Covington, M. V.: Goal Theory, Motivation, and School Achievement: An Integrative Review. Annual Review of Psychology 51 (2000) 171-200 30. Folkman, S., Moskowitz, J. T.: Coping: Pitfalls and Promise. Annual Review of Psychology 55 (2004) 745-774 31. Frijda, N. H., Kuipers, P., ter Schure, E.: Relations among Emotion, Appraisal, and Emotional Action Readiness. Journal of Personality and Social Psychology 57 (1989) 212-228 32. Kahn, B. E.: The Power and Limitation of Social Relational Framing for Understanding Consumer Decision Processes. J. Consumer Psychology 15:1 (2005) 28-34 33. Zeelenberg, M., van Dijk, W. W., Manstead, A. S. R., van der Pligt, J.: The Experience of Regret and Disappointment. Cognition and Emotion 12 (1998) 221-230 34. Laux, L., Weber, H.: Presentation of Self in Coping with Anger and Anxiety: An International Approach. Anxiety Research 3:4 (1991) 233-255 35. Begley, T. M.: Coping Strategies as Predictors of Employee Distress and Turnover after an Organizational Consolidation: A Longitudinal Analysis. J. Occupational and Organizational Psychology 71 (1998) 305-329 36. Roseman, I. R., Wiest, C., Swartz, T. S.: Phenomenology, Behaviors, and Goals Differentiate Discrete Emotions. J. Personality and Social Psychology 67 (1994) 206-221 37. Zeidner, M., Saklofske, D.: Adaptive Maladaptive Coping. In Zeidner, M., Endler, N. S.: Handbook of Coping. Wiley, New York (1996) 505-531 38. Bhattacherjee, A.: Understanding Information Systems Continuance: An ExpectationConfirmation Model. MIS Quarterly 25:3 (2001) 351-370 39. Fournier, S.: Consumers and Their Brands: Developing Relationship Theory in Consumer Research. J. Consumer Research 24 (1998) 343-373 40. Coulter, R. A., Price, L. L., Feick, L.: Rethinking the Origins of Involvement and Brand Commitment: Insights from Postsocialist Central Europe. J. Consumer Research 30 (2003) 151-169
1000
X. Yang, C.-H. Tan, and H.-H. Teo
41. Thatcher, J. B., George, J. F.: Commitment, Trust, and Social Involvement: An Exploratory Study of Antecedents to Web Shopper Loyalty. J. Organizational Computing and Electronic Commerce 14:4 (2004) 243-268 42. Johnson, E. J., Moe, W., Fader, P. S., Bellman, S., Lohse, J.: On the Depth and Dynamics of Online Search Behavior. Management Science 50:3 (2004) 299-308 43. Dick, A. S., Basu, K.: Customer Loyalty: Toward an Integrated Conceptual Framework. J. the Academy of Marketing Science 22:2 (1994) 99-113 44. Schunk, D. H.: Learning Theories: An Educational Perspective. 3rd edn. Prentice-Hall, Upper Saddle River, NJ (2000) 45. Bandura, A.: Self-Efficacy: The Exercise of Control. Plenum, New York (1997)
A Study on the Form of Representation of the User’s Mental Model-Oriented Ancient Map of China Rui Yang1, Dan Li2, and Wei Zhou1 1
Corporate Technology, Siemens Ltd. China, 7 Wangjing Zhonghuan Nanlu, Beijing 100102 P.R. China 2 Arts & Design Department, College of Architecture and Urban Planning Tongji University, China {[email protected], [email protected], [email protected] }
Abstract. People often believe that fidelity is an important principle of cartographic information representation, that is, the closer of the geo information representation to the real world the better. However, excessively high fidelity of geo information representation does not necessarily bring about effective navigation and convenient reading, as excessive information representation may bring about cognitive burden to users, thereby affecting usability of users. Based on study of the form of the traditional map representation of China, the author finds that during drawing of map by the ancient people, in view of the user’s mental model and the specific setting for use, they were good at adopting multiple forms of information representation to lessen user’s cognitive burden, increase user’s intuition for reading, and bring about effective navigation. For inspiration of geographic information design, this kind of form of user’s mental model-oriented information representation is of certain significance. Keywords: ancient map, information representation, mental model.
humanistic information. Representation of visualize map has good artistic effect, as it is more convenient for reading. Diversified representation not limited by objective simulation In information representation of the ancient maps, the ancient people were not formalistic to following the objective representation of original features of the geographical features, rather based on different purpose of use and through multiple means of flexible representation, such as geographical feature transformation, deflection, change of visual angle etc., they match with the sensual experiences and psychological needs of the viewers, to achieve the purpose of convenience for use. This article focuses on discussion of the second characteristics of the information representation of the ancient map of China. In Chapter III, we shall, in connection with typical cases, analyze the form the user’s mental model-oriented information representation. Part IV is key findings; Part V is conclusion, bringing forwards plan for the next step research.
2 Typical Cases About Ancient Map Information Representation 2.1 Multi-directionally Oriented Representation Different from the modern map that takes the upper side as the north direction; the ancient map had no orientation restriction at all, for its direction is determined by location of the map user, which replaces the absolute coordinate system that takes the upper side as the north direction and the lower side the south direction. In the ancient map, its orientation does not change along with rotation of the earth; rather it takes the user as the center, and changes along with the location of the user. This is a kind of direct representation of imagination of the people of ancient China towards the world space (that is to say, the people of ancient China visualized the universe as a space centered on the self), which can better achieve the purpose of convenience for use of the map. For instance, the “Xuanhan Xianzhi Tu” (Map of the Xuanhan County Annals) is the hypsometric chart of the entire Xuanhan County published in 1815, which was to help the local officials to understand the geographical form of the county for convenient management. In the map, the county seat was drawn in the center, symbols and commentary characters for mountainous ground architectural constructions are arranged in the form of concentrate circles, surrounding the county set in a deformed centrifugal style, creating a sense of rounding immersion. When viewing this map, the officials as if were in the center of the county seat of the map, looking around and viewing the world from their own orientation. At the same time, the entire map is drawn adopting the isometric view that does not affect the viewer to overlook the entire county, forming survey knowledge towards the topographic features. In the ancient map aimed at navigation, this kind of multi-directional map all the more displayed the characteristics using the simplified user’s mental model as orientation. For instance, Zhenghe Hanghai Tu, or Zheng He’s Nautical Map, is a marine navigation map of Zheng He, navigator of the Royal Court of the Ming Dynasty, drawn for his voyages to the Western Seas. The actual physical routes of
User’s Mental Model-Oriented Ancient Map of China
1003
Fig. 1. Xuanhan Xianzhi Tu (Map of the Xuanhan County Annals)
Zheng He’s navigation is very complicated, representation of the complex routes and the great deal of geographic information on the way of navigation according to the facts would bring about not only burden for the drawers in map drawing, but also great burden for the users in cognition, let alone to say convenience for use. For solution of this problem, the ancient people adopted a kind of simplified form of representation based on user’s psychological route.
Fig. 2. Part of the Zhenghe Hanghai Tu (Zheng He’s Nautical Map) (1405 AD – 1435 AD), the red line is the representation route of navigation
1004
R. Yang, D. Li, and W. Zhou
Fig. 3. Part of the modern paper map of Zhenghe Hanghai Tu, the broken line represent the navigational route
The purpose of the abovementioned map of the local county annals is to represent the survey geographical information; whereas Zheng He’s Nautical Chart, was drawn based on navigation requirements, its purpose was to help users in orientation during navigation. In the journey, the voyagers place themselves in the actual physical routes, and in the course of time change, what these voyagers are more concerned about is the relative location information of themselves, that is, where “I” am now? What is “my” location in the entire route? Therefore, the Nautical Chart did not display an accurate and objective geographical route according to the absolute coordinate system of upper side being the north direction and lower side the south direction (Fig. 4.); rather, based on the user-centered map orientation, by focusing on information of the constantly changing location of the voyagers, it pieces together a navigation path approximate to the “user’s psychological route” (Fig. 5). What this path reflects is a simplified track that the various vision or sceneries protect in the
Fig.4. Sketch Map of the Real Geographical Routes of the Navigation
Fig.5. As shown in the map, the green straight line is the psychological route of the user (simplified as a straight line), the black route is the representation path shown in the nautical chart, and these two are very close.
User’s Mental Model-Oriented Ancient Map of China
1005
brain along with the course of time during the user’s travel of complex geographical environment. Just like a 3600 panoramic photo, the nautical map did not represent the real physical routes of the whole landscape, as it connect in series only in the time of track the various visions or sceneries projected in the user’s brain. Therefore, even though the overall geographical track is simplified and straightened up, evolving into a straight line, the relationship between the local vision or sceneries is nevertheless retained objectively, and viewers can understand the overall perspective as well. When the voyager reading this map in the journey, as the navigation path represented by the map is highly adaptive to the simplified psychological route (infinitely approaching towards a straight line) of the voyager, the user can thus quickly understand and perceive their location. This illustration embodies profoundly the starting point of design of the user’s mental modeloriented ancient map representation. 2.2 Deformation Representation Not focusing on reality of simulation of the objective physical world, rather emphasizing on difference of the importance of information and expressing particular point of views through such means as deformation etc. is yet another characteristic of information representation of the ancient map. In many maps of the local topography of the Ming and Qing dynasties, to emphasize importance of the capital, prefecture or county administration, the capital, prefecture or county administration was drawn in the center of the map. Although such doing sacrificed the objective accuracy of the map, it nevertheless met the requirements of the target users; to emphasize importance of the ancient cities, drawers used very large cartographic symbol to indicate ancient cities. Naturally, this will depart from the scale; some areas, such as islands, were drawn very small, displaying the subjective mindset of the user (considering it not important). For instance, Gujin Huayi Quyu Zongyaotu(General Map of the Ancient and Modern Territory of China and Foreign Countries), showed with prominence distributive setting of all the prefectures, provinces and counties in the Song Dynasty, as well as the location relation between China and its neighboring countries. As the purpose of this map was to help the emperor to acquire an overall view of its territory, and understand its administrative division, the reference geographical features outside of the main areas in the map were just curtly drawn. For example, a number of neighboring countries in the map, such as Japan, had only a simple name and relatively correct location information, and their territorial contour had been deformed and reduced, using similar form to indicate. We can therefore guess that the ancient monarchs concerned only about the relative location of its neighboring countries to it own, and they valued not that highly of their specific topographic and landform features.
1006
R. Yang, D. Li, and W. Zhou
Fig. 5. Gujin Huayi Quyu Zongyaotu(General Map of the Ancient and Modern Territory of China and Foreign Countries), the 12th Year’s Reign of Chunxi Emperor of the South Song Dynasty (1185 AD). The neighboring countries such as Japan (labeled by the red circle), its territorial contour had been deformed and reduced, using similar form to indicate.
2.3 Multiple Projection Representation Many ancient maps of China are represented by a multiple protection drawing methods. This form of mixed expression was often used in thematic map representation in the ancient times. For thematic maps based on certain special purposes, such as map for military affairs, nautical maps etc., different geographical features information adopted multiple projection representation methods, and also laid stress on its practicality. For instance, the 4th Part of the Zheng He’s Nautical Map. In this map, different geographical feature adopted different means of projection: the ground and water bodies used top-down perspective, mountain ranges adopted side view; folk house building etc. used inclined second isometric projection drawing method; As the external appearance of the imperial city had no help to navigation at all, it was therefore drawn with a simple top view. Out of reverence to the imperial authority, the drawer intentionally scaled up the imperial city. As the mountain ranges and buildings were to be used for positioning and orientation in navigation, they were therefore drawn in detail. Especially as direction of the mountain range is directly correlated to military deployment, they were generally represented in detail.
User’s Mental Model-Oriented Ancient Map of China
1007
In a word, the entire map was a combination of 2D and 2.5D view. Though its geographical feature representation was either complex or simple, as far as navigation is concerned, the amount of information is also sufficient to help the user for location determination.
Fig. 6. Part of the Zheng He’s Nautical Map
3 Key Findings The core design philosophy of the ancient map of China happens to coincide with today’s information design that supports user’s mental model-orientation. Just as advocated by Alan Cooper in his book entitled “The Essentials of Interaction Design”, information designers should understand the means and purpose of the target users for use of the information products, and try as much as possible to make the designed representation model match with the user’s mental model. Then, what kind of information representation can integrate closely with the user’s mental model? Based on exploration to the diversified and flexible means of representation of the ancient maps of China, we hereby abstract two points of findings, which has certain significance for use of reference for today’s information representation design under the complex normal form. 3.1 Information Representation Based on Characteristics of User’s Simplified Mental Model to Improve Reading and Interactive Efficiency The ancient map information representation did not adopt accurate objective means of simulation; rather it focused on more simplified and practical information
1008
R. Yang, D. Li, and W. Zhou
representation. This can very well adapt with characteristics of the user’s mental model. Users would often base on their inherent graphic expression of experience cognition to establish a kind of relatively simple form of interpretation. Although this kind of interpretation may not necessarily conform to the realization model of the “object” or objective information, they are nevertheless sufficient for users to understand the information and complete their tasks. Based on user’s mental model characteristics, the form of ancient map representation is very flexible and diversified. Measures for common use include the following: Neglecting or simple processing of irrelevant or non-important information. For instance, for thematic maps, only one or two types of information related to the theme are drawn, while others irrelevant to the psychological needs of the users are neglected. The reference geographical features outside of the main areas in the map are only simply drawn in order to stress on the main information and lessen user’s cognitive burden. Using feature aspect projection to represent geographical feature. In one ancient map of china, there are even several methods of projection drawings combined. As far as the geographical feature information is concerned, to simplify the non-important feature aspect is to simply represent the feature project aspect, which not only lessens the drawing burden of the drawer and the cognitive burden of the reader in map reading, but also improved the reading efficiency. 3.2 Drawing of Map Starting from the User-Centered Map Orientation, to Form Effective Navigation The ancient map did not develop complex means of navigation as that of the electronic map today, nor had it excessive abstract navigation marks (such as arrow, icon, and lead line etc.). Then, how to help users achieve better navigation? The means is to draw map based on the first-person perspective, make users specify their own location, with “I” as the center, to extend reading to the surrounding information, so as to acquire survey knowledge. Based on modern navigation theory, during traveling, humans feel like being in the centre of space (egomotion). If map information representation can better support the sense of presence, it could strengthen the construction of a cognitive map. The ancient map design had accomplished this very well. Drawing of the ancient map often placed the viewer in the center of the map (such as the Map of Xuanhan County Annals), or used the location of the viewer to define orientation of the map. This two kinds of maps were both drawn from the egocentric map orientation, provide a “I-am-here” viewing angle, make the user clearly feel that he is within to strengthen the proprioception of the user; at the same time, integrated with the visual angle of God view (isometric view) of the absolute coordinate system does not affect its forming to the survey concept of the setting integrity, thereby ensuring the purpose of very good map reading. On the other hand, maps for the purpose of navigation are not limited by the accurate representation to the real and complex routes; rather it focuses on “making clear” the user’s simple mental model. In terms of cognition, voyagers always use a kind of most simple and convenient form to read the sea route. The ancient nautical
User’s Mental Model-Oriented Ancient Map of China
1009
map of China just started from the perspective of the voyagers, their orientation always changed along with location change of the users, and what it simulated was exactly the psychological routes of the user in the course of navigation, as it filtered out all kinds of complex physical information, represented the schema close to user’s simplified psychological route, thereby achieving the objectives of effective navigation and easy understanding.
4 Conclusion and Future Work Design of the ancient map of China, took user’s mental model and the purpose of usage as its orientation and created diversified and versatile forms of information representation, which brings about new inspiration for development of the contemporary map that has been blindly tending towards precise representation of the location of features and or cartographic boundaries. How to extend diversified forms of map representation to meet today’s user’s requirements and adapt to user’s mental model, will be the focus of our future attention. We will continue to further extract valuable forms of ancient map representation of China, attempt to apply them in the specific usage setting of today, and explore the integration of the unique forms of information representation of the ancient map of China with the current application. Acknowledgments. The research presented in this paper is part of the author’s research work about Chinese traditional information design at the Interaction Design Department of SLC, Siemens Research China under the leader of Dr. Zhou Wei. For support of this research work the author would like to thank Dr. Zhou Wei and other colleagues from Siemens SLC, Beijing.
References 1. Zhou Xiaoying.: Information Understanding-based Information Construction, People’s University of China Press (2005) 2. Ge Zhaoguang.: History of Chinese Thoughts (Vol. II), FuDan University press (2001) 3. Chao Wanru: A Collection of Ancient Map of China, Cultural Relics Publishing House, Beijing (1994) 4. Wang Yong: History of Cartography of China, Sanlian Press in Beijing (1958) 5. Jiang Daozhang: “On Characteristics of the Traditional Cartography of China”, Studies in the History of Natural Sciences Vo1.17th, No.3 (1998) 6. Sen—dou Chang: “Some Aspects Of the Urban Geography of the Chinese Hsien Capita1”, Studies in the History of Natural Science, Vol. 17th, No.3 (1998) 7. Yin Chunmin: Enchantment of Brush Work of the Traditional Map of China, Source of manuscript: China Academic Journal Electronic Publishing house (2006) 8. Yu Cang: Orientation in the Ancient Map, Source of manuscript: China Academic Journal Electronic Publishing house (2006) 9. The Nautical Technology of Zheng He’s Delegation, Source of manuscript: Combination of International On-line (2005) 10. Wang Shulian: Cosmic View Reflected in the Ancient Map of China, Source of manuscript: China Academic Journal Electronic Publishing house (2006)
1010
R. Yang, D. Li, and W. Zhou
11. Lu Liangzhi: History of Cartography of China, Beijing Publishing House of Surveying and Mapping (1984) 12. Erwln Raisz: Principles of Cartography, New York McGraw—Hill (1962) 13. Mei—ling Hsu: Descriptive Maps is the Chinese Cartography Tradition, unpublished paper presented at the 5th International Conference of the History of Science in China held at the University of California,San Diego, August 05- 10,1988 14. Jiang Daozhang, Liu Tingxiang: “Example of Map of Local Topography of Ming Dynasty”, Research report on Map of Local Topography of China, Institute Geosciences, Chinese Culture University, Taibei (1995) 15. Alan Cooper, Robert Reimann: About Face 2.0--- Essence of Interactive Design, translated by Zhan Jianfeng, Zhang Zhifei, Publishing house of electronics industry, Beijing (2005) 16. Derek F. Reilly, Kori M. Inkpen.: Map Morphing: Making Sense of Incongruent Maps (2004)
Towards Automatic Cognitive Load Measurement from Speech Analysis Bo Yin1 and Fang Chen1,2 1 School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney, NSW 2052, Australia 2 National ICT Australia (NICTA), Australian Technology Park, Eveleigh 1430, Australia [email protected], [email protected]
Abstract. Cognitive Load, as an indicator of pressure on working memory during task performing, attracts more and more research interests in recent years. By correctly measuring cognitive load levels, the system can adjust task procedure to maintain the cognitive load in an acceptable range; therefore, the subject can execute tasks more accurately and efficiently. Among many different cognitive load measuring approaches, speech-based measurement is effective due to its non-intrusive nature and possibility of online measurement. Most existing research on speech-based cognitive load measurement is based on manually extracted features, which prevent practical use. In this paper, some potential speech features, such as rate of pauses and rate of pitch peaks are investigated and proved to be effective. All feature extraction is based on automatic algorithm. Keywords: Cognitive load, speech.
working memory is limited, the cognitive load has to be maintained below a safe threshold to prevent task failure. Sweller's [2] research on study scenario proved that limiting cognitive load of learning process is effective in increasing the effectiveness of learning. According to Sweller’s Cognitive Load Theory[2], there are three types or sources of cognitive load: intrinsic cognitive load, extraneous cognitive load, and germane cognitive load. Intrinsic cognitive load is related to the inherent difficulty of tasks. Extraneous cognitive load is related to the instructional or presenting method. Germane cognitive load is related to the efforts devoted to learning. The capacity of working memory is varied for different people. Even on the same task, different people will have varied cognitive loads. To improve the effectiveness of learning and/or reduce the risk of error on key tasks, it is very important to maintain the subject’s cognitive load at an acceptable level by adapting the user interface or alarm risk. Measuring cognitive load is the key technology to serve the purpose. There are many different categories of cognitive load measuring technologies [3]. Based on the source of indicators, they can be categorized as physiological measurement, e.g. Eye movement [4], Electroencephalograph (EEG) [5], Event Related Potential (ERP) [6], PET, MRI, heart rate [7] and blood pressure; performance [8]; selfreport, e.g. Rating Scale Mental Effort (RSME) [9] and NASA Task Load Index (NASA-TLX) [10]; and behavioural measurement, e.g. linguistic index [11]. Based on the output, they can be categorized as online or offline measurement. Based on the interference of the task, they can be categorized as intrusive and non-intrusive measurement. They can also be categorized as subjective/objective and direct/indirect measurement. The perfect measurement should be accurate, non-intrusive, objective and online. Considering these requirements, physiological measurement is not suitable because they are intrusive (cost is another issue); self-reporting based measurement is not suitable either, because it is subjective and offline measurement. Performance based measurement is impossible to achieve online. Behavioural measurement is a possible solution, but depends on the application scenario. This paper focus on speech-based measurement – one of the behavioural measurements, since speech already existed in many real-life tasks, e.g. telephone conversation, and voice control. Collecting speech is relatively easy and inexpensive, and speech-based measurement is also nonintrusive.
2 Existing Speech-Based Cognitive Load Measuring Approaches Although speech is a possible source to measure cognitive load, it is also difficult because exact features related to cognitive load have to be found and extracted from the raw speech data. Actually, there is little existing research directly dealing with speech-based cognitive load measurement. Berthold investigated some potential speech features which could indicate the high cognitive load in a user modeling context [12]. Features like number of sentence fragments, number of false starts, number of self-repairs are researched and verified on a Bayesian network-based simulation. Two features, number of sentence fragments
Towards Automatic Cognitive Load Measurement from Speech Analysis
1013
and articulation rate are proved to be related to cognitive load. However, accurate measurement or classification of different cognitive load levels is impossible. Muller researched the recognition of time pressure and cognitive load in a navigation task [13]. A dynamic Bayesian network is used for learning the pattern related to speech features. Six speech features are utilized, including disfluences, articulation rate, utterance content quality, number of syllables in an utterance, silent pauses and filled pauses. Jameson undertook a further investigation, under an extra condition of background acoustic distraction[14]. The utterance content quality feature is replaced by hesitation and onset latency. The number of syllables is shown to be the highest contribution. Although these researchers have pointed out some potential relationships between cognitive load and particular speech features, measuring or classifying cognitive load is still unachieved. Another limitation of the above research is that all speech features are manually labeled, with the result that it is impossible to automatically measure the cognitive load.
3 Experiment Setup For a particular subject, if the task process maintains same, i.e. extraneous cognitive load is the same; the overall cognitive load will be decided by intrinsic cognitive load, i.e. decided by the inherent difficulty level of task. Therefore, designing tasks with different difficulty level is a reasonable method for introducing different cognitive load levels to a subject.
Fig. 1. User interface of experiment system
1014
B. Yin and F. Chen
The task is set based on traffic management scenarios. The subject acts as traffic controller in the task. The task is basically locating and reporting road/traffic incidents. The related information and interactive street maps are displayed on a big screen (Figure 1). There are four difficulty levels of task, based on different city maps. Since these difficulty levels reflect the cognitive load levels introduced to subjects, we will refer the tasks with higher difficulty level as tasks with higher cognitive load level in this paper. The lowest difficulty level uses a Brisbane map, the medium difficulty level uses a Melbourne map, the high difficulty level uses a Sydney map; the higher difficulty level uses a Canberra map with 90 seconds time pressure. In each difficulty level, there are three different tasks. All these 12 tasks are repeated with different interaction modality: speech-only, gesture-only, and speech + gesture. In this paper, the analysis is based on speech-only data. Actions that the subject needs to do in a task include ‘Zoom in’ and ‘Zoom out’ of maps, ‘Select’ active map elements, ‘Tag’ map elements, and provide information about active map elements. In speech only tasks, the subject needs to use speech commands to achieve all actions. For each subject, the length of all tasks is about 30 minutes. However, most parts of the speech are silence. The voice control interface is designed as a keyword command system. Thus, for each action in tasks, the operator (subject) only speaks a limited number of individual words. Consequently, the information included in speech is also limited. There are total five subjects available in speech-only data.
4 Automatic Speech-Based Measuring 4.1 Identify Features To automatically measure the cognitive load from speech, suitable speech features have to be chosen firstly. The major challenge of choosing the features is to make sure they satisfy the requirements of consistency, quantization, and automatic acquisition. Previous research already suggested that disfluencies, intersentensial pausing, fragmented sentences and slower speech rate are the major speech cues which related to the cognitive load [14, 15]. Inspired by the emotional speech research, the prosodic pattern is potentially useful as another cue to reflect cognitive load. To utilize the above cues in a real-world system, the specific features have to be identified from the cues. The following list shows some possible features categorized by cues. Disfluencies Interruption rate Proportion of the effective speech in the whole speech period Keywords for correction or repeating Intersentensial pausing Length and frequency of the big pauses Fragmented sentences Length and frequency of the small pauses Length of intra-sentence segments
Towards Automatic Cognitive Load Measurement from Speech Analysis
1015
Slower speech rate Syllable rate Other Delay of generating speech Particular hybrid prosodic pattern Therefore, the research task was transformed to find effective features which reflect the above cues and could be extracted automatically, and to find a suitable learning scheme to identify the cognitive load. After investigating different features and considering the characteristics of tasks (individual words), the frequency of pauses and prosodic pattern are considered as potential features. 4.2 Feature Extraction A Voice Activity Detector (VAD) is implemented, based on a customized version of ETSI distributed speech recognition front-end. A pitch extraction module is combined to the front-end to produce synchronized pitch values, along with the speech/pause detection results. The analysis frame length is 10ms. The noises contained in the speech will introduce some incorrect short speech segments, which will divide the big pause into small parts. On the other hand, the unvoiced part and small gap in the speech of a word will also introduce some correct, but unwanted, short pause segments, which will split a speech utterance into many small pieces. To avoid them, a FIR low-pass digital filter is applied on the above data. The cut-off point is set to the ¼ position of the sampling frequency. The difference between the original data and filtered data looks like: Apparently, the data with filtered segments is more like what we want. Before filtering 1.5 1 0.5 0 -0.5
0
500
1000
1500
2000
2500
3000
3500
2500
3000
3500
After filtering 1.5 1 0.5 0 -0.5
0
500
1000
1500
2000
Fig. 2. Speech detection output before and after filtering
1016
B. Yin and F. Chen
4.3 Rate of Pauses Since we are interested in the pause segments the speech utterance is removed. The frame-based speech detection results are re-coded to the instance-based data, in which each instance indicates the length of a continuous pause segment. To better represent the intersentensial pauses, the short pause segments with the size smaller than a predefined threshold are also removed. The optimized value of this threshold is found to be 100ms. Several approaches were conducted, including decision tree-based learning and GMM. The most effective indicator is the pause-rate calculated directly from the pause instances. The pause-rate is defined as:
RATE pause = RATE
N pause
N
pause is the pause-rate, pause is number of pauses, where current sub-task. All values are calculated on the sub-task basis. Then the pause-rate is normalized:
RATEpause_norm,L = RATEpause,L − where
RATE pause _ norm , L
(1)
Ltask
Ltask is the length of
1 M ∑RATEpause,l +0.5 M l=1
(2)
is the normalized pause-rate for CL level L, M is the num-
RATE pause, L
ber of CL levels, is the original pause-rate for CL level L. The average rate of pauses across all sub-tasks within one difficulty level is calculated as the rate of pauses of this difficulty level. The following figure shows the average rate of pauses from all five available subjects:
Fig. 3. Average rate of pauses from all CL levels and subjects
Towards Automatic Cognitive Load Measurement from Speech Analysis
1017
The trend is clear that the rate of pauses is always higher in a higher cognitive load level. To make sure this observation is statistically significant, the ANOVA test is conducted on the values. The null hypothesis (H0) is defined as ‘The rate of pause is not significantly different’. Correspondingly, The H1 is ‘The rate of pause is significantly higher when cognitive load is higher’ with considering of the observed trend. The following table shows the result of ANOVA test produced by statistics tool SPSS [16]. The value in sig. field indicates the significance level of given H0 on given data, i.e. the probability of H0 is true. Since the value .000 (more digits are truncated for display reason) is smaller than 0.05 (the common significance level used for acceptance), we can say that the H0 is rejected at 0.05 significance level, therefore, H1 is accepted. Table 1. ANOVA test result on average rate of pauses from all subjects, grouped by cognitive load levels
Between Groups Within Groups Total
Sum of Squares .196 .077 .273
df 3 16 19
Mean Square .065 .005
F 13.671
Sig. .000
4.4 Rate of Pitch Peaks The pitch values are calculated by the VAD. To capture the pitch peaks, the pitch values were filtered by a FIR high-pass filter. The cut-off frequency is ¼ of the sampling frequency. A peak detection criteria was then applied through a sliding window. The detection criteria is:
DETpitchpeak
where
DETpitchpeak
⎧ ⎧ pi +1 > pi , for any 0 ≤ i < N , and ⎪true, when ⎨ =⎨ ⎩ pi −1 > pi , for any − N < i ≤ 0 ⎪ false, others ⎩
indicates if the pitch sequence within the current window is a ef-
pi is the pitch value of position i , i = 0 for the center frame within current window, the length of window is 2 N + 1 .
fective pitch peak,
In this experiment, the window length 5 is selected for optimized results. The detection results are re-coded to a sequence of effective pitch peaks with a time stamp. Several approaches were conducted, including decision tree-based learning and GMM. The most effective indicator is the rate of pitch peak calculated directly from the pitch peak instances. The rate of pitch peak is defined as:
RATE pitchpeak =
N pitchpeak Ltask
(3)
1018
B. Yin and F. Chen
RATE
N
pitchpeak where is the rate of pitch peak, pitchpeak is number of pauses, the length of current sub-task. All values are calculated on the sub-task basis. Then the rate of pitch peak is normalized:
RATE pitchpeak _ norm , L = RATE pitchpeak , L − \where
RATE pitchpeak _ norm, L
1 M
M
∑ RATE
pitchpeak ,l
+ 0.5
Ltask is
(4)
l =1
is the normalized pause-rate for CL level L, M is the
RATE pitchpeak , L
is the original pause-rate for CL level L number of CL levels, Similarly, the average rate of pitch peaks across all sub-tasks within one difficulty level is calculated. The following figure shows the average rate of pauses from all five available subjects:
Fig. 4. Average rate of pitch peaks from all CL levels and subjects
The trend is clear that rate of pitch peaks are always higher in a higher cognitive load level. To make sure this observation is statistically significant; similarly, the ANOVA test is conducted on the values. The null hypothesis (H0) is defined as ‘The rate of pitch peaks is not significantly different’; correspondingly, the H1 is defined as ‘The rate of pitch peaks is higher when cognitive load is higher’ with considering the observed trend. The test result shown in the following table revealed that H0 should be rejected at 0.05 significance level, therefore, H1 is accepted. Table 2. ANOVA test result on average rate of pitch peaks from all subjects, grouped by cognitive load levels
Between Groups Within Groups Total
Sum of Squares .368 .252 .620
df 3 16 19
Mean Square .123 .016
F 7.806
Sig. .002
Towards Automatic Cognitive Load Measurement from Speech Analysis
5
1019
Conclusions
Speech-based features are possible indicators of cognitive load level. Speech-based measurement is particularly suitable because it is non-intrusive and possible to be automatically extracted. The rate of pauses and rate of pitch peaks are proved to be potential indicators of the cognitive load level in designed tasks. These indicators are both higher when the cognitive load level is higher. There still exist several constraints in this research. Due to the limitation of the task and scenario, the amount of available data is insufficient for training the classifier. Additionally, current features are only significant enough on long utterance, and relatively difficult to compare between different subjects because of lack of a baseline. Future research will concentrate on collecting more data and deploying an automatic classifier.
References 1. Mousavi, S.Y., Low, R., Sweller, J.: Reducing Cognitive Load by Mixing Auditory and Visual Presentation Modes. Journal of Educational Psychology 87(2), 319–334 (1995) 2. Sweller, J., Merrienboer, J.J.G.v., Paas, F.G.W.C.: Cognitive Architecture and Instructional Design. Educational Psychology Review, 10(3) (1998) 3. Paas, F., Merrinboer, J.J.G.v.: Instructional Control of Cognitive Load in the Training of Complex Cognitive Tasks. Educational Psychology Review 6, 5171 (1994) 4. Kahneman, D.: Attention and Effort. Prentice Hall, New Jersey (1973) 5. Gevins, A., Smith, M.E.: Neurophysiological Measures of Working Memory and Individual Differences in Cognitive Ability and Cognitive Style Cerebral Cortex, vol. 10(9), p. 829–839 (2000) 6. Scerbo, M.S., et al.: The Efficacy of Psychophysiological Measures for Implementing Adaptive Technology. NASA Langley Research Center, Hampton (2001) 7. Roscoe, A.H.: Assessing Pilot Workload. Why measure heart rate, HRV and respiration? Biological Psychology (34), 259–287 (1992) 8. O’Donnell, R.D., Eggemeier, F.T.: Workload Assessment Methodology. In: Cognitive processes and performance, vol. 42, pp. 1–49. Wiley, New York (1986) 9. Zijlstra, F.R.H., Doorn, L.v.: The construction of a scale to measure perceived effort. Department of Philosophy and Social Sciences. Delft University of Technology, Delft, Netherland (1985) 10. Hart, S.G., Staveland, L.E.: Development of NASA-TLX (Task Load Index): results of empirical and theoretical research. Human Mental Workload, pp. 139–183. North-holland, Amsterdam (1988) 11. Kettebekov, S.: Exploiting Prosodic Structuring of Coverbal Gesticulation. In: ICMI04, State College, PA, USA (2004) 12. Berthold, A., Jameson, A.: Interpreting Symptoms of Cognitive Load in Speech Input. In: UM99 (1999) 13. Müller, C., et al.: Recognizing Time Pressure and Cognitive Load on the Basis of Speech: An Experimental Study. In: UM2001 (2001)
1020
B. Yin and F. Chen
14. Jameson, A., et al.: Assessment of a User’s Time Pressure and Cognitive Load on the Basis of Features of Speech. Journal of Computer Science and Technology (2006) 15. Oviatt, S., Coulston, R., Lunsford, R.: When Do We Interact Multimodally? Cognitive Load and Multimodal Communication Patterns. In: ICMI 04, State College, Pennsylvania, USA (2004) 16. SPSS Inc. http://www.spss.com
Attitudes in ICT Acceptance and Use Ping Zhang1 and Shelley Aikman2 1
School of Information Studies, Syracuse University, Syracuse, NY 13244, USA 2 Department of Psychology, Syracuse University, NY 13244, USA {pzhang, saikman}@syr.edu
Abstract. Information and communication technology (ICT) acceptance and use is a prolific research stream in the information systems (IS) field. One major theoretical influence is the Theory of Reasoned Action/Theory of Planned Behavior (TRA/TPB). While the research stream achieved high consensus and validation in IS, the interest in attitude, an important concept in TRA/TPB, has gone through ups and downs over the past decades due to the lack of predictability of attitude for behavioral intention. In this paper, we clarify both conceptual and operational confusions by providing clear definitions of two different types of attitudes and detailing their relationships to each other and to behavioral intention. Empirical data confirms that attitude toward behaviors is a better prediction of intention than attitude toward objects (ICT); attitude toward objects has positive influence on attitude toward behaviors. Attitudes toward a previous version of the software and its use have significant impacts on the current attitudes. Keywords: attitudes, ICT acceptance, ICT use, empirical study.
In combination, attitude toward behavior, subjective norm, and perception of behavior control lead to the formation of a behavioral intention. Over the past decades, TRA and TPB have influenced the IS research on technology acceptance, from the formation of the famous Technology Acceptance Model [6], to a number of alternative models, to a recent attempt of a unified theory of technology acceptance [1], and several studies on perfecting the models by consider moderating factors [2], among other factors. The convergence of research focus has been on the roles of an individual’s beliefs and various antecedents of beliefs. Several salient behavioral beliefs include perceived usefulness, perceived ease of use and perceived enjoyment. Despite the importance of attitude toward behavior in TRA and TPB, the concept of attitude has not been always the focal interest in IS research on technology acceptance. Empirical studies find inconsistent and inconclusive results for the role of attitude on behavioral intention. Based on a survey of the literature, attitude toward using technology is theorized not to be a direct determinant of intention and thus is excluded from the unified theory of acceptance and use of technology (UTAUT) [1]. Our careful examination of the literature shows that it is too early to remove attitude toward behavior from technology acceptance research. In fact, attitude should play an important role on intention if the concept is clearly understood and carefully studied. We posit that there are conceptual and operational mis-conceptions of attitude toward behavior in the IS literature that can be the cause of inconsistent and inconclusive results. In this paper, we investigate the theoretical underpinning of different attitude concepts, the structure of attitude, and the roles of different attitudes in forming technology use intention. Specifically, we attempt to focus two key points in this paper: (1) there are two types of attitudes that have different effects on intention but that have been mixed up in a number of the studies: attitude toward ICT as an object (ATO for short), and attitude toward using ICT as a behavior (ATB). We hypothesize that ATB is a strong predictor of intention, while ATO’s impact on intention is fully mediated by ATB. (2) One’s ATO and ATB toward an early version of ICT can have strong impacts on his or her current ATO and ATB. An empirical study is conducted to validate our theoretical positions.
2 Literature Review on Attitude in Technology Acceptance Research The two threads we use for the literature review are (1) whether the attitude concept being studied is toward object (ATO) or toward behavior (ATB), and (2) how the attitude is measured in the study. Most of the reviewed studies emphasize ATB. Some of these studies are actually the results of confusing attitudes toward objects with attitudes toward behaviors. The measures for attitude are also widely different. Some measure “global” attitudes, while others measure the informational base of the attitudes. For example, [7, 8] considered perceived ease of use and perceived usefulness as attitude measures. Yet in TAM and other models, these two are the cognitive antecedents of attitude. The measures also spread out in terms of measuring attitude toward general ICT (or ICT use) or a target ICT being studied in the research.
Attitudes in ICT Acceptance and Use
1023
In addition, the measures cover different or unbalanced components of attitude, including instrumental (e.g., “beneficial,” “useful,” “valuable”), experiential (“unpleasant,” “enjoyable,”), or something seeming at a more general level (“good/bad,” “negative/positive,” “favorable/unfavorable”). The mis-conception of the attitude construct has been noted by other researchers as well. For instance, in a recent study, Wixom and Todd clearly stated the difference between ATO and ATB, and included ATB in the research model [9]. However, to the best of our knowledge, few studies provide a systematic examination of the two attitude concepts or have considered ATO and ATB together in same studies to distinguish them from each other. It is thus our intent in this paper to demonstrate that ATO and ATB are both conceptually different as supported by theoretical reasoning and empirically different as supported by empirical evidence. In addition, we also want to clarify the confusion about the structure of attitude that influences how attitude is measured within IS. Toward this end, we present a theoretically supported structure/instrument and then test the psychometric property of the instrument using the empirical study.
3 Conceptual Development 3.1 Attitude Structure According to Ajzen and Fishbein, at the time of Wicker’s review [10] of attitude studies, the most popular conceptions of attitude incorporated the traditional trilogy of thinking, feeling, and doing. In contemporary language, attitude was defined as a complex, multidimensional construct comprised of cognitive, affective, and conative components. Yet, most attitude measurement techniques resulted in capturing only the “affect” side of the concept [11]. Our literature review confirmed that this practice does exist. MIS research has widely accepted the attitude definition by Fishbein and Ajzen [5] in that attitude is “an individual’s positive or negative feelings (evaluative affect) about performing the target behavior” [1, 12, 13]. That is, attitudes are often considered overall affective evaluations [4]. The most recent theoretical understanding of the structure of attitude posits that attitude toward behavior contains instrumental (e.g., desirable-undesirable, valuableworthless) as well as experiential (e.g., pleasant-unpleasant, interesting-boring) aspects; thus attitude measures should contain items representing these two subcomponents [11]. In terms of attitude toward object, it has been well established that attitude should be measured by general evaluative terms such as positive/negative, good/bad, desirable/undesirable, and like/dislike [14]. 3.2 The Roles of Attitudes Research within attitude theory in general makes a distinction between attitudes toward an object and attitudes toward a behavior [4, 11, 15, 16]. To clearly distinguish the two types of attitude, here we adopt the well established definitions of these concepts in psychology. Attitude toward object (ATO) is defined as “a psychological tendency that is expressed by evaluating a particular entity with some
1024
P. Zhang and S. Aikman
degree of favor or disfavor” [15] or similarly, as a combination of evaluative judgments about an object [14]. Attitude toward behavior (ATB) is defined as “an individual’s positive or negative feelings (evaluative affect) about performing the target behavior.” [5] In TRA and TPB, attitude is toward a particular behavior [3-5]. In a recent effort of clarifying the roles of attitudes in behavior, Ajzen and Fishbein re-state the difference between attitude toward object and toward behavior that are both theoretically and empirically established [11]. The two attitudes have different functions regarding behavior. In particular, attitude toward behavior has been shown to be a much better predictor of behavioral intention and behavior than attitude toward object [4, 11]. This finding can be explained by the principle of compatibility [5, 15]. Generally, in order for an attitude to be predictive of behavior, the attitude must be assessed at the same level of specificity. So, the attitude being assessed must be as broad as the behavior in question. It follows then that a behavioral intention toward using something (e.g., ICT) would be best predicted by an attitude measure of behavior regarding that something (again, ICT) rather than an attitude measure of the object, which by definition is being assessed at a different level (e.g., attitude toward the ICT itself). Further, Eagly and Chaiken point out that, although there are these two broad theories of attitudes that are generally examined separately (i.e., either attitude toward the object or attitude toward the behavior), it might be useful to examine both together to get good prediction of behavior. For example, once an attitude toward an object has been activated, there are likely still many options of appropriate behaviors to choose from. If a link is established between an ATO and an ATB (as a possible behavioral choice), then once an ATO has been activated, the attitude toward behavior (ATB) should also be activated; this leads to a good prediction of behavior [15]. In the IS literature, some empirical studies support the claim that attitude toward object predicts attitude toward behavior. For example, Hiltz and Johnson find that attitude toward computer (ATO) has a significant impact on attitude toward alternative media. Thus, to summarize, ATO predicts ATB, which in turn predicts BI. From this, we postulate the following three hypotheses. H1: Attitude toward object has a positive impact on attitude toward behavior H2: Attitude toward behavior has a positive impact on behavioral intention H3: Attitude toward object does not have a direct effect on behavioral intention 3.3 The Impact of Previous Attitudes We also explore the role of one’s attitudes toward similar objects (ATO0 for short) and one’s attitudes toward behavior with similar objects (ATB0) on one’s attitudes toward the current product. Attitude research both generally and in the applied realms of marketing, advertising, and ICT suggest that prior attitudes and behaviors can have various effects on current attitudes and behavioral intentions. First, research by Ouellette and Wood demonstrates that past behaviors, along with attitudes toward the behavior and subjective norms, predict behavioral intentions even when those past behaviors are not well-learned. Further, research examining these relationships has also found that attitudes toward a similar object and attitudes toward behavior with a similar object impact one’s attitudes related to the current object. For instance,
Attitudes in ICT Acceptance and Use
1025
attitude research demonstrates that once a stimulus is categorized, its evaluation may be consistent with the category evaluation [17]; therefore, if a new ICT product is perceived to be in the same category as a product previously used, the new product is likely to be evaluated in light of the existing category evaluation. This is confirmed by our preliminary results in the e-commerce context: a user can form a positive reaction toward a new website after 0.5 second exposure to it if the website resembles typical e-commerce website designs. Similarly, research in the domain of marketing and advertising demonstrates that attitudes that a consumer holds toward a parent brand impact attitudes towards new products released by that parent brand [18]; this effect is particularly strong when the new product is similar to existing products [19, 20]. Because participants taking part in a study examining ICT use likely have previous experience with ICT, it is important to take these prior attitudes and behaviors into account. The ATO0 and ATB0 constructs realize the feedback loop idea of the impacts of formed attitudes on new interactions [11]. Because research examining these links has found conflicting results in terms of positive or negative effects, we will explore the relationships between ATO0, ATB0, and current attitudes rather than make specific predictions on the directions of these effects. Thus, we have the following four hypotheses. H4: Attitude toward a similar object has an effect on attitude toward the object. H5: Attitude toward a similar object has an effect on attitude toward using the object. H6: Attitude toward using a similar object has an effect on attitude toward the object. H7: Attitude toward using a similar object has an effect on attitude toward using the object.
4 Methodology The data for this study was collected as part of a larger field study with survey method to understand student evaluations of a course management system WebCT at a major northeast university in the US. A new version, WebCT 6.0, was implemented in the Fall 2006 semester. It is significantly different from its previous version (WebCT 4.0) with new looks and new functions, as well as a need for users to have new mental models of some of its features. The data were collected during the 3rd and 4th weeks of the semester when students were getting settled in their courses. An announcement for the survey was posted in the WebCT 6.0 homepages of those courses that require WebCT 6.0 to be part of the courses with an incentive of winning one of two drawings of $100 cash prizes. A link directed participants to a survey website hosted at SurveyMonkey.com. The study thus is about mandatory use of a relatively new ICT product. Responses from 121 graduate student participants in the survey were included in this study. All of these participants have used the previous version WebCT 4.0 for at least one month. They have an average age of 33 years (std=9.7), experience of 15 years (std=6.1) of using computers, and 10 years (std=3.2) of using the world wide
1026
P. Zhang and S. Aikman
web. 63% of the participants were females, 68% Caucasian/white, 18% Asian/pacific rim, 5% African-American, and 3% Hispanic. The majority of them just started using WebCT 6.0 since the beginning of the semester; a few participated in the piloting of WebCT 6.0 in the summer. They had used the previous version WebCT 4.0 for an average of 14 months (std=10). The measures for behavioral intention and attitude toward objects were adopted from [9] and [14] respectively. For attitude toward behavior, we constructed the measures based on the guideline by [11]. This measure of attitude toward behavior includes both instrumental and experiential aspects. All constructs were measured using multiple items on 5-point Likert scales (1 for Strongly Disagree, 2 for Somewhat Disagree, 3 for Neither Agree nor Disagree, 4 for Somewhat Agree, and 5 for Strongly Agree. Data analyses consisted of two phases. The first phase was confirmatory factory analysis (CFA) to assess the measurement model. All constructs were modeled as reflective and by multiple indicators. The second phase was to test the research model using structural equation modeling technique. PLS was used for these analyses. The measurement model was examined in Phase 1 for convergent and discriminant validity. Convergent validity was assessed by reliability of items, composite reliability of constructs and average variance extracted (AVE). Discriminant validity was assessed by examining cross-loadings and the relationship between correlations among constructs and the square root of AVEs. Reliability of items was assessed by examining each item’s loading on its corresponding construct. A common rule of thumb suggests that the item loading should exceed .70 [21, 22]. Confirmatory factor analysis results showed that all items exhibited loadings of more than .70 to their corresponding constructs, indicating adequate reliability of items. Table 1 shows composite reliability. AVE measures the amount of variance that a construct captures from its indicators relative to the amount due to measurement error [22]. It is recommended to exceed 0.5. As shown in Table 1, all of the constructs met this guideline. AVE is also suggested to serve as a means of evaluating discriminant validity [23]. The square root of the AVEs should be greater than the correlations among the constructs, which indicates that more variance is shared between the construct and its indicators than with other constructs. In Table 1, the shaded numbers on the leading diagonals are the square root of the AVEs. Off diagonal elements are the correlations among constructs. All diagonal numbers are greater than the off diagonal ones, indicating satisfactory discriminant validity of all the constructs. Another criterion for assessing discriminant validity is that no measurement item should load more highly on any construct other than the construct it intends to measure [22]. An examination of cross-factor loadings shows that all items satisfied this guideline. This indicates that ATO and ATB in general (including ATO0 and ATB0) are different constructs confirmed by empirical evidence. Table 2 shows the outer model loading in the model context, indicating that all constructs have satisfactory indicator loading.
Results from Phase 2, the structural model, are shown in Figure 1. PLS does not use model fit indices; however, the explanatory power of a structural model could be assessed by the R square values (variance accounted for) in the dependent latent variables. Figure 1 shows that 55% of the variance in BI is explained by the model; 67% in ATB by ATO, ATO0, and ATB0. While ATB has a significant impact on BI, ATO does not. ATO’s impact on BI is fully mediated by ATB. In addition, ATB0 has
1028
P. Zhang and S. Aikman
significant positive impact on ATO and ATB; ATO0 has a significant negative impact on ATB. ATO0 also has negative impact on ATO but not at a significant level. As a summary, all hypotheses except H4 (ATO0->ATO) are supported by the data.
5 Discussion and Conclusion We believe that there are mis-conceptualizations of the attitude construct(s) in the current literature, making comparison of studies or definitive conclusions from results a difficult task. This is a main reason for attitudes having been dropped in many studies and for knowledge regarding the roles of attitudes being inconclusive. In this paper, we clarify conceptual confusions regarding attitudes in the context of ICT acceptance and use. Specifically, we demonstrate that attitude toward what, object or behavior, is a necessary distinction. We also demonstrate that an attitude structure that incorporates both experiential and instrumental aspects shows good psychometric properties. Efforts to clarify conceptual confusions of attitude concepts are in great need due to the inconclusive and inconsistent empirical findings and the importance in studying ICT acceptance and use. In this study, we also present a research model indicating the different roles of ATO and ATB in predicting BI, as well as the roles of attitudes toward a similar object and use of it. Our empirical data largely support the model. It is interesting to note that the attitude toward previous version of the software has a negative impact on the attitude toward using the new software. That is to say, that the more one is positive about the previous version, the more negative one’s attitude toward using the current version of the software. One potential explanation for this could be the idea of psychological reactance. That is, people do not like their choices being taken away from them or their choices being made for them, and they often react in ways that reflect this. So, for example, if someone had a positive attitude toward the previous version of the software (many participants have positive attitude toward WebCT 4.0), they may not want to have to use a new version (many felt that their use of WebCT 6.0 was forced), and therefore react by having a negative attitude toward the new version. It is possible that their attitudes toward the new version could change as they gain experience and become more comfortable with the new version. Examining fluctuations in these attitudes over time and increased usage is a fruitful avenue for further research. One research implication has to do with the treatment of attitudes in future empirical studies. Despite the fact that our current understanding of the user technology acceptance phenomenon is based on the attitude measures that Ajzen and Fishbein are currently criticizing [11], future studies should pay attention to the holistic nature of attitude that includes both instrumental and experiential aspects. Another research implication is to pay attention to the antecedents of ATB due to its important impact on BI. TRA and TPB clearly state the importance of instrumental determinants (various beliefs). Recent movement in the IS, psychology and other fields has started to examine the experiential (largely affective) antecedents of ATB. This makes sense as the structure of attitudes includes both components. The third research implication is that previous empirical studies should be re-examined in light
Attitudes in ICT Acceptance and Use
1029
of the ATO/ATB distinction to draw conclusion on the role of attitudes on behavioral intentions. There are also practical implications of our study. Attitude toward behavior fully mediates the role of attitude toward object on behavioral intention. Knowing this, ICT designers, managers, trainers, marketers and other stakeholders should understand that a positive attitude toward a particular ICT will not lead potential users to decide to accept or use the ICT. To increase the chance of potential users adopting and continue using an ICT, efforts should be put into identifying antecedents of attitudes toward behavior in addition to attitude toward the ICT. Another implication has to do with the negative impact of attitude toward a previous version of the ICT on current attitude toward behavior. This finding suggests that in the case of mandatory ICT use, if someone holds a positive (negative) attitude toward a similar product, they are more likely to have a negative (positive) evaluation of the current attitude toward behavior. Knowing this, ICT designers, managers, trainers, marketers and other stakeholders may want to take note of what aspects led to the positive (or negative) previous attitude to (1) make sure those positive features are retained in the new product and (2) make sure it is clear to users that the features they found appealing before are still present and/or the features they found distasteful are no longer present. Future research should determine what information people use in forming their attitudes toward ICT and include this information when modeling the relationships between attitudes towards objects, attitudes toward behaviors, and behavioral intentions. Finally, it should be noted that the current research focused on mandatory ICT use. Future research should explore the relationship between prior attitudes, current attitudes, and behavioral intentions in regards to voluntary use of ICT.
References 1. Venkatesh, V., Morris, M.G., Davis, G.B., Davis, F.D.: User Acceptance of Information Technology: Toward a Unified View. MIS Quarterly 27(3), 425–478 (2003) 2. Sun, H., Zhang, P.: The Role of Moderating Factors in User Technology Acceptance. International Journal of Human-Computer Studies 64(2), 53–78 (2006) 3. Ajzen, I.: The Theory of Planned Behavior. Organizational Behavior & Human Decision Processes 50(2), 179–211 (1991) 4. Ajzen, I., Fishbein, M.: Understanding Attitudes and Predicting Social Behavior. PrenticeHall, Englewood Cliffs, NJ (1980) 5. Fishbein, M., Ajzen, I.: Belief, Attitude, Intention and Behavior: An Introduction to Theory and Research. Addison-Wesley, Reading, MA (1975) 6. Davis, F.: Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly 13(3), 319–340 (1989) 7. Harrison, A.W., Rainer Jr, R.K.: The Influence of Individual Differences on Skill in EndUser Computing. Journal of Management Information Systems 9(1), 93–112 (1992) 8. Sambamurthy, V., Chin, W.W.: The Effects of Group Attitudes toward Alternative GDSS Designs on the Decision-making Performance of Computer-Supported Groups. Decision Science 25(2), 215–239 (1994) 9. Wixom, B.H., Todd, P.: A Theoretical Integration of User Satisfaction and Technology Acceptance. Information Systems Research 16(1), 85–102 (2005)
1030
P. Zhang and S. Aikman
10. Wicker, A.W.: Attitudes versus Actions: The Relationship of Verbal and Overt Behavioral Responses to Attitude Objects. Journal of Social Issues 25, 41–78 (1969) 11. Ajzen, I., Fishbein, M.: The Influence of Attitudes on Behavior. Albarracin, D., Johnson, B.T., Zanna, M.P. (eds.) Handbook of Attitudes and Attitude Change, Erlbaum, Mahwah, NJ (2005) 12. Davis, F.D., Bagozzi, R.P., Warshaw, P.R.: User Acceptance of Computer Technology: A Comparison of Two Theoretical Models. Management Science 35(8), 982–1003 (1989) 13. Moon, J.-W., Kim, Y.-G.: Extending the TAM for a World-Wide-Web Context. Information & Management 38(4), 217–230 (2001) 14. Crites, S.L., Fabrigar Jr., L.R., Petty, R.E.: Measuring the Affective and Cognitive Properties of Attitudes: Conceptual and Methodological Issues. Personality and Social Psychology Bulletin 20, 619–634 (1994) 15. Eagly, A.H., Chaiken, S.: Attitude Structure and Function. In: Gilbert, D.T., Fiske, S.T., Lindzey, G. (eds.) The Handbook of Social Psychology, pp. 269–322. Oxford University Press, New York (1998) 16. Forgas, J.P.: Feeling is believing? The Role of Processing Strategies in Mediating Affective Influences on Beliefs. In: Frijda, N.H., Manstead, A.S.R., Bem, S. (eds.) Emotions and Beliefs: How Feelings Influence Thoughts, pp. 108–143. Cambridge University Press, Cambridge, United Kingdom (2000) 17. Wegener, D.T., Carlston, D.E.: Cognitive Processes in Attitude Formation and Change. In: Albarracín, D., Johnson, B.T., Zanna, M.P. (eds.) The Handbook of Attitudes, pp. 493– 542. Lawrence Erlbaum Associates, Mahwah, NJ (2005) 18. Aaker, D.A., Keller, K.L.: Consumer Evaluations of Brand Extensions. Journal of Marketing 54, 27–41 (1990) 19. Boush, D.M., Loken, B.: A Process-tracing Study of Brand Extension Evaluation. Journal of Marketing research 28, 16–28 (1991) 20. Park, C.W., Milberg, S., Lawson, R.: Evaluation of Brand Extension: The Role of Product Feature Similarity and Brand Concept Consistency. Journal of Consumer Research 18, 185–193 (1991) 21. Barclay, D., Higgins, C., Thompson, R.: The Partial Least Squares (PLS) Approach to Causal Modeling, Personal Computer Adoption and Use as an Illustration. Technology Studies 2(2), 285–309 (1995) 22. Chin, W.W.: The Partial Least Squares Approach to Structural Equation Modeling. In: Marcoulides, G.A. (ed.) Modern Methods for Business Research, pp. 295–336. Lawrence Erlbaum Associates, Mahwah, NJ (1998) 23. Fornell, C., Larcker, D.F.: Structural Equation Models with Unobservable Variables and Measurement Errors. Journal of Marketing Research 18(2), 39–50 (1981)
Part IV
Models and Patterns in HCI
Using Patterns to Support the Design of Flexible User Interaction M. Cecília C. Baranauskas and Vania Paula de Almeida Neris Institute of Computing, IC - Unicamp Campinas, São Paulo, Brazil +55 19 3788-5870 {cecilia, neris}@ic.unicamp.br
Abstract. The social value of Web applications is in their potential to be the conduit for many different types of applications to many different people, using different resources and embedded in diverse contexts. Designing for flexibility involves many people, with different skills, interests and levels of commitment, including, designers, developers and users. Tailorable features in the user interface demand a clear bond between the phases in the whole software lifecycle, starting from requirements elicitation to the design and development stages. As interaction patterns have been considered a promising approach to bridge the gaps between analysis, design and implementation of usability related features, this work first investigates and synthesizes from literature a set of interaction patterns related to tailoring activities. From this analysis, a semiotic-informed categorization of tailorable user interface features is presented and discussed; an elicitation pattern for tailorable user interface features illustrates the usefulness of the proposal. Keywords: tailoring, user interface, flexibility, interaction patterns.
designers as well as for the end users themselves. Tailoring involves the concept of “design for change”, in a way that software applications can provide the flexibility of being customized to different organizational contexts or not anticipated situations of use, or those that have changed [7]. The main benefits that can be obtained with customization, emphasized by literature are: more efficiency [14], more satisfaction of use [18] and a smaller learning curve when an application is replaced [13, 18]. In this work we consider tailoring that is done explicitly by the end-user and not automatically by the software. Designing for flexibility involves many people, with different skills, interests and levels of commitment, including, designers, developers and users. Tailorable features in the user interface demand a clear bond between the phases in the whole software lifecycle, starting from requirements elicitation to the design and development stages. On the other hand, Interaction Patterns are becoming a valuable approach to bridge the gap between analysis and design [19]. The idea of design patterns, originally introduced in the field of architecture by Alexander [1], has been successfully utilized in the context of Software Engineering [6] and, more recently, in the HumanComputer Interaction field [22, 9]. A design pattern is a solution to a reappearing design problem; a pattern description consists of at least three parts: a problem, its context and a solution to it. Another interesting aspect of patterns is that in the original idea, patterns were introduced to include the inhabitants in the design process of buildings. Also, in tailoring design, they can be used to facilitate communication between all the involved parts, including final users. Although a promising approach to bridge the gaps between analysis, design and implementation of usability related features, interaction patterns are presented in different ways by various authors, using different criteria for categorizing them. Moreover, recent studies have indicated problems in their usability [19]. In this work we draw upon the Semiotic Ladder (SL) [20] to categorize the interaction patterns according to the SL six layers of information: physical, empirical, syntax, semantics, pragmatics and social layers. This paper discusses the concepts related to tailoring and its variants and investigates the use of interaction patterns, more specifically Tailorable User Interface Features (TUIF), suitable for the design of flexible applications. Our overall research goal involves: (a) to identify a set of TUIFs based on the literature and best practice on HCI, (b) classify them according to specific criteria and (c) to develop a structure and format for these interaction patterns, so that they could be more usable and useful to designers as well as to developers of tailorable applications. The paper is organized as follows: Section 2 presents the concepts involved in Interaction Patterns and some patterns that are related to tailoring. Section 3 presents principles of Organizational Semiotics and the SL, which represents the theoretical referential for the categorization we have used for the interaction patterns. Section 4 discusses the main findings: the expert knowledge in tailoring which enables us to derive the TUIFs patterns and a categorization. Section 5 concludes.
2 Interaction Design Patterns and Tailoring The original intention of defining patterns, as introduced by the architect Christopher Alexander, was to capture the essence of successful solutions to recurring design
Using Patterns to Support the Design of Flexible User Interaction
1035
problems in a certain context. In [1] he espouses an approach to design that focuses on the interactions between the physical form of buildings and the way that form impacts personal and social behavior [7]. It was a reaction against the kind of buildings that had been built within the modernist tradition, where many of the immeasurable qualities of architecture had been lost. The patterns that his team made, strive at resolving conflicting forces, wants, needs, and fears that exist in the usage of a building [2]. Every pattern describes a recurring problem, its context, the forces that are at play in the situation, and a solution to the problem. The feature that solves the problem is written in a generic but concrete way, so it can be designed in an infinite number of ways, while still being readily identifiable. Anyone should be able to see if a design solution has a particular feature or not. This is especially important to help non-specialists to participate in the design [2]. So, regarding to the original idea of patterns, they were introduced to include the inhabitants in the design process of buildings. Also in the software application context of design, they can be used to facilitate communication between all the involved parts, including final users. In the Computer Science field, the idea of Alexander was first explored in Software Engineering. Software patterns became popular with the wide acceptance of the book Design Patterns by Erich Gamma and friends - frequently referred to as the Gang of Four [6]. In the Human-Computer Interaction community a large number of design patterns has been formalized [4, 22, 25] for different types of applications as elearning, e-commerce, CSCW to name a few. A pattern can be seen as a working hypothesis; each pattern represents the current understanding of what the best arrangement is for solving a particular problem [2]. So, the first step in this work, aiming at identifying a set of TUIFs based on the literature and best practice on HCI for the design of tailorable applications, was to identify patterns that are related to tailoring. Ten patterns collections were studied, they are: Welie [25], Tidwell [22], Borchers [4], Laakso [11], Back [3], Coram and Lee [5], Arvola [2], Yahoo! [23], CSEG[17] and UC Berkeley [24]. Table 1 shows the selected patterns. As pointed out before, although interaction patterns are a good way to spread specialized knowledge, they are presented in different ways by various authors, using different criteria for categorizing them. Moreover, recent studies have indicated problems in their usability [19]. In this work we draw upon the Semiotic Ladder (SL) [20] to categorize the interaction patterns which impact on the tailorability of the application user interface. Our approach is based on the idea that the development of any software application should consider more than the technical aspects represented by the computer application, including the social system where the computer application is situated. The next section shows Stamper’s SL that presents a classification for signs that goes from the social layers to the physical layer. The patterns from Table 1 are then distributed in the SL; implications of this categorization are discussed.
1036
M.C.C. Baranauskas and V.P. de A. Neris Table 1. Patterns related to Tailoring
Name and Author
Customized Window Welie [25]
Problem
van
Users ideally want to have fully personalized content
Personalized 'My' Site - van Welie [25]
Users have a need to define their own page elements
User Preferences - J. Tidwell [22]
How does the artifact present the actions that the user may take?
Solution Use "windows" with select items that users can adapt or click away. Users are presented with areas that look like they are kind of windows. They have a "close" or "minimize" button next to an "edit" or "customize" button. Users can customize what is displayed in the window or take it away completely. The settings are stored for each user and they see their customized window upon return. www.welie.com/patterns/showPattern.php?patternID=customizationwindow Create a part of the site that belongs to a user and that is controlled by that user. First log in and then present a customized personal section. Usually the pages are built up using 'modules' that the users have selected. Users can change which modules they want and in which layout and graphical presentation. www.welie.com/patterns/showPattern.php?patternID=my-site Provide a place or working surface where users can pick their own settings for things like language, fonts, icons, color schemes, and use of sound. Allow users to save those preferences. Devise a set of alternative "canned settings" that users can choose between. www.mit.edu/~jtidwell/language/user_preferences.html Allow users to place things where they want, at least in one dimension but preferably in two. Start out with a reasonable default layout, however. Permit stacking, moving, grouping, aligning, "neatness" adjustments, sorting, and other layout operations. Do not capriciously rearrange the user's space -- only do automatic layout if the user specifically requests it! www.mit.edu/~jtidwell/language/personal_object_space.html Provide a way for the user to "record" a sequence of actions of their choice, and a way to easily "play them back" at any time. The playback should be as easy as giving a single command, or pressing a single button, or dropping the action object onto a control of some kind. The user should be able to give the sequence a name of their choice. www.mit.edu/~jtidwell/language/scripted_action_sequence.html Support ways for users to add their own comments and other annotations to the artifact. Allow the users to place those annotations physically close to where they are needed, and if possible, allow for simple drawings in addition to text. Let users write private comments, for their own eyes only, and also let them write public ones that other users can read. Save the annotations from session to session, as a part of the artifacts. www.mit.edu/~jtidwell/language/users_annotations.html Let the user make a record of their points of interest, so that they can easily go back to them later. The user should be able to label them however they want, since users are in a better position to choose labels that are memorable to them (see also User's Annotations). Support at least an ordered linear organization, so that a user can rank them according to whatever criteria they choose; if possible, support a grouping structure of some kind. Save the bookmarks for later use. www.mit.edu/~jtidwell/language/bookmarks.html
Personal Object Space - J. Tidwell [22]
There are many things that the user needs ready access to, such as working surfaces, documents, objects, or tools. How should the items in question be organized?
Scripted Action Sequence - J. Tidwell [22]
The user needs to perform the same sequence of actions over and over and over again, with little or no variability. How can the artifact make repetitive tasks easier for the user?
User's Annotations - J. Tidwell [22]
The artifact is complex and difficult to learn, but will be used again. How can the artifact help preserve the user's hard-won understanding from one use session to the next?
Bookmarks - J. Tidwell [22]
The artifact is large or complex. How can the artifact support the user's need to navigate through it in ways not directly supported by the artifact's structure?
Movable Panels J. Tidwell [22]
The page has several coherent interface "pieces" that don't really need to be laid out in one single configuration; their meanings are self-evident to users, regardless of their location on the page.
Let the user move the UI pieces around the page at will. Save the layout for the next time the user resumes using the software, especially if it's an important part of his daily life. http://designinginterfaces.com/Movable_Panels
Placeholder Laakso [11]
User needs to return to a target directly
The user chooses a single target from a set of data and saves a shortcut (bookmark) to it. www.cs.helsinki.fi/u/salaakso/patterns/Placeholder.html
There are two sets of items with many-to-many connections. User typically does not benefit from a generic design showing all the groups where a single item belongs, and all the items that belong to a selected group in the same view.
Good solutions to Groups and Items problems depend on the user’s goals. Typically, in addition to viewing the groups and items, the user must be able to edit both of them. www.cs.helsinki.fi/u/salaakso/patterns/Groups-and-Items.html
–
Groups and Items – Laakso [11]
Using Patterns to Support the Design of Flexible User Interaction
1037
Table 1. (Continued) Master and Instances – Laakso [11]
User has created several copies of an object. There are two kinds of changes he faces in the future: changes that apply only to this specific object, and changes that apply to all of the objects.
Pile of Items – Laakso [11]
The user creates objects by picking them from an infinite stack of objects, i.e. the pile that is like a factory producing objects.
Drag and Drop Modules Yahoo! [23]
The user needs to re-arrange the layout on a web page directly with the mouse.
Typically, changes that should apply to all of the created objects are made to a master object, and the instance objects only reflect the changes. www.cs.helsinki.fi/u/salaakso/patterns/Master-and-Instances.html To create an object, the user either drags it from the pile, or he selects the pile and clicks somewhere, where the object can be created. The pile where the objects come from does not need to be static; the user may edit the properties of the pile to create different kind of objects. www.cs.helsinki.fi/u/salaakso/patterns/Pile-of-Items.html Give to users drag and drop modules. This also avoids forcing the user to go to another page in order to re-arrange the layout. http://developer.yahoo.com/ypatterns/pattern.php?pattern=dragdrop modules
3 A Semiotic-Informed Classification for Interaction Patterns for Tailorable Applications Tailoring involves many complexity levels and one basic question remains: how to communicate to users the possibility of tailoring? Mackay [14] has shown that many users do not know what can be customized and how to do that. Some of the reasons pointed out are the absence of documentation and the absence of users’ expectation in customizing the application. Also, questions regarding changing management should be considered, as how to notify and follow changes, how to offer support to changes in documentation and in use. We argue that designing applications that make possible some level of tailoring should consider aspects such as: (a) the system architecture and also questions regarding implementation, to provide the possibility of changing; (b) documentation and the expression of the possibility of tailoring, so users can know what to change and how; and also (c) aspects regarding the social impact of changes. The user interface issues as well as organizational aspects are fundamental to support tailoring. A motivation for considering Semiotics in the project of the user interface of tailorable applications rests in the fact that interface elements do not exist as “physical” objects, but as signs. The “brush” in a drawing software, for example, “stands for” the real brush and is represented by a collection of pixels in the screen. By using Semiotics, the human-computer interaction can be understood through complex processes. Such processes, analyzed only according to the perspective of engineering, have been interpreted as purely syntactic phenomena. The analysis using Semiotics rescues the primary function of computer systems as vehicles of signs and supplies an adequate vocabulary that makes possible the agreement of computer systems in function of other types of systems of signs [15, 16]. Organizational Semiotics (OS) is a discipline that has roots in Semiotics applied to organizational processes. OS studies the nature, characteristics, function and effect of information and communication within organizational contexts. Organization is considered a social system in which people behave in an organized manner by conforming to a certain system of norms. These norms are regularities of perception, behavior, belief and value that are exhibited as customs, habits, patterns of behavior and other cultural artifacts [12, 21].
1038
M.C.C. Baranauskas and V.P. de A. Neris
An organization can be seen as an information system where agents employ signs to perform purposeful actions. Some of the organization functions are of high regularity, where rules can be clearly formalized. Within the formalized part of the job, part of it may be highly repetitive and can be automated by computer-based systems. Therefore, any technical system is just part of a formal part of the organization which is, in turn, part of the total organization [12]. The success of a technical system presupposes a formal system, just as a formal system relies on an informal system [21, 12]. SO provides us with some artifacts to study organizations. One of the artifacts that allow a refined classification of information is the Semiotic Ladder [20]. The rationale for the SL is to see information as signs and to define the different aspects or levels of these signs based on the different operations you can execute upon these signs. The semiotic ladder consists of the views on signs from the perspective of physics, empirics, syntactics, semantic, pragmatics, and the social world. The addition of a view on information from the social world stresses that information use is always part of human behavior in a social setting, where norms or social conventions govern people’s behavior. The semiotic ladder shows that there are six views on information that together form a complex conceptual structure [12]. SL's six layers can be used to define information in really different contexts, including computer science. Figure 1 shows the SL applied to Interactive Systems as an adaptation of the original SL from Stamper [20], based on Liu [12].
Fig. 1. Semiotic Ladder applied to Interactive Systems
In the Social World layer, we classify information about the effects of using signals. Here, we consider the processes that invokes, violate and alter social norms. In interactive systems design, we can cite social inclusion as an example of information to be considered in this layer. In the Pragmatics layer, it is important to be concerned about the use of the interactive system. Interactive systems should help users to do the respective tasks. Users’ intentions and the communication between user and system are considered here. In the Semantics layer, “meaning” is the main concept. What users understand from the interface and the expressiveness of icons are considered here. In the Syntactics layer, the software structure is the focus. Here we
Using Patterns to Support the Design of Flexible User Interaction
1039
consider information about the group of actions that users will do in a certain order in the interface. In Empirics layer, we have information that makes possible the structure using the physical resources, so we have the communication protocols, programming languages and other kind of communication systems. Finally, in the Physical World layer, we classify information about the hardware resources that are necessary to use an interactive system. As patterns store specialized knowledge, we classified the patterns related to tailoring (described in table 1) using the SL applied to Interactive Systems. Figure 2 shows the classification.
Fig. 2. Patterns related to tailoring classified in the Semiotic Ladder
As expected, Interaction Design patterns do not cover Physical and Empirics layers. However, most of the information stored in the patterns is related to the Syntactics and Semantics layers, being associated mostly to interface elements organization and meaning. Not much knowledge has been formalized considering Pragmatics and Social World, which represent concerns about the use and social implications of tailorable systems. In the next section, we present the TUIFs that were formalized considering the knowledge stored in the studied patterns.
4 Tailorable User Interface Features We have used Interaction Design patterns related to tailoring as a source of information. Although many of them were not formalized specifically for tailorable applications, they store specialized knowledge about human-computer interaction that can be applied to tailorable systems. Analysing the information provided by patterns, from a semiotic-informed perspective, we could get to TUIFs. TUIFs can be understood as characteristics that interfaces that allow tailoring should have. Table 2 shows the TUIFs we propose.
1040
M.C.C. Baranauskas and V.P. de A. Neris Table 2. Tailorable User Interface Features TUIF
Patterns related Customized Window
Show the tailoring options
Change the positions of interface elements
Define own interface elements
Record actions interest
and
points
of
Insert comments
User Preferences Personal Object Space Movable Pieces Pile of Items Groups and Items Drag and Drop Modules Personalized My Site Master and Instances Scripted Action Sequence Bookmarks Placeholder User’s annotations
Goal Provide a way for users to perceive how to customize what is displayed in the window
Provide a way for users to re-arrange the layout of the interface directly
Support ways for users to create or change layout and graphical aspects of interface elements and save these changes. Provide a way so users can "record" a sequence of actions or points of interest of their choice, and a way to easily "play them back" at any time Support ways for users to add their own comments and other annotations to the artifact
Table 3. TUIF Elicitation Patterns Name
Problem
Context
I can customize
Which information needs to be elicited in order to provide users with the “show the tailoring options” feature?
One of the main reasons users do not tailor is they don’t know they can do it.
Rearrange layout
Which information needs to be elicited in order to provide users with the “change the position of interface elements” feature?
Users may want to change interface elements positions, rearranging layout.
Create and Save My Interface Elements
Which information needs to be elicited in order to provide users with the “define own interface elements” feature?
Users may want to create or change layout and graphical presentation of interface elements, according to their own understandings.
Record Repeat
Which information needs to be elicited in order to provide users with the “recording actions and points of interest” feature?
In a navigable software system, when the system is possibly large and complex, allows the user to move freely through it in ways not directly supported by the initial design
and
Solution - Tailoring Features Elicitation Guide Issues to be discussed with stakeholders - Will the application offer different actions to be customized? - If so, what are the actions that will be customized? - Is that possible to offer the tailoring function really close to where the action will be performed? - Is that possible to show the changes in a WYSIWYG format? - How users are going to save changes? - How application will manage to show saved changes next time user comes back? - Which interface elements could be moved? - Which movement options will be offered (stacking, grouping, aligning, nestling, …)? - Is that possible to move the artifact directly? - How users are going to save changes? - How application will manage to show saved changes next time user comes back? - Will the application allow users to create new elements with new functions? - If so, could new elements be created in any part of the interface? - How will users create the new elements (a completely new or a modified copy)? - If copies will be allowed, how to offer the possibility to make changes in only one copy or in all copies? - How will these new elements be recorded? - How are users going to name the new elements? - Will the application allow users to organize and reuse their new elements? - If so, how the new elements can be organized? - Will the application allow users to record different actions (depending on the kind of application action means a specific functionality performed by the user, a place visited, etc.) they perform? - If so, how many actions can be recorded? - How will these actions be recorded? - How are users going to name the record? - Will the application allow users to organize their records? - If so, how the records can be organized? - How is playback going to be presented? Is it possible to offer a single button?
Using Patterns to Support the Design of Flexible User Interaction
1041
Table 3. (coninued) My had-own understanding
Which information needs to be elicited in order to provide users with the “insert comments” feature?
Interface elements sometimes are not as meaningful as they should be. Users may want to add comments about interface elements that can be read by the same user or by others
- Will the application allow users to make notes to all interface elements? - Will the application allow users to make different types of notes (in different colors, different formats)? - How will these notes be recorded? - How are users going to name the notes? - Will the application allow users to organize their notes? - If so, how can the notes be organized?
5 Conclusions and Future Works Designing for flexibility involves considering many people, with different skills, interests and levels of commitment, including, designers, developers and users. Concerns regarding the design of flexible systems must be present since the early stages of the software development lifecycle. Interaction Patterns are becoming a valuable approach to bridge the gap between analysis and design as they store specialized knowledge about varied interaction aspects. In this work we studied 10 Interaction Design Pattern Collections, selected patterns that are related to tailoring and classified them according to an artifact from Organizational Semiotic - the Semiotic Ladder. Based on the specialized knowledge stored in these patterns, we proposed Tailorable User Interface Features. We also structured and formatted knowledge about the features in Elicitation Patterns, so that they could be more usable and useful to designers as well as to developers of these applications. Further work involves studying accessibility patterns and extending the TUIFs set considering this context. We point out that tailoring can be seen as an instrument to deal with accessibility issues. Offering the end-user the possibility of changing and personalizing the system is a way to provide access to many different profiles, contributing to an inclusive system usage. Acknowledgments. Authors thank to FAPESP (proc. 06/54747-6) and CNPq (proc. 476381/2004-5) for financial support.
References 1. Alexander, C., Ishikawa, S., Silverstein, M., Jacobson, M., Fiksdahl-King, I., Angel, S.: A Pattern Language: Towns, Building, Constructions. Oxford Univer. Press, NY (1977) 2. Arvola, M.: Interaction Design Patterns for Computers in Sociable Use. Journal of Computer Application in Technology. 25(2-3), 128–139 (2006) 3. Beck, K.: User Interface (Last visit: February 2007) http://c2.com/ppr/ui.html 4. Borches, J.: A Pattern Approach to Interaction Design (Last visit: February 2007) http://www.hcipatterns.org/patterns/borchers/patternindex.html 5. Coram, T., Lee, J.: Experiences – A Pattern Language for User Interface Design (Last visit: February 2007) http://www.maplefish.com/todd/papers/Experiences.html 6. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, Boston, EUA (1995) 7. Henderson, A., Kyng, M.: There’s no place like home: Continuing Design in Use. In: Greenbaum, J., Kyng, M. (eds.) Design at work: Cooperative Design of Computer Systems, pp. 219–240. Lawrence Erlbaum Ass, Hillsdale, NJ (1991)
1042
M.C.C. Baranauskas and V.P. de A. Neris
8. Jones, M.C., Rathi, D., Twidale, M.B.: Wikifying your Interface: Facilitating CommunityBased Interface Translation. DIS 2006, June 26–28, 2006, University Park, Pennsylvania, USA, ACM (2006) 1-59593-341-7/06/0006 9. Juristo, N., Moreno, A., Sanchez-Segura, M.: Using Elicitation Patterns to Gather Usability Functionalities. Universidad Politécnica de Madrid, Faculdad de Informática. Internal Document (2006) 10. Kahler, H., Morch, A., Stiemerling, 0., Wulf, V.: Computer Supported Cooperative Work: The Journal of Collaborative Computing. CSCW 9(I), 1–4 (2000) 11. Laakso, S. User Interface Design Patterns (Last visit: February 2007) http://www.cs.helsinki.fi/u/salaakso/patterns/ 12. Liu, K.: Semiotics in Information Systems Engineering, 1st edn. Cambridge University Press, Cambridge. UK (2000) 13. Ma, J., Kienle, M., Kaminski, P.: Customizing Lotus Notes to Build Software Engineering Tools. In: Proceedings of the conference of the Center for Advanced Studies on Collaborative, pp. 211–222, Toronto (2003) 14. Mackay, W.: Triggers and Barriers to Customization Software. In: Proceedings of the SIGCHI conference on Human factors in computing systems: Reaching through technology. New Orleans, pp. 153–160 (1991) 15. Nadin, M.: Interface Design: A semiotic paradigm. Semiotica 69(3/4), pp. 269–302 (1988) 16. Oliveira, O., Baranauskas, M.: A Semiótica e o Design de Software. Technical Report – Computing Institute - Unicamp 98 – 09 (1998) 17. Patterns of Interaction: a Pattern Language for CSCW (Last visit: Februaary 2007) http://www.comp.lancs.ac.uk/computing/research/cseg/projects/pointer/patterns.html 18. Rivera, D.: The Effect of Content Customization on Learnability and Perceived Workload. CHI ’05 extended abstracts on Human factors in computing systems, pp. 1749–1752, Portland, USA (2005) 19. Segerståhl, K., Jokela, T.: Usability of Interaction Patterns CHI 2006, April 22–27, 2006, Montreal, Québec, Canada. ACM (2006) 1-59593-298-4/06/0004 20. Stamper, R.K., Althaus, K., Backhouse, J.: MEASUR: Method for Eliciting, Analizing and Specifying User Requirements. In: Olle, T.W., Verrijn-Stuart, A.A., Bhabuts, L. (eds.) Computerized assistance during the information systems life cycle, Elsevier Science Publishers, North-Holland (1988) 21. Stamper, R.: Language and computer in organized behavior. In: Riet, R.P., Meersman, R.A. (eds.) Linguistic Instruments in Knowledge Engineering, pp. 143–163. Elsevier, Amsterdam (1992) 22. Tidwell, J.: Common Ground: A Pattern Language for Human-Computer Interface Design (Last visit: February 2007) http://www.mit.edu/ jtidwell/interaction_patterns.html 23. Yahoo! Design Pattern Library (Last visit February 2007) http://developer.yahoo.com/ypatterns/ 24. Web Patterns-A UC Berkeley Resource for Building User Interfaces (Last visit: February 2007) http://harbinger.sims.berkeley.edu/ui_designpatterns/webpatterns2/webpatterns/home.php 25. van Welie, M.: Web Desig Patterns (Last visit: February 2007) http://www.welie.com/patterns/
Model-Based Usability Evaluation – Evaluation of Tool Support Gregor Buchholz1, Jürgen Engel2, Christian Märtin2, and Stefan Propp1 1 University
of Rostock, Institute of Computer Science, Albert Einstein Str. 21, 18059 Rostock, Germany {grbuc, [email protected]} 2 University of Applied Sciences, Faculty of Computer Science, Baumgartnerstrasse 16, 86161 Augsburg, Germany {jürgen.engel, [email protected]}
Abstract. Usability evaluation can be accomplished in different ways, depending on individual information interests and specific constraints. In some cases the test user and the usability evaluator are located at different places, for instance in mobile environments or in the case of Internet websites, where the user can't be observed as in a laboratory situation. The usage of multi-modal interfaces introduces some additional constraints. To overcome the problems, techniques of remote usability testing are applied. The data recorded during the test is structured und afterwards analyzed. A user centric approach structures the data based on tasks that are intended by the user. A task model describes the tasks composed of subtasks and temporal relationships between them. This paper introduces and evaluates two tools, AWUSA and ReModEl, which use task modeling for remote usability evaluation. Keywords: Remote Usability Evaluation, Task Models.
1 Introduction This paper introduces two approaches for model-based remote usability evaluation. First AWUSA is evaluated as a framework for model-based website analysis, followed by an introduction of ReModEl, which provides usability evaluation within an integrated model-driven approach for multimodal user interfaces for applications. Subsequently both approaches are compared with each other and finally compared with other approaches, which are ErgoLight, RemUSINE, WebRemUSINE and WET.
analyses can be divided into static analysis procedures at definition time and dynamic analysis when the usage data gathered at runtime are evaluated. The AWUSA approach is based on real usage data and combines log file analysis, automated usability evaluation and web mining in one single system. All input, output and processing data are represented in XML format while graphical output is reported using scalable vector graphics (SVG) [1]. Log file analysis comprises those strategies which apply for the analysis of behavioral patterns (usage patterns) of system users [2]. It can be used to infer the cognitive processes of persons interacting with the software system of interest. The most widely spread technique is computer-registered operations, where data about the current system usage is captured automatically. Data evaluation typically includes transition analysis, frequency analysis and sequence analysis [3]. Since traditional usability evaluation methods, such as user testing or usability inspection, are time consuming and therefore relatively expensive, automation is a remunerative field of investigation. According to [4] any usability evaluation method can be associated with at least one of the following categories: non-automatic technique, automatic capture, automatic analysis and automatic critique. AWUSA fits into the latter three of these. The goal of web mining is to extract detailed information about website structures, contents and usage. Thus web mining can basically be divided into web structure mining, web content mining and web usage mining. AWUSA emphasizes on web usage aspects and allows to identify and distinguish individual users and their particular user sessions down to single navigation paths performed during these sessions. The AWUSA framework is designed to achieve the following goals by using automatic website analysis [2]: − Find and visualize users’ paths on a website − Find and visualize deviations between intended tasks and actual usage − Find and visualize locations and events where tasks are canceled prematurely
Model-Based Usability Evaluation - Evaluation of Tool Support
− − − −
1045
Find and visualize areas and situations with poor usability Provide plausible reasons for detected usability problems Find new goals and tasks of users while touring the website Classification of different user groups and their mapping of the various goals and tasks AWUSA is capable to detect usability problems of various categories [1]:
Category Interaction Information Architecture Design Technical
Example Form or dialogue box problems Orientational problems, no navigation within the content Missing cross-links Inconsistent look & feel of the website Syntax errors, broken links, overloaded pages
AWUSA implements two different strategies for usability problem detection: On the one hand the system directly analyzes the website and its individual webpages in order to identify technical problems. On the other hand the users’ behavior is analyzed and compared with predefined usage patterns which are known for their inherent usability problems. Thus AWUSA combines the collection and processing of both, static and dynamic data. Data mining techniques are used to automatically compute relations between the static structure and content of a website and its related usage. For usability problem detection AWUSA provides the following six core components: Usage Pattern Miner, Usage Pattern Analyzer, Task Usage Miner, Task Usage Analyzer, User Analyzer and Result Analyzer. The Usage Pattern Miner extracts significant patterns from the usage data and computes a similarity ratio for each pair of paths, corresponding to the sessions’ resource sequence. Sessions with a ratio greater than a predefined threshold value are cumulated to usage patterns. Subsequently the Usage Pattern Analyzer compares the extracted patterns with the underlying website structure. Statistical methods are applied to calculate new data as well as data mining techniques to generate association rules. The Task Usage Miner extracts information about how the task paths that were defined by the information architects at design time are met by the real usage patterns during runtime. This is either done by selecting all usage patterns with a similarity ratio to the defined task paths greater than a given value or by finding actual usage patterns which were assigned to task paths that contain the goal resource of a task. As a next step the Task Usage Analyzer distinguishes between successful and not successful usage patterns. A pattern is defined to be successful with respect to a task path, if it contains the task’s goal resource. Subsequently the individual usages of task paths are analyzed with regard to deviations and breaks in task-path navigation. Statistical methods are applied in order to generate metrics information about the usage of task paths. The User Analyzer is used to identify user groups. This is achieved by clustering user and navigation data. Some general characteristics (such as connection speed, type
1046
G. Buchholz et al.
of used web browser, etc.) and navigational behavior is compared with characteristics of previously defined target groups. All of AWUSA´s mining and analyzing components described so far store their results in a common file in XML format. This result file is interpreted by the Result Analyzer by using well defined interpretation rules and patterns. Data mining methods are applied for detecting possible relations between problem resources (e.g. break-off from task paths), their usage pattern and there static attributes. Further the mining engine generates possible reasons for the appearances of these negative events. Finally statistical metrics (e.g. count of problem resources) are calculated from the result file.
3 Introduction to ReModEl ReModEl (“REmote MODel-based EvaLuation”) [6] provides functionalities for remote model-based usability evaluation. It has been developed at the University of Rostock since 2005. 3.1 ReModEl Architecture ReModEl is developed as a client-server system. The server contains different task models, for instance to describe a scenario of writing an email. These models are delivered as a corresponding graphical user interface to the client-side. [5] The server contains a task model, which describes the task intended by the user according to the concept of ConcurTaskTrees [7]. To provide a concrete User Interface, a navigation structure has to be defined. Therefore a UI designer creates a dialog graph on the basis of the task model [5]. This concrete dialog graph contains information about the user interface for the specific device like a PDA, a mobile phone or a notebook. The dialogue graph is stored at the server. The data at serverside is interpreted and a user interface is delivered to the client. The client visualizes the multi-modal user interface with the available resources. User interactions are captured and sent to the server where the events are structured based on the task model. 3.2 Usability Testing with ReModEl To describe the basic usage of ReModEl three main parts can be distinguished: a server, a client for the tester and another client interface for the usability expert. [6]. After starting the server a number of task models can be loaded in order to interpret them. The additional information to represent the UI is defined by a UI designer, who transforms the task model to a dialogue graph and places it on the server. The tester uses the client 1 to connect to the server. After selecting the desired taskbased application, the corresponding user interface is delivered to the client device and the user can work with it as in a real life situation. The server uses a framework for task modeling to receive state change events from the client, which delivers the occurring events to the server. The captured events are collected at server side in order to combine the task model and the current task states. Since there is an authentic user interface provided, influences as in laboratory situations can be prevented.
Model-Based Usability Evaluation - Evaluation of Tool Support
1047
Fig. 2. Usability Testing with ReModEl
The usability client 2 uses a specialized user interface providing an overview of the user interactions at the same time as an event occurs. Hence the usability expert may intervene directly if an unexpected situation emerges. For this purposes a chat client is integrated. The functionality is based on a Java implementation, which observes the states of the models on the server with parameterized cascaded observers.
Fig. 3. GUI for Usability Expert
The screen is divided into three parts (fig.3). The left side visualizes the current state of the task model corresponding to the user application, where the usability expert can compare the executed tasks with the expected ones. A difference might indicate a usability problem. A table in the center lists events that are logged during user interaction. The task states are captured with a timestamp. Columns contain a flag for each task indicating whether it is enabled, running, done, finished or skipped. A comparison with the test case discovers strength and weaknesses of the user interface. The area at the right side of the screen allows filtering of tasks and highlighting. A more sophisticated use case may offer different user interfaces representing the same task model. Establishing a test scenario with these different UIs can provide a direct comparison of how a task is accomplished.
1048
G. Buchholz et al.
4 Comparison of Both Approaches This section compares the presented approaches beginning with the common purpose, the domain of usage, moreover discovering the input of the application, the basic working principle and finally the output. The presented approaches AWUSA and ReModEl both focus on remote usability testing. AWUSA evaluates websites, whereas ReModEl evaluates multi-modal user interfaces of applications on different devices. A usability expert may observe the tester at a different place and moreover at a different time. The events of the user interaction with the system are captured and structured according to a task-based approach, which is quite different. AWUSA evaluates existing websites. Therefore the task structure has to be discovered by an information architect, who identifies goals and corresponding tasks of the website. The steps of user navigation through the website are called task paths. ReModEl employs task models, which are discussed in [7]. A partly automated transformation [8] leads straight forward to a device specific dialogue graph and to the final application, which can be evaluated at any stage of the development process. Both approaches comprise a task-based model representing the opportunities of user interactions and a corresponding UI – AWUSA as a website and ReModEl as a graphical application. The inputs for AWUSA are websites, the discovered task structures and captured logging information of the test user, which are typically provided by a web server generating log files. ReModEl deals with a task model and the generated application. The provided log files are already structured according to the task-based approach. ReModEl uses a task modeling framework, which provides access to task model related information during the applications’ execution. AWUSA analyzes externally captured events of the user interaction. A pre-processing step realizes a filtering of relevant information and afterwards a clustering of events according to the given task structure, which helps the expert to better exploit the collected data. A succeeding web mining analyses the observed task path and compares it with the path expected by an expert. Additionally an analysis of the static website structure delivers further indicators of usability strength and weaknesses. The output is furthermore accompanied by a visualization of the stated data as diagrams in SVG format. ReModEl visualizes the applications’ underlying task model and moreover visualizes all user actions as an animation of the task model. Each interaction changing the state of a task is instantly visible to the usability expert, who is consequently directly able to provide support for the tester. Perhaps users loosing too much time with a short task can be suggested to skip this one to ensure nevertheless high test plan coverage. Additionally a log file records the navigation through the task model. Both applications provide output files in XML format to ease further data processing. To put it in a nutshell, the evaluation of user interaction is lifted to a higher abstraction level: from a view on single events to a task-based perspective. Usability experts can move the focus from examining platform details to the analysis of task execution. A comparison of task execution expected by an expert and observed within the user interaction is suggested. The key difference between both approaches is the direction of development. AWUSA applies reverse engineering of an existing website to examine the task structure and ReModEl applies forward engineering to transform a task model to a user interface. Consequently AWUSA provides capabilities to
Model-Based Usability Evaluation - Evaluation of Tool Support
1049
analyze existing source, whereas ReModEl is integrated in a model-driven approach specialized on a whole development process from scratch. The following section gives an overview of other approaches.
5 Comparison with Other Approaches The ErgoLight Usability Validation Suite [14] is commercial software for usability evaluation developed by ErgoLight Ltd. in Israel. The suite offers both, local as well as remote evaluation. A model-based approach is taken deploying task models. The method distinguishes three phases. The first phase is the definition of the user interface. A user interface designer specifies user tasks and applies a breakdown to build a task hierarchy being stored in a built-in database. An association of the task model and the tested application has to be accomplished. The designer is required to associate the Windows controls and the steps of the model. Opening the tested application in parallel allows using drag-and-drop to complete the association. Subsequently ErgoLight applies a task verification to discover defects or inconsistencies in an early stage of development. The designer has furthermore the opportunity to specify problem indicators for the interaction of the test user. A typical indicator for difficulties is the usage of the undo or cancel button. The second phase accomplishes the data collection. The tested application is executed at the user side and events are captured. Difficulties can be reported during the test. To report a problem the user is prompted to state the intended task. Optionally a free comment of the user may discover terminology problems or missing features that could be implemented in future versions. The third phase is the evaluation, which classifies the problems and provides statistics of occurring difficulties. RemUSINE (“Remote USability INterface Evaluator”) [12] provides support for remote usability evaluation and was developed of Paterno et al. [13] at the University of Pisa, Italy. It was intended to prevent the designer from a high involvement in the usability evaluation leading to increasing costs. A further goal was to improve the support for model-based usability evaluation on the basis of task modelling. Consequently the input for RemUSINE covers three sources of data: a task model specified according to the ConcurTaskTree notation [7], a log file comprising the events of the user interaction and an association table describing the mapping between log and tasks. The method distinguishes three phases. Firstly a preparation phase comprises the task modelling as well as the development of the association between the physical events of the user interaction, which are stated in the log files, and the basic tasks of the task model. Secondly the automatic analysis phase follows. If temporal relationships for tasks are specified, they are processed as precondition for each step. The following analysis of the results examines occurred errors, discovered task patterns, the duration of task execution and further information. Finally, within the evaluation phase, the designer analyses the results to improve the user interface. WebRemUSINE is an extension of the RemUSINE approach to analyze websites. It was also designed and continuously developed at the HCI group of ISTI-C.N.R. in Pisa, Italy since 2002 [10]. It combines two types of usability evaluation techniques: empirical testing and model-based evaluation. WebRemUSINE enables usability
1050
G. Buchholz et al.
experts to detect and analyze possible inconsistencies between the task model of a website of interest and the actual interactions performed by real users. The input for WebRemUSINE covers, like RemUSINE, three sources of data: a task model, a log file comprising the events of the user interaction and an association table describing the mapping between log and tasks. The WebRemUSINE engine computes and processes both, the task model and the log file [9]. Working with WebRemUSINE consists of three stages: within the preparation phase the task model of the website is created by the usability engineer. Further the logged data is collected and associations between logged user actions and basic tasks are defined. During the subsequent automatic analysis phase the tool processes and examines the logged data and task models in order to calculate and visualize a variety of statistical values related to the tasks performed by the users, such as number of errors made, visit time or page loading time. Finally, within the evaluation phase, the evaluator analyzes the provided results, identifies usability problems and derives possible interface improvements. The Web Event-logging Tool (WET) incorporates capabilities to capture data about users interacting with web pages. The goal is to provide an inexpensive way to collect reliable and complete sets of usage data from many users at a time [11]. It is intended to avoid the weaknesses of diverse existing approaches, such as server logging or the usage of data collection software. While server logging usually cannot provide a comprehensive picture of all user interactions e.g. due to caching effects, data collection software causes development or license costs and requires efforts for installation and maintenance. In contrast to such approaches WET supports usability experts with functions to capture and collect web browser events on the client side without the need to install a complex software system on the clients. For instance, web browser events include mouse movements and clicks, navigation key strokes, loading and unloading web pages etc. If such events are detected they are logged together with timestamp information and various related object properties. The data collection process consists of four major steps: Within an Javascript file, the evaluator specifies the events to be logged and the logging method to be used. Afterwards the file has to be placed onto the related web server. Then calls to the Javascript file have to be inserted into the head tags of the web pages of interest. Finally, some code for the related log retrieval method must be added. Table 1 provides a comparison of the evaluated tools: Table 1. Comparison of evaluated tools Application 1. classification Application Evaluation Website Evaluation Applicable in which stages of development
AWUSA
ErgoLigh t
ReModEl
RemUSINE
WebRem -USINE
WET
-
x
x
x
-
-
x
-
-
-
x
x
Testing
From Design until Testing
From Design until Testing
Testing
Testing
Testing
Model-Based Usability Evaluation - Evaluation of Tool Support
1051
Table 1.(continued) 2. preparation before evaluation Task Model Task Task structure model [1] [14] Task Model Reengine Associati within ered from on development Website defined to process UI events 3. execution of evaluation User observation - at test time - after testing x Evaluation of static x structures
CTT [7] (enhance d) Forward engineered to UI
CTT [7]
CTT [7]
-
Associati on defined to UI events
Associati on defined to Website events
Not used
x
x x
x
x
x
x
-
-
-
x
To conclude, all tools provide remote usability evaluation and the first five in the table are task-based approaches differing in the applied notation. Beyond testing, the tools ErgoLight and ReModEl focus on a broader usage in several stages of development. Most tools (AWUSA, ErgoLight, RemUSINE, WebRemUSINE) state the UI as initial point and reengineer the task model manually or partly automated, whereas ReModEl applies forward engineering starting with a task model which is transformed to a UI partly automated. All tools provide evaluation of dynamic data from the user interaction and some tools, primarily the website evaluation tools, provide additional static metrics of the UI.
6 Future Challenges The introduced approaches demonstrate successfully how the deployment of task models can contribute to remote usability evaluation. Future improvements comprise the briefly stated challenges: The analysis of captured user interactions should cover the examination of specific behavior for certain user groups or application types. The identification of GUI patterns with good usability for certain tasks may provide decision support for the UI designer. Additionally a reference path through the task model can be defined for comparing the current user behavior with an expected route provided by an expert. Furthermore a linking of task model and models of the object-oriented application logic was recently developed and can be integrated to enhance the model-based approach.
1052
G. Buchholz et al.
References 1. Tiedtke, T., Märtin, C., Gerth, N.: AWUSA – A Tool for Automated Website Analysis. In: PreProceedings of the 9th Int. Workshop DSV-IS 2002, Rostock, Germany, pp. 251–266 (2002) 2. Tiedtke, T., Krach, T., Späth, C., Märtin, C.: Applying Usage Patterns for Web Analysis. In: Proc. of HCI International, July 22-27, 2005, Las Vegas, Nevada, USA, vol. 2 - The Management of Information: E-Business, the Web, and Mobile Computing, Lawrence Erlbaum Associates (2005) 3. Engel, J.: Usability Evaluation Framework. Master Thesis, Augsburg University of Applied Sciences, Dept. of Computer Science (2006) 4. Ivory, M., Hearst, M.: The State of the Art in Automating Usability Evaluation of User Interfaces. ACM Computing Surveys 33(4), 470–516 (2001) 5. Buchholz, G., Forbrig, P., Dittmar, A., Wolff, A., Reichart, D.: Task Models and Remote Usability Testing. CUSEC (Canadian University Software Engineering Conference), January 19-21, 2006, Montreal, Canada (2006) 6. Buchholz, G.: Untersuchungen zum Remote Usability Testing, Diploma Thesis. University of Rostock, Dept. of Computer Science (2005) 7. Paterno, F.: Model-Based Design and Evaluation of interactive applications. Springer, Heidelberg (1999) ISBN 1-85233-155-0 8. Reichart, D., Forbrig, P., Dittmar, A.: Task Models as Basis for Requirements Engineering and Software Execution. TAMODIA 2004, Prague, pp. 51–58 (2004) 9. Paganelli, L., Paternò, F.: Intelligent Analysis of User Interactions with Web Applications. CNUCE – C.N.R., Pisa, Italy (2002) 10. WebRemUSINE, Web REMote USer INterface Evaluator, (01-07-2007) http://giove.cnuce.cnr.it/webremusine.html, 11. Etgen, M., Cantor, J.: What does getting WET (Web Event-logging Tool) Mean for Web Usability? User Experience Engineering Division, ATT&T Labs, Middletown, NJ, USA (1999) 12. Lecerof, A., Paternò, F.: Automatic Support for Usability Evaluation. IEEE Trans. Software Eng. 26(10), 863–888 (1998) 13. RemUSINE, Remote USability INterface Evaluator (01-09-2007) http://giove.cnuce.cnr.it/ fabio/remusine.html 14. ErgoLight Usability Validation Suite (EUVS) (01-09-2007) http://www.ergolightsw.com/CHI/Company/Articles/Poster-HCI97/Poster-HCI97.htm
User-Oriented Design (UOD) Patterns for Innovation Design at Digital Products Chiou Wen-Ko, Chen Bi-Hui, Wang Ming-Hsu, and Liang You-Zhao 259 Wen-Hwa 1st Road, Kwei-Shan Tao-Yuan,Taiwan, 333, R.O.C Chang Gung University [email protected]
Abstract. Innovation design is the trend of products in the future. User-oriented design (UOD) is a design process which focuses on the needs of the user and develops product concepts for them. The objective of this research is to find the UOD patterns from four digital products cases. The cases included ‘Home Scenario Control’, ‘Wireless Conference Room Facility Controller’, ‘Medical Tablet PC’ and ‘Elderly Care System and Interface’. The results evidence that we can find ‘real’ needs and users’ ‘problems’ concerning different digital products and that we can also integrate differing opinions from various professional fields. Keywords: User-Oriented Design, Innovation Design, Digital Products, UOD Patterns.
2 Product Design Process New product development (NPD) is a complex process of ideas associated with a significant measure of innovation. NPD contains research and development (R&D), production, skill development, marketing and decision-making, and also requires linking science and technology/invention or innovation with the marketplace. Throughout the 1970s and 80s product developers focused on production quality and achieving minimum production cost within long product life cycles, but in the 1990s they began to shift the focus onto time domination and launching a product to market faster than other competitors (Kengpol and O’Brien, 2001). Cooper (2000) identifies six categories of newness in product innovations, (1) new-to-the-world products, that are first of their kind and that create an entirely new market, (2) new product lines, that are not new to the marketplace, but are quite new to the company, (3) additions to existing product lines, that are new items to the company, but fit within an existing product line of the company, (4) improvements and revisions to existing products, that are replacements of existing products in a company’s product line with improvements in performance and perceived value, (5) repositioning, that are new applications for existing products and retargeting of old products to new market segments or for different applications, and (6) cost reductions, that are new products designed to replace existing products in the line.
3 Innovation 3.1 Innovation Model The studies on the theory of innovation suggest innovation to be a ‘process,’ but there is no agreement about the nature of this process. Rothwell (1994) suggests
Fig. 1. Example of the integrated innovation process (fourth generation)
User-Oriented Design (UOD) Patterns for Innovation Design at Digital Products
1055
that there are five generations of innovation models that evolve over time, originating from the linear model of the innovation process. The complexity and interconnection of the model, and the feedback it supplies, increases with the evolution of each generation. In the ‘fifth generation’ of innovation models, Rothwell (1994) describes innovation as a process with a supreme level of interaction, within the company or with external resources, assisted by IT (Information Technology) networking systems. Fourth generation innovation – integrated model (Fig. 1). An integrated model of innovation saw a tight coupling of marketing and R&D activity, together with strong supplier linkages and close coupling with leading customers. Fifth generation innovation – systems integration and networking model. This model (Table 1) of innovation builds on the integrated model by including strategic partnerships with suppliers and customers, using expert systems, and having collaborative marketing and research arrangements. There is an emphasis on flexibility and speed of development with a focus on quality and other non-price factors. Table 1. Five generations of innovation models (Rothwell, 1994)
3.2 Innovation Cases from IDEO IDEO helps organizations innovate through design. Independently ranked by global business leaders as one of the world's most innovative companies, they use ‘design thinking’ to help clients navigate the speed, complexity and areas of opportunity in today's world. Multidisciplinary teams are at the heart of the IDEO method as they believe this is how innovation happens in the world. Quite simply, great work is accomplished by hot teams which include people from various disciplines including: Human Factors, Mechanical Engineering, Healthcare, Business Factors, Electrical Engineering, KidCentric Design, Industrial Design, Manufacturing, Environments, Interaction Design, Software Engineering. Table 2 shows six IDEO innovation cases by using multidisciplinary team method.
1056
W.-K. Chiou et al. Table 2. Six innovation cases from IDEO (http://www.ideo.com/ideo.asp)
Year Project Title 1997 Stifneck Select for Laerdal Medical
Content Adjustable extrication collar
2001 DePaul Health Center for SSM Health Care 2001 BodyGem for HealtheTech
Patient-Care Delivery Model Handheld metabolic measurement device
2003 Leighton Heart & Vascular Center for Memorial Hospital & Health System 2004 Improved PatientProvider Service for Mayo Clinic
Redefining the patient-centered care experience
2004 SFMOMA Atrium Redesign for San Francisco Museum of Modern Art
Better service at a renowned healthcare institution Rethinking a modern art museum lobby
Teams Industrial Design, Human Factors, Mechanical Engineering Environments, Human Factors Industrial Design, Manufacturing, Electrical Engineering, Human Factors, Interaction Design, Mechanical Engineering Environments, Industrial Design, Human Factors, Interaction Design Environments, Service Design, Human Factors, Interaction Design Environments, Industrial Design, Human Factors
4 Scenario and UOD Cases This term refers to events and circumstances in the past, as in “scenario analysis”, (Kreifeldt, 1987 used the term to describe current user task domains, or tasks specified for user testing; still others, particularly within a design context (Joe, 1997; Welker et al., 1997) use “scenarios” to refer to explorations of possible futures. Fig. 2 shows the communication of the significance of several different user lifestyles; evaluation of different design concepts; keeping idea generation focused on people; exploring needs within an existing concept; and communicating critical user issues to a design team. The following four cases used ‘scenario process’ as their method to make the UOD cases, and this research aims to build up the UOD patterns from these four cases. The UOD process began by identifying the range of users, goals, tasks and activities which needed to be considered. Ideally this exercise was based on detailed research of users in the context of interacting with products and using methods such as user profiling, field observation, contextual inquiry, protocol analysis and interviews. Throughout this process ‘real’ information about the real users was gleaned, and all information was applied to build up the initial protocol. Finally a protocol was created and tested in a real-life situation.
User-Oriented Design (UOD) Patterns for Innovation Design at Digital Products
1057
Fig. 2. Scenario process (Jane and Matthew, 2000)
4.1 Home Scenario Control This case is based on ‘user-oriented design’, discussing the relationship between scenarios and semantics by using ‘scenario semantics analysis’. In view of the needs of the user at home, ‘design indoor scenarios’ bring the trend of automation into the family by providing interaction between individual and computer, ease of control and management, making systems assist human living, providing scenario control in varying circumstances and facilitating diversification and customization.
Fig. 3. Case of home scenario control
1058
W.-K. Chiou et al.
4.2 Wireless Conference Room Facility Controller This case was a collaboration with Advantech Co. Ltd, Taiwan, who wish to develop the wireless facility controller. Advantech covers the complete market share of integrated solutions, from industrial automation to medical computing to home automation. Nevertheless, it is the engineering and marketing departments who are concerned about user oriented design approaches to fit the new interactive generation. The wireless facility controller is a kind of interactive device which has a wireless touch panel and fast operation keys that can control other automated devices. We applied the usability analysis and scenario-based approach to user-oriented design concepts to create an innovative design concept and to synthesize the processes to develop an interactive product development framework.
Fig. 4. Case of wireless conference room facility controller
4.3 Medical Tablet PC This case anticipated that using scenario observation approach would define the design strategy for Tablet PC medical applications. Firstly, the preliminary scenario observation of medical workers from different regions was conducted. Secondly, a deeper level of scenario observation was conducted regarding the user. Understanding factors such as the user’s work content, project and environment from morning until evening facilitated the creation and analysis of a ‘work model’, and in return fully understand the using requests, i.e. ‘work situations’ demanded of the Tablet PC so a design strategy for medical Tablet PCs could be fully defined.
User-Oriented Design (UOD) Patterns for Innovation Design at Digital Products
1059
Fig. 5. Case of medical Tablet PC
4.4 Elderly Care System and Interface The objective of this case was to find a new concept for building a care system and interface for the elderly in a new elderly living community in Taiwan. And additionally, to try to build up this new care system and interface by concepts evolved from the scenario-based workshop group related to the study of the life of the elderly. For the new care system and interface, we collected information from the new elderly community which is the first of its kind in Taiwan. Through scenario-based approach, the workshop elicited data from different profession fields.
Fig. 6. Case of elderly care system and interface
1060
W.-K. Chiou et al.
5 UOD Patterns From the four UOD cases, this research tries to build up the UOD pattern for digital products.
Fig. 7. UOD pattern for digital products
6 Discussion and Conclusion Concerning the four cases, we found that ‘Wireless Conference Room Facility Controller’ applied the usability analysis and scenario approach based on UOD concepts to generate innovative concept designs and to synthesize all the processes to develop an interactive product development framework. The ‘Home Scenario Control’ applied the scenario animation for concept building. The ‘Medical Tablet PC’ applied scenario observation approach to find design concepts, and the ‘Elderly Care System and Interface’ applied scenario workshop to develop design ideas. Through these cases, we found that UOD has different processes. Because cases’ features were not the same we needed to modify the UOD processes for them and the products, and through them build up the UOD pattern for the design of digital products. Additionally, we also found that the prototypes ‘fitted’ the needs of the users more than the ‘brainstorming’ by the separate professions. Because we could understand more ‘user’ information from the different areas we could provide appropriate design patterns for digital product innovation design; so in summation,
User-Oriented Design (UOD) Patterns for Innovation Design at Digital Products
1061
UOD could be an appropriate approach in concept building for digital products. This approach could increase our understanding of the UOD process which is important if firms are to meet the challenges of global competition. A UOD pattern for digital products can be generated as the four design cases show, and we could try to use this in the design of digital products in the future, and, to test the pattern performance through the digital products’ design to let products better ‘fit’ users’ needs.
7 Implication For practitioners and scholars, the approach suggests that UOD can combine different professional fields and supply more solid concepts in the design of new digital products and the design cases have evidenced this, and furthermore, this supplies future study directions concerning UOD patterns in innovation design for digital products.
References 1. Cooper, R.G.: Product Leadership: Creating and Launching Superior New Products. Perseus Books, Cambridge, MA (2000) 2. Jane, F.S., Matthew, M.: Scenario building as an ergonomics method in consumer product design. Applied Ergonomics 31, 151–157 (2000) 3. Joe, P.: Scenarios as an essential tool; stories for success. Innovation Quar. J. Ind. Des. Soc. Am. 20–23 (1997) 4. Kengpol, A., O’Brien, C.: The development of a decision support tool for the selection of advanced technology to achieve rapid product development. International Journal of Production Economics 69, 177–191 (2001) 5. Kreifeldt, J.G.: Guarding snowblowers. In: Interface 87 Proceedings. Human Implications of Product Design. Human Factors Society, Consumer Products Technical Group. Santa Monica, CA, pp. 259–304 (1987) 6. Rothwell, R.: Towards fifth-generation process innovation. International Marketing 11(1), 7–31 (1994) 7. Welker, K., Sanders, E.B.N., Couch, J.S.: Design scenarios to understand the user. Innovation Quart. J. Ind. Des. Soc. Am, pp. 24–27 (1997)
Formal Validation of Java/Swing User Interfaces with the Event B Method Alexandre Cortier1, Bruno d’Ausbourg1, and Yamine Aït-Ameur2 1
Centre d’Etudes et de Recherches de Toulouse - ONERA, 2 Avenue E. Belin – BP 4025, 31055 Toulouse, France {cortier,ausbourg}@cert.fr www.cert.fr 2 LISI/ENSMA, Téléport 2, 1 Avenue Clément Ader – BP 40109 31055 Toulouse, 86961 Futuroscope Chasseneuil, France {yamine}@ensma.fr www.lisi.ensma.fr/ihm
Abstract. User Interface (UI) systems are increasingly complex and nowadays assist critical activities. The development of UIs needs empowered validation methodologies in order to ensure the correctness of the developed UI-based applications. This paper investigates the applicability of reverse engineering and formal approaches to the validation of UIs correctness. The approach is the following. An user interface’s abstract model is derived starting from its Java/Swing source code. This formal execution model is then used to prove that the developed interactive system is in accordance with usability requirements expressed in CTT tasks models. Keywords: User Interface, Validation, Formal Methods, Method B, Tasks Model, CTT, Static Analysis.
Formal Validation of Java/Swing User Interfaces with the Event B Method
1063
user-based models describe the tasks the user can achieve by using the final system together with the logical sequences these tasks may follow under user actions [13]. Other models give a description of the software architecture for the system implementation [6]. Some models try also to describe formally the behavior of the interactive system by representing the structure of interactions that are supported by the system [3,4,8,9,12]. More recently, some work and studies tried to build such formal models of the system behavior by analyzing the source code of programs that encodes this behavior. More precisely, for example, some approaches intend to analyze the source code, statically or dynamically according to the intended objective (formal validation or test) [5,10,11,16]. Namely, Silva and al. propose some models for the reverse engineering of Java/Swing applications. These models are obtained by a static analysis of codes [15]. By investigating the Abstract Syntax Tree (AST) of a java program and by using code slicing operations, an abstract behavioral model (in fact and more precisely an interactor model and a state machine) and an abstract structural model (an event flow graph) can be extracted. The approach developed in this paper attempts to contribute to validate interactive systems by suggesting some means to demonstrate that an interactive system behaves as intended and specified. Technically, this approach combines two kinds of models that must be formalized and expressed in the Event B language. Firstly, a given user tasks model can be considered as a possible specification of the system. We make the hypothesis that this task model is described by a CTT (Concur Task Tree) model [13]. This CTT model must be formalized in an Event B model MSp. Secondly, a static analysis of Java/Swing codes permits to extract an abstract model that catches behavioral and structural aspects of the encoded system. This extracted model MSy is also expressed in the Event B language. In this context, demonstrating that the system behaves as intended comes to demonstrate that MSy is a correct refinement of MSp. This paper is organized as follows. Section 2 presents the general principles of the approach. Section 3 focuses on the Event B model extraction and gives a simple example illustrating the approach. A conclusion is provided in section 4.
2 Formal Validation of UI Systems: General Principles This paper suggests using a formal method to contribute to the validation steps in the UI development process. The aim is to maintain experiences and practices of designers in using classical development tools but to reinforce the techniques that are involved in the validation process of user interface systems. Figure 1 sketches the different steps of the approach. These steps are graphically identified by labeled black circles that are named tags in the following. Two main steps are identified. The first one consists in extracting an Event B model from the source code (Fig.1,tag A) and the second is devoted to validate the derived Event B model (Fig.1, tag B) with respect to a user task model.
1064
A. Cortier, B. d’Ausbourg, and Y. Aït-Ameur
Fig. 1. Principles of the approach
2.1 Event B Model Extraction: Principles and Technique The aim is to extract a behavioral formal model of the Java/Swing application. A formal representation of basic interaction components that are involved in the source code is needed to capture properly the UI behavior. This representation cannot be built directly from the Java code because the code of these components is not directly inlined in the application code. These components are defined in the Swing and AWT libraries and are imported in the application code. Moreover, these object components can be shared by all the applications and it can be advantageous to avoid reengineering them each time a program is analyzed. So an Event B abstract model BSwing of the Swing library is built (Fig.1, tag 1). This abstract model is then used as a resource to derive the final Event B application model BApplM by using static analysis techniques (Fig.1, tag 2). 2.2 Formal Validation of CTT Task Models The second step of the approach uses the refinement principle in the Event B method. The objective is to check that the concrete interactive behavior of the application is in accordance with the task model requirements. In other words, the aim is to prove
Formal Validation of Java/Swing User Interfaces with the Event B Method
1065
formally that human-computer interaction scenarii that can be enforced by the encoded application software are a correct refinement of a more abstract scenario that may be derived from CTT task models. CTT is defined as a notation for task model specifications. It permits to describe tasks expressions combining temporal operators and atomic tasks. A CTT task model is based on a hierarchical structure of tasks represented by a tree-like structure. It requires identification of temporal relationships between other subtasks at the same level in the tree. We feel that task models can express a significant part of usability requirements. So, the approach associates a CTT task model that represents possible user actions, with the extracted Event B model that represents the effective reactions of the encoded application in response to the user actions. To achieve this goal, the CTT task model is concretized and then formalized into an Event B model BTask (Fig.1,tag3). This formalization step was already studied by the past and can be mechanically performed [3]. BTask is then refined by introducing new B events and variables that denote the UI reactions (Fig.1, tag 4) that are described in the BApplM model. The correctness of this last refinement BValidAppl can be mathematically demonstrated by proving the proof obligations that are generated by B tools (Fig.1,tag5). It can be concluded that the Java/Swing application behaves in accordance with the requirements expressed in the CTT task model. More information concerning the validation process can be found in [7]. Scope and limitations. The current experiments handle Java source codes that make use of Swing and AWT libraries. This choice is motivated by the gain in popularity of this language in software. However, the same principles and techniques could be developed and applied to other languages. By this time, experiments focus on the analysis and formalization of mono-threaded Java programs and applications. Synchronization constraints are not yet considered and multi threaded synchronized applications are beyond the scope of this paper.
3 Extracting an Event B Model from Java/Swing Codes 3.1 Background: The Event B Method The Event B Method was developed by J.R Abrial and is based on model description [2]. A model is defined as a set of variables that are declared and defined in the VARIABLES clause. These variables evolve thanks to events declared and defined in the EVENTS clause. An Event B model encodes a state transition system where variables represent the state and events represent and describe the transitions from state to another. An INVARIANT clause permits to define typing and safety properties on variables in first order logic. Generalized substitutions. Initialization event and other events are described in a B model using generalized substitutions based on the weakest precondition calculus of Dijkstra . Consider a substitution S and a predicate P expressing a post-condition, then [S]P represents the weakest precondition that establishes P after the execution of S. Substitutions occurring in Event B models are defined by expressions depicted in figure 2.
1066
A. Cortier, B. d’Ausbourg, and Y. Aït-Ameur
⇔P ⇔ [S1 || S2]P [S1]P ∧ [S2]P [ANY v WHERE E THEN S END]P ⇔ ∀ v.(E ⇒ [SELECT E THEN S END]P ⇔ E ⇒ [S]P [BEGIN S END]P ⇔ [S]P [x:=E]P ⇔ P(x/E)
(1)
[SKIP]P
(2) [S]P)
(3) (4) (5) (6)
Fig. 2. Generalized substitutions in Event B models
Substitutions 1,2,5 and 6 represent respectively the empty statement, the parallel substitution expressing that S1 and S2 are concurrent, the block substitution and the affectation. Substitutions 3 and 4 are guarded substitutions. The predicate E represents the guard. Each event guarded by the guard E is fired if the guard is true and when it is fired, the post-condition P is established Semantics of Event B models. The semantics of an Event B model is a trace based semantics with interleaving. A system is characterized by a set of licit traces that correspond to the fired events of the model. In fact these traces denote sequences of states. Properties can be expressed as constraints on the possible structures of these traces. If these traces are in accordance with the constraints, they satisfy the given properties and are then considered as licit. Event based systems like interactive systems can be described and formalized by following this approach [3]. Moreover, the decomposition process, thanks to the refinement operations, permits to gradually build complex systems in an incremental manner. A gluing invariant J(vari,varj) establishes a formal relationship between variables in the abstract model vari and variables in the refined model varj. The B tools generate proof obligations (OP) that must be demonstrated in order to validate the correctness of the refinement. If this correctness is proved, the abstract properties are inherited and satisfied in the refined models. From this description it appears that B is a top-down method using refinement. 3.2 Capturing the Java/Swing Application Behavior Java/Swing application. The Swing library provides the programmer with widgets that permit to program the look and feel of interfaces. User actions on widgets generate event objects inside the system. These events are characterized by a set of attribute variables such as the source of the event (the widget on which the user acts) and the type of the event. Some particular objects, named listeners, can be linked to widgets. A listener is characterized by a set of methods, the body of which is defined by the programmer. When a listener listens to an event and that event occurs, the method that is related to this event is invoked by the Java Virtual Machine (JVM) and executed. The link established at execution time between events and listener methods is enforced by considering the event attributes (source, type). Executing listener methods modifies the internal state of the interactive system. Modifications are of two types: rendering modifications (modification of state and appearance attributes of widgets) and control modifications (modifications on state variables that are not widgets).
Formal Validation of Java/Swing User Interfaces with the Event B Method
1067
Capturing the behavior. The UI behavior can be caught by focusing on rendering aspects of the application and by observing evolutions of these rendering aspects due to interactions. The relevant data that are necessary to catch properly this behavior are: (1) the creation statements of widgets and listeners instances and the links established between these widgets and listeners instances; (2) the implementation code of listener methods because this code defines and describes the UI reactions in response to user actions. Statements that perform creations of widgets or listeners, and links between widgets and listeners, can be found by starting the analysis with the implementation code of the main() method. A static analysis of this method permits to define and build the INITIALISATION clause (a particular event) of the Event B model. The UI reactions are defined in the implementation code of listener methods. These methods are also analyzed and translated in B events. But building in a single step a formal execution model of these methods from the source code is a complex task. In fact, it must be decomposed. A Java program is made up of a set of class declarations. Each class defines a set of attributes and methods. Generally, the body of a method is not only a simple sequence of assignments but it is also a sequence of more complex statements like, in particular, method calls. Moreover, a significant number of classes are imported from libraries and it is not really necessary to represent all the methods and attributes of imported classes in the final execution model. To handle this difficulty, translation of listener methods execution in B events is preceded by the following steps: (1) first, the methods of the functional core of the application are abstracted. Indeed, these methods do not affect UI rendering aspects and then are not really relevant in catching the interactive behavior of the application; (2) secondly, methods of listener are flattened by inlining their internally invoked methods. The resulting code of these methods is a linear code composed by a set of assignments that are scheduled by control structures of the language: conditional statements, sequential compositions and loops. Once these operations are carried out, the effective translation of listener methods execution into B events is actually performed. Each assignment is translated in one B event. In some cases several instructions can be merged in a single B event using parallel substitutions. This merging process requires some careful analysis of data dependencies. Variants are introduced and used to schedule these B events and to reflect the execution order of assignment statements in listener methods. This translation process is based on the translation rules defined by J.R Abrial and used in a reverse mode [1]. Technique. Figure 3 presents the technique that is used to perform the static analysis of Java/Swing applications. First, a Java Abstract Syntax Tree (AST) of the source code (Fig.3,tag1) is built by parsing it. Secondly, some abstract interpretations are performed in order to identify and to catch the only interactive part from the entire Java program (Fig.3,tag 2). Namely, the functional core of the application is strongly abstracted. The resulting Java AST may be viewed as an execution model of the interactive part of the program. Then, it is translated into an Event B AST (Fig.3, tag 3) by using some predefined transformation rules. This third AST is extended and completed by inserting the generic resource BSwing previously translated into an Event B AST (Fig.3,tag 4). The final Event B AST can be pretty-printed into the Event B behavioral and structural model of the application BApplM.
1068
A. Cortier, B. d’Ausbourg, and Y. Aït-Ameur
Fig. 3. Static Analysis of Java/Swing programs : technique
Example : A Money Converter. Figure 4 shows a small program used as an example: a money converter program from French Francs to Euros. The user enters in the left text field (Fig.4,tag 2) widget the value to convert, performs the conversion by pushing either Francs-> Euros button (Fig.4,tag 1) to convert in Euros, or the Euros>Francs button (Fig.4,tag 3) to convert in French Franc. The converted value is displayed in the right text field component (Fig 4, tag 4).
Fig. 4. An example: a money converter
At initial step, the left text field is the only enabled widget. When the user modifies this left text field by entering a value at the keyboard, buttons become enabled: the user can make the choice between two conversions. When the user clicks on a button, the second button becomes disabled and the converted value is displayed on the right text field. Figure 5 gives a part of the converter Java code example. Two buttons EF and FE and one text field output are instantiated. These widgets are instantiated and initialized in the main method P_converter. Buttons EF and FE are associated to a listener ActionListener that catches ActionEvent events emitted when the user pushes one of the two buttons. In this case, the ActionPerformed() listener method is executed. On the right side of figure 5, an overview of the obtained BApplM model is given. The INITIALISATION clause defines widgets and listeners instantiations and initializations. In this model, the total function W_att is used to represent widgets instances attributes. Three attributes are associated to each widget: visible and enabled represent respectively the visibility and enabling state of widgets, and lists encapsulate a set of listener instances associated to the widget.
Formal Validation of Java/Swing User Interfaces with the Event B Method
public class P_converter implements ActionListener { private JButton EF,FE; private JTextField output;
Fig. 5. A part of the converter source code and its Event B representation
For each widget type a similar function is used to represent specific widget attributes. Namely, Jtf_att associates to text field widget an attribute named value to represent the value typed in the text field. The UA variable models the last emitted events in response to the last user action onto UI. This structure encapsulates the source and the type typ of the emitted event (ActionPerformed, KeyTyped, KeyReleased). All these variables are defined in the generic resource BSwing, the Event B abstraction of the Swing library imported in BApplM. A more precise description of BSwing model can be found in [7].
1070
A. Cortier, B. d’Ausbourg, and Y. Aït-Ameur
The ActionPerformed listener method is described by two events in the EVENTS clause. The conditional statement is expressed in guards definition: the first event is fired when the source attribute of the UA variable is EF (UA’source=EF) and the second when not(UA’source=FE). The other guard conditions are similar to both events and mean that events are fired if and only if: (1) list1 is an ActionListener and is linked to the source of the last emitted event, (2) the type of the last user action is ActionPerformed. A variant V_aP is used to ensure that an event is not fired several times. In this example, there are no dependencies on variables in the sequential composition of assignments statements. So, these assignment statements can be merged in the event bodies using parallel substitutions. The method convert, which belongs to the application functional core, is abstracted by using the constant result.
4 Conclusion This paper presented an approach to contribute to formal validation of UI systems. Starting from Java/Swing source code, a profiled projection of the Java code allows to obtain an Event B model of the application. This model encapsulates UI behaviors by modeling the Swing library and by abstracting listener methods in a set of B events scheduled by variants. The functional core of the application is abstracted. Validation is achieved with respect to a task model that can be viewed as an application specification. This task model can be encoded by using the Event B Method. Once the formalization of the CTT model is obtained, it can be refined by adding variables and events of the B model extracted from the source code. The resulting refinement links user actions (leaves of the CTT task tree) with UI reactions. Assertions ensure that the final model is deadlock free. In this case, they ensure that the interaction scenarii that are encoded in the UI application are defined and accepted by the CTT task model. B tools permitted to discharge all the proof obligations generated by the prover on the example presented in this paper. By using this validation technique, the development of the UI is fully maintained as it is in practice. Moreover, except for the concretization of CTT task models and for interactive proofs, all steps of the approach are automatically performed and can be applied on Java/Swing programs. More precisely, we think that the proposed approach could be used and enforced with other programming languages for UI design. We are currently developing a tool devoted to extract an Event B model from Java/Swing source code and to assist the user in performing the validation steps. The use of this approach to validate multimodal application is also studied. Finally, we would expect to enlarge this approach by defining multiple views of the application: each view would be dedicated to the validation of various UI properties. A particular view of the application consists in performing a particular abstraction of the Event B model of this application. The performed abstraction could be validated by using the proof techniques that are associated with the refinement techniques in the B methods.
Formal Validation of Java/Swing User Interfaces with the Event B Method
1071
References 1. Abrial, J.R.: Event Based Sequential Program Development: Application to Constructing a Pointer Program. In: FME, pp. 51–74 (2003) 2. Abrial, J.R.: The B-Book: Assigning Programs to Meanings. Cambridge University Press, Cambridge (1996) 3. Aït-Ameur, Y., Baron, M., Kamel, N.: Encoding a Process Algebra using the Event B Method. Application to the Validation of User Interfaces. In: ISOLA 2005, Columbia, USA, Springer, Heidelberg (2005) 4. Ausbourd(d’), B.: Using Model Checking fort he Automatic Validation of User Interfaces Systems. In: Markopoulos, P., Johnson, P. (eds.) Proceedings of Design, Specification and Verification of Interactive Systems ’98, Abingdon, UK, Springer, Heidelberg (1998) 5. Ausbourd(d’), B., Durrieu, G., Roché, P.: Deriving a Formal Model of an Interactive System from its UIL Description in order to Verify and Tests its Behaviour. In: DSV-IS, pp. 105–122 (1996) 6. Bass, L., Pellegrino, R., Reed, S., Seacord, R., Sheppard, S., Szcezur, M.R.: The Arch Model: Seeheim Revisited. In: CHI 91 User Interface Developper’s Workshop (1991) 7. Cortier, A., Ausbourg(d’), B., Aït-Ameur, Y.: Using the Event B Method to contribute to the Formal Validation of User Interface Systems. Technical Report, ONERA-CERT (2007) 8. Duke, D.J., Harrison, M.D.: Event Model of Human-System Interaction. Software Engineering Journal, pp. 3–12 (January 1995) 9. Markopoulos, P.: A Compositional Model for the Formal Specification of User Interface. PhD thesis, University of London (1997) 10. Memon, A., Banerjee, I., Nagarajan, A.: GUI Ripping: Reverse Engineering of Graphical User Interfaces for Testing. In: Proceedings of the 10th Working Conference on Reverse Engineering (WCRE’03), Los Alamitos, CA, USA, vol. 0, p. 260. IEEE Computer Society, Washington, DC, USA (2003) 11. Moore, M.: Rule-Based Detection for Reverse Engineering User Interfaces. In: Proceedings of the 3rd Working Conference on Reverse Engineering (WCRE’96), p. 42. IEEE Computer Society Press, Washington, DC, USA (1996) 12. Palanque, P., Bastide, R., Sengès, V.: Validating Interactive System Design through the verification of Formal Task and System Models. In: Bass, L.J., Unger, C. (eds.) Working Conference on Engineering for Human-Computer Interaction (EHCI’95), pp. 189–212. Chapman & Hall, USA (1995) 13. Paternò, F.: Model-Based Design and Evaluation of Interactive Applications. Springer, London,UK (1999) 14. ISO/TC159 Sub-Committee SC4. Draft International ISO DIS 9241-11 Standard (September 1995) 15. Silva, J.C., Campos, J.C., Saraiva, J.: Models for the Reverse Engineering of Java/Swing Applications. In: 3rd International Workshop on Metamodels, Schemas, Grammars, and Ontologies (ateM 2006) for Reverse Engineering, Informatik-Bericht series. Johannes Gutenberg-Universität Mainz, Institut für Informatik – FB 8, October (2006) 16. Systa, T.: Dynamic Reverse Engineering of Java Software. In: Proceedings of the Workshop on Object-Oriented Technology, London, UK, pp. 174–175. Springer, Heidelberg (1999)
Task Analysis, Usability and Engagement David Cox School of Computing and Communications, Faculty of Technology Southampton Solent University East Park Terrace, Southampton SO14 0YN, UK (+44) (0)23 80 319263 [email protected]
Abstract. Human factors methods such as Hierarchical Task Analysis (HTA) are an important means of improving product usability through user-centred analysis and design. The goal driven nature of HTA is examined in the context of a Human Computer Interaction module in a higher education environment. A study of HTA techniques, exercises, typical errors and engagement is presented in order to determine whether this method promotes learner engagement. The study concludes that it may increase intrinsic motivation and engagement among learners as well as raise awareness of usability. Keywords: Hierarchical; task; analysis; engage, intrinsic; motivation; usability; interface; human; computer; interaction; user.
1.1 Human Factors and Task Analysis Human Factors methods as a means of achieving usable design are now wide ranging. They can be used in the area of analysis of human interaction with a device, product design and evaluation of a system, and in all stages there is now great emphasis on focusing on the user or potential users of systems. Indeed ISO 13407 which covers Human-Centred Design of Systems emphasises the need to specify and understand the context of use as well as specifying the user and organisational requirements at the earliest stage of a project [3]. Task analysis is an established method of understanding human performance and goals in context and involves collecting and analysing task data and representing that data so that the tasks are clearly understood. It enables the designer to represent the users’ goals clearly and concisely and provides another vehicle for establishing with users that these are the tasks which they consider to be important. 1.2 A Range of Methods Task Analysis is only one of a range of human factors methods and Stanton et al [4] review and establish a set of techniques suitable for the design and evaluation of systems. These range from Task Analysis and Cognitive Task Analysis to Situation Awareness and Interface Analysis techniques but they highlight the usefulness of Task Analysis at being able to represent human performance in a particular scenario as well as breaking down tasks/scenarios into individual steps. Task Analysis techniques include, among others, Critical Path Analysis (CPA), Goals, Operators and Selection Methods (GOMS), Verbal Protocol Analysis (VPA) and Hierarchical Task Analysis (HTA). Generally, whichever technique is used, Task Analysis is seen as an important tool in developing usable products and involves identifying tasks, collecting and analysing task data so that tasks are fully understood and then producing some kind of documented representation of the analysed tasks [5].
2 Task Analysis and HTA Hierarchical Task Analysis extends the clarity of task analysis and centres on a graphical description of the activity under analysis in terms of a hierarchy of goals and sub-goals mixed with plans and operations. It was originally produced in order to provide a better understanding of cognitive tasks [6] and allows the designer to represent the users’ goals (or tasks) as a hierarchical chart. Such a representation promotes a systematic decomposition of goals into clear sub-goals which in turn can be broken down into further sub-goals. This graphical nature of HTA provides a very simple but very clear and precise method of discussing users’ tasks and determining whether they are of real importance or whether they should be included in a system to allow achievement of an overall objective. HTA can also be seen as a starting point which promotes other ways of resolving problems and developing a better usable system. For example, decision points in HTA can be identified and Cognitive Analysis techniques could be employed to ascertain through interview why users made these decisions. This data can also be fed into the
1074
D. Cox
design process in order to achieve accuracy in the resulting system. Stanton et al [4] emphasize the popularity and wide use of Hierarchical Task Analysis (HTA) in the context of all task analysis methods. It is very flexible and can be said to be placed at the heart of user centred design because of its adaptability and key initial use in many human factor analysis methods. 2.1 An Example of Hierarchical Task Analysis An example of a simple HTA chart is shown in Figure 1. The goal of borrowing a book from a library is decomposed into sub goals (or tasks) with each level ordered from left to right.
Fig. 1. Example Hierarchical task analysis diagram for borrowing a library book. The main goal is broken down into sub tasks/goals in hierarchical order. Numbering is from left to right. Plans describe the operation of the resulting decomposition of a task. Modified from [7].
A level consists of two or more sub-goals but optimally between three and ten. A plan describes the sequence in which the tasks should be actioned and any choices which may occur or conditions in which sub goals would be triggered. The chart can be verified by users and the analysis revised if necessary. Normally, sub goals would not be decomposed any further when the analysis, in conjunction with users is judged to be complete or fit for purpose and when an appropriate operation is reached. The bottom level operation is indicated by a line.
3 Task Analysis and an Educational Perspective This section examines Task Analysis and in particular HTA from a higher education viewpoint in the context of learners engaging with human factors methods within an HCI context. It will also focus on the course(s) used in this study.
Task Analysis, Usability and Engagement
1075
3.1 The Background In general, it is very common to deliver a module such as HCI/Human Computer Interaction and Design (HCID) /User/Usability & Interface Design (UID) within the UK and other state/national higher education systems, in one semester, ranging from between twelve to fifteen weeks. The time allotted may be three to four hours per week. Alternatively, a module might run through a whole academic year of, say, thirty weeks but the number of hours may be reduced to two or three. On computing based courses, the module will compete for time with other software based units. Within the module itself, time constraints dictate that the range of issues and skills covered will also vie with each other for space within the syllabus. Three broad areas within such a syllabus might be: human cognitive factors, usability and task analysis (or other human factors methods) and design with possible construction. 3.2 Human Factors Methods and Engagement In a higher education environment, HCI and the concept of usability is a domain which is often new to learners studying BSc. courses in computer science or related areas. Because of the diverse range of cognitive and human factors presented to learners the concept of usability is often difficult to focus on. In order to approach usability and design in a sensible manner, human cognition and human factors methods must always be considered in a logical time frame. This can produce conflict within such degree courses both in terms of time constraints and student dissatisfaction with what they may see as an unnecessary weight placed not only on areas such as psychology but also on what is perceived as an unusual analysis tool. However studies show that areas of learning which may be new to computer science and technology students such as perception, cognition or human factors methods may have an important value in terms of curiosity which may stimulate students’ intrinsic motivation. This in turn may help students to engage more fully with their studies in this area and in turn reveal the important value of methods such as task analysis as a means of achieving a usable product interface. Often engagement is due to perceived career prospects through high grades but it can also be because of the content provided in the module syllabus stimulated by the process of curiosity and the value in learning something new [8, 9]. In other words, a person engages with a topic because they want to, because they are motivated intrinsically. Intrinsic motivation is possibly the most valuable of the different types of motivation [10]. Students who become engaged in this manner will work around the subject and explore it more fully than those who are not motivated in this way. Because of this, the curiosity value of connecting a new human factors method directly with other elements of development within an HCI context may be one way of promoting interest and engaging students in usability analysis and design work. In addition, areas such as human/cognitive factors and human factors methods, can also be seen as a way of engaging students with other parts of their course and the environment in general [11].
1076
D. Cox
3.3 The Courses The module used in the study is Human Computer Interaction Design (HCID) which focuses on Cognitive Factors, Human Factors Methods and usability but also emphasizes practical Interface Design exercises and assignment work which require some element of functionality. The module is semester-based and is conducted for a period of thirteen weeks. It is aimed at Level 2 (second year) undergraduate students on two separate programmes (effectively courses) who complete the unit in semester one of the academic year: BSc Computer Studies (CS) and BSc Business Information Technology (BIT). Students on both programmes enter their first year with diverse backgrounds and a wide range of experience and ability in software design and program design, development and implementation. Both courses have a programming element running through them and the BIT students will have experienced some analysis techniques in their first year though not task analysis and not in the context of HCI or usability issues.
4 Exercises Within the Study The study focused on an examination of HTA and on a series of HTA techniques and exercises presented within the HCID module attended by BSc Level 2 students (cf. section 3.2). The study was part of an ongoing larger survey looking at student levels of engagement and attitudes towards aspects of cognition, human factors and perception within a typical HCI module [12]. The study examined some HTA techniques conducted by BSc students. Its main objective was to try to gain some information on how useful, easy and relevant these techniques are to students within a Usability/HCI module on an honours degree. In addition to this, it attempted to examine whether students see some value in engaging with a set of exercises in the context of usability and their course. 4.1 HTA Techniques and Exercises Within an HCI Module HTA was introduced to students as part of a series of tutorials which, for the first three weeks, consisted of an investigation into perception in conjunction with interface design. The tutorials consolidated a series of lectures. In the fourth week of the thirteen-week semester, students were presented with HTA, concepts of usability, the importance of user-centred design, user profiling and understanding what tasks and goals are relevant to users. A practical two-hour tutorial/workshop continued this theme examining ways in which user information and goals could be elicited. This included very simple HTA examples, such as that illustrated in Figure 1, partly presented on slide and partly drawn quickly on flip chart, with a detailed explanation of the plans and tasks and how they are constructed. Students were then asked to choose an object or operation relevant to them and a goal associated with this. This might be to make a play-list on an MP3 player or eat a self-made sandwich. Because of relevance to the module in terms of interaction with interfaces, they were encouraged to select portable or non-portable objects with interfaces if possible but activities, such as eating, were considered to be equally valid for this exercise. They were then provided with a large sheet of paper with associated marker pens and asked
Task Analysis, Usability and Engagement
1077
to break that goal down using HTA techniques. They could work in pairs if they wished. Construction of the exercise on paper was selected in order to provide a diversity of approaches within the tutorial sessions. These students mostly conduct any exercise on a screen so this was something different. In addition, it emphasized the difficulty of constructing such charts on large paper sheets in order to communicate with users or groups of users. In the following week, a further lecture consolidated the issues which arose from this otutorial. It looked at HTA in more detail and a second tutorial strengthened the techniques which were established earlier. However, this time the tutorial groups were given set goals such as ‘using a microwave with an auto-sensor’ or ‘ washing clothes’ or the library example (cf. Fig 1). They were also encouraged to do the exercise on screen, something with which they were more familiar. In addition to the two weeks of tutorial practice, students were later expected to complete a hierarchical task analysis of a larger case-study. This formed part of an assessable assignment which they were required to complete by week 11 of the semester. 4.2 Response to the Exercises In general, the techniques and methods presented produced a very positive response. The construction of hierarchical task analysis was new to all within the cohort with many finding the plan and task layout conventions particularly intriguing. Some errors were apparent (cf. 5.1) but many students engaged particularly well in the construction of a hierarchical task analysis in large sheet format, with a considerable number eager to discuss and demonstrate their results to the whole group. In the initial exercise, the cohort could be roughly divided between portable devices such as mobile phones and general activities such as ‘drinking beer’ in its choice of subject goal to analyse. Many students opted for what they thought would be simple goals such as ‘making a phone call’ or ‘writing and sending a text message’ using their hand-held devices. Others opted for an activity or operation which could be broken down into a series of goal tasks such as ‘making a smoothie drink’. The consolidation exercise in the second week was more prescriptive with no choice of subject matter. The group responded as favourably to these exercises as the first set though with slightly less eagerness. They produced some good analyses but with similar errors. The third set of exercises which required students to perform hierarchical task analyses as part of an assessable assignment, exhibited similar but fewer errors. A further analysis of the difficulties encountered by the cohort is in the following section.
5 Study Analysis, Difficulties of HTA and Conclusions Some comments by students on mainly the first but also the second series of exercises indicate the level of engagement and the difficulties encountered in this human factors method. They are fairly indicative of the general comments by the cohort and were as follows: ‘this is great – it gets me away from doing everything on a screen’; ‘yes, I get it – cool’; ‘ why can’t I just have one sub goal’; ‘can’t we just go to this level within the plan’; ‘ what we’ve done is put this choice in this box- not in the plan’
1078
D. Cox
The first two comments perhaps demonstrate a degree of engagement while the remaining comments indicate some of the difficulties and frustration exhibited by some. A discussion follows on the types of difficulties or errors frequently encountered by the cohort and an examination of students’ attitude towards this human factors method. 5.1 Difficulties Encountered After two 2-hour sessions the initial difficulties were less frequent as were the types of comment indicated above. In the first tutorial, many found that the goal which they had chosen, such as ‘send a text message’ contained more tasks and was more difficult to analyse hierarchically than had originally been thought. A number of studies suggest that HTA training and application times for tasks range from a few hours to substantial periods and are possibly dependent on the size and complexity of the task [2, 6]. This tends to support the results of this exercise with students initially finding the basic construction of the charts difficult for even very simple tasks. Some confused the production of HTA with the construction of charts with which they had some prior experience such as structured programming or hierarchical organisational charts. This involved using notation which was different from that discussed and required in HTA. Another common mistake was simply omitting the plans completely and focusing solely on tasks. A further very typical error was not describing a task clearly as a task. This often involved simply not using a verb. So, for example, when using a mobile phone, a goal: ‘find person’s phone number’ might imply searching address book, checking with a friend, searching on-line, searching user memory, etc. However, rather than use words ‘search’, ‘find’, ‘check’, etc., students simply noted ‘address book’, ‘memory’, etc., which did not clearly define the nature of the goal. The problems described thus far might be broadly categorised as syntactical or layout/notational errors. Table 1. Common HTA errors by students at the end of the first two tutorial/workshops. Errors are listed in order, the first being the most common. Popularity 1
Error Name Omission of plans
2
General confusion with other type of analysis notation
3
Describing the wrong level in a plan or series of tasks (moving back up a level) Task/goal not described clearly (an omission of verbs to describe task) Task/goal described/decomposed as only one sub-goal Attempting to describe conditions or order in task/goals rather than in plans (confusion of plans with tasks)
4 5 6
A second broad problem area could be termed logic errors though some also include syntactical misjudgements. A very common mistake was attempting to describe a goal as one sub-goal only rather than a series of two or more. On occasion there was an
Task Analysis, Usability and Engagement
1079
attempt at rectifying this error by notating tasks in a vertical order leading to further confusion. Another typical problem was confusing plans with tasks. Choices in HTA should be outlined in the plan but students attempted to conflate series of conditions into task boxes which skewed the logic of the overall analysis. Similarly, a goal is described as a series of sub-goals but many assumed that the sub-goals not only describe their parent goal but that the last sub-goal in that series should outline the next goal in the level above. This may be termed ‘moving back up a level’. An attempt was also made to describe a move to an upper level in plans. The common errors by students at the end of the first two sessions are summarized in Table 1. By the time students handed the third series of exercises in as part of an assessable case-study assignment (some six weeks later), the number of errors had diminished. Table 2. Common errors in HTA constructed six weeks later Popularity Error Name 1 Task/goal not described clearly (an omission of verbs to describe task) 2
Omission of plans
3
Describing the wrong level in a plan or series of tasks (moving back up a level) Task/goal described/decomposed as only one sub-goal
4
The percentage of errors in all areas was also substantially lower. This may partly be because they had much more time to refine the techniques and skills gained and partly because this exercise was assessable but it does lend support to the view that a more substantial period of time is required to apply this analysis accurately. The most common errors at this stage were mainly syntactical/notational. 5.2 Engagement The first two comments (cf. 5) and similar ones made by students may indicate a high degree of engagement that positively influenced their work in the later exercise. Certainly the level of enthusiasm exhibited in the first two sessions together with the production of a markedly improved standard of work in the second session appeared to influence the quality of the later, mainly error free, HTA. The enthusiasm and initial curiosity value of HTA may have generated an element of intrinsic motivation which helped students engage in their work and in turn provide a fuller understanding of hierarchical task analysis techniques. Evidence does suggest that curiosity helps in this area and also helps stimulate and motivate other areas of learning . An ongoing survey conducted since 2003 tends to support this view [12]. The survey focuses substantially on aspects of cognitive psychology and human factors within the HCI module in question (cf. 3.3) and points towards the value of human factors in helping motivate students in all aspects of their environment (e.g. programming aspects of their course). The survey is presented as a Likert-type scale with statements such as: ‘I feel the psychology aspect of HCID will help me with other parts of my course’ requiring an indication of agreement ranging from ‘Strongly Agree’ to ‘Strongly Disagree’.
1080
D. Cox
In order to provide a basic gauge of the degree with which students engaged in human factors methods and particularly the task analysis exercises, a question was added to the survey during academic year 2006-07. This is illustrated in Table 3. It was hoped that it would elicit students’ attitude towards HTA and provide an indicator of engagement. Table 3. Indication of understanding of task analysis, human factors techniques
I feel I have a fuller understanding of task analysis techniques.
Strongly Agree. Total %
Somewhat Agree. Total %
Neither Agree nor Disagree Total %
Somewhat Disagree Total %
Strongly Disagree Total %
18.5
63
11.1
3.7
3.7
Fifty-four subjects completed the survey, from a total cohort of sixty-nine. Table 3 reveals students’ positive attitude towards task analysis techniques within this module with 81.5% agreeing (strongly or otherwise) with the sentiment addressed in the statement. While this is not definite evidence that students connected fully with the HTA exercises and techniques, it is a marker of a degree of engagement with these methods. 5.3 Conclusions In order to engage, learners must be motivated and the study indicates that one way to do this is through intrinsic curiosity in a relevant human factors method. It suggests HTA may be a suitable method for this and that the cohort gained a solid foundation in HTA techniques. The relative reduction in errors in later exercises also points towards a degree of engagement and implies that students can recognise HTA as a useful tool in achieving product usability.
References 1. Barber, P.: Applied Cognitive Psychology. Methuen London (1998) 2. Stanton, N.A., Young, M.S.: A Guide to Methodology in Ergonomics, Designing for Human Use. Taylor & Francis, London (1999) 3. Bevan, N., Bogomolni, I.: Incorporating User Quality Requirements in the Software Development Process. In: Proceedings of the 4th International Software Quality Week Europe & International Internet Quality Week Europe (QWE 2000) [on line] (2000) http://www.soft.com/QualWeek/QWE2K/Papers.pdf/Bevan.pdf 4. Stanton, N.A., Salmon, P.M., Walker, G.H., Barber, C., Jenkins, D.P.: Human Factors Methods, A Practical Guide for Engineering and Design. Ashgate Publishing Ltd, Aldershot, Hampshire (2005) 5. Annett, J., Duncan, K.D., Stammers, R.B., Gray, M.: Task Analysis. HMSO London (1971)
Task Analysis, Usability and Engagement
1081
6. Annett, J.: Hierarchical Task Analysis. In: Stanton, N.A., Hedge, A., Brookhuis, K., Salas, E., Hendrick, H.W. (eds.) Handbook of Human Factors and Ergonomics Methods, CRC Press, LLC Boca Raton (2004) 7. Preece, J., Rogers, Y., Sharp, H.: Interaction Design: beyond human-computer interaction, John Wiley & Sons Inc, New York (2002) additional ebook at http://www.id-book.com 8. Jenkins, T.: The Motivation of Students in Programming. In: Proceedings of the 6th ITiCSE Conference (ITiCSE ’01), Canterbury, UK, pp. 53–56. ACM Press, New York (2001) 9. Mitchell, M., Sheard, J., Markham, S.: Student motivation and positive impressions of computing subjects. In: Proceedings of the Australasian conference on Computing education, Melbourne, Australia, pp. 189–194. ACM Press, New York (2000) 10. Entwisle, N.: Motivation and Approaches to Learning: Motivating and Conceptions of Teaching. In: Brown, S., et al. (eds.) Motivating Students. Kogan Page London, pp. 15–23 (1998) 11. Cox, D.: A Pragmatic HCI Approach: Engagement by Reinforcing Perception with Functional Design and Programming. In: proceedings of the 10th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, ITiCSE 05, Monte de Caparica, Lisbon, Portugal, pp. 39–43. ACM Press, New York (2005) 12. Cox, D.: Human Factors in the HCI Learning Process: A Survey. HCI International 2005. In: Proceedings of the 11th International Conference on Human-Computer Interaction, Human Factors Issues in Human-Computer Interaction. Las Vegas, USA. Mira Digital Publishing St. Louis CD-ROM (2005)
ORCHESTRA: Formalism to Express Static and Dynamic Model of Mobile Collaborative Activities and Associated Patterns Bertrand David, René Chalon, Olivier Delotte, and Guillaume Masserey LIESP Laboratory Ecole Centrale de Lyon 36, av. Guy de Collongue, 69134 Ecully Cedex, France [email protected]
Abstract. Orchestra is a new formalism on which we are working in the field of cooperative systems design. In CoCSys methodology for Cooperative Capillary Systems design, we transform partial scenarios describing particular cooperative situations in a more comprehensive Cooperative Behaviour Model (CBM). In this paper, we describe our contribution to the need for a graphical formalism which would be able to express in a natural way, understandable by different actors (users, designers, developers,…) different cooperation situations in an ambient intelligence environment (mobile, context-aware, proactive and ubiquitous). ORCHESTRA is complementary to CTT and UML Use cases, and its objective is to express clearly cooperation situations (explaining easily synchronous or asynchronous cooperation activities) and the role (active or passive) played instantaneously by each actor. We take into account main concepts of “cooperative world” which are Actors, Roles, Groups, Tasks, Processes, Artefacts (Tools and Objects) and Contexts (Platforms, Situations and Users). With Orchestra formalism we try to express by a sort of music staff individual and collective behaviours. In this way we can model either individual works or organized collective activities. We present this formalism, its metamodel and associated patterns expressing typical configurations of cooperation facilitating their reuse. Keywords: CSoCW, Specific Description Language, MDA inspired elaboration process, transformation process, formalism meta-model, description patterns.
ORCHESTRA: Formalism to Express Static and Dynamic Model
1083
conversation activities in respect with the definition initially proposed by Ellis [10] and adapted by several other authors [8]. This cooperative work can be done in several cooperative situations characterized initially by Johansen and enhanced by Ellis [9]. At the moment CSCW systems are becoming more and more mobile, context-aware and proactive. We called this kind of cooperative systems Capillary Cooperative Systems (CCS) [6]. We use this term by analogy with the network of blood vessels. The purpose of the Capillary CS is “to extend the capacities provided by co-operative working tools in increasingly fine ramifications, hence they can use fixed workstations and handheld devices". These systems become also pervasive, proactive and ubiquitous. Our final goal is to allow them to evolve in mixed reality environment (mixture of real and digital objects and tools) and to put into practice Ambient Intelligence (AmI) concept. In the following sections we briefly describe our methodology (section 2), we present CBM content (section 3), then we discuss the formalism features and present ORCHESTRA concepts (section 4). After that we discuss pattern approach and give several patterns (section 5). Finally, an illustrative example (section 6), conclusions and perspectives are finishing the paper.
2 Our Approach: CoCSys Methodology We are studying design of CSCW systems and we propose an approach and a process, called CoCSys (Collaborative Capillary System) engineering process. Main reason for this more comprehensive process is related to the necessity to allow the evolution of this kind of system during its use in relation with the users’ skills, expertise, and the evolution of their perception and the mastery of the system. Our approach is based on Model-Based approach [17], which is characterized by a different way of development: “Rather than programming an interface using a toolkit library, developers would write a specification of the interface in a specialized, high-level specification language. This specification would be automatically translated into an executable program, or interpreted at run-time to generate the appropriate interface.” This approach is used in HCI for several years and become more generally used in other development application fields. OMG adapted a similar approach as new paradigm of development which is called MDA Model-driven architecture [14]. Other acronyms describing similar ways are MDE (Model-Driven Engineering) or MDD (Model Driven Development). In each case specification at concrete, abstract or meta level is privileged before studying the way to produce an executable code. The production is done more or less automatically by transformation or translation of these models. The objective of our approach is to adapt this trend to CSCW. We are proposing a framework for design, implementation and evolution of CCS. As described deeply in [5, 7] this approach is based on 3 main parts: 1/Scenarios Collection, 2/Cooperative Behaviour Model (CBM), and 3/Collaborative Architecture; and 3 transformation phases: I/CBM Model Construction, II/CBM Projection on the Collaborative Architecture and III/Evolution.
1084
B. David et al.
3 Scenarios and Cooperative Behaviour Model We consider that a scenario allows to final users and designers to meet them and discuss together about functionalities of the system to be developed. A scenario describes repetitive activity that should activate an adaptation mechanism which will be recorded and reused. For us the scenarios are short stories describing precise working situations which occur for different actors. This analytical perception of working situations seems be possible to catch and express observers or actors needs. We are asking to give as precise description as possible, i.e. to indicate, if possible, all actors evolving, artefacts used, activities executed and contexts characterising them (devices used, geographical location, temporal situation …). We collect these scenarios for different collaborative situations. In this way we can consider that this formulation of scenarios is possible, meaningful and useful. If scenarios are short limited stories, expressed mainly by different actors, behaviour model objective is to discover overall organization of the cooperative system in which main elements are actors, artefacts, tasks, processes and contexts. The designers are in charge to study different scenarios and to construct gradually the Cooperative Behaviour Model (CBM). In the model we find comprehensive collections of actors, artefacts, activities and contexts and also all relations which allow materializing all necessary elements for each activity. Different processes are also explained carrying out dependencies between tasks and their temporal and organizational constraints. This comprehensive model is able to manage the cooperative system behaviour and will be used during the implementation process i.e. projection of this model on a particular hardware, network and software architectures. Main elements of the CBM model are: • An actor, as instantiation of one or several roles, a role is a basic element of human behavior in the system, which can be qualified as Acting (A), Observing (O) or Editing (E) i.e. observing and acting. • An activity, describing an identified work which a role can do, this activity can be also A, O or E, i.e. acting, observing or editing activity. • A process expressed as a network composed of process states (PS) and process transitions, which can also be qualified by A, O or E. • An artefact can be either a tool or an object. The tool is an instrument used in the task; the object is either input, support or output of the task, qualified by A, O or E. • A context is a collection of three aspects giving platform, situation (often logical, physical or geographical location) and user preferences characterising the context. We take into account several platform examples and elements: laptop, PDA, cellular phone, and also active environmental object (active RFID tag), passive environmental object (passive tag), … In the CBM model all these elements are expressed and interconnected. We can take as example a user’s role, which is identified by a name, a type, its participation in different actors, the activities which can be done, the process states and transitions in which their can occur, the artefacts (tools and objects) manipulated and the contexts (platform, situations and user preferences) which applies the role. These interrelations are also needed for other elements of the model. They are explicitly or implicitly described and can change during the system life expressing its adaptation and evolution. List of activities is one of the main components of CBM. This list is
ORCHESTRA: Formalism to Express Static and Dynamic Model
1085
obtained from the task tree which can be expressed by CTT [15], an interesting task formalism, and its environment (CTTE) proposed by Paterno. Its extension for cooperative activities [13] aims to express cooperative situations. In CTT, collaboration is expressed by individual task trees and by a collaborative task tree. That is interesting to express tasks, but is insufficient for the more comprehensive view of collaboration, that we need. We consider that tree view of tasks is interesting during the task design phase. However, during the activities organization (definition of effective collaborations), mainly effective activities (leaves of the task tree) are important and their individual or collaborative scope is essential, in relation with effective actors, objects, tools, process states and transitions and contexts. To express in a more comprehensive way we propose a new formalism called Orchestra [5].
4 ORCHESTRA The objective of Orchestra is to propose a more comprehensive formalism which is able to express together all main aspects of the CBM. ORCHESTRA adapts musical score notation [18] to our problem of CBM description. For us, the 5 lines of a staff are expressing 5 main aspects of the CBM (Fig. 1), which are: user’s role, activity concerned, process state or transition, artefacts involved in the activity and the context. These aspects are expressed on each of their respective line by situating one or several “notes” containing their names. Each note can receive a stem which indicates the participation of the element (acting, observing or editing). We distinguish main actor (double arrow) and secondary actor (simple arrow) as well as active role and passive role: Active role Passive role A bar line indicates the separation between independent cooperation episodes. To express repetition of an episode we propose four options: an explicit number of repetitions (n), an undetermined number of iterations (+/*), a contextual end (logical condition), a time dependent end of iteration (relative or absolute time limit). Each cooperation episode expresses a state or a transition in the cooperation process description network. For each cooperation episode, sequential ordering from left to right is implicit temporal option, another order, must be expressed explicitly either by a jump from current period to another one which is named, or by a “procedure call” jump to a named episode then the back to the previous one. By different types of parenthesis, we indicate explicit relations between participating notes. These parentheses are used to express different situations: (…) alternatives, {…} mandatory participation, [….] optional participation. Different key signatures are expressing collaboration properties like synchronous or asynchronous collaborations, collaboration modes and styles of coordination (computational or social ., implicit z or explicit ---): @ - Asynchronous with infinite answer delay @@ - Asynchronous with limited answer delay corresponding to “on call” participation
1086
B. David et al.
& - Synchronous “in-meeting” cooperation && - Synchronous “in-depth” cooperation In synchronous collaboration two different participations must be distinguished: • instantaneous, short term collaboration, called also implicit and expressed by z i.e. vote activity, • long term participation, long term collaboration, called also explicit and expressed by gg i.e. sketching activity.
Role Activity Process Artifact Context
Role Activity Process Artifact Context
E1
JE3 N/*
. .
. .
R-Name
E3
E2
Acting Observing Editing
R-Name
R-Name
A-Name PS/PT-Name AT/AO-Name L-Name
P-Name
U-Name
Fig. 1. ORCHESTRA main concepts
In the first case (vote activity) an implicit collaboration is appropriate (short exclusive access to the shared space), in the second case (sketching) explicit participation must be asked and allowed (long-term access to the shared space) either by social coordination (.), i.e. one of human actors is in charge of this coordination or a computational () one i.e. the computer fulfil it. We express graphically instantaneous collaboration by a dot over concerned chords and for long term collaboration we use a horizontal line gg and a symbol expressing social or computational coordination (., ) i.e. coordination made by one of the actors or by interaction (asking for, receiving and returning exclusive access right to shared space). Another important notion in CSCW is awareness. Its objective is to allow to different actors to know (or not) what has been done by an actor. It is important to decide statically (by the designer) or dynamically by the actor himself the scope of information propagation to other actors. For static way we propose to express awareness in ORCHESTRA formalism. Special marks are proposed: • • •
for no awareness, for partial awareness (for specific actors), for overall awareness (for all actors).
To explain more deeply ORCHESTRA formalism, we give in [5] its metamodel which contents ORCHESTRA and CBM metamodels.
ORCHESTRA: Formalism to Express Static and Dynamic Model
1087
5 Patterns As initially expressed by Christopher Alexander [1]: “A pattern is a careful description of a perennial solution to a recurring problem within a building context, describing one of the configurations which brings life to a building.” In software engineering a Design Pattern describes a family of solutions for a given Software-Design problem. The Pattern is not the solution itself, but a solution framework. The final goal of Design Patterns is the reusability of Software Design knowledge. Patterns can be used for different reasons, as: to improve team communication, to document and facilitate the state-of-the-art and to reflect main concepts. Patterns can also help to understand, to clarify and document design decisions. They can help to avoid design drift and also can improve code structure and code quality. For these reasons, patterns can be useful everywhere (in process, product and activity), as reusable problem-context-solution descriptions. Methodological, functional, process, analysis, scenario, testing and evaluation patterns are proposed and useful, as well as design, HCI, UI patterns. In our case, we propose patterns for ORCHESTRA which objective is to express in a reusable manner main cooperation situations. Our approach of patterns is in relation with Alexander [1], Gamma [11], Borchers [3] and Seffah [12], we adopt a more comprehensive and generic definition: A pattern is a collection of elements and their relationships. They can be repetitively reached or used in analysis, design, development and use (of cooperative systems): Pattern = Problem + Context + (potential) Solution(s). It seems important to highlight the convergence of interest between different patterns users. In HCI design and groupware design, patterns are useful for the designer (professional) as expression of best practices, standardization and usability; they are also useful for the final user for standardization (same thing is done in the same manner in different situations), learnability and usability. In figure 2 we are giving an open-ended list of ORCHESTRA patterns. They are either finalized chords with appropriate annotations, or and more usually incomplete configurations which could be completed during the instantiation process. Chords are mainly generic, i.e. role, activity, process, artefact (tool or object) and context can be chosen from corresponding concrete application field instances. ORCHESTRA pattern is a schema with one or several chords constituting cooperation episode(s) organized temporally and associated to a particular configuration of complementary annotations expressing nature of cooperation (Synchronous or Asynchronous), level of cooperation (asynchronous with infinite delay, on call, in meeting or in-depth cooperation), coordination style (social or computational), nature of coordination (implicit or explicit) and awareness (overall, partial or no awareness). To exemplify this approach we are able to present six important cooperation patterns which are the following: • Intervention appointment: Synchronous or asynchronous, on-call or in-meeting cooperation with computational implicit coordination and no awareness. • Consultation – vote: Synchronous, in-meeting cooperation with computational and implicit coordination and either overall awareness or no awareness.
1088
B. David et al.
• Presentation: Synchronous and in-meeting cooperation with social and explicit coordination with overall awareness. • In-depth work: Synchronous, in-depth cooperation with computational explicit coordination and partial awareness. • Questions / Answers: Synchronous, in-meeting activity with social or computational explicit coordination. • Validation: Asynchronous, on-call cooperation with implicit coordination and no awareness.
Pattern
S/As
Coop
Coord
Intervention appointmen t
S/As
&/@ @
.
Consultatio n Vote
S
&
Presentatio n
S
&
.
gg
S
&&
gg
S
&
./
gg
A
@@
In-depth work Questions / Answers Validation
Exp/Imp
Aware
Coop. configurations
/
/
Fig. 2. Characterisations of several ORCHESTRA patterns
We give on the figure 3a ORCHESTRA description of report writing activity which is an instantiation of validation pattern and on figure 3b a description of test activity which is an instantiation of vote pattern. Names in inside of notes are formal; they will receive final names during instantiation of patterns.
Role Activity Process Artifact Context
@@
Student Report writing Report writing Report sheet PC
PDA
a - Individual activity “Report writing” Fig. 3. Two ORCHESTRA patterns instantiations
ORCHESTRA: Formalism to Express Static and Dynamic Model
1089
. Role Platform Activity Process Artifact Context
Role Activity Process Artifact Context
&
Teacher
Teacher
evaluation
Test submission Test-State
Test evaluation
Test sheet
Result sheet
PC
PC
&
Student Test answering Test-State Test sheet PC
PDA
b - Collaborative test activity preparation, execution and treatment Fig. 3. (continued)
6 Case Study: Heating Equipment Maintenance Activities To explain ORCHESTRA formalism use we are expressing with it heating equipment maintenance activities (Fig. 4) with six actors: client, secretary, technician, supervisor, expert and clerk. Main scenarios of maintenance process are the following: • A client (secondary actor), observing a problem with his heating equipment, phones to the repair company to ask intervention. The secretary (secondary actor) asks him his profile (address, equipment…) and finds him in the database. He organizes an appointment with a technician. State: Appointment, Actors: Client, Secretary, Properties: & • In the morning, before leaving the company, the technician (main actor) loads on his PDA necessary information for his round with appropriate information (clients and their addresses, nature of intervention …). State: Init, Actor: technician, Properties: @ • At client house, the technician works on maintenance process, he can study history file of the supplies and blueprints, to elaborate a diagnosis using appropriate tools, and repair, or ask for spare parts. State: Work, Actors: Client, Technician, Properties: & • In a situation of impossibility to establish a diagnosis alone, he can contact his manager (secondary actor) to ask him some helps and to exchange some information. He can also contact, in a synchronous manner the heating manufacturer expert (secondary actor) to study the situation with him. State: Coop, Actors: Technician, Manager, Expert, Properties: && • At the end of his round, the technician, back to the company, updates history file of visited equipments and gives his intervention statement. State: End, Actor: Technician, Properties: @
1090
B. David et al.
• Next day the clerk (secondary actor) produces the financial balance and statement of accounts and either sends the bill to the client or integrates it in the client record. State: FB (Financial Balance), Actor: clerk, Properties: @ & Client App. R A P A C
R A P A C
@ Init
Client RV
& Work
&& . Coop
Client
Client
Be here
Be here
@ End
@ FB
RV
Secretary
Expert
Clerk
RV
Helping
update
RV
FB
Manager
PC
PC
PDA
Tech R A P A C
Techcian
Technician
Load
Work
Technician
Technician
Coop
Update
information
History
analyse
PDA
PDA
PDA
Int St. PDA
Man R A P A C
Manager Helping
Doc PDA
v. .
Patterns
Fig. 4. Different ORCHESTRA description of heating maintenance activities example
In figure 4 we show ORCHESTRA modelling for this case study and associated patterns. Individual activities are expressed on one staff. For collaborative activity, several staffs are needed, each for a role. In this way we describe on the same sheet, the participation of each actor to this collaborative episode and we facilitate its understanding.
7 Conclusion In this paper, we outlined a new formalism called ORCHESTRA, which objective is to provide a graphical expression of Cooperative Behavior Model. CBM, elaborated from a collection of scenarios, as a reference for the transformation process allowing different implementations. As it is important to associate different actors to this constructive process, we propose a formalism which could be used during initial discussions as well as during the implementation and adaptation process. We presented a set of reusable patterns which are useful to accelerate and do design process more powerful. We propose to use them in a pattern oriented walkthrough, in which patterns
ORCHESTRA: Formalism to Express Static and Dynamic Model
1091
are considered as best practices, as a collection constituting an inspiration sourcebook and as a use guide. We presented ORCHESTRA use on a case study. Of course ORCHESTRA explains a global view of cooperation. An in-depth view is necessary to describe completely the content of “notes” with the help of an editor. ORCHESTRA has been tested in several case studies and we may continue to upgrade it by new concepts as result of these tests. The connection with mixed reality has not been described in this paper, even if we are currently working on it.
References 1. Alexander, C., Ishikawa, S., Silverstein, Jacobson, M., Fikdhl-King, I., Angel, S.: A Pattern Language: Towns; Building, Constructions, p. 1216. Oxford University Press, New York (1977) 2. Andriessen, J.H.E.: Working with Groupware: Understanding and Evaluating Collaboration Technology. In: CSCW Series, p. 206. Springer, Heidelberg (2003) 3. Borchers, J.O.: A Pattern Approach to Interaction Design, pp. 369–378. ACM Press John Wiley & Sons, New York (2000) 4. Chalon, R., David, B.: IRVO,: an Interaction Model for designing Collaborative Mixed Reality Systems, HCI International 2005, Las Vegas, USA, pp. 22-27 (July 2005) 5. David, B., Chalon, R., Delotte, O., Masserey, G., Imbert, M., ORCHESTRA,: formalism to express mobile cooperative applications. In: Dimitriadis, Y.A., Zigurs, I., GómezSánchez, E. (eds.) CRIWG 2006. LNCS, vol. 4154, pp. 163–178. Springer, Heidelberg (2006) 6. David, B., Chalon, R., Vaisman, G., Delotte, O.: Capillary CSCW. In: Proceedings of HCI International, Crète (2003) 7. David, B., Delotte, O., Chalon, R.: Model-Driven Engineering of Cooperative Systems. In: proceedings of HCI International 2005, Las Vegas, USA, pp. 22–27 (July 2005) 8. David, B.: IHM pour les collecticiels. Réseaux et Systèmes Répartis, Hermès, Paris 13, 169–206 (2001) 9. Ellis, C., Gibbs, S.J., Rein, G.L.: Groupware: some issues and experiences. Communications of the ACM 34(1), 38–58 (1991) 10. Ellis, C., Wainer, J.: A conceptual model of Groupware. In: Proceedings of CSCW’94, pp. 79–88. ACM Press, New York (1994) 11. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns, Elements of reusable Object-Oriented Software. Addison-Wesley Publishing Company, Reading, MA (1995) 12. Javahery, H., Seffah, A., Engelberg, D., Sinnig, D.: Migrating User Interfaces between Platforms Using HCI Patterns. In: Seffah, A., Javahery, H. (eds.) Multiple User Interfaces: Multiple-Devices, Cross-Platform and Context-Awareness. Wiley, pp. 241–259 (2003) 13. Mori, G., Paternò, F., Santoro, C.: CTTE: Support for Developing and Analyzing Task Models for Interactive System Design. In: IEEE Transactions on SE, vol. 28(9) (2002) 14. Object Management Group http://www.omg.org/mda/ 15. Paternò, F.: Model-Based Design and Evaluation of Interactive Applications. Applied Computing Series. Springer, Heidelberg (2000) 16. Rosson, M.B., Caroll, J.M.: Usability engineering scenario-based development of humancomputer interaction. Morgan Kaufmannn, San Francisco (2002) 17. Szekely, P.: Retrospective and Challenges for Model-Based Interface Development. In: Vanderdonckt, J. (eds.) CADUI’96, June 5-7, 1996, Namur (1996) 18. Stewart, D.: The Musician’s Guide to Reading and Writing Music. Backbeat, p. 117 (1999)
Effective Integration of Task-Based Modeling and Object-Oriented Specifications Anke Dittmar1 and Ashraf Gaffar2 1
Abstract. This paper proposes an integration of task modeling and object-oriented analysis approaches. We argue that task–based approaches are more appropriate to analyze existing working situations and to elicit user needs. In subsequent stages like design and implementation, an object-oriented approach is warranted since most of the developer’s skills, techniques, and tools are better matched to object-oriented representations. We show that such amalgamation, when supported by systematic transformation from a goal- and actionoriented perspective to “thinking in objects”, can have several advantages for both approaches.
Effective Integration of Task-Based Modeling and Object-Oriented Specifications
1093
into OOA is warranted since the developers’ design environment lends itself easily to object-oriented representations. We show that such amalgamation and transfer of knowledge between goal- and action-oriented perspective and “thinking in objects” has several advantages. First, it can help overcome some well-known traps like the “knowing the correct stopping point” problem in task analysis. Second, it facilitates a smooth integration of two main principles of design: reusing well-proven concepts as well as generating new ideas. Third, developers, and stakeholders in general, might be more aware of the transition from current working practices to envisioned ones. Due to the inherently informal nature of users, and hence requirements, current task modelling relies mainly on informal derivation of user needs. OOA will value more formalism, leading eventually to a formal artefact in terms of a source code. To facilitate a smooth transition into the OOA, we need to reduce this gap by introducing formalism into the tasks. In this paper, we use TaOSpec [2], to specify tasks. It is further claimed that the decomposition of tasks into sub-tasks should stop at a level that seems to reveal how people interact with the task domain objects and what the probable outcome is. This level can be seen as action-oriented with a resulting set of user actions and desired system responses. We suggest that, at this level, “interaction elements” can be created and described in an object-oriented way. Such elements approximate our basic understanding of how corresponding domain objects behave and can be used. They also reflect constraints on this behaviour given by the task. Additional design principles like affordance and visualization [7] as well as design patterns can guide this transformation step. Often, “interaction elements” lead to other actions as well, which help to evolve our task understanding. We think that the development of design models needs more rigorous transformations from task models into object-oriented models. We are interested in exploring how task analysis models can influence design decisions and corresponding objectoriented specifications and how those influence envisioned task models again.
2 Related Work The need for the integration of task modelling and OOA has been recognized by researchers. The OOA tends to trim down on the early phase of identifying user needs in interactive systems. Fixed scenarios and use cases with informal documentation and analysis are used. Only later stages use more formal models and techniques. Use cases only describe how users interact with the system, but they inadequately describe the overall context. Explicit descriptions of current working practices are rarely found, therefore they don’t allow for rigorous analysis of user requirements. Task models provide better analysis, but there is room for improvements. Most of the current work in model-based design is based on CTT [11]. They clearly describe the hierarchical and sequential decomposition of tasks but not the task domain. This incomplete conceptual understanding makes it hard to map task models into OOA because the change from an action-oriented perspective to an object-oriented one is poorly supported. In [8] and [1] alternative UML notations for task trees in CTT notation are proposed. [5] suggests a mapping from tasks specified in a CTT model to methods in a UML class diagram. Other approaches like Groupware Task Analysis
1094
A. Dittmar and A. Gaffar
(GTA) [13] identifies the need for a broader design view by combining other aspects into task analysis. The GTA framework focus on modelling four main facets of an interactive system: Tasks, objects, roles and agents. In our approach, we assume that a task analysis can result in formal task models which also specify task domain objects to be manipulated and used during task execution. Those objects assigned to leaf nodes of current task models are the starting point for a smooth transition into OOA models.
3 A Design Example We will use a simple example of solving the puzzle shown in Fig. 1 to illustrate the idea of an integrated task-based and object-oriented design approach. Given are nine square cards (made of cardboard). The diagonals divide the 4 corners of each card into 8 partitions. Each partition contains one, two, or three stars. Now, you have to arrange these cards in a 3x3 grid with the following condition: For every two neighbouring cards, each pair of adjacent partitions must have the same number of dots. The design goal is to support the task of solving the puzzle by an appropriate interactive system.
Fig. 1. “The mad starry sky” (author: Manfred Schüling, 1997)
3.1 Formal Representations and Their Intertwined Development Fig. 2 shows recommended formal representations for our approach. Although an intertwined development of models is assumed, as in [3], this process is also ‘directed’. A solid arrow from models A to B indicates that A serves to develop B. However, such a process is always accompanied by revisions of A (dashed arrow). (1) Current task models help to discuss about a new work allocation (Sect. 3.3). (2) Envisioned task models serve to find requirements on the system (Sect. 3.4). Here, we consider functional requirements as well as some UI requirements (usefulness and usability). (3) Current task models help to find functional specifications (Sect. 3.5). (4) The requirements specification informs the development of a design model (Sect. 3.5). Task models can never describe human activity in their full richness. While in [4] a general task model is recommended, we rather assume a set of current and envisioned formal task specifications describing some aspects of the ways people may follow to
Effective Integration of Task-Based Modeling and Object-Oriented Specifications
1095
accomplish a certain task. These descriptions are used in the design process. Later on in this paper, we enrich this view by showing where patterns and affordances can guide the development of models (solid lines) but also encourage designers to go back to other models in order to revise or refine them (dashed lines).
Fig. 2. Formal representations in the design approach of interactive systems
3.2 Task Analysis Task analysis involves data gathering by observation, interviewing etc. In the context of this paper, it is transformed into formal task models which focus on the description of goal-oriented human activity. As mentioned, we use TaOSpec to describe the hierarchical decomposition of tasks, temporal constraints between sub-tasks, and task domain objects in terms of attributes and object states. We then show the role of describing the domain objects in the refinement process. Example: Current Task Model Fig. 3 illustrates a first iteration to decompose the example task and to define temporal relations between sub-tasks. We point out that modelling processes are characterized by decisions about what to focus on. Developers should feel free to go back and refine the models if necessary. In the example, cooperative aspects are not described explicitly. Furthermore, advanced strategies like grouping of cards to facilitate the solving process were not specified here for the sake of simplicity.
Fig. 3. Hierarchical and sequential description of current task “Solve puzzle”
In Fig. 4, TaOSpec elements Card, FreeCards, GridPlace, and Grid specify the task domain. C1, c2, c3, c4 are instance elements of Card (denoted by ‘::’) and describe the card illustrated below in “standard position”, rotated by 90, 180, 270 degrees; gp is an instance of GridPlace (its attribute $card: c2 does not fit this place).
1096
A. Dittmar and A. Gaffar
Fig. 4. Task domain objects for the example task
Now, using this task domain description, we can refine the task model of Fig. 3 by preconditions and effects on the task domain as shown in Fig. 5. Task models combining if-then knowledge (by task domain objects and their states) and procedural knowledge (temporal relations between sub-tasks) are more expressive and often more concise. For example, sub-task lay out is no longer specified as optional but the selected card ($c) has to match with the grid place, that is $gp has to be in state OK (see (*) in Fig. 5). Thus, a formal description of the task domain might help to get a clearer understanding. Analysts and designers might be encouraged to revise structures like the task hierarchy. So, sub-task lay out card of Fig. 3 is completely revised because it turned out that sub-task select free card stands for two quite different actions. The model fragment in Fig. 5 makes a clearer distinction between two strategies one can observe. People take a card and look for a matching place in the grid or vice versa.
Fig. 5. Revised sub-task lay out card of Fig. 3
Finding the Stopping Point It is widely accepted that a deeper task understanding which external task representations leads to better designs. However, ''[a] key problem that faces task analysts is knowing the correct stopping point: Too little detail, and the designer does not receive a full specification of requirements; too much, and the designer's options are overly constrained'' [9]. We propose that the refinement of a task model should stop at a level where a task-independent, “natural understanding” about how to act on domain objects is achieved. Besides the task of solving the puzzle, people also look at
Effective Integration of Task-Based Modeling and Object-Oriented Specifications
1097
the pattern of a card, grab a card, lift, rotate, and move it, turn it over, let it drop down from a table, or throw it into the air. Of course, we could refine sub-task select free card of Fig. 5 by saying that a person has to grab a card out of the set of free cards, then lift it, then move the arm etc. However, this description seems not to deepen our task understanding but inflate the task model. In the example, it was also decided to consider sub-tasks look for matching free grid place and look for matching free card as basic ones. In [7], affordances are introduced. They “provide strong clues to the operations of things”. We propose to stop a task description when the taskindependent affordance of domain objects seem to be revealed.
Task Models vs. Task Domain Objects A comparison of the descriptions of cards in previous paragraphs shows important differences between task models (goal-oriented), and task domain object. The goal-oriented task model focuses on the tasks needed to achieve a goal, but ignores indirect- or sub-tasks, which constraints our understanding of how to act on domain objects. At early stage of analysis, such a restriction facilitates modelling which otherwise would be too complex. On the other hand, task domain objects are models of “real-life artefacts” in the domain, which focus on their use or manipulation with respect to corresponding tasks. In a later stage of refinement, this brings more details to the model and the refinement is more suitable for a transition into objects. For example, the task model first suggests actions like select a card, compare it with the context of a grid place (see Fig. 4), lay it out, or remove it from the grid, with the main focus of how to solve the problem. The task domain object Card now look at the object that will be used to solve the problem, and hence implicitly imposes certain positions of a card which we get by rotations of 90, 180, 270 degrees. This brings the analysis model closer to the OOA paradigm. We argue that abstractions like goaloriented task models alone are necessary to inform as well as to direct the early design process. However, they are not necessarily sufficient. Further analysis and refinement using task domain goals help bring the models closer to a quality design. 3.3 Task Design – Define the Task Allocation Space Envisioned task models reflect task design decisions. They reflect the level of automation, new forms of cooperation, or how to apply new artifacts. For example, we might decide that we want a system where users can ask for a partial solution. The system represents it in an abstract form so that the user still has to arrange the “interactive cards” in order to be able to see a “mad starry sky” (one reason could be that users feel more comfortable if they still have to move cards). It is beyond the scope of this paper to argue why these decisions were made. Fig. 6 illustrates parts of a possible envisioned task model. For specifying sub-task lay out solution corresponding parts of the existing task model were reused. This also includes a reuse of task domain objects. Current practices were considered worth to be maintained. In addition, sub-task ask for solution automates the solution process of the puzzle.
1098
A. Dittmar and A. Gaffar
Fig. 6. Part of an envisioned task model of “solve puzzle”, and an initial OOA model
3.4 Requirements Specification – Define the Design Space Envisioned task models give hints for initial requirements, here specified as OOA models. Task domain descriptions of interactive objects are transformed into classes (in Fig. 6: Solution and InteractiveCard; the grid and the set of free cards are still seen as conceptual task domain objects). Sub-tasks, which have to be supported by the interactive system under design (underlined in the figure), are used to find appropriate methods of classes (functional requirements). In the example, the following mappings were applied. − − −
select free card, look for matching free card, select card in grid, select suggested card Æ select() lay out, remove card, look for matching free grid place Æ move() Æ solve(), show() ask for solution
In addition, method show() was assigned to class InteractiveCard. Task models are also useful for deriving initial behavioral specifications (e.g. state charts, activity diagrams in OOA). 3.5 OOD Before we illustrate how patterns and affordances can guide the design we point out another advantage of formalized task knowledge. Current task models often support functional specifications. Basically, to automate a sub-task means to determine the corresponding preconditions, and then, to select a subset of possible ways to perform the sub-task, which can be translated into an executable program. In the example, a specification of the method solve() of class Solution could look like this: select grid place (2,2), look for matching free card, select grid place (2,1), look for matching free card,… select grid place (3,3), look for matching free card
// (1) // (2) … // (9)
where a backtracking mechanism is assumed to go back if action look for matching free card was not successful. The context of matching free card is known in each line (1)-(9). The card, which is chosen in line (2), has to be a “valid left neighbour” of the card chosen in line (1) and so on.
Effective Integration of Task-Based Modeling and Object-Oriented Specifications
1099
Fig. 7. Some abstract solutions of the puzzle (supplied by an executable specification in Prolog) and interactive cards in an almost 3x3 grid representing the first given abstract solution. On the right side, there are OOD-classes InteractiveCard and Solution.
In addition, the specification of current task domain objects (see Fig. 4) can help to find representations of classes in OOD models. For example, Grid and Card influenced the refinement of OOA-classes Solution and InteractiveCard. However, an identifier for each interactive card had to be introduced for the interactive system to give users a hint how to interpret an abstract solution. In Fig. 7, they are represented by numbers between 1 and 9 in the centre of each card.
4 Guidance by Patterns and Affordances 4.1 Pattern Application So far, task-specific knowledge was employed to derive design ideas for a supporting interactive system. Patterns in OOA – as general solutions to design problems that recur repeatedly - rather describe opportunities afforded by software applications in general. They allow to extend OOA models or to restructure them in order to prepare them for further extensions or refinements. A pattern application often entails a revision of other models in use. This is illustrated through a small example in Fig. 8. A list pattern applied to class Solution results in a structure to manage a list of
Fig. 8. A pattern applied to class Solution and two revisions of the envisioned task model
1100
A. Dittmar and A. Gaffar
solutions. However, this was not considered in the envisioned task model so far. Two possible revisions are shown on the right side of the figure: a) a user can ask for the next solution but has only one set of nine cards to lay the grid, b) a user can handle several sets of cards to lay out several grids in parallel (*| denotes the operator for instance iteration). The decision for a) or/and b) has an influence on the design of the user interface. 4.2 Smooth Transition from Tasks to Concrete Design An affordance emerges as a relationship that holds between the object and the person that is acting on it. An object “affords” its functionality when it reveals some usage potential by giving clues to its observer. Although model-based approach can help view the design from multiple perspectives, we always deal with abstractions from the ‘real’ objects currently used. This is essential in early design to focus only on important issues related to understanding the problem domain and finding a solution strategy, while not sinking in details. We then need to make the transition into thinking of domain objects, as explained in Sec. 3.2. We further need to complete the transition into concrete objects suitable for the OO approach. The affordance concept can help to ‘refine’ properties and behaviour of current domain objects which were not seen as task-relevant but can influence the performance and the manipulation as well as aesthetic aspects like the pleasure to use the system.
Fig. 9. Bad transfer of properties of cardboard cards to interactive cards: in the current situation cards are easy to lay out in an exact grid
The concept of affordance allows us to differentiate between real affordance of physical objects and perceived affordance of virtual objects, which is more suitable for UI design. The example models so far describe that interactive cards can be selected, moved, rotated (Fig. 7). One could ‘transfer’ the current move-action to a drag-action of the mouse. However, a current move-action often involves lifting the card. Hence, no one would expect, for instance, that an interactive card can be dragged beneath other cards. Another example for an inappropriate ‘transfer’ of behaviour: the ‘real’ cards are squares made of cardboard. It would cost some effort to lay them out as in Fig. 9. In Sec. 3.2 further operations were mentioned like turning cards, cut them into triangles or look at their patterns. Although not relevant for the example task, it might be a good decision to transfer some of them too. Every design will offer interactive elements which have no analogy in the current task domain like the list of solutions of our puzzle. Then again patterns can help.
Effective Integration of Task-Based Modeling and Object-Oriented Specifications
1101
5 Summary The paper proposes an integration of task-based and object-oriented design approaches. In particular, designers are encouraged to specify task models of current and envisioned practices. These serve to derive OOA models, which are often used by software designers. Patterns and affordances support an intertwined development of the models in use and let designers pay more attention to details.
References 1. Bastide, R., Basnyat, S.: Error Patterns: Systematic Investigation of Deviations in Task Models. In: Coninx, K., Luyten, K., Schneider, K. (eds.) TAMODIA’2006 (2006) 2. Dittmar, A., Forbrig, P., Heftberger, S., Stary, C.: Tool Support for Task Modelling - A Constructive Exploration. In: Proc. EHCI-DSVIS’04 (2004) 3. Dittmar, A., Gellendin, A., Forbrig, P.: Requirements Elicitation and Elaboration in TaskBased Design Needs More than Task Modelling: A Case Study. In: Coninx, K., Luyten, K., Schneider, K. (eds.) TAMODIA’2006 (2006) 4. Lim, K.Y., Long, J.: The MUSE Method for Usability Engineering. Cambridge University Press, Cambridge (1994) 5. Limbourg, Q.: Multi-Path Development of User Interfaces. PhD thesis, Université catholique de Louvain (2004) 6. Luyten, K., Clerckx, T., Coninx, K., Vanderdonckt, J.: Derivation of a dialog model from a task model by activity chain extraction. In: Jorge, J.A., Jardim Nunes, N., Falcão e Cunha, J. (eds.) DSV-IS 2003. LNCS, vol. 2844, Springer, Heidelberg (2003) 7. Norman, D.A.: The Psychology of Everyday Things. Basic Books, New York (1988) 8. Nunes, N.J., e Cunha, J.F.: Towards a UML profile for interaction design: the Wisdom approach. In: Evans, A., Kent, S., Selic, B. (eds.) UML 2000. LNCS, vol. 1939, pp. 101–116. Springer, Heidelberg (2000) 9. Ormerod, T.C.: Using Task Analysis as a Primary Design Methods: The SGT Approach. In: Schraagen, J.M., Chipman, S.F., Shalin, V.L (eds.) Cognitive Task Analysis, pp. 181–200. Lawrence Erlbaum Associates, Mahwah (2000) 10. Paternò, F.: Model-Based Design and Evaluation of Interactive Applications. Springer, Heidelberg (2000) 11. Paternò, F., Mancini, C., Meniconi, S.: ConcurTaskTrees: A notation for specifying task models. In: INTERACT’97, 362–369 (1997) 12. Van der Veer, G.C., Van Welie, M.: Task based Groupware Design: Putting theory into practice. In: Proceedings of DIS 2000, New York, pp. 326–337 (2000)
A Pattern Decomposition and Interaction Design Approach∗1 Cunhao Fang1,2, Pengwei Tian1,2, and Ming Zhong1,2 1 2
Tsinghua National Labatory of Inforamation Science and Technology, China Department of Computer Science and Technology,Tsinghua Univeristy, China [email protected]
Abstract. This paper explores and discusses the application of pattern decomposition and interaction design approach in pattern layout design. First we introduce a Pattern Decomposition Representation Model (PDM). In this model, the reusable parts of a pattern are extracted as pattern primitives. At the meantime, a module separated from pattern primitives is defined by the abstract structure of the pattern. Next, the interaction design approach based on the design context and the knowledge-based promotion is proposed and the implementation is presented at the end. Keywords: Pattern Design, pattern decomposition model, interaction design.
A Pattern Decomposition and Interaction Design Approach1
1103
2 Pattern Decomposition Model In some fields of application in pattern design, the elements of pattern, i.e. pattern primitives, are reused frequently, and different pattern types contrast sharply and their structures are regular. Thus, the design activities can be formularized clearly. By taking the advantages mentioned above, we introduce a model named “Pattern Decomposition Model” (PDM). In the model, the reusable parts of a pattern are extracted as pattern primitives and the pattern layout is separated as pattern structure. Based on the model, the pattern can be described with a two-dimension hierarchical structure, called Pattern Structure Graph (PSG), which usually is a graph without circles, as shown in Fig. 1.
Fig. 1. The Decomposition of Pattern and the Abstracting of Pattern Layout
A pattern case can be represented by case structure (Pattern layout) that is the abstract description of case and case data (Pattern Element) that is abstracted from case. So the procedure of pattern design is the Case-Making Process of the specific Pattern Layout’s aiming to the various different Pattern Elements.
1104
C. Fang, P. Tian, and M. Zhong
So, the procedure of pattern design descends to the five steps: 1) Pattern designer seeks appropriate case and pattern elements according to design intention. 2) Case adaptation. 3) Edition inside the elements. 4) According to adapted case layout, elements are rearranged in the pattern. 5) Interactive edition, color perfection and accomplishing of the final piece. 6) In this way, the process of a pattern making is described as:
Pattern =
n
∏ φ ( Element
i
)
i =1
∏
stands for space layout knowledge which is hidden in the case. transformation and synthesis knowledge of elements.
φ stands for
3 Design Case Representation Based the Pattern Decomposition Model, each design Case may be represented as: DesignCase = (EleSet, Position, OpState, FoldModel) EleSet stands for the element set used in the pattern, which is marked as ID in pattern element database. Position is pattern element’s relative position in the pattern. OpState stands for element’s operating state, such as rotation angle, zoom out/in ratio etc. FoldModel is (cascade mode) among element objects, such as transparent mode, semitransparent mode and submerge mode. This kind of high-efficient, abstract description is called as Module, with which the storing of case becomes simple and efficient. At the same time, for the sake of supporting valid case index, Module is organized as a standard tree-shape hierarchical structure in case database in term of applying occasions. Some case-adaptation technologies, such as object substitution, parameter adjusting and deductive reuse are used to make specific application-oriented solution adaptation. Further we find that the System State on each interaction point in the pattern design process can be described in terms of 4-dimension array. This kind of description’s compaction makes remembering of the whole design process possible. System organizes user’s current design state and that of former series of interaction point and produces a Design Process Series. With it user can realize the backtracking of design process, the task needed only is to fetch the former abstract description of the System State and to reproduction it to users through visible transformation. Obviously this kind of description technique of design process renders some functions’ realization, such as Undo and Redo, relatively simpler and even makes possible the recovery of design process based on interaction point through label-remembering of design process.
A Pattern Decomposition and Interaction Design Approach1
e
d
1105
o
U Label Remembering State 3
State 2
Design Process Series State 1
R
e d o ………
Fig. 2. The Design Process Series used in TopBC
Finally, if we apply the Design Process Series into the case’ remembering and store the whole DesignProcess Series of each case into the Case Database, then the reproduction of the whole design process of case will be realized. During the period, the solution-adaptation of case can be carried out on the interaction point through response to the user input.
4 Design Context and Conductive Interaction Technology Research shows that the events selected by users at different interaction point are different, it principally depends on the system state at the interaction point, which is known as design context environment, and it is also restricted by the domain knowledge and the habits of user’s behaviors. Applying the knowledge-based reasoning methods, we can derive a series of possible interaction action at each interaction point, and execute the appropriate action after the user’s acknowledgement. The procedure of reasoning is described as Fig.3.
Action1
Systen state N
Procedure Rule Set
Action Queue
Machine
Action 2
Reasoning Design Context
Selected by user
Action
Statistics of user’s behavior
m Action m-1
Fig. 3. The process of conductive interaction technology
Systen state N
1106
C. Fang, P. Tian, and M. Zhong
Design context describes the relationship of system states. According to the design context environment, we can acknowledge the design state at the interaction point and possible state shifts afterwards, then according to the various state shift wards,we can derive a series of possible interaction action, and place them in the Action Queue. In TopBC system, design context is the main proof of reasoning, and it is a tree-shape structure that may be described by Design Context Tree. (Fig. 4). Procedure rule set reflects the restriction of action execution order given by domain knowledge. It can verify the execution order of actions by a set of rules, such as the starting of Element DB always precede the browsing of element, and local edition always be after case adoption, etc. Hence, the action reasoning process of TopBC system can be described as: Acknowledge the current System State of each interaction point. According to the current System State, matching the design context tree, and seeking the possible System State shift with width-prior algorithm. In term of different state shift, relative action is created and put into action queue, scheduling by domain knowledge and data statistics of user former events. Pop action queue and wait for user’s acknowledgement. Automatic accomplishment of the action that is already acknowledged by user. System State
Operation on Database
Module Selection
Global Layout
Element Selection
Image Edition
Cascade Mode
Rotation Operation
Local Edition
Color Perfection
Object Substituti
Graph Edition
……
Non-linear Transfer
Zoom out/in
Layout Adjustment
……
Word Edition
……
Font Transform
Font Fill Mode
……
Fig. 4. The design context tree used in TopBC (The grayly-filled node stands for the current system state.)
5 Automatic Promotion Because system adopts object oriented structuring, edition of element object is encapsulated inside the object, so the encapsulation of operations and coherence of global layout are achieved. However, we have to be sure that changes of object internal information may do harm to harmony of global pattern layout. Obviously
A Pattern Decomposition and Interaction Design Approach1
1107
harmony protection of pattern layout through user’s mechanical accomplishment is unreasonable and is sometimes inaccurate. TopBC system gains success of automatic promotion from partial operation to global layout in term of regulations of pattern layout and domain knowledge. The following is the promotion regulation used in TopBC system: Rule 1: When the same element object appears several times in a pattern layout (such as the square continual pattern), the edition of this element is projected to the entire individuals. Rule 2: As far as the symmetrical pattern layout is concerned, edition of certain space should be projected to its symmetrical space. Rule 3: The color alteration inside the objects must not do harm to the global effect, otherwise, the operation of color-perfection should be executed. ……
6 Conclusion We propose a pattern decomposition and interaction design approach and realize a prototype system. Running the prototype system shows that case-based pattern producing techniques is an effective approach for Pattern CAD domain. It helps to the accumulation of technology, and can centralize the wisdom and experiences from various designers through the inter-communion of case. Meanwhile, other techniques used in TopBC, such as the object oriented structuring, the conductive interaction technology based on the design context and the knowledge-based promotion from partial operation to global layout, are applied successfully, and enhance the speed and quality of pattern production.
References 1. Marculescu, D., Marculescu, R., Khosla, P.K.: Challenges and opportunities in electronic textiles modeling and optimization Design Automation Conference, 2002. In: Proceedings. 39th 2002, pp. 175 –180 (2002) 2. Polzleitner, W., et al.: Invariant pattern location using unsupervised color2 based perceptual organization and graph2based matching [A]. In: 2001 International Joint Conference on Neural Networks Proceedings [ C ], pp. 594–599. IEEE Press, Washington DC (2001) 3. Rantanen, J., et al.: Smart clothing for the arctic environment [A]. In: 2000 the Fourth International Symposium on Wearable Computers Proceedings [C], Atlanta, GA, pp. 15–23. IEEE Press, New York (2000) 4. Sheng, L., et al.: Application of pattern emulation on weave CAD automatization [A]. In: 2000 the 3rd World Congress on Intelligent Control and Automation Proceedings [C], Hefei,China, pp. 2412–2416. IEEE Press, New York (2000) 5. Burchard, B., et al.: Devices, software, their applications and requirements for wearable electronics [A]. In: 2001 International Conference on Consumer Electronics Proceedings [C], pp. 224–225. IEEE Press, Los Angeles,CA (2001) 6. Cun-hao, F.: The Textile Oriented Pattern Database Design, the bachelor degree thesis of Zhejiang University (in Chinese) (1998)
1108
C. Fang, P. Tian, and M. Zhong
7. Guo-FU, Y., Lei, X. et al.: Research on Case-Based Reasoning Intelligent CAD for designing Hydraulic Cylinder, Journal of CAD&CG 10(6) (in Chinese) (November 1998) 8. Ling, Y., Hua, F., et al.: A Knowledge-Based Conductive Human-computer interface. Journal of CAD&CG 9(1) (in Chinese) (January 1997) 9. Chun, C., Zhijun, H., Duanqing, X.: An Intelligent CAD/CAM System for Silk Printing, Chinese Journal of Automation, 4(1), (1993) 10. Lee, J.-H., Kim, H.-S., Cho, S.-B.: Accelerating evolution by direct manipulation for interactive fashion design, Computational Intelligence and Multimedia Applications, In: ICCIMA 2001. Proceedings. Fourth International Conference on, 2001 pp. 343 –347 (2001)
Towards an Integrated Approach for Task Modeling and Human Behavior Recognition Martin Giersich, Peter Forbrig, Georg Fuchs, Thomas Kirste, Daniel Reichart, and Heidrun Schumann Rostock University, Computer Science Department, Albert-Einstein-Str. 21, 18051 Rostock, Germany {martin.giersich, peter.forbrig, georg.fuchs, thomas.kirste, daniel.reichart, heidrun.schumann}@uni-rotock.de
Abstract. Mobile and ubiquitous systems require task models for addressing the challenges of adaptivity and situation-aware assistance. Today, both challenges are seen as separate issues in system development, addressed by different modeling concepts. We propose an approach for a unified modeling concept that uses annotated hierarchical task trees for synthesizing models for both areas from a common basic description. Keywords: Task models, human behavior models, dynamic Bayesian networks, user interface design methodology, ubiquitous computing.
1
Introduction
Mobility and ubiquity of computing devices make information technology accessible for user activities that are temporally and, especially, spatially distributed. (We will use the term “task” for such a composite activity.) Computing devices are used “beyond the desktop”, in situations, where the user focuses on interacting with the real world (e.g., while repairing a machine, while giving a lecture in a smart classroom, during a team meeting, . . . ). In such situations, we wish for systems that are able to assist the user proactively: systems that do not force the user to interrupt his primary task and shift his focus of attention in order to interact with the system. One example for such proactive assistance are smart environments – smart homes, smart offices, smart meeting rooms, etc. – that support users in their activities through appropriate actuators. For instance, consider a “situation room” or “mission control center” scenario where several users are engaged in cooperative, time-critical decision making (such as disaster mitigation etc.). Here, wall displays may be used alongside standard computer monitors, and even laptops brought in or removed by staff members at arbitrary times. Different specialists on the team require information on different aspects of a problem, or even on different data altogether. A smart room would provide support for these conflicting and dynamically changing information needs by automatically deciding what data to present where, when, and how. This decision is based on the current task at hand (team intention), the capability of the available hardware, and the preferences/needs of the team member(s) interested in them. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1109–1118, 2007. c Springer-Verlag Berlin Heidelberg 2007
1110
M. Giersich et al.
From the viewpoint of user interface development, the consideration of mobile and ubiquitous application scenarios has two important consequences: 1. The set of devices available for interaction may change over time. This raises the challenge of adaptivity: on different devices, the same abstract interaction (“enter phone number”) has to be rendered differently in order to make optimal use of the current device’s interaction mechanisms – consider for instance keyboard entry vs. speech recognition. 2. The system becomes aware of a user’s task and its structure. This creates the opportunity of proactive assistance: If the devices in the user’s environment are able to infer her current activity, they are able to trigger actions (such as providing information), without explicit user interaction. In order to address these consequences, the same basic concept is being investigated in current research: the system is provided with an explicit model of the user’s task, a task model. However, the specific kinds of task models used for addressing the above two challenges differ significantly. In the area of adaptive user interfaces, hierarchical task graphs are used. Tasks can be refined by sub-tasks and their ordering constraints. In the area of proactive assistance, probabilistic models – such as hidden Markov models – are employed for describing human behavior, by specifying the probability of different possible actions a user may take in a certain situation. Consequently, in both areas models are currently developed independently. Intuitively, one would assume that a model which specifies the temporal orderings of subtasks should have some relation to the model that specifies what a user will probably do next. So it might be possible to generate both models (or at least, “templates” for both models) from the same basic description. It is this intuitively plausible idea we want to discuss in this paper. The further structure of this paper is as follows: in Sec. 2, we provide an indepth discussion of task models in general and hierarchical task models specifically. In Sec. 3, we outline activity recognition using probabilistic models of human behavior. Then we describe a simple strategy for synthesizing parts of such probabilistic models from extended hierarchical task models in Sec. 4. We discuss the viability of this approach and indicate further research in Sec. 5.
2
Modeling Tasks
By a task model we mean a breakdown of a composite activity into individual atomic steps, between which a partial order may be defined, roughly speaking: a “plan”. The term “action” will denote an atomic step of a task. The concept of task models originates from two research areas: In cognitive psychology, task models have been developed as means for formally describing human problem solving behavior. The well known GOMS model [1] is a very good example for this class of models. It is the foundation of several proposals for model-based user interface design (for instance, [2]). These models
Task Modeling and Human Behavior Recognition
1111
can be used in two ways: (i) for analyzing the cognitive complexity of a given user interface, and (ii) for designing user interfaces by first developing a model of the task at hand and then choosing appropriate dialogue elements for the individual atomic activities. In signal processing, “task models” have been developed as a means for estimating the actual behavior of a signal source, for which only incomplete and noisy observations are available. The fundamental algorithmic approach is Bayesian filtering. A Bayesian filter requires a hypothesis about a signal source’s behavior repertoire, a hypothesis about which behavior will cause what observation, and a set of noisy observations. Based on this information, the filter will yield the most probable explanation for the observed data – i.e., the most probable behavior of the signal source given the observations (see e.g. [3] for an introduction to Bayesian filtering.). The relation between the two origins for task models becomes clear once we try to track a human user: The observations may be data from (noisy) location sensors (e.g., Beacon-based, GPS, UbiSense), accelerometers attached to the user’s body (or her mobile phone), or information about objects touched by the user (using, e.g., RFID-Tags). The challenge is then to infer which task the user is trying to perform from a given set of sensor data. Clearly, a model correctly describing a human’s strategy for achieving a certain goal is an ideal hypothesis for a Bayesian filter: given a task model and a set of sensor readings, a Bayesian filter will output the user’s most probable goal. In essence, from a viewpoint of mobile and ubiquitous computing, task models have two important uses: – As a means for deriving the dialogue structure of a mobile human computer interface (hierarchical tasks models). – As a means for providing activity support for users (and teams) through proactive assistance (probabilistic behavior models). With respect to hierarchical task models, one of the most popular notations is the so called ConcurTaskTree-notation (CTT-notation) [2]. In this notation, a compound activity is represented by a task-tree. Each node in the tree represents a task; composite tasks may be broken down into subtasks. For each task node it may be specified if this activity executed by the user, by the application, or by an interaction between user and application. In addition, the possible execution sequences of a composite task’s sibling nodes may be further constrained by temporal relations such as “α | = | β” (α and β may be executed in any sequence) or “α >> β (α has to be executed before β). Fig. 1 presents a typical task tree, describing in a rather simplistic way the agenda of a meeting that consists of three talks by users a, b, c (represented by task nodes A, B, and C, respectively) and a discussion (task node D). The talks can be given in any order, which is specified by the temporal relation order independency (“| = |”). The discussion can only be performed after all talks were presented. This is specified by the enabling relation (“>>”). Hierarchical task models are used to specify the behavior of users interacting with a software system. They allow to describe the basic temporal structure of
1112
M. Giersich et al.
A
|=|
B
|=|
C
>>
D
Fig. 1. Task model specifying the schedule of a meeting
compound activities. For inferring the activity of a user from sensor data, we need additional information: a specification of how probable a certain execution sequence is. Next, we will look at a current approach to this problem.
3
Inferring Intentions
As outlined above, computing the user’s current activity from sensor data requires a task model that allows to make statements about the plausibility of sensor data given a specific activity. A system can then try to identify the user’s current task by selecting that task, whose action sequence is most plausible with respect to the observed sensor data. Bayesian Filtering for identifying a user’s current task has been successfully used in several projects that aim at supporting user activities in classrooms, meeting rooms, and office environments [4,5,6]. Here, dynamic Bayesian networks (DBNs) are investigated increasingly for modeling a user’s activities [7,8]. In our own work, we look at using DBNs for inferring the current task and actions of a team of users. Given (noisy and intermittent) sensor readings of the team members’ positions in a meeting room, we are interested in inferring the team’s current objective – such as having a presentation delivered by a specific team member, a moderated brainstorming, a round table discussion, a break, or the end of the meeting. The basic structure of the DBN we propose for modeling the activities of such a team is given in Fig. 2. In general, a DBN consists of a sequence of time slices, where each time slice describes the possible state of a system at a given time t. A time slice consists of a set of nodes that represent the system’s state variables at that time. State variables may be connected through directed causal links. A connection such as X → Y means that the current value of Y depends on the current value of X. This dependency is described by a conditional probability table (CPT), such as P (Y = 0|X) P (Y = 1|X)
X=0X=1 0.9 0.3 0.1 0.7
which in this example says that, in case X is 1, the value of Y will be 0 with a probability of 0.3 and it will be 1 with a probability of 0.7. (If X is 0, Y will be 1 with a probability of 0.1 and 0 with a probability of 0.9.)
Task Modeling and Human Behavior Recognition
slice t − 1
slice t
Tt−1
Tt
(c)
(c)
Ut−1
(c)
Gt−1
(b) Ut−1 (a)
(a)
Gt−1
(b) Gt
(a)
(a)
Ut
Gt
(c)
(c)
At−1
At (b) At
(a)
(a)
At−1
At (c)
(c)
St−1
St
(b) St−1 (a)
Gt
(b) Ut
(b) At−1
St−1
(c)
Ut
(b) Gt−1
Ut−1
1113
(b) St (a)
St
Fig. 2. Two-sliced dynamic Bayesian network (DBN) modeling team intention inference. It shows the intra-slice dependencies between observable (double-contoured) and hidden variables, as well as the inter-slice dependencies between consecutive states.
Causal links may connect nodes within a time slice, they may also connect nodes between time slices – the latter is used to express the fact that the state at time t depends on the previous state at time t − 1. (We here consider only DBNs that are first-order Markovian – i.e., where the state t depends only on state t − 1 and no earlier states.) The classical problem of a Bayesian network in general and for DBNs specifically is that not all node values may be known at a given time. Some nodes may be directly observable, but other nodes may be hidden. Bayesian inference then tries to infer the probability distribution over the hidden nodes’ values from the values of the known (observable) nodes. In our network given in Fig. 2, only the nodes labeled S are observable (they represent sensor data reporting the position of a user), the other ones are hidden. With this network, we try to model the behavior of a team of three users during a meeting. At the top level, the Team Node Tt represents the current team intention. The team’s intention at time t depends on what the team has already achieved (T at time t − 1, Tt−1 ), and what the users i are currently (i) trying to achieve (the Ut -nodes, i ∈ {a, b, c}). If all users have achieved their individual assignment for the current team intention, the team T will adopt a (i) new intention. This may cause new assignments to the users. The Gt nodes represent these – possibly new – assignments. So at each time slice, the team looks at what the users have achieved so far and then decides what the users should do next. The CPT of node T therefore represents the negotiation process by which the team members agree on the next joint activity. For instance, if the team decides that the next activity should be the presentation of user a, it would
1114
M. Giersich et al.
assign to user a the task to go to the speaker stand and deliver his speech, while users b and c would be assigned the task to take a seat in the audience. (i) Whether a user i has achieved his assignment at time t – given by Ut – (i) (i) depends on the user’s current action (At ) and its previous assignment Gt−1 . (i) The At nodes record the current state of the user’s action (e.g., the user’s current position and velocity in case he has to reach the speaker stand in order to achieve his assignment). What the user is doing at time t depends on his previous (i) (i) action and assignment – At−1 and Gt−1 . Finally, the sensor observations of user (i) i at time t – the nodes St – depend on the user’s activities at that time. Note that these sensor nodes are the only observable nodes in our model: we cannot directly look into the minds of the users to observe their joint intention. (i) Rather, we take the available sensor data – the set of St values for the times up to t – and try from these to find the sequence of values for Ts , s ∈ {1 . . . t} that best explains the observed data – we try to estimate the team’s negotiations from the observable behavior of the team members. Once a probabilistic model is available, it allows us to infer user and team intentions. A substantial question is now of course: Where do we get such a model? Is it necessary to create it manually, from scratch, or can we synthesize it from existing information – such as from a hierarchical task model? This question will be addressed in the next section.
4
Synthesizing Probabilistic Models
In order to define a complete probabilistic model, sub-models have to be provided for the following three aspects: – How a team produces a sequence of joint intentions (Team model) – Which actions a user performs in response to a joint intention, an assignment (User model) – Which sensor data are caused by what actions (Sensor model) We will now look at the first topic and we will discuss how task models such as task-trees can be used to simplify the definition of such a model. First, the CPT of a T node basically looks as follows: Tt−1 .history h Tt−1 .activity α (i) (i) (i) Ut .done, i ∈ {a, b, c} ∀i : Ut .done = true ∃i : Ut .done = f alse P (Tt .history = h ∪ {α}) 1 0 0 1 P (Tt .history = h) P (Tt .activity = α) 0 1 0 P (Tt .activity = ξ), ξ = α mmodel(Tt .history, ξ)
Task Modeling and Human Behavior Recognition {A}
.09
{A, B} 1
.9
∅
.9 .1
1115
.99
{B}
.01
.01
{A, C}
1
{A, B, C}
1
{A, B, C, D}
1 .91
{C}
.09
{B, C}
Fig. 3. Markov model of agenda driven team process
The history slot of the T node records the team’s previous activities1 , the activity is the team’s current goal that the users try to jointly achieve through their individual assignments. If all users are done with their assignment, the team will add the current action to its history and it will then choose a new activity ξ. Otherwise, it will continue its current activity α. In this CPT, mmodel is the essential point. mmodel is a function that, given a history h and an action ξ, will yield the probability that the team will try ξ after h. mmodel describes our knowledge about what a team will most probably do in a certain situation. For instance, if the possible actions of a team are {A, B, C, D} and if we know that the team has an agenda stating the sequence of actions A, B, C, D, mmodel should assign the highest probability to action B when given the history {A} – modeling the prejudice that a team tends to follow its agenda. However, mmodel should also assign non-zero probabilities to the other actions in order to account for the possibility of deviations from an agenda. A possible model for a simple four-step agenda that states “A, B, C may come in any order but most probably in the order A, B, C, while D must be the last action.” is given in Fig. 3. mmodel essentially specifies a Markov model where the states are partial histories and the edges are transitions between histories. The problem here is how such a model can be specified efficiently: the number of states (histories) grows exponentially in the number of available actions! Our proposal for solving this problem is to utilize hierarchical task models for defining the structure and transition probabilities of a DBN. Specifically, we use an annotated CTT graph for generating an initial proposal for mmodel. Basically, a task model M defined over a set of actions A specifies a directed acyclic graph on possible execution histories h ∈ 2A , with the additional constraint that (h, h ) ∈ M ⇒ ∃α ∈ A : h ∪ {α} = h . 1
Given a set of actions A, an execution history is a set of actions that already have been performed. The set of all execution histories is the power set of A, which we denote by 2A . (This model makes the simplifying assumption that the exact sequence of actions is not important – however, it is easy to change the history model to a sequence model.)
1116
M. Giersich et al.
This means: in a task model M, a history h directly results from a history h through the execution of a single action α. The empty history ∅ is the root of this graph2 . For a given history h, the set C(h) denotes the set of actions that may directly follow this history. C(h) is defines as follows: C(h) = { α ∈ A|(h, h ∪ {α}) ∈ M } Clearly, the graph M directly represents the structure of a corresponding Markov model. At this point, the question to be addressed is how to provide initial proposals for the transition probabilities of this Markov model. The idea is to allow the developer of a task model to annotate a task-tree with additional information from which these initial proposals for the transition probabilities can be derived. One straightforward approach is to annotate each sibling action α with a “priority” prio(α), a number that indicates how important an early execution of this node is in relation to the other siblings – such as outlined in Fig. 4. For independently ordered tasks (order relation | = |), the priorities indicate the probabilities of being executed first. Then, for a given history h and a possible extension ξ ∈ C(h), the probability of a transition from h to h ∪ {ξ} is calculated from the priorities by: prio(ξ) P (h ∪ {ξ}|h) = prio(α) α∈C(h)
The resulting Markov model for the hierarchical task model in Fig. 4 is the one shown in Fig. 3. In this example the probability that a meeting starts with the presentation of A first is 0.9. Accordingly, it is 0.09 for presentation B. The probability that a meeting starts with third talk C is 0.01. If the meeting has started with talk B, the probabilites for the following two possible transitions to {B, C} and {A, B} are given by prio(A) prio(C) ≈ 0.01, and ≈ 0.99. prio(A) + prio(C) prio(A) + prio(C) Note that the most probable path through the generated Markov model is indeed the one following the agenda: ∅ → A → {A, B} → {A, B, C} → {A, B, C, D}. Also, if an action is taken out of order, the Markov model specifies that the team will try to return to the agenda: When the meeting has been started with B, the most probably following action will be to return to the planned sequence by executing A. So the generated Markov model represents the intuition behind the task-tree annotations. We have just shown that a proposal for a probabilistic model of user behavior can be generated from an annotated hierarchical task model. Therefore, we claim 2
If histories are represented by sequences instead of sets, this graph is a tree.
Task Modeling and Human Behavior Recognition
A 90
|=|
B 9
|=|
C 1
>>
1117
D
Fig. 4. Extended task model specifying the schedule of a meeting
that it is at least possible to exploit established user interface design methodology – task-tree modeling – for additionally enabling some aspects of proactive assistance in ubiquitous computing systems. Following, we will discuss some of the research issues that arise from this approach.
5
Discussion and Outlook
With the simple strategy outlined above, we have shown that it is possible to synthesize Markov models from annotated hierarchical task-trees that capture the probability of task execution in a team. An interesting question is now, how well the intended Markov model can be specified by the priority annotations. Not all possible distributions for the transition probabilities can be generated from these task-tree annotations (after all, the number of priority annotations is much smaller than the number of transitions in the Markov model). However, we think it is sufficient if the generated model is approximately correct: the exact transition probabilities are not known in advance, anyway. They have to be learned from the observation of real user behavior. The generated probabilities only have to be so exact as to permit a system a reasonable behavior right from the start, before training data is available (of course, the better the initial estimate, the less training data will be required). The salient point is a useful definition of “reasonable” and “approximate”. We think it is possible to provide such definitions; this will be part of our future research. The specific task-tree annotations and the accompanying probability computation given in Sec. 4 implicitly assume that the team uses a specific agenda management strategy. Specifically, the synthesized model assumes that a team prefers to execute actions in the order of their original priority, independent of the history. We call this a return to agenda strategy. Sometimes, teams might use other strategies. One example is to continue with the successor : if the meeting had started with talk B, the most probable next action would be C. This means the original “successor” of the action actually executed is also the most probable following activity. Another strategy of a team could be to stick to a timetable and to execute each action as close as possible to the original plan. Different strategies may require different annotations to a hierarchical task model. (For instance, in case of using continue with successor, one needs to
1118
M. Giersich et al.
provide the priority annotations at the parent task level rather than at the sibling level.) In addition, it may be conceivable to provide a set of mechanisms for inheriting such annotations within a task-tree. In order to render task-tree annotations usable, the set of annotation mechanisms has to be kept as small as possible. Therefore, we need to identify a set of annotations that allows to specify the typical team strategies for agenda management with sufficient precision. Future research has to identify which strategies are typical, what precision is required, and which set of task-tree annotations is able to capture the required information in a usable way for real life problems. To summarize, we have shown that task models are an important tool for addressing salient challenges of mobile and ubiquitous system design: adaptivity and proactive assistance. The models employed today in these areas use quite different modeling concepts, which leads to duplication of work. We have then argued that it is possible to automatically synthesize one kind of models (probabilistic behavior models) by giving simple annotations to the other kind (hierarchical task models). Therefore, it seems possible to generate proactive assistance for every application that already provides a suitably elaborate task model. We think that further research in this area is important for evaluating the potential of this approach and for rendering its benefit accessible to ubiquitous system development practice.
References 1. Card, S.K., Moran, T.P., Newell, A.: The psychology of human-computer interaction. Lawrence Erlbaum, Mahwah (1983) 2. Mori, G., Paterno, F., Santoro, C.: CTTE: support for developing and analyzing task models for interactive system design. IEEE Trans. Softw. Eng. 28(8), 797–813 (2002) 3. Fox, D., Hightower, J., Liao, L., Schulz, D., Borriello, G.: Bayesian filtering for location estimation. IEEE Pervasive Computing 2, 24–33 (2003) 4. Franklin, D., Budzik, J., Hammond, K.: Plan-based interfaces: keeping track of user tasks and acting to cooperate. In: Proceedings of the 7th international conference on Intelligent user interfaces, New York, NY, USA, pp. 79–86. ACM Press, New York (2002) 5. Bui, H.: A general model for online probabilistic plan recognition. In: IJCAI ’03: Proceedings of the 18th International Joint Conference on Artificial Intelligence, pp. 1309–1315 (2003) 6. Duong, T.V., Bui, H.H., Phung, D.Q., Venkatesh, S.: Activity recognition and abnormality detection with the switching hidden semi-markov model. In: CVPR ’05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Washington, DC, USA, vol. 1, pp. 838–845. IEEE Computer Society Press, Los Alamitos (2005) 7. Patterson, D.J., Liao, L., Fox, D., Kautz, H.A.: Inferring high-level behavior from low-level sensors. In: Dey, A.K., Schmidt, A., McCarthy, J.F. (eds.) UbiComp 2003. LNCS, vol. 2864, Springer, Heidelberg (2003), citeseer.ist.psu.edu/article/patterson03inferring.html 8. Patterson, D.J., Liao, L., Gajos, K., Collier, M., Livic, N., Olson, K., Wang, S., Fox, D., Kautz, H.A.: Opportunity knocks: A system to provide cognitive assistance with transportation services. In: Ubicomp, pp. 433–450. Springer, Heidelberg (2004)
A Pattern-Based Framework for the Exploration of Design Alternatives Tibor Kunert1 and Heidi Krömker2 1
Siemens AG, Automation and Drives, P.O. Box 31 80, 91050 Erlangen, Germany [email protected] 2 Technical University of Ilmenau, Institute of Media Technology, P.O. Box 10 05 65, 98694 Ilmenau, Germany [email protected]
Abstract. Design patterns serve the documentation and sharing of proven solutions for recurring design problems. Additionally, patterns can provide guidance on design alternatives. In this paper we present a pattern-based framework to support the designer in the exploration and evaluation of design alternatives and their tradeoffs. Based upon the systematic identification of recurring design problems and solution alternatives and their tradeoffs the framework consists of a generic hierarchy of design problems and solution alternatives as well as of two generic interaction design pattern templates. The presented framework can be used to specify design problems and existing solutions for a specific platform or application domain as well as to think about design alternatives and to develop new solutions. In addition, it can be used to structure interaction design pattern collections. The approach is illustrated by a case for interactive television applications. Keywords: Interaction design patterns, design patterns, design tradeoffs, interactive television.
and unproven design alternatives. Furthermore, the pattern approach is well suited for the analysis of design tradeoffs of a particular design alternative because it is also based upon the idea that a proven design solution is an appropriate compromise between different competing forces. In the pattern approach a design problem is the result of competing forces [8]. If there are no competing forces there is no design problem and no guidance is required. The forces referred to in the design pattern terminology can be understood as different requirements and constraints and the solution approach described in design patterns is a proven solution to balance them out. In addition, as a form for design guidance design patterns provide a format to explicitly state and discuss the specific advantages and disadvantages of a particular design solution. In this way design patterns are well capable to support the informed decision making by the designer [5, 8]. Other forms of design guidance, e.g. design principles, guidelines and style guides, have less potential is this regard because they do not provide leads on how to choose between design alternatives or how to solve conflicts between competing forces. Besides the pattern approach also other approaches have been suggested to document interaction design alternatives and their tradeoffs, especially claims analysis [3, 12] and impact analysis [10]. However, in this paper a pattern-based approach for the exploration of design alternatives is followed to take advantage of the strong focus of patterns on the link between design problem and solution.
2 Design Patterns Design patterns describe successful solutions for recurring design problems [1, 6, 2, 8, 14, 15]. Design patterns usually do not describe isolated solutions but are structured hierarchically forming a pattern collection or language. The hierarchical order reflects the scope of the design problems addressed. Patterns on conceptual design issues are at the top while more patterns on implementation related design issues are at the bottom. Within a pattern language different relations between individual patterns exist. Some complement each other and others describe alternative solution approaches. The design pattern approach is based upon the distinction of proven and unproven design solutions. In the pattern literature the quality criteria of long-term reliability in the real context of use is commonly applied. Especially for interaction design patterns usability has been suggested instead as a substitute due to the relatively short existence of interaction design solutions [8]. Using usability [7] as quality criterion of design solutions has the advantage that it can be evaluated experimentally with prototypes. Thus emerging design patterns for new technologies, e.g. ubiquitous computing, mobile applications or interactive television, can be identified early in the technology lifecyle. Without the availability of design guidance supporting usability and user acceptance early in the technology lifecycle a new technology may never get further than infancy. More specifically, for new technologies the potential of design patterns is seen e.g. in serving as a tool to accelerate usability knowledge sharing in a practice oriented format and to avoid the establishment of insufficient design standards [4, 13]. Although it is widely recognized by pattern authors that it is difficult to identify recurring design problems no systematic method has been
A Pattern-Based Framework for the Exploration of Design Alternatives
1121
presented for it so far. Instead the selection of design problems addressed by existing interaction design pattern collections seems to be rather arbitrary leaving it unanswered if all design problems are covered within a particular pattern collection.
3 Method The object is to support informed design decisions by the development of a patternbased framework for the exploration of design alternatives. The framework is based upon two basic steps: 1. Identification of recurring design problems and solution alternatives. 2. Exploration of the solutions’ design tradeoffs. As a result of these steps a framework was developed that consists of two components: • A generic hierarchy of design problems and solution alternatives. • Two generic templates for interaction design patterns based upon the hierarchy of design problems and solution alternatives. The framework presented in this paper is generic and is applicable to any platform and application domain. In the following the conducted steps are described in detail. The results of the single steps are illustrated using interactive digital television (iTV) applications as example. Although iTV applications can be as diverse as electronic program guides, voting and betting applications, enhanced or personalized news, educational applications, games and shops the hardware for user interaction usually is a standard TV or set-top box remote control. 3.1 Identification of Design Problems and Solution Alternatives Method for the Identification of Design Problems. Solution alternatives are based upon design problems. To be able to systematically explore solution alternatives the corresponding design problems need to be identified systematically as well. Objective was to identify recurring interaction design problems. To support the design problem coverage to be as complete and as diverse as possible different methods were applied. On the one hand, the perspectives of the users of the applications as well as of the application designers were considered. On the other hand, for each of these two perspectives theoretical as well as empirical analyses were conducted. To cover the user perspective a theoretical context of use analysis [7] as well as an empirical user task and user requirements analysis [10] were undertaken. For the designer perspective a literature analysis of existing platform specific design guidance as well as an empirical analysis of the designers’ requirements for platform specific design guidance using interviews with usability experts for that platform were conducted. These different methods deliver different types of results. However, they all can be used to identify and specify recurring interaction design problems. Following this approach design problems from a user perspective are based upon particular context of use specifics and/or user requirements and user tasks. In other words, the design problem is how to cope with particular context of use specifics as well as how to support particular user tasks and user requirements. The designer perspective on the
1122
T. Kunert and H. Krömker
other hand can provide partly different design problems based upon deficits of existing guidance and/or the designers’ requirements regarding problems to be addressed to better support the application design process. Method for the Identification of Solution Alternatives. In the application design process for each design problem different solution alternatives exist. Solution alternatives represent different solution approaches to a particular design problem. After the recurring interaction design problems had been identified the possible solution alternatives were explored. Objective was to find as many solution approaches to any one of the identified design problems as possible. However, design problems and thus solution alternatives exist on different levels. After the decision for a certain solution alternative several design decisions are required regarding its concrete design. Each design alternative can be broken down into different components each one representing another lower level design problem. Many of these design alternative components again consist of different design variables that need to be decided upon in the design process. To provide meaningful guidance for the solution of a design problem each of these components and design variables needs to be addressed. To identify solution alternatives for iTV applications an analysis of existing iTV applications as well as a theoretical analysis of potential alternatives was carried out. Also, design solutions implemented within other platforms or application domains were considered to be a solution alternative for iTV design problems. The design alternatives consisted of implemented as well as of not yet implemented solutions. Results. The identified design problems and solution alternatives can be hierarchically classified resulting in a hierarchy of design problems and solution alternatives. Objective was to provide a hierarchical overview of design problems and alternatives to support informed design decisions. In addition, a hierarchical structure of design problems facilitates their mapping with solutions within a pattern collection and can serve as structure for the pattern collection itself. Each step conducted to classify the design problems and solution alternatives brought forth results at a different hierarchy level (Table 1). Table 1. Steps for the hierarchical classification of design problems and solution alternatives and the corresponding results or hierarchy levels Classification step Identification of recurring high level design problems Identification of design alternatives for each high level design problem Identification of the components for each design alternative Identification of the design variables for each design alternative component Identification of possible values for each design variable
A Pattern-Based Framework for the Exploration of Design Alternatives
1123
Level 3
Level 2
Level 1
Only for the design problems on level 2 no directly corresponding alternatives are assigned because the problems are broken down into different design variables. Fig. 1 shows the developed generic hierarchy of design problems and solution alternatives. In the naming convention of Fig. 1 the numbers in the boxes represent design problems (1, 2, 3…) while the letters represent solution alternatives (a, b, c…).
Fig. 1. Generic hierarchy of design problems and solution alternatives
Examples for Interactive TV Applications • Identification of recurring high level design problems: E.g. Page layout, Navigation and Text input. • Identification of solution alternatives for each high level design problem: Solution alternatives for the high level design problem “Text input” are Hardware QWERTY keyboard, On-screen QWERTY keyboard, On-screen alphabetic keyboard, Mobile phone keyboard on remote control, On-screen mobile phone keyboard as well as Scrolling alphabet. • Identification of the components for each solution alternative: Components of the solution alternative “On-screen alphabetic keyboard” are e.g. Keyboard presentation, Character input, Indication of selected character, Cursor presentation, Character correction. • Identification of the design variables for each solution alternative component: Design variables of the solution alternative component “Cursor” are e.g. its default position and its presentation. • Identification of the possible values for each design variable: Possible values for the design variable “Presentation” of the solution alternative component “Cursor” are e.g. static and blinking. For iTV applications the developed hierarchy of design problems and solution alternatives is illustrated for the design problem “Text input” (Fig. 2).
T. Kunert and H. Krömker
Level 3
Level 2
Level 1
1124
Fig. 2. Hierarchy of interaction design problems and solution alternatives for the interactive TV design problem “Text input”
3.2 Exploration of Design Tradeoffs Each solution alternative has specific advantages and disadvantages or tradeoffs that need to be considered and evaluated when designing a particular application. The suitability and selection of an alternative for a particular application is e.g. based upon certain context of use specifics. E.g. specific user characteristics or a specific use environment can result in certain solution alternatives not to be suitable. Therefore, to support informed design decisions the specific tradeoffs of the single solution alternatives were analyzed and specified. For iTV applications usability tests were used to identify the tradeoffs of different design alternatives as well was to identify proven solutions to the identified design problems. Tested were existing iTV applications as well as iTV prototypes especially developed for the development of an iTV interaction design pattern collection. The prototypes systematically implemented different solution alternatives for particular problems based upon the identified hierarchy of design problems and solution alternatives. For the evaluation of the prototypes comparative usability studies were conducted using a within-subjects design. For design problems for that only one design alternative was available conventional usability tests were carried out. For both test types qualitative as well as quantitative measures for effectiveness, efficiency and satisfaction were applied [7]. After all tasks had been completed a task-based post-test semi-structured qualitative interview was conducted with the test participants regarding critical incidents and specific strengths and weaknesses of the tested design alternative. Especially valuable for the identification of specific design tradeoffs were the qualitative results of the thinking-aloud and the post-test interviews. The two conducted test types for iTV applications resulted in different types of results. On the one hand, comparative usability studies were used to create a usability ranking of the different evaluated solution alternatives based upon the quantitative as
A Pattern-Based Framework for the Exploration of Design Alternatives
1125
well as the qualitative measures. On the other hand, both test types were suitable to identify specific advantages and disadvantages of the tested solutions.
4 Documentation of the Results as Design Patterns The identified design problems and solution alternatives as well as the tradeoffs of the solution alternatives can be documented in form of an interaction design pattern collection. The objective is to support informed design decisions by sharing design and usability knowledge amongst designers. The results of the previous analyses can be used to structure the pattern collection as well to guide the definition of pattern templates. 4.1 Structure of the Pattern Collection The structure of the interaction design pattern collection can be based upon the identified hierarchy of design problems and solution alternatives (Fig. 1). However, referring to the top level of the hierarchy of design problems and solution alternatives is sufficient to structure the resulting design pattern collection characterized by only two hierarchy levels (Table 2). Level 2 and 3 of the hierarchy do not need to be presented in the structure of the pattern collection but can be addressed within the patterns on level 2. Objective of limiting the structure of the pattern collection to two hierarchy levels is to create an easy to use pattern collection to support acceptance by designers. Table 2. Classification steps for design problems and alternatives, the corresponding levels in the hierarchy of design problems and alternatives and the corresponding hierarchy level in the pattern collection Classification step for design problems and solution alternatives High level design problems
Level in hierarchy of design problems and alternatives
Level in pattern hierarchy
Design problems (level 1)
Solution alternatives for the high level design problems Components of the solution alternatives Design variables for the solution alternative components Possible values for the design variables
4.2 Design Pattern Templates To address the identified high-level design problems (hierarchy level 1 in Fig. 1) as well as the lower level design problems (hierarchy level 2 and 3 in Fig. 1) two
1126
T. Kunert and H. Krömker
different design pattern templates were created. While pattern template 1 addresses the question” What solution alternative to choose?” pattern template 2 addresses the question “How to design a specific solution alternative?”. Template 1 is used for the patterns on level 1 and template 2 by the patterns on level 2 (Table 2). Template 1 and 2 only differ in regard to the content and structure of their “Problem” and “Solution” section. The other pattern sections (Name, Examples, Context, Evidence, Related patterns) are identical for both templates. Patterns using template 1 provide an overview of solution alternatives to a particular design problem (design problems on hierarchy level 1 in Fig. 1) by specifying and defining the alternatives and by listing their individual tradeoffs in form of advantages and disadvantages in the problem section (Table 3). In the solution section of template 1 patterns concrete guidelines are provided specifying when to use what solution alternative. Table 3. Interaction design pattern template 1 (naming convention refers to Fig. 1) Name Examples Context Problem
Name of the addressed high-level design problem (problem on hierarchy level 1), e.g. Design problem 1 One screenshot of each solution alternative Description of the use of this design pattern in the application design workflow x Solution alternative 1a: Specification and definition x Solution alternative 1b: Specification and definition x Solution alternative 1n: Specification and definition
Solution
Advantages Disadvantages Solution alternative 1a … … Solution alternative 1b … … Solution alternative 1n … … x Use solution alternative 1a for/when … x Use solution alternative 1b for/when… x Use solution alternative 1n for/when… Empirical usability test results and references to literature Other design patterns of this collection that should be considered as well.
Evidence Related patterns
Patterns using template 2 provide an overview of the design problems associated with the concrete design of a particular solution alternative (design problem on hierarchy level 2 and 3 in Fig. 1) in the problem section. The addressed solution alternative components and design variables in the “problem” section of the patterns following template 2 are identical with the ones specified in Fig. 1. In addition, also in the pattern’s problem section, they specify the possible variable values for the design variables of the corresponding solution alternative components together with their specified tradeoffs (Table 4).
A Pattern-Based Framework for the Exploration of Design Alternatives
1127
Table 4. Interaction design pattern template 2 (naming convention refers to Fig. 1) Name Examples Context Problem
Solution
Evidence Related patterns
Name of the addressed solution alternative (hierarchy level 1), e.g. Solution alternative 1b Screenshots of different design approaches for this solution alternative Description of the use of this design pattern in the application design workflow • Solution alternative component 1b.1 (design problem on hierarchy level 2): Design variable 1b.1.1 (problem on hierarchy level 3): Specification of possible variable values and their trade-offs. Design variable 1b.1.2 (problem hierarchy level 3): Specification of possible variable values and their trade-offs. • Solution alternative component 1b.2 (problem on hierarchy level 2): Design variable 1b.2.1 (problem on hierarchy level 3): Specification of possible variable values and their trade-offs. Design variable 1b.2.2 (problem hierarchy level 3): Specification of possible variable values and their trade-offs. • Solution alternative component 1b.1: For design variable 1b.1.1 use the value… (alternative on hierarchy level 3) For design variable 1b.1.2 use the value… (alternative on hierarchy level 3) • Solution alternative component 1b.n: For design variable 1b.2.1 use the value… (alternative on hierarchy level 3) For design variable 1b.2.2 use the value… (alternative on hierarchy level 3) Empirical usability test results and references to literature Other design patterns of this collection that should be considered as well.
5 Conclusion The developed generic hierarchy of design problems and solution alternatives has proven to be a suitable basis for the development of generic interaction design pattern templates aimed at supporting designers in the exploration and evaluation of design alternatives and their tradeoffs. After developing an interaction design pattern collection for iTV applications by applying the presented framework we can conclude that the presented framework is supporting the pattern creation process. The presented framework cannot only be used to structure interaction design pattern languages, but also to guide and structure the development of the pattern content. Patterns corresponding to this framework describe the different solution alternatives for a problem and discuss their specific tradeoffs, thus supporting an informed design decision. In addition, the framework may also support the finding of new design solutions that have not been implemented yet. As raised by Chung et al. [4] a systematic pattern-based framework may facilitate the structured search for new solutions as accomplished by Mendeleyev’s periodic table in chemistry. Future research is required to evaluate patterns developed using this framework regarding their practical utility in the application design process. The evaluation would need to distinguish between different phases in the application design process because the patterns might provide different levels of support for different design phases. In addition, it needs to be investigated what level of maturity is required for a platform or application domain in general for patterns to be of support in the design process.
1128
T. Kunert and H. Krömker
Acknowledgement. This work is supported by the EC 6th Framework IST NoE “3DTV” under Grant 511568.
References 1. Alexander, C., Ishikawa, S., Silverstein, M., et al.: A Pattern Language – Towns – Buildings – Construction. Oxford University Press, New York (1997) 2. Borchers, J.: A Pattern Approach to Interaction Design. Wiley, Chichester UK (2001) 3. Carroll, J.M.: Making use: Scenario based design of human computer interaction. MIT Press, New York (2000) 4. Chung, E.S., Hong, J.I., Lin, J., Prabaker, M.K., Landay, J.A., Liu, A.L: Development and Evaluation of Emerging Design Patterns for Ubiquitous Computing. In: Proc. of Designing Interactive Systems (DIS) 2004, pp. 233–242. ACM Press, New York (2004) 5. van Duyne, D.K., Landay, J.A., Hong, J.I.: The Design of Sites: Patterns, Principles, and Processes for Crafting a Customer-Centered Web Experience. Addison-Wesley, Boston MA (2003) 6. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns. Addison-Wesley, Reading MA (1995) 7. ISO 9241-11: Ergonomic Requirements for Office Work with Visual Display Terminals (VDTs) – Part 11: Guidance on Usability. International Organization for Standardization (ISO), Geneva (1998) 8. Mahemoff, M.J., Johnston, L.J.: Pattern Languages for Usability: An Investigation of Alternative Approaches. In: Proc. Australian Computer Human Interaction Conference OZCHI ’98., pp. 132–139. IEEE Computer Society, Adelaide Australia (1998) 9. Nielsen, J.: Usability Engineering. Morgan Kaufmann Academic Press, Boston (1993) 10. Preece, J., et al.: Human-Computer Interaction. Addison-Wesley, Wokingham, UK (1994) 11. Preece, J., Rogers, Y., Sharp, H.: Interaction Design: Beyond human-computer interaction. Wiley, New York (2002) 12. Rosson, M.B., Carroll, J.M.: Usability Engineering: Scenario-Based Development of Human-Computer Interaction. Morgan Kaufmann, San Francisco (2002) 13. Saponas, S.T., Prabaker, M.K., Abowd, G.D., Landay, J.A.: The Impact of Pre-Patterns on the Design of Digital Home Applications. In: Proc. of Designing Interactive Systems (DIS) 2006, pp. 189–198. ACM Press, New York (2006) 14. Tidwell, J.: Designing Interfaces. O’Reilly, Sebastopol CA (2005) 15. van Welie, M., van der Veer, G.C.: Pattern Languages in Interaction Design. In: Rauterberg, M., et al. (ed.) Proc. of IFIP INTERACT ’03, pp. 527–534. IOS Press, Amsterdam (2003)
Tasks Models Merging for High-Level Component Composition Arnaud Lewandowski1, Sophie Lepreux2, and Grégory Bourguin1 1
Laboratoire d'Informatique du Littoral (LIL) 50 rue Ferdinand Buisson, F-62100 Calais, France {lewandowski,bourguin}@lil.univ-littoral.fr 2 University of Valenciennnes, LAMIH Le Mont-Houy, F-59313 Valenciennes Cedex 9 [email protected]
Abstract. As users become more and more demanding about the software environments they use, they need environments offering them the possibility to integrate new tools in response to their emerging needs. However, most highlevel component composition solutions remain out of reach for users. Thanks to an innovative approach that tends to provide more understandable components, we propose in this paper a new mechanism in order to assist high-level component composition. This approach proposes to realize this composition through tasks models assembling. The assistance we propose is based on an adaptation of tree algebra operators and is able to automatically merge tasks trees in order to assist high-level component integration in a more global environment.
more easily integrated afterwards, especially thanks and through the use of tasks models. Following this approach, we propose to realize the integration of high-level components inside a global environment through the assembling of their individual tasks model in the more global task of the integrating environment. Even if interesting, this proposition raises some questions. We particularly focus in this paper on the means that could help or assist the realization of the merging between several tasks models. In the second part of the paper, we present a solution that has been developed during previous work about how to compose XML trees thanks to specific operators. Since the tasks models of our components are described in XML documents, we propose in the third part of the paper to adapt the XML tree composition solution to the merging of several tasks models. This proposition tends to assist the composition of high-level components that have been developed according to the TO approach. Finally, we illustrate this proposition with an example of highlevel component composition through the assisted merging of their individual tasks models.
2 The Task-Oriented Design Approach Software components composition is a large and complex research area. Besides, many technical solutions try to give it answers. For example, distributed components such as CORBA components [0], EJB (Enterprise JavaBeans) [0], or Web Services [0] have been conceived with the perspective of their future integration. Some of them are associated with composition languages [0] that allow the fine integration of these components or services inside software applications. One can notice that such technical solutions are exclusively usable by software development experts, especially because of their complexity, of their implementation cost and of the specificity of the used techniques [0]. However, these different methods follow the same principle: it is possible to dynamically discover objects on the Internet, to instantiate them, to discover their public methods and eventually their event channels, and to determine how the global environment will integrate and pilot them (through specific methods calls). Even if very useful, these mechanisms mainly bring a solution to the technical dimension of the problem. Indeed, the fact of finely and dynamically integrating a tool not only supposes that we are able to use it, but also that we understand how to use it. And even if some documentation supports — like the Javadoc for Javabeans components, or WSDL descriptions for Web services — may exist, this problem of semantic still remains. Every developer has been faced with this problem: introspecting the list of the public methods of a specific component, even with their documentation, is generally not sufficient in order to finely realize its integration: knowing the methods does not tell you — for instance — in which order you have to call them for this component to work properly. The Task Oriented approach is intended to give an answer to the semantic lack in high-level components and to raise the abstraction level of their composition. Our approach is to consider that each high-level component aims at supporting a specific kind of activities. Our goal is to provide the means to contextualize the many tools or components involved in the realization of a global task. In other words, the global environment has to manage what we call the inter-activities [0], i.e. the links
Tasks Models Merging for High-Level Component Composition
1131
existing between the activities supported by the many tools integrated in a global environment in order to support a composed and global activity. We consider then that each component supports the task it has been designed for. Indeed, the designer of a specific tool or high-level component has created the underlying mechanisms and its interface in order to propose an adequate support for a specific generic task. Thus, a mailing component supports the realization of mailing tasks; a chat component supports synchronous discussion activities, etc. So we can consider that contextualizing a tool is equivalent to contextualizing an existing task into the frame of a more global task, such as Co-writing an article for HCI-International, where a mailing tool may be associated with a word processor, a chat and other tools. In order to facilitate this contextualization and to bring an answer to the dynamic integration problems, we propose to better use the component’s tasks model, a kind of missing link that generally disappears between the design stage and the delivered code. Actually, tasks models are generally used at the beginning of the software development process. But their use progressively fades during the process and finally disappears behind an object-oriented design approach inspired by the computer engineering background. This classical software design approach tends to transform tasks models into objects models, from which emerges implicitly the class-based structure of the produced component. The original tasks model is swamped, implicitly inscribed in the complexity of the produced source code. Indeed, task-oriented approaches are slightly used – or even not used at all – during the design and development cycle, namely after the requirements collection and analysis. Nevertheless, at the stages where they are used, tasks models often serve, as shared objects, to help a better communication between the many actors (including the future users) implied in the complex software development process. Tasks models also contain useful information describing the functioning of the tool and serve a better understanding of it. The Task Oriented design approach tends to keep the benefits of tasks models during the whole software development process and even during composition or integration stages. In order to facilitate high-level component composition, the TO approach proposes to include the tasks model of a component within it [0]. This approach consists in the explicit preservation of the links between the functional source code and the tasks model it is based on. The Figure 1 summarizes part of the architecture of a high-level component developed according to the TO approach — or TO component. Not only the embedded tasks model adds some semantics to the component and should help in its understanding, but it could also be used for its integration. Thus, the developer of a specific component could specify on its tasks model which parts (or subtasks) could be “shunted”, i.e. realized by the global environment. For example, in the chat component of the Figure 1, the “connect” subtask could be shunted by the global environment calling the connect() method with the right arguments, which may have for effect to ‘realize’ the “validate” subtask, and as a consequence the “connect” subtask, skipping the corresponding interface. The purpose of this paper is not to describe in details how these links between the functional code and the tasks model are kept inside the TO component. Briefly said, the tasks model, described in a XML document, contains some information about the tasks that can be realized through the call of a specific method by any global environment integrating this component. For such tasks, a specific field indicates which method it corresponds to. These specific methods are
1132
A. Lewandowski, S. Lepreux, and G. Bourguin
grouped together into a kind of wrapper class so that the global integrating environment can do the appropriate calls to pilot the component. Such TO components are then developed with their future integration in mind, since the designers/developers are able to put in them the appropriate methods that will shunt some of their subtasks and then adapt their behavior.
Fig. 1. Architecture of a TO component
Therefore, high-level component composition could be realized through the composition, or merging, of several tasks models in a more global one, supposing that these components are “TO components” that include their tasks model. The global environment, managing the global task, could integrate the many tools by integrating their tasks models. One of the benefits of this approach is that it raises the abstraction level required for assembling components. It removes the need to look at the public methods of the component and to understand how to call them in order to properly integrate the component, since this information is obtained through the tasks model. The purpose of this paper is to propose means that can help this integration of tasks models and the merging of parts of them. The solution we propose is inspired by results obtained in the domain of XML tree composing applied to the merging of graphical user interfaces. We now introduce these previous results before presenting how we use them for TO components merging.
3 XML Tree Composing A tasks model can be expressed in XML. The XML document can be associated to algebra tree. We propose to use the tree algebra to manipulate the XML document in which the tasks model is written. The TAX model (Tree Algebra for XML) [0] defines a data tree as a rooted, ordered tree, such that each node carries data (its label) in the form of a set of attribute-value pairs. Each node has a special, single valued
Tasks Models Merging for High-Level Component Composition
1133
attribute called tag whose value indicates its type. A node may have a content attribute representing its atomic value. Each node has a virtual attribute called pedigree drawn from an ordered domain. The pedigree carries the history of “where it comes from”. Pedigrees play a central role in grouping, sorting and duplicate elimination. Originally proposed for database management, this model is also well suited to manipulate XML documents from Human-Computer Interaction domain. In [0], for instance, some operators (such as Union, Fusion, Selection, Difference, Equals operators) have been adapted in order to manipulate Graphical User Interfaces (GUIs) defined with the UsiXML UIDL [0] based on XML. Besides, a plug-in (ComposiXML) has been developed for the GrafiXML editor to compose GUIs [0]. The following example illustrates how this principle of XML tree composing based on the TAX model and applied to GUIs works. The Figure 2 shows the tree representation of a Union operator applied on two input interfaces. This result is operated from two XML trees in the case of horizontal union (this layout precision is specific to the Concrete User Interface Operator). The input interfaces and the resulting one (Final User Interface on java platform) are presented in the Figure 3. The Union operator creates a new window, whose width is equals to the sum of the two input windows’ width. It also adds a box in the tree to indicate the new type of layout (horizontal in this example). Then the duplicates are deleted; for instance, as the two buttons Save and Close appears on each form, they will be deleted from one of them (this choice — which ones will be deleted — is made by the user). To do this, the algorithm uses the tag “default value” and the content associated to compare the element. It uses the pedigree to know if the parent elements are repetitive. In the example, the box (with type = horizontal) is repetitive if the children are two buttons with default value equals to Save and Close; to do that, the pedigree is used.
Fig. 2. The tree resulting from the application of a Union operator on two interfaces
We think that this principle based on tree algebra can also be applied at a higher level, at the tasks model level. That is what we now illustrate through an example, showing how the XML tree composing approach can assist the TO components composition problem.
1134
A. Lewandowski, S. Lepreux, and G. Bourguin
Fig. 3. Union of two user interfaces without repetition of common part. The resulting interface is described by the tree of the Figure 2.
4 Tasks Models Merging Using Composition Operators 4.1 The Composition Problem First, we start from the assumption that we have at our disposal two components — a chat tool and a shared whiteboard — that have been developed according to the TaskOriented design approach. The architecture of each of these two final stand-alone components is then similar to the one illustrated on the Figure 1. As we said in this previous part, the use of tasks models should ease the composition of such components. Indeed, assembling such components, in which the tasks model is linked to the functional code, can then be realized through the integration, or linkage, of their individual tasks models in a more global tasks model, the one of the global environment. As each individual tasks model is linked to the code of the component it describes, the global environment will be able to know which methods to call in order for this component to be properly integrated. The introduction of the composition principle based on tree algebra will provide assistance for realizing this integration, by helping to merge several subtasks.
Fig. 4. Similar parts between the Chat tasks model and the White board tasks model. Thanks to the algebra tree composition approach, we want to assist the merging of these two subtasks so that the global environment integrating the two components may take in charge simultaneously the two connection processes.
Tasks Models Merging for High-Level Component Composition
1135
The embedded tasks models of these two high-level components appear in the Figure 4. Both may include a specific and similar connection subtask corresponding to a specific interface asking the user for his/her login and password (and eventually other information specific to each component, like the channel for the chat, or the board to join for the whiteboard). If we integrate these tools in the global environment without any specific merging, the environment will just launch the tools without any particular configuration, and the user will have to identify him/herself twice (once for the chat, and once for the white board, since they both have their own similar “connect” or “identification” subtask which is required). We propose a mechanism in order to assist the integration of these tasks models by merging the appropriate subtasks. 4.2 Tasks Models Composing Using Tree Algebra We can imagine several scenarios of composition. In this paper, we focus on the following one: we want the global environment to do the connection phase (that means the “connect” subtask for the chat, and the “identification” subtask for the whiteboard) at the very beginning; after that, the user will be able to use both tools in parallel. If we look at the tasks models of these two components (see Figure 4), we understand that assisting this scenario will consist in: 1) extracting the two subtasks or sub-trees tied to the connection processes; 2) merging these two subtasks in one; and finally 3) plug into the global tasks model the three resulting sub-models (the one containing the result of the merging, and the tasks models of the two components). In order to realize this, we adapt the tree composing approach presented before.
Fig. 5. Part of the XML tree obtained by transformation of the tasks model of the chat component. Some examples of the key concepts used during tree algebra transformation appear on it: (a) a tag, (b) a tag’s content, and (c) a pedigree.
The tasks models are described and stored in XML documents. The first step consists in transforming them in XML trees. The Figure 5 illustrates this transformation applied to the tasks model of the chat component. These XML trees
1136
A. Lewandowski, S. Lepreux, and G. Bourguin
serve as the basis to the assisted composition. According to our scenario, we want to merge the Connect subtask of the chat and the Identification subtask of the whiteboard. From the tree algebra and the work we have presented in part 3 about GUI composition, it corresponds to the Union operator. There are many ways to implement a Union operator between two trees. The more immediate possibility consists in creating a new tree where the root’s identifier is equal to the name of the global task; then a subtask is created and the two input trees (corresponding to the tasks models of the two TO components) are just plugged into this subtask. However, this basic solution does not do any merge between the similar connection subtasks of each component; it does not manage the problem of the repetitive connection process. A second alternative consists in deleting one of the two similar subtasks. It was the choice made for the implementation of the Union operator in the frame of GUIs composition (cf. part 3). In this case, two solutions are possible: either we choose to delete the connect subtask of the Chat, or the identification subtask of the shared whiteboard. But in both solutions, a problem remains: in the global environment, the user should be able to use the two components in parallel. If we delete one of the two similar subtasks — the connect subtask of the Chat for example — and do every necessary connection in the remaining one — the identification subtask of the whiteboard —, the resulting tree is not coherent; indeed, the Chat task may be initiated before the connection which stands in the whiteboard. The enabling relationship between the “connect” and the “use” subtasks of the Chat forbids this solution. The other solution — keeping the Chat connection and removing the whiteboard identification — presents the same problem. The algorithm we propose in order to implement the Union operator between two tasks trees is the following. It first creates a new sub tree that will contain the merging of the two similar subtasks. This merging is possible since both connection subtasks have a similar structure. The algorithm imports one of them in the new tree, and adds to it the missing subtasks appearing in the second one, according to their order in the corresponding input tree. In order to respect the enabling relationship between the connection subtask and the use subtask of each component, the created task must be placed before the integration of the components’ individual tasks models. Once the two subtasks have been merged, only one interface will be presented to the final user at the beginning, and s/he will fill the form only once. The task generated by the merging takes in charge the necessary methods calls on the two components in order for their own “connect” subtasks to be effectively realized. This is possible because each TO component contains the links existing between some tasks of their model and specific methods of their code (cf. part 2). The final resulting functioning is illustrated on the Figure 6 by the tasks model of the global environment that integrates the result of the merging process and the tasks models of the two components. As we can notice, the enabling relationship is kept between the merged subtask that realizes the connection processes and the parallel use of the two TO components. This simple example illustrates what kind of merging this approach is able to realize, depending on the choices of the user that will finally do the component assembling. This proposition assists the integration of individual tasks models. If the person in charge of this integration specifies which subtasks are similar and should be merged (in our example, the two “connect” subtasks), our solution is able to merge these subtasks in one task in the global model, and eventually merge the corresponding interfaces too.
Tasks Models Merging for High-Level Component Composition
1137
Fig. 6. Part of the tasks model of the global environment after the merging process. It contains first the tasks model resulting from the merging process, and the two adapted tasks models of the integrated components.
5 Conclusion and Future Work Providing means that will allow users to adapt their software environments by finely and dynamically integrating high-level components is truly a challenge. A first problem stands in the fact that components are generally hardly understandable. We have presented the Task-Oriented approach that proposes a new way to construct more understandable high-level components, especially thanks and through the use of tasks models. Assembling TO components means then, according to this approach, composing tasks models. In order to assist this integration, we have proposed a mechanism inspired by previous work on tree algebra applied to GUIs composition. TO components’ individual tasks models are described in XML documents. Assembling tasks models can then be seen as assembling XML trees. Tree operators can then be adapted to this specific domain. We have illustrated this approach with an example: the integration of two TO components with the automatic merging of two similar subtasks they share. The tasks models are transformed into trees, on which we apply a Union operator that merges the identified similar subtasks. The resulting tasks models are then integrated in the global model of the integrating environment. Even if this example seems to be very specific, it can be extended to other operators and technologies. The only requirement is that the high-level components have to be developed according to the TO approach. We now pursue our efforts in order to generalize this approach and provide a more efficient assistance to high-level component composition by users. Acknowledgments. The present research work has been supported by the “Ministère de l'Education Nationale, de la Recherche et de la Technologie”, the “Région Nord Pas-de-Calais” and the FEDER (Fonds Européen de Développement Régional) during the projects MIAOU and EUCUE. The authors gratefully acknowledge the support of these institutions. The authors thank also Jean Vanderdonckt for his contribution concerning UsiXML.
1138
A. Lewandowski, S. Lepreux, and G. Bourguin
References 1. Blevins, D.: Overview of the Enterprise JavaBeans Component Model. In: [0], pp. 589– 606 (2001) 2. Clerckx, T., Luyten, K., Coninx, K.: The Mapping Problem Back and Forth: Customizing Dynamic Models while preserving Consistency. In: TAMODIA 2004, 15-16 November, Prague, Czech Republic (2004) 3. Cubranic, D., Murphy, G.C., Singer, J., Booth, K.S.: Learning from project history: a case study for software development. In: Proc. of CSCW04, pp. 82–91. ACM Press, New York (2004) 4. Ferris, C., Farrel, J.: What are web services? Comm. of the ACM 46(6), 31 (2003) 5. Heineman, G.T., Councill, W.T. (eds.): Component-based software engineering: putting the pieces together. Addison-Wesley Longman Publishing Co., Inc, Boston (2001) 6. Jagadish, H.V., Lakshmanan, L.V.S., Srivastava, D., Thompson, K.: TAX: A Tree Algebra for XML. In: Ghelli, G., Grahne, G. (eds.) DBPL 2001. LNCS, vol. 2397, pp. 149–164. Springer, Heidelberg (2002) 7. Lepreux, S., Vanderdonkt, J., Michotte, B.: Visual Design of User Interfaces by (De)composition. In: Proc. of DSV-IS 2006 (Dublin, Ireland, July, 26–28 2006, pp. 26–28. Springer, Heidelberg (2006) 8. Lewandowski, A., Bourguin, G.: Inter-activities management for supporting cooperative software development. In: Nilsson, et al. (eds.) Advances in Information Systems Development: Bridging the Gap between Academia and Practice, vol. 1, pp. 155–167. Springer, Heidelberg (2005) 9. Lewandowski, A., Bourguin, G., Tarby, J.C.: Les Modèles de Tâches pour la Contextualisation des Composants. In: Proc. of the 10th Intern. Conf. Ergo’IA, Bidart/Biarritz, France, pp. 147–154 (October 11-13 , 2006) 10. Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., López, V.: UsiXML: a Language Supporting Multi-Path Development of User Interfaces. In: Bastide, R., Palanque, P., Roth, J. (eds.) Engineering Human Computer Interaction and Interactive Systems. LNCS, vol. 3425, pp. 200–220. Springer, Heidelberg (2005) 11. Suchman, L.: Plans and Situated Actions. Cambridge University Press, Cambridge (1987) 12. Van der Aalst, W.: Don’t go with the flow: Web services composition standards exposed. Trends Controversies Jan/Feb 2003 issue of IEEE Intelligent Systems (2003) 13. Wang, N., Schmidt, D.C., O’Ryan, C.: Overview of the CORBA component model. In: [0], pp. 557–571 (2001)
Application of Visual Programming to Web Mash Up Development Seung Chan Lim, Sandi Lowe, and Jeremy Koempel MAYA Design 2730 Sydney St Building 2 Suite 300 Pittsburgh, PA 15203, USA {slim, lowe, koempel}@maya.com
Abstract. The ongoing adoption of the latest Web development patterns such as AJAX is helping to enhance the user experience on the Web. Moreover, there is now API-based support from various vendors that allow seamless fusion of disparate data sources into a single application. However, the barrier for Web designers to integrate such features into their Web applications remains high. This hampers a wider proliferation of such novel Web applications. In this paper, we conduct an experiment to see whether visual programming is appropriate for allowing Web designers integrate the aforementioned features. For the experiment, we have developed a prototype, tentatively named WIPER that allows Web designers to incorporate pre-built JavaScript components into live Web pages using drag-and-drop. We combined rapid revision with usability testing to iteratively advance our prototype. Working with users, we have learned that with some targeted refinements, visual programming paradigm can be very effective in achieving our goal. Keywords: Visual Programming, Dataflow Architecture, JavaScript, Rapid Prototyping, End-User Programming.
Building such rich Web applications typically involves extensive use of JavaScript. JavaScript is used to modify the parts of the Web page that need updating, and to also retrieve remote data using patterns such as AJAX. Although several third-party JavaScript code samples exist for download, integrating and configuring them has proven difficult for Web designers. In this paper, we conduct an experiment to see if the paradigm of visual programming can help Web designers to more easily employ modern Web development paradigms such as AJAX and other scripted real-time interactions. We define Web designers as those who have enough skill and inclination to mock-up pages, but struggle with or avoid the programming required to add rich interactions to their designs. Visual programming is a graphical paradigm that has been tested under various conditions to facilitate end-user programming [4,5,6,7,8,9]. We combined rapid iteration with usability testing to advance our prototype visual programming tool, tentatively named WIPER. Through the course of this work we tested the usability of the tool with fourteen Web designers by asking them to complete an identical Web development task. The task involved the creation of a rich Web application that allows page visitors to access, display and interact with data retrieved from a remote server using the AJAX pattern. Incremental design modifications were made to our prototype after each user test until consecutive user tests indicated that users were having little difficulty accomplishing the task at hand. Each user session was preceded by a brief demonstration that illustrated dragging components, consulting the help text, wiring components and testing the page.
2 Related Work Visual programming has been a topic of interest to many researchers for over two decades. Examples of visual programming environments span from the early works of Pygmalion [10], Prograph [11] to VIPERS [12], LabView [13] and Interstacks [14]. WIPER builds on the interaction paradigms found in these systems. Most popular examples of Web programming tools include Macromedia Dreamweaver [15] and Adobe GoLive [16]. Although these WYSIWYG editors help style and lay out the contents of a Web page in a visual manner, they do not provide any visual means to integrate and configure third-party JavaScript components that provide rich interaction patterns and/or access remote services via AJAX. Examples of work that focused on the design of a GUI that allowed access to remote services include efforts by Wolber et al. [17] as well as Mosconi et al. The work done by Wolber et al. proposed a WYSIWYG editor that could directly access relational databases. Work by Mosconi et al., named “Alligator,” used the visual programming paradigm to [18] allow the design of Web application flow that involved the execution of URL-addressable server-resident programs. Neither of the two efforts involved the incorporation of rich interaction patterns. Other examples of end-user programming tools intended for the Web include systems such as DENIM [19] and the image-oriented tools proposed by Takao
Application of Visual Programming to Web Mash Up Development
1141
Shimomura [20]. DENIM focused primarily on the site-map design phase of Web development, and the work by Takao Shimomura concentrated on the traditional fullpage refresh paradigm. Although there has been extensive research into systems intended to script dataflow among disparate large-grain Web applications, [21,22,23,24,25], none have focused on being able to provide small-grain dataflow among JavaScript components within a Web page. Recent efforts by Datamashup [26] also showcase interesting solutions to providing end-users with the ability to more easily build Web mash ups.
3 Initial Design WIPER can be launched directly from the FireFox Web browser. The initial design allowed right clicking on the content area of the Web page then clicking on “Edit HTML using WIPER” from the context menu to spawn WIPER in a new window (Fig 1). JavaScript components are integrated into the Web page by simply dragging a component from the library (white- colored sidebar shown in Fig 1.) and dropping it onto the light blue canvas region of the WIPER interface. Once the component is integrated, it can be made to communicate with other components using the traditional wiring paradigm seen in other visual programming environments. Wiring is also done by drag and drop. Dragging on an output terminal of a component dispenses a wire. Dropping the wire on the input terminal of another component creates a connection (Fig 2.). There are three different in-context interfaces provided by the visual programming tool. First is a text-based property editor, second is a text-based HTML editor, and third is a help documentation viewer (Fig 2.).
4 Implementation The foundation of WIPER lies on the Javascript Dataflow Architecture (JDA) [27]. JDA allows Web applications to be built from JavaScript black box components that can pass messages to and from one another via arbitrarily complex dataflow channels. WIPER is built as an extension to the FireFox Web browser. There are three main advantages to this approach. First, users can access the tool to edit the Web page they’re currently viewing without having to launch another application. Second, it allows WIPER to directly manipulate the target Web page in real time. Third, users can immediately test the resulting Web page as viewed in the FireFox Web browser. WIPER prefixes all modifications it makes to the HTML of the target Web page with well-documented comments, and keeps them cleanly separated from the content that pre-existed. The resulting HTML is also void of any automatically generated JavaScript. This makes it easier to open the resulting HTML in a text or a WYSIWYG editor for further editing.
1142
S.C. Lim, S. Lowe, and J. Koempel
Fig. 1. WIPER launched in a new window
Fig. 2. Users can look up help documentation for any given component
5 Method We designed a think-aloud usability test consisting of three parts. First, we provided a brief tutorial of the tool and gave the user the opportunity to ask questions about the tool. Next, we provided the user with a pre-existing, plainvanilla Web page: “My Cool Blog”. The first task was as follows: “This is your cool blog. You want to let your blog visitors flip through the photos from your Flickr account on your blog page.” This task required the user to connect several components and change both their property configurations as well as their HTML configurations. This task seemed to successfully familiarize the user with the application. When the user had progressed through this task, we asked them to: “Please move this feature to the bottom of the right- hand column.” This task required that the user shift their focus away from the original WIPER prototype and complete the task by manipulating the HTML and CSS.
Application of Visual Programming to Web Mash Up Development
1143
We modified the prototype immediately when we identified interface problems. This gave us the opportunity to verify whether our solutions were having the desired effect.
6 Rapid Design Iterations In this section we detail the observations made through the course of fourteen user tests and four design iterations. 6.1 Observation: Clarify the Relationship Between Parts Our first user test indicated that it could be difficult for users to form a correct mental model of how WIPER affects the HTML and CSS of the Web page. Our initial design had separated the resulting Web page from the editing environment (Fig 1). Web designers could add components to the page with the prototype environment, but were required to edit the HTML and CSS files to reposition or resize the components. Hence the second task, “Please move this feature to the bottom of the right-hand column.” The Web designers we tested had a difficult time understanding the capabilities and limitations of the prototype versus design work that they were required to perform manually. 6.2 Observation: Visual Programming: Advantages and Responsibilities All of our users made positive remarks about the visual programming aspects of the system. They liked the ability to drag and drop components onto the canvas and the immediate visual feedback provided by our tool. However, the visual programming metaphor places an additional responsibility on the tool designers. Users expected that dragging a component to a particular place on the canvas would result in a related position on the Web page. For example, when the user dropped a component on the bottom right of the canvas, they expected it to show up on the bottom right of the Web page. 6.3 Refinement: Side-by-Side View Users’ difficulty understanding the relationship between parts and their expectations for more direct manipulation led us to three important changes. The first change is the side-by-side design view that increases the users’ visibility of the system. When the WIPER interface is invoked, a new frame slides in from the left pushing the Web browser frame to the right. The resulting interface presents two frames situated sideby-side. For reference, the original Web browser view can be seen in Fig 4, and the same view after WIPER has been invoked can be seen in Fig 5. As seen in Fig 5, the frame on the left houses the visual programming environment, and the frame on the right contains the current Web page. The side-by-side view was chosen for two reasons. First, the proximity between the visual programming environment and its output allows users to receive real-time
1144
S.C. Lim, S. Lowe, and J. Koempel
feedback as they modify the page. Second, the side-by-side view allows for better use of real estate. On most popular resolutions, having a floating window housing the visual programming environment proved too difficult to due to its size. In the side-by-side view, the user can resize either frame horizontally to take up more or less space. The component library was moved to the bottom of the screen. To return to the normal Web browser view, users can click on the red “x” situated to the right of the left-hand frame. 6.4 Refinement: Screen Overlay and the Use of Halos Our second significant change is the use of a screen overlay and halos to help users understand that they are in an editing mode and to provide a visual connection between the components on the left-hand panel and their resulting display elements on the Web page. As seen in Fig 5, when WIPER opens, a semi-transparent screen overlay is cast on top of the original Web page. The screen overlay is designed to achieve two goals. First, it indicates to users that the Web page they are working on has now entered a non-interactive, editing mode. Second, it provides a platform on which halos can be rendered around any visualized JavaScript components they have incorporated thus far. The color of the halos on the Web designer’s page (right- hand side) matches the corresponding halos on the JavaScript component counterparts in the visual programming environment (left-hand side). This design is intended to help users better associate the component as displayed in the visual programming environment with its rendering in the Web browser. This color coordination was used throughout the interface even when the incontext interfaces were accessed. This helped users remember which component they were editing and still retain the visual connection between the component and its rendering on the Web page. The relocation and resizing that takes place on the screen overlay moves the corresponding HTML elements in real-time. 6.5 Refinement: Explicit Naming Scheme To correctly communicate the real behavior of the components, we first removed the rollover effect such that the labels would no longer look clickable. Next, we relabeled the component terminals to better communicate their function. For example, on component A the “username” label became “username-in,” indicating that the username should originate from some other component rather than by modifying component A. Similarly, “response” became “response-out,” indicating that the response should be sent out of the component rather than be wired to loop within it. These changes appeared to have solved the problem; we did not observe it in any of the subsequent users.
Application of Visual Programming to Web Mash Up Development
1145
Fig. 3. Original Web page as viewed in FireFox
Fig. 4. FireFox with WIPER open
6.6 Observation: Consistent, Immediate Feedback Is Crucial In our third iteration, we were able to diagnose a feedback problem now that some of the other problems appeared to be resolved. We realized that our prototype was misleading users with inconsistent behavior. On iterations 2 and 3, users were provided with some immediate feedback. For example, changing a text label would render instantly on the resulting page. Other actions required users to manually exit the editing mode and refresh the page. Users interpreted the lack of real-time feedback as though they had performed an incorrect action, and would continue to modify their work, often “breaking” work that was correct, had they only known to test it.
1146
S.C. Lim, S. Lowe, and J. Koempel
6.7 Refinement: Provision of Real-Time Feedback One of the most important lessons learned during the iterative design process was to make sure that users are provided with consistent real-time feedback for all of their actions. When we implemented this change we observed considerable improvement in users’ comfort with the task. We updated the prototype such that any dataflow configuration changes, component additions and/or customization are also reflected immediately. This means that the Web browser always displays the most up-to-date result of user modifications at any given moment in time. To test the Web page, for example, to check whether a button click responds as the designer intended, s/he should close the visual programming environment and return to the normal Web browser view. The last few users did not exhibit any problem understanding the need to exit editing mode to test interactive elements like button behavior.
7 Conclusion and Future Work Throughout this research, we were able to incorporate several important refinements to our prototype visual programming tool. By the end, two consecutive users succeeded in the task with very little difficulty. Our work has shown that with targeted refinements, the paradigm of visual program can successfully help Web designers take advantage of modern Web techniques such as AJAX and other scripted real-time interaction capabilities. Our work also opens up interesting venues for further exploration. As the success of the visual programming paradigm in the large relies heavily on the provision of a sizable component market, the question of scale is certainly unavoidable. The ability to gracefully scale the component library remains an issue. As the number of components grows and their variety expands, the issue of message formats will arise. More specifically, it will be important for the tool to provide adequate feedback on whether the output originating from one component is appropriate to use as input to another component. We can envision the use of visual cues to indicate such compatibility coupled with additional real-time feedback from the tool that mimics the attract/repel phenomenon of a magnet. A feature of JDA not addressed by this prototype is the notion of hierarchical component encapsulation and decomposition. This also directly affects the optimal use of screen real estate dedicated to the visual programming canvas. We believe one of the best approaches to managing the complexity of large-scale applications is through hierarchical decomposition. Further, being able to encapsulate a number of components in one large packaging would be a novel way to introduce refactoring. We believe it is important to provide features in the tool that directly address these two paradigms. The direct manipulation feature provided by the tool currently only addresses fixed positions and dimensions. Although one can easily load the HTML modified by the tool in a text or a WYSIWYG editor to take advantage of the dynamic layout capabilities of HTML, additional work will be needed to provide such functionality directly from the tool’s interface.
Application of Visual Programming to Web Mash Up Development
1147
As minor hindrance to the overall experience, the in-context editing interfaces can be further improved using columnar input fields used by other tools such as Visual Basic, instead of the free-form text editor provided.
References 1. Garrett, J.: Ajax: A New Approach to Web Applications. http://www.adaptivepath.com/publications/essays/archives/000385.php 2. Wikipedia.: mashup (Web Application Hybrid) http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29 3. Google Maps Mania http://googlemapsmania.blogspot.com/ 4. Johnston, W.M., Hanna, J.P., Millar, R.J.: Advances in dataflow programming languages. ACM Comput. Surv. 36(1), 1–34 (2004) 5. Myers, B.A.: Visual programming, programming by example, and program visualization: a taxonomy. In: Mantei, M., Orbeton, P. (eds.) Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA, April 13 - 17, 1986). CHI ’86, pp. 59–66. ACM Press, New York (1986) 6. Meyer, R.M., Masterson, T.: Towards a better visual programming language: critiquing Prograph’s control structures. In: Proceedings of the Fifth Annual CCSC Northeastern Conference on the Journal of Computing in Small Colleges (Ramapo College of New Jersey, Mahwah, New Jersey, United States). Meinke, J.G. (ed.) Consortium for Computing Sciences in Colleges. pp. 181–193 (2000) 7. Schmucker, K.J.: Rapid prototyping using visual programming tools. In: Tauber, M.J. (ed.) Conference Companion on Human Factors in Computing Systems: Common Ground (Vancouver, British Columbia, Canada, April 13 - 18, 1996). CHI ’96, pp. 359–360. ACM Press, New York (1996) 8. Whitley, K.N., Blackwell, A.F.: Visual programming: the outlook from academia and industry. In: Wiedenbeck, S., Scholtz, J. (eds.) Papers Presented At the Seventh Workshop on Empirical Studies of Programmers (Alexandria, VA, USA). ESP ’97, pp. 180–208. ACM Press, New York (1997) 9. Wilcox, E.M, Atwood, J.W., Burnett, M.M., Cadiz, J.J., Cook, C.R.: Does Continuous Visual Feedback Aid Debugging in Direct-Manipulation Programming Systems? In: Proceeding of CHI’97 (Atlanta, GA), ACM/SIGCHI, 10. Smith, D.C.: Pygmalion: a Creative Programming Environment. Doctoral Thesis. UMI Order Number: AAI7525608 (1975) 11. Cox, P.T., Mulligan, I.J.: Compiling the graphical functional language PROGRAPH. In: Proceedings of the 1985 ACM SIGSMALL Symposium on Small Systems (Danvers, Massachusetts, United States). SIGSMALL ’85, pp. 34–41. ACM Press, New York (1985) 12. Bernini, M., Mosconi, M.: VIPERS: a data flow visual programming environment based on the Tcl language. In: Costabile, M.F., Catarci, T., Levialdi, S., Santucci, G. (eds.) Proceedings of the Workshop on Advanced Visual interfaces (Bari, Italy, June 01 - 04, 1994). AVI ’94, pp. 243–245. ACM Press, New York (1994) 13. National Instruments LabView, http://www.ni.com/labview 14. http://www.maya.com/web/what/papers/maya_interstacks_scripting.pdf 15. Macromedia Dreamweaver, http://www.macromedia.com/dreamweaver 16. Adobe GoLive, http://www.adobe.com/golive
1148
S.C. Lim, S. Lowe, and J. Koempel
17. Wolber, D., Yingfeng, S., Yih Tsung, C.: Designing Dynamic Web Pages and Persistence in the WYSIWYG Interface. In: Proceedings of IUI’02 (San Francisco, CA), ACM/SIGCHI/SIGART, NY, pp. 228–229 (2002) 18. http://www2003.org/cdrom/papers/poster/p326/XHTML/p326-mosconi.html 19. Lin, James, Newman, Mark, W., Hong, Jason, I., Landay James, A.: DENIM: Finding a Tighter Fit between Tools and Practice for Web Site Design. In: Proceedings of CHI’00 (The Hague, The Netherlands), ACM/SIGCHI, NY, pp. 510–517 (2000) 20. Shimomura, Takao.: A Page-Transition Framework for Image-Oriented Web Programming ACM SIFSOFT Software Engineering Notes, vol. 29(2), pp. 10–10. ACM, NY (2004) 21. Bauer, M., Dengler, D.: InfoBeans - Configuration ofPersonalized Information Services. In: Proceedings of IUI’99 (LA, USA, pp.153–156 (1999) 22. Davis, H.C., Hall, W., Heath, I., Hill, G., Wilkins, R.: Towards an Integrated Information Environment with OpenHypermedia Systems. In: Proceedings of ECHT’92 (Milan, Italy, pp. 181–190 (1992) 23. Ito, K., Tanaka, Y.: A Visual Environment for Dynamic Web Application Composition. In: Proceedings of SIGWEB’03 (Nottingham, UK), ACM/SIGWEB, NY, pp. 184–193 (2003) 24. Kistler, T., Marais, H.: WebL - A Programming Language for the Web. In: Proceedings of WWW7, (Brisbane, Australia, 1998), Computer Networks, vol. 30, No.1-7, pp. 259–270 (1990) 25. Sahuguet, A., Azavant, F.: Building Intelligent Web Applications Using Lightweight Wrappers. Data & Knowledge Engineering 36(3), 283–316 (2001) 26. http://datamashup.com 27. Lim, S.C., Lucas, P.: JDA: a step towards large-scale reuse on the web. In: Companion To the 21st ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications. OOPSLA ’06, Portland, OR, USA, October 22 - 26, pp. 586–601. ACM Press, New York (2006)
Comprehensive Task and Dialog Modelling Víctor López-Jaquero and Francisco Montero Laboratory on User Interaction & Software Engineering (LoUISE) University of Castilla-La Mancha, 02071 Albacete, Spain {victor, fmontero}@dsi.uclm.es
Abstract. Task modelling has proven useful as a basis for user interfaces (IU) design. Although different models have been pushed ConcurTaskTrees (CTT) notation has become without any doubt the most extended notation for task model specification. However, this notation suffers from a lack of modularity, making the creation and modification of real-world applications a cumbersome process. In this paper a notation that takes inspiration from CTT is described that allows for the specification of the tasks the user is supposed to perform through the user interface and the dialog between the user and the user interface in an intuitive manner. Furthermore, the notation makes use of an abstract operation set to help in the automatic or semi-automatic generation of a user interface that conforms with the specified model. Keywords: User interface design, abstract user interfaces, task models, dialog models.
models are used to generate automatically or semi-automatically a user interface compliant with the requirements captured in these models. The way the transformation from a set of declarative models into a running user interface is achieved can be made following different approaches. Nevertheless, the most widely used approaches take as the cornerstone of their design process either a task model [18][15], a domain model [1] or both [11]. In this paper a visual notation for a task model specification is introduced that is used within a model-based approach: AB-UIDE [11] as the main model guiding the whole model-based design method proposed. This notation takes inspiration from ConcurTaskTrees notation [15], introducing a greater modularity and including dialog modelling. This visual notation has been designed in a fashion close to UML statecharts notation [7], to make easier for the huge mass of UML practioners to get into model-based user interface development.
2 From Domain Model to User Interface Generation Domain model encapsulates the important entities of a particular application domain together with their attributes, methods and relationships. Within the scope of UI development, it describes the objects that the user requires in order to carry out his tasks. Most applications rely on a database to perform its objectives. This data dependency inspired the creation of some projects aimed at generating automatically a user interface out of the data it was supposed to handle. Examples of such projects were Janus [1] or Teallach [6]. Although these domain-based approaches are useful to quickly generate a user interface to access some data, the usability of the resulting user interface is rather low. These domain-based user interface generation approaches produce complex user interfaces, because users can see many elements at the same time. Moreover, as long as the user-tasks are not contemplated the dialog within the user interface is rather limited and constrained, producing user interfaces quite static. Another drawback in domain-based user interface generation approaches is the lack of a proper grouping between the elements the user requires to perform a task, reducing the productivity of the application. Next, we elaborate on the task-driven model-based approach we use, and the arguments that directed us towards this solution.
3 From Task Model to User Interface Generation Task model describes those tasks the user is allowed to perform through the user interface. This task model can be modelled in many different ways, some of them coming from Software Engineering community, such as UML statecharts, activity or use cases diagrams [16] or Petri nets [2], and some of them more specific to humancomputer interaction community, such as CTT or User Action Notation (UAN) [8]. The derivation of a user interface out of a task model adds an additional view to the design process: the user. Thus, taking into account which tasks the user is allowed to
Comprehensive Task and Dialog Modelling
1151
perform and the temporal relationships between those tasks it is possible to increase the overall system usability for example by grouping the related widgets or by hiding all or most of the irrelevant information for the current task. Therefore, relying on a task model for user interface generation rather than just on a domain model is a important step forward to improve the usability of the user interfaces built by applying model-based techniques. 3.1 Task and Domain: A Marriage of Convenience A task model by itself is not enough to generate a high quality user interface, additional information is required. Although the task model includes information regarding which are the tasks the user is supposed to carry out with the application, it does not include information regarding the data that those tasks require to be performed. Thus, we find it is necessary to relate the tasks and those data they require. Therefore, a marriage between task and domain model is required in order to generate a good user interface. For instance, if the user is supposed to perform an input task, where the data type for the input data is integer, the generation process should generate a set of widgets, which are appropriate for that kind of task and for that data type. Another fact that the generation process within a model-based approach relying on task and domain models is the cardinality of the domain object the tasks are related to. For instance, consider the user asks the system to show the phone numbers for a client. Obviously, in this case the cardinality between the output task “Show phone numbers” and the method of the domain object returning those phone numbers is oneto-many (1,*). Therefore, the generation process should generate a set of widgets able to show a set of data entries (for instance a list box). In some approaches this last situation is modelled by specifying a single output task with one-to-one cardinality, and marking that task as repetitive, but some tweaking is required to make it work properly. 3.2 Dialog Modelling By relating task model with domain model we get some more valuable extra information for the generation of the user interface. However, for us it is not enough. For instance, if a task model is created in CTT it is possible to describe the tasks the user will be allowed to perform and the temporal constraints between those tasks. However, it is not that easy to describe the dialog between the system and the user, that it to say, describe the situation where different branches are available depending on the actions the user take. In figure 1 an example is shown where a user is authenticating in a webmail (i.e. GMail1 application by providing a username and a password). The CTT notation is powerful and useful to analyse the authentication task and describe which data the user should provide and how he interacts with the system. Nevertheless, it is not flexible enough to embrace the interaction in all its dimensions. For instance, if the designer wants to specify a task modelling describing the behaviour exhibited in 1
http://gmail.google.com
1152
V. López-Jaquero and F. Montero
figure 1, it is really hard to specify all those situations arising from user’s interaction. In the example in the figure, it is hard to specify the possible error states that a login task can produce (inform the user about a wrong password, a wrong username or both).
(a) User login in a webmail application.
(b) The user enters a wrong password or username.
Fig. 1. Authentication task in a webmail application
This kind of exceptional situations shown in this tiny example, that have a direct impact in the usability of a software product, are some of the limitations of current task models that we overcome in our method. Following, the description of our technique for task and dialog modelling will be shown.
4 A Comprehensible Task and Dialog Modeling Approach Because of the facts enumerated in the previous section, a task model has been devised that takes inspiration from ConcurTaskTrees [15], the well-known technique within human-computer interaction community, UML statecharts diagrams [7], and the Canonical Abstract Prototypes [5]. Next, the most prominent features of the task-modelling approach introduced will be described in-depth. 4.1 Closer to UML In this approach, we bring closer to the huge mass of UML practioners human computer interaction task modelling techniques by adopting a notation much alike UML statecharts diagrams. A task or subtask model specification begins with a circle whose background colour is black (starting state). On the other hand, the end of specification is indicated by a circle whose background colour is black and surrounded by another circle (see figure 2), as it is used in UML statecharts diagrams (final state).
Comprehensive Task and Dialog Modelling end
start
Login
1153
( 1,* )
start
Perform operation
Fig. 2. Tiny example for a task model specification
Each state in the diagram can be either a task or an action. Tasks can be further refined, while actions are elemental tasks that can not be further refined. Tasks are represented by using the same symbol that is used for a state in statecharts diagrams (see figure 2a). To represent actions the states have been stereotyped. The stereotype used has been called “action” (see figure 2b). All these tasks and actions have a set of properties to better describe de purpose and presentation of the intended goal. Tasks, actions, starting state and final state are linked by transitions. A transition from a state S to another state R means that the task control flow can go from state S to state R if the condition given in the label of the transition is met (sequentially). The available labels for the transitions are detailed later on in the paper.
Login
(a) Task
<>
Enter pin (b) Action
Fig. 3. Representation of tasks and actions
4.2 Modularity One of the most interesting things in modelling any complex system is modularity. For instance, in UML the designer can split the design in packages to better organize the structure and readability of the models. Moreover, it also allows the designer to create a complex model where some parts are underspecified while some other parts are fully specified. In our approach modularity has been taken into account also. First, the designer can design the general tasks structure (in our method this structure is derived out of a previous enriched use case model, capturing the initial requirements). The tasks created for that general structure are then refined either with new tasks or with the actions that will allow the user to carry out those tasks. In figure 4 an example of this kind of modularity is depicted. The task Login (it represents the task of a login in a bank ATM) has been decomposed into two actions: Enter login/card and Enter Pin. Notice the resulting actions have also a starting and a final state that represent the beginning and the end of Login task. The designer is allowed to edit/refine a task by double-clicking on the task to be edited. By rightclicking the properties of either a task or an action can be edited.
1154
V. López-Jaquero and F. Montero end
start
Login
( 1,* )
start
Perform operation
Refines <>
Enter login/card
<>
Enter pin
Fig. 4. Refining a task
4.3 LOTOS-Based Temporal Relationships This notation has been enriched by including the same Lotos [9] operators used within CTT. The graphical notation for these operators has been adapted to make it more easily usable within the statecharts notation used. For instance, in figure 2 the repeat operator is represented as a transition from a state to the same state. Between the parentheses the designer can specify the minimum and maximum number of times that this task can be repeated. By default, a transition between two states means a sequential temporal relationship between those two states. Notice in figure 4 Enter login/card and Enter pin have a concurrent temporal relationship. Therefore, both actions can be executed concurrently in no particular order. The graphical representation used to represent concurrency between two tasks is the same one used in CTT. 4.4 Detailed Dialog Modelling with Abstract Tools Constantine [4][5] proposed a set of abstract tools that represent, in an abstract manner, the complete set of actions that can be performed in an user interface. This set of actions was devised after year experimenting and gathering feedback from developers. Constantine uses this set of abstract tools to represent canonical user interfaces (abstract user interfaces). In our approach these abstract tools are applied to express the dialog between the tasks and the actions. That is to say, which action from the user or the system are required to take a transition in our task model from one state to another. A state can have several outgoing transitions. Thus, the task flow control will choose the right transition according to the actions taken by the user or the system. For instance, figure 5 models a scenario where a bank client wants to make a deposit. The client first enters the amount to deposit and puts the money on the ATM slot (waiting for the ATM machine to acknowledge that the client puts the right amount of money on the slot is expressed as a post-condition for Enter amount action
Comprehensive Task and Dialog Modelling
1155
that needs to be held before Enter amount action is considered to be finished). If the post-condition is not held, an error is raised and the transition for error abstract action would be taken. Otherwise, the system would take the transition labelled start. For this transition to be taken it is required that the user enters the amount to deposit and to confirm/accept it. <>
Enter amount
start
<>
Show balance
start
error <>
Invalid Amount
start
Fig. 5. Example of dialog modelling
In short, by adding the abstract tools to the specification of our task model we provide the designer with a powerful tool to make dialog modelling easier and intuitive. Moreover, it also make generation/transformation process much more easier, by providing additional meaningful information regarding the transitions from one task to another and the type of operation required from the user in order to force each transition. 4.5 Tasks and Actions Properties Tasks and actions need to be described in order to provide meaningful information for the generation process. In our approach a set of predefined properties has been defined (see figure 6), although it can be extended by the designer to add custom properties. A priori it is almost impossible to cover every potential property a designer might need to apply the transformation process leading to the generation of the final user interface, since different sets of properties are required to apply different heuristics or transformational approaches. Each task or action has a descriptive name, a description in natural language that describes which is the goal of the task or action, a type that specifies whether it is abstract, for input, for output, etc. They also have a frequency attribute that stores the task/action frequency the designer expects that task/action will be executed by the user/system. This attribute is quite useful to help in finding out a good task layout in the final user interface generated. Precondition attribute includes a set of expressions that must be evaluated to true for the task to be enabled. The expressions are evaluated as a logical program. It means that the precondition will failed whenever any of the expressions that it includes is not successfully evaluated. Postconditions work in a similar manner. Nevertheless, in this case all the expression should be evaluated to true before a transition to other state is taken. Resulting from either evaluation the precondition or the postcondition, error exceptions can be raised. These error exceptions can be handled within the dialog modelling by means of error
1156
V. López-Jaquero and F. Montero
abstract tool. For the specification of both precondition and postcondition attributes OCL [19] (Object Constraint language is used. OCL is widely used among UML practioners to express constraints in a variety of UML diagrams. Finally, any task can be represented at the abstract level either in a FreeContainer or in a Container [16]. The difference between both kinds of Containers is that the first one is a root container that cannot be included within any other container. Designers can choose to leave black this attribute, and to postpone this decision until generation process. Task name Login Description The bank customer enters the card or the login. Type Abstract Frecuency High Precondition NULL Postcondition Customer.checkLogin() Presentation FreeContainer
Action name Description
EnterLogin_card The user types in the login or enters the card.
Type
Input
Frequency
High
Precondition Postcondition
Action name Description
NULL Customer.currentLogin!=“”
EnterPin The user types in the password for the login.
Fig. 6. Properties for Login task and its associated actions
Notice the expressions used in either precondition or postcondition attributes can included any valid OCL expression. In the expressions created the designer can also make use of any public method or attribute of the classes defined for the domain model.
5 Conclusions Task modelling has become the cornerstone for model-based user interface design. Different approaches have been pushed, but CTT seems to be the most widely used.
Comprehensive Task and Dialog Modelling
1157
Nevertheless, CTT is not as used as it should, because of a lack of tools integrating that task modelling technique within the whole development process and because is quite far from what most developer are used to when modelling their applications: UML. In this paper a task modelling approach is introduced that takes inspiration from the strong points of CTT to create a graphical notation alike UML statecharts diagrams to bring closer to the huge mass of UML practioners HCI community modelling techniques. Moreover, the notation has been enriched with abstract tools to provide an easy and clear notation for the dialog between the system and the user. Although, model-based user interface design approaches have reached some kind of maturity, they are not as much used as they should by developers. To build a bridge between HCI research community and developers we need to devise notations able to attract developers towards the good practices for user interface design. In this paper, we have tried to make another step forward to build the bridge between both communities. Acknowledgments. This work is partly supported by the Spanish PAI06-0093-8836, CICYT TIN2004-08000-C03-01 and PCC05-005-1 grants.
References Balzert, H., Hofmann, F., Kruschinski, V., Niemann, C.: The JANUS Application Development Environment - Generating More than the User Interface 1996. CADUI, pp. 183–208 (1996) Bastide, R., Palanque, P.A.: Implementation Techniques for Petri Net Based Specifications of Human-Computer Dialogues. CADUI 1996, pp. 285–302 (1996) Calvary, G., Coutaz, J., Thevenin, D., Limbourg, Q., Bouillon, L., Vanderdonckt, J.: A Unifying Reference Framework for Multi-Target User Interfaces. Interacting with Computers 15(3), 289–308 (2003) Constantine, L.L., Lockwood, L.A.D.: Software for use. Addison-Wesley, London, UK (1999) Constantine, L.: Canonical Abstract Prototypes for abstract visual and interaction design. In: Jorge, J.A., Jardim Nunes, N., Falcão e Cunha, J. (eds.) DSV-IS 2003. LNCS, vol. 2844, Springer, Heidelberg (2003) Griffiths, T., Barclay, P., McKirdy, J., Paton, N., Gray, P., Kennedy, J., Cooper, R., Goble, C., West, A., Smyth, Teallach, M.: A model-based user interface development environment for object databases. In: Proceedings of UIDIS’99, pp. 86–96. IEEE Press, New York (1999) Harel, D.: Statecharts: A visual formalism for complex systems. Science of Computer Programming 8, 231–274 (1987) Hartson, R., Gray, P.: Temporal Aspects of Tasks in User Action Notation. Human Computer Interaction 7, 1–45 (1992) Information Process Systems - Open Systems Interconnection - LOTOS - A Formal Description Based on Temporal Ordering of Observational Behaviour. ISO/IS 8807 (1988) Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., López, Jaquero, V.: UsiXML: a Language Supporting Multi-Path Development of User Interfaces. In: Bastide, R., Palanque, P., Roth, J. (eds.) Engineering Human Computer Interaction and Interactive Systems. LNCS, vol. 3425, Springer, Heidelberg (2005) López Jaquero, V., Montero, F., Molina, J.P., González, P., Fernández Caballero, A.: A Seamless Development Process of Adaptive User Interfaces Explicitly Based on Usability Properties. In: Bastide, R., Palanque, P., Roth, J. (eds.) Engineering Human Computer Interaction and Interactive Systems. LNCS, vol. 3425, Springer, Heidelberg (2005)
1158
V. López-Jaquero and F. Montero
Montero, F., López Jaquero, V., Vanderdonckt, J., González, P., Lozano, M.D.: Solving the Mapping Problem in User Interface Design by Seamless Integration in IdealXML. In: 12th International Workshop on Design, Specification and Verification of Interactive Systems (DSV-IS’2005), England, July 13-15. Newcastle upon Tyne, Springer, Heidelberg (2005) Myers, B., Hudson, S.E., Pausch, R.: Past, present, and future of user interface software tools. ACM Trans. Comput.-Hum. Interact 7(1), 3–28 (2000) Oeschger, I., Murphy, E., King, B., Collins, P., Boswell, D.: Creating Applications With Mozilla. O’Reilly (September 2002) Paternò, F., Mancini, Meniconi, S.: ConcurTaskTrees: A Diagrammatic Notation for Specifying Task Models. In: Interact’ 97, pp. 362–369. Chapman & Hall, Sydney, Australia (1997) da Silva, P.: Object Modelling of Interactive Systems: The UMLi Approach. Ph.D Thesis. University of Manchester, N.W. Paton (supervisor), UK (2002) Puerta, A.R.: A Model-Based Interface Development Environment. IEEE Software, pp. 40–47 (1997) Reichart, D., Forbrig, P., Dittmar, A.: Task models as basis for requirements engineering and software execution. In: Proceedings of the 3rd Annual Conference on Task Models and Diagrams. TAMODIA ’04, vol. 86, pp. 51–58. ACM Press, New York (2004) Warmer, J., Kleppe, A.: The Object Constraint Language: Precise Modeling with UML. Object Technology Series. Addison-Wesley, London, UK (1999)
Structurally Supported Design of HCI Pattern Languages Christian Märtin and Alexander Roski Augsburg University of Applied Sciences Faculty of Computer Science Baumgartnerstr. 16 D-86161 Augsburg, Germany {maertin, roski}@informatik.fh-augsburg.de
Abstract. HCI pattern languages represent an important software engineering concept and offer proven design and architectural solutions to developers of interactive systems and user interface designers. However, due to their poor organizational structures the effective usage of many existing pattern languages is not clear and easy enough to let developers quickly find appropriate patterns for solving their current design problems. In order to raise pattern language usability, there is a need for a sound definition of the hierarchical structure of pattern languages and a rule based workflow for constructing future pattern languages. The structural approach presented in this paper will provide the designer with a technique to ensure the development of efficient and usable pattern languages. Keywords: Pattern Language, HCI, Structured Hierarchy, Regulated Links.
2 Structured Pattern Language Design Approach 2.1 Dividing the Language into Hierarchical Steps To divide a pattern language into several planes or levels is a well known approach that was already presented in [4]. But whether the requirements for finding patterns e.g. for a complex web-based software system can really be covered by as few as four specific planes of patterns is still an unanswered question. We believe that it is not always the best solution to avoid additional planes by combining the respective multifacetted patterns in one of the few defined planes. There should rather be an opportunity to decompose the overall problem class covered by the pattern language into different pattern language planes. For this reason we present an approach without
Fig. 1. Separation into different hierarchical steps
Structurally Supported Design of HCI Pattern Languages
1161
strongly defined layers, but with methodical steps for including patterns with different levels of abstraction, grade of detail or refinement. These grades represent the part of the problem that is solved by the current pattern in relation to the underlying problem class of the whole language. Each level contains a collection of patterns that together solve all of the design requirements for a given part of the overall problem at a specific level of detail. This means that the first and/or the top pattern of the language will describe a relatively abstract solution which represents the main problem class of the language by 100%. Each of the following patterns stands for a smaller percentage part of the overall problem, but gives a more concrete or specific solution for the represented part of the main problem class. Thereby the number of refinement or diversification steps is variable and not limited. Figure 1 illustrates this concept. The pattern on the top will always represent the full 100% of the underlying problem class modelled by the whole language. As you can see, the subsequent patterns only cover smaller parts of the main problem class. It does not matter, whether they cover either 50% or only 10% (see part A of the figure), but in both cases the two patterns on the same level are required to solve the partial problem. In part B of the figure you might assume that all patterns in the second step represent 33.33% but this is not necessarily the case. The covered overall percentage of the patterns in one plane can be smaller than that of the parent pattern. Part C emphasizes that patterns at a deeper level present a higher grade of details or refinement and consequently only cover a smaller percentage of the main problem class than their predecessor. However, it is important that all patterns within the same plane represent the same level of detail. All deeper patterns represent the problem and its solution at a more concrete level. These requirements certainly constitute the greatest challenge when developing a pattern language with a good logical organization. The question is: How can you separate the problems targeted at by the patterns without the knowledge of which other problems might still occur and which grade of details they will correspond to? To solve this problem a simple method can be used. Starting at the general problem class and with the abstract pattern at the root of the pattern language graph you will decompose the problem class into several smaller problems and their patterns. In the beginning you will not know yet, whether the decomposed patterns correspond to the same level of details or not. However, if you continue with the decomposition process you will find new problems which cover smaller percentages of the higher patterns´ problems and a pattern tree is created. In case you want to link an existing pattern to a newly created pattern, which is on the same or even on a higher level, you have detected a gap in the pattern hierarchy. This gap implies that either the pattern, you want to link to, is not diversified into enough subpatterns and you have to fill in an additional subpattern or that the new problem corresponds to a more detailed pattern with less percentile coverage at a deeper level. In principle the resulting pattern hierarchy should provide the pattern language user with a simple und clear navigation structure. Additionally he or she will be able to better estimate the size of an existing problem within the overall application context. On the other hand applying this problem decomposition process, the pattern language developer has a way to find out possible logical gaps in the language.
1162
C. Märtin and A. Roski
2.2 A System for Connecting the Patterns To be useful for the system developer a pattern language needs a well and conveniently constructed system of connections between its patterns in order to allow the systematic navigation through the pattern hierarchy during the development process. However, when is a system of connections constructed conveniently?
Fig. 2. Overview about the links of Tidwell's Pattern Language
Figure 2 shows a visualization of all the links in an existing pattern language for user interface development [6]. If you have a look at it you may doubt whether this language is constructed conveniently or not. By using the links, a possible user will move from pattern to pattern, but it can happen that he or she will not find some patterns useful for the design. Some of the patterns are completely isolated from the rest of the language. We think, to write patterns totally cut off from the environment may lead to suboptimal and inefficient design. On the other hand, we also find it is useless to write patterns which are connected to nearly all other patterns in the environment. We guess that the connection structure for this example language was created that way, because for the language designer it was not clear which trace a user would take through the pattern language. In an interactive system design project of a certain size it is an absolute necessity for the designer to have a design plan. Instead of jumping from one place of construction to another one he or she should systematically follow this construction plan from the overall content structure, to the logical structure, the abstract presentation structure, the concrete presentation structure down to the specific layout, positioning and coloring aspects. Consequently an appropriately constructed system of connections for such a comprehensive pattern language could be useful. Because of this we propose special restrictions or rules for possible links within a pattern language.
Structurally Supported Design of HCI Pattern Languages
1163
Fig. 3. Graph of possible and not possible links
Implicitly during problem decomposition links are established to subsequent patterns in the next layer of the same part of the pattern tree. However, we also allow to reference patterns at a deeper level. We call this kind of connections special links. For such connections we have special rules to follow. In figure 3 you can see the possible three classes of links. The implicit links, painted in black, the special-links (green or possible) and the non-allowed links (red or not possible). The red lines with the numbers 1 and 2 are forbidden, because they are drawn within the same tree and therefore not necessary. The line with the number 3 is not allowed, because it refers to a pattern within the same step. 2.3 Possible Run Through In a run through a potential user will walk through the pattern language in a predefined way. In order to follow a structured construction plan, we also provide the user with a structured way through the language when designing an interactive application. As figure 4 shows, there is a global entrance point that represents the abstract pattern for the general problem class of the whole pattern language. Of course it can happen, that a user is looking for a pattern which is somewhere at an inner level
1164
C. Märtin and A. Roski
Fig. 4. Overview of a systematic run through including special-links
of the pattern language. In order to find the relevant pattern, the user has to search for the specific pattern by following the problem decompositions from the top to the bottom of the pattern graph, i.e. from the most abstract to the most detailed patterns. By selecting the correct nodes, the user will be automatically guided to the required pattern. After the user has found the specified problem he or she will be in the phase of problem diversification. In this phase the advantages of the described structure will become evident. In the phase of problem diversification the user wants to solve a problem, for example the problem described in the pattern with the letter i. While the user retrieves the solution information contained in the pattern, he or she will realize that additional useful information is available (b, h and e) referring to special parts of the current problem. Continuing this procedure in each of the referenced patterns, the user will get into the deepest level of this tree and consequently will be fully informed about how to solve the current problem most efficiently and effectively. The green (lighter colored) arrows represent the basic run through which is possible with the implicit connections. The blue ones (labelled (4,5), (5,8), (15,8)) show some logical links, which will automatically be created when the user follows the predefined way. So it is not necessary to implicitly establish such relationships during the design of the pattern language. This also includes the connections with special links (dotted line).
Structurally Supported Design of HCI Pattern Languages
1165
3 Example of a Web Design Pattern Language Implementation To demonstrate the approach we present a typical situation during the development of a pattern language. In this example you will notice, how easy it is to avoid a logical gap by simply arranging the patterns on the different hierarchical levels. All this is made possible by just following the defined rules above.
Fig. 5. Situation of avoiding a logical gap
The given situation is a pattern language for web design which has only five levels at the moment. At the top, as the global entry, there is a pattern that gives an overview of possible categories of websites together with category-independent design
1166
C. Märtin and A. Roski
knowledge. On the next layer down, the developer has already written patterns about a community site, a portal site and a personal site. These three patterns represent the same grade of detail and therefore they are located on the same hierarchical step. For the community site, there are two more patterns below, which describe how to design a chat and a forum. In the portal domain, there is a special pattern for onlineshopping, which further references to some more detailed search function patterns. Now, a developer might wish to express that an online-shop could also include a forum. In this case he or she will realize that it is not possible to establish a link (special-link) between the “online-shop” pattern and the “forum” pattern, because both patterns reside on the same level and links of this kind are not allowed. As it is a fact that online-shops can include a forum, the developer knows that the “forum” and the “chat” pattern have to be shifted down one level. This is the only way to create a reference to the online-shopping pattern, but this also implies that you have to insert a new pattern at the former position of both patterns. For this simple dilemma he or she has to find a suitable pattern which will fit into the new gap. “Communication” would be such a possible pattern, because it complies as a parent pattern for “forum and chat” and as a child pattern for “Community”. Community, e.g., might serve as an overview pattern for possible communication functionalities, what sometimes may be quite helpful in order to understand the various reasons for the different communication techniques. Without this new pattern, a logical gap would have existed. For a novice webpage designer it would have been more difficult to follow the way through the pattern language. Figure 5 shows the whole process at one glance. At the beginning, only the patterns “Chat” and “Forum” (red 1,2) were residing in the left column directly under the “Community site” pattern. Then these patterns were shifted down one level (red 3) and the “Communication” pattern (red 4) was inserted. Finally, the developer is now able to establish the link from the online-shopping pattern to the Forum pattern (red 5). Using our structural approach, a pattern language developer will clearly notice from the beginning that “Chat” or “Forum” should not reside on the same level as the online-shopping pattern. During the complex design of a new pattern language, the language developer might otherwise often have overlooked such simple logical dependences between the patterns of the language. With the help of the structured design approach presented in this paper, this will not occur. The same approach can also be used for changing existing pattern languages in a structurally consistent way as well as for merging several pattern languages into one.
4 Conclusion This paves the way for the correct and efficient use of the resulting pattern languages by application developers and user interface designers. As the existing structure only implements the implicit and other regulated links, the system will still be consistent when changes or expansions to a language are made. The user will always keep the overview of the inter-relations between the patterns of the language and will be guided only to the necessary information. Thus, pattern languages that use the presented hierarchical structuring approach might add a new level of design experience to the HCI community.
Structurally Supported Design of HCI Pattern Languages
1167
Two evaluation prototypes of pattern languages that use our structural approach are currently under way at our lab. One language is targeted at patterns for designing and evaluating web applications. The other one is dedicated to the organization of patterns for the semi-automatic design of user interfaces for industrial automation applications.
References [1] Alexander, C., Ishikawa, S., Silverstein, M.: A pattern language. Oxford University Press, Oxford, UK (1977) [2] Alexander, C.: The Timeless Way of Building. University Press, Oxford (1979) [3] van Duyne, D.K.: The design of sites, The Patterns, Principles and Processes for Crafting a Customer-Centered Web Experience, 1st edn. Addison Wesley, Reading, MA (2002) [4] van Welie, M., Traetteberg, H.: Interaction Patterns in User Interfaces. In: 7th Pattern Languages of Programs Conference, August 13-16, 2000, Allerton Park Monticello, USA (2000) [5] Tidwell, J.: Interaction Design Patterns. In: Proceedings of the Pattern Languages of Programming PLoP98 (1998) [6] Tidwell, J.: http://www.designinginterfaces.com [7] Marcus, A.: Patterns within Patterns, Interactions 11(2), 28–34 (2004) [8] Tiedtke, T., Krach, T., Märtin, C.: Multi-Level Patterns for the Planes of User Experience. In: Proc. of HCI International. Theories Models and Processes in HCI, Las Vegas, Nevada USA, July 22-27, 2005. Theories Models and Processes in HCI, vol. 4, pp. 22–27. Lawrence Erlbaum Associates, Mahwah, NJ (2005) [9] Gamma, E., et al.: Design Patterns. Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading Mass (1995) [10] Seffah, A., Forbrig, P.: Multiple User Interfaces: Towards a Task-Driven and PatternsOriented Design Model. In: Forbrig, P., Limbourg, Q., Urban, B., Vanderdonckt, J. (eds.) DSV-IS 2002. LNCS, vol. 2545, pp. 118–132. Springer, Heidelberg (2002)
Integrating Authoring Tools into Model-Driven Development of Interactive Multimedia Applications Andreas Pleuß and Heinrich Hußmann Department of Computer Science, University of Munich Munich, Germany {pleuss, hussmann}@cip.ifi.lmu.de http://www.medien.ifi.lmu.de
Abstract. The Multimedia Modeling Language (MML) is a platformindependent modeling language for model-driven development of interactive multimedia applications. Using models provides several advantages like well-structured applications and better coordination of the different developer groups involved in the development process. However, the creative tasks – like graphical design of the user interface and the design of media objects – are better supported by traditional informal methods and tools. In particular multimedia authoring tools such as Adobe Flash are well established for multimedia application development. In this paper we show how MML and authoring tools can be integrated by the example of Flash. Therefore we transform the MML models into code skeletons which can be directly loaded into the Flash authoring tool to perform the creative design tasks and finalize the application. In that way, the strengths of models and authoring tools are combined. The paper shows the required level of abstraction for the models, introduces a metamodel and a suitable code structure for the Flash platform, and finally presents the transformation.
1
Introduction
Through the growing pervasion of every day life with computers, many application areas appear where rich and comfortable and eventually entertaining user interfaces become more and more natural. In this paper we deal with “multimedia user interfaces” which make intensive use of different kind of media – like audio, video, graphics, and animation – and provide sophisticated user interfaces adapted individually to the user’s tasks and information. In particular, we address highly interactive systems which may include complex application logic. Classical examples are e-learning or training applications, simulation or computer games. New additional application areas are for instance home entertainment systems or infotainment systems in cars. Such interactive multimedia applications are often developed using authoring tools such as Adobe Flash, which includes the programming language ActionScript. Such tools provide excellent support for the creative development tasks. J. Jacko (Ed.): Human-Computer Interaction, Part I, HCII 2007, LNCS 4550, pp. 1168–1177, 2007. c Springer-Verlag Berlin Heidelberg 2007
Integrating Authoring Tools into Model-Driven Development
1169
However, they lack of support for structuring the application. The ActionScript code can be scattered all over the application and is very difficult to maintain. Furthermore, there is very low support for teamwork and for coordination between the different developer groups for user interface design, software design and media design. The need for a better support of software engineering principles into multimedia application development is clearly stated by various publications in this area (e.g. [1]). To address this issues, we propose in [2,3] a modeling language for modeldriven development of multimedia applications called Multimedia Modeling Language (MML). Our idea is to combine this approach with the advantages of the existing tool. Therefore during the design phase only the overall structure and behavior of the application is specified in the MML models. Detailed behavior and concrete visual design should however be created in the authoring tools. We achieve this by generating code skeletons from the MML models which can be directly loaded into the authoring tools for the implementation phase. The code skeletons contain placeholders which have then to be filled out and arranged within the tool. In this paper we demonstrate the feasibility of this concept using as target platform the Flash authoring tool. The paper is structured as follows: In section 2 we briefly summarize MML. Section 3 introduces the target platform, Adobe Flash, and presents an overview on the main ideas of the approach. Section 4 elaborates a suitable code structure for Flash applications, summarizes the transformation and shows how the resulting documents are processed within the authoring tool.
2
MML
The Multimedia Modeling Language (MML) is a platform-independent language for model-driven development of multimedia applications. It supports a design phase for multimedia applications and allows generating code skeletons for different platforms. The language bases on UML 2.0 and integrates concepts from different approaches in user interface and multimedia modeling. Based on the results of requirement analysis - like user task models [4] and storyboards – four kinds of models are provided: structural model, scene model, abstract user interface model (which is enhanced to a media user interface model ) and interaction model. In the following we briefly summarize the MML models referring as example to a Jump’n Run gaming application like those at [5]. For further details on MML please see [2,3]. The structural model describes the structure of the application logic (domain model) in terms of an extended UML class diagram. The classes from the domain model are referred to as application entities. They can be associated with media components. For example a character in a Jump’n Run game is represented by an animation. If required, the inner structure of the media components can be defined. This is only necessary when its inner structure is relevant for other parts of the application. An example is a character in a Jump’n Run game: its legs should be animated when the character moves. Thus, the legs have to be realized
1170
A. Pleuß and H. Hußmann
as moveable parts of the animation and in some cases it must also be possible to access them from the application logic. As such issues often concern different developer groups – usually the software designer and the media designer – it is important to define them in the model. The scene model defines the scenes of the application and the transitions between them in terms of an adapted UML state chart. A scene represents a specific state of the application’s user interface and is an abstraction of the screen concept in graphical user interfaces. Scenes in a Jump’n Run game are for instance the menu, the game and the highscore. Through their dynamic character caused by temporal media objects, multimedia scenes have an inner state specified by attributes as well as operations to affect their state. In particular, they have so-called entry-operations to initialize the scene and exit-operations to clean it up. The abstract user interface model describes for each scene the user interface in terms of abstract user interface components (AUI components). For the AUI components in MML we reuse the concepts provided by user interface modeling approaches (e.g. [6,7]). The set of AUI components currently supported in MML includes input component, output component, action component, and different specializations of them. The AUI components required within a scene can be derived for instance from task models and are usually specified by the user interface designer. As a core concept of MML we enhance the abstract user interface with relationships to the media components from the structural model: some of the AUI components could be realized by one or more of the media components. Obviously, often output components are realized by media components. But also input components can be realized by media components as e.g. an animation can be clicked or dragged and dropped. Furthermore, the abstract user interface model is enhanced with sensors. A sensor represents an event caused by a temporal media component, like collision sensors for animations (triggering an event when an animation is moved over another object on the screen) or a time sensors for a videos triggering an event when the video has reached a specific point on its timeline. The enhanced abstract user interface model is referred to as media user interface model. Finally, the interaction model describes for each scene how the AUI components and the sensors from the media user interface model trigger operations from the structural model. This model is an adapted UML activity diagram where the actions are restricted to operations calls.
3
General Approach for the Target Platform
For the transformation from MML models to Flash code skeletons we consider the concepts from model-driven development (see e.g. [8]), such as explicit metamodels (to define the models) and explicit, modular transformations between them. MML is defined using a MOF-compliant metamodel. We use the Eclipse Modeling Framework (EMF ) for its implementation. The transformations are
Integrating Authoring Tools into Model-Driven Development FlashDocument
Element depth : Integer height : Double left : Double name : String top : Double width : Double
0..1 +actionScript 0..n +item
ASScript
Item
Class
name : String
0..1 +actionScript
+actionScript 0..1 0..1 +symbolItem
BitmapItem SoundItem
FontItem
SymbolItem
Instance +libraryItem 1
VideoItem MovieClip
Shape
Text
0..1 +symbol
SymbolInstance
ComponentInstance
Fig. 1. Extract of Flash metamodel including the main elements of Flash documents
defined using the Atlas Transformation Language (ATL), a declarative language close to the OMG standard QVT (Queries, Views and Transformations) [9]. The transformation is performed in two steps: first, the MML models are transformed to Flash models. Therefore we specify a Flash metamodel which is presented in this section. The actual mapping from the platform-independent MML concepts into the platform-specific concepts of Flash is performed during this transformation. In a second step, we transform the Flash model into the final code skeletons. This is mainly a straightforward transformation. However, it is more complex than conventional code generation (like transformation from a Java model into Java code), as we aim to generate files for the Flash authoring tool. The resulting files can be directly loaded and processed in the tool using its sophisticated support for the creative design tasks. This requires on the one hand strict compliance to the authoring tool’s internal document structures and on the other hand a solution how to produce the corresponding binary files. In this section we summarize the capabilities of the Flash authoring tool and the resulting general structure of Flash applications. The main concepts (in the following denoted in italics) are reflected in the simplified extract of our Flash metamodel in figure 1. Afterwards we explain our general approach for creating code for the authoring tool. On that base we propose in section 4 a mapping from MML models to a suitable Flash application skeleton. The Flash authoring tool was originally developed for the creation of graphics and animations. The tool is timeline-based, i.e. the temporal dimension of animations and behavior is represented by a timeline consisting of several frames. A frame owns a two-dimensional space (called stage, not part of the metamodel) where 2D vector graphics (shapes and text ) and other media objects can be placed. A third dimension (z-axis) is realized by layers to define which object is the topmost when several objects overlap each other on the stage. An
1172
A. Pleuß and H. Hußmann
animation means that some graphics changes (e.g. its position) over the time, i.e. over the frames in a timeline. A symbol is complex graphical object (often the term movie clip is used as synonym as movie clip is the most important type of symbol). A symbol contains a timeline which can contain (in its frames) any content as the main timeline. This means that a symbol may contain any complex content, even symbols and animations. Thus, symbols and animations can be hierarchically nested in arbitrary depth. Each Flash document contains a library which contains all media objects of the document. When a symbol is created or any media object is imported into the authoring tool, it is automatically added to the library. The items in the library can be instantiated multiple times in one or more frames on the timeline. An instance usually has an instance name and a location within a frame. Since version 4 a scripting language is included in Flash, called ActionScript which continuously evolved. In Flash MX 2004 ActionScript 2 was introduced which supports the object-oriented concept of classes. Classes have to be specified in separate ActionScript class files. In particular, it is now possible to attach ActionScript classes to symbols in the library of a Flash document. This is a very interesting opportunity, as symbol and associated code then build together a complex object consisting of programming logic and a (possibly very complex) visual representation. The associated ActionScript class has automatically access to all properties of the symbol, as if they were class properties, including visual elements nested inside the symbol. Furthermore events on the symbol (e.g. mouse clicks) can be processed in the class by just specifying corresponding event handler operations. Such a connection between symbols and ActionScript classes is an important concept which we intensively use in our generated code (see section 4). The file format for the flash documents is a proprietary binary format with the file extension FLA. For execution the files are compiled into SWF files which run within the Flash player available as plugin for Browsers. SWF is an open format, but as it is a complied format SWF files can not be edited comfortably within the authoring tool. Hence, for our purposes we aim to generate FLA files. To solve the problem of creating the proprietary FLA files, we use the mechanism of extensions for the Flash authoring tool. They must be specified in JavaScript and allow to automate every action, which can be done manually in the authoring tool, e.g. creating symbols. For that purpose the tool provides a kind of document object model, similar to that in browsers for HTML documents. These scripts must have the file extension JSFL and can be executed either within the authoring tool or from the command line (if the Flash authoring tool is available in the system). We use this mechanism to generate FLA files by generating a JSFL file which can be executed on the command line and then creates the FLA content according to the Flash model (figure 2). A core problem in Flash is the low support for structuring the applications. The program flow of the application can be determined for instance by ActionScript code, by the timeline or by a combination of both. ActionScript code can be attached to symbols, symbol instances, and frames. The sources on Flash
Integrating Authoring Tools into Model-Driven Development
1173
Execution of JSFL file
JSFL File MML Model
FLA Files MML Model FLA Files
Flash Model
ATL Transformation
FLA Files ActionScript MML Model Class Files
Fig. 2. Approach for the overall transformation
in the literature and in the web provide various frequently used patterns for many different problems in small scale, but it exits no common solution for the overall structure of Flash applications. An important contribution into this direction is provided by [10] who applies several object-oriented patterns in Flash – e.g. the Model-View-Controller pattern (MVC) – and presents a framework for the overall application structure. However, this approach is restricted to ActionScript code and completely omits the usage of the authoring tool, and is hence not suitable for our purposes. On the other hand, the feedback in the web (e.g. in forums like http://flashforum.de) on books like this as well as the latest changes in Flash provided by Adobe show the general demand for a better support of software engineering principles in tools like Flash.
4
Transformation and Resulting Flash Code Skeletons
In this section we describe how to transform the platform-independent MML models into useful code skeletons in Flash. As described in section 3, the literature on Flash provides various different patterns, but there is no common solution for an overall structure of Flash applications which includes both: ActionScript and the features of the authoring tool. Thus, two issues have to be addressed: first we need to identify a suitable structure for Flash applications. Then, the concrete mapping from MML model elements into this structure has to be defined. Our proposed structure is based on the following considerations: The most important requirement for the Flash application structure is the usage of the authoring tool for creating and editing visual objects. Hence we generate FLA files which contain placeholders (annotated rectangles, see figure 5) for the media components and the AUI components. Besides, the application structure should be well-structured using common concepts, to avoid restriction to specific purpose or specific size and to enable an easy understanding of the generated code. Thus, we use object-oriented ActionScript code for the non-visual parts of the application. We make use of the ability of ActionScript 2 and place all ActionScript code into separate class files. As proposed e.g. by [10], we use the MVC-pattern to structure the ActionScript code. For the connections between
1174
A. Pleuß and H. Hußmann
<> Scene1 for each scene in the application
<<MovieClip>> AUI_Component1
sharedMedia once for the whole application
<> Scene1
attached
<<MovieClip>> Scene1 attached
references (only if realized by a media object)
<> AUI_Component1
model
<> Media1 <<MovieClip>> MediaComponent1
<> ApplicationEntity1
Fig. 3. General structure of Flash applications resulting from the transformation. The names of the artifacts indicate the MML model elements which they result from.
the visual elements in the authoring tool and the corresponding ActionScript code we apply the ability to attach ActionScript classes to movie clips. User interface objects of others kinds are just encapsulated into movie clips. To support teamwork, we divide the FLA part of the application into many small FLA files. Efficient version management of a single FLA file is usually not possible as FLA is a binary file format. To support development of large applications we provide a package structure for the ActionScript classes and a folder structure for the FLA files. Figure 3 shows an overview on the resulting structure. The element names in figure 3 indicate the MML model elements where they are derived from during the transformation. The ActionScript classes for the application entities contain the class properties derived from the MML class diagram. For operations we generate only the operation signature, as the operation body is not specified in MML. We believe that the operation bodies are specified more efficiently directly in the target language manually, using the platformspecific constructs and libraries. The classes generated from application entities correspond to the ‘model’ in terms of the MVC-pattern. The ActionScript classes for the scenes contain operations which perform the transitions between the scenes according to the MML scene model. The ActionScript classes for the AUI components contain event handler operations (depending on the type of AUI). They correspond to the ‘controller’ in terms of the MVC-pattern. For each media component in MML we generate a separate FLA file containing a movie clip in its library which encapsulates a placeholder. The movie clip has attached a name which can be used to refer on it from other files. This ensures that media component can be reused multiple times within an application, as this is possible in MML. If the media component is kind of graphics or animation the placeholder will usually be filled out directly in the Flash authoring tool. For instance, for an animation heroAnimation in a Jump’n Run application a FLA document heroAnimation is generated which contains in its library a movie clip heroAnimation containing a placeholder (figure 4(a)). The generated movie clip
Integrating Authoring Tools into Model-Driven Development
(a) Placeholder in the library
(b) After double-click
1175
(c) Replacing with custom content
Fig. 4. Replacing the movie clip generated for the media component heroAnimation
Fig. 5. Screenshot from the Flash authoring tool showing a FLA document generated for the scene Game. (The window is reduced to the most important elements.)
can be edited in the authoring tool as easily as any other manually created movie clip: a double-click on the movie clip opens its content on the stage (figure 4(b)) where it can be replaced by any graphics or animation using the authoring tool’s various editing capabilities (figure 4(c)). Other media objects (which can not be created in Flash) will be imported from the file system into the movie clip. The FLA files generated for the scenes contain the actual user interface of the application (the different ‘screens’). They contain the elements generated for the AUI components from the MML model. Figure 5 shows as an example a screenshot of the Flash authoring tool after loading the FLA document generated for a scene Game of a Jump’n Run application. Each AUI component is represented by a movie clip (in the library and on the stage) which encapsulates its specific content. This allows us to directly associate it with the corresponding ActionScript class. AUI components which are not realized by media components are mapped to conventional Flash widget components (located into the encapsulating movie clip). In this case the encapsulating movie clip has no own visual representation on the stage beside the contained widgets. The widgets are labeled with the element name. In figure 5, there are three (invisible) movie clips: one for the output component playerName containing a generated text label, one for the output component playerScore also containing a generated text label, and one
1176
A. Pleuß and H. Hußmann
for the action component exit containing a generated button. The movie clips representing the AUI component can encapsulate multiple widgets if necessary, for example a text field and a related text label. As explained in section 3 the elements nested into a movie clip can be accessed from the associated ActionScript class as if they were class properties. If the AUI component is realized by a media component (in the MML model) the generated code uses another ability of the Flash authoring tool: to reuse a movie clip in multiple documents it can be referenced by other movie clips in other FLA documents. In this case the destination movie clip retains it original name and properties, but its contents are replaced with those of the referenced movie clip. Changes in the referenced movie clip appear also in the referencing movie clip. We use this mechanism to reuse the movie clips generated for the media components (e.g. heroAnimation in figure 4) in one ore more scenes. For instance the scene Game in figure 5 contains a movie clip heroAnimation which references the heroAnimation from heroAnimation.fla. In the screenshot, the referenced heroAnimation has already been edited while the referenced enemyAnimation and platformGraphics currently still contain their default placeholder rectangles.
5
Conclusion
In this paper we present a transformation from MML, a language for modeldriven development of interactive multimedia applications, to code skeletons for the widespread and professional authoring tool Adobe Flash. Further technical contributions of the paper are the MOF-compliant Flash metamodel and the proposed general structure for Flash applications. As they are independent from the modeling approach, they can be reused for other projects which aim to make use of the Flash authoring tool, e.g. web-engineering approaches which aim to generate rich internet applications (e.g. [11]). Currently, approaches in this area usually use e.g. frameworks like Flex, but they do not support individual user interfaces created in the Flash authoring tool. Our approach bases on existing concepts from the literature where possible. In particular, we use the abstract user interface model which is common to many approaches in the field of user interface modeling [12]. Thus, it is possible to combine the multimedia-specific aspects from our approach e.g. with concepts for context-sensitive user interfaces as presented e.g. in [6]. Our work also bases on concepts from [13], an existing modeling approach for multimedia applications. However, to our knowledge none of the approaches aims for generation of code skeletons for an authoring tool like Flash. MML and the Flash metamodel are implemented using the Eclipse Modeling Framework (EMF ). Currently no custom MML editor exists but there is an extension for the UML tool MagicDraw which allows creating MML models. The transformations are specified with ATL (see section 3). First user test with the presented concepts were performed in several student projects, mainly in the lecture “multimedia programming” where students developed in teams of 5
Integrating Authoring Tools into Model-Driven Development
1177
to 6 persons (relatively complex) multimedia applications with MML (see [5]). The lessons learned from these practical projects are already integrated into the current version of MML and the Flash code structure. In all, the paper provides a general proof of concept for the integration of models and authoring tools and shows the required level of abstraction for the models. This results in a combination of the strengths of both technologies: well-structured applications and better coordinated cooperation of developers through models as well as excellent support for the creative design by established authoring tools. In general, the idea of integrating modeling with more informal techniques and tools for the creative development tasks might be another step towards a better integration of software engineering and human-computer interaction.
References 1. Hirakawa, M.: Do Software Engineers Like Multimedia? In: IEEE International Conference on Multimedia Computing and Systems (ICMCS). IEEE (1999) 2. Pleuß, A: Modeling the User Interface of Multimedia Applications. In: Wang, F. (ed.) FORTE 2005. LNCS, vol. 3731, Springer, Heidelberg (2005) 3. Pleuß, A.: MML: A Modeling Language for Interactive Multimedia Applications. In: 7th IEEE International Symposium on Multimedia (ISM 2005). IEEE (2005) 4. Patern´ o, F., Mancini, C., Meniconi, S.: ConcurTaskTrees: A Diagrammatic Notation for Specifying Task Models. In: Interact’97, Chapman & Hall, Sydney, Australia (1997) 5. University of Munich: Lecture Multimedia-Programmierung, Summer Term 2006 (2006) http://www.medien.ifi.lmu.de/studiengang-neu/galerie/mmp-ss06/ 6. Van den Bergh, J., Coninx, K.: Towards Modeling Context-Sensitive Interactive Applications. In: SoftVis 2005, ACM Press, New York (2005) 7. Constantine, L.L.: Canonical abstract prototypes for abstract visual and interaction. In: DSV-IS 2003. LNCS, vol. 2844, Springer, Heidelberg (2003) 8. Kleppe, A., Warmer, J., Bast, W.: MDA Explained. Addison-Wesley, Reading (2003) 9. Jouault, F., Kurtev, I.: On the architectural alignment of atl and qvt. In: Proceedings of the 2006 ACM Symposium on Applied Computing (SAC), ACM, New York (2006) 10. Moock, C.: Essential ActionScript 2.0. O’Reilly Media (2004) 11. Bozzon, A., Comai, S., Fraternali, P., Carughi, G.T.: Capturing RIA concepts in a web modeling language. In: WWW 2006, ACM, New York (2006) 12. Calvary, G., Coutaz, J., Thevenin, D., Limbourg, Q., Souchon, N., Bouillon, L., Florins, M., Vanderdonckt, J.: Plasticity of user interfaces: A revised reference framework. In: TAMODIA, INFOREC Publishing House Bucharest (2002) 13. Sauer, S., Engels, G.: Uml-based behavior specification of interactive multimedia applications. In: HCC’01, IEEE (2001)
A Survey on Transformation Tools for Model Based User Interface Development Robbie Schaefer Paderborn University, C-LAB, Fürstenallee 11, 33102 Paderborn, Germany [email protected]
Abstract. As a wide variety of interaction devices, modalities has to be supported by user interface developers, model-based user interface development gets increasing attention. Especially if context- and user-awareness comes into play, handcrafting a user interface is rendered almost impossible. In modelbased user interface development, usually several models are applied to describe different aspects of the user interface or to provide a varying level of detail. The relations between the models representing those levels of abstractions are established through transformations, a concept which is also applied in software engineering with the Model Driven Architecture (MDA). In this paper we will review several transformation systems and discuss their applicability for model-based user interface development. Keywords: User Interface Engineering, Model Driven Architecture, Model Based User Interface Development, Transformation Tools.
A Survey on Transformation Tools for Model Based User Interface Development
1179
models while other work on their representations, some are integrated in the model while others are applied externally, some are observable and modifiable while others are hardcoded and not accessible. For this reason, we will review several transformation systems and discuss their applicability for model-based user interface development. The considered transformation approaches and tools are graph transformations (GT) as applied in UsiXML [13], ATL [10], TXL [6], 4DML [4], UIML's internal transformation capability [1], XSLT [11], GAC [8] and RDL/TT [18].
2 Selection and Comparison Criteria Transformations are an essential part of many domains in computer science and applied computing for example the transformation of programmes, data and models. Since in this survey the focus is on transformations beeing used for engineering of user interfaces, a selection had to be made from the plethora of transformation tools. Even if we would constrain ourself to model transformation approaches used in the MDA, as done in [7], we would end up in a comparison of more than twenty canditates with a rich set of comparison features. However, model-based user interface development took a variety of paths in the past and provided several other models and transformations than used in software engineering only. For this reason we did not only select model transformation tools from the MDA but also transformation tools which are common practice in model-based UI development or approaches which have interesting properties that may be exploited in engineering user interfaces and in fact have been used for that purpose. In order to compare the selected transformation tools, we did not go into the same level of detail as in [7] but took a rather practical approach and looked at several criteria which are of particular importance with model-based UI development. First of all, the programming model is compared: While the distinction between imperative and declarative programming does not allow evaluating the expressiveness of the approach it is important with respect to the UI developer's familarity with one or the other approach. Furthermore, we looked at the capabilities to transform models, XML and code. The distinction between model- and XML-transformation is needed, since many UImodels are described in XML and also a lot of FUIs are XML-based e.g. XHTML, XForms, WML, VoiceXML etc. Another important aspect is, whether the transformation approach is capable of generating code beyond XML-based FUIs, while the ability of complex mapping as opposed to linear mapping is an evidence for the expressiveness of the approach. Furthermore, we looked at the extensibility and parameterizability of the tools which make the transformation possibilities more versatile, especially with plasticity of the user interfaces for different devices, contexts, users and modalities in mind.
3 Selected Transformation Approaches Before comparing the selected transformation languages, we will give a brief introduction to each of them and show how they have been used for the
1180
R. Schaefer
transformation of user interface models or representations and as such would with in a model based user interface design modality. 3.1 Graph Transformations (GT) in UsiXML A formal, purely declarative approach for model transformations is established with graph transformations as shown in [3], since many models can be designed with an underlying structure of directed graph. Graph transformations are quite common for tools in the MDA-domain, for example in AtoM3 [12]. For this survey, we selected UsiXML as a candidate which used GT since UsiXML is specifically designed for the multi-path development of User Interfaces and one of the first approaches which have been proven to be MDA-compliant [22]. The models UsiXML is based on are based on graphs and therefore the model mappings of UsiXML are specified with graph transformations which consist of a set of transformation rules [13]. Each rule consists of a Left Hand Side (LHS) matching a graph G, a Negative Application Condition (NAC) not matching G and a Right Hand Side which is the result of the transformation. The LHS may also further be augmented by additional attributes to further constrain the matches and thus adding to the expressiveness. Since graph transformations allow mappings between any models that are based on a graph, UsiXML thus allows reification, abstraction and translation between the models. The limitations with this approach are only with the construction (or reengineering) of the FUI, since the FUI usually is not represented as a graph. Translations between to different FUI formats are also not possible nor intended with UsiXML. 3.2 ATL Another language used for model transformation is ATL [10]. ATL follows a hybrid approach in a way that the user is in a position to select, whether to use ATL purely declarative or to employ imperative features in addition. The declarative aspect is provided by the approach of matching rules, where a source pattern is described through a set of source types and an OCL-expression which constrains the source types. The target pattern is constructed a similar way by specifying a set of target types form the target meta-model and a set of bindings which are used to initialize the features of the target types. While this declarative approach is very straightforward, it may be hard to specify more complex rules. For this case, ATL offers to add an action block with imperative constructs to the rules or even allows calling external code for the logic. Since ATL operates on the models themselves – not even on the representations of the models such as XML-representations – it is not suited as a transcoding tool for other purposes but only for model transformations. ATL has been successfully applied for the model driven engineering of plastic user interfaces [21]. 3.3 TXL TXL [6] is a transformation language which is designed for multiple purposes, especial for the transformation of programming languages, and is not constrained to any source or target format. This is established through two components:
A Survey on Transformation Tools for Model Based User Interface Development
1181
• A specification of the structure to be transformed based on grammars in the Backus Naur Form. • A set of transformation rules based on pattern/replacement pairs and functional programming Since the rules are specified in a functional way and the first part of a TXL specification only describes the grammatical structure, TXL can be considered also to be a mostly declarative language. TXL has been also proven to be capable of model transformations [15]. In fact the grammar support allows taking the final step from a concrete model to a representation in a programming language, which is not possible with graph transformations. 3.4 4DML The transformation language of 4DML (four-dimensional markup language) [4] has been originally developed in order to adapt web content to people with special needs and is therefore considered here. It is designed to transform different notations and as such serves a similar rich application domain as TXL. But while TXL is intended for transforming programming languages which can be represented as a syntax tree, 4DML supports the transformation of data which comes in a matrix-like structure. The transformation of 4DML documents is done purely declarative through the definition of a source pattern matching and the definition of a target model. While 4DML seems to be strong in transforming between completely different languages where the source is organized in an n-dimensional structure, it is a bit artificial to impose a matrix structure on documents which are organized as trees or graphs. 3.5 UIML Peers The User Interface Markup Language (UIML) [1] is an XML-based language to describe all relevant aspects of a user interface such as structure, style, content and behavior. A genuine aspect of UIML is the capability to define connections to the backend logic and to provide a vocabulary which maps UIML to other UIML instances or target languages. The latter two aspects are covered in the "peers"-section of UIML and provide the transformation features for this survey. The "presentation"section of UIML includes mappings of classes (and components) and their properties to target format constructs, while the "logic"-section is used to manage the connection to the application logic. A presentation section usually has a name, which allows different presentation sections to be provided for different target formats. For example, there may be a presentation section for VoiceXML and for HTML. Mappings of classes and their properties are not necessarily restricted to XMLbased formats but may also be mapped to, e.g., Java constructs. Since UIML’s mapping facility matches class names and provides new values for the matched objects, it can be regarded as declarative. This is however only linear and therefore too simple to support complex restructuring tasks. This results also in limited use, when it comes to model transformation. The obvious advantage of the UIML approach is that the abstract UI-representation and the transformation to the FUI can be specified in the same language. Therefore the model based approach using
1182
R. Schaefer
UIML as presented in [2] uses UIML internal mappings from the AUI to CUI and from CUI to the FUI but requires an external transcoding approach from the task-level to the AUI. 3.6 XSLT The transformation language XSLT [11] is designed for the purpose of transforming the XML-based input to textual (mostly XML-based) output. The input of an XSLT program is a set of XML-based documents. The output can be XML, or plain text. With plain text output, an XSLT processor can generate languages different from XML. An XSLT definition defines a set of template rules which associate patterns with templates. Each rule consists of a matching pattern, optional mode and priority attributes, and a template. Matching pattern expressions are defined by a subset of the XPath language and are evaluated with respect to a currently processed (matched) node or the root node. The matching process considers the node’s name, attributes, location in the tree and position in the list and results in a set of nodes that can be used to provide parameters for the template or as a base for further matching. XPath supports the processing of node-sets and covers five additional basic types: booleans, numbers, strings, node sets, and tree fragments. Processing usually starts at the root node. When a pattern is successfully matched, the pattern is associated with the template, the template (construction pattern) is recursively executed, mode is possibly changed, and matching is optionally continued from each matched node. For execution, XSLT provides variables and parameters which can be passed between template rules. For pattern processing, XSLT provides literals, constants, variables, and keys (for cross referencing) with conditions, list iterations, recursion, sorting, and numbering as control structures. For advanced processing, XSLT covers a powerful set of built-in string functions for creation, deletion, replacement, copying, and concatenation. While the XSLT processing foundation lies in functional programming, the processing allows imperative statements such as iterations and conditions. Therefore XSLT can be considered to be a hybrid approach. 3.7 GAC The General Adaptation Component (GAC) [8] has been developed to make web applications more adaptable. In contrast to the other presented transformation languages, GAC provides explicit means to reference context data to control the adaptation process and is able to modify the contextual data. Since its purpose is to adapt web content it is able to process HTML and XML in general. The architecture of GAC is as such notable that it does not use XSLT to describe the transformation rules - which is otherwise a quite common practice in that domain - but provides an RDF-based configuration of the adaptation process. The GAC configuration consists of rules which are bound to conditions. The rules can be of two types for adaptation and for updating the usage context. The adaptation rules allow deletion and substitution of XML fragments as well separation – the process of sourcing fragments out and making them accessible via links – and the inverse process. As the rules provide clear instructions of which operations to perform when a condition holds, we consider this approach to be more imperative.
A Survey on Transformation Tools for Model Based User Interface Development
1183
3.8 RDL/TT The Rule Description Language for Tree Transformation (RDL/TT) [18] evolved from a domain specific language for adapting Web-content to different devices into a transcoding language for multiple purposes including context-dependent transformations of XML-based UI descriptions. RDL/TT employs a Java-oriented syntax to define the transformation rules which operate on the DOM-tree of the XMLdocument. It defines simple search patterns based on tag-names or a collection of tags with complex restructuring rules on the found matches. A notable property of RDL/TT is the use of variables which may convey contextual information that allow different flows of transcoding operations for varying preferences, target platforms and contexts of use. The transcoding rules are specified in an imperative manner and provide several control structures such as branches and loops together with calls to predefined transcoding functions. The set of transcoding functions is extensible by compiling additional transcoding libraries to the tool, which for example has been used to include image processing rules to adapt visual content besides the user interface istself. In practice RDL/TT has for example been used for context-based adaptation of web content in [17] and with a generic user interface format in [16]
4 Comparison and Discussion Table 1 shows the support of different transformation characteristics for the languages we discussed. If a feature is supported, it is marked wit a “+” in the table, if not it is marked with “-”. If a supported feature is put in brackets, it means that it is in principle supported (maybe with some additional effort) but that the language is not specifically designed to support that property. Table 1. Comparison of general transformation language properties
Feature Declarative Imperative Model Transformation XML Transformation Code Transformation Code Generation Complex Mapping Extensible Parameterizable
ATL + + +
GT + +
TXL + (+)
4DML + (+)
XSLT + + (+)
GAC + (+)
UIML + (+)
RDL + (+)
-
-
(+)
(+)
+
+
-
+
-
-
+
(+)
-
-
-
-
-
-
+
+
(+)
-
+
(+)
+
+
+
+
+
+
-
+
+ -
-
-
-
-
+
-
+ +
1184
R. Schaefer
While the distinction between declarative and imperative programming tells nothing about the capability of the transcoding language, it may be a selection criteria for programmers who feel more familiar in one of these programming models. On the other hand a clear distinction between declarative and imperative transformation systems is not always possible. While for example Graph Transformations (GT) are clearly declarative, the declarative aspects of TXL for instance are a bit diluted. For this reason, the marks for declarative and imperative indicate the strongest tendencies. With respect to the ability to transform models (and as such to transform UI models), only ATL and graph transformations are really designed for it. However, tools which operate on XML Documents are capable to process the XML representations of the models. So XSLT, GAC and RDL/TT are principally capable of model transformations. For TXL it is also possible, but here the structure must be established first through an according grammar, and also 4DML has first to establish the structure, which actually counts for any type of input for 4DML. The ability for model transformation is poorest available in the UIML peers section. In fact it is not designed for model transformation at all but it allows at least the transformation from the AUI to the CUI and to the FUI. The latter transformation process is the actual goal for the peers section. Since graph transformations and ATL are designed to work on the model only, they are not capable of processing general XML documents and even less on arbitrary code.1 XSLT, RDL/TT and GAC only work with XML Documents and as such are not usable for code transformation as done with TXL, although XSLT and RDL/TT can at least produce non XML code out of an XML Document. The code generation ability is however more evident in 4DML, TXL and UIML. In the scope of code transformation, UIML peers on the other hand is just able to match UIML elements, but suited very well to produce for different target languages. Besides UIML peers all considered languages allow complex mappings, which means restructuring the source of operation. UIML however only provides a linear one-toone mapping. While RDL/TT is both extensible with additional functionality and parameterizable, only few of the other approaches come with these features: ATL allows extensions by calling native operations and GAC is able to process and change context information. While in GAC the context modification happens within the rules, it is separated it in RDL: The context information is fed to variables but processing and modifying the context information is performed with a different rule set.
5 Conclusion In this paper, we compared several transformation tools / languages with respect to a selected set of criteria we considered of importance for model based user interface 1
Of course XML document trees can be interpreted as graphs and as such are potentially subject to graph transformations, but we look explicitly at embedded graph transformations as in UsiXML.
A Survey on Transformation Tools for Model Based User Interface Development
1185
development from a more practical view. For this reason we included the programming model, since it may be a premier choice for a developer being familiar with either declarative or imperative programming. We also identified different levels of transformations: Model to model, transformations on XML representation of models and code transformations, while the ability of generating code with a transformation tool is of equal importance to fulfill the complete modeling pipeline. While the capability of specifying complex transformations is very important for the most applications, extensibility and parameterizability is more important to a subset of user interface development tasks, for example for building context-dependent applications. In respect on the variety of the modeling tasks it is impossible to definitely recommend or dismiss one of the compared candidates which provide different strengths and weaknesses for different applications. For purely model driven approaches, graph transformations and ATL will be good choices but also the XML-processing tools will do, if the model representations come with an XMLsyntax. The most problems can be seen with 4DML since a matrix structure has to be established first, which rather unnatural for user interface models, although it supports indirectly most of the required features. The capabilities of the UIML peers section on the other hand provides only very few features but has proven to be very strong to connect to specific target toolkits. Therefore, in a short summary, the choice of the transformation tool largely depends on the models, their applied representation and the targeted application. Sometimes, a combination of different transformation approaches is advisable, for example when graph transformations are used for model to model transformations on higher levels of abstractions but the last step towards the final UI requires code generation. In addition, many user interface modeling tools come with internal transformations which are neither observable, nor controllable. However, their models may be of interest for the user interface developer and are often available with an XML schema. Therefore it makes sense to use the desired models with an own developed ruleset for one of the XML-processing approaches (XSLT, GAC, RDL/TT and to a lesser extent TXL and 4DML) to bypass the tools' internal fixed transformation and for example make transformations to new targets or improve the tools' transformation resuls. To further compare the performance, code size of the transcoding rule representations, ease of definition and other specific aspects of the transformation tools, more detailed tests are required. Something we did for UIML, XSLT and RDL/TT in the past [16]. Furthermore, the evaluation against the design features developed in [11] is appropriate to get a denser picture and provide user interface developers as well as modeling tool developers a higher level of detail for their choice of transformation approach.
1186
R. Schaefer
References 1. Abrams, M., Helms, J.: User Interface Markup Language (UIML) Specification, Working Draft 3.1. OASIS (2004) 2. Ali, M.F., Pérez-Quiñones, M., Abrams, M.: Building Multi-Platform User Interfaces with UIML. Multiple User Interfaces - Cross-Platform Applications and Context-Aware Interfaces, pp. 95–118. John Wiley & Sons, Ltd, New York (2004) 3. Andries, M., Engels, G., Habel, A., Hoffmann, B., Kreowski, H.-J., Kuske, S., Plump, D., Schürr, A., Taentzer, G.: Graph transformation for specification and programming. Science of Computer Programming 34(1), 1–54 (1999) 4. Brown, S.S.: Conversion of notations. Technical report, University of Cambridge (2004) 5. Calvary, G., Coutaz, J., Thevenin, D., Limbourg, Q., Bouillon, L., Vanderdonckt, J.: A Unifying Reference Framework for Multi-Target User Interfaces. Interacting with Computers 15(3), 289–308 (2003) 6. Cordy, J.R.: The TXL Source Transformation Language. Science of Computer Programming 61, 190–210 (2006) 7. Czarnecki, K., Helsen, S.: Classification of Model Transformation Approaches. OOPSLA’03 Workshop on Generative Techniques in the Context of Model-Driven Architecture (2003) 8. Fiala, Z., Houben, G.-J.: A Generic Transcoding Tool for Making Web Applications Adaptive. In: Proceedings of the CAiSE’05 Forum. CEUR Workshop Proceedings (2005) 9. Gerber, A., Lawley, M., Raymond, K., Steel, J., Wood, A.: Transformation: The Missing Link of MDA. In: Corradini, A., Ehrig, H., Kreowski, H.-J., Rozenberg, G. (eds.) ICGT 2002. LNCS, vol. 2505, pp. 90–105. Springer, Heidelberg (2002) 10. Jouault, F., Kurtev, I.: Transforming Models with ATL. In: Bruel, J.-M. (ed.) MoDELS 2005. LNCS, vol. 3844, pp. 128–138. Springer, Heidelberg (2006) 11. Kay, M.: XSL Transformations (XSLT) Version 2.0, W3C Working Draft. World Wide Web Consortium (2002) 12. Lara, d.J., Vangheluwe, H.: AToM3: A Tool for Multi-formalism and Meta-modelling. In: Kutsche, R.-D., Weber, H. (eds.) ETAPS 2002 and FASE 2002. LNCS, vol. 2306, pp. 174–188. Springer, Heidelberg (2002) 13. Limbourg, Q., Vanderdonckt, J., Michotte, B., Bouillon, L., López-Jaquero, V.: Usixml: A language supporting multi-path development of user interfaces. In: Bastide, R., Palanque, P., Roth, J. (eds.) Engineering Human Computer Interaction and Interactive Systems. LNCS, vol. 3425, pp. 200–220. Springer, Heidelberg (2005) 14. Nunes, N.J., e Cunha, J.F.: Wisdom - A UML Based Architecture for Interactive Systems. In: Palanque, P., Paternó, F. (eds.) DSV-IS 2000. LNCS, vol. 1946, pp. 191–205. Springer, Heidelberg (2001) 15. Paige, R., Radjenovic, A.: Towards Model Transformations with TXL. In: First International Workshop on Metamodelling for MDA, pp. 162–177 (2003) 16. Plomp, J., Schaefer, R., Mueller, W.: Comparing Transcoding Tools for Use with a Generic User Interface Format. Extreme Markup Languages (2002) 17. Schaefer, R., Mueller, W., Dangberg, A.: Fuzzy Rules for HTML Transcoding. In: Hawaii International Conference on System Sciences, HICSS 35 (2002) 18. Schaefer, R., Mueller, W., Dangberg, A.: RDL/TT, A: Description Language for the Profile-Dependent Transcoding of XML Documents. In: Proceedings of the first International ITEA Workshop on Virtual Home Environments (2002)
A Survey on Transformation Tools for Model Based User Interface Development
1187
19. Soley, R. and the OMG Staff Strategy Group: Model Driven Architecture. White Paper. Object Management Group (2000) 20. Sottet, J.-S., Calvary, G., Favre, J-M.: Towards Model Driven Engineering of plastic User Interfaces. Model Driven Development of Advanced User Interfaces MDDAUI ’05. In: CEUR Workshop Proceedings, vol. 159 (2005) 21. Vanderdonckt, J.: A MDA-Compliant Environment for Developing User Interfaces of Information Systems. In: Pastor, Ó., Falcão e Cunha, J. (eds.) CAiSE 2005. LNCS, vol. 3520, pp. 16–31. Springer, Heidelberg (2005)
A Task Model Proposal for Web Sites Usability Evaluation for the ErgoMonitor Environment André Luis Schwerz1, Marcelo Morandini2, and Sérgio Roberto da Silva1 1
Informatics Department – State University of Maringa Av. Colombo, 5790 – Bloco 19 – Centro – Maringá – PR, Brazil 2 School of Arts, Sciences and Humanities – University of São Paulo Av. Arlindo Bettio, 1000 – Ermelino Matarazzo – Brazil {[email protected], [email protected], [email protected]}
Abstract. In this paper we present a task model for the usability monitoring environment called ErgoMonitor. ErgoMonitor realize an usability evaluation in websites through selective collection and analyses of the data from log files referring to the real interactions that are established between final users and an web interface. Nevertheless, the ErgoMonitor depends on the users expected behaviors previous identification and this activity is conducted by a specialist in usability that must observe the website characteristics or be assisted by traditional usability evaluation previous diagnosis to define which tasks (behaviors) should be inserted in the evaluation script. In this way, we developed a mechanism to register the expected users behaviors conceiving the Monitoring Tasks and Behaviors Model. This mechanism enabled the ErgoMonitor to realize web sites usability evaluations based on their log files. Keywords: Interactive Systems Usability Evaluation; Human-Computer Interaction; Web Sites; Server Log Files.
A Task Model Proposal for Web Sites Usability Evaluation
1189
StateWebCharts [2] in confrontation with the obtained scenes from tasks model, represented by the diagrammatic notation proposal in [6]. This method approaches the project´s initial phase evaluation, however it still needs tools to aid in the UE phase, analyzing as the users interacts to the Web UI based in the developed scenes. For application of an UE with the support of the ErgoMonitor, it is necessary that a specialist in usability and the application domain, which we call of evaluator-operator, specifies the tasks that compose the script of evaluation. However, in this environment, the evaluator-operator must identify manually the tasks of script in server log files. This work is very slow and susceptible to errors when the log files have a great amount of hits. In this way, this paper presents a proposal of a tool that reduce the work of the evaluator-operator to identify the tasks in server log files. Thus, this tool allows that the evaluator-operator elaborates the Monitoring Task and Behavior Model that will be used for the ErgoMonitor to realize monitoring of usability. Section 2 presents an overview of the main steps of the ErgoMonitor environment. Section 3 describes the proposal of Monitoring Task and Behavior Model as an improvement of the ErgoMonitor. Section 4 demonstrates as Monitoring Task and Behavior Model can be useful for cleaning of server log files. Section 5 presents and discussion the results of performance of ErgoMonitor in institutional website.
2 ErgoMonitor The ErgoMonitor environment was proposed as a usability monitoring system in Web UI through the collect and analyze of hits of the log files referring the real interactions between the users and the Web UI. On based selected data of the log files and through of expected behaviors models the tool calculates taxes and metrics that quantify the usability of website. All these procedures are realized of imperceptible and invisible way for the users. Thus, the user interacts normally with the Web UI and, while these interactions occur, the server stores the data relating to these interactions in its log files. Later, the ErgoMonitor collect and analyze these files to elaborate usability measures of Web UI. Figure 1 illustrates an overview of components of the ErgoMonitor environment. As can be seen, the environment is divided between components that represent the activities that are realized by evaluator-operator and components that represent the activities that are realized by software of the ErgoMonitor. In the next paragraphs we will describe each one of these components. Monitoring Analysis – In this activity the evaluator-operator observes the web application and selects which tasks must compose the Monitoring Tasks and Behavior Model. In this activity the evaluator-operator can be based on a traditional UE to define which tasks must be included in evaluation script. LogControl – This component is composed for programs group that realize a cleaning in server log files and accounting of each user behavior based in the Monitoring Tasks and Behavior Model. The expected output of LogControl is the Verified Tasks and Behavior Model that is support for the taxes and metrics determination. The LogControl will be described in Section 4.
1190
A.L. Schwerz, M. Morandini, and S.R. da Silva
Fig. 1. An overview of ErgoMonitor
Usability Taxes and Metrics Determination – This step determines the taxes and metrics that quantify a Web UI. A tax is an intermediate measure that qualifies the interaction. A metric represents a usability measure, as established by ISO 9241 norm [3] (basically metric of effectiveness and efficiency). Depending on the resources of the interface, the ErgoMonitor can be able to supply to the evaluator-operator taxes and metrics as demonstrated in Table 1. Historic Parameters Database – This component is a database that can store information of UE realized previously. The stored data in this base can be compared with the obtained data from the interaction of the users with a new Web UI (when there is the need of a re-design, for instance), verifying, thus, the contribution of the modifications realized in this new Web UI. Moreover, the evaluator, who had the experience with similar UI, can establish values to judge permissible for the users interactions. Thus, through of verified and permissible values, the evaluator is able to elaborate a monitoring report that may confirm the existence of usability problems, previously diagnosed, or indicate new problems that must be analyzed later. Table 1. Usability metrics and taxes generated by the ErgoMonitor environment Usability Taxes amount of Verified Behaviours (VB) amount of Successful Behaviours (SB) amount of SB with Help (SH) amount of SB with Error (SE)
Usability Metrics Rate of Efficiency = SB VB Mean Time to Task =
∑ timeSB SB
Incidents in Success Rate = SH + SE SB
A Task Model Proposal for Web Sites Usability Evaluation
1191
Report Generation – The evaluator-operator must receive the usability taxes and metrics calculated by the ErgoMonitor and from the permissible parameters he/she can detect possible usability problems in the website. We showed that the ErgoMonitor was considered as a system able to monitor the real interactions with real users and that they are really using the website in his/her proper work environments. This defines the UE focus of attention: usability problems that can occur effectively, even without the full knowledge of the involved conditions of use context with such problems. However, the environment only was defined and systemized, and the evaluator-operator was forced to identify manually the tasks in the server log files. Moreover, the cleaning mechanism of log files did not arrive to be implemented. Thus, we present these innovations in next sections.
3 The Monitoring Task and Behavior Model The result of the activity of monitoring analysis is the Monitoring Tasks and Behaviors Model. Morandinil defined that the evaluator-operator is answerable for modeling the tasks that are used by the ErgoMonitor to realize an UE analyzing the users’ behaviors when interact with these tasks. However, the evaluator-operator has that to manipulate server log files to identify the accesses to the first page and last page of task through of special characters. This work is extremely difficult to be realized for websites that have a great amount of tasks, and therefore, may induce to errors for its modeling and demand a high the evaluator-operator workload. 3.1 Task Specification Problem Formalization A task is an objective associated with an ordered set of actions that can satisfy such objective in the appropriate contexts [8]. Translating into our context, we consider a task as a sequence of steps that the user must realize to reach specific objective. Thus, it is necessary that the evaluator-operator determines the steps that compose each task, establishing its beginning and end. Each one of these steps is an access to a webpage and must be identified by URLs. Formally, we can define a task k for quadruple tk = <S, p o, pf, δ>, where S is a finite set of webpages (with size n) that must be viewed by the users for each task po is the first webpage, p f is the last webpage that identifies the success of task and, finally, δ:S→S is the transition function that identifies the change of the webpage pi for another webpage pj, where i, j ≤ n. Thus, we define the Monitoring Tasks and Behaviors Model as: TM = {t1, t2, ..., tm}, where m is amount of tasks in the model. It is important to remember that each webpage is identified by a URL and that the server log files stores in sequential way a list of URLs that determines the historic of access of each user. A common task in website could be the “User Register” and could be analyzed by evaluator-operator as is illustrated in Figure 2.
1192
A.L. Schwerz, M. Morandini, and S.R. da Silva
Fig. 2. Example of a task
When the user wants to register him/her self in a website he/she must to access the index.html, register.php, and finally, success.html. However, when finding a problem in the page register.php the user can access the page of help (help.php). If a problem persists and the user insists to be registered, he/she will be directed for the error page (error.php). The user only will have success in his/her objective if obtains access the page of success (success.html). Formally, we can define the example of Figure 2 as: • • • •
3.2 An Overview of Register of Monitoring Tasks and Behaviors Model This work main focus is to develop a mechanism where the Ergo-Monitor´s evaluator-operator may register and manipulate the task model. Thus, we developed an application using the Borland Delphi 6 (currently it is been converted for Java language, due to requirement of portability) associated to a database MySQL 4.1, in which the evaluator-operator can register the website to be evaluated, and then, register the webpages through of its URLs. With these information already defined, the evaluator-operator will have the chance to register the tasks that he/she wants to evaluate. Figure 3 presents the class diagram of the Monitoring Tasks and Behaviors Model of the ErgoMonitor. The classes in colors hard gray and light gray will be presented in next section and do not belong to Monitoring Tasks and Behaviors Model.
A Task Model Proposal for Web Sites Usability Evaluation
1193
Fig. 3. Class Diagram of Monitoring Tasks and Behaviors Model
4 LogControl The LogControl is a set of programs that are performed sequentially. These programs receive the Monitoring Tasks and Behaviors Model and server log files to generate the Verified Task and Behaviors Model. The stages that compose the LogControl are: • Cleaning – is the elimination of unnecessary information of log files leaving only urls of the pages accessed by the users. • Search – is the research of the occurrence of urls that are in the Monitoring Tasks and Behaviors Model that was specified in the previous stages; and • Organization — is the manipulation of the data to organize the occurrence of behaviors in log files to facilitate the usability taxes/metrics determination. 4.1 The Cleaning Mechanism The purpose of the cleaning mechanism is to eliminate of log files the unnecessary accesses for UE. Realizing this process, the cleaning mechanism receives the server log file and the Monitoring Tasks and Behaviors Model generating filtered log file. We can consider a website as a set of web resources, represented for R={r1, r2, r3, ..., rv }, where v is the amount of available resources for the website, and the set U={u1, u2, u3, ..., uw} represent all the users that to have access to the website. Thus, it
1194
A.L. Schwerz, M. Morandini, and S.R. da Silva
is possible to represent a hit in the log files as li = {ui, d, h, c, ri}, where ui ∈ U, ri ∈ R, d represents the access date, h represents the access hour and c the status code that identifies if the request was success or failure. When the user visualizes a page, he is visualizing a set of resources that this page represents. Thus, we can represent a pageview for pi = {ri1, ri2, ..., rip}, where rij ∈ R, p ∈ v. We know that ri1 represents the resource desired for the user, and the others are embedded resources, which are requested by the webbrowser. On the other hand, the Monitoring Tasks and Behaviors Model presents sequences of urls for each task, that is, sequences of pages that must be visited by the user. As we defined that a task k can be represented by the quadruple tk = <S, po, pf, δ>, where the S is the set of webpages that compose the task. Hence, we can say that S = { p1, p2, ..., pn }, where n is the number maximum of pageview for a task. Now, we can clean the log file of form that each register does not represent an access to a resource, but, an access to a page. Thus, one hit in filtered log file is represented as lfi = {ui, d, h, c, pi}. Therefore, we can define the cleaning problem as: Given a log file with hits li and a Monitoring Task and Behaviors Model TM, transform the hits of this file in lfi, in such a way that all the hits in this file represent a pageview. 4.2 The Search and the Organization The search is a process that realizes the identification of the users behaviors in log files and the organization is a process that realizes a score of these behaviors. The expected output file of these processes is the Verified Tasks and Behaviors Model with the following information: • • • • •
Task accesses; Task success; Task unsuccess; Task with error page access; and Task with help page access.
A hit i of the filtered server log file was defined as lfi = {ui, d, h, c, pi}. On the other hand, a task k was defined by quadruple <S, po, pf, δ>, where δ is the function that defined a transition between pages of the task k. In order to produce the Verified Tasks and Behaviors Model, the program must searches in filtered log file the first transition of task, that is, a transition of page po for any page of set S defined by transition function δ. After to find the first transition, it must marks one access for this task and to continue the analyses until to find the last page pf (task success) or to find one transition that does not match with transition function δ of this task (task unsuccess). We defined that each page belongs to a type (content, error or help) in the Monitoring Tasks and Behaviors Model (see Figure 3). Thus, we can determine when a user accessed a error page and/or help page during his/her interaction with Web UI
A Task Model Proposal for Web Sites Usability Evaluation
1195
In Figure 3, the classes painted in gray black implement these processes and they are shortly described below: • File — this class represents the filtered server log file; • File_Page — this associative class represents the pages (beside IP, date, time) that are in the filtered server log file; and • Evaluation — this class represents the performance of evaluation and the attributes represent the behaviors calculated. In order to present an example that demonstrates the performance of LogControl, the Table 2 presents a fragment of log file with possible access to the task illustrated in Figure 2. Notice that the accesses in bold are those which the cleaning process are identified from the task illustrated in Figure 2. Table 2. Fragment of log file of accesses in the User Register task 201.22.90.195 - - [30/Nov/2005:09:50:11] “/index.html” 201.22.90.195 - - [30/Nov/2005:09:50:12] “/images/menu.jpg” 201.22.90.195 - - [30/Nov/2005:09:50:13] “/images/logo.jpg” 201.22.90.195 - - [30/Nov/2005:09:50:13] “/images/menu_register.jpg” 201.22.90.195 - - [30/Nov/2005:09:50:13] “/images/animation.swf” 201.22.90.195 - - [30/Nov/2005:09:51:50] “/register.php” 201.22.90.195 - - [30/Nov/2005:09:51:50] “/images/logo.jpg” 201.22.90.195 - - [30/Nov/2005:09:53:12] “/success.html” 201.22.90.195 - - [31/Nov/2005:10:30:11] “/index.html” 201.22.90.195 - - [31/Nov/2005:10:30:12] “/images/menu.jpg” 201.22.90.195 - - [31/Nov/2005:10:30:13] “/images/logo.jpg” 201.22.90.195 - - [31/Nov/2005:10:30:13] “/images/menu_register.jpg” 201.22.90.195 - - [31/Nov/2005:10:30:13] “/images/animation.swf” 201.22.90.195 - - [31/Nov/2005:10:31:12] “/register.php” 201.22.90.195 - - [31/Nov/2005:10:31:13] “/images/logo.jpg” 201.22.90.195 - - [31/Nov/2005:10:33:12] “/error.php” 201.22.90.195 - - [31/Nov/2005:10:33:40] “/register.php” 201.22.90.195 - - [31/Nov/2005:10:33:40] “/images/logo.jpg” 201.22.90.195 - - [31/Nov/2005:10:34:12] “/success.html”
Fig. 4. The search and organization processes in the fragment of filtered log file
In Figure 4 we present the result of the search and organization processes realized in the fragment of filtered log file of the Table 2. Moreover, Figure 4 shows the users behaviors when interacting to the task. These behaviors compose the Verified Tasks and Behaviors Model that is the output of LogControl.
1196
A.L. Schwerz, M. Morandini, and S.R. da Silva
5 Experiment We use the ErgoMonitor environment with the additional modules to realize UE in the website of Construtora Cidade Verde (CCV)1. Thus, we collected the log files related to the months of December/2005, January/2006 and March/2006, that added possess 22.6 MB of textfile with more than 113 thousand hits. The evaluator-operator generated the Monitoring Tasks and Behaviors Model with 64 tasks reaching the main website´s functionalities. After registering the Monitoring Tasks and Behaviors Model, we realized the log files cleaning, resulting in a set of filtered log files that added possess 3442 hits in 638 KB, which provides a considerable reduction in the size and in the amount of hits. In this way, we showed that it is practically impracticable to make the manual identification of the tasks in the log file due the great amount of hits and the high number of tasks. Finally, Table 3 describes the result of some tasks evaluated in the website of CCV. The data regarding the accesses to errors and help pages cannot be calculated therefore the website does not present pages of these types. Table 3. Results of tasks evaluated in the website of CCV Task New Building Building under Construction News Useful Links
Access 53 200 12 366
Success 9 76 12 366
Unsuccess 44 124 0 0
6 Conclusions In this paper we presented a mechanism to register the Monitoring Tasks and Behaviors Model. This mechanism provides to the evaluator-operator an interface to register the tasks that his/her desires to evaluate. We also showed the ErgoMonitor as an environment able to realize the website´s usability monitoring. However, the ErgoMonitor could not be applied in websites that possesses a great volume of data in their log files and a great amount of tasks to be evaluated. This problem occurred due to the evaluator-operator having to dedicate great efforts to manually identify the tasks in the log files. After the Monitoring Tasks and Behaviors Model was produced, the evaluator-operator must provide the log files that he/she desires to use in the evaluation. The LogControl receives these log files and with the support of the Monitoring Tasks and Behaviors Model realizes the cleaning of these log files through of elimination the unnecessary data for the UE. Besides, the LogControl 1
This website can be accessed in http://www.construtoracidadeverde.com.br and presents a building construction company portfolio.
A Task Model Proposal for Web Sites Usability Evaluation
1197
identifies the users’ behaviors in the filtered log files to obtain the Verified Tasks and Behavior Model. Thus, we also verified that the implantation of these approaches made the ErgoMonitor able to realize quantitative UE in websites of great proportions diminished the workload of the evaluator-operator.
References 1. Downton, A.: Engineering the Human-Computer Interface. McGraw-Hill, London (1992) 2. Harel, D.: Statecharts: a visual formalism for complex systems. Science of Computer Programming, pp. 231–274 (1987) 3. ISO 9241. Ergonomic Requirements for Office Work with Visual Display Terminals, Part 11 Usability Statements; Draft International Standard ISO 9241-11 (1993) 4. LEA, M.: Evaluating User Interface Designs. User Interface Design for Computer Systems, Chichester: pp.134–167 (1988) 5. Nielsen, J.: Projetando Web Sites - Designing Web Usability, Editora Campus, p. 416 (2000) 6. Paterno, F., Mancini, C., Meniconi, S.: ConcurTaskTrees: a Diagrammatic Notation for Specifying Task Models, INTERACT 97, pp. 362–369. Chapman&Hall, Sydney (1997) 7. Sears, A.: Layout Appropriateness: A Metric For Evaluating User Interface Widget Layout. IEEE Transactions on Software Engineering 19(7), 707–719 (1993) 8. Storrs, G.: The Notion of Task un Human-Computer Interaction. In: HCI 95- 10th Annual conference of the British Human-Computer interaction Group, University of Huddersfield, UK (1995) 9. Treu, S.: User Interface Evaluation: A Structured Approach, p. 351. Plenum Press, New York (1994) 10. Winckler, M., Palanque, P., Farenc, C., Pimenta, M.: Task-Based Assessment of Web Navigation Design. In: Proc. of TAMODIA’02 Task Models and Diagrams for User Interface Design, Bucharest, Romania (2002)
Model-Driven Architecture for Web Applications Mohamed Taleb1, Ahmed Seffah2, and Alain Abran3 1
Abstract. A number of Web design problems continue to arise, such as: (1) decoupling the various aspects of Web applications (for example, business logic, the user interface, navigation and information architecture; and (2) isolating platform specifics from the concerns common to all Web applications. In the context of a proposal for a model-driven architecture for Web applications, this paper identifies an extensive list of models aimed at providing a pool of proven solutions to these problems. The models span several levels of abstraction such as business, task, dialog, presentation and layout models. The proposed architecture will show how several individual models can be combined at different levels of abstraction into heterogeneous structures, which can be used as building blocks in the development of Web applications. Keywords: Models, Model-Driven architecture, Software engineering, Web applications, MDA, architecture.
interactive, platform-independent and run on the client Web browser across a network. This paper is aimed at providing a pool of proven solutions to many recurring Web design problems. Examples of such problems include: (1) decoupling the various aspects of Web applications such as business logic, the user interface, navigation and information architecture; and (2) isolating platform-specific issues from the concerns common to all Web applications. In this paper, the definition of a software architecture from [1] is adopted: “the structure of the subsystems and components of a software system and the relationships between them typically represented in different views to show the relevant functional and non functional properties.” This definition introduces both the main architectural elements (for instance, subsystems, components and connectors), and covers the ways in which to represent them, including both functional and nonfunctional requirements, by means of a set of views. A pool of proven solutions is proposed here in the form of an architecture and the related models for a model-driven architecture for Web applications to address solving these problems. These individual models can then be combined at different levels of abstraction into heterogeneous structures, which can be used as building blocks in the development of these applications. This paper is organized as follows: section 2 introduces related work on modeloriented architectures in general, such as the Model-Driven Architecture; section 3, primarily, describes the model-oriented architecture proposed here and some models, which we have identified and formalized; finally, section 4 presents a summary and directions for future work.
2 Related Work The concept of model plays a central role in the majority of the scientific disciplines. Some of them consider it moreover as being the "confluence of sciences". In Computer science, the models always existed, but they were relegated for a long time to the second plan compared to the source code, which still plays a dominating role in industry. The tendency tends however to be reversed and the concept of model passes from the “contemplative” statute (interpreted by the human) to the “productive” statute (interpreted by processors). Moreover, whereas the models were used until now in phase of design, and more recently in phase of development, they must from now on be embarked in the software to allow a dynamic evolution of this one. The use systematic, dynamic and equipped of models results today to consider the software engineering architecture directed by models as a major paradigm of the software engineering. 2.1 MDA Model The models are commonly used to represent flexibly the complex system. The models can be viewed at many levels of abstraction, and complementary model views can be combined to give a more intelligible, accurate view of a system than a single model alone. Meservy and Fensternacher [2] claim that many software development experts have long advocated using models to understand the problem that a system seeks to
1200
M. Taleb, A. Seffah, and A. Abran
address; yet development teams commonly employ models only in the early stages of modeling. Often, once construction begins, the teams leave them behind and never update to reflect their changing conceptions of the project. Most software developers would agree that modeling should play a role in every project [2]. However, there is no clear consensus on what that role should be, how developers should integrate modeling with other development activities, and who should participate in the modeling process [2]. In 2001, the Object Management Group introduced the Model-Driven Architecture (MDA) initiative as an approach to system specification and interoperability based on the use of formal models (i.e., definite and formalized models) [3; 4; 5; 6; 7]. The main idea of MDA is to specify the business logic in the form of abstract models. These models are then mapped (partly automatically) according to a set of transformation rules to different platforms. The models are usually described by UML in a formalized manner, which can be used as input for tools, which perform the transformation process. The main benefit of MDA is the clear separation of the fundamental logic behind a specification from the specifics of the particular middleware that implements it. In other words, the MDA approach distinguishes between the specifications of the operation of a system from the details of the way that the system uses the capabilities of its platform. This architectural separation of concerns constitutes the foundation of MDA in order to reach three main goals: portability, interoperability and reusability [8, 9, 10]. The MDA approach is comprised of three main steps: • Specifying the system independently from the platform that supports it • Specifying target platforms • Transforming the system specification into a specification for a particular platform In short, MDA makes a sharp distinction between models of: • The business (the Computation-Independent Model, or CIM), sometimes called a domain model, • The business model in a specific technology context (PIM), and • A model that is tied to the business and uses platform-specific code (PSM). There are two others steps that can be integrate in MDA process development: • Capture requirements in a CIM. The Computation-Independent Model captures the domain without reference to a particular system implementation or technology; The CIM would remain the same even if the systems were implemented mechanically, rather than in computer software, for example. • Deploy the system in a specific environment. Here, the question is to deploy the system in several specific platforms and environments. 2.2 Compositional Structured Component Model The authors have presented the CSCM model [11] designed to allow the construction of software components with variable lists of functionalities selected according to
Model-Driven Architecture for Web Applications
1201
components’ composition descriptor instances at runtime. The capability offered by CSCM component to select the required functionalities tackles the issue of excessive unwanted functionalities. Furthermore, software maintenance, modification and reuse can be significantly eased and simplified. According the authors, the power of CSCM components [11] can be efficiently tackled in the development of software application families. Software application families are most likely to reuse coarse to large-grained software components across families of applications with different functional configuration and capabilities.
3 The Proposed Architecture 3.1 Overview To tackle some of the weaknesses identified in related work, set of concepts, proposes a 5-tier architecture of model-driven generic classification schema for a Web software architecture. 3.2 Models Taxonomy A taxonomy of models is proposed next. Examples of models are also presented to illustrate the need to combine several types of models to provide solutions to complex problems at the five architectural levels. This list is not exhaustive: there is no doubt that more models are needed, and that others have yet to be discovered. A number of Web models have been suggested; for example, the OMG’s the Model-Driven Architecture [3, 4, 5, 6, 7], Si Alhir’s Understanding the Model Driven Architecture (MDA) [8], Methods & Tools [8], Paternò’s Model-Based Design and Evaluation of Interactive Applications [9], Vanderdonckt’s Task Modelling in Multiple Contexts of Use [10], Msheik’s Compositional Structured Component Model: Handling Selective Functional Composition [11], Puerta’s Modeling Tasks with Mechanisms [12]. In our work, we investigate how these existing collections of models can be used as building blocks within the context of the proposed five-layer architecture. Which models at which level solve which problem is the question we try to answer? An informal survey conducted in 2004 by the HSCE Research Group at Concordia University identified at least five types of Web models that can be used to create a model-oriented Web software architecture. Some examples of proposed models are presented below. 3.3 Domain Model The Domain Model, sometimes, called a business model. This model encapsulates the important entities of an application domain together with their attributes, methods and relationships [13]. Within the scope of user interface development, it defines the objects and functionalities accessed by the user via the interface. Such a model is generally developed using the information collected during the business and functional requirements stage. It defines the list of data and features or operations to be performed in different manner, i.e., by different users in different platforms. The
1202
M. Taleb, A. Seffah, and A. Abran
first Model-Based approaches were using a domain model to drive the user interface at runtime. These domain models describe the application in general and include some specific information for the user interface. For example, the domain model [13] includes: • • • • •
a class hierarchy of objects which exist in the application, properties of the objects, actions which can performed on the objects, units of information (parameters) required by the actions, and Pre- and post-conditions for the actions.
In their basic form, domain models should represent the important entities together with their attributes, methods, and relationships. This kind of a domain model corresponds to the object model of recent object-oriented software development methods. Consequently, the only real way to integrate user interface and system development is the simultaneous use of the data model. That is why; recent ModelBased approaches include a domain model known from the software engineering methods. Four other models are then derived from this model: task, dialog, presentation and layout models. 3.4 Task Model This model enables to describe how activities can be performed to reach the user’s goals when interacting with an interactive system [9]. Using task models, designers can develop integrated descriptions of the system from a functional and interactive point of view. Task models typically are hierarchical decompositions of tasks and subtasks into atomic actions [10]. In addition, the relationships between tasks are described in correlation with the execution order or dependencies between peer tasks. The tasks may contain attributes about the importance, the duration of execution and the frequency of use. For our purposes, we can reuse a following definition: A task is a goal together with the ordered set of tasks and actions that would satisfy it in the appropriate context. [13] This definition explicates the intertwining nature of tasks and goals. Actions are required to satisfy goals. Furthermore, the definition allows the decomposition of tasks into sub-tasks and there exist some ordering among the sub-tasks and actions. In order to complete this definition we need to add the definition of goal, action, and artifact: A goal is an intention to change or maintain the state of an artifact (based on [13]). An action is any act that has the effect of changing or maintaining the state of an artifact (based on [13]). An artifact is an object, which is essential for a task. Without this object, the task cannot be performed; the sate of this artifact is usually changed in the course task performance. Artifacts are real things existing
Model-Driven Architecture for Web Applications
1203
in the context of task performance - in the business. Artifacts are modeled as objects and represented in the business model. This implies a close relationship between the task model and the business model. With these definitions, we can derive the information necessary to represent in a task model. According to [13], one task description includes: • one goal, • a non-empty set of actions or other tasks which are necessary to achieve the goal, • a plan of how to select actions or tasks, and • a model of an artifact, which is influenced by the task Consequently, the development of the task model and of the domain model is interrelated. One of the goals of Model-Based approaches is to support user-centered interface design. Therefore, they must enable the user interface designer to create the different task models. One other model is then derived from this model: domain models. 3.5 Dialog Model This model enables to provide dialog styles to achieve tasks and to provide proven techniques for the dialog. The dialog model defines the navigational structure of the user interface. It is a more specific model and can be derived in good part from the more abstract task, user and business object models. A dialog model is used to describe the human-computer conversation. It specifies when the end-user can invoke commands, functions and interaction media, when the end user can select or specify inputs, and when the computer can query the end-user and presents information [14]. In other words, the dialog model describes the sequencing of input tokens, output tokens and their interleaving. It describes the syntactic structure of human-computer interaction. The input and output tokens are lexical elements. Therefore, in particular, this model specifies the user commands, interaction techniques, interface responses and command sequences permitted by the interface during user sessions. Two other models are then derived from this model: domain, task models. 3.6 Presentation Model The Presentation Model describes the visual appearance of the user interface [13]. This model exists at two levels of abstraction: the abstract and the concrete presentation model. In fact, this defines the appearance and the form of presentation of the application on the Web page. This model provides solutions for how the contents or the related services can be visually organized into working surfaces, the effective layout of multiple information spaces and the relationship between them. They define the physical and logical layout suitable for specific Web pages such as home pages, lists and tables. A presentation model describes the constructs that can appear on an end user’s display, their layout characteristics, and the visual dependencies among them. The
1204
M. Taleb, A. Seffah, and A. Abran
displays of most applications consist of a static part and of a dynamic part. The static part includes the presentation of the standard widgets like buttons, menus, and list boxes. Typically, the static part remains fixed during run-time of the interactive system except for state changes like enable/disable, visible/invisible. The dynamic part displays application-dependent data what typically changes during run-time (e.g., the application generates output information; the end user constructs application specific data). The former provides an abstract view of a generic interface, which represents a corresponding task and dialog models. Three other models are then derived from this model: domain, task, dialog models. 3.7 Layout Model Layout Model is realized as a concrete instance of an interface. This model consists of a series of user interface components that defines the visual layout of user interface and the detailed dialogs for a specific platform ands context of use. There may be many concrete instances of layout model that can be derived from a presentation and dialog models. This Layout model enables to provide conceptual models and architectures for organizing the underlying content across multiple pages, servers, databases and computers. This model is concerned by the Look & Feel of Web applications and by the construction of a general drawing area (e.g., Canvas widget) and all output inside a canvas must be programmed using a general-purpose programming language and a low-level graphical library. Four other models are then derived from this model: domain, task, dialog and presentation models.
4 Summary and Future Work In this paper, we have identified and proposed five categories of models, providing examples, for a model-driven architecture for Web applications to resolve many recurring Web design problems, examples of which include: (1) decoupling the various aspects of Web applications such business logic, the user interface, navigation and information architecture; (2) isolating platform-specific problems from the concerns common to all Web applications. Our discussion has focused on the way to specify a model-driven architecture using particular models such as Domain, Task, Dialog Presentation and Layout. Future work will require the classification of each model and the illustration of each of them in UML class and sequence diagrams. Next, some transformation rules will have to be defined and some relationships will have to be defined between models so that they can be combined to define the new methodology for software development based on the resulting patterns categories that we have defined and formalized in our previous work and these different models proposed and defined in this present paper.
Model-Driven Architecture for Web Applications
1205
References 1. Buschmann, F., Meunier, R., Rohnert, H., Sommerlad, P., Stal, M.: A System of Patterns: Pattern-Oriented Software Architecture, West Sussex, England. John Wiley & Sons, New York (1996) 2. Meservy, T.O., Fensternacher, K.D.: Transforming Software Development: An MDA Road Map. IEEE Computer 38(8), 52–58 (2005) 3. An ORMSC White Paper, ormsc/05-04-01.: A Proposal for an MDA Foundation Model’, V00-02, OMG Group (2005) [Online] available at http://www.omg.org/docs/ormsc/05-0401.pdf 4. Desmond Dsouza, Kinetium: Model-Driven Architecture and Integration Opportunities and Chalenges, OMG Group (2001) [Online] available at ftp://ftp.omg.org/pub/docs/ab/01-03-02.pdf 5. Richard Soley and the OMG Staff Strategy Group.: ‘Model-Driven Architecture’, OMG Group (2000) [Online] available at ftp://ftp.omg.org/pub/docs/omg/00-11-05.pdf 6. Dr. Jishnu Mukerji.: Document number ormsc/2001-07-01, Architecture Board, ORMSC, Model Driven Architecture (MDA) – Technical Perspective, OMG Group (2001) [Online] available at http://www.omg.org/docs/omg/01-07-01.pdf 7. Miller, J., Mukerji, J.: MDA Guide Version 1.0.1, OMG doc.omg/2003-06-01 (2003) [Online] available at http://www.omg.org/docs/omg/03-06-01.pdf 8. Alhir, S.S.: Understanding the Model Driven Architecture (MDA), Methods & Tools, Vol. 11, No.3, pp. 17–24, [Online] (2003) available at: http://www.methodsandtools.com/archive/archive.php?id=5 OR http://home.comcast.net/ salhir/UnderstandingTheMDA.PDF 9. Paternò, F.: Model-Based Design and Evaluation of Interactive Applications. Springer, Heidelberg (2000) 10. Vanderdonckt, J.Q., Limbourg, Q., Souchon, N.: Task Modelling in Multiple Contexts of Use. In: Forbrig, P., Limbourg, Q., Urban, B., Vanderdonckt, J. (eds.) DSV-IS 2002. LNCS, vol. 2545, pp. 77–95. Springer, Heidelberg (2002) 11. Msheik, H., Abran, A., Lefebvre, E.: Compositional Structured Component Model: Handling Selective Functional Composition. In: IEEE 30th EUROMICRO Conference, pp. 74–81 (2004) 12. Puerta, A.R., Tu, S.W., Musen, M.A.: Modeling Tasks with Mechanisms. International Journal of Intelligent Systems, 8 (1993) 13. Schlungbaum, E.: Model-based User Interface Software Tools Current state of declarative models Technical Report 96-30, Graphics, Visualization and Usability Center, Georgia Institute of Technology, CADUI’96 workshop in Namur, Belgium (1996) 14. Puerta, A.R.: Model-Based Interface Development Environment. IEEE Software 14, 41–47 (1997)
HCI Design Patterns for PDA Running Space Structured Applications Ricardo Tesoriero, Francisco Montero, María D. Lozano, and José A. Gallud Laboratory of User Interaction and Software Engineering, Albacete Research Institute of Informatics, University of Castilla-La Mancha 02071 Albacete, Spain {ricardo, fmontero, mlozano, jgallud}@dsi.uclm.es
Abstract. Nowadays, mobile activities such as m-commerce, m-learning, etc, are being increasingly adopted by people. Information availability will be a key feature in future applications. Public spaces, as shops, libraries, museums, etc do not have enough information available to visitors, mainly due to physical space constraints. In this context, PDAs provide a balance between physical dimensions and processing power capable of supporting Augmented and Immersive Reality (A&IR) features. However, they have several limitations (i.e., space screen). As a result of two usability evaluations of a PDA application currently running at the MCA (The Cutlery Museum of Albacete, Spain) some improvements were found. To reuse these solutions, this paper presents a collection of HCI design patterns for PDAs that run this kind of Space Structured Applications (SSA). Keywords: Information presentation, Interaction design, HCI standards, Graphical user interface, Architectures for interaction, Computer-augmented environment, Computer–mediated virtual spaces, Interaction techniques, platforms and metaphors.
HCI Design Patterns for PDA Running Space Structured Applications
1207
To address this situation, design patterns [1] [2] [7] seem to be an appropriate tool to provide generic solutions that enable designers to solve these problems [8] [12] [14] [15] [16] [17]. This paper is organized as follows: First, based on usability reports performed on a concrete application [12] we have defined an environment composed by a set of applications (Space Structured Applications) where these patterns are valid. Categories that grouped these patterns according to problems they solved were defined to organize them. Due to space reasons we only describe a pattern for each category. To end this article, conclusion and future work is presented.
2 Space Structured Applications The definition of Space Structured Application was a consequence of usability evaluations [12] performed on a production system, the MCA application [6]. Analyzing the usability evaluations, we detected some problems related to HCI using a PDA running the MCA system and some solutions for these problems were designed. Then we noticed that these solutions were too general to be reused on applications that share MCA system characteristics. So, we characterized these applications as Space Structured Applications (SSAs) or m-space applications. A SSA models physical spaces (buildings, floors, rooms, etc) that contain extra information about objects dwelling in these places. Users can use this metaphor to browse information or locate an object or space physically. The main aim of SSA is to improve information availability on physical spaces that is not available due to, for instance, physical space restrictions. Besides, objects that are not physically available may be so, virtually. An object virtually represented in a SSA may provide context to a physical object, and vice versa. This contextualized information leads to a richer user experience. Public spaces as museums, libraries, shops and so on may be modelled by SSAs and a key issue of SSAs is the fact that the physical space is part of the application and physical position of objects is used to address extra information about it. Most of SSAs are public spaces, so accessibility becomes a key issue in these applications. The following list summarizes key aspects that should be covered by SSAs: 1. User position and orientation is essential to keep virtual and physical spaces synchronized. 2. Most of SSA users are visitors, so effort to browse information should be minimized. 3. SSA may be used to guide people through the physical space instead of just browsing it; taking an active interaction role too. 4. Accessibility is a key issue to board because SSA may help disabled people to interact with their environment.
1208
R. Tesoriero et al.
3 Related Work Currently, there are many examples where pattern languages were successfully applied on HCI environments. One of the first pattern collection applied on HCI was Common ground: A Pattern Language for Human-Computer Interface Design [13]. This collection presents loosely coupled patterns and organized in two levels: primary patterns and sublanguages. As an evolution of this catalogue, Designing Interfaces: Patterns for effective Interaction Design [14] presented new patterns to its predecessor. Other collections and languages followed, for instance, Martijn van Welie's Interaction Design Patterns [17] that organized patterns according to interface types (Web Design Patterns, GUI Design Patterns and Mobile UI Design Patterns). Another interesting book related to HCI patterns is Jan Borchers's A Pattern Approach to Interaction Design [4] and recently, The Design of Sites [15] that presented a complete collection of patterns oriented to Web site applications. Finally, design patterns were applied to mobile applications. For instance, in [10] we can find classes of patterns (as patterns languages) related to mobile interaction.
4 Proposal of Categories and Design Patterns The aim of the pattern language proposed in this paper is not as general as Tidwell’s [13] [14], but not as concrete as Gamma’s [7]. It presents solutions for a specific environment without providing a concrete implementation. We based pattern characterization on Roth’s [10]. However, we modified this proposal, which was based on [7], into name, synopsis, context, forces, solution, consequences, implementation, example and schematic description characteristics. The collection of HCI design patterns is organized into four categories; each category focuses on related problems. We have defined categories in order to ease the identification of the best patterns to be applied to certain problems. Due to space reasons we will expose a summary of each category and a brief description of each pattern. As an example, we will present only one pattern of each category in detail. The whole catalog of design patterns is available in a Technical Report at [12]. 4.1 Orientation Patterns This category introduces HCI patterns to help users to get oriented into a physical space. These patterns improve virtual/physical synchronization of space in order to locate users within the space. They cope with the issues described in point 1 of SSA characteristics. Patterns belonging to this category are the following: 1. You are here (aka Address): A user tries to identify any space somehow. Usually, public spaces are identified by names; so they should be supplied to the user. This pattern is widely used on Web. 2. Multi-Layer Map: Sometimes users need to know their physical position within a space. Physical spaces are structured as a hierarchy and user position can be determined by user space position on each level.
HCI Design Patterns for PDA Running Space Structured Applications
1209
3. Signs: This pattern helps users to get oriented when they spend a long a time into a space and get lost in there. So, a sign is used to synchronize virtual and physical space 4. Door at back: This pattern helps users to get oriented when a space transition occurs. A space transition happens when a users moves virtually and physically from one space to another; for instance form a room to another. As an example of this category we present Door at Back. Door at Back 1. Synopsis: Spaces are graphically represented by maps. Large buildings have several rooms. As all rooms of a building do not fit on screen at once, each one is represented by a different screen, producing space transitions when a user moves from one space to another. 2. Context: Users pass through different rooms while visiting buildings. When users move from one space to another, a transition on screen occurs. 3. Forces: This interface transition leads to user disorientation between physical and virtual space. 4. Solution: Virtual space orientation is usually represented by a map. This map should be automatically oriented according to the door used by the user is at the bottom of screen. The door should be clearly marked, as an arrow, pointing the same direction the user; as seen on Fig. 1. 5. Consequences: User gets oriented on space by recalling at first sight the room he had when he entered first time. 6. Schematic Description:
Fig. 1. Sample of “Door at back” pattern
7. Related Patterns: It can be used jointly with Address. Map orientation may be combined with layout changes depending on map shape (Layout patterns). Based on W3C Common Sense Suggestions for Developing Multimodal User Interfaces principles [18] this pattern focuses on: • Satisfying Real-world constraints taking into account physical suggestions and environmental suggestions (physical space orientation is treated in this pattern). • Communicating clearly, concisely, and consistently with users (an arrow represents user entrance direction). Making users comfortable by easing user’s short term memory (the arrow help users to get “back to the basics” - the moment he / she entered the room -).
1210
R. Tesoriero et al.
We can relate this pattern with Tidwell’s HCI patterns [13] [14]. So, the following sublanguages may apply to this pattern definition: Go Back One step and Go Back to a Safe Place (arrow may be used to go to a safe place to orientate user); Bookmark (entrance is automatically as a safe place) and; Remembered State (at the time user entered into the room). From Van Welie et al. [16] point of view, we can relate this pattern with feedback (user gets oriented based on a previous known position) and visibility (User guidance) problem. So, it improves Learnability and Memorability. 4.2 Layout Patterns Layout patterns were introduced to organize SSA. Screen resolution on mobile devices are restricted, information to be displayed is increased due to virtual / physical space relationship and objects extra information. Point 2 of SSA characteristics can be designed using the following patterns in this category: • Landscape: This pattern proposes to use PDA in Landscape direction. • Vertical-Horizontal Layout: Modify the application layout according to the information to be displayed. • Layout Transition: It shows layout change transition. As an example of this category we present Vertical-Horizontal Layout. Vertical-Horizontal Layout 1. Synopsis: Information to be displayed on a portable devices screen should be optimized because screen space. 2. Context: Usually, there are two types of information to be displayed: main information (information that fulfil screen objective) and secondary (additional information to perform other operations). 3. Forces: Main information shape and size vary. For instance, maps, photos and videos may be displayed in portrait or landscape. 4. Solution: To optimize screen visualization for main information, screen layout is changed to fit main information the best way as possible. Secondary information is displayed “around” main information to have it available. 5. Consequences: Primary data information is optimized to fit screen and secondary information is displayed on available space. 6. Schematic Description:
Fig. 2. Sample of “Vertical-Horizontal Layout” pattern
HCI Design Patterns for PDA Running Space Structured Applications
1211
7. Related Patterns: This pattern is close related to Landscape and Layout Transition. This pattern satisfies the following principles of W3C Common Sense Suggestions for Developing Multimodal User Interfaces [18]: • Communicate clearly, concisely, and consistently with users by switching presentation modes when information is not easily presented in the current mode. Screen layout adapts interface to main information. It keeps interface as simple as possible, changing control layout instead of controls themselves. • Make users comfortable by reducing learning gap of a new user interface. Relationship to Tidwell’s HCI patterns [13] [14] sublanguages are: Disabled Irrelevant things (although secondary items are not disabled, they are not treated in the same level of relevance as main information) and Good Defaults (information default layout changes according to main information to be displayed). If we analyze this pattern from [16] perspective, the problems it affords are related to Conceptual Model and Natural Mapping (user knows exactly how to perform operations, if the user had previous experience with the interface - before layout transformation -). We try to cope with Learnability and Memoability usability issues. 4.3 Guide Patterns Design patterns on this category are used to model routes and paths that users may follow to guide users through any physical space based on user preferences. So, point 3 and slightly 1 of SSA characteristics are boarded here. Patterns belonging to this category are the following: • Free Will Navigation: This pattern provides a method to access spaces at any level through the application using cursor keys only. • Routes: Routes pattern provides routes to focus a visit on user preferences. As an example of this category we present Free Will Navigation. Free Will Navigation (aka Up-Down and Left-Right or No Guide) 1. Synopsis: Virtual space navigation is performed by cursor keys only. 2. Context: Usually, people using SSAs do not have both hands free (carry baggage). So, people should be able to hold and operate a device with one hand only. 3. Forces: As one of the most important things to be performed by this kind of applications is space navigation, it should be easily performed by one hand and be learned quickly. 4. Solution: To cope with this navigation problem we propose to control navigation by cursor buttons using: Left – Right keys to navigate across space levels (interlevel). Right button goes one level into selected space (if a piece is selected on a showcase, when right button is pressed, it goes into selected piece). While Left arrow cursor button goes one level up (if a showcase is being shown, when left button is pressed, it goes to the room enclosing this showcase). And Up – down buttons are used to navigate across same level spaces (intra-level). It selects a subspace into the same space. Up and down buttons changes selection to labelled items. Labeling actions representing cursors on screen provides action feedback to user. See fig 3.
1212
R. Tesoriero et al.
5. Consequences: User is aware of navigation destination using cursor keys. If proposed control is accepted as a standard on SSAs, learning gap will be minimized. A disadvantage of using labels is the fact that they may obscure map. 6. Schematic Description
Fig. 3. Sample of “Free Will Navigation” pattern
7. Related Patterns: Main relationship is established with Landscape Layout pattern because portable devices, as PDAs, can be used with one hand only, if they are in landscape position. It is also related to Right-Left handed users. On W3C Common Sense Suggestions for Developing Multimodal User Interfaces principles [18] this pattern satisfies the following principles: • Satisfy real-world constraints by assigning cursor key to most common operation on this kind of application. It also applies physical suggestions by using one hand only instead of both hands. • Communicate clearly, concisely, and consistently with users by using the same keys through navigation system regarding of space level (keeping interface simple). According to Tidwell’s sublanguages [13] [14] this pattern is related to: Optional Detail On Demand (user access information according to space level); Short description (information about navigation is displayed on screen) and Convenient environment Actions (people usually goes one level up and down only). Finally, according to Van Welie’s [16] perspective, the problems it affords are related to Visibility (user guidance, navigation can be used to guide users across building); Affordance (it uses the space metaphor); Feedback (operations are labelled). And usability issues we try to cope with are Learnability and Memoability. Note: We propose this pattern as a standard way of navigating across SSAs. 4.4 Accessibility Patterns Accessibility category is used to group patterns that can be applied to improve application access to disabled people. Patterns related to Point 4 of SSA characteristics are grouped here. • Space Audio Perception: A voice tells the user which space has selected • Right – Left Handed users: It adapts a SSA application designed using the Landscape pattern to be used by right or left handed people.
HCI Design Patterns for PDA Running Space Structured Applications
1213
• Zoom: It provides controls to change font size when users are reading documents. As an example of this category we present Right-Left handed users. Right-Left Handed Users 1. Synopsis: This pattern adapts the system to be used by most skilled hand of the user. 2. Context: Usually people do not have the same skills on both hands. So, if an application that should be used with one hand only, it is logical that the hand used to perform operations be the skilled one. 3. Forces: Right – Left handed users 4. Solution: Solution lays on two issues: mirroring screen horizontally and Change cursor control behaviour (Up - Down) (Left - Right). 5. Schematic Description:
Fig. 4. Sample of “Left-Right handed users” pattern
6. Related Patterns: On W3C Common Sense Suggestions for Developing Multimodal User Interfaces principles [18] this pattern satisfies the following principles: • Satisfy real-world constraints by using the easiest mode available on the device to perform each task. • Communicate clearly, concisely, and consistently with users by making command consistent and organizational suggestions keeps interface simple. Relating this pattern with Tidwell’s [13] [14] we found it is related to Convenient environment actions (actions are adjusted to user’s perspective).This pattern improves flexibility providing explicit control. It also improves learnability and memorability.
5 Conclusions As new types of applications emerge as technology advances, a set of applications (SSA or m-space applications) arises. These applications have new characteristics and constraints that define them. Problems like user position and orientation, information browsing, visitors’ guidance and accessibility should be managed. To solve these problems, we proposed an HCI design pattern language.
1214
R. Tesoriero et al.
Patterns have been grouped in four categories that solve related problems providing a useful way to identify them. One of the most relevant contributions of this paper is the proposal of a standard control to navigate across virtual spaced using cursor keys only (“Free Will Navigation” pattern). Finally, from our perspective, we think that SSAs are not mature enough and more research in needed to achieve a natural HCI within these environments. Thus, we propose some future woks on this field to improve HCI in this environment. Our work in progress is currently focused on the evaluation of these patterns. Usability evaluation tests are being designed to measure usability before and after these patterns are applied. We are also thinking about performing these tests inside and outside an HCI Lab and compare these measurements. In order to improve HCI on SSA, sensors could be able to suppress manual navigation on cursor keys. To provide this functionality, there are lots of technologies currently available (RFID, Barcodes, IRDA, etc) that could be used to provide location aware applications. Finally, we think that the catalogue presented in this paper is not complete, so the addition of new patterns is considered. Acknowledgements. We would like to thank the Spanish CICYT project TIN200408000-C03-01 for funding this work, which was also supported by the grant PCC-05005-1 from JCCM.
References 1. Alexander, C., Ishikawa, S., Silverstein, M., Jacobson, M., Fiksdahl-King, I., Shlomo, A.: A Pattern Language: Towns, Buildings, Construction. Oxford University, New York (1977) 2. Alexander, C.: The Timeless Way of Building. Oxford University Press, New York (1979) 3. Baber, C., Bristow, H., Cheng, S.L., Hedley, A., Kuriyama, Y., Lien, M., Pollard, J., Sorrell, P.: Augmenting Museums and Art Galleries. Human-Computer Interaction. In: INTERACT ’01. The International Federation for Information Processing, Tokyo, Japan, pp. 439–447 (2001) 4. Borchers, J.: A Pattern Approach to Interaction Design. John Wiley & Sons, New York (2001) ISBN-10:0471498289. ISBN-13: 978-0471498285 5. Elliot, G., Phillips, N.: Mobile Commerce and Wireless Computing. Addison-Wesley, London, UK (2004) ISBN-10: 0201752409. ISBN-13: 9780201752403. 6. Gallud, J.A., Penichet, V.M.R., Argandeña, L., González, P., García, J.A.: Digital Museums: a multi-technological approach. In: HCI-International Conference 2005, Lawrence Erlbaum Associates Las Vegas, USA (2005) 7. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Abstraction and Reuse in Object-Oriented Designs. In: Nierstrasz, O. (ed.) ECOOP 1993. LNCS, vol. 707, Springer, Heidelberg (1993) 8. Gary, S., Simon, S.: A Service Management Framework for M-Commerce Applications. Mobile Networks and Applications 7, 199–212 (2002) ISSN: 1383-469X
HCI Design Patterns for PDA Running Space Structured Applications
1215
9. Georgiev, T., Georgieva, E., Smrikarov, A.: M-Learning - a New Stage of E-Learning. In: International Conference on Computer Systems and Technologies - CompSysTech’2004 (2004) 10. Roth, J.: Patterns of Mobile Interaction. In: Roth, J. (ed.) JournalPersonal and Ubiquitous Computing, vol. 6(4), Springer, London (September 2002) ISSN 1617-4909 (Print) 16174917 11. Tesoriero, R., Lozano, M.D., Gallud, J.A., Penichet, V.M.R.: Evaluation the user experience of PDA-based software applied in art museums. In: 3rd Conference on Web Information Systems and Technologies 2007 WebIST. Barcelona Spain (2007) 12. Tesoriero, R., Lozano, M.D., Gallud, J.A., Montero, F.: Technical Report. HCI design patterns for SSA to PDA applications in art museums (2007) https://www.dsi.uclm.es/trep.php?&codtrep=DIAB-07-01-2 13. Tidwell, J.: Common Ground: A Pattern Language for Human-Computer Interface Design (1999) URL http://www.mit.edu/ jtidwell/common_ground_onefile.html 14. Tidwell, J.: Designing Interfaces: Patterns for Effective Interaction Design. Publisher O’Reilly. November 2005 ISBN-10: 0-596-00803-1. ISBN-13: 9780596008031 (2005) http://designinginterfaces.com/ 15. Van Duyne, D.K., Landay, J.A., Hong, J.I.: The Design of Sites. Publisher: AddisonWesley Professional (July 2002) ISBN-10: 020172149X. ISBN-13: 978-0201721492 16. Van Welie, M., Trætteberg, H.: Interaction Patterns in User Interfaces. In: Pattern Languages of Programs Conference (PLoP 2000), Allerton Park, Monticello, Illinois, USA (2000) 17. Van Welie, M.: Interaction Design Patterns (2007) http://www.welie.com/patterns/index.html 18. W3C Working Group. Common Sense Suggestions for Developing multimodal User Interfaces, Note September 11, 2006 (2006) http://www.w3.org/TR/mmi-suggestions/
Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments Kristof Verpoorten, Kris Luyten, and Karin Coninx Hasselt University, Expertise Centre for Digital Media and transnationale Universiteit Limburg Wetenschapspark 2, 3590 Diepenbeek, Belgium {kristof.verpoorten, kris.luyten, karin.conix}@uhasselt.be
Abstract. In this paper we introduce a monitoring system to support the user executing tasks in an ambient intelligence environment. In contrast with traditional environments, the goal of the user can not always be defined beforehand, but is determined while the user interacts with the environment. The monitor observes the user's activities and learns to correlate a set of user actions with a goal. The system maps activities to a task model and reuses these models to take appropriate actions in later similar user actions that are observed. Keywords: task patterns, activity patterns, ambient intelligence environment, pro-active agent system.
Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments
1217
the agent. Our system uses a graph-based representation of the environment, where each node represents a resource (devices) in the environment or a property of a resource. The edges specify the relations between these resources or properties. A relation that is often used is the distance between resources. The graph contains all relevant data about the user's environment. At this moment, relevant data is limited to information about other devices in the neighbourhood of our device. The node representing the user's device also contains a link to information about the device's internal context (running programs, battery level,...). The combination of this information with the environment information is enough for the agent to be able to help the user in supporting his/her tasks. The system's goal is to be self-learning: by observing actions of the users and the associated goals, it creates patterns of interaction that can lead to such a goal. For the actual learning, decision trees and a sliding window are used [7]. The decision trees are used to capture relevant information about the environment, and how this affects the user’s goals. The sliding window compares a set of actions of the user to previous interaction patterns to determine the next action that needs to be supported and how this contributes to the task at hand. The task and goals provide a context for the pattern. By using both decision trees and the sliding window simultaneously we increase the likelihood a correct action is selected and anticipate new behaviour of the user that was not encountered before. A self-learning system often takes incorrect choices and invokes incorrect behaviour that hinders rather than supports the user. The main problem in an ambient environment is that there is no standard way to give feedback to the system and correct its behaviour. For this purpose we investigate the use of predefined design patterns in these environments [10] that allow the user to interact with the selflearning system. In this paper we consider the usage of patterns that go beyond the traditional Alexandrian definition that indicate architectural or design patterns [15]. Besides HCI-oriented task and feature patterns, the recognition of patterns in activity data plays an important role in our approach.
2 Related Work To standardize the way the user can interact with a ubiquitous computing environment, Chung et al have developed and evaluated several design patterns for ubiquitous computing [10]. They developed an initial pattern language for ubiquitous computing consisting of 45 pre-patterns describing application genres, physicalvirtual spaces, interaction and systems techniques for managing privacy, and techniques for fluid interactions. Each of their pre-patterns consists of: a name and a letter-number pair, the patterns background, the problem that the patterns is addressing, the solution(s) to the problem and references to work related to the pattern. Their patterns tend to focus on high-level issues, such as user needs, versus specific user interfaces and interaction techniques. An evaluation showed that the prepatterns helped new and experienced designers unfamiliar with ubiquitous computing in avoiding design problems. These patterns are of great use to us when developing our context sensitive pro-active application.
1218
K. Verpoorten, K. Luyten, and K. Coninx
Software applications are becoming more and more complex, and so are user interfaces. Therefore, Sinnig et al feel that a disciplined form of reuse is needed for user interface development [17]. They explored the combination of components and patterns for this purpose. Patterns help in the reuse of well-known and proven design solutions, and play a significant role throughout the whole UI development process. Components embody reusable solutions at the implementation level. As a case study, they discussed the development of a web application for selling and managing IPphone service using the PCB (pattern -and component based) process. This research shows it is possible to build a UI using patterns. Pattie Maes [11] suggests the idea of agents acting as personal assistants. These agents acquire their competence by learning from the user, as well as learning from agents assisting other users. A few prototypes were build using this technique, including agents that provide assistance with meeting scheduling, e-mail handling, news filtering and entertainment selection. The problem with this approach is that the agents will only monitor one program (agenda, e-mail client, ...). We think more useful results can be obtained by monitoring the entire environment and the user's personal device. Charles Isbell et al [12] try to predict which task the user wants to execute with a remote control. They deploy a user interface that allows the user to execute a task using only one remote to control several devices. They describe tasks as clusters of similar commands, which are often used together. With the recorded data of 2 users during several weeks they were able to collect enough interactions to divide all available commands into several clusters. Each cluster represents a task the user is performing, during which he or she uses the commands in the cluster. They predict the next task the user is likely to perform by looking at the previous tasks he or she has performed. That way the user interface of the remote is adapted to support the probable next task, avoiding a cluttered UI with too many buttons. The way they predict the next task has some similarities with our approach of sliding windows (section 4.1). Another resemblance with our approach is that when our agent is not entirely sure about the next action to take, it will not execute an action on behalf of the user, but adapt the UI on the user's device to make it easier for the user to execute the task himself if desired. However, in contrast to us, they only focus on devices that can be controlled using a remote. Our system is designed to support the user with almost everything he or she can do using his or her PDA. Our system will also take over easy tasks from the user when it has had enough training, while the remote developed by Isbell et al will only adapt the UI of the remote to help the user in executing his or her task. Byun and Cheverst propose the utilization of context history together with user modelling and machine learning techniques to create pro-active applications. In [13] they describe an experiment to examine the feasibility of their approach for supporting pro-active adaptations in the context of an intelligent office environment. The application uses two different approaches to obtain pro-activity: pro-active rule based adaptation, and pro-active modelling adaptation. With the first approach, users have to reconfigure the system as their preferences change. The second approach automatically adapts those predefined rules when observation makes clear that the user's preferences have changed. In order to learn the patterns of the user's behaviour, decision trees are used. Decision trees have the advantage to be much easier to
Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments
1219
understand by designers than other machine learning techniques such as neural networks. Byun's conclusion is that context history has a concrete role for supporting pro-active adaptation in ubiquitous computing environments. This work supports our decision to use context history as our main source of data when trying to predict the user's next action. Both our decision tree and sliding window techniques (section 4.1) use context history. Obviously, a pro-active agent takes some control away from the user. Is the user willing to make this sacrifice? Barkhuus and Dey have examined this very question in [14]. Their conclusion is that the user is willing to accept a large degree of autonomy from applications as long as the application's usefulness is greater than the cost of limited control.
3 Environment Representation Our system represents the environment of the user in such a way that it is understandable for a computer program. The system monitors the environment continuously (through a set of sensors) and stores all relevant information in a graphlike structure. An example of such a graph is shown in figure 1.
Fig. 1. An environment graph representing the user's device and several other devices in the vicinity
Each node of the graph represents a resource (device) in the environment or a property of a resource and the edges represent relations between the resources. The node representing the user's device will be in the center of the graph. It will have several outgoing edges to all other devices in the vicinity that can be of any relevance to the user's context. It will also have an outgoing edge to a node representing the
1220
K. Verpoorten, K. Luyten, and K. Coninx
device's internal context. The internal context node contains all relevant information about the device itself (running programs, battery level,...). The representation of the environment is sufficient for our current needs, but it is still work in progress and will be further extended when necessary (e.g. by using the CoDAMoS ontology [16]).
4 Patterns in Ambient Intelligence Environments 4.1 Activity Patterns At the most fundamental level, patterns are considered as reoccurring sequences of user (inter)actions when inside the ambient intelligence environment. In our approach, on the syntactical level, the universe of interactions is an alphabet (from now on referred to as the interaction alphabet). Each letter in that alphabet is an atomic interaction. A word is a composition of letters of the interaction alphabet, and can be related with a sequence of activities that directly contributes to one of the tasks that can be executed while in the ambient intelligence environment. The length of such a word, and thus the sequence of actions, is limited by the maximum number of interactions that are required to complete a dialog (introduced in section 1). A word is always executed in a particular context. This means it is related with a (set of) task(s) from the task model and with a (set of) node(s) from the environment graph. The first indicates the interactions are done to complete a task, the latter that these interactions make use of resources and have an execution context. Based on the longstanding ideas of processor instruction predictions [5] and more recent work using Markov models [6], we think the dialog model that guides the user through the user interface can be composed dynamically. There are three different stages that need to be tackled here: 1. feature extraction from the observations 2. determining the next actions that can follow the observed set of actions 3. predicting the likelihood of each of these actions. We describe each step in more detail in the next paragraphs. When the user is interacting with the environment, the agent will continuously monitor her or his actions executed in the ambient environment. Each action is encoded as a letter from the interaction alphabet, and added to a list containing all executed actions. After a while, interesting patterns will start to occur in that list. Several actions will very often group together in the list and form recognizable subsets (interaction words) in the list. The system can look for these subsets in the user’s actions by comparing “partial words” already executed by the user, and use them to determine the next action the user is likely to execute. While the user is interacting with the ambient environment, her or his actions will be monitored by the system. With each action the user executes, it will search if the user’s activity is similar to previously recorded interaction patterns. It will do this by creating a window containing the last x actions executed by the user (a partial word). The system will use several lengths between a maximum and a minimum length for this window. It will first try to find reoccurring patterns using the maximum length for the window. Of course, with a long window, it will be harder to find a match but the
Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments
1221
likelihood of the predicted action based on this will be higher. When it does not find a match, it will start looking for matches with shorter windows, thus predicting actions with a lower likelihood. This goes on until the system finds a match or it has reached the minimum length for the window. When a matching window is found (figure 2), the action that occurs directly after the match will probably be the next action the user is going to execute. The reasoning behind this is simple, when the x previous actions are a part of a previously recorded word in the list, one can predict the next actions by completing the word. Currently we use a simple sliding window approach since we assume the actions that belong to an interaction word are always sequentially executed. One can imagine more complex pattern matching algorithms if parallel actions in an interaction word are allowed: in this case sequences becomes less important than the actual occurrence of an action in a word.
Fig. 2. Finding a matching pattern using the "sliding window" technique
A second machine learning method used by our system is decision trees. Instead of relating the user’s actions with previously executed actions, they are related with the user’s environment context at the time of the execution. Each time the user executes an action, it is added to the decision tree together with the current context. Next time a similar context occurs, the agent will be able to predict the expected action. Both the sliding window and the decision tree techniques will be used simultaneously when the agent tries to predict the user’s actions. By using both techniques together, we can figure out the prediction’s likelihood. When both techniques have the same outcome, it is a more likely outcome then when both algorithms disagree on the prediction. The agent has to be able to estimate the likelihood of the predicted pattern. It is important this is done as accurate as possible because wrong predictions will make the user lose trust in the system. Because we use both sliding windows and decision trees, there are two ways to calculate a prediction’s likelihood. First, when using the sliding window, the length of the partial word that matches a word in the list can be used to indicate the likelihood. The longer the word, the more likely the predicted action will be executed. Secondly, the decision tree can also be used to calculate the likelihood. Each time the agent “sees” a situation it already encountered before, the likelihood of the action executed in this situation is incremented.
1222
K. Verpoorten, K. Luyten, and K. Coninx
4.2 Task Patterns The previous section discussed the first level of patterns we use in our approach; patterns that are unlike other HCI or design patterns but have more in common with data patterns as they occur in machine learning techniques. This section considers more classic HCI patterns: the task pattern. A task pattern offers a solution to a particular problem (e.g. find item in list) for a given context. A task pattern is a re-occurring solution for a problem we encounter while doing task analysis. However, in the situation described above there is no predefined task model that describes the system, rather there is a set of observed user activities and available resources. We discuss how to progress toward dynamically selecting task patterns based on the observed activity patterns. The type of task pattern is also referred to as a feature pattern: a pattern that describes the activities of the user while using a certain feature of a system and that is part of a bigger task model [8]. The activity pattern and related environment graph describe the context of the pattern. Based on previous work [9], the user actions included in the activity pattern are interactions with services accessible from the user environment. Interaction with these services can be done using the available interaction resources available in the user environment and is transparent for the end-user. The service-interaction description that is included in the services includes a feature pattern that can be combined with other feature patterns to obtain a structured task model for the predicted goal of the user. Figure 3 gives an overview of this approach. Each service has a feature pattern attached that serves as a building block for an overall task model when a combination of services is being used. Each action from the activity is related with a service that is offered in the environment. A set of actions implies a set of services being accessed, and each service has its own specific interaction behaviour described by a task specification. Since a task specification can be considered equivalent with a dialog or behaviour model (depending on the notation used), the different dialog models of the services involved, and the services related with the predicted actions can be merged into one dialog model. Once this is completed, the next user interface elements likely to be interacted with by the user can be found following the different possible paths in the dialog model. Figure 3 shows exactly this approach: notice for the predicted actions there is also a dialog model that is merged with the previous dialog model by an edge that allows the user to progress from using the presentations for services X and Y toward using the presentation of service Z. The edges that occur between dialog models belonging to different action words are automatically added and depend on how the task models (feature patterns) of the services are merged. For example: suppose the top node of the task model for service X is A and for service Y it is B. Since X and Y are accessed together, these top nodes will be merged as (A ||| B), while if there is a merge with the top node C from service Z, this will result in (A ||| B) >> C since service Z is only used after services X and Y are used.
Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments
1223
Fig. 3. On the left the set of services accessible in the environment with their associated dialog models. On the right the action words with the user activities and their corresponding services. The combination of services leads to a combined dialog model guiding the user through the user interface.
5 Current Status and Future Work Currently, both the sliding window and the decision trees are implemented. The system is not yet able to detect the environment context itself. At this moment, tests are done by simulating the environment with a software component. The sliding window and decision trees are trained when simulating the environment, just as when it would be detected by the system itself. The agent will also suggest actions when it is able to make a reliable prediction and will try to help the user to execute that prediction by adapting the user interface to the user’s needs. Future work is to replace the software component that simulates the environment by one that can actually detect the environment context using sensors available for the system. These are standard sensors present on any modern device (WiFi, Bluetooth, …). The usage of task and feature patterns and their corresponding dialog models is also still work in progress. In previous work we already developed some basic components to support this approach, such as service annotation [9] and task prediction [6].
6 Conclusions In this paper we introduced a monitoring system to support the user in an ambient intelligence environment. The system is able to make sense of the environment context and learns the user’s expectations in certain contexts. Next time it encounters a similar environment, it will try to support the user by either executing the action on
1224
K. Verpoorten, K. Luyten, and K. Coninx
her or his behalf, or by adapting the user interface to support the user’s interactions in the ambient environment. Both for monitoring and predicting actions, as for adapting the user interface, patterns are used. Acknowledgments. Part of the research at EDM is funded by EFRO (European Fund for Regional Development), the Flemish Government and the Flemish Interdisciplinary institute for Broadband Technology (IBBT). Funding for this research was also provided by the Fund For Scientific Research Flanders (F.W.O. Vlaanderen), FWO project nr G.0461.05.
References 1. Baldonado, M., Chang, C.-C.K., Gravano, L., Paepcke, A.: The Stanford Digital Library Metadata Architecture. Int. J. Digit. Libr. 1, 108–121 (1997) 2. Bruce, K.B., Cardelli, L., Pierce, B.C.: Comparing Object Encodings. In: Ito, T., Abadi, M. (eds.) TACS 1997. LNCS, vol. 1281, pp. 415–438. Springer, Heidelberg (1997) 3. van Leeuwen, J. (ed.): Computer Science Today. LNCS, vol. 1000. Springer, Heidelberg (1995) 4. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn. Springer, Heidelberg, New York (1996) 5. Computer Architecture. A Quantitive Approach (2nd edn) Patterson, D.A., Henessy, J. L. (eds.) Morgan Kaufman Publishers (1996) ISBN 1-55860-329-8 6. Task-Driven Automated Component Deployment for Ambient Intelligence Environments, Rigole, P., Clerckx, T., Berbers, Y., Coninx, K.: Accepted for the Elsevier Journal on Pervasive and Mobile Computing, in press 7. Machine Learning, Tom M. Mitchell, McGraw-Hill Science (1997) ISBN 0-07042-807-7 8. Javahery, H., Seffah, A., Engelberg, D., Sinnig, D.: Multiple User Interfaces: MultipleDevices, Cross-Platform and Context-Awareness, ch. 12 Migrating User Interfaces between Platforms Using HCI Patterns. Wiley (2003) 9. Service-interaction Descriptions: Augmenting Services with User Interface Models. In: Vermeulen, J., Vandriessche, Y., Clerckx, T., Luyten, K., Coninx, K.(eds.) Proc. of EHCIHCSE-DSVIS’07 (March 2007) 10. Chung, E.S., Hong, J.I., Lin, J., Prabaker, M.K., Landay, J.A., Liu, A.L.: Development and evaluation of emerging design patterns for ubiquitous computing. In: Proceedings of the 2004 Conference on Designing interactive Systems: Processes, Practices, Methods, and Techniques (Cambridge, MA, USA, August 01 - 04, 2004). DIS ’04, pp. 233–242. ACM Press, New York (2004) 11. Maes, P.: Agents that reduce work and information overload. Human-computer interaction: toward the year 2000. pp. 811–821 (1995) 12. Charles Jr., L.I., Omojokun, O., Pierce, J.S.: From devices to tasks: automatic task prediction for personalized appliance control. Personal Ubiquitous Computing 8(3-4), 146–153 (2004) 13. Byun, H.E., Cheverst, K.: Utilising context history to support proactive adaptation. Applied Artificial Intelligence 18(6), 513–532 (2004) 14. Barkhuus, L., Dey, A.K.: Is context-aware computing taking control away from the user? three levels of interactivity examined. In: Dey, A.K., Schmidt, A., McCarthy, J.F. (eds.) UbiComp 2003. LNCS, vol. 2864, pp. 149–156. Springer, Heidelberg (2003)
Task-Based Prediction of Interaction Patterns for Ambient Intelligence Environments
1225
15. A Pattern Language: Towns, Buildings, Construction (Center for Environmental Structure Series) Alexander, C. (1977) ISBN 0-19-501919-9 16. Preuveneers, D., Van den Bergh, J., Wagelaar, D., Georges, A., Rigole, P., Clerckx, T., Berbers, Y., Coninx, K., Jonckers, V., De Bosschere, K.: Towards an extensible context ontology for ambient intelligence. In: European Symposium on AmbientIntelligence, pp. 148–159 (November 2004) 17. Sinnig, D., Javahery, H., Forbrig, P. Seffah, A.: Patterns and Components for Enhancing Reusability and Systematic UI Development. In: Proceedings of HCI International, Las Vegas, USA (2005)
Patterns for Task- and Dialog-Modeling Maik Wurdel1, Peter Forbrig1, T. Radhakrishnan2, and Daniel Sinnig2 1
Software Engineering Group, Department of Computer Science, University of Rostock, Albert-Einstein-Str. 21, 18051 Rostock, Germany {maik.wurdel, pforbrig}@informatik.uni-rostock.de 2 Department of Computer Science, Concordia University, 1455 De Maisonneuve Blvd. West, H3G 1M8, Montreal, Canada {krishnan, seffah, d_sinnig}@cs.concordia.ca
Abstract. The term Context of Use has been treated with much attention in HCI in recent years. In this paper, the integration of context information into task models will be described. The notion of context is formulated and used to annotate the task model. The reuse of such context-sensitive task models in light of task patterns is also examined. Keywords: task modeling, context-sensitivity, task patterns, context of use.
1 Introduction The development of UIs is complex and requires the integration of different disciplines. Model-based UI development has gained much attention by various researchers [5, 6, 8, 9], due to its ability to foster the integration of different viewpoints into the development process in the early stages of the development process. In general it tackles the problem of UI development by using different declarative models and the relationships between these models. The task model as a description of the tasks and goals is a commonly accepted starting point for modelbased UI development processes. Other models that have to be taken into account describe the environmental circumstances of the execution of tasks. The Context of Use (CoU) as an abstraction of these circumstances influences the tasks a user has to fulfill. Note that some tasks might not be useful or possible in a certain context. The adaptation of the UI according to the context can improve the human computer interaction by providing an adapted UI for the specific CoU. In this paper, we will demonstrate how context models are integrated into a modelbased UI development process, with particular emphasis on the task model. A definition of the term context and a formalization is given, which is later used to enrich task models. Finally we illustrate how context sensitive task patterns can be used as building blocks for the creation of context sensitive task models. The idea of reuse of generic model fragments by means of the use of patterns will be illustrated.
manage complexity by abstracting from low-level implementation details. In HCI, there are different kinds of UI models that can be taken into account to describe the various facets of the UI. Among those models, the task model has gained special attention, as it often forms the starting point from which the UI development should be initiated. Task models describe the tasks (and sub-tasks) a user has to execute to achieve a certain goal. A task is a hierarchical structure, which expresses the activities a user has to accomplish to fulfill this task. A goal is understood as a result a user wants to obtain after the task execution. Task modeling is a user-centered approach and thus task model designers concentrate on users and capture their activities. Considerations about how a user can reach a goal using a certain software system can foster usability. Even without using task models for UI generating processes they help capturing usability requirements, since the understanding of the task world of the user can lead to a better UI design. Model-based UI development describes the process of (semi-) automated UI generation by using a set of declarative models, which cover all aspects of importance of the envisioned UI. Most model-based UI approaches specify the user, task, domain (application), platform, dialog, layout and/or presentation model [7, 8]. Model-based UI development can be seen as a series of model transformations, where abstract models (e.g. task, user, domain model) gradually evolve into more concrete models (e.g. dialog, layout, presentation model) which finally result in the implementation of the UI. Since the design of UI models is complex and error prone, tool support is needed to carry out model-based UI development efficiently. Especially tedious tasks can be supported or automated. Furthermore tool support is able to hide technical details of the used technologies and the design is made at a conceptual level. A model based UI development environment can consequently be understood as a software system, which helps software designers to execute a certain model-based UI development process. Typical functions of such an environment are the design, validation, and animation of the model instances. Furthermore, the environment should provide decision-making assistance and involve the end-user in all stages of the development. The generation of prototypes based on the designed models to evaluate the made decisions helps to integrate the stakeholders’ needs in early stages of development. Design decision can be reconsidered based on the given feedback.
Fig. 1. Model-based UI development process and its steps
Fig. 1 shows the general rational of a model-based UI development process. It starts with an analysis of the goals and tasks, which results in overall task model. This step will be further detailed in a subsequent section. Next, the resulting task model
1228
M. Wurdel et al.
has to be adapted to the current context by taking into consideration aspects about the user, the end-device and the environment. This refined task model is less complex, since unnecessary tasks for this context are already filtered. Based on the information of the task model the dialog is constructed. It specifies groupings of tasks into dialog views and defines transitions between the various dialog views. In this stage of the process an abstract prototype can already be generated (more details on this generation process can be found in [4]), which exemplifies the general application flow of the later UI. This prototype is based on the designed dialog structure and the temporal relationships of the involved tasks. Followed by the creation of the dialog structure is the definition of the presentation and layout model. The former associates interaction elements (e.g. buttons, text fields, labels) with the various tasks, whereas the latter describes the arrangement and the look & feel of these elements. After describing our model-based UI development methodology we will now discuss different types of task models, which may be involved in the various development steps. In general we distinguish between three different types of task models: 1. Task model of the problem domain (Analysis task model) 2. Context-sensitive task model of the envisioned software system 3. Task model of the software system for a particular CoU (context-insensitive) The analysis of the domain of interest results in the analysis task model (1.). It reflects the knowledge of the domain how a problem can be tackled in a general way independent of a software system [4]. The analysis is performed in close cooperation with the domain expert. After considering, which problems should be solved by the envisioned software system, a corresponding envisioned task model is designed (2.). It is a boiled down version of the previous task model and omits tasks, which will either not be tackled by the envisioned software system or do not relate to the software system itself. This model has to reflect the behavior of the envisioned interactions between the user and software [4] and describes the task world for all platforms, users and environments. The context-insensitive task model, on the other hand, is considered as the task model, which has to be fulfilled by a certain user using certain device in a particular environment (3.). It is a filtered version of the previous task model. During the transformation only the applicable tasks for a particular context are considered. Note that a context-sensitive task model describes a set of context-insensitive task models. Thus, a context-insensitive task model can be seen as instance of the corresponding context-sensitive task model. The next section will clarify the term CoU and proposes a model, which is used to annotate task models. Later on we will come back to the different types of task models to reflect context dependencies.
3 Context of Use With the advent of mobile and ubiquitous computing the development of interactive systems has become increasingly complex. The interactive behavior of the system needs to be accustomed to a wide range of people having different skills hand and
Patterns for Task- and Dialog-Modeling
1229
using different end-devices. In addition the usage of mobile devices is not bound to a predefined location of use and hence versatile environmental factors need to be taken into account as well. In this paper we summarize the entirety of influential factors under the term Context of Use (CoU). The context of use is any information that can be used to characterize the situation of the environment, the user and the device of a software system, which is regarded as relevant for the interaction of the system and the user.
Fig. 2. Decomposition of the CoU into sub-models
From our point of view a holistic approach has to cover the device, the user of the system and the environment of the system and the user. There is consent that these models interrelate to each other[1]. According to our previous definition we propose the categorization illustrated in Fig. 2. 3.1 Meta-model In this section we propose a generic meta-model for the CoU model. It consists of variables and expressions. More precisely the CoU model is defined by a set of variables where each variable has a unique name and a domain of discrete values. Furthermore, an order is defined on the values of the domain. Based on these variables expressions are defined by comparisons of variables and Boolean logic. Brief Example Based on the proposed meta-model of the CoU we will exemplify its application for the sub-model “Device”. As depicted below, it consists of a set of representative variable with pre-defined domains. Variables: CPU (low, medium, high) Memory (<64Mb, 64Mb, 128Mb, 256Mb, 512Mb,>1024Mb) Bandwidth (low, medium, high) Input capabilities (pen, cell phone keyboard, keyboard) Monitor resolution (<800x600, 800x600, 1024x768)
1230
M. Wurdel et al.
These variable definitions allow us to build expressions, which describes our possible devices. For example, we define (in a very simplified manner) the device PDA as follows: PDA := CPU = low ∧ Memory < 256Mb ∧ Input capabilities = pen This procedure can be used to define other devices, which in turn build a “smaller than” and “greater than” relationship (i.e. PDA < laptop < PC). In a modeling process of a certain sub-model the number and form of variables are not limited, since we defined context as any information, which is regarded as important. Thus, the context highly relates to the domain of interest. A fact might be considered as mandatory in a special domain of interest, whereas in another it is dispensable. Consequentially the model of the CoU is defined with respect to the requirements of the current project and the domain.
4 A General Approach of Integrating Context-Sensitivity into Task Models As already mentioned before the tasks a user has to accomplish to reach a goal are highly related to the context of execution. Some tasks may not be executable or may be useless in particular contexts (i.e. “downloading a file without internet connection”). Thus, task models should provide a mechanism to accommodate CoU concerns. Some attempts ([6, 9]) were made in the HCI community to introduce an envisioned task model, which describes the interaction of a software system with its users by taking into account context information. Such a model contains all possible tasks, which can appear in a particular context of use. In [5] Luyten proposes a new node type to introduce a decision mechanism in order to derive a task model for a certain context. The decision mechanism is based on conditions, which are built with respect to the possible CoU. This approach, however, suffers from the lack of a sound definition of the CoU. Our approach bears resemblance to [5] and extends the common notation of CTT by introducing a new node type; the
Fig. 3. Example of the extended task model
Patterns for Task- and Dialog-Modeling
1231
decision node. The sub-nodes of the decision node are annotated with constraints, which denote when a particular node is used. Fig. 3 shows an example for such a task model. Constraints are built according to the defined expressions in the model of the CoU. The evaluation of the extended task model is performed by traversing the task tree (starting from the root node) in preorder. For each encountered decision node the following steps are performed: 1. Evaluation. Evaluation of every constraint of the sub-nodes. 2. Selection. Selection of the matching branch depending on the constraints. 3. Replacement. Replacement of the decision node by the selected branch.
Fig. 4. Evaluation of a decision node
Fig. 5. Context-sensitive task model in our tool
1232
M. Wurdel et al.
Since expressions and constraints are based on variables both can only be evaluated when all its contained variables have a concrete assigned value. An example of the evaluation algorithm is given in Fig. 4. The process of evaluating a whole task tree is a top-down process and can be done automatically after all variables have valid values. The input is the generic task model and a model of a particular CoU and the output is a task model adapted to the current CoU. We implemented a tool that accomplishes this process automatically. Please note that this procedure is a general attempt and not limited to device or user constraints. It can be used to power the expressiveness of task models in general. An example of context-sensitive task model is given in Fig. 5. The figure shows a task tree in our tool and the property view shows that the sub task of the decision node has a constraint attached that expresses whether this task is executed in a particular CoU or not.
5 Accelerating the Design Through the Use of Task Patterns The design of formal models is a tedious and error-prone process. Reuse of already designed solutions is becoming more and more crucial. One way to reuse model artifacts is through the use of patterns. Alexander describes a pattern as “a three-part rule, which expresses a relation between a certain context, a problem and a solution” [2]. A pattern provides the core solution of a recurrent problem and provides guidance its usage and its application. The application of a pattern to a model follows a general process independent from the kind of used pattern. It starts with the identification of a sub model on which a pattern should be applied. Next an appropriate pattern is selected, which is followed by an adaptation and instantiation of the pattern. Finally the pattern is integrated into the original model [10]. A task pattern defines a reusable and generic task structure that encapsulates a well-defined functionality for a recurrent design problem in task modeling. In order to provide a high degree of reuse, patterns have to support at least three concepts: Encapsulation, Composition and Adaptation. Task patterns have to encapsulate a particular functionality. Further on, to foster the degree of reuse patterns should be composable. Adaptation of a pattern during the application process is necessary to allow the use of a pattern in various situations. Hence, a wider spectrum of application is offered. In this paper we extend the idea of “traditional” task patterns [10] by capturing context dependencies. As already described before, the modeling of the CoU is domain specific and thus has to be performed with respect to the requirements of the current project. Hence, the model of the CoU differs from project to
Patterns for Task- and Dialog-Modeling
1233
Project. This fact demands Context Sensitive Registration Pattern (adapted from [11]) a high degree of Problem: adaptability of context- Delivering goods or performing business transactions usually sensitive task pattern, since requires a large amount of personal data, which has to be entered by the used constraints have to the client. be adaptable as well to When to Apply: match the requirements When users repeatedly use the software system to order goods or perform transactions the ability of store their personal data helps to of the current project. This satisfy the customer. Furthermore, personalized offers can be adaptation is performed provided based on the stored data. during the pattern Solution: application process. Offer a registration process and store the personal data of the Therefore, patterns contain customer in a profile to avoid the repeatedly input of personal data constraints, which are, at for each purchase process. The registration process contains a set of mandatory and a setoff optional data, which are entered by the first, rather abstract and client. After validating the entered data, the data is stored and the will be adapted to concrete login information is provided to the customer for future login. constraints during the Rational: instantiation phase. For The user enters the mandatory and optional data, which is followed instance, a pattern should by a validation to avoid input errors and the submission of the data. not contain domain Afterwards login information is provided to the client. of Use: specific roles, since the Context A registration process can be accomplished under different CoU. If reuse of such a pattern is the input constraints of the device are limited only the mandatory bound to this domain. data of the user are entered, since entering values takes a while Instead, abstract roles using device like PDAs or mobiles. Otherwise the normal process of should be used, which can entering the mandatory and optional data is performed. be mapped to domain Structure (Task Structure): specific roles (during the pattern application) with respect to the CoU model. This concept ensures a higher degree of reuse and customization. We developed a pattern language consisting of five context-sensitive patterns and twelve contextPattern: insensitive patterns [10, Related Multi-Value Input Form, Login, Booking 12]. An example of a pattern is given on the right hand side. For further information of the description of patterns (meta-pattern) or other patterns it is referred to [12]. In order to facilitate the instantiation and application of patterns we have developed a tool called C-PIM (Context and Patterns in Modeling) that supports all stages in the general process of pattern-based design. In the next section we will illustrated the tool briefly. 5.1 C-PIM: A Tool Support for the Application of Task Patterns and the Derivation of Context-Sensitive Task Models The C-PIM tool provides access to a library of patterns and helps the designer in the pattern selection process and assists in applying patterns to the task model. More
1234
M. Wurdel et al.
precisely, the pattern adaptation phase, resulting in a pattern instance, is assisted by an interactive wizard. The adaptation includes the specification of occurrences of tasks, the assignment of values to domain variables and the adaptation of constraints with regard to the CoU model.
Fig. 6. Task model and available patterns in C-PIM Eclipse plugin
Since context sensitive task models are more complex than regular ones a reification process resulting in “ordinary” task models should be provided to foster the understanding. Our tool provides this functionality by analyzing the defined contexts entailed in the CoU model. Afterwards the reification wizard offers the selection of the envisioned CoU. Based on the entered values the decision nodes are evaluated and a task model for this context is generated. This task model can be used for further development. The tool is implemented in Java as Eclipse plugin and thus is integrated in out model-based development environment. By using the Eclipse platform a seamless integration of various model types is possible.
6 Conclusion In this paper we illustrated how the CoU model can be integrated into an overall model-based UI development process. We provided a formal definition of the CoU, based on constraints and a set of context conditions. The integration of context conditions into model-based UI development was demonstrated for the task model. In particular we extended the task model by introducing a new node type; the decision node [5]. We also illustrated how context sensitive task patterns can be used as building blocks for the construction of context sensitive task models. We developed a pattern library which includes five context-sensitive and twelve ordinary task patterns and their relationships [10, 12]. The application of patterns to task models is supported by our C-PIM tool, which supports the major steps of pattern applications (selection, adaptation, instantiation and integration). Furthermore, the generation of contextinsensitive from context-sensitive task models is automated. There are several aspects which are promising for future examination. One such future avenue is the integration of CoU information into the dialog model (whose
Patterns for Task- and Dialog-Modeling
1235
creation immediately follows the task model according to model-based UI development methodology). Another future avenue deals with the further investigation of dialog patterns and their relationship to task patterns, since the application of a task pattern can exclude or recommend the application of certain dialog patterns.
References 1. Abi-Aad, R. et al.: CoU: Context of Use Model for User Interface Design. In: HCI International, 2003. Greece (2003) 2. Alexander, C., Ishikawa, S., Silverstein, M.: A Pattern Language: Towns, Buildings, Construction, xliv, p. 1171. Oxford University Press, New York (1977) 3. Booch, G., Jacobson, I., Rumbaugh, J.: The Unified Modeling Language User Guide, 2nd edn. Addison-Wesley, Upper Saddle River, NJ (2005) 4. Forbrig, P., et al.: User-Centered Design and Abstract Prototypes. In: SHAKER (ed.) Proceedings of BIR2003: Berlin. pp. 132–145 (2003) 5. Luyten, K.: Dynamic User Interfaces Generation for Mobile and Embedded Systems with Model-Based user Interface Development, Universiteit Maastricht: Maastricht (2004) 6. Mori, G., Paternò, F., Santoro, C.: CTTE: Support for Developing and Analyzing Task Models for Interactive System Design. IEEE Trans. Softw. Eng. 28(8), 797–813 (2002) 7. Puerta, A.R.: Model-Based Interface Development Environment. IEEE Softw 14(4), 40–47 (1997) 8. Schlungbaum, E., Elwert, T.: Automatic User Interface Generation from Declarative Models. In: CADUI 1996 Presses Universitaires de Namur (1996) 9. Seffah, A., Forbrig, P.: Multiple User Interfaces: Towards a Task-Driven and PatternsOriented Design Model. In: Proceedings of the 9th International Workshop on Interactive Systems. Design, Specification, and Verification, pp. 118–132. Springer, Heidelberg (2002) 10. Sinnig, D., Forbrig, P., Seffah, A.: Patterns in Model-Based Development. In: Position Paper in INTERACT 03 Workshop entitled: Software and Usability Cross-Pollination: The Role of Usability Patterns (2003) 11. Welie, M.: Patterns in Interaction Design, Accessed January 2007 http://www.welie.com/ 12. Wurdel, M.: Tool Support of Patterns for Task Models. In: Department of Computer Science, University of Rostock: Rostock (2006)
Author Index
Abascal, Julio 594 Abran, Alain 1198 Adikari, Sisira 373 Aikman, Shelley 1021 A¨ıt-Ameur, Yamine 1062 Akatsu, Hiroko 3 Akselsen, Sigmund 746 Almirall-Hill, Mag´ı 446 Altankhuyag, P. 614 Amarsaikhan, D. 614 Anastassova, Margarita 383 Ando, Masaya 393 Araki, Sachiyo 49 Asano, Yoko 675, 765, 921, 966 Atwood, Michael E. 11, 40 Baek, Seungyup 559 Baranauskas, M. Cec´ılia C. 1033 Bartneck, Christoph 20 Berkman, Ali Emre 397 Bevan, Nigel 407 Bi-Hui, Chen 1053 Bock, Gee-Woo 681 Bourguin, Gr´egory 1129 Br˚ athe, Lars 802 Buchholz, Gregor 1043 Burkhardt, Jean-Marie 383 Buur, Jacob 30 Calcaterra, Gina 959 Cao, Shan Shan 289 Carri¸co, Lu´ıs 428 C ¸ etin, G¨ orkem 774 Chalon, Ren´e 1082 Chambers, Vanessa 939 Chen, Fang 1011 Chen, Yan 420 Chen, Yunan 40 Cheong, Cheolho 862 Chevalier, Aline 691 Chi, Yu-Liang 701 Chiou, Wen Ko 156, 490, 827 Chung, Donghun 711 Coninx, Karin 1216
Cortier, Alexandre Cox, David 1072
1062
d’Ausbourg, Bruno 1062 da Silva, S´ergio Roberto 1188 Daimoto, Hiroshi 49 Darzentas, J. 569 David, Bertrand 1082 deBuys, Brahm Daniel 711 Delotte, Olivier 1082 Dhakhwa, Sagun 721 Dittmar, Anke 1092 Domm`es, Aur´elie 691 Dong, Jianming 796 Duarte, Carlos 428 D¨ uchting, Markus 58 Dwyer, John P. 939 Ektare, Mayuresh 731 Engel, J¨ urgen 1043 Eschenbrenner, Brenda 736 Evjemo, Bente 746 Ewert, Bernd 796 Ezzedine, Houcine 632 Fajardo, Inmaculada 594 Fang, Cunhao 1102 Ferre, Xavier 68 Forbrig, Peter 1109, 1226 Forsman, Catherine 78 Freeman, Jonathan 262 Frøkjær, Erik 642 Fu, Xiaolan 340 Fuchs, Georg 1109 Fujimura, Kaori 756, 921 Fujino, Yuichi 756 Fujioka, Ryosuke 438 Furuta, Kazuo 298 Gaffar, Ashraf 1092 Gallud, Jos´e A. 1206 Gao, Qin 675, 765 Garcia, Fredrick P. 939 Garreta-Domingo, Muriel 446 Georgakis, Panagiotis 350
1238
Author Index
Georgiakakis, Petros 453 Ghimire, Ganesh Bahadur 721 Giersich, Martin 1109 Go, Kentaro 88, 140 G¨ okt¨ urk, Mehmet 774 G´ omez-Carnero, Susana 463 Gong, Dun-wei 472 Greenwood, Kristyn 133 Gr¨ otzbach, Lennart 360 Guimar˜ aes, Nuno 428 Gundelsweiler, Fredrik 174 Guo, Yi-nan 472 Guo, Yinni 784 Gyobu, Ikuko 223 Hall, Patrick A.V. 721 Han, Alice 796 Han, Tack-Don 862 Heesom, David 350 Heldal, Ilona 802 Henke, Katja 812 Heo, Jeongyun 482 Hermann, Fabian 812 Hinum, Klaus 604 Hirasawa, Naoki 140 Hirose, Yoko 108 Hodgetts, Helen M. 818 Hofer, Ron 98 Hong, Ji 127 Horner, John 11 Hosono, Naotsune 3 Hu, Yung Hsing 827 Huang, Ding-Long 835 Huang, Ding Hau 156, 490 Huang, Lixian 420 Huang, Yuan Tsing 827 Hußmann, Heinrich 1168 Hwang, Ha Sung 844 Hwang, Wonil 499 Ito, Kei 223 Itoh, Yasuhisa Itoh, Yoshihiro
Jin, Ling 921, 966 Jokela, Timo 536 Jones, Dylan M. 818 Jounila, Ilari 527 Kang, Neung Eun 854 Kantola, Niina 536 Karjalainen, Sami 544 Karvonen, Kristiina 549 Kasai, Hideaki 140 Kawai, Yuki 438 Khong, Chee Weng 273 Kim, Andrew 559 Kim, Doyoon 862 Kim, In Ki 559 Kim, Jin Hyun 929 Kirlik, Alex 872 Kirste, Thomas 1109 Knapp, Barbara 882 Ko, Sang Min 517 Koempel, Jeremy 1139 Koichiro, Kubo 756 Kolski, Christophe 632 Komatsu, Hidehiro 223 Koutsabasis, P. 569 Kowalski, Luke 133 Kraft, Jerome 939 Kr¨ omker, Heidi 1119 Kuan, Huei-Huang 681 Kunert, Tibor 1119 Kurosu, Masaaki 49, 108, 140, 393, 579 Laporte, Philippe 632 Laumann, Susanne 985 Lee, Haeinn 891 Lepreux, Sophie 1129 Lessiter, Jane 262 Lewandowski, Arnaud 1129 Li, Dan 1001 Li, Lulu 420 Liang, Jun 146 Liang, You Zhao 156, 490, 827 Lim, Seung Chan 1139 Lindgaard, Gitte 164 Lindqvist, Janne 549 Ling, Chen 901 Liu, Ping 681 L´ opez, Juan Miguel 594 Lopez, Miguel 901 L´ opez-Jaquero, V´ıctor 1149
Author Index Lowe, Sandi 1139 Lozano, Mar´ıa D. 1206 Luo, Qi 420 Luyten, Kris 1216 Lynch, Neil 373 Lyons, Michael J. 20
Peissner, Matthias 812 Pekkola, Samuli 976 Peng, Shu-Yun 701 Pleuß, Andreas 1168 Pohl, Margit 604 Popow, Christian 604 Propp, Stefan 1043 Psaromiligkos, Yannis 453
Mahlke, Sascha 164 Manandhar, Prakash 721 M¨ artin, Christian 1043, 1159 Martins, Daniel 691 Masserey, Guillaume 1082 McDonald, Craig 373 Medinilla, Nelson 68 M´egard, Christine 383 Memmel, Thomas 174 Miki, Hiroyuki 3 Miksch, Silvia 604 Ming-Hsu, Wang 1053 Mizuno, Masamitsu 49 Mochizuki, Takayoshi 756 Montero, Francisco 1149, 1206 Mor, Enric 446 Morandini, Marcelo 1188 Morgan, Maggie 184 Morimoto, Kazunari 652 Mun, Jae Seung 517 Murata, Setsuko 756 Mynatt, Elizabeth D. 959 Nah, Fiona Fui-Hoon 736 Nam, Chang S. 711, 949 Naumann, Anja 812 Nebe, Karsten 58, 194 Neris, Vania Paula de Almeida Newell, Alan 184 Ngo, Thuan K. 939 Nguyen-Ngoc, Anh Vu 204 Niedermann, Iris 812 Nieters, James 214 Ning, Liu 587 Ogura, Kenji 756 Ohmann, Susanne 604 Okada, Hidehiko 438 Okamoto, Makoto 223 Papadimitriou, George Park, Sanhyun 482 Park, SungBok 844
453
1239
Qin, Xiangang
289
Radhakrishnan, T. 1226 Rau, Pei-Luen Patrick 662, 675, 765, 835, 921, 966 Reichart, Daniel 1109 Reiterer, Harald 174 Rester, Markus 604 Retalis, Symeon 453 Rintel, Sean 911 Roberts, David 802 Rodeiro Iglesias, Javier 463 Roski, Alexander 1159 Rouillard, Jos´e 632 Ruiz, Natalie 232
1033
Saeed, Akbar 242 Saito, Harumi 675 Salminen, Tapio 252 Salvendy, Gavriel 499, 624, 662, 784 Sandhu, Jaspal S. 614 Sato, Hitomi 675, 756, 765, 921, 966 Savoy, A. 624 Schaefer, Robbie 1178 Schumann, Heidrun 1109 Sch¨ urmann, Anders 746 Schwerz, Andr´e Luis 1188 Seffah, Ahmed 1198 Seifert, Uwe 929 Shanghong, Xiao 283 Shimokura, Kenichiro 756 Shin, Seungchul 862 Sinnig, Daniel 1226 S¨ offner, Jan 949 Song, Chiwon 482 Spinhof, Lonneke 407 Spyrou, T. 569 Stienstra, Marcelle 30 Strybel, Thomas Z. 939 Su, Hui 835 Sun, Hua 681
1240
Author Index
Taib, Ronnie 232 Takahashi, Hideaki 108 Taleb, Mohamed 1198 Tan, Chuan-Hoo 991 Tanimoto, Ryo 438 Tarby, Jean-Claude 632 Teo, Hock-Hai 991 Tesoriero, Ricardo 1206 Thapa, Ishwor 721 Tian, Pengwei 1102 Ting, Shang 587 Tingling, Peter 242 Tran, Chi Dung 632 Tran, Quan T. 959 Tseng, Winnie 796 Tsuboi, Toshiaki 756 Tu, Nan 835 Uldall-Espersen, Tobias Urano, Naoki 652