This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany
6765
Constantine Stephanidis (Ed.)
Universal Access in Human-Computer Interaction Design for All and eInclusion 6th International Conference, UAHCI 2011 Held as Part of HCI International 2011 Orlando, FL, USA, July 9-14, 2011 Proceedings, Part I
13
Volume Editor Constantine Stephanidis Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science N. Plastira 100, Vassilika Vouton, 70013, Heraklion, Crete, Greece and University of Crete Department of Computer Science Crete, Greece E-mail: [email protected]
ISSN 0302-9743 e-ISSN 1611-3349 ISBN 978-3-642-21671-8 e-ISBN 978-3-642-21672-5 DOI 10.1007/978-3-642-21672-5 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011928824 CR Subject Classification (1998): H.5, K.6, H.3-4, C.2, D.2, J.1, J.3 LNCS Sublibrary: SL 3 – Information Systems and Application, incl. Internet/Web and HCI
The 14th International Conference on Human–Computer Interaction, HCI International 2011, was held in Orlando, Florida, USA, July 9–14, 2011, jointly with the Symposium on Human Interface (Japan) 2011, the 9th International Conference on Engineering Psychology and Cognitive Ergonomics, the 6th International Conference on Universal Access in Human–Computer Interaction, the 4th International Conference on Virtual and Mixed Reality, the 4th International Conference on Internationalization, Design and Global Development, the 4th International Conference on Online Communities and Social Computing, the 6th International Conference on Augmented Cognition, the Third International Conference on Digital Human Modeling, the Second International Conference on Human-Centered Design, and the First International Conference on Design, User Experience, and Usability. A total of 4,039 individuals from academia, research institutes, industry and governmental agencies from 67 countries submitted contributions, and 1,318 papers that were judged to be of high scientific quality were included in the program. These papers address the latest research and development efforts and highlight the human aspects of design and use of computing systems. The papers accepted for presentation thoroughly cover the entire field of human–computer interaction, addressing major advances in knowledge and effective use of computers in a variety of application areas. This volume, edited by Constantine Stephanidis, contains papers in the thematic area of universal access in human-computer interaction (UAHCI), addressing the following major topics: • • • •
Design for all methods and tools Web accessibility: approaches, methods and tools Multimodality, adaptation and personalisation eInclusion policy, good practice, legislation and security issues
The remaining volumes of the HCI International 2011 Proceedings are: • Volume 1, LNCS 6761, Human–Computer Interaction—Design and Development Approaches (Part I), edited by Julie A. Jacko • Volume 2, LNCS 6762, Human–Computer Interaction—Interaction Techniques and Environments (Part II), edited by Julie A. Jacko • Volume 3, LNCS 6763, Human–Computer Interaction—Towards Mobile and Intelligent Interaction Environments (Part III), edited by Julie A. Jacko • Volume 4, LNCS 6764, Human–Computer Interaction—Users and Applications (Part IV), edited by Julie A. Jacko • Volume 6, LNCS 6766, Universal Access in Human–Computer Interaction— Users Diversity (Part II), edited by Constantine Stephanidis • Volume 7, LNCS 6767, Universal Access in Human–Computer Interaction— Context Diversity (Part III), edited by Constantine Stephanidis
VI
Foreword
• Volume 8, LNCS 6768, Universal Access in Human–Computer Interaction— Applications and Services (Part IV), edited by Constantine Stephanidis • Volume 9, LNCS 6769, Design, User Experience, and Usability—Theory, Methods, Tools and Practice (Part I), edited by Aaron Marcus • Volume 10, LNCS 6770, Design, User Experience, and Usability— Understanding the User Experience (Part II), edited by Aaron Marcus • Volume 11, LNCS 6771, Human Interface and the Management of Information—Design and Interaction (Part I), edited by Michael J. Smith and Gavriel Salvendy • Volume 12, LNCS 6772, Human Interface and the Management of Information—Interacting with Information (Part II), edited by Gavriel Salvendy and Michael J. Smith • Volume 13, LNCS 6773, Virtual and Mixed Reality—New Trends (Part I), edited by Randall Shumaker • Volume 14, LNCS 6774, Virtual and Mixed Reality—Systems and Applications (Part II), edited by Randall Shumaker • Volume 15, LNCS 6775, Internationalization, Design and Global Development, edited by P.L. Patrick Rau • Volume 16, LNCS 6776, Human-Centered Design, edited by Masaaki Kurosu • Volume 17, LNCS 6777, Digital Human Modeling, edited by Vincent G. Duffy • Volume 18, LNCS 6778, Online Communities and Social Computing, edited by A. Ant Ozok and Panayiotis Zaphiris • Volume 19, LNCS 6779, Ergonomics and Health Aspects of Work with Computers, edited by Michelle M. Robertson • Volume 20, LNAI 6780, Foundations of Augmented Cognition: Directing the Future of Adaptive Systems, edited by Dylan D. Schmorrow and Cali M. Fidopiastis • Volume 21, LNAI 6781, Engineering Psychology and Cognitive Ergonomics, edited by Don Harris • Volume 22, CCIS 173, HCI International 2011 Posters Proceedings (Part I), edited by Constantine Stephanidis • Volume 23, CCIS 174, HCI International 2011 Posters Proceedings (Part II), edited by Constantine Stephanidis I would like to thank the Program Chairs and the members of the Program Boards of all Thematic Areas, listed herein, for their contribution to the highest scientific quality and the overall success of the HCI International 2011 Conference. In addition to the members of the Program Boards, I also wish to thank the following volunteer external reviewers: Roman Vilimek from Germany, Ramalingam Ponnusamy from India, Si Jung “Jun” Kim from the USA, and Ilia Adami, Iosif Klironomos, Vassilis Kouroumalis, George Margetis, and Stavroula Ntoa from Greece.
Foreword
VII
This conference would not have been possible without the continuous support and advice of the Conference Scientific Advisor, Gavriel Salvendy, as well as the dedicated work and outstanding efforts of the Communications and Exhibition Chair and Editor of HCI International News, Abbas Moallem. I would also like to thank for their contribution toward the organization of the HCI International 2011 Conference the members of the Human–Computer Interaction Laboratory of ICS-FORTH, and in particular Margherita Antona, George Paparoulis, Maria Pitsoulaki, Stavroula Ntoa, Maria Bouhli and George Kapnas. July 2011
Constantine Stephanidis
Organization
Ergonomics and Health Aspects of Work with Computers Program Chair: Michelle M. Robertson Arne Aar˚ as, Norway Pascale Carayon, USA Jason Devereux, UK Wolfgang Friesdorf, Germany Martin Helander, Singapore Ed Israelski, USA Ben-Tzion Karsh, USA Waldemar Karwowski, USA Peter Kern, Germany Danuta Koradecka, Poland Nancy Larson, USA Kari Lindstr¨om, Finland
Brenda Lobb, New Zealand Holger Luczak, Germany William S. Marras, USA Aura C. Matias, Philippines Matthias R¨ otting, Germany Michelle L. Rogers, USA Dominique L. Scapin, France Lawrence M. Schleifer, USA Michael J. Smith, USA Naomi Swanson, USA Peter Vink, The Netherlands John Wilson, UK
Human Interface and the Management of Information Program Chair: Michael J. Smith Hans-J¨ org Bullinger, Germany Alan Chan, Hong Kong Shin’ichi Fukuzumi, Japan Jon R. Gunderson, USA Michitaka Hirose, Japan Jhilmil Jain, USA Yasufumi Kume, Japan Mark Lehto, USA Hirohiko Mori, Japan Fiona Fui-Hoon Nah, USA Shogo Nishida, Japan Robert Proctor, USA
Youngho Rhee, Korea Anxo Cereijo Roib´ as, UK Katsunori Shimohara, Japan Dieter Spath, Germany Tsutomu Tabe, Japan Alvaro D. Taveira, USA Kim-Phuong L. Vu, USA Tomio Watanabe, Japan Sakae Yamamoto, Japan Hidekazu Yoshikawa, Japan Li Zheng, P. R. China
X
Organization
Human–Computer Interaction Program Chair: Julie A. Jacko Sebastiano Bagnara, Italy Sherry Y. Chen, UK Marvin J. Dainoff, USA Jianming Dong, USA John Eklund, Australia Xiaowen Fang, USA Ayse Gurses, USA Vicki L. Hanson, UK Sheue-Ling Hwang, Taiwan Wonil Hwang, Korea Yong Gu Ji, Korea Steven A. Landry, USA
Gitte Lindgaard, Canada Chen Ling, USA Yan Liu, USA Chang S. Nam, USA Celestine A. Ntuen, USA Philippe Palanque, France P.L. Patrick Rau, P.R. China Ling Rothrock, USA Guangfeng Song, USA Steffen Staab, Germany Wan Chul Yoon, Korea Wenli Zhu, P.R. China
Engineering Psychology and Cognitive Ergonomics Program Chair: Don Harris Guy A. Boy, USA Pietro Carlo Cacciabue, Italy John Huddlestone, UK Kenji Itoh, Japan Hung-Sying Jing, Taiwan Wen-Chin Li, Taiwan James T. Luxhøj, USA Nicolas Marmaras, Greece Sundaram Narayanan, USA Mark A. Neerincx, The Netherlands
Jan M. Noyes, UK Kjell Ohlsson, Sweden Axel Schulte, Germany Sarah C. Sharples, UK Neville A. Stanton, UK Xianghong Sun, P.R. China Andrew Thatcher, South Africa Matthew J.W. Thomas, Australia Mark Young, UK Rolf Zon, The Netherlands
Universal Access in Human–Computer Interaction Program Chair: Constantine Stephanidis Julio Abascal, Spain Ray Adams, UK Elisabeth Andr´e, Germany Margherita Antona, Greece Chieko Asakawa, Japan Christian B¨ uhler, Germany Jerzy Charytonowicz, Poland Pier Luigi Emiliani, Italy
Michael Fairhurst, UK Dimitris Grammenos, Greece Andreas Holzinger, Austria Simeon Keates, Denmark Georgios Kouroupetroglou, Greece Sri Kurniawan, USA Patrick M. Langdon, UK Seongil Lee, Korea
Organization
Zhengjie Liu, P.R. China Klaus Miesenberger, Austria Helen Petrie, UK Michael Pieper, Germany Anthony Savidis, Greece Andrew Sears, USA Christian Stary, Austria
Hirotada Ueda, Japan Jean Vanderdonckt, Belgium Gregg C. Vanderheiden, USA Gerhard Weber, Germany Harald Weber, Germany Panayiotis Zaphiris, Cyprus
Virtual and Mixed Reality Program Chair: Randall Shumaker Pat Banerjee, USA Mark Billinghurst, New Zealand Charles E. Hughes, USA Simon Julier, UK David Kaber, USA Hirokazu Kato, Japan Robert S. Kennedy, USA Young J. Kim, Korea Ben Lawson, USA Gordon McK Mair, UK
David Pratt, UK Albert “Skip” Rizzo, USA Lawrence Rosenblum, USA Jose San Martin, Spain Dieter Schmalstieg, Austria Dylan Schmorrow, USA Kay Stanney, USA Janet Weisenford, USA Mark Wiederhold, USA
Internationalization, Design and Global Development Program Chair: P.L. Patrick Rau Michael L. Best, USA Alan Chan, Hong Kong Lin-Lin Chen, Taiwan Andy M. Dearden, UK Susan M. Dray, USA Henry Been-Lirn Duh, Singapore Vanessa Evers, The Netherlands Paul Fu, USA Emilie Gould, USA Sung H. Han, Korea Veikko Ikonen, Finland Toshikazu Kato, Japan Esin Kiris, USA Apala Lahiri Chavan, India
James R. Lewis, USA James J.W. Lin, USA Rungtai Lin, Taiwan Zhengjie Liu, P.R. China Aaron Marcus, USA Allen E. Milewski, USA Katsuhiko Ogawa, Japan Oguzhan Ozcan, Turkey Girish Prabhu, India Kerstin R¨ ose, Germany Supriya Singh, Australia Alvin W. Yeo, Malaysia Hsiu-Ping Yueh, Taiwan
XI
XII
Organization
Online Communities and Social Computing Program Chairs: A. Ant Ozok, Panayiotis Zaphiris Chadia N. Abras, USA Chee Siang Ang, UK Peter Day, UK Fiorella De Cindio, Italy Heidi Feng, USA Anita Komlodi, USA Piet A.M. Kommers, The Netherlands Andrew Laghos, Cyprus Stefanie Lindstaedt, Austria Gabriele Meiselwitz, USA Hideyuki Nakanishi, Japan
Anthony F. Norcio, USA Ulrike Pfeil, UK Elaine M. Raybourn, USA Douglas Schuler, USA Gilson Schwartz, Brazil Laura Slaughter, Norway Sergei Stafeev, Russia Asimina Vasalou, UK June Wei, USA Haibin Zhu, Canada
Augmented Cognition Program Chairs: Dylan D. Schmorrow, Cali M. Fidopiastis Monique Beaudoin, USA Chris Berka, USA Joseph Cohn, USA Martha E. Crosby, USA Julie Drexler, USA Ivy Estabrooke, USA Chris Forsythe, USA Wai Tat Fu, USA Marc Grootjen, The Netherlands Jefferson Grubb, USA Santosh Mathan, USA
Rob Matthews, Australia Dennis McBride, USA Eric Muth, USA Mark A. Neerincx, The Netherlands Denise Nicholson, USA Banu Onaral, USA Kay Stanney, USA Roy Stripling, USA Rob Taylor, UK Karl van Orden, USA
Digital Human Modeling Program Chair: Vincent G. Duffy Karim Abdel-Malek, USA Giuseppe Andreoni, Italy Thomas J. Armstrong, USA Norman I. Badler, USA Fethi Calisir, Turkey Daniel Carruth, USA Keith Case, UK Julie Charland, Canada
Yaobin Chen, USA Kathryn Cormican, Ireland Daniel A. DeLaurentis, USA Yingzi Du, USA Okan Ersoy, USA Enda Fallon, Ireland Yan Fu, P.R. China Afzal Godil, USA
Organization
Ravindra Goonetilleke, Hong Kong Anand Gramopadhye, USA Lars Hanson, Sweden Pheng Ann Heng, Hong Kong Bo Hoege, Germany Hongwei Hsiao, USA Tianzi Jiang, P.R. China Nan Kong, USA Steven A. Landry, USA Kang Li, USA Zhizhong Li, P.R. China Tim Marler, USA
XIII
Ahmet F. Ozok, Turkey Srinivas Peeta, USA Sudhakar Rajulu, USA Matthias R¨ otting, Germany Matthew Reed, USA Johan Stahre, Sweden Mao-Jiun Wang, Taiwan Xuguang Wang, France Jingzhou (James) Yang, USA Gulcin Yucel, Turkey Tingshao Zhu, P.R. China
Human-Centered Design Program Chair: Masaaki Kurosu Julio Abascal, Spain Simone Barbosa, Brazil Tomas Berns, Sweden Nigel Bevan, UK Torkil Clemmensen, Denmark Susan M. Dray, USA Vanessa Evers, The Netherlands Xiaolan Fu, P.R. China Yasuhiro Horibe, Japan Jason Huang, P.R. China Minna Isomursu, Finland Timo Jokela, Finland Mitsuhiko Karashima, Japan Tadashi Kobayashi, Japan Seongil Lee, Korea Kee Yong Lim, Singapore
Zhengjie Liu, P.R. China Lo¨ıc Mart´ınez-Normand, Spain Monique Noirhomme-Fraiture, Belgium Philippe Palanque, France Annelise Mark Pejtersen, Denmark Kerstin R¨ ose, Germany Dominique L. Scapin, France Haruhiko Urokohara, Japan Gerrit C. van der Veer, The Netherlands Janet Wesson, South Africa Toshiki Yamaoka, Japan Kazuhiko Yamazaki, Japan Silvia Zimmermann, Switzerland
Design, User Experience, and Usability Program Chair: Aaron Marcus Ronald Baecker, Canada Barbara Ballard, USA Konrad Baumann, Austria Arne Berger, Germany Randolph Bias, USA Jamie Blustein, Canada
Ana Boa-Ventura, USA Lorenzo Cantoni, Switzerland Sameer Chavan, Korea Wei Ding, USA Maximilian Eibl, Germany Zelda Harrison, USA
XIV
Organization
R¨ udiger Heimg¨artner, Germany Brigitte Herrmann, Germany Sabine Kabel-Eckes, USA Kaleem Khan, Canada Jonathan Kies, USA Jon Kolko, USA Helga Letowt-Vorbek, South Africa James Lin, USA Frazer McKimm, Ireland Michael Renner, Switzerland
Christine Ronnewinkel, Germany Elizabeth Rosenzweig, USA Paul Sherman, USA Ben Shneiderman, USA Christian Sturm, Germany Brian Sullivan, USA Jaakko Villa, Finland Michele Visciola, Italy Susan Weinschenk, USA
HCI International 2013
The 15th International Conference on Human–Computer Interaction, HCI International 2013, will be held jointly with the affiliated conferences in the summer of 2013. It will cover a broad spectrum of themes related to human–computer interaction (HCI), including theoretical issues, methods, tools, processes and case studies in HCI design, as well as novel interaction techniques, interfaces and applications. The proceedings will be published by Springer. More information about the topics, as well as the venue and dates of the conference, will be announced through the HCI International Conference series website: http://www.hci-international.org/ General Chair Professor Constantine Stephanidis University of Crete and ICS-FORTH Heraklion, Crete, Greece Email: [email protected]
Table of Contents – Part I
Part I: Design for All Methods and Tools Visual Mediation Mechanisms for Collaborative Design and Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carmelo Ardito, Barbara Rita Barricelli, Paolo Buono, Maria Francesca Costabile, Antonio Piccinno, Stefano Valtolina, and Li Zhu Design for the Information Society . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Agata Bonenberg Classifying Interaction Methods to Support Intuitive Interaction Devices for Creating User-Centered-Systems . . . . . . . . . . . . . . . . . . . . . . . . . Dirk Burkhardt, Matthias Breyer, Christian Glaser, Kawa Nazemi, and Arjan Kuijper
3
12
20
Evaluation of Video Game Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Joyram Chakraborty and Phillip L. Bligh
30
Emergent Design: Bringing the Learner Close to the Experience . . . . . . . Joseph Defazio and Kevin Rand
36
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Carlos Duarte, Jos´e Coelho, Pedro Feiteira, David Costa, and Daniel Costa
Use-State Analysis to Find Domains to Be Re-designed . . . . . . . . . . . . . . . Masami Maekawa and Toshiki Yamaoka
80
An Approach Towards Considering Users’ Understanding in Product Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anna Mieczakowski, Patrick Langdon, and P. John Clarkson
90
XVIII
Table of Contents – Part I
Evaluation of Expert Systems: The Application of a Reference Model to the Usability Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paula Miranda, Pedro Isaias, and Manuel Cris´ ostomo
100
Investigating the Relationships between User Capabilities and Product Demands for Older and Disabled Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Umesh Persad, Patrick Langdon, and P. John Clarkson
110
Practical Aspects of Running Experiments with Human Participants . . . Frank E. Ritter, Jong W. Kim, Jonathan H. Morgan, and Richard A. Carlson
119
A Genesis of Thinking in the Evolution of Ancient Philosophy and Modern Software Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stephan H. Sneed
129
Understanding the Role of Communication and Hands-On Experience in Work Process Design for All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Stary
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology to the Teaching/ Learning of Interaction Design . . . . . . . . . Shu-Wen Tzeng
159
A Study on an Usability Measurement Based on the Mental Model . . . . . Yuki Yamada, Keisuke Ishihara, and Toshiki Yamaoka
168
Part II: Web Accessibility: Approaches, Methods and Tools Enabling Accessibility Characteristics in the Web Services Domain . . . . . Dimitris Giakoumis, Dimitrios Tzovaras, and George Hassapis
Intelligent Working Environments for the Ambient Classroom . . . . . . . . . Maria Korozi, Stavroula Ntoa, Margherita Antona, and Constantine Stephanidis
381
Adaptive Interfaces: A Little Learning Is a Dangerous Thing. . . . . . . . . . . Kyle Montague, Vicki L. Hanson, and Andy Cobley
Part I: User Models, Personas and Virtual Humans Standardizing User Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pradipta Biswas and Patrick Langdon Integral Model of the Area of Reaches and Forces of a Disabled Person with Dysfunction of Lower Limbs as a Tool in Virtual Assessment of Manipulation Possibilities in Selected Work Environments . . . . . . . . . . . . . Bogdan Branowski, Piotr Pohl, Michal Rychlik, and Marek Zablocki Modeling the Role of Empathic Design Engaged Personas: An Emotional Design Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Robert C.C. Chen, Wen Cing-Yan Nivala, and Chien-Bang Chen Accessible UI Design and Multimodal Interaction through Hybrid TV Platforms: Towards a Virtual-User Centered Design Framework . . . . . . . . Pascal Hamisu, Gregor Heinrich, Christoph Jung, Volker Hahn, Carlos Duarte, Pat Langdon, and Pradipta Biswas
3
12
22
32
Modelling Cognitive Impairment to Improve Universal Access . . . . . . . . . Elina Jokisuu, Patrick Langdon, and P. John Clarkson
User Modeling through Unconscious Interaction with Smart Shop . . . . . . Toshikazu Kato
61
Supporting Inclusive Design of User Interfaces with a Virtual User Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pierre T. Kirisci, Patrick Klein, Markus Modzelewski, Michael Lawo, Yehya Mohamad, Thomas Fiddian, Chris Bowden, Antoinette Fennell, and Joshue O Connor Virtual User Concept for Inclusive Design of Consumer Products and User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yehya Mohamad, Carlos A. Velasco, Jaroslav Pullmann, Michael Lawo, and Pierre Kirisci Modeling Users for Adaptive Semantics Visualizations . . . . . . . . . . . . . . . . Kawa Nazemi, Dirk Burkhardt, Matthias Breyer, and Arjan Kuijper
69
79
88
XXIV
Table of Contents – Part II
An Investigation of a Personas-Based Model Assessment for Experiencing User-Centred Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wen Cing-Yan Nivala, De-Lai Men, Tin-Kai Chen, and Robert C.C. Chen Numerical Analysis of Geometrical Features of 3D Biological Objects, for Three-Dimensional Biometric and Anthropometric Database . . . . . . . Michal Rychlik, Witold Stankiewicz, and Marek Morzynski
98
108
Part II: Older People in the Information Society Designing Interactive Pill Reminders for Older Adults: A Formative Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sepideh Ansari
121
Older User Errors in Handheld Touchscreen Devices: To What Extent Is Prediction Possible? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Bradley, Patrick Langdon, and P. John Clarkson
131
Affective Technology for Older Adults: Does Fun Technology Affect Older Adults and Change Their Lives? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryoko Fukuda
140
Muntermacher – “Think and Move” Interface and Interaction Design of a Motion-Based Serious Game for the Generation Plus . . . . . . . . . . . . . Holger Graf, Christian Tamanini, and Lukas Geissler
149
Preliminary Framework for Studying Self-reported Data in Electronic Medical Records within a Continuing Care Retirement Community . . . . . Kelley Gurley and Anthony F. Norcio
159
Using Motion-Sensing Remote Controls with Older Adults . . . . . . . . . . . . Thomas von Bruhn Hinn´e and Simeon Keates
166
Design Lessons for Older Adult Personal Health Records Software from Older Adults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Juan Pablo Hourcade, Elizabeth A. Chrischilles, Brian M. Gryzlak, Blake M. Hanson, Donald E. Dunbar, David A. Eichmann, and Ryan R. Lorentzen Design and Development a Social Networks Platform for Older People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chien-Lung Hsu, Kevin C. Tseng, Chin-Lung Tseng, and Boo-Chen Liu In Search of Information on Websites: A Question of Age? . . . . . . . . . . . . Eug`ene Loos
176
186
196
Table of Contents – Part II
Preliminary Findings of an Ethnographical Research on Designing Accessible Geolocated Services with Older People . . . . . . . . . . . . . . . . . . . . Valeria Righi, Guiller Mal´ on, Susan Ferreira, Sergio Sayago, and Josep Blat An Experiment for Motivating Elderly People with Robot Guided Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ryohei Sasama, Tomoharu Yamaguchi, and Keiji Yamada Connecting Communities: Designing a Social Media Platform for Older Adults Living in a Senior Village . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tsai-Hsuan Tsai, Hsien-Tsung Chang, Alice May-Kuen Wong, and Tsung-Fu Wu A Telehealthcare System to Care for Older People Suffering from Metabolic Syndrome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kevin C. Tseng, Chien-Lung Hsu, and Yu-Hao Chuang Narrating Past to Present: Conveying the Needs and Values of Older People to Young Digital Technology Designers . . . . . . . . . . . . . . . . . . . . . . . Elizabeth Valentine, Ania Bobrowicz, Graeme Coleman, Lorna Gibson, Vicki L. Hanson, Saikat Kundu, Alison McKay, and Raymond Holt Evaluating the Design, Use and Learnability of Household Products for Older Individuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christopher Wilkinson, Patrick Langdon, and P. John Clarkson
XXV
205
214
224
234
243
250
Part III: Designing for Users Diversity Disable Workstation Development: A Multicompetence Approach to Human Behaviour Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Giuseppe Andreoni, Fiammetta Costa, Carlo Frigo, Sabrina Muschiato, Esteban Pavan, Laura Scapini, and Maximiliano Romero Making Visual Maps Accessible to the Blind . . . . . . . . . . . . . . . . . . . . . . . . . Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Loredana Martusciello Untapped Markets in Cloud Computing: Perspectives and Profiles of Individuals with Intellectual and Developmental Disabilities and Their Families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ann Cameron Caldwell Patient-Centered Design: Interface Personalization for Individuals with Brain Injury . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Elliot Cole
263
271
281
291
XXVI
Table of Contents – Part II
An Information Theoretic Mouse Trajectory Measure . . . . . . . . . . . . . . . . . Samuel Epstein, Eric S. Missimer, and Margrit Betke Comparative Study between AZERTY-Type and K-Hermes Virtual Keyboards Dedicated to Users with Cerebral Palsy . . . . . . . . . . . . . . . . . . . Yohan Guerrier, Maxime Baas, Christophe Kolski, and Franck Poirier
301
310
New Trends in Non-visual Interaction - Sonification of Maps . . . . . . . . . . . Vidas Lauruska
320
Opportunities in Cloud Computing for People with Cognitive Disabilities: Designer and User Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . Clayton Lewis and Nancy Ward
Access-a-WoW: Building an Enhanced World of WarcraftTM UI for Persons with Low Visual Acuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Michael Poor, Thomas J. Donahue, Martez E. Mott, Guy W. Zimmerman, and Laura Marie Leventhal
352
Audiopolis, Navigation through a Virtual City Using Audio and Haptic Interfaces for People Who Are Blind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaime S´ anchez and Javiera Mascar´ o
Do Hedonic and Eudaimonic Well-Being of Online Shopping Come from Daily Life Experience? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jia Zhang and Hiroyuki Umemuro
519
XXVIII
Table of Contents – Part II
Part V: Eye Tracking, Gestures and Brain Interfaces Eye Tracking and Universal Access: Three Applications and Practical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Bartels and Sandra P. Marshall Interpreting 3D Faces for Augmented Human-Computer Interaction . . . . Marinella Cadoni, Enrico Grosso, Andrea Lagorio, and Massimo Tistarelli
525 535
Social Environments, Mixed Communication and Goal-Oriented Control Application Using a Brain-Computer Interface . . . . . . . . . . . . . . . G¨ unter Edlinger and Christoph Guger
545
Tactile Hand Gesture Recognition through Haptic Feedback for Affective Online Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hae Youn Joung and Ellen Yi-Luen Do
555
Gesture-Based User Interfaces for Public Spaces . . . . . . . . . . . . . . . . . . . . . Andreas Kratky
564
Towards Standardized User and Application Interfaces for the Brain Computer Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Paul McCullagh, Melanie Ware, Alex McRoberts, Gaye Lightbody, Maurice Mulvenna, Gerry McAllister, Jos´e Luis Gonz´ alez, and Vicente Cruz Medina
573
Head Movements, Facial Expressions and Feedback in Danish First Encounters Interactions: A Culture-specific Analysis . . . . . . . . . . . . . . . . . . Patrizia Paggio and Costanza Navarretta
Part I: Universal Access in the Mobile Context Results of the Technical Validation of an Accessible Contact Manager for Mobile Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jon Azpiroz, Juan Bautista Montalv´ a Colomer, Mar´ıa Fernanda Cabrera-Umpi´errez, Mar´ıa Teresa Arredondo, and Julio Guti´errez Developing Accessible Mobile Phone Applications: The Case of a Contact Manager and Real Time Text Applications . . . . . . . . . . . . . . . . . . Mar´ıa Fernanda Cabrera-Umpi´errez, Adri´ an Rodr´ıguez Castro, Jon Azpiroz, Juan Bautista Montalv´ a Colomer, Mar´ıa Teresa Arredondo, and Javier Cano-Moreno
3
12
BrailleTouch: Mobile Texting for the Visually Impaired . . . . . . . . . . . . . . . Brian Frey, Caleb Southern, and Mario Romero
A System for Enhanced Situation Awareness with Outdoor Augmented Reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jan A. Neuh¨ ofer and Thomas Alexander
203
Implementation of the ISO/IEC 24756 for the Interaction Modeling of an AAL Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pilar Sala, Carlos Fernandez, Juan Bautista Mochol´ı, Pablo Presencia, and Juan Carlos Naranjo Virtual Reality for AAL Services Interaction Design and Evaluation . . . . Pilar Sala, Felix Kamieth, Juan Bautista Mochol´ı, and Juan Carlos Naranjo Young by Design: Supporting Older Adults’ Mobility and Home Technology Use through Universal Design and Instruction . . . . . . . . . . . . . Michael Sengpiel Towards an Evidence-Based and Context-Aware Elderly Caring System Using Persuasive Engagement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu Chun Yen, Ching Hu Lu, Yi Chung Cheng, Jing Siang Chen, and Li Chen Fu
210
220
230
240
Part III: Driving and Interaction Towards an Integrated Adaptive Automotive HMI for the Future . . . . . . Angelos Amditis, Katia Pagle, Gustav Markkula, and Luisa Andreone
Spaces of Mutable Shape and the Human Ability to Adapt . . . . . . . . . . . . Katarzyna Kubsik
365
Using a Visual Assistant to Travel Alone within the City . . . . . . . . . . . . . . Yves Lachapelle, Dany Lussier-Desrochers, Martin Caouette, and Martin Therrien-B´elec
Part I: Speech, Communication and Dialogue Greek Verbs and User Friendliness in the Speech Recognition and the Speech Production Module of Dialog Systems for the Broad Public . . . . . Christina Alexandris and Ioanna Malagardi
3
Intercultural Dynamics of Fist Acquaintance: Comparative Study of Swedish, Chinese and Swedish-Chinese First Time Encounters . . . . . . . . . Jens Allwood, Nataliya Berbyuk Lindstr¨ om, and Jia Lu
An Experimental Study of the Use of Multiple Humanoid Robots as a Social Communication Medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kotaro Hayashi, Takayuki Kanda, Hiroshi Ishiguro, Tsukasa Ogasawara, and Norihiro Hagita
32
A Multitasking Approach to Adaptive Spoken Dialogue Management . . . Tobias Heinroth, Dan Denich, and Wolfgang Minker
42
From Clouds to Rain: Consolidating and Simplifying Online Communication Services with Easy One Communicator . . . . . . . . . . . . . . . Jeffery Hoehl and Gregg Vanderheiden
52
Use of Speech Technology in Real Life Environment . . . . . . . . . . . . . . . . . . Ruimin Hu, Shaojian Zhu, Jinjuan Feng, and Andrew Sears
Collaborative Editing for All: The Google Docs Example . . . . . . . . . . . . . . Giulio Mori, Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Victor M.R. Penichet
165
Acoustic Modeling of Dialogue Elements for Document Accessibility . . . . Pepi Stavropoulou, Dimitris Spiliotopoulos, and Georgios Kouroupetroglou
175
Part III: Universal Access in Complex Working Environments Seeing the Wood for the Trees Again! SMART - A Holistic Way of Corporate Governance Offering a Solution Ready to Use . . . . . . . . . . . . . . Fritz Bastarz and Patrick Halek
187
Using Human Service Center Interfaces and Their Information to Foster Innovation Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Klaus-Peter Faehnrich, Kyrill Meyer, and Benjamin Udo Strehl
195
Key Features of Subject-Oriented Modeling and Organizational Deployment Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Albert Fleischmann and Christian Stary
205
Table of Contents – Part IV
XXXIX
The Effect of a GPS on Learning with Regards to Performance and Communication in Municipal Crisis Response . . . . . . . . . . . . . . . . . . . . . . . . Helena Granlund, Rego Granlund, and Nils Dahlb¨ ack
215
Crisis Management Training: Techniques for Eliciting and Describing Requirements and Early Designs across Different Incident Types . . . . . . . Ebba Thora Hvannberg and Jan Rudinsky
225
Development of Mobile Evacuation Guides for Travellers and Rescue Personnel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Viveca Jimenez-Mixco, Jan Paul Leuteritz, Eugenio Gaeta, Maria Fernanda Cabrera-Umpierrez, Mar´ıa Teresa Arredondo, Harald Widlroither, and Mary Panou Stakeholder-Driven Business Process Management: “An Evaluation of the Suitability, Adequacy and Effectiveness of Quality and Process Management” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Jungen
235
244
Evaluation of a Mobile AR Tele-Maintenance System . . . . . . . . . . . . . . . . . Michael Kleiber and Thomas Alexander
253
Knowledge in Digital Decision Support System . . . . . . . . . . . . . . . . . . . . . . Erika Matsak and Peeter Lorents
263
Examining the Current State of Group Support Accessibility: A Focus Group Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . John G. Schoeberlein and Yuanqiong (Kathy) Wang
Age Dependent Differences in the Usage of a Desktop VR System for Air Force Mission Planning and Preparation . . . . . . . . . . . . . . . . . . . . . . . . . Carsten Winkelholz and Michael Kleiber
292
A Concept for User-Centered Development of Accessible User Interfaces for Industrial Automation Systems and Web Applications . . . . . . . . . . . . . Farzan Yazdi, Helmut Vieritz, Nasser Jazdi, Daniel Schilberg, Peter G¨ ohner, and Sabina Jeschke
301
Part IV: Well-Being, Health and Rehabilitation Applications Forms of Interaction in Virtual Space: Applications to Psychotherapy and Counselling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shagun Chawla and Nigel Foreman
313
XL
Table of Contents – Part IV
Control of Powered Prosthetic Hand Using Multidimensional Ultrasound Signals: A Pilot Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xin Chen, Siping Chen, and Guo Dan
322
A Top-k Analysis Using Multi-level Association Rule Mining for Autism Treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kelley M. Engle and Roy Rada
328
Pregnancy Test for the Vision-Impaired Users . . . . . . . . . . . . . . . . . . . . . . . Tereza Hykov´ a, Adam J. Sporka, Jan Vystrˇcil, Martin Kl´ıma, and Pavel Slav´ık Effect of Spinal Cord Injury on Nonlinear Complexity of Skin Blood Flow Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yih-Kuen Jan, Fuyuan Liao, and Stephanie Burns Developing Prot´eg´e to Structure Medical Report . . . . . . . . . . . . . . . . . . . . . Josette Jones, Kanitha Phalakornkule, Tia Fitzpatrick, Sudha Iyer, and C. Zorina Ombac Increasing Physical Activity by Implementing a Behavioral Change Intervention Using Pervasive Personal Health Record System: An Exploratory Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hadi Kharrazi and Lynn Vincz
335
345 356
366
Exploring Health Website Users by Web Mining . . . . . . . . . . . . . . . . . . . . . Wei Kong and Josette Jones
376
Double Visual Feedback in the Rehabilitation of Upper LIMB . . . . . . . . . Liu Enchen, Sui Jianfeng, and Ji Linhong
384
Can User Tagging Help Health Information Seekers? . . . . . . . . . . . . . . . . . . Malika Mahoui, Josette Jones, Andrew Meyerhoff, and Syed Ahmed Toufeeq
389
Interactive Medical Volume Visualizations for Surgical Online Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Konrad M¨ uhler, Mathias Neugebauer, and Bernhard Preim Age-Adapted Psychoacoustics: Target Group Oriented Sound Schemes for the Interaction with Telemedical Systems . . . . . . . . . . . . . . . . . . . . . . . . Alexander Mertens, Philipp Przybysz, Alexander Groß, David Koch-Koerfges, Claudia Nick, Martin Kaethner, and Christopher M. Schlick Bringing the Home into the Hospital: Assisting the Pre-Discharge Home Visit Process Using 3D Home Visualization Software . . . . . . . . . . . . Arthur G. Money, Anne McIntyre, Anita Atwal, Georgia Spiliotopoulou, Tony Elliman, and Tim French
398
406
416
Table of Contents – Part IV
Design of a Paired Patient-Caregiver Support Application for People Coping with Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christine M. Newlon, Robert Skipworth Comer, Kim Wagler-Ziner, and Anna M. McDaniel
The Relationships between Morphology and Work for the Nursing Performance of Hand Controls in Emergency Surgery . . . . . . . . . . . . . . . . . Chao-Yuan Tseng and Fong-Gong Wu
448
Upper Limb Contralateral Physiological Characteristic Evaluation for Robot-Assisted Post Stroke Hemiplegic Rehabilitation . . . . . . . . . . . . . . . . Lap-Nam Wong, Qun Xie, and Linhong Ji
458
A Case Study of the Design and Evaluation of a Persuasive Healthy Lifestyle Assistance Technology: Challenges and Design Guidelines . . . . . Jie Xu, Ping-yu Chen, Scott Uglow, Alison Scott, and Enid Montague
464
Novel Human-Centered Rehabilitation Robot with Biofeedback for Training and Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Runze Yang, Linhong Ji, and Hongwei Chen
472
Intensity Analysis of Surface Myoelectric Signals from Lower Limbs during Key Gait Phases by Wavelets in Time-Frequency . . . . . . . . . . . . . . Jiangang Yang, Xuan Gao, Baikun Wan, Dong Ming, Xiaoman Cheng, Hongzhi Qi, Xingwei An, Long Chen, Shuang Qiu, and Weijie Wang Handle Reaction Vector Analysis with Fuzzy Clustering and Support Vector Machine during FES-Assisted Walking Rehabilitation . . . . . . . . . . Weixi Zhu, Dong Ming, Baikun Wan, Xiaoman Cheng, Hongzhi Qi, Yuanyuan Chen, Rui Xu, and Weijie Wang
479
489
Part V: Universal Access to Education and Learning From “Reading” Math to “Doing” Math: A New Direction in Non-visual Math Accessibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nancy Alajarmeh, Enrico Pontelli, and Tran Son Accessible Education for Autistic Children: ABA-Based Didactic Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Silvia Artoni, Maria Claudia Buzzi, Marina Buzzi, Claudia Fenili, and Simona Mencarini
501
511
XLII
Table of Contents – Part IV
Educational Impact of Structured Podcasts on Blind Users . . . . . . . . . . . . Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, and Giulio Mori A New Structure of Online Learning Environment to Support the Professional Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenzhi Chen and Yu-How Lin Behaviour Computer Animation, Communicability and Education for All . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Francisco V. Cipolla Ficarra, Jacqueline Alma, and Miguel Cipolla-Ficarra An Intelligent Task Assignment and Personalization System for Students’ Online Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Asterios Leonidis, George Margetis, Margherita Antona, and Constantine Stephanidis Using Interface Design with Low-Cost Interactive Whiteboard Technology to Enhance Learning for Children . . . . . . . . . . . . . . . . . . . . . . . Chien-Yu Lin, Fong-Gong Wu, Te-Hsiung Chen, Yan-Jin Wu, Kenendy Huang, Chia-Pei Liu, and Shu-Ying Chou Integration of a Spanish-to-LSE Machine Translation System into an e-Learning Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fernando L´ opez-Colino, Javier Tejedor, Jordi Porta, and Jos´e Col´ as
521
530
538
548
558
567
Towards Ambient Intelligence in the Classroom . . . . . . . . . . . . . . . . . . . . . . George Margetis, Asterios Leonidis, Margherita Antona, and Constantine Stephanidis
577
Learning Styles and Navigation Patterns in Web-Based Education . . . . . . Jelena Naki´c, Nikola Maranguni´c, and Andrina Grani´c
587
Phynocation: A Prototyping of a Teaching Assistant Robot for C Language Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Akihiro Ogino, Haruaki Tamada, and Hirotada Ueda
597
A Generic OSGi-Based Model Framework for Delivery Context Properties and Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jaroslav Pullmann, Yehya Mohamad, Carlos A. Velasco, and Stefan P. Carmien Inclusive Scenarios to Evaluate an Open and Standards-Based Framework That Supports Accessibility and Personalisation at Higher Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alejandro Rodriguez-Ascaso, Jesus G. Boticario, Cecile Finat, Elena del Campo, Mar Saneiro, Eva Alcocer, Emmanuelle Guti´errez y Restrepo, and Emanuela Mazzone
605
612
Table of Contents – Part IV
XLIII
Relationship between BPM Education and Business Process Solutions: Results of a Student Contest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Werner Schmidt
622
Design of a Multi-interface Creativity Support Tool for the Enhancement of the Creativity Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . George A. Sielis, George A. Papadopoulos, and Andrina Grani´c
632
Shadow Expert Technique (SET) for Interaction Analysis in Educational Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Christian Stickel, Martin Ebner, and Andreas Holzinger
Visual Mediation Mechanisms for Collaborative Design and Development Carmelo Ardito1, Barbara Rita Barricelli2, Paolo Buono1, Maria Francesca Costabile1, Antonio Piccinno1, Stefano Valtolina2, and Li Zhu2 1
Dipartimento di Informatica, Università degli Studi di Bari, Via Orabona 4, 70125 Bari, Italy {ardito,buono,costabile,piccinno}@di.uniba.it 2 Dipartimento di Informatica e Comunicazione, Università degli Studi di Milano, Via Comelico 39/41, 20135 Milano, Italy {barricelli,valtolin,zhu}@dico.unimi.it
Abstract. Collaborative design involving end users has emerged as a response to the needs felt by various organizations of adapting software to specific environments and users. During time, users and environments evolve; this is another reason why software has to be modified. Different stakeholders, including consultants, designers internal to the organization and, recently, end users, have to collaborate among themselves, and possibly with the software providers, to shape software. Such stakeholders face fundamental challenges in learning how to communicate and in building a shared understanding. Researchers are now addressing such challenges. This paper contributes to this innovative research by formally defining visual mediation mechanisms for collaborative design. A case study illustrating their application is discussed. Keywords: collaborative design, mediation mechanisms, end-user development, meta-design, communities of practice.
infrastructure. Researchers are now exploring different ways for effectively supporting the collaborative activities of the diverse stakeholders in such ecosystems. Particular attention is devoted to the development of software environments and tools, which enable end users to effectively participate in this collaboration and in adapting software at use time. Recent developments based on the Web 2.0 and semantic Web, like weblogs, podcasts, RSS feeds, social software as wikis and social networking, are already examples of collaborative design environments that permit user-generated contents. Fischer refers to a new world based on cultures of participation, in which end users evolve from being passive software consumers to active producers [2], and are involved in various activities of end-user development [3]. In several development practices, communication and collaboration with end users take place through channels that are separated from the actual software, e.g., phone, e-mail, thus limiting end users’ participation in collaborative design. Another main problem is that stakeholders are very diverse, characterized by different cultures and skills, they use different languages and notations, adopt different documentation styles, i.e. they belong to different Communities of Practice (CoPs). According to Wenger, we refer to a CoP as a group of people who share a common practice and address a common set of problems [4]. CoPs develop their own languages and notations to express and communicate their knowledge, problems and solutions. Examples of CoPs are software developers, software consultants, end users. CoPs involved in design, development and evolution of a certain software system represent a Community of Interest (CoI), defined in [5] as a community of communities brought together to solve a problem of common concern. CoIs stress the importance of combining voices from different CoPs; however, they face fundamental challenges in communicating among the CoPs and in building a shared understanding, which is the basis for their collaboration. Members of the CoI keep collaborating at use time, whenever there is the need to modify or evolve the software; this forces the development of technical means to relate and integrate users’ and developers’ views in order to provide a seamless way of moving between use and design of software, facilitating its adaptation to users’ needs and environments. Such technical means include new modeling languages, architectures that support multilevel design and development, but also mediation mechanisms, which permit the communication between professional developers’ environments and end users’ environments across the ecosystem. The novel contribution of this paper is the formal definition of visual mediation mechanisms supporting collaborative design. Their application to a case study referring to a Web portal, which advertises products of shops of various types, is also discussed. The paper is organized as follows. Section 2 discusses the mediation process. Section 3 presents the formal specification of visual mediation mechanisms, and Section 4 illustrates how they are applied to a case study. Finally, Section 5 reports conclusions.
2 Mediation Process A mediation process allows two human actors (shortly actors) to reach a common understanding, related to a specific domain, by the support of an agent, the mediator [6].
Visual Mediation Mechanisms for Collaborative Design and Development
5
Wiederhold defined a mediator as “a software module that exploits encoded knowledge about certain sets or subsets of data to create information for a higher layer of applications” [7]. The concept of mediator has been used in the field of Web services to manage the interoperability among either software agents and Web services or Web services themselves [8], [9]. In the collaborative design context analyzed in this paper, the mediation process consists of exchanging messages between two actors playing a certain role in the collaboration [10]. These actors are generally members of different CoPs; they use dedicated interactive environments to reason on and perform their activities. Similarly to what is described in [11], such environments exploit an interaction language, which comply with the CoP notations, culture and role in the collaboration, in order to be usable for the CoP members. Each environment is equipped with an engine that acts as mediator by translating the incoming messages into the CoP’s interaction language. The first two steps of a generic mediation process are illustrated in Fig. 1. The human actor H1 sends a message (Message1) to another human actor H2. Before reaching H2, Message1 is captured and managed by “Mediator2” (the engine of H2’s interactive environment that acts as mediator for it) that, by exploiting the knowledge Base (KB) for the current domain, translates it into the interaction language used by H2, so that H2 can understand it. The translated message, Message1’, is then delivered to H2. In the second step represented in Fig. 1 H2 replies to H1’s message by sending a new message (Message2) to H1. In analogy with what happened in the first step, the message is captured and managed by “Mediator 1” (the mediator for H1’s interactive environment) that, by exploiting the knowledge base, translates it into the interaction language used by H1. The translated message (Message2’) is then delivered to H1. The CoPs’ environments support the actors involved in the communication process by allowing them to create and exchange boundary objects and their annotations on them. Boundary objects are artifacts of the interactive system and are part of the message sent by a human actor (a member of a CoP), but received and interpreted differently by another human actor according to the background and expertise of the CoP s/he belongs to [12]. Boundary objects are used all the time to sustain communication in face-to-face collaboration; for example, blueprints, sketches and drawings are used during discussion in design engineering; similarly, digital images
Fig. 1. Mediation process between two human actors (H1 and H2)
6
C. Ardito et al.
are used during discussions in medicine and natural sciences. Discussants may express their annotations on a boundary object by using sticky notes. Going to interactive software systems, boundary objects are used to support computer-mediated communication among members of CoIs, who are geographically distributed and work asynchronously on the same project. The diversity of the members of the CoI is addressed by making available boundary objects that are adaptable to the various cultures and contexts of use. Members of the CoIs which constitute the design team interact and negotiate a concept by exchanging messages based on boundary objects as a concrete representation of what they mean. Boundary objects serve as externalizations that capture distinct domains of human knowledge and hold the potential to lead to an increase in socially shared cognition and practice [13], [14]. They carry information and context and can be used to translate, transfer and transform knowledge between CoIs members [12]. These objects are dynamic; they can be changed and manipulated to carry more information. In a collaborative design context, the effectiveness of a boundary object is directly related to how it is translated from tacit knowledge to explicit knowledge and translated back from explicit knowledge to tacit knowledge between different actors [15]. Since the information carried by boundary objects can be implicit, the annotations allow each actor to explicitly explain the modification s/he introduces in the boundary objects.
3 Mediation Mechanisms The elements involved in a mediation process constitute a Mediation Mechanism (MM), defined as: MM = (Mediator, KB, MVL) where: - Mediator is the agent that supports the two human actors in reaching an agreement through a mediation process; - KB is the knowledge base accumulated in a specific domain in which the actors collaborate; - MVL (Mediation Visual Language) is the visual language constituted by the set of messages exchanged by the two human actors by means of the Mediator. MVL is defined as follows: MVL = {MVLmsg1, …, MVLmsgn} A MVLmsg is a message defined as MVLmsg = Data describe the object of the mediation; metadata specify some characteristics of the sending and receiving actors and of the digital platform used in the communication. When an actor sends a message, a mediation process starts. Such message is constituted as follows.
Visual Mediation Mechanisms for Collaborative Design and Development
7
• Data: ─ EP, an executable program, that is the software artifact that the two actors are collaboratively designing or part of it; ─ A, the annotation that the sender attached to the EP in order to communicate with the receiver; • Metadata: ─ S, the profile of the human actor that acts as a sender; ─ R, the profile of the human actor that acts as a receiver; ─ Pl, the specification of the hw/sw platform being used to access the program All messages following the first one convey the contributions of the involved actors. The metadata related to the profiles of the communicants are sent only in the first MVL message and not repeated in the next ones. In the messages, boundary objects are augmented with multimedia annotations to support negotiation among the different actors. Boundary objects and annotations are explicitly represented by MVL. A mediation mechanism enables actors of different CoPs, working with different software environments, to properly communicate and collaborate within the ecosystem, since it makes concepts expressed in the language of a CoP understandable by the members of another CoP. In [11], examples of message exchanges between actors of different CoPs, each working with the software environment specific for the CoP the actor belongs to, are provided. That paper also describes the Software Shaping Workshop (SSW) methodology, whose main idea is that all stakeholders, including end users, are “owners” of a part of the problem (e.g. software engineers are experts of technology, end users know the application domain, human-computer interaction experts deal with human factors). They all bring their own expertise into the collaboration and exchange ideas in order to converge toward a common design. Different software environments (called Software Shaping Workshops), are provided to each community of stakeholders (CoPs) in order to allow them to actively participate in system design, development and evolution. The case study illustrated in the next section has been designed according to the SSW methodology.
4 A Case Study: The Virtual Showroom The case study described in this paper refers to a Web portal of a company that provides virtual windows for shops of various natures. Through its portal, the company sells advertisement spaces (the virtual shop windows) to shop owners, in order to allow them to advertise their goods. One of the novelties of the system is that such owners are allowed to create and manage the content of the virtual windows, so that they can update them as they like at any time. The virtual windows may be created using different templates, which are made available by the company and sold at different prices: the more the complexity (in terms of combined multimedia elements) grows, the higher the virtual window price becomes. Fig. 2 depicts a virtual window for a women’s accessories shop, whose template is composed of a left side with textual description and a right side with a photo gallery showing four pictures.
8
C. Ardito et al.
Some stakeholders in this case study are grouped into four distinct CoPs: ─ Customers, i.e. Web surfers, who interact with and/or browse the virtual shop windows; ─ Content providers, i.e. shop owners, who provide contents to be shown in their virtual shop window(s); ─ Editorial staff, who create virtual windows’ templates to be used by the content providers; ─ Administrators, who shape the software environment in which the editorial staff designs the virtual windows’ templates. Administrators, editorial staff, and content providers collaborate in designing the portal, by interacting with software environments specifically developed for them. Editorial staff and content providers are not required to be computer experts. On the other hand, administrators have to be familiar with the application domain and also should have some experience in Web development. The software environment used by the administrators is designed by professional software developers, but this is out of the scope of this paper so that this CoP is not considered here. The reader interested in the whole meta-design process can refer to [11]. In order to provide an example of a mediation process in this case study, the situation in which the content provider wants a feature modification in her/his virtual shop window is considered. As illustrated in Fig. 3, s/he uses the annotation tool available in the system to annotate that specific feature explaining the changes s/he wants. The feature is the photo gallery that is highlighted with a dashed border. Through this annotation, which in Fig. 3 overlaps the main picture, the content provider requests to the editorial staff a modification in the virtual shop window template in order to be able to show a higher number of pictures.
Fig. 2. An example of virtual window of a women’s accessories shop. The template is composed of a textual description on the left and a photo gallery on the right.
Visual Mediation Mechanisms for Collaborative Design and Development
9
Fig. 3. An example of annotation containing a request of the content provider. The photo gallery is highlighted by surrounding it with a dashed border and the annotation explaining the requested modification is overlapped on the main picture.
A mediation process is thus activated. The mediation mechanism involved in this process consists of the three components defined in Section 3; specifically, Mediator is an engine that is part of the environment used by the content provider to make her/his requests of changing the template of the virtual shop window; KB contains all the information necessary to translate the content of the message (i.e. EP and A) in a specific request for the editorial staff; MVL is the visual language composed by the messages exchanged during the whole mediation process between content provider and editorial staff. The first message is constituted as follows. • Data: ─ EP, the virtual shop window environment that the content provider is asking to modify; ─ A, the annotation that the content provider attached to the EP in order to communicate her/his request to the editorial staff; • Metadata: ─ S, the profile of the content provider (the sender in this mediation process); ─ R, the profile of the editorial staff (the receiver in this mediation process); ─ Pl, the specification of the hw/sw platform used to access the Web portal.
10
C. Ardito et al.
The receiver, a member of the editorial staff, gets the message translated according to the language of her/his software environment, which is customized to the needs of that CoP. As shown in Fig. 4, the content of the message is rendered in a table at the bottom of the screen, which communicates the same meaning intended by the content provider who sent the message, but using a different visual representation, in which some codes are used, that the editorial staff well understands. If the editorial staff can directly manage the request, s/he performs the necessary software modification and communicates it to the content provider; otherwise, s/he activates a second mediation process with a member of the administrator CoP, to whom s/he reports the request for modifications. In the first mediation process, the reply message to the content provider consists of the virtual shop window application EP, modified according to the content provider request, and the annotation A created by the editorial staff to explain the performed modifications. If the content provider is satisfied by the solution s/he gets, the mediation process is concluded; otherwise it keeps going on iteratively until the content provider and the editorial staff reaches an agreement.
Fig. 4. Editorial staff environment: the request sent by the content provider is shown
5 Conclusions This paper has discussed and provided a formal definition of visual mediation mechanisms for collaborative design, development and evolution of software. Mediation mechanisms provide a means to improve communication and cooperation among all stakeholders involved in the design of software artifacts, including end users. This communication is fundamental in order to cooperatively create software adapted to user needs and context of work. A case study referring to a Web portal that provides advertisement for shop of various natures has been presented; it provides an example of how visual mediation mechanisms are used to permit the collaboration among the different stakeholders. Acknowledgments. This work is supported by Italian MIUR and by grant PS_092 DIPIS. Li Zhu acknowledges the support of the Initial Training Network “Marie Curie
Visual Mediation Mechanisms for Collaborative Design and Development
11
Actions”, funded by the FP 7 - People Programme with reference PITN-GA-2008215446 “DESIRE: Creative Design for Innovation in Science and Technology”. The authors thank Nicola C. Cellamare for his collaboration in the case study.
References 1. Software Engineering Institute: Ultra-Large-Scale Systems: The Software Challenge of the Future, http://www.sei.cmu.edu (last access on February 22, 2010) 2. Fischer, G.: Cultures of Participation and Social Computing: Rethinking and Reinventing Learning and Education. In: Ninth IEEE International Conference on Advanced Learning Technologies (ICALT), pp. 1–5. IEEE Computer Society, Riga (2009) 3. Lieberman, H., Paternò, F., Wulf, V. (eds.): End User Development, vol. 9. Springer, Dordrecht (2006) 4. Wenger, E.: Communities of Practice: Learning, Meaning, and Identity. Cambridge University Press, London (1998) 5. Fischer, G.: Extending Boundaries with Meta-Design and Cultures of Participation. In: 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries (NordiCHI 2010), pp. 168–177. ACM, Reykjavik (2010) 6. Boulle, L.: Mediation: Principles, Process, Practice. LexisNexis Butterworths, Chatswood (2005) 7. Wiederhold, G.: Mediators in the Architecture of Future Information Systems. Computer 25(3), 38–49 (1992) 8. Burstein, M.H., McDermott, D.V.: Ontology Translation for Interoperability among Semantic Web Services. AI Magazine - Special issue on Semantic Integration 26(1), 71– 82 (2005) 9. Vaculin, R., Neruda, R., Sycara, K.: The Process Mediation Framework for Semantic Web Services. International Journal of Agent-Oriented Software Engineering 3(1), 27–58 (2009) 10. Zhu, L., Mussio, P., Barricelli, B.R.: Hive-Mind Space Model for Creative, Collaborative Design. In: 1st DESIRE Network Conference on Creativity and Innovation in Design, Desire Network, Aarhus, Denmark, pp. 121–130 (2010) 11. Costabile, M.F., Fogli, D., Mussio, P., Piccinno, A.: Visual Interactive Systems for EndUser Development: A Model-Based Design Methodology. IEEE T. Syst. Man Cy. A 37(6), 1029–1046 (2007) 12. Carlile, P.R.: A Pragmatic View of Knowledge and Boundaries: Boundary Objects in New Product Development. Organization Science 13(4), 442–455 (2002) 13. Jennings, P.: Tangible Social Interfaces: Critical Theory, Boundary Objects and Interdisciplinary Design Methods. In: 5th Conference on Creativity & Cognition (C&C), pp. 176–186. ACM, London (2005) 14. Resnick, L., Levine, J., Teasley, S.: Perspectives on Socially Shared Cognition. American Psychological Association, APA (1991) 15. Fong, A., Valerdi, R., Srinivasan, J.: Boundary Objects as a Framework to Understand the Role of Systems Integrators. Systems Research Forum 2(1), 11–18 (2007)
Design for the Information Society Agata Bonenberg Faculty of Architecture, Poznan University of Technology, Nieszawska 13C, 61-021 Poznan, Poland [email protected]
Abstract. The aim of this paper is to discuss the acomplishments of contemporary design, focusing on its flexibilible and adaptative features – meeting demands of migrating, mobile societies. With the expansion and popularization of information and communication technologies in the last decades, traditional space-use patterns evolved. Divisions and borders between work and leisure, public and private slowly loose their importance. Users often seek “multi-use”, “multi-task” and “open space” solutions. Research is based on projects developed at the Faculty of Architecture, Poznan University of Technology, under supervision of the Author. Keywords: Evolution-based adaptation, adaptive change in modern design, modification, adjustment, reactivity, devices, utilitarian/common use/ everyday objects.
which requires more technological and stylistic innovation than in case of architecture. It is related to the fact that removal of an unwanted architectonic structure is a difficult undertaking. In order to secure a specified group of clients for a product, or a family of products, the designer must take into consideration the state of continuous change of their needs. This evolution is focused particularly on the strategies of bonding users to a product - or a family of products. Analogies of adjusting common use objects to the external requirements may be found in similar processes existing in nature. The following paper includes the terms related to biological sciences: „evolution-based adaptation”, „adaptive changes”, which illustrate well the rules governing the modern requirements towards products. As living creatures have their specific adaptation strategies, designers create products applying utility criteria - particularly important for the product. The aim is to improve subsequent generations of objects, with regard to ergonomics, and also to meet the functional and aesthetic needs of the users in the constantly changing world of new technologies.
2 Evolution-Based Adaptation of Design Objects During the historical development, objects evolved under influence of technical possibilities, and development of the science, of ergonomics, fashion and various aesthetical trends. [3] It is particularly noticeable when we analyze them from certain time perspective. The need to adapt was usually triggered by market conditions and was the driving force of evolution of every day use objects. Some of the products and equipment existing for years have changed completely, some to some extent only; man has vanished. Example: evolution-based adaptation of a tram:
1A
1B
1C
Fig. 1. Evolution-based adaptation illustrated with an example of a tram with an electric drive: (from left): 1A -A tram in Warsaw (1920)- part of a postcard from Warsaw 1B - Contemporary tram used in Poznan (2010) fot. Autor 1C - Following the evolutionary path – image of a „future tram” Project developed at the Faculty of Architecture, Poznan University of Technology, by Wacław Kłos, under supervision of A. Boneberg
14
A. Bonenberg
There can be distinguished some basic factors which modify the forms of everyday objects: − Technology: new materials, engineering solutions − Culture: fashion, painting, trends in aesthetics − Economy – manufacturing, production and use cost
3 Adaptive Change in Modern Design Designers take into account the necessity to make adaptive changes under the influence of the environment and external stimuli. This, however, must be preceded by thorough recognition of a designing problem: defining the task, collection of information on the present state of the available solutions. Many of once acquired features and proprieties of products are "inherited" - the same features and proprieties exist in the subsequent editions of the products. Important task is to find the general (unchanging) features of a given product and the features subject to adaptive changes. As examples we can name here equipment which is operated under extreme conditions (in hot or cold environment, or at height) using different type of power supply and strict requirements for operation safety. All stimuli being the reason for adaptation changes may be classified as development factors, determinants or stimulators. There are several types of adaptive changes: modification, adjustment and reactivity. 3.1 Modification Modification can be defined as a type of recomposition, performed to meet requirements of the specific groups of consumers. In which case, the designers have to work with the object features and the environment taken under consideration, and establish: • Practical and communication functions • Aesthetical and semantic functions • Function of indication and symbolic meaning1
Fig. 2. Contemporary variants of a design of a coffee machine: product intended for the same use, applying similar technology, adapted to the needs of different target groups. Each of the examples has different aesthetical expression, different semantic function and different symbolism. 1
After: Steffen D. “Design als Produktsprache” der “Offenbacher Ansatz” in Theorie und Praxis Verlag form GmbH, Frankfurt / Main 2000.
Design for the Information Society
15
Fig. 3. System of light-weight division walls which divide living or working space in few minutes, designed at the Faculty of Architecture and the Poznań University of Technology. Object allows enjoying open spaces keeping basic functional divisions of the interior. (Author: M. Marchewicz, under supervision of A. Bonenberg).
Fig. 4. An example illustrates adjustment of the wall shelves system, which may be personalized to a large extent. Such solution assumes different load of the system in course of exploitation. It ensures easy access to articles of different shapes and sizes. The design has been developed under supervision of A. Bonenberg by G. D’Angelo, during the course of Design, at the Faculty of Architecture of the Poznan University of Technology.
Fig. 5. A modular bus stop construction - for changeable length, depending on the traffic intensity on a given communication hub. Each bus stop is equipped with at least one module with a digital timetable and information on the time of arrival of the next vehicle. The design has been developed by E. Erdmann under supervision of A. Bonenberg, Faculty of Architecture, Poznan University of Technology.
16
A. Bonenberg
While describing the phenomena related to adaptation, one should not forget about behavioral adaptation, i.e. modification or development of new products due to the phenomenon of change in space-use patterns. Interior design of living and working areas seems to encourage greater flexibility and multi-purpose use. This has happened in response to the life-style changes. 3.2 Adjustment Adjustment is a relatively short-lasting recomposition, which indicates reversible changes in an object. The aim is to meet dynamically changing needs of its user.
Fig. 6. Folding desk intended for multi-functional living spaces. The desk is equipped with integrated lighting system, drawers and two additional spaces for storing books. The design has been developed by A. Szymczak, under supervision of A. Bonenberg, Faculty of Architecture, Poznan University of Technology.
3.3 Reactivity Reactivity can be defined as being sensitive to environmental stimuli as well as functional and structural adaptation possibilities of a given device. In modern designing, reactivity is usually characteristic for objects, which use information and communication technologies. Those devices initiate respective actions depending on the phenomena in the environment.
4 Creative Process: Selection the Strategy for Adaptive Changes Adaptive changes specified above and the strategies of their implementation constitute the creative process. Stages of the designing process of utilitarian objects in the traditional understanding are illustrated in Table 1. At each stage it is possible to apply the elements of adaptation, adjustment or reactivity. The sooner the concepts are taken into account in the design, the greater impact they have on the final result of the designing process. Design is most often perceived as introduction of adaptive changes related to human factors (aesthetics, changeable understanding of beauty [5] related to cultural phenomena) with omission of the economic, ecological and technological factors with regard to which decisions are made solely by the manufacturer. But these areas have special potential for introduction of new, innovative solutions. “Design is not just
Design for the Information Society
17
Fig. 7. An example of reactive design – a conceptual project of a telephone with the function of detecting people within a predefined radius. The telephone „senses” similar devices in the proximity, making available users’ profiles visible. The idea of this invention is to facilitate face-to-face social contacts based on the technologies created for internet community networks.
what it looks like and feels like. Design is how it works“[6], a quote from one of the most revolutionary entrepreneurs, Steve Jobs, seems to confirm this truth. The interest and role of a designer should go far beyond the question of the form. Influence of the actions schedule on the implementation of adaptive changes is illustrated in Table 2. However, regardless of the revolutionary nature of the applied concepts, utilitarian objects must always comply with the basic requirements of: ergonomics, safety of use, transportation possibilities, storage, packaging and waste management. These
18
A. Bonenberg
Table 1. A scheme of designing actions: from a concept through readiness for production. [4] Source: teaching materials, Hans–Georg Piorek, Dipl.-Designer.
IDEA Phase 1 SEARCHING ANALYSIS Aim Recognition of the problem
Defining the task Gathering information Analysis of the present state Definition of the target group Briefing preparation
Phase 2 CONCEPT DEVELOPMENT Aim Solution variants
Division of functions Looking for solutions Creation of concept variants Assessment of the variants Establishing grounds for realization
Phase 3 DESIGNING Aim Problem solving
Ergonomics check Computer development of models Development of prototypes Assessment of the design Design approval
Phase 4 IMPROVEMENT ELABORATION Aim Problem solving
Development of details Improvement of the general form Voting for the realization Expenditures check Approval for realization
issues are connected with the product-environment relation and adjustment of the living space to the abilities of human body and mind. Special role is played by ergonomics as it changes the scope of meanings and extends the issues semantically it evolves in the direction indicated by technology and civilization development making it more sensitive to the needs and abilities of individuals.
5 Summary There can be three pillars of changeability of utilitarian forms distinguished: 1. Evolution-based on adaptation of utilitarian objects 2. Adaptation changes in the modern designing of utilitarian objects 3. Selection of adaptation changes strategy
Adaptation changes made by the designer Analytical part of the designing process is the basis for selection of the direction of adaptive changes. Gathering information and analysis of the current state of the market is helpful in regard to selection of the direction of changes. The analytical part concludes with a list of designing requirements.
Elaboration of the concept
At the concept stage it is possible to observe different solutions. While creating the variants of a concept, assessing the variants, it is worth extending the spectrum of possibilities and offers
Designing
At the stage of computer development of models and prototypes changes may be introduced, provided that they do not disturb coherence of the concept. At this stage the design and adaptation, adjustment or reactive changes applied to it are subject to assessment.
Improvement of the elaboration
At the stage of improvement and elaboration, change is only possible within the range of the corrected elements: work-out details, improved general form. At this stage possibilities of adaptive changes are limited.
These represent the rules crucial for the contemporary designing processes. Strategies of adaptation to the market conditions and improvement of subsequent generations of utilitarian objects decides of the success in the difficult, constantly changing world of consumer demands. Evolution and diversity of the design object, cannot be understood without social, cultural and economical context – this is dominated in present times by the mobility of the information society.
References 1. Castells, M.: The Rise of the network Society. PWN, Warsaw (2007) 2. Bonenberg, A.: Ergonomic aspect of Urban and Social Dynamics. In: Vink, P., Kantola, J. (eds.) Advances in Occupational, Social, and Organizational Ergonomics, pp. 837–846. Taylor & Francis Group, Boca Raton (2010) 3. Fiell, C.P.: Design of the 20th Century. Taschen, Köln (2005) 4. Steffen, D.: Design als Produktsprache, der Offenbacher Ansatz in Theorie und Praxis. Verlag vom GmbH, Frankfurt / Main (2000) 5. Bonenberg, W.: Beauty and Ergonomics of living environment. In: Vink, P., Kantola, J. (eds.) Advances in Occupational, Social, and Organizational Ergonomics, pp. 575–581. Taylor & Francis Group, Boca Raton (2010) 6. Jobbs, S.: The Guts of a New Machine, The New York Times, (November 30, 2003), http://www.nytimes.com/2003/11/30/magazine/30IPOD.html
Classifying Interaction Methods to Support Intuitive Interaction Devices for Creating User-Centered-Systems Dirk Burkhardt, Matthias Breyer, Christian Glaser, Kawa Nazemi, and Arjan Kuijper Fraunhofer Institute for Computer Graphics Research, Fraunhoferstraße 5, 64283 Darmstadt, Germany {dirk.burkhardt,matthias.breyer,christian.glaser, kawa.nazemi,arjan.kuijper}@igd.fraunhofer.de
Abstract. Nowadays a wide range of input devices are available to users of technical systems. Especially modern alternative interaction devices, which are known from game consoles etc., provide a more natural way of interaction. But the support in computer programs is currently a big challenge, because a high effort is to invest for developing an application that supports such alternative input devices. For this fact we made a concept for an interaction system, which supports the use of alternative interaction devices. The interaction-system consists as central element a server, which provides a simple access interface for application to support such devices. It is also possible to address an abstract device by its properties and the interaction-system overtakes the converting from a concrete device. For realizing this idea, we also defined a taxonomy for classifying interaction devices by its interaction method and in dependence to the required interaction results, like recognized gestures. Later, by using this system, it is generally possible to develop a user-centered system by integrating this interaction-system, because an adequate integration of alternative interaction devices provides a more natural and easy to understand form of interaction. Keywords: Multimodal Interaction, Human-Centered Interfaces, HumanComputer-Interfaces, Gesture-based Interaction.
Classifying Interaction Methods to Support Intuitive Interaction Devices
21
On the computer nowadays a gesture-based interaction is even not so successful. On these systems the traditional interaction devices mouse and keyboard are the most often used devices for the control of applications. Only multi-touch monitors are in some usage scenarios used for an easier interaction e.g. in public domains. Mostly the reason is the missing support in programs and applications. But also if alternative interaction devices are supported, their usage may not be adequate in all use case scenarios of a program. In different use cases, different interaction metaphors are needed to provide a useful interaction. For example by presenting a picture, abstract gestures are useful to instruct the viewer-program to zoom or rotate the picture. But if the user navigates through the menu of such a program only simple gestures are appropriate like pointing an entry or panning the display in a direction. Furthermore interaction devices often provide additional technical features e.g. the Nintendo WiiMote controller contains accelerometers, which are useful for supporting a gesture-based interaction. But the WiiMote also utilizes an infrared camera, which allows it to be used as pointing device. For an adequate support of modern interaction devices all the possible interaction methods and paradigms should be supported, but it is a challenge to address different kinds of interaction methods, if multiple modern interaction devices are used. In this paper we introduce a taxonomy of possible interaction methods for existing interaction devices. In this taxonomy currently applied interaction methods are presented, based on nowadays available interaction systems and devices. In this paper we also describe a generic interaction analysis system we had developed, which handles different interaction devices and its supported forms of gestural interactions, basing on the defined taxonomy for interaction methods. Hence the system handles the technical aspects like detecting gestures, or adopting e.g. the pointing mechanism of the WiiMote to a physical display. By using this interaction analysis system, a developer does not need to spend effort in developing a gesture recognition system any more. Via an API the developer can also declare, which interaction method should be supported or be disallowed in the programs use case scenarios. The interaction analysis system automatically determines the supported devices and its technical aspects to use the relevant sensors. Of course the support of multiple interaction methods is provided, so different kinds of interaction methods are possible in a use case scenario.
2 Related Works For classifying interaction devices to specify existing input and output devices, different approaches are existing. Especially for input devices classifications exist, to provide the possibility to group devices in dependence of similar properties. In the following section we give a small overview of existing classifications for input devices. One of our goals is to use the classification for conceptualizing and developing an interaction-system, for this fact we also introduce in some existing interaction systems, which provide the feature for an intuitive usage.
22
D. Burkhardt et al.
2.1 Classifications for Interaction Devices In the past some effort was done in defining classifications of input devices, mainly with the goal of better understanding of the devices, so that optimal devices could be found for specific tasks or similarities in devices to replace a device by another adequate one [1]. One of the first classifications was a taxonomy, which was defined by Foley et. al. [2]. The classification was focused on graphical user interface applications and its typical tasks that can be performed. Tasks were for example selecting a graphical object, orienting the mouse etc. Foleys taxonomy differentiates between devices that can perform these tasks and in what way they can do it, especially if its active principle is directly or indirectly. Buxton et. al. [3] made the observation that there is a major difference in the way devices can produce the same output. Buxton called these differences pragmatics and divided the devices further into the way they produce the output (position, motion, pressure, etc.) and ended up with a taxonomy, which could represent simple devices. To finally be able to classify all input devices, even virtual ones, Mackinglay [4] took Buxton’s taxonomy and build a formal approach, which can be used to classify the most input devices. Mackinglay describes an input device with a 6-tuple:
T = M , In, S , R, Out , W M… is an operation manipulator that describes, which value is changed by the device e.g. Rz would mean rotation around the z-axis In… is the input range in which the device can be manipulated e.g. a touchscreen would have a input range of [0, touchscreensize] S… is the actual state of the device R… defines a function that maps between In and Out Out… is the output range which results from In and function R W… describes the inner function of the device
(a) Merged operations
(b) combined devices in Buxton’s taxonomy
Fig. 1. Mackinglay’s Taxonomy of input devices [4]
Classifying Interaction Methods to Support Intuitive Interaction Devices
23
These tuples can be combined by 3 different merge operations to form more complex input devices: ─ Connect: connects one device from Out with a fitting input device from In ─ Layout composition: indicates the position for two simple devices at a superordinate device ─ Merged composition: similar to layout composition, but the merged values defining a new complex data type It is hard to insert devices for video or audio-based recognition in these taxonomies. In an alternative approach Krauss [5] divides devices on the super level in coordinatebased and non-coordinate-based interaction devices. So its classification (see Fig. 2) is driven by the output values of the devices.
Fig. 2. Classification of input devices by Krauss [5]
2.2 Gesture-Based Interaction Systems Some effort was invested to support a gesture-based interaction to normal computers and into application for allowing an easier control. In particular, the use of interaction devices from game consoles becoming also interesting for the use on normal computers. The reason is that they are well known and next to its easy usage, they provide different methods for interacting with a system. Early interaction devices like old data gloves were designed for only one interaction scenario. But in difference, a modern controller like the Wiimote, which is an input device developed by Nintendo, can recognize acceleration in 3 directions and it therefore can be used for gesturerecognition, next to use the infrared camera for the use as pointing device or use the buttons directly to control an application. To use only the WiiMote on a computer, different implementations are available. One of the most flexible programs for using the WiiMote as gesture-based device is WiiGee1. The recognition is implemented by a statistical comparison of the acceleration inputs and comparing them with previously trained acceleration data. WiiGee can classify the observed sequence as gestures [6][7]. 1
http://www.wiigee.org
24
D. Burkhardt et al.
There are also video-based applications that recognize faces [8] and some early approaches to recognize human gestures in video-based systems [9]. A more advances approach is Microsoft's Kinect2 which uses stereo images to recognize gestures. Speech recognition is a field, which gets more and more involved and works well for limited vocabulary. Many systems like cellphones, car navigation systems, operation systems and computer games support speech recognition.
3 Classification for Interaction Methods In a first idea for creating a gesture-interaction system (see Fig. 3) that supports different kinds of alternative interaction devices, we focused on supporting different kinds of devices [10]. During this work we recognized that many devices can be used for different interaction methods, for instance the WiiMote is an accelerometer-based device, which can be used for gestures. Next to this it utilizes an infrared camera, which allows the use as pointing-device and it is possible to use it as key-based controller, because of the also provided buttons on the top of the controller. Furthermore for some interaction methods various other devices do exists, which allows the same form of interaction. Representatives for 3-dimensional accelerometerbased devices are the WiiMote, Playstation Move Controller and most of the currently existing smartphones on the market.
Fig. 3. Architecture of our Gesture-Interaction-System [10]
In default, for every supported controller another implementation is necessary, also for modules, which are able to e.g. recognize gestures. This is not an adequate procedure, so that we created another approach, which we are going to describe in the following sections. For creating an effective tool to support also new upcoming alternative interaction devices, we need a concept for organizing them. Also to allow the use of modules for multiple times, like for gesture recognition module. In fact of this, we designed a multi-dimensional classification of interaction devices, grouped by the interaction method and the interaction result. 3.1 Classification of Interaction Methods In the past times interaction devices are designed for a single use scenario, e.g. a 3DMouse was designed for using it to interact directly within a virtual 3-dimensional 2
http://www.xbox.com/de-de/kinect
Classifying Interaction Methods to Support Intuitive Interaction Devices
25
Fig. 4. Classification taxonomy for the possible interaction methods
world. In difference to these devices, the modern interaction devices for modern game consoles etc. utilize multiple technologies. The Playstation Move controller integrates so accelerometers, a gyrocopter, a compass and buttons, which in sum allows a manifold interaction style. For grouping interaction devices, we conceptualized a taxonomy to group the controller by its basic underlying technology (see Fig. 4). Of course, every interaction device can be ordered in multiple groups – depending to the required technical feature. This classification is necessary to provide the possibility that a developer etc. can address which kind of interaction method he wants to support or what kinds of interaction devices he wants to allow interacting in his application. The important difference between this characterization and the existing approaches for classifying interaction device is that we separate the devices by its technical use method and what kind of sensors etc. are responsible for this method. 3.2 Classification of the Interaction Results In a practical use a user or furthermore a developer has a specific requirement on the results by the usage of an interaction device. So if an interaction in a graphical userinterface is planned, the result from the used controllers must be coordinates etc. This can only be ensured, if the taxonomy provides the feature of filtering the devices by its generated result data. By interacting through a graphical interface the user can use different kinds of interaction devices, in dependence to the needed results e.g. coordinates for a 3dimensional environment. Today only 1, 2 or 3-dimensional environments are known. But in fact of further research approaches or the use of e.g. multi-dimensional
Fig. 5. Classification taxonomy for addressing the interaction result by previously defined interaction method
26
D. Burkhardt et al.
combinations of multiple 3D environments, we regard also n-dimensional interaction results, but of course today we found no real n-dimensional (n>3) interaction device or an environment, which provides the need of such an interaction device. 3.2 Usage of Interaction Devices by Classifying Interaction Devices Our concept of the taxonomy for classifying interaction methods to support intuitive interaction devices is a combination of the two described sub-classifications. The result of coupling both classifications is an array in which every interaction device (restricted to input devices) can be classified (see array in Fig. 5). With this array it is possible to determine interaction device with similar features. Next to the feature of addressing interaction devices that supports a required interaction method and generates a preferred interaction results, like complete coordinates within a graphical user-interface.
Fig. 6. Entire Classification Taxonomy for classifying interaction methods to support intuitive interaction devices, in which every input device can be arranged
4 Concept for an Interaction-System To create an extensible interaction system, we conceptualized a system, which regards the presented taxonomy. One of the most important features of the conceptualized system will be the possibility to address devices with a specific interaction metaphor and with a special return value. This enables the chance that later further applications can be developed, which will use the interaction-system and over this they can select the kind of the preferred way to interact within the application. On this way the developer of an application can for instance define, that every kind of accelerometerbased devices will be supported for a gesture-based interaction. The consumer has now the choice to use such a device, but it does not matter, if it will be a WiiMote, a Playstation Move controller or a Smartphone with accelerometers. To achieve such a system and also to support the aspect of recognizing gestures with a device, every field of the array equates to an abstract device. This will be
Classifying Interaction Methods to Support Intuitive Interaction Devices
27
achieved in modeling every device as a virtual device. The treatment of the data e.g. the final coordinates or the recognition of a performed gesture will fully be overtaken by the interaction-system. Over an API the application is able to commit information to the server like the enabled kind of interaction devices or further supported gestures. The application gets as return value the coordinates or the final performed gestures. So the interaction-system is saving a lot of effort for developers of applications, who wants to enable a gesture-based interaction within his application or program. Fig. 7 shows the overall concept of our system. Another application can communicate with our system by sending xml messages over the network layer. Over the connection the system sends an xml message that lists all available devices, and expecting after this an xml message, which describes the devices that should be operated and how they should be connected. Every device is implemented as a plugin, which the server can automatically load. This makes our system generically to support further and also new upcoming devices like Microsoft Kinect.
(a) Architecture illustration for the interaction-system
(b) mapping of the device data to the screen
Fig. 7. Architecture and data mapping of interaction devices by the Gesture Interaction System
The system consists in general of 3 components. The central and most important component is the server of the interaction-system, which organizes all available interaction devices and also the modules for the supported interaction methods. Next to this central system an API library is available, so that a developer of a program is able to use the interaction-system. The API is planned for C# (.NET) and for Adobe Flex, which allows also web-application to support modern devices. The third component will be a system for providing the possibility to configure new gestures. So this Learning-Tool is only designed for modes where a gesture-interaction should be used. So developers etc. can train new gestures for controlling their application. The Learning-Tool generates an XML-configuration file, which can later be committed by the API to the server of the interaction-system. On this way the server will know the new gestures and is able to recognize them.
28
D. Burkhardt et al.
5 Discussion This interaction-system is an additional system to support alternative interaction devices. Our main research scope lay in an adaptive visualization framework for visualizing semantics data. To enable an intelligent user-interface we have to support intuitive information visualization on the hand, and on the other hand we have to provide an easy form for interacting through visualizations from the input side. For this circumstance we need an interaction-system to delimit a specific method of interaction (and a specific type of result value, like a performed gesture or coordinates) and the used interaction method e.g. by accelerometers. This is especially necessary, if another kind of navigation should be supported then only a coordinate-based once. For example, how is it possible to provide a gesture-based interaction through graphs? The challenge is that the navigation to its direction is harmful, because if e.g. 5 nodes are positioned on top right, it is hard to determine which node has to be selected, if a gesture to the top, right direction is performed. So an alternative approach is required for providing an adequate gesture-based interaction through graphs. Similar to this kind of visualization, it is also hard to find a gesture-based approach for timeline and geographical visualizations. Another point for discussions is the completeness of the taxonomy. We currently applied it on actual existing input devices especially from game consoles. We also try to regard most kinds of smartphones. From all these devices we determined its utilized features for differentiate its interaction methods and form. Then we tried to determine the possibly results, which are required in external applications. In the presented form, we cannot ensure that all kinds of interaction are regarded. We supported all common kinds of interaction, but it is possible that also new approaches have to be regarded and so an extension of the taxonomy has to be taken.
6 Conclusion In this paper we introduced into our defined taxonomy for classifying interaction devices in dependence of its interaction metaphor and its interaction results. In difference to other existing classifications, our approach is primary driven by the usage in adaptive semantic visualizations and therefor to select specific kinds of interaction devices or devices with specific forms of interaction. Next to this, the taxonomy groups devices with similar forms of interaction, which helps to abstract these devices and its significant properties and provide so the possibility to develop e.g. gesture-recognition modules for such an abstract device and finally also all related devices which are grouped under this abstract device. This idea for classifying interaction devices is used for our interaction system, which allows the use of alternative interaction devices like WiiMote or the Playstation Move controller. With this system every currently available input device can be classified by its utilized interaction method and in dependence to its interaction result e.g. coordinates or an identified gesture. The interaction-system consists of two parts, the central server, which organizes the devices and provides an interface for external applications. And also the API for the use within application that communicates with the server and gets the results of interactions from the connected devices.
Classifying Interaction Methods to Support Intuitive Interaction Devices
29
The conceptualized system can be extended by further and also new upcoming input devices and also for the determined input results further modules can be developed to support e.g. different kinds of gesture-recognition. This can be useful if an accurate recognition is required for normal interactions in application as well as a less accurate real-time recognition like it is needed in games. But in general it is possible to develop a user-centered system by integrating this interaction-system, because an adequate integration of alternative interaction devices provides a more natural and easy to understand form of interaction. Acknowledgements. This work was supported in part by the German Federal Ministry of Economics and Technology as part of the THESEUS Research Program. For more information please see http://www.semavis.com.
References 1. Jacob, R.J.K., Sibert, L.E., McFarlane, D.C., Mullen Jr., M.P.: Integrality and separability of input devices. ACM, New York (1994) 2. Foley, J.D., Wallace, V.L., Chan, P.: The human factors of computer graphics interaction techniques. Prentice Hall Press, Englewood Cliffs (1990) 3. Buxton, W.: Lexical and pragmatic considerations of input structures. ACM, New York (1983) 4. Mackinlay, J., Card, S.K., Robertson, G.G.: A semantic analysis of the design space of input devices. L. Erlbaum Associates Inc., Mahwah (1990) 5. Krau, L.: Entwicklung und Evaluation einer Methodik zur Untersuchung von Interaktionsgerten für Maschinen- und Prozessbediensysteme mit graphischen Benutzungsoberflächen. Universität Kaiserslautern (2003) 6. Poppinga, B.: Beschleunigungsbasierte 3D-Gestenerkennungmit dem Wii-Controller. University of Oldenburg (2007) 7. Thomas Schlomer, N.H.S.B., Poppinga, B.: Gesture Recognition with a Wii Controller. In: 2nd International Conference on Tangible and Embedded Interaction (2008) 8. Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face recognition: A literature survey. ACM Comput. Surv. 35, 399–458 (2003) 9. Mitra, S., Acharya, T.: Gesture Recognition: A Survey. IEEE Systems Man and Cybernetics Society (2007) 10. Burkhardt, D., Nazemi, K., Bhatti, N., Hornung, C.: Technology Support for Analyzing User Interactions to Create User-Centered Interactions. In: Stephanidis, C. (ed.) UAHCI 2009. LNCS, vol. 5614, pp. 3–12. Springer, Heidelberg (2009) 11. Nazemi, K., Burkhardt, D., Breyer, M., Stab, C., Fellner, D.W.: Semantic Visualization Cockpit: Adaptable Composition of Semantics-Visualization Techniques for KnowledgeExploration. In: International Association of Online Engineering (IAOE): International Conference Interactive Computer Aided Learning 2010 (ICL 2010), pp. 163–173. University Press, Kassel (2010)
Evaluation of Video Game Interfaces Joyram Chakraborty and Phillip L. Bligh Department of Computer Science The State University of New York, The College at Brockport, Brockport, NY 14420 USA [email protected], [email protected]
Abstract. The interface is an essential part of every video game. However, research in the understanding of the modern game player’s preferences is lacking. This paper reports the preliminary findings from the evaluation of a computer game user interfaces that determines specific user preferences. Keywords: User Interface Evaluation, Video Game Design, End User Preferences.
1 Introduction The game interface is an essential part of every video game. Regardless of how artistic, useable or functional the interface is, it remains the primary conduit of interaction for the game player. It is essential for game developers to understand the common problems associated with game interfaces as well as the analytical techniques used to solve them. However, little research has been carried out to understand the end user preferences. Game developers use personal preferences and creative programming techniques and tools to develop games with the hopes of successful market penetration. It is the purpose of this study to evaluate the interface of a video game to gain an understanding of the end user.
2 Background Research has indicated that the following four methods of evaluation techniques are most commonly applied [1, 9, 10]. 1.
2.
Cognitive Walk-through technique is a process that measures the usability of an interface by using a Cognitive Learning Model to evaluate the ease at which an interface can be learned. Heuristic Evaluation technique involves employing use-ability experts to inspect interfaces. They use predefined criteria to evaluate the problems with the interface. This was found to be the most effective evaluation technique but it relies on the availability of an expert.
Pluralistic Evaluation technique is a process by which developers, users and experts do a walkthrough of the interface together. The advantage is that in this process is the diverse perspective involved in evaluation. Formal Usability Inspections are a process where Human Factors Experts use a Cognitive Model of Task to evaluate interfaces. The advantage of this process is that the experts can do a walk-through more efficiently.
The literature has indicated that heuristic evaluation method are the most effective alternative to empirical testing. This technique is better at finding a larger percentage of design flaws in an interface, although its effectiveness relied heavily on the quality of the experts available [1, 7, 8, 9, 10]. The advantage of heuristics and reliance on experts was also found to be true in a study that compared cognitive walk-through and heuristic evaluation techniques using system designers and experts [7]. When comparing these two methods performed by experts, heuristics had a clear advantage, however, when using only designers to perform the evaluations both methods preformed equally. Over the last two decades, other researchers have recognized the need to expand and modify heuristic evaluation approaches for video game interfaces. Modifying and creating new heuristic approaches had already been done in other cases. For instance, in one non game related study, the author examines enhanced heuristic evaluation techniques [10]. Heuristics criteria were combined and then analyzed for effectiveness. After comparing the results the most effective heuristics were noted. In another study, researchers [6] introduced Heuristic Evaluation for Playability (HEP), a heuristic evaluation created specifically for the evaluation of game-play. According to the findings, HEP was “reviewed by several playability experts and game designers.” A comparative study of the HEP method and end user observation revealed specific problems that could only be found through observation. In a study about customizing evaluations for video games, Pinelle, et. al. divided video games into six different genres: Role-playing, Sports, Shooter, Puzzle, Strategy, Adventure [4]. Then they took twelve common use-ability problems found in 108 different reviews (eighteen for each type of video game genre) and mapped their occurrences to each genre. The common occurrences of usability problems for each genre were shown in radial charts. After finding problems common to specific genre, they discuss the implications that those problems could have on evaluation techniques. In another study, researchers developed ten criteria for evaluating video game interface problems [5]. Initially five video game players with experience analyzing interfaces were recruited to evaluate a specific game using Heuristic evaluation criteria developed from their previous research. After evaluating the game by playing and using the given criteria the problems evaluators reported were recorded. The evaluators used a severity scale to classify the problems found. There was a significant overlap in reported problems but some evaluators found unique problems. This study did not take in account the engagement or fun factor of the game. Research on video game interface evaluation suggests that a different type of evaluating criteria must be developed for evaluating interfaces in games. Our study
32
J. Chakraborty and P.L. Bligh
attempts to prove that a classification of game players based on their interface preferences is possible. This new classification of users could be used as criteria for evaluating interfaces.
3 Methodology A windows-based gaming application called "Mayor City" was developed using both JAVA and JOGL under the Eclipse Platform specifically. In "Mayor City" the goal of a player is to build a road from one side of a three dimensional map to the other. The player starts with a small amount of seed money (used to build roads and buildings) and a zero population. Each "game" day the player receives income based and a percentage of the total income of all the money making buildings they have built plus a small base income. In order to build money making buildings the player must first build population buildings in order to meet the population requirements of the specific building.
Fig. 1. Mayor City
The following elements were added to the game to measure the player’s preferences: 1.
The game input consists of a mouse menu interaction and corresponding keyboard controls. Any action in the game that can be done by mouse can also be done by the keyboard.
Evaluation of Video Game Interfaces
2.
3. 4.
33
There are two parts of the interface that swap location during play. The first part is the side menu, which swaps from left side to right side at random. This menu allows the user to select a picture of the building that they are planning to build. The second part is the game text. The text on the top bar displays both the population and the funds and the bottom bar displays the game timer. These elements swap location at random. The entire interface changes color between a cyan background with black text and blue background with white text. In "Mayor City" the sky darkens and brightens according to the time of day. The color of the sunsets and sunrises vary randomly between red and green.
A demographic and end user preference survey was developed. This survey instrument was pilot tested. A Microsoft Access database was developed to implement this survey.
4 Procedure Permission was sought from the Institution Review Board at the SUNY Brockport prior to the start of this study. Subjects were recruited from undergraduate programs at SUNY Brockport. Subjects were informed of their rights and their ability to cease participation at any time. Once they accepted that agreement, they were presented with the Ishihara color blind test to test for normal color vision. If they passed the test, they would be presented with the images that would comprise the study itself. Subjects were then allowed to play the game for as long as they wanted. Each player was provided with a basic game play instruction document as well as verbal instructions and help as needed. After the conclusion of the game, the subjects were asked to fill out our demographic and game response survey in the Access Database application. (For the complete set of questions, please email the author.)
5 Results Data was collected from 24 undergraduate subjects of from SUNY Brockport. There were 11 females and 13 male students of various disciplines who spoke English as their primary language. On average, these participants indicated that they had been browsing the Internet for approximately 5 years. They further indicated that they spent an average of 5 hours a day working on their computers of which an average of 2 hours was spent on the Internet. The participants were asked to complete a binary scaled questionnaire to gauge their interface preferences. For example, participants were asked to respond to the question “Which contrast scheme did you prefer” using binary scale. The table below shows the responses from the 24 participants. The results indicated that there were no clear preferences for specific user interface features with the exception of Mouse versus Keyboard preferences. Surprisingly, the majority of the users preferred the use of the Keyboard which was not in tune with the literature findings. The results did not indicate any obvious gender preferences as both male and female participants showed almost similar interface preferences features.
34
J. Chakraborty and P.L. Bligh
18 to 20 8
Age Ranges 21 to 24 25 to 30 12
31 to 35
36+
Male
1
2
13
1
Gender Femal e 11
Preferred Text Contrast Dark text on bright Bright text on dark background background 13 11
Preferred Control Interface Mouse Keyboard 1
13
6
Preferred Sunset Sky Color Green Red 12
18
Preferred Sunrise Sky Color Green Red
12
11
Preferred Side Menu Location Right-side Left-side 17
Preferred Menu Background Color Cyan Blue
13
Preferred Date Text Location at Bottom Yes No
7
17
7
Preferred Population and Funds Text Location at the Top Yes No 10
14
6 Conclusions This is a preliminary investigation of evaluation techniques of video game interfaces. The findings of this study have indicated that there is clearly a need for further research of evaluation techniques. The end users in this study showed no obvious preferences with the exception of the Keyboard and Mouse. The results indicated that changing interface features, such as color and menu positioning did not affect the end users.
Evaluation of Video Game Interfaces
35
The results from the study are not surprising given the generally high level of familiarity with computer based gaming that the end users indicated. The enormous diversity of computer games available to the end user has ensured that their gaming experience levels are high. The literature confirms these findings.
7 Future Work This study is far from complete. This report is the first of a series of on-going studies that are examining evaluation techniques in the development of gaming user interfaces. The next step would be to reevaluate the questionnaires for to correct any deficiencies. It would be interesting to observe whether the results can be replicated on a larger sample size. Mayor City game could be further enhanced to add more tasks and more interface options. In addition, this study should be replicated a different demographic such as children or older adults to gauge their interface preference. The literature for best evaluation practices is inconclusive. An interesting methodology would be the use of eye tracking to help researchers analyze end user cognition patterns. This pattern recognition could be extended to other demographics, such as children and older adults. The study could be replicated using users of other cultures. These findings would be of great interest to game developers as they seek to further the reach of video games by understanding the preferences of end users.
References 1. Hollingsed, T., Novick, D.G.: Usability inspection methods after 15 years of research and practice. In: Proc. The 25th Annual ACM international Conference on Design of Communication, pp. 249–255. ACM Press, El Paso (2007) 2. Juul, J., Norton, M.: Easy to use and incredibly difficult: on the mythical border between interface and gameplay. In: Proc. 4th International Conference on Foundations of Digital Games (2009) 3. Barr, P., Noble, J., Biddle, R.: Video game values: Human-computer interaction and games. Journal of Interacting with Computers 19(2), 180–195 (2007) 4. Pinelle, D., Wong, N., Stach, T.: Using genres to customize usability evaluations of video games. In: Proc. 2008 Conference on Future Play: Research, Play, Toronto, Ontario, Canada, November 3-5, pp. 129–136 (2008) 5. Pinelle, D., Wong, N., Stach, T.: Heuristic evaluation for games: usability principles for video game design. In: Proc. CHI 2008, pp. 1453–1462 (2008) 6. Desurvire, H., Capalan, M., Toth, J.A.: Using heuristics to evaluate the playability of games. In: Extended Abstracts of the 2004 Conference on Human Factors in Computing Systems, pp. 1509–1512. ACM Press, NewYork (2004) 7. Desurvire, H.W., Kondziela, J.M., Atwood, M.E.: What is gained and lost when using evaluation methods other than empirical testing. In: Proc. Conference on People and Computers VII, York, United Kingdom, pp. 89–102 (January 1992) 8. Jeffries, R., Miller, J.R., Wharton, C., Uyeda, K.M.: User Interface Evaluation in the Real World: A Comparison of Four Techniques. In: Proceedings of CHI 1991, ACM, New Orleans (1991) 9. Nielson, J.: Usability inspection methods. In: Proc. ACM CHI 1995 Conf. Usability Engineering, May 7- 11, pp. 377–378. Denver, CO (1995) 10. Nielsen, J.: Enhancing the explanatory power of usability heuristics. In: Proc. of the SIGCHI Conference on Human factors in Computing Systems: Celebrating Interdependence, Boston, Massachusetts, United States, April 24-28, pp. 152–158 (1994)
Emergent Design: Bringing the Learner Close to the Experience Joseph Defazio and Kevin Rand School of Informatics and Department of Psychology IUPUI – Indiana University Purdue University at Indianapolis 535 W. Michigan St., Indianapolis, IN 46202 {jdefazio,klrand}@iupui.edu
Abstract. The creative process of design is at the foundation of serious game and simulation development. Using a systematic approach, the designer of serious simulations or games analyzes the best approach that would deliver an interactive learning experience; one that will harnesses growing forms of behavior, requiring both the learner and technology to engage in an open-ended cycle of productive feedback and exchange. According to Collins [1], “Beyond simply providing an on/off switch or a menu of options leading to ‘canned’ content, users should be able to interact intuitively with a system in ways that produce new information. Interacting with a system that produces emergent phenomena is what I am calling interactive emergence” (4th Annual Digital Arts Symposium: Neural Net{work}). Keywords: creative process, emergent design, serious game design, health education simulation.
Emergent Design: Bringing the Learner Close to the Experience
37
The Suicide: Prevention, Intervention simulation is a work-in-progress that uses emergent design; the intuitive interactive experience coupled with a focus on usability to build this simulation. The authors’ intention is to provide learners with an educational experience while engaging in interactive dialogue with virtual characters in situations that increase potential suicidal behavioral awareness, promote and encourage effective communication with members of specific populations at risk, and enhance participation and activation skills in working with the members of a population thereby increasing positive mental health outcomes.
2 The Creative Process of Emergent Design The creative process of emergent design is essentially a problem-solving process. In the creative process the designer is actively engaged in systematic problem analysis much as an architectural designer would design a blueprint for a building or a computer scientist would develop programming code structures for an application. Emergent design is an evolutionary process. The simulation begins to reveal its structure and performance attributes as the design evolves. According to EmergentDesign.org, “Emergent design holds both emergence—the bubbling up of new phenomena from the random interaction of individuals—and design—the planned progression of events towards a goal—in a dynamic embrace that maintains a higher level view of the process, and thus transcends the apparent duality by facilitating the interaction between the two approaches. It seeks to reduce the friction points that inhibit the free flow of information and to allow for experimentation towards a goal, so that product development is more like the leaves of trees reaching for the light than central planning.” [4]. In emergent design, this serious health education simulation serves as a way of testing specific hypotheses within the Interpersonal-Psychological Theory of Suicide, which posits that the motivation for suicide is a function of two psychological factors: belongingness and burdensomeness. For example, the interactive simulation could be tweaked to only intervene on belongingness or only on burdensomeness and then these versions of the game could be compared in terms of their impact in reducing suicide risk (e.g., two versions of the simulation could be compared at two different college campuses). This would answer empirical questions about which psychological need has a greater impact on suicidality. Moreover, hypotheses could be tested as to whether there is an additive benefit to intervening on both burdensomeness and belongingness, or if intervening on just one is sufficient to reduce risk for suicide. The goals for the Suicide: Intervention, Prevention simulation are presented in Table 1. Obviously, studying suicide is a difficult endeavor given the ethical and moral issues surrounding it. Hence, it is difficult to conduct much research that involves experimental manipulation in order to determine causality. With this interactive serious health education simulation however, researchers could experimentally manipulate important variables while still behaving in an ethical manner. In summary, this simulation would provide a modifiable tool that could be used to answer specific research questions related to suicide and expand our understanding of the mechanisms involved in suicidality.
38
J. Defazio and K. Rand Table 1. Goals for the Suicide: Intervention, Prevention Simulation Goal Raise awareness of risk factors Model Appropriate intervention & helpseeking behaviors Inform of resources of help-seeking Model appropriate follow-up behaviors
Example Marked change in behavior or appearance Caring, empathy, compassion
Present sources for help and guidance Follow-up, provide continual support
Cavallo claims that emergent design manages the overall process [5]. Emergent design uses the visual messages from artifacts in the process of systematic analysis. Kitamura, Kakuda, & Tamki also provide supporting claim that emergent design uses “The concepts of evolution, adaptation, learning, and coordination” [6] which are addressed iteratively.
3 Using Artifacts to Inform the Design In the design and development of simulations in health education the authors use available turnkey systems for animators, content creators, visualization and special effects. Applications can be developed easily which allow 3D motion data, volumetric 3D mesh, and video surface textures for game design team to reduce production time. For game developers, the entire production pipeline is shortened, production-ready animation data is available instantly, and animators can now implement character, graphics, and animated sequences without extensive design and development
Fig. 1. 3D Modeled Character
Fig. 2. Street Scene
Emergent Design: Bringing the Learner Close to the Experience
39
production. For game designers, instant 3D characters are created in seconds. Game and simulation designers and developers have access to faster, cheaper, and better motion capture for 3D applications. The process of design incorporates both representation of the artifact being designed and the process or development by which the design is completed [7]. Design can be defined as the intention of making original plans, patterns, and/or schemes towards an aim or goal. For example, Figure 1 shows a 3D character designed using Mixamo. Figure 2 shows a screen capture from Sims 3 and Figure 3 shows the character embedded into the street scene [14]. The character developed in Mixamo is an animated character in a walk-cycle. This animated character can be tweened to move according to the sidewalk path depicted in the street scene.
Fig. 3. Embedded Character/Scene
Although Seels and Richey define design in the context of instructional design; “The planning phase in which specifications are constructed” [8], the same premise can be used for serious game and simulation design. Development can be defined as a stage or phase in the gradual growth, evolution, or advancement towards an aim or goal. Development is “The process of translating the design specifications into physical form” [8]. Design and development are related processes and both have goals and outcomes for the product to be developed. One of the biggest time-constraints in game and simulation design and development is the design and animation of a 3D character. “Animating an articulated 3D character currently requires manual rigging to specify its internal skeletal structure and to define how the input motion deforms its surface” [9]. Within the emergent design process, the authors engaged in rapid game development as the main context for design and development. The authors explored several game engines and software applications that would allow quick prototyping and development. Mixamo [10] was used for character development and animation. The Sims3 simulation game was used for graphics, backgrounds, and assets (e.g., furniture, home content, etc.)
40
J. Defazio and K. Rand
The development of serious educational games and simulations has proven to be complex, time-consuming and costly [11], [12]. Serious game and simulation designers and developers are continually facing time constraints and demands on delivery of the end product. One of the issues at the forefront is how to reduce development and production time. According to Pagulayan et.al. “There is a great deal of pressure on designers to utilize new technologies which may break old interaction models” [13]. The authors recognize these limitations and have adopted emergent design as the driving factor in the design of the educational simulation titled, Suicide: Prevention, Intervention.
4 Conclusions The goals for the Suicide: Prevention, Intervention simulation are to provide an engaging interactive experience that will educate the learner about suicide prevention. The objectives include: 1) raise awareness of risk factors (e.g., marked change in behavior or appearance, agitated or depressed mood, drug or alcohol abuse, etc.); 2) model appropriate intervention & help-seeking behaviors; 3) inform of resources for help-seeking; and 4) model appropriate follow-up behaviors. The authors of this article are actively designing and developing a serious simulation titled, Suicide: Intervention, Prevention. This work-in-progress is being developed as an educational simulation (serious game) that includes skills, knowledge, and values that allow the learner to interact with virtual characters who demonstrate mental health issues (e.g., suicide or harmful behavior). Learners are able to think, interact, and solve behavioral problems in social situations that demand effective communication, analysis, and response in various health education scenarios. Using four programmed scenarios, the learner will engage in distinct issues regarding suicidal thoughts as portrayed by characters in the simulation. This article focuses on the creative process of emergent design and usability issues affecting the overall production flow of this serious simulation.
References [1] Collins, D.: Breeding the Evolutionary: Interactive Emergence in Art and Education. In: Paper given at the 4th Annual Digital Arts Symposium: Neural Network, April 11 -12. University of Arizona, Tucson (2002) [2] Kahn, K., Pattison, T., Sherwood, M.: Simulation in Medical Education. Medical Teacher 33(1), 1–3 (2011) [3] Zyda, D.: From Visual Simulations to Virtual Reality to Games, vol. 38(9), pp. 25–32. IEEE Computer Society, Los Alamitos (2005) [4] Emergent Design.: What is Emergent Design (2009), http://emergentdesign.org/chrysalis/topics/emergence/ [5] Cavallo, D.: Emergent Design and Learning Environments: Building on Indigenous Knowledge. IBM Systems Journal 39(3), 768–781 (2000) [6] Ktamura, S., Kakuda, Y., Tamaki, H.: An approach to the Emergent Design Theory and Applications. Artificial Life Robotics 3, 86–89 (1998)
Emergent Design: Bringing the Learner Close to the Experience
41
[7] Dym, C.L.: Engineering Design: A Synthesis of Views. Cambridge University Press, MA (1994) [8] Seels, B.B., Richey, R.C.: Instructional Technology: The Definition and Domains of the Field. Association for Educational Communications Technology, Washington DC (1994) [9] Baran, I., Popović, J.: Automatic Rigging and Animation of 3D Characters. ACM Transactions on Graphics (TOG) 26(3), 72–78 (2007) [10] Mixamo.: Mixamo: Animation in Seconds (2010), http://www.mixamo.com [11] Nadolski, R., Hummel, H., Brink, H., van de Hoefakker, R., Slootmaker, A., Kurvers, H., Storm, J.: Emergo: Methodology and Toolkit for Efficient Development of Serious Games in Higher Education. Electronic Games, Simulations and Personalized eLearning (2007) [12] Westera, W., Hommes, M.A., Houtmans, M., Kurvers, H.J.: Computer-Supported Training of Psychodiagnostic Skills. Interactive Learning Environments 11(3), 215–231 (2003) [13] Pagulayan, R., Keeker, K., Fuller, T., Wixon, D., Romero, R.: User-centered Design in Games (revision). In: Jacko, J., Sears, A. (eds.) Handbook for Human-Computer Interaction in Interactive Systems. Lawrence Erlbaum Associates, Inc., Mahwah (2006) (in press) [14] Defazio, J., Rand, K., William, A.: Serious game design: Reducing production time constraints. In: 1st Annual Midwest Health Games Conference, Indianapolis, USA (2010) [15] Kharrazi, H., Faiola, A., Defazio, J.: Healthcare Game Design: Behavioral Modeling of Serious Gaming Design for Children with Chronic Diseases. In: Salvendy, G., Jacko, J. (eds.) Proceedings of the 13th International Conference on Human-Computer Interaction, San Diego, CA, Lawrence Erlbaum Publishing, Mahwah (2009)
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications Carlos Duarte, José Coelho, Pedro Feiteira, David Costa, and Daniel Costa LASIGE, University of Lisbon, Edifício C6, Campo Grande 1749-016 Lisboa, Portugal {cad,pfeiteira}@di.fc.ul.pt, {jcoelho,dcosta,thewisher}@lasige.di.fc.ul.pt
Abstract. The design of multimodal adaptive applications should be strongly supported by a user centred methodology. This paper presents an analysis of the results of user trials conducted with a prototype of a multimodal system in order to elicit requirements for multimodal interaction and adaptation mechanisms that are being developed in order to design a framework to support the development of accessible ICT applications. Factors related to visual and audio perception, and motor skills are considered, as well as multimodal integration patterns. Keywords: Multimodal interaction, Adaptation, User trials.
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications
43
from doing it. This means, that a whole new interaction techniques will have to make their way into the living room. Gesture recognition will become part of the interaction experience, which is already happening, first with the Nintendo Wii, and more recently with Microsoft Kinect. But we can expect voice recognition to also become part of this new interactive TV scenario, as well as alternative “remote controls”, like smartphones or tablet devices. Besides this environmental and interaction devices variability, the target audience makes this a truly inclusive problem, since everyone can be a user of such platforms, regardless of their age, knowledge and physical, sensorial or cognitive abilities. Moreover, each user will have his or her preferences. They will be more comfortable with one interaction device than others, but even that may vary depending on the task they are accomplishing. One approach to tackle to problems raised by all these variables, environment, devices and users, is to employ adaptation techniques. This will allow an intrinsically multimodal system, to explore to the maximum the advantages of multimodal interaction, most notably the possibility of interacting naturally, with the added benefits in terms of learnability and ease of use, and of letting users chose the modality that is the most adequate, based on their appraisal of the current situation [1]. Adaptation will allow a system to exploit the knowledge it might have about its users, in order to provide the most appropriate means of interaction for a given task in a given environment, being aware to the context defining parameters [2]. To support an efficient and effective adaptation in a multimodal setting, it is of the utmost importance to correctly identify the adaptation variables and the interaction patterns users reveal. This paper reports on the efforts made to tackle the aforementioned issues, in the context of the European Union funded GUIDE1 project. By employing a user centred approach, interaction requirements are being elicited to understand how user’s abilities impact their perception, and also their use of different skills. This allowed for the identification of several interaction requirements, initially targeted at specific modalities, like visual presentation issues. Simultaneously, we have been observing how users integrate multiple modes of interaction when offered the possibility to explore them in a combined or independent fashion.
2 Context The studies reported in this paper were made in the context of the European Union funded project GUIDE (Gentle User Interfaces for Elderly Citizens). GUIDE aims to provide a framework and toolbox for adaptive, multimodal user interfaces that target the accessibility requirements of elderly users in their home environments, making use of TV set-top boxes as a processing and connectivity platform. GUIDE envisions bringing to users the benefits of multimodal interaction, empowering interaction through natural and seamless interaction modes, that do not require learning, and that will be able to convince users to adopt them [3]. This includes the modalities that are used in natural human-human communication, as speech and pointing. Additionally, and being based on TV as the central processing 1
www.guide-project.eu
44
C. Duarte et al.
hub, GUIDE also encompasses remote controls as another interaction modality. Moreover, in an attempt to explore novel interaction scenarios, and even promote mobility in the explored interaction settings, GUIDE includes tablets as an additional device, that can be employed both for input and output modalities. The majority of the system’s output will be transmitted, as expected, through the TV set. Complementing it, besides the aforementioned tablet, is the possible of haptic feedback through the remote control. As can be easily perceived from the above description, GUIDE aims to implement a fully multimodal system, offering its users a range of modalities that they can explore in order to address, both their impairments, as well as the context variations that might come into being. In order to avoid an excessive configuration complexity that such richness might impart, GUIDE includes a set of adaptation capabilities to harness the complexity of managing its full array of modalities. As such, adaptation will impact both multimodal fusion, by taking advantage of known multimodal interaction patters and by adapting individual recognizers in order to increase their efficiency, and multimodal fission, by generating output presentation customized to the system’s current user, considering a diversity of factors, with the more important ones being the user’s perceptual, motor and cognitive skills and abilities. Adaptation will also impact the dialog management component, thus addressing the interaction flow with applications based on user’s abilities. In order to support such system characteristics it is fundamental to employ a usercentered design methodology. The user is the pivotal subject for the system’s adaptation mechanisms. User information drives the adaptation, and knowledge about him is fundamental for the fusion and fission processes and the dialogue management. As a result, a profound characterization of its target users group is one of the project’s milestones, and several iterations of user studies are planned in the scope of this project. The next section details the first of these iterations.
3 User Studies As aforementioned, in order to collect the information required for the characterization of the target population several user studies have been planned. These will proceed in iterative fashion, intermingled with developments, allowing for an updated feedback on evolutions of the envisioned framework and its deployment. In this section we describe the set-up of the first user study conducted in the scope of the project. Given its nature of being the first study, the possibility of using low fidelity prototypes was considered. This however, was readily discarded, given that we wished to acquire reactions to what technology could offer its users as close to reality as possible. Having already available the majority of the individual technologies envisioned for the project, we opted to build a multimodal system to explore individual reactions to interaction means made available, and to study patters of usage [4]. The recognizers already supported by the system include pointing (either freehand through a motion sensing system, or with a remote control – in our case, we used the Nintendo’s Wii remote control). Speech recognition was not included at this time, but nevertheless tasks accepting speech input were included in the studies, in order to assess how representatives of the target population employ it. Output was
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications
45
achieved through a TV, including text, audio and video. An avatar was also used. With it we attempted to understand if it could contribute the user’s empathy towards the system. The script included tasks to exercise perceptual, motor and cognitive skills. With them we aimed at exploring how users perceive and prefer visual elements (font size, font and background colors, object placement …) and audio stimulus (volume). We included also a set of tasks to exercise the user’s cognitive abilities. Furthermore, and event tough in some tasks users were free to interact in any way they wished, including combining modalities, we included tasks for users to explicitly use more than one modality, in an attempt to observe what integration patters were used. The trials were conducted over a period of one week. Nine elderly people participated in these trials: five men and four women. Their average age was 74.6 years old, with the younger being 66 years old and the older being 86 years old. Sessions lasted about 40 minutes, in which participants familiarized themselves with the available means of interaction and executed the required tasks. In the next section we will discuss our findings based mostly on qualitative findings from the trials’ observation and the participants’ remarks made during and after the trials. It should be stressed that these were the first trials, and thus have had such a small number of participants. Currently, a new of set of trials is underway, involving several scores of participants, and where findings from the first trials, here reported, have already been used to improve the system’s performance.
4 Discussion As mentioned previously the trial’s scrip took in consideration visual, audio and motor requirements. Moreover, in some tasks participants were free to elect how they wished to interact with the trial’s application while in other tasks they were specifically requested to interact with a specific modality or combination of modalities. In most of the script’s questions participants had to perform the same task with different modalities or with a different rendering parameter and after performing the task they had to report their favorite configuration for that task. In some other questions participants were just asked to select between one of differently rendered presentations without having to explicitly perform any task. The trials begun with a simple explanation task, where participants were asked to select one of four options presented visually on screen as four buttons. They had to do it through finger pointing, using the Wii remote to point, speech and with a combination of speech and one of the pointing options. Afterwards, different presentations were explored: target placement (near the center of near the edges of the screen), target size, target spacing, target color, text size and text color. Only for the color presentations participants simply had to select their preferred option. In all other, they had to perform a task, be it selecting one of the targets or reading aloud the text. The trial then proceeded with an audio rendering task where participants were asked to repeat text that was rendered to them through a TTS system with different volume levels. These were followed by motor tests, where users had to perform gestures (not simply pointing) both without surface support or using a tablet emulating surface.
46
C. Duarte et al.
Afterwards, users had to perform a selection task in all combinations of input (options presented visually, aurally or both combined) and output (selection made through pointing, speech or both combined) modalities. Finally, they had to assess what would be their preferred modality to be alerted to an event when watching TV and when browsing photos on the TV. Options included on screen text, TTS, the avatar or combinations of the three. The test ended with a comparison between the avatar and a video of a person presenting the same content, to assess the avatar’s ability to generate empathy with future system users. In the following paragraphs we report our findings, based on the participants expressed opinions but also on our observations, in situ and of the trials’ recordings. 4.1 Visual Perception Regarding visual presentation there were two main focus studied: target (e.g. buttons) and text. Targets were analyzed according to their placement, size, separation and content. In what concerns target placement, most participants (6 participants) preferred targets near the center of the screen instead of near the edges. Most participants (7 participants) preferred the larger targets. The majority of participants (6 participants) preferred the version with greater separation than the version with targets closer. Reasons given for this preference include being easier to see (and understand) the targets, additionally to movement related issues (being easier to select). Regarding the content of targets, participants showed a clear preference for solutions that promote a great visual contrast. The more popular choices were white text on black background or blue background, and black text on white background, with a single participant electing blue text on yellow background. There seemed to be some sort of consensus among participants that strong colors are tiring. In what concerts text presentation, size and color were both evaluated. Six text sizes were presented to participants. The larger was an 100 pixel font (this meant that approximately five words would fill half the TV screen), and the smaller a 12 pixel font. Intermediate sizes were 80, 64, 40 and 24 pixels. Only one participant preferred the larger font. Five participants preferred the second largest and three participants the third largest size. No participant preferred fonts with any of the three smallest sizes. Table 1. Participants‘ preferences regarding text color (in rows) for different background colors (columns)
White Black Blue Red Green Orange Yellow Gray
White 7
Black 7 -
1
1
Blue 8 -
Green 2 5 2 -
1
1
1
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications
47
Text color was evaluated against different color backgrounds. Text colors considered were white, black, blue, red, green, orange, yellow and gray. Background colors used were white, black, blue and green. Participants mostly opted for high contrast combinations. Table 1 shows the participants’ expressed preferences, with text color in the rows and background color in the columns. Values in the table’s cells represent the number of participants that selected that particular combination of text and background color. 4.2 Audio Perception The audio tasks evaluated participants’ ability to perceive messages in different volumes. Five volumes were employed. The test started on the loudest setting, and then would decrease each time by half the previous volume and after reaching the lowest volume the procedure was repeated but increasing the volume by complementary amounts. Three participants preferred the loudest volume, with all but one participants preferring one of three largest volumes. However, some participants noted that the highest volume was too high in their opinion. One interesting finding, reported by some participants, was that their comfortable audio level was different when the volume was decreasing or increasing. For instance, one participant reported she could understand the spoken message only in the first two volumes when the volume was decreasing, but she could understand the loudest three volumes when the volume was increasing. Other examples of such behavior were observed and reported. In addition to reporting their preferred volume setting, several participants also reported being comfortable with the three loudest volumes. 4.3 Motor Skills Participants found both free-hand pointing and pointing with the Wii to be natural interaction modalities, meeting our goal of providing interaction modalities that are natural and do not require learning to be used. When comparing both free-hand pointing and Wii pointing the majority (6 participants) preferred free-hand pointing. Nevertheless, there were some interesting remarks. One participant stated she preferred the Wii remote because it reminded her of her remote control that she is used to handling when watching TV in her house. Other participants changed their preferences during the trial, after becoming more accustomed with both options. These participants typically moved from an initial preference for the remote to a preference for free-hand pointing. For the design of pointing interaction devices it was possible to gather some relevant indications. For instance, one participant pointed almost exclusively with her finger, barely moving her arm and hand. This was especially challenging for the motion tracking system. Other participants change the pointing hand depending where the target is on screen, using the left hand to point at targets on the left side of the screen, and the right hand to point at targets on the right side of the screen. This knowledge can be used, for instance, to adapt presentation based on the ability of the user’s arms. Participants were also asked to perform a set of representative gestures (e.g. circling, swiping) both in free air and on a surface representative of a tablet. Eight of
48
C. Duarte et al.
the participants preferred performing the gestures in free air. Some participants justified this preference because they felt that in that manner they could be more expressive. Others reported that doing gestures in free air was similar to do gestures when engaged in conversation with another person, thus feeling more natural. Another raised the issue that when performing gestures in the tablet she wouldn’t be able to see any content that might be displayed in it. Participants were also asked if they preferred to do those gestures with just one or with the two hands. Four participants expressed no preferences in this regard, with only two preferring two handed gestures and 3 preferring one handed gestures. Regarding their preferences and abilities when asked to perform pointing tasks with targets in the four corners of the TV screen it was not possible to identify a clear tendency in the collected results. One participant found it easier to point at the top right, two preferred the bottom right, one preferred the top edge, one preferred the bottom edge and four did not express any preference. 4.4 Specific Modality and Multimodal Patterns One important topic to address when using both gestures and speech is how users combine them, specifically if and what use they make of deictic references. One initial observation was that the purpose of combining modalities was not clear to all participants. This could however be attributed to their being required to combine modalities to do the same task they had just made with a single modality. Most participants employed multiple modalities in a redundant fashion, speaking the option’s text and pointing to it. From the interaction analysis it was possible to identify some integration patterns. Four participants pointed before speaking, while other four participants spoke before pointing. For the other participant no clear tendency was found. A few participants combined the pointing with spoken deictic expressions. In one of the trials the participant used the deictic reference while pointing, and then followed it speaking the option’s text, introducing a different interaction pattern. Multimodal interaction was explicitly exercised in the context of a selection task done with visual, audio and combined presentations and with selection being possible through pointing, speech and combined usage of these two modalities. Regarding the presentation, participants unanimously expressed a preference for the system to employ redundant visual and audio presentation (all 9 participants selected this option). Observations of some participants’ behavior showed that when the system presented options both visually and aurally, they did not wait for all options to be presented, answering as soon as they perceived the right answer. Regarding the input, seven participants are satisfied with having simply speech recognition, commenting that it is much easier to perform selection in this fashion. Two participants expressed their preference for a system where they can combine pointing and speaking. Different combinations of modalities to alert the user were also considered in the context of two different scenarios: watching TV and browsing pictures on the TV screen. For the TV watching scenario preferences were variable. Two participants preferred the alert to be rendered only using speech synthesis. Other two preferred to be alerted by avatar and text message, four preferred text and audio and one the avatar
Eliciting Interaction Requirements for Adaptive Multimodal TV Based Applications
49
only. In the photo browsing scenario a similar variability was found. Two participants preferred alerts through speech synthesis only, two preferred the avatar and text, one preferred text and audio and two preferred the avatar only. Other interesting observations regarding the use of multiple modalities were made. Some participants, even in tasks where they were required to select only by pointing ended up also speaking the option’s text, even without noticing it. In one case, even after being asked to point the correct option, a participant just spoke the option’s text. In what concerns the use of speech alone for selection operations it is important to understand if users simply read aloud one of the choices presented to them or if they use alternative speech. Although most participants simply read aloud one of the presented options, we also witnessed some natural utterances, like saying “the fourth” instead of “option 4” which was the text presented on screen. 4.5 Other Topics Although not available to be used during the trial, participants were shown a keyboard and a mouse and we asked if they would like to have them available to interact with GUIDE. The majority (6 participants) did not wish to have it available. Some even expressed they could not really understand why should the system employ those devices if users could interact through the natural means they had already tried (speech and gestures), while others simply stated that those interaction means are harder to use than speech or pointing. The use of an avatar in the system was also assessed. The participants’ reaction to the avatar was not as positive as expected. There are some justifications for this, given the avatar employed had a too small size and was not properly configured in what concerns emotion expression. However, it was possible to gain some knowledge about the use of an avatar in the context of the proposed system. Some participants expressed a request for the avatar to look better, less cartoonish, in order to make them feel better about it. An important observation was that four out of the nine participants responded to the avatar’s greeting message as if they had been greeted by a person. This is an indicative sign that avatar’s can be used to promote a bonding between the users and the system.
5 Conclusions Due to their intrinsic nature, the design of multimodal interaction systems should be strongly founded on user centred design techniques. That is exactly what is being done in the context of the GUIDE projected, and some of those initial findings have been reported in this paper. User trials with a multimodal system are being conducted in order to characterize user abilities, and also to assist in the future evolutions of the system. These results are expected to provide impact both on the individual modality level, but also through the multimodal integration patterns found. This paper presented a summary of the findings, and the impact that user’s visual perception, audio perception and motor skills can have on the interaction design of
50
C. Duarte et al.
such systems. Additionally, some multimodal interaction patterns were observed and reported. What is clearly supported by the data acquired so far, is the need for adaptation mechanisms in order to provide adequate interaction mechanisms to a user population with such diversity of abilities. One example of how adaptation could be explored became evident as a result of the observations conducted so far: participants selected their pointing hand based on where the target is on screen. This knowledge can be explored to decide on presentation details. For instance, if we know the user has impairments affecting his left hand, we can present the selectable targets on the right side of the screen, offering him what should be a more comfortable interaction experience. These user trials are ongoing, so further observations and insights are expected in the near future and will be reported in due time.
References 1. Oviatt, S., Darrell, T., Flickner, M.: Special issue: Multimodal interfaces that flex, adapt, and persist. Commun. ACM 47(1), 1 (2004) 2. Duarte, C., Carriço, L.: A conceptual framework for developing adaptive multimodal applications. In: Proceedings of the 11th International Conference on Intelligent user Interfaces, pp. 132–139. ACM Press, New York (2006) 3. Dumas, B., Lalanne, D., Oviatt, S.: Multimodal interfaces: A survey of principles, models and frameworks. In: Lalanne, D., Kohlas, J. (eds.) Human Machine Interaction. LNCS, vol. 5440, pp. 3–26. Springer, Heidelberg (2009) 4. Duarte, C., Feiteira, P., Costa, D., Costa, D.: Support for inferring user abilities for multimodal applications. In: Proceedings of the 4th Conferência Nacional em Interacção Pessoa-Máquina, Aveiro, Portugal (2010)
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications Peter Forbrig, Anke Dittmar, Jens Brüning, and Maik Wurdel University of Rostock, Department of Computer Science, Albert Einstein Str. 21, 18055 Rostock, Germany {peter.forbrig,anke.dittmar,jens.bruening, maik.wurdel}@uni-rostock.de
Abstract. This paper discusses approaches for specifying workflows based on task models. These task models represent activities of stakeholders in different ways. It is shown how the development process of workflow specifications can be supported to get hierarchical, structured and sound specifications. Further on, a language CTML is introduced that was developed to specify activities in smart environment. The language has the potential to be used to specify general workflow specifications as well. It is demonstrated how cooperative work can be specified using this language. Keywords: Stakeholder-driven Specifications, Business process modeling, Workflow specifications, Task Modeling.
developed. Besides the fact that a lot of other languages were developed currently the notation of CTT is most often used one in publications. Additional to the application during requirements engineering task models can also be used for modeling business processes or workflows. This idea is not new. Already Traetteberg mentioned in [22] “Workflow concepts are compatible with task modeling concepts, although they have different foci and strengths”. Furthermore, task trees generally are structured workflow models [11] that provide further benefits. In [4] the correlation between CTT and UML activity diagrams is shown and a structured editor for activity diagrams is presented. Structured workflow models are less error prone, better readable and understandable [14]. Unreadable and unstructured workflow models similar to “Spaghetti code” models [9] are avoided. Task tree models forbid insensible models like the vicious cycle and are dead- and lifelock free and sound by design. Although expressive power is lost compared to the flow chart oriented languages like BPMN, there are many benefits to use task trees as workflow models. Consequently it seems promising to use them as workflow models. The paper is structured as follows. In section 2 the foundation for modeling business processes with task trees is given. Section 3 provides a modeling language CTML originally developed for smart environments. Finally there will be a discussion and a summary.
2 Modeling Cooperative Activities with Task Trees In this section the modeling of workflows with task trees is discussed in more detail. The hierarchy is integral part of the workflow specifications and binary temporal relations are used to specify the control flow. Temporal operators are related to the operators of the process algebra LOTOS [3] . 2.1 Control Flow Specification Using Temporal Operators CTT is currently the most referred language for task models. Tasks are arranged hierarchically, with more complex tasks decomposed into simpler sub-tasks. CTT distinguishes between several task types, which are represented by the icon representing the task node. There are abstract tasks, which are further decomposable into combinations of the other task types including interaction, application and user tasks (see Fig. 1 for an overview of the available task types). The task type denotes the responsibility of execution (human, machine, interaction, cooperation with human).
Fig. 1. Task types for task models
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications
53
They are connected pair wise by binary temporal operators. Additional to the binary temporal operators there are unary ones that are only related to one task. Some CTT operators are listed in Table 1. A complete listing and explanation of the CTT operators can be found in [18]. Operators have priority orders. These orders are important for interpreting different operators in the same level. The priority of the interleaving ( |=| ) operator is higher than the enabling operator ( >> ). The only unary operator listed in Table 1 is the iteration. Fig. 2 gives an Impression of the specification of a task model. Table 1. Selected temporal operators of CTT Operator name choice orderindependence interleaving disabling enabling iteration
An example of a CTT model is given in Fig. 2 which shows how a presenter may give a talk. The abstract root task “Give Presentation” is decomposed into four children tasks. The tasks on the second level of abstraction are connected with the enabling operator (≫) in order to specify that one task has to be performed before the other can start (e.g., “Present” can only be performed after having executed “Configure Equipment”). An exception to this is “Leave Room” as it can be performed at any time due to the deactivation operator ([>) resulting in a prematurely abortion of the currently running task. “Configure Equipment” is furthermore consisting of the tasks “Start Projector”, “Start Laptop” and “Connect Laptop & Projector”. Those basic tasks are connected with the orderindependence (|=|) and enabling operator. The orderindependence operator defines the sequential execution of the tasks in arbitrary order meaning that once one of the tasks is started the other has to
Fig. 2. Task model for giving a presentation
54
P. Forbrig et al.
wait for the first one to terminate. Tasks which are not further decomposed are actions and considered as atomic. They represent the smallest entity of execution (e.g., Start Projector). 2.2 Specification of Cooperative Work In order to support the specification of collaborative (multi-user) interactive systems, CTT has been extended to CCTT (Cooperative ConcurTaskTrees) [16]. A CCTT specification consists of multiple task trees. One task tree for each involved user role and one task tree that acts as a “coordinator” and specifies the collaboration and global interaction between involved user roles. An example for the formalism is given in Fig. 3. Task models for the roles presenter and listener are given on top and on the lower right hand side of the figure respectively. The model specifying the coordination of the individual tasks is depicted on the lower left hand side. For each action in the coordinator task model a counterpart in the role specific task model has to be defined which is denoted by the dotted lines in the figure. In essence, the coordinator task specification adds additional execution constraints to the individual task models. In the given example it is specified that “Wait for Questions” of the role “Presenter” needs to be performed before the “Listener” is allowed to perform “Ask Question”. After that “Answer Question” of the role “Presenter” can eventually be executed.
Fig. 3. Cooperative Concurrent Task Tree Model for Presentation
The main shortcoming of CCTT is that the language does not provide means to model several actors simultaneously fulfilling the same role as well as that an actor is assumed to fulfill only one role within a CCTT specification (strict one to one mapping of actors and roles).
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications
55
3 A Collaborative Task Modeling Language The collaborative task modeling language (CTML) was developed in conjunction with modeling efforts in smart environments. It supports the idea of stakeholderdriven process management and has the potential to be used outside the context of smart environments. We will shortly discuss the fundamentals and main features of the language. The design of CTML is based on four fundamental assumptions: 1. Role-based Modeling. In limited and well-defined domains the behavior of an actor can be approximated through her role. 2. Hierarchal Decomposition and Temporal Ordering. The behavior of each role can be adequately expressed by an associated collaborative task expression. 3. Causal Modeling. The execution of tasks may depend on the current state of the environment (defined as the accumulation of the states of all available objects) and in turn may lead to a state modification. 4. Individual and Team Modeling. The execution of task of individual users may contribute to a higher level team task. Based on these assumptions a collaborative task model is specified in a two-folded manner: 1. Cooperation Model Specifies the structural and behavior properties of the model. 2. Configuration(s) Holds runtime information (like initial state, assignment) and simulation/ animation configurations. For each Cooperation Model several Configurations may exist in order to describe different situations in which the model is used. The following figure Fig. 4 shows a schematic sketch of a cooperation model. Elements in the inner circle show modeling entities of the cooperation model (post fixed with “-1”) whereas diagrams outside of the inner circle show specifications realizing the corresponding entities (post fixedwidth “-2”). On a higher level of abstraction the cooperation model specifies the entities relevant to task modeling. Therefore roles (e.g., A-1), devices (e.g., B-1), a location model (C-1), a domain model (D-1) and a team model (E-1) can be specified. The potential actions a user is able to perform are determined by his role(s). More precisely a role is associated with a collaborative task model (A-2 in Fig.4), which is visually represented by a task tree in a CTT-like notation [18]. Tasks are arranged hierarchically defining a tree structure. Atomic tasks, non refined tasks, are referred as actions. In addition, tasks on the same level of abstraction can be connected via temporal operators defining the temporal order of task execution. Roles categorize users of the same kind in terms of capability, responsibility, experience and limitations according to the domain. Thus roles are abstractions of actors sharing the same characteristics. Role modeling is a common concept in
56
P. Forbrig et al.
Fig. 4. Schematic Cooperation Model for Meeting Scenario
software engineering ([6; 10]) to reduce complexity and build systems for diverse users. What constitutes to a certain role and distinguishes it from another one relates to the system and development approach. In [10] it is stated that a user is not limited to one role at a time and role switching is often taking place. In CTML the role concept is employed to define the pool of actions of a user by means of task expressions. In task analysis and modeling this approach is quite common but is usually restricted to a one-to-many relation of role and user [15;16]. However this is a rather rigorous constraint. In the domain of smart environments it is frequently the case that an actor changes his role at runtime and that one role is being performed by several actors simultaneously. This might be the case in our modern business world as well. The role concept implemented in CTML incorporates this case. In the example of Figure 4 the roles are Presenter, Listener and Chairman. They represent the different types of stereotypical behavior in the meeting scenario. Besides the cooperation model a CTML specification also contains one or more configurations providing essential runtime information for the cooperation model. A configuration represents necessary information for a concrete situation. This allows for testing different settings for the same cooperation model without much effort by defining different configurations. As the cooperation model relies on a role-based specification actors operating in the environment need to be defined in accordance with a corresponding actor-role mapping. More precisely an actor may fulfill more than one role concurrently and a role may be assigned to different actors simultaneously. Moreover, not only
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications
57
concurrent role fulfilling is allowed but also all other temporal operators defined in CTML are possible. None of currently existing task modeling supports this assumption even though this is a common case in working cooperatively. Taking the example of the “Conference Session” one can imagine the case of an actor presenting a paper in front of the audience but also listening to other presentations afterward. Therefore, the simultaneous (or more precisely ordered) performance of more than one role is an important feature of the language as it also allows separating roles from another since they are assembled at runtime. Thus modularization and separation of concerns are achieved. Additionally some properties for actors are defined (e.g., initial position in the environment). On the left hand side of Fig. 5 an example Configuration for the schematic Cooperation Model in Fig. 4 is depicted. Not all before mentioned information have visual counterparts but the actor-role mapping is represented by arrows. More precisely it is specified that Leonard only acts as Presenter whereas Penny fulfills the role Presenter and Listener simultaneously. Sheldon acts as Chairman. The precise assignment of temporal operators for an actor fulfilling more than one role is performed in a dialog which is shown on the right hand side. Currently it is specified that Penny first acts as Presenter and afterward as Listener.
Fig. 5. Configuration “Scenario 1” for Cooperation Model "Conference Session" Table 2. Examples of preconditions and effects Role Presenter Presenter Chairman Role Presenter Presenter Chairman
Task Start presentation Respond to question Announce open discussion Task End presentation Leave room Announce open discussion
A configuration can be considered as a scenario under which the cooperation model is tested or used. However sometimes one might test only certain features of the model. In the context of a smart environment the scenario is concrete. It could be considered as abstract specification for a workflow specifying the general behavior of all possible participants.
58
P. Forbrig et al.
Additional to temporal relations between tasks CTML allows more specific relations in a similar way like OCL. These constrains can be specified between tasks but also between other model elements. CTML allows in this way the specification of preconditions and effects of tasks. With respect to the task expressions of the role chairman and the role presenter the preconditions shown in Table 2 can be defined. The first precondition defines that the presenter is only allowed to start his presentation if the talk had been announced by a Chairman. The second preconditions states that responding to questions can only be performed if the Chairman has opened the discussion. The precondition of the chairman states that an open discussion can only be announced if all presenters have finished their presentation. Preconditions defined on this level of abstraction integrate well with the CTML approach of role based descriptions. Quantifiers are able to specify how many actors fulfilling the role are addressed (one or all). The task “End presentation” results in setting the attribute presented of the presenter true. If a presenter leaves the room he is outside. The opening of the general discussion by the chairman has the effect that all notebooks are switched off.
4 Discussion In this paper, an approach to model workflows in a hierarchical, structured way by using task trees is discussed. Task trees are frequently used in the HCI community to specify tasks and user actions in a goal or rather problem oriented way in a consequently user centered approach. The models are structured and for that reason readable and understandable [14]. Originally, CTML was developed to specify activities in a smart environment but it can be used to model general workflows as well. Activities of stakeholders are specified by specific models. Constrains between tasks, roles, devices and location (general context) can be specified by constrains as preconditions and effects. Additionally, an activity model specifying the cooperation can be used. Additionally, temporal relations can be specified that describe the dependence of different roles a user acts. The idea of specifying for every role (stakeholder) a special task model could be called subject-oriented because the main idea is very similar to that of [8]. The term stakeholder-driven would also fit. Stakeholders that play the role of bystanders can also be model. In this case the task model might be reduced to one task only. However, certain restrictions can be specified regarding certain stats of other elements of the model. This can e.g. mean that it is not allowed that in the neighborhood of a bystander exists a device that is louder than a certain level. During software development models are adapted and incremental refined. In order to define an appropriate notion of refinement different refinement relations introduced for CTML [27]. This information specifies on which degree a certain task specification can be adapted. This could be very helpful for workflow management systems as well.
Making Task Modeling Suitable for Stakeholder-Driven Workflow Specifications
59
5 Summary and Outlook Task modeling is compared to business process and workflow modeling in this paper. In this context it is stated that task modeling normally goes beyond business process modeling to a fine granular user action level. Nevertheless, notations of task models can be used for workflow specifications. It might be useful to separate models with different level of detail. In this way workflow management systems can be combined with user interface generation systems presenting the user workflow data in a usable way. Detailed Task models have been proven to be useful for interactive systems in this way.
References 1. Annett, J., Duncan, K.D.: Task analysis and training design. Occupational Psychology 41, 211–221 (1967) 2. Balzert, H.: Lehrbuch der Software-Technik: Basiskonzepte und Requirements Engineering. Spektrum, Heidelberg (2009) 3. Bolognesi, T., Brinksma, E.: Introduction to the ISO Specification Language LOTOS. Computer Network ISDN Systems 14(1), 25–59 (1987) 4. Brüning, J., Dittmar, A., Forbrig, P., Reichart, D.: Getting SW Engineers on Board: Task Modelling with Activity Diagrams. In: Gulliksen, J., Harning, M.B., van der Veer, G.C., Wesson, J. (eds.) EIS 2007. LNCS, vol. 4940, Springer, Heidelberg (2008) 5. Card, S., Moran, T.P., Newell, A.: The Psychology of Human Computer Interaction. Erlbaum, Hillsdale (1983) 6. Constantine, L.L., Lockwood, L.A.D.: Software for Use: A Practical Guide to the Models and Methods of Usage-Centered Design. Addison-Wesley, Boston (1999) 7. Diaper, D., Stanton, N.: The Handbook of Task Analysis for Human-Computer Interaction. Lawrence Erlbaum Assoc Inc., Mahwah (2003) 8. Fleischmann, A., Lippe, S., Meyer, N., Stary, C.: Coherent Task Modeling and Execution Based on Subject-Oriented Representations. In: Proc. TAMODIA, pp. 78–79 (2009) 9. Gabriel, M., Ferreira, V., Ferreira, D.R.: Understanding Spaghetti Models with Sequence Clustering for ProM. In: Business Process Management Workshops (BPM 2009 International Workshops), Ulm. LNBIP, vol. 43 (2009) 10. Johnson, H., Johnson, P.: Task Knowledge Structures: Psychological Basis and Integration into System Design. Acta Psychologica 78, 3–26 (1991) 11. Kiepuszewski, B., ter Hofstede, A.H.M., Bussler, C.J.: On structured workflow modelling. In: Wangler, B., Bergman, L.D. (eds.) CAiSE 2000. LNCS, vol. 1789, pp. 431–445. Springer, Heidelberg (2000) 12. Kristiansen, R., Trætteberg, H.: Model-based user interface design in the context of workflow models. In: Winckler, M., Johnson, H. (eds.) TAMODIA 2007. LNCS, vol. 4849, pp. 227–239. Springer, Heidelberg (2007) 13. Larman, C.: Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development, 3rd edn. Prentice Hall, Englewood Cliffs (2004) 14. Laue, R., Mendling, L.: The Impact of Structuredness on Error Probability of Process Models. In: Information Systems and e-Business Technologies 2nd International United Information Systems Conference UNISCON 2008 Klagenfurt. LNBIP, vol. 5 (2008) 15. Molina, A.I., Redondo, M.A., Ortega, M., Hoppe, U.: CIAM: A Methodology for the Development of Groupware User Interfaces. Journal of Universal Computer Science 14, 1435–1446 (2008)
60
P. Forbrig et al.
16. Mori, G., Paternò, F., Santoro, C.: CTTE: Support for Developing and Analyzing Task Models for Interactive System Design. IEEE Trans. Software Eng. 28(8), 797–813 (2002) 17. OCL: http://www.omg.org/technology/documents/modeling_spec_catalo g.htm#OCL 18. Paterno, F.: Model-Based Design and Evaluation of Interactive Applications. Springer, Heidelberg (2000) 19. Penichet, V.M.R., Lozano, M.D., Gallud, J.A., Tesoriero, R.: User interface analysis for groupware applications in the TOUCHE process model. Adv. Eng. Softw. 40(12), 1212– 1222 (2009) 20. Poluha, R.G.: Application of the SCOR Model in Supply Chain Management. Youngstown, New York (2007) 21. Scheer, A.-W.: ARIS: Business Process Modeling. Springer, Heidelberg (1999) 22. Traetteberg, H.: Modelling Work: Workflow and Task Modelling, Computer-Aided Design of User Interfaces II. In: Louvain-la-Neuve, Belgium, Vanderdonckt, J., Puerta, A.R. (eds.) Proceedings of the Third International Conference of Computer-Aided Design of User Interfaces, October 21-23, pp. 275–280. Kluwer, Dordrecht (1999) 23. van der Aalst, W., Desel, J., Kindler, E.: On the semantics of EPCs: A vicious cycle. In: Geschäftsprozessmanagement mit Ereignisgesteuerten Prozessketten (EPK 2002), Trier (2002), http://www.wiso.uni-hamburg.de/fileadmin/wiso_fs_wi/EPKCommunity/epk2002-proceedings.pdf (accessed at November 30, 2010) 24. van der Aalst, W., ter Hofstede, A.: YAWL – Yet Another Workflow Language (Revised version). QUT Technical Report, FIT-TR-2003-04, Queensland University of Technology, Brisbane (2003) 25. Vanhatalo, J., Völzer, H., Koehler, J.: The refined process structure tree. In: Dumas, M., Reichert, M., Shan, M.-C. (eds.) BPM 2008. LNCS, vol. 5240, pp. 100–115. Springer, Heidelberg (2008) 26. Weske, M.: Workflow Management Systems: Formal Foundation, Conceptual Design, Implementation Aspects. Habilitation Thesis, University of Münster (2000) 27. Wurdel, M.: Making Task Modeling Suitable for Stakeholder-Driven Workflow Specification, University of Rostock, PhD Thesis submitted (December 2010)
A Method to Solve the Communication Gap between Designers and Users Jeichen Hsieh1, Chia-Ching Lin1, and Pao-Tai Hsieh2 1 Department of Industrial Design, Tunghai University, Taiwan, R.O.C. Tai-Chung Institute of Technology, Department of Department of Commercial Design National, Taiwan, R.O.C. [email protected], [email protected], [email protected] 2
Abstract. There are always discrepancies when abstract design concepts are transferred to solid products. How to make sure that design concepts are conveyed exactly via products? To develop the early stage prototypes for tests and surveys is one of the solutions. The research applies POE (Post-Occupancy Evaluation) on prototypes of students’ design cases repeatedly. The result revealed that product prototype POE can anticipate the performances of products in final evaluation as an evaluation can predict post-production consumer reception. It suggests that performances of product prototype by POE would be clarified if extraneous variables are under strict control in advance. Two cases show chaos phenomenon, to probe into the field of students’ design activities with grounded theory might help to unearth some discovery. Keywords: Post-Occupancy Evaluation, Prototype, Cognitive Differences.
concrete answer after the design program, you need to examine whether it meets the original objectives. Coughlan and Mashman [4] note that when more than one prototype must be selected or, usually ask the customer to choose the most attractive proposal for further development or improvement of the missing to do, in this single-phase of the assessment the customer does not make the right judgments, easy to provide false information, causing errors on the design decisions, leading to the development of a design or development should not veto the design of risk should not be rejected. Also, ask the customer to select their most attractive proposal is consistent with the original target market or to meet the original design proposal? It needs to verify. By the use of the construction sector assessment of POE (Post-Occupancy Evaluation) method, a designer could understand the intent and the user intent and thereby shorten the gap among multi-application architecture and urban planning and other fields. Follow-up with researchers to apply it to industrial design to view products in the design and planning on the information to know whether the user can really feel the product design and planning on the intention and the assessment can be used in nextgeneration knowledge acquisition products improvement; Since the ex post review of POE can be used as the product improvement policy, whether that is in the prototype stage of development to evaluate and improve the implementation of POE, whether the effects are significant? That is the research focuses. In education, there are often many steps between the first ideas to the completion of modeling, with teacher’s guidance for improvement. This is similar to the work industry, when the design goes through internal assessment decisions by the superiors. However, in the interview with the former Toyota Motor Minister of Design, it is stated that the early decisions made by a superior often result in errors, with no further assessment due to a systematic problem. We can see that the practice has to be effective in practice. POE prototype response to the research, but the development of the field of industrial design prototype POE method, import the repeated amendments to the process of evaluation and assessment of students design prototype, the design concept and the actual performance gap and give recommendations to encourage students to better approximate the design, prototype and then try to understand if the implementation of POE cases can result in significant impact on students.
2 Literature Review 2.1 POE Method POE contains research and review the previous assessment of the research prototype, is to grasp the principles and characteristics of POE, to explore its content and evolution, and by what is the prototype, the prototype is to investigate the scope of the establishment. POE method views the environment, the effectiveness of architectural design and planning, accumulation and sorting through the data as a basis for future improvements. In contrast, POE used in industrial design, should be able to view the product information on the design and planning, and to understand whether the user can really feel the product design and planning with the intention; Liguan Yao in the
A Method to Solve the Communication Gap between Designers and Users
63
POE-in home appliances refrigerators Case studies, for the first time to understand the use of POE in the product design, users and designers the difference between ideas, products in use after those changes, the user may misuse act and new patterns of behavior. Architecture and industrial design are compared in the different POE, and the establishment of suitable operating model for industrial products and related POE. The nature and functions of POE have somewhat different views. Sommer defines the POE is the "built environment for the use of a systematic examination of ways to obtain useful information"; Rabinowity says the POE is the "built environment for understanding and improving the effectiveness of a method of using"; domestic scholars have proposed to use assessment as a "specifically refers to the use of the building or building environment research". Meanwhile Shyh-Meng Huang think "POE's point of view the first emphasis on space and activity (or use) whether there is correspondence between the occurrence of conflict issues, the fact that these contradictions to resolve these contradictions." In addition, according to Zimring and Rirenstein, POE is "to access, observation, questionnaires and other methods to understand user views on all aspects of building work", which are more operational definition, the object between people and buildings define. And Zimring's another definition of POE is "diagnostic work on the environment, and this environment is by design for human use after". All in all, the POE can be generalized and treat the efforts of the past as a reflection and decision-making, from seeking to increase understanding and further improvement of the road. More narrow view of architectural design from the perspective of users, such as exploring the response of the built environment to improve the design of future cases similar to the decisionmaking; or to assess the functional performance of buildings, and use and planning objectives and content of the original comparison; or to act with the architects on the use of assumptions compared to understand the differences and to explore the cause during different reasons. POE and rationalization of a systematic process, emphasizing the position and method of objective evaluation and assessment criteria such as characteristics of importance is the result of its and the "Architectural Review" (Critique) different places. In other words, POE is the use of social science research methods of the built environment and people (especially users) to do the relationship between integrity and depth. The Presier based on the length of the POE application functionality into three stages: 1. (1) (2) (3) 2. (1)
In the short-range effect on: determine the success or failure of the building. make recommendations on the need to fix the problem. in the development project budget to provide some information. medium-range effect on: to help determine the suitability of the building to continue to use, modify or re-build and other measures. (2) resolve the supply dimension of existing buildings. 3. the effect on the remote: (1) POE can be based on the findings to the design of future buildings.
64
J. Hsieh, C.-C. Lin, and P.-T. Hsieh
(2) basic information designed to enhance and improve the formulation of assessment criteria and guidelines. (3) to improve the qualitative aspects of the past performance of the building methods of measurement. And, Preiser [5] presented in the writings of three levels of POE studies can be used in any form or size of buildings or facilities: 1. Indicative POE (Indicative POE): The main problem is to find that the advantages and disadvantages, success and failure, for short-term assessment. General to file the collection, performance evaluation, field reconnaissance and interviews of four methods required time from 2 to 3 hours to a day or two days. 2. Investigative POE (Investigative POE): Index of POE is that the results of important issues, and investigative POE need to get to the bottom of the topic for a more detailed investigation. The main basis for the formulation of assessment criteria related to literature, and reference to recent cases of similar assessment and therefore the higher its credibility, more objective; need for about 160 to 240 hours. 3. Diagnostic POE (Diagnostic POE): Diagnostic POE is a comprehensive and in-depth assessment, and involves many variables, the use of questionnaires, surveys, observation, the actual measurement and scientific method and other data collection methods, aims to understand the actual environment and facilities, the relationship between behavior and the variables and the relationships between variables; assessment results can be compared and scientific standards, it is quite accurate, and can enhance the knowledge of the building, design guidelines, and even make amendments to laws and regulations; time from months or even a year or more. 2.2 Industrial Design Prototypes Through Science Direct and UMI ProQuest Digital Dissertations search, keyword Evaluation, Product and Industrial Design Prototype door in search results, similar to the original planning POE design information for the control of the ex-post evaluation methods used in the prototype or product literature little. If Section I, POE than used in the construction field, Liguan Yao used the first time in the refrigerator POE product evaluation, and Cai Shijie dehumidifier product innovation and POE. The context of the POE is no more in-depth understanding of, for example: There is no understanding of POE to assess which building projects? How these projects generate? In addition, POE is not described how the three levels used in the field of industrial design. Coughlan and Mashman [4] create the prototype of a single-stage (single-session) of the assessment, the customer does not make the right judgments, the easy to provide false information, causing errors on the design decisions, it advocates with more than one assessment, which is to allow customers to look at a few prototypes, to reduce "side of the edge," Evaluation of the potential to make the crisis, and concluded that: the assessment of pre-production market should
A Method to Solve the Communication Gap between Designers and Users
65
be able to predict customer acceptance of the situation after ; cultural context is often in the aesthetic aspects of the product play an important role and the success of aftermarket, and the manufacturer's image and reputation, people use their past experience in related products and functions, etc., will affect the judgments of the product, so In considering a potential product, the other for the assessment can not be ignored. Further, Crilly et al [1] summed products are aesthetic, symbolic and practical functions such as 3 large, and suggested that further research is to develop appropriate ways to examine all aspects of product performance, and even to predict whether the product can stimulate the customers as expected reaction; Heufler [6] also indicated that the works of aesthetic products, symbolic and practical basic functions such as 3 large, we can see three basic functions of large argument so far is more consistent. Based on the above, Coughlan and Mashman said the current product design, prototype for the evaluation of other still room for development, and evaluation of pre-production and after-market customer response related to sex. Liguan Yao pointed out that the POE applied to assess the feasibility of industrial products, and products, POE survey that can be assessed and revised from generation to generation products; If POE used in the prototype evaluation, and POE function of feedback and feed forward echoes; use of the prototype POE assessment feedback and make improvements, but the effect is the role of feed forward in the product (eg, Coughlan and Mashman said above the predictive effect), it may form a better assessment of the prototype approach. Baxter [3] that the design process and models used in the prototype can be divided into a structural characterization of the model with functional characterization of the prototype and the characterization of both structure and function of the prototype. 2.3 Research Questions POE in the construction sector or field of industrial design applied to the POE, and irrespective of whether the POE in the field of industrial design, whether the spirit of the original POE, POE common research approach is used, the original planning and user cognitive differences between it reported; product assessment is an established fact, the conclusion to provide improved directions, did not address the design intent and user POE reduced cognitive performance. The product design process, designers are thinking how to be sure users understand which way is through the development review of the early prototype testing to verify design concepts to convey the relevance of the conversion, in addition to the appropriate research approach to solve, but also To know how it works. The focus of research is to expose "POE users to shorten the design intent and cognitive differences," the performance of the POE method is applied to the first prototype; use of prototyping and testing, repeated evaluation and correction (Assess & Refine) process to implement the product prototype POE, to answer whether the use of prototype POE, the subjects in the final assessment of whether a significant difference? And whether subjects were able to predict the final response? Such as preproduction evaluation, it can predict after-market customer acceptance of the situation.
66
J. Hsieh, C.-C. Lin, and P.-T. Hsieh
3 Methodology Assessment of the prototype is done by the students of the Department of Industrial Design, Tunghai University. Including: (1) chair prototype POE, sophomore design; (2) acrylic and re-design after POE, junior design; (3) cross step onto the track of the next semester, the experimental group and control group were two senior project students. Figure 3.1 shows the order of the processes. First, to understand the feasibility of POE applied to prototype, then the chair directly to the sophomore design prototypes for evaluation, and then go back and search for literature, will be integrated and field data to construct the theory is not only the documentation from the field come, and studied literature in the field of temporary and verified, so the POE literature review in the chair after the end of the prototype, and has practical experience with the designer to understand how the industry's implementation of the prototype assessments. After the prototype chair determine the feasibility of POE, but also learn from the literature review, should be followed in the construction sector POE way to the development of industrial design product prototypes POE approaches, then submitted documentation inductive method POE model prototype construction; time study gradually reduce the focus on design intent of POE and the user perceived performance, the use of the student work on is an attempt to POE will review the information back to students, encourage students to better approximate the design. POE prototype is used to understanding other sides of the implementation of multiproduct (POE prototype chair facing a single product do not), so in the third year of its implementation of the prototype acrylic and submit works of students attended a common problem, according to procedures to conduct the first POE (cancellation), and re-design of the second post-POE. Also at the same time senior graduation project, the first experimental design in order to grasp the upper and lower across the time course of the semester, the whole experiment is divided into five senior mileage, and established to operate the variables and experimental and control group arrangements, senior experimental design. After the experiment the first two milestones, the last semester of study will come to an end, then has to do the assessment work done to review and feedback to the prototype POE approaches, after continue next semester to complete three assessments.
4 Results Implementation of POE from the understanding of the feasibility of the prototype began with repeated cases of students to design prototype implementation of POE, there are findings as following: POE prototype will be used in cases of students work, repeated evaluation and improvement of its prototype implementation of the recommendations in the final evaluation of subjects. From the Scheffe method of post hoc comparison, the experimental group and control group work did not reach 0.05 level of significance. Hence there are differences between the performance but not significantly; and then ranked the performance seems to work, the first born of the experimental group B
A Method to Solve the Communication Gap between Designers and Users
67
single rinsing mug, the second born of the control group D with a group of cold tea, which means that the work done using POE Performance is not necessarily better, and that work without POE is not relatively to poor performance. There is no significant effect in the POE performance of this four students. Works in the experimental group and the final assessment four miles, respectively, by the independent sample t test and Scheffe method of post hoc comparison, did not reach 0.05 level of significance, the results indicated that the two assessments showed no significant difference. In other words, four miles from the school final results of the evaluation will be able to get a glimpse of a new generation of development works in the school's performance, it can be expected to judge cases of POE prototype high school students work, the performance of the final results of the assessment; such as the assessment of pre-production, to predict product Listed after the customer accepts the situation. POE on the prototype works in the of secondary school students showed no significant effect, said the POE literature review the research work carried out under natural conditions, rather than in a controlled experimental conditions, but the process of gradually found it difficult to external variables do better isolated defects, such as designers to reverse the verdict, and the design capabilities of different items, etc.; meaning POE used in the prototype will need to be in a better controlled experimental conditions, if asked to set up product items and designers can not overturn and other control measures, should also help clarify the effectiveness of POE prototype. And when items are not at the same time, it is more difficult to clarify the diagnosis POE prototype design and prototype feedback on the effectiveness of what happened, as the case of the pot with the robot lamp, table lamp itself, the robot may be difficult to obtain subjects who recognition, but if the two table lamps designed by key design robot, the more consistent base of comparison. Senior test mileage II, Kathy Chang, single rinsing mug after assessment found that there are functional, and that cup diameters which are too small will cause the brush bristles into the cup on the face and prevent lack of water movement, the feedback survey after the diagnosis of POE Mileage three functional assessment found that cup to drink water on the bristles have not obstructing movement. As Norman [2] described, recursive testing and re-design, the level of behavior (functional) design has played a good role, but not necessarily suitable for instinct (aesthetic features) or reflection level (symbolic function) design. Some recommendations are as following: From the works of experimental and control groups, there is no significant difference in the performance of views. The case did not present a significant benefit by students, but the diagnostic feedback POE prototype is more significant for a senior designer to reflect on their instinct. Prototype POE recommends using only the investigation of the two indicator levels, as indicators for short-term assessment of the POE. The main problem is found that the advantages and disadvantages, and success and failure, the time required from 2 to 3 hours to a day or two days, the study's assessment of the mileage are all investigative POE found in the indicator problem, its more detailed investigation are applied to the assessment of all POE takes about two weeks. The POE is the diagnostic variables and variables related to the relationship between the time required from months to a year or more [5], the study did not use the highest level which is necessary to implement POE. It seems too difficult to match the time used in
68
J. Hsieh, C.-C. Lin, and P.-T. Hsieh
the prototype, because the results are too late to feed back to designers in product development, unless a considerable length of time. 1. In this will complete prototype development of skills needed for POE (POE after 30 years of architectural development), the main results of the current prototype in the proposed POE approaches, but more specific items such as: functional aspects of environmental protection , a symbol of the functional aspects such as cultural and social context detection method, and using better ways to detect, follow-up study to be itemized and specializes in the integration of knowledge into prototypes and then POE research approach. 2. Research in the "sub-blocks method of investigation" did not get the better of, because you can more easily partition the building blocks, but the product of a partition block would probably lose its significance (based on gestalt psychology) and would be difficult to define the scope. For example: a robotic desk lamp where block A is defined as the palm of your hand to the shoulder, but why not from the palm to the elbow? There seems to be no sufficient reason to support such a regional division, the impact, for example: only know that the poor performance of the prototype and beautiful, but not that "the beauty of which part of the poor"; raises a question: how to detect the aesthetic (or other) part of a negative evaluation of the exact place? Can only be subjects for the prototype general awareness of the negative evaluation of the qualitative data records, if the designer knows exactly which part or range of subjects is not understood and difficult to clarify. Suggest that future developments need to improve that part of the prototype approach. 3. The industry does not face the fear of using more innovative design of the phenomenon in Taiwan. By POE and Decision Sciences, it might help the evaluation of making prototypes and let the designers be more convincing. 4. From the chair prototype POE we found that the general assessment of the personnel evaluation of subjects and the scores for teachers were higher than the assessment of staff; a glimpse into different plays (Occupy) and occupies the length of time seems to influence the form of assessment, like business involvement in the field of the impact of different decisions on the purchase, so the prototype POE Marketing and Business Studies (Marketing) areas should be able to do more closely with the possession of such different forms, such as: a packed, unpacked and made available any consumer touch, but packaged and exposed part of the product design without instructions, etc., the impact on purchasing decisions. 5. In the assessment of association via a qualitative record of items to find work similar appearance can lead to more in-depth articles of association or emotional detachment, such as: single rinsing mug reminiscent of a partner's time. Since the emotional detachment can sometimes change or advertised products through methods such as enhanced, thereby allowing the consumers feel are "their" products [6]. To use POE prototype to explore the use of items and do not use items in advertising effectiveness or the impact on consumer purchasing decisions could be considered. 6. The current assessment is more artificial data interpretation, diagnosis and recommendations vary susceptible to assess product prototypes more cumulative. It is expected to organize the analysis and diagnosis of the rules, such as: with correlation analysis, can determine why the performance of the project or sort out the rules for construction of expert system.
A Method to Solve the Communication Gap between Designers and Users
69
7. POE survey is to assess and generation of products and buildings, but no studies have product or building to be followed and diagnosis. It does not show the generation better or not. In terms of design activities for students, graduates of the next track are to follow on the topic if the term by the POE diagnosis, whether the performance of a new generation of work can be better will be explored. 8. Cases, a glimpse into the Chaos of the phenomenon, are beyond the scope of the study. Scholars in different fields have been found to chaotic phenomena, the proposed follow-up study in this direction may be able to open another field of industrial design a "nonlinear" perspective. For example: Grounded Theory approach taken directly to the students in the fields of design activities and to observe the chaotic phenomena and create some new concept and theory.
References 1. Crilly, N., Moultrie, J., Clarkson, P.J.: Seeing Things: Consumer Response to the Visual Domain in Product Design. Design Studies 25(6), 547–577 (2004) 2. Norman, D.A.: Emotional Design (Emotional Design: Why We Love (or Hate) Everyday Things) (Que Lan Weng. Garden City Culture Limited, New York (2005) 3. Baxter, M.: Product Design and Development (Product Design: A Practical Guide to Systematic Methods of New Product Development). Kuni Press, New York (1998) 4. Coughlan, P., Mashman, R.: Once is Not Enough: Repeated Exposure to and Aesthetic Evaluation of an Automobile Design Prototype. Design Studies 20(6), 553–563 (1999) 5. Preiser, W.F.E., Rabinowitz, H.Z., White, E.T.: Post-Occupancy Evaluation. Van Nostrand Reinhold Company, New York (1988) 6. Heufler, G.: Design Principles: From Concept to Product Formation (Design Basics: From Ideas to Products). Longxi Books, New York (2005)
Teaching the Next Generation of Universal Access Designers: A Case Study Simeon Keates IT University of Copenhagen, Rued Langgaards Vej 7, 2300 Copenhagen, Denmark [email protected]
Abstract. This paper describes the development of the “Usability and Accessibility” course for M.Sc. students at the IT University of Copenhagen. The aim is to examine whether this course provides an effective and useful method for raising the issues around Universal Access with the designers of the future. This paper examines the results and conclusions from the students over 5 semesters of this course and provides an overview of the success of the different design and evaluation methods. The paper concludes with a discussion of the effectiveness of each of the specific methods, techniques and tools used in the course, both from design and education perspectives. Keywords: usability, accessibility, universal access, education.
1 Introduction It is widely accepted that there is a need to adopt user-centred [1] or user-sensitive [2] design processes when designing user interfaces. It is also widely accepted that there is a need to design for the widest possible range of users [3]. Design approaches such as Universal Design [4], Inclusive Design [5], and Countering Design Exclusion [6] have been developed as means of ensuring that user interfaces support the concept of Universal Access [7]. However, it is unusual to find any of these concepts taught explicitly within university Computer Science degree programs. Often they are taught within subjects such as Interaction Design, if at all. This paper describes a combined Usability and Accessibility course for graduate students. It will explain how students with little to no background in either topic can gain pragmatic skills and experience in a comparatively short space of time.
Teaching the Next Generation of Universal Access Designers: A Case Study
71
Students on the DDK line come from a wide variety of backgrounds. Approximately half of the students attending the course have received a traditional computer science education. The other students have had a more humanities-based education. The students are all typically mature and are either returning to education after a few years of work experience or are completing the degree as part of their onthe-job training. Almost all of the students describe their interest and motivation for taking the course as being to learn how to make websites more usable, even though websites are not explicitly mentioned in the course description. The DDK line consists of a mix of mandatory courses and voluntary ones. The full course structure is shown in Table 1. Table 1. The DDK study line for the M.Sc. degree at the IT University of Copenhagen
Semester
Courses
1st Semester
“Interaction design” (15 ECTS) “Media and communication” (7.5 ECTS) “Web design and web communication” (7.5 ECTS)
2nd Semester
“Innovation and concept development” (7.5 ECTS) “Introduction to coding, databases and system architecture” (7.5. ECTS) Elective 1 (7.5 ECTS) Elective 2 (7.5 ECTS)
The “Usability and Accessibility” course is a specialism option in the 3rd semester. Other choices include: • • • •
“Digital culture and community” “Globalisation, organisation and communication” “Digital aesthetics: theory and practice” “Mobile communication: design-related, business-related and social context”
These are all 15 ECTS courses, constituting one-eighth of the 120 ECTS M.Sc. course and run twice every year in the Spring and Autumn (Fall) semesters.
3 Course Structure The “Usability and Accessibility” course is 15 weeks long and is structured around the development of a web shop. In the first teaching session, the students are asked to interview each other and to complete a skills and interests questionnaire. Students are then placed into groups of 4 or 5 students with at least 2 experienced coders in each
72
S. Keates
group, although all students will have taken the mandatory course on Databases, which teaches the basics of PHP programming, and the course on Web Design, which teaches HTML, XML and the basics of Javascript. Students are tasked with building a simple web-shop from scratch in the first 2 weeks of the semester. The tight deadline is specifically to emulate the time pressure in most commercial environments. No explicit usability or accessibility goals are presented. This version of the web-shop is then frozen and the students are not allowed to modify it. A copy of the web-shop is made and over the next 10 weeks of the course, usability and accessibility theory are introduced. The students develop another version of the web-shop, with explicit consideration of the usability and accessibility requirements. Students are introduced to usability and accessibility theories in an order to support the continuing development and refinement of their web-shops. The course is expected to take 20 hours per week of student time, with 2-hour lectures and 2-hour exercise sessions twice a week (Wednesday and Friday), giving 8 hours of direct tuition per week and the remainder of the time being self-guided tuition by the students, typically work on their projects. Usually, the first morning of lectures introduces new theory. The first exercise session is focused on applying that theory in an exercise that is unrelated to the project. The second morning of lectures then examines the application of the theory and introduces further theory. The second afternoon of exercises is then focused on applying the theory to the web-shop project. At the end of the semester, the students are asked to prepare a 10-page project report in the ACM CHI publication format [8] along with a 5 page supplementary report, which can be formatted to their own choice. They are examined on a combination of the 10-page report, a 20 minutes group presentation and 20 minutes individual oral examinations. The students were told to focus on being able to justify quantitatively whether their revised sites were more usable and accessible than their original (frozen) sites. 3.1 Course Participants In total 116 students have enrolled in the course over the 5 semesters discussed in this paper (Autumn 2008 to Autumn 2010). Between them, they have developed 48 different web-shops – 24 original (frozen) versions and 24 revised versions. 3.2 The Design Brief Once students have been placed in their groups of 4, they are given a design brief, which states that: • The students have been hired by a fictional company to produce a web-shop within 2 weeks that offers a list of specified British products to their employees as a reward for a record-breaking year of sales. • The web-shop is to consist of a welcome/splash page explaining the offer, a product selection page, a delivery page and an order confirmation page. • Each employee is to either choose a single product (Autumn 2008 and Spring 2009) or is to receive between 5 and 10 stars to spend (all other semesters). All stars must be “spent” to reduce delivery costs before the order can be completed.
Teaching the Next Generation of Universal Access Designers: A Case Study
73
The students are then given a list of between 60 and 75 British products to offer on their web-shop. A number of those products are deliberately chosen to be unfamiliar to non-British people, such as mince pies and Christmas crackers. The aim is to encourage the students to learn to research products for themselves and also to ensure that their web-shops communicate the nature of the products effectively, rather than simply relying on brand and product name familiarity to the users. Between 30% and 50% of the products on the list were changed each semester both to reflect the change between Christmas and Summer holiday rewards and also to minimize the effects of designs being passed down from semester to semester. The change from selecting a single product to spending 10 stars was made because although the newer project is more complex to code, it offers a richer interaction and thus more data to analyse in the final reports. Having developed a working web-shop, the students then have to improve the design through the application of usability and accessibility methods. 3.3 Usability Methods The students are introduced to usability methods in both increasing complexity, but also in an order that makes sense for the re-design of their web-shop. Card sorting. Card sorting was used by the students to decide on the best potential clusters for their products (e.g. Sweets, Healthcare products) and also to ensure that the products were in the correct cluster Personas. Personas are usually developed from a known user group and are typically used to describe particular sectors of the target users that are of specific interest to the designers. In this case, though, since the target user group was fictional, the personas were developed to represent broad user types and were used to prompt the students to consider different user patterns of behaviour in conjunction with heuristic evaluation. Heuristic evaluation. The students developed specific use cases based on the personas that they had developed and then performed a heuristic evaluation to identify potential usability issues with their “frozen” sites. User trials. At the end of the semester the students performed user trial evaluations of their original (frozen) and revised sites. They had to recruit a minimum of 4 (later 6) users and a user who was blind. No assistance was given in finding the blind user to encourage the students to learn where to find such users. Before conducting the final set of user trials, they also had to perform a pilot study with at least one user. The students typically used screen-recording software, such as Silverback1 and Camtasia2, to record the trials. They were encouraged to collect as much quantitative data as possible. 3.4 Accessibility Methods The stipulation that at least one of the users in the final user trials had to be blind meant that each group had to explicitly consider the accessibility of their web-shop. To this end, the students were introduced to common accessibility evaluation tools. 1 2
Cynthia Says3. The students were first asked to use HiSoftware’s Cynthia Says Portal to identify how many Web Content Authoring Guidelines (WCAG) Priority 1, 2 and 3 errors [9] their sites had. Although WCAG is commonly accepted as the default standard for web accessibility in the Universal Access community, this was the first time almost all of the students had encountered it. Wave4. As many students found the Cynthia Says Portal output to be very difficult to visualise, they were asked to repeat the WCAG evaluation using WebAIM’s Wave Web Accessibility Evaluation Tool, which produces a marked up version of the web page being analysed, with red, yellow and green markers indicating the location of potential problems (the yellow and red markers) or successes (the green markers). Vishceck5. About 8% of the male population is colour blind, so to check whether this presents a problem to users of their sites, the students are instructed to evaluate their sites using Vischeck. The aim is to establish whether users with Deuteranopia (red/green colour deficit), Protanopia (red/green colour deficit) or Tritanopia (blue/yellow colour deficit) would experience difficulties using their sites. Screen reader. While the WCAG compliance tools such as Cynthia Says and Wave are useful in identifying basic coding issues, simply conforming to those standards does not guarantee an accessible or usable website. To check this, the students are asked to use a screen reader such as WebAnywhere6 or JAWS to browse their webshops aurally. Exclusion calculator. To evaluate the potential number of users that may be excluded from using their sites, the students are asked to perform a comparative exclusion analysis using either of the exclusion calculators from the Engineering Department at the University of Cambridge7. The calculators require the students to estimate the level of functional capability required to use a product and then report the total number of people within the British population who do not possess those levels of functional capability. The aim of introducing the exclusion calculators is to indicate prevalence of impairment in the general population.
4 Review of the Usability and Accessibility Methods As discussed above, the “Usability and Accessibility” course introduced the students to a number of common design and evaluation methods and tools. 4.1 Consideration of Usability and Accessibility None of the groups considered accessibility in their initial designs. Where the first versions of their web-shops were accessible, this was solely due to using valid HTML 3
Teaching the Next Generation of Universal Access Designers: A Case Study
75
coding. This is both good in that it demonstrates that accessibility can be achieved by following standards, but is also concerning that no students considered accessibility until formally instructed to do so. Comparatively few groups considered explicit usability goals either. When they were considered, the goals were vaguely formulated often making reference to “user experience,” but with no set targets or objectives. By the end of the course, all groups had clearly defined usability and accessibility objectives. By far the most common usability definition adopted was that from ISO 9241:11 specifically the “extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.” [10] The most commonly adopted definition of accessibility was broadly that the site must be “usable and accessible to a blind user using a screen reader.” While this definition does not meet the usual definitions for Universal Access – to be as usable and accessible by as many people as possible in as many contexts of use as possible – it is a major step in the right direction. 4.2 Usability Methods and Tools The card sorting exercise is useful in helping the students to consider their product groups and especially to identify products that were in the wrong product group. The personas are not very useful in their traditional role in a user-centred design process. However, this is not surprising as they are entirely fictional and not developed in the correct way. They are useful, though, as a design tool in reminding the students that the users may exhibit a variety of browsing patterns and IT skills. The most successful strategy observed to date is a trio of personas that exhibited the following browsing patterns: • The quick user – someone who wants to complete the process as quickly as possible • The careful user – someone who wants to consider all of the possibilities to get the best possible value • The uncertain user – someone who changes their mind frequently, might possibly visit the site multiple times before deciding and also possibly chooses products based on someone else’s recommendations The heuristic evaluation often proves very useful in identifying many usability issues with the original (frozen) versions of the web-shops. However, this technique has also proven to be the most problematic for the students to use. In the final user trial evaluations of the original and revised web-shops, the times when the users expressed a preference for the original site can in almost all circumstances be traced back to the heuristic evaluation stage. Heuristic evaluation is known to identify many potential usability issues on a website. However, the method provides comparatively little information about the priority of each issue. Consequently, the students often assign each issue the same priority and attempt to fix them all. In doing so, they sometimes end up with a revised site that is visually more complex than the original site through the addition of FAQs, contact addresses, more robust error-checking, etc. While the users often respond positively to the new additions to the site in terms of trustworthiness, for example, they also sometimes feel
76
S. Keates
that the flow of the interaction has become more cumbersome and less streamlined. Many of the students walk the fine line between providing a richer and more secure user experience without compromising the effectiveness of the site. Some groups, however, make their sites so complex that the user satisfaction is adversely affected. Finally, the user trials at the end of the semester are generally regarded by the students as the most useful usability evaluation method and the user trials with the blind users are often the most interesting and personally rewarding. However, it is also accepted that user trials take much longer to perform and are more resource intensive. 4.3 Accessibility Methods and Tools The students typically find the visual presentation of WCAG violations from Wave to be extremely useful in identifying where the accessibility problems were on each page. However, the detailed analytical feedback from Cynthia Says is typically more useful in identifying where the problems lie in the coding. All groups to date have used a combination of both applications in developing the revised versions of their web-shops. Vischeck is often harder for the students to interpret. A number of students have tried to adjust the colour schemes of their sites to still look visually appealing to themselves, while not appreciating that their colour preferences (with unimpaired colour vision) may not be the same for some with a colour vision impairment. Most students, though, use Vischeck to look for insufficient colour contrast for each of the three colour vision impairment types, which is usually more successful. The exclusion calculators usually do not offer enough data resolution to be able to respond to the changes made between the original and revised versions of each site, with often only minor differences in exclusion reported between the two versions. This is because the limiting factors in the ability to use the web-shops are imposed by the hardware used in the interaction (the keyboard, mouse and screen) rather than the design of the web-shops themselves. The accessibility tools that are most universally praised and used by the students are the screen readers. Trying to complete purchases using only the screen readers quickly makes it clear why the pages have to be well-structured and also why options such as “Skip to Content” are so important.
5 Review of the Course The “Usability and Accessibility” course is under constant review to keep it fresh and relevant to the students. The review process involves student feedback as well as setting new pedagogical and learning goals. 5.1 Student Response to the Course Midway through each semester, the students on the course are invited to provide anonymised feedback through an online questionnaire. Responses are rated on a Likert scale of 1 (I completely disagree) to 6 (I completely agree).
Teaching the Next Generation of Universal Access Designers: A Case Study
77
The students are very satisfied overall with the course (response mean = 4.8 out of 6.0 to the statement “Overall conclusion: I am happy about this course”). They also feel that the quantity of work on the course is about right (mean response = 3.4 out of 6.0 to the statement “My time consumption for this course is above the norm of about 20 hours/week”) and that the course is highly relevant to their future employment (response mean = 5.1 out of 6.0 to the statement “I think this course is highly relevant for my future job profile”). These results indicate that the students respond positively to this course. Qualitative feedback demonstrates very clearly that the students respond most positively to the very practical and applied nature of the course, with the focus on learning pragmatic skills rather than simply classroom theory. 5.2 Course Name and Student Enrolment In Spring 2010, the name of the course was changed from “Usability with Project” to “Usability and Accessibility”. Figure 1 shows the student enrolment on the course before and after the name change. It can clearly be seen that the student enrolment decreased from a mean of 24.7 students to 14 students per semester with the change in name of the course. This shows that the concept of “accessibility” is still problematic in persuading students that this is a topic worthy of their attention. It is worth noting, though, that Denmark does not have a formal anti-discrimination law along the lines of the 1990 Americans with Disability Act [11] or the 1995 UK Disability Discrimination Act [12]. Thus it is not clear whether the student response to the course name change would be the same in countries where there is a clear legal imperative to consider accessibility in the design of websites.
Fig. 1. The number of students enrolled for each semester of the Usability with Project and the Usability and Accessibility courses
It is worth noting, though, that in their final exams the students are usually the most excited by the work that they have done with the blind users. So, it is not that they are prejudiced about the work, simply that they do not see the relevance of “accessibility” to their own careers and shows that there is still work to be done in making accessibility and Universal Access more mainstream concepts.
78
S. Keates
5.3 Experimental Design and Analysis Pre-course It became clear during the first semester of the “Usability with Project” course that the students from a non-scientific background were struggling with the quantitative analysis elements of the project. It was clear, for example, that many of them had never been introduced to fundamental concepts such as probabilities and could not make the connection between a probability of 0.5 being the same as a 50% chance. As such, a new pre-course was introduced – “Experimental Design and Analysis.” This course runs in the second semester of the degree programme and teaches basic statistics assuming no pre-knowledge. It covers basic probabilities all the way up to multivariate analysis of variance. Since the introduction of this course, the overall quality of reports submitted for the “Usability with Project” and “Usability and Accessibility” courses improved substantially, with the students able to better understand the role of the statistical tests and spontaneously performing Kolmogorov-Smirnov or Q-Q plot analyses to ensure that the data is normally-distributed before applying a paired Student t-test or ANOVA. If the data is not normally-distributed, they usually perform a Wilcoxon signed-rank test. This is remarkable in students that have often had no statistical training prior to the “Experimental Design and Analysis” course. Since the introduction of that course, no students have lost marks because of incorrect statistical analyses in their projects.
6 Conclusions Overall, the “Usability and Accessibility” course provides a model for teaching both usability and accessibility theory and practice within the later stages of a Bachelor programme or in a Masters programme. It shows that students from a wide variety of backgrounds can respond positively to the challenges presented by this course. The student response is overwhelmingly positive to the course. However, there is still room for concern over the tailing off of the number of students enrolling in the course since the name change and this suggests that “accessibility,” and thus Universal Access, is still widely perceived as a niche interest rather than a mainstream activity within the student community. Since the students on the DDK course are mature students and have experience in industry before enrolling in the programme, this suggest that this attitude is also widespread in Danish industry. This is clearly a challenge that needs to be met.
References 1. Vredenburg, K., Mao, J.-Y., Smith, P.W., Carey, T.: A Survey of User-Centered Design Practice. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2002). ACM, New York (2002) 2. Newell, A.F., Gregor, P.: User Sensitive Inclusive Design- in Search of a New Paradigm. In: Proceedings of the 2000 Conference on Universal Usability (CUU 2000), pp. 39–44. ACM, New York (2000)
Teaching the Next Generation of Universal Access Designers: A Case Study
79
3. Marcus, A.: Universal, Ubiquitous, User-Interface Design for the Disabled and Elderly. Interactions 10(2), 23–27 (2003) 4. Choi, Y.S., Yi, J.S., Law, C.M., Jacko, J.A.: Are Universal Design Resources Designed for Designers? In: Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2006), pp. 87–94. ACM, New York (2006) 5. Clarkson, P.J., Coleman, R., Lebbon, C., Keates, S.: Inclusive Design. Springer, Heidelberg (2003) 6. Keates, S., Clarkson, P.J.: Countering Design Exlcusion. Springer, Heidelberg (2003) 7. Stephanidis, C.: The Universal Access Handbook, 1st edn. CRC Press, Inc., Boca Raton (2009) 8. ACM Special Interest Group on Computer Human Interaction: SIGCHI Conference Publications Form , http://www.sigchi.org/chipubform (accessed February 3, 2011) 9. Thatcher, J., Burks, M.R., Heilman, C., Henry, S.L., Kirkpatrick, A., Lauke, P.H., Lawson, B., Regan, B., Rutter, R., Urban, M., Waddell, C.D.: Web Accessibility: Web Standards and Regulatory Compliance. Springer, New York (2006) 10. ISO 9241-11: Ergonomic requirements for office work with visual display terminals (VDTs) – Part 11: Guidance on usability (1998) 11. US Department of Justice: Americans with Disabilities Act of (1990), http://www.ada.gov/pubs/ada.htm (accessed February 3, 2011) 12. Her Majesty’s Stationery Office (HMSO): UK Disability Discrimination Act (1995), http://www.legislation.gov.uk/ukpga/1995/50/contents (accessed February 3, 2011)
Use-State Analysis to Find Domains to Be Re-designed Masami Maekawa1 and Toshiki Yamaoka2 1
Abstract. Even if the problems concerning HCI are individually solved, it is not easy to relate to a big value solution. It tends to be some small improvements of narrow scope. Therefore, this proposal pays attention to the situations in which the problem occurred. It was attempted to clarify the domains to be re-designed by uing mathematical analysis methods with use-state keywords as data extracted from the descriptions of situations. Consequentially, it is understood that there is a possibility of this method through a trial experiment. And, some limits of this method were found. Additionally, the difference with the result of a common classification method was confirmed by a comparative experiment. Keywords: Use-state Analysis, Design, Human Computer Interaction, Context.
Use-State Analysis to Find Domains to Be Re-designed
81
2 Literature Review In the field of software engineering, there is requirement engineering as a technique to clarify which to be designed and be made. Demand extraction type, goal aim type, and domain model type are basic ideas that acquire the requirement and analyze it [3]. This chapter shows demand extraction type, scenario method used with demand extraction type, and goal aim type. Demand extraction type is a method of extracting demands by basically using interview. This technique stands in assumption that stake-holders including users have demands latently. However, the quality of the interview depends on the interviewer’s skill. Moreover, there is a difficulty on understanding the globular conformation of demands. These characteristics tend to make individual solutions of each extracted individual demand. The lack of coherence often appears in demand extraction type. Therefore, the contradiction of solving plans and/or confrontation is caused as shown in Fig.1. In scenario method, states such as user's behavior and reactions of system after the system constructed are described as a scenario. User’s needs are extracted through taking communications with the user by using scenario. There is an advantage that it is easy to extract demands with high effectiveness from the user [4]. Using scenario method, the developer or the designer should describe a scenario showing the state after
Fig. 1. Idea of solution with individual investigation
Fig. 2. Idea of solution with use-state analysis
82
M. Maekawa and T. Yamaoka
the system constructed beforehand. However, the contents of the discussion are depended on the range of the contents included in the scenario. It is clear that the result changes. For instance, when only a part of narrow range of procedure is imaged, only the demand within the narrow range tends to be found and discussed. Therefore, it is still the theme what range of subjects is set in the scenario by the developer or the designer. Goal aim type is an idea that demands for the system become clear by clarifying stake-holder's goal. Next, the goal is resolved to some sub-goals to be achieved, repeatedly. The function and the operation are allocated to the sub-goals of the subordinate position [3]. Therefore, this is a technique for not the solution of each individual problem but the searching activity for the primary cause by causal relation of problems, and deriving the goal there. However, the problems concerning HCI have complex and unclear causal relation [5]. Therefore, the analysis method used by goal aim type is understood that it is not easy to use for the problems concerning HCI. There is contextual design technique as a method of analysis as the user survey through the investigation and the modeling to understand the context and the situation of product use inclusively. It is interested in this technique through the question to the user first of all by non-structured question method. Next, the use context is modeled from five viewpoints of “Flow”, “Sequence”, “Artificial object”, “Culture”, and “Physics”, and the solution is found from these inductively. There is a feature in the point that the entire use state can be described [6]. A method to invite the idea that finds the fundamental problem with grasping the relations of found small problems as shown in Fig.2 is hoped for now. The purpose that this study proposes looks like goal aim type and contextual design technique, because they don’t intend to the improvement of an individual problem but inclusive re-design. In this proposed technique, the problem itself is little considered, because it pays attention not to the causal relation but the situation of problem occurrence. In contextual design technique, the situation and state is caught as an element of inclusive current understanding, and it is not done as the object of the mathematical analysis. This proposed technique recognizes the situation and state as an information group including background, reasons, and remote cause etc. that relate to the found problem. Use-state keywords which describe the context of situation problem occur are recognized as data to be analyzed mathematically. And, the domains to be re-designed are worked out.
3 Process 3.1 Problem Finding Task analysis, observation, and protocol analysis etc. are used as techniques for finding problems. There is a feature that these techniques are enforceable without preparing testee without protocol analysis. Therefore, for the product development, it is often used. When executing it, some scenes that the product will be used are specified. It is thought that it is easy to understand situations problem occur by recording them with these situation items listed below.
Use-State Analysis to Find Domains to Be Re-designed
1. • • • • 2. • • • • 3. •
83
Situation items concerning user Attribute or Sense of value Role Physical posture Psychological condition Situation items concerning background Stage of work or Time zone Environment or Place Limiting condition Things used alongside Situation item concerning interface Thing concerned
Fig. 3. Structure of situation items of use
Fig. 3 shows the structure of use aspects to be observed or supposed [7]. Each element can be shown in the situation items. The arrows that connect between elements show the relation. And, it is also thought that they include limiting conditions between elements. These limiting conditions are that may become design requirements. These situation items are extracted by the following methods. The description and the item which define the situation are included in user’s role, persona, and scenario etc. used to assume the situation that the product will be used. Therefore, the situation item concerning user can be chiefly drawn out from the element of user’s roles and the persona. In addition, other situation items can be drawn out from the example of scenario. It shows specifically as follows. The set of “Needs”, “Concern”, “Expectation”, “Behavior”, and “Responsibility” included in the user’s role described by Constantine and Lockwood advocated [8] was enumerated as “Roles”, “Attributes or Sense of values”. The characteristic of the persona that Cooper advocated [9] was enumerated as “Attributes or sense of values”. Moreover, Carroll describes about scenario as follows. The reference and the assumption of the situation setting are included in the scenario, and the status becomes clear. It is thought that the scenario includes situation items to describe the status correctly. Some contents of the scenario
84
M. Maekawa and T. Yamaoka
described in two or more documents thought to be good examples to be investigated. The following elements have been extracted from the result of the survey. “Thing concerned” is as an item concerning interface. “Stage of work or time zone”, “Place”, “Limiting condition”, “Things used alongside” are as items concerning background. “Attribute or Sense of value”, “Role”, “Posture”, “Psychological condition” are as items concerning user. What assumed to be a situation concerning the background includes limiting conditions such as technical, social, physical, economical which thought to be remote causes. “Things used alongside” is a thing that is necessary for the user to accomplish the work. It includes supplement elements of the product to be used for achieving the work for the user. “Things concerned” is the objects for interaction like buttons and displays etc. The fault when the designers do task analysis as an impression method, the assumed situations are limited within the range that they recognize beforehand that problems may happen. It is impossible to overturn this fault, even though it is expected to become easy to avoid losing the situation that should be assumed by putting the above mentioned situation items on the mind. However, if the assumed situations of the found problem are recorded, it must be clarified because the situations not assumed are not recorded in it. Therefore, when an insufficient assumption in the problem finding step is recognized, it becomes possible to add the problems under an insufficient situation. Thus, traceability of the assumed situations in the problem finding step is secured, and the recovery action is expected to be easy and swift. 3.2 Description of Situation and Extraction of Use-State Keyword Problems are found by using task analysis, the observation, and protocol analysis etc. They are recorded and the descriptions of the situation in which the problems occur are recorded in rows divided by the situation items. Keywords that frankly show the situation are extracted from each situation description. Two or more keywords may be extracted from one situation description. 3.3 Analysis When the use-state keyword is extracted, it is possible to pay attention to a state keyword which is common between two or more problems, and to extract the problem of coming under a specific keyword. However, it is not realistic to repeat the list of the problem coming under it in every case about all keywords so that associating of each problem and each use-state keyword may have diversity. Moreover, there is a problem of not understanding the range of the problem easily because it becomes a view that pays attention in each of the use-state keyword. Therefore, the pattern is classified by analyzing it with Hayashi's quantification method 3. The categorical data table is made first. The state keywords are arranged in the axis of abscissas and the problems are arranged to the spindle. The cells of the problem and the use-state keyword come under are put “1”. Other cells are put “0”. Next, cluster analysis that uses the category score of each keyword in some extracted dimensions is done. It can be expected to become easy to be able to understand the relations between use-state keywords from the dendrogram obtained as result. The problem descriptions
Use-State Analysis to Find Domains to Be Re-designed
85
are not analyzed, but it is used as reference information when the contents of clusters are interpreted. Moreover, it is possible to exclude needless and an inappropriate use-state keywords from the data if the product concept is prepared upon the analysis beforehand. It can be expected that it becomes easy to find the problem under an appropriate situations that agree with the product concept. 3.4 Interpretation There are various kinds of problems by various situations in one product. And naturally, various kinds of domains to be re-designed are supposed to exist. The use-state keywords are classified into some clusters according to the shape of the dendrogram. Domains are specified by interpreting the contents of each cluster. It may be easy to image the domain if there is a description of specific problem. In that case, it can be used as reference information. The domain could be derived by assuming use-state keywords extracted from the situation description of the problem to be data, and by using cluster analysis. 3.5 Using Interpreted Contents This proposal doesn't pay attention to the problems actualized only. It will adjust to the idea that intends the development of not problem solving type but creation type, because the contents don’t show the problems but the domains and some subjects. Therefore, an idea which is not stored in shape or UI of physical product may be conceived. In this case, it is guessed that not the product but service or function will be consequentially designed.
4 Trial Experiment To confirm the effect of proposal technique, a trial experiment was done. The theme is books search pages on WEB of three libraries. 4.1 Methodology The experiment collaborators were students learning design, or designers. They were 6 people, and all members were experienced task analysis before. All members were daily using internet. Each collaborator was marked letters A-F to be distinguished. All problems were brought together in one sheet, and the situation of the problem occurrence was described according to the situation item. Next, use-state keywords were extracted from the contents of the situation description. In this experiment, to arrange the expression and the extraction method of keywords, one experimenter extracted all the use-state keywords from all the situation descriptions. It was judged that the difference of the assumption situation of “Situation concerning user”, and “Place” were minor. So, these are excluded from the data. The situation item of “Stages of work and time zone”, “Limiting condition”, “Things used alongside”, and “Things concerned” were used and analyzed. The use-state keywords were analyzes by Hayashi’s Quantification Method 3 to examine the pattern of them. Next, the cluster
86
M. Maekawa and T. Yamaoka
analysis on the category score of each dimension of each keyword was done. The dendrogram shows the relations between each use-state keyword like a tree form. Therefore, it drives easiness to image the situation and understand the relativity of them. And it derives discussion about the situation assumed to be re-designed. After this, a comparative experiment was done with the proposal technique. The method was an idea to classify the problem by discussion like KJ method. This is a common practice often used on the site of design. The experiment collaborators were 4 designers besides the trial experiment. Data was 190 problems previously extracted by the trial experiment. Neither the situation descriptions nor the use-state keywords were used at all. After having confirmed the outline of books search WEB system, the experiment collaborator began classifying of the problem description. 4.2 Result 190 problems have been found including repetition. 68 kinds of use-state keywords have been extracted. The keywords were analyzed as category data by Hayashi’s quantification method 3. The keywords and the obtained sample scores of 5 axes were analyzed by cluster analysis. Therefore, a dendrogram shown in Fig.4 was derived. The number of cluster was assumed to be 6 from the shape of the dendrogram. And the domains which were candidates to be re-designed were summarized in Fig.5 with some use-state keywords included in each cluster.
Fig. 4. Dendrogram of state keywords
Use-State Analysis to Find Domains to Be Re-designed
87
Fig. 5. Clustered use-state keywords and domains extracted
Next, the comparative experiment is described. The problems were consequentially classified into the following seven kinds. “Retrieval”, “Visual representation and display”, “Semantic content transferability of mark”, “Layout”, “Procedure design”, “Map”, and “Help”. Because each problem has many aspects to be considered, many kinds of classification should be done by each aspect. However, it is guessed that plural aspects actually mix during classifying, and it was classified as mentioned above. It seems that collaborator's experience, easiness to classify, and plainly of the classification influenced. The contents that paid attention to a specific button or the display were often seen. And ideas adding function were seen also. The idea that aimed reviewing the existence of a function was not seen, and the idea to make the function easy to use was shown. Overall, the derived ideas were narrowly classified from the problems included in each classification. Moreover, not the problem theme but specific plans to solve individual problems were most in the ideas.
5 Consideration 5.1 Comparison with a Common Technique The contents that belong to two or more groups classified by the comparative experiment exist together in almost each clusters derived by this technique. For instance, use-state keywords grouped by common technique concerning “Search”, “Procedure design”, and “Help” are included in one cluster, and keywords belong to “Search”, “Visual representation and display”, and “Layout” are included in each three clusters. It is thought that the effect of inviting combined domain to be re- designed is arrived by using this proposal technique.
88
M. Maekawa and T. Yamaoka
5.2 Problem Finding and Situation Description By using the situation items as a template that describes the situation in which problems occur, it was suggested that the detailed grasp of the problem including the situation is made. In the experiment, the situation item was squeezed to 4 kinds as mentioned above. Naturally, only the keywords which are appeared in the dendorogram are the data to show the situation. However it is not enough. It is necessary to add the situation and the state excluded from the analysis as a common state to perceive the situation correctly. In an actual design process, it is general that the status is limited according to the product concept. The possibility that the proposal method can be used after some situation items are selected in conformity with the product concept is suggested. However, it is thought that there is little effect to use this technique when the situation items to be analyzed are extremely few. It is a point to require the further work. 5.3 Use-State Keyword The result of each collaborator and each product is considered about the amount of use-state keywords. The number of extracted keywords becomes few in one product. To extract various keywords, it was able to be confirmed that you should extract the problem of plural products. The following respect is thought as this reason. Because an individual product is designed respectively in an assumed situation and restriction, various kind of them are extracted as problems by investigating plural products. A significant difference was not confirmed on the amount of keywords between the case that each 3 persons found problems of another product and the case that only 1 person found problems of every products. However, it is understood that the number of keyword obtained by one person is relatively small even if plural products are investigated. By bringing together the result of each person’s investigation to each another product, various keywords are included within the data. And, it leads to the achievement of the use-state keyword extraction with a little omission. 5.4 Analysis and Interpretation By interpreting the use-state keyword group referring to the depth of the relations between each keyword that can be read according to the dendrogram form, it is thought that the problem situation can be almost imaged. This proposal method has aimed to relate to the development of a more creative, more attractive product. The advantage point is that the latent domains to be re-designed can be derived from recognized problems. There is possibility to be able to use the result of analysis as objective information. In the upper process of the product development, the proposal method is expected to function. When the use-state keywords classified into a cluster were considered, it has been understood that two or more keywords which belong to “stage of work or Time zone” are not included in a same cluster easily. This is a limit that depends on the sorting algorithm of the analysis method. Therefore, the other way is necessary to clarify a wider range of domains including plural stages of work or time zones.
Use-State Analysis to Find Domains to Be Re-designed
89
6 Conclusion and Future Plan The present study aimed at the proposal of the technique for deriving domains to be re-designed to develop attractive products. Consequently, the following points were able to be confirmed. The domains that should be re-designed are possible to be found by mathematical analyzing the use-state keywords which are extracted from the description of the problem situation. Up to now, the domains to be re-designed are often subjectively decided by the experience and intuition of designers. So, it is expected that it may become an effective technique that can be used in the upstream stage of the product development. However, there is a limiting fact of this method that the other way is necessary to find wider range domains including plural steps. So, it is necessary to do the research on this theme. And, it is recognized that an additional research for clarifying applicable condition and restriction is necessary in the future. For instance, it is perhaps difficult to be used if situation items few, or the case which serious problems occur only in unusual situations. The research on such respect should be advance further.
References 1. Preece, J., et al.: Human-Computer Interaction. Addison Wesley, UK (1994) 2. Kurosu, M., Ito, M., Tokitsu, T.: Guide book of User Engineering, p. 124. Kyoritsu shuppan, Tokyo (1999) (in Japanese) 3. Ohnishi, J., Tsumaki, T., Shirogane, J.: Introduction to Requirements Engineering, Kindaikagakusha, Tokyo (2009) (in Japanese) 4. Carroll, J.M.: Making Use: scenario-based design of Human-computer Interactions. The MIT Press, Boston (2000) 5. Tahira, H.: Usability Problem Solving with Graph Theory. In: Conf. of Human Interface Symposium, Sapporo (2002) (in Japanese) 6. Beyer, H., Holtzblatt, K.: Contextual Design. Morgan Kaufmann Publishers Inc., San Francisco (1998) 7. Maekawa, M., Yamaoka, T.: One Proposal Concerning Framework to Clarify Design Requirement. In : Proceedings of the 57th Annual Conference of JSSD, Ueda (2010) (in Japanese) 8. Constantine, L., Lockwood, L.: Software for Use. ACM Press, Boston (1999) 9. Cooper, A.: About Face 3. Wiley Publishing Inc., Indianapolis (2007)
An Approach towards Considering Users' Understanding in Product Design Anna Mieczakowski, Patrick Langdon, and P. John Clarkson Engineering Design Centre, Department of Engineering, University of Cambridge, Trumpington Street, Cambridge, CB2 1PZ, United Kingdom {akm51,pml24,pjc10}@eng.cam.ac.uk
Abstract. Although different techniques for supporting the process of designing exist, there is, at present, no easy-to-use and pragmatic way of helping designers to infer and analyse product representations that users form in their heads and to compare them with designers’ own understanding of products. This paper is part of ongoing research that attempts to develop an approach for supporting designers in identifying, during the early stages of the design process, whether specific product features evoke similar understanding and responses among the users as among the designers of those features. Keywords: Inclusive Design, Product-User Interaction, Mental Models, Cognitive Representations, Prior Experience.
1 Introduction Features of many modern products are largely unusable for the majority of users as they frequently embody the intentions of designers and unverified assumptions about users’ needs and wants. To design more inclusively, designers require guidance and tools that would help them to better understand users’ understanding of product features, the goals they want to achieve in relation to products and the actions they exert on product features to achieve those goals. This paper describes the goal and structure of a new easy-to-use and pragmatic technique for designers, called the GoalAction-Belief-Object (GABO) approach, developed to help designers model users’ understanding and use of products and compare it with their own conceptual models.
An Approach towards Considering Users' Understanding in Product Design
91
postulating that it is a small-scale model of external reality that people carry in their heads and use it to take various actions, conclude which is best and utilise this knowledge for future problem-solving. Nearly forty years later, Johnson-Laird [8], suggested that individuals construct internal models of the external world that enable them to make inferences and predictions, understand and explain phenomena, decide what actions to perform and control their execution. Norman [10] argues that the system image observable by the user should be consistent with the designer’s conceptual model, and the mental model that a user brings to bear on a given task should be consistent with both these models. Payne [15] claims that users’ models of products are fragmentary and as a result people have difficulties interacting correctly with products. While, diSessa [12] notes that mental models are heavily influenced by knowledge and experience with previously encountered products.
3 Prior Experience Experience is a critical factor of how easy a product is to learn and use [1]. Rasmussen [3] claims that people process information on three levels: (1) skill-based level, which is automatic and non-conscious; (2) rule-based level, which is guided by the ‘if (precondition) then (action)’ rule; and (3) knowledge-based level, which accounts for unfamiliar situations for which no rules from previous experience are available. Reason [11] suggests that rule-based attempts in tasks are always tried first as people in general are “furious pattern matchers”. If the problem can be pattern matched and only minor corrective rules need to be applied, then the processing will take place at the skill-based level. If, however, a ‘pre-packaged solution’ cannot be found at the rule-based level, then the information processing is carried out at the slower and more laborious knowledge-based level.
4 Modelling Design Activity It has been suggested that extracting, measuring and comparing mental models can be best achieved through modelling [18]. Payne [14] believes that models of user interaction with products should be developed with a view of furthering psychological theory and providing conceptual and practical models in HCI. However, a plethora of tools for designers have been developed way before the theoretical foundations of the concept of mental models have been developed sufficiently and as a direct consequence those models are “very confusing and lack the predictive power to be of any practical or explanatory value” [16]. Yet a further consequence is that product designers are provided with very little guidance in adequately representing and comparing functional, conceptual and user information during product design. Following a number of years spent researching the design activity at a large UK engineering company, Aurisicchio and Bracewell [17] propose the use of diagrams for documenting the structure of design information. They argue that diagrams can be supported by a wide range of computer-based diagramming tools (i.e. Visio, SmartDraw, Mindjet, Compendium, Cambridge Advanced Modeller, etc.), and since they are more visual than linear text documents, they are better for spotting patterns and
92
A. Mieczakowski, P. Langdon, and P.J. Clarkson
gaining insights. Also, using diagrams is beneficial as they group together information that needs to be used together and minimally label that information, place similar information at adjacent locations, minimise shifts of attention and automatically support a number of perceptual inferences.
5 Requirements for Easy-to-Use Modelling Approach Previously, modelling techniques such as ACT-R [5], GOMS [6], SOAR [2] and TAFEI [4] had been developed to help designers to focus on users’ goals and actions. Most of them incorporated task analysis of what a person is required to do to achieve their goal and what operational difficulties they faced [9]. However, for different reasons, including the complexity of architecture and specific skills necessary to use these techniques, so far none of them has effectively been transferred to product design in order to benefit the end users. Consequently, in order to offset the lack of an easy-to-use and pragmatic technique for modelling users’ understanding and use of products and comparing it with designer conceptual models, we developed the GoalAction-Belief-Object (GABO) modelling approach for designers. The GABO approach was developed in conjunction with information elicited from interviews with twenty product designers and observations with fifty users of everyday products. Our goal for development of the GABO approach was that it should be visual, easy and quick to understand, implement and use, it should lead toward improvement in design practice to increase the chances of producing an accessible and usable product, and designers should find significant productivity and differentiation gains in using it.
6 Rules of the GABO Modelling Approach The GABO approach works on the premise that the contents of people’s minds (i.e. their knowledge, theories and beliefs) should be explored by designers in order to better understand users’ behaviour in relation to products [15]. Consequently, it is aimed at representing minimum information about users’ preferences to identify areas of design where people’s mental models are compatible with the designer model and where they differ. Though, it needs noting that the GABO approach does not focus on the structure of the human mind and is not in any way aimed at representing information about the actively changing parts of mental models. Instead, it intends to capture and represent the current and static contents of users’ mental models during interaction with a given product. 6.1 Structure of the GABO Modelling Approach The GABO approach is aimed at encouraging designers to focus on users’ understanding and use of products through representing the following elements: 1. goals that users want to achieve during interaction with products, 2. correct and incorrect actions that they exert on product interfaces 3. beliefs about actions that they bring from previous interactions with other products to interactions with new products, and 4. understanding of the impact of their actions on functional objects.
An Approach towards Considering Users' Understanding in Product Design
93
When using the GABO approach, designers are required to create three types of models: 1. engineering model 2. designer model, and 3. a number of individual user models. The engineering model is essentially a functional analysis diagram encompassing the objects of a given product, the engineer’s beliefs about the function of each object, external interface-based and internal object-based actions that need to be taken to change objects’ states and transform available inputs into the desired outputs, and goals that pertain to actions driven by human input and applied on the objects. The structure of the engineering model is different to the structure of the designer model and the individual user models as its purpose is to act as a reference for designers when they construct the designer model. The designer model includes the overall goal(s) that the designer thinks that the potential users should set for their product usage, the actions that need to be taken to get the users to achieving their goals with that product, the beliefs that correspond to users’ actions and contain the designers’ description of the appearance, functionality and behaviour of that product, and the objects that are envisaged by designers to sit on the top of the interface of the product and be exploited by users. The structure of an individual user model is very similar to the designer model as it captures the overall goal(s) that a given user wants to accomplish with a particular product, the actions that a user wants and thinks they need to perform to achieve their goal(s) with that product, the beliefs that correspond to the user’s actions and provide the user’s internal understanding of the appearance, functionality and behaviour of that product, and the objects that the user is using to carry out their actions. The DRed platform [17] has been used during development of the GABO approach as a testing ground for representation and comparison purposes. However, it needs noting that the GABO approach is aimed to be platform independent. During the use of the DRed software, the GABO approach’s goal, action, belief and object elements were assigned their corresponding DRed elements. Accordingly, the DRed’s Task (Pending) element was used as a goal’s symbol, the Answer (Open) element was chosen to represent an action, the Issue (Open) element was given a belief’s symbol and the Block (Internal) element was selected as a counterpart for an object. The GABO approach’s corresponding DRed elements are shown in Table 1. Table 1. Four elements of the GABO approach and their corresponding DRed symbols
GABO Approach’s Element Types Goal Action Belief Object
Corresponding DRed Element Types
94
A. Mieczakowski, P. Langdon, and P.J. Clarkson
The modelling of products using the GABO approach involves four stages. In the first stage, designers need to refer to a drawing of the actual engineering model of a given product to better understand what product elements and operations are functionally and structurally possible before they design interface features that will sit on the top of the engineering parts. Figure 1 shows an example of an engineering model of a complex-to-use toaster represented using the GABO approach in DRed software.
Fig. 1. Engineering model of a complex-to-use toaster drawn using the GABO approach
In this example, the engineering model has been represented using the object elements; the belief elements to explain the objective of each object; and the main relationships between different objects. Also, the action element was used to connote human input and DRed’s External Block element to signify external entities such as human input, bread, mains power and environmental air. In the second stage of using the GABO approach, designers are required to draw one collective product model using goal, action, belief and object elements and accompany each element with simple descriptive text based on a rigorous semantic coding language of their own choice for the purpose of pattern matching elements in designer and user models during model comparison. For instance, the semantic coding language that has been used to draw the designer and user models in this paper uses different verb and noun combinations depending on which type of element—goal (verb + noun), action (verb+ing + noun), belief (noun + to+verb), object (noun)—it is describing. An example of a designer model of a complex-to-use toaster represented using the GABO approach in DRed in shown in Figure 2. Due to the sheer size of this designer model, only a part of it is included in this paper.
An Approach towards Considering Users' Understanding in Product Design
95
Fig. 2. Part of designer model of a complex-to-use toaster drawn using the GABO approach
In this example, the designer model of a complex-to-use toaster has been represented using the goal element to signify the master user goal pertaining to a toaster’s usage; the action elements to indicate the sequence of actions that designers think that users have to perform to accomplish their goal with a toaster interface; the belief elements to provide designer description of the appearance, functionality and behaviour of toaster features which users are required to exert their actions on and to elucidate the position of different features and how user actions need to be taken to correctly operate them; and object elements to specify the functional features that are envisaged by designers to sit on the top of the toaster interface and be exploited by users.
Fig. 3. User model of a complex-to-use toaster drawn using the GABO approach
96
A. Mieczakowski, P. Langdon, and P.J. Clarkson
In the third stage of using the GABO approach, designers need to run observations with users of products using a verbal protocol (think-aloud) and keep a log of users’ goals, actions, beliefs and objects. Subsequently, designers should use the information collected from study participants to draw several individual user models using goal, action, belief and object elements and, as in the design model, accompany each element with simple descriptive text based on a rigorous semantic coding language of their own choice. Similarly to the elements used in the designer model example, the elements in individual user models have been annotated with four different verb and noun combinations. An example of an individual user model of a complex-to-use toaster represented using the GABO approach in DRed in shown in Figure 3. In this example, the individual user model of a complex-to-use toaster has been represented using the goal element to signify the master user goal pertaining to a toaster’s usage; the action elements to indicate the sequence of actions that a user wants and thinks they need to perform to accomplish their goal with a toaster interface; the belief elements to provide a user’s internal understanding of the appearance, functionality and behaviour of toaster features which they are required to exert their actions on and to convey the position of different features and how a user thinks their actions need to be taken to correctly operate these features; and lastly object elements to stipulate the functional features that a user is familiar with, accustomed to operate and immediately associates with a toaster form. It needs noting that the block element has a blue background only in the designer model and the background colour of all the transcluded block elements in user models is white. In the fourth stage of using the GABO approach, designers need to compare similarities and differences between each user model and the designer model using an appropriate algorithm (either manually or computationally), check the degree of compatibility between the designer model and the individual user models and make appropriate design decisions relating to the inclusivity of future product features. 6.2 The GABO Comparison Procedure The GABO approach stipulates that any two models (a designer model and an individual user model) are compared based on: (1) presence of the same vertices in the two models and (2) connectivity between two given vertices in the two models. The comparison procedure is carried out (either manually or computationally) using a function from set theory that measures similarity between graphs with common vertex and edge sets [13]. This function is used for measuring both the presence of vertices and the connectivity between vertices in the designer model and the individual user models, with the designer model acting as the main model against which each user model is checked for compatibility. The assumption is that by using the GABO approach’s function, designers will be able to make close estimates of the degree of compatibility of their intended goals, actions, beliefs and objects regarding product usage with the goals, actions, beliefs and objects of heterogeneous users. The function for checking presence of vertices in designer model and user models is as follows:
P( D,U ) =| VD ∩ VU | / | VD | ,
(1)
An Approach towards Considering Users' Understanding in Product Design
97
where •
P = value between 0 and 1, where o 0 means that an individual user model is not compatible with the designer model, and o 1 means that an individual user model is 100% compatible with the designer model o Any value between 0 and 1 indicates the degree of compatibility of a user model
with the designer model (e.g. if P ( D,U ) = 28/88, then the compatibility level equals 0.31, which means 31% compatible) • D = designer model • U = user model
VD = the set of vertices in designer model • VU = the set of vertices in user model • V D ∩ VU = the set of all vertices that are members of both V D and VU .
•
This function assumes that two vertices are the same (one from the designer model and one from the user model) if they belong to the same element type, for instance the belief element type, and contain the same semantic grammar. The function for checking connectivity of vertices in designer model and each user model is as follows:
C ( D,U ) =| E D ∩ EU | / | E D | ,
(2)
where • C = value between 0 and 1, where o 0 means that an individual user model is not compatible (dissimilar) with the designer model, and o 1 means that an individual user model is 100% compatible (identical)with the designer model o Any value between 0 and 1 indicates the degree of compatibility of a user model with the designer model (e.g. if C ( D,U ) = 30/114, then the compatibility level equals 0.26, which means 26% compatible) • D = designer model • U = user model
E D = the set of edges in designer model • EU = the set of edges in user model • E D ∩ EU = the set of all edges that are members of both E D and EU .
•
The connectivity function assumes that two edges are equal if they join two vertices that belong to the same element type and have identical semantics.
98
A. Mieczakowski, P. Langdon, and P.J. Clarkson
7 Evaluation of the GABO Modelling Approach The usefulness and effectiveness of the GABO approach was evaluated with eight designers from a range of small and large organisations based in the UK during two five hour workshop sessions. Designers, aged between 29 and 52, were asked to work on two redesign tasks, one of which required them to redesign an interface of a household product with a complex-to-use interface (either a toaster or a coffee maker) using a method of choice and the other task required them to redesign an interface of one of the aforementioned two products using the GABO approach. When the tasks were completed, each designer was asked to individually fill out an evaluation questionnaire composed of a number of quantitative and qualitative questions. Overall, the designers marked, on average, point 5 on a 7-point scale indicating how useful the GABO approach was in identifying and capturing users’ understanding and the problems users encounter during product use. This procedure was mirrored when investigating designers’ opinion regarding designers’ understanding of product functionality with the understanding of users, the result being point 5.5 on the scale. Likewise, indicating ease-of-use, the designers on average gave the GABO approach a score of 4.3. In addition, five designers believed that the GABO approach helped them to produce a better design than an alternative approach, while three designers said that they would need more time to use it to determine as to whether it was better or worse than the alternative method.
8 Discussion and Conclusion This paper discussed the role of mental models and prior experience in product-user interaction and existing modelling techniques for representing users’ goals and actions. Since there is, at present, no easy-to-use and pragmatic technique for representing and comparing designers and users’ understanding and usage of everyday products, this paper proposes the GABO approach for designers which bridges that gap. The GABO approach consists of four stages in which designers need to: (1) refer to the engineering model of a product to better understand how different product parts interact with one another; (2) create a designer model of that product using appropriately annotated goal, action, belief and object elements and compare it with the engineering model to see what features should be mounted on the top of the underlying functional parts; (3) investigate how different users understand and use product features, create several individual user models using goal, action, belief and object elements annotated in the same semantic style as their counterpart elements in the designer model; and (4) compare the designer model with individual user models using a function from set theory (either manually or computationally), check the degree of compatibility between the designer model and the user models and make appropriate design decisions relating to the inclusivity of future product features. Results from the evaluation study with eight designers indicate that designers find the GABO approach fairly useful and effective in identifying key similarities and differences in designers and users’ understanding and usage of products.
An Approach towards Considering Users' Understanding in Product Design
99
References 1. Langdon, P.M., Lewis, T., Clarkson, P.J.: Prior experience in the use of domestic product interfaces. Universal Access in the Information Society 9, 209–225 (2009) 2. Laird, J.E., Newell, A., Rosenbloom, P.S.: SOAR: An architecture for general intelligence. Artificial Intelligence 33, 1–64 (1987) 3. Rasmussen, J.: Skills, rules, and knowledge: Signals, signs, and symbols, and other distinctions in human performance models. IEEE Transactions on Systems, Man, and Cybernetics 13, 257–266 (1983) 4. Stanton, N.A., Baber, C.: Validating task analysis for error identification: Reliability and validity of a human error prediction technique. Ergonomics 48, 1097–1113 (2005) 5. Anderson, J.R.: Rules of the mind. Lawrence Erlbaum Associates, Hillsdale (1993) 6. Card, S., Moran, T.P., Newell, A.: The Psychology of human-computer interaction. Lawrence Erlbaum Associates, Hillsdale (1983) 7. Craik, K.J.W.: The nature of explanation. Cambridge University Press, Cambridge (1943) 8. Johnson-Laird, P.N.: Mental models. Harvard University Press, Cambridge (1983) 9. Kirwan, B., Ainsworth, L.K.: A Guide to Task Analysis: The Task Analysis Working Group. Taylor and Francis, London (1992) 10. Norman, D.A.: The design of everyday things. Basic Books, London (2002) 11. Reason, J.T.: Human error. Cambridge University Press, Cambridge (1990) 12. di Sessa, A.: Models of computation. In: Norman, D.A., Draper, S.W. (eds.) User-centered system design: New perspectives in human-computer interaction, pp. 201–218. Lawrence Erlbaum Associates, Hillsdale (1986) 13. Goldsmith, T.E., Davenport, D.M.: Assessing structural similarity in graphs. In: Schvaneveldt, R.W. (ed.) Pathfinder associative networks, pp. 75–87. Ablex Publishing Corporation, Norwood (1990) 14. Payne, S.J.: On mental models and cognitive artefacts. In: Rogers, Y., Rutherford, A., Bibby, P.A. (eds.) Models in the mind: Theory, perspective and application, pp. 103–118. Academic Press, London (1992) 15. Payne, S.J.: Mental models in human-computer interaction. In: Jacko, J.A., Sears, A. (eds.) Mental models: The human-computer interaction handbook: Fundamentals, evolving technologies and emerging applications, pp. 63–76. Taylor & Francis, New York (2008) 16. Rogers, Y.: Mental models and complex tasks. In: Rogers, Y., Rutherford, A., Bibby, P.A. (eds.) Models in the mind: Theory, perspective and application, pp. 145–149. Academic Press, London (1992) 17. Aurisicchio, M., Bracewell, R.H.: Engineering design by integrated diagrams. In: Proceedings of the 17th International Conference on Engineering Design (ICED 2009), Stanford California, USA, pp. 301–312 (2009) 18. Van Engers, T.: Knowledge management: The role of mental models in business system design. PhD Thesis. Departmennt of Computer Science, Vrije Universiteit (2001)
Evaluation of Expert Systems: The Application of a Reference Model to the Usability Parameter Paula Miranda1, Pedro Isaias2, and Manuel Crisóstomo3 1
Escola Superior de Tecnologia de Setúbal, IPS Campus do IPS, Estefanilha 2910-761 Setúbal – Portugal [email protected] 2 Universidade Aberta, Rua Fernão Lopes, 9, 1º Esq. 100-132 Lisboa – Portugal [email protected] 3 Universidade de Coimbra, [email protected]
Abstract. This study aims to present an expert systems’ performance evaluation model, which will then be used to evaluate an existing expert system as a way of testing its applicability. The proposed model’s evaluation criteria are: usability, utility, quality, interface, structure, productivity and return. Information systems, especially expert systems, are today a real necessity for any organisation intending to be competitive. Given this scenario, organisations investing in these systems, aim to, progressively, ensure that the investment they’ve made is contributing to the organisation’s success. Hence, it is fundamental to evaluate the expert system performance. The evaluation assesses an expert system’s adaptability to its original requisites and objectives and determines if its performance satisfies its users and meets the organisation’s strategic goals. Keywords: Expert systems, performance evaluation, evaluation model, usability.
The literature presents several methods and models for the evaluation of information systems in general and knowledge based systems in particular. Nonetheless, they’re not very systematic, they’re difficult to apply and they’re based in informal concepts, with modest foundations, and little practical experience. According to [2] the evaluation of a knowledge based system is a multifaceted problem, with numerous approaches and techniques. The results the system produces should be evaluated along with its characteristics: usability, easiness to improve and the impact on the users for not using the system. A review of the literature demonstrates that the authors have numerous viewpoints in terms of what should be considered when evaluating an expert system. [3] believe expert systems can be evaluated and analysed through the most important performance measures: efficiency and effectiveness. Efficiency is the level of accomplishment in terms of objectives and it is related to the outputs of the system and the effectiveness correlates the inputs (resources) with the outputs. Efficiency implies the use of the least amount of resources. [4] have a different view and they advocate that the best way to evaluate and test the efficiency of an expert system is to ensure it satisfies the users and it responds to their feedback. User satisfaction is also presented by [1] as a success measure, in terms of an expert system’s use and effectiveness. If the users are pleased with the system, they will feel motivated to use it and that will improve their individual impact. [5], on the other hand, believe that the evaluation is the process that ensures the usability, quality and utility of an expert system.
2 Evaluation Model This study aims to propose an evaluation model of expert systems performance, based on the review of various studies conducted on this subject. 2.1 Initial Considerations When evaluating an expert system, one has to take into consideration its evaluation as a global objective. An expert system cannot be seen as an isolated object, but as an integrated entity, used by people in a specific context to achieve certain objectives [7]. The review of the literature identified several parameters for expert systems evaluation. These parameters demonstrate that expert systems performance should be evaluated from the perspective of the existing influences between user, system and organisation. According to [8], in this type of system, users and technologies are, often, responsible for the accomplishment of tasks. Hence, both the technical and the organisational evaluation are important. To conduct an accurate evaluation of this type of systems it is essential to take into consideration the interactions involved in the process where the expert system is implemented: user, system and organization. [9] underlines, in his study, the fact that the interdependency between these entities is determinant for the accomplishment of a task. Expert systems bring many benefits to users. Expert systems are an excellent support for users, in the sense they can reduce the time users spend performing a task and also decreases the users’ workload. It can make users more successful in the
102
P. Miranda, P. Isaias, and M. Crisóstomo
accomplishment of their tasks and make decisions with a higher precision. A user with good working tools is, undoubtedly, a satisfied user since the workload and the occurrence of mistakes are reduced and the user can make better decisions. Nonetheless, the misuse of the system may lead to failed expectations. Expert systems also bring benefits to organisations. An expert system can lead to a more precise and homogeneous decision making and represent a reduction in task conclusion time. Due to its fundamental characteristic of disseminating knowledge, it gives, specialists, the possibility of being freed from tasks that were previously their responsibility so they can have time to perform other tasks. Nevertheless, if the organisation doesn’t create the conditions for the expert system to operate at its best, for example, by subscribing a maintenance and update program, the system won’t have the expected performance. The user, by operating the system correctly is unquestionably contributing for the global performance of the organisation. 2.2 Model As it was mentioned by [10], the majority of the models described in the literature are not clear and organised in the identification of the evaluation parameters. In the attempt to design a theoretical model it was possible to identify parameters defined by the majority of the authors. The parameters that can be used in expert systems performance evaluation, identified in the variety of models proposed in the literature, were grouped with the intention of contributing to their identification and systemisation. They were organised according to the role of the entities involved in the evaluation process: the user, the system and the organisation. Table 1. Parameters and respective expert systems evaluation criteria, according to the involved entities Entities
Parameters Usability
User
Utility Quality
Interface System Structure
Productivity Organisation Return
Criteria Learning Effectiveness Error tolerance Satisfaction Efficiency Scope Suitability Reliability Decision quality Consistency Update time Coherence Response time Ease of navigation Suggestibility Screen design Search easiness Input and output quality System stability Error rate Transaction processing Historic Transaction time Motivation Task optimisation Benefit Reports Efficiency Cost Effectiveness Competitiveness Return over investment Cost reduction
Evaluation of Expert Systems
103
The table above categorises the entities involved in the process, the evaluation parameters and the respective criteria that compile the model this research aims to propose (table 1). 2.2.1 User Perspective From the user perspective, it was possible to identify three parameters for the evaluation of expert systems’ performance: usability, utility and quality. Usability Parameter The usability parameter was identified, with more or less emphasis, by several authors [10], [2] and [5] and it can be defined by the easiness with which the system is used. Table 2 lists the evaluation criteria proposed for the usability parameter: learning, error tolerance, efficiency, effectiveness and satisfaction. Table 2. Usability parameter – evaluation criteria Criteria Learning Error tolerance Efficiency Effectiveness Satisfaction
Definition Ease with which the users perform the tasks. Identification of the errors made by the user. The required effort to attain a certain objective. Quantity of successfully accomplished tasks. User’s subjective satisfaction.
Utility Parameter Another considered parameter in the perspective of the user for performance evaluation is utility [11] and [1]. Utility refers to how much benefit the user perceives to have. According to [2], the performance of the system should also be evaluated by its utility. Thus, it is important to assess if the expert system is useful in the resolution of the problems it was designed to solve, to determine if the expert system is executing what it is expected to. The Technology Acceptance Model (TAM), developed by [12] to explain the behaviour of informatics system users, has demonstrated that if a system is really useful, the user tends to use it, despite the existing difficulties to do so [13]. Table 3. Utility parameter – evaluation criteria Criteria Scope Suitability
Definition Level of correspondence between the system and the reality. Relation between the decisions made by the system and the needs of users.
Quality Parameter The quality parameter is referred by [14], [15], [10], [16] and [5]. The technical definition established by the International Organisation for Standardisation (ISO) states that quality is fitness for use. It is the conformity to the demands.
104
P. Miranda, P. Isaias, and M. Crisóstomo
Table 4 shows the criteria associated with the quality parameter: reliability, consistency, coherence, decision quality, update time and response time. Table 4. Quality parameter – evaluation criteria Criteria Reliability Consistency Coherence Decision quality Update time
Response time
Definition Level of trust the users have in the systems decisions. Integrity relation between different information in the system. System’s capacity to reflect reality. Level of understanding of the decisions and its suitability to the needs of the user. Capacity the system has to make decisions in the shortest period of time, while keeping the system available for updates, so that no conflicts can occur in the decision-making process Time the user has to wait since the request until the attainment of a decision
2.2.2 System’s Perspective In the system’s perspective two parameters were identified for the evaluation of expert systems performance: interface and structure. The interface and structure parameters allow the identification of criteria in a technological perspective, i.e., according to the software and hardware aspects. Interface Parameter The interface, cited by [14] and [11] relates to the design of the system and the access to the information and the structure concerns the technological base of the system. The interface in an expert system is the set of characteristics the users employ to interact with the system. Thus, the interface is everything the user has available to control the system and what the system has to produce the effects (responses) of the user’s actions. For the interface parameter five criteria were identified: ease of navigation, screen design, input and output quality, suggestibility and search easiness. Structure Parameter The next parameter, related to the system, is structure. The evaluation criteria of this parameter are: system stability, transaction processing, transaction time, error rate and historic. The criteria concerning the structure were referenced by [14]. 2.2.3 Organization’s Perspective From the organisation’s perspective two parameters were identified for the evaluation of the expert systems performance: productivity and return. The productivity and return parameters enable the measurement of the benefits that the organisation gains from the acquisition of the expert system. Productivity Parameter The productivity parameter criteria are: motivation, benefit, efficiency, effectiveness, task optimisation, reports and cost. They intend to evaluate the aspects related with the financial impact and the production. These criteria were identified by diverse authors namely [15], [10] and [3].
Evaluation of Expert Systems
105
Return Parameter The return parameter intends to underline the relation between the expectations of the investment and the perceived return. Its criteria are competitiveness, cost reduction and return over investment.
3 Case Study The case study conducted in this study aims to demonstrate the applicability of the previously proposed expert system evaluation model, by using the Exsys Corvid’s "Cessna Citation X Diagnosis System". 3.1 Describing the Application of the Exsys Corvid’s "Cessna Citation X Diagnosis System" Exsys Corvid is a world leader, user-friendly tool used in the development of expert systems. Its systems report an impressive investment return and a unique competitive advantage (Exsys 2008). This study chose the "Cessna Citation X Diagnosis System", an expert system developed by the Cessna airline for its Cessna Citation X planes, using the Exsys Corvid software. According to Cessna, the use of this expert system has significantly reduced the downtime of their Cessna Citation X aircrafts, allowing for a substantial reduction of costs and maximizing the safety and satisfaction of its users. Using this case study, a scenario that reports a fault on the starting of one of the engines of the Cessna Citation X plane was proposed. More specifically, one of the engines fails when the starting process is initiated. When the fault occurs, a set of lights in the cabin are switched on and depending on which lights are switched on, recommendations are progressively given and then a final recommendation on how to solve the problem is provided. 3.2 Methods A questionnaire was designed (see appendix)to demonstrate the feasibility of the adoption of the proposed model and to provide a method to assess the level of satisfaction that the users of this type of systems experience. It uses a Likert scale that indicates the level of the user’s agreement or disagreement in face of the given statements. The users assessed the statements according to levels that vary from 0 – N/A to 5-Totally agree. The questionnaire embodies some of the factors responsible for the satisfaction or dissatisfaction of expert systems’ users. The questionnaire was given to a total of 37 students from two classes of the “Management Science” Master degree from the Higher Institute of Economics and Management from Lisbon Technical University – Portugal. One of the classes concerned students from the “Decision Systems” module and the other from “e-Business Models and Technologies”. The users started by understanding the test scenario in order for them to get familiarised with the application of the test surroundings, in this case the Exsys Corvid (http://www.exsys.com) and the scenario “Expert System - Cessna Citation X”. Only then they proceeded to the questionnaire itself.
106
P. Miranda, P. Isaias, and M. Crisóstomo
4 Results This section will display the result obtained with the evaluation of the leraning criteria of the usability parameter. It is the product of a statistical treatment of the questionnaires’ data. The first part characterises the user and the second intends to evaluate the learning criteria. 4.1 User’s Profile The respondents’ age was an average of 24.5 years (n=37; SD=2.96). Respondents have been asked whether it was the first time they used expert systems. 14 answered ‘No’ (37.8%) and 23 answered ‘Yes’ (62.2%) (N=37). Respondents that answered ‘No’ to the previous question were asked what the frequency os expert systems usage was. 10 answered ‘Rarely’ (71.4%) whilst only 4 (28.6%) answered ‘at least once a week or more’ (N=14). To those that used expert systems. a question on usage frequency (‘how much time per week they used expert systems?”). The majority (13) corresponding to 92.9% of the respondents only used “less than an hour” (N=14). This shows very little experience with expert systems. 4.2 Usability Evaluation The results presented in this section concern the learning criteria for usability evaluation. 4.2.1 Learning Criteria The questions used to evaluate this criteria are: A.1 It is easy to learn how to use the system. A.2 It took too much time to learn how to use the software. A.3 It is easy to remember how to use the system after a period of usage interruption. A.4 The information the system supply is easy to apprehend. A.5 The help/explanation messages are clear. A.6 The language the system uses is adequate. A.7 The commands’ names are suggestive and easy to use. A.8 To learn how to use the system it is necessary to have extensive experience in the use of expert systems. A.9 Before using the system is it necessary to read much documentation. A.10 It is easy to successfully complete a task for the first time in the system . A.11 It was possible to use the software from beginning to end without resorting to external help. A.12 The system encourages the exploration of new features. The learning criteria registered, in general, a high median. Table 5 presents more detailed answers’ results. Table 5 indicates satisfaction at a level 4 – Agree for questions A1, A3, A4, A5, A6, A7, A10, A11 and A12 (with the highest percentage of respondents indicating this choice). This means that with regards to these questions users, in general, agreed
Evaluation of Expert Systems
107
A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12
Total
5 - Totally Agree
4 - Agree
3 - Neither Agree/ Nor Disagree
2 - Disagree
1 - Totally Disagree
0 - NA
Classification
Questions
Table 5. Frequencies and percentages regarding the learning criterion
that nhe system is easy to understand and remember who to use, the information given by the system is effortlessly assimilated, the messages the system supplies, the language it employs and the name of the given commands are accessible, the accomplishment of a task for the first time can be done without resorting to external help and the system encourages the use of new features. With regards to questions A1, A3, A6 and A11, the second highest score relates to people who chose 5-Totally agree, indicating that these questions, globally, represented a higher users’ satisfaction among all the questions with a higher median. Nonetheless, questions A4, A5, A7, A10 and A12 besides having a 4-Agree score have a significant number of individuals choosing 3-Neither Agree nor Disagree, reflecting a certain indifference in responding to these questions. It is necessary to underline that answers A2 and A8 present a majority score of 2-Disagree, followed by a significant second choice of 1Totally Disagree. This means that users, in general, disagreed that the process of learning how to use the software is time-consuming and requires extensive experience. Finally, answer A9 had a 1 – Totally Disagree score followed by a considerable 2-Disagree score, meaning that the majority of users disagreed that it is necessary to read much documentation before using the system, therefore, it becomes possible to conclude that it is easy to learn how to use the system. 4.2.2 Others The others criteria: error tolerance, efficiency, efficacy and satisfaction were also object of the same kind of evaluation.
108
P. Miranda, P. Isaias, and M. Crisóstomo
4.3 Discussion The analysis of the results from table 5 allowed an expert system usability evaluation for learning criteria. The generality of the criteria evaluated in this study resulted in a high usability. Nonetheless, the efficiency, efficacy and satisfaction criteria obtained more convergent indices in terms of the users’ opinions and, consequently, a higher trust in the results that presented a high usability. Satisfaction was undoubtedly the criteria with best performance. It is possible to perceive that in certain criteria the number of respondents responsible to reduce the evaluation of the usability for that criteria, or even of provoking a large dispersion in the answers was very small, thus enhancing the amplitude of the final result. From the presented results we can verify the applicability of the proposed model, to the usability parameter, in special for the learning criteria.
5 Conclusions Expert Systems can be used as decision auxiliaries and are part of a set of tools used by organizational strategy, working as differentiation and competition devices. It’s unquestionable the importance to evaluate Expert Systems. The impact of Information Systems in general and Expert Systems in particular in organizations is of increasing importance, interconnecting the most varied activities. Investments in these systems are high and the benefits difficult to be measured due to the high level of subjectivity of the variables under evaluation. Nevertheless, this is a mandatory task. There are several barriers to the mentioned evaluation such as systems’ complexity, the evolutive nature of Information Technologies, the non existence of methodological evaluation patterns, demands in terms of Human Resources, Financial demands, and time demands, lack of immediate perception of the evaluation benefits and the perception of limitations or failure aspects in the organizations. The study presented had the goal, as described in section 1, to propose a model to evaluate the performance of Expert Systems. The proposal has been object of evaluation through a questionnaire that can be applied to an organization’s Expert System. The implementation shows the applicability of the proposed model and its results.
References [1] Cascante, P., Plaisent, M., Maguiraga, L., Bernard, P.: The Impact of Expert Decisions Support Systems on the Performance of New Employees. Information Resources Management Journal 15(4), 64–78 (2002) [2] Barr, V.: Applications of Rule-base Coverage Measures to Expert System Evaluation. Knowledge-Based Systems 12, 27–35 (1999) [3] Turban, E., Aronson, J.: Decision Support Systems and Intelligent Systems, 6th edn. Prentice-Hall, New Jersey (2000) [4] Anumba, C., Scott, D.: Performance Evaluation of Knowledge-Based System for Subsidence Management. Structural Survey Journal 19(5), 222–232 (2001)
Evaluation of Expert Systems
109
[5] Cojocariu, A., Munteanu, A., Sofran, O.: Verification, Validation and Evaluation of Expert Systems in Order to Develop a Safe Support in the Process of Decision Making. Computational Economics, 0510002 (2005) EconWPA [6] Yang, C., Kose, K., Phan, S., Kuo, P.: A Simulation-Based Procedure for Expert System Evaluation. In: Proceedings of the IEA/AIE 13th International Conference, New Orleans, June 19-22 (2000) [7] Miranda, P., Isaias, P., Crisostomo, M.: Expert systems evaluation proposal. In: Smith, M.J., Salvendy, G. (eds.) HCII 2007. LNCS, vol. 4557, pp. 98–106. Springer, Heidelberg (2007) [8] Grabowski, M., Sanborn, S.: Evaluation of Embedded Intelligent Real-time Systems. Decision Sciences Journal 32(1), 95–123 (2001) [9] Mauldin, E.: An Experimental Examination of Information Technology and Compensation Structure Complementarities in an Expert System Context. Journal of Information Systems 1, 19–41 (2003) [10] Guida, G., Mauri, G.: Evaluating Performance and Quality of Knowledge-Based Systems: Foundation and Methodology. IEEE Transactions on Knowledge and Data Engineering 5(2), 204–224 (1993) [11] Waterman, D.: A Guide to Expert System. Addison-Wesley, Reading (1986) [12] Davis, F.: Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Quarterly 13(3), 319–340 (1989) [13] Burgarelle, R., Carvalho, R.: Avaliação do Uso de Sistemas de Informação Académica por Alunos de Graduação em Ciência e a Informação. In: Actas do VIII ENANCIB – Encontro Nacional de Pesquisa em Ciência da Informação, Salvador, Bahia, Brasil (October 2007) [14] Lawrence, L., Miller, W., Okogbaa, G.: Evaluation of Manufacturing Expert Systems: Framework and Model. The Engineering Economist 37, 293–314 (1992) [15] Kirani, S., Zualkernan, I., Tsai, W.: Comparative Evaluation of Expert System Testing Methods. In: Proceedings of the 1992 IEEE International Conference on Tools with AI, Arlington, VA (November 1992) [16] Lynn, M., Murray, M.: Expert System Domain Identification, Evaluation and Project Management a TQM Approach. International Journal of Quality and Reliability Management 13(3), 73–83 (1996)
Investigating the Relationships between User Capabilities and Product Demands for Older and Disabled Users Umesh Persad1, Patrick Langdon2, and P. John Clarkson2 1 Product Design and Interaction Lab, Centre for Production Systems, The University of Trinidad and Tobago, Trinidad and Tobago, West Indies 2 Cambridge Engineering Design Centre, Department of Engineering, University of Cambridge, Cambridge, United Kingdom [email protected], {pml24,pjc10}@eng.cam.ac.uk
Abstract. This paper presents the results of a study that specifically looks at the relationships between measured user capabilities and product demands in a sample of older and disabled users. An empirical study was conducted with 19 users performing tasks with four consumer products (a clock-radio, a mobile phone, a blender and a vacuum cleaner). The sensory, cognitive and motor capabilities of each user were measured using objective capability tests. The study yielded a rich dataset comprising capability measures, product demands, outcome measures (task times and errors), and subjective ratings of difficulty. Scatter plots were produced showing quantified product demands on user capabilities, together with subjective ratings of difficulty. The results are analysed in terms of the strength of correlations observed taking into account the limitations of the study sample. Directions for future research are also outlined. Keywords: Inclusive Design, Product Evaluation, User Capability Data, Disability.
Investigating the Relationships between User Capabilities and Product Demands
111
difficulty in operating controls for a little more than 50% of the cases (after measuring maximum force exertions and the force required by the control) [3]. Steenbeekkers et al. also concluded that laboratory measures have limited predictive value for difficulties experienced in daily life, and it is not clear how individual measures of capability combine enabling task performance [4]. Most human factors data tends to be for relatively homogenous populations, and using capability data to make real world predictions of difficulty and exclusion for disabled people is not well understood [4,5,6]. As a precursor to further developing analytical evaluation methods and collecting human capability data to support these methods, this fundamental problem needs to be addressed. The study presented in this paper aims to shed some light on the predictive ability of user capability measures in the context of a capability-demand product interaction framework. It is a necessary first step to determine the relevant capability measures and their relationships with task outcome measures before the development of valid and robust analytical evaluation methods for inclusive design.
2 Study Design An empirical study was conducted entailing participants using four consumer products chosen to represent activities of daily living: (1) a clock-radio, (2) a mobile phone, (3) a food blender and (4) a vacuum cleaner (Fig. 1). The study was conducted at the usability lab of the Cambridge Computer Laboratory. Prior to commencing the study, the four products were chosen and various characteristics were measured including sizes and colours of text, sizes and colours of interface features (chassis, buttons, handles etc.) and push/pull/rotational forces required for activation. After ethical approval was obtained from the Cambridge Psychology ethics committee, older and disabled users were recruited from organisations in and around Cambridge such as the University of the Third Age (U3A), CAMSIGHT and the Hester Adrian Centre (Papworth Trust). This resulted in the recruitment of 19 participants in total who took part in the study. Participants first signed a consent form and were given a 10 GBP voucher for participating in the study. They were then asked questions to gather demographic, medical and product experience information. Participants were also asked to rate their experience with four consumer products and to describe how they would go about using these products to perform tasks (one task per product). Data was recorded on a questionnaire sheet and via an audio recorder. Secondly, a series of capability tests were administered using a range of measurement devices. These tests included sensory tests of visual acuity, contrast sensitivity, hearing level; cognitive tests of short term working memory, visuo-spatial working memory, long term memory and speed of processing (reaction time); and motor tests such as push/pull forces exerted by each hand in different positions, walking speed and balance time. Participants had a short break after the sensory and cognitive capability assessment was performed. Some of the participants chose to take breaks during the capability testing session when they became tired. All capability testing data was recorded on a pre-designed testing sheet or a computer database for the computer based cognitive testing (CANTABeclipse from Cambridge Cognition).
112
U. Persad, P. Langdon, and P.J. Clarkson
Fig. 1. Products selected for the study (from the left): Matsui Clock Radio, Siemens Mobile Phone, Breville Blender, and Panasonic Vacuum Cleaner
Thirdly, participants performed one task with each of the products while being videotaped, with tasks randomly assigned to avoid order effects. The tasks performed were: (a) Clock radio - setting the time to 4.30 PM, (b) Mobile phone - taking the ringer off via the menu, (c) Blender - blend a banana and water as if making a smoothie/drink, and (d) Vacuum cleaner- vacuum a piece of carpet till clean. These tasks were analysed for essential constituent actions before the commencement of the study. Therefore, on completion of a task, subjective difficulty and frustration ratings were collected from each participant for these selected actions using a visual analogue scale ranging from 0 to 100. After completing the four tasks, participants were debriefed and thanked for participating in the study.
3 Results and Analysis The collected data was entered into SPSS for statistical analysis and the video data was analysed in Adobe Premiere Pro. The videos of each participant were analysed and task time and errors were extracted. The SPSS data consisted of 19 participants (Mean Age=62.68, SD=9.20) with demographic data, capability data, and task outcome measures (times, errors, difficulty and frustration ratings) for each product task. The data was analysed by generating scatterplots of rated difficulty versus measured capability for constituent actions, and analyzing the strength of linear correlations. Fig. 2 shows the proportion of participants who attempted, succeeded or failed the task with each of the consumer products. Of the 16 participants who performed the clock radio task, 56% successfully completed the task, and of the 16 participants who performed the mobile phone task, 19% completed it successfully. Of the 19 participants who performed the blender task, 100% successfully completed the task, and of the 18 participants who performed the vacuum cleaner task, 100% completed it successfully. Thus the mobile phone task and the clock radio task had the highest and second highest failure rate respectively.
Investigating the Relationships between User Capabilities and Product Demands
113
Fig. 2. Graph of proportion of participants who attempted, failed and succeeded each of the four tasks
Mean difficulty and frustration ratings were plotted for each product and compared. The mobile phone had the highest mean ratings across all the products for difficulty in starting the task (M=78.24, SD=30.00), difficulty in working out subsequent actions (M=84.59, SD=28.96), and overall mental demand (M=76.47, SD=30.25). The mobile phone also had the highest mean rating for frustration experienced during the task (M=48.89, SD=41.82). In terms of visual demands, the small text on the clock radio display was rated the most difficult to see (M=52.37, SD=34.78), followed by seeing the numbers on buttons (M=46.05, SD=36.84) and seeing the actual buttons (M=39.47, SD=41.16) on the mobile phone. The physical actions of opening (M=47.37, SD=30.25) and closing (M=38.68, SD=26.03) the blender cover and pushing the vacuum cleaner forward (M=28.06, SD=30.69) were also rated as being the most difficult actions on average. In terms of overall mental demands, the mobile phone ranked the highest (M=76.47, SD=30.25), followed by the clock radio (M=42.94, SD=37.54), the blender (M=28.11, SD=32.58) and the vacuum cleaner (M=27.94, SD=27.60). For mean frustration ratings, the mobile phone once again ranked the highest (M=48.89, SD=41.82), followed by the vacuum cleaner (M=28.33, SD=36.22), the clock radio (M=26.39, SD=39.91) and the blender (M=22.11, SD=36.03). In the following sections, the relationships between measured user capabilities and their difficulty ratings would be examined further. Due to space limitations, an overview of the results will be given with illustrative examples. 3.1 Sensory and Motor Capabilities Scatter plots between visual capabilities and rated difficulty in visual actions were generated as shown in Fig. 3. Similar graphs were also used for motor actions. The plots show an increasing user capability measure on the horizontal axis, while the vertical axis shows the rated difficulty score ranging from 0 to 100. A red vertical dashed demand line is plotted on the graph to indicate the specific demand of the
114
U. Persad, P. Langdon, and P.J. Clarkson
Fig. 3. Graphs of visual capability and demand (top row) and motor capability and demand (bottom row)
product feature being considered. Weak correlations (r values) were considered to be 0.4 and below, moderate between 0.4 and 0.7, and strong above 0.7 for both positive and negative correlations. The top row of Fig. 3 shows a fairly strong negative relationship between reading numbers on the digital display of the clock radio and the contrast sensitivity measured for each participant: r(16)=-0.782, p< 0.01. It also shows a strong correlation for actions involving seeing product features, for example seeing the cord retractor button on the vacuum cleaner: r(14)=-0.771, p< 0.01. Five of eight actions involving seeing textual features produced strong correlations, while the remaining three produced weak to moderate correlations. Three of six actions involving seeing product features produced strong correlations, while the remaining three produced very weak correlations. The bottom row of Fig. 3 shows example correlations for motor actions. In considering motor actions and manipulations (pushing buttons, sliding and twisting, pushing and pulling etc.), only weak to moderate linear relationships were found. For example, the left graph in the bottom row of Fig. 3 shows a weak linear relationship between finger push force and difficulty in pushing the clock radio buttons: r(17)=0.105. Actions such as lifting the blender and moving the vacuum cleaner around showed better correlations. For example, in pushing the vacuum cleaner forward (right graph in the bottom row of Fig. 3), there was a moderate linear relationship with rated difficulty: r(13)=-0.564, p< 0.05.
Investigating the Relationships between User Capabilities and Product Demands
115
3.2 Cognitive Capabilities In order to investigate the relationships between measured cognitive capabilities and task outcome measures, graphs were plotted of task time, errors, difficulty starting task, difficulty in selecting subsequent actions and overall mental demand against four main cognitive variables: (1) short term working memory (digit span), (2) visuospatial working memory (span length), (3) speed of processing (reaction time) and (4) long term memory (GNTpercentcorrect). In the case of errors, short term working memory and visuo-spatial working memory were found to correlate moderately with errors for the blender r(17)=-0.493, p< 0.05 and vacuum cleaner r(16)=-0.516, p< 0.05. Visuo-spatial working memory showed some fairly strong correlations with blender errors r(15)=-0.819, p< 0.01 and vacuum errors r(14)=-0.700, p< 0.01. Long term memory showed a significant relationship with blender errors: r(15)=-0.638, p< 0.01, vacuum errors: r(15)=-0.763, p< 0.01 and clock radio errors: r(14)=-0.502, p< 0.05. However, the mobile phone errors had a week linear correlation with long term memory: r(14)=-0.059. Some of these relationships are shown graphically in Fig 4.
Fig. 4. Example graphs of relationships between errors and short term working memory, visuospatial working memory and long term memory
116
U. Persad, P. Langdon, and P.J. Clarkson
Fig. 5. Example graphs of relationships between errors and a Euclidean model of cognitive capability
Fig. 6. Graph of relationships between overall mental demand ratings and self rated experience
In order to further investigate the relationship between outcome measures and cognitive capabilities, the four cognitive variables were used to calculate a derived variable (scaled to an interval between 0 and 1) based on one of the following models: MAX, MIN, CITY-BLOCK and EUCLIDEAN. MAX and MIN models used the maximum and minimum value of the four cognitive variables respectively. The
Investigating the Relationships between User Capabilities and Product Demands
117
CITY-BLOCK metric used the sum of the four cognitive variables, in this case being equivalent to the arithmetic mean of the cognitive variables. The EUCLIDEAN metric was also used which is the square root of the sum of the squares of the four cognitive variables. Correlations between these cognitive models and task time, errors, difficulty starting task, difficulty with next action and overall mental demand were investigated. In general, all cognitive models produced very weak linear relationships with the outcome measures, except for the EUCLIDEAN cognitive capability model in relation to the blender errors and the vacuum cleaner errors shown in Fig. 5 (blender errors r(15)=-0.711, p< 0.01 and vacuum cleaner errors r(14)=-0.804, p< 0.01). Fig. 6 shows the relationships between overall mental demand and self rated experience. The clock radio r(14)=-0.631, p< 0.01 and the vacuum cleaner r(15) =-0.622, p< 0.01 correlations were moderate in strength. The mobile phone showed participants with relatively high ratings of mental demand even with moderate to high experience ratings, while the blender showed participants with low ratings of mental demand even though they had relatively low experience ratings.
4 Discussion The overview of results presented in the previous sections suggest that, given the limitations of the current study, measures of low-level visual, cognitive, and motor capabilities in general correlate weakly to moderately with outcome measures such as time, errors and rated difficulty. In the case of vision, correlations were moderate to strong indicating that the essential low-level capabilities utilized in real world task performance were being captured and adequately modelled with a linear relationship. The blender and vacuum cleaner showed stronger correlations with cognitive variables such as visuo-spatial memory and long term memory. This difference between the type of product used is significant in that it indicates different resources may be drawn upon depending on the complexity of the task at hand. Further, it also indicates the possibility that multiple models may be needed to capture the workings of the human cognitive system in task performance. The EUCLIDEAN model used to combine the four cognitive variables correlated strongly with errors for the blender and vacuum cleaner, but not for the clock radio and mobile phone. Therefore, another type of model may account for such cognitively demanding products. The results also indicate that there is scope for exploring alternatives to a linear reductionist model for describing human capability in disabled populations. It is possible that disabled people utilise multiple low-level capabilities in a non-linear way, relying on a system of accommodation and coping strategies that would be difficult to accurately model with simple linear models. The graphs also show that ratings of difficulty could have a large spread at a given level of capability. It is possible that multidimensional judgements are being made in rating difficulty with single actions, taking into account the level of difficulty with other actions in the task and the cognitive state of the user. The sample size (19 users) in the study was relatively small, and the results presented are indicative. However, the methodology for investigating capabilitydemand relationships with a large sample should be similar, as it simultaneously
118
U. Persad, P. Langdon, and P.J. Clarkson
captures all the essential user, product and task parameters for analysis of their interrelationships. The trade-off would be the increased cost and resources required for conducting a larger scale investigation.
5 Conclusions and Further Work The data collected is being analysed further to look at models consisting of multiple user capabilities and how they relate to errors, times and rated difficulty. This would involve the use of both linear and non-linear models to determine whether task outcome measures could be predicted using lower level sensory, cognitive and motor capability measures. Further studies are planned using a similar methodology and a larger sample size to investigate capability-demand interaction in older and disabled populations.
References 1. Keates, S., Clarkson, J.: Countering design exclusion - An introduction to inclusive Design. Springer, Heidelberg (2003) 2. Persad, U., Langdon, P., Clarkson, J.: Characterising user capabilities to support inclusive design evaluation. Universal Access in the Information Society 6(2), 119–135 (2007) 3. Kanis, H.: Operation of Controls on Consumer Products by Physically Impaired Users. Human Factors 35(2), 325–328 (1993) 4. Steenbekkers, L.P.A., VanBeijsterveldt, C.E.M., Dirken, J.M., Houtkamp, J.J., Molenbroek, J.F.M., Voorbij, A.I.M.: Design-relevant ergonomic data on Dutch elderly. International Journal for Consumer & Product Safety 6(3), 99–115 (1999) 5. Kondraske, G.V.: Measurement tools and processes in rehabilitation engineering. In: Bronzino, J.D. (ed.) The Biomedical Engineering Handbook, vol. 2, pp. 145-141 – 145-116. CRC Press, Boca Raton (2000) 6. Kondraske, G.V.: A working model for human system-task interfaces. In: Bronzino, J.D. (ed.) The Biomedical Engineering Handbook, vol. 2, pp. 147-141 – 147-118. CRC Press, Boca Raton (2000)
Practical Aspects of Running Experiments with Human Participants Frank E. Ritter1, Jong W. Kim2, Jonathan H. Morgan1, and Richard A. Carlson3 1
College of Information Sciences and Technology, The Pennsylvania State University 2 Department of Psychology, University of Central Florida 3 Department of Psychology, The Pennsylvania State University {frank.ritter,jhm5001,racarlson}@psu.edu, [email protected]
Abstract. There can often be a gap between theory and its implications for practice in human-behavioral studies. This gap can be particularly significant outside of psychology departments. Most students at the undergraduate or early graduate levels are taught how to design experiments and analyze data in courses related to statistics. Unfortunately, there is a dearth of materials providing practical guidance for running experiments. In this paper, we provide a summary of a practical guide for running experiments involving human participants. The full report should improve practical methodology to run a study with diverse topics in the thematic area of universal access in humancomputer interaction. Keywords: Experiments, Human Participants, Universal Access.
appalled by the lack of this common but undocumented sense when it is reported by researchers applying psychology methods outside of psychology. 1.1 Why Do We Need a Practical Guide? In general, scientific inquiries in the areas of human-computer interaction (HCI), human factors, cognitive psychology, and cognitive science involve human participants. One distinguishing factor of these disciplines, and thus experiments in these areas, has been the centrality of the human participant. Consequently, working in these areas requires not only understanding the theoretical and ethical issues incumbent to running human participants but also the practical aspects of the process itself. To frame this discussion, we are working to provide an overview of this process and related issues. 1.2 Purpose of This Paper In this paper, we will present a summary of a practical guide (Ritter, Kim, & Morgan, 2009) that can help RAs to run experiments effectively and more comfortably. Our purpose is to provide hands-on knowledge and actual experimental procedures. We are generally speaking here from a background rooted in cognitive psychology, cognitive ergonomics, and HCI studies. Because it is practical advice, we do not cover experimental design or data analyses and it may be less applicable in more distant areas. 1.3 Who Is This Report Useful For? We believe that this synopsis of our report is and the longer summary are useful to anyone who is starting to run research studies, training people to run studies, or studying the experimental process. Particularly, it is useful for students, teachers, lab managers, and researchers in industry. It is useful in particular to computer scientists and other technologists who might run an empirical user study to test new ways to support universal access.
2 Contents We focus on topics that are important for running HCI related user studies concerning diverse populations and universal interactions to them. Also, we discuss the importance of repeatable and valid experiments, and the ethical issues associated with studies involving human participants. 2.1 Overview of the Components Table 1 outlines several of the major components of the larger report. Here, we examine these components with respect to studies examining universal access for diverse populations.
Practical Aspects of Running Experiments with Human Participants
121
Table 1. Important components for working with diverse populations Components Scripting Missing subjects Decorum Recruiting Literature Debriefing Payments Piloting Simulator studies Chances for insights
Explanation Writing a script to ensure standard procedures are observed. How do you deal with participants who do not show up? How do you dress and how do you address the participants? How do you recruit a diverse yet representative set of participants without unwanted bias? What literature should you read to prepare for running a study? How to debrief after a study session. How to arrange payment for the participants, and the importance of performing this correctly. The need to run pilot subjects to practice the method, and also to find where the method (e.g., the script) needs to be modified. The role for simulated studies and how to treat model results as data. Reiterating the importance of being observant, and thus ready for further insights while running studies.
2.2 Repeatability and Validity When running an experiment, ensuring its repeatability and validity are of greatest importance, assuming the experiment is conducted ethically. Running an experiment in exactly the same way for each participant is essential. In addition, reducing unwanted variance in the participants’ behavior is important as well. Ensuring this repeatability is partly the job of the RAs, who often are not informed about these concepts and their practical application. Thus, senior colleagues should strive to make these concepts clear to RAs, while RAs should strive to provide each participant a consistent and comfortable but neutral testing experience. Understanding how participants will complete the task, and working towards uniformity across all iterations of the procedure for each subject are important. The repeatability of the experiment is a necessary condition for scientific validity. There are, however, several well-known effects that can affect the experimental process. Chief among these is the experimenter’s effect, or the influence of the experimenter’s presence on the participants. It is important to be not only aware of this effect but also how it can vary across experimenters. Depending upon the experimental context, the experimenter effect can lead to either increased or decreased performance. The magnitude and type of effect that can be attributed to this effect generally depends upon the type and extent of personal interaction between the participant and experimenter. Thus, you should strive to provide each participant a comfortable but neutral testing experience. Besides the experimenter’s effect, there are other risks to the experimental process. We highlight some here and illustrate how to avoid them, either directly or through proper randomization. Randomization is particularly important because you will most likely be responsible for implementing treatments, while understanding the other risks will help you take steps to minimize them. Finally, there are other experimental effects that are outside of your control—we do not cover these here. Even though you
122
F.E. Ritter et al.
cannot eliminate all contingent events, you can note idiosyncrasies and with help from the principle investigator either correct them or report them as a potential problem. Another common source of variation across trials is the effect of the experimental equipment. For instance, if you are having participants interact with a computer or other fixed display, you should take modest steps to make sure that the participant’s distance to the display is the same for each subject—this does not mean, necessarily, putting up a tape measure, but in some cases, it does. It is necessary to be aware that the viewing distance can influence performance and in extreme cases can lead to blurred vision, irritated eyes, headache, and movement of torso and head (e.g., Rempel, Willms, Anshel, Jaschinski, & Sheedy, 2007). These factors, in that they represent deviations in the testing protocol, can be risks to validity. Furthermore, if subjects are picking up blocks or cards or other objects, the objects should either always be in the same positions, or they should be always randomly placed because some layouts of puzzles can make the puzzles much easier to solve. The experimental set up should be consistent across all trials. There will be other instances where variations in the apparatus can lead to unintended differences, and you should take advice locally to learn how to reduce these risks. 2.3 Ethics There are several topics that you need to keep in mind when working with participants. Chief among these are the ethics pertaining to experimental procedures, and the gathering and reporting of data including published and unpublished documents. If you have any questions, you should contact the lead researcher (or principal investigator), or other resources at your university. Because we would like to generalize the results to a wide population, indeed the whole population if possible, it is critical to recruit a representative sample of the population in question. It has been noted by some observers that experimenters do not always recruit from the whole population. In some studies, there are good reasons for recruiting more heavily from one sub-group, either because of the increase in risk to vulnerable groups (e.g., non-caffeine users in a caffeine study), or because of limited access (e.g., treatment programs for drug addicts). In these cases, there are principled procedures for ensuring reliability. Where possible, however, experimenters should recruit a representative population. This may mean putting up posters outside your department; and it may entail paying attention to the proportion of participants of any one sex or age among the recruits. Where the proportions are detrimental, correcting these imbalances will most likely require more recruiting. As a research assistant, you can be the first to notice this, and to help address these issues by bringing it to the attention of the investigator. Coercion is a violation of the human rights of participants. It is necessary to avoid any procedures that restrict participants’ freedom of consent regarding their participation in a study. Some participants, including minors, patients, prisoners, and individuals who are cognitively impaired are more vulnerable to coercion. For example, enticed by the possibility of payments, minors might ask to participate in a study. If, however, they do so without parental consent, this is unethical because they are not old enough to give their consent—agreements by a minor are not legally binding.
Practical Aspects of Running Experiments with Human Participants
123
Students are also vulnerable to exploitation. The grade economy presents difficulties, particularly for courses where a lab component is integrated into the curriculum. In these cases, professors must not only offer an experiment relevant to the students’ coursework but also offer alternatives to participating in the experiment. To address these problems, it is necessary to identify potential conditions that would compromise the participants’ freedom of choice. For instance, in the second example, recall that it was necessary for the professor to provide an alternative way to obtain credit. In addition, this means ensuring that no other form of social coercion has influenced the participants’ choice to engage in the study. Teasing, taunts, jokes, inappropriate comments, or implicit quid pro quo arrangements are all inappropriate. These interactions can lead to hard feelings (that’s why they are ethical problems!), and the loss of good will towards you, your lab, and potentially science in general. When preparing to run a study, you must have procedures for handling sensitive data. There are at least two categories of which to be aware: (1) sensitive data that you have indentified and prepared for before the experiment; and (2) unexpected sensitive data that arises during the course of the experiment. Personal data, or data that is intrinsically sensitive, should be handled carefully. Information on an individual pertaining to his or her race, creed, gender, gender preference, religion, friendships, etc., must be protected. This data should not be lost or mislaid. It should not be shared with people not working on the project, either formally if you have an IRB that requires notice, or informally, if the IRB does not have this provision (formal tracking of who has access to experimental data occurs more rigorously in the US than in some other countries). You should seek advice from your colleagues about what practices are appropriate in your specific context. In some situations, you are not allowed to take data from the building, and in most cases, you are encouraged to back it up and keep the backed-up copy in another safe and secure location. In nearly all cases, anonymous data, that is, removing names and other ways data can be associated with a particular individual, removes most or all of the potential problems. Data that arises during the experiment (e.g., the subject’s responses) can have implications beyond the scope of the study. This can include subjects implicating themselves in illegal activity, or unintentionally disclosing an otherwise hidden medical condition. For example, if you are administering caffeine, and you ask the subject what drugs they take to avoid known caffeine agonists or antagonists, you may find information about illegal drug use. If you take the participant’s heart rate or blood pressure, you may discover symptoms of an underlying disease. It is important to prepare experimenters for these situations. Generally, preparation for a study should involve discussions about how to handle sensitive data, and whether there is a chance that the study may reveal sensitive data about the participants. You should fully understand how your institution’s policies regarding sensitive data, and how to work with the participants when sensitive information becomes an issue. If you have questions, you should ask the principle investigator.
3 Major Aspects of Working with Diverse Populations Nearly every aspect of preparing and running a study is influenced by the precautions necessary for working with studies of universal access and diverse populations.
124
F.E. Ritter et al.
We examine just a few, noting that some researchers may find other issues more important for their work. 3.1 Recruiting Recruiting participants for your experiment can be a time consuming and potentially difficult task, but recruiting is important for producing meaningful data. An experimenter, thus, should carefully plan out with the lead researcher (or the principal investigator) to conduct successful participant recruitment for the research study. Ask yourself, “What are the important characteristics that my participants need to have?” Your choices will be under scrutiny, so having a coherent reason for which participants are allowed or disallowed into your study is important. First, it is necessary to choose a population of interest from which you will recruit participants. For example, if an experimenter wants to measure the learning effect of foreign language vocabulary, it is necessary to exclude participants who have prior knowledge of that language. On the other hand, if you are studying bilingualism you will need to recruit people who speak two languages. In addition, it may be necessary to consider age, educational background, gender, etc., to correctly choose the target population. Second, it is necessary to decide how many participants you will recruit. The number of participants can affect your final results. The more participants you can recruit, the more reliable your results will be. However, limited resources (e.g., time, money, etc.) often force an experimenter to find the minimum number of participants. You may need to refer to previous studies to get some ideas of the number of participants, or may need to calculate the power of the sample size for the research study, if possible (most modern statistical books have a discussion on this, and teach you how to do this, e.g., Howell, 2007). Finally, you may have to consider whether the sample is, in fact, too large (e.g., online surveys), in that the sample’s size is either wasteful of resources, or because it is likely to exaggerate any correlations found in the data. With large sample sizes, trivial or meaningless effects can be found to be statically significant (reliable). This is not a normal problem, but if you arrange to test a large class or use online data you may encounter this problem. There are several ways that participants can be recruited. The simplest way is to rely on the experimenters to find participants. In simple vision studies, this is often done because the performance differences between people are negligible in these types of tasks is negligible, and knowledge of the hypothesis being tested does not influence performance. Thus, the results remain generalizable even with a small number of participants. An alternative to personal recruiting specifically for the task is to use a sample of convenience. Samples of convenience consist of people who are accessible to the researcher (e.g., such as classroom based research). Many studies use this approach, so much so that this is not often mentioned. Generally, for these studies, only the sampling size and some salient characteristics that might possibly influence the participants’ performance are noted. These factors might include age, major, sex, education level, and factors related to the study, such as nicotine use in a smoking study, or number of math courses in a tutoring study. There are often restrictions on how to recruit appropriately, so stay in touch with your advisor and/or IRB.
Practical Aspects of Running Experiments with Human Participants
125
In studies using samples of convenience, try distributing an invitation e-mail to a group mailing list (e.g., students in the psychology department or an engineering department) done with the approval of the list manager and your advisor. Also, you can post recruitment flyers on a student board, or an advertisement in a student newspaper. Use efficiently all the resources and channels available to you. There are disadvantages to using a sample of convenience. Perhaps the greatest is that the resulting sample is less likely to lead to generalizable results. The subjects you recruit are less likely to represent a sample from a larger population. Students who are participants often differ from other students. For instance, selection bias is a potential issue because they are very likely already interested in experimental methods and the hypotheses behind them, making them more vulnerable to experimenter effects. Furthermore, the sample itself may have hidden variability in it. The subjects you recruit from one method (e.g., e-mailing them) or from another method (using a poster) may be different. We also know that they differ over time— those that come early to fulfill course requirements are generally more conscientious than those who come later. So, ensure that these participant types are randomly assigned to the various conditions of your study. The largest and most carefully organized sampling group is a random sample. In this case, researchers randomly sample a given population by carefully applying sampling methodologies meant to ensure statistical validity by making participant selection equally likely across the given population. Asking students questions at a football game as they go in does not constitute a random sample—some students do not go (selection bias). Other methods such as selecting every 10th student based on a telephone number or ID introduce their own biases. For example, some students do not have a publicly available phone number, and some subpopulations register early to get their ID numbers. Truly choosing a random sample is difficult, and you should discuss how best to do this with your lead researcher. One approach for recruiting participants is a subject pool. Subject pools are generally groups of undergraduates who are interested in learning about psychology through participation. Most Psychology departments organize and sponsor subject pools. Subject pools offer a potential source of participants. You should discuss this as an option with your lead researcher, and where appropriate, learn how to fill out the requisite forms. If the students in the study are participating for credit, you need to be particularly careful with recording who participated because the students’ participation and the proof of that participation represent part of their grade. A whole book could be written about subject pools. Subject pools are arrangements that psychology or other departments provide to assist researchers and students. The department sets up a way for experimenters to recruit subjects for studies. Students taking particular classes are either provided credit towards the class requirement or extra credit. When students do not wish to participate in a study, alternative approaches for obtaining course credit are provided. The theory is that participating in a study provides additional knowledge about how studies are run, and provides the participant with additional knowledge about a particular study. Researchers, in turn, receive access to a pool of potential subjects.
126
F.E. Ritter et al.
3.2 Experimental Methods: Background Readings for your Study Many introductory courses in statistics focus primarily on introducing the basics of analysis of variance (ANOVA) and regression. These tools are unsuitable for many studies analyzing human subject data where the data is qualitative or sequential. Care, therefore, must be taken to design an experiment that collects the proper kinds of data. If ANOVA and regression are the only tools at your disposal, we recommend that you find a course focusing on the design of experiments featuring human participants, as well as the analysis of human data. We also recommend that you gather data that can be used in a regression because it can be used to make stronger predictions, not just that a factor influences a measure, but in what direction (!) and by how much. Returning to the topic of readings, it is generally useful to have read in the area in which you are running experiments. This reading will provide you further context for your work, including discussions about methods, types of subjects, and pitfalls you may encounter. For example, the authors of one of our favorite studies, an analysis of animal movements, notes that data collection had to be suspended after the researchers were chased by elephants! If there are elephants in your domain, it is useful to know about them. There are, of course, less dramatic problems such as common mistakes subjects make, correlations in stimuli, self-selection biases in a subject population, power outages, printing problems, or fewer participants than expected. While there are reasons to be blind to the hypothesis being tested by the experiment (that is, you do not know what treatment or group the subject is in that you are interacting with, so that you do not implicitly or inadvertently coach the subjects to perform in the expected way), if there are elephants, good experimenters know about them, and prepared research assistants particularly want to know about them! As a result, the reading list for any particular experiment is both important and varies. You should talk to other experimenters, as well as the lead researcher about what you should read as preparation for running or helping to run a study. 3.3 Piloting Conducting a pilot study based on the script developed for the research study is important. Piloting can help you determine whether your experimental design will successfully produce answers to your inquiries. If any revision to the study is necessary, it is far better to find it and correct it before running multiple subjects, particularly when access to subjects is limited. It is, therefore, helpful to think of designing experiments as an iterative process characterized by a cycle of design, testing, and redesign. In addition, you are likely to find that this process works in parallel with other experiments, and may be informed by them (e.g., lessons learned from ongoing related lab work). Thus, we highly recommend that you use pilot studies to test your written protocols (e.g., instructions for experimenters). The pilot phase provides experimenters the opportunity to test the written protocols with practice participants, and is important for ironing out misunderstandings, discovering problematic features of the testing equipment, and identifying other conditions that might influence the participants. Revisions are a normal part of the process; please do not hesitate to revise your
Practical Aspects of Running Experiments with Human Participants
127
protocols. This will save time later. There is also an art to knowing when not to change the protocol. Your principal investigator can help judge this! It is also useful at this stage to write the method section of your paper. Not only is your memory much fresher but also you can show other researchers your method section and receive suggestions from them before you run the study—definitely a good time to get suggestions. These suggestions can save you a lot of time, in that these reviews essentially constitute another way of piloting the study. 3.4 Chances for Insights Gathering data directly can be tedious, but it can also be very useful and inspiring. Gathering data gives you a chance to obtain insights about aspects of behavior that are not usually recorded, such as the user’s questions, their posture, and their emotional responses to the task. Obtaining these kinds of insights and the intuition that follows from these experiences is important aspect of the experimental process, but gathering data is particularly important for young scientists. It gives them a chance to see how previous data has been collected and how studies work. Reading will not provide you this background or the insights associated with it, rather this knowledge only comes from observing the similarities and differences that arise across multiple participants in an experiment. So, be engaged as you run your study and then perform the analysis. These experiences can be a source for later ideas, even if you are doing what appears to be a mundane task. In addition, being vigilant can reduce the number and severity of problems that you and the lead investigator will encounter. Often, these problems may be due to changes in the instrument, or changes due to external events. For example, current events may change word frequencies for a study on reading. Currently, words such as bank, stocks, and mortgagees are very common, whereas these words were less prevalent three or four years ago.
4 Conclusions Once a science is mature, practitioners know the methods; while, however, a science is growing, methods and procedures have to be more explicitly taught. Furthermore, when methods move between areas (e.g., techniques associated with behavioral studies moving from psychology to computer science and engineering), there must be an effort to not only document and disseminate these methods but also to formally transfer them, with greater attention given by senior investigators. In our presentation we provide practical advice regarding conducting experiments with human participants. We are working on extending and polishing a written guide that will be useful to anyone who is starting to run research studies, training people to run studies, or studying the experimental process. We expect this guide will be particularly helpful to students who are not in large departments, or who are running participants in departments that do not have a long history conducting human-based research.
128
F.E. Ritter et al.
Currently, the report is in use at five universities in the US, Canada, and England for graduate and advanced undergraduate courses in Cognitive Science, Human Factors engineering, and in Human-Computer Interaction courses. As a colleague noted, this contains just common sense. In this case, we have found that the common sense is not so common, and that new researchers, both students and those taking up a new methodology, need a good dose of common sense.
Acknowledgements. This work was sponsored by ONR (W911QY-07-01-0004 and #N00014-10-1-0401).
References 1. Howell, D.C.: Statistical methods for psychology, 6th edn. Thomson, Belmont (2007) 2. Rempel, D., Willms, K., Anshel, J., Jaschinski, W., Sheedy, J.: The effects of visual display distance on eye accommodation, head posture, and vision and neck symptoms. Human Factors 49(5), 830–838 (2007) 3. Ritter, F.E., Kim, J.W., Morgan, J.H.: Running behavioral experiments with human participants: A practical guide (Tech. Report No. 2009-1): Applied Cognitive Science Lab, College of Information Sciences and Technology, The Pennsylvania State University (2009)
A Genesis of Thinking in the Evolution of Ancient Philosophy and Modern Software Development Stephan H. Sneed jCOM1, Münchner Straße 29 - Hettenshausen 85276 Pfaffenhofen, Germany [email protected]
Abstract. This paper is a brief discussion on the issue of modeling evolution. The question posed is where does modeling theory come from and where is it going. A parallel is drawn between the evolution of modeling theory and ancient philosophy. In the end both must come to the conclusion that a theory must be applicable to a human problem and must lead to a solution of that problem. Otherwise, it is useless. It is pointed out, that some methodological approaches are even detrimental to reaching a good solution. They only absorb costs and effort and lead to nowhere. Just as Aristotle rounded out the ancient philosophical discussion, the S-BPM method appears as the next logical step in the evolution of modeling methods. In the end, some research issues are discussed, for the S-BPM method as well as for modeling comparison in general. Keywords: SA, OOA, OOP, S-BPM, paradigm change, modeling methods, programming languages, software development, evolution.
2 Formal Theories in IT Let us define executable machine instructions as the base of any abstraction with index 0 and therefore as atomic elements of higher level abstractions. Any higher level abstraction of machine code is a formal theory above this base with an index n. Any model consisting of more than one atomic algorithm of such a formal theory is called a system. The problem with these definitions is that they are recursive and not bounded. Two instructions can already form a system. A function is already a system as well as anything that is not atomic like a single instruction or a single data item. Hence, referring to something as a system consisting of functions that consist of instructions is absolute arbitrary. The fact is that we have systems encapsulated by other systems and so on. There can be an infinite nesting of systems. So each programming language is an abstraction with an index n of the atomic instructions at 0 level, and hence a formal IT theory. Each executable program written in this language is a model of this theory and can be interpreted by the computer. In case of modeling methods, one has to distinguish between the ones that are executable and ones that are not. Some modeling languages like SADT were never intended to be executable. Originally UML was not intended to be executed. Now it is. Anyway, executable modeling languages (or their executable subsets) are formal IT theories about the code they govern. A model of such a theory can be executed, hence is an abstraction of machine code and hence an IT system. This brings us to an interesting point: What is actually the difference between a programing language and an executable modeling method? They have the following aspects in common: • Programing languages and modeling methods are both level n abstractions of machine code • Models of programing languages and models of modeling methods are called IT systems. The aspects where programming languages and modeling methods differ are the following: • The algorithms in programming languages have a textual representation whereas the representation of algorithms in an executable modeling method is graphical. Actually most modeling languages reach a point where graphics no longer suffice. Then they revert to text in tags like OCL as an extension of UML. • Usually the modeling method is an abstraction of a certain programming language and has the abstraction index n+1 when the programming language is of abstraction index n. It becomes clear that the type of algorithm representation, either in text or in graphical representation, is not crucial for the theory itself. We could well think of a graphical Java Editor where an “if statement” could be placed in a diamond shaped form as with structured analysis. Thus, the representation of algorithms is actually not relevant since it can be replaced by other representations. The other aspect is the degree of the abstraction level. A modeling method will usually have a higher
A Genesis of Thinking in the Evolution of Ancient Philosophy
131
abstraction index than a programming language but this aspect does not have a casual relationship. We could also think of a modeling method that is directly based on a particular machine code as well. In the end, there is no real difference between programming languages and executable modeling methods. They are both nothing but abstractions of a certain level n of machine code. Both are being developed in order to reduce complexity and to make the systems described with them easier to comprehend. Thus both of them have to be treated equally when talking about their value to the user.
3 Formal IT Theories and their Value Value can be expressed in terms of cost effectiveness, reduced time and enhanced comprehension. What does cost effectiveness mean in the sense of formal IT theories? In the first place, any business IT system has the purpose to of enhancing user revenues. So to be of value, a language or model must be cost effective, i.e. inexpensive to use. A language which requires a longer time to formulate is less valuable than one which fits better to the problem and can be expressed more quickly. General purpose languages are generally not suitable to particular problems and require more time to formulate. This is the advantage of domain specific languages. The third potential value is more problematic. What does it mean to be comprehensible? Are there some certain properties that make a formal language easier to comprehend than others? Are they objective? Can they even be? Nietzsche says that every product of the human mind is only relevant for humans. So, according to him, there are no general criteria for good and bad languages, and he shows good reasons for his claim [Nietzsche1999]. Languages can only be judged in terms of their value to the user. If so, the language would be preferable which requires the least mappings between the human user and the IT system.
Fig. 1. The figure shows a simple and a complex mapping. The simple mapping occurs where real world (or natural language) and modeling world entities are the same, while the complex mapping occurs, where they differ.
132
S.H. Sneed
4 A Genesis of Thinking It is, of course not possible to discuss the complete evolution of Greek philosophy and IT programming and modeling methods within this short paper. But there is also no need for this. It suffices to concentrate on those key aspects that determined a new evolutionary step. The evolution of model construction seems to be common for both ancient philosophy and modern system design. The requirements for proper models, the arguments for and against certain approaches, as well as the criticism and condemnation are following exactly the same track. This evolution, “based on the problem and dictum of Parmenides describes [in both disciplines] a curios curve of thinking” [Buchheim2003] that will be described in this chapter. Parmenides and Bottom-up System Design. The beginning of software design methodology coincides with the point in time in which the ancient Greeks started to reflect on the nature of the world and the reason for their being. Heraclitus, a contemporary of Parmenides, pointed out the difficulties one has when discussing an ever changing and evolving world, while Parmenides reflected on the construction of models as well: “The one most famous claim made by Parmenides that can serve for constructing any ontology” [Buchheim2003] is the following: “What is thought must correspond to what exists” [Parmenides, Fragments, B3]. This claim of Parmenides presupposes a tight relationship between the intelligible and the existing entities. The fulfillment of this axiom is what we will be pursuing when comparing the evolution of ancient philosophy with that of system modeling. Regarding ancient philosophy we have to rely on the conclusions reached at that time, but as for system design, i.e. business engineering, we can determine whether a model fulfills the claim of Parmenides or not: The claim of Parmenides is fulfilled by a modeling approach in IT when the respective model can be automatically transferred into coding instructions performing specific functions with interfaces to the outer world, that is, to an executable system. This is true only if the model is an abstraction of the underlying machine code. In software design, the first approaches to dealing with program complexity were the hierarchical functional structured approach of IBM in the USA (HIPO) and the hierarchical data structured approaches of Jean Dominique Warnier in France (LCP) [Warnier1974] and Michael Jackson in England (JSP) [Jackson1983]. The situation with software development is according to Warnier something he calls the space and time problem where one images how a program is being executed in time and writes down the instructions to be executed in space [Warnier1974, p. 15]. He claimed that “this approach is not satisfactory, for it leads to us to attach equal importance both to details and to essential points; it must be replaced by a hierarchical approach which allows us to go from abstractions to details, i.e. from the general to the particular” [Warnier1974, p. 15]. This lead to a structured approach, in which data is grouped according to some criteria in a hierarchy of data types where each node of the hierarchy can be a set of data or an elementary data attribute. This data hierarchy was then mapped into a corresponding hierarchy of functions so that for every node of the data tree there was a corresponding node in the function tree for processing it. There are in the end as many functional hierarchies as there are input/output data structures.
A Genesis of Thinking in the Evolution of Ancient Philosophy
133
Each functional hierarchy corresponds to a subprogram. A program consists of all subprograms derived from data structures used by that program. This hierarchical data-oriented approach introduced by Warnier was a means of modeling IT-Systems from the bottom up based on the data they use. He also proposed a set of diagrams for depicting the data structures and their corresponding functional structures. The major reason why his method was not widely accepted was the difficulty involved in drawing those diagrams. Michael Jackson in England came up with a similar approach, but his tree diagrams were much more intuitive and easier to draw. Therefore the Jackson method of modeling computer programs became much more widespread. Furthermore, the Jackson structured diagrams could be mapped 1:1 into a program design language from which COBOL programs were generated. Thus, there was a direct uninterrupted path from the model to an executable system. The prerequisite was of course that the model must be at the same semantic level as the program itself, so that in the end the model is the program. Every single data item and elementary operation is contained within the model, thus fulfilling the premise of Parmenides. However, what was the advantage of this method other than changing the textual representation into the graphic one? Socrates and Plato, Structured Analysis and Object Orientation. With the increasing power of the computer, the size of the IT-Systems increased as well. By the end of the 70’s the average commercial program size had surpassed 2000 statements rendering it impossible to apply a bottom up approach with the then existing tools. This led to the so called software crisis of the 70ies, a crisis which continues even after 30 years. “The major cause of this crisis is that the machines have become several orders of magnitude more powerful! To put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem” [Dijkstra1972]. Thus the situation for any IT-Manager could be well described by a human that gains for the wisdom of god when trying to understand all the algorithms (either graphical or textual) under his responsibility. But Socrates claimed: “Human wisdom is worth little or nothing at all.” [Socrates, Apology 23a] So, according to Socrates, it is not possible at all for the human to gain the complete and definite truth. The only thing that can be done is trying to get close to it by human assumptions and the testing of hypotheses, not knowing where this may lead to in the end. This is exactly the thesis put forward by the advocates of agile development. They claim it is impossible to model and to plan a complex IT-System in advance. The only way to construct one is through trial and error. “While the PreSocratic philosophy had been about achieving the truth by reducing concepts to existing reality, Socrates was looking for the most accurate alphabetization of the claimable, which was directed towards searching for the truth, but without ever really obtaining it” [Buchheim2003, p. 208]. This is equally true for agile development using Object-oriented Design. The analyst starts to study a given problem (“Analysis is the study of a problem…” [DeMarco1978, p. 4]) whose solution may, after many months of discussion with the user and the development team (“The analyst is the middleman between the user, who decides what has to be done, and the development team, which does it.” [DeMarco1978, p. 7]), turn out to be completely different in the end than it might have appeared at the beginning. Socrates knew quite well that
134
S.H. Sneed
human beings cannot grasp complex reality, as depicted in the quote: “That, what we believe to understand clearly, we should look upon skeptically, for it is surely in error, even though we might all agree that we understand it” [Socrates, Sophistes 242a]. This was the main motivation for the structured and later the object-oriented design methods. “The principal goal of Structured Analysis is to minimize the probability of critical analysis phase errors” [DeMarco1978, p. 9]. If errors are to be avoided within the analysis phase, they are avoided by constructing the proper models, gained by proposal and discussion. To quote the literature, “OO analysis and design […] are powerful ways to think about complex systems” [Martin1992, p. xi] Hence all analysis and its models, whether being structured or object-oriented, serve as a base of discussion to avoid errors within the analysis phase, thus corresponding to the philosophical approach of Socrates: “The highest good for man is to produce [trains of thoughts] each day that you hear me talking about in discussions and thereby put myself and others in question.” [Buchheim2003, p. 210] It was pointed out before that Socrates had his doubts about the Parmenides approach to depicting the reality in accordance with a single detailed view of the world. He was quoted as saying “Human wisdom is insignificant compared to the true wisdom of god that is not at human disposal” [Buchheim2003, p. 206]. The wisdom of god in terms of information technology would mean nothing less than knowing the content of each and every single instruction in a system, or its representation within a model. Creating models by abstracting elementary statements in the way that Warnier and Jackson did might work for small programs. However with the increasing power of the computer, the size of the IT-Systems increased as well. But the entities of DeMarcos modeling approach are not physical entities like modules, interfaces and code blocks. They are abstract features defined in a specification made independent of the machine code. This meant it was never intended to derive executable programs from the model. DeMarco proposed to stop modeling this top down approach in a downward direction when one believes that the insights of his “lowest-level bubbles can be completely described in a mini-spec of about one page” [DeMarco1992, p. 84]. Object-oriented Analysis and Design is more of a bottom-up approach in the sense of Warnier and Jackson. It starts with the data objects of a system and determines what happens with them. Once the building blocks are determined, the designer then decides what to do with them. The object model is really derived from the data model as depicted by Shlaer and Mellor in their book on “Modeling the World with Data” [ShlaerMellor1988]. Unlike structured design, which was never intended to be used as a basis for code generation, object-oriented design can correspond to the physical system, i.e. to the code, provided it is broken down far enough. The UML-2 language makes this possible as claimed by Ivar Jacobson in his keynote speech at the International Conference on Software Maintenance in Florence in 2001 with the title “UML – all the way down” [Jacobson2001]. By that he meant that with the new UML-2 it would be possible to go all the way from the requirements to the code. There would only be one language required and that is UML. The problem with this one language approach is that the one language inevitably becomes just another programming language. As Harry M. Sneed noted in his article “The Myth of Top-Down Software Development” by quoting DeMillo, Perles and
A Genesis of Thinking in the Evolution of Ancient Philosophy
135
Lipton, “’the program itself is the only complete and accurate description of what the program will do.’ The only way to make the specification a complete and accurate description of the program is to reduce the specification to the semantic level of the program. However, in doing this, the specification becomes semantically equivalent to the program” [Sneed1989, p. 26]. So, the OOA and OOD methods are nothing more than graphical representations of OO programming languages. The relevant principles of OO programming languages are inheritance and encapsulation (information hiding), locality of reference and polymorphism [Rumbaugh1991]. Each and every class extends the all integrating super class “Object”. This corresponds exactly to Plato’s teaching of the ideas. Individual ideas are all derived from a single overriding idea, the idea of the good, and each lower idea holds its part in the higher one. With his notion of what is good, Plato introduced something that he assumed to be “the organizing principle of the world” [Buchheim2003, p. 211]. The benefit of this concept is that information can be outsourced to the super classes (in a complex inheritance structure) in order to minimize the size of the lower level classes. But, of course, this requires a very intelligent design of the super classes, a degree of intelligence most software developers don’t possess. “Our constitution will be well governed if the enforcement agencies are always aware of and adhere to the higher order governing principles.” [Plato, Politeia 506b]. Probably the perfect inheritance structure is just as much a utopia as was Plato’s perfect state he describes in the Politeia. If we define a super class for creatures that walk on two legs and call them humans as well as one class for animals that fly through the sky which is called birds, what do we do with the ostrich? Often things have less in common that we think they have. The Next Step of the Evolution: Aristotle and S-BPM. The advantages of object oriented design and object oriented programming are clear: The top down approach of structured analysis and UML1 makes things simple and quick to discuss them, where the most important parts need the deepest discussions as Socrates proposed. Object-oriented design languages, e.g. UML, are attempting to be both a descriptive language and an implementation language in the sense of Plato’s universal, allencompassing model. Aristotle’s criticism of his predecessors in ancient philosophy applies equally well to the early design methods. SD and UML1 as well as non formal BPM methods (like ARIS) were never intended to be an abstraction of the physical solution, i.e. the machine code. They exist only on paper or as a graphical model within some drawing tool. In fact, they do not have any guaranteed reference to the executable machine code. They may be describing something totally different than that what has been implemented. And, since the code changes, but the design documentation remains static, the model soon loses its value, if it ever had one. In our fast changing world they represent a lost investment. The least change to the code renders the model obsolete. The two descriptions no longer match. The problem is that model and code are depictions of separate worlds. This is the same as with Plato’s teaching of the ideals. They form another reality with a complete different ontological state. How can we use an explanation of World A to explain something that lies in world B when the two worlds differ so much? This was the reason for Aristotle to claim the following:
136
S.H. Sneed
“Are these now separated from each other, there would be no science of the ones while the others wouldn’t be existing.” [Aristotle, Metaphysics 1031b]. Object-oriented design methods and their graphical representation UML2 have reunited the model and reality, but are lacking in another respect. The formal OO theory defines objects, methods and properties as their main entities. But in the real world, objects are passive. An invoice in the real world does not have any functionality like “calculate sum”. In the real world, this functionality belongs to the actor responsible for preparing the invoices. He calculates the sum and adds it to the invoice. If this actor is automated, it still is an actor and hence a subject, not an object. An object which controls the behavior of other objects, such as the control object proposed by Jacobson is an artificial entity that should exist, neither in the real world nor in natural language. The use case and the control object are no more than patches that were added to make the original method usable. So, instead of defining new artificial entities like Plato’s ideas or control objects with functionality, Aristotle demands that one should define subjects: “Definitions exist either only or at least preferable from substances (subjects).” [Aristotle, Metaphysics 1031a]. The only objects that should be allowed are data beans. Everything with more functionality than get and set methods is no longer an object but a subject. With its introduction of the subject, the SPBM method is in tune with this major criticism of Aristotle. The S-BPM method takes this claim seriously in introducing the notion of subjects denoting any kind of actor within a business process: “The acting elements in a process are called SUBJECTS. SUBJECTS are abstract resources which represent acting parties in a process. SUBJECTS are process specific roles. They have a name, which gives an indication of their role in the process. SUBJECTS send or receive messages to or from other process participants (SUBJECTS) and execute actions on business objects” [Fleischmann2010, p. 12]. SUBJECTS are distinct from each other by their behavior, which consist of chains of elementary operations such as SEND, RECEIVE and ACTION. Thus the SUBJECTS of an S-BPM model are already implemented sub systems in form of prefabricated building blocks. Their specific behavior can be modeled by drag and drop in order to define the unique behavior of their relationships. They might either represent an existing or planned system or a human actor or an organizational unit. In case of a system, they need to be linked to the system, but the systems functionality can be simulated as long is the connection is not yet completed. In case of a human actor, the human interface is already provided by the SUBJECT. The relations of the SUBJECTS are defined within the S-BPM communication view, where “it is defined which SUBJECTS participate in a process and which messages they exchange” [Fleischmann2010, p. 12]. Altogether they form an S-BPM business process. It is important to note that a modeling method defines only the syntax of a language. The semantics come together with modeling and are provided by the entities and relationships the model artifacts are representing in the real world. Since existing legacy systems or actors have to interact with the new actors, they should be modeled first. For existing actors and constraints, the model is made according to the real world. For actors that are not yet existing, they are modeled according to the role
A Genesis of Thinking in the Evolution of Ancient Philosophy
137
they should play. Otherwise the process might be executable according to a weaker definition, but not to the strong definition required for process operation. It seems that the bottom-up approach has partly to be also applied to the semantics (not discussed within this paper) in order to get a fully operational process.
5 Conclusion The issues to be considered in developing a “better” implementation language as a successor of the OO paradigm according to Demelt and Maier are the following [DemeltMaier2008]: Better ability of structuring, higher language abstraction, development of formal methods and tool support. A more valuable model in terms of cost effectiveness, ease of use and comprehensibility is achieved by the S-BPM method, by taking the subject out of the objects and putting it there where it belongs, as in the real world where subjects are independent of the objects they deal with. Not only that, but S-BPM is a pure abstraction of the underlying Java code and remains in tune with it throughout the system evolution. The S-BPM Method is based on formal theory and is supported by the Metasonic Tool Suite, which ensures that model and implementation are always synchronized. Just as Aristotle adapted from Plato the principle of a complete and all encompassing description of reality, a description which combines generality with specifics, so has the S-BPM method taken over the principles of abstraction from object-oriented design. However, instead of having a single, central hierarchy as with Plato and OO, S-BPM follows the Aristotelian philosophy by having several parallel hierarchies. It distributes the complexity by subject. In that way, the problems associated with deep inheritance trees are avoided. Also the objects are separated from the subjects. This corresponds to the principle of Aristotle, in distinguishing between the category and the members of categories, which can be either subjects or objects. The S-BPM Method is devoted to being a mirror image of the real world, reflecting the real life subjects, objects and actions. The model can be mapped 1:1 not only to the code but also to the business process the code is embedded in. In so doing, it becomes cost effective, easy to use and easy to understand, thus achieving a maximum value to the user. Finally S-BPM fulfills all the requirements of Demelt und Maier in regard to creating a new paradigm for software development.
References 1. Buchheim, T.: Ist wirklich dasselbe, was zu denken ist und was existiert? – Klassische griechische Philosophie. In: Fischer, E., Vossenkuhl, W. (eds.) Die Fragen der Philosophie, C.H. Beck, Munich (2003) 2. DeMarco, T.: Structured Analysis and System Specification. Yourdon, New York (1978) 3. Demelt, A., Maier, M.: Paradigmenwechsel: Was kommt nach der Objektorientierung? In: Objektspektrum 2008 / 6, SIGS DATACOM, Troisdorf (2008) 4. Djikstra, E.: The Humble Programmer (EWD340). Communications of the ACM (1972)
138
S.H. Sneed
5. Fleischmann, A.: What is S-BPM? In: Buchwald, H., Fleischmann, A., Seese, S., Stary, C. (eds.) S-BPM, CICS Band, vol. 85. Springer, Heidelberg (2010) 6. Jackson, M.: System Development. Prentice/Hall International, Englewood Cliffs (1983) 7. Jacobson, I.: Four Macro Trends in Software Development“. In: Proceedings Keynote of: Conference on Software Maintenance, Florence. IEEE Computer Society Press, Washington (2001) 8. Martin, J., Odell, J.: Object-oriented Analysis & Design. Prentice/Hall International, Englewood Cliffs (1992) 9. Nietzsche, F.: Über Lüge und Wahrheit im außermoralischen Sinne, Kritische Studienausgabe Band 1, DTV de Gruyter, München (1999) 10. Rumbaugh, J., Blaha, M., PRemerlani, W., Eddy, F., Lorensen, W.: Object-Oriented Modeling and Design. Prentice-Hall, Englewood Cliffs (1991) 11. Shlaer, S., Mellor, S.: Object Oriented Systems Analysis: Modeling the World in Data. Prentice Hall, New Jersey (1988) 12. Sneed, H.: The Myth of ‘Top-Down’ Software Development and its Consequences for Software Maintenance. In: Proceedings of Conference on Software Maintenance, Miami 1989. IEEE Computer Society Press, Washington (1989) 13. Sneed, H.: Software Entwicklungsmethodik. Verlag Rudolf Müller, Köln (1996) 14. Warnier, J.D.: Logical Construction of Programs. Van Nostrand Reinhold Company, Paris (1974)
Understanding the Role of Communication and Hands-On Experience in Work Process Design for All Christian Stary Department of Business Information Systems - Communications Engineering, University of Linz, Austria, [email protected]
Abstract. The paper motivates the explicit recognition of communication and hands-on experience when stakeholders design work processes, both, on the individual and on the organization level. As a straightforward implementation Subject-oriented Business Process Management is reviewed. Its constructs for modelling and resulting capabilities for seamless execution when using a corresponding suite are discussed. In particular, it is shown how stakeholders can articulate their way of task accomplishment in terms of communication relationships while producing an executable model. As the behaviour of all participating stakeholders in a specific business process can be expressed in terms of communication acts, adjusting individual and task-relevant flows of communication leads to a complete picture of an organization in operation. Moreover, subject-oriented representations allow executing the resulting workflow without further transformations. They enable interactive experience of business processes which in turn facilitates (collective) reflection and redesign. In this way, stakeholders can trigger and control seamless round-trips in organizational development. It minimizes development costs and social risks, since alternative ways of task accomplishment can be negotiated before becoming operational in daily business. Keywords: Work process modeling, Subject-oriented Business Process Management, Participatory Design, seamless roundtrip engineering, articulation and negotiation.
acts. However, the activities in operation and development of an organization decide upon its success and failure. The latter, functional perspective on organizations is addressed by and handled through traditional techniques in Business Process Management (BPM). They structure organizational behavior and arrange the organization of work according to sequences of functions for task accomplishment (cf. Laudon et al., 2005). In the course of modeling a chain of activities (functions) is defined according to temporal and/or causal relationships when handling work tasks. Communication can be represented in this kind of models by overlaying communication relationships, e.g., defining communication flows (cf. Scheer, 2001). Following this approach, organizations are primarily described through functional decomposition rather than adjustment of communication acts. As social systems organizations are living systems, they behave ‘non-trivial’ - their behavior cannot be (pre-) determined externally, and described by specifying causal relationships (Varela et al., 1974, von Foerster, 2003): A certain input is processed according to the inner state and the activity patterns of an organization. Situational context influences individual behavior and trigger work activities. In accordance with that business process models should reflect stakeholder-specific situational context of functions focusing on communication acts (cf. Wiesenfeld et al., 1998). Such a shift requires rethinking management in general: Scientific management techniques as proposed by Taylor still dominate in leading organizations, and center on control and efficiency (Hamel, 2007). They consider organizations as ‘trivial’ machines, and strive for deterministic behavior patterns. In a world where adaptability and creativity drive business success (cf. Hamel, 2007) they need to be adapted to if not replaced by mechanisms of co-ordination and guidance (cf. Böhle et al., 2004). Such a shift allows sharing not only ideas and expertise, but also values, interests and objectives (cf. Back et al., 2004, www.vernaallee.com), coming back to the initially mentioned observation of values as drivers of change. Today’s change management and organizational development processes rely on the use of information and communication technologies. They link communities by providing the technical infrastructure for collaboration and knowledge representation, processing, and sharing (cf. Van den Hooff et al., 2004). Organizations having implemented environments recognizing the nature of social systems report significant benefits in terms of knowledge transfer efficiency, response time and innovation (Deloitte, 2002). In this paper we discuss the baseline of socio-technical work (re-)design, namely subject-oriented business-process management of organizations. As it actively supports describing work processes from a stakeholder perspective and their immediate execution, scientific management of development processes can be replaced by selfcontrol when handling change requests and ideas (cf. Bieber et al., 2002; Hill et al., 2009). It allows breaking through the (vicious) cycle of re-inventing organizations by identical means, which might lead to self-referential results and cause more economic damage than benefits, as Kotter argues metaphorically: ‘The (penguin) colony ensured that changes would not be overcome by stubborn, hard-to-die traditions’. (Kotter, 2008).
Understanding the Role of Communication and Hands
141
2 Organizational Key Asset Communication In this section the essential role of communication is shown, as S-BPM strives for complete and coherent descriptions of business processes. The perspective of individuals is tackled in section 2.1, enabling the flow of communication between stakeholders elaborated in section 2.2. 2.1 Individual Work and Process Completion Subject-oriented Business Process Modeling (S-BPM) is based on the conceptual understanding of processes as functionally interacting subjects, i.e. actors, agents or services, such as accountants, sales persons, information systems, or knowledge management systems (see also Fleischmann et al., 2011). Subjects cannot only be persons or software applications. They can also be a combination of both, meaning that data is being entered to a software application. A process is considered as the structured interaction of subjects involved in a business or work transaction. Subjects transfer information and coordinate their work
Fig. 1. An employee’s communication perspective when specifying a vacation request process
142
C. Stary
by exchanging messages. Messages can be exchanged synchronously, asynchronously, or in a combined form. The synchronization type can be specified depending on the message type and the sending subject. Each subject has an input pool as a mail box for incoming messages. The synchronization type is defined using attributes of the input pool. In figure 1 the behavior of the subject Employee in terms of exchanging messages along a simple vacation application process is shown, starting with filling in a vacation request form to be sent to the subject Manager. In the same, however complementary way the interaction between Manager and Employee can be specified (see figure 3). In general, each subject involved in a business process, sends and receives messages, and accomplishes some tasks without interaction. The definition and the behavior of a subject depend on the order of sent and received messages, the tasks being accomplished, and the way it influences the behavior. If a subject sends a message the information transferred with that message is derived from user inputs or computed by some applications. These send functions are executed before a message is sent. Vice versa, if a subject accepts a message a corresponding function is executed. The information received through the message is used as an input to that function. This type of receive and send functions represent socalled refinements of a subject. They constitute the interface of a subject to the applications used by the subject. Once all participating subjects in a specific process can be identified, each of the subjects can be specified in terms of its communication behavior in the above explained way (cf. Fleischmann et al., 2011). To complete a description of a process in terms of communication lines the interfaces have to be adjusted, as exemplified for the vacation application in figure 2 and 3.
Fig. 2. Organizational perspective as set of communication interactions
2.2 Subject as Drivers of Processes The flow of communication in a networked subject-driven process environment can be best illustrated proceeding with the vacation process example. The behavior of the manager is complementary to the Employee’s. The messages sent by Employee are received by the subject Manager and vice versa. Figure 3 shows the behavior of the subject Manager. The Manager is on hold for the holiday application of Employee. Upon receipt the holiday application is checked (state). This check can either result in an approval or a rejection, leading to either state. The subject
Understanding the Role of Communication and Hands
143
Fig. 3 Adjusted behavior of subject Manager
Employee receives the result (i.e. decision). In case the holiday application is approved, the subject Human Resource Department is informed about the successful application. In terms of S-BPM the subject HR receives the approved holiday application, and puts it to Employee’s days-off record, without further activities (process completion). The description of a subject defines the sequence of sending and receiving messages, or the processing of internal functions, respectively. In this way, a subject specification contains the pushing sequence of functions, so-called services (as an abstraction from implementation). These services can be the standard ones for communication like send, or predicates dealing with specific objects, such as required when an employee files a holiday application form (vacation request in figure 1). Consequently, each node (state) and transition has to be assigned to an operation. The implementation of that operation does not matter at that design stage, since it can be handled by (business) object specifications. A service is assigned to an internal functional node. If this state is reached, the assigned service is triggered and processed. The end conditions correspond to links leaving the internal functional node. Each result link of a sending node (state) is assigned to a named service. Before sending this service is triggered to identify the content or parameter of a message. The service determines the values of the message parameters transferred by the message. Analogously, each output link of a receiving node (state) is also assigned to a named service. When accepting a message in this state that service is triggered to identify the parameters of the received message. The service determines the values of the parameters transferred by the message and provides them for further processing. These services are used to assign a certain meaning to each step in a subject behavior. Services allow defining the functions used in a subject. All of those are triggered in a
144
C. Stary
synchronous way, i.e. a subject only reaches its subsequent state once all triggered services have been completed. The functions of a subject are defined by means of objects. In this way, a process specification can be completed for automated execution.
3 Development Key Asset Self-directed Interactive Experience After completing subject behavior specifications and their mutual adjustments stakeholders can execute syntactically valid models. In section 3.1 the procedure is exemplified. Such interactive experiences can be shared by reflecting the model, as shown in section 3.2 using a semantic content management approach. 3.1 Seamless Interactive Workflow Execution In S-BPM, in order to enable direct process experience, stakeholders are empowered to execute what they have modeled without further transformation (seamless processing). For instance, an employee entitled to apply for vacations is able to create a new instance of a process specification (see also www.metasonic.de). After creating the process instance the stakeholder is guided through the process. He/she is asked by the BPM suite which transition he/she wants to follow. For instance, once the stakeholder knows that he/she has to fill in the business message form with the corresponding data, and that form has to be sent to the manager, he/she follows the transition “send”. In the state “Prepare Message and select Receiver” following the transition “send” he/she fills in the business object with the data required for an application for vacation. In the following figure elements of the user interface created by the S-BPM suite is shown. 1. refers to the name of the current state: “Prepare Message and select Receiver” 2. gives the title of that process instance: “Request for vacation” 3. shows the creation date of that process instance 4. is the form for filling in the business object data
1 2 3
4
Fig. 4. User interface of the execution engine (workflow system) in state “prepare message and select the person(s) to be addressed”
Understanding the Role of Communication and Hands
145
The stakeholder (in this case, subject 1 ‘Employee’) can add all the required data for a vacation request to the business object and send it to his/her manager who is the owner of another subject (subject 2 ‘Manager’). Since S-BPM focuses on individual work perspective, stakeholders need only to know communication interfaces when participating in organizational development: The behavior description of the subject Employee allows sending the vacation request to other subjects, such as the Manager or the Human Resource Department. S-BPM utilizes the metaphor of exchanging emails, however, focused on certain task accomplishments involving certain business objects (i.e. the content of the mail). The workflow execution allowing interactive process experience follows a simple protocol. A stakeholder (subject 1, e.g., Employee) starts with the select activity and selects the send transition. After that the action “prepare message and select address” is executed and in another state the message is sent to another stakeholder (subject 2, e.g., Manager). Now subject 1 reaches again the state “select”. In state Start subject 2 receives the message. In the following state “follow up action” the content of the received message is read and the corresponding action is executed by a certain person (or system) who is the owner of subject 2. In the case of the vacation application this follow up action is the manager´s decision whether the vacation application is accepted or denied. This decision must be sent to subject 1 (Employee). In the state select subject 2 (Manager) decides to follow the send transition, prepares the message with the result of the decision and sends it to subject 1 (Employee). In general, when a subject sends a message the sending state is connected with the corresponding receive state in the receiving subject. Subject 1 sends a message to subject 2 in state 2. Subject 2 receives that message in state “start”. Following this line of interaction, a complete business process can be executed in an interactive way. 3.2 Self-directed Roundtrip Engineering Creating new knowledge in the course of organizational development requires members of a knowledge building network collaboratively posing questions and commenting, and intentionally seeking for alternative solutions in order to expand the social system’s capabilities (Hakkarainen et al., 2004). Supporting such mutual learning scenarios, a subject-oriented workflow system is capable to provide operational evidence for each stakeholder involved in a certain process. A behavior proposal of one of the involved stakeholders in a process can be studied also by others immediately after completing and validating a subject-oriented model. However, posing questions and commenting requires more than execution. It requires social media, such as chat, forum, blogs, or process portfolios, and a proper content management to preserve contextual findings. Otherwise, the process of organizational development cannot be traced. Figure 5 shows such an approach: The subject-oriented behavior specification of the subject Manager when handling vacation requests is embedded into a semantic content management system. From the content perspective, each model can be enriched with meta data, such as documents assigned to a process, and role bindings, such as the behavior of the subject Manager. They also allow for navigation (left side in figure 5 with main categories ‚behavior‘ and ‚documents‘). Business process specifications handled as
146
C. Stary
Fig. 5. Knowledge sharing support – content perspective
content elements cannot only be tagged with meta data, they also can be annotated (i.e. enriched with link, comments, text, videos etc.), and become part of specific views that can be exchanged among members of an organization in the course of reflection and sharing. Any meta data, such as ‘behavior’ for diagrammatic content, lays ground for focused interactions sharing experience. Coupling meta data and annotations with topic-specific forum entries, role-specific blogs, or chats allows coand re-constructing information spaces for organizational development and mutual understanding. In the course of generating process models other domain-specific content might either be created from scratch or added to existing content. Embedding links to forum entries, blogs, or chats into annotations, stakeholder-specific perspectives can be created and kept in user views. They structure the space for sharing information and interacting, as social interaction is based on exchanging (stakeholder-specific) views. A view is generated like an empty overhead slide, and put on top of content elements (see right side of the screen a view termed ‘My lecture notes’ for annotating a diagram to indicate idealized subject behavior). The selection of content is supported by structured navigation (as shown on left side of the screen in figure 5) and filters directing the search for specific categories of information, such as behavior diagrams.
Understanding the Role of Communication and Hands
147
In a semantic content management system all annotations of a development process are stored, equal to original content elements, in user-specific views, including the links to communication entries. Users can manage their views, including deletion and transfer to other members of the organization. The transfer of views is essential, as collaboration in virtual communities can be enabled through sharing views. Views having set public by users can be taken by other users. They might import them to their list of individual views, and study them on top of the concerned content elements. Stakeholders taking existing views might continue working this way, i.e. setting views public after supplementing existing annotations. Such backand-forth transfers lead to cascaded views and in this way, to traceable processes in organizational development, both, on the content, and interaction level (cf. Stary, 2011).
4 Conclusive Summary For the first time since the dawning of the industrial age, the only way to build a company that’s fit for the future is to build one that’s fit for human beings as well.‘ (Hamel, 2007). Following this finding indicating the need for new ways in management we need to revisit organizations as social systems. From such a perspective stakeholders are driven by communication acts when accomplishing their tasks and when moving forward to novel structures of work. Traditional techniques of Business Process Management arrange process-specific information around functional activities that are linked by causal and/or temporal relationships. They do not support modeling organizations from a communication perspective in the first run. The discussed Subject-oriented Business Process Management technique supports stakeholders articulating their way of task accomplishment in terms of communication relationships while producing an executable model. The latter allows for interactive experience of specific business processes, according to individual and task-relevant flows of communication. The immediate execution of subject-oriented models facilitates (collective) reflection and re-design of individual proposals. In combination with semantic content management stakeholders can discuss alternative ways of task accomplishment along seamless roundtrips before incorporating them into daily business.
References 1. Bieber, M., Goldman-Segall, R., Hiltz, S.R., Im, I., Paul, R., Preece, J., et al.: Towards knowledge-sharing and learning in virtual professional communities. In: 35th Annual Hawaii International Conference on Systems Sciences (HICSS-35 2002). IEEE Computer Society Press, Los Alamitos (2002) 2. Back, A., von Krogh, J.: Knowledge Networks. Springer, Berlin (2004) 3. Böhle, F., Pfeiffer, S., Sevsay-Tegethoff, N.: Die Bewältigung des Unplanbaren. VS Verlag für Sozialwissenschaften, Wiesbaden (2004) 4. Deloitte: Collaborative Knowledge Networks. Deloitte Consulting and Deloitte & Touche, New York (2002) 5. Fleischmann, A., Stary, C.: Whom to Talk to? A Stakeholder Perspective on Business Process Development. Universal Access in the Information Society (2011) (in press)
148
C. Stary
6. Hakkarainen, K., Palonen, T., Paavola, S., Lehtinen, E.: Communities of Networked Expertise, Professional and Educational Perspectives. Advances in Learning and Instruction Series. Earli & Elsevier, London (2004) 7. Hamel, G.: The Future of Management. Harvard Business School Press, Boston (2007) 8. Hill, J.B., Cantara, M., Kerremans, M., Plummer, D.C.: Magic Quadrant for Business Process Management Suites, Gartner Research, 18, G00164485 (February 18, 2009) 9. Laudon, K.-C., Laudon, J.P.: Essentials of Management Information Systems: Managing the Digital Firm, 6th edn. Pearson, Upper Saddle River (2005) 10. Levine, R., Locke, C., Searls, D., Weinberger, D.: The Cluetrain Manifesto, The End of Business as Usual, Perseus, Cambridge, MA (2000) 11. Kotter, J.: Our Iceberg is Melting, Management of Change, Ashley Kreuer, NY (2008) 12. Luhmann, N.: Soziale Systeme. Gundriß einer allgemeinen Theorie, Suhrkamp (2006) 13. Stary, C.: Perspective Giving - Perspective Taking: Evidence-based Learning in Organizations. International Journal of Information and Knowledge Management 10 (2011) 14. Scheer, A.-W.: ARIS - Modellierungsmethoden, Metamodelle, Anwendungen, 4th edn. Springer, Berlin (2001) 15. Tsai, W., Ghosal, S.: Social Capital and Value Creation: The Role of Intrafirm Networks. The Academy of Management Journal 41(4), 464–476 (1998) 16. Van den Hooff, R., De Ridder, J.A.: Knowledge Sharing in Context: The Influence of Organizational Commitment, Communication Climate and CMC Use on Knowledge Sharing. Journal of Knowledge Management, 8(6), 117–130 (2004) 17. Varela, F.J., Maturana, H.R., Uribe, R.: Autopoiesis: The Organization of Living Systems, Its Characterization and a Model. Bio Systems 5, 187–196 (1974) 18. Von Foerster, H.: Wahrheit ist die Erfindung eines Lügners. Carl Auer, Heidelberg (2003) 19. Wiesenfeld, B.M., Raghuram, S., Garud, R.: Communication Patterns as Determinants of Organizational Identification in a Virtual Organization. Journal of Computer-mediated Communication 3(4) (1998), http://jcmc.huji.ac.il/vol3/issue4/wiesenfeld.html
Extending Predictive Models of Exploratory Behavior to Broader Populations Shari Trewin1, John Richards1,2, Rachel Bellamy1, Bonnie E. John1, Cal Swart1, and David Sloan2 1 IBM T. J. Watson Research Center, 19 Skyline Drive, Hawthorne, NY, 10532, USA {trewin,ajtr,rachel,bejohn,cals}@us.ibm.com 2 School of Computing, University of Dundee, Dundee, DD1 4HN, Scotland {johnrichards,dsloan}@computing.dundee.ac.uk
Abstract. We describe the motivation for research aimed at extending predictive cognitive modeling of non-expert users to a broader population. Existing computational cognitive models have successfully predicted the navigation behavior of users exploring unfamiliar interfaces in pursuit of a goal. This paper explores factors that might lead to significant between-group differences in the exploratory behavior of users, with a focus on the roles of working memory, prior knowledge, and information-seeking strategies. Validated models capable of predicting novice goal-directed exploration of computer interfaces can be a valuable design tool. By using data from younger and older user groups to inform the development of such models, we aim to expand their coverage to a broader range of users. Keywords: Cognitive modeling, information foraging, usability testing, accessibility, interface design, older users.
models, apply broadly, and what (if any) enhancements to current models will better predict the behavior of users of different ages.
2 Modeling Skilled Behavior HCI professionals are generally capable of constructing keystroke-level models by hand but this can become tedious if the design is complex. The correct placement of mental operators is also error prone undercutting model accuracy. To increase modeling efficiency and decrease modeling errors, John et al. created CogTool [7] to support the storyboarding of designs, the demonstration of tasks against those designs, and the resultant automatic generation of accurate models. Models of expert performance have begun to develop a more nuanced representation of users that incorporates differences between groups. Daily et al [8] applied personalized models to explain the performance of individual subjects across two different tasks. Further work by Rehling et al. [9] applied individual models based on performance in one task to predict performance in a second task. Jastrzembski and Charness [10] developed 'younger' and 'older' GOMS models, based on cognitive parameters suggested by prior literature. Two mobile phone tasks were modeled: dialing a number and sending a text message. The models predicted performance differences observed in an older and younger group of participants performing these tasks. These parameters were later mapped to equivalent ACT-R parameters, and again the age-specific parameters produced a better fit to the data than the default ACT-R parameters [11]. In further work, John and Jastrzembski [12] explored a series of models built using CogTool. The most successful model for the older adult group revisited the written instructions in the dialing task more frequently to extract the next set of digits to dial. The best models were able to detect statistically significant differences between the age groups, though the match to observed times (within 10-20% of observed times) was less good than that of the original hand-built GOMS models (within 5% of observed times). Awareness of the role of working memory in task performance proved instrumental in the production of high quality models of skilled performance.
3 Modeling Exploratory Behavior Navigating an unfamiliar interface can be described using the information foraging framework initially developed to describe the behavior of people navigating the Web [13], [14], [15], and subsequently used to model the behavior of users in other information-rich environments such as debugging unfamiliar code in an integrated development environment [16], [17]. The basic idea underlying information foraging is that a user evaluates which item (e.g., a link) to select next by assessing the semantic relationship between the goal (expressed as a series of linguistic tokens) and the item (similarly expressed as a series of tokens). This semantic relationship is characterized by its “information scent” with higher scent items being more likely to be selected than lower scent items. Various strategies for deciding when an item has a high enough scent to follow it, what to do when no links on the current page have a
Extending Predictive Models of Exploratory Behavior to Broader Populations
151
high enough scent, how rapidly the memory of having selected an item decays, and so on, can all be modeled within a general purpose cognitive architecture such as ACT-R [18]. The computation of information scent is based on an underlying text corpus and is roughly a measure of the degree to which the goal and the item are “near” each other in the corpus. Several computational cognitive models of information seeking on the Web, incorporating information foraging theory, have been shown to make good predictions of users’ interaction with Web sites [19], [20]. Our approach to modeling exploratory behavior is to build on CogTool-Explorer [21], a version of CogTool that implements novice goal-directed exploration through information foraging. CogTool-Explorer uses the ACT-R cognitive model [22], and incorporates SNIFACT 2.0, a computational cognitive model of how people use information scent cues to make navigation decisions on the Web [23]. In SNIF-ACT, items on a Web page are evaluated one by one, and a decision is made either to “satisfice” (choose the best seen so far), look further, or go back. In CogTool-Explorer, headings, and then items within headings, are evaluated in the order dictated by Halverson and Hornoff’s Minimal Model of Visual Search [24]. 3.1 Calculating Information Scent Successful modelling of novice behaviour depends crucially on the ability to make human-like judgments of the similarity between a goal and an item in a user interface. A number of algorithms have been explored, including latent semantic analysis (LSA), pointwise mutual information, generalized LSA, and Vectorspace approaches. All of these algorithms depend on the presence of a large corpus of text that represents the user’s knowledge, and their performance improves as the corpus size increases. By default, CogTool-Explorer uses LSA calculations based on the TASA corpus. TASA is a corpus often used for LSA calculations, and incorporates texts that a first year college student would be expected to have read. Blackmon’s AutoCWW also used this approach. Other research has explored corpora based on material found on the Web, on Wikipedia, or New York Times articles. Pointwise mutual information is a simpler measure that captures how likely it is to find word A in a text given that it contains word B, adjusting for the frequency of A and B in the corpus. It has the advantage that the corpus can be more easily extended and scaled. For large corpora, several studies have found it to perform better than LSA on similarity judgements [25]. Stone et al [26] report that the Vectorspace model performed better than LSA in predicting eye movements in a web search task. More sophisticated scent calculations have also been developed. The CoLiDes model of web navigation [27] integrates five factors to assess information scent: semantic similarity based on LSA, elaboration of terms to include similar terms, word frequency, previous experience, and literal matching between the goal and the user interface object being assessed. 3.2 The Role of Between-Group Differences If models of information seeking are to be useful to designers (who must design for a range of user abilities), the underlying models must be broadly applicable. However,
152
S. Trewin et al.
little is known about between-group differences in information foraging behavior. Blackmon et al’s experiments [19], upon which both the Auto-CWW and CogToolExplorer models are based, were performed with college students. The evaluations of the SNIF-ACT theory were based on data gathered from researchers and academics, and from individuals recruited on the web about whom little is reported, except that they were able to install specialized software in order to participate in the experiment. It is plausible that these models represent strategies and approaches primarily used by relatively young people with high levels of cognitive function. Age and cognitive skills are particularly interesting dimensions to study. Older adults are a significant, and growing, group of technology users that are often underrepresented in both research and design. They also tend to be a more heterogeneous group, in part due to the variable effects of age-related changes in cognitive, physical and sensory abilities. An important research question, therefore, is whether these models are applicable to older adults.
4 Factors Influencing Goal-Directed Exploration This section discusses three factors that may lead to significant differences not captured in current models. 4.1 Working Memory Working memory is the mental process by which information is temporarily stored and manipulated in the performance of complex cognitive tasks such as language understanding, learning and reasoning. Many different types of information can be held in working memory, including visual, verbal, and emotional information [28]. Measures of working memory correlate well with performance on many tasks, especially those that involve both memory and processing. Working memory capacity shows a decline with age, and is considered to be a key determinant of age-related differences in performance on cognitive tasks [29], [30], [31]. Age differences in performance on memory tasks are greater for tasks involving more executive processing [32], [33]. Reuter-Lorenz and Sylvester [34] suggest that this may be due to the use of executive processing resources to compensate for declines in memory ability. It is well known that recall is a more difficult memory task than recognition. Craik and McDowd [35] showed that increased age amplifies the cost of memory recall. Older adults may also be less able to suppress interfering memories [36], [37] during encoding and retrieval of information. Baddeley and Hitch's influential model of working memory [38] posits three primary subcomponents of working memory: an 'attentional-controlling system' (or 'central executive'), a 'phonological loop' that stores and rehearses verbal and other auditory information, and a 'visuo-spatial scratch pad' that manipulates visual images. More recently, an 'episodic buffer' component has been proposed, serving the purpose of integrating working memory information with that in long-term memory [39]. Working memory has a role in performance that goes beyond age-related differences. Impairments in aspects of working memory have been found with a number of disabilities, including schizophrenia [40], ADHD [41], and learning
Extending Predictive Models of Exploratory Behavior to Broader Populations
153
disabilities [42]. This impacts performance in many tasks. Naumann et al [43] provided strategy training to individuals in how to learn from hypertext, and found that the effectiveness of this training depended on the working memory resources of the individuals. For learners with poorer working memory, the strategy training actually hindered their performance. Furthermore, modern computing contexts often tax working memory with frequent interruptions and multi-tasking demands. 4.2 Prior Knowledge Prior knowledge has been found to influence goal-directed exploration. Older adults, having greater life experience on which to draw, may outperform younger adults in ill-defined Web-based tasks that rely on background knowledge of the topic [44]. The effect of knowledge on exploration is core to information foraging. Scent depends on the perceived relationship between what is on the screen and the users goal. To date, information foraging models have treated scent as a lexical relationship. It is calculated relative to lexical relationships in a pre-existing corpus. One can think of scent calculations as using a corpus of documents to ‘stand-in’ for the users knowledge. There are likely to be generational differences in background knowledge. For example, different age groups will have learned different topics at school, different age groups will tend to have read different books and be exposed to different media. The effects of background knowledge are readily apparent in different English speaking cultures. For example, when asked to find ‘vacuum cleaners’, people who have grown-up in the UK will likely follow a link that says ‘Hoovers’. Use of the term ‘Hoover’ to refer to a vacuum cleaner is common in the UK, so will be background knowledge. Among English language speakers who have grown-up in the US, however, younger adults are very unlikely to click such a link, as it would not appear to be related to their goal. In information foraging terms, for someone who has grown-up in the US, ‘Hoover’ would not have high scent when the search goal was ‘vacuum cleaner’. These differences in background knowledge may be sufficient to change users’ exploration behavior, and background knowledge may be similar enough within a generation to see generational differences in exploration behavior. 4.3 Information-Seeking Strategies In an information foraging task, participants must maintain many items in memory: the current goal, the paths already explored, and other promising unexplored paths, in addition to making judgments of the information scent of each option. As a result, performance in such tasks may be quite sensitive to working memory capacity. It may even be the case that people with lower working memory capacity employ very different strategies, in order to compensate. In a loosely constrained information-seeking task, Fairweather [45] observed that while there were no differences in task success between an older and younger group, there were significant differences in how the task was approached, and strong agerelated tendencies to follow particular paths and visit particular zones. Older adults made more use of guided small steps, performed more fruitless searches, and left the original site more often. Fairweather concluded that the observed performance
154
S. Trewin et al.
differences could not be attributed to taking the same steps in a less efficient way – fundamentally different approaches were being used. In a study of adaptive foraging behavior in a virtual landscape, both younger and older adults adjusted the parameters of their foraging strategy in response to task characteristics, but the performance of the older group was poorer, even when an optimal strategy was explicitly taught [46]. Hanson also observed strategy differences between older and younger individuals performing Web search tasks, but no significant differences in task success [47]. Paxton et al [48] report MRI-based results that also suggest a strategy change in older adults in a task that stressed maintenance of goal-relevant information. They hypothesize that problems with goal maintenance lead to this shift. Cognitive models of foraging strategies, when compared with human data, can be a useful way to explore possible alternative strategies in different user groups.
5 Research Directions In light of the discussion above, our research is exploring the following questions: How do the strategies of younger and older adults compare on information foraging tasks? What cognitive skills, including working memory, correlate with task performance and strategy use? How well can information foraging models predict human data? What is the influence of information scent algorithms and corpora? 5.1 Collecting Human Data from a Broader Population Our approach is to gather very detailed data from older and younger adults performing a goal-directed search task on the Web, to augment existing knowledge of human information foraging. Our data includes keyboard, mouse, and eye movement information. Our initial experiments are utilizing an information space derived from a popular online auction site in the UK. Items in this space are organized in a three-level category hierarchy with the top level including 27 categories such as “Antiques”, “Art”, “Baby”, “Jewellery & Watches”, “Musical Instruments”, and “Photography”. Research participants are being asked to find the proper third-level category for a named item such as “Watch batteries”. In this case the correct path would be “Jewellery & Watches”, “Watches”, “Watch Batteries”. Each of the second-level and third-level screens has a “Go Back” button making it easy to navigate back up the hierarchy. In addition, each screen has a reminder string of the form “Looking for x” to remind participants of their goal. By examining the eye-movement data we will be able to see what is considered, how long each item is considered, and whether the reminder string is utilized. 5.2 Modeling Between-Group Differences in Information Foraging ACT-R models offer predictions of how the motor actions, eye movements, and cognition are interweaved in the performance of a task. We plan to develop models that represent theories of how users are performing the task. Different corpora to represent different knowledge bases are another possible avenue of exploration.
Extending Predictive Models of Exploratory Behavior to Broader Populations
155
Comparing different models with human data will test these theories, leading to a deeper understanding of age-related differences in information foraging. The cognitive model currently underlying CogTool-Explorer has been successful in predicting user behavior in Blackmon’s dictionary search tasks [19]. Although it is an ACT-R model, the goal, and the best link seen so far, are stored in ACT-R’s goal buffer, where they are not subject to decay. Differences in working memory abilities cannot be represented in this model. If issues of forgetting the goal term, or the best link seen so far arise in the data, more sophisticated models will be required to capture these effects. ACT-R’s working memory mechanism will provide a basis for models that account for forgetting. The ACT-R cognitive architecture implements a model of working memory, including a visual buffer and an auditory buffer. The attentional-controlling system is implemented as a fixed amount of memory activation that is spread across memory items. Each memory item has a level of activation based on the time since it was created or accessed, and connections to other active elements. Those items with an activation level above a threshold are active 'working memory' chunks. Anderson et al [49] describe experiments supporting this 'source activation' model, and ACT-R models can account for many effects observed in psychological experiments [50]. Huss and Byrne [51] further proposed an ACT-R based model of an articulatory loop, in which an individual repeatedly sub vocally verbalizes an item in order to rehearse it and maintain its activation in working memory. Their model was able to reproduce the effects of list length and stimulus length on a list recall task. Sub vocal (or even vocal) articulation is a commonly used memory strategy, and may be important for accurate modeling of tasks involving memory. The ultimate goal of our research work is to derive and validate models capable of being embedded in CogTool-Explorer and used by designers to predict the behavior of a broad range of users exploring unfamiliar Web sites or user interfaces. Using data from older and younger user groups, the applicability of current models can be examined, and new models developed. We have discussed working memory, prior knowledge, and information-seeking strategies. These three factors, when reflected in models, may help to account for between-group differences in exploratory behavior. Acknowledgments. This research was supported by an Open Collaborative Research grant from the IBM Research Division, and by RCUK EP/G066019/1 “RCUK Hub: Inclusion in the Digital Economy”.
References 1. Card, S.K., Moran, T.P., Newell, A.: The Psychology of Human-Computer Interaction. Lawrence Erlbaum Associates, Hillsdale (1983) 2. Gray, W.D., John, B.E., Atwood, M.E.: Project Ernestine: Validating a GOMS Analysis for Predicting and Explaining Real-World Task Performance. Human-Computer Interaction 8(3), 237–309 (1993) 3. Callander, M., Zorman, L.: Usability on Patrol. In: CHI 2007 Extended Abstracts on Human Factors in Computing Systems, San Jose, CA, USA, April 28 - May 03, pp. 1709–1714. ACM, New York (2007)
156
S. Trewin et al.
4. John, B.E., Kieras, D.E.: Using GOMS for User Interface Design and Evaluation: Which Technique? ACM Transactions on Computer-Human Interaction 3(4), 287–319 (1996) 5. Luo, L., John, B.E.: Predicting Task Execution Time on Handheld Devices Using the Keystroke-Level Model. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2005), Portland, Oregon, April 2-7. ACM, New York (2005) 6. Knight, A., Pyrzak, G., Green, C.: When Two Methods Are Better Than One: Combining User Study with Cognitive Modeling. In: CHI 2007 Extended Abstracts on Human Factors in Computing Systems, San Jose, CA, USA, April 28 - May 03, pp. 1783–1788. ACM, New York (2007) 7. John, B.E., Prevas, K., Salvucci, D.D., Koedinger, K.: Predictive Human Performance Modeling Made Easy. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2004), pp. 455–462. ACM, New York (2004) 8. Daily, L., Lovett, M., Reder, L.: Modeling Individual Differences in Working Memory Performance: A Source Activation Account. Cognitive Science 25, 315–353 (2001) 9. Rehling, J., Lovett, M., Lebiere, C., Reder, L., Demiral, B.: Modeling Complex Tasks: An Individual Difference Approach. In: Proceedings of the 26th Annual Conference of the Cognitive Science Society, Chicago, IL, August 4-7, pp. 1137–1142 (2004) 10. Jastrzembski, T.S., Charness, N.: The Model Human Processor and the Older Adult: Validation in a Mobile Phone Task. Journal of Experimental Psychology: Applied 13, 224–248 (2007) 11. Jastrzembski, T.S., Myers, C., Charness, N.: A Principled Account of the Older Adult in ACT-R: Age Specific Model Human Processor Extensions in a Mobile Phone Task. In: Proceedings of the Human Factors and Ergonomics Society 54th Annual Meeting, San Francisco, CA, September 27-October 1 (2010) 12. John, B.E., Jastrzembski, T.S.: Exploration of Costs and Benefits of Predictive Human Performance Modeling for Design. In: Salvucci, D.D., Gunzelmann, G. (eds.) Proceedings of the 10th International Conference on Cognitive Modeling, Philadelphia, PA, pp. 115– 120 (2010) 13. Pirolli, P., Card, S.: Information Foraging in Information Access Environments. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 1995), pp. 51–58. ACM Press, New York (1995) 14. Pirolli, P.: Computational Models of Information Scent-Following in a Very Large Browsable Text Collection. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 1997), pp. 3–10. ACM Press, New York (1997) 15. Pirolli, P.: Information Foraging Theory: Adaptive Interaction with Information. Oxford University Press, New York (2007) 16. Lawrance, J., Bellamy, R., Burnett, M., Rector, K.: Using Information Scent to Model the Dynamic Foraging Behavior of Programmers in Maintenance Tasks. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2008), pp. 1323– 1332. ACM Press, New York (2008) 17. Lawrance, J., Burnett, M., Bellamy, R., Bogart, C., Swart, C.: Reactive Information Foraging for Evolving Goals. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2010), pp. 25–34. ACM, New York (2010) 18. Anderson, J.R.: The Adaptive Character Of Thought. Erlbaum, Hillsdale (1990) 19. Blackmon, M., Kitajama, M., Polson, P.: Tool for Accurately Predicting Website Navigation Problems, Non-Problems, Problem Severity, and Effectiveness of Repairs. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2005), pp. 31–40. ACM Press, New York (2005)
Extending Predictive Models of Exploratory Behavior to Broader Populations
157
20. Chi, E., Rosien, A., Supattanasiri, G., Williams, A., Royer, C., Chow, C., Robles, E., Dalal, B., Chen, J., Cousins, S.: The Bloodhound Project: Automating Discovery of Web Usability Issues Using the InfoScentTM Simulator. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2003), pp. 505–512. ACM Press, New York (2003) 21. Teo, L., John, B.E.: Towards a Tool for Predicting Goal-Directed Exploratory Behavior. In: Proceedings of the Human Factors and Ergonomics Society 52nd Annual Meeting, pp. 950–954 (2008) 22. Anderson, J., Lebiere, C.: The Atomic Components of Thought. Erlbaum, USA (1998) 23. Fu, W., Pirolli, P.: SNIF-ACT: A Cognitive Model of User Navigation on the World Wide Web. Human-Computer Interaction 22(4), A355–A412 (2007) 24. Halverson, T., Hornoff, A.: A Minimal Model for Predicting Visual Search in Human Computer Interaction. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2007), pp. 431–434. ACM Press, New York (2007) 25. Turney, P.D.: Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL. In: Proceedings of the Twelfth European Conference on Machine Learning, Freiburg, Germany, pp. 491–502 (2001) 26. Stone, B., Dennis, S., Kwantes, P.J.: A Systematic Comparison of Semantic Models on Human Similarity Rating Data: The Effectiveness of Subspacing. In: The Proceedings of the Thirtieth Conference of the Cognitive Science Society (2008) 27. Kitajima, M., Blackmon, M.H., Polson, P.G.: Cognitive Architecture for Website Design and Usability Evaluation: Comprehension and Information Scent in Performing by Exploration. In: Proceedings of the HCI International Conference (2005) 28. Mikels, J., Larkin, G., Reuter-Lorenz, P.: Divergent Trajectories in the Aging Mind: Changes in Working Memory for Affective Versus Visual Information with Age. Psychology and Aging, APA 20(4), 542–553 (2005) 29. Salthouse, T.A.: Differential Age-Related Influences on Memory for Verbal-Symbolic Information and Visual-Spatial Information. Journal of Gerontology 50B, 193–201 (1995) 30. Park, D.C., Lautenschlager, G., Hedden, T., Davidson, N.S., Smith, A.D., Smith, P.K.: Models of Visuospatial and Verbal Memory Across the Adult Life Span. Psychology and Aging 17, 299–320 (2002) 31. Brown, S.C., Park, D.C.: Theoretical Models of Cognitive Aging and Implications for Translational Research in Medicine. The Gerontologist 43(suppl. 1), 57–67 (2003) 32. Dobbs, A.R., Rule, B.G.: Adult Age Differences in Working Memory. Psychology and Aging 4, 500–503 (1989) 33. Salthouse, T.A., Babcock, R.L.: Decomposing Adult Age Differences in Working Memory. Developmental Psychology 27, 763–776 (1991) 34. Reuter-Lorenz, P., Sylvester, C.: The Cognitive Neuroscience of Working Memory and Aging. In: Cabeza, R., Nyberg, L., Park, D. (eds.) Cognitive Neuroscience of Aging: Linking Cognitive and Cerebral Aging, pp. 186–217. Oxford University Press, Oxford (2005) 35. Craik, F., McDowd, J.: Age Differences in Recall and Recognition. Journal of Experimental Psychology: Learning, Memory and Cognition 13(3), 57–67 (1987) 36. May, C.P., Hasher, L., Kane, M.J.: The Role of Interference in Memory Span. Memory and Cognition 27, 759–767 (1999) 37. Lustig, C., May, C.P., Hasher, L.: Working Memory Span and the Role of Proactive Interference. Journal of Experimental Psychology 130, 19–207 (2001) 38. Baddeley, A.D., Hitch, G.J.: Working memory. In: Bower, G.A. (ed.) Recent Advances in Learning and Motivation, pp. 47–89. Academic Press, London (1974)
158
S. Trewin et al.
39. Baddeley, A.: The Psychology of Memory. In: Baddeley, A., Kopelman, M., Wilson, B. (eds.) The Essential Handbook of Memory Disorders for Clinicians, ch.1. John Wiley & Sons, Chichester (2004) 40. Silver, H., Feldman, P., Bilker, W., Gur, R.C.: Working Memory Deficit as a Core Neuropsychological Dysfunction in Schizophrenia. American Journal of Psychiatry 160, 1809–1816 (2003) 41. Marusiak, C., Janzen, H.: Assessing the Working Memory Abilities of ADHD Children Using the Stanford-Binet Intelligence Scales. Canadian Journal of School Psychology 20(1-2), 84–97 (2005) 42. Swanson, H.: Individual Differences in Working Memory: A Model Testing and Subgroup Analysis of Learning-Disabled and Skilled Readers. Intelligence 17(3), 285–332 (1993) 43. Naumann, J., Richter, T., Christmann, U., Groeben, N.: Working Memory Capacity and Reading Skill Moderate the Effectiveness of Strategy Training in Learning from Hypertext. Learning and Individual Differences 18(2), 197–213 (2008) 44. Chin, J., Fu, W., Kannampallil, T.: Adaptive Information Search: Age-Dependent Interactions Between Cognitive Profiles and Strategies. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI 2009), Boston, MA, USA, April 04–09, pp. 1683–1692. ACM, New York (2009) 45. Fairweather, P.: How Younger and Older Adults Differ in Their Approach to Problem Solving on a Complex Website. In: Proceedings of 10th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS 2008). ACM Press, New York (2008) 46. Mata, R., Wilke, A., Czienskowski, U.: Cognitive Aging and Foraging Behavior. J. Gerontol. B Psychol. Sci. Soc. Sci. 64B(4), 474–481 (2009) 47. Hanson, V.: Influencing Technology Adoption by Older Adults. Interacting with Computers 22, 502–509 (2010) 48. Paxton, J., Barch, D., Racine, C., Braver, T.: Cognitive Control, Goal Maintenance, and Prefrontal Function in Healthy Aging. Cerebral Cortex 18(5), 1010–1028 (2008) 49. Anderson, J., Reder, L., Lebiere, C.: Working Memory: Activation Limitations on Retrieval. Cognitive Psychology 30, 221–256 (1996) 50. Anderson, J., Bothell, D., Lebiere, C., Matessa, M.: An Integrated Theory of List Memory. Journal of Memory and Language 38, 341–380 (1998) 51. Huss, D., Byrne, M.: An ACT-R/PM Model of the Articulatory Loop. In: Detje, F., Doerner, D., Schaub, H. (eds.) Proceedings of the Fifth International Conference on Cognitive Modeling, pp. 135–140. Universitats-Verlag Bamberg, Germany (2003)
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology to the Teaching/ Learning of Interaction Design Shu-Wen Tzeng Department of Industrial and Graphic Design, Auburn University, Auburn, Alabama 36849, USA [email protected]
Abstract. The development of digital technology has changed the way users interact with products and forced industrial design educators to rethink the role of design education with respect to both the integrity and suitability of current design curriculum. This study is sought to figure out a better way for teaching/ learning Interaction Design in the discipline of Industrial Design with considerations to the nature of interaction design and students’ learning mode. A newly created interaction design methodology will be introduced in this paper, and the case study on the application of this approach to a graduate school level interaction design course should explain how this methodology can be manipulated in the development of an interaction design, making teaching/ learning Interaction Design more effective and enjoyable. Keywords: Interaction Design, Design Methodology, Design Education.
In school, the changes in technology are also forcing design educators to rethink the role of design education with respect to both the integrity and suitability of current design curriculum. It is apparent that we are now at the leading edge of a wave of technological change that will affect all aspects of everyday life in a profound way. The next generation of designers will need new skills and knowledge to negotiate this new terrain. As the focus of product design has shifted from the physical functionality and aesthetics to user interfaces, and finally to the ubiquitous interactive paradigms, industrial design students must have a solid appreciation of Interaction Design topics to succeed. Understanding Interaction Design will allow designers to create products that are usable by everyone, extending the impact of user-product interaction to a diverse set of users within many domains. Yet industrial design educators are now struggling with teaching Interaction Design due to the lack of methods, which adapt to students’ learning mode (in this case, the Net Generation’s learning mode) and are easy-to learn and quick to examine. Because the design of interaction interface often depends on both the author’s thinking and the presumed audience’s experience, educators must endow their students with abilities to solve problems by using procedures and analytic methods. The solution to this problem relies on educators’ extensive study on Interaction Design and continuous attempts at creating effective teaching strategies. Therefore, this study is sought to figure out a better way for teaching/ learning Interaction Design in the discipline of Industrial Design. The goal is to equip the next generation of industrial designers with the abilities to tackle human’s substantial needs of interaction.
2 Interaction Design in the Field of Industrial Design 2.1 The Nature of Interaction Design According to Bill Moggridge, who coined the term “interaction design” together with Bill Verplank in the late 1980s, Interaction Design is the study of devices with which a user can interact [2]. In other words, Interaction Design defines the behavior (the "interaction") of an artifact or system in response to its users. Because the behavior of an artifact is often embodied in both the abstract content and the concrete form of an interface design, it makes sense to assume that any interaction design is composed of user, content and form of design, as shown in Figure 1.
Fig. 1. The three main components in an interaction design
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology
161
These three components- the user, the content, and the form of design- are both independent from and correlated with each other- the content and the form of design can influence the users’ experience and productivity while have no power to change the users’ profiles or intentions. The user, with no doubt, is the center of an interaction design since the ultimate goal of Interaction Design is to create a great user experience. Note that the user consideration here is more focused on users’ intentions rather than users’ needs -- the way a user chooses to interact with an interface is often determined by his/ her purpose or intentions in accessing the content. The second consideration, the content, can be referred to the services provided by an artifact. The content is collected and delivered through different ways based on the pre-determined functions. Because Interaction Design is often looking at the user’s productivity rather than that of the product, it is more interested in the delivery strategy of the content instead of the content itself. In their design research, Blair-Early and Zender find that the type of delivery of content is one of the most important factors contributing to the quality of an interaction design and has more power to influence end-user experience comparing to the content itself [3]. The form of design, which is often referred to an interface, is the means by which users interact with content for a purpose. The form can be presented through different media- from a physical object to a gesture-oriented interaction, and exists in various kinds of products in our everyday life. As the number of interfaces and the diversity of users glow, the need for effective interaction design increases. Therefore, the form of design has gradually become the main focus in an interaction design. As interaction design becomes a new terrain for industrial designers, the educators are under obligation to teach interaction design to their students in an effective way. 2.2 Teaching Interaction Design to Industrial Design Students Similar to other interaction design related disciplines, the interaction design curriculum in industrial design begins with a design research, which investigates user, process, cultural context, form, and content, and follows with an iterative design process including idea exploration, design execution and design validation. Technically, this is a process from collecting abstract information (often involving the understanding of user behaviors and goals), to defining content functions (within a conceptual framework and with an emphasis on the values of the content), and finally to constructing the concrete structure of a user interface design. This process can be presented in the structure as shown in Figure 2. Different from other disciplines, industrial design has more focus on the communications between the user and the artifact through the form of design- often in the fashion of visual display of quantitative information and visual explanations. It has been believed that the “design principles” can help designers achieve an effective interface [3], [4], [5], [6], [7], [8]. Hence, tons of interaction design professionals have attempted to improve interface design primarily by exploring and analyzing existing form (patterns) of interface design, or by providing design principles to guide the design of interaction. Two of the most famous examples are the 1992 Version of “Macintosh Human Interface Guidelines” published by Apple Computer, which suggests thirty-eight index entries for icons in an interface design [4], and the Nielsen
162
S.-W. Tzeng
Fig. 2. The general process and factors in an interaction design
Norman Group’s 106-page report, “Site Map Usability”, which delivers twenty-eight guidelines “ to improve an interaction design” [9]. These design principles have two problems: they are too vague to be useful— especially for design students or junior designers; and, as final experiences, they provide no indication of how they may be achieved. Therefore, there is a need for the design educators to figure out a better way to teach interaction design, especially on how design principles can be applied to an interaction design. What follows is a description of the proposed design methodology that has been used for teaching/ learning interaction design since 2008. This design method is created with the considerations of students’ learning mode and ease of examination for future design validation, making learning more efficient and effective.
3 The Idea of Parameter-Oriented Design Methodology The Parameter-Oriented Interaction Design Methodology is partly inspired by the parameter concept proposed by Blair-Early and Zender [3], and partly inspired by the digital technology which is extensively utilized in the everyday life of current college students. Many current university students belong to the ‘Net Generation’- a label used to describe today’s young adults. This group of individuals, born between 1980 and 1994 [10], has been characterized by their familiarity with and reliance on digital technologies. A number of social psychologists have argued that the digital culture in which the Net Generation have grown up has influenced their preferences and skills in a number of key areas related to education. For example, the Net Generation are said to prefer receiving information quickly; expect immediate answers; prefer multi-tasking and non-linear access to information; rely heavily on communications technologies to access information and to carry out social and professional interactions [11], [12]. Compare to analog signal, digital signal is easier to recognize, define, and manipulate, meeting the needs of Net Generation in terms of simplicity and speed [13]. The digital signal takes the form of parameter, which is a computation from recorded data, to
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology
163
present its relative value in a system. Therefore, the Parameter-Oriented Design Methodology should be an ideal way for teaching/ learning interaction design. In their research on identifying user interface design principles for interaction design, Blair-Early and Zender propose a set of “parameters” for achieving an effective interface. They believe that principles in isolation do not provide sufficient guidance to inform design decision, only when parameters and principles working together can drive innovation and empower designers. Based on the three components model for an interaction design, Blair-Early and Zender define the “parameters” that govern an effective interface as: User Intention: Hunter-Browser. Content Delivery: Reference- Educational- Inspiration- Entertainment. Interface Type: Linear-Hierarchy-Matrix-Web. Note that the Interface Type here can only be referred to the structure of an interface design rather than the form of an interaction design; hence, only the parameters defined for user and content are considered in this design methodology. The descriptions of these parameters are shown as below: User Intention. The Hunter: the hunter is focused, precise and often destinationdriven. The hunter values the speed and efficiency of an interface, and rarely deviates from its initial content direction to discover a new path. A hunter may scan large quantities of information quickly in order to find a predetermined target information or content. The Browser: the browser is intent on the journey and, in many cases, may not have a final content destination in mind. The browser is less focused and driven in the search for content, and more likely to be open to new experiences. Content Delivery Strategy. Reference: a content delivery strategy designed to serve discrete bits of information to users. The reference source is driven to provide as much information as possible in as few steps as possible. Educational: a content strategy designed to instruct, often in a step-by-step fashion. Educational content is driven to educate its audience. Inspiration: a content strategy designed to motivate or inspire. Often, the inspirational source has a more personal connection to the audience through calls to action and directives. The inspirational source derives its trust through emotional response and personal connection rather than through factual data. Entertainment: an entertainment delivery strategy is designed to amuse and geared to draw a browser audience. It establishes a more direct connection with the audience and requires direct participation from the user. Entertainment sources are the most open to interpretation, and may even require audience participation in establishing the content. It is important for designers to define these parameters before working on design exploration. Because these parameters are derived from research results, identifying design parameters can help designers clarify some essential questions. Once these parameters are identified, designers can define the level of interaction based on their knowledge of user intentions and content delivery strategies by applying appropriate design principles to their design, making interaction design more effective and relevant. Note that the level of interaction/ the desired experience is the key in this design methodology; it should be considered within each single design principle. Therefore, the structure of the Parameter-Oriented Design Methodology can be presented in the following figure (Figure 3):
164
S.-W. Tzeng
Fig. 3. The structure of the Parameter-Oriented Design Methodology
At the very center of this model is the user intention, which can be classified into two parameters- the hunter and the browser, and follows the consideration of content strategy, which is divided into four main types: the reference, the educational, the inspiration, and the entertainment. All of the design principles are used to guide design decisions by defining the appropriate level of interaction in the fashion of digital parameters. Take “the principle of feature exposure” and “the principle of user autonomy” for examples; to define the form of an interaction design for a browsing user experiencing entertainment content relies on the understanding of both user intention and content strategy. Based on the gathered information, it is obvious that intensive feature exposure is important and more authorship is required to achieve an effective interaction design within this kind of scenario. Therefore, if these design elements can be presented in a scale composed of nine numbers - from 1 to 9, the design decisions can be illustrated on the following chart:
Fig. 4. The design parameters identified by interaction designers based on the considerations of user intention, content strategy, and level of interaction
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology
165
Note that the decisions related to “Feature Exposure” and “User Autonomy” align roughly with the interaction parameters above them, causing each design decision to be taken in reference to the specific considerations being addressed. This process will be followed with each design principle in an iterative fashion. In practice, consideration of just a few principles generally leads to a design theme or system that encompasses the other principles. With the great flexibility in accounting for all the relevant factors in an interaction design, the Parameter-Oriented Design Methodology is both easy to practice and inviting invention and innovations. Recipes for making an interaction design are replaced by guidance in establishing landmarks and contexts for a particular user intention and content strategy- making the teaching/ learning of interaction design more effective and enjoyable.
4 The Application of Parameter-Oriented Design Methodology to the Teaching/ Learning of Interaction Design 4.1 The Execution of Parameter-Oriented Design Methodology in Design Projects This paragraph describes how the Parameter-Oriented Design Methodology was applied to a social networking device design conducted by two industrial design graduate students. The Friend Finder was designed for a hunter persona within a reference interface concept. Based on students’ understanding of user intention and content strategy, the result of defining design parameters can be graphically illustrated below:
Fig. 5. Student example, Brian Williams and Lauren Weigel. Design parameters were placed within a matrix based on what user intention and content strategy are and how closely they related to the search query.
166
S.-W. Tzeng
The goal of this project was to develop features and systems within a person’s mobile ecosystem that better enables them to manage a connection to information and others in order to create more meaningful destinations. Because the hunter requires immediate information with the least amount of input and the reference device is aimed to give the user a way of navigating and connecting within their social network, the design parameters consist of relatively low user autonomy, tons of clear landmarks with consistent design language, and conventional visual metaphors. Some interface designs of this project are presented in Figure 6.
Fig. 6. Student example, Brian Williams and Lauren Weigel. Friend Finder interaction design and process of use.
4.2 Educational Result and Student Benefits of the Application of PODM to the Design of Interaction Because the PODM application project was following a case study project, students were provided with enough background information to understand the dilemma in an interface development process. While the guidance was clearly prescribed and was based on the parameters of an interaction interface, it became much easier for design students to create their first interaction design works. The produced prototypes were published in exe format and sent out to potential users for design evaluation. Although the effectiveness of design was hard to verify due to the small size of sampling, design students got a lot of feedback and insights for future design improvement. When asked to respond on a five-point Likert scale to broader questions about this design methodology in an end-semester learning evaluation, most students either strongly agreed or agreed that the PODM is great for training decision-making process, useful for guiding project actions and enhancing their understanding of Interaction Design. Although a relevant and effective interaction design may take more effort and time for design testing and refinement, the design students in this class have built strong confidences in design exploration and decision-making, preparing them for the future careers.
Digitizing Interaction: The Application of Parameter-Oriented Design Methodology
167
5 Summary As mentioned before, without more detailed knowledge of the effects of the execution of interaction design, design principles can only be applied intuitively, which is difficult for design students. As a result of applying Parameter-Oriented Design Methodology to the projects conducted by design students, it is clear that more precisely defined parameters for the form of interaction design are needed in order to create distinct possibilities aimed toward a target experience. Besides, the integrated approach-combining user, content, and form in the PODM should be comprehensive enough to guide design students or practical designers to make relevant design decisions more effectively. Even so, more work needs to be done— the continuous researches on how to convert intuition into significant knowledge to make interaction design education more effective are of enormous need in this digital era. Design educators are now at the leading edge of a wave of not only technological but also educational change that will affect all aspects of how a product can be designed in a profound way.
References 1. Mak, B.: Tomorrow’s Design Landscape: Anywhere Interactions. Innovation Journal, IDSA, Dulles, 43–49 (2009) 2. Moggridge, B.: Designing Interactions. The MIT Press, Massachusetts (2007) 3. Blair-Early, A., Zender, M.: User Interface Design Principles for Interaction Design, Design Issues, vol. 24(1), pp. 85–107. The MIT Press, Massachusetts (2008) 4. Apple Computer Inc.: Macintosh Human Interface Guidelines. Addison-Wesley, Boston (1992) 5. Tufte, E.: Visual Display of Quantitative Information and Visual Explanations. Graphic Press, CT (1997) 6. Cooper, A., Reimann, R.: About Face 2. Wiley, Boston (2003) 7. van Duyne, D.K., Landay, J.A., Hong, J.I.: The Design of Sites. Addidon-Wesley, Boston (2003) 8. Tognazzini, B.: Maximizing Human Performance, Nielsen Norman Group, http://www.asktog.com/basics/03Performance.html 9. Nielson Norman Group: Site Map Usability, Fremont, California (1998) 10. McCrindle, M.: New Generations at Work: Attracting, Recruiting, Retraining& Training Generation Y, p.7. McCrindle Research, Sydney (2006) 11. Frand, J.L.: The Information-Age Mindset: Changes in Students and Implications for Higher Education. EDUCAUSE Review 35, 15–24 (2000) 12. Prensky, M.: Digital Natives, Digital Immigrants, pp. 34, On the Horizon (2001) 13. Befooty, C.: What are the Advantages and Disadvantages of Analog vs. Digital Communication?, http://www.answerbag.com/q_view/31169#ixzz12pAElagr
A Study on an Usability Measurement Based on the Mental Model Yuki Yamada, Keisuke Ishihara, and Toshiki Yamaoka 930, Sakaedani, Wakayama City, Wakayama, 640-8510, Japan [email protected]
Abstract. If you find distance between user’s model and designer’s model, the system is not good. We made a scale constructed with six viewpoints of similarity and practiced it. In result, we found possibility to measure a satisfaction at usability, roughly. If we complete this study, this method is useful for all interface designer and usability engineer. Keywords: Mental Model, Similarity, Usability, Interface.
1 Introduction When a man/woman find cognitive distance using a system between his/her own image (User’s Model) and system (Designer’s Model), he/she feel the system bad (Fig.1). The own image is called The Mental Model [1]. The purpose of this study is to measure the cognitive distance. The Mental Model is incomplete, unstable, unscientific, and ambiguous [2]. Because of those features it is difficult to found what The Mental Model is. However, Users must have viewpoints to find the cognitive distance. We focused on the similarity among Models.
A Study on an Usability Measurement Based on the Mental Model
169
2.2 Materials and Methods Participants were six students (included author) who study cognitive ergonomics and interface design. We prepared six different digital cameras as representative example of user interface. First, they operated cameras and build The Mental Models to understand each system. Second, they wrote out similar points among each camera (fig.2). Third, they showed their own viewpoints and grouped each opinion (fig.3). Last, those points of similarity were integrated into six.
Fig. 2. Writing out viewpoints
Fig. 3. Grouped viewpoints
2.3 Result We made the following (provisional) scale for measuring similarity.
disagree 1
agree 2
3
4
○
2. Structure of indication is good.
○
3. Ordering of indication is good. 4. Shapes of operating parts is good. 5. Layouts of operating parts is good. 6. Steps of operation is good.
5
○
1. Appearance of indication matched is good.
○ ○ ○
3 Experiments 3.1 Purpose The purpose of experiments is to use the scale actually, and check its function. And we looked into the importance of each items in the scale.
170
Y. Yamada, K. Ishihara, and T. Yamaoka
3.2 Materials and Methods Participants are 23 students (Average: 22.7, SD:1.2, male:15, female:8). We prepared 3 types of digital cameras, 3 types of cell phones and 3 types of electric dictionaries. After hearing explanation, the participants took the following processes repeatedly. Last, we heard experiences regarding products use. 1. The participants operate 3 tasks with the product (fig.4). We asked them to talk their idea about the products due to observe their cognitive process (Protocol analysis). 2. The participants filled in two evaluation forms (fig.5). One is similarity form. The other is SUS (System Usability Scale) form [3].
Fig. 4. Operation
Fig. 5. Evaluation
3.3 Result 1. Observation Many kinds of cognitive idea were observed. We will not take up individual examples in detail. But small problems occur early. For example, some users took no clear judgment. Some users took suspension of operation. Last stage of the problem, they lost where they are (Situation Awareness [4]), and operated the system randomly. Also, some unconsciousness error was observed. 3.4 Result 2. Comparing the Similarity Measurement and SUS The relation between total similarity score (average of six similarity score) and SUS score was investigated by Pearson’s correlation coefficient test. The result revealed the strong correlation between the total similarity score and SUS score (r=0.8023, p<0.001) (Fig.6). The other results of correlation coefficient test with divided each product showed range of coefficient of correlation r=0.6 to r=0.8. The result of evaluation with three digital cameras were shown (Fig.7, Fig8).Oneway analysis of variance (ANOVA) was used to compare means among groups. The results of ANOVA were significant. Post hoc analyses were performed with the Least Significant Difference (LSD) test. The result of the similarity scale showed Camera1 is significantly lower than Camera2. The result of SUS scale showed Camera1 is significantly lower than Camera2 and Camera3.
A Study on an Usability Measurement Based on the Mental Model
171
Fig. 6. Correlation of Similarity and SUS
Fig. 7. Similarity score
Fig. 8. SUS score
3.5 Result 3. Relation of Each Similarity Multiple linear regression analysis was used to determine the relation of each similarity score. We used SUS score as outcome variable, each similarity score as predictor variables. Coefficient of determination was 0.71(p<0.01). Standardized partial regression coefficient and pearson's product-moment correlation coefficient is showed in following figure (Fig. 9). The result showed Similarity 3 & Similarity 4 has influence for user’s satisfaction. 3.6 Result 4. Guessing of Users Cognitive Experiences Correspondence analysis showed a positioning of products and user (Fig.10). For example, next figure shows about digital cameras positioning result. A dotted line means result of Cluster analysis. Following figure indicate that Camera1 is similar to camera 3 than camera2.
172
Y. Yamada, K. Ishihara, and T. Yamaoka
Fig. 9. Multiple linear regression analysis & Correlation analysis
Fig. 10. Corresponding analysis & Caster analysis
4 Discussion We tried to measure a cognitive distance between user’s mental model and designer’s model of a system, using a similarity scale what got by comparing digital camera's model. The results had correlation with SUS score. Therefore we think that the similarity score measured usability and cognitive distance. SUS contains 10 terms that a system should have. the similarity score can measure score of elements of Mental Model directly. In this view points, we think, SUS is denotation and similarity is connotation. Result of Multiple linear regression analysis showed importance of order of operation and indication. However shapes and layouts of operation parts were not important. It is not clear whether this rule applies to all case. Jenny says structural model is not be needed except when need repairing [5]. The result of our study agrees this opinion. On the other hands, Gentner insist structural importance
A Study on an Usability Measurement Based on the Mental Model
173
(structural-mapping theory) in terms of analogical process [6]. It is likely that they concentrate consciousness in the use process, when they evaluate the system, even if the process is made after structural mapping. Many type of error were observed. According to Reason's error classification, skill-based error does not relate with thinking. Violation is intentional error [7]. Therefore, knowledge based and rulebased "mistake" has relationship with cognitive distance we think. Correspondence analysis and Cluster analysis shows positioning and groups of products and users. Designer or researcher is able to see the relation at one sight. In the future, we would repeat the test to increase the reliability and validity of the similarity scale.
References 1. Norman, D.A.: The Psychology of Everyday Things. In: Hisao, N. (trans.). Basic Books Inc., New York thorough Tuttle-Mori Agency Inc. (1990) 2. Norman, D.A.: Some Observations on Mental Models. In: Gentner, D., Stevens, A.L. (eds.) Mental Models, pp. 7–14. Lawrence Erlbaum Associates, Mahwah (1983) 3. Brooke, J.: SUS - A quick and dirty usability scale (1996) 4. Endsley, M.R.: Situation Awareness. In: International Encyclopedia of ergonomics and Human Factors, vol. 1, pp. 551–554 (2001) 5. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., Carey, T.: Human-Computer Interaction, pp. 134–137. Addison-Wesley, Reading (1994) 6. Gentner, D., Colhoun, J.: Analogical processes in human thinking and learning (in press) 7. Reason, J., Hobbs, A.: Managing Maintenance Error. Ashgate Publishing Limited, Gower House, pp.39--60
Enabling Accessibility Characteristics in the Web Services Domain Dimitris Giakoumis1,2, Dimitrios Tzovaras1, and George Hassapis2 1
Informatics and Telematics Institute, 6th km Charilaou-Thermi Road, Thessaloniki, GR-57001 Greece 2 Aristotle University of Thessaloniki Department of Electrical and Computer Engineering, Greece {dgiakoum,tzovaras,ghass}@iti.gr
Abstract. Accessibility in ICT and web-based applications has become an issue of great importance during the last years. However, the notion of accessibility was until recently undervalued in the web services domain. Trying to fill this gap, this paper presents work conducted towards enabling web services (WS) with accessibility characteristics, trying to ensure that HCI through applications utilizing them is accessible. For this purpose, a WS accessibility assessment framework has been deployed, having as basis guidelines which if followed, can ensure that accessible WSs are developed. In order to further facilitate the development of accessible WSs, a WS accessibility assessment tool has been developed on the basis of the proposed framework. In its current implementation, the tool is capable to automatically assess whether SOAP- or REST- based services conform to proposed guidelines. Thus, by using this tool, developers can be significantly facilitated towards developing accessible web services, or also enriching their already developed not-accessible ones with accessibility characteristics and so as to make them accessible. Keywords: Web Services, accessibility, assessment, Human Computer Interaction, User-centered design.
during the past years, no similar care has been taken so far in the web services domain. Web services (WSs) [9] play a key role for content delivery and service provision through the Internet. A vast amount of companies and non-profit organizations provides content and services through the Web by using this still evolving technology. To the present, two types of WSs can be considered as the most widely used, SOAP- [9][10] and REST-based [11] ones. The utilization of Web Services has been boosted through standardization efforts which have already been made regarding their proper specification and interoperability [12][13]. However, standards deployed so far do not take under consideration the fact that the content and functionality derived from WSs, delivered through end-user client applications should be accessible also to people with disabilities. As a result, Human-Computer Interaction (HCI) with end user applications that provide content or functionality derived through WS utilization may become inappropriate for users with special needs. Trying to fill this gap, this paper presents work conducted towards enabling web services with accessibility characteristics, trying to ensure that HCI through user agents utilizing them is accessible.
Fig. 1. The typical Web Service Utilization chain
Considering the typical web service utilization chain (Fig. 1), an end-user application typically contacts a WS and retrieves content, which is then presented to the end user through an appropriate interface. In the context of end users with special needs, there could be cases however where the content delivered cannot be presented through the user agent appropriately, due to the fact that no appropriate accessibilityrelated information (e.g. appropriate alternative text for image delivered) is provided through the service. Furthermore, even if the user agent is capable to present the retrieved content appropriately, there could be cases where the content itself is inappropriate for the end user. For instance, a lower-limb impaired end user could be directed from an info-mobility service to a restaurant which is inappropriate regarding her/his special needs (e.g. has no ramp at its entrance for wheelchair users). The present work tries exactly to fill these gaps, by building upon the notion of “accessible web service” and deploying a framework which ensures that web services are developed so as to allow for WS – based HCI be accessible both on presentation and content level. In this context, an “accessible web service” is defined as a WS that is 1) well-defined, well-working and easy to integrate within client applications, 2) has accessibility features that enable client applications to show the delivered content to end users with special needs, and 3) provides content that contains appropriate information, in order for the content itself to be actually helpful for end users with special needs.
Enabling Accessibility Characteristics in the Web Services Domain
179
Within the proposed framework, developed in the context of ACCESSIBLE [14] project, web service accessibility guidelines are defined, which if followed, are capable to ensure that the “WS invocation” part of the WS utilization chain allows for accessible HCI interaction at the end user – user agent level. These guidelines are categorized in three layers, the “core functional”, “basic accessibility” and “extended accessibility” ones. Following the proposed guidelines during design and development of web services can lead to development of “accessible” ones, enhanced with accessibility features. In this context, trying to further facilitate the design of accessible services, these guidelines form the basis of a WS accessibility assessment framework deployed, capable to assess whether a service can be regarded as accessible. Within this framework, three accessibility classes are defined (Class “A”, “AA” and “AAA”), which build upon the three layers of WS accessibility guidelines. The accessibility classes provide the means for service categorization in terms of accessibility guidelines followed. WS accessibility assessment is then further enhanced by an appropriate tool that has been developed, capable to automatically assess whether the service under evaluation conforms to the proposed accessibility guidelines. This tool facilitates accessibility assessment of both SOAP- and RESTbased WSs. Thus, by elaborating a concrete WS accessibility assessment framework, and developing a tool for automatic WS accessibility assessment, the present work aims to facilitate the future development of accessible web services, which can be enriched by appropriate accessibility characteristics that will ensure accessibility at the HCI level of the web service utilization chain.
2 Web Service Accessibility Assessment Framework The main idea behind this work is that interaction between client applications and web services can be enhanced with accessibility features. The intent of these features is to ensure that the Client Application – WS interaction part of the WS utilization chain (Fig. 1) allows for accessible HCI at the “End User - Client Application” level. For this purpose, the WS Accessibility Assessment framework developed assesses whether web services are accessible. Within this framework, WS accessibility is defined on the basis of a three-layer-architecture (Fig. 2); comprised of the “core functional”, “basic accessibility” and “extended accessibility” layers. The rationale behind the proposed layers of accessibility is that in order for a service to be considered as fully accessible it has to: 1. Be well-defined, well-working and easy to integrate within client applications, so as for developers of client applications to be able to use the service’s functionality and/or provided information effectively within their developed application’s operational context. This requirement defines the concept behind the core functional layer. 2. Have accessibility features that will enable the client applications invoking the service to show the delivered content in an accessible way, in respect to the special needs of impaired user groups. This requirement defines the concept behind the basic accessibility layer.
180
D. Giakoumis, D. Tzovaras, and G. Hassapis
3. Provide data which contains enough information, in order for the content itself to be helpful for impaired users, containing information adapted to their special needs. Based on this requirement, the extended accessibility layer is defined. Whereas the two latter layers deal exactly with accessibility-related issues, the first one deals with characteristics that a WS has to have so as for it to be properly functional and easy to integrate within client applications. Obviously, dependencies exist among the above layers of accessibility. As an example, in order for a service to be able to have “basic accessibility”, first it has to have those core functional features that will ensure its proper “core functionality”. The concept behind this particular dependency can be considered as: “In order to make a service better in terms of accessibility, it should be working (well enough) first”.
Fig. 2. Accessible Web Service Evaluation framework base
2.1 WS Accessibility Classes and Guidelines The three aforementioned accessibility layers (core, basic and extended accessibility respectively) form the basis for defining three service accessibility classes (A, AA and AAA), which provide the means for service categorization based on service accessibility features. As shown in Fig. 2, within the proposed framework, a set of guidelines is defined in respect of each accessibility layer. If these guidelines are followed, they are able to provide a web service with functional and accessibility features (core, basic and extended) that will enable it to belong to the corresponding accessibility class. Furthermore, for each guideline proposed, a set of specific techniques are defined within the framework, which can be used to assess whether an already developed service belongs to a specific accessibility class or not. Focusing more on the meaning of each accessibility class and its relation to the accessibility levels, it can be fist considered that “Class A” accessible services are accessible for “typical use”, since they have features that ensure their proper “core functionality”. In this context, “typical use” refers to a service’s ability to provide, through the exchange of appropriate messages, functionality and information that can be delivered to users through “basic” User Interfaces (UIs) of client applications; these are UIs of client applications which not necessarily provide access to user groups with special needs, but provide content to not impaired users in an proper, effective and useful way. The guidelines that should be followed so as for the service
Enabling Accessibility Characteristics in the Web Services Domain
181
to belong to this class refer to general functional / operational characteristics of a service, which ensure the effective and efficient integration of its functionality and information delivered within appropriate “basic” client applications. Such an example is the Level 1 guideline defining that no one-way operations should be defined within a WS, since the client should always be properly informed that the WS it tried to invoke was indeed properly invoked through the invocation request message sent. Then, “Class AA” accessible services are defined as those which are considered to be “accessible for Typical Use” (they have core functional features), and also have features that enable client applications to use their functionality and show the retrieved content appropriately to various user groups with special needs. Thus, these services have basic accessibility features as well. As an example, following the rationale behind the WCAG guidelines defined for web content; and in particular, the one denoting that alternative text should accompany any non-text content delivered through the web, one of the level 2 WS guidelines suggests that a service providing non-text content (e.g. images) upon invocation should also provide a text element that contains all the important information conveyed through the images. Finally, “Class AAA” Services are those which are “Accessible for Typical Use”, thus first of all they have core functional and basic accessibility features as well; and furthermore, the content that they provide contains adequate accessibility information, in order for it to be useful enough to impaired users. For instance, a service that provides route calculation functionality, may return apart from maps also textual information that can be presented to the end user through an assistive device (integrated in the client application). However, if the service has not taken into account the needs of the impaired end user during the route calculation process, the route returned may eventually be inaccessible. These services thus have extended accessibility features which allow them to provide content with information adapted to the special needs of impaired user groups. As an example, a level 3 guideline of the framework proposes that WS operations that deliver information regarding “points of interest” (e.g. restaurants, cinemas etc. in a city) should also provide for each point of interest information regarding its accessibility status in respect of different impaired user groups (e.g. wheelchair user). As was also shown from the level 2 WS accessibility guideline example provided above, the guidelines defined within the proposed WS accessibility assessment framework were formed after thoroughly reviewing W3C’s WCAG standardization guidelines regarding the accessibility of Web Content, in an effort to enhance Web Services with the concepts behind the Web Content Accessibility Guidelines proposed by W3C. Concluding, the proposed accessibility guidelines are categorized on the basis of three Guideline levels: Level 1, Level 2 and Level 3. These levels correspond to the three accessibility classes, Class A, Class AA and Class AAA respectively. In order for a service to belong to a specific class, it should meet guidelines that belong to the corresponding level and thus have the required, respective functional and accessibility features. At this point however, it has to be noted that the defined guidelines follow the SHALL-SHOULD-MAY convention, as outlined in RFC 2119 [15], and a distinction has been made between “mandatory” and “not mandatory” WS guidelines of each level. This was done due to the fact that some guidelines (e.g. alternative text accompanying images delivered) can be thought of higher importance than others towards ensuring that a service belongs to a specific
182
D. Giakoumis, D. Tzovaras, and G. Hassapis
class of accessibility. Thus, even if all guidelines defined are suggested to be followed by WS developers where applicable, specific subsets of guidelines are defined as “mandatory” (and others as “not mandatory”) at each level. Thus, for a service to belong to each accessibility class, the respective level’s mandatory guidelines “shall” be followed. All of the proposed guidelines have been stored in an appropriate ontology in order to be used for the Web services assessment process described in the following.
3 Web Service Accessibility Assessment in Practice In order to enable developers assess whether their WSs are enabled with necessary accessibility characteristics, a Web Service Assessment module has been developed (Fig. 3), responsible for the evaluation of Web Services through the above-described WS assessment framework. If regarded as a black box, this module takes as input the relevant “Web Service Description Language” (WSDL) [16] or “Web Application Description Language” (WADL) [17] file describing the WS under evaluation and the WS Accessibility Guidelines defined within the Accessible Ontology, and produces as output the result of the WS Accessibility Assessment process. During service assessment, the module communicates with (a) the WSs Ontology in order to get information regarding the Web Services Accessibility Guidelines defined and (b) an EARL (Evaluation and Report Language) [18] -based Reporting tool, responsible for translating the assessment result into EARL-based reports. These reports follow a standardized format, and thus can be used for present the result to the tool users in an appropriate way.
Fig. 3. Block Diagram of the Accessible Web Service Assessment Module
As depicted in Fig. 3 the accessible Web Service Assessment module consists of the following main components:
Enabling Accessibility Characteristics in the Web Services Domain
183
The Service Definition Parser: It is responsible for the parsing of formal XML-based files describing a Web Service. In the current implementation, these files can either be WSDL or WADL -based. The tool takes as input the WSDL or WADL file describing a Web Service (SOAP- or a REST- based respectively) and produces Java structures that hold information regarding the WS, appropriate for further processing and accessibility evaluation. It has to be noted at this point that since WADL files are not at the moment a commonly used standard followed to describe REST WSs, the tool also offers its users the capability to define the structure of their WS through an appropriate UI, without needing to provide any WADL file describing the service. Thus, in this case the use of the service definition parser is omitted. Finally, it has also to be noted that the service definition parsing module is extensible so as to allow (after the integration of further appropriate modules) the parsing of further service types that could be defined in the future, given that these services can be regarded as black boxes with specific input and output data structures. The Service Alignment Tool: This tool offers the service evaluator the capability to “align” Web Services to the Accessible WS “Ideal Operations” defined within the ACCESSIBLE Ontology. The alignment process enables the tool to identify specific requirements that the input/output structures of the service under assessment should have, on the basis of the service category. For instance, a web service operation that returns information related to points of interests has to follow some level 3 WS accessibility guidelines that are different from some level 3 followed by a service offering route calculation capabilities. For the purposes of this process, currently a set of ideal operations has been defined. Some of them refer to general purpose WSs, such as: Image, Audio, Video or Textual Info Provider operations, whereas others are more specific and deal for instance with “info-mobility” WSs, such as the “points of interest” info provider and “route calculation” operations. As example, two of the already defined ideal operations are shown in more detail in Fig. 4. Image Provider operation
Points of Interest Provider Operation
Inputs: -
Inputs: User Category
Outputs: Image Object, Object URL, Alternative Text
Image
Outputs: Point of Interest {User Group, POI Accessibility Status, POI Accessibility Status Details}
Fig. 4. Example ideal WS operations defined in the ACCESSIBLE ontology
The image provider ideal operation holds the minimum necessary elements that a Web Service delivering images (e.g. maps) should have, in order for it to be considered as accessible. In particular, as denoted in its output (Fig. 4), an alternative text element should accompany every image delivered through the WS. This alternative text should be a description of the image, capable to be handled for example by appropriate text-to-speech modules integrated in the end user app utilizing the service, so as for the information conveyed through the image to be accessible also to blind end users. Similarly, the Points of Interest info provider ideal
184
D. Giakoumis, D. Tzovaras, and G. Hassapis
operation defines that a web service of this kind should provide in each delivered block of information that regards a specific “point of interest” (e.g. a restaurant), additional information regarding the POI’s accessibility status in respect of different end user categories (e.g. wheelchair user etc.). Apart from enabling the tool identify the WS type and whether specific elements exist in the input/output data structures, these ideal operations provide the developers also with a more direct guide over accessibility characteristics that their service should have (in respect of the WS category it belongs to) so as for it to be regarded as accessible. The Accessible Web Service Evaluator: This component is responsible for interpreting and combining information gathered from the above-described modules, so as to conclude whether the service under assessment belongs to a specific accessibility class or not. In particular, it takes as input the (a) information derived from the parsing of the WSDL file, (b) the information derived from the alignment of the Service’s operations to the concepts defined within the “Ideal” ones and (c) the Web Service Accessibility Guidelines defined within the Ontology. The Accessible WS Evaluator combines these three inputs and produces as output the WS Accessibility Assessment result, which is then passed to the Accessible EARL – based reporting tool responsible for the EARL-based Accessibility Report generation. 3.1 WS Accessibility Assessment Procedure The assessment procedure supported from the Web Service Assessment tool consists of the following steps: 1. Parsing of a Web Service’s definition (WSDL or WADL) file: During this initial step, the definition file Parser acquires information regarding the operations defined within the Service under evaluation. During the parsing process, all the information contained in the WSDL or WADL file is transferred in Java-based structures, appropriate for further processing and evaluation of the Accessible WS Accessibility Guidelines. 2. Automatic evaluation of the Service’s accessibility status based on information acquired from step 1: Within this step, all information acquired from step 1 is used from the Web Service Evaluator in order to evaluate a limited set of the Service Accessibility Guidelines. This limited set contains all Accessibility Guidelines that can be automatically checked by using the information acquired so far from step 1. 3. Alignment of the service’s operations to the Accessible “Ideal Operation” elements, defined within the Ontology: By utilizing the service alignment capabilities offered from the ASK-IT Service Alignment Tool, the Accessible Web Service Assessment Module acquires more information regarding the Service’s operations and their input and output structures. Within this process, the service assessment tool is asked to align the operations defined within the Service’s WSDL file and their input and output elements to corresponding ones, defined within the “Ideal Operations”. The alignments produced are then ready to be used from the Accessible WS Evaluator during Step 4. 4. Automatic evaluation of the Service’s accessibility status based on the combined information acquired from steps 1 and 3: Within this step, all information acquired
Enabling Accessibility Characteristics in the Web Services Domain
185
from steps 1 and 3 is used in order for the Accessible WS Evaluator to evaluate a broader set of Guidelines than the one assessed during step 2. 5. Manual evaluation of the Guidelines that cannot be assessed automatically by using the information acquired from steps 1 and 3: During this step, the (user) evaluator of the service manually evaluates the service against the Guidelines not previously checked until step 4. The evaluator, for the purposes of this task, is offered the capability to invoke the Web Service’s operations by using the dynamic invocation option provided from the Accessible WSDL Parser, in order to check more accessibility Guidelines.
4 Conclusions This paper presents work conducted towards enabling accessibility characteristics in the WS domain. The target is to ensure that the “Client Application – WS interaction” part of the WS utilization chain allows for accessible HCI at the “End User - Client Application” level. For this purpose, this work takes a step forward and proposes a three-layer WS accessibility assessment framework, which can form the basis towards the future development of “accessible web services”; namely services that are welldefined, well-working and easy to integrate within client applications, provide content which is accessible to impaired users and conveys information that can be actually helpful to them, adapted to their special needs. The WS accessibility assessment tool developed on the basis of the proposed framework is then capable to facilitate WS developers, towards creating services that conform to the accessibility guidelines, and thus are enabled with characteristics that make them accessible. Acknowledgments. This work was partially funded by the EC FP7 project ACCESSIBLE - Accessibility Assessment Simulation Environment for New Applications Design and Development, Grant Agreement No. 224145 (www.accessible-eu.org).
References 1. Thatcher, J., Waddell, C.D., Henry, S.L., Swierenga, S., Urban, M.D., Burks, M., Regan, B., Bohman, P.: Constructing accessible web sites. Glasshaus, San Francisco (2003) 2. Sierkowski, B.: Achieving web accessibility. In: Proceedings of the 30th Annual ACM SIGUCCS Conference on User Services,SIGUCCS 2002, Providence, Rhode Island, USA, November 20 - 23, pp. 288–291. ACM, New York (2002) 3. Abascal, J., Arrue, M., Fajardo, I., Garay, N., Tomas, J.: Use of Guidelines to automatically verify web accessibility. Universal Access in the Information Society 3(1), 71–79 (2004) 4. Lazar, J., Dudley-Sponaugle, A., Greenidge, K.D.: Improving web accessibility: a study of webmaster perceptions, Computers in Human Behavior. The Compass of HumanComputer Interaction 20(2), 269–288 (2004) 5. Petrie, H., Kheir, O.: The relationship between accessibility and usability of websites. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2007, San Jose, California, USA, April 28 - May 03, pp. 397–406. ACM, New York (2007)
186
D. Giakoumis, D. Tzovaras, and G. Hassapis
6. Altmanninger, K., Wöß, W.: Accessible Graphics in Web Applications: Dynamic Generation, Analysis and Verification. In: ICCHP 2008: Proceedings of the 11th International Conference on Computers Helping People with Special Needs, pp. 378–385 (2008) 7. World Wide Web Consortium “W3C”, http://www.w3.org 8. W3C WCAG 2.0, http://www.w3.org/TR/WCAG20 9. Curbera, F., Duftler, M., Khalaf, R., Nagy, W., Mukhi, N., Weerawarana, S.: Unraveling the Web Services Web. IEEE Internet Computing 6, 86–93 (2002) 10. W3C SOAP Specifications, http://www.w3.org/TR/soap/ 11. Costello, R.L.: Building Web Services the REST Way, http://www.xfront.com/RESTWeb-Services.html 12. W3C Web Services Activity, http://www.w3.org/2002/ws/ 13. Web Services Interoperability Organization “WS-I”, http://www.ws-i.org/ 14. ACCESSIBLE Project Official Website, http://www.accessible-eu.org/ 15. Internet Engineering Task Force RFC 2119, http://www.ietf.org/rfc/rfc2119.txt 16. Web Services Description Language (WSDL), http://www.w3.org/TR/wsdl 17. Web Application Description Language (WADL), http://www.w3.org/Submission/wadl/ 18. W3C Evaluation and Report Language (EARL) Schema http://www.w3.org/TR/EARL10-Schema/
Results from Multi-dimensional Accessibility Assessment Rogério Bandeira, Rui Lopes, and Luís Carriço LaSIGE, University of Lisbon, Edifício C6 Piso 3 Campo Grande, 1749 - 016 Lisboa, Portugal {rbandeira,rlopes,lmc}@di.fc.ul.pt
Abstract. This paper discusses the variability of websites accessibility using a multi-dimensional evaluation that considers specific sets of relevant guidelines according to different devices and different disability types. We use an accessibility evaluation framework that is able to explore different combinations of guidelines from web content accessibility and mobile web best practices. It was applied to evaluate a set of interesting case studies. The obtained results show that the web content presents different accessibility issues regarding specific disability types, always a subset of the universal accessibility assessments. Regarding the devices’ dimension, results of the assessment show significant differences depending on the web resource representation for different devices. In all cases the dissimilarities between the general accessibility assessment and the evaluation for specific disabilities were visible. Keywords: Mobile Web, Accessibility, Assessment.
Finally, on the confluence of mobility and accessibility, things get even worse. The intrinsic features and limitations of mobile devices along with the accessibility requirements are a hinder but also a challenge to comprehensive web content development and interaction. Again the variability in the evaluation of websites according to the disability type dimension and the delivery context dimension needs to be addressed. This paper addresses these multi-dimensional assessment issues. It presents a multi-dimensional accessibility assessment approach allowing web content accessibility evaluation, regarding different selectable disability profiles and different delivery contexts. The assessments results of case studies carried out are presented and discussed.
2 Related Work Two sets of guidelines stand forward in the quest to develop accessible web contents and mobile-friendly web contents exist. The Web Content Accessibility Guidelines (WCAG) [1, 2], defines a set of rules to make websites accessible to people with disabilities, whereas the Mobile Web Best Practices (MWBP) [3] defines rules for making websites more usable from a mobile device. Although their correlations are well documented [4], developers still are unfamiliar with each one, and their combination. Also, if we take into account each disability and corresponding usage and accessibility constraints, the dimensions of the puzzle become even more intricate. Several tools are already available for the assessment of web sites, in terms of their accessibility [5, 8] and in terms of their mobile usage [5, 7, 8]. In general though, they tend to adopt approaches where guidelines are applied indifferently of the target users, the target devices or the conjunction of mobile and accessibility constraints. Even if recent work [9, 10] is emerging that addresses some of these nuances, the fact remains that a comprehensive approach is still lacking addressing the variability of accessibility according to each disability type and device delivery contexts.
3 The Approach The multi-dimensional accessibility assessment approach allows for web content accessibility evaluation, regarding different selectable disability profiles and different delivery contexts, such as mobile and desktop user-agents. Accessibility assessments based on WCAG guidelines conformance evaluation and disability type specific accessibility assessments based on subsets related to targeted disability types [10,13] can be performed. Regarding the delivery context dimension the rationale for the selection and application of guidelines aiming mobile web accessibility evaluation for specific disability profiles considered the following steps [11,12]: 1. Get the WCAG guidelines sets relevant to the targeted disability type; 2. Select the related MWBP relevant to the targeted disability type;
Results from Multi-dimensional Accessibility Assessment
189
3. Select the guidelines relevant to mobile web content adequacy regardless of the users’ special needs; 4. Exclude guidelines that become irrelevant to the targeted disability type when accessing it from mobile devices. For the first step each WCAG guideline relevance can be established regarding each disability type [10,13,]. For the second step, standard mapping between WCAG and MWBP [4] is used as the first approach. For the third step guidelines referring to aspects relevant for mobile usage adequacy that do not relate to any disability group accessibility specific issues such as character encoding, clarity, content format preferred, content format support, cookies, etc. that do not have a relation with any disability type or WCAG best practice issue, rather are critical to general mobile devices interaction. Guidelines such as “page size limit” or “link target format” are good examples of these. In principle, then, the conjunction of these two MWBP subsets, i.e., WCAG related and accessibility independent, along with the disability specific WCAG set will constitute the whole relevant set that should be used to assess content for given disability. However, a deeper analysis revealed some interesting, potentially controversial, issues. Consider a blind disability type or a user that by rule turns of the images download option on its user agent of his/her mobile device. Applying MWBP image related tests for guideline conformance (e.g. images specify size) can result in failure results irrelevant for that specific usage. In fact, not having specified the image size will not change at all the user experience since the image will not be downloaded anyway. Those cases are reflected on step 4 of the method has guidelines that can be excluded from the relevant sets for specific disability types such as the visual impaired. This approach was reflected in a proof of concept tool, MWAAT, which fully addresses its basic concepts [11]. This provides the necessary support to web developers, designers and assessment experts to conduct rapid, yet specialized, accessibility assessments focused on different disability types for web sites tailored also to mobile devices.
4 Multi-dimensional Accessibility Assessment Scenarios Using the abovementioned method and tool, we accessed different web site resources, simulating the access from a default desktop and mobile OK delivery contexts [3]. We evaluated the accessibility of the received web content representations on the following web resource evaluation scenarios: • Web site’s default representation accessibility evaluation for: • • • • •
All disability types Blind disability type Deaf disability type Color blind disability type Motor impaired disability type
190
R. Bandeira, R. Lopes, and L. Carriço
• Web site’s mobile OK representation mobile accessibility evaluation for: • Mobile adequacy evaluation (no disability) • All disability types • Blind disability type • Deaf disability type • Color blind disability type • Motor impaired disability type • Web site’s default representation mobile accessibility evaluation for: • • • • • •
Mobile adequacy evaluation (no disabilities) All disability types Blind disability type Deaf disability type Color blind disability type Motor impaired disability type
For simplicity reasons, the tables for each case study are organized as follows: • The first one reflects the accessibility evaluation for the default representation according to the scenarios presented above, namely, the assessment for no specific disability, for blind, deaf, colour blind and motor impaired disability types, in this order. • The second table reflects the mobile accessibility evaluation for the mobile representation. On this second table, the first row (Disability: NONE) depicts the mobile assessment not considering accessibility issues whereas the next ones follow the abovementioned organization. • The third table reflects the mobile accessibility evaluation for the default representation, using the same ordering of the second one.
Fig. 1. Web portal mobile OK representation mobile adequacy evaluation
Results from Multi-dimensional Accessibility Assessment
191
The fourth percentage column represents the ratio between the number of nodes with warnings and the number nodes. The last column represents the ratio between the number warnings and the number nodes. Warn=Err signifies that fails and warnings were counted together. The different received HTTP response contents were evaluated according to the referred approach. We see from the Web Portal Web site mobile adequacy evaluation example that, although it supports a mobile OK representation, the Web content even in that representation still maintains several non-adequate issues such as on click events with associated Java scripts that most mobile devices do not support, and they maintain on several tag, the absence of the height and/or width specification with the resulting rendering issues therefore derived. On the next section we present the aggregate results obtained from the performed evaluations.
5 Multi-dimensional Accessibility Assessment Results The tables bellow present the summary results for three cases studies, covering the evaluation scenarios presented above. • Web portal assessment case study:
Fig. 2. Web portal default representation accessibility assessment results
Fig. 3. Web portal mobile representation mobile accessibility assessment results
192
R. Bandeira, R. Lopes, and L. Carriço
Fig. 4. Web portal default representation mobile accessibility assessment results
• Web magazine assessment case study:
Fig. 5. Web magazine default representation accessibility assessment results
Fig. 6. Web magazine mobile representation mobile accessibility assessment results
Fig. 7. Web magazine default representation mobile accessibility assessment results
Results from Multi-dimensional Accessibility Assessment
193
To start with, consider the accessibility dimension. The first table of each case study showed some figures that raise interesting questions. Primarily, it is clear that most of the specific disabilities have much less issues that the general case, since each disability relevant set of guidelines is a subset of the available tests. Furthermore, a deeper analysis of the evaluation results showed that even when the numbers are similar between disabilities, the actual raised issues generally correspond to different guidelines. This reinforces the decision of having a specific disability testing option, since for example, for the deaf case the site is completely accessible. At the other end, the color blind case, that has a significant number of issues, can be easily explained by the fact that most of them are warnings regarding the contrast of background and foreground colors. In fact, the actual contrast is not tested and might be correct. Looking now at the mobility dimension, we should focus the attention on the first line of the second and third tables of each case study. Here it is clear that the mobile representation (second table) presents a much smaller page size than the default representation (third table) in all case studies. This occurs since all sites have a mobile specific content representation, which usually offer a much simplified version of the site. For sites without this feature the results of the second and third table would be the same. • Web financial portal assessment case study:
Fig. 8. Web financial portal default representation accessibility assessment results
Fig. 9. Web financial portal mobile representation mobile accessibility assessment results
Regarding all the case studies and the percentage of issues raised, it is noticeable that the web magazine shows the best improvement rates (9% to 1% in nodes and 15% to 1% in warnings), from the default to the mobile representations. The Financial Portal shows no improvement, but the site is already quite good, in its default format
194
R. Bandeira, R. Lopes, and L. Carriço
Fig. 10. Web financial portal default representation mobile accessibility assessment results
(third table). Finally the Web Portal case study shows minimal improvement (9% to 6% and 10% to 9%). A deeper regard at the HTML content and the evaluation results showed the existence of an advertisement using java script code, not supported by many mobile phones, and some tags with unspecified image height or width attributes. Considering the first two lines of the second table of each case study, corresponding to the mobile representation mobile accessibility assessment, one can observe that the number of warnings augments as expected from the mobile adequacy case to the mobile accessibility case (Web Portal: 6 to 13; Web Magazine: 1 to 2; Web Financial Portal: 7 to 11). This reinforces the notion that assessing mobile adequacy is not the same than assessing mobile accessibility. Comparing the same two initial rows with the ones of the third table, it is patent that, regarding absolute numbers, the gain in accessibility is enormous when comparing the mobile representation version with the default one, both in terms of number of nodes with warnings and total number of warnings. Looking at percentages, that is also true for most case studies. The exception is the number of warnings in the web portal’s case (last column of the tables). Again, the explanation falls on the existence of the aforementioned advertisement. Overall, the verified improvement in the accessibility of the mobile representations versus the default ones is in accordance with the well-known overlapping of guidelines. Nevertheless, this overlapping is not complete, far from it, and sometimes the reasons and the corrective measures are different between the mobile adequacy issues and the accessibility ones. Finally, it is worth comparing the rows of the second table (or of the third, which follows a similar pattern). In all case studies the differences the general accessibility evaluation and the specific disability ones are noticeable, as referred in the analysis of the first table. An interesting one is the blind disability evaluation scenario. Particularly in the web magazine case the mobile accessibility issues disappear. A more detailed analysis of the results shows that the issue found on the mobile adequacy (first row) that disappears in the blind assessment (third row) is referring the missing image size specification, which is considered irrelevant in this later one.
Results from Multi-dimensional Accessibility Assessment
195
6 Discussion The paper presents and discusses the results obtained on the evaluation of different web resources, accessed from different delivery contexts and for different disability types. The different representations of each web resource, for each of the different delivery contexts, were assessed according to different evaluation scenarios: including standard accessibility without regard to any particular type of disability and accessibility considering specific types of disabilities such as visual, hearing, color blind, motor and cognitive impairments. The results obtained allow asserting that web contents, as expected, have much less accessibility problems regarding each specific disability types than when assessed from the indiscriminate general case. Deeper analysis of the results showed that even when the numbers are similar between the different types of disabilities, the real problems raised correspond generally to different guidelines that are not observed. Looking for mobility dimension, it was clear that the representation mobile presents a much smaller size than the default representation in all cases of study. This asserts that these websites have a particular representation to be accessed from mobile contexts, which usually offers a simplified version that is more suitable. In relation to the absolute numbers, the gain in terms of accessibility is huge when comparing mobile representation with the default representation, both in terms of number of nodes as in number of warnings and errors. Looking for the percentages, the same is true for most cases of study. In general, the expected improvement in accessibility of mobile representations versus standard representations was observed. In all cases the differences between the general accessibility assessment and the evaluation for specific deficiencies were visible.
References 1. Chisholm, W., Vanderheiden, G., Jacobs, I.: Web Content Accessibility Guidelines 1.0 (May 1999) http://www.w3.org/TR/WCAG10/ (last accessed on January 27, 2011) 2. Caldwell, B., Cooper, M., Reid, L.G., Vanderheiden, G.: Web Content Accessibility Guidelines 2.0 (December 2008), http://www.w3.org/TR/WCAG20/ (last accessed on January 27, 2011) 3. Rabin, J., McCathieNevile, C.: Mobile Web Best Practices 1.0 (July 2008), http://www.w3.org/TR/mobile-bp/ (last accessed on January 27, 2011) 4. Chuter, A., Yesilada, Y.: Relationship between Mobile Web Best Practices (MWBP) and Web Content Accessibility Guidelines (WCAG) (July 2009), http://www.w3.org/TR/mwbp-wcag/ (last accessed on January 27, 2011) 5. Abou-Zahra, S., et al. (eds.): Web Accessibility Evaluation Tools: Overview , http://www.w3.org/WAI/ER/tools/ (last accessed on January 27, 2011) 6. W3C, W3C mobileOK Checker, http://validator.w3.org/mobile/ (last accessed on January 27, 2011) 7. dotMobi, mobiReady http://mobiready.com/launch.jsp?locale=en_EN (last accessed on January 27, 2011) 8. Benjamin, S., et al.: D 2.1 - state of the art survey in accessibility research and market survey. Technical report, ACCESSIBLE, Grant Agreement No 224145 (September 2009)
196
R. Bandeira, R. Lopes, and L. Carriço
9. Yesilada, Y., Chen, T., Harper, S.: Mobile Web Barriers for the Barrier Walkthrough (July 2008), http://wel-eprints.cs.manchester.ac.uk/93/1/riam_D3.pdf (last accessed on January 27, 2011) 10. Brajnik, G.: Barrier Walkthrough heuristic evaluation guided by accessibility barriers (2009), http://users.dimi.uniud.it/~giorgio.brajnik/projects/bw/bw.h tml (last accessed on January 27, 2011) 11. Bandeira, R., Lopes, R., Carriço, L.: Towards mobile Web accessibility evaluation. In: ETAPS, FOSS-AMA Workshop, Paphos, Cyprus (March 2010) 12. Carriço, L., Lopes, R., Bandeira, R.: Crosschecking MWBP for Visually Impaired Persons (2011) (submitted for publication) 13. Chalkia, E., et al.: D 3.1 - Ham Accessible Harmonized Methodology. Technical report, ACCESSIBLE, Grant Agreement No 224145 (September 2009)
A Harmonised Methodology for the Components of Software Applications Accessibility and its Evaluation Eleni Chalkia and Evangelos Bekiaris Hellenic Institute of Transport, Centre for Research and Technology Hellas 6th Km Charilaou-Thermi Road, Thermi-Thessaloniki, PO BOX 60361, GR-57001, Greece {hchalkia,abek}@certh.gr
Abstract. Accessibility today is gaining more and more ground, becoming a real necessity in daily living and every day needs. Authorities and experts are putting a lot of effort towards accessibility, especially in the software application domain. Despite this fact the ICT applications and systems are still not fully accessible. The main idea of the ACCESSIBLE project is to contribute towards better accessibility for all citizens. This will be achieved by increasing the use of standards and by the development of an assessment simulation environment, as well as, a harmonized methodology that links all the accessibility components. In the current paper we will present to the reader the general harmonised methodology introduced in ACCESSIBLE project to correlate the proposed accessibility components. Attention will be also given to the evaluation of the ACCESSIBLE harmonised methodology, as well as the future plans. Keywords: Accessibility, harmonization, disability.
researchers and practitioners realised, that access to a computer-based system is often denied to large numbers of potential users as a result of the system’s design. In the old days, it was widely believed that the interaction ability of an individual is simply subject to his/her functional characteristics. Yet, we now understand that it is the design of system in combination with the functional characteristics of the user that renders the person, able or unable to interact with it. Based upon this outcome in ACCESSIBLE we tried to correlate and link the characteristics of the disabled users with their functional limitations, the assistive technologies and the design guidance through existing accessibility guidelines within on Harmonised Methodology.
2 ACCESSIBLE Harmonised Methodology Methodological Framework The Accessible Harmonized Methodology is an attempt of harmonising each ACCESSIBLE area separately by correlating all their components, in the beginning one by one and finally all together. This is a human oriented methodology and that is why the disability type and ICF classification have been placed in the kernel of our framework. Additionally, the current methodology has been developed in a way that can be used in all possible ways and also vice versa, in order to induce the respective results and cover the addressed needs of all users. In addition and according to the aforementioned, ICF classification provides a concrete classification of impairments of the body structures, which ensures no overlapping. To this end, experts in ACCESSIBLE worked on linking user types (e.g., disability types) to certain ICF body structures and their related impairments (e.g., see Fig.1).
Fig. 1. Using the ICF classification as a base for harmonizing multiple user types
After this first step, experts in ACCESSIBLE have worked on defining a classification of interaction limitations based on the disability type and ICF classification core. The interaction limitations are, in essence, a detailed explanation of the functional limitations that occur from the disability types, and a presentation of the points that should be checked in order for a web site to be accessible for individuals with these disabilities. Following, link between the aforementioned
A Harmonised Methodology for the Components
199
components and the assistive devices was performed. As it is illustrated in Fig. 2, assistive technologies are correlated with the disability types, the ICF classification and further on with the interaction limitations.
Fig. 2. Towards translating ICF body structures into interactions limitations and there upon relating individual assistive technologies to specific body structures and / or to disability types
In addition, the “translation” of the disability types and the ICF body structure impairments into interaction limitations further facilitates the linking of existing guidelines and heuristics from the literature to specific body structures and thereby to user types (see Fig. 3). Although, it is often very difficult to understand what type of user benefits the most from a given guideline (because it is hard for inexperienced developers to understand a disability or it’s the effects), it is much easier to correlate a guideline to an explicitly described interaction limitation.
Fig. 3. Towards harmonizing design guidance with assistive technology and user types
200
E. Chalkia and E. Bekiaris
Fig. 4. Overview of the ACCESSIBLE harmonised methodology for measuring accessibility
Last, but not least, the above workplan, allows us to implement assessment rules that have derived from one or more guidelines, and use the above classification (organised into an ontology) in order not to lose track of which user types do benefit and which assistive technologies are affected (see Fig. 4). Ultimately, in this way, a developer will be in the position to initiate an assessment by defining (alone) any one of the following: User Group(s), Guidelines collection(s), Assistive technology (ies), or any other classification that can be integrated into this schema.
3 Harmonized Methodology Components 3.1 Disabilities and ACCESSIBLE User Groups HAM has been based upon the user groups that have been identified in ACCESSIBLE project. The Target User Groups consists of two main categories. The first category includes the developers, designers, and accessibility assessors and the second category includes people with disabilities and/or older people. The disabilities/ impairments that have been taken into account in the ACCESSIBLE project include the cognitive, hearing, visual, communication receiving and producing and upperlimb impairments. 3.2 HAM Interaction Limitations The need of an identification of those interaction limitations that exist in the HCI and the barriers caused by them is essential. Lots of efforts have taken place in this field from which we are inspired in order to create an inductive clustering of the possible functional limitation that may each user from the ACCESSIBLE users group confront while trying to interact with any computerised device. In ACCESSIBLE, in order to identify the interaction limitation of each disability type we were based upon the “Barrier Walkthrough Heuristic evaluation guided by accessibility barriers” methodology [3], of the project “Barrier Walkthrough: an accessibility evaluation
A Harmonised Methodology for the Components
201
method” and adapt it to the HAM needs. The classification of the interaction limitations according to disabilities and later on with guidelines is an adaptation of the heuristic walkthrough method used for usability investigations, where the principles are replaced by barriers. The basic underlying idea is that, for testing and assessment purposes, it is better to start from known types of problems rather than using general design guidelines. Thus, the barriers that may occur while interacting with a device are the first step that is correlated to the disability type and afterwards with the guidelines that should be followed in order to overpass this barrier. 3.3 HAM User and Assistive Technology Most people with disabilities require assistive or adaptive devices to help them render or view Web content. Those in the disability technology field refer to these devices or software interfaces by many names, including access systems, assistive technology, adaptive technology, and adaptive computing. We will refer to these devices as assistive technologies. An assistive technology device is defined by the Assistive Technology Act of 1998 (ATA), as "any item, piece of equipment, or product system, whether acquired commercially, modified, or customized, that is used to increase, maintain, or improve the functional capabilities of individuals with disabilities". Assistive Technology promotes greater independence by enabling people to perform tasks that they were formerly unable to accomplish, or had great difficulty accomplishing, by providing enhancements to or changed methods of interacting with the technology needed to accomplish such tasks. Likewise, disability advocates point out that technology is often created without regard to people with disabilities, creating unnecessary barriers to hundreds of millions of people. 3.4 HAM4 Design Guidance Web applications. The most well know and used framework for web content accessibility is the Web Content Accessibility Guidelines, W3C. The Web Content Accessibility Guidelines (WCAG) were created in 1995 by TRACE R&D Centre to prepare a set of recommendations for making HTML pages viewed in web browsers more accessible to users with disabilities. The definitive version was published by the World Wide Web Consortium (W3C8) in May 1999. Though the WCAG were initially created as recommendations, in many countries they have been incorporated in the legislation because information policy-makers found them to be a convenient tool for determining whether a Web site is accessible. The first country to do so was the USA, which included the guidelines in the Americans with Disabilities Act (ADA). After USA many other European countries followed. After WCAG 1.0, a new more evolved version came from W3C in 11th December 2008, after a slight change of W3Cs orientation. WCAG 2.0 has 12 guidelines that are organized under 4 principles: perceivable, operable, understandable, and robust. For each guideline, there are testable success criteria, which are at three levels: A, AA, and AAA [4]. Actually WCAG 2.0 is tanking on WCAG 1.0 which is being replaced by WCAG 2.0.
202
E. Chalkia and E. Bekiaris
Mapping Web content guidelines to Disabilities. The guidelines and the checkpoints that were included in the HAM are the most critical ones from a great variety of guidelines that diverse from one country to another and from one standardisation body to another. This variety of guidelines gives to the developers, on the one hand, the opportunity to choose the most appropriate between them, and on the other, to be lost in a chaos of non understandable and difficult to implement terms. This is the reason why the correlation between the guidelines and the other components of the ACCESSIBLE ontology is such a demanding procedure. In ACCESSIBLE harmonised methodology the correlation between the disabilities and the WCAG 2.0 guidelines has been developed. At this end, all other guidelines either official or institutional and research guidelines that have been developed on the basis of WCAG can easily be integrated in this methodological framework. Mobile applications. Accessible mobile applications imply a novel approach to the construction of both accessible applications and mobile applications: in the first case, the hardware/technological constraints of a mobile device have to be taken into account, as well as how these impose new constraints to different kinds of disabilities; in the second case, accessibility must be intertwined with the existing solutions for the constraints imposed by the device. Consequently, ensuring mobile accessibility to applications requires taking into account the dichotomy between the constraints imposed by both domains to each other. The approach proposed for the ACCESSIBLE HAM4 on the Mobile Applications domain is focused on the aforementioned dichotomy. By ensuring mobile & accessible applications, the different disabilities specified in HAM4 are mapped into the mobile and accessibility constraints. Mobile Application Guidelines. It has been noted for several times that the constraints imposed by accessibility are akin to those imposed by the limitations of mobile devices (c.f.)[5]. Examples such as properly structured information, correct (and linear) labelling of forms, or media equivalence of contents, are landmarks that illustrate this assertion. Consequently, striving for an accessible application is (partially) striving for a usable mobile application. Thus, a starting point to define a way to evaluate the accessibility of a mobile application is ensuring that in fact the application is usable in a mobile-centric environment. This solution though, hinders problematic situations due to the diverse ecosystem of mobile devices. However, abstracting away from these constraints, there are general-purpose usability guidelines that can be applied to the mobile applications domain, as well as mobile-specific development guidelines that help building usable mobile applications [6]. Another set of guidelines for mobile applications concerns those targeted to the Web. Mobile Web applications have been gaining momentum in the last years, due to the ever-increasing proliferation of Web-enabled mobile devices (especially smartphones). The W3C, through its Mobile Web Initiative [7] has been created for “Making Web access from a mobile device as simple as Web access from a desktop device.” This motto has been conducted in several directions, such as studying social developments (e.g. the increasing usage of mobile devices in developing regions and its intersection with the Web), and easing the task of creating Web sites that are usable on mobile devices.
A Harmonised Methodology for the Components
203
MWBP define a set of checkpoints (akin to WCAG’s) that developers must/should take into account, to ensure that a Web page or Web site is properly functional and tailored to mobile devices. There is a two level structure that narrows MWBP into a subset of checkpoints that are machine verifiable, called MobileOK Basic Tests [8]. Mapping MWBP to Disabilities. In ACCESSIBLE, we propose a new approach for Mobile Accessibility Guidelines (MAG) that is based on leveraging existing guidelines from both WCAG and MWBP. We followed a three-step methodology to define MAG: first, MWBP is mapped into WCAG; then, we leverage this mapping in order to associate MWBP to different disabilities (so as to support personalised accessibility assessment, as defined by ACCESSIBLE); and then we define a subset of checkpoints from MWBP that can be applied/transposed to non-Web scenarios. Web services . An accessible service should include the following 1. be well-defined, well-working and easy to integrate within client applications, 2. provide content which can be accessed by impaired users and 3. provide content with information which is actually useful and helpful for the impaired users accessing it. These requirements provide the three accessibility layers which form the basis for the Accessible Web Service accessibility evaluation. Three service accessibility classes (A, AA and AAA) build upon the three accessibility layers (the concepts of core, basic and extended accessibility respectively), thus providing the means for service categorization based on service accessibility features.
Fig. 5. Accessible Web Service Evaluation framework base
As shown in Fig. 5., for each accessibility layer a set of guidelines (of importance Level 1, 2 and 3 respectively) is defined. These guidelines, if followed, are able to provide a developed service with functional and accessibility features (core, basic and extended) that will enable it to belong to the corresponding accessibility class. Furthermore, for each guideline proposed, a set of specific techniques are defined, which can be used to check whether an already developed service belongs to a specific accessibility class or not.
204
E. Chalkia and E. Bekiaris
In order for a Service to be considered as “fully Accessible”, it has to: 1. Be accessible to developers of client applications, who want to use the service’s functionality and/or provided information within their application’s operational context. This requirement defines the concept of the core functional layer. 2. Have accessibility features that will enable the client applications invoking the service to show the delivered content in an “accessible for all” users way, in respect to the special needs of impaired user groups. This requirement defines the concept of the basic accessibility layer. 3. Provide data which contains enough information, in order for the content itself to be helpful for impaired users, containing information adapted to their special needs. This requirement defines the concept of the extended accessibility layer. Description Language. An accessible SDL application is the one that operates well satisfying the needs of both impaired and non impaired users. The application must be able of processing the user inputs and providing them with appropriate output content which is easy to interpret, useful and helpful for them. The above requirements define two accessibility layers which form the basis for the Accessible SDL application evaluation. The two accessibility layers are the basic accessibility layer and the extended accessibility layer. In order an application to be basically accessible it must follow a set of rules-guidelines defined for this level of accessibility and in order to be fully accessible it must be a basically accessible application (satisfy the set of rules of the basic accessibility layer) that also follows the rules-guidelines defined for the extended accessible SDL applications. Furthermore, for each guideline, a set of techniques that can be used to check whether an already developed SDL14 application belongs to a specific class or not have been defined.
4 ACCESSIBLE Harmonised Methodology Evaluation The evaluation of the HAM was a process that took long time to be completed and included the following actions: 1. Create a contact list among the partners with experts in the field of accessibility. 2. Create a template for receiving feedback on the methodology, the mapping, the ontology and the tools (these templates are included in Annex 13 of the current Deliverable.) 3. Organise an open call for experts via the project web-site, where experts in various fields are invited to participate and validate the ACCESSIBLE HAM. 4. Create one separate document for each ACCESSIBLE field, –web applications, web services, mobile-web applications, SDL- and send it to the experts. The documents included the following parts of the current Deliverable: a. b. c. d.
Contents Short description of the followed methodology and its components Presentation of the respective domain and Mapping domain guidelines to Disabilities Presentation of the HAM table for the respective domain
A Harmonised Methodology for the Components
205
5. Distribute the documents accompanied with the evaluation template to the experts. 6. Receive feedback. 7. Organise the 1st ACCESSIBLE Workshop in coincidence with ICCHP 2010 in July 2010 in Vienna and focus on the validation of the HAM, having an open discussion upon this issue (the results of the Workshop are noted in the minutes of the Workshop). Following these steps, we conducted a thorough evaluation of the HAM that brought very positive comments an results to the project. The point where the most positive comments were made and has been highlighted from all the reviewers, is the description of the disability types and their sub-categorization in type, subtype, ICF, functional limitation, etc. Even the creator of the Barrier Walkthrough methodology, Giorgio Brajnik, that the HAM was based upon stated “I like a lot the way in which in HAM you characterize a disability type (type, subtype, ICF, functional limitation, etc). I think I'll try to import that in my next description of the BW method.”, along with other promising comments like, “the idea of correlating all these components is of a great importance. It can assist us-developers in all the steps of developing to assessing our applications.” Acknowledgments. This work was partially funded by the EC FP7 project ACCESSIBLE- Accessibility Assessment Simulation Environment for New Applications Design and Development, Grant Agreement No. 224145.
References 1. European Information Society, Activities, e-Inclusion, Communication European i2010 initiative on e-Inclusion - to be part of the information society, http://ec.europa.eu/information_society/activities/einclusion /policy/i2010_initiative/index_en.htm 2. European Information Society, Activities, e-Inclusion, Accessibility-Opening up the Information Society (2009), http://ec.europa.eu/information_society/activities/einclusion /policy/accessibility/index_en.ht 3. Brajnik, G. Web accessibility testing with barriers walkthrough (March 2006), http://www.dimi.uniud.it/giorgio/projects/bw. 4. Caldwell, B., Cooper, M., Reid, L., Vanderheiden, G.: Web Content Accessibility Guidelines (WCAG) 2.0. W3C Recommendation, World Wide Web Consortium (W3C) (2008), http://www.w3.org/TR/WCAG20/ 5. Disability Rights Commission: Inclusive Design (2003), http://www.drc.org.uk/drc/Documents/Inclusive_design.pdf (date accessed 06/01/03) 6. Jones, M., Marsden, G.: Mobile Interaction Design. John Wiley & Sons, Books, Chichester (2005) 7. Mobile Web Initiative, World Wide Web Consortium, http://www.w3.org/Mobile 8. Owen, S., Rabin, J.: MobileOK Basic Tests 1.0. W3C Recommendation. World Wide Web Consortium (W3C) (2008), http://www.w3.org/TR/mobileOK-basic10tests/
An Architecture for Multiple Web Accessibility Evaluation Environments Nádia Fernandes, Rui Lopes, and Luís Carriço LaSIGE, University of Lisbon, Edifício C6 Piso 3 Campo Grande, 1749 - 016 Lisboa, Portugal {nadia.fernandes,rlopes,lmc}@di.fc.ul.pt
Abstract. Modern Web sites leverage several techniques that allow for the injection of new content into their Web pages (e.g., AJAX), as well as manipulation of the HTML DOM tree. This has the consequence that the Web pages that are presented to users (i.e., browser environment) are different from the original structure and content that is transmitted through HTTP communication (i.e., command line environment). This poses a series of challenges for Web accessibility evaluation, especially on automated evaluation software.In this paper, we present an evaluation framework for performing Web accessibility evaluations in different environments, with the goal of understanding how similar or distinct these environments can be, in terms of their web accessibility quality. Keywords: Web Accessibility, Web Accessibility Evaluation Environments.
An Architecture for Multiple Web Accessibility Evaluation Environments
207
This paper presents an evaluation framework for perform Web accessibly evaluations in different environments. Taking into account that usually the existents automatic evaluation procedures occur in the original HTML conclusions over the accessibility quality of a Web page can be incomplete, or, in extreme erroneous. It is therefore important to access the transformed HTML documents and understand how deep the differences toward the original document are.
2 Related Work To help create accessible Web pages, the Web Accessibility Initiative (WAI) developed a set of accessibility guidelines, the Web Content Accessibility Guidelines (WCAG) [9], that encourage creators (e.g., designers, developers) in constructing Web pages according to a set of best practices. If this happens, a good level of accessibility can be guaranteed [1, 2]. Although these guidelines exist and are supposed to be followed by the creators, most Web sites still have accessibility barriers that make very difficult or even impossible many people to use them [1]. Thus, WCAG can also be used as a benchmark for analysing the accessibility quality of a given Web page. Web Accessibility Evaluation is an assessment procedure to analyse how well the Web can be used by people with different levels of disabilities, as detailed in [1]. Optimal results are achieved with combinations of the different approaches of Web accessibility evaluation, taking advantage of the specific benefits of each of them [1]. Therefore, conformance checking [3], e.g., with the aid of automated Web accessibility evaluation tools is an important step for the accessibility evaluation. Automated evaluation is performed by software, i.e., it is carried out without the need of human intervention, which has the benefit of objectivity [2]. To verify where and why a Web page is not accessible it is important to analyse the different resources that compose the Web page. Two examples of automatic accessibility evaluators are: EvalAcess [6] that produces a quantitative accessibility metrics from its reports and the automatic tests of UWEM [5]. In the past, the predominant technologies in the Web were HTML and CSS, which resulted in static Web pages. Today, on top of these technologies, newer technologies appear (e.g., Javascript), and, consequently, the Web is becoming more and more dynamic. Nowadays, user actions and/or automatically triggered events can alter a Web page's content. Because of that, the presented content can be different from the initially received by the Web browser. To solve this problem, the accessibility evaluation should be applied to new environments, i.g., in the Web browser context. However, automatic evaluations do not consider these changes in the HTML document and, because of that, results can be wrong and/or incomplete. Expert and user evaluation are performed in the browser, they do not suffer with these changes. The importance of the Web browser context in the evaluation results is starting to be considered and is already used in three tools named Foxability, Mozilla/Firefox Accessibility Extension, and WAVE Firefox toolbar [7]. However, these tools focus only evaluating Web pages according to WCAG 1.0. Furthermore, since their evaluation procedures are embedded as extensions, they become more limited in terms of their application.
208
N. Fernandes, R. Lopes, and L. Carriço
Also, since these tools focus on providing developer-aid on fixing accessibility problems, the resulting outcomes from evaluations are user-friendly, thus less machine-friendly. Moreover, this “browser paradigm” - like is called in [7] - is very preliminary. Until now, to the best of our knowledge, differences between results in different evaluation environments are not clear. To perform correct comparisons, it must be guaranteed that tests are implemented in different environments in the same way, by reducing implementation bias.
3 Web Accessibility Evaluation Environments Our study is emphasized in two main environments: Command Line and Browser. The Command Line environment represents the typical environment for automated evaluation, which includes existing evaluators that can be accessed online and the evaluation is performed into the original HTML document. In Browser environment users interact with the Web evaluation, performed into the transformed version of the HTML document. Consequently, to better grasp the differences between the environments, we defined an architecture that allows for leveraging the same evaluation procedures in any environment, as detailed below. Afterwards, we explain how we implemented the ideas from this architecture, as well as how it was validated. 3.1
Architecture
The architecture of our evaluation framework is composed by five components, as depict in Figure 1: the QualWeb Evaluator, the Environments, the Techniques, the Formatters and the Web Server. The QualWeb Evaluator is responsible for performing the accessibility evaluation in Web pages using the capabilities provided by the Techniques component; it uses the Formatter component to tailor the results into specific serialisation formats, such as EARL reporting [8]. Finally, QualWeb Evaluator can also be used in different Environments. The Techniques component contains the individual front-end inspection code that is intended to be used in evaluation. In our case we chose the WCAG 2.0 [9], because it is one of most important accessibility standards. The Techniques component is built so that other techniques could be added, at any time, to be used in the evaluator. The Browser is the environment where the transformed HTML is used and the evaluation is performed in a browser. In Browser could be consider two mechanisms to deliver the evaluation results, the Server and the Embedded. In the Server the HTML document is evaluated and the result is sent to the Web Server for subsequent analysis. In the Embedded the evaluation results are injected into the HTML document and shown to the developers/designers directly within the Web page. Furthermore, other environments can be added to Environments component, in order to supply different HTML representations.
An Architecture for Multiple Web Accessibility Evaluation Environments
209
3.2 Implementation In order to compare the proposed evaluation environments, we must use the same accessibility evaluation implementation. Given that one of the environments is the Web browser, we have a restriction on using Javascript as the implementation language. Thus, to develop the Command Line version of the evaluation process, we 1 2 leveraged Node.js an event I/O framework based on the V8 Javascript engine . In 3 addition to standard Node.js modules, we used several other ancillary modules , including:
Fig. 1. Architecture of the Evaluation Framework
─ Node-Static, which allowed for serving static files into the browser environment; ─ Node-Router, a module that supports the development of dynamic behaviours, which we used to implement the retrieval and processing of evaluation results, and ─ HTML-Parser, which provides support for building HTML DOM trees in any environment. 1 2 3
Besides these standard modules, we also implemented a set of modules for our evaluation framework, including: ─ EARL module, which allows for the creation of EARL documents with the defined templates and parse EARL files using the Libxmljs library, and ─ Evaluator module, which performs the accessibility evaluation with the implemented techniques. Next it is presented an excerpt from WCAG 2.0 H64 technique. function inspect(DOMList) { if (typeof DOMList == "undefined" || DOMList.length == 0) return; for (var i = 0; i < DOMList.length; i++) { position++; if (DOMList[i]["type"] == "tag" && (DOML ist[i]["name"] == "frame" || DOMList[i]["name"] == "iframe")) { if(DOMList[i]["attribs"]["title"] != "" && DOML ist[i]["attribs"]["title"] != "undefined" && DOMList[i]["attribs"]["title"] != "''" ) { addElement(position,'cannotTell: title could not describe frame or frame',""); } else addElement(position,'failed',""); } inspect(DOMList[i]["children"]); } } exports.startEvaluation=startEvaluation; Next, we present additional details on how we implemented both evaluation environments, as well as report generation and processing capabilities. Command Line Environment This environment obtains the HTML document from a URL using an HTTP request, executes the QualWeb evaluator on the HTML DOM tree, and serialises its outcome into EARL. All of these processes are implemented with a combination of the HTMLParser, EARL, and Evaluator modules, executed from a command line. Browser Environment This environment uses a bookmarklet (Figure 2) to trigger the execution of the evaluation within the browser. Bookmarklets are a kind of browser bookmark that has
An Architecture for Multiple Web Accessibility Evaluation Environments
211
the particularity of point to a URI that starts with the javascript: protocol. In front of this, pure Javascript commands follow. Thus, when a user activates the bookmarklet, these commands are executed.
Fig. 2. Evaluation execution example on Browser
In the case of our evaluator, this bookmarklet injects the necessary functions to obtain the HTML DOM tree of the current Web page, executes the QualWeb evaluator, and sends the evaluation results to a server component. These results are transformed in the EARL serialisation format, and subsequently stored. To implement this browser-server execution and communication mechanism, we used the following modules: ─ Bootstrap, to import the required base modules, and ─ LAB.js, to inject all of the evaluation modules into the browser's DOM context. Report Generation and Processing Finally, to generate the evaluation reports containing the accessibility quality results, we used the following modules: ─ Node-Template, to define EARL reporting templates, ─ Libxmljs, to parse EARL reports, and ─ CSV module, to recreate a comma-separated-values (CSV) counterpart from a given EARL report. This module allowed for a better inspection and statistical analysis with off-the-shelf spreadsheet software. Besides, to the best of our knowledge, there was nothing that performs the EARL parsing giving results in CSV. While the EARL format allows for the specification of evaluation results, we had to extend EARL with a small set of elements that could allow for the analysis of the resulting outcomes from our experiment. Hence, we defined a Metadata field that supports the specification of HTML element count, as well as a Timestamp to state the specific time when the evaluation was performed. The EARL reports served as the basis for generating CSV reports. Due to the extensiveness of EARL reports generated by our evaluator, especially in what respects to parsing and consequent memory consumption provided by generic DOM parsers, we implemented the EARL-CSV transformation procedures with SAX events. Next, an EARL document example in RDF/N34 format. <#QualWeb> dct:description ""@en; dct:hasVersion "0.1"; dct:location "http://qualweb.di.fc.ul.pt/"; dct:title "The QualWeb WCAG 2.0 evaluator"@en; a earl:Software. 4
dct:title "H25"@en; a earl:TestCase. dct:description ""@en; dct:hasVersion "0.1"; dct:title "The QualWeb WCAG 2.0 evalua-tor"@en; a earl:Software; foaf:homepage qw:. dct:description "description"^^rdf:XMLLiteral; dct:title "Markup Valid"@en; a earl:TestResult; earl:info "info"^^rdf:XMLLiteral; earl:outcome earl:passed; earl:pointer <1>. 3.3 Testability and Validation We developed a test bed comprising a total of 102 HTML documents, in order to verify if all the WCAG 2.0 implemented techniques provide the expected results. Each HTML document was carefully hand crafted and peer-reviewed within our research team, in order to guarantee a high level of confidence on the truthfulness of our implementation. For each technique success or failure cases were performed to test all the possible techniques outcomes. To get a better perspective on the implementation of our tests, we leveraged the examples of success or failure cases described for each WCAG 2.0 technique. The graph depicted in Figure 3 shows the number of HTML test documents defined for each technique that was implemented in the QualWeb evaluator. To test the proper application of the implemented techniques in the two evaluation environments, we defined a small meta-evaluation of our tool. This meta-evaluation consisted on triggering the evaluation on the command line with a small automation script, as well as opening each of the HTML test documents in the browser, and triggering the evaluation through the supplied bookmarklet. Afterwards, we compared the evaluation outcome for all HTML test documents and compared their results with the previously defined expected results. Since all of
An Architecture for Multiple Web Accessibility Evaluation Environments
213
Fig. 3. Number of Test Documents per Technique
these HTML tests do not include Javascript-based dynamics that transform their respective HTML DOM tree, we postulated that the implementation returns the same evaluation results in both evaluation environments.
4 Conclusions and Future Work The presented architecture for Multiple Web Accessibility Evaluation Environments that was implemented for: Command Line and Browser environments. The architecture was used in accessibility evaluation tests successfully. In this work were implemented new modules to facilitate this type of evaluations. These modules will be available online. Some limitations of this work are: the evaluations do not occur exactly at the same time in both environments, so we could not guarantee 100% that Web page generation artefacts were not introduced between requests for each of the evaluated Web pages, and injection of accessibility evaluation scripts could be blocked with cross-site scripting (XSS) dismissal techniques. Ongoing work is being conducted in the following directions: 1) an in-depth implementation of WCAG 2.0 techniques for different front-end technologies, as well as its application in different settings and scenarios; 2) implementation of more WCAG 2.0 tests; 3) continuous monitoring of changes in the HTML DOM thus opening the way for detection of more complex accessibility issues, such as WAI ARIA live regions [12]; 4) detecting the differences in DOM manipulation, in order to understand the typical actions performed by scripting in the browser context, and 5) the implementation of additional evaluation environments, such as developer extensions for Web browsers (e.g., Firebug5), as well as supporting an interactive analysis of evaluation results embedded on the Web pages themselves.
5
Firebug: http://getfirebug.com/
214
N. Fernandes, R. Lopes, and L. Carriço
Acknowledgements. This work was funded by Fundação para a Ciência e Tecnologia (FCT) through the QualWeb national research project PTDC/EIA-EIA/105079/2008, the Multiannual Funding Programme, and POSC/EU.
References 1. Harper, S., Yesilada, Y.: Web Accessibility. Springer, London (2008) 2. Lopes, R., Gomes, D., Carriço, L.: Web not for all: A large scale study of web accessibility. In: W4A: 7th ACM International Cross-Disciplinary Conference on Web Accessibility, ACM, Raleigh (2010) 3. Abou-Zahra, S.: Wai: Strategies, guidelines, resource to make the web accessible to people with disabilities conformance evaluation of web sites for accessibility (2010), http://www.w3.org/WAI/eval/conformance.html (last accessed on November 11 2010) 4. Sullivan, T., Matson, R.: Barriers to use: usability and content accessibility on the web’s most popular sites. In: CUU 2000: Proceedings on the 2000 conference on Universal Usability, pp. 139–144. ACM, New York (2000) 5. Velleman, E., Meerveld, C., Strobbe, C., Koch, J., Velasco, C.A., Snaprud, M., Nietzio, A.: Unified Web Evaluation Methodology, UWEM 1.2 (2007) 6. Vigo, M., Arrue, M., Brajnik, G., Lomuscio, R., Abascal, J.: Quantitative metrics for measuring web accessibility. In: W4A 2007: Proceedings of the 2007 International crossdisciplinary Conference on Web Accessibility (W4A), pp. 99–107. ACM, New York (2007) 7. Fuertes, J.L., González, R., Gutiérrez, E., Martínez, L.: Hera-ffx: a firefox add-on for semi-automatic web accessibility evaluation. In: W4A 2009: Proceedings of the 2009 International Cross-Disciplinary Conference on Web Accessibililty (W4A), pp. 26–34. ACM, New York (2009) 8. Abou-Zahra, S., Squillace, M.: Evaluation and report language (EARL) 1.0 schema. Last call WD, W3C (October 2009), http://www.w3.org/TR/2009/WD-EARL10Schema-20091029/ 9. Caldwell, B., Cooper, M., Chisholm, W., Reid, L., Vanderheiden, G.: Web Content Accessibility Guidelines 2.0. W3C Recommendation, World Wide Web Consortium, W3C (2008), http://www.w3.org/TR/WCAG20/ 10. Sullivan, T., Matson, R.: Barriers to use: usability and content accessibility on the Web’s most popular sites. In: CUU 2000: Proceedings of the Conference on Universal Usability, pp. 139–144. ACM, New York (2000) 11. Vigo, M., Arrue, M., Brajnik, G., Lomuscio, R., Abascal, J.: Quantitative metrics for measuring web accessibility. In: W4A 2007: Proceedings of the 2007 International Crossdisciplinary Conference on Web Accessibility (W4A), pp. 99–107. ACM, New York (2007) 12. Craig, J., Cooper, M.: Accessible rich internet applications (wai-aria) 1.0. W3C working draft, W3C (September 2010), http://www.w3.org/TR/wai-aria/
Overview of 1st AEGIS Pilot Phase Evaluation Results Maria Gkemou and Evangelos Bekiaris Hellenic Institute of Transport, Centre for Research and Technology Hellas 6th Km Charilaou-Thermi Road, Thermi-Thessaloniki, PO BOX 60361, GR-57001, Greece {mgemou,abek}@certh.gr
Abstract. This paper presents the most significant results, emerging from the users’ evaluation of the accessible solutions developed (and/or utilised as starting point) in the ÆGIS IP project (Open Accessibility Everywhere: Groundwork, Infrastructure, Standards; http://www.aegis-project.eu) of the 7th European Framework Programme. Users participating in the first out of the three in total evaluation rounds scheduled within the project represented all user clusters targeted by the project. The emerging results, which are considered overall positive, will constitute the basis for the optimisation to be held until the next evaluation round of the project. Keywords: eAccessibility, Open Source Software, iterative evaluation.
major mainstream ICT devices/applications domains, namely the desktop applications area, the rich internet applications (RIA), and the Java-based mobile devices domain. The accessible solutions developed in the project have and will be tested in three evaluation rounds and one final demonstration phase with all types of users (users with impairments and developers) and other stakeholders (i.e. tutors, evaluators, carers, etc.) targeted by the project, whereas they will also undergo in each case a parallel technical validation by the development teams of the project (and also by external developers in the second and third evaluation rounds). This paper aims to present the most important results of the first evaluation round conducted with users in the project. Chapter 2 gives an overview of the pilot activities held and a short description of the prototypes tested, Chapter 3 presents in short the profile of the users participating in the Pilots, and, finally, Chapter 4 presents indicative results from the Human Factor Assessment with users. All pilot activities in the project have and will be conducted on the basis of a thorough and User Centred Design based and Ethics Policy compliant evaluation framework and detailed experimental plans, which will be optimised itself, based on the results derived by each evaluation round 2.
2 Overview of 1st Phase Pilot Activities and Tested Prototypes 2.1 Prototypes/Solutions Tested in the First Evaluation Round In the context of the first evaluation round, 10 prototypes have been tested with users and 5 of them also underwent a technical validation. During the first 12-16 months of the ÆGIS project, there have been 6 preliminary prototypes developed by the respective development teams in ÆGIS, which reached an adequate maturity level in order to be included in its first evaluation phase. These are namely the “Accessible Contact Manager and Phone Dialer”, the “Concept Coding Framework Ooo Symbols”, the “DAISY Production”, the “GnomeShell Magnifier”, the “Haptic RIA maps” and the “ÆGIS RIA Developer tool”. In addition to the above prototypes, there were 4 concepts/prototypes developed outside of ÆGIS, selected to be tested in addition which were namely the “Open Speech Access to the GNOME Desktop environment” by Sun, the “AIM Real-Time Instant Messenger” by AOL, the “Oratio for Blackberry” by RIM and, finally, a collection of some Text To Speech sample files for language evaluation. Each of them is shortly described below. The Accessible Contact Manager and Phone Dialer aims to show how the contact manager application will show the information and the contact details of each contact and is supported with a special support feature for cognitive impairment users to allow them recognise each contact by graphical information (picture), textual (label) and auditive (voice of the contact).The Concept Coding Framework OooSymbols aims to make the text based environment of a standard Office application suite – OpenOffice.org (OO.org) – accessible, as a productive tool also for users with more
Overview of 1st AEGIS Pilot Phase Evaluation Results
217
profound problems in relation to text – both in terms of writing and reading. This will be achieved by – in addition to TTS reading support – providing graphical symbol support. Graphic symbols will illustrate the meaning of the words as they are entered into the text, or when text content is loaded from a file. The DAISY Production aims to demonstrate that it enables end users to create digital talking books in DAISY format from an (accessible) ODF document. The GNOMEShell Magnifier provides magnification for visually impaired users; the aim in this case was to test several features that have been implemented so far in the screen reader for the GNOME Desktop: Magnification Factor, Fullscreen feature, Moveable Lens feature, Scroll at Screen Edges feature. The Haptic RIA maps aim is to provide the visually impaired users with an easy way to use means of accessing conventional 2D maps. The user can interact with the produced 3D model of the map and examine its properties. The developed framework analyses the map image so as to obtain the enclosed information. While navigating, audio messages are displayed providing information about the current position of the user (e.g. street name). The Open Speech Access to the GNOME Desktop environment is not an actual prototype; the aim of the plan designed was to test the existing Orca screen reader with open desktop applications, the Firefox web browser, including ARIA enabled applications and also to test the screen reader customisation functionality. The AIM Real-Time Instant Messenger constitutes a working implementation of the commercial real-time text communication application of AOL (project beneficiary). The Oratio for Blackberry by RIM (project beneficiary) provides a screen reading software solution that allows people with severe visual impairments to access and operate BlackBerry smartphones. In the context of AEGIS, a proof of concept with users will take place in order to get feedback about the screen reader features, its performance, the inherent performance of the Accessibility API and of the Text-to-Speech (TTS), in order to apply or not in AEGIS relevant work. This prototype is built using an accessibility API and targets to verify the success of using this 3rd generation of accessibility with this AT. Also, TTS sample files have been provided for language evaluation, targeting at the evaluation of the TTS Engine and its further improvement in the context of the project. Finally, the AEGIS RIA Developer tool presents the basic idea of the accessibility support for RIA application developers. 2.2 Overview of First Evaluation Round Activities The 10 prototypes have been evaluated in total with 185 users with impairments and 56 experts of various types (e.g. tutors, accessibility evaluators, consultants, etc.) and 7 developers (which tested of the AEGIS RIA developer tool). Tests have been conducted across 6 test sites (and 4 countries), namely in Belgium by EPR and KUL, in Spain by FONCE, in Sweden by SU-DART and in the UK by ACE and RNIB. It should be noted that users of each test site tested more than one prototype, which implies that the number of the testing sessions has been much higher than the absolute number of users participating in the assessment. The following table gives an overview of the pilot activities of the first round of ÆGIS.
218
M. Gkemou and E. Bekiaris Table 1. ÆGIS Pilot activities overview
3 Pilot Participants Profile A series of selection criteria were defined and agreed upon by the whole ÆGIS Consortium (test sites and developers) to be considered in the participant recruitment for the different evaluation phases within the project. The selection criteria for the end users with impairments have been the type and severity of disability, gender, age, previous experience with using the devices analysed in the project (mobile, computer, internet), previous experience with using AT and previous experience of participation in similar activities of past or ongoing projects. The table below shows the distribution of the sample that should be recruited (where these percentages should be the minimum for each variable in order to ensure a controlled pool of participants), actual
Overview of 1st AEGIS Pilot Phase Evaluation Results
219
percentages according to the recruited participants in each site and the deviation, if any, in the number of users needed to reach the minimum. Looking at the different columns of the table, one can notice that there are very slight deviations regarding the minimum acceptable percentages, which means that the sample of users recruited to carry out the evaluation test of the first phase has been correctly controlled by the Consortium in order to ensure the consistence and representativeness of the results. It also implies the statistical strength of the emerging results. Table 2. ÆGIS Pilot participants profile according originally defined recruitment criteria
The selection criterion for experts and tutors is the type of disability in which each one has experience, which is directly linked to the beneficiaries of each prototypes. In the case of developers, the criterion is the type of area (mobile, desktop, RIA), and is also related to the environment of the prototype. This means that the selection for recruitment of experts, tutors and developers is not as specific and exhaustive as that one followed for the users with disabilities.
4 Human Factors Assessment Results 4.1 Human Factor Assessment Techniques The evaluation approach followed in ÆGIS needed to encompass all types of users that are interfering directly or indirectly with its solutions. Therefore, as shown in the table below, besides performance testing with end-users, deploying naturalistic observation methods, focus groups have been planned in order to involve tutors, experts and other relevant stakeholders. It is important to note that ÆGIS has tried to gather both subjective and objective measurements; not aiming to evaluate the users' performance; on the contrary for evaluating the systems' performance through users' interaction with them. As such, a combination of contextual inquiry and performance testing took place, where the users were asked to perform designed tasks (assigned to their type of impairment) with the ÆGIS prototype/proposed solution. Performance
220
M. Gkemou and E. Bekiaris
stepwise testing scenarios were developed per prototype, and, when applicable, recommended execution times were defined for each step, to serve as thresholds for later analysis. On the basis of the scenarios, service diaries were developed to allow the testing supervisors to keep track of the users’ performance testing while applying at the same time the Think Aloud or Co-discovery protocols. Table 3. First evaluation phase techniques and tools applied in Human Factor assessment 1st Evaluation Phase Techniques and Tools Evaluation technique Tools to be used Training workshops and consent forms Consent forms and training manuals signed by end-users Questionnaires/Interviews addressing end- Subjective: Pre-test form users Performance testing combined with Natural- Objective: Service diaries and vidistic Observations and Contextual Inquiry eo/sound/screen recordings, in combination addressing end-users with: Subjective: Think Aloud/Co-Discovery Protocol/open questions Questionnaires/Interviews addressing end- Subjective: Post-test forms, including stanusers dard scales and prototype-specific questions Focus Groups involving all types of experts Subjective: Open questions and free discussion and questionnaire
4.2 Indicative Human Factor Assessment Results The most important results, of qualitative nature, as emerging from the questionnaires, interviews and the Think Aloud/Co-discovery Protocol open questions, during the testing with users and also during the focus groups sessions with experts are summarised below per prototype/mock-up/solution tested in the first evaluation round. It should be highlighted that the results provided below are only indicative (the length of this paper prohibits more extended discussion) and also refer only to those prototypes/solutions that have been developed within the context of the project. Accessible Contact Manager: Overall, this prototype received positive feedback by users and unanimous overall appreciation by Focus Groups experts despite the series of technical problems noticed, some of which relating to controlling the phone. Users were specifically interested to know if this will finally work also in Symbian and Android (prototype tested was developed in J2ME). Interoperability with communication aids was considered essential. Room for optimisation was identified in a number of functions. For example, the scroll function should have a border on the side and be improved in general, an alphabet list should be added, it should be possible to adjust and customise settings, the most frequently used contacts should appear on top of the list, a better touch screen response is required, improvement of the small, unclear and hard to manage screen buttons is needed and also of icons and texts on some displays, better synchronisation of feedback messages is required, etc. The prototype was considered useful especially for people with mobility impairments (e.g. MS, degenerative muscle disease) because the pictures are easy to touch, and for elderly people, because
Overview of 1st AEGIS Pilot Phase Evaluation Results
221
it works with pictures instead of small characters (they do not need an “elderly phone”), and, finally for persons with mild cognitive disabilities. However, there are still some navigation problems making it hard for people with manual dexterity problems. Concept Coding Framework Ooo Symbols: Bliss-users seemed to have the greatest benefit. The modifications proposed encompass placement of text under symbols, possibility to show symbols only without text, to choose among multiple alternative symbols, to add own sets of symbols, to store in a smart way preferences, etc. Other languages availability than English (i.e. Dutch) seems essential, whereas it would be interesting to be able to deal with phrases in some way. Focus Groups experts expressed interest and requested for smooth interaction with TTS synthesis reading support. DAISY Production: It has been considered a good, free way to generate DAISY material through OpenOffice. But it was made evident that the quality of the DAISY output is totally dependent on the quality, structure and tagging of the input document. Users were unable to find options in the prototype to select different audio compression settings. Lexicon files incorporated into the prototype would allow the user to correct errors in the audio such as pronunciation. Proposals for improvement encompass improvement of set of “voices”, easier installation on networks, more graded example documents in ODT and Daisy forms, improvement of choosing and finding location of output Daisy files, provision of continuous feedback about the current stage in the execution of tasks, possibility to choose pitch and speed. Focus Groups experts commented in addition that support for document structure would be appreciated. Also, that Odt2Daisy should enable the user to choose the voice so as not to consume resources if the user already has a TTS installed and does not want to install a different one. GnomeShell Magnifier: The “movable lens” feature was considered the most intuitive and useful. The screen edges feature is very helpful when surfing the web. Proposed improvements encompass additional keyboard commands for scrolling the screen and changing the magnification level, improvement of the screen movement (more stable and avoid flickering), making it resizable so that it is customizable, have a small window to let the user get an overview of the full screen, improve color and contrast and visibility options for mouse and cursor, language availability, combination of the features with speech and improvement of quality of the characters in large magnifications (scale was also discussion point). Haptic RIA maps: Some test sites (RNIB & EPR) did not manage to test this, due to compatibility issues with hardware and installation problems. However, high interest in the overall has been expressed. Recommendations for improvement encompassed language support, support to the user in order to know to which cardinal points s/he is heading to, improvement of the performance of the device to further refine the pointer and convey more sensations, improvement of the user’s feedback (vibration for street intersections and additional “push” info about directions or near places. ÆGIS RIA Developer tool: According to the developers, the position of the different elements and application areas is simple and intuitive. However the description of the
222
M. Gkemou and E. Bekiaris
controls is not fully clear, at least for developers unfamiliar with these development environments. Focus Groups highlighted the risk with throwing too much warnings and information at the user. Recommendations for improvement included direct editing of the styles of the components and language availability (other than English). The aggregated results on the User Acceptance (measured on qualitative basis through the system acceptance scale 3 are shown in the following figure as a further reflection of the results presented above. As it is shown below, all prototypes have been positively rated against both indices of the scale (“Usefulness” and “Satisfaction”), besides the RIA Developer tool, the Open Speech Access to the GNOME Desktop environment and the Haptic Ria Maps prototypes.
When defining the tasks upon which each prototype would be tested, in some cases and whenever applicable, the test sites experts had recommended average “acceptable” performance time for each task (and actually each step of the task) from each type of users. The performance of the users was recorded by the test conductors through in advance formulated service diaries together with the average % of task completion and average number of prompts by each user in order to complete the task. Table 4 provides these aggregated results (on task level), whereas the original recommended time is also indicated, whenever applicable. As it is shown in the table, the most considerable deviation is noticed in the Haptic RIA Maps prototype, whereas the most “difficult” prototypes for the users in general, if we judge from the average completion % seemed to be the Accessible Contact Manager, and the RIA Developer tool. However, especially in the Accessible Contact Manager case, the blame, as reported by the tests conductors, should be put on the task complexity and not the prototype itself. The number of prompts for every prototype was never more than 2, which is acceptable for the level of technical maturity reached at this first phase.
Overview of 1st AEGIS Pilot Phase Evaluation Results
Finally, regarding the demonstration of the TTS engine has not been that convincing. It is interesting to note that most people hadn't heard eSpeak before. Overall, prosody and punctuation got good ratings, whereas lowest pitch and slowest rate were preferred. According to Focus Groups experts it is good that exists as a free and last resort alternative, but it is applicable only for users who read to support themselves, e.g. blind users, and again, if no better alternatives are available. Also, it is not applicable for communication to other people, and certainly not for users with cognitive and/or perceptual problems. The voices are artificial and sound quite robotic and unnatural: the pronunciation, intonation, pauses and rhythm of the voice need improvements to make the speech more natural and closer to the human voice. Intelligibility must be improved because currently many users do not understand correctly the content. It would be appropriate to provide more speed options for the audio. The above certify that a lot of work needs to be done by ÆGIS and the whole research community in this area (to be reminded that this part of evaluation served as one of the starting points for the optimization and development work that needs to be held in the context of the project and not as a validation of work that has already been done).
5 Conclusions, Lessons Learned and Next Steps This paper presents indicative results, emerging from the users’ evaluation of the accessible solutions developed (and/or utilised as starting point) in the ÆGIS IP project. Users participating in the first out of the three in total evaluation rounds scheduled within the project represented all user clusters targeted by the project. 10 prototypes have been evaluated with 185 users with impairments and 56 experts of various types (e.g. tutors, accessibility evaluators, consultants, etc.) and 7 developers (as end-users of the RIA developer tool) in total across 6 test sites (and 4 countries). Overall, a high interest was expressed from users in all prototypes tested. Users especially appreciated the availability of alternatives to commercial products and expressed their interest to participate in the next evaluation rounds of the project. The partial lack of local language adaptation of prototypes has been challenging in many cases, but, as reported by the test conductors, the users did make a big effort to
224
M. Gkemou and E. Bekiaris
provide useful feedback to ÆGIS. Some of the prototypes were still in an early stage of development with reduced functionalities that resulted in a limited evaluation of all the features expected. Despite the above, a great deal of valuable input has been gathered to enable optimisation of the prototypes but also of the procedure to be followed in the next evaluation rounds. The evaluation results of this phase (but also of the upcoming phases) will be provided in the form of short guidelines to the developer teams of the project, to allow optimization of their prototypes for the next round. In addition, the outcomes of the evaluation will enable finally the Consortium to come up with recommendations and feedback for standards in the relevant areas of research. Last but not least, for the evaluation process itself, the Consortium has received feedback on how to optimise the process to be followed and the supporting tools in the next phases across all aspects. Acknowledgments. This work was partially funded by the EC FP7 project AEGISOpen Accessibility Everywhere: Groundwork, Infrastructure, Standards, Grant Agreement No. 224348.
References 1. Annex I-``Description of Work”, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing, September 8 (2008), http://www.aegis-project.eu/ 2. G(k)emou, M., Bekiaris, E., et al.: Novel framework and supporting material for the inclusive evaluation of ICT, Deliverable 1.5.1, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (May 2010) 3. http://www.hfes-europe.org/accept/accept.htm
An End-User Evaluation Point of View Towards OSS Assistive Technology Maria Gkemou1, Evangelos Bekiaris1, and Karel Van Isacker2 1
Hellenic Institute of Transport, Centre for Research and Technology Hellas, 6th Km Charilaou-Thermi Road, Thermi-Thessaloniki, PO BOX 60361, GR-57001, Greece 2 Marie Curie Association 31 Osvobozhdenie avenue, Karina, 1st floor, ap. 4, Plovdiv 4023, Bulgaria {mgemou,abek}@certh.gr, [email protected]
Abstract. This paper describes the evaluation framework developed in the ÆGIS IP project (Open Accessibility Everywhere: Groundwork, Infrastructure, Standards; http://www.aegis-project.eu) of the 7th European Framework Programme, in the context of its overall User Centered Design plan, focusing on the experimental planning of its first out of the three scheduled evaluation rounds. The ÆGIS evaluation framework may serve as a valuable manual for testing in the overall eInclusion area, beyond the narrow context of the project. Keywords: eAccessibility, Open Source Software, iterative evaluation, User Centred Design.
open source based generalised accessibility support into the major mainstream ICT devices/applications domains, namely the desktop applications area, the rich internet applications (RIA), and the Java-based mobile devices domain. One of the very first activities of ÆGIS was a state of the art undertaken regarding the European Assistive Technology (AT) industry and the availability of past (European) surveys and data regarding the usage of and satisfaction with AT of end users 5. As shown from this literature survey, AT is widely used, and in many cases has improved the life of many end-users. However, data seems to indicate that a majority of the people with disabilities (people with vision impairments seemingly being an exception) do not use AT, or are simply unaware of existing AT, may lack appropriate training to properly use it and, if they do, are often disappointed with what it offers in relation to what they need. All findings emerging from this survey, like the lack of local language versions of AT or a common policy regarding reimbursement schemes, etc. were cross-checked and confirmed with the findings of the studies undertaken by the project. However, they constitute some first evidence that User Centred Design (UCD) is necessary in the context of AT/AAC prototype development in order to place the end user, user organisations, & support teams at the fulcrum of the overall iterative design and testing process. This is essential for a genuinely iterative approach to AT/AAC design and is being strictly followed within ÆGIS. This paper presents the evaluation approach established and followed in ÆGIS, as an integral part of its UCD implementation plan developed in the project, in order to accommodate the evaluation of the project Open Source Software (OSS) solutions from the end-user point of view and beyond. This paper is structured as follows. The second Chapter provides an outline of the targeted end-users and other users of the project outcomes, the third Chapter presents in short the ÆGIS UCD plan overview, a stage of which is the iterative design and development stage, the fourth Chapter presents the evaluation framework and exp erimental planning, focusing on the first evaluation round conducted, and the fifth Chapter discusses the benefits and added value of the plan described herein.
2 Targeted User Groups The number of people with disabilities in Europe is estimated to be between 10% and 15% of the total population (between 50 and 75 million people in the EU27), which gives an idea of the number of Europeans at risk of exclusion, as well as the number of potential beneficiaries of accessible Information and Communication Technologies (ICT). The prevalence of both disabilities and other minor functional limitations is strongly related to age. Thus, the already high level of demand for eAccessibility solutions will increase substantially with the ageing of the population. In the meantime, the ageing population in Europe is the most important demographic process in recent years and it is expected to increase sharply over the coming years due to two main factors: the increase in life expectancy and falling birth rates. The findings of Labour Force Survey and other surveys indicate that disability increases with age; approximately two-thirds of disabled people are elderly. On the basis of these facts, the major end-user groups targeted in ÆGIS, affecting its UCD, are the following:
An End-User Evaluation Point of View Towards OSS Assistive Technology
227
• Mainstream ICT devices and services developers, which they will use the ÆGIS OAF and tools to build accessible applications and services. In the context of the ÆGIS project there are several clusters of developers considered, such as developers or Assistive Technology (AT), of operating systems, of mainstream applications, of web content, of mobile applications, accessibility evaluators and manufacturers. • People with disabilities, including the elderly: constituting the direct users of all accessible services and applications resulting from ÆGIS. The types of disabilities or diversity in functioning are targeted within the project are blind and low-vision users, users with motor impairments(upper limb), cognitive impairment, users with learning difficulties, hearing, speech and communication impairments. As a complement and transversally, older people are being taken into account, above all insofar, as functional impairment increasingly appears in old age. It should be also noted, that throughout the project and especially during the phase of personas construction (see following Chapter), some combinations of the above groups (to represent persons that have multiple impairments) have emerged (and have been also represented in the project pilots). • Other stakeholders and groups with interest in ÆGIS design processes, public or private, institutional or community groups, such as organisations and tutors of people with disabilities and elderly people, service providers (mobile connectivity and applications, web content providers, training institutions), resellers of software and other assistive technology, AT research centres, Public bodies/Government and governmental agencies, standardisation bodies, health care and emergency support service providers, public / private social security service providers and insurance companies.
3 ÆGIS UCD Plan and Starting Point for Evaluation The UCD implementation plan of the project constitutes the cornerstone of all its phases’ work, starting from the user needs and modelling phases to the iterative design, development and evaluation of the ÆGIS outcomes. Emerging after a thorough literature survey and identification of all applicable methods for each UCD phase, it constituted the basis for the framework for the iterative evaluation of the project outcomes, in three phases and 4 test sites (Belgium, Spain, Sweden and in the UK). The UCD implementation plan as proposed by ÆGIS has four sequential phases. In the first phase, analysis of the users (both end-users with disabilities and developers and experts), their tasks and their contexts are the main focus. Based on these analyses, insight in the problems and needs of the users are collected. Central to phase two is the translation of these problems and needs into a format which can be used throughout the remainder of the project, such as personas, use cases, user scenarios, user requirements, conceptual models, etc. In the third phase, the conceptual models will be developed into prototypes in an iterative, co-design approach. The prototypes of increasing fidelity will be iteratively evaluated with users. In the fourth phase, the final, working prototype which is the result of the previous phase is tested in the field. The four phases are schematically presented in Fig. 1. The interdependencies between the research activities are indicated in the overall context of the iterative
228
M. Gkemou, E. Bekiaris, and, K. Van Isacker
process. The four phases are followed for each of the ÆGIS application domains (i.e. mobile, desktop, Rich Internet Applications). When possible, the UCD activities will run in parallel for each of these domains, or will be combined. For instance, in the first phase, the user, task and context analysis have been conducted simultaneously for all application domains.
Fig. 1. The UCD phases and the interdependencies between the research activities as defined by ÆGIS 4
The first two phases of the UCD plan are the ones that correspond to the overall “Modelling the User” stage of the project, as indicated in the following figure. On the other hand, the third and fourth phases of the UCD plan constitute the “Iterative Design and Development” stage of the project. It should be mentioned that the applicable for ÆGIS UCD techniques in each phase have emerged from an extensive literature survey held in the early beginning of ÆGIS project 4. All tangible results of UCD Phases 1 and 2, encompassing the ÆGIS Use Cases, Personas, application scenarios and conceptual models, have constituted the basis for the specific application scenarios and experimental plans that have oriented the evaluation of the 10 first prototypes of ÆGIS in the first round of evaluation, that took place from May 2010 until end of July 2010. This has been only the first of the three evaluation rounds that are scheduled in the project in the context of the iterative design and development stage.
An End-User Evaluation Point of View Towards OSS Assistive Technology
229
4 Iterative Evaluation Framework and Experimental Planning 4.1 Iterative Evaluation Framework ÆGIS has developed a horizontal evaluation plan involving several sequential iterative heuristic evaluations that involve both experts (representing various stakeholders groups) and end-users (users with disabilities and developers), as appropriate, at various stages of the development lifecycle of all the proposed prototypes and applications. Sequential evaluations involve a series of evaluation techniques that run in sequence, such as contextual inquiry, naturalistic observations, performance testing, technical validation on the hand, etc. 3. The iterative nature of testing will be one of the core concepts that will be maintained throughout the project. More specifically, ÆGIS is committed to build userbased and technical validation into all stages of the development lifecycle, from the very first prototypes until the pre-release stage; this is the major reason for scheduling 3 evaluation rounds and one final demonstration phase in the context of the project. After each phase, the evaluation outcomes will provide valuable feedback to various design and development teams of the project, whereas the experience to be gained by a number of spread evaluation groups will be consolidated and documented appropriately in order to serve as reliable, yet raw, input to standards and exploitation plans. 4.2 Applicable Evaluation Categories As currently identified, the applicable evaluation categories of ÆGIS are namely the technical validation, conducted by the developers, the Human Factor assessment, enabled through the participation of end-users (developers and users with disabilities) and several types of experts, whereas socio-economic assessment and impact assessment will be performed off-line, utilising feedback by the three evaluation rounds like Willingness to Have (WTH) and Willingness to Pay (WTP) data (to be gathered mainly from the 2nd and 3rd evaluation phases). The high-level objectives of the four targeted evaluation types are presented in the table below, whereas, in each case, the user groups involved directly or indirectly in the type of evaluation are indicated. 4.3 First Evaluation Phase Prototypes and Experimental Plans In the context of the first evaluation round, 10 prototypes have been tested with users and 5 of them also underwent a technical validation. During the first 12-16 months of the ÆGIS project, there have been 6 preliminary prototypes developed by the respective development teams in ÆGIS, which reached an adequate maturity level in order to be included in the first evaluation phase of ÆGIS. These are namely the “Accessible Contact Manager and Phone Dialer”, the “Concept Coding Framework Ooo Symbols”, the “DAISY Production”, the “GnomeShell Magnifier”, the “Haptic RIA maps” and the “ÆGIS RIA Developer tool”. In addition to the above prototypes, there were 4 concepts/prototypes developed outside of ÆGIS, selected to be tested in addition either because they were considered as basis for future relevant ÆGIS implementation, and as such, their evaluation was considered very critical to ÆGIS in order to
230
M. Gkemou, E. Bekiaris, and, K. Van Isacker Table 1. ÆGIS high level assessment objectives
be decided if the project should comply or not with them and to which extent or because their testing practically constituted the first step of further development (as it is the case for Text To Speech sample files evaluation). These are namely the “Open Speech Access to the GNOME Desktop environment” by Sun/Oracle, the “AIM RealTime Instant Messenger” by AOL, the “Oratio for Blackberry” by RIM and, finally, a collection of some Text To Speech sample files for language evaluation. Before structuring the specific experimental plan for each prototype/solution, a manual that would serve as a supporting tool during the training and test sessions was developed per each, indicating: contact details of the development team in the project, reference to the corresponding project workplan items the respective work has been held, reference to the final official or internal documentation that reports the methodology and outcomes of this work, reference to the relevant Use Cases of the project, description of the aim of the prototype, clusters of users (following the projects classification presented in Chapter 2 of this document) addressed, short functional description of the prototype, technologies deployed, limitations/restrictions/dependencies, h/w and s/w requirements, guidelines for installation and use with specific interaction examples, and, finally, major mitigation actions to the most common problems that could arise during use and testing. On this basis, for each prototype/solution, a very specific validation plan has been designed, consisting of the following: • Regarding the Human Factors Assessment: HF Research Hypothesis (applicable for 1st Phase); experimental plan outlining the key indicators, the metrics and ways of measurement, as well as the success thresholds set by the AEGIS Consortium per high level objectives for the HF assessment (presented also in section 4.2 of this paper), the overall process to be followed, and, finally, the tasks that would orient the performance testing phase.
An End-User Evaluation Point of View Towards OSS Assistive Technology
231
• Regarding the Technical Validation: Technical validation research hypothesis (applicable for 1st Phase), main technical evaluation objectives, key indicators, measuring tools and success thresholds, measuring conditions and short description of the technical validation (in lab). In addition, it was specified which test site would test which prototype/solution with which cluster of users and number per each, whereas, for the technical validation, it was defined who the technical validation conductor and site would be (in most cases it was the original development team itself). 4.4 First Evaluation Phase Human Factor Assessment Having as starting point the results of the first two UCD phases, as aforementioned, an investigation has been held, of all-applicable for ÆGIS evaluation-techniques, approaches and tools. As such, the most relevant to the specific prototypes of the 1st evaluation phase, techniques and tools, both conventional and novel, have been identified in the context of the validation plans of the first phase. The evaluation approach followed in ÆGIS has encompassed all types of users that are interfering directly or indirectly with its solutions (discussed in Chapter 2). After the awareness of the scope of the evaluation, the completion of the consent form (in line with the concrete Ethics Policy established in the project) and the conduct of the training session, the users were asked to fill-in a pre-test form (different per prototype). ÆGIS has tried to gather both subjective and objective measurements; not aiming to evaluate the users' performance; on the contrary targeting to evaluate the systems' performance through users' interaction with them. As such, a combination of contextual inquiry and performance testing took place, where the users were asked to perform designed interaction tasks (varying according to their type of impairment) with the ÆGIS prototype/proposed solution. The starting point for the tasks description has been the initial project Use Cases, Personas, application scenarios and conceptual models of the “Modelling the User” stage of the project. For each performance stepwise testing scenario developed per prototype, recommended execution times were defined for each step, to serve as thresholds for later analysis. On the basis of the scenarios, service diaries were developed to allow the testing supervisors to keep track of the performance testing. The respective tasks descriptions were given to the users by voice and written (when not prohibited by the type of impairment) prior to the testing (during the training phase). In all cases, besides the TTS sample files for evaluation, the Think-Aloud protocol was applied. Also, in the case of the AIM prototype, the Co-discovery protocol was applied to every pair of users, because, since it is a realtime-text communication tool, it was considered interesting to monitor the interaction of each user with the application when both are using it at the same time. After the performance testing was over, the users were asked to fill in a post-test form, which consisted of 3 standard scales (one for user acceptance, one for accessibility evaluation and one for workload evaluation) and prototype-specific questions. In cases that the users could not read or unable for any reason to answer the questionnaire, an interview was applied on the same questions, whereas, whenever applicable, sign language interpreters were engaged in the test sites to interact with the user
232
M. Gkemou, E. Bekiaris, and, K. Van Isacker
(following the principles of the Ethics Policy of the project). In general, it is important to highlight that, although the questionnaires (both pre- and post-) were handed in printed (or other type according to impairment) form to the users, each case was handled individually in a form of interview and, if needed, more comments and open questions were asked in addition by the interviewer that were written down for further consideration during the consolidation of results. It should be noted that there were cases that different questions had been formulated for different user group even for the same prototype, both in the pre-test as in the post-test questionnaires. Closing the evaluation, focus groups were formulated, locally in each test site, consisting of experts, advanced users acting like experts, evaluators, tutors, trainers, etc. Each focus group was guided by one moderator, coming from the respective test site. Each focus group session was initiated by the presentation of the prototype (or the prototypes) by the moderator. Then, the participants were invited to play with the prototype freely. After that, the moderator posed some key questions to the participants in order to raise an open discussion (varying per prototype). Notes were kept by the moderator (or an assistant) during this process. In some cases, the focus group discussion was also videotaped, with the permission of the participants. At the end, a short questionnaire was completed by each focus group participant (different questions corresponding to each prototype). In order to facilitate the moderators’ task, specific guidelines were issued in advance. Last but not least, an important aspect of the ÆGIS research and development activities, and, more specifically, of the evaluation activities to be conducted is the monitoring of compliance to the project's ethics policy concerning them. An Ethics Policy has been established in ÆGIS, according to which, evaluation, among other activities, should be fully compliant. Local Ethics Committees have been formulated locally in each test site to apply the Ethics Manual developed and monitor the trials adherence to that 2. Part of the Ethics Policy developed in the project was the instantiated consent forms and issue of processes to be followed for each type of impairment user group (during training and testing sessions). 4.5 First Evaluation Phase Technical Validation On the other hand, the developers themselves, in the context of the technical validation, have also measured (to the extent enabled by the maturity of the prototypes) their systems' performance and their results have been compared and consolidated, when similar, to the user trials' results. Again, a very specific technical validation plan has been formulated for each prototype to undergo a technical validation, identifying technical key indicators (applicable for this phase), measuring ways and tools and success thresholds for each metric. However, it should be noted that all aforementioned will be better performed in the second and third evaluation round of the project (although planned also for the first one and partially executed). The improved fidelity of the 1st phase prototypes and the additional Me-Fi to Hi-Fi prototypes that will be tested in the 2nd phase will enable the incorporation of more sophisticated measurements and tools for them and, as such, more valuable results.
An End-User Evaluation Point of View Towards OSS Assistive Technology
233
4.6 First Evaluation Phase Participants and their Recruitment The 10 prototypes have been evaluated in total with 185 users with impairments and 56 experts of various types (e.g. tutors, accessibility evaluators, consultants, etc.) and 7 developers (which tested the AEGIS RIA developer tool). Tests have been conducted across 6 test sites (and 4 countries), namely in Belgium by EPR and KUL, in Spain by FONCE, in Sweden by SU-DART and in the UK by ACE and RNIB. It should be noted that users of each test site tested more than one prototype, which implies that the number of the testing sessions has been much higher than the absolute number of users participating in the assessment. The approach followed for the participants' recruitment was defined from the early beginning of the project, on the basis of a series of selection criteria, with thresholds (as minimum acceptable %) defined per each. The selection criteria were tackling with gender (targeting as much as possible equal representation of male and female; 40% acceptable minimum), age (25% in 18-34 and 35-54 years old groups and 15% in >55 years old group), previous experience in using mobile (60%), previous experience in using computers (70%), previous experience in using internet (40%), previous experience with AT (50%), participation in similar activities (30%). Finally, it should be mentioned that, from the early beginning of ÆGIS, it has been decided to engage at 50% the same users across all evaluation rounds and studies.
5 Conclusions The current paper presents the evaluation framework established in ÆGIS FP7 project, in the context of its iterative UCD plan, aiming to support its testing activities that have and will be performed in three evaluation rounds and one final demonstration phase. Focus has been mainly the first evaluation round of the project, in the context of which, 10 OSS prototypes/solutions have been tested. The experimental plans, the testing and participants recruitment and ethics processes followed have been explained in short. The first evaluation round has been completed and the results consolidation, both from the technical validation and human factors assessment testing with users, is available. The 10 prototypes have been evaluated in total with 185 users with impairments and 61 other users, representing various types of experts (e.g. tutors, accessibility evaluators, consultants, etc.) and developers. Tests have been conducted across 6 test sites (and 4 countries), namely in Belgium by EPR and KUL, in Spain by FONCE, in Sweden by SU-DART and in the UK by ACE and RNIB. It should be noted, that, although a very specific experimental plan and measuring tools have been developed for each of the solutions tested, the overall approach followed has been common across all prototypes and across all test sites, in order to allow valuable comparisons in the later analysis of the results. Contextual inquiry, naturalistic observation techniques, performance testing and focus groups were the evaluation techniques deployed for the first evaluation round. The evaluation framework presented in this paper will be most probably revisited, according to the feedback derived during the first tests. The same is valid for the supporting measuring tools. As such, the tests of the first round (and of each subsequent
234
M. Gkemou, E. Bekiaris, and, K. Van Isacker
round) will not only serve for the optimisation of the accessible solutions tested, but also for the optimisation of the evaluation framework and plan itself. To conclude, it should be noted that the usefulness of the evaluation framework established for the needs of the ÆGIS project is not limited to the relatively narrow context of the project; on the contrary, it should be seen as a useful guide for testing in the whole eInclusion area. Acknowledgments. This work was partially funded by the EC FP7 project AEGISOpen Accessibility Everywhere: Groundwork, Infrastructure, Standards, Grant Agreement No. 224348.
References 1. Annex I-“Description of Work”, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing, September 8 (2008), http://www.aegis-project.eu/ 2. Bekiaris, E., G(k)emou, M., et al.: ÆGIS Ethics Manual, Deliverable 5.6.1 (revision for M18), ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (May 2010) 3. G(k)emou, M., Bekiaris, E., et al.: Novel framework and supporting material for the inclusive evaluation of ICT, Deliverable 1.5.1, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (May 2010) 4. Pajo, S., et al.: User Groups’ and stakeholders’ definition and UCD Implementation Plan, Deliverable 1.1.1, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (February 2010) 5. Sulmon, N., et al.: Analysis of the context of ICT use, Deliverable 1.1.2, ÆGIS project, CN. 224348, 7th Framework Programme, ICT & Ageing (February 2010)
A Method to Automate the Ranking of Web Pages According to User Defined Accessibility Ratings Alice Good School of Computing University of Portsmouth, PO1 3AE, UK [email protected]
Abstract. The premise of this research is to present the final results of an investigation, which looked at a means to automate the rating of web pages according to their accessibility to specific user groups. These groups include visual impairments, mobility restricted and dyslexia. This research identifies three integrated, user-centred studies that assisted in the development of this work. The research conducted for this project has collected data that will help to develop a better method for disabled users to search for and easily locate accessible web pages. It has investigated how web pages can be rated for accessibility using specific algorithms that have designed according to user defined ratings of accessibility. The results presented in this paper demonstrate that re-ordering search results, by ranking web pages according to user defined ratings, could provide a better user experience for people with disabilities. Keywords: Accessibility, Algorithms; Disabilities, HCI.
adjustment into the design process. Consequently, poor design leads to poor accessibility [6,7]. With age comes increased disability: in the elderly, vision and mobility deteriorate with time With an aging population on the increase, the severity of this problem continues to grow. Population aging is on the increase and as such, there will be a greater percentage of people alive at this time who are likely to be experiencing age-related impairments than in the past. There is a definite need to ‘bridge the gap’ between elderly users and the Internet and with almost a fifth of the UK population aged over 65, it is a significant market to exclude [8]. Many of these people make use of the Internet in controlling their finances, keeping up to date with current affairs and staying in contact with family and friends. The Internet can then be said to be a valuable tool in reducing the risk of social exclusion, a concern that many of the elderly face. In considering that the elderly are very likely to develop age related impairments such as visual and/or mobility related disabilities, web accessibility then becomes an important consideration. The initiatives may be in place to encourage the elderly to get online but poor web accessibility is likely to affect not only the user’s ability to access information but also create a less than favourable web experience. Better solutions are needed to improve accessibility for special user groups Much research has been dedicated to devising ways of modifying content according to users’ needs. The focus has then stringently been placed upon the alteration of content, a process that retains a certain amount of infallibility due to its dependence upon correctly coded pages i.e. web pages that have been written with the relevant html, for example ‘Alt-tags’, so that screen readers are able to inform the user. However, Correctly coded accessible web pages can be difficult to find. There needs to be a practical and effective way to guide users to accessible content. One possible solution could be adapting the order of search results according to user needs. This would then enable a means of retrieving accessible information, which is best suited to the users’ needs and disabilities. Furthermore, such a method would contribute towards a better user experience too. 1.1 Inter-related Studies This research has featured a series of inter-related studies, which are outlined below: Study One. In the initial requirements phase, user defined ratings of web page elements that were known to affect accessibility for specific user groups were obtained [9]. The study included participants with a wide range of sensory, physical and cognitive impairments. Participants were asked to specify and rate, with a scoring of one to three, the elements they felt affected accessibility of web pages. Many of the elements that were rated were ones that had been previously specified by the WAI. This is particularly true for all visually impaired participants. Study Two. These ratings formed the basis of a set of algorithms that were designed according to each user group, using the user defined elements of web pages that reduced accessibility outlined in study one (9). The objective of the algorithms was to analyse the accessibility of a web page, according to a specific user group. The algorithm would then assign a rating based upon the page's accessibility. Using these algorithms, users can enter a search query and Web results are then re-ordered according to how accessible they are to a particular user group.
A Method to Automate the Ranking of Web Pages
237
Study Three. The focus of this paper is to present the final stage of this exploratory study, namely study three. The aim of this study was to evaluate the effectiveness of the algorithms in re-ordering search results according to user perceived accessibility in comparison with Google search engine. An empirical evaluation was conducted using the same participants from study one. The purpose was fundamentally to assess whether participants had a better Web experience using the Algorithm Based Search. 1.2 Algorithm Based Search The proposed system, known as Algorithm Based Search, will incorporate a user model specific to the disability group as well as a web model. The information in the user model is used to determine which algorithm is used to rate the accessibility of a web page. The type of algorithm applied will depend upon whether the user is: blind; short-sighted; motor restricted or dyslexic. It is intended that the user supplies this information by specifying which user group best fits their needs: : blind; shortsighted; motor restricted or dyslexic. A database acts as the web model, which represents the accessibility of each page and includes attributes such as the URL and accessibility rating of web pages. The purpose of the web model is to store the URL and the rating of the web page. Each web page is then analysed according to the constraints incorporated within the algorithm. When a user submits a query, the system refers to the user model to establish which algorithm is assigned. The algorithm then individually analyses the first 20 web pages that the search engine returns, and inputs the URL and the rating of accessibility it has given the page into the database. Following this, the system then ranks the web pages according to the ratings applied and then presents the 20 pages as a list of URLs to the user, according to the ratings that have been assigned. Pages with high ratings are considered more accessible than those with low ratings.
2 Method A simulator was built using the algorithms created from user-defined ratings of accessibility that were obtained from the first study of this research [9]. Each disability user group had its own algorithm: blind; short sighted; motor restricted and dyslexic. The simulator was built in html. The simulator enabled a user to specify which user group they were in and the content of each of the two search queries (Scottish castles and Health Information). Twenty web pages were previously analysed according to the rating system incorporated within the algorithm. For example, for a short-sighted user, the user-defined rated elements from the first study that were found to affect accessibility were fixed font, low contrast between background and font colour. The simulator allowed the user to switch between the ordered set of results according to the applied algorithm and the set of results provided by Google. Each user set consisted of 10 search results for each query. There were two search queries and so each user set had two sets of results, one for ‘Scottish castles’ and one for ‘Preventing asthma’. There was also the default set of search results provided by Google for each search query.
238
A. Good
For this study, an empirical investigation was used to measure the effectiveness of the system to direct users to accessible pages. This system has been designed to improve the accessibility of search results for disabled users when compared to a traditional search engine. In this study, users were asked to rate the first ten pages of the search results presented only, as this amount was deemed a suitable sample. Google and most other on-line search engines limit the amount of results to 10 per page. 2.1 Participants The second study included 31 participants. It is possible to obtain excellent results using small sample groups, as research undertaken by Nielsen and Landauer (1993) reveals. Sample groups consisting of no more than five people can yield meaningful data Furthermore, frequent tests with smaller groups can produce better results than large-scale studies [10]. Participants ranged in age from 16-65; both males and females were included. Computer skills ranged from novice to intermediate. All users reported that they experienced impairments that affect the ease with which they were able to access web-based information. 2.2 Procedure The participants were asked to perform two searches for specific information, using the Internet. They were asked to enter the search query regarding health related issue, firstly using the algorithm and then using Google search. Then users were asked to rate the pages, using a Likert-Scale, in terms of ease of access and suitability of content.
3 Results Respondents were asked to rate web pages in terms of ease of access and precision of content using both the Algorithm Based Search system and Google. Results indicated statistically significant differences between respondents’ perception of the ease of access and content of the webpage between Google search and Algorithm Based Search. 3.1 Comparison of Responses The Likert-Scale responses were converted onto a numerical scale with ‘strongly disagree’ responses assigned the minimum weight of 1and ‘strongly agree’ response assigned the highest weight of 5. The differences in responses were calculated between the algorithm based search and Google, for each respondent and each web page. Differences in responses are recorded for each web page to analyse how the behaviour of the respondents change as they record their response for page 1- 10. A flat line above the zero mark will be the most ideal outcome as this would suggest that respondents consistently rated algorithm based search results better than Google results. Mean, Median and Standard deviation were calculated for difference in
A Method to Automate the Ranking of Web Pages
239
responses for each web page. Fig. 1and 2 show the Mean, Median and Standard deviation of difference in responses for each web page. Health Information Search. As fig. 1 shows, Mean, median and Standard Deviation were generally positive with page 8 and page 6 being the exception, where respondents felt that Google search results were better than code based search. Also, difference in responses was not consistent with 4 crests and 4 troughs. Respondents particularly reported negative responses for web page 6 while responses for web page 8 were marginally negative. However, the graph suggests that overall the respondents positively felt that the Algorithm Based Search results were better than Google search results in terms of ease of access and suitability of content. Scottish Castle Search. Fig. 2 suggests that respondents felt positive about algorithm based results but their responses turned negative as they moved down the search results. This may be because the Scottish Castle related web pages were very graphical with many images. Obviously a high ratio of graphics to text would create accessibility issues for blind users. Because the algorithm checks for fonts and contrasts, the two attributes which do not affect images very much, the algorithm results down the order may have been less appealing to the respondents than Google results, however the respondents did report a significant and positive improvement in results with the algorithm. Stdev Mean Median
Page1
Page2
Page3
Page4
Page5
Page6
Page7
Page8
Fig. 1. Health Information Search Statistics
Page9
Page10
240
A. Good
Stdev Mean Median
Page1
Page2
Page3
Page4
Page5
Page6
Page7
Page8
Page9
Page10
Fig. 2. Scottish Castle Search Statistics
3.2 Statistical Testing of Results The Wilcoxon Signed Ranks Two-Tailed Test was also used to determine differences between users’ responses on which search method was more convenient to them in terms of accessibility and in terms of finding the information that they were looking for. This non-parametric test for significance between items was considered most appropriate because the measure uses an ordinal scale. The Wilcoxon Signed Ranks Test approximates a z test. To further determine the magnitude of differences, all analyses were conducted at the p < .01 significance level. We test the hypothesis: H0: The median response is 3 i.e. neither agree nor disagree H1: The median response is >3 i.e. the respondents felt a noticeable enhancement in results when searched using the code as compared to when searched using Google search engine. Median
10
Mean
9.612903
Standard deviation
3.116381
Median
14
Mean
16.22581
Standard deviation
8.452651
Wilcoxon signed rank test was then performed using SPSS version 11.5.
A Method to Automate the Ranking of Web Pages
241
Health Information Search. The results obtained for health related query were: W- = 72.50, W+ = 227.50, N = 24, p <= 0.01346 The huge difference between W+ and W- suggests that we can rule out the NULL hypothesis. This means that the users did experience a difference in browsing experience under the two settings- Google based and Algorithm Based Search. This conclusion was confirmed by the critical value comparison. The selected (being smaller) value for the search is W-. Next, the critical value in the Wilcoxon signed rank test critical values table was found to be n=24 at 95% significance. This value is 91. W- = 72.50 and Tcritical (95% significance) = 91 Thus
W-(72.50) < Tcritical (95% significance)
This means that we can reject the NULL hypothesis that respondents’ responses for both the searches are identical. The higher value of W+ suggests that more respondents felt that their browsing experience was enhanced because of using the algorithm-based search. If the respondents felt a negative difference in browsing experience we would have obtained a higher value of W-. In conclusion we can say that, at 5 percent significance level, the respondents felt that for health related query, algorithm based search results were better in terms of accessibility and content than the Google based search results. Scottish Castle Search. The results obtained for Scottish Castle related query were: W- = 109.50, W+ = 268.50, N = 27, p <= 0.02760 Again, we witness a huge difference in W+ and W- values, which suggests that we can rule out the NULL hypothesis. This means that the users did experience a difference in browsing experience under the two settings- Google based and algorithm based. But this can only be confirmed by looking at the critical value comparison. The selected (being smaller) value for the search is W-. Next, the researcher looked up for the critical value in the Wilcoxon signed rank test critical values table for n=27 at 95 percent significance. This value is 119. W- = 109.50 and Tcritical (95% significance) = 119 Thus
W- < Tcritical (95% significance)
This means that we can reject the NULL hypothesis that respondents’ responses for both the searches are identical. The higher value of W+ suggests that more respondents felt that their browsing experience was enhanced because of using the
242
A. Good
Algorithm Based Search. If the respondents felt a negative difference in browsing experience, a higher value of W- would have been obtained. In conclusion it can be said that, at 5 percent significance level, the respondents felt that for health related query, Algorithm Based Search results were better in terms of accessibility and content than the Google based search results.
4 Conclusions and Recommendations The research has provided a means of rating Web content according to user needs. The results presented in this paper indicate that overall, participants had a better user experience when using the Algorithm Based Search.. The algorithms developed will also go some way to improve overall accessibility by creating a filtering system that identifies and rates pages according to the user's accessibility needs. One of the objectives of this work was to develop these algorithms with the input of those directly affected by accessibility issues. This research is informed by a need to involve and engage with those who are directly affected by accessibility issues. It is believed that these individuals possess the knowledge and experience necessary to assist in addressing the problem of accessibility. While the project has resulted in significant gains, there is considerable scope for future research. It is necessary to create algorithms for other impairments, such as auditory or cognitive impairments other than dyslexia. Additionally, specific visual impairments, apart from blindness and short-sightedness, also need to be addressed. Most importantly, algorithms need to be developed in order to meet the accessibility needs of users with multiple impairments. This is a particular challenge, as it will require reconciling the competing aims of different forms of impairment. In spite of the expected degree of difficulty, this research is essential. Multiple levels of disability affect many users. For example, the growing numbers of elderly users face a host of impairments as they age. As eyesight fails, arthritis sets in and other neurological conditions may develop. 4.1 Recommendations As a result of the research undertaken for this project, the following recommendations for further research are suggested. -Develop algorithms that allow for more elements of web pages known to affect accessibility to be included -Enable user specific algorithms to have different level of searches according to extent of accessibility check required. For example – low, medium and high -Develop algorithms that allow users to specify which elements they wish to look for. -Perform study on larger group of users -Conduct a comparative study between Google’s accessible search engine and the algorithm-based search for users with visual impairments -Develop additional algorithms need to be developed in order to address the needs of other users with impairments not covered by this research -Develop algorithms to meet the needs of users with multiple impairments. -Refine the simulator used to test the algorithms to ensure that it functions at a higher level. - Implement system based upon simulator and test performance with live searches.
A Method to Automate the Ranking of Web Pages
243
References 1. Lazzaro, J.: Adaptive Technologies for Learning and Work Environments. American Library Association, USA (1993) 2. Webb, C.: Ensuring the Internet Revolution reaches the blind (2001), http://www.theage.com.au/news/2001/01/22/FFX5ORDL7IC.html 3. Hackett, S., Parmanto, B., Zeng, X.: Accessibility of Internet Websites Through Time. ACM SIGACCESS Accessibility and Computing Archive (77-78), 32–39 (2003) 4. Bevan, T., Ahmed, A.: An Investigation into Web Accessibility Standards as a Practical Study with Older and Disabled Users. ACM SIGACCESS Accessibility and Computing Archive 88, 9–14 (2007) 5. Lang, S.: Social Networking Sites Exclude disabled users, says charity. The Guardian, January 23 (2008) 6. Shneiderman, B.: Leonardo’s Laptop: Human Needs and the New Computing Technologies. MIT Press, Cambridge (2002) 7. Preece, J., Rogers, Y., Sharp, H.: Interaction Design. John Wiley & Sons, Inc., New York (2001) 8. Pinder, A.: Help to Bridge The Digital Divide. Computer Weekly, June 27 (2002) 9. Good, A., Jerrams-Smith, J.: Enabling accessibility and enhancing web experience: Ordering search results according to user needs. In: Stephanidis, C. (ed.) HCI 2007. LNCS, vol. 4556, pp. 34–44. Springer, Heidelberg (2007) 10. Nielsen, J., Landauer, T.: A Mathematical Model of the Finding of Usability Problems. In: Proceedings of ACM INTERCHI Conference 1993, Amsterdam, The Netherlands, pp. 206–213 (1993)
Issues in Web Presentation for Cognitive Accessibility Clayton Lewis Coleman Institute for Cognitive Disabilities and Department of Computer Science, 430 UCB, Boulder CO 80309 USA [email protected]
Abstract. For people with cognitive disabilities, access to mainstream content is crucial, for educational materials, and for access to other information and services essential to participation in society. Key features for these users are clear, simple presentation (of navigation and interaction as well as content), multimodal presentation to assist with difficulty in processing text, and access to definitions of unfamiliar terms. Simple configurability of presentation, ideally via online profiles, is also important. The Fluid project family is developing technical approaches for realizing these facilities automatically, without requiring content and service providers to develop separate sites for accessibility. Keywords: Inclusive design, Web accessibility, cognitive disabilities.
1 Introduction Access to information on the Web is increasingly a necessity of modern life. As a participant in a focus group of people with brain injury said, when asked about using the Web, “Without it, how could I buy insurance?” But access is often difficult for people with cognitive disabilities, even for sites that comply with accessibility guidelines (Small et al. [1]). The features of Web sites that enhance cognitive accessibility are becoming clearer (see for example WebAIM [2]). Clear, simple presentation (of navigation and interaction as well as content), multimodal presentation (auditory as well as visual) to assist with difficulty in processing text, and access to definitions of unfamiliar terms, are all helpful. But how are sites with these desirable characteristics going to be created?
Issues in Web Presentation for Cognitive Accessibility
245
The situation is even worse for cognitive accessibility, because simple presentation, a key accessibility feature, in some ways works against other priorities that designers must attend to. To illustrate, consider a page on amazon.com, a popular e-commerce site, that describes a book for sale. The page is very complex; there are more than 150 controls displayed “above the fold”, that is, in the portion of the page shown on a single screen of a typical browser display. The total number of controls on the complete page is very much higher. Many of these controls are only rarely used. For example, what proportion of buyers will want to add a book to a “wish list”, or want to contribute an image of their own to the book page? Yet all viewers of the page get these controls. The designers who created this page were not being careless. Rather, they were responding to a fact of life that has become clearer and clearer since Landauer and colleagues pointed it out in the early 80s: packing as much information as possible onto a single screen gives a more effective design, for typical users. This is true because of negative consequences of splitting material among multiple screens. If a collection of items, or choices, is to be shown, the initial presentation can be made smaller only by pushing some information onto other screens. This creates two problems. First, the initial screen has to contain some kind of category descriptions, such as a product type, rather than the products themselves. But category descriptions inevitably lead to uncertainty about where a needed item is to be found [4]. Does “clothing” include shoes? Second, the need to move between multiple screens introduces overhead that is avoided when all of the items are shown on a single screen. So there are good reasons why typical Web pages often seem cluttered, and have large numbers of controls. But these pages are often confusing and difficult for people with cognitive disabilities. Is there a way to provide simpler pages in the face of these good reasons?
3 Web Pages for Mobile Devices Are Simpler There is a common situation in which simpler pages are already being created: Web presentations intended for viewing on mobile devices, with their small screens. Like many organizations that operate on the Web, Amazon has a mobile version of its site, and these pages are radically simpler than those produced for laptop or desktop browsers. Some users with disabilities prefer to use these mobile sites, when they are available, to avoid clutter. But there is a tradeoff in doing this, because the smaller screen (and controls) on a mobile device may themselves be more difficult for some users. A user who needs a large font size, for example, is likely to have difficulty working with the small screen on a mobile device. It is possible for sophisticated users to solve this latter problem, and view the mobile version of a Web site on a large screen device. Special effort is required to do this, because normally the Web request a browser sends to a server identifies the kind of device that is making the request. This identifying information is used to determine which version of the site to deliver: requests marked as coming from mobile devices are given the mobile version of the site. But there are legitimate ways to control the
246
C. Lewis
identifying information that the browser sends. For example, there is a Firefox addon [5] that allows a user to request the mobile version of a site regardless of what device they are using.
4 User Profiles Should Include Preference for Mobile Versions For many users, including many users with cognitive disabilities, it is too difficult to manage the procedure needed to request mobile versions of Web content. Further, having managed to do this on one computer, they must repeat the process if they need to use a different computer, for example a public computer in a library or senior center. The Global Public Inclusive Infrastructure initiative (GPII [6]) aims to allow users to set up a profile, maintained online on a server accessible from any machine, that could include a preference for mobile versions. Having set up such a profile, and given the necessary supporting infrastructure envisioned by GPII, a user would see mobile versions of content, whenever it is available, on any computer they might use. (The same profile facility will also support other preferences, such as for large fonts.) The profile approach greatly increases users’ ability to work independently. A user who needs assistance to create a profile can get the help they need once, and get the benefit of the profile thereafter without needing additional help.
5 A Better Approach to Providing Multiple Versions of Content Is Needed The profile approach to accessing simplified content has some attractive features. One is that designers of mobile sites have already chosen which content to preserve on small screens, and which to suppress. This means that no third party is intervening to modify the provided content, perhaps distorting the content provider’s intended result. But not all Web sites have corresponding mobile versions. Providing a mobile site is expensive, requiring a great deal of work above and beyond what organizations have already invested in their sites for larger screens. Commercial organizations can often afford to make the added investment, because the mobile site can pay for itself through added revenue. But noncommercial organizations face a challenge. As mobile access to the Web continues to grow, an organization’s users will come to expect, or even demand, good access to content. But the organization may realize no additional revenue from a mobile site. My own organization, the University of Colorado, has recognized just this dilemma: future students will expect mobile access, but the funds needed to provide it this will have to be diverted from other needs. It should be possible to reduce the cost of providing alternate versions of a site by allowing content developers to assign a priority to different bodies of content. Implicitly, designers of mobile sites are already doing this, by choosing to present some content while suppressing other content. Making these designations explicity would allow a single collection of content to be shown on screens of different sizes, with low priority content only rendered when screen size permits. Users could then request to see only high priority content, even if they are using a large screen, by placing an appropriate preference in their profile. Further, users (or
Issues in Web Presentation for Cognitive Accessibility
247
content developers) could choose among different treatments for low priority content. The simplest, but most restricted, treatment would be for low priority content just to be suppressed. An alternative would place low priority content on a subsidiary page, viewed when the user operates a “show me more” control to request it. A further possibility would have multiple “more” controls, associated with different categories of low priority content on a page.
6 The Fluid Project Is Developing Technology to Support Alternative Presentations of Web Content Fluid [7] is an international collaborative project to develop community source Web technology to enhance Web accessibility. One aspect of Fluid is the development a form of IoC (inversion of control) technology to support alternative presentation of Web pages. IoC technology allows developers to specify alternative implementations of needed functions, and rules that govern when particular alternatives are selected. In the framework being developed by Fluid, these selection rules will allow the rendering of Web pages to depend on information from the request sent by a browser (from which screen size can be determined), or by information from a user profile, or both, in addition to information encoded in the Web page itself. The framework will thus support all of the options just described, allowing low priority content to be rendered in any of a number of ways, based on user preference as well as screen size. This technology can be used by developers who want to support users who need simple presentations.
7 Accessibility Technology with Mainstream Benefits Has Significant Advantages While Fluid supports enhanced accessibility, its flexibility, as just explained, also enables it to meet the needs of any Web developer who needs to provide content on devices with different screen size, whether they are concerned about accessibility or not. This means that the use of the Fluid framework, with its attendant accessibility benefits, will spread more widely across the Web. The accessibility payoff of this is that any developer who uses Fluid to support different screen sizes will, with no additional effort, also provide enhanced support for users who need simplified presentations. This happens whether or not the developer cares about accessibility. Wide adoption will of course enhance the sustainability of the Fluid technology itself. As a community source project, Fluid will benefit from a wide user base of organizations and individuals motivated to maintain and enhance the framework. To encourage wide adoption, Fluid has additional design objectives. One is to work without constraining designers’ control of the appearance, or styling, of their pages. Another is to be maximally compatible with other libraries or frameworks that designers may choose to use. For example, Fluid carefully manages its use of names so as to avoid collisions between its code and the code used by other frameworks. Information about these design objectives, and currently available versions of the framework, is available at [7].
248
C. Lewis
8 Conclusion In the future we may have the technology to automatically create a clear, simple version of a body of content, based on advances in natural language processing. But in the meantime human effort is required to do this, and the benefits of clarity and simplicity in themselves are only rarely sufficient to motivate designers to pursue them. Fortunately, the powerful motive to deliver content on small screens leads to similar results. By adding infrastructure that allows users to express their preference for these clear and simple presentations, we can enhance accessibility for people with cognitive disabilities in the short term. Acknowledgements. Thanks to the Coleman Institute for Cognitive Disabilities and the Rehabilitation Engineering Research Center for Advancing Cognitive Technologies, funded by the National Institute for Disability and Rehabilitation Research, for supporting the preparation of this paper. Thanks to Antranig Basman and Colin Clark, Fluid project architects, for explanations of the Fluid IoC framework.
References 1. Small, J., Schallau, P., Brown, K., Appleyard, R.: Web accessibility for people with cognitive disabilities. In: CHI 2005 Extended Abstracts on Human Factors in Computing Systems (CHI 2005), pp. 1793–1796. ACM, New York (2005) 2. WebAIM, http://webaim.org/ 3. Loiacono, E.T., Romano Jr., N.C., McCoy, S.: The state of corporate website accessibility. Commun. ACM 52(9), 128–132 (2009) 4. Dumais, S.T., Landauer, T.K.: Using examples to describe categories. In: Proceedings of the SIGCHI conference on Human Factors in Computing Systems (CHI 1983), pp. 112– 115. ACM, New York (1983) 5. Modify-headers, https://addons.mozilla.org/en-US/firefox/addon/ modify-headers/ 6. Global Public Inclusive Infrastructure, http://GPII.org 7. Fluid project, http://www.fluidproject.org/
A Study of Accessibility Requirements for Media Players on the Web Lourdes Moreno, María Gonzalez, Paloma Martínez, and Ana Iglesias LaBDA Group, Computer Science Department, Universidad Carlos III de Madrid, Avda. Universidad 30, 28911 Leganés, Madrid, Spain {lmoreno,mgonza1,pmf,aiglesia}@inf.uc3m.es
Abstract. Multimedia content covers the Web, and we should provide access to all people. For this reason, we must consider including accessibility requirements in a synchronized manner with the alternative resources such as caption and audio description among others. In addition, it is very important to take into account accessibility requirements in the player to avoid barriers and to ensure access to this multimedia content as well as their resources. This paper presents an overall study on standards and players with accessibility requirements. Moreover, solutions to improve the accessibility features in the YouTube player are presented. Based on this study, we have distinguished a set of guidelines to take into account for including accessibility requirements in players. Furthermore, we suggest an agile evaluation process which indicates the order of accessibility guidelines to check. Finally, the proposed evaluation method is put into practice with a case study: accessibility features are evaluated in three widely used players. Keywords: Web accessibility, user agent, media players, standard, evaluation, accessibility requirement.
is of no practical use to include an accessible video with audio description if a blind user, in the end, cannot access this audio alternative due to accessibility barriers within the media player itself. Therefore, media players should be developed according to the User Agent Accessibility Guidelines (UAAG) [4] of WAI and Universal Design criteria. Furthermore, it is imperative that Web designers are familiar with the UAAG as well as the existing media players capable of the accessible delivery of multimedia content.
2 Background The User Agent Accessibility Guidelines (UAAG) explain how to make user agents – including Web browsers, media players, and assistive technologies – accessible for people with disabilities, and particularly, how to increase the accessibility of Web content. In general, media players must ensure that all audio and video controls are accessible to the keyboard alone and can be accessed by a user using a screen reader. While among the two versions of the UAAG currently available, the earlier UAAG 1.0 (approved in 2002) is considered the version of reference, UAAG 2.0 is currently being developed to help make a new generation of user agents functionally accessible, to provide a gate to alternative information based on user technologies and to align itself with WCAG 2.0 (W3C Recommendation). Following the Guideline 1.2 of WCAG 2.0 (“provide alternatives for time-based media”) [1], the media content must be accompanied by media alternatives as caption (or subtitles for deaf people), audio description, sign language, etc. in synchronized media. The UAAG includes guidelines to ensure that the players provide support for these media alternatives. User agents like players may get conformance to UAAG 2.0 at one of three conformance levels. The level achieved depends on the level of the success criteria that have been satisfied. The conformance levels are: "A" (the player satisfies all of the Level “A” success criteria), "AA" or "Double-A" (it satisfies all of the Level “A” and Level “AA” success criteria) or "AAA" or "Triple-A" (it satisfies all of the success criteria). Current media players have, to a greater or lesser extent, accessibility features conforming to the UAAG. These players are embedded in a web page or through a standalone player. Embedding the player in a web page allows the user to access the content without another application opening as many Flash players, but the standalone players usually have more control options. In general, the standalone versions of the media players are far more accessible than the embedded versions [5]. Several media formats present caption capabilities. The most famous ones are RealPlayer1, Windows Media Player2, QuickTime3 and iTunes4. iTunes and the videos that it gets synced with iPod-family devices are capable of displaying subtitles with the ability to have elegant closed-captions (captions that can be displayed or not, depending on user preference) [6]. 1
RealPlayer, www.real.com Windows Media Player, http://windows.microsoft.com 3 QuickTime, www.apple.com/ 4 iTuunes, www.apple.com/ 2
A Study of Accessibility Requirements for Media Players on the Web
251
In relation to audio description, it is recommended for video players to have an audio description track that runs the length of the video. Then, the player may allow audio description controls for the user. Most of the videos online are delivered via Adobe Flash-based in-page video players5. Flash has an excellent compression system that can deliver high-fidelity audio and high-resolution video without taxing bandwidth. In addition, Flash is installed in most browsers [6]. Due to this fact, this latter group of Flash media players has long been used on the Web, as YouTube Video Player6. Although the YouTube Video Player provides subtitles, it presents accessibility problems as it will show in section four. The HTML5 [7] offers web accessibility in the accessible embedded media player a huge step forward. Basically, the new standard provides the latest commands, such as