CONTEMPORARY ERGONOMICS 1998
CONTEMPORARY ERGONOMICS 1998
Proceedings of the Annual Conference of the Ergonomics Society Royal Agricultural College Cirencester 1–3 April 1998 Edited by
M.A.HANSON Institute of Occupational Medicine Edinburgh
THE
Ergonomics society
UK
Taylor & Francis Ltd, 1 Gunpowder Square, London EC4A 3DE
USA
Taylor & Francis Inc., 1900 Frost Road, Suite 101, Bristol, PA 19007–1598
This edition published in the Taylor & Francis e-Library, 2004. Copyright © Taylor & Francis Ltd 1998 except papers by P.J.Goillau et al., R.S.Harvey, and two papers by L.M.Bouskill et al. © British Crown Copyright 1998/DERA published with the permission of the Controller of Her Britannic Majesty’s Stationery Office. And except the paper by L.A.Morris © British Crown Copyright 1998 reproduced with the permission of the Controller of Her Britannic Majesty’s Stationery Office. The views expressed are those of the author and do not necessarily reflect the views or policy of the Health & Safety Executive or any other government department. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the publisher. A catalogue record for this book is available from the British Library. ISBN 0-203-21201-0 Master e-book ISBN
ISBN 0-203-26949-7 (Adobe eReader Format) ISBN 0-7484-0811-8 (Print Edition)
Cover design by Hybert Design
Preface Contemporary Ergonomics 1998 are the proceedings of the Annual Conference of the
Ergonomics Society, held in April 1998 at the Royal Agricultural College, Cirencester. The conference is a major international event for Ergonomists and Human Factors Specialists, and attracts contributions from around the world. Papers are chosen by a selection panel from abstracts submitted in the autumn of the previous year and the selected papers are published in Contemporary Ergonomics. Papers are submitted as camera ready copy prior to the conference. Each author is responsible for the presentation of their paper. Details of the submission procedure may be obtained from the Ergonomics Society. The Ergonomics Society is the professional body for Ergonomists and Human Factors Specialists based in the United Kingdom. It also attracts members throughout the world and is affiliated to the International Ergonomics Association. It provides recognition of competence of its members through the Professional Register. For further details contact: The Ergonomics Society Devonshire House Devonshire Square Loughborough Leicestershire LE11 3DW UK Tel/Fax (+44) 1509 234 904
Contents
HUGH STOCKBRIDGE MEMORIAL SESSION Ergonomics and Standards Introduction DM Anderson Ergonomics standards—the good, the bad and the ugly T Stewart International standardisation of graphical symbols for consumer products FR Brigham The UK human factors defence standard: past, present and future RS Harvey
2 3 8 13
STEPHEN PHEASANT MEMORIAL SESSION The Contribution of Ergonomics to the Understanding and Prevention of Musculoskeletal Disorders Introduction S Lee and D Stubbs The role of physical aspects PW Buckle The combined effects of physical and psychosocial work factors J Devereux The role of psychosocial factors AK Burton
20 21 25 30
MUSCULOSKELETAL DISORDERS Interpreting the extent of musculoskeletal complaints C Dickinson People in pain DM Anderson Prevention of musculoskeletal disorders in the workplace—a strategy for UK research LA Morris, R McCaig, M Gray, C Mackay, C Dickinson, T Shaw and N Watson A musculoskeletal risk screening tool for automotive line managers A Wilkinson, RJ Graves, S Chambers and R Leaver Risk assessment design for musculoskeletal disorders in healthcare professionals C Beynon, D Leighton, A Nevill and T Reilly Ergonomic microscopes—solutions for the cyto-screener? JL May and AG Gale Musculoskeletal discomfort from dancing in nightclubs SL Durham and RA Haslam
36 41 46 51 56 61 66
MANUAL HANDLING Is the ergonomic approach advocated in the Manual Handling Regulations being adopted? KM Tesh Control of manual handling risks within a soft drinks distribution centre EJ Wright and RA Haslam Training and patient-handling: an investigation of transfer JA Nicholls and MA Life Risk management in manual handling for community nurses P Alexander Children’s natural lifting patterns: an observational study F Cowieson Manual handling and lifting during the later stages of pregnancy T Reilly and SA Cartwright Posture analysis and manual handling in nursery professionals JO Crawford and RM Lane
72 77 82 87 92 96 101
POSTURE Can orthotics play a beneficial role during loaded and unloaded walking? DC Tilbury-Davis, RH Hooper and MGA Llewellyn Investigation of spinal curvature while changing one’s posture during sitting FS Faiks and SM Reinecke The effect of load size and form on trunk asymmetry while lifting G Thornton and J Jackson The effect of vertical visual target location on head and neck posture R Burgess-Limerick, A Plooy and M Mon-Williams
108 113 118 123
OFFICE ERGONOMICS Is a prescription of physical changes sufficient to eliminate health and safety problems in computerised offices? RM Sharma An evaluation of a trackball as an ergonomic intervention B Haward Old methods, new chairs. Evaluating six of the latest ergonomic chairs for the modern office A Esnouf and JM Porter
130 135 140
NEW TECHNOLOGY Development of a questionnaire to measure attitudes towards virtual reality S Nichols Orientation of blind users on the World Wide Web MP Zajicek, C Powell and C Reeves “Flash, splash and crash”: Human factors and the implementation of innovative Web technologies A Pallant and G Rainbird
146 151 156
WORK STRESS Determining ergonomic factors in stress from work demands of nurses DW Jamieson and RJ Graves
162
A risk assessment and control cycle approach to managing workplace stress RJ Lancaster
167
TELEWORKING Teleworking: Assessing the risks M Kerrin, K Hone and T Cox Evaluating teleworking—case study S Campion and A Clarke
174 179
TEAM WORKING Team organisational mental models: an integrative framework for research J Langan-Fox, S Code and G Edlund The impact of IT&T on virtual team working in the European automotive industry C Carter and A May
186 191
WORK DESIGN The effect of communication processes upon workers and job efficiency A Dickens and C Baber A case study of job design in a steel plant HT Neary and MA Sinclair The effects of age and habitual physical activity on the adjustment to nocturnal shiftwork T Reilly, A Coldwells, G Atkinson and J Waterhouse Job design for university technicians: work activity and allocation of function RF Harrison, A Dickens and C Baber
198 203 208 213
SYSTEM DESIGN AND ANALYSIS Allocation of functions and manufacturing job design based on knowledge requirements CE Siemieniuch, MA Sinclair and GMC Vaughan The need to specify cognition within system requirements IS MacLeod Analysis of complex communication tasks J Wikman
220 225 230
INFORMATION SYSTEMS Health and safety as the basis for specifying information systems design requirements TG Gough Cognitive algorithms R Huston, R Shell and AM Genaidy
236 241
DESIGN METHODS Rapid prototyping in foam of 3D anthropometric computer models in functional postures S Peijs, JJ Broek and PN Hoekstra
248
The use of high and low level prototyping methods for product user interfaces JVH Bonner and P Van Schaik Creative collaboration in engineering design teams F Reid, S Reed and J Edworthy
253 258
DESIGN AND USABILITY Pleasure and product semantics PW Jordan and AS Macdonald A survey of usability practice and needs in Europe MC Maguire and R Graham Cultural influence in usability assessment A Yeo, R Barbour and M Apperley
264 269 274
INTERFACE DESIGN Interface display designs based on operator knowledge requirements F Sturrock and B Kirwan Understanding what makes icons effective: how subjective ratings can inform design SJP McDougall, MB Curry and O de Bruijn Representing uncertainty in decision support systems: the state of the art C Parker Representing reliability of at-risk information in tactical displays for fighter pilots M Piras, S Selcon, J Crick and IRL Davies Semantic content analysis of task conformance A Totter and C Stary
280 285 290 295 300
WARNINGS Warnings: a task-oriented design approach JM Noyes and AF Starr Effects of auditorily-presented warning signal words on intended carefulness RS Barzegar and MS Wogalter Listeners’ understanding of warning signal words J Edworthy, W Clift-Matthews and M Crowther Perceived hazard and understandability of signal words and warning pictorials by Chinese community in Britain AKP Leung and E Hellier
306 311 316 321
VERBAL PROTOCOL ANALYSIS Thinking about thinking aloud MJ Rooden Adjusting the cognitive walkthrough using the think-aloud method M Verbeek and H van Oostendorp Verbal protocol data for heart and lung bypass scenario simulation “scripts” J Lindsay and C Baber Use of verbal protocol analysis in the investigation of an order picking task B Ryan and CM Haslegrave
328 333 338 343
PARTICIPATORY ERGONOMICS Selecting areas for intervention BL Somberg Participatory ergonomics in the construction industry AM de Jong, P Vink and WF Schaefer User trial of a manual handling problem and its “solution” D Klein, WS Green and H Kanis
350 355 360
INDUSTRIAL APPLICATIONS Case study: a human factors safety assessment of a heavy lift operation WI Hamilton and P Charles The application of ergonomics to volume high quality sheet printing and finishing ML Porter The application of human factors tools and techniques to the specification of an oil refinery process controller role J Edmonds and C Duggan Feasibility study of containerisation for Travelling Post Office operations G Rainbird and J Langford
366 371 376 381
MILITARY APPLICATIONS The complexities of stress in the operational military environment MI Finch and AW Stedmon The development of physical selection procedures. Phase 1: job analysis MP Rayson The human factor in applied warfare AE Birkbeck
388 393 398
AIR TRAFFIC MANAGEMENT Getting the picture—Investigating the mental picture of the air traffic controller B Kirwan, L Donohoe, T Atkinson, H MacKendrick, T Lamoureux and A Phillips Developing a predictive model of controller workload in air traffic management AR Kilner, M Hook, P Fearnside and P Nicholson Assessing the capacity of Europe’s airspace: The issues, experience and a method using a controller workload model A Majumdar Evaluation of virtual prototypes for air traffic control—the MACAW technique PJ Goillau, VG Woodward, CJ Kelly and GM Banks Development of an integrated decision making model for avionics application D Donnelly, JM Noyes and DM Johnson Psychophysiological measures of fatigue and somnolence in simulated air traffic control H David, P Cabon, S Bourgeois-Bougrine and R Mollard
404 409 414 419 424 429
DRIVERS AND DRIVING What’s skill got to do with it? Vehicle automation and driver mental workload M Young and N Stanton The use of automatic speech recognition in cars: a human factors review R Graham Integration of the HMI for driver systems: classifying functionality and dialogue T Ross Subjective symptoms of fatigue among commercial drivers PA Desmond How did I get here? Driving without attention mode JL May and AG Gale Seniors’ driving style and overtaking: is there a “comfortable traffic hole”? T Wilson Speed limitation and driver behaviour D Haigney and RG Taylor The ergonomics implications of conventional saloon car cabins on police drivers SM Lomas and CM Haslegrave The design of seat belts for tractors DH O’Neill and BJ Robinson
436 441 446 451 456 461 466 471 476
NOISE AND VIBRATION Auditory distraction in the workplace: a review of the implications from laboratory studies S Banbury and D Jones Transmission of shear vibration through gloves GS Paddan and MJ Griffin The effect of wrist posture on attenuation of vibration in the hand-arm system TK Fredericks and JE Fernandez
482 487 492
HAND TOOLS Criteria for selection of hand tools in the aircraft manufacturing industry: a review BP Kattel and JE Fernandez Exposure assessment of ice cream scooping tasks PG Dempsey, R McGorry, J Cotnam and I Bezverkhny
498 503
THERMAL ENVIRONMENTS The effect of clothing fit on the clothing Ventilation Index LM Bouskill, N Sheldon, KC Parsons and WR Withey A thermoregulatory model for predicting transient thermal sensation F Zhu and N Baker The user-oriented design, development and evaluation of the clothing envelope of thermal performance D Bethea and KC Parsons A comparison of the thermal comfort of different wheelchair seating materials and an office chair N Humphreys, LH Webb and KC Parsons
510 515 520 525
The effect of repeated exposure to extreme heat by fire training officers JO Crawford and TJ Milne 530 The effects of self-contained breathing apparatus on gas exchange and heart rate during fire-fighter simulations KJ Donovan and AK McConnell 535 The effect of external air speed on the clothing ventilation index LM Bouskill, R Livingston, KC Parsons and WR Withey 540 COMMUNICATING ERGONOMICS Commercial planning and ergonomics J Dillon Human factors and design: Bridging the communication gap AS Macdonald and PW Jordan Guidelines for addressing ergonomics in development aid T Jafry and DH O’Neill Determining and evaluating ergonomic training needs for design engineers J Ponsonby and RJ Graves Ergonomic ideals vs genuine constraints D Robertson, S Layton and J Elder
546 551 556 560 565
GENERAL ERGONOMICS Another look at Hick-Hyman’s reaction time law TO Kvålseth Design relevance of usage centred studies at odds with their scientific status? H Kanis The integration of human factors considerations into safety and risk assessment systems JL Williamson-Taylor The use of defibrillator devices by the lay public T Gorbell and R Benedyk Occupational disorders in Ghanaian subsistence farmers M McNeill and DH O’Neill
572 577 582 587 592
AUTHOR INDEX
599
SUBJECT INDEX
603
HUGH STOCKBRIDGE MEMORIAL SESSION: ERGONOMICS AND STANDARDS
HUGH STOCKBRIDGE MEMORIAL LECTURES Ergonomics and Standards
Introduction Towards the end of his career, Hugh Stockbridge became known for, amongst other things, his keen interest in ergonomics standards, but he was not originally a ‘standards man’. In fact his individualistic style was very far from standard. In the days before the publication of Murrel’s book Ergonomics (Chapman and Hall, 1965), we had only one ‘cookbook’ published in the UK that could be seen as a standard. This was the 1960 publication from the MRC/RNPRC: Human Factors in Design and Use of Naval Equipment, intended for use by Royal Navy designers. Hugh, however, was ever the innovative experimenter in the grand tradition of Cambridge psychologists, and where ergonomics data was lacking, produced some of his own. An example was the designs for micro shape coded knobs, intended to be used on the old ‘Post office keys’, published internally at Farnborough in 1957. His continuing interest in factors affecting the design of indicators and such controls was later reflected in a paper with Bernard Chambers in the Journal Ergonomics (Taylor and Francis, 1970). By the end of the ‘60s, Hugh was already involved in his great project, as a member of a working party of the RNPRC to carry out a revision of the earlier handbook. The working party carried out a thorough review of the proliferating, human engineering handbooks and surveys, and by consideration of best ergonomics principles, conceived a new publication of 11 Chapters, to be produced in a ring binding to facilitate updating and additions. But that is the subject of Roger Harvey’s paper in this memorial session. Hugh also made a particularly interesting, and still valid, critique of ergonomics handbooks and journals to Human Engineering, by Kraiss and Moraal (Verlag TüV Rheinland Gmbh, 1975). Much later, Hugh became secretary of the Study Group concerned with the creation of DBF STAN 00–25, and to quote from a contemporary colleague, ‘…his cryptic minutes were a work of art—inimitable, yet very informative once decoded and embellished with the knowledge of those who had the need to know.’ The other two papers are complementary in many ways. Tom Stewart covers the considerable time he has spent developing and using more general International Standards in ergonomics. It will be especially interesting to hear his comments on the application of ergonomics principles to creating standards—and the standards he considers have made a negative contribution to ergonomics! The final paper, contributed by Fred Brigham, illustrates amongst other things the particular problems encountered in the development of quite specific standards for graphical symbols for use on consumer products intended for an international market. The promotion of appropriate testing procedures to ensure usability of such symbols must be a very important and interesting part of the standard making process. Hugh would have been proud of this session.
ERGONOMICS STANDARDS— THE GOOD, THE BAD AND THE UGLY Tom Stewart Managing Director System Concepts Limited 2 Savoy Court, Strand London WC2R OEZ www.system-concepts.com Hugh Stockbridge was an active supporter of ergonomics standardisation. In this presentation, I will draw some conclusions about the process of developing standards (particularly International Standards) and about the usefulness and usability of the resulting standards themselves. As the title suggests, there can be major problems with both the process and the standards (the Ugly) or there can be minor problems with both (the Bad). However, most of the time, the process works well and the resulting standards have improved the ergonomics quality of products and systems and are well received by industry and users (the Good).
Introduction One of Hugh Stockbridge’s most endearing qualities (apart from his sense of humour) was his enthusiasm for irreverence. I believe he would have approved of my choice of title and been pleased that his memory was being cherished in this session although he was also at home on the stage. This session is like a posthumous award for Hugh and it is clear that there are some similarities between standards and show business. Just as awards ceremonies often seem rather incestuous, esoteric and irrelevant to the real world, many standards seem to be aimed more at ergonomists and their concerns (in their terminology, structure and emphasis) than at standards users in the real world. Taking the analogy of awards a step further, I will therefore review ‘The Good, The Bad and The Ugly’ aspects of ergonomics standardisation in reverse order. These observations are based on my own experience as Chairman of ISO/TC 159/SC 4 Ergonomics of Human System Interaction and as an active developer of ergonomics standards for more than 15 years. But before exposing the sordid side of standards making, I would like to explain what the process of International Standardisation should involve.
The process of International Standardisation International standards are developed over a period of several years and in the early stages, the published documents may change dramatically from version to version until consensus is reached (usually within a Working Group of experts). As the standard becomes more mature (from the Committee Draft Stage onwards), formal voting takes place (usually within the
T Stewart
4
parent sub-committee) and the draft documents provide a good indication of what the final standard is likely to look like. Table 1 shows the main stages.
Table 1 The main stages of ISO standards development
The Ugly side of standards Although ergonomics standards are generally concerned with such mundane topics as keyboard design or menu structures, they nonetheless generate considerable emotion amongst standards makers. Sometimes this is because the resulting standard could have a major impact on product sales or legal liabilities. Other times the reason for the passion is less clear. Nonetheless, the strong feelings may result in what I have called the ugly side of standards. In terms of the standardisation process, the ugly side includes:
u large multinational companies exerting undue influence by dominating national committees. Although draft standards are usually publicly available from national standards bodies, they are not widely publicised. This means that it is relatively easy for well informed large companies to provide sufficient experts at the national level to ensure that they can virtually dictate the final vote and comments from a country.
u end user’s requirements being compromised as part of ‘horse trading’ between conflicting viewpoints. In the interests of reaching agreement, delegates may resort to making political trade-offs largely independent of the technical merits of the issue.
u national pride leading to uncritical support for a particular approach or methodology. In theory, participants in Working Group meetings are experts nominated by member bodies in the different countries. They are not there to represent a national viewpoint but are supposed to act as individuals. However, as one disillusioned expert explained to me ‘sometimes the loudest noise at a Working Group meeting is the grinding of axes’
Ergonomics standards—the good, the bad and the ugly
5
However, it is not just the process which is ugly. The standards themselves can leave much to be desired in terms of brevity, clarity and usability as a result of:
u stilted language and boring formats. The unfriendliness of the language is illustrated by the fact that although the organisation is known by the acronym ISO, its full English title is the International Organisation for Standardisation. The language and style are governed by a set of Directives and these encourage a wordy and impersonal style.
u problems with translation and the use of ‘Near English’. There are three official languages in ISO—English, French and Russian. In practice, much of the work is conducted in English, often by non-native speakers. As someone who only speaks English, I have the utmost respect for those who can work in more than one language. However, the result of this is that the English used in standards is often not quite correct— it is ‘near English’. The words are usually correct but the combination often makes the exact meaning unclear. These problems are exacerbated when the text is translated.
u confusions between requirements and recommendations. In ISO standards, there are usually some parts which specify what has to be done to conform to the standard. These are indicated by the use of the word ‘shall’. However, in ergonomics standards, we often want to make recommendations as well. These are indicated by the use of the word ‘should’. Such subtleties are often lost on readers of standards, especially those in different countries. For example, in the Nordic countries, they follow recommendations (shoulds) as well as requirements (shalls), so the distinction is diminished. In the USA, they tend to ignore the ‘shoulds’ and only act on the ‘shalls’.
The Bad side of standards I used the expression ‘ugly’ to describe the result of extreme passion in the development of standards. In this part of the paper, I discuss what might be seen as the result of too little passion. The bad side is that standardisation is very slow as a result of :
u an apparently leisurely pace of work. One of the reasons is that there is an extensive consultation period at each stage with time being allowed for national member bodies to circulate the documents to mirror committees and then to collate their comments. Another reason is that Working Group members can spend a great deal of time working on drafts and reaching consensus only to find that the national mirror committees reject their work when it comes to the official vote. It is particularly frustrating for project editors to receive extensive comments (which must be answered) from countries who do not send experts to participate in the work. Of course, the fact that the work is usually voluntary means that it is difficult to get people to agree to work quickly.
u too many experts. This might sound like an unlikely problem but given the long timescale mentioned above it can be a significant factor in slowing down the process. The reason is that many experts are only supported by their organisations for a relatively short time and are then replaced by other experts. Every time a new expert joins the Working
T Stewart
6
Group, there is a tendency to spend a lot of time explaining the history and to some extent starting the process again. Similarly, each expert feels obliged to make an impact and suggest some enhancement or change in the standard under development. Since the membership of Working Groups can change at virtually every meeting (which are usually three or four months apart), it is not uncommon for long standing members finding themselves reinstating material which was deleted two or three meetings previously (as a result of a particularly forceful individual). While I do not accept that we have produced bad standards (at least in our committee), our standards have been criticised for being too generous to manufacturers in some areas and too restrictive in other areas. The ‘over-generous’ criticism misses the point that most standards are setting minimum requirements and in ergonomics we must be very cautious about setting such levels. However, there certainly are areas where being too restrictive is a problem. Examples include:
u ISO 9241–3:1992 Ergonomics requirements for work with VDTs: Display Requirements. This standard has been successful in setting a minimum standard for display screens which has helped purchasers and manufacturers. However, it is biased towards Cathode Ray Tube (CRT) display technology. An alternative method of compliance based on a performance test (which would be technology independent) is still under development and is unlikely to be finalised in the near future.
u ISO CD 9241–9 Ergonomics requirements for work with VDTs: Non keyboard input devices. This standard is suffering because technological development is faster than either ergonomics research or standards making. Although there is an urgent need for a standard to help users to be confident in the ergonomic claims made for new designs of mice and other input devices, the lack of reliable data forces the standards makers to slow down or run the risk of prohibiting newer, even better solutions.
The Good side of standards I would not spend my time (largely unfunded) developing standards if I did not believe that they are largely good for ergonomics. Major strengths in the process are that it is:
u based on consensus. Manufacturers (and ergonomists) make wildly different claims about what represents good ergonomics. This is a major weakness for our customers who may conclude that all claims are equally valid and there is no sound basis for any of it. Standards force a consensus and therefore have real authority in the minds of our customers. Achieving consensus requires compromises, but then so does life.
u international. Although there are national and regional differences in populations, the world is becoming a single market with the major suppliers taking a global perspective. Variations in national standards and requirements not only increase costs and complexity, they also tend to compromise individual choice. Making standards international is one way of ensuring that they have impact and can help improve the ergonomics quality of products for everyone.
Ergonomics standards—the good, the bad and the ugly
7
We have produced a number of useful standards over the past few years. These are not only useful in providing technical information in their own right but serve to ensure that ergonomics issues are firmly placed on management agendas. Many organisations feel obliged to take standards seriously and therefore even if they were not predisposed towards ergonomics initially, the existence of International Standards ensures that they are given due consideration. As consultants, we know that basing our recommendations on agreed standards gives them far greater authority than citing relevant research. There is not space in this paper to list all the relevant standards but a few key examples include three from the ISO 9241 series and ISO 13407.
u ISO 9241–2:1992 Guidance on Task Requirements. This standard sets out key points on job and task design and provides a sound basis for persuading managers and system developers that such issues require proper attention if systems are to be successful.
u ISO 9241–3:1992 Visual Display Requirements. This standard allows purchasers to have some confidence in the ergonomic quality of computer displays. This is particularly important for managers who wish to meet their obligations under the European Directive on work with display screen equipment.
u ISO 9241–10:1996 Dialogue Principles. This standard sets out some key principles of dialogue design and gives useful examples to illustrate how the relatively simple principles apply in practice. The European Directive also requires employers to ensure that systems meet the principles of software ergonomics. This standard gives them an external benchmark which they can incorporate in procurement specifications.
u ISO DIS 13407 Human Centred Design for Interactive Systems. This standard is an attempt to solve the problem of developing ergonomics standards quickly enough in a fast changing technical environment. The standard provides guidance for project managers to help them follow a human-centred design process. By undertaking the activities and following the principles described in the standard, managers can be confident that the resulting systems will be usable and work well for their users. If their customers require evidence of humancentredness, the standard gives guidance on how to document the process.
The way forward Although I believe standards are an important tool for the ergonomist, many people find them difficult to understand and use. In part, this is because people sometimes expect too much from standards. They cannot represent the latest ideas and they are not going to help much with the more creative parts of design. However, they often represent important constraints and may give some guidance on what has worked in the past. The best way to really understand what is going on in standards is to get involved. This will give you advance warning of future standards, the opportunity to influence the content of standards and an understanding of the context in which they have been developed. You will then find it much easier to make effective use of standards. If you do not know who to contact, let me know what you are interested in helping with (email
[email protected]) and I’ll send you details.
INTERNATIONAL STANDARDISATION OF GRAPHICAL SYMBOLS FOR CONSUMER PRODUCTS Fred Brigham Philips Design, Building HWD, PO Box 218, 5600 MD Eindhoven, The Netherlands
The paper discusses international activities concerned with the standardisation of graphical symbols and complementary activities in a major electronics company focussing on symbols used for consumer products. Issues relating to the practical application of symbols in an industrial setting are described, including procedures for new symbols and tools to provide worldwide access. Results from user tests of a proposal for international symbols are presented to illustrate some of the problems involved in designing comprehensible symbols. The paper concludes by stressing the need to focus on the communicative processes involved.
Introduction The main organisations concerned with the international standardisation of graphical symbols are as follows:
International Organization for Standardization (ISO) The main ISO technical committee dealing with graphical symbols is ISO TC145, which has three subcommittees dealing with public information symbols (SC1), safety signs and symbols (SC2), and graphical symbols for use on equipment (SC3). The main publication containing graphical symbols for use on equipment is ISO 7000. The symbols for use on equipment are developed by the technical committee responsible for the equipment following the rules in ISO 3461–1 (General principles for the creation of graphical symbols) and ISO 4196 (Use of arrows).
International Electrotechnical Commission (IEC) The IEC committee responsible for graphical symbols for use on electrotechnical equipment is IEC SC3C. New symbols are proposed by technical committees or by national standardisation organisations, and SC3C plays an active role in the approval procedure. The main publication containing graphical symbols for use on electrotechnical equipment is IEC 60417. This document is produced electronically from a database. The database may be linked to a web site in the near future allowing users to search for suitable symbols and either download the drawings or order them on CD-ROM.
Standardisation of graphical symbols for consumer products
9
ISO/IEC Joint Technical Committee 1 This joint technical committee is concerned with information technology and, because of its size and importance, can be considered separately from the two parent organisations. JTC1 Working Group 5 is responsible for graphical symbols for office equipment and also icons. A collective standard containing graphical symbols for office equipment is being prepared.
International Telecommunication Union (ITU) Graphical symbols to assist users of telephone services are published in ITU-T Recommendation E.121. This document includes symbols for videotelephony which have been developed by The European Telecommunication Standards Institute (ETSI) using the Multiple Index Approach which is described in ETSI Technical Report ETR 070 (1993).
Use of graphical symbols for consumer products Philips is a multinational company which produces a wide range of consumer products. As most of the products are electrotechnical, IEC 60417 is the major source of symbols. Figure 1 shows examples of symbols from IEC 60417 adapted for use within Philips. Many of the symbols are also used as icons.
Figure 1. Some graphical symbols used by Philips A major concern of Philips Design is the usability of products and this is taken into account in the policy with regard to the use of graphical symbols and the development of new symbols. Some of the practical issues arising are as follows:
Symbols or text In general, text describing a function in the user’s native language is likely to be understood better than a graphical symbol. Where there is no compelling reason to use graphical symbols it is better to use text. However, graphical symbols are useful where there is insufficient space to use text, eg. on a remote control, or where use of the equipment must be language independent.
Pictogram or abstract symbol The symbols on the left of Figure 1 are pictograms, i.e. they depict an object. Those on the right are abstract symbols. It might be thought that the pictograms can be more easily understood than the abstract symbols. The meaning of abstract symbols has to be learned but some are well understood, eg. the arrows used for “play”, “fast forward” etc. on tape players. Furthermore a pictorial representation may limit the application of the symbol or may become out of step with developments in technology.
Approval of new symbols In a company as diverse as Philips with relatively independent lines of business operating in all parts of the world, it is necessary to control the use of graphical symbols. This is done in
10
FR Brigham
the consumer products area by means of a company standard containing “approved” symbols. Where no appropriate symbol can be found either in the company standard or in relevant international standards, a new symbol must be developed. The Philips internal procedure involves submitting the proposed new symbol for immediate comment by an expert panel followed by circulation to a wider group of interested parties prior to publication in the standard. Wherever possible, new symbols are tested as part of the user interface.
Accessing the information The printed company standard for symbols is in the process of being replaced by a web site on the Philips intranet. This has the advantage that the up-to-date collection of approved symbols and new proposals under consideration can be accessed from anywhere in the world. The database can be searched using keywords and, when appropriate symbols have been found, the drawings can be downloaded as electronic files for immediate use. The approval procedure for new symbols has also been made more effective and efficient by linking this to the web site.
Testing graphical symbols It has been noted above that symbols should be tested wherever possible to ensure that their meaning will be understood. The following example illustrates some of the issues involved in developing symbols which can be reliably comprehended and distinguished. The testing of symbols is part of the Philips quality policy for consumer products which is committed to putting in place processes and tools which enable quality of use to be managed in a systematic way.
Symbols for timer functions In response to the needs of the electrotechnical industry, an IEC proposal was circulated containing symbols for the time functions “elapsed time”, “remaining time”, “programmable start” and “programmable stop”. The symbols concerned are shown as Set 1 in Table 1. Although the functions may initially appear complex, they are all functions which may be found on consumer equipment in the living room, bedroom or kitchen. Philips was concerned about the comprehensibility of the proposed symbols and decided to test them.
Testing procedure The test of pictogram associativeness from the ETSI Multiple Index Approach, ETR 070 (1993), provided the basis for the test method. The procedure was as follows: • One of the functions was described and drawings for all four functions presented. Subjects were asked to choose the drawing which best represents the function. • This was repeated for all four functions with the order randomised between subjects and stressing that each choice should be independent. • Subjects were typical users of consumer equipment (N≥24 for full test). Two further sets of symbols were tested at separate times. Set 2 was used in a pilot test conducted by Philips to explore possible alternatives to Set 1. Set 3 was circulated in the most recent IEC draft.
Results The number of correct selections of the symbols (hit rate) is shown in Table 1. The figure in brackets is the percentage of subjects correctly selecting the symbol concerned. Where the hit
Standardisation of graphical symbols for consumer products
11
ate differs significantly from that expected by chance (25%), this is indicated by the asterisks. (Chi-square goodness of fit test, **=α<.01, ***=α<.001. Note: the expected frequencies for Set 2 are less than 5.) Table 1. Results of testing the timer symbols
The results for Set 1 indicate that the symbols for elapsed time and remaining time were correctly selected significantly less than would be expected by chance. These are not just bad symbols. In combination with the other two symbols they are actively misleading. Set 2 shows a large improvement in the discriminability of the symbols. The pair of symbols proposed for elapsed/remaining time work well and can be distinguished from each other and from the other pair. However, they were considered unsuitable for the subsequent draft because they could potentially refer to any elapsed and remaining quantity, and not necessarily to time. Set 3 appeared in the most recent IEC draft and the symbols are likely to be approved for publication (with minor graphical enhancements).
Discussion Had it not been for the intervention of Philips, the symbols in Set 1 might have been published by the IEC. On the basis of the test results it was possible to analyse why the results were so poor. The challenge was to provide clear graphical cues which distinguish the two pairs of related symbols (elapsed/remaining time and programmable start/stop) and also distinguish the symbols within each pair. The graphical features in Set 3 (single versus double clock hands, large arrow heads, large dots) are all intended to enhance the distinctions and the result show the positive effects. It might be thought that much higher hit rates should be achieved but it should be borne in mind that this is an extremely demanding test in view of the fact that the functions are so similar. If one of the symbols had been tested with three completely different symbols, the hit rate might have been 100%. This reflects a difficulty with the Multiple Index Approach, i.e.
FR Brigham
12
the individual results are entirely dependent on the other symbols in the set which is tested. This means that the set chosen must have ecological (i.e. contextual) validity and the results cannot be extrapolated to other situations. A further important issue is learning. The test procedure can be repeated to reveal learning effects but normal practice is to focus on the first time comprehensibility or discriminability of the symbols. In practice, users may come across the same symbols repeatedly and subsequent recognition may therefore be more important.
Designing effective symbols It is not possible to provide detailed guidance on how to design effective symbols within the scope of this paper but a number of important issues will be mentioned. Firstly it is essential to focus the communicative processes involved in the understanding of graphical symbols. Barnard and Marcel (1984) provide an excellent exposition of this approach. It is important not to focus on the representation of objects for their own sake but to provide semantic elements which communicate the appropriate message resulting in the appropriate behavioural response of the user. The two symbol shown in Figure 2 illustrate the point:
Figure 2. Alternative proposals, “Boiler empty” (left) and Fill boiler” (right) A set of symbols was required for a domestic iron with a separate water container (boiler). The “Boiler empty” symbol was designed to complement several other symbols which depicted the boiler. However, an analysis of the message to be communicated indicated that the purpose of this symbol was to prompt the user to fill the boiler. This understanding led to the design of the alternative symbol “Fill boiler” which gives a clearer message to the user. The results of testing the timer symbols highlight the need to consider carefully how related groups of symbols can be discriminated from other groups, and the how the individual symbols within the groups can be discriminated from each other. The separate needs of designing for first time comprehension and later recognition also need to be addressed by the appropriate use of graphical and semantic elements. More detailed guidelines, including guidelines for the management of the development process, can be found in Horton (1994).
References Barnard, P. and Marcel, T. 1984, Representation and understanding in the use of symbols and pictograms. In Easterby, R.S. and Zwaga, H.J.G. (eds) Information Design 1984, (John Wiley and Sons, Colchester), 37–75 ETSI Technical Report ETR 070, June 1993, The Multiple Index Approach (MIA) for the evaluation of pictograms (European Telecommunication Standards Institute, Sophia Antipolis) Horton, W.K. 1994, The Icon Book: Visual Symbols for Computer Systems and Documentation (John Wiley & Sons, New York)
THE UK HUMAN FACTORS DEFENCE STANDARD: PAST, PRESENT AND FUTURE Roger S Harvey
Systems Psychology Group Centre for Human Sciences DERA Farnborough Hants GU14 6TD
This provides a historical review of the development of the United Kingdom Defence Human Factors Standard, DBF STAN 00–25. This can trace its origins to the early 1960’s when as a joint Royal Naval Personnel Research Committee/Medical Research Council (RNPRC/MRC) handbook for designers, the text was intended for use within the Royal Navy and to be a summary of data and guidelines. Further developments of this early venture led to what is now known as DEF STAN 00–25, with tri-Service applications, and initially with a list of Parts which deliberately mirrored those of the RNPRC/MRC handbook. At the end of 1997 the text of this and many other DEF STANs was made available on the Internet thus making the knowledge widely available in digital form.
Early beginnings The Navy Handbook From the Second World War onwards the UK Ministry of Defence has attached significance to the role that ergonomics can play in the evolution of equipment matched to the cognitive and physical demands of users and maintainers. In recent years substantial programmes and methodologies such as MANPRINT (initiated within the USA) and Human Factors Integration (HFI, a UK programme derived in part from the core of MANPRINT methodologies and activities) have provided the necessary impetus and encouragement for system designers and engineers. However during the first three decades after the Second World War the role played by a number of enthusiastic, committed, psychologists and human factors scientists was crucial in building upon the foundations provided by the then MOD Research Establishments such as the Clothing Equipment and Physiological Research Establishment (CEPRE), the RAF Institute of Aviation Medicine (RAF IAM), and the Royal Aircraft Establishment (RAE). These Establishments, other smaller units, and their successors such as the Army Personnel Research Establishment (APRE), have provided a continuing commitment to ergonomics and human factors culminating in the amalgamation, during the last two years, of many of their constituent elements to form the Centre for
14
RS Harvey
Human Sciences (CHS) as a sector of the Defence Evaluation and Research Agency (DERA). In UK MOD headquarters during the 1960s and 1970s a small number of psychologists, including Hugh Stockbridge, Edward Elliot and Ken Corkindale, provided the necessary leadership and encouragement within the lengthy corridors of Old Admiralty Building and Main Building, so as to secure continual funding and impetus for a number of the elements of those early units and establishments. The combination of MOD (intra-mural) research, and extra-mural funding for universities, for example, ensured the continuation of the British heritage of the application of ergonomics within defence equipment design. Hand in hand with this background of activity it was clear that the provision of clearly written technical documentation would add the necessary human factors data and guidance that designers and engineers needed for their tasks within Industry. In the 1960’s Hugh Stockbridge had moved from the wooden huts of CEPRE Farnborough to the corridors of the Old Admiralty Establishment in Spring Gardens (just off Whitehall) and it was here that a Handbook took shape that was to provide the solid footing for the document which would eventually become the United Kingdom equivalent of US MIL-STD 1472, namely DEF STAN 00–25. Under the auspices of the Royal Naval Personnel Research Committee, and in collaboration with staff of the Medical Research Council, the “Royal Navy Handbook for Designers” was produced during the 1960’s. This took the form of a plastic ring-binder (in an appropriate shade of blue and smartly emblazoned with gold lettering on the cover) containing 12 typewritten chapters, each one devoted to a self-contained topic of human factors and ergonomics data and design guidance. The Handbook was an early success, although it has to be said that the landscape layout of the pages meant that it was considered by some to have “difficult handling characteristics”. The emphasis in the Handbook was upon easily readable data, and checklists of actions. Intended in the first instance for designers and engineers who lacked specialist human factors knowledge, it was successful enough to find its way onto the shelves of many psychologists and ergonomists who found it invaluable as technical summaries and several copies still exist at CHS and other establishments. However, the early success of this venture was temporarily cut short a few years later when funding for a projected second edition could not be found. Paradoxically this was to prove a blessing in disguise, because it provided the necessary impetus for the Senior Psychologist (Naval) to gather together a number of scientists from all three Service human factors units and Establishments in order to lay the groundwork for a future tri-Service document. With the encouragement of Edward Elliott and Ken Corkindale, Hugh Stockbridge became associated with the newly formed editorial planning committee and, since all defence committees had to have a name, suggested its imaginatively inappropriate acronym SCOTSH (Steering Committee for the Tri-Service Handbook). At an early stage it was agreed that the original Navy Handbook would become the model for the contents of the prospective Tri-Service equivalent, and MOD Directorate of Standardization confirmed that the final document would form a new Defence Standard, DEF STAN 00–25 Human Factors for Designers of Equipment.
The UK human factors defence standard
15
The Present The development of DEF STAN 00–25 And so the scene was set for the conversion of the Navy Handbook into what is now the somewhat more garish orange covered Directorate of Standardization—sponsored DBF STAN 00–25 booklets, covering a substantially greater range of technical information than the original Handbook. Over the next 15 years Hugh’s dream of a stand-alone Defence Standard devoted to human factors and ergonomics data and design guidelines was gradually implemented, culminating in the publication of the last of 12 Parts of DBF STAN 00–25 with a structure deliberately based on the original Navy Handbook. In the late 1970s the author was invited to join one the editorial subcommittee of SCOTSH chaired by Dr Maurice Elwood of APRE, and later succeeded him in this position. This subcommittee then took over all responsibility for commissioning and editing of Parts, and later formed up as Directorate of Standardization Committee with a descriptor which can only be described as prosaic (El8) when viewed alongside Hugh’s imaginative SCOTSH. Over the ensuing years Hugh’s plan was slowly put into place and for those who are unfamiliar with the contents they are given in Table 1 below, together with the more recent part 13. Table 1. DBF STAN 00–25 Current Contents
In the early days authors were drawn almost exclusively from University Departments, for example the late John Spencer was the author of the first edition of Parts 6 and 7. In later years a number of companies responded to invitations to tender for authorship; BAe Sowerby Research Centre provided the authors of Parts 10 and 12, whilst System Concepts Ltd provided the authors for the revised editions of Part 6 and Part 7. The protracted gestation period of some 10 years for all Parts was due principally to the limited availability of funding in each FYs. However there were (fortunately) rare instances when an early draft did not quite seem to “gell” and substantial re-writing delayed the original timetables. The original draft of Part 5 (Stresses and Hazards) was one such draft which underwent significant editorial changes, including one which caused a certain amount of “black humour” at the time. Within the section on the effects of nuclear detonation on the human body, it was noted that the author had stated that the most serious effect of such
16
RS Harvey
detonation was blindness. Regrettably there seemed to be no mention in any part of the text of the effects of radiation, nor of the likely deaths to be caused by explosion! Few Parts were without their last minute panic at the final proof stage. For Part 7 (Visual Displays) it was the discovery that a diagram of a submarine display was upside down, with the result which seemed to have the vessel diving when it should have been surfacing, and vice versa. For Part 10 (Controls) it was the description of a control which was so garbled that Hugh Stockbridge felt that it was more appropriate for an Zulu spear; the original text had described a particular control as “a ball-shaped knob-knob”. Each Part was intended to act as an up-to-date sourcebook of data and guidance for designers of defence materiel. Then, as now, three guiding principles governed the authors. Firstly, the Parts were NOT intended to be voluminous textbooks in their own right—this was undoubtedly the most difficult principle to adhere to and was the cause of much grief at the editorial stages of several Parts! Secondly, particular care had to be taken to ensure the applicability of each Part to all three Services. Finally, the primary readership was designers representing a wide spectrum of technical background and knowledge, but who were NOT assumed to be human factors specialists. The use of this Standard has now become common within the defence sector, but it finds increasing application within the civil sector of Industry both here and in the UK and elsewhere. When compared with MIL-STD 1472E, for example, there is clearly some duplication of approach although naturally there is substantial tailoring of local markets. However it is interesting to note that increasing numbers of requests for DEF STAN 00–25 now come from the USA. Recent requestors have included the US Coastguard, and Ford Motor Co Research Centre, Michigan. Until very recently the text of DEF STAN 00–25 has been published exclusively via the traditional medium of paper, although it is interesting to note that some 10 years MOD gave approval to two defence contractors to produce hypertext versions for internal Company use. Undoubtedly the future lies in other media as the final section of this paper indicates.
The digital future During the last 5 years Assistant Director/Standardization (AD/Stan) has been examining the feasibility of making Defence Standards available digitally. A number of Defence Standards have been available on CD-ROM for some time, but the exponential growth in the application and use of the Internet has forced a recent examination of what would have been a decidedly novel means of dissemination to the early authors and editors of the Naval Handbook. During 1996 plans were made to launch UK Defence Standards on the Internet, through the medium of the AD/Stan pages on the World Wide Web. These plans have recently been confirmed and it is hoped that UK Defence Standards will be accessible on the Internet shortly after December 1997.
Dedication This short paper was written as a contribution to a session at the Ergonomics Society 1998 Annual Conference held in memory of the late Hugh Stockbridge, Honorary Fellow of the Ergonomics Society.
The UK human factors defence standard
17
Acknowledgement The author would like to thank his colleague Ian Andrew, and Donald Anderson, for their initial encouragement to submit this paper, and for their helpful memories contributed during the drafting process. Any errors remain those of the author. © Crown Copyright 1998
STEPHEN PHEASANT MEMORIAL SESSION: THE CONTRIBUTION OF ERGONOMICS TO THE UNDERSTANDING AND PREVENTION OF MUSCULOSKELETAL DISORDERS
STEPHEN PHEASANT (1949–1996) MEMORIAL LECTURES The Contribution of Ergonomics to the Understanding and Prevention of Musculoskeletal Disorders
Sheila Lee 7 Antrim Grove London NW3 4XP
David Stubbs Centre for Health Ergonomics EIHMS, University of Surrey Guildford, Surrey GU2 5XH
INTRODUCTION As a scientist Stephen Pheasant believed in the common sense of ergonomics. He cared passionately about the design of work and about the physical comfort, psychological symbiosis, job satisfaction of the working man/woman. He believed that the spirit of the worker should be reinforced by work rather than depleted by it, as he had so often found to be the case. Stephen was a ‘hands on’ ergonomist. In addition to employing the utmost academic rigour, he returned time and again to the dissecting room, where he would examine specimens at first hand to try and discover where the connections may lie between the working actions and the resulting injury. One of his common phrases was ‘God is in the details’. His profound knowledge of anatomy led him to research and explore the anatomical—biomechanical— physiological details by which ergonomic injuries occurred. He would never tire of discussing the subject with colleagues, clinicians and students. The demands of consumerism concerned him. He was concerned that high production methods frequently caused musculoskeletal injury and psychological stress. He was concerned at the extent of old fashioned authoritarian styles of management. But what concerned and upset him most was (to use his own words) that when people became injured they should be treated like old pieces of broken factory or office machinery, and simply dismissed. The importance that Stephen attached to the topics of musculoskeletal disorder, is reflected in the following papers which address the contribution of ergonomics to our understanding and prevention of these conditions. They emphasise both physical and psychosocial work factors, In particular the papers highlight the importance of combinations of exposures to both sets of factors in the manifestation of musculoskeletal disorders. A series of challenges are presented for ergonomics which are considered a suitable addition to Stephen’s legacy. They also serve as a testament to Stephen who will always be remembered with affection, respect, regard and for many as an inspiring friend.
THE CONTRIBUTION OF ERGONOMICS TO THE UNDERSTANDING AND PREVENTION OF MUSCULOSKELETAL DISORDERS: THE ROLE OF PHYSICAL ASPECTS Peter Buckle Reader Robens Centre for Health Ergonomics EIHMS, University of Surrey Guildford, Surrey, GU2 5XH
The vast contemporary research literature reflects current concern over these disorders. As this research has become available it has become apparent that simple theories of “trauma” to tissues, followed by pain and then recovery do not adequately explain the observed facts. Similarly, the relationship between so-called “physical” factors and the development of, in particular, back and upper limb disorders, is complex and does not allow a simple prediction of the effects that changes in physical exposure will have on musculoskeletal outcomes. The failure to “explain” the origins and aetiology of these disorders through an examination of physical factors has led to an exploration of psychological, organisational and sociological factors. This research has also proved, so far, to be of only limited help in explaining the phenomena. It can be argued that, through an increasing focus on these elements, the relative importance of physical factors has been diminished. The epidemiological evidence in support of work related, and particularly physical factors and combinations of physical factors, in the development of a number of musculoskeletal disorders is nevertheless, strong. It can be argued that ergonomic interventions that reduce such exposures may still prove to be the most effective means of reducing the prevalence of these disorders.
Introduction The role of physical and psychosocial factors in the development and expression of musculoskeletal disorders has been well documented, with physical factors being identified in many early documents and texts. The increasing contemporary research literature reflects the concern over these disorders. As this research has become available then so it has become apparent that simple theories of “trauma” to tissues, followed by pain and then recovery do not adequately explain the observed facts (e.g. (Armstrong et al, 1993.)
22
PW Buckle
Similarly, the relationship between so-called “physical” factors and the development of, in particular, back and upper limb disorders, is complex and does not allow a simple prediction of the effects that changes in physical exposure will have on musculoskeletal outcomes. The failure to “explain” the origins and aetiology of these disorders through an examination of physical factors has led to an exploration of psychological, organisational and sociological factors (e.g. Bongers et al, 1993, 1995; Sauter and Swanson, 1996; Buckle, 1997 a). This research has also proved, so far, to be of only limited help in explaining the phenomena.
Epidemiological Evidence It can be argued that, through an increasing focus on these elements, the relative importance of physical factors has been diminished. The epidemiological evidence in support of work related, and particularly physical factors and combinations of physical factors, in the development of a number of musculoskeletal disorders is strong (NIOSH, 1997, Buckle 1997b). The relative importance of these factors are, selectively, presented in Table 2 (see NIOSH, 1997). From such an examination, it can be argued that ergonomic interventions that reduce such exposures may still prove to be the most effective means of reducing the prevalence of these disorders.
Evidence from Intervention Studies Ergonomists have consistently failed to provide sufficient evidence of the efficacy of work system interventions. Whilst the major challenges are methodological, if they are to be overcome then simplistic notions of physical exposure must be replaced with an understanding of the components of physical exposure and their interactions. Simple interventions are implemented routinely by many practitioner ergonomists. Few of these appear to be evaluated with any degree of rigour and only rarely are control data gathered to allow the relative effect of changes to be obtained. Thus it is not possible to state whether interventions based on physical factors alone are sufficient or whether more complex interventions are required. Interventions at a societal level (e.g. European Union Directives) may be of only limited benefit without greater evidence of the potential benefits. Thus a recent HSE review of the Manual Handling at Work Directive showed that of the organisations surveyed only 30% responded, of whom only 30% had heard of the directive. Thus it is possible that 90% of organisations have probably made little headway on addressing these problems.
Conclusions This paper re-iterates the need to utilise existing data on physical factors more effectively in the workplace and to document the results. It does not underestimate the importance of other factors (e.g. the psychosocial, individual) in the manifestation of these disorders, but suggests that understanding the complex interactions between variables and within individuals is still a distant goal.
The role of physical aspects
Table 1 Historical Perspective of Exposure and Back Pain
Table 2 NIOSH (1997) Summary Table
23
24
PW Buckle
References Armstrong, T.J., Buckle, P.W., Fine, L.J., Hagberg, M., Jonsson, B., Kilbom, A., Kuorinka, I., Silverstein, B.A., Sjogaard., Viikari-Juntura, E. 1993 A conceptual model for work-related neck and upper-limb musculoskeletal disorders. Scand J Work Environ Health 19:73–84 Bongers, P. M., de Winter, C.R., Kompier, M.A.J., & Hildebrandt, V.H. 1993, Psychosocial factors at work and musculoskeletal disease, Scandinavian Journal of Work Environment and Health, 19, 297–312 Buckle, P 1997a Upper limb disorders and work: the importance of physical and psychosocial factors. Journal of Psychosomatic Research, 43, 1, 17–25 Buckle, P. 1997b Work related upper limb disorders. British Medical Journal, 315, 1360–3 HSE 1990 Work related upper limb disorders: a guide to prevention Linton, S.J. 1990, Risk factors for neck and back pain in a working population in Sweden, Work and Stress, 4, 41–49 NIOSH 1997 Musculoskeletal disorders and workplace factors DHHS (NIOSH) Publication No. 97–141 Sauter, S.L. & Swanson, N.G. 1996. An ecological model of musculoskeletal disorders in office work. In S.D.Moon & S.L.Sauter (eds.), Beyond Biomechanics: Psychosocial Aspects of Musculoskeletal Disorders in Office Work (Taylor and Francis, London), 1–22
THE COMBINED EFFECTS OF PHYSICAL AND PSYCHOSOCIAL WORK FACTORS Jason Devereux
Research Fellow Robens Centre for Health Ergonomics EIHMS, University of Surrey Guildford, Surrey, GU2 5XH
Physical and psychosocial work factors have been implicated in the complex aetiology of musculoskeletal disorders. Psychosocial work factors differ from individual psychological attributes in that they are individual subjective perceptions of the organisation of work. An ergonomic epidemiological study was undertaken to determine the impact of different combinations of physical and psychosocial work risk factors upon the risk of musculoskeletal disorders. Physical work factors are more important determinants of recurrent back and hand/wrist problems than psychosocial work factors. The greatest risk of musculoskeletal problems occurs when exposed to both physical and psychosocial work risk factors. Ergonomic strategies should, therefore, aim to reduce physical and psychosocial risk factors in the workplace.
Introduction “We are fiercely competitive in our consumption of goods and services; and our sense of selfworth is tied up in our use of status symbols. This lies at the root of our stress levels.” (Pheasant, 1991) Stephen Pheasant realised that humanity has created a ‘milieu’ of self-imposed stress by increasing the demand for goods and for services. Satisfying the demand has been formalised into work organisation goals, culture and beliefs but at what cost to the individual worker ironically from which the demand has originated. The work organisation imposes physical and psychosocial stressors upon the individual. The physical stressors originate from environmental, manual handling and other physical demands and the psychosocial stressors originate from the perceptions of the organisation and the way the work is organised. Models have been proposed that describe the probable pathways by which these work factors can impose a threat such that symptoms, signs and diagnosable pathologies of musculoskeletal disorders can ensue (Bongers et al, 1993; Sauter and Swanson, 1996; Devereux, 1997a). The model by Devereux (1997a) proposed that
26
J Devereux
the individual perceptions of the work organisation e.g. the social support, the control afforded by the work and the demands imposed may be influenced by the capacity to cope with such psychosocial stressors. Individual capacity may be affected by a number of factors including previous injury, cumulative exposure to work risk factors, age, recovery, and beliefs and attitudes towards pain. Beliefs, attitudes and coping skills have collectively been referred to as psychosocial factors by some (Burton, 1997), but to minimise confusion, they are referred to as individual psychological attributes in this text. The relationship between risk factors and musculoskeletal problems may be dependent on the definition of the latter (Leboeuf-Yde et al, 1997) and also on the interrelationship between risk factors (Evans et al, 1994). For example, many studies have simply considered the effects of either physical or psychosocial work factors upon the risk of musculoskeletal disorders. Some studies have considered both sets of factors but have assumed the relationships between them to be independent when in reality such factors mutually exist. The effect that the mutual existence of these risk factors has upon the risk of musculoskeletal disorders has not been adequately investigated (Devereux, 1997a, Devereux, 1997b). An ergonomic epidemiological investigation was conducted to examine the effects upon the musculoskeletal system of physical and psychosocial work factors acting in different combinations in a work organisation which employed workers engaged in manual handling, driving and sedentary office work. The ethical permission for the cross-sectional study was obtained from the University of Surrey Committee on Ethics.
Methods Company work sites from around the U.K were randomly selected to participate in the study. The mixed gender study population (N=1514) was given a self-report questionnaire that included information on personal data and demographics, physical and psychosocial work factors and musculoskeletal symptoms. Questions on physical and psychosocial work factors had been validated elsewhere (Wiktorin et al, 1993; Hurrell and McLaney, 1988). Most of the physical scales had a kappa coefficient greater than or equal to 0.4 (except for bent-over posture and trunk rotation) and all the psychosocial scales had acceptable alpha coefficients (0.65–0.95). The musculoskeletal symptom questionnaire included the head/neck, trunk and upper and lower limbs. Items for the lower back had been validated against a physical examination using a symptom classification scheme proposed by Nachemson and Andersson (1982). The kappa values for the 7-day and 12-month prevalence were 0.65 and 0.69 respectively (Hildebrandt et al, 1998). Physical and psychosocial work factors that had been shown in previous epidemiological studies to increase the risk of back disorders by a factor of 2 or greater were selected to classify individual workers into one of four physical/psychosocial exposure groups. For physical exposure the criteria consisted of heavy frequent lifting or relatively lighter but frequent lifting performed as well as driving. The physical exposures were quantified with respect to a level or amplitude, a frequency or duration. For psychosocial work factors, mental demands, job control and supervisor and co-worker social support were used to classify workers into low and high exposure groups. Subjects not satisfying the low/high physical and psychosocial criteria were excluded from the analysis. Recurrent back disorder cases were defined as having experienced problems more than 3 times or longer than one
Combined effects of physical and psychosocial work factors
27
week in the previous year and were also present within the last 7 days at the time of the survey. These back problems were not experienced before starting the present job. Univariate Mantel-Haenzel chi-squared statistics were used to test the hypothesis of no association between the exposure criteria variables and recurrent back disorders. Crude and logistical regression analyses provided an estimate of the risks associated with exposure to different physical/psychosocial work factor combinations. The potential confounding/modifier effects of age, gender and cumulative exposure (defined as the number of years spent in the present job) were controlled for in the logistical regression.
Results There were 869 valid responses from the survey (57%) of the total study population (N=1514). Non-respondents did not differ with respect to gender, age or cumulative exposure. Recurrent back disorders were prevalent in 22% of the valid number of survey responses. Of the 869 valid questionnaire responses, 638 workers were classified into low/high physical and psychosocial exposure groups. The gender, age and cumulative exposure for the exposure stratified population and the valid questionnaire population did not differ. The univariate analysis for recurrent back problems showed that heavy frequent lifting>16 kg≥1–10 times per hour (p
6–15 kg≥1–10 times per hour performed as well as driving=half the working day (p<0.001) were associated with recurrent back disorders. Forward bent-over postures>60 degrees for greater than a quarter of the working day (p<0.05) was also found to be significantly associated with recurrent back problems. Trunk rotation of 45 degrees greater than a quarter of the working day increased the risk of experiencing recurrent back disorders but was not statistically significant at the 5% level. A perceived high workload was associated with recurrent back problems (p<0.05). Mental demands, job control, supervisor and co-worker social support factors were not found to be statistically associated with recurrent back disorders when considered independently.
Figure 1. The risk of recurrent back problems for different exposure groups
28
J Devereux
Figure 1 shows the combined risk effects of physical and psychosocial work risk factors associated with recurrent back disorders. High exposure to physical work risk factors had a greater impact upon the risk of recurrent back problems compared to psychosocial work risk factors. For workers highly exposed to physical work risk factors and with relatively low exposure to psychosocial work risk factors, the approximate risk of experiencing recurrent back problems was approximately 3 times greater than for workers exposed by a lesser extent to both physical and psychosocial work risk factors. The risk increased approximately 3.5 times for workers highly exposed to both physical and psychosocial work risk factors compared to those exposed by a lesser extent to both sets of factors. A similar exposure-risk relationship was observed for self-reported symptoms in the hands/wrists experienced both within the last 7 days and the last 12 months at the time of the survey using the same exposure criteria. The risk to the hands/wrists due to high exposure to physical and psychosocial work risk factors was approximately seven times greater than being exposed to these risk factors by a lesser extent (OR 6.94 95%CI 3.79–12.82). An exposure-risk relationship was not observed for the same definition in the neck region. After controlling for the effects of age, gender and cumulative exposure, a similar exposure-risk relationship was observed for each exposure group and recurrent back disorders except for the low physical-high psychosocial exposure group. The risk associated with this group was equal to unity.
Discussion Exposure to a combination of psychosocial work risk factors seems to have a greater impact on the level of risk compared to considering individual psychosocial work factors (Bongers and Houtman, 1995). The greatest risks of experiencing musculoskeletal disorders was derived from high exposure to both physical and psychosocial work risk factors but physical work risk factors were more important determinants than psychosocial work risk factors for recurrent back and hand/wrist disorders. A Swedish epidemiological study also showed that a combination of heavy lifting and a poor psychosocial work environment increased the risk of back pain and neck pain compared to being exposed to neither work factors (Linton, 1990). However, the study design did not permit the analysis of the other possible exposure combinations. In this study, associations with neck problems were not observed. Workers in the low exposed groups performed tasks that were associated with neck and shoulder disorders so they were not truly unexposed with respect to this anatomical region. As a result, it could not be determined whether the combination of high exposure to physical and psychosocial work risk factors increased the risk of neck disorders. A cross-sectional study does not allow exposures to be measured before the onset of musculoskeletal problems and so reporting biases may have been present for exposures and self-reported symptoms. The influence of these biases was controlled by assessing current exposures and assessing recently experienced symptoms. Self-reported exposures were also tested using observation, instrumentation and interview methods and it was found that workers could provide accurate reports for the exposure criteria (Devereux, 1997a). The cross-sectional study was also limited at examining exposure-disease causation, but temporal data on the outcome measure provided strong evidence that the exposures currently experienced were associated with the development of back problems. The combination risk effects were not limited to the back and have also been shown to be present for the hands/wrists.
Combined effects of physical and psychosocial work factors
29
Conclusions The relationship between physical and psychosocial work risk factors is complex and is not fully understood but reduction in exposure to both sets of factors is needed in risk prevention strategies for musculoskeletal disorders. Ergonomic interventions should be targeted at the organisation of work and the individual worker to reduce the psychosocial work stressors and also the physical stressors. The consumption of goods and services remains unabated and will be driven to higher levels in the years to come. Work organisations should strive to achieve a balance between satisfying the demand and maintaining a healthier workforce.
References Bongers, P.M., de Winter, C.R., Kompier, M.A.J., & Hildebrandt, V.H. 1993, Psychosocial factors at work and musculoskeletal disease, Scandinavian Journal of Work Environment and Health, 19, 297–312 Bongers, P.M. and Houtman, I.L.D. 1995, Psychosocial aspects of musculoskeletal disorders. Book of Abstracts, Proceedings of the Prevention of Musculoskeletal Disorders Conference (PREMUS 95), 24–28 September, Montreal, Canada, (IRSST, Canada), 25–29 Burton, A.K. 1997, Spine update—Back injury and work loss: Biomechanical and psychosocial influences, Spine, 22, 2575–2580 Devereux J.J. 1997a, A study of interactions between work risk factors and work related musculoskeletal disorders, Ph.D. Thesis. University of Surrey Devereux, J.J. 1997b, Back disorders and manual handling work—The meaning of causation, The Column, 9, 14–15 Evans, G.W., Johansson, G., & Carrere, S. 1994, Psychosocial factors and the physical environment: Inter-relations in the workplace. In C.L.Cooper & I.T.Robertson (eds.), International review of industrial and organizational psychology, (John Wiley, Chichester), 1–30 Hildebrandt, V.H., Bongers, P.M., Dul, J., Van Dijk, F.J.H., & Kemper, H.C.G. 1998, Validity of self-reported musculoskeletal symptoms, Occupational and Environmental Medicine, In press Hurrell, J. & McLaney, M. 1988, Exposure to job stress-a new psychometric instrument, Scandinavian Journal of Work Environment and Health, 14 Supplement 1, 27–28 Leboeuf-Yde, C., Lauritsen, J.M., & Lauritzen, T. 1997, Why has the search for causes of low back pain largely been nonconclusive?, Spine, 22, 877–881 Linton, S.J. 1990, Risk factors for neck and back pain in a working population in Sweden, Work and Stress, 4, 41–49 Nachemson, A.L. & Andersson, G.B.J. 1982, Classification of low-back pain, Scandinavian Journal of Work Environment and Health, 8, 134–136 Pheasant, S. 1991, Ergonomics, Work and Health (Macmillan Press, London) Sauter, S.L. & Swanson, N.G. (1996). An ecological model of musculoskeletal disorders in office work. In S.D.Moon & S.L.Sauter (eds.), Beyond Biomechanics: Psychosocial Aspects of Musculoskeletal Disorders in Office Work (Taylor and Francis, London), 1–22 Wiktorin, C., Karlqvist, L., & Winkel, J. 1993, Validity of self-reported exposures to work postures and manual materials handling, Scandinavian Journal of Work Environment and Health. 19, 208–214
STEPHEN PHEASANT MEMORIAL SYMPOSIUM: THE ROLE OF PSYCHOSOCIAL FACTORS A Kim Burton
Spinal Research Unit, University of Huddersfield, c/o 30 Queen Street, Huddersfield HD1 2SP
Low back trouble affects a majority of workers at some time in their lives; many recover, but some become significantly disabled. The notion of achieving primary control through ergonomic intervention, based on biomechanics principles, whilst intuitively attractive, has so far been unhelpful. Biomechanics/ergonomic considerations can sometimes explain the first recalled onset of low back pain, but there is little evidence that secondary control based solely on these principles will influence the risk of the progression to chronic disability. More promising are intervention programs that take account of the psychosocial influences surrounding disability. Ergonomics can assist by ensuring that workplaces are comfortable and accommodating, for both fit and back-troubled workers.
Introduction That low back trouble (LBT) is an increasing problem in industrialised society is axiomatic, despite the efforts of ergonomists, clinicians and legislators. There is a dichotomy: ergonomists and biomechanists strive to reduce physical stress at the workplace with the intent of lowering the risk of musculoskeletal problems, yet clinicians and psychologists are suggesting that rehabilitation of the back-injured worker should involve not only activity, but physical challenges to the musculoskeletal system.
Background Epidemiology may point to a link between physically hard work or whole body vibration and back pain, but this link is not universal; seemingly much depends on definitions for back pain and workload. Similarly, reports of an association between heavy work and absenteeism are not entirely consistent. However, experimental biomechanical evidence does suggest that strenuous work is likely to be detrimental; in vitro experiments, simulating physiological occupational loads, can result in fatigue damage to numerous spinal tissues (Adams and Dolan, 1995; Brinckmann et al. 1988). Thus, a reduction of occupational loads should limit
Role of psychosocial factors
31
work related back trouble, but despite the gradual reduction in occupational physical stressors back pain has not decreased; in fact, disability due to LBT has increased. Back pain can be as prevalent among sedentary workers as among manual workers but heavy jobs do seem to be associated with an increased work loss. Work-related LBT should be viewed against the high background level of reporting of a symptom which has an undetermined pathology, a propensity for recurrence and a variable tendency to progress to disability. The identification of risk factors is problematic, and it is difficult to be certain that a particular job is involved in causation. A consequence of the symptoms is often an inability (or reluctance) to perform activities of daily living as well as work activities. The following discussion explores some interrelationships between damage, symptom reporting and disability.
Spinal damage The intervertebral disc is presumed at greatest risk of damage from physical stress. Disc degeneration is influenced only modestly by work history; the greatest proportion of the explained variation in degeneration can be accounted for by genetic influences, though age does have some influence (Battié et al. 1995). When matching for age, sex, and work-related factors, disc herniations were found in 76% of an asymptomatic control group compared with 96% in the symptomatic group—the presence of symptoms was related to neural compromise and psychosocial aspects of work, but not to the exposure to physical stressors (Boos et al. 1995). A new method for quantifying overload damage from radiographs has enabled comparison of cohorts exposed to heavy work with those exposed to light work; irreparable damage was associated only with jobs entailing excesses of loading or vibration, suggesting that current regulations are adequate protection against overload damage (Brinckmann et al. 1998). Other structures may also sustain damage. Deficient intrinsic spine muscles or a lack of motor control may increase the risk of straining muscles or ligaments (Cholewicki and McGill, 1996), but recovery should be fairly rapid. Irrespective of whether damage to spinal structures can be identified or quantified, there is no doubt that workers do get painful backs and some will believe it is their work which is to blame. A study, of sick-listed blue-collar workers found that 60% of patients believed that work demands had caused their back trouble, but neither an assessment of workload (e.g. lifting, bending) nor calculated compression loads predicted the rate of return to work or sickleave during follow-up (Lindstrom et al. 1994).
Injury, recurrence and work loss Some data suggest that the risk of LBT is associated with the dynamics of lifting, and one study has linked epidemiological findings with quantitative biomechanical findings, though causation was not established (Marras et al. 1993). Experienced industrial workers seemingly have a reduced risk for LBT compared with inexperienced workers, but this may be related more to muscular coordination aiding spinal stability, rather than lowered spinal loads (Granata et al. 1996).
32
AK Burton
Workers in similarly demanding occupations can have varying symptomatology. A study of nurses in Belgium and The Netherlands has shown a significantly lower prevalence of back trouble (and other musculoskeletal complaints) in the Dutch nurses despite the fact that their average workload was substantially greater than then-Belgian counterparts. Overall, symptoms and work loss were not related to work-load. The Dutch nurses differed strikingly on a range of psychosocial variables; they were less depressed and significantly more positive about pain, work and activity (Burton et al. 1997). A large general population study (Croft et al. 1995) has found that new episodes of LBT are more likely for those who are psychologically distressed, even for first onsets. An industrial study (Bigos et al. 1991) found that reported first injuries were not related specifically to job demands, rather to psychosocial factors such as low job satisfaction. Police officers in Northern Ireland have proved useful for studying first-onset LBT; they compulsorily wear body armour weighting >8kg. Compared with an English police force without body armour, they showed reduced survival time to first-onset. It was also found that working in vehicles comprised a separate risk; but the effect of exposure to armour and vehicles was not additive. The proportion of officers with persistent (chronic) back complaints did not depend on the length of exposure since first-onset, rather chronicity was associated with psychosocial factors (distress and blaming work) (Burton et al. 1996). There is little support for a relationship between recurrence and work demands. The best predictor of future trouble seems to be a previous history, with perception of work demands being more important than objective measurement (Troup et al. 1987), and dissatisfaction with work being a significant factor. The term re-injury may be a misnomer (Bigos et al. 1991). Workers with current LBT have been shown to have a lower score for job satisfaction and social support but, surprisingly, absenteeism and work heaviness were not related to these parameters (Symonds et al. 1996; Burton et al. 1997). But other attitudes and beliefs do seem to be relevant. Psychosocial factors such as negative beliefs about the inevitable consequences of LBT, inadequate pain control strategies, fear-avoidance beliefs and belief that work was causative have all been found to relate to absenteeism. The relationship between attribution of cause, job satisfaction and pain perception is complex, but a simple educational intervention program (comprising workplace broadcasting of a pamphlet stressing the benign nature of LBT, the importance of activity and desirability of early work return) is capable of creating a positive shift in beliefs with a concomitant reduction in extended absence (Symonds et al. 1995). There is accumulating evidence that early return to the same task is beneficial and does not highten the risk recurrence of symptoms (or do further damage). A three-year follow-up of occupational musculoskeletal injuries (including LBT) found that those whose workloads had been reduced did not report less problems (Kemmlert et al. 1993). In fact, a successful rehabilitation program for patients with subchronic back pain has advocated early return to unrestricted duties as part of a combined graded activity/behavioural therapy approach (Lindstrom et al. 1992). Clinical studies in workers’ compensation back pain patients have found that delayed functional recovery was associated with psychosocial factors more than with perceived task demand (Hadler et al. 1995), and that longer spells off work were associated with a poor outcome (Lancourt and Kettelhut, 1992).
Role of psychosocial factors
33
The reluctance to confront normal physical challenges seen in back-disabled workers has been termed activity intolerance, which is variously linked to individual response to pain, the belief that a specific injury must be the cause of the pain, and the behavioural roles such as suffering. The question obviously arises as to the origin of the various relevant psychosocial traits. There is clinical evidence that psychological profiles predictive of chronicity are present very early in the course of the back pain experience (Burton et al. 1995); seemingly they are not a result prolonged pain.
Effectiveness of ergonomic intervention Supporttive evidence for the belief that ergonomic intervention will reduce the impact of occupational low back pain is not compelling. The only intervention which has been formally evaluated is worker training in manual handling techniques; whilst lifting techniques can be improved, the effect on injury rates has not been clearly demonstrated (Smedley and Coggon, 1994). A recent rigorous evaluation of a ‘back school’ approach to injury prevention found that the programme did not reduce the rate of injury, time off work, or rate of reinjury, even though the subjects’ knowledge of safe behaviour was increased (Daltroy et al. 1997).
Summary On balance, there is evidence to support the notion that biomechanics-based ergonomic improvements to the workplace have some potential to limit first-time back injury; therefore they should be deployed where practicable. The possible role of ergonomics for reducing recurrence rates seems limited at best; conversely there is no convincing evidence that continuance of work is detrimental in respect of disability. A proportion of workers with back pain, having inappropriate beliefs about the nature of their problem and its relationship to work, will develop fear-avoidance behaviours because of inadequate pain coping strategies; they then begin to function in a disadvantageous way and drift into chronic disability. This issue may best be challenged by a combination of organisational and psychosocial interventions intended to make the workplace comfortable and accomodating (Hadler, 1997).
References Adams, M.A. and Dolan, P. (1995) Recent advances in lumbar spinal mechanics and their clinical significance. Clin Biomech 10, 3–19. Battié, M.C., Videman, T., Gibbons, L., Fisher, L., Manninen, H. and Gill, K. (1995) Determinants of lumbar disc degeneration: a study relating lifetime exposures and MRI findings in identical twins. Spine 20, 2601–2612. Bigos, S.J., Battié, M.C., Spengler, D.M., Fisher, L.D., Fordyce, W.E., Hansson, T., Nachemson, A.L. and Wortley, M.D. (1991) A prospective study of work perceptions and psychosocial factors affecting the report of back injury . Spine 16, 1–6. Boos, N., Reider, V., Schade, K., Spratt, N., Semmer, M. and Aebi, M. (1995) The diagnostic accuracy of magnetic resonance imaging, work perception, and psychosocial factors in identifying symptomatic disc herniations. Spine 20, 2613– 2625. Brinckmann, P., Biggemann, M. and Hilweg, D. (1988) Fatigue fracture of human lumbar vertebrae. Clin Biomech 3 (Suppl. 1), s1-s23.
34
AK Burton
Brinckmann, P., Frobin, W., Biggeman, M., Tillotson, M. and Burton, K. (1998) Quantification of overload injuries to thoracolumbar vertebrae and discs in persons exposed to heavy physical exertions or vibration at the work-place. Part II. Occurence and magnitude of overload injury in exposed cohorts. Clin Biomech 13 (Supplement), (in press) Burton, A.K., Tillotson, K.M., Main, C.J. and Hollis, S. (1995) Psychosocial predictors of outcome in acute and subchronic low back trouble. Spine 20, 722–728. Burton, A.K., Tillotson, K.M., Symonds, T.L., Burke, C. and Mathewson, T. (1996) Occupational risk factors for the first-onset of low back trouble: a study of serving police officers. Spine 21, 2612–2620. Burton, A.K., Symonds, T.L., Zinzen, E., Tillotson, K.M., Caboor, D., Van Roy, P. and Clarys, J.P. (1997) Is ergonomics intervention alone sufficient to limit musculoskeletal problems in nurses? Occup Med 47, 25–32. Cholewicki, J. and McGill, S.M. (1996) Mechanical stability of the in vivo lumbar spine: implications for injury and chronic low back pain. Clin Biomech 11, 1–15. Croft, P.R., Papageorgiou, A.C., Ferry, S., Thomas, E., Jayson, M.I.V. and Silman, A.J. (1995) Psychologic distress and low back pain: evidence from a prospective study in the general population. Spine 20, 2731–2737. Daltroy, L.H., Iversen, M.D., Larson, M.G., Lew, R., Wright, E., Ryan, J., Zwerling, C., Fossel, A.H. and Liang, M.H. (1997) A controlled trial of an educational program to prevent low back injuries. New England Journal Medicine 337, 322–328. Granata, K.P., Marras, W.S. and Kirking, B. (1996) Influence of experience on lifting kinematics and spinal loading, In: Anonymous 20th Annual Meeting, Georgia Tech, Atlanta, USA: American Society of Biomechanics] Hadler, N.M., Carey, T.S. and Garrett, J. (1995) The influence of indemnification by works’ compensation insurance on recovery from acute backache. Spine 20, 2710–2715. Hadler, N.M. (1997) Workers with disabling back pain. New Eng J of Med 337, 341–343. Kemmlert, K., Orelium-Dallner, M., Kilbom, A. and Gamberale, F. (1993) A three-year followup of 195 reported occupational over-exertion injuries. Scand J Rehabil Med 25, 16–24. Lancourt, J. and Kettelhut, M. (1992) Predicting return to work for lower back pain patients receiving workers compensation. Spine 17, 629–640. Lindstrom, I., Ohlund, C., Eek, C., Wallin, L., Peterson, L. and Nachemson, A. (1992) Mobility strength and fitness after a graded activity program for patients with subacute low back pain: A randomized prospective clinical study with a behavioral therapy approach . Spine 17, 641–652. Lindstrom, I., Ohlund, C. and Nachemson, A. (1994) Validity of patient reporting and predictive value of industrial physical work demands. Spine 19, 888–893. Marras, W.S., Lavender, S.A., Leurgens, S.E., Rajulu, S.L., Allread, W.G., Farthallah, F.A. and Ferguson, S.A. (1993) The role of dynamic three-dimensional trunk motion in occupationally-related low back disorders: The effects of workplace factors trunk position and trunk motion characteristics on risk of injury. Spine 18, 617–628. Smedley, J. and Coggon, D. (1994) Will the manual handling regulations reduce the incidence of back disorders? Occup Med 44, 63–65. Symonds, T.L., Burton, A.K., Tillotson, K.M. and Main, C.J. (1995) Absence resulting from low back trouble can be reduced by psychosocial intervention at the work place. Spine 20, 2738–2745. Symonds, T.L., Burton, A.K., Tillotson, K.M. and Main, C.J. (1996) Do attitudes and beliefs influence work loss due to low back trouble? Occup Med 46, 25–32. Troup, J.D.G., Foreman, T.K., Baxter, C.E. and Brown, D. (1987) The perception of back pain and the role of psychophysical tests of lifting capacity. Spine 12, 645–657.
MUSCULOSKELETAL DISORDERS
INTERPRETING THE EXTENT OF MUSCULOSKELETAL COMPLAINTS Claire Dickinson
HSE, Magdalen House, Trinity Road, Bootle, L20 3QZ.
A number of studies have compared the extent of musculoskeletal complaints in the adult working populations using the Nordic musculoskeletal questionnaire. The design of such studies has typically involved cross-sectional cohorts of occupational groups being used as referent populations to each other. The self-reported complaints of aches, pain or discomfort relating to a particular body area lead to a percentage being determined for those reporting positively. The difficulty lies in then deciding when a particular percentage indicates that there is reason for concern and action is needed. The current paper proposes an interpretation system based on the annual prevalence, discusses its limitations and offers suggestions on its future development. Introduction The Nordic Musculoskeletal Questionnaire is a valuable tool enabling large scale surveys into the extent of self-reported musculoskeletal complaints (Kuorinka et al, 1987). It has been extensively cited in the technical literature describing the state of occupational populations (e.g. David and Buckle, 1997; Williams and Dickinson, 1997). Following a series of publications of referent data (Ydreberg and Kraftling, 1988), HSE embarked on an evaluation of the questionnaire and produced both long and abridged standardised versions for their own use (Dickinson et al, 1992). Whether using the original, HSE’s or a self-devised questionnaire, a similar form of questions have been used to enquire about the extent of troubles such as aches, pains, or discomfort in the last year (Annual Prevalence), the last week (Weekly Prevalence) or that has prevented activity in the last year (Annual Disability). Users have then produced a series of tables showing the percentage of that occupational group reporting positive complaints for nine, defined, body areas. Many researchers have designed their cross-sectional studies such that the data for the occupational group of interest is compared to one or more control or referent populations. This enables the findings to be interpreted in a wider context. In the absence of selecting an appropriate control group, this approach produces little
Interpreting the extent of musculoskeletal complaints
37
more than a statement that population X reports more or less complaints in a particular body area than populations A, B or C. However, what is usually needed at this point is a form of interpretation and conclusions on the extent of the problem among this occupational group, the part of the body which is particularly affected and, if this is the case, what to do about it. At the current time, employers and researchers seem to lack a basis for deciding on further action, apart from using their personal judgement to establish a threshold (X%) for action. Action Levels With this in mind, annual prevalence figures from a number of HSE studies of selfreported musculoskeletal complaints were reviewed to see if a simple interpretation system could be devised. This could also be described as an attempt at defining where priority should be allocated given limited resources. The data covered workers operating 64 different systems of work. The occupations included trades or different operating systems in the cotton, ceramic, food processing, construction and garment manufacturing sectors aswell as production line assembly, packing and supermarket cashiers. The data from each occupational group were considered separately but in total included 1781 males and 4704 females. Given the size of some populations that were studied, it was not feasible to group within an occupation by separate age groups or any indices such as the length of time in a given job. Table 1 shows three action levels based on the annual prevalence data. The median values plus an arbitrary 10% forms the row described as “high”. The median value less 10%, forms the “medium” row. The “low” row is the percentage which fall below the medium level. Table 1. % Annual Prevalence Action Levels
Key: N—Neck, Sh—Right and left shoulders, E—Both elbows, WH—Wrist and hands, UB—Upper Back, LB—Lower Back, H—Hips, K—Knees, A—Ankles.
38
C Dickinson
It is possible to compare newly collected data with the values shown in Table 1 and focus on those body areas where high levels are located in order to establish if there is a problem and identify the body area affected. For example, if a population study established that 50% of the females were self-reporting neck complaints in the last year, then serious attention might to given to establishing why this may be. For the same situation, a value of 41% would indicate that action was still merited but with lesser urgency. Thirdly, if a “low” level was established, further action should not be overlooked if accessible and straightforward but may be regarded of lesser priority until ‘high’ and ‘medium’ situations had been tackled. The 10% criterion is purely arbitrary. Alternatives might be suggested which reflect the range of responses. However, when applying the current system to the original 64 systems of work, about 7 emerge in the ‘high’ action level and they do seem to be the ones where subjectively improvements are thought to be particularly merited. Alternatives The HSE’s work on supermarket cashiers has extended this approach slightly further (Mackay et al, 1998). Tables in Mackay et al (1998) are presented showing benchmarks for the annual prevalence, weekly prevalence and annual disability based on the variation found amongst check-out operators operating different systems of work. Two sets of figures are shown, depending on the size of the population surveyed. It is anticipated they will serve to assist retail managers in deciding whether their check-out operation are of an acceptable standard overall and to prioritise areas for remedial action. Wider Considerations The advantages of the proposed interpretation system are many. The rational basis for action reduces unnecessary costs and assists in prioritising resources. It also provides of an opportunity to move forward with a proportional response. There are however some limitations to the approach shown in Table 1. In particular, there was no selection of specific occupational groups in an attempt to represent UK industry. Their inclusion was simply based on the availability of data. Secondly, there is the lack of age or exposure sensitivity. However, while these action levels are tentatively suggested given the absence of other criteria, there is a clear recognition that an improved criteria could be derived in the future and indeed, this is desirable. Self-reported questionnaires do not elicit what has caused the onset of complaints, nor do they attribute causation entirely to working activity. Musculoskeletal disorders are by their nature associated with a multi-factorial aetiology with factors involved in their onset grouped as personal, clinical, organisational or concerned with aspects of the work-space. Hence, a sensible, measured response to “high”
Interpreting the extent of musculoskeletal complaints
39
levels found in the workplace might include an ergonomic hazard spotting exercise with further measurements to pin-point where the risk lies. This may take many objective and subjective forms, such as the assessment of vibration levels, psychosocial assessment, measurements of workload or equipment or activity levels. Recording the organisational system for dealing with reports of musculoskeletal aches and pains in a workplace and the prevailing climate is often overlooked in studies. Boocock et al (1997) have reported however that these factors have a strong influence on the extent of reporting. In a situation where there are good employeremployee relations, widescale positive self-reports of musculoskeletal complaints may occur with the expectation that an active response will follow—irrespective of the level of the observable risk. The opposite situation is more likely to be found. In a climate of job uncertainty it may be perceived by respondents that a positive report on a questionnaire may lead to a threat to their livelihood. Hence underreporting ensues. Another confounder to be overcome is the high variability in the severity of the symptoms reported by respondents. A complementary clinical examination to determine grades of severity or a diagnosis is potentially useful here. Alternatively, the use of an extended questionnaire which includes questions on severity, defined as the effect on the person or the extent of disability (described as the prevention of doing activities) would seem to be important in defining which cases are experiencing the more costly forms of musculoskeletal disorder. Such questioning might include: Severity : have you seen a doctor or similar in a period of time or the frequency or duration of experiencing symptoms Disability : days or number of occasions absent from work or any reduction in home or working activities. Given such limitations in deriving a system for prioritising action it may be prudent to extend any future system to incorporate severity and disability indices at the very least. Conclusion This paper has presented one approach that can be used to interpret annual prevalence data and used in prioritisation for remedial action. Whilst an interpretation system based on annual prevalence has its uses, a system that encompasses prevalence, severity and disability measures would seem to be more acceptable in the long term. Supplementing the use of questionnaire surveys with an ergonomic hazard spotting or comprehensive risk assessment provides an opportunity to establish the extent of reporting and the nature of the improvements required to manage the problem of concern.
40
C Dickinson
References Boocock, M., 1997, Personal Communication re: Relative Risks Project David, G. and Buckle, P., 1997, A questionnaire survey of the ergonomic problems associated with pipettes and their usage with specific reference to work-related upper limb disorders, Applied Ergonomics, 28, 4, 257–262. Dickinson, C.E., Campion, K., Foster, A.F., Newman, S.J., O’Rouke, A.M.T. and Thomas, P. 1992, Questionnaire development: an examination of the Nordic Musculoskeletal Questionnaire, Applied Ergonomics, 23, 3, 197–201. Kuorinka, I., Jonsson, B., Kilbom, A., Vinterberg, H., Biering-Sorenson, F., Anderson, G. and Jorgensen, K., 1987, Standardized Nordic Questionnaires for the analysis of musculoskeletal symptoms, Applied Ergonomics, 18, 3, 233–237. Mackay, C., Burton, K., Boocock, M., Tillotson, M., Dickinson, C.E., 1998, Musculoskeletal Disorders In Supermarket Cashiers. HSE Research Report. HSE Books. Williams, N. and Dickinson, C.E., 1997, Musculoskeletal complaints in lock assemblers, testers and inspectors, Occupational Medicine, 47, 8, 479–484. Ydreberg, B and Kraftling, A., 1988, Referensdata Till Formularen FHV 001 D, FHV 002 D, FHV 003 D, FHV 004 D och FHV 007 D. Rapport 6. The Foundation for Occupational Health Research and Development. Orebro.
The views expressed are those of the author and not necessarily those of the HSE.
PEOPLE IN PAIN Donald Anderson
Centre for Occupational and Environmental Medicine, The National Hospital, Pilestredet 32, N-0027 Oslo, Norway.
Many people at work suffer from some degree of musculoskeletal illness, but these make up only a subset of the general population, of whom up to 86% may complain about pain from this source. A survey of occupational health centres was carried out in a health district in Norway to determine the occurrance of musculoskeletal problems in the working population. A questionnaire was sent out including questions about the number of patients seen per week, the proportion diagnosed with a musculoskeletal problem, how the diagnosis was made, and what proportion benefitted from treatment. Other questions were asked about the type of work and possible causes for the problem. The possibilities are mooted for general egonomics education and training as a supplement to more conventional intervention.
Introduction It is axiomatic that many people at work suffer from some degree of pain caused by musculoskeletal illness, an affliction that seems now to be pandemic and numerous authors have reported specific studies of selected parts of the working population. A few examples of such studies include office workers (Grandjean, 1988) paediatric surgeons (Cowdery and Graves 1997), bus drivers (Kompier, et al, 1987), and garage mechanics (Torp, 1997). These occupational groups, however, represent only subsets of the population at large and more researchers are now reporting surveys of the general population. Natvig, et al, (1995) found that 86% from a general population survey in Norway had pain of musculoskeletal origin in the previous year. Hagen, et al, (1997), in another Norwegian survey of 20,000 people (59% response), conclude that up to 61% had experienced musculoskeletal pain in the previous month. Other studies have taken place in Sweden and Denmark with similar results. Table 1 shows some of these data. In some of these studies the results are based on self-reports of pain in the musculoskeletal system, often using the Nordic questionnaire (Kuorinka, et al, 1987), or equivalent. Without confirmation of diagnosis by clinical follow-up this may provide some level of overreporting, but nevertheless giving an indication of the scale of the problem. Of the two general population studies, Natvig, et al, (1995) relied only on the self reports, but Hagen, et
DM Anderson
42
al, (1997) arranged clinical diagnosis of some 160 cases to identify inflammatory rheumatoid arthritis as a separate group (ca. 7%). Whatever the diagnosis, specific or not, and regardless of whether the condition is workrelated or not, a major problem of pain appears to exist, likely to be having a major impact on the quality of life for sufferers at home and at work. Table 1. Percentage prevalence of reported ms. pain
1: Grandjean (1988); 2: Cowdery and Graves (1997); 3: Kompier et al (1987); 4: Torp (1996); 5: Natvig et al. (1995); 6: Hagen et al. (1997)
The Centre for Occupational and Environmental Medicine (SYM) at the National Hospital in Oslo, has been set up for three years. While being a resource with responsibility for an entire health district in Norway, covering approximately 437,000 people at work, little information was available on what potential problems SYM might be expected to investigate within the field of ergonomics, or the extent of the workload. To attempt to quantify potential workload, and as a marketing exercise, therefore, SYM carried out a survey amongst 110 occupational health centres in the district.
Conduct of the survey The aim of the survey was to determine by means of a questionnaire the extent of problems of musculoskeletal illness in the working population, as identified by occupational health professionals. The questionnaire, containing only ten questions (to encourage replies), was sent out to 110 occupational health centres, ranging from district and community services to those offered by large companies to their employees. A covering letter also explained the function of SYM and the expertise available in the areas of medicine, occupational hygiene and ergonomics. Questions asked were about the number of patients seen per week, the proportion diagnosed with a musculoskeletal problem, how diagnosis was normally made (multiple choice), and what proportion may have benefitted from treatment. Other questions were asked about the type of work and possible causes for the problem, including workplace design (multiple choice) and the possible influence of so-called psychosocial factors (multiple choice). There were obvious limitations to the extent and reliability of the data obtained in this way, but it was intended mainly to be of use in planning SYM’s strategy in relation to ergonomics.
Results Conclusions have been drawn from analysis of 55 replies (50% response). These included 20 industry-based services, including paper and glass-making, biscuit manufacture, engineering,
People in pain
43
forestry and wood products and the police. Fourteen of our sample were public or community based and 21 others included private physicians and occupational health clinics. A few replies included comments about the questionnaire. Some respondents thought that musculoskeletal illness should have been clearly defined; others had difficulty with the expression ‘patient’ in relation to company employees, and ‘cases’ may have been a better word. Some were not convinced that their opinions about possible cause were valid, and others claimed not to have adequate or sustained records. A few physicians explained that as company medical officers they may diagnose, but treatment and follow-up were sometimes the province of a patient’s general practitioner. It transpired that the attendant occupational medical officers were mostly full-time, but some were performing a part-time occupational service in addition to their private practice, and some centres were staffed only by physiotherapists. Some company-related occupational health sevices were being phased out and ‘outsourced’.
Cases seen (patients) and diagnosed with musculo-skeletal problems Most respondents saw less than 25 cases per week, and some saw between 25 and 35, whilst a few saw as many as 45 or more per week. Of these, some diagnosed more than 50% of cases, nearly half diagnosed 25–50% and a few diagnosed less than 25% of their cases as suffering from musculoskeletal illness. More than half of respondents apparently relied only on reported pain for their diagnosis, although many reported using joint mobility and/or other unspecified tests to confirm diagnosis.
Improvement after treatment Half of the respondents saw improvement in 25–50% of their cases after treatment, some less than 25% and a few saw some improvement in more than 50% of their cases.
Work-related factors/non-work factors About half the respondents considered heavy physical work was the prime cause of the compaint, but only a few less felt that heavy mental work was just as important. Light mental work was considered more likely than light physical work as a contributor. Nearly half of the respondents considered football or handball to be a major contributor, with nearly as many citing home decorating and carpentry. Some thought that home personal computer usage (Internet?) was a likely cause
Workplace design Office and VDU work and seating contributed the majority of workplaces, along with construction machinery and benchwork (assembly/fitting). Some respondents were themselves responsible for assessment of workplace design factors, with some others relying on physiotherapists or similar professionals to carry out this work.
Psycho-social factors and influence Nearly three quarters of the respondents considered problems of employment contributed to patients’ illness; nearly half saw marriage and children problems as a contributory factor, but only a few considered economic problems as important. More than a quarter saw a combination of marriage and children and problems at work as important factors. A few saw all these factors as important in combination. Fewer than 10% of respondents thought that such causes were
44
DM Anderson
influential in more than 50% of their cases, but more than half considered them to be so for between 25 and 50% of cases. A few saw such factors as influential in less than 25% of cases.
Discussion and conclusions The data collected lacks strict statistical validity, but a number of services sampled over a cross-section of industry gave considered replies. The data confirms that a great many people are diagnosed as suffering from musculo-skeletal problems. The number of cases seen on a weekly basis suggests a scale of problem beyond available resources adequately to follow-up and identify reliably any occupational or leisure basis for the complaints. To achieve accurate diagnosis, clinical examination of the patient and follow-up inspection and assessment of the job and workplace seems desirable in all cases, which would impose a very heavy load on the occupational physician or other specialist colleagues. Although some reported assessment of workplaces takes place, it is not clear just how much intervention is being recommended, but between 25 and 50% of cases were successfully treated. However, the study did not explore long term benefits, or the percentage of re-affliction or chronic suffering. A wide variety of workplaces and tasks were implicated as possible causal factors, with office and data work being predominant, contributing to heavy mental workload as a cause of illness. Operating construction and other plant, and mechanical workshop activity provided the basis for much of the heavy physical work reported as a cause of complaint. Nonwork activity included sport, home decorating and carpentry. The impression was confirmed that so-called psycho-social factors are having an impact on musculoskeletal problems, and that high among these are problems at work and the fear of unemployment, as well as difficulties with marriage and childen. More surprisingly, economic and housing seemed less important, but the influence of a combination of all these factors is recognised by many of the respondents. The results from this study reinforces the evidence from many surveys and detailed studies that a major problem exists, and more than a suspician that resources to combat the problem are inadequate. Faced with such evidence for ‘demand’ for ergonomics intervention, what can be done? Do-it-yourself seems to be one alternative, armed with published tools like RULA from McAtamney and Corlett (1992). This systematic workplace assessment method, coupled with some basic rules for design/re-design will help to reduce the risks of contracting work-related upper limb disorders. Other guidelines are also available. These intervention methods, however, require dedicated application and time, but are effective for redesign of working situations, where ergonomics may not have been involved from the start of the design (product or production). Methods to ameliorate symptoms ‘on-line’ may also be effective, such as pausgymnastik, adopted from Sweden and recommended by Pheasant (1991), and others. As yet unfinished studies in Sweden and Norway appear also to show the benefits of exercise as both preventive and remedial in the treatment of musculoskeletal problems. In the United Kingdom, the recently formed Body Action Campaign (1997) is actively involved in remedial and preventive work amongst schoolchildren, and this initiative suggests the possibility for more widespread education and training. What may be also be called ‘the Lothian initiative’ was launched in Edinburgh by Andrews and Kornas (1982). These authors produced a programme of ‘Ergonomics Fundamentals for
People in pain
45
Senior Pupils’, including a workbook for teachers, intended to educate secondary school students in basic ergonomics. Although the experiment was abortive at that time, evidence may now be emerging that the programme could be effective in giving young people an awareness of ergonomics, to help them to assess products and environments and react to poor conditions in their working life. Hopefully, this will lead to a point where they can influence their own working conditions. Another programme is reported by Albers, et al, (1997), to teach apprentice carpenters in the construction industry an awareness of ergonomics, with promising results. Over half the apprentices completing the course reporting using the information they received and about the same number saying that they changed the way they work following the course. Other similar examples are beginning to appear in the literature which give encouraging evidence for the notion that one way forward is through education, where by example and diffusion, a slow process of general health improvement will result.
References Albers, J.T., Li, Y., Lemasters, G., Sprague, S., Stinson, R. and Bhattacharya, A. 1997. An Ergonomic Education and Evaluation Program for Apprentice Carpenters Amer. J. Indust. Med, 32:641–646. Andrews, C.J.A., Kornas, B. 1982. Ergonomics Fundamentals for Senior Pupils. Napier College (now Napier University), Edinburgh. Cowdery, I.M. and Graves, R., 1997, Ergonomic issues arising from access to patients in paediatric surgery. Contemporary Ergonomics 1997. (Taylor and Francis, London.) Grandjean, E, 1988. Fitting the task to the man: a textbook of occupational ergonomics. (Taylor and Francis, London.) Hagen, K.B., Kvein, T.K., and Bjørndal, A, 1997. Musculo-skeletal pain and quality of life in patients with non.inflammatory joint pain compared to rheumatoid arthritis: A population survey. The Journal of Rheumatology, 24. Kompier, M., de Vries, M., van Noord, F., Mulders, H., Meijman, T. and Broersen, J. 1987. Physical Work Environment and Musculo-skeletal Disorders in the Busdrivers Profession. Musculoskeletal Disorders at Work (ed. P.Buckle) (Taylor and Francis, London.) Kuorinka, I., Jonsson, B., Kilbom, A., Vinterberg, H., Biering-Sorensen, F., Andersen, G., and Jorgensen, K. Standardized Nordic Questionnaire for the analysis of musculoskeletal symptoms. Appl.Ergonomics, 18, 3, pp233–237. Natvig, B., Nessiøy, I., Bruusgaard, D. and Rutle, O. 1955. Musculoskeletal symptoms in a local community. Euro. J. Gen. Practice, 1, March. McAtamney, L. and Corlett, E.N. 1992. Reducing the Risks of Work Related Upper Limb Disorders: A Guide and Method. The Institute for Occupational Ergonomics, University of Nottingham, Nottingham, UK. Pheasant, S. 1991. Ergonomics, Work and Health. McMillan Press, London. Torp, S., Riise, T., and Moen, B.E. 1996. Work-related musculoskeletal symptoms among car mechanics: a descriptive study. Occ. Med. 46, 6, pp 407–413.
PREVENTION OF MUSCULOSKELETAL DISORDERS IN THE WORKPLACE—A STRATEGY FOR UK RESEARCH L Morris, R McCaig, M Gray, C Mackay, C Dickinson, T Shaw and N Watson Health and Safety Executive, Magdalen House, Trinity Road, Bootle, L20 3QZ.
Research plays an important role in the Health and Safety Executive’s (HSE) strategy for the prevention of musculoskeletal disorders, the leading cause of occupational ill-health in the UK. Over forty musculoskeletal research projects have been funded since the early 1980s and the findings have assisted HSE in advising industry about the nature and extent of musculoskeletal risks and appropriate control measures. HSE is currently reviewing its musculoskeletal research portfolio with the aim of mapping out research priorities for the next 5 to 10 years. This paper describes the development of a musculoskeletal research strategy and outlines its major themes including research on pathomechanisms, risk factors, strategies for exposure assessment, health surveillance methods and intervention studies.
Introduction Musculoskeletal disorders (acute and chronic) are the leading cause of self-reported occupational ill health in the UK with an annual prevalence now estimated at over 900,000 cases caused by work (Health and Safety Commission 1997). The reported conditions can be grouped in four categories (McCaig 1996):-
u u u u
transient soft tissue pains related to poor work posture and task design discrete soft tissue lesions such as carpal tunnel syndrome chronic pain syndromes affecting the lower back and limbs chronic degenerative disorders such as osteoarthritis of the hip
The health consequences of these conditions range from transient aches and pains to chronic problems which may lead to permanent disability. Acute injuries to the musculoskeletal system, resulting from poor task design and overexertion, are also a cause for concern. Handling injuries, for example, account for around a third of all over-3 day injuries reported to HSE.
Prevention of musculoskeletal disorders in the workplace
47
Research plays an important role in HSE’s strategy for the prevention of acute and chronic injury to the musculoskeletal system. HSE has to be able to get accurate information about risk factors and control measures in order to undertake its core activities successfully, including publication of guidance, publicity campaigns and workplace visits by inspectors. All this work is heavily dependent on scientific knowledge drawn from the literature and from HSE’s own research programme. Since the early 1980s, HSE has funded over forty research projects on musculoskeletal issues and this together with the findings from in-house field studies and technical investigations has helped to shape the advice given to industry. In the past, HSE’s musculoskeletal research strategy has been incremental, research questions being generated by ongoing policy development and operational activities. While this approach had largely met organisational requirements, a review of occupational health policy identified a need for a more strategic look at the programme, examining the range of topics covered by current and completed projects and identifying significant gaps in knowledge. The primary objective was to map out and prioritise research themes for projects to be commissioned by HSE over the next five to ten years. The strategy also aimed to identify topics for collaborative research at both national and international levels with a view to maximising the benefits gained from limited research resources. It is interesting to note that, internationally, a more strategic approach to musculoskeletal research is being adopted as countries seek to ensure that programmes address agreed national priorities. In the USA and Finland, for example, recent musculoskeletal research programmes have focused on preventive measures and workplace interventions (Haartz and Sweeney 1995, Viikari-Juntura 1995).
The Strategy Development Process Recent publications have considered the process of research strategy development (Rantanen 1992, National Institute for Occupational Safety and Health (NIOSH) 1996). An important element is consultation involving stakeholders (employer and employee organisations, occupational health practitioners, research funding bodies etc.,) as well as experts and researchers. The criteria for defining priority topics are driven largely by expert and stakeholder opinion and experience shows that several iterations of the process may be needed before consensus is achieved (NIOSH 1996). In developing the HSE strategy, information on research needs was gathered from a wide range of sources including:-
u u u u
a review of previous and current extramural research projects research workshops on specific topics (back pain, upper limb disorders (Harrington et al 1996), diagnostic criteria) an overview of the scientific literature to identify emerging research themes musculoskeletal research strategies published by other funding bodies or professional organisations eg., NIOSH (1996)
48
LA Morris, R McCaig, M Gray, C Mackay, C Dickinson, T Shaw and N Watson
A strategy development group, with representatives from HSE’s technical, policy and research interests was formed to structure the information and develop an initial draft. An important aspect of the work was the development of a framework, which integrated research needs related to all types of work-related musculoskeletal disorders as well as acute injuries and manual handling accidents. This was linked to regulatory requirements for risk assessment and control and was based on the ergonomics concept of the degree of match between task demands and individual capabilities. The research needs identified in the draft strategy were debated at a research seminar attended by leading research workers, technical experts, occupational health practitioners and representatives of employer and employee organisations. The seminar programme focused on developing trends in musculoskeletal research with sessions on health outcomes, psychosocial factors, exposure assessment and control. International developments in research and standards were also considered. The views expressed in this forum are being incorporated in a second draft of the strategy which will be subject to further rounds of consultation. An important requirement to be met in developing a research strategy is that it should be widely communicated to stakeholders and feedback sought (Rantanen 1992). The HSE strategy will ultimately be published, therefore, in order to encourage wider discussion and to assist funding organisations in the planning of research programmes.
Emerging Research Themes The review of the extramural projects showed that almost half of the HSE sponsored projects had addressed issues connected with the development of manual handling guidance, reflecting research needs associated with the introduction of the Manual Handling Operations Regulations 1992. Comparatively fewer studies had been undertaken on the musculoskeletal risks associated with keyboard tasks or clinical aspects such as disease mechanisms and diagnostic criteria. Significant gaps were identified in relation to risk factors for the development of upper limb disorders and the interactions between them. Other research needs indicated by the review, included studies of the mechanisms of cumulative musculoskeletal injury and the development of design guidelines for workplaces and tasks. While some of these issues had been addressed in the scientific literature, further applied research was necessary to develop practical workplace guidance, an important objective of HSE research. The draft strategy paper built on this review and the earlier research workshops (Harrington et al 1996), identifying research needs related to risk factors, exposure measurement, health outcomes, health surveillance, medical management and workplace interventions. These areas encompass the wide spectrum of issues associated with the assessment and control of musculoskeletal risks in the workplace, methodological issues associated with epidemiological research being included alongside practical management concerns. Table 1 summarizes some of the main research issues identified to date. These are presented in order to illustrate the general direction being taken by the strategy and are subject to further consultation before any consideration is given to funding.
Prevention of musculoskeletal disorders in the workplace
Table 1 HSE Musculoskeletal Research Strategy—Summary of Research Issues
49
50
LA Morris, R McCaig, M Gray, C Mackay, C Dickinson, T Shaw and N Watson
Future Development The development of any research strategy is a continuous process and the direction and content of HSE sponsored musculoskeletal research is likely to change as research findings are evaluated. The strategy is seen as a useful management tool for planning and commissioning new research and has been designed with flexibility in mind. While the strategy is based on current research trends, experience shows that some facility must be made to accommodate unforeseen scientific developments. Musculoskeletal research draws on a number of parent disciplines and new directions can emerge from basic research. Advances in research into pain mechanisms, for example, have led to new methodologies for investigating chronic pain syndromes in the upper limbs. It can be concluded that there is much to be gained from a strategic approach to the planning of musculoskeletal research. The consultative process involving a wide range of stakeholders will help to ensure that limited research resources are appropriately targeted and that research findings are evaluated against agreed objectives. A published national strategy, which is regularly updated, also provides an ideal vehicle for collaboration, enabling governmental, academic and industrial research resources to be shared in the pursuit of common goals.
References Haartz, J.C. and Sweeney, M.H. 1995, Work-related musculoskeletal disorders: prevention and intervention research at NIOSH, In Nordman, H. et al, Sixth US-Finnish Joint Symposium on Occupational Health and Safety, Research Report 3, (Finnish Institute of Occupational Health, Helsinki), 135–141 Harrington, J.M. et al 1996, Work related upper limb pain syndromes—origins and management, Unpublished report on research priorities workshop, (Institute of Occupational Health, Birmingham) Health and Safety Commission 1997, Annual Report and Accounts 1996/97 (HSE Books, Sudbury) McCaig, R.H. 1996, Managing musculoskeletal disorders—an overview from a medical perspective, Proceedings, Ergonomics and Occupational Health ManagingMusculoskeletal Disorders, London, 3 December 1996, (The Ergonomics Society, Loughborough) National Institute for Occupational Safety and Health 1996, National Occupational Research Agenda, DHHS(NIOSH) Publication 96–115, (NIOSH, Cincinnati). Rantanen, J. 1992, Priority setting and evaluation as tools for planning research strategy Scandinavian Journal of Work, Environment and Health, 18, Suppl 2, 5–7 Viikari-Juntura, E. 1995, Prevention program on work-related musculoskeletal disorders, In Nordman, H. et al, Sixth US-Finnish Joint Symposium on Occupational Health and Safety, Research Report 3, (Finnish Institute of Occupational Health, Helsinki), 151–154 The opinions expressed in this paper are those of the authors and do not necessarily reflect the views of the Health and Safety Executive.
A MUSCULOSKELETAL RISK SCREENING TOOL FOR AUTOMOTIVE LINE MANAGERS A Wilkinson*, RJ Graves*, S Chambers**, R Leaver** * Department of Environmental & Occupational Medicine University Medical School, University of Aberdeen Foresterhill, Aberdeen, AB25 2ZD ** Occupational Health Department Land Rover, Rover Group Solihull
The routine assessment of musculoskeletal disorders (MSD) risk at line management level in industry is an important step in risk management. Tools such as Rapid Upper Limb Assessment etc., provide various means of trying to integrate MSD risk assessment but appear to use differing criteria and emphasise risk to different parts of the body. A study was undertaken to develop a Statutory Musculoskeletal Assessment Risk Tool (SMART) and to assess the effectiveness of the company’s current risk assessment tool. Two groups of twenty line level managers acted as subjects, one group using the old tool and the other the new tool. The results showed that there was an improvement in the accuracy and sensitivity of risk identification using the new tool.
Introduction The routine assessment of musculoskeletal disorders (MSD) risk at line management level in industry is an important step in risk management. Tools such as RULA (McAtamney and Corlett, 1993) and OWAS (Kant et al, 1990), supplemented by the Health and Safety Executive’s Manual Handling Operations Regulations (HSE, 1992) provide various means of trying to integrate MSD risk assessment. The former tools appear to use differing criteria and tend to emphasise risk to different parts of the body. As a first stage in any statutory risk assessment, there is a need for a tool that helps users to identify risks as defined by the MHOR and those risks which can lead to Upper Limb Disorders indicated by criteria such as that used by the HSE (1990, 1994). As part of an initiative to reduce the incidence of work related MSD on site through increased awareness, an automotive company intended to make its line managers responsible for screening for sources of risk within their work areas. An existing tool (Associate joB Analysis, ABA, BMG AG/Rover Group, 1996) was available but there was some concern that it did not highlight sources of MSD risk accurately enough. The company wished for screening tools which were simple and straightforward to use to enable accurate, quick and reliable assessments to be made of risk from individual jobs without the need for detailed training in ergonomics. In addition, the tool(s) needed to provide enough information so that
52
A Wilkinson, RJ Graves, S Chambers and R Leaver
task requirements could be assessed to help in initial job placement, rotation and/or rehabilitation, not necessarily by managers but by the occupational health staff. A study was undertaken to develop a statutory based manual handling and musculoskeletal risk assessment tool. This effectively provided a first stage indication of potential risk using the MHOR and so could be termed a Statutory Musculoskeletal Assessment Risk Tool (SMART). In addition the study needed to assess the effectiveness of the company’s current risk assessment tool (Wilkinson, 1998).
Approach Overview First, a prototype SMART was developed by examining the literature to identify appropriate criteria. This consisted of two sections. The first was a modification of a MHOR worksheet to be used to determine if any of the tasks exceeded MHOR guidance. The second section was to be used to determine whether there were other MSD risks. Each section was intended to provide risk scores to indicate whether there was high risk (red), medium risk (amber) and low risk (green). The design was intended to be as visual and simple enough so that line managers could use it routinely. For the experimental evaluation of the two assessment tools, video recordings of tasks from a vehicle assembly line were compiled to represent a range of typical work activities found on site. The tasks were selected to cover a wide range of postural work activity and load handling. Two groups of twenty line level managers acted as subjects, once they had been trained, one group using the ABA tool and the other the new tool. The time taken and accuracy of the assessments were recorded. A team consisting of an ergonomist, an occupational physician, two physiotherapists, two occupational health nurses and a health and safety officer, determined the ‘gold standard’ levels of risk present in each of the tasks. The latter were used as a base line for analysing the results from the experimental study.
Stage 1 This involved developing a prototype SMART. The first part of this involved developing a summary sheet using job process sheets (internal job descriptors) as a basis for breaking down the job into elements. The next two sections concerned manual handling and general musculoskeletal risk assessments. The manual handling section reflected the needs of the MHOR and used diagrams to take the user through the assessment process. As there could be load reduction factors which needed to take account of factors such as repetition etc., correction factors in tabular form were provided. Where a high risk was identified this implied that a more detailed assessment would be needed. The section on general musculoskeletal risk took account of different body parts including the neck, shoulders, back, wrists and hands. Generally postures were assessed against criteria such as repetition, force and duration to provide an integrated score. Examples of line tasks were selected for the study and included one task for training purposes, and ten for the experimental assessments. Tasks were chosen to cover aspects of high, moderate and low risk levels in relation to MHOR, RULA and OWAS.
Musculoskeletal risk screening tool for automotive line managers
53
Actual risk levels present in each task, and the key causal factors of that risk were established to the satisfaction of a professional panel of judges including an Occupational Physician, a physiotherapist and an ergonomist. This provided the gold standard referred to earlier. For the pilot phase of the study subjects were selected from occupational health staff members to ensure some level of ergonomic awareness based on previous health and safety training. The subjects worked through an assessment of the training task using the SMART, and then were asked to use it to make assessments of the ten experimental tasks. The results of the pilot study were compared with those of the professional team. The prototype SMART was reworked and amended on the basis of these results and comments from the subjects.
Stage 2 This involved comparing groups on performance in terms of accuracy, speed and ease of use using the prototype SMART and ABA assessment tool. Forty line level managers (as a representative sample of the intended users of the final form) were selected to take part in the experimental phase; of whom at least twenty were pretrained in the use of the current ABA form. A summary sheet was produced for use with the ABA assessment, so that the output from this form was comparable with the output of SMART in terms of high, medium and low risk. Twenty subjects pre-trained in ABA made up one experimental group (Non expert ABA). They worked through the assessment of the training task using the ABA and the ABA summary sheet, then completed the assessment of the ten experimental tasks using the ABA form unassisted. The remaining twenty subjects were used for the assessment of the prototype SMART (Non expert SMART). These were trained in using the SMART by working through the form and assessing the training task. Following this, they made the assessments of the ten experimental tasks unassisted. Analyses of accuracy were undertaken in relation to the risk levels for each factor by comparing performance of the professional panel in using the ABA and SMART against the non expert ABA and the non expert SMART groups. Similarly, comparisons between the groups were undertaken to assess speed in relation to the time taken to make the assessments of the ten experimental tasks, and usability in relation to a Likert scale evaluation of the form’s clarity etc. Accuracy was taken as the percentage of the risk assessments agreeing with the control (gold standard). The assessment of accuracy depended upon comparing performance between groups and recording false positive and false negatives. A false positive was defined as finding a degree of risk where there was none i.e. the performance was oversensitive. False negatives occurred where there was risk but this was not found i.e. performance was not sensitive enough.
Results and Discussion A false positive was defined as finding a degree of risk where there was none i.e. the performance was oversensitive. False negatives occurred where there was risk but this was not found i.e. performance was not sensitive enough.
54
Figure 1
Figure 2
A Wilkinson, RJ Graves, S Chambers and R Leaver
Comparison of overall accuracy assessment for non experts versus experts for both tools
Comparison of accuracy of assessment for non experts versus experts for both tools for neck assessments (group means)
Musculoskeletal risk screening tool for automotive line managers
55
Figure 1 (overleaf) shows a comparison of overall accuracy assessment for Non-Experts versus Experts for both tools. Overall there were lower percentages of False Negative scores when using the SMART compared to the ABA tool. This appears to indicate that performance was more sensitive in detecting risk with the SMART. Examining the accuracy of the Non-expert ABA versus the Expert ABA in more detail, it can be seen that both had similar performance. Although they had similar Accuracy scores, the Expert ABA group appeared to have marginally more. It can be concluded that performance was reasonably similar using the ABA form. Examining the accuracy of the Non-expert versus the Expert SMART groups, it can be seen that both had similar performance in terms of Accuracy, although the Expert group had a lower percentage of Accuracy scores. Analyses were carried out for each of the sections of the SMART. The first involved examining the manual handling assessments (Section one). This showed that both the Nonexpert and Expert groups had better sensitivity in detecting risk with the SMART. The next section (MSD) covered necks, shoulders, arms, wrists, hands, backs and legs. Figure 2 (overleaf) shows a comparison of accuracy of assessment for non experts versus experts for both tools for neck assessments (group means). Both groups using SMART were more accurate and had better sensitivity in detecting risk. The results for the other parts of the body tended to be mixed. For example, risk with shoulders and arms tended to be detected more accurately with the ABA tool but the SMART was more sensitive to detecting risk. The results of the study indicate that SMART has the potential for overall accuracy and sensitivity in detecting risk for line managers. As such it seems to be a sensitive means of highlighting potential risk. There is, however, a trade-off between how much sensitivity is needed in practice and the amount of risk in a specific task, and this needs to be examined in more detail. In addition, more work should be undertaken to improve this tool in relation to risk to certain parts of the body.
References BMG AG/Rover Group 1996, ABA Associate joB Analysis, ABA Rev/6 DOC September, 1996, BMG AG/Rover Group Health and Safety Executive 1990, Work related upper limb disorders. A guide to prevention. HMSO, London Health and Safety Executive 1992, Manual handling. Guidance on regulations Manual Handling Operations Regulations 1992. HMSO, London Health and Safety Executive 1994, Upper limb disorders: Assessing the risks. Health and Safety Executive Kant, I. Notermans, J.H.V. Borm, P.J.A. 1990, Observations of the postures in garages using the Ovako Working Posture Analysing System (OWAS) and consequent workload reduction recommendations, Ergonomics, 33, 2, 209–220 McAtamney, L., Corlett, E.N. 1993, RULA: a survey method for the investigation of work related upper limb disorders, Applied Ergonomics, 24, 91–99 Wilkinson, A. 1998, Development of a manual handling and musculoskeletal risk assessment screening tool for line managers at an automotive plant MSc Ergonomics Project Thesis, Department of Environmental and Occupational Medicine, University of Aberdeen: Aberdeen.
RISK ASSESSMENT DESIGN FOR MUSCULOSKELETAL DISORDERS IN HEALTHCARE PROFESSIONALS Caryl Beynon, Diana Leighton, Alan Nevill, Thomas Reilly School of Human Sciences Liverpool John Moores University Mountford Building Liverpool, L3 3AF
An ergonomic check-list was developed to assess the risk of performing certain nursing and physiotherapy tasks. This was in response to extensive epidemiological work identifying the magnitude of the problem. Questionnaires were used to assess the musculoskeletal symptoms experienced by nurses and physiotherapists and life-time prevalence was 49%. Low back/buttocks/upper legs was identified as the anatomical area most affected. The risk assessment pro-forma was based on guidelines provided by the Health and Safety Executive but was amended with reference to the questionnaire results. A scoring system was devised so an overall risk score for performing a specific task can be identified. The study indicates the benefits of using epidemiology results when devising an ergonomic risk assessment procedure.
Introduction It was evident from review of the literature that nursing is frequently cited as an occupation with a high risk of back problems (Hildebrant, 1995). This constitutes a huge financial burden and potentially long periods of sickness absence from work. Whilst a plethora of studies concerning back pain within the nursing profession exist, this area of research is rarely expanded to include other anatomical sites. Rarely have other healthcare professionals been cited in the literature. Physiotherapists are often neglected in research, possibly because it is assumed that they have superior understanding of body mechanics and in particular back protection (Molumphy et al., 1985). In order to quantify the prevalence of various musculoskeletal disorders and to enable comparisons to be made between the nursing and physiotherapy professions, comprehensive epidemiological investigations must be undertaken. While epidemiology is important in giving a preliminary overview of the problem, it is then necessary to establish the possible causes of this occupational strain using more objective measures. Only then can guidelines to reduce the risk factors be implemented. The
Musculoskeletal disorders in healthcare professionals
57
aim of this study was to utilise the results from an epidemiological investigation in order to develop a risk assessment procedure for this to be achieved.
Epidemiology of Musculoskeletal Disorders Methodology A confidential questionnaire designed for self-administration was utilised within a cross-sectional investigation. Questionnaires were distributed to 4220 nurses within 7 hospitals and 794 physiotherapists in 20 hospitals. Head manager nurses/superintendent physiotherapists distributed the forms either directly to the sample group or via the heads of wards, depending on the numbers involved. The questionnaires were distributed randomly to gain a cross-section of ages, grades, specialties and gender. The results were analysed using the SPSS statistical package. Chi-squared and logistical regression tests were utilised in the analysis of the data. Results A response rate of 44% (n=349) was obtained for the survey of physiotherapists; the questionnaire was completed by 19% (n=813) of the nursing personnel sampled. The sample characteristics of both populations are shown in Table 1. Table 1. Sample characteristics of questionnaire respondents (means and standard deviation are reported)
The lifetime prevalence of musculoskeletal disorders of various locations was 49%. The point prevalence was 20.7%. Almost half of the sample (42.2%) who had suffered symptoms at any time of their working lives were therefore exhibiting symptoms at the time of the questionnaire. Point prevalence gives an indication of the immediate impact, but many sufferers reported reoccurring symptoms which may not have been present at the time of questioning and would therefore be undetected in the study if life-time prevalence had not also been indicated.
C Beynon, D Leighton, A Nevill, and T Reilly
58
Respondents indicated the site of musculoskeletal symptoms on an anatomical diagram; these sites were grouped into specific areas for analysis. The anatomical areas and corresponding percentage of individuals who had experienced symptoms in these areas are shown in Table 2. There was no significant difference in the relative percentages of nurses and physiotherapists who had suffered a musculoskeletal disorder during their working life (p<0.05). However, the location of disorders was significantly different (p<0.05) between the two samples. Physiotherapists suffered more symptoms relating to the wrist, fingers, hand and forearm, knee and lower limb than the nursing sample. Table 2. Percentage of individuals that had experienced symptoms in each defined anatomical area at some period in their working life
Perceived causes Regarding lifetime prevalence, 36.4% of respondents with musculoskeletal symptoms could recall a specific causal incident. Patient handling and lifting was indicated as the cause by 66.7%. Similarly, of those personnel who attributed their symptoms to continued exposure to a stressor, patient handling and lifting was implicated by 51.3% of respondents. Logistic regression analysis was employed to indicate factors with predictive value for musculoskeletal disorders in general and back pain specifically. Performing manual lifts and the number of lifts performed by the nurses and physiotherapists were not a significant indicator of the prevalence of musculoskeletal symptoms. However, other factors were shown to have predictive value but are beyond the scope of this paper.
Development of the ergonomic risk assessment pro-forma The results of the epidemiological survey indicated a need to develop an objective risk assessment procedure. This assessment pro-forma was developed based upon the guidelines provided by the Health and Safety Executive, and incorporates information relating to occupational, environmental, organisational and personal factors. Pilot work was performed on a range of personnel by attending a variety of hospital wards and physiotherapy departments at Southport and Formby District General Hospital to ensure all typical actions could be recorded. The results of the epidemiological study were also used in the risk assessment development. For example, the questionnaire indicated the relatively high proportion of physiotherapists with problems in the wrists and fingers, so a section indicating finger and wrist force was included in the risk assessment pro-forma. Sub-sections of the check-list detailed task, posture, load, environmental conditions, the psychological state of the individual and forces acting on the wrists and fingers were included. An example of one sub-section is given in Table 3.
Musculoskeletal disorders in healthcare professionals
59
Table 3. Example of one sub-section used in the risk assessment pro-forma
A scoring system was devised for each sub-section and totalled to indicate the overall measure of risk for the specific activity. Certain tasks/postures are assigned a score depending on risk; for example, trunk flexion of 45° scores 2, compared to flexion of 90° which scores 4. A short description of the task was included at the time of recording so that a composite score was associated with specific activities. Considering numerous individuals and collecting a large sample of data will reduce any large individual differences by providing mean scores for performing a specific task. A total of 45 hours of risk assessment data will be collected, with risk assessment performed every 10 minutes within each hour period. Data collection is currently underway and results will be compared with the questionnaire responses to determine exactly which aspects of the profession are detrimental in terms of the onset of musculoskeletal disorders. Of those respondents who attributed their symptoms to a single event, 66.7% identified patient handling and lifting as the cause. Similarly, of those personnel who attributed their symptoms to continued exposure to a stressor, patient handling and lifting was implicated by 51.3% of respondents. Patient handling is frequently cited as the most common cause precipitating a period of low back pain in both nursing (Jensen, 1990) and physiotherapy (Bork et al., 1996) but the logistic regression failed to find lifting per se as having a predictive value for the onset of musculoskeletal disorders. The risk assessment will indicate which specific lifting and handling tasks, or components of tasks, are detrimental and other factors which may have been over looked with the pre-occupation with manual handling. The epidemiological analysis identified high and low risk specialties. A range of both high and low risk wards were chosen for the risk assessment, to be performed at a District General Hospital on Merseyside. The assessor ‘shadows’ one member of staff for a one-hour period during the course of their working day and an instantaneous assessment is performed every 10 minutes. By remaining with the member of staff continuously for the whole hour, the assessor is also able to assess the psychological characteristics of the individual which were shown to be important in the questionnaire analysis. Assessments take place at different times of the day, and incorporate personnel of both sexes, and a range of grades to ensure a crosssection of information is collected.
Conclusion It is apparent that nurses and physiotherapists are highly susceptible to experiencing musculoskeletal disorders of a perceived work-related origin. It also apparent that these healthcare professionals perceive patient handling tasks to be instrumental in the onset of symptoms. This objective risk assessment procedure, developed in response to results of
60
C Beynon, D Leighton, A Nevill, and T Reilly
extensive epidemiology, may be applied within the work environment to establish the overall risk of performing various occupational tasks. Data collection and analysis are underway for exploring which specific occupational activities are incurring the greatest risk to personnel. Most importantly, the observation check-list may be completed manually, is a quick, nonintrusive method of collecting large amounts of data.
References Bork, B.E., Cook, T.M., Rosecrance, J.C., Engelhardt, K.A., Thomason, M.E.J, Wauford, I.J. and Worley, R.W. 1996, Work-related musculoskeletal disorders among physical therapists, Physical Therapy, 76, 827–835 Hildebrandt, V.H. 1995, Back pain in the working population: prevalence rates in Dutch trades and professions, Ergonomics, 38, 1283–1298 Jensen, R.C. 1990 Back injuries among nursing personnel related to exposure, Applied, Occupational and Environmental Hygiene, 5, 38–45. Molumphy, M., Unger, B., Jensen, G.M. and Lopopolo, R.B. 1985, Incidence of workrelated low back pain in physical therapists, Physical Therapy, 65, 482–486
ERGONOMIC MICROSCOPES—SOLUTIONS FOR THE CYTO-SCREENER? J.L.May & A.G.Gale
Applied Vision Research Unit, University of Derby, Mickleover, Derby DE3 5GX, UK
Previous studies have found that microscope users demonstrate a high number of visual and postural problems potentially resulting in poor productivity, discomfort and fatigue. These problems have potential serious consequences particularly in the medical setting as they may lead to errors in medical diagnosis. This paper reviews international literature and details some of the findings of previous ergonomic studies regarding microscopy. The main problems found in microscopy both in industrial and medical settings are detailed. Potential solutions to some of the problems identified and the practicality of applying these to a cytology laboratory are discussed.
Introduction Many assembly and inspection tasks in the electronics industry require the use of a microscope to ensure efficient manufacture and quality control. Microscopes are also used widely in medical and research settings to analyze closely cultures or samples of cells to aid diagnosis and treatment of various diseases. The intensive use of microscopes in some situations however has led to reported problems. The majority of problems appear to fall into two categories; visual and postural.
Visual problems One of the first documented accounts of what is termed “operational microscope myopia” is by Druault (1946) who reported that near sightedness (myopia) and double vision in physicians was the result of undue accommodation and convergence caused by the use of microscopes. In cytology screening where microscope use is intensive, more recent studies have found that 73% (Hopper et al, 1997) of cytology screeners reported eye strain. Similar levels exist in the electronics industry. Frenette and Desnoyers (1986) conducted tests of visual fatigue on cyto-technicians both before and after they started work and compared the results with a group of haematologists, who did not use a microscope as intensively. Some 31% of cyto-technologists showed symptoms of blurred vision at the end of the day compared with just 3% of haemotechnolgists. Elias (1984) also reported an increase in the prevalence of visual symptoms for microscopists working more than 20 hours per week compared to those working less. Reasons for visual fatigue may include:
62
JL May and AG Gale
• Long periods of Accommodation. Intense periods of microscope work require the eyes to perform long periods of accommodation possibly resulting in the condition referred to as “temporary or operational myopia”, (Zoz et al, 1972). • Ophthalmic factors. Various ophthalmologic factors such as long-sightedness or astigmatism increase a person’s susceptibility to visual fatigue, (Zoz et al, 1972). • Microscope Illumination. If the illumination is too bright a distinct “pulling sensation” can be felt (Burrells, 1977). Exposure to bright light may increase retinal detachment and myopia, (Ostberg and Moss, 1984). • Environmental Conditions. The microscopist may be subject to glare from reflective table tops, high levels of illumination and sunlight. Insufficient air movement and humidity may contribute to eye problems.
Postural Problems Sustained voluntary and involuntary contractions of the ocular and neck muscles when using microscopes can give rise to headaches, and stiffness in the neck (Simons, 1942). In industry 45% of microscope workers suffered from muscular ailments, (Soderberg 1978), while in the medical field Hopper et al, (1997) found that a prevalence of 78% of cyto-screeners in the UK reported muscular pain. It was reported to be experienced every day by 28.6%, while a further 53.7% experience muscular discomfort during each week. The areas of the body where discomfort was most commonly reported were the neck, shoulders, upper and lower back and wrists. These users found it harder to concentrate and had lower job satisfaction. Key factors are: • Maintaining a Fixed Posture. Microscope work involves users adopting a fixed position for long periods of time. The number of hours spent in a fixed position while using a microscope was related to the reporting of discomfort symptoms in the neck and back, (Grieg and Caple, 1987). • Performing Small Repetitive Movements. Microscopists move slides by continuous operation of the controls using small precise movements of the hands. The control location often forces the user to adopt awkward hand/arm positions. • Inadequate Microscope and Furniture Design. Various types of microscopes are not adjustable in terms of eye piece height, angle or control position and this may lead to muscular discomfort, (Soderberg, 1978). Unsuitable benches and chairs were often used and inadequate space provided, (Hopper et al, 1997). • Stress. Maintaining long periods of concentration results in a fatiguing effect on the user, (Johnsson, 1981). The pressure of not making a diagnostic error is very high which again can lead to additional stress and fatigue.
Implications/Recommendations 1) Minimize the need for microscopy Many microscope jobs in health and industry have been eliminated by automation and alternative viewing systems. Potential problems of these systems, however include inadequate resolution and poor colour rendition which render them unsuitable for several
Ergonomic microscopes—solutions for the cyto-screener?
63
tasks. The introduction of new technology is also likely to pose new ergonomic problems to be addressed. 2) Design and purchase suitable equipment and furniture Recommended microscope viewing characteristics (e.g. Ostberg and Moss, 1984) are:• Eye pieces with built in artificial depth cues may help to reduce accommodation. • The interpupillary distance (IPD) of the eyepieces should be adjustable between 50– 76mm corresponding to a viewing distance of 1.0–2.5 diapers and a convergence angle of 3–10 degrees. • A flat field is necessary for a clear image at both the periphery and the centre of the visual field. A detector device could warn operators when the visible light is too bright for long term viewing. • Use coloured filters to enhance filter out unhealthy parts of the spectrum • To accommodate for differences in fusion capacity a phoria adjustment is needed. Eye piece height and angle is a determining factor in neck angle and some of the problems could be eliminated posturally if the eye pieces would adjust to allow the operator to adopt a more horizontal line of sight (Soderberg, 1978). It should be possible to adjust the distance from the front of the microscope to the front of the eye piece to accommodate individuals with a large abdominal depth. To reduce stress on the shoulders and neck the focusing controls should be brought down close to table top height. The maximum distance of the controls from the edge of the table should be suitable for the smallest user to minimise the tendency of bending forward. Some ability to adjust the position of the controls would be beneficial. Recent improvements in microscope design have been made and some microscope manufacturers are now starting to offer more ‘ergonomically designed’ microscopes incorporating some of these features. Purchasing such modern microscopes and additional aids however may be beyond financial possibility and the user may be left with an inadequate microscope for some time. Kumar and Scaife (1979) analyzing measurements of muscle activity in microscope workers found that even small changes in the workstation design such as the degree of incline and height of the table top produced significant changes in work posture and muscle activity. It was recommended that the height of the table and chair should be adjustable to reduce stress on the back and shoulders. Thinner benches which still provide adequate stability are needed and footrests need to be provided if the user is unable to rest their feet on the floor. Sufficient working space for the task must be provided, (Soderberg, 1978). 3) Modify Existing Equipment Modifications may range from being cost free to very expensive and again the extent to which they are implemented. The extent to which recommendations are implemented is dependent upon the financial support available. It is possible to buy “add ons” or inserts to adjust the height and angle of the eye piece. To minimize the stress placed on the wrists by continuous operation of the controls products are now becoming available which will by flicking a lever or using a mouse move the stage and adjust the fine focus controls electronically. Cheaper modifications can be achieved through placing stands under the microscope to adjust the height and angle. While this reduces the need to bend over the microscope it also raises the
64
JL May and AG Gale
height of the controls further above bench height thus increasing the stress placed on the shoulders. Any change in eye piece angle towards the operator will reduce strain on the neck. The microscope can be tilted forward by placing blocks under the rear edge of the base. The more it is tilted forward however then the more likely things are to fall off the stage. Padding has also been provided in some laboratories to minimize sharp corners on benches and to rest the hands on equipment which is cold to touch. Users have also adapted boxes and buckets as footrests when they have been inadequately provided. 4) Minimize the time spent on task Users who perform the most microscope work experience more discomfort and visual problems (Rohmert and Haider, 1986) It is therefore important to minimize the time each person spends using a microscope by alternating with other tasks. Where ergonomic approaches regarding redesign of work tasks or equipment are not feasible rest breaks and shortened working hours must be considered. This is difficult for cyto-screeners within the UK as there are very few tasks with which they can rotate. 5) Provide Appropriate Training New microscopists should be given training in how to set up their microscope. All microscopists should be aware of how to use and adjust their workstation to minimise discomfort. Both eye exercises and a physical exercise programme may reduce fatigue, (Haines, and McAtamney, 1993, MacLeod and Bannon, 1973) 6) Provide Vision Screening for Microscopists Individuals with astigmatism, myopia, and hyperopia may be unsuitable for intensive microscope work (Olsson, 1985). It may therefore be beneficial to screen users initially for existing visual problems and at regular intervals thereafter.
Conclusion There is evidence of visual and musculoskeletal problems amongst microscope workers. Due to the fixed posture required in microscopy it is important that equipment and furniture are provided which are suitable for the user. Although ergonomic problems were originally highlighted some time ago many of the difficulties regarding the usability of microscopes for intensive purposes still remain. Some manufacturers have recently addressed some of the issues concerning microscope design by providing adjustable features on particular models. In some settings however the ability to afford these new microscopes has not been possible and users are left to work with cheaper models which offer very little in terms of flexibility and adaptability. Some companies also produce ‘ergonomic’ attachments which do improve the adjustability of certain microscope features, but they are often only suitable to use on a certain kind of microscope. It is also unclear if the range of adjustment that they give will be suitable to all potential users in their own working environment. It is important therefore that a wider ergonomic approach is considered when looking at the problems of microscope usage. This should incorporate not only the design and layout of equipment and furniture used but also the user’s job design, satisfaction and organizational pressures. Users should also be screened to ensure their suitability for the job performed, and trained to set up their workstation appropriately for their own needs and requirements.
Ergonomic microscopes—solutions for the cyto-screener?
65
Acknowledgment We would like to thank the NHSCSP who kindly funded this work.
References Burrells W., 1977, Microscope Technique. A Comprehensive Handbook for General and Applied Microscopy. (New York, NY: John Wiley and Sons). Druault A., 1946, Visual Problems Following Microscope Use. Annals Oculistique, 138–142, Elias R., & Cail, F., 1984, Work with Binocular Microscopes—Visual and Postural Strain, INRS Cashiers de notes documentaires, 117, 451–456. Frenette B., & Desnoyers L., 1986, A study of the Effects of Microscope Work on the Visual System. Proceedings of the 19th Annual Meeting of the Human Factors Association of Canada, Richmond (Vancover), August 22–23, 1986 Grieg J., & Caple D., 1987, Optical Microscopes in the Research Laboratory, Proceedings of the 24th Annual Conference of the Ergonomics Society of Austrailia. Melbourne. Haines H., & McAtamney 1993, Applying Ergonomics to Improve Microscopy Work. Microscopy and Analysis, July 15–17. Hopper, J.A., May, J.L., & Gale, A.G., 1997, Screening for Cervical Cancer: The Role of Ergonomics. In S.A.Robertson (ed.) Contempory Ergonomics 1997, 38–43. Johnsson C.R., 1981, Cytodiagnostic microscope work. In O.Ostberg and C.E.Moss (1984), Microscope work—Ergonomic Problems and Remedies, Proceedings of the 1984 International Conference on Occupational Ergonomics, Rexdale, Ontario, Canada. Human Factors Association of Canada, 402–406. Kumar S.W., & Scaife W.G.S., 1979, A Precision Task, Posture, and Strain, Journal of Safety Research, 11, 28–36 Olsson A., 1985, Ergonomi I mikroskaparbete, Rifa AB, Stockholm. Ostberg O., & Moss C.E., 1984, Microscope work—Ergonomic Problems and Remedies, Proceedings of the 1984 international Conference on Occupational Ergonomics, Rexdale, Ontario, Canada. Human Factors Association of Canada, 402–406. MacLeod D., & Bannon, R.E., 1973, Microscopes and Eye Fatigue. Industrial Medicine and Surgery, 42, (2) 7–9. Rohmert W., Haider E., Hecker C., Mainzer, J., & Zipp, P., 1986, Using a Microscope for Visual Inspection and Repair of Printed Circuits, Ceramic Films and Microchips. Zeitschrift fur Arbeitswissenschaft. Simons D., Day, E., Goodell, H., & Wolf, H., 1942, Experimental Studies on Headache: Muscles of the Scalp and Neck as Sources of Pain, Research Publications Association, 23:228–244. Soderberg, I., 1978, Microscope work II. An Ergonomic Study of Microscope Work at an Electronic Plant. Report No. 40, National Board of Occupational Safety and Health, Sweden. Zoz, N., I., Kuznetsov, J., Lavrova, M., & Taubkina, V., 1972, Visual Hygiene in the Use of Microscopes, Gigiena Truda i Professional’nye Zabolevanija, 16 (2) 5–9.
MUSCULOSKELETAL DISCOMFORT FROM DANCING IN NIGHTCLUBS S L Durham and R A Haslam
Health & Safety Ergonomics Unit Department of Human Sciences Loughborough University Loughborough, Leicestershire, LE11 3TU
This paper provides preliminary evidence of the extent of musculoskeletal discomfort arising from dancing in nightclubs. Subjective data were collected by means of a postal questionnaire (n=50), and structured interviews (n=50). An experiment was undertaken, measuring ankle deceleration during dancing, allowing different floor/shoe combinations to be compared. A high proportion of survey respondents reported having musculoskeletal discomfort at some time (86% postal survey; 84% interview respondents), a notable proportion of those questioned in the nightclub (52%) reported discomfort at the time of the interview. For the experiment, 10 subjects danced under 4 conditions. Ankle decelerations as high as 18.3g were measured on the hard-floor/hard-shoe condition. Further research is recommended to confirm the nature and extent of the problem.
Introduction Musculoskeletal disorders affect a significant proportion of the population at some time in their lives. While considerable research has and is being undertaken concerning the problem in the workplace and with respect to competitive sport, less attention has been paid to musculoskeletal injury in the general population in connection with leisure activities. Attendance at nightclubs and ‘raves’ (all night dance parties) is booming, with estimates that several hundred thousand individuals participate every week (Jones, 1994). ‘Ravers’ dance for long periods of time, on hard unyielding floor surfaces, such as concrete, while wearing poor shock absorbing footwear. This investigation was prompted by concern among the ‘rave’ community regarding aches and pains experienced during and after dancing. Additional factors that might be involved are use of drugs and elevated body temperatures, both of which might mask the onset of discomfort, exacerbating the problem. McNeill and Parsons (1996) found 76% of respondents to a questionnaire survey reported use of drugs such as 3,4-methylenedioxymethamphetamine (ecstasy). In a thermal chamber experiment, recreating dancing in night club conditions for an hour, McNeill and Parsons measured deep
Musculoskeletal discomfort from dancing in nightclubs
67
body temperature increases to 38.2 °C and rises in mean skin temperature close to deep body temperature. The research reported by this paper sought to provide preliminary evidence of the extent of musculoskeletal discomfort arising from dancing in nightclubs. The research involved three studies: a postal survey, structured interviews in nightclubs, and an experiment where ankle decelerations during dancing were measured, enabling different floor/shoe combinations to be compared.
Postal Survey Method A postal questionnaire survey was undertaken to obtain initial data, with particular reference to longer term effects. The questionnaire was based on the Nordic Musculoskeletal Questionnaire (Kuorinka et al, 1987), and was distributed on a convenience basis to 100 regular nightclub attendees. Participants were recruited through relevant Internet newsgroups, nightclub/promoter mailing lists and personal contacts.
Results The survey achieved a response rate of 50%, with 50 completed questionnaires (25 male, 25 female). Results are summarised in table 1. Table 1. Summary of results from postal survey
A small but significant negative association was found between time spent dancing and symptoms (Pearson r=-0.36, p<0.05). This may be because respondents within this sample experiencing discomfort restrict their dancing. No other significant relationships were found.
SL Durham and RA Haslam
68
Nightclub Interviews Additional data were collected by means of an interview survey of dancers in a London nightclub. The nightclub was selected on the basis of its concrete dance floor, a hard surface thought likely to maximise problems. The music played on the night of the survey was of styles ‘hard techno’, ‘hard trance’ and ‘gabba’, music with a high number of beats per minute, likely to encourage hard, fast dancing.
Method Interviews were undertaken by 5 interviewers, who approached individuals resting or walking around that had previously been observed dancing. Interviews took place in the early morning, between 0300–0600 hours, to allow time for dancing to have taken place. The interview schedule was adapted from the postal questionnaire, having similar content.
Results The survey collected data from 50 participants (34 male, 16 female). Results are summarised in table 2. Table 2. Summary of results from nightclub interviews
A significant positive association was found between time spent dancing and symptoms (Pearson r=0.67, p<0.05). No other significant relationships were found within the data.
Musculoskeletal discomfort from dancing in nightclubs
69
Laboratory Experiments The purpose of the laboratory experiments was to allow the effects of different floor/shoe combinations to be examined.
Method Subjects were 10 university students, 5 male, 5 female, mean age 21.5 (±2.1), all regular nightclub attendees. Decelerations were measured using 2 accelerometers, positioned at the left ankle at the base of the fibula and at the mid-lumber region of the lower back. Subjects were asked to dance to an extract of music in their usual manner, under 4 conditions. Deceleration data were logged for 1 minute within each condition. The 4 conditions were: (1) hard footwear, hard floor; (2) hard footwear, soft floor; (3) soft footwear, hard floor; (4) soft footwear, soft floor. The order of conditions was balanced across subjects.
Results Deceleration data were analysed by identifying the 10 peak decelerations for each subject and calculating the mean for the ankle and lower back, figures 1 and 2.
Figure 1. Deceleration at Ankle A mean peak deceleration as great as 18.3g was measured for one subject on his ankle. The highest mean peak deceleration measured at the lower back position was 10.5g. A significant interaction was found between deceleration and floor surface (p<0.05), with the hard flooring having the highest decelerations. No other significant relationships were found.
70
SL Durham and RA Haslam
Figure 2. Deceleration at Lower Back
Discussion The results indicate the ‘ravers’ participating in this study engage in energetic, high impact dancing, often while under the influence of alcohol or other drugs. The high proportion of survey respondents reporting musculoskeletal discomfort is notable, with levels exceeding 80% in both surveys. The main sites of discomfort were the lower back, knees and ankles/ feet. The laboratory experiments found high decelerations and demonstrated an effect of floor surface on ankle deceleration. Caution is needed drawing conclusions from this study. The self-selected nature of the postal survey sample could have resulted in disproportionate representation of those with musculoskeletal problems. It is considered the results of the interview survey are less likely to have been affected by selection bias, but the possibility of sampling effects remains. It is clear that use of intoxicants could have affected responses of both groups of survey participants. It is concluded there is sufficient evidence of a problem to warrant further research.
References Jones D, 1994, Rave New World. Programme transcript: Equinox, November 1994 (Channel 4: London) Kuorinka I, Jonsson B, Kilbom Å, Vinterberg H, Biering-Sørensen F, Andersson G, Jørgensen K, 1987. Standardised Nordic Questionnaires for the analysis of Musculoskeletal systems. Applied Ergonomics, 18, 233–237 McNeill M and Parsons KC, 1996, Heat stress in night-clubs. In: Contemporary Ergonomics 1996, edited by Robertson SA (Taylor & Francis: London), 208–213
MANUAL HANDLING
IS THE ERGONOMIC APPROACH ADVOCATED IN THE MANUAL HANDLING REGULATIONS BEING ADOPTED? Kevin Tesh
Senior Ergonomist Institute of Occupational Medicine, 8 Roxburgh Place, Edinburgh EH8 9SU
This paper describes the results of employers’ responses to the Manual Handling Operations Regulations in terms of reducing risks using the ergonomic approach. Risk reduction factors considered how the work was done (the task); what was handled (the load); where the load was handled (the working environment) and who handled the load (the individual). The range of measures reported showed that the new ergonomic approach advocated was being taken on board generally by organisations, but the practical implementation of some of the risk reduction measures was not always effective as indicated from a small number of follow-up site visits.
Introduction Manual handling has long been recognised as a major cause of occupational injury and illhealth, In 1994/95 over 115 million days of certified sickness absence were attributed to back problems and more than 50,000 work-related handling injuries were reported to the enforcing authorities. In 1982 the Health and Safety Commission (HSC) circulated a consultative document on new Regulations and Guidance relating to manual handling at work (HSC, 1982). This represented a major departure from previous legislation on this topic in that they sought to emphasise that the risk of injury from manual handling was not simply a function of the weight being handled, by describing an ergonomic approach to identifying the sources of risk of injury in manual handling activities. This approach can now be seen in the Council of the European Communities (CEC) 1990 European Directive on the minimum health and safety requirements for the manual handling of loads (CEC, 1990). This directive led to the publication of the Manual Handling Operations Regulations (MHORs) (HSE, 1992) and associated guidance which came into force in the UK on 1 January 1993. British industry at least those businesses who are aware of the Regulation has been trying to comply with the duties since that date. In 1996 the HSC decided to evaluate (he effectiveness of these Regulations and Guidance against a background of a wider review of a whole range of health and safety legislation recently imposed on industry. The Institute of Occupational Medicine (IOM) in Edinburgh was commissioned by the HSE to conduct a large-scale industry wide survey as part of that
Adoption of the Manual Handling Regulations
73
process (Tesh et al., 1997). The IOM had tested the usability of the Guidance by nonergonomists in the workplace before the Regulations were implemented, resulting in a more effective document (Tesh et al., 1992). The main aims of the project were to study both employers’ and employees’ awareness of, interpretation of and response to the Regulations; to evaluate their appropriateness and to see how organisations went about implementing the legislation. This paper describes and discusses results of employers’ responses to the Regulations in terms of reducing manual handling risks using the ergonomic approach.
Methods An employer survey questionnaire addressed all issues relating to the Regulations and Guidance. The questionnaire asked about knowledge and relevance of the Regulations, their implementation, costs and benefits as well as the usefulness of the Guidance provided. Particular attention was paid to making the questionnaire easy to complete to encourage a good response rate. A questionnaire was posted to a stratified sample of 5,000 employers covering the following ten industrial sectors: Manufacturing; Construction; Wholesale and Retail; Agriculture/Horticulture; Transportation/Communications; Finance; Local Government; NHS and Ambulance Trusts; Fire Brigades; and Services. Organisations were selected from the single person self-employed up to large companies with 100 or more employees throughout Britain. Responses to this questionnaire were weighted to allow for distribution of the sample by size and sector. The study was particularly interested in the steps taken by employers to reduce manual handling risks, and so the main factors that employers most consider when making an assessment of manual handling operations were included in the questionnaire under the headings of: the tasks; the loads; the working environment; the individual and training. All factors listed in Schedule 1 of the Regulations could not be considered as this would have significantly increased the length of the questionnaire, thereby discouraging respondents to complete and return the questionnaire. In order to validate appropriate sections of the employer questionnaire responses, nineteen follow-up company site visits were conducted. The results of these visits are not fully discussed in this paper although some explanatory information gathered from these visits are used to supplement the employer responses on the risk reduction strategies. Information on the perception and awareness of the Regulations and Guidance amongst employees was also gathered using two approaches, viz. through trade union representatives and through employers during the follow-up company visits. The results of this employee survey are not addressed in this paper.
Results Employers were asked to tick the main examples of reducing manual handling risks if they had both heard of the MHORs and had at least partly implemented the Regulations within their organisation. The responses from the 5,000 employers were weighted by the distribution of organisations in Great Britain, in order to give a representative picture of Britain as a whole. The results quoted are the weighted figures.
74
KM Tesh
The Task All the examples listed in the questionnaire relating to reducing the risk by altering the task were used. Except for reducing pushing and pulling efforts, each of the approaches had been employed by 40–60% of the organisations. Reduction of lifting from floor and above shoulder level appeared to be the most popular measure along with reducing carrying distances, while reductions in pushing and pulling efforts were only implemented by 27% of the organisations. This possibly reflects the relative ease with which such measures could be introduced. NHS Trusts had a high percentage who reduced the amount of twisting and stooping and lifting from floor level. Reducing lifting from floor level or above shoulder lever was lowest in the Transport sector. Although reducing pushing and pulling was generally low, this example was mentioned by a high percentage in the NHS Trusts and Finance sectors. The relatively low score for reducing pushing and pulling under the task category may be because respondents do not easily identify measures to reduce this particular risk such as improved trolley maintenance procedures. However other measures such as improving floor conditions and the layout of the workplace, under the environment category can also have the same desirable effect. Similarly reducing the load, under the load category, will also result in lower pushing and pulling efforts. Clearly, the categories are interrelated and can influence each other. Varying the work was the least frequently adopted risk reduction method mentioned by the Fire Brigade. Varying the work was more prevalent in larger organisations, presumably because they would have the flexibility to do this. Generally all examples were stated more often by larger companies.
The Load Relatively few organisations had addressed elements under the ‘load’ category to reduce manual handling risks, compared with changing the way an individual undertook the job (ie. the task). Making a load easier to handle (16%), more stable (18%) and safer to handle (17%) were reportedly tried less often than the 40–60% achieved under the task category. The exceptions were reducing the size or weight of load (35%) and providing employees with more information about the loads, which was the most commonly stated means of reducing the risk attributable to the load itself (57%). This approach was generally high in all sectors, with Agriculture the lowest. The widespread use of the provision of information as a risk-reduction technique is possibly because this is also a specific requirement under Regulation 4(1)(b)(iii). Providing load information would also be expected to be high as it is a non-physical change, and in most instances would be an easier and cheaper option to implement in order to reduce the manual handling risks associated with the load. These findings could be interpreted as suggesting that organisations had less scope for modifying characteristics of the load and provide further justification for the emphasis on factors other than the load, such as who, where and how the load is handled The next most commonly stated example was reducing the weight or size of the load, which was highest in NHS Trusts, Local Government and lowest in Wholesale/Retail and Services. There were no strong relationships between size of organisation and the methods used, except that larger organisations tended to give employees more information compared with smaller organisations.
Adoption of the Manual Handling Regulations
75
The Working Environment Clearing obstructions, improving the layout of workplaces, and improving the lighting, were introduced by almost half (48%) and over a third (36% and 34%) of the organisations respectively under the environment risk reducing category, In most cases these risk reduction measures could simply be introduced by improving the standard of housekeeping to allow handlers more room and clearer access along handling routes to adopt more acceptable postures. As a consequence manual handling risks in the working environment could be reduced or eliminated relatively easily and cheaply by adopting good housekeeping measures. Less than one quarter of respondents mentioned evening up floors, reducing risks associated with steps and ramps, and controlling temperature and draughts. NHS Trusts had the highest percentage who improved the layout and cleared obstructions. Construction, Manufacturing, Finance and Local Government had high prevalence of clearing obstructions as a means of reducing risks. There were clear trends in the data for size of organisation. Almost all means of reducing risk by altering the environment were more frequently used in the larger organisations.
The Individual Risk reduction measures on ‘selecting the individual’ and ‘providing training’ (together with ‘other factors’) were implemented by between a third and almost two thirds of the organisations. Providing training in handling techniques was the most common, which was virtually 100% in the sectors of Local Government, NHS Trusts and the Fire Brigades. Considering the relatively high number who identified this as a requirement of the Regulations, this result is not surprising. Along with other information and training methods, this provision in most cases is the easiest to implement and therefore would tend to appeal as a quick and easy risk reduction measure. While it is recognised that training and information have an important role to play, it is also widely acknowledged that it will not be particularly effective in reducing manual handling problems if the design of the workplace, the loads handled and what operators are asked to do still result in awkward handling postures. A third of the organisations employed selection procedures to identify individuals not physically suited to manual handling. It would be interesting to analyse what selection criteria organisations employed, as the most likely factors such as strength, age and gender have been shown in the scientific literature not to correlate particularly well with reducing manual handling risks. On the other hand, some factors such as previous back problems and long absences from work due to holidays and illnesses have shown a positive relationship. However, this level of detail of information was not collected during the survey. There was a strong trend related to company size, with larger organisations more likely to provide training in good handling techniques. However, this may be due to the fact that many of the larger organisations such as Local Governments, NHS Trusts and Fire Brigades have large numbers of staff employed in jobs where handling training has traditionally been an integral part of general training
Site Visits The outcome of the site visits showed that like the postal survey results organisations were adopting a wide range of ergonomic measures to reduce manual handling risks. The extent of risk reduction measures was influenced by non-manual handling issues such as new
76
KM Tesh
technology, process efficiency and general investment. The level of manual handling assessments, in terms of job coverage and detail contained within the assessments were lacking in some cases. Also there was opportunities to do more in reducing manual handling risks in terms of using generic and collective approaches as advocated in the Regulations.
Conclusions The results of the postal survey, with an overall expected response rate of 30%, showed that the range of measures reported for reducing manual handling risks was encouraging and that the ergonomic approach advocated was being taken on board. The most common risk reduction methods employed by organisations were: reducing lifting from floor level or from above shoulder level under the ‘task’; providing more load information under the ‘load’; improving workplace layout under the ‘environment’ and providing manual handling training techniques under the ‘individual and training’ section. While the site visits confirmed a wide range of risk reduction steps, the practical implementation of some of these measures was not always effective. The coverage and quality of the assessments and hence the risk reduction strategies adopted were also a concern. What appeared less obvious were the arrangements in place to ensure that the monitoring, auditing and co-ordination of these risk reduction methods was being carried out to ensure that measures were being effectively implemented.
Acknowledgements The author would like to thank the HSE for funding this research and to co-workers at the IOM who contributed to the main research project from which this paper is taken.
References Council of the European Communities (1990). Council Directive of 29 May 1990 on the minimum health and safety requirements for the manual handling of loads where there is risk particularly of back injuries to workers. (Fourth individual Directive within the meaning of Article 16(1) of Directive 89/391/EEC). Official Journal No. L156/9–13 (90/269/EEC). Health and Safety Commission (1982). Consultative Document: Proposals for Health and Safety (Manual Handling of Loads) Regulations and Guidance Notes. London: HMSO. Health and Safety Executive (1992). Manual Handling. Manual Handling Operations Regulations 1992. Guidance on Regulations L23. London: HMSO. Tesh KM, Symes AM, Graveling RA, Hutchison PA, Wetherill GZ (1992). Usability of manual handling guidance. Edinburgh: Institute of Occupational Medicine (IOM Report TM/92/11). Tesh KM, Lancaster RJ, Hanson MA, Ritchie PJ, Donnan PT, Graveling RA (1997). Evaluation of the Manual Handling Operations Regulations 1992 and Guidance HSE Contract Research Report No. 152/1997 Sudbury: HSE Books.
CONTROL OF MANUAL HANDLING RISKS WITHIN A SOFT DRINKS DISTRIBUTION CENTRE Liz Wright1 and R A Haslam Health and Safety Ergonomics Unit Department of Human Sciences Loughborough University Leicestershire LE11 3TU
This paper describes an investigation into the presence of manual handling risks, and measures put into place to control these risks, within a large soft drinks distribution centre. Company risk assessments had identified risk associated with handling activities and described training as the control. Postures were analysed using OWAS, and the NIOSH equation was used to estimate levels of risk. The manual handling training programme was evaluated by comparing content with recommended criteria taken from the literature. Manual handling risks were found in both warehouse and delivery areas, some being classed as “excessive” using the NIOSH calculation. The study recommended other means of addressing manual handling risks.
Introduction Injuries, particularly to the back, resulting from handling activities produce a significant proportion of reported injuries (HSE, 1992). There is a high reported rate of back injury among warehouse workers (Ljungberg et al, 1989) with increasing levels of back injury within the drinks industry; professional drivers have also been shown to have a relatively high prevalence of musculoskeletal injury (van der Beek et al, 1993). Manual materials handling (MMH) tasks can be evaluated using various well documented approaches: biomechanical, psychophysical, epidemiological and physiological (Mital et al, 1997). These approaches have helped determine the primary risk factors involved with manual handling and resultant musculoskeletal injuries. This study reviewed the presence of manual handling risks to warehouse operatives and delivery drivers within a soft drinks distribution centre, and evaluated control measures. Risk assessments had been performed by the organisation in response to the Management of Health and Safety at Work Regulations 1992, and had identified manual handling as a problem. 1
Now at Human Applications 139 Ashby Road, Loughborough, Leicestershire, LE11 3AD
78
EJ Wright and RA Haslam
Training was specified as the control measure, and the company specifically requested an evaluation of their training programme. The company had also attempted to reduce the effects of MMH by other means such as raising some of the pallets to a more acceptable height, but had not assessed the effects of these changes. There have been numerous case studies investigating MMH in a variety of work situations (such as Burdorf and Vernhout, 1997; Hickson and Megaw, 1994; Vessali and Kothiyal, 1997) in most cases using a combination of two or more methods. An interview based questionnaire was used to obtain worker information on the presence of musculoskeletal disorders and opinions on training. Postures of both groups were analysed using the Ovako Working Posture Analysing System, and levels of risks in different situations were estimated using the NIOSH equation and HSE guidelines.
Methods The study included two groups of employees; warehouse operatives and local delivery service (LDS) drivers. Within the warehouse is a “break bulk” area, where cases of soft drinks are loaded onto LDS lorries. Operatives drive powered pallet trucks to locations in the break bulk area, and select, or “pick”, cases either onto pallets or into cages, which they then take to the loading area. Lorries are loaded with either cages or pallets of product for delivery to small and medium retail outlets and drivers manually unload cases at the customer location. The study therefore involved two work methods in the warehouse and delivery, as well as two groups of workers.
Semi-structured interviews A questionnaire was used as a framework for semi-structured interviews conducted with 19 warehouse operatives and 12 drivers, selected during different shifts, over a two week period. This included questions on training attendance and opinions, work experience and job satisfaction, as well as a section derived from the Nordic questionnaires (Kourinka et al, 1987) on musculoskeletal disorders.
Ovako Working posture Analysing System, OWAS OWAS (Louhevaara and Suurnakki, 1992) postural analysis is a method devised to classify working postures of the back, arms and legs, giving an estimation of the potential for musculoskeletal injury. The position of body components and activity being performed are coded and recorded at timed intervals. Initial observations of workers identified the specific activities. A sample period of 20 seconds was used following pilot trials. Warehouse operatives were observed from the start of picking onto a new cage or pallet until completion. Drivers were observed from the start of unloading lorries to the point where they begin to wheel product onto premises.
National Institute of Occupational Safety and Health (NIOSH) Equation The revised NIOSH equation (Waters et al, 1993) was used to determine level of risks associated with some MMH tasks. The calculation considers factors including horizontal and vertical distances, frequency of lifting, asymmetry and coupling (grip) to determine the Recommended Weight Limit (RWL). A “load constant” of 23 kg is used as a maximum, which is then altered by multipliers according to the specific lifting conditions. The Lifting Index (LI) compares the actual weight being handled with RWL, providing an estimated level of risk, with probability of low back pain increasing as the LI increases (Waters et al, 1997). The developers of the NIOSH equation agree that “many workers will be at an elevated risk if the LI exceeds 3.0” (Waters et al, 1993).
Manual handling risks in a soft drinks distribution centre
79
Training evaluation A list of topics to be included in training was developed from literature (see table 1) and was used to evaluate training. Training is provided by a number of workers all of whom have completed a five day course provided by a consultancy. Training sessions and information given to those who attend is based on that provided by the consultants. Table 1 List of criteria used to assess the manual handling training (Based on Birnbaum et al, 1993; Chaffin and Andersson, 1991; Chaffin et al, 1986; Chartered Society of Physiotherapy (CSP), 1994; HSE, 1992b; Kroemer, 1992; Troup and Edwards, 1985)
Results Semi-structured interviews The most frequently reported musculoskeletal disorder over the last year was that of lower back problems, reported by 47% of warehouse operatives (n=19) and 58% of drivers (n=12). Knee problems were reported by 50% of drivers and 32% of warehouse operatives. 50% of drivers also reported neck trouble. Of those reporting back trouble, 22% (7 workers) report having been absent from work because of it during the last year, 16% report changing duties (for example carrying out light duties for a period of time) and half claim to have reduced their activities at home or at work.
Ovako Working posture Analysing System, OWAS The OWAS categories were used to study combinations of postures and individual back postures. Differences in postural combinations and in individual back postures were analysed using Chi square test on observations in each group with the null hypothesis “there is no difference between cages or pallets with respect to proportions of harmful postures observed” (in either warehouse or delivery). Harmful postures were looked at relative to the activities in which they occurred. A minimum of 100 observations is recommended for each job or task (Louhevaara and Suurnakki, 1992). A total of 531 observations were made of operatives picking. There were significant differences (p<0.01) between using cages and pallets, between both postural combinations and between individual back postures, with fewer harmful postures using cages. A total of 603 observations were recorded on delivery. There was a significant difference between postural combinations (p<0.05) in favour of cages but the difference was not significant when comparing individual back postures.
EJ Wright and RA Haslam
80
When using pallets, the activities which had the highest proportions of harmful postures were lifting and lowering and reaching for products and climbing on and off the lorry.
National Institute of Occupational Safety and Health (NIOSH) Equation Calculations were made to provide estimations of risks associated with various other conditions, including different worker techniques and effects of raised areas. A LI of less than 1 was possible in some areas when lifting close to the body, with no twisting. Increasing the horizontal distance when lifting from or to the back of a pallet or cage, significantly reduced the RWL (LI of 3.1 and 2.4 respectively). The reduction was slightly less when using cages due to their smaller depth. Worker posture reduced RWL when twisting (11.82 kg to 10.17 kg) or not being close to the load. When lifting crates, workers cannot slide the load close to them; lifting crates from the back to the front of a pallet resulted in a LI of over 4.
Training Comparison of the training session content with the criteria is shown in table 2. A large number of interviewees described the session as “interesting”, “good”, “informative” and “thorough”, increasing their awareness of lifting and how the body works. The training concentrated on a particular technique without discussing realistic approaches to actual working conditions. Table 2 Results of comparative evaluation
Coverage: “Good” if area covered fully, “Satisfactory” if some aspects are covered, “Poor” if this area not covered.
Discussion The organisation had performed risk assessments as part of the requirements under the Management of Health and Safety at Work Regulations (1992) and these identified manual handling as a high risk activity. More detailed manual handling risk assessments had not taken place.
Manual handling risks in a soft drinks distribution centre
81
There had been attempts to control these handling risks, but this study concluded that such risks were still present. Training was identified by the organisation as the primary control measure. The study emphasised the need for training to be used as a secondary control measure and recommended other control methods including the introduction of equipment and changed work practices. Recommendations were also made regarding the training itself, primarily to ensure that it was relevant for specific work conditions. The use within the study of more than one methodology was an important means of identifying some of the complex factors involved in MMH. The OWAS and NIOSH methods in some instances resulted in conflicting recommendations. Consideration of workplace factors as well as actual worker postures, along with subjective information, allowed realistic recommendations of means of reducing manual handling risks to be delivered.
References Burdorf, A and Vernhout, R, 1997, Reduced physical load during manual lifting activities after introduction of mechanical handling aids” In Seppala, Luopajarvi, Nygard and Mattila (ed.) Proceedings of 13th Triennial Congress of the International Ergonomics Association, June 29th-July 4th, Tampere, Finland. Vol 4 Musculoskeletal Disorders and Rehabilitation. Health and Safety Executive, 1992, Manual handling: guidance on regulations (HMSO, London) Hickson, J and Megaw, T, 1994, An example of job redesign in a large automotive plant, Contemporary Ergonomics 1994, (Taylor and Francis, London) 400–5. Kuoririka, I, Jonsson, B, Kilbom, A, Vinterberg, H, Biering-Sorensen, F, Andersson, G, Jorgensen, K, 1987, Standardised Nordic Questionnaires for the analysis of musculoskeletal systems, Applied Ergonomics, 18, 3, 23–237 Ljungberg, A.S, Kilbom, A and Hagg, G.M, 1989, Occupational lifting by nursing aides and warehouse workers, Ergonomics, 32, 1, 59–78 Louhevaara, V and Suurnakki, T, 1992, OWAS: a method for the evaluation of postural load during work, Training Publication II. (Institute for Occupational Health. Helsinki, Finland) Mital, M, Nicholson, A.S and Ayoub, M.M, 1997, A guide to manual materials handling (Taylor and Francis, London) Van der Beek, A.J, Bruijns, P.W, Veenstra, M.S and Frings-Dresen, M.H.W, 1993, Energetic and postural workload of lorry drivers during manual loading and unloading of goods. In Marras WS, Karwowski W, Smith JL and Pacholski L (eds.) The Ergonomics of Manual Work Vessali, F and Kothiyal, K, 1997, A case study of the application of revised NIOSH guide and OWAS to manual handling in a food processing factory. In Adams (ed.) Proceedings of the 30th Annual Conference of the Ergonomics Society of Australia. Sydney, 1994 Waters, T.R, Putz-Anderson V, Garg A and Fine L, 1993, Revised NIOSH equation for development and evaluation of manual handling,. Ergonomics, 36, 749–70 Waters, T, Baron, S, Haring-Sweeney, M, Piacitelli L, Putz-Anderson, D, Skov, T and Fine, L, 1997, Evaluation of the revised NIOSH lifting equation: a cross sectional epidemiological study. In Seppala, Luopajarvi, Nygard and Mattila (eds.) Proceedings of 13th Triennial Congress of the International Ergonomics Association, June 29th-July 4th, Tampere, Finland. Vol 4 Musculoskeletal Disorders and Rehabilitation. Finish Institute of Occupational Health.
TRAINING AND PATIENT-HANDLING: AN INVESTIGATION OF TRANSFER J A Nicholls and M A Life Ergonomics and HCI Unit University College London 26 Bedford Way London WC1H OAP
This paper describes an exploratory study comparing novices compliance with patient-handling training in the taught classroom setting with their later compliance in the workplace. Their workplace performance was also compared to that of experts. Reasons for non-compliance were explored. Novice workplace performance was significantly worse than classroom performances and worse than the workplace performance of experts. The differences were, in part, attributable to the failure of the practical element of the programme to support workplace handling. Experts were more likely to: decompose the task into subunits; think in advance; respond to unexpected events. It is suggested that training needs to provide more support to novices’ acquisition of practical knowledge. Such training would enable development of a set of advance planning behaviours to enhance performance in diverse clinical situations.
Introduction Little consensus exists regarding the effectiveness of training in patient-handling (Wood, 1987; Videman et al 1989). Much training design has viewed handling as a variant of simple lifting. Hence, training has been based on a physical model of the trainee, relying on the assumption that the acquisition of a set of lifting skills will produce safe handling in the clinical workplace. Most studies of training effectiveness have examined performance in either the classroom (e.g. Troup and Rauhala, 1987) or workplace (Takala and Kukkonen, 1987), but few have investigated both the acquisition of handling skills and their subsequent application. Furthermore, handling patients is a complex task which requires both overt behaviours and associated mental behaviours. The presumption that ineffective training may be attributable to trainees’ lack of ‘physical’ skill, has led to scant attention being paid to the required mental behaviours such as planning and decision making during a lift. The general aim of this study was to explore the existence and nature of the problem of ineffective training. Specific aims were: 1) to determine whether novice performance differs in the classroom and workplace; 2) to compare novice and expert workplace performance and behaviours; and 3) to consider the adequacy of the design of the training programme.
Training and patient-handling
83
Methods and materials The study was designed to explore the problem in as naturalistic a manner as possible. The use of a novice group of subjects in both the classroom and workplace, and of the expert group in the workplace, provided some level of control over the variable of lifters’ knowledge. A between groups comparison of novices’ and experts’ workplace performances was possible and a within groups comparison of novices’ classroom and workplace performances. To assess performance, a checklist was developed based on that previously used by Chaffin et al (1986). Raters were asked to assess subjects’ (Ss) performance with respect to 5 criteria concerning: closeness of load, erectness, smoothness, avoidance of twisting and adequacy of grip. In health workplaces, optimum body postures often cannot be assumed because of environmental constraints so the criteria in the checklist were all defined in a relative sense using the qualifier ‘as possible’. A 16-item questionnaire was developed to facilitate exploration of the relative adequacy of the lecture and practical parts of the programme, and to provide a basis for understanding any differences in classroom and workplace performance a 16-item questionnaire was developed.
Procedure Eleven undergraduate physiotherapy students and eleven expert therapists participated in the study. Each expert had more than 3 years, full time, post-qualification clinical experience. Novices were videoed whilst performing a handling task on a simulated patient in the classroom following completion of training. The training was based on the teaching of basic principles reflected in the, ‘straight back, bent knees’ profile. The class setting permitted application of the principles of handling as taught. As part of the assessment Ss were questioned to test their lecture knowledge and were scored using a 1–5 scale with a score of 3 representing an acceptable level of knowledge. The videorecordings were subsequently assessed by two independent raters using the performance checklist. Each rater had more than 10 years experience of making observational assessments of students’ performance. Each criterion was scored on a 0–3 scale, so performance scores between 0 and a maximum of 15 could be achieved. To pass the classroom assessment Ss had to score at least 60% on the practical part of the assessment, i.e. a checklist score of 9 or more. Ss were videoed again two years later, whilst performing two handling tasks during the course of their normal working routines, on a working day chosen at random. The observer followed Ss as unobtrusively as possible. To try to ensure as rich a capture of data as possible, the observer also noted the behaviours carried out by a subject in association with the task performed (e.g. clearing the workspace). Following completion of the handling task, Ss were interviewed. Assessment of Ss behaviours was made from the videos and from the observer’s notes.
Data Analysis Each video was assessed by one of two independent raters on two separate occasions. The raters were trained by exposing them to videos of ‘good’ and ‘poor’ performance in order to establish a consistent set of criteria. Each rater judged each subject’s performance, in the classroom and the workplace, using the performance checklist. The behaviours were informally classified from the videorecordings and experimenter’s notes on an arbitrary temporal basis into one of three stages: the preparatory stage (i.e. activities prior to actual change of a patient’s position), the
84
JA Nicholls and MA Life
rifting stage (i.e. activities concerned with changing a patient’s position), and the post-lifting stage (i.e. activities subsequent to a patient being moved).
Results It was considered that the data met the requirements for the use of parametric analyses (Huck and Cormier, 1996). Data were analysed for a total of 33 tasks representing data analysis of 1 task performed by each expert and novice in the workplace and 1 task performed by each novice in the classroom.
Performance scores Reliability of checklist: Intra-class correlations were used to estimate the inter and intra-rater reliability of the checklist. All of the scores were significantly associated (r=0.92). Classroom performance scores of novices: The mean performance score achieved by the novices was 11.91 (sd=1.37). Subjects’ lecture knowledge was also found in all cases to be of an acceptable standard (mean=3.45, sd=0.52). Workplace performance scores of novices and experts: The mean performance score achieved by the novices was 8 (sd=3.16). To test the hypothesis that there would be a significant difference between novices’ class and workplace scores, a Student’s t-test was used. This revealed a significant difference (t=4.87, df=10, p=0.0007) indicating a decline in performance in the workplace. The mean performance score achieved by the experts in the workplace was 12.91 (sd=0.87). A two sample t-test revealed a significant difference between novices’ and experts’ workplace scores (t=3.4, df=20, p=0.004) suggesting that novices’ performance, might be amenable to improvement.
Workplace behaviours In order to explain the performance differences, subject behaviour was examined in detail. Many behaviours were common to novices and experts. For example, Ss in both groups communicated with the patient and assessed the clinical situation. In the preparatory phase, experts seemed more likely than novices to: decompose the task into sub-units; clear the workspace of relevant objects, (i.e. those likely to interfere with performance of a task); adjust the bed; arrange pillows on the bed; position the patient’s feet; and check that the patient was free of encumbrances to movement. During the lifting phase, experts appeared to be more likely to: move the patient to the edge of the bed/chair; continue instructing the patient during the manoeuvre; respond to unexpected events. Conversely, novices appeared less likely to: break the task down into sub-units; and clear the workspace appropriately, e.g. novices tended to reposition objects, the position of which, was irrelevant to the execution of a lift. Novices also seemed to be less able to respond to unexpected events. There appeared to be fewer differences in the behaviours performed by novices and experts in the actual lifting phase, although as the performance scores showed, the quality of execution of the behaviours differed markedly.
Reasons for novices’ non-compliance in Workplace No problems were identified regarding the lecture knowledge components of the programme. The main problems concerned the practical aspects of the programme; nine novices identified insufficient practical time, and eight reported difficulty in remembering the practical parts. Just under half of the subjects reported having a poor understanding of the practical parts. Most Ss claimed that their ability to handle had improved since their classroom assessment and almost all claimed to have been motivated when being assessed in the classroom.
Training and patient-handling
85
Discussion This exploratory study aimed to expose the existence of a problem concerning training in handling by examining the performance and behaviour of novices and experts. Overall, the results indicate marked differences between novices performances in the classroom compared to the workplace, and between novices’ performances in the workplace and those of experts. Differences were also found between the behaviours of novices and experts, confirming the existence of a problem.
Performance scores In general terms, adequate performance in the class followed by failure to perform adequately in the workplace suggests a problem with transfer of the knowledge acquired during training. All the novices achieved reasonably high classroom performance scores indicating that, in this study, as in Videman et al’s (1989) investigation, the taught techniques could in fact, be learnt. The observed level of classroom performance is, however, markedly higher than that found by Troup and Rauhala (1987) and Videman et al (1989) when using a single three-point scale. Although Videman et al (1989) found a significant difference between the performances of trained and untrained subjects, the mean score of the trained group failed to reach 50% of the maximum. It could be argued that, in the present study, each of the criteria upon which the performance checklist is based are related, leading to a subject’s performance being overassessed. However, the size of the comparative difference makes it seem unlikely that it can solely be attributed to one of possible overassessment. It seems more likely that Videman et al’s (1989) low score suggests either unrepresentatively poor subject groups or possibly an inappropriate scoring technique. Whereas our assessment of performance was derived from five relatively circumscribed subscores Videman et al (1989) used a single measure of performance which incorporated assessment of less easily defined aspects of performance, (e.g. how well a lift was planned). Experts’ workplace scores, whilst still ‘imperfect’, were significantly better than the scores of novices. Methodological issues aside, the main implication of the difference is that the deficiency in novice performance might be amenable to being reduced. To explore this further, it was necessary to consider the different ways in which novices and experts behave.
Workplace behaviours and reasons for novices’ non-compliance Although as standardised as possible, the recording and classification of behaviours was subjective and this must be borne in mind when considering the behavioural data. Nevertheless, in comparison to novices, experts’ behaviour suggested two characteristics: experts appeared to organise the task across a longer time horizon and they seemed to be more likely to respond to unexpected events. Experts evidenced behaviours which seemed to indicate that they had thought through their intended plan of action, e.g. in the case of a catheterised patient transferring from bed to chair, experts were more likely to check that the catheter was free to move. More detailed forward planning was also suggested by the fact that the experts were more likely to explain the task to the patient by breaking it down into subsections. In contrast, novices were, for example, less likely to clear the destination area suggesting that they had not thought through the intended plan of action in the same way as experts. This was supported by the fact that, unlike the experts, novices tended to explain the task to the patient in terms of the overall aim rather than by breaking the task down into
86
JA Nicholls and MA Life
subsections. Novices’ apparent inability to break the task down into sub-sections in advance suggests a bottom-up strategy of problem-solving with little advance planning of the consequences of their current actions. Such expert-novice differences have been observed in other domains (e.g. Chi et al, 1983). Novices also seemed less able to respond to unexpected events, resulting in their frequently being found in situations where, for example, the chair was too far from the bed. Experts seemed less likely to be ‘stranded’ by a poorly positioned chair and, if this did happen, they were more likely to be able to respond. Either the experts were more able to predict likely problems and so prevent them, or else their strategies for action were sufficiently flexible to allow them to respond to the unexpected. Reflecting back to the training programme, novices may be deficient in the high level knowledge required to plan the sequencing of their movements, and to adjust this planning in response to the changing situational demands. Novices’ questionnaire responses suggested that poor workplace performance concerned those components of the programme associated with the provision of practical knowledge. The suggestion is that it is these elements which need redesigning or extending. The most clear-cut finding was that the amount of practice was insufficient, leading to the practical knowledge being inadequately acquired and so failing to support handling in the workplace. It may be not that the programme fails to promote the acquisition of practical knowledge per se, but that it fails to foster its acquisition at a sufficiently high level to support the development of a flexible set of behaviours which can be recruited in a novel situation. This seems to leave novices with little option other than to perform the task as best they can, in other words, to rely on the low level knowledge that they have acquired. In conclusion, this study has demonstrated a problem concerning novices’ use of taught handling skills in the workplace, and has identified weaknesses in the training. We suggest that it is reasonable to attempt to reduce the problem by enhancing practical training so that novices’ ability to consider their behaviours in advance is supported.
References Chaffin, D.B., Gallay, L.S., Wooley, C.B. and Kuciemba, S.R. 1986, An evaluation of the effect of a training program on worker lifting postures. International Journal of Industrial Ergonomics 1 127–36 Chi, M.T., Glaser, R. and Rees, E. 1983, Expertise in problem solving. In Sternberg, R.J. (Ed) Advances in the psychology of human intelligence, (Vol 2. Lawrence Erlbaum. Hillsdale, New Jersey). Huck, S.W. and Cormier, W.H. 1996, Reading statistics and research. Harper Collins. NY. St Vincent, M. and Tellier, C. 1989, Training and handling: an evaluative study. Ergonomics 32 191–210 Takala, E.P. and Kukkonen, R. 1987, The handling of patients on geriatric wards. Applied Ergonomics 18 17–22 Troup, J.D.G., and Rauhala, H. 1987, Ergonomics and training. International Journal of Nursing Studies 24 325–30. Videman, T., Rauhala, S., Lindstrom, K., Cedercruetz, G., Kamppi, S., Tola, S. and Troup, J. 1989, Patient handling skill, back injuries and back pain. An intervention study in nursing. Spine 14 148–56 Wood, D.J. 1987, Design and evaluation of a back injury prevention programme within a geriatric hospital. Spine 12 77–82
Risk Management in Manual Handling for Community Nurses Pat Alexander
Back Care Adviser Herts Handling Training and Back Care Advice 36 Barlings Road, Harpenden, Herts, AL5 2BJ
A mixed method study was set up to evaluate a risk management programme using a quantitative survey to both managers and operational staff, and to explore risk taking behaviour in community nurses with reference to manual handling practices. Nurses obliged to work in situations where managers’ recommendations had not been implemented were 3 times more likely to have taken sick-leave for back/neck pain in the last 12 months. Different perceptions of risk by managers and operational staff were revealed, which could be addressed by joint training, or consultation.
1 Introduction In the past physiotherapists have often trained carestaff in the moving and handling of patients. In 1992 the author was asked to set up a training department for a Community Trust, which would also train staff from outside agencies as an income generating activity. In view of the emphasis of the forthcoming European initiative on manual handling leading to the Manual Handling Operations Regulations 1992 (MHOR 1992), a strategy was devised that trained all Community Nurse Managers in Risk Assessment and Reduction, as well as a manual handling programme for all hands-on community nurses. A form for recording risks and recommendations for manual handling was compiled in 1993 and a programme implemented to raise awareness of risk, and educate staff in good practice. It was expected that financial savings would result from a decrease in litigation and sick leave for musculo-skeletal problems. However, the Trust method of recording sick leave did not allow access to these data, and it is well-known that there is under reporting of accidents in the NHS (National Audit Office, 1996). Thus it was decided that a survey of self-reported back/neck pain would be necessary to establish a baseline to evaluate the effect of the programme. It was anticipated that due to the Care in the Community Act (1990) the level of dependency of patients nursed in the community would increase. In order to reduce the effect of confounding variables such as an increased input from Social Services reducing the amount of hands-on nursing required from Trust staff, the survey included questions on changes seen to have influenced community nursing.
P Alexander
88
2 Methods In order to explore the hypothesis a quasi-experimental survey was conducted, followed by semi-structured interviews. Measurements taken were the amount of self reported sick leave taken for back/neck pain from the data concerning the hands-on nurses, and the Manager’s ability to implement her own recommendations for risk reduction in manual handling strategies. As Trust sickness data and accident reports were considered either not accessible or reliable, two anonymous postal surveys were conducted among 61 Community Nurse Managers and 165 hands-on community nurses. The two questionnaires were based on the Witney Back Survey (Harvey, 1985), with additional questions on risk assessment and related themes. A response rate of 69% was obtained from the managers and 55% from the hands-on staff. The results of the two questionnaires were analysed and interviews conducted of volunteer subjects from management and hands-on staff from a range of geographical areas in the Trust.
3 Results The managers’ questionnaire showed the following factors were believed to prevent implementation of their recommendations for safer practice.
Fig 1 showing Factors preventing implementation of manager’s recommendations
Risk management in manual handling for community nurses
89
The hands-on staff showed a different perception of the factors influencing their back/ neck pain. Table 1 showing Factors perceived by hands-on community nurses to influence back/neck pain Lifting heavy patients Stooping Space Constraints Lifting boxes, etc Driving Problems in patient’s home
40.9% 68.8% 61.2% 19.3% 30.1% 44.0%
n=38 n=64 n=57 n=18 n=28 n=41
The data relating to the epidemiology of back/neck pain were only collected from the hands-on nurses. The lifetime prevalence of back/neck pain was found to be 83% (n=78), the point prevalence to be 32.2% and the annual prevalence 72%. There was a significant relationship between the number of hours worked and the annual prevalence of back/neck pain (p=5% with 1 degree of freedom). Table 2 showing Significant association between the number of hours worked and the annual prevalence of back/neck pain (p=5%, 1 degree of freedom)
There was also a significant relationship (p=5% with 1 degree of freedom) found between those nurses who had taken sick leave for back/neck pain in the last 12 months and working where the recommendations made by their manager had not been implemented. Table 3 showing Significant relationship between non-implementation of manager’s recommendations and sick leave due to back/neck pain in last 12 months (Chi squared test, p=5%, 1 degree of freedom)
90
Pat Alexander
4 Discussion The nurses in this study, contrary to many of those in other studies (Buckle, 1987), showed no significant relationship between lifetime, annual and point prevalence of back/neck pain and age or length of time as a nurse. However, this group was small compared to other studies. Many of those studies used different time bands for prevalence and a multiplicity of methodologies, and are therefore not easily comparable. Nor is back pain itself an easy problem to define, being a symptom rather than a disease. Recent research (Knibbe and Friele, 1996) shows similar findings of lifetime prevalence of back/neck pain amongst Dutch community nurses. Their findings of 87% compare with the 83% revealed in this study. One of the most powerful results shown is that a nurse obliged to work in a situation where her manager has not been able to implement her own recommendations for safe practice is 3 times more likely to have taken sick leave for back/neck pain in the last 12 months. This finding alone shows the importance of a thorough assessment of the risks of manual handling and emphasises the necessity of implementing the recommendations, or altering the package of care delivered until such time as a safe system of work is in place. Research shows that nursing carries an increased risk of work-related back/neck pain (HSE, 1992). Entering the nursing profession itself could thus be seen as an exercise in risktaking behaviour, but many people believe in their own personal immunity (Adler et al, 1992). The zero-risk theory postulates that despite people aiming to eliminate risky behaviour this is often not effective, due to faulty perception of risk (Pitz, 1992). Many nurses have been misled, believing that correct use of technique will protect them from injury (Harber et al, 1988), whereas it is not solely the provision of training but an ergonomic approach that will improve the situation. Thus their apparent risk-taking behaviour is non-deliberative (Yates, 1992) and linked to lack of up to date knowledge of research based practice. Due to the apparent increase in the number of very dependent patients in the community, including the terminally ill, it appears that, despite the increased input from Social Services revealed in this research, those patients requiring nursing need more complex intervention. It is known that these types of patients are physically and mentally more demanding to nurse (Hignett and Richardson, 1995), and nurses may well be prepared to take risks for those patients who have a limited life expectancy. Hands-on community nurses believe that stooping and space constraints are their main problems, but their managers believe that non-availability of hoists and space constraints are the main problems. Perhaps this is because relatively few of the managers are now performing much of the practical work of dressings etc, so are not personally reminded of the risks of holding a static posture. As Adams (1995) states, the decision makers are often removed from the consequences of their actions.
5 Conclusion Differing perceptions of risk between managers and staff were revealed in this study. The MHOR 1992 emphasise that the workforce must be consulted whilst making the handling assessment; perhaps this could be a joint training issue. Community Nurse Managers must understand the importance of implementing their prescriptions for safe practice in manual handling, including the legal and financial implications. Procedural solutions alone, such as
Risk management in manual handling for community nurses
91
training, have a limited effect (Buckle et al, 1992). The findings of this research confirm that of many others, in that a multi-facetted approach including design and engineering solutions is essential to reduce the high incidence of work-related back/neck pain in community nursing.
References Adams, J. 1995, Risk. London: UCL Press Adler, NE. Kegeles, S.M and Genevro, J.L. 1992, Risk taking and health. In Yates, J.F. (ed) Risk-taking behaviour. John Wiley and Sons Ltd. Buckle, P.W. 1987, Epidemiological aspects of back pain within the nursing profession. International Journal of Nursing Studies 24 (4) 319–324 Buckle, P.W. Stubbs, D.A., Randle, P. and Nicholson, A. 1992, Limitations in the application of materials handling guidelines. Ergonomics 35(9) 955–964 Harvey, J. 1985, The Witney Healthy Back Survey; a survey of staff attitudes towards lifting patients and objects, and the causes of back pain at Witney community Hospital. Oxfordshire Health Unit:Centre for Health Promotion. Health and Safety Executive. 1992, Manual handling—Guidance on Regulations London: HMSO Hignett, S. and Richardson, B. 1995, Manual handling human loads in a hospital: an exploratory study to identify nurse’s perceptions. Applied Ergonomics 26(3) 221–226 National Audit Office. 1996, Health and Safety in NHS Acute Hospital Trusts in England. London: The Stationery Office. Pitz, G.F. 1992, Risk Taking, design and training. In: Yates, J.F. (ed) Risk-taking Behaviour. Chichester: John Wiley and Sons Ltd. Yates, J.F. 1992, Epilogue: In Yates, J.F. (ed) Risk-taking Behaviour. Chichester. John Wiley and Sons Ltd.
CHILDRENS NATURAL LIFTING PATTERNS: AN OBSERVATIONAL STUDY Fiona Cowieson Department of Health Studies Brunel University Borough Road, Twickenham London TW7 5DU
This preliminary observational study explored childrens natural approaches to a lifting task. A trained observer viewed sagittal plane videorecordings of eighteen children lifting a box. Lifting performance was assessed using an observational checklist and posture was categorised according to criteria for stoop, squat or semisquat. The videorecordings were digitised to provide angular (trunk, knee) and distance (between subject and load) data. Fourteen children stooped and none adopted a squatting posture. It is suggested that schoolchildren do not naturally lift ‘correctly’ and that there may be a basis for targeting training at children as young as 7 or below.
Introduction Since 1992 the provision of training in manual handling for adult workers has been required by law (HSE, 1992). Mostly, such training encourages people to lift ‘correctly’ by keeping the back straight and holding the load close to the body. The biomechanical argument for this is clear. However, there is little and conflicting evidence to suggest that training programmes are effective in either the reduction of low back pain or in changing working postures (Troup and Edwards, 1985; Pheasant, 1991). Indeed, everyday observation suggests that adults do not automatically lift in a ‘correct’ manner. Conversely, anecdotal evidence suggests that children do in fact adopt a squatting posture when lifting. If children do lift ‘correctly’ there may be a basis for introducing lifting training at an earlier age before this advantage is lost. Equally, if children do not lift in a ‘correct’ way it may suggest that training programmes for adults are, by attempting to replace habits which have developed over a long period of time, doomed to failure. No studies have investigated childrens natural preferences when lifting yet it seems reasonable to suggest that greater understanding of childrens lifting preferences may provide information which could be useful to the future development of training programmes. Hence, this exploratory study aimed to examine the naturalistic lifting behaviour of prepubertal children.
Method Subjects Eighteen children participated in the study (mean age=7.5+/-0.4 yrs; mean Body Mass Index (W/H 2 ) 22.16+/-6.1). All subjects were right handed. Using personal or parental
Children’s natural lifting patterns
93
questionnaires subjects were screened for any major visual or musculoskeletal disorders which might affect their lifting behaviour. None of the subjects had received any lifting instruction in school.
Procedure This is an observational study of prepubertal children performing a natural lift. All subjects wore shorts and T shirts. Four markers were attached to the skin overlying: 1) the spinous process of the 7th cervical vertebrae, 2) the lateral malleolus, 3) the lateral knee joint line and 4) the greater trochanter. Each subject, stood 2 metres away from a cardboard box weighing 682 gm with dimensions 375mm×440mm×125mm. i.e (H×W×D). Using standardised instructions, each subject was asked to lift and place the box on a table. Each child performed three lifts with a 30 second rest between each lift. Continuous videorecordings were made with a camera placed in a standardised position 4 metres lateral to the load in order to obtain a sagittal view of the lifting posture. No child was permitted to view other children lifting.
Data acquisition Manual digitisation of the position of each skin marker was carried out by a trained observer using a real time video frame grabber and a standard computer. The reliability and accuracy of the digitising system was tested against a series of known angles and found to be highly reliable and accurate (mean angular error=0.86°+/-0.57°, SE 0.13°).
Angle and distance data Angle and distance data were obtained from the videos by digitising the position of the four skin markers together with a fifth point corresponding to the estimated centre of gravity of the box. The four markers enabled definition of two angles, one representing forward inclination of the trunk (torso angle) and the other knee posture (knee angle). The horizontal distance between the subject and the box was determined from the marker on the lateral malleoli and the digitised fifth point on the box corresponding to the estimated centre of gravity of the box.
Performance data The Chaffin Lift Evaluation Record (Chaffin et al, 1986) was used to assess subjects’ performance. This provides binary (Y/N) data based on 6 criteria: one vs two-handed lifting, smoothness of lifting, avoidance of twisting, proximity of load, erectness of trunk, and grip. Subjects scored 1 point for each criterion they complied with, hence the maximum possible performance score was 6. Each lift was categorised as squat, semisquat or stoop. The main criterion used for the categorisation was the flexion of the trunk relative to the vertical: Squat: trunk flexed up to 30° from vertical Semi-squat: to 60° and Stoop: trunk flexed beyond 60° from vertical and towards horizontal. Posture was assessed from the first identified frame in which the box began to be lifted from the floor. A trained observer familiar with the criterion for rating the videos viewed each video as often as she wished until satisfied that the lifts had been correctly classified.
Results Descriptive statistics were used for angular and distance data and performance scores. The lifting behaviour of each of the eighteen children was consistent between successive lifts.
94
F Cowieson
Squatting was not observed in any of the children so that two categories emerged, semisquat and stoop. Four children (22%) were classified into semisquat posture and 14 (78%) into the stoop category. The result as presented in the Table 1 summerise the principal data for each group. As can be seen, there was little difference between subjects who semisquatted and those who stooped in terms of their proximity to the load and their performance scores. Table 1. Torso and knee angles, Horizontal Distance and Performance scores
Discussion Squatting, with an almost vertical trunk posture was not observed in any of the children. Most children (14) adopted a stoop posture, with the trunk flexed toward the horizontal. A minority (4) adopted semisquat thereby conforming to a safer lifting posture. Clearly, the present study refutes anecdotal reports of children adopting a squatting posture when lifting. For all of the children, lifting behaviour was consistent between all three lifts making it seem reasonable to presume that approaches to lifting are well established by 7 years of age. By giving children specific instructions to lift a standard box, this study aimed to balance a requirement to observe childrens natural behaviours with some degree of control over the task requirements. It could be argued that a truly unobtrusive study method would have been preferable but given that all of the children consistently lifted in the same manner and as they can be presumed to be ignorant of ‘correct’ lifting, it seems reasonable to assume that the observed behaviours were representative of the childrens ‘usual’ approaches to this type of lifting task. As none of the children adopted a squatting posture it seems clear that children as young as 7 may, as a consequence of their lifting behaviours, be at risk of potential injury. Sheldon (1994) has previously suggested that children should be targeted with training related to manual handling and this study could be considered to provide some support for that line of
Children’s natural lifting patterns
95
thinking. However, the age group which should be targeted remains open to question. In the present study, the decision to investigate 7 year olds was an arbitrary one based on the requirement to investigate children who were old enough to respond reliably to instructions. Yet, most of these children adopted the less safe stooping posture when lifting. At face value this might suggest that training should be directed at younger children. A prerequisite to the development of such training is future work to investigate not only the lifting preferences of younger children but also the influence of task variables on lifting posture. Difference in the functional posture adopted by individuals performing the same task have previously been recognised, (Ikeda et al, 1991) so it is perhaps unsurprising that some children elected a semisquat posture. Clearly, the different postures observed may be related to individual differences such as anthropometry or strength. Although further work is required to determine if there is a relationship between anthropometry and lifting posture, these children may lift in a way which optimises their particular individual differences. If this is the case one implication might be that training in manual handling for either adults or children should focus on preserving individual differences in lifting style rather than on the imposition of a ‘correct’ lifting behaviour. Finally, horizontal distance was remarkably consistent for subjects in each of the two categories with most children naturally electing to stand close to the load. If this is confirmed in larger scale studies it would seem to suggest that instruction to ‘stand close to the load’ may be in fact redundant.
References Chaffin, D.I.B., Gallay, L.S., Woolley, C.B., Kuciemba, S.R. 1986, An evaluation of the effect of a training program on worker lifting postures, International Journal Industrial Ergonomics, 127–136 Health and Safety Executive. 1992, Guidance on Manual Handling Regulations. L23. Health and Safety Executive. HMSO Ikeda, R.E., Schenkman, M.L., Riley, P.O., Hodge, W.A. 1991, Influence of age on dynamics of rising from a chair. Physical Therapy 71 61–69 Pheasant, S.T. 1991, Ergonomics, Work and Health. (Macmillan, Edinburgh) Sheldon, M.R. 1994, Lifting instruction in children in an elementary school, Journal Orthop Sports Phys Ther. 19 105–108 Troup, J.D.G. and Edwards, F.C. (1985) Manual handling and lifting: an information and literature review with special reference to the back. Health and Safety Executive. HMSO
MANUAL HANDLING AND LIFTING DURING THE LATER STAGES OF PREGNANCY T.Reilly and S.A.Cartwright
Research Institute for Sport and Exercise Sciences Liverpool John Moores University Mountford Building, Byrom Street Liverpool, L3 3AF
Women may have to maintain manual handling/lifting activities, in domestic and/or occupational roles, during pregnancy. In the present study, observations were made at weeks 24–26 and 36–38 pre-term, and 12–16 weeks after birth (n=7). On 3 successive days, anthropometry, isometric lift performance and self-selected dynamic lifts (6 lifts/min for 10 min) were assessed, respectively. Ten controls, matched for age and body size, were measured over the same time frame. Performances in the isometric lifts, at knee and at waist height, did not change during later pregnancy. Dynamic lifting performance was unchanged over this same period, although perceived exertion increased. Loads handled by the pregnant subjects were lower than for the control subjects. Performance post-partum improved compared to measurements during the final trimester of pregnancy. Perceived exertion post-partum varied for individual body locations more than for whole-body exertion. It seems lifting performance is not seriously compromised throughout pregnancy when the load is self-selected and isometric endurance in particular is improved post-partum.
Introduction Women may be obliged during pregnancy to maintain handling/lifting activities, whether in domestic or occupational roles. There is also an increasing commitment among pregnant women to stay at work as long as possible prior to giving birth and to resume physical work soon afterwards. Changes occurring during pregnancy include weight gain, alterations in the centre of mass and body shape, and adaptations in gait. There are also alterations in ventilation, cardiovascular, oxygen transport and endocrine systems that contribute to the healthy development of the foetus without compromising maternal requirements.
Manual handling and lifting during the later stages of pregnancy
97
Physiological and anatomical changes during pregnancy are not necessarily detrimental to physical performance. Previously we have reported that lifting performance (isometric endurance lift; vertical and asymmetric lifts) was not impaired during pregnancy (Sinnerton et al., 1993, 1994) up to approaching term. Here we report observations made repeatedly preterm over 3 days and follow-up post-partum. For dynamic lifting tasks it was important that the methods involved self-selection of load, so that the perceived capabilities of the pregnant women were monitored. Consequently, for the pregnant women all of the tests were at submaximal intensity. For both the pregnant and control subjects, the data obtained at the end of the second trimester (weeks 24–26) are used for reference purposes.
Methods Seven women agreed to participate in this part of a larger study concerned with lifting performance of pregnant women. The women were aged 32 (±2) years and on their first visit to the laboratory weighed 64 (±5) kg at week 13. For this study they were measured pre-term at weeks 24–26 (Stage 4), weeks 36–38 (Stage 5) and again 12–16 weeks after the birth. Ten women who were matched for age and body size and were not pregnant were measured at the same time to form a control group. The measurements were made on three consecutive days on each of the test occasions. All participants had been familiarised with the procedures. On the first day body mass was recorded and skinfold thicknesses measured using a skinfold caliper (Harpenden) over biceps, triceps, subscapular and suprailiac sites. An isometric lift (at both knee and waist height) was performed on day 2 and vertical and asymmetric dynamic lifts were performed on day 3. All lifts were performed in the afternoon, at approximately the same time on each occasion. Subjects wore appropriate clothing and were reminded not to eat, drink coffee or smoke prior to the performance of the tests. The procedures were approved by both the University’s and the Liverpool Maternity Hospital’s Ethics Committees. Performance of the lifting tasks was as previously described (Sinnerton et al., 1993; 1994). The isometric lifts were measured using the dynamometer validated by Birch et al. (1991). The lifts were performed at both knee and waist height and were adjusted to the anthropometric requirements of each participant. The criterion was the length of time the applied force could be maintained within a strictly defined range. This was determined to be 35–45% of a maximum, previously established using the control group. The dynamic lifting tasks incorporated adaptations to the psychophysical method of Snook (1978). Participants selected the maximum load (MAL) they believed themselves to be able to lift repeatedly for 10 min at a frequency of 6 lifts per minute (down and up). The bags, from which the load was chosen, gave no visual clues as to the amount they weighed. A standard set of instructions was given to each participant. The weight in the tote box could be adjusted at any point throughout the lift. The dynamic lift was performed vertically and the asymmetric lift was through an angle of 90°. The height of the lift was over a fixed distance in both cases, approximately from waist to knee height. Immediately following completion of each task, perceived exertion was rated using Borg’s (1982) scale. The participants were asked to rate the task for general (whole-body) and localised (muscular, breathing) effects.
98
T Reilly and SA Cartwright
Results Body mass, measured during the post-partum stage, was significantly less than the body mass measured in the later stages of pregnancy. Mean body mass showed a decrease in the postpartum stage (67.0±8.1 kg); this decrease was significant between the post-partum stage and some of the stages of pregnancy, namely, Stage 4 (74.7±6.7 kg; P<0.05) and Stage 5 (77.0±6.5 kg; P<0.01). When the post-partum body mass was compared with the body mass of the control group at all stages, the results were similar. The total skinfolds, measured in the post-partum stage were compared with the total skinfolds during pregnancy; the results were not significantly different (P>0.05). When the total skinfolds of the post-partum group were compared with those of the control group, no notable differences were found. Performances in the isometric lifts, at knee and at waist height, did not change during the later stages of pregnancy. Generally, the performances were more variable in the pregnant subjects compared to the reference group of non-pregnant women. The endurance times at knee height were marginally greater in the control subjects whereas the pregnant subjects were slightly better in endurance performance at waist height. None of these differences reached statistical significance (P>0.05). For the pregnant group, the duration at waist height was greater than at knee height (P<0.01), but this difference was also observed in the control subjects. The performance on the dynamic lifting tasks was unchanged over this same period, although there was an increase in the rating of perceived exertion. This applied to both the vertical and the asymmetric lifts. Loads handled by the pregnant subjects were lower (P<0.01) than those employed by the control subjects. Performance post-partum improved in the isometric lift at waist height and in the dynamic lifts compared to the measurements obtained during the later stages (final trimester) of pregnancy. Perceived exertion post-partum varied in all tests for individual body locations more than for whole-body exertion. Table 1. Endurance time for isometric lifting at knee height and waist height, and maximal acceptable lift (MAL) for pregnant and control groups
The post-partum results were similar for the two dynamic lifts (P>0.05). Nevertheless, the performances by the mothers did not reach the values attained by the non-pregnant control participants (P<0.05).
Manual handling and lifting during the later stages of pregnancy
99
Perception of whole-body exertion (RPE) during the isometric lift at waist height postpartum did not differ from that during pregnancy (P>0.05). There was a decrease in the RPE values for the back from 10±3 at Stage 4 to 8±2 post-partum. The decrease for the lower legs from Stage 5 (11±3) to post-partum (8±2 was also significant (P<0.05). There were no significant differences for whole-body RPE post-partum and the values reported by the control group. Similarly for the dynamic lifts, there were no changes between the later stages of pregnancy and post-partum for whole-body ratings of exertion. Ratings for the back were higher at Stage 4 (12±3) and for the abdominals were higher (P<0.05) at Stage 5 (12±3) than post-partum (9±3). For the asymmetric dynamic lift, the main difference was a decrease in ratings for the abdominals from 11±3 at Stage 5 to 9±3 (P<0.05) post-partum.
Discussion The results of Maximum Acceptable Load (MAL) during pregnancy, for both the vertical and asymmetric dynamic lifts, did not change as pregnancy progressed. This implied that dynamic lifting performance could be maintained throughout the course of pregnancy. Although there were no visual clues as to the amount the women were lifting, each subject was obviously able to select a suitable amount at each stage. However, in contrast to the findings for the isometric endurance lifts, the results showed that the pregnant group, at all stages, lifted substantially less than the control group. This is in agreement with the findings of Masten and Smith (1988), who suggested that non-pregnant are significantly stronger than pregnant women. One might at first try and explain this difference by the increase in body mass and the change in shape which occurs as pregnancy progresses, but this explanation would not be as plausible when applied to the earlier stages of pregnancy. The psychophysical methodology chosen means it is difficult to state that the pregnant women were not as ‘capable’; they may well have been capable, but chose not to lift as much. The purpose of using the psychophysical methodology was to enable the women to exercise an element of subjective judgement. This is crucial throughout pregnancy when so many changes are occurring to the woman’s body. Therefore, when the pregnant women were asked to select a load for the frequency and time period specified, they chose the ‘maximum’ amount with which they felt comfortable. Nicholson and Legg (1986) noted that when subjects were asked to produce the maximum weight acceptable for a given time the load subjectively corresponded to “Fairly light” on the Borg scale, highlighting the “acceptable” notion of the methodology. It may have been the case that the women underestimated this amount in order to ‘be on the safe side’. Also, for the very reason that MAL was used, that is, to include subjective judgement, it follows that it tends to be influenced/limited by psychological factors. Ayoub et al. (1980) considered that the psychophysical methodology was concerned with the relationship between sensations and their physical stimuli. Previous research has also found individual interpretation to be a disadvantage (e.g. Snook, 1985; Mital, 1983). Other factors may influence the amount lifted, such as motivation and previous experience of lifting tasks. The fact remains that the pregnant women were consistent in their estimations and maintained their dynamic lifting performance throughout the course of pregnancy. The results in both groups of women suggest that there is no difference between the performance of the vertical and asymmetric lifts. This is in contrast to previous research
100
T Reilly and SA Cartwright
which has shown that the amount lifted vertically tends to be greater than that lifted asymmetrically (Garg and Badger, 1986). Lack of differences between the dynamic lifts in this report may again be, in part, inherent in the psychophysical methodology (however, it must be noted that the 1 RM results did not show variations between vertical and asymmetric lifts). With the control group the amount lifted as a percentage of the 1 RM was at least 50%. This figure varied, often influenced by the amount the subjects felt they could lift as a maximum. These figures did not change significantly over the entire stages of testing, suggesting that if subjects did tend to under- or over- estimate the amount they were to lift, they did so consistently over a period of nearing 12 months. It seems that lifting performance is not compromised appreciably throughout pregnancy when the load is self-selected. In this study subjective exertion increased whilst lifting performance was maintained. Self-chosen work-load was unaffected by physiological changes as the pregnant women neared term and performance improved following the birth. Further research is envisaged to establish the behavioural adaptations that occur during pregnancy which permit the maintenance of manual handling performance and the factors underlying the improvement post-partum.
Acknowledgements This work was supported by a grant from the Health and Safety Executive.
References Ayoub, M.M., Mital, A., Bakken, G.M., Asfour, S.S. and Bothea, N.J. 1980, Development of strength and capacity norms for manual handling activities: the state of the art. Human Factors, 22, 271–283 Birch, K., Sinnerton, S., Reilly, T. and Lees, A. 1994, The relation between isometric lifting strength and muscular fitness measures. Ergonomics, 37, 87–93 Borg, G., 1982, Psychophysical bases of perceived exertion. Medicine and Science in Sports and Exercise, 14, 377–381 Garg, A. and Badger, D. 1986, Maximum acceptable weights and maximum voluntary isometric strength for asymmetric lifting. Ergonomics, 29, 879–892 Masten, M.Y. and Smith, J.L. 1988, Reaction time and strength in pregnant and non-pregnant employed women. J. Occ. Med., 30, 451–456 Mital, A. 1983, The psychophysical approach to manual lifting—A verification study. Human Factors, 25, 485–491 Nicholson, L.M. and Legg, S.J. 1986, A psychophysical study of the effects of load and frequency upon selection of workload in repetitive lifting. Ergonomics, 29, 903–911 Sinnerton, S., Birch, K., Reilly, T. and McFadyen, I.M. 1993, Weight gain and lifting during pregnancy. In E.J.Lovesey (ed.) Contemporary Ergonomics (Taylor and Francis, London), 305–307 Sinnerton, S., Birch, K., Reilly, T. and McFadyen, I.M. 1994, Lifting tasks, perceived exertion and physical activity levels: their relationship during pregnancy. In S.A.Robertson (ed.) Contemporary Ergonomics (Taylor and Francis, London), 101– 105 Snook, S.H. 1978, The design of manual handling tasks. Ergonomics, 21, 963–985 Snook, S.H. 1985, Psychophysical considerations in permissible loads. Ergonomics, 28, 327–330
POSTURE ANALYSIS AND MANUAL HANDLING IN NURSERY PROFESSIONALS Joanne O.Crawford and Rhonda M.Lane
Industrial Ergonomics Group School of Manufacturing and Mechanical Engineering University of Birmingham Edgbaston Birmingham B15 2TT
The incidence of back pain in nurses and others involved in handling patients is well documented. However, this is not the case for nursery professionals. This study aimed to investigate the prevalence of back pain and poor posture A modified version of the Nordic musculoskeletal questionnaire was administered, working postures were observed using OWAS and participants rated selected work tasks using the Borg RPE-scale. The results indicated that the prevalence of low back pain is similar to that found in nurses. Participants also experienced pain in other body sites. The main activities contributing to poor postures were play activities and meal supervision. Recommendations include a formal risk assessment of this environment and education for staff to increase awareness when handling children
Introduction This study examines the incidence of back pain, body discomfort and working postures in nursery carers. There is an absence of direct research material in this area, however much of the research relating to nursing professionals and manual handling is relevant (Stubbs et al, 1983; Baty and Stubbs 1987; Pheasant and Stubbs 1992). Corlett et al, (1993) found that caring for children can cause a high level of postural stress and lifting children from the floor is likely to cause and aggravate low back pain. Recommendations were made for those working with children including checking the height from which the child is lifted, ensuring the side of the cot is down, kneel or squat to the childs level and carrying the child close to the trunk (Corlett et al, 1993). The impetus for this study was the perception that nursery carers and carers of older children with disabilities were suffering from an increased risk of injuries and postural problems. Babies and young children tend to occupy low positions on the floor, thus in order to lift them, nursery carers have to lift below knee level. Although babies and young children are perceived as small and lightweight, the frequency with which they are lifted is a further
102
JO Crawford and RM Lane
risk factor. The aim of the study was to examine the prevalence of back pain, other areas of body discomfort and the postures adopted by nursery carers at work.
Method Participants Twelve female participants working in two nurseries took part in the study. Eight were observed in nursery one, five in nursery two… The age of the staff ranged from 18 to 47 years and length of time working in the nursery environment ranged from 6 months to 21 years. Six of the participants worked with children aged 6 weeks to 18 months, the remainder worked with children from 18 months to 5 years. In nursery one staff worked over a 10 hour period with a one hour break during the day. In nursery two staff worked for an 8 hour period with a one hour break. The ratio of staff to children was one staff member to three children under 18 months and one staff member to 5 children over 18 months.
Back Pain and Postural Analysis The questionnaire used to assess discomfort was a modified version of the Nordic Musculoskeletal Questionnaire (Kuorinka et al, 1987). The modifications to the questionnaire included an additional section on perceived fitness levels and further biographical details. The questionnaire was administered to participants in a structured interview format. The Ovako Working Posture Analysis System (OWAS) was used to identify and classify working postures (Kurhu et al, 1977). Participants were observed for 45 minutes each at intervals of 30 seconds. This allowed 90 data to be collected per participant. Although it is recommended that data collection be carried out using a video camera, this was not permitted in the nursery environment. The weights of the children were estimated using growth tables provided by Samtrock (1994).
Task Evaluation Nursery staff were asked to rate the main tasks they performed using the Borg Scale (Borg 1985)
Results Back Pain The point prevalence of low back pain was 4 of the sample (33.3%) and the 12 month period prevalence was 6 of the sample (50%). During the previous 12 months, 16.7% of the sample reported low back pain as lasting between one and seven days and 8.33% experienced pain on a daily basis. Table 1 shows the point and 12 month prevalence of pain in various body areas.
Posture Analysis The OWAS observations resulted in 1170 data being collected. This was analysed using MINITAB. The data was divided into two groups to represent carers of babies (under 18
Posture analysis and manual handling in nursery professionals
103
months) and toddlers (over 18 months). The percentage of time spent on each particular work activity is shown in Figure 1. Table 1. Point and 12 month prevalence of pain
Figure 1. Percentage of time spent on work activities
There were significant differences between the two rooms (p<0.001). In the baby room 38.9% of the time was spent carrying out activities with the children. In the toddler room more time was spent supervising meals (27.6% compared with 19.8%). The distribution of OWAS action levels is shown in Table 2. As can be seen from the table, the carers observed spent at least 40% of the observation time in postures which require action to be taken. The main contributing activities to the poor back and neck postures were play activities in the baby room and meal supervision in the toddler room.
Task Evaluation Using the Borg Scale, participants rated various work tasks. Nappy changing was rated as 15 (hard, heavy), lifting babies and children, getting toy boxes out and getting nappy baskets out were rated as 13 (somewhat hard) and play activities were rated as 9 (very light).
Other Observations During the observation periods at the nursery a number of work practices were noticed. These included the use of children’s chairs by staff, the use of travel cots (at floor height and the sides do not drop down), the moving of heavy equipment, e.g., sandboxes (25 kg) without
JO Crawford and RM Lane
104
knowledge of the weight and the high storage of clothing and nappy baskets (176cm from the floor). Table 2. Distribution of OWAS action levels during the observation time
Discussion The prevalence of back pain was similar to that found in Stubbs et al, (1993). Fifty percent of the sample experienced pain in multiple sites which implies that although the load can be reduced for the back, this may be removing the pain to other body sites. As found from the OWAS analysis, participants were spending approximately 40% of their time in AC categories 2 to 4. Indicating action should be taken in the near future to improve posture and for some tasks immediately. However one of the difficulties of working in a childcare environment is that infants who are not capable of standing must be lifted from floor or cot height. In a childcare environment there is also the difficulty of maintaining an upright posture, as there is a need to get down to the child’s level to interact. Lifting recommendations for dealing with children were suggested by Corlet et al, (1993), however, it must be questioned whether they take account of the repetitive nature of the nursery carers job. Most of the equipment in the nurseries studied is designed to fit children under 5 years old. Nursery carers were using chairs for children and this practice should be avoided as prolonged sitting in a poorly fitting chair without back support can increase the risk of back pain (Luopajarvi 1990). Lifting babies onto changing areas and into high chairs was rated as somewhat hard using the Borg RPE-scale. Although the use of raised work areas and high chairs does reduce the need to stoop, the actual lifting of the children to those areas does represent a high risk. The method used to lift children involves holding the child away from the body to place them in high chairs. This perhaps reflects a need to examine both the design of high chairs and how children are placed in them. The use of removable trays on the high chair is also recommended. The use of travel cots in the nursery should be avoided. Travel cots are designed to be portable and short term. Their use as a permanent tool in the nursery means that carers have to place children into them, reaching over the side of the cot, down to 10cm above the floor. This risk factor can be avoided by using cots with drop sides that are higher from the floor. The storage of children’s belongings at higher level is necessary in the nursery environment. However the present height of 176 cm from the floor is considered unnecessarily high and involves carers reaching above shoulder height. It would be
Posture analysis and manual handling in nursery professionals
105
recommended that this height be reduced to allow the carers to reach but prevent the children from doing so.
Recommendations The nature of the nursery carers job is such that there will always be handling of infants and bending to the child’s level to interact with them. There are a number of recommendations that can be made as a result of this study although numbers were small. A full risk assessment should be made of the tasks that nursery professionals carry out. This information could then fed back to the carers to increase their knowledge regarding the weights of children and equipment and how they could change the tasks to reduce the handling risk. There is also a clear need for nursery staff to be educated in the risks involved. The workplace also needs to be improved by providing adult sized equipment for staff when carrying out administration work or taking rest breaks.
Further Research Recommended further research in this area would be to examine the manual handling training given to nursery carers and whether this is adequate for the work tasks that they do. There is also a clear need for ergonomics information to be provided to nursery managers regarding the purchase and use of equipment such as cots and high chairs.
References Baty, D. and Stubbs, D.A., 1987, Postural stress in geriatric nursing. Journal of Nursing Studies, 24, 339–344 Borg, G., 1985, An Introduction to Borg‘s RPE-scale. Movement Publications, Ithaca, New York Corlett, E.N., Lloyd, P.V., Tarling, C., Troup, J.D.G. and Wright, B., 1993, The Guide to the Manual Handling of Patients. National Back Pain Association, London Karhu, O., Harkonen, R., Sorvali, P. and Vepsalainen, P, 1977, Observing work postures in industry. Examples of OWAS application. Applied Ergonomics, 12, 13–17 Kuorinka, I., Jonnson, B., Kilbom, A., Vinterberg, H., Biering-Sorensen, F., Anderson, G. and Jorgensen, K., 1987, Standardised Nordic questionnaire for the analysis of musculoskeletal symptoms. Applied Ergonomics, 18, 233–237 Luopajarvi, T., 1990, Ergonomics analysis of workplace and postural load. In M.I.Bullock (ed.) Ergonomics: the physiotherapist in the workplace. (Churchill Livingstone, Edinburgh) Pheasant, S. and Stubbs, D.A., 1992, Back pain in nurses Applied Ergonomics, 23, 226–233 Samtrock, J.W., 1994, Child Development. (Brown and Benchmark, Madison USA) Stubbs, D.A., Buckle, P.W., Hudson, M.P. and Rivers, P.M., 1983, Back pain in the nursing profession II. Ergonomics, 26, 767–779
POSTURE
CAN ORTHOTICS PLAY A BENEFICIAL ROLE DURING LOADED AND UNLOADED WALKING? David C.Tilbury-Davis1, Robin H.Hooper1, Mike G.A.Llewellyn2
1
Human Sciences Dept. Loughborough University, Loughborough, LEICS. LE11 3TU 2 Protection & Performance, Centre for Human Sciences, Defence Evaluation Research Agency, Farnborough, HANTS. GU14 6TD
Increased knee flexion occurs post heel contact whilst carrying a heavy load. To establish the influence of knee orthotics that inhibit anterio-tibial displacement on changes induced by load carriage, ten military subjects were assessed under four conditions (Unloaded, 20kg load, 40kg load and 40kg load+orthotics). Ankle and knee flexion/extension angular displacements and velocities were derived. Ground reaction force data and peak force time parameters were derived. Force data were expressed as percentage of body weight. Significant differences were found in propulsive impulse, work and power; as well as vertical impulse, work and power. These were increased by knee orthotics. Knee flexion during load carriage was not reduced by orthotics (p>0.05). The possibility that orthotics assist in carrying heavier loads was not demonstrated while there was an increase in physiological cost.
Introduction The physiological effects of load carriage have been well documented and reviewed (Knapik et al, 1996). The electromyographic activity of certain muscles has also been documented (Bobet et al, 1984; Ghori et al, 1985; Holewijn, 1990; Harman et al, 1992). Fewer studies have looked at the kinematic and kinetic effects of load carriage. Increased knee flexion occurs post heel contact during load carriage (Kinoshita, 1985). Stance phase duration is unchanged by upto 50% of body weight load carriage, but swing phase duration is decreased (Ghori et al, 1985; Kinoshita, 1985; Martin et al, 1986). Ground reaction force, braking force, propulsive force and lateral force, are all increased by increasing load (Kinoshita, 1985; Harman et al, 1992). Although the change in vertical force is not proportional to the load carried (Harman et al, 1992). The affects on medial force are inconclusive (Kinoshita, 1985; Harman et al, 1992). Hip flexion and knee flexion post heel contact are increased by load carriage along with anterio-posterior rotation of the foot about the distal ends of the metatarsal bones (Kinoshita, 1985; Martin et al, 1986). During fixed speed walking the stride length may shorten, as load mass is increased (Kinoshita, 1985; Martin et al, 1986; Harman et al, 1992).
Orthotics during loaded and unloaded walking
109
The use of orthotics is widespread in sport and clinical rehabilitation. Increased stability in lax joints has been shown with the use of knee orthotics and/or prophylactic taping (Andersen et al, 1992). But these interventions have been shown to affect lower extremity motion and vastus lateralis electromyographic activity (Cerny et al, 1990; Osternig et al, 1993) as well as increasing intramuscular pressure (Styf et al, 1992). Knee orthotics may also cause greater extensor torques at the hip and ankle with more work produced at the hip and less at the knee (DeVita et al, 1996). Use of knee orthotics designed for anterior cruciate ligament injuries, that inhibit anteriotibial displacement or ‘giving’ of the knee joint, might provide support to offset this flexion, giving a more upright posture during load carriage. Would such orthotics therefore assist when carrying loads by attenuating the increase in knee flexion?
Methods Subjects Data were collected from seven healthy males for the unloaded assessment of the knee orthotics (mean age 22.7±3.9 years), mean stature and mass being 1.79±0.06m and 75.96±7.04 kg respectively. Data for the loaded conditions were collected from ten further healthy males, drawn from serving military personnel whose job involved load carriage, because task experience has been shown to influence task performance (Littlepage et al, 1997; Vasta et al, 1997). (Mean age 24.5±3.5 yrs, mean stature 1.77± 0.07m and mass 78.03±6.66 kg).
Apparatus The knee orthotics (Masterbracetm3, Johnson & Johnson Orthopaedics) weighed 3.6–3.9 kg per pair. A Kistler force plate (model: 9281B) was used to record ground reaction force parameters. Functional landmarks were identified with 5 markers (5th metatarsal head, lateral malleolus, 0.15m proximal to the lateral malleolus, 0.15m distal to the glenohumeral condyle and at the glenohumeral condyle). The movements of the ipsilateral limb were filmed in the sagittal plane (Panasonic F15 camera with lens WV-LZ14/8AFE). Footage was recorded (Panasonic AG7330). A Peak 5 (Peak Performance Technologies Inc.) video motion analysis system was used to analyse marker motion. Force data were synchronised with the kinematic data using a threshold trigger setting of 0.1904V (for all subjects and weights) and a Peak Performance Event Synchronisation box.
Procedures Following ethical clearance (DeRA Ethical committee) and informed consent from subjects, height, weight and leg length were measured. The subjects were familiarised with procedures and the orthotics. A start point at least 5 paces from the force plate was found and subjects walked over the plate at a self-selected velocity and continued for a further 5 paces past the force plate. When the subjects were ready they were asked to walk towards the plate, whilst focusing on a point in the distance. This was repeated until 5 clean (foot landing mid-plate) contacts with the force plate had occurred and the data recorded. For the unloaded study all
110
DC Tilbury-Davis, RH Hooper, and MGA Llewellyn
subjects completed 10 right foot contacts (5 unbraced, 5 braced) and 10 left foot contacts (5 braced, 5 unbraced). When carrying loads the right limb only was studied. The starting condition for each subject was randomised, the sequence being unloaded, 20kg, 40kg and 40kg with orthotics.
Data Analysis After 2-D reconstruction (filtered using a Butterworth optimal filter) of the raw kinematic data, the ankle and knee flexion/extension angular data were derived along with their respective angular velocities. Ground reaction force data were derived from the force plate along with peak force time parameters. Also the braking/propulsion impulse ratio was calculated to assess constancy of walking velocity (Hamill et al, 1995). All data were expressed as a percentage of contact time to normalise the individual time phases and force data were expressed as percentage of body weight. Peak parameters were statistically analysed using a T-Test for comparing dependent samples in a repeated measures experiment, which was corrected for correlated samples to overcome intersubject leg length variation. The intrasubject coefficients of variation (CoV) were calculated for peak force parameters (Winter, 1984). The curves were plotted and show the mean±95% confidence intervals, CoV for the ground reaction force, knee flexion and ankle flexion curves were also calculated (Winter, 1984). Difference between the control and trial, was accepted where the trial curve lay outside the 95% confidence intervals of the control.
Results The orthotics made no significant differences in any angular velocities and displacements during unloaded gait, nor to the ground reaction forces. The latter are in agreement with those reported by Chao et al (1983). When comparing the unloaded conditions (braced and unbraced) maximum medial force, mediolateral power and mediolateral work in the nondominant (N-D) leg were significantly different (table 1, p<0.01). Table 1. Unloaded
During loaded gait medio-lateral, anterio-posterior and vertical work and power were significantly increased (p<0.01) by wearing knee orthotics. Wearing orthotics increased medial force and decreased lateral force (p<0.05). Vertical and propulsive impulses were increased along with loading rate and positive torque (p<0.01). Braking impulse and vertical thrust were significantly reduced (p<0.01) as well as maximum propulsive force (p<0.05). When analysing the curves mean intrasubject CoV’s during loaded 40kg and loaded 40kg with orthotics, were low for knee flexion (16%, 20%) and ankle flexion (20%, 24%) as well as ground reaction force (21%, 27%). During load carriage no kinematic differences were caused by the orthotics (p>0.05), specifically knee flexion was not reduced by the orthotics
Orthotics during loaded and unloaded walking
111
when carrying loads (peak mean knee flexion braced: 22.5°, peak mean knee flexion unbraced: 25.0°), the maximum range of the orthotic flexion not being reached.
Discussion Unloaded Our data show that wearing knee orthotics has no adverse affect upon the gait of healthy adults. This is different to the effects of a knee-ankle-foot orthosis (Cerny et al, 1990). It seems that adverse changes in gait caused by orthotics only occur when ankle flexion is restricted. The small medio-lateral changes observed in the non-dominant limb are consistent with a greater rigidity at the knee which assists in maintaining balance. This lends support to the theory that the non-dominant limb acts as the control limb for medio-lateral balance and the dominant limb is primarily propulsive in its nature (Matsusaka, 1985; Sadeghi et al, 1997).
Loaded In our data the 40kg load ranged from 47%–64% of subject body weight. Although load carriage upto as much as 64% of subject’s body weight does not grossly effect sagittal plane gait motion, it may cause significant increases in the moments occurring at the hip, knee and ankle as well as increasing the impact, mediolateral balance forces. In an attempt to attenuate these changes, knee orthotics were used as an intervention. Our data suggest that wearing knee orthotics during load carriage does not reduce knee flexion and may cause an increase in the physiological cost of load carriage. The significant differences in the mediolateral factors suggest some asymmetry, but the high mean intrasubject CoV’s mask any systematic differences. The reduction in propulsive force and vertical thrust may relate to the possible increase in metatarsal flexion at toe-off as suggested by Kinoshita (1985) and Martin et al (1986) and is supported by the increase in positive torque, vertical and propulsive impulses. The highly significant differences in propulsive and thrust forces that occur may be due to the increased inertia of the lower limbs, in agreement with Martin (1985) and DeVita et al (1996). In conclusion there was no benefit wearing knee orthotics when carrying loads. Further work with orthotics in 3D, where the load carried is normalised as a percentage of body weight and lower limb moments are quantified may show clearer but modest benefits.
Acknowledgements The Defence Clothing and Textiles Agency, Science and Technology Division supported this work. We would also like to thank Rene Nevola and Andrew Brammell for their assistance with data collection and preparation.
References Anderson, K, Wojtys, E.M, Loubert, P.V, and Miller, R.E. 1992, A biomechanical evaluation of taping and bracing in reducing knee joint translation and rotation, The American journal of Sports Medicine, 20, 416–421 Bobet, J and Norman, R.W. 1984, Effects of load placement on back muscle activity in load carriage, European journal of applied physiology, 53, 71–75
112
DC Tilbury-Davis, RH Hooper, and MGA Llewellyn
Cerny, K, Perry, J, and Walker, J.M. 1990, Effect of an unrestricted knee-ankle-foot orthosis on the stance phase of gait in healthy persons, Orthopaedics, 13, 1121–1127 Chao, E.Y, Laughman, R.K, Schneider, E, and Stauffer, R.N. 1983, Normative data of knee joint motion and ground reaction forces in adult level walking, Journal of Biomechanics, 16, 219–233 DeVita, P, Torry, M, Glover, K.L, and Speroni, D.L. 1996, A functional knee brace alters joint torque and power patterns during walking and running, Journal of Biomechanics, 29, 583–588 Ghori, G.M.U and Luckwill, R.G. 1985, Responses of the lower limb to load carrying in walking man, European journal of applied physiology, 54, 145–150 Hamill, J and Knutzen, K.M. 1995, Types of Mechanical Analysis. In Stead, L (ed.) Biomechanical basis of human movement, (Williams & Wilkins, Media, USA), 458–489 Harman, E, Man, K.H, Frykman, P, Johnson, M, Russell, F, and Rosenstein, M. 1992, The effects on gait timing, kinetics, and muscle activity of various loads carried on the back, Medicine and Science in sports and exercise, 24, S129 Holewijn, M. 1990, Physiological strain due to load carrying, European journal of applied physiology, 61, 237–245 Kinoshita, H. 1985, Effect of different loads and carrying systems on selected biomechanical parameters describing walking gait, Ergonomics, 28, 1347–1362 Knapik, J, Harman, E, and Reynolds, K. 1996, Load Carriage using packs: A review of physiological, biomechanical and medical aspects, Applied Ergonomics, 27, 207–215 Littlepage, G, Robinson, W, and Reddington, K. 1997, Effects of task experience and group experience on group performance, member ability, and recognition of expertise, Organisational Behaviour and human decision processes, 69, 133 Martin, P.E. 1985, Mechanical and physiological responses to lower extremity loading during running, Medicine and Science in sports and exercise, 17, 427–433 Martin, P.E and Nelson, R.C. 1986, The effect of carried loads on the walking patterns of men and women, Ergonomics, 29, 1191–1202 Matsusaka, N. 1985, Relationship between right and left legs in human gait, from a viewpoint of balance control. In Biomechanics IX-A, (Champaign, Illinois, USA), 427–430 Osternig, L.R and Robertson, R.N. 1993, Effects of prophylactic knee bracing on lower extremity joint position and muscle activation during running, The American journal of Sports Medicine, 21, 733–737 Sadeghi, H, Allard, P, and Duhaime, M. 1997, Functional asymmetry in able-bodied subjects, Human Movement Science, 16, 243–258 Styf, J.R, Nakhostine, M, and Gershuni, D.H. 1992, Functional knee braces increase intramuscular pressures in the anterior compartment of the leg, The American journal of Sports Medicine, 20, 46–49 Vasta, R, Rosenberg, D, Knott, J.A, and Gaze, C.E. 1997, Experience and the waterlevel task revisited: Does expertise exact a price?, Psychological science, 8, 336–339 Winter, D.A. 1984, Kinematic and kinetic patterns in human gait: Variability and compensating effects, Human Movement Science, 3, 51–76
INVESTIGATION OF SPINAL CURVATURE WHILE CHANGING ONE’S POSTURE DURING SITTING Frederick S.Faiks* and Steven M.Reinecke**
*Steelcase Inc. Grand Rapids MI 49501 USA **Univ. of Vermont, Vermont Back Research Center, Burlington VT 05405 USA
As sedentary, static work postures have become increasingly prevalent in our workplaces, musculoskeletal problems—in particular, low back pain and discomfort—have also increased. Researchers agree on the importance of changing one’s posture while providing adequate back support. This study provides the basis for developing a backrest that accommodates natural human motion. Kinematic motion of twenty subjects were recorded in a seated position. While moving between flexion to extension, thoracic kyphosis increases and lumbar lordosis increases. Thoracic curvature changed uniformly through the full range of motion (80°–115°). Lumbar curvature changed only as the thigh-torso angle exceeded 95 °. The path and rate of curvature of the lumbar spine (L3) is independent of the path and rate of curvature of the thoracic spine (T6) and is a function of the complex combined motion of pelvic rotation and variations in spinal curvature. These findings suggest that a backrest should provide independent lumbar and thoracic support to ensure that the backrest continues to support one’s posture while promoting natural patterns of motion of the spine.
Introduction Sitting is the most frequently assumed posture, approximately 75 percent of the workforce has sedentary jobs. However, prolonged static sitting is frequently accompanied by discomfort and musculoskeletal complications that result from sustained immobility (Hult, 1954; Eklund, 1967; Magora, 1972; Kelsey, 1975; Lawrence, 1977). Reinecke et al. (1985) showed a correlation between static seated postures and back discomfort concluding that individuals are better able to sit for prolonged periods when they can change their posture throughout the day. Several researchers have evaluated the physiologic affects of changing ones posture or more directly spinal motion. Holm and Nachemson (1983), investigated the effects of various types of spinal motion on metabolic parameters of canine intervertebral discs. They suggest that the flow of nutrient-rich fluids to and from the intervertebral discs increases with spinal movement. Adams (1983) also found that alternating periods of activity and rest, thereby introducing postural change, further boosts the fluid exchange, helping to nourish the discs. Grandjean (1980) is another who maintains that alternately loading and unloading the spine (through movement) is ergonomically beneficial, because the process pumps fluid in and out of the disc, thereby improving nutritional supply.
114
FS Faiks and SM Reinecke
Chaffin and Andersson (1984) have reported that the two most important considerations in seating are adequate back support and allowance for movement or postural change. Good seating should allow a worker to maintain a relaxed, but supported, posture and should allow for freedom of active motion over the course of the day. Kroemer (1994) noted that a backrest should allow for stimulation of the back and trunk muscles by moving through, and holding the back in, various postures. While freedom of movement is beneficial, extended association of muscle forces on the trunk also generates spinal compression, and a backrest can support the trunk and serve as a secondary support mechanism, thereby reducing the necessary muscle forces and reducing the compressive loading of the spinal column. In summary, active movement and postural changes are inevitable, and in fact desirable, throughout the day. Schoberth (1962) recommends changing postures around a relaxed, upright, seated posture to minimize muscular activity and the static muscular load needed for sitting. Most researchers agree that motion should be incorporated in seating while the body is being supported in different postures. Little information is available on spinal curvature and pelvis rotation while a person is moving in a seat. The objective of this study is to describe the kinematic movement of the upper trunk and use this information to aid designers in developing a backrest that actively accommodates natural human motion in a relaxed and unrestricted manner. The resulting backrest system should support the body, continuously and throughout the entire range of motion, but should not constrain natural movement. A backrest that naturally moves with a person while continuously providing support would gain from the physiologic benefits of spinal motion.
Methods Subjects: Twenty subjects (10 female, 10 male) participated in the study. Among the men, heights ranged from 163.2 to 188.4cm (mean 176.2cm) and weights ranged from 59.5 to 93.4kg (mean 75.9 kg). Among the women, heights ranged from 144.8 to 177.8cm (mean 165.9cm) and weights ranged from 46.3 to 81.6kg (mean 60.1kg).
Procedures: Targets, consisting of a light-emitting diode (LED) and a 1cm calculator battery were attached to the skin over the posterior vertebral body at the following locations: Thoracic vertebrae, T1–T3–T6–T8–T10–T12, Lumbar vertebrae, L1–L3–L5, (Figure 1) and mid-point femur and tibia while the subjects were seated in the test fixture. The test fixture allowed subjects to move, unsupported, between a forward-flexed position (80° trunk-thigh angle) to an extended, reclined position (115°) without affecting their natural motion (Figure 2). During the data collection period. Seat-pan tilt was adjusted to three positions: -5° rearward, 0° horizontal and +5 ° forward tilt. Positioned behind the seat pan was a fixed backrest that served as a “safety backrest.” The backrest provided confidence as a backstop at the fully reclined position. The backrest was split, with a 20-cm gap between the two lateral supports, allowing enough room so that the LED targets would not become compressed when subjects adopted a fully reclined position. Subjects were positioned and adjusted to the test fixture for seat-pan height (popliteal height) and buttock position. A removable positioning support ensured that all subjects’ buttocks were positioned in the same location.
Spinal curvature during sitting
Figure 1. Position of LED markers on spine
115
Figure 2. LED’s depict spinal and pelvic motion
Seat-pan depth was 44 cm with a 2.5 cm foam pad upholstered over a flat surface. Once seated, subjects practiced moving through the full range of motion: 80° forward flexion to 115° extension. Once the subjects felt comfortable and natural with the motion, time-lapse photographs were taken at a rate of 4 frames per second. Each test position was repeated to evaluate repeatability. Subjects repeated the motion for all three seat-pan angles: +5° forward tilt, 0° degrees and -5° backward tilt. The test procedure was repeated with a 76.2 cm work surface placed in front of the subject. Subjects’ arms rested on the top of the work surface in the forward-flexed position.
Results Both lumbar and thoracic curvature were measured using the National Institute for Occupational Safety and Health (NIOSH) method. Lordosis angles were determined by drawing a line connecting the points of the corresponding posterior vertebral body at L1 to L3 and L3 to L5. At the superior margins of L1 and L5, perpendicular lines were drawn so that their intersection formed the angle of lordosis at the lumbar region. Kyphosis angles were determined by drawing a line connecting the points T1 to T6 and T6 to T12. At the superior margins of T1 and T12, perpendicular lines were drawn so that their intersection formed the angle of kyphosis at the thoracic region. Table 1. Average change in curvature from 80° to 115° forward flexion
116
FS Faiks and SM Reinecke
The average change in thoracic curvature was 3° (S.D. 1.6°) for flexion angles between 80° to 95° and 2.7° (S.D. 1.7°) between 95° to 115°. The average change in lumbar curvature was .08° (S.D. 1.1°) for flexion angles between 80° to 95° and 4.5° (S.D. 1.8°) between 95° to 115°. Pelvic rotation was monitored by the translation of the thigh (femur) posteriorly, as the pelvis rotates rearward about the ischial tuberosity. Quantitative data for the exact amount of pelvic rotation could not be obtained. However, qualitative assessments of the magnitude of pelvic rotation were made. The amount of pelvic rotation was observed by the distance the thigh moved rearward; the greater the distance, the greater the amount of hip rotation. The average rearward thigh translation was 5.57cm. (S.D. 1.24)
Conclusion Thoracic region of the spine 1) Thoracic curvature becomes more kyphotic as the person reclines (figure 3). 2) Variance among subjects was significant. 3) Change in curvature was consistent for the full range of motion, from 80° to 115°. 4) Seat-pan angle did not affect thoracic curvature. 5) Subjects displayed greater kyphosis when they were seated at a work surface.
Lumbar region of the spine 1) Lumbar curvature becomes more lordotic as the person reclines (figure 3). Figure 3. Thoracic kyphotic and 2) Variance among subjects was significant. lumbar lordotic curvature 3) Changes in curvature occurred primarily from increases as a person reclines 95° to 115° of the range of motion. 4) Seat-pan angle did not affect lumbar curvature. 5) Lumbar curvature was not affected by the presence of a work surface.
Pelvic rotation 1) Pelvic rotation decreased when subjects were seated at a workstation. 2) Variance among subjects was significant. 3) Seat-pan angle did not affect pelvic rotation.
Discussion Seating design should be considered from the perspective of the end users and their postural requirements. Thus, a main objective should be to determine the ways in which a chair can support the body while, at the same time, providing for unrestricted movement. One should expect a chair to conform to, or accommodate, the body, rather than expecting the user to conform to the shape of the chair. In order to refine design criteria consistent with this expectation, this study was conducted to record kinematic motion of the back during unrestricted movement.
Spinal curvature during sitting
117
It was found that the motion of the upper trunk represents a combination of spinal movement and pelvic rotation. As a seated individual moves from a forward-flexed position (80° trunk-thigh angle) to a reclined position (115°), both thoracic kyphosis and lumbar lordosis increases. The path and rate of motion of the lumbar spine (L3) are independent of the path and rate of motion of the thoracic spine (T6), additionally, both parameters vary with the complex, combined motion of pelvic rotation, as well as changes in spinal curvature. To provide maximal support, a chair’s backrest should follow the motion of the back while the seated individual changes position. The backrest must, therefore, be flexible enough to provide continuous support in both an upright and reclined position. This study demonstrates the need for a backrest that can change its contouring as an individual moves. The thoracic region of the back requires a backrest that is capable of providing an increasingly concave surface as one reclines further backward, while the lumbar region requires a surface that is capable of increasing in convexity. Chairs which feature a singleplane surface, cannot provide this type of support. In contrast, a dynamic backrest, one with a changing surface contour, will ensure that the back is supported in all natural seated postures. Knowledge gained from this study of motion can lead to a design solution that addresses the complexities of human movement and one that provides more comfortable and healthy seating than do conventional chair designs.
References Adams M.A. 1983, The effect of posture on the fluid content of lumbar intervertebral discs. Spine. Volume 8, No. 6 Bendix T., Winkel J., Jessen F. 1985, Comparison of office chairs with fixed forwards of backwards inclining, or tiltable seats. European Journal of Applied Physiology, 54: 378–385 Chaffin D.B., Andersson G.B.J. 1984, Occupational Biomechanics, New York, (John Wiley & Sons) Eklund M. 1967, Prevalence of musculoskeletal disorders in office work. Socialmedicinsk, 6, 328–336 Grandjean E. 1980, Fitting the Task to the Man, Third Edition, (Taylor and Francis, London) Holm S., Nachemson A. 1983, Variations in nutrition of the canine intervertebral disc induced by motion. Spine, 8(8):866–874 Hult L. 1954, Cervical, dorsal and lumbar spine syndromes. Acta Orthopaedic Scandinavia (Supplement 17) Kelsey J. 1975, An epidemiological study of the relationship between occupations and acute hemiated lumbar intervertebral discs. International Journal of Epidemiology, 4, 197–205 Kroemar R. 1994, Sitting (or standing?) at the computer workplace. Hard Facts about Soft Machines The Ergonomics of Seating. Edited by Lueder R. and Noro K., (Taylor & Francis, London), 181–191 Lawrence J. 1977, Rheumatism in populations. (London: William Heinemann Medical Books Ltd) Magora A. 1972, Investigation of the relation between low back pain and occupation. 3. Physical requirements: Sitting, standing and weight lifting. Industrial Medicine, 41, 5–9 Reinecke S., Bevins T., Weisman J., Krag M.H. and Pope M.H. 1985, The relationship between seating postures and low back pain. Rehabilitation Engineering Society of North America, 8th Annual Conference, Memphis, Tenn. Schoberth H. 1962, Sitzhaltung, Sitzschaden, Sitzmobel. (Springer-Verlag, Berlin)
THE EFFECT OF LOAD SIZE AND FORM ON TRUNK ASYMMETRY WHILE LIFTING Gail Thornton* and Joanna Jackson**
*Formerly of Coventry University **School of Health and Social Studies, Colchester Institute
The influences of load characteristics on trunk asymmetry during a lifting manoeuvre were investigated. Asymmetry was defined as the degree of rotation and side flexion occurring in the thoracolumbar spine. Using a same subject, repeated measures design angular range of motion in the coronal and transverse planes was measured using a tri-axial goniometer. Objects of equal mass but different dimensions and form were used. ANOVA revealed that there was no significant difference in the range of motion during the lifting of the three objects. It was concluded that load size and form did not significantly influence the degree of trunk asymmetry while lifting.
Introduction The association between manual materials handling and occupational low back pain has been well documented and widely reported. Many ergonomic evaluation techniques rely on evidence based upon static biomechanical assessments of spinal loading during lifting (Marras et al, 1993). In addition, many of these assessments only consider sagittally symmetrical positions of the body. Waters et al (1995) demonstrated that consideration of the dynamic components of lifting could be crucial to a proper understanding of the causes of back injury. The effects of asymmetry of the trunk during lifting have been strongly associated with an increased risk of back injury. Significant risk factors for back injury include repetitive twisting or side bending when lifting, even when loads are relatively light (Bigos et al, 1986). Asymmetry (twisting and side bending) has been found to influence individual capability by reducing trunk strength and increasing the degree of strain put on the intervertebral disc (Kelsey, 1984; Shiraz-Adl, 1989). The dynamic components of lifts, defined as angular ranges of motion, acceleration and velocity (Marras et al, 1993) have also been associated with increased spinal loading.
Effect of load size and form on trunk asymmetry while lifting
119
Features of the object/load to be lifted, such as its size, bulk and unpredictability have also been linked to the possibility of manual materials handling becoming more hazardous. Lift styles, lift frequency, weight of load, size of load and static strength have all been investigated. There appears to be a paucity of literature examining the effect of load size and form on dynamic three-dimensional motions of the trunk. The degree of asymmetry occurring with lifting varying loads of the same weight may give an indicator of risk. Marras et al (1995) identified the need to establish trunk motion characteristics of in “vivo” occupational lifting conditions to give improved reasoning about the mechanisms involved with back injury. Knowledge of three-dimensional spinal position is one the key elements he identifies as essential in any evaluation of injury risk during manual handling. During this study asymmetry was defined as the degree of rotation and side flexion taking place in the trunk. Rotational movements occur in the transverse plane of the body, about a vertical axis. Side flexion occurs in the coronal plane about a sagittal axis.
Methodology A same subject repeated measures experimental design was used. All subjects were measured under three different lifting conditions. The objects to be lifted had a mass of 8kg; two were boxes of different dimensions and one was a beanbag. The aim was to simulate loads that were bulky and awkward, small and compact and with contents that were susceptible to shifting. A convenience sample of fifteen subjects was used from a student population. None of the subjects had any previous known back injury or pathology. None of the subjects had received any formal training in manual materials handling. All subjects volunteered for the study and gave written informed consent to participate. During each lift spinal motion was recorded using the lumbar motion monitor (LMM). The LMM is an electrogoniometer which is capable of measuring the instantaneous position of the thoracolumbar spine in three dimensions (Marras, 1995). The LMM consists of an exoskeleton, which represents the posterior guiding system of the spine; this is attached to the subject by a two piece harness allowing the LMM to track the subject’s trunk motion. The validity of the LMM has been established (Marras, 1992) as has its inter-tester and intratester reliability (Gill and Callaghan, 1997). The LMM is calibrated in its carrying case and this gives it a zero position for all three planes of movement. A subject wearing the LMM will therefore demonstrate a negative reading in the sagittal plane when in relaxed standing which will represent the lumbar lordosis. Each subject wore the LMM with the harness fitted in relation to set bony landmarks. The upper metal edge of the pelvic harness was aligned with the junction between the L5/S1 vertebrae and the lower metal edge of the thoracic harness was aligned with the inferior angle of the scapula. All subjects were informed that the load would be no greater that 10kg. The subjects lifted the three objects in a random order to try and eliminate order effects. Each subject stood at a marked point 125cm in front of a plinth and the object to be lifted was placed on the floor
120
G Thornton and J Jackson
25cm in front of the subject. The subject was instructed that on the command “1 2 3 Go” they were to lift the object and place it onto the plinth, they could move freely using any style of lift they wished. Each subject lifted the three objects during one test session. Data was collected using the Industrial software available for the LMM. The range of motion analysed was calculated using the upper and lower range summary statistics. This is based on the peak ranges of motion recorded for, in this study, the movements of side flexion and rotation to the left and right. Statistical analysis was undertaken using SPSS for windows.
Results Fifteen healthy subjects were used in the study; ten females and five males. Table 1. Subject Data
Data was analysed using SPSS for windows, release 6.0. Descriptive data is presented in table 2. Table 2. Sideflexion and rotation in the thoracolumbar spine
A one way analysis of variance (ANOVA), repeated measures, was computed on each set of data (side flexion and rotation). Results are presented in table 3 and 4. Table 3. ANOVA for variable rotation
Effect of load size and form on trunk asymmetry while lifting
121
Table 4. ANOVA for variable side flexion
The results from the ANOVA suggest that changes in load size or form did not significantly affect the degree of asymmetry taking place in the trunk during lifting. Closer examination of the descriptive data of individual subjects revealed that they appear to fall into two subgroups: one demonstrating very little variation in spinal motion between the different conditions, the other having considerable variation when lifting the different objects. The majority of subjects demonstrated more rotation taking place than side flexion when lifting any of the three objects.
Discussion The results of this study did not show a significant difference in spinal motion when lifting three different loads, however all the tasks did produce substantial three-dimensional motion of the trunk. The following limitations and observations of the study should be considered: •
• •
•
•
•
•
Only a small number of subjects were used, within this number a clearly identifiable number appeared to have obvious differences in spinal motion between the different conditions. The subjects were allowed little time to acclimatise to wearing the LMM. It could have influenced, altered or restricted subjects’ movement. The hand position used during the lift will have influenced trunk motion. This may have changed during the task once the lifter had assessed the weight of the object. There appeared to be more spinal motion when lifting the small box. The large box could have made subjects more cautious so a more controlled lifting style may have been adopted. This may also relate to the influence of weight knowledge on lifting technique where there is anticipation of the amount of effort required to lift a load. Lift style was not controlled in the study. Making a visual recording of subjects’ lifting technique would have allowed for consideration of this variable and of the effect of hand position. The range of motion analysed was the range through which the subjects moved in side flexion and rotation. Consideration could also have been given to the maximum range of movement achieved in any direction and to the interactions between the movements. It should not be forgotten that there are many factors which can increase the risk of injury whilst lifting.
122
G Thornton and J Jackson
References Bigos, S.J., Spengler, D.M., Martin, N.A., Zeh, J., Fisher, L., Nachemson, A. and Wang, M.H. 1986, Back injuries in industry: A retrospective study. 11. Injury factors, Spine, 11, 246–251 Gill, K.P. and Callaghan, M.J. 1997, Intratester and intertester reproducability of the lumbar motion monitor as a measure of thoracolumbar range, velocity and acceleration, Clinical Biomechanics, 11, 418–421 Kelsey, J.L., Githens, P.B., White, A., Holford, T., Walter, S.D., O’Connor, T., Ostfield, A.M., Weil, U., Southwick, W.O. and Calogero, J.A. 1984, An epidemiological study of lifting and twisting on the job and risk for acute prolapsed lumbar intervertebral disc, Journal of Orthopaedic Research, 2, 61–66 Marras, W.S., Fathallar, F.A., Miller, R.J., Davis, S.W. and Mirka, G.A. 1992, Accuracy of a three-dimensional lumbar motion monitor for recording dynamic trunk motion characteristics, International Journal of Ergonomics, 9, 75–87 Marras, W.S., Lavender, S.A., Leurgans, S.E., Sudhakar, L.R., Allread, W.G., Fathallah, F.A. and Ferguson, S.A. 1993, The role of dynamic three-dimensional trunk motion in occupationally-related low back disorders, Spine, 18, 617–628 Marras, W.S., Lavender, S.A., Leurgans, S.E., Fathallah, F.A., Allread, W.G. and Sudhakar, L.R. 1995, Biomechanical risk factors for occupationally related low back disorders, Ergonomics, 38, 377–410 Shiraz-Adl, A. 1994, Biomechanics of the lumbar spine in sagittal/lateral moments, Spine, 19, 2407–2414 Waters, T.R., Andersen, V.P., Garg, A. and Fine, L.J. 1993, Revised NIOSH equation for the design and evaluation of manual lifting tasks, Ergonomics, 36, 749–776
THE EFFECT OF VERTICAL VISUAL TARGET LOCATION ON HEAD AND NECK POSTURE Robin Burgess-Limerick, Anna Plooy, & Mark Mon-Williams
Department of Human Movement Studies The University of Queensland, 4072 AUSTRALIA
Twelve participants viewed a visual target placed in 6 vertical locations ranging from 30° above to 60° below horizontal eye level. This range of vertical target location was associated with a 37° change in head orientation, and a 53° change in gaze angle with respect to the head. The change in head orientation was predominantly achieved through changes in atlanto-occipital posture. Consideration of these data in light of preferred gaze angle data, and neck muscle length/tension relationships, suggests that visual targets should be located at least 15° below horizontal eye level.
Introduction A change in the vertical location of visual targets influences both the vertical gaze angle of the eyes relative to the head, and the orientation of the head relative to the environment (Burgess-Limerick et al., in press; Delleman, 1992). In general, a large range of vertical gaze angles and head orientations might be combined to view a visual target in any particular vertical location. Similarly, any given head orientation may be achieved through combining a large range of trunk orientations, cervical postures, and positions of the atlanto-occipital joint. The aims of this paper are: (i) to describe the gaze angles and postures adopted to view a large range of vertical target locations; and (ii) to explore the consequences of these changes in terms of potential musculoskeletal discomfort.
Method Twelve participants self-selected the height and backrest inclination of an adjustable chair. A small screen television (4.5×5.5 cm) connected to a video player was mounted at 15° intervals on a 65 cm arc. The arc was positioned so that its centre was at the same height as each participant’s eyes in the self-selected sitting position. The television was placed in six positions; +30, +15, 0, -15, -30, – 45 and -60° with respect to a virtual horizontal line at eye level. Each position was presented 3 times in random order. A modified Stroop task (the
124
R Burgess-Limerick, A Plooy, and M Mon-Williams
television displayed a single word written in a contradictory colour, e.g. the word “red” would appear written in green and participants were requested to name the word rather than the colour it was written in) was performed for one minute in each trial with data collection (at 10 Hz) occurring during the last 10 seconds. Optotrak (Northern Digital) provided the 3 dimensional location of infra red emitting diodes placed on the outer canthus (OC), the mastoid process (MP) on a line joining the external auditory meatus and the outer canthus, spinous process of the seventh thoracic vertebra (C7), and the greater trochanter (GT). The markers were used to define head and neck angles in the sagittal plane. These angles were used to describe the position of the head and neck modelled as three rigid links articulated at two pin joints at the atlanto-occipital joint and C7. The position of the head with respect to the external environment was described by calculating the position of a line joining OC and MP markers (the ear-eye line) with respect to the horizontal. Gaze angle is reported with respect to the ear-eye line.
Results and Discussion The average effect of vertical target location on vertical gaze angle and posture is summarised in Figure 1. Participants responded to changes in visual target location with an approximately linear change in both head inclination (as described by ear-eye position) and gaze angle. Fixation on a visual target which varied through a 90° vertical range was achieved by an average change in head orientation (ear-eye line) of 37° and a change in gaze angle relative to the head of 53° (the average ratio of head inclination to gaze angle change was 0.70). Whilst all participants exhibited linear changes in both variables (all individual participant correlations were greater than 0.93), considerable individual differences existed in the ratio of changes in head orientation to changes in gaze angle relative to the head (from 0.45 to 1.12). Changes in head orientation were achieved predominantly through altering the position of the atlanto-occipital joint (measured here as head angle) and, to a lesser extent, by changing cervical posture (neck angle). The average change of 37° in head orientation across the target locations was produced by an average change in atlanto-occipital position of 28°, a 7° change in the posture of the cervical spine, and a 2° change in trunk inclination. Previous research suggests that subjective preference is for visual targets to be located such that the eyes are rotated downwards relative to the head. Kroemer and Hill (1986) reported the average preferred gaze angle as 35° below the Ear-Eye line for visual targets at 1m. We suggest that the reason for this preference is as follows. An observer’s eyes must converge to maintain single vision of near visual targets. Ocular vergence is produced by activation of the medial recti muscles of the eye. The muscles responsible for raising the eye (the superior obliques) also create a horizontal divergent force on the eye. Raising the eyes to view a target thus requires increased activation of the medial recti to maintain single vision. Conversely, the muscles which lower the eyes also tend to create convergence, thus reducing the activation required by medial recti. Visual discomfort may result from prolonged high activation levels of medial recti. Indeed, anomalies of vergence are considered to be the primary cause of visual discomfort when fixating near targets (Mon-Williams, et al., 1993). This simple mechanical model explains why observers prefer to look downwards to view near targets. It also explains why the preferred vertical gaze angle gets progressively lower as objects get closer (Kroemer & Hill, 1986).
Effect of vertical visual target location on head and neck posture
Figure 1: (A) Posture as a function of target position; (B) Gaze angle and head orientation as a function of target position; and (C) schematic representation of postural and gaze angle changes as a function of selected target positions.
125
126
R Burgess-Limerick, A Plooy, and M Mon-Williams
Interpretation of the consequences of the postures adopted to view different visual targets also requires consideration of the biomechanics of the head and neck. The head and neck system comprises a rigid head located above a relatively flexible cervical spine. Flexion and extension are possible at the atlanto-occipital, and cervical joints. The ligaments and joint capsules are relatively elastic, especially within the mid range, and a large range of movement is possible without significant contribution from passive tissues. The centres of mass of the head, and the head and neck combined, are anterior to the atlanto-occipital and cervical joints. Consequently, when the trunk is vertical, extensor torques about the atlanto-occipital and cervical joints are required to maintain static equilibrium. A large number of muscles with diverse sizes, characteristics, and attachments are capable of contributing to these torques. The suboccipital muscles, which take origin on C1 and C2 and insert on the occipital bone, are capable of providing extensor torque about the atlanto-occipital joint only; others (such as semispinalis capitis) provide extensor torque about cervical as well as atlanto-occipital joints; while others provide extensor torque about cervical vertebrae only. Increased flexion at the atlanto-occipital joint increases the horizontal distance of the centre of mass of the head from its axis of rotation (level with the mastoid process). Similarly, with the trunk in a vertical position, an increase in flexion of the cervical spine increases the horizontal distance of the centre of mass of the head and neck combined from the axes of rotation in the vertebral column (and all else remaining the same, the horizontal distance of the head from its axis of rotation). Hence, with the trunk approximately vertical both atlantooccipital and cervical flexion increases the torque required of the extensor musculature to maintain static equilibrium. However, the head and neck complex is inherently unstable, especially in the upright position (Winters & Peles, 1990), and the neck muscles must do more than just balance the external forces acting on the system. For the system to be stable, additional co-contraction is required to increase the stiffness of the cervical spine and prevent buckling. The consequence is that significant muscular activity is probably still required even if the head and neck are positioned to minimise the flexor torque imposed by gravitational acceleration. Indeed, the necessity for muscle activity to stabilise the cervical spine is likely to be greater when it is relatively extended (Winters & Peles, 1990). The tension generating capability of a muscle is highly dependent on its length. In general, changes in posture at the atlanto-occipital and cervical joints will alter both the moment arm and the average fibre length of muscles active to provide both the required extensor torque and stiffness. While accurate measurements of moment arm and fibre length changes are unavailable, it is clear that muscle fibres which produce extensor torque will be shortened to some extent by increased extension of the head and neck. The best estimates available (Vasavada et al., in press) suggest that the force generating capabilities of the muscles which cross the atlanto-occipital joint rapidly deteriorate with extension of the atlanto-occipital joint from neutral. The suboccipital muscles in particular are relatively short, and even a small change in average fibre length caused by extension of the atlanto-occipital joint is likely to cause significant decrement in their tension generating capabilities. Yet it is precisely these muscles which appear to be primarily responsible for vertical movements about axes high in the cervical spine (Winters & Peles, 1990).
Effect of vertical visual target location on head and neck posture
127
In this experiment participants responded to a 90° change in vertical target location by changing cervical posture by only 7° on average, but changed the position of the atlanto-occipital joint by nearly 30°. A neutral posture corresponds to a posture in which the ear-ear line is approximately 15° above horizontal (Jampel & Shi, 1992). This corresponds to the average posture adopted by participants in this experiment when the the visual target was located between 15° and 30° below horizontal eye height. Higher visual targets are likely to lead to postures in which the tension generating capabilities of the sub-occipital muscles are reduced. These relationships between joint angle and tension generating capabilities may account for the observation by Jones et al. (1961) that the most comfortable sitting posture corresponded to a head orientation in which the ear-eye line was approximately horizontal. Such a posture would correspond to the average posture adopted by participants in this experiment when the vistual target was more than 45° below horizontal eye level. In summary, the implications of this experiment are that workplaces should be designed such that visual displays are located at least 15° below horizontal eye level, and possibly lower, depending on the distance of the visual target from the observer.
References Burgess-Limerick, R., Plooy, A., Fraser, K., & Ankrum, D.R. in press, The influence of computer monitor height on head and neck posture. International Journal of Industrial Ergonomics. Delleman, N.J. 1992, Visual determinants of working posture. In M.Mattila and W. Karwowski (Eds.). Computer applications in ergonomics, Occupational Safety and Health. (Elsevier, Amsterdam). 321–328. Jampel, R.S. and Shi, D.X. 1992, The primary position of the eyes, the resetting saccade, and the transverse visual head plane. Investigative Opthalmology and Visual Science, 33, 2501–2510. Jones, P.P., Gray, F.E., Hanson, J.A. & Shoop, J.D. 1961, Neck-muscle tension and the postural image. Ergonomics, 4, 133–142. Kroemer, K.H.E. and Hill, S.G. 1986, Preferred line of sight angle. Ergonomics, 29, 1129– 1134. Mon-Williams, M., Plooy, A., Burgess-Limerick, & Wann, J. in press, Gaze angle: A possible mechanism of visual stress in virtual reality headsets. Ergonomics. Mon-Williams, M., Wann., J., & Rushton, S. 1993, Binocular vision in a virtual world: Visual deficits following the wearing of a head mounted display. Opthal. Physiol. Opt., 13., 387–391. Vasavada, A.N., Li, S., & Delp, S.L. in press, Influence of muscle morphometry and moment arms on the moment-generating capacity of human neck muscles. Spine. Winters, J.M. and Peles, J.D. 1990, Neck muscle activity and 3-D head kinematics during quasi-static and dynamic tracking movements. In J.M.Winters S.L.-Y. Woo (eds.). Multiple muscle systems: Biomechanics and movement organisation, (Springer Verlag, New York). 461–480.
OFFICE ERGONOMICS
IS A PRESCRIPTION OF PHYSICAL CHANGES SUFFICIENT TO ELIMINATE HEALTH AND SAFETY PROBLEMS IN COMPUTERISED OFFICES? Randhir M Sharma Division of Operational Research and Information Systems The University of Leeds Leeds LS2 9JT Although a highly controversial and emotive issue, it appears that finally a time has arrived when it has becoming ‘acceptable’ to suggest that certain health conditions may have computer use as a significant contributory factor. Directives issued by the European Commission under article 118a of the Treaty of Rome are widely recognised as the way forward. They consist of recommendations designed to reduce the likelihood of health problems arising as a result of computer use. The primary focus of these directives is the physical components of the office. They are essentially a prescription of physical changes. However, the work environment comprises both physical environment and job organisation (Choon Nam Ong 1990). Can any approach which focuses solely on the physical components be successful in reducing health complaints?
Introduction It is estimated that by the turn of the century at least two out of every three people who work will use a VDU (Bentham 1991). The potential costs of computer use related health problems are therefore immense. Sharma (1996) presented the results of a survey conducted amongst staff and students of the School of Computer Studies at the University of Leeds and computer users employed by a large newspaper in India. The results indicated that many users were suffering, or had suffered from health problems at some time. Most importantly, with regard to the work described in this paper the results showed that most users felt that they did not know enough about the subject of health and safety and that more could be done to inform them. The starting point of the experiment described in this paper is the observation that users had a variety of preferences for the techniques which could be employed to educate them about health and safety problems. The two most popular suggestions made were firstly, to use an introductory lecture and secondly, to distribute information in a variety of formats. In addition to these suggestions it was agreed that it would be useful to assess the effect of an ‘ideal working environment’.
Method The experiment was conducted with the staff in India who had participated in the original survey. There were three reasons for the choice of participants. Firstly, it was not possible due to the high turnover of both staff and students at the School of Computer Studies to ensure that those who had taken part in the initial survey would be able to take part in the experiment. Secondly, funds for the experiment were made readily available in India. Finally, the level of understanding concerning health and safety issues did not display a large variance. This would allow any changes to be identified much more easily. Users were split into four groups. After an initial survey, interventions were made in three of the four groups. After three months the groups were surveyed again in order to assess whether the interventions made had resulted in the reduction of any health problems. The first survey was carried out in September 1996, the second follow up survey was conducted in January 1997. Although the
Health and safety problems in computerised offices
131
time between the two surveys was quite short the results obtained suggest that it had been sufficient to allow changes to manifest themselves. The interventions made are detailed below;
Group 1. Existing chairs were replaced with new ergonomically designed chairs with adjustable height, rake and armrests. Glare screens were provided in order to counter the effects of fluorescent lighting. Wrist rests were introduced in order to provide cushioning for arms and wrists. Copy holders were supplied to allow information which was being typed to be in the line of view of operators. The height of display units was raised to encourage users to adopt an upright working posture. Finally display screens were repositioned in order to encourage a greater distance between the user and the screen. An important point to note is that users were not given any guidance about how the equipment should be used or any additional information on working practises. The aim in doing this was to simulate typical work environments where often both ergonomic furniture and accessories are introduced, without advice about how these should be used.
Group 2. The second group was provided with information in a number of different formats. Booklets and leaflets containing information about various health problems were located in offices. These were placed in such a way that they could neither be removed from offices or obscured. There was however no compulsion for the users to the read this information. Two types of poster were also positioned on the walls of the offices used by this group. The first type were of A3 size, these contained single sentences in large size bold fonts. The sentences used contained guidelines about working distance, frequent breaks and a reminder to stretch regularly. The second type of poster illustrated a ‘correct’ working posture.
Group 3. The third group were given an introductory forty five minute talk about health problems associated with computer use, and the steps that could be taken to eliminate these problems. The problems covered were musculoskeletal injuries, visual problems and postural problems. The talk also contained a question and answer session which allowed the audience to ask about anything which they did not understand.
Group 4. The fourth group was used as the control group. No further intervention was made in this group.
Results The survey focused on four key areas, these were working habits, frequency of problems, contribution of particular components to problems and finally the level of knowledge about health and safety. The working habits examined were the number of hours per week spent using a computer, the time between breaks, the distance between the user and the screen and whether or not the user stretched before or during working. The questions concerning the frequency of problems and the level of blame attached to each component both used a four point scale. In the case of problem frequencies 1 indicated a high frequency and 4 indicated never. For the questions regarding the level of blame 1 indicated a large contribution and 4 indicated no contribution at all. The results produced were analysed using the ‘Wilcoxon Signed Rank Test’.
Group 1 Surprisingly the frequencies of certain health problems rose. The mean problem frequency ratings can be seen in the table on the next page. When the working habits of the group were assessed it was found that users were working a greater number of hours per week and sitting closer to the screen, despite the fact that monitors had been repositioned. Users were also working longer
RM Sharma
132
between breaks. Fewer users were stretching before or during work. There was no change in the level of knowledge of terminology. Surprisingly when asked if they felt that they knew enough about health and safety there was 35% increase in the number of users who felt that they knew enough, this despite there being an increase in the frequency of health problems. When questioned about the factors that they considered responsible for their problems there was a significant reduction in the level of blame attached to those components which had been modified or replaced. Table 1. Mean Problem Frequency Ratings n<denotes that the sample size was not large enough to make a comparison.
Group 2 The frequencies of several health problems fell quite dramatically in this group. This fall can be attributed quite clearly to the changes in the working habits of the users. Although the number of hours per week spent using a computer did not change the time between breaks fell, the distance between the user and the screen increased and the number of users who stretched before or durine work increased by 65%.The level of familiarity with terminology also rose, after three months 50% of respondents were familiar with the terms RSI and WRULD. Despite this increase in knowledge and the reduction in health problems none of
Health and safety problems in computerised offices
133
the respondents felt that they knew enough about health and safety. When questioned about components of the office which users felt were responsible for problems there was no change from the previous survey. It appeared that despite being better informed users were unable to isolate particular components of the office as being responsible.
Group 3 Group 3 showed very little change. The introductory talk, although simple in its’ content did not have any lasting effect. There was no change in the frequency of health problems or in the working habits of the users. There was also no change in the level of familiarity with terminology or the level of blame attached to individual components. The only observation which was made from this group was that surprisingly some of the users in this group now felt that they knew enough about health and safety.
Group 4 Group 4 was the control group, no changes were observed in either the frequency of health problems or the working habits of users. The level of knowledge about health and safety and the level of blame attached to individual components were also unchanged.
Discussion The worrying aspect about group 1 is that it was an attempt to simulate a typical employer response. Modifications are often made to working environments because of ergonomic concerns without any supporting guidance as to how this equipment should be used. The results of group 1 indicate that physical changes alone are not sufficient to reduce the frequency of health complaints. The significant change in the working habits of the users is also a matter of concern. The results indicated that the users had adopted working habits which had countered any benefits that may have been obtained from the changes made. The most interesting observation concerned the level of blame attached to individual components. Although the frequency of certain health problems had risen, the level of blame attached to individual components of the office fell. The author believes that the group was reluctant to blame those components of their office which they believed were causing them problems because they had been modified or replaced by an ‘expert/authority’ figure. If the results obtained here are typical, they have very serious implications. It appears that by simply modifying the environment, we may do little more than silence complaints. The frequency of problems had risen, yet despite this there was a decrease in the blame given to individual components, there was also an increase in the number of people who were satisfied with their level of knowledge. Generally the consensus among users was that all that could be done had been done and that little more could be done to help. This is dangerous, we do not want health problems as a result of computer use to be seen as the norm. The changes made to the environment of group 1 appeared to have done more harm than good. This does not however mean that they are not necessary. Much research has been done to assess the effectiveness of ergonomic aids. This research has shown that there are benefits to be achieved. The results obtained from group 2 suggest that in order for them to be effective they have to implemented in parallel with a package of measures which educate users about the manner in which they should be used. Group 2 displayed a significant reduction in health problems after only three months. There was also a significant change in the working habits of the users. These improvements were obtained without the introduction of changes to the working environment. The importance of the format in which information is delivered was clearly demonstrated by this group. Individuals learn differently and favour different techniques. As with any form of education it is important that the audience is given due consideration when the mechanisms of delivery for such information are considered. The information distributed to group 2 was in a variety of formats in order to ensure as
134
RM Sharma
The final observation from group 2 was that improvements were made at a minimal cost. The cost of the improvements was the paper on which the information was printed. The costs of preventative measures, and in particular the costs of meeting legislative requirements are often cited as an obstacle to the implementation of ‘best practice’ as defined by legislation (Gough & Sharma 1998). The results of group 2 have shown that a healthy workplace can be achieved without the need for a large scale financial investment. An introductory lecture or presentation was one of the most popular suggestions made by respondents in the initial survey. The results of group 3 illustrate clearly how unsuccessful this technique was. Overall little change was achieved, the presentation had very little effect on the group. It is immediately apparent from these results that a more comprehensive approach is required. Unfortunately in many organisations users do not even receive this. The group had not remembered any of the presentation and realistically could not really have been expected to. They had been forced to attend a single presentation and then asked to try and absorb information which was completely alien to them. They had no feedback and no one to whom they could ask questions or from whom they could receive guidance. The results of group 4 indicated that little change had occurred in the the group during the three months. This lack of change was important if we are to assume that the changes witnessed in the other groups occurred as a result of the interventions made.
Conclusion The approach used for group one was a component based approach. The focus of the approach was on the physical characteristics of the working environment. The results of this approach were that firstly the frequency of problems had increased and secondly the blame attached to those components which had been modified had fallen. The users were suffering but were unable to pinpoint why. The key problem with article 118a is that it is primarily component based. The document focuses mainly on the hardware found in computerised offices. Regulation 7 looks at the provision of information, section la of this regulation states ‘Every employer shall ensure that operators and users at work in his undertaking are provided with adequate information about all aspects of health and safety relating to their workstations’. This particular guideline is however all too easily forgotten. The results of group 2 indicate that a much greater amount of attention needs to be paid to this regulation. Only by educating users can we equip them to deal with the health problems which have been shown to be linked to computer use. Simply giving users the latest in furniture and ergonomic accessories is no guarantee of a healthier workplace. Any methodology which seeks to eliminate health and safety problems must have as its’ foundation a comprehensive programme of education. More importantly however this programme needs to be tailored to the needs of the user. It has to contain the information users need in the format that users want. Otherwise resources spent on modifying computerised offices in order to protect users will do little more than harm them further.
References: Bentham P. 1991, VDU Terminal Sickness: Computer Risks and How to Protect Yourself. (Green Print) Choon-Nam Ong, 1990, Ergonomic Intervention for Better Health and Productivity. in Promoting Health and Productivity in the Computerised Office, edited by S.Sauter. (Taylor & Francis,) HSE 1992. Display Screen Equipment Work: Guidance on Regulations. Gough T.G. & Sharma R.M, Health & Safety in Information Technology—an International Educational Issue, accepted for BITWORLD 98’. Sharma R.M. 1997, Health and Safety in Computerised Offices, The Users Perspective. in S.A.Robertson (ed) Contemporary Ergonomics 1997, (Taylor & Francis, London), 257–262.
AN EVALUATION OF A TRACKBALL AS AN ERGONOMIC INTERVENTION Barbara Haward
Robens Institute for Health Ergonomics University of Surrey Guildford, GU2 5XH
In the workplace, intensive mouse users are observed to develop work related musculoskeletal disorders more frequently than keyboard only users. As an intervention measure to assist users with symptoms, a trackball has been used, but never strictly evaluated. A workplace study was undertaken with experienced mouse users performing their normal work activities using a questionnaire to gather data and select subjects for follow up study. In the small follow up study, changing the input device had little effect on subjects’ perceptions of tiredness and discomfort and work practices. Subjects perceived fatigue and discomfort as related to work load and long working hours rather than to the input device. This, associated with observations on motivation levels and degree of control over their work, indicates the importance of psychosocial factors in the work system.
Introduction Worldwide use of information technology has been increasing rapidly since the early 1970s, especially with the introduction of low cost, smaller computer hardware resulting in the proliferation of desktop computers. Software development has led to the introduction of more accessible packages that rely on Graphical User Interfaces (GUIs). These incorporate windows, ‘pop up’ menus, icons and dialogue boxes on the computer screen. Most available software packages use GUIs and are dependent on input devices such as the mouse, using ‘point and click’ operations. With the increasing use of these in the workplace, it is likely that mouse users are at risk of developing work related musculoskeletal problems as a result of postures adopted and the nature and duration of work tasks. Little epidemiological research has been undertaken in this area despite mice being a standard feature of most desktop computer configurations. Research carried out has been on simulated activities of short duration (eg. Karlqvist, 1994) or on mouse users who use the device for short time periods (Hagberg, 1994). Real workplace problems have been encountered with computer users who use mice intensively to carry out work tasks, often 6–8 hours per day. These users have been observed to develop and report work related musculoskeletal disorder symptoms at an apparent greater frequency than keyboard only computer users. As an intervention measure, an alternative input device—a trackball, has been used in the workplace for those reporting symptoms. This
B Haward
136
has enabled users to continue working and has appeared to ‘help’ in the reduction of symptom severity. The use of a trackball as an intervention measure has never been strictly evaluated therefore the study aims were to: 1.
2.
Evaluate whether changing a mouse for a trackball had any effects on : (a) Subjective perception of fatigue and discomfort. (b) Exposure to two of the possible factors implicated in the onset of work related musculoskeletal disorders—posture and work organisation (rest breaks, work duration, work rates). Explore the characteristics of musculoskeletal symptom reporting within a group of mouse users with reference to: (a) Time spent using the mouse (b) Number of work breaks (c) Length of work breaks.
The work was carried out as a workplace study in a large Information Technology company using subjects who were experienced mouse users and who performed normal work activities for the duration of the study.
Methods Initial Study The aim was to gain information regarding musculoskeletal symptom prevalence so a questionnaire was devised to gain both retrospective information and subjects’ perceptions of their work activities. This contained questions on general demographics, work organisation (hours worked at VDU, time spent using mouse, number and length of work breaks) and focussed on musculoskeletal health symptoms using a modified Standard Nordic Questionnaire (Kuorinka, 1987). Responses to the initial questionnaire were matched against selection criteria to choose healthy (ie. symptom free) subjects for the follow up study. These criteria were (a) (b) (c) (d) (e)
Willing to participate No reported musculoskeletal symptoms in the previous 12 months or 7 days. Mouse user VDU user for >5 hours of the working day. Mouse used for >50% of VDU tasks.
Initial data analysis indicated that none of the sample matched these criteria therefore (b) and (e) were modified as follows: (b) (e)
No reported musculoskeletal symptoms in the previous 7 days in the upper limbs only. Mouse used for >50% VDU tasks or 26–50% of tasks for >or =6–7 hr VDU use.
Follow up study The purpose of the study was to evaluate the effect of substituting a trackball for a mouse, therefore objective and subjective measurements were used to detect any changes occurring.
Evaluation of a trackball as an ergonomic intervention
137
RULA (McAtamney and Corlett, 1993), Body Part Discomfort (Corlett and Bishop, 1976), workplace environment measures (lighting levels, temperature,) and anthropometry were used. Each device was used for a 3-week study period and subjects completed one questionnaire per device type at the end of this time. Questionnaire content was designed to elicit subjective information regarding fatigue and discomfort and work practices. RULA posture analysis was undertaken at 4 separate times per subject and device type. Device types used were a standard IBM 2-button mouse of curved, rounded shape and a Logitech Trackman Marble, 3button trackball.
Results Initial Study 83% (n=29) of the sample (n=34) reported at least one musculoskeletal problem in the previous 12 months. Prevalence of symptoms by body part was significant for neck, neck and back and 12 month wrist symptoms from Chi squared test results. There was little observed difference between percentage mouse use and symptom frequency and for some body parts (right wrist, neck and upper back) a higher frequency of symptoms were reported for less than 50% mouse use per day. Similarly, there were no observed differences between number and length of work breaks taken and symptom occurrence. (Number and length of work breaks ranged from 0–7 per day and 0–30 minutes duration respectively.)
Follow up study For the 6 subjects who participated, subjective feelings of tiredness were compared by subject and device type and there were little observed differences between device type and tiredness rating. (Refer to table 1) Table 1
Subjects rated if their level of tiredness had changed during the study period, (refer to Table 2), but this was not perceived to be related d to device type used. Table 2
138
B Haward
Change in rate of work pace was subjectively assessed—summarised in Table 3 and again little differences observed. Table 3
Physical discomfort ratings also indicated no differences between device type and body part discomfort, for the group as a whole, although at individual level one subject did comment that they had less wrist aches using the trackball. RULA grand score results achieved a score of ‘3’ by all users for both device types except for one ‘4’ rating. When the upper limb scores alone were analysed, there was greater variation with a range of ‘3’ to ‘5’ ratings, but this was not significant when subjected to Chi squared testing. Although some workstation mismatches were apparent from physical measurements, these were not confirmed by RULA posture observations. Environmental measurements were taken to identify any factors that could alter subjects’ perceptions of their work environment and hence contribute to psychosocial aspects of the workplace. Temperature and relative humidity ranged from 20–23Deg C and 35–45% respectively, within desirable limits for air-conditioned offices and workstation illuminance varied from 360–900 Lux.
Discussion and Conclusions Percentage mouse use was not observed to be significant with respect to musculoskeletal symptom occurrence, which did not concur with previous experience or research findings. Hagberg (1994) found differences between low (2 hours/week) and high (10 hours/week) mouse users with regard to shoulder-scapular and hand-wrist-finger symptom prevalence. Karlqvist (1994) found subjective discomfort ratings higher for keyboard operators than mouse users for the neck and shoulder region but that mouse users reported higher discomfort for the forearm/wrist region. The number and length of work breaks also had no apparent effect on symptom occurrence, but there is a lack of available data to compare this against. Changing the input device had little observable effect on the way that subjects perceived and subjectively rated their tiredness, changes in tiredness and body part discomfort. Based on subjects’ comments it was apparent they perceived tiredness and fatigue to be more related to work load levels than to device type being used. Greater differences may have been reported if the trackball had been located in a different position on the desktop to the mouse, but physical space constraints prevented this. Harvey and Peper (1997) showed a centrally placed trackball resulted in lower muscle tension levels than the mouse to the right side of the keyboard. Swanson et al (1997) assessed VDU keyboards and found that different designs had little impact on subjects’ reports of fatigue and discomfort. This supports the findings of the current study that workstation hardware changes are not perceived by users as being related to fatigue and discomfort, but that other work organisation factors are of greater importance.
Evaluation of a trackball as an ergonomic intervention
139
Changing the input device did not change the subjects work organisation in terms of number and lengths of work breaks and overtime hours worked. One explanation for this is that because subjects were musculoskeletal disorder symptom free they had no reason to modify their work behaviour to cope with any discomfort and pain. Posture assessments for the study subjects indicate the only difference between the devices is the observed reduction in ulnar deviation when using the trackball. Previous studies (Karlqvist, 1994) have found ulnar deviation to be a risk factor for upper limb discomfort and pain. The observed reduction in ulnar deviation may be beneficial to someone experiencing ulnar nerve/ulnar region symptoms, but this small study does not provide sufficient evidence to substantiate this. From observations and discussions with the follow up subjects it was apparent they perceived tiredness and fatigue as more related to excessive workloads and long working hours than to device types being used. They were a highly motivated group, familiar with high workloads and working to time deadlines, but coped well. This suggests the relevance of psychosocial factors in the work system, which have emerged as being of importance in the multifactorial nature of work related musculoskeletal disorders. It is recommended that trackballs should still have a role in an ergonomics intervention programme, but as part of a range of intervention measures employed to alleviate work related musculoskeletal disorder symptoms where both physical and psychosocial attributes of the workplace are considered.
References Corlett, EN; Bishop, RB; 1976, A technique for assessing postural discomfort. Ergonomics 19(2), pp 175–182. Hagberg, M; 1994, The ‘mouse arm’ syndrome-concurrence of musculoskeletal symptoms and possible Pathogenesis among VDU operators. In GriecoA,. Molteni G, Piccoli, B and Occhipinti (eds.) Work with display units 94, (Elsevier Science, Holland) pp 381– 385. Hales, TR; Sauter, SL; Peterson, MR; Fine, LJ; Putz-Anderson, V; Schliefer, LM; Ochs, TT; Bernard, BP; 1994 Musculoskeletal disorders among visual display terminal users in a telecommunications company. Ergonomics, 37(10), pp 1603–1621. Harvey, R; Peper E; 1997, Surface electromyography and mouse use position. Ergonomics 40(8), pp 781–790. Karlqvist, L; Hagberg, M; Selin, K; 1994.Variation in upper limb posture and movement during word processing with and without mouse use. Ergonomics 37(7) pp 1261– 1267. Kuorinka, I; Jonsson, B; Kilbom, A; Vinterberg, H; Biering-Sorenson, F; Andersson, G; Jorgensen, K. 1987. Standardised Nordic Questionnaire for the analysis of musculoSkeletal symptoms. Applied Ergonomics,18(3), pp 233–237. McAtamney L; Corlett, EN; 1993, RULA—A survey method for the investigation of work related upper limb disorders. Applied Ergonomics, 24(2) pp 91–99. Swanson, NG; Galinsky, TL; Cole, LL; Pan, CS; Sauter, SL; 1997, The impact of keyboard design on comfort and productivity in a text entry task. Applied Ergonomics, 28(1). pp 9–17.
Old methods new chairs. Evaluating six of the latest ergonomic chairs for the modern office. Alan Esnouf and Professor Mark Porter Sarum Road Hospital Winchester SO22 5HA Department of Design and Technology Loughborough University Loughborough LE11 3TU
In a large office complex such as Shell-Mex House where 90% of the work force are seated at workstations for most of their working day, it is imperative to ensure that the work force stays as comfortable as possible. Using a variety of evaluative methods this project evaluated six of the latest chairs on the market to recommend a suitable chair for the office staff of Shell-Mex House. The relative merits of long and short term evaluations were also compared. One chair, 6 (Sedus Paris) was rated as the most preferred, it produced the least discomfort in all eleven body parts of the Corlett questionnaire, (Corlett and Bishop 1976), and was also rated as the most comfortable in the paired comparison
Introduction The work on chair comfort was started in earnest in the forties and fifties with work by Akerblomb (1948) and Keegan (1953), and the momentum has continued ever since. The methodology to date is very varied but basically fits into two categories; subjective or objective, and can be further subdivided into field and laboratory studies. The research so far indicates that there is no one definitive method for predicting chair comfort, therefore this project employed a combination of subjective field studies and objective laboratory studies, namely paired comparisons, field evaluation study and a pressure distribution study. The definition of comfort used in this project was that used by Shackel et al, (1969) and others, their definition of comfort was a lack of discomfort.
Aims To establish the extent to which short term trials can be used as a replacement for longer trials (a full day) when selecting a suitable visual display unit (VDU) user chair. A second aim of the project was to select a comfortable chair for VDU users in Shell-Mex House
Evaluating six ergonomic chairs for the modern office
141
The body part questionnaires produced similar results, chair 6 (Sedus) showed the least amount of discomfort. In the four body segments, upper arms, legs, lower arms and shoulders, none of the subjects recorded any discomfort at the end of their working day in chair 6 (Sedus). At least 50% of the subjects recorded the least amount of discomfort in all 11 body segments in the Corlett questionnaire. Table 1 shows Chair 6 the Sedus to be rated the most preferred chair with the least recorded percentage of discomfort in all 11 body segments Table 1. Order of preference from analysis of Corlett questionnaire.
Discussion It is interesting to note that despite the many similarities between chair 1 (Simultan) and chair 6 (Sedus), chair 6 was preferred to chair 1 in every perameter evaluated. The major difference identified from the chair feature checklist was that 10 of the subjects considered the seatpan cushion too hard or much too hard in chair 1 the Simultan. Also of note was that in this study the upper back was the area of most discomfort. It may be that with the increased use of computers this is a trend that may increase. The validity of the comfort/discomfort measures were well proven by means of the two versions of the Corlett questionnaire the Shackel comfort scale and a comfort/discomfort question in the chair feature checklist producing remarkably similar results.
Conclusion From the literature it is evident that expert opinion, standards and recommendations are not a reliable method for predicting chair comfort, and can only be used as a start point when attempting to select a comfortable chair. This study has shown that long term user trials is a viable method of selecting a chair for the modem office. It is also true to say that no one chair will ever suit everyone so it is important to have alternatives available. The inclusion of a chair feature checklist helped to provide information on what made one chair more comfortable when compared to another.
Seat Pressure Distribution Study Three subjects sat in the same six chairs as used in the previous studies described above. One female 151cm stature (10th percentile) weight 45kg, one male 175cm stature (50th percentile) weight 75kg, one male 190cm stature (95th percentile) weight 93kg.
A Esnouf and JM Porter
142
Paired Comparisons study In the paired comparisons study 48 male and 52 female subjects took part, their weight ranged from 1071bs/48.6ks lightest female to 1911bs/90ks heaviest male. Stature ranged from a 4ft 1 lins/1500cms female to a 6ft 5ins/195cms male. Age range was 23 to 55 years old. All six chairs were aesthetically similar and a similar shade of blue despite having a variety of different ergonomic features. Chair 6 and 1 the Sedus and Simultan were fully synchronised and dynamic. Chair 3 4 and 5 the Criterion Dauphin and the Sitrite had dynamic backrests and tilting seatpans. Chair 2 the Ahrand had the addition of adjustable seatpan depth.
Procedure. The trials were conducted in rooms very similar to that of the subjects workplace. Each subject was given the same information on the trial and a checklist on which to record their preferences. The checklist included all possible pairings of the six chairs in an order that no chair was paired in succession. Subjects started at different places on the checklist so that no comparisons were always made at the beginning or end of the trial. They were given a few minutes on each chair to adjust it to their liking, and were instructed to make their selection quickly based on their initial sensations in the chair. They had to record their preferred cushion, backrest and whole chair. Each trial took approximately 45 minutes per subject.
Results Using the method of paired comparison, Guildford (1954), chair 6 the Sedus Paris was clearly the most preferred chair in all three parameters, seat cushion, backrest and whole seat, chair 1 (Simultan) was the second most preferred chair. Figure 1 shows the scale value for each chair. The zero value was set to the least preferred chair and is not absolute, but the scale values do allow a relative assessment of the six chairs as the scale is at least linear.
Figure 1. Scale value from paired comparisons
Discussion When analysing the data there was no noticeable relationship between gender, stature or weight. This could well indicate that features of each chair played a major part in subject
Evaluating six ergonomic chairs for the modern office
143
selection. The methodology of paired comparisons however is not precise enough to be categorical in making such a statement. The only chair with a specifically shaped lumbar support was rated as the least preferred chair. The first and second most preferred chairs were very similar in nearly all respects and were both synchronised dynamic chairs. The method of paired comparison can be successfully used in identifying the most preferred chair from a group of chairs. The methodology however does leave many unanswered questions, and does not tell us why a particular chair was the most preferred, or what aspects of the chair made it the most preferred.
Field Survey In the field survey 30 subjects reflecting the user population of Shell-Mex House, including data processors, word processors and interactive users took part. There were 15 females ranging in stature from 1525cm to 1700cm (10th to 90th percentile), (Pheasant, 1990), weighing from 44.5kg to 85kg, age range 25 to 55. The 15 males ranged from 1550cm to 1900cm (10th to 95th percentile), weighing from 55kg to 90kg, age range 25 to 55. All of the subjects spent the majority of their working day at their workstations.
Procedure The same six chairs were used as in the paired comparison trial. The subjects used each chair for one full day on Tuesday, Wednesday and Thursday over two weeks to reduce some of the effects of Monday freshness and Friday staleness. The use of the chairs were balanced to ensure that the chairs were not evaluated in the same order throughout the trials. Subjects were instructed on chair adjustment and left instructions should more adjustments be required. Once subjects had got themselves settled with the chair adjusted to their liking they filled in two Corlett comfort questionnaires, a mood scale, and Shackel’s 11 point comfort scale. Shortly before the end of the working day subjects filled in a second copy of the above questionnaires plus a chair feature checklist.
Results The shackel scale showed only 6% of subjects felt uncomfortable in chair 6 (Sedus) by the end of the day compared to 50% in chair 3 (Criterion) figure 2.
Figure 2. Percentage of subjects recording discomfort on the Shackel scale.
144
A Esnouf and JM Porter
Procedure The pressure study was conducted using a Tally Mk 3 pressure monitor with a seatpan matrix of 140 pressure reading points and a matrix of 40 points for the backrest. The chairs where adjusted appropriately for each subject, with the seatpan and backrest fixed to eliminate variations in subjects positions, and subjects rested their arms lightly on the arm rests. The pressure readings where then computated into pressure charts and colour contour maps to a give a better visual appreciation of the pressure readings.
Results The data was given an eyeball analysis described by Reinecke et al (1986). The pressure readings correlated with each subjects size and weight indicating that the data had been recorded correctly. From the data no one chair significantly reduced seat pressure but the Sedus did distribute the pressure more evenly, particularly in the heaviest subject.
Discussion In this study static seat pressures were measured and in keeping with previous studies the highest recorded pressures were over the ischeal tuberosities. The results from this study agree with the findings of both the subjective evaluations of the six chairs. The Sedus chair showed slightly less pressure overall, and also better distribution of pressure with less of a hot spot under the ischeal tuberosities and the sacral area.
Project discussion/conclusions The paired comparison was a quick and easy method of evaluating chair comfort. However it gave no information as to why the Sedus was the most comfortable chair of the six evaluated. The short term study took each member of staff away from their workplace for approximately 45 minutes. The long term study took each member of staff away from their work for no more than 20 minutes and at the same time gave a huge amount of useful information on subjects preferences in chair comfort, areas of discomfort and features of each chair which increased or decreased their comfort. The time involved in the long term study was that of the individual responsible for the analysis of the collected data, the time cost for the paired comparison was more than twice that required for the long term study. There are literally hundreds of chairs on the market today described as ergonomic, unfortunately this does not mean comfortable. The methodology used in this study enables companies to involve their employees in selecting a comfortable chair for their use in an inexpensive way and ensures a truly ergonomic approach.
References Aakerblomb, B. 1948 Standing and sitting posture with special reference to the construction of chairs. Doctorial dissertation. Nordiska Bokhandeln, Stockholm. Keegan, J.J. 1953 Alterations of the lumbar curve related to posture and seating. Journal of Bone and Joint Surgery, 35–A, 589–603. Shackel, B. Chidsey, K.D. and Shipley, P. The assessment of chair comfort. Ergonomics, 1969, vol. 12, no. 2, 269–306 Guildford, J.P. 1954 Psychometric Methods. London: McGraw-Hill Book Co. Pheasant, S.T. 1990. Anthropometrics-An Introduction, 2nd edition (London: B.S.I.) Corlett, E.N. Bishop, R.P. 1976, Ergonomics, 19, 175–182. Assessing postural discomfort
NEW TECHNOLOGY
DEVELOPMENT OF A QUESTIONNAIRE TO MEASURE ATTITUDES TOWARDS VIRTUAL REALITY Sarah Nichols Virtual Reality Applications Research Team (VIRART) Department of Manufacturing Engineering and Operations Management University of Nottingham, University Park Nottingham, NG7 2RD [email protected]
The attitude of users towards Virtual Reality (VR) may influence their experience of effects of VR use. This paper describes the development of a questionnaire to measure potential users’ attitudes towards VR. Responses from 167 questionnaires from staff and students at the University of Nottingham were analysed. A principle components analysis revealed eight factors that comprised an overall attitude to VR, including awareness, perceived usefulness, perceived desirability of VR and concerns about health effects of VR use. Experience of VR was found to be positively correlated with some aspects of attitude to VR, and men were found to generally have a more positive attitude to VR.
Introduction As Virtual Reality (VR) technology has developed after the last few years, it has become apparent that there are a number of effects, both negative and positive, that result from VR use. These effects include a feeling of presence (or “being there”) in a Virtual Environment (VE) (the “world” simulated by VR technology) and negative effects such as the experience of symptoms akin to motion sickness either during or after VR use. The influential factors on these effects have previously been classified as being associated with the VR system (e.g. headset design, resolution of visual display), VE type and design (e.g. number and type of objects, amount of interactivity afforded), circumstances of use (e.g. length of period of VR use) and individual user characteristics (Nichols et al., 1997). Many individual user characteristics may influence the effects experienced after VR use, including gender, visual characteristics or previous experience of motion sickness after travelling in cars, boats or planes, or experience of sickness after simulator use. However, the individual characteristic of interest here is the characteristic of attitude of users towards VR. Questionnaires to measure attitudes towards computers have been found to comprise of several contributing factors such as computer anxiety, enjoyment, satisfaction, perceived usefulness and experience (Loyd & Gressard, 1984; Igbaria et al., 1994). Attitudes vary amongst different areas of the population, make a contribution to effectiveness of computer use in the workplace and can be influenced by training (Torkzadeh & Koufteros, 1993).
Measurement of attitudes towards virtual reality
147
Therefore if VR is to be implemented as a workplace tool it is useful to examine attitudes of users before VR use. A more positive attitude should lead to a more effective and enjoyable use of VR as a workplace or educational tool. If this relationship is found to exist, training programmes for VR use should have the improvement of VR user attitudes as a primary aim. However, there are aspects of VR which mean that a Computer Attitudes Scale cannot necessarily be used to measure attitudes towards VR. These aspects include the three dimensional, interactive and user-centred nature of Virtual Environments (VEs), the large amount of physical freedom available to the participant and the idea that a participant experiences a sense of presence in a VE. This paper presents a questionnaire developed to measure potential and existing users’ attitudes towards VR. The questions consider a number of aspects of VR use, including whether VR is thought to be a useful technology for work and leisure, whether people would be willing to use VR in their workplace, whether VR systems should include a headset, if there are any health issues perceived to be associated with VR use and whether VR use provides the opportunity for social interaction. In order for a questionnaire to be used as a statistical tool which can be used to provide parametric data a number of statistical procedures must be performed. This analysis also allows the identification of component factors of an overall attitude towards VR. In addition, the relationship between VR attitudes, experience of VR, age and gender are examined.
Method Initial VR Attitudes Questionnaire Design An initial version of the questionnaire was developed. The items included were derived from the author’s knowledge of computer attitude questionnaires and experience with participants in VE experiments over the last 2 1/2 years. 45 items were produced. A short introduction to Virtual Reality and Virtual Environments was included with the questionnaire so that people who had not previously seen or used VR would still be able to complete the questionnaire. The scale used in the questionnaire was a five point Likert scale. In addition, for purposes of questionnaire development, respondents were given the additional option to select a “don’t understand” option, in order to identify those questions which were badly worded or confusing to those with low experience of VR, and to avoid a high central tendency.
Participants The participant sample consisted mainly of students and staff from the University of Nottingham who either completed the questionnaire during a lecture or were approached by the author and asked to return the completed questionnaire via internal mail. 284 questionnaires were distributed in total. 167 completed questionnaires were received (94 (56.3%) male, 73 (43.7%) female; response rate=58.8%). The mean age of respondents was 20 yrs 11 months (range=18–36 yrs, SD=3 yrs 1 month). The sample also included a wide spread of people with different levels of experience with technology in general and VR in particular for the purposes of test construction. As the development process continues and the questionnaire is used in experiments it is hoped that this population will be expanded to include other potential users of VR such as a military population.
148
S Nichols
Questionnaire Analysis Scoring Initially, the appropriate direction of scoring for each questionnaire items was estimated by observation. A high score indicated a positive attitude. If there was any doubt about the appropriate direction of scoring then the items were initially classified as “neutral” and later the direction of scoring was determined statistically (see “Item Analysis”). 16 questions were scored as “positive”, 15 as “negative” and fourteen as “neutral”. Questionnaires where more than five responses of “don’t understand” were given were eliminated from further statistical analysis. It was felt that the responses on these questionnaires were unreliable and not suitable for inclusion in the factor analysis. This resulted in the elimination of data from eight respondents. Further examination of these respondents revealed that all but one had a low level of experience or knowledge of VR. The remaining respondent was in fact a VR programmer, which obviously resulted in him having a very high level of VR knowledge and experience. However, this respondent spoke English as a second language. Although this data is not suitable for statistical analysis at this stage, it is intended that this questionnaire should be suitable for use by people of all levels of VR experience and should be understandable by speakers of English as a second language, therefore this information is used in further examination of question appropriateness and wording.
Item Analysis Several types of item analysis were carried out. Firstly, the responses of “don’t understand” were examined by question. This revealed that 13 questions had more than 2.5% of respondents not understanding the question, 3 of which were not understood by more than 5% of respondents. No elimination of questions was made on the basis of this data alone; however, it was taken into consideration when further item analysis was completed. After the “don’t understand” responses had been examined the data from the eight respondents were eliminated and the remaining responses of “don’t understand” were converted into neutral (3 points for positive and negative worded questions) for all further analysis. At this stage the direction of the items previously classified as “neutral” items were assigned direction according to the direction of correlation with the total of all items to which a direction had been previously assigned. As a result of this four more questions were positively scored and four items negatively scored. After this analysis one question was eliminated due to having a very high central tendency and having had six responses of “don’t understand”. After reliability analysis was performed, items which had a value of Cronbach’s Alpha <0.15 were eliminated. As a result of this, the five remaining neutral items were eliminated. This resulted in no neutral items remaining, and a final value of Cronbach’s Alpha of 0.886.
Factor Analysis A Principal Components Analysis was performed with Orthogonal Varimax rotation (see Table 1) after it was confirmed that the data met the criteria required for a factor analysis to be performed (Ferguson & Cox, 1993). The criterion of assignment of items to the eight factors extracted was a loading of at least 0.40. The factors were named after four independent observers were shown the items grouped according to their factor loadings and
Measurement of attitudes towards virtual reality
149
Table 1. Factor loadings of questionnaire items
asked to assign titles to each factor. These four observers then discussed the titles they had thought of, and agreed on a suitable final name.
Relationship between individual characteristics and attitudes There were no significant correlations between age and either total VR attitude or any of the subscale totals. VR experience was found to be correlated with overall attitude to VR
150
S Nichols
(r s=0.349; p<0.001), VR Desirability (r s=0.301; p<0.001), VR Awareness (r s=0.457; p<0.001), VR usefulness (rs=0.357; p<0.001) and Enthusiasm for VR (rs=0.354; p<0.001). Independent samples t-tests showed that responses of men and women differed in overall attitude (t=4.27; df=157; p<0.001), VR health anxiety (t=3.23; sf=157; p<0.001), VR Desirability (t=2.43; df=157; p<0.02), VR Awareness (t=4.78; df=157; p<0.001) and Enthusiasm for VR (t=3.28; df=157; p<0.001). In all cases men were found to have a more positive attitude than women.
Conclusions This preliminary analysis has shown that attitude towards VR is comprised of a number of factors, including awareness of VR capability and use, perception of how useful VR is, anxiety towards VR use and general enthusiasm for using VR. Previous experience or knowledge of VR was found to be strongly correlated with five of the factors and the overall attitude, indicating that experience generally has a positive effect on improving attitude. This analysis has succeeded in shortening the initial questionnaire. However, it may be the case that there is still some repetition in the questions, and further items could be removed. This could also result in the removal of some questions which were commented on as being ambiguous, such as those which did not distinguish between headset and desktop VR systems, and confusions between VR in general and VEs. It may also be that confusions such as these would be eliminated once general public knowledge about VR technology is increased, although actual differences in attitudes towards different types of systems may exist; if this is the case then this issue must be examined in detail. Finally it is important to note that there may be different requirements of a VR attitudes questionnaire from different types of users. One of the main benefits of VR technology is that it is not restricted to use by English speaking, able-bodied workers in higher education systems. The development of further versions of this questionnaire is proposed, where question wording is improved for speakers of English as a second language, shorter versions of questionnaires are developed for use with children, and single questions from each factor are used in a symbolic form for assessment of attitudes of people with learning disabilities.
References Ferguson, E. & Cox, T. (1993) Exploratory Factor Analysis: A Users’ Guide. International Journal of Selection and Assessment, 1(2), 84–94. Igbaria, M., Schiffman, S.J. & Wieckowski, T.J. (1994) The respective roles of perceived usefulness and perceived fun in the acceptance of microcomputer technology. Behaviour & Information Technology, 13(6), 349–361. Loyd, B. & Gressard, C. (1984) Reliability and factorial validity of computer attitude scales. Educational and Psychological Measurement, 44, 501–505. Nichols, S., Cobb, S. & Wilson, J.R. (1997) Health and Safety Implications of Virtual Environments: Measurement Issues. Presence: Teleoperators and Virtual Environments, 6(6). Torkzadeh, G. & Koufteros, X. (1993) Computer user training and attitudes: a study of business undergraduates. Behaviour & Information Technology, 12(5), 284–292.
ORIENTATION OF BLIND USERS ON THE WORLD WIDE WEB Mary Zajicek, Chris Powell, Chris Reeves* The Speech Project, School of Computing and Mathematical Sciences Oxford Brookes University, Gipsy Lane, Oxford OX3 OBP, UK Tel: +44 1865 484683, Fax: +44 1865 483666 Email: [email protected] *Royal National Institute for the Blind 224, Great Portland Street, London WIN 6AA
The aim of our work is to make the wealth of information on the World Wide Web more readily available to blind people. We wish to enable them to make quick and effective decisions about the usefulness of pages they retrieve. We have built a prototype application called BrookesTalk which we believe addresses this need more fully than other Web browsers. Information retrieval techniques based on word and phrase frequency are used to provide a set of complementary options which summarise a Web page and enable rapid decisions about its usefulness.
Introduction This paper describes the results of evaluation of BrookesTalk, a web browser for blind users developed at the Speech Project at Oxford Brookes University. The aim was to evaluate the utility of the multi-function virtual menubar provided by BrookesTalk with blind users including those based at the Royal National Institute for the Blind. The aim of the project is to provide a tool which will enable blind users to ‘scan’ web pages in the way sighted users do (Zajicek and Powell, 1997a) and find the useful information that is out there. BrookesTalk is designed to extract an information rich abbreviated form of a web page and present it using speech so that the blind user can make a quick decision as to whether the page will be useful or not. It also provides a means by which blind users can store particular sentences and mark places as they move around the web.
BrookesTalk Described BrookesTalk is a small speech output browser which is independent of conventional browsers and also independent of text to speech software applications using Microsoft speech technology. It includes the functionality of a standard Web browser for the blind (Zajicek and Powell, 1997b) such as pwWebSpeak(TM) in that it can break up the text part of a Web page into headings and links and read out paragraphs etc. However the main aim is to provide an orientation tool
152
MP Zajicek, C Powell and C Reeves
for blind users in the form of a virtual toolbar of functions that will provide different synopses of a Web page to help the user decide whether it will be useful to them or not. Users can select from a from a menu of, list of headings, list of links, list of keywords, list of bookmarks, an abridged version of the page, a list of scratchpad entries, a summary of the page, and can also reach and read out chunks of text which are organised hierarchically under headings. It is expected that the user will pick tools from this virtual toolbar which complement one another for the particular type of page under review. The list of keywords consists of words which are assumed (Luhn, 1958) to be particularly meaningful within the text. These are found using standard information retrieval techniques based on word frequency. Abridged text is also compiled of special sentences which have been isolated using trigrams. The scratchpad allows users to send any sentence they are listening to, which they consider important or worth noting, to the scratchpad simply by pressing a key. They can then playback lists of sentences linked to particular pages. The summary of the page includes author defined keywords, the number of words in a page, the number of headings and the number of links.
Keywords and Abridged Text Explained Keywords The following examples of keyword extractions were found for three different Web pages shown in Table 1. Table 1. Examples of Keyword Extractions
An evaluation of the usefulness of keywords compared with headings or links is given in the next section.
Abridged Text When keywords are used contextual information imparted by the position of a word in a sentence is lost. Extraction of three word key phrases or trigrams preserves some word position information. The technique is based on ‘Word level n-gram analysis’ in automatic document summarisation (Rose and Wyard, 1997) To provide a measure of similarity, groups of words appearing together rather than individual words are compared.
Orientation of blind users on the World Wide Eeb
153
To reduce the number of word-level mismatches due to the normal changes in spelling required by grammar, each element of a trigram was assigned the stem of a word rather than the word itself. The trigrams presented were ranked by frequency. Table 2
The RNIB Web page ‘Equal Opportunities Policy’, a 238 word document, gave the five trigrams of frequency >1 shown in Table 2. High frequency trigrams occur twice, low frequency trigrams once, providing at first glance little to distinguish between them. Many of the words in the trigrams are noise words which are required for grammatical correctness and are not content bearing. A summation of frequency of trigram, number of content words in the trigram and number of keywords in the trigram appears as the score for the trigram in Table 2. which also shows key trigrams for the ‘Equal Opportunities Policy’ page. Abridged pages were created by computing the key trigrams of a page, according to score, and then creating a page consisting of the sentences in which the trigrams appeared. Abridged pages on average worked out to be 20% of the size of the original text and, unlike keyword lists, are composed of well formed (comprehensible) sentences.
Evaluation Keyword Evaluation Preliminary experiments were performed to assess the usefulness of the keyword list as an indicator of page content compared to the headings list or the links list. Headings and keywords in particular were judged to be roughly comparable in that they provided a list of indicating words or short phrases. The argument for incorporating keywords in the BrookesTalk menubar is that it provides more flexibility for the user in summarising the Web page. If the author has truly encapsulated the meaning of subsections of the page in headings then headings should provide a significantly better indicator of page content than keywords. However headings are often represented as images which do not provide speech output, or are eye-catching rather than informative. In this case keywords could provide a better summary. The aim of BrookesTalk is to provide flexibility with a range of tools to aid orientation which can override many of the vagaries of Web page authoring.
154
MP Zajicek, C Powell and C Reeves
User’s perception of the usefulness of the representation was measured by asking them to evaluate the usefulness of describing a Web page using the three different types of summary representations, headings, links/anchors, and keywords. Twenty subject users were shown the different representations for six different Web pages. The pages were chosen to maximise variability. Subjects gave a score between 0 and 5 for each representation. The sum of the scores for each representation, together with the percentage of the total score it represented, was taken to give an indication of its effectiveness, results are shown in Table 3. We see that users perceived that keywords provide a considerable improvement on the use of links to orientate users to Web pages. Headings gave the best score but the score for keywords was not significantly different. Table 3. Scores for different representations
Usability And The BrookesTalk Environment The prototype BrookesTalk was used by a group blind users including those at the Royal National Institute for the Blind (RNIB). User acceptance was no problem as this group were committed to finding out what software is available for blind people. They were all technically able although our ultimate goal is to develop software for non technical users so that all blind people can use the Web Earlier versions of BrookesTalk required TextAssist software for the speech synthesis. This often required patching in Windows’95 and caused discouraging technical complications before getting started. BrookesTalk was then re-written to run on the Microsoft Speech engine for Windows’95. While increasing portability the speech engine currently uses a lot of precious disc space. Both versions are available. BrookesTalk uses different voices for conceptually different parts of a Web page. This was appreciated by most but described as irritating by one. We plan to make different voices optional in the future. BrookesTalk runs without any visual display at all and does not run with another browser open. Users felt it would be useful to have the visual equivalent of the spoken page available at the same time so that sighted co-workers could be called in for clarification or work cooperatively with the blind user.
Evaluating The Functionality Of BrookesTalk Users were observed to rely heavily on one function rather than move between different summarising representations. They had been encouraged to try using the different functions to complement one another as they provide different views of the page. Users said that they usually know what type of page they are searching for, research work, entertainment, product details etc They therefore know how useful headings are likely to be and can use keywords accordingly.
Orientation of blind users on the World Wide Eeb
155
Surprisingly one user orientated himself by using the movement between links key 90% of the time. We had not anticipated that he would build his conceptual model of the page by looking at what was behind it. This approach will be investigated fully! The abridged version of the page received most criticism. The trigram analysis could easily pick out the wrong trigrams as being significant and important headings were frequently left out of the summary. The algorithm for picking trigrams is not very stable it can easily be influenced by irrelevant words. It was suggested that trigrams should carry some kind of semantic weighting if they appear in the title or headings. The scratchpad worked well and provided an easy way of saving important sentences from the page. As yet sentences are linked to pages and a new scratchpad must be started with each new page. Users suggested that sentences could be tagged as related to search themes. In this way sentences from several different pages could be grouped by theme.
Conclusion Initial evaluation highlighted the potential usefulness of features designed to improve navigation, such as the use of keywords and page summary. Blind users emphasised the potential of a tool such as BrookesTalk to sort through what they referred to as the ‘increasing pile of paper that arrived on their desks’ during the working day. Methods used to translate HTML formatting to speech could easily be applied to other formatted documents. They felt that the product is fairly accessible for this stage of its development; but some problems exist with accessing links. Useful aspects included vocal notification of HTML markup (i.e. if text is a heading, alt-text etc.) and being able to move between headings and start speech from specified points in the text. By incorporating usability at such an early stage in the development process the system is more likely to meet user needs, with the next step being a more structured approach to obtaining feedback. This will allow faster communication of problems and improvements, ensure all functionality is assessed and highlight potential new areas of development.
References Luhn, H.P. 1988, The automatic creation of literature abstracts. IBM Journal of Research and Development, 2, 159–165. Rose, T., and Wyard, R., 1997, A Similarity-Based Agent for Internet Searching. Proceedings of RIAO’97. Zajicek M., Powell C., 1997a, ‘Building a conceptual model of the World Wide Web for visually impaired users’, Contemporary Ergonomics 1997, (Taylor and Francis, London), 270–275 Zajicek M., Powell C., 1997b, ‘Enabling Visually Impaired People to Use the Internet’, IEE Colloquium ‘Computers helping people in the service of mankind’, London
“FLASH, SPLASH & CRASH”: HUMAN FACTORS AND THE IMPLEMENTATION OF INNOVATIVE WEB TECHNOLOGIES Adam Pallant and Graeme Rainbird
RM Consulting, Royal Mail Technology Centre, Wheatstone Road, Dorcan, Swindon SN3 4RD email to
This paper is based upon the results of a heuristic evaluation of early design concepts for a new Royal Mail Web site. The design concepts consisted of static page mock-ups incorporating innovative Web technologies as implemented in existing Web sites. The acceptability of these design features in light of the particular constraints imposed by the Web is discussed. It is argued that reliance on innovative technologies is likely to exclude a proportion of potential customers. The paper concludes that Web designers should strive for simplicity rather than innovation, thereby making their sites accessible to the widest possible audience while still fostering positive customer perceptions.
Introduction The World Wide Web (‘Web’) is increasingly regarded by corporations as an important point of contact with customers, as well as a potentially valuable sales channel. Royal Mail has had a presence on the Web since 1995, providing customers with information about its range of products and services. This original site was seen as outdated, however, and new designs incorporating innovative Web technologies were being considered. The principal business objectives of the new site were: – – –
To allow customers to learn about and purchase Royal Mail products and services. To exploit the opportunities of the Web and provide new ways for customers around the world to interact with Royal Mail. To foster positive customer perceptions about Royal Mail.
Human factors involvement was sought to evaluate the design concepts for the new Royal Mail site. The principal objective of the evaluation was to assess the ‘acceptability’ of the innovative technologies against established usability principles and design guidelines.
Implementation of innovative Web technologies
157
Design Concepts The concepts proposed by the design team consisted of a series of static, non-functional page mock-ups illustrating the proposed ‘look and feel’ of the site (see Figure 1 below for an example). These designs were based upon the use of innovative Web technologies already implemented in a sample of existing Web sites.
Figure 1. Mock-up of Web page for new Royal Mail site The key features of the design concepts included: – – –
– –
The display of an animated ‘Splash’ screen welcoming visitors to the site but providing no content in its own right (somewhat like a book cover). The use of horizontal ‘Channels’ and vertical ‘Services’ toolbar structures to support navigation around the site. The use of the ‘Flash’ plug-in extending the functionality of the Web browser by supporting the use of animated graphics, including ‘mouse-over’ animations and text fields, within the site. The use of ‘frames’, or separate page areas, housing distinct content and navigation functions. The display of page content in additional fixed width ‘pop-up’ Web browser windows.
A heuristic evaluation of existing sites was performed to assess the potential impact of these design features on the usability of the proposed Royal Mail site.
158
A Pallant and G Rainbird
Usability The usability of each of a sample-of existing Web sites was evaluated with particular regard to the implementation of the key design features described above. Issues were classified as either enhancing [key: ✓], or reducing [✕] the usability of each site. A sample of the issues identified for each design feature are listed in Table 1. below. Table 1. Usability issues identified during heuristic evaluation of sample Web sites [key: ✓=enhanced usability; ✕=reduced usability]
The implementation of each design feature had a dramatic impact upon the usability of the sample sites. The results of the heuristic evaluation were therefore used to inform a list of recommendations intended to guide designers in successfully implementing these features in the Royal Mail site. The usability of the design features was also likely to depend upon the specific context of use. Users with relatively little Web experience, for example, were more likely to be confused by pop-up browser windows and mouse-over animation. Indeed, whatever their potential usability benefits, it was suggested that reliance on innovative Web technologies risked excluding a proportion of the potential user population.
Implementation of innovative Web technologies
159
Accessibility The major constraint of Web design is the variability of the potential task context and hence the difficulty in specifying user requirements. It is often impossible to predict who will visit a site, why they will go, and how they will get there (not to mention what they will do once there). The Royal Mail site was intended to be accessible to the widest possible audience, whatever their user characteristics (e.g. philatelists vs. Royal Mail employees), task requirements (e.g. ‘browsing’ vs. ‘searching’), or platform (e.g. hardware, software or system settings). It was argued that the difficulties inherent in making the site universally accessible were likely to be exacerbated by reliance on innovative technologies. For example, potential Royal Mail customers may not have had access to plug-in technology such as Flash for any of a number of reasons: – –
–
–
– – –
They may have been unwilling to download the plug-in just to view the contents of the site, particularly if they were using slow modem connections. They may have been viewing the site with a text only browser, or have disabled graphics in their browser settings, and consequently have had no requirement for ‘animated graphics’. They may have experienced difficulties installing the plug-in. The installation process can be complicated, particularly when it starts to go ‘wrong’, as the following on-line ‘troubleshooting’ guide exert illustrates: “..If you can’t find the file NPSPL32.DLL, click on “Start”. Select “Find…Files or Folders”. Under “Name & Location”, enter “NPSPL32.DLL” in the Named field. Make certain you have selected the upper root of your hard drive (C: in most cases)…” Their browser settings may have prevented them from installing the plug-in. Having ‘Safety Settings’ set to ‘High’ in Internet Explorer 3.02 (the default setting), for example, prevents Flash from being downloaded. Similarly, having ‘JavaScript’ disabled in Netscape 2.x interferes with the installation of the plug-in. They may have been using a platform which did not support the plug-in. For example, Mac users must have installed Netscape 3.x or above to use Flash. Their server ‘firewall’ may have prevented them from downloading Flash. They may have been unable to find the plug-in. Several sites, for example, directed users to a ‘plug-in directory’ page, leaving them to search for Flash from among the dozens of available plug-ins.
Not only could reliance on innovative technologies exclude potential customers, it could cause severe problems for customers who do manage to access the site content. The latest technologies are often unstable, unpredictable and inadequately tested. Downloading plug-ins, opening additional browser windows, running animations and rendering frames are all prone to draining system resources and crashing even (or especially) the most up-to-date browsers. As one commentator notes: “The problem with ‘bleeding edge’ technology is that the blood on the floor ends up being yours.” (Bystrom, 1996).
A Pallant and G Rainbird
160
Conclusion Innovative technologies have the potential to enhance Web site usability. The ‘mouse-over’ animations enabled by Flash, for example, can support users in identifying ‘clickable’ page areas (or links). While meeting the expectations of some customers, however, Web designs relying on innovative technologies are likely to exclude others. One approach to resolving the tension between innovation and accessibility is configurability; tailoring a customer’s experience of a site to suit their particular requirements. This approach has the advantage of allowing the site to exploit the latest technologies, while still leaving the content accessible to a wide range of potential customers. Such ‘configuration’ can be achieved in one of two ways: –
–
Automatically: by having the server determine the task context and act accordingly (‘server-push’). For example, the server can detect which browser a customer is using and send only data appropriate to that platform. Manually: by allowing the customer to select the version most appropriate to their requirements (‘client-pull’). For example, customers can be presented with options such as whether to view the site with or without frames, graphics, proprietary plug-ins etc.
Innovative technologies such as Flash, however, are essentially ‘presentational tools and as such stand at odds with the essential value of the Web as a universally accessible information system. Indeed, the original design of the Web and its underlying language (HTML) was based on encoding the meaning of information rather than its presentation. An alternative approach to Web design is simplicity; avoiding innovative technologies and making the site content accessible to the widest possible audience. Simple sites are generally easier to use, more stable, less error-prone, more broadly compatible and easier to maintain. Royal Mail subsequently determined that the use of innovative technology in their Web site was inappropriate and inconsistent with customer perceptions of the corporate brand as a ‘reliable’, ‘solid’ and ‘traditional’. The new site can be seen at .
References Bevan, N. (1997) “Usability Issues in Web Site Design.” in National Physical Laboratories site at . Bystrom, C. (1996) “ThreeToad Browser Tips.” in ThreeToad MultiMedia site at . Flanders, V. (1996) “Web Pages That Suck.” at Levine, R. (1995) “Sun Guide to Web Style.” in Sun On the Net site at . Nielson, J. (1997) “The Alertbox: Current Issues in Web Usability.” at . Quinn, L. (1997) “Why Write Accessible Pages?” in Web Design Group site at . Sullivan, T. (1997) “All Things Web.” at .
WORK STRESS
DETERMINING ERGONOMIC FACTORS IN STRESS FROM WORK DEMANDS OF NURSES DW Jamieson, RJ Graves
Department of Environmental & Occupational Medicine University Medical School University of Aberdeen Foresterhill Aberdeen AB25 2ZD
A number of authors have suggested that ergonomics may be of considerable use for tackling the root cause of work related stress by identifying and eliminating sources of stress from the workplace. Evidence from a number of sources suggests that nurses may be one of the groups most affected by work related stress within the National Health Service (NHS). The current study aimed to identify characteristics of NHS nursing tasks creating an imbalance between work demand and the ability of nurses to cope. This was achieved by developing a checklist questionnaire with the power to identify such factors. Results from 135 nurses (34%) identified a variety of task related factors contributing to stressful demand.
Introduction It has been estimated that stress related illnesses are responsible for more absenteeism from work than any other single cause (Rees and Cooper, 1992). The Health and Safety Executive (HSE) treat hazards leading to stress related illnesses in the same way as hazards leading to physical injury. Guidelines (HSE, 1995) state: “Employer’s…have a legal duty…to ensure that health is not…at risk through excessive and sustained levels of stress…from the way work is organised, the way people deal with each other at…work, or from the day to day demands…on their workforce.” The implication for UK employers is that they have a legal responsibility towards stress in relation to The Health and Safety at Work Act, 1974 and the management of health and safety at work regulations (HSE, 1992). Failure of employers to take this into consideration has led to several cases of civil litigation, the most significant being Walker versus Northumberland County Council in 1994, the first time an employee had been awarded compensation in such a case. Current approaches to work related stress have focused on establishing the root cause so that changes to the wider work environment can be made to reduce stress (primary intervention). An example of this type of approach was devised by Organisation for Promoting Understanding in Society (OPUS, 1995) for the Health Education Authority which aimed to tackle stress at an organisational level. As well as the organisational approach to stress other authors have postulated that an ergonomic approach may be of considerable
Ergonomic factors in stress from work demands of nurses
163
benefit in reducing work related stress (see for example, Smith and Sainfort, 1989; Williamson, 1994). Despite the fact that these authors have suggested an ergonomic approach to work stress, there are relatively few studies that examine the use of ergonomic methods to respond to stress or which measure the stress related consequences of ergonomic interventions (Williamson, op cit.). Previous research has shown that high levels of occupational stress are experienced by all occupational groups within the NHS. It has been suggested that nurses may be one of the groups most affected by work stress within the NHS (Rees and Cooper, op cit.). Other studies of nurses have reported high levels of chronic tiredness, high rates of absenteeism and widespread job dissatisfaction (OPUS, op cit.). The aim of this study was to develop a tool which could identify ergonomic factors causing stress related work demands providing enough information to recommend stress reducing changes. Ergonomic factors were defined as characteristics of a task creating an imbalance between demand and individual coping. It was also intended that data provided by the tool could be used to determine the correlation between work demand and work stress, identify which groups reported the highest levels of stressful demand, and identify which tasks were found the most stressful. From previous research there appeared to be no ergonomic intervention strategies which had been tested or evaluated. Smith and Sainfort (op cit.) use work tasks as one of the main elements in their ‘balance theory of stress within an ergonomic framework’. According to this theory, if the task requirements do not match individual capability then this increases the probability of stress. From work such as this, it is suggested that an ergonomic approach may be of use. It follows from this that there is a need to establish exactly what it is about the task that causes the misfit. Of particular concern are factors related to the practicability and endurability of work which are unsuitable in that they may contribute to work which the individual finds unacceptable and unsatisfying, increasing the probability of stress. Once it has been established which tasks cause excess demands and which factors are responsible, it should be possible to make direct changes to the nature of the task in order to reduce stress. An ergonomic intervention strategy which could be used to deal with work related stress was developed for this study from the reviewed literature (see Figure 1).
Approach The current study was influenced by stage two of the control cycle for the management of stress. This involves analysing the possibly stressful situation, identifying psychosocial hazards or demands, and diagnosing the harm being caused. The measurement of stress should be based primarily on ‘self report’ measures of how people perceive their work and the experience of stress (see Cox, op cit.). The current approach was to identify excessive demands caused by work tasks and determine if these task demands led to symptoms of stress as well as assessing the contribution of ergonomic factors. A checklist questionnaire with rating scale questions using the Likert scaling method was developed in four main stages. The first stage involved the identification of typical work tasks performed on a daily basis by the majority of the workforce under survey. This was done by conducting interviews with a random sample of nurses of varying grades on different wards.
164
DW Jamieson and RJ Graves
Figure 1 Ergonomic intervention strategy to deal with work related stress
Nurses were asked to provide a breakdown of their daily duties and, from this, appropriate task descriptors were selected which were deemed suitable by nurses. The ten task descriptors included: making beds, washing dependent patients, lifting patients and paperwork. Stage 2 of the development of the questionnaire was the identification of demands potentially arising from work tasks. Cox (op cit.) states that the nature of work related stress hazards or demands can be classed as psychosocial and refer to job and organisational demands which have been identified as causing stress and/or poor health. From the nine categories identified by Cox, four demands were taken from the task design category (lack of task variety, meaningless work, under use of skills, lack of feedback) and four from the workload/work pace category (work too hard, too much work, lack of control over pacing, high levels of time pressure). Stage 3 was the identification of stress symptoms potentially arising from work demands. These were selected from the General Health Questionnaire (GHQ) developed by Goldberg (1972) which was designed to detect current diagnosable psychiatric disorders. Potential cases are identified on the basis of checking twelve or more of the sixty symptoms. As the
Ergonomic factors in stress from work demands of nurses
165
incidence of symptoms reported by a subject increases so does psychological disturbance and the probability of being a psychiatric case. For the purpose of this study the GHQ 28 was considered which contains symptoms selected via factor analysis. The symptoms are categorised as anxiety and insomnia, somatic symptoms, social dysfunction and severe depression providing disturbance scores for each. These categories have been consistently highlighted as symptoms of stress (see Cox, op cit.) and so the symptoms within them could be used to assess psychological stress, as opposed to general health. The twelve symptoms selected from the GHQ 28 included: not satisfied with carrying out a task (social dysfunction); feeling run down (somatic symptoms); lost sleep over worry (anxiety and insomnia); thinking of yourself as worthless (depression). More consideration of social dysfunction was relevant since stress was defined as an imbalance between demands and coping resulting in people behaving dysfunctionally at work. Stage 4 involved incorporating a section into the appropriate questions allowing respondents to list causative factors if they found a task ‘very’ or ‘excessively’ demanding. The Likert scaling method used in the checklist questionnaire produced three different types of score. The first was the specific task demand score which subjects awarded to each of the ten tasks in terms of the eight demands considered. The higher the score, the more demanding subjects found the task. The second type of score was the overall work demand score which was the sum of the specific task demand scores. The higher the overall work demand score, the more demanding subjects found their work in terms of the demands of the 10 tasks considered. The third type of score was the overall work stress score which was how many of the 12 stress symptoms subjects reported they had experienced as a result of work demands. The higher the overall work stress score, the more severe the level of psychological disturbance, as predicted by the GHQ. After a pilot study of the checklist questionnaire (n=78), and subsequent changes, the final version was sent, individually addressed, to 321 nursing staff on medical and surgical wards within a Hospital NHS Trust. Questionnaires from the pilot and main study were combined for the main analysis (n=399).
Results and Discussion Questionnaires were returned by 34% (n=135) of the sample. These were fairly representative of medical and surgical wards, grade, age, and level of nursing experience. In addition, one third of respondents reported no demand and slightly more reported no stress which indicates there was not a strong response bias. It was found that there was a significantly strong correlation between overall work demand scores and overall work stress scores (Spearman 0.58, p=0.00005). This suggests that something about the nature of the tasks was contributing to demand and so increases the validity of searching for ergonomic factors which might be responsible. It was also found that ‘D’ grades aged 20 to 29 working full time (n=32) reported significantly higher overall work demand scores (p=0.023) and overall work stress scores (p=0.0002) compared with the rest of the sample and were classed as high risk. The tasks reported as being the most demanding by the high risk group can be viewed as the tasks
166
DW Jamieson and RJ Graves
causing the highest levels of work stress. Tasks were ordered according to the sum of specific task demand scores reported by each respondent in the high risk group, which gave an overall demand score for each task. The higher this score, the greater the probability of an individual performing that task experiencing stress. The most demanding task reported by the high risk group was making beds followed by paperwork and then helping with doctor’s rounds. Comments taken from questionnaires returned by the high risk group indicated task related factors contributing to stressful demand which might reduce this demand if changed. An example of these comments in relation to paperwork is shown in Table 1. Table 1 Example of factors contributing to demand and appropriate demand reducing changes in relation to paperwork
The current study has shown that it was possible to develop a tool which could assess stressful demand at work, identify contributory ergonomic factors, and suggest control measures. The results support findings from other studies in relation to the way in which work demand causes stress (for example, Smith and Sainfort, op cit.). The study can be viewed as a valuable addition to the limited amount of intervention strategies to deal with work related stress. The lessons learned from this study may be of help in guiding future research into stressful demand at work.
References Cox, T. 1993, Stress research and stress management: putting theory to work. Health and Safety Executive Contract Research Report Number 61 Goldberg, D.P. 1972, The detection of psychiatric illness by questionnaire, (London: Oxford University Press) Health and Safety Executive 1974 Health and Safety at Work Act, 1974. HMSO, London Health and Safety Executive 1995, Stress at Work (A guide for employers), HS(G) 116, (HSE Books, Sudbury, Suffolk) Health and Safety Executive 1992, Management Of Health And Safety At Work Regulations 1992. HMSO, London OPUS, 1995, for the Health Education Authority, Organisational stress in the NHS: An intervention to allow staff to address organisational sources of work related stress Rees, D. and Cooper, C.L. 1992, Occupational stress in health service workers in the UK, Stress Medicine, 8, 79–90 Smith, M.J. and Sainfort, P.C. 1989, A balance theory of job design for stress reduction, International Journal Of Industrial Ergonomics, 4, 67–79 Williamson, A.M. 1994, Managing stress in the workplace: Part II, the scientific basis for the guide, International Journal Of Industrial Ergonomics, 14, 171–196
A RISK ASSESSMENT AND CONTROL CYCLE APPROACH TO MANAGING WORKPLACE STRESS Rebecca J Lancaster Institute of Occupational Medicine 8 Roxburgh Place Edinburgh, EH8 9SU
A Health and Safety Executive (HSE) publication proposed that the assessment and control cycle approach, already applied to physical health and safety risks, be adopted to manage stress at work. The Institute of Occupational Medicine (IOM) has developed an Organisational Stress Health Audit (OSHA) using this approach. The feasibility of this was tested in a study commissioned by the Health Education Board for Scotland (HEBS). The OSHA is a three tiered approach. Stage One involves the identification of sources of stress. Stage Two investigates areas of major concern and generates recommendations for risk reduction. Stage Three evaluates the effectiveness of the recommendations made in reducing risk. This paper presents the background to this organisational approach and a discussion of its feasibility in managing workplace stress in an NHS Trust.
Introduction Stress is a real problem in the workplace, often resulting in high sickness absence and staff turnover coupled with low morale and performance. Various intervention strategies have been suggested to combat the detrimental effects of workplace stress. Murphy (1988) emphasised the following three levels of intervention which have since been widely accepted: (1) Primary or organisational stressor reduction, (2) Secondary or stress management training, and (3) Tertiary, encompassing counselling and employee assistance programmes (EAPs). Whilst there is considerable activity at the secondary and tertiary levels, primary/organisational reduction strategies are comparatively rare (Murphy, 1988 and DeFrank & Cooper, 1987). An HSE publication (HSE, 1995), providing guidelines for employers on how to manage workplace stress, has recognised identifying and controlling causes of stress at source as the most appropriate. The HSE recommend the assessment and control cycle approach for managing physical hazards in the workplace e.g. Control of Substances Hazardous to Health (COSHH) and suggest that this same approach be adopted in controlling psychological stressors. The Institute of Occupational Medicine (IOM) has developed an Organisational Stress Health Audit (OSHA) for the identification and control of work-related stress. The feasibility of this was tested in a research study commissioned by the Health Education Board
168
RJ Lancaster
for Scotland. (HEBS). This study applied the approach in three organisations: heavy industry; telecommunications; and an NHS Trust. This paper describes its application in one of these, namely the NHS Trust. The OSHA is a three tiered approach, covering hazard identification, risk assessment, review of existing control measures, recommendations for improved control and evaluation of control. Stage One provides an organisational overview by identifying the presence or absence of work related stressors and opportunities for risk reduction, many of which can be implemented by the organisation without further external input. Stage Two focuses on investigating, in more detail, areas of particular concern identified in Stage One. Stage Three involves assessing the extent to which recommendations in Stages One and Two have been implemented and their effectiveness in reducing organisational stress. A database of known causes of work-related stress was compiled from the scientific literature and this formed the background to the OSHA. Over recent years, numerous researchers have carried out extensive studies designed to validate or indeed refute the existence of work characteristics which impact upon employees’ mental health (eg Cooper & Marshall 1976, Karasek & Theorell 1990, Warr 1992, Cox 1993, and Nilsson 1993). In general there is a high level of consensus concerning those psychosocial hazards of work which are considered to be stressful or potentially harmful (Cox, 1993). However, although aspects relating to non-work issues, such as home/work interface, are acknowledged to some degree, there is a general lack of information relating to other factors which have been clearly shown to have a significant impact on mental health, eg physical hazards, industry specific pressures and company policies. The IOM researchers developed their approach based on all four components ie Environmental, Physical, Mental and Social, rather than just the work content/context divide. By addressing these four components the total work sphere i.e. all possible work-related stressors are investigated. In addition, by placing work-related stress within this acknowledged health and safety framework, there is more likelihood of stress being accepted and treated in conjunction with the other types of work-related hazards. Development of the stresser database involved ascertaining traceability of the various ‘stressors’ to be included in the IOM approach. In terms of both Environmental and Physical components the majority of health related issues are very definitely enveloped within legislation such as the Control of Substances Hazardous to Health Regulations (1988) and the Management of Health and Safety at Work Regulations (1992). A checklist was collated using the current legislative documents on all work-related physical and environmental hazards. Knowledge concerning the latter two areas, ie Mental and Social, however is not detailed in such a way and, as such, a review of relevant research material was carried out to conclude the full range of work-related factors which should be addressed in a comprehensive organisational stress audit. The OSHA is centred around semi-structured interviews tailored to the specific needs of the organisation under investigation. Representatives of all levels and functions within the organisation are interviewed. The line of questioning follows those known causes of workrelated stress in the database. The interviews are constructed from a database of questions relating to the following areas: Health & Safety, Organisational Structure, Communication, Management/supervisory skills, Training, Staff Support Facilities, Policy, Sickness Absence, Contracts/Terms of Employment, Changes/Incidents, Work Characteristics and ‘General’. The interviews themselves are undertaken by Occupational Psychologists due both to the nature of the study and the need to interpret and analyse the interviews as the sessions progress.
Managing workplace stress
169
Method Preliminary information In applying the OSHA within the NHS Trust, a certain amount of background information was requested, including: issues pertaining to the economic and competitive climate surrounding the organisation; organisational structure; and data on trends in possible indicators of stress-related problems (sickness absence, staff turnover). This information was then used to determine a profile of the organisation and presented to the IOM internal stress team, which comprises business related staff including those experienced in finance and personnel management as well as appropriate scientific staff such as psychologists and occupational physicians. Issues were raised through this presentation which helped in constructing the semi-structured interviews.
Stage One A Steering Group was formed by the Trust to identify the representatives for interview and to discuss how the work would be communicated throughout the Trust. All directorates were to be included and guidance was provided on the roles and functions required for interview. Subsequently, directorate representatives on the Steering Group identified appropriate persons for interview. All selected interviewees were contacted by IOM auditors regarding possible participation and provided with background material on the study. When the final list of participants was collected, the structured interviews were tailored according to the interviewee’s function. For example, it would be inappropriate to ask a Managing Director about specific work tasks and an Employee Representative about strategic decisions. Results of the interviews were then disseminated to the IOM internal team, presenting the sources of stress identified and possible opportunities for risk reduction. All issues were considered with regard to the potential impact both on employee health and on the organisation. A report was then presented to the organisation from which some recommendations have been implemented without further IOM involvement. The report also contained recommendations for Stage Two, proposing detailed investigations of the major concerns.
Stage Two A number of recommendations were made for further investigation and, from these, it was agreed that the role of the Charge Nurse should be looked at, in particular their conflicting roles as ward manager and provider of patient care. The investigation involved interviews with focus groups and a sample of charge nurses were asked to complete a number of published scales and a tailored questionnaire. These Stage Two investigations identified training, information and support needs of charge nurses aimed at reducing causes of stress, promoting health and well-being, and optimising performance.
Stage Three Due to the time constraints of the project, evaluation of the process within the Trust was limited to a review of Stage One via feedback questionnaires administered to interviewees and the Steering Group Leader. The interviewee questionnaire asked participants whether or not the issues that were important to them had been tackled and how they felt about the interview process. The questionnaire administered to the Steering Group Leader asked more about the organisation’s perspective of the process and the outcomes of the audit.
RJ Lancaster
170
Results Stage One Stage One was successful in identifying a number of sources of stress, for example: pace of change; uncertainty about the future; and work overload, which is perhaps not surprising given recent changes in the NHS at large. The OSHA also successfully identified a number of sources of stress at a local level including; poor communication among certain groups of staff and poor relations between different professions.
Stage Two The Stage Two investigations confirmed many of the findings of Stage One, despite the fact that different individuals were interviewed. The following stressors were identified in both stages: poor communication; lack of feedback and formal appraisal system; lack of clarity of Business Manager role; and lack of support from Occupational Health. In addition the following stressors were identified in this detailed investigation: lack of allocated time to manage; duplication of effort in administrative tasks; poor management of change; and lack of accountability. Recommendation were made to reduce the risk associated with the stressors identified, an example of which is illustrated below: Stressor: lack of allocated time to manage Recommendation: There is a need for experienced staff at ward management level, who understand operational issues and yet have sufficient influence within the hierarchy to influence strategic development. Consideration should be given to a supernumerary Charge Nurse role having responsibility for a number of wards, and clinical management devolved to Charge Nurses.
Stage Three An average of 94% of interviewees reported that their interview addressed relevant causes of stress in the organisation.
Discussion The application of the OSHA in the Trust is part of an ongoing programme of work to tackle workplace stress. There are constant changes in the NHS and it may be argued that the individual Trust is limited in terms of what it can do. This study has demonstrated that there are a number of possibilities for reducing workplace stress at this local level. The approach adopts the risk-assessment, control-cycle approach in terms of hazard identification; risk assessment; review of existing control measures; recommendations for improving control; and evaluation of controls. The approach has proved to be effective in meeting these steps. Although the Trust showed a willingness to implement the recommendations, due to the limited timescale of the study, evaluation of their impact on reducing stress at source has not been carried out. It is hoped, as part of this ongoing program of work, that the impact of the changes will be evaluated. This evaluation is intended to include a review of the impact of changes on; sickness absence, job satisfaction, and staff turnover throughout the Trust, as well as reviewing the impact on the role of Charge Nurses specifically by re-administering the standard questionnaires.
Managing workplace stress
171
It was possible to ensure an organisational, cross-functional approach whereby all levels and functions within the Trust were represented in interview. There is the possibility of selection bias as the directorate representatives identified people for interview. However, in many instances, the selection was determined by job title rather than specific individual. It is recommended, in future applications of the approach, that an organisational chart be supplied to the external auditors (IOM Team) complete with job titles and names of post holders, so that they can select the participants in order to eliminate any selection bias. The success of the approach is due, in part, to its flexibility in meeting the specific needs of the organisation. This is achieved through the development of the company profile to allow tailoring of the semi-structured interview, coupled with in-depth information from representatives of the organisation. These are the main advantages that the approach has over existing audit tools which administer a ‘standard’ questionnaire to all employees. Organisations commented on the minimal disruption caused during administration of the approach. The commitment and enthusiasm of the company contact is crucial to the success of the approach and this person should be selected with great care.
Conclusions The study reported here has demonstrated the feasibility of addressing stress in the same manner as physical hazards in the workplace and adopting a risk assessment-hazard control approach to reducing stress at source. Using appropriately skilled staff, backed by others to advise on the interpretation and evaluation of findings, it has been possible to identify sources of occupational stress and to indicate avenues for risk reduction. These recommendations have been recognised as practicable by the Trust and some have already been acted upon. The timescales of the work precluded the inclusion of an evaluative phase to determine the success of the outcome in terms of reduced stress at work. However, a full evaluation is envisaged as part of this ongoing programme to tackle stress within the Trust.
References HSC 1988, Control of Substances Hazardous to Health Regulations (HMSO, London) Cooper, C.L., Marshall, J. 1976, Occupational sources of stress: a review of the literature relating to coronary heart disease and mental ill health, Journal of Occupational Psychology, 49, 11–28 Cox, T. 1993, Stress research and stress management: Putting theory to work. Health & Safety Executive contract research report No.61/1993, (HMSO, London) DeFrank, R.S., Cooper, C.L. 1990, Worksite stress management interventions: Their effectiveness and conceptualisation, Journal of Managerial Psychology, 2, 4–10 Karasek, R.A., Theorell, T. 1990, Healthy Work: Stress, Productivity and the Reconstruction of Working Life, (Basic Books, New York) HSC 1992, Management of Health and Safety at Work Regulations, (HMSO, London) Murphy, L.R. 1988, Workplace interventions for stress reduction and prevention. In C.L.Cooper, R.Payne (Eds) Causes, Coping and Consequences of Stress at Work 1988, (Wiley, Chichester) HSE 1995, Stress at Work: A guide for employers, (HSE Books, Sudbury) Nilsson, C. 1993, New strategies for the prevention of stress at work, European Conference on Stress at Work—A call for action: Brussels Nov 1993, Proceedings. (European Foundation for the Improvement of Living and Working Conditions, Dublin) Warr, P.B. 1993, Job features and excessive stress. In R.Jenkins, N.Coney. (Eds) Prevention of Mental Ill Health at Work, (HMSO, London)
TELEWORKING
TELEWORKING: ASSESSING THE RISKS Maire Kerrin, Kate Hone and Tom Cox
Centre for Organizational Health and Development Department of Psychology University of Nottingham University Park Nottingham NG7 2RD
This paper discusses the risks to well-being which may be associated with teleworking and reports a small comparative study of ‘teleworkers’ and office workers from within the same organisation. The survey showed that in this organisation older and female workers were most likely to choose teleworking as an option. The survey also revealed the teleworkers used VDUs for longer and took fewer rest pauses than office workers matched for age and gender. However, these differences in working practices were not associated with any differences in health outcomes. This study also highlighted practical problems with applying current definitions of teleworking and the paper therefore presents an alternative conception to guide future research in this area.
Introduction There are various indicators that teleworking is becoming a more prevalent form of work (e.g. Huws, 1993). Furthermore, improvements in information technology and telecommunications mean that there is now the opportunity for more and more traditionally office-based work to be performed from an employee’s own home. This move is being encouraged by policy makers because of the potential to create jobs in rural areas and to reduce commuting. However, less attention has typically been paid to the effects which teleworking might have on individual workers. Clearly it is important to assess the psychosocial and physical impact of this new form of working and to ensure that appropriate measures are taken to protect worker well-being. In a recent review for the European Foundation for the Improvement of Living and Working Conditions, Cox et al (1996) have highlighted a range of hazards (both physical and psychosocial) which may be associated with teleworking. These include poor design of workstations, social, organisational and physical isolation, poor relations with work colleagues and superiors, lack of social support, poor work patterns, poor break taking habits, poor management, lack of promotion opportunities, conflicting demands of work and home, and lack of training. Some of these types of hazard are known to be associated with the experience of stress and/or poor health outcomes such as musculoskeletal problems and eyestrain (Cox, 1993). Furthermore, some of the potential hazards identified (e.g. isolation; workplace design for some classes of self-employed teleworkers) are not adequately covered
Teleworking: Assessing the risks
175
by current E.C. legislation (Cox et al, 1996). However, despite the potential importance of this issue, very little empirical research has assessed either the risks or the health outcomes associated with teleworking. There is some limited evidence that teleworking may be associated with poor health outcomes, for instance an unpublished survey by the Institute of Professionals, Managers and Specialists (IMPS) of 103 teleworking professionals from the civil service in the UK found a higher incidence of health problems (such as eyestrain, backache and joint pain) than would be expected given the type of work being performed. Similarly a UK Employment Department survey of non-employee teleworkers carried out in 1992 found a number of workers with cramped and potentially dangerous working conditions and a high incidence of Work Related Upper Limb Disorders (WRULD) and other work related problems. However, as neither of these studies employed a control group it is impossible to determine whether the problems experienced were specifically related to teleworking or were merely indicative of the particular organisational culture or job type involved. A comparative study of teleworkers and non-teleworkers was carried out as part of the PATRA project (see Dooley et al, 1994). They found no evidence of an increase in work-related health problems in teleworkers compared to office workers (in fact office workers reported more eye-strain, headaches and upper limb pain). However, they do not report the number of respondents studied, nor what types of work were being performed, nor the demographics of the sample. Judging from other published work from the PATRA project (e.g. Dooley, 1996) it would appear that their work was based on a relatively small sample of highly heterogeneous teleworkers from over 7 different European countries, performing a range of different types of job for a range of organisation type. Given that these variables are likely to have an important impact on outcome measures it is unlikely that a study of this kind can tell us anything meaningful about the distinct effect which teleworking has on individuals. The aim of the survey reported here was to investigate the relationship between teleworking, work hazards and health outcomes while holding other work-related variables constant. The study compared teleworkers and office workers from within the same organisation, performing broadly similar jobs. During the first stage of the research questionnaires were sent to all members of sections of the UK part of the organisation where teleworking was offered as an option to the employees. The data collected at this stage was used to investigate any differences between those in the sample identified as teleworkers and those identified as office workers. Teleworkers were identified as those who used telecommunications and IT in order to perform their work at a location other than the central office, at least 2 days per week. This criteria was based on several existing definitions of telework (Cox et al, 1996; Huws, 1993). During the second stage of the research each teleworker was matched to a non-teleworker from the same organisation according to age and gender for statistical comparisons. This approach had the advantage over previous research of controlling extraneous variables. The survey assessed a range of hazards which can be associated with work and a range of health outcomes including general well-being and the incidence of WRULDs. Given the paucity of existing data, and the conflicting nature of the findings which have been reported to date, no specific predictions were made regarding the expected results.
Method Respondents The survey was sent to ninety employees of a multi-national computer company who were free to choose whether to telework or work exclusively in the central office. Fifty two questionnaires were returned. Thirty eight respondents identified themselves as office workers. Fourteen identified themselves as teleworkers. However, of the self identified teleworkers, only ten could be accepted as teleworkers using the criteria of working away
176
M Kerrin, K Hone and T Cox
from the central office at least two days per week, the remaining four were excluded from further analysis The demographics of the sample obtained during the first stage of the research are shown in table 1. Table 1. Sample demographics
In the second stage of the research the ten teleworkers were matched to ten of the office workers according to age and gender.
Questionnaire Design The questionnaire was designed to measure both work-related hazards and negative health outcomes. Psychosocial, physical and organisational hazards were assessed using the Organisational Health Questionnaire (Smewing and Cox, 1995), the Work Characteristics Checklist (Atkins, 1995) and an Ergonomics check list for assessing VDU workstation design against the DSE regulations. Health outcomes were assessed using the General Well-Being Questionnaire (GWBQ)(Cox et al, 1983). Work-related upper limb disorder (WRULD) symptoms and eyestrain were assessed using the ‘Mannequin’; a pictorial representation of the human upper body on which respondents rate the frequency and severity of pain experienced.
Results During the first stage of the research, analysis of the demographics of the full sample (see table 1 above) revealed that the teleworking group were significantly older than the nonteleworking group (t(38)=2.49, p<0.05). Furthermore there appeared to be an unequal distribution of male and female workers into the two types of work, with females being more likely to telework than males (χ2(1,40)= 4.75; P<0.05). Because of the substantial demographic differences between the samples of teleworkers and non-teleworkers, the analyses reported below are based only the matched sample of teleworkers and office workers. Note that statistical comparison of age and gender in these groups suggested that the matching had been successful.
Physical Hazards The teleworkers reported using their VDUs for longer (M=5.9 hrs/day) than the office workers (M=4.6 hrs/day) and taking fewer rest breaks (M=3.3 per day) than office workers (M=5.5 per day). Both of these differences were found to be significant (t(9)=4.78, p<0.05; t(9)=5.39, p<0.05 respectively). Scores on the workstation design checklist revealed that both sets of workers had workstations which met about the same number of the required design features (mean=14.4 out of a possible 22 for office workers, 14.33 for teleworkers). However, there was a greater range of scores for the teleworkers (7–20) compared to the office workers (12–19), indicating a wider variability in the adequacy of workstations.
Psychosocial and Organisational Hazards Measures of perceived job control, job demand and organisational support (using Work Characteristics Checklist, Atkins, 1995) did not show any significant differences between the
Teleworking: Assessing the risks
177
teleworkers and non-teleworkers. There were also no differences in the two groups’ perceptions of the health of their organisation (as measured by the Organisational Health Questionnaire).
Health Outcomes On the WRULD Mannequin, office workers reported feeling discomfort due to their work in a mean 3.7 areas (s.d.=2.26), teleworkers reported feeling discomfort in a mean of 2 areas (s.d.=2.54). The mean maximum discomfort reported in any one body area was 1.8 (out of 5) for office workers and 1.2 for teleworkers. The mean total discomfort score (no of areas×severity) for office workers was 5.1 and for teleworkers was 2.9. None of these differences were significant. The were no significant differences between office and teleworkers on measures of general well-being.
Discussion The results of the survey did not provide evidence for any difference between the teleworkers and office workers in terms of either work related hazards or health outcomes. One exception to this general pattern was the finding that teleworkers were spending significantly longer working at a VDU and were taking significantly fewer rest pauses when working at the VDU compared to matched office workers. However, this potential hazard was not associated with any increase in negative health outcomes as measured by Mannequin and the GWBQ. The demographics of the initial sample suggested that teleworking employees were significantly more likely to be older and female. The findings also indicated that teleworking was a still a less popular choice than office working. As teleworking was optional within the organisation the results suggest that certain types of people are more likely to choose teleworking in preference to working at an office. This highlights the need for caution when interpreting the results of comparisons between teleworkers and non-teleworkers because the effects may be due to differences in the sample rather than due to teleworking (for example there tend to be gender differences on GWBQ scores which would prove a confound if more women are found to telework). The current study tried to avoid such problems by matching teleworkers to non-teleworkers. However, because relatively few of the respondents could be defined as teleworkers this led to a very small final sample size. It is interesting to note that the while the initial sample of teleworkers available in the current study was small (N=14), the final sample used in the analysis was made even smaller by the strict cut-off criteria which was used to define who was and was not a teleworker. In fact some of those who defined themselves as teleworkers were actually spending a greater proportion of their time in the central office than those who defined themselves as office workers. Similar problems have been reported in other teleworking research (e.g. Dooley, 1996). This raises the question of whether current definitions of telework capture anything psychologically meaningful. It could be argued that while we know so little about teleworking and its effects on the individual, the introduction of arbitrary cut-off points to decide who to study and who not to study is unhelpful. Such cut-off points may actually lead to a loss of useful data about the impact of teleworking. For example it may be that some self-defined teleworkers make frequent visits to the central office in order to cope with problems brought about by the act of teleworking (such as isolation from work colleagues, lack of training opportunities, etc.). It would thus be beneficial to investigate the relationship between work characteristics, outcomes and the time spent in various locations, rather than simply using the time variable to distinguish between groups of teleworkers and non-teleworkers. Similarly the other main defining characteristic of telework is the use of Information Technology and telecommunications equipment in order to perform work away from the central office location. Researchers have argued about what kind of technology is needed before working at home can be deemed to be “telework” (i.e. is using a telephone and laptop enough or do you need a fully networked computer?). However, again it can be argued that while we know so little about the impact of teleworking, arbitrary technological cut-offs in choosing who to
178
M Kerrin, K Hone and T Cox
study should be avoided in the same way as arbitrary time cut-offs. Thus it is proposed that researchers should study how outcomes vary with the sophistication of the technology employed rather than excluding those using particular types of technology. A new conceptualisation of teleworking can therefore be proposed in order to assist future research design. Based of the discussion above it is argued that teleworking can be conceptualised according to two core dimensions: (1) proportion of working time spent away from the traditional work environment, and (2) extent to which telecoms and IT are used for working away from the traditional work environment. Using these core dimensions of telework (CORDiT) it is possible to incorporate the useful aspects of previous definitions without their drawbacks, All workers can placed along each of these two dimensions, and the relationships between these measures and key outcome measures can be assessed. These outcome measures would include those discussed in the current paper such as musculoskeletal problems, eye-strain and well-being. However, they could also include measures which will be of interest to organisations such as productivity. It is also necessary to note that the effects of teleworking on health and productivity outcomes may not be direct. There are very many possible intervening variables which may mediate or moderate the effect. These can be divided into variables which relate to the specific job (e.g. type of work, task demands), those which relate to the organisation (e.g. promotion opportunities, control over pacing), those which relate to the social and physical environment of work (e.g. isolation, work station design), and those which relate to home life (work/home interface, support from relatives). These variables will be valuable in understanding telework and how it has its effects on worker well-being. It is also important to note that these variables are not unique to telework (a view which seems to be implicit in much previous telework research) but apply to all forms of work. This is where the dimensional approach to defining teleworking is important because it allows office workers (who will score low on both core dimensions of teleworking) to be considered within the same conceptual framework as all types of teleworker. Thus it is proposed that future research should explore the impact of teleworking using a multidimensional framework such as CORDiT. Other important variables (such as job type/organisational culture) should either be controlled as in the current study, or, if enough respondents are available, included in analysis of the results. In this way it will be possible to substantially increase our understanding of how teleworking effects workers. Such understanding is vital if workers are to protected from any potential ill effects of teleworking.
Acknowledgements The authors would like to thank Lisa Long for collecting the empirical data.
References Atkins, L. (1995) Work characteristics and upper limb disorder in clerical and academic staff in university departments. Unpublished BA project, Department of Psychology, University of Nottingham. Cox, T. (1993) Stress Research and Stress Management: Putting Theory to Work. Sudbury: HSE Books Cox, T., Griffiths, A., and Barker, M. (1995) The Social Dimensions of Telework. Report to the European Foundation for the Improvement of Living and Working Conditions. Cox, T., Thirlaway, M., Gotts, G. and Cox, S. (1983) The nature and assessment of general well being. Journal of Psychosomatic Research, 27, 353–359. Dooley, B., Byrne, M-T., Chapman, A., Oborne, D.J., Heywood, S., Sheehy N. and Collins, S. (1994) The teleworking experience. Contemporary Ergonomics. S.A.Robertson (Ed.) London: Taylor and Francis. Dooley, B. (1996) At work away from work. The Psychologist, 9, 155–158. Huws, U. (1993) Teleworking in Britain. A report to the Employment Department. Employment Department Research Series No 18.
EVALUATING TELEWORKING—A CASE STUDY Samantha Campion and Anne Clarke HUSAT Research Institute, The Elms, Elms Grove, Loughborough, Leics. LE11 3RG Telephone: +44 1509 611088 email: [email protected] The TWIN project (TeleWorking for the Impaired, Networked Centres Evaluation) was involved in data collection from up to 41 tele workers with disabilities at 10 pilot sites in 5 countries across Europe. This paper describes the evaluation methodology developed within the project and the results in terms of teleworking as an option for employment of people with disabilities.
Introduction The overall objective of the TWIN project was to assess the opportunities for development and interconnection of specialised telework centres aimed at the integration of disabled people in the labour market at a pan-european level. People with disabilities are under-represented in the workplace (Andrich & Alimandi, 1995) and teleworking was perceived to be a method whereby people with disabilities could be integrated into the working environment. The TWIN project was set up to explore this possibility through studying existing examples of pilot sites that were supporting teleworking for people with disabilities, both to determine the barriers and opportunities that exist and to distil information on best practice. The TWIN project involved seven partners, five each of which were either pilot sites, or were in touch with pilot sites. HUSAT’s role was assessment and evaluation of the existing pilot sites. Initially a literature review was carried out to identify previous work in the area, and it became clear that whilst actual reports of teleworking and case study information are many and varied, when it comes to how to evaluate teleworking, information is scarce. A methodology was developed within the project and was used to assess the situation at the pilot sites. First the methodology developed will be discussed, then a broad outline of the results will be reported.
A Teleworking evaluation methodology In the same way that teleworking itself imposes changes from traditional working methods on the worker, the manager and the organisation in carrying out the process of work (Gray, et al 1993; Huws, 1993; Korte and Wynne 1996), it also imposes a requirement for a specific evaluation methodology. This paper describes the methodology that was used within the TWIN project.
S Campion and A Clarke
180
Objectives The objectives of the evaluation were to define a common methodology for monitoring and collecting data from the pilot sites in a form that would enable the output from the evaluation to feed into the major deliverable, a document reporting the global knowledge acquired during the project in terms of job opportunities; trans-national teleworking; efficiency and productivity: impacts on the employer, the networked centres and the teleworkers; rehabilitation opportunities; social impact and human factors issues.
Procedure It was found within the TWIN project that the most useful way to structure the evaluation to provide the necessary data was to split the issues into 3 main levels, these being individual, organisational and wider (societal) issues. Use of this method is corroborated by other studies of telework (Nilles, 1987; Korte et al., 1988, and Qvartrup, 1993). Baseline evaluation Initially the TWIN project undertook an ‘baseline’ evaluation of the pilot sites, individual teleworkers, potential teleworkers and trainees. Two questionnaires were prepared; a site questionnaire and an individual questionnaire. These questionnaires allowed a picture to develop of the present situation, leading to: • • •
Identification and selection of different groups of users and additional stakeholders who could take part in the evaluation. Identification and selection of comparison groups (controls) who could take part in the evaluation. Identification of opportunities for networkingAeleworking between pilot sites.
It became clear after the initial data capture exercise that the circumstances at the pilot sites were not uniform. Whilst all the pilot sites are involved in promoting teleworking initiatives for the disabled, each site had found its own particular methods for doing this, and worked within its own national and commercial arena. Full Evaluation Taking into account the above baseline evaluation and the wishes of the pilot sites involved in the project, the full evaluation took the form of a number of longitudinal case studies. Given the diversity of the 10 pilot sites, in terms of organisational structure, and access to teleworkers, the output of the monitoring and assessment to be reported back was defined centrally (by HUSAT), However, the method of collecting the data at each pilot site was decided locally. This allowed the most appropriate form of data collection to be used at each site. Where sites did not have the necessary skills, HUSAT performed a support role, and defined more closely the tools to be used to collect the data. A range of tools appropriate for the specific information being collected were made available to the pilots sites, who then chose which they would implement locally. This was felt to be the best method, given the cultural, language and structural differences that existed between sites. The information was then formatted by the local TWIN partner, and reported to HUSAT in a uniform and consistent manner. Information was then analysed and disseminated to all pilot sites, again in a uniform and consistent manner.
Evaluating teleworking
181
Results of using the methodology The evaluation methodology used within the TWIN project was very successful, in that it identified the main barriers and opportunities that existed in trying to implement teleworking for people with disabilities, and enabled the production of a document detailing best practice. There were particular areas that were observed to work well and others that needed improvement. These were identified as: • Allowing flexibility of data collection. The TWIN partners reported that a semi-formal method of interviewing allowed a richer picture to develop and added value to the reporting process, allowing surrounding issues to be discussed as well. • Repeated handling of data and translation issues. The effects of translation could have been alleviated by the employment of professional translators, or by the use of yes/no or tick box type answers as much as possible, to reduce the translation needs to a minimum. • Frequency of reporting too high for levels of change occurring. The effect on teleworkers of having to respond to the monitoring activities became gradually more negative, resulting in lower quality and quantity of responses. • Teleworkers interpreting that the monitoring was of them rather than of telework. In a minority of cases, the monitoring procedure was misinterpreted as a means for the pilot sites to monitor teleworkers own personal progress (or lack thereof through no fault of their own). This misunderstanding led to straight non co-operation.
Results of the evaluation Opportunities identified Theoretically teleworking allows a new way of working where the work can come to the worker rather than the worker having to travel to the work. In this way people with disabilities can carry out tasks in their own time and within their individual capabilities. The other benefit that teleworking offered was an opportunity to be assessed without reference to disability, as the employer may well not be aware of the disability of the teleworker. Also the opportunity to carry out work whilst remaining in an adapted environment and without the requirement to travel, which can be difficult and consume limited individual energy resources. Barriers identified Whilst teleworking can be seen to offer opportunities to people with disabilities, there are also barriers. The main barrier identified within the TWIN project that was a factor in all but one of the participating sites was the benefits trap. An assessment of disability is often an all or nothing assessment, which does not take account of fluctuations in the ability to perform work. People with disabilities who take on work when they are able find their benefits stopped completely, and may well find that when they are unwell and unable to perform work, they cannot claim disability benefit. Therefore, given the uncertainty of their ability to perform work in the future, and given the amount they would have to earn to make up for the loss of benefits, many people with disabilities felt that the risks were too great for them to take up teleworking. The situation was only found to be different in Greece, where benefits were not linked to working or not working, but were paid regardless. However even in this situation there were barriers, The main barrier seemed to be the ‘culture shock’ that many of the people with
182
S Campion and A Clarke
disabilities experienced. They had been used to having their time as their own, and found it difficult to carry out work to an acceptable quality and to a set timetable. However the work they were given was boring and repetitive, and support for the workers was limited. In Finland the workers were well supported and had access to suitable equipment and training. There seemed to be a real will to succeed and this is shown in the fact that after the end of the project, the work continued, however even in this case the level of actual teleworking employment was low. Another barrier for many of the teleworkers was lack of equipment. The cost of adequately equipping a teleworker was found to be beyond the means of many of the pilot sites. This is to some extent a ‘chicken and egg’ situation, in that to carry out work as teleworkers people need access to equipment and training, but to finance the equipment and training, there is a need for finances, which are not available. Another barrier was employer perceptions. Teleworking is still a new concept to many employers, and they are wary of trying it. Combining trying to sell the concept of teleworking with trying to persuade employers to take on people with disabilities creates a double barrier for the pilot sites to overcome. There are prejudices against both teleworking and employing people with disabilities that are both unfounded but also very strong. This problem was highlighted by the fact that during the whole 18 months of the project, only one or two people managed to find work that was not directly supplied by a pilot site. In some ways these problems were understood before the project started, and some pilot sites attempted to act as ‘work brokers’ for the teleworkers. However there were limited resources and little willingness from employers to consider this form of employment as a valid and valuable form.
Discussion Although these results seem to indicate little success for the pilot sites and for the individual teleworker, there were opportunities above and beyond those of teleworking for the sake of earning a living. The access to equipment and training, even when limited, allowed the potential teleworkers to learn new technologies and skills. There was an improvement in quality of life, and even where work was not forthcoming, whilst many potential teleworkers found themselves frustrated, many more preferred the present situation to what had been available previously and were satisfied. In hindsight the project may have been over ambitious in its’ aims. The access to technology and training, and the concomitant improvement in the quality of life of many of the people with disabilities that took part in the project is a laudable aim in itself without the necessity of also finding employment in 10 months. Continuing the training, enabling people with disabilities to learn new skills and new technology may eventually naturally lead to work opportunities in time, so long as the training and access to technology continue. The project encountered the perennial problem of teleworking being seen as a panacea… Teleworking is not a job, it is a method whereby a job is performed. Therefore it is necessary to have a skill or activity, and then to deliver that skill or activity via teleworking, the absence of a skill still means poor employability. During the project one pilot site found itself inundated with calls from people with disabilities who wanted to work as teleworkers, however when asked what service they intended to offer, they replied ‘teleworking’. They wanted to attend courses on how to become teleworkers, then offer their teleworking skills to employers. This misperception of teleworking is not uncommon, and led to resistance amongst employers
Evaluating teleworking
183
In many cases there is a need to supply two sets of skills, the skills to perform a particular work role and the skills to deliver that work role via teleworking. These extra teleworking skills will be in both technological and personal areas, including an understanding of the technology, use of modems and data transfer mechanisms, how to deal with the social and practical issues of teleworking, self motivation and time management to name but a few.
Conclusions The evaluation methodology used within the TWIN project, of a baseline evaluation followed by a set of longitudinal case studies allowed a rich description of the issues involved in teleworking for the disabled. The major opportunities and barriers were identified, and these were found to be consistent across the pilot sites. Although teleworking can provide a means by which people with disabilities can be integrated into the work environment, there are many barriers to be overcome. However, although the training, skills and equipment could eventually lead to employment, the situation can still be termed a success for the individual in terms of improved quality of life. The issue of whether teleworking is an enabling technology or a ghettoising of the disabled, (ie. where they are not introduced into the mainstream of the working environment, but instead are kept in their own homes) depends on the reasons for teleworking. It is recommended that teleworkers spend some proportion of their time ‘in the office’ to maintain links with the rest of the organisation. Teleworking is not a way for employers to avoid adapting the workplace to enable access. However in extreme cases where the disability precludes these visits, working totally from the home or a telecentre is a viable alternative to no employment.
Acknowledgements This paper was based on work carried out by the TWIN project (TeleWorking for the Impaired, Networked centres evaluation) funded by the CEC under programme DGXIII. The authors wish to acknowledge all of the project partners who contributed to the success of the project.
References Andrich, R., Alimandi, L. eds. 1995. TWIN (T1003) Guidelines for setting up teleworking centres integrating disabled people. Wynne, R., Cullen, K., Mercinelli, M., Andrich, R., Alimandi, L., Campion, S., Clarke, A., Ashby, M., Webb, L., Carter, C., Leondaridis, L., Anogianakis, G., Savtschenko, V. 1995. TWIN (T1003) Deliverable D1.T1: Technological and Socio-economic Requirements and Opportunities. Gray, M., Hudson, N., Gordon, G. 1993. Teleworking Explained. John Wiley & Sons. Huws, U. 1993. Teleworking in Britain. A report to the Employment Department Korte, W.B., Wynne, R. 1996. Telework Penetration, Potential and Practice in Europe. IOS Press Nilles, J.M. (1988), Managing Teleworking. final report, University of southern California, Los Angeles Korte, W.B., Robinson, S., and Steinle, W.J. eds. 1988, Telework: Present Situation and Future Development of a New Form of Work Organisation, North-Holland, Amsterdam Qvartrup, L., 1993, ‘Flexiwork and Telework Centres in the Nordic Countries, Trend and Perspectives,’ Paper presented at the ECTF International Seminar “Flexiwork Policy in the European and Nordic Labour Markets”. Helsinki, May 18–19, 1993
TEAM WORKING
TEAM ORGANISATIONAL MENTAL MODELS: AN INTEGRATIVE FRAMEWORK FOR RESEARCH. Janice Langan-Fox, Sharon Code, and Geoffrey Edlund.
Department of Psychology The University of Melbourne, Australia.
In recent years, researchers from a broad and disparate range of disciplines, have explored the utility of the ‘mental model’ (Rogers, 1993), ‘collective mind’ (Weick & Roberts, 1993), ‘cognitive maps’ (Langfield-Smith, 1992), and ‘team mental model’ (Cannon-Bowers, Salas, & Converse, 1993). In general, these diverse literatures present a confusing array of concepts and meanings with little coherence or systematic research. Further, much of the research stems from aviation psychology, where samples are homogeneous, and topics uniquely applicable to defence scenarios (eg, cockpit behaviour) are favoured. The present research integrates cognitive, social and aviation psychology, by presenting a framework for analysing ‘team organisational mental models’. This unifying model should be useful for the ergonomist and psychologist investigating shared cognition, group dynamics, and teamwork in organisations.
Introduction. Teams, organisations, and mental models: Building a framework for research. Recent reviews of the mental models literature (eg, Klimosky & Mohammed, 1994) have highlighted the lack of consensus on what is meant by the term ‘mental model’. Besides this problem of definition, there are empirical and theoretical problems in developing a conceptual model for research purposes. For instance, apart from several mental model frameworks (see Cannon-Bowers, Salas, Tannenbaum, and Volpe, 1995), there appears to be no theory of mental models which would enable accurate predictions in different contexts. What is needed, is an integrative framework which links the mental models literature to those other variables of interest which have been associated with the concept, eg, situation awareness (Endsley, 1995), team skills (Cannon-Bowers et al, 1995). The current paper addresses these issues through the development of a methodology which casts light upon the every day problems and issues confronting people in the workplace, and which was developed through a series of pilot studies in a large, national, government business enterprise. The research aimed to answer the following questions: (a) What are the shared understandings between employees involved in problem solving teams, and how should these
Team organisational mental models
187
be measured? and (b) What team dynamics variables contribute to the development of shared mental models? A more general aim was to assess the success of a training program designed to facilitate employee participation (EP) in problem solving teams about workplace issues. The training program emphasised the importance of reaching consensus and shared or common perceptions in solving these problems. The concept of mental models originated from Craik (1943), who argued that we construct internal models of the environment around us which form the basis from which we reason and predict the outcome of events before carrying out action. More recently, Rogers (1993 pp.2) defined mental models as “internal constructions of some aspect of the external world that can be manipulated enabling predictions and inferences to be made”. Wilson and Rutherford (1989) argued that mental models are constructed from the background knowledge that an individual has of a system or task, and consist of just those aspects which are needed to solve a problem at a particular point in time. Other authors (eg, Bainbridge, 1991), more typically use the term mental model to refer to certain contents of long-term memory. They argue that working memory is constructed from moment to moment, from the interaction of the contents of long-term memory (including mental models) and information from the environment. Here, the concept will mainly apply to the long-term representation of knowledge, however it is acknowledged that the contents of a user’s mental model may be made available to working memory. In order to work successfully on decision making tasks, groups must perceive, encode, store, and retrieve information in a parallel manner; the quality of the group’s output will depend not only on the information available to individual group members, but also on the ‘shared mental model’ present in the group (overlap between individual mental models). Shared mental models help team members to explain what is happening during task performance, and to arrive at common explanations across members. In turn, these lead to the development of shared expectations for task and team demands. Cannon-Bowers et al (1993) suggest that effective team performance can only occur when team members share an understanding of both the task and team, and the general context in which they operate. Their comment underscores the importance of situation awareness, with the link between team mental models and situation awareness having been proposed by several authors (eg, Smith & Hancock, 1995). Whilst there are large numbers of studies in aviation psychology and group psychology, there appear to be few studies available on team mental models in organisations. As noted by Klimosky and Mohammed (1994), most researchers have simply assumed the existence of shared knowledge structures, and not attempted to measure them. Further, no attempt has been made to empirically examine processes which contribute to team mental models in organisations. Thus various literatures were reviewed in order to develop an integrated framework for research: EP, team dynamics, and team mental models.
Team Organisational Mental Models. EP team dynamics Much of the interest and support for research on groups can be attributed to their prevalence in organisations and society. Groups are widespread in modern organisations, with autonomous work groups, labour-management committees, product development teams, and
188
J Langan-Fox, S Code, and G Edlund
executive committees representing some of the many examples of groups that now form vital parts of organisational life. Understanding how such groups function, therefore has important theoretical and practical implications. Researchers are also devoting more attention to how groups manage their external relations, and adapt to changes in the environment (Argote & McGrath, 1993). The involvement of workers in important decisions has grown significantly in recent years in both Britain and the U.S with approximately half of all large, unionised manufacturing firms running ongoing EP programs. Industrial democracy is defined as “the significant involvement of workers in the important decisions that affect their lives”, and is achieved through employee participation (Davis and Lansbury, 1996 pp. 6). In recent years, EP has grown significantly in both Britain and the US, with approximately half of all large unionised manufacturing firms running ongoing EP programs. One difficulty in analysing problem solving groups at work such as EP teams, lies in the fact that the ‘problem’ can be ongoing, and/or the output, product, or solution, is not readily observable or quantifiable, nor conveniently measured at one particular point in time (eg, at the beginning of team formation). For our purpose, it was relevant and appropriate to understand team processes rather than say, group performance (eg, output in some form) given that the teams had been involved in solving different workplace problems for differing periods of time, in different types of work units. Because the research on group and team dynamics is extensive, and spans a number of different domains the team mental model literature was researched with a view to selecting those group or team variables which were directly relevant to team organisational mental models and to employee participation. This included characteristics, teamwork behaviours and skills, individual differences, and situation awareness (or team-environment interaction).
Status characteristics: The link between shared mental models and group process Status characteristics theory (SCT) is a cognitive theory which contends that influence in small groups (eg, decision making groups) is caused by expectations about performance activated by status characteristics, “…any characteristic that has differentially evaluated states that are associated directly or indirectly with expectation states” (Berger, Fisek, Norman, & Zelditch, 1977 pp. 35). Status characteristics may be external to the group (diffuse), or emerge during the course of interaction (specific). Diffuse status characteristics include gender, education, seniority, and work level. Expectations associated with a specific status characteristic are limited to a particular task or situation (eg, expert status). In social interactions, high status affiliates of a status characteristic are expected to possess greater general competence than low status affiliates. High status affiliates are thus treated by others as though they are more competent than low status affiliates. Once formed, this shared order of performance expectations tends to become self-fulfilling, by shaping the interactant’s propensity to offer goal related suggestions, and the likelihood that others will attend to, positively evaluate, or accept these. Such deferential behaviours in effect treat their recipient as a more valued contributor to the group goal, and consequently communicate situational worthiness. In turn, status processes may indirectly affect the development of team mental models, via their more direct effect on goal commitment, communication openness, and teamwork.
Team organisational mental models
189
Teamwork behaviours and skills. Based on an extensive review of the past literature on team performance, Cannon-Bowers et al (1995) proposed that the specification of competencies for teams in the workplace should be derived from: (1) the requisite knowledge, principles and concepts underlying the team’s effective performance; (2) the repertoire of required skills and behaviours necessary to perform the team task effectively; and (3) the appropriate attitudes on the part of team members that foster effective team performance. On the basis of this extensive review, Cannon-Bowers et al (1995) argued that teamwork can be defined in terms of the following skill dimensions: adaptability, shared situational awareness, performance monitoring and feedback, leadership/team management, interpersonal relations, coordination, communication, and decision making, and that these teamwork skills are instrumental to the development of shared mental models. Thus, it is also important to investigate the contribution of certain individual differences.
Individual differences. Individual difference variables which have been hypothesised to affect the development of shared understandings in teams, include tenure, group history, proximity, goal commitment, and perceptions of task difficulty. It should be noted that some individual difference variables (eg, tenure) have the potential to act as status characteristics, since individual differences may become salient when they discriminate between the actors in the situation, even when they are not directly connected to the task. Initially irrelevant characteristics will become involved in a social situation unless their applicability is challenged (eg, by previous information). If it is not challenged, the members of the group, as part of the normal course of interaction, will act as though it is relevant to the task.
Situation awareness (SA) and team-environment interaction (TEI). For the most part, SA has been applied in aviation teams and in military decision-making environments (eg, Endsley, 1990). Salas, Prince, Baker, and Shrestha (1995) proposed that the construct also applies to other types of teams. Oranasu (1990) described a crew’s shared mental model as arising out of the articulation of SA (interpreting situation cues) and metacognition (defining the problem and devising a plan for coping with it). He argued that casting a situation into a commonly shared frame of reference integrates information into an overall coherent picture (a mental model). The extent to which this picture can be achieved may depend on group process variables such as group history, and the quality of communication and trust between group members. Some authors see SA as a characteristic of the agent, who may be said to have good or poor SA on the basis of his/her propensity (Endsley, 1988). Others (eg, Smith & Hancock, 1995) argue that SA is not resident in the agent, but rather exists in the invariant interaction of the agent and the environment. It is also possible to talk about team SA, “the sharing of a common perspective among two or more individuals regarding current environmental events, their meaning and predicted future states” (Wellens, 1989 pp.6). Thus team-environment interaction was an important variable in the research framework. For many years now, teams have become a feature of daily working life, with most organisations requiring workers to be involved in shared work activities and goals. It is timely to assess how team processes relate to constructs representing shared cognition about the team task and its importance, and to assess the extent of influence of various group
190
J Langan-Fox, S Code, and G Edlund
differences by virtue of group, individual, and status characteristics. The framework drawn above suggests the important variables in this endeavour.
References. Argote, L. and McGrath, J.E. 1993, Group processes in organisations: Continuity and change. In C.L.Cooper and I.T.Robertson (eds.) International Review of Organisational and Industrial Psychology 1993 Volume 8, (John Wiley, Chichester) Bainbridge, L. 1991, Mental models in cognitive skill. In A.Rutherford and Y.Rogers (ed.). Models in the Mind, (Academic Press, New York) Berger, J.M., Fisek, H., Norman, R.Z., and Zelditch, M. 1977, Status Characteristics and Social Interaction: An Expectation-States Approach (Elsevier, New York) Cannon-Bowers, J.A., Salas, E. and Converse, S. 1993,. Shared mental models in expert team decision making. In John Castellan, Jr. (ed.) Individual and group decision making, (Lawrence Erlbaum Associates, New Jersey), 221–246 Cannon-Bowers, J.A., Tannenbaum, S.I., Salas, E., & Volpe, C.E. 1995, Defining team competencies and establishing training requirements. In R.Guzzo & E.Salas (eds.) Team Effectiveness and Decision Making in Organisations, (Jossey-Bass, San Francisco), 333–380 Craik, K. 1943, The Nature of Explanation, (Cambridge University Press, Cambridge) Endsley, M.R. 1988, Design and evaluation for situation awareness enhancement. In Proceedings of the Human factors Society 32nd Annual Meeting, (Human Factors and Ergonomics Society, Santa Monica), 97–101 Endsley, M.R. 1990, Predictive utility of an objective measure of situation awareness. In Proceedings of the Human Factors Society 34th Annual Meeting, (Human Factors and Ergonomics Society, Santa Monica), 41–45 Endsley, M.R. 1995, Toward a theory of situation awareness in dynamic systems, Human Factors, 37(1), 32–64 Klimoski, R., & Mohammed, S. 1994, Team mental model: Construct or metaphor? Journal of Management, 20, 403–437 Langfield-Smith, K.M. 1992,. Exploring the need for a shared cognitive map, Journal of Management Studies, 29, 349–368 Orasanu, J. 1990, Shared mental models and crew decision making. Paper presented at the 12th Annual Conference of the Cognitive Science Society, (Cognitive Science Society, Cambridge). Salas, E., Prince, C., Baker, D.P., & Shrestha, L. 1995, Situation awareness in team performance: Implications for measurement and training, Human Factors, 37(1), 123–136 Smith, K. and Hancock, P.A. 1995, Situation awareness is adaptive, externally directed consciousness, Human Factors, 37(1), 137–148. Weick, K.E. and Roberts, K.H. 1993, Collective mind in organisations: Heedful interrelating on flight decks, Administrative Science Quarterly, 38, 357–381 Wellens, A.R. 1989, Effects of communication bandwidth upon group and human machine situational awareness and performance (Final Briefing), (Armstrong Aerospace Medical Research Library, Ohio) Wilson, J.R. and Rutherford, A. 1989, Mental models: Theory and application in human factors. Human Factors. 31. 617–634.
THE IMPACT OF IT&T ON VIRTUAL TEAM WORKING IN THE EUROPEAN AUTOMOTIVE INDUSTRY Chris Carter and Andrew May HUSAT Research Institute, The Elms, Elms Grove, Loughborough, Leics. LE11 3RG Telephone: +44 1509 611088 email: [email protected], [email protected]
The following paper is based on some of the results from the TEAM (Teambased European Automotive Manufacture) project. The objective of TEAM was to investigate how advanced IT&T (Information Technology and Telecommunications) tools could support working between virtual team members along the automotive supply chain, with the aim of improving product quality and reducing the costs and time-to-market of introducing a new product vehicle. A TEAM software demonstrator was developed, based on the results of a user requirements exercise and a series of workplace-based user evaluations were carried out involving manufacturer and supplier engineers. User and technical evaluations have shown the potential for improving communication and collaboration during the Product Introduction Process (PIP).
Introduction The European automotive industry is in the process of globalisation (Lamming, 1994), (Simpson, 1996) and is characterised by increased delegation of technical responsibility to Tier 1 suppliers and the downward supply chain for the design, production and integration of assemblies. Global market pressures dictate increasing adoption of Concurrent Engineering, where the integration of design and manufacturing activities and maximum parallelism of working practices is sought. This is becoming the de-facto method for product development (Lawson and Karandikar, 1994), and to achieve this, there is a fundamental need for technology to support distributed engineering teams (Londono, Cleetus et al., 1992). Despite the emergence of new technology, standard day-to-day communication and collaboration between European automotive engineers at different locations still centres heavily on telephone calls, face-to-face meetings, faxes and the post. These methods are often either inefficient, or not very effective at representing engineering issues and enabling interactive, real-time problem resolution. The TEAM project investigated how advanced IT&T tools such as audio-video conferencing, application sharing, shared whiteboards and distributed data product libraries might enable more effective and efficient working between geographically separated manufacturers and suppliers along the automotive supply chain. This paper outlines the evaluation approach, results and conclusions from the user evaluations carried out within the project.
192
C Carter and A May
TEAM Project Evaluations A user requirements exercise undertaken at the beginning of the project identified the basic requirements for virtual team working within the European automotive industry. A ‘best in class’ software demonstrator was then developed to run over heterogeneous hardware platforms, and a series of user and technical evaluations and demonstrations undertaken in the UK, Italy, Ireland and France. A total of approximately 40 engineers in the UK and Italy took part in the user trials. All were potential end-users of TEAM-type technology in that they worked with colleagues in their company or other companies at remote sites. Initial trials centred around realistic scenarios of use involving engineers at manufacturer and supplier sites. These employed standard human factors evaluation techniques such as direct observation, questionnaires and structured interviews, and identified improvements and changes to the system functionality and ease of use; these were incorporated where possible. Evaluation criteria for the TEAM demonstrator were established, based on the efficiency, effectiveness and satisfaction of using TEAM to support PIP activities. The PIP is defined here as all of the stages necessary to go from product conception to volume manufacture, i.e. concept, feasibility, packaging, detailed design, pre-production validation, and on-going problem solving. The following evaluation criteria were used: • Degree of actual use of TEAM for day to day virtual team working (to enable quantification of the resources used, relating to efficiency of working). • Effectiveness of using TEAM for virtual team working (how well collaborative engineering activities were supported). • Initial and ongoing support requirements for TEAM type technology when implemented in the workplace. • Reliability of the hardware, software and networks in a working environment. • Perceived and realisable benefits compared with current and alternative practices. • Satisfaction with, and motivation to use TEAM (all stakeholder perspectives). • The impact of TEAM on the organisational and business activities. • The extent of buy-in and commitment from stakeholders for implementation. Final evaluations involved real working by engineers, solving live design issues on current vehicle programmes. These final user trials, employing the above evaluation criteria, are outlined below. Supplier and manufacturer engineers at Rover Group, Key Plastics and TRW Steering Systems Ltd undertook a series of collaborative sessions over a period of 4 weeks in the workplace. The evaluation approach taken was simple and robust, non-intrusive and required minimal effort from the users. Session proformas were completed by the engineers for each collaborative session; these included the following: participants involved in the session, the engineering project under discussion, aims of the session, applications used, details of both electronic and paper-based information used, whether additional data were required or would have been useful during the session, how successful the session was with respect to it’s aims, the degree of overall satisfaction with the session, and improvements or changes to the system which would have made the session more successful. The collaborative working in the UK addressed the following design issues on new vehicles: the design of a Power Assisted Steering reservoir; the design of steering components; and the design of the PCB, casing and labelling design for an engine management ECU. In Italy, work centred around: the design of a dashboard system; design of a vehicle seat; problem solving on a door design; and work on a suspension system.
Impact of IT&T on virtual team working
193
After the final collaborative session, structured interviews were carried out with all the participating engineers, the systems support personnel, and business managers. The interviews addressed the requirements for a system such as TEAM, the benefits and drawbacks, and also looked at the wider issues concerning the future implementation of TEAM technology in the business, organisational issues and the impact on suppliermanufacturer relationships.
TEAM Technology A range of user trials were undertaken. Primary rate 6-Channel ISDN was used as the Wide Area Network (WAN) to link the UK user companies; this offered the optimum price/ performance ratio and a bandwidth of up 384 Kbits/s. CISCO 4500-M routers interfaced between the company Local Area Networks (largely based on Ethernet) and the ISDN WAN. In Italy trials between Fiat CRF and Magneti Marelli were undertaken using 8-Channel ISDN. Broadband trials were also carried out between Rover UK and Siemens Automotive (France) using a mixed SuperJanet/ATM network, providing 2 Mbits/s. The TEAM demonstrator comprised a suite of proprietary and bespoke applications. User sites employed differing software according to their specific requirements. The basic applications used in the UK were ‘Communique’ for audio/video/whiteboard conferencing, ‘TeamConference’ for real-time sharing of applications, ‘Annotator’ for on-line creation and management of minutes and ‘WebProM’ to provide a common data management environment within a distributed team. A key requirement for the TEAM system was that, as far as possible, the demonstrator should run over heterogeneous platforms, since a wide variety of platforms exist along supply chains.
Results Summary of TEAM use During the narrowband sessions, the average session length was 35 minutes (range 20–50 minutes), with sessions fairly focused on specific design issues. One engineer at each site generally participated, with up to 2 further additional participants for some sessions. The TEAM sessions generally took the following form: pre-preparation of an agenda by the party calling the conference (using Annotator) and storage of this agenda using WebProM; joint viewing and discussion of agenda using the shared whiteboard; importing of CAD images into the whiteboard (this was done as preparation in some cases); discussion of design issues and annotation of imported CAD images; saving of whiteboard pages as minutes. The most useful applications were the shared whiteboard, WebProM, Annotator and the audio tool. There was also limited use of the video for showing physical components: images were captured and then imported into the whiteboard. Face-to-face video was not seen as useful for engineering discussions. For the types of design activities undertaken during the trials, application sharing of CAD packages was also requested, however where this had been tested over 6 Channel ISDN, the performance and reliability had been poor with complex CAD models. TEAM sessions were characterised as being very successful in achieving the aims of the sessions, but from the users’ point of view, only partly satisfactory. Where the aims of the sessions were not achieved, the main reasons were the lack of availability of key people and the need to consult specific data which had not been anticipated, and so an inability to reach concrete decisions. The main reasons that users felt only partly satisfied with TEAM were the poor usability (both during start-up and in use), the lack of robustness and problems due to the non-integration of the TEAM system with company networks.
194
C Carter and A May
Impact on the PIP TEAM was judged a substantial improvement over current PIP working practices, although the actual benefits were slightly negated by poor quality audio, a lack of robustness and ease of use, and a need for integration with other company systems. The overcoming of these limitations would produce a major improvement in current collaborative engineering working practices. The following predictions of projected time savings for different stages of the PIP using TEAM made by participating engineers are shown below in Table 1. Table 1. Projected time savings using TEAM, for stages of the PIP
The prerequisites for these savings and a more detailed cost-benefit analysis are given in (May, Carter et al., 1997) and (Distler, Carpenter et al., 1997). A major stated benefit was the ability for engineers and designers to exchange ideas more easily and hold more effective and detailed collaborative technical discussions than is possible over the phone, with less possibility of misunderstandings and ambiguities. It was still felt that face-to-face meetings (around a terminal if necessary) enabled the easiest technical discussions, but were often inefficient due to the large amount of travel involved. For the impact on product quality, the view was that with TEAM, a similar final product quality would be achieved, but this would be achieved earlier, as there would be more ‘right first time design’. Also it was felt that many changes during the build phase would be eliminated, leading to a reduction in time to volume and in the prototype phase. More detail could be built into the prototypes allowing detail to be finalised earlier on in the PIP. Impact on roles and the company From an individual’s perspective, TEAM should reduce the time spent out of the office and increase the effectiveness and efficiency of an engineer. It is likely that engineers will feel they have more control over their jobs as they will be able to react more quickly and effectively to design issues as they arise. Engineers stated that they did not want the introduction of technology such as TEAM to affect greatly current roles within companies; rather it was seen as offering tools to help facilitate the way they work. However, if widely implemented, it is likely that TEAM will initiate some changes to company roles and the organisation. CAD engineers may need to become more involved in project management, and involved earlier in the PIP, and there will be an emphasis on using information more fully throughout the PIP. Some activities such as quoting on tenders will happen earlier. Responsibilities and authorities of individuals may need to be redefined, as workstation-based engineers will have powerful communication tools, but not necessarily the authority to make decisions using them. Companies and supply chains should feel more integrated, as communication tools reduce the isolation felt by sites which are geographically remote. Although it is likely that TEAM would reduce the frequency of traditional face-to-face meetings between customers and suppliers, it is seen as actually building stronger relationships between companies due to the more interactive nature of discussions, and more frequent communications.
Impact of IT&T on virtual team working
195
Conclusions The user companies are pursuing implementation strategies for the technology demonstrated in TEAM. Careful planning was recognised as essential, as with any system operating within a complex environment. A recommendation was made that it should be introduced at the beginning of a new vehicle programme so that appropriate systems, processes and working practices can be developed, and the problems of legacy data minimised. Initial emphasis should be placed on simple tools that improve the communication between customers and suppliers, before more complex functionality is offered. Before successful implementation, the following additional issues would need resolving: • • • • • • •
Agreement on appropriate IT strategy between distributed team members. Resolution of costs, especially for smaller companies lower down the supply chain. Continuing awareness building amongst end-users and top management. The maintenance of company security for commercially sensitive project data. Integration of TEAM into company networks, for interoperability and data access. Major improvements in the reliability and ease of use. The further development by IT providers of collaborative tools that work satisfactorily over narrowband networks.
IT&T demonstrated in TEAM offers distinct improvements over some current engineering working practices. Benefits include time savings across all stages of the PIP, easier and more effective discussions between distributed project teams, less possibility of misunderstandings or ambiguities and an ability to react more quickly.
Acknowledgements This paper was based on work carried out by the TEAM (AC070) (Team-based European Automotive Manufacture) project, funded by the CEC under the ACTS (Advanced Communications Technologies & Services) programme DGXIII. The authors wish to acknowledge all of the project partners who contributed to the success of the project.
References Distler, K., Carpenter, P., Caruso, P., D’Andrea, V., Doran, C., Fontana, P., Foster, P., May, A., McAllister, W., Pascarella, P. and Savage, R. 1997, TEAM (AC070) Deliverable DRR016: Cost-Benefit and Impact Analysis. München, Siemens AG. Lamming, R. 1994, A review of the relationships between vehicle manufacturers and suppliers, DTI/SMMT Report. Lawson, M. and Karandikar, H.M. 1994, A Survey Of Concurrent Engineering. Concurrent Engineering-Research And Applications 2(1), 1–6. Londono, F., Cleetus, K.J., Nichols, D.M., Iyer, S., Karandikar, H.M., Reddy, S.M., Potnis, S.M., Massey, B., Reddy, A. and Ganti, V. 1992, Coordinating a Virtual Team. West Virginia, CERC–TR–RN–92–005, Concurrent Engineering Research Centre, West Virginia University. May, A., Carter, C., Joyner, S., McAllister, W., Meftah, A., Perrot, P., Pascarella, P., Chodura, H., Doblies, M., Carpenter, P., Caruso, P., Doran, C., D’Andrea, V., Foster, P., Pennington, J., Sleeman, B. and Savage, R. October 1997, TEAM (AC070) Deliverable DRP013: Final Results of Demonstrator Evaluation. Loughborough, HUSAT Research Institute. Simpson, G. 1996, Components of success—or failure—for the 21st century. Society of Motor Manufacturers and Traders Conference, ‘Driving tomorrow’s world’, Birmingham, UK.
WORK DESIGN
THE EFFECT OF COMMUNICATION PROCESSES UPON WORKERS AND JOB EFFICIENCY Anne Dickens and Chris Baber
Industrial Ergonomics Group School of Manufacturing and Mechanical Engineering University of Birmingham Edgbaston Birmingham B15 2TT
This paper examines the way that communication structures affect workers and job efficiency. This paper focuses on a company using manufacturing cells support is based away from the shopfloor. It was found that remotely based support led to sequential communication processes, which in turn meant long lead-times. Interviews with staff showed that being locked into a highly sequential and rigid system led to low levels of job satisfaction and frustration due to an inability to change the system for the better. It was clearly shown that there was a strong link between organisational design, communication processes and job satisfaction.
Introduction Previous research has shown that the majority of companies implementing manufacturing cells, do not make changes to their organisational structure to embrace these flexible production methods (Dickens and Baber, 1997a). This paper determines the implications that this has on the effectiveness of communication processes. Also, the effect that these communication processes have upon worker satisfaction and job efficiency are examined. The following paragraphs describe the terms used in this paper. The organisational structure of a factory consists of two main systems: the manufacturing system and the support system. The type of manufacturing system in this study is cellular, i.e. the system is divided into smaller, autonomous manufacturing units. Every manufacturing cell requires assistance, and the support system provides this. Due to the nature of the study, no measurements are taken of cell communication structures, only of support system structures. A support system consists of up to thirteen support functions, each of which perform specific tasks. These functions are: information technology, maintenance, stores control, engineering, logistics, quality, design and development, human resource management, procurement, sales, marketing, production planning and finance (Dickens and Baber, 1997b). The results from a postal survey (Dickens and Baber, 1997a) show that the majority of support functions are centralised, i.e. not based upon the shopfloor, but in remote offices. Similarly, support is functionally divided so that people performing the same or
Effect of communication processes upon workers and job efficiency
199
similar tasks are based together. This leads to a sequential approach to work. Bessant (1991) claims that flexible models of production are incompatible with older forms of organisation, especially those stressing division of labour and rigid bureaucratic forms. This study investigates Bessant’s claims by examining the effect of support system configuration upon communication processes. The company selected for the case study was representative in that it fulfilled all the configurational traits described in the paragraph above. That is, the company had two years operational experience of manufacturing cells, and centrally based support functions with functional division of work. The site had 800 employees and an annual turnover of approximately £60 million.
Methodology Scenario measurement was selected as the foundation of the case study. This means selecting a common factory procedure and charting its communication structure. A scenario called ‘scheduling’ was selected due to its criticality. If the scheduling process fails, the manufacturing cell stops. Scheduling extends from the forecast of demand, to the delivery of the parts to the cell. The study examined the scheduling of one sub-assembly for one cell. The sub-assembly was typical in that it had to be ordered from an external supplier (outsourced), and required additional press-shop operations before delivery to the cell. The cell in question is representative in that it manufactures 29 varieties of one product and employs 15 full-time operatives and a cell manager. To measure the scenario, outline flow process charts were used (International Labour Office, 1979) because of their flexibility. This method not only allows the communication structure to be quantified, but allows us to make inferences regarding job efficiency. Job efficiency for a support system is summarised using the following measures: lead-time, ontime deliveries, and number of non-value-adding stages within a process. A preliminary examination of the scheduling process discovered that two charts would be necessary: one documenting the physical stages of the scheduling process; and another monitoring flow through the Manufacturing Resource Planning (MRPII) system. Both systems operate in conjunction with one another to complete the scheduling scenario.
Results The Communication Structure Each box in Figure 1 represents a single stage in the communication structure and each arrow indicates the direction in which the sequence proceeds. The unmarked arrows show that the process stage moves ahead unhindered. Arrows labelled ‘not OK’ show the path of the process if a problem occurs Because the MRPII system and communication structure operate concurrently, certain facets overlap. As a result, the grey boxes in Figure 1 depict stages where MRPII is accessed in some way. The bold type in Figure 1 depicts departments where tasks are carried out.
A Dickens and C Baber
200
Figure 1. Communication structure for scheduling scenario
The MRPII System Structure
Figure 2. Flow of information through MRPII system
Effect of communication processes upon workers and job efficiency
201
In Figure 2, the lines on the diagram show information being either inputted into the MRPII system, or requested from it. The boxes represent parties who process this information in some way.
Summary of Job Efficiency Only 3.23% of the components involved in the scheduling scenario arrived at the cell on-time. The following table summarises the job efficiency measures. Table 1. Summary of Job Efficiency Measures
* The sum of the two is still one month because they operate simultaneously
Discussion The results show that the scheduling process has become part of the critical path in terms of manufacturing lead-time. This means that the overall time taken to complete a product can be either lengthened or shortened by the scheduling process, and the manufacturing cell has no control over this. This is demonstrated by the fact that 96.77% of parts are delivered late. This is caused by inefficiencies within the communication structure, which are discussed further below.
Support System Configuration Due to the fact that all support functions within the communication process are functionally based, a sequential approach to work has evolved; each function performs their own task before passing it on to another function. This means that communication takes longer, hence contributing to long lead-times. A sequential approach similarly leads to individual support functions having little understanding about the tasks performed by other functions. This manifests itself negatively in that individual functions become self-focused rather team-focused. When interviewed, the staff felt that the system design was poor. They believed that the sequential nature of the system meant they were trapped in a rigid process, and often blamed when parts were late. The latter was true even when it was the process rather than the individual which caused the delay. For example, when the manufacturing cell did not receive their parts on time, energy was spent apportioning blame rather than solving the fundamental problem of a poor communication structure. Staff also felt that their suggestions to improve the process were ignored by management. An alternative to functionally based support would be the use of multi-functional groups. This would allow the use of verbal face-to-face communications which are more timeeffective method than paper systems.
202
A Dickens and C Baber
The Communication Structure The complexity of the communication structure is demonstrated by the number of stages in the scheduling process. When examining the scheduling process in conjunction with MRPII, there are 25 steps and 12 departments in total. The communication system evolved as the company expanded and this lack of formal design is borne out by the complex, rigid network of links and loops. System re-design could eliminate up to 70% of these stages as non-valueadding.
The Use of MRPII All staff interviewed similarly distrusted the information contained within the MRPII system. Data was incorrectly entered into the system, resulting in inaccurate outputs. In addition, data was mistrusted because of two-way information flows (see Figure 2), involving forecasting (pushing information into the system) and ‘backflushing’ (pushing information back through the system once the product has been completed in order to adjust stock levels). This results in system ‘lags’, making it difficult to judge the accuracy of information at any given time. It also means that information is constantly changing, which makes decision-making difficult and inaccurate for functions such as purchasing. These lags also negate the benefits of interfacing outsourced companies with the system. It is questionable whether MRPII should be used at all within the company. MRPII is a rigid system primarily designed to control ‘push’ systems, and as such, is often incompatible with flexible manufacturing systems such as cells. It is likely that a Kanban system would simplify the scheduling process.
Conclusions The communication structure led to long lead-times, such that the scheduling process became part of the critical path for the manufacturing process, and was therefore capable of limiting throughput. The flow chart results show that the flow of information and components is overly complex and not conducive to a typical cellular manufacturing pull system, resulting in the need for re-design.
References Bessant, J., 1991, Managing Advanced Manufacturing Technology; The Challenge of the Fifth Wave, (NCC Blackwell, Manchester) Dickens, A., Baber, C. and Quick, N., 1997a, Support system configurations: design considerations for the factory of the future. In S.A.Robertson (ed.) Contemporary Ergonomics 1997, (Taylor and Francis, London), 510–515 Dickens, A. and Baber, C., 1997b, Distributed Support Teams in Manufacturing Environments . In S.Procter and F.Mueller (eds.) Teamworking, (University of Nottingham Print: Nottingham), 75–91 International Labour Office, 1979, Introduction to Work Study Third Edition, (ILO Publications, Geneva)
A CASE STUDY OF JOB DESIGN IN A STEEL PLANT. H.Neary and M.A.Sinclair
Department of Human Sciences Loughborough University Loughborough Leics., LE11–3TU
The paper outlines the design of a critical new job in a rolling mill. Comments were requested on the project of which this case study formed a part, and are given at the end of the paper.
Introduction The study took place in a large privately-owned Company, said to be one of the most efficient in the global steel industry. The plant for this study (the ‘Beam Mill’) is part of a large complex. Slabs of cold steel are reheated and rolled into I-shaped beams. These are large, structural beams, of the sort used for road bridges and oil platforms. Rolling mills typically are long, linear processes, working on batch sizes of one, with three batches on the process line and several in the reheating furnace, being prepared to go on the line. Processing is 24 hours, 7 days a week. The Company has been carrying out incremental changes to its processes for many years. Most of these changes have been aimed at improving the quality of the product by introducing automation and new machinery; other changes have been aimed at reducing labour costs and increasing efficiency. This particular study had several objectives: improve the productivity of the plant (by increasing throughput), improve quality of the product (by reducing dimensional variability), and improve the flexibility and agility of the plant in its response to the marketplace (by improved layout of equipment, more integration of automation, and better utilisation of storage areas), and by widening the product range from this plant. The focus of attention was the ‘hot saws’ area of the plant. In this area, beams arrive which have been rolled to the correct cross-sectional shape from the original steel slabs, and are still very hot. The beams are cut to size, to fulfil customer orders, and go to the Banks, until being called off for delivery to the customer. In May, 1995, the Company began the ‘hot saws’ refurbishment with a budget of approximately £16 million, with the intention of completing the project by 1 September 1997. The development was costly and time consuming, involving several suppliers and numerous individuals from a wide range of backgrounds. The project involved reshaping of the process line; installing two in-line higher-capacity saws in place of the two old parallel saw tracks; replacing the IT applications and integrating them more into the whole production control system; refurbishing the control room; and reducing the manning levels by a ratio of 4:1. The hot saws development has had its problems and setbacks; for example, there were several serious delays with the software design. The software supplier, a sub-contractor to the saws supplier, had had no experience of the steel industry beforehand, and in May 1997 was about
204
HT Neary and MA Sinclair
ten months behind schedule in producing the ‘final solution’ for 1 September 1997. However, the installation of the equipment was to go ahead on schedule, without the new control system, with the new saws commissioned at the end of August 1997. Therefore, it was necessary to devise an ‘interim solution’, based on the current control system. In April 1997 the interim solution was begun within the Company, involving several IT and Engineering Departments. The solution was constrained due to the tight time scale., and only limited development was possible. It would incorporate minimal changes to the current mill software system, yet use the new hardware. The consequence of this was that the ‘interim solution’ would require its own job description and training needs, different to those currently in place and different to those anticipated for the ‘final solution’, but this problem was temporary and could be solved during the commissioning period. Both solutions were developed concurrently, leading to uncertainty among the stakeholders regarding the actual operation of the new system. Training plans were being compiled by the Training Department in the Company. However, the main supplier had not yet provided the appropriate documentation and training in relation to the hardware. Therefore, the adequacy and timing of training was expected to lead to problems for the workers trying to learn about the new system and their role within it, and go ‘live’ on 1 September 1997. These problems would be exacerbated because the ‘interim solution’ had not had sufficient time in its development to have been properly user-tested. The Case Study was carried out over the summer of 1997, concentrating on the redesign of the Saw Controller’s job in the hot saws Control Room, dealing only with the ‘final solution’.
Problem definition for the Case Study The goals set for the Case Study were as follows. • A description of the new Saw Controller’s job (i.e. the ‘final solution’). • Knowledge requirements for the job, as a basis for a training needs assessment. • As a subsidiary task, comment on the approach adopted in the project, as a contribution to self-awareness and organisational learning within the Company.
Execution of the Case Study A ‘User-centred’ approach was adopted. Stakeholders (i.e. people directly affected by any changes) in the hot saws development were initially identified by discussion with the project managers for the development, and interviews started in May. ‘Snowballing’ was used to expand the initial, basic set of stakeholders. This method is known to have problems when applied to a general population, but in a structured organisation it was deemed satisfactory. The first interview was a pilot interview, involving the Unit Trainer from the hot saws, to give a better insight into the roles of operators at the hot saws and to develop a set of questions that could be posed to the saw operators. The interviews included task analysis of the existing job and what the operators expected to change in the new development. Their personal opinions about the full situation were also sought. The interviews were conducted on a one-to-one basis. Some problems were experienced with shopfloor operators at this stage, for resistance-to-change reasons. On average an interview lasted between 30 and 40 minutes. In total, including stakeholders in management positions, 35 interviews were conducted. Management interviews were more exploratory, and individualised compared to those for shopfloor operators. This was because each manager had a different background and different input into the hot saws development project. As well as conducting interviews in the Beam Mill, visits to other rolling mills both on and off site were arranged. These visits provided further insight into the automation to be introduced and operators’ opinions regarding the automation, its effects on their jobs, and so on. An advantage of the incremental improvements approach is that there are always
Job design in a steel plant
205
analogies to the current improvement somewhere fairly close, which can act as guides for the current development. Observational studies were also conducted at the hot saws to establish the sequence of events, time cycles, and so on, to enable a simulation of the workloads for the Saw Controller under normal and worst-case scenarios to be undertaken. On the basis of findings from these different approaches, a new job description was developed. This involved three lengthy interviews with relevant members from management followed by a number of workshops, with the following aims; • • •
to incorporate the expertise of those knowledgeable about sawing operations to ensure that the new job description was realistic to engender a degree of ownership of the final description by those likely to undertake the job.
The interviews with management established the ‘official’ job description. The first workshop involved the unit trainer from the hot saws. The workshop’s aims were to develop a clear statement (in flowchart form) of the tasks involved in the new operation of the hot saws, in full automation mode and in manual mode, as well as a list of critical incidents that might occur. For the subsequent workshops, two saw controllers, as well as the hot saws unit trainer, provided knowledge requirements for the operation of the hot saws in the final solution, using a new technique described in another paper in this conference by Siemieniuch et al. The participants in these workshops were recommended by management as being senior personnel with a good understanding of the likely Saw Controller’s job for the final solution. A knowledge tree was devised from these workshops, which was then utilised to develop the knowledge and skills required to perform the Saw Controller’s job.
New job description for the Saw Controller Outline of the old working context and ob. Before September, 1997, four operators per shift were in the saws control room; a Saw Controller and a Saw Driver for each track. On the shopfloor, there was one Section Controller, one Rolling Utility Man, and two Saw Utility Men. Each of the tracks had one saw blade located at a fixed point on the track. The control room is located to the side of the tracks and the operators have direct vision of the hot saws. The Saw Controller decided the sequence in which the customers’ orders were cut. Bars were identified by ‘mill codes’, used to assist the tracking and processing of bars through the mill. The Saw Controller had a number of objectives to follow when deciding this: • • • •
to cut priority orders first, to limit the number of orders open at any one time (usually 3 orders per saw), to ensure that the steel is being cut at the optimal rate, to ensure that there is maximum yield achieved from each beam.
This method was the manual mode, in which the Saw Controller inputted manually into the computer both the order and item number, obtained from the ‘saw sheets’ (shift schedules based on customer orders). It was under the Saw Controller’s instruction that the Saw Driver cut the steel. There was a second method of operation called the ‘optimisation method’. A software algorithm decided the cutting pattern, monitored by the Saw Controller. The Saw Driver’s job does not change in the optimisation method. In theory, the Saw Controller had only to monitor the technology and deal with unexpected situations.
Description of the new working context and the proposed job In the new situation the Saw Driver role has disappeared, and there is only one Utility role in the saws area. The new hot saws operation will be almost fully automated. There will be a single
206
HT Neary and MA Sinclair
saw track through which all bars will travel. There will be two saws, one fixed and one movable. Beams will be cut in accordance with ‘load plans’ (customer orders recategorised into loads leaving the plant) and ‘mill codes’ will be eliminated. This will change a lot of the processes within the system. The manning level for the saws control room will be reduced from four operators to one. The role of the operator will be to monitor the sawing process and the operational technology. However, it was calculated that under ‘worst case’ conditions, a single operator could not cope. A definition of the ‘worst case’ scenario is when the load plan requires short beams of minimal cross-section and the process had to be operated in full manual mode, (i.e. the order of cutting had to be calculated, and then manually controlled). This would be too demanding for one individual. It would be a necessary management decision to operate either at a higher manning level or to stop the mill. The job will comprise the following tasks: • • • •
•
•
Initial preparation, and re-start after maintenance. The operator runs a ‘Virtual Bar’ past the hot saws Blade Changes and Beam Section Changes. The Saw Controller ensures that the whole mill has slowed down and that all technicians have been informed, and then monitors the change. Normal Rolling Duties. The Saw Controller has only a passive, monitoring function to perform. Detection and Prevention of Problems/Reaction to Alarms. The Saw Controller will use numerous computer interfaces to monitor equipment and to prevent time delays and critical incidents by early detection of problems. This requires communicating with other parts of the mill to maintain ‘situation awareness’. Manual intervention to carry out a standard task. Several different levels of manual intervention are possible with the new automation, including full manual operation. It will also be possible to interrupt the automation, mid cycle, to cut a test piece, etc. The system will then be re-instated, and would re-optimise orders for the remaining runout length. Carry out Secondary Tasks. These are: General housekeeping tasks in the control room. The operator is also expected to clean, paint and de-scale equipment on the mill. Assist the mechanical and electrical engineers during maintenance tasks, by operating the equipment on request of the engineers. Paperwork. The operator will log any incidents, subsequent events and actions taken, as part of good management practice. Manual proforma sheets will be provided. Start up check sheets will also be provided.
Training and selection issues The above tasks, defined as the Saw Controller’s job, are possible to complete in the available time, when the hot saws equipment and software are fully operational, and given careful selection and training of the operators involved. However, full training plans were not available during the case study, so no comment is possible. Nevertheless, a significant change in the job requirements and in the working environment will occur, and preparation of the workforce is a critical issue. It should be noted that, from a lifecycle perspective, the operators in a plant may be seen as the “designers’ on-site representatives”, responsible for realising the cost-effectiveness of the plant as planned by the designers, and developing more efficient ways to operate the plant (Rasmussen and Goodstein 1986). Hence, it is necessary to ensure the operators know what to do, but also why they are doing it; i.e. to have some insight into the design of the plant and its behaviour. This demands a different approach to the planning of training, compared to the usual approach. Another issue not known to be addressed in the plans, but of concern to the workforce is that of promotion routes associated with this new job. It is not clear how people can progress into this job, become more skilled, achieve promotion and advance elsewhere in the Company.
Job design in a steel plant
207
Uncertainty about this issue is known to be a source of resistance to change; furthermore, in an environment in which downsizing is an established long-term trend, those who retain their jobs but without seeing a recognisable future may see themselves as next-in-line and therefore have no better motivation than those who know their jobs will cease to exist. Finally, given the potential for boredom inherent in this new job, there is an important issue in maintaining situation awareness and the sharpness of problem-solving skills. It was recommended that issues of retraining and the simulation of problems should be addressed.
General comments on the project as a whole The authors were asked to make any additional comments about the project thought to be relevant. These are summarised below. •
• •
• •
•
•
It is well-known that a task in which operators are required to monitor a process for extended periods of time with little involvement will produce wandering of attention, lassitude, and a loss of situation awareness. The scope of this project has not permitted coverage of this issue, but it is one for management to consider, given the significant role of the Saw Controller in the whole process. Many of the tasks in the ‘final solution’ will be novel for the operators, requiring new skills or higher levels of expertise in existing skills. Training plans will require considerable attention if a trouble-free transition is to occur. Given a highly-automated process, it is known that operators can become deskilled in fixing problems. Operators in an adjacent plant are aware of this in their own jobs. Simulation facilities may be an answer, for operators both to learn and to maintain their operational and problem-solving skills. Current training plans do not envision such facilities, neither in training nor under operational conditions. There may be problems with skills progression and promotions surrounding this new job. Some avoidable problems occurred during the implementation of the new process during this project which seem to be due to poor communications. This is a major issue, and refers to communications between operators and designers; between management personnel to avoid ‘over-the-wall’ problems; and between the various organisations involved. It appears that the new ways of working will mean changed priorities for jobs within the Beam Mill. The Saw Controller now moves to centre stage in determining what happens in the Mill. There will inevitably be knock-on effects for other jobs in the Mill, which will need further consideration. The overall design philosophy seems technocentric, when current thinking is moving towards a more socio-technical philosophy for engineering design. This shift is perhaps best expressed as moving from ‘what technology is needed to solve this problem’ to ‘what’s the right operational combination of people and technology to meet business objectives?’ There seems to be a case for the Company to move in this direction as well. The development of this approach will require time before it reaches the same level of process maturity as the current method, but it should obviate the need for the oft-repeated mantra of the project, “Whatever it takes, we will deliver the goals”. Given the number of times process improvements have been carried out in the Company, this mantra should not be necessary; it is indicative of the need for a different approach.
References Rasmussen, J. and L.P.Goodstein (1986). Decision support in supervisory control. Analysis, design and evaluation of man-machine systems, Varese, Italy, 2nd IFAC/IFIP/IFORS/ IEA Conference.
THE EFFECTS OF AGE AND HABITUAL PHYSICAL ACTIVITY ON THE ADJUSTMENT TO NOCTURNAL SHIFTWORK T.Reilly, A.Coldwells, G.Atkinson and J.Waterhouse Research Institute for Sport and Exercise Sciences Liverpool John Moores University Mountford Building, Byrom Street Liverpool, L3 3AF
The purpose of this research was to determine the influences of age and habitual physical activity levels on the adjustment to and tolerance of nocturnal shiftwork. Participants included young (mean age 23.4 years) and older (mean age 48.9 years) male shiftworkers who operated on a slowrotating and backward-rotating shift. Circadian rhythm characteristics were determined for 5 days on each of 3 work-shifts (night, afternoon, morning). Leisure time activity was quantified by means of a questionnaire. Younger subjects had higher amplitudes in their circadian rhythms and adapted more quickly. Active subjects showed a similar trend in rhythm amplitude and adjusted more quickly to nightwork. The older subjects were better suited to the morning shift than their younger counterparts. The observations support a scheduling scheme which takes age (but not necessarily habitual activity level) into account.
Introduction An appreciable proportion of the national workforce is engaged periodically in shiftwork. Shiftwork refers to any regularly taken employment outside the day-working window, defined arbitrarily as the hours between 07:00 and 18:00 hours (Monk and Folkard, 1992). It has been estimated that 10–25% of individuals in employment participate in shiftwork, with higher rates among manual workers in Britain and among part-time workers in America (Young, 1982; Reilly et al., 1997). About 18% of European workers are thought to engage in nocturnal shiftwork for at least one quarter of their working time and in the USA 20 million full-time employees are involved in shiftwork (Costa, 1997). In view of the scale of nocturnal shiftwork use in industry, the consequences of shiftwork systems on human factors issues are worthy of investigation. Shiftwork causes disturbances of the normal sleep-wake cycle and circadian rhythm. Ageing is thought to be associated with a reduced ability to tolerate circadian phase shifts such as occurs when starting on a nocturnal work-shift. There is concern also that ageing workers have more health-related problems than younger colleagues when the human body clock which regulates circadian rhythms is disrupted (Waterhouse et al., 1992). Such
Effects of age and physical activity on nocturnal shiftwork
209
disruption occurs after travelling across multiple time zones or engaging in nocturnal workshifts. Habitual physical activity may act as a time-signal for the body clock and so influence circadian rhythm characteristics (Redlin and Mrosovsky, 1997). Atkinson et al. (1993) reported higher amplitudes in circadian rhythms of physically fit subjects compared to a group of inactive individuals studied under nychthemeral conditions. It has been suggested that this higher amplitude is characteristic of a tolerance to circadian phase shifts, such as occurs in adjusting to night work or following long-haul flights (Harma, 1995). There is no agreement regarding the causality of this relationship. For the ergonomist, these considerations are relevant to the design of work schedules best suited to ageing employees. There are consequences also for the selection of individuals according to tolerance of night work. Therefore, the purpose of this study was to determine the influences of age and physical activity on the adjustment to and tolerance of a nocturnal shiftwork regimen.
Methods Twenty male shiftworkers from car manufacturer, police and security officers were recruited for the study. They were divided into a young (n=9; mean age±SD= 23.4±2.3 years) and an old (n=11; mean age 48.9±5.2 years) group. All subjects worked a slow-rotating and backward-rotating shift i.e. night, afternoon, morning. Subjects were also subdivided into active and inactive sub-groups based on reports in the leisure-time Physical Activity Questionnaire (Lamb and Brodie, 1991). The physical activity status of the active group was calculated as 55.9±10.5 units compared with 4.7±2.6 units based on energy expenditure over a 14–day period. Observations were made over the solar day whilst on the various shifts as outlined in Table 1. Subjects recorded their own oral temperature measured by means of a digital clinical thermometer (Phillips, Eindhoven). Grip strengths for right and left hands were measured using a hand-held spring-loaded dynamometer (Takei-kiki Kogyo, Tokyo). Peak expiratory flow was measured with a flow-meter (Airmed, London). Arousal was self-rated using a visual-analogue scale. Measurements were made where feasible every 2 h for the 5 days on each of the 3 work-shifts. Table 1. Times at which measures were recorded by subjects on each shift
Rhythm characteristics were determined by means of cosinor analysis (Nelson et al., 1979). Comparisons between groups were made using analysis of variance. The participants also completed the Standard Shiftwork Index at the beginning of the study (Barton et al., 1990). The index contains six sections on general biographical information, sleep and fatigue, health and well-being, social and domestic situation, coping, and chronotype (morningness/eveningness). The detailed observations gained using the inventory are not included in this report.
210
T Reilly, A Coldwells, G Atkinson and J Waterhouse
Results Overall, the younger subjects were found to have higher amplitudes in their circadian rhythms and faster adaptations of the rhythms to nightwork. The older subjects seemed better suited to the morning shift than their younger counterparts. This corresponded to an increased tendency to ‘morningness’ in the older group. Those subjects with a high level of leisure time activity possessed larger circadian rhythm amplitudes than inactive individuals but not faster adaptations of the rhythms to nightwork.
Night shift There was a significant time of day effect for both old and young subjects in oral temperature. For the first night worked, the age by time of day interaction was significant for oral temperature (P<0.05) although the times for the peak values did not differ between groups. The difference was significant at 06:00 hours, with the older group showing a significantly lower temperature at the end of the night shift. The interaction was also apparent on the third shift on night work, the older group experiencing a lower (P<0.05) oral temperature earlier in the shift at 04:00 hours. The mean oral temperature decreased progressively over each successive night in both young and old participants. Grip strength and subjective arousal showed similar trends to those of oral temperature. The old group reported significantly lower values than the young group (P<0.05) at 06:00 h on the first night and at 04:00 h on the third and fifth nights. A significant ‘time of day’ effect was observed over each of the five nights for peak expiratory flow. Nevertheless, the same trends were evident in the old as in the young group.
Table 2. Acrophases for the young and old groups for the first and fifth nights on night, morning and afternoon shifts
Effects of age and physical activity on nocturnal shiftwork
211
Morning Shift Oral temperature demonstrated a characteristic circadian rhythm on each of the five days (P<0.0001). The age effect and the age by time of day interaction were significant for all five days (P<0.001). Right grip strength of the older group increased progressively during the first morning shift until 10:00h. Values for the young group increased between 06:00 and 08:00h and then remained fairly stable for the remainder of the shift period up to 14:00 hours. The grip strength of the older group decreased following the 14:00h peak but remained above the 06:00h value. The differences between the groups were smallest at 06:00h. The older group demonstrated greater alertness scores at 06:00h compared to the young group (P<0.05).
Afternoon Shift Oral temperature showed significant effects of time of day and of age (P<0.0001) for each of the five days. The time of day by age interaction was significant (P<0.005) for days 1, 3 and 4. Time of day effects and the interaction with age were significant for all five days (P<0.0001) for right grip strength. The interaction effect was not significant for left grip strength (P=0.053). Alertness demonstrated significant time of day effects on all five days (P<0.001). The age effect was significant on days 4 and 5 and the time of day by age interaction was significant (P<0.05) on all days except the first one on this shift schedule.
Discussion Colquhoun and Folkard (1978) reported data for a first night of shift-work. The data indicated that for the first night shift the normal (that is diurnally phased) circadian rhythm is maintained. The present data for the first night shift show a similar finding, although both the old and young subjects reported a gradual decrease in temperature throughout the night. The differences on both the first and third nights in oral temperature between the old and young subjects suggest that the phasing or amplitudes of the groups’ circadian rhythms is different. The lower temperature of the older subjects at the end of the night shift (06:00h) suggests that, as a group, the older individuals may have had more difficulty to adjust to work at this time. This result is contrary to the findings for morning shift, where the differences between older and younger subjects were smallest at 06:00h. When sleep is also considered, the present findings are supported in the literature (Reilly et al., 1997). Following a night shift, the older subjects’ circadian rhythms are usually more disturbed and the individuals are less capable of performing work. Conversely, when subjects have slept during the night, the differences between the older and younger subjects tend to be relatively small. The early cessation of sleep (to commence work at 06:00h) seem to have had a large influence on young subjects, causing a relatively poor performance. However, for the older subjects, with an increased ‘morningness’ (Reilly et al., 1997), the 06:00h start might have had little disruption effect on their sleep, thereby allowing a relatively good performance.
212
T Reilly, A Coldwells, G Atkinson and J Waterhouse
Despite the higher amplitudes of rhythm in the younger subjects, this group had more difficulties in adjusting to the morning shift. This contradicts the theory that high-amplitude rhythms predict good tolerance to phase shifts. The reduced differences in performance between the old and young groups during the morning shift, further suggests the potential benefits of re-scheduling work-shifts of individuals by age. If older individuals were scheduled to work morning shifts, then their relative performance would be high when compared to a younger group. Increased ‘morningness’ of the older individuals would explain the finding. It must be noted that although the differences between the old and young groups were smallest in the morning, the older subjects did not out-perform the young individuals. During the night shift, the younger individuals performed consistently better than the older subjects. Thus utilisation of older workers, within shift systems, is probably most effective during the morning and afternoon shifts. Overall, the adjustment to repeated phase shifts including nocturnal work was shiftdependent and would support a scheduling scheme which takes age (but not necessarily habitual physical activity) into account.
Acknowledgements This work was supported by a grant from the Health and Safety Executive.
References Atkinson, G., Coldwells, A., Reilly, T. and Waterhouse, J. 1993, A comparison of circadian rhythms in work performance between physically active and inactive subjects. Ergonomics, 36, 273–281 Barton, J., Folkard, S., Smith, L.R., Spelton, E.R. and Tattersall, PA. 1990, Standard Shiftwork Index Manual. Sheffield, MRC/ESRC Social and Applied Psychology Unit Colquhoun, W.P. and Folkard, S. 1978, Personality differences in body temperature and their relation to its adjustment to night work. Ergonomics, 21, 811–817 Costa, G. 1997, The problem: shiftwork. Chronobiology International, 14, 89–98 Harma, M. 1995, Sleepiness and shiftwork: individual differences. Journal of Sleep Research, 4 Suppl. 2, 57–61 Lamb, K.R. and Brodie, D.A. 1991, Leisure time physical activity as an estimator of physical fitness: a validation study. Journal of Clinical Epidemiology, 44, 41–52 Monk, T.H. and Folkard, S. 1992, Making Shiftwork Tolerable, (Taylor and Francis, London) Nelson, W., Tong, U., Lee, J. and Halberg, F. 1979, Methods for cosinor rhythmometry. Chronobiologia, 6, 305–323 Redlin, U. and Mrosovsky, N. 1997, Exercise and human rhythms: what we know and what we need to know. Chronobiology International, 14, 221–229 Reilly, T., Waterhouse, J. and Atkinson, G. 1997, Ageing, rhythms of physical performance and adjustment to changes in the sleep-activity cycle. Occupational and Environmental Medicine, 54, 812–816 Waterhouse, J., Folkard, S. and Minors, D. 1992, Shiftwork, health and safety. An overview of the scientific literature 1978–1990, London, HMSO Young, B.M. 1982, The shift towards shiftwork. New Society, 61, 96–97
JOB DESIGN FOR UNIVERSITY TECHNICIANS: WORK ACTIVITY AND ALLOCATION OF FUNCTION R.F.Harrison*, A.Dickens and C.Baber
Industrial Ergonomics Group, School of Manufacturing & Mechanical Engineering, University of Birmingham, Birmingham, B15 2TT, United Kingdom(* Presenter of paper).
The following paper examines the role of technicians within a university environment. Five technicians specialising in various disciplines were selected to form the research sample. Over a period of five days, the constituents of the technicians jobs were defined using a flow process chart. The distance travelled, time taken, tasks performed and problems encountered were noted in order to highlight any inefficiencies. Further information regarding job satisfaction was obtained using a Job Diagnostic Survey (JDS). The results of the flow process chart show that a significant percentage of the total time was spent walking to task destinations and waiting for appropriate tools and access. Also, no formal procedure for allocating and prioritising tasks was used within the university, and technicians performed tasks as and when they arose.
Introduction Research into job design has traditionally centred around manufacturing and service industries (Adair and Murray, 1994). Little research has been carried out into the role of technicians at Universities and other higher education institutions. The role of a technician is to provide the appropriate technical support needed by the academic staff so that they can teach students effectively. This means that a wide range of skills are often needed by the technician, and their job can often require a high degree of movement between various locations. Due to a lack of job design guidelines for technicians, their roles tend to be evolutionary by nature, i.e. the methods employed by a technician result from his or her own experience of working within that particular role. It is the aim of the study to identify the type of activities technicians perform during a working day, and determine any inefficiencies that may result from evolutionary job design. In addition, the technicians will be questioned regarding their feelings toward their job, in order to determine whether or not job design affects motivation.
214
RF Harrison, A Dickens and C Baber
Methodology Sample Characteristics Five technicians were selected as participants for the study. In order to be representative, the sample was chosen to encompass as many as possible of the technicians job-types and responsibilities. Although their individual job-types varied considerably, all were similar in that they produced work on an ‘as needs’ basis for both students and academics. In addition, all have a wide range of skills and a high degree of mobility within the university. A minimum of five years experience was deemed necessary for all participants in the study. This ensured that the study investigated the job itself and was not confounded by the inexperience of the technician. The technicians’ ages ranged between 28 and 55 years.
Data Collection Three methods of data collection were considered for the study: activity sampling, travel diagrams and flow process charts (International Labour Office, 1979). Activity sampling was not used due to the disruptive nature of the data collection, whilst travel diagrams were considered too time-consuming for the technicians to complete. The flow process chart was selected as the most suitable for the study, being simple to complete, yet comprehensive.
The flow process chart The flow process chart as outlined by the International Labour Office (1975) was used. As the flow process chart is normally used to investigate factory jobs, slight adaptation was required to ensure that it accurately measured the parameters necessary for this study. The original chart categorised work activities as the following: ‘operation’, ‘transport’, ‘delay’, or ‘storage’. It was deemed more appropriate, and easier to understand, when the descriptions in table 1 were used to categorise the activities of the technician:
Table 1. Categorisation of Work Activities
The technicians were required to complete the flow process charts themselves, including a brief description to accompany each activity in order to aid analysis. In addition, technicians were asked to detail the distance they travel when they are required to move. In order to minimise inter-participant and intra-participant variability when categorising activities, every technician was trained for a least one hour prior to beginning a study. All technicians were also supplied with guidelines to support training, and as a source of reference during the study. By allowing the technicians to chart their own work, it was found that the suspicion normally associated with work studies, whereby the presence and observations of an outsider could easily be misinterpreted, was lessened.
Job design for university technicians
215
Job Diagnostic Survey To measure beyond the purely physical aspects of the technicians role, another means of measurement is employed. A Job Diagnostic Survey (JDS) was selected, as described by Hackman and Oldham (1980). Using the JDS it is possible to examine work dimensions such as motivation and job satisfaction. All results for the JDS are measured on a 7 point preference scale, where 1 is low and 7 is high. The only exception is the Motivating Potential Score (MPS) which ranges from 1 to 343. This score is derived from an average of ‘skill variety’, ‘task identity’ and ‘task significance’ scores, which are then multiplied by ‘autonomy’ and ‘feedback from the job’. The higher the score, the more motivated the individual. The JDS was chosen because it is regarded as a ‘good’ diagnostic tool (Hackman and Oldham, 1980). There is though, an absence of firm evidence about the validity and reliability of some JDS measures, especially ‘growth need strength’ (Van der Zwaan, 1975).
Results Flow process chart Despite comprehensive training it was found that one technicians data was inconsistent and as a result was discarded. The remaining data for the flow process charts is depicted below in figure 1.
Figure 1. Average Breakdown of Activities Per Day The graph shows that a high proportion of the technicians’ time is non-value-adding, i.e., non-productive. In the worst case, only 49.3% of the technicians’ time was non-value-adding, whilst the best case was 30%. Table 2. Average Distance Travelled in One Day (m)
The data in the table above details the distance moved by the technicians whilst performing their tasks. On average, a university technician will walk 1166 metres per day.
216
RF Harrison, A Dickens and C Baber
Job Diagnostic Survey Table 3. Results of Job Diagnostic Survey
The JDS shows that overall, technicians are very satisfied with their jobs and relatively highly motivated. This is reflected by the MPS which was higher than the normative data for professional or technical workers (Hackman and Oldham, 1980). However, low satisfaction ratings were shown for pay, supervision and feedback.
Discussion The results showed a high proportion of non-value-adding time due to the distances moved and delays. From the activity description (not shown in this paper) it could be seen that these delays encompassed: talking to students about jobs, unavailability of tools, interruptions from staff or students, and the inability to gain access to essential areas. The JDS showed that technicians were highly motivated individuals with a high degree of job satisfaction. However, despite these positive results, they felt strongly about pay, supervision and feedback. Although the results suggest that more supervisory support is needed, it is theorised that greater gain in terms of job efficiency, will come from further examining the technicians’ job design. Possible improvements to job design based on qualitative insights gained from carrying out the study are discussed in the following paragraphs. The first improvement involves determining the routine jobs performed by technicians, such as maintenance, laboratory classes and paperwork. At present, routine jobs are fitted into
Job design for university technicians
217
the technicians day whenever time is available. If set times are allocated, routine jobs can be structured in the most efficient way, taking into account distances and tools required. Technicians will not be available to aid students and staff during this time and can therefore perform routine jobs with the minimum of delays. The second type of structural improvement relates to the introduction of flexible time. Flexible time will be allocated for requests for one-off jobs by students. Time is currently wasted because students are unsure of which technician they need for certain problems, and they will often approach several technicians before finding the correct one. It is suggested that a ‘technical services’ board is displayed within the department. This would list all the technicians, their specialities, and a means of contacting during ‘flexible time’. In addition, the board would refer students to a World Wide Web page containing in-depth information about each technicians’ role. By using the web pages in conjunction with the technical services board, students could quickly ascertain which technician most suited their needs. This would lead to a reduction in unnecessary interruptions by students. The third improvement involves the use of request forms for students booking work with particular technicians during their flexible time. The request form would ask for information regarding nature of a job, alongside a means of contacting students. By having request forms, technicians could prioritise and plan work effectively, and they would be able to refer back to previous jobs which may help direct current ones. The final improvement relates to lack of feedback, as indicated by the JDS. The ‘request form’ could have an additional section in which students give feedback upon completion of a job, about the technicians performance. To increase the level of feedback from academics and supervisors, annual reviews of technicians should take place. The comments given by students on ‘request forms’ could provide valuable assessment information. An annual review would similarly allow the technician to highlight job design improvements drawn from their experience (Hackman and Morris, 1975). It should be noted that the above suggestions attempt to increase the efficiency of the technician and represent a starting point, and all systems require continuous improvement to remain effective. Similarly, due to the mobile nature of the job, it would impossible to achieve 100% productive time, but it is expected that a target figure should be approximately 90%.
Conclusions The study raised some interesting points regarding the work of the technician. However, it would be beneficial to undertake an inter-university study to examine whether or not these problems exist to the same extent in other institutions.
References Adair, C.B. and Murray, B.A. 1994, Break-Through Process Redesign (Rath and Strong, USA) Hackman, J.R. and Morris, C.G. 1975, Group tasks, group interaction process, and group performance effectiveness: A Review and proposed integration. In L.Berkowitz (ed.), Advances in experimental social psychology (Academic Press, New York), 15–34 Hackman, J.R. and Oldham, G.R. 1980, Work Redesign (Addison—Wesley, USA) International Labour Office 1979, Introduction to Work Study (ILO Publications, Geneva) Van der Zwaan, A.H. 1975, The Sociotechnical Systems Approach: A Critical Evaluation, International Journal of Production Research, 13, 149–163
SYSTEM DESIGN AND ANALYSIS
ALLOCATION OF FUNCTIONS AND MANUFACTURING JOB DESIGN BASED ON KNOWLEDGE REQUIREMENTS. C.E.Siemienluch, M.A.Sinclair and G.M.C.Vaughan HUSAT Research Institute Elms Grove Loughborough Leics LE11–1RG
This approach addresses the design of new business processes, rather than upgrades of cells. Oganisations are construed as configurations of knowledge, embodied in humans and machines, utilising data to create information, and its physical manifestation (products for sale). The problem is to optimise this configuration of knowledge and its allocation to humans and machines. We start from: the operating environment; a knowledge taxonomy; and a functional description of the process. This results in the allocation of functions, the definition of human roles, and the distribution of management functions very early in the design process.
Introduction The frame of reference for this paper is manufacturing industry. For simplicity, we define two categories of problems for the allocation of functions—the major facility (e.g. a new process line), and the cell (e.g. the reconstruction of a manufacturing cell). We address the former of these two problems, which represents a step-change in the organisation’s operations; the latter problem, the gradual, incremental improvement to established processes, has been addressed by many authors (e.g. Meister and Rabideau 1965; Döring 1976; Kantowitz and Sorkin 1987; Mital et al. 1994a; Mital et al. 1994b). An organisation can be construed as a configuration of knowledge, embodied in humans and machines, which utilises data to create information (e.g. the product data model), and its physical manifestation (products for sale). The problem is to optimise this configuration of knowledge and its allocation to humans and technical systems. In doing this, a particular goal for this method was to enable practitioners to carry out function allocation at a very early stage in systems design, as the textbooks tell us to do without offering useful suggestions. Note that each allocation decision creates at least one extra interaction task (e.g. coordination). Dekker and Wright (1997) have argued, and managers will agree, that materials transformation activities are seldom the source of manufacturing inefficiencies. More often, failings in the associated information processing, communication and co-ordination tasks are the main source, and which cause most of the accidents. Hence, early definition of roles and responsibilities means that engineers can be given needs for control and communication in time for inclusion in the process design, rather than having to cobble together solutions at a late stage in the design, when many design decisions are fixed. Three premises underlie the methodology. •
There is a basic, generic structure of knowledge for the manufacturing domain. Abstract models of companies exist in practice and in text (Vernadat 1996).
Allocation of functions and manufacturing job design
• •
221
The design process for the structure of a company is invariant over the organisation’s hierarchy (i.e. the levels of the hierarchy do not matter) and its business processes (i.e. any process can be designed using the same method). This knowledge configuration can be constructed per process, and accumulated for the whole facility. For process groups with devolved management, this is acceptable; however, we believe that in a facility where similar processes occur in parallel the methodology needs elaboration to deal with cross-process issues.
The starting points for the approach, which provides the only relatively stable set of parameters for design in the early stages, are as follows. • • •
the operating environment (market conditions and company policies), a knowledge taxonomy for manufacturing functions/processes a function-based description of the activities in the facility.
There are two components for the DSS tool which is the embodiment of the approach; the positioning component and the role structuring component. The Positioning component is a spread-sheet tool that captures the ‘framework of forces’ that act on a company arising from its internal and external environments. This results in a set of five process characteristics, expressed as points on five separate continua. The continua are: • • • • •
Structure: from wholly project- to wholly function-based Control: from wholly project- to wholly function-based Process: from wholly sequential to wholly parallel People: from entirely specialist to entirely generalist skill sets Tools: from completely automated to completely manual tools (not used at present)
This component is described in (Brookes and Backhouse 1996), and is not considered further here. The positioning tool is not essential, as long as the organisation has a formal statement of policies on these continua. However, a property of the positioning tool revealed on several occasions is its encouragement of discussion about important organisational issues that are usually not considered, because everybody ‘knows’ what the situation is. The discussions uncover hidden assumptions, uneven distribution of information, and errors in understandings, and it is recommended that the positioning tool is used finitially. The Allocation of Functions and Role Structuring component has two main modules; the functions/knowledge data base and sets of rules which act on the database. Firstly there is the generic functions/knowledge database which contains a set of connected, decomposable functions, each of which is serviced by a particular set of knowledge classes and associated expertise levels. The functions part of this database comprises a set of generic manufacturing processes, which can be tailored by a user organisation to fit its circumstances. These functions are derived from process charts for similar processes in a range of different manufacturing companies, and by reference to standards (e.g. BSI 7000). Secondly there is a set of six rulesets, described below: Meta-rules. These operate on the next subsets. Their function is to switch on or off particular rules, depending on the output from the positioning component and the users’ choices. They translate the output from the first part of the tool into constraints for the second part. Especially, they resolve conflicts which can occur in the output from the positioning tool (e.g. “everyone must be omni-competent, but we must have specialists”). Process characteristics rules. There are four types: Structure, Control, Process and People. These adjust the relationships between the process functions, their management, and the configuration of knowledge serving these functions. They amend a generic set of functions to fit the particular business process being considered. Allocation of function rules. Following (Mital, Motorwala et al. 1994a; Mital, Motorwala et al. 1994b), we consider that there are three categories for these rules, (a) Mandatory allocation: There are mandatory reasons for allocating a function to humans or machines; for
222
CE Siemienluch, MA Sinclair and GMC Vaughan
example, safety practice, legal requirements, or engineering limitations, (b) Balance of value: rules based on estimates of the relative goodness of human and machine technology for performing the intended function, (c) Knowledge and communications characteristics: This category includes rules, which could be employed to alter the pre-set attributes of functions in the database. These attributes are in effect pre-determined answers to the questions below in Table 1, which the user can change to fit the company circumstances. Humans bring to the workplace particular abilities to perceive and interpret information, to think, and to act, in the context of a variable environment. Automated systems are unlikely to be competent to do this for some time to come. Consequently, if any of these abilities are required in order to perform some function, then the function must be allocated to humans. Question classes which explore this are as shown in Table 1. They have been worded such that a ‘No’ answer implies that the function should be carried out by humans. Note that this does not mean that a human must perform the function unaided; merely that a human must be in direct, real-time and online control of the function. It is possible that as the design progresses, these decisions can be reviewed.
Table 1: Classes of questions for allocation of function rules. Knowledge rules. These are used to identify matches between the functions in terms of the knowledge required for each of the functions. There are 16 of these, in a sequential set. The first rule is the most restrictive in its application conditions, and the last is the most relaxed, allowing almost any two functions to be matched and combined. Matching is on the basis of 4 criteria (or less, depending on the rule); adjacency of the functions; completeness of the match in knowledge classes; completeness of the match including levels of expertise; and homo-location within a defined process sub-section. The consequences of applying only the first rule is that a multitude of single-function jobs are produced; if only the last is applied, a few, comprehensive jobs are produced suitable only for polymaths. Consequently, the set is applied in sequence, with provision for back-tracking to permit alternative groupings of functions to be achieved. Role definition rules. Their function is to control the groupings generated by the knowledge rules, so that at the end of the exercise a sensible set of function groupings has been achieved, perhaps with some functions still left dangling, and no nonsensical groupings have been produced. Note that these rules take little cognisance of workload, though they do crudely recognise the concept of a ‘headful’ of knowledge.
Allocation of functions and manufacturing job design
223
Job design guidelines. These comprise advice to the user on what to do about the ‘danglers’, and on further editing of the role groupings developed above, to make them more suitable to the characteristics of the process and the organisation in which the process occurs. This extra editing will arise from the users’ knowledge of the local context, to which the tool could not be privy. The output from this will be an agreed set of roles, comprising the operational functions, and the process management functions. Authority vs. empowerment rules. Up to this point, roles have been defined, only with implicit ‘boundaries’ The only identified transactions across these boundaries are those defined by the process; traffic in operational information, products, and the like. This last rule-set now defines the nature of management links between the roles, so that a process can be controlled. The rules define role boundaries, and the relationships between operational roles, between process management roles, and between both of these. This is accomplished by the use of ‘Paste functions’, discussed below.
Paste functions Eight types of paste functions have been defined (so-called because they are ‘pasted’ between functions). With the exception of the ‘Congruence’ paste function, they indicate the boundaries between two roles. Different types of relationship between roles are shown by different combinations of the paste functions (e.g., ‘autocratic’; ‘empowered’, or ‘peer-topeer’). Once inserted, these Paste Functions comprise the structure and communications for management of the process under consideration. Congruence. This facilitates the working of concurrent functions, and provides notifications regarding the availability, status and timeliness of the data flows between them. This occurs within a role. Hand-over. This indicates the hand-over of responsibility and authority from one role to another role (most often from a project management role to an operational role). Targeting. This paste function provides a context for the hand-over or delegation of responsibility and authority. Targeting is also a one-way paste function operating between one role and another. Co-ordination. The Co-ordination paste function establishes two-way communication between two roles. In order to achieve this certain functions must be present within the management part of each of the two roles linked by this paste function: these functions are: ‘plan’, ‘monitor’, ‘co-ordinate’ and ‘report’. Integration. The Integration paste function ensures that activities within different roles are working to the same (moving) goals and with the same parameters. This does not imply control by one role over another. Wherever an information or data link exists between roles then an Integration paste function will be required. Control. This paste function means that one role has ultimate authority and responsibility for any group of functions carried out by other role(s). Delegation. This is a variation of the Hand-over paste function, in that it includes tightlyconstrained conditions (e.g. no variance allowed in budget, allocation of resources, timescales, etc.), and typically occurs with the Control paste function. Propagation. The Propagation paste function transfers information between processes rather than along a process. It enables organisational learning. Upon completion of the actions of these rule-sets, and of the inputs by the users during their control of the rule-sets, there will be an output providing the functions allocated to humans (and, by default, those allocated to technology), the grouping of these functions into roles, a listing of the knowledge classes and levels of skill required for these roles, and the organisational structures into which these roles fit, together with the nature of the management communications between them.
Evaluation of this approach This methodology has been developed in the SIMPLOFI (“Simultaneous Engineering through People, Organisation and Function Integration”) project, funded by the Engineering and Physical Sciences Research Council, UK. Six industrial companies took part. They ranged from ‘large’
CE Siemienluch, MA Sinclair and GMC Vaughan
224
to ‘medium-sized’; none were ‘small’. The manufacturing domains were computers, automobiles, materials testing, railway subsystems, mining equipment, and materials handling. In each of the companies a case-study approach was adopted, partly to develop the methodology, and partly to demonstrate its relevance to users in their endeavours both to understand and to improve their current processes. General responses from the users involved were: • •
the technical jargon and the concepts were unfamiliar to the users, and some effort was necessary before the methodology would be usable by managers and others. the exercise was thought-provoking.and in some cases had led to a redefinition of existing problems. Some felt relief that their thinking was supported by the tool.
Summarising the strengths and weaknesses of the approach: + + + + + + + + -
It provides a tool for the organisation to explore the staffing of future processes, or process re-engineering ideas, where there are few tools currently available It provides an opportunity to influence the important, early process design decisions from a sociotechnical viewpoint, rather than just a technical viewpoint It provides a documented audit trail of users’ decisions and the consequences of these decisions, impartially, and in more detail than is usually the case. It provides a means of clarifying and crystallising differences between users in their visions of the future and its consequences It provides worked-through alternatives for the user to consider It provides a basis for early recognition of training needs and selection requirements Paste Functions allow comparisons of management structures, and are the basis for an evaluation tool with metrics for the appropriateness of management structures. The approach is being explored for incorporation into Reference Architectures for enterprise modelling, and the development of IT infrastructures and applications. In its current form, the methodology requires domain experts for its use. The methodology is still a prototype, and requires further work and validation. The methodology is designed to work with sparse, early design information. Necessarily, therefore, it can only offer guidance to users, not solutions. The tool cannot deal with workload issues; in other words, it can tell you what kinds of roles are necessary, but not how many of each is necessary. For the tool to be properly effective, it should be one among a suite of interoperable tools for business process/enterprise modelling. This is not yet available.
References Brookes, N.J. and C.J.Backhouse (1996). Understanding Concurrent Engineering practice: a case-study approach, Dept of Manufacturing Engineering, Loughborough University, LE11–3TU. Dekker, S.A. and P.C.Wright (1997). Function allocation: a question of task transformation not allocation. ALLFN’97—Revisiting the allocation of functions issue, Galway, IEAPress, Louisville. Döring, B. (1976). Analytical methods in man-machine system development. Introduction to human engineering. K.-F.Kraiss and J.Moraal. Köln, Verlag TÜZV Rheinland GmbH: Ch. 10. Kantowitz, B. and R.Sorkin (1987). Allocation of functions. Handbook of human factors. G.Salvendy. New York, J.Wiley & Sons: 355–369. Meister, D. and G.F.Rabideau (1965). Human factors evaluation in system development. New York, John Wiley & Sons. Mital, A., A.Motorwala, et al. (1994a). “Allocation of functions to humans and machines in a manufacturing environment: Part 1—Guidelines for practitioners.” International Journal of Industrial Ergonomics. 14(1 and 2):3–31. Mital, A., A.Motorwala, et al. (1994b). “Allocation of functions to humans and machines in a manufacturing environment: Part 2—Scientific basis (knowledge basis) for the Guide.” International Journal of Industrial Ergonomics. 14(1 and 2):33–49. Vernadat, F.B. (1996). Enterprise modelling and integration. London, Chapman & Hall.
THE NEED TO SPECIFY COGNITION WITHIN SYSTEM REQUIREMENTS Iain S MacLeod Aerosystems International West Hendford Yeovil, BA20 2AL UK
Systems Engineering separates system design into logical specification and physical design stages. Post the logical stage, the systems design is approached by various iterations of physical design processes. The usual consideration of Human Factors (HF) at the logical stage of design is in the form of human constraints on system design. This form of HF consideration is necessary. However, human system related performance requirements should also be considered. Therefore, HF requirements should be carefully introduced as system functional requirements at the logical phase. Otherwise, HF influence on design has no trace to system requirements and is commonly performed late in the physical phase. Also, design requirements for HF should arguably result from an early consideration of system cognition. By cognition we refer to system goal related properties of work concerned with system direction & control, situation analysis, system management, supervision, knowledge application, and anticipation.
Introduction Systems Engineering (SE) separates system design into logical specification of system functions and performance as a basis for subsequent physical design stages (see IEEE 1220, 1994). The logical stage allows requirements capture and functional/performance specification that is logical and implementation free. Post the logical stage, the systems design is approached by various staged iterations of physical design processes, for example a design stage of synthesis, where requirements are equated with proposed system architectures and candidate technologies. Traceability threads are maintained to the initial specification to help ensure that the initial requirements are met by the design and that design changes are noted. Change details are recorded detailing not only the thread to the original requirements but also change origins and the reasons for the change. As technology introduces greater complexity to systems it is important that tenets, such as those of SE, are applied to design processes to promote quality in design. Unfortunately, Human Factors (HF) is poorly considered by the logical stage and is only considered in detail by the systems engineering process during physical design. The result of this lack of early consideration is that there is a sparcity of HF requirements on which to base physical design. There is poor traceability of HF activities to system requirements and, therefore, its benefits to design are hard to determine and its cost is difficult to justify to the engineering design world. This relegates much of the HF contribution to design to ‘tidy-up’ activities around the engineered design of the system. The usual consideration of HF at the logical stage is in the form of human constraints on system design. For example, cabin dimensions and seats. These details are important in the
226
IS MacLeod
support of human work within the designed system. However, they add little to the specification of the human functions within the system and of their expected performance. This article reasons that the system requirements for cognition should be initially approached within the early specification of the system. It is argued that by this means operator performance can be properly considered within the design of a system. It is true that all engineered systems are designed to meet specified performance criteria but that operator performance is poorly considered within design processes. Moreover, only through consideration of the system requirements on the operator can the trust of the operator in the system be promoted. By cognition we refer to system goal related properties of work concerned with system direction & control, situation analysis, system management, supervision, knowledge application, anticipation of events, and associated teamwork (MacLeod, 1996). These properties are normally relegated solely to the domain of human expertise and are not considered within system requirements/performance specification. However, it will be argued that in an advanced technology system it is important to recognise what functions of cognition are required in support of the systems model and are include in the specification. Arguably, only through careful early specification can man and machine functions be adequately complemented within a system. Moreover, looking to the future, machine assistants to the human must possess cognitive functions if they are truly to complement human performance in system control.
HF and Systems Engineering (SE) SE places much emphasis on the capturing of system requirements and the performance specification of these requirements. Consequently, a sub specialisation of the discipline has developed termed Requirements Engineering. Requirements Engineering formulates the customer’s requirements for the system, requirements that are often ambiguous and incomplete, into a set of logical functional and performance statements that represents a total system specification. SE strength is that it represents a multi disciplinary approach to the engineering of systems. Unfortunately, SE specification processes typically only cater for HF constraints on the system, constraints defining system design boundaries. Further, HF constraints are not generally accompanied by performance requirements and mainly consider issues related to working space and habitability. HF constraints tend to be concerned with limits of human capability rather than the required human contribution to the specified performance of the system. Current SE practices logically specify the system to be engineered. Within this specification there may be implicit HF requirements. However, traditionally these requirements have only been considered during physical design with relation to the Human Machine Interface. Therefore, the majority of the needed human contribution to system performance is met by operator expertise, this often developed separately to any design intent. Operator expertise will always be required within a man-machine system if for no other reason than the need to cater for uncertainties in the system operating environment (MacLeod & Wells, 1997). This expertise is created as a product of personnel selection, training, and experience with systems. Expertise encompasses high quality operator performance requirements within the system, these including anticipation system control. However, a design aim should be that associated operator expertise should never need to be developed to largely cater for inefficiencies in the physical design of the system. It is a truism that changes in technology change the nature of human work. For example, as advancing technology promotes increasing levels of system automation so the human role within a system becomes one more of supervision than direct control, more of cognitive type activity than of manifest physical activities. Such changes should be accompanied by an understanding of the nature of these changes. Understanding of these changes must be
The need to specify cognition within system requirements
227
accompanied by the development of new approaches and methods applicable to system design, developments that allow high quality engineered systems to be produced that fit with their performance requirements. SE is a growing discipline. In the U.S.A. SE is starting to approach the issues of HF within specification. The International Council of Systems Engineers (INCOSE) has HF as one of its main themes at its next conference at Vancouver in 1998. In the UK, the British Psychological Society Special Interest Group on Engineering Psychology has an active working group examining the problems associated with the specification of cognitive functions with relation to the design of systems.
What are System Cognitive Functions (SCFs)? A Function is stated here to be a system property that is latent until required, has an expected level of performance, and is appropriately evoked by engineered automation or the system operator through the application of effort (MacLeod & Scaife, 1997). In contrast, a Task involves effort and is a system’s planned application of its functionality towards the satisfaction of explicit system goals. Tasks may involve one or more functions and may solely reside within the engineered system, be unique to the work performed by the system operator, or involved the use of functions from both. Tasks can be physical in nature, cognitive in nature, or a combination of the two (MacLeod, 1993). Further, the term cognition (e.g. encompassing knowing, understanding, anticipation, directing, mediation of skilled application, and control) can apply to functions and tasks resident in either man and machine or both (Hollnagel & Woods, 1983). With the human operator, activities and actions are necessary to address task performance. Mediating between operator system tasks and activities are the system operator’s pertinent Cognitive Functions. Thus: Task >> Cognitive Function >> Activity >> System Feedback However, for the sake of system design, system related cognitive functions differ from the system operator’s cognitive functions. SCFs are concerned with system performance issues and not the individual expertise and cognitive processes of an operator. Whilst operator cognitive functions, as introduced above, are tuned to the living needs of the individual, SCFs are functions that are required solely to allow the system to meet its designed performance. Importantly, the specified cognitive functions should not try and represent a model of the human system which has far too many variables to make it of practical use (Chapanis, 1996). Rather, the SCFs should cover what the operator has to understand and activate with relation to the work situation and its associated operating procedures for control, direction, and management of the system. In the system they represent a complementary functionality to both engineering functionality and operator cognitive functionality. Therefore, SCFs are supported by operator cognitive functions but are concerned with purely system issues such as system control and management. As such, it is possible to engineer some SCFs. For example, some system management functions could be sensibly automated to complement the human tasks in the management and supervision of the system. Any SCFs considered here reside within the sphere of overall system requirements. Moreover, the eventual performance of these functions by man or machine, arrived at during the iterative and physical processes of design, can be represented by tasks undertaken by the engineered system, the system operator, or both.
The Association of SCFs with Other SE Functions SE is similar to other forms of engineering in that a system is devised ‘Top Down’ from performance specification to a decomposition of functions into a associated hierarchy of sub functionality. It is suggested that SCFs can be considered in a similar fashion, not as a
228
IS MacLeod
decomposition of the cognitive function, but as an association of the appropriate cognitive function to the engineering derived function. By this method, not all engineered functions would have an associated cognitive function, though some engineered functions might have many. Here we have a more complete method of considering the transposition between the evocation of engineering functions and the performance of system tasks. By considering the cognitive functionality required to support the system engineered functions it will not only be easier to determine the form of system tasks, and the participation in these tasks of system components, it must be possibly to consider automation in a new light. Because cognition is now in the system frame of consideration, the cognitive functions that are needed to support the functions resident in technology can be made explicit. This will mean that cognitive functions can either be automated through the new technology, if the adopted technology allows and it is desirable, or they can be more ably understood to assist the training of operators and design HMI that is task orientated rather than system function orientated.
The Form of SCFs We earlier discussed the meaning of cognition. It is now necessary to consider the particular form and nature of SCFs. Firstly, we are complementing existing forms of functionality within the system: engineering and operator cognitive. Secondly, that complementation will exist with consideration to specific system task areas: currently suggested as management, supervision, direction, control, and analysis as associated with evoked engineering functionality. Thirdly, the existence of the SCFs may or may not be manifest to other component parts of the system. If the SCFs are not manifest the form may be that of system error or status checking where other components of the system may not require the results of the associated error or status checking tasks provided the results are within pre defined or acceptable limits. If manifest the SCFs will have various forms depending on what parts of the system needs to be aware of the results of the evoked SCFs. Therefore, the forms in this case should be task and context sensitive. For example, the form of a SCF assisted warning on the safety of the system would be different from the form of an SCF assisted advisory message. Within the engineered system the interface of the SCFs would be a matter of engineering physical design but would have implications on the safety criticality (or any other criticality) of the system. In the case of interfacing with the system operator’s cognitive functions the consideration of what operator sense or senses was most appropriate as a perception receptor would have to be considered as would the information content of the communication. In many cases good Human Computer Interface practice could be applied. However, with advanced technology systems there should also be distinctions and separate considerations applied to certain forms of intra-system communications. As examples: 1) the differences between system performance and communication of an activity to the system performance of inter component assistance; 2) between the passing of information between components and the proffering of advice from one component to another. There are of course parallel arguments that can be entered into the debate. The arguments on the nature of consciousness (e.g. Penrose, 1997), the differences between language and conscious thought (e.g. Hardcastle, 1995), whole/part perception (e.g. Arnheim, 1986), and the very existence of cognitive functions as discussed in this article. However, what is argued in this article is the capture of a system function, an SCF, which will complement other system functions to the benefit of system performance. Such functions do not have to mirror the processes of the brain, do not have to have sentience or be aware, but must have a high quality ability to assess situations from gained information, this both internally and externally to the system. They must also be capable of communicating effectively. What will be definitely be needed, not only for any SCF based approach but to equate new advances in technology, are the creation and adoption of new design methods. To be adopted
The need to specify cognition within system requirements
229
these design methods must complement those already existing in disciplines such as SE. Some of these methods may already exist but are currently treated with some scepticism. An example of argued conflicts over the validity of methods can be seen in Psychology in the debate between the use of Qualitative versus Quantitative methods. It is submitted that qualitative methods would be required to determine the form and nature of SCFs.
The Issues The issues discussed in this article are all related to one issue, namely that: Advanced technologies normally have no accompanying and accepted design philosophies and methods for their effective and quality incorporation into current system design processes. It is essential that better design processes are found to allow the quality adoption of new and advanced technologies as they emerge. If old and inappropriate methods are applied to the adoption of new technologies within design, the result in invariably the production of systems that fail to meet performance requirements. Further, if the designers do not understand the implications of adopted new technologies, then the delivered system will not only under perform as a system, its very operation is an unquantified risk. This small article has introduced the idea of the introduction of System Cognitive Functions (SCFs) into design as a method of improving the requirements specification of systems, and through so doing, approach an understanding and roles of new advanced technologies and their implications to Human Machine System design. Design processes need to incorporate a greater understanding of the system requirements on human performance, and the support that must be offered to human cognition, if the system is to operate under control towards its designed performance goals. It is suggested that the ‘old and inappropriate methods’ are endemic within HF. It is time for HF to join and complement the other system design disciplines to assist in the development of high quality design processes that allow the adoption of advanced technologies. It is even possible that HF practitioners could lead the way.
References Arnheim, R. (1986). The trouble with wholes and parts. New Ideas in Psychology, 4, 281–284. Chapanis, A. (1996), Human Factors in Systems Engineering, Wiley, New York. Hardcastle, V.G. (1995). Locating Consciousness. Amsterdam & Philadelphia: John Benjamins Press. Hollnagel, E & Woods, D.D. (1983) Cognitive systems engineering: New wine in new bottles, International journal of Man-Machine Studies, 18, pps 583–600. IEEE P1220 (1994), Standard for the Application and Management of the SE Process, IEEE Standards Department, Piscataway, NJ. MacLeod, I.S. (1996), Cognitive quality in advanced crew system concepts: The training of the aircrewmachine team, Contemporary Ergonomics 1996, Taylor & Francis. MacLeod, I.S. and Scaife, R (1997), What is Functionality to be Allocated?, in Proceedings of ALLFN’97, Galway, Ireland. MacLeod, I.S. and Taylor R.M. (1993), Does Human Cognition Allow Human Factors (HF) Certification of Advanced Aircrew Systems? Proceedings of the Workshop on Human Factors Certification of Advanced Aviation Technologies, Toulouse, France, 19–23 July, Embry-Riddle University Press, FLA MacLeod, I.S. and Wells, L. (1997), Process Control in Uncertainty, in Proceedings of 6th European Conference on Cognitive Science Approaches to Process Control, Baveno, Italy, September Penrose, R (1997), The Large, the Small and the Human Mind, Cambridge University Press, Cambridge, UK.
ANALYSIS OF COMPLEX COMMUNICATION TASKS Jonas Wikman
Communication Research Unit Department of Psychology Umeå University, S-901 87, Umeå Sweden
The present paper has a specific emphasis problem solving and decision making tasks where the tools for execution and completion has distinct communicative features. Two possible analytical strategies to such tasks are critically examined: a laboratory-based, micro level strategy and a strategy based on ergonomics. The aim is to compose an analytical approach that can account for task uncertainty, variation, and dynamics in communication settings. Systems analysis and constructs from organisational psychology and constructivistic theory are suggested as means to defy some of the current limitations.
Introduction Modern real life tasks are characterised by a high degree of uncertainty. The trends in working life towards flexible working hours, flexispace, team, and project organisations, conditioned by the proliferation of information technology, accentuate the dynamic aspects of work. Thus, a characterisation of modern, discretionary tasks must be guided by analytical approaches that can account for the critical dimensions mentioned. The present paper is concerned with the analytical and methodological consequences of the evolution of work practice and follows the discussions that have accompanied the development of task analysis. Emphasis is on real-life tasks in naturalistic settings where the primary means for execution and completion is interpersonal communication. Traditional work or job analysis methods have focused on tasks with physical and procedural characteristics, existing in stable work environments. The seemingly clear, established and uncomplicated relation between a task and the goal, has made explicit analysis of goal hierarchies redundant, since they have, supposedly, been embedded in practice and customs. However, substantial changes in working life have taken place and the demands on human performance have shifted from primarily physical to cognitive. As a consequence, the discrepancy between reality and models of analysis has become apparent. The new complex, process-oriented tasks that require activities such as co-ordination, planning, decision-making, and communication do not offer correct behavioural indicators needed to specify the parameters that regulate behaviour.
Analysis of complex communication tasks
231
This underspecification leads to an inability to validate the mediating processes that determine task performance.
Problems with task analysis It is possible to distinguish between two evolving task analysis traditions. The theory driven, nomothetic approach that concentrates on micro level of task analysis (e.g. Wickens, 1992) is closely related to the dominant, basic research trend in cognitive psychology. The essence of the classic critique of this laboratory-based, cognitive approach, put forth by Neisser (1976) and others, concerns construct validity and can be summarised as follows: In the laboratory a limited number of variables can be manipulated independently. However, the only thing that guarantees this independence is the simple and stable environment that characterises the experimental situation in which the boundaries are clear, and relevant contextual factors are eliminated or held constant. In real life situations no controls that would warrant this independence may exist. It is a possible scenario, that when the experimental control is lifted, contextual and confounding variables are added that might invalidate an experimentally confirmed independence. This independence may be the basic condition on which the theorising rests and, in effect, the whole line of reasoning will be degraded. In addition, the approach is based on a model of reality where closed systems are linearly and additively combined into seemingly open ones where boundary conditions are constant. Such a model has proved to be of little use for practitioners (Hollnagel, 1982). An alternative, more or less atheoretic approach has been criticised for suffering from problems that are the reverse of those of the micro level tradition. The starting point for this ergonomics-based approach is the task together with the restrictions and/or properties of the situation as they occur in the specific workplace. Researchers construct heuristic and customised models that are well grounded in the situation and comprise all relevant contextual variables. The prime ambition has been to solve the practical problem at hand and not to develop a general methodological and theoretical base. The concepts and models applied are validated in the situation, but transfer to other contexts is secondary (Rasmussen, 1993). Consequently, the representation of contextual factors in these models usually have high construct validity. However, detailed analysis is often restricted to specific tasks such as aviation and process control. In these contexts, the boundary conditions are thoroughly explored but lack general cognitive explanations which limits external validity. In general, the inability of ergonomists to build theory that stretches outside the immediate application area invites critics to question the already shallow theoretical foundation.
The validation issues and the systems definition While micro-level analysts ignore contextual dependence, this alternative ergonomic approach is overwhelmed by it. This has implications for an analysis of problem solving tasks with communicative features: The theoretical and practical base need to be broadened so that the absence of valid psychological constructs within the ergonomic approach can be rectified and thereby offer the necessary means for generalisation. The basic issue is what system is under study. Following Cook and Campbell’s (1979) discussion on validity, it should be noted that the time dimension does not have to be the sole parameter that affects the development of the function between two situations, it may be the situational or the
232
J Wikman
populational dimensions, or an interaction between some combination of the three, that produces the changes between the first and second states. The question concerns what the difference is between the original empirical corroboration of the fact and the generalisation: Is it the nature of the system or is it the parameters given in the definition of the system? With an underspecified systems definition it is impossible to determine where the boundary between the system under study and other systems is drawn. The approach suggested in this paper is based on systems theory (Katz & Kahn, 1978). The systems approach requires a description of boundary conditions, the interaction between the system and its context, and the structure of the dynamic transformation process within the system with an emphasis on regulatory feed-back loops.
Systems analysis—state of the art From the perspective of Rasmussen (1993) and others, there seem to be little controversy in stating that the major inadequacy in contemporary human factors applications is the lack of explicit systems analysis. Practitioners approach the task without paying attention to basic company goals, functions and processes which are crucial in order to identify existing options for change.
Possible theoretical specifications The question is how an approach should be broadened in order to meet current constraints. The suggestion is a multiple levels of analysis approach (Pfeffer, 1985), with the inclusion of the role concept, the incorporation of communication in the analysis of organisational processes, and the use of organisation theoretic concepts such as context, external environment, technology, goals, and organisational structure. The theoretical and methodological framework of organisation theory offers, through these macroergonomic factors (Hendricks, 1995), tools that readily can be used to guide analysis. However, the complex systems under study are not exhaustively defined in terms of principles of organisational structure and behaviour. The study of communication behaviour at work has received little attention in current task analysis approaches. Yet, interpersonal communication is usually an integral part of task behaviour. It is also possible that the constructivistic school can contribute to the analysis of complex tasks. Theoretically, they develop the idea of a cognitive system and view the individual as a consciously reflecting organism engaged in intelligent social action in which information is selected, transformed and enriched. The constructivists suggest an extension of the role concept which may refine the systems approach and allow the researcher to view social systems as consisting of roles rather than individuals.
Task-related communication—theoretical and methodological implications To what extent are macro-organisational principles mirrored through the task related communication at work, and to what extent can socially constructed models be inferred from communication data? In the first case it is reasonable to expect a correlation, since the work activities are integrated in an organisational system and the task related discourse should reflect this system. Such communication data can, in principle, be determined relatively objectively, since they consists of technological, administrative, and other tangible concepts. On the other
Analysis of complex communication tasks
233
hand, it may be problematic to interrelate different levels of analysis—to integrate macro and micro perspectives. Since focus is on invariant features of behaviour, it does not account for the fact that there are different subjective perspectives. Methodologically, this criticism is one of the major contributions from the constructivists, i.e. the methods used to study these processes must be sensitive to individual variation and account for subjective data. However, in the construtivistic tradition socially constructed concepts and models are inferred from discursive data. It is an inductive approach with distinct problems of unique, subjective descriptions and interpretations, resulting in difficulties when generalising across cases. In an explorative approach to study communication in complex tasks, it would be safer to anchor the analysis in well-defined categories, that follow a preliminary task description, governed by system concepts and organisation principles, than to rely on a constructivistic strategy. There are several methodological implications for observation of task related communication. For instance, what kind of research strategy should be employed when focus is on complex tasks within a real-life context, and when the question posed is ‘how’, and the goal is to draw valid inferences about the strategies used in task execution? What strategy should be used when the task is characterised by great complexity, and several aspects covary naturally, and when the boundaries between the task and its context are not clearly evident, and it is not possible to gain control over behavioural events and how they unfold over time? What research strategy should be selected when the boundary conditions may not be constant during the sequence of events, and when there are important serial and parallel processes proceeding simultaneously? With reference to these questions and the previous elementary discussion of different approaches to task analysis an advanced case study approach (Campbell, 1984) accompanied by controls that guard against the construct and external validity deficiencies connected to the practitionerbased ergonomics approach seems to be the correct strategy. Opportunities to intentionally replicate case studies should be utilised, that is, the selection of cases to study should be complementary and allow replication. In order to generalise findings analytically the psychological constructs needed to satisfy the validity issue, must be added. If these cannot be found in existing theory they can be anchored in the system analysis. By making analytical rather than statistical generalisations the replication logic that case studies rely on, is equivalent to the logic behind the generation of experimental studies (Yin, 1984).
Content analysis of communication data Communication data are only a sample of evidence needed to infer task related knowledge and strategies. The size of the sample and the degree of biased selection may vary across actors and settings. When using quantitative indicators, this generates problems. Is, for instance, the frequency of utterances monotonically related to the significance of discourse content? An even more complicated issue is the relation between a verbal utterance and the cognitive meaning of the sender and the receiver. On the one hand, the relation can be difficult to determine regarding whether thought is mirrored through speech. In complex settings of real life communication between parties, the validity concept becomes hard to grasp. One sensible way to assess validity for communicative tasks is to use the problem building phase (Simon, 1992) as a reference point, and the circumstantial validity evidence becomes the degree of convergence between preparation and conversation.
234
J Wikman
On the other hand, the same utterance is very likely to vary in regard to the relative importance that different actors attach to it in and also across different situations. The interlocutors communicate fragments of their partly common representations, what they regard as important factors and how they perceive the relationship between these factors. The information received is added to existing representations or used to alter priorities or relations within these. Thus, performance in tasks where actors have to interact socially not only depend on the actions of the sender, nor on the responses from the receiver, but on their joint, situated contribution to the issues discussed. This yields problems in classifying data in psychologically meaningful categories, required to describe and explain the task strategies of the actors. Utterances can be objectively recorded and also classified with high inter-observer agreement as long as well-defined and simple categories are being used. This is the case when the classification follows a preliminary task description governed by system concepts and unequivocal organisation principles.
Conclusions The research strategy suggested in this paper and applied in two empirical studies by Strangert et al. (1996), and Wikman and Strangert (1996) has been case-studies in field settings. Some rather severe theoretical and methodological problems in task analysis have been outlined in this paper, and the suggested approach must be able to handle the deficiencies outlined. This approach does not capture the subjective meaning of the interlocutors. Still the results from our empirical studies confirm our approach with casestudy based task analysis as a promising initial step towards a general explanation of communication in complex tasks.
References Campbell, D.T. (1984) Foreword. In R.Yin. Case study research: Design and methods. (Beverly Hills: Sage Publications) Cook, T.D. & Campbell, D.T. (1979). Quasi-experimentation: Design & analysis issues for field settings. (Boston: Houghton Mifflin Company) Hendricks, H.W. (1995). Future directions in macroergonomics. Ergonomics, 38, 1617–1624 Katz, D., & Kahn, R.L (1978). The social psychology of organisations. (New York: John Wiley) Neisser, U. (1976). Cognition and reality: Principles and implications of Cognitive Psychology. San Francisco: W.H.Freeman & Company. Rasmussen, J. (1993). Analysis of tasks, activities and work in the field and in laboratories. Le Travail humain, 56, 133–155 Simon, H.A. (1992). What is an explanation of behaviour? Psychological Science, 3, 150–161 Strangert, B., Wikman, J., & Strangert, C. (1996). Analysis of communication strategies in Systems Inspections. (Umeå University, Department of Psychology) Wickens, C.D. (1992). Engineering Psychology and Human Performance, 2nd. ed. (New York: Harper Collins Publishers Inc.) Wikman, J., & Strangert, B. (1996). Task uncertainty in systems assessment. (Umeå University, Department of Psychology) Yin, R. (1984). Case study research. Design and methods. (Beverly Hills: Sage Publications)
INFORMATION SYSTEMS
HEALTH AND SAFETY AS THE BASIS FOR SPECIFYING INFORMATION SYSTEMS DESIGN REQUIREMENTS Tom G Gough
Information Systems Research Group School of Computer Studies University of Leeds Leeds, LS2 9JT
Health and safety at work has been a concern over many years most recently in respect of the users of computer-based information systems. However, developers of such systems still seem to see health and safety as someone else’s responsibility. Little attempt has been made to incorporate consideration of health and safety into the information systems design process. An examination of health and safety in the workplace suggests that health and safety issues have implications for all aspects of information systems development and that guidelines on health and safety may provide an appropriate basis for the initial specification of design requirements and lead to the construction of systems which do not put the health and safety of their users at risk.
Introduction Despite the fact that health and safety at work has been a concern over many years, most recently in relation to the use of computing equipment, information systems developers still seem to see health and safety as someone else’s responsibility. Although information systems developers would find it hard to disagree with the contention that one aim of information systems development should be to produce information systems which do not put the health and safety of their users at risk, it is not obvious that much effort has been expended generally to incorporate consideration of health and safety into the literature on the information systems design process. This paper begins with a brief review of the various aspects of health and safety in the workplace to which information systems developers should be paying attention, if the systems they build are to ‘fit safely’ into the context in which they are to be used. It then briefly examines the assertion that insufficient attention is paid to
Health and safety in information systems design requirements
237
issues of health and safety by the proponents of various information systems development methodologies. The implementation of the European Union Directives in the UK will be used to provide the design baseline for an exploration of whether guidelines on health and safety could be used as the basis of an initial specification of information systems design requirements. The paper concludes with a preliminary assessment of the implications of such an approach for the theory and practice of information systems development. Health and Safety—Issues and Implications Health and safety issues in the workplace may be grouped into four general categories: those associated with the workplace itself; those identified as related to fatigue; those that are stress related; and the risks associated with the use of workstations that incorporate VDUs. Many of these issues are interrelated and need to be addressed together, even if discussion is nominally about a single issue. Most recent attention has been focussed on the problems that are perceived to be associated with workstations and the use of VDUs. This is largely because the growth in the use of VDUs and, in particular, keyboards linked to VDUs, has resulted in an increase in the claimed incidence of repetitive strain injury (RSI). An initial assessment of the health and safety issues outlined above suggests that the implications for information systems design cover the whole of the process from the analysis to installation of computer-based information systems in that the issues identified cover all aspects of the design process, if they are to be effectively addressed. Information Systems Design Methodologies If the contention that attention needs to be paid to health and safety throughout the information systems design process is true, it would be reasonable to expect that consideration of the health and safety issues would be a feature if information systems development methodologies. It is obviously unlikely that any designer of a computer-based information system would set out to build a system that put the health and safety of the users of such a system at risk. However, as noted earlier, little attempt is made to address health and safety in the literature on how to build information systems. Any discussion tends to be fragmentary or even non-existent. This is perhaps to be expected since the literature on both theory and practice leaves the reader with the impression that health and safety is not the responsibility of the information systems designer.
238
TG Gough
This criticism of insufficient attention to health and safety is valid across the range of information systems methodologies. A review of a sample set of methodologies to support this assertion will be found in Gough (1991). A subsequent review in Gough (1995) showed little improvement over that in Gough (1991) despite the intervening implementation of the EU Directives on health and safety requirements for work with display screen equipment. There is little sign of improvement despite the fact that there has been legislation in place to implement the Directives and that there has been a considerable increase in the number of reported cases of ‘RSI’. For example, not one of some 40 papers at a recent conference on information systems methodologies (Jayaratna and Fitzgerald, 1996) addressed to “…issues ranging across the spectrum from social to technical concerns” identifies health and safety as a key issue. Smith (1997) contains no more than a brief reference to health and safety despite its focus on “the major user issues in information systems development”. In a text aimed a providing a foundation course in information systems (Avison and Shah, 1997), there appears to be no reference to health and safety. Robson (1997) appears to simply repeat the limited discussion in the first edition (Robson, 1994) suggesting that the author sees no need to encourage information systems developers to regard health and safety as integral to the information systems design process. These three brief reviews with their illustrative examples are offered to support the claim that the information systems development community in general sees health and safety as not the responsibility of the information systems designer in terms of the process of design. The Setting and Application of Minimum Standards The European Union Directives (agreed on 29 May 1990), incorporated into Article 118A of the Treaty of Rome, were aimed at improving the health and safety of workers in the workplace and at ensuring that all Member States have comprehensive and comparable health and safety legislation. In the UK the Directives were implemented via new regulations and codes of practice issued under the Health and Safety at Work etc Act of 1974 by the Health and Safety Commission (1992) supported by Guidance on the Regulations from the Health and Safety Executive of which that on Display Screen Equipment (Health and Safety Executive, 1992) has particular relevance to the information systems design process. The ‘Guidance’ (Health and Safety Executive, 1992) covers all the nine regulations addressed to display screen equipment as well as the Schedule (which draws on the minimum requirements set out in the relevant European Union Directive) and an Annex containing detailed guidance on workstation minimum requirements. This detailed guidance is concerned with both the physical characteristics of
Health and safety in information systems design requirements
239
the workplace and the workstation, and with the systems to be used by the workstation user. Increasing emphasis is rightly being placed the importance of all the people engaged in the information systems development process and it is now widely recognised that ‘people factors’ have a greater influence on the success or failure of computer-based information systems than technical factors. However, this recognition of the centrality of the potential user of information systems has not yet been reflected in the advice provided by the proponents of the range of information systems development methodologies. These are still focussed on the data, the processes or the task, the latter in terms of task content rather than the implications of the ‘doing’ of the task for the person engaged in it. The continuation of these approaches, even if supported by increased user involvement (as widely advocated), seems unlikely to reduce the incidence of the problems associated with the use of VDUs. Similarly, more attention is being paid to the importance of the organisational context in the effective operation of computer-based information systems. Little attention is addressed to the physical context in which such systems operate, apart from the occasional discussion of the physical components of the workstation. Continuing to ignore the other aspects of the environment in which users work is likely to lead to the users continuing to experience VDU-related problems since many health and safety issues are interrelated and all of them ought to be addressed within the information systems design process. Using the advice offered by the Health and Safety Executive (1992) as a template for the initial specification of requirements and a basis for the implementation and installation of the resulting information system would offer the opportunity to build ‘better’ systems. If the advice on the health and safety issues were followed within the specification process, the resulting system built to meet such a specification could reasonably be expected not to put the health and safety of its users at risk. The arguments in favour of the efficacy of such an approach are the same as those for ensuring that audit and security provisions are integral to information system specification rather than less effective (and more expensive?) later postimplementation additions. Insisting that the physical context is given equal weight with the organisational context would require designers to be aware of the physical implications of their design for their clients in the workplace, not merely as a visual representation on a screen. Designers would need to re-acquire the broader understanding of the systems environment which was lost with the adoption of a narrow technical systems analysis approach in the late sixties and has never been fully recovered. Designers would then be in a good position to assist their clients with advice on making the physical environment one which did not put the health and safety of its occupants at avoidable risk.
240
TG Gough
Conclusion Health and safety issues still appear to the majority of information systems designers to be someone else’s responsibility and the long-standing concern about health and safety in the workplace, underlined by the recent legislation in relation to the risks associated with the use of computer-based information systems, does not seem to be reflected in the theory and practice of information systems. If, however, information systems designers were to use the advice available in support of the recent legislation as the starting point for requirements specification, as input to the implementation of such specifications, and as part of the planning for the installation of the resulting systems, such systems would contribute to the production of healthier and safer workplaces and reduce the incidence of the problems associated with VDU usage. References Avison, D.E. and Shah, H.U. 1997, The Information Systems Development Life Cycle; A First Course in Information Systems, (McGraw-Hill, London). Gough, T.G. 1991, Health and Safety Legislation—Threat or Opportunity. In M. C.Jackson, G.J.Mansell, R.L.Flood, R.B.Blackham and S.Y.E.Probert (eds.) Systems Thinking in Europe, (Plenum Press, New York), 171–175. Gough, T.G. 1995, Health and Safety Legislation Implications for Job Design. In S.A.Robertson (ed.) Contemporary Ergonomics 1995, (Taylor and Francis, London), 446–450. Health and Safety Commission 1992, Approved Code of Practice—Workplace (Health, Safety and Welfare) Regulations 1992, (HMSO, London). Jayaratna, N. and Fitzgerald, B. (eds.) 1996, Lessons Learned from the Use of Methodologies, (British Computer Society and University College Cork, Cork). Robson, W. 1994, Strategic Management and Information Systems—An Integrated Approach, (Pitman Publishing, London). Robson, W. 1997, Strategic Management and Information Systems—An Integrated Approach Second Edition, (Pitman Publishing, London). Smith, A 1997, Human-Computer Factors: A Study of Users and Information Systems, (McGraw-Hill, London).
COGNITIVE ALGORITHMS Ronald Huston, Richard Shell, and Ashraf Genaidy
Department of Mechanical, Industrial and Nuclear Engineering University of Cincinnati, Cincinnati, OH 45221–0116, USA
This paper presents a new procedure, using cognitive algorithms, for studying complex human-based systems. The procedure uses fuzzy logic, linguistic variables, and participant (worker) expertise to establish the fundamental characteristics of the systems. These characteristics then provide for the identification of system norms which in turn lead to standards and a basis for system modification and optimization. Introduction The traditional methods of characterizing, evaluating, and regulating work systems have become increasingly ineffective. They simply cannot cope with the complexity brought on by the technological advances and the higher productivity demands. There is a need for new intelligent systems which can effectively relate to these increased complexities. The objective of this paper is to provide a basis for the development of such systems. This is accomplished through a discussion of the following concepts: system complexity; cognitive algorithms; and experiments. System Complexity Definition The concept of complex systems has been succinctly stated by Lewin (1993) as: the phenomena that out of the interaction of individual components at a local level emerges a global property which feeds back to influence the behavior of the individual components”. For example, the interaction of species within an ecosystem might confer a degree of stability within it. Stability in this context is an emergent property. As another example, in economics, the aggregate interaction of manufacturers, distributors, marketers, consumers, and financiers forms a modern capitalistic system. A more abstract example is the relation between knowledge and wisdom. Here knowledge is analogous to the “local interaction” and wisdom to the
242
R Huston, R Shell and AM Genaidy
“emergent global property”. In this context, the interaction of different entities of knowledge gives rise to wisdom which utilizes the knowledge base. Such utilization in turn expands the knowledge base thus producing dynamic interaction contributing to system complexity. Domains of Complexity Weaver (1948) characterized complexity in terms of three regions: 1) organized simplicity; 2) disorganized complexity; and 3) organized complexity. “Organized simplicity” describes systems with only a few variables (or parameters)—usually only one, two, or three—and a high degree of determinism. Such systems dominated analyses in physical sciences prior to 1900. An example is Newtonian mechanics of one or two particles. At the other extreme “disorganized complexity” describes systems with very large numbers of variables, regarded as random variables. To study such systems analysts have formulated variousprobalistic and statistical theories. According to Weaver, however, there exists a great middle region between these extremes which he categorized as “organized complexity”. In this region there is a sizable number of variables which are integrated into an organized whole. To study these systems analysts have developed some relatively new techniques including operations research, linear programming, integer programming, fuzzy logic, neural networks, bond graphs, chaos theories, and genetic algorithms. Our contention is that organized complexity can be expanded to include the two extremes. Specifically, organized simplicity is a special case of organized complexity and it is questionable that “randomness” even exists in the physical world. In 1992 Kosko strongly questioned the presence of randomness in the real world, indicating that uncertainty aspects of complexity are deterministic in nature. Alternatively, Prigogine (1985) suggests that reality lies somewhere between determinism and randomness. Use of Fuzzy Logic and Linguistic Variables One method of measuring complexity is to employ the principles of fuzzy logic and fuzzy sets developed by Zadeh (1965) and others. Zadeh envisioned the application of fuzzy sets with human based systems such as economic and ergonomic systems (as in the research here). Fuzzy logic is based upon a premise of “incompatibility” which states that “as the complexity of a system increases, our ability to make precise and yet significant statements about its behavior diminishes until a threshold is reached beyond which precision and significance (or relevance) become almost mutually exclusive” (Zadeh, 1973). Fuzzy logic employs linguistic variables to describe the contents of a fuzzy set. According to Zadeh (1975) a linguistic variable is an entity whose values are not numerical but instead quantitative words or phrases in a natural language. The use of words or phrases has the advantage of being less specific than numerical variables and thus being easier to apply when describing complex systems. Linguistic characterization of system complexity is especially useful with human based systems since people tend to think and act in such descriptive terms as
Cognitive algorithms
243
“high and low”, “hot and cold”, “far and near”, “fast and slow”, “heavy and light”, etc. In an application of these concepts Tichauer (1978) transformed numerical lifting values into ranges defined by linguistic variables. This procedure arguably led to a more meaningful interpretation of results. Cognitive Algorithms A cognitive algorithm is a mental procedure for solving a problem. It is a product of human thinking and reasoning about a problem. Humans think in “analog” instead of “digital” terms, using “continuous” instead of “discrete” variables. This means that information is processed in “chunks” or “segments” as opposed to specific data. The edges or boundaries of these chunks of information are often ill defined, but this allows the application of fuzzy set theory as exposited by Zadeh (1973). Our thesis is that cognitative algorithms can be used in solving problems with complex systems. The best resource for establishing these algorithms are people with “expertise” with the system being studied. We discuss expertise and the derivation of the algorithms in the following two subsections. Expertise Perhaps the major distinction between experts and non-experts (or novices) in a given field is that experts confront problems with “skill-based” behavior whereas novices confront problems with “knowledge-based” behavior. That is, experts will solve routine problems almost subconsciously, or automatically, using skills acquired by repeated solving of the same type of problems. Alternatively, novices approach the same problems consciously, relying on whatever knowledge they may have about the problem and its solution. When asked to describe this skills-based approach to problems experts will typically respond by saying, “I don’t know. I just do it.” This does not mean that experts approach problems capriciously. It simply means that the expert has solved the problem at hand so many times, that the knowledge required is stored in patterns in long-term memory, which is retrievable without conscious mental effort. Development of Cognitive Algorithms The “input” and “output” variables of cognitive algorithms will be linguistic variables. These variables must be capable of ranging between the extremes of all possible variables and states encountered in the workplace (for example “light” to “heavy”). The algorithms will convert the input or independent variables into the desired output or dependent variable. Within this context, a cognitive algorithm is defined as an aggregate of linguistic operations which establishes a rational relationship between the input and output variables. For example, the dependent variable might correspond to the safety behaviors essential to minimizing the risk of adverse health effects during performance of various tasks. Experts would then enable the algorithm development through their experimental assessment of the linguistic variables.
244
R Huston, R Shell and AM Genaidy
It should be noted that, during the experimental assessment, information regarding input variables should not be randomized when presented to the experts. Instead the information should be presented in an organized fashion as in ascending or descending order. Expressed more succinctly: “Don’t randomize, organize.” Experimental Validation An experiment was conducted on thirty male workers engaged in infrequent lifting. Twenty-nine of the workers had at least five years of experience. One had forty years. But one worker had only six months of experience. Each worker was asked, based on his knowledge and experience, to assess the effects of load and horizontal distance on lifting effort for three heights of life. Input variables were then the loads and the horizontal distance. Each of these variables were assigned three values. The load values were “light”, “medium”, and “heavy”. The distance values were “close”, “medium”, and “far”. The output variable was assigned one of nine linguistic values. The results showed that lifting effort increased with an increase in either load or horizontal distance. These findings are consistent with those reported in the ergonomics literature. The results of this preliminary study indicated that the load effect on lifting effort is more pronounced than that of horizontal distance, although in a few cases, workers recognized both input variables as equally important. Conclusions To summarize, the concept of cognitive algorithms with the use of linguistic variables provides a means of studying a wide variety of complex human-based systems. The lifting experiments validate the approach. But, more needs to be done before the procedure is firmly established. Specifically, a system of linguistic mathematics needs to be developed and implemented to produce more precise relationships between the input and output variables, incorporating the nonlinear effects occurring at extreme values of the input variables. References Kosko, B., 1992, Neural Networks and Fuzzy Systems—A Dynamical Systems Approach to Machine Intelligence, (Prentice-Hall, Englewood Cliffs, New Jersey). Lewin, R., 1993, Complexity: Life at the Edge of Chaos, (McMillan Publishing Company, New York). Prigogine, I., 1985, New perspectives on complexity. In The Science and Praxis of Complexity, (The United Nations University, Tokyo), 107–118. Tichaeur, E., 1978, The Biomechanical Basis of Ergonomics: Anatomy Applied to the Design of Work Situations, (Wiley-Interscience Publication, New York). Weaver, W., 1968, Science and complexity, American Scientist, 36, 536–544. Zadeh, L.A., 1965, Fuzzy set, Information and Control, 8, 338–353.
Cognitive algorithms
245
Zadeh, L.A., 1973, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Transactions on Systems, Man and Cybernetics, SMC-3, 28–44. Zadeh, L.A., 1975, The concept of a linguistic variable and its application to approximate reasoning-I, Information Sciences, 8, 199–249.
DESIGN METHODS
RAPID PROTOTYPING IN FOAM OF 3D ANTHROPOMETRIC COMPUTER MODELS IN FUNCTIONAL POSTURES Siel Peijs, Johan J.Broek and Pyter N.Hoekstra
Delft University of Technology Faculty of Design, Engineering and Production Subfaculty of Industrial Design Engineering Jaffalaan 9, 2628 BX Delft, the Netherlands
This paper describes the first-stage results of a feasibility study of rapid prototyping in foam of 3D anthropometric computer models in functional postures. The following three phases will be discussed: first the creation of a list of surface points that simulate the anthropometric model’s outside geometry; next the conversion of these points into B-spline descriptions of the model’s constituent body-members and thirdly the input of these descriptions into the available CAD/CAM software to realize the foam milled model. We will shortly discuss possible next steps that could lead from the automatic creation of anthropometric computer models using 3D anthropometric whole body scans of subjects in standardized postures to the rapid prototyping of these models in functional postures.
Introduction The last ten years have seen the birth and successive growth and flowering of a new scientific discipline and accompanying technology: rapid prototyping. Rapid prototyping can briefly be described as the fabrication of a physical model directly from a 3D CAD design. Where the production of a prototype with classical techniques might readily take a few weeks this now can be shortened to days or even to hours: hence the term ‘rapid’ prototyping. At the moment two main techniques can be discerned. First we have LMT: Layer Manufacturing Technique (additive fabrication as e.g. Stereolithography) where the physical model is built by adding material layer by layer. The second technique (subtractive fabrication such as milling, cutting etc.) is more closely associated with classical techniques in that it removes material step by step. This last technique is used at the Subfaculty of Industrial Design Engineering in the SRP (Sculpturing Robot Project), see e.g. Vergeest and Tangelder (1996), where a CAD/ CAM software controlled six degree-of-freedom foam milling robot is applied for the rapid prototyping of free form surfaces. Since ADAPS (Anthropometric Design Assessment Program System) is also available to us to visualize and manipulate 3D anthropometric models on a computer screen into
Rapid prototyping of 3D computer models
249
workspace related functional postures, a feasibility study of the linkage of both projects was carried out. Results—a foam milled anthropometric model (or selected body parts)—might be useful for better presentation issues and highlight Man Product Interaction.
Conversion The following is quite straightforward. What we have is one of the 3D computer models of AD APS, visualized on a display in a workspace related functional posture. What we like to end with is an anthropometric model in foam. In between we have a number of conversion steps that transfer information about the anthropometric model’s outside geometry, via the definition of surfaces of enclosed body parts into data needed to calculate the milling paths for the robot tool.
The ADAPS Model An ADAPS-model consists of a set of linear branched chains, containing twenty-five links or body members. Relative to these links we define surface-points which, together with a number of lines between these points, determine the outside geometry of the model (see Figure 1). A link’s orientation is defined by its joint-angles relative to the connecting, more proximal link. Changing the orientation of a link will change the absolute position of the surface-points in space (while keeping constant their relative position to the link) and in this way the model’s outside shape. We should note here that the lines used to display the anthropometric model only simulate a 3D entity but do not, as such, automatically form well defined boundaries of enclosed volumes (see e.g. the regions of the model’s elbows in Figure 2).
Figure 1. Schematic representation of an ADAPS-model via links, surface points and connecting lines
Figure 2. Lines do not enclose volumes in the regions of e.g. the model’s elbows
250
S Peijs, JJ Broek and PN Hoekstra
We will come back to this in the next paragraph. All we have to do now in transferring available surface geometry information of an ADAPS-model to the next phase is the production of a list of co-ordinates of selected surface-points (and implicitly the knowledge of what lines should connect them).
Surfaces The next phase is the construction of surfaces from these ADAPS coordinates with the objective to recreate (as close as possible) the original ADAPS-model’s geometry. The definition of 3D surfaces is handled using B-splines. Parameterized B-spline (NURBS) surfaces are defined by the input of the X-, Y- and Z-coordinates of ‘control’ points in two directions. These points form a 3D rectangular web of rows and columns. The control points for the surface part HEAD e.g. are thus selected by the ADAPS coordinates of 5 ( ) times 6 ( ) points (see Figure 3).
•
Figure 3. Definition of control points in two directions ( ) for surface part HEAD
•,
The conversion that now is needed consists of a simple transfer-table linking the point numbering in the ADAPS-description with the numbering for the control points. In this way various surfaces (TORSO, L-ARM, R-ARM, L-LEG, R-LEG etc.) can be defined that represent the constituent body-members of an ADAPS model. We mentioned earlier that the lines used in visualizing an ADAPS-model do not always form the boundaries of an enclosed volume. Care had to be taken in the definition of the surfaces that ensured that all adjacent surfaces form an object that is closed (so it can be milled). Extra surfaces had to be defined in the pelvic region to close a possible gap between the torso and the legs. Since the number of surfaces is not a limiting factor the choice was made to model the nose and the breasts separately, in this way making the modeling of the head and torso more easy.
SIPSURF The next phase links the defined surfaces and prepares the data needed for the milling process. This we can accomplish by using the SIPSURF software that was developed in the Sculpturing Robot Project SRP. SIPSURF—short for Simple Interactive Program for SURFaces—allows the definition of the various surfaces and their relation to each other. It enables visualizing the results as a 3D web, as a surface or as a rendering. Geometric errors can be detected and readily repaired. When everything is satisfactory a module is started that automatically calculates the trajectories for the milling robot. The milling can now be left to the hardware: a combination of
Rapid prototyping of 3D computer models
251
a turntable and a six degree-of-freedom foam milling robot. In this way we could start with an ADAPS model as displayed on a computer screen (Figure 4) and end with the realized object in foam as depicted in Figure 5. Foam models of 30 cm height were realized in two milling resolutions. Corresponding milling times are given in Table 1.
Figure 4. ADAPS model as displayed on the screen
Figure 5. Realized model in foam with milling resolution of 0,5 mm
Table 1. Milling resolutions and milling times
Discussion This feasibility study has shown that the linkage between an ADAPS-model in a functional posture displayed on a computer screen and a realized object in foam is certainly possible. Some remarks however have to be made. -
-
Exerted milling forces in combination with the stiffness of the foam material used (including the scale of the foam model and the mill resolution) resulted in the need for an extra supporting surface under the foam model’s left foot (as may be seen in Figure 5), otherwise the foot could break. ‘Filling the gaps’ as e.g. between the torso and the legs is at the moment an ad-hoc procedure depending on the specific functional posture of the ADAPS model. Special closing techniques like blending and filleting however are available and can be incorporated in a later stage.
252
S Peijs, JJ Broek and PN Hoekstra
- At the moment there is a tiny gap between the surface model’s breasts and the torso. This is not noticeable in the foam model (the mill diameter is much larger than the gap) but it will have to be remedied when producing foam models of bigger size. We end this paragraph with the following: there is ongoing research in the area of 3D Surface Anthropometry using Whole Body Surface Scans of subjects in standardized postures—see e.g. HQL (1995), Daanen (1995), CARD (1996) or Robinette and Daanen (1996). As stated elsewhere (Hoekstra, 1997) ergonomists are especially interested in predictions of real life situations. Much research is still needed to derive arbitrary, but functional, postures from scan data of only a few standardized postures. The possibility to realize true-to-life human models (or body-parts) in foam—to be incorporated within workspace mock-ups or product prototypes—might help in highlighting human workspace/product interaction issues.
Conclusions In this study we have found that rapid prototyping in foam of 3D anthropometric computer models in functional postures is certainly feasible. Via successive steps of surface geometry extraction, the definition of parameterized (NURBS) B-spline surfaces and the automatic calculation of the tool-path trajectories for a foam milling robot, models in two milling resolutions were realized. Although much work is still needed in fully automating the entire process, this ongoing project seems very promising in highlighting human workspace interactions to the designer or ergonomist.
Acknowledgments We would like to express our gratitude to Bram de Smit for his valuable help with the production of the foam milled models and to Henk Lok for help with the illustrations.
References CARD, 1996, Computerized Anthropometric Research and Design Lab, Fitts Human Engineering Division of Armstrong Laboratories at Wright-Patterson Air Force Base, Ohio, USA. http://www.al.wpafb.af.mil/~cardlab/ Daanen, H.A.M. 1995, 3D-Oppervlakte Antropometrie, Ned. Mil. Geneesk. Tijdschrift 48, 171–178, e-mail: [email protected] Hoekstra, P.N. 1997, On Postures, Percentiles and 3D Surface Anthropometry, In S.A.Robertson (ed.) Contemporary Ergonomics 1997, (Taylor & Francis, London), 130–135 HQL, 1995, Research Institute of Human Engineering for Quality Life, Osaka, Japan. http:// www.hql.or.jp/eng/index.html Robinette, K.M. and Daanen, H.A.M. 1996, CEASAR Proposed Business Plan, Armstrong Lab. (AFMC), Wright-Patterson AFB, Ohio, USA, personal communication Vergeest, J.S.M. and Tangelder, J.W.H 1996, Robot machines Rapid Prototype, Industrial Robot. 23–5, 17–20
The use of high and low level prototyping methods for product user interfaces John V H Bonner* and Paul Van Schaik+ Teesside University Middlesbrough TS1 3BA
*
Institute of Design +School of Social Sciences
As part of a research project investigating the development of novel user interfaces for consumer products, two different types of design and evaluation methods were appraised. These two methods were high and low level software based prototyping of novel user interfaces for two types of consumer product. In the first study, we used high level, interactive, prototypes using a high level of functionality and visual fidelity. With these prototypes, we conducted a structured series of formal evaluation methods using thirty subjects. The second study was less structured with a mixture of low level, local prototypes representing object and behavioural elements of a product interface. These were tested using much smaller groups of subjects. This paper compares and contrasts these methods and presents some specific advantages and disadvantages of each and also problems associated with both approaches are outlined.
Introduction The use of interface prototypes to test interactivity is advocated as an integral and vital part of the development process (Gould and Lewis, 1985; Hix and Hartson, 1993; Newman and Lamming, 1995). Prototypes can be developed to different degrees of fidelity. Typically these can be either non-interactive and generally are paper based, or interactive, representing different levels of fidelity and have been defined here as either low level (where a local or specific element of the interface design is produced); or high fidelity or high level interactive prototypes (where all or most of the functionality and often the form of the interface is fully represented). The use of non-interactive prototypes, which can also be defined as scenario tools (van Harmelen, 1989; Carroll and Rosson, 1992) is not discussed here, although they do have a significant role, particularly in the early conceptual stages of an interface design process. This paper specifically examines the role of high and low level interactive prototypes and explores the advantages and disadvantages of both approaches along with observations made during the development of a range of novel interfaces for a washing machine and a microwave oven. This research forms part of an EPSRC funded project to develop guidelines and design tools for the development of novel interfaces for consumer products. The exploration of the development of novel interfaces is important as consumer products become more complicated through technological convergence with computers and telecommunications. The scope of the project was initially limited to the development of design guidelines but our research has identified the need for design tools far earlier in the development process where
JVH Bonner and P Van Schaik
254
guidelines have not proved to provide effective support (Henniger et al 1995). One important design tool that we have therefore examined is the use of software-based prototyping development and evaluation. Two case studies are discussed below.
High level prototypes In the first study, three types of novel interfaces were proposed. These were based on a range of criteria including user requirements information on conventional consumer products, gained from focus group sessions conducted by one of the collaborating partners, along with technical developments that were anticipated by both of the industrial collaborating partners. The novel interfaces were: •
• •
Animated Object Display This dialogue style used graphic representations of three washing programming parameters (spin speed, fabric type and temperature wash) as an animated object which could be altered using conventional toggle switches to change the colour, shape and speed of the animated object. This type of interaction dialogue has previously been used to represent aircraft flight information (Wickens and Andre 1988). Drag and Drop This dialogue style was selected to examine the use of finger controlled ‘direct manipulation’ (dragging and dropping icons on a display panel to ‘design or build’ a washing programme). Auditory display Auditory information plays an important role in many mechanically based products and under utilised in many computer based products. Therefore, we were keen to see if the addition of auditory displays for functions such as washing cycle status and control feedback would increase the user’s understanding of product functionality.
The purpose of the study was two-fold: firstly to explore the effectiveness of a range of design and evaluation methods; and secondly to derive design guidelines related to the different interaction styles. It is the former objective that is of concern here. One of the design methods under review was the effectiveness of prototyping as a design tool to develop novel interfaces whilst also identifying usability problems. The level of novelty of the proposed interfaces posed an interesting dilemma. In order to obtain meaningful feedback from any user trials the novel interaction styles needed to be set in context. Allowing users to explore novel interaction styles without contextual information would not identify the correct type of usability problems. Furthermore, the type of evaluation methods we wanted to validate required that the user developed a comprehensive understanding of the different novel interfaces. The prototypes, therefore, needed, to some extent, to be representative and recognisable as a washing machine control panel or interface. This suggested that a high level prototype would be most appropriate. The prototypes were developed using Macromedia Director (v 4.0), a multimedia authoring software package. An experienced prototype developer/designer was used and initial concepts were developed using pen and paper during discussions with the one of the collaborating partners and then translated into an interactive prototype. The prototype developer made notes during the development process and recorded any problems or design decisions needing to be made. In light of this process, the following observations were made.
Advantages Using a high level prototype proved to be excellent for addressing usability problems using a top-down perspective. The level of novelty within the different interaction styles prompted a much larger number of design issues than anticipated, which needed to be resolved. However, we found very little published interface design guidance which could assist or support in the decision making process. We found little evidence during the user trials to suggest that users did not accept the embodiment of the prototype. This was established by using a method known as ‘Teach-Back’ (van der Veer, 1990). Feedback from the users suggested that the prototype itself
Use of prototyping methods for product user interfaces
255
did not obstruct their acceptance of the concepts presented to them. As the functionality of the interfaces was largely complete, we could perform keystroke data-logging. This allowed us to set a wide variety of tasks and monitor performance measures, allowing quite subtle differences in behavioural activity between the different interfaces to be measured. Finally, the high level prototypes created credibility for the development work; whilst this does not have any direct impact on the development process, a good visualisation of the design problem instilled confidence in the design proposals.
Disadvantages One of the primary objectives of the high level prototypes was to arrive at a working interpretation of the dialogue styles, in context, as quickly as possible, so that the underlying design concepts could be assessed. This caused the most significant problem during the development process. As usability problems arose, design decisions had to made to overcome them. The literature proved inadequate in many situations in providing an indication of where a solution may be sought. Most design problems, therefore, could only be resolved by conducting many low level prototype studies. However, the difficulty of non-contextual interpretation of the design proposals would then become a problem. An example of this concerned the behavioural properties of icons on the ‘Drag and Drop’ interface. These icons could be picked up by the finger and moved around the touch screen. We had little knowledge of how these should be highlighted, what type of ‘drag’ qualities they should possess, and how they should indicate that they had arrived at the target area. To overcome this, basic heuristic rules (Nielsen, 1993) were used to assess the behavioural characteristics, although in some instances even these were inadequate. Problems identified further down the development process were often more difficult to resolve, purely because a good deal of re-coding would have been required. Some of these problems only manifested themselves when a series of elements of the interfaces became inter-related. The key rationale for developing a high level prototype was to ensure that the interaction dialogues were evaluated in context. However, we found that this advantage was also counterproductive. During the trials subjects were asked to evaluate the novel interfaces against a ‘standard’ washing machine interface (which was also presented in prototype form). We found that users tended to make direct comparisons with their own appliance at home rather than using the ‘standard’ interface as the benchmark. This shed doubt on the validity of some of the findings from the subjective comparative measures that were undertaken.
Low level prototypes Whilst the first study provided a top-down approach to the development of novel interfaces, we decided to use a bottom-up approach in the second study to allow for comparisons between the two. In the second study, one of the novel interaction styles (Drag and Drop) was developed further but using a different application domain. For this study, a novel microwave control panel was developed. The approach was to develop alternative design proposals of the interface using low level prototypes and provide exposure of the conceptual design elements to users, creating more immediate feedback. An existing microwave oven (upon which the novel interface was based) was assessed in terms of its functionality and usability. In discussion with one of the collaborating partners, a new interface was designed using a series of paper prototypes. At this stage a series of interaction design issues emerged and it was agreed that these would be resolved using a set of low level prototypes. These were: alternative methods for selecting icons; presentation styles for animated icons; dragging behaviour for moving icons across the touch panel; making multiple icon selections; and iconic representation User trials were conducted using very small samples (usually 5–6 subjects) and alternatives were presented in pairs where the user had to select one of them against three assessment criteria. As the subject groups were so small a paired comparison test could not be
256
JVH Bonner and P Van Schaik
conducted so design solutions were based purely on frequency counts. An amalgam of these local prototypes was then used on another local prototype which offered more functionality. This prototype is shortly to be evaluated again using small user groups before being developed into a high level prototype.
Advantages This approach meant that there was less investment in key concepts or design proposals allowing for more diverse ideas to be considered. The ‘fluidity’ of the design concept could remain for a longer period of time. There was also a higher degree of impartiality about the design process where a design problem could be decided through a user trial rather than attempting to use heuristic evaluation methods. The decision making process was generally quick once a format for the development of the prototypes, and the setting up of the user trials, had been established. This approach did prompt other design solutions to emerge and in one case, it was possible to merge two favoured design solutions together as they were mutually compatible. However, we do not know whether there is any confounding interference by adopting this approach.
Disadvantages By identifying a series of design problems and resolving them through low level prototypes, there appeared to be a lack of coherence in the design process. This became more apparent when the relationships between the interface elements were combined. For example, we found that the preferred icon for ‘auto-defrost’ was presented in a static format. Introducing this to the next level of prototyping where some animation was then being considered, caused problems as we felt that a different, but less preferred example, would be more appropriate within this broader context. As anticipated in the previous study we did find a lack of contextual relevance which prevented users making informed decisions about the design proposals presented using the low level prototypes. One user even abandoned the trials as she did not understand what was expected of her or what the purpose of the trial was. In addition, many subjects hinted that they were making arbitrary decisions about the usability of the different design proposals because of lack of context. Obviously these type of trials have little reliability and validity although there is evidence to suggest that small user groups can, if the trials are conducted in the right manner, reveal a high level of usability problems (Thomas, 1996)
General Observations Many of the advantages and disadvantages of high and low level prototyping are well reported, for example Rudd et al (1996) offer advice on the applicability and resource implications for both approaches. Although our findings support these general advantages and disadvantages, we found that neither approach supported the following important methodological issues which needed satisfying during the development process of these two studies. Neither approach offered a satisfactory solution to enabling contextual usability to be incrementally assessed, which is important in the evaluation of consumer products. Also, both approaches were unsatisfactory in handling the effect of interrelated user interface elements (for example the interaction between icon representation and icon behaviour) both in terms of development and evaluation. In practice, high level prototypes do allow these inter-related elements to be evaluated but at the cost of committing too many design decisions early in the conceptual development process. Conversely, low level prototypes allow more ‘fluidity’ in the design process but lack contextual relevance and the ability to resolve usability problems at a high level of abstraction from the interaction process. These methodological concerns may be symptomatic of developing novel interfaces where less is generally known about specific interface design principles. This would suggest
Use of prototyping methods for product user interfaces
257
that interface principles for established types of user interface, rather than for novel interfaces, within the field of HCI have matured to a degree where conventional dialogue style design principles are well understood and therefore it is possible to determine more distinct roles for both approaches to prototyping.
Conclusions Our findings suggest that in order to support both the creative development of novel interaction styles and interface designs while also explicitly addressing usability issues requires a design tool or guidelines that offers support on how and when to use high and low level prototyping to achieve contextual validity. On balance, we found that pushing towards a high level prototype provided far more important contextual usability problems. What was needed was ‘time-out’ from this process to conduct low level prototypes that would reveal valid results to support the development programme. It is the integrated approach of these two approaches that we hope to review as part of our design tools and guidelines development programme.
References Carroll, J.M. and Rosson, M.B. (1992), Getting around the task-artefact cycle: how to make claims and design by scenario. ACM Transactions on Information Systems, 10(2), pp 181–212. Gould, J.D., Lewis, C.H., (1985) Designing for Usability—Key Principles and What Designers Think, Communications of the ACM, 28, pp 300–311 Harmelen, M., van, (1989) Exploratory User Interfaces Design Using Scenarios and Prototypes, In Sutcliffe A. and Macaulay L. (eds). People and Computers V: Proceedings of the Fifth Conference of the British Computer Society (Cambridge), Cambridge University Press pp. 191–202 Henninger, S., Haynes, K., Reith, M.W., (1995) A Framework for Developing ExperiencedBased Usability Guidelines. Proceeding of the Symposium on Desigining Interactive Systems (DIS ‘95), Ann Arbor MI, pp 43–53 Hix, D., Hartson. H.R., (1993) Developing User Interfaces: Ensuring Usability Through Product and Process, J Wiley and Sons, New York. Rudd, J., Stern, K., Scott, I., (1996) Low High Fidelity Prototyping Debate, Interactions, January Vol 3.1 pp 76–85 Newman W.M., Lamming, M.G., (1995) Interactive System Design, Addison—Wesley Reading, UK. Nielsen, J., (1993) Usability Engineering, AP Professional, Boston. Thomas, B., (1996) ‘Quick and Dirty’ Usability Tests, In (eds) Jordan, P.W., Thomas, B., Weerdmeester, B.A., and McClelland, I.L., Usability evaluation in Industry, Taylor and Francis Ltd, London. Veer, G.C., van der (1990). Human-computer interaction: learning, individual differences, and design recommendations. Alblasserdam: Haveka Wickens, C D & Andre, A D (1988) Proximity compatibility and the object display, Proceedings of the Human Factors Society 32nd Annual Meeting, Santa Monica, CA. pp 1335–1339
CREATIVE COLLABORATION IN ENGINEERING DESIGN TEAMS Fraser Reid, Susan Reed, and Judy Edworthy Department of Psychology University of Plymouth Plymouth, PL4 8AA United Kingdom
We analysed video recordings of engineering design team meetings for the occurrence of visual and non-visual design reasoning, drawing space activity, and conversational grounding. Three interactional patterns were observed: (a) designers evolved complementary specialisms, either generating visual ideas, or focusing on customer requirements and design constraints; (b) non-visual design reasoning was highly interactive, whilst design visualisation consisted of uninterrupted bursts of design ideas from single individuals; (c) conversational grounding was initiated by the speaker during visualisation sequences, but by the listener in non-visual sequences. These interactional patterns have major implications for the design of virtual workspaces, and we evaluate the concept of “seamless collaboration media” in the light of these results.
Shared workspaces and interactive design The use of shared workspaces has become a major focus of research interest as developers strive to build real-time computer systems capable of supporting designers working at a distance from each other. These virtual workspaces typically combine a variety of media, and might include live video images, electronic whiteboards, structured drawing tools, hypertext editors, and other tools. The aim is to allow people in different locations to see, point to, sketch, or write down design ideas whilst simultaneously holding a conversation with their colleagues. Support for interactive sketching is now widely regarded as an essential component of virtual workspace systems, and this rests on three assumptions concerning the role of sketching in collaborative design. The first is that freehand sketching stimulates visual imagination, and allows the designer to capture and manipulate emergent visual ideas. Secondly, sketches provide designers with a common task focus and an expressive medium through which to externalise and communicate design ideas. Thirdly, collaborating through sketches encourages designers to build on each others’ ideas, and to combine them to form novel design solutions. In short, interactive sketching is assumed to play an indispensable role in the design process by providing a creative forum in which designers can explore a problem space and reason interactively about new design solutions. This reasoning lies behind the development of “seamless collaboration media” that integrate interpersonal and group processes within a shared workspace (eg. Ishii et al, 1995). However, the collaborative process of building on and interactively developing creative designs is as yet poorly understood. Tang’s (1991) milestone study of collaborative drawing
Creative collaboration in engineering design teams
259
highlights the importance of simultaneous visual access to the drawing surface, but sheds little light on how designers support and contribute to each others’ design reasoning. In this paper, we describe an observational study which examines how co-located teams of design engineers incorporate freehand sketches into the interactive process of design reasoning.
Design reasoning in engineering team meetings Our data was gathered from six teams of engineering design students working on realistic design briefs as part of their final year of training at the University of Plymouth. Each team consisted of about six designers drawn from different specialisms (mechanical and materials engineering, manufacturing, design technology, etc.). Half of the teams designed a lightweight, low-cost portable river crossing system for use by aid workers, whilst the remaining teams designed an electrically adjustable changing bed for use in schools and clinics. Each team was presented with customer requirements, market analyses, supplier and manufacturing information, and large-format paper pads and sketching materials. Using unobtrusive wide-angle and overhead cameras, we collected time-stamped video recordings from the first meeting of each team, and focused specifically on active design episodesextended interchanges in which potential designs were formulated, developed, and evaluated. Each episode was divided into simple speech units corresponding to the design ideas they conveyed, or to the conversational functions they carried out. Speech units were then coded to produce three separate event sequences for each team. Firstly, where speech units were accompanied by actions (sketching, writing, pointing, or gesturing) performed on or over the shared drawing surface, these actions were coded and recorded. Secondly, speech units associated with conversational grounding (Clark and Brennan, 1991) were coded as grounding requests if the speaker sought verbal or nonverbal evidence to confirm that a listener understands an utterance, or as grounding offers if the listener provided unsolicited verbal or nonverbal evidence that they had (or had not) understood a speaker’s utterance. Thirdly, speech units conveying design reasoning were classified as visual arguments if visible workspace actions were necessary to their communication, and as non-visual arguments if this was not the case. Visual arguments typically conveyed potential design solutions or solution fragments, whilst non-visual arguments mostly referred to customer requirements, materials specifications, or other constraints on the design. To assess reliability of coding, a randomly selected team protocol (constituting a 10% sample of the pool of speech units) was independently coded by two of the authors. An overall intercoder agreement of 83% was obtained, with significant kappa agreement coefficients for design reasoning, κ=.75, and conversational grounding, κ=.76. To our surprise, we found that visual reasoning necessitating the use of a shared workspace was relatively infrequent in our design teams, even during the active design episodes selected for detailed analysis. Of the 3225 speech units in the sample, only 829 (25.7%) were classified as visual arguments, whilst 1391 (43.1%) were classified as nonvisual arguments. This pattern was consistent over teams: non-visual arguments were significantly more numerous than visual arguments in five of the six design teams (χ2 ranging from 6.20 to 68. 96, all with df=2), with one team showing a similar, but marginally nonsignificant (χ2=5.06) pattern. Almost all (791) of the visual arguments in the sample were accompanied by visible workspace activity, mainly sketching (383 units), and pointing to sketches (280 units). Furthermore, use of the shared workspace was not confined to visual design reasoning: over a quarter of the non-visual arguments (380 units) were accompanied by workspace activity, in this case mainly pointing (183 units) and gesturing (167 units). We also found—as Goldschmidt (1995) would have predicted—that the designers in our study adopted specialised and complementary roles within the design process, each
260
F Reid, S Reed and J Edworthy
contributing certain kinds of design reasoning to the whole. Chi-square tests revealed that in five of the six design teams, it was the most active member that specialised in producing visual design ideas (χ2 ranging from 11.09 to 53.73, dfs between 4 and 6), using the shared workspace to gesture, sketch, and point to sketches. Other team members supported the visualisation specialist by focusing on customer requirements, design constraints. Just one team deviated from this pattern, but even here we found evidence of role specialisation, with the most active member emphasising non-visual arguments (χ2=50.41, df=6), and two other designers sharing responsibility for producing visual design ideas.
Interactive design and conversational grounding We then investigated how the engineers in our study coordinated their design ideas with those of their colleagues. One possibility we explored was that teams would engage in interactive design visualisation, working simultaneously to visualise potential solutions, combining these to arrive at an acceptable design. Alternatively, our teams might prefer to exchange visual and non-visual arguments on an alternating basis, with visualisation specialists generating new ideas which others in the team promptly evaluate. To explore this, we analysed transitions between the design arguments produced by the engineers, and compared these with a statistical model computed from the baseline rates for arguments produced by each team. The results of this analysis are shown in Figure 1(a). In this analysis, the identities of individual designers are unimportant: instead, we focused on transitions between arguments produced by the same speaker (S-S), and argument transitions between different speakers (S-O). Solid links indicate transitions averaged over the six teams that significantly exceed those predicted by the model, whilst the lighter links indicate chance level transitional probabilities. Links that are significantly weaker than those predicted by the model have been omitted from the figure.
Figure 1. (a) Transitions between visual arguments (Vs, Vo), non-visual arguments (Ns, No), and other speech units (Os, Oo) (b) Grounding requests (Res, Reo) and offers (Ofs, Ofo) associated with visual arguments (c) Grounding requests and offers associated with non-visual arguments Two quite distinct interactional patterns emerged from this analysis. Firstly, argument transitions between designers were more likely to occur during non-visual reasoning sequences (Ns-No; φ=.14, p<.01). These sequences consisted of brisk, highly interactive exchanges of singleargument speaking turns. In contrast, visualisation sequences involved extended speaking turns by individual designers, consisting of chains of visual arguments and other utterances (VS-VS; φ=.23, p< .01:VS-OS; φ=.08, p<.05), accompanied by sketching, pointing, and figural gesturing.
Creative collaboration in engineering design teams
261
Inspection of these chains revealed uninterrupted turns of up to six arguments to be common, with the longest chain consisting of eleven consecutive visual arguments. Surprisingly, we could find little evidence of interactive visualisation, or of alternation between the visual and nonvisual arguments of visualisation specialists and other team members. Visualising a design solution involves developing novel—or at least initially unshared— ideas, and communicating these to colleagues requires effort and persistence. We therefore expected our designers to request and offer positive evidence of conversational grounding more frequently during visual than non-visual argument sequences. Lag sequential analysis was used to test this idea. Because grounding information can also be conveyed by a design idea (eg. by one designer completing another person’s utterance, or expanding on their ideas), we tested whether grounding occurred more or less frequently at the same time as visual and non-visual design arguments (lag 0), as well as immediately following them (lag 1). The results of these analyses are shown in Figures 1(b) and 1(c). Again, solid links are those that occur significantly more frequently than expected by chance. In line with our expectations, designers requested evidence of grounding significantly more frequently than chance after presenting a visual argument (VS-RS, lag 1; φ=.10, p< .01). However, they also offered evidence of grounding more readily on hearing a colleague present a non-visual argument (Ns-Ofo, lag 1; φ=.14, p<.01). Evidently, the initiative for establishing grounding resides with the speaker in visual sequences, but with the listener in non-visual sequences. Our results suggest that visual argumentation, and the workspace activity that typically accompanies it, discourages turn transitions, leaving the speaker with the initiative either to continue their speaking turn, or to invite a colleague to speak. In contrast, non-visual argumentation is highly interactive. One explanation for this is that the management of conversational turn-taking differs in the two modes of design reasoning. The introspective character of visualisation sequences, together with their external focus on sketches or other workspace materials, may signal a special state of conversational disengagement that suppresses the transition relevance of the clausal, phrasal, or lexical boundaries that ordinarily signal turn completion (Goodwin, 1981). Furthermore, the direction of the speaker’s gaze is an important turn-taking cue, but only when speech is disfluent and hesitant, or when the baseline level of speaker gaze is low (Beattie, 1979). These are exactly the conditions we observed in the visualisation sequences: designers struggled to verbalise their ideas, occasionally ceasing vocalisation altogether, and at the same time directed their gaze, not towards the listener, but to the work surface and the sketches they were creating. From their perspective, gaze avoidance helps them cope with the cognitive demands imposed by visual problem solving. From the listener’s perspective, the visualiser’s averted gaze signals their temporary non-availability for interaction. Under these conditions, turn transitions are likely to be solicited by explicit displays of recipiency by the visualiser.
Implications for the design of virtual workspaces These observations have several implications for the design of virtual workspace technologies. Firstly, the finding that only one quarter of all speech units observed here required access to a shared workspace surely suggests that only a very small proportion of the activities of design teams will require the processing power and advanced display technologies currently under consideration for shared workspace systems. This does not, of course, mean that such systems can dispense with support for real-time communication. The highly interactive non-visual interchanges that we observed clearly need sufficient bandwidth to carry synchronous group-wide communication, though the evidence accumulated over the last twenty years points overwhelmingly to the adequacy of high-fidelity audio connectivity, rather than more costly video connectivity, for this purpose.
262
F Reid, S Reed and J Edworthy
Other categories of interaction in the present study, however, are likely to depend on high levels of video connectivity. Just how much connectivity is desirable depends on the type of design reasoning in which the team is engaged. The majority of the workspace actions accompanying speech in the present study involved producing and pointing to sketches, and this can readily be supported by an audio link coupled to a “virtual sketchbook”, such as VideoDraw (Tang and Minneman, 1990)—a system that allows concurrent access to a drawing surface over which hand gestures and sketches by different designers can be superimposed, and high levels of interaction supported. It is important to note that over one quarter of the non-visual arguments observed in the present study involved some form of visible workspace activity—mainly pointing and gesturing—for which this level of video connectivity is likely to be sufficient. However, the visualisation sequences we observed clearly require the additional capability to monitor the direction of a colleague’s gaze and therefore his or her focus of attention. In order to establish the recipiency of the visualiser and avoid unwarranted interruptions, listeners need visual access to the shared drawing surface and to a view of the visualiser that provides information on their attentional focus. This could, of course be achieved using a three-quarters video viewpoint that encompasses both the person and the workspace. But the visualiser needs at the same time to signal turn completion using familiar cues, including gazing directly at the listener. Here it might reasonably be thought that the “seamless” ClearBoard workspace developed by Ishii (eg. Ishii et al, 1995) would provide the unimpeded access to work and interpersonal spaces necessary for effective collaboration. This system allows designers working from remote locations simultaneously to draw on, and talk through, a virtual “glass window” onto which a head-and-shoulders video image of the partner is superimposed. Designers can not only see and manipulate the same images on the drawing surface, but can also see each other and what they are doing. One of the advantages claimed for this system is that less eye and head movement is needed to switch focus between the drawing surface and the partner’s face, making eye-contact easier to establish. However, it also makes it more difficult to avoid. Our present findings imply that any enhancement of the shared workspace which prevents designers from visibly disengaging from their colleagues is likely to perturb not only the process of visualising design ideas, but also the subtle and precise coordination between speech and eye gaze associated with the production and exchange of speaking turns. Careful empirical investigation of design interactions in these environments is now needed to evaluate these conclusions.
References Beattie, G.W. 1979, Contextual constraints on the floor-apportionment function of gaze in dyadic conversation, British Journal of Social and Clinical Psychology, 18, 391–392 Clark, H.H. and Brennan, S.E. 1991, Grounding in communication. In L.B.Resnick, J.M.Levine and S.D.Teasley (eds.) Perspectives on Socially Shared Cognition, (American Psychological Association, Washington), 127–149 Goldschmidt, G. 1995, The designer as a team of one, Design Studies, 16, 189–209 Goodwin, C. 1981, Conversational Organization: Interaction Between a Speaker and a Hearer, (Academic Press, London) Ishii, H., Kobayashi, M. and Grudin, J. 1995, Integration of interpersonal space and shared workspace: ClearBoard design and experiments. In S.Greenberg, S. Hayne and R.Rada (eds.) Group-ware for Real-Time Drawing: A Designer’s Guide, (McGraw-Hill, London), 96–125 Tang, J. and Minneman, S. 1990, VideoDraw: A video interface for collaborative drawing. In Proceedings of the ACM/SIGCHI Conference on Human Factors in Computing, (ACM Press, New York), 313–320 Tang, J.C. 1991, Findings from observational studies of collaborative work, International Journal of Man-Machine Studies, 34, 143–160
DESIGN AND USABILITY
PLEASURE AND PRODUCT SEMANTICS Patrick W.Jordan Senior Human Factors Specialist, Philips Design, Building W, Damsterdiep 267, P.O. Box 225, 9700 AE Groningen, The Netherlands
Alastair S.Macdonald Course Leader, Product Design Engineering, Glasgow School of Art, 167 Renfrew Street, Glasgow G3 6RQ, Scotland
Human factors has tended to focus on pain. As a profession, it has been very successful in contributing to the creation of products that are safe and usable and which, thus, spare the user physical, cognitive and emotional discomfort. However, little attention seems to have been paid to the positive emotional and hedonic benefits—pleasures—that products can bring to their users. This paper examines the relationship between product semantics and pleasure in use, within the structure of the ‘Four Pleasure Framework’. Studies such as this represent human factors’ first steps towards establishing links between product properties and the types of emotional, hedonic and practical benefits that products can bring to their users.
Introduction “I can sympathise with other people’s pains, but not with their pleasures. There is something curiously boring about someone else’s happiness.” This quote comes from Aldous Huxley’s 1920 Novel, ‘Limbo’ however, it could almost be a motto for ergonomics. Ergonomics journals, conference proceedings, and textbooks seethe with studies of pain: back pain, upper limb pain, neck pain, pain from using keyboards, pain from using industrial machinery, pain from hot surfaces—these are just a few examples from studies that have been presented at recent Ergonomics Society Conferences. As a discipline, ergonomics has been focused on eliminating pain, whether it be in the form of physical pain, as in the examples above, or the cognitive/emotional discomfort that can come from interacting with products that are difficult to use. Meanwhile, the idea that products could actually bring positive benefits—pleasures—to users, seems to have been largely ignored. So, whilst ergonomics has had a great deal to offer in terms of assuring product usability and safety, it seems to have had very little to contribute in terms of creating products that are positively pleasurable. The case for ergonomists to take the lead in addressing the issue of pleasure with products has been made elsewhere (Jordan 1997a) and a framework for approaching the issue—the four pleasures—has been proposed (Jordan 1997b). In this paper, this framework will be summarised and illustrated with examples that show the relationship between pleasure and product semantics.
Pleasure and product semantics
265
Pleasure with Products Pleasure with products is defined as: “…the emotional, hedonic and practical benefits associated with products.” (Jordan 1997a)
The Four Pleasures The four pleasure framework was originally espoused by Canadian anthropologist Lionel Tiger (Tiger 1992) and subsequently adapted for use in design (Jordan 1997b). The framework models four conceptually distinct types of pleasure—physio, socio, psycho and ideo. Summary descriptions of each are given below with examples to demonstrate how each of these components might be relevant in the context of products.
Physio-Pleasure This is to do with the body—pleasures derived from the sensory organs. They include pleasures connected with touch, taste and smell as well as feelings of sexual and sensual pleasure. In the context of products physio-p would cover, for example, tactile and olfactory properties. Tactile pleasures concern holding and touching a product during interaction. This might be relevant to, for example, the feel of a TV remote control in the hand, or the feel of an electric shaver against the skin. Olfactory pleasures concern the smell of the new product. For example, the smell inside a new car may be a factor that effects how pleasurable it is for the owner.
Socio-Pleasure This is the enjoyment derived from the company of others. For example, having a conversation or being part of a crowd at a public event. Products can facilitate social interaction in a number of ways. For example, a coffee maker provides a service which can act as a focal point for a little social gathering—a ‘coffee morning’. Part of the pleasure of hosting a coffee morning may come from the efficient provision of well brewed coffee to the guests. Other products may facilitate social interaction by being talking points in themselves. For example a special piece of jewellery may attract comment, as may an interesting household product, such as an unusually styled TV set. Association with other types of products may indicate belonging in a social group—Porsches for ‘Yuppies’, Dr. Marten’s boots for skinheads. Here, the person’s relationship with the product forms part of their social identity.
Psycho-Pleasure Tiger defines this type of pleasure as that which is gained from accomplishing a task. It is the type of pleasure that traditional usability approaches are perhaps best suited to addressing. In the context of products, psycho-p relates to the extent to which a product can help in accomplishing a task and make the accomplishment of that task a satisfying and pleasurable experience. For example, it might be expected that a word processor which facilitated quick and easy accomplishment of, say, formatting tasks would provide a higher level of psychopleasure than one with which the user was likely to make many errors.
Ideo-Pleasure Ideo-pleasure refers to the pleasures derived from ‘theoretical’ entities such as books, music and art. In the context of products it would relate to, for example, the aesthetics of a product and the values that a product embodies. For example, a product made from bio-degradable materials might be seen as embodying the value of environmental responsibility. This, then, would be a potential source of ideo-pleasure to those who are particularly concerned about environmental issues. Ideo-pleasure would also cover the idea of products as art forms. For example, the video
266
PW Jordan and AS Macdonald
cassette player that someone has in the home, is not only a functional item, but something that the owner and others will see every time that they enter the room. The level of pleasure given by the VCR may, then, be highly dependent on how it affects its environment aesthetically.
Product Semantics Product semantics refers to the ‘language’ of products and the messages that they communicate (Macdonald 1997). Product language can employ metaphor, allusion, and historical and cultural references, whilst visual cues can help to explain the proper use or function of a product. What follows are a number of examples, demonstrating the link between product semantics and pleasure with products.
Karrimor’s Condor Rucsac Buckle This side release buckle closes with a very positive ‘click’. Visual, audio, and tactile feedback combine to ensure that the rucsac looks, feels and sounds good. These physio-pleasures are part of projecting the benefit of a reliable and reassuring fastening.
Global Knives Global knives are a new concept in knives, designed and made in Japan. The blades are made from a molybdenum/vanadium stainless steel and are ice tempered to give a razor sharp edge. The integral, hollow handles are weighted to give perfect cutting balance with minimum pressure required. The comfort of the knife in the hand and the aesthetic sensation of the finely balanced weight are both physio-pleasures. Because of their smooth contours and seamless construction, the knives allow no contours for food and germs to collect and thus are exceptionally hygienic. This provides the user with a feeling of reassurance—a psycho-pleasure.
NovoPen™ Traditionally, those suffering from diabetes had to use clinical looking syringes and needles. The NovoPen™ is a device for the self-administration of precise amounts of insulin. Its appearance is rather like that of a pen—this provides a more positive signal than that of the hypodermic syringe, which is coloured through medical and drug abuse associations. This offers the user both ideo- and socio-pleasure, by playing down any stigma that the user and others may associate with syringes and/or the medical condition. The NovoPen™ also incorporates tactile and colour codes which refer to the different types of insulin dosage that may be required. These provide sensory back up and contribute to the product’s aesthetic profile. The technicalities of administering precise dosages have been translated into easy human steps and a discrete but positive click occurs when the dose is prepared for delivery. This provides the psycho-pleasure of reassurance to the user in what might otherwise be a rather daunting task. Finally, the NovoPen™ also provides physio-pleasure through its tactile properties—the pen is shaped to fit the hand comfortably and the surface texture, achieved through spark erosion, is pleasant to the touch.
Samsonite Epsilon Suitcase Journeys through air terminals can be fraught with stress—both physical and psychological. The three handle options on the Samsonite Epsilon Suitcase provide a number of comfortable options for lifting, tilting or trailing. The handle material is a non-slip rubberised coating which does not become sweaty or slippy in use. These features, providing physio-p, reduce the stress associated with the situation. The design of the suitcase’s coasters allow a controllable and responsive movement in the ‘trailing’ mode, the suitcase ‘obeying’ the needs of the user, providing a degree of psycho-p over other suitcases who do not obey their owners’ will.
Pleasure and product semantics
267
Mazda Car Exhaust ‘Kansei Engineering’ is a term coined by Nagamachi (1995) for turning emotions into product design, and has been extensively employed in automotive design. The Mazda team has engineered the sound emitted by the MX5 Miata to evoke association with classic (British?) sports cars, satisfying ideo-p (macho, youthful associations), and socio-p (I have arrived, have I not?!). Table 1 gives a summary of the benefits associated with the products within the context of the four pleasure framework. Table 1. Four pleasure analysis of the benefits associated with the example products.
Designing Pleasurable Products The examples given have demonstrated that, through their semantics, products can provide different types of pleasure to their users. Even from this little selection, it is clear that products can bring practical, emotional and hedonic benefits to users which go beyond those associated with concepts such as ergonomic design and usability. The four pleasure framework gives a useful structure within which to approach the issue of pleasure with products. In particular, it has proved useful at the beginning of the product creation process, as a vehicle for discussion and agreement between human factors, design, product management, marketing, engineering and market research, as to what the main benefits delivered by a product should be. These agreements lead to the unity of purpose, that is so important to creating products that deliver clear benefits and tell a clear ‘story’. For example, Jordan (in preparation) describes a set of benefits that might be agreed for a new photo camera, with young professional women as the target group. They are summarised in table 2. Having agreed that these are the benefits to be delivered, the entire product development team can then concentrate on these. Having these common aims in mind when developing the technology, design and marketing material for a product, ensures that all disciplines are working to a common goal.
268
PW Jordan and AS Macdonald
Conclusions A number of examples have been given, illustrating links between product semantics and pleasure in use. This was based on a qualitative analysis within the context of the Four Pleasure Framework. Because of its simplicity and accessibility, this framework is a useful tool, suited to the multi-disciplinary nature of product development. It supports constructive, focused and progressive co-operation to move the design along. Such approaches enable human factors to move beyond usability to support the creation of products that are a positive pleasure to use—products that will delight the customer. Table 2. Four Pleasure Analysis of product requirements for a camera aimed at young women of high socio-economic status (from Jordan, in preparation).
References Jordan, P.W., 1997a, Putting the pleasure into products, IEE Review, November 1997, 249–252 Jordan, P.W., 1997b, A Vision for the future of human factors. In K.Brookhuis et al. (eds.) Proceedings of the HFES Europe Chapter Annual Meeting 1996, (University of Groningen Centre for Environmental and Traffic Psychology), 179–194 Jordan, P.W., in preparation, The four pleasures—human factors for body, mind and soul, submitted to Behaviour and Information Technology Macdonald, A.S., 1997, Developing a qualitative sense. In N.Stanton (ed) Human Factors in Consumer Products, (Taylor and Francis, London), 175–191 Nagamachi, M., 1995. The Story of Kansei Engineering, (Kaibundo Publishing, Tokyo) Tiger, L., 1992, The Pursuit of Pleasure, (Little, Brown and Company, Boston)
A SURVEY OF USABILITY PRACTICE AND NEEDS IN EUROPE Martin Maguire and Robert Graham
HUSAT Research Institute Loughborough University The Elms, Elms Grove Loughborough, Leics. LE11 1RG Tel: +44 1509 611088, Fax: +44 1509 234651 [email protected], [email protected] A survey was carried out to determine the state of current usability practice in Europe to assist with the dissemination of usability information and services to industry and EC projects. The survey shows that organisations are aware of the need for usability in the design process and carry out relevant usability activities, although the extent to which they are performed may vary e.g. for bespoke systems versus ‘off-the-shelf’ products. The paper also reports on current usability activities at different stages in the design process, possible methods for enhancing usability activities, and requirements for usability information.
Introduction and Aims The human factors community has produced a wide range of methods and guidelines that can assist the system development process for the benefit of the end users. To ensure that these potential human factors inputs are used effectively, it is important to consider the views of companies and organisations that might receive them. An earlier survey carried out by Dillon et al (1993) found that the greatest need is for usability methods which fit in with the development lifecycle. This paper looks at the specific needs for usability support within the lifecycle in more detail. HUSAT is part of a project called INUSE (Information Engineering Support Centres), funded by the European Commission (EC) which has set up a network of usability support centres to provide usability advice to industry and to the Telematics community (INUSE, 1997). As part of the project, it was planned to survey current usability practices within European organisations, the needs for usability support and the form that such support should take. The results of the survey are presented under the following headings: • • • • • • •
Scale of usability activities performed Usability information and support currently employed Comparison of bespoke system with ‘off-the-shelf’ packages Usability activities in the design process Methods of incorporating usability into system design Types of information to assist with usability activities Methods of disseminating information about usability
Survey Sample The survey data was collected via a self completion questionnaire. The population sample was drawn from existing company contacts, delegates at EC Telematics project concertation
270
MC Maguire and R Graham
meetings, the UK HCI Conference, and a Ministry of Defence suppliers exhibition in the UK. Based on a target number of 140 individuals, 65 responses (a rate of 46%) were received. As the responses are drawn from those people sufficiently interested to complete the survey, the results may tend to show a generally positive attitude towards usability activities. Arguably however, such respondents are in a good position to comment on their organisations’ own usability practices. Of the 65 respondents, most came from industrial companies (40), with others representing government departments (14), academic and industrial research centres (11). The majority of the respondents were from the UK (37), but nine other European countries were represented, including Greece, Belgium, Germany, the Netherlands, Denmark, Italy, Sweden, Spain and Norway. Respondents came from a range of occupations. Many stated that their principal role within their organisation was project manager (25). Other common roles were human factors specialists (13) and designers or software engineers (11). There were also representatives from marketing, system procurement, R&D, user representatives in a design team and quality assurance. The organisations produced a range of systems e.g. communications, management, financial and public information systems. They varied in size from small groups of less than 10 employees to large multi-national companies.
Results Scale of usability activities performed A question was asked about the scale of usability activities carried out within the organisation. In response, 18% stated that they carried out little or no such activity, 60% said that they carried out usability work on a small scale, while 22% stated that they carried it out on a large scale.
Usability information and support currently employed The different types of usability advice incorporated are shown in Table 1 below, in order of frequency. It is interesting to note that general literature sources are rated more highly than standards. Other sources of information are only used by about half of the organisations or less.
Table 1. Current use of usability information and support
Comparison of bespoke system with ‘off-the-shelf’ packages Over half of the organisations who responded to the questionnaire (55%) designed custom or bespoke products for a single client. About a third of the organisations (34%) designed mass market or ‘off-the-shelf’ packages. The difference between the two classes of product are shown in Figure 1). Around 90% of developers of bespoke products discussed requirements with the customer or purchaser, while only 50% of the developers of off-the-shelf systems did the same. Few ‘off-the-shelf’ system developers (18%) involve end-users in the design team while considerably more (42%) bespoke system developers do so. These differences are perhaps less surprising bearing in mind that for mass market products, there is typically no specific customer or set of users to focus on. It is therefore be desirable to offer such companies clearer advice about how to set up representative customer panels and to recruit users from the outside world with relevant characteristics.
Survey of usability practice and needs in europe
271
Figure 1. Comparison of usability activities for bespoke and ‘off-the-shelf’ products
Usability activities in the design process One section of the survey considered different usability activities within the design process. For each activity, respondents could select one from a given set of responses to indicate how fully each activity was considered and carried out. The activities included: 1) 2) 3) 4) 5) 6) 7) 8) 9)
Getting consideration of usability into the initial contract for the system. Management of usability activities during the project. Gaining access to the right users at the right time. Analysing the user and task characteristics and organisational setting. Collection/specification of the user and organisational requirements. Iterative development and evaluation of prototype solutions. Evaluation of the final product with users. Field trials of the product in use. User support of the product after purchase.
For most of the above activities, between 44% and 51% of respondents stated that they felt they were carried out fully whether part of a quality plan or not. However for activities 3, ‘Gaining access to the right users at the right time’ and 4, ‘Analysing the user and task characteristics and organisational setting’ only 37% of respondents felt they were considered properly. Others stated that the task had not been carried out fully, that they had too little knowledge to consider it properly, or even that it was not seen as important by management or staff. These results indicate the need for greater dissemination of usability context analysis techniques (Bevan and Macleod, 1994). Also the problem of gaining access to users to make inputs into the design process or to act as subjects within usability evaluations should be considered and planned early on in the lifecycle, if the project is to have representative user input. Perhaps the most fundamental activity listed above is 1, ‘Getting consideration of usability into the initial contract for the system’. While 51% of respondents felt that this was
272
MC Maguire and R Graham
carried out fully, only 12% incorporated it into a quality plan. This indicates the need to consider linking usability and quality planning more closely. It is also of interest to note that for activity 9, ‘User support after purchase’, although 46% stated that they carry it out fully, a high proportion (24%) felt that they had too little knowledge to consider it properly or that it was not seen as of major importance. Taking the results as a whole, it seems that the majority of organisations do see the need for, and do perform, relevant usability processes during the design lifecycle. However for most of the activities less than half do so fully, indicating the need for more support and information.
Methods of incorporating usability into system design Subjects rated various ways in which usability could be incorporated or improved in their company, on a scale from 1 (not useful) to 5 (essential). Table 2 below, shows the mean ratings given to each method. Table 2. Methods of incorporating usability
Respondents rated highly the idea of training designers to help themselves, and providing access to users and tools to help them carry out usability processes. There appears less interest in employing external usability specialists although this is partly due to respondents having usability expertise in-house. Analysing these results further, it was found that companies developing ‘off-the-shelf’ systems feel more strongly about the need for external expertise (mean rating 3.6 versus 2.5 for bespoke developers). This may be because the user population being designed for, and the tasks that users carry out, are less well defined. It was also found that companies who carry out usability activities on a large scale feel a greater need for improved prototyping tools than those where it is on a smaller scale. They requested both (i) better prototyping tools, and (ii) usability knowledge to be built into the tools. This may be seen by developers as a cost effective way of inputting usability into the design lifecycle.
Types of information to assist with usability activities Respondents were asked to rate the types of information that might assist with usability within the system development process. The mean ratings given in Table 3 below (on the scale: 1=not useful to 5=essential) show that respondents are most interested in information to supplement processes e.g. evaluation, prototyping and user requirements specification. Table 3. Types of information to assist usability
Survey of usability practice and needs in europe
273
Topics for guidelines on interface design were also investigated. It was found that respondents seemed to find guidelines on the broader issues such as navigation and screen layout more useful than interface specifics such as using colour or icons. Again, this may reflect the fact that designers may be tied to a particular style, limiting their choice of available colour or icons. Alternatively, it may be that issues such as user navigation are seen as more complex, and designers need further guidance or knowledge than they already possess.
Methods of disseminating information about usability Finally, respondents rated the various methods by which usability information could be received. These are shown below (Table 4) in order of preference, the scores based on a scale of 1=very poor to 5=excellent. Table 4. Methods of disseminating usability information
Respondents felt that training is a good method of receiving usability information. Another highly rated approach was to build usability principles into the software design tools themselves. However, general usability guidelines, whether in the form of handbooks, video or CD-ROM were seen as slightly less useful, perhaps as they require more effort to apply.
Conclusions The findings seem to show that many organisations do carry out usability activities although they may not be on a large scale or fully across the design lifecycle. It was found that ‘off-theshelf’ developers do not involve potential customers and end users as much as bespoke developers, and so more advice to support this activity could be offered. Similarly a greater awareness of usability standards could increase their usage, and more help on analysing the user context is also needed. It is recommended that agreement to gain access to potential users is made at an early stage to ensure good user representation. Finally, the approach of training companies seems to be one of the most attractive options for integrating human factors knowledge into the lifecycle, while the idea of software design tools incorporating usability principles would also be of interest to organisations in the future. Acknowledgement: Project IE 2016 INUSE, is funded by the European Commission’s Telematics Applications Programme.
References Bevan, N. and Macleod, M. 1994, Usability measurement in context, Behaviour and Information Technology 13(1–2), 132–145, Jan–Apr 1994, London: Taylor & Francis. Dillon, A., Sweeney, M. and Maguire, M. 1993, A survey of usability engineering within the European IT industry—Current needs and practices, Proceedings of the HCI ‘93 Conference, J.Alty, D.Diaper and S.Guest, (Eds), Cambridge Univ. Press, People and Computers VIII, Loughborough, pp. 81–94., Sept 1993. INUSE 1997, see http://www.npl.co.uk/inuse
CULTURAL INFLUENCE IN USABILITY ASSESSMENT Alvin Yeo Computer Science Dept. University of Waikato, Private Bag 3105 Hamilton, New Zealand
Robert Barbour
Mark Apperley
Science, Mathematics & Technology Education Research Centre University of Waikato, Hamilton, New Zealand
Computer Science Dept. University of Waikato, Private Bag 3105 Hamilton, New Zealand
A study was conducted in Malaysia to identify cultural factors that may affect results of usability assessment techniques. The usability evaluation techniques used in this study were “think aloud”, System Usability Scale and interviews. The results indicate that cultural factors such as power distance and language might influence responses in the different usability assessments. Recommendations on how to reduce the cultural effects are also reported.
Introduction Results of usability tests conducted in the domestic market may not be valid internationally (Nielsen, 1990; Fernandes, 1995). The software must be tested in the target market. Testing is necessary to ensure that the software is acceptable and does not cause offence to the target community. As more software is marketed globally, software developers have to take into account cultural issues that may impact usability testing. Methods that work in Western cultures, e.g. the United States (US), may not work in other cultures (Herman, 1996). Fernandes (1995) describes lessons learnt by Claris Corporation in Japan: •
• •
“Questions regarding how comfortable or how much they “like” the product were removed because they involved feeling and emotion which are issues that the Japanese are not accustomed to responding to. Japanese women spoke very softly. This puts a huge premium on the quality of the microphone and where it was placed. Co-discovery techniques were used but they became problematic when people of differing status were put in the room together. In particular, women were found to talk very little when they were paired with a man.” (Fernandes, 1995).
In Singapore, a subject actually broke down and cried during a software evaluation session (Herman, 1996). However, during the post-test interview, the subject was very positive about the software. This behaviour is believed to be attributed to the Eastern culture whereby it is “considered culturally unacceptable to criticise the designer directly or openly, as this may cause the designers to lose face” (Herman, 1996). Usability assessment is conducted to improve the usability of the software. Thus, if results of usability assessments are misconstrued, the software’s success might be compromised. It is crucial that lessons from the above experiences be taken into account to ensure that
Cultural influence in usability assessment
275
accurate, reliable and valid results can be drawn from the usability assessments. These issues are especially important given that most, if not all, usability assessment techniques originate from the West and are used in the East—one of the fastest growing and potentially big software markets (Software Publishers Association, 1996). There exist few studies of how these usability evaluation techniques fare in Eastern countries. This lack of literature is surprising as US (which is the biggest exporter of software in the world) earns more that half of its revenue from outside the US. The more information we know about the target users, the greater the likelihood of success of the software. As highlighted in the examples, studies of potential problems of the use of these Western techniques in the East are needed to ensure more accurate results from usability evaluation.
Research Aims The aims of this research were to: identify the cultural factors that affect usability testing, examine how these cultural factors affect usability testing, and identify ways to improve usability testing by reducing the cultural effects. The study was conducted in Malaysia, one of the fastest growing markets in Asia. Results from this Malaysian study may be indicative of results in other Asian countries as Malaysia shares similar cultural attributes with its neighbours.
Method To identify the cultural factors that may affect usability testing, data was collected from an experiment using three methods of usability assessment. The three usability assessment methods were “think aloud” (described as probably the most valuable usability engineering method (Nielsen, 1993)), the System Usability Scale (SUS) and the interview method. The SUS is identified as a “quick and dirty” method to gauge the users’ response to the interface’s usability (Brooke, 1995). Although the SUS is only ten questions long, it correlates well with SUMI (Holyer, 1994). A University of Cork study placed SUS correlation reliability of 0.8588 with SUMI (Holyer, 1994). In the experiment, seventeen experienced spreadsheet users who were staff members of a Malaysian university completed a set of tasks in a spreadsheet with a Bahasa Melayu (Malaysia’s national language) interface. All the subjects had used Excel 5.0 in their work. Their reported spreadsheet-use ranged from at least one hour a week to four hours a day. Experienced spreadsheet users were procured, as we believed that they would be able to transfer their knowledge of one spreadsheet to another. An effort was made to select subjects from the different levels of the organisation—the subjects’ occupations ranged from clerks to managers. This diversity was to allow for a representative perspective from the different levels of the organisation and also to take into account the status difference findings of Fernandes (1995) above. The higher status subjects included managers, lecturers and tutors whereas lower status subjects included clerks and administrative assistants. The experiment was conducted by one of the authors who is a Malaysian and a tutor. During the experiment, the users were required to “think aloud” and the session taperecorded. Also, the subjects’ interactions (keystrokes) were recorded by a logging algorithm in the spreadsheet. Once the users had completed the tasks, they were asked to evaluate the spreadsheet’s usability by filling in a SUS survey form. After completing the SUS, the subjects were also interviewed to obtain their opinions relating to the spreadsheet they had just used. The log files were used to assist the transcription of the “think aloud” sessions. The interviews were also transcribed. Data from all three sources were aggregated. A general framework of grounded theory was used in the data analysis to identify cultural factors that may exist in the experiment.
276
A Yeo, R Barbour and M Apperley
Results and Analysis The transcription of the “think aloud” and interview sessions of all seventeen subjects were examined. The subjects’ responses in the System Usability Scale were also scored. Overall, all the subjects had problems completing the tasks in the spreadsheet as they had to contend with the Bahasa Melayu interface as well as the fact that the spreadsheet was a DOS spreadsheet which did not support mouse-use. As the spreadsheet used was a DOS spreadsheet, a lot of negative comments were expected of the spreadsheet especially as all the users had been using Excel 5.0. It was observed that the subjects that were higher status (than the experimenter) were “harsher” or more frank in their comments (note that italicised English comments in this paragraph are comments that were translated from Bahasa Melayu). The negative comments made by the lower status (compared to the experimenter) were more subtle. Some of the negative comments made by nine of the ten higher-ranked subjects about the spreadsheet include “…old fashioned” [S4—subject identified as S4 in the study], “Excel is easier” [S8], “…very outdated” [S10], “I was taken aback it wasn’t that friendly I have to be frank with you” [S14], “difficult/complicated” [S15], “I don’t like it…it’s difficult” [S18], “…system still at a very primitive level” [S20], “For a beginner, it’s quite difficult for them to learn this…” [S21], “I think it’s really difficult to learn” [S22]. However, only two of the seven lower-ranked subjects made negative comments: “not user friendly” [S2] and “I think Excel is easier” [S11]. The other comments made by remaining five lower-ranked participants were mainly positive: “…not that bad a utility” [S3], “I think it’s okay… I think we better use this in our [office]” [S7], “I feel it’s more effective if we use this.” [S12], “The spreadsheet is good…to umm…replace Excel…” [S13], “I believe it…if we learn…it’s easier to use” [S19]. From the above comments, it would appear that the lower status subjects were more positive and more receptive of the spreadsheet. This result also correlates with the SUS scores whereby the lower status subjects’ scores were significantly greater than higher status subjects’ scores (t-test at 5% significant level) i.e. lower status subjects rated the spreadsheet’s usability more favourably than the higher status subjects. Although all the subjects had problems with the DOS spreadsheet, it would seem the lower status subjects “liked” the spreadsheet more than higher status subjects. The above results suggest that one possible cultural factor that may affect the results of the usability assessment is power distance. Power distance is “the extent to which the less powerful members of institutions and organisations within a country expect and accept that power is distributed unequally.” (Hofstede, 1994). From a sample of 50 countries, Hofstede (1994) identified Malaysia as the country having the highest power distance. This means that Malaysians in general are willing to accept the fact of inequality in power as being normal. The power holder’s authority is unquestioned. Subordinates that do so would be seen improper and disrespectful on their part (Abdullah, 1996). Furthermore, in a high power distance country, employees are “afraid” of their employers as employers wield powers such as the authority to fire employees. Thus, a person of higher status and power (e.g. a manager) will be more likely to voice his or her feelings of discontent to a person of lower ranked (e.g. subordinate). However, the reverse is not true. A person of lower status is unlikely to go against a higher ranked person for fear of retribution. The experiment results can be explained with respect to the power distance characteristic. The lower status subjects probably did not like the spreadsheet any more than the higher status subjects. However, the lower status subjects were more positive about the spreadsheet as they did not want to question or “go against” the experimenter (a tutor, considered high status/power holder), maybe for fear of retribution or of appearing disrespectful to the experimenter. The lower status subjects thus were less critical and “less honest” in their
Cultural influence in usability assessment
277
responses, to the extent of suggesting the use of the DOS spreadsheet in the office and also replacing Excel with the DOS spreadsheet. On the other hand, the subjects of higher status were more likely to voice their dissatisfaction as these subjects were of the same status or higher in the organisation hierarchy (compared with the experimenter) and had little fear of retribution. This observation is supported by the frank comments made by the higher status subjects that the spreadsheet was “old fashioned”, “very primitive” and “very outdated”. Another interesting observation is that the subjects, in their “think aloud” sessions and interviews, used predominantly English interspersed with Bahasa Melayu (this excludes any reference to parts of the software with prompts or commands in Bahasa Melayu). This observation is probably due to the fact that Malaysians are bilingual (speaking both English and Bahasa Melayu). As such, if an experimenter fluent in both languages is used, the subjects would be able to “think aloud” or respond in their preferred language. Responses in the native tongue are more accurate as the subjects are more likely to describe usability problems clearer in their native language than in another language. Forcing a subject to use a language other than their native tongue might impose further cognitive load which may in turn affect techniques such as the “think aloud” session during which a subject is required to complete tasks as well as think aloud at the same time. For example, subject S12 whose native tongue is Bahasa Melayu attempted to use English in the “think aloud” session. S12’s speech was hesitant and more information may have been garnered if S12 had spoken in his native language. Furthermore, the ability of an experimenter to communicate in the native tongue of the subjects may assist in forming a better rapport. This view is supported by Morais (1997) who found that in Malaysia, top management used more Bahasa Melayu to reduce the status gap and to reach out to their employees while the workers used more English.
Recommendations One implication of the power distance finding is that in order to obtain an honest and accurate appraisal of a software, an experimenter of lower status than the subject may be required. Otherwise, if the the experimenter is a manager (higher status) and the subject/usability assessor is a clerk (lower status), an accurate (honest) assessment might not result as the clerk may say what the manager wants to hear for fear of offence or retribution. Linguistically, it would be ideal to have usability testers fluent in the language of the users as this provides better opportunities for the users to speak their native tongue. Forcing subjects to think aloud in a language other than their native tongue might impose further cognitive load. Furthermore, an experimenter who is fluent in the subject’s native tongue may be able to form better rapport with the subject more quickly and this may aid in the getting better results in the usability study.
Conclusion The status factor appears to influence the results of the usability assessment techniques whereby higher status subjects were more honest than the lower status subjects in their usability assessment responses. To ensure more honest (accurate) responses, the status of the experimenter should be equal or lower than the status of the subjects. Furthermore, an experimenter who is fluent in the subject’s native language may be able to elicit more information if the subject is able to communicate in their native language. This factor is of importance in techniques where verbal responses are required e.g. the “think aloud” and interviews.
278
A Yeo, R Barbour and M Apperley
Further Work Two other factors that may influence the result of the usability evaluation were also examined. The factors were the gender of the subjects and the familiarity of the subject with the experimenter. Although gender does not seem to play any role in our study, initial results indicate that a subject who is familiar with the experimenter may be more honest in the evaluation. Further investigation of these factors is being carried out.
Acknowledgments The authors would like to thank Borland International for permission to use the TCALC spreadsheet source code in their research.
References Abdullah, A. 1996, Going Glocal, (Malaysian Institute of Management, SNP Offset, Shah Alam, Malaysia) Brooke, J.B. 1995, SUS—A Quick and Dirty Usability Scale, In P.Jordan (ed.) Usability Evaluation in Industry, (Taylor and Francis) Fernandes, T. 1995, Global Interface Design. (Academic Press, Chestnut Hill, MA) Herman, L. 1996, Towards Effective Usability Evaluation in Asia: Cross-cultural Differences. In J.Grundy and M.Apperley (eds.) Proceedings of OZCHI’96: Sixth Australasian Computer Human Interaction Conference, Hamilton, New Zealand, Nov., 1996, (IEEE), 135–136 Hofstede, G. 1994, Cultures and Organizations: Software of the Mind, Paperback Edition. (HarperCollins, Glasgow) Holyer, A. 1994, Methods for Evaluating User Interfaces, Cognitive Science Research Paper No. 301, School of Cognitive and Computing Sciences, University of Sussex, Available as http://www.cogs.susx.ac.uk/cgi-bin/htmlcogsreps?csrp301 Morais, E. 1997, Talking in English But Thinking Like a Malaysian: Insights from a Car Assembly Plant. In Proceedings of English Is An Asian Language: The Malaysian Context Symposium, Kuala Lumpur, Malaysia, (Forthcoming) Nielsen, J. 1990, Usability Testing of International Interfaces. In J.Nielsen (ed). Designing User Interfaces for International Use, (Elsevier) Nielsen, J. 1993, Usability Engineering, Paperback Edition, (Academic Press) Software Publishers Association. 1997, Asia Pacific Application Software Sales Rise 22% in 1996, Available as http://www.spa.org/research/releases
INTERFACE DESIGN
INTERFACE DISPLAY DESIGNS BASED ON OPERATOR KNOWLEDGE REQUIREMENTS Fiona Sturrock and Barry Kirwan*
Industrial Ergonomics Group, School of Manufacturing & Mechanical Engineering, University of Birmingham, Birmingham, B15 2TT, United Kingdom.
This study was aimed at designing interface displays which support different types of knowledge for a specific nuclear power plant scenario. An in-depth and exhaustive scenario exploration revealed that the current interfaces did not sufficiently support the operators’ knowledge requirements for the scenario, and much vital information could be considered inert. Interface redesign recommendations were made on the basis of the required type of knowledge, and as a result several new interfaces were designed which would aid in the re-activation of the inert knowledge. These interfaces were designed to be selectively called-up via icons on the original display and are aimed at specifically supporting operators during diagnosis of disturbances.
Introduction Diagnoses of disturbances, in complex and dynamic environments such as nuclear power plants (NPP), is considered to constitute one of the greatest cognitive demands to be experienced by the human operator (Vicente, 1995; Woods et al, 1994). However the operator is not expected to perform unaided. The role of the interface is not only to represent system data in a meaningful and expected format but to support operator diagnosis. Ultimately the interface design should activate or trigger the correct type of knowledge from memory allowing the operator to generate a quick and accurate diagnosis of the disturbance. Research has for many years recognised the importance of supporting the operator through the interface, especially in complex human-machine systems where incidents are frequently abnormal and unfamiliar, and hence the potential for disaster is great. Therefore how to represent the dynamic properties of a system to an operator is one of the most important research questions of the current period (Brehmer, 1995). There are few detailed guidelines or methodologies associated with the design of such interfaces available to the designers, thus rendering the process of interface design a relatively unstructured process
* Currently Head of Human Factors at ATMDC, NATS, UK.
Interface display designs
281
which is unsatisfactory given the significance of such displays (Seamster et al, 1997). A poorly designed interface may conceal important diagnostic information, or alternatively the presentation of such information on the display may not trigger the utilisation of the relevant knowledge, and as a result the operator’s diagnostic ability may be impaired. The research described in this paper is the application of the Types Of Knowledge Analysis (TOKA) approach (Sturrock and Kirwan, 1996) to a classic nuclear power plant training scenario (Steam Generator Tube Rupture1) focusing predominantly on the redesign of the interface displays.
Type Of Knowledge Analysis (TOKA) approach The TOKA approach to interface design is based on the premise that the differing demands placed on the operator during abnormal or novel scenarios will require a different interface design to the demands imposed during normal or steady state conditions. The TOKA approach comprises 5 steps. The first step involves analysing the available knowledge (concerning the disturbance from the interface) followed by (Step 2) an analysis of the knowledge required by the operator to diagnose the disturbance. The third step is to categorise such available and required knowledge according to the six ‘Type Of Knowledge’ (TOK) categories shown in Table 1. The fourth step involves identifying where the required knowledge is not supported by the interface, and subsequently making recommendations concerning the redesign of the interface. The final step in the TOKA approach is to test the redesigned displays against the original displays. Table 1. TOK Characteristics
Steam Generator Tube Rupture (SGTR) scenario analysis Initially an in-depth scenario exploration and knowledge elicitation were carried out using both the abundance of literature on this scenario (compared with other NPP scenarios), a very experienced operator, and also a full-scope Pressurised Water Reactor simulator at the OECD Halden Reactor Project in Norway. Steps 1–4 of the analysis formed part of an earlier study and readers are directed 1
A steam generator tube rupture scenario in a nuclear power plant, if undetected, could lead to the radioactive contamination of the secondary side of the nuclear process, which could threaten the lives of the plant personnel and eventually the environment.
282
F Sturrock and B Kirwan
to Sturrock and Kirwan (1997) for a detailed account of the methodology and results. Table 2 however shows an extract of the analysis. Although the SGTR scenario is a classic fault with operators being trained to recognise the symptoms and to take corrective actions, the analysis revealed that the knowledge or information required in order to diagnose the scenario was not optimally supported by the interface, e.g. many of the main symptoms of the scenario are of low salience, despite their relatively high relevance to the context of the disturbance. Table 2. Extract of SGTR scenario results
TOKA-based Interface Designs A detailed examination of the inert knowledge led to the generation of design concepts that would potentially reactivate such knowledge. A summary of redesign recommendations associated with each TOK are shown in Table 3. The structure of the interface re-design process was aided via semistructured interviews with several complex process interface designers in order to compensate for the lack of formal interface design guidelines. Table 3. Interface redesign recommendations associated with TOK
Interface display designs
283
The resultant redesigned displays are accessible from the original displays (as they support diagnosis rather than routine monitoring) via icons. Each icon represents a different TOK which can be selectively called-up at any time and in any order. Figure 1 shows an example of a redesigned TOKA display for the SGTR scenario which would support TOK 4 (maximum integration of systems).
Figure 1. TOKA-based interface design supporting TOK 4 Figure 1 shows all four steam generators (SG) and their direct link with the main steam manifold therefore potentially triggering the knowledge that the manifold steam output will remain constant despite the increasing levels in RY 13 (on the current interface the link is not directly shown, i.e. it is inert). Also displayed is the steam output from the manifold via a trend graph which shows the steam output over a period of time. This supports the operator by providing indications concerning how the abnormal functioning of RY13 affects the other parts of the plant (TOK 4). If the operator wants such information from the original displays then he/she must know what systems are likely to be affected (assuming that the malfunctioning component can be identified) and subsequently access the appropriate displays—this however, assumes that such knowledge will be triggered from memory. Also displayed on this TOK4 format are the emergency feedwater pumps, which show the direct link between the pumps in the feedwater system and the SG. Such a design feature supports TOK 4 by emphasising the increased feedwater supply to the SG which is an abnormal condition. It is anticipated that such TOK supportive displays would result in operators generating more accurate and rapid diagnoses of the disturbance by supporting the knowledge that is required for diagnoses.
Discussion The TOKA approach identifies parts of the original interface that do not currently support the operator in terms of knowledge requirements through a highly detailed and structured
284
F Sturrock and B Kirwan
walk-through of the task and evaluation of the current interface. The redesign of the interface is prioritised allowing the inclusion of the most important redesigned features (in terms of supporting operators’ knowledge requirements for the task), unlike traditional interface design processes. The potential utility of such a task analytical based tool to codify and re-present inert knowledge has implications for interface design. Whereas traditional interface designs rely heavily on designers expertise and experience the TOKA approach to interface design, although still requiring the designer to be experienced, follows a five step structure producing displays which have been designed specifically to support operators during diagnosis. The TOKA approach itself leads the designer to search for these ‘Types Of Knowledge’ which may otherwise be overlooked. Although this approach is still in its preliminary stages it addresses current research problems such as what should be displayed and what should not be displayed to the operator.
Summary The results of this analysis, briefly reported in this paper, show that it is possible to design interface displays based on operator’s knowledge requirements. The SGTR scenario displays have been redesigned according to the results of this study and the next phase of this research will be to compare operator performance when using the redesigned interfaces compared with performance using the original interface. The end goal of the research is to develop a task analysis tool, based on the TOKA approach, which can be used prospectively for determining interface requirements for operator diagnostic support and also for generating recommendations for display design. Acknowledgements: The authors would like to thank the HRP staff and the operator for their co-operation, time and enthusiasm. Thanks must also go to the interface designers (from Halden Reactor Project, British Nuclear Fuels Limited, and Rolls Royce Associates) for their expertise and help. Disclaimer: The opinions expressed in the paper are those of the authors and do not necessarily reflect those of their respective organisations nor the HRP.
References Brehmer, B., 1995, Feedback delays in complex dynamic decision tasks. In P.A.Frensch and J.Funke (eds.), Complex Problem Solving: The European Perspective, (Lawrence Erlbaum Associates, New Jersey), 103–130 Seamster, T.L., Redding, R.E., and Kaempf, G.L., 1997, Applied Cognitive Task Analysis In Aviation, (Ashgate Publishing, Aldershot) Sturrock, F., and Kirwan, B., 1996, Mapping knowledge utilisation by nuclear power plant operators in complex scenarios. In S.A.Robertson (ed.), Contemporary Ergonomics, (Taylor and Francis, London) 165–170 Sturrock, F., and Kirwan, B., 1997, Inert knowledge and display design. In S.A. Robertson (ed.), Contemporary Ergonomics, (Taylor and Francis, London) 486–491 Vicente, K.J., 1995, Ecological interface design: A research overview. In T.B.Sheridan (ed.), International Federation Of Automatic Control Symposium: Analysis, Design And Evaluation Of Man-Machine Systems (Pergamon Press, Cambridge), 623–628 Woods, D.D., Johnassen, L.J., and Sarter, N.B., 1994, Cognitive systems factors, state of the art report, Behind human error: Cognitive systems, computers and hindsight, CSERIAC 94–01
UNDERSTANDING WHAT MAKES ICONS EFFECTIVE: HOW SUBJECTIVE RATINGS CAN INFORM DESIGN Siné J.P.McDougall*, Martin B.Curry † and Oscar de Bruijn* *Department of Psychology University of Swansea Singleton Park Swansea. SA2 8PP.
†
Human Factors Department Sowerby Research Centre British Aerospace plc Filton, Bristol. BS12 7QW.
Icons and symbols are now routinely used across a wide range of user interfaces. This has led researchers and designers to explore the mechanisms that are thought to make icons effective. However, research has been hampered by a lack of clarity about exactly which icon characteristics affect user performance. To help researchers address this issue, we propose that subjective ratings can be used to measure individual icon characteristics and control them experimentally. This paper reviews the research that we have carried out using subjective ratings to explore the effects of that icon concreteness, complexity and distinctiveness have on user performance. We make recommendations for design practice on the basis of these findings.
Introduction The escalation in the use of icons to convey information has paralleled the routine incorporation of computing technology within people’s everyday lives. To date the choice of which icons to use has largely depended on international standards, guidelines, and guesswork (e.g. Shneiderman, 1992). As a result, recent research has focused on the need to arrive at a better understanding of the characteristics that make a good icon (e.g. Scott and Findlay, 1991). In order to arrive at a proper understanding of what makes icons effective, researchers need to be able to identify the effects that different icon characteristics have on user performance. This paper examines the success of previous research in realising this goal and highlights the need for greater levels of experimental control. A good way of controlling icon characteristics experimentally is to obtain subjective ratings of each characteristic. Although there has been a long tradition of using subjective ratings to control item characteristics for words and pictures (e.g. Quinlan, 1992), ratings have yet to applied to icon research. This problem was addressed by obtaining ratings for a corpus of icons on a variety of icon characteristics that included concreteness, complexity, and distinctiveness. These ratings were used in a series of experiments which examined the roles played by complexity, concreteness and distinctiveness on icon effectiveness. The results of these experiments are reviewed in order to show how this methodology can be applied to icon research. The data shows that each of these properties has different
286
SJP McDougall, MB Curry and O de Bruijn
behavioural effects and that these can change as a result of user experience. The implications of these findings for icon design practice are discussed.
Concreteness and complexity One of the strongest claims made for icons is that they are easier to use because they are concrete. Concrete icons tend to be more visually obvious because they depict objects, places and people that we are already familiar with in the real world. Abstract icons, in contrast, have an indirect correspondence with our experience and typically represent information by more ambiguous means using shapes, arrows and lines. The evidence available from both research and practice suggests that users use the visual metaphor of the real world created by concrete icons and that, as a result, they are easier to understand (e.g. Rogers, 1986). This difference in meaningfulness between concrete and abstract icons has been referred to as the ‘guessability gulf’ by Moyes and Jordan (1993) since the meaning of concrete icons can be guessed more easily than abstract icons when they are first encountered. One way of accounting for the performance advantages observed for concrete icons arises from the extra detail provided when depicting objects, places and people. Garcia et al (1991) measured this detail by applying a complexity metric to icons used in previous studies and found concrete icons to be consistently more complex than abstract icons (e.g. Arend et al, 1989; Rogers, 1986). This suggests is that concrete icons must be more complex in order to be easier to use. This assumption, however, contrasts sharply with the recommendations of design guidelines which emphasise the importance of keeping icons as simple as possible (e.g. Gittens, 1986). In order to determine which of these propositions was correct, we obtained subjective ratings of concreteness and visual complexity for a large corpus of 240 icons. These were drawn from icons in everyday use on machinery, cars, aircraft, computers and public information signs. Forty raters were asked to assess the concreteness of each icon using a 1–5 rating scale. Another group of 40 were asked to rate the visual complexity of icons using the same scale. A strong correlation between the perceived concreteness and visual complexity of the icons, would support the idea that it is extra visual detail that enables users to employ the visual metaphor. We found, however, that this correlation was virtually non-existent (r=–0.03, p>0.05). What this means is that, contrary to previous research, the addition of extra visual detail does not make concrete icons visually obvious.
The effects of concreteness and complexity on user performance Given that users rate icon concreteness and complexity differently, we might expect that the effects they exert on user performance will also be different. Indeed, research to date suggests that while increasing concreteness enhances user performance (e.g. Rogers, 1986; Stammers et al, 1989), increasing visual complexity diminishes it (Scott, 1993). This, however, does not tell us anything about the locus of these effects on user performance and when they are likely to be important. A study was therefore carried out to examine the role of concreteness and complexity in more detail. Four types of icons were selected from the 240 which had been previously rated. These were (a) concrete and complex (b) concrete and simple (c) abstract and complex and (d) abstract and simple. In all, there were 72 icons—18 of each type. Twenty volunteers were asked
Subjective ratings in the design of icons
287
to carry out a search-and-match task in which they were given an icon function and asked to match it to one of the icons in the display (see Figure 1). Eight other icons were used as a background, two of each type (see (a)—(d) above). The volunteers saw the full set of icons many times in order to simulate growing experience. This made it possible to examine how accurately and quickly volunteers could respond as they learned icon-function relationships.
Figure 1. An example of the computer screen display in the search-and-match task During initial learning, icon concreteness appeared to be very important; it enabled icon meaning to be guessed more accurately and quickly. After 9–10 exposures, however, performance differences between concrete and abstract icons disappeared. This suggests that concreteness effects are temporary and are likely to have minimal effects on icons that are frequently used. The visual complexity of icons was also found to affect the time that users took to respond, although it had little effect on accuracy. Users took longer to respond to visually complex icons than those which were simple. This finding was not affected by the number of times icons were presented. This means that the visual complexity of the icons used in time-critical systems will be an important determinant of performance that is unlikely to diminish with experience. Thus, he design of such systems should place emphasis on the creation of simple interfaces rather than hoping for any performance improvements that may result from experience.
Distinctiveness Icon distinctiveness is another important factor in determining how easy icons are to use. Defining distinctiveness is notoriously difficult and is often characterised in terms of how discriminable icons are from one another. Distinctiveness is often thought to be the product of an icon’s global features (e.g. shape, colour, size) rather than of its local features (i.e. details within icons; see Arend et al, 1987). There is strong evidence that icons differing in their global features can be found more easily in an array because visual search is quicker (e.g. Scott, 1993). However, in the usability research carried to date it is often difficult to distinguish distinctiveness from visual complexity. For example, Byrne (1993) states that “simple icons (those discriminable based on a few [global] features) seem to help users, while complex icons are no better than simple rectangles”.
288
SJP McDougall, MB Curry and O de Bruijn
Following on from our previous research, we used ratings to quantify icon distinctiveness. A group 30 volunteers were asked to rate the extent to which they felt target icons ‘stood out’ from other background icons in an array of nine (similar to the array shown in Figure 1). These icons were the same as those used to determine the effects of concreteness and complexity. The targets were shown against a series of different backgrounds; mixed (using the 4 different types of icons as in Figure 1 above) or wholly concrete, abstract, simple or complex (forming a uniform background). Subjective ratings suggested that icon distinctiveness was largely determined by the contrast of a target’s characteristics in comparison to the characteristics of the background icons. For example, simple target icons were rated as being most distinctive against a background of complex icons. Similarly, concrete icons were rated as most distinctive when presented against a background of abstract icons. There was little effect of distinctiveness when icons were presented against a mixed background. To summarise, there appear to be two types of contrast (a) a visual contrast that relates primarily to differences in icon complexity (b) a semantic contrast that relates primarily to differences in the concreteness. Taken together, these findings suggest that the source of distinctiveness effects is unlikely to lie solely in visual search (as suggested in previous research) but may also affect users’ ability to select icons on the basis of their meaning. The effect of creating visual and semantic contrasts on user performance was subsequently examined.
The effects of distinctiveness on user performance Users were asked to complete the same search-and-match task as shown in Figure 1. However, in this instance, the displays were designed to create a contrasting background for the target icon. In one study visual contrasts were used, e.g. the lock (a simple icon) shown in Figure 1 was set against a background of complex icons. In a second study semantic contrasts were used, e.g. the lock (which is also concrete) was against a background of abstract icons. So, how was user performance affected by these contrasts? Our findings show that generating contrasts can be very effective in enhancing user performance but designers need to be aware of exactly what type of contrasts were likely to be effective. When visual contrasts were created by contrasting the target’s complexity with the background display, it was clear that user performance was faster when the target was simple and the background was complex. However, when a complex icon was presented against a simple background user performance was not improved. The behavioural effects generated by the use of contrast are, therefore, not symmetrical. Therefore, designers should not treat icon contrasts as being two sides of the same coin. When semantic contrasts were used in displays, similar effects were observed. Concrete icons presented against an abstract background, enabled users to respond very quickly in the search-and-match task. However, hen the contrast was reversed, with abstract icons being presented against a background of concrete icons, user performance was not enhanced. Once again, this shows the behavioural effects created by generating a contrast were not symmetrical.
Design implications of research findings To conclude, we believe that our findings are clearly relevant to design practice. The implications of our findings are summarised below.
Subjective ratings in the design of icons
289
• The addition of visual detail does not make icons more visually obvious. • The difference in user performance between concrete and abstract icons known as the ‘guessability gulf’ is short-lived and does not affect experienced users. • Concrete icons are likely to be most useful in public information systems or for icons that are rarely used (such as warnings). • The use of complex icons should be avoided in displays where user responses are timecritical. Even experienced users respond more slowly to complex icons. • Both visual and semantic contrasts can be created within displays. • Contrasts do not always enhance user performance and it is important for designers to be clear about exactly what contrasts are effective.
Acknowledgements This research has been supported by British Aerospace grant (SRC/UCS/060495). The icons in Figure 1 have been reproduced with permission of the British Standards Institute; International Electrotechnical Commission; Microsoft Corporation.
References Arend, U., Muthig, K.P. & Wandmacher, J. (1987). Evidence for global feature superiority in menu selection by icons. Behaviour and Information Technology, 6, 411–426. Byrne, M.D. (1993). Using icons to find documents: simplicity is critical. Proceedings of INTERCHI ‘93, 446–453. Garcia, M., Badre, A.N. & Stasko, J.T. (1994). Development and validation of icons varying in their abstractness. Interacting with Computers, 6, 191–211. Gittins, D. (1986). Icon-based human-computer interaction. International Journal of ManMachine Studies, 24, 519–543. Moyes, J. & Jordan, P.W. (1993). Icon design and its effect on guessability, learnability and experienced user performance. In J.D.Alti, D.Diaper & S.Guest, People and Computers VIII. Cambridge: Cambridge University Press. Quinlan, P. (1992). The MRC Psycholinguistic Database. Cambridge: Cambridge University Press. Rogers, Y. (1986). Evaluating the meaningfulness of icon sets to represent command operations. In M.D.Harrison & A.F.Monk (Eds.), People and computers: Designing for usability. Cambridge: Cambridge University Press. Rohr, G. & Keppel, E. (1985). Iconic interfaces: Where to use and how to construct. In H.W.Hendrick & O.Brown (eds.), Human factors in organisation design and management. Amsterdam: Elsevier Science Publishers. Scott, D. & Findlay, J.M. (1991). Future displays: A visual search comparison of computer icons and words . In E.J.Lovesey, Contemporary Ergonomics: Proceedings of the Annual Conference of the Ergonomics Society, 246–251. London: Taylor & Francis. Scott, D. (1993). Visual search in modern human-computer interfaces. Behaviour & Information Technology, 12, 174–189. Shneiderman, B. (1992). Designing the User Interface: Strategies for Effective HumanComputer Interaction. 2nd edition. Reading, MA: Addison-Wesley.
REPRESENTING UNCERTAINTY IN DECISION SUPPORT SYSTEMS: THE STATE OF THE ART Caroline Parker
HUSAT Research Institute Loughborough University, LE11 1RG
There is increasing interest in the potential of Decision Support Systems (DSS) in agriculture. DSS are usually based on simulation models with which a degree of uncertainty is always associated. In response to the practical problem of how to present this uncertainty to non-technical users a literature review was undertaken. The maturity of this technology in other industrial sectors led to the expectation that answers already existed. Results so far suggest that this belief was unfounded. This paper is a first attempt to collate the answers which do exist and some general guidelines for presentation are given.
Introduction This paper stems from a very specific practical problem, the need to produce a design solution to an interface requirement for an agricultural decision support system. DESSAC (Decision Support Systems for Arable Crops) is a MAFF Link-funded project which has as part of its remit the development of a decision support system (DSS) for winter wheat fungicide use. A decision support tool (DSS) can be defined as a tool which helps the user to make better decisions by providing access to a model or rule based representation of the decision area and to supporting information. There is an increasing interest in the potential of these tools in the agricultural and horticultural industries. Agricultural DSS, in common with any system attempting to describe and predict natural processes, are not capable of giving definitive answers. The emphasis is on support and not decision making. Agricultural DSS contain one or more simulation models which approximate the interactions between biological systems. As these models are only estimates there is always a degree of uncertainty associated with their output, (often shown as a probability distribution). A major requirement at the interface is the expression of this uncertainty as well as the general estimate of risk. The DESSAC system needs to display a variety of solutions to a spray plan problem so that the user can identify the best fit solution and the differences (or lack of these) between the risk levels associated with them. Farm-based users need to know what the worst, best and most likely outcomes, and the spread between them, might be, so that they can make realistic comparative decisions. What exactly is the problem in displaying something that the statistical sciences have been expressing for a long time? Firstly, there is the non-technical background of the target audience. The DESSAC project is working on the assumption that the user population will be computer
Uncertainty in decision support systems
291
literate but have little familiarity with statistics or the nature of modelling uncertainty. Previous work in the area suggests that few agricultural users are aware of the limitations of models; there is a tendency to either accept the output of the tool as 100% accurate or to view it with an extreme degree of prejudice. Good communication of uncertainty is always critical to good decision making (Cleaves, 1995) but obviously more so under these circumstances. Secondly there is the underlying assumption that these systems should also improve the general level of decision making. The reason that agricultural DSS are funded is the urgent need to reduce, or to target more effectively, the use of agro-chemicals in the UK. The human interface to the DSS therefore has three jobs to do: it has: • to guide the user in the direction of better decision making, • to present decision supportive information to the user, and • to make best use of human capabilities. It has to do this in the context of a non-technical user group, working with a complex problem in real time. As this is a problem common to all decision support systems, an early literature review was conducted in the strong belief that answers would already exist. Six months later few solutions have been found. This paper is a first attempt to bring together the answers which do exist and to place them in a context which is meaningful for the developers of decision support systems.
Types of uncertainty DSS are based either on simulation models, or on rules extracted from domain experts, or on a mixture of the two, and their answers to the questions posed by the user will always have a degree of uncertainty around them. But what exactly do we mean by uncertainty in this context? What types of uncertainty have been identified? Finding an appropriate categorisation for uncertainty should make it easier to group design solutions in a way which is useful for DSS developers. Krause and Clarke (1993) provide a branching classification system in which types of uncertainty are first divided according to whether they relate to a single proposition or to a group of propositions; then whether they arise from ignorance or from conflict and finally into 8 sub categories (op cit. p.7). The critical ones for this discussion relate to the unary set and are: Indeterminate Knowledge (vagueness); Partial Knowledge (confidence); Equivocation; Ambiguity; and Anomaly (error). Table 1: Uncertainty types within information categories
292
C Parker
A DSS provides the user with many types of information, e.g. the set identified by Brookes (1985). Each type of information may bring with it its own form of uncertainty. Table 1 above lists Brookes’ information retrieval categories and suggests the types of uncertainty, as defined by Krause and Clarke, to be found within them. The table shows that different types of uncertainty are present in the various information categories and that there is a further division, between input and method: uncertainty within a DSS relating either to the data fed into the system, and/or to the method (models or rules) used to generate the answers to decision enquiries (Arinze, 1989). In the case of the first two of Brookes’ categories, the mechanisms for generating an answer are purely mechanical and based on very simple and very complete algorithms. No uncertainty is generated by mechanisms (models or rules) in these cases, the only possible source of uncertainty is that associated with the data fed into them. As all data is prey to anomaly (input error) and much real data contains missing values (indeterminate knowledge) it must be assumed that these types of uncertainty are always be associated with inputs. Where data is generated (i.e. by a model) rather than input there will be uncertainty associated with the mechanism used for the generation (predictive information). In the case of agricultural systems any prediction of weather conditions or disease progress will be prone to ‘partial knowledge’ uncertainty because it is impossible to produce an accurate simulation of these complex and chaotic systems. Any other use of models or rules is prone to the same type of uncertainty. On the output side these systems may produce ambiguous results where it is not clear which of a number of outcomes is preferable. Rule bases are also prone to equivocation, where two or more rule sets are equally applicable. Another level of categorisation which is particularly relevant to uncertainty in DSS is the type of data being displayed i.e. whether it is nominal, ordinal, interval or ratio based numeric data or whether it is textual. Expert systems, based on rules extracted from experts, may produce numeric or textual data but the numeric data is likely to be an expression of an expert opinion. Model based systems on the other hand produce numeric uncertainty based on the application of equations to numeric inputs (inputs which may of course be based on non-numeric judgements).
Display solutions Many papers describe the performance of graphical vs. tabular vs. textual displays for interval and ratio type numeric data often with conflicting findings: others suggest that a mix of tabular and graphical displays produce the best performance (e.g. Bennet, 1992) In general the literature seems to suggest a graphical format is the easiest to interpret, even for small data sets (Melody Carswell, 1997) and given the additional difficulties surrounding uncertainty a graphical format may be considered to be the better approach. However, the influence of the graphical display is also found to be highly dependent on the type of task it is intended to support. The right graphical display is thus needed to express specific task uncertainties. In the only experiment of its type this survey was able to locate, Ibrekk and Morgan (1987) look specifically at the problem of graphically representing uncertainty to non-technical users. Two types of users (non-technical and technically aware) were presented with 9 graphical displays of the same data with and without instruction. The displays were: a point estimate with error bar; six displays of probability density (discretised display, pie chart, conventional, mirror image display, horizontal bars shaded to display density using dots or vertical lines.); a Tukey box plot; and a cumulative distribution function. Six forms of the probability density display were used because formally equivalent representations are not often psychologically equivalent. Subjects were asked to make judgements about realistic events such as the depth of predicted snowfall and flood. They found that the performance of a display depended on the information
Uncertainty in decision support systems
293
that the subject was trying to extract, and concluded that displays that explicitly show the information people need show the best performance. Pie chart displays were found to be potentially misleading and subjects displayed a tendency to select the mode rather than the mean unless the mean was explicitly marked. Where subjects were asked to make judgements about probability intervals in displays that did not forcefully communicate a sense of probability density, there was a tendency for them to use a linear proportion strategy equivalent to an assumption of a uniform probability density. Explanations had little effect on performance although there was evidence that subjects were trying to use them. Another finding was that there was little difference between the performances of the technical and non-technical groups and suggesting that a ‘rusty’ knowledge of statistics, or a graduate degree, will not necessarily improve performance. Designs which support non-technical users will therefore be equally valuable to the technically literate. The alternative to graphical or tabular representations of uncertainty are textual/verbal representations. Budescu and Wallsten (1995) investigated information processing, choice behaviour and decision quality when subjective uncertainty was expressed linguistically or numerically. They identified two further categorisations of uncertainty, ‘precise’ and ‘vague’. Uncertainty is precise if it depends on external, quantified random variation; and vague if it depends on internal sources related to lack of knowledge or judgements about the nature of the database (op. cit.) These distinctions map well onto Brookes information categories. Precise uncertainty can be linked to the first three categories, vague uncertainty with the last two because they relate very much to internal, user based uncertainty. In a later experiment Olson and Budescu (1997) found that verbal representations outperform numeric ones when the nature of the underlying uncertainty is also vague The best mode of communication they suggest, is the one which most clearly matches the nature of the event and the source of its underlying uncertainty.
Summary While there is insufficient room in this short paper to expand on, or even describe, all of the data gathered in this exercise, it is possible to make broad recommendations. It would appear that there are 3 key issues which impact on the form of representation for displaying uncertainty: • the type of uncertainty being displayed (e.g. Krause’s categories); • the type of data being displayed (e.g. numeric, textual, precise or vague); and • the users requirement for information to support the task (mean, mode, etc.). In the latter case users may, or may not, need to see the reasoning behind the data. Tufte (1997) argues quite forcibly that they do, Ackoff (1967) that they don’t. In the agricultural domain it seems likely that many farmers will not want to see the reasoning behind the data whereas most agronomy consultants and the more technically minded farmers will. The answer seems to be user dependent, requiring a layered interface approach. The literature surveyed to date would seem to suggest the use of graphics as a first choice for representing numeric data of the ‘precise type’ i.e. relating to factual, instructive and predictive information; with tabular representation providing additional support. Where the data may be either numeric or textual and the type of uncertainty is ‘vague’ i.e. factual inferential and causal inferential information, then textual representation is to be preferred. The results of Ibrekk and Morgans experiment suggest that within graphical display types mirror image displays and the shaded bar displays of probability density are the best for communicating the ranges that variables assume, and box plots or simple error bars are the best way of communicating means. It may be the case however that explicitly marking the mean on the probability density display would produce the optimal solution for mean and range.
294
C Parker
Conclusion The title of this paper refers to the ‘state of the art’ in the representation of uncertainty in decision support systems and this paper has made a first attempt to define it. However while there is a great deal of work to which this paper has not referred given the limited space, the tragedy is that most of it is microscopic in scope. Published work deals with human limitations in relation to different display mechanisms, to the nature of uncertainty, to the differences between expert and non-expert users and to many other issues surrounding the problem. Very little is directly relevant to the designer of decision support tools. Indeed this review has made it apparent that not much has changed since the publication of Morgan and Henrion’s book in 1990 where they concluded that: “for the most part the absence of empirical studies of the relative virtues of alternative displays means that the choice for displays remains largely a matter of personal judgement” (op. cit. p. 220). A great deal more targeted research is required if developers of decision support tools are to be given the practical guidelines they need.
Acknowledgements The author wishes to thank Loughborough University; the DESSAC project, MAFF and HGCA for funding; Murray Sinclair; David Parsons of Silsoe Research Institute; and the many leads and pointers provided by Email contacts and colleagues to their own and others work in this area.
References Ackoff, R.L. 1967. Management Misinformation Systems. Management Science, 14(Series B), 147–156. Arinze, B. 1989. Developing Decision Support Systems from a model of the DSS/User Interface. In G.I.Doukidis (Eds.), Knowledge based management support systems 166–182. (Chichester: Ellis Horwood.) Bennet, K.B. & Flach, J.M. 1992. Graphical Displays: Implications for Divided Attention, and Problem Solving. Human Factors, 34(5), 513–533. Brookes, C.H.P. 1985. A Framework for DSS Development. In Transactions of Fifth International Conference on DSS. Budescu, D.V. and Wallsten, T.S. 1995. Processing Linguistic Probabilities: General Principles and Empirical Evidence. In J.Buyesmeyer Haties, R. and Medin, D.L. (Eds.), The Psychology of Learning and Motivation. 275–318. (New York: Academic Press.) Cleaves, D.A. 1995. Assessing and Communicating Uncertainty in Decision Support Systems. Policy Analysis. AI Applications, 9(3), 87–102. Ibrekk, H. & Morgan, M.G. 1987. Graphical Communication of Uncertain Quantities to Non-technical People. Risk Analysis, 7(4), 519–529. Krause, P. and Clark, D. 1993. Representing Uncertain Knowledge: An Artificial Intelligence Approach. (Kluwer Academic Publishers). Melody Carswell, C. & Ramzy, C. 1997. Graphing small data sets: should we bother. Behaviour and Information Technology, 16(2), 61–71. Morgan, M.G. and Henrion, M. 1990. Uncertainty: A guide to dealing with uncertainty in quantitative risk and policy analysis. (Cambridge, Massachusetts: Cambridge University press). Olson, M.J. and Budescu, D.V. 1997. Patterns of Preference for Numerical and Verbal Probabilities. Journal of Behavioural Decision Making, 10, 117–131. Tufte, E.R. (1997). Visual Explanations: Images and Quantities, Evidence and Narrative.. (Cheshire, Connecticut: Graphics Press.)
REPRESENTING RELIABILITY OF AT-RISK INFORMATION IN TACTICAL DISPLAYS FOR FIGHTER PILOTS Maddalena Piras1, Stephen Selcon2, Jeffrey Crick2 and Ian Davies1
2
1 Department of Psychology, University of Surrey, Guildford, Surrey GU2 5XH Human Factors Group, Systems Integration Department, Air Systems Sector, DERA, Farnborough
We report a study of representing the reliability of ‘at-risk’ information to pilots. The Launch Success Zone (LSZ) shows the pilot whether they are within firing range of an enemy missile. Here we compare four ways of representing the reliability of LSZs. Three designs (qualitative; quantitative; and graphical) displayed threat information above the symbols representing enemy air craft, while a fourth display integrated threat information with the representation of the LSZ. Using a visual search paradigm, it was found that the graphical representation produced the fastest decisions, and that hostiles with the highest risk levels were detected most quickly.
Introduction We report a study that is part of a research program to identify optimal ways of representing ‘certainty’ information in tactical displays during air-to-air combat (see Selcon et al., 1995; Crick et al., 1997). Head down displays (HDD) can represent whether the pilot is within the firing range of enemy missiles (the Launch Success Zone or LSZ), but the reliability or certainty of these boundaries is variable. However, it also possible to indicate the reliability of theses boundaries, and the questions we address here are should such information be included in HDDs, and if so how? Previous work by Selcon et al. (1995) and Crick et al. (1997) found that pilots could make tactical use of LSZs. Further, these LSZs were used most effectively when depicted in a graphical format. Similarly, Kirschenbaum and Arruda (1994) found that graphical forms of representing uncertainty for ships produced better performance than verbal representations. The present experiment extended Selcon et al.’s investigation by including information representing the reliability of the LSZs. Specifically, we compared four ways of representing certainty: qualitative; quantitative; graphical and integrated. The qualitative display represented certainty with abbreviations above the symbol for the enemy aircraft; e. g. VL (very low). The quantitative display gave the certainty information as percentage risk scores placed above the enemy symbol. The graphical display represented certainty by the length of a bar positioned above the enemy. And the integrated display represented certainty by how continuous the line depicting the LSZ was.
296
M Piras, S Selcon, J Crick and IRL Davies
Method Subjects There were 16 civilian subjects, all members of DERA, with ages ranging from 19–30 years; there were eight men and eight women. All had normal or corrected to normal eyesight, and none were pilots.
Apparatus The stimuli were displayed on a Silicon Graphic ZX workstation and the keyboard was used as the response device. The display showed the ‘ownship’ (the pilot’s own location) at the bottom of the display, represented by a triangle with a direction indicator (a line) protruding from it (see Figures 1–4). All displays showed three symbols representing enemy aircraft: circles with direction indicators. The enemy aircraft were at headings of either 180°, 150°, 120° or 90°. Each hostile aircraft was ranked in terms of its threat. Thus, the hostile could either be of a high, medium or low threat Each enemy aircraft had its own Launch Success Zone (LSZ). The LSZ of one hostile was always covering the ownship. In figures 1–4 the ownship is represented by a triangle and the hostile aircraft by circles. The LSZs are shown as discrete regions displaced from the respective hostile in the direction of flight Thus the direction indicators point at the respective LSZ, while the reliability of the LSZ is shown in a variety of ways, as follows. 1) Qualitative: (Figure 1) symbols for enemy aircraft had lettering above it representing the level of certainty: VL (very low certainty); L (low certainty); QL (quite low certainty); QH (quite high certainty); H (high certainty) and VH (very high certainty).
Figure 1. Qualatative representation of uncertainty
2) Quantitative: (Figure 2) numbers were positioned above each enemy aircraft representing the level of certainty. The percentage levels were as follows: 6%, 18%, 32%, 66%, 78% and 93%. The lowest percentage represented the lowest certainty, and the highest percentage represented the highest certainty.
Figure 2. Quantitative representation of uncertainty
Reliability of at-risk information for fighter pilots
297
3) Graphical: (Figure 3) a bar was positioned above each enemy aircraft with shading corresponding to the level of certainty. The more shading the bar had, the more reliable the LSZ information was.
Figure 3. Graphical representation of uncertainty.
4) Integrated: (Figure 4) the continuity of the line representing the LSZ was varied to represent certainty or reliability. The more continuous the line, the greater the certainty.
Figure 4. Intergrated representation of uncertainty.
There were sixteen base scenarios. Each scenario was used six times. Each time, however, a different combination of certainty levels were displayed. Only one aircraft at a time represented the highest certainty, no two aircraft would display the same certainty at a given time. Scenarios were counterbalanced due to the variations in position of the enemy aircraft on the screen. Each condition consisted of 96 stimuli plus five practice stimuli.
Procedure Participants were asked to respond to the enemy aircraft which had the highest certainty. They responded by making a keyboard response in the form of aircraft 1, 2 or 3. Before the start of each condition, specific instructions for that condition were given. When the practice trial was completed participants had the opportunity to ask any questions. The experimental trial then started. The experimental trial was subdivided into four blocks of 96 trials. The order in which conditions were presented to each subject was counterbalanced using a Latin square design.
M Piras, S Selcon, J Crick and IRL Davies
298
Results Data from one subject was discarded due to high error scores (over 50%) leaving the total number of subjects at 15. Figure 5 shows the mean reaction times across subjects for each kind of display and each risk level. The clearest trend that can be seen in Figure 5 is that RTs for the graphical display are lower than for the other three types. In addition, RTs to high risk symbols seem to be lower than for other risk levels. Two-way ANOVA (display by threat) supported these impressions. Both main effects were significant: display (F=82.1; d.f.=3,42; p<0.001); and threat level (F=17.2, d. f.=2,28; p<0.001). In the first case, the significant effect is due to the graphical display producing the fastest performance, while high risk symbols also produced fast performances. There was no suggestion of any interaction between the two main effects.
Figure 5. Mean RTs for each level of threat and for each type of display
Discussion There were two clear effects in the results. First, the graphical representation of reliability produced the fastest RTs. The size of this effect was substantial: RTs to the graphical display were about 0.75 seconds faster than to the next fastest symbols. Second, there was an effect of threat level: medium threat levels produced the slowest RTs and high threat levels produced the fastest RTs. However, the size of the threat effect was small relative to the symbol effect (about 0.15 seconds) and probably reflects subjects searching the display radially about the ownship at the bottom of the display (see Figures 1–4).
Reliability of at-risk information for fighter pilots
299
The four types of symbols evaluated in this study were already under consideration by the display designers. While the trends in our results are clear in supporting the use of the graphical display, it is of course possible that some other design would produce better performance than the current best design. The choice of symbols is largely governed by ‘craft knowledge’ as there is no adequate theory of symbol processing that fits all possible types of symbol. However, given the size of the effect found, and the significance of such an effect in the fast moving world of the fighter pilot, it is worth extending the research to evaluate other display symbols, and the stability of results across a range of tasks and expertise.
Acknowledgements We are very grateful to colleagues at DERA, Craig Shanks, Maitland De Souza, Susan Driscoll and Alex Bunting, for valuable discussions and help with the experiment. We would also like to thank Nigel Woodger of University of Surrey, for laying out the paper.
References Crick, J.L., Selcon, S.J., Piras, M., Shanks, C., Drewery, C., and Bunting, A. 1997, Validation of the explanatory concept for decision support in air-to-air combat. Proceedings of Human Factors and Ergonomics Society 41st Annual Meeting, in press. Kirschenbaum, S.S. and Arruda, J.E. 1994, Effects of graphic and verbal probability information on command decision making. Human Factors, 36(3), 406–418 Selcon, S.J., Bunting, A., Coxell, A., Lal, R., and Dudfield, H. 1995, Explaining decision support: an experimental evaluation of an explanatory tool for data-fused displays. Proceedings of the 8th International Symposium on Aviation Psychology, 1, 92–97, Columbus, OH
Semantic Content Analysis of Tasks Conformance Alex Totter, Chris Stary University of Linz Department of Business Information Systems, Communications Engineering Freistädterstrasse 315, 4040 Linz, Austria
Design principles and usability measurements, such as task conformance, are widely used in the course of information system development and user interface evaluation. Although there exist commonly accepted frameworks for these principles and measurements, such as the ISO-standard 9241 Part 10, techniques for development and evaluation vary to a great extent when implementing these principles and measurements. As a consequence, the results of utilizing the principles and measurements for software development and evaluation lack quality in terms of reliability, validity and objectivity. In order to overcome this deficiency, semantic content analyses, and further on, analytical definitions capturing the meaning are required for each of the principles and measurements. In this paper the results of the semantic content analysis for one of the major principles, namely tasks conformance, are reported. The presented semantic content analysis has been performed on six techniques for user-interface evaluation that contain different interpretations of tasks conformance. The results should be used to avoid the further diversification in interpreting design principles and evaluation measurements.
Introduction Design principles and usability measurements, such as task conformance and adaptation, are widely used in the course of user interface development and evaluation. They are part of design and evaluation methodologies, such as EVADIS II (Oppermann et al., 1992), respectively, international standards, such as ISO 9241—Part 10 (1990), or directives, such as 90/270/EEC (EU-Directive, 1990). Their understanding of task conformance is mostly based on the following interpretation: “A dialogue supports task conformance, if it supports the user in the effective and efficient completion of the task. The dialogue presents the user only those concepts which are related to the task” (ISO 9241 Part 10 1990). In order to gain insights into the concept and practical impact of task conformance for design and evaluation, first, a conceptual, and secondly, the semantic content analysis provide the basics for determining the semantics (meaning) of the principle itself and related measurements. Such a specification of meaning lays ground for an analytical definition of the principle. This definition can then be used to develop reliable, objective, and valid techniques for the development and the evaluation of user interfaces. Figure 1 illustrates the addressed cycle for the improvement of quality in general: In a first step, the principles and measurements that are part of standards as well as techniques for design and evaluation are identified. For each of the principles and measurements the descriptions as well as their utilization in different techniques have to be acquired, compared, and analyzed. This second step is termed semantic content analysis. In case the use of a principle or measurement in
Semantic content analysis of tasks conformance
301
different techniques leads to different descriptions, further activities are required to ensure proper understanding. These activities comprise an explicit identification of the meaning (=analytical definition) through an analysis of the meaning, as for instance proposed by Bortz and Döring (1995), of a principle or measurement. Meaning analysis increases the transparency of the subsequent operational definition, since it provides the terminological and conceptual framework for the development of techniques for user interface development and evaluation. Figure 1 shows this transition and its result on the right side. Once the semantics of a principle or measurement has become transparent, its operational definition can be performed on a sound epistemological basis.
Figure 1. Methodological Framework for Quality Improvement In this paper we focus on the semantic content analysis of a particular principle for design and evaluation, namely tasks conformance. The benefits of the analysis are demonstrated through elaborating the terminological and conceptual deficiencies in the context of developing a proper technique of evaluation.
The Investigation The inputs to the semantic content analysis have been extracted from the following techniques: ABETO (Technology Consulting Nordrhein-Westfalen, 1994), ErgonomicsChecker (Technology Consulting Nordrhein-Westfalen, 1993), EVADIS II (Oppermann et al, 1992), Evaluating Usability (Ravden, Johnson, 1989), IsoMetrics (Willumeit et al, 1996), Software Checker (TCO, 1992). The selection of these techniques was based on the criteria of availability and accuracy: (i) How difficult is it to get access to the technique and use it practically?; (ii) Does it provide a description of task conformance similar to standards, such as ISO 9241 part 10 (1990)? For the analysis only those parts of the techniques have been considered for the analysis that focus on the evaluation of software. Most of the techniques are based on the ISO-standard 9241 Part 10. According to the goal of our study it had to be investigated, whether the selected techniques provide a theoretically sound operational definition of task conformance based on their descriptive interpretations of this principle. The semantic content analysis has been based on all of the questions of the six selected techniques. Overall, 74 questions have been identified with respect to task conformance exclusively. For each of the techniques the identified questions have been cross-checked for
302
A Totter and C Stary
mutual semantic correspondence. In order to ensure objectivity the entire set of cross-checks has been performed by two independent evaluators who are experts in the field software ergonomics. Figure 2 shows this first step of the semantic content analysis. This step identifies the redundant questions of task conformance within the set of techniques under investigation (see first and second column of table 1).
Figure 2: Mutual Cross-check of Techniques with Questions Concerning Tasks Conformance Exclusively In the second step of the qualitative content analysis the questions of each technique have been checked mutually against those questions of all other techniques that have not been related to task conformance initially. Again, the semantic correspondence of the questions has been checked.
Figure 3: Cross-check of Questions Concerning Task Conformance (TC) with NonTask Conformance-Questions of Other Techniques Figure 3 details one iteration of this step. The first input to the analysis, namely the set of questions concerning task conformance, remains identical. However, in contrast to step 1, the second input is the set of questions of the other technique(s) that is not directly related to task conformance. The second input is the set of questions that has not been involved in step 1. For each of the techniques the set of cross-checks as shown in Figure 3 has been performed, again by two independent experts in the field of software ergonomics. Step 2 has been considered to be completed when all the questions had been cross-checked for semantic correspondence(s). The results of this step can be represented in a correspondence matrix (partly shown in Table 1). This step has led to all those questions that can not only be assigned to task conformance but also to other principles. Such multiple assignments indicate problems of validity, due to mutual dependencies.
Semantic content analysis of tasks conformance
303
Table 1: Semantic Correspondences
Following the tradition of qualitative studies (e.g., Bortz, Döring, 1995) the results of the previous steps are analyzed in a descriptive-statistical way. The analysis of results of step 1 (mutual cross-checks of TC-questions) has led to a semantic correspondence of 22 out of 73 questions concerning task conformance exclusively—one question has been found three times in different techniques in the context of task conformance. Table 2: Number of TC-Questions Also Assigned to Other Categories
The analysis of the results of step 2 (cross-checking TC-questions against questions assigned to other categories of measurement) leads to the answer of: Are there any questions that concern task conformance (TC-questions), and are these questions utilized to measure other principles than task conformance? Overall, 38 out of 73 TC-questions correspond semantically to questions that are assigned to other principles in other techniques. Given 51
304
A Totter and C Stary
TC-questions, i.e. the reduced set of questions according to the 22 semantic correspondences found in step 1, 25 questions (about 50 %) have a semantic correspondence to questions assigned to other categories of measurement. Particular questions have been used up to 5 times to measure other principles than task conformance. Overall, 47 TC-questions have been found in other categories than TC. Table 2 shows the principles that contained TC-questions. The leading principle is controllability, followed by self-descriptiveness and individualization.
Conclusions Although the principles for designing and evaluating user interfaces are based on common frameworks, their operational definition has led to results that lack reliability, objectivity, and validity. In order to overcome these deficiencies, a semantic content analysis and a subsequent meaning analysis of the principles and measurements are required. In this paper, a first step towards the empirically sound operational definition has been made through performing a semantic content analysis. The semantic consistency of one of the major design principles and measurements, namely task conformance, has been examined based on 6 techniques for user interface evaluation. Like most of the results of qualitative analyses the results of the semantic content analysis should be used for further empirical work. In our case the follow-up investigation should comprise a meaning analysis of task conformance enabling an analytical definition of this principle.
References Bortz, N. & Döring: Research Methods and Evaluation (in German), 2nd edition. Springer, Berlin, 1995. EU-Directive 90/270/EEC: Human-Computer Interface. Occupational Health and Safety for VDU-Work (5th Directive, Art. 16 Par. 1 of the Directive 89/391/EEC). In: EUBulletin, Vol. 33, L 156, Minimal Standard (Art. 4 &5), Par. 3, p. 18, 21.06.1990. ISO 9241 Part 10: Ergonomic Dialogue Design Criteria, Version 3, Committee Draft, December, 1990. Oppermann, B. Murchner, H. Reiterer, M.Koch: Ergonomic Evaluation. The Guide EVADIS II (in German), de Gruyter, Berlin, 1992. Ravden, S. & G.Johnson: Evaluating Usability of Human-Computer Interfaces. Ellis Horwood, Chichester, 1989. TCO, Swedish Confederation of Professional Employees: Software Checker—An Aid to the Critical Examination of the Ergonomics Properties of Software, Handbook and Checklist, Sweden, 1992. Technology Consulting Nordrhein-Westfalen: Ergonomics-Checker (in German) Technik und Gesellschaft, Vol. 14, Oberhausen, 1993. Technology Consulting Nordrhein-Westfalen: ABETO—Work Sheets. Oberhausen, 1994. Ulich, E.: Psychology of Work (in German). 3rd edition, vdf, Zürich, 1994. Willumeit, G. Gediga, K.Hamborg: IsoMetrics: A Technique for Formative Evaluation of Software in accordance to ISO 9241/10 (in German). In: Ergonomie und Informatik, March (1996), 5–12.
WARNINGS
WARNINGS: A TASK-ORIENTED DESIGN APPROACH Jan Noyes* and Alison Starr**
* Department of Experimental Psychology, University of Bristol 8 Woodland Road, Bristol BS8 1TN, UK ** Smiths Industries Aerospace, Cheltenham GL52 4SF, UK
Air traffic continues to increase; a trend which is expected to persist well into the next century with a concomitant increase in accident and incident rates (Last, 1995). Given the prevalence of human error in aircraft operations, the design of the warning system is of paramount importance since it is often provides the crew with the first indication of a potential problem. This paper will discuss the findings from recent research on civil aircraft warning systems carried out at Smiths Industries Aerospace in conjunction with the University of Bristol and British Airways; some of the issues associated with the design of current warning systems will be considered. It is concluded that the use of task-oriented as opposed to fault-oriented handling of information may be the way forward for the next generation of warning systems.
Designing for Error Humans make errors, although most of the time these errors are inconsequential with no ill or long term effects. However, in safety-critical systems such as those involved with aircraft operation, human error may have catastrophic effects. Although it is not possible to prevent humans from making mistakes, every attempt must be made when designing systems to minimise the opportunities for human error, and for remedial actions to be carefully planned for easy assimilation and execution by the crew. The point of contact between the flight-deck crew and the aircraft informing them of critical changes in the state of various aircraft systems is usually the warning system. Consequently, special attention needs to be applied to its design in order to accommodate any errors which the crew might make (Billings, 1997). When considering the causes of aviation incidents and accidents, human error is |implicated in a large number. Figures differ according to definitions and method of calculation, but human error has been given as a causal factor in 80% of fatal aircraft accidents in general aviation and 70% in airline operations (Jensen, 1995). Recent statistics indicate there were 1063 accidents world-wide in commercial jet aircraft between 1959 and 1995 of which 64.4% cited flight crew as a primary cause (Boeing, 1996).
Warnings: a task-oriented design approach
307
However, it should be noted that human error is a portmanteau expression that does not differentiate between errors made due to lapses in professional skills and errors arising due to ordinary human failings. The area of human error has been well-researched, although the development of a precise definition and in-depth understanding of human error continues to prove to be a difficult and elusive goal. A common viewpoint exemplified by Rasmussen (1987) is that human errors arise because of a mismatch between the human and the task or human-machine misfits. Frequent misfits are likely to be considered design errors, while occasional misfits may arise due to variability on the part of the system (component failures) or the human (human errors). Both external and internal factors may be responsible for this mismatch, although it is generally thought that internal traits, e.g. skill levels, are not as influential as external factors in their contribution to human error. External performance shaping factors of relevance to the design of aircraft warning systems might include: (i) inadequate human engineering design, e.g. violation of population stereotypes resulting in sequence and selection errors; (ii) inadequate work space and work layout, which may contribute towards fatigue, decreased productivity and increased errors; (iii) inadequate job aids, e.g. poorly written manuals and procedures may lead to uncertainty and errors on the part of the operator (see, Noyes and Stanton, 1997). In the avionics application there are specific difficulties associated with studying human error. For example, in some situations, accidents are likely to be catastrophic. As a result, evidence about the cause of the accident is often lost. The main participants may be deceased, thus hampering the search for the causes of the errors. This type of situation is exacerbated by our limited understanding of the role of the human operator in accident processes (Kayten, 1989). In summary, the complexities of human behaviour make studying human error a challenging task with many difficult theoretical problems (Leplat, 1987). There exists no single theory or model which predicts the occurrence of human errors, which would provide an initial step towards learning more about the causes, and hence, the prevention of errors. Often there is no right or wrong decision, only the best decision for any given set of characteristics and for any given point in time. Often, the outcome in terms of the results of making errors and subsequent decisions is not known until later. Consequently when designing systems, failure to know in detail why a human error occurs makes the development of a solution strategy both difficult and inefficient. When considering methodologies for the study of human error, these are very different from studying behaviour based on simple, rule-directed decisions. Laboratory studies of human error and post-hoc analyses of incidents which involve collecting individual accounts/ reactions, etc. are often not fruitful in terms of yielding definitive information about the causes of making errors (Nagel, 1988). Consequently, the approach taken here was to use a self-report technique developed through extensive observation and interview studies, and in conjunction with analyses of accident and incident data.
Flight-deck Crew Survey The findings presented here emanate from a questionnaire survey of 1360 commercial flightdeck crew (representing a return rate of just over 40%). This questionnaire on aircraft warning systems was developed through an extensive knowledge elicitation process; for example, user requirements of current and future warning systems were assessed by taking a
308
JM Noyes and AF Starr
descriptive (do you have this feature/function?) followed by a prescriptive approach (would you like it?). Respondents were asked to state the extent of their agreement with 51 statements on a 7-point Likert scale from ‘strongly agree’ through to ‘strongly disagree’. Further details concerning this methodological approach are given in Noyes, Starr and Frankish (1996). The purpose of a warning system is manifold, from alerting the crew to an actual or potential malfunction, through to providing evidence of the problem, and guidelines for remedial actions. Consequently, the features of a good warning system include the provision of a complete set of warnings with enough information (to anticipate problems before they arise, and to be aware of them when they do arise), guidance to deal with the situation, provision of information about secondary consequences, and the reduction of ‘false’ and ‘nuisance’ warnings.
Brief Summary of User Responses Findings from the questionnaire survey indicated that the majority of respondents felt that warning information was complete in terms of appropriateness of warnings to a given situation1, sufficient to identify the problem(s)2, and providing direction towards corrective procedures3 (with over 80% of crew agreeing that the warnings in their aircraft are effective in directing them towards appropriate actions). But in general, they did not think that current warning systems provided adequate information about secondary consequences of malfunctions4 5, and this was a feature which they viewed favourably. In addition, the questionnaire results indicated that crew would like warning systems to have a predictive capacity allowing the anticipation of problems6 7. False warnings were generally thought to be undesirable8, but were not viewed as a significant problem; this is presumably because they are a rare occurrence (although one respondent made the comment that “one false warning is ‘too often’”). A clear need was expressed by crew to have systems adept at handling multiple warnings9, and it was felt that current systems were not as supportive as they might be in these situations10.
Footnotes 1
Descriptive statement—“All warnings appropriate to the situation are given” Prescriptive statement—“The warnings provided in my aircraft are usually sufficient to identify immediately the source of the problem” 3 Descriptive statement—“Warnings that are given are effective in directing me to appropriate procedures for dealing with the problem” 4 Descriptive statement—“Flight-deck displays provide adequate information about secondary consequences of malfunctions (e.g. inoperative systems, restrictions on operational procedures, etc.)” 5 Prescriptive statement—“Flight-deck displays should provide information about secondary consequences of malfunctions” 6 Descriptive statement—“The flight-deck instrumentation available in my aircraft is effective in enabling problems to be anticipated before warnings are triggered (e.g. by indicating parameters that are slightly in error, but still within tolerance)” 7 Prescriptive statement—“Flight-deck instrumentation should enable problems to be anticipated” 8 Descriptive statement—“False warnings appear too often” 9 Descriptive statement—“It is easy to interpret warning displays when several warnings appear at the same time” 10 Prescriptive statement—“When several warnings conditions are active, only the most important should be displayed” 2
Warnings: a task-oriented design approach
309
Discussion The provision of warning information on civil flight-decks has changed significantly over the years from the early distributed warning lights through to the introduction of multifunction displays with associated system schematics and checklists (see, Starr, Noyes, Ovenden and Rankin, 1997, for a full history of the evolution of warning systems on civil aircraft). Although this research programme has only been conducted with a single airline with 10 different aircraft types (and as such may not fully represent the views of flight-deck crew in other airlines), it is generally accepted that current commercial aircraft warning systems have the ability to provide a large amount of data. However, they tend not to: (i) integrate data from several sources into a format determined by the current situation, e.g. phase of flight; (ii) allow anticipation of malfunctions by conveyance of prediction information concerning abnormal conditions to the crew; (iii) provide advanced indication of the consequences of crew decision-making and actions. It could therefore be concluded that most conventional warning systems are fault-oriented, and corrective actions are directed towards management of the immediate problem; priorities being determined according to a pre-determined hierarchy. Current warning systems tend to present a large amount of ‘unprocessed’ information, which essentially lacks integration and situational modification across and within sources to aid diagnosis and corrective actions. Although the basic functions of warning systems are unlikely to change significantly in the future, it is likely that the amount of information which can be made available to the crew will continue to increase as it extends to include more system parameters and external events. Looking to the future, part of the solution to improve upon current systems may be to develop task-oriented warning systems. Future warning systems could perhaps aim to rectify this potential crew ‘information overload’ situation by providing information tailored to the overall aircraft situation, offering a range of options to be evaluated in the light of future operational requirements. The development of ‘soft displays’ supported by powerful computational resources could facilitate the development of task-oriented warning systems which provide information tailored directly to users’ current requirements. This type of interface should aid the performance of the human operator in terms of the management and presentation of warning information; this in turn should reduce the opportunities for human error. It will also be in keeping with the views expressed by the current user population of civil flight-decks as demonstrated in this research programme.
Acknowledgements This work was carried out as part of a UK Department of Trade and Industry funded project, IED: 4/1/2200 ‘A Model-Based Reasoning Approach to Warning and Diagnostic Systems for Aircraft Application’. Thanks to British Airways for their participation in this research programme, and especially to all flight-deck crew who completed interviews and questionnaires. Thanks are also due to the late David Eyre for the meticulous statistical analyses carried out on the questionnaire data.
310
JM Noyes and AF Starr
References Billings, C.E. 1997, Aviation Automation: The Search for a Human-centred Approach, (LEA, New Jersey) Boeing 1996, Table of all accidents—World-wide commercial jet fleet, Flight Deck, 21, 57. Jensen, R.S. 1995, Pilot Judgement and Crew Resource Management, (Avebury Aviation, Aldershot) Kayten, P. 1989, Human performance factors in aircraft accident investigation. In Proceedings of the 2nd Conference on Human Error Avoidance Techniques, Herndon, VA, (SAE International, Warrendale, PA), Paper 892608, 49–56 Last, S. 1995, Hidden origins to crew-caused accidents. In Proceedings of IFALPA Conference, Interpilot, June Issue, 5–15 Leplat, J. 1987, Some observations on error analysis. In J.Rasmussen, K.Duncan and J. Leplat (eds.), New technology and human error, (Wiley, Chichester), 311–316 Nagel, D.C. 1988, Human error in aviation operations. In E.L.Wiener and D.C.Nagel (eds.), Human factors in aviation, (Academic Press, San Diego), 263–303 Noyes, J.M., Starr, A.F. and Frankish, C.R. 1996, User involvement in the early stages of the development of an aircraft warning system. Behaviour & Information Technology, 15(2), 67–75. Noyes, J.M. and Stanton, N.A. 1997, Engineering psychology: Contribution to system safety. Computing & Control Engineering Journal, 8(3), 107–112. Rasmussen, J. 1987, The definition of human error and a taxonomy for technical system design. In J.Rasmussen, K.Duncan and J.Leplat (eds.), New technology and human error, (Wiley, Chichester), 23–30 Starr, A.F., Noyes, J.M., Ovenden, C.R. and Rankin, J.A. 1997, Civil aircraft warning systems: A successful evolution? In Proceedings of IASC ‘97 (International Aviation Safety Conference), edited by H.M.Soekkha, (VSP BV, Rotterdam, Netherlands), 507–524
EFFECTS OF AUDITORILY-PRESENTED WARNING SIGNAL WORDS ON INTENDED CAREFULNESS Rana S.Barzegar and Michael S.Wogalter Ergonomics Program, Department of Psychology North Carolina State University Raleigh, North Carolina 27695–7801 USA
This study investigates whether signal words such as DANGER, WARNING, and CAUTION, presented under different vocal conditions, influence intended compliance. Male and female participants listened to cassette tapes of signal words presented by a male or female speaker in monotone, emotional, and whisper voice styles at either a low or high sound level. The results showed that female speakers produced significantly higher ratings of intended carefulness. Of the five signal words examined, DEADLY received the highest ratings, followed by DANGER; and NOTICE received the lowest carefulness ratings. WARNING and CAUTION did not differ. The safety implications of these results are discussed.
Introduction Current warning design standards and guidelines recommend the use of signal words to alert individuals to the presence and level of potential hazards. Standards and guidelines in the US generally recommend DANGER, WARNING, and CAUTION to indicate high to low levels of hazard, respectively (e.g., ANSI, 1991; FMC Corporation, 1985). According to ANSI (1991) these terms have been assigned the following definitions. DANGER should be used to indicate immediate hazards that will result in severe personal injury or death. WARNING is recommended for use with hazards or unsafe practices that could result in severe personal injury or death. Finally, CAUTION is recommended for hazards or unsafe practices that could result in minor personal injury and/or product or property damage. Research has consistently shown that people do, in fact, perceive DANGER to connote a significantly greater hazard than both WARNING and CAUTION, but people do not differentiate between the two latter terms (e.g., Wogalter and Silver, 1990; 1995). Other research has investigated whether alternate terms, such as DEADLY and LETHAL, are useful in conveying different hazard levels (Wogalter and Silver, 1990; 1995). All previous research on signal words has evaluated their effectiveness as presented visually in the print medium. Although there is research on nonverbal auditory warning signals (e.g., see Edworthy and Adams, 1996 for a review), there has been no research on the effects of auditory/voiced/verbal signal words. The present research is an initial attempt to examine the effects of voiced signal words on connoted hazard (intended carefulness ratings). Previous studies suggest that voiced warnings have potential for effective warning
RS Barzegar and MS Wogalter
312
communication. Wogalter and Young (1991) and Wogalter et al. (1994) showed that voiced warnings produced greater compliance than the same message in print. One benefit is that the receivers of the information do not need to be looking in a particular direction, as would be needed with visually presented information (Wogalter and Young, 1991; Wogalter et al., 1994). Another benefit of voiced warnings is their potential utility for informing those who have difficulty reading the English language, including children and individuals with vision problems. With recent advancements in digital speech technology, voiced warnings could be used to communicate hazards of various types under various conditions. The present study examines the effects of signal words presented in monotone, emotional, and whisper voices on intended compliance. Sound levels (dBA) were manipulated (low vs. high) with the amplitude levels equated among the three voicing methods. Mershon and Philbeck (1991) found that a whisper presented at the level of normal speech is significantly more salient and arousing than normal speech. In addition, gender was examined with respect to both the speaker (i.e., presenter or source) and the participant (i.e., listener or receiver). Although 43 words were used as stimuli in this research, the present article describes the results of the five terms that have been investigated most extensively in previous research (DEADLY, DANGER, WARNING, CAUTION, and NOTICE). Three of these terms, DANGER, WARNING, and CAUTION, are recommended by ANSI (1991) to indicate high to low levels of hazard, respectively. Previous research by Wogalter and Silver (1995) and Wogalter et al. (1997) has shown that DEADLY connotes a substantially greater hazard than DANGER, NOTICE is a nonhazard related term recognized by ANSI (1991) to call attention to important information (Westinghouse Product Safety Label Handbook, 1981).
Method Participants Seventy-two undergraduate students taking an introductory psychology course at North Carolina State University participated. They were compensated with credit towards the course. An equal number of males and females participated.
Stimulus materials The signal words were taken from a list of 43 words investigated by Wogalter and Silver (1995). They are shown below in alphabetical order: ALARM ALERT ATTENTION BEWARE CAREFUL CAUTION CRITICAL CRUCIAL DANGER DANGEROUS DEADLY
DON’T EXPLOSIVE FATAL FORBIDDEN HALT HARMFUL HAZARD HAZARDOUS HOT IMPORTANT INJURIOUS
LETHAL NECESSARY NEEDED NEVER NO NOTE NOTICE POISON PREVENT PROHIBIT REMINDER
REQUIRED RISKY SERIOUS SEVERE STOP TOXIC UNSAFE URGENT VITAL WARNING
The above words were arranged in 18 random orders, each recorded on a separate audio cassette tape. The recordings were produced in a sound chamber using a Marantz PMD201 professional portable cassette recorder, Audio-Technica ATR30 vocal/instrument microphone, microphone stand, TDK DS-X90 audio tapes and Koss TD/60 enclosed ear headphones.
Effects of warning signal words on intended carefulness
313
Each speaker produced three recordings, one in each voicing method (monotone, emotional, and whisper) with a different random order word list for each. Each recording consisted of signal words presented at a rate of 8 s intervals (onset to onset) with a quiet period between each word. Three male and three female speakers were used to make the recordings.
Procedure Participants were informed that they would hear a series of words presented on three cassette tapes. The instructions were to listen to each word and rate “How careful would you be after hearing each word?” based both on its meaning and on how it is presented. Ratings were made on a 9-point Likert-type scale with the following verbal anchors placed at the evennumbered points: 0—not at all careful, 2—slightly careful, 4—careful, 6—very careful, and 8—extremely careful. Each participant heard three tapes, monotone, emotional, and whisper, in different random orders. Sound level (low: 60 dBA vs. high: 90 dBA) and speaker gender (male vs. female) were manipulated between participant genders. All tapes heard by a given participant were presented either at the low or high sound level and by a male or female speaker. Participants were randomly assigned to conditions based on a schedule such that an equal number of males and females participated in the sound level and word order conditions an equal number of times.
Results The data were examined using a 2 (Sound level: low vs. high) X 2 (Speaker gender: male vs. female) X 2 (Participant gender: male vs. female) X 3 (Voicing method: monotone vs. emotional vs. whisper) X 5 (Signal Words: DEADLY vs. DANGER vs. WARNING vs. NOTICE vs. CAUTION) mixed-model design analysis of variance (ANOVA). The last two variables, voicing method and signal words, were repeated measures factors; all others were between-subjects factors. The ANOVA showed a significant main effect of speaker gender, F(1, 60)= 13.95, p<.001. Female speakers (M=5.10) produced higher carefulness ratings than male speakers (M=4.18). Although participant gender failed to reach the conventional p level generally considered necessary for significance, F(1, 60)=3.82, p=.055, the means showed the tendency for male participants (M=4.9) to give higher ratings for intended carefulness than female participants (M=4.4). The ANOVA showed a significant main effect of voicing method F(2, 120)= 6.86, p<.01. Comparisons among the means, using Tukey’s Honestly Significant Difference (HSD) test, showed that the emotional voicing method (M=4.93) produced significantly higher carefulness ratings (p<.05) than the monotone (M= 4.30). The whisper voice style (M=4.68) was intermediate and was not significantly different from the other two conditions. In addition, a significant main effect was found for signal words F(4, 240)= 137.80, p<.001. Tukey’s HSD test showed that all paired comparisons were significant (DEADLY, M=6.35; DANGER, M=5.28; WARNING, M=4.44; CAUTION, M=4.25; and NOTICE, M=2.87), except between WARNING and CAUTION. Table 1. Means as a function of voicing method and signal word
RS Barzegar and MS Wogalter
314
The ANOVA also indicated the presence of three significant interaction effects. Table 1 presents the means for the interaction between voicing method and signal words, F(8, 480)=2.56, p<.01. The emotional voicing method produced significantly higher ratings than the monotone for both WARNING and NOTICE. In addition, NOTICE voiced emotionally was rated higher than NOTICE whispered. DEADLY whispered was rated higher than DEADLY voiced in monotone. There were no significant voicing-method differences for CAUTION and DANGER. Table 2. Means as a function of speaker gender and signal word
Speaker gender and signal word interacted, F(4, 240)=6.82, p<.001. The means in Table 2 show that female speakers consistently produced higher carefulness ratings than male speakers for all signal words, except NOTICE. These 2 factors interacted with sound level in a three-factor interaction of sound level, speaker gender, and signal words, F(4, 240)=3.37, p<.05. The means for this interaction, displayed in Table 3, depict a similar pattern to the speaker gender by signal word interaction described above, with two relatively minor magnitude changes as a function of sound level. The speaker gender difference is larger for DEADLY in the low sound level condition and for WARNING in the high sound level condition. Note that the greatest intended carefulness was produced with DEADLY spoken in a low level female voice. Table 3. Means as a function of sound level, speaker gender, and signal word
Discussion Various parameters of auditorily-presented signal words can affect receivers’ intended carefulness. For the most part, emotionally toned voices produced the highest carefulness ratings, particularly compared to the monotone voices. Perhaps the higher ratings for the emotional tone is a reflection of the way people would naturally vocalize a hazard. In emergency-type communications, people become excited and emotional speaking at a higher pitch and at a faster rate. Therefore, the emotional tone may cue listeners to the urgency of the situation. Research has shown that nonverbal auditory signals presented at a faster rate and at higher frequencies increase perceived urgency (Edworthy and Adams, 1996). Related to this is the higher carefulness ratings when the signal words were presented by female speakers.
Effects of warning signal words on intended carefulness
315
This concurs with previous findings showing that higher physical frequencies (i.e., the female voice) produce greater perceived urgency (Edworthy and Adams, 1996). The perceived hazard levels associated with the signal words were ordered high to low as follows: DEADLY, DANGER, WARNING, CAUTION, and NOTICE. This order is consistent with previous research of visually presented signal words (Wogalter and Silver, 1995). Several other results were also consistent with previous research of visually presented signal words. First, there was no significant difference between WARNING and CAUTION on perceived hazard (i.e., intended carefulness) (Wogalter and Silver, 1990). Second, DEADLY was consistently rated higher than DANGER (Wogalter & Silver, 1995; Wogalter et al., 1997). Third, the low ratings for NOTICE for both male and female participants reflects the fact that this term has no specific hazard-related implications. Several complex interactions were noted in the analysis. We will withhold elaborate explanations until there is additional evidence and replication. Clearly these results have implications for safety. Modern technology has provided voice recordable transistor chips (found in greeting cards, answering machines), which when combined with one or more detection systems (e.g., motion, infrared, sound) can potentially communicate effective, timely warnings. Only a few of the many sound parameters were investigated in the present study. Other parameters of voice warnings still need to be examined.
References ANSI. 1991, American national standard on product safety signs: Z535.1–5, (American National Standards Institute, New York) Edworthy, J., and Adams, A. 1996, Warning Design: A Research Perspective, 129–178 FMC Corporation. 1985, Product safety sign and label system, (Santa Clara, CA: Author) Mershon, D.H., and Philbeck, J.W. 1991, Auditory perceived distance of familiar speech sounds, Paper presented at the Annual Meeting of the Psychonomic Society, (San Francisco, CA) Westinghouse Printing Division. 1981, Westinghouse product safety label handbook, (Trafford, PA: Author) Wogalter, M.S., Frederick, L.I., Herrera, O.L., and Magurno, A.B. 1997, Connoted hazard of Spanish and English warning signal words, colors, and symbols by native Spanish language users. Proceedings of the 13th Triennial Congress of the International Ergonomics Association, IEA ‘97, 3, 353–355 Wogalter, M.S., Racicot, B.M., Kalsher, M.J., and Simpson, S.N. 1994, The role of perceived relevance in behavioral compliance in personalized warning signs. International Journal of Industrial Ergonomics, 14, 233–242 Wogalter, M.S., and Silver, N.C. 1990, Arousal strength of signal words. Forensic Reports, 3, 407–420 Wogalter, M.S. and Silver, N.C. 1995, Warning signal words: connoted strength and understandability by children, elders, and non-native English speakers, Ergonomics,38, 2188–2206 Wogalter, M.S., and Young, S.L. 1991, Behavioural compliance to voice and print warnings. Ergonomics, 34, 79–89
LISTENERS’ UNDERSTANDING OF WARNING SIGNAL WORDS Judy Edworthy, Wendy Clift-Matthews & Mark Crowther
Department of Psychology University of Plymouth Drake Circus Plymouth PL4 8AA
This paper presents two studies which look at the interaction between the arousal strength of signal words and the way in which they are spoken. In the first study, listeners rated the urgency, appropriateness and believability of eight signal words which were presented in either an appropriate, or an inappropriate, voice tone by human speakers. Listeners’ judgements of all three measures was strongly affected by the way in which the words were spoken. In a second study the words were presented in a synthesized format and were subjected to some basic urgency modelling. Results from this study were more ambiguous, although differences in urgency were noted. The research implications of the findings are discussed.
Introduction Much evidence exists to show that the perceived urgency of nonverbal auditory warnings can be influenced by their acoustic structure. For example, it has been demonstrated that warnings which are higher in frequency, louder, faster, and vary along a number of other dimensions such as pitch contour, amplitude envelope, rhythm and so on are rated as being more urgent than warnings with lower values of these parameters (Edworthy et al, 1991). There is evidence also to show that peoples’ responses to warnings designed in such a way as to sound acoustically urgent or nonurgent vary in important, practical ways (e.g. Bliss et al, 1995). Increasingly, speech warnings are used where other, more traditional types of warnings might have been used. Thus an interesting research question arises as to the extent to which the urgency, as well as the believability and appropriateness, of speech warnings can be influenced by those same acoustic parameters that influence nonverbal auditory warnings. In particular, there is the question as to the interaction between the semantic content of a speech message and the way, acoustically, it is presented. This paper presents a pair studies which begin to look at this interaction. They form the basis of a more comprehensive set of ongoing studies which look at the design and response to speech warnings in multitask environments.
Listeners’ understanding of warning signal words
317
Study 1: Natural speech and signal words It is well established that some words typically used on warning labels, usually known as signal words, vary systematically in their arousal strength (e.g. Wogalter & Silver, 1995). For example words like Deadly and Danger always score higher ratings than words like Don’t and Note. The most stable of these words can be used to create a scale of semantic urgency which can them be manipulated acoustically in order to address the interaction between the semantic content of the word, and the way in which it is presented acoustically. For example, the extent to which the urgency of a word such as Danger is influenced by the way in which is spoken can give us some insight into the way the acoustic structure of the word, and its semantic meaning or strength, interact and contribute to the overall impression of the word. As acoustic analysis of speech sounds is complex, we decided in the first instance (and as a precursor to experiments which will involve acoustic analysis) simply to ask two speakers, one male and one female, to speak a set of eight signal words in both an appropriate and an inappropriate manner, leaving the speakers to decide how to say each of the words.
Method Two human speakers, one male and one female, were asked to speak the eight signal words Lethal, Deadly, Poison, Danger, Beware, Warning, Attention and Don’t in both an appropriate and an inappropriate manner. The speakers were left to decide how the words should be spoken. Forty-three participants were asked to rate three features of these words: first, the urgency on a 0–100 scale; second, the appropriateness of each of the words on a 1–8 scale; and third the believability of each of the words on a 1–8 scale. Each of the scales was selected because of their similarity with earlier studies which had asked participants to rate these dimensions. Each stimulus was heard twice by each participant, in a randomised order. The stimuli were presented on cassette tape.
Results and Discussion Three sets of measures was taken for each word: its urgency, its believability and its appropriateness. The results for the urgency measure revealed a sex difference (with the female voice producing higher scores overall than the male voice). However, this may be due to individual differences between the speakers, so is not emphasized here. The central finding for the urgency measure is that main effects were found for both style of speaking (appropriate or inappropriate) and signal word, as well as an interaction between style and signal word for the female speaker (F=102.98, df=1, p<.001 for style, F=9.06, df=7, p<.001 for signal words and F=10.39, df=7, p<.001 for the interaction between style and signal word). For the male speaker, a main effect for style was found (F=89.78, df=1, p<.001) and an interaction between style and word was found (F=3.03, df=-7, p<.005). These results show that the way in which each of the speakers spoke the words had a very large effect on the urgency of the words. Words spoken in an appropriate manner were judged to be considerably more urgent than those presented in an inappropriate manner. However, the word itself also had a fairly prominent effect on listeners’ judgements, producing an interaction in each case and a main effect in the case of the female speaker. The results for the female speaker in particular show that the words already known to possess higher levels of arousal produced higher ratings of urgency. Generally, the pattern shown was that the words remained more or less in their expected order, from Deadly at the top to Don’t at the bottom,
318
J Edworthy, W Clift-Matthews and M Crowther
but with the scores for the appropriate words being higher than those for the inappropriate words. The results for the appropriateness and the believability measures demonstrated a very similar pattern. For the appropriateness measures, separate 2-way ANOVAs on the female and the male stimuli (inappropriate/appropriate×word (8 levels)) revealed main effects for both appropriateness, and the interaction between appropriateness and word (F =1033, df=1, p<.001 for female appropriateness, F=422, df=1, p<.001 for male appropriateness, F=5.325, df=7, p<.001 for the interaction between appropriateness and word for the female speaker and F=2.6168, df=7, p<.05 for the interaction between appropriateness and word for the male speaker). The effect for word itself was significant in the case of the female speaker, but not in the case of the male speaker. Thus the majority of the variance was accounted for by the contrast between scores for the appropriately, and the inappropriately spoken, words. The pattern for the believability was very much the same as the pattern for the appropriateness measures, for both speakers. Similarly to the appropriateness measure, the factor accounting for most of the variance was that the contrast between the appropriately spoken words and the inappropriately spoken words. Thus the pattern of response was similar across all three measures, showing effects for signal words in line with previous research for written words (e.g. Wogalter & Silver, 1995) and very large effects for the way in which a word is spoken.
Study 2: Synthesized speech and signal words Previous studies on nonverbal warnings have shown that factors such as intensity, frequency and speed have considerable effects on the perceived urgency of auditory warnings (e.g. Edworthy et al, 1991). It is likely that these were amongst the key factors that our two live speakers varied in producing their appropriate and inappropriate versions of the signal words. These three factors would seem to be the primary (and technically the most readily available) features which can be manipulated on almost any digitized speech generation system, so we decided simply to take a very basic synthesizer to explore whether manipulation of these features could produce discernible changes in listeners’ judgements of urgency, appropriateness and believability in a similar manner to live speakers.
Method Two ‘speakers’, one male and one female, were chosen from a set of available voice types on a ‘Text’LE’ program found within a ‘Soundblaster 64’ sound system run on a PC using Windows 95. Each speaker was given the same eight words as before—Lethal, Deadly, Poison, Danger, Beware, Warning, Attention and Don’t—and the appropriateness of each of the words was manipulated by setting the pitch and speed levels of one version of the word considerably higher than the other version of the word. The former were then labelled ‘appropriate’ and the latter ‘inappropriate’. Each of the words was then recorded on digital tape for use in the study. Each stimulus was then presented twice to 43 participants, who were asked to rate the urgency, appropriateness and believability of each of the words as in Study 1.
Listeners’ understanding of warning signal words
319
Results and Discussion Again, three sets of measures was taken for each word: its urgency, its believability and its appropriateness. In the case of this study, the male and female data were combined and analyzed in a series of 3-way sex×word×style ANOVAs. For the urgency measures, no effect was found between the two speakers (F=0.512, df=1, p= .48). A significant effect was found for word (F=10.42, df=7, p<001), as was also a significant effect for style (appropriate vs inappropriate) (F=12.40, df=1, p<.005). Interactions were found between speaker and word, and word and style. The major part of the variance here was accounted for by the effects of word and style and so are largely similar to those effects found for the live speakers. Some departure from the live speaker results were demonstrated by the appropriateness and the believabihty measures, however. Although some significant results were obtained, similar to those found earlier, the most striking thing about both the appropriateness and believability measures was that no significant difference was found between the ‘appropriate’ and ‘inappropriate’ versions of each of the words on either the appropriateness and believability measures. In other words, the basic acoustic manipulations that we applied did not influence listeners’ judgements of the believability and appropriateness of the words, even though some effect for urgency was found. The results for these two measures show strong effects for word (F=4.8, df=7, p <.001 for appropriateness scores and F=7.46, df=7, p<.001 for believability scores), again in line with earlier findings for written signal words. Some interactions were obtained, as before. No effects were found for speaker across all three sets of data.
General Discussion This pair of experiments bring forward a number of interesting research points which will need to be elaborated more fully in future studies. However, the key points to emerge from the results can be summarised as follows. First of all, the results of both experiments show clearly that signal words already known to vary in their arousal strength when presented in visual form (e.g. Wogalter & Silver 1995) produce the same general pattern of results when presented in spoken form: words known to be high on their arousal strength such as Deadly and Danger are rated consistently higher than words such as Attention and Don’t. This was true across each of the experiments, on every rating. There are some minor inconsistencies but as a rule the results mirror those for written words in a striking way. The second main feature of the results is that the way in which a word is rated is tempered by the way in which it is spoken. Clearest of all is that words spoken in an appropriate manner (in a style freely chosen by the speaker in this case) produce much higher ratings of urgency, appropriateness and believability than those spoken in an inappropriate way. (The results are less clear for the synthesized speech, which we will come to later). In Study 1 any sort of acoustic analysis was purposely avoided, because we were interested primarily as to whether speakers can convey appropriateness and inappropriateness to listeners: the results of Study 1 show very clearly that they can, and that listeners are sensitive to these contrasts. In fact, Study 1 shows that the most important factor in the study was the style of speaking, which is an area where fruitful research might be carried out in the future. A secondary aspect of these results is that they also show that the urgency, appropriateness and believability of
320
J Edworthy, W Clift-Matthews and M Crowther
signal words can be reduced or increased by the way in which they are spoken. For example, the word Deadly can be made less urgent by speaking it in an inappropriate manner. In some ways this is similar to the effect that can be obtained by varying the colours used to emphasise signal words. Braun et al (1994) show for example how colours and words can trade off against one another depending upon how they are combined. The same may be true for spoken signal words and the way in which they are presented acoustically. Turning to the results for the digitized words, and their comparison with live speakers, the results show that at least to some extent urgency can be altered by manipulating pitch and speed variables: thus as a first pass, such manipulations might be adequate for design purposes and if intensity was also included such designs and manipulations might be quite effective. However, the results for believability and appropriateness suggest that such manipulations do not make the ‘urgent’ words any more believable and appropriate than their ‘nonurgent’ counterparts. This distinction was very clearly delineated for the live speakers, and no doubt this is because a live speaker is doing very much more with the words when speaking them in the two styles than simply raising the pitch, speeding up the words and making them louder. The numerous interactions which were obtained between word and speaker, and speaker and style, also draw attention to the subtlety of the interaction between speaker and listener which is taking place. This exploration of this interaction, as well as the detailed acoustic analysis which will need to be performed in order to understand it more fully, forms the next phase of this research programme.
References Bliss, J.P., Gilson, R.D. and Deaton, J.E. 1995, Human probability matching behaviour in response to alarms of varying reliability Ergonomics, 38, 2300–2312 Braun, C.C., Sansing, L., Kennedy, R.S. and Silver, N.C. 1994, Signal word and colour specifications for product warnings: an isoperfomance application. Proceedings of the 38th Annual Conference of the Human Factors and Ergonomics Society (Human Factors and Ergonomics Society, Santa Monica), 1104–8 Edworthy, J., Loxley, S.L. and Dennis, I.D. 1991, Improving auditory warning design: relationship between warning sound parameters and perceived urgency. Human Factors, 33, 205–31. Wogalter, M.S. and Silver, N.C. 1995, Warning signal words: connoted strength and understandability by children, elders, and non-native English speakers. Ergonomics, 38, 2188–2206.
PERCEIVED HAZARD AND UNDERSTANDABILITY OF SIGNAL WORDS AND WARNING PICTORIALS BY CHINESE COMMUNITY IN BRITAIN Angela K.P.Leung & Elizabeth Hellier Department of Psychology, City University, London, EC1V 0HB
This study investigated the hazard perceptions and understandability ratings of 43 signal words and 12 warning pictorials between the Chinese population and English population in London. The results showed that for all 43 signal words, the understandability ratings in the Chinese subjects were significantly lower than those of the English subjects, although no significant difference was found for the eight commonly used signal words such as DANGER and WARNING. Also, the Chinese subjects were found to have similar perceived hazard levels of signal words as the English subjects. A shorter list of 12 signal words was selected based on understandability. The results of pictorials on both comprehension rates and understandability ratings of the Chinese subjects were found to be significantly lower than the English subjects. The implications of these findings for hazard communication are discussed.
Introduction Most standards and guidelines on warning design recommend the use of signal words (e.g. DANGER) on warnings for the purpose of calling attention to the safety sign and conveying the degree of potential seriousness of the hazard (FMC Corporation, 1985). The standards usually recommend the signal words DANGER, WARNING, and CAUTION to denote the highest to lowest levels of hazard, respectively. However, research in this area has been equivocal. While some studies (Dunlap et al, 1986) have found significant differences in connoted hazard between the words DANGER and CAUTION; other studies (e.g. Leonard et al, 1986; Wogalter et al, 1987) reported no reliable differences between risk ratings of the words DANGER, WARNING and CAUTION Studies in the USA have used participants other than students as subjects (Wogalter and Silver, 1995), however little research has been earned out in Britain to assess if the non-native English speaking population perceive the same level of the hazard with the commonly used signal words such as DANGER and WARNING. Since the Chinese community is the third
AKP Leung and E Hellier
322
largest minority ethnic community in Britain, they were used as subjects to see if they perceived the hazard levels of the signal words in the same way as the English population. Warning designers have increasingly made greater use of pictorials in hazard communications. Some research found that warning pictorials and icons might be useful in assisting hazard communication when the verbal information cannot be read or understood (Leonard and Karnes,1993), nevertheless other studies (e.g. cited by Casey,1993) found that pictorials are not always easy to understand. There are four hypotheses in this study: a) b) c) d)
for signal words, the Chinese subjects will have lower understandability ratings than the English subjects for signal words, the Chinese subjects will have different hazardousness ratings from the English subjects no significant difference is expected in the understandability rating between the Chinese subjects and the English subjects. there will be no significant difference in the comprehension rates between the Chinese subjects and the English subjects
Another purpose of this study is to develop a list of potential signal words that probably would be understandable to the Chinese population as well as the English population.
Method Subjects Ninety-six subjects participated in this study: 48 Chinese subjects and 48 English subjects. The Chinese subjects were able to read and speak English but whose first language was not English.
Stimuli and Materials All subjects were asked to complete a questionnaire which consists of questions on signal words and warning pictorials. The presentation and question orders of the questionnaire were randomised. The 43 signal words which were used in the study of Wogalter and Silver (1995) were included in the questionnaire. Subjects were given two questions to rate on. The first question was ‘How much hazard, do you think, is implied by this word?’ A 9-point rating scale with anchors: (0) no hazard, (2) slight hazard, (4) some hazard, (6) serious hazard, (8) extreme hazard. The second question was ‘How understandable, do you think, is this word?’ A 9-point rating scale with anchors : (0) not at all understandable, (2) somewhat understandable, (4) understandable, (6) very understandable, (8) extremely understandable. For warning pictorials, twelve signs which conform to the British Standard and to the new Safety Signs at Work Regulations 1994 were used. The 12 signs included: 1) seat belt must be worn, 2) helmet must be worn, 3) breathing mask must be worn, 4) guard must be used, 5) eye wash, 6) fire exit, 7) fire extinguisher, 8) wet floor, 9) corrosive substance, 10) risk of explosion, 11) no entry and 12) do not operate. All pictorials were shown in black and white and measured approximately 4.5cm×5cm. There were two tasks for the subjects. The first task was to write down the meaning of the pictorial as specifically as possible. The second task of the subjects was to rate on this
Understandability of warning by Chinese community
323
question on each pictorial: ‘How understandable, do you think, is this pictorial?’ A 9-point rating scale with anchors: (0) not at all understandable, (2) somewhat understandable, (4) understandable, (6) very understandable, (8) extremely understandable.
Procedure Subjects were tested individually. All subjects were told that they were to complete a questionnaire which was about some signal words and warning pictorials, both of which are sometimes used to indicate hazard.
Results Signal words A 2 (Chinese and English subjects)×8 (signal words: NOTE, ATTENTION, NOTICE, CAREFUL, DANGER, CAUTION, WARNING, DEADLY) ANOVA was performed using understandability rating as the dependent variable. The ANOVA did not show a significant main effect of nationality, F(1,94)=2.67, p>0.05. This result suggested that there was no significant difference in the understandability ratings of the eight commonly used signal words between the Chinese and English subjects. However, an independent t-test found that, for the 43 signal words, the mean of the means of understandability ratings of the Chinese subjects (M=4.98) was significantly lower than that of the English subjects (M=5.64), t(84)=2.67, p<0.05. This indicates that apart from these eight commonly used signal words, there were some signal words, e.g. HALT, PROHIBIT, FORBIDDEN, etc. which appeared to have lower understandability ratings in the Chinese subjects than in the English subjects. A 2×8 (signal words: NOTE, ATTENTION, NOTICE, CAREFUL, DANGER, CAUTION, WARNING, DEADLY) ANOVA was performed using hazardousness ratings as the dependent variable. The ANOVA did not show a significant main effect of nationality, F(1,94)=2.65, p>0.05. This result suggested that there was no significant difference in the hazardousness ratings of the eight commonly used signal words between the Chinese and English subjects and that it did not support the second hypothesis. However, there was a significant main effect of signal word, F(7,658)=308.64, p<0.01, with DEADLY, DANGER, WARNING, CAUTION, CAREFUL, ATTENTION, NOTICE, NOTE rated from the greatest to least on overall hazardousness i.e. when hazardousness ratings were collapsed across nationality. The results of subsequent Newman-Keuls tests found that DANGER was rated significantly higher on hazardousness than either WARNING or CAUTION. Nonetheless, WARNING was not rated significantly higher on hazardousness than CAUTION.
Warning pictorials A 2 (Chinese and English subjects)×12 pictorials ANOVA was performed using understandability ratings as dependent variable. The result showed a significant main effect of nationality, F(1,94)=8.91, p<0.05 and suggested that the mean of understandability ratings of the Chinese subjects (M=3.38) was significantly lower than that of the English subjects (M=4.16). This result did not support the third hypothesis A 2×12 pictorials ANOVA was performed using comprehension rates as dependent variable. The ANOVA showed a significant main effect of nationality, F(1,94)=10.15, p<0.01 and suggested that the mean of comprehension rates of the Chinese subjects (M=0.58) was
324
AKP Leung and E Hellier
significantly lower than that of the English subjects (M=0.70). This result did not support the fourth hypothesis.
Discussion The result of ANOVA for the eight commonly used signal words (NOTE, ATTENTION, NOTICE, CAREFUL, DEADLY, including three most commonly used: CAUTION, WARNING, DANGER) suggested that there was no significant difference in the understandability ratings between the Chinese and the English subjects. One possible reason for this finding could be, as Wogalter and Silver (1995) suggested, that in the limited exposure to the English Language, the Chinese received training on the intended meanings of these commonly used words (perhaps through formal instruction, or paying close attention to the gradations of English word meanings or to verbiage on products manufactured by English-speaking countries.) However, further test result which indicated for all 43 signal words that the understandability ratings of the Chinese subjects were significantly lower than those of the English subjects. The results suggested that apart from these eight commonly used signal words, there were some signal words, e.g. HALT, LETHAL, PROHIBIT, FORBIDDEN, etc. which appeared to have lower understandability ratings in the Chinese subjects than in the English subjects. Therefore, it is recommended not to use these less common words when the target population is the Chinese Community in Britain. One purpose of this study was to construct a list of words that would be understandable to the Chinese population as well as the English population, In order to produce a list from the data of this study, three criteria were used: 1) words received mean understandability ratings less than 4.0 were excluded. 2) words for which the standard deviation exceeded 2.0 were excluded. 3) words with significant difference in understandability ratings between the Chinese and English subjects were excluded. Using the three criteria, 31 words were eliminated. The 12 remaining words are: CAREFUL, CAUTION, HARMFUL, SERIOUS, DANGER, FATAL, DANGEROUS, HAZARD, HAZARDOUSNESS, TOXIC, EXPLOSIVE and POISON. This list of words derived would be interpretable by both the Chinese and English population. Lists such as this one and as well as that of Wogalter and Silver (1995) would be useful to individuals designing warnings and for selecting alternative words that convey various hazard levels including substitutes for other words. A designer should select words that are the most understandable to the target population with significant differences along the hazard dimension. The results suggested that there was no significant difference in the hazardousness ratings for all 43 signal words between the Chinese and English subjects and that the Chinese subjects did perceive similar connoted hazard levels from the set of signal words as the English subjects. However, the results showed that DEADLY was rated significantly higher on hazardousness than DANGER. DANGER was rated significantly higher on hazardousness than either WARNING or CAUTION. Nonetheless, WARNING was not rated significantly higher on hazardousness than CAUTION. These results concurred with the findings of other
Understandability of warning by Chinese community
325
studies such as Wogalter and Silver (1990) and Dunlap et al (1986) that it did not support the difference between WARNING and CAUTION as asserted in standards and guidelines (FMC Corporation, 1985).
Warning pictorials The result of ANOVA suggested that there was a significant difference in the understandability ratings of the 12 pictorials between the Chinese and English subjects and indicated that the understandability ratings of the Chinese subjects were significantly lower than those of the English subjects. This result gives support to the tragedy in Baghdad in 1972 cited by Casey (1993), that pictorials are not always easy to understand although the language barrier has supposedly been lifted. Therefore, it is very important for designers to choose appropriately understood pictorials in order to convey the intended meanings. It is also recommended that the designers to confirm the meaning of the pictorials with the target population and to use them prudently. The other results suggested that there was a significant difference in the comprehension rates of the 12 pictorials between the Chinese and English subjects and indicated that the mean of comprehension rates of the Chinese subjects was significantly lower than that of the English subjects. Nevertheless, this result concurred with the findings of an Australian study of Cairney and Sless (1982) that the results of the first recognition test of the Vietnamese participants were significantly lower than those of the other groups. In summary, no significant difference was found in the understandability between the Chinese subjects and the English subjects for the eight commonly used signal words but there was significant difference in understandability for the other less commonly used signal words. The other results suggested that the Chinese subjects did perceive similar connoted hazard levels from the set of signal words as the English subjects. The other findings of the study suggested that both the comprehension rates and understandability ratings of the 12 pictorials of the Chinese subjects was significantly lower than those of the English subjects.
References Cairney, P. and Sless, D. 1982, Communication effectiveness of symbolic safety signs with different user groups, Applied Ergonomics, 13, 91–97. Casey, S M. 1993 Set Phasers On Stun and Other True Tales of Design, Technology and Human Error. (Aegean Publishing Company) Dunlap, G.L., Granda, R.E. and Kustas, M.S. 1986, Observer perceptions of implied hazard: Safety signal words and color words. Research Report No. Tr00.3428 FMC Corporation 1985, Product Safety Sign and Label System FMC Corp., Santa Clara Leonard, S.D. and Karnes, E.W. 1993, Development of warnings of resulting from forensic activity, Proceedings of the Human Factors Society 37th Annual Meeting 501–505. Human Factors and Ergonomics Society, Santa Monica Wogalter, M.S. and Silver, N.C. 1990, Arousal strength of signal works, Forensic Reports, 3, 407–420. Wogalter, M.S. and Silver, N.C. 1995, Warning signal words: connoted strength and understandability by children, elders and non-native English speakers, Ergonomics, 38(11), 2188–2206.
VERBAL PROTOCOL ANALYSIS
THINKING ABOUT THINKING ALOUD M.J. (Theo) Rooden
School of Industrial Design Engineering Delft University of Technology 2628 BX Delft, the Netherlands e-mail: [email protected]
In this paper the possibilities and limitations of thinking aloud, as a technique to elicit perceptions and cognitions during everyday product usage, are discussed. Findings from the literature are compared to experiences with the application of thinking aloud in users’ trialling conducted at TU Delft.
Introduction In users’ trialling, users in action are observed to gain information to improve the design at hand. Subjects may be asked to use existing consumer products or design models, ranging from rough sketches to working prototypes. Observations of user activities are the most important data from users’ trialling. An overview of interaction difficulties can help designers focus on certain aspects of the design. However, when information is available on users’ perceptions and cognitions at the moment of experiencing interaction difficulties, users’ trialling can really be a design tool, since causes of difficulties may be traced. Users’ perceptions and cognitions are as a rule not directly observable. The most direct method to elicit perceptions and cognitions is called thinking aloud (TA), which means that subjects are asked to verbalise their thoughts concurrently, while using a product. When pros and cons of TA in users’ trialling are discussed, Ericsson and Simon (E&S) are often referred to, as proponents of TA. However, E&S (1993) regard TA as a formal method, with a lot of rules and restrictions. More detailed insight in their views on TA is required to assess possibilities of TA in users’ trialling. In this paper, E&S’s basic considerations are presented. These views are then discussed in the context of users’ trialling both in general and with regard to specific experiences with the application of TA in an experimental study. This leads to some proposals on how to benefit from the methods of TA in users’ trialling.
Thinking about thinking aloud
329
Thinking aloud (considerations of E&S) Techniques of TA originated in cognitive psychology, and they are often applied to get insight in thought processes. TA and derived techniques are extensively described by E&S (1993). Results of empirical research on TA are discussed in the frame of an information processing theory. Cognitive processes are regarded as sequences of internal states successively transformed by a series of information processes. Information is stored very shortly in sensory stores, then in short term memory (STM) with limited capacity and intermediate duration and long term memory (LTM) with very large capacity and relatively permanent storage. Ericsson & Simon state that only the contents of STM can be verbalised. In automated behaviour no information is heeded in STM, therefore, there is nothing relevant to be verbalised. When someone is writing a letter for instance, he or she is not thinking about the pen, and the way it is manipulated, but is probably thinking about the contents of the letter. A verbal report will not reveal information about usage of the pen. Only tasks which require some form of problem solving accommodate for relevant TA. Information from thought processes available in oral form are easiest to verbalise, because this can be done directly. In other cases, translation from thought to verbalisation is necessary. The vocabulary available may not always be adequate to express for instance pictorial information and manipulations (E&S, 1993, p92) in detail. In research into thought processes it is important not to disturb these thought processes by TA. E&S (1993, p79) argue that asking subjects for explanations concurrently changes these processes. When E&S’s requirements for successful TA are met, verbal protocols may capture a considerable part of the thought process. In those cases a formal analysis is suggested in which the protocols are segmented and the segments are categorised. An ‘ideal’ retrospective report is given by subjects immediately after completion of the task, with much information still in STM (E&S, 1993, p19).
Thinking aloud in users’ trialling E&S’s considerations and users’ trialling It is clear that TA in users’ trialling is performed under conditions far from ideal to get ‘rich’ protocols, especially when usage of everyday products is investigated. Using consumer products consists partly of automated behaviour, and actions, such as manipulations may be difficult to verbalise in detail. Studying human computer interaction probably benefits more from TA, because then more problem solving may be taking place, and more information is already in oral form. It is expected that thinking aloud when using everyday products may reveal little about the thought processes at the moment. Therefore it is misleading to justify application of TA in users’ trialling by simply referring to Ericsson and Simon.
Different aims of TA in users’ trialling In users’ trialling aims of TA are different from, and possibly much more modest than charting thought processes. Having subjects think aloud may help to learn which information from a product is used in what way. The aim is not to get ‘rich’ verbal protocols which can be analysed in a formal way. Verbal reports are additional to data from direct observation of use actions. Each utterance of information is valuable, and spontaneous verbalisations of
330
MJ Rooden
perceptions are welcome as well. Unless subjects are forced to think aloud, there seem not to be large disadvantages of applying TA, although one never knows for sure whether TA does not interfere with carrying out the task. Information on usage elicited retrospectively does not necessary reflect thoughts during usage, especially when subjects are asked for explanations. In many cases, however, it can help understand what happened during usage.
Experiences with thinking aloud in users’ trialling In an experiment TA was applied to elicit users’ perceptions and cognitions. Aim of the experiment was to investigate possibilities for users’ trialling early in design processes. In this paper the focus is only on experiences with TA in this experiment. Subjects (age 20 to 78) operated a non-professional blood pressure monitor. They were asked to think aloud concurrently in a casual way. Information was also elicited retrospectively. Some of the characteristics of the verbal reports, which are illustrated by an extract of one of the protocols (figure 1), are discussed.
Figure 1. Extract from a verbal report.
No information on skilled behaviour Usage of everyday products is expected to consist largely of skill- and rule-based behaviour (Rasmussen, 1986, Kirlik, 1995). Also in usage of the blood pressure monitor, parts of the interaction consisted of automated behaviour, for instance pushing a button to switch the
Thinking about thinking aloud
331
monitor on, and using a Velcro fastening. No detailed information was supplied about these actions, as could be expected.
Subjects remain silent for longer periods There was large variety between the amount of subjects’ verbalisations. However, most subjects remained silent for longer times. The fact that subjects remained silent can not totally be explained by the fact that skilled behaviour and manipulations are non-verbalisable. Subjects may have had other thoughts at the moment. Maybe they judged these thoughts irrelevant for the research (Wright, 1980; Dobrin, 1986). This may have been the case while waiting for the results to appear. Maybe they simply forgot to think aloud. Subjects may also have remained silent, because they are ‘bad’ verbalisers. In more formal TA it is advised to throw away poor verbal reports, to select subjects with good verbal capacities, and to train subjects to think aloud. It is also advised to remind subjects to think aloud when they remain silent. We refrained from this all, because we did not want these means to interfere with carrying out the task. We only intervened when the interaction got stuck. It might be expected that subjects would start verbalising when getting in trouble (Bowers and Snyder, 1990; Ohnemus and Biers, 1993). However, this was not often the case, apart from general remarks, like ‘I don’t understand’. Maybe they focused all their mental capacities to solve the problem (Page and Rahimi, 1995) and do a good job, starting to verbalise again when getting out of trouble.
Mainly procedural information Most verbalisations consisted of verbalised actions. As this is often observable information, such verbalisations are not very helpful (except when a drawing of a design proposal is presented to the subjects), and they prevent subjects from really verbalising thoughts at that time. Sometimes subjects verbalised what they perceived, they read product graphics aloud for instance. This is valuable information from TA. The fact that manipulations are difficult to verbalise is not a serious problem, when these manipulations can be directly observed. Bowers and Snyder (1990) and Ohnemus and Biers (1993) also found that concurrent TA mainly yields procedural information. They prefer retrospective probing, as this yields more design relevant information. However, retrospective information is not questioned by them.
Retrospective information to be treated with caution Subjects commented afterwards on a video viewing of their usage, and we asked questions. Sometimes actions were contradicted, for instance the order of applying the cuff and switching the monitor on. Sometimes subjects came up with a list of more or less plausible reasons for certain actions. Presumably, they tried to justify their actions. Negative comments may be more useful. When a subject is asked ‘did you see the sticker with instructions on the cuff?’, he or she will probably not answer ‘no’ when in fact he or she did see it. However, some subjects may hesitate to admit having overlooked the sticker.
332
MJ Rooden
Discussion From E&S’s views it might be concluded that in users’ trialling spontaneous, concurrently expressed remarks are most valuable, and at the same time that the verbal reports will be poor. The verbalisations seem to be only traces of thought processes, which consist of much more than can be voiced. Retrospective techniques, which include interviewing, should be applied with caution, because retrospective reports may not reflect thoughts at the moment of task performance. The goals of TA in a design context differ from E&S’s goals of TA. Verbal reports labelled as useless by E&S, may be very useful for designers. Techniques of TA may be modified. Although retrospective reports may suffer from memory processes, they should not be put aside altogether, as these techniques are supposed to yield design relevant information. There are a few alternative methods to elicit perceptions and cognitions, mainly developed in industry. However, these techniques bring along other uncertainties. In co-discovery (Kemp and van Gelderen, 1996), subjects working in pairs, conventions of conversation may interfere with expressing thoughts. Question asking protocols (Kato, 1986) obfuscate regular usage. Such techniques may be beneficial in specific cases, respectively to inspire designers, and to find out when users need help from a manual or a help-desk.
Literature Bowers, V.A. and Snyder, H.L. 1990, Concurrent versus retrospective verbal protocol for comparing windows usability, Proceedings of the Human Factors Society 34th Annual Meeting, 1270–1274 Dobrin, D.N. 1986, Protocols once more, College English, 48, 713–725 Ericsson, K.A. and Simon, H.A., 1993, Protocol Analysis. (MIT press, Cambridge MA). Kato, T. 1986, What “question-asking protocols” can say about the user interface, Int. J. Man-Machine Studies, 25, 659–673 Kemp, J.A.M. and van Gelderen, T., 1996, Co-discovery exploration: an informal method for the iterative design of consumer products. In W.Jordan et al. (ed.) Usability evaluation in industry. (Taylor & Francis, London), 139–146 Kirlik, A. 1995, Requirements for psychological models to support design: toward ecological task analysis. In J.flach (ed.) Global perspectives on the ecology of human-machine systems. (Lawrence Erlbaum Associates, Hillsdale), 68–120 Ohnemus, K.R. and Biers, D.W. 1993, Retrospective versus concurrent thinking-out-loud in usability testing, Proceedings of the Human Factors Society 37th Annual Meeting, 1127–1131 Page, C. and Rahimi, M. 1995, Concurrent and retrospective verbal protocols in usability testing: is there value added in collecting both?, Proceedings of the Human Factors Society 39th Annual Meeting, 223–227 Rasmussen, J. 1986, Information processing and human-manchine interaction: an approach to cognitive engineering. (North Holland, Amsterdam) Wright, P. 1980, Message-evoked thoughts: Persuasion research using thought verbalization, Journal of Consumer Research, 7, 151–175
ADJUSTING THE COGNITIVE WALKTHROUGH USING THE THINK-ALOUD METHOD A case study on detecting learnability problems in software products
Marjolijn Verbeek and Herre van Oostendorp
Cap Gemini Nederland B.V. Methods and Tools P.O.Box 7525, NL-3500 GN Utrecht [email protected]
Utrecht University Department of Psychonomics Heidelberglaan 2, NL-3584 CS Utrecht [email protected]
The aim of this study was to analyze the sensitivity of the cognitive walkthrough method and to construct an improved version of the walkthrough question form by adjusting it with the assistance of the thinkaloud method. These two methods were applied to evaluate the ease of learning of a graphical user interface. It appeared that the think-aloud method had additional value over the cognitive walkthrough method, because more different learnability problems were detected, at least for novice users. The results of the two methods were integrated in a new, adjusted cognitive walkthrough form.
Introduction Several contributions have been dedicated to the cognitive walkthrough method since it was launched (e.g. Wharton et al, 1992). This study had the aim of contributing to the effectiveness of the cognitive walkthrough in detecting learnability problems in software products. The walkthrough method focuses primarily on ease of use for first-time users. In other words, the cognitive walkthrough assesses a user interface for support of learning by exploration. Exploration-supporting interfaces will be of growing importance as the number of different system applications increases as well as the population of end-users who often have received no formal training. Ease of learning is therefore an important aspect of a usable software product. Three categories of learnability can be distinguished. The first category involves support of task-driven user-events, while the other two indicate the degree of exploration-support of an interface design: 1. The ‘task to action mapping’ of the interface object involved (the object provides insight in the situation in which it can be used because it provides cues compatible with the users’ goal); 2. The ‘name to effect mapping’ of the interface object involved (the objects’ label or cues on the screen provide a clear indication of its function); And 3. the ‘affordance’ (the user directly perceives how to operate on the object, e.g. to drag and drop) (Draper & Barton, 1993). The present study applied the cognitive walkthrough (Lewis et al, 1990) and think-aloud method (Ericsson & Simon, 1993) to detect learnability problems of the three types mentioned above.
334
M Verbeek and H van Oostendorp
The Experiment Twenty end-users have taken part in this study. Half of this group were novice users and the other half were experienced users; both groups were randomly divided over the two methods in a between-subjects design. Novice users had domain knowledge (planning, resource allocation, time scheduling, etc.), but they did not have the opportunity yet to bring it into practice, while they were experienced with Microsoft Windows® applications. Experienced users also had to have practical experience with the product. Both the walkthrough and the think-aloud method evaluated the same graphical user interface of a fully operating system, Project Workbench® PMW (short: PMW). PMW is marketed by ABT Corporation and provides a project scheduling, tracking, reporting and analysis capability for managing a wide variety of project environments, from small maintenance activities to multiple, complex projects and programmes. Four tasks were evaluated, which were representative for the system. Two tasks were performed applying one of the two methods whereby the second task is a more complex task than the first one. The other two tasks were control tasks; task performance was here measured in order to be sure that the two groups were comparable.
Cognitive Walkthrough Method In order to map the mental processes, we needed to know what goals and which action repertoir were at the users’ disposal. Next, to assess the learnability, the user goals and actions were compared to the formal goals and actions, Table 1. Cognitive walkthrough question form i.e. goals and actions required by 1. Describe your immediate goal: the interface in order to 2. The (first/next) atomic action you take: successfully perform a task. The 2a. Is it obvious that action is available? Why?/why not? cognitive walkthrough is a 3. Is it obvious that action is appropriate to goal? Why/ theoretically structured evaluating why not? process that takes the form of a list 4. How do you associate the description with action? 4a. Problem associating? Why/why not? of questions (see Table 1; Lewis 5. Are all other available actions less appropriate? For et al, 1990). The evaluation of each, why/why not? each step in a task involves the 6. How are you going to execute action? subjects answering the questions 6a. Problems? Why/why not? while interacting with the 7. Execute the action. Describe system response: 7a. Is it obvious that progress has been made toward goal? interface. The subject begins by Why/why not? giving a description of the current 7b. Can you access needed information in system goals and the next action response? Why/why not? (questions 1 and 2). The next 8. Describe appropriate modified goal, if any: series of questions (questions 2a 8a. Is it obvious that goal should change? Why/why not? 8b. If task completed, is it obvious? Why/why not? through 6) evaluate the ease with which the subject will be able to correctly select that action and execute it. Next, a description of the system response is given, as perceived by the user (question 7). Questions 7a and b evaluate the adequacy of the system response. And the final questions (question 8 to 8b) evaluate the user’s ability to form an appropriate goal for the next action or to detect that the task has been completed. The ‘task to action mapping’ is inferred from the answer on question 3. The ‘name to effect mapping’ of the object mentioned by the subject at question 2, is inferred from the answer on question 4. The ‘affordance’ is inferred from question 6, where the user explains the interaction with the object mentioned at question 2.
Adjusting the cognitive walkthrough using the think-aloud method
335
Think-aloud Method Via the think-aloud method insight is obtained into the user’s thoughts during task performance (see for example De Mul & Van Oostendorp, 1996). With this insight one can find out where the user’s attention is drawn to while interacting with the interface. Analysis of the verbal protocols, then, makes it possible to determine whether there’s a good fit between what users want and can do and what the system requires and provides on possibilities and feedback. From the verbalizations of the subjects the interaction with each interface object was judged in terms of problems resulting from one of the three learnability categories…
Results Data Analysis In order to investigate the learnability problems detected by the cognitive walkthrough method the goal-action structures of users were mapped on the goals and actions which are required or supported by the interface. From this comparison we were able to determine what percentage of users did find the solution path, i.e. the sequence of formal goals and actions required by the design of the interface. Also the amount of deviating goals and actions was analysed. Every deviating user action was judged as performed on the ground of either the ‘task to action mapping’, the ‘name to effect mapping’ or the ‘affordance’ of the interface object involved. In sum, the three categories of learnability were inferred from the answers on the corresponding questions, as mentioned earlier. Table 2. Median proportions of In this way, each interface object on which an general success for every object action other than the one of the solution path was performed, could be identified as caused by one of the three types of learnability problems. An object was judged as problematic if less than 60% of the users found the formal action belonging to the solution path. In preparation of the data analysis of the verbal protocols a list of (totally 23) PMW objects was composed, which are to be used in order to perform the tasks successfully. The verbal protocols of the subjects could then be segmented in accordance to these objects. Every verbalization actually relates to a planned or realized action of the subject with regard to an object, thus can be seen as a ‘user-event’ (i.e. an attempt to learn by exploration or to find a suitable object). We time-stamped each user-event and cross-referred this with the object in question. Then followed a binary decision for each of the three categories of learnability: 1. Is the object found when it is needed (‘task to action mapping’)?, 2. Is the object understood when explored with (‘name to effect mapping’)? and 3. Is the object successfully operated when used (‘affordance’)? All user-events
336
M Verbeek and H van Oostendorp
were listed chronologically for each object of the interface (Draper & Barton, 1993). We obtained a success-failure proportion with values between 0 and 1, by dividing the number of successes by the sum of successes and failures. This figure reflects the general success of exploration of that object per subject, where a value of 1 means that the object involved has been successfully used. Then, for the total group of subjects the median value of the general success figure was computed for every interface object. This value indicates the proportion of general success of the total frequency with which an object was used. The median was chosen as statistic because it suites with the quantitive comparison of the two sets of data. In Table 2 one part of the scores scheme is shown (see for details Verbeek, 1997). Table 2 shows, for example, that the object TAR (number 14, with the user-event “allocate a resource to the task”) was not used very successfully. It especially scores very low on the category ‘name to effect mapping’ as well as on the category ‘affordance’. This means that the involving interface object causes learnability problems, in the sense that the object did not show enough what it does and also the user could not perceive directly how to operate on it. The results shown in Table 2 seem to imply that userevents with an exploration-driven character are weakly supported by the interface. This is in contrast to the task-driven user-events, because there are hardly ‘task to action mapping’ problems found.
The Comparison between Methods With the cognitive walkthrough and the think-aloud method learnability problems were detected and identified as being related to one of three categories: ‘task to action mapping’, ‘name to effect mapping’ or ‘affordance’ of an interface object. These problems refer to the interface objects indicated as problematic by the cognitive walkthrough method (used successfully by less than 60 percent of the subjects) and by the think-aloud method (used successfully by less than 50 percent of the subjects). The cognitive walkthrough method detected learnability problems mainly of the category ‘task to action mapping’, which was true for both groups of subjects. As opposed to this, the think-aloud method detected learnability problems of all three categories: ‘task to action mapping’, ‘name to effect mapping’ and ‘affordance’ of interface objects. These problems were specifically experienced by novice users. While interpreting the conclusions one has to keep in mind that this study was specifically aimed at detecting learnability problems. Therefore, no absolute conclusion can be drawn about the usability of PMW. From these results we may conclude that the think-aloud method had additional value over the cognitive walkthrough, because more different problems were detected, at least for novice users, involving particularly the exploration-driven user-events. The learnability problems as detected by the think-aloud method but not detected by the walkthrough, were seen as “missing values” in the efficiency of the cognitive walkthrough method. The results of the two methods were integrated in a new, adjusted cognitive walkthrough form, with which all three types of learnability problems can be detected. The adjustment was made primarily by modifying the questions 4 and 7, with which respectively ‘name to effect mapping’ and ‘affordance’ are measured. Secondly, the first couple of questions were task oriented and the next couple exploration oriented. Placing question 4 at the beginning of the form, will consequently allow the user to first observe and explore the interface more thoroughly before selecting an action. This way, a more exploration oriented task performance is simulated and more emphasis is given to the ‘name to effect mapping’ and ‘affordance’ of interface objects.
Adjusting the cognitive walkthrough using the think-aloud method
337
Discussion Usually, the cognitive walkthrough is applied by the designer of the evaluated system or by an expert in cognitive psychology. The expert walks through the interface design, simulating the user’s interaction with the interface while performing a specific task. In the present study however, the cognitive walkthrough is performed by the subjects themselves. This difference had the advantage that we could derive data directly from representative users instead of experts merely estimating the user’s goals and actions. The combination of the two methods can detect a broader range of learnability problems than they do individually. This case study demonstrated that the think-aloud method was needed to develop an adjusted cognitive walkthrough, because the think-aloud method detected problems that the walkthrough did not detect. We believe that the main cause of this, is that there’s a close correspondence between the verbalizations and the actual processes used to perform the task. In spite of the verbalization’s lack of coherence and partial incompleteness, like leaving unanswered e.g. how the solution was generated in detail and why a given action structure was adopted among many possible scenarios, it yet provides a more accurate picture than the interrogatively way of the cognitive walkthrough. The subjects who performed the think-aloud method were focused on completing the task, the verbalizing of the heeded information being secondary. The subjects who performed the walkthrough, however, paid relatively more attention to the question form than to completing the task. Of course, the new version of the cognitive walkthrough should be tried out in order to test its efficiency empirically.
References De Mul, S. and van Oostendorp, H. 1996, Learning user interfaces by exploration, Acta Psychologica 91, 325–344 Draper, S.W. and Barton, S.B. 1993, Learning by exploration, and affordance bugs, Adjunct Proceedings of INTERCHI ’93 Conference, April 24–29 1993, (ACM, Amsterdam, The Netherlands), 75–76 Ericsson, K.A. and Simon, H.A. 1993, Protocol analysis. Verbal reports as data, Revised edition, (MIT Press, Massachusetts) Lewis, C., Polson, P., Wharton, C. and Rieman, J. 1990, Testing a walkthrough methodology for theory-based design of walk-up-and-use interfaces, Proceedings of CHI ’90 Conference, April 1–5 1990, (ACM, Seattle, Washington), 235–242 Verbeek, M.L. 1997, Adjusting the cognitive walkthrough method with the assistence of protocol analysis. A case study of detecting learnability problems in a softwareproduct, Dutch Internal Report, (Utrecht University, The Netherlands) Wharton, C., Bradford, J., Jeffries, R. and Franzke, M. 1992, Applying cognitive walkthroughs to more complex user interfaces: experiences, issues, and recommendations, Proceedings of CHI ’92 Conference. May 3–7 1992, (ACM, Monterey, California), 381–388
Acknowledgements Thanks are due to ABT Benelux B.V. for providing the means for conducting this study. Especially Dr. Maarten Bakker for his enthusiasm during the study and for his useful comments to a previous version of this paper. In addition, we thank the participating persons for their cooperation on the experiment.
VERBAL PROTOCOL DATA FOR HEART AND LUNG BYPASS SCENARIO SIMULATION “SCRIPTS” Joyce Lindsay* and Chris Baber
Industrial Ergonomics Group, School of Manufacturing & Mechanical Engineering, University of Birmingham, B15 2TT, United Kingdom.
A perfusionist is the medical professional who operates the heart and lung bypass circuit during open heart surgery. The perfusion circuit pumps oxygenated blood to the patient’s body tissues while the heart is temporarily arrested. Currently, perfusionists are given practical training during real surgery cases. This paper discusses how the current training regime in the UK could be improved to help trainees become more competent. A small part of more extensive research towards the development of a training simulator, this study uses verbal protocol data, real-time parameter monitoring and critical incident technique (CIT) data to form a database of perfusion scenarios. With extensive data, this will eventually be transformed into “scripts” from which simulations will be run.
Introduction In cardiopulmonary bypass (CPB) surgery the heart is temporarily arrested to allow surgical repair. Meanwhile, the patient’s tissue oxygenation is maintained by the perfusionist who also manages a number of other physiological parameters to maintain an optimal internal environment for the patient. Fully qualified perfusionists operate the circuit on their own. Trainee perfusionists, on the other hand, operate under the supervision of a qualified professional. While the complex nature of perfusion means on-the-job training is an agreeable method of training, the critical nature of perfusion suggests that it is not. Because the heart surgery team (i.e. surgeons plus perfusionists plus anaesthesiologists) is tightly coupled, trainees learning during clinical cases may contribute risk to the patient in three ways. Their own error may directly affect the patient; their error may influence the surgeon or the anaesthesiologist, causing them to make an error; the surgeon or anaesthesiologist may take an action which causes the perfusionist to make an error. With the advance of medical technology there has been a corresponding rise in the incidence of operator errors (Cooper et al, 1984). Bogner (1994) attributes this in part to a lack of task specific training in the medical domain, to which perfusion is no exception. The training of UK perfusionists is currently regulated by the European Board of Cardiovascular Perfusionists (EBCP). Trainee perfusionists learn theoretical basics before they begin on-the-
Verbal protocol data for heart and lung bypass scenario simulation “scripts”
339
job training in which they must complete 100 clinical cases and be considered as being of a suitable standard before they can be assessed (Davis, 1996). The EBCP are currently introducing the European Certificate of Cardiovascular Perfusion (ECCP) to create a uniform training standard of perfusion across Europe and phase out National qualifications (Sanger, 1997). Candidates for the ECCP must satisfy certain criteria. • • • •
They must have graduated from an accredited institution. They must have practised clinical perfusion for at least two year. They must be currently practising clinical perfusion. Their supervisor(s) must confirm the following: - That the applicant has a minimum of 100 clinical cases and that they are competent to practice unsupervised. - That the applicant can competently avoid and manage perfusion accidents. - That the applicant can set-up and operate a wide range of equipment used for CPB (EBCP, 1997).
Broadly, these criteria can be both quantified and assessed against specific standards. However, the criterion “that the applicant can competently avoid and mange perfusion accidents” could prove problematic to assess for the following reasons. Since critical incidents and failures in CPB are rare (Wheeldon 1981, in Longmore 1981), it is likely that a trainee will first encounter a critical incident as a fully qualified perfusionist. Even if a trainee experiences a critical incident, the supervisor will assume responsibility. In either case, the supervisors cannot claim that the trainee can competently avoid and manage perfusion accidents. If they cause a critical incident then they have not managed to avoid them and if they managed to avoid them they have not shown that they can cope. It is suggested that an additional technique is required to improve training as inadequate training usually manifests itself in critical incidents (Weinger and Englund, 1990). Botney et al (1993) demonstrated that very few anaesthesiologjsts could deal with simulated critical incidents in the manner most likely to reduce risk to the patient. Like perfusionists, anaesthesiologists rarely experience critical incidents and have not been specifically trained to deal with them. A similar phenomenon may therefore exist among perfusionists. A training simulator would provide the opportunity to experience and practise routine and critical scenarios outwith clinical cases. The aim of this research is to produce a training simulator for perfusion to provide such training opportunities. • • • • •
Introduction to perfusion techniques. Recognition of and response to critical incidents. Recognition of monitor artefacts. Practise in operating real to acquire skills. Provision of greater trainer control, for example by allowing repeated practise of one process until it has been mastered. • To test trainee progress. Thus, a data collection technique was employed which elicited enough detail to develop “scripts” for perfusion simulation without disrupting the perfusionist.
340
J Lindsay and C Baber
Methodology The methodology was divided into two main phases: verbal protocol (VP) and temporal physiological parameter recording. Five perfusionists were asked to provide concurrent VPs during surgery. In other words, to provide a running commentary about their actions and associated reasoning. The data was recorded as non-intrusively as possible, using a simple arrangement of a tie-pin microphone and a walkman. This method recorded background noises which often provided clues as to the stage of surgery or to what the perfusionist was referring if their explanation was not entirely clear. Carried out during clinical cases, each operation was recorded for twenty to forty five minutes and included putting the patient onto bypass; bypass itself and bringing them off bypass. The analyst remained present throughout so that the perfusionist felt that they were talking to someone rather than to themselves. Concurrently, the analyst recorded the values of the most important monitors in the perfusion circuit (Lindsay, Baber and Carthey, 1997) at two minute intervals. The data was recorded onto prepared charts.
Results The data from the VP were fully transcribed and separated into discrete tasks. Analysis involved checking the transcripts for those events in the circuit which prompted the perfusionists to take certain actions. Examples of these are given in table 1. These events then were matched against the parameter values recorded by the analyst to determine the value(s) which prompted particular action(s). For instance, the perfusionist may have commented that “patient blood pressure was high” so he would “increase the isoflurane administration levels.” The charts indicated what a “high blood pressure value” was and by how much the isoflurane administration was increased. From an extensive collection of VP and parameter data it is hoped to create a database of information from which simulation “scripts” can be formulated. The scripts will include routine scenarios and critical events where the perfusionist may have to problem solve. Such scenarios will be identified by a critical incident technique (CIT) which is already underway and the future implementation of failure modes and effects analysis (FMEA). Each script will contain contextual information (i.e. patient and operation details) because this may influence the response adopted by the perfusionist. In one example, when the patient had just been put onto bypass, the perfusionist commented that the venous oxygen levels were high which indicated that the patient was not receiving enough oxygen. At a cooler temperature, however, this would have been an acceptable level of venous oxygen. Such examples suggest that perfusionists maintain excellent situational awareness, noting the suitability of physiological parameters as they vary throughout surgery. In the VP study, to date, only one critical incident has been witnessed by the analyst. The perfusionist has to calculate the width of circuit lines from patient details and a calculation chart. Because this chart is difficult to use, the perfusionist inadvertently chose a venous line which was too big for the patient. This was only discovered when the patient was on bypass, the venous line began returning a lot of air to the reservoir. The patient came to no harm but had anything else gone wrong, they could have been injured.
Verbal protocol data for heart and lung bypass scenario simulation “scripts”
341
Table 1: Verbal Protocol; Examples of Action & Response Scenarios
Discussion Training simulation is “the exercise of the operator on a mimic of the condition in which the subject will perform their work.” (Stammers, 1983) This need not be a high fidelity arrangement since effective training can be obtained using limited fidelity (Rolfe and Waag, 1982). It is envisaged that the initial simulation will involve a static representation of the system to which the perfusionist is expected to respond. The scripts developed from the VP and parameter data will direct the course of a simulation. The scripts will adopt the following format. • The perfusionist will be given contextual information (i.e. the type of surgery; the stage of surgery and patient details). • The static representation will include a set of physiological values which will be temporally updated and to which the perfusionist must interpret the situation and outline their responses. • The data from the verbal protocols will be compared with the perfusionists’ responses. • Eventually, more challenging circumstances will be included where the information is incomplete or there are unsafe physiological values. Such scripts will be derived from the CIT and FMEA data. • Finally, the activities of other members of the surgery team must be included as the perfusionist does not operate the circuit in isolation. Team activities and communication will therefore be studied and incorporated into the scripts. It is impossible for the perfusionists to verbalise every thought and action involved. Thus, much more VP and parameter data will have to be recorded to create a substantial database. This data will eventually be combined with CIT and FMEA data to include a wide range of failures in the scripts.
342
J Lindsay and C Baber
Conclusions 1. That certain physiological parameter values trigger specific responses by the perfusionist and that the latter are governed by the context. 2. That perfusionists maintain an accurate awareness of the situation around them. 3. That VP is an excellent means of data collection in these conditions to give insight into circuit operation. 4. That recording of physiological parameters every two minutes provides sufficient detail for the simulation scripts. 5. That many more VP, parameter records, CITs and FMEAs will have to be undertaken to create a substantial database for script development. 6. That the criteria required for candidates to sit the ECCP are inadequate. 100 clinical cases will not prepare trainees for all eventualities or provide the opportunity to develop skills to cope with critical situations. 7. That a simulator run by scenario scripts would be flexible, providing trainer control over the experiences of the trainee.
References Bogner, S.; 1994; Human Error In Medicine; Lawrence Erlbaum Associates Ltd. Botney, R., Gaba, D.M., MD, Howard, S.K., MD; 1993; The role of fixation error in preventing the detection and correction of a simulated volatile anaesthetic overdose; Anaesthesiology; 79(3a). Davis, M.; 1996; Personal Communication; Chief Perfusionist; Great Ormond Street Hospital for Sick Children, London. European Board of Cardiovascular Perfusion; 1997; Perfusion Announcement; Perfusion; 12(2); pp. 80. Lindsay, J., Baber, C. and Carthey, J; 1997; Criticality analysis of potential failure in heart and lung bypass systems during neonatal open heart surgery; The Principles of Risk Assessment and Management for Programmable Electronic Medical Systems; 9th December, 1997, Strand Palace Hotel, London. Rolfe, J.M. And Waag, W.L.; 1982; Flight simulators as flight devices: some continuing psychological problems; Communications to the Congress of IAAP, Edinburgh; Roneo; 10 pp. Sanger, K.; 1997; Personal Communication; Chief Perfusionist; The Royal Infirmary of Edinburgh and the Royal Hospital for Sick children in Edinburgh. Stammers, R.B.; 1983; Simulators for Training in T.O.Kvalseth (Ed.), Ergonomics of Workstation Design; London, Butterworths; pp. 229–242. von Segesser, L.K.; 1997; Perfusion education and certification in Europe; Perfusion; 12; pp. 243–246. Weinger, M.B and Englund, C.E; 1990; Ergonomics and human factors affecting anaesthetic vigilance and monitoring performance in the operating room environment; Anaesthesiology, 73, pp. 95–102. Wheeldon, D.R.; 1981; Can cardiopulmonary bypass be safe procedure?; In, D. Longmore (Ed), Towards Safer Cardiac Surgery.
USE OF VERBAL PROTOCOL ANALYSIS IN THE INVESTIGATION OF AN ORDER PICKING TASK Brendan Ryan and Christine M.Haslegrave
Institute for Occupational Ergonomics University of Nottingham, Nottingham NG7 2RD
Verbal protocol reports were collected from eleven subjects both during and shortly after completion of familiar handling tasks in their usual workplace. Additional reports were collected whilst the subjects watched a video recording of their performance. Supplementary questions were used to collect further information on factors which were not discussed by the subjects. The reports give an indication of the range of information which can be collected from subjects in relation to manual handling tasks, and showed that it is difficult to obtain detail with regard to posture, movements, handling techniques and characteristics of the loads handled.
Introduction Self report methods have been used extensively in practical situations to collect information relating to accidents and injuries. They have been used with varying degrees of success in the collection of data on exposure to risk factors associated with back injuries, In this study, verbal protocol reports were collected to gain a greater understanding of how subjects approach manual handling situations, and how they perceive, process and report information relating to manual handling risk factors.
Method Verbal protocol reports were collected from eleven subjects both during (concurrent protocol) and shortly after (retrospective protocol) completion of familiar handling tasks in their usual workplace, which was a distribution warehouse. The tasks involved the transfer of various items from warehouse racking to a roll container. The largest item handled was a cardboard box (approximate dimensions 0.5×0.5×0.5m, weight 5kg), which had to be pulled out from a low level of the racking to pick items of stock which were stored inside.
344
B Ryan and CM Haslegrave
Standardised instructions were read to the subjects, requesting them to think aloud during the execution of the order picking task. Movements, posture and verbalisations by the subject during the execution of the task were recorded using video tape. Retrospective protocols were recorded on audio tape immediately on completion of the task. Ten of the subjects returned to participate in day two of the study, forty-eight hours later. A second retrospective report was collected in order to investigate the effect of a delay in reporting. This was followed by a further report produced whilst watching the video recording of the task completed on day one of the study. Finally, supplementary questions were put to subjects. All recordings were transcribed and tabulated to facilitate comparisons between subjects and the various methods. Postures and movements during the performance of the task were viewed on video tape by the experimenter. Written descriptions of these postures and movements were tabulated alongside the verbal reports from the subjects.
Results Five of the subjects were only able to provide very general information, with a lack of task related detail or evidence of poor understanding of instructions. The reports of the other six subjects were analysed in detail.
Concurrent report The concurrent report aimed to collect details of thoughts while performing the task. It was clear that the subjects’ main focus of attention was on the selection and locating of items to make up the order on the pick list they had been given. Many subjects focused on racking location numbers and order details which were read from the pick list. Several subjects paid close attention to the products and attempted to identify and produce verbal labels for these. However, attention wandered frequently, with some subjects making reference to hobbies, ambitions or other non-work related activities. Gaps were evident in other reports where subjects appeared to have found difficulty in producing verbalisations at the same time as attending to their task. There were very few spontaneous reports of postures or techniques being used to handle items.
Retrospective reports The retrospective reports aimed to determine how well subjects could recall thoughts which had entered their attention during performance of the task. These were repeated two days later in order to evaluate the effect of a period of delay on the ability to report such thoughts. Generally, the retrospective reports contained less descriptive detail than the concurrent reports. The subjects tended to report general objectives or goals of the task (e.g. “picked the stock”, “put them in the roll cage”). Concurrent and retrospective reports often contained similar subject matter, but some subjects reordered their thoughts. In addition, new information on the subjects’ thoughts on their performance of the task was often provided in the retrospective reports. Thus, it is not possible to determine whether the reports contained details of actual thoughts at the time of execution of the task, or whether they are based on reconstructed information.
Use of verbal protocol analysis in an order picking task
345
The second retrospective report showed further loss of information, particularly in relation to descriptive details of the task.
“Retrospective with video” report After the second retrospective report, the subjects were again asked to describe their task and the video recording was used as an additional prompt with regard to the activities carried out. Details were again requested of recall of thoughts. Some subjects provided information of a better quality than others, and reports were produced in close synchronisation with the video images. Others referred to what they “usually”, “often” or “sometimes” thought during the execution of the task. These reports appeared to introduce some new information but would be subject to the limitations relating to the reconstruction of thoughts, identified in the previous sections.
Supplementary questions The final series of questions was included to collect information which subjects had not given during their spontaneous reports. The questions addressed the awareness of postures, movements and handling techniques, and the awareness of characteristics of the loads. There was little evidence to suggest that conscious attention was devoted to postural risk factors in this study. The subjects, almost without exception, admitted that they did not think about the characteristics of the loads when they were of the weight and size used in this study.
Discussion The reports give a good indication of the range of information which can be collected from subjects both during and after completion of a typical handling task, and the type of detail which they spontaneously report without the need to resort to questioning or prompting. In general, the reports lacked descriptive detail, and contained little or no references to handling technique, posture or movements. Baril-Gingras and Lortie (1990) refer to handling activities as a complex combination of different operations, which are rarely one continuous movement. Subjects do not appear to think about and describe handling activities in this way, and they fail to provide complete breakdowns of the phases involved in the tasks. The reports suggest that subjects perceive handling activities in more “global terms”, enabling reports which are goal orientated or state the general objective of the task. It is not clear whether subjects have the ability to access information which will allow them to adequately describe the handling activities. The final interviews failed to extract this information and comments from some subjects appeared to confirm that they do not have access to such information in tasks which they describe as “routine” or “automatic”. This might be expected, since Schmidt (1982) explains the contribution of motor programs to posture and movement patterns. This evidence of unconscious processing suggests that it would be difficult to use self report methods for the purpose of collecting information on postures, movements and handling techniques. In a similar manner, subjects made very few references to characteristics of the load. Their reports, and responses to the supplementary questions, suggest that the subjects made subconscious decisions which may not fully account for the potential effect of size, weight or
346
B Ryan and CM Haslegrave
other attributes of the items handled. One of the loads was a significant size and weight, and was positioned in such a way that the opening flap of the box caused difficulties for the subjects in removing this from the bottom level of the racking. Several subjects briefly acknowledged this difficulty but made no other references to characteristics of the load, or to the problematic postures which they adopted during this handling activity. The weights of many of the items in the current study were admittedly small and the “size-weight” illusion described by Wiktorin et al (1996) could be an important factor in the underestimation of the loads. It is intended to repeat the study with a wider distribution of load sizes and weights in order to see whether this raises the awareness of both the load characteristics and risk factors associated with posture and movement patterns. In the final part of the study, careful consideration was given to the question wording in an attempt to minimise bias (Sheehy 1981). Sheehy refers to a trade off between attempts to provide complete verbal reports and the erroneous information which may result from interrogation, but the responses to the questions have therefore been interpreted with caution. The questions were located at the end of the experiment to minimise the effect of bias on the other elements of the study. When considering the methodology, the limitations of verbal protocol analysis are well documented in the literature (Bainbridge, 1995; Nisbett and Wilson, 1977; Leplat and Hoc, 1981; Ericsson and Simon, 1993). A number of these methodological problems are of concern in the current study. Several subjects referred to difficulties putting their thoughts into words and it is possible that this is particularly a problem in verbalising handling techniques. Ericsson and Simon (1993) discuss the problems which subjects may have in transforming non-orally encoded information into speech, so it is perhaps not a surprising finding in this study that the subjects focused on aspects of the task which may be easier to report. The retrospective reports gave evidence of revision of accounts of the task, at least in terms of ordering thoughts, giving the impression that the task had been approached in a more logical fashion than the former report indicated. Sheehy (1981) warns of the potential for reconstruction in self reported information, and explains how the subject’s conceptual recognition of the events may be different from the chronological order of events. Additional information was obtained from some of the retrospective reports, but it is not possible to determine whether this was based on the subject’s actual thoughts, or originated from reconstructed comments which the subject “thinks he must have thought”. The “retrospective with video” report extracted new information in a number of cases. These reports may also be susceptible to the effects of reconstruction but it is difficult to either prove or disprove this. The widely differing quality of reports from subjects while they were watching the video recording appears to confirm Schmidt’s (1982) reservations about the abilities of subjects to critically observe their own performance because of limited “viewing skills” to separate relevant from irrelevant aspects of the action.
Conclusions - The reports give an indication of the range of information which subjects process and are able to describe during the execution of handling tasks and the types of information which are accessible and inaccessible.
Use of verbal protocol analysis in an order picking task
347
- The reports contain remarkably few references to posture, movements, handling techniques or factors relating to the load. Subjects did not naturally provide detailed breakdowns of the stages involved in handling activities. The failure to obtain such information could suggest: (i) The subjects do not have access to this information (ii) The ability to report on such factors is not well established. Subjects may have a greater awareness of these details than may be apparent from the reports. Progress may be achieved by efforts to improve the ability of subjects to provide reports In either case, the findings raise questions with regard to the validity of any subsequent questions which might attempt to collect such information. - The verbal protocol methodology has been shown to contain many limitations, such as the difficulties putting thoughts into words and time limitations which prevent subjects from mentioning all that passes through their minds. These may be particular problems in the verbalisation of handling techniques. - Further studies are now necessary to investigate the awareness of posture, movement and handling techniques when handling tasks involve heavier and different types of loads.
References Bainbridge, L. and Sanderson, P. 1995, Verbal protocol analysis. In J.R.Wilson and E.N. Corlett (ed) Evaluation of Human Work. A practical methodology Second Edition, (Taylor and Francis, London), 169–201 Baril-Gingras, G. and Lortie, M. 1990, Analysis of the operative modes used to handle containers other than boxes. In B.Das (ed) Advances in Industrial Ergonomics and Safety II, (Taylor and Francis), 635–642 Ericsson, K.A. and Simon, H.A. 1993, Protocol Analysis: Verbal reports as data Revised Edition, (The MIT Press, Cambridge, Massachusetts) Leplat, J. and Hoc, J-M. 1981, Subsequent verbalisation in the study of cognitive processes, Ergonomics, 24, 743–755 Nisbett, R.E. and Wilson T.D. 1977, Telling more than we can know: Verbal reports on mental processes, Psychological Review, 84, 231–259 Sheehy, N.P. 1981, The interview in accident investigation. Methodological pitfalls, Ergonomics, 24, 437–446 Schmidt, R.A. 1982, Motor Control and Learning. A Behavioural Emphasis, (Human Kinetics Publishers, Champaign, Illinois) Wiktorin, C., Selin, K., Ekenvall, L., Kilbom, A. and Alfredson, L. 1996, Evaluation of perceived and self-reported forces exerted in occupational materials handling, Applied Ergonomics, 27, 231–239
PARTICIPATORY ERGONOMICS
SELECTING AREAS FOR INTERVENTION
Benjamin L.Somberg
User-Centered Design Department AT&T Labs Room 2K-337 101 Crawfords Corner Road Holmdel, NJ 07733 USA
When seeking to effect organizational change, the determination of which aspects of the organization are to undergo change should be based on objective data regarding existing problems in the environment. However it is unprofitable to seek change that is not supported by the organization and thus the personal preferences of the stakeholders must be given considerable weight. A method for balancing the use of objective data and attention to stakeholder preferences was used to help an organization select an opportunity for change. This process ensured involvement of stakeholders by assigning specific participatory roles and it forced stakeholders to agree in advance on the criteria for selecting an area for change. This produced a decision that was supported by all stakeholders and ensured that a verifiable problem within the organization was being solved.
Introduction and Thesis Attempts to effect organizational change often face the obstacle of a lack of agreement among stakeholders on which aspects of the organization are the best candidates for change. Clearly it is productive to address only those issues that represent verifiable problems with costjustifiable solutions. However, recommendations based on objective analyses may be devalued because they are not consistent with management’s view of the existing environment. Meanwhile, political considerations and anecdotal evidence often influence stakeholders’ decisions about which of several opportunities for change should be seized. To overcome the barrier concerning selection of an area for change, two conditions must be met. First, a decision needs to be based as much as possible on objective data, and second, there must be a sensitivity to the, possibly, conflicting goals of all important stakeholders. A recent project offers a successful example of meeting these two conditions through the use of a “focus area selection process” that was data-driven, but had high stakeholder involvement.
Selecting areas for intervention
351
Focus Area Selection Procedure This project involved a division within the company that has a critical role in the maintenance of the public telecommunications network.1 Because of the introduction of new technology in the network, the role and demands of this organization were under significant transition, providing an excellent opportunity to investigate ways to enhance the effectiveness and efficiency of the operations within the division. The organization is a large one with a broad scope and it was known from the beginning that there were more opportunities to effect change than could be handled by the resources devoted to the effort. Some selection of critical areas would have to occur. The types of change that would be considered included enhancements to the work environment (including support tools), job and task design, and organizational restructuring. The project was conceived as a four phase effort. The first phase was a high-level overall analysis of the operations within the division, concentrating on identifying sources of inefficiency, sources of error, and opportunities for change. At the conclusion of that analysis, there was to be a selection of one or more “focus areas” which would receive detailed attention. In the third phase, an analysis of those focus areas would be performed and a recommended program for organizational change within the boundaries of those focus areas was expected. Finally, the fourth phase of the project called for the development of an overall plan for organizational change, incorporating the lessons learned from the selected focus areas. In order to help ensure participation across the division, the following participatory roles were identified: • Analysis team: a small group of technical personnel with responsibility for performing the analysis of the operations and for making specific recommendations regarding organizational change. • Core team: a group of key managers from the division who periodically reviewed the project status and made final decisions regarding the overall project direction. • Stakeholders: a larger group of managers or process owners within the division who would likely be affected by the results of the project. They played a consulting role on the project and participated in some key decisions. It was recognized from the outset that the selection of the focus areas was going to be a critical step in the project. A considerable effort was to be devoted to a detailed analysis of the focus area and it was anticipated that significant, concrete recommendations for change would result. If the focus areas were selected based upon an unbiased understanding of the benefits and obstacles associated with each potential area, there would be confidence that the remainder of the project would be devoted to solving verifiable problems and that significant benefit to the organization could be achieved. However the project had numerous stakeholders with competing interests and there was potential for territoriality to overwhelm objectivity. Consequently, it was decided to attempt to reach a priori consensus among the
1 This goal of this paper is to describe a methodology, rather than to discuss the results of an analysis. Consequently, some of the proprietary details of the involved organizations have been stated in general terms or altered in ways that do not affect the goal of the paper.
352
BL Somberg
stakeholders on a process for selecting the focus areas. Although the focus area selection process was to be the second phase of the project, defining that process occurred in parallel with the high-level analysis.
Step One: Selection Criteria Nominations The first step in selecting a focus area was for the stakeholders to construct a list of potential criteria by which focus areas could be selected. Stakeholders were told generally how these criteria would be used, but were not given any examples or restrictions on what they could suggest. The stakeholders were asked to generate as many possible criteria as they wished. Twelve unique selection criteria were offered, as shown in Table 1. Table 1. Nominated selection criteria
Step Two: Voting on Selection Criteria Once a set of possible selection criteria had been obtained, stakeholders were asked to help choose a final set of criteria that would be used to select the focus areas. Stakeholders were sent a list of the twelve nominated selection criteria and were asked to rank each nomination as High (very important), Medium (moderately important), or Low (relatively unimportant), with the constraint that there could be no more than four nominations placed into either of the two higher categories. Based upon the voting, four criteria were adopted for use in selecting the focus areas.
Step Three: Generation of Candidate Focus Areas It is important to note that the nomination and voting on selection criteria were completed in the absence of any knowledge on the part of the stakeholders about what focus areas were being considered. This was an intentional effort to prevent stakeholders from using their votes
Selecting areas for intervention
353
on selection criteria to promote the choice of a preferred focus area. Once the set of selection criteria had been adopted, candidate focus areas could be disclosed. The focus areas were generated from the results of the high-level analysis of the division’s operations and represented areas in which the analysis revealed an opportunity to enhance the effectiveness or efficiency of the organization. Stakeholders were allowed to supplement this list of candidate focus areas, but no such suggestions were received. A few of the candidate focus areas are summarized in Table 2. Table 2. Sample focus areas
Step Four: Rating of Candidate Focus Areas For each of the four previously-adopted selection criteria, a five-point rating scale was adopted. Guided by the results of the high-level analysis, the analysis team rated each candidate focus area on each of the four selection criteria. These ratings were reviewed by the core team.
Step Five: Selection of Focus Area(s) Given that a process for selecting focus areas had been agreed to in advance and that the selection criteria had been selected by the stakeholders, it would have been possible at this point to tally the ratings for the candidate focus areas and select the one(s) with the highest overall rating. However it was decided that more would be gained through involving the stakeholders once again in the selection process, particularly if the options could be presented to the stakeholders in a manner that would encourage the use of the pre-established process. By this time all stakeholders had received substantial information, in the form of interim reports and presentations, about the major results of the high-level analysis. They had available to them supporting information about each of the candidate focus areas, including data to indicate the magnitude and effects of each problem area. Core team members were sent packages that summarized what was known about the candidate focus areas. Specifically the key findings of the analysis were reviewed along with a table that showed the relationship between these findings and the candidate focus areas. Core team members were asked individually to select one or more focus areas based upon this input. This was followed by a
354
BL Somberg
meeting of the core team and the analysis team where their selections were reviewed and a final determination of a focus area was made.
Results This process resulted in the selection of a focus area that met a pre-established set of selection criteria.2 This focus area was also one that had been identified as a source of inefficiency by the operations analysis and one for which opportunity for significant change had been indicated. Although one of the core team members was less pleased with the outcome of the selection process than the others,3 there was unanimous agreement that the process was fair and objective. The division manager was particularly supportive of the process and often reminded other core stakeholders that they had agreed to the process in advance. In the end, all stakeholders supported the decision and planning to perform the work involved with the selected focus area preceded readily.
Conclusions The goal of the process described here was to select an opportunity for enacting organizational change that would solve a real problem and that would be supported by a significant portion of the affected organization. Achieving this objective required a delicate balance of sensitivity to the conflicting preferences of the stakeholders and reliance on objective indicators of true organizational problems. The process contained three major elements that helped reach this goal. 1. The process endeavored to achieve as much involvement by the stakeholders as possible, as it is well-known that people who are involved in a process are more likely to be supportive of the outcome of the process. This was accomplished by assigning roles with specific responsibilities to members of the organization and by soliciting input across the organization at each step in the process. 2. The process for selecting a focus area was defined and agreed to by the stakeholders in advance of any discussion about content. It was assumed that once discussions about content had begun, stakeholders would find it difficult to engage in process negotiations in an unbiased manner. As a specific example of this principle, the stakeholders were asked to agree to criteria that would be used to select a focus area before the candidate focus areas were revealed. 3. Even though care had been taken to achieve a priori consensus on a process for selecting focus areas, it was assumed that people with strong interest in the outcome of the process would find it difficult to apply the process objectively and consistently. Consequently, results were packaged in a way that made it difficult for analysis results to be ignored and encouraged conformance to the established process. In fact, the candidate focus area that was ranked highest on the selection criteria was the one that was ultimately chosen for more detailed analysis.
2 Although the process permitted the final selection to consist of more than one focus area, due primarily to resource limitations and the complexity of the highest-ranked candidate, only one focus area was selected. 3 As one might expect, this core team member had rated the selected focus area as a relatively low priority.
PARTICIPATORY ERGONOMICS IN THE CONSTRUCTION INDUSTRY A.M.de JongA, P.VinkB, W.F.SchaeferC
A
Delft University of Technology, Faculty of Civil Engineering, Dept. of Building Technology and Building processes, P.O. Box 5048, 2600 GA Delft, The Netherlands, E-mail [email protected], Facsimile +31 15 2784333 B
C
NIA TNO, P.O. Box 75665, 1070 AR Amsterdam, The Netherlands
Eindhoven University of Technology, Faculty of Architecture Building and Planning, Department of Production and Construction, P.O. Box 513, 5600MB Eindhoven, The Netherlands
Work in the construction industry often results in high physical strain of the worker, which may be reduced by introducing technological innovations at construction sites. However, many innovations are not implemented at the sites in het Netherlands. The concept of participatory ergonomics is a method to improve the implementation of innovations by involving the target group during the development. This paper reports on an evaluation study of two development processes of innovations for construction sites. The first project for painters is initiated to explore the working methods to reduce physical strain and involves less workers as participants than the second project. The second project for installation workers is initiated by a company that is aimed to develop tools to support workers at non-changing working methods. Although the projects differ in participatory approach the goal of both projects have been achieved.
Introduction The origins of the method of participatory ergonomics have been defined by Noro and Imada (1991) and involve employees as sources for problem solving and improving product quality by stimulating their desire to improve organizational effectiveness and quality of work. Three advantages of this method can be distinguished: (1) integration of human factors at different organizational departments, (2) efficient use of sources of information and (3) using opinions of employees to improve quality of work. In a recent conference the necessity of input from workers was emphasized by different authors (e.g. Jensen, 1997, Landau and Wakula, 1997, Moir, Buchholz and Garrett, 1997). In this paper projects will be discussed which were carried out with the method of participatory ergonomics. In a previous conference paper (De Jong, 1997) the method of participatory ergonomics of NIA TNO, the Dutch organization for labour issues, has been analyzed. Only a brief overview of the method will be given here. The projects follow the six steps of the
AM de Jong, P Vink, WF Schaefer
356
method as shown in Table 1. The analysis indicated that for each step of the process different participatory approaches and groups must be chosen. Also, in some steps participation is necessary in greater extent than in other steps. Table 1. Step by step approach (Vink et al., 1995)
This paper aims to analyze the goal of two projects in relation to the participatory approach. In 1997 two projects in the Netherlands have been carried out with the method by different project managers. The first was carried out in a large installation company and the second one is done in cooperation with two middle-sized painting companies. The projects cannot be compared with each other on outcome of the process, since they were performed under different circumstances. However, the different goals of the projects can be compared with each other in relation to the participatory approach and methods and techniques that were used in the process.
Project 1: painters The project was at first aimed at developing devices to support the painter during standing activities. However, in the participatory process the problems concerning working conditions turned out to be more diverse. Therefore, the scope has been broadened and for a number of important problems of the work, such as reaching and repetitive movements, solutions have been proposed. The process will be described using the step by step approach as shown in Table 1. In each step the participatory approach will be outlined.
Step 1 The project was funded by the painters trade organization and was carried out by NIA TNO and two painting companies. However, the companies have not been involved at the initiation of the project, but only at the beginning of step 2. Four experts of NIA TNO suggested the participatory approach and handled the organization and communication of the project. The project proposal was written by this group and sent to the trade organization for financial support. After approval, the two companies were asked to join the project group.
Step 2 Video recordings have been made at locations of the two painting companies. This showed that painters do not only paint; they also have to build scaffolding, transport materials and equipment and prepare the surfaces for painting. The problems did not only occur during painting but also, and even more, at the other activities. The painters were asked informally about their body discomfort and the activities in which it occurred.
Participatory ergonomics in the construction industry
357
Step 3 A solution session has been prepared and organized by students as a special course. Two executives of one company and experts were present to explain the activities of the painters. A special technique to improve creativity during such a session was used by a student-chairman to get as many ideas and solutions as possible. This generated a lot of ideas to solve different problems, such as the repetitive movements and the reaching and for the painting job. The ideas have not been worked out further than drawings and texts.
Step 4 The ideas were first categorized to type of problem and prioritized by the three project members of NIA TNO. Fifteen solutions remained, which were drawn and put in a booklet with grading forms. The solutions concerned alternatives for scaffolding, alternatives for painting and preparing activities and alternatives for supporting devices. These booklets were sent by mail to one company and handed over with explanation to the other. The mailed booklets were given much lower grades than the booklets that were handed over. However, both companies graded the solutions to eliminate the painting job very low, since they figured that this is the principal job they do. Step 5 and 6 have not yet been carried out. The financial support only extended to the first four steps of the project. Next year the other part of the process will be requested for financial support at the trade organization and the two companies both indicated they would like to cooperate.
Project 2: installation workers This development project is part of a larger work improving process. This part of the project was aimed to develop devices to reduce physical strain for three important problems: static load, kneeling and manual transport. The process will be described as in the previous paragraph.
Step 1 This project was a follow-up of a previous project aimed at improving company culture towards working conditions. The installation company gave NIA TNO the assignment to start a developing project for mechanical devices to reduce physical strain for three problems: manual transport, kneeling and static load. The company already installed a special committee for working conditions joined by the project manager of NIA TNO. In every step several experts were asked to contribute to the project.
Step 2 The physical strain for the problems mentioned previously was analyzed by questionnaires which were carried out by safety and health coordinators. These questionnaires asked the workers to point out problem activities and indicate the number of times they are performed. This was a good method to get more information of the circumstances of occurring problems. The data were set out in graphics to make it comprehensible.
Step 3 Again, a solution session was organized, but this time by NIA TNO in a round table meeting with workers, safety and health coordinators, chiefs and experts. An expert chaired the session
358
AM de Jong, P Vink, WF Schaefer
and started with some graphics of physical strain to show the importance of the problems. Then the participants were asked to explain in which situation the problems occurred and what the cause of the problems was. A better definition of the problem creates better solutions. At first, solutions were drawn or written down individually, later the results were discussed in the group. A number of criteria was used in the discussion to prioritize the solutions. The group made a final judgement based on the necessity of the solution. Among others the results of the selection concerned transport devices, fixing equipment and supporting devices.
Step 4 Afterwards the committee evaluated the solutions to their effects. Some existing solutions were found in one department, which could also be used elsewhere. These solutions have to be diffused into other departments of the company. Other solutions had to be either bought or developed in cooperation with production companies and NIA TNO.
Step 5 An information day, organized for all safety and health experts of the company, should make them aware of the existence and availability of the solutions. Information was given on the safety and health regulations, physical strain and policy of the company. However, most important was the presence of the prototypes of the solutions. Furthermore, a booklet containing all solutions and the production company that sells the device was handed out. The next step is to buy solutions and introduce them in the departments by the safety and health experts. The project continues this phase now and it will be evaluated by the author.
Differences in participation In this paragraph some important differences in participation will be shortly reviewed. • aimed at improving existing situations or creating new working methods Since project 2 had to produce working results on a short term the working methods as a concept were not evaluated. The working methods of the painters of project 1, however, were discussed and evaluated. This resulted in very radical solutions, such as the elimination of the painting job. • determining problems with or without workers Step 2 was carried out in project 1 without questionnaires to the painters. Project 2 did involve questionnaires. Therefore, focus can be put on problems of painters which are not actually firstly prioritized by them. • developing solutions with or without workers and companies Step 3 involved no painters and just a small number of installation workers. Only one painting company contributed to the solution session. The installation company gave several people the chance to contribute to the solution session, which resulted in interesting discussions between work floor and management. • feedback to workers Project 1 gave more feedback to all workers in the company than project 2. The painters were informed of the solutions by grading them in step 4. Project 2 only involved a small number of workers, but did not give feedback afterwards to other workers until step 5.
Participatory ergonomics in the construction industry
359
Discussion The two projects which both use the participatory approach have a very different organization and use very different methods and techniques. The context of a project often determines the type of approach that is chosen. If a company initiates the project, they are in great extent involved in the process. On the other hand, if the trade organization initiates, the developing company is in the center and more companies are involved which indicates that they do not ‘own’ the solutions. Project 1 had an interesting difference in outcome of the grades by the two companies. In the first company no verbal explanation was given on the solutions and the grading methods which had its effects on the grades, compared to the second company. This could indicate a difference in outcome because of selection techniques. Also the use of participatory groups differ in great extent. In project 1 merely students are involved at the solution session, though project 2 only the company is involved. This may be the cause of the difference in the solutions. Companies cannot as easily as outsiders forget about daily working methods and therefore generate less radical solutions.
Conclusion The projects which were discussed in this paper had a very different approach and used very different techniques. Both projects resulted in the requested outcome. Project 1 was more radical and changed working methods. Project 2 developed solutions to reduce physical workload at existing working methods. The participatory approach of project 1 was aimed to define the possibilities in theory to reduce workload, whereas the approach of project 2 was aimed to find on a short term practical solutions for problems that actually worked. Therefore, project 1 involved less workers and management than project 2, because it simply was not necessary to find good results. Also the methods and techniques are adjusted exclusively to reach the goal of the project. Further research will analyze these and other projects to implementation and adoption.
References Jensen, P.L. 1997. The Scandinavian approach in participatory ergonomics. In: Proceedings IEA ‘97, Vol. 1, Seppälä et al. (eds.), Tampere, Finland, p. 13–15. Jong, A.M.de, Vink, P. and Schaefer, W.F. 1997. An evaluation study of the participation in the development process of innovations for construction sites, In: Proceedings of the 1st South African Conference on Safety & Health on Construction Sites, T.C. Haupt and P.D.Rwelamila (eds.), CIB W99, Cape Town, 4–9 October 1997, p. 47–56. Landau, K. And Wakula, J. 1997. Ergonomic design of tools and working objects in the construction industry. In: Proceedings IEA ‘97, Vol. 6, Seppälä et al. (eds.), Tampere, Finland, p. 139–142. Moir, S., Buchholz, B., Garrett, J. 1997. Health Trak: a participatory model for intervention in construction. In: Proceedings IEA ‘97, Vol. 6, Seppälä et al. (eds.), Tampere, Finland, p. 151–154. Noro, K. and Imada, A. 1991. Participatory Ergonomics. Taylor & Francis, London. Vink, P., Peeters, M., Gründemann, R.W.M., Smulders, P.G.W., Kompier, M.A.J. and Dul, J. 1995. A participatory ergonomics approach to reduce mental and physical workload, Int. J. Of Industrial Ergonomics, 15, 389–396.
USER TRIAL OF A MANUAL HANDLING PROBLEM AND ITS “SOLUTION” D.Klein, W.S.Green and H.Kanis
Department of Product and System Ergonomics School of Industrial Design Engineering Delft University of Technology Jaffalaan 9, 2628 BX Delft, the Netherlands e-mail: [email protected]
A study (Klein, 1997) has been conducted at a major beer company with the aim of solving some of the manual handling problems of the distribution of 50 L beer kegs to pubs. It started by conducting a user trial of the so called Keg Buggy, a device developed by the company to reduce the workload. During the user trial, it turned out that the whole foundation for the development of the Keg Buggy was missing: the actual problems had never been analysed properly. This paper shows how strategic design decisions were made from behind the desk in an early stage, only to find out much later that the whole product idea was on the wrong track. Secondly, this project illustrates how a user trial can be used to establish the occurrence of manual handling problems and especially to determine the reasons for their occurrence.
Introduction Distribution system of kegs The main distribution units of beer for pubs are the steel 50 L beer kegs. They are transported to pubs in lorries, together with other goods such as liquors, wines and soft drinks. The lorry crews have all kinds of small equipment to help them get the goods from lorries to pubs. The goods sometimes have to go over thresholds, through narrow passages, upstairs to pubs on the first floor or downwards into cellars. The current Dutch guidelines concerning manual handling do not allow lifting more than 23 kg and pulling more than 200 Newton. When the postures aren’t good these figures are even lower (Kluver, 1992). Because of the weight of full beer kegs (66,5 kg), it was expected that these guidelines were often overstepped.
Earlier research of the manual handling problems As early as 1985 the beer company held an analysis of the manual handling problems during the distribution of beer kegs (Snijders, 1985). The resolution of that research was low: it only
User trial of a manual handling problem and its “solution”
361
measured the amount of bending needed while distributing the beer kegs. It showed that a lot of bending occurred, but not why or when. In 1992 a second project was conducted, analysing which handling methods were causing problems (Kluver, 1992). It still didn’t produce answers about why the workers used the heavy and demanding handling methods.
Developing a product to reduce the workload In 1995, a student in mechanical engineering proposed to redesign a product used for carrying bricks so that it would solve the problems with transporting kegs. The management, eager to get results, accepted the proposal. The student did exactly what was agreed upon: develop a vehicle to handle beer kegs. It was meant to be suitable for taking kegs from pallets or from the top racks of wheeling containers, for transporting kegs from lorry to pub, and, especially, for lowering kegs into cellars.
Figure 1: The original product for carrying bricks (a), the first prototype (b) and the Keg Buggy (c) as tested at the actual delivery situations by the lorry crews. The final product has an electric winch, a battery and a special hook for grabbing kegs by the handles or from the side. The design was prototyped and analysed in “laboratory conditions”: the actual lorry crews were not involved, nor was the product tested at the actual surroundings in and around pubs. From this analysis it was concluded that the design of the buggy should be optimised before it was ready for testing in the actual surroundings. The construction firm that developed a new prototype still had no analysis of the situations at pubs or the current handling methods. In 1997 the product was finally developed sufficiently for the company to consider it suitable for testing in a user trial.
User trial of the Keg Buggy and current methods Lay-out of user trial The user trial was conducted with the actual lorry crews. They were asked to first show how they would normally handle the kegs at each delivery situation, before testing the Keg Buggy. In this way the nature and size of the manual handling problem that was actually solved could be estimated. The user trials took place across the Netherlands, to even out the inter-regional
362
D Klein, WS Green and H Kanis
differences between building styles of pubs, delivery demands and styles of handling. The buggy was tested and compared with the normal handling methods on comprehensibility, effectiveness in reducing the workload, time efficiency and safely. The methods used were video recording and analysis, interviews, questionnaires and personal experience of the work situation by the designer.
The Keg Buggy as an all-round keg-handling device It was soon evident that the buggy was completely useless as an all-round vehicle for handling kegs. For instance, for transporting kegs over the ground, the buggy wasn’t competitive with the currently used hand trucks that are smaller, lighter, very tough and can carry two kegs at a time. The buggy was also meant to solve the problems with taking kegs from the top racks of wheeling containers. Normally a worker would pull a keg out by hand and use too much force while guiding it to the floor. However, the worker already has the alternative of using a drop cushion to break the fall. That method is much more efficient and practical than any tackling device can ever be, but still the workers never use it. The reason is that the workers don’t consider lowering a keg to be a problem and also find getting and placing a drop cushion too much of a hassle. As the relatively small and light drop cushion is already too much of a bother to use, the chances of the heavy and slow Keg Buggy ever being used are zero. The choice was soon made to focus on the most specific and typical function of the Keg Buggy: lowering kegs into cellars.
Figure 2: Some of the functions of the Keg Buggy. a) Taking a keg from the top rack of a wheeling container. b) Lowering a vertical keg straight into a cellar over a slide. c) Lowering a horizontal keg into a cellar in two phases.
Lowering kegs into cellars the normal way The workers first showed how they would normally lower kegs into each cellar before trying out the Keg Buggy. Surprisingly, it turned out that in most cellar situations there was no manual handling problem at all, or only a small one. In the easiest cellar situations neither worker needs to lift the keg at all: the top man can simply roll the keg through the hatch and let it fall down onto a drop cushion, or use other methods that are no problem according to the guidelines concerning manual handling. At other times, the workers would need to lift or swing the keg from the ground for part of a second, for instance to avoid the falling keg damaging some pipelines directly under the hatch, to place the keg onto a slide, or to make sure that a drop cushion is hit in the centre.
User trial of a manual handling problem and its “solution”
363
Figure 3: a) Example of a very easy cellar situation: simply roll the keg in without any lifting. b) Because the top man has to carefully aim the keg on the centre of a drop cushion, he needs to swing it away from the edge. c) Walking down with a keg step by step. d) An awkward combination of obstacles under the hatch and a deep cellar make considerable force and bad posture unavoidable while lowering a keg. On only a few occasions are the cellar situations so awkward that they cause considerable manual handling problems. At one of the situations encountered there was a stairway with vulnerable marble steps, so each keg had to be carried down. A slide over the steps, as seen at other delivery situations, would have solved this problem and been more efficient at the same time. The Keg Buggy, however, formed no solution at all. Other serious manual handling problems occurred because of awkward combinations of small hatches and obstacles under the hatch. The conclusion of this analysis was that lowering of kegs into cellars itself doesn’t cause the manual handling problems, but awkward cellar entrances do. All the methods currently used for lowering kegs are very efficient. It takes about 20 seconds cycle time to lower a keg step by step over a stairway and situations with slides and drop cushions can be twice as fast. Speed is very important to the company, but also to the workers: they prefer to work fast and save time to drink coffee, to talk or to be back home earlier.
Using the Keg Buggy for lowering kegs Besides the fact that lowering kegs into cellars isn’t as hard as presumed, the Keg Buggy is unacceptable as an alternative. Even if all the design mistakes of the prototype were solved, the whole concept of lowering kegs with a heavy motorised tackling device is much too slow: lowering the hook system, attaching it to the keg, taking the buggy to the hatch, positioning it, lowering the keg to the bottom, unhooking the hook system, hoisting up the empty hook and taking the buggy away for the next keg took at least a minute at the trials, even if nothing went wrong. This is completely unacceptable to the workers, who are used to doing the same work in a few seconds.
Discarding the concept of the Keg Buggy The management board that issued the development assignment over two years earlier and has been debating what to do about the manual handling problems for 12 years, was shown the results of both the analysis of the current distribution methods and the user trial with the Keg Buggy. They were quickly convinced to discard the whole concept of lowering kegs into cellars with something like a Keg Buggy.
364
D Klein, WS Green and H Kanis
Continuation of the project This paper deals with the first stage of a significant trial, design and development process, the later stages of which will be reported in detail separately. A thorough analysis of the manual handling problems and, more importantly, the reasons for their occurrence, was made to determine which types of keg handling overstepped the guidelines and why. This is the link which had been missing in earlier studies. Now ideas could be generated which would solve the real problems. Several ideas for potential solutions were generated and connected with the various manual handling problems. Their feasibility was estimated, for instance by conducting simple user trials at an early stage. It was concluded that the distribution of 50 L kegs could be made to conform completely to the guidelines, provided that the beer company would in future have some minimal requirements for delivery situations to rule out the most awkward and infrequently occurring problems. The most promising problem-solution combination was chosen to be developed first. The information regarding the delivery situations and also the mentality and attitude of the workers, as gathered during the user trials, was integrated into the design process.
Discussion As demonstrated, the choices that are made in an early development stage can carry a project in the wrong direction for years. In this case and with the very best of motivation, it was decided what type of solution was going to improve the working condition before gaining the necessary insight into the real, as opposed to perceived, problems. By conducting a user trial of the distribution methods at an early stage, the Keg Buggy could have been eliminated as a solution, better ideas generated and considerable time and money saved. User trials are often seen only as tools to seek out and eliminate design mistakes in prototypes or even end products (Roozenburg and Eekels, 1991), but trials can also be used to evaluate functions, before any product is developed. User trials can be a source of inspiration for the generation of new product ideas (Kanis and Green, 1996) in addition to helping evaluate their feasibility.
References Kanis, H., Green, W.S. 1996, Deel III; gebruik, cognitie en veiligheid, (Technische Universiteit Delft, Delft) Klein, D. 1997, The manual handling of kegs. Graduation report. (Technische Universiteit Delft, Delft) Kluver, B.D.R., Riel, M.P.J.M. van, Snijders, C.J. 1992, Toetsing van de fysieke belasting bij de distributie van bierfusten aan de richtlijnen en normen van de EG aan de Arbowet, (Erasmus Universiteit Rotterdam, Rotterdam) Roozenburg, N.F.M. and Eekels, J. 1991, Produktontwerpen: Structuur en methoden, (Technische Universiteit Delft, Delft) Snijders, C.J. 1985, Analyse van de fysieke belasting tijdens de distribute van fasten, (Erasmus Universiteit Rotterdam, Rotterdam)
INDUSTRIAL APPLICATIONS
Case Study: A Human Factors Safety Assessment Of A Heavy Lift Operation W Ian Hamilton 1 & Phil Charles2
Human Engineering Limited, Shore House 68 Westbury Hill, Westbury-on-Trym, Bristol, BS9 3AA 1
2Amec Process & Energy, Unit 2 Altec Centre Minto Drive, Altens Industrial Estate, Aberdeen, AB12 3LW
This paper presents the results of a human factors analysis of an offshore heavy lift operation. The lift team comprised 18 roles which were all analysed using HTA and timeline analysis techniques. The data were then subject to human HAZOP and communications analysis procedures which revealed certain operational vulnerabilities. A full set of control measures was specified to manage these risks. The operation was then performed successfully. The work serves to illustrate the value which timely intervention of human factors assessment can bring to a major engineering project.
Introduction Part of the development of Marathon Oil (UK)’s Brae B platform to accommodate the Kingfisher field, required the installation of a new separator and pipework module. Although these modules weighed 220 tonnes in total, it was determined that they could be transported to the field on a normal supply vessel and hoisted onto the platform using the drilling draw works and a cantilevered lifting frame. This was a radical alternative to the traditional method of using a very expensive heavy lift barge. This operation would involve a large number of personnel on-board the platform, the supply vessel, and the standby vessel. From the outset it was recognised that effective command and control would be critical to the safety of the operation. Furthermore, this was an activity which was outwith the normal experience of all the participants, as it was the first time that this technique had been used in the North Sea. To address this, the prime contractor commissioned a human factors analysis of the lift operation to highlight any potential human operability hazards and to identify appropriate risk control measures.
Task Analysis Team Structure A hierarchical task analysis of the lift operation was performed down to the function level using the ATLAS tool (Human Engineering, 1996), based on a review of the lift plan
A human factors safety assessment of a heavy lift operation
367
documentation which had been prepared by the customer. This served to identify the organisation of roles and responsibilities within the operation. Essentially, the lift team comprised 18 roles as illustrated in Figure 1. In the interests of brevity only the lead roles (identified by the shaded boxes) are described here. These descriptions are presented in Table 1.
Figure 1. The organisation of the lift team Table 1. Lead roles within the lift team
Site Visit The analyst visited the offshore installation to perform a site inspection and to interview key personnel who would be involved in the lift operation. As a result of this data collection activity, the analysis data were verified and extended. Following the revision a full timeline model of the lift operation was developed. This not only captured the order of activities for each member of the lift team, but also revealed the dependencies between activities. This defined the lift operation as comprising eleven stages. At this point the analysis was also taken down another level of detail. This level defined the co-ordination activities,
WI Hamilton and P Charles
368
responsibilities, equipment used, and training needs for each functional activity. From this it was possible to map out the transfer of responsibilities between lead roles through each stage of the operation.
Human HAZOP Analysis The operational sequence of activities, represented in the timelines, were subjected to a human hazard and operability (HAZOP) analysis. This is a formal methodology which is similar to the standard engineering HAZOP procedure (Kirwan, 1994, pp 95–99), but which makes use of guide words which are more appropriate for the identification of human errors. In this case, the guide words were organised in the form of a systems ergonomics checklist. This checklist was applied to each activity and the results recorded in a set of standard data fields within ATLAS. The critical information defined for each activity included the following: Operator (critical role), Command/Initiation, Action, Equipment used, Response, Error/Hazard, Type of error, Consequence, etc. This process revealed the major hazards and their consequences associated with each activity. It also classified the nature and cause of the hazard.
Communications Analysis As expected, most of the hazards arose through potential co-ordination errors. As a result, a full communications analysis was also performed. This included an examination of all of the communication facilities available on the Brae B. In addition, the analysis developed a command and co-ordination sequence which represented the commands which either initiate or terminate the co-ordinated actions.
Principal Hazards & Risks The HAZOP analysis revealed a number of human operability hazards and vulnerabilities. Some of these are summarised in Table 2.
Figure 2. A photograph of the lift operation in progress
A human factors safety assessment of a heavy lift operation
369
Table 2. Principal human operability hazards and risks
Control Measures The following section describes some of the risk control strategies which were adopted to minimise the risks which had been identified.
Briefing Packs To combat the concerns over team competence the training needs for all participants were specified in detail. Individual briefing packs were prepared for every member of the lift team. These complemented the verbal briefings and hands-on training which the team members received. Each contained a full description of the operation, roles and responsibilities, communication and safety rules, and a personal checklist of actions. The safety rules also ensured proper response to platform alarms. The risk of complacency at the second lift was controlled by having a comprehensive debrief following the first lift. The second lift then occurred after a suitable rest break. Also,
370
WI Hamilton and P Charles
checklists were prepared for each lift and signed off by the required individuals prior to each lift being authorised to proceed.
Sterile Area To control the risk to spectators, the lift operation was performed within a sterile area into which access was restricted to essential personnel, and only then under the authority of the lift co-ordinator. This rule was enforced emphatically in the vicinity of the moving loads.
Ambiguous Communication Where the communications analysis revealed gaps in the co-ordination sequence, appropriate remedial measures were recommended. Also a comprehensive communication protocol was specified to ensure the use of only reserved and unambiguous language. Similarly, where the analysis revealed a special vulnerability to communications failure, back-up non-verbal methods were recommended. Co-ordination was further enhanced by the introduction of a GO/HALT procedural check following every stage of the operation.
Positioning Of The Boat The Captain was aided in the accurate positioning of his vessel by the use of a visual marker system for N-S positioning; and, by relaying instructions from an observer located on the lower deck of the platform, who was able to check W-E positioning.
Handling Of Rigging Strict procedures were developed for the management of the rigging gear on the lift modules. The same rigging was used onshore and offshore, and the boat based rigging crew attended the onshore loadout to ensure that they were intimately familiar with the gear and procedures. In addition, the rigging gear was colour coded to ensure that it was only attached at its designated points.
Failure To Stop Draw Works To ensure that the Driller would know when to stop the draw works, even in the event of a radio failure, the lift ropes were marked with coloured bands to designate the various stop points for each stage of the lift.
Conclusions The lift operation took place successfully and without incident in June 1997. The operation is depicted in Figure 2. This case study serves to illustrate the value which can be added to the planning of a major engineering operation by the timely application of human factors analysis techniques. In addition, the work illustrates how a wide range of outputs and specifications can be derived from the analysis data; thus demonstrating that such intervention can also be highly cost effective.
References Human Engineering Limited, 1996, ATLAS—A Practical Task Analysis System. (Software Created For The Apple Macintosh, Version 1.1K). Kirwan, B. 1994, A Guide to Practical Human Reliability Assessment (Taylor & Francis, London)
THE APPLICATION OF ERGONOMICS TO VOLUME HIGH QUALITY SHEET PRINTING AND FINISHING Mic L.Porter
University of Northumbria at Newcastle Newcastle-upon-Tyne. NE1 8ST (0191)227 3155 [email protected] Four projects undertaken in commercial printing companies have highlighted many areas of concern to an ergonomist. The companies involved were all “sheet printers” producing high quality, specialist, cut items that were then shrink wrapped and despatched to the customer. The high speed presses used were capable of sophisticated and ultra high quality printing but could not be set up, maintained or cleaned in an ergonomically acceptable manner. The printed work was then inspected, “jogged” and guillotined. The cut sets, perhaps held with a rubber band, are then “fan” inspected before wrapping and dispatch. In all of these “finishing” tasks poor ergonomics was identified, remedial actions found and implementation started. Although desirable, major modification to the presses was not possible, however, the application of ergonomics was found to be justifiably beneficial to the organisations.
Introduction In 1491, possibly the year of his death, William Caxton published “Journals of Health”. In the subsequent 500 years of sheet printing the fundamental intention of applying ink to paper in precisely defined locations so that the highest quality reproduction occurs has changed little. Indeed small “Private Presses” still exist using equipment very much like that of Caxton; given that metal has, largely, substituted for wood. This is not, however, the case for the plants in which these ergonomic audits were undertaken. Single colour, single side printing at rates of one/two sheets per minute has now become 220 (or more) sheets per minute printed in several colours and with the possibility of both sides simultaneously. However, the printer and all those that support them still focus on the quality of the image produced to the extent that virtually everything is subservient to the interlinked twin goals of quality and speed. This is particularly true for the presses themselves where the centre of attention is the paper path and not operator ergonomics. Each of the main stages of the process will now be discussed, however, in some cases precise details cannot be given for commercial and other confidentiality constraints.
372
ML Porter
Observations from four Printing Plants Printing The presses generally operated at between 6000 and 10000 sheets per hour although greater rates were possible. The largest sheet size that could be handled was 720mm× 1020mm while 970mm×700mm and 810mm×650mm were more typical. The “weight” of the papers ranged from under 80gm per square metre (gsm) to over 110gsm. Thus when the press is running it can require between 250Kg and 810kg of paper per hour and, due to the extra weight of the applied ink and varnish, 260kg to 845kg to be removed. The wooden pallets upon which the paper was supplied (in 25 team, 12500 sheet loads) could vary from 5.38kg to 7.32kg (mean of 6.19kg for 5) in one plant to a maximum of 12.13Kg in another. The chipboard protection for the top of the paper (c1550mm high) typically weighed between 1.6kg and 1.8kg. The wooden pallets and protection were, when not in use, manually handled and often stacked above shoulder height in order to minimise the floor area required and the congestion caused near to the presses. In one of the plants “continuous running bars” were used to support the paper while another stack was loaded. These steel rods weighted about 1kg each and would need to be inserted and removed from among the paper stack at about shoulder height. Powered lift trucks were used for the carriage of paper from store to press and from the press to the temporary storage/drying area but hand pallet trucks were used for transport close to the presses. The force required to start to move these hand pallet trucks was often found to be in excess of 250N and between 100N and 200N was typically required to keep the load moving. When in the press a similar force was applied to the pallet, with a foot, to hold it “hard home” while the truck was removed. During normal running the printers must also transport ink and varnishes. In the case of the smaller print runs this could come in plastic kegs of 2.5kg, 5kg and 10kg nominal weight that would require lifting onto/into the press. In one case heavier ink kegs (>25kg) were used from which the ink could be pumped and while the latter did not require lifting they often required dragging into position and then large forces were applied to transfer the pump from the empty to the full keg. The force required for these operations could exceed 250N, the capacity of the measuring equipment available. Another large, but less obvious, musculoskeletal hazard associated with the handling of ink is the use of the spatula to transfer ink from the tub to the rollers in the ink reservoir. The viscosity of the ink would vary with the precise specification, colour (in one plant red was often found to be stiffer than blue) and, obviously, the temperature. In order to minimise the viscosity the kegs were often kept in a warm water bath or balanced on (hot) electric motors. When running normally the noise of the presses were, about or above, “the second action level” (90dB(A)) (SI 1989:1790). Chemical hazard; were also present from contact with the inks and varnishes and their fumes/vapours. In one plant Ultraviolet (UV) cured varnish was used requiring hazardous high intensity UV lighting sources (PIAC 1993). When long jobs are running on a press it will, typically, be stopped for routine cleaning and maintenance during every shift and also in response to changes in print quality. The printing plates, and possibly “blankets” and “wipers” will also need changing between jobs together with a full clean to the ink reservoir and ducts. These operations involve heavy manual operations, often with strong solvents and are generally undertaken in congested areas where the postures that might be adopted are constrained. A “short” run aluminium plate for a “litho” press might only weight 0.63k but one “grown” for a large intaglio press might be 5.65kg. The various
The application of ergonomics to sheet printing and finishing
373
rollers and wipers might weigh between 10kg and 32kg while blanket cleaners might weight 30kg and all will require routine exchange. In all cases the ability to adopt desirable postures and to use mechanical devices to lift and manoeuvre these heavy loads is severely limited. The press operators were an ageing, all male, close knit group who, although confident in their work and their ability to produce the highest quality work that their machines were capable of, were often wishing to move onto fixed shift patterns. This not generally an option, given the capital investment involved in a large press, is readily justified only for continuous operation. In one plant a group of five printers all reported suffering pain or discomfort as a result of the work. In one case their doctor had diagnosed “Tennis Elbow” and in another a non specific wrist sprain The printers also reported several traumatic injuries, often associated with slips, trips and falls that had occurred to others. It would not be possible to greatly improve; the fundamental ergonomics of the presses without a complete re-design, major expenditure and a long period of time.
Sheeting All of the plants undertook some inspection before the sheets were cut. In the case of the highest quality products every sheet was checked by an inspector on both sides and the edges and corners were also “flicked”. This work was generally undertaken by female staff on sloping benches either standing or sitting on high chairs. In the case of the large sheets (eg 810mm×650mm or more) sitting was not an option for most as it made it impossible to view or handle the far edge. The rate of working depended upon many factors including print quality, image complexity, page size and the extent to which one page “stuck” to the next. A typical workload would consist of 13–17 reams (500 sheet) daily implying a steady “pull-over” rate of over 2 sheets per second. Each ream would weigh between 14kg and 23kg and might be difficult to “roll over” or for the porters (male) to pick up and carry. In another plant, where sheet inspection was less commonly undertaken the work was carried out standing and the sheets were transferred between two stacks kept at waist height by the use of pallet “scissor jacks”. In one case all 22 staff present were questioned. Ten reported some aches, discomfort or pain in the lower back, neck and shoulder that they associated with work but none regularly took medication to subdue or “cure” the pain. One person had sought medical treatment for “wrist sprain” from the company nurse. In another plant 12 staff were questioned and 8 reported aches, discomfort or pain, one was taking “Over-The Counter” analgesics and two, with wrist pains, had sought treatment from either their own or the company doctor. Injuries reported also included small cuts from the paper and dermatological reactions to the inks and varnish. The latter could be a particular problem if it remained undried/uncured by this stage.
Knocking-up/jogging At several stages in the printing process there can be a need to “break-up” the paper, aerate the stack and thus ensure precisely the register of each sheet. In the case of the fastest presses this can be necessary before the paper goes in but in most cases the task only occurs between the printing and finishing stages. The knocking-up can be entirely manual where a wad of paper is “flicked”, “shimmied” and the sheets separated. Alternatively, the wad may be “broken” and then loaded onto a “jogging table” that mechanically agitates the paper, against an edge. In each case the wad of sheets handled at any one time will vary but 25mm thickness and 10Kg are typical. Greater loads may be undertaken by staff attempting to “work ahead” or when the sheet size is small.
374
ML Porter
In most cases the load handled is of concern as not only is the paper difficult to keep together but it must be “flicked” in mid air, typically about shoulder height, and transferred from one location to another. The workplace can be set-up with the “jogging table” (c750mm high) and a supply pallet 1000mm apart with the person standing between the two. Unless some form of pallet jack was used then the paper transferred from the supply pallet would vary in height, from above 1800mm for the first sheets to 150mm for the last. This is heavy work that was always found to be undertaken by males. In one plant only one person from the nine people undertaking the work reported musculoskeletal pain (lumbar back) and one more refused to answer. There can also be dermatological risks with this work, especially if any uncured or undried ink/varnish is present but, with one exception, it was not thought possible to do the work wearing gloves or barrier creams.
Guillotine Operator The guillotine operator will receive the paper “knocked-up” and in a standard quantity; usually 1000 sheets. In all cases the guillotines observed were pre-set to cut following the operators orientation of the paper and command. In all cases the paper was pushed or pulled on low friction surfaces and the final “cut sets” pushed aside. In this aspect of the work smaller products not only required more cuts but also more manoeuvring and near maximal reaches into the guillotine and were, thus, more likely to lead to musculoskeletal injury than larger sets. The removal of the waste is also of concern, while yields of 95%+ would be designed for but these could, under some conditions, drop to 75%. Thus 1000 sheets could result in scrap weighing between 3kg and 10kg; this was often thrown, backhand into high sided skips. In another plant the scrap was loaded into plastic sacks which, when full, were carried to, and lifted into a skip with sides 1650mm from the floor. In this plant the mean sack weight was found to be 12.3kg (10.5kg–14.6kg). The operators generally dealt with two sacks at a time, one in each hand. The guillotine operators observed were all male and had formed strong “buddy” groups. In one plant eight operators were questioned and two reported pain/discomfort in their back, neck and shoulder that they felt was “work related”, neither had sought medical attention about this matter. The work is, largely, machine paced.
Set Inspection The inspection of the cut sets can vary greatly depending upon the range of product produced. In one plant, for example, the sets are inspected, by “flicking”/“fanning” to see if the image moved, altered in colour, etc. An operation that is repeated for each end/corner and, in some cases, for both sides. Smaller (eg 135mm×95mm) or near square, set sizes are, effectively, stiffer and the task more hazardous especially if it is undertaken in free air rather than with the set supported by the worksurface. In another case only a cursory glance was given to sample sheets from which the entire set was accepted or rejected. In either case this was a highly subjective inspection with undefined decisions made as to what the customer would accept and what they would not. The distribution of products could vary greatly (Table 1.) and many of the most common products were found to be economically undesirable to work with. For example, sets of 320– 370mm×140–195mm×70–80mm with weights between 2.97–3.97kg were, musculoskeletally, difficult to work with but often found. The use of rubber bands to hold the sets together also created musculoskeletal hazards when they were either put on or removed, however, they were only used if the sets were handled away from “float” tables.
The application of ergonomics to sheet printing and finishing
375
Table 1. Summary of cut set sizes (two ream) found in one plant
The set inspection work was undertaken by females and, generally, some degree of job rotation occurred. In some cases work was undertaken while sitting with the product, in tote boxes, delivered and removed by conveyer belt. In other plants standing staff collected and despatched work, in sets, on “float” tables. In one plant, where this task was undertaken by standing staff, it was also rotated (within the shift) with the loading of pallets for despatch and with the shrink wrapping the bundled product. In this plant twenty members of staff were questioned. Seven reported musculoskeletal pain or injury that they associated with the work and three had received medical treatment, carpal tunnel syndrome(2) and lateral epicondylitis and weakness of grip(1). Three of those that reported no discomfort noted that they had started on this work less than three months ago.
Wrap and Despatch This task again required the handling of sets, possibly the frequent removal of rubber bands and the presentation to banding or shrink wrapping machines in the required bundles. Again large span grips could be required to hold the product during presentation and, possibly, during the loading of pallets or the transfer of bundles to storage. The loading of pallets, especially if they were not adjusted in height, presented musculoskeletal hazards, and the danger of tripping when walking among them. This work was undertaken in all plant by females; except for the reloading of the banding or shrink wrapping machines. Although infrequent this could be a hazardous task. Loads of 23.5kg could require handling, with constrained postures, into the machines. However, none of those undertaking the maintenance work reported any injury, nor could any records be found.
Conclusion Six of the jobs to be found in high quality sheet printing have been described and found to raise many areas that would benefit from the attention of an ergonomist. In those plants where this work has been undertaken some actions have already been taken to improve ergonomically the task, workplace and conditions especially in the finishing stages. Further changes and better risk monitoring planed for the future. However, the presses themselves remain a major concern that cannot be tackled to any great extent by the end user. It is to be hoped that press manufactures are, at present, addressing such issues especially those associated with cleaning and “set-up” and that the next generation will be much more ergonomically acceptable than those currently in use.
References Printing Industry Advisory Committee (1993), Safety in he use of inks, varnishes and lacquers cured by ultraviolet light or electron beam techniques, (HMSO, London) Statutory Instrument—Noise at Work Regulations 1989 (SI 1989:1790)
THE APPLICATION OF HUMAN FACTORS TOOLS AND TECHNIQUES TO THE SPECIFICATION OF AN OIL REFINERY PROCESS CONTROLLER ROLE Janette Edmonds1 and Chris Duggan2
1 Human Engineering Limited, Shore House, 68 Westbury Hill, Westbury-on-Trym, Bristol, BS9 3AA 2
BP Oil UK Limited, Coryton Refinery, The Manorway, Stanford-Le-Hope, Essex SS17 9LL
This paper discusses the use of human factors tools and techniques to solve real human issues in industry. The paper uses the example of a project involving the specification of an oil refinery process controller role. The investigation is used to illustrate the importance of effective and efficient use of human factors methods, to meet the usual industrial constraints of short timescales and limited budgets.
Introduction The aim of this paper is to discuss the application of human factors tools and techniques to support decisions concerning human related issues. To support the discussion, reference is made to an investigation which was undertaken on a BP oil refinery in Essex. The investigation focused on a proposal to introduce a new process controller role, and, in particular, the implications this would have on the workload for that role, the effect on team structure and function, and subsequent training requirements.
Background To The Study The investigation was part of a process of extensive technological and organisational change taking place on the oil refinery. The plant is divided into five areas: the Cracking Complex, Fuels, Lubes, Utilities and Product Movements. At the time of the investigation, these areas were controlled locally in five separate control rooms. The main change for the refinery was that these five control rooms were to be amalgamated into one centralised control building (CCB), remote from the plant. In addition, new digital control systems were being introduced to some areas which relied on manual systems, or a combination of manual and automatic systems. As a consequence of the move to the CCB, and the subsequent requirement for remote control, a new process controller (PC) role was proposed for one of the plant areas: product movements. Some concerns were expressed regarding whether the new PC role presented unacceptable levels of workload, and whether the remoteness of the PC from the rest of the product movements team had implications for the ability of the team to cope with new methods of working. Due to these concerns, Human Engineering undertook an investigation to:
Tools and techniques in an oil refinery process controller role
377
• Analyse the workload for the proposed role • Identify potential impacts on team structure and function • Identify the training requirements for the team to support the changes In addition to supporting specific decisions that would be made by the area management, the aim of this investigation was to support a consultation process with the work force. This was to ensure that the work force were involved in the development of their own job roles and to explore the issues that they would have extensive knowledge of as subject matter experts.
Human Factors Intervention Development Of A Job Specification The investigation began with an initial phase of consultation to develop a clear outline of the future job role for the PC working from the CCB. Data were collected separately from the area management and the nominated persons who would take responsibility for the new PC duties. The area management were asked to describe the duties that the PC would undertake, including proposed new tasks and any additional equipment that would be introduced. They were also asked to describe potential high workload scenarios. The nominated PCs were asked to describe the tasks currently undertaken during routine day and night shifts, potential ‘worst case scenarios’, where it was envisaged that the PC would be under high workload, and how they handle such situations. The plans for new equipment and new job tasks, which they were aware of, were also discussed. The PCs were asked to rate the workload for all future tasks, using a five point rating scale. This was used to anchor ratings between respondents. From the information gathered, a job specification was prepared, detailing the tasks that the PCs would undertake from the CCB.
Workload Analysis The data from the initial consultation were used to develop task models of the PC role, using a software task analysis tool (Human Engineering Limited, 1996). A series of baseline task models were developed to reflect busy routine day and night shifts, but without any specific incidents which would increase the workload significantly, referred to as ‘routine’ scenarios. The baseline task models were then developed further to reflect situations where particular types of upsets occurred at different times during the day and night shifts, referred to as ‘high workload’ scenarios. These included; emergencies on the product movements area, emergencies on the jetties, and emergencies occurring elsewhere on the refinery. The task models focused only on the role of the PC. The task models were verified by the area management and by the nominated PCs as being representative. Workload profiles were then calculated for the routine and high workload scenarios. The workload profiles were ‘demand’ based calculations which calculated the sum of the workload demand ratings for each task in a given time frame. This was based on the D model (Aldrich et al, 1988), as follows:
Where Aat=1, if action a at event sequence number t exists and Aat=0, otherwise: a=actions t=event sequence number dat is the demand for action a at event sequence number t.
378
J Edmonds and C Duggan
The workload demand calculation samples the tasks within specified time intervals over the duration of the 24 hour shift period. For each time interval, the ‘demand ratings’ are summated. The workload profile then illustrates the summed demand ratings over a specified period, i.e. 24 hours. It was found that, during a routine ‘busy’ day or night shift, the PC would be working at maximum capacity for less than 50% of the shift, and that there would be workload peaks above one person’s capacity for less than 10% of the shift duration. By closer investigation of the tasks causing the workload peaks, it was concluded that the workload problem could be eliminated by effective task rescheduling. In addition, the workload could be reduced further by acceptable temporary suspension of certain tasks, and through the area teams being aware of the demands being made on the PC at any given time. During ‘high’ workload scenarios, however, it was found that the PC would be working at, or beyond, one person’s capacity for between 29–55% of the duration of the upset, dependent on the type of upset. It was therefore concluded that two people must be available to work the panel during upsets. However, during the initial stages of the upset, it was assumed that support (from the Process Technician (PT)) would be unavailable, as the PT job was primarily designed to be outside on the plant. Despite the equipment being designed to be ‘fail safe’, it was also recommended that the following provisions were made: • To increase the availability of support, for example by having more than one team member and/or other CCB-located people trained to support the PC whilst the PT is unavailable • To provide additional training to the PC and PT to improve their skills at diagnosis of upsets • To provide additional training to the PC and PT to act quickly and efficiently during the initial few minutes of an upset
Consultation With The Area Teams The PC job specification and the results of the workload analysis were put forward for discussion with the area teams. Shift team members were asked to comment on the proposed organisation of the new team structure and the impact of the PC role on the team function. The proposed re-organisation of the team activities was based on the following: • New plant area divisions for product movements (previously decided by shift teams) • The reduced requirement for continuous supervision of ships on the jetties (due to the installation of surveillance equipment, electrically operated valves and ship position alarms) Outside team members would retain ‘control’ of their areas, but the PC would operate some of the automatically controlled valves. The shift team members felt that they could support the PC by: • Duplicating the surveillance control equipment for the jetties, so that the jetty technician could take responsibility away from the PC during peak workload situations • Developing a working practice for ensuring that all team members regularly input relevant data to the computer system which records the latest tank levels and contents (as the PC would have greater reliance on this information) • Maintaining good team communication via the radio network It was generally felt that the workload of the outside team would increase, especially during upsets, as the PT would be required to support the PC in the CCB. The following recommendations were made to reduce the workload of the outside shift teams to enable them to cope with the changes:
Tools and techniques in an oil refinery process controller role
379
• To improve co-ordination and mutual understanding with other departments within the refinery for more efficient problem solving • To reduce unnecessary enquiries by providing an on-line information system which could be accessed by other refinery departments requiring information about the product movements plant activities • To increase the reliability of specific plant equipment • To develop working practices for product movements and clear roles and responsibilities for team members • To develop protocols for emergency situations Top level task analyses were undertaken for each of the new team roles. Using the task analyses, a comparison was made between old and new task knowledge/skill demands. A training requirements matrix was then developed, with support from the shift training technicians. The key training requirements were as follows: • • • • •
Diagnosis of upsets and emergency responses Cross area training within the product movements team Familiarisation with other refinery departments Additional safety and fire-fighting training Refresher training for outside team members to support their continued knowledge and understanding of the PC tasks • Refresher training for PCs to support their continued knowledge and understanding of the plant tasks It was recommended that the training was delivered prior to the move to the CCB, and that it was a mixture of formal, on the job, and simulation training.
Recommendation It was recommended that the proposed PC role was viable given the requirements for additional support, discussed in the previous sections.
Discussion The aim of this paper is to illustrate the advantages of human factors being applied in a timely, efficient and highly structured manner. An investigation to specify an oil refinery process controller role was described to support this aim.
The Case Study The aim of the investigation was to assess the viability of the proposed new role of the PC, in terms of the PC workload, the potential impact on the team structure, and any subsequent requirements for training. Indeed, the customer required independent advice about the viability of the role, and wanted a relatively quick assessment without it being prohibitively expensive (in terms of the assessment itself or any subsequent solutions). In addition, the customer required a neutral ‘body’ to be involved in the consultation process to ease the transition to the CCB. The customer requirements, therefore, dictated how the investigation needed to be undertaken. It was essential that: • The focus of the investigation was on the PC role • The analyses were conducted in sufficient detail to answer the question, but within budget and timescale constraints • The investigation supported the consultation process, as well as answering the question The ingredients for meeting these requirements relied on three key factors:
380
J Edmonds and C Duggan
• A comprehensive set of task analyses • A high level of contact with team members and area management, and immediate and effective feedback between them • The use of highly structured tools and techniques for conducting the analyses The task analyses were extremely important for shaping the whole of the investigation. Through gathering sufficient detail of the PC tasks, and the tasks of each product movements team member, a great deal of clarity was gained very quickly. In addition, all areas of uncertainty or concern were recorded and fed back between the refinery workers and the area management in a timely and structured manner. This helped to gain clarity at the early stages of the investigation, and indicated to the workers that their opinions were being heard and given a high level of attention. The investigation involved the application of a number of human factors techniques. These included; task analysis, workload analysis, training needs analysis; and, data collection through interview. To support the process, a software task analysis system was used. This helped to streamline the process of undertaking the analyses and supported the re-use of task analysis data for the workload analysis and training needs analysis.
The Case Study With Respect To The Application Of Human Factors The investigation took two months to complete. The solutions required some minimal additional costs for training, engineering maintenance time, and time to develop the operating and emergency procedures. It was considered that the investigation had answered the questions it intended to answer, and that neither the human factors intervention nor the resulting recommendations were prohibitively expensive. It was subsequently recognised by the area management that the intervention had satisfactorily met their requirements. Indeed, through further discussions and circulation of the report to other area managers, it became evident to them that human factors interventions could support them in a variety of similar ways. The paper demonstrates how the structured use of tools and techniques can support projects in many ways. In particular, it shows how objective data can be used to answer specific questions and to support a consultation process. It also shows how the efficient use of the methods can be effective, even when timescales are short and without being prohibitively expensive. It also became evident during the work programme that it was essential for the human factors intervention to be visible and not seem unnecessarily complex to those involved, especially when they do not have a detailed knowledge of human factors. Finally, whilst undertaking this, and other work programmes, it has become clear that the role of the human factors practitioner is, at least, two fold. Firstly, it is essential to ensure that work is undertaken professionally, efficiently and cost effectively to answer the project questions. It is also important for human factors practitioners to market themselves effectively to increase the awareness of the importance and the range of applications of human factors.
References Aldrich, T., Szabo, S., & Bierbaum, G.R., 1988, The Development and Application of Models to Predict Operator Workload During System Design. In G.McMillan (ed.) Human Performance Models, (NATO AGARD) Human Engineering Limited, 1996, ATLAS—A Practical Task Analysis System. (Software Created For The Apple Macintosh. Human Engineering Limited. Version 1.1K)
FEASIBILITY STUDY OF CONTAINERISATION FOR TRAVELLING POST OFFICE OPERATIONS Graeme Rainbird and Joe Langford RM Consulting, Royal Mail Technology Centre, Wheatstone Road, Dorcan, Swindon, SN3 4RD
This paper describes the feasibility review of a major operational change to Royal Mail’s Travelling Post Offices, in particular, the introduction of trays and roll cages for handling mail. Task analysis of the operation and workshops with TPO staff identified a range of relevant issues. Initial concepts were developed and evaluated. Technical problems, such as methods for securing roll cages and trays were solved, at a high level. A risk assessment study was employed to evaluate the acceptability of the concepts with key stake-holders. It was concluded that containerisation of TPOs is viable, and that the approach was a good illustration of the value of employing ergonomics methodology as a fundamental part of the design and development process.
Introduction Royal Mail strives to achieve next day delivery for a very high proportion of first class mail. Letters and packets which must travel long distances across the country can present a problem. Travelling Post Offices (TPOs) are trains on which mail is sorted in transit to help meet delivery schedules. The rail network has been used to transport and store mail for over a hundred and fifty years. TPO design has changed little over the last 60 years and the newest rolling stock dates back to 1977. Mail is taken into carriages in mail bags, the mail is tipped and sorted, rebagged, and despatched at stations along the route. Simple wooden sorting frames, consisting of a number of boxes are used to segregate the letter mail. All mail movement and sorting tasks are done by hand. Mail profiles (the type and proportions of letters and packets) have changed over time. In addition, new mail streams and services have been introduced, including ‘Priority’ mail— high value items which require increased levels of security. TPOs have been adapted over time in a ‘piecemeal’ fashion to accommodate these requirements. A more fundamental change to the whole mail distribution network has also taken place over the last five years. Traditionally all mail was transported in bags. Now containerisation
382
G Rainbird and J Langford
has been introduced, so that most mail is handled in trays, which are transported in roll cages. This improves the efficiency of mail handling, but has had a profound affect on many processes and equipment across Royal Mail. TPOs have not yet been containerised. Mail is still shipped in bags, and as such the TPOs are incompatible with the overall network. The viability of containerisation of TPOs is currently under review. Ergonomiste in RM Consulting were requested to: determine the feasibility of containerising the TPO operation, and identify options for subsequent design and development.
Review of the Existing TPO Operation The first step was to develop an in-depth understanding of the current TPO operation. Having ridden the TPO and interviewed staff, a task analysis was completed. At a high level, the task includes: preparing the carriages for departure; loading the mail to the appropriate carriage for sortation (mail is pre-sorted by ‘divisions’ which relate to the station at which it is despatched); sortation of mail at sorting frames following departure; clear down bundles of mail to bags; and despatch of bags at stations along the line.
Figure 1. An example TPO work plan
Outputs from Workshops with TPO Staff An early activity involved TPO postal staff and supervisors, and managers from headquarters, taking part in a workshop to review the current operation and consider the issues for a containerised system. Participants reported their likes and dislikes and identified areas for improvement with the current TPOs. Areas for improvement included:
Containerisation for Travelling Post Office operations
383
Equipment Some features of the current sorting fittings are liked, for instance the glass-bottomed pigeonholes make checking easier at clear-down. Staff recognised however, that in many respects the fittings are out of date. For example, there are no frames specifically for large envelopes; the seating is inadequate and uncomfortable; bag labels are awkward to change and difficult to read.
Environment Many aspects of the TPO environment are unsatisfactory. The temperature is too hot in the summer and too draughty in winter. Lighting is poor and there is much dust from mail bags. Floors are slippery when wet. Train rocking at high speed causes discomfort and occasional minor collisions between people. TPOs are also cramped, which is a problem during peak work loads.
Task Time is the key factor for all tasks, including sorting, clearing down, tying bags and loading/ unloading at stations. The job is generally well liked and there is a good collaborative atmosphere. The sorting staff like the challenge of meeting targets under challenging conditions. The most unpopular task is moving mail bags between carriages.
Initial Concepts for Containerisation Participants at the workshop were also asked to develop their own ideas for the design of a containerised TPO. Their solutions were drawn up on flip charts and provided a very rich source of information identifying critical issues. The example shown in Figure 2 is intended to illustrate the nature of the responses, rather than any specific detail.
Figure 2. An example design for a containerised TPO from the user workshop The designs from the workshop could all be categorised as being of one of two fundamentally different types: either: store and sort mail in separate carriages; or store and
384
G Rainbird and J Langford
sort mail in the same carriage. This factor will influence all the other design issues such as loading methods, storage, floor plans, allocation of tasks etc. Each type of design has advantages and disadvantages. If the mail is stored and sorted in separate carriages the staff have minimal contact with the roll cages which minimises hazards associated with containers on a moving train, and may reduce refurbishment costs. The major disadvantage is that the mail must be transported a greater distance to the work area before and after sortation. It was generally accepted by the workshop participants that if the mail can be stored and sorted safely in the same carriage there will be significant operational benefits. However, regardless of the option selected, TPO staff must be in contact with roll cages and trays when the train is moving to be able to retrieve and replace mail.
Figure 3. One of the design concepts for securing roll cages and mail trays.
Containerisation for Travelling Post Office operations
385
For the containerisation of TPOs to be viable therefore, it must be possible to restrain the roll cages and trays in the event of a disaster such as a train crash. To demonstrate that this was feasible, a range of restraint designs were developed, with input from industrial designers and mechanical engineers. This initial process was not intended to identify the final solution, but to demonstrate that a solution was achievable. Figure 3. shows one potential securing method. In this example, the straps in front of the roll cage are elasticated to allow access to the trays.
Evaluation and Risk Assessment The design study showed that in principle it is possible to adequately secure roll cages and trays. It was clear, however, that there would be different hazards in the operational environment and significant modifications to the ways of working. It was felt that at this stage it was important to carry out a more formal assessment to test the acceptability of the concepts to the users and other key stake-holders. To this end an initial Hazop study was carried out with TPO staff, managers, safety consultants and representatives of the train operating company. The study concluded that, in general, the risks would not be any greater than for the current TPO system. Additional design features were also identified through this process. For example: the roll cage restraint system should have an interlock system to ensure all containers are secured before departure; all securing latches etc. should be designed to accommodate staff wearing gloves; and the doors between carriages must be negotiable by staff using two hands to carry trays.
Conclusions • Containerisation of TPOs is a viable operational concept which would be acceptable to TPO staff • Storing and sorting mail in the same carriage would present an operational advantage, as time taken retrieving and replacing mail to and from storage would be minimised • There are many opportunities for improving the TPO equipment, environment and tasks in an upgraded system A full-scale model of half a train carriage has been obtained, and a mock rail platform has been built to allow the loading of roll cages and mail trays. The next stage of the project is to determine the optimum carriage layouts and to built prototypes. These will be tested with TPO staff and detailed layout and equipment design issues will be explored. If successful, the testing and development programme will be continued on real TPOs. The approach taken by the project team is considered to have been very successful so far in that: • User opinions and knowledge, gained at a very early stage, were key to forming the initial design options • The logical steps to the process meant that any potential project stoppers would be identified at a very early stage, thereby preventing unnecessary spend • User buy-in and communication from the outset are likely to be important if and when the new system is introduced.
MILITARY APPLICATIONS
THE COMPLEXITIES OF STRESS IN THE OPERATIONAL MILITARY ENVIRONMENT Matthew I.Finch & Alex W.Stedmon1
Centre for Human Sciences DERA Portsdown West Fareham, Hants. PO17 6AD. UK. Tel: +44 (0)1705 336424 Email: [email protected]
1
Stress is a highly problematic concept to define, and may be considered as a generic term for responses to four main categories of stressors. Stressors arise in a variety of manifestations and effects, depending on the individual concerned, the situation encountered and combination of stressors therein. Consideration is given to stressors typically experienced in operational military environments, ranging from the pilot in a fast-jet cockpit or naval officer in an operations room, to the soldier on the battlefield. By the very nature of this environment, conventional methods of stress measurement are impractical. Attention is focused on literature that details non-invasive measures of physiological correlates of stress. It is argued that this method is suited to the complex and dynamic nature of the working environment and may be exploited to improve selection, training and combat systems.
The Complexity of Stress Stress can be very loosely defined as “the process of adjusting to circumstances that disrupt, or threaten to disrupt, a person’s equilibrium” (Bernstein et al, 1988). Whilst this definition is somewhat vague, it does illustrate the difficulty of trying to formulate a universal definition. That a recent ESCA-NATO Workshop failed to provide a single standard definition, and that it arrived at six definitions, illustrates, as Cox states, how “elusive…[and]…poorly defined” stress can be (in: Murray et al, 1996). Murray et al (1996) propose that stressors define factors which induce a state of stress in an individual or situation. Whilst this may not be very helpful in itself, it allows a further definition of the stress/stressor relationship, which Murray et al place into four distinct orders. Zero order effects manifest themselves in physical changes brought about by the stressor; first order effects relate to physiological changes brought about by the stressor; second order stressors are concerned with psychological changes brought about by the cognitive interpretation of a stressor; and third order stressors are the re-interpretation of second order stressors so that stress effects are compounded.
Stress in the operational military environment
389
One of the reasons for this is that the interpretation of stress is highly subjective: it will differ between individuals and situations and may even differ for a particular individual at different times. This subjectivity of stress is supported by Albery (1989), whose studies have shown that, although biodynamic stresses can affect subjective measures of workload, this effect is not necessarily reflected by objective task performance. Formalising the stress/stressor relationship in this way, forms a common basis for interpreting various stress effects so that, for example “researchers into speech produced by pilots of high performance aircraft are most concerned with zero-order effects [vibration and G-force], whereas researchers looking at workload would be more interested in second-order effects [which are more prone to individual changes]” (Murray et al, 1996). What is still unclear about the concept of stress is that whilst it is possible to identify specific stressors that cause individuals to become stressed, the cause and effect relationship is still not fully understood.
A Taxonomy of Stressors An inherent problem in defining stressful events is that “it is often difficult to define the characteristics of a specific situation which are stressful” (Baber et al, 1996). In an attempt to deal with this problem, members of the ESCA-NATO Workshop drew up a taxonomy of stressors, based on Hayre’s (in: Trancoso and Moore, 1995) four categories: Physical, Chemical, Physiological and Psychological. This is detailed in Table 1 and elaborated in relation to combat environments. Table 1. A Taxonomy of Stressors
One thing that is apparent from the list is that stressors are not exclusive to one particular category. For example, sleep deprivation may well manifest itself in a number of ways that
390
MI Finch and AW Stedmon
serve to define it under three of the four categories. Furthermore not all the effects of a stressor may arise in any given scenario. The stress effects on an individual may arise from independent stressors, combinations of stressors and even the subjective interpretation of stress effects which may serve to compound the initial stress episode. The effects of these stressors are, therefore, highly specific, not only depending on the task and operational environment, but also on particular individual traits.
Measuring Stress Various strategies have been developed to examine the complex interrelationships of stressors. Whilst traditional methods for measuring psychological stress have relied upon the use of questionnaires, another approach is to assess the physiological parameters associated with stress. However, as Cox (1985) argues, there are no direct physiological measures of stress, only physiological correlates. These correlates include such variables as the level of adrenaline in urine samples (Kagan & Levi, 1975), certain metabolites in blood samples (De Leeuwe, et al, 1992); and REG/EEG patterns (Montgomery & Gleason, 1992). Endresen, et al (1987) suggest that immunological parameters may be used as a psychological stress indicator. Psychological stress produces immunological changes in animals and increasing evidence suggests that this may also be the case for humans. That said, the relationship between stress and the immune system is complex, and best understood in conjunction with individual coping and defence mechanisms. Physiological correlates of stress pertaining to arousal are particularly relevant to time pressure situations. From their study, Cail & Floru (1987) found that for participants, in a time-pressure condition, performance, error-rate, EEG Beta Index and heart rate were significantly higher than in a self-paced condition. Jorna (1993) states, the “monitoring of heart rate in aviation research provides a global index of pilot workload.” Although undoubtedly a useful technique, and indeed Jorna’s results indicate that cardiovascular changes may be a good physiological correlate for stress, he also argues that it is, “more complex to assess and therefore less often used.” This point is particularly pertinent when considering the dynamics of the operational military environment which may serve to confound the measure.
The Operational Military Environment As Driskell & Salas (1991) state, there are few settings that come to mind when one envisions extreme stress environments: airline emergencies are one setting, natural disasters are another, but certainly the military combat environment is one of the most hostile situations in which humans must operate. In many ways this working environment is unique, with servicemen operating at the limits of their cognitive and physical ability, endurance and stamina; in hostile environments, and with acute life/death consequences. In addition the nature of military life promotes strong social bonds which whilst possibly offering a means of social support and indirect stress management also imposes a strong sense of peer pressure. This, in itself, may act as a stressor when individuals are faced with a situation they cannot cope with but either perceive it as something they should be able to cope with or that others are coping with.
Stress in the operational military environment
391
Whilst laboratory studies, and studies conducted in benign environments, attempt to control all but a few stressful variables the military environment cannot be controlled in such a manner. This has two immediate effects: one cannot carry out experimental procedures in the field, and if any measures of stress are to be taken they must be done so in neither an invasive nor disruptive manner. It is clearly just as impractical to ask a pilot to complete a mood questionnaire during a 9-G turn as it is to take a blood sample from a soldier during a fire fight. Indeed, these impracticalities mean that psychological or direct physiological testing of active combatants is impossible. The reactions of an individual to these extreme stressors may not be known until they are placed in the environment and, when they are, they must perform their set tasks efficiently. Indeed, as Driskell & Salas (1991) state, “it is precisely in [the combat environment] that effective task and mission performance is most critical.” It is possible, however, to employ specific physiological techniques in the field but one must be aware of certain practical limitations when these are used. For example, using electrical skin conductance to assess emotional stress would be affected by sweating, possibly due to hard physical exertion which in itself may not be a stresser in this particular instance. Similarly, heart rate monitors would require advanced algorithms to differentiate between changes due to stress and those due to running while carrying a wounded colleague.
A Case for Non-Invasive Measures Measurement techniques that are non-disruptive must be, by definition, non-invasive. From a practical perspective, any disruption to the combat scenario carries risks to the success of the mission and well being of the personnel. Whilst there has been little research into noninvasive measures of stress in the field some measurement techniques can be readily incorporated into systems and kit that servicemen already use. For example, helmet-mounted EEGs, helmet-mounted blink monitors, and chest-mounted heart rate monitors. Different working environments will require different characteristics of the equipment used to measure and record physiological correlates. For soldiers on the battlefield, weight and size of equipment are of paramount importance. Indeed, with the manufacture of modern materials, the onus is on the equipment suppliers to produce lighter and smaller end products. For this reason, any equipment used for measurement must adhere to these principals. It may be, however, that the pilot in a fast-jet cockpit can be directly attached to more equipment, which is bulkier and heavier, due to the fact that he/she does not have to physically carry the equipment or move about extensively within their immediate environment. Weight and size considerations may be of less importance still for a naval officer in an operations room. However, within this working environment, consideration must be given to the potential crowding effects and the need to move around unhindered when the room is at its busiest. From the literature, significant effects can be found for non-invasive measures in relation to particular stressors. Monitoring of heart rate, total eye blinks, blink duration and electromusculargraphs (EMGs) supports the notion that physiological correlates of stress can be measured without unduly affecting operator performance (Albery, 1989). Morris (1985) also supports the use of eye blink measurements as predictors of performance decrements due to stress.
392
MI Finch and AW Stedmon
Concluding Remarks In order to address the complexities of stress in the operational military environment some formal framework is required such as the taxonomy detailed above. From this theoretical basis the delicate interplay of combinations of stressors can be rationalised and accounted for in the selection and training of personnel, and systems design.
References Albery, W.B., 1989, The effect of sustained acceleration and noise on workload in human operators, Aviation, Space, and Environmental Medicine, 60(10 part 1), 943–948 Baber, C., Mellor, B., Graham, R., Noyes, J., Tunley, C., 1996, Workload and the use of ASR: the effects of time and resource demands, Speech Communication, 20, 37–53 Bernstein, D.A., Roy, E.J, Srull, T.K., Wickens, C.D., 1988, Psychology, (Houghton Mifflin Company, Boston) Cail, F. and Floru, R., 1987, Eye Movements and Task Demands on VDU. In J.K. O’Regan and A.Levy-Schoen (eds.) Eye Movements: From Physiology to Cognition, (North Holland, Amsterdam), 603–610 Cox, T., 1985, The nature and measurement of stress, Ergonomics, 28, 1155–1163. De Leeuwe, J., Hentschel, U., Tavanier, R., Edelbroek, P., 1992, Prediction of endocrine stress reactions by means of personality variables, Psychological Reports, 79(3,1) 791–802 Driskell, J.E., Salas, E., 1991, Overcoming the effects of stress on military performance: human factors, training, and selection strategies. In R.Gal and A.D.Mangelsdorff (eds.) Handbook of Military Psychology, (John Wiley, Chichester), 183–193 Endresen, I.M., Vaernes, R., Ursin, H., Tonder, O., 1987, psychological stress-factors and concentration of immunoglobulins and complement components in Norwegian nurses, Work and Stress, 1(4), 365–375 Jorna, P.G.A.M., 1993, Heart rate and workload variations in actual and simulated flight, Ergonomics, 36(9), 1043–1054 Kagan, A., and Levy, L., 1975, Health and environment—psychosocial stimuli: a review. In L.Levy (ed.), Society, Stress and Disease, Vol.2, (Oxford University Press, New York) Montgomery, L.D., Gleason, C.R., 1992, Simultaneous use of rheoencephalography and electroencephalography for the monitoring of cerebral function, Aviation, Space, and Environmental Medicine, 63(4), 314–321 Morris, T.L., 1985, Electroculographic indices of changes in simulated flying performance, Behaviour Research Methods, Instruments, and Computers, 17(2), 176–182 Murray, I.R., Baber, C., South, A., 1996, Towards a definition and working model of stress and its effects on speech, Speech Communication, 20, 3–12 Trancoso, I., and Moore, R., (eds.), 1995, Proceedings ofECSA-NATO Workshop on Speech Under Stress, Portugal.
THE DEVELOPMENT OF PHYSICAL SELECTION PROCEDURES. PHASE 1: JOB ANALYSIS Mark Rayson Optimal Performance Ltd Old Chambers, 93–94 West Street Farnham, Surrey GU9 7EB United Kingdom A number of occupations remain physically demanding, requiring a high level of physical capability for successful performance. Matching worker capabilities with occupational requirements by selecting personnel who possess the necessary physical attributes avoids irrational discrimination, reduces the risk of injury and ensures operational effectiveness. This is the first in a series of papers which describe the development and application of a systematic approach to setting and validating occupation-related physical selection standards, using the British Army as an example. This paper identified and quantified the most physically-demanding tasks within each occupation. Complexities were encountered during the data collection and analysis which are discussed. Criterion tasks which represented common military activities were defined (a single lift, carry, repetitive lift and loaded march), and all occupations in the British Army were allocated a level of performance.
Introduction Despite increased automation in the work-place, a number of occupations remain physically demanding, requiring a high level of physical capability for successful performance. Occupations in the Armed Services, Civilian Services (e.g. Police, Fire, Ambulance, Prison) and heavy industry (e.g. mining, steel, construction) are prime examples where workers require a minimum level of physical capability to be able to perform tasks required of their occupation. Matching worker capabilities with occupational requirements by selecting personnel who possess the necessary physical attributes avoids irrational discrimination, reduces the risk of injury and ensures operational effectiveness. Physical selection procedures have been in place for some time in certain occupations. But where physical capability has been assessed, it is often through tests and standards which have been derived pragmatically. For example, the British Army have minimum physical performance requirements for recruits on tests of sit-ups, pull-ups to a bar and a 1.5 mile run. However, despite the potentially serious consequences of recruiting to these occupations personnel who are physically sub-standard, these entry criteria have not been assessed against bona fide occupational requirements. For occupations where performance is of paramount importance and where the lives and safety of the public and other members of the work-force may be at risk, the basis for physical selection requirements should be operational effectiveness. Operational effectiveness is ultimately dependent upon the ability of each worker to perform the required tasks to the required standards. There are several approaches for developing occupation-related physical selection standards. The preferred approach involves assessing workers on the performance of real occupational tasks (ensuring a high content validity). However, this approach is often impracticable, especially for job applicants, for reasons of safety, skill requirements or logistics. Consequently, tests are often used in place of occupational tasks either as substitutes (criterion validity approach) or as simulations of all or part of the tasks (construct validity approach). Whichever approach is adopted, the tests must be predictive, quantitative, reliable, safe, practicable and non-discriminatory. This is the first in a series of three papers which describe the development and application of a systematic approach to setting occupation-related physical selection standards, using the British Army as an example. This paper describes Phase 1 which involved a job analysis to
394
MP Rayson
identify the most physically-demanding aspects of each occupation. Subsequent papers describe Phase 2 (setting physical selection standards) and Phase 3 (validation and application of physical selection standards).
Method A variety of techniques for gathering data about the physical demands of British Army occupations were used including questionnaires, interviews, observation, and physiological, biomechanical and psychophysical measurement techniques. A job analysis Questionnaire was administered to the Arms and Service Directorates: to identify the most physically-demanding tasks required of personnel; to obtain a detailed description of the task elements; and to cluster all occupations which shared similar task demands. The questionnaires requested detailed information on the most physicallydemanding tasks that all personnel under their command would be required to perform under peace-time conditions. Where it was practicable to do so, the tasks identified from the questionnaires were quantified in the field using multi-disciplinary techniques. The objective of the field measurements was to provide quantitative data describing the requirements of the tasks for each cluster of occupations. One hundred and twenty trained male and female soldiers [mean age 22.2 (sd 3.1) years, height 1776 (sd 69.3) mm, body mass 73.8 (sd 10.6) kg] performed the military tasks. Subjects were ‘representative’ of their occupation, medically classified as ‘fully fit’, and familiar with performing the tasks. The study was approved by the Army Personnel Research Establishment’s Ethics Committee. Consent was provided by all subjects. Performance of the tasks was recorded on video tape for documentation purposes and for subsequent biomechanical analysis. The images from two frame-synchronized cameras were recorded onto video tape recorders and a video time-code signal was ‘burnt’ onto the video image. Digitisation was performed using a Peak Video Illustrator. Once digitised, selected frames were replayed and measurements of distance and angle made using scaling rods as a reference. Where possible, direct measurements of the mass of equipment handled by personnel were made using calibrated weighing scales or dynamometers. Dynamometers were also used to measure the peak forces exerted by individuals on selected manoeuvres. Sizes of objects, and horizontal and vertical distances of movements of subjects and objects, were measured using tape measures. . Heart rate (HR) and rate of oxygen uptake (VO2) were measured to assess the demands of some tasks on the cardio-respiratory systems. HR was measured using Sport Testers PE 3000 (Polar). HRmax was measured during a Multistage Fitness Test (Ramsbottom et al, 1988). Oxygen uptake was measured using Oxylogs (P.K.Morgan).
Results Responses to the job analysis questionnaire were received from all Directorates. Eighty six physically-demanding tasks, required of all personnel within an occupation, were identified. All occupations which shared similar task demands were clustered. For example, the various medical and nursing occupations were clustered, as their most physically-demanding task was a common one—to evacuate a casualty on a stretcher. Observations and measurements were subsequently made on 64 of the 86 tasks. The detailed results have been published in an internal Ministry of Defence Report (Rayson et al, 1994). The main findings are summarised below. The frequency of occurrence of the principal actions used during the tasks are shown in Table 1. Fifty five percent of tasks involved a combination of actions, with lifting and carrying comprising the majority of these (89%). Table 1. Frequency of principal actions used during tasks
The vertical travel distance of the lifts ranged from ground level to overhead. The start and end heights of the lifts are shown in Table 2.
Physical selection procedures. Phase 1: Job analysis
395
Table 2. Start and end heights of lift tasks
The horizontal distances of the carry tasks ranged from 2 to 500m. The distribution of distances is shown in Table 3. Table 3. Horizontal distances of carry tasks
The tasks were performed by teams of between one and eight people. Thirty seven percent of tasks were single-person and 63% were multi-person. Where objects were handled, loads ranged from 10 to 111 kg per person. The distribution of loads is shown in Table 4. Where loads and forces were shared by more than one person, a simplistic approach was adopted whereby the total load was divided by the number of people in the team. Table 4. Loads handled
Mean % HRmax values ranged from 55% to 88%. The frequency with which the categories of % HRmax was achieved is shown in Table 5. Table 5. Percentage of maximum heart rate achieved during tasks
Mean values of. oxygen uptake ranged from 1.16 to 2.92 1.min–1. The frequency with which the categories of VO2 was achieved is shown in Table 6. Table 6. Oxygen uptake during tasks
Discussion The decision to ask Arms and Service Directorates to shortcut the job analysis process by identifying and defining the most physically-demanding tasks and by specifying minimum standards of performance, was partially successful Although key tasks were identified, the quantity and quality of response was variable. As reported in a job analysis of the United States’ Army (Sharp et al., 1980), the responses represented experienced opinion rather than observed practice. For some tasks the difficulty in providing a detailed description was understandable. For example, the Infantry task of “assaulting a prepared enemy position” involved a sequence of sub-tasks, the precise details of which are dictated by the mission. Some of these sub-tasks included: an approach march over variable terrain and for a variable distance; ‘closing in’ on the enemy position whilst alternating sequences of short sprints, crawling and shooting; a final assault to secure the position; and evacuation of casualties. For other tasks, the responses were either unnecessarily complicated by respondents, or subsequent observation and measurement in the field revealed the tasks to be unrealistic. The importance of obtaining a detailed and accurate description of the job requirements as the first step in establishing physical selection standards cannot be over-stated. The components of a task are so intrinsically linked to performance that they directly determine performance outcome. For example, handling awkwardly shaped, larger, and asymmetrical objects, or objects which do not afford good grip, or increasing the vertical height, spine-toload distance, and frequency of lifting, all decrease lifting capability (Ayoub and Mital, 1989). In view of the fundamental inadequacy in the definition of the tasks, there appeared little chance that the majority could be used as reliable criterion tasks. One of the most important findings from the job analysis was the predominance of material handling activities. With only a few exceptions, the demonstrated tasks involved lifting, lowering, carrying, pushing or pulling items of equipment. This was not surprising
396
MP Rayson
given the frequency with which material handling activities had been reported by both the United States’ and Canadian Armed Forces (Sharp et al., 1980; Allen et al., 1984). Although some extremely heavy lifts were recorded (e.g. 90 kg fork lift, 100 kg generator lift, 111 kg T-bar lift), the vertical distance of these lifts was often small and the start and finish heights were largely in the optimal lift range (i.e. ground to waist height). The heaviest lifts to head height or higher were recorded during bridge-building and lifting generators on to vehicles: both involved 44 kg lifts per person. Where tasks involved handling very heavy loads, strategies were adopted by the soldiers to minimise the stress on the lumbar spine. For example, the 111 kg T-bar was lifted with the legs straddling the object and the hands positioned between the legs. The 100 kg generator was raised by extending the legs whilst supporting the generator handle in the crook of the arms. Involvement of the arm muscles was thereby minimised. Two thirds of the tasks were multi-person, i.e. they required involvement by more than one soldier. Some of these tasks involved simultaneous sharing of a given manoeuvre, such as the numerous two- and four-person lifts, whilst others involved a chain of personnel consecutively contributing to the manoeuvre, e.g. loading barmines. Both forms of multiperson activities complicated the process of identifying and analysing an individual’s contribution in successfully completing the task. A few studies on multi-person tasks have been published. Kroemer (1974) suggested that in the case of shared pulling or pushing, the force recommendations for one person should be multiplied by the respective factor (e.g. two or three) using the assumption that the load is shared equally. For simplicity, this was the principle adopted in calculating the loads per soldier cited in this paper. However, this method is likely to underestimate the actual strength requirements on soldiers (Davies, 1972; Karwowski and Mital 1986; Sharp et al. 1993). A number of examples of large and variable-shaped objects were measured which included various missiles, generators and scanners, camouflage nets etc., which compelled unusual methods of handling. Other objects were either asymmetrical in load distribution (generators, missiles, drawbars etc.) or had unstable loads (camouflage nets, fuel cans, food pots etc.) thereby reducing performance (Ayoub and Mital, 1989). Similarly, although the majority of material handling tasks afforded good coupling between the worker and the object by the provision of handles, a number of tasks did not. Tasks which were. judged by the author to involve a significant cardiovascular component were . assessed for HR and VO2 response. However, the intricate relationship between HR and VO2 and the mode (Petrofsky and Lind, 1978; Rayson et al., 1995), intensity and duration (Ayoub and Mital, 1989) of the tasks, combined with the inadequacy in definition of the task components allowed a very limited interpretation of the cardiovascular data. Future measurement of these variables during a job analysis is not recommended unless the tasks can be adequately defined. Many of the measured tasks were skilled, multi-person activities involving complex manoeuvres usually involving equipment, and often performed in restricted space, forcing unusual postures. The inability to define precisely the tasks meant that the majority were performed and measured, self-paced. Many of the complicating factors described confounded attempts to standardise the tasks during their demonstration and quantification and conspired against their adoption as criterion tasks. However, close scrutiny of the tasks revealed considerable duplication and overlap. The majority of identified short-comings could be overcome and the project progressed by using the data collected during the job analysis, combined with subject-matter expert opinion, to define generic military tasks for use as criterion tasks. These generic criterion tasks would remain task-related, as was originally intended, but would not attempt to encompass every task that had been identified from every occupation. Rather the generic tasks would be typical and representative of a cluster of similar activities. The standards could vary by occupation. The experience in other nations supported the need to develop generic criterion tasks. The United States Army progressively reduced the complexity of their task classification system to encompass eventually only lifting tasks which were reduced to five load categories (Department of the Army, 1982). The Canadian Armed Forces concentrated on four common military tasks comprising a casualty evacuation (stretcher carry), an ammunition box lift, a maximal effort dig, and a loaded march (Stevenson et al, 1988). No attempt was made to set standards which were occupation-specific. Rather, minimum acceptable standards were set which were common for all personnel. The strength of adopting generic criterion tasks lay in its simplicity and manageability. A relatively small number of generic tasks would need to be identified, protocols developed and minimum acceptable standards of performance agreed. The weakness lay in the shift away from the real occupational requirements to the notion of generic or representative tasks.
Physical selection procedures. Phase 1: Job analysis
397
However, if the tasks were rationalised empirically and were deliberated and refined by a team of subject-matter experts, it could be argued, if a legal challenge arose, that all reasonable action had been taken given the time course and resources available. Consequently, four generic criterion tasks were developed to represent the key activities identified in the job analysis. They comprised a Single Lift, Carry, Repetitive Lift and Loaded March. The feasibility and logistics of administering the generic criterion tasks as ‘the gold standards’ against which any future tests could be validated, influenced the selection and design of the tasks. The diversity of physical requirements in the different occupations were encompassed by setting three standards, referred to as Levels 1, 2 and 3, for each of the 4 generic criterion tasks. Defining the standards and allocating personnel to levels were achieved on the basis of both the empirical data from the job analysis and subject-matter expert opinion. Where objective data existed to set the standards confidently and allocate occupations to a particular level, this method prevailed. But where no empirical data existed, or where the empirical data fell between Levels, subject-matter expert opinion was sought. Involving subject-matter experts allowed greater face validity to the selected tasks and demonstrated acceptability of the set standards within the organisation.
Acknowledgement This work was commissioned by the Ministry of Defence to the Army Personnel Research Establishment, Farnborough, Hampshire, UK. The author wishes to acknowledge the contributions to this work by DG Bell, DE Holliman, RV Nevola, M Llewellyn, A Cole, RL Bell, WR Withey, and MA Stroud.
References Allen, C.L., Nottrodt, J.W., Celentano, E.J., Hart, L.E.M. & Cox, K.M. (1984). Development of occupational physical selection standards (OPSS) for the Canadian Forces—summary report. Technical Report 84-R-57. DCIEM North Yorks, Canada. Ayoub, M.M. & Mital, A. (1989). Manual materials handling. London: Taylor and Francis. Davies, B.T. (1972). Moving loads manually. Applied Ergonomics, 3, 190–194. Department of the Army (1982). Women in the Army. Policy Review. Washington AC 20310. USA. European Economic Community (1976). Council Directive of 9 February 1976 on the implementation of equal treatment for men and women… Official Journal of the European Communities, 14 Feb 1976, 1, 39–42. Karwowski, W. & Mital, A. (1986). Isometric and isokinetic testing of lifting strength of males in teamwork. Ergonomics, 29, 7, 869–878. Kroemer, K.H. (1974). Horizontal push and pull forces. Applied Ergonomics, 5, 2, 94–102. Nottrodt, J.W. & Celentano, E.J. (1987). Development of predictive selection and placement tests for personnel evaluation. Applied Ergonomics, 18, 4, 279–288. Petrofsky, J.S. & Lind, A.R. (1978). Comparison of metabolic and ventilatory responses of men to various lifting tasks and bicycle ergometry. Journal of Applied Physiology: Respiratory, Environmental and Exercise Physiology, 45, 64–68. Ramsbottom, R., Brewer, J. & Williams, C. (1988). A Progressive Shuttle Run test to estimate maximal oxygen uptake. British Journal of Sports Medicine, 22, 141–144. Rayson, M.P., Bell, D.G., Holliman, D.E., Llewelyn, M., Nevola, V.R. & Bell, R.L. (1994). Physical selection standards for the British Army. Phases 1 and 2. Technical Report 94R036, Army Personnel Research Establishment, Farnborough, UK. Rayson, M.P., Davies, A., Bell, D.G. & Rhodes-James, E.S. (1995). Heart rate and oxygen uptake relationship: a comparison of loaded marching and running in women. European Journal of Applied Physiology, 71, 405–408. Sharp, O.S., Wright, J.E., Vogel, J.A., Patton, J.F., Daniel, W.L., Knapik, J. & Korval, D.M. (1980). Screening for physical capacity in the US Army. Technical Report T8/80, US Army Research Institute of Environmental Medicine, Natick, USA. Sharp, M.A., Rice, V., Nindl, B. & Williamson, T. (1993). Maximum lifting capacity in single and mixed gender three-person teams. In Proceedings of the Human Factors and Ergonomics Society 37th Annual Meeting. Statement of Defence Estimates (1990). Hansard, 1, 66:748, 6 February 1990. Stevenson, J.M., Andrew, G.M., Bryant, J.T. & Thomson, J.M. (1988). Development of minimum physical fitness standards for the Canadian Armed Forces. Phase III. Ergonomics Research Laboratory, Queen’s University at Kingston, Canada.
THE HUMAN FACTOR IN APPLIED WARFARE A E Birkbeck
Ballistics and Impact Group Mechanical Engineering Department University of Glasgow G12 8QQ
Warfare over the centuries has changed radically in some aspects. This has been brought about by improved technology, better understanding of materials and improvements in engineering. However, there is one limiting factor that has not changed throughout the ages. They have been called many names: foot slogger, grunt, the PBI. It does not matter what you call him, he is the infantry soldier. He has always been the main limiting factor in warfare.
Introduction Before the advent of motorised, horse drawn or other troop transport, the only way an army could move about the country was by covering the ground on foot. This practice carried on until the 1860s and the American Civil War where for the first time railroads were used to transport whole armies. In this day and age with advanced mechanisation it may seem superfluous that the soldier should still be phyiscally trained in order to improve his stamina and endurance but, during the Falklands war in 1982 and due to the lack of transport, the infantry had to resort to the traditional method of cover long distances, by carrying everything and walking!
Marching and load carrying Marching During the Roman times a legionnaire travelling between countries had to carry everything he possessed: his armour, his weapons and his food. In all, the total load was about 30kg (it is not clear if this figure includes the weight of the three stakes each legionary carried for the purpose of building a palisade when they camped each night). On campaign they were expected to march between 40 and 48km/day and called themselves Marius’s mules because of the weight they had to carry (Watson, 1983). The soldier of today carries between 25 and 30kg and, when not being ferried in an armoured support vehicle, he can march 35 to 45km/ day. A point that should be take into account is that the Roman mile is .92 of the statute mile
The human factor in applied warfare
399
and, given the time allowed to march the distances recorded, the pace works out very close to the British Army’s rate-of-march of 3 miles/hour with a ten minute halt included (British Army Drill Manual). There have been times when loads have been in excess of those mentioned but these have been exceptional circumstances e.g. the Normandy landings (Wilmot, 1952). Likewise the parachute landings at Arnhem but the distances on these occasions were comparatively short and not for day after day of constant marching. In the Middle Ages there was a regression in the ability of an army to cover the distances that the Roman armies could and did travel. One of the main reasons for this was the make up of the army. With the downfall of the Roman Empire and its armies and with the slide into feudalism, the king no longer kept a large regular standing army. At the prospect of a war each of the nobles who supported him was expected to bring with him a number of his local people, notably archers and spearmen. While the archers were expected to train in the use of the bow (Hardy, 1992) they were not trained in other aspects of military discipline (Lloyd, 1908) such as marching, load carrying, or skill with the sword, which was considered a gentleman’s weapon. The army itself moved, not in an organised column of marching but rather in the manner of a football crowd, travelling at the pace of the slowest member and only assuming any sort of formation as they drew close to the enemy. The distance marched each day was 19 to 24km (Hardy, 1992) and was controlled by the terrain they had to cross with the baggage wagons. As one can imagine the weather played a large part in this as well.
Load carrying In a medieval army the only people with any sort of weight to worry about were the knights with armour that weighed about 30kg. On the march this would not be worn but was carried on the accompanying wagons. The infantryman, apart from his helmet and possibly a leather jerkin, only had his bow or spear and the clothes he stood up in. In some cases individuals may have had a sword or knife looted from a previous battle and in accounts of the battle of Agincourt each bow and spearman also carried a 6ft stake (Bradbury, 1985). From this we can assume that the weight carried by the soldier was not excessive. In modern times the need for the infantry soldier to march any great distance has been reduced by the availability of motorised transport but there are times when this is not possible. These occasions do not often occur but the one great advantage the modern infantry soldier enjoys over his predecessors is his personal load carrying equipment. The legionnaire carried his equipment with the aid of a T or a Y shaped pole (Upcott and Reynolds, 1955; Lloyd, 1908) balanced over his shoulder in the manner of a bricklayer using a hod. The medieval soldier carried his possessions wrapped in his blanket and slung over his shoulder, a practice that was carried on during the American Civil War in the early 1860’s. The modern infantry soldier’s equipment is worn basically on his belt with the load being supported by padded straps over his shoulders. The style of the equipment has changed and improved since the start of this century but the basic application is still there and for all the improvements in the equipment and materials the modern soldier still cannot carry a greater load and still function any more successfully than could the Roman legionnaire.
400
AE Birkbeck
Arms and Armour Arms Another aspect with which a soldier has to become familiar with is “skill at arms” whether in modern times with a rifle or in medieval times a bow. In the middle ages the long bow ruled the battle field because of its range and ability to strike at the enemy and prevent him closing on a friendly position. The archer in peace time was exposed to a rigid training regime. He practised as a youngster with a light-draw bow. As he got older a heavier draw bow and heavier arrows increased his strength and marksmanship (Hardy, 1992) so that in time of war he would function in a cohesive body, In battle the archers would open fire at maximum range of “330 yds”, ie 300m, using volleys of arrows, at a rate of up to 17 arrows a minute. At this rate of fire they had 5 or 6 arrows in the air at any one time (Strickland, 1997). As the range decreased to 200 yds the best shots would start to select individual targets and their rate of fire would drop to between 8 and 10 arrows a minute. An infantryman before World War I was trained to use his rifle at extreme ranges varying from 800 to 1000 yds but during the First World War experience showed that the ability to shoot at 1000 yds was superfluous in modern warfare. During the latter part of the war and up to to-day most military rifles are designed with sights that do not go beyond 500m. Today’s soldiers, armed with an assault rifle, are not expected to engage the enemy beyond 400m and the effective battle range is 300 m (British Army Individual Skill at Arms Manual, 1975). The rate of applied fire is 8 to 10 aimed shots a minute, a range and rate of fire that would be familiar to the medieval bowman as the maximum range of his arrow and his aiming capability. One noticeable difference between the bowman and the rifleman is their weapons. The bow weighs 1.5kg while the rifle weighs 4.5kg. The bowman has to be strong enough to be able to draw a bow with a draw weight between 85lbs and 150lbs or greater (Hardy, 1992), and hold it steady long enough to aim and loose an arrow. The rifleman has to be able to hold his heavier rifle and carry out the same drill as the archer before firing. The limiting factor appears to be the same for the bow or rifle, each needs to have the strength to hold the weapon steady, the ability to acquire a target, have a steady aim and a controlled firing technique. It is this sequence of operations that restricts the number of targets and the range at which they can be seen and engaged by the individual. Hand-thrown weapons such as the Roman pilum weighed 3kg and had an effective range about 30m (Upcott and Reynolds 1955). While a modern hand grenade weighs about a third of this, it still thrown the same distance and not 3 times further. The weight of the object does not appear to matter very much as it comes down to the individual person. Modern javelins are thrown over 100m but they weigh about 1kg and are launched by running and then throwing whereas the pilum and the grenade were thrown from a standing position without the benefit of a run.
Armour Armour has been worn throughout the ages. During the Greek wars before 490 BC the armour weighed about 40kg and required a slave to carry it between battles (Lloyd 1908). After 490
The human factor in applied warfare
401
BC, with the change to a more mobile style of warfare, the armour evolved and became lighter. This is shown in the changes to helmets, from the earlier heavy pot helmet to the Corinthian style. These used a thinner wall section of the same material and a better overall design and ended up weighing between 0.9 and 1.5kg (Blyth, 1995). In contrast, the modern battle helmet, constructed of a polymer composite material, weighs between 1.2 and 1.5kg (Courtaulds, 1997). During the early Roman period (200 BC to AD 40) the body armour was of a chain-mail construction and weighed 12 to 15kgs. After AD 40 the style changed to plate armour in the form of the Lorica Segmentata that weighs about 9kg. Today’s modern multi-layered body armour weighs 8.6kg, very much in line with that of the late Roman armour, In the Middle Ages the armour worn by the knights covered the whole body but even that became lighter as it became more common to fight on foot as opposed to being on horseback. This was mainly because of the threat of the longbow. The common style of armour used during this period weighed around 30kg and the only other item the knight had to carry was his chosen weapon. Again it appears that 30kg is the weight a fit man can carry on his person and still function normally, whether legionary or modern infantryman.
Diet Carrying his food was a new concept introduced at the time of Julius Caesar. All the armies from earlier times, such as Alexander the Great’s, had to forage for their food and rely on the surrounding countryside to sustain them. This method of feeding the army has only one advantage and several disadvantages. The one advantage is that there are no supply lines and no baggage train. The disadvantages are that the diet cannot be controlled and they have to eat what is available. If one is in the same area for some time, foraging will use up the local resources and the army then goes hungry. Also one is at the enemy’s mercy if they decide to carry out a scorched earth policy and deprive the army of all possible food supplies. The advantage gained by carrying a food supply is that it allows an army to be independent of the immediate environment. The Roman soldier lived as part of an 8-man squad that organised and carried its own camping and cooking equipment. Evidence on display in the Hunterian Museum, University of Glasgow, points to a surprisingly varied diet which included pasta, meat, fish, fruit and vegetables. He also had beer and wine to drink. This was supplied to each squad along with a ration of grain from the stores, which was ground into flour and baked into bread and rolls. From this it can be deduced that the Roman soldier had a reasonably healthy diet which would reflect in his general health and strength, an advantage when it came to training exercises. Physical training sessions were included in the basic training and the soldiers were encouraged to take part in sports such as running, jumping, swimming and carrying of heavy packs in order to increase their stamina during route marches and weapons training (Watson, 1983). In contrast, one has only to read through the numerous books about warfare in the Middle Ages to realise the desperate living and eating conditions. As noted earlier the armies then not only moved like a football crowd, they also ate like a plague of locusts, taking everything from the surrounding area. Even in the Napoleonic times the idea of the army supplying
402
AE Birkbeck
rations to its soldiers was only partial and the troops had to forage to supplement the rations. During the 19th century, the Commissary Department started supplying the army with food although in the 1850s, during the Crimean War, the system broke down and the army was reduced to living in conditions like those of the Middle Ages. This war shows the importance of a regular supply of food being available to the troops. Rationing during World War II was even extended to the civilian population in the UK and the diet, while not very exciting, was designed to give everyone the correct balance for a healthy life. Today’s soldiers fare much better than those of the past, having access to a wide variety of food to sustain them and allow them to be able to carry out the strenuous tasks they are asked to perform.
Conclusion It is apparent that, irrespective of the changes in the technology of warfare over the past two millenia, the dominant constraint has been the infantryman. He cannot function properly if he is expected to march further than 35 to 45km/day or carry a load or wear armour greater than 30kg in weight. Full armour has not been used for several centuries but helmets are still in use. History has shown that helmets heavier than 1.6kg are not practical, as the neck cannot support greater loads during vigorous physical exercise. The effective distance of handthrown weapons, irrespective of their weight, is still much the same, implying that the human arm is the limiting factor. Likewise, for weapons that require to be aimed, 300m appears to be the optimum range that a target can be seen clearly, acquired and hit. Finally, over the years, the practice of supplying the soldier with regular food has helped him to perform at greater efficiency. To this day nobody has invented anything better than the well-equipped, welltrained and well-fed Infantry Soldier.
References Blyth P.H., 1995, Proc Light Armour Systems Symposium, RMCS Shrivenham. Bradbury, J, 1985, The Medieval Archer, pp 130–131. British Army Drill Manual, appendix “C”, Time and Pace pp 157. British Army Individual Skill at Arms manual, 1975. Chester Wilmot, C.F., 1952, The Struggle for Europe, pp 223 and 239. Hardy, R., 1992, Longbow: A Social and Military History, 3rd Ed. pp 102, 212–216, 218, J.H.Haynes & Co Ltd. Lloyd, Col (retd) E.M., 1908, A Review of The History of Infantry, pp 3, 37, 75, Longmans, Green, and Co. Strickland, M., Personal communication, Dept of Medieval History, Glasgow Univ. Upcott, Rev A.W., and A.Reynolds, Caesar’s Invasion of Britain, pp 17, 18, translated 1905, 15th Ed, 1955. Watson, G.R., 1983, The Roman Soldier, 2nd Ed, pp 54–55, 62, 65–66, Thames and Hudson, Pitman Press, Bath. Acknowledgement: Thanks are due to Mr B McCartney of Glasgow, for access to his private library.
AIR TRAFFIC MANAGEMENT
Getting The Picture:—Investigating The Mental Picture Of The Air Traffic Controller Barry Kirwan, Laura Donohoe, Toby Atkinson, Heather MacKendrick, Tab Lamoureux, & Abigail Phillips
Human Factors Unit, ATMDC, NATS, Bournemouth Airport, Christchurch, Dorset, BH23 6DF
Air traffic controllers have a mental representation of what is happening on the radar screen in front of them, including what has happened, what is probably going to happen, what could happen, and what they would like to happen and are in fact trying to achieve. This representation, whether primarily pictorial in nature or verbal or both, is generally referred to as ‘the picture’. Controllers talk of ‘having the picture’, as a necessary prerequisite for controlling air traffic, and also talk about ‘losing the picture’ as a rare event in which their abilities to control traffic break down. Future automation may impact upon this picture, and so it is useful to gain an understanding of the picture, what it is and how it works, and what affects it. This paper reports the initial results of a series of interviews with controllers, and insights from a recent experiment, which shed some light on the complexities and potential variations of this mental representation.
Background Air traffic controllers control air traffic primarily based on a radar display and flight strip information, the former showing aircraft position and flight level, and the latter giving an indication of where the aircraft originated, their destination, and by what route. Air traffic moves within a three dimensional space as a function of (the fourth dimension) time. The controllers therefore have a two-dimensional ‘picture’ in front of them (via the strips and the radar picture which updates in real time), but they must also be predicting where the aircraft are going to be in the near and medium future, and be aware of other possibilities, such as unplanned deviations in heading or (more rarely) altitude, and speed, etc… They must therefore have a mental ‘picture’ which operates in four dimensions. This is necessary in order to detect and avoid any potential conflicts between aircraft, and to facilitate the smooth, orderly and efficient (called ‘expeditious’) movement of air traffic. In order to do this,
Investigating the mental picture of the air traffic controller
405
controllers generally agree they need to have an internal or mental ‘picture’, which is based on the actual radar picture, strip information, and communications, etc. As air traffic density continues to increase in the near future, it is highly likely that some form of computer assistance or automation will need to be implemented to enable the controllers to handle the increased traffic load (and their own workload) safely and efficiently. For example, it is planned to replace paper strips with either electronic versions, or to use object-oriented displays attaching the information to the labels on the radar display which show the position of the aircraft. Communications techniques and equipment are also likely to change, requiring less oral radio-telephony between controller and pilot, and instead relying more on computerised messages which will be ‘up-linked’ and ‘downlinked’ between air traffic controllers on the ground and the aircrew in the cockpit. Additionally, several new tools are in development to supplement controller functions, such as conflict prediction and resolution, and the sequencing and spacing of aircraft on final approach to an airport runway. The question is, what impact will such tools or supportive semi-automation have on the controller’s performance, and in particular on the controller’s ‘picture’? In order to understand the picture, the first phase of research has proceeded in two related strands. The first strand involved carrying out a series of interviews with approximately twenty operationally valid air traffic controllers on the nature of the picture. The second strand of the research has focused on a detailed investigation of two controllers’ situation awareness and eye movements in a series of real-time simulations. This required the use of situation awareness debriefs (a modified SAGAT technique—Endsley and Kiris, 1995) and a head-mounted eye tracker, together with a limited amount of auto-confrontation (where controllers review their own performance in handling traffic—they are able to do this via a video replay of their eye track superimposed on the recorded scene).
Results Interviews on the picture Table 1 shows the main interview questions asked of the controllers, and Table 2 shows some of the types of answers gained from the interviews. Clearly there is a diversity of ideas on what is meant by the picture, indeed one respondent stated that there was no picture, that having the picture was a euphemism for being confident and skilled enough to handle traffic fluently and safely. From a safety perspective what was most interesting were the insights gained into what can make maintaining the picture difficult. This can lead to ‘losing the picture’, a catastrophic breakdown of the ability to control traffic. Controllers who had had this unnerving experience stated that they recognised this was about to happen (by getting behind on tasks and becoming purely reactive rather than proactive), and called in support either to help them to regain the picture, and their confidence, or to take over. This extra person effectively adds cognitive ‘processing power’ to the task. Given the experience the aviation industry has had with the introduction of cockpit automation and its effects on pilots (e.g. Billings, 1997: creating many secondary automation management tasks, and increasing workload just at the point where something unusual happens on the flight deck, etc.), considerations based on insights such as those in Table 2
406 B Kirwan, L Donohoe, T Atkinson, H MacKendrick, T Lamoureux and A Phillips
(Q10), may help forestall similar problems occurring for ATC as future automation and consequent interface, procedure and role changes are introduced. The interviews undertaken during these studies represent a first set of data, and it is intended to continue conducting these ‘picture’ interviews, since new information is still being generated from the more recent interviews. It is also desirable to extend the range of ATC jobs/positions being analysed, to gain a full appreciation of different picture types and aspects. One interesting ATC position to analyse would be oceanic control, since it is procedurally controlled at present, without a radar picture, though this may change in the future as new ATC technology arrives. Table 1—Sample interview questions on ‘the picture’
Experimental investigation of the picture Given the important caveat that only two controllers were the subject of this investigation (each with 7 years of operational experience), the investigation nevertheless produced some interesting results. The main result, gained from reviewing the eye tracking data, the situation awareness scores, and the result of the auto-confrontations, was that these two controllers appeared to have entirely different notions of what constituted the ‘picture’, although their performance on the same traffic samples was similar and met fully the requirements of the job. The first controller had a primarily visual picture, based on the radar display. This controller maintained the picture with regular circular scanning of the radar display including
Investigating the mental picture of the air traffic controller
407
its periphery, and by frequent contact with the aircraft, and by monitoring and updating the strips. This controller’s picture was good globally, i.e. more was remembered by this controller during the situation awareness debriefs immediately after each exercise. At one point during the study, the simulation was frozen after this controller had spent some time monitoring and re-ordering the strips, temporarily ignoring the radar display. The situation awareness measure showed that the controller did in fact have all the required information about the aircraft, but the locations were all similarly inaccurate, relating to the last visual sweep of the controller that had occurred. This suggested a strong visual and topographical picture for this controller. Table 2—Typical answers to a subset of the questions
The other controller had less global awareness. This controller’s awareness was on a more local level, and so the situation awareness debriefs showed less information on locations and details of aircraft, except those this controller had been dealing with at the time (for these aircraft the SA was good). This controller appeared to have a more verbal and less visual/ spatial ‘picture’, in the sense of having a list of priorities of aircraft to deal with in sequence. Once these aircraft had been dealt with, it appeared that information on them was released from working memory. This processing and discarding of information has been noted before in a study of expert versus novice controllers (the latter appeared to try to remember everything, but the experts’ performance was better, as they only remembered the essential information—Stein, 1993).
408 B Kirwan, L Donohoe, T Atkinson, H MacKendrick, T Lamoureux and A Phillips
What is interesting is that performance was equal (in terms of parameters such as conflict avoidance, and quality of service to aircraft) between the two controllers, though one controller had a far better global situation awareness than the other. However, it should be noted that, due to the unpredictability of conflict situations (where two aircraft can potentially lose standard required separation, and therefore have a risk of collision), the simulation freeze and SA debrief rarely coincided with such an incident, which is where a performance difference (in terms of conflict reduction) between the two picture ‘styles’ might manifest itself. Taken together with the eye tracking data and the auto-confrontation, however, this pilot experiment suggested evidence for at least two different picture types, scanning strategies, and processing styles.
Discussion There are potential implications in the results presented for future ATC systems. Firstly, the range of picture types needs to be fully understood by ATM system designers. Secondly, certain picture types may favour certain future ATC display and support-tool concepts more than others. Thirdly, there is the question of which picture type is best (safest; most expeditious; maximising situation awareness and optimising workload), if indeed there is a ‘best picture’, given projected future traffic levels. Fourth, what parts of the picture should be left to the controller, and which parts, if any, supported or even automated. As traffic increases, is it tenable that the controller will be able to maintain the picture, or will there be more reliance on automation, or will the controller have a very different picture in the future? Fifth, how will the role of the controller change, e.g. will the controller’s job become more supervisory in nature, and does such a role change necessitate giving up the picture, and will the controller still be able to intervene effectively in such a role? Sixth, and more fundamentally, how is the picture first evolved during training, and do individuals have a predilection for certain picture types, or can any controller learn to have a particular type or style of picture? It is hoped that the future planned research can at least begin to answer, or give insights into, some of these questions, in a practical ATC context. Acknowledgements: The authors would like to thank all the controllers that participated in both studies. Disclaimer: The opinions expressed in this paper are those of the authors and do not necessarily represent any policy or intent on behalf of the parent organisation.
References Billings, C.E. (1991) Aviation Automation: The Search for a Human-Centred Approach, (Mahwah, NJ: Lawrence Erlbaum Associates). Endsley, M.R. and Kiris, E.O. (1995) Situation awareness global assessment technique (SAGAT) TRACON air traffic control version user guide, (Lubbock, TX: Texas Tech University). Stein, E.S. (1993) Tracking the visual scan of air traffic controllers. Proceedings of the 7th International Symposium of Aviation Psychology, Vol 2, 812–816.
Developing a predictive model of controller workload in Air Traffic Management Andrew Kilner; Michael Hook; Paul Fearnside; Paul Nicholson. Human Factors Unit, Air Traffic Management Development Centre, Bournemouth Airport, Christchurch, Dorset, BH23 6DF, UK.
Workload has long been used as a metric to indicate system performance and operator strain (Moray 1979). The National Air Traffic Services (NATS), Air Traffic Management Development Centre (ATMDC) uses a unique methodology and toolset based on Wickens’ Model of Multiple Resources (Wickens 1991) to analyse and predict workload for a given set of circumstances. PUMA (Hook 1993) (Performance and Usability Modelling in ATM (Air Traffic Management)) uses a task analytic approach to analyse workload and infers workload on the basis of observational task analysis, cognitive debriefs (auto-confrontation) and interviews with subject matter experts. A large and detailed model of a concept of air traffic operation is developed and is subsequently used to predict workload. The following paper describes the PUMA workload analysis process and reviews several projects in which PUMA has had an impact in terms of Human Machine Interface (HMI) design, and workload assessment.
Introduction In order to conduct research and development work in ATM at an early stage in the design life-cycle, the ATMDC use the PUMA toolset as one of its methodologies. PUMA is used in part to evaluate prototype tools for the Air Traffic Controller (ATCO) which enable more aircraft to be handled. The PUMA toolset enables the determination of controller workload given a particular way of working (known as an operational concept or OC), and a particular airspace sectorisation, route structure and traffic sample (known as a scenario). The analyst can thus carry out initial investigations to determine those areas of an operational concept which are particularly workload intensive. The effect on workload of proposed changes to the operational concept (e.g. task restructuring/interface reconfiguring) may also be examined i.e. the PUMA toolset offers the ability to predict workload. This approach is possible because PUMA calculates an estimate of the overall workload placed upon the controller by each of the tasks and actions undertaken by that controller (tasks and actions are the basic
410
AR Kilner, M Hook, P Fearnside and P Nicholson
levels of controller interactions with the ATM system). By varying the sequence of tasks and actions, the operational concept can be varied to investigate alternative methods of working.
The PUMA Toolset and Methodology The PUMA toolset consists of a number of window-based graphical editors. These allow the user to define both the operational concept and the scenario, and hence, to generate estimates of controller workload. A variety of editors allow the user to examine and edit operational concepts and scenarios. These facilities may be used, for instance, in an experiment to optimise the structure of a given ATM task.
The PUMA Methodology/Process Video/Observational Task Analysis Video recordings of the controller (over the shoulder), the radar/interface equipment and a recording of the controller’s face are taken and combined into a single picture during real time simulations and operational analysis. In such trials the controller is required to provide instantaneous self assessment (ISA, Kilner 1994) data pertaining to how much workload the controller is experiencing. This assists in the identification of high workload areas of the trial. Such high workload areas are then examined after the trial during a cognitive debrief. A period of the trial (typically of 20 minutes duration) will then be translated by hand into timeline format, detailing all observable tasks and actions for a given controller role. This is known as the observational task analysis (OTA).
Cognitive Debrief During this part of the methodology the PUMA analyst and ATCO work through a structured cognitive debrief, using verbal protocol analysis. The aim is to capture all of the covert (cognitive) activities of the controller (including judgements, decision making, and planning). The covert actions are then be entered into the OTA file, which contains all overt and covert actions associated with the period of controller work of interest.
Task Modelling Controller actions are modelled in terms of their start and end times, and their times of occurrence. Tasks can then be defined in terms of which actions constitute which tasks. Typical tasks include coordination, accepting aircraft and resolving conflicts between aircraft. Tasks are then “generified” or averaged to produce a standard version of each task in terms of its typical duration and composition, generification allows the difference between different controller styles to be accounted for in the final task model.
Workload Attribution Once a model of generic tasks exists, the Workload Analysis Tool may be run. This particular tool uses a British Aerospace (BAe) implementation of Wickens’ “M” model. The M model contributes to the calculation of overall workload on the basis of a number of limited capacity information processing resources, or channels, that allow the controller to undertake actions.
A predictive model of controller workload in air traffic management
411
The use of a conflict matrix within PUMA adjusts the workload calculations in terms of the penalties (extra workload) incurred by two or more processing channels interfering with one another as they compete for cognitive processing time. The Workload Analysis Tool provides representation of calculated workload over time as the generic tasks examined by the toolset. In this way, the workload intensive tasks can be identified directly using the workload graph.
Task & Action Verification Once tasks have been generified, their structure is then reviewed to ensure that they are operationally realistic. This process entails the examination and editing of generic tasks by a subject matter expert. Here, a suitably experienced controller will examine each generified task in terms of its constituent actions, their duration, sequencing and associated workload. Once the generic tasks and their associated workload profiles are approved, further analysis and optimisation can begin.
Workload Analysis The workload associated with any particular generified task can be calculated so as to determine its contribution to the overall workload resulting from a sequence of tasks. This means that the high workload tasks, or task-combinations, can be studied and optimised in terms of their sequencing. The effects of implementing a new technology concept or HMI may also be explored. An additional facility for further task workload analysis is known as the static analysis. In such analysis, data are provided on the number of occurrences of each task, and on the percentage of time which the controller would spend performing that task or action. This information can aid in the identification of (predominant) tasks which might benefit most from optimisation.
Previous Applications The following section describes two studies in which NATS has applied PUMA in different settings. In the first project, Programme for Harmonised Air Traffic Management Research in Eurocontrol (PHARE), PUMA was used to analyse a real time simulation, make recommendations for change and then analyse the subsequent real time simulation with and without the recommendations implemented. The second project briefly describes how PUMA was used in conjunction with fast time computer based simulations to provide a more detailed analysis of workload.
PHARE PUMA has been used at various development stages of the Ground Human Machine Interface (GHMI) for the Programme for Harmonised Air Traffic Management Research in Eurocontrol (PHARE), a European future ATM concept that has both ground and airborne elements. PUMA analysis was undertaken on video data collected for both tactical and planner en-route controller position during the PHARE Demonstration 1 trials held in 1995. PUMA analysis identified areas of high controller workload when the PHARE GHMI was used under different operating conditions (or operational concepts). PUMA analysts were then, with the help of subject matter experts (SMEs), able to identify causes of high
412
AR Kilner, M Hook, P Fearnside and P Nicholson
workload, and develop recommendations for improving tool interface design, information display, controller training and task structuring (Rainback et al 1997). Using the PUMA data collected from the PHARE Demonstration 1 trial, it was possible to model the recommended changes to the operational concept. The workload curves produced by the PUMA model then allowed the proposed changes to be refined before they were implemented in a later real-time simulation. The results from the revised simulation operational concept saw a marked reduction in controller workload. Under the original operational concept the controllers experienced difficulties maintaining control of the sector and operating in a composed manner. However, under the revised operational concept the controllers had a planned and structured means of operating, and were able to maintain control of their sectors. The PUMA findings were substantiated by comments made by controllers in the debrief who, when presented with the same traffic sample as used in original operational concept but under the revised operational concept, said, “It’s a different traffic sample, that’s why it is so much easier to control”. Another controller said, “you must have made huge changes to the user interface, because it’s so much easier to use.” In fact, the various changes recommended from the PUMA findings each required very slight alterations to be made to the interface (for example colour coding and highlighting).
TOSCA—Fast time model work Work undertaken within the TOSCA (Testing Operational Scenarios for Concepts in ATM (an investigation of the concept of free flight in European Airspace)) project used PUMA to predict controller workload for scenarios generated by fast-time simulations of aircraft routing strategies. PUMA was used to add a finer level of detail to the workload calculations associated with fast time models. In this way the hypothetical workload calculations provided by the fast time model could be augmented with the PUMA data derived from actual controller-system interactions. The approach taken was to develop a PUMA model of an operational concept of air traffic control which was based on PHARE concepts, and use data obtained from fast-time computer models of air traffic management. The observational task analysis model from PHARE was augmented with information on conflict resolution strategies obtained by questionnaire-based interviews with controllers. The PUMA model of workload was then generated by taking a sequence of events generated by the fast-time simulation and mapping these to events from the PHARE data. The resultant PUMA model was then used to predict the sequence of tasks that might be performed by tactical and planner controllers when presented with the developing traffic situation generated by a fast-time simulation (Kilner et al 1997).
Summary The workload metric within PUMA is based on Wickens MRT. MRT provides a relatively sophisticated method of assessing workload in ATM. ATM is a relatively high workload environment in which the controller must task share. Using MRT enables the analyst to
A predictive model of controller workload in air traffic management
413
ensure the effects of competing resources within these tasks are assessed and measured. It is intended that in the near future several other workload metrics will also be imported to PUMA; total time on task, Time Line Analysis and Prediction (TLAP) and Visual Auditory Cognitive Psychomotor (VACP). All these metrics will be available within PUMA to ensure that the operational project under consideration has the most appropriate metric available. The process of validation of the PUMA method and toolset will also begin in the near future. Initially PUMA, as a task analytic measure of workload, will be assessed against a measure of primary task performance and, where appropriate, against subjective and objective measures of workload. These measure are recorded as a matter of course during real time simulations held at the ATMDC. PUMA’s greatest strength lies in its ability to not only measure workload but also to predict the workload associated with, as yet, undeveloped concepts of operation. This process (described above) allows those concepts of operation that are unlikely to yield benefits to NATS to be filtered from the development process before the move to more resource intensive real time simulation. The prediction of workload and the ability to measure workload supplemented with the ability to accept data directly from fast time computer models means that PUMA is a highly flexible tool that can be incorporated at various stages in any design cycle. PUMA can be used to test design concepts at an early stage in the lifecycle of a project and also undertake a workload comparison between several preferred options for a particular interface element. This paper has described the basis of a PUMA analysis of workload, and how PUMA is applied within operational projects to refine proposed methods of operation, and adds value to fast time computer simulations.
References Hook, M.K., 1993, PUMA version 2.2 User Guide, Roke Manor Research Ltd Internal Report X27/HB/1320/000. Kilner A.R., and Turley N.J.T, 1984, Development and assessment of a personal weighting procedure for the ISA tool. ATCEU Internal Note No 69, Kilner, A., Hook, M., Marsh, D.M., 1996, TOSCA WP8: Workload Assessment. Report reference TOSCA/NAT/WPR/08. Moray, N., (ed) 1979, Mental Workload: Its Theory and Measurement, edited by N.Moray, New York: Plenum Press. Rainback, F., Hudson, D., Lucas, A., 1997, PD1+Final Report, Eurocontrol PHARE/CAA/ PD3–5.2.8.4/SSR; 1 Wickens, C.D., 1992, Engineering Psychology and Human Performance, New York: Harper Collins.
ASSESSING THE CAPACITY OF EUROPE’S AIRSPACE: THE ISSUES, EXPERIENCE AND A METHOD USING A CONTROLLER WORKLOAD. Arnab Majumdar Centre for Transport Studies Department of Civil Engineering Imperial College of Science, Engineering and Technology South Kensington SW7 2BU European airspace often operates at or beyond capacity, leading to substantial delays and inefficiencies. The design, planning and management of European airspace is a highly complex task, involving many issues. Prime amongst these issues is the level of the workload of the air traffic controllers. This paper considers the issues involved in airspace capacity determination and its relation to the controller’s workload capacity. The impact of the sector and air traffic features on the controller’s workload is described, whilst the last section outlines ongoing research work at Imperial College to estimate airspace capacity given the various factors involved. 1. Introduction. The air traffic control (ATC) system plays an integral part in the safe and orderly movement of air traffic and relies upon a balance between technology and air traffic controllers. Air traffic doubled in Europe during the last decade, much in excess of the predictions upon which the developments of the national ATC systems were based. A recent study (ATAG, 1992), forecasts that the total number of flights in Western Europe will increase by 54% from 1990 to 2000, and by 110% from 1990 to 2010, leading to more than 11 million flights a year in Western Europe. This air traffic is unevenly spread throughout the continent, with a core area where traffic density is the highest. Modelling studies show that area will expand, there will soon be a situation of almost impossible air traffic density where the skies are the busiest over Europe. This will lead to an ever-increasing workload being placed upon the ATC network in Europe and it is generally held that workload within the ATC system has dramatically risen in the past few years, with further growth is predicted in the future. There has also been an increase in complexity of traffic. Responding to this, the aviation authorities in Europe introduced the European Air Traffic Control Harmonisation and Integration Programme (EATCHIP), administered by EUROCONTROL (European Organisation for the Safety of Air Navigation) to integrate and harmonise the present, disparate European ATC systems by 2000, and then to design a future air traffic control system. This relies heavily upon high levels of technology and automation to aid controllers (Majumdar, 1994). Currently, in most Western European en-route airspace sectors, controller workload is the primary limitation of the ATC system—a concept which is difficult to understand, comprising of tasks both required directly for the control of individual aircraft, and associated tasks. This situation is likely to remain in the short to medium term and thus controller workload will remain the dominant factor in determining sector and system capacity, and any modification of the airspace structure which reduces controller workload should increase airspace capacity. 2. Airspace Capacity Unlike land transport, the meaning of the term capacity for airspace is non-trivial. In road design for example, it is possible to estimate capacity in terms of the number of vehicles when the flow on the road is saturated. Similarly, for a railway line, it is possible to estimate the maximum number of trains permissible, given the safety requirements and level of signalling technology, on any particular railway.
Assessing the capacity of Europe’s airspace
415
For certain sectors, EUROCONTROL have available the declared capacity for that sector, in terms of the number of flights per hour. However, these figures need to be noted with caution. First of all these figures are the capacities that controllers declare available for their particular sector for the peak hour at a given time period. Such a figure gives no indication of the traffic mix which occurs in this figure, such as the proportion of aircraft ascending/descending, nor of the type of aircraft, e.g. large or small. Furthermore, the values of these capacities change after a period of time depending upon the season, additional technology, etc. A better indicator of the capacity of a sector than simply the aircraft per hour is capacity miles measured in nautical miles per hour. This is defined, for a sector, as the declared capacity multiplied by the average route length within that sector. This is an indicator of the real capacity of the sector in that it measures the capability of a sector to resolve “real transportation” problems, including both the traffic flow and flight distance parameters. It is, in effect, the maximum flight miles which can be controlled in a given sector over a long period. But again, given that one of the terms is the declared capacity, the concerns with this figure above, remain. In considering the capacity of the airspace system, there is a need to consider three factors (Stamp 1990); the physical pattern of routes and airports; the pattern of traffic demand, both geographic and temporal; and any ATC routing procedures designed to maximise the traffic throughput. The prime concern of the ATC authorities is the total number of flights that can safely be handled, rather than the number of passengers, or other measures. System capacity expressed by these measures is linked in a complex fashion by average aircraft size; load factors and demand for non-passenger flights. Forecasts of these depend in turn upon assumptions about market forces, about diurnal, weekly and seasonal demand patterns, and any anticipated responses of airlines to these and to any predicted capacity constraint. A second factor in this is the time span for the capacity estimate, e.g. peak hour or day, a “typical” busy hour or day, or a whole year. Linkages between such measures are complex and depend upon the diurnal, weekly and seasonal demand profiles. The tendency has been for such diurnal and seasonal profiles to become flatter. And finally, there is a need to specify the geographical area considered, e.g. from an individual ATC sector all the way to the European air traffic network. In general, the wider the area being considered, the more complex is the task of estimating capacity since such an estimate must be built up in stages starting with the capacities of the individual airports and airspace sectors and progressing through to the individual ATCCs. In a system agreed to be operating at its capacity limit, there is likely to be some room for extra flights on particular routes or at particular times. This unused “spare capacity” exists because it is presumably not sufficiently attractive in economic terms for airlines to operate additional services. Therefore, the total system capacity will not be the simple sum of the capacities of all the constituent parts, but can only be estimated once the pattern of demand is specified, with different patterns giving different capacities. A system as complex as the UK or European ATC network is not static, but instead changes dynamically so as to react to expected capacity constraints. Consequently, the estimate of total capacity is likely to be more robust if several parts of the system “saturate” at about the same time. In such a situation, the opportunity for finding relatively simple ways of alleviating the constraints is much reduced. The geographical division of airspace over continental areas of Europe is based upon national boundaries, with straight lines as limits over maritime areas and analysis of this operational division shows that the limits of a sector have usually been defined in relation to the acceptable workload within its airspace (EUROCONTROL, 1990). When the traffic levels increase, subdivision of the sectors has been the normal method of ensuring acceptable workload limits. However, the biggest regions of Europe have now reached the limit at which further subdivision results in an increase of the coordination workload outweighing the decrease generated by the reduction of traffic handled. In addition, the room for manoeuvre in a sector is reduced with subdivision. In this connection, it should be noted that the airspace as such is not saturated, but that the flow of data to be processed by the controller is becoming excessively heavy. Therefore,
416
A Majumdar
it is the controller who is saturated in terms of the tasks that he must do. Therefore, in most Western European en-route airspace sectors—with current level of technology—the air traffic control sector capacity is determined primarily by the workload of the controller, both for directly observable tasks as well as mental tasks that also need to be done if the traffic is to be safely handled. In terms of simple traffic flow into a sector, i.e. the number of aircraft, irrespective of their size or attitude, the traffic load of the sector defines the average hourly traffic demand between 06:00 and 18:00. It is an indicator for controller workload (used for estimating capacity exploitation), and the hourly traffic load is proportional to the routine workload. Often, instead of the hourly flow, the number of aircraft—ignoring attitude and size—within a sector at any instant is an equally good indicator for a sector. This represents the number of aircraft a controller must control at any time, and obviously too large a number will lead to the control difficulties. This instantaneous load defines the average instantaneous number of aircraft within a sector and is proportional to the monitoring workload of the controller. 3. Sector Complexity The concept of sector complexity is one which poses considerable problems. A FAA review of 1995 (Mogford et. al. 1995) based on the literature available defined ATC complexity as a construct—a process which is not directly observable, but gives rise to measurable phenomena— that is composed of a number of sector and traffic complexity dimensions or factors. These factors can be either physical aspects of the sector, e.g. size or airway configuration, or those relating to the movement of traffic through the airspace, e.g. the number of climbing flights. Some factors cover both sector and traffic issues, e.g. required procedures and functions. The FAA ATC complexity term refers to the effect on the controller of the complexity of the airspace and the air traffic flying within it. In theory, the structure of a sector is separate from the characteristics of the air traffic. However, when considering ATC complexity, it is not useful to separate these concepts and consider them in isolation. A certain constellation of sector features might be easy to handle with low traffic volume or certain types of flight plans. More or different traffic might completely change this picture. When there is no traffic in the sector, there is no complexity (i.e. there is no effect on the controller). On the other hand, a given level of traffic density and aircraft characteristics may create more or less complexity depending on the structure of the sector. Traffic density alone does not define ATC complexity, but it is one of the variables that influences complexity and so is a component of complexity. Its contribution to ATC complexity partially depends on the features of the sector. ATC complexity generates controller workload and it is the thesis of the FAA review that controller workload is a construct influenced by four factors. The primary element consists of a constellation of ATC complexity factors. Secondary components (acting as mediating factors) include; the cognitive strategies the controller uses o process air traffic information; the quality of the equipment (including the computer-human interface); individual differences (such as age and amount of experience). Controller workload originates from the sector and the aircraft within it. The procedures required in the sector, flight plans of the aircraft, traffic load, weather and other variables form the basis for the tasks the controller must complete. The amount of workload experienced by the controller also may be modulated by the information processing strategies adopted to accomplish required tasks. Such techniques have been learnt in training or evolved onthe-job and may vary in effectiveness. The influence of a complex ATC environment on workload can be ameliorated through he use of strategies that will maintain safety through, for example, simpler or more precise actions. The effect of equipment on workload is also relevant to ATC. The controller’s job may be made more easier if a good user interface and useful automation tools are available. This will ensure that adequate and accurate information is presented to the controller to allow for effective task completion. Personal variables, e.g. age, proneness to anxiety and amount of experience can also influence workload. Variations in skill between controllers can be
Assessing the capacity of Europe’s airspace
417
pronounced. These factors can have a strong effect on the workload experienced by a given controller in response to a specific array of ATC complexity factors. 4. The use of RAMS controller workload model in airspace capacity research. At Imperial College we have begun a study of the factors involved in ATC capacity for Europe, and their quantitative loading, as a function of the controller’s workload. To estimate the airspace capacity of Europe’s system, the following steps are required: • •
defining the physical characteristics of the system—simple but time-consuming once the airspace sectorization has been designed use the RAMS methodology.
To study the effectiveness of control measures on the flows of traffic being simulated, the RAMS simulator (EUROCONTROL, 1995) developed by EUROCONTROL is used to measure workloads associated with existing or proposed ATC systems and organisations. Due to its conflict detection/resolution mechanisms, and its flexible user interfaces and data preparation environment, RAMS allows the user to carry out planning, organisational, high-level or in-depth studies of the wide range of ATC concepts. RAMS has access to the EUROCONTROL database maintained for description and definition of the European aviation environment, e.g. airspace, airports and traffic loadings, to aid data preparation. It then simulates the flow of traffic through the defined area using accurately modelled 4-dimensional flight profiles for any one of 300 currently supported aircraft types. Realistic simulation is helped by the use of advanced conflict detection algorithms and rule based resolution systems. During the simulation recordings are made to assess the associated workloads placed upon the relevant controllers required undertake the simulated activity. In the RAMS model, each control area is associated to a sector, which is a 3-dimensional volume of airspace as defined in the real situation. Each sector has two control elements associated with it, Planning Control and Tactical Control, which maintain information regarding the flights wishing to penetrate them, and have associated separation minima and conflict resolution rules that need to be applied for each control element. To obtain the tasks for the controllers, the RAMS uses the ATC Tasks Specification used for the EAM. This lists a total of 109 tasks undertaken by controllers, together with their timings and position, for a number of reference sectors in Europe. These tasks are grouped into five major areas; internal and external co-ordinations tasks; flight data management tasks; radio/telephone communications tasks; conflict planning and resolution tasks and radar tasks. The reference sectors chosen cover the core European area upper airspace, and include sectors in the LATCC region, Benelux countries, France and Germany. The capacity of the sector is estimated by identifying the individual tasks that the controller must undertake, the time needed to achieve each task, and their frequency for a particular pattern of traffic and routings. The total workload can then be estimated by summing the time. The capacity of the sector is the traffic flow such that the estimated workload does not exceed a set criterion. (EUROCONTROL, 1996). However, the question of the set criterion for controller workload at capacity is also complex. In the UK, where the DORATASK model of controller workload is used, the following definition of controller capacity is obtained, agreed with operational ATC managers (Stamp, 1992): a controller cannot be fully occupied, on average, for than 48 minutes of an hour and a controller can only work in excess of 57 minutes in any one hour in forty. This guards against traffic flows that produce workloads in excess of 57 minutes in any one hour. In using the European Airspace Model, which is the precursor of the RAMS model of controller workload, there are two values generally used by EUROCONTROL in the interpretation of controller loadings; the PEAK HOUR PERCENTAGE LOADING and the AVERAGE PERCENTAGE LOADING. To assist in the interpretation of these loadings, approximate criteria are used to describe each loading (figures represent % age of 60 minutes): Severe PEAK HOUR loading: >70% Severe AVERAGE (3 hour) loading: >50% Heavy PEAK HOUR loading: >55% Heavy AVERAGE (3 hour) loading: >40% Moderate PEAK HOUR loading: <55% Moderate AVERAGE (3 hour) loading: <40%
418
A Majumdar
The main research concerns determining the impact of various sector, route and traffic factors on the controller’s workload, and then also determining the “equivalence factors” for the effect of different types of aircraft movement on the controller’s workload. For the purposes of our research, the modulating factors on the amount of workload experienced by the controller are ignored. We take as the starting point the concluding remarks for the study on sector complexity by the FAA (Mogford et al. 1995), that in any future study of ATC complexity “it would be more beneficial to focus further investigation on ATC complexity on refining our understanding of the complexity factors so that intelligent sector design and traffic management studies become feasible. It should be possible to discover how much each weighting each salient complexity factor has in determining overall complexity and controller workload. In this way, ATC environments could be created that have predictable effects on the controller” (p20). Table 1, outlines some of the sector, route and traffic characteristics whose impact on the controller workload, and ultimately capacity are to be determined using RAMS. The technique required in this case is response surface methods to estimate the nature of the surface. TABLE 1. Some Airspace characteristics in Europe
Another aspect of the research is to develop “equivalence factors” for aircraft movements, similar to the concept of passenger car units (pcus) is widely used in urban road traffic design, and in particular at signallized junctions. The saturation flow of a traffic stream at a signal depends upon, amongst other items, the number of heavy vehicles in the stream: as their number increases, so the saturation flow decreases. This effect is represented by ascribing passenger car units (pcus) to vehicles of various classes (heavies, two-wheelers, etc.) so that saturation flow can be written as a constant number of pcus per unit time. Similarly, intuitively it seems that different types of flow will have different impact on the capacity of a sector, e.g. it should be relatively easy to control a sector where there are say twenty flights all cruising in attitude, as opposed to a sector where there are 10 flights ascending and 10 descending. The additional workload elements involved in the latter case will include those of greater conflict detection, mental separation calculations, etc. Indeed, from earlier work undertaken by CAA (Stamp, 1992) with DORATASK, it is known that different types of flight movements have differential impacts on controller workload. As has been stated earlier, in terms of air traffic capacity in Europe, it is the controller saturation which determines capacity. These equivalence factors are just weighting factors which allow account to be taken of the influence of different types of aircraft attitude on the flow through a sector when the controller is at working at “capacity”, i.e. flow at “controller saturation”. References ATAG (1992) European traffic forecasts, ATAG: Geneva, Switzerland. EUROCONTROL EXPERIMENTAL CENTRE (1995) RAMS system overview document, Model Based Simulations Sub-Division, EUROCONTROL, Bretigny-sur-Orge, France EUROCONTROL EXPERIMENTAL CENTRE (1996) RAMS User Manual Version 2.1, EEC/RAMS/UM/OO13, Model Based Simulations Sub-Division, EUROCONTROL, Bretigny-sur-Orge, France. Majumdar, A. (1994) Air traffic control problems in Europe—their consequences and proposed solutions, Journal of Air Transport Management, 1(3), 165–177. Mogford, R.H., J.A.Guttman, S.L.Morrow and P.Kopardekar (1995) The complexity construct in air traffic control: a review and synthesis of the literature, DOT/FAA/CT-TN92/22, Department of Transportation/Federal Aviation Administration Technical Center, Atlantic City, NJ. Stamp, R.G. (1992) The DORATASK method of assessing ATC sector capacity—an overview, DORA Communication 8934, Issue 2, Civil Aviation Authority, London.
EVALUATION OF VIRTUAL PROTOTYPES FOR AIR TRAFFIC CONTROL—THE MACAW TECHNIQUE Peter Goillau, Vicki Woodward, Chris Kelly and Gill Banks Air Traffic Control Systems Group (ATCSG), DERA, St. Andrew’s Road, Malvern, Worcestershire, WR14 3PS, UK
In a complex domain such as Air Traffic Control (ATC) it is not always possible to prototype a system and its human-computer interface for assessment before implementing the final system. There is a requirement for a technique which enables a human factors input to be made at the earliest conceptual stages of the system lifecycle. The MAlvern Cognitive Applied Walkthrough (MACAW) approach builds on Software Engineering and Cognitive Walkthroughs, but adds a number of practical and applied dimensions geared towards the specific needs of the ATC domain. The MACAW technique was employed during a project to assess different options for automation in future Air Traffic Management (ATM) systems. Paper specifications supplemented by user interface visualisations were used as ‘virtual prototypes’ for preliminary assessment. Results from representatives of the user population yielded valuable early insights into the likely benefits, problems and usability of the ATM automation approaches.
Introduction Conventional human factors wisdom (Hopkin, 1995) advocates prototyping a system and its Human-Computer Interface (HCI) for expert assessment, before proceeding to implement the final system. However, constructing a fully working prototype in a complex domain such as Air Traffic Control (ATC) is a substantial activity, requiring significant resources of time and manpower. The use of appropriate software tools can ease the prototyping process, but again the learning curve and the necessary programming support associated with such tools are not trivial. In the real world, there is a recurring problem of the first human factors assessments occurring too late, often well into the system development lifecycle. There is therefore a concomitant requirement for a structured but flexible technique, beyond the so-called ‘cheap and dirty’ usability methods, which enables a human factors input to be made at the earliest conceptual stages of the lifecycle when only a paper specification exists. The present paper addresses this gap in available techniques by extending walkthroughs using paper-based ‘virtual prototypes’.
420
PJ Goillau, VG Woodward, CJ Kelly and GM Banks
Background The evaluation technique developed and employed by the DERA ATCSG is known as MACAW (MAlvern Cognitive Applied Walkthrough). The MACAW approach builds on the work on Cognitive Walkthroughs and Software Engineering design walkthroughs, but adds a number of practical and applied dimensions geared towards the specific needs of the ATC application domain.
Pluralistic Walkthroughs MACAW takes as its starting point the work of Bias (1991) on multi-user, informal Pluralistic Walkthroughs. Bias’ work at IBM used three types of ‘expert’: product developers, human factors specialists and representatives of the expected user population. The multi-user paradigm gave valuable perspectives from different viewpoints, and its informal approach was found to be a useful way of articulating interface usability issues before building a prototype or implementing software.
Cognitive Walkthroughs The walkthrough technique is derived from the code walkthrough review method of Software Engineering (Preece et al, 1994). As in the software version, the goal of HCI design walkthroughs is to detect potential problems early on in the design process so that they may be avoided. System designers ‘walk through’ a set of well-defined tasks, estimating necessary actions to achieve the tasks and predicting end users’ behaviour and problems. An inherent problem with this approach—particularly for novel systems—is that the evidence on which to make reliable predictions of user behaviour may not be available (given that the end users themselves are not represented in the walkthroughs). Polson et al (1992) describe their Cognitive Walkthrough as a hand simulation of the cognitive activities of a user. They also aim to identify potential usability problems by taking a micro-level, strongly cognitive stance closely mirroring cognitive task analysis. The walkthrough process is structured around specific questions which embody cognitive psychological theory, thus the reviews are particularly appropriate for investigating how well the proposed interface meets the cognitive needs of the intended users. However, a potential problem once again is that no actual users are involved, so Polson’s approach is heavily reliant on how well the system designers conducting the walkthrough are familiar with the cognitive set and the domain knowledge of their intended users. Although the strengths of walkthroughs are acknowledged, the overly formal and detailed academic stance and lack of user input are significant drawbacks.
The MACAW technique MACAW aims to retain the firm cognitive theoretical foundations and multi-user perspectives of these previous approaches, but also to add a level of practicality relevant to the evaluation of future Air Traffic Management (ATM) automation options, derived from experience of conducting human factors ATM trials at DERA Malvern (Kelly and Goillau, 1996). The underlying agenda was to assess the strengths, weaknesses and usability issues of selected automation concepts, so that an informed decision could be made regarding their future selection and implementation in ATM projects.
Virtual prototypes for air traffic control—the MACAW technique
421
MACAW key features • representative users from the scientific and Air Traffic Control Officer (ATCO) populations • written high-level paper specifications of controller tasks, categorised into the cognitive activities of Communication/Monitoring/Decision-Making/Planning/ Negotiation (based on previous task analyses of ATC) • coverage of the Departure/En-Route/Approach/Landing phases of an aircraft’s typical flight across Europe • screen shots of visualised exemplar ATM automation concept interfaces • framework of semi-structured interviews using a questionnaire format • video-taping of walkthroughs for later off-line analysis
MACAW questionnaire A standard questionnaire (Figure 1) was used to facilitate the walkthrough of: • • • • • •
task actions, associated goals and any task-goal mismatches perceived advantages and problems likely to be encountered with the automation implications for cognitive processing: memory, perception, understanding, learning estimates of ATCO performance, workload and capacity (Goillau and Kelly, 1997) estimates of errors and error types: commission and omission implications for other issues: timing, system failure, feedback needs, locus of control
Application of the MACAW technique As part of the European Commission funded ‘RHEA’ project (Role of the Human in the Evolution of ATM Systems), DERA’s role was to assess selected options for automation in future Air Traffic Management systems. For the purposes of the RHEA project, the MACAW walkthrough technique involved interviewing six subjects as they talked through the ATC scenario and automation options. Two in-house scientific experts participated in the first sessions as a pilot study. This resulted in a number of refinements to the experimental procedure and MACAW questionnaire. Four very experienced, retired ATCOs were then used as subjects in the main experimental sessions. Each scientific expert or ATCO was walked through individually, by a two-person team of questioner and observer. The ATC scenario and three ATM automation options had been defined on paper as highlevel task specifications. Time did not permit the construction of working prototypes. Also available were screen snapshots of a typical European flight region obtained from the DERA Real-time ATC Facility and Testbed (RAFT). These snapshots were imported into Microsoft Powerpoint, and the latter’s drawing facility used to visualise potential exemplar interfaces for each of the automation concepts. The paper specifications supplemented by the user interface visualisations were thus used as ‘virtual prototypes’ and provided a starting point for detailed MACAW discussions. A semi-structured interview was employed for each walkthrough, using the MACAW questionnaire as a framework. A written instructions sheet and explanation of the questionnaire terms were also employed. Each ATC scenario and automation option took approximately half a day to complete. At the end of the complete set of four walkthroughs, each subject was encouraged to compare and contrast the three ATM automation concepts, again using the MACAW questionnaire as a framework.
422
PJ Goillau, VG Woodward, CJ Kelly and GM Banks
Figure 1. MACAW Questionnaire
Observations on MACAW use There is clearly a limit to the quantity and quality of comments that can be gleaned from paper specifications and static pictures of exemplar screen interfaces. However, in the absence of working prototypes the MACAW approach was found to be very effective in involving controllers and incorporating their wealth of operational experience into the early assessment process. The interviews provided a valuable insight into the controllers’ opinions about the ATM automation concepts, and were usefully combined with the opinions of human factors experts and system designers. The individual interviews yielded a rich set of data, including the likelihood and the types of errors which might occur. Videoing the interviews proved to be a good backup mechanism, and having a questioner and observer team worked well. The MACAW questionnaire was helpful as a standard way of structuring the interview and looking at different aspects of automation usability, though it was not always possible to separate out each cognitive component during the walkthroughs. Considering the ATC scenario and automation options, some trends were evident in the subjects’ preferences and comments. (These will be covered in a separate publication). A wide range of views was received from the subjects, though to identify any general ATCO preferences would require a much larger sample size. The high-level paper specifications were generally felt to contain insufficient detail and were refined by the ATCOs in the course of the walkthroughs. All subjects felt that trusting the automation was a major issue, as were the speed and accuracy of the automation responses. Although all the automation concepts were construed as providing potential benefits, each ATCO suggested different improvements to the concepts which might be implemented in the final design. This stresses the importance of including end users as active participants and stakeholders in the design and evaluation process.
Virtual prototypes for air traffic control—the MACAW technique
423
Conclusions and recommendations MACAW is a promising extension of the walkthrough approach. It was used successfully to investigate the strengths, weaknesses and usability issues of potential ATM automation concepts as ‘virtual prototypes’ prior to formal prototyping and system implementation. It was found beneficial to combine multiple stakeholder viewpoints from ATCOs and human factors experts. Some interesting insights into future automation system use and likely errors emerged in the course of this study. There were found to be problems with insufficient numbers of subjects and insufficient detail in the scenario specifications. These should be borne in mind for future MACAW validation work. At present, the MACAW approach can be commended as a cost-effective technique for involving the users and initially assessing usability aspects of virtual prototypes in future Air Traffic Management systems. It requires more extensive validation. The technique might be suitable for other complex application domains such as command and control, process control and aerospace.
References Bias, R. 1991, Walkthroughs: efficient collaborative testing, IEEE Software, September 1991, 94–95 Goillau, P.J. and Kelly, C.J. 1997, MAlvern Capacity Estimate (MACE)—a proposed cognitive measure for complex systems. In Harris, D. (ed.) Engineering Psychology and Cognitive Ergonomics, Volume 1: Transportation Systems, (Ashgate Publishing, Aldershot), 219–225 Hopkin, V.D. 1995, Human factors in Air Traffic Control, (Taylor & Francis, London) Kelly, C.J. and Goillau, P.J. 1996, Cognitive Aspects of ATC: Experience from the CAER and PHARE simulations. Paper presented at Eighth European Conference on Cognitive Ergonomics (ECCE’8), University of Granada, Spain, 10–13 September Polson, P.G., Lewis, C., Rieman, J. and Wharton, C. 1992, Cognitive Walkthroughs: a method for theory-based evaluation of user interfaces, International Journal of ManMachine Studies, 36, 741–773 Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S. and Carey, T. 1994, HumanComputer Interaction, (Addison-Wesley, Wokingham)
Acknowledgements The RHEA project was part-funded between 1996 and 1997 under the European Commission’s RTD Transport Programme DG VII Directorate contract No AI-95-SC.107. The RHEA partners were NLR (Netherlands), Sofréavia (France), Thomson-CSF Airsys (France), NATS (UK) and DERA (UK). The views expressed in this paper are the authors and are not necessarily those of DERA, the other RHEA partners or the European Commission. © British Crown Copyright 1998/DERA Published with the permission of the Controller of Her Britannic Majesty’s Stationery Office
DEVELOPMENT OF AN INTEGRATED DECISION MAKING MODEL FOR AVIONICS APPLICATION Doug Donnelly*, Jan Noyes** and David Johnson*
** Faculty of Engineering, University of Bristol Bristol BS8 1TN, UK ** Department of Experimental Psychology, University of Bristol 8 Woodland Road, Bristol BS8 1TN, UK
Ever since the first commercial flight, the amount and complexity of information available to flight deck crew have continued to increase. Although modem avionics systems have provided many benefits for aircraft operations, the advent of automation can lead to a decrease in crew awareness, especially in abnormal situations. One solution is the development of error tolerant systems which not only aid the crew in detecting and diagnosing problems, but also provide feedback to the crew on their actions. This paper will propose an integrated decision model, which has been developed taking into account the known characteristics of the civil flight deck decision environment and the human decision making capabilities of crew, with a particular focus on situation awareness. The model identifies points in the decision process where errors may be made and suggests that these may be used as intervention points for decision support, to prevent errors or to help recover from them.
Introduction The flight deck is a unique environment for decision making: it is complex, dynamic, subject to distractions, time pressure and at times an overload of information. Consequently, in order to design systems for use on the flight deck, it is essential to understand how crew act under certain conditions, how they respond to certain situations, and most importantly how they make decisions (Abbott et al., 1996). There have been several attempts in the past to form models of human decision making; the most successful of which tend to be based on Naturalistic Decision Making (NDM) theories. NDM research considers decision making in operational settings with experienced operators and so it is eminently suited to the civil flight deck. However, due to the unique characteristics of the flight deck, there are certain aspects of crew decision making which do not seem to be covered by existing NDM models and theories. Furthermore, those theories which are aimed specifically at aviation decision making do not seem to capture the complete picture. A model of crew decision making is
Integrated decision making model for avionics application
425
needed which can bridge the gap between understanding decision making and improving it. This paper outlines an Integrated Decision Model (IDM) which draws on other theories but is more descriptive of crew behaviour and applicable to supporting crew decision-making. This model highlights areas of weakness in decision making and the types of errors that can be made; it may also be used to point to areas where these errors may be detected, and where decision support may intervene to correct them.
An Integrated Decision Model There are certain characteristics of crew decisions which are essential to understanding flight deck decision making. First and perhaps most important is ‘Situation Awareness’ (SA). Many believe that a good SA is the key to effective decision making (Orasanu, 1995). Endsley (1994) described SA as consisting of three levels: level 1 SA concerns the perception of events, level 2 involves a comprehension of these events, and level 3 is the projection of future developments. This is very similar to Klein’s Mental Representation (MR) which he outlined in his Recognition-Primed Decision (RPD) model (Klein, 1993). This MR consists of knowledge of what is happening (similar to level 1 SA), knowledge of the rules governing the situation (level 2 SA), and knowledge of possible consequences, or expectancies for the future (level 3 SA). The key difference in aviation decision making is that the crew begins with a high SA which may degrade over time, unlike other experienced decision makers such as fire fighters, who acquire SA as the situation clarifies. This is an important reversal since a potential for error occurs when SA degrades (i.e. when the crew’s MR differs from the real situation), as opposed to when a situation is not correctly assessed. Figure 1. The proposed Integrated Decision Model
426
D Donnelly, JM Noyes and DM Johnson
It shows that the crew’s MR and the difference between this and the actual situation, play a key role in the decision process, since any decisions are based on this. In the case of the flight deck, where procedures have been previously determined, experience is essential in matching the information and cues to a familiar situation, in order to maintain the MR and to know which procedure is relevant. This is where Klein’s RPD model is most appropriate. The model shows that there are three paths which the crew may take in making a decision. If there is not enough information, or the situation is complex, s/he may seek more information to clarify his/her representation of the situation. If the crew is satisfied with the representation, s/ he may form intentions to act, may consider the consequences of these actions, and may even perform a mental simulation such as that described by Klein (1993). However, under certain circumstances, a short cut may be taken, which bypasses this process of forming intentions and considering consequences. When a situation is routine or if there is time pressure, the person may act or react automatically. This automaticity seems to be the key to the versatility of human decision making and problem solving, but can also be its downfall. Finally, there will be effects and consequences of the crew’s actions, or failure to act. Since aviation decision making is a continuous process, these effects will feedback to the crew in the form of changing events and trends. This feedback is a vital part of decision making. It is a kind of fail-safe, a way of detecting errors and correcting them. If the situation changes unexpectedly, or if the feedback does not correspond to the crew’s MR, they will have to adjust their MR, or take a fresh look at the situation. Many actions or errors on the flight deck are recoverable, but if the crew is distracted, is operating in poor conditions (e.g. at night or in bad weather), or has too high a workload, this vital feedback may be missed. This is where errors turn into accidents.
Experimental Study An important aspect of the IDM and one which often leads to error, is the short-cut path which bypasses the consideration of consequences. The two main conditions under which this route is taken are time pressure and automaticity (when actions or situations become routine). An experimental study was undertaken to examine the actions of decision makers under these conditions. It was hypothesised that decision makers do not consider the full consequences of their actions under conditions of time pressure and automaticity, and therefore make more errors under these conditions. Participants were asked to control a process control task, analogous to a flight deck scenario with some of the key characteristics of NDM. The task was a computer simulation of an industrial distillation plant. The user had to regulate the distillation of alcohol to a desired level of purity, while producing as much distillate as possible. There was also a secondary, event-handling task where they were asked to respond to a message. There were two possible responses, thus requiring a simple decision to be made by the user. The frequency of occurrence and type of event were controlled such that the response became increasingly automatic. The time allowed to respond to the event was also controlled so that the participants were placed under time pressure. There were 18 male and 11 female participants: 14 in the control condition and 15 in the experimental condition (high time pressure and automaticity). Their mean age was 24.9 years, with a standard deviation of 2.09 years. An independent subject’s design was used for both trials, with participants randomly allocated to conditions. Automaticity and time pressure were
Integrated decision making model for avionics application
427
used as independent variables, while the number of correct responses to an event and the response times were the dependent variables. The participants’ approach to the process control task was also important since this would directly affect their responses to the events. For example, the process was not easy to control and required a high mental workload in order to collect the required volume and purity of distillate. However, by taking a less active approach, a less pure, lower quantity of alcohol is collected and the task becomes much simpler. The overall purity and volume of the collected alcohol therefore gave an indication of the approach taken by the participant. Finally, a questionnaire was completed by each participant to provide a subjective measure of automaticity, time pressure, performance and motivation.
Results The results showed that, although time pressure greatly affected the way in which the participants made decisions, the increasing routineness or automaticity, of the decisions had little effect. This was shown in both the number and types of incorrect decisions made under time pressure, and by secondary measures such as the response times and the questionnaire answers. However, the questionnaire results suggested that there did exist a certain amount of automaticity. The participants found themselves responding automatically to events, but correcting themselves in time. This may relate to Rasmussen’s skill or rule level of information processing (Rasmussen, 1993), rather than the knowledge level. Rasmussen showed that different tasks required differing levels of mental processing, according to the nature of the task. He also showed that humans made errors in knowing which level of processing is appropriate. This may correspond to the different paths in the IDM, with the skill and rule processing levels relating to the short-cut path used under time pressure and automaticity, and the knowledge level being represented by the consideration of consequences. The objective results did show that under time pressure or distractions, the decision maker’s actions could be automatic and this might imply that there are two kinds of automaticity, simple and complex, again relating to Rasmussen’s different levels of skill/ knowledge. The event-handling used in the experiments would have dealt with ‘simple automaticity’, i.e. skill/rule level decisions (yes/no decisions) which are simple enough not to warrant any real consideration of consequences under any conditions, but which, under conditions of time pressure or distraction can lead to mistakes or slips. ‘Complex automaticity’ would involve decisions which require knowledge level processing in order to be made correctly, but which would, if the same decision was made frequently enough, encourage the decision maker to become complacent and not use this knowledge level.
Future Directions It is important that the proposed model for decision making is validated, both in terms of its suitability to the flight deck and in its conclusions about crew behaviour. This could be done through further experimental studies to investigate the actions of decision makers in various situations. The main aim of such validation would be to determine the key areas of weakness in crew decision making. If the IDM proves to be a valid model of the decision process, it would then be possible to use it in the improvement of decision making on the flight deck. This improvement may come through a combination of training, procedures and the design of systems to support the crew’s decisions.
428
D Donnelly, JM Noyes and DM Johnson
It would seem, from the proposed IDM, that two of the main areas of weakness in the decision process, are the formation and maintenance of an accurate MR, and the consideration of the consequences of actions or inaction. A major problem however, is that these two areas lie within the internal decision process (shown by the dashed line). This makes it extremely difficult to know when an error has been made and what type of error it is, thus making intervention from flight deck systems unfeasible. The only means of interaction with the crew is through the actions they perform and the information presented to them. This is why solutions to human error have traditionally relied upon training to improve the internal decision processes, and the improvement of information displays. Many of the systems on the flight deck, such as information displays, tend to be errorpreventative, as opposed to error-tolerant. However, using the IDM it may be possible to design error-tolerant decision support systems. Traditional decision support systems have generally been based on normative decision models, and are not designed for use in situations where time is short and information is not freely available. Many decision support systems proposed for use on the flight deck rely on artificial intelligence technology which is not yet available, and are not based on an understanding of crew decision making. The IDM highlights areas where decision support could intervene to aid the crew. An effective intervention point for decision support would be to provide feedback on the effects or consequences of crew actions. This could also help to clarify or even restore crew situation awareness. Such a system would essentially be a warning system which gathers information on the aircraft and its present environment, so that it can provide the crew with an accurate picture of the situation. If there is reason to believe that the aircraft is in an unsafe condition, or if the crew’s actions may place the aircraft in danger, the system could inform the crew of this. It is important that the design of flight deck systems is based on an understanding of the way in which the crew act and make decisions. They must allow the crew the freedom to use their own decision strategy, whilst providing support in potentially unsafe conditions. The proposed Integrated Decision Model, if validated, will allow such systems to be more effective and may lead to improved decision making on the flight deck.
References Abbott, K., Slotte, S. and Stimson, D. 1996, The Interfaces Between Flightcrews and Modern Flight Deck Systems. (Report of the Federal Aviation Administration Human Factors Team). Department of Transportation, Washington, DC Orasanu, J. 1995, Situation Awareness: Its role in flight crew decision making. In Proceedings of the 8th International Conference on Aviation Psychology, Columbus Ohio Endsley, M. 1994, Situation Awareness in dynamic human decision making: theory. In R. Gilson, D.Garland and J.Koonce (eds.), Situation awareness in complex systems, (Embry-Riddle Aeronautical University Press, Daytona Beach, Florida) 27–58 Klein, G. 1993, A recognition-primed decision model of rapid decision making. In G. Klein, J.Orasanu, R.Calderwood and C.Zsambok (eds.), Decision making in action: Models and methods, (Ablex, New Jersey) Rasmussen, J. 1993, Deciding and doing: decision making in natural contexts. In G. Klein, J.Orasanu, R.Calderwood and C.Zsambok (eds.), Decision making in action: Models and methods, (Ablex, New Jersey)
PSYCHOPHYSIOLOGICAL MEASURES OF FATIGUE AND SOMNOLENCE IN SIMULATED AIR TRAFFIC CONTROL Hugh David Eurocontrol Experimental Centre 91222 Bretigny-sur-Orge, CEDEX, France P.Cabon, S.Bourgeois-Bougrine, R.Mollard Laboratoire de l’Anthropologie Applique 45 Rue des Saints-Peres 75270 Paris, France Eight Air Traffic Controllers carried out exercises using a TRACON II Air Traffic Control (ATC) Simulator. After a training and familiarisation day, the controller carried out four simulation exercises, two low and two high traffic load. His performance during each exercise was recorded, A self-assessment questionnaire for fatigue and a test of event-related potential (ERP) were applied and a sample of saliva was taken for cortisol analysis before and after each experimental session. The NASA-TLX was completed after each exercise and a test of alpha-rhythm attenuation was carried out at the start and end of each day. cortisol analysis, ERP and fatigue/sleep questionnaires are recommended for further use in Real Time (RT) simulations. Introduction The Eurocontrol Experimental Centre has probably the longest and certainly the widest experience of large-scale RT simulation of Air Traffic Control (ATC). One continuing concern has been the ‘objective’ measurement of the effects of carrying out ATC on the controller. Recent developments in ATC (Brookings et al, 1996, using a TRACON/Pro simulator), and elsewhere (Fibiger et al, 1986—hormonal responses to stress, Kramer et al, 1987—Event-Related Potentials) suggested that sufficient progress had been made to justify a further study of the feasibility of measuring physiological and psychological correlates of stress in ATC.
430
H David, P Cabon, S Bourgeois-Bougrine and R Mollard
Experimental Equipment and Design The TRACON II simulator is a PC-based RT simulator, displaying a simulated radar picture, with tables of strips for actual and expected traffic. Control orders were inserted via the keyboard, and a speech generator ‘spoke’ the controllers’ orders and pilots’ communications. A labelled map of the (London) TMA area employed, a table of data for the eight airports involved, and a table interpreting the keyboard orders were provided at all times. Exercises were nominally thirty minutes long, but continued until all the traffic had left the area, requiring up to 20 minutes after the last aircraft entered. Eight male Air Traffic Controllers carried out individual exercises using a TRACON II ATC Simulator during two days each. During the first day, the controller carried out an initial exercise during which the displays and controls were demonstrated. The controller was familiarised with the EEG and other procedures employed during testing. Two exercises with low and high workload respectively were then carried out, during which the controller was prompted and assisted as necessary. During the second day, the controller carried out four simulation exercises, two with a low and two with a high level of traffic load. The high level of traffic corresponded to 20 or 30 aircraft entering in 30 minutes, depending on the controllers’ estimated capacity. The low level was half the high level. Although the same exercises were used throughout, no exercises were seen twice by the same controller. Measurements The controller’s performance during each exercise was recorded. A self-assessment questionnaire for fatigue and a test of event related potential (ERP) were applied and a sample of saliva was taken for cortisol analysis before and after each experimental session. The NASA-TLX was completed after each exercise and a test of alpha-rhythm attenuation was carried out at the start and end of each day. A test of ERP was also carried out during simulation exercises, where the controller was able to carry it out. (The ERP test required the controller to listen to 150 auditory tones, counting the high frequency tones—about one third. The process takes about five minutes, and appears to the controller as a potentially distracting secondary task.) The controller filled in questionnaires concerning his sleep pattern before, during and after the two days experimentation. Results Sleep loss Controllers showed no significant changes in the time for which they slept, and felt less sleepy and tired on waking.
Measures of fatigue and somnolence in simulated air traffic control
431
NASA-TLX Controllers rated the higher loaded exercises to be more difficult. The mental demand component was rated highest for the high traffic session at the end of day one. Controllers rated their performance higher after the high load exercises, although objective performance, measured as the ratio of the TRACON score to the maximum value decreased. Alpha Attenuation Test The alpha attenuation test (Stampi et al, 1995) compares the proportion of alpharhythm observed in the EEG when the eyes are shut with that when they are open. In principle, when a controller becomes sleepy, his alpha rhythm should decrease when his eyes are closed and increase when they are open, the ratio of alpha power eyes closed/open is the alpha attenuation coefficient (AAC). A high AAC implies high alertness and vice-versa. In this experiment, no significant effect was observed. Subjective ratings of sleepiness and fatigue A clear circadian rhythm, with a marked post-lunch dip, was observable. Sleepiness and fatigue were generally closely related, although they diverged on the afternoon of the second day, where controllers felt less sleepy, but more fatigued. EEG—Spectral Analyses Theta rhythm (4–7Hz) was high during the high traffic training session of day 1, consistent with the view that theta activity is related to learning processes. During the measured exercises, there was a shift from low frequency (Delta and Theta rhythm 1–7Hz) to higher frequencies (Alpha and Beta rhythm 8–30Hz) for higher workload samples, consistent with increased alertness. ERP—Event Related Potential. The relative amplitude of the P300 potential decreased significantly after high load exercises, suggesting a measured fatigue effect attributable to the higher workload. (P300 amplitude during the exercise decreases considerably, but the technique is not practically applicable in RT simulations.) Cortisol Salivary cortisol normally shows a strong circadian rhythm, and did so here. Comparisons were therefore made in terms of the change in cortisol levels after an exercise compared with the level measured before. During the training day there was a significantly greater increase in cortisol for the high-load exercise than for the low-load exercise. During the measured day, no such effect was observed. The controllers, however, reported, via the NASA TLX, a higher subjective workload. This discrepancy may be attributed to the controllers, although they perceived a workload difference, being better able to cope with it.
432
H David, P Cabon, S Bourgeois-Bougrine and R Mollard
Individual Cortisol Rates There were significant differences between the four controllers having higher cortisol levels (HC) and the four having lower cortisol levels (LC). During the training day, the HC group performed significantly better, and tended to rate the workload higher. During the measured day, however, the HC group performed significantly better in low traffic, but showed a strong decrement in performance in high traffic. The HC group showed a marked increase in cortisol after the high workload training exercise, and after the afternoon high workload measured exercise, during which their performance deteriorated. Consideration of subjective and objective evidence of sleepiness and fatigue suggests that the HC group were more ‘fatiguable’ than the LC group. They slept longer during and after the experiment, and felt more tired when they woke. Discussion This feasibility study, although based on relatively few subjects, has demonstrated that various objective measures can be relevant to ATC. These results differ in some respects from those of Brookings et al (1996). Brookings’s subjects were younger and less experienced than those in this study, and received six hours of training, rather than the four used here. Brookings compared short sequences of 15 minutes of low, medium and hard workload, defined by manipulating task complexity (pilot skill, traffic mix) or traffic volume (6, 12, or 18 aircraft), without considering the relative capacity of the controllers. Most measures of fatigue showed high sensitivity to the effect of task demand. ERP showed relatively lower amplitude immediately after high workload exercises. Fatigue and sleepiness tend to dissociate under these conditions. The demands of the task require a high level of alertness, which induces further fatigue. (Similar effects are observed elsewhere—for example in the later stages of long-distance flights—Cabon et al, 1996.) ERPs cannot practically be recorded during simulations, both because they form a distracting secondary task, and because, in RT simulations, the controller speaks, which disrupts the ERP signal. Similar problems arising from speech and movement affect other electrophysiological measures, such as heart-rate variability, respiration rate, or overall EEG observations. Salivary cortisol is relatively easy to collect. There appear to be significant differences between low and high cortisol individuals. High cortisol individuals tend to be more affected by (simulated) ATC than low cortisol individuals. Subjective measures are sensitive, and easy to administer. The elaborate crosscomparison of scales appears redundant to controllers, and may be omitted. Where RT simulations are concerned, the physical workload scale may well be redundant. It is clear that the underlying pattern of sleep distribution may affect, and be affected by the learning or practice of ATC, whether as shift-work in the real workplace, or where controllers are displaced from their normal environment while participating in RT simulations. ‘Sleep logs’ as used in this study are relatively
Measures of fatigue and somnolence in simulated air traffic control
433
cheap and simple, but they may need to be checked by the use of an ‘actometer’—a sophisticated wrist-watch like device which records sleep quantity or quality. Conclusions Transfer to Real Time Simulation Determination of sleep patterns by sleep log/actometry and subjective selfevaluation of sleepiness throughout the simulation. Evaluation of stress reaction by salivary cortisol concentration—12 controllers—before and after selected exercises. Evaluation of psychophysiological impact of task load using ERP. One or two controllers, on heavily loaded sectors measured each day. Further TRACON Studies More detailed studies of brain function, examining topographical effects (where in the brain does activity take place?) as well as Fourier Analysis (what frequencies are involved?). Other measurement methods—eye-movement, blink rate, pupil diameter. Other control input methods (speech/on-screen graphical etc.) Effects of ‘less skilled’ computer-generated pilots. References Brookings J.B., Wilson G.F. and Swain, C.R. 1996, Psychophysiological responses to changes in workload during simulated Air Traffic Control, Biological Psychology, 42, p361–377 Cabon P., Mollard R., Mourey F., Bougrine S. and Coblenz A. 1996, Towards a general and predictive model of fatigue in aviation In Proceedings of the fourth Pacific Conference on Occupational Ergonomics, Taipei, Taiwan, pp 622–625. Fibiger W., Evans O. and Singer G.1986, Hormonal Responses to Graded Mental workload, European Journal of Applied Psychology and Occupational Psychology, 55, pp 339–43 Kramer A.F., Donchin E. and Wickens C.D. 1987, Event-Related Potentials as indices of mental workload and attentional allocation, In Electrical and Magnetic Activity of the Central Nervous System: Research and Clinical Applications on Aerospace Medicine. AGARD Conference Proceedings No. 432, pp 14–1 to 14–14 Stampi, C., Stone P. and Michimori A. 1995, A new quantitative method for assessing sleepiness, the Alpha Attenuation Test , Work and Stress, 9(2/3), pp 368–376
DRIVERS AND DRIVING
WHAT’S SKILL GOT TO DO WITH IT? VEHICLE AUTOMATION AND DRIVER MENTAL WORKLOAD Mark Young & Neville Stanton Department of Psychology University of Southampton Highfield SOUTHAMPTON SO17 1BJ
Although vehicle automation is not unfamiliar on today’s roads, future technology has the potential to reduce driver mental workload in addition to relieving physical workload. Previous work in our laboratory has determined that mental workload decreases significantly as more levels of automation are introduced. The current paper addresses the question of whether this picture changes across levels of driver skill, by measuring the mental workload of drivers at four different levels of skill, and under four different levels of automation. The preliminary data reported in this paper demonstrate that level of driver skill has no effect on subjective mental workload, however it does interact with level of automation on a secondary task measure. These results are interpreted with respect to potential effects on performance, with implications for safety on the roads of tomorrow.
Introduction There are a number of vehicle technologies on the horizon, some of which are intended to assist drivers in their task (e.g., navigation aids), whilst others are designed to relieve the driver of certain aspects of the driving task. So far, automation has merely operated at a physical level, however future systems seek to take over some psychological elements of driving. It is these latter devices which we are concerned with. Adaptive Cruise Control (ACC) assumes longitudinal control of the user’s vehicle, controlling both speed and headway, whilst Active Steering (AS) copes with lateral control, keeping the car within its lane. Both devices are expected to be on the road within the next decade, and both have the potential to reduce driver mental workload. Although this may at first sound advantageous, the arrival of automation is accompanied by a whole new set of problems. Stanton & Marsden (1996) use the history of automation problems in aviation as a basis for summarising the potential problems which may arise in road vehicle automation. For instance, overdependence on the automated system may lead to skill degradation, and this in turn could result in more serious consequences in the event of system failure. These views have been espoused by other noteworthy researchers in the field (e.g., Bainbridge, 1983; Norman, 1990; Reason, 1990), and there is a general consensus that supervisory control (cf. Parasuraman, 1987) is not a task best suited for humans. One specific problem associated with automation is that of mental workload (MWL). Automated systems have the potential for imposing both underload and overload. Under normal circumstances, operators are faced with fewer tasks than they were previously able to cope with, however in an automation failure situation, they are immediately forced into a situation of overload (cf. Norman, 1990). It is precisely this problem which our research is concerned with. Underload is at least as serious an issue as overload (Leplat, 1978; Schlegel, 1993), however its effects on performance have not been fully documented as yet. The
Vehicle automation and driver mental workload
437
present paper describes the latest in a series of studies designed to examine the relationship between automation, mental workload and performance.
Previous research There is very little work in the public domain specifically directed at evaluating vehicle automation. Nilsson (1995) used a driving simulator to investigate the effects of ACC on performance in critical situations. In comparison with manual driving, ACC drivers were found to be four times as likely to crash in the situation where collisions occurred (approaching a stationary queue). However, Nilsson (1995) did not find any workload differences between the groups on a subjective measure (the NASA-TLX; Hart & Staveland, 1988). Instead, performance differences were attributed to drivers’ expectations about the ACC system. Most of the other research of which we are aware in this field has been conducted in our own laboratory—the Southampton Driving Simulator. Stanton, Young & McCaulder (in press) explored the effects of ACC failure on performance. Faced with a malignant failure scenario, one-third of all participants crashed into the lead vehicle. In this experiment, a secondary task did demonstrate a significant difference in MWL between manual and ACC-assisted driving. With an apparent discrepancy between the workload results of the two studies described above, Young & Stanton (in press) performed a detailed experiment into automation and workload. Participants were asked to drive under four levels of automation: manual, ACC, AS, and ACC+AS. Both the secondary task and the NASA-TLX were used to measure workload, and the pattern of results for each was identical. Using ACC alone did not reduce MWL significantly when compared to manual driving, however there was a significant reduction when AS was engaged, and a further significant drop when both devices are used. These studies are discussed further by Stanton & Young (1997).
Skill and automaticity With these results in mind, then, the major theme of the most recent experiment addresses the question of whether this pattern of results changes across levels of driver skill, and in particular, degree of automaticity. Driving is a skilled activity which is a classic example of an automatic behaviour (Stanton & Marsden, 1996). Such overlearned responses can prove advantageous, as demonstrated by the braking response to vehicles pulling out into the path of others (see Nilsson, 1995), or they can have adverse consequences, as when drivers have a strong expectation that a junction will be clear and fail to see oncoming traffic (Hale, Quist & Stoop, 1988). When automation is engaged, all drivers—novices and experts alike—essentially satisfy the criteria for automaticity (that is, fast, attention-free, unconscious processing). If a situation of increased demand essentially transforms an expert into a novice (cf. Bainbridge, 1978), it is surely plausile to assume the reverse would be true in a situation of unusually low demand (i.e., driving with automation). However, whereas the expert has an increased knowledge base to draw upon, the novice is deprived of this. Therefore, it is important to understand how automation may affect MWL and performance for drivers of all skill levels. By measuring the mental workload of drivers at four levels of skill, and under four levels of automation, it should be possible to determine how automation and automaticity interact. Furthermore, Liu & Wickens (1994) claim that whilst subjective workload is influenced by the presence of automation, a secondary task can discriminate automatic from nonautomatic processing. Therefore, the present study extended that of Young & Stanton (in press) by repeating the procedure for drivers at three additional levels of skill: novice, learner, and advanced.
Method A mixed between-and within-subject design was used. Level of automation constituted the within-subjects variable, with four levels: manual, ACC, AS, and ACC+AS. Driver skill level was the between subjects factor, again with four levels: novice (i.e., never driven before), learner (currently learning but does not hold a full licence), expert (holds a full licence), and advanced (member of the Institute of Advanced Motorists in the UK). The latter group was chosen as a
M Young and N Stanton
438
high level skill group because the Institute of Advanced Motorists (IAM) provides further training for drivers with a full licence, and are statistically 75% less likely to be involved in an accident than other drivers without such training. Number of participants in each group was 12, 20, 30 and six respectively. For the expert group, the same data were used as that gathered by Young & Stanton (in press), and the general procedure for all groups was as described in that paper. Dependent measures were the NASA-TLX for subjective workload, administered immediately following each trial, and a visuo-spatial secondary task, designed to occupy the same attentional resources as driving. This was treated as an additional task to driving, thus used as a measure of spare attentional capacity. The variable treated to analysis was number of correct responses to the secondary task during the trial. A series of primary task data were also recorded, however these will not be covered in this paper.
Results A repeated measures analysis of variance (ANOVA) was performed on each of the NASATLX and the secondary task sets of data, including level of experience in the model as a between-subjects factor. Only the overall workload (OWL) score was analysed for the NASA-TLX, more detailed investigations on the subscales are beyond the scope of this paper. It must be emphasised that these analyses were performed on a reduced data set, and more robust conclusions will be available in the near future. For the OWL score then, a significant main effect was found for the within-subjects factor of level of automation (F3, 192=151.99; p<0.001). There were no significant results for level of experience, nor for the interaction between the two variables. Further exploration revealed that the effect of automation was exactly as expected on the basis of Young & Stanton’s (in press) work. That is, manual and ACC assisted driving did not differ in workload (t67=2.04; p=0.045), however there was a significant reduction when AS was engaged (t67=13.07; p<0.001), and a further significant reduction when both devices were used (t67=8.57; p<0.001). Significance levels in these tests have been adjusted to allow for the possibility of type I error. The results are illustrated in figure 1.
Figure 1. Mean OWL score for each skill group at all four levels of automation The secondary task data were more intriguing. Again a significant main effect emerged for level of automation (F 3,180=204.77; p<0.001), and the effect of experience alone was nonsignificant. However, a significant interaction did arise between the two variables (F9,180=2.26; p<0.025). The data are represented graphically in figure 2. Paired comparisons of the data for level of automation, pooled across experience levels, reveal a similar pattern to the NASA-TLX data. However, now there is also a marginally significant difference between manual driving and ACC supported driving (t66=–2.53; p<0.02), such that there is reduced MWL in the latter. The significant workload reductions
Vehicle automation and driver mental workload
439
for driving with AS (t64=–12.23; p<0.001) and for driving under full automation (t64=–11.53; p<0.001) still stand.
Figure 2. Mean secondary task score for each skill group at all four levels of automation To tease out the interaction, these paired comparisons were repeated within each level of experience. For the learner and expert groups, the pattern resembles the now familiar results obtained previously. That is, no significant difference between manual and ACC driving, however using AS reduces workload significantly, and using both ACC+AS has a further significant effect. For the novices, the effects of AS and ACC+AS are the same, however ACC marginally reduces workload when compared to manual driving (t11=–3.39; p<0.01). Similarly, in the advanced group there again appears to be more of a stepwise reduction in workload across the four levels of automation, although in this case the statistical significance is slightly less conclusive. There are marginal differences between manual and ACC (t5=–4.44; p<0.01), between ACC and AS (t5=-3.59; p<0.02) and between AS and ACC+AS (t5=–3.57; p<0.02). We are cautious in interpreting these data, given the number of tests performed and the sample size used. There are clearly significant differences between manual and AS driving (t5=–6.13; p<0.005), and between ACC and ACC+AS driving (t5=5.14; p<0.005).
Discussion If, as is the claim of Liu & Wickens (1994), subjective workload measures are influenced by the presence of automation, then the current study indicates that this factor is completely independent of experience. That is, levels of vehicle automation have the same effect on an individual’s perception of MWL whether they are a complete novice or an advanced driver. This consistent effect demonstrates that ACC alone has no effect on subjective workload, however AS does reduce perceived MWL, and this reduction is augmented further when both ACC and AS are used simultaneously. The secondary task, which according to Liu & Wickens (1994) can discriminate automatic from nonautomatic processing, paints a more interesting picture. As far as learners and experts are concerned, the pattern of results mirrors that for the NASA-TLX. This is an expected result, for the secondary task is also a widely accepted measure of MWL (e.g., Schlegel, 1993). However, this view is upset when the novice and advanced groups are considered. Here, there seems to be a more stepwise reduction in workload as more levels of automation are introduced. Although some of the results were statistically inconclusive, we feel that with the increased data set we intend to collect, this pattern will become more robust. That is, these groups perform better on the secondary task when driving with ACC+AS than they do with AS alone, which in turn is better than ACC alone, and finally performance with ACC alone is better than when driving manually.
440
M Young and N Stanton
As far as the automaticity paradigm is concerned, this is an extremely intriguing finding. The fact that the influence of automation on drivers’ spare attentional capacity is mediated by their level of experience means this is certainly an area which merits deeper exploration. Although it is currently difficult to explain why the results for novices and advanced drivers should be equivalent, further research should be able to shed some light on the matter. For now, it is perhaps interesting to note that part of the IAM assessment includes a running commentary on one’s driving. This is in deference to known research about expert performance, as automatic behaviours should be processed unconsciously. To say that a highly trained person must perform always at a conscious level is something of a paradox (cf. Barshi & Healy, 1993), however the merits of such processing in an unexpected situation cannot be denied. Indeed, in this respect the advanced driver does perform in a similarly declarative manner as the novice, albeit at a much higher level of abstraction. In light of this, perhaps the equivalence of responses in these two groups should not be so surprising.
Conclusions and future research It has been found that subjective MWL in a driving task is heavily influenced by the presence of automation, irrespective of the driver’s level of experience. However, a secondary task measure does reveal differences in the pattern of responses across levels of skill. This is interesting from the perspective of relating automation to automaticity, for the secondary task is a measure of automaticity as much as it is of workload. The most obvious next step is to analyse the primary task performance data. It may be the case that intermediate levels of automation adversely affect performance under normal circumstances. For instance, a driver’s steering performance may be different if driving with ACC than under manual conditions. Similarly, longitudinal control may be affected by the introduction of AS. Using these data in conjunction with the workload results reported here will help us understand whether mental underload is detrimental to performance.
References Bainbridge, L. 1978, Forgotten Alternatives in Skill and Work-load, Ergonomics, 21, 169–185 Bainbridge, L. 1983, Ironies of Automation, Automatica, 19, 775–779 Barshi, I. & Healy, A.F. 1978, Checklist Procedures and the Cost of Automaticity, Memory and Cognition, 21, 496–505 Hale, A.R., Quist, B.W. & Stoop, J. 1988, Errors in Routine Driving Tasks: A Model and Proposed Analysis Technique, Ergonomics, 31, 631–641 Hart, S.G. & Staveland, L.E. (1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In P.A.Hancock & N.Meshkati (Eds.), Human Mental Workload, (Elsevier Science, North-Holland) 139–183 Leplat, J. 1978, Factors Determining Work-load, Ergonomics, 21, 143–149 Liu, Y. & Wickens, C.D. 1994, Mental Workload and Cognitive Task Automaticity: An Evaluation of Subjective and Time Estimation Metrics, Ergonomics, 37, 1843–1854 Nilsson, L. 1995, Safety Effects of Adaptive Cruise Controls in Critical Traffic Situations, Proceedings of the Second World Congress on Intelligent Transport Systems, 3, 1254–1259 Norman, D.A. 1990, The ‘Problem’ with Automation: Inappropriate Feedback and Interaction, not ‘Over-Automation’, Phil. Trans. R. Soc. Lond. B, 327, 585–593 Parasuraman, R. 1987, Human-Computer Monitoring, Human Factors, 29, 695–706 Reason, J.T. 1990, Human Error (Cambridge University Press, Cambridge) Schlegel, R.E. 1993, Driver Mental Workload. In B.Peacock & W.Karwowski (Eds.), Automotive Ergonomics, (Taylor & Francis, London) 359–382 Stanton, N.A. & Marsden, P. 1996, Drive-By-Wire Systems: Some Reflections on the Trend to Automate the Driver Role, Safety Science, 24, 35–49 Stanton, N.A. & Young, M.S. (1997), Driven to distraction? Driving with automation, Proceedings ofAutotech ’9, 77–86 Stanton, N.A., Young, M.S. & McCaulder, B. in press, Drive-By-Wire: The Case of Driver Workload and Reclaiming Control with Adaptive Cruise Control, Safety Science Young, M.S. & Stanton, N.A. in press, Automotive Automation: Investigating the Impact on Driver Mental Workload, International Journal of Cognitive Ergonomics
THE USE OF AUTOMATIC SPEECH RECOGNITION IN CARS: A HUMAN FACTORS REVIEW Robert Graham HUSAT Research Institute, The Elms, Elms Grove, Loughborough, Leics. LE11 1RG tel: +44 1509 611088 email: [email protected]
Automatic speech recognition (ASR) has been successfully incorporated into a variety of domains, but little attention has been given to in-car applications. Advantages of speech input include a transfer of loading away from the over-burdened visual-manual modality. However, the use of ASR in cars faces the barriers of high levels of noise and driver mental workload. This paper reviews some of the likely in-car applications of ASR, concluding that its widespread adoption will be driven by the requirement for hands-free operation of mobile phone and navigation functions. It then discusses some of the human factors issues which are pertinent to the use of ASR in the incar environment, including dialogue and feedback design, and the effects of the adverse environment on the speaker and speech recogniser.
Introduction Automatic speech recognition (ASR) technology has been successfully incorporated into a variety of application areas, from telephony to manufacturing, from office to aerospace. However, so far, little attention has been given to in-car applications. This is perhaps surprising given that one of the major advantages of speech input over manual input is that the eyes and hands remain free. The task of safe driving could clearly benefit from a transfer of loading from the over-burdened visual-manual modality to the auditory modality. Indeed, numerous studies have confirmed the potential adverse safety impacts of operating a visualmanual system (e.g. a mobile phone or car radio) while on the move. This situation is likely to be exacerbated by the rapid growth of Intelligent Transportation Systems (ITS) such as navigation or traffic information systems, which require complex interactions while driving. As well as improving driving safety, ASR could increase the accessibility and acceptability of in-car systems by simplifying the dialogues between the user and system, and the processes of learning how to use the system. The main difficulty facing the incorporation of speech into in-car systems comes from the hostile environment. Noise (from the vehicle engine, road friction, passengers, car radio, etc.) can adversely affect speaker and speech recognition performance. The car is also characterised by a variety of concurrent tasks to be carried out while using speech (particularly the primary task of safe driving), and varying levels of driver mental workload. As well as these and other human factors issues, Van Compernolle (1997) suggests that the
442
R Graham
automotive industry is a slow acceptor of new technologies in general, and that there has been confusion in the past about which speech applications to implement. Despite these barriers, ASR may be useful for a number of different in-car applications, each with particular requirements (in terms of vocabulary, dialogue, etc.) These applications are outlined in the sections below. There then follows a general discussion of human factors issues relevant to the use of ASR in cars.
Applications of ASR in Cars Likely in-car applications for ASR can be put into 3 groups—standard vehicle functions (including the stereo), phones and navigation/information systems. Van Compernolle (1997) rates the importance of incorporating ASR into these functions as low, high and essential respectively.
Standard Vehicle Functions Any non-safety-critical vehicle control may benefit from the incorporation of speech recognition, particularly those whose interfaces have multiple control options. A prime candidate is the car stereo (radio/tape/CD). For example, Haeb-Umbach and Gamm (1995) discuss a system in which speaker-independent continuous-speech is employed to access various functions (e.g. “CD four, track five”), and speaker-dependent recognition allows users to define their own names for radio stations (e.g. “change to BBC now”). Using speech for the car stereo benefits from compatibility in input and output modalities; that is, the auditory input of a speech command results in the auditory feedback of the change in radio or CD output. Other applications include the car’s climate control system, and the mirrors, windscreen wipers, seats, etc. Although speech input to such basic car systems may not have significant advantages over manual input for most users, it could allow drivers with physical disabilities to use their arms and/or legs solely for the most important tasks of safe driving.
Phones In recent months, a number of high-profile legal cases have argued the dangers of operating a mobile phone while driving. These have led to adjustments in the Highway Code in the UK, and the belief that legislation preventing the manual operation of phones on the move is inevitable (Vanhoecke, 1997). Consequently, much effort is being invested towards the development of voice-operated, hands-free kits, initially for keyword dialling and eventually for all phone functions. The former application requires speaker-dependent recognition, allowing the user to dial commonly-used numbers through keywords (e.g. “mum”, “office”). The latter needs speaker-independent, continuous-word recognition for inputting numbers or commands. The ASR capability may be either incorporated into the phone, accessed over the mobile network, or pre-installed by the car manufacturer.
Navigation and Travel/Traffic Information Technology such as navigation systems (which aid drivers in planning and finding their destinations) or travel/traffic information systems (which inform drivers of local ‘events’ such as accidents, poor weather, services, etc.) have the potential to greatly increase the complexity of driver-system interactions. Current systems require the driver to input information, such as a journey destination, while on the move, and often use an array of buttons or rotary switches to accomplish this. The near future is also likely to see a rise in the prevalence of driver-requested services (for example, the ability to investigate the availability of parking spaces in a town, or the location of the next petrol station), for which ASR could be even more useful. Fully voice-operated navigation requires that a user can input thousands of possible geographical names to specify a destination. Apart from the obvious technical difficulties of
Automatic speech recognition in cars
443
large-vocabulary, phoneme-based recognition, there are added problems that the system must cope with multi-national names, and poor pronunciation by users unfamiliar with the place name they are speaking. One solution is to incorporate standard word recognition for commonly-used names, with a fall-back to a spelling mode for less frequent inputs (Van Compernolle, 1997). Of course, spelling itself is a complex process for a speech recogniser due to the highly confusable ‘e-set’ (b, c, d, e, g, etc. all sound similar).
Human Factors Issues of ASR in Cars Much has been written in the past about the human factors of ASR (see, for example, Hapeshi and Jones, 1988; Baber, 1996). The following sections discuss those issues which are particularly pertinent to the incorporation of ASR in cars.
User Population It is generally accepted that a variety of user variables (age, gender, motivation, experience, etc.) may affect the success of the speaker in operating an ASR system. The avionics industry is probably the closest to the automotive industry in terms of the environmental demands on speech recognition (concurrent tasks, noise, etc.); however, whereas aircraft pilots tend to be highly-motivated, well-trained, younger and male, car drivers make up a heterogeneous sample of the general public. Indeed, it should be noted that there are very few successful public applications of ASR, probably due to the wide variation in speaking style, vocabulary, etc. Perhaps the most important factor is experience. Both the user’s experience with the specific recognition system, and their experience with technology in general may affect performance. Users who are computer literate may well adapt to speech systems more readily than naive users (Hapeshi and Jones, 1988), but they may also have over-inflated expectations of the technology. The implication for the design of in-car ASR is that both pre-use and online training must be provided. For example, the system designed by Pouteau et al (1997) allows the user to ask the system for assistance at any stage of the dialogue, to which the system responds with its current state and allowable operations. This system also provides automatic help if the user falls silent in the middle of a dialogue. As well as help for naive users, the interface should adapt for expert users; for example, shortening the dialogues as the user becomes familiar with them to avoid frustration.
Dialogue Initiation Leiser (1993 p.277) notes that “an unusual feature of user interfaces to in-car devices is that there will be a combination of user-initiated and system-initiated interaction. For example, interaction with a car stereo will be largely user-initiated. A car telephone will demand a roughly equal mixture…an engine monitoring system will be largely system-initiated”. Both types of dialogue initiation must be carefully designed. Because of the prevalence of sounds in the car which are not intended as inputs to the ASR device (e.g. the radio or speech from/with passengers), user-initiated dialogues must involve some active manipulation. One possibility is a ‘press-to-talk’ button mounted on or near the steering wheel. However, this may result in one of the major potential advantages of voice control over manual control (hands-free operation) being lost. An alternative is some keyword to bring the system out of standby mode (e.g. “wake up!”, “attention!”, “system on!”) In system-initiated dialogues, care must be taken not to disrupt the user’s primary task of safe driving. Unless an intelligent dialogue management system which estimates the driver’s spare attentional capacity is incorporated (see Michon, 1993), the system may request information when the driver is unable to easily give it. Therefore, system prompts should be designed to reassure the driver that a dialogue can be suspended and successfully taken up again later (Leiser, 1993).
444
R Graham
Feedback Feedback is any information provided by an ASR system to allow the user to determine whether an utterance has been recognised correctly and/or whether the required action will be carried out by the system (Hapeshi and Jones, 1988). As a general human factors principle, some sort of feedback should always be provided, and it has been shown that this increases system acceptance (Pouteau et al, 1997). For many in-car applications, feedback will be implicit (‘primary feedback’); that is, the action of the system (e.g. change of radio station, activation of windscreen wipers, phone ringing tone) will directly inform the user what has been recognised. In these cases, additional feedback from the speech system may not be required. If explicit (‘secondary’) feedback is necessary for cases when system operation is not obvious, or when the consequences of a misrecognition are particularly annoying, there are a number of possibilities. A simple system of tones has been found to be efficient and wellliked for certain ASR applications, but in the car environment there are likely to be a variety of easily-confusable abstract tones present. Spoken feedback is transient and makes demands on short-term memory (Hapeshi and Jones, 1988). It is also impossible to ignore, and may be irritating for the user. Visual feedback via a simple text display has the advantage that it can be scanned as and when required, but requires the eyes to be taken off the road. A combination of spoken and visual modes may be preferable (Pouteau et al, 1997).
Effects of Noise The failure of ASR devices to cope with the noisy car environment is probably the main reason why in-car applications of speech input have been unsuccessful in the past. Noise can adversely affect both speaker and speech recognition performance at a number of levels. First, noise can impact directly on the recognition process by corrupting the input signal to the recogniser. Second, speakers tend to sub-consciously adapt their vocal effort to the intensity of the noise environment (the ‘Lombard Effect’), which then adversely affects recognition accuracy. Third, noise can cause stress or fatigue in the speaker, which affects the speech produced, which in turn affects the recognition accuracy. And so on. Noise can also impact on cognitive processes outside speech production, which may affect the ability of the user to carry out the required tasks concurrently. ASR in the noisy car environment may be less problematic than other environments such as offices or industrial settings, as it is more predictable. Although in-car noise comes from a variety of sources (engine, tyres, wind, radio, passenger speech, etc.), the speed of the vehicle can give reference points for reasonably effective noise reduction (Pouteau et al, 1997). Technological solutions for coping with noise include selection of appropriate microphone arrays, acoustic cancellation (especially from known sources such as the radio), and active noise suppression through masking or spectral subtraction (Van Compernolle, 1997). Poor accuracy associated with the Lombard Effect can be reduced by training the speech recognition templates in a variety of representative noisy environments. However, because incar recording can be expensive, some success can be found by artificially degrading speech with environmental noise (Van Compernolle, 1997). Also, for a given individual, the effects of noise on speech production are relatively stable; therefore, speaker-dependent training of particular template sets for ‘noisy speech’ may be an effective solution (Hapeshi and Jones, 1988).
Effects of Workload and Stress Speech has also been shown to be vulnerable to the effects of speaker workload or stress (e.g. Graham and Baber, 1993). Sources of driver mental workload include (a) the driving task itself (e.g. lane-keeping, speed choice, keeping a safe headway and distance from other vehicles), (b) the driving environment (e.g. traffic density, poor weather, road geometry, etc.) and (c) the use of in-vehicle systems (e.g. the presentation, amount and pacing of information to be assimilated and remembered). For in-car applications of ASR, this implies that the
Automatic speech recognition in cars
445
speech recogniser may fail just when it is needed most; in high workload conditions where the driver cannot attend to visual-manual controls and displays. Similar to the strategy adopted to overcome noise, enrolment of speech templates under a variety of representative task settings may reduce the effects of stress or workload on ASR performance. As users tend to revert to speaking more-easily-recalled words under stress, system vocabularies should be designed to be ‘habitable’. The size and complexity of the vocabulary might also be reduced in stressful situations (Baber, 1996).
Conclusions Despite the barriers, it seems very likely that ASR will be widely incorporated into cars in the near future. Legislation relating to the use of mobile phones on the move and the rapid growth of the ITS market will drive its adoption. However, little research has been directed towards the use of ASR in the car environment. Further work is required into ASR dialogue design for in-car applications, particularly with respect to the mode and timing of feedback while the user is engaged in the concurrent driving task. Work is also required into the effects of the particular sources of noise and mental workload found in the in-car environment on the speaker and speech recogniser.
Acknowledgements This work was carried out as part of the SPEECH IDEAS project, jointly funded by the ESRC and the DETR under the UK Government’s LINK Inland Surface Transport programme. For further details of the project, please contact the author.
References Baber, C. 1996, Automatic speech recognition in adverse environments, Human Factors, 38(1), 142–155 Graham, R. and Baber, C. 1993, User stress in automatic speech recognition. In E.J. Lovesey (ed.) Contemporary Ergonomics 1993, (Taylor & Francis, London), 463–468 Haeb-Umbach, R. and Gamm, S. 1995, Human factors of a voice-controlled car stereo. In Proceedings of EuroSpeech ’95:4th European Conference on Speech Communication and Technology, 1453–1456 Hapeshi, K. and Jones, D.M. 1988, The ergonomics of automatic speech recognition interfaces. In D.J.Oborne (ed.) International Reviews of Ergonomics, (Taylor & Francis, London), 251–290 Leiser, R. 1993, Driver-vehicle interface: dialogue design for voice input. In A.M. Parkes and S.Franzen (eds.) Driving Future Vehicles, (Taylor & Francis, London), 275–293 Pouteau, X., Krahmer, E. and Landsbergen, J. 1997, Robust spoken dialogue management for driver information systems. In Proceedings of EuroSpeech ’97: 5th European Conference on Speech Communication and Technology, vol. 4, 2207–2210 Van Compernolle, D. 1997, Speech recognition in the car: from phone dialing to car navigation. In Proceedings of EuroSpeech ’97:5th European Conference on Speech Communication and Technology, vol. 5, 2431–2434 Vanhoecke, E. 1997, Hands on the wheel: use of voice control for non-driving tasks in the car, Traffic Technology International, April/May '97, 85–87
INTEGRATION OF THE HMI FOR DRIVER SYSTEMS: CLASSIFYING FUNCTIONALITY AND DIALOGUE Tracy Ross
HUSAT Research Institute The Elms, Elms Grove Loughborough, Leics, LE11 1RG Telephone: +44 1509 611088 email: [email protected]
The implementation of advanced driver information and vehicle control systems is becoming more widespread, with many systems already appearing on the market. If information overload and reduced driver safety is to be avoided, then consideration must be given to the integration of the HMI to such systems. This paper reports on research which aims to provide human factors design advice to vehicle manufacturers and system suppliers. The main focus of this paper is the development of two classification systems: one for detailed system functionality, the other a generic description of the inputs and outputs that make up a dialogue. The way in which these will contribution to the development of the design advice is described.
Introduction It is widely accepted within the transport telematics community, that the piecemeal introduction of advanced driver information and control systems into the vehicle is undesirable, and raises concerns regarding the potential for information overload and human error. An integrated systems approach is required to ensure that the design of the interface (i.e. the input and output mechanisms) used to present information to the driver is soundly based on human factors principles. To date there is a lack of research which analyses and synthesises current ergonomics knowledge in this area, in order that it may provide practical guidance for the integration of the interface to any combination of telematics systems in a modular fashion. An EPSRC funded project (INTEGRATE) is investigating this topic, with the ultimate aim of developing design advice for vehicle manufacturers and system suppliers. This paper reports on part of the state-of-the-art review conducted at the beginning of this project (Ross et al, 1997). The whole review covered a range of topics relevant to in-vehicle HMI integration, namely: the relevant cognitive psychology literature; human factors research into HMI integration in both automotive and aerospace applications; a review of display and control technologies of potential use in the vehicle; and a summary of relevant design guidelines and standards existing and under development.
Integration of the HMI for driver systems
447
This paper focuses on two other areas of the review which are fundamental to the development of design advice planned for later in the project. These are (a) the classification of the detailed functionality of the systems considered for integration and (b) a generic classification which can be applied to each individual input and output involved in the system dialogue. In addition some practical examples of the design of an integrated in-vehicle system are provided, based on earlier commercial work at HUSAT.
Development of the Classifications Functional Classification The literature was reviewed in order to discover any existing, well accepted classifications of in-vehicle driver systems. It is necessary within INTEGRATE to develop such a classification for three reasons: first, to clarify the scope of the project for those involved; second, to create a list of system functions to discuss with manufacturers during the ‘Industry Requirements’, the next stage of the project; and finally, to form a basis for a more detailed system classification later in the project (i.e. at the advice stage, when concrete guidelines are likely to be required for functional integration). Research on functional classifications showed that the most detailed research has been conducted in the U.S… In particular the University of Michigan Transportation Research Institute (UMTRI) (Serafin et al, 1991), the Battelle Human Factors Transportation Center (Lee at al, 1997)) and the University of Iowa (Mollenhauer et al, 1997) have conducted such research from a human factors viewpoint (most classifications have been driven by the technology). The most ‘official’ classification has been produced by the standards organisation ISO (ISO, 1996). The decision of the project was thus to use the ISO classification as a basis for the INTEGRATE approach, incorporating additional items from the UMTRI, Battelle and University of Iowa work, and particularly, extending the list to include conventional functionality, e.g. that of the audio system, speedometer, control pedals. The classification produced will be adapted as necessary over the life of the project. Due to space restrictions, only the main headings of the classification can be reproduced here. In the full listing, each heading has up to ten sub-headings (Ross et al, 1997). On-Trip Driver Information Route Planning, Guidance & Navigation Personal Information Services Mobile Office Services Entertainment/Comfort Electronic Financial Transactions Incident Management Emergency Notification/Personal Security Longitudinal Collision Avoidance Lateral Collision Avoidance Intelligent Junctions Vision Enhancement Safety Readiness Vehicle Status/Warnings Mechanical Controls ‘Secondary’ Controls/Displays
(e.g. prevailing traffic conditions) (e.g. dynamic route guidance) (e.g. filling station location) (e.g. phone) (e.g. radio) (e.g. road pricing) (e.g. emergency call) (e.g. automatic collision notification) (e.g. intelligent cruise control systems) (e.g. automatic steering/lane support) (e.g. clarification of right of way rules) (e.g. enhancement night road scene) (e.g. driver alertness monitoring) (e.g. oil level) (e.g. steering wheel) (e.g. side mirrors)
T Ross
448
Classification of Inputs and Outputs Previous commercial work at HUSAT on the topic of integration necessitated the development of a classification system for a multi-function driver system. This system included conventional driver tasks, advanced driver information and some elements of vehicle control. The need for the classification of inputs and outputs arose from the requirement to ‘code’ such a large number of interactions in a way that would aid in the specification for an interface design to such a system. Due to the commercial nature of this work, the classification system was not based on a thorough research review and as such, relied solely on the human factors expertise of the staff. This current project was partly inspired by this piece of work and thus, it was appropriate to review the literature, in hindsight, to assess any other similar classification systems. The subsequent search was disappointing. Little was found which described inputs and outputs in sufficient detail to be of use to the project. There were some exceptions. In the driving area specifically, the previously referenced work of Battelle and the University of Iowa showed some activity in this area. In the wider human factors arena, there was little which was of practical use for this project. Therefore, the classification produced here is based on that developed during the commercial work at HUSAT. This classification has not been validated in any way to date. It’s use within the project during the development of the design advice, will go some way towards this validation.
Figure 1. The classification of inputs used by the INTEGRATE project
Integration of the HMI for driver systems
449
Figure 2. The classification of outputs used by the INTEGRATE project
Developing Design Advice Classifying the inputs and outputs which make up the system dialogue is a first step towards developing generic design advice for integrated systems. The past commercial work at HUSAT on the feasibility of an integrated system, employed this approach with some success. However, this was for a limited number of subsystems only. The approach to the design was, for each system function, to classify each dialogue component (inputs and outputs) according to the above system. This enabled the identification of the range of input/ output types to be incorporated in the design for that particular set of sub-systems. The next stage was to list these types and, for each one, to identify the most appropriate input and output method(s). An example for input is that for an action which falls into the category ‘user initiated, direct, move, vertical’, the appropriate input devices are up/down buttons or a turn knob. As an example for output, for an information item which is ‘system generated, direct, information, status’, e.g. traffic messages, it would be most suitable to use speech in conjunction with text (possibly symbols if well known) together with an initial alert tone. By this process, it was possible to identify both the minimum, and optimum set of input and output devices that could/should be specified to achieve a usable integrated system. For the system in question the design solution incorporated: • A new, dedicated display area, showing icons of the various sub-systems (those that are ‘active’ are illuminated, the one currently in use is indicated) • A flexible format display area in 3 sections: a user input bar at the top; a large centre section where prompts, feedback and other system output will always appear; a soft key label bar, with six associated keys. • Three new flexible input devices: a quartet of vertical and horizontal ‘move’ keys, an alphanumeric keypad and a pair of positive/negative keys (for yes/no, proceed/ cancel, etc.). • New dedicated displays/controls as follows: steering wheel/stalk controls for frequent/ urgent functions; a parking aid display in the rear; a speaker for auditory output.
450
T Ross
• Dedicated displays/controls using existing equipment, e.g. set cruising speed associated with the speedometer, activation of rear parking aid by engaging reverse gear. This preliminary specification is obviously only the first step in the design of a usable and safe system for use on the move. For example, no attempt was made to identify what should be accessible whilst driving, as this can only be decided during the detailed design of a system. However, it is a good starting point for the INTEGRATE project. What the project hopes to do is to refine this first part of the approach and, more significantly, to develop design advice in order to support the subsequent decisions that a design team has to make.
Industry Requirements The next stage of the project is to identify the industry requirements for the format and content of the design advice to be produced. Until this is complete the exact nature of the rest of the research is unsure. It is hoped however, that use can be made of more advanced technologies for conveying the design advice. For example, certain car companies are already employing time saving design tools in other areas of vehicle design. Such software systems endeavour to create an ‘environment’ within which the design team are free to explore different implementation options. The software has in-built rules and knowledge which restricts the design to that which is technically or legally possible as determined by current knowledge. That is, the designer does not need to ‘know’ the rules and regulations in order to design within them. The INTEGRATE project will investigate the possibilities of offering human factors advice in this way.
Acknowledgements The INTEGRATE Project is funded by the EPSRC Innovative Manufacturing Initiative, Land Transport Programme, Telematics Call. Other partners in the project are Coventry University Knowledge Based Engineering Centre, and the Motor Industry Research Association.
References ISO 1996, Transport Information and Control Systems: Fundamental TICS Services, Technical Report of ISO TC204/WG1 Architecture, Taxonomy and Terminology, May 1996 Lee, J.D., Morgan, J., Wheeler, W.A., Hulse, M.C. and Dingus, T.A. 1997, Development of Human Factors Guidelines for Advanced Traveler Information Systems (ATIS) and Commercial Vehicle Operations (CVO): Description of ATIS/CVO Functions, Publication No. FHWA-RD-95–201, (U.S. Department of Transportation, Federal Highway Administration) Mollenhauer, M.A., Hulse, M.C., Dingus, T.A., Jahns, S.K. and Carney, C. 1997, Design decision aids and human factors guidelines for ATIS displays. In Y.I. Noy (ed.) Ergonomics and Safety of Intelligent Driver Interfaces, (Lawrence Erlbaum, Mahwah, New Jersey), 23–61 Ross, T., Burnett, G., Graham, R., May, A. and Ashby, M. 1997, State-of-the-Art Review: Human Machine Interface Integration for Driver Systems, INTEGRATE Project, Deliverable 1. EPSRC Innovative Manufacturing Initiative, Land Transport Programme, Telematics Serafin, C., Williams, M., Paelke, G. and Green, P. 1991, Functions and Features of Future Driver Information Systems, Technical Report UMTRI-91–16, (University of Michigan Transportation Research Institute)
SUBJECTIVE SYMPTOMS OF FATIGUE AMONG COMMERCIAL DRIVERS P.A.Desmond Human Factors Research Laboratory University of Minnesota 141 Mariucci Arena 1901 Fourth St. S.E, Minneapolis MN 55414 U.S.A
A study of real-life driving is reported in which the subjective symptoms of fatigue were explored in commercial drivers. Drivers completed subjective measures to assess fatigue, mood and cognitive interference before and after their driving trip. A postdrive measure of active coping was also administered. Prior to the driving trip, drivers also completed the Fatigue Proneness scale of the Driver Stress Inventory. The findings showed that subjective fatigue was characterised by changes in mood and cognitive state. Drivers not only experienced increased fatigue but also experienced increased tension, depression, annoyance, and cognitive interference. The study also showed that post-drive tension and fatigue related to drivers’ reports of fatigue reactions to real driving, as measured by the Fatigue Proneness scale. However, active coping was unrelated to changes in most of the subjective state measures. Introduction Driver fatigue remains a significant problem for the commercial driving industry. Many studies have been conducted to examine professional drivers’ performance over prolonged periods of time (e.g. Mackie & Miller, 1978). However, as McDonald (1984) points out, researchers have largely neglected drivers’ subjective experience of fatigue. Desmond (1997) has stressed the importance of the subjective component of fatigue in the light of transactional theories of the effects of stressors on driver performance (Matthews, 1993). Transactional theories of driver stress propose that stress reactions are the product of a complex dynamic interaction between the individual and his or her environment such that stress outcomes are the result of the driver’s appraisal of the demands of driving and coping strategies. Since fatigue and stress share similar energetical properties (e.g. Cameron, 1974), we can propose that the driver’s appraisals and coping strategies also play an important role in driver fatigue. The present study attempted to explore this possibility in a study of professional drivers’ subjective states.
452
PA Desmond
Recent simulated studies of driver fatigue (e.g. Desmond, 1997) have shown that the subjective pattern of the fatigue state is a complex one. In Desmond’s studies, drivers performed both a fatiguing drive, in which they performed a demanding secondary attention task in addition to the primary task of driving, and a control drive without a secondary task. Drivers completed a variety of subjective state measures to assess mood, fatigue, motivation and cognitive interference before and after the drives. The findings indicated that following the fatiguing drive, drivers experienced increased subjective fatigue symptoms such as physical and perceptual fatigue, as well as boredom, de-motivation and apathy. Drivers also experienced increased tension, depression and cognitive interference indicating that the fatiguing drive was mildly stressful. Thus, it is expected that fatigue will be characterised by changes in mood and cognitive state in the present study. Desmond also investigated the relationship between Fatigue Proneness, a dimension of driver stress measured by the Driver Stress Inventory (DSI: Matthews, Desmond, Joyner, Carcary & Gilliland, 1997), active coping, and a variety of state measures in these studies. The findings indicated that Fatigue Proneness predicted changes in state measures of fatigue, mood and cognitive interference. Moreover, active coping predicted changes in the fatigue state measures, and also predicted changes in mood and cognitive state measures. These studies provided support for the utility of a transactional model of stress that incorporates a Fatigue Proneness trait. The studies showed that Fatigue Proneness, like other DSI dimensions such as Aggression and Dislike of Driving, relates to specific mood states and coping strategies. In the present study, the Fatigue Proneness scale and an active coping scale were used to predict changes in measures of fatigue, mood and cognitive state in a sample of professional drivers. It was expected that both Fatigue Proneness and active coping would predict changes in fatigue, mood and cognitive state measures. Method 58 Australian professional truck drivers participated in the study. Drivers ranged in age from 23 to 61 years (M=37.09). Time since obtaining an Australian truck driver’s license ranged from 2 years to 38 years (M=15.08). The duration of driving trips ranged from 6 hours to 18 hours and 50 minutes (M=11 hours and 38 minutes). All drivers completed measures of fatigue, mood and cognitive components of subjective stress states before and after their driving trip. An 11-item fatigue scale was used to measure fatigue. Mood was assessed with the UWIST Mood Adjective Checklist (Matthews, Jones & Chamberlain, 1990). A shortened version of the modified Cognitive Interference Questionnaire (Sarason, Sarason, Keefe, Hayes & Sherarin, 1986) was used to assess intruding thoughts. The CIQ requires subjects to rate the frequency with which they experienced specific thoughts. The scale comprises 4 items relating to task-relevant interference and 6 items relating to task-irrelevant personal concerns such as personal worries, friends, and past events. An unpublished post-drive measure of active coping was also administered. This scale consists of 12 items concerning coping strategies that relate to the driving task itself. Drivers also completed the DSI Fatigue Proneness scale before their driving trip.
Subjective symptoms of fatigue among commercial drivers
453
Results Table 1 gives means and standard deviations for the state measures. Two-tailed paired-subjects t-tests were calculated for each state variable to measure the extent to which the level of pre-drive state differs from the level of post-drive state (see Table 1). The results of these analyses show significant increases in tense arousal, depression, fatigue and task-relevant and task-irrelevant cognitive interference following the drive. In addition, a significant decrease in energetic arousal was found following the drive. Table 1. Descriptive statistics for pre- and post-drive state measures
**p<.01, ***p<.001
Pearson correlations were calculated to measure the possible associations between the Fatigue Proneness and active coping scales and state measures. Partial correlations were also calculated in which each pre-drive state variable was controlled in order to determine if the change in subjective state was related to the Fatigue Proneness and active coping scales. Table 2 gives these correlations. Table 2. Correlations between Fatigue Proneness, Active Coping and state measures
*p<.05, **p<.01.
454
PA Desmond
The results show that Fatigue Proneness was positively associated with pre- and post-drive tense arousal, fatigue and task-relevant and irrelevant cognitive interference. Fatigue proneness was also negatively associated with pre- and postdrive hedonic tone, and with post-drive energetic arousal. The partial correlations indicate that the increase in tense arousal and fatigue during the drives was related to the Fatigue Proneness scale. Pearson correlations between the active coping scale and state measures shows that active coping is unrelated to the state measures. The partial correlations show that the decrease in anger/frustration during the drive was related to active coping while changes in the other state measures are unrelated to active coping. Thus, in contrast to the findings of Desmond’s (1997) simulator studies, it appears that the increase in fatigue found in the present study does not relate to active effortful coping. There is a concern that the correlations shown in Table 2 may be an artifact of the duration of driving trips that might, conceivably, be confounded with both Fatigue Proneness and state measures. Thus, in order to address this possibility, driving trip duration was correlated with the state measures, Fatigue Proneness and active coping scales. The results showed that trip duration was unrelated to Fatigue Proneness and state measures. Table 3 gives the state change scores for the significant state changes found in the present study and Desmond’s (1997) two simulator studies. The scales show consistency in the direction but not in the magnitude of change. With the exception of energetic arousal, the magnitude of change for the state measures is substantially larger in the simulator studies than in the present field study. These results suggest that the simulator drives elicit stronger and more stressful reactions than the real drives. Table 3. Standardised state change scores for field and simulator studies
Discussion The results of the study support the first hypothesis proposed. The first hypothesis stated that subjective fatigue would be characterised by changes in mood and cognitive state. The findings from the pattern of changes in the subjective state measures provide support for this hypothesis. Drivers not only experienced increased fatigue but also experienced increased tension, depression, annoyance and task-relevant and task-irrelevant cognitive interference. Thus, this study has
Subjective symptoms of fatigue among commercial drivers
455
replicated the results from Desmond’s (1997) simulator studies, and provides further evidence of the emotional and cognitive changes that characterise fatigue. The second hypothesis stated that Fatigue Proneness and active coping would be related to changes in mood and cognitive state. The results of the study provide some support for the first part of this hypothesis. Fatigue Proneness was found to relate to changes in tension and fatigue. This result is consistent with the findings of Desmond’s simulator studies. However, active coping was unrelated to changes in almost all of the subjective state measures. This latter result is inconsistent with the findings from the simulator studies in which active coping was found to relate to post-drive fatigue, mood, and cognitive state. This inconsistency may be explained by the difference in the magnitude of subjective state change found in the present study and the simulator studies. The results showed that state change for most of the subjective measures was substantially larger in the simulator studies than in the field study. Thus, it appears that the simulator drives were experienced as more stressful by drivers than the real drives. The simulated environment represents a novel situation for drivers and its unfamiliarity may serve to heighten stressful reactions. An alternative explanation is that the determinants of active coping may differ in simulated and real drives. Post-drive active coping was higher in the field study, implying drivers may be generally more motivated to cope with stress in reallife. In conclusion the present study supports the utility of a transactional model of driver stress in accounting for the relationships between Fatigue Proneness, cognitive processes and affective reactions in the real world context. However, the role of coping in mediating the relationship between Fatigue Proneness and acute fatigue reactions requires further investigation. References Cameron, C. 1974, A theory of fatigue. In A.T.Welford (Ed.), Man Under Stress, (Taylor and Francis, London). Desmond, P.A. 1997, Fatigue and stress in driving performance. Unpublished doctoral thesis. University of Dundee, Scotland. Mackie, R.R. & Miller, J.C. 1978, Effects of Hours of Service, Regularity of Schedules and Cargo Loading on Truck and Bus Driver Fatigue. Santa Barbara Research Park, Goleta, California. Matthews, G. (1993), Cognitive processes in driver stress, in Proceedings of the 1993 International Congress of Health Psychology. Tokyo: ICHP. Matthews, G., Jones, D.M. & Chamberlain, A.G. 1990, Refining the measurement of mood: The UWIST Mood Adjective Checklist. British Journal of Psychology, 81, 17–24. Matthews, G., Desmond, P.A., Joyner, L.A., Carcary, W. & Gilliland, K. 1997, A comprehensive questionnaire measure of driver stress and affect. In Traffic and Transport Psychology. Amsterdam: Elsevier. McDonald, N.J. 1984, Fatigue, safety and the truck driver, (Taylor and Francis). Sarason, I.G., Sarason, B.R., Keefe, D.E., Hayes, B.E. & Sherarin, E.N. 1986, Cognitive interference: Situational determinants and traitlike characteristics. Journal of Personality and Social Psychology, 51, 215–226.
HOW DID I GET HERE?—DRIVING WITHOUT ATTENTION MODE J.L.May & A.G.Gale
Applied Vision Research Unit, University of Derby, Mickleover, Derby, DE3 5GX, UK. Tel\Fax 44 1332 622287 E-mail: [email protected]
Driving without attention mode (DWAM) is a state where the driver loses awareness while driving. Evidence of this is when the driver suddenly realises their location without being able to recall the actual process of having driven to get there. DWAM is not only found in car drivers but may also be a causal factor in some cases of SPaD’s (Signals Passed at Danger) experienced by train drivers. The number of accidents which are a direct cause of DWAM is unknown but clearly constitutes a serious hazard. This paper reviews research in the area of DWAM and discusses its relevance to all vehicle drivers. Measures introduced to help prevent DWAM are discussed.
Introduction Many drivers recount some experience of “waking up” while driving and realised that they have driven some distance without being able to recall exactly how they got there (Kerr, 1991). Driving is a highly visual task requiring constant monitoring of the road environment to ensure that vehicle control is maintained. The presence of drowsiness or states of inattention which might occur while the driver is travelling therefore may result in inappropriate responses or no responses being made to a potentially hazardous situation. Such actions are often attributed by the person concerned to inattentiveness, a lowered state of awareness or fatigue leading to a failure to react adequately to the changes in the road situation. There is evidence however to suggest that it is more than just a case of driver fatigue (Furst, 1971) with drivers often observed to sit in the normal driving position gazing straight ahead with a glassy stare. Williams (1963) attributed this cognitive state to trance inducive features in the driving situation such as repetitive and monotonous stimulation, particularly on highways. Drivers also often report that they were in a “trance like state” rather than asleep. This has led to the term “highway hypnosis” being widely used to refer to this state of inattention while driving. One of the earliest accounts of the term ‘road hypnotism’ is by Brown (1921);
Driving without attention mode
457
“A large limousine was rolling north at 15 miles an hour. At the rear a similar vehicle approached moving faster…. There was ample space for the second car to pass, but to my astonishment it came up behind and crashed squarely into the first machine. It was absurd. The second driver [a chauffeur] had sat at ease, his hands on the wheel, his gaze straight ahead. There was nothing to divert his attention…. Asleep at the wheel—sound asleep. The driver had been gazing at the bright streaming roadway flowing smoothly beneath him. Its monotonous sameness concentrated his mental faculties to the point of inducing momentary self hypnotism.“ This highlights some of the beliefs concerning highway hypnosis, namely; a hypnotic trance, sleep inducive features of the road, fatigue, pre-occupation and a recognition of the condition by professional drivers. The symptoms commonly detailed in the literature are; a trance like state or glassy stare, late recognition of road hazards, a gradually developing steering bias or auditory and visual hallucinations with the driver eventually even falling asleep. The condition seems more likely to occur on familiar roads or featureless countryside under conditions of monotonous travelling “where the lack of novelty promotes passive and automatic responses“ (Williams and Shor, 1970). Hallucinatory experiences and poor judgement are commonly reported by truck drivers (Wertheim, 1981). This term ‘highway hypnosis’ although popular with the media is a barrier to scientific inquiry into this problem (Brown, 1991). More recently this condition has therefore been referred to as Driving Without Attention Mode (DWAM) (Kerr, 1991) being defined as a state of inattention or loss of awareness to the driving task by the person controlling the vehicle. Although mainly documented in car driving, DWAM has also been recognised as potentially affecting train drivers and aircraft pilots, (e.g. Kerr, 1991). It becomes more prominent as people control vehicles for prolonged periods. Similar phenomena have been shown to occur when performing routine industrial operations. It is possible therefore that many repetitive tasks performed under predictable and monotonous conditions could produce a similar cognitive state.
Theories of causes of driving without attention mode. Only a few explanations have been offered of this phenomenon. Some of the theories do not adequately explain its nature or origins and are very difficult to validate in experimental research. a) Fatigue McFarland and Moseley (1954) suggested that fatigue may be the most important factor in DWAM. However while fatigue may facilitate the occurrence of this condition, drowsiness in car driving has been shown to also occur when there is no evidence of excessive fatigue (Roberts, 1971). Currently there is insufficient evidence to determine the exact relationship between fatigue, sleepiness and DWAM although fatigue may be a contributory factor but not necessary a causal one. b) Hypnosis Williams (1963) suggested that the monotony of the surroundings and the necessity to attend to only a very small part of the visual field might induce some sort of hypnotic trance. There is no direct evidence to support a relationship between hypnosis and DWAM but it is recognised that the hypnotic state is affected by sleep deprivation.
458
JL May and AG Gale
c) Hyperinsulism Roberts (1971) suggested that the occurrence of excessive drowsiness might be due to functional hyperinsulinism—an over-sensitivity to a certain concentration of sugar in the blood. This may lead a person (particularly narcoleptics, who suffer from unsuspected attacks of sleepiness) to experience sudden attacks of lowered consciousness. However the symptoms are too common within the population of drivers for this to be a general explanation, (Wertheim, 1991). d) Automation of Driving Task The use of automatic or higher cognitive processing may cause some states of DWAM (c.f. Reason, 1987). Vehicle control requires a mixture of controlled and automatic responses. When the driving environment becomes more predictable less feedback is required and it is such predictability which induces DWAM. For instance train drivers may experience false expectations of signal aspects which could restrict their perception and assimilation of true information (e.g. a signal may be interpreted as orange when it is in fact red). The probability of making such errors increases with task proficiency. Buck (1963) attributed possible causes of some SPaD’s to factors such as a driver incorrectly estimating his location or totally losing his position on the track and missing a signal or selecting the wrong signal. Drivers have to learn the route before they are allowed to drive it and thus a state of automatic processing or inattention may occur when driving on a well learned route. e) Oculomotor Control and Alpha activity Wertheim (1991) proposed that DWAM is associated with a lessening of reliance upon attentive stimulus information and a move towards internally governed oculomotor control, based on internal representations. This is a shift from actively reacting to changes in the environment to monitoring an unchanging mental representation of the predictable conditions. f) Monotony Monotonous tasks can depress the levels of performance and arousal due to lessened sensory stimulation. DWAM may develop by continuously looking at the same objects in the visual field moving in a predictable pattern. A monotonous road situation, however, does not always imply a predictable one (e.g. when driving in thick fog).
Problems of DWAM Quantification Few official accident records have been kept detailing the occurrence of DWAM and therefore it is very difficult to gain an understanding of the size of the problem. Retrospective interviewing of drivers is problematic.
Definition Public awareness of DWAM and ambiguity about it and other potential states, such as drowsiness, may lead to under-reporting. For example some train drivers may have a tendency to drowse whilst driving (Endo and Kogi, 1975).
Driving without attention mode
459
Experimental work DWAM is a far reaching problem and existing experimental work may not have addressed all the contributing factors or considered all their interactions. For instance in addition to the above, the effect of common drugs such as alcohol, caffeine and nicotine need to be determined. The role of diurnal variations and biological rhythms may also be important, particularly for pilots who frequently cross time zones.
Possible Solutions Devices A variety of devices have been developed which monitor the physiological alertness of the body and activate a warning when the driver becomes drowsy. For instance: registering the pulse; blood pressure; muscular reflex; or eye lid reflex have been studied. Two examples highlighted by Roberts (1971) are the ‘Buzz Bonnet’ and the ‘Autoveil device’. The response to devices such as these however can occur so late that their value is greatly diminished. It is also unlikely that one single device universally addresses the problem of DWAM, and it is important to look at each individual task and each user’s limitations and capabilities. Wilde and Stinton (1983) found that certain types of vigilance devices for train drivers were not linked with direct control of the train. These could in fact divert the driver’s attention away from driving the train to the task of cancelling the warning and thus failing to focus the driver’s attention. Vigilance devices should direct the drivers attention to some specific train driving task such as speed control. The Advanced Warning System (AWS) was designed to warn train drivers of signal aspects and possibly it may not be effective in the long term. It is feasible that due to the frequency with which the driver has to cancel the AWS they may learn to respond automatically. Cancelling the AWS therefore may become somewhat ineffective against attention loss. Also while such vigilance devices and safety systems have been introduced it may be many years before such systems are implemented widely. The problem of SPaDs must therefore still be addressed.
Steps the driver can take. There are various measures a driver can take if experiencing DWAM, such as taking a break or listening to a radio. The ability of the driver to take such steps will depend on; • their recognition of symptoms and awareness that they are not attending to the task. If the features of the road do have some hypnotic effect however then the driver may not be aware of it. • Their recognition and understanding of DWAM. They may think that once they have recognised the symptoms of DWAM then they will be able to keep themselves awake and so continue driving. • Their willingness and ability to perform such steps. For example a driver may be under a time pressure to reach a destination.
Driver Education Many drivers have an unawareness and under—recognition of the occurrence of DWAM and what measures to take to reduce its occurrence. It is important that the driver recognises the
460
JL May and AG Gale
symptoms of DWAM so that he/she can make an appropriate judgement on their physical and mental state to drive.
Roadway Engineering This needs to determine ways in which novelty and variability can be introduced into the driver’s task and environment. Such measures include the introduction of minor curves on long straight stretches of road, different types of road surfaces producing changes in noise and vibration, and rumble strips placed at the side of the road.
Conclusion Research into DWAM needs to bring together all transportation areas and not just concentrate on car driving. Theoretical explanations tend now to emphasise higher levels of learning and automation, in conjunction with the predictability of the external vehicle scene. Further research is needed to look at the role of all possible contributory factors. Current devices to counteract DWAM may not be fully appropriate in addressing the problem. Human factors research is needed, not only in defining the problem and its components, but also in assessing the suitability of devices and roadway engineering to address the problem and assess the long term benefits.
References Brown W. 1921, Literary Digest, June 4th , 69, 56–57. Brown I. 1991, Highway Hypnosis: Implications for Road Traffic Researchers and Practitioners, In A.G.Gale et al. (eds.) Vision In Vehicles III (Elsevier Science Publishers B.V. North Holland). Buck, L 1963, Errors in the Perception of Railway Signals, Ergonomics 11 (6). Endo, T. & Kogi, K. 1975, Monotony Effects of the Work of Motormen during high speed train operation. Journal of Human Ergology 4, 129–140. Furst C. 1971, Automizing of Visual Attention. Perception and Psychophysics, 10: 65–69 Kerr J.,S. 1991, Driving without attention mode (DWAM): A normalisation of inattentive states in driving. In Vision in Vehicle III (op.cit.) McFarland R.A. & Moseley A.L. 1954, Human Factors in Highway Transport Safety. (Boston, Harvard School of Public Health). Reason J.T. 1987, The cognitive basis of predictable human error In Megaw E.D. (Eds) Contemporary Ergonomics, (Taylor and Francis, London). Roberts H.J. 1971, The Causes Ecology and Prevention of Traffic Accidents, (Charles C.Thomas Publisher, Springfield, Illinois, USA). Wertheim A.H. 1981, Occipital alpha activity as a measure of retinal involvement in oculomotor control. Psychophysiology 18, 4:432–439 Wertheim A.H. 1991, Highway Hypnosis: A Theoretical Analysis, In Vision In Vehicles III (op.cit.) Wilde, G.J.S. & Stinson, J.F. 1983, The Monitoring of Vigilance in Locomotive Engineers. International Journal of Accident Analysis and Prevention 15 (2) 87–93. Williams G.W. & Shor R.E. 1970, An Historical Note on Highway Hypnosis Accident Analysis and Prevention, 223–225. Williams G.W. 1963, Highway Hypnosis: A Hypothesis. International Journal of Clinical and Experimental Hypnosis, 103:143–151
SENIORS’ DRIVING STYLE AND OVERTAKING: IS THERE A “COMFORTABLE TRAFFIC HOLE”? Tay Wilson
Psychology Department, Laurentian University Ramsey Lake Road Sudbury, Ontario, Canada, P3E 2C6 tel (705) 675–1151
“Conventional wisdom” held by many on both sides of the Atlantic is that driving near the speed limit results in being overtaken by a continuous stampede of drivers resulting in significant trip delay, thus necessitating driving with ever more speed to avoid being overtaken. This conventional wisdom is controverted by an on-road study on the British A1, in which a relatively “comfortable traffic hole” was discovered consisting of driving in the inside lane where-ever possible at a speed just above the legal speed of lorries (60m.p.h.). Actual trip time delay caused by involuntary slowing for traffic events was found to be of the order of only 10 minutes over journeys of about five hours. Trip aspects relevant to mental load and risk and subjective time estimate implications are discussed. “Conventional wisdom” held by many on both sides of the Atlantic is that driving near the speed limit results in being overtaken by a continuous stampede of drivers and thus, by being trapped in the inside lane behind slow traffic, incurring significant trip time delay. Extending this reasoning leads to a sort of unstable positive feedback loop in which drivers adopt driving styles involving traveling at ever higher speeds to avoid being overtaken. This tactic presents real risk, not least to older and some other categories of drivers who are not inclined towards high speed driving. In a series of three earlier on-road studies of overtaking (Wilson and Neff, 1995; Wilson, 1996; Wilson, 1997a, b. see above) actual patterns of overtaking on various Canadian roads have been examined which belie much of this “wisdom”. On the basis of this earlier work, it is hypothesized here that there exists, on the British A1, relatively a “comfortable traffic hole” which older and other drivers can use to reduce apparent mental work load and risk. This “comfortable traffic hole” consists of driving in the slow lane just faster than the posted speed limit for lorries (60m.p.h.). Furthermore it is hypothesized that contrary to subjective estimates by many (Wilson and Ng’andu, 1994) of “lost” time due to this driving style strategy, little actual trip time will be lost.
462
T Wilson
Method The test driver was a senior citizen age 77 with a valid driver’s licence, a half century of driving experience, training as a multi-vehicle military driver and 40 years experience of driving the A1 several times a year. The car was a 1997 Volkswagen Polo with only a few hundred miles on the odometer. Two normal trips along the A1 were chosen. The northbound trip began at 2:29p.m. on April 17, 1997 at the Stoke Rochford entrance to the A1 and terminated at 7:28p.m. at Berwick. The southbound trip began at 3:30p.m. at the Felton entrance to the A1 and ended at 8:20p.m. at the turn-off from the A1 to Cambridge. The driver was instructed to drive, as normal, at a target speed of 60 to 65m.p.h.—just faster than the speed limit for lorries on the road (60m.p.h.) and considerably slower than the posted speed limit for cars (70m.p.h). The target speed dictated that the test driver would spend most of his time in the inside or slow lane. Each trip included a rest lasting about forty five minutes. The experimenter recorded on-going speed, overtaking events, involuntary slowing and other salient events. Since the experimenter had ridden as passenger with the test driver many times and had had many conversations about driving during this time, it is likely that observations by the experimenter caused minimal disturbance to normal driving.
Results In table 1 can be found a tabulation of cars and lorries overtaking and being overtaken by the test driver over 30 minute trip segments for the northbound trip (trip 1) and the south-bound trip on the A1 in Britain. (Rests of 40 and 55 minutes were taken during the two trips.). On the northbound trip, about 15 times as many cars (438) overtook the test driver as were overtaken by him (28) while he overtook about ten times as many lorries (106) as overtook him (10). On the southbound trip, about nine times (370) as many cars overtook the test driver as were overtaken by him (43) while he again overtook about ten times as many lorries (126) as overtook him (12). (χ2, p<0.005 for all four comparisons.) No significant mean or variance differences between the north bound and the south bound trip were found for car overtaking, for lorry overtaking, for being overtaken by cars or for being overtaken by lorries. Thus one is unable to reject the null hypothesis that there was no difference in the overall overtaking experience on the two trips—and hence that they were just two typical A1 trips. In table 2 are the occasions—with salient comments—during which the test driver was forced to slow down below his preferred cruising speed or for round-abouts by time of day, duration in minutes (all fractions of minutes were rounded up to the next whole minute), overtaking experience during the slow-down (overtaking or being overtaken by car and overtaking or being overtaken by lorries) and finally by enforced speed slow-down. Consider involuntary slowing to accommodate traffic conditions in Table 2. In the two trips, only one minute of enforced slowing below the test driver’s preferred cruising speed was caused by a slower car ahead, six minutes by traffic merging into lanes of the A1, and sixteen minutes below Newcastle by slower trucks while 28 minutes was caused by slower trucks on the one-lane section of the road below Berwick. Seven minutes of enforced slowing was due to seeing brake lights ahead for various reasons including traffic build-up. Ten and eleven minutes of enforced slowing were due to round-abouts and warning signs respectively.
Seniors’ driving style and overtaking
463
A clear road with no visible traffic ahead was experienced for six minutes on each trip. The test driver spent only three minutes in the fast lane on each of the trips beyond immediate overtaking—return manouevres, both times while traveling slowly in heavy traffic.
Discussion The data here support the existence of a ‘comfortable traffic hole’ or a within which seniors and others can normally drive relatively comfortably without undue delay and stressful event. From the time of involuntary driving at 60, 55, 50, and 45 m.p.h. respectively, it can be determined that there was “lost” because of involuntary reductions in speed below the 65 m.p.h. target speed, a total of twenty four miles over ten hours of trip time, or, in round terms, about 10 minutes extra driving time over each of the two trips. (About 25% of this time (Newcastle in trip 1 and construction in trip 2) would have been difficult to “save” by any amount of more aggressive driving because all traffic lanes were clogged and moving at the same speed. In conclusion there was no meaningful trip delay caused by the driving strategy of the test driver. (Wilson and Ng’Andu (1994) found evidence suggesting that high accident drivers relatively report subjective trip time to be higher than actual time—low accident drivers report no difference.) Second, consider the events. Over two trips on the A1, beyond the short time delays described above, the test driver experienced no occasion in which he was not in his preferred, comfortable, and “legal and safe” position on the road. Only on one occasion (for a merging bus) did the test driver brake to slow down in a manner perceptible to the experimenter—and that was best described as light braking. The conclusion is that there does appear to be a driving style on the A1 suitable for many older and other drivers which provides a relatively lower risk and lower mental load journey with out losing significant trip time—namely driving just faster than the speed limit of lorries (60 m.p.h) in the inside lane for as much of the journey as possible. (Note that driving at this speed results in being overtaken about once per minute by cars and twice per hour by lorries.
References Wilson, Tay and Ng’Andu, Bwalya., 1994. Trip time estimation errors for drivers classified by accident and experience. In S.A.Robertson (ed.) Contemporary Ergonomics (Praeger, London), 217–222. Wilson, Tay and Neff, Charlotte, 1995. Vehicle overtaking in the clear-out phase after overturned lorry has closed a highway. In S.A.Robertson (ed.) Contemporary Ergonomics (Praeger, London), 299–303. Wilson, Tay, 1996. Normal traffic flow usage of purpose built overtaking lanes: A technique for assessing need for highway four-laning. In S.A.Robertson (ed.) Contemporary Ergonomics (Praeger, London), 329–333. Wilson, Tay. 1997. Overtaking on the Trans-Canada: conventional wisdom revised. In S.A.Robertson (ed.) Contemporary Ergonomics (Praeger, London), 104–109. Wilson, Tay, 1997. Improving drivers skill: can cross cultural data help? In Don Harris (ed.) Engineering Psychology and Cognitive Ergonomics (Ashgate, Aldershot), 1, 395–401.
464
T Wilson
Table 1. A1 Car and Lorry Overtaking by 30 min. segments when driving at a 60–65 mph target speed.
Seniors’ driving style and overtaking
465
Table 2. Involuntary slowing on the A1 to 60, 55, 50, 45 mph and for roundabouts by time, duration in minutes (Dur.) and car (C) or lorry (L) and overtaking event.
SPEED LIMITATION AND DRIVER BEHAVIOUR Di Haigney1 and Ray.G.Taylor2
1
Road Safety Dept. RoSPA 353 Bristol Road Birmingham B5 7ST
2
Applied Psychology Division Aston University Aston Triangle Birmingham B4 7ET
Engineering models of driver behaviour suggest that if more stringent physical limits were enforced on speeding behaviour, a significant decrease in the frequency of speeding related injury accidents would occur. Driver behaviour was tested under various speed limitation conditions on the Aston Driving Simulators. Participants also completed a questionnaire testing for affective response, attention and awareness of task performance per limitation condition. Differential effects of limitation were noted in driving behaviour, as well as the frequency of accident types. The overall mean accident frequency did not vary significantly across conditions.
Introduction ECMT (1984) indicates that a reduction in average speed throughout Europe of about one kilometre per hour could save seven percent of fatal accidents on the roads per year. Surveys carried out by the U.K. Department of Transport (DoT, 1994) involving over nine million vehicles however, confirm a widespread disregard for speed limits, with some 60 percent of drivers exceeding the posted speed limit on motorways. Given this apparent reluctance of the public to obey posted limits, speed reduction and the concomitant reduction in injury accidents predicted above could be achieved directly through the installation of speed limiting devices in vehicles. Enforced speed limitation may not necessarily result in the safety benefits of the magnitude cited above, as ECMT (1984) safety benefit calculations are based solely upon a ‘non-interacting engineering’ model of driver behaviour. In brief, this model assumes that any changes to accident risk caused by ar intervention will be passively accepted by the road user population—such that an improvement in safety can be predicted using an engineering calculation (Adams 1985). Other patterns of behaviour are thought to remain relatively unaffected with ‘knock on’ behavioural effects being of negligible magnitude. The ‘non-interacting’ assumption of the engineering approach has beer challenged by a number of studies which have found very little direct correspondence between the predicted safety benefits calculated and the actual recorded changes in traffic accidents (Evans, 1985).
Speed limitation and driver behaviour
467
The weak relation between engineering calculations and actual safety benefit may be attributed in part to poor data collection and analysis practice (Haigney, 1995), although a number of researchers also maintain that the low association is an indication that behavioural ‘compensation’ occurs. That is, the safety benefit arising from an engineered safety intervention is exploited by drivers to allow some change in performance associated with greater positive utility (e.g. increased speed), which effectively reduces the safety benefit realised through the intervention (Wilde 1982). This study attempts to assess the responses of drivers to the introduction of enforced maximum speed restrictions of varying levels of stringency.
Method The Aston Driving Simulator (ADS) is a fixed base, closed loop simulator in which a participant is able to interact fully with a computer generated environment via the manipulation of a steering wheel, brake and accelerator pedals, spatially arranged so as to mimic the operation of an automatic vehicle. The ADS registers and stores data on the participants manipulation of these controls and of the users performance in the simulated environment (e.g. position in the carriageway, collisions with other vehicles, collisions with the edge of the carriageway) each half second. A monitor generates a view of a single lane carriageway ‘populated’ with other simulated car drivers who travel at a steady thirty miles an hour on both sides of the road. These simulated ‘others’ are able to interact intelligently with the user through overtaking when appropriate to do so. Prior to either a practice run on the ADS or to data collection, all participants were read standardised instructions outlining the experimental procedure and conditions. Each participant was told that they had been allotted fifty ‘points’ which would be reduced by a certain amount according to the ‘severity’ of collision experienced on the ADS namely: Head-on crash=25 points lost; Rear of car in front=10 points lost; Veering off the road=5 points lost. Following a collision, the simulated vehicle would be centred back onto the lefthand carriageway and the participants would be required to pull away again from a stationary position. Participants were also informed that subjects with points remaining at the end of all experimental runs would receive this number of points in pounds sterling. Participants who had continued to collide with objects after having lost all points were to pay the fifty points and the excess amount in pounds sterling to the experimenter. At this point, participants either chose to sign a slip to agree to these conditions or did not participate further with the experiment. After having been read these instructions and having agreed to them, participants were allotted a ten minute ‘practice’ run on the driving simulators in order to become familiar with the working of the ADS. No speed limitation or other form of performance restriction was experienced in the practice run. Following the practice run, participants knowingly entered the experimental driving conditions through responding to on-screen prompts via a keyboard. The four experimental conditions comprised restrictions on the maximum speed possible throughout each ‘run’ of 30mph, 45mph, 70mph and 120mph (the latter being in effect, a free speed condition). Participants experienced each of the four experimental driving conditions in randomised order. The speed limitation for each condition was displayed on the monitor to participants prior to each ‘run’. Each participant experienced the same ‘track dynamics’ under each limitation condition. After completing the four experimental sessions on the ADS, participants completed a Driver Attitude Questionnaire (DAQ), developed in part from a driving attitude questionnaire devised by Parker et al (1994) and in part from a questionnaire devised by Hoyes(1990). The DAQ assesses perceived risk per condition, awareness of changes in driving behaviour per condition and the participants utility of speed. Subjects were paid in pounds sterling for any ‘points’ remaining after the experimental run. Participants who had expended their allocation of points and who had an ‘excess’ to be
468
D Haigney and RG Taylor
paid to experimenter, were informed that they would be invoiced after all participants in the experiment had been run. In reality and delayed in order to maintain the credibility of the mechanism, these subjects were informed that no monies were required to be paid after all subjects had completed the experimental run on the ADS.
Results Forty participants—twenty male, twenty female—with mean age 27.44 (SD 8.56) were tested on the Aston Driving Simulator (ADS). All participants had full, current driving licenses with mean driving experience at 8 years (SD 8.94; minimum 1 year; maximum 34 years) and mean exposure at 7600 miles per annum (SD 1449.49). Repeated measures ANOVAS were calculated for all the dependent variables provided by the ADS. Correlations between ADS variables and responses to the DAQ, were calculated where appropriate. Data logged through the practice runs was not included in the analyses.
Speed Variables Mean speed was not found to differ significantly when tested by the gender of the participants (DF=1; f=0.390; p<0.539), age (DF=1; f=1.058; p<0.175), driving experience (DF=2; f=0.202; p<0.819) or driving exposure (DF=1; f=0.646; p<0.431). Mean speed was found to be associated with a significant difference between speed limitation conditions (DF=3; f=300.5855; p<0.001). As maximum speed capability became increasingly restricted, mean speed rose as a percentage of this capability. In the 30mph condition for example, subjects mean speed represented 95.64% of capability, whereas mean speed accounted for 83.03% of capability at 45mph, 77.38% at 70mph and 48.62% in the 120mph condition. Participants also demonstrated reduced variation in speed from the mean value in the more restricted conditions, with SD for mean speeds ranging from 2.1260 in the 30mph condition, 6.5791 in the 45mph condition, 10.9756 in the 70mph condition and 14.2820 in the 120mph condition. An effect on speed variance resulting from collisions across conditions was assessed and found to be nonsignificant for each type of collision recorded (collision with vehicle in front [DF=2; f=0.356; p<0.702]; Collision with verge [DF=9; f=1.472; p<0.160]; head on collision [DF=1; f=0.303; p<0.584]).
Acceleration Variables Calculations of mean acceleration data from the ADS, refer to the angular positioning of the accelerator pedal, in effect the pressure applied to the pedal. Significant differences were not recorded across any of the demographic variables: sex (DF=1; f=0.026; p<.873), age (DF=1; f=0.086; p<0.772), experience (DF=2; f=0.386; p<0.685), exposure (DF=1; f=0.088; p<0.770). Mean acceleration was found to vary significantly across conditions, decreasing as speed limits increased (f=16.409; df=3; p<0.00). Acceleration variance was found to vary significantly across conditions (f=35.2376; df=3; p<0.001), increasing as speed limits increased—possibly indicating a greater frequency of overtaking or attempted overtaking manoeuvres (refer to ‘Lateral position variables’ below). Statistically significant differences across conditions were not detected when the effect of collisions on accelerator variance was controlled (collision with vehicle in front [DF=2; f=0.496; p<0.612]; Collision with verge [DF=9; f=1.060; p<0.394]; head on collision [DF=1; f=0.487; p<0.487]).
Braking Variables Mean braking is determined through a numeric system relating to the degree of pedal travel indicating the pressure being applied to this control.
Speed limitation and driver behaviour
469
No demographic variable exhibited a significant difference in mean braking values: sex (DF=1; f=0.545; p<0.469), age (DF=1; f=0.156; p<0.697), experience of driving (DF=2; f=0.228; p<0.798), exposure to driving (DF=1; f=0.258; p<0.546). Mean braking was found to vary significantly across conditions, increasing as speed limits increased (df=3; f=31.9623; p<0.001).
Lateral Position Variables Position variables represent the distance between the ‘participant’s vehicle’ and the centre line of the ‘road’. Lateral position was not found to vary significantly across the demographic variables of: sex (DF=1; f=1.663; p<0.211), age (DF=1; f=0.045; p<0.834), experience of driving (DF=2; f=0.369; p<0.696), exposure to driving (DF=1; f=1.385; p<0.092). The mean position of the ‘vehicle’ was found to vary significantly across conditions (df=3; f=14.3198; p<0.001), with mean metres from the centre of the carriageway increasing as speed limits increased. Mean position was found to be significantly correlated with mean speed (r=0.63; p<0.001), speed variance (r=0.74; p<0.001), and acceleration variance (r=0.50; p<0.01). Position variance was tested across demographic variables and was also found to be nonsignificant across each: sex (DF=1; f=0.342; p<0.565), age (DF=1; f=1.495; p<0.235), experience of driving (DF=2; f=1.256; p<0.306), exposure to driving (DF=1; f=0.376; p<0.546), although position variance was found to vary significantly across conditions (DF=3; f=11.611; p<0.000).
Accident Variables Head on collisions were tested against demographic variables and were found to be nonsignificant by sex (DF=1; f=1.950; p<0.177), age (DF=1; f=1.852; p<0 .188), experience (DF=2; f=0.580; p<0.569), exposure (DF=1; f=0.002; p<0.969) Collisions with the vehicle in front were also found to be nonsignificant across sex (DF=1; f=0.461; p<0.504), age(1, 2) (DF=1; f=0.190; p<0.667), experience (1, 3) (DF=2; f=0.088; p<0.916), exposure (DF=1; f=0.633; p<0.435). The mean frequency of both car-car collision types increased significantly with less restrictive limitation (head-on collision [DF=3; f=11.343; p<0.000], collision with the vehicle in front [DF=3; f=10.245; p<0.000]). Collisions with the verge were not significant across demographic variables: sex (DF=1; f=.389; p<0.540), age (DF=1; f=2.113; p<0.161), experience (DF=2; f=1.006; p<0.383), exposure (DF=1; f=.420; p<0.524) although this collision type was significant across condition (DF=3; f=4.365; p<0.005). Mean accident frequency overall did not evidence a significant difference across conditions (DF=3; f=3.328; p<0.996).
Driver Attitude Questionnaire (DAQ) 44% of the sample indicated that most accidents occurred to them in the 70mph condition, although the greatest overall accident frequency actually occurred in the 45mph condition. If only car-car collisions are considered, then the 120mph condition had the greatest total number of accidents and 40% of participants indicated that they felt at increased risk, in this condition. Participants indicated their awareness of differential acceleration, braking and positioning across the conditions, in line with ADS data. The majority of participants (61%) reported paying increasing levels of attention as speed limits were increased, with participants also tending to rate the higher speed limit conditions as being more enjoyable (56%). Although participants acknowledged that speeding was one of the main causes of accidents (76%), participants exhibited very relaxed attitudes towards speeding offenders— considering them to be unlucky to have been caught (87%), most agreeing that it was acceptable to drive faster than a posted speed limit (73%), possibly reflecting the majority
470
D Haigney and RG Taylor
opinion that speed limits were set too low (72%). Most participants agreed that they compensated for road conditions which are perceived to be safe by driving faster (81%).
Discussion The decreased frequency of severe accidents in the more highly restricted conditions, would agree with the ‘non-interactive’ engineering approach, although the lack of significance across conditions exhibited by the mean frequency of accidents overall, especially when considered against the significant shifts in driver performance across conditions, could be held to indicate some process of compensation. Drivers exhibited an increase in ‘risk-acceptance behaviours’ (Wagenaar and Reason, 1990) via a tendency to ‘floor’ the accelerator pedal in the more highly restricted conditions—to the point where mean speeds were at 95.64% of capability. This appeared to be linked to frustration and boredom expressed by the subjects on the DAQ for these limitation conditions. Increases in ‘safety behaviours’ (Matthews et al., 1991) were only demonstrated in the less highly restricted conditions. Drivers subjective perception of risk appeared to be associated most closely with car-car collision events, rather than total accident frequency and increased perception of risk was also associated with increased attention and enjoyment of the the task all of which rose with speed capability. In summary, data suggest that speed limitation may prove an effective means through which to reduce the likelihood of severe accidents, although not all accident typologies will be affected equally. It should also be noted, that the majority of participants showed evidence of risk taking behaviour in order to alleviate boredom and frustration arising from the more restrictive limitation conditions. Given that subjects also indicated they regarded ‘speeding’ as acceptable, since posted limits were set ‘too low’, it may be that any feasibility study examining the introduction of speed limitation technology, should also consider the rationale underlying the limits established per carriageway in order to counter behaviours noted above.
References Adams, J.G.U. 1985, Risk and Freedom—The record of Road Safety Regulations (Bottesford Press, Nottingham, U.K.) Department of Transport 1994, Vehicle speeds in Great Britain, 1993, Department of Transport Statistics Bulletin (94) 30, U.K. European Conference of Ministers of Transport 1984, Costs and benefits of road safety measures (ECMT: Paris) Evans, L. 1985, Human behaviour feedback and traffic safety, Human Factors, 27, 555–576. Haigney, D.E. 1995, The reliability and validity of data, In Roads 17(2), 11–17. Hoyes, T.W. 1990, Risk Homeostasis, Master of Science Thesis (Hull University, UK) Matthews, G., Dorn, L.; Glendon, A.I. 1991, Personality correlates of driver stress. Personality and Individual Differences 12, 535–549. Nilsson,G. 1982, The effect of speed limits on traffic accidents in Sweden, VTI Report No.68, S-58101, p.1–10, 1982 (National Road and Traffic Research Institute, Linkoping, Sweden) Parker, D., Reason, J.T., Manstead, A.S.R., Stradling, S.G. 1994, Driving Errors, Driving Violations and Accident Involvement (Driver Behaviour Research Unit. Department of Psychology, University of Manchester, U.K.) Wagenaar, A.C., Reason, J.T. 1990, Types and tokens in road accident causation. Ergonomics 33(10–11), 1365–1375. Wilde, G.L.S. 1982, The Theory of Risk Homeostasis: implications for safety and health, Risk Analysis 2, 209–25.
THE ERGONOMICS IMPLICATIONS OF CONVENTIONAL SALOON CAR CABINS ON POLICE DRIVERS S.M.Lomas*
C.M.Haslegrave
Institute for Occupational Ergonomics University of Nottingham Nottingham, UK
Institute for Occupational Ergonomics University of Nottingham Nottingham, UK
* Now at Applied Vision Research Unit, University of Derby, Derby, UK
To minimise cost Police Forces have to use vehicle fleets consisting of relatively conventional cars. However the way in which police officers use these vehicles differs greatly from the general public for which the cars are primarily designed. Due to the nature of their job officers often spend a considerable amount of time sat in the vehicle and there is often a need to enter and exit quickly and easily. A further complication arises from several officers sharing one car with each requiring a unique driving position in order to drive in a comfortable and safe environment. Problems also arise as a result of the equipment officers are required to carry and increasingly by the wearing of protective vests and body armour. This paper presents the results obtained from a study on police drivers and their interaction with the car cabin.
Introduction People in general are spending more time in their cars as a result of several factors such as increases in commuter traffic, business related journeys and traffic congestion. Thus the ergonomics of the car cabin becomes increasingly important to enable the driver to perform their work safely and efficiently with minimum effort and discomfort. In recent years the number of possible adjustments to the car seating and controls has significantly increased. The potential driving position adjustment of the car has been shown by Gyi and Porter (1995), to be a significant factor in the amount of reported discomfort. Cars which permit increased freedom of movement and improved postures by having adjustable features, result in less people being absent from work. In many respects police officers are an extreme case for a vehicle ergonomist. Not only do they spend a considerable amount of their job sat in their police car, either driving or performing other activities but they also tend to be significantly larger then the average driver. The nature of their job and the high average mileage increases the possibility that a police driver may already have suffered injury though a previous incident and this may compound the problem.
472
SM Lomas and CM Haslegrave
The aim of this study was to investigate the problems of using conventional vehicles as police cars. In order to understand this it was necessary to study the anthropometric characteristics of the police force population, understand the various tasks that police cars perform and obtain the comments and opinions of police officers.
Methods Initially meetings were held with the fleet managers and representatives from a number of police forces to gain an understanding of how best to access the population, obtain background information on the officers, the cars they used and the roles and tasks they perform. Following this a questionnaire was developed to elicit information from individual officers. The questionnaire was distributed to police stations which had a mixture of Traffic, Section and Traffic Support Officers. This provided information from officers who had different duties (see Table 1), drove on different road types and drove different car types. Table 1. Classification of Police Officer Duties
To provide more detail than would have been possible with a questionnaire 25% of questionnaire respondents participated in a follow-up interview. The interview was used to obtain anthropometric data, determine the driving position and collect the officers’ opinions. As such the interviews were carried out in the officers’ own police car. The results from both the questionnaire, interview and case-study were wide ranging therefore only some of the major findings are summarised in this paper.
Results The questionnaire was mainly completed by Traffic Police, as may be seen from Table 2. This reflects the composition of officers at the large police stations targeted. Table 2. Summary of Questionnaire Respondents
Stature The interviewed police officers were found to be of a considerably larger stature compared to the general population as may be seen in Table 3.
Ergonomics implications of saloon car cabins on police drivers
473
Table 3. Stature Percentiles
Discomfort and Driving 46.4% of Officers reported that they experienced discomfort when driving or sat in the driving seat of their police car. Table 4 shows where these problems were reported. Table 4. Region of Officers Discomfort
Duration of Driving Officers spend a considerable amount of time driving or sat in the police car. 75.9% of officers drive for at least 20 hours per week, whilst 81.3% drive/sit in the driver’s seat for at least 20 hours per week.
Adjustment of the Driving Position 54.4% of officers share a car with greater than 5 other officers. Officers commented that adjustment needed to be quick and easy to use, durable and repeatable. A small number of officers pointed out that entry to a vehicle that had previously been used by a small driver was often difficult.
Head Restraints Most of the cars in the study had head restraints which were able to be adjusted in height and tilt. Of the ones that were fixed, officers commented that they were positioned at an incorrect height and were uncomfortable. Officers did not like me lack of control over them. However in the interviews the vast majority of officers did not adjust the head restraint or check its position.
Seat Belt Height Anchorage The majority of the cars had manual adjustment of the seat belt height anchorage adjustment. One make of car adjusted automatically the location of the seat belt depending on the position of the person. General comments regarding officers’ opinion of this type of adjustment resulted in a varied response. Some liked the fact that they were not required to remember to adjust it, others disliked the lack of control and the position that the seat belt adopted. The wearing of protective vests resulted in officers being unable to feel the position of the seat
474
SM Lomas and CM Haslegrave
belt. In some cases this was viewed as an advantage as belts were reported as being uncomfortable. In other instances the body armour impeded exit from the car due to officers not being able to sense that they were wearing a seat belt.
Steering Wheel The level of adjustment of the steering wheel was reported as being dissatisfactory by 8% of drivers. Problems were reported relating to obscured controls and displays, impeded entry and exit and difficulty in regaining a preferred position, due to lack of fixed détentes.
Seat Height 25.9% of officers were dissatisfied with the seat height. An inappropriate seat height resulted in officers either hitting their head on the car roof or exerting an excessive amount of effort when exiting the car. A number of officers expressed significant irritation at this as 43.2% enter their police car greater than 15 times per day. A significant difference between civilian and police car derivatives is that the latter are not fitted with sunroofs. This results in police cars having notably more headroom, partially offsetting the relative increase in occupant size. Despite this some officers still experienced difficulties with the amount of available head clearance. A particular problem was identified by drivers hitting their head on the roof whilst travelling over speed ramps at high speed.
Lumbar Adjustment 27.8% of respondents believed that inadequate lumbar support was linked to their discomfort. Of those that drove vehicles which had lumbar support adjustment 30.4% were still dissatisfied. All of the lumbar adjustments in the cars surveyed adjusted in the horizontal axis only. During the interview a number of officers stated that some form of vertical adjustment would be needed to provide an adequate level of support. Additionally several officers complained that wearing body armour rendered seat lumbar support ineffective.
Seat Back Inclination Wear and Tear The majority of officers (87.5%) were satisfied with the seat back inclination. However a repeated comment related to concern over the durability of the vehicle interior and in particular the degradation of the support offered by the seat back due to the accelerated wear experienced by police cars as a result of the frequent exits, entry and adjustment.
Use of Equipment Belts The most significant problem experienced by police officers in vehicles appears to be related to the wearing of equipment belts. Items carried on the belt, in particular the baton, frequently get caught in the seat causing wear and tear, hindering exit and impairing the position of the seat belt across the hips. Removal of the belt whilst driving is not considered viable as it would place unacceptable restrictions on the officers ability to perform their duty.
Ergonomics implications of saloon car cabins on police drivers
475
Use of Body Armour Two types of protective vest were generally worn by a number of officers. An overt vest (worn over the shirt and jumper), which was standard issue by each force, and a covert vest (worn under the clothing), which was purchased by individual officers. Officers reported problems with the overt vest being too restrictive, riding up the neck and digging into the mid back when sat. Additionally officers reported difficulties encountered when entering and exiting the vehicle. In a case study to investigate the implications of wearing body armour on an individual officer, it was found that the overt vest significantly increased the chest depth (96th to 99th percentile) and abdominal depth (85th to 99th percentile) of the wearer.
Discussion and Conclusions Driving for greater than 20 hours per week whilst at work has been reported as resulting in increased sickness absence (Gyi and Porter, 1995). Additionally several studies have indicated that driving is a causal factor in the prevalence of lower back pain (e.g. Porter et al, 1992). A report in Police Review (25th July, 1997) highlighted that Traffic officers are particularly vulnerable to back pain. The issue of police officers banging their heads on the roof of the vehicle when driving over speed ramps is a serious concern, not only in terms of the health effects on the officers but also in relation to the possible momentary loss of concentration and control. Due to the large number of drivers who use each car adjustments are required which are quick and easy to use, repeatable and durable. The inclusion of fixed détentes in the adjustment of the steering wheel for example, would ease the repeatability of driver position. Most officers prefer to have the ability to adjust elements of the driving position or safety restraint system. However the interview showed that in the case of head restraints and seat belts the driver very rarely made any adjustment. When used selectively, i.e. for the seat belt anchorage height, automatic adjustment systems could have significant benefits in terms of optimum positioning and speed of adjustment. It is important that the integrity of the interior is maintained throughout the vehicle’s life. In particular, the seat in a police car must be robust due to the frequency of entry and exit and wear and tear that protective equipment and clothing places on it.
References Gyi, D.E. & Porter, J,M. 1995, Musculoskeletal troubles and driving: A survey of the British Public, In Robertson, S,A (ed.) 1995, Contemporary Ergonomics, (Taylor & Francis), London, 304–308 Police Review, 1997, 25th July, Providing back support, Police Review Porter, J.M., Porter, C..S. & Lee, V.J.A., 1992, A survey of car driver discomfort. In Lovesey, E.J. (ed.), Contemporary Ergonomics, 262–267, Taylor & Francis, London
THE DESIGN OF SEAT BELTS FOR TRACTORS D H O’Neill* B J Robinson** *Silsoe Research Institute, Silsoe, Bedford **Transport Research Laboratory, Crowthorne, Berks
The use of seat belts on tractor cabs is negligible. Although very few tractors are fitted with seat belts, there is some evidence that the use of seat belts in tractors fitted with cabs would save lives. The requirements of seat belts for on-road and off-road use are reviewed and the apparently conflicting needs, which may exert a strong influence on their acceptability and use, outlined. Farmer attitudes towards the use of seat belts are discussed and original experimental data from a study of seat belt use, addressing farmers’ comments, are presented.
Introduction The legislation governing the seat belt use has not, so far, demanded the use of seat belts on tractors. Nevertheless, some tractors, usually the more expensive models, are sold with seat belts fitted and there are (International) Standards covering their design, particularly concerning anchorage points1. The main purpose of wearing a seat belt is to prevent the driver from being ejected from the seat. This applies specifically to tractors with ROPS (Roll-Over Protective Structures), usually in the form of cabs, rather than to “open” tractors, as one effective way of avoiding injury when a cabless tractor overturns. However, this is more common off-road than on-road. A recent study of on-road accidents involving agricultural vehicles (Robinson, 1994) concluded that seat belts could save 15 to 20 fatalities and serious injuries per year. However, to be effective on-road, where collisions with other (large) vehicles are the most serious threat to safety, a 3-point (ie diagonal) design would be preferred. This presents a conflict with offroad use where tractor drivers consider such a design to be too restricting and a 2-point (ie lap belt) design would be preferred.
1
ISO 3776 (1994). Tractors for agriculture—seat belt anchorages.
Design of seat belts for tractors
477
The subject of this paper is a research project, commissioned by the Department of Transport2 and undertaken by the Transport Research Laboratory and Silsoe Research Institute to investigate the use of seat belts for on-road use, with a view to reducing on-road fatalities.
Attitudes of suppliers and drivers The corporate opinion of the suppliers was that “we fulfil our responsibility by providing anchor points and stock seat belts which are provided as an option”. Except for JCB “Fastracs”, these are invariably lap belts. The difficulty of fitting a diagonal belt to a suspended tractor seat was frequently raised: this would entail all three anchorage points moving with the seat. Two small samples of farmers, from areas of contrasting farming systems, were interviewed. Collectively, the farmers had little experience of wearing seat belts on tractors but, because the owners of JCB Fastracs were specifically targeted two farmers from Cheshire and three farmers from Scotland (around Perth and Inverness) had had the opportunity to use seat belts. Their attitudes are summarised in Table 1. Table 1. Tractor drivers’ attitudes to on-road wearing of seat-belts
The solitary driver who does wear a belt drives a FASTRAC and stated that, to do so, he would have to be on a relatively long (roughly an hour or more) and high speed (over 40 mph) journey. Two criticisms were made by a majority of the farmers (who had little or no experience of seat belt use on tractors). • Seat belts would be too restrictive on body movement; • It would be a nuisance to keep unfastening and fastening when they dismount and mount so frequently. This is regarded as a source of unnecessary delay. Half the farmers said they spent little time on the road and that the combined exposure and risk of an injury was too small to justify wearing a lap belt. However, half said they would wear them off-road on “steep” slopes. Some of the farmers, particularly those with a generally positive attitude towards wearing seat belts, felt that a lap belt could be more hazardous than no belt, as injuries suffered by use of the belt could be worse than injuries without it. A minority of farmers commented that seat belts: • •
2
would get dirty, would get in the way and so become dangerous; were acceptable, although their use should not be made compulsory (it should be a matter of individual judgement).
now Department of the Environment, Transport and the Regions
478
DH O’Neill and BJ Robinson
Some comments were made on how driver behaviour might change causing tractors to be driven more dangerously if the wearing of a seat belt offered the driver a “false sense of security”. This risk compensation argument has been used against many safety improvements and usually has been found to be false, but it remains conjectural and does emphasise the importance of drivers’ perceptions of risk, danger and injury. In this respect training could play a major role. In fact, a number of farmers suggested that public money would be spent more effectively on training drivers to drive safely than on addressing the consequences of dangerous driving.
Design and fitting of seat belts The particular difficulty with fitting and using seat belts in agricultural vehicles is the low frequency, high amplitude movement of the seat. It is almost inevitable that, unless seat belts are designed with this in mind to ensure that they are comfortable and do not unduly hinder the driver’s operations, wearing rates will remain very low. From the design point of view, fitting seat belts, especially diagonal ones, to suspended seats is a major problem because of the additional strength requirement for the seat and/or floor. Changes in seat design to accommodate a shoulder-level anchorage point are unlikely to find favour with drivers as a high seat back would hinder many off-road operations. It is apparent that drivers remove head restraints, when they are fitted, because of the need for rearward monitoring for many off-road activities. To avoid the need for a high seat back, a shoulderlevel anchorage point, independent of the seat could be fitted but this would be difficult to implement because of locating a suitable anchorage point (that would not be able to move with the seat). From the driver’s point of view, wearing a diagonal belt would be too restrictive on body movement off-road, although if an accommodating design could be specified, our findings suggested that farmers (drivers) would be more amenable to using seat belts. However, the majority felt that their use should not be mandatory. There are two main variants of lap belts: those with loose straps which tend to fasten centrally, and those with one retractable strap which fastens into a fixed mechanism. These factors, together with the design of the buckle or fastening mechanism, can influence user acceptability. The retractable may or may not have emergency locking (eg inertia reel), thereby allowing or not the length of the belt to change as the driver makes small movements, but preventing large or sudden movements.
Laboratory tests To provide information on what real practical difficulties might be involved, particularly regarding nuisance and delays, a short series of laboratory tests was carried out. The tests made use of Silsoe Research Institute’s three-axis vibration rig. This is effectively an immobile tractor, mounted on three hydraulic rams. These rams are controlled by a computer in such a way that the vibrations produced are a realistic simulation of those experienced on a real tractor. Test subjects can sit on the rig and perform various tasks which are also representative of real conditions. For the purposes of this project, four subjects of varying shapes and sizes were asked to complete various tasks. Each trial simulated on-road driving and lasted approximately 25 minutes, during which each subject had to get off and on the rig five times. Between these
Design of seat belts for tractors
479
events, other simulated driving tasks, typical of road use, were undertaken. These tasks included both stretching forward and looking to the rear of the tractor. Each trial was recorded on video tape which facilitated subsequent work-study analysis. The subjects completed a questionnaire after each trial to give their own comments on the use of seat belts. Observing the behaviour of the subjects also provided some insight into the nuisance factors. The four subjects were all experienced tractor drivers and ranged from 55kg to 106kg in weight, from 1.65m to 1.80m in height, and 330mm to 455mm in hip breadth. Details are given in Table 2; subject 3 was female. All testing was carried out with a KAB U2 seat with XH2 suspension unit. Four seat belt conditions were used by each subject: i) ii) iii) iv)
no belt fitted standard (static) lap belt fitted retractable lap belt fitted retractable lap belt and arm-rests fitted to seat. Table 2 Relevant anthropometric data for the subjects
Laboratory findings Objective data were obtained by studying the video recordings to determine the times needed for unfastening and fastening the seat belts (not for condition i) and the times taken to dismount and remount the rig between successive spells of simulated driving (see figure 1). These have been interpreted to give the following objective information. • Seat belts increased the time taken to get off and on the tractor (3-axis vibration rig) by roughly 7.5 s (p<0.01). • The time required to manipulate (fasten and unfasten) the seat belts (all conditions combined) was approximately 9 s.
Figure 1. Dismounting and remounting times
480
DH O’Neill and BJ Robinson
• The retractable belts were unfastened more quickly than the standard belts (p<0.01) but there were no significant but there were no significant differences in fastening times. From the questionnaires the following subjective information was obtained. • The retractable belts were preferred by 3 of the 4 subjects. No subject preferred the standard belt. • The subject who did not express a preference was the quickest at manipulating the standard belt, but differences between subjects for manipulating belts were not significant. The researchers’ observations are summarised below. • The quicker unfastening of the retractable belts is attributed to the one part being fixed and the other not needing to be placed anywhere. • The slower fastening of the retractable belts is attributed to occasional seizures of the inertia reel. • The subjects’ overall preference for the retractable belt is attributed to their considering the comfort during use to be more important than occasional difficulties in fastening. • The effect of the arm rests is unexpected in that the belts (retractable only) were manipulated more quickly in this condition. This could be due to the fact that the arm-rest trial was always conducted last for each subject and, hence, the practice gained and the increased understanding of the characteristics of the inertia reel enabled the subjects to use the retractable belt more quickly.
Conclusions Retractable lap belts are the most acceptable form of tractor occupant restraint system. Test subjects found them to be comfortable, not to hinder them in a variety of simulated driving tasks, and to add less than 8 seconds to a typical dismount-remount operation. The small selection of farmers questioned were not averse to wearing seat belts when they judge them to be appropriate but strongly disagree with their compulsory use. Driver education and persuasion might be the most effective methods to maximise wearing rates.
Acknowledgement This research was funded by the UK Department of Transport, 1995–96.
Reference Robinson, B.J. (1994). Fatal accidents involving “other motor vehicles”, 1991–92. TRL Project Report PR/VE/100/94. Crowthorne: Transport Research Laboratory. (Unpublished report available on direct personal application only).
NOISE AND VIBRATION
AUDITORY DISTRACTION IN THE WORKPLACE: A REVIEW OF THE IMPLICATIONS FROM LABORATORY STUDIES. Simon Banbury1 and Dylan Jones2
1
Air Human Factors, Defence Evaluation and Research Agency, Farnborough, UK. 2 School of Psychology, University of Wales, Cardiff.
In the decades between 1950 and 1980 research on the behavioural effects of noise emphasised the role of the intensity of the sound, searching for a threshold at which white noise would impair performance. Latterly, emphasis has shifted to the distracting effects of sound, particularly the effects of speech on cognitive processing, experiments which have produced distinctly different conclusions. Findings in this area may have important implications for an understanding of efficiency at the workplace. The support for three key claims is reviewed: (a) that the degree of distraction is unrelated to the intensity of the sound; (b) that speech and non-speech sounds are functionally equivalent in their effect; and (c) that the sound must be segmented into a sequence of events that are different in timbre or pitch for it to be disruptive. The implications of these results for practical settings are discussed.
Introduction The effects of background sounds on task performance are of great relevance to the study of efficiency at the place of work, whether the work is undertaken in the office or on the flight deck of an aircraft. Whilst the effects of loud white noise on cognitive processing have been generally very inconsistent, the deleterious effects of extraneous speech on cognitive processing are more consistent and, arguably, more relevant to the modern workplace. A number of laboratory studies have shown large, consistent and replicable disruption of performance, effects which appear to have relevance for the workplace. The first report of what became known as the “irrelevant speech effect” was by Colle and Welsh (1976). They established the classic paradigm that has been used with minor variation since: typically, experimental subjects are asked to retain and report in correct serial order a sequence of verbal items (usually consonants) from a visually-presented sequence. After seeing the list the subjects are asked to rehearse the items for a short interval and then to report them in serial order when prompted. For some of the lists, background sound is played, but in these cases the subjects are expressly asked to ignore any sound they hear.
Auditory distraction in the workplace
483
Early studies established key features of this disruption by irrelevant speech, among them that the degree of disruption was not dependent on the meaning of the sound (a conclusion supported in much subsequent work: Salamé & Baddeley, 1982; Jones, Miles & Page, 1990; Banbury and Berry, 1997); nor on its intensity (at least within a limited range). However, recent research has found that the effect is not confined to speech, since the effect can be found with tones (Jones & Macken, 1993) or pitch glides (Jones, Macken & Murray, 1993). Thus, it is now referred to it as the irrelevant sound effect. These laboratory findings have a number of important implications for an understanding of efficiency at the workplace. Thus, this paper reviews the following key features of the “irrelevant sound effect”: ➀ that the degree of distraction is unrelated to the intensity of the sound; ➁ that speech and non-speech sounds are functionally equivalent in their effect; and ➂ that the sound must be segmented into a sequence of events that are different in timbre or pitch for it to be disruptive.
Intensity and meaning have no effect The effects of white noise on cognitive processing have been generally very inconsistent, showing improvements in performance, reductions in performance or no effects at all (see Jones, 1990 for a review). However, the occurrence of white noise is relatively rare in most workplaces, so this review will concentrate on the effects of more common background sounds, such as speech (e.g. from co-workers), and tones (e.g. from alarms and auditory feedback in the operation of equipment). Indeed, there is little evidence to suggest that the effects of white noise are similar to those of speech. Most notably, the effect of irrelevant speech on serial recall is independent of intensity; the disruption was roughly the same whether the sound level is equivalent to a whisper [48dB(A)] or a shout [76dB(A)]. This is true whether the level is varied between trials or within trials (Tremblay and Jones, 1998). That the meaning of sound is an important determinant of disruption, although intuitively plausible, has received little experimental support. A number of studies, using serial recall tasks, have shown that speech in a language a person does not understand leads to disruption as (Colle and Welsh, 1976; Colle, 1980; Banbury and Berry, in press); and that the effects is roughly the same as for narrative English for English speakers (Jones, Miles and Page, 1990). However, a recent study by LeCompte, Neely and Wilson (in press) contests the assumption that the disruption observed on serial recall tasks is independent of meaning. They found some tentative evidence that meaningful speech disrupts serial recall more than meaningless speech. However, their results merely indicated a trend in their data, and failed to reach conventional significance levels. Studies on the effect of meaning on more complex cognitive tasks have been inconsistent. Reading tasks, such as text comprehension, are susceptible to the meaning of background speech (Martin, Wogalter and Forlano, 1988; Jones, Miles and Page, 1990). However, Banbury and Berry (1997, in press) have shown that the disruption on mental arithmetic and memory for prose tasks by background speech was not mediated by its meaning. Nevertheless, for seriation-based tasks (i.e. those that require that the order of information to be maintained correctly) the weight of evidence suggests that the disruption is independent of intensity and independent of its meaning.
484
S Banbury and D Jones
Speech and non-speech are equipotent Salamé and Baddeley (1982) suggest a model to account for the disruption by background sounds. They assume that in the process of reading, material in written form is transformed into phonological code, a code which is based on the sound of the material rather than its appearance or meaning. This set of codes conflicts in memory with phonological codes resulting from privileged access of speech to phonological memory. They suggest a mechanism in which the degree of disruption is proportional to the phonological similarity of items from two sources. Thus only speech, they argue, can show an irrelevant speech effect. Evidence for this assertion is based on a serial recall task in which a sequences of single digit numbers were used as to-be-remembered materials. Irrelevant auditory materials with different degrees of similarity to the to-be-remembered sequences were compared: semantically-similar words (integers) and phonologically-similar words (tun—one, gnu— two, tee—three, sore—four, etc.) gave marked but comparable degrees of disruption, while the effect of phonologically-dissimilar words (tennis, jelly, tipple, etc.) was significantly less, but still appreciably worse than a quiet control (Salamé and Baddeley 1982, Experiment 5). The closer the resemblance of heard and rehearsed materials, it was argued, the more likely was the degree of disruption. However, their model based on the phonological similarity between the two streams remains controversial. Effects may be found with non-verbal memory and also with nonverbal irrelevant sounds. Studies by Jones, Farrand, Stuart and Morris (1995) using spatial memory tasks with no verbal component, and Jones and Macken (1993) using random tones, suggest that some factor other than phonological confusion is responsible for the disruptive effects observed. Furthermore, a study by Jones, Macken and Murray (1993) found that random pitch glides interspersed with short periods of silence could also produce similar disruptive effects to that of speech. Recent research by LeCompte, Neely and Wilson (in press), however, contests Jones and Macken’s findings that the disruption from tones and syllables (i.e. speech and non-speech sounds) is equipotent. An exact replication of their paradigm, albeit with twice the amount of participants, found that speech disrupted serial recall to a greater degree than did tones. However, Jones and Macken’s (1993) “equipotentiality hypothesis” has been supported by Banbury and Berry (in press, 1997), who generally found that speech and the noise from office equipment (i.e. consisting of telephone, printer and fax noise, but no speech) produced similar levels of disruption to memory for prose and mental arithmetic tasks. Overall, these results suggest that tones and speech are equipotent, which confounds any account based on the similarity of the visual and auditory material. Instead, some simple analysis of the auditory stream, insensitive to the gross acoustical differences between speech and steady-state tones, serves as the basis for disruption. This simple analysis underpins the “Changing State hypothesis” put forward by Jones, Madden and Miles (1992) to account for these results.
“Auditory changing state” is a necessary condition The Changing State hypothesis originally formulated by Jones, Madden and Miles (1992) argues that the “irrelevant” sound stream has to show an appreciable acoustic variation (in all
Auditory distraction in the workplace
485
but intensity) from one segmented entity to the next, rather than the assumption that the sound has to be “speech-like” before it disrupts serial recall. A necessary precursor to changing state must be some means of segmenting this sometimes physically continuous signal into its component units. If the onset of a sound is masked, such as in a speech babble, the effects of disruption is rather small. Sounds that do not contain sharp transitions in energy, such as continuous pitch-glides, therefore show reduced levels of disruption. Once a sound is segmented the important determinant of disruption is the degree of stimulus mismatch between successive stimuli. However, this effect is non-monotonic. As the difference in acoustic properties of successive items in the series increases (such as one from a single voice or single instrument) there is initially an increase in disruption; however, beyond the point at which the differences become so great that each member of the sequence constitutes a distinct perceptual object (such as a different voice, or different instrument) the degree of disruption begins to diminish. That is, physical change increases disruption as long as the identity of the ‘object’ producing the sound remains throughout the series (see Jones, Alford, Bridges, Tremblay & Macken, 1997 for a discussion).
Implications for the workplace The practical implications for the deleterious effects of extraneous sounds are fairly clear. Extraneous sound is increasingly common in a range of work environments, such as openplan offices, aircraft cockpits and various kinds of command and control centres. If irrelevant speech impairs performance on tasks involving primary memory, then job performance may be affected adversely. A small number of studies have attempted to research the effects of background noise in open-plan office environments (e.g. Banbury and Berry, 1997; Banbury and Berry, in press). Consistent with may observational studies conducted after the introduction of open-plan offices in the 1960s, these results highlight people’s susceptibility to disruption from extraneous background noise, even when they are not attending to the noise. Banbury and Berry’s (1997) finding that nonspeech sounds, such as telephones and printers, can cause as much disruption as irrelevant speech sounds is of particular interest. This has clear implications for office work, particularly with the increasing trend by corporations to move toward open-plan offices. Not only is the performance of office workers likely to be affected by conversations of their co-workers but also by the office equipment they have to use. Office planners need to find ways of reducing subjective noise levels, for example by providing adequate partitioning and sound insulating materials. Alternatively, they could consider introducing a continuous noise that serves to mask not only the background speech, but also the equipment noise, so that these sounds become less distinct from one another (Jones and Macken, 1995). Other settings, such as control centres and aircraft cockpits, have received less scrutiny in this respect1. Considering that much of the complex, error-critical decision making in the military and civil environments is undertaken in noisy aircraft cockpits and control centres, it is surprising that relatively little research to date has been conducted in these domains. Experiments currently in progress at DERA (Farnborough) are investigating the effects of extraneous sounds in aircraft cockpits and airborne control centres (SB). 1
486
S Banbury and D Jones
Nevertheless, it is clear that any reduction in disruption by background sounds is only possible through a better understanding of why background sounds are disruptive, rather than reducing the intensity of the sounds per se.
References Banbury. S., & Berry, D.C. (in press). Disruption of office related tasks by speech and office noise. British Journal of Psychology. Banbury. S., & Berry, D.C. (1997). Habituation and dishabituation to speech and office noise. Journal of Experimental Psychology: Applied, 3, 1–16. Colle, H.A. (1980). Auditory encoding in visual short-term recall: Effects of noise intensity and spatial location. Journal of Verbal Learning and Verbal Behavior, 19, 722–735. Colle, H.A., & Welsh, A. (1976). Acoustic masking in primary memory. Journal of Verbal Learning and Verbal Behaviour, 15, 17–31. Jones, D.M. (1995). The fate of the unattended stimulus: Irrelevant speech and cognition. Applied Cognitive Psychology, 9, 23–38. Jones, D.M. (1993). Objects, streams and threads of auditory attention. In A.D.Baddeley & L.Weiskrantz (Eds.), Attention: Selection, awareness and control. Oxford: Clarendon Press. Jones, D.M. (1990). Recent advances in the study of human performance in noise. Environment International, 16, 447–458. Jones, D.M., Alford, D., Bridges, A., Tremblay, S., & Macken, W.J. (1997). Organisational factors in selective attention: The interplay of acoustic distinctiveness and auditory streaming in the irrelevant sound effect. (Manuscript submitted for publication). Jones, D.M., Farrand, P., Stuart, G., & Morris, N. (1995). The functional equivalence of verbal and spatial information in serial short-term memory. Journal of Experimental Psychology: Learning, Memory and Cognition, 21, 1008–1018. Jones, D.M., & Macken, W.J. (1993). Irrelevant tones produce an irrelevant speech effect: Implications for phonological coding in working memory. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 369–381. Jones, D.M., & Macken, W.J. (1995). Auditory babble and cognitive efficiency: Role of number of voices and their location. Journal of Experimental Psychology: Applied, 1, 216–226 . Jones, D.M., Macken, W.J., & Murray, A.C. (1993). Disruption of visual short-term memory by changing-state auditory stimuli: The role of segmentation. Memory and Cognition, 21, 318–328. Jones, D.M., Madden, C., & Miles, C. (1992). Privileged access by irrelevant speech to shortterm memory: The role of changing state. Quarterly Journal of Experimental Psychology, 44A, 645–669. Jones, D.M., Miles, C., & Page, J. (1990). Disruption of reading by irrelevant speech: Effects of attention, arousal or memory? Journal of Applied Cognitive Psychology, 4, 645–669. LeCompte, D.C., Neely, C.B., & Wilson, J.R. (in press). Irrelevant speech and irrelevant tones. Journal of Experimental Psychology: Learning, Memory and Cognition. Martin, R.C., Wogalter, M.S., & Forlano, J.G. (1988). Reading comprehension in the presence of unattended speech and music. Journal of Memory and Language, 27, 382– 398. Salamé, P., & Baddeley, A.D. (1982). Disruption of short-term memory by unattended speech: Implications for the structure of working memory. Journal of Verbal Learning and Verbal Behavior, 21, 150–164. Tremblay, S., & Jones, D.M. (1998). The role of habituation in the irrelevant sound effect: Evidence from the effects of token set size and rate of transition. Journal of Experimental Psychology: Learning, Memory and Cognition (in press).
TRANSMISSION OF SHEAR VIBRATION THROUGH GLOVES Gurmail S.Paddan and Michael J.Griffin
Human Factors Research Unit Institute of Sound and Vibration Research University of Southampton Southampton SO17 1BJ England
International Standard ISO 10819 (1996) proposes a test method for the measurement of the vibration transmission of gloves but does not address the testing of gloves during exposure to vibration in a shear axis. Experiments have been conducted to measure the frequency transmissibility of shear axis vibration through a selection of gloves to the palm of the hand. Ten gloves, some ‘antivibration gloves’, were used in the investigation. The subjects (8 males) pushed on a horizontal handle with a force of 20 Newtons; no grip force was applied. The frequency-weighted magnitude of random horizontal vibration on the handle was 5 ms–2 r.m.s.; vibration was measured at the vibrating handle and at the palm of the hand using a palm-glove adaptor. Transmissibilities between the handle and the palm adaptor are presented for all gloves and all subjects. It was found that a few gloves gave resonances at about 200Hz and attenuated vibration at frequencies above about 300Hz. Other gloves had resonances at higher frequencies and some gloves amplified the shear axis vibration at all frequencies up to about 1000 Hz.
Introduction The hands of operators are exposed to multi-axis vibration when using vibrating tools. The directions of the vibration include the three translational axes: fore-and-aft, lateral and vertical, as specified in British Standard BS 6842 (1987). Some tools expose the hands of operators to vibration predominantly in an axis parallel to the surface of the hand: vibration in a shear axis. Examples include a percussive chipping hammer when holding the chisel. International Standard 10819 (1996), which might be used to determine the vibration transmissibility of a glove, considers the transmission of vibration through the palm area of the glove to the hand in a direction perpendicular to part of the palm. With the glove worn on the hand, the horizontally vibrating handle is held such that acceleration occurs along the forearm, that is, in the z-axis of the hand. The measures obtained according to ISO 10819 are used to determine whether a glove can be considered to be ‘antivibration glove’. However, International Standard ISO 10819 (1996) does not address the testing of gloves during exposure to vibration in a shear axis. This paper presents data on the transmission of vibration in the shear axis as a function of frequency for a selection of gloves. The transmission of shear axis vibration through gloves
488
GS Paddan and MJ Griffin
has received little previous attention, even though many vibrating tools expose the hands of workers to high levels of shear vibration. The variability in the transmission of shear vibration between subjects and between gloves has been determined. The data presented in this paper are taken from a larger study which also investigated the effect of push force on the transmission of shear vibration.
Equipment and Procedure The experiment was conducted using an electrodynamic vibrator, Derritron type VP30, powered by a 1500 watt amplifier. A basic handle comprising a steel bar of diameter 32mm and length 102mm was attached to the vibrator such that the grip of the hand would be horizontal and in line with axis of vibration. The first resonance of the handle occurred at approximately 1340Hz. Acceleration was measured at two locations: on the vibrating handle, and between the palm of the hand and the glove using a palm adaptor of mass 9.21 grams (ISO 10819 states a maximum mass of 15 grams). The accelerometers were of piezoelectric type (Brüel and Kjær type 4374) each with a mass of 0.65 gram. The acceleration signals from the two locations were passed through charge amplifiers (Brüel and Kjær type 2635) and then acquired into a computer-based data acquisition and analysis system. The subjects stood on a horizontal surface and applied a downward force with the right hand on to the laterally vibrating handle. The subjects held their forearms horizontal at an angle of 90° to the axis of vibration. The gloved hand was placed on the handle such that the metacarpal bones were horizontal and at right angles to the axis of vibration. The elbow formed an angle of approximately 180° between the forearm and the upper arm. There was no contact between the elbow and the body during the measurements. A downward push force of approximately 20N was applied during the measurements; no grip force was applied. A copy of the written instructions given to subjects is shown in the Appendix. Ten commercially available gloves were tested (see Paddan, 1996, for details of gloves). In accord with ISO 10819 (1996), the gloves were worn by the subjects for at least 3 minutes prior to the vibration measurements. The room temperature during the tests fluctuated between 22°C and 25°C (the standard specifies a temperature range of 20+5°C) and the relative humidity varied between 37% and 51% (the standard specifies that the relative humidity shall be below 70%). The experiment was approved by the Human Experimentation Safety and Ethics Committee of the Institute of Sound and Vibration Research. Eight right-handed male subjects participated in the inter-subject variability study (mean age 28.75 years; mean weight 71.25kg; mean height 1.78m). Each subject was exposed to the vibration eleven times: once with the ungloved hand and once with each of the ten gloves. A commercial data acquisition and analysis system, HVLab, developed at the Institute of Sound and Vibration Research of the University of Southampton, was used to conduct the experiment and analyse the acquired data. A computer-generated Gaussian random waveform having a nominally flat acceleration spectrum was used with a frequency-weighted acceleration magnitude of 5.0 ms–2 r.m.s. at the handle. The frequency weighting used was Wh as defined in British Standard BS 6842 (1987). The frequency range of the input vibration was 6 Hz to 1800 Hz. The waveform was sampled at 6097 samples per second and low-pass filtered at 1800 Hz before being fed to the vibrator. Acceleration signals from the handle and
Transmission of shear vibration through gloves
489
the palm adaptor were passed through signal conditioning amplifiers and then low-pass filtered at 1800 Hz via anti-aliasing filters with an elliptic characteristic; the attenuation rate was 70 dB/octave in the first octave. The signals were digitised into a computer at a sample rate of 6097 samples per second. The duration of each vibration exposure was 5 seconds.
Analysis Transfer functions were calculated between acceleration on the handle (i.e. the input) and acceleration measured at the palm-glove interface adaptor (i.e. the output). The ‘crossspectral density function method’ was used. The transfer function, Hio(f), was determined as the ratio of the cross-spectral density of input and output accelerations, Gio(f), to the power spectral density of the input acceleration, Gii(f): Hio(f)=Gio(f)/Gii(f). Frequency analysis was carried out with a resolution of 5.95Hz and 124 degrees of freedom.
Results and Discussion Figure 1 shows individual shear transmissibilities between the handle and the adaptor for the 8 subjects pushing on the handle with a force of 20N. The palm adaptor was inserted between the palm of the hand and the glove. The transmissibilities show different dynamic characteristics for the 10 gloves. However, there appear to be two main categories for the data shown: gloves which demonstrate a low frequency peak in transmissibility (i.e. below about 300Hz) and gloves which show a peak in transmissibility over the frequency range 300Hz to 700Hz. Gloves which show a low-frequency peak in transmissibility are gloves 2 and 10; these show significant attenuation of vibration for frequencies above about 400Hz. All the other gloves could be put into the second category. (Gloves 2 and 10, together with glove 6, showed low-frequency peaks in transmissibility when tested using vibration perpendicular to the palm, see Paddan, 1996.) It is interesting to note that glove 5 showed almost no attenuation in vibration at the adaptor compared to the vibration on the handle. Indeed, the glove amplified the shear vibration from the handle to the adaptor by more than 50% at the peak transmissibility between 600Hz to 1000Hz. Variation in glove transmissibility in the shear axis between individuals is apparent in Figure 1. An example of the variation is seen for glove 3 where one subject showed a transmissibility of 0.18 at 1200Hz whereas another subject showed a transmissibility of 0.96: the maximum response being over 5 times the minimum transmissibility. Variation of similar order has been seen for glove transmissibilities perpendicular to the surface of the palm (Paddan and Griffin, 1997). Median transmissibilities for the ten gloves were calculated and are presented in Figure 2. A large variation in transmissibility is apparent between gloves. At a frequency of 875Hz, glove 5 (highest curve) shows a median transmissibility that is over 19 times greater than the median transmissibility for glove 2 (lowest curve). In view of the high levels of shear vibration to which the operators of vibrating tools are exposed (see for example Nelson, 1997), the revision of ISO 10819 (1996) should consider the additional measurement of the transmission of vibration in shear axes. A factor that is of importance in the transmission of shear vibration through the glove is the effect of grip force; some vibrating tools require operators to hold tools with high grip forces. Other work has shown that an increase in force results in increased transmission of vibration through the glove to the palm of the hand.
490
GS Paddan and MJ Griffin
Figure 1. Glove shear transmissibilities for 8 subjects pushing with a force of 20N (5.95Hz frequency resolution, 124 degrees of freedom)
Figure 2. Median shear vibration transmissibilities for 10 gloves
The effect of the transmission of shear vibration on the hands has received little attention. Since the vibration transmitted to the hand in the shear direction can be as great as the vibration occurring perpendicular to the palm, this is a topic which requires further consideration.
Transmission of shear vibration through gloves
491
Conclusions A large variation in shear transmissibilities has been found for the 10 gloves tested. Some gloves transmitted only low frequency vibration (below about 300Hz) while other gloves also transmitted frequencies well above about 300Hz. Some gloves offer no attenuation of shear axis vibration at any frequency below 1000Hz: they appear to amplify shear axis vibration at all relevant frequencies.
References British Standards Institution 1987, British Standard Guide to Measurement and evaluation of human exposure to vibration transmitted to the hand, BS 6842. London: BSI International Standards Organization 1996, Mechanical vibration and shock—Hand-arm vibration—Method for the measurement and evaluation of the vibration transmissibility of gloves at the palm of the hand. ISO 10819 (1996) Nelson, C.M. 1997, Hand-transmitted vibration assessment—a comparison of results using single axis and triaxial methods. United Kingdom Group Meeting on Human Response to Vibration, ISVR, University of Southampton, 17–19 September 1997 Paddan, G.S. 1996, Effect of grip force and arm posture on the transmission of vibration through gloves. United Kingdom Informal Group Meeting on Human Response to Vibration, MIRA, Nuneaton, 18–20 September 1996 Paddan, G.S. and Griffin, M.J. 1997, Individual variability in the transmission of vibration through gloves. In S.A.Robertson. (ed.) Contemporary Ergonomics 1997, (Taylor and Francis, London), 320–325
Acknowledgements This work has been carried out with the support of the United Kingdom Health and Safety Executive.
Appendix Following are instructions that were given to the subjects taking part in the experiments on the transmission of shear vibration through gloves. INSTRUCTIONS TO SUBJECTS SHEAR VIBRATION: EFFECT OF PUSH FORCE The aim of this experiment is to measure the effect of push force on the transmission of shear vibration through gloves to the palm of the hand. Please stand and rest your right hand on the handle in front such that the forearm is horizontal. Ensure that your right arm (upper and lower) is not in contact with your body. Throughout each vibration exposure, you are required to apply a downward push force on the handle. You must ensure that the adaptor is inserted between the glove and the palm of the hand, and positioned such that the accelerometer in the adaptor is in line with the direction of vibration. Just prior to the start of each run, which the experimenter will indicate, you are to place your right hand on the handle and apply the required push force. This position is to be maintained throughout the vibration exposure. You are free to terminate the experiment at any time. Thank you for taking part in this experiment.
THE EFFECT OF WRIST POSTURE ON ATTENUATION OF VIBRATION IN THE HAND-ARM SYSTEM Tycho K.Fredericks1 and Jeffrey E.Fernandez2
Department of Industrial and Manufacturing Engineering Human Performance Institute, Western Michigan University Kalamazoo, MI 49002–5061 USA
1
2Deparment of Industrial and Manufacturing Engineering National Institute for Aviation Research, Wichita State University Wichita, KS 67260–0035 USA
Eight Male University students served as subjects in this experiment. Subjects were required to perform a simulated riveting task in three wrist postures (neutral, 1/3 maximum flexion, 2/3 maximum wrist flexion) while vibration was measured at the hand-handle interface and the styloid process of the ulna. Results indicated that wrist posture had a significant effect on vibration transmitted from the hand-handle interface to the wrist. The neutral wrist posture, in the majority of the cases, was associated with the highest degree of dampening. Decrements in vibration amplitude from the handhandle interface to the wrist were as a high as 88.89%.
Introduction According to the United States Bureau of Labor Statistics (1997) for 1995, Work-Related Musculoskeletal Disorders (WMSD) due to repeated trauma have declined over the past year by 7 percent. This, on the surface, seems encouraging until you link this finding with the lost time data (time away from work). Carpal Tunnel Syndrome, a WMSD commonly associated repeated trauma, was documented to have a median of 30 days away from work (BLS, 1997). Couple the lost time data with the frequency data, and multiply by an indirect cost ranging from $60,000–$100,000 (CTD News, 1993) and it is understandable why industry desires to keep this illness at bay. For these reasons, the research community has been using the psychophysical approach in an attempt to develop acceptable work frequencies. A plethora of previous psychophysical studies have investigated the effects of wrist posture (Marley and Fernandez, 1995), applied force (Kim and Fernandez, 1993), and gender (Davis and Fernandez, 1994) on maximum acceptable frequency (MAF). A more recent study (Fredericks and Fernandez, in press) investigated the effect of vibration, wrist posture, and applied force on MAF. In this particular study it was determined that MAF decreased significantly with a deviation in wrist posture and an increase in applied force. It was also determined that decrements in MAF due to vibration were 36% while decrements due to wrist posture were 19%. This indicated that vibration transmitted from a rivet gun, as a risk factor in the development of WMSD, is of more concern than wrist posture. The present study builds upon that study to determine the effect of wrist posture and on the attenuation of the vibration from the hand-handle interface to the wrist.
Effect of wrist posture on attenuation of vibration in the hand-arm system
493
Methods and Procedures Subjects and Design of Experiment Eight males from the University population served as subjects in this experiment. Each subject was required to perform a riveting task commonly found in the aircraft industry. Wrist postures (neutral, 1/3 maximum flexion, 2/3 maximum wrist flexion) were varied in compliance with a pilot study conducted at an aircraft company located in the United States. For details of that study refer to Fredericks and Fernandez (in press). Applied force level and coupling force were held constant. A complete randomized block design with subjects as blocks served as the statistical design for this experiment. All trials were presented to the subjects in random order.
Equipment A workstation was designed to simulate sheet metal riveting activities commonly found in the aircraft industry. A United Air Tool brand pneumatic hand held rivet gun coupled with a pistol type grip was used as the hand tool to be used in the simulated task. The weight of the tool was 3.5 pounds. Three tri-axial Endevco 23 accelerometers were used to measure vibrations. The preamplification of the signal was performed by Endevco Model D 33 series signal conditioners. The analog-to-digital (A/D) conversion of the vibration signal was performed Keithley Metrabyte DAS 16F A/D board housed in a Zenith 80286 microcomputer. Processing of these vibration signals was done using the “Snap Master” signal processing software. One accelerometer was glued to a transducer mount (Rasmussen, 1982) to measure vibration levels entering the hands and another accelerometer was glued to a wrist mount (Farkkila, 1978) to measure vibration levels at the styloid process of the ulna.
Procedures All tasks were performed with the subjects preferred hand. All anthropometric measurements of the hand and wrists were also taken using the preferred hand. For each wrist posture, plus a replication of one posture, a psychophysically adjusted drilling frequency was determined by the method of adjustment. The subject was allowed to adjust the frequency of the task by adjusting the up and down arrows on the keyboard for the first 20 minutes. At the end of the 23rd minute, physiological measures (heart rate, blood pressure) and ratings of perceived exertion were taken. Then the subject momentarily halted activities while the vibration equipment was put on them (this was approximately 1.5 minutes). This equipment included the previously described transducer mount for the hand, and one arm clasp for the wrist as well as the EMG sensors. Subjects then continued the task and a 2-second sample at 7000Hz recorded digitally were recorded.
Results and Discussion Descriptive statistics for the eight male subjects are presented in Table 1. A statistical comparison was made between the subjects height, weight, and grip strength in this study with those from a larger population (Viswanath and Fernandez, 1992; Ayoub et al., 1985). Results of t-tests indicate that there were no significant differences between the two. This could mean that subjects used in this study could be representative of a larger population. Hand tool vibration was measured in the three orthogonal directions as recommended by NIOSH (1989). The signals, corresponding to the X, Y, and Z axes, were recorded and processed several different ways. A root-mean-square (RMS) value was calculated to determine the energy content of the vibration. Further analysis occurred after a Fast Fourier Transform was performed
494
TK Fredericks and JE Fernandez
to determine acceleration values within specific frequency regions. Frequency regions reviewed included: 0–100 Hz (FWA), 101–200 Hz (FWB), 201–3500 Hz (FWC), and 0–3500 Hz (FWD). ANOVA results for the effects of wrist posture on RMS acceleration values, frequency weighted acceleration values in the region FWA, FWB, FWC, and FWD to the hand-arm system are presented in Table 2. Although vibration data were collected at the coupling, styloid process of the ulna, and the proximal end of the ulna bone, only results of the vibration at first 2 locations are presented in this article. The combined affect of acceleration associated with the vibration in all three basicentric orthogonal directions was also calculated (NIOSH, 1989). Table 1. Subject Descriptive Statistics (n=8)
Table 2. ANOVA’s for vibration response variables
A Duncan’s Multiple Range test was performed to determine that allocation of differences for all vibration response variables significantly (α=0.05) affected by wrist posture. It was determined for RMS values in the Z-axis, that the neutral wrist posture had significantly
Effect of wrist posture on attenuation of vibration in the hand-arm system
495
higher acceleration values associated with it as compared to 1/3 and 2/3 maximum wrist flexion. Similar findings were determined in the x-axis (0–100 Hz range), z-axis (0–100 Hz range), combined effect of all three axis (0–100 Hz range), x-axis (101–200 Hz range), and combined effect of all three axis (101–200 Hz range). For the combined RMS values, it was determined that there were no significant differences between acceleration values obtained in the neutral and 2/3 maximum wrist postures and no significant difference between 2/3 maximum flexion and 1/3 maximum flexion. Similar findings were found in the z-axis in the 101–200 Hz range. In all of the cases tested, the neutral wrist posture had higher acceleration values associated with it. This could be attributed to the type of contact the hand has with the handle in the neutral posture versus the other wrist postures. Previously it has been shown that there is a difference between the impedance of the flat hand pushing a plat (Miwa, 1964 a, b), the impedance of the hand gripping handles of different diameters (Reynolds and Falkenberg, 1984), and impedance with a palm grip or finger grip around a handle (Reynolds and Keith, 1977). It is believed that wrist posture had an effect on the contact area of the hand with the handle thus influencing the vibration transmission characteristics. In addition wrist posture did not have a significant effect (α=0.05) on vibration greater than 200 Hz thus implying that vibration higher than 200 Hz was not permitted to pass through the hand. These findings coincide with Reynolds and Keith (1997). Table 3 presents the findings of the attenuation of vibration from the hand-handle interface to the wrist. In the majority of the cases the highest degree of attenuation occurs when the wrist is in the neutral posture. Decrements in vibration amplitude were seen as high as 88.89% Table 3. Percent attenuation from hand-handle interface to wrist.
496
TK Fredericks and JE Fernandez
Conclusion Wrist posture has a significant effect on vibration attenuation. The neutral wrist posture has the highest attenuation associated with it. Engineers should consider different means to decrease the vibration transmitted through the hand-arm system. Investigation of alternative handle shapes, materials, and tool mechanics may decrease the vibration transmitted through the hand-arm system thus reducing the risk of WMSD.
References Ayoub, M.M., Smith, J.L., Selan, J.L., Chen, H.C., Fernandez, J.E., Lee, T.Y., and Kim, K.H. (1985). Lifting in unnatural postures. Instititute for Ergonomics Research, Texas Tech University, Lubbock, Texas. Davis, P.J. and Fernandez, J.E. (1994). Maximum acceptable frequencies performing a drilling task in different wrist postures. Journal of Human Ergology, (23)2, 81–92. Farkkila, M.A. (1978). Grip force in vibration disease. Scand. J. Work Environ. Health, 4, 159–166. Fredericks, T.K. and Fernandez, J.E. (in press). The Effect of Vibration on Pyschophysically Derived Work Frequencies for a Riveting Task”, International Journal of Industrial Ergonomics. Kim, C.H. and Fernandez, J.E. (1993). Psychophysical frequency for a drilling task. International Journal of Industrial Ergonomics, 12, 209–218. Marley, R.J. and Fernandez, J.E. (1995). Psychophysical frequency and sustained exertion at varying wrist posture for a drilling task. Ergonomics, 38(2), 303–325. Miwa, T. (1964a). Studies in hand protectors for portable vibrating tools. 1. Measurements of the attenuation effect of porous elastic materials. Industrial Health, 2:95–105. Miwa, T. (1964b). Studies in hand protectors for portable vibrating tools. 2. Simulations of porous elastic materials and their applications to hand protectors. Industrial Health, 2:106–123. NIOSH (1989). Occupational Exposure to Hand-Arm Vibration. Cincinnati, OH: Department of Health and Human Services, Public Health Services, Centers for Disease Control, National Institute for Occupational Safety and Health, Division of Standards Development and Technology Transfer, DHHS (NIOSH) Publication No. 89–106. Rasmussen, G. (1982). Measurement of vibration coupled to the hand-arm system. In Brammer, A.J., Taylor, W., Eds. Vibration effects on the hand and arm in industry. New York, NY: John Wiley & Sons, 157–167. Reynolds, D.D. and Keith, R.H. (1977). Hand-arm vibration. Part I. Analytical model of the vibration response characteristics of the hand. J. Sound and Vibration, 51(2):237–253. Reynolds, D.D. and Falkenberg, R.J. (1984). A study of hand vibration on chipping and grinding operators. Part II: Four-degrees-of-freedom lumped parameter model of the vibration response of the human hand. J. Sound and Vibration, 95(4):499–514. Tallying the true costs of on the job CTDs, CTD News, 1993. U.S. Department of Labor, Bureau of Labor Statistics. (1997). Occupational Injuries and Illnesses in the United States by Industry, 1995. Washington, D.C.: U.S. Government Printing Office. Viswanath, V. and Fernandez, J.E. (1992). MAF for males performing drilling tasks. Proceedings of the 36th Annual Human Factors Society Meeting, pp. 692–696.
HAND TOOLS
CRITERIA FOR SELECTION OF HAND TOOLS IN THE AIRCRAFT MANUFACTURING INDUSTRY: A REVIEW Bheem P.Kattel and Jeffrey E.Fernandez
Department of Industrial and Manufacturing Engineering National Institute for Aviation Research Wichita State University Wichita, KS 67260–0035, USA
There is no evidence of extensive research on the selection criteria of hand tools for specific tasks. Tools ergonomically suited for one task, individual, or environment may not be suitable for another environment or individual. The success in reducing this mismatch would depend on a reliable selection criteria. The heavy expenses incurred by aircraft manufacturing industries in purchasing hand tools (especially rivet guns) needs to be justified from the point of view of economics as well as the quality of life of the employees. This study reviews the literature related to the selection criteria of hand tools for different types of work. Some factors recommended for consideration while selecting hand tools for aircraft manufacturing industry include size, weight, shape, forces exerted, and vibration. Key Words: Rivet guns, Vibration, Hand-arm, Musculoskeletal Disorders.
Introduction Hand tools find their use in all types of environment, in kitchen, in garages, or in industries. Type, size, and shape of hand tools depend on the nature of task to be performed. They may be manually operated or may be manually operated but power driven. Improper use of hand tools has been found to cause a variety of problems. The main effects of improper use would be felt in the upper-extremity. Recent trend in the manufacturing sector has been the production of ergonomically designed products. The market is flooded with products claimed by the manufacturers to be ergonomically sound. The hand tools are no exception to this trend. There is no recorded evidence of these hand tools being evaluated from all aspects to justify the cost. Tools which are ergonomically suited for one task, individual, or environment may not be suitable for other tasks, individuals, or environments.
Criteria for selection of hand tools in the aircraft manufacturing industry
499
Modern industrial development requires the production of hand tools to be a significant activity. However, poor design combined with excessive use makes the hand tools potential causative factor for the development of CTDs of the hand, wrist, and arm (Armstrong, 1983; Aghazadeh and Mital, 1987). Usually tools regarded as best for the job are the ones that minimize physical stress by producing low demands of force on the hand, that are not awkward to hold and handle, and that minimize shock, recoil, and vibration (Radwin and Smith, 1995).
Factors affecting hand tool selection Posture Kim and Fernandez (1993) concluded from their study on psychophysical frequency for a drilling task that task frequency for a drilling operation should be lowered as force and wrist flexion angle are increased. Halpern and Fernandez (1996), Fernandez et al, (1991), and Klein and Fernandez (1997), while exploring the effect of arm and wrist posture on pinch strength concluded that elbow posture had a significant effect on chuck pinch strength while the shoulder posture did not. Posture affected the endurance time as well. Similarly, Kim et al, (1992) and Fernandez et al, (1993) concluded that wrist flexion had significant effect on grip strength.
Force The force requirements for a job are often related to the weight of the tools being handled. Force demands that exceed an operator’s strength capabilities may cause loss of control, leading to an unintentional injury and poor work quality (Radwin and Smith, 1995). The force requirement can be classified into grip force and applied force. Dahalan and Fernandez (1993) from a study on the psychophysical frequency for a gripping task concluded that maximum acceptable frequency for a gripping task significantly decreased as the required gripping force and duration of grip increased.
Repetitiveness The risk of developing a hand or wrist disorder is significantly increased for workers performing highly repetitive and forceful exertions (Silverstein et al, 1987). Silverstein et al, (1986) and Rodgers (1986) have reported that cycle time less than 30 seconds could be considered as highly repetitive.
Contact stress Some areas of the body are better suited for bearing contact stress than others. It is related to the force and area of contact, described by the pressure exerted against the skin (Radwin and Smith, 1995). Fransson-Hall and Kilbom (1993) from a study on the sensitivity of the hand to surface pressure concluded that the most sensitive areas were the thenar area, the skinfold between thumb and index finger and the area around os pisiforme.
Vibration Burdorf and Monster (1991) investigated riveters and controls in an aircraft company for the effect of vibration exposure and health complaints. The results of the cross-sectional study provided some evidence that the use of impact power tools could result in neurovascular symptoms and damage of bones and joints in the hand-arm system.
500
BP Kattel and JE Fernandez
Tool weight and load distribution Generally, the tool weight should be less than 2.3kg. However, for precision operations it should be less than 0.4kg (1.00lb) (Rodgers, 1986). According to Chaffin and Andersson (1993) the combined weight of tools, hose, and power cord are not uncommon for commercial drills, sanders, buffers, etc. The effect of weight is further aggravated by additional muscle actions necessary to precisely position and stabilize a tool during operation.
Triggers Shut-off mechanisms used in the power driven hand tools could affect the stress generated in the hand-arm system while using the tool. Kihlberg et al, (1993) showed significant differences in the arm movements and ground reaction forces between three tools with instantaneous shut-off, more slowly declining torque curve, and maximum torque maintained for some time before shut-off. Smallest values were found with the fast shut-off tool while the delayed shut-off tool caused the largest values.
Feed and reaction force A tool with a rapid coupling obtains better results operating on hard joints than tools with slower disengaging functions (Lindqvist, 1993). Frievalds and Eklund (1993) concluded from a study that using electrical tools at lower rpm levels and under-powering the pneumatic tools would result in larger impulses and more stressful ratings.
Handles Khalil (1973) indicated that there was a decrease in the muscle effort for a given torque as the diameter of the handle varied from 1.25 inches. According to Shih and Wang (1996), among different handle shapes, triangular shape was the most favorable, followed by square, hexagonal and circular. Cochran and Riley (1986) concluded that for tasks involving thrust push or pull type activities, the best handle shape would be triangular followed by rectangular and handles with square and circular cross-section were the worse. According to Johnson (1988), the best grip diameter which required less effort, was 5cm. According to Fernandez et al, (1991), an increase in handle diameter could cause an increase in the maximum wrist flexion but at a lower grip strength.
Grip span and type Fransson and Winkel (1991) concluded that a span of 50–60mm for females and 55–65mm for males would produce maximum resultant force between the jaws of the tool for both conventional and ‘reversed’ grip types. Kilbom et al, (1993) showed that during one minute repetitive handgrip exercise with simultaneous demands on force and moderate demands on precision, subjects could use 40–50% of their maximal grip force.
Friction Higher friction between hand and handle will facilitate good grip and hence, less grip force exertion. Bobjer et al, (1993) concluded from a study that the coefficient of palmar friction was dependent on the size of the surface areas in contact with the skin and there was a low correlation between coefficient of friction and perceived discomfort.
Criteria for selection of hand tools in the aircraft manufacturing industry
501
Recommendation for the aircraft industry Based on the literature review presented above, Table 1 gives the summary of criteria recommended for selection of hand tools in the aircraft manufacturing industry. Table 1. Factors and mode of measurement to be considered for evaluation of rivet guns.
Concluding Remarks Proper selection of hand tools can reduce the risk of developing CTDs of the upper extremities. Available literature mostly deal with studies performed to establish safe limits for various physical characteristics of hand tools. Further research is required to establish criteria for selecting hand tools for specific operations.
References Aghazadeh F. and Mital A. 1987, Injuries due to hand tools. Applied Ergonomics, 18(4), 273–278 Armstrong T.J. 1983, An ergonomic guide to carpal tunnel syndrome. American Industrial Hygiene Association Journal, 43(2), 103–116 Bobjer, O., Johansson, S-E. and Piguet, S. 1993, Friction between hand and handle: Effects of oil and lard on textured and non-textured surfaces; perception of discomfort. Applied Ergonomics, 24(3), 190–202 Burdorf A. and Monster A. 1991, Exposure to vibration and self-reported health complaints of riveters in the aircraft industry. Annals of Occupational Hygiene, 35(3), 287–298 Cochran D.J. and Riley M.W. 1986, The effects of handle shape and size on exerted forces. Human Factors, 28(3), 253–265
502
BP Kattel and JE Fernandez
Dahalan J.B. and Fernandez J.E. 1993, Psychophysical frequency for a gripping task, International Journal of Industrial Ergonomics, 12, 219–230 Fernandez, J.E, Dahalan, J.B., Klein, M.G. and Kim, C.H. 1991, Effect of handle diameter on maximum wrist flexion and extension. In W.Karwowski and J.W.Yates (eds.) Advances in Industrial Ergonomics and Safety III, 1991, (Taylor & Francis, London), 351–157 Fransson C. and Winkel J. 1991, Hand strength: the influence of grip span and grip type, Ergonomics, 34(7), 881–892 Fransson-Hall C. and Kilbom A. 1993, Sensitivity of the hand to surface pressure, Applied Ergonomics, 24(3), 181–189 Freivalds A. and Eklund J. 1993, Reaction torques and operator stress while using powered nutrunners, Applied Ergonomics, 24(3), 158–164 Halpern C.A. and Fernandez J.E. 1996, The effect of arm posture on peak pinch strength, Journal of Human Ergology, 25(1), 141–148 Halpern C.A. and Fernandez J.E. 1993, The effect of wrist posture and pinch type on endurance time. In W.Marras, W.Karwowski, J.Smith, and L.Pacholski (eds.) The Ergonomics of Manual Work, (Taylor & Francis, London), 323–326 Johnson S.L. 1988, Evaluation of powered screwdriver design characteristics, Human Factors, 30(1), 61–69 Khalil T. 1973, An Electromyographic methodology for the evaluation of industrial design, Human Factors, 15(3), 257–264 Kihlberg, S., Kjellberg, A. and Lindbeck, L. 1993, Pneumatic tool torque reaction: reaction forces, displacement, muscle activity and discomfort in the hand-arm system, Applied Ergonomics, 24(3), 165–173 Kilbom, A., Makarainen, M., Sperling , L., Kadefors, R. and Liedberg, L. 1993, Tool design, user characteristics and performance: a case study on plate-shears, Applied Ergonomics, 24(3), 221–230 Kim C.H. and Fernandez J.E. 1993, Psychophysical frequency for a drilling task, International Journal of Industrial Ergonomics, 12, 209–218 Kim, C.H., Marley, R.J. and Fernandez, J.E. 1992, Prediction models of grip strength at varying wrist positions. In S.Kumar (ed.) Advances in Industrial Ergonomics and Safety IV 1992, (Taylor & Francis, London), 783–788 Klein M.G. and Fernandez J.E. 1997, The effects of posture, duration, and force on pinching frequency, International Journal of Industrial Ergonomics, 20, 267–275 Lindqvist, B. 1993, Torque reaction in angled nutrunners, Applied Ergonomics, 24(3), 174–180 NIOSH, 1989, Criteria for a recommended standard: Occupational exposure to hand-arm vibration, DHHS (NIOSH) Publication No. 89–106, 86–92 Radwin R.G. and Smith S.S. 1995, Industrial power hand tool ergonomics research: current research, practice, and needs, DHHS (NIOSH) Publication No. 95–114. Rodgers, S.H. 1986, Ergonomic Design for People at Work, Vol.2, (Van Nostrand Reinhold, New York) Shih Y.C. and Wang M.J.J. 1996, Hand/tool interface effects on human torque capacity, International Journal of Industrial Ergonomics, 18, 205–213 Silverstein, B.A., Fine, L.J. and Armstrong, T.J. 1987, Occupational factors and carpal tunnel syndrome, American Journal of Industrial Medicine, 11, 267–274
EXPOSURE ASSESSMENT OF ICE CREAM SCOOPING TASKS Patrick G.Dempsey, Raymond McGorry, John Cotnam and Ilya Bezverkhny Liberty Mutual Research Center for Safety and Health 71 Frankland Road Hopkinton, MA 01748 U.S.A.
One area of upper-extremity cumulative trauma disorder research and practice that has been relatively ignored is the quantification of forces required by repetitive tasks. While wrist posture and frequency have been measured with some precision, forces are often evaluated qualitatively. This study extended a previously developed hand tool kit to specifically measure grip and scoop head forces on an aluminum ice cream scoop. The instrumented scoop was used to investigate forces required to scoop nine different flavors of ice cream between approximately –19° C and –12° C. Eight subjects participated in the experiment. Of particular interest was the percentage of maximum voluntary grip contraction forces required. The results indicate that ice cream scooping demands force exertions that exceed recommendations for repetitive exertions.
Introduction Repetitive work with hand tools is thought to be a risk factor for work-related upperextremity cumulative trauma disorders (WRUECTDs). The primary task-related risk factors for WRUECTDs are force, posture and repetition. Repetition can be readily measured through observation, and wrist posture can be directly measured with reasonable precision by electro-mechanical goniometers. However, direct measurement of force has been somewhat ignored, although there have been some efforts to directly measure forces at the hand-to-tool coupling. In general, assessments of muscular force associated with repetitive tasks involving the upper extremity have been fairly qualitative. Electromyography (EMG) presents one alternative, but the difficulties associated with the lack of control afforded by most workplaces makes EMG an infeasible alternative. Another approach is using force sensitive resistors at the point(s) of force application. The resistors can be intrusive and must be placed rather precisely on contact points. Previously-developed hand tool technology was extended and utilized to directly measure forces required to scoop nine flavors of ice cream across a range of temperatures. The primary focus was assessing grip force, due to the involvement of the flexor tendons in WRUECTDs.
504
PG Dempsey, R McGorry, J Cotnam and I Bezverkhny
Methods Subjects Four male and four female subjects voluntarily participated in the experiment. The mean ages (standard deviation) of the male and female subjects were 34.0 (11.3) and 33.8 (3.3) years, respectively.
Apparatus An instrumented ice cream scoop was used to collect data on grip force exerted on the handle and data on moments acting about the head of the ice cream scoop. Grip force was measured with two button load cells mounted orthogonally within the handle of the ice cream scoop. Moments about the center of the handle were measured in two planes using strain gauges mounted on the interface between the handle and the scoop head. Linear regressions were used to relate voltages to forces (N) and moments (N•m). The r2 values for the four channels ranged between 0.997 and 0.999. The signals were passed through a differential amplifier to gain amplifiers. The gained output was passed to a low-pass filter with a 16Hz cutoff frequency. The filtered signal was sent to an analog-to-digital converter, a 512k microprocessor, and over a RS232 line to the serial port of a personal computer. The “dull” scoop head, the electronic scoop, and an original scoop that the electronic scoop was modeled on are shown in Figure 1.
Figure 1. Dull scoop head, electronic scoop, and original scoop.
A mock-up dip-chest was constructed from wood to simulate a two-lid Kelvinator dip chest used in ice cream shops. Figure 2 shows a subject scooping ice cream from the mock-up dip chest. A digital probe thermometer was used Figure 2. Experimental setup. to measure ice cream temperatures immediately after scooping. Six flavors of ice cream, two flavors of frozen yogurt, and one flavor of sorbet were used.
Procedures Subjects were provided limited training. The training covered maintaining a straight wrist while scooping, scooping straight across the ice cream, and scooping from the highest point in the box of ice cream to the lowest.
Exposure assessment of ice cream scooping tasks
505
Maximum voluntary contraction (MVC) data for grip force were collected with the electronic scoop handle at least one day prior to the experimental tasks. The Caldwell regimen, with modifications used by Berg et al (1988) and Dempsey and Ayoub (1996) for collecting pinch strength data, was used. The average of two trials within ±15% of each other was used as the datum. A range of temperatures was studied. The goal was to test the ice cream at –18°C, –15°C, and –12°C. Due to difficulties controlling temperature with precision, the testing was performed over a range of –19.3°C to –12.4°C. Subjects performed the experimental tasks on three different days. On each day, subjects performed two replications of scooping for each flavor, for a total of 18 scoops per session. The order of presentation of the flavors during a session was random. On the three different days, the ice cream was at different temperatures for each session. Due to differences in contents, location in the freezer, etc., all flavors tested were not at the same temperature; however, during any given session all flavors were at similar temperatures relative to the range of temperatures tested. Subjects performed the experiment in pairs. One flavor at a time was selected at random and placed in the mock-up. Each subject scooped two scoops from the box placed in the mock-up dip chest. Before each scoop was performed, the subject dipped the scoop in a room-temperature water bath and shook off any excess water. Only one subject was in the laboratory at a time; the second subject waited outside. The ice cream scoops were placed into plastic bowls for weighing. One experimenter controlled data acquisition and a second experimenter weighed the ice cream and transferred the boxes from the freezer to the mockup and vice-versa. Once the four scoops were complete, the thermometer was inserted into the box of ice cream approximately 2.5cm below the surface. The temperature was recorded once the reading on the thermometer stabilized. The process was repeated for the other flavors.
Results Grip force readings for each trial were averaged and divided by each subject’s maximum grip force (MGF), yielding average %MGF. An unexpected finding was a tendency for subjects not to exert a higher %MGF at colder temperatures. Ordinary least-squares regression was used to examine the relationship between temperature and %MGF. A significant relationship between %MGF and temperature was found for only one flavor. In this case, %MGF decreased approximately 4% for each 0.6° C increase in temperature. Given the general lack of significance, the %MGF was calculated across all temperatures for each flavor. The averages for each flavor ranged from 40.2% to 50.6%. Average values for subjects ranged between 36.0% and 68.8%. The total amount of grip force exerted to scoop a given weight of ice cream was analyzed. The grip forces for each trial were integrated and divided by the weight of the scoop. The reason for dividing by the weight of the scoop is because the exact amount of ice cream scooped could not be controlled. Figure 3 shows the relationships between this quantity and temperature for each flavor (functions are from regressions). For all flavors, temperature significantly affected integrated grip forces (using α=0.05). With the exception of one flavor, the second order quadratic terms for temperature were significant. Since the significance level for the second order term (0.0529) was just slightly over 0.05 for that flavor, the quadratic function is reported in Figure 3. The moments the head forces create about the center of the handle were analyzed in the same manner as the grip forces, i.e. the moments were integrated and divided by the weight scooped for a particular trial. For all flavors, temperature significantly affected the integrated scoop moments. Four of the regression functions were linear, and five were second order quadratics (see Figure 4).
506
PG Dempsey, R McGorry, J Cotnam and I Bezverkhny
An important aspect of scooping is the amount of time required per scoop of ice cream. Force and duration of an exertion can be used to estimate the “acceptability” of a given exposure pattern, so the amount of time scooping per gram of ice cream was analyzed. Figure 5 shows the relationships between temperature and the amount of time required per gram of ice cream for the nine flavors tested. The relationships were second order for all flavors except one.
Discussion Figure 3. Relationship between temperature and integrated grip forces for 9 flavors tested.
Figure 4. Relationship between temperature and integrated scoop moments for 9 flavors tested.
Figure 5. Relationship between temperature and time per gram for 9 flavors tested.
The experiment reported here provides quantification of the task demands associated with scooping ice cream. The authors were unable to find any ergonomic literature related to scooping ice cream. Likewise, the study presents quantitative estimates of exerted grip force while using hand tools. Individual subjects averaged up to almost 70% of their maximum grip force, which exceeds guidelines for even occasional exertions. For example, Putz-Anderson (1988) recommends limiting the magnitude of repetitive exertions to less than 30% of maximum, and limiting all exertions to 50% of maximum. It should be noted that values over 90% were observed for individual trials. These percentages are conservative in that MGF was collected with a neutral posture. Wrist deviation (ulnar/radial or flexion/ extension) decreases the prehensile strength capabilities of the hand (e.g., Dempsey and Ayoub, 1996; Putz-Anderson, 1988). Thus, the actual %MGF values may have been somewhat higher for trials where wrist deviation was involved. Figures 3 and 4 indicate that integrated scoop moments and grip force follow the same pattern as the time/gram relationships shown in Figure 5. A somewhat unexpected finding was that subjects do not necessarily increase forces when scooping harder (colder) ice cream, rather the time spent scooping is increased. Most relationships between peak forces and moments versus temperature were not statistically significant. Hopefully, the expansion of such direct measurement technologies will lead to the quantification of task demands associated
Exposure assessment of ice cream scooping tasks
507
with repetitive hand tool use. Direct measurement of force may permit quantitative relationships between force and WRUECTD outcomes to be established. Such knowledge would be very beneficial to designers and practitioners. Currently, epidemiological evidence linking wrist posture, force, and repetition to the incidence and severity of WRUECTDs only allows for gross judgments concerning risk associated with various work designs. Predictive models would allow for more quantitative comparisons of different work designs. Likewise, direct measurement provides the opportunity to gain considerable insight into the human responses to work with hand tools, some of which are not intuitive. The relationships presented in Figures 3–5 permit work practices to be developed with more precision afforded before the experiment. In particular, the graphs show that forces, moments, and time per gram increase markedly at temperatures below -14°C. This point may be used as a goal for the coldest temperature that ice cream is scooped at. Often, ice cream is stored at a temperature colder than it is scooped at, indicating intermediate storage freezers should have capacity to allow sufficient time for the ice cream to reach serving temperature. The %MGF and time values for particular temperatures might also be used to suggest work/ rest regimes. Of course, the actual time content of the particular job being examined would be necessary in addition to the results presented here. Several limitations of this study should be mentioned. The experiment was performed with nine flavors of a particular brand of ice cream assumed to provide a representative sample with regards to additives (e.g., nuts, fruit) and consistencies. It is possible that the results (in absolute terms) are specific to the flavors tested. However, it does not seem unreasonable to extrapolate the results, at least qualitatively, to other brands. Additionally, the subjects were not highly trained; however, comparison of the results presented here to data collected in the field indicate that results under actual working conditions are quite similar.
Acknowledgments The authors would like to thank Richard Holihan, Walter Majkut and Peter Teare for their assistance with the design, construction, and calibration of the electronic scoop, and constructing the mock-up dip chest. The authors would also like to thank the eight volunteers that participated in the experiment.
References Berg, V.J., Clay, D.J., Fathallah, F.A., and Higginbotham, V.L., 1988, The effects of instruction on finger strength measurements: Applicability of the Caldwell regimen. In F.Aghazadeh (ed.) Trends in Ergonomics/Human Factors V, (Elsevier, Amsterdam), 191–198 Dempsey, P.G., and Ayoub, M.M. 1996, The influence of gender, grasp type, pinch width and wrist position on sustained pinch strength, International Journal of Industrial Ergonomics, 17(3), 259–273 Putz-Anderson, V. (ed.) 1988, Cumulative trauma disorders: A manual for musculoskeletal diseases of the upper limbs, (Taylor and Francis, London)
THERMAL ENVIRONMENTS
THE EFFECT OF CLOTHING FIT ON THE CLOTHING VENTILATION INDEX Lisa Bouskill1, Nicola Sheldon 1, Ken Parsons 1 and W R Withey2
(1) Loughborough University Loughborough LE11 3TU (2) Centre for Human Sciences Defence Evaluation Research Agency Farnborough GU14 0LX
To quantify the effect of clothing ‘fit’ on the exchange of air between clothing and the environment 9 male subjects undertook 2 exposures to an environment of ta=5.0 (1SD=0.3) °C, tr=5.0 (0.3) °C, Va=0.12 (0.02) inland rh=62 (1)%. During one exposure the subjects wore a ‘large’-size, 2-piece air-impermeable suit and during the other exposure they wore a ‘small’-size suit of the same design. During both exposures, air exchange—Ventilation Index—(VI) was measured whilst subjects performed 3 activities: standing with no movement, stepping on and off a platform and rotating each limb in turn at a constant rate. VI was 28% higher (P<0.01) in the large suit compared to the small suit when standing with no movement, and 15% higher (P<0.01) for all activities combined. It is concluded that the rate of heat exchange between the skin and the environment depends on clothing fit.
Introduction It is not always the case that correctly fitting clothing is available for an individual when required. An inappropriate fit can be detrimental to the wearer’s performance due to a number of factors. Clothing which is too tight may restrict movement and may rub against areas of the body causing chaffing. Similarly, clothing which is too loose may present a snag or trip hazard and may be generally uncomfortable to wear. The incorrect fit of clothing may also generate problems associated with incompatibility between garments in an ensemble or between garments and equipment used in association with the clothing. In terms of clothing design and thermal physiology, the fit of clothing influences both the micro-environment volume within the ensemble and the ability to generate ‘pumping’ within it. In turn, these properties influence the ventilation characteristics of an ensemble and thus the effect the ensemble has on the exchange of sensible and evaporative heat between the skin and the environment. It has therefore been suggested that “the measurement and control of the pumping effect seems to be the key for further advances in the field of thermal insulation of clothing ensembles” (Vogt et al, 1983).
Effect of clothing fit on the clothing Ventilation Index
511
Our study was conducted as part of a larger project to quantify the effects of different aspects of clothing garments and ensembles on the exchange of heat between the skin and the environment. Of particular interest was the relationship between values obtained from standardised measurements in laboratory conditions, such as intrinsic insulation (Ic1) and evaporative resistance (Iec1), and the ‘resultant’ values of these parameters (Vogt et al, 1983) obtained when the clothing is worn in workplace situations. The study used the tracer gas technique to determine the air exchange rate between the clothing and the environment; when combined with the measurement of micro-environment volume it provided the Ventilation Index (VI) (Birnbaum and Crockford. 1978). where:
The aim of our study was to examine the effect of fit of clothing on VI. The null hypothesis (H0) was that VI would not be related to clothing fit. The alternative hypothesis (H1) was that loose-fitting clothing would have a higher VI than tight-fitting clothing. If a relationship can be quantified then it will be possible to relate clothing fit, through VI, to its effects on clothing insulation and on the heat exchange between the wearer and the environment, thus making evaluations of thermal risk in the workplace more accurate.
Materials and Methods Subjects Nine, healthy, physically-active males volunteered to participate in the study. Their physical characteristics are given in Table 1. They were fully informed of the objectives, procedures and possible hazards of the study and completed a Form of Consent before exposure. During the study, left aural temperature and heart rate were recorded as safety measures. Table 1. Subject physical characteristics. Mean (1SD)
Clothing and Clothing Fit During the study subjects wore the following clothing: underpants (own), short socks, soft ‘trainers’ and a 2-piece oversuit comprising trousers with an elasticated waist and a zipfronted, long-sleeved jacket with elasticated bottom hem. The oversuit was made from GoreTex ™ material with cotton elasticated, wrist cuffs, and ‘popper’ fastenings at the ankles. The jacket zip was protected with a ‘popper’-fastened wind baffle. Subjects wore a ‘large’size suit and a ‘small’-size suit of the same design during separate exposures. The order of wearing the suits was randomised. For this study it was necessary to make an estimate of the way in which the suit fitted the subjects. Various ways to quantify this fit were considered. For reasons of speed, practicability and simplicity the following technique was adopted. Seven anatomical landmarks were used as measuring sites: forearm, upper arm, chest, abdomen, hip, thigh and
512
LM Bouskill, N Sheldon, KC Parsons and WR Withey
lower leg. At each site the ‘excess’ fabric of the suit was ‘pinched’ away from the body, and the extent of this ‘excess’ measured. The mean of these values was calculated and used as an indication of the fit of the suit (Table 2).
Determination of the Ventilation Index VI is the product of the clothing air exchange rate and micro-environment volume. These were measured in separate test sessions as follows (Bouskill et al, 1997):
Air exchange Rate The air exchange rate of the ensemble was obtained using a tracer gas technique. Nitrogen was flushed through the clothing at a constant rate using a system of distribution tubes. The gas in the micro-environment was sampled using a system of sampling tubes connected to a small vacuum pump and an oxygen analyser. The time taken for the oxygen concentration in the micro-environment to return to 19% from 10% was used to calculate the rate of air exchange between the micro-environment and the external environment.
Micro-environment Volume The volume of air trapped within each ensemble (ie large-suit and small-suit) was determined in triplicate in a separate test session. The subject wore over the ensemble a 1-piece airimpermeable suit, sealed at the neck, which enclosed the entire body including the hands and feet. Air was evacuated from this suit until pressure, as measured using a U-tube manometer attached to a perforated tube fed down the ensemble trouser leg, began to change. This was taken to be the point at which the air in the oversuit had been evacuated such that it just lay on top of the ensemble. Evacuation continued until no more air could be removed from the suit. This additional volume of evacuated air was taken to represent the micro-environment volume.
Test Protocol The test protocol consisted of two separate exposures (one for each size of suit), in a controlled-environment chamber, to the following thermal conditions: ta=5.0 (1SD=0.3) °C, tr=5.0 (1SD=0.3) °C, V a=0.12 (ISO=0.02) ms -1 and rh=62 (ISO=1) % (water vapour pressure=0.54 (ISD=0.01) kPa). During each exposure subjects performed 3 activities: a. Standing stationary. b. A ‘step-up’ routine, each step movement taking 1.2 seconds as cued by a metronome. c. A rotating limbs routine, each limb being moved individually in large arcs, with each arc taking 4.8 seconds to complete as cued by a metronome. VI was determined 3 times for each activity, from which an average value was calculated. The 3 measurements were made consecutively, but the order of presentation of activities was balanced between subjects.
Results Clothing fit and VI values are given in Table 2. VI increased by 28%, 15%, 8% and 15% respectively for the standing stationary, stepping, rotate limbs activities and all activities combined when comparing subject data when wearing the small-suit with the large-suit. In practical terms this translates to increases in ventilation of 1.9lmin-1, 1.7lmin-1, 1.0lmin-1 and 1.5lmin-1 respectively for the 3 activities and all activities combined.
Effect of clothing fit on the clothing Ventilation Index
513
Preliminary statistical analysis (Student’s t-tests) shows significantly higher values for the standing stationary activity in the large suit compared with the small suit (P<0.01). However, for the stepping and rotating limbs activities the differences between suits were not statistically significant. The mean VI for all activities combined was greater in the large suit than in the small suit (P<0.05). The relationship between clothing fit and the mean VI for all activities combined is shown for each subject in Figure 1. Table 2. Clothing fit and Ventilation Index values when standing with no movement, stepping and rotating limbs activities with ‘small’ and ‘large’ suits
Figure 1. Relationship between fit of clothing and Ventilation Index when wearing the large and small suits.
514
LM Bouskill, N Sheldon, KC Parsons and WR Withey
Discussion The primary aim of this study was to examine the effect of the fit of clothing on the clothing ventilation index. As expected, each individual had their own unique value of ‘fit’ of the GoreTex suit, and different values of VI (Figure 1). The overall trend of the data is that loose-fitting clothing allowed greater ‘pumping’ ie ventilation than did the tight-fitting clothing. The mean increase when standing stationary was 28% for the large suit compared with the small suit. The reason for this unexpectedly large difference is not clear, but is presumably related to the large suit allowing air to exchange through the openings (the neck, cuffs and leg bottoms) by means of natural convection. This emphasises the importance of this avenue of sensible and evaporative heat exchange with the environment for maintaining heat balance in the workplace GoreTex is relatively impermeable to air, so in this study, in which the ambient air speed was low (0.12 ms-1), most of the air exchange probably took place through the openings alone. Clothing with a higher air permeability would be expected to allow air exchange by diffusion through the fabric of the garments. In these circumstances the relative contribution of the natural convection component would be smaller. The practical consequences of this must be taken into account when selecting clothing for work in environments in which air speed could significantly lower resultant insulation, or in selecting a limiting work environment when the clothing ensemble for that particular workplace is invariable. During this study it was also observed that when the clothing was very tight the ability of the subject to perform some tasks was impaired. Subject 2 was restricted, in both the stepping and rotating limbs activities, because of the tightness of both the small and large suits around his waist and thighs. This explains why Subject 2 did not achieve a higher VI in the large suit. This study has shown that it is in the interest of individuals and their employers to ensure that clothing is of a suitable fit, because ill-fitting clothing restricts movement and also because pumping may have detrimental physiological effects if the air temperature of the environment is cold. However, in warm and hot air temperatures pumping may assist heat loss by convection and evaporation, and thus be of benefit to the wearer.
References Birnbaum, R.R. and Crockford, G.W. 1978, Measurement of clothing ventilation index. Applied Ergonomics. 9, 194–200 Bouskill, L.M., Withey, W.R., Watson-Hopkmson. I And Parsons, K.C. 1997, The relationship between the clothing ventilation index and its physiological effects. In: Proceedings of Fifth Scandinavian Symposium on Protective Clothing. Elsinore, Denmark. 5/8 May 1997 R.Nielsen and C Borg (eds.) 36–40. Vogt, J.J., Meyer J.P., Candas, V., Libert J.P. and Sagot, J.C. 1983, Pumping effect on thermal insulation of clothing worn by human subjects. Ergonomics, 26, 963–974 The support of the Ministry of Defence. DERA Centre for Human Sciences, is acknowledged. © British Crown Copyright 1998/DERA. Published with the permission of the Controller of Her Britannic Majesty’s Stationery Office
A THERMOREGULATORY MODEL FOR PREDICTING TRANSIENT THERMAL SENSATION Fangyu Zhu and Nick Baker The Martin Centre for Architectural and Urban Studies Department of Architecture, University of Cambridge Cambridge, CB2 2EB
Comprehensive experiments to investigate human reactions to both transient and spatially inhomogeneous thermal environments have been reported in the literature, but the relationship between thermal sensation and environmental parameters is yet to be explicitly made. In this research project, a 37-node human thermoregulatory model has been constructed. The model is able to distinguish the effects of spatial and temporal changes of conditions around the body. A thermal sensation model will be developed to translate a given body thermal state into a corresponding thermal sensation on the ASHRAE seven-point thermal sensation scale. This paper mainly describes the physiological model.
Introduction The ISO standard 7730 (1994) is mainly based on Fanger’s research under steady-state conditions (Fanger, 1970). Fanger produced the PMV and PPD indices; PMV index predicts the thermal sensation and PPD index predicts thermal discomfort. However dissatisfaction with air-conditioned buildings designed according to the standard is widespread. The rigid indoor temperature limits lead to needless heating and cooling which typically cause high consumption of energy. Furthermore the sick building syndrome (SBS) is common in these buildings. A new philosophy of building design is emerging with an interest in the use of structure and forms to moderate the climate, allowing the temperature to fluctuate within limits during the day. Recently developed task-conditioning systems function by creating transient and highly asymmetric environments around the workstation (Heinemeier et al., 1990). By deliberately departing from the conventional goal of (HVAC) practice of steady, isothermal, low-speed air flow uniformity within the entire room, task-conditioning designs have highlighted the shortcomings of Fanger’s numerical models of human thermal balance, which only resolve the steady state heat and mass fluxes at whole-body level. This research project includes the development of two models. A 37-node human thermoregulatory model has being developed and validated against the experimental results.
Fangyu Zhu and Nick Baker
516
A thermal sensation model will be built to transform the multivariable thermal state into a single thermal sensation index. The process by which a given thermal environment produces a corresponding thermal sensation is illustrated in Figure 1.
Figure 1. Thermoregulatory and thermal sensation model
The Thermoregulatory Model Main feature of the model Human Body Thermoregulatory Model (HBTM) is based mainly on Stolwijk’s 25-node model (Stolwijk, 1971). In order to represent a person in a real room, several modifications have been made to the original model, where the environment was assumed to be isothermal and the body naked. Clothing is represented by two additional layers in each segment. Heat and humidity transfer resistance and ‘pumping effect’ due to movements are considered. Convective and radiative heat loss is computed at different segments. The heat transfer coefficients are calculated from the experimental results of de Dear et al (1997), which were measured from the articulated thermal manikin. The evaporative heat loss from the skin and heat loss by respiration were considered to be constant in the original model. In this model, they are computed from the equations in ASHRAE Handbook of Fundamentals (1989). The model has been written in FORTRAN and runs quickly on the UNIX system.
Validation of the model Several researchers have measured body temperatures of subjects, who were dressed in shorts and in the sitting-resting position (Thellier, 1994). The subjects were exposed for 1 hour at a neutral temperature of 28°C, then quickly transferred for a 2 hour exposure to a hot or cold condition. The temperatures were taken after one hour’s exposure to the second environment. Presented in Figure 2 are the two measured mean skin temperatures and the predictions by Thellier’s model, along with those from HBTM. Figure 3 shows the experimental values (Wang, 1992) and predicted results of head, trunk and feet skin temperatures. The aggreement between computed and measured results is reasonable. There is a change in the slope of the predicted curve between 29°C and 31°C, when thermoregulatory reactions against cold are changed into reactions against heat. Wang (1992) conducted experiments of sudden temperature changes, which were simulated by HBTM. Although several experiments have been used for comparison, only one is described here. After 30 minutes of staying at an ambient temperature of 25°C, the subjects stayed for 60 minutes at an ambient temperature of 35°C, and an additional 30 minutes of recovery in a 25°C environment. The clothing insulation and mass of the six segments are presented in Table 1.
Thermoregulatory model for predicting transient thermal sensation
517
Figura 2. The predicted and measured mean skin temperatures in quasi-steady state
Figure 3. The predicted and experimental head, trunk and feet skin temperatures in quasi-steady state
Figure 4. Mean skin temperature, head and trunk core temperatures in transient conditions
Fangyu Zhu and Nick Baker
518
The computed and measured mean skin temperatures, and the calculated head and trunk core temperatures are demonstrated in Figure 4. There is a reasonable agreement between simulated and observed values, but the model seems to react a bit faster than reality. Although this phenomenon has been identified by many authors, a good explanation has not been found yet. Table 1. The clothing insulation and mass of the six segments
Thermal sensation model Humphreys and Nicol (1996) pointed out that the PMV equation employs different comfort criteria for estimating conditions which would yield thermal neutrality, and for assessing the effects of departures from neutrality. It was shown that this difference of criteria undermines the estimation of PMV wherever the clothiong insulation differes from 0.6 clo. They considered that the above weakness comes from the misleading idea of thermal load, which is the central concept of PMV. We think that the further reason for the above weakness might be the limitation of Fanger’s physical model. It only includes the passive system and does not include the control system. We know that human body is capable of maintaining heat balance within relatively wide limits of the environment variables. There is a neutral comfort interval for people, where the control system can be neglected. Therefore Fanger’s PMV is a good predictor for optimal comfort state. But when the environment changes from the neutral comfort conditions, the effort necessary for thermoregulation increases, and the control system cannot be neglected. In dynamic state, the control system works all the time. The mean skin temperature and the sweat secreation cannot be kept at the comfort values, so Fanger’s thermal load does not exist. Because both the control and passive system are considered in HBTM, it can overcome the limitation of Fanger’s method mentioned above. To complete the development of a thermal sensation model, it is necessary to devise a relationship between the thermal sensation and the thermal state. It is called new predicted mean vote (NPMV)
Where Q is the metabolic heat production (W/m2) Q0 is the neutral metabolic heat production (W/m2) BF is the blood flow rate (1/h·m2) BF0 is the neutral blood flow rate (1/h·m2) Es is the sweat secretion rate (W/m2) Es0 is the neutral sweat secretion rate (W/m2) S is the net heat storage (W/m2) K1, K2, K3 and K4 are appropriate weighting factors.
Thermoregulatory model for predicting transient thermal sensation
519
NPMV is based on the following two assumptions: (1) a person at steady state in a neutral comfort environment expends minimal effort for thermal regulation and the net heat storage is zero; (2) as the environment changes from the above condition the effort necessary for thermoregulation increases, and the changes in thermal sensation may be correlated with changes in the effort required for thermoregulation and the net heat storage. The best values for the neutral values and the weighting factors will be found from correlating the thermoregulatory model results with experimental data. These values depend on the metabolic rate.
Conclusion The thermoregulatory model presented in this paper can be used to predict the global and local physiological parameters of the human body. It considers the local interactions between the body and the environment. Validation work, using independent experimental measurements showed satisfactory agreement with predictions of skin temperatures. Further refinements can be made as more and better experimental data become available. A thermal sensation model is currently under development. The whole model will be able to predict thermal sensation in transient and asymmetric thermal environment.
References ASHRAE 1989, Handbook of Fundamentals SI Edition, (American Society of Heating, Refrigerating and Air Conditioning Engineers, Inc., Atlanta) de Dear, R.J., Arens, E., Zhang H. and Oguro, M. 1997, Convective and radiative heat transfer coefficients for individual human body segments. International Journal of Biometeorology, 40, 141–156 Fanger, P.O. 1970, Thermal Comfort. (Danish Technical Press, Copenhagen) Heinemeier, K.E., Schiller, G.E. and Benton, C.C. 1990, Task conditioning for the workplace: issues and challenges. ASHRAE Transactions, 96, 678–688 Humphreys, M.A. and Nicol, J.F. 1996, Confilcting criteria for thermal sensation within the Fanger predicted mean vote equation. Proceedings of CBBSE/ASHRAE Joint National Conference, (Harrogate, the UK) ISO 7730 1994, International Standard 7730, Moderate Thermal Environments: Determination of the PMV and PPD Indices and Specification of the Conditions for Thermal Comfort, (ISO, Geneva) Stolwijk, J.A.J. 1971, A mathematical model of physiological temperature regulation in man, NASA CR-1855, (NASA, Washington) Thellier, F., Cordier, A. and Monchoux, F. 1994 The analysis of thermal comfort requirements through the simulation of an occupied building, Ergonomics, 37, 817–825 Wang, L. 1992, Research on Human Thermal Responses During Thermal Transients., M.Sc. dissertation, (Tsinghua University, P.R.China)
THE USER-ORIENTED DESIGN, DEVELOPMENT AND EVALUATION OF THE CLOTHING ENVELOPE OF THERMAL PERFORMANCE . Damian Bethea and Ken Parsons Human Thermal Environments Laboratory (HTEL) Department of Human Sciences Loughborough University Loughborough, Leicestershire, LE11 3TU England, UK Tel: +44 (0) 1509 22 81 65 email: [email protected] [email protected] The Clothing Envelope of Thermal Performance (CETP) provides a range of limits describing an ensemble in terms of wearer comfort and/or safety over a range of environmental parameters and work rates. This study took the systems approach for the user-oriented design, development and evaluation of the CETP. 4 Methods of Presentation (MoP) of the CETP and their associated documentation were iteratively designed using ergonomics principles and guidelines. The MoP’s were an Area Graph, a Table, a Psychrometric Chart and a Decision Tree. Following a pilot trial, 2 MoP’s were selected for a formal user trial with a representative user population. The results of the Formal User Trial recommended that the Area Graph be the Method of Presentation for the CETP. The successful application of the CETP may help to reduce heat stress amongst wearers of PPC in industry.
Introduction Personal Protective Clothing (PPC) is selected and allocated on the strength of its hazard protection qualities, and, as such, is described by manufacturers in terms of its “technical specification” i.e. permeability, material abrasiveness etc. A consequence of this increased protection from the hazardous environment is that the clothing may not adequately facilitate the transfer of heat from the body to the environment, potentially causing heat stress. Health & safety managers/advisors are presented with little or no information about the effects protective clothing will have on the wearer in different environmental situations. The concept of the thermal model of a clothing envelope could be applied which would specify the clothing ensemble in terms of its thermal performance specification over a range of environmental parameters Each envelope is specific to an ensemble and for a given type of work will indicate the range of limits within which human comfort and/or safety will be achieved. If the ensemble performs within the envelope, the performance is acceptable. This project was concerned with the user-oriented design, development and evaluation of a Clothing Envelope of Thermal Performance (CETP). It considered both the generation of the CETP using current climatic ergonomics knowledge and the use of user-oriented design
Design, development and evaluation of the CETP
521
and evaluation principles and guidelines to ensure that the CETP is practical and usable. The successful application of the CETP may help to reduce heat stress amongst wearers of PPC in industry.
Systems design approach A 4-stage systems approach was developed for the completion of the objectives, although only the first 3 stages were completed (see Figure 1). This design process was developed to accommodate the time schedule and resources available.
Figure 1. Flow Process Chart of the design, development and evaluation of the CETP
Stage 1: Synthesis Stage The Synthesis Stage involved the initial planning and design of the CETP paper interface. Design objectives were identified following interviews with experienced Health & Safety Executive (HSE) inspectors and health & safety managers, occupational hygienists, product managers etc. from industry. The users were then formally defined as those people responsible for the health and safety PPC policy within their company and will be referred to as health & safety managers/advisors. The development of the data characteristics of the CETP were generated using the 2-node model of human thermoregulation. The model was adapted to provide a combination of all four of the environmental parameters and the work rate values that would cause an increase in core temperature to 38°C for a particular ensemble, thereby creating the limits of the envelope. The following were the specifications used:
522
Damian Bethea and Ken Parsons
Model Specification The model was rewritten so that the core temperature was set at 38°C, and the intrinsic clothing insulation at 1.5clo. The skin wettedness was set at 1, which is the maximum skin wettedness for acclimatised workers and the relative humidity range was increased by 10% increments from 10% to 100%.
Problem Specification Inputs The inputs for the model were metabolic rate (from 75–225 W/m2 in increments of 25 W/m2), air velocity (from 0.2–1 m.s-1 in increments of 0.2 m.s-1) and mean radiant temperature (from 30–90°C increasing by 10°C increments).
Outputs Interpretation This resulted in 560 combinations of the three inputs, producing a predicted air temperature for each 10% increment in humidity that results in a rise of core temperature to 38°C after 1 hour of exposure. Following the generation of the data model from the 2-node model, 4 methods of presenting the CETP were developed as a paper interface. They were an Area Graph; a Table; a Decision Tree and a Psychrometric Chart (see Figures 2 and 3.). Parallel to this development of the MoP’s the associated documentation was iteratively developed (Information & Instructions.) according to the user requirements, the functional specification and the specific design of each MoP. Design guidelines such as how to present information in tables, the use of colour, spatial cues etc. were used in the development of both the MoP interfaces and their documentation. Methods used included assigning categories to air velocity (air movement), radiant temperature and work rate such as low, medium, high, and the use of colour coding for these categories. Definitions of the categories and their ranges were supplied in the Instructions.
Figure 2. Representation of the Area Graph MoP. Colour coding is not shown and has been substituted with Letters for the air temperature categories and formatted lines for the work rates. The arrows show the increase in work rate for each category. This is the Area Graph for low radiant temperature at low air movement.
Design, development and evaluation of the CETP
523
Figure 3. Representation of the Table MoP. Colour coding is not shown. (Note this is the Table for Low radiant temperature at Low air movement) An important aspect of developing the MoP was allocating functions to each of the parameters as it would have been impossible to represent all the data on a single interface. A fundamental requirement of this allocation of function was that the user’s decision process, when using the CETP, had to be identical irrespective of the MoP they were using.
Stage 2: Basic Design The 4 MoP’s were further developed and evaluated using Heuristic Analysis and Informal Walk-Throughs. During this stage, the User Trial Pack (UTP) consisting of an Introduction, Instructions, Scenarios, Questionnaires etc, was developed. Due to the explicit differences between the MoP’s, the same UTP’s could not be used for all methods. Therefore the MoP’s, their instructions and the UTP’s for each MoP were analysed to ensure that there were no confounding errors in the nature of the information provided.
User Trials User trials were used to evaluate the 4 MoP’s against usability criteria (e.g. ease of use, performance.) This identifies definitions of user characteristics and accuracy in using the MoP and its Instructions for use. Objective measures to evaluate differences between the MoP’s were evaluated by providing the participants with an Example Scenario of a working environment that needed to be assessed. The environmental parameters and the nature of the work rate were provided. The participant was then required to interpret the CETP using the associated documentation and MoP to determine whether the hypothetical clothing represented by the CETP was suitable for the environment and work described in the Scenario. They were then provided with answers to the questions so they could evaluate their own performance. Therefore, the Example Scenario acted as a training tool. They then answered 2 further Scenario Questions. Likkert scales were used to evaluate the subjective responses to usability criteria, such as ease of use, accuracy of CETP and appropriateness of information. The results from the Scenario Questions would provide the only objective data from the User Trial. Subjective data provided supportive information but objective performance was the main criteria used for the assessment of the MoP’s.
524
Damian Bethea and Ken Parsons
Stage 3: Summative Evaluation A pilot study was carried out using 24 student volunteers to assess the usability of the 4 MoP’s in a between-subject design (6 subjects per MoP). From this study the Area Graph and Table were selected for a formal evaluation by a representative population.
Experiment 2 40 health and safety managers were sent UTP’s by post (20 per MoP). The results of this User Trial were used to propose a final CETP interface which could be used by those people responsible for the PPC policy in companies in industry. The independent variable was the MoP (Area Graph or Table), with the dependent variables being the usability criteria. The objective data from all 3 Scenarios, the subjective data and personal data were then collated and analysed.
Experiment 2—Results The results of the Formal User Trial did not provide any significant differences between the usability criteria of the Table and the Area Graph methods of presentation. Those differences that were observed are discussed below. The instructions: No significant difference in the subjective ratings of any of the usability criteria where observed. Therefore differences were not due to the instructions. The Scenarios: A training effect following the Example User trial was evident, with the subjects achieving a higher positive score in the next two Scenarios. Although, the Table provided a high number of false positives for the air temperature range answers in Scenario 1, resulting in an erroneous assessment of the ensemble’s suitability. The Method of Presentation. The Area Graph performed better, both in the Scenarios and in the subjective ratings for the usability criteria. Other Issues: The air temperature ranges were too large, because all the parameters that would result in an increase in core temperature to 38°C were included in the CETP. Most users complained that the CETP did not tell them what PPC should be worn and therefore it was not directly addressing their issues.
Conclusions The use of the 2-node model was satisfactory for the requirements of this project; however, more sophisticated models should be used if the CETP is be developed further. The Area Graph is the Method of Presentation that is recommended for the CETP. Although it has been shown that a user performance specification can be provided and used to allow the selection, procurement and allocation of appropriate protective clothing, it also needs to be more specific to the environment in which the PPC is to be worn (e.g. job-task-environment-clothing-specific). At this stage of development is would appear that the CETP would be best suited to represent PPC in situations where generic ensembles are worn e.g. the military, the fire brigade, nuclear power industry etc.
Acknowledgements The authors would like to acknowledge the help of the following people; Len Morris, Dr. Ron McCaig, Paul Evans. Andrew Phillips from the HSE, Dr. Reg Withey from DERA, Mike Harris from Loughborough Consultants and Geoff Crockford. Gratitude is also expressed to those that took part in the study.
A Comparison of the Thermal Comfort of Different Wheelchair Seating Materials and an Office Chair. Humphreys, N., Webb, L.H., Parsons, K.C.
Department of Human Sciences, Loughborough University, Loughborough, Leicestershire, LE11 3TU
This study aimed to determine whether there were any thermal comfort differences between a standard office chair (wool, viscose) and four different wheelchair seating materials. The materials were tested in a neutral environment of 23°C, 70% humidity, with a predicted mean vote of 0. Measurements used included, thermal sensation, thermal comfort, stickiness, subjects skin temperatures and temperatures of the chair seat pan and back. Differences were found between the chairs, where the body was in contact with the seat pan and back. The subjective responses indicated that on their ‘bottom’ the office chair (wool, viscose) was generally slightly warmer, less comfortable and more sticky than the wheelchairs. Whilst Wheelchair D (PVC coated) was slightly cooler than all the other chairs. The objective temperature data support the subjective data.
Introduction The thermal environment is one important aspect of achieving a “quality indoor environment for people”. Much work has been done on the thermal comfort requirements of the general population. However, until now little research has been carried out on the thermal comfort requirements of people with physical disabilities. One particularly relevant area of interest is that of the different seating materials used in wheelchairs. Studies of people without physical disabilities can control the seating which its subject group uses. However, when an individual uses a wheelchair, they are specific to that individual, and people can not always transfer to the desired chair controlled by the experimenter. This study then attempts to look at the thermal comfort effects of four different wheelchair seating materials compared to that of a standard office chair. Subjects used in the study were not disabled, as the thermal comfort requirements (Predicted Mean Vote BS EN ISO 7730 (1995)) of this population is known therefore, avoiding the uncontrollable variable of the effects of different disabilities. The effect of these disabilities on thermal comfort requirements being unknown as yet.
526
N Humphreys, LH Webb and KC Parsons
Aim The aims of this study are two fold. One: To determine whether there are any thermal comfort differences between a standard office chair (wool, viscose) and wheelchair seating materials. Two: To determine if there are thermal comfort differences between different seating materials on wheelchairs. Hypothesis 1 - Office chairs and wheelchair seating materials do not differ in terms of thermal comfort. Hypothesis 2 - Different wheelchair seating materials do not differ in terms of thermal comfort.
Method To determine the optimum indoor temperature in which to evaluate the seating material a number of small studies were undertaken. A two day survey of a day centre for people with physical disabilities, measured the air temperature, radiant temperature and humidity of the centre during the hours of occupation. In addition two pilot studies were conducted in a thermal chamber, one at air temperature (ta) of 23°C and relative humidity (rh) of 70 %, predicted mean vote=neutral (PMV=0), the other, ta 29°C, 50% rh, predicted mean vote=slightly warm to warm (PMV=+1.5). The studies suggested that an air temperature of 23°C, relative humidity of 70%, PMV=0, would provide sensitive conditions and reflected the air temperature of the day centre. These conditions were then used for the main study.
Subjects and Clothing Five male subjects without physical disabilities participated in the study. Age=22.8±6 years, height=183±9cm and weight=73±11kg. Each subject wore their own footwear, cotton socks and underpants. They were provided with trousers and a shirt both of 65% polyester and 35% cotton and a sweatshirt of 70% cotton 30 % polyester. The estimated clo value was 1.0Clo.
Experimental Design Table 1 Material Composition of Seats
Four identical wheelchairs were fitted with four different seating materials. The experimenter was not informed of the materials used and the chairs were labelled A, B, C D and office chair. See table 1 for the composition of the seating materials. The subjects were exposed to a different chair on five consecutive mornings, the order of exposure was determined by a 5×5 Latin square. The subjects always sat in the same part of the chamber and the seating moved around according to the exposure for the session.
Thermal comfort of seating materials
527
Measurements The experimental protocol was similar to that of Webb and Parsons (1997). Subjective and objective measures were taken. The ISO/ASHRAE 7 point thermal sensation scale was used as well as subjective scales for thermal comfort, stickiness, preference, satisfaction and discomfort of specific seat areas. Both overall and body area responses were recorded. In addition subjects were asked to rate the chair of the present session in relation to the chair of the previous session at the end of the three hours. The skin temperatures of the subjects were taken at their mid anterior and posterior thigh, chest and lower back. Seat temperatures were taken at the chair back and pan. The environment was measured across three regions of the chamber, at subject chest level. Measurements taken were: air temperature, globe temperature, humidity, air velocity, plane and radiant temperature. Equipment used included: Grant and Eltek series 1000 and 1001b data loggers with EU skin thermistors, 150mm diameter black globe, Vasiala HMP 35DGT humidity and temperature probe, air velocity sensor Byral 8455 and a Brüel and Kjær Indoor Climate Analyser (Type 1213) with humidity, temperature, air velocity and plane radiant probes.
Procedure The group of five subjects arrived at the laboratory thirty minutes before the experimental session commenced. Procedures were explained, consent and health check forms completed. The four skin thermistors for measuring subject skin temperatures were attached to the subjects and oral temperatures taken. The standard clothing (of correct size) was given to each subject. Objective measurements were set to record every minute for the three hours. Subjective forms were completed prior to entering the chamber, on entering the chamber and every fifteen minutes thereafter. The subjects sat in an upright but relaxed position in their chairs, watching light entertainment videos for the duration of the session. Any major deviations from the posture were corrected.
Results The target environmental conditions were achieved. See table 2 Table 2 Actual Experimental Conditions
Thermal Sensation Body areas not in contact with the chair, voted neutral across all five subjects on all five chairs. Body areas in contact with the chairs, i.e. posterior thigh, bottom and lower back, showed some variation between the chairs but within one actual mean vote (AMV). Bottom area for all chairs were close to neutral except for the office chair (wool, viscose) for which the actual mean vote was generally higher than that of the other chairs scores of neutral to slightly warm AMV=0–1.
Thermal Comfort All the chairs were found to be “not uncomfortable”. Scores for the bottom-body area showed slight variation between the chairs, with the office chair (wool, viscose) showing actual mean votes of ‘slightly uncomfortable’. However, in general differences between the chairs were marginal.
528
N Humphreys, LH Webb and KC Parsons
Stickiness In the neutral environment to which subjects and chairs were exposed no stickiness was experienced. Only the office chair (wool, viscose) showed scores nearing ‘slightly sticky’ in the bottom-body area, towards the end of the three hours.
Preference Subjects expressed different preferences between their chairs in terms of their desire to be warmer or cooler. When seated in Chair A (woven nylon), 20 % of subjects wished to be cooler, whilst when seated in Chair D (PVC coated) 40% would prefer to be warmer. For the other three chairs subjects preferred no change to the environment.
Satisfaction In terms of thermal comfort 40 % of subjects were dissatisfied with Chair A (woven nylon). Whilst the other four chairs were all found to be satisfactory. When subjects were asked if they would be satisfied in a particular chair for the whole day 40% said they would be dissatisfied with Chair A (woven nylon) and the office chair (wool, viscose), and 20 % would be dissatisfied with Chair D (PVC coated). A rank order of preference was established. The preferred chair from most preferred to least preferred was Chair C (woven polyester), D (PVC coated), B (expanded PVC), A (woven nylon) and office chair (wool, viscose).
Skin Temperatures There were no statistically significant differences between the skin temperatures of the subjects across the chairs. However, Chair D (PVC coated) was cooler in the posterior thigh and by 0.4–0.8°C for the wheelchairs and 1.7°C cooler than the office chair (wool, viscose) and for the lower back 1.1–1.3°C cooler than the other chairs.
Chair Temperatures There was 3.9°C difference between the coolest and warmest chair back, and chair pan, the coolest was Chair D (PVC coated) and the warmest was the office chair (wool, viscose). Tukey’s pairwise comparison showed that Chair D (PVC coated) was significantly cooler at p=0.05, than the other chairs and the office chair (wool, viscose) was significantly warmer at p=0.05 than the all the other chairs.
Discussion When evaluating the different seating materials in a neutral environment, no differences were found with regards to sensation, comfort and stickiness between the four wheelchair materials. The four materials were varying combinations of polyester, nylon and PVC, three of the materials had a PVC foam filling of approximately 2mm and PVC coated polyester backing, and one did not. The material that showed slight differences was the office chair (wool, viscose), which had a wool, viscose and nylon weave, a PVC foam filling of approximately 5cm and a wooden backing. This chair gave results of slightly warm, slightly uncomfortable and slightly sticky, mainly in the ‘bottom’ area of the body. The office chair (wool, viscose) was also the lowest ranked chair. It was found to be satisfactory at any one moment, however, 40% said they would not find it satisfactory to be in all day. Subjects posterior thigh skin temperatures were 1.7°C warmer than when sat in chair D (PVC coated) and the seat pan temperature was 3.9°C warmer than chair D (PVC coated)
Thermal comfort of seating materials
529
(which was the coolest chair) The office chair (wool, viscose) was significantly warmer than all chairs p=0.05. Chair C (woven polyester) was ranked first in preference. However, there were no performance indicators other than ranking to differentiate chair C (woven polyester) from chair B (expanded PVC) which was ranked 3rd. Chair D (PVC coated) was ranked 2nd despite 40% of subjects wanting to be warmer when in this chair, 20% would not be satisfied seated in it all day. Subject skin temperatures at posterior thigh and back were cooler by around 1°C than when in other chairs and the seat pan and back were significantly cooler p=0.05, than the other chairs. It is not possible to say from the data whether the cooler effect of this chair was preferred by subjects or what the implications of this would be for warmer or cooler environments. It was however, ranked 2nd. Chair A (woven nylon) was ranked 4th. This chair did not differ in performance with regards to sensation, comfort and stickiness to the other wheelchairs, however, on the preference ratings, 20% of people wished to be cooler when seated in this chair, 40% were dissatisfied both at any one time and if they were to sit in the chair all day.
Conclusions Overall it was found that the office chair (wool, viscose) differed slightly in terms of thermal comfort compared to that of different wheelchair seating materials. Differences with regards to preference were also found between the different wheelchair seating materials, no differences in terms of thermal comfort, sensation and stickiness were found. Subjective and objective results supported each other, although some slight variation was found, the differences were negligible and therefore insignificant in a neutral environment. Chair D (PVC coated) was found to be the coolest chair, and the subjects would have preferred to be warmer when seated in this chair. However, in ranking the chairs, Chair D (PVC coated) was ranked second. It can not therefore be assumed that the cooler chair was a negative attribute, in a neutral environment, however, it may have different implications in warmer or cooler environments. Except for the office chair (wool, viscose), it is possible that factors other than the sensation, comfort and stickiness ratings influence subjects seating material preference. The warmest chair was the office chair (wool, viscose). The key differing factor between the coolest, warmest and other chairs appears to be material thickness and foam padding rather than material type. Further work needs to be carried out to evaluate the materials in warmer and cooler environments, to evaluate more closely the effect of material thickness, type and backing on the thermal comfort of the seating materials. A further study is now being undertaken, to evaluate the thermal comfort of the same seating materials in a slightly warm to warm environment predicted mean vote 1.5, air temperature 29°C, 50% relative humidity.
References BS EN ISO 7730 (1994): Moderate Thermal Environments—Determination of the PMV and PPD Indices and Specification of the Conditions for Thermal Comfort. 2nd ed. (ref. no ISO 7730:1994(E)) International Standards Organisation, Geneva. Webb and Parsons (1997) Thermal Comfort Requirements for People with Physical Disabilities. BEPAC and EPSRC Sustainable Building mini conference, UK 5/6 February. http://www.iesd.dmu.ac.uk/bepac/sustbuild/conf/heat_com.htm#
THE EFFECT OF REPEATED EXPOSURE TO EXTREME HEAT BY FIRE TRAINING OFFICERS Joanne O.Crawford and Tracey J.Milne
Industrial Ergonomics Group School of Manufacturing and Mechanical Engineering University of Birmingham Edgbaston Birmingham B15 2TT
The study examined the effect on fire training officers of repeated exposure to high temperatures. Physiological measures included heart rate and aural temperature. Sub-maximal baseline tests were carried out with participants in P.E.Kit and fire kit and oxygen consumption was measured. On entry to hot conditions, aural temperature and heart rate were monitored. The results indicated a significant increase in heart rate between the two baseline conditions. There were no significant differences found between baseline measures of heart rate and temperature (wearing fire kit) and first entry to the hot conditions. Heart rate data and temperature data obtained from the second entry to the hot conditions was found to be significantly higher than all other conditions. The results indicate a need to further examine working time and recovery time when working in high temperatures.
Introduction Firefighting has long been associated with high physiological demands. Many studies both simulation and actual fire suppression have measured work load at 60–80% VO2max and up to 95% HRmax (Sothmann et al, 1992; Manning & Griggs 1983). Both physiological and psychological stresses (including boredom and anxiety) contribute to the overall hazards affecting the well-being of firefighters (Lim et al, 1987). Personal Protective Equipment (PPE) is essential but when fully kitted up, an additional 20–30kg in weight is added, which causes a significant increase in oxygen consumption, heart rate and ventilation rates (Louhevaara (1984); Borghols et al, (1978); Sykes (1993); Love et al, (1994). High temperatures add an additional physiological load and it is thought that typical temperatures worked are 38°C to 66°C but air temperatures as high as 232°C in structural fires have been recorded (Abeles et al, 1973) Although actual firefighting tasks have been examined, the nature of firefighting is such that on a day-to-day basis, firefighters are not continually working in extreme environments.
Effect of repeated exposure to extreme heat by fire training officers
531
The role of fire training officers, is however different in that they are training recruits in hot conditions and are entering fire training houses more than once per day wearing full fire kit and self-contained breathing apparatus. Although previous research has recommended maximum limits for core temperature for working in high temperatures, body temperature is not necessarily monitored on a day-to-day basis. The aim of this study was to examine the physiological strain on fire training officers when they were exposed to hot conditions more than once per day by monitoring heart rate and aural temperature and comparing data collected with baseline data.
Method Six participants took part in the study, all were professional male fire brigade training officers. The equipment consisted of a bicycle ergometer (Monark Ergomedic 818), Aerosport TEEM 100 metabolic analyser, two heart rate monitors (Polar Sports Tester and the Polar Vantage NV) and two aural thermometers (Grant Instruments, Cambridge). The aims of the study were explained to all participants. Baseline data were collected from the participants in two sub-maximal tests; wearing P.E Kit, full fire kit and carrying SCBA. Each test was sub-maximal and consisted of 4 workloads on the bicycle ergometer for a period of 5 minutes with 5 minutes break between each workload. The starting point of each stage was a workload of 20W followed by 3 further workloads with increments of 20W—the final workload being 80W. Due to time constraints all baseline data had to be collected on the same day but participants were allowed recovery time between tests. Data collected during the baseline tests included oxygen uptake, heart rate and aural temperature. Physiological monitoring used when the training officers were in the fire house included heart rate monitors and aural thermometers. Aural thermometers placed on the training officers twenty minutes before recording commenced. Temperature was recorded prior to entry to the fire house and on exit. Heart rate was recorded continuously during the working period. Air temperature was continuously monitored in the fire house. The time spent in the fire house for the participants varied between 10 minutes and 18 minutes for each participant.
Results The participants were all male fire training officers, age range between 29 and 42 years old. Average predicted VO2max was found to be 2.88 l/min (range 2.51 to 3.33 l/min). Heart rate and aural temperature data obtained during the final work phase of baseline tests are shown in Table 1. The heart rate data and aural temperature data for the hot conditions are shown in Table 2. The data are presented graphically in Figures 1 and 2 for each of the four test conditions. Initial examination of the heart rate data found that the sample were working at 78.8±12% of maximum heart rate during the second entry to the fire house (range 67% to 99%). The results are shown in Table 3. Statistical analysis found that when comparing baseline tests, there was a significant increase in heart rate when wearing fire kit (p<0.02). Using ANOVA to analyse the heart rate data, it was found that there were significantly higher heart rates in the second entry to the fire compared with all other conditions (p<0.001). However there were no significant differences found between wearing fire kit in the baseline condition and first entry to the fire house.
532
JO Crawford and TJ Milne
Table 1. Baseline Data
Table 2. Data from Hot Conditions
Table 3. Average Heart Rate and Percentage of Predicted Maximum Heart Rate
Statistical analysis found that with the baseline tests there were no significant differences in aural temperature when wearing PE Kit and Fire Kit. Using ANOVA to analyse the aural temperatures, significant differences were found again with the highest temperatures after the second entry to the fire house (p<0.01). It was interesting to note that there were again no significant differences between the baseline condition wearing fire kit and the first entry to the fire house. Environmental temperature monitored during the study ranged from 44°C to 240°C.
Effect of repeated exposure to extreme heat by fire training officers
533
Figure 1. Heart Rate Data
Figure 2. Aural Temperature Data
Discussion The participants who took part in the study were of a variety of ages and their predicted VO2max was found to reach the minimum level recommended by Sothmann et al, (1992) for firefighting. However the use of sub-maximal testing to predict maximum oxygen consumption does increase the likelihood of errors by ±15%. The increase in heart rate across conditions supports the majority of previous literature. What was interesting from the study was the significant increase in both heart rate and aural temperature between the two hot conditions indicating that physiological strain was significantly higher on the second entry to the fire house. This appears to indicate a cumulative effect between the two conditions. The lack of difference in heart rate and aural temperature between the baseline condition wearing fire kit and first entry to the fire house would also support a cumulative effect between the two hot conditions. This raises further questions about repeated exposure in this environment. One participant’s aural temperature reached 39.3°C on second entry to the fire house. Although aural temperature is not a direct measure of core body temperature, it does indicate an increase in body temperature. This
534
JO Crawford and TJ Milne
would suggest that there is a need to further examine the time spent in hot conditions and the amount of time allowed for recovery between entries. The limitations on this study were having only six participants, the lack of control over the fire house conditions and time spent within the hot conditions. It is very difficult when using live fire exercises to control temperature as it is affected by outside weather conditions. However, to obtain realistic live data for fire training officers, this was one of the main methods that could be used and does present future challenges in this field.
Future Research From this study a number of recommendations can be put forward for future research. The number of participants should be increased. There is also a need for more controlled conditions in terms of ambient temperature and standardised working time within the hot conditions. It would also be recommended that skin temperature be monitored as this may be a better measure to use in this type of environment. The cumulative effect of heat and wearing PPE should also be examined further in the future.
References Abeles, F.J., Delvecchio, R.J. and Himel, V.H., 1973, A fire-fighters integrated life protection system, phase 1, design and performance requirements. Grumman Aerospace Corp, New York Borghols, E.A.M., Dresen, M.H.W. and Hollander, A.P., 1978, Influence of heavy weight carrying on the cardiorespiratory system during exercise. European Journal of Applied Physiology , 38, 161–169 Lim, C.S., Ong, C.N., and Phoon, W.O., 1987, Work stress of firemen as measured by heart rate and catchecholamines. Journal of Human Ergology, 16, 209–218. Louhevaara, V.A., 1984, Physiological effects associated with the use or respiratory protective devices: a review. Scandinavian Journal of Work, Environment and Health, 10, 275–281. Love, R.G., Johnstone, J.G.B., Crawford, J., Tesh, K.M., Graveling, R.A., Ritchie, P.J., Hutchison, P.A. and Wetherill, G.Z., 1994, Study of the physiological effects of wearing breathing apparatus. Institute of Occupational Medicine Technical Memorandum TM/94/05 Manning, J.E. and Griggs, T.R., 1983, Heart rates in fire-fighters using light and heavy breathing equipment: similar near maximal exertion in response to multiple work conditions. Journal of Occupational Medicine, 25, 215–218 Sothmann, M., Saupe, K., Jasenhof, D. and Blaney J., 1992, Heart rate responses of firefighters to actual emergencies: implications for cardiorespiratory fitness. Journal of Occupational Medicme, 34, 27–33 Sykes, K., 1993, Comparison of conventional and light BA cylinders. Fire International, 143, 23–24
The effects of Self-Contained Breathing Apparatus on gas exchange and heart rate during fire-fighter simulations Kerry Donovan & Alison McConnell Sports Medicine and Human Performance Unit, Applied Physiology Research Group, School of Sports & Exercise Sciences, University of Birmingham, Edgbaston, Birmingham B15 2TT.
The aims of the present study were, to quantify the effects of wearing selfcontained breathing apparatus (SCBA) during an occupation-specific laboratory-based test (‘Firetest’) and to validate the test. 8 fire-fighters and 10 civilians performed the Firetest with and without fire-fighter SCBA. The . group mean results with and without SCBA wear were as follows. VO2 . increased from 1.81 to 2.34 l·min-1 (p<0.01); VCO2 from 1.65 to 2.29 l·min-1 . (p<0.001); VE from 53.49 to 72.74 l·min-1 (p<0.001), Fc from 123 to 152 bpm (p<0.001). TTOT fell from 2452 to 1882 msec (p<0.01). Fire-fighters using this protocol reported that the Firetest was more representative of firefighting activities than other tests in current use. The Firetest will be used to test the respiratory responses of fire-fighters to SCBA wear and to monitor the changes in performance following various training interventions.
Introduction All active fire-fighters are trained to use Self-contained Breathing Apparatus (SCBA). Accurate determinations of the cost to respiratory function of SCBA wear is therefore desirable. However, testing of fire-fighters during real emergencies is generally impractical. Most of the published reports examining fire-fighters’ responses to SCBA wear have utilised standard laboratory protocols, i.e. treadmill walks and cycle ergometry, (Faff & Tutak, 1989; Wilson et al, 1989). These are not entirely representative of fire-fighting actions which typically involve wholebody and upper-body activity, (Sothmann et al, 1991). To address this discrepancy, we have developed a test which simulates fire-fighting activities in a laboratory environment (Firetest). Published research indicates that the respiratory accessory muscles are activated at an earlier stage in exercise when wearing SCBA, (Louhevaara et al, 1995). Furthermore, evidence suggests that SCBA compresses the thorax, “preventing free and efficient movement” (Louhevaara et al 1985, pp 215). This restriction, the added mass of the SCBA and the upper-body work typical of fire-fighting activities may increase respiratory demands resulting in significant changes in breathing pattern and ventilation. The aim of the present study was twofold. 1) to validate the Firetest in the light of previous work carried out in the laboratory environment and in the field, and 2) to quantify the ventilatory and metabolic effects of wearing full fire-Kit including SCBA (dry mass~23kg), during the Firetest.
536
KJ Donovan and AK McConnell
Methods 18 male volunteers successfully completed the study, 8 were professional fire-fighters and 10 were civilians, (see Table 1). All were physically active, none were smokers. Ethics committee approval and informed written consent were obtained prior to the study. Table 1: Group physical characteristics.
Physiological measurements Respiratory air flow was measured breath-by-breath using an ultrasonic phase-shift flow meter, (Flowmetrics, Birmingham, UK), to monitor; inspired and expired tidal volumes, total breath duration (TTOT), peak inspiratory and expiratory flow rates, mean inspiratory flow rate, . respiratory frequency (FR), and minute ventilation (V E). A mass spectrometer analysed inspired and expired air, (Airspec MGA 2000, UK). Sampling was performed using a finebore polythene catheter (~2.5m long), inserted into the flow meter manifold which was incorporated into the inlet port of a modified SCBA face-mask. Heart rate (Fc) was monitored using a Sportstester, (Polar Oy, Finland). Any Fc above 95% Fcmax would have resulted in termination of the test; no test was terminated under this criterion. Breathlessness was monitored at one minute intervals using a twelve point modified Borg scale.
Procedure Visit 1: Anthropometric measures were made, followed by a treadmill test to volitional . . fatigue to measure V O2 max. Lung function was measured before and after the VO2 max test. Failure to meet any of the guidelines laid down by the Home Office (WM Fire, personal communication), led to the removal of volunteers from the study. The fitness data were used to calculate the Firescore*, obtained using a modified version of a test designed by Davis & Dotson (1982). Scores <1065 do not meet requirements, >1065 are “Fair”, >1150 are “Good” and >1250 “Excellent”. The range of scores for the group was 1079–1336. Visit 2: Tests of lung function were followed by a 5 minute rest. The volunteers then performed the Firetest. Heart-rate (Fc) was monitored throughout. Volunteers performed the Firetest in PE kit and training shoes. Visit 3: Repeated Visit 2 but this time wearing full fire-fighter turn-out kit including SCBA. Following a debrief, the fire-fighters indicated that the Firetest produced physiological responses similar to those encountered whilst wearing SCBA. The Firetest is a nine stage, sub-maximal, progressive test, (see Table 2). Computer storage limitations required the Firetest to be monitored in two halves, (stage 5 was used to open a new data file and thus could not be monitored). The treadmill speed and gradient was kept constant (5 kph, 6%) during each relevant stage.
Effects of self-contained breathing apparatus on gas exchange and heart rate 537
Table 2. The nine-stage task-specific Firetest
Statistical analysis Data were means of the last ten breaths for the final minute of each monitored stage, (1 data point per stage per variable). Student’s t-tests for paired observations were performed to test for differences. Pearson’s product moment coefficient tested for associations between variables. A probability of 5% was accepted as significant.
Results Table 3. Results of Visits 2 & 3 Means (±SDev), n=18
The volume of air used during SCB A wear is of great relevance to fire-fighters and was calculated by summing the expired tidal volume per breath (litres, BTPS). Results of a t-test for paired data showed that air use between visits increased significantly from a mean of 11521 (±135), to 16011 (±247, n=18, p<0.01), a mean SCBA cost of 39% (range 32 to 51%). These data show that most of the volunteers would have remained within safe working limits (~18501), none would have exhausted the air supply (~22501). A Pearson’s product moment correlation coefficient indicated a significant negative association between a volunteer’s Firescore and the cardiac strain (82% Fcmax) imposed by SCBA, (r=-0.59, p<0.025). There was also a significant negative correlation between the . Firescore and the mean aerobic strain (54% V O2 max) during Visit 3, (r=-0.67, P<0.005).
538
KJ Donovan and AK McConnell
Three volunteers who successfully completed Visit 1 failed to complete the Firetest during visit 3. They cited exhaustion, light-headedness and/or upper-body fatigue as reasons for early termination. The mean Borg score at test termination for these three volunteers indicated “severe-to-very severe’ breathlessness.
Discussion An early study by Louhevaara et al, (1985) showed that wearing SCBA increased fire. fighters’ VO2 during progressive treadmill walks by ~0.54 lmin-1. The present study indicates . a mean SCBA-induced increase in VO2 of 0.53 lmin-1. These data also suggest that changes in respiratory demand generally exceeded 25%, (see Table 3), greater than has been reported elsewhere, (Louvevaara et al, 1986). These results support the rationale for the Firetest, namely that whole-body and upper-body work increase the loading on the respiratory system and elicit greater respiratory demands than elicited by the mass of the SCBA alone. . Although VT increased significantly between tests, the difference was modest compared with the differences seen in the other variables (5% rather than>25%). The small increases in . VT following Visit 3 cannot be explained by the normal exercise dynamics, as the ‘levelling . off’ point during the maximal test was 3.25 lmin-1 (58% of mean FVC). The mean VT seen . during Visit 3 was only 2.17 lmin-1 (38% FVC). The majority of the increase inV E can therefore be attributed to decreased TTOT and the concomitant increase in FR (–36% and +29%, respectively). This hypothesis is consistent with the non-significant changes in . . ventilatory equivalent for O2 (VE/V O2) between the two conditions. There was a slight but significant increases in duty cycle (TI/TTOT), suggestive of an increased inspiratory demand. . These results show that SCBA may limit V T , causing the increased metabolic demands to be met primarily by increases in FR and changes in duty cycle. This breathing pattern is inefficient, decreases effective alveolar ventilation and may result in premature respiratory fatigue. A study which examined the effects industrial respirators (incl. SCBA), suggested that . individuals with VO2 max>50ml.kg.min-1 have the greatest chance to “override the effect of respirator work on performance,” Wilson et al, (1989, p92). However, the three subjects who . failed to complete Visit 3 had a mean V O2 max of ~49.6 50 ml.kg.min-1. The Wilson et al, (1989) study used standard laboratory protocols (maximal treadmill and cycle ergometer tests), which did not necessarily generate fire-fighter specific respiratory demand. It follows . that V O2 max (ml.kg.min-1) may not be the most important determinant of SCBA tolerance during standard fire-fighter tasks. Although there were significant negative associations between the Firescore and both . aerobic strain (% V O2 max) and cardiac strain (% Fcmax), only 44% and 34% respectively, of the variation could be accounted for by the variation in Firescore. This may be the result of the relatively well-trained status of the group as a whole. It is expected that a more heterogeneous group would demonstrate wider variations in Firescores and a greater negative association between Firetest score and cardiac and aerobic strain. This was partially demonstrated by the three volunteers who failed to complete Visit 3. Their mean cardiac . strain and aerobic strain (96% Fcmax and 59% V O2 max respectively), were higher at termination than the means for the group during Visit 3. Further studies utilising the Firetest will attempt to test this hypothesis by assessing the responses of less well trained volunteers.
Effects of self-contained breathing apparatus on gas exchange and heart rate 539
Research into load carriage and walking indicates that the addition of a heavy backpack (or SCBA), forces walkers to lean forwards, Gordon et al, (1989). Such alterations to normal gait are resisted by eccentric and isometric contraction of various muscle groups including those of the lower back and the abdominals. As the abdominal muscles are active during forced expiration, this may have knock-on effects upon respiration leading to respiratory fatigue. Likewise, isometric contraction of the shoulders, upper chest and upper-limbs may also impact negatively on respiratory muscles. Furthermore, the mass of the SCBA and the aforementioned inspiratory restriction may exacerbate respiratory strain. The effects that such potential respiratory fatigue has upon respiratory function will be the object of future investigation. The fire-fighters in the present study reported that the Firetest was more representative of SCBA activities than other tests in current use and elicited responses similar to those generated during fire-fighting activities that require SCBA. This protocol will be used to test the respiratory responses of fire-fighters to SCBA wear and to monitor the changes in performance following various training interventions.
Summary The results of the present study show that most of the volunteers were able to cope well with the added strain of the SCBA, indicating that the physiological characteristics of this group seem to be reasonable for fire-fighters, supporting the recent findings of Louhevaara et al, (1995). We have quantified the effects of wearing SCBA during the Firetest and we are confident that the Firetest is a useful tool for investigating the effects of SCBA on fire-fighter performance, ventilation and breathing pattern in the laboratory environment. This was supported by the fact that three volunteers who successfully completed Visit 1 and thus had met the Home Office minimum requirements for UK fire-fighters, failed to complete the Firetest in Visit 3.
References Davis P., Dotson C., & Santa Maria D., (1982)., “Relationship between simulated firefighting tasks and physical performance measures,” Medical Science of Sport & Exercise, 14, 67–71. Faff J., & Tutak T., (1989), “Physiological responses to working with fire-fighting equipment in the heat in relation to subjective fatigue,” Ergonomics, 32, 629–638. Gordon M., Goslin Br., Graham T., et al., (1983), “Comparison between load carriage and grade walking on a treadmill,” Ergonomics, 26, 289–298. Louhevaara V., Ilmarenen R., Griefahn B., Kunemund C., & Makinen H., (1995), “Maximal physical work performance with European standard based fire-protective clothing system and equipment in relation to individual characteristics,” Eur J Appl Physiol, 71, 223–229. Louhevaara V., Smolander J., Tuomi T., Korhonen O., & Jaakkola J., (1985)., “Effects of an SCBA on Breathing Pattern, Gas Exchange and Heart Rate during Exercise,” J. Occ Med., 27, 213–216. Sothmann M., Saupe K Raven P., et al., (1991)., “Oxygen consumption during fire suppression: error of heart rate estimation,” Ergonomics, 34, 1469–1474. Wilson J., Raven P., Morgan W., Zinkgraf S., Garmon R., & Jackson A., (1989)., “Effects of pressure-demand respirator wear on physiological and perceptual variables during progressive exercise,” Am. Ind. Hyg. Assn. J., 50, 85–94–4.
THE EFFECT OF EXTERNAL AIR SPEED ON THE CLOTHING VENTILATION INDEX Lisa Bouskill1, Ruth Livingston1, Ken Parsons1 and W R Withey 2
Department of Human Sciences Loughborough University LE11 3TU
1
2 Centre for Human Sciences Defence Evaluation and Research Agency Farnborough GU14 0LX
To quantify the importance of external air movement on the heat exchange through clothing, 9 male subjects, wearing a wind-resistant GoreTex™ suit, were exposed twice to an environment of and rh =62 (1SD=1) %. One exposure had still air, ; during the other, air speed was 3.06 (0.04) ms-1. The Ventilation Index (VI) of the suit was measured while subjects performed 3 physical activities: standing stationary, stepping onto and off a raised platform, and rotating limbs. High air speed increased VI by 37%, 53% and 52% respectively for the 3 activities (P<0.01). These data imply that wind can induce large sensible and evaporative heat losses, even in wind-resistant clothing.
Introduction An increase in the movement of air through a clothing ensemble (often known as the ‘pumping’ or ‘bellows’ effect) reduces both its thermal insulation and its apparent evaporative resistance, generally resulting in a higher heat loss from the skin than expected. This may be advantageous when working in circumstances which require heat loss to maintain a safe deep-body temperature (eg in a hot work-place, or when wearing encapsulating clothing); however, it may be disadvantageous when circumstances require heat conservation. Thus, it has been suggested that: “…the measurement and control of [the pumping effect] seem to be the key to further advances in the field of thermal insulation of clothing ensembles…” (Vogt et al, 1983). One of the factors that, in principle, can affect clothing ventilation is the speed of external air movement (wind). Several mechanisms may operate. Air may enter garments through the fabrics, depending upon their air permeability, on the condition of the seams and on garment age and general condition. Air may also move through openings (sleeves, cuffs, gaps around the collar) and pre-designed vents. The presence of restrictions such as a belt or harness, and a tighter fit of the clothing may reduce the magnitude of these effects.
Effect of external air speed on the clothing Ventilation Index
541
It is therefore clear that for a complete evaluation of the worker and his thermal environment it is essential to quantify the relationship between wind speed and the consequent changes in heat exchange between the skin and the external environment. The aim of this preliminary study was to examine the effect of wind on the amount of air moving through a clothing ensemble, as measured by the clothing Ventilation Index (VI) (Birnbaum and Crockford, 1978). The null hypothesis (H0) was that, in the particular clothing ensemble chosen for the study, increased air speed would have no effect on the VI. The alternative hypothesis (H1) was that increased air speed would increase VI.
Materials and Methods Subjects Nine, healthy, physically-active males, age range 19 to 30 years, volunteered to participate in the study. They were fully informed of the objectives, procedures and possible hazards of the study and completed a Form of Consent before exposure. Left aural temperature and heart rate were recorded as safety measures. Physical characteristics of the subjects were: Height=1.81 (1 SD=0.06)m; Weight=78.46 (9.13)kg; Dubois Surface Area=1.99 (0.13)m2; Age=22.56 (3.17) years.
Test Protocol Each subject was exposed twice (once in each air speed), in a controlled-environment chamber, to the following thermal conditions: ta=tr=5.0 (0.3) °C, and rh= 62 (1) %. In the ‘still air’ exposure, air speed (Va) at chest height was 0.12 (0.02)ms-1; in the second exposure air speed was increased to 3.06 (0.73)ms-1 using a Micromark 16-inch diameter, pedestal fan, positioned 0.75m in front of the subject. The order in which subjects were exposed to the low and high speeds was randomised.
Physical Activities During each exposure subjects performed 3 activities: • Standing stationary: with feet together, arms by sides; • Stepping onto and off a platform 150mm high, each step movement took 1.2s, as cued by a metronome; • Rotating each limb individually in large arcs, each arc taking 4.8s to complete as cued by a metronome. VI was determined 3 times in succession during each activity, (the method is described below) from which an average value was calculated. The 3 order of presentation of activities was balanced between subjects.
Clothing ensemble Subjects wore: underpants (own), short socks, soft ‘trainers’ and a 2-piece suit comprising trousers with an elasticated waist and a zip-fronted, long-sleeved jacket with elasticated bottom hem. The suit was a commercially-available item of leisurewear made from GoreTex™ material, with cotton elasticated wrist cuffs, and popper fastenings at the ankles. The jacket zip was protected with a popper-fastened wind baffle. During the study the windbaffle on the jacket and the ankle poppers were fastened. Each subject wore a size of suit appropriate to his stature and physique.
542
Lisa Bouskill1, Ruth Livingston1, Ken Parsons 1 and W R Withey
This type of suit was chosen because GoreTex™ fabric has a very low air permeability. The effect of wind on the clothing ventilation would therefore be caused mainly by pumping, and little influenced by the permeation of air through the fabric. Thus, the combination of relatively air-impermeable garments and closed neck, cuffs and ankles, represents a clothing condition in which low values of VI would be expected and therefore any effect of external air movement on VI should be maximised.
Determination of the Ventilation Index. VI is defined as the product of the rate at which the air within the clothing is exchanged, and the volume of the micro-environment available for this exchange, ie VI (litres per minute)=Air exchange (rate per minute)×Micro-environment volume (litres) Air exchange rate and micro-environment volume were measured as follows (Bouskill et al, 1997): Air exchange rate: The rate of air exchange between the micro-environment (ie the space between the skin and the suit) and the external environment was measured using a tracer gas technique. Nitrogen was flushed through the micro-environment using a system of distribution tubes worn next to the skin. The gas in the micro-environment was sampled using a separate system of tubes, also worn next to the skin, connected to a vacuum pump and an oxygen analyser. The time taken for the oxygen concentration of the micro-environment gas to return to 19% from 10% was used to calculate the air exchange rate. This determination was repeated 3 times for each physical activity, in both air speeds. Micro-environment volume: The volume of air trapped within the ensemble was determined in triplicate in a separate session. For this measurement the subject donned over the clothing ensemble a l-piece, air-impermeable oversuit sealed at the neck, which enclosed the whole body including the hands and feet. Air was evacuated from this oversuit until the microenvironment pressure (measured on a water-filled manometer attached to a perforated tube placed in one trouser leg) began to change. This was taken to be the point at which the airpermeable oversuit lay just on top of the clothing ensemble. Evacuation continued until no more air could be removed from the clothing. This additional volume of evacuated air was taken to be the micro-environment volume.
Results The mean (and 1 SD) values of VI measured for all 9 subjects in each of the 3 physical activities, and the 2 air-speed conditions, are given in Table 1. Table 1: Mean Ventilation Index in 3 physical activities and 2 air-speed conditions (for details see test)
Effect of external air speed on the clothing Ventilation Index
543
In both still air and high air-speed conditions, physical activity increased the VI. In still air the increases were 52% and 79% for the stepping and rotating limbs activities, respectively; in the high air speed the corresponding increases were 70% and 98%. For all activities, the mean VI was greater in the high air speed condition than in the still air condition, the differences being 37%, 53% and 52% for the standing, stepping and rotating limbs activities, respectively. Preliminary statistical analysis using Student’s t-tests shows that all these differences were statistically significant (P<0.01). To show the consistency of the effects observed, the values for individual subjects are shown in Figure 1. As expected, each subject had a unique value for VI in each physical activity. Previous experience (Bouskill et al, 1997 and Bouskill et al, 1998) has shown that this value is a function of clothing fit, and the type of movement involved. Therefore comparisons of the VI values between subjects adds no useful information.
Figure 1: Ventilation Index in the 9 individual subjects Values are given for 3 physical activities and 2 air-speed conditions 1, 3 and 5 are the ‘still’ air condition; 2, 4 and 6 the ‘high’ air speed condition
Discussion This study examined the effect of increasing external air velocity on the clothing ventilation index, and is part of a wider study of factors which affect sensible and evaporative heat loss from the clothed worker. This information can be used for a variety of practical purposes, for example, to set safe exposure times for hot or cold conditions, or to calculate effective work/ rest schedules to maintain worker productivity and efficiency. Some calculations, for example, BS EN ISO 12515, assume that the values of the sensible and evaporative resistances required in the calculations can be the ‘intrinsic’ values measured in laboratory conditions often using thermal manikins. However, several studies have shown that the values obtained in working conditions—the ‘resultant’ values—can be lower (see
544
Lisa Bouskill1, Ruth Livingston1, Ken Parsons 1 and W R Withey
Havenith et al, 1990). Part of this difference can be explained by the increased convective heat loss caused by air movement through the clothing as a result of wearer movement and ‘wind’ speed. Although there have been studies to determine an empirical relationship between intrinsic and resultant resistances (Havenith et al, 1990), we know of no studies which have attempted to quantify systematically the factors that comprise the overall effect, eg air flow through the clothing layers. The VI is a measure of air flowing through the microenvironment; it therefore has the potential to quantify the convective heat loss this will induce. The VI values measured in the different physical activities in this study are typical of those obtained in a previous study (Bouskill et al, 1997), and are reproducible—the difference between the triplicate measures in the present study being less than 10%. The consistent effect of the increased air speed shows that the amount of air moving through the microenvironment of this clothing ensemble was sensitive to the air speed. GoreTex™ fabric has a very low air permeability, so most of the wind-induced increase must have taken place through the openings in the suit, even though these were in the ‘closed’ position. The mechanism for this may be direct displacement of the micro-environment air, or convective exchange arising from the wind creating a duct effect at garment openings. From our data it can be concluded that wind has a significant effect on clothing ventilation, and therefore on sensible and evaporative heat transfer between the clothed worker and the external environment, even in wind-resistant clothing. This affects thermal strain, and should be considered in assessments of the worker’s thermal environment.
References Birnbaum, R.R. and Crockford, G.W. 1978, Measurement of clothing ventilation index. Applied Ergonomics, 9, 194–200. Bouskill, L.M., Withey, W.R., Watson-Hopkinson, I. and Parsons, K.C. 1997, The relationship between the clothing ventilation index and its physiological effects. In: R.Nielsen and C.Borg (eds) Proceedings of Fifth Scandinavian Symposium on Protective Clothing. Elsinore, Denmark. 5–8 May 1997, 36–40. Bouskill, L.M., Sheldon, N., Parsons, K.C. and Withey, W.R. 1998, The effect of clothing fit on the ventilation index. In M.A.Hanson (ed) Contemporary Ergonomics 1998. (Taylor and Francis, London). British Standards Institution, 1997. Hot environments—Analytical determination and interpretation of thermal stress using calculation of required sweat rate. BS EN ISO 12515:1997. Havenith, G., Heus, R. and Lotens, W.A. 1990, Resultant clothing insulation: a function of body movement, posture, wind, clothing fit and ensemble thickness. Ergonomics, 33, 67–84. Vogt, J.J., Meyer. J.P., Candas, V., Libert. J.P. and Sagot, J.C. 1983, Pumping effect on thermal insulation of clothing worn by human subjects. Ergonomics, 26, 963–974.
Acknowledgement Support from the DERA Centre for Human Sciences is acknowledged © British Crown Copyright 1998/DERA. Published with the permission of the Controller of Her Britannic Majesty’s Stationery Office.
COMMUNICATING ERGONOMICS
Commercial Planning and Ergonomics Jane Dillon
RM Consulting Royal Mail Technology Centre Wheatstone Road, Dorcan SWINDON SN3 4RD
The Human & Environmental Consultancy is a newly formed group within RM Consulting. It provides three high level products: ergonomics consultancy, safety consultancy and environmental consultancy. It was perceived that the commercial success of the new unit would, to significant extent, depend on a clear marketing strategy. This perception was reinforced by an increasingly complex and competitive marketplace and the speed of technological change. The commercial planning process would enable the creation of marketing objectives and the identification of resources and financial investment needed to achieve them. This would ultimately lead to greater profitability and growth. The paper describes how the principles of marketing theory were applied to the activities of a specialised consultancy unit and discusses the impact of the commercial planning process in determining the direction and focus of the group’s activities.
Introduction The Human & Environmental Consultancy is a separate commercial unit within RM Consulting, one of the main UK consultancies dealing in postal and distribution services. The group was formed in April 1997 following a major reorganisation of RM Consulting and brought together three existing groups working in the fields of Safety, Ergonomics and Environment. The commercial climate in which the group operates is increasingly competitive and constantly changing. It was recognised that a clear marketing strategy could play an important role in shaping future success. There were initially a number of barriers to the marketing process. The main problem was a lack of basic marketing and planning skills within the group. This was coupled with a knowledge that the resources required to produce the plan would interfere with short term work and current financial performance. Nevertheless, the ever increasing commercial pressures meant that the status quo was not an option. RM Consulting itself was committed to strategic market planning and had created a Marketing Consultancy within the organisation. The production of a commercial plan became a required management activity. The process was facilitated by the Marketing Consultancy who provided the model for the plan and important support throughout.
Commercial planning and ergonomics
547
This paper discusses in broad outline the elements that made up the planning process and the implications for a small consultancy group. It does not address the theory of marketing planning or the relative merits of different planning techniques.
Purpose and Benefits The overall purpose of undertaking commercial planning was to identify and create a sustainable competitive advantage for the group in line with RM Consulting’s business plan. The process involved a logical sequence of activities leading to the setting of marketing objectives and the formulation of plans for achieving them. The benefits of commercial planning were perceived to be increased profitability and improved productivity due to:• • • • • •
identification of opportunities and threats specification of sustainable competitive advantage preparedness to meet change improved communication better resource allocation more market-focused activity
Method The methods used to produce the commercial plan were:• • • •
group discussions information gathering and analysis brainstorming discussions with key customers
Elements of the Commercial Plan The commercial plan consisted of four main elements:- Purpose, Situation Analysis, Strategy Formulation and Resource Allocation and Monitoring. These are described below.
Purpose This requires the production of a purpose statement which clearly sets out the role of the group, its business and its distinctive competence. It is produced mainly through group discussion. The statement is no more than a page long.
Situation analysis This section is concerned with understanding the current position of the group in relation to a number of key areas which are described below.
i) Environmental Review This reviews elements in the business environment (political, economic, social legislative, technological etc.) over which the group has no control and helps to identify opportunities and threats.
548
J Dillon
ii) Market Analysis, The objective is to analyse all aspects of the current and potential customer base. This helps to establish the most important products and customers; whether there is over dependence on a particular product or customer; which products or projects are coming to the end of their life and how any lost income might be replaced.
iii) Competitive Analysis, The objective is to identify competitors’ main strengths and weaknesses and assess current and future impact on the group.
iv) Key Capabilities Analysis The generic skills (e.g. leadership, project management, report writing etc.) and technical skills which are required to deliver products are mapped. The output is an understanding of the capabilities needed within the team and where shortfalls exist. Capabilities which are under-utilised are highlighted.
v) SWOT Summary This acts as a summary of the key Strengths, Weaknesses, Opportunities and Threats (SWOT) which were highlighted during the previous activities. It provides a snapshot view of the group’s current business position on which to build objectives and action plans.
vi) Creation of Assumptions The final activity in the Situation analysis is to make assumptions about the factors over which the group has no control. The main inputs are the results of the environmental review and the SWOT summary.
Strategy Formulation This is a key part of the planning process which is designed to examine the group’s relative competitive position and the attractiveness of different product areas. Products are compared with the objective of choosing which have the greatest potential for future development. A number of techniques are available to assist this analysis. The Directional Policy Matrix (DPM) was selected as the most appropriate. The position of a product on the matrix is determined by scoring and weighting each product based on Critical Success Factors (CSFs) and Market Attractiveness Factors (MAFs). CSFs are the key things which need to be right in order to succeed in a specific market e.g., value for money or knowledge of organisation. CSFs are generated by brainstorming all the success factors that the group feel arc important to customers. The five most important are then identified and given a weighting factor. Each product is then scored for the group and contrasted with the group’s main competitor for the product. A MAF is an inherent characteristic of a market. It should not reflect the position of a product in a market but should represent the market forces e.g., competitiveness, market growth or threat of new entrants. Each MAF is given a weighting and each product is scored against each MAF.
Commercial planning and ergonomics
549
The results are used to produce a matrix with a bubble representing each product (see Figure 1). The diameter of each bubble represents the amount of net income and the position of the bubble shows the relative competitive position and the attractiveness of different product areas.
Figure 1. Directional Policy Matrix for Group’s Products (P1–P3)
Table 1. Subject physical characteristics. Mean (1SD)
The DPM analysis provides the main basis for creating marketing objectives for the group. Figure 2 shows suggested market strategies for different positions on the matrix.
Figure 2. Strategies for different positioning on the Directional Policy Matrix Appropriate marketing objectives for the data in Figure 1 would be to invest and grow market share for products 1 and 2 whilst maintaining current market share for product 3.
550
J Dillon
Resource allocation and monitoring The final stage of the plan is to: • • • • •
plan the effective deployment of the group’s capabilities identify the people required to achieve the objectives formulated above assess the physical resources needed create a detailed finance plan prepare an action which details how the previously identified objectives can be met.
Discussion In evaluating the success of the commercial planning process, it is necessary to evaluate the costs involved in producing the plan and the benefits which have been achieved as a result. The creation of the plan involved a considerable amount of work, probably the equivalent of at least 4 man weeks in staff costs. Additional support from the marketing team was required, particularly for the DPM analysis. It is unlikely that an effective plan could have been produced without such professional support. However about 50% of the work involved in producing the plan would have been done in some form as part of normal management activity; e.g. finance planning, resource planning, capability analysis. It is difficult, and probably too soon to quantify the commercial benefits which have resulted directly from the commercial planning process. Many of the objectives in the plan have certainly been achieved, e.g. broadening the customer base and increasing market share for specific products. However it could be argued that these would have been achieved in any event. The main benefit to the group has been to increase the level of commercial awareness in the group. Everyone in the team was involved in some part of the planning process. • • • • •
Clear marketing actions were identified Support for internal investment was secured Training needs to support the plan were identified Some activities were dropped Communication with top management was facilitated.
Conclusion On balance, the commercial planning process was successful, although the full benefits are difficult to quantify at this stage. It forced a rigorous self-examination and yielded some useful and surprising results. It demonstrated that a specialist consultancy like any commercial enterprise, must understand its markets and invest to succeed.
References Baker, M.J. 1992, Marketing Strategy Management (Macmillan, Basingstoke) RM Consulting 1997, The Guide to RM Consulting Processes (Internal Document)
HUMAN FACTORS AND DESIGN: BRIDGING THE COMMUNICATION GAP Alastair S.Macdonald Course Leader, Product Design Engineering, Glasgow School of Art, 167 Renfrew Street, Glasgow G3 6RQ, Scotland
Patrick W.Jordan Senior Human Factors Specialist, Philips Design, Building W, Damsterdiep 267, PO Box 225, 9700 AE Groningen, The Netherlands
Poor communication can arise between different professions due to the limitations in the language each specialism employs. Despite the growth of expertise in the fields of product design and human factors, the communication gap between them can still result in under-optimised products failing to deliver in usability, quality and enjoyment is use. The ambition of this paper is to examine the emergence of new tools which facilitate the development of a common language between the professions. In particular, the paper will highlight communication techniques based on the use of images drawn from popular culture as a means of expressing design issues and people’s rational emotional and hedonic responses to design.
Limitations of Language As Wittgenstein wrote in his Tractatus in 1922, ‘Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt.’ Translated into English, the sentence reads: ‘The limits of my language indicate the limits of my world’ and summarises well the difficulty of communication between different professions. Despite the growth of expertise in the fields of product design and human factors, many products still betray a poor understanding and consideration of the end-user through their lack of usability, quality and enjoyment in use. Consultation with industry-based human factors specialists and designers suggests that a major reason for this is poor communication between human factors and design professionals. The two fields of product design and human factors share areas of concern. Both are usercentred. Both are concerned with how products, tasks, and environments ‘fit’ people. Design is a visually-orientated, artistic profession. The designer tends to be a generalist, combining aesthetic, ergonomic and technological elements to produce an improved or innovative product. The designer may utilise many forms of specialist knowledge during the design process to help bring a product ‘into being’. By comparison, many human factors specialists see their own discipline as a science. The ergonomist tends to be more of a specialist in one area of his/her field, assisting the designer with particular expertise or data during the design process, e.g. at the outset with standards and anthropometric data, and later with user trials.
552
AS Macdonald and PW Jordan
Whilst designers may prefer visual communication aids, many human factors specialists may favour communicating via technical reports. This sort of reporting can be seen by designers as dull, produced in a difficult-to-use format, and more suited to a university laboratory than a commercial design studio. The language which ergonomics uses has limited the extent to which it can communicate its own field effectively to others. On the other hand, visual tools can help orient human factors material to suit designers’ strengths as well as assisting all those in product development teams facing difficult qualitative value judgments.
Improving communication: understanding designers How could ergonomists and designers communicate better with one another? Firstly, it would be useful to understand the nature of design and how designers operate: 1) Designers are practitioners. Relevant knowledge and expertise is applied as required during the design process. Good designers usually assemble teams of specialist expertise. A good product design education provides students with a project-led studio ethos, mirroring the way that designers work in the ‘real world’ of practice. 2) Design practice is increasingly focused on the consumer. A designer’s remit is usually to come up with a usable, attractive, safe and commercially successful design within a given time scale and budget. This is a pressured activity where all knowledge is ultimately embodied in designs—in drawings, models, and prototypes. 3) In a world dominated by legislation and documents, ‘linguistic’ skills are commonly regarded as the most important, but in the world of practice, spatial and visual skills are highly developed in eg designers, surgeons and engineers. Designers also tend to be visually literate (visualate), and respond well to visually-orientated learning and reference material. 4) Design is seldom a precise science. Design processes often contain so-called ‘fuzzy’ problems with no readily identifiable single correct solution. As a result, designers tend to be speculative, to have the ability to progress an idea without knowing all the facts—using a type of ‘fuzzy-logic’, to develop and preserve the ‘sense’ of a product, and they become adept at making value judgments.
Improving communication: developing new tools This view of designers suggests the need for ‘bridging tools’ to facilitate a common language between the professions. Because of the timescales within which designers have to work, they have developed a number of quick, visual, ‘desk-top’ methods for sharing their ideas. Examples of two such tools are discussed here, one used in the educational and the other in the professional context.
Focus Boards At Glasgow School of Art, the Product Design Engineering course is multi-disciplinary in nature and involves engineering, product design, and human factors specialists teaching together in the design studio with students who will emerge as fully qualified engineers. As a profession, engineers have traditionally had a poor reputation for visual presentation and communication skills, but the Glasgow model of educating them has demonstrated that engineering students can easily acquire and develop an attractive ‘skills set’ including visual
Human factors and design: bridging the communication gap
553
communication skills—normally the domain of the product designer. This set of skills is regarded as a valuable asset by industry. Focus Boards (FBs) comprise one of a number of visual methods used in studio to develop product concepts and details. Using images generated from a number of sources—magazines, catalogues, hand-drawn sketches, diagrams, and photographs together with key words, the FBs help develop a sense of the desired qualities in the end product, to locate and focus on the market in terms of customer profile and lifestyle, and to develop a visual reminder and understanding of the environment and context of use of the product (Table 1). One area of the FBs introduces the idea of ‘parallel products’ which discuss qualities, features and details in terms of existing and familiar products already displaying some of the desired qualities. Through careful editing during their genesis—rather like working on a number of drafts in a script—students learn to give order and coherence to the boards and to discriminate quite clearly the particular qualities for carefully targetted end-users. Unlike a typescript, concept qualities are illustrated by tangible and accessible-to-all examples. These FBs ensure that all members of the teaching/learning team—designer, ergonomist, engineer and student alike, are drawing from clear focussed examples. This ensures that the engineering is human-centred and that a shared vision of the product is developed. Table 1. A typical Focus Board schematic
Figure 1. An example of a student Focus Board in use
Figure 1 shows a student visualising the requirements for an on-board heater for an emergency life raft through a Focus Board.
554
AS Macdonald and PW Jordan
Visual Communication Boards The ‘Humanware’ function at the Groningen studio of Philips Design is responsible for the input of a number of disciplines into the creation process for a huge variety of household products from kettles to sunbeds. These disciplines include social science disciplines, such as sociology, anthropology and psychology; specialist design disciplines, such as interaction design, trend analysis and multimedia design; and human factors. As part of a design department, Humanware works hand in hand with product designers. However, it is also a bridging competence, between design and other disciplines involved in the product creation process, such as market research, marketing, engineering and product management. “…its a bit like Shakespeare really: it all sounds very good, but it doesn’t actually mean anything.” This quote from P.G.Wodehouse character Bertie Wooster might be seen as a reflection on the way that different disciplines view verbal and written communications from others. The multidisciplinary nature of Humanware itself, and the involvement of the function with other disciplines, presents a challenge in terms of communication. Although written reports have a place in this communication, there are disadvantages associated with them in this context of use. Different disciplines use different jargon in communication and concepts that are understood by some disciplines are not meaningful to others. For example, talk of concepts such as ‘consistency’, ‘compatibility’ and ‘visual clarity’, whilst meaningful to human factors specialists, may mean very little to the other disciplines in the product creation process. It is equally unlikely that practitioners of these disciplines will want to slog through pages of text in order to be educated as to the meaning of such concepts! Because of these potential communication difficulties, and because design is a visually oriented profession, the majority of Humanware’s communication is conducted visually. The main tool used for this is the Visual Communication Board (VCB). VCBs employ images to communicate concepts and the relationship between concepts and the symbols, product semantics and form language by which those concepts are embodied. Because the images are drawn from popular culture and are readily recognised by all, they facilitate an effective discussion by all involved in the product creation process.
Figure 2. Exploring a ‘masculine’ image
Figure 3. A typical ‘Four Pleasure’ board
Human factors and design: bridging the communication gap
555
A product briefing required a product to project a ‘masculine’ image. But what does this really mean? The VCB (Figure 2) portrays a number of images of masculinity, from macho, through traditional, to SNAG (Sensitive New Age Guy). Figure 3 shows images associated with each of the four pleasures: physio, socio, psycho and ideo (see Jordan and Macdonald 1998).
Figure 4. Human Diversity Figure 4 shows images of human diversity. Who are our target group?
Conclusion As increasing numbers of different specialists join product development teams e.g. anthropologists, linguists, sociologists, designers and human factors specialists, it will be crucial for new ‘bridging tools’ between these disciplines to be developed to allow a shared vision. Pirkl and Babic (1988) have already produced visual demographic charts which translate physiological data into product design specifications, a great facility for bridging two disciplines. If human factors is to get its message across to the design community, it is essential that it embraces a more visually-oriented approach to communication. This is particularly relevant in the light of recent developments in human factors, for instance in the emerging desire to articulate issues such as pleasure in products which lie outside the human factors specialists’ traditional sphere of interest. One way to achieve this would be to ensure ergonomists had more training in envisioning information and qualitative value judgement during their education.
References Jordan, P.W., and Macdonald, A.S. 1998, Pleasure and product semantics. In S.A. Robertson (ed.) Contemporary Ergonomics 1998, (Taylor and Francis, London) Macdonald, A.S. 1997, Developing a qualitative sense. In N.Stanton (ed.) Human factors in consumer products, (Taylor and Francis, London), 175–191 Pirkl, J.J. and Babic, A.L. 1988, Guidelines and strategies for designing transgenerational products. (Copley Publishing Group, Acton, Massachusetts) Wittgenstein, L. 1922, Tractatus logico-philosophicus. (Kegan Paul, London) Proposition 5.6
GUIDELINES FOR ADDRESSING ERGONOMICS IN DEVELOPMENT AID T Jafry and D H O’Neill
International Development Group, Silsoe Research Institute, Wrest Park, Silsoe BEDFORD MK45 4HS
Subsistence agriculture in developing countries involves work which is physically demanding and time consuming. Much of the work is done manually because there are no suitable tools and implements available, people cannot afford to purchase tools or they have no access to tools and implements. Generally, women have longer working days than men because, in addition to agricultural work, they are responsible for domestic work and looking after children. The British Department for International Development have recognised that the application of ergonomics can help to reduce the drudgery and fatigue of agricultural and domestic work of poor people, particularly women. An ergonomics guide has been produced and is to be implemented as a matter of policy within DFID by the end of 1998. The ergonomics guide is described in this paper.
Introduction The most valuable resource of any country is its people, particularly so in developing countries where other resources may be more limited. Furthermore, very poor people have little more than their own resourcefulness (physiological and intellectual) on which to depend for gaining their livelihoods. The application of ergonomics will enable poor people to make effective use of their abilities and optimise their performance. In terms of development, the benefits of a system which takes into account people and the way they work include: • food security and enhanced economic status of households through increased productive capacity • reduced drudgery and fatigue from work • more time available, especially for women • improved health and quality of life • fewer accidents and injuries.
Guidelines for addressing ergonomics in development aid
557
The White Paper on International Development Through sustainable development, DFID’s (Department for International Development) aim is to achieve a reduction by one half in the proportion of people living in extreme poverty by 2015. The White Paper on International Development (DFID 1997) states that sustainable development requires the management and maintenance of four sorts of “capital” which support human well-being: • • • •
created capital: including machinery and equipment natural capital: the environment and natural resources human capital: human skills and capacity social capital: social relations and institutions
A policy which ensures that the human resources in a country are properly utilised through bridging the relationship between these “capitals”, especially human and created capitals can contribute directly to DFID’s aim. Introducing ergonomics (people-technology interaction) as a matter of policy within DFID will help to achieve this and also complement other disciplines such as economics, environmental and social development issues. In order to accommodate the human factors, guidelines were developed as a practical tool for aid professionals to ensure that ergonomics issues are adequately addressed in future aid programmes and projects.
Methods Development of the guidelines The guidelines were developed from three sources of information. Firstly, listening to the needs of the customer, in this case DFID. Their requirements were to make the document short, user-friendly and written in a positive tone. Secondly, the authors’ own knowledge and experience of identifying and solving ergonomics problems in developing countries made a major contribution. Thirdly, information was taken from reviewing the literature generated by other authors covering ergonomics concerns in developing countries.
Testing the guidelines A draft document was produced in January 1997 and circulated to DFID aid professionals for their comments. Their views on how to improve the draft document were received and incorporated into the guidelines, which have now been finalised.
Results The guidelines document is split into 10 sections. The whole document cannot be described in this short paper but the most important section, which contains the ergonomics checklist for project screening, is give in Table 1.
Ergonomics Checklist The questions in the checklist are underpinned by a single more fundamental question: how will the project affect people? The checklist is split into five ergonomics concerns; the individual, tools and equipment, working conditions, technology transfer, accidents and injuries. The questions require either a YES or NO answer.
558
T Jafry and DH O’Neill
Table 1. Ergonomics Checklist
The remaining sections of the document contain appropriate information for aid professionals to: • • • • •
learn what ergonomics is and why ergonomics inputs are beneficial be shown how to use a checklist to screen projects for potential ergonomics problems be informed of the key ergonomics concerns in developing countries be provided with case studies which explain how the checklist is used guided on where to get further advice and information.
Discussion Guidelines have been developed as a practical tool to help aid professionals identify ergonomics problems or potential problems in their programmes and projects. Use of the guidelines has a number of benefits. In summary, these will: • help DFID in their quest to eliminate poverty, through bridging the gaps between people and technology; • contribute to improving the design and implementation of DFID’s projects, particularly regarding people-technology interaction, and thereby increasing their effectiveness;
Guidelines for addressing ergonomics in development aid
559
• facilitate the development, adoption or adaptation of tools and equipment to be better matched to the physical and mental capabilities of local (indigenous) people; • provide benefits to the millions of people who work in the cottage industries, through simple low-cost improvements to the organisation of work, workplace layouts and work schedules. A programme of follow-up work is currently being conducted to integrate ergonomics into DFID policy, research and bi-lateral programmes. The guidelines document will play a key role in achieving this. The work will also continue to raise the profile of ergonomics in development aid.
Acknowledgements The authors would like to acknowledge the British Department for International Development for funding this work.
References DFID. 1997, Eliminating world poverty—a challenge for the 21st century, White Paper on International Development, (The Stationary Office Limited, London)
DETERMINING AND EVALUATING ERGONOMIC TRAINING NEEDS FOR DESIGN ENGINEERS J Ponsonby, RJ Graves
Department of Environmental & Occupational Medicine University Medical School, University of Aberdeen Foresterhill, Aberdeen, AB25 2ZD
There have been many attempts to design and run different levels of ergonomics training for engineers and other professionals. The current study aimed to identify engineering ergonomic training needs in order to develop and evaluate a support aid. The first stage involved surveying a sample of design engineers to identify apparent gaps in their knowledge. The information was used to design an ergonomics support aid for engineers and a training package. The support aid was structured to help the designer to think about the whole process including analysis of the tasks, sources of information for the worker, and the implications for physical aspects of the task. A pilot study with physiotherapists showed that the aid improved their performance. The second part involved an experimental study with two samples of engineers. The second stage is ongoing.
Introduction Engineers are fundamental to the design process and can be highly influential in relation to ergonomics issues in various levels of system design. It has been recognised for some time that improving engineers knowledge of ergonomics can improve human factors aspects of work design (Graves, 1992). There have been many attempts to design and run different levels of ergonomics training for engineers and other professionals (see for example, Graves et al, 1996). Little experimental evaluation of the degree of the effectiveness of different types and levels of training, however, appears to have been carried out. Classic approaches to ergonomics training seem to emulate the skills of the ergonomist in designing the training material. It is not clear whether engineers and/or designers have the same ergonomic training needs or whether different types of engineer require the same type or degree of knowledge. Woodcock and Galer Flyte (1997) surveyed automotive designers in relation to product design and found that although ergonomics was considered throughout design, it was not considered rigorously enough. In addition, there are questions about the levels of existing knowledge and training of designers involved in manufacturing processes.
Ergonomic training needs for design engineers
561
Graves et al (op cit.) in training around 160 manufacturing “design” engineers was faced with varying levels of engineering knowledge and capability. At the end of the two day courses, small teams of participants undertook design exercises based on their work environments to confirm their ability to take account of basic ergonomics principles in their design solutions. Although the majority appeared to be able to use the task and risk analysis approach and the workspace tools, there was a surprising variation in basic engineering knowledge. The current study aimed to identify engineering ergonomic training needs in order to develop and evaluate a training package designed to satisfy these needs (Ponsonby, 1998). This paper describes the development of the support aid and the preliminary results from a pilot study with physiotherapists to test components of the aid.
Approach Overview The first stage involved surveying a sample of design engineers from manufacturing and process companies and identifying apparent gaps in their knowledge. In the first part of the second stage, the information was used to help design a prototype ergonomics support aid for engineers and an associated training package. A pilot study of the prototype support aid was then undertaken to assess the effectiveness of the material. The second stage involved designing an experimental study with two samples of engineers. This is ongoing and is designed to determine the levels of ergonomics knowledge used by both samples of engineers in an applied synthetic task. One sample will be trained using the package and then both retested using a synthetic task to measure differences in performance.
Stage 1 The two companies involved were a cardboard manufacturer (Company A) and a biscuit manufacturer (Company B). Informal interviews were undertaken with a sample of engineers to determine their level of knowledge of ergonomics, where they gained that knowledge and how they used it. A questionnaire was developed from the results of these discussions. A pilot study of the questionnaire was undertaken on each site. Following this, several of the questions were altered, and the final questionnaire was then sent out to a wider audience and other sites. The questionnaire was used to identify gaps in the knowledge and thinking of the design engineers. This information was used in the development of the support aid. The main survey was undertaken by post to engineers in both companies throughout the United Kingdom and abroad (fourteen design engineers in company A and twenty in B) Several informal on-site visits were used to help determine the types of ergonomic risk to which operators were being exposed. A specific area of production was chosen to be investigated in more detail as an example of where gaps in knowledge had been identified to help in the development of the support aid. Figure 1 illustrates an overview of the process with the numbers referring to specific sections. Each section contained worksheets designed to support the users decision making. The support aid was structured to help the designer think about the worker in the whole process and the role of the worker. The next steps covered
562
J Ponsonby and RJ Graves
analysis of the tasks, sources of information for the worker, and the implications for physical aspects of the task… In addition, the extent and costs of musculo-skeletal problems was assessed by
Figure 1 Overview of ergonomics support aid sections examining sickness absence data, insurance claims and turnover. Information was collected on the different tools and check lists available at the moment to assist both design engineers and anyone else involved in looking at ergonomic problems, both existing and new. The
Ergonomic training needs for design engineers
563
questionnaires were analysed to identify ergonomic gaps in knowledge and these results in conjunction with the task analysis and current support tools were all drawn together to create a support aid to be used by design engineers to improve their application of ergonomics to both new and existing projects. A pilot study of the prototype support aid was undertaken by four volunteer physiotherapy staff prior to the main exercise to identify any problems. They had the same pre-assessment exercise intended for the engineers. The four volunteers were given an instruction sheet and asked to watch a video recording of a sample of work practices and asked to identify any ergonomic issues, concentrating on possible musculoskeletal problems. A separate information sheet provided additional information on the work area. The results from this were scored against pre-set criteria. This was followed by a three hour training session which involved instruction and practice in using the prototype support aid. The training was a condensed version of that planned for the engineers because of time constraints on these volunteers. A post-assessment similar to the pre-assessment was undertaken. This involved using the support aid. This was scored using the pre-set criteria and a questionnaire was completed by all the participants at the end of the exercise. Modifications were made to the support aid following the post- assessment exercise.
Stage 2 The project plan was to carry out a study of the support aid in both industrial environments but company B had difficulty in committing time to the project. The revised plan involved having two groups of design engineers from company A, with both groups having the same pre-course assessment to identify two equally matched groups for the experimental trials. One would use the new support aid while completing the same video based exercise. The two groups would be compared to see if using the support aid improved the ergonomic performance of the engineers. The pre training assessment is on-going due to delays at the company.
Preliminary Results and Discussion Stage 1 Eighteen out of 34 questionnaires (53%) were returned from the design engineers (12 from Company A and 6 from B). Four respondents had a degree, two a Diploma, five an HNC in engineering and seven “other”. Ten said they had limited ergonomics knowledge, seven moderate and one extensive knowledge. The questionnaire responses of the respondents from both companies to technical ergonomic questions revealed that the overall number of correct answers tended to be low. For example, for all but one question on manual handling and one on the upper limb, half or fewer of the respondents knew the correct answers This showed that the designers did not appear to have adequate ergonomics knowledge. Table 1 shows the results from the pilot trials of the support aid with the physiotherapists. All the group improved upon their initial score after training indicating that they considered more of the pre-set ergonomic criteria using the support aid. But the maximum score overall even after using the support aid was just above half the total attainable (25 out of 42 see Table 1) indicating they were still missing over half of the relevant points or were not recording them.
J Ponsonby and RJ Graves
564
Table 1 Percentage improvement between pre and post test scores from pilot trials of the support aid with the physiotherapists
Table 2 shows the results from the pilot trials of the support aid with the physiotherapists in relation to the detailed pre-set criteria categories. Clearly there were a number of areas where further support could be required, e.g. task analysis, sources of information and of physical demand, and additional information. From the pilot study, Table 2 Percentage improvement between pre and post test scores in relation to pre-set criteria
however, it can be concluded that performance of physiotherapists improved by using the ergonomic support aid. In addition, the video exercise seemed to be suitable to test both pre and post test the criteria. A number of sections within the tool were improved to take account of post pilot study feedback. It now remains to be seen whether a similar improvement can be obtained from the engineers in the on-going next stage of the study.
References Graves, R.J. 1992, Using ergonomics in engineering design to improve health and safety. In J.D.G.Hammer (ed.) European Year of Safety and Hygiene at Work Special Edition Safety Science, 15, (Elsevier, Amsterdam) 327–349 Graves, R.J. Sinclair, D. Innes, I. Davies, G. Bull, G. Burnand M. 1996, Applying ergonomics research and consultancy in a manufacturing design process to reduce musculoskeletal risk. Annual Scientific Meeting of the Society of Occupational Medicine, Society of Occupational Medicine: Birmingham Ponsonby, J. 1998, An evaluation of the ergonomic training needs for design engineers, MSc Ergonomics Project Thesis, Department of Environmental and Occupational Medicine, University of Aberdeen: Aberdeen Woodcock, A. Galer Flyte, M.D. (1997) ADECT—Automotive Designers Ergonomics Clarification Toolset. In S.A.Robertson (ed.) Contemporary Ergonomics 1997, (Taylor and Francis, London), 123–128
ERGONOMIC IDEALS vs. GENUINE CONSTRAINTS Duncan Robertson, Simon Layton & Jayne Elder
Human Engineering Limited Shore House, 68 Westbury Hill, Westbury-on-Trym, Bristol, BS9 3AA.
Ergonomics is by its very nature an applied discipline. University courses prepare students well in terms of technical knowledge, but often give less guidance on the constraints imposed in a commercial environment. This paper presents case examples illustrating the experience of incorporating ergonomics into the design of two train cabs.
Background This paper describes the experience of incorporating ergonomics into the design of two train cabs, whilst taking into account the genuine constraints associated with such a project. The cabs were developed by Human Engineering Limited for a UK train manufacturer. Cab ‘A’ was for use in driver/guard operation, and cab ‘B’ was for use in driver only operation (DOO). Each type of operation has a number of unique requirements with respect to the cab design. The two cabs also differed in their layout. Cab ‘A’ was a ‘full-width’ cab, with the driving position on the left, and the non-driver on the right. Cab ‘B’ however, had a central gangway to allow two units to be coupled together. Using these gangways passengers can walk the length of the train. This design resulted in the driver’s console being approximately one third smaller than in the full-width cab—with the same amount of equipment to be located in both cabs.
Approach The approach that was taken can be broken down into three stages. Initially (during Stage 1) a task analysis of the driver’s role was performed for each of the existing cabs. The second stage was to model the cab environment, using the human 3D modelling package MQPro. The aim of this stage was to assess elements of the design, such as access/ egress, clearance, driver posture, reach envelopes and sightlines. For example, Figure 1 shows the reach envelope for a 5th percentile (%ile) UK male, within cab ‘A’.
566
D Robertson, S Layton and J Elder
Figure 1–5th %ile UK male manikin within cab ‘A’ During Stage 3 a number of elements of the design work were carried out in parallel. This included the specification and design of controls, audible warnings, displays, and the panel layouts. The proposed traction-brake controller was also assessed.
Constraints Due to the nature of the project there were a number of constraints imposed upon the cab designs. These constraints can be divided into four categories: cost, engineering, panel/ control size and time constraints. Cost constraints included, for example, limitations in the new technology incorporated into the cabs; a number of the controls and displays, such as the speedometer, brake pressure gauges, etc., were required by the manufacturer to be tried and tested, off-the-shelf items. The structural design of the front end of the train and cab also imposed engineering constraints on the location of controls. For example, the position of the connecting corridor in cab ‘B’ limited the positioning of controls on the driver’s right side wall. In addition, the size of certain controls and console panels restricted the control location and orientation. Consequently this had a knock-on effect on the grouping of controls in that there was sometimes insufficient space available to achieve the ideal functional grouping. Finally, the entire modular train development, from concept to production, was taking place over a period that was half the industry standard at that time.
Design Process Figure 2 shows the design process which was followed throughout the project for items within the cabs with examples for each stage. Initially design recommendations were made by the ergonomists taking into account a set of constraints (for example, the switches and buttons must be from one of two suppliers). The preferred solutions were sometimes rejected by the client (or a supplier) when additional constraints restricted the recommendation’s viability.
Ergonomic ideals vs. genuine constraints
567
Figure 2—Design process (on left), and a worked example When recommendations were rejected, a number of alternative solutions would then be put forward. These not only conformed to the additional constraints, but also applied as many ergonomic principles as possible. For example, if it was not possible to locate a control in the ergonomically ideal position due to insurmountable structural constraints, the most suitable alternative would be identified and principles of coding, size, etc., would be adhered to. In the instances when an item of equipment failed to meet several ergonomic principles, a formal statement would be submitted to the client outlining the potential problems for the driver. The client would, on the basis of that evidence, then decide the course of action to take to enable as many ergonomic principles to be applied as possible. The following paragraphs outline two case examples.
Case Examples Communications Unit The communications unit initially proposed for the cab (which was to be supplied by a company within the manufacturer’s group) did not comply with a number of ergonomic principles (see Figure 3A). These included the grouping, labelling, spacing and colour coding of controls. Therefore, whilst acknowledging constraints such as overall size and general functionality of the unit and minimising the number of major changes, the ergonomists then proposed their preferred solution (see Figure 3B). The controls on the panel were spaced and labelled appropriately, lines were used to group related controls, and colour was used to identify important controls (e.g. ‘acknowledge alarm’ and ‘emergency egress’).
568
D Robertson, S Layton and J Elder
Figure 3—Communications Unit: A) supplier’s original design, B) preferred solution, C) compromise design
Due to space restrictions, and constraints in the manufacturing process used, the space between the controls could not be increased. Therefore, a compromise solution was put forward to the client which took into consideration this constraint (see Figure 3C). This solution still incorporated the grouping and labelling principles used in the preferred solution.
Traction-brake Controller (TBC) Track Layout The traction-brake controller is a hand control used by the driver to initiate either traction and power or braking. The driver either pushes the controller to start braking or pulls it to engage power. The track is divided into a number of steps. The customer for Cab ‘B’ requested a continuous movement from ‘low brake’ to ‘full brake’, four defined steps for power settings and a ‘Reserve Power’ step which would only be used in emergencies. Figure 4A shows the preferred solution which was recommended to the client. The triangle between the ‘low brake’ and the ‘full brake’ gives the impression of a continuous movement. To enter the ‘Reserve Power’ notch the driver must move the TBC to the left and break a witness seal. The seal would deter the driver from entering the notch in a non-emergency situation. The braking labelling was coloured red, and the power labelling green. Figure 4B shows the final TBC track layout. The extra notch to enter ‘Reserve Power’ could not be achieved due to a combination of space, engineering and cost restrictions. Complete words could not be used in the labelling because of space constraints. Therefore
Ergonomic ideals vs. genuine constraints
569
alternative labelling had to be used e.g. ‘H’ for ‘full brake’, ‘L’ for ‘low brake’ and ‘E’ for ‘emergency brake’. However, red was still used to differentiate between power and braking, and the triangle (also coloured red) was still used to identify the continuous braking. Green could not be used to colour the power steps because of cost constraints.
Figure 4—TBC track layout: A) the ergonomic ‘ideal’, B) the final track layout.
Conclusion Ergonomists must deal with design constraints, be they cost, engineering, political etc., regularly during their working lives. Unless the constraint ‘goal posts’ can be moved through dialogue with the client, compromises must be made. A human factors/ergonomics degree provides a student with the technical knowledge needed to produce ergonomically ideal solutions to problems. However, in most cases students are not given guidance on how to make compromises to their designs. Such guidance would make for a smoother transition between the academic and commercial sectors for newly graduated students whilst maintaining ergonomic integrity in design and a harmonious relationship with the client.
GENERAL ERGONOMICS
ANOTHER LOOK AT HICK-HYMAN’S REACTION TIME LAW Tarald O.Kvålseth
Department of Mechanical Engineering University of Minnesota Minneapolis, MN 55455, USA
The classic Hick-Hyman’s law of choice reaction time is re-examined in two respects. First, it is re-emphasized that this lawlike relationship holds for the total set of stimulus-response pairs, but is not capable of accounting for the reaction times of individual stimuli. Second, while Hich-Hyman’s law is based on an abstract information-theory measure that lacks any meaningful interpretations, an alternative model is considered based on a predictor variable that has intuitively appealing interpretations in probabilistic terms. Some numerical data are used to compare the two models.
Introduction Consider the general reaction-time paradigm involving n potential stimuli with a probability distribution P=(p1,…, pn) and with each stimulus requiring a specific response. That is, Stimulus i occurs with probability p1 on any given trial, with i= 1,…,n and p1+…+pn=1, and the subject responds as quickly as possible. In the case of error-free performance, or nearly so, the classic relationship known as Hick-Hyman’s law, after Hick (1952) and Hyman (1953), is given by (1) where RT is the overall mean reaction time for all stimuli, a and b are parameters to be empirically determined, and I(P) is the mean information content, or uncertainty, of one stimulus event as measured by the information measure due to Shannon (1948). When the stimuli are all equally likely (i.e., p1=…=pn=1/n), Equation (1) reduces to (2)
Hick-hyman’s reaction time law
573
In the case of a single stimulus (n=1), if follows from both equations that RT=a so that a is the so-called simple reaction time. While the general validity of this law as expressed by Equations (1)–(2) has been established on the basis of numerous empirical studies, it does not hold for the reaction time to individual stimuli. In the present paper, we shall briefly re-examine this limitation of Hick-Hyman’s law, which often appears to be either neglected or misunderstood in the published literature. Also, we shall consider an alternative formulation for the overall mean reaction time that is based on the expected probability or repetition probability as a meaningful independent variable. Some empirical results will also be presented.
RT for Individual Stimuli Even though the formulation in Equation (1) appears to be appropriate as an average description for the total stimulus set, it does not appear to apply to the reaction time for individual stimuli. According to Hich-Hyman’s law, the reaction time RTi for Stimulus i with probability pi should be a linear function of the information content (uncertainty) I(pi)=-log2pi of the ith stimulus event with the same parameter values as those of a and b in Equation (1) for the same experimental data set. However, as was first pointed out by Hyman (1953), this requirement does not appear to be met, stating that “we would expect the mean reaction time to each of the components within a condition to fall on the regression line which was fitted to the over-all means of the conditions…. But such was not the case (p. 194).” As a specific example, Hyman (1953) mentioned the case of one subject and observed reaction times 306 and 585 msec for two stimuli with respective information content of I(13/16)=.30 and I(1/16)=4.00 bits whereas the corresponding reaction times predicted by the fitted regression model for all stimuli were 258 and 824 msec, respectively. Without providing the specific data, Hyman (1953) stated that such findings were typical for all subjects and conditions, i.e., low-information stimuli have larger reaction times and high-information stimuli have smaller reaction times than those predicted by the overall model in Equation (1) involving all stimuli. Such inconsistent findings have raised serious concern about the general validity of HickHyman’s law (see, e.g., Laming, 1968, p. 10; Luce, 1986, pp. 392, 405). While this limitation of the law is appreciated by some authors, it is either ignored or misunderstood by others. Some authors have specifically and incorrectly stated that the reaction times for individual stimuli fall along the linear function in Equation (1) for the average reaction times of all stimuli (e.g., Wickens, 1992, p. 318). If the reaction time RTi for a particular stimulus is considered to be a linear function of its information content I(pi)=-log2pi , then the inconsistency referred to above would imply that this linear function has parameter values that differ from those of the overall mean relationship in Equation (1). That is, if Equation (1) and the equation RTi=a+bI(pi)
(3)
are both fitted to the same set of experimental data, we would expect the parameter estimates to be different for Equations (1) and (3), whereas they should be the same according to HickHyman’s law. In order to test the proposition of differing parameter values for the two fitted models, we shall reanalyze some experimental data by Fitts et al. (1963; Fig. 2, Session 4) as also given
574
TO Kvålseth
by Fitts and Posner (1967, p. 102) and involving n=9 stimulus-response pairs as well as data by Theios (1975) for n=2. When Equations (1) and (3) are fitted to Fitts et al’s data by means of linear regression analysis, the following results are obtained: (4) with the values of the coefficient of determination R2=.98 for the RT model and R2 =.92 for the RTi model. Similarly, based on the data from Theios (1975), (5) with the respective R2 values of .92 and .90. It is apparent from these results that the parameter values for Equations (1) and (3) may indeed differ substantially. This is especially true for the results in Equation (5) where the values of the slope parameter b differ by a factor of about three between the two linear models and the estimated values of the simple reaction time (a) differ by 63 msec. When looking at scatter plots for these data (see, e.g., Fitts and Posner, 1967, p. 103), it would also appear that Equation (3) is simply not an appropriate formulation, with clear departures from linearity for extreme values of I (Pi). Thus, while Hick-Hyman’s law appears to be an appropriate relationship for the total stimulus set, it fails to account for the reaction time to individual members of the stimulus set. To quote Hyman (1953), “If, however, we are interested in the behavior of the components making up the conditions, we must find different laws and equations (p. 194).” However, no such well-fitting model for individual stimuli appears yet to have been presented in the published literature.
An Expected Probability Model Hick-Hyman’s law in Equations (1)–(3) is based on the proposition that the choice reaction time is a linearly increasing function of the stimulus information content (uncertainty), using the information measures developed by Shannon (1948). Although the measures by Shannon are the most commonly used ones in various fields a study, entire families of alternative information measures have been proposed (see, e.g., Kapur, 1994). However, while Shannon’s and other information measures have a number of desirable mathematical properties, they have an important limitation: they lack meaningful interpretations. The numerical values of such measures are abstract numbers lacking any meaningful or operational interpretations in some probabilistic sense. As a potential alternative and meaningful predictor of the overall mean reaction time, we shall consider the self-weighted arithmetic mean probability M(P)= p12+…+pn2, i.e., the weighted arithmetic mean of the probabilities p1,…,pn weighing each probability with itself. The M(P) is also called the expected probability (e.g., Fry, 1965, p. 210) since it is the statistical expectation of a random variable that takes on the potential values pi with probabilities pi for i=1,…,n. It may also be called the repetition probability since M(P) can be interpreted as the probability that the same stimulus occurs twice during two independent experimental trials (replications). Furthermore, M(P) may be interpreted in a predictive sense as the probability of correctly predicting which stimulus will occur on any given trial when the various stimuli are a priori predicted to occur with given probabilities p1,…,pn (by, say, tossing an n-sided die whose sides appear with probabilities p1,…,pn). Clearly, pi2 is the probability of a correct prediction for
Hick-hyman’s reaction time law
575
Stimulus i (i.e., the probability of predicting that Stimulus i will occur when it does actually occur) so that M(P) as the sum of all the p12 (i=1,…,n) is the probability of correct prediction whichever stimulus occurs on any given trial. It seems reasonable to postulate that the overall mean reaction time RT is some monotonic decreasing function of M(P). With M(P) interpreted as the expected stimulus probability, increasing value of M(P) implies increasing overall expectancy and level of preparation on the part of a subject, which in turn causes faster responses (reduced RT). Similarly, with M(P) interpreted as the probability of a correct stimulus prediction, it seems intuitive that any increase in this probability should cause RT to decrease since responses to correctly predicted stimuli ought to be faster than those involving erroneous predictions. Finally, with M(P) being a repetition probability, as stated above, there is considerable experimental evidence to suggest that RT should be decreasing in M(P); see, for example, Luce (1986, Sec. 10.3) for a review of studies showing that reactions tend to be faster when a stimulus is repeated (sequential) than when it is not. As to the specific form of this functional relationship, the following power model appears to provide reasonably good fits to experimental data: (6) and, when p1=…=pn, (7) where α and β are positive parameters to be empirically determined. When n=1, RT =α so that α is the simple reaction time. The relationship in Equation (7) has previously been proposed by Kvålseth (1980). To explore the goodness of fit of Equations (6)–(7) to experimental data and make comparisons with Hick-Hyman’s law in Equations (1)–(2), a number of published data sets were reanalyzed. The models in Equations (6)–(7) were fitted to the experimental data by means of nonlinear regression analysis and the appropriate coefficient of determination (R2) was properly computed (Kvålseth, 1985). The results are summarized in Table 1. It is apparent form the R2 values in Table 1 that both Hick-Hyman’s law in Equations (1)– (2) and the power model in Equations (6)–(7) provide good fits to the experimental data, with little difference between the two models. However, while Hick-Hyman’s law is based on the measure I(P) that lacks any operational interpretation, the power model is based on the mean (expected) probability M(P) that has intuitively appealing interpretations in terms of the repetition probability or the probability of correct stimulus predictions as discussed above. According to Equation (6), a relative or fractional change in this probability (i.e., ∆M(P)/ M(P)) causes a fractional change in reaction time that is proportional to the probability change with the negative proportionality constant -β, ranging from -.12 to -.50 for the data in Table 1.
TO Kvålseth
576
Table 1. Some sample comparisons between Equations (1)–(2) and (6)–(7), with the unit of RT being msec.
Notes:
(a) Average data for the four subjects were used; (b) Merkel’s data are, for instance, given by Keele (1986); and (c) These data were for the “discrimination” situation.
References Crossman, E.R.F.W. 1953, Entropy and choice time: The effect of frequency unbalance on choice response, Quarterly Journal of Experimental Psychology, 5, 41–51. Fitts, P.M., Peterson, J.R. and Wolpe, G. 1963, Cognitive aspects of information processing: II. Adjustment to stimulus redundancy, Journal of Experimental Psychology, 65, 423–432. Fitts, P.M., Posner, M.I. 1967, Human Performance, (Brooks/Cole, Belmont, CA). Fry, T.C. 1965, Probability and Its Engineering Uses, Second Edition, (Van Nostrand, Princeton, NJ). Hick, W.E. 1952, On the rate of gain of information, Quarterly Journal of Experimental Psychology, 4, 11–26. Hyman, R. 1953, Stimulus information as a determinant of reaction time, Journal of Experimental Psychology, 45, 188–196. Kapur, J.N. 1994, Measures of Information and Their Applications, (Wiley, New York). Kaufman, H., Lamb, J.C. and Walter, J.R. 1970, Prediction of choice reaction time from information of individual stimuli, Perception & Psychophysics, 7, 263–266. Keele, S. 1986, Motor control. In K.Boff, L.Kaufman and J.Thomas (eds.) Handbook of Perception and Human Performance, Vol. 2: Cognitive Processes and Performance, (Wiley, New York). Kvålseth, T.O. 1980, An alternative to Hick-Hyman’s and Sternberg’s laws, Perceptual and Motor Skills, 50, 1281–1282. Kvålseth, T.O. 1985, Cautionary note about R2, The American Statistician, 39, 279–285. Laming, D.R.J. 1968, Information Theory and Choice-Reaction Times, (Academic Press, London). Luce, R.D. 1986, Response Times’ Their Role in Inferring Elementary Mental Organization, (Oxford University Press, Oxford) Shannon, C.E. 1948, A mathematical theory of communication, Bell System Technical Journal, 27, 379–423, 623–656. Theios, J. 1975, The components of response latency in simple human information processing tasks. In P.M.A.Rabbitt and S.Domic (eds.) Attention and Human Performance V, (Academic Press, London), 418–440. Wickens, C.D. 1992, Engineering Psychology and Human Performance, (Harper Collines, New York).
DESIGN RELEVANCE OF USAGE CENTRED STUDIES AT ODDS WITH THEIR SCIENTIFIC STATUS? H.Kanis
School of Industrial Design Engineering Delft University of Technology Jaffalaan 9, 2628 BX Delft, the Netherlands
The application of reliability/repeatability and validity as criteria to assess the scientific status of empirical research becomes less relevant to usage centred studies of domestic products as these studies are more relevant and applicable by designers.
Introduction Notions like reliability/repeatability and validity may be seen as necessary criteria in the establishment of the scientific status of empirical research. The question dealt with in this paper is to what extent these criteria, in the area of Ergonornics/HumanFactors (E/HF), are appropriate for usage centred research for the design of everyday products.
Criteria Reliability/repeatability In E/HF, the degree to which measurements or observations are free from dispersion is addressed by the terms reliability (from the social sciences) or repeatability/ reproducibility (from the technical sciences), see Kanis, 1997a. Aside from the unfortunate difference in terminology, these notions as such reflect the amenability of measurement to repetition, which constitutes a basic consideration in scientific research.
Validity In E/HF, the extent to which observations are free from deviation or bias is addressed by the term validity, drawn from the social sciences. The identification of deviation thrives on limitations in dispersion: the more repeatable measurement results are, the narrower the range within which these results cannot be demonstrated to differ systematically. The difficulty with the term ‘validity’ is its wide interpretative span, ranging from the ‘tenability’ of a model, via measurement results as ‘being (un)biased’ to a method ‘doing a good job’ and the ‘adequateness’ or ‘acceptability’ of a particular approach or procedure, i.e. ‘valid(ity)’ used as common parlance, see Kanis, 1997b. In order to avoid semantic confusion in this paper, empirical findings, or conclusions based on those findings, will be assessed for ‘deviation’, rather than ‘(in)validity’.
578
H Kanis
Usage centred research for everyday product design In Figure 1, a graphical representation is given of the functioning of a product operated by a user. In Kanis (1998), it is argued that user activities (perceptions/ cognitions, use actions, including any effort involved) are the key-issues in usage oriented design of everyday products. For these user activities, human characteristics as indicated at the right in Figure 1, mainly serve as tokens for general boundary conditions (Green et al., 1997), rather than as a base to predict future activities and experiences of users. The attention paid to human characteristics in textbooks on design engineering seems to be the result of the relatively good measurability of those characteristics, rather than their design relevance. This is further illustrated by looking into the application of measurement criteria to different types of human involvement.
Different types of human involved measurement/observation In Kanis (1997a), the following distinction is put forward for human involved observation in the area of E/HF: - measuring ‘at’ human beings, such as anthropometrical characteristics with a mainly passive/involuntary role of subjects, e.g. body mass, arm length, hand breadth; - measuring/observing ‘through’ human beings, i.e. the recording of (results of) activities in carrying out a task such as pronation/supination, the performance in any force exertion, and the number of work movements; - registration of self-reports, i.e. about perceptions, cognitive activities, and experienced effort aired by subjects on the basis of their internal references. Matched with Figure 1, this tripartition results in the following typification of human involved measurement/observation within the user-block: - ‘at’-measurands (with the term measurand adopted from ISO 1993, as the object or phenomenon intended to be measured) at the right, i.e. as human characteristics, - ‘through’-measurands occurring both as human characteristics (at the right), e.g. eyesight, memory capacity, reaction time, joint flexibility, exertable forces, and as user activity (at the left), particularly as use actions in the operation of products, and - ‘self’-measurands at the left, including perceptions/cognitions and effort experienced in any user activity. Figure 1. The functioning of a product operated by a user in order to achieve some goal (from Kanis, 1998)
Design relevance of usage centred studies at odds with their scientific status? 579
Occasionally, ‘through’- and ‘self’-measurands are combined, e.g. in the exertion of a force experienced as comfortable. Assuming for the time being the adequateness of this structuring of human involved research in a design context, the viability of research criteria (see above) is now scrutinised for the identified types of measurands.
Specification of dispersion ‘At’-measurands This type of measurement resembles the classical ideal from the natural sciences, involving measurands which can be specified in a method, that is: as existing ‘out there’. Throughout subjects, a more or less constant or so-called homoscedastic dispersion can be expected, which means that the repeatability is a constant.
‘Through’-measurands In this case, the involvement of human activities generally precludes numerous repetitions per subject due to possible carry-over. Hence the application of research-designs involving several subjects with only a limited number of repetitions per subject, in particular the testretest. In this type of research, a heteroscedastic dispersion tends to be found, including some proportionality throughout subjects between the difference and the mean of test-retest results. In Kanis (1997a), the relevance of accounting for different types of dispersion patterning, i.e. the non-constancy of the repeatability, is discussed as regards the setting of margins in design. For that matter, E/HF studies regularly discuss dispersion in case of ‘through’measurands deficiently, on the basis of the reliability coefficient r (Pearson), a fallacy originating from the social sciences.
‘Self’-measurands. In general, the registration of self-reports cannot be treated in terms of dispersion, since a reasonably argued repetition for this type of recording is illusory due to the irrevocability of perceptive or cognitive experiences. The same applies to the reporting of effort in view of the evasiveness of any alleged constancy of internal references in human beings.
Questioning deviation In order to avoid the muddled use of the concept of validity, and also since corroboration, rather than a once and for all ‘validation’, is all that is achievable, questioning syntaxes have been developed. Basically, two distinctive notions may be questioned when deviation is observed (Kanis, 1997b): (i) a claimed measurand, see the questioning syntax in Figure 2, and (ii) a proposition underpinning a prediction such as an inference or generalisation. In the first case (i), some considerations (theoretical, logical) predicting a certain relationship between fa and fb° are taken for granted, i. e. as prop° (see Figure 2), for instance that a maximally exerted force F max should exceed a comfortably exerted force, F comf. As occasionally it is found that Fmax≤Fcomf(Kanis, 1994), in such a case at least Fmax should be questioned as a claimed measurand. In the second case (ii), the empirical observations to be compared are adopted as unquestioned. This leaves only prop as amenable to reconsideration when the comparison fa°#fb° yields a ∆, see note in Figure 2. This is at issue in measuring Fcomf and Fmax (see
580
H Kanis
Figure 2. Questioning a claimed measurand
example above) if fa is termed as ‘the maximum force exerted, given the instruction to do so’, which both renders fa unquestioned by definition and raises theoretical questions as to psycho-motoric differences between subjects after the same instruction.
‘At’-measurands In case of evidence for deviating measurement results, the identification of deviation is more or less straightforward, such as doing a measurement again with a critical eye for a suspect part of it, or applying a different method (triangulation).
‘Through’-measurands The example of force exertion (see above) illustrates that this type of measurand may not be completely specified by a method: the same instruction can evoke different reactions of people. In addition, an instruction can never be guaranteed to be unambiguous. Hence, the identification of deviation is evasive in as far as ‘through’—measurands rely on interpretations and reactions by people.
‘Self’-measurands This type of measurand largely escapes from specification by any method. Then, identification of deviation is seen as virtually impossible insofar as internally referenced human activities cannot reasonably be linked to observables in order to produce supporting or conflicting evidence. An example is people reporting their perceptions and cognitions. What they say cannot, by definition, be questioned as a claimed measurand if termed as ‘people’s utterances, being asked in a certain way to air what they see/saw, think/thought (etc.)’, i.e. as fa° in the syntax in Figure 3. It is theoretical considerations, i.e. prop, on which the inference Figure 3. Questioning a proposition
Design relevance of usage centred studies at odds with their scientific status? 581
is based that what people for instance say in ‘thinking aloud’, is what they are really thinking. This inference is troublesome to challenge because of the difficulty to produce evidence to compare with, i.e. fb0 in Figure 3, see further Rooden, 1998.
Summary and discussion The message of the summary in Table 1 seems clear: the more design relevant usage centred research, the less this research can accommodate for the adopted scientific criteria, i.e. the specification of dispersion (‘reliability’/‘repeatability’) and any questioning of deviation (‘validity’). The point is that measurands of interest in design are essentially interactive, rather than conceivable as existing ‘out there’. Hence, it may come as no surprise that usage centred design efforts can only benefit to a limited extent from ‘hard’ science thriving on the positivistic ideal that any interactive ‘fuzziness’ in user-measurement confrontation may be nullified. How can interaction be both the target in the study of product usage and, simultaneously, be denied in the application of measurement techniques which are applied to observe that interaction? For that matter, the interactiveness of human involved measurement largely undermines the call for ‘valid’ methods in E/HF. Virtually, in design oriented usage centred research, the notion of validity reduces to ‘credibility’ or ‘plausibility’, as seems viable in qualitative research. Table 1. Design relevance vs. scientific criteria for different types of measurands
References Green, W.S., Kanis, H. and Vermeeren, A.P.O.S. 1997, Tuning the design of everyday products to cognitive and physical activities of users. In S.A.Robertson (ed.) Contemporary Ergonomics, (Taylor and Francis, London), 175–180 ISO, 1993 Guide to the expression of uncertainty in measurement, International Standard Organisation, Geneva Kanis, H. 1994, On validation, In Proceedings of the Human Factors Society 38th Annual Meeting, (Human Factors and Ergonomics Society, Santa Monica, CA, USA). 515–519 Kanis, H. 1997a, Variation in measurement repetition of human characteristics and activities, Applied Ergonomics, 28, 155–163 Kanis, H. 1997b, Validity as panacea? In Proceedings 13th IEA Congress, (Finnish Institute of Occupational Health, Helsinki), 235–237 Kanis, H. 1998, Usage centred research for everyday product design, Applied Ergonomics, 2, 75–82 Rooden, M.J. 1998, Thinking about thinking aloud. In M.Hanson (ed.) Contemporary Ergonomics, (Taylor and Francis, London), this issue
The Integration of Human Factors considerations into Safety and Risk Assessment systems. J.Lola Williamson-Taylor, Ph.D, MIOSH AWE Plc, Aldermaston, Reading, UK. RG7 4PR
This paper describes a methodology for Human Factors Integration (HFI) into safety and risk assessment for safety case purposes. It explains the identification, assessment and screening of human factors contribution to major hazard scenarios and the treatment of human deficiency/recovery tendencies in safety management systems, safety culture and organisational factors as an integral part of the risk assessment process. The paper is freely formatted and aimed at risk assessment specialists with safety case expertise.
1.0 Introduction The safety and risk assessment of hazardous installations such as chemical, explosives and nuclear processing facilities, their supporting functions and management control systems are required to incorporate human factors considerations. The complexity of human involvement in such facilities means that the associated human factors are not constrained to human and machine interface type only, but also include those inherent in the safety management systems, safety culture and organisational frameworks (Andersen et al, 1990; Joksimovic et al 1993; Pate-Cornell, 1990). In the UK, the safety case approach has been fairly well established as an effective and a comprehensive means of justifying the safety of hazardous installations throughout their life cycle. The safety case requirement is set to be extended to other industry sectors, as seen from the recent extension to the offshore, railway and mining industries. The incoming Control of Major Accident Hazard Regulations (COMAH), currently known as SEVESO 11 directive, will further underpin the current regulatory trend. The risk assessment of major accident hazards and significant safety concerns is a crucial part of a safety case for regulatory purposes such as licensing and permissioning. The causes
Human Factors Integration into Safety and risk assessment systems
583
of technical and hardware failures and operator errors have been linked to deep-seated human factors in the management decision processes and organisational frameworks. However, these are often omitted or poorly treated in safety cases, mainly due to lack of a practical integration methodology. Also, the scope of these human factors is not seen to be open to a formal definition and an objective assessment by many specialist risk assessors. This paper describes a “true” human factors integration into the safety/risk assessment process for safety cases. The integration is based on clear principles and achieved by identifying the critical path for the appropriate treatment of the various types of human factors considerations throughout the risk assessment process.
2.0 The Basic Principles If we are to adequately consider both the direct and indirect human factors with significant impact on safety, we must understand the level at which they can be adequately treated within an assessment process. An appropriate methodology must be applied to achieve a credible and reliable result. This is the basic philosophy behind the methodology. For safety cases, a clear strategy of approach that takes account of the safety assessment system in use and the extent to which advanced risk assessment techniques are applied needs to be established. Competency of the specialist assessors and reviewers, interdisciplinary team practices and use of specialist contractors should be included when formulating the strategy. The identification of a critical path for the integration of a framework for the identification and assessment of direct and indirect human factors embedded in the safety management system and organisational domains should be unambiguous and transparent. The HFI process must be compatible and harmonised with each level within the risk assessment process. To produce a reliable deterministically-based assessment, an appropriate balance must be struck between the qualitative and quantitative assessment and targets. The probabilistic treatment of human error within the Probabilistic Risk Assessment (PRA) must acknowledge the sparsity of good quality data and must validate the data, sources and assumptions made.
3.0 Safety Management System and Organisational Frameworks An important aspect of a safety case is the demonstration of the effectiveness of the Safety Management System (SMS) and the organisation for safety. This aspect is often described and not backed up by an assessment of their reliability. The key controlling elements within the safety management system and organisation frameworks need to be included in the risk assessment. Here, it is essential to make a distinction and appreciate the difference between human error causing an immediate failure in the system and that which makes hazardous events more probable.
584
JL Williamson-Taylor
For the purpose of a safety case, a powerful two stage technical audit procedure, such as Critical Examination (CE) for safety assessment studies (Williamson-Taylor, 1995) can be applied at the early stage of the risk assessment. The first stage is a broad scope technical audit followed by the application of a set of specialist ergonomic audit tools to assess specific aspects such as procedures and communications, training and supervision, human and machine interface, safety culture, the environment, organisation for safety— including decision processes for manning, safety related posts, technical support, use of contractors etc.
4.0 The Human Factors Integration into Safety Assessment Methodology The description of a risk assessment process for a hazardous facility integrating human factors considerations is given below.
Preliminary Safety Review and Human Factors Considerations The key to a competent treatment of both the direct and indirect human factors within a risk assessment framework is one of identification. A methodology capable of high level identification of the hazardous systems and operations, high impact human involvement in process/systems operations and key SMS subsystems should be applied. The screening of the output from the application of a methodology such as CE includes the identified tasks associated with high hazards, high impact and vulnerable task activities, SMS sub-systems which their functionality depends on high human success. Management issues relating to general occupational safety and health rather than major accident hazards can be appropriately filtered into the relevant management system for their adequate treatment.
Formal Hazard Identification and Human Factors Identification The human tasks, hazardous systems and operations and key SMS subsystems identified in the previous stage are further assessed using dedicated techniques—HAZard and OPerability study (HAZOP) and/or Failure Mode and Effect Analysis (FMEA) for process and hardware systems; an appropriate combination of Task Analysis (TA) tools for specific human tasks; and Human HAZOP for safety management sub-systems supported by appropriate TA supplementary tools. Organisational factors and other factors having influencing effects on human performance are assessed using dedicated ergonomic tools. The various assessments are best performed in parallel. This will require good project management, planning and use of competent multi-skilled assessors. The output from these assessments is a schedule of faults and human error sequences and conditions leading to events which can be either initiating or top events. The qualitative screening of incredible faults and human error can be carried out using engineering and expert knowledge, aided by a short-cut or coarse risk criteria.
Human Factors Integration into Safety and risk assessment systems
585
Risk Analysis Integrating Human Error Analysis This stage involves the estimation of the potential consequence of a given event and the probability of the event occurring to estimate the risk. Fault Tree and Event Tree analysis are the most commonly used methods to provide the architecture of the relationship between the top events and initiating events and conditions. Only a proportion of hazardous events will require detailed quantification of the fault sequences and risk. The architecture of the relationship between the immediate precursor to a top event and immediate precursor to a sub-event that can lead to a top event provide a good qualitative basis for examining the possible combination of initiating events for the minimum cut set analysis. The quantification process should include human reliability quantification using methods such as HEART (Kirwan, 1994) as an integral part of the process. Comparison with a target for overall risk acceptability using ALARP principle and tolerability of risk criteria would have included a well structured human factors considerations. The deductions from the risk assessment are a set of technical guidelines, engineering and management systems such as safe operating envelopes for processes and systems, key operating procedures and minimum standards, mechanical integrity and maintenance standards for safety critical and related systems, organisation for safety, emergency plans/ response and crisis management etc. These are SMS sub-systems run by people through the organisational decision systems required to comply with the safety standard and risk level defined by the safety case. The full scope of human factors for safety and continuous improvement becomes inherent and demonstrable in the deduced systems.
The Demonstration of safety of a Hazardous Facility The demonstration of safety and risk level will draw on the evidences from both the qualitative and quantitative assessment and the reliability of the deduced supporting SMS.
Validation The validation of the HFI methodology for safety assessment and safety cases involves the evaluation of the quality and reliability of the methodology and verification of the correctness of the tools within it. The detailed methodology upon which this paper is based (WilliamsonTaylor, 1996) was independently validated on this basis. Pilot and real applications have demonstrated the methodology’s ability to achieve its objective.
5.0 Conclusion Human factors considerations should address the full scope of all significant human influences within a safety case of a hazardous installation. A structured methodology such as the one upon which this paper is based is required. This methodology demonstrates that human factors in safety management systems and organisational domains can be integrated into a safety case.
586
JL Williamson-Taylor
References Anderson, N, Schurman, D and Wreathall, J 1990, A structure of influences of management and organisational factors on unsafe acts at the job promoter level, Proceedings of the Human FactorsSociety 34th Annual Meeting, Orlando Fl. 8–12 October 1990, 881– 884 (The Human Factors Society , Santa Monica, CA, USA). Joksimovich,V. Orvis, D and Moien. P 1993, Safety Culture via Integrated Risk Mangement Programme. Proceedings of the Probabilistic Safety Assessment International Topical meeting, Clearwater beach, Fl. January 26–29, 220–226 (American Nuclear Society, La Grange Park, IL. USA). Kirwan B, 1994 A Guide to Practical Human Reliability Assessment (Taylor and Francis, London). Pate-Connell, M 1990, Organisation aspects of Engineering Systems Safety: The case of offshore platforms, Science 250:1210–1217 Williamson-Taylor, J. 1995, Critical Examination Methodology for safety case preparation (AWE plc—internal publication) Williamson-Taylor, J., 1996, Human Factors Integration into safety assessment processes and safety cases. (AWE plc—internal publication)
THE USE OF DEFIBRILLATOR DEVICES BY THE LAY PUBLIC Tracy Gorbell and Rachel Benedyk
Ergonomics and HCI Unit University College London 26 Bedford Way London WC1 0AP
A defibrillator is a device used to apply an electric shock to a patient’s chest to stop the haphazard activity of the heart that results in cardiac arrest. It is proposed that the lay public could apply defibrillation in advance of the paramedic’s arrival. To investigate several ergonomic aspects of novice use of the device, sixty-one subjects were recruited into one of two experimental designs, with or without training. Four performance measures were assessed using real defibrillators on a simulated task. The results of the study show that performance was better for those subjects with training. Times to defibrillation depended on the usability of particular features. Resuscitation experience did not appear to influence the results. Lay people could successfully defibrillate with optimal designs of device.
Introduction Successful resuscitation from cardiac arrest due to heart attack depends on what is commonly known as the ‘chain of survival’ (Bossaert and Koster, 1992; Resuscitation Council UK, 1994). The chain is characterised by four links; 1) early access, 2) early cardiopulmonary resuscitation (CPR), 3) early defibrillation and 4) early advanced care. The degree of success achieved in resuscitating an individual requires the rapid application of this chain in an emergency situation. Typically the first two links have been taught to the lay public through First Aid and CPR courses. As part of a wider initiative to improve the chances of survival from sudden cardiac death, several authors have supported the idea of extending the role of the lay public to include stage three, defibrillation (Weisfeldt et al, 1996; Bossaert and Koster, 1992). Implementation of automated external defibrillators (AEDs) into the community has been suggested, amongst others, for densely populated areas such as airports. The analogy put forward, is that AEDs could become like ‘fire extinguishers’. However, is this what is really
588
T Gorbell and R Benedyk
meant by such an analogy, that anyone could pick up an AED and use it in a life-threatening situation without training? Although technology used to analyse cardiac rhythms is not new, actual advances in AED design have increased the possibility of their use by lay individuals. AEDs aimed at public access are portable, maintenance-free, provide audible and/or visual prompts and require no recognition of complex heart rhythms by the user. Whilst differences in design will determine the exact sequence of use, operating an AED involves several key actions. Turning the AED on, verbal and/or visual instructions prompt the operator to connect the electrodes. The electrodes are positioned as indicated on the electrode packaging (or AED itself) to the patient’s exposed chest. Analysis of the heart rhythm is then initiated either automatically or through pressing an ‘analyse’ button. The AED then decides whether it is necessary to administer a shock. If advised, the operator delivers the shock by pressing the appropriate button. Prior to each shock, the operator is required to make both a visual check of the area and give a command to “stand clear”. The safety actions are important to ensure that no one is inadvertently shocked. If a shock is not advised, the operator is prompted to commence CPR. Given, that a role for others outside of the medical profession exists, who are first responders and what does the term lay person mean? First responders can be characterised as those individuals who as part of their profession are likely to be first on scene in an emergency e.g. the police. Lay person responders on the other hand could be distinguished as those individuals who are responsible for others within their work domain e.g. airline cabin crew. The common factor amongst these groups, is that they have some degree of training. However, if the analogy of a fire-extinguisher is applied to AED use, then a further category of users arises, the untrained, non-medical public. One of the main advantages of improved defibrillator technology is that less knowledgeable individuals can be taught how to use an AED. Weisfeldt et al (1996) suggest that the training requirements of the public “depend on whether the objective is familiarity with the concept or ease in use of the device”. The introduction of unfamiliar devices and techniques previously considered outside of the lay public domain, requires both careful thought and implementation. Ergonomic design challenges include how to package an AED so that it communicates both its purpose and operation to even a lay user. At the same time the design should be such that it can be used effectively and safely. Furthermore, it is not only necessary to look at the individual design features of a system or device, but to consider the device in the full context in which it will be used. For this, a way of identifying possible intervention strategies that guard against error and optimise design are needed (Benedyk and Minister, 1997). The purpose of this study was to investigate several ergonomic aspects of AED use by the lay public. Several key areas were identified. These included whether resuscitation experience, training versus non-training and certain features of AED design would affect the user’s use of AED units. Finally, does training in one AED transfer to a second, different AED?
Use of defibrillator devices by the lay public
589
Method Sixty-one subjects were recruited into one of two experimental designs, with training or without training. Two AED designs were used and four performance measures were assessed using a real AED on a simulated task. All subjects were asked to complete a questionnaire regarding the usability of the AED units.
Experiment 1—With Training Thirty-two subjects participated in experiment 1 (16 CPR-trained individuals and 16 novices). Subjects were divided into four groups so that both a CPR-trained and novice group received training with either AED-A or AED-B. Each group received a two hour training session which included a lecture, practice time and assessment of subject’s AED performance. A modified section of the advisory external defibrillator protocol (as recommended by the London Ambulance Service Steering Committee) was used in the experiment (Figure 1). Four performance measures (Figure 2) were assessed using a performance checklist. A week later, each group was re-assessed using the same scenario. On this occasion, half of the subjects within each group were re-assessed using the same AED. The other half were re-assessed using a different AED unit. This was considered to be of interest because of future implications for training programmes and the possibility for standardisation of public access defibrillator units. Figure 1. Modified section of defibrillation protocol followed in scenario
Experiment 2—Without Training Twenty-nine subjects (10 paramedics, 9 CPR-trained individuals and 10 novices) participated in experiment 2. Each subject was asked to try and deliver three shocks using both AED units without prior instruction. Half of the subjects in each population used AED-A followed by AED-B. The others did the reverse. The scenario and assessment of performance was the same for that described in experiment 1. Unless otherwise stated, the Wilcoxon Rank Sum test was used to analyse the data from both experimental designs.
T Gorbell and R Benedyk
590
Figure 2—Description of performance measures recorded
Results and Discussion Experiment 1 With training no significant differences (at the 5% level) in performance scores were observed between novice and CPU-trained subjects. To incorporate novice subjects, the scenario used (Figure 1) did not require subjects to perform CPR. Thus with training both groups could perform the task set in the experiment. Resuscitation experience did not appear to have an additional advantage. Whilst AED action and safety scores did not differ between the two units, times to defibrillation were significantly quicker (at the 5% level) for subjects trained to use AED-B. With AED-A, the electrodes were not pre-connected to the main unit thus additional steps had to be completed by the user to connect the electrode lead. Furthermore after the first analysis cycle and subsequent shock, the sequence returns to the point of pushing the analyse button. In contrast, AED-B automatically re-analyses the patient’s heart rhythm. Times to defibrillation are therefore intrinsic to the AED unit. When subjects were re-tested using a second unfamiliar defibrillator, differences in performance were observed. Using a paired sample t-test, times to defibrillation were significantly slower (at the 5% level) for subjects trained on AED-B and re-tested on AED-A. In contrast, for those trained on AED-A and re-tested on AED-B, times to defibrillation were not significantly better or worse (at the 5% level). AED action points were lost for both novice and CPR-trained groups and for both units. Interestingly, it was always action 4 (correct positioning of electrodes) where points were lost. Whilst the positioning of the electrodes is the same, the shape, size and labelling of the electrodes differ between the two units. In general, it appeared that performance deteriorated if subjects were re-tested using the more complex of the two AED units. Standardisation of future AED designs would be advisable. However, because several companies manufacture AEDs, a more practical approach may be to standardise certain features. For example, all AEDs have the same preconnected electrodes.
Use of defibrillator devices by the lay public
591
Experiment 2 Without training, times to defibrillation did not significantly differ (at the 5% level) between the three subject groups using either AED unit. In contrast, safety scores were significantly better (at the 5% level) for the expert group when compared with the novice and CPR-trained subjects. Points were consistently lost for safety actions 2 and 3 (Figure 2). Simply, the untrained lay subjects were not aware of the required safety actions. Although no significant differences (at the 5% level) in AED action scores were found between the subjects, all groups consistently lost points associated with AED action 4. As with the trained subjects, resuscitation experience did not have an effect. More specifically, it was prior experience of defibrillation that affected the results. Comparisons between the three subject groups using both AED units demonstrated that times to defibrillation were again, significantly quicker (at the 5% level) using AED-B. If, as suggested, the analogy of the fire extinguisher is applied to AEDs, taken literally, this would imply that AED use would not require training. The results of this study however, demonstrated that significant differences (at the 5% level) in performance did exist between trained novice and CPR-trained subjects when compared with their non-trained counterparts. The untrained groups took the longest times to defibrillation and lost the most safety points, thus highlighting the need for some instruction. As public access defibrillation will be an unfamiliar concept to many, in the absence of experience, the provision of a good conceptual model can be addressed through training.
Conclusion In conclusion, all subjects were able to use an AED. Differences in performance suggests that lay individuals will need training. Although AED design in general supported ease of use, to improve upon existing designs, issues relating to usability need to be driven from the endusers point of view. Ergonomic research can therefore support that conducted by the medical and engineering professions. As these results were obtained from small samples of data, further investigation in this area is suggested.
References Benedyk R and Minister S, 1997. Evaluation of Product Safety Using the BeSafe Method. IN Stanton N, (ed). Human Factors in Consumer Products. Taylor and Francis, 1997 In press Bossaert L and Koster R, 1992. Defibrillation: Methods and Strategies. Resuscitation (24) 211–225 Resuscitation Council UK, 1994. Advanced Life Support Manual, 2nd Ed. Burr Associates Weisfeldt ML, Kerber RE, McGoldrick P, Moss AJ, Nicol G, Omato JP, Palmer DG, Riegel B and Smith SC, 1996. American Heart Association Report on the Public Access Defibrillation Conference December 8–10, 1994. Resuscitation (32) 127–138
OCCUPATIONAL DISORDERS IN GHANAIAN SUBSISTENCE FARMERS Marc McNeill 1 and Dave O’Neill2
Department For International Development, 94 Victoria Street, London SW1E 5JL 2Silsoe Research Institute, Silsoe, Bedford MK45 4HS
1
A survey of 100 (male) subsistence farmers in the Brong Ahafo region of Ghana was undertaken to identify the predominant causes of ill-health in this sector of the population. Injuries from cutlass accidents and back pain were found to be prevalent (79% and 76% respectively), with back pain being the more debilitating accounting for, on average, 19 days lost from work. A greater number of working days were lost from gunshot wounds (60), broken bones (38) and snakebites (29), but these were less prevalent. The use of handtools was heavily implicated in many of the activities associated with the onset of ill-health. It is concluded that improved designs of handtools could increase the farmers’ productivity and quality of life.
Introduction In Ghana agriculture accounts for 47.8% of GDP, employs about 60% of the total labour force and contributes 70% of total export earnings (GSS, 1994). The majority of this is small scale subsistence farming where manual labour contributes an estimated 90% of the energy used for crop production (FAO, 1987). The full potential of this energy is often not realised, with the workers physical capacity being reduced because of ill health from occupational disorders; diseases or injuries attributable to work practices, work demands or the work environment, (Rainbird and O’Neill, 1993). A perception that occupational health is solely an industrial concern and that health and safety issues are less of a problem to the agricultural sector than the industrial sector seems to persist (Mohan 1987). Whilst some research has been conducted into occupational disorders in industrially developing countries, very little has focused upon agriculture. Rainbird and O’Neill (1993) in their review of occupational disorders affecting agricultural workers in tropical developing countries grouped agricultural occupational disorders into three broad categories: health problems associated with pesticides, musculoskeletal disorders, and occupational diseases such as zoonoses and farmer’s lung. They specifically excluded
Occupational disorders in Ghanaian subsistence farmers
593
occupational accidents that may be a significant cause of lost productivity in agriculture. Nogueira (1987) described a survey of agricultural accidents carried out in Brazil where 9.22% of workers suffered accidents at work, of which 45.98% were caused by handtools. In Ghana where most farming activities are carried out using hand tools, the incidence of injuries from handtools may be expected to be greater. A participatory rural appraisal (PRA), along the lines described by O’Neill (1997), conducted with farmers in the Brong Ahafo (BA) region of Ghana, suggested that accidents, injuries and illnesses as a result of agricultural activities are not uncommon. In particular, musculoskeletal disorders were identified as a problem with a majority of farmers complaining of lower back pain. Injuries from hand tools were common, farmers claiming that lacerations from slashing the bush with cutlasses or weeding with hoes were a regular hazard. Other occupational disorders that farmers claimed to suffer from included thorn pricks, from weeds such as Acheampong (Chromolaena Odorata) and Speargrass (Imperata Cylindrica), gunshot wounds and fever from working in the sun. Occupational disorders from post-harvest agro-processing activities, which are mostly carried out by women, were also found to be common. These usually involve much drudgery, with repetitive upper body motions (e.g., stirring, kneading, pounding) in unpleasant environments (e.g., smoke, dust). For a more detailed account of occupational health in agroprocessing, refer to Fajemilehin and Jinadu (1995). Discussions with medical personnel in clinics and hospitals in the Wenchi district of BA and with traditional herbalists supported the hypothesis that occupational disorders are a problem for farmers. Whilst malaria is by far the most common cause of morbidity and admission to hospital in Wenchi district it is by no means the only cause. In 1996, accidents (trauma and burns) were the sixth most common cause of morbidity (Antwi, 1997). Whilst there is no indication as to the causes of these, the medical personnel and herbalists suggested that agricultural accidents may be the most frequent cause. From the records at one hospital, morbidity amongst farmers that may be related to occupation (such as trauma, lower back pain and snake bites) accounted for approximately 11% of all cases seen. Given the apparently often hazardous nature of many of the activities with which they are involved, this study aimed to establish how Ghanaian subsistence farmers are affected by occupational disorders.
Methodology From earlier PRA work with farmers and discussions with health personnel, a questionnaire was constructed covering the major occupational disorders that had been identified. It was piloted before being incorporated into a larger survey of farming practices in the Wenchi district. Whilst women are also farmers, (and indeed their burden of agricultural work may be greater, undertaking activities such as agro-processing, water and firewood collection along with tending to the farm), the logistics of this limited survey prevented them from being included. Hence the questionnaire was administered to farmers (predominantly male heads of household) in four villages in the Wenchi district. A total of 100 farmers from 168 households were interviewed.
M McNeill and DH O’Neill
594
Results and Discussion Table 1 provides a summary of the days lost and the costs of disorders, for various activities from information collected over two cropping seasons (ie one year). Table 1. Mean costs and days lost from occupational disorders
Musculoskeletal disorders Back pain was suffered by 76% of the farmers. This may be related to extended periods of hard work in awkward postures that are observed during many agricultural activities. Indeed all the activities that were attributed to causing back pain (Figure 1) are traditionally undertaken using short handled hoes and cutlasses that necessitate a stooping posture. Several farmers claimed they were unable to work for long periods with chronic back pain. The mean number of days lost from back pain was 19 days. Complaints of chest pain were made by 42% of the farmers in the last two cropping seasons. The most commonly cited cause of this was from making yam mounds; this activity involves the farmer bending over, using a short handled hoe to move soil between his legs creating a mound approximately 0.5m high. An opportunity sample of 40 farmers from the original 100 were also asked whether they were suffering from lower back pain now and whether they had suffered lower back pain in the last year. The point prevalence was 48%, whilst 77% of farmers claimed they had suffered from lower back pain in the last year.
Occupational disorders in Ghanaian subsistence farmers
595
Figure 1. Activities attributed to causing back pain
Hand tools The cutlass is a multi-purpose tool, being used in clearing the bush (slashing and cutting), planting (digging holes with the blade end), weeding (turning over the soil with the blade end) and harvesting (cutting and digging). Over the two cropping seasons the most common occupational disorder affecting farmers was cutlass injury. The weight of the cutlass and its handle design may be important factors in the incidence of accidents using cutlasses. Hoes are predominantly used during land preparation (i.e., making mounds) and during weeding. As well as being associated with musculoskeletal disorders, 42% of farmers claimed they had sustained an injury from hoeing.
Burns and fever Farmers burn their land during the dry season to clear the soil for planting and to rid the land of weed seeds and pests. Fires are also lit by hunters to drive out animals. With intensified cultivation, longer dry seasons and increasing spread of grasses the fires can easily get out of hand. Thus, burns are common, with 50% of farmers claiming to have been injured, predominantly during the dry season. Burns may not be the only health hazard from bush fires: almost 74% of fevers during the dry season were attributed to burning. Whilst many of these fevers may be malarial (farmers do not discern between malaria and any other fever) it is suggested that they may be the symptoms of upper respiratory problems from smoke and dust inhalation, or heat stress and heat related illnesses.
Pesticide problems The results from this survey indicate that 28% of farmers had suffered from sickness following chemical use. This is an indication of acute pesticide poisoning rather than the effects of long-term exposure to pesticides that would require objective, or clinical, analysis such as inhibition of cholinesterase activity to reveal (Rainbird and O’Neill, 1993). Several reasons for the incidence of pesticide poisonings are suggested in Table 2.
596
M McNeill and DH O’Neill
Table 2. Suggested reasons for incidence of pesticide poisonings
Snake bites and injuries from plants Acheampong is a common weed that is claimed to have medicinal properties and is used in the preservation of corpses. When it is dried, however, the sharp ends are thought to be poisonous and present a significant hazard of injury and infection: injuries from Acheampong were reported by 69% of the farmers. Speargrass is also a hazard with the risk of lacerations and puncture wounds. Discussions with farmers suggested that these injuries occur mainly around the feet and ankles. Snake bites, which are universally feared, also occur around the lower legs. Many farmers wear Wellington boots to protect themselves from these hazards, however they are expensive and inappropriate for the tropical environment. There is, therefore, an apparent need for comfortable, low cost leg protection.
Conclusions The results of this survey have indicated that occupational disorders are a major problem in Ghanaian subsistence agriculture. Injuries from hand tools, musculoskeletal complaints (back pain) and fever that is attributed to work are the most common. The immediate to the cost to the farmers both in terms of lost work and the financial burden of treatment, be it traditional or allopathic can be considerable. When farmers only have a limited window, dictated by climatic changes, in which to undertake certain activities, an injury or illness that is sustained at these times can have serious consequences in the success of the crop. Many of the occupational disorders identified in this study could benefit from improvements following a participatory ergonomics approach. For example whilst the cutlass and hoe are the traditional tools used by subsistence farmers it is apparent that they cause many injuries, and the posture required to use them may be a contributing factor in the high incidence of back pain. Nwuba and Kaul (1986) investigated the biomechanical and physiological aspects of using short and long handled hoes. They found the short-handled hoe exerted considerable spinal muscle force and associated this with the “sharp pains low in the back when hoeing”. It also had a 64% greater demand in terms of work rate and 51% greater energy expenditure per unit of volume soil moved when compared with the long-handled hoe. Yet, whilst such improvements as long-handled hoes may appear to be beneficial to farmers, there may be cultural or traditional reasons why an ergonomics intervention may be resisted. Freivalds (1987) suggested that a lack of impetus for changing tool design is a resigned view arising
Occupational disorders in Ghanaian subsistence farmers
597
from the belief that that no further improvement is possible to a tool which has been used by many people for many years. Johnson and O’Neill (1979) noted that many attempts have been made to introduce improved tools, such as scythes into Africa, but have mostly failed. They suggested that the main reason, in broad terms, was that a participatory approach had not been taken. By introducing a participatory, multi-disciplinary ergonomics approach to the causal factors of the occupational disorders identified in this paper, it is considered that accidents, injuries and ill-health can be reduced. This will result in raised work capacity, improved health and higher productivity (Elgstrand, 1985).
Acknowledgement This paper is an output from a project funded by the UK Department for International Development (DFID) for the benefit of developing countries. The views expressed are not necessarily those of the DFID.
References Antwi, Y, 1997, Personal communication. (District director for health services, District Health Services, Ministry of Health, Wenchi District). Elgstrand, K., 1985, Occupational safety and health in developing countries. American Journal of Industrial Medicine , 8, 91–93. Fajemilehin, B.R. and Jinadu, M.K. 1995. African Newsletter on Occupational Health and Safety, 5, 38–39. FAO, 1987, African Agriculture: The next 25 years. Food and Agriculture Organisation, Rome. Freivalds, A., 1987, The ergonomics of tools, In D.J.Oborne (ed), International Reviews of Ergonomics, 1, 43–75 (Taylor and Francis, London). GSS, 1984, Quarterly digest of statistics, Ghana Statistical Service. Johnson, I.M. and O’Neill, D.H., 1979, The role of ergonomics in tropical agriculture in developing countries Ergonomics in Tropical Agriculture and Forestry-Proceedings of the 5th Joint Ergonomics Symposium, Wageningen, 125–129. Mohan, D., 1987, Injuries and the poor worker. Ergonomics, 30(2), 373–377. Nogueira, O.P., 1987, Prevention of accidents and injuries in Brazil, Ergonomics 30(2), 387–393. Nwuba, E.I.U. and Kaul, R.N., 1986, The effects of working posture on the Nigerian hoe farmer , J.agric.Engng Res. 33, 179–185. O’Neill, D.H., 1997 Participatory ergonomics with subsistence farmers. In S.A. Robertson (ed), Contemporary Ergonomics 1997, Proceedings of the Ergonomics Society 1997 Annual Conference, Taylor and Francis Ltd, London, 232–237. Rainbird, G. and O’Neill, D, 1993, Work-related diseases in tropical agriculture: A review of occupational disorders affecting agricultural workers in tropical countries. Silsoe Research Institute.
AUTHOR INDEX
600
Alexander, P. Andersen, D.M. Apperley, M. Atkinson, G. Atkinson, T.
Author index
87 2, 41 274 208 404
Baber, C. 198, 213, 338 Baker, N. 515 Banbury, S. 482 Banks, G.M. 419 Barbour, R. 274 Barzegar, R.S. 311 Benedyk, R. 587 Bethea, D. 520 Beynon, C. 56 Bezverkhny, I. 503 Birkbeck, A.E. 398 Bonner, J.V.H. 253 Bourgeois-Bougrine, S. 429 Bouskill, L.M. 510, 540 Brigham, F.R. 8 Broek, J.J. 248 Bruijn, O.de 285 Buckle, P.W. 21 Burgess-Limerick, R. 123 Burton, A.K. 30 Cabon, P. Campion, S. Carter, C. Cartwright, S.A. Chambers, S. Charles, P. Clarke, A. Clift-Matthews, W. Code, S. Coldwells, A. Cotnam, J. Cowieson, F. Cox, T. Crawford, J.O. Crick, J. Crowther, M. Curry, M.B.
429 179 191 96 51 366 179 316 186 208 503 92 174 101, 530 295 316 285
David, H. Davies, I.R.L. Dempsey, P.G. Desmond, P.A. Devereux, J. Dickens, A. Dickinson, C. Dillon, J. Donohoe, L. Donovan, K.J. Donnelly, D. Duggan, C. Durham, S.L.
429 295 503 451 25 198, 213 36, 46 546 404 535 424 376 66
Edlund, G. Edmonds, J. Edworthy, J. Elder, J. Esnouf, A.
186 376 258, 316 565 140
Faiks, F.S. Fearnside, P. Fernandez, J.E. Finch, M.I. Fredericks, T.K.
113 409 492, 498 388 492
Gale, A.G. 61, Genaidy, A.M. Goillau, P.J. Gorbell, T. Gough, T.G. Graham, R. Graves, R.J. Gray, M. Green, W.S. Griffin, M.J. Haigney, D. Hamilton, W.I. Harrison, R.F. Harvey, R.S. Haslam, R.A. Haslegrave, C.M. Haward, B.
456 241 419 587 236 269, 441 51, 162, 560 46 360 487 466 366 213 13 66, 77 343, 471 135
Author index
Hellier, E. Hoekstra, P.N. Hone, K. Hook, M. Hooper, R.H. Humphreys, N. Huston, R.
321 248 174 409 108 525 241
Jackson, J. Jafry, T. Jamieson, D.W. Johnson, D.M. Jones, D. Jong, A.M. de Jordan, P.W.
118 556 162 424 482 355 264, 551
Kanis, H. Kattel, B.P. Kelly, C.J. Kerrin, M. Kilner, A.R. Kirwan, B. Klein, D. 360 Kvålseth, T.O.
360, 577 498 419 174 409 280, 404
601
Maguire, M.C. Majumdar, A. May, A. May, J.L. McCaig, R. McConnell, A.K. McDougall, S.J.P. McGorry, R. McNeill, M. Milne, T.J. Mollard, R. Mon-Williams, M. Morris, L.A. Neary, H.T. Nevill, A. Nicholls, J.A. Nichols, S. Nicholson, P. Noyes, J.M.
269 414 191 61, 456 46 535 285 503 592 530 429 123 46 203 56 82 146 409 306, 424
Oostendorp, H.van O’Neill, D.H.
333 476, 556, 592
572
Lamoureux, T. Lancaster, R.J. Lane, R.M. Langan-Fox, J. Langford, J. Layton, S. Leaver, R. Lee, S. Leighton, D. Leung, A.K.P. Life, M.A. Lindsay, J. Livingston, R. Llewellyn, M.G.A. Lomas, S.M.
404 167 101 186 381 565 51 20 56 321 82 338 540 108 471
Macdonald, A.S. Mackay, C. MacKendrick, H. MacLeod, I.S.
264, 551 46 404 225
Paddan, G.S. Pallant, A. Parker, C. Parsons, K.C. Peijs, S. Phillips, A. Piras, M. Plooy, A. Ponsonby, J. Porter, J.M. Porter, M.L. Powell, C. Rainbird, G. Rayson, M.P. Reed, S. Reeves, C. Reid, F. Reilly, T. Reinecke, S.M. Robertson, D.
487 156 290 510, 520, 525, 540 248 404 295 123 560 140 371 151 156, 381 393 258 151 258 56, 96, 208 113 565
602
Author index
Robinson, B.J. Rooden, M.J. Ross, T. Ryan, B.
476 328 446 343
Schaefer, W.F. Selcon, S. Sharma, R.M. Shaw, T. Sheldon, N. Shell, R. Siemieniuch, C.E. Sinclair, M.A. 203, Somberg, B.L. Stanton, N. Starr, A.F. Stary, C. Stedmon, A.W. Stewart, T. Stubbs, D. Sturrock, F.
355 295 130 46 510 241 220 220 350 436 306 300 388 3 20 280
Taylor, R.G. Tesh, K.M. Thornton, G.
466 72 118
Tilbury-Davis, D.C. Totter, A.
108 300
Van Schaik, P. Vaughan, G.M.C. Verbeek, M. Vink, P.
253 220 333 355
Waterhouse, J. Watson, N. Webb, L.H. Wikman, J. Wilkinson, A. Williamson-Taylor, J.L. Wilson, T. Withey, W.R. Wogalter, M.S. Woodward, V.G. Wright, E.J.
208 46 525 230 51 582 461 510, 540 311 419 77
Yeo, A. Young, M.
274 436
Zajicek, M.P. Zhu, F.
151 515
SUBJECT INDEX
604
Subject index
adaptation age agriculture air-to-air combat air traffic management allocation of functions 3D anthropometry attention attitudes audit, ergonomic audit, stress auditory distraction automatic speech recognition automation automotive industry avionics
466 208, 461 290, 592 295 404, 409, 414, 419, 429 213, 220 248 456 146 371 167 482 441 436 51 306
backrest biomechanics blind users BSI
113 30 151 3
case study CEN chairs, see seating children Chinese clothing cognition cognitive representation cognitive walk-through collaboration comfort commercial planning communication concurrent engineering constraints construction consumer products see products containerisation CSCW cultural issues
77, 179, 203, 366, 371 3 92 321 510, 520, 540 225 230 333 258 140, 525 546 191, 198, 230 191 565 355 381 191 274
Subject index
dancing decision support design design needs disability discomfort display design, see interface design drink distribution driver behaviour drivers, see driving driving
66 290, 424 8, 225, 264, 269, 565 560 30, 179 471
education elderly drivers engineers engineering design equipment design ergonomic application ergonomic intervention error, human error, pilot evaluation evaluation methods
130, 551 461 560 258 587 366, 371, 376 21, 135, 350 456 424 300, 419 253
fatigue financial planning fire fighters fuzzy logic
429, 451 546 530, 535 241
gloves guidelines
487 130, 446, 556
hand-arm vibration, see vibration hand tools hazard HCI health and safety see also occupational health, safety heart rate heart surgery heat HMI integration hospitals
605
77, 360 461, 466 436, 441, 446, 451, 456, 461, 471
503 316, 321 156 130, 174, 236, 520 530, 535 338 530 446 56, 82, 87, 162, 167, 338
606
Subject index
ice cream icons in-vehicle telematics industrially developing countries infantryman information systems design information theory intelligent systems intelligent transportation systems interface design Internet, the ISO
503 285 446 592 398 236 572 241 441 253, 280, 285, 290, 295, 333 156 3
job design job efficiency job satisfaction
203, 213, 220, 376 198 198, 213
kinematic knee flexion knowledge requirements
113 108 280
legislation lifting see manual handling
72, 236
mail processing manual handling management of change management, line marketing material medical equipment mental models mental workload methodology
381 72, 77, 82, 87, 92, 96, 101, 118, 343, 360 203 51 546 525 587 186, 404 436 36, 179, 203, 220, 236, 253, 328, 343, 350, 376, 565 61 295, 388, 393, 398 248, 572 414 409
microscopes military models workload predictive see also mental models mouse movement musculoskeletal disorders
135 92 21, 25, 30, 36, 41, 46, 51, 56, 56, 61, 66 135, 371, 492, 498
Subject index
navy neck noise nursery carers nurses see also hospitals
13 123 482 101 56, 87, 162
occupational health see also health and safety offices organisational change orthotics OWAS
41, 355, 393, 592 130, 140, 482 350 108 101
pain participatory ergonomics patient-handling perfusion personal protective clothing personnel selection physical risk factors physical activity police policy posture psychophysical psychosocial risk factors pregnancy product semantics products prototype, rapid prototype, virtual
41 355 82 338 471, 520 393 21, 25 208 471 556 61, 101, 123, 248 492 25, 30 96 264 8, 253 248 419
questionnaire design
146
railways reaction time research rivet guns risk assessment risk management
381 572 46 498 51, 56, 72, 167, 174, 366, 582 87
607
608
Subject index
safety
366, 381, 476, 582
see also health and safety safety management systems sculpturing robot seat belts seating self contained breathing apparatus self report shared work spaces sheet printing shift work signal words simulator situation awareness software usability sound levels speech speed spine standards strategy stress suitability for tasks support systems survey symbols syntax system requirements
582 248 476 113, 140 535 343 258 371 208 311 338 306, 424 274 311 151 466 113 3, 8, 13, 393, 487 46 162, 167, 388, 451 300 198 269, 560 8 577 225
task analysis task demands team working teleworking thermal comfort thermal environments thermoregulatory model trackerball tractors training trunk asymmetry
230, 300 162 186, 191 174, 179 515, 525 510, 515, 520, 530, 540 515 135 476 82, 560, 587 118
usability usage centred design user trials
264, 269, 274, 253, 333 360, 577 328, 360
Subject index
validation vehicle design verbal protocol analysis vibration virtual reality virtual teams vision visual strain
577 471, 476 328, 333, 338, 343 487, 492, 498 146 191 123 61
walking warnings wheelchair workload work process analysis work systems World Wide Web WRULDs
108 306, 311, 316, 321 525 376, 409 350 241 151, 156 135, 174, 503
609