Usability Testing for Developing Effective Interactive Multimedia Software: Concepts, Dimensions, and ProceduresLecturer, Department of Educational Technology Hanyang University, 17, Haengdang-dong, Seongdong-gu Seoul, 133-791, KOREA FAX: +82-2-2291-9697 Email: suhlee@garam.kreonet.re.kr
* Manuscript received Sept. 24, 1998; revised Feb. 02, 1999 IntroductionInteractive multimedia is an entirely new kind of media experience born from TV and computer technologies (Cotton & Oliver, 1993). Increasingly it is being used for learning in schools as well as training in corporate settings. It can be a powerful tool in the hands of the performance technologist, including instructional and multimedia designers. Multimedia refers to any computer-mediated software or interactive application that integrates text, color, graphical images, animation, audio sound, and full-motion video in a single application (Hall, 1996; McKerlie & Preece, 1993; Northrup, 1995; Tolhurst, 1995). Multimedia software may use some or all these modes of communication, however, it is more than a collection of multiple media. As a complex interaction of stimuli (McKerlie & Preece, 1993), interactive multimedia software aims to produce usability and functionality of systems. Usability, a key concept of human-computer interface, is concerned with making computer systems easy to learn and easy to use through a user-centered design process (Preece et al., 1994). Poorly designed computer-based interactive multimedia systems can be extremely annoying to users. Usability, equated with such concepts as ‘user-friendliness’ or ‘ease of use,’ is not a new concept, but it is relatively new to the field of computer software production and rarely well defined (Morgan, 1995). According to Shackel (1991), the usability of a system can be defined as:
Usability has been of prime concern to multimedia software designers since IBM established the User Interface Institute (UII) in 1986. The major computer hardware and software companies, including Microsoft, IBM, Hewlett-Packard, WordPerfect, Borland Lotus, American Institutes for Research (AIR), Apple, and DEC, have integrated usability testing into their software development processes (Dieli, 1989; Reed, 1992). Usability testing can provide a significant impact on the instructional product development cycle as well as the instructional systems design process. Formative evaluation, sometimes called prototype evaluation or learner validation (Smith & Wedman, 1988), could be considered as a theoretical base of usability testing. Formative evaluation is used to obtain data to guide instructional material revision or improvement. As the process of collecting data and information in order to design and improve the effectiveness of an instructional product, formative evaluation is an essential part in developing multimedia software (Dick & King, 1994: Flagg, 1990; Laurillard, 1994; Patterson & Bloch, 1987; Skelton, 1992). Formative evaluation has been incorporated into nearly every systematic design model because it becomes the quality control component and focuses on cost-effectiveness improvement throughout the product-development cycle rather than only at the end (Dick & King, 1994; Tessmer, 1994; Thiagarajan, 1991). The root of formative evaluation can be traced back to the idea of course improvement through evaluation. Cronbach (1963) defined evaluation as:
Usability testing and formative evaluation have become integrated into the instructional design process for quality improvement (Russell & Blake, 1988). The purposes of this present study were to explore the issues surrounding the usability testing of interactive multimedia software, to build a set of comprehensive dimensions necessary to conduct usability testing, and to describe basic procedures for usability testing for multimedia software. The results can provide a useful framework to help performance technologists, including instructional and multimedia designers, evaluate multimedia usability testing in the process of developing effective interactive multimedia software. User-Centered Design and Usability TestingUser-Centered Design User-centered design, as the process of integrating user requirements, user interface validation, and testing into standard software design methods, is an approach which views knowledge about users and their involvement in the design process as a central concern (Preece et al., 1994). This means that the principle of user-centered design is to involve users in the design decision process of a particular product, and to understand the user’s needs and to address them in very specific ways (Morariu, 1988; Rubin, 1994). Therefore, designers must understand who the users will be and what task they will do (Shackel, 1991). This requires direct contact with proper users at their place of work. Well-designed multimedia programs are easy to interpret, understand, and contain visible clues to their functions, while poorly designed multimedia software can be difficult and frustrating to use without proper clues. Norman (1990) identifies two key principles that help to ensure user-centered design: visibility and affordance. The correct parts should be visible and convey the correct message. Affordances can provide strong clues to the operations of things. Controls need to be visible, with good mapping with their effects, and their design should suggest their functionality (Preece et al., 1994). Multimedia designers must see the user as the center of the multimedia system instead of as a mere peripheral (Shackel, 1991). The user-centered design implies that good system design depends upon solving the dynamic interacting needs of the four principal components of any user-system situation: users, tasks, productivity, usability (Dumas & Redish, 1993). Gould and Lewis (1985) describe three principles for user-centered design: (a) early focus on users and task; (b) empirical measurement of product usage; and (c) iterative design in the production process. To sum up, the key concepts of user-centered design are early focus on users and tasks, early and continual usability testing, empirical measurement, and integrated and iterative design. Usability Testing Usability can be defined as "a measure of the ease with which a system can be learned or used, its safety, effectiveness and efficiency, and attitude of its users towards it" (Preece et al., 1994, p. 722). Based upon this definition, the usability of a multimedia software could be measured by how easily and effectively a specific user can use the multimedia program, given particular kinds of support, to carry out a fixed set of tasks, in a defined set of environments (Chapanis, 1991). Usability testing determines whether a system meets a pre-defined, quantifiable level of the usability for specific types of user carrying out specific tasks. Traditionally, software products including information materials and multimedia software have been evaluated by means of marketplace reviews, magazine reviews, and beta tests, but these approaches leave too little time for major modifications and improvement of products (Reed, 1992; Skelton, 1992). As the process of observing and collecting data from users while they interact with multimedia prototypes, usability testing can be used to address and solve a system’s usability problems before it goes into production. The aim of usability testing is not to solve problems, or to enable a quantitative assessment of usability (Patterson, 1994). It provides a means of identifying problem areas, and the extracting of information concerning problems, difficulties, weaknesses and areas for improvement. Even if usability testing should reveal difficulties or faults that cannot be corrected in the model under development, the information is still important for the designers in planning for the future release of a product (Chapanis, 1991; Dieli, 1989). Usability testing may serve a number of different purposes: to improve an existing product; to compare two or more products; to measure a system against a standard or a set of guidelines (Lindgaard, 1994). It can also be used as a comparison test: usability of a product is compared against competitors’ products, and serves as a verification tool- a way to check user reaction to new features (Reed, 1992). Usability testing is concerned with ‘fitness for use of a system,’ and as such it can be a powerful instructional systems development (ISD) tool for identifying problems with multimedia interface as defined by the specific user rather than the interface as designed by the instructional systems designers (Davies, 1995). With usability testing, rapid prototyping in the multimedia production process is beginning to emerge as a way to test design approaches and user interfaces, and will reduce the software development cycle while at the same time increasing effectiveness (Henson & Knezek, 1991; Northrup, 1995). Reed (1992) indicates maxims of usability for software developers: (a) design for the software end user, not for the designers/clients; (b) test the multimedia software, not the user; (c) test usability with real users early and often; (d) don’t test everything at once; (e) measure performance of real-world tasks with software, not functionality of the program; and (f) test usability problems that software designers never imagined. Dimensions of Usability TestingUsability can be specified and tested by means of a set of the operational dimensions. Usability dimensions reviewed in the literature for interactive multimedia software include: ease of learning (Guillemette, 1995; Lindgarrd, 1994; Nielsen, 1990a; Reed, 1992; Shackel, 1991); ease of use (Guillemette, 1995; Nielsen, 1990a, 1990b); easy to remember (Nielsen, 1990a); performance effectiveness (Lindgaard, 1994; Reed, 1992; Shackel, 1991); few errors and system integrity (Guillemette, 1995; Nielsen, 1990a; Reed, 1992); flexibility (Guillemette, 1995; Lindgaard, 1994; Shackel, 1991), and user satisfaction (Nielsen, 1993, Reed, 1992; Shackel, 1991). In addition to the above, Lindgaard (1994) advocates the category of usability defects including "navigation, screen design and layout, terminology, feedback, consistency, modality, redundancies, user control, and match with user tasks" (p. 33). It is important for multimedia designers to note that there are tradeoffs involved in user interface designs with respect to the usability parameters (Guillemette, 1995; Nielsen, 1990a). Chapanis (1991) explains that "well designed systems often simplify operations, reduce maintenance requirements, and sometimes do other good things as well" (p. 364). However, the nature and importance of these factors differ among groups of users and tasks performed. Based upon the results of literature review on usability components, usability testing dimensions can be classified into five general categories: (a) learnability; (b) performance effectiveness; (c) flexibility; (d) error tolerance and system integrity; and (e) user satisfaction. Learnability Learnability refers to "the ease with which new or occasional users may accomplish certain tasks" (Lindgaard, 1994, p. 30). Learnability problems may result in increased training, staffing, and user support or corrective maintenance costs (Guillemette, 1995; Lindgarrd, 1994; Nielsen, 1990a, 1990b; Shackel, 1991). Users are quickly able to understand the most basic comments and navigation options and to use them to locate wanted information. In addition to easily understanding functionality of multimedia software, multimedia systems should be easy to remember. The casual users should have no problems in remembering how to use and navigate in the system after periods of non-use. Memorability could give users the ability to transfer their knowledge of use and navigation of one information base to the use of another information base with same engine (Nielsen, 1990a; 1990b). Performance Effectiveness Multimedia products should be designed to achieve a high level of productivity. Effectiveness, measured in terms of speed and error, refers to levels of user performance (Lindgaard, 1994; Shackel, 1991). After learning the multimedia software, users should become more expert at using them over time (Robertson, 1994). Flexibility Flexibility refers to variations in task-completion strategies supported by a multimedia system. The freedom to use a range of different commands with which to achieve similar goals adds to the system flexibility although not necessarily to the learnability for new users (Lindgaard, 1994). Effects of flexibility may be measured by differences in performance as a function of absence or presence of added features in the multimedia software. Error Tolerance and System Integrity It is desirable that users do not make many errors during the use of a multimedia system. Design accommodations should be made so that when errors do occur, users can easily recover from them (Nielsen, 1990a; Robertson, 1994). System integrity is the prevention of data corruption or loss (Reed, 1992). No critical errors must occur in order to meet high integrity of multimedia software. User Satisfaction Multimedia software should be enjoyable to use and aesthetically pleasing to users. User satisfaction should be within acceptable levels of user cost in terms of tiredness, discomfort, frustration, and individual effort so that satisfaction causes continued and enhanced usage of multimedia software (Lindgaard, 1994). Motivational elements including typographical cueing, color, graphical images, animation, and sound in the interactive multimedia software can motivate the user and increase satisfaction, but follow the principles of motivational design elements (Lee & Boling, 1996). In summary, the general dimensions of usability testing mentioned above are summarized in Table 1 and might be used to capture valuable information for improving the quality of multimedia software in the process of production, with the categories of usability defects such as Lindgaard’s (1994) illustration. Designers can choose the usability defect categories in terms of user, task, and environment. Table 1. Dimensions of Usability Testing
Procedures for Usability TestingThere are a variety of methods/techniques for usability testing available which can use different purposes and circumstances. According to Conyer (1995), there are six typical methods for usability testing: heuristic evaluation methods; pluralistic walkthroughs; formal usability inspection; empirical methods; cognitive walkthroughs; and formal design analysis. Table 2 is a brief summary of usability testing methods/ techniques. If necessary, more detail explanation can be referred from other source. (see Conyer, 1995) Each method/technique for usability testing has its advantages and limitations. When using the above methods/techniques for usability testing, there are also various data collection methods summarized in Table 3 (Conyer, 1995; Corry, Frick, & Hansen, 1997). Evaluation experts can choose different methods and data collection tools that can be considered for different purposes and circumstances of usability testing. Table 2. Methods/Techniques of Usability Testing
Table 3. Data Collection Methods for Usability Testing
Seven Basic Procedures of Usability Testing Different methods address different purposes and involve a combination of user and usability testing. However, conducting a useful usability testing takes planning and attention to detail. Following are the general procedure of usability testing for effective multimedia development (Dumas & Redish, 1993; Rubin, 1994):
The plan of usability testing is critical to a successful test and the foundation for the entire test. It covers the how, when, where, who, why, and what of usability testing (Rubin, 1994). Test plan formats can vary according to the type of test and the degree of formality required in an organization, however, they should include purpose, problem statement/test objectives, user profile, usability testing method/technique, task list, test environment/equipment, test monitor’s role, data to be collected, and so on. The test plan can also be used as a communication vehicle among the usability testing team. The selection and recruitment of participants is a crucial element of the process for usability testing. Selecting and recruiting participants involves identifying and describing the relevant skills and knowledge of the person(s) who will be users of a software product. The results of usability testing will only be valid if the participants are typical end users of the multimedia software. If usability testing recruits ‘inappropriate’ people, it does not matter how much effort you put into the rest of the test preparation. The results of the usability test will be questionable and of limited value. For every usability test there are materials prepared in addition to the software you are testing. These include a screening questionnaire for participant selection, legal forms of nondisclosure agreement and tape consent forms, orientation script, data collection instruments, task scenario, prerequisite training materials, posttest questionnaire, and debriefing guides (Dumas & Redish, 1993; Rubin, 1994). Before conducting a usability testing, the physical test environment and the staff who will conduct the test must be prepared. It is important to develop all required test materials well in advance of the time you will need them. After preparations are completed, next step is to conduct the usability test. Conducting a usability test is a demanding physical and emotional exercise. There exists an almost endless variety of sophisticated usability testing methods, however, the typical test consists of four to ten participants, each of whom is observed and questioned individually by a test monitor seated in the same room. The step-by-step testing activities of this stage can be referred from other sources (see Dumas & Redish, 1993 or Rubin, 1994). Debriefing the participant refers to the interrogation and review with the participant of his or her actions during the performance portion of a usability test. For every usability test, the test goal should be to understand why every error, difficulty, and omission occurred for every participant for every session (Rubin, 1994). The debriefing session is the final opportunity to fulfill this goal before you let the participant walk out the door. The debriefing session allows you to resolve any residual questions still resonating after a session and gets the participants to explain things that you could not see, such as what they were thinking during usability testing. The process of compiling and analyzing data involves placing all the data collected into a form that allows you to discern patterns. The compilation of data should go on throughout the test sessions. After transforming the raw data into more usable summaries, it is time to make sense of the whole thing. For data summary, it is important to choose data analysis methods that match the types and levels of data collected. Typically, there are two distinct processes with different deliverables for the analysis of data (Rubin, 1994). The first process is a preliminary analysis and is intended to quickly find out critical problems, so that the developers can work on these immediately without having to wait for the final report. The second process is a more comprehensive analysis, which takes place during a two- to four-week period after the usability testing (Rubin, 1994). Its deliverable is a final, more exhaustive report. After analyzing the data, the final report that focuses on solving problems and improving the quality of the interactive software should be produced. The report should include an executive summary, methods, results, findings and recommendations, and an appendix section. The final report needs to target the development team members so they can develop more effective interactive software. Guidelines for Conducting Usability Testing Based upon Rubin’s (1994) study, we can summarize the basic guidelines for monitoring a usability test. They include guidelines on probing and assisting the participant, implementing a ‘thinking aloud’ techniques, and some general recommendations on how to work with participants during a usability test. A case study illustrates how user-centered design and usability testing can help make usable and useful multimedia software for Web sites (Corry, Frick, & Hansen, 1997).
In summary, as usability testing becomes more prominent and as more research on usability testing occurs, we will see many creative variations and improvements to usability testing methods and techniques. As the knowledge about usability testing grows, practitioners will be able to choose more effective and efficient methods and techniques that are appropriate to their goals and circumstances. Concluding Remarks: Expanding UsabilityUsability testing, as an emerging and expanding research area of human-computer interface, can provide a means for improving the usability of multimedia software design and development through quality control processes. In the process of usability testing, evaluation experts should consider the nature of users and the tasks they will perform, tradeoffs supported by the iterative design paradigm, and real world constraints in order to effectively evaluate and improve multimedia software. The best way to carry out usability testing is to watch and listen to real users, under real situations interfacing with a multimedia program. In order to do this, usability experts need to be in the field where they can see how real users work with real multimedia software. It is the responsibility of performance technologists, especially multimedia developers,’ to make multimedia software simple to use, simple to understand, yet still powerful enough for the task. The issue is no longer whether to conduct usability testing, but how to conduct useful usability testing. References
|
