This study compares and contrasts usability feedback from target learners against feedback from multimedia designers, in the evaluation of a computer-based instruction software prototype about telecommunication. In the context of formative evaluations of courseware prototypes, this study provides useful information for defining more cost-effective evaluation strategies and methods, as well as specifying valid instruments and tools for the task of evaluating early prototypes .
This analysis distinguishes two aspects of evaluation behavior: i) user-interface issues and ii) pedagogical dimensions. In this study a standard usability instrument (QUIS 5.5b-University of Maryland) for measuring usability factors is combined with think-aloud and heuristic evaluation techniques to collect feedback from 15 target users (Engineering students enrolled in a Midwestern American university, subdivided in three groups: 5 Americans, 5 Chinese-Koreans, 5 Indian-Pakistanis) and 5 educational multimedia designers of the same university.
In addition to evaluating the courseware itself, all 5 designers were asked to evaluate the summaries of the data collected from the learners, which included quantitative reports, qualitative reports and multimedia files of the critical incidents. Designers were then asked to rate the usefulness of these summaries.
In the quantitative side of the study, descriptive statistics, non-parametric comparisons (Kruskal-Wallis and Man-Withney tests) and cluster techniques were applied to the answers to find patterns and contrasts. Target Learners groups and a Designers group feedback were compared, as well as some more general trends, when all subjects were combined altogether. Gender comparisons were also studied.
In the qualitative side of the study, critical incidence multimedia segments were produced for each subject, with screen and audio grabs of problems encountered by each subject; navigational maps were generated for each subject; written comments about the prototype were collected; and finally a descriptive list of errors was generated, comparing the types of errors and the overlapping of error detection between subjects.
The study provides a number of indications for defining better and more efficient strategies for usability testing. Designers have reported that the qualitative information in general was more useful than the information provided by the quantitative instruments. Designers tended to be more critical both about the interface aspects and pedagogical dimensions and significantly found more errors than the Learners.
Out of the three target learners groups, the American group was more efficient in finding errors. the Indian group came in second and the Chinese group in third place. The same trend was found in terms of grading the software, but in the inverse order. The Chinese group was much more forgiving when answering QUIS and the Reeves/Harmon Questionnaire. One important trend, when comparing males and females, was that females were systematically more positive about the prototype.
A major conclusion is that practically speaking, designers were much more efficient than users when executing the usability evaluation, but could not completely replace learners (some errors were found only by users). Experts manage better the double task of trying to learn and critique a new interface and learn about the content at the same time. Users generally cannot do both at the same time and frequently drop one (the interface is the mostly dropped one).
Finally, the variability of feedback within learners as well as within designers was remarkably high, which confirms a trend of previous studies in this field. Directions for further research are discussed. Methodological considerations for further work include the relative usefulness of combining quantitative and qualitative methods; the issue of when to use designers as opposed to target learners, and the importance of gathering information from different ethnic user groups when developing software for an international audience. Gender differences is also an important topic to be included in the agenda for future research in this field.