Involving Effectively Teachers and Students in the Life Cycle of an Intelligent Tutoring SystemMaria Virvou Victoria Tsiriga
IntroductionThere have been criticisms on the pedagogical worth of ITSs. For example, Laurillard (1993) claims that ITS research does not provide education a clear pedagogical alternative besides individual tutoring. However, individual tutoring is still a very important feature and it can be extremely useful in the settings of a real classroom. In fact, in the setting of a real classroom, an ITS may help the human tutor to provide more individualised help to his/her students. For example, Koedinger and Anderson (1993) report, among the benefits of the use of their geometry ITS, the following: “While students were engaged in interacting with the tutor, the human tutor was free to roam around the classroom giving extra help to poorer students who needed it or challenging better students to do more than they might otherwise. In other words, the individualised help provided by the ITS gave the teacher a greater chance to provide individualised help himself/herself.” Indeed, one of the major problems of a mathematics tutor in a class is that s/he cannot check the answers of all the students of his/her class simultaneously. Therefore, s/he can be significantly assisted by an ITS that performs individualised error diagnosis to students’ solutions. We believe that the main criticism of ITSs such as that they are not useful to real classrooms or that they are boring can be refuted by the true involvement of school teachers and students in all the phases of the life cycle of an ITS (Virvou & Tsiriga 1999a). In addition, the application of evaluation methods during more than one phase of the life cycle of an ITS can provide a further improvement, in terms of the effectiveness and usability of the system. Therefore, an iterative development approach is required in order to ensure multiple evaluations. The life cycle of EasyMath is based on rational unified process (Booch et al. 1997; Quatrani 1998), which is an objectoriented software engineering process that supports multiple iterations of the software development phases. This includes multiple evaluations that should take place at various stages, starting from the early phases until the completion of the development. School teachers are the most appropriate resource for understanding real problems that students often face while learning. This kind of expertise can be very useful during the development of an ITS. Moreover, students can also provide helpful information both about their own cognitive ability as well as the way they prefer to learn. Therefore, school teachers have been involved throughout the life cycle of EasyMath. The idea of involving end users (such as students) and experts (such as human tutors) is not new in knowledge and/or software engineering. For example, the standard requirements analysis in software engineering involves end users; similarly, knowledge acquisition in knowledge engineering involves experts. Still knowledge acquisition has often been referred to as “bottleneck” in the business of designing and applying expert systems to real problems (HayesRoth et al. 1983). In addition, the involvement of real users and experts is often neglected in later stages of the development, resulting in systems which are not very usable. In this paper we describe the life cycle of a system that has standard features of an ITS, but is also useful in real classrooms. We have combined methods and trends from older and recent literature concerning ITSs and software/knowledge engineering to reinforce the idea that teachers and students can play a crucial role concerning the usefulness of an ITS. Intelligent Tutoring Systems are traditionally considered good at diagnosing and responding to the needs of individual students. However they have often been criticised that they are mainly research products, which have not been used in real classrooms. For example, Boyle (1997) reports that, at the plenary session at CAL ’93, the chairman asked the audience whether there was anyone who had seen an ITS used successfully in an educational setting and only one person said that she had seen one but the students found it boring. The body of the paper is organised according to the model of life cycle of EasyMath, which is described in section 2. Therefore, in section 3 we present the work that was done for the requirements capture of the system; this was an empirical study that involved both students and human teachers. We then describe the architecture of EasyMath, providing details about each one of the system’s components. The fifth section presents the methods used for the evaluation of the design of the student modelling and advice generation components. We then introduce the overall evaluation of the system, providing information about both teachers’ and students’ involvement in this phase of development. The final section discusses certain advantages and limitations of EasyMath, and sketches possible improvements and extensions for future work.
Model of life cycle of EasyMathOne of the main aims of a tutoring system is to assist the student in the learning process. However, as Jones et al. (1993) point out, there is no design process or method that can predict the outcomes or preempt all the learner’s problems; therefore, the design process must be iterative. A traditional software engineering model of life cycle, such as the waterfall model (Sommerville 1992) does not provide us with such flexibility. Therefore, a model of this kind does not seem suitable for the development of an intelligent tutoring system, without being reformed in order to enhance iteration. An iterative approach, as the one in the spiral model, makes it easier to accommodate tactical changes in requirements, feature or schedule (Boehm 1988; Boehm 1996). However, we have chosen an objectoriented model of life cycle, which is called rational unified process (Quatrani 1998). This model has been chosen for two reasons. Firstly, it supports multiple iterations of the software development phases, a crucial matter while working with systems that should perform usability and learning tasks in an integrated way. Secondly, it is an objectoriented methodology, which suits better the development of graphical user environments, such as EasyMath. Using the rational unified process, the software life cycle is broken into cycles, each cycle working on a new generation of the product. The process divides one development cycle into four consecutive phases: the inception, the elaboration, the construction and the transition phase. In the inception phase, an empirical study was conducted which involved school teachers and students. This study was held in November 1997. As a result a library of the most common students' errors was constructed. In addition, at this stage of development of EasyMath, the teachers were asked to specify what they would like the software to do to help them in class. The elaboration phase resulted in the description of the system architecture, along with the development of a primary executable release of the tutor. The elaboration phase was conducted in the spring of 1998. One of the school teachers participated in the elaboration phase. His role was to provide information about the kind of exercises that were encoded into the system and several requirements that were connected to the use of the product in real school environments. As Dix et al. (1993) point out, evaluation is an integral part of the design process and should take place throughout the design phase of the life cycle. Therefore, school teachers also participated in the evaluation of the design of the student modelling and advice generation components. In addition, in the primary executable release evaluation, two teachers along with ten students of the same grade were involved. The primary executable release evaluation was based on qualitative evaluation methods, such as observation and questionnaires. The construction of the questionnaires was based on a set of “usability” heuristics presented in Nielsen (1994). The comments made in this stage of the tutoring system development formed the basis for the refinement of the requirements as well as the construction of the final system. In the construction phase, the prototype system that was developed in the previous phase was extended, leading to a second executable release of the tutoring system. The outcome of the construction phase is a product ready to be used by its endusers. The product that resulted from this phase was also tested thoroughly in terms of its usability and functionality. For the formative evaluation of the second executable release, a set of “learning with software” heuristics was used. The results of the formative evaluation form the basis for the future development plans concerning EasyMath. This phase of the development of EasyMath started in the autumn of 1998 and was completed a year later.
An empirical study involving school teachers and studentsIn the inception phase of the life cycle of EasyMath, an empirical study was conducted, involving four school teachers and 240 students of eight different classes of the same grade. This empirical study was conducted in order to identify what teachers and students would like the software to do to help them in class. The findings of the empirical study were used for the construction of a library that held the most common misconceptions that students have while working with Algebraic powers. This piece of knowledge was particularly useful for the design of the student modelling and the advice generation component of EasyMath. At first, the teachers were asked to prepare a test that would cover the whole domain of Algebraic powers. The test was made by the four school teachers in collaboration. The test contained questions about every case of the calculation of powers, the conversion of numbers into powers, the various operations that can be conducted between powers and the calculation of a power raised to another power. In the second stage of the empirical analysis, all of the 240 students were given the test and were asked to answer all of the questions. The teachers then corrected the students' answers, trying to note down each one of the mistakes made by the students, along with what they thought the underlying misconception of the error was. The students' mistakes were then analysed so that as many errors as possible could be categorised. As a result, the most common mistakes were selected, to be included in the bug list of EasyMath. For example, for the division of powers, the following most common error categories were identified:
The identified most common categories of error, their causes and appropriate advice for each category were encoded to the domain knowledge, student modeller and advice generator of EasyMath.
Description of EasyMathThe architecture of EasyMath follows the main line of Intelligent Tutoring Systems (ITS) architectures. It is widely agreed that the major functional components of an ITS architecture are the domain knowledge, the student modeller, the advice generator and the user interface (Hartley and Sleeman 1973; Burton and Brown 1976; Wenger 1987; Nwana 1990). In this section, we describe EasyMath by giving the description of each one of the components involved in the system. The domain knowledge of EasyMath contains the explicit representation of the theoretical concepts of algebraic powers. This component mainly contains the knowledge of how to solve exercises correctly. In addition, it comprises the knowledge about the general patterns of exercises that are used during the construction of several types of exercises. The student modelling component is the part of an ITS that manages the explicit assumptions about a student in respect to his/her interaction with the tutor. Traditionally, student modelling in ITSs has been approached in two different ways, the buggy model and the overlay model. The buggy model assumes that a student may employ incorrect as well as correct reasoning, whereas the overlay model assumes that the student may not have a complete knowledge of the domain but has no incorrect knowledge. The student modelling component of EasyMath has been based on the buggy approach. The first system that introduced this approach was BUGGY (Brown & Burton 1978). Since then it has been used in many other systems, such as (Langley et al. 1987; Sleeman et al. 1990; Hoppe 1994). These systems reconstruct the problemsolving process and generate malrules from hypothesised faulty solution paths, which are used for the modelling of students’ misconceptions or procedural bugs. As Bertels (1994) points out, from a pure pedagogical point of view, the buggy model is far more superior to the overlay model, because the system is able to explain why the student made a particular error; however, it will be difficult to define a complete library of bugs. To this direction, VanLehn (1988) suggests that such a library presupposes the availability of careful analysis of students’ problem solving behaviour. The bug library of EasyMath has in fact been constructed as a result of the empirical analysis described in Section 3. If a student gives an erroneous answer, then EasyMath tries to perform error diagnosis by generating the faulty procedures of solution. If one of them is found to match the student’s answer then EasyMath gives the user the appropriate kind of advice, and records the type of error identified in the student’s personal record, so that it can maintain the history of his/her progress. This record constitutes the individual long term student model (Rich 1979; Jones & Virvou 1991). In the case of EasyMath, the most common faulty solution procedures, which were identified in students’ transcripts during the empirical study, have been encoded in the student modelling component. For example, when a student multiplies a power with a number that can be converted into a power with the same base as the first power, s/he may make the following types of error:
Figure 1a. A case of erroneous answer
Figure 1b. Another case of erroneous answer
In the case where the student’s answer is of the type illustrated in [Figure 1a], the answer of the system is: “There has been an error in the transformation of number 8 into a power.” In the case where the student’s answer is of the type illustrated in [Figure 1b], the answer of the system is: “The transformation of 8 into a power is correct, but there has been an error in the multiplication of the two powers. You have multiplied the exponents instead of adding them.” The advisor is the part of an intelligent tutoring system that is responsible for the didactic strategy of the system. The advisor of EasyMath is responsible for the following tasks:
Figure 2. Exercise where the student has to provide an explicit answer
Figure 3. Multiplechoice question about the multiplication of two powers
The student, while working with EasyMath, is given three different types of exercise to select from. The first type of exercise is the one where the student has to provide an answer explicitly, using the predefined editors for the base and exponent values. An example of the first type of exercise is illustrated in [Figure2]. The second type of exercise refers to multiple choice questions, where the student has to choose the correct answer among the ones provided by the system as alternatives. An example of a multiple choice question is illustrated in [Figure 3]. The last type of exercise is a puzzle. The puzzle consists of a hidden picture which is separated into a number of squares, each of which hides an exercise about algebraic powers. In order for the picture to be revealed, the student has to answer correctly all of the exercises. This utility of EasyMath is quite important pedagogically, since puzzle type games make the learning environment more stimulating and thus not boring (Woolf & Hall 1995). In addition, as Amory et al. (1998) point out, games appear to motivate students intrinsically and represent one of the best uses of multimedia in education. The construction of all types of exercises in EasyMath is done dynamically using randomly generated numbers. Furthermore, the advisor makes sure that the exercises provided are different each time a student attempts one type of exercise. The construction of multiple choice questions is based on a random number generator, the domain knowledge and the student model. The construction of a new multiple choice question consists of a number of steps. At first, the random number generator provides EasyMath with numbers that serve as bases and exponents of the powers that form the question. For each type of question there is one correct procedure for solving the exercise and there is a number of faulty procedures encoded into the system’s library of errors to represent common students’ mistakes. The result of the correct procedure, contained in the domain knowledge component, is used as the correct answer to the multiple choice question. For each question, two faulty procedures among the ones encoded in the bug list are selected randomly and are used to form the erroneous answers to the multiple choice question. For example, in the case of multiplication of powers, while EasyMath is constructing a new multiple choice question, it performs the following steps:
In the example of [Figure 3], the correct answer is the second choice. However, if the student selects the first answer, then EasyMath prompts him/her with the message: “Identified problem in the multiplication of powers: you have multiplied the bases and the exponents instead of keeping the same base and adding the exponents.” If the student selects the third choice, then EasyMath prompts him/her with the message: “Identified problem in the multiplication of powers: you have multiplied the exponents instead of adding them.” The construction of multiple choice questions may be done in a completely automatic way by EasyMath as described above. In addition, human tutors are provided with a facility to collaborate with EasyMath at the construction of multiple choice exercises. In particular, a human tutor may type in a new question, which is solved by the system correctly and in all possible faulty ways that EasyMath knows. Then the teacher is presented with faulty results and algorithms so that s/he can select the erroneous answers for the multiple choice question. The advisor is finally responsible for acting in the case when a student performs an error. In such cases, it tries to respond in the most appropriate way by informing the user about what the cause of the error has been and by showing to him/her a relevant message. The selection of the advice message is done by using the knowledge of the most common student mistakes and the corresponding cause of the error. Finally, the user interface of EasyMath, is a multimedia user interface, which involves graphics and sounds so that it can attract the student’s interest in the subject. Graphics in EasyMath, such as pictures and animations are used in order to attract the students’ interest and attention. On the other hand, sounds are not only used to signal "obvious" errors but also to provide a different mean of informing the student about his/her progress in a test, as well as about the effect of his/her actions on the system. For example, if the student provides a correct answer to an exercise, s/he is rewarded with a congratulation message.
Evaluation of the design of the student modelling and advice generation involving school teachersThe correctness of the error diagnosis and advice generation processes is considered very important for the overall performance of an intelligent tutoring system. For that reason, we evaluated the design of the error diagnosis procedure and the generation of explanations to errors, in order to ensure that the expertise incorporated in them would be acceptable by the majority of school teachers. As Dix et al. (1993) pointed out, a design can be evaluated before any implementation work has started to minimize the cost of early design errors. They also stated that most techniques for evaluation at this stage are analytic and involve using an expert to assess the design against cognitive and usability principles. The evaluation was conducted by 10 school teachers before the implementation of the system. At this stage we aimed at evaluating the student modelling component in terms of its cognitive principles and the completeness of its bug list. In addition, the advice generation was evaluated in terms of the appropriateness of the explanations provided to students’ errors. Therefore, the ten school teachers were given a set of exercises relating to the domain of algebraic powers, that were answered erroneously. The exercises and their erroneous answers were generated using the domain knowledge and the bug list of EasyMath respectively. The exercises covered the whole material taught in schools about algebraic powers. For each one of the exercises, each teacher had to provide an explanation of what s/he thought the underlying misconception was. This was meant to evaluate the diagnostic capability of the system. Ideally, the school teachers would give the same explanation as EasyMath for each of the erroneous answers. One problem at this stage of the evaluation process was that many teachers did not provide any explanation and just gave the correct answer to the question. A sample of questions and answers along with the percentage of the teachers that gave the same explanation as EasyMath and the percentage of the teachers that did not provide any explanation is illustrated in [Table 1].
Table 1.Error Diagnosis Evaluation Questionnaire
The results of this evaluation showed that human teachers gave the same explanation as EasyMath to a large extent. This meant that the diagnostic capability of EasyMath was satisfactory. However, there were cases where the percentage of agreement was very low, such as in question 4 shown in [Table 1]. It was found that, in cases like this, there is an ambiguity as to the underlying cause of the error. For example, the erroneous answer 3^{9} to the exercise 3^{4} * 3^{5} may be attributed to two explanations about the underlying cause. One explanation may be that the student subtracted instead of adding the exponents. Another explanation may be that the student did not pay any attention to the sign of the exponent of the second power. EasyMath considered only the first explanation, whereas most teachers considered the second. A small percentage of teachers pointed out that there could be two explanations. Questions like this revealed a limitation of the design of EasyMath in terms of ambiguity resolution. This limitation will be addressed in future executable releases of EasyMath.
Evaluation of the overall performance of EasyMathIn order to test the usefulness of the developed intelligent tutoring system, we conducted an evaluation study. EasyMath has been tested by teachers alone and then by students in real classrooms. In cases where students evaluated EasyMath, there was one human teacher as well as about 30 students in every classroom. Each student had a personal computer that had EasyMath on. The human teacher was still in charge of the tutoring process and asked his/her students to perform certain tasks using EasyMath, when s/he thought it was appropriate. In the phase of the formative evaluation of EasyMath, 10 school teachers as well as 240 students were involved. These teachers and students were different from the ones that participated in the empirical study and the evaluation of the design. Both students and teachers were asked to evaluate EasyMath in terms of the purpose of education. In addition, teachers were asked to evaluate the overall performance of the system and express their opinion about the usability of such an intelligent tutoring system in a real classroom.
Teachers’ involvement in the evaluationThe involvement of school teachers in the phase of the evaluation of a tutoring system was considered crucial. Human teachers usually perform some kind of error diagnosis, since they know from earlier experience some typical errors that students often make. In addition, based on their experience, teachers are the most appropriate source for providing information and estimation about the efficiency of the teaching strategies used by an ITS. Therefore, human teachers' contribution to the overall evaluation of EasyMath was considered of great importance. The teachers who participated in the overall evaluation were asked to play the role of an average student interacting with EasyMath. Next, they were given a questionnaire to fill in. The evaluation questionnaires were constructed based on a set of "learning with software" heuristics, introduced by (Squires & Preece 1999). These heuristics are an adaptation of the “usability heuristics” presented in (Nielsen 1994), so as to relate them to socioconstructivist criteria for learning. The learning with software heuristics include the following:
The “learning with software” heuristics have been suggested as a method for performing predictive evaluation. This kind of evaluation typically occurs when teachers are either planning lessons or making purchasing decisions about educational software. The set of heuristics is meant to provide guidance to teachers so that they avoid superficial evaluations of educational software. In the case of EasyMath, we used this set of heuristics in order to construct a formative evaluation questionnaire, that would give the teachers insight about the integration of both usability and learning issues that need to be evaluated.
Table 2.Overall Evaluation Questionnaire
Table 3.Relation between the questions of the questionnaire and the set of “learning with software” heuristics
The questionnaire was carefully designed so that questions are related to as many of the "learning with software" heuristics as possible. A sample of the questionnaire given to the teachers is illustrated in [Table 2]. In addition, in [Table 3] we provide the relation between the questions of the questionnaire and the "learning with software" heuristics. The evaluation results showed that the multimedia interface of EasyMath provides an attractive and userfriendly environment for learning. In addition, the relevance of the advice provided by the system when the students were solving exercises was quite satisfactory. However, the evaluation results of EasyMath also showed a need for further improvement of the adaptation of EasyMath to the student's needs. This conclusion was reached due to the poor grading of EasyMath in the questions related to the adaptation of the system to student’s needs (questions 5 and 11 of the questionnaire). One suggested improvement is the adaptation of the tests presented to the student, based on the previous interaction of the student with EasyMath. An enhancement of this kind will be addressed in a following version of EasyMath.
Students’ involvement in the evaluationAs Jones et al. (1999) point out, the stage of involving students in the phase of the evaluation of a tutoring system is crucial. They say that it will be crucial to track students' use of resource based packages very closely to uncover the problems and successes and although observing and interviewing are time consuming methods, they will need to play a large part. For that reason, students have also been involved in the phase of the evaluation of the overall performance of EasyMath. The evaluation of EasyMath by the students was conducted in two distinct steps. The first step involved the evaluation using a questionnaire based on a set of "learning with software" heuristics. In the second step, the students used the tutoring system and then their performance in the domain of algebraic powers was tested through written tests, so as to estimate the efficiency of EasyMath in teaching the intended domain. In the stage of the students' evaluation of EasyMath, the total of 240 students were separated in two portions. The first 120 students were introduced to EasyMath and asked to use it for about one hour. The students, after interacting with EasyMath, were given the same questionnaire as the one used by teachers. The questions included in the questionnaire were related to usability, learning, and integrated issues concerning EasyMath. The students’ answers to the questionnaire showed that they were largely satisfied by EasyMath and found it useful and easy to use. For a more detailed description of the results of the questionnaire, see (Virvou & Tsiriga 1999b). In the second step of the evaluation phase, the remaining 120 students were taught half of the syllabus in algebraic powers without the use of EasyMath and they were given a written test. Then they used EasyMath while being taught the rest of the syllabus. In the end they were given another written test and the grades of their first and second test were compared. The results of this research showed that 46% of the students obtained a better grade in the second test, 43% obtained the same grade and only 11% obtained a lower grade in the second test.
Conclusions and future workTo produce a really useful ITS, there is a need to ensure the correctness of the domain knowledge, the completeness of bug lists in a student model (if the buggy approach is used), the accordance of the advising strategies of the system with human teachers’ as well as the attractiveness of user interfaces. In addition, there is a need to design the ITS so that it serves the purposes of teaching and learning in real school settings as well as to evaluate the usability of the product in real world situations. The above goals may be achieved by involving human school teachers and students in every stage of the developing process, as well as by applying evaluation methods throughout the life cycle of the ITS. Given the results of the overall system evaluation, we showed that EasyMath seems to be a useful product for schools. The interface of EasyMath and the educational game included make the system quite attractive to students. Moreover, the student modelling component of EasyMath provides individualised support to students’ learning, while they solve exercises. The explanations given by EasyMath are considered satisfactory. However, the evaluations showed there is still a need for further improving the student modelling and advice generator. For example, the student modelling cannot cope with ambiguity resolution. In the future we plan to adjust the exercises component of the system so as to present problems according to the student's proficiency level and to dynamically adapt the theory presentation to the student’s long term model. In addition, the improvement of the error diagnosis component so that it performs ambiguity resolution will also be addressed in future executable releases of EasyMath. Finally, EasyMath is going to be adapted for the WWW so that it can be made available for distance learning. In this way, we will be able to evaluate it fully by involving larger numbers of teachers and students, for longer periods of time.
References
