Educational Technology – An Unstoppable Force: a selective review of research into the effectiveness of educational media.
Lecturer in Media Studies
Centre for Educational Studies
University of Hull, Hull
Let me start by, rapidly, reviewing the use of technology in education over the past few thousand years. First, there was language, a very powerful tool that allowed accumulated knowledge to be passed from one generation to the next, speeding up the evolutionary process. Then, after thousands of years, came written language, about 5,000 years ago, allowing thoughts and ideas to be transmitted across almost limitless vistas of time. Again, the process of human evolution was speeded up:
"The invention of writing was highly significant for the development not only of language, but of society, and favoured the progress of commerce. It confirmed the power of the priests through the trained scribes, and even more the might and prestige of the ruler" (Singer et al., 1954).
The next major step forward was the invention of moveable type. Printing allowed a dramatic increase in the rate and volume of information distributed, and drove the European Renaissance. As many books were printed in the fifty years after Gutenburg’s invention as had been produced by the scribes of Europe during the previous thousand years. And after such a revolutionary educational invention, things did not change a great deal for 500 years. Teachers taught a variety of subjects using basically the same tools: books and writing materials. Indeed, The Visible World in Pictures (Orbis Sensualium Pictus), produced by Comenius in 1658, was still widely used in schools the 19th century.
So, to a certain extent, teachers have always used technology, but the available technology was rather limited. With new technologies being introduced, in the 19th century, to boost production and aid in the distribution of the products of industrialisation, there was also a demand for improvements in education to meet the need for skilled workers and clerical staff. This was met by the monitorial training system introduced by Lancaster, based on Bell’s ideas: it was a very efficient, but labour intensive, system, capitalising on human resources rather than the introduction of machines. Indeed, while Lancaster was introducing his system of schooling Charles Babbage and Ada Lovelace, were laying the foundations for the future revolution in educational technology, with their design for the precursor of the modern programmable computer, the analytic engine, which ran programmes from punched cards.
The twentieth century has been obsessed with the idea that the new communications technologies, such as film, radio and television, should make a significant impact on education, and that the introduction of such audio and visual aids should raise the level of achievement of pupils. The main early theory governing this approach, the realism theory, was summarised by Carpenter (1953):
"Sign similarity hypothesis: that films whose signals, signs, and symbols have high degrees of similarity ("iconicity") to the objects and situations which they represent will be more effective for most instructional purposes than films whose signals, signs and symbols have low degrees of "iconicity"... Signs (or symbols) vary in "iconicity" to the degree to which they are similar to the things or situations signified. Thus, for example, sound motion pictures have potential capacities for high degrees of "iconicity" in representing objects in motion as well as reproducing authentic sound" (p. 41).
Unfortunately, most of the research, which compared the new media, such as film and television, with traditional ways of teaching, found that there were no significant differences in student performance.
Early Film Research
With the advent of sound films for educational purposes, in the early 1930s, it was natural for researchers to consider the relative effectiveness of lectures and filmed recordings of such lectures. Hoban and Van Ormer’s report, subtitled Rapid Mass Learning, summarised the research available during the period 1918 to 1950.
The earliest research investigating comparisons of sound films and lecture demonstrations was conducted by Clark in 1932. Three sound films, Radioactive Substances, Liquid Air, and Characteristics of Sound, were found to be as effective as the lecture demonstrations given by regular class instructors, in tests designed to measure thinking and reasoning ability.
A later study by Hall and Cushing in 1947 investigated the difference between a sound film presentation and a lecture with enlarged illustrations, dealing with 3 science topics. No differences were found.
In an experiment conducted by Vernon (1946) information test scores of seamen who witnessed two showings of a film demonstrating two methods of taking depth soundings, a total of 50 minutes of instructions, were only 6 per cent below those of groups which had usual instruction lasting 3 hours. Vernon concluded that an hour’s film appears to be as effective as three hours weak oral instruction.
Hoban and Van Ormer summarised their review of the research, in which films are compared with demonstrations or lectures, and conclude that films reduce instruction time and are often equivalent to good instructors. They hasten to add that this should not be interpreted as meaning that films can eliminate the need for instructors but rather that the effectiveness of instructors of average or below-average ability can be improved (and instructional time can be saved). Also, they suggest that films can be projected on large screens, increasing the size of the viewing group but with no loss in instructional effectiveness. Finally, they conclude that films, used alone, can offset a shortage of instructors.
Films Reduce Instruction Time:
The conclusion which continually recurs in these studies is that films reduce instruction time with little or no sacrifice of instructional results. In some of the experiments in film presentation, from one-half to two-thirds of the instructional time was saved by the use of films in place of lecture or demonstration. There thus appears to be considerable support for the Navy slogan "More learning in less time", although this may not always be true for all films.
Films are Often Equivalent to Good Instructors:
A second conclusion that is recurringly supported by the research data is that, in communicating facts and demonstrating concepts, films (or filmstrips) are about equivalent, and sometimes better than superior instructors using the best non-filmic materials at their disposal (Hoban, C.F. and Van Ormer, E., 1950).
From the results of the first thirty years of research comparing the comparatively new educational media there is strong confirmation of their effectiveness when compared with traditional methods. Indeed, Hoban and Van Ormer suggest that by using film recordings of above average teachers there can be some compensation for poorer quality personnel.
Peggie Campeau’s stringent review of the literature concerning audio-visual media, in 1966, cites nine studies conducted at university, senior high school, junior high school and elementary school levels in which no significant differences in achievement were found when students were taught by either motion pictures or conventional instruction. This is confirmed by Greenhill in his introduction to a volume of abstracts on film and tv research by MacLennan & Reid (1964).
Early Television Research
Just as research was indicating quite conclusively that there is no disadvantage in studying from filmed courses, a new technological innovation was entering the mass media market place. Television, although invented in the late 1920s, was only beginning to make headway in the early 1950s, and it was natural for researchers to turn their attention to a medium which offered all the potentialities of film, at a possible lower cost, for, as Lumsdaine and May concluded in 1965, film and tv can be considered substantially identical media for many purposes.
The first research effort used television as a means of expanding the total audience for a given lecture via closed circuit television, in which the transmission was carried from the lecture room, by cable, to several locations, enabling a single teacher to communicate with many hundreds of students. In fact, at one stage, audiences of 7,000 were taught via closed circuit tv in New York University. In Greenhill’s report of the Pennsylvania experiment he acknowledges that the major use of closed circuit tv was for the presentation of regular classroom instruction to students located in multiple classrooms, as a means of coping with mounting enrolments. Greenhill also argues that the standard of instruction would also be raised by:
This new medium, which is substantially the same as motion film, in that it presents moving pictures with sound accompaniments, should produce substantially the same results when compared with traditional teaching. The shift towards tv, which was viewed as a panacea for all the educational ills during the mid 1950s and early 1960s, resulted in a proliferation of research studies. Stickell, in 1963, reviewed 250 comparisons of educational television and conventional face-to-face instruction from 31 research reports. Overall, 75% of the studies showed no difference, with equal percentages favouring tv or face-to-face instruction. Chu and Schramm found that, by 1968, of 421 separate comparisons taken from 207 published reports, 308 showed no difference, 63 showed tv to be superior and 50 found conventional instruction superior.
The research reviewed was so wide-ranging that they concluded that tv can be used efficiently to teach any subject matter where one-way communication will contribute to learning.
These results lead to the inevitable conclusion that course enrolment can be greatly expanded by the use of educational television and that student performance will not suffer. However, the second point referred to by Carpenter, concerning the possible improvements in instruction, does not seem to be tenable. If this is the case, what are the advantages of using film or television? The answer is: when there is no difference in performance, the most obvious measure then to come under scrutiny is the cost.
An early cost analysis is available in Greenhill’s final report on tv teaching at Pennsylvania State University, during the period 1956-57. Comparisons were made between actual costs of televised instruction and the costs that would have been incurred in courses had they been taught in the usual way. The analysis showed a total saving, in favour of the tv service, of $40,000, which represented more than the total cost of running the service. However, it was necessary to have at least 200 students per course before any savings were made. The system required large recurrent finance to keep it in operation and, if the system was not fully operational for most of the academic year, it rapidly became less cost-effective.
Radio and Audio Recordings
It is generally recognised that for teaching via radio or audio recordings essential graphic information, usually in the form of printed materials, must be provided in order to fully exploit the potential of the medium. This means that comparisons of radio and traditional teaching are often comparisons of audio plus print with traditional teaching.
Beginning in the 1920s instructional radio was widely used in the United States and Britain, but with the advent of television its use dwindled in the US, although it has continued to be widely employed by the BBC schools service. Developing countries, however, are making increasing use of radio, its principal attraction lying in its low cost when compared with television. It is also an effective instructional medium, as much of the research confirms.
Carpenter, in 1937, prepared 15-minute radio lessons on science for pupils ranging from fourth grade to senior high school level. The results of the end-of-term examinations indicated that pupils taught by radio did as well as, or better than, those taught by conventional methods. Attitude reports from pupils showed a high degree of interest in radio lessons. Heron & Ziebarth (1946) found that the radio was as effective as the face-to-face instruction in college psychology courses.
Another example of the effectiveness of radio as a teaching instrument, this time in a developing country, is reported by Mathur & Neurath (1959). A total of 145 villages in Bombay state, averaging about 850 people per village were chosen as the experimental group and were provided with radio sets. A similar number of villages without radio sets served as the control group. Twenty special farm programmes were broadcast twice a week for 30 minutes. Comparison of test results, both before and after the broadcast programmes, found a significant increase in knowledge in the radio villages, but only negligible increases in the non-radio groups. This and many other reports indicate the efficacy of instructional radio. Forsythe’s (1970) review concluded:
"Research clearly indicates that radio is effective in instruction. Experimental studies comparing radio teaching with other means or media have found radio as effective as the so-called "conventional methods". Even though radio has been criticised for being only an audio medium, studies have shown that visual elements in learning are not uniformly important. In many educational situations visuals may be more harmful than helpful. Also, the efficiency of combined audio and visual media has been challenged by studies which show that multi-channel communications may not be inherently more effective than single channel presentations."
These conclusions can be extended to include the equivalent recorded form of instruction, such as discs or tapes. Popham, in 1961, divided an introductory graduate course into two sections. In one he taught in a lecture-discussion format; in the other, he played a tape-recorded version of the lecture and then led a brief discussion period. The two groups were carefully matched on scholastic aptitude and two achievement pre-tests. Following instruction several post-tests were administered and it was found that there were no differences between the groups.
By 1969, the rapid expansion of cassette tapes was making the medium ideally suited to individualised learning and it was this aspect which was of interest to Menne, who used it in an introductory psychology course at Iowa State University. The lectures were recorded on tape and notes were taken from the blackboard material used by the instructor during the presentation of his lectures. The blackboard notes were then assembled to form a booklet. Each member of the experimental tape group was issued with a tape recorder, a complete set of lecture tapes, a booklet containing the transcribed blackboard material and a schedule of the lecture topics to be given to the lecture group. The audio-taped group was self-paced, though they were required to take 3 objective tests during the course. Information was available concerning student performance on several measures, which enabled a covariance analysis to be applied to the results for the regular exams, class points and a final grade. There were no significant differences.
Ackers and Oosthoek (1972) reports a similar experimental course. The subjects in the taped group again had individual access to recorders and tapes on the subject of micro-economics, and were able to follow the course at their own speed, within certain broad limits. Ackers does not elaborate on the broad limits and we can consider the taped course to be substantially student-paced. Both groups had ample opportunity to participate in test problems, which formed an integral feature of the instruction, and were encouraged to take part in fortnightly group discussions. The performance was assessed in a June examination which, according to the authors, called for the sub-categories ‘Application’ and ‘Analysis’ of the category ‘Comprehension’ from Bloom’s Taxonomy of Objectives. The results indicated a slight advantage in favour of the tape group.
General Conclusion from early media research
Much of the early research was concerned with the new mass media, and it was clear from this research that these new approaches produced results similar to more traditional methods of teaching. Traditional approaches had a limited number of technologies embedded in them, such as books, and writing and drawing materials; the new approaches incorporated the teachers themselves, whose performance had previously been evanescent. The new media removed the need to have teachers actually present in the classroom, because they can produce a facsimile of the teacher. This represents a significant change, because prior to the introduction of the new media, if a teacher was not present, instruction could only be given through the medium of print, which need a decoding skill, the ability to read, to be present in the student. The new media literally spoke directly to the students, and did so as effectively as if the teacher was actually present. This lies as the heart of the embedding process: the new media could actually replace teachers, although researchers usually denied that this was a possible outcome, and their results were euphemistically disguised as demonstrating that the new media could provide a means to compensate for a lack of teachers. Nonetheless, teachers were well-able to perceive the threat from the new media.
Media Attribute Research
Early research into the effectiveness of media considered the effectiveness of the new medium compared to traditional approaches. In addition to this, groups of media which differed along one dimension only, were compared in order to determine the effectiveness of a particular attribute, such as visual motion or pictorial colour.
One of the earliest accounts of the experimental investigation of media effectiveness is Freeman’s (1924) Visual Education, published 20 years after the first public demonstration of moving pictures by the Lumiere brothers. The results are not as scientifically valid as later research, but they do confirm later findings.
McClusky’s (1924) experiment represents one of the first comparisons between film and a lecture illustrated with slides. He used a film on the life history of the Monarch butterfly and compared this with two lecture conditions: a slide lecture using eight slides, each illustrating a step in the life cycle; an oral presentation illustrated with two pictures and two blackboard sketches. Each was presented to 20 pupils in grades 6-8 in two schools and lasted 12 minutes. The results failed to show any difference between the methods.
Brown (1928) found similar results when comparing films and filmstrips for teaching factual information about the physiology of seeing to high school students. In the filmstrip group discussion was free and questions were asked both by the teacher and the students. A multiple choice test indicated a superior performance for the filmstrip group and Brown concluded that this was because of the greater exchange of comment within the teacher-paced filmstrip group.
A more satisfactory approach to determining the effectiveness of the visual motion attribute was undertaken by Twyford (1954). The topic investigated was methods of riot control, under the title Military Police Support in Emergencies, and introduced the problems of training soldiers to cope with such complex situations as restless, disturbed city populations, agitated groups, mobs and rioting crowds. The film had a Hollywood budget and expensive crowd scenes organised in an American city. The question of simpler and less expensive production methods was raised and Twyford was charged with determining the effectiveness of other methods. Twyford’s group suggested an alternative approach using stock film or newsreel coverage of riots, and even the use of still pictures if motion film was not available. The project eventually compared the Hollywood style film with two filmograph versions, which were similar to sound filmstrips. In all three versions the soundtrack was identical. One filmograph was based on the motion film and consisted of individual still frames taken from the original. The second filmograph was made up of stock still pictures of riots taken from news libraries, or simple diagrammatic representations of troop movements. The groups of recruits were tested using a 42 question test with 10% of the questions using pictures. The full motion version scored 4% more than the filmographs, which were equally effective. The difference in performance was educationally insignificant, but the difference in cost was very substantial.
This is one of the earliest well-controlled experiments that shows that motion aids such as film and television will not automatically improve student performance when compared with simpler aids such as filmstrips. The essence of this argument is that film and tv can teach many different groups and subjects about as effectively as traditional methods, but so can simpler aids such as sound filmstrips or sound tapes with booklets, and these simpler aids cost less to produce.
Colour and Pictorial Quality
It is also worth, at this point, considering the experimental evidence concerning comparisons between colour illustrations and their counterparts in monochrome, and the effects of changing pictorial quality.
Vander Meer’s (1954) work has been described as demonstrating a rigorous methodology which sets the standard for similar studies. The first experiment involved 500 students, 14 and 15 years old. One half of the students saw colour versions of 5 films, whilst the other half simultaneously saw black and white prints made from the original colour materials. The films were commercially produced titles including: Maps are fun; How man made days; Rivers of the Pacific Slope; Snakes, and Sulphur & its compounds. Two types of tests of perceptual and conceptual learning were developed for each film: non-verbal and verbal. The results for the verbal tests indicate that in only one case was there a statistically significant result in favour of the colour film version. The non-verbal test results reverse the statistically significant results for the verbal tests, with two of the three films favouring the b & w version. However, the differences do not persist and the delayed recall test indicates no difference between the two versions.
The main conclusion reached by Vander Meer was:
"The use of colour in instructional films which may superficially seem to ‘call for colour’ does not appear to be justified in terms of greater learning on the part of those who view the films. If colour is to be used effectively in films there must be careful preproduction consideration of the probable psychological impact of specific uses of colour upon the learner."
A similar project was undertaken at Yale University (May and Lumsdaine, 1958). In the "Learning from Films" report the effects of pictorial quality and colour is considered, especially the importance of factors generally regarded as entering into the degree of polish or quality of the pictorial component of teaching films, both factors being related to the cost of producing and printing films.
The Yale team produced a colour film ‘Seasons’, which dealt with the causes of seasonal change, and was to be used to investigate the efficacy of colour instructional films. During the production phase a story board was produced as a guide for the eventual production of animated and live colour footage. The story board consisted of very crude b & w pencil sketches for each scene. In order to aid in visualising the content of the final film a so-called pencil test running reel was made by photographing these sketches on motion picture film in the planned sequence.
Before the final film became available a silent print of the pencil test version was shown to a sixth grade class, with a staff member reading the commentary. The post test scores were later compared with the performance of a similar class who viewed the full colour version and the result was that the learning from the crude, jerky b & w version was substantially as great as from the full colour version. These surprising results were at first only accepted as being very tentative, because the groups compared were not selected to assure equivalence or provide any valid measure of error. A second experiment utilised 4 classes of fifth grade pupils, and Lumsdaine concluded that the difference between the two mean scores was so small that it is interpreted as the result of mere chance fluctuations — in other words they are not statistically significant.
Kanner and Rosenstein in 1960 evaluated the need for colour rather than b & w instructional television in the US Army. The report indicates that reliable colour television equipment was available but that the cost was higher than monochrome equipment although costs were expected to fall with technological developments as indeed they have done.
The experimental study beamed the eleven lessons from a mobile colour television facility into two classrooms, one viewed the lessons on colour receivers while the other group viewed them on monochrome receivers. Immediately following a lesson the subjects were tested using multiple-choice questions. Every effort was made to incorporate colour items into the tests. A total of 368 trainees took part in the experiment and pairs of subjects were matched on electronics aptitude or general technical scores and then were randomly assigned to one of the two experimental conditions.
Ten out of 11 comparisons show no significant differences and the single statistically significant result is considered to be unimportant in view of the overall picture and small differences in test performance. The overall mean scores are remarkably similar, bearing out the results of May and Lumsdaine.
Stephen Cox produced a survey of the research into the effects of colour in learning from film and television in 1976, and from the results of twenty or so studies concluded that overall there is no marked difference in learning from colour or black and white film or television.
Dwyer took this research forward in an attempt to improve visualised instruction. He planned and carried out his programme of systematic evaluation of the effects of a variety of pictorial types over an extended period of time, involving 100 separate studies with a total population of 23,000 students.
The first piece of research in this programme is recorded in Dwyer’s 1967 research report: Adapting Visual Illustrations for Effective Learning, published in the Harvard Educational Review. He describes an experiment which compares four different audio-visual presentations of the same material. The commentary is the same for each presentation but the pictorial image is different.
Initially, only black and white illustrations were considered. The four conditions used in the experiment were:
Dwyer also designed four individual criterial measures, which were administered in the following order: drawing, identification, terminology, comprehension tests. After the presentation of the instructional materials each student was permitted to take as much time as he required to complete one criterial measure before proceeding to the next. The important question, a re-statement of the ‘Sign Similarity Hypothesis’, was: do students learn more if illustrations are more realistic?
The results of this study indicated that when students viewed their respective instructional presentations for equal amounts of time, the simple line drawing presentation was significantly more effective in facilitating achievement than was the oral presentation without visuals on the drawing, identification, and total criterial tests. The oral presentation without visuals of the heart was found to be as effective as each of the visually complemented treatments on both the terminology and comprehension tests. Dwyer also concluded that, contrary to previously stated theories of visual communication (e.g. Carpenter, 1953), the more realistic illustrations were found to be the least effective in complementing oral instruction.
The purpose of this and other experiments was to test the hypothesis that an increase in realistic detail in visual illustrations increases the probability that learning will occur. Eventually nine slide sequences, possessing differing degrees of realistic detail and colour, were produced so that variations in visual stimuli could be assessed in terms of their ability to facility student achievement on five criterion measures. The results indicated that increasing the amount of realistic detail in visual illustrations does not necessarily lead to greater learning.
Why are there no differences?
There are no differences because much of what is happening in mediated instruction hardly differs from what is happening in the classroom. And the classroom is a most inefficient device for education. The reason colour, motion, or pictorial quality adds little to understanding a range of topics, is that the human information processing system has processing limits and best deals with information that has been simplified. This actually matches better with the types of tests which are administered to measure learning, such as those used by Dwyer: simple line drawings are best for instruction when assessment uses simple line drawings. Travers (1964) successfully linked this to the emerging discipline which was seeking to apply information theory to psychology. He demonstrated that much of the information that is attended to by the sense organs is actually filtered out before it reaches the higher levels of cognitive processing. In many cases, as exemplified by Dwyer’s line drawings, simplification makes the world more comprehensible because it places less demands on the processing system: it is, by its very nature, partly processed, the extraneous information having been stripped out.
There are no differences because the information passed to the student by the teacher, the television or radio programme, book or picture, is not usually sufficiently well-adapted to the student’s needs. The information is often too much, in quantity or speed of delivery, and the student perceives only a fraction of it, and understands even less.
Can there be any differences?
There must be an unequivocal "yes" to this question. Students who are average in a class can be turned into above-average "A"-grade students, but not by merely changing the medium of instruction. Clark (1983) concluded from the research on learning from media that:
"Consistent evidence is found for the generalisation that there are no learning benefits to be gained from employing any specific medium to deliver instruction."
"Media in education" is often misconstrued as being what educational technology is all about. In fact, it is only one aspect of educational technology and should be more properly termed: technology IN education. It does not have a good record for showing significant educational gains when introduced into the classroom. However, there is a technology OF education that has been much more successful in raising levels of achievement. Its roots are in the behavioural sciences, and can be traced back to the work of Skinner and Bloom; it is associated with mastery learning methodologies. A review of the relative effect sizes of different media and methods (Spencer, 1991) shows that the Learning for Mastery (LFM) approaches advocated by Bloom (1968) and Keller’s (1968) Personalised System of Instruction (PSI) produced educationally significant results when compared to the media approaches.
The Personalised System of Instruction is a totally student-paced system, in which students can take course units at any time, and then receive immediate feedback on their performance on tests taken directly after each unit of study. The feedback is provided by proctors, who are paid students who have already completed the course of study. This approach is very similar to that advocated by Lancaster in the monitorial system. Students can only move on from one unit to the next when they have achieved a high level of proficiency in the work being studied: a criterion level of at least 80% correct is usual, and Kulik (1986) has shown that raising that criterion level to >90% produces a much larger effect size of 0.8, compared to 0.4 for 70-80%.
Learning for Mastery, Bloom’s approach, is teacher-paced and more suitable for use in the classroom. It requires constant formative evaluation of each pupil’s performance, with support for failing students being provided by the use of audio-visual aids and other suitable remedial materials, and extra sessions with pupils and adults providing remedial help. Bloom is adamant that the tests must be diagnostic:
"For students who fail to master a given unit, the tests should pinpoint their particular learning difficulties — that is, the specific questions answered incorrectly and thus the particular ideas, skills, and processes which need additional work. We have found that students respond best to diagnostic results when the diagnosis is accompanied by very specific prescription of particular alternative instructional materials and processes they can use to overcome their learning difficulties" (Bloom, 1968).
It is interesting to compare these mastery approaches with Postlethwait’s (1972) Audio Tutorial (AT) method. Postlethwait’s system, based on the provision of audio-taped lectures, has many of the characteristics of LFM and PSI: it is student-paced within a set time for particular units of work (often 1 week), with students visiting the learning centre whenever they need to study; there is continual formative assessment, usually weekly, with students completing a written and oral test; the course is broken down into small units; and students are provided with a set of learning objectives for each unit. However, there is no mastery requirement and students can move from one unit to the next obtaining very low grades. When compared with traditional approaches, this system shows only a small effect size, and the reason seems to be that no matter how flexible a system is, no matter how carefully instructional materials are prepared for students, if the students do not invest sufficient mental effort in the learning process they will fail to master all the material presented. If later units require full understanding of earlier materials, then those students who have previously achieved only partial understanding will inexorably develop a cumulative learning deficit, and the gap between top and bottom students will widen.
A Synthesis for the Future?
We must begin to accept that what takes place in the classroom can be replaced by a whole host of alternative media, without a deterioration in pupil performance. Research has demonstrated this using different media, teaching subjects and ages of students. If we wish to improve levels of performance, we must look to the new methodologies. Those systems that incorporate mastery learning strategies seem to offer the most hope for such improvements. Bloom and Keller have demonstrated mastery systems based on traditional media, essentially human resources, backed up with written materials, but this mastery methodology can also be automated and applied to computer-based systems. Spencer (1996) demonstrated such an effective system for teaching reading and spelling to pupils who had failed to gain literacy skills. Integrated learning systems, which manage student progress by constantly assessing performance and indicating suitable learning materials, are also following this path. At this stage results are variable, but encouraging:
"In the first phase of the evaluation pupils using SuccessMaker made learning gains in numeracy above that of equivalent control groups (an effect size of 0.4 which was equated to progress of 20 months over a six-month period" (P.14, NCET, 1996).
[An effect size of 1.0 improves an average pupil’s performance such that they achieve results previously associated with the most able 10% of the class; an ES of 0.25 and less is considered educationally insignificant]
Table 1. Summary of Effect Sizes
We must remember that the development of sophisticated computer-based systems is in its infancy, but even so, as Table 1 illustrates, it is now beginning to equal the effectiveness of individual tutoring methods. It is not surprising that the computer begins to excel when it is used to provide simulations. It is inevitable that we shall see the computer competently simulating human one-to-one exchanges, passing what is sometimes known as the Turing test.
The recent chess victory of the Deep Blue computer over world champion Kasparov has fulfilled the expectations that Shannon and Weaver (1949) expressed in their theory of information, published just 50 years ago. They tentatively suggested that information theory could lead to intelligent systems tackling the supreme game of chess. Their work stemmed out of the advances made in communication systems during the Second World War, and the impact of new computer methods for breaking codes. It was also related to the concept of feedback in machines, based on work with rockets and missiles, proposed by the father of cybernetics, Norbert Weiner.
The greatest impact of Deep Blue’s achievement is likely to be in the field of education. The key to education is the assimilation of information to create new mental schemes, which enable us to look at the world anew, to go beyond the obvious. And how do we do this, how does this mental construction come about? By entering into a dialogue with the world, the physical world and the social world, as suggested by Piaget (1971) and Vygotsky (1962). One way in which we extend the capabilities of the child is to have her enter into a dialogue with an expert who can answer questions, provide hints, set expectations. The expert has traditionally been the teacher, but there has been a search for mechanical means to do the same job, just as there were searches for mechanical devices to play chess.
Skinner was the most effective early proselytiser of mechanical methods using feedback mechanisms, demonstrating his teaching machines in 1954. He argued that a country that could mass-produce washing machines and cars could surely develop a machine for providing sufficient feedback to students to enable them to reach high levels of performance, especially in basic numeracy and literacy. Programmed learning was only moderately successful. Later computerised methods, using essentially the same limited psychology, have shown themselves to be at least as effective as teachers, and in some cases even more effective.
However, the newer integrated learning systems, combining mastery strategies with the ability to provide rapid feedback and make decisions about suitable remedial materials, represent the true state of Educational Technology. But, much more is to come, and it will come soon. Already it has been demonstrated that artificial intelligence (AI) systems can teach, or rather tutor, as effectively as human tutors in advanced courses (Lajoie and Derry, 1993). Intelligent tutor systems are now being used in college level maths courses, helping students gain understanding of the complexities of geometry proofs. These computer-based tutors, using diagnostic modelling procedures, have been found to be as capable as their human counterparts at identifying and correcting student misunderstandings, even in the complex, advanced field of avionics.
Artificial intelligence systems actually learn from experience and develop powerful rules and strategies, which when applied may be even more effective than those used by the human experts whose skills have been tapped by the computer. Such tutoring systems will get to know their pupils, will have extracted rules and strategies for optimal methods of teaching, and pupils will really have the very best of personal tutors, a dream that has been sought for many years. Deep Blue’s victory shows that the potential for sophisticated dialogues with an understanding, motivating, superior intelligence are achievable. There can be no doubt that just as the filing cabinet size of the first teaching computers was reduced, within a few years, to that of the watch on my wrist, so computers more powerful than Deep Blue will shrink to sit on a child’s desk, early in the next millennium.
And teachers, what will become of them? Teachers will always be needed, because of the human touch. Their role will undoubtedly change; it may even become more rewarding. And, of course, teachers do use educational technology. They always have done: the written word, on a blackboard or in a book, represents technology which is so embedded in teaching that we hardly acknowledge its presence. The new technologies, such as artificial intelligence tutors, are just emerging. Soon they, too, will become ubiquitous, and so totally embedded within the educational context that they will become transparent, in much the same way that written communication is hardly noticed as an embedded technology today.