Educational Technology & Society 3(4) 2000
ISSN 1436-4522

Avoiding holes in holistic evaluation

Malcolm Shaw
Academic Development Manager
The Academic Registry, Room F101
Leeds Metropolitan University
Calverley Street, Leeds, LS1 3HE, UK
m.shaw@lmu.ac.uk
Tel:  +44 113 283 3444
Fax: +44 113 283 3128

Suzanne Corazzi
Course Leader, Cert. in English with Prof. Studies
Centre for Language Studies, Jean Monnet Building  Room G01
Leeds Metropolitan University
Beckett Park, Leeds, LS6 3QS, UK
s.corazzi@lmu.ac.uk
Tel: +44 113 283 7440
Fax: +44 113 274 5966

 

ABSTRACT

The paper describes the evaluation strategy adopted for a major Teaching and Learning Technology Programme (TLTP3) funded project involving Leeds Metropolitan University (LMU), Sheffield Hallam Univeristy (SHU) and Plymouth University. The project concerned the technology transfer of a web-based learning resource that supports the acquisition of Key Skills from one of the Universities (LMU) to the others, and its customisation for these new learning environments.

The principles that guided the development of the evaluation strategy are outlined and the details of the methods employed are given.  The practical ways in which this large project approached the organisation and management of the complexities of the evaluation are discussed. Where appropriate, examples of the sort of procedures and tools used are also provided.

Our overarching aim in regard to evaluation was to take a thorough and coherent approach that was holistic and that fully explored all the main aspects in the project outcomes. The paper identifies the major issues and problems that we encountered and the conclusions that we have reached about the value of our approach in a way that suggests its potential usefulness to others operating in similar circumstances.

Keywords: Holistic evaluation, Project evaluation, Organisation & management of evaluation, Triangulation in evaluation


Introduction

The Key to Key Skills project is funded under HEFCE’s TLTP3 initiative and involves collaboration between Sheffield Hallam University (SHU), Leeds Metropolitan University (LMU) and Plymouth University. The overarching aim of the project is to:

Implement a web-based system ... (which) will enable academic staff to plan, deliver and assess key skills curriculum provision and enable students to identify and meet their personal needs for key skill development.  It will be flexible to account for different HE structures and cultures and will be trialled in a wide variety of ways and with different disciplines, to produce case studies or example routes through the system.

The system was originally developed at Leeds as an internal project over a two year period. Integral to the success of the initiative was the close collaboration between academic and technical staff.  The system is web-based and supports learning by providing students with on-screen guidance and references (for example to books, journals, audio, video, software) including some that are available in paper-based hard copy format. Simple diagnostic tests called Skills Checks, developed at SHU, have also been included to enable students to self-diagnose their levels in specific skill areas.

 The original project team at Leeds designed the system around eight main themes: Study Skills, Teaching and Learning, Information and Research, Assessment, Group Skills, Personal Development, Employability Skills and Using IT. Sheffield, in contrast, was able to use a more contemporary set of skills themes based closely on the Qualification and Curriculum Agency (QCA) model. On entering the site, users will see the eight themes on the left-hand side (Table of Contents) along with the Skill Checks. By clicking on one of the themes, subtopics within that theme are revealed, enabling the user to navigate through the contents to areas of particular interest. (See Figure 1.)

Figure 1. The skills web site

 

The subtopics themselves are then further subdivided into even smaller components, allowing quite detailed searching. Also presented is on-screen guidance and reference to materials – in some cases the materials are denoted by a red filing cabinet symbol, indicating that they are in paper format and can be obtained from filing cabinets located in all the Learning Centres at LMU. From any page, users can click on any of the other subtopics within the theme or they can go to other themes. The tool bar contains three sets of tools: the Contents Tools, Window Tools and Navigation Tools, which enable the student to use the system effectively.

Sheffield has successfully imported the Leeds system, modified it and populated it with resources and materials more appropriate to the SHU environment. The system has also been transported successfully to Plymouth where a shortened version of the Sheffield system concentrating on writing skills, but customised for the Plymouth environment, has been evaluated.

  

The Approach to Evaluation

The main outcomes for the project are stated as:

  • A web based system to help HE students and academic staff to identify needs and locate relevant materials and other support mechanisms designed to develop key skills.
  • Materials to help Universities set up, adapt and populate the system.
  • Training packages to help those involved to use the system.
  • All outcomes will be based on a thorough evaluation which will consider: system usability, effectiveness in supporting student learning and in supporting staff/ institutions, cost effectiveness, efficiency and robustness.

The timetable for completion was two years. 

Thorpe (1993, p.7) has identified a set of nine typical purposes of evaluation in open learning contexts:

  1. measurement of achievement of objectives of a programme as a whole
  2. judging the effectiveness of course or materials
  3. finding out what the inputs into a programme were- number of staff, number and content of contact hours, time spent by the learner, and so on
  4. ‘mapping’ the perceptions of different participants – learners, tutors, trainers, managers, etc
  5. exploring the comparative effectiveness of different ways of providing the same service
  6. finding out any unintended effects of a programme, whether on learner, clients or open learning staff
  7. regular feedback on progress towards meeting programme goals
  8. finding out the kinds of help learners need at different stages
  9. exploring the factors which appear to affect the outcomes of a programme or service

It was fairly apparent to us that our project, as defined, reflected all of these purposes to a greater or lesser extent.  This wide range of purposes suggested to us that no single approach to evaluation was remotely likely to satisfy the needs of the project and we rapidly adopted an eclectic approach in our search for and development of a comprehensive and holistic evaluation strategy. 

We found ourselves very much in sympathy with the illuminative evaluation approach first outlined by Parlett and Hamilton (1972), where they question the wholesale use of scientific approaches to educational evaluation with its attendant trappings of scientific experimentation, control of variables, psychometric testing and statistical manipulation.  They suggest the appropriateness of what we have come to know as the more qualitative approaches to evaluationIn a brief description of illuminative evaluation, Oliver & Conole (1998) identify the three critical stages of observation, inquiry and explanation, and highlight its essentially pragmatic approach, which is closely related to the naturalistic paradigm and emphasises the use of triangulation strategies.

With regard to the value of triangulation, Breen et al. (1998) in their investigation of the IT learning environment in a large modern University suggest  a number of potential benefits:

  • it helps to ensure adequate coverage of all aspects in the evaluation focus
  • it can help to fill in gaps that might occur if relatively few methods are used
  • findings  from one method that may be difficult to interpret can often be viewed and resolved in the light of findings from other methods
  • it can enhance the validity of findings (which is often the single quoted purpose)

We also found ourselves almost equally committed to notions of formative as well as summative evaluation.  Formative approaches seemed relevant not least because we had a period of two years which, whilst very short in  relation to what we had originally hoped to achieve, was sufficient to allow a phased approach to evaluation that would allow piloting to influence subsequent major field trials.  It was also our intention to explore the processes as well as the products of our project and to allow early experiences to influence the development of our system, materials and our approaches as we proceeded through the project. Consequently we kept in mind the possibilities for formative evaluation as identified by Tessmer (1993), such as: expert review, one-to-one, small group and field tests.

Pawson and Tilley (1997) review the emergence of constructivism as a significant movement in educational evaluation.  With its emphasis on process rather than product, its concern for the needs and views of different stakeholders from their own particular perspectives and in naturalistic settings, constructivism appeared also to chime with what we felt were essential perspectives in our own project. So our concern was not to impose a tightly prescribed set of conditions for the use of the system on our course leaders.  We were more concerned to allow courses to exploit the system as they saw appropriate for their own identified purposes and to evaluate the contribution of the system within those naturally occurring conditions. In fact our early trials enabled us to begin to classify and categorise these modes of utilisation tentatively into four reasonably discrete models – see Figure 2. These models are the refined versions that were carried forward into the field trial stage in order to help us gain additional insights into our data.

1.        Optional model

system recommended by tutors/course documents

student is left to search it out and use at own discretion

no formal training/introduction to system

no formal relationship with specific parts of course

peripheral not essential

 

2.        Directed model

system recommended by tutors/course documents

students get formal intro/training in use of system

students are directed to system from time to time

not identified strongly with specific parts of course

functions as background not essential

 

3.        Semi-integrated model

system recommended by tutors/course documents

students get formal intro/training in use of system

students are directed to system from time to time

system identified with specific parts of course (eg in workbooks or tutorials or assignments)

system is an important component of module (s)

 

4.        Fully integrated model

system recommended

formal introduction/training provided

students directed to system at appropriate points

tutors associate system with key skills

system fully integrated within course/module delivery (eg in workbooks and tutorials and assessed assignments)

use of system is essential for successful completion of assessment tasks

tutors are likely to contextualise system for course/subject

 

Figure 2. Models of Utilisation

 

Additionally, even a rapid consideration of our major stakeholders revealed to us an impressive list that includes the managers of TLTP, participating Universities, course tutors, students, members of the Project Team (academic, technical and administrative) and the wider HE community. A careful distinction was made between the roles of these stakeholders as providers and/or consumers of evaluative data. So, for example, students could be considered primarily as providers of evaluation data, and TLTP as mainly consumers, whereas the project team provides data and consumes it, especially in the context of formative evaluation.

So where did all this lead us?  It suggested that, for example:

  • we should have a clear notion of the full range of purposes for the evaluation associated with our project
  • we should consider very carefully the full range of stakeholders - whether providers or consumers of evaluative data
  • we should employ a range of appropriate evaluative methods from whatever paradigm, which would also provide triangulation benefits
  • we should consider formative as well as summative requirements for evaluation
  • we needed to consider processes as well as products
  • we needed to communicate this information about evaluation effectively, albeit selectively, across the range of stakeholders

The need to make decisions about a range of evaluative variables in relation to a range of diverse purposes seemed to point us towards a matrix approach as a convenient way of tackling the complexity and detail of the study.

 

The Experience

It should be appreciated that the matrix approach was adopted at an early stage in the project both as an organiser and as a communication tool. For brevity it is only possible to include here a section of our actual main planning matrix at Figure 3, but this should be sufficient to illustrate our approach. In practice this can be combined with means for scheduling tasks in some sort of rational way, as we shall mention later. Illustrated here are a set of evaluative purposes around system development, portability and customisation which are at the heart of our project. Not included in this part of the matrix are a raft of detailed concerns around the pedagogic development of the system as well as those dealing with the effectiveness of the project management and organisation, but this is not to imply that they are of any less importance. 

It should be noted also that technical and ‘academic’ perspectives are carefully balanced in the focus column, as they are in all aspects of this project.  This has led to a careful consideration of evaluation of the technical as well as the pedagogic features of the system. One outcome of this has been a consideration of the earlier work of Ravden and Johnson (1989), Johnson (1992) and of the INUSE Project (1996) which is promoted through the work of the European Usability Support Centres - in the UK, based at the Universities of Glasgow and Loughborough.

Our concern for the stakeholder approach is illustrated within the actualisers and the target groups.  Responsibilities for the development of instrumentation, data collection, analysis and interpretation were allocated explicitly to specific individuals. In our experience this allows individuals to utilise their particular skills, encourages real task sharing and avoids unnecessary duplication.  It also assisted in helping to assure the validity of the processes and outcomes of our evaluation strategy.

PURPOSE

1.       testing system  portability

2.       populate & customise system

3.       pilot system prototypes       (formative)

4.       field trials (summative)

FOCUS

technical aspects of system installation

materials input to system

-          system effectiveness

-          materials effectiveness

-          staff and student briefing

-          support strategies

-          organisation and management

-          system effectiveness

-          materials effectiveness

-          staff and student briefing

-          support strategies

-          organisation and management

-          cost benefit

-          course integration

ACTUALISERS

Technical team at LMU

Academic and technical team at SHU and Plymouth

-          Technical developers

-          Academic team

-          Technical developers

-          Academic team

-          Course tutors

TARGET GROUPS

project team

project team

-          project team

-          course tutors

-          students

-          steering group

-          collaborating  groups

-          external evaluators

-          project team

-          course tutors

-          students

-          steering group

-          collaborating groups

-          external evaluators

METHODS

observation

problem logging

observation

problem logging

-          observation

-          qu’aires

-          interviews

-          focus grps

-          observation

-          qu’aires

-          interviews

-          focus grps

-          case study

OUTCOME

technical reports

guidelines

(templates)

-          reports

-          papers

-          system mods

-          improved guidelines

-          improved application

-          final project report

-          finalised systems

-          papers

-          finalised technical guides

-          finalised system guides

Figure 3. Matrix of Evaluation Strategies

 

The wide range of identified purposes and the range of different foci and target groups led to a carefully identified, but tightly controlled, range of evaluative methods and set of instruments. Methods were chosed for their appropriateness but also with an eye to the size of the samples involved. Field trials were held at the three participating institutions involving more than 1000 students in total – mostly first year full-time undergraduates

Data acquired early, through observation, was concerned with the technical issues that confronted or impeded students.  Observation at SHU, around student induction to the system, was undertaken with the use of a structured schedule –see Figure 4 for an extract.  Observations within the field trials raised issues typically around the accessibility of the system by specific student groups – overseas, disabled, inexperienced.

Observation Checklist

1.       Which of the following did tutors explain and/or demonstrate (please tick more than one if appropriate)

Where to find the system

 

What the icons at the top mean

 

What is and how to use the scratch pad

 

What is a skill check

 

Why self-evaluation is important

 

Access to guidance without using skill check

 

What guidance looks like

 

That it is a source of further resources

 

 

2.       How long did students have to familiarise themselves with the system

No time (no access available)

 

Less than 5 minutes

 

Between 6 and 15 minutes

 

Between 16 and 30 minutes

 

More than 30 minutes

 

               

3.       Where students had access to system, did they do any of the following (please tick more than one if appropriate)

Not applicable

 

Work individually

 

Work in pairs

 

Work in small groups

 

Complete a Skill Check

 

Make notes to Scratch Pad

 

Make notes on paper

 

Print out sections

 

Use help

 

Look at further resources

 

 

4.       Were any technical problems encountered by students (please tick more than one if appropriate)

No, problems encountered

 

Yes, getting into system

 

Yes, system locked/crashed

 

Yes, re scratch pad

 

Figure 4. Extract from observation schedule

 

We limited our data collection from students by questionnaire to a single comprehensive instrument that allows the exploration of a wide range of issues but attempts to prevent questionnaire overload.  A significant section of the instrument samples student opinion, through 4-point Likert scales, on a range of variables including learnability, helpfulness, navigability, quality of interface, controllability, speed, likeability, and workload. Figure 5 gives an indication of the format of a relevant part of the questionnaire. Student reaction was generally favourable with modal responses of 3 across almost all items.

Student feedback by questionnaire has been supplemented through the use of student focus groups to allow open and consensual responding and some possibility of triangulation of student feedback. Students were asked to identify the following: what they liked and disliked about the system, areas for improvement, benefits to their learning, disadvantages of using the system and advice for others users.  Feedback from the student focus groups was generally positive around aspects such as; ease of use, ease of access and the content being helpful, informative and wide ranging. There were still some critical comments about the amount of on-screen information and overall appearance.

.

Figure 5. Extract from Student questionnaire

 

Course tutors were involved in the evaluation processes with students, but in particular they were the focus of defined structured interviews (see Figure 6).  Data that has emerged from these interviews provided the following insights:

  • the flexibility and continuous availability of the system is particularly appreciated, in that it can be used as an independent resource or embedded in the course
  • the system would benefit from being more interactive
  • the system is perceived as an investment for the future, acting at both physical and conceptual levels
  • the system has contributed in helping to overcome students’ fears, and in aiding motivation
  • perceived savings in resources and in staff and student time

 

SKILLS CHECKS
  1. Do you think that skills checks are potentially useful devices for enabling students to:

    identify their needs more effectively and efficiently?

    identify the needs of their courses more effectively and efficiently?

  2. How did the skills checks feature in your course delivery?
  3. Are there other contributions that you think the skills checks might make?
  4. What do you consider were the main limitations of the skills checks?
  5. Can you suggest any improvements that could be made in their design?

 

THE SYSTEM
  1. Have you personally found the system useful? – in what ways?
  2. What contribution did you intend the system would make to your course?
  3. How did you attempt to integrate the system into your teaching?
  4. What features of the system do you think helped you with integration?
  5. What could be done (by tutors/system) to make the system easier to integrate?
  6. What do you think have been the main benefits to your course of using the system?

                Prompts:                savings in amount of course/tutor time
                                                easier/better access to resources for students
                                                savings in student time
                                                more motivating student experience
                                                more effective student learning process
                                                improved student work relating to key skills application (how do you know)

  1. What do you consider to have been the main drawbacks(costs) in using the system?

                Prompts:                tutor preparation/development time
                                                difficulty in accessing computers
                                                extra demands by students for tutor support
                                                extra time demands on students
                                                student browsing and time wasting

  1. How should the system be developed to improve it?   Do you intend to continue to use the system?  Would you recommend its wider use in your Dept/School Faculty?

Figure 6. Structured Staff interview

 

Tutors were also tasked with  producing a case study of their experience with the project to an agreed template – see Figure 7. There were careful briefing sessions for tutors on case study content. The case studies are of interest in their own right, but they are also a crucial and prime source of data (along with the tutor interview) around such matters as modes of utilisation, cost-benefits, and learning effectiveness.

CASE STUDY TEMPLATE

Course details:            title; level; mode; module(s)

Student details:          numbers; gender; age profile; levels of web expertise

Curriculum focus:       skills on which students were directed/advised to focus                                    approximate course time (study hours) allocated

                                   period and pattern of student engagement (please                                   differentiate class contact and independent study)

                                   how students actually used the system

Mode of utilisation:      optional/directed/integrated/contextualised

                                   variations from above models

                                   details of how system was integrated in course

                                    if, how and where the skills are assessed

Commentary on questionnaire results

Results of Observation:            technical issues;  content issues

Results of focus group discussions

Points on which evaluation data sources achieve a measure of agreement

Identification of benefits of system:  what worked well (what didn’t work)

                                              what you’d do differently next time

Discussion of costs of using system (time and resources)

Suggestions for improvement of existing system

Suggestions for further development and application       

Figure 7. Case Study Template

 

Issues

There are typical and characteristic practical problems that can be expected when attempting to undertake the evaluation of major projects across a number of institutions.  They would include the following, all of which we have experienced to a greater or lesser extent:

  • communication between the various partners to agree strategies for effective evaluation;
  • bridging the divide between academic and technical evaluation;
  • adequate piloting before the field trials;
  • balancing evaluation of the processes and the products - necessity for on-going monitoring and logging of issues to formatively inform the final product;
  • evaluation in different cultures – including the extent to which evaluation processes and instrumentation are standardised across the collaborating institutions and can allow data to be aggregated;
  • assuring the reliability and validity of data across institutions;
  • effective scheduling, timing of delivery and phasing of evaluation methods to harmonise course/institution requirements with those of the project across the collaborating  institutions;
  • technical issues including access, technical support, technical staff time and problems with software and hardware;
  • coping with diversity of applications - tutors involved in the evaluation are using the system for different purposes and this can conflict with the evaluators’ attempts to standardise their approach;
  • evaluating cost-benefits -  a notoriously difficult area in educational settings;
  • coping with partial/incomplete data sets;
  • striking a balance between getting people involved, with the attendant benefits of  ownership and commitment, and co-ordinating the resulting increased complexity.

In our experience it was probably technical issues around technology transfer that were most challenging and least predictable. Though it is fair to say that problems around co-ordinating and scheduling the major field trails across the three universities, to a tight time scale, ran a close second.

 

Conclusions

In addressing the above issues, we are able to draw a number of conclusions that we feel are worth sharing.

The matrix approach to planning and organising has provided not only an explicit overview of the total evaluation task but also it has led to an extended model that has allowed the consideration and inclusion of matters to do with logistics and personal responsibilities, such as:

  • the actualisers - those responsible for administering the instrumentation;
  • the targets - students and staff involved in providing the team with data;
  • those responsible for collecting and analysing data;
  • timing and phasing of instrumentation;
  • required format of data to ease analysis;
  • deadlines for receipt of data to enable adequate time for analysis, interpretation and editing for final report.

The matrix also acted as an early warning device in alerting the team to potential problems. In particular it helped to enable the effective identification and communication of responsibilities at group and individual level, within and across the three universities. It suggested the need for setting up specialised subgroups to deal with particular aspects of the project (viz. technical subgroup, evaluation subgroup). All of which helped with the acknowledgement of team members’ expertise and with their sense of ownership and commitment.

Detailed integration of the evaluation of technical dimensions across all aspects of the project has resulted in a strong commitment to evaluation from all team members. Particularly effective has been the formative development of technical features of the system. This was also reinforced through the regime of working meetings that quickly became a part of our modus operandi. General operational meetings were held at about six-weekly intervals, focussing on action planning to explicit deadlines. These meetings were supplemented at various critical phases with occasional meetings of the technical and evaluation subgroups.

A strong emphasis throughout on formative evaluation has allowed issues to be addressed in an iterative way, resulting in a more fully developed and acceptable ‘product’ than might have been expected over the relatively short life cycle of the project (2 years).

The use of a carefully selected range of methods has allowed the various benefits of triangulation, as discussed earlier, to be realised. We do appear to have covered all aspects that we intended, with very few holes in our data. Views from students around on-screen information and the value of paper based resources, which were not always clearly articulated, have been clarified by data contained in the staff interviews and case studies.  In addition, the validity of much of our data has been enhanced through its feedback to us from more than one evaluation source.

In summary, our concern for evaluation in naturalistic settings, involving a wide range of stakeholders and with the need for clarity of purpose and careful organisation across large institutions has led us to experiment with matrix models in our attempts to accurately define the evaluative methods, processes and tasks. It seems to have worked well leaving us with very few unpluggable holes - and we would suggest that this approach may have something to offer colleagues operating in similar settings.

 

References

  • Breen, R., Jenkins, A., Lindsay, R. & Smith, P. (1998). Insights through Triangulation. In Oliver, M. (Ed.) Innovation in the Evaluation of Learning Technology. London: University of North London, 151-168.
  • INUSE (1996). http://www.npl.co.uk/inuse (information on usability standards and design now found at http://www.usability.serco.com/).
  • Johnson,  P. (1992). Human Computer Interaction: psychology, task analysis, and software,Maidenhead: McGraw Hill.
  • Oliver, M. & Conole, G. (1998). The Evaluation of Learning Technology – an overview. In Oliver, M. (Ed.) Innovation in the Evaluation of Learning Technology. London: University of North London, 5-22.
  • Parlett, M. R. & Hamilton, D. F. (1972). Evaluation as Illumination, Occasional paper No. 9, Edinburgh, UK: University of Edinburgh, Centre for Research in Educational Sciences.
  • Pawson, R. & Tilley, N. (1997). Realistic Evaluation, London: Sage.
  • Ravden, S. J & Johnson, G. I. (1989). Evaluating Usability of Human-Computer Interfaces – a Practical Method, Chichester: Ellis Horwood.
  • Tessmer, M. (1993). Planning and Conducting Formative Evaluations, London: Kogan Page.
  • Thorpe, M. (1993). Evaluating Open and Distance Learning, Harlow: Longman.

decoration