cyberSlang: the ultimate instant online encyclopedia

Evaluating Instructional Software

Review and Critique of Current Methods

Robert A. Reiser and Harald W. Kegelmann (1994)

Explained by Chris


Who does the evaluating?
What is the nature of the evaluation process?
What features are evaluated?
How reliable are the evaluations?
How might the evaluation process be improved?

The many key features software evaluators and evaluation organizations employ to evaluate instructional software do not necessarily say much about the instructional effectiveness of the evaluated software package. More valid informations result from incorporating students as participants in the evaluation process.

After a review of the current evaluation practice, the authors describe their method of participating students in the software evaluation.

In order to conduct this review, the authors examined
- more than 30 journal articles that describe or critique various evaluation procedures,
- the evaluation procedures of 18 software evaluation organizations.

up - Abstract - Literature - Introduction - Who - Process - Features - Reliable - Improve - Conclusion

The market of instructional software is growing and the teachers and school administrators are getting troubles: they can't no longer review them all.
But educators have to be able to select and have their students use software that is instructionally effective.

Software evaluation organizations are helping out, Neill & Neill (1992) identified more than 30 of them.
The mission of those evaluation organizations is to review and assess the quality of instructional software and to share the results of that evaluation with educators.

The procedures that are used vary widely across organizations:
- the actual evaluation process,
- the types of individuals conducting the evalution,
- the types of criteria used during evaluation.

The authors differentiate between two types of evaluation:

  • evaluation or review by the individual teacher looking for an appropriate instructional software: to become familiar with the software, to decide whether to use it;
  • evaluation or critique done for a software evaluation organization: so that others educators can make informed decisions (focus here).

    up - Abstract - Literature - Introduction - Who - Process - Features - Reliable - Improve - Conclusion

  • Who does the evaluating?
    Recommended and employed individuals:
    - teachers,
    - subject matter experts,
    - media specialists,
    - school administrators,
    - target group students.

    Very few models suggest that students serve as evaluators:

  • Callison and Haycock (1988)
  • Jolicoeur and Berger (1988a, 1988b)

    Two roles that students may play in the evaluation process:
    - students as evaluators: performing the same evaluating functions,
    - students as participants in the evaluation process: other evaluators observe the students and draw conclusions, they ask the students to share their opinions, they assess how much the students have learned as a result of using that software.

    The authors believe that there is much benefit in using students as participants.

    How many individuals should evaluate a particular software program?
    Up to three persons - reviewing the software independently or working together during the evaluation process.

    How train the evaluators?
    - Future evaluators are asked to rate a sample piece of software and then share their ratings with a group of trainers (Owston and Dudley-Marling, 1988) - for a better understanding of the rating scale they will be employing.
    - Other evaluators receive at least 20 hours of training and are checked for the accuracy of their ratings on a number of sample programs (Micceri, Pritchard, and Barrett, 1989) before they are asked to conduct formal evaluations.

    up - Abstract - Literature - Introduction - Who - Process - Features - Reliable - Improve - Conclusion

  • What is the nature of the evaluation process?
    Evaluators have to ...
    - go through a software program in the same way students would;
    - observe students as they work their way through the program: classroom tryouts;
    - work through the program first and then observe students as they do so.

    Individual and overall ratings
    Most procedures involve having evaluators use a rating form to evaluate each of a variety of features. These individual ratings may be weighted in order to arrive at an overall rating of the program.
    In other cases, the overall rating simply represents the rater's subjective overall impression and is not directly derived from the ratings of individual factors.

    Types of rating scales
    - Rating each feature of a software program on a Likert-type scale, indicating the degree to which the feature is present.
    - Simply identifying whether a feature is present or absent.
    - Using a combination of the two approaches.
    - Reviewing a program holistically (instead of rating individual features) and reaching an overall conclusion based on these impressions.

    up - Abstract - Literature - Introduction - Who - Process - Features - Reliable - Improve - Conclusion

    What features are evaluated?
    The authors found a wide variety of features evaluators were asked to look at: ranging from 4 to more than 300.
    No consistent terminology across the methods can be found.

    They identified a few categories of features:
    - content
    - technical characteristics
    - documentation
    - instructional design
    - learning considerations
    - objectives of the software
    - handling of social issues
    - attitude data: What students like or dislike about a program, why they feel that way.
    - performance data, gain: How much students learn from a particular program.
    - demonstration (by the developers) of instructional effectiveness

    up - Abstract - Literature - Introduction - Who - Process - Features - Reliable - Improve - Conclusion

    How reliable are the evaluations?
    The authors revealed that evaluations are not at all reliable:
  • Evaluators make subjective judgements about factors that are no valid indicators of the instructional effectiveness.
  • Teachers' subjective ratings of software programs are not valid indicators neither (Jolicoeur & Berger, 1988b).

    Teachers and students rate software quite differently
    Callison and Haycock (1988) reported a weak correlation between the teachers' and the students' ratings across 135 evaluated software programs.
    Signer (1983) found that teachers and students have different perceptions about the quality of instructional software: students are more critical than teachers!

    Subjective ratings differ greatly across individuals and groups
    Jolicoeur & Berger (1986) proved a very low correlation between the sets of ratings of 82 software programs rated by two evaluation services.
    Micceri et al. (1989) found the reliability of ratings much lower when evaluators rated subjectively than when the same individuals used a more objective set of criteria.

    Different groups tend to consider different features to be most important
    Teachers and software developers tend to value technical aspects of software most highly, whereas persons working directly for software evaluation agencies tend to focus on issues of content and instruction (Borton & Rossett, 1989).

    up - Abstract - Literature - Introduction - Who - Process - Features - Reliable - Improve - Conclusion

  • How might the evaluation process be improved?
    Observing the students
    In order to overcome the problems of subjective evaluations, students are recommended to serve as participants:
    - students should be observed as they work through a software program,
    - they should be observed in the classroom as well as in a controlled laboratory setting.

    Measuring the gain
    Evaluators should measure what students learn as a result of studying the program by collecting student performance data. Field testing: before and after students complete a software program.

    See: the software evaluation model proposed by Reiser and Dick (1990):
    - pretesting learners
    - observing learners working through the program
    - posttesting learners
    Reiser and Dick showed several times that software rated quite highly by using subjective evaluation techniques proved not to be highly effective then tried out with target group learners.

    Collecting attitude data
    Evaluators should collect attitude data from the students who have worked trough the program.
    In addition, evaluators should examine student perceptions of the instructional effectiveness of software.


  • Subjective techniques do not help much in evaluating instructional software.
  • Software programs need to be tried out with students as participants.
  • Collected performance data reveal better informations.

    up - Abstract - Literature - Introduction - Who - Process - Features - Reliable - Improve - Conclusion

  • Conclusion
    The increasing use of computer software for instructional purposes demands for new qualifications of educators: the ability to identify software that is instructionally effective. In order to do so, educators will rely on software evaluation organizations to provide them with the information they need.

    What can evaluation organizations do to improve their evaluation methods?

  • Incorporating students as participants in the evaluation process: observing them as they use the program, asking them to share their opinions of each of the software programs they work through.
  • Assessing how much students have learned as a result of using a particular program.

    up - Abstract - Literature - Introduction - Who - Process - Features - Reliable - Improve - Conclusion

  • tidBits 111-18

    contact mail to:
    Chris Mueller (

    ++41 (0)52 301 3301 phone
    ++41 (0)52 301 3304 fax

    97 05 02