The Use of Pretests for Screening Examinations Archie S. Wilson, Department of Chemistry and Paul W. Fox, Department of Psychology University of Minnesota, Minneapolis. MN 55455 Objectively storable, multiple-choice examinations traditionally have not been selected for use by chemistry instructors. This situation has begun to change, however, and interest in objective examinations has been heightened by the feasibility of computer generation of examinations from stored test-item data banks and the advent of computer-assisted instructional procedures. Objective tests offer the advantages of standardized scorine and ereat convenience. es~eciallvin larger classes, though i&ay 6e asserted that they do so aithe cost of loss of information about oroblem-solvinp - -orocesses engaged in by students in arriving at their answers (an apparent strong point of "hlue-hook" exams). Whatever the merits of this particular argument, it should he recognized that examination scores on obiective, multiple-choice tests may be difficult to interpret unambiguously, regardless of traditional test-item relinhility measures. Fur example, if students score well on a particular item (say 85%correct), the instructor may draw the natural conclusion that the course has been effective in enabling most of the students to learn the material represented by that type of item. Unfortunately, a t least two alternative internretations becloud such a conclusion: ta, dudenti may hove lrnrned rhe material, bur not in thr rourae under considrration! That is, they may have known rhe material in question before the course even began, on the basis of earlier work. Even more humbling from an instructor's point of view is a second possibility: (b) an intelligent person, without any special knowledge of the subject being tested, can figure out the answer on the basis of the way the item is worded. What one mav he assessine. then. is eeneral intellieence and test-taking "s&" ratheithan acq;isition of knowledge of a particular subject. Because of this possibility, it is especially important that the choices or foils on multiple choice items amear plausible: if they are not, corrections for .. to be eauallv . .. guessing, if made, may he seriously incorrect. The main point of the present paper is the idea that pretests may be used for identifying specific items which are unsuitable for either of the above reasons. The pretest, as we have used it'is an ungraded, objective test given to a class at the beginning of the term. The items are those which relate to the subject areas to he learned during the term. In a sense, the pretest is a preview of attractions to come and gives students an examde of the tvoe of material to be tested on later examinatick. The pokential problem of biasing performance on the final exam by using the same items on the pretest may he effectively circumvented in several ways. One is to sample, from the larger pool of items to be screened, subsets which are
576
Journal of Chemical Education
then used on the pretests ofdifferent gnupsof students from the entire class. Moreover, and in agreement with Hartley,' we have found that for present purposes it makes little differenre on final exam scores whether the same items were pretested or not. Students are informed on the pretest that they should not be concerned if the questions and choices have little meaning to them, since they are not expected to know this material. All students are urged to make a choice on each auestion. even thoueh it mav be a pure mess. Under these rircumstances, the responses to the choices fur a question should t,e rundom if the item is ideally construvtc.d and the students are unfamiliar with the subject of the test. In an investigation designed to test the ideas noted above, 25 items (chosen in part from a trial national examination develoned hv " a committee of the American Chemical Society) were used in a pretest administered during Spring Quarter, 1978,to students at the University of Minnesota The students were enrolled in the second quarter of a two-quarter general chemistry sequence for science-oriented, hut nou-chemistry, majors. The pretests were printed in two forms with identical questions in scrambled orders. One form was given to students in odd numbered seats and the other students in the even numbered seats. All questions were multiple-choice in format, with one correct answer and three foils. The results of the pretesting revealed a number of cases where otherwise carefully prepared items were seriously flawed by one or both of the interpretive problems noted above. For examole. . . on five of the 25 pretest items, more than 60% uf the studenti chose the correct answrr, a hit-rate far higher than expected or drsirable at the nurser of rhr course. Tu illustrate the SOIT of leverage pretest screening aifurds hoth in item revision and eventual exam interpretation, two examples of problems (a) and [b) above will he described here in some detail. Example 1(below)was correctly answered by 62%of a class of 162 on the Spring 1978 final examination. At first glance, the item suggests fair student performance. However, when data from the pretest are made available, we see that 65% of
.
~~
~
~
Presented at the Great Lakes Regional Meeting of the American Chemical Society, Rockford College. Rockford. IL. June 1979. Fox, P. W.. Wilson, A. S.,and Alden, D. "Procedures for the DiagEDVC.. in Dress. nosis of Instructional Outcomes." J. CHEM. 2Harlley,J. "The Effect of ~reiestingon post-testing Performance," Instructional Science. 2, 193-214 (1973).
'
the students actually selected the correct answer (alternate B) even before instruction began!
Example 2 Whlch Statement most accurately describes the behavior of a calalu47
Example 1 What is the solubility of AgCI (K,p = 1.56 X 10-lo) in 0.1 M NaCI? Pretest Response'
A. B.
1.25 X 10@ M 1.56 X M
(8%) (65%)
Final Response (25%) (62%)
A.
~
8.
C. averaged for two groups (159 students)
Excluding the shift to foil A, which indicates partial but not complete learning of the solubility product concept, the humhling conclusion might he that increased learning about this topic was essentially nil. In order to test alternative interpretations (a) and (h) above, Example 1was consequently revised as shown below. This time, the question required recognition of the proper equation to use in the calculation, rather than the calculation itself. The revised test item was used in a follow-uo and . oretest . final exam for a Winter 1979 offering of this class, with considerably different results. Example 1 (revised) What is the equation whicbexpresses the solubility (S) of AgCl (K., = 1.56 X 10-'0) in 0.1 M NaCI? S Is In moleslliter.
A.
B.
S = (1.56 X 10-'O)/(O.l) S = (0.1) 4 1 . 5 6 X 10"
Pretest Responsesa (28%) (30%)
Final
Responses (50%) (11%)
A catalyst decreases AG of a reaction and hence the forward rate. A catalyst reduces AG of a reaction and hence the temperature needed to produce products is reduced. A catalyst reduces the activation energy for a reaction and hence increases the rate of the reaction. A catalyst shins the equilibrium constant of a reaction and hence the final product concentrations.
D.
Pretest Responses' (1%)
Final
Responses (3%)
~
(8%)
(3%)
(85%)
(92%)
(4%)
(2%)
averaged for two gmupr (159 studenb)
considered a strong clue to the correct answer. Once again, the revised item (Example 2, revised) was used with a Winter 1979 class
Example 2 (revised)
AG of a reaction. A catalyst reduces AH of a reaction. A catalyst reduces the activation energy for a reaction. 0. A catalyst shifts the equilibrium constant of a reaction. A. B. C.
A catalvst decreases
Pretest Responsesa is%> k%i (48%) (38%)
Flnal
Responses 14%) , , (2%) (84%) (10%)
averaged for two groupa (109 nudens) 'averaged for two groupa (109 studenb)
This time, the correct response on the final was chosen by 50% of the students whereas only 28% were correct on the pretest. The pretest response profile was thus almost "ideal" and the final response both more encouraging and more informative than the initial version of the question. Consider now an example (see Example 2) not just of problem (h) but of (a) and (b) combined (which probably represents a more typical case in collegiate or secondary schwl instruction). Example 2 was correctly answered by 92% of a class of 162 on the Surinp . .1978 final examination. As with Examde . 1.. however, the pretest data also showed an unusually high percentage of students selecting the correct response (85%). Thus, to help us decide between alternative interpretations: (a) and (b), the item was reworded to delete the phrases beeinnine with "and hence" from the foils as shown below. The " primary effect of this rewording was to delete the phrase ". . . increase the rate.. ." from alternative C, a phrase which was
On the final examination. the correct resoonse was chosen bv a heartening 84%of the ciass, whereas the correct response on the pretest was chosen by 48% of the students. Thus, Example 2 (revised) depicts a class of items about which students appear to possess some knowledge at the beginning of the course, as well as showing a substantial improvement as a consequence of instruction, In brief, high scores on objective final examination items may not be unamhimously ascribed to what the students have from a course. he use of pretests provides a comparison point prior to the start of formal instruction on test items of interest. As the examples ahove make clear, pretests may also help the instructor differentiate between material which the students know on the basis of earlier exoerience and mod performance based on flaws in the wording of test items.
-
Acknowledgment
This research was partially supported by the University of Minnesota, Consulting Group on Instructional Design.
Volume 59
Number 7
July 1982
577