Testing, Testing: Good Teaching Is Difficult; So Is Meaningful Testing

Feb 1, 2004 - Testing, Testing: Good Teaching Is Difficult; So Is Meaningful Testing ... A significant disadvantage is that students are often present...
2 downloads 0 Views 130KB Size
Chemical Education Today

Commentary

Testing, Testing Good Teaching Is Difficult; So Is Meaningful Testing by Sidney Toby and Richard J. Plano

Teachers are learning facilitators. Charisma helps, but remembering the instructor 25 years later is not the same as remembering and understanding the material. The limitations of teaching have been well stated by a charismatic teacher, Richard P. Feynman, who put it bluntly but perceptively: “If you want to learn about something, read a book. If you want to understand something, figure it out for yourself.” (1). We propose an improvement in the evaluation of learning. Multiple Choice Questions Examinations consisting entirely of multiple-choice questions are variously regarded as fair and efficient or as props used by lazy instructors. Multiple-choice questions, properly generated and utilized, are difficult to write, but fast and unbiased in evaluating student performance. A major difficulty is writing incorrect but plausible answers without being misleading or obvious. We believe that examinations in the sciences for large groups of students can be significantly improved by choosing a mix of questions, some of which are multiple-choice while others require numerical answers. Multiple-choice questions, al-though imperfect, have advantages that are generally considered to outweigh the disadvantages, at least in large classes in the sciences. They have also been successfully used in small classes (2). As a result of many years of experience in setting both multiple-choice and free-response questions, we offer the following opinion: multiple-choice questions are a pedagogic disaster if they are used merely to save time. True, such examinations may be rapidly graded, but good multiple-choice questions take much longer to set than free-response questions and the extra time required to set discriminating questions must be spent if the subsequent exam is to be of high quality. On the other hand, when free-response questions are used in large courses, the grading extends over many hours or days, the graders tire, the quality and consistency of the grading deteriorates, and the students are thereby treated unfairly. An Improved Approach A good compromise is the use of an optical scanning form that allows the usual multiple-choice answers for concept questions coupled with questions that ask students to enter numerical answers. For example, a student could then enter a (correct) answer of 1.24 ⫻ 106, the optical scanner would read this and the software could be set to accept any value between 1.22 ⫻ 106 and 1.26 ⫻ 106 for full credit.

180

Journal of Chemical Education



Partial credit could be given for answers with greater errors or answers that are missing a factor or have the incorrect sign. It is also possible to give some credit if only the exponent or only the mantissa is correct. The limits for correct answers and limits and amount of credit for answers for which partial credit is given are decided by the examiner. This approach eliminates the drawbacks of multiple-choice questions in numerical problems and yet maintains the speed and accuracy of machine grading. An improved examination would be a mix of the usual four- or five-way multiple-choice questions with numerical answer questions where the students key in a number that is then scanned and graded using the appropriate software. The use of computer-graded exams that allow the input of numerical answers is hardly new: they were used more than 35 years ago with punched card input (3, 4). We have designed a new optical scanning form that allows up to 15 numerical answers of the kind described above together with up to 30 multiple-choice questions. These forms are used with a grading program that will give full, partial, or zero credit for numerical answers as well as the usual scheme for multiple-choice answers (5). A scrambling program makes it convenient to develop exams in which the question or answer is scrambled for the multiple-choice section of the exam and the numerical questions are altered in different versions as desired by the test writer. The students’ answers are unscrambled by the grading program. We believe that this provides an important tool to help minimize cheating. A portion of the answer form is shown in Figure 1 and more details of the software programs used are given in the Appendix. We are convinced that elimination of multiple-choice questions for numerical problems represents an educational step forward. The change might not be popular with some students. They would no longer have the (specious) crutch that when they have done the calculation, their answer should agree with one of the given choices. In addition, the mean scores for free-response questions requiring numerical answers are likely to be lower than the scores on the numerical multiple-choice questions because the former are inherently more difficult. We believe, nevertheless, that such examinations would be a worthwhile pedagogical improvement to testing. Acknowledgment We thank the Office of the Vice President for Undergraduate Education at Rutgers University for funding to develop these new optical scanning forms.

Vol. 81 No. 2 February 2004



www.JCE.DivCHED.org

Chemical Education Today

Appendix

Figure 1. Portion of the answer form with numerical input.

Literature Cited 1. 2. 3. 4.

Hanson, D. J. Chem. Educ. 2001, 78, 1184. Wimpfheimer, T. J. Chem. Educ. 2002, 79, 592. Frigerio, N. A. J. Chem. Educ. 1967, 44, 413. Hamilton, J. D.; Hiller, F. W.; Thomas, E. D.; Thomas, S. S. J. Chem. Educ. 1976, 53, 564. 5. Toby, S.; Plano, R. J. Reducing the Tyranny of Multiple Choice Questions, 16th Biennial Conference on Chemical Education, University of Michigan, Ann Arbor, MI, Aug 2000.

www.JCE.DivCHED.org



The exam is usually written in TeX or LaTex, using a program (GRTeX) that reads selected questions from a question file and normally produces four versions, in which the questions and/or answers are scrambled or altered to minimize cheating. After the exam the answer forms are scanned. In our case this was done with an Opscan5 scanner (NCS Pearson, 3975 Continental Drive, Columbia, PA 175129779) using the second of our programs (GREAD). A third program (GRAD) grades the answers, both numerical and multiple-choice, and presents the results in a variety of ways, including tabular and graphical. GRAD also handles different versions of the exam with the questions and/or answers in different order. In addition, we have the option of giving arbitrary partial credit for numerical answers within arbitrary boundaries. The default is to use 2% and 10% variation for full and partial credit, respectively. The programs are written in Fortran for Unix systems. The forms are available to faculty from the authors for the cost of postage and the programs are available as FTP files at no cost. Sidney Toby is in the Department of Chemistry and Chemical Biology and Richard J. Plano is in the Department of Physics and Astronomy, at Rutgers, the State University of NJ, Piscataway, NJ 08854; [email protected] and [email protected].

Vol. 81 No. 2 February 2004



Journal of Chemical Education

181