Graded Multiple Choice Questions: Rewarding ... - ACS Publications

Aug 1, 2002 - This paper describes an easily implemented method that allows the generation and analysis of graded multiple-choice examinations...
1 downloads 3 Views 82KB Size
In the Classroom edited by

Resources for Student Assessment

John Alexander University of Cincinnati Cincinnati, OH 45221

Graded Multiple Choice Questions: Rewarding Understanding and Preventing Plagiarism

W

Gareth Denyer* and Dale Hancock Department of Biochemistry, The University of Sydney, Sydney, NSW 2006, Australia; *[email protected]

Mechanized assessments, such as multiple choice questions (MCQs) and true/false questions (TFQs), are exceptionally convenient for teachers because they allow assessment of large classes in a consistent and rapid manner. As academic positions are eroded and class sizes grow, it is likely that the use of MCQs and TFQs will escalate. Unfortunately, mechanized assessment is not without difficulties. The Problems with Standard Mechanized Assessments Mechanized assessments can have a negative effect on student learning. The most serious problem is that, in inexperienced hands, the format rewards surface learning (i.e., the simple recall of facts). Because students’ learning strategies are closely allied to their perception of the assessment tasks, MCQs and TFQs may encourage students to take a surface approach to their learning. If the questions are set by skilled academics with a view to testing the understanding of deeper concepts and problem-solving skills the added complexity introduced into the stem or the options can often lead to ambiguities. These ambiguities may go unnoticed by the less able students but may confuse the better students. Individual questions in mechanized assessments do not reward process. The problems associated with setting more complex questions are compounded by the “all or nothing” manner in which mechanized exams are marked. Standard marking schemes give no credit for near misses (i.e., elimination of the majority of incorrect options or distractors) and so it is impossible to distinguish between a wild stab in the dark and an educated guess. Indeed, the former may well score a full mark and the latter zero. In an attempt to eliminate guessing, some instructors apply negative marks to incorrect answers. Although this may be enough to deter random guessing by some students, it will not prevent all guessing and probably intimidates competent but less self-assured students into leaving questions blank. The all-or-nothing marking scheme constrains examiners to set a rigid style of questions that meticulously eliminates ambiguity. Mechanized assessments are marked inflexibly. Using standard marking systems, it is difficult to perform meaningful post-exam revisions to marking schemes. So, whereas the marking scheme applied to a short-answer or essay-style question can be revised once it is apparent that the whole class have interpreted a question differently from the examiner, all that is possible in standard mechanized schemes is to fully omit (or credit) the relevant question. This means that the question has served no useful discriminatory purpose and may even have handicapped the better students, who will have spent time considering their options.

Mechanized assessments are easier for students to plagiarize than short-answer questions. Whereas it is difficult to copy the text, diagrams, and logic for a short-answer question or essay from a peer sitting nearby, it is relatively simple to copy multiple-choice answers. Students sitting in proximity often find it tempting to glance at the responses of those in front and to the side of them, even when they had no premeditated intention of copying. There have been several findings of plagiarism in MCQ and TFQ examinations (1). If generalized mark sheets are used for computer scanning purposes the task of copying a neighbor’s work becomes that much easier. These sheets typically have the answers to many questions on one page (often 50 questions to a page). Students are now not constrained to viewing only the answers to questions on the page that the neighbor is currently working from. They have access to all the questions already completed. They may not be able to read the questions properly some distance away but they can easily spot the “colored in” pattern at a glance and gain the answers to questions they haven’t even read! The Graded Alternative An alternative to the traditional all-or-nothing marking system is to allocate a mark for each option for each question. As an example, consider the following question. What is the capital city of Australia? A Melbourne B Sydney C Atlanta D Canberra E Auckland

Option D is correct, but some of the other options are closer than others. Options A or B are at least educated guesses. Response E shows a failure to discriminate between Australia and New Zealand, whereas option C would reveal a strong misunderstanding or wild guess. In fact, a blank answer may be preferable to such an ill-informed selection. If many students left this question blank, it would at least tell the examiner that antipodean geography was not well covered in the course; but with random guessing feedback to the teacher is less reliable. Although this question simply tests surface recall of information, it illustrates the point that it is often desirable to allocate partial marks for answers. After all, if this question were in short-answer format, then, after marking several answers, the examiner might well decide to award some marks for certain responses. This is even truer when questions test deeper understanding, the synthesis of several pieces of information, or the ability to extrapolate. The more complex the question, the more important it is that partial marking is allowed.

JChemEd.chem.wisc.edu • Vol. 79 No. 8 August 2002 • Journal of Chemical Education

961

In the Classroom

This marking scheme is particularly helpful when setting questions that require a number of calculation steps. If the answer returned after each step is offered as one of the options, partial marks can be awarded to students who can do at least part of the question. This would be very useful in mathematics examinations, where the process is sometimes as important as the answer. For example consider the following problem: A student has a vacation job in a fast-food outlet. This job pays at a rate of $5.00 an hour but the student must pay 10% of his earnings in tax. If the shifts are 4 hours long and he works 2 shifts per day how much would the student take home after 5 days?

To answer this correctly there are a number of factors to be considered: 1. How much would the student earn in one shift? $(4 × 5) = $20 2. How much would the student earn per day? $(20 × 2) = $40 3. How much would the student earn in 5 days? $(40 × 5) = $200 4. How much would the student take home? $(200 – 0.1 × 200) = $180

If the student answering the question forgot to account for the 2 shifts per day the answer would be $90. Failure to take account of the tax would give an answer of $200. If the 2 shifts per day and the 4 hourly shifts were both ignored the answer would be $25 less $2.50 for tax, giving $22.50. The options would look like this: A $200

B $180

C $90

D $25

E $22.50

While answer B would attract full marks, answers A and C show more understanding of the various factors involved than either D or E. Answer D shows little understanding of the complexity of the problem at all and would attract no marks. Answer E at least showed the student knew how to subtract 10%. We present online a spreadsheet solution that enables an examiner to apply a partial-marking scheme to MCQformatted exams, using common spreadsheet programs.W The system permits concurrent analysis of different key sheets, allowing different students to sit different versions of the same exam and thus preventing plagiarism. The Solutions Student answer sheets are always marked by scanning software. The exact format of the results varies depending on the package used, but it should always be possible to obtain a table (in Excel or similar spreadsheet program) showing the response of each student to each question.

Basic Solution Using Spreadsheet Software The solutions presented online can be implemented by any standard spreadsheet software that supports separate worksheets (e.g., Excel 4.0+). The online tutorial takes the examiner through the process step-by-step from the scanned students’ responses to the final marked exam.W The process involves setting up an answer key sheet on a separate worksheet within the same workbook. Once the mark for each 962

option is entered into the key sheet the student responses can be compared to this key sheet. The comparison is carried out by setting up a third worksheet, also within the same workbook, that contains student names in the same format as the first worksheet. However, unlike the first worksheet, which contains the student’s answers to each question, the grid squares of this worksheet contain an equation that refers to the student’s answer in the first worksheet and returns a mark from the key sheet depending on what answer the student recorded. This is done by the use of a nested IF statement. For example, the equation can have the structure “If the student response to question 1 in the first worksheet is ‘A’, return the value for ‘A’ from the key sheet, otherwise see if their response was ‘B’”, and so on. If care is taken to design this formula with appropriate absolute references, the equation only needs to be constructed once and can simply be copied and pasted into every box in the grid. The default value (i.e., the value returned if the student has left the question blank) is “Bk”. It is then a simple matter to add up all the marks in each row to get each student’s total. As a further refinement, the COUNTIF function can been used to count the number of Bk’s, etc. This enables examiners to refine their understanding of the answering patterns of students.

Antiplagiarism Solution A very effective way to combat plagiarism is to generate different versions of the examination paper (1, 2). This “scrambling” is usually achieved by mixing the order of the questions (often stored in a database). However, this limits the style of questioning. If the order of the questions is chosen to introduce students to material presented in the course in chronological order, then the amount of scrambling is limited. It could also be argued that, given the same set of questions, certain orders of these questions (i.e., different versions of the paper) are easier to do than others. If one version of the paper has the questions arranged with the easier ones first and the more challenging material later, there are grounds to argue that students would find this version easier than one where the more difficult material is presented first. Generating different versions of the paper by mixing the question order also presents a problem if a number of questions have been set from common stimulus material. This situation often arises in chemistry or biochemistry examinations that test the principles behind experiments. The examiner may wish to present the methodology behind an experiment, a graph, or table of sample results. The questions relating to this information must appear below the stimulus material, again limiting the mixing or question order. We present here a method of generating different versions of the paper by rotating the multiple-choice options.W This method has the advantage that it does not require specialized software, either to generate the different paper versions or to process these papers. Discrimination arguments based on the different question order are no longer applicable. In each version of the paper the wording of each option and the order and stem of each question are identical, but the order of the options has changed. The “scrambling” process can be illustrated using the original geographical example, “What is the capital of Australia?” From the original paper (version 1), another 3 versions can be generated by simply highlighting the option at the

Journal of Chemical Education • Vol. 79 No. 8 August 2002 • JChemEd.chem.wisc.edu

In the Classroom

bottom and performing a “drag and drop”. The automatic numbering feature on modern word processor programs can then be used to advantage to renumber the altered option order, A–E. The exact procedure is presented onlineW and the resultant 4 versions of this question would appear as follows: Option A

Version 1 Melbourne

2 Auckland

3 Canberra

4 Atlanta

B

Sydney

Melbourne

Auckland

Canberra

C

Atlanta

Sydney

Melbourne

Auckland

D

Canberra

Atlanta

Sydney

Melbourne

E

Auckland

Canberra

Atlanta

Sydney

Rotating in this manner makes it very easy to generate a series of exams with identical questions but different order of options. It also makes it very easy to generate four answerkey sheets from the original key sheet.

Paper Layout Like many universities, we have large classes of students, often examined in one large building (e.g., a gymnasium). With even just four versions of the examination, it is easy to arrange the papers in the examination room in such a way as to prevent both casual and premeditated plagiarism (3).W With this layout there is a very low likelihood of students seeing the answer sheet of peers who have the same version of the paper. Clearly, students who copy the responses of the person sitting adjacent to or in front of them will score very poorly. This layout is quite straightforward for invigilators to arrange. The Processing To process the four papers we must be able to match a student with a key sheet. In one method, each version of the examination contains several “questions” that direct students to fill in a certain response on their answer sheet. This instruction can be worded in such a way as to appear simply as a check on the scanning procedure: for example, “as a check on the scanning procedure enter an answer of A/B/C/D into your answer sheet for question XX.” This allows the computer to determine which version of the paper each student has completed. These check questions may be scattered throughout the paper. In other methods, the student enters a code when completing the entry of personal information. Either way, a record of which version of the paper the student has sat is obtained electronically and this record is linked to the appropriate marking scheme. Using the VLOOKUP function in Excel the computer can be directed, via this record, to look up the correct marking scheme and assign a mark for each question. If the nested IF statements are included in this solution, examiners can use both the differential marking of multiple-choice options and the sorting of different versions of the paper. The online tutorial takes the examiner through the whole process, set by step.W Discussion and Evaluation of the Graded Scheme The graded scheme has undergone trials at the University of Sydney in several end-of-semester examinations in large classes (>400 students). As a result it has been possible to

determine the effect of the scheme from both the examiner and the student point of view.

From the Examiners’ Perspective The ability to allocate partial marks to each option makes the marking of MCQs more flexible. Examiners who have used the scheme have remarked that it makes the marking more akin to the way in which we judge short-answer questions. Examiners also appreciate the fact that ambiguities revealed in the examination paper after production can be dealt with constructively. Crucially, changes can be made to the marking scheme without altering the marks given to the other responses in the same question. As importantly, because the data are held in a dynamic spreadsheet or database, changing the key sheet will change all the student marks instantly. This makes post hoc reappraisal of the marking scheme a possibility. It also opens the door to scaling the students with reference to the quality of their responses. In other words, scaling can be achieved by simply changing the grid to a more generous or more rigorous version. We have found this a fairer and more creative way of scaling than simply raising or lowering everyone’s marks by a fixed percentage or amount. When the different versions of the paper are generated by the procedure described in the online tutorial, you only have to change the marking scheme on the key sheet of version 1 of the paper. The other 3 versions will automatically change. Increased flexibility of marking has also led to examiners becoming more creative with their question setting. The flexibility releases the constraints imposed by the fear of ambiguities, and as a result, questions tend to be less focused on regurgitation of minutiae and more on extrapolation. With any number of permutations of positive, negative, or fractional marks allowed for each option, the examiner can make every option tell about the understanding of the students. Examiners often bemoan the fact that students can obtain up to 20% or more of their marks by random guessing. With the introduction of negative marks for some options, blatant guessing is discouraged but the provision of part marks for other options means that informed guessing is not stifled. We hope that this enables the teacher to obtain a much better picture of each individual student’s strengths, weaknesses, and misconceptions. From the Students’ Perspective Student opinion of the scheme, from both course surveys and interviews, is generally very favorable. Students appreciate the fact that they may be given credit for knowing the concepts behind a question, even if they don’t know the actual answer. Certainly the students agree that as the marking is no longer “all or nothing” (or even “all or negative”!), they find the examination less stressful and are more willing to make educated guesses. The higher-achieving students are particularly satisfied because they perceive that they are less likely to be marked down for incorrectly “double guessing” the examiner. Changes in the Distribution of Marks Although it has obviously not been possible to make a direct comparison (we can hardly get the students to sit the same examination twice!), we have observed an increased range of distribution in the final marks with the new scheme. Whereas weaker students used to achieve marks in the range

JChemEd.chem.wisc.edu • Vol. 79 No. 8 August 2002 • Journal of Chemical Education

963

In the Classroom

of 30–40%, the lowest marks are now much less (some even obtain a total of