J. Adlin Mann, Jr., Harry Zeitlin and Allan 8. DelRnol
of Hawaii Honolulu, 96822
University
A Computer Centered Chemistry Records and Grading System
E D I ~ R 'NOTE: s Many chemistry departments at colleges and uiversities are using computers for grading and andyeing examinations and for keeping records of student performance. This paper and those by Professors Pollnow and Yaney which follow on pages 677 and 678 describe several methods for wing the computer in this way. More information on the implementation and details of the programs is available from the authors.
T h e traditional hand-grading of large class exams has a number of weaknesses which will he obvious to those who have taught or learned their chemistry under such a system. We have designed and developed the first stage of a Chemistry Records and Grading System (CRAGS) that handles a number of the problemseffectively. In particular, extensive bookwork is completely automated. Moreover, it is now possible to study efficiently correlations of student response to exam questions. It was diicult, under the old system, to judge class weakness in subject matter, and the instructor received little feedback from the exam that would help him direct future lecture work. CRAGS increases significantly the amount of diagnostic information passed back to both the students and the instructors. In describing the system we wish to underlme a number of decisions that were required in order to structure the program for efficient use. We will discuss certain practical aspects such as costs involved in the application of the system. We note that Hinclcley and Lagowski a t the University of Texas have reported on a system somewhat similar to that devised a t this University.*
ber by a card soit This procedure is required by the computer program input format; a future version of the program will probably include this initial sort. The sorted exam deck is combined with the program deck and the necessary control cards and submitted for proc-
F
b
@
Creation of Student
"
Exam Ans,rers are recorded on form With Student Name and Number
1 veriflestion of Student N a m b c r
I
~7 Proees. Exam
,2ns\ver Cards
Faeliltr and Students
System Structure
The first stage of CRAGS was designed to handle freshman chemistry grading and the resulting freshman chemistry grading records. A single exam-grading run may require the handling of 1000 exam papers. Figure 1is a schematic flow diagram of CRAGS. The first step requires the creation of a student name file on magnetic tape. This master file builds as the semester progresses since each exam means that a grade must be recorded on each student record. During the exams each student records his responses to the questions on an answer sheet designed to speed key punching both the student number and his test responses into the convcntional IBM punched card. Several verification steps are used to insure minimum prohability for error on the transfer of the answer sheet contents to punched cards. The verification details are given later in this paper. Some card handling is carried out prior to machine ~rocessinrr. We order the cards after the student numPresent address: Syntex Research, Pdo Alto, Calif. a HINCKLEY, C. c.,AND LADOWSKI, J . J., J. CKEM.EDUC., 43, 575 (1966).
u Calouiatim required for gradeassignment
Figure 1.
CRAGS structure.
essing. Processing at first was done in "open shop" mode by our graduate assistants well versed in machine operation. However, since the program have been tested and debugged "closed shop" batch processing is satisfactory. Turn-around time varies but has customarily been only a matter of hours. The exams are generally processed in time for the next class meeting. The output from the processing is distributed to the faculty members handling the class lectures and includes detailed diagnostic tables for the lecturer and a full report for each student. The faculty and the student r e ports generated by the grading program are discussed in a subseouent section of this DaDer. . The exam post-mortem generates file update material. While exam errors occur only at a very low rate, some file correction worlc must be done after each exam.
.
Volume 44, Number .l 1 , November 1967
/
673
hlost of the updating concerns excused absences. The update data are channeled to the system operator who then handles the file changes through specially written programs. We make copies of our master file tape in order to anticipate catastrophic machine or human error that would destroy the name file and all of the records. We back our tape file in three ways: we keep a t least one extra copy of the current tape file; all of the exam cards are stored until after the end of the semester; and each instructor has a current grade listing from which a tape file could be reconstructed. While occasionally operator error has ruined a tape file, we have always been able to reconstruct the file from a copy. These precautions have become part of the system. We find that giving the student a complete report of his performance allows further verification of our records since he quickly feeds back file error informatim. Toward the end of the semester laboratory exams are given and recorded with the normal routines. The final exam does not require special handling; homver, the faculty reports summarize the whole s e mester's work. Averages are calculated according t o the algorithm supplied by the faculty in charge of the classes. Each exam and laboratory grade can be weighted separately with the weights normalized to give final scores in the grade range desired, usually 0-100. Organization of the Student Name File
All grade records are stored on tape, referred to as the Name File, in student number order; thus, one of the first organizational responsibilities of the section lecturers is to assign student numbers. The programs are general so t,hat differentsystemscan beused. Thephysical layout of our large lecture halls suggested student numbers that include reference to the course number, section, and row-seat number. This latter point speeds distribution of the student reports after an exam. The exam reports are printed off on the computer high-speed printer in row-seat number order. This feature could be easily changed to accommodate any desired system. For example, we have considered printing the reports in groups keyed to the laboratory instructors, and this system if desired could be handled after slight modifications of the present program. The sections stabilize after the first two weeks of a new term and at this time and before the first exam the students are asked to fill out a questionnaire which asks for their name, section, seat numher, social security number, and any background information that the faculty might be interested in correlating. This last point is of real interest in course planning. The students at the University of Hawaii are all required to register with social security numbers so that these numbers will be known by the students. The computer can be used then t o correlate various items on student background with grade information stored on general University tape files. Thus, through the use of a social-securitynumber oriented permanent file of the chemistry students, the time evolution of these correlations can be followed rather easily. It will be possible to follow quantitatively the impact of improved high school programs on student performance in freshman chemistry. These ideas, implemented during the next few years, will constitute the next stage of CRAGS. 674
/
Journal of Chemical Education
Structuring of the Examinations
The nature of the testing program described in this report necessitates the use of multiple choice examinations. Good objective examinations can be devised to satisfy the most discriminating testing criteria. Toward this goal, the instructor is provided, at the conclusion of each exam processing run, with a breakdown of each question including the number answered and percentage answered correctly. It is relatively simple to determine the questions that have given the students the most trouble and make use of this information for instruction in class and in the selection of questions for future examiuations. Scrutiny of the test evaluation data will uncover poor questions. Over a period of years it should be possible to develop a file of question t,ypes which will help in the process of composing discriminating examinations relatively free of "loaded" and undesirable questions. This type of feedback is characteristic of the faculty report aspect of CRAGS. The grading program is so arranged that it is not necessary to adhere to a fixed number of questions in an examination. The number can vary from one test to another within wide limits (up t o 150 questions) restricted only by the 60 minutes allotted for the class examinations during the semester or two hours for the final. Any number of questions up t o the maximum can be used with scores normalized to fall in the range O - l O O ~ o . Each question may have up to nine options. The grading program will handle any grade-evaluation algorithm chosen by the instructors. The student records his answers on an answer sheet designed for rapid key punching, and in our experience this system has proved more reliable than other commonly used answer sheets and cards designed for mechanical or electronic grading. I n particular, the Porta-Punch card system was less satisfactory since, among other things, students are prone to double punch an answer. The key-punch operator records only the student number and the answer response numbers on the standard IBM card. The correct student number which is provided must be entered as soon as possible after the beginning of the course. It is expedient not to rely completely on the student for the accuracy of this number but to have the proctors asked a t the conclusion of the test to verify each student numher twice by different individuals. This task which normally takes 30 min or less is accomplished with the aid of master lists which contain in alphabetical order the names of the students and their numbers. The answer sheets checked for the correct student number are brought over to the computer center for key punching and grading. Exam Grading
At least four benefits are expected from CRAGS: reduction of junior and senior staff grading time; increased grading accuracy; elimination of grader variation within an exam set; superior report and record keeping procedures; and detailed exam analysis. The grading burden for each class has been transferred from 10 to 20 junior staff graders spending roughly 7 hr per exam set of 300 papers (100 man hours per exam of 300 papers appeared to be an average figure) to a system's operator who spends roughly 5 hr per
exam set, and this time is only weakly dependent on the number of papers graded. We have been supporting a half-time teaching assistant in the position of programmer-operator for the system. His duties range beyond just running the grading programs through the machine at exam time. He is also responsible for coordinating the grading operation starting from the time the answer sheets are handed in until updating is completed. He supervises the key-punch girls who translate the answer sheet records to punched cards. This operation typically uses four key-punch operators (ca. %1.75/hr) for two hours to handle key punch and verification on 800 exams. He also handles program modifications. While the operator need not be a highly skilled programmer yet he must know Fortran in order to handle small changes in the program which is continually evolving. Our experience has been that proper verification improves markedly the error rate in grading large sections. This is most efficiently done at two stages of the exam grading operation. The first, verification involves checking that the student numher written by the student on each exam sheet properly corresponds t o his name appearing on the sheet. Since only the student number and answers are key-punched into cards, this check must be done accurately. While the machine could handle this check, the extra key-punch time necessary to put the name also on each card coupled with sort errors through trivial spelling mistakes persuaded us to establish our present procedure. Even with these precautions some of the name file updating involves resolving ambiguous student numbers. However, this verification error amounts to no more than six exams out of a set of 1000. These student reports are still processed but the reports cannot be sequenced. They are thus printed out a t the end of the run for special handling. Key-punch verification is routine. In handling some 14,000 exam papers we have heard from only three students who could substantiate their claim to a key-punch error. This is a very strong argument in favor of the key-punch step in translating a student answer sheet to punched cards. A key-punch girl can intelligently decide on the intent of slightly sloppy answer marking. The original answer sheets are kept until after the name file update has been completed. Any question about key-punch error can be very quickly settled. The exam reports are printed either on line or off line depending upon the computer center's job load. The system operator collects the computer output, "bursts" the sheets and delivers the faculty reports and the student reports to the proper instructor for distribution to their students at the next lecture session after the exam.
Technical details of the programs used in the grading are presented in the appendix. We will be happy t o supply copies of a somewhat more technical diwussion of the sections of CRAGS so far implemented. The programs grew over a period of six months and bile written in Fortran IV may need trimming and modification for operation on computer systems that differ from the University of Hawaii's IBM 7040-1401 system. We will also supply card images or listings of our programs upon request. In case card images are desired, please send a reel of tape with instructions ns to bit density needed for computer compatibility. Student and Staff Reports
We have attempted to include in CRAGS certain exam diagnostics that aid in staff analysis of exam effectiveness. Mean and median numbers and n histogram are printed out. A question summary table is also presented; the fraction of the class missing each question is recorded. A question-reply "matrix" is constructed so that for each question the respouse distribution among the possible answers is immediately apparent (see Fig. 2). On this same report sheet rankreply correlation coefficients are printed for each question. We have partitioned each class into three ranks on the basis of student performance in the exam: Upper 25 percentile (U), middle 50 percentile (M), lower 25 percentile (L). The correlation coefficient for each question and for each rank was calculated by assigning +1 for each student in the rank marking the correct answer, -1 for each student marking the incorrect answer or leaving a blank. The sum of these numbers over all students in the rank divided by the number of students in the rank gave the correlation coefficient. For a given question and rank a correlation coefficient of 1 would mean that every student in that rank answered the question properly. -1 would mean that every student in the rank missed the question or did not answer the question. Obviously these are extremes. The correlation coefficients have some obvious properties. A valid question should have correlation coefficients that order as U > 1\1 > L, and if this orderiug did not obtain, we would doubt the validity of the question. Study of Figure 2 should show the reader a few of the uses of the reply matrix in analysis of the validity of an exam. The remaining staff reports involve student, records. These reports have been handled as sections so that each staff member has an updated record of the performance of each student in his section. The names of students missing an exam are printed on a separate list so that the tape file on these students can be updated later.
+
QUESTION-REPLY MATRIX FIRST CHEMISTRY 103 EXAM G lVEN ON OCTOBER 6, 1966 * = REPLY ON KEY. REPLY 0 % \ M E AS BLANK. QUESTION REPLY REPLY REPLY REPLY REPLS REPLY REPLY REPLY REPLY REPLY TOTAL NUMBER 0 1 2 3 4 5 6 7 REPLIES
Figure 2.
A portion of a reply Motrix for
0
R:\NI