A useful grade-scaling equation

Figure 1. Raw score distribution for a recent general chemistry hour examina- tion. The mean and the standard deviation of these scores yield the norm...
30 downloads 0 Views 2MB Size
A Useful Grade-Scaling Equation J. T. Maloy Seton Hall University, South Orange. NJ 07079 Somewhat poetically our students call it "curving the grades". As chemical educators we know that what we are really doing is scaling their numerical exam scores to correspond to institutional standards of achievement that they are unable to attain directly. Grade scaling may he carried out in a linear or a nonlinear manner. Linear scaling merely adds the same constant t o each student's score in order to achieve the desired result. The problem with linear scaling is that the constant that boosts the marginal .. erour, . into the passable range usually propels the average student into the ;aceptional range. This prohlem is illustrated in Figure I, which shows an actual grade distrihution (raw score = %correct) for the first multiple-choice hour exam in a recent offering of general chemistry to 5: students at this university. Ry adding 40 points to each score, one can boost all hut the lowest seven grades into the passing range: the trouble with this is that it also results

-

l5

[

M e a n = 44.28

S.D. = 16.27

Row Score Flgure I.RBI 8COm dlarlbl*lon for a recent gsnsral chemistry hour sxamlnatlon The m n and me stsndard deviation ot thssa swms yield me normal dlseibution mat has the same area as the raw swre histogram. The ordinate refers to me actual number of stulems wnhln the swrlng interval.

414

Journal of Chemical Education

granting an "A" t o those 22 students who managed to get more than half of the questions right. Faced with this dilemma, many instructors prefer to use a nonlinear scaling strategy. The most common nonlinear scaling strategy is to fit the actual grade distrihution to a normal (Gaussian) distrihution and to assign scaled scores on the basis of the number of standard deviations that an individual's score is ahove or below the mean. There are two problems with this approach. First, while it is possible to compute the mean and the standard deviation for any distrihution, there is no way to guarantee that the underlying distribution is Gaussian. In fact, for nonrandom processes such as shooting craps with loaded dice, one may correctly anticipate that the outcome distrihution will be non-Gaussian. (The poorness of fit between the tri-modal grade histogram in Figure 1 and the Gaussian distrihution that results from its mean and standard deviation would convince any statistician that these scores are not normally distributed.) Second, even if the scores are normally distributed, there is an arbitrary and nonlinear operation in going from the number of standard deviations to a numerical scaled score. For the data shown in Figure 1,the arbitrary decision to fit the raw score distribution to a scaled score distribution having a mean of 75 and a standard deviation of 10 raises the lowest score (15% correct) to a scaled score of 57, a gain of 42 points. By comparison, astudentwhose rawscore is the same number of standard deviations ahove the mean would find his or her raw score of 74 boosted to 93, a gain of only 19 points. Since this nonlinear scaling is based upon a statistical analysis followed by an arbitrary assignment to the meaning of the resulting statistical parameters, it is difficult t o justify to students who with good reason cannot understand how some grades would he raised more than others merely to award an equal number of B's and D's. Proffered here is a simple, nonlinear grade scaling strategy that always raises lower grades more than i t raises higher grades and yet is easy to justify to students who have no knowledge of statistics because it raises all grades in a way that is easily understood by everyone. The method is an extension of the old "square root of the grade times 10"gag that once seemed alot funnier than i t is today. Allgrades are

critical standard scores a t various values of n. When n = 1, the raw scores are equal to the scaled scores. When n = '12, the "square root of the grade times 10"case is obtained. As aporoaches 0, all grade ranges are expanded, but at any fixed viiue of n a lowe'score is raised more than a higher score. In all cases where 0 < n < 1, the scaled score is better than the corresponding raw score, much to the delight of the students. (Scaled scores that are lower than the corresponding raw scores may be obtained by setting 1< n. Proceed a t your own risk under these conditions.) In order to utilize this scaline strateev. one merelv determines the value of n that is aiplicabrL & the situation a t hand. For example, one might gradually tighten the screws by settingn = 0.50 for the first exam, n = 0.75 for the second, n = 0.90 for the third, and n = 1.0 for the final. This brings the class up to full speed withouthopelessly losing more th& half the class on the first exam. Alternately, one could determine the scaling factor after the results are in. This is best accomplished by using an arbitrary pair of corresponding scores Ro and SOto determine

n

Scaled

Score

Figure 2. Scaled score disnibution tor lb aata s h o w in Figure 1. Equation 1 wRh n = 0.5 was "set to scale me raw scass. The mean an0 the standard deviation of these scaled smres generate the normal distrlbutlon that has the same area as the scaled s m e hlstagram.

Raw Scores Necessary To Achieve Crnlcal Scaled Scores with Varlous Scaling Factors

I

Scaling Factw (n)

100.0

1.00 0.90 0.75 0.50 0.25 0.10

100.0 100.0 100.0 100.0 100.0 100.0

A

Scaled Scores (Ora6e Ranges) B C 90.0 80.0 70

I

90.0 89.0 86.9 81.0 65.6 34.9

I

80.0 78.0 74.3 64.0 41.0

10.7

I

70.0 67.3 62.2 49.0 24.0 2.8

D

I 60.0 60.0 56.7 50.6 36.0 13.0 0.6

scaled by a simple exponential equation that can be evaluated with the least expensive of scientific calculators or personal computers: S = 100'+" .R"

(1)

In this equation S is the scaled score ("percent"), R is the raw score (percent correct), and n is the scaling factor, a number that usually lies between 0 and 1.Figure 2 illustrates the distribution that results when the raw scores shown Figure 1are scaled with eq 1using n = 0.5. The magnitude of n determines the degree of scaling. The table shows the raw scores that are necessary t o achieve some

a result that is obtained by solving eq 1for n. For example, if one wished to establish the criterion that a raw score of 50% correct was to correspond to a minimum passing score of 60%". one would set n = (loe60 . .. - 2)/(loe . . ., 50 - 2) = 0.737. Or for norm-referenced grading. one might wish toset the mean raw score isav. 45.2°0) to a scaled score of 7 5 W " bv calculating n = (log 75 - 2)/(log 45.2 - 2) = 0.362. (This latter application is particularly useful when a standardized final is given.) Once n is determined in this manner, all other raw scores may be scaled by using this value of n in eq 1. In addition to being remarkably simple, this method has the distinct advantage that i t employs only one arbitrary step: the determination of n. Cast in terms of forcing a single raw score Ro to correspond to a scaled score So, this determination can be justified to the most litigious class, particularly since i t provides them with an ongoing assessment of their performance according to institutional standards. Moreover, this grade scaling method can provide a quantitative way to compare successive offerings of the same course or to judge the level of difficulty of new test material. All thines being equal, the higher the scaling factor that is necessary t o br& about the same (scaled) class average, the better is the class performance. Detailed information of this kind collected over several offerings of the same material can be useful in class to class comparisons of student performance. Such comparisons may he used as a measure of teaching effectiveness or as a class calibration for new examination material.

Volume 67

Number 5

May 1990

415