Analysis of Success in General Chemistry Based on Diagnostic

Thomas J. Greenbowe ... Shawn P. Shields , Mark C. Hogrebe , William M. Spees , Larry B. Handlin , Greg P. Noelken , Julie M. Riley , and Regina F. Fr...
0 downloads 0 Views 109KB Size
Research: Science and Education

Chemical Education Research

Analysis of Success in General Chemistry Based on Diagnostic Testing Using Logistic Regression Margaret J. Legg Department of Chemistry, St. Ambrose University, Davenport, IA 52803 Jason C. Legg Department of Statistics, Iowa State University of Science & Technology, Ames, IA 50011 Thomas J. Greenbowe* Department of Chemistry, Iowa State University of Science & Technology, Ames, IA 50011; [email protected]

Students arrive in general chemistry with a variety of backgrounds and abilities, and in our institutions and others only about 70–75% of the enrolled students will succeed in the course (1). Predicting probability of success on the basis of a diagnostic exam can be used for advising students. Toward this end, methods for screening by mathematical ability, college entrance exams, or prior grade point average have been used and several chemistry placement tests have been developed and field-tested (2– 6 ). McFate and Olmsted reviewed the use of chemistry placement exams to predict success in general chemistry (1). They described a number of studies that report a statistically significant correlation between chemistry and mathematics aptitude as predictors of success in general chemistry. Our paper describes a quantitative statistical technique based on scores on placement exams that will help instructors make an informed decision about the probability of success of their students. The methods and equations determined by a logistic analysis of data are explained, and actual examples presented. One readily available placement exam is the California Chemistry Diagnostic Test (CCDT) (4, 7 ). In Iowa, members of the Iowa General Chemistry Network (IGCN), a consortium of teachers of General Chemistry (8), have given the CCDT to their students for the past five years. The average scores on the CCDT have been very consistent for various institutions. Previous articles in JCE dealing with placement exams have used simple linear regression to show a correlation between diagnostic test scores and course grades (9–19). However, an estimate of the probability of passing the course from a single pre-course measure, such as the CCDT, is more useful, practical, and appropriate than a correlation for informing students and instructors. In these cases, logistic regression is often a good statistical method to use and it allows interpretation of success in passing a chemistry course in terms of probability. Linear Regression and Logistic Regression Linear regression and logistic regression are both generalized linear models. A generalized linear model is a probability model in which the mean of a response variable is related to a linear combination of predictive variables through a link function. The regression equation is linear in unknown parameters β0 and β1, …, βπ . Y = β0 + β1X1 + … + βπ X π + ε

It is helpful to remember that X i could be raised to a power and the statistical method is still called linear regression. Simple linear regression is often appropriate for continuous predictive variable (X ) with a continuous response (Y ). It uses ordinary least squares, a method of maximum likelihood for normal populations and the population mean response, µ, is a simple linear function of the predictor variables. In simple linear regression, non-statisticians usually ignore the concept of a link function, which makes the connection between Y and the estimated population mean response, µ, because the link function is the identity function. For example, when chemists calculate y at some value of x using the equation y = mx + b, y is the population mean response, so µ = 1 × y and “1” would be the link function. In simple linear regression, it is assumed that the variance (σ 2) does not depend on the mean and has the same value at all values of the predictive variable along the line. Logistic regression is often more appropriate than simple linear regression when a continuous variable is used to predict a binary or categorical variable (20). The logistic regression uses the method of maximum likelihood for a binomial distribution (21). The binary response variable (Y ) has only two possible values, 0 or 1 (e.g., success or no success), whereas the predictor variable (X ) is continuous. At each value of predictor variable, the model will give the probability of success. A plot of these probabilities is curved and is usually symbolized by p instead of Y. The link function expresses that the mean (p) is not linear in the β’s. The link for a binary response variable is the logit, or log-odds function. The symbol p is used to emphasize that it is a proportion or probability. p logit p = ln = β 0 + β 1 X1 + … + β π Xπ 1–p The inverse of the logit function is called the logistic function.

p=

exp β 0 + β 1 X 1 + … + β π X π 1 + exp β 0 + β 1 X 1 + … + β π X π

A plot of p versus X is shaped like a tilted S, with a minimum of 0 and a maximum of 1 on the ordinate. The curve is contained in the link function. The variance of a population with a mean of p is p(1 – p). The variance is different at every point on the curve; the lowest variance is at the extremes and the highest around p = .5. Several references (22–24) provide a more complete discussion of logistic regression.

JChemEd.chem.wisc.edu • Vol. 78 No. 8 August 2001 • Journal of Chemical Education

1117

Research: Science and Education

Diagnostic Testing Over the past five years, the CCDT was administered each fall semester during the first week of class at Iowa State University (ISU), a large midwestern university, and St. Ambrose University (SAU), a private liberal arts college. Students at ISU have a mean score of about 25 (out of 44 questions) on the CCDT and students at SAU have a mean score of around 20. Table 1 shows the number of students and the mean, standard deviation, and range of scores for students enrolled in a science and engineering general chemistry course at ISU. Table 2 lists the data for students in the science majors course at SAU. We concluded from the data in Tables 1 and 2 that the average CCDT scores of students at both institutions were stable over the five-year period. At both institutions, qualitative analysis suggested that the CCDT was a good predictor of the ability to succeed in the course and there seemed to be a threshold score on the CCDT required for passing the course. At both institutions, only one or two students who scored below 12 ever passed the course. Historically, at both ISU and SAU, the instructors used a CCDT score of 12 or below to alert students that they had a slim chance of being successful in the course. It is worth noting that a CCDT score of 12 out of 44 is at the random guessing level.

Table 1. Results of the California Chemistr y Diagnostic Examination Form 1993 Administered at ISU in the First Week of the Fall Semester Score

Year

No. of Students

Mean

%

Standard Deviation

Range

1995

538

25.0

57

6.8

8–44

1996

529

25.6

58

6.9

5–42

1997

600

24.9

57

6.8

3–43

1998

545

25.7

58

6.9

8–43

1999

680

24.6

56

6.7

6–44

Table 2. Results of the California Chemistr y Diagnostic Examination Form 1993 Administered at St. Ambrose University in the First Week of the Fall Semester Score

Year

No. of Students

Mean

%

Standard Deviation

Range

1995

71

19.8

45

5.6

18–38

1996

56

21.4

49

5.5

11–33

1997

48

20.9

47

5.9

11–37

1998

45

21.1

48

5.3

10–32

1999

48

20.7

47

5.5

11–38

Methods

Table 3. Residual Deviance from Logistic Regression for St. Ambrose University Students, Fall 1995–Fall 1999

In our study, the continuous variable is the score on a chemistry diagnostic test (CCDT), and the binary response is success in passing the course. We designated success as a final course grade of C or better and nonsuccess as any course grade below C, including withdrawals. Success was assigned a value of 0 (withdrawals and grades below C) or 1 (grades C and higher). Students who withdraw during the first few days of class were eliminated from the data pool. After eliminating these students, data were available for 269 SAU students and for 537 ISU students in 1997, 511 in 1998, and 635 in 1999. Since class size was small at SAU, students from fall 1995 through fall 1999 were analyzed as a group. The same SAU instructor taught all the classes. Class size at ISU was high (>500 students) in each year, so the data were analyzed separately for fall 1997, fall 1998, and fall 1999. The ISU instructor was the same for all these classes. The generalized linear model function of the Math Soft S-plus program with the logit link and binomial family was used for each analysis (25). Input files contained the CCDT score, success, and gender (male = 0, female = 1) for each student. The gender variable was added to determine whether separate curves were needed for men and women. Similar computer programs for a logit analysis are available in SPSS (26 ) and SAS (27 ). From a logistic analysis, equations for the probability of success based on the CCDT were determined for three years of ISU data (1997, 1998, and 1999) and for pooled SAU data (1995–1999). The equations were compared to the actual probabilities of success. To illustrate the difference between logistic regression and simple linear regression, the actual fraction of student success for fall 1999 ISU data set was calculated and analyzed by simple linear regression.

Predictor Variable

1118

Deviance

Difference

df

Intercept (null)

277.7



268

CCDT

238.1

39.6*

267

Gender

224.6

13.5*

266

*Significant factor; p < .05.

Table 4. Probability of Success of ISU Students, 1997–1999, from Logistic Regression Analysis CCDT Score

Probability 1997

1998

Differencea

1999

15

.035

.030

.038

.008

10

.093

.076

.106

.030

15

.222

.180

.263

.083

20

.444

.370

.516

.147

25

.691

.611

.762

.151

30

.862

.807

.906

.098

35

.946

.918

.966

.048

40

.980

.968

.989

.021

average aDifference

.073

between the minimum (1998) and maximum (1999)

probabilities.

Table 5. Residual Deviance from Logistic Regression for ISU Students, Fall 1999 Predictor Variable

Deviance

Difference

df

Intercept (null)

797.3



634

CCDT

620.6

176.7*

633

Gender

617.6

*Significant factor; p < .05.

Journal of Chemical Education • Vol. 78 No. 8 August 2001 • JChemEd.chem.wisc.edu

3.0

632

Research: Science and Education

Results

Probability of Success

1.0

females males

0.5

0.0

0

10

20

30

40

California Chemistry Diagnotic Test Score Figure 1. Logistic regression curves for the probability of success (final course grade of at least C) in general chemistry for St. Ambrose University students based on the CCDT score. The plot is based on pooled data from 269 students enrolled in fall semesters from 1995 through 1999.

Probability of Success

1.0

1997 1998 1999

0.5

0.0 0

10

20

30

40

California Chemistry Diagnostic Test Score Figure 2. Logistic regression curves for the probability of success (final course grade of at least C) in general chemistry for ISU students for the years 1997, 1998, and 1999 based on the CCDT score.

Probability of Success

1.0

0.5

0.0 0

10

20

30

40

California Chemistry Diagnostic Test Score Figure 3. Comparison of the logistic regression curve to the actual data points (CCDT score, fraction of success) to show the fit. This plot is based on data for ISU students in fall 1999.

Figure 1 shows the probability of success (passing with a C or better grade) for male and female students at SAU. Gender was a statistically significant factor for predicting success in general chemistry at SAU. For a given CCDT score, women had a higher probability of success than men. Table 3 lists some of the statistics from the logistic analysis for SAU. In Table 3, the null deviance corresponds to using no predictor variable. In the null case, the probability of success for all students is the same and equals the passing rate for the course. The first logistic regression analysis used CCDT as the only predictor variable and the second used both CCDT and gender. Both CCDT score and gender are significant, since each factor changes the residual deviance significantly. Figure 2 shows the probability of success curve for ISU students in 1997, 1998, and 1999. Gender was not significant for predicting success in general chemistry at ISU. Figure 3 illustrates how the probability curve based on logistic analysis corresponds with actual data for ISU students during the fall 1999 semester. The fraction of students passing for each score on the CCDT was calculated as the number of students who earned C or better divided by the number of students who received that CCDT score. In logistic regression, variation is highest in the middle of the curve rather than either end of the range. The variation among years is most evident in the region of the rise in the curves. Table 4 lists values of estimated probability at several CCDT scores for 1997, 1998, and 1999 at ISU. The data show a slight variation in probability for each CCDT score from year to year. Table 5 shows some of the statistics for a logistic analysis of fall 1999 at ISU. In Table 5 (as in Table 2), the null deviance corresponds to using no predictor variable. The null case assumes an identical probability of success for all students, the probability of passing the course with a C grade. The first logistic regression analysis adds CCDT as the predictor variable and the second adds gender. CCDT is a significant factor but gender is not, since addition of that gender does not change the residual deviance significantly.

Comparison of Simple Linear Regression and Logistic Regression A simple linear regression analysis performed on the fall 1999 ISU data illustrates the difference between simple linear regression and logistic regression. Figure 4 shows the data as well as the simple linear regression line and equation. This analysis appears to be adequate if judgment is based on a value of R 2 = .87 or a significant F test.1 A closer look at how data points fall around the regression line, however, will show that our estimates do not follow the trend of the data. The data points are not randomly distributed about the line (Fig. 4). To more easily visualize the pattern, the residuals are plotted against the fitted values. In simple linear regression, you can make the same assessment by plotting the residuals versus the predictor (Fig. 5). The plot of the residuals versus the predictor (CCDT) shows this nonrandom pattern about the simple linear regression line (Fig. 5). See ref 22 for a more detailed discussion of this issue. The residuals are negative for the low and high values of CCDT and positive for the intermediate values, indicating that the simple linear model

JChemEd.chem.wisc.edu • Vol. 78 No. 8 August 2001 • Journal of Chemical Education

1119

Research: Science and Education 0.3

1.0

0.1

Residuals

Fraction Success

0.2

0.5

0 10

20

30

40

-0.1

y = 0.030 x − 0.133 R 2 = .87

-0.2 0.0 0

10

20

30

40

California Chemistry Diagnostic Test Score

-0.3

California Chemistry Diagnostic Test Score

Figure 4. Simple linear regression analysis of success fraction based on CCDT for ISU students during fall 1999.

Figure 5. Plot of residuals for the simple linear regression of the success fraction based on CCDT for ISU students during fall 1999.

is incorrect. The use of the simple linear regression equation ( y = 0.030x – 0.133) also leads to impossible negative values for the success fraction at low CCDT scores (x < 5) and to values greater than 1 (>100%) for high scores (x > 37). Rather than a simple linear relationship, the data are better fit by the S-shaped curve (Fig. 3) because the logistic regression model treats the data appropriately. Researchers must be careful not to accept a simple linear regression model based on a high R 2 value or a significant F test, but should diagnose whether the model holds by checking plots (e.g. a residualsversus-fit plot or a normal probability plot of the residuals) and assumptions of the model.

ISU class had an added cohort of about 150 engineering students who took the course owing to a change in the engineering curricular requirements. However, the variation between years is tolerable because the goal is advising students of their estimated chance of success. Researchers who want to do a longitudinal study of the effect of changing an instructor’s teaching mode from a lecture-centered to an active student-centered mode, or to investigate students’ conceptual understanding, should use a combination of qualitative and quantitative techniques. Using logistic regression to generate conditional probabilities about the occurrence of events and an S-shaped curve is another valid tool to help researchers analyze data (24, 28). The desire for chemistry instructors to make an informed decision based on placement tests has been a recurring theme in JCE papers (9–19). For example, McFate and Olmsted compared several placement instruments (1). However, they used simple linear regression to correlate course success fraction as the response variable with students’ scores as a predictor variable, using scores on the Toledo Chemistry Placement Exam, the CCDT, and the Fullerton Placement Chemistry Test (CSUF) from three institutions (their Fig. 1). Even though they obtained R 2 values of .82, .98, and .87, since they analyzed a binary response variable (success–no success), logistic regression may be a more appropriate statistical technique. While the probability curves generated in our study are specific for the course and the instructor, the methodology is useful and appropriate for any instructor who wants a means of advising students based upon actual data at that institution. A major assumption in advising students on the basis of analysis of success in previous years is that the course content, assessment methods, or grade standards for the course are not radically changed. Even though the S curves are different for different courses, the logistic regression method works regardless of how an instructor structures the graded components of the course or where the instructor determines cuts for course grades. Of course, if students are to understand what the instructor tells them about their chance of succeeding in the course, they must be capable of thinking in terms of probability. If a student thinks that he or she has a better than even chance of winning a state lottery, per-

Discussion The goal of performing a logistic regression analysis is to advise the next group of students on their estimated probability of succeeding in general chemistry based on diagnostic or placement exam scores. Results of the logistic (e.g., Fig. 2 and Table 4) analysis give instructors a sounder basis for advising students of their probability of success in the course. For example, an ISU student who scores 15 on the CCDT at the beginning of the fall of 2000 semester can expect about a 22% probability of success. The advisor would also want to question the student about his or her academic preparation, extenuating circumstances that affected performance on the CCDT, commitments during the semester, etc. Note that logistic regression provides the probability of success, which becomes the focus of the discussion, rather than a cutoff score. For faculty who set a cutoff score for enrollment in a course, the determination of the cutoff score can be based on an estimated probability of success, rather than an educated guess. One of the assumptions underlying the use of information from logistic regression is that the course and assessments remain fairly consistent from year to year. Figure 2 illustrates that there is a minor year-to-year variation in the success curves at ISU (also see Table 4). This variation has multiple causes, including changes in the class demographics, in the degree of difficulty of examinations, in teaching style, and in the minimal C grade cutoff. For example, the fall 1999

1120

Journal of Chemical Education • Vol. 78 No. 8 August 2001 • JChemEd.chem.wisc.edu

Research: Science and Education

haps this student will not be able to understand what it means to have less than a 10% chance of passing the course. Note 1. The F test for β = 0 here is the same as testing ρ = 0.

Literature Cited 1. 2. 3. 4. 5. 6. 7.

8. 9. 10. 11. 12.

McFate, C.; Olmsted, J. J. Chem. Educ. 1999, 76, 562–565. MacPhail, A. H. J. Chem. Educ. 1939, 16, 270–273. Hovey, N. W.; Krohn, A. J. Chem. Educ. 1958, 35, 507–509. Hovey, N. W.; Krohn, A. J. Chem. Educ. 1963, 40, 370–372. Russell, A. A. J. Chem. Educ. 1994, 71, 314–317. Schmidt, F. C.; Kaslow, C. E. J. Chem. Educ. 1952, 29, 624. Examinations Institute of the American Chemical Society Division of Chemical Education. California Chemistry Diagnostic Test, 1993. Greenbowe, T. J.; Burke, K. A. Tech Trends 1995, 40, 23–25. Brautlecht, C. A. J. Chem. Educ. 1926, 3, 903–908. Malin, J. E. J. Chem. Educ. 1928, 5, 208–222. Martin, F. D. J. Chem. Educ. 1942, 19, 274–277. Bachelder, M. C. J. Chem. Educ. 1948, 25, 217–218.

13. 14. 15. 16. 17. 18. 19. 20. 21. 22.

23.

24. 25. 26. 27. 28.

Walter, R. I. J. Chem. Educ. 1966, 43, 499–500. Haffner, R. W. J. Chem. Educ. 1969, 46, 160–162. Wilson, A. S.; Fox, P W. J. Chem. Educ. 1982, 59, 576–577. Dever, D. F. J. Chem. Educ. 1983, 60, 720. Carmichael, J. W.; Bauer, Sr. J.; Sevenair, J. P.; Hunter, J. T.; Gambrell, R. L. J. Chem. Educ. 1986, 63, 333–336. Ealy, J.; Pickering, M. J. Chem. Educ. 1991, 68, 118–121. Spencer, H. E. J. Chem. Educ. 1996, 73, 1150–1153. Wu, H. STATS 1997, 20, 17–20. Ramsey, F. L.; Schafer, D. W. The Statistical Sleuth; Belmont, CA: Wadsworth, 1997. Neter, J.; Kutner, M.; Nachtsheim, C.; Wasserman, W. Applied Linear Statistical Models, 4th ed.; Wm. C. Brown/McGrawHill: Dubuque, IA, 1996. Cohen, J.; Cohen, P. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences; Lawrence Erlbaum: Hillsdale, NJ, 1975. Hause, R. F. Educ. Psychol. Meas. 1976, 36, 135–140. Math Soft S-plus; Math Soft, Inc.: Cambridge, MA, 1999. SPSS; SPSS, Inc.: Chicago, IL, 1999. StatView; SAS, Inc.: Cary, NC, 2000. Voska, K.; Heikkinen, H. J. Res. Sci. Teach. 2000, 37, 160–176.

JChemEd.chem.wisc.edu • Vol. 78 No. 8 August 2001 • Journal of Chemical Education

1121