Research: Science and Education edited by
Chemical Education Research
Diane M. Bunce The Catholic University of America Washington, D.C. 20064
The Impact of Active and Context-Based Learning in Introductory Chemistry Courses: An Early Evaluation of the Modular Approach
W
Joshua P. Gutwill-Wise Exploratorium, 3601 Lyon Street, San Francisco, CA 94123;
[email protected] For over a decade, the need for curriculum reform in chemistry has been well recognized (1, 2). Reports in the 1980s and early 1990s were issued by the American Chemical Society (ACS) (3), the ACS Committee on Professional Training (4), the National Research Council (5), the National Science Foundation (NSF) (6, 7 ), and the National Commission on Excellence in Education (8), describing how the chemistry curriculum fails to serve students effectively. At that time, many believed that the curriculum was neither successful in producing a scientifically literate work force nor able to provide the broadly trained industrial and political leaders necessary for the nation to function efficiently and productively in the years to come. In response to these problems, several groups of faculty formed coalition projects funded by the NSF’s Systemic Change Initiative (9, 10). These faculty groups took different approaches to reforming chemistry education: The Molecular Science group began building an online delivery and assessment system for college chemistry; the New Traditions group started creating and implementing a new, more interactive pedagogy in chemistry classrooms; the Workshop Chemistry faculty undertook the development of a training program for undergraduate leaders to run chemistry problem-solving discussion sessions with their peers; and the ModularChemistry Consortium and ChemLinks Coalition joined forces as the ChemConnections consortium and developed “modules”, new curricular materials and methods to enhance the learning and appreciation of chemistry. In addition to promoting an understanding of chemistry, many of these groups’ broader goal was for college graduates to command the scientific knowledge and skills necessary to permit continued learning, lead productive lives, and make informed decisions. The study discussed here provides an early evaluation of the ChemConnections modular materials. The Modular approach involves a change in two components of the chemistry classroom: content and pedagogy. The shift in content emphasizes the importance of chemistry in understanding and solving real-world problems, such as building a better automobile air-bag system, investigating global warming, and understanding atmospheric ozone depletion. Each module typically spans 3–4 weeks of class time and utilizes a single real-world topic as a vehicle for teaching a coherent set of chemistry concepts. The new pedagogy driven by the modules requires greater interaction between students and instructors and among students themselves. Rather than relying solely on the lecture method, Modular classrooms also have students work in groups, solve problems individually, participate in whole684
class discussions, and use multimedia. A fuller description of the Modular approach can be found in ref 11. This article reports two comparative evaluation studies that assess the impact of the Modular approach on students’ understanding, reasoning skills, and attitudes toward chemistry. The hypotheses we set out to test in these studies may be summarized as follows: 1. Students in Modular courses will be better than students in non-Modular courses at understanding chemistry. This includes conceptual understanding and scientific inquiry skills. 2. Students in Modular courses will be as adept as students in non-Modular courses at solving standardized chemistry problems, such as those found in nationally distributed tests. 3. Modular students will have better attitudes toward science. This includes seeing that chemistry is relevant to their lives, feeling that it is interesting and important, enjoying the process of discovery, and appreciating the complexity of real-world problems.
Researchers have studied the effects of context-based chemistry curricula on high school students’ attitudes and conceptual understanding. In the United Kingdom, little or no effect was found on conceptual understanding, but students in context-based courses seemed to enjoy the course more and realized the relevance of chemistry to their lives more than students in a traditional, course (12, 13). The evaluation of ChemCom, a context-based high school chemistry curriculum in the United States, found that students in the full-year ChemCom courses outperformed students in traditional courses on assessment items that tested for both chemistry knowledge and the ability to apply that knowledge (14 ). The ChemConnections faculty felt that they could produce college-level materials that would increase students’ enjoyment of chemistry and improve their understanding of it. Experimental Method To test the ChemConnections hypotheses, we conducted two comparative studies, each at a different type of institution. The first study was implemented during the fall of 1997 at Grinnell College, a small, private, liberal arts college in Iowa. The second study was conducted during the spring 1998 semester at the University of California at Berkeley, a large, public, research institution.
Journal of Chemical Education • Vol. 78 No. 5 May 2001 • JChemEd.chem.wisc.edu
Research: Science and Education
The experimental design was the same at the two institutions: two instructors co-taught two sections of the introductory general chemistry course. One section was taught with the Modular method, while the other employed a nonModular approach (i.e., typical textbook and lecture format). At each school, the study compares the performance of the Modular and non-Modular groups of students. We attempted to control for the effect of instructor by having the instructors alternate between the Modular and non-Modular sections throughout the semester. For example, during the first three weeks, Instructor A taught a module in the Modular section while Instructor B taught one or two topics from the textbook in the non-Modular section. At the end of the three weeks the instructors switched, and Instructor B taught the Modular section while Instructor A taught the non-Modular section. This way of alternating continued throughout the semester in both experiments, thereby ensuring that both the Modular and non-Modular groups spent the same amount of class time with both instructors.
Recruitment In the ideal experiment, the students in the Modular and non-Modular sections would be identical before instruction. In real classroom environments, however, the students are bound to differ slightly between the two sections. At Grinnell, four sections of General Chemistry were taught that semester, one of which used Modular materials and another of which acted as a non-Modular control section. The Modular materials included laboratory experiments, thus requiring a separate lab period for that section’s students. Students in the non-Modular section, however, could enroll in any of three “standard” lab periods. Thus, the Modular section imposed greater scheduling constraints on students than the non-Modular section. As a result, the Modular section was under-enrolled; only 10 students originally signed up for it. The other three sections of General Chemistry that semester, including the non-Modular section that participated in our study, enrolled over 30 students each. To increase the number in the Modular section, students from the other three sections were offered free course materials if they would switch to the Modular section. In the end, the Modular section contained 16 students and the non-Modular contained 30. This undoubtedly introduced a self-selection Table 1. Curricular Sequence in Modular and NonModular Sections Modules in Modular Section
Content in Non-Modular Section
Grinnell College Global Warming
Stoichiometry
Airbags
Gas Laws
Make a CD Player
Periodicity, Bonding
Computer Chip
Thermochemistry
Ozone
Kinetics U.C. Berkeley
Airbags
Gas Laws
Solar Energy a
Periodicity, Bonding
Computer Chip
Thermochemistry
Water Treatment b
Acid/Base, Equilibrium
aModular
materials were incomplete at time of study. bModule had never been used before this study.
effect in the section enrollment at Grinnell. Indeed, the attitudinal presurvey revealed a larger percentage of intended science majors in the non-Modular section (83%) than the Modular section (53%). Perhaps the modules were more appealing to non-science-majors. However, the two groups performed equally well on the conceptual pretest, suggesting that their understanding of chemistry was equivalent. At U.C. Berkeley, the lab schedules for the two sections were identical, and students did not know that the two sections would be taught differently. In fact, only days before the course started, the instructors flipped a coin to decide which section would use modules. There were no section differences on the attitudinal presurvey or on the conceptual pretest at U.C. Berkeley. A total of 338 students enrolled in the Modular section and 255 students enrolled in the non-Modular section.
Curricula Employed in Each Study The chemistry topics covered in the Modular sections were chosen to be as similar as possible to those typically taught in non-Modular classes at each institution. However, the standard curriculum varied between the two institutions, so the modules used in the two studies were different. Table 1 provides a rough outline for the course at each school. There were several differences in the details of the two studies. First, only two of the modules, “Earth, Fire and Air: What Is Needed to Make an Effective Airbag System” (“Airbags” for short) and “Computer Chip Thermochemistry: How Can We Create an Integrated Circuit from Sand?” (hereafter called “Computer Chip Chemistry”), were the same at the two institutions. Second, the amount of class time differed. At Grinnell College, students met with the professor for three 50-minute class periods and one lab per week. At U.C. Berkeley, there were only two 50-minute class meetings and one lab per week. Third, the particular teaching methods were not controlled across studies. Although the Modular pedagogy was constrained by the materials themselves, faculty at each institution taught the Modular and non-Modular sections of their course according to their own school’s norms. It was expected (and indeed provided the motivation for conducting experiments at different types of institutions) that both Modular and non-Modular sections would be taught in different ways at a small college as compared to a large university. (For example, one might expect a non-Modular section to include less straight lecture at a small college than at a large university.) Fourth, the university’s course employed Graduate Student Instructors (GSIs) to run discussions and laboratories, whereas the college’s did not. We attempted to control for the effect of GSIs by creating two matched groups of them, then randomly assigning one group to teach the Modular students and the other to teach the non-Modular students. For all these reasons, comparisons may only be made between Modular and non-Modular sections within the same school, not between institutions. Both studies were conducted at an early stage in the development of the modular materials. At Grinnell, some of the modules used in the classes had never before been used outside of the author’s home institution. At Berkeley, two of the four modules had never been used in any classroom before the study. The consortium faculty desired an early test of the modules to determine whether they were moving in a worthwhile direction. Because of the underdeveloped nature
JChemEd.chem.wisc.edu • Vol. 78 No. 5 May 2001 • Journal of Chemical Education
685
Research: Science and Education
of the materials, these studies must be considered progress evaluations, not summative evaluations.
Classroom Observation To better describe the curricular and pedagogical differences in the Modular and non-Modular classrooms, we gathered empirical data on how each class was implemented. Observers attended the two chemistry sections at each institution and recorded the types of classroom activities employed (lecture, demonstration, whole-class discussions, group work, etc.), the number of questions asked by students and instructors, and the number of students in attendance. These data reveal some of the educational differences in the Modular and non-Modular classes. Assessment Instruments Within both studies, we assessed students’ performance and attitudes using pretests, in-class examinations, posttests, and a problem-solving interview near the end of the semester. We also followed students as they went into the next course.1 The evaluation team, working with the faculty at both institutions, created several assessment instruments to compare the performance of the students in the two sections in each study. The conceptual pretests and posttests were designed to assess change in students’ conceptual understanding over the course of the semester. The in-class exams and follow-up exams included a range of questions, mainly developed by the instructors, to assess understanding. The ACS 1996 Brief Exam was included to measure students’ ability to solve standard, well-accepted, chemistry problems. Finally, the attitudinal presurveys and postsurveys were intended to assess change in students’ attitudes toward chemistry and the course during the semester.2 The concept pre/post test and the attitudinal surveys are available to readers online.W The specific questions on the conceptual tests, the problem-solving interviews, and the in-class exams differed somewhat between the studies in order to match the curricula taught at the two institutions.3 The written tests and surveys were given to the students in class or in lab, under the observation of a proctor. The concept tests required both multiple-choice and longanswer explanations from students. These explanations were coded for types of reasoning by evaluation team members, and these codes were converted into scores. To check for interrater reliability, we cross-coded about 20% of the concept posttests at Grinnell and were in agreement on 92% of the codes. At Berkeley, we cross-coded about 10% of the concept posttests and were in agreement on 91% of the codes. At Grinnell, all 46 students were invited to participate in the problem-solving interviews; only three refused (one from the Modular section and two from the non-Modular section). At Berkeley, 40 students (20 from each section) were selected using a stratified random design with SAT score, gender, ethnicity, and major as the matching variables. The two groups of interviewees at Berkeley were representative of the class at large. Two interviewers conducted one-on-one interviews with students in a private room and coded and scored students’ responses during the interview itself according to a standardized coding scheme. In the problem on gas laws, for example, the interviewer recorded whether a student correctly related all the relevant proportions in the ideal gas law 686
(pressure is proportional to both temperature and number). After each interview, the two interviewers met to discuss the scores and came to agreement on each score. In addition, the interviews were audiotaped and videotaped in case transcription and further analysis were necessary. The interviews and their coding schemes are available online.W At Berkeley, a trained focus-group interviewer from John Wiley and Sons, the publisher of the modules, also conducted a series of informal focus groups with students throughout the semester to gather in-depth information on their attitudes toward the course. Results At both institutions studied, several assessments revealed significantly better performance by the students in the Modular classes while others found no differences. In no case did the students in the non-Modular classes outperform those in the Modular classes. The attitudinal picture is somewhat more complex; students in the Modular class at Grinnell College had more positive attitudes than their non-Modular counterparts, but the reverse was found at Berkeley. The details of these results are explicated below. First, we empirically describe some of the differences in classroom practice between the Modular and non-Modular sections at each institution.
Classroom Observation Data As intended, the classroom activities were found to be different in the Modular and non-Modular classes at both institutions. We categorized activities as either active, meaning that students had the opportunity to interact with Table 2. Class Time Spent in others or to solve prob- Active and Passive Activities lems, or passive, meaning Time (%) they did not have such an Activity Modular Non-Modular opportunity. Active classSection Section room activities included Grinnell College engaging in whole-class 43 6 discussions or brainstorm- Active 57 94 ing, performing group Passive U.C. Berkeley work, doing individual work, and answering quiz Active 44 15 questions. Passive activities Passive 56 85 were listening to lecture, NOTE: Chi-square tests reveal a sigviewing demonstrations, nificant difference between Modular watching the instructor and non-Modular classes at the .001 solve problems at the level at both institutions. blackboard, and performing procedural business such as handing in homework. Our observers marked down the amount of time spent in each type of activity. Table 2 shows these data for both institutions. It is interesting that although a large percentage of time in the Modular classrooms was spent in an active learning mode, passive activities still accounted for the majority of the time. The difference between Modular and non-Modular classes is that students in the latter spend very little time engaged in active learning activities. At both institutions, the difference between types of activities in the Modular and nonModular classes is significant at the .001 level. Consistent with the data in Table 2, both students and instructors asked more questions in the Modular classes than
Journal of Chemical Education • Vol. 78 No. 5 May 2001 • JChemEd.chem.wisc.edu
Research: Science and Education
in the non-Modular ones. At Grinnell College, students asked on average about six times as many questions per day in the Modular class as in the non-Modular class (t59 = 5 × 106; p < .001). At U.C. Berkeley, Modular class students asked on average a little over twice as many questions per day (t43 = 3.67; p < .001). As for the instructors, at Grinnell, they asked an average of twice as many questions per day in the Modular class (t58 = 4.90; p < .001); at Berkeley, they asked an average of almost one and one-half times as many questions per day in the Modular class (t43 = 2.18; p < .05). There were no differences in attrition rates for the Modular and non-Modular classes. At Grinnell, all students remained in both classes throughout the semester. Berkeley saw a loss of three (1%) students from the Modular section and a gain of seven (3%) students in the non-Modular section from the first midterm to the final exam. This apparent difference is not statistically significant (χ2 = 0.60; p = .74). The classroom observation data indicate that the pedagogy employed in the Modular sections was more interactive than that in the non-Modular sections. What, if any, effect did the new pedagogy and real-world contexts have on the students’ understanding and enjoyment of chemistry? To answer this question, we turn to the results from the assessment instruments.
Conceptual Understanding and Reasoning We performed analyses of variance (ANOVAs) on the data from Grinnell College to reveal differences between the Modular and non-Modular sections. The students’ SAT scores (combined Math and Verbal) were included as covariates in the analyses of the written assessments.4 The analysis of the results from Berkeley was a bit more complicated in order to account for the additional variable of GSI. Although we attempted to match the GSIs for teaching experience and quality across the two sections, there were considerable individual GSI differences, which may have led to subgroup differences in the students. To control for this, our analyses of the written assessment data employ a general linear model, with the assessment score as the dependent variable and SAT, section, and GSI (nested within section) as the independent variables.5 At Grinnell College, students in the Modular section outperformed those in the non-Modular section on the paired exam and quiz questions created by the faculty as a normal part of the course (effect size = 0.7σ).6 In addition, the Modular class students performed marginally better on the in-depth interview question regarding gas laws than the non-Modular class students (effect size = 0.6σ). On all other assessments, there were no significant differences between the two sections.7 The results from all the assessments given at the small college are shown in Table 3. The results at Berkeley were similar, but the performance differences appeared in different assessment instruments. Recall that the instruments were dissimilar across institutions; hence, one cannot directly compare results from the two schools. At Berkeley, students in the Modular section significantly outperformed their peers in the non-Modular course on the concept posttest (effect size = 0.5σ). The students from the Modular section who were interviewed outperformed their counterparts on the interview question designed to measure students’ scientific thinking skills (effect size = 0.7σ). There were no significant differences between the two sections on
Table 3. Assessment Results at Grinnell Assessment
Modular
Mean
SE
Non-Modular
Mean
SE
F
2-Tailed p
Conceptual pretest
7.80 0.65
7.24 0.46
0.49
.49
Conceptual posttest
9.69 0.73
9.59 0.52
0.01
.92
In-class exams
28.69
0.99
26.12 0.70
4.44
.04
ACS 1996 brief exam 20.24
1.04
21.38 0.73
0.81
.37
Interview, gas laws
8.20 0.41
7.21 0.30
3.85
.06
Interview, exptl design
8.53 0.57
8.64 0.42
0.02
.88
NOTE: All written assessments employed SAT scores as a covariate in the ANOVA.
Table 4. Assessment Results at U.C. Berkeley Assessment Conceptual pretest Conceptual posttest
Modular
Mean
SE
Non-Modular
Mean
SE
F 2-Tailed p (Section) (Section)
6.58 0.22
6.36 0.20
0.51
10.98 0.29
9.36 0.27
15.73
.47